AI4Aggregation

This repo contains code accompanying the publication Amino Acid Composition drives Peptide Aggregation: Predicting Aggregation for Improved Synthesis, including scripts to reproduce all results.

Installation

This project utilises poetry as package manager. Install the package run:

poetry install

Preprocessing data

To create the dataset used to train models, use the following script:

poetry run create_combined_data --uzh_dataset_path <Path to UZH dataset> --mit_dataset_path <Path to MIT dataset> --save_path data/combined_data.csv

For this both the UZH and MIT dataset are required. For the UZH dataset download the file uzh_data_clean.csv from the Zenodo record corresponding to this publication. The MIT dataset can be found in the corresponding GitHub repo here.

Reproducing Training results

We provide a set of scripts to replicate the results obtained in the paper. All scripts require a path to where the results of the experiments are saved. The HuggingFace models require GPUs to train:

bash scripts/run_hf_models.sh <Path to Experiment Folder>
bash scripts/run_sklearn_models.sh <Path to Experiment Folder>
bash scripts/run_sklearn_shuffled.sh <Path to Experiment Folder>
bash scripts/run_wof_sklearn.sh <Path to Experiment Folder>

Explainability

To explain the predictions of the models we use Shap values. To reproduce our results use the following scripts:

poetry run explain_model --data_path data/combined_data.csv --output_path <Path where results should be stored>

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
scripts		scripts
src/ai4agg		src/ai4agg
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI4Aggregation

Installation

Preprocessing data

Reproducing Training results

Explainability

About

Releases

Packages

Contributors 2

Languages

rxn4chemistry/AI4Aggregation

Folders and files

Latest commit

History

Repository files navigation

AI4Aggregation

Installation

Preprocessing data

Reproducing Training results

Explainability

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages