Skip to content

Code accompanying the paper: Amino Acid Composition drives Peptide Aggregation: Predicting Aggregation for Improved Synthesis

Notifications You must be signed in to change notification settings

rxn4chemistry/AI4Aggregation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI4Aggregation

This repo contains code accompanying the publication Amino Acid Composition drives Peptide Aggregation: Predicting Aggregation for Improved Synthesis, including scripts to reproduce all results.

Installation

This project utilises poetry as package manager. Install the package run:

poetry install

Preprocessing data

To create the dataset used to train models, use the following script:

poetry run create_combined_data --uzh_dataset_path <Path to UZH dataset> --mit_dataset_path <Path to MIT dataset> --save_path data/combined_data.csv

For this both the UZH and MIT dataset are required. For the UZH dataset download the file uzh_data_clean.csv from the Zenodo record corresponding to this publication. The MIT dataset can be found in the corresponding GitHub repo here.

Reproducing Training results

We provide a set of scripts to replicate the results obtained in the paper. All scripts require a path to where the results of the experiments are saved. The HuggingFace models require GPUs to train:

bash scripts/run_hf_models.sh <Path to Experiment Folder>
bash scripts/run_sklearn_models.sh <Path to Experiment Folder>
bash scripts/run_sklearn_shuffled.sh <Path to Experiment Folder>
bash scripts/run_wof_sklearn.sh <Path to Experiment Folder>

Explainability

To explain the predictions of the models we use Shap values. To reproduce our results use the following scripts:

poetry run explain_model --data_path data/combined_data.csv --output_path <Path where results should be stored>

About

Code accompanying the paper: Amino Acid Composition drives Peptide Aggregation: Predicting Aggregation for Improved Synthesis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published