Thank you for answering our questionnaire! Based on your answers, we have provided our team's recommendations to help you properly set up Makya for your project.
Your answers
- I want to modify the center of my molecule, keeping the external branches fixed
- I want to stay close to a reference molecule or dataset
- Yes, I have my own data
Given the requirements of your project, we recommend setting up a Fragment Linking generator coupled with QSAR models.
1. Selection of the generator
The Fragment Linking generator can be used for generating compounds by proposing new linkers and/or scaffolds between two building blocks and their reaction centers. It is purely chemistry-driven so a good understanding of organic chemistry is required. By defining the exit vectors (where the chemistry will take place), the generator will search for commercial building blocks which can react in such a position.
For example, given a couple of building blocks: A and B and their reaction centers (exit vectors) -NH2 and -Br respectively, the fragment linking generator will propose novel compounds by attaching new scaffolds at those centers while keeping the rest of the building blocks intact.
2. Construction of a QSAR model
Makya QSARs are classification models trained on the data of your choice. For each individual objective that you want to model (activity, ADMET properties...), Makya QSAR module automatically tests different combinations of molecular representations, parameters, etc, so as to select an optimal solution. QSARs can be used to guide a generator (thus generating molecules that are optimized on the QSAR model objectives), as we recommend doing here, but also to score any molecule of interest.
3. Step by Step setup
Step 1: Upload your project data in Makya
- To create your dataset, follow the steps described in the documentation: Datasets.
- Requirements: your dataset should not contain columns without any values or without a column title. It should contain a SMILES column.
Note: Makya automatically cleans the dataset. For more information, see the documentation.
Step 2: Create a new QSAR trained on your project data
- To create your QSAR, follow the steps described in the documentation: Creating a new Predictor by defining a Target Product Profile (TPP).
- Train your QSAR by clicking on the Run button in the QSAR tab.
- Once the training is complete, validate the performances of your model using the scores provided in Makya. This step is crucial as you do not want to use badly performing models to guide molecule generation (for more information, see the documentation).
Step 3: Set up your Fragment Linking Generator
A description of the Fragment Linking generator and of the setting-up steps is provided in the documentation: Fragment Linking generator. You can also find examples in our use-cases: for example, Scaffold Hopping with the Fragment Linking generator.
- Create a new Fragment Linking generator in the Generation tab of your project. The generator set-up page appears.
- In the Exit Vectors tab, enter the two external branches of your molecule.
- The fragments should not contain any charged atoms, as protonation is performed directly inside Makya.
- It is important to input fragments that are suitable building blocks for chemical synthesis (for example, brominated or chlorinated fragments, or molecules with an OH to form an ester), as the Fragment Linking generator is a chemistry-based generator trained on chemical reactions.
- After having entered your fragments, select the exit vectors by clicking on Set and inputting the atoms ID.
- Make sure to select all the atoms that will be involved in the reaction.
- In the Chemical Space tab, select the dataset of your project.
- The similarity to this chemical space will be an element of the overall fitness function that will be optimized during molecule generation. It ensures that you will stay close to your project molecules.
- In the QSAR tab, select the QSAR that you trained in Step 2.
- The objectives that you select will be an element of the overall fitness function that will be optimized during molecule generation.
Step 4: Run the Generator and analyze your results
- To run your generator, go back to the Generator tab and click on Run.
- You can see the first generated molecules while the generation is still running. Check that there is no error in your set-up and that the generated molecules look conform to the requirements of your problem.
Once you have enough molecules, you can use the Parallel Coordinates to filter the molecules based on scores such as the QSAR predictions. For more information on the visualization, analysis and export of your results, check the documentation: Visualisation and Analysis of Generated Molecules.
To Go Further
These are the minimal steps needed to fit the requirements of your project. If you want to add more constraints on the generation, you can do so during the set-up of your generator.
For example, you can add substructure constraints (forcing or preventing the presence of specific substructures) either on the building blocks that will be chosen as the new molecular cores, or directly on the generated molecules. The first option will drastically accelerate the generation by reducing the size of the catalog of building blocks explored by the algorithm: thus, we recommend using it whenever appropriate.
For any questions, contact your Application scientist.