Design of Macromolecules Using ML Summary
Design of Macromolecules Using ML Summary
The average cost for getting a new drug by the various phase of drug development can range
from $1 to $2 billion and consumes up to 15 years. Upon considering the research question, the
available data on this domain can be used to develop new drugs, which can be more accurate,
timely, and cost-effective.
Design of Macromolecules: -
Macromolecule: -
These are distinguished by identity and arrangement of Monomers and Linkages. Where,
the Monomers are the building blocks and Linkages are the connecting individual monomers.
Biomacromolecules: -
These are also called as Biological Macromolecules, which are occurred naturally in the
Human beings.
These forms the basis of life, mostly like the strands of RNA/DNA.
Artificial Macromolecules: -
The Artificial Macromolecules are just ubiquitous and indispensable to life. These are just
things like cups, covers, Polyethylene Bags.
Monomer Representation: -
Macromolecule Representation: -
Similarity Computation: -
Chemical dissimilarity is computed using sequence alignment or graph edit distance and
scored using Tanimoto matrix.
1. Data Type
2. Dataset size
3. The task
If the Dataset size is less than 100, we could use the simpler models. But id the dataset size is More
than 500,600, or even greater, then dataset size is enough to do something like Convolutional Neural
Networks, Craft Neural Networks over non-linear molecules.
Depending on the task all of these models, simpler models, CNN and Craft Neural Networks, GNNs
were used based on the Regression and Classification. To discover new molecules, we used a
language based model (or) Recurrent Neural Networks.