The document discusses several open problems in bioinformatics including the molecule isomorphism problem, string search algorithms, pairwise sequence alignment algorithms, protein and RNA folding problems, and biological network inference. Many of these problems relate to determining isomorphisms or structures given molecular representations and sequences.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
63 views2 pages
What Are Some Open Problems in Bioinformatics
The document discusses several open problems in bioinformatics including the molecule isomorphism problem, string search algorithms, pairwise sequence alignment algorithms, protein and RNA folding problems, and biological network inference. Many of these problems relate to determining isomorphisms or structures given molecular representations and sequences.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
What are some open problems in bioinformatics?
Some open theoretical and practical problems I have encountered off the top of my head :
• Molecule Isomorphism Problem, given two representations (be it
sequences, structures, notations etc) of molecule R1R1 and R2R2, infer the isomorphism of the two. Could it be solved in deterministic polynomial time? A variation of graph isomorphism problem. For highly reduced form of canonical molecules like antibody and antibody drugs conjugate (ADC) my own constructed algorithms seem to always fall in polynomial time. For more complex molecules, however, the problem approaches the general problem in computer science complexity-wise. As such no definite answer is known until P=NPP=NP is solved. • String search algorithms. What is the lower bound of string search algorithms? Applying linear space constraints, what is the lower bound of string search algorithms? • Pairwise sequence alignment algorithms. What is the lower bound of pairwise sequence alignment algorithms? Applying linear space constrains, what is the lower bound of pairwise sequence alignment algorithm? • Molecule database search. Does error-free database system for molecules exist? At the end this goes back to first problem (Molecule Isomorphism Problem), an existing general algorithm for graph isomorphism in polynomial time means that it's possible to have 100% accuracy in molecular search, time and space (polynomially) unconstrained. • Protein and RNA Folding Problem, given the sequence s0s1s3…sns0s1s3…sn of a protein or RNA, infer the secondary, tertiary and quartenary structure of it in a given environment EvEv. Does such algorithm exist? Is it beyond PP? • Reverse Protein and RNA Folding Problem. Given a representation (be it distance matrix, coordinate vector, vector graphic etc) of protein or RNA structure RsRs, find an underlying sequence SS that would adopt the structure. Does such algorithm exist? If so, is it beyond PP? • Dynamic Substructure Prediction. Given the structure RsRs of a protein, applying change Ti:(s0s1…sn)→(s0s1…s′0s′1…s′m…sn)Ti:(s0s1…sn)→(s0s1…s0′s1′… sm′…sn) (insertion), Td:(s0s1…sn)→(s0s1…si−1sisjsj+1…sn)Td:(s0s1…sn)→( s0s1…si−1sisjsj+1…sn) where j−i>1j−i>1 (deletion), or Ts:(s0s1…sk…sn)→(s0s1…s′k…sn)Ts:(s0s1…sk…sn)→(s0s1…sk′…sn) wh ere sk≠s′ksk≠sk′ (substitution) to its underlying sequence, infer the new structure R′sRs′. Does an algorithm exist for all non-secondary parts of protein (turn and coil)? Is it beyond PP? • De novo protein design. Is it possible to design a protein from scratch? Does such algorithm/method exist? • Mathematical biological network inference. Is it possible to entirely infer biological networks mathematically (i.e. without using database search)? Is it possible to mathematically infer causality among biological entities? Is it possible to mathematically predict pathways?
Dokumen - Pub Machine Learning in Bioinformatics of Protein Sequences Algorithms Databases and Resources For Modern Protein Bioinformatics 9811258570 9789811258572