Protein Structure Modeling
Protein Structure Modeling
Protein Structure Modeling
Andrs Fiser
Department of Biochemistry and Seaver Foundation Center for Bioinformatics Albert Einstein College of Medicine New York, USA
Why is it useful to know the structure of a protein not only its sequence?
The 3D structure is more informative than sequence because patterns in space are frequently more recognizable than patterns in sequence
Evolution tends to conserve function and function depends more directly on structu than on sequence, structure is more conserved in evolution than sequence.
Anacystis nidulans
Condrus crispus
Anabaena 7120
Ab initio prediction
Comparative Modeling
Applicable to those sequences only that share recognizable similarity to a template structure Fairly accurate ( <3 Ang RMSD), typically comparable to a low resolution X-ray experiment. Not limited by size
Accuracy and applicability are limited by our understanding of the protein olding problem
Accuracy and applicability are rather limited by the number of known folds
Structural Genomics
Characterize most protein sequences (red) based on related nown structures (green). The number of families is much smaller than the number of proteins
Structural Genomics
Definition: The aim of structural genomics is to put every protein sequence within modeling distance of a known protein structure.
Size of the problem: There are a few thousand domain fold families. There are ~20,000 sequence families (30% sequence id).
Solution: Determine protein structures for as many different families as possible. Model the rest of the family members using comparative modeling
Flavodoxin family
Anabaena 7120
COMPARATIVE MODELING
KIGIFFSTSTGNTTEVA
Condrus crispus
Desulfovibrio vulgaris
TARGET
ASILPKRLFGNCEQTSDEGLK IERTPLVPHISAQNVCLKIDD VPERLIPERASFQWMNDK
TEMPLATE
ASILPKRLFGNCEQTSDEGLKIERTPLVPHISAQNVCLKIDDVPERLIPE MSVIPKRLYGNCEQTSEEAIRIEDSPIV---TADLVCLKIDEIPERLVGE
No
OK?
Yes
END
Pattern recognition, heuristic searches (e.g. BLAST, FastA) Profile and iterative alignment methods (e.g. HMMs, PSI-BLAST) Structure based threading (e.g. THREADER, FUGUE, 3DPSSM)
Template Search
Target Template Alignment
Dynamic Programming, Pairwise Alignmen Multiple Alignments, Profiles, HMMs Structure based approaches (Threading)
No
OK?
Yes
END
Template Search
Target Template Alignment
Rigid Body Assembly (COMPOSER) Segment Matching (SEGMOD, 3DPSSM) Satisfaction of Spatial Restraints (MODELLE Integrated (NEST)
No
OK?
Yes
END
Template Search
Target Template Alignment
Stereochemistry (PROCHECK, WHATCHEC Environment (Profiles3D, Verify3d) Statistical potentials based methods (PROSA
No
Is the model reliable? A model is reliable when it is based on a correct template and on an approximately correct alignment.
OK?
Yes
END
1rypH
1ac5
odeling structural consequences of a point mutation (Ser-Pro) in Zebrafish forkhead transcription factor Foxi1
re-modelled wild type segments(6 and 7aa) and NMR: modelled mutated segments with each other (6 and 7aa): wild type and mutated segments (6 and 7 aa):
ophila m.
oli
structures reveals two subclasses of dUTPases with different type of subunit interfaces.
3. Altered character of subunit
interfaces correlates with the suggested different functional mechanism: polar/charged surface is better adjusted for allosterism.
1.
2.