Test For Upload
Test For Upload
Introduction to bioinformatics
B I O I N F O R M A T I C S V U
and evolution
Whats in a name?
DNARNA = Transcription
because the information is exactly copied (or transcribed) from one base-4 system (DNA) to an equivalent base-4 system (RNA). Think of a monk transcribing a scroll.
RNAProtein = Translation
because the information is converted from a base-4 system (RNA) to a base-20 system (protein). Think of a monk translating a scroll into a new language.
What is a gene?
A gene is a segment of DNA that contains all the information necessary to code for some function. A gene is also the unit of information that is transferred through Transcription and Translation.
What is a gene?
A gene is a segment of DNA that contains all the information necessary to code for some function. A gene is also the unit of information that is transferred through Transcription and Translation.
Changes to the cells internal or external environment can lead to changes in gene expression.
ABC
Biological Data
Many different genomics datasets:
Genome sequencing: more than 300 species completely sequenced and data in public domain (i.e. information is freely available), virus genome can be sequenced in a day Gene expression (microarray) data: many microarrays measured per day Proteomics: Protein Data Bank (PDB) - as of Tuesday February 07, 2006 there are 35026 Structures. http://www.rcsb.org/pdb/ Protein-protein interaction data: many databases worldwide Metabolic pathway, regulation and signaling data, many databases worldwide
Most effectively utilising and analysing this information computationally is essential for Bioinformatics
clustering or
optimal segmentation by Dynamic Programming
An important goal of bioinformatics is development of a data analysis infrastructure in support of Genomics and beyond
Each protein sequence knows how to fold into its tertiary structure. We still do not understand exactly how and why
SECONDARY STRUCTURE (helices, strands)
The 1-step process is based on a hydrophobic collapse; the 2-step process, more common in forming larger proteins, is called the framework model of folding
There are many different regulation signals such as start, stop and skip messages hidden in the genome for each gene, but what and where are they?
Expression data
What is life?
NASA astrobiology program: Life is a self-sustained chemical system capable of undergoing Darwinian evolution
Evolution
Four requirements: Template structure providing stability (DNA) Copying mechanism (meiosis) Mechanism providing variation (mutations; insertions and deletions; crossing-over; etc.) Selection: some traits lead to greater fitness of one individual relative to another. Darwin wrote survival of the fittest
Evolution is a conservative process: the vast majority of mutations will not be selected (i.e. will not make it as they lead to worse performance or are even lethal) this is called negative (or purifying) selection
Orthology/paralogy
Orthologous genes are homologous (corresponding) genes in different species Paralogous genes are homologous genes within the same species (genome)
Consequence of evolution
Notion of comparative analysis (Darwin) What you know about one species might be transferable to another, for example from mouse to human Provides a framework to do the multi-level large-scale analysis of the genomics data plethora