0% found this document useful (0 votes)
186 views31 pages

Comparative Genomics 2 - PART 1

Uploaded by

Nnleinom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
186 views31 pages

Comparative Genomics 2 - PART 1

Uploaded by

Nnleinom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Comparative Genomics

Ana C. Marques
[email protected]
The Module - Overview

Lecture 1 - Comparative genomics- Ana Marques


Comparison of DNA sequences.

Lecture 2 - Comparative transcriptomics- Chris Ponting


Comparison of RNA/protein

Lecture 3 - Disease genomics- Caleb Webber


Using genomics to understand phenotype and disease

Practical - Ana Marques and Steve Meader


Using web-based data-mining tools to compare disease
associated loci between human and mouse.
The Lecture - Overview

Overview:

1-Genome(s);

2-Genomics: comparative, functional and


evolutionary;

3-Protein-coding genes and evolution;


The genome

The genome contains all the biological information required to


build and maintain any given living organism.
The genome contains the organisms molecular history.

Decoding the biological information encoded in these molecules will have


enormous impact in our understanding of biology.
Some history

1866- Gregor Mendel suggested that the traits were inherited.

1869-Friedrich Miescher isolated DNA.

1919-Phoebus Levene identified the nucleotides and proposed they were linked
through phosphate groups.

1943- Avery, MacLeod and McCarty showed that DNA and not protein is the carrier
of genetic information.

1953- Based on a X-ray diffraction taken by Rosalind Franklin and Raymond Gosling
and the Erwin Chargaff discovery that DNA bases are paired James D. Watson and
Francis Crick suggested the double helix structure for the DNA.

1957- Crick laid out the central dogma of molecular biology (DNA->RNA->protein).

1961 - Nirenberg and colleagues “cracked” the genetic code


Some history (cont.)

1975- Sanger sequencing

1976/79- First viral genome – MS2/fX174 (chromosomal walking- size ~5


kb)

1982 -First shotgun sequenced genome – Bacteriophage lambda (~50 kb)

1995 - First prokaryotic genome – H. influenzae

1996 - First unicellular eukaryotic genome – Yeast

1998 - The first multicellular eukaryotic genome – C.elegans

2000 - Drosophila melanogaster - fruitfly

2000 - Arabidopsis thaliana

2001- Human Genome


1865 Mendel discovers laws of genetics

1900 Rediscovery of Mendel’s genetics

1944 DNA identified as hereditary material

1953 DNA structure

1960’s Genetic code

1977 Advent of DNA sequencing

1975-79 First human genes isolated

1986 DNA sequencing automated


~50 years
1990 Human genome project officially begins

1995 First whole genome

1999 First human chromosome

2003 ‘Finished’ human genome sequence


The Human genome project

The Human genome project promised to


revolutionise medicine and explain every
base of our DNA.

Large MEDICAL GENETICS focus

Identify variation in Determine how individual


the genome that is genes play a role in health
disease causing and disease
The Human genome project

This was a huge technical undertaking so further


aims of the project were…
• Develop and improve technologies for: DNA sequencing, physical
and genetic mapping, database design, informatics, public access
• Genome projects of 5 model organisms e.g. E. coli, S. cerevisiae, C.
elegans, D. melanogaster, M. musculus.

Provide information about As test cases for refinement and


these organisms implementation of various tools
required for the HGP

• Train scientists for genomic research and analysis


• Examine and propose solutions regarding ethical, legal and social
implications of genomic research (ELSI)
The 2 Human genome project

PUBLIC - Watson/Collins PRIVATE - Craig Venter

• Human Genome Project • 1998 Celera Genomics


• Officially launched in 1990 • Aim to sequence the human
• Worldwide effort - both genome in 3 years
academic and government • ‘Shotgun’ approach - no use
institutions of maps for assembly
• Assemble the genome using • Data release NOT to follow
maps Bermuda principles
• 1996 Bermuda accord
The Human genome project

It cost 3 billion dollars and took 10 years to complete (5 less


than initially predicted).

• Currently 3.2 Gb
• Approx 200 Mb still in progress
– Heterochromatin
– Repetitive
• Most recent human
genome uploaded
February 2009
The Human genome.
The functional genome

evodisku.multiply.com/notes/item/109

Protein-coding do not explain complexity/diversity.


The functional genome

35 Research groups threw everything at 30Mb (1%) of human DNA sequence.


>200 experimental datasets (transcription, histone-modifications, chromatin structure,
regulatory binding sites, replication timing, population variation and more.)
The functional genome map
Estimating the fraction of the genome that is functional

• Only about 1.2% of the genome encodes protein sequence


• Most of it is composed of decaying transposons
• 5% appears “constrained” = likely functional
• >70% appears transcribed but unconstrained (lots fast evolving?)
2nd generation sequencing

Genome wide annotation of functional elements made easy!


2nd generation sequencing

Applications

1-Genome sequencing and genome assembly (Panda genome,


2009)
2-Genome re-sequencing (Craig Venter, James Watson…1000
genomes project)
3- Transcriptome sequencing (unbiased)
4- Metagenomics
5-ChIP-seq
7-RIP-seq

…seq.
3nd and counting generation sequencing

Single molecule sequencing.


Potential to answer questions that remain open (somatic
variation/ single cell transcription…)

Next generation sequencing has (and will continue to) changed


the way we do and understand biology!
More data but what should we do with it?
From genome to biology

How we use this data to understand physiology, behaviour,


disease and variation between species/individuals we need to:

•The evolutionary history of every genetic element (every base)


•Evolutionary forces shaping the genome
•Structural and sequence variation in the population and between species.

Comparative genomics studies differences between


genome sequences pin-pointing changes over time.
Comparison of the number/type changes against the
background “neutral” expected changes provides a
better understanding of the forces that shaped
genomes and traits.
Comparative genomics

“Nothing in Biology Makes Sense


Except in the Light of Evolution.”
Theodosius Dobzhansky
How do genomes change

MUTATION

1. Small scale mutations


Nucleotide substitutions ACGTGTC ATGTGTC
Small Insertions / Deletions (Indels) ACGTGTC AGTGTC
How do genomes change

MUTATION

1. Small scale mutations


Nucleotide substitutions ACGTGTC ATGTGTC
Small Insertions / Deletions (Indels) ACGTGTC AGTGTC

2. Large scale mutations (> 1kb)

QuickTime™ and a
decompressor
QuickTime™ and a are needed to see this picture.
decompressor
are needed to see this picture.
How do changes accumulate in the genome?

In 1965 Pauling and colleagues showed that for any given protein the rate of
molecular evolution is approximately constant in all lineages.

QuickTime™ and a
decompressor
are needed to see this picture.

1968, proposed that most mutations


accumulated in genomes are neutral.
QuickTime™ and a
decompressor
are needed to see this picture.

The Neutral Theory.


Neutral model

Aim: Identify regions of the genome that are not evolving


neutrally!
LOCI X-
Neutral
Species 1 CGACATTAAATAGGCGCAGGACCAGATACCAGATCAAAGCAGGCGCA
Species 2 CGACGTTAAATTGGCGCAGTATCAGATACCCGATCAAAGCAGACGCA

LOCI Y
Species 1 CATGGGTCATCACTCTAGCTGTACGTCTACTTCATCATCGCGCTACG
Species 2 CATGAGTCATCACTCTAGCTGTACGTCTACTTCATCATCGCGTTACG

Sequence that is conserved over long evolutionary


distances is likely to be under selective constraint
Conservation is often a good predictor of functionality

Regulatory
Element? Novel
Conservation
exon?
highlights
exons

BUT…
Conservation is not synonymous of function

Not all functional sequence is conserved across long evolutionary distance.

Heart Enhancers
Conservation is not synonymous of function

Long Intergenic ncRNA

QuickTime™ and a
decompressor
are needed to see this picture.

QuickTime™ and a
decompressor
are needed to see this picture.

QuickTime™ and a
decompressor
are needed to see this picture.
Sequence conservation doesn’t imply function conservation

Despite conservation of binding preferences and binding sites only a small proportion
of TF binding events is conserved across species

Odom D. et al (2007)
Schmidt D. et al (2010)
Sequence conservation doesn’t imply function conservation

Massive turnover of functional sequence in mammalian genomes

Meader S et al. (2011)


Protein-coding genes and evolutions

Lessons from comparative genomics:


Changes of protein coding repertoires and contributions to
phenotypic differences

same
different
contraction
expansion

Demuth J.P. et al, (2006)

You might also like