VN2024-7 MicrobialGenomics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Microbial Genomics

❑The genome and information storage and retrieval


❑DNA sequencing of genomes
❑Evolution of genomes
❑Using genome sequence information to analyze gene
expression
❑Comparative genomics

1
Microbial Genomics

The genome and information storage and


retrieval

2
Information flow in computers is similar to cells.
Computer Cell
Execute instructions
Run programs Translation →
Run pathways

RAM (temporary Information retrieval


memory) mRNA’s

Read specific Transcribe


files specific genes

Programs, files, Information storage


Genes and
and operating
system Install and operons Horizontal,
edit vertical gene
programs transfer,
Hard Drive and files Genome mutations
(1’s and 0’s) (G A T C)
3
Information is stored in genomes as a DNA sequence.

A concept map showing


how sequence information Sequence
Clone into then plasmid
can be used. plasmids
inserts
What
Single To By Break into to
information in
species ask first pieces
genome(s) Construct
Sequence to complete or
from
pieces directly partial
and genome(s)
Use genome(s)
provides Gene
annotation
from

Populations Amplifying
What Build
of many ask by 16S rRNA and Sequence and
organisms phylogenetic
species gene by PCR products
are there tree
PCR

4
Microbial genome DNA sequence information has many uses.
What is the sequence information in
a genome useful for?
What is a genome?
All of the genes in an Measure Quantitating and
By
organism: transcription sequencing mRNA Transcriptomics
• Chromosome of all genes (RNA-seq)
• Plasmids To
To Diagnostics
• Viruses
Nutrients,
DNA Sequence of Infer
Identifies To Providing metabolic
sequence all genes metabolic Metabolomics
intermediates,
of genome and proteins pathways
products
To
Determine mass Allows Measuring levels
of all proteins of all proteins Proteomics

Design DNA-
dependent
diagnostic tools Genomics
Detect genes
involved in virulence 5
Microbial Genomics

DNA Sequencing of Genomes

6
A DNA sequencing machine

A MiSeq DNA sequencer is shown. It has a small screen where operators input programs. Attached to this is an enclosed chamber where the reactions are run.

Laboratory for large-scale sequencing reactions

A laboratory has a long row of 15-20 sequencing machines on each side. The lab is used for rapid genome sequencing of pathogens.

A MiSeq made by The sequencing floor in BGI


a company called Hong Kong, showing the
Illumina Illumina Hiseq 2000
sequencers

Example of using MiniON nanopore sequencing


MiniON nanopore sequencing

MiniON nanopore Example of using MiniON


sequencing nanopore sequencing

7
Genome sequencing projects often use a “shotgun” strategy.

Diagram, starting at the top and going downward, showing steps in a "shotgun" sequencing strategy. Isolated or cloned genomes are broken into many smaller random short pieces, and these pieces are put through a DNA sequencer. Computers then align the resulting sequences into a combined long sequence.

8
After sequencing reactions, the individual sequence reads must
be aligned to yield a long consensus sequence (contigs).

Building genomes:
Short sequences → contigs → scaffolds → complete genome

Alignment: different DNA


fragments are compared with
each other.
• Overlaps: parts of DNA
fragments that are identical.

• Alignments can be done for


sequence reads from a single
microorganism or an entire
microbial community that
contains hundreds of species.
9
A genome sequencing project involves multiple steps. You
sequenced thousands of small fragments – what is the next step?

Alignment

Contigs → Scaffolds

“Gap closure” and “finishing”

Analyze sequence for Open Reading Frames (ORF’s)
to obtain protein sequences → annotated draft
genome

Compare protein sequences with regions or motifs of
known proteins to assign function

Click on this to see the number of prokaryotic sequenced genomes in the database at the National Center for Biotechnology Information (NCBI)

Click on this to see the number of prokaryotic


sequenced genomes in the database at the National
Center for Biotechnology Information (NCBI) 10
Gene prediction involves computational methods to detect
genes, or Open Reading Frames (ORF).
ORF: a region of DNA that
encodes a protein.

• How are ORFs identified?


• Computer software can
detect:
• Start codons
• Stop codons
• Ribosome binding
sites
• Codon usage
statistics

• The branch of biology dealing with computational approaches to storage,


analysis, and comparison of genomes is called bioinformatics. 11
$1,000 human genome sequence

Link: A $100 USD Human Genome sequence?

Link: A $100 USD Human


Genome sequence?
12
Microbial Genomics

Evolution of Genomes

13
Genome evolution: The size of genomes can vary widely,
depending on how a prokaryote is adapted to its environment.

• Prokaryotes that are dependent on


other organisms for survival (are
parasites) have the smallest genomes.
• Why?
• Replicating a large genome
requires energy and nutrients.
• Errors in replication can delete
segments of DNA.
• If deleted DNA is not required for
survival, cell will continue to grow
and divide.
• Parasites get nutrients from their host:
genes for synthesis or transport of
those nutrients are no longer needed.
14
Genome evolution: Mechanisms of genome plasticity

Genome plasticity:
• Genomes are constantly changing
because of:
• Horizontal gene transfer
• Deletions
• Rearrangements
• Translocations

“plasticity”: like plastic – can


change because of evolution and
environmental forces.

15
Genome evolution: Comparison of genomes of closely related
prokaryotes reveals genome plasticity.
All of these sequenced genomes are the same genus and species.
•Genes are connected by lines to show rearrangements.

Darling AE, Miklós I, Ragan MA (2008) Dynamics of Genome Rearrangement in


Bacterial Populations. PLoS Genet 4(7): e1000128.
https://doi.org/10.1371/journal.pgen.1000128 16
Using genome sequence information shows an organism’s
metabolism, structure, and other characteristics.
• 99% of all prokaryotes cannot be grown in the laboratory; they live only in
complex communities.
• Genome sequencing can reveal much about these microorganisms such
as:
• Metabolic pathways
• Energy production
• Nutrient requirements
• Phylogenetic relationships

17
Hongoh Y et al. PNAS 2008;105:5555-5560

©2008 by National Academy of Sciences


18
Genome sequences show how genomes can evolve by
incorporating new genes from other organisms.
This is “horizontal gene transfer”.

Prokaryotes can
take up new DNA:
•By mating with
other cells
•Through a virus
•From the
environment

19
Microbial Genomics

Using genome sequence information to


analyze gene expression

20
Using genome sequence information: RNA-Seq can quantitate
the levels of each mRNA in a cell.
Genes needed Proteins Biochem.
for growth are
Enzymes Pathways
transcribed:
increased
mRNA

Which genes
are Can measure
All genes
transcribed? gene expression RNA
(information
by quantitating by sequencing
storage)
mRNAs for every “RNA-Seq”
gene
Can be
determined
from: Genes not
needed are not
transcribed:
Genome decreased
sequence mRNA

Genome sequence used to align transcript sequences for: 21


Using genome sequence to measure transcription: RNA-Seq
S. aureus S. aureus
BG1150 BG1769
• RNA-Seq examines all Isolate Isolate
mRNA transcripts in a 2-dimentional array of rows of red or blue colored rectangles stacked in many adjacent vertical columns. The pattern is blue where genes are transcribed more, and red where genes are transcribed less

mRNA mRNA
cell.
• Gene expression Convert to Convert to
profiling DNA DNA
• Compare two or more DNA DNA
organisms by sequencing sequencing
measuring their
mRNA Assemble Assemble
• Compare one transcripts transcripts
organism grown in and align with and align with
two different genome genome
conditions
Analyze and compare
Sharma-Kuinkel BK, Mongodin EF, Myers JR, et al. Potential Influence of Staphylococcus aureus Clonal Complex 30 Genotype and Transcriptome on Hematogenous Infections. Open Forum Infectious Diseases. 2015;2(3):ofv093. doi:10.1093/ofid/ofv093.
abundance of transcripts
Link to figure: Summary of RNA-Seq method http://journals.plos.org/ploscompbiol/article/figure/image?size=mediumid=info:doi/10.1371/journal.pcbi.1004393.g002

Sharma-Kuinkel BK, Mongodin EF, Myers JR, et al.


Potential Influence of Staphylococcus aureus Clonal
Complex 30 Genotype and Transcriptome on
Link to figure: Summary of RNA-Seq method
Hematogenous Infections. Open Forum Infectious http://journals.plos.org/ploscompbiol/article/figure/image?size=
Diseases. 2015;2(3):ofv093. doi:10.1093/ofid/ofv093. medium&id=info:doi/10.1371/journal.pcbi.1004393.g002 22
Microbial Genomics

Comparative Genomics

23
Using genome sequence information: Comparing genomes of
multiple strains of the same microorganism = “Comparative
genomics”
Two findings illustrate how
genomes are dynamic:
• The study of comparative
genomics shows that closely
related microbes can have
significant differences in genetic
makeup.
Li Y, Kwok AHY, Jiang J, Zou Y, Zheng F, Chen P, et al. (2013) Complete Genome
Analysis of a Haemophilus parasuis Serovar 12 Strain from China. PLoS ONE 8(9):
e68350. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0068350
• Core genome: all The
A
993
15,700
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2974192/
study
science
genes
genes
on in
61behind
core
in
sequenced
pan
genome
core
genome
and
E.(out
coli
panof
genomes
genomes
4,100 – revealed:
5,800 total genes per genome)

genes that are found in The science behind core and pan genomes
all organisms being • A study on 61 sequenced E. coli genomes
compared. revealed:
• Pan genome: sum • 993 genes in core genome (out of 4,100 –
total of all genes in all 5,800 total genes per genome)
organisms being • 15,700 genes in pan genome
compared.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2974192/
24
Comparative genomics: Comparing genomes of pathogenic vs.
non-pathogenic E. coli reveals regions involved in virulence
Green = core genome

Red or Blue = unique genes

Comparative genomics of 3 E. coli

Pathogenicity islands Genes


involved in causing disease are
often clustered together.
•Indicates that gene clusters
resulted from horizontal gene
transfer.

25
Summary
• DNA stores information for the genes and operons of cells.
• New and inexpensive DNA sequencing technologies allow genomes of individual
microorganisms or entire microbial communities to be determined.
• Results of individual sequencing reactions are analyzed using computers, and
overlapping sequences are aligned to create a complete genome.
• The genome sequence is analyzed for genes and open reading frames
(protein coding regions), and these are assigned function based on similarities
to other known genes.
• The genome sequences of individual species, or members of complex
microbial communities can be determined. The sequences provide information
about metabolism, virulence, physiology, phylogeny, and structure of
microorganisms that cannot be isolated or cultured in a laboratory.
• Genome sequence information can be used to analyze gene expression.
• mRNAs from cells is converted into DNA by an enzyme. The resulting DNA is
then sequenced, using a method called RNA-Seq. The abundance and
sequence of transcripts can then be analyzed.
• Comparative genomics looks at differences between two similar microorganisms to
see what genes are different or unique. 26

You might also like