Genomics-Lectures 1 To 8 - 2023 PDF
Genomics-Lectures 1 To 8 - 2023 PDF
• A+T or C+G
DNA
Molecule of Life
Deoxyribonucleic Acid
DNA was
Watson
discovered in 1953
by James Watson
and Francis Crick
Rosalind Franklin
Developed
techniques for X-
Ray Diffraction
photographs
Franklin’s X -Ray
pictures of DNA fibers
were key to the
discovery of the
structure of DNA.
X - Ray Diffraction
Using models,
Watson and Crick
were able to come
up with the
structure of DNA
that matched the
pattern in
Franklin’s
photographs.
NOBEL PRIZE
Watson and Crick
won the Nobel
Prize in 1962.
Rosalind Franklin
died in 1958. The
Nobel Prize is
only awarded to
living scientists.
Chemical structure, nomenclature, properties of nucleotides
The molecule
without the
phosphate group is
called a nucleoside
(sugar+base).
DNA
Replication
DNA Replication
(Synthesis of the new DNA Strands)
• DNA template
• DNA polymerase
5’ 3’
3’ 5’
Forward Primer 5’ 3’
3’ 5’
5’ 3’
3’ 5’
extension
extension
5’ 3’
3’ 5’
5’ 3’
3’ 5’
5’ 3’
3’ 5’ 5’ 3’
5’ 3’
3’ 5’
3’ 5’
5’ 3’ 5’ 3’
3’ 5’
3’ 5’
5’ 3’
3’ 5’
The Size of the DNA Fragment Produced in
PCR is Dependent on the Primers
• The PCR reaction will amplify the DNA section between the two primers.
• If the DNA sequence is known, primers can be developed to amplify any
piece of an organism’s DNA.
Forward primer
Reverse primer
DNA Polymerase
• Polymerases such as Pfu, obtained from Archae, have proof-reading
mechanisms.
• Combinations of both Taq and Pfu are available
• high fidelity
• accurate amplification of DNA.
Components of PCR
dNTPs
• Typically 0.2 mM in reaction.
• Less amount of dNTP gives poor yield.
• More amount leads to mis-incorporation by
polymerase.
• Very sensitive to freeze-thaw cycles. Should be kept in
small aliquots, discard after a few cycles.
Cycles in PCR
• The most important region for the specific priming is the 3’ region of the
primer; amplification starts here. The 3’ ends should be free of secondary
structures, repetitive sequences, palindromes and highly degenerate
sequences.
• Primers that are used together should have similar Tm values and should not
differ by more than 5ºC.
Can you design a PCR program to
amplify a gene of your interest?
Example PCR program
1. 95 C for 5 min
2. 95 C for 30 sec
3. 60 C for 30 sec
4. 72 C for 1 min (for amplicon size of 1 kb)
5. Go to step 2 for 35 cycles
6. 72 C for 10 min
7. 4 C for ever/ Stop
Laboratory
Applications
of PCR
DNA Repair
• Excision repair:
1. Damaged segment is excised by a repair
enzyme (there are over 50 repair enzymes).
KEY TERMS
*tRNA
*rRNA
*codon
*anticodon
*codon- in RNA, a three-based “word” that
codes for one amino acid (EX. AUG)
gcgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatcgtgtgggtagtagctgatatgatgcgaggtaggggataggata
gcaacagatgagcggatgctgagtgcagtggcatgcgatgtcgatgatagcggtaggtagacttcgcgcataaagctgcgcgagatg
attgcaaagragttagatgagctgatgctagaggtcagtgactgatgatcgatgcatgcatggatgatgcagctgatcgatgtagatgca
ataagtcgatgatcgatgatgatgctagatgatagctagatgtgatcgatggtaggtaggatggtaggtaaattgatagatgctagatcgt
aggtagtagctagatgcagggataaacacacggaggcgagtgatcggtaccgggctgaggtgttagctaatgatgagtacgtatgag
gcaggatgagtgacccgatgaggctagatgcgatggatggatcgatgatcgatgcatggtgatgcgatgctagatgatgtgtgtcagta
agtaagcgatgcggctgctgagagcgtaggcccgagaggagagatgtaggaggaaggtttgatggtagttgtagatgattgtgtagttg
tagctgatagtgatgatcgtag …….
Human Genome Project
Goals:
■ Identify all the genes in human DNA,
■ Determine the sequences of the 3 billion chemical base pairs that make up human
DNA,
■ Store this information in databases,
■ Improve tools for data analysis,
■ Transfer related technologies to the private sector, and
■ Address the ethical, legal, and social issues (ELSI) that may arise from the project.
Milestones:
■ 1990: Project initiated as joint effort of U.S. Department of Energy and the National
Institutes of Health (project time and cost: 15 years and $3 billion)
■ June 2000: Completion of a working draft of the entire human genome (using DNA
from 5 individuals)
■ February 2001: Analyses of the working draft were published
■ April 2003: HGP sequencing was completed and the Project was declared finished two
years ahead of schedule
Two strategies for genome sequencing
Hierarchical Shotgun
Sequencing Sequencing
DNA Sequencing
• Length: ~ 70 cm
•Thickness: 0.1 mm
By the Numbers
• The human genome contains 3 billion chemical nucleotide bases (A, C,
T, and G).
• The average gene consists of 3000 bases, but sizes vary greatly, with
the largest known human gene being Dystrophin at 2.4 million bases.
• Almost all (99.9%) nucleotide bases are exactly the same in all people.
• In contrast, the gene-poor "deserts" are rich in the DNA building blocks A and T.
• Genes appear to be concentrated in random areas along the genome, with vast
expanses of noncoding DNA between.
• Stretches of up to 30,000 C and G bases repeating over and over often occur
adjacent to gene-rich areas, forming a barrier between the genes and the "junk DNA."
These CpG islands are believed to help regulate gene expression.
What does the human genome sequence tell us?
• Chromosome 1 has the most genes (2968), and the Y chromosome has
the fewest (231).
• Gene rich (Chromosome 19) vs. gene poor (Chromosome 13) regions
• Repeated sequences that do not code for proteins ("junk DNA") make up at least
50% of the human genome.
• Repetitive sequences are thought to have no direct functions, but they shed light
on chromosome structure and dynamics. Over time, these repeats reshape the
genome by rearranging it, creating entirely new genes, and modifying and
reshuffling existing genes.
• The human genome has a much greater portion (50%) of repeat sequences than
the mustard weed (11%), the worm (7%), and the fly (3%).
• Humans have on average three times as many kinds of proteins as the fly
or worm because of mRNA transcript "alternative splicing" and chemical
modifications to the proteins. This process can yield different protein
products from the same gene.
What does the human genome sequence tell us?
• Humans share most of the same protein families with worms, flies, and
plants; but the number of gene family members has expanded in
humans, especially in proteins involved in development and immunity.
• The ratio of germline (sperm or egg cell) mutations is 2:1 in males vs.
females. Researchers point to several reasons for the higher mutation rate
in the male germline, including the greater number of cell divisions
required for sperm formation than for eggs.
Genome Analysis
Genome sizes and number of genes
… T A G C …
… T G G C …
… T A G C …
… T G G C …
Genotype at a SNP
Occurrence of SNPs across the human genome
A chromosome region with only the SNPs shown. Three haplotypes are
shown. The two SNPs in color are sufficient to identify (tag) each of the
three haplotyes. For example, if a chromosome has alleles A and T at
these two tag SNPs, then it has the first haplotype.
Future Challenges: Specific Research Areas
• Gene regulation
• Developmental genetics
International HapMap project
The International HapMap Project is a multi-country effort
to identify and catalog genetic similarities and differences
in human beings.
• Scientific reasons:
* Recommendation to include samples from at least 3 Old World
continents
* Pilot data showing range of haplotype frequencies
• Ethical reasons:
* No small, isolated populations
* Inclusiveness
(find some less common variation)
• Practical reasons:
* Established relationships with communities
* Funding agency interest
Time lines of the International HapMap project
Thorisson, G.A., Smith, A.V., Krishnan, L., and Stein, L.D. The International
HapMap Project Web site. Genome Research,15:1591-1593. 2005.