THE HUMAN
GENOME SEQUENCE
Learning outcomes
• Recognize the international collaborative effort to sequence the human
genome as a public venture
• Discuss the following topics regarding the human genome sequence;
number of genes and complexity; alternative splicing and its
consequences; Types of repeats; genetic variations / single nucleotide
polymorphisms
• Recognize that this information is changing the way we diagnose and
treat diseases (molecular medicine / genomic medicine/personalized medicine)
• Discuss the following topics; discovery of new drug targets and rational
drug discovery
• Recognize that improvement to sequencing technology and cost
reduction has led to sequencing of individual genomes
• Discuss the pros and cons of personal genome sequencing
• Recognize that there is a need for consideration of a new area of ethics
and policy-level regulations regarding handling of personal genetic
information
What does genome sequencing mean?
Figuring out the order of DNA
nucleotides, or bases, in a genome—
the order of As, Cs, Gs, and Ts that
make up an organism's DNA
The human genome is a
complete set of DNA
found within the 23 chromosome pairs
in cell nuclei
Human Genome Project
Humans Genome Project
was started in 1990,
20 centers from 6 countries (US, UK,
China, France, Germany, Japan),
$3 billion, was declared
completed in April 2003
Sequencing an entire genome (bases on
all 23 human chromosomes-3 billion
bases) - is mostly done by
high-tech machines
Benefits of the genome sequencing in medicine
Better diagnosis of disease
Early detection of certain diseases
Gene therapy
Control systems for drugs
Major Findings of the Human Genome
Two major types of 1. Coding DNA (Genes)
DNA sequences 2. Non coding DNA (Junk DNA)
Coding DNA (Genes)
• Number of genes (exons) only ~ 35,000
– < 2% of genome encode genes
– Fruit fly has 13,000 genes
• Proteome (entire set of proteins can be, expressed by a
genome) is complex - 1 gene codes up to ~ 1000
proteins
- Alternative splicing -
- Variation in gene regulation
- Post translational modification
Gene density
Gene density of Human genome is very low
(ave ~10 genes / Mb)
Compared to other known organisms
Caenorhabditis elegans ~190 genes / Mb C. elegans
(A round worm)
Gene density varies between chromosomes
Highest gene density =
Chromosome 19 (23 genes / Mb)
Also 17, 22
Mostly Pseudogenes!
Lowest gene density
Chromosomes 13 & Y (1.8 genes / Mb)
Also 4, 18
Arrangement of genes in HUMAN GENOME
Gene-dense "urban centers" are G, C rich regions
Gene-poor "deserts" are A, T rich regions
GC rich regions - light bands chromosomes when
stained with Giemsa,
AT-rich regions – dark bands observed under a
microscope
Genes concentrated in random areas along the
genome, with vast expanses of noncoding DNA
Presence of initiator, promoter or enhancer/silencer
sequences
Presence of CpG islands
Stretches of CG repeating sequences (CpG islands)
occur near to gene-rich areas. They form a barrier
between the genes and the "junk DNA"
Major Findings of the Human Genome cont.
Non coding DNA
• Introns ~24%
(interspersed between
genes)
• Pseudogenes ~ 5%
Related to functional genes
but are no longer capable of
being transcribed or translated
contain biological
evolutionary histories within
their sequences
• Repeated sequences ~ 50%
Important for chromosome structure
Dynamic nature of chromosomes
Repetitive DNA sequences
Repetitive elements differ in their
-position in the genome -sequence
-size -number of copies
-presence or absence of coding regions within them
Many questions regarding their - origin
- evolutionary mode
- functions
Repetitive sequences – some are extremely well conserved
between species while others are among the most variable,
defining differences between even closely related species
Located at relatively specific chromosome domains such as
centromeres or subtelomeric regions
Repeated sequences
Interspersed repetitive DNA Tandemly repeated DNA
(Present as single copies
and dispersed or Macro satellite DNA
scattered throughout the
Mini satellite
genome)
Micro satellite
SINEs (Short intersperse
nuclear elements)
LINEs (Long intersperse
Retrotransposons- mobile DNA
nuclear elements
Some display sequence similarity to
LTRs (Long terminal certain viruses
repeats)
DNA Transposons - Mobile elements
Important in genome
Transposable elements (Transposons) function and evolution
Mobile elements / Jumping genes (creation of new genes)
Nearly 100 examples of known diseases caused by transposons insertions, eg: some
types of cancer and neurological disorders.
2. DNA transposons
1. Retrotransposons • encode the protein transposase -
• function via reverse transcription require for insertion and excision
• first transcribed into RNA • some encode other proteins
• then converted back into identical DNA
sequences using reverse transcription
• these sequences are then inserted into
the genome at target sites.
•encode the protein require for insertion
Tandemly repeated DNA
Identical repeat units (internal repeats) Double stranded
DNA
Tandem units (Cluster/Array)
Macro satellite Minisatellites Microsatellites
Variable Number Tandem Simple Tandem Repeats (STR’s)/
Repeats (VNTRs) Simple Sequence Repeats (SSRs)
Highly repetitive Moderately repetitive Moderately repetitive
Repeat units are of 20 to 100 bp per repeat 2 - 5 bp per repeat unit
100 to 6500 bp unit
Large Total length of tandem Array length 40 to 400
(100 million bp) units 1 to 30 kb bp
clusters
Number of repeats in Number of repeats in
Found in satellites are highly satellites are highly
heterochromatic variable(polymorphic) variable(polymorphic)
regions near
centromeres & Found in euchromatic Found in euchromatic
telomeres region of chromosome region of chromosome
Human Variation
~99.9% of the sequence is similar among individuals
~0.1% varies between individuals
The number of repeats in micro and mini
satellites are highly variable (polymorphic)
among individuals, particularly with regard to
the number of repeats at a given locus
Useful in - Gene mapping
- DNA profiling for paternity testing
- Forensic testing
- Confirmation of relatedness and dead body
identification.
Simple Tandem Repeat Length Polymorphisms
Positions of DNA primers
Person 1
Person 2
Person 3
Person 4
Person 5
Person 6
Person 7
Person 8
Person 9
Person 10
Results of agarose gel electrophoresis
Human Variation
Single Nucleotide Polymorphisms (SNPs) – most
common variation in the human genome
10 million common variants
Occurs at a rate of ~1 in 300 bp
Important for:
finding chromosomal
locations for disease –
associated sequences
tracing human history
in mapping the migrations of
human populations.
One hot area in pharmaceutical research is the design of
personalized drug treatments based on a patient’s SNP profile
Sequencing of individual genomes
Cost of sequencing a human genome is dropping rapidly,
due to the continual development of new, faster, cheaper
DNA sequencing technologies such as "next-generation
DNA sequencing".
The first whole human genome sequencing cost
roughly $ 3 billion in 2003.
In 2006, the cost decreased to $300,000.
In 2016, the cost decreased to $1,000
Genomic Medicine
Application of genomic information about an individual
as part of their clinical care
(e.g. for diagnostic or therapeutic decision-making)
Accelerated identification of disease-causing genes
Understanding - their functions / role in disease
Anticipated Benefits of Genomic Medicine:
• Improved diagnosis of disease - Rapid, precise PCR-based (eg:detection of mutations)
• Earlier detection of genetic predispositions to a disease - Genetic screening of
individuals / populations
• Rational drug design - Drugs will be designed with a better understanding of the disease
process at molecular level
• Gene therapy - Replacing absent / defective genes with ‘good’ genes
• Personalized, custom drugs - Identifying personal genetic makeup
Pharmacogenomics
Study of individual gene-drug interactions
- relevant to disease susceptibility
-response to individual cellular, tissue level
new method of providing
targeted drug treatment
it is possible to identify people that
react in specific way to drug treatment
by identifying their genetic makeup
Provision of ―Custom drugs‖
To accommodate variations in drug responses
among individuals
Disadvantages / Ethical issues of Genomic Medicine:
Privacy and confidentiality of genetic information.
Availability of Genetic information of individuals in Genetic databases
(dbSNP; haplotype mapping etc)
Fairness in the use of genetic information
by insurers, employers, courts, schools, adoption agencies,
the military
Stigmatization, and discrimination-due to an individual’s genetic
differences
Effect the future of many families in a negative way
Uncertainties associated with gene tests for susceptibilities
(e.g. heart disease, diabetes, and Alzheimer’s disease).
There is a need for
consideration of a new area of
ethics and policy-level
regulations regarding handling of
personal genetic information !