Lecture 1: Introducing The Power and Scope of Genetics Learning Goals
Lecture 1: Introducing The Power and Scope of Genetics Learning Goals
Learning Goals:
• That the history of genetics starts long before Mendel: early humans, as
they endeavored through selective breeding to improve domesticated
species, were the first geneticists.
• Like many areas of science, genetics is technology-driven. Today,
courtesy of remarkable single-molecule DNA-sequencing technologies,
we are in a phase of DNA sequence Big Data.
• How comparisons of genomes allow us to infer rates of genetic change
(i.e. the rate at which new mutations occur).
• Karyotypic analysis – direct view of chromosomes – allows us to visualize
genomes divided up into their constituent chromosomes.
• That phylogenetic analysis – at its largest scale, reconstructing the
branching pattern of the tree of life – is based on genetic similarity among
organisms.
• Two taxa are closely related if they have a recent common ancestor,
meaning there hasn’t been enough time for substantial genetic
difference to accumulate between them.
• Genealogical analysis through pedigrees permits us to investigate the
transmission of traits within families.
• That inbreeding results in an increase in the frequency of homozygotes
• Homozygosity of deleterious recessive alleles causes inbreeding
depression.
• Genome-wide Association Mapping allows us to identify genomic regions
in which variation correlates with the traits we are interested in.
• Apply the logic of trio (parents, offspring) analysis to the assessment of the
de novo mutation rate
• Perform basic phylogenetic analysis based on the levels of genetic
similarity among organisms: high similarity implies a recent common
ancestor.
• Interpret phylogenetic patterns
• Interpret pedigrees (i.e. recognize the symbols used, etc).
• Distinguish between correlation and causation in genetic analysis
1
Lecture Notes:
The formal science of genetics originates with the work of Gregor Mendel (who
published his famous work on peas plants in 1866) but, as a less scientific
endeavor, genetics has much more ancient origins. Our ancestors who so
patiently bred animals – domesticating cattle and dogs, for example, from wild
animals – and plants – wheat and corn, for example – were the original
geneticists. For them, the key knowledge was simply that offspring tend to
resemble their parents. If you want a cow with a high milk yield, you should
breed from one with a high milk yield. Now, since the discovery of the double
helix structure of DNA, genetics has become a molecular science, with,
accordingly, emphasis on reading and understanding the genetic code itself,
DNA sequence.
Fred Sanger introduced the original effective way to sequence DNA, the
dideoxy method [see lecture 4], in the mid-70’s. This remained the industry
standard for at least 30 years (and it remains to this day important and oft used)
until, spurred by the technological demands associated with sequencing the
human genome (the first reasonably complete draft was released in 2003),
around 2007/2008 when the first “next gen” (next generation) methods became
available. These high-tech, high throughput, methods have revolutionized
genetics because sequence data can now be collected stunningly efficiently
and cheaply. The cost of sequencing a human genome – about 3 x 109
basepairs – has fallen from about $100 million in 2001 to about $1000 in 2020.
This is astoundingly fast technological evolution: molecular biology has even
significantly outstripped the pace of progress in the computer industry (in which
Moore’s Law [named for the erstwhile boss of Intel] states [accurately] that the
rate of computer evolution is exponential, with the power of computers
doubling about every two years). The result in genetics is a new data-rich big
data science.
1. Human mutation rate. How many new mutations arise in the generation of
each new human? There are two sources of mutation: in the generation of the
father’s sperm, and in the generation of the mother’s egg. In the past, we had
to make convoluted inferential estimates of this key parameter; now, however,
we can simply count. We can analyze “trios”: two parents and their offspring.
We sequence the entire genomes of Mom, of Dad, and of Junior. Then we play
“spot the difference”: any mutations in Junior that were not present in Mom or
Dad are de novo mutations that arose in the generation of the gametes that
gave rise to Junior. It turns out that each of us receives about 70 new mutations,
more from Dad than from Mom (probably because there are more cell divisions
2
– and therefore more opportunities for replication error [ie mutation] – in the
production of sperm than in the production of eggs).
Because the problem spread out over time from its Northeastern origin, the best
guess for a cause was some kind of infectious disease. The hunt was on for a
bacterium or virus. The premise here is to apply Koch’s postulates, named for
the 19th century microbiologist who formalized the criteria for determining
disease causation:
1. The microbe must be found in abundance in all animals suffering from the
disease, but should not be found in healthy animals.
2. The microbe must be isolated from a diseased animal and grown in pure
culture.
3. The cultured microbe should cause disease when introduced into a
healthy animal.
4. The microbe must be re-isolated from the inoculated, diseased
experimental host and determined to be identical to the original.
The hunt proved forlorn. No microbes were implicated. What could be going
on? One hint came from analyses of karyotypes – i.e. the structure and number
of chromosomes in a cell. Cancerous growths like the devil sores often have
weird chromosomal arrangements – these seem to be a by-product of the
uncontrolled, unregulated growth of cancer cells -- and the devil sores were no
exception to this rule. What, however, was curious about this was that it seemed
that the sores from multiple individuals had the same weird karyotypes. This is
unexpected: as cancers typically each have their own idiosyncratic karyotypes.
3
populations that are geographically distant from each other – and therefore
seldom exchanging migrants – will accumulate different, independent
mutations, leading them gradually to become more different from each other
over time. In the case of the devils, mitochondrial DNA (mtDNA) was the
genetic marker of choice. Each mitochondrion in a cell has its own small loop of
DNA (about 17 Kb in most mammals). Because there are large numbers of
mitochondria in each cell, mtDNA is actually the commonest DNA segment in
tissues, which means it is easy to extract and work with. mtDNA studies found, as
expected, that devil populations are genetically differentiated (though not very
much).
The shocking result came when the mtDNA’s of the facial sores were analyzed.
They seemed to be cancers, in which case the assumption was that each one
was derived from the host animal’s cells, meaning that the mtDNA of the sore
should match its host’s mtDNA, just as the mtDNA of a human cancer – say, a
lung cancer – would be expected to match the patient’s mtDNA. But, no, the
mtDNA of the sore and of the sore’s host did not match. Rather, all the sores
had the same, or a very similar, mtDNA sequence, suggesting a single origin.
This result, taken in concert with the identical karyotype observation, suggests
that the disease is a transmissible cancer. Because the animals are very
aggressive towards to each other – biting is a frequent form of interaction – it
seems that they readily transmit the cancer to each other. Mysteries, however,
remain. Why, for example, does an animal’s immune system not reject the alien
invading cancer cells? One possibility is that there is relatively little genetic
variation in the immune system, meaning that the animals do not have the
immunogenetic ability to distinguish between self and non-self.
• Vaccine. Progress has been made in creating a vaccine that primes the
animals’ immune system to respond aggressively to the tumor cells. The
question remains, however, about how, in practice, the wild population
could be vaccinated.
4
the immune system – as one might have expected) the ability to the resist
the cancer.
3. The medical genetics of pain. People with the rare disorder Congenital
Insensitivity of Pain (CIP) – meaning that they cannot feel pain -- have a life-
threatening condition because there are no triggers to prevent them from
inadvertently doing self-destructive things. They feel nothing if they put their
finger into the flame of a candle, but the candle burns their flesh nevertheless.
That it sometimes runs in families suggests that there may be genetics factor(s)
underlying it.
In the case of CIP, inbred pedigrees from three separate families helped identify
a particular locus, SCN9A, as implicated in the disease. Interestingly, each of
the three had different mutations in the gene, but all the mutations destroyed
the function of the protein, Nav1.7. This protein encodes a sodium channel in
the nervous system.
5
Curiously, pedigree analysis of another neurological disease, Erythromelalgia,
which is rather the neurological opposite of CIP with patients having heightened
sensitivity to heat and pain, and experiencing acute pain in response even to
mild stimuli, also implicated SCN9A/Nav1.7. Here, however, the key mutations
did not shut down the function of the protein. Nav1.7 plays a central role in
some neurons in the transmission of the action potential – the wave of
membrane depolarization/repolarization that sweeps along the nerve cell to
relay information from one end of the cell to the other. This process is mediated
by the flow of Sodium and Potassium ions, so it is not difficult to see how
mutations in SCN9A (and therefore changes in Nav1.7) might affect nervous
function. In the case of CIP, the sodium channel simply does not work, so there
is no transmission of information – no pain. In the case of Erythromelalgia, the
mutations cause the sodium channel to go into over-drive, cranking a flow of
sodium ions that provoke not a single action potential but, rather, a whole string
of spikes. The upshot: acute pain.
This discussion of rare diseases may seem arcane but the insights they have
yielded may be important both medically and commercially. Pain
management, as the ongoing opioid public health crisis bears eloquent witness,
is a major medical issue of today. Hopefully the increased understanding of the
neuroscience of pain afforded by these analyses will prove helpful in this regard.
The question is a simple one: if you get COVID-19, what determines how sick you
get? As you know, there are many well established factors, most notably your
age and the extent to which you have other medical conditions. But what of
two healthy 30-year olds, A and B, one of whom barely notices when they get
6
infected while the other ends up in Intensive Care? What is responsible for the
difference? One (of several) possibilities is genetics: genetic differences
between A and B may account for the difference in their responses to infection.
How do we go about looking for this?
Often results are presented graphically in ‘Manhattan Plots’ (so-called for their
visual similarity to the NYC skyline). The y-axis features the measure of the extent
of correlation between genetic variation and the trait we’re interested; the x-
axis features of the location of the genetic variation in the genome, laid out
chromosome by chromosome.
For COVID-19 susceptibility, there is one region of the genome, on the third
chromosome, that gives a strong signal. This means that the variants you have
at genes in this region – there are seven genes in the region – may have an
effect on how your respond to COVID-19 infection. It’s worth noting, however,
that this is a small effect relative to other factors like age. What has all this got to
do with Neanderthals? Remarkably, it turns out that the susceptible DNA
sequence that has been identified in these studies is derived from Neanderthals.
7
A phylogeny of sequences for this region found across humans shows clearly
that the susceptible sequences are either identical to Neanderthal sequences
or descended from Neanderthal sequences. Neanderthals haunt us still.
Concept Check:
Human genome
DNA sequencing
Sanger sequencing
Next gen sequencing
Trio analysis
Karyotype
Phylogeny
mtDNA
Natural selection
Pedigree
Allele
Homozygous
Allele
Deleterious allele
Inbreeding depression
Ancient DNA
Genome-wide Association Studies
Manhattan Plot