OpenGeneticsLectures Fall2017
OpenGeneticsLectures Fall2017
Lectures
Fall 2017
Department of Biological Sciences – University of Alberta, Canada
Also available:
Online Open Genetics!
To be successful in Introductory Genetics, you are encouraged to use the supplementary electronic
resources provided by the website for Online Open Genetics (http://opengenetics.net). These
resources will help you learn and practice problem-solving skills and self-assess your knowledge as you
progress through the course.
The website provides:
(1) access to short instructional videos and
(2) supplementary readings, as well as
(3) interactive exercises.
All these will help deepen your understanding of basic concepts in genetics, as well as to practice and
refine the skills needed to solve common problems in genetic analysis.
These supplementary materials can be accessed using the internet via web browsers on Windows or
Mac computers, or on tablets and iPads. More information is available on the web page.
Definitions:
Gene - a hereditary unit that occupies a specific position (locus) within the genome or chromosome
and has one or more specific effects upon the phenotype of the organism and can mutate into
various forms (alleles) and can recombine with similar such units.
Gene locus -(plural = loci) The specific place on a chromosome where a gene is located
Allele - refers to one of the different forms of a gene that can exist at a single gene locus
Genotype - the specific allelic composition of a cell or organism. Normally only the genes under
consideration are listed in a genotype and the alleles at all the remaining gene loci are
considered to be wild type.
Phenotype - the detectable outward manifestation of a specific genotype. In describing a phenotype
usually only the characteristics under consideration are listed while the remaining characters
are assumed to be wild type (normal).
Allelic mutations- two mutations at the same gene locus
Non-allelic mutations- two mutations that affect different gene loci
Genetic Nomenclature & Symbols - What you need to know
Geneticists use a variety of different nomenclature systems to represent genes and their mutations in
different organisms. You will need to become familiar with these different systems in order to
understand genetics and answer questions on an exam.
th
(Definitions are taken with modification from Griffiths et al, 2000 and A Dictionary of Genetics 4 Ed., King & Stansfield, 1990.)
Figure 1.
Parent and offspring Wolf’s Monkey.
(Flickr- Eric Heupel - CC BY-NC-ND 2.0)
were S-strain, pathogenic cells recoverable. Thus, 2. AVERY, MACLEOD AND MCCARTY’S
some non-living component from the S-type strains EXPERIMENT (1944)
contained genetic information that could be
transferred to and transform the living R-type What kind of molecule from within the S-type cells
strain cells into S-type cells. was responsible for the transformation? To answer
this, researchers named Avery, MacLeod and
McCarty separated the S-type cells into various
components, such as proteins, polysaccharides,
lipids, and nucleic acids. Only the nucleic acids from
S-type cells were able to make the R-strains
smooth and fatal. Furthermore, when cellular
extracts of S-type cells were treated with DNase
Figure 2. (an enzyme that digests DNA), the transformation
Colonies of Rough (left) and Smooth (right) strains of S. ability was lost. The researchers therefore
pneumoniae.
(J. Exp.Med.98:21, 1953-R. Austrian-Pending) concluded that DNA was the genetic material,
which in this case controlled the appearance
(smooth or rough) and pathogenicity of the
bacteria.
Like all viruses, T2 hijacks the cellular machinery of
Figure 3.
Experiments of Griffith and of Avery, MacLeod and McCarty.
its host to manufacture more viruses. The T2 phage
R strains of S. pneumoniae do not cause lethality. However, itself only contains both protein and DNA, but no
DNA-containing extracts from pathogenic S strains are other class of potential genetic material. To
sufficient to make R strains pathogenic. determine which of these two types of molecules
(Wikipedia - Modified by Deyholos- CC BY-NC 3.0) contained the genetic blueprint for the virus,
Hershey and Chase grew viral cultures in the
presence of radioactive isotopes of either
phosphorus (32P) or sulphur (35S). The phage
incorporated these isotopes into their DNA and
proteins, respectively (Figure 5). The researchers
then infected E. coli with the radiolabeled viruses, 4. RNA AND PROTEIN
and looked to see whether 32P or 35S entered the
bacteria. After ensuring that all viruses had been While DNA is the genetic material for the vast
removed from the surface of the cells, the majority of organisms, there are some viruses that
use RNA as their genetic material. These viruses
researchers observed that infection with 32P
can be either single- or double-stranded. Examples
labeled viruses (but not the 35S labeled viruses)
include SARS virus, influenza virus, hepatitis C virus
resulted in radioactive bacteria. This demonstrated
and polio virus, as well as the retroviruses like HIV-
that DNA was the material that contained genetic
AIDS. Typically, there is DNA used at some stage in
instructions.
their life cycle to replicate their RNA genome.
Also, the prion protein is an infectious agent that
transmits characteristics via only a protein (no
nucleic acid present). Prions infect by transmitting
a mis-folded protein state from one aberrant
protein molecule to a normally folded molecule.
These agents are responsible for Bovine
Spongiform Encephalopathy (BSE, also known as
"mad cow disease") in cattle, Chronic Wasting
Disease in deer, Scrapie is sheep and Creutzfeldt–
Jakob disease (CJD) in humans. All known prion
diseases act by altering the structure of the brain
or other neural tissue and all are currently
untreatable and ultimately fatal.
Figure 5.
32
When P-labeled phage infects E. coli, radioactivity is found
only in the bacteria, after the phage are removed by
agitation and centrifugation. In contrast, after infection
35
with S-labeled phage, radioactivity is found only in the
supernatant that remains after the bacteria are removed.
(Wikipedia –Modified by Deyholos- CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• Genetics is the scientific study of heredity and the variation of inherited characteristics.
• Heredity is the concept that a trait of an individual can be passed down through generations
• A gene can be defined abstractly as a unit of inheritance.
• The experiments done by Griffith and Hershey and Chase showed the ability of DNA from bacteria and
viruses to transfer genetic information into bacteria demonstrates that DNA is the genetic material and
that its universal.
• Some viruses use RNA as their genetic material and can be either single or double stranded.
• Prion is a mis-folding protein that transmits its mis-folding property to a normal one.
KEY WORDS:
genetics transform
heredity Avery, MacLeod, & McCarty
Mendel DNase
blending inheritance Hershey and Chase
particulate inheritance bacteriophage
35
gene S
32
allele P
Griffith prion
STUDY QUESTIONS:
1) Imagine that retuning astronauts provide you
with living samples of multicellular organisms
discovered on another planet. These organisms
reproduce with a short generation time like our
standard yeast species. Initial observations
about their reproduction indicate that they also
require two “sexual types” to mate, but nothing
else is known about how their genetics works.
a) How could you define laws of heredity for
these organisms?
b) How could you determine what molecules
within these organisms contained genetic
information?
c) Would the mechanisms of genetic
inheritance likely be similar for all
organisms from this planet?
d) Would the mechanisms of genetic
inheritance likely be similar to organisms
from earth?
2) It is relatively easy to extract DNA and protein
from cells; biochemists have been doing this
since at least the 1800’s. Why then did Hershey
and Chase need to use radioactivity to label
DNA and proteins in their experiments?
3) Starting with mice and R and S strains of S.
pneumoniae, what experiments, in addition to
those shown in Figure 3 can be used to
demonstrate that DNA is THE genetic material
and the only genetic material?
4) Mendel put forth a “particulate inheritance”
model – alleles, dominant, recessive, etc. At the
time there was a “blended inheritance” model,
which is like mixing paint colours (analogy).
Suggest an analogy for Mendel’s particulate
model, taking into account the dominant and
recessive characters of alleles.
Figure 5.
14 15
The positions of the N and N
containing DNA in the density
gradient tube on the left.
(Wikipedia-LadyofHats-PD)
3. CHROMOSOME REPLICATION (E. COLI) - CAIRNS radioactive and one that is not. After a third round
EXPERIMENT of replication there will be a two types of daughter
DNA, one that has a non-radioactive strand and a
If the results of Meselson and Stahl were true and radioactive strand, and one that has two
there was semi-conservative replication, then the radioactive strands.
two strands of DNA have to separate to provide the
template for copying. This should be seen as a After growth in the 3H-thymidine, Cairns lysed the
‘fork’ in a linear model if you manage to see the bacteria and collected the contents onto a
DNA just as it’s replicating. John Cairns in 1963 microscope slide. He then covered the slide with a
chose to test this. photographic emulsion and allowed exposure to
film for 2 months. As the 3H-thymidine decays it
To do this he took E. coli cells growing in a normal emits an electron with a lot of energy and speed,
environment, and then allowed them to grow and known as a beta particle. The emulsion reacts with
replicate in the presence of radioactive 3H- the beta particle creating a black silver grain on the
thymidine. The hypothesis is that if the E. coli’s film. The density of grains should be indicative of
DNA or chromosome is semi-conservatively whether one or two strands are radioactive.
replicated then after the first round of replication
there should be one newly made strand that is After the first replication cycle, the film had a thin
radioactive, or “hot”, and the other strand that is circular ring of grains (Figure 6). This was
the parental template strand with no radioactivity, interpreted to be a daughter chromosome with
so is “cold”. The original parental DNA will have one strand that is hot and one strand cold. This also
two strands, each not radioactive. After replication provided physical evidence that the E. coli
the daughter DNA will have two strands, one that is chromosome is circular, something that has only
previously been shown genetically.
In the second replication cycle the replication fork thought one fork was static while the other strand
was seen. Here Cairns saw the typical thin ring of went around the chromosome replicating.
grains much like the first replication cycle, but with Scientists later went on to show that replication is
a branch in the middle that had a thicker strand in-fact bidirectional.
(Figure 6). This means that the branch seen was an
actively replicating chromosome, using the
4. ORIGINS OF REPLICATION (PROKARYOTE -
radioactive strand of DNA as a template, and SINGLE ORIGIN), REPLICATION FORK
adding more radioactive thymidine as the DNA is When the cell enters S-phase in the cell cycle (See
being synthesized. Because of the shape these Chapter 14) the entire chromosomal DNA is
created on the film this replicating structure was replicated. This is done by enzymes called DNA
called a theta (Θ) structure. Cairns observed many polymerases. All DNA polymerases synthesize new
different molecules corresponding to the strands by adding nucleotides to the 3'OH group
progression from starting replication to the present on the previous nucleotide. For this reason
completion of replication. they are said to work in a 5' to 3' direction. DNA
One round of Two rounds of polymerases use a single strand of DNA as a
replication replication template upon which it will synthesize the
complementary sequence. This works fine for the
middle of chromosomes. DNA-directed DNA
Autoradiograph polymerases travel along the original DNA strands
making complementary strands (Figure 7a).
Interpretation
Figure 7.
DNA polymerases make new strands in a 5' to 3' direction.
Figure 6. (a) Regular DNA polymerases are proteins or protein
In his experiment, Cairns looked at DNA with radioactive complexes that use a single strand of DNA as a template.
thymidine on an autoradiograph film, with the radioactive For example, the main human DNA polymerase, Pol α, is
thymidine leaving dots on the film. This figure shows what large protein complex made of four polypeptides. (b)
the autoradiograph film would look like, and below what Telomerases use their own RNA as a template. The human
the interpretation of what the autoradiograph shows. The telomerase is a complex made of one polypeptide and one
blue line represents the ‘cold’ DNA that has no RNA molecule.(Original-Harrington- CC BY-NC 3.0)
radioactivity, while the red shows the ‘hot’ radioactive
DNA. The density of the dots on the autoradiograph imply
whether there is one strand or both strands of hot DNA. DNA replication in both prokaryotes and
During the second round of replication, a theta structure eukaryotes begins at an Origin of Replication (Ori).
can be seen, as the circular E. coli DNA is in the process of Origins are specific sequences on specific positions
being replicated. (Original-L.Canham- CC BY-NC 3.0) on the chromosome. In E. coli, the OriC origin is
~245 bp in size. Chromosome replication begins
Here Cairns’ results were able to further support with the binding of the DnaA initiator protein to an
the semi-conservative replication theory, showing AT-rich 9-mer in OriC and melts the two strands.
the existence of replication forks, as well as the Then DnaC loader protein helps DnaB helicase
hypothesis that E. coli has a circular chromosome. protein extend the single stranded regions such
What Cairns did not realize is that replication goes that the DnaG primase can initiate the synthesis of
in both directions at the replication fork, where he an RNA primer, from which the DNA polymerases
can begin DNA synthesis at the two replication only ~100 base/second. Thus, eukaryotes contain
forks. The forks continue in opposite directions multiple origins of replication distributed over the
until they meet another fork or the end of the length of each chromosome to enable the
chromosome (Figure 8). duplication of each chromosome within the
observed time of S-phase (Figure 9).
Figure 8.
An origin of replication. The sequence-specific DNA duplex
is melted, then the primase synthesizes RNA primers from
which bidirectional DNA replication begins as the two Figure 9.
replication forks head off in opposite directions. The Part of a eukaryotic chromosome showing multiple Origins
leading and lagging strands are shown along with Okazaki (1, 2, 3) of Replication, each defining a replicon (1, 2, 3).
fragments. Note the 5’ and 3’ orientation of all strands. Replication may start at different times in S-phase. Here #1
(Original-Locke- CC BY-NC 3.0) and #2 begin first then #3. As the replication forks proceed
bi-directionally, they create what are referred to as
“replication bubbles” that meet and form larger bubbles.
5. EUKARYOTE CHROMOSOME REPLICATION - The end result is two semi-conservatively replicated duplex
MULTIPLE ORIGINS DNA strands.
(Original-Locke- CC BY-NC 3.0)
In prokaryotes, with a small, simple, circular
chromosome, only one origin of replication is
6. TELOMERES
needed to replicate the whole genome. For
example, E. coli has a ~4.5 Mb genome The ends of linear chromosomes present a problem
(chromosome) that can be duplicated in ~40 – at each end one strand cannot be completely
minutes assuming a single origin, bi-directional replicated because there is no primer to extend
replication, and a speed of ~1000 and replace the end RNA primer. While the loss of
bases/second/fork for the polymerase. such a small sequence might not be a problem, the
continued rounds of replication would result in the
However, in larger, more complicated eukaryotes,
continued loss of sequence from the chromosome
with multiple linear chromosomes, more than one
end. Ultimately, the losses would reach a point
origin of replication is required per chromosome to
where essential gene sequences would be lost and
duplicate the whole chromosome set in the 8-
the organism would die. Thus, this end DNA must
hours of the replicative phase (S-phase) of the cell
be replicated. Most eukaryotes solve the problem
cycle. For example, the human diploid genome has
of synthesizing this unreplicated, end DNA with a
46 chromosomes (6 x 109 basepairs). The shortest
specialized DNA polymerase called telomerase, in
chromosomes are ~50 Mbp long and so could not
combination with a regular polymerase.
possibly be replicated from one origin. Additionally,
Telomerases are RNA-directed DNA polymerases.
the rate of replication fork movement is slower,
They are a riboprotein, as they are composed of
OPEN GENETICS LECTURES – FALL 2017 PAGE 7
CHAPTER 02 – DNA STRUCTURE AND REPLICATION
both protein and RNA. As Figure 10. shows, these repeats at the end, this fluctuation maintains a
enzymes contain a small piece of RNA that serves length buffer – sometimes it’s longer, sometimes
as a portable and reusable template from which it’s shorter – but the average length will be
the complementary DNA is synthesized. The RNA in maintained over the generations of cell replication.
human telomerases uses the sequence 3-AAUCCC- In the absence of telomerase, as is the case in
5' as the template, and thus our telomeric DNA has human somatic cells, repeated cell division leads to
the complementary sequence 5'-TTAGGG-3' the “Hayflick limit”, where the telomeres shorten
repeated over and over 1000’s of times. After the to a critical limit and then the cells enter a
telomerase has made the first strand, a primase senescence phase of non-proliferation. The
synthesizes an RNA primer and a regular DNA inappropriate activation of telomerase expression
polymerase can then make a complementary permits a cell and its descendants to become
strand so that the telomere DNA will ultimately be immortal and bypass the Hayflick limit. This
double stranded to the original length (Figure 10). happens in cancer cells, which can form tumours as
Note: the number of repeats, and thus the size of well as in cells in culture. HeLa cells, which can be
the telomere, is not set. It fluctuates after each propagated essentially indefinitely, have been kept
round of the cell cycle. Because there are many in culture since 1951 (See Chapter 41).
Figure 10.
Telomere replication showing the completion of the leading strand and incomplete replication of the lagging strand. The gap is
replicated by the extension of the 3’ end by telomerase and then filled in by extension of an RNA primer.
(Original-Locke- CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• DNA is a double helix made of two anti-parallel strands of bases on a sugar-phosphate backbone.
• Specific bases on opposite strands pair through hydrogen bonding (A=T and G=C), ensuring
complementarity of the strands.
• The hereditary information is present as the sequence of bases along the DNA strand.
• Chromosome replication begins at an origin and proceeds by DNA polymerases at a replication fork.
• Replication proceeds bi-directionally.
• Typically eukaryotes have multiple origins along each chromosome, while prokaryotes have only one.
• Eukaryotes have telomerase to complete the replication of the ends of chromosomes.
KEY TERMS:
deoxyribonucleic acid E. coli
nucleotides Meselson and Stahl
purine Nitrogen-14
adenine Nitrogen-15
guanine light
pyrimidine heavy
cytosine CsCl gradient
thymine John Cairns
3
phosphodiester bond H-thymidine
ribonucleic acid photographic emulsion
dideoxynucleotide silver grain
Watson and Crick theta structure
Chargaff’s Rules bidirectional
double helix DNA polymerases
anti-parallel Origin of replication
right-handed replicon
major groove replication bubble
minor groove telomerase
semi-conservative riboprotein
conservative Hayflick limit
dispersive HeLa cells
STUDY QUESTIONS:
1) Compare Watson and Crick’s discovery with 4) Refer to Figure 3.
Avery, MacLeod and McCarty’s discovery. a) Identify the part of the DNA molecule that
a) What did each discover, and what was the would be radioactively labeled in the
impact of these discoveries on biology? manner used by Hershey & Chase
b) How did Watson and Crick’s approach b) DNA helices that are rich in G-C base pairs
generally differ from Avery, MacLeod and are harder to separate (e.g. by heating)
McCarty’s? than A-T rich helices. Why?
c) Briefly research Rosalind Franklin on the 5) Are the ends of eukaryote, linear chromosomes
internet. Why is her contribution to the static and fixed in length? Explain.
structure of DNA controversial?
2) List the information that Watson and Crick used
to deduce the structure of DNA.
3) Refer to Watson and Crick’s
a) List the defining characteristics of the
structure of a DNA molecule.
b) Which of these characteristics are most
important to replication?
c) Which characteristics are most important to
the Central Dogma?
Figure 1.
Most, but not all, genes code for
proteins. They are transcribed into
mRNA, which is then translated
into polypeptides.
(pixabay-PublicDomainPictures- CC0
1.0)
INTRODUCTION converted back to DNA through a process called
reverse transcription. As well, DNA, and its
How is the genetic information in DNA (genes)
information, can also be replicated (DNAèDNA).
expressed as biological traits, such as the flower
color of Mendel’s peas? The answer lies in what Proteins do most of the “work” in a cell. They (1)
has become known as molecular biology’s Central catalyze the formation and breakdown of most
Dogma. While not all genes code for proteins, most molecules within an organism, as well as (2) form
do. This chapter describes the Central Dogma and their structural components, and (3) regulate the
some experiments that were used to support this expression of genes. By dictating the sequence and
concept. thus structure of each protein, DNA directs the
function of that protein, which can thereby affect
1. CENTRAL DOGMA the entire organism. Thus the genetic information,
The Central Dogma of Biology describes the or genotype, defines the potential form, or
concept that genetic information is encoded in phenotype of the organism. Note, however, that
DNA in the form of genes (Figure 2). This the environment can also influence phenotype.
information is then transferred as needed, in a
process called transcription into a messenger RNA
(mRNA) sequence. The information is then
transferred again, in a process called translation
into a polypeptide (protein) sequence. The
Figure 2.
sequence of bases in DNA directly dictates the Central Dogma of molecular biology.
sequence of bases in the RNA, which in turn (Original-Locke/Kang- CC BY-NC 3.0)
dictates the sequence of amino acids that make up
a polypeptide. In the case of Mendel’s peas, purple-flowered
The original core of the Central Dogma is that plants have a gene that encodes an enzyme that
genetic information is NEVER transferred from produces a purple pigment molecule. In the white-
protein back to nucleic acids. In certain flowered plants (a purple-less mutant), the DNA for
circumstances, the information in RNA may be this gene has been changed, or mutated, so that it
no longer encodes a functional protein. This is an sugars, and one vitamin (biotin). Prototrophs can
example of a spontaneous, natural mutation in a synthesize the amino acids, vitamin, etc. necessary
gene coding for an enzyme in a biochemical for normal growth.
pathway. They also knew that by exposing Neurospora
2. GENES CODE FOR ENZYMES – A. GARROD spores to X-rays, they could randomly induce
mutations in genes (now known as damage to the
Life depends on (bio)chemistry to supply energy
DNA leading to DNA sequence change). Each spore
and to produce the molecules that construct and
exposed to X-rays potentially contained a mutation
regulate cells. In 1908, Archibald Garrod described
in a different gene. While most mutagenized spores
“in-born errors of metabolism” in humans, using
were still able to grow (prototrophic), some spores
the congenital disorder, alkaptonuria (black urine
had mutations that changed their phenotype from
disease), as an example of how “genetic defects”
a prototroph into an auxotrophic strain, which
(genotype) led to the lack of an enzyme in a
could no longer grow on minimal medium. Instead
biochemical pathway and caused a disease
these auxotrophs could grow on complete medium
(phenotype). The reason why people with
(CM), which was MM supplemented with nutrients,
alkaptonuria have black urine is because a chemical,
such as amino acids and vitamins, etc. (Figure 3). In
called “alkapton”, makes urine black when exposed
fact, some auxotrophic mutations could grow on
to air. In normal people, enzymes catalyze the
minimal medium with only one, single nutrient
reaction to break down alkapton, but people who
supplied, such as the amino acid arginine. This
are born with the disease, due to genetic defect,
implied that each auxotrophic mutant was blocked
cannot make such enzymes and therefore cannot
at a specific step in a biochemical pathway and that
break down alkapton. Garrod’s work gave huge
by adding an essential compound, such as arginine,
impact to modern genetics as it attempted to
that block could be circumvented.
explain the biochemical mechanism behind the
genes proposed in Mendelian genetics.
3. BEADLE AND TATUM: PROTOTROPHIC AND
AUXOTROPHIC MUTANTS
In 1941, over 30 years after Garrod’s discovery,
Beadle and Tatum built on this connection
between genes and metabolic pathways. Their
research led to the “one gene, one enzyme (or
protein)” hypothesis, which states that each
enzyme that acts in a biochemical pathway is
encoded by a different gene. Although we now
Figure 3.
know of many exceptions to the “one gene, one A single mutagenized spore is used to establish a colony of
enzyme” principle, it is generally true that each genetically identical fungi, from which spores are tested for
different gene produces a protein that has a their ability to grow on different types of media. Because
distinct catalytic, regulatory, or structural function. spores of this particular colony are able to grown only on
complete medium (CM), or on minimal medium
Beadle and Tatum used the fungus Neurospora supplemented with arginine (MM+Arg), they are considered
crassa (a bread mold) for their studies because it Arg auxotrophs and we infer that they have a mutation in a
had practical advantages as a laboratory model gene in the Arg biosynthetic pathway. This type of screen is
repeated many times to identify other mutants in the Arg
organism. They knew that Neurospora was pathway and in other pathways.
prototrophic, meaning that it could grow on (Original-Deyholos-CC BY-NC 3.0)
minimal medium (MM). Minimal medium lacked
most nutrients, except for a few minerals, simple
4. ONE GENE: ONE ENZYME HYPOTHESIS LED TO different enzyme. Each enzyme works sequentially
BIOCHEMICAL PATHWAY DISSECTION USING on a different intermediate in the pathway (Figure
4). For arginine (Arg), two of the biochemical
GENETIC SCREENS AND MUTATIONS
intermediates are ornithine (Orn) and citrulline
Beadle and Tatum’s experiments are important not (Cit). Thus, mutation of any one of the enzymes in
only for their conceptual advances in this pathway could turn Neurospora into an Arg
understanding genes, but also because they auxotroph (arg-). Srb and Horowitz extended their
demonstrate the utility of screening for genetic analysis of Arg auxotrophs by testing the
mutants to investigate a biological process – intermediates of amino acid biosynthesis for the
genetic analysis. ability to restore growth of the mutants (Figure 5).
Beadle and Tatum’s results were useful to
investigate biological processes, specifically the
metabolic pathways that produce amino acids. For
example, Srb and Horowitz in 1944 tested the
ability of the amino acids to rescue auxotrophic
strains. They added one of each of the amino acids
to minimal medium and recorded which of these
restored growth to independent mutants.
Figure 5.
Testing different Arg auxotrophs for their ability to grow
on media supplemented with intermediates in the Arg
biosynthetic pathway. (Original-Deyholos- CC BY-NC 3.0)
They found that only Arg could rescue all of the Arg
auxotrophs, while either Arg or Cit could rescue
some (Table 1). Based on these results, they
deduced the location of each mutation in the Arg
biochemical pathway, (i.e. which gene was
responsible for the metabolism of which
intermediate).
Figure 4. Mutants MM + Orn
A simplified version of the Arg biosynthetic pathway, MM + Cit MM + Arg
In:
showing citrulline (Cit) and ornithine (Orn) as
intermediates in Arg metabolism. These chemical gene A Yes Yes Yes
reactions depend on enzymes represented here as the
products of three different genes.
gene B No Yes Yes
(Original-Deyholos- CC BY-NC 3.0) gene C No No Yes
Table 1.
A convenient example is arginine. If the progeny of Ability of auxotrophic mutants of each of the three
a mutagenized spore could grow on minimal enzymes of the Arg biosynthetic pathways to grow on
medium only when it was supplemented with minimal medium (MM) supplemented with Arg or either
arginine (Arg), then the auxotroph must bear a of its precursors, Orn and Cit. Gene names refer to the
labels used in Figure 4.
mutation in the Arg biosynthetic pathway and was
called an “arginineless” strain (arg-).
Synthesis of even a relatively simple molecule such
as arginine requires many steps, each with a
___________________________________________________________________________
SUMMARY:
• The Central Dogma describes the information flow from nucleic acids to proteins.
• Garrod's observations showed that there is a connection between genes and enzymes.
• Beadle and Tatum proposed that one gene encoded one enzyme,
• It was an example of how to screen for genetic mutants, and therefore characterize biochemical
pathways or biological processes.
KEY TERMS:
Central Dogma prototroph
transcription minimal medium
translation auxotroph
reverse transcription complete medium
genotype genetic screen
phenotype genetic analysis
Beadle & Tatum rescue
metabolic pathway arginine
one-gene:one-enzyme genetic screen for mutations
Neurospora crassa
STUDY QUESTIONS:
1) Compare Figure 4 and Table 1. Suppose you 7) Recall that Neurospora is orange coloured
- bread mould. This biochemical pathway
created three new arg mutation called mutants
below is how wild type cells become
#1, #2, & #3. #1 grew on MM+cit and MM+arg,
orange. None of the compounds are
#2 grew on only MM+arg, while #3 grew on essential. Cells containing W are white,
MM+ orn, cit or arg. Which genes are #1, 2, & 3 cells with Y are yellow, and cells with O are
mutant in (A, B, or C)? orange. Assume that the reactions will go to
2) Why was the Vitamin biotin (see Section #3) completion if possible.
always added the MM?
3) Last century, A. Garrod, and later Beadle and
Tatum, showed that genes encode enzymes.
From what we know now, do all genes encode
enzymes? Explain.
4) Most mutant proteins differ from wild type
Fill in this table with the colours of the cell
(normal) by a single substitution at a specific cultures.
amino acid site. Explain how some amino acid Strain MM+W MM+Y MM+O
changes result in:
+
a) no loss of protein function, gene1
b) only partial loss-of-function, +
gene2
c) complete loss-of-function,
-
d) and how do changes at different amino acid gene1
sites result in the same complete loss-of- +
gene2
function. +
5) Some mutants result in the loss of a specific gene1
enzyme activity. Does this mean that no protein -
gene2
product is produced from that mutant gene? -
6) The molecular weight of the A and B chains of gene1
-
E. coli tryptophan synthase are 29,500 and gene2
49,500, respectively. The size of the entire
enzyme is 159,000.
a) If the average molecular weight of each
amino acid is 110, then how many amino
acids are present in each chain?
b) How many chains does the whole enzyme
contain? Explain.
CHAPTER 04 – COMPLEMENTATION
Figure 1.
In Chinese philosophy the yin yang symbol suggests opposite forces,
such as two mutations, can actually be complementary and how
together they can give rise to a whole, as with complementation in
genetics.
(Wikipedia-Gregory Maxwell-PD)
INTRODUCTION phenotype (e.g., in the same pathway). In other
words, are they allelic mutations or non-allelic
How do genetic researchers determine whether mutations, respectively? This question can be
two mutants that have similar phenotypes are resolved using complementation tests, which bring
mutant in the same gene or in different genes? together or combine, the two mutations under
One way is by determining if the genes are located
consideration into the same organism to assess the
at a similar or different location. If they are
combined phenotype.
different, they must be in different genes and thus
are not allelic. If they are located in the same
region then a complementation test is used. These
consist of classical Mendelian genetic crosses to
see if one mutant can complement another, or give
a wild type phenotype. More recently,
transformation of DNA with a gene has been used
to see if putting a single gene into a cell/organism
can rescue a mutant phenotype.
1. COMPLEMENTATION TESTS AND ALLELISM
Figure 2.
As explained earlier in the previous chapter, In this simplified biochemical pathway, two enzymes
mutant screening is one of the starting points encoded by two different genes modify chemical
geneticists use to investigate biological processes. compounds in two sequential reactions to produce a
Geneticists can observe two independently derived purple pigment. Loss of either of the enzymes disrupts the
mutants with similar phenotypes, through a pathway and no pigment is produced. (Original-Deyholos-
CC BY-NC 3.0)
mutant screen or in natural populations. An
immediate question from this observation is
whether or not the mutant phenotype is due to a The easiest way to understand a complementation
loss of function in the same gene, or are they test is by example (Figure 2). The pigment in a
mutant in different genes that both cause the same purple flower could depend on a biochemical
pathway much like the biochemical pathways
OPEN GENETICS LECTURES – FALL 2017 PAGE 1
CHAPTER 04 –COMPLEMENTATION
leading to the production of arginine in Neurospora mutant). These could be either the exact same
(Chapter 3). A diploid plant that lacks the function mutant alleles (same base pair changes), or
of gene A (genotype aa) would produce mutant different mutations (different base pair changes,
white flowers that phenotypically looked just like but in the same gene - allelic).
the white flowers of a plant that lacked the Conversely, if the F1 progeny all appear to be wild
function of gene B (genotype bb). Both A and B are type (Case 2 - Figure 3B), then each of the parents
enzymes in the same pathway that leads from a most likely carries a mutation in a different gene.
colorless compound #1, through colorless These mutations would then be called non-allelic
compound #2, to the purple pigment. Blocks at mutations - mutant in a different gene locus. These
either step will result in a mutant white flower mutations DO COMPLEMENT one another.
instead of the wild type purple flower.
Note: For mutations to be used in If, however, you obtained a different mutation,
complementation tests they are (1) usually true- vestigial for example, which affects wing growth,
breeding (homozygous at the mutant locus), and and crossed it to a white eye colour mutation, the
(2) must be recessive mutations. Dominant and double heterozygote would result in red eyes and
semi-dominant mutations CANNOT be used in normal wings (wild type for both characters) so the
complementation tests, since these mutations two would complement and represent two
won’t show complementation effects of two non- different complementation groups: (1) white, (2)
allelic genes. (3) Note that haploid organisms like vestigial. The same would be true for the other
Neurospora cannot be used in complementation eye-colour mutations mentioned elsewhere in this
test since they have only one set of chromosome text. For example, if you crossed a scarlet eye-
(4). Also, remember, some mutant strains may colour mutant to a white eye-colour mutant, the
have more than one gene locus mutated and thus double heterozygote would have wild type red
would fail to complement mutants from more than eyes. Each mutant has the wild type allele of the
one other locus (or group). other. Again, remember that all the other genes in
the diploid genome are assumed to be wild type.
2. COMPLEMENTATION GROUPS = GROUPS OF To drive home the concept of complementation
ALLELIC MUTATIONS groups, we will look at a two hypothetical
So, with the third mutant strain above, we could examples.
assign it to be allelic with either gene A or gene B,
2.1. EXAMPLE ONE: MULTIPLE MUTANT
or some other locus, should it complement both
COMPLEMENTATION TEST
gene A and gene B mutations. If they came from
The first example, shows the results of a series of
different natural populations or from
crosses as a complementation test table (Figure 4)
independently mutagenized individuals, we could
with six mutants labeled a to f. The mutants fall
have a fourth, fifth, sixth, etc. white flower strain,
into three complementation groups in total: (1) a
then we could begin to organize the allelic
(2) b, c, f and (3) d, e. Notice that a
mutations into groups, which are called
complementation group can consist of only one
complementation groups. These are groups of
mutant, or more than one.
mutations that FAIL TO COMPLEMENT one another
(a group of NON-complementing mutations) and
are assumed to have mutations in the SAME gene;
hence they are grouped as complementation group.
A group can consist of as few as one mutation and
as many as all the mutants under study. Each group
represents a set of mutations in the same gene
(allelic). The number of complementation groups
represents the number of genes that are
represented in the total collection of mutations.
It all depends on how many mutations you have in
a gene. For example, the white gene in Drosophila
Figure 4.
has >300 different mutations within the white gene Complementation test table showing which flower mutant
described in the literature. If you were to obtain strains complement each other and vice versa. “w” stands
and cross all these mutations to themselves, you for the white flowers, which is mutant (no
would find they all belonged to the same complementation) and “p” stands for purple which
represents wild type (complementation). Blanks are for
complementation group or same white gene. Each
crosses not done. (Original-Di Cara-CC BY-NC 3.0)
complementation group represents a gene.
[vector + DNA insert] molecule can be replicated either plasmids with no transgene or have plasmids
and the result would be multiple clones of the with gene b, would be still auxotrophic.
original DNA insert. Notice that the plasmids contain an antibiotic
(4) After the E. coli DNA fragments that were once resistance gene called AntiR and that the strains
a single long DNA molecule are inserted into DNA were actually grown on minimal medium that
vectors, we have a collection of recombinant DNA contained antibiotics. Why was this so? This is
molecules, which when transformed, can be called because we want to select for the ones that
a DNA library. Among all the recombinant DNA actually incorporated the plasmid that contained
molecules in the library, there are three the wild-type “a” gene.
possibilities (Figure 7): (1) DNA clones that contain Only a small fraction of cells is actually transformed
gene a, (2) DNA clones that don’t contain gene a, by foreign DNA. Therefore, if we grow those strains
which will be collectively presented by the letter b on agar plate without antibiotics, we cannot
and (3) DNA clones that don’t contain any foreign guarantee that the growth was due to the
genes. complementation between the host DNA and the
(5) Combine the recombinant DNA molecules and recombinant DNA or by some reversion back to
host E. coli strain together so that the auxotrophic wild type. There is a small possibility that the cells
strain can incorporate those DNA molecules that weren’t transformed could somehow
through transformation. Growing the strains on synthesize the essential substrate due to a
minimal and complete media will let us decide if spontaneous mutation. Adding the antibiotic
the transformation rescue worked or not. selection will remove cells that weren't
transformed and therefore don't contain a plasmid
The host strain’s genotype is a-b+. It needs a wild with the antibiotic resistance gene, and select for
type a+ in order to grow on minimal medium. the cells that were successfully transformed and
Therefore, plasmids that have the a+ allele would complemented by the recombinant DNA.
grow (prototrophic), and other strains that have
Figure 7.
STUDY QUESTIONS:
1) You are working with a prototrophic model happened. The student’s control experiments
organism (e.g. a fungus). You are interested in indicate that the transformation protocol
finding genes involved in synthesis of proline worked.
(Pro), an amino acid that is normally 4) Figure 7 shows how we can rescue an a– strain
synthesizes by this organism. with a plasmid carrying an a+ gene. Could we
a) How would you design a mutant screen to also rescue this strain by growing the cells on
identify genes required for Pro synthesis? media containing Enzyme A (the product of the
b) Imagine that your screen identified ten a+ gene)? How about the product of Enzyme A?
mutants (labeled #1 through #10) that grew
very poorly unless supplemented with
Proline. How could you determine the
number of different genes represented by
these mutants?
c) If each of the ten mutants represents a
different gene, what will be the phenotype
of the F1 progeny if any pair of the ten
mutants are crossed?
d) If all of the ten mutants represents the
same gene, what will be the phenotype of
the F1 progeny if any pair of the ten
mutants are crossed?
2) Draw the expected results of a series of
complementation tests (crosses), in the form of
a table, for five yeast mutant strains where
there are at least three different mutant loci,
and one of the mutations involves a double hit
(two loci are mutant in the same strain).
3) Students create a mutant E. coli strain that is
auxotrophic for methionine. Three students
build plasmid DNA libraries from wild type DNA
from the parental strain. Student A uses EcoRI
to clone the restriction fragments. Student B
uses HindIII and student C uses XhoI. Each
transforms the auxotrophic mutant strain with
their library. Student A gets lots of prototrophic
colonies on minimal medium, while students B
and C don’t get any. Explain what might have
are usually dispersed around the genome (not similar in size and function with miRNAs but come
clustered). from a different RNA precursor. Double-stranded
DNA molecules are chopped into shorter fragments
5.2. SMALL RNAS (AN INCOMPLETE LIST): that are still double-stranded, which are called
snRNA (small nuclear RNA) reside in the nucleus siRNAs. For example, RNA is transcribed from
and form an RNA-protein complex called centromeric sequences and modified into siRNA
“spliceosomes” that process the primary mRNA fragments. This double stranded siRNA is broken
transcripts into the mature mRNA. In this complex, down into two strands. One of the strands forms a
it is the RNA molecule, not the protein component complex with other proteins and this complex finds
that has the catalytic activity. its precursor (the centromeric sequence) and
snoRNA (small nucleolar RNA) act as guides for modifies it into a highly condensed chromatin,
other RNA molecules such as snRNA or rRNA producing heterochromatin.
molecules in modification process. piRNA (Piwi-interacting RNA) are single stranded
miRNA (microRNA) is a single-stranded RNA RNA molecules of 24-32 bp in length that interact
molecule that is about 22 bp long, and regulate with a protein called piwi and this RNA-protein
gene expression in both transcription and post- complex affects epigenetic and post-transcriptional
transcriptional level. gene silencing. For example, it can block the
transcription from DNA transposons by turning the
siRNA (small interfering RNA) are short double
normal chromatin into heterochromatin.
stranded RNA molecules (about 21-24 bp long) and
are also involved in RNA interference (RNAi) Each of these types of small RNAs is transcribed
regulation of genes and post-transcriptional gene from a gene.
silencing (PTGS). These small interfering RNAs are
Figure 3.
Structure of a gene contains many components. This is a mix of prokaryote and eukaryote gene structure. For example, a
polycistronic operon is found in prokaryotes, while the distant enhancer/silencer elements are found in eukaryotes. ORF is open
reading frame; UTR is untranslated region; RBS is ribosome binding site.
(Wikipedia-Thomas Shafee-CC BY-SA 4.0)
6. GENES: DNA TRANSCRIBED INTO MRNA, The open reading frame (ORF) refers to the
TRANSLATED INTO A POLYPEPTIDE (PROTEIN sequence beginning at the start codon, through to
the stop codon. Also, many diagrams depict these
CODING GENES)
components as a “single unit” or as a cluster
Protein coding genes consist of both a regulatory sequence on the same chromosome, but in
and a transcribed sequence. The DNA is transcribed eukaryotes, regulatory regions can be very far away
into an mRNA that is then translated into (kilobases upstream/downstream) from the
polypeptides as directed by the cis-regulatory transcription unit.
elements in combination with various trans-acting
factors. One thing to note is that a gene is depicted 7. HOW ARE GENES AND OTHER SEQUENCES
as simply a “block” or a “line” in various diagrams DISTRIBUTED IN THE GENOME?
but it actually contains more than that; there are In our genome, genes are interspaced by inter-
many components inside a gene (Figure 3). For genic regions that contain interspersed repeats
example, in prokaryotes, a gene consists of: such as SINE (short interspersed elements) or LINE
- regulatory sequences: enhancer/silencer + (long interspersed elements). Organisms that have
operator + promoter smaller genes, such as bacteria, tend to have less
- transcribed sequences (transcription unit): 5’UTR inter-genic DNA compared to organisms that have
(untranslated region) + open reading frame + 3’UTR larger genes, such as yeast, Drosophila and
(includes the terminator). mammals.
Figure 4.
Comparison of the gene distribution between prokaryote and different eukaryotic organisms.
(Original-Locke- CC BY-NC 3.0)
From: http://sandwalk.blogspot.ca/2011/05/whats-in-your-genome.html
___________________________________________________________________________
SUMMARY:
• The Central Dogma states that information in nucleic acids (DNA and RNA) is translated into protein,
but it can never go back in the opposite direction. More recently it has been described as information
flows from DNA->RNA->protein.
• The definition of a gene is changing as new discoveries are made but in general, a gene is a unit of
inheritance that has a locus on a chromosome, can affect an organism’s phenotype, can exist in various
forms, and can recombine with other such units.
• DNA sequences can be divided into two main categories: functional and non-functional. The functional
DNA can be divided into (1) DNA acting directly, (2) DNA transcribed into RNA, which functions directly,
and (3) DNA transcribed into mRNA, which is translated into a polypeptide, which has a function.
• There are various kinds of RNA molecules that are involved in protein synthesis, DNA replication, post-
transcriptional modification, and gene regulation.
• Protein coding genes contain regulatory sequences and transcribed sequences. The transcript (mRNA)
contains a 5’ untranslated region (5’UTR), the open reading frame (ORF), and the 3’ untranslated
region (3’UTR).
KEY TERMS:
Central Dogma non-protein coding
transcription RNA encoding
translation Structural DNA
reverse transcription rRNA
genotype tRNA
phenotype snRNA
genes snoRNA
alleles miRNA
genetics siRNA
ORF piRNA
protein coding
STUDY QUESTIONS:
1) Provide a definition of a gene that includes all
types of genes (e.g. more than just a protein
coding gene).
2) Is all DNA in a genome part of a gene? Does all
DNA have a function? Explain.
3) Do all transcribed RNA molecules end up as
mRNA transcripts?
4) What is the UTR on an mRNA?
5) Does a segment of DNA have to be transcribed
in order to be a gene?
Figure 1.
Electron micrograph of growing E. coli.
Some show the constriction at the location
where daughter cells separate. The
colouring is false.
(Flickr-NIAID-CC BY 2.0)
INTRODUCTION 1. THE LAC OPERON – A MODEL PROKARYOTE
With most organisms, every cell contains GENE
essentially the same genomic sequence. How then Early insights into mechanisms of transcriptional
do cells develop and function differently from each regulation came from studies of E. coli by Francois
other? The answer lies in the regulation of gene Jacob & Jacques Monod. In E. coli, and many other
expression. Only a subset of all the genes is bacteria, genes encoding several different
expressed (i.e. active) in any given cell participating polypeptides may be located in a single
in a particular biological process. Gene expression transcription unit called an operon. The genes in an
is regulated at many different steps along the operon share the same transcriptional regulation,
process that converts DNA information into but are translated individually into separate
proteins. In the first stage, transcript abundance polypeptides. Most prokaryote genes are not
can be controlled by regulating the rate of organized as operons, but are transcribed
transcription initiation and processing, as well as individually yielding single peptide units.
the degradation of transcripts. In many cases,
Eukaryotes do not group genes together as
higher abundance of a gene’s transcripts is
operons (an exception is C. elegans and a few other
correlated with its increased expression. We will
species).
focus on transcriptional regulation in E. coli (Figure
1). Be aware, however, that cells also regulate the
overall activity of genes in other ways. For
example, by controlling the rate of mRNA
translation, processing, and degradation, as well as
Figure 2.
the post-translational modification of proteins and Diagram of a segment of an E. coli chromosome containing
protein complexes. the lac operon, as well as the lacI coding region. The
various genes and cis-elements are not drawn to scale.
(Original-Deyholos-CC BY-NC 3.0)
1.1. BASIC LAC OPERON STRUCTURE This repressor is trans-acting and binds to two cis-
E. coli encounters many different sugars in its acting operator sequences adjacent to the
environment. These sugars, such as lactose and promoter of the lac operon. Binding of the
glucose, require different enzymes for their repressor prevents RNA polymerase from binding
metabolism. Three of the enzymes for lactose to the promoter (Figure 2, Figure 4.). Therefore,
metabolism are grouped in the lac operon: lacZ, the operon is not transcribed when the operator
lacY, and lacA (Figure 2). LacZ encodes an enzyme sequence is occupied by a repressor.
called β-galactosidase, which digests lactose into
its two constituent sugars: glucose and galactose.
lacY is a permease that helps to transfer lactose
into the cell. Finally, lacA is a trans-acetylase; the
relevance of which in lactose metabolism is not
entirely clear. Transcription of the lac operon
normally occurs only when lactose is available for it
to digest. Presumably, this avoids wasting energy
in the synthesis of enzymes for which no substrate Figure 3.
is present. In the lac operon there is a single mRNA Structure of lacI homotetramer bound to DNA.
transcript that includes coding sequences for all (Original-Deyholos- CC BY-NC 3.0)
three enzymes and is called a polycistronic mRNA.
A cistron in this context is equivalent to a gene. 2.1. THE REPRESSOR ALSO BINDS LACTOSE
(ALLOLACTOSE)
1.2. CIS- AND TRANS- REGULATORS Besides its ability to bind to specific DNA sequences
In addition to these three protein-coding genes, at the operator, another important property of the
the lac operon contains several short DNA lacI protein is its ability to bind to allolactose. If
sequences that do not encode proteins, but instead lactose is present, β-galactosidase enzymes
act as binding sites for proteins involved in convert a few of the lactose molecules into
transcriptional regulation of the operon. In the lac allolactose. This allolactose can then be
operon, these sequences are called P (promoter), allosterically bound to the lacI protein. This alters
O (operator), and CBS (CAP-binding site). the shape of the protein in a way that prevents it
Collectively, sequence elements such as these are from binding to the operator. Therefore, in the
called cis-elements because they must be located presence of lactose (allolatose) the repressor
on the same piece of DNA as the genes they doesn’t bind the operator sequence and thus RNA
regulate. On the other hand, intermolecular polymerase is able to bind to the promoter and
elements outside from the target DNA such as the transcribe the lac operon. This leads to a moderate
proteins that bind to these cis-elements are called level of expression of the mRNA encoding the lacZ,
trans-regulators because (as diffusible molecules) lacY, and lacA genes. This kind of secondary
they do not necessarily need to be encoded on the molecule that binds to either activator or repressor
same piece of DNA as the genes they regulate. and induces the production of specific enzyme is
2. NEGATIVE REGULATION – INDUCERS AND called an inducer. Also, proteins such as lacI that
change their shape and functional properties after
REPRESSORS
binding to a ligand are said to be regulated through
LacI encodes an allosterically regulated repressor an allosteric mechanism. The role of lacI in
One of the major trans-regulators of the lac operon regulating the lac operon is summarized in Figure 4.
is encoded by lacI, a gene located just upstream
from the lac operon (Figure 2). Four identical
molecules of lacI proteins assemble together to
form a homotetramer called a repressor (Figure 3).
Figure 5.
CAP, when bound to cAMP, helps RNApol to bind to the
lac operon. cAMP is produced only when glucose [Glc] is
low. (Original-Deyholos-CC BY-NC 3.0)
Figure 6.
+ + - + + + - + - +
Both E. coli strands with genotypes I O Z Y A /F I O Z Y A
S + + + + - - + + +
and I O Z Y A /F I O Z Y A will induce all the lac genes
-
because repressor cannot bind to the O sequence on the
F-factor and cannot prevent transcription.
(Original-Locke-CC BY-NC 3.0)
Figure 8.
S + + + + + - +
E. coli strains with genotypes 1) I Z Y A /F I Z Y A and 2)
S + + + - + + +
I Z Y A /F I Z Y A will not produce lacZ, lacY and lacA
Figure 7. products. (Original-Locke-CC BY-NC 3.0)
+ - + + - + - +
E. coli strain with genotype I Z Y A /F I Z Y A will not
produce lacZ, lacY and lacA products.
The repressor protein encoded by lac I gene has at
(Original-Locke-CC BY-NC 3.0)
least two independent functional domains. This is
the reason why it can mutate independently to give
two different types of mutants. (Figure 9)
alter conformation so it no longer binds to the transcription. Low levels of lactose would not cause
operator sequence and transcription can take place. inhibition to the repressor, so transcription would
High levels of lactose (inducer) would allosterically be prevented. Various forms of regulation in the lac
inhibit repressor and therefore would not prevent operon are found in Figure 10.
Figure 10.
Top: When glucose [Glc] and lactose [Lac] are both high, the lac operon is transcribed at a
basal (<1%) level, because CAP (in the absence of cAMP) is unable to bind to its
corresponding cis-element (yellow) and therefore cannot help to stabilize binding of
RNApol at the promoter.
Bottom: Alternatively, when [Glc] is low, and [Lac] is high, CAP and cAMP can bind near
the promoter and increase further the transcription of the lac operon. (Original-Deyholos-
CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• Regulation of gene expression is essential to the normal development and efficient functioning of cells
• Gene expression may be regulated by many mechanisms, including those affecting transcript
abundance, protein abundance, and post-translational modifications
• Regulation of transcript abundance may involve controlling the rate of initiation and elongation of
transcription, as well as transcript splicing, stability, and turnover
• The rate of initiation of transcription is related to the presence of RNA polymerase and associated
proteins at the promoter.
• RNApol may be blocked from the promoter by repressors, or may be recruited or stabilized at the
promoter by other proteins including transcription factors
• The lac operon is a classic, fundamental paradigm demonstrating both positive and negative regulation
through allosteric effects on trans-factors.
KEY TERMS:
gene expression trans-regulators
transcriptional regulation lacI
operon homotetramer
lactose repressor
glucose inducer
lac operon allosteric
lacZ cAMP binding protein
lacY CAP
lacA CAP binding sequence
β-galactosidase CBS
permease adenylate cyclase
trans-acetylase constitutive
P / promoter Oc / I- / Is
O / operator cis dominant
CBS F-factor / episome
CAP-binding site merozygotes
cis-elements
Figure 1.
Some genes are expressed in a segmental pattern and dictate the
development of cells in that segment. This tissue specific patterning
happens through the temporal and spatial regulation of these
genes. (Wikipedia-PhiLiP-PD)
INTRODUCTION (short tandem repeats) and mini-satellites (longer
tandem repeats). Interspersed repeats include
While prokaryote protein-coding genes are
SINEs (Short Interspersed Elements), LINEs (Long
relatively simple with a promoter driving a
interspersed elements).
transcribed mRNA sequence (or a multiple protein
coding mRNA in the case of an operon), the
expression of a eukaryote protein-coding gene is
much more complex. There are intron sequences,
which are spliced out during processing, or they
may be alternately spliced, and there are three
levels of transcriptional regulation. All these make
the typical eukaryote gene much larger than the
typical prokaryote one and more complex.
1. THE EUKARYOTIC GENOME CONTAINS VARIOUS Figure 2.
Tandem repeats can be either microsatellite and mini-
TYPES OF SEQUENCES satellite, and interspersed repeats can be SINE or LINE
There are three main types of sequences in depending on the length of the repeats.
(Original-Kang- CC BY-NC 3.0)
eukaryote genome, which are: (1) single copy
genes, (2) multiple copy genes and (3) repeated
sequences. Single copy genes have a single copy in 2. TRANSCRIPTS OF PROTEIN CODING GENES –
the genome and include most protein-coding PROCESSING
genes. Multiple copy genes have multiple copies in
the genome and include rRNA- and tRNA-coding 2.1. 5’ CAP, POLY(A) TAIL
genes, and some protein coding genes. Repeated An mRNA is transcribed by RNA polymerase II using
sequences can be either tandem repeats or its complementary DNA strand as a template. As it
interspersed repeats (Figure 2). Tandem repeats is synthesized, it undergoes processing before
are followed directly after one another, whereas transport to the cytoplasm. Here are the major
interspersed repeats are scattered randomly. steps of during transcription of eukaryote mRNA:
Tandem repeats include (a) short centromeric- a) mRNA transcript is synthesized by RNA
tandem arrays, which are sequence repeats at the polymerase II.
centromere region and (b) VNTR (Variable Number
Tandem Repeats). VNTR include microsatellites
Figure 3.
The steps of synthesizing a primary mRNA transcript. (Original-Locke-CC BY-NC 3.0)
Figure 4.
For each intron spliced out, there are three sites that are
essential. They are the 5’ donor site, branch point, and 3’
acceptor site. The number below each nucleotide
represents the percentage of that nucleotide at that site.
(Original-Locke-CC BY-NC 3.0) Figure 5.
(1) Spliceosome makes a cut at 5’ splice donor site, and (2)
the 5’ end of the intron attaches to branch A point, and (3)
For each intron on the primary transcript RNA, the second cut is made at the 3’ splice acceptor site,
there exists (1) 5’ splice donor site, (2) branch point ultimately forming a mature RNA.
A, and (3) 3’ splice acceptor site. (Note that the (Original-Kang-CC BY-NC 3.0)
directionality in these names (ex. 5’, 3’) are
referenced to the mRNA sequence). The snRNA of 2.3. ALTERNATIVE SPLICING
the spliceosome base pairs with the RNA Many genes have primary transcripts that are
sequences at the 5’ splice donor site and cuts it. processed differently to produce more than one
This cut 5’ end of the intron “donates” or attaches type of mature mRNA. This is called alternative
to the branch point A via 2’-5’ phosphodiester splicing, which often results in the production of
bond and forms a lariat. Next, a second cut is made more than one type of protein product from the
at the 3’ splice site acceptor (3’ end of the intron) same gene (Figure 6.). Alternative splicing is
(Figure 5.) RNA ligase attaches the two exon ends another means of gene regulation, but it happens
together, and the intron sequence is are degraded at a post-transcriptional level.
leaving the mature mRNA. To further complicate this process, in some
Note: Almost all eukaryote genes have introns, but organisms it is even possible for exons from
for some rare genes, like the Heat Shock Protein 70 different gene transcripts to be ligated together
(HSP70) gene, the primary transcript is the mature through a process called trans-splicing. Although
mRNA. Prokaryote genes do NOT have introns. rare, an example comes from the worm, C. elegans,
where an identical short leader sequence, the
spliced leader (SL), is trans-spliced onto the 5ʹends
of multiple mRNAs.
With alternative splicing, it is possible for
organisms with 25,000 genes (e.g. humans) to
produce a much larger variation of polypeptides.
Figure 6.
Alternative splicing produces different combinations of exons, which result in different mRNA products, thus different proteins.
(Wikipedia- National Human Genome Research Institute-PD)
3. TRANSCRIPTION REGULATION – PROMOTERS, between any of these elements and the
ENHANCERS/SILENCERS transcription start site can vary, but are typically
within ~200 base pairs of the start of transcription.
3.1. PROXIMAL REGULATORY SEQUENCES. This contrasts the next set of elements.
As in prokaryotes the RNA polymerase binds to the
DNA at the gene’s promoter to begin transcription. 3.2. DISTAL REGULATORY ELEMENTS
In eukaryotes, however, RNApol is part of a large Even more variation is observed in the position and
protein complex that includes additional proteins orientation of the second major type of cis-
that bind to one or more specific cis-elements in regulatory element in eukaryotes, which are called
the promoter region, which includes GC-rich boxes, enhancer elements. Regulatory trans-factor
CAAT boxes, and TATA boxes. Cis-elements are proteins called transcription factors bind to
intramolecular elements that exist and act within enhancer sequences, then, while still bound to
the same DNA molecule. However, trans-elements DNA, these proteins interact with RNApol and
are intermolecular elements that are distinct other proteins at the promoter to enhance the rate
molecules from their target DNA; it could be RNA of transcription. There is a wide variety of different
or proteins. High levels of transcription require transcription factors and each recognizes a specific
both the presence of this protein complex at the DNA sequence (enhancer elements) to promote
promoter, as well as their interaction with other gene expression in the adjacent gene under specific
trans-factors described below. The approximate circumstances. Because DNA is a flexible molecule,
position of these elements relative to the enhancers can be located near (~100s of bp) or far
transcription start site (+1) is shown in Figure 7, (~10K of bp), and either upstream or downstream,
but it should be emphasized that the distance from the promoter (Figure 7 and Figure 8).
Figure 7.
Structure of a typical eukaryotic gene. RNA polymerase binding may involve one or more cis-elements within the proximal region
of a promoter (green boxes). Enhancers (yellow boxes) may be located any distance upstream or downstream of the promoter
and are also involved in regulating gene expression. The processing of a primary transcript to a mature mRNA is also shown.
Note: not to scale. (Original-Deyholos- CC BY-NC 3.0)
3.3. EXAMPLE: GAL4-UAS SYSTEM FROM YEAST – A eukaryotes, including humans. It has been
GENETIC TOOL especially well exploited in Drosophila where
Yeasts use the Gal regulon to convert galactose to >10,000 differently expressing driver lines are
glucose-1-phosphate for glycolysis. Geneticists available. These lines permit the tissue specific
have taken advantage of a yeast distal enhancer expression of any responder gene to examine its
sequence to make the GAL4-UAS system, a effect on development or cellular functions.
powerful technique for studying genes in other
eukaryotes. It relies on two parts: a “driver” and a
“responder” (Figure 9.). The driver part is a gene
encoding a yeast transcriptional activator protein
called Gal4. It is separate from the responder part,
which contains the enhancer sequence, or
upstream activation sequence (UAS, as it is called
in yeast) to which the Gal4 protein specifically Figure 8.
binds and activate the Gal genes. This UAS is placed A transcription factor (yellow) bound to an enhancer that
upstream (using genetic engineering) from a is located far from a promoter. Because of the flexibility of
the DNA molecule, the transcription factor and RNApol
promoter transcribing a reporter gene, or other
(green) are able to interact physically, even though the
gene of interest, such as GFP (green fluorescent cis-elements to which they are bound are located far
protein). apart. In eukaryotic cells, RNApol is actually part of a
large complex of proteins (not shown here) that
Both parts must be present in the same cell for the assembles at the promoter. (Original-Deyholos- CC BY-NC
system to express the responder gene. If the driver 3.0)
is absent, the responder product will not be
expressed. However, both are in the same cell (or
organism) the pattern of expression of the driver
part will induce the responder part’s expression in
the same pattern. This system works is a variety of
Figure 9.
The GAL4-UAS system. The driver, with a wing enhancers,
expresses the Gal4 protein that then binds to the UAS
element upstream of a marker gene, GFP. This would
express the GFP in the wing tissues. The modular aspect
of this system would let the wing enhancer be replaced by
any other enhancer and the GFP marker replaced with
any other gene. (Original-Locke- CC BY-NC 3.0
Figure 10.
4. HIGHER ORDER CHROMATIN - ADDITIONAL Acetylation of histone proteins is associated with more
LEVELS OF REGULATING TRANSCRIPTION open chromatin configuration. Acetylation is a reversible
process. (Original-Deyholos- CC BY-NC 3.0)
Eukaryotes regulate transcription via promoter
sequences close to the transcription unit (as in
4.2. MODIFICATION OF DNA BASES
prokaryotes) and also use more distant enhancer
sequences to provide more variation in the timing, Likewise, methylation of DNA itself is also
level, and location of transcription, however, there associated with transcription regulation. Cytosine
are still additional levels of genetic control. This bases, particularly when followed by a guanine
consists of two major mechanisms: (1) large-scale (CpG sites) are important targets for DNA
changes in chromatin structure, and (2) methylation (Figure 11). Methylated cytosine
modification of bases in the DNA sequence. These within clusters of CpG sites is often associated with
two are often inter-connected. transcriptionally inactive DNA.
through which eukaryotic cells control the phenotype, the ability to influence traits in the next
transcription of specific genes. generation, is a topic of current research and only
some examples will be discussed here.
5. EPIGENETICS
One example comes from the grandchildren of
5.1. THE BACKGROUND OF EPIGENETICS famine victims. They are known to have lower birth
The word “epigenetics” has become popular in the weight than children without a family history of
last decade and its meaning has become confused. famine. This heritability of altered state of gene
The term epigenetics describes any heritable expression is surprising, since it appears not to
change in phenotype that is not associated with a involve typical changes in the sequence of DNA.
change in the chromosomal DNA sequence. The term epigenetics is applied here since the
Originally it meant the processes through which the apparently heritable change in phenotype is
genes were expressed to give the phenotype; that associated with something other than DNA
is, the changes in gene expression that occur during sequence.
normal development of multicellular organisms. This change is inherited from one generation to the
This includes the change in transcriptional state of next and is thus transgenerational, for at least one
a DNA sequence (gene) via DNA or chromatin generation. In developmental epigenetics, the
protein reversible modifications. Thus, DNA expression state (developmentally differentiated
methylation and chromatin protein methylation, state) is conserved only from one mitosis to the
phosphorylation, and acetylation have been next, but is erased or reset at meiosis (the
targeted as mechanisms for “heritable” changes in boundary of one generation to the next). The basis
cells as they grow from a single cell (zygote) and of at least some types of epigenetic inheritance
differentiate to a multicellular organism. Here, appears to be replication of patterns of histone and
dividing cells commit to differentiate into different DNA methylation that occurs in parallel with the
tissues such as muscle, neuron, and fibroblast due replication of the primary DNA sequence. The
to the genes that they express or silence. Some permanence of this “epigenetic change” is not the
genes are irreversibly silenced, through epigenetic same as changes in the DNA sequence itself. What
mechanisms, in some cell types, but not in others. is clear is that epigenetics is an important part of
All of this doesn’t involve any change in DNA regulating gene expression, and can serve as a type
sequence. of cellular memory, certainly within an individual,
Remember, these epigenetic effects are not or across a few generations in some cases. It is
permanent changes and thus are not selectable in becoming clear that epigenetics is an important
an evolutionary context. However, mutations in the part of biology.
genes that regulate the epigenetic effect can be
5.3. IMPRINTING AND PARENT-OF-ORIGIN EFFECTS
selected.
For some genes, the allele inherited from the
5.2. SOME HERITABLE INFORMATION CAN BE PASSED ON female parent is expressed differently than the
INDEPENDENT OF THE DNA SEQUENCE allele that is inherited from the male parent. This is
More recently however, researchers have found distinct from sex-linkage and is true even if both
many cases of environmentally induced changes in alleles are wild-type and autosomal. During
gene expression that can be passed on to the next gamete development (gametogenesis), each
generation – a potential multi-generational effect. parent imprints epigenetic information on some
These cases have also been called “epigenetics”, genes that will affect the activity of the gene in the
and probably involve similar reversible changes to offspring. Imprinting does not change the DNA
the DNA and chromatin proteins. These altered sequence, but does involve methylation of DNA
expression patterns represent the diversity of and histones, and generally silences the expression
expression for a genome. This “extended” of one of the parent’s alleles. In humans, some
genes are expressed only from the paternal allele, The mouse agouti gene produces a signaling
and other genes are expressed only from the molecule that regulates pigment-producing cells
maternal allele. The imprinting marks are and brain cells that affect feeding and body weight.
reprogrammed before the next generation of Normally, agouti is silenced by methylation, and
gametes are formed. Thus, although a male these mice are brown and have a normal weight.
inherits epigenetic information from both his When agouti is demethylated by feeding certain
mother and father, this information is erased chemicals or by mutating a gene that controls
before sperm development, and he passes only one methylation, some mice become yellow and
pattern of imprinting to both his sons and overweight, although their DNA sequence remains
daughters. Most examples of imprinting come from unchanged. Methylation of agouti and normal
placental mammals, and many imprinted genes weight and pigmentation of offspring can be
control growth rate, such as IGF2 (insulin-like restored if their mothers are fed folic acid and
growth factor 2). other vitamins during pregnancy.
Imprinting appears to explain many different A study of an isolated Swedish village called
parent-of-origin effects. For example, Prader-Willi Överkalix provides an example of transgenerational
Syndrome (PWS) and Angelman Syndrome (AS) inheritance of nutritional factors. Detailed
are two phenotypically different conditions in historical records allowed researchers to infer the
humans that result from deletion of a specific nutritional status of villagers going back to 1890.
region of chromosome 15, which contains several The researchers then studied the health of two
genes. Whether the deletion results in PWS or in generations of these villagers’ offspring, using
AS depends on the parent-of-origin. If the deletion medical records. A significant correlation was
is inherited from the father, PWS results. found between the mortality risk of grandsons and
Conversely, if the deletion is inherited from the the food availability of their paternal grandfathers.
mother, AS is the result. The gene(s) associated This effect was not seen in the granddaughters.
with PWS is maternally silenced by imprinting, Furthermore, the nutrition of paternal
therefore the deletion of its paternally-inherited grandmothers, or either of the maternal
allele results in a complete deficiency of a required grandparents did not affect the health of the
protein. On the other hand, the paternal allele of grandsons. It was therefore proposed that
the gene involved in AS is silenced by imprinting, so epigenetic information affecting health (specifically
deletion of the maternal allele results in deficiency diabetes and heart disease) was passed from the
of the protein encoded by that gene. grandfathers, to the grandsons, through the male
line.
5.4. TRANSGENERATIONAL INHERITANCE OF
NUTRITIONAL INFLUENCES 5.5. VERNALIZATION AS AN EXAMPLE OF EPIGENETICS
Nutrition is one aspect of the environment that has Many plant species in temperate regions are
been particularly well-studied from an epigenetic winter annuals, meaning that their seeds
perspective in both mice and humans. People alive germinate in the late summer, and grow
today who experienced the Dutch famine of 1944- vegetatively through early fall before entering a
1945 as fetuses have IGF2 genes that are less dormant phase during the winter, often under a
methylated than their siblings. Methylation of IGF2 cover of snow. In the spring, the plant resumes
(and birth rate) is also lower in children of mothers growth and is able to produce seeds before other
who do not take folic acid supplements as species that germinated in the spring. In order for
compared those who do. Furthermore, an this life strategy to work, the winter annual must
individual’s phenotype can be influenced by the not resume growth or start flower production until
nutrition of parents or even grandparents. This winter has ended. Vernalization is the name given
transgenerational inheritance of nutritional effects to the requirement to experience a long period of
appears to involve epigenetic mechanisms. cold temperatures prior to flowering.
This is due to patches of skin cells having different of their liver cells do not make Factor VIII (because
X-chromosomes inactivated. In each orange hair the X with the F8+ allele is inactive) the other 50%
the Xi chromosome carrying the OB allele is can (Figure 15). Because some of their liver cells
inactivated. The OO allele on the Xa is functional are exporting Factor VIII proteins into the blood
and orange pigments are made. In black hairs the stream they have the ability to form blood clots
reverse is true, the Xi chromosome with the OO throughout their bodies. Even though their liver
allele is inactive and the Xa chromosome with the cells are a genetic mosaic, this does not produce a
OB allele is active. Because the inactivation decision visible mosaic phenotype.
happens early during embryogenesis, the cells
continue to divide to make large patches on the
adult cat skin where one or the other X is
inactivated.
Figure 14
Relationship between genotype and phenotype for an X-
O
linked gene in cats. The O allele = orange while the
B
O allele = black. Figure 15.
(Original-Harringtion- CC BY-NC 3.0) This figure shows the two types of liver cells in females
heterozygous for an F8 mutation. Because people with
the F8+/F8- genotype have the same phenotype, normal
6.2. FACTOR VIII BLOOD CLOTTING PROTEINS
blood clotting, as F8+/F8+ people the F8- mutation is
Another mammalian X-inactivation system is the F8 classified as recessive. .
gene in humans. It makes Factor VIII blood clotting (Original-Harringtion/Locke- CC BY-NC 3.0)
proteins in liver cells. If a male is hemizygous for a
mutant allele the result is hemophilia type A.
Females homozygous for mutant alleles will also
have hemophilia. Heterozygous females, F8+/F8-,
do not have hemophilia because even though half
___________________________________________________________________________
SUMMARY:
• In eukaryotic genome, there are single-copy and multi-copy genes and repeated sequences that are
subdivided in various categories. Repeated sequences have tandem and interspersed repeats with
varying lengths.
• The primary mRNA transcript undergoes some modification and processing before being exported to
the cytoplasm: 5'cap, poly (A) tail, and splicing.
• Alternative splicing allows maximum number of products (proteins) with limited amount of resources
(genes)
• In eukaryotes, enhancers bind to specific trans-factors, RNA polymerase and additional proteins to
regulate transcriptional initiation in the promoter region.
• GAL4-UAS system in yeast uses driver (transcriptional activator/Gal4) and responder system (enhancer
sequence / upstream activation sequence UAS) that can be integrated into other genes and be used as
a biomarker.
• Chromatin structure, including reversible modifications such as acetylation of histones, and
methylation DNA CpG sites also regulates the initiation of transcription.
• Chromatin modifications or DNA methylation of some genes are heritable over many mitotic, and
sometimes even meiotic divisions and allow higher level of transcription
• During gamete development, each parent imprints epigenetic information on some genes that will
affect the activity of the gene in the offspring.
• Heritable changes in phenotype that do not result from a change in DNA sequence are called
epigenetics. Many epigenetic phenomena involve regulation of gene expression by chromatin
modification and/or DNA methylation.
• When there are two X chromosomes in a female, X-inactivation compensates for overdosage;
examples are calico cats and factor VIII Blood clotting protein in humans.
KEY TERMS:
Single copy genes GC boxes / CAAT boxes / TATA epigenetics
Multiple copy genes boxes transgenerational
Repeated sequences Cis-elements gametogenesis
primary transcript Trans-elements imprint
RNA splicing transcription start site parent-of-origin
introns enhancer elements Prader-Willi Syndrome (PWS)
exons transcription factorsGAL4-UAS Angleman Syndrome (AS)
mature transcript Driver/responder agouti
spliceosome chromatin remodeling winter annuals
lariat acetylation/deacetylation vernalization
alternative splicing, methylation/demethylation X-inactive (Xi) / X-active (Xa)
trans-splicing CpG sites Barr body
STUDY QUESTIONS:
1) List all the mechanisms that can be used to
regulate gene expression in eukaryotes.
2) How are eukaryotic and prokaryotic gene
regulation systems similar?
How are they different?
3) Histone deacetylase (HDAC) is an enzyme
involved in gene regulation. What might be the
phenotype of a winter annual plant that lacked
HDAC function?
Figure 1.
Image of red blood cells (red), platelets (green) and T cells
(orange) with a scanning electron microscope. The cells in this
image are artificial coloured.
(Flickr-ZEISS Microscopy-CC BY-NC-ND 2.0)
Figure 3.
Fragments of human chromosome 11 and human chromosome 16 on which are located clusters of b-like and a-like goblin genes,
respectively. Additional globin genes (q theta, µ mu) have also been described by some researchers, but are not shown here.
(Wikipedia –Modified by Kang- CC BY-NC 3.0)
as a globin gene. The ψ (psi) symbol represents the childhood onward, most tetramers are of the type
designation as a pseudogene. The globin genes a2b2 (alpha, beta). A small amount of adult
provide an example of how gene duplication and hemoglobin is a2d2 (alpha, delta), which has d globin
mutation, followed by selection, allows genes to instead of the more common b globin. Although the
evolve specialized expression patterns and six globin proteins (a = alpha, b = beta , g = gamma,
functions. In general, many genes have evolved as d = delta, e = epsilon , z = zeta) are very similar to
gene families in this way, although they are not each other, they do have slightly different functional
always clustered together as are the globins. properties. For example, fetal hemoglobin a2g2 has
2. HEMOGLOBIN EXPRESSION CHANGES DURING a higher oxygen affinity than adult hemoglobin,
allowing the fetus to more effectively extract oxygen
DEVELOPMENT IN HUMANS.
from maternal blood, which is a2b2. The specialized
In humans the composition of the globin tetramer g globin genes that are characteristic of fetal
changes during development (Figure 6). There are 3 hemoglobin are found only in placental mammals.
distinct time periods that differ in globin gene
Note that in humans the developmental changes in
expression. In embryos, z2e2 (zeta, epsilon) is the
gene expression from zeta to alpha and from epsilon
most abundant type, which means the globin
to gamma to beta parallel the location along the
tetramer contain two copies of each of zeta and
chromosome. This correlation is found in other
epsilon proteins, which are similar but slightly
species with clusters of globin genes, although not
different from each other. Next, in fetuses, a2g2
as rigidly.
(alpha, gamma) is most abundant form. From early
Figure 4.
Expression of globin genes during prenatal and postnatal development in humans. The organs where the globin genes are primarily
expressed at each developmental stage are also indicated on top.
Data: Wood, W.G. 1976 Br. Med. Bull. 32, 282
Original: (Wkipedia-Furfur- CC BY-SA 3.0)
Derivative work/Translation: (Wikipedia-Leonid2- CC BY-SA 3.0)
3. LOCUS CONTROL REGION (LCR) – ANOTHER the nucleases can cleave the DNA (Figure 6.). This
LEVEL OF REGULATION nuclease sensitivity assay is done in vitro and
reflects the open/closed chromatin found in vivo.
The transcription of the globin genes is controlled at Also, these hypersensitive sites aid in the
multiple levels. First, the promoter dictates the recruitment of molecular factors that are needed for
mRNA start position and provides a basal level of transcription; each hypersensitive region
transcription. Second, there are enhancer/silencer independently affects, in an additive fashion, the
elements that act on the promoter to determine activation of gene expression. Note that LCR regions
tissue- and temporal-specific transcription. Third, do not necessarily exist at a single site like the beta
the b-globin locus is also regulated by higher order globin gene; in other cases, LCR can be found in
chromatin structure changes. multiple sites.
At the chromatin level, there is a region upstream This kind of regional change in chromatin
from the cluster of b-globin genes that regulates and conformation permits the various globin genes to be
controls the expression of all the genes in the cluster. regulated and expressed by their own individual
It is called a locus control region (LCR). LCR region promoters, enhancers, as well as having a dynamic
can transcriptionally activate distal globin genes and chromatin regulation, too.
its exact mechanism hasn’t been fully identified.
The LCR-dependent chromatin changes develop in
There are many models proposed to explain how
erythroid precursor cells long before any globin
LCR works, and one of them is by forming a loop and
gene is expressed. It begins with the opening
interacting with transcription factors to form a
(become nuclease sensitive) of the sites 5’ to the
complex called an enhancesome. During
globin genes first, then sites that are 3’ open later
development, this complex associates with other
on. Thus the change in chromatin structure at this
transcription factors in sequential manner. The LCR
locus is a developmental planned series of events.
region contains sequences that regulate the
conformation of chromatin for all the adjacent A deletion mutation that removes the LCR region,
globin genes (Figure 5). prevents the 5’ site from forming and also prevents
the subsequent 3’ site formation, thus the 5’ site is
The change in conformation is recognized through
needed for the 3’ site to gain nuclease sensitivity.
differences in this region’s sensitivity to added
Note that LCR sites must be present in order to
nucleases. The LCR contains 4 nuclease
activate the expression globin genes; non red blood
hypersensitive sites (HS4, HS3, HS2, and HS1) that
cell precursors do not open this region of chromatin
influences can be detected when isolated nuclei are
treated with added nucleases. If the DNA is in a so the b-globin genes are not expressed.
“closed” conformation the nucleases cannot cleave
the DNA. If the DNA is in an “open” conformation
Figure 5.
Diagram showing the role of the LCR in
development. The LCR regulates which
of the globin genes in the cluster is
expressed at different times during
development of the red blood cell.
(Original-Kang-CC BY-NC 3.0)
4. ADDITIONAL INFORMATION-MYOGLOBIN
Globin gene expression is tissue-specific; a- and β-
like globin genes are expressed in red blood cell
precursors. A different globin-like gene, myoglobin
is expressed and found only in muscle cells. Just like
hemoglobin, myoglobin is an oxygen-binding
protein and acts as temporary storage in muscle
cells. The main difference between myoglobin and
hemoglobin is that hemoglobin is mainly found in
the blood stream, but myoglobin is only found in
skeletal and heart muscles to provide oxygen for
Figure 6. metabolically active cells. Therefore, myoglobin can
The upper part of this diagram represents the nuclease act as a biomarker for detecting muscle injuries as
insensitive chromatin form where the nucleases (scissors)
cannot access or cleave the DNA. The lower part
high concentrations of myoglobin in the
represents the nuclease hypersensitive chromatin form bloodstream indicate internal bleeding from the
where the nucleases can access and cleave the DNA. The muscles. Also, myoglobin has higher affinity for
sensitivity/insensitivity of the DNA to nucleases is just a oxygen than hemoglobin; this feature allows
method to reveal the difference in chromatin structure myoglobin to take up oxygen from hemaglobin.
along the DNA molecule. Some areas are “open”
(sensitive) while others are “closed” (insensitive).
(Original-Locke- CC BY-NC 3.0)
__________________________________________________________________________
SUMMARY:
• Hemoglobin is a tetramer that transports and stores oxygen; it is composed of 2 α-globin-like + 2 β-
globin-like polypeptides. α-globin can be replaced by zeta (z) globin, and beta-globin can be replaced by
epsilon (e), gamma (g) and delta (d) globin proteins. Each protein has different affinity for oxygen.
• Pseudogenes are version of a normal gene that frequently lacks the cis-acting regulatory elements but
still possess the protein coding sequences.
• Expression can be tissue specific; globin is expressed in red blood cells and myoglobin is expressed only
in muscle cells.
• Expression is developmental specific; each globin gene is expressed at a limited time during development.
• Expression is coordinately controlled; alpha and beta genes are expressed to the same level so that there
is a 1:1 ratio of globin polypeptides.
• Promoter, enhancer/silencer elements, and locus control region regulates gene expression.
• Hemoglobin can be found in the blood stream, and myoglobin is only expressed in the muscles.
Myoglobin acts as temporary oxygen storage for metabolically active cells and has higher affinity for
oxygen than hemoglobin.
KEY TERMS:
hemoglobin/heme/ α , β globin Gene duplication
gene families
post-translational modification
z2e2 / a2g2 /a2b2/ a2d2
zeta (z)-globin
locus control region (LCR)
epsilon (e)/gamma (g)/delta (d) globin
myoglobin
locus
Pseudogene
STUDY QUESTIONS:
1) The various a- and b-globin genes are expressed
at various times during development (embryo,
fetus, adult). This might reflect their various
physiological roles during development. What
might those roles be, and how might this be
tested?
2) Why might the a-globin and b-globin genes have
the same intron/exon structure?
3) Draw a simple cartoon showing the organization
of the globin polypeptides in the functional
hemoglobin molecule.
4) Figure 3. shows the organization of the human
globin genes on chromosomes 11 and 16. The
figure lacks a scale bar. Search the internet for a
similar figure and show the length of 10 kbp on
your figure.
5) Some adults have a condition called
hereditary persistence of fetal hemoglobin
(HPFE). What causes it? Does it affect their
health?
INTRODUCTION itself. Most however are transported a second time
at the other side of the cell where they enter the
Young mammals get nourishment from their blood. Once in the circulatory system the sugars
mother's milk. Human milk for example contains 4% will travel to the other cells of the body.
fat, 1% protein, and 7% carbohydrates. There are
30+ different types of carbohydrates but the most
abundant is the disaccharide lactose ("milk sugar").
How does the infant digest this lactose? When the
lactose reaches the start of the small intestine it
encounters an enzyme called Lactase. As shown in
Figure 2 Lactase is a membrane protein on the
surface of intestinal epithelial cells. Lactase
performs a hydrolysis reaction, which separates the
disaccharide into two monosaccharides, galactose
and glucose. These are then imported into the cells
by a second membrane protein named the Sodium-
Glucose Transporter 1 (SGLT1). It transports some
monosaccharides (glucose and galactose) but not
others (fructose for example). As its name suggests,
it is powered by a sodium gradient, there are more
sodium ions outside the cell than inside. Each time Figure 2.
the protein allows two sodium ions to enter one Lactose import by a human intestinal epithelial cell.
(Original-Harrington-CC BY-NC 3.0)
monosaccharide can be imported. Once inside the
cell some of the sugars are consumed by the cell
Figure 7.
A typical human mRNA shown approximately to scale. In the LCT mRNA the lengths of the 5'UTR, coding sequence, and 3'UTR are
11, 5784, and 479 nucleotides, respectively. Codon 1 is AUG = methionine/start and codon 1928 is UGA = stop.
(Wikipedia-Daylite-PD)
The resolution to this mystery comes from how the
Lactase protein is made (Figure 8). When the
ribosome first binds to the mRNA, both are floating
free in the cytosol. The first 20 amino acids of the
Lactase protein are an ER signal sequence. This
signal attracts an RNA-protein complex called the
Signal Recognition Complex. It brings the ribosome
to the surface of the rough ER. The ribosome
continues protein synthesis but now the protein is
fed into the ER lumen (Figure 8a). Since its job is
done, the ER signal sequence is cut off and its
amino acids are recycled. These are typical events
in the synthesis of membrane proteins such as
Lactase. Figure 8.
Synthesis of Lactase proteins in a human intestinal
Towards the end of the coding sequence comes a
epithelial cell.
stop transfer sequence. This portion is left in the (Original-Harrington-CC BY-NC 3.0)
ER membrane and becomes the trans-membrane
domain (Figure 8b). The ribosome soon reaches the
The purpose of this pro region was a mystery until
stop codon in the mRNA and departs.
1994 when scientists made synthetic Lactase
At this point we have a membrane protein but it is proteins that lacked them. These Lactase proteins
still longer than expected. This so-called pro- were unable to fold into their correct shapes. Thus
Lactase protein travels to the Golgi apparatus. the pro regions are necessary so that normal
There, an enzyme cuts the protein a second time Lactase proteins can fold properly in the ER.
releasing an 847 amino acid long pro region (Figure Afterwards in the Golgi apparatus, the pro regions
8c). The remaining protein is the 1060 amino acid are removed so that the enzymatic active sites are
long mature Lactase. It is modified a bit more in the exposed. Some other proteins, for example Insulin,
Golgi apparatus before being sent onwards to the also contain pro regions when they are first made.
plasma membrane (Figure 8d). Once the pro-proteins fold into their proper shapes
the pro regions are removed and their amino acids
recycled.
3. LCT GENE EXPRESSION DURING DEVELOPMENT they turn eight the LCT genes have turned off.
Without Lactases they can no longer break down
Because young mammals are completely lactose. If they drink cow milk or eat too much
dependent upon mother's milk their LCT genes are cheese, ice cream, or other dairy products the
very active in the intestinal epithelial cells (Figure
result is diarrhea. However in 35% of people this
9). This supplies these cells with enough Lactase
doesn't happen. The LCT genes remain active,
enzymes to digest the lactose sugars in the gut. The
Lactases continue to be produced, and milk and
resulting glucose and galactose can then be
dairy products do not cause gastric problems. Each
imported into the cells. Other cells do not turn on
of us is thus either lactose intolerant or lactose
the LCT genes because they have no use for Lactase
tolerant. Either we have stopped making Lactase (a
proteins. This is an example of spatial gene
phenotype also known as Lactase non-persistence)
expression - when a gene is active in only some
or we continue to make it (Lactase persistence).
cells in a multicellular organism. In liver, muscle,
The explanation for this difference reveals much
and other cells the LCT genes remain off to not
about human genetics and human evolution.
waste nucleotides on unneeded mRNAs and amino
acids on unneeded proteins. This also saves energy Before we move on, an option for lactose
that would be required by the transcription and intolerant people are lactose-reduced dairy
translation machinery too. products such as those shown in Figure 1. How are
they made? The answer is simple, Lactases are
Figure 9. purified from yeast such as Kluyveromyces fragilis
A young elk drinking and added to food during its processing. Any
milk (top) and an
older elk eating grass
lactose will be broken down into monosaccharides
(bottom). by the time the person eats the food. All people,
(Wikipedia-Left: lactose tolerant and intolerant, can import glucoses
Norbert Kaiser, Right: and galactoses into their intestinal cells using their
Jonathunder-CC BY-SA SGLT1 transporters.
3.0)
4. EVOLUTION OF THE LCT GENE
If we look at the distribution of lactase persistence
(LP) in the human population we see three
hotspots - Northern Europe, Eastern Africa, and
Arabia/India (Figure 10). After much searching
scientists found that in each case there was a single
mutation in LCT responsible. In the European
When the mammal is older it will switch from population a CG to TA base pair substitution
mother's milk to eating regular food, a process created the LP allele. The reason it took so long to
called weaning. For most mammals this means that find was the mutation was far upstream of the
there will be no more lactose in the diet, and thus transcribed region, 13 910 bp in fact!
no more need for Lactases. The LCT genes are
turned off in the intestinal cells because their job is The LP alleles in the other populations were
done. This is an example of temporal gene different bp substitutions very close by (Figure 11).
expression - when a gene is only active during In African populations it was a TA --> GC bp
specific stages during an organism's development. substitution at –13 915 while in Arabia/India was a
CG to GC bp substitution at –13 907.
In about 65% of people the LCT genes follow this
pattern of temporal gene expression. By the time
Figure 10.
Distribution of the lactose tolerant phenotype. Dots represent collection locations. Colours show the frequency of the lactose
tolerant phenotype from 0-10% to 90-100% of the local population.
(BMC Evolutionary Biology, 2010, 10:36-Itan et al-CC BY 2.0)
How can these mutations have an effect? What were Lactase persistent. In other places in the
each does is to turn this stretch of DNA into a world, places where lactose tolerance had no
binding site for a positive transcription factor. benefit, any LP alleles that arose would have not
Positive transcription factors bind to genes and been selected for and would remain rare.
activate them, while negative transcription factors
bind to genes and have the opposite effect. In this
case the positive transcription factor is a protein
known as Oct1. It is Oct1 that is keeping the LCT
genes active long after they would otherwise be
turned off. All it took to change the LCT gene's
temporal expression pattern were mutations in the
gene's regulatory region. The result had a dramatic
effect on a person's phenotype.
Figure 11.
Like other mutations these three occurred Mutations in the LCT gene that produce a Lactose
persistence allele. Note that locations on genes are
randomly. But why did these mutations become so
numbered relative to the first base pair read by the RNA
common? The answer comes from the food Polymerase being +1.
consumed in these places over the past thousands (Original-Harrington-CC BY-NC 3.0)
of years. All three groups of people raised animals;
goats, sheep, cows, or camels; that could be In summary, while there are different alleles of the
milked. Milk and milk-products offered a new year LCT gene in the human population neither type is
round food source. People in these communities "better". The original allele turns off after weaning
with the lactose tolerance phenotype would have and thus conserves nucleotides, amino acids, and
had more food available and been able to have energy. The LP alleles remain on and allow a person
more children. Their children would have inherited to eat a greater variety of foods. Ultimately, the
the LP alleles and also had this advantage. Over reason you are either lactose tolerant or intolerant
many generations the population shifted to where has to do with what your ancestors ate and drank
most if not all of the people had the LP alleles and thousands of years ago!
SUMMARY:
• In the human gut the dissaccharide lactose is hydrolyzed by an intestinal epithelial cell membrane protein
named Lactase. The resulting monosaccharides, galactose and glucose, are then imported into the cell by
the SGLT1 transport protein.
• The LCT gene makes a long pre-mRNA which is processed (5' cap added, introns removed, poly(A) tail
added) to produce a much shorter mature mRNA.
• The Lactase protein contains regions that control where it is synthesized (ER signal sequence), become a
membrane domain (stop transfer sequence), and assist with its folding (pro sequence). Some of these
regions are removed as the protein is formed and delivered to its final location.
• The LCT gene shows both spatial gene expression (it is only active in some cells) and, in many people,
temporal gene expression (it is only active during some developmental stages).
• During human history three independent mutations have generated Lactose persistence (LP) alleles of the
LCT gene. These mutations have altered how the gene is regulated. People with one of these alleles can
digest lactose during their whole lives and not just as infants.
KEY TERMS:
lactose pro region
Lactase spatial gene expression
SGLT1 temporal gene expression
trans-membrane domain lactose intolerant /Lactase non-persistence
UTR lactose tolerant / Lactase persistence
coding sequence positive transcription factor
ER signal sequence negative transcription factor
stop transfer sequence
STUDY QUESTIONS:
1) If a person was heterozygous for LCT, i.e. they 5) In E. coli a protein called Lac Permease imports
had one Lactase persistence allele and one lactose into cell so that a protein called Beta-
Lactase non-persistence allele, what would Galactosidase can turn it into galactose and
their phenotype be? In other words, is the LP glucose. What are the similarities and
allele dominant or recessive to the original differences between how E. coli imports lactose
allele? and how you do?
2) Go to OMIM and find the entry for the LCT 6) How do you suppose other dissaccharides are
gene. What are the alternative symbols for this digested in humans? Note that in our diet the
gene? Why is it necessary for the HUGO Gene most common dissaccharides are lactose
Nomenclature Committee to approve only one (galactose + glucose), sucrose (glucose +
symbol for each gene? fructose), and maltose (glucose + glucose).
3) One way to find out if a person is lactose 7) Why do people with a lactose intolerant
tolerant or intolerant is to feed them some phenotype have gastric troubles if they drink
lactose and then monitor their blood glucose milk or eat dairy products? More specifically,
levels. How does this work? what problem is all that undigested lactose
4) Insulin proteins are synthesized and exported causing in their large intestines?
from human pancreatic cells. Consult Figure 8
and describe how these proteins are made.
PAGE 8 OPEN GENETICS LECTURES – FALL 2017
EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE - CHAPTER 10
INTRODUCTION In Drosophila melanogaster, two types of pigments
are used: orange-coloured drosopterins and
One of the most striking features of Drosophila brown-coloured ommochromes. Eyes that contain
melanogaster is the adult's large red eyes (Figure both pigments have the wild type, bright red
1). As with other insects, these are compound
colour. Synthesizing these pigments requires a set
eyes. Each Drosophila eye is made of about 800
of transporters and enzymes. If any of the genes
tubes called ommatidia arranged in a hemisphere
encoding these proteins is mutated, the result will
(Figure 2). Light enters the outwards facing side of
be a fly with an altered eye colour. In the wild this
the ommatidium and activates a light sensitive
would be detrimental, however, a fly confined
photoreceptor cell at the base. This cell sends a
within a laboratory vial does not require vision to
nerve impulse to the brain. In order for compound
find food and mates. Eye colour mutations
eyes to function the sides of each ommatidium
therefore do not compromise the viability and
have to be opaque - otherwise light coming from
fertility of lab strains.
other directions will activate the photoreceptor
cell. Thus each ommatidium has three parts, a lens Because eye colour mutants are easy to isolate and
at the top, pigment cells along the sides, and a propagate, scientists have used them to make
photoreceptor cell at the base. many scientific discoveries. The best example of
this is a gene called white (w), with mutants giving
a white coloured eye. This chapter describes how
this gene functions, some of its mutant alleles, and
why it is important in the history of genetics. The
study of fly genes provides insight into gene
expression, function, control for other genes,
including human genes and diseases.
1. THE WHITE GENE PROTEIN
Figure 2. Each fly eye begins as a clump of cells called an
Three ommatidia within a Drosophila eye. Arthropod imaginal disc inside the larva. During pupation
compound eyes have multiple lenses in contrast to human these imaginal discs grow and mature into eyes.
eyes, which have a single lens each.
One of the developmental steps in the future
(Original-Harrington- CC BY-NC 3.0)
pigment cells is to import tryptophan and guanine
molecules (Figure 3). These will be converted into synthesis of W and B polypeptides would be
brown (ommochrome) and orange (drosopterin) unaffected. These cells would therefore be
pigments, respectively. Tryptophan and guanine lacking tryptophan transporters but would
are imported by transporter proteins in the cell’s have functional guanine transporters.
plasma membrane. Both transporter proteins are During pupation the cells would be able to
heterodimers, proteins made of two different synthesize orange (drosopterin) but not
polypeptides. The transporter made with the W brown (ommochrome) pigments. The cells,
and S polypeptides imports tryptophan while the and the fly eyes as a whole, would be
guanine transporter is made with the same W orange.
polypeptide joined with a B polypeptide.
• If the B gene was mutated the opposite
situation would happen, the future pigment
cells would be able to import tryptophan
but not guanine. They would contain brown
but not orange pigments. These flies would
have brown eyes.
• If the W gene was mutated neither
transporter could be produced. With no
precursors imported there would be no
pigments and the flies would have white
eyes.
Figure 3.
These figures are missing one piece of information,
Import of pigment precursor molecules into a future eye the true names of the three genes. Each was
pigment cell. discovered decades before their protein's cellular
(Original-Harrington- CC BY-NC 3.0) function was revealed. So what did geneticists
name a gene that, when mutated, makes the eyes
white? Well, it was named the white gene. To
reduce confusion Drosophila geneticists use a
system of italics and capital letters when referring
to DNA, RNA, and proteins. In this system the white
gene is transcribed into the white mRNA, which
translated into the WHITE protein. The wild type
(functional) allele of the white gene can be
depicted white+ or just w+.
Figure 4.
Three different genes encode the three polypeptides The two other genes were named the same way.
needed to make the two types of transporters. The actual name of the S gene is scarlet (st), while
(Original-Harrington- CC BY-NC 3.0) the B gene is officially the brown (bw) gene. Figure
5 shows the actual names of the three genes and
Each of these three polypeptides is encoded by a their polypeptide products. There are a few other
different gene, there being three genes in total genes that also mutate to produce an unusual eye
(Figure 4). From these figures we can predict what colour. Many of these encode enzymes which turn
would happen if any one of these genes were non- the tryptophans into ommochromes or guanines
functional. into drosopterins. One example is the rosy gene
which makes the Xanthine Dehydrogenase enzyme.
• If the S gene was mutated, pigment cells Flies without this enzyme can not synthesize the
would not make any S polypeptides, but the
orange pigments and have brown-coloured eyes as represent exons and the filled in sections are the
a consequence. protein coding region. In this mRNA the start codon
is in the first exon and the stop codon is in the sixth
2. THE WHITE GENE (last) exon. The V's below and connecting the boxes
represent the introns which were removed. This
2.1. THE FUNCTIONAL (WILD TYPE) ALLELE representation allows the mature mRNA and gene
The white gene is a typical eukaryotic gene. It sequences to line up vertically. The figure omits the
makes a pre-mRNA that will have five introns 5' cap and poly(A) tail that are present in the
removed during processing to yield a shorter mature mRNA.
mature mRNA. In Figure 6 the transcribed region
on the DNA has an arrow indicating where the RNA Figure 7 shows the WHITE polypeptide. It has been
polymerase starts and the direction it travels. The flattened into two-dimensions so that the six trans-
mature mRNA is shown below where boxes membrane domains and large cytosolic domain can
be seen. The actual polypeptide would join with a
similarly structured BROWN or SCARLET
polypeptide to form a cylindrical membrane
protein.
active. They only cause 0.2% of our spontaneous • Look at the ocelli. These are three simple eyes
mutations. on the top of insect heads (Figure 9 right).
Drosophila adults use these to keep themselves
2.3. THE WHITE-APRICOT ALLELE upright as they fly. In Drosophila the ocelli are
Geneticists have isolated over a thousand mutant normally red but they are unpigmented in
alleles of the white gene. Not all of these have a certain mutant strains.
white-eyed phenotype though. The whiteapricot (wa)
allele has apricot-coloured eyes for example
(Figure 8). It was also caused by a spontaneous
transposable element insertion, in this case one
called copia. Because of the location and
orientation of the copia element insert, the wa
allele can still make mRNAs but not at wild type
levels. This allows the future pigment cells to
synthesize some WHITE polypeptides.
Consequently, there are fewer transporter proteins
on the surface of the cells and thus not enough
precursors are imported to make pigments. With
fewer of both types of pigment molecules, the cells
and eyes have a pale orange colour.
Figure 9.
Figure 8. A mantis with prominent pseudopupils (top) and a wasp
+ -
Drosophila with white (left), white mutation (middle) and with its ocelli circled (bottom).
apricot
white mutation. (right) (FlyBase-PD) (Wikipedia-right: Luc Viatour/left: Assafn- CC BY-SA 3.0)
The only explanation that fit was that males had • The first Drosophila gene cloned and
one copy of this eye colour gene while females sequenced was white. The whiteapricot allele
must have two. This parallels the situation with the played an important part.
X chromosome, males have one and females have
• A functional white+ gene is used as a marker
two. His explanation for the results was that the
for small pieces of DNA when scientists
gene for eye colour was on the X chromosome. We
introduce DNA into Drosophila or move
now call genes such as this X-linked. In a famous
DNA from chromosome to chromosome.
statement Morgan concluded:
These techniques are called transfection
“The fact is that this R [the allele for red and transposition, respectively.
eyes] and X [the X chromosome] are
combined, and have never existed apart.”
His three crosses, using modern nomenclature, are
shown in Figure 10.
Morgan used these results as confirmation of the
chromosomal theory of inheritance. Other
biologists had proposed that genes were on
chromosomes but here was the evidence – the
gene for Drosophila eye colour was inherited as if
on a specific chromosome, the X chromosome. For
this and other research T. H. Morgan won the
Nobel prize in Physiology or Medicine in 1933.
In many cases, the white-eyed flies students work
with in genetics labs are the descendants of that
one male fly he discovered over one hundred years
ago (or 2500+ generations). Also, the experiment
students do today to demonstrate X-linked
inheritance would have given them a Nobel prize
had they done it before Morgan in 1910!
4. THE IMPORTANCE OF THE WHITE GENE
4.1. OTHER DISCOVERIES
Since Morgan's time geneticists have made further
discoveries using the white gene and its mutant
alleles. Some highlights include: Figure 10.
A modern depiction of Morgan's crosses and results.
• Much of what we know of heterochromatin Female flies have five thin stripes on their abdomens
comes from the analysis of the whitemottled#4 while males have two thin and one wide stripe.
(wm4) allele. A chromosome rearrangement (Original-Harrington- CC BY-NC 3.0)
has placed the white gene too close to the
centromere. In some ommatidia the gene is
able to function but in most the gene is
silenced. The result is a fly with a mosaic of
red and white ommatidia (Figure 11).
• The first Drosophila transposable element
discovered was in the whiteivory (wi) allele.
___________________________________________________________________________
SUMMARY:
• Insects have compound eyes. Pigment-containing cells line the side of each ommatidia and serve a
crucial function in vision.
• Heterodimer transport proteins import pigment precursor molecules into these cells during
development.
• Drosophila adults that are unable to make one or both of the transporters do not have the normal red
eye colour.
• Mutations in the white gene reduce or eliminate eye pigmentation.
• The Drosophila white gene was the first X-linked gene discovered and was used by Morgan to confirm
the chromosomal theory of inheritance.
• Geneticists continue to use the white gene as a tool in their research.
KEY TERMS:
compound eye transposable element
ommatidia pseudopupil
pigment cell ocelli
pigment X-linked gene
transporter protein chromosomal theory of inheritance
heterodimer protein
QUESTIONS:
1) What is the difference between white, white,
and WHITE?
2) How do Drosophila adults with the white1
mutation perceive the world? Is it darker or
lighter?
3) What colour eyes would a fly have if it had
homozygote mutations in both the sepia and
the brown genes?
4) Other Drosophila cells import tryptophan and
guanine but use them to make the
neurotransmitters serotonin and dopamine.
Does this mean that flies with the white1
mutation have altered behaviour?
Figure 1.
The difference in appearance between
pigmented and white peacocks is due
to mutation.
(Flickr-ecstaticist- CC BY-NC-SA 2.0)
INTRODUCTION However, DNA sequences can change. Changes in
DNA sequences are called mutations. If a mutation
The techniques of genetic analysis discussed in the
changes the phenotype of an individual, the
chapters on Mendelian inheritance depend on the
individual is said to be a mutant (as opposed to
availability of two or more alleles for a gene of
wild type).
interest. Where do these alleles come from? The
short answer is mutation, or a change to the DNA In a typical population of individuals (e.g. a
sequence. classroom of students), not all members will have
the same DNA sequence – there is genetic
Humans have an interesting relationship with
variation. The extent of this variation can be
mutations. From our perspective, mutations can be
divided into two categories. First, naturally
extraordinarily useful because they are essential
occurring but rare (<1%), sequence variants that
for the domestication and improvement of almost
are clearly different from a normal, wild-type
all the organisms we use as food. On the other
sequence are called mutations. Second, in a
hand, mutations are the cause of many cancers and
population there may be many naturally occurring
other diseases that can be devastating to
variants for a trait for which no wild type can be
individuals. Yet, the vast majority of mutations
defined. In this case we use the term
probably go unnoticed and undetected. In this
polymorphism to refer to variants of DNA
section, we will examine some of the causes of
sequences and other phenotypes that co-exist in a
mutations.
population at relatively high frequencies (>1%).
1. MUTATION AND POLYMORPHISM Polymorphisms and mutations arise through similar
biochemical processes, but the use of the word
We have previously noted that an important
“polymorphism” avoids implying that any particular
property of DNA is its sequence fidelity: most of
allele is more normal or abnormal. For example, a
the time its sequence accurately passes the same
change in a person’s DNA sequence that leads to a
information from one generation to the next.
disease such as albinism is appropriately called a Mutations here will affect the ability of RNA
mutation, but a difference in DNA sequence that polymerase to bind and transcribe that gene, and
explains whether a person has red hair rather than so will ultimately affect the overall levels of the
brown, black, or blond hair is an example of protein. Lastly, mutations can occur in regions
polymorphism. between genes, or within introns. These mutations
Molecular markers, which we will discuss in the will not affect the functions of any genes, and so
chapters on DNA variation, are a particularly useful the organism will appear wild type.
type of polymorphism for some areas of genetic 2.1. DELETION AND INSERTION MUTATIONS - FRAME
research. SHIFT
2. TYPES OF MUTATIONS A deletion or insertion mutation may cause
dramatic changes in the sequence of the protein. A
Mutations, or lesions, may involve the loss deletion is removing base pair(s) from the DNA,
(deletion), gain (insertion) of one or more base and an insertion is inserting new base pair(s) into
pairs, or else the substitution of one or more base the DNA. Remember that three nucleotides, or a
pairs with another DNA sequence of equal length. codon, code for a single amino acid. If the insertion
These changes in DNA sequence can arise in many or deletion is only three nucleotides, it will
ways, some of which are spontaneous and due to maintain the sequence reading frame so the
natural processes, while others are induced by protein will have one extra or one missing amino
humans intentionally (or unintentionally) using acid (Figure 2). The same will occur for multiples of
mutagens. There are many ways to classify three (6,9,12, etc.). The location of the
mutagens, which are the agents or processes that insertion/deletion will affect the severity of the
cause mutation or increase the frequency of mutant allele, but this type of mutation is generally
mutations. We will classify mutagens here as being less harmful than non-multiples of three.
(1) biological, (2) chemical, or (3) physical in the
next section. If a deletion or insertion mutation is not a multiple
of three, it will cause a frame shift. The typical
Mutations can occur in many locations, with codon next to the insertion or deletion will be
respect to a gene. They can occur within genes, and shifted over, and the ribosome will start placing
so can possibly change the polypeptide sequence incorrect amino acids after the mutation. This will
from that gene. The severity of that mutation, and lead to a severely disrupted protein that will likely
how it affects the genes function, can be described not be able to function properly. A frame shift is
using Muller’s morphs, which is explained in also very likely to cause a premature stop codon.
Chapter 13. They can also occur in regions that are This will lead to a truncated, or shortened
transcribed but not translated. These are non- polypeptide (Figure 2). If this happens near the end
protein coding genes, which can include tRNA, of the polypeptide sequence, it is likely that a
rRNA or siRNA. Mutations that can still affect gene frame shift will not have major effects on the
function can occur in regions that are not polypeptide function. If it happens near the start
transcribed or translated, such as in the promoter then the protein will likely be non-functional.
or regulatory regions (enhancer/silencer) of genes.
ATG CCG AAA ATA AGT TTC AGG GGT ... Figure 2.
Met Pro Lys Ile Ser Phe Arg Gly ... The top is an example sequence, and the amino acids
produced from those codons. When a three base
insertion, there isn’t a frameshift, but just an insertion
Three base insertion ATG CCG AAA CTC ATA AGT TTC AGG GGT ...
with no frameshift Met Pro Lys Leu Ile Ser Phe Arg Gly ... of a new amino acid. When a two base insertion occurs,
it causes a frame shift of every base after, in this
Two base insertion situation leading to a premature stop codon.
ATG CCG AAG CAA TAA GTT TCA GGG GT ...
with frameshift Purple shows inserted bases, green shows affected
Met Pro Lys Glu XXX Val Ser Gly ...
and stop codon bases and amino acids, red XXX is the stop codon.
(Original-L. Canham- CC BY-NC 3.0)
Figure 4.
Examples of substitution
mutations that lead to silent,
nonsense or missense
mutations.
(Wikipedia-Jonsta247- CC BY-SA
4.0)
Figure 6.
Strand-slippage can occur
occasionally during replication,
especially in regions with short,
repeated sequences. This can
lead to either deletion (left) or
insertion (right) of sequences
compared to the products of
normal replication (center),
depending on whether the
template strand or daughter
strand “loops-out” during
replication.
(Original-Deyholos- CC BY-NC
3.0)
This strand-slippage causes one or more bases on repeat diseases. If there are a low number of
either strand to be temporarily displaced in a loop repeats, the gene can be stable and the polymerase
that is not paired with the opposite strand. If this is able to faithfully replicate the repeats. Strand
loop forms on the template strand, the bases in the slippage, as described in the last section, can cause
loop may not be replicated, and a deletion will be an increase in the number of repeats in that region.
introduced in the growing daughter strand. If it only increases slightly, this usually doesn’t
Conversely, if a region of the daughter strand that cause instability. Once a threshold is reached, the
has just been replicated becomes displaced in a ability of polymerases to faithfully replicate the
loop, this region may be replicated again, leading repeated region becomes more difficult, and the
to an insertion of additional sequence in the repeats can grow, often very quickly. This mostly
daughter strand, as compared to the template occurs in the germline of individuals, and often
strand. Frame shift mutations account for mutation leads to their offspring making individuals with
hot spots in some genes. Hot spots are sites in a more and more repeats. This threshold is different
gene that are significantly more mutable than for different genes.
other sites. An example of this can be seen in with the
Regions of DNA that have several tandem repeats trinucleotide repeat disease Huntington’s Disease.
of the same few nucleotides are especially prone to Huntington’s has the repeat CAG within the reading
this type of error during replication. Thus regions frame of Huntingtin gene (HTT). A normal HTT
with short-sequence repeats (SSRs) tend to be gene will have fewer than 28 repeats. If more
highly polymorphic, and are therefore particularly repeats are gained, through strand slippage, to 28-
useful in genetics. They are called microsatellites. 35 repeats, then this is called a pre-mutation. At
this point, the DNA polymerase cannot faithfully
3.3. TRINUCLEOTIDE REPEATS reproduce the repeats and strand slippage occurs
Some regions of the genome have repeated more frequently. Such a person is unaffected, but
sequences. Dinucleotide repeats (eg. AGAGAGAG) their children may inherit the HTT gene with an
are common throughout the genome, as well as increased numbers of repeats, and their children
larger repeats such as VNTRs (Variable Number of after that even more. The more repeats, the more
Tandem Repeats). Trinucleotide repeats (eg. severe the disease. Once the repeats get above 40,
..CGGCGGCGGCGG..) though have a tendency to the HTT gene will cause the disease, which is a
expand the region of their repeat, which leads to neurodegenerative disorder that leads to a
genetic diseases. These are known as trinucleotide
decreased life expectancy. With the CAG repeat common. This reaction occurs because of
within the HTT gene reading frame, one can endogenous metabolites within the cell.
understand why increasing the repeats will cause a
decrease in protein function, as it will be gaining Deamination is the removal of an amine group
more and more of the amino acid glutamine. through a hydrolysis reaction. When looking at the
individual nucleotides (See Chapter 2), adenine,
Not all trinucleotide repeat disorders are caused by guanine and cytosine all have amine groups on the
repeats within the reading frame though, for base. Like depurination, deamination can occur
example Fragile X syndrome. Fragile X syndrome is spontaneously due to metabolites within the cell.
a genetic congenital disease characterized by both In cytosine deamination the cytosine will change to
mental retardation and physical abnormalities like a uracil (Figure 8) and free ammonia. This change
long face and large ears in men. The gene can be easily corrected, as uracil is only found in
associated with this disorder is FMR-1. At the 5’ RNA, not DNA. So when a uracil is found in DNA, it
end of the FMR-1 mRNA, before the translational is removed and replaced again with a cytosine. If
start site is a CGG repeat. Most individuals have this does not occur though, the uracil will pair with
between 5-54 repeats of CGG. Parents with a adenine, leading to a GC to AT transition mutation.
premutation, where they are prone to pass on the
disease but are not affected themselves, have 5-methylcytosine is a cytosine with a methyl-tag.
between 60-200 repeats. Children of individuals Deamination of 5-methylcytosine is the most
with this premutation will have a greatly expanded common deamination mutation. It leads to
region of around 200-4000 of repeats and will thymine and ammonia. In this situation a T would
exhibit the syndrome. Since this is not in the
reading frame, why does this expanded repeat
cause the disease? It is predicted that the
expanded repeat causes aberrant methylation of
base pairs the 5’ upstream region, which leads to
changes in chromatin structure and gene silencing.
The karyogram of affected individual exhibits a
constricted region on the X chromosome, which is
fragile and is prone to chromosomal breakage. This
constriction is caused by the hypermethylation on
the large number of FMR-1 repeats.
then be opposite a T. This is fixed prior to the one gene with another. These insertions can occur
passage of the replication fork, but because the spontaneously, or they may also be intentionally
repair systems do not know where the original C stimulated in the laboratory as a method of
was located, this can lead to a point mutation at mutagenesis called transposon-tagging. For
that location. example, a type of transposable element called a P
Deamination of guanine creates the base xanthine. element is widely used in Drosophila as a biological
Xanthine is more prone to pair with thymine mutagen (see Chapter 30). T-DNA, which is a
instead of cytosine. If unfixed, this will cause a GC transposable element modified from a bacterial
to AT mutation. Deamination of adenine leads to pathogen, is used as a mutagen in some plant
hypoxanthine is more prone to pair with cytosine species.
instead of thymine. This causes an AT to GC Transposable elements (TEs) are also known as
mutation. mobile genetic elements, or more informally as
Oxidative damage is caused by oxygen free radicals jumping genes. They are present throughout the
such as superoxide (O2-), hydrogen peroxide (H2O2) genome of almost all organisms. These DNA
and hydroxyl radicals (OH-). These are normal sequences have a unique ability to be cut or copied
byproducts in cells, but are also increased in areas from their original location and inserted into new
of inflammation as the body uses oxidizing species locations in the genome. This is called
to attack pathogens. A common product of transposition. The insert locations are not entirely
oxidative damage is 8-oxo-7-hydrodeoxyguanosine random, but TEs can, in principle, be inserted into
(8-oxo dG), which is formed from oxidative almost any region of the genome. TEs can
damaged guanine. 8-oxo dG mispairs with A, therefore insert into genes, disrupting their
causing GC to AT mutations. function and causing mutations.
Researchers have developed methods of artificially
3.5. MUTATIONS FROM TRANSPOSABLE ELEMENTS increasing the rate of transposition, which makes
Mutations can also be caused by the insertion of some TEs a useful type of mutagen. However, the
viruses, transposable elements (see below), and biological importance of TEs extends far beyond
other types of DNA that are naturally inserted at their use in mutant screening. TEs are also
more or less random positions in chromosomes. important causes of disease and phenotypic
The insertion may disrupt the coding or regulatory instability, and they are a major mutational force in
sequence of a gene, including the fusion of part of evolution.
Figure 9.
Diagrams of the two main types of
transposable elements. (TEs) Class I
elements transpose via an ssRNA
intermediate, which is reverse
transcribed to dsDNA prior to insertion
of this copy in a new site in the
genome. Class II elements do not
involve an RNA intermediate; most
Class II elements are cut from their
original location as dsDNA, prior to
being inserted into a new site in the
genome. Although the diagram shows
TEs being inserted on the same
chromosome as they originated from,
TEs can also move to other
chromosomes within the same cell.
(Original-Deyholos- CC BY-NC 3.0)
There are two major classes of TEs in eukaryotes DNA template from a homologous chromosome
(Figure 9). that itself contains a copy of a transposon, then the
Class I elements include retrotransposons; these total number of transposons in the genome will
are mobile by means of an RNA intermediate. The increase.
TE transcript is reverse-transcribed into DNA before Besides greatly expanding the overall DNA content
being inserted elsewhere in the genome through of genomes, TEs contribute to genome evolution in
the action of enzymes such as integrases. many other ways. As already mentioned, they may
Class II elements are known also as transposons. disrupt gene function by insertion into a gene’s
They do not use reverse transcriptase or an RNA coding region or regulatory region. More
intermediate for transposition. Instead, they use an interestingly adjacent regions of chromosomal DNA
enzyme called transposase to cut DNA from the are sometimes mistakenly transposed along with
original location and then this excised dsDNA the TE; this can lead to gene duplication. The
fragment is inserted into a new location. Note that duplicated genes are then free to evolve
the name transposon is sometimes used incorrectly independently, leading in some cases to the
to refer to any type of TEs, but in this book we use development of new functions. The breakage of
transposon to refer specifically to Class II elements. strands by TE excision and integration can disrupt
TEs are relatively short DNA stretches of 100- genes, and can lead to chromosome
10,000 bp, and encode no more than a few rearrangement or deletion if errors are made
proteins (if any). Normally, the protein-coding during strand rejoining. Furthermore, having so
genes within a TE are all related to the TE’s own many similar TE sequences distributed throughout
transposition functions. These proteins may include a chromosome sometimes allows mispairing of
reverse transcriptase, transposase, and integrase. regions of homologous chromosomes at meiosis,
However, some TEs (of either Class I or II) do not which can cause unequal crossing-over, resulting in
encode any proteins at all. These non-autonomous the deletion or duplication of large segments of
TEs can only transpose if they are supplied with chromosomes. Thus, TEs are a potentially
enzymes produced by other, autonomous TEs important evolutionary force, and may not be
located elsewhere in the genome. In all cases, included as merely “junk DNA”, as they once were.
enzymes for transposition recognize conserved 4. INDUCED MUTATIONS OF CHEMICAL ORIGIN
nucleotide sequences within the TE, which dictate
where the enzymes begin cutting or copying. Many chemical compounds, whether natural or
synthetic, can react with DNA and cause mutations.
The human genome consists of nearly 45% TEs, the In some of these reactions the chemical structure
vast majority of which are families of Class I of particular bases may change, so that they are
elements called Long Interspersed Elements (LINEs) misread during replication. In other cases the
and Short Interspersed Elements (SINEs). The short, chemical mutagens distort the double helix causing
Alu type of SINE occurs in more than one million it to be replicated inaccurately, while still other
copies in the human genome (compare this to the compounds may cause breaks in chromosomes
approximately 21,000, non-TE, protein-coding that lead to deletions and other types of
genes in humans). Indeed, TEs make up a aberrations. The following are examples of two
significant portion of the genomes of almost all classes of chemical mutagens: that are important in
eukaryotes. Class I elements, which usually genetics and medicine: alkylating agents, and
transpose via an RNA copy-and-paste mechanism, intercalating agents.
tend to be more abundant than Class II elements,
which mostly use a cut-and-paste mechanism. But 4.1. ALKYLATING AGENTS
even the cut-paste mechanism can lead to an Ethane methyl sulfonate (EMS) is an example of an
increase in TE copy number. For example, if the site alkylating agent that is commonly used by
vacated by an excised transposon is repaired with a geneticists to induce mutations in a wide range of
Figure 12.
Mutagenesis with 5-Bromouracil
(Wikipedia-Allen Gathman-CC BY-SA 2.5)
isomer that pairs with guanine. This ultimately
causes an A to G transition (Figure 12).
5. INDUCED MUTATIONS OF PHYSICAL ORIGIN
Anything that damages DNA by transferring energy
to it can be considered a physical mutagen. Usually
this involves radioactive particles, x-rays, or
ultraviolet (UV) light. The smaller, fast moving
particles may cause base substitutions or delete a
single base, while larger, slightly slower particles
may induce larger deletions by breaking the double
Figure 13.
Thymine dimers are formed when adjacent thymine bases
stranded helix of the chromosome. For example, X-
on the same DNA strand become covalently linked (red rays can cause DNA double stranded breaks.
bonds) follow exposure to mutagens such as UV light. The
Physical mutagens can also create unusual
dimers distort base pairing and can interrupt processes
such as replication. (Original-Deyholos- CC BY-NC 3.0) structures in DNA, such as the pyrimidine dimers
formed by UV light (Figure 13). Pyrimidine dimers
are covalent linkages between two adjacent
pyrimidines, with thymine dimers being the most
common. When a cell is trying to replicate its DNA,
it cannot go through the dimer and so is forced to
stop. Replication can only proceed if DNA repair
enzymes fix the damage. Pyrimidine dimers cause
conformations changes in the DNA, so they are
easily recognized by DNA repair enzymes, but are
Figure 14. often repaired incorrectly.
UV photons can cause adjacent pyrimidines to bond with
each other. This distorts the DNA, creating a bulge that
The most common types of mutations from UV
prevents polymerases from passing by the lesion.
(Wikipedia- NASA/David Herring-PD) light are GC to AT transitions but GC to TA, AT to
TA, AT to CG and CG to GC can all be caused by UV that are potentially deleterious to its long-term
mutagenesis. health. If it is not successful, cells will enter one of
three possible states: (1) they will enter a state of
UV light is a very broad mutagen that can cause dormancy, or senescence, where the cell is still
many mutation types. Compare this with EMS, living but no longer functional; (2) programmed cell
which mostly creates GC to AT mutations; or AFB1, death, or apoptosis, will be activated and the cell
which mostly creates GC to TA mutations, but can will die; or (3) unregulated cell division, where the
cause some other mutations in low frequencies as cell will divide rapidly despite numerous mutations
well. and chromosomal abnormalities (Figure 15). This
can lead to cancerous tumours (Chapter 41).
6. FAILURE OF REPAIR SYSTEMS
6.1. EXCESSIVE DNA DAMAGE
For each type of damage cells have a way to fix it.
A cell may be exposed to DNA damage past a
These repair systems include but aren’t limited to
threshold that it is normally capable of dealing
base excision repair, nucleotide excision repair,
with. Such DNA damage usually prevents the DNA
mismatch repair, non-homologous end joining and
polymerase from normally replicating the DNA and
homologous recombination. The mechanisms of
it becomes ‘stuck’. If the cell is unable to find a way
each repair are not important at this time. All these
to continue replicating it will lead to cellular death.
systems require multiple enzymes to recognize the
Alternatively, it may enter an error-prone DNA
specific type of mutagenic lesion and repair it as
repair system.
accurately as possible. In certain situations, DNA
repair systems are unable to cope with DNA In prokaryotes this is called SOS repair, and when
damage, either because the damage is too induced will recruit error-prone DNA translesion
numerous for the enzymes to be able to recognize polymerases to continue to replicate the DNA past
and repair all of them, or because there is damage the mutagenic lesion usually causing errors in the
to the DNA repair systems itself. place of the DNA damage. A similar example is seen
in eukaryotes called translesion synthesis.
If the final efforts to rescue a cell from DNA
damage are successful, the cell will be able to
survive replication, but will be full of mutations
Figure 15.
Most DNA damage is repaired in
a healthy cell. If the rate of DNA
damage exceeds the rate of
repair, a cell either undergoes
senescence, apoptosis or
uncontrolled cell growth.
(Wikipedia-Harold Brenner)
This is often used when there are excessive DNA UV light can make those cells more susceptible to
lesions from thymine dimers or apurinic sites. If cancer. Similarly, individuals who smoke increase
these lesions are not able to be repaired through mutagens in their lungs. Thus, if heterozygous they
their normal mechanism, specific translesion are more susceptible to DNA damage and cancer in
polymerases. Like in the SOS response, the their lungs compared to a homozygous wild type
translesion polymerase is error prone and will individual.
often insert incorrect bases in the areas of the If damage to DNA repair system genes happens in
lesion. the gametes of both parents, then they can pass
6.2. DAMAGED DNA REPAIR SYSTEMS that on to their child, making the child homozygous
Another source of mutations is when the DNA mutant in that specific DNA repair gene. Instead of
repair systems fail. The genes that make the individual cells being at risk, every cell in the
proteins for the various DNA repair machinery are homozygous mutant child will be missing that DNA
just like any other genes, they can be mutated and repair gene, causing problems with DNA repair. An
cease to function. Many people are homozygous example of this is the disease Xeroderma
wild type for these DNA repair genes. Also, most of pigmentosum, which is caused by a mutant version
these gemes are haplosufficient, meaning that only of a nucleotide excision repair enzyme. Individuals
one copy of the normal allele is required to with this inherited disease are extremely
produce a normal DNA repair system. Those who susceptible to UV light, and develop skin cancer
are heterozygous are at a greater risk though. They very easily and usually die at a young age.
maybe one mutation away from completely losing Xeroderma pigmentosum is just one example of
function of that DNA repair systems gene. When many when people who are born with defects in
that gene is lost, DNA repair mechanisms might not one of the DNA repair system genes. Most of these
be able to work as well, or at all, and mutagenic diseases lead to various types of cancers,
lesions can occur within the genome, leading to particularly many hereditary colorectal cancers, like
permanent mutations. hereditary nonpolyposis colorectal cancer (HNPCC).
If an individual is heterozygous for DNA repair The breast cancer genes, BRCA1 and BRCA2 are
genes, then individual cells are at a higher risk associated with DNA repair as well, and when
when exposed to mutagens. Since UV light is one of mutant lead to early onset breast and ovarian
the most common mutagens we encounter, the cancers.
loss of DNA repair proteins in skin cells exposed to
SUMMARY:
• Variations in DNA sequence that originated recently, and are rare in a population, are called mutations.
• Variations in DNA sequence that co-exist in a population, and neither one can be meaningfully defined
as wild type, are called polymorphisms.
• Mutations may either occur spontaneously, or may be induced by exposure to mutagens.
• Mutations may result in substitutions, deletions, insertions or chromosomal rearrangements.
• Spontaneous mutations arise from many sources including natural errors in DNA replication, usually
associated with base mispairing, or else insertion/deletion, especially within repetitive sequences.
Occasionally metabolites within a cell can catalyze spontaneous mutations.
• Transposable elements are dynamic, abundant components of eukaryotic genomes and important
forces in evolution.
• Induced mutations result from mispairing, DNA damage, or sequence interruptions caused by
chemical, or physical mutagens.
• DNA repair systems can fail through excessive mutations so the cell cannot cope, or by loss of function
of the DNA repair genes themselves.
• When DNA repair systems fail, a cell will senesce, undergo apoptosis or become a cancer.
KEY TERMS:
mutation mispairing LINEs
mutant strand slippage SINEs
wild type loop Alu
polymorphism mutation hot spot copy-and-paste
lesion short-sequence repeats (SSRS) cut-and-paste
deletion microsatellites alkylating agent
insertion trinucleotide repeat diseases intercalating agent
substitution depurination EMS
spontaneous apurinic site benzopyrene
induced deamination carcinogenic
mutagen oxidative damage ethidium bromide
codon transposon-tagging aflatoxin
frame shift P element AFB1
premature stop codon T-DNA base analog
truncated transposable elements (TEs) 5-bromocuracil
transition mobile genetic elements UV light
transversion Class I TE pyrimidine dimer
silent mutation retrotransposon DNA repair systems
missense mutation integrase senescence
conservative Class II TE apoptosis
non-conservative transposon cancer
nonsense mutation transposase DNA damage threshold
translocation reverse transcriptase error-prone DNA repair
inversion non-autonomous translesion polymerases
DNA replication error autonomous Xeroderma pigmentosum
Figure 1.
A breed of cat, the Canadian
Sphynx, lack hair due to a genetic
mutation. The Sphynx breed
originated in Minnesota, but the
Canadian Sphynx line was started
in Toronto in 1966 through a
selective breeding program from a
spontaneous mutation that gave
naked kittens. This mutation is
inherited in an autosomal recessive
manner for the hairlessness gene.
(Flickr- Weimar Meneses -CC BY
2.0)
INTRODUCTION strategy of mutant screening, and has been used
very effectively to identify and understand the
When we think of the word "mutation", we
molecular components of hundreds of different
automatically think of it as something negative or
biological processes. For example, to find the basic
detrimental. However, a mutation, which is a
biological processes of memory and learning,
change in the DNA sequence, may have one or
researchers have screened mutagenized
more effects on an organism, depending on what it
populations of Drosophila to recover flies (or
is and in which gene it occurs. While detrimental
larvae) that lack the normal ability to learn (yes
effects are most common, sometimes mutations
Drosophila can learn). Mutants lack the ability to
can create new features. These mutations give us a
associate a particular odor with an electric shock.
tool with which to investigate the gene and the
Because of the similarity of biology among all
biological processes in which it is involved. In this
organisms (common descent), some of the genes
chapter we will first take a look at how scientists
identified by this mutant screen of a model
perform genetic screening for mutations, and the
organism may be relevant to learning and memory
various consequences of those mutation.
in humans, including conditions such as Alzheimer’s
1. GENETIC SCREENING FOR MUTATIONS: disease.
FORWARD GENETICS, REVERSE GENETICS On the other hand, reverse genetic screening
Forward genetic screening refers to the process of refers to the process of creating a mutation in a
finding the gene or genes responsible for a certain gene, then identifying the phenotypic
phenotype or biochemical process. One way to consequences of that specific mutant gene on the
identify genes that affect a particular biological organism. This method is becoming more useful
process is to induce random mutations in a large with the advent of whole genome sequencing.
population, and then look for mutants with Here, we have identified the gene sequences, but
phenotypes that might be caused by a disruption of are unsure of what each gene does.
a particular biochemical pathway. This is the
1.1. GENETIC SCREENS (recall that the genetic code is degenerate; for
In a typical mutant screen, researchers treat a example, GCT, GCC, GCA, and GCG all encode
parental population with a mutagen. This may alanine) and is referred to as a silent mutation.
involve soaking seeds in EMS, or mixing a mutagen Additionally, the base substitution may change an
with the food fed to flies. Usually, no phenotypes amino acid, but this does not quantitatively or
are visible among the individuals that are directly qualitatively alter the function of the product, so
exposed to the mutagen because in all the cells no phenotypic change would occur.
every strand of DNA will be affected
independently. Thus, the induced mutations will be 2.2. ENVIRONMENT AND GENETIC REDUNDANCY
heterozygous and limited to single cells. There are situations where a mutation can cause a
complete loss-of-function of a gene, yet not
However, what is most important to geneticists are produce a change in the phenotype, even when the
the mutations in the germline of the mutagenized mutant allele is homozygous. The lack of a visible
individuals. The germline is defined as the gametes phenotypic change can be due to environmental
and any of their developmental precursors, and is effects: the loss of that gene product may not be
therefore distinct from the somatic cells (i.e. non- apparent in that specific environment, but might in
reproductive cells) of the body. Because most another. An example is an auxotrophic mutant on
induced mutations are recessive, the progeny of complete medium. Conversely, researchers can
mutagenized individuals must be mated in a way alter the environment to reveal such mutants (e.g.
that allows the newly induced mutations to auxotrophs on minimal media).
become homozygous (or hemizygous). Strategies
for doing this vary between organisms. In any case, Alternatively, the lack of a phenotype might be
the generation in which induced mutations are attributed to genetic redundancy. That is. the
expected to show a phenotype can be examined mutant gene’s lost function is compensated by
for the presence of novel traits. Once a relevant another gene, at another locus, encoding a
mutant has been identified, geneticists can begin similarly functioning product. Thus, the loss of one
to make inferences about what the normal function gene is compensated by the presence of another.
of the mutated gene is, based on its mutant The concept of genetic redundancy is an important
phenotype. This can then be investigated further consideration in genetic screens. A gene whose
with molecular genetic techniques to connect the function can be compensated for my another gene,
gene function with the external appearance. cannot be easily identified in a genetic screen for
loss of function mutations.
2. SOME MUTATIONS MAY NOT HAVE
DETECTABLE MUTANT PHENOTYPES 2.3. ESSENTIAL GENES AND LETHAL ALLELES
Some mutant maybe required to reach a particular
Not all DNA sequence changes result in mutant developmental stage before the phenotype can be
phenotypes. Various reasons are described below. seen or scored. For example, flower color can only
be scored in plants that are mature enough to
2.1. SILENT CHANGES
make flowers, and eye color can only be scored in
After mutagen treatment, the vast majority of base
flies that have developed to the adult stage.
pair changes (especially substitutions) have no
However, some mutant organisms may not develop
obvious effect on the phenotype. Often, this is
sufficiently to reach a stage that can be scored for a
because the change occurs in the DNA sequence of
particular phenotype. Mutations in essential genes
a non-coding region of the DNA, such as in
create recessive lethal alleles that arrest or derail
intergenic regions (between genes) or within an
the development of an individual at an immature
intron where the sequence does not code for
(embryonic, larval, or pupal) stage. This type of
protein and is not essential for proper mRNA
mutation may therefore go unnoticed in a typical
splicing. Also, even if the change affects the coding
mutant screen because they are absent from the
region, it may not alter the amino acid sequence
results, which disrupts the properties of the liquid frequency of the ΔF508 allele has led to speculation
layer that normally forms on the epithelial surface. that it may confer some selective advantage to
In the lungs, this causes mucus to accumulate and heterozygotes, perhaps by reducing dehydration
can lead to infection. Defects in CFTR also affect during cholera epidemics, or by reducing
pancreas, liver, intestines, and sweat glands, all of susceptibility to certain pathogens that bind to
which need this ion transport. CFTR is also epithelial membranes.
expressed at high levels in the salivary gland and CFTR is also notable because it is one of the well-
bladder, but defects in CFTR function do not cause characterized genetic diseases for which a drug has
problems in these organs, probably because other been developed that compensates for the effects
ion transporters are able to compensate. of a specific mutation. The drug, Kalydeco
Over one thousand different mutant alleles of CFTR (Ivacaftor), was approved by the FDA and Health
have been described. Any mutation that prevents Canada in 2012, decades after the CFTR gene was
CFTR from sufficiently transporting ions can lead to first mapped to DNA markers (in 1985) and cloned
cystic fibrosis (CF). Worldwide, the most common (in 1989). Kalydeco is effective on only some CFTR
CFTR allele among CF patients is called ΔF508 mutations, most notably G551D (i.e. where glycine
(delta-F508; or PHE508DEL), which is a deletion of is substituted by aspartic acid at position 551 of the
three nucleotides that eliminates a phenylalanine protein; GLY551ASP). This mutation is found in less
from position 508 of the 1480 aa wild-type protein. than 5% of CF patients. The G551D mutation
Mutation ΔF508 causes CFTR to be folded affects the ability of ATP to bind to CFTR and open
improperly in the endoplasmic reticulum (ER), the channel it for transport. Kalydeco compensates
which then prevents CFTR from reaching the cell for this mutation by binding to CFTR and holding it
membrane. ΔF508 accounts for approximately 70% in an open conformation. Kalydeco is expected to
of CF cases in North America, with ~1/25 people of cost approximately $250,000 per patient per year.
European descent being carriers. The high
___________________________________________________________________________
SUMMARY:
• Forward genetic screening aims to find the molecular basis for a certain phenotype whereas reverse
genetic screening aims to find the phenotypic effects that a gene might have on the organism.
• Somatic mutations occur in non-reproductive cells which affect the current individual, while germline
mutations occur in the gametes which affect future generations and not the individual.
• Mutation can alter a gene into different levels and types of expression.
• Not all base pair changes (mutations) cause detectable changes in an organism. The efficiency of
mutant screening is limited by silent mutations, redundancy, and embryonic lethality.
• Cystic Fibrosis is a genetic disease caused by the mutation in the CTFR gene.
KEY TERMS:
mutant screen recessive lethal allele
loss-of-function double strand break
gain-of-function non-homologous end joining
null DNA repair system
dominant negative chromosome rearrangement
somatic cells CFTR
germline cells Cystic Fibrosis (CF)
silent mutation DF508(PHE508DEL)
inter-genic region Kalydeco
redundancy
essential gene
STUDY QUESTIONS:
1) You have a female fruit fly, whose father was a) What do you expect to be the relative
exposed to a mutagen (she, herself, wasn’t). frequency of dominant mutations, as
Mating this female fly with another non- compared to recessive mutations, and why?
mutagenized, wild type male produces b) How will you design your screen differently
offspring that all appear to be completely than in the previous question, in order to
normal, except there are twice as many detect dominant mutations specifically?
daughters as sons in the F1 progeny of this c) Which kind of mutagen is most likely to
cross. produce dominant mutations, a mutagen
a) Propose a hypothesis to explain these that produces point mutations, or a
observations. mutagen that produces large deletions?
b) How could you test your hypothesis? 4) You are interested in finding genes involved in
2) You decide to use genetics to investigate how synthesis of proline (Pro), an amino acid that is
your favourite plant makes its flowers smell normally synthesizes by a particular model
good. organism.
a) What steps will you take to identify some a) How would you design a mutant screen to
genes that are required for production of identify genes required for Pro synthesis?
the sweet floral scent? Assume that this b) Imagine that your screen identified ten
plant is a self-pollinating diploid. mutants (#1 through #10) that grew poorly
b) One of the recessive mutants you identified unless supplemented with Pro. How could
has fishy-smelling flowers, so you name the you determine the number of different
mutant (and the mutated gene) fishy. What genes represented by these mutants?
do you hypothesize about the normal c) If each of the four mutants represents a
function of the wild-type fishy gene? different gene, what will be the phenotype
c) Another recessive mutant lacks floral scent of the F1 progeny if any pair of the four
altogether, so you call it nosmell. What mutants are crossed?
could you hypothesize about the normal d) If each of the four mutants represents the
function of this gene? same gene, what will be the phenotype of
3) Suppose you are only interested in finding the F1 progeny if any pair of the four
dominant mutations that affect floral scent. mutants are crossed?
Figure 1.
A flower called Camellia showing co-dominance of the red
and white alleles of flower colour.
(Flickr- darwin cruz-CC BY 2.0)
INTRODUCTION variation so there will be different alleles. Some
may be defined as wild type, some as variants,
The previous chapter described the consequences others as mutant.
of mutations. We will now use the mutant forms of
a gene to investigate the interactions of alleles at a The complete set of alleles at all loci in an
single locus. This will begin with the difference individual is its genotype. Typically, when writing
between somatic and germ line mutations. Then it out a genotype, only the alleles at the locus (or loci)
will deal with simple dominance/recessive of interest are considered and written down – all
relationships, which many students have the others are still present and assumed to be wild
encountered before. It will end with more type. So, typically only the alleles at the few mutant
sophisticated interactions that can be described by loci appear in the written genotype. All the many,
“Muller’s Morphs”, which deal with the many others that are wild type are not.
interrelationships of mutant and wild type alleles at The visible or detectable effect of alleles on the
a more detailed level. structure or function of that individual is called its
1. TERMINOLOGY phenotype – what it looks like. The phenotype
studied in any particular genetic experiment may
A specific section of a chromosome is called a range from simple, visible traits such as hair color,
locus. Because each gene occupies a specific locus to more complex phenotypes including disease
along a chromosome, the terms locus and gene are susceptibility or behavior. If two alleles are present
often used interchangeably. However, the term in an individual, as is the case with diploid
“gene” is a much more general term, while “locus” organisms, then various interactions between them
usually is limited to defining the position along a may influence their expression in the phenotype.
chromosome. Each locus will have an allelic form
(allele); that is, a specific DNA sequence. In a
population of individuals there will be sequence
OPEN GENETICS LECTURES – FALL 2017 PAGE 1
CHAPTER 13 –ALLELES AT A SINGLE LOCUS
Figure 2.
Relationship between genotype and phenotype for an
allele that is completely dominant to another allele.
(Original-M. Deyholos -CC BY-NC 3.0)
Figure 3.
Patch of brown eye colour in a green eye.
2. SOMATIC VS. GERMLINE MUTATIONS (Wikipedia-Sheila.lorquiana-CC BY-SA 3.0)
A mutation occurs in the DNA of a single cell. In
single-cell organisms, that mutation is passed on In animals, somatic cells are segregated from germ
directly to its descendants, typically through the line cells. In plants, somatic cells become germline
process of mitosis. In multicellular animals, there is cells; so somatic mutations can become germline
a partitioning early in development into somatic mutations.
cells, which form the body cells, and germline cells,
which form the gametes for the next generation. 2.3. HAPLOID VS. DIPLOID ORGANISMS
Mutations may be passed on to somatic cells via Haploid organisms, have only one copy of a gene,
mitosis and to gametes via meiosis. In plants, this thus a mutation will directly affect the organism’s
somatic/germline separation occurs later, in the phenotype. Therefore, the phenotype can be used
cells that form the flower. to directly infer the genotype of the organism.
However, in diploid organisms, there are two
2.1. SOMATIC MUTATIONS copies of each gene. The phenotype depends upon
Somatic cells form the tissues of the organism and an interaction between the two alleles. Thus, any
are not passed on as gametes. Any mutations in mutation may not have a direct impact on the
somatic cells will only affect the individual in which organism’s phenotype. The interaction of the two
they occur, not its progeny. If mutations occur in alleles can show complete dominance, incomplete
somatic cells, its mutant descendants will exist dominance, co-dominance, or recessiveness.
alongside other non-mutant (wild type) cells. If the Therefore, inferring the genotype based upon its
mutation occurs at a very early stage of phenotype is not as simple as in diploids.
development, the mutation will be present in more
cells. This gives rise to an individual composed of 3. ALLELES: HETERO-, HOMO-, HEMIZYGOSITY
two or more types of cells that differ in their Mendel’s First Law (segregation of alleles) is
genetic composition. Such an individual is said to especially remarkable because he made his
be a mosaic. An example is shown in Figure 3. observations and conclusions (1865) without
Cancer cells are another example of mosaicism. knowing about the relationships between genes,
2.2. GERMLINE MUTATIONS chromosomes, and DNA. We now know the reason
Germline cells are those that form the eggs or why more than one allele of a gene can be present
sperm cells (ovum or pollen in plants), and are in an individual: most eukaryotic organisms are
passed on to form the next generation. Therefore, diploid and have at least two sets of homologous
mutations in germline cells will be passed on to the chromosomes. For organisms that are
next generation but won’t affect the individual in predominantly diploid, such as humans or Mendel’s
which they occur. peas, chromosomes exist as pairs, with one copy
inherited from each parent. Diploid cells therefore The opposite is also found. Single characteristics
can contain two different alleles of each gene, with can be affected by mutations in multiple, different
one allele part of each member of a pair of genes. This implies that many genes are needed to
homologous chromosomes. If both alleles of a make each characteristic. For example, if we return
particular gene are the same (indistinguishable), to the Drosophila wing, there are dozens of genes
the individual is said to be homozygous at that that when mutant alter the normal shape of the
gene or locus. On the other hand, if the alleles are wing, not just the vg locus. Thus there are many
different (can be distinguished) from each other, genes that are needed to make a normal wing; the
the genotype is heterozygous. In cases where there mutation of any one causes an abnormal, mutant,
is only one copy of a gene present, for example if phenotype. This type of arrangement is called
there is a deletion of the locus on the homologous polygenic inheritance.
chromosome, we use the term hemizygous. In
5. COMPLETE DOMINANCE AND RECESSIVE
another example is single X-chromosome in X/Y
males were almost all the loci on that chromosome An example of a simple phenotype is flower color
are hemizygous. (The exception is the pseudo- in Mendel’s peas. We have already said that one
autosomal region – see the chapter on sex allele as a homozygote produces purple flowers,
chromosomes.) while the other allele as a homozygote produces
white flowers (Figure 2). But what about a
Although a single diploid individual can have at
heterozygous individual that has one purple allele
most two different alleles of a particular gene,
and one white allele? What is the phenotype of a
many more alleles can exist in a population of
heterozygote?
individuals. In a natural population the most
common allelic form is usually called the wildtype This can only be determined by experimental
allele. However, in many populations there can be observation. We know from observation that
multiple variants at the DNA sequence level that individuals heterozygous for the purple and white
are visibly indistinguishable as all exhibit a normal, alleles of the flower color gene have purple
wild type appearance. There can also be various flowers. Thus, the allele associated with purple
mutant alleles (in wild populations and in lab color is therefore said to be dominant to the allele
strains) that vary from wild type in their that produces the white color. The white allele,
appearance, each with a different change at the whose phenotype is masked by the purple allele in
DNA sequence level. The many different mutations a heterozygote, is recessive to the purple allele.
(alleles) at the same locus are called an allelic The dominant/recessive character is a relationship
series for a locus. between two alleles and must be determined by
observation of the heterozygote phenotype.
4. PLEIOTROPY AND POLYGENIC INHERITANCE
Sometimes, to represent this relationship, a
There is usually not a one-to-one correspondence dominant allele will be written as a capital letter
between a gene and a physical characteristic. Often (e.g. A) while a recessive allele will be written in
a gene is responsible for several phenotypic traits lower case (e.g. a). However, this is not the only
and it is said to be pleiotropic. For example, system. Many different systems of genetic symbols
mutations in the vestigial gene (vg) in Drosophila are in use. The most common are shown in Table
results in an easily visible short wing phenotype. 3.1. Also note that genotypes (alleles) are usually
However, mutations in this gene also affect the written in italics and chromosomes and proteins
number of egg strings, position of the bristles on are not. For example, the white gene in Drosophila
scutellum, and lifespan in Drosophila. Therefore, vg melanogaster on the X chromosome encodes a
gene is said to be pleiotropic in that it affects many protein called WHITE, which is a pigment precursor
different phenotypic characteristics. transmembrane transporter enzyme.
Another example of co-dominance is shown in the are usually recessive, so both copies of a gene have
first figure of this chapter – flower colour in to be lost for the premature death to occur
Camellia sp. (homozygous lethal alleles will not be viable).
Heterozygotes which have one lethal allele and one
8. BIOCHEMICAL BASIS OF DOMINANCE
wild type allele are typically viable.
Given that a heterozygote’s phenotype cannot
9.3. BIOCHEMICAL
simply be predicted from the phenotype of
Auxotrophic mutants can be derived from
homozygotes, what does the type of dominance
prototrophic parents. This type of mutation blocks
tell us about the biochemical nature of the gene
a step in a biochemical pathway as discussed for
product? How does dominance work at the
the arg- mutants of Beadle and Tatum in the
biochemical level? There are several different
chapter on biochemical pathways. Such
biochemical mechanisms that may make one allele
biochemical mutations are a specific type of the
dominant to another.
conditional mutation class (next).
For the majority of genes studied, the normal (i.e.
9.4. CONDITIONAL
wild-type) alleles are haplo-sufficient. So in
Conditional mutations rely on the concept of:
diploids, even with a mutation that causes a
phenotype = genotype + environment + interaction.
complete loss of function in one allele, the other
Organisms with this kind of mutation express a
allele, a wild-type allele, will provide sufficient
mutant phenotype, but only under specific
normal biochemical activity to yield a wild type
environmental conditions. Under restrictive
phenotype and thus be dominant and dictate the
conditions, they express the mutant phenotype
heterozygote phenotype.
while under permissive conditions, they show a
On the other hand, in some biochemical pathways, wild type phenotype. One example of a conditional
a single wild-type allele is not enough protein and mutation is the temperature-sensitive
may be haplo-insufficient to produce enough pigmentation of Siamese cats. Siamese cats have
biochemical activity to result in a normal temperature sensitive fur colour; their fur appears
phenotype, when heterozygous with a non- unpigmented (light coloured) when grown in a,
functioning mutant allele. In this case, the non- warm temperature environment. The hair appears
functional mutant allele will be dominant (or semi- pigmented (dark) when grown at a cooler
dominant) to a wild-type allele. temperature. This is seen at the peripheral regions
Mutant alleles may also encode products that have of the feet, snout, and ears (Figure 6). This is
new and/or different biochemical activities instead because in warm temperature, the enzyme that is
of, or in addition to, the normal ones. These novel needed for melanin pigment synthesis becomes
activities could cause a new phenotype that would nonfunctional. However, in cooler temperature,
be dominantly expressed. the enzyme needed for melanin synthesis is
functional and the deposition of melanin makes the
9. MUTANT CLASSIFICATION fur look dark.
9.1. MORPHOLOGICAL
Morphological mutations cause changes in the
visible form of the organism. An example could be
a change in size, shape, colour, number etc.
9.2. LETHAL
A lethal mutation causes the premature death of Figure 6.
an organism. For example, in Drosophila lethal Siamese cats have temperature sensitive pigmentation
mutations can result in the death during the due to genetic mutation. (Wikimedia-Telekokopelli-CC BY-
SA 3.0)
embryonic, larval, or pupal stage. Lethal mutations
OPEN GENETICS LECTURES – FALL 2017 PAGE 5
CHAPTER 13 –ALLELES AT A SINGLE LOCUS
10. MULLER’S MORPHS base pair changes cause the mature mRNA to
incorrectly splice introns, therefore the
Exposure of an organism to a mutagen causes translated amino acid sequence would be
mutations in essentially random positions along the altered and nonfunctional.
chromosomes. Consequently, most of the mutant
phenotypes recovered from a genetic screen are (4) Gene is present and a transcript is produced but
caused by loss-of-function mutations. These no translation occurs – changes in the base pair
alleles are due to random changes in the DNA sequences would preclude the mRNA from
sequence that cause a gene to no produce less or binding to the ribosome for translation.
no active protein, compared to the wild-type allele. (5) Gene is present and a transcript is produced
Loss-of-function alleles tend to be recessive and translated but a nonfunctional protein
because the wildtype allele is haplo-sufficient. A product is produced – the mutation alters a key
loss-of-function allele that produces no active amino acid in the polypeptide sequence
protein is called an amorph, or null. On the other producing a completely non-functional
hand, alleles with only a partial loss-of-function are polypeptide.
called hypomorphic. More rarely, a mutant allele
may have a gain-of-function, producing either Genetic/phenotypic explanation - Amorphic
more of the active protein (hypermorph) or mutations of most genes usually act as recessive to
producing an active protein with a new and wild type (case #1). However, with some genes the
different function (neomorph). Finally, antimorph amorphic mutations are dominant to wild type.
alleles have an activity that is dominant and (case #2).
opposite to the wild-type function; antimorphs are case #1: white gene in Drosophila
also known as dominant negative mutations.
w+/w+ wildtype and red eyed
Thus, mutations (changes in a gene sequence) can
w+/w- wildtype and red eyed
result in mutant alleles that no longer produce the
same level or type of active product as the wild- w-/w- mutant and white eyed
type allele. Any mutant allele can be classified into
case #2: Minute locus in Drosophila
one of five types: (1) amorph, (2) hypomorph, (3)
hypermorph, (4) neomorph, and (5) antimorph. M+/M+ wildtype and long bristeld
one or more of the following, with gene still being 10.3. HYPERMORPH
present: Hypermorphic alleles produce quantitatively more
(1) reduced transcription – changed DNA sequence of the same, active product.
in the promoter or enhancer/regulatory Molecular explanation - Changes in the DNA base
elements can reduce the level of transcription. pair sequence of the hypermorphic allele may
(2) aberrant processing of the transcript – normal cause one or more of the following, with the gene
transcription but base pair changes cause the still being present:
mature mRNA to incorrectly splice introns, (1) increased transcription – changed DNA
therefore the translated protein sequence sequence in the promoter or
would be altered and function at a reduced enhancer/regulatory elements that increase the
level. level of transcription.
(3) reduced translation – changes in the base pair (2) increased translation – changes in the base pair
sequences would reduce the efficiency of the sequences would increase the efficiency of the
mRNA binding to the ribosome for translation. mRNA binding to the ribosome for translation.
(4) reduced-function protein product – normal (3) increased function protein product – normal
transcription, processing, and translation but transcription, processing, translation but base
mutation changes certain amino acid in the pair changes alter certain amino acid in the
polypeptide sequence so its function is reduced. polypeptide sequence so its function is normal
Genetic/phenotypic explanation - Hypomorphic but increased in amount.
mutations of most genes usually act as recessive to Genetic/phenotypic explanation - Hypermorphic
wild type, though hypomorphic mutations mutations of most genes usually act as dominant to
theoretically could be dominant to wildtype. wild type since they are a gain of function, The
classic hypermorph is a gene duplication.
whiteapricot allele in Drosophila
w+/w+ wildtype and red eyed 10.4. NEOMORPH
Neomorphic alleles produce a product with a new,
w+/wa wildtype and red eyed different function, something that the wild type
wa/wa mutant and apricot eye colour allele does not do.
Molecular explanation - Changes in the DNA base
Both amorphs and hypomorphs tend to be pair sequence of the neomorphic allele may cause
recessive to wild type in diploids because the wild one or more of the following, with the gene still
type allele is usually able to supply sufficient being present:
product to produce a wild type phenotype (called
(1) new transcription – changed DNA sequence in
haplo-sufficient). If the mutant allele is not able to
the promoter or enhancer/regulatory elements
produce a wild type phenotype, then it is haplo-
that makes new transcription either temporally
insufficient, and it will be dominant to the wild
or in a tissue-specific manner.
type allele. Here -/+ heterozygotes produce a
mutant phenotype. (2) new function protein product – normal
transcription, processing, translation but base
While the first two classes involve a loss-of-
pair changes alter certain amino acids in the
function, the next two involve a gain-of-function –
polypeptide sequence so it acquires a new
quantity or quality. Gain-of-function alleles are
function (activity) that is different from the
almost always dominant to the wild type allele.
normal function (e.g. additional substrate or
new binding site).
10.5. ANTIMORPH
Antimorphic alleles are relatively rare, and have a
new activity that is dominant and opposite to the
wildtype function. These alleles usually interfere
with the function from the wild type allele. (They
often lose their normal function as well.) The new
function works against the normal expression of
the wild type allele. This can happen at the
transcriptional, translational, or later level of
expression. Thus, when an antimorphic allele is
heterozygous with wild type, the wild type allele
function is reduced or prevented. At the molecular
level, there are many ways this can happen. The
simplest model to explain an antimorphic effect is
that the protein acts as a dimer (or any multimer)
and the inclusion of a mutant subunit poisons the
whole complex, thereby preventing or reducing its
level of function. Antimorphs are also known as
dominant-negative mutations because they are
usually dominant and act negatively against the
wild type function.
Figure 7.
Five classes of mutants designated as morphs (forms) by a
Nobel prize winner, H.J. Muller, which are known as
Muller’s Morphs. (Original-Locke- CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• Symbols are used to denote the alleles, or genotype, of a locus.
• Phenotype depends on the alleles that are present, their dominance relationships, and sometimes also
interactions with the environment and other factors.
• A somatic mutation affects the individual but not the progeny, whereas a germline mutation affects
the progeny in the next generation but not the individual in which they occur.
• In a diploid organism, alleles can be homozygous, heterozygous or hemizygous.
• Allelic interactions at a locus can be described as dominant vs. recessive, incomplete dominance, or co-
dominance.
• Muller's morphs classify all types of mutations including: amorph, hypomorph, hypermorph,
neomorph, and antimorph.
KEY TERMS:
homozygous co-dominance
heterozygous ABO blood group
hemizygous haplosufficiency
wild-type haploinsufficiency
variant loss-of-function
locus gain-of-function
genotype amorph
phenotype null
dominant hypomorph
recessive hypermorph
complete dominance neopmorph
incomplete (semi) dominance
STUDY QUESTIONS:
1) Distinguish amongst the following terms: (1) to a wild type strain the following phenotypes
gene, (2) locus, (3) allele, (4) transcription unit. are observed in the progeny:
2) A flower geneticist crosses a red flowered Mutant#1 = bristles 20% shorter
diploid plant with a white flower diploid plant Mutant#2 = bristles 30% longer
and all the progeny are red. Use two different Mutant#3 = bristles 50% shorter
forms of symbols to show this cross and its Mutant#4 = bristles kinked and misshapen
progeny. What if all the progeny were pink? Mutant#5 = bristles are missing
3) If your blood type is B, what are the possible What is the best characterization, using
genotypes of your parents at the locus that Muller’s Morphs, for each?
controls the ABO blood types?
4) In the table below, match the mouse hair color
phenotypes with the term from the list that
best explains the observed phenotype, given
the genotypes shown. In this case, the allele
symbols do not imply anything about the
dominance relationships between the alleles.
List of terms:
haplo-sufficiency,
haplo-insufficiency,
pleiotropy,
incomplete dominance,
co-dominance,
incomplete penetrance,
broad (variable) expressivity.
5) In this hypothetical example of Drosophila
bristle mutations, when various, true-breeding
mutant strains (all at a single locus) are crossed
Table for Question 2
endoreduplication (See Chapter 2). Understanding attach to the kinetochore and the chromosomes
the control of the cell cycle is an active area of align along the middle of the dividing cell, known as
research, particularly because of the relationship the metaphase plate. The kinetochore is the region
between cell division and cancer. on the chromosome where the microtubules
attach. It contains the centromere and proteins
2. MITOSIS that help the microtubules bind. Then in anaphase,
During the S-phase of interphase the chromosomes each of the sister chromatids from each
replicate so that each chromosome has two sister chromosome gets pulled towards opposite poles of
chromatids attached at the centromere. After S- the dividing cell. Finally in telophase, identical sets
phase and G2, the cell enters Mitosis. The first step of unreplicated chromosomes (single chromatids)
in mitosis is prophase where the nucleus dissolves are completely separated from each other into the
and the replicated chromosomes condense into the two daughter cells, and the nucleus re-forms
visible structures we associate with chromosomes. around each of the two sets of chromosomes.
Next is metaphase, where the microtubules Following this is the partitioning of the cytoplasm
(cytokinesis) to complete the process and to make
two identical daughter cells. Figure 1 and Figure 3
show real pictures and a cartoon schematic of the
process, respectively.
You should note that this is a dynamic and ongoing
process, and cells don’t just jump from one stage to
the next. When looking at snapshots of real cells,
you will more often see cells between two stages,
like is seen in some of the images in Figure 1.
An acronym to remember the main stages of
mitosis is iPMAT, where the little i stands for
Figure 2.
interphase, which will be described next.
Stages of the cell cycle. The outer ring identifies when a In contrast, Meiosis, which may appear similar, is a
cell is in interphase (I) and when it is in mitosis (M). The
very different process. Read through the Chapter
inner ring identifies the four major stages. Cells can enter
G0 if they are not actively undergoing cell division, and 16 and try to identify the similarities and
may re-enter the cell cycle at a later time. differences between the two processes.
(Wikimedia Commons - R. Wheeler - CC BY-SA 3.0
1N, 1C 2N, 2C
2N, 2C 2N, 2C 2N, 4C 2N, 4C
fertilization mitosis
ga
sy
ga
nt
p
p
h
(G
(G
es
1
)
2
)
is
(S
)
Figure 4.
Changes in DNA and chromosome content during the cell cycle and mitosis. For simplicity, nuclear membranes are not shown,
and all chromosomes are represented in a similar stage of condensation. (Original-M. Deyholos/L. Canham- CC BY-NC 3.0)
Table 1.
Measures of genome size in selected organisms. The DNA content (1C) is shown in millions of basepairs (Mb). For eukaryotes, the
chromosome number is the chromosomes counted in a gamete (1N) from each organism. The average gene density is the mean
number of non-coding bases (in bp) between genes in the genome."
___________________________________________________________________________
SUMMARY:
• The asexual transmission of genetic information is accomplished in a process called Mitosis.
• The process of mitosis can be divided into Prophase, Metaphase, Anaphase, and Telophase.
• Mitosis reduces the c-number, but not the n-number of the daughter cells.
• Not all the DNA in an organism codes for genes. In most higher eukaryotes most DNA is non-gene
coding and appears to have no specific function and is called “junk’ DNA.
• The c-value paradox refers to the observation that the amount of DNA is not necessarily related to the
complexity of the organism.
KEY TERMS:
mitosis metaphase plate
interphase anaphase
G1 Phase telophase
S Phase unreplicated chromosome
G2 Phase cytokinesis
M Phase n-value
G0 Phase c-value
chromatids replicated chromosome
prophase nuclear genome
metaphase c-value paradox
microtubules
kinetochore
STUDY QUESTIONS:
1) Species A has n=4 chromosomes and Species B
has n=6 chromosomes. Can you tell from this
information which species has more DNA? Can
you tell which species has more genes?
2) The answer to question 1 implies that not all
DNA within a chromosome encodes genes. Can
you name any examples of chromosomal
regions that contain relatively few genes
3)
a) How many centromeres does a typical
chromosome have?
b) What would happen if there was more than
one centromere per chromosome?
c) What if a chromosome had no
centromeres?
4) For a diploid organism with 2n=16
chromosomes, how many chromosomes and
chromatids are present per cell at the end of:
a) G1,
b) S,
c) G2,
d) mitosis,
5) Refer to Table 1.
a) What is the relationship between DNA
content of a genome, number of genes,
gene density, and chromosome number?
b) What feature of genomes explains the c-
value paradox?
c) Do any of the numbers in this Table show a
correlation with organismal complexity?
INTRODUCTION and clarity. Figure 2 shows a more magnified view
of a pair of chromosomes. On average a condensed
Humans, like all other species, store their genetic
human metaphase chromosome is 5 µm long and
information in cells as large DNA molecules called
each chromatid is 700 nm wide. In contrast, a
chromosomes. Within each nucleus are 23 pairs of
decondensed interphase chromosome is 2 mm long
chromosomes, half from mother and half from
and only 30 nm wide, yet still fits into a single
father. In addition, our mitochondria have their
nucleus.
own smaller chromosome that encodes some of
the proteins found in this organelle. Figure 2.
A pair of metacentric human
This chapter will provide information on human chromosome #1.
chromosomes that will be referred to in various (Wikipedia- National Human Genome
other chapters, lectures, and in the lab. Research Institute-PD)
cell will become 4c again (replication) before replication. The n-value does not change while the
dividing themselves to become 2c each. From this c-value does.
point forward, every cell in the embryo will be 2c =
6000 Mb before its S phase and 4c = 12 000 Mb
afterwards. The same is true for the cells of
fetuses, children, and adults. Because the cells used
to prepare this chromosome spread were adult
cells in metaphase each is 4c = 12 000 Mb. Note,
there are some rare exceptions, such as some
stages of meiocytes that make germ cells and other
rare situations like the polyploidy of terminally
differentiated liver cells. In summary:
Human cell DNA content
gamete (egg or sperm) 1c = 3000 Mb
Figure 3.
regular cell before S phase 2c = 6000 Mb Karyogram of a normal human male.
(Wikipedia-National Human Genome Research Institute –
regular cell after S phase 4c = 12 000 Mb PD)
and so on. The karyogram above shows two copies 2.3. RELATIONSHIPS BETWEEN CHROMOSOMES AND
of each of the autosomes. A karyogram from a CHROMATIDS
normal female would also show these 22 pairs. To summarize what we have covered so far,
There are also the sex-chromosomes, X and Y (see karyograms depict replicated chromosomes
below). Normal females have two X-chromosomes, (because the cells had past S phase in the cell cycle)
while normal males have an X and a Y each. They and two copies of each chromosome (because the
act as a homologous pair, similar to the autosomes. cells were diploid). So how do we refer to all the
During meiosis only one of each autosome pair and pieces of DNA present? Figure 4 summarizes the
one of the sex-chromosomes makes it into the terms used.
gamete. This is how 2n = 46 adults can produce
1n = 23 eggs or sperm. 2.4. HUMAN SEX CHROMOSOMES
Figure 3 shows that most of our chromosomes are
In addition to their length, Cytogeneticists can present in two copies. Each copy has the same
distinguish chromosomes using their centromere length, centromere location, and banding pattern.
position and banding pattern. Note that at the As mentioned before, these are called autosomes.
resolution in Figure 3 both chromosome 1s look However, note that two of the chromosomes, the X
identical, even though at the base pair level there and the Y, do not look alike. These are sex
are small and often significant differences in the chromosomes. In mammals, males have one of
sequence that correspond to allelic differences each while females have two X chromosomes.
between these homologous chromosomes.
Remember that in each karyogram there are
maternal chromosomes, those inherited from their
mother, and their paternal chromosomes, those
from their father. For example, everyone has one
maternal chromosome 1 and one paternal
chromosome 1. In a typical karyogram it usually is
not possible to tell which is which. In some cases,
however, there are visible differences between
homologous chromosomes that do permit the
distinction to be made.
Figure 4.
The relationships between chromosomes and
chromatids.
(Original Deyholos- CC BY-NC 3.0)
Term Definition Example
the maternal and paternal copies maternal chromosome 1 and
homologous chromosomes
of a chromosome paternal chromosome 1
two different chromosomes within a chromosome 1 and a
non-homologous chromosomes
the same cell/organism chromosome 8
the identical chromatids within a the two chromatids within
sister chromatids
single replicated chromosome maternal chromosome 1
the similar but not identical a chromatid in maternal
non-sister chromatids chromatids from homologous chromosome 1 and a chromatid
chromosomes in paternal chromosome 1
Figure 6.
Parts of a typical human nuclear chromosome (not to
scale). The ori's and genes are distributed everywhere
along the chromosome, except for the telomeres and
centromere.
(Original-Harrington- CC BY-NC 3.0)
Figure 5. 3.1. THOUSANDS OF GENES
Meiosis in an XY mammal. The stages shown are anaphase In the previous sections we mentioned human
I, anaphase II, and mature sperm. Note how half of the chromosome 1, but what exactly is it? Well, each
sperm contain Y chromosomes and half contain X
chromosome is long molecule of double stranded
chromosomes.
(Original - Harrington - CC BY-NC 3.0) DNA. They carry genetic information (genes).
Chromosome 1, being our largest chromosome has
the most genes, about 4778 in total. Many of these
genes are transcribed into mRNAs, which encode where this begins are called origins of replication
proteins. Other genes are transcribed into tRNAs, (ori's). They are found distributed along the
rRNA, and other non-coding RNA molecules (see chromosome, about 40 kb apart. S phase begins at
Chapter 07). each ori as two replication forks leave travelling in
opposite directions. Replication continues and
3.2. ONE CENTROMERE replication forks travelling from one ori will collide
A centromere ("middle part") is a place where with forks travelling towards it from the
proteins attach to the chromosome as required neighboring ori. When all the forks meet, DNA
during the cell cycle. Cohesin proteins hold the replication will be complete.
sister chromatids together beginning in S phase.
Kinetochore proteins form attachment points for 4. APPEARANCE OF A TYPICAL NUCLEAR
microtubules during mitosis. The metaphase CHROMOSOME DURING THE CELL CYCLE
chromosomes shown in Figure 3 have both Cohesin
If we follow a typical chromosome in a typical
and Kinetochore proteins at their centromeres.
human cell it alternates between unreplicated and
There are no genes within the centromere region
replicated states and between relatively
DNA; rather it is composed of a simple repeated
uncondensed and condensed. The replication is
DNA sequence.
easy to explain, if a cell has made the commitment
All human chromosomes have a centromere, but to divide, it first needs to replicate its DNA. This
not necessarily in the middle of the chromosome. If occurs during S phase. Before S phase,
it is in the centre the chromosome it is called a chromosomes consist of a single piece of double-
metacentric chromosome. If it is offset a bit it is stranded DNA and after they consist of two
submetacentric, and if it is towards one end the identical double-stranded DNAs.
chromosome is acrocentric. In humans an example
The condensation is a more complex story because
of each is chromosome 1, 5, and 21, respectively.
eukaryotic DNA is always wrapped around some
Humans do not have any telocentric chromosomes,
proteins. Figure 7 shows the different levels
those with the centromere at one end, but mice
commonly found in cells. During interphase, a
and some other mammals do.
chromosome exists mostly as a 30 nm fibre. This
3.3. TWO TELOMERES allows it to fit inside the nucleus and still have the
The ends of a chromosome are called telomeres DNA be accessible for enzymes performing RNA
("end parts"). Part of the DNA replication is unusual synthesis, DNA replication, and DNA repair. At the
here, it is done with a dedicated DNA polymerase start of mitosis these processes halt and the
known as a Telomerase. Chapter 2 on DNA chromosome becomes even more condensed. This
replication goes into more detail. As with the is necessary so that the chromosomes are compact
centromere region there are no genes in the enough to move to the opposite ends within the
telomeres, just simple, repeated DNA sequences. cell. When mitosis is complete the chromosome
returns to its 30 nm fibre structure. Recall that
3.4. THOUSANDS OF ORIGINS OF REPLICATION each of our cells has a maternal and a paternal
At the beginning of S phase DNA polymerases begin chromosome 1. Figure 8 shows what these
the process of chromosome replication. The sites chromosomes look like during the cell cycle
Figure 7.
Successive stages of
chromosome
condensation depend on
the introduction of
additional proteins.
(Wikipedia-R. Wheeler- CC
BY-SA 3.0)
contain genes that are actively being transcribed. chromosomes (Figure 10), like most bacteria that
Heterochromatin, is more densely compacted and exist today. Mitochondria typically have circular
tends not to be transcribed; the genes are inactive. chromosomes that behave more like bacterial
Heterochromatin sequences also include short, chromosomes than eukaryotic chromosomes, (i.e.
highly-repetitive sequences called satellite DNA, mitochondrial genomes do not undergo mitosis or
which acquired their name because their buoyant meiosis). Also, the mitochondrial chromosome is
density, as determined by ultracentrifugation, is not associated with histones or other proteins that
distinctly different from the main band of DNA. compact it. It also lacks a centromere because
mitochondrial replication is simpler than nuclear
6. PARTS AND APPEARANCE OF A MITOCHONDRIAL chromosome replication. Mitochondria just grow
CHROMOSOME larger and split in two, like the cells of its
prokaryote origin. Because there are multiple
While most of our genome is located in the
mtDNA copies that are randomly distributed in the
nucleus, there is also DNA in the mitochondria. The
matrix, both new mitochondria will end up
human mtDNA is small, only 16.6 kb, and circular,
inheriting some mtDNAs. And lastly because the
although it is double-stranded like most DNA
mtDNA is circular there are no ends and thus no
molecules. It has only 37 genes, 13 of these make
telomeres.
mitochondrial proteins and the rest encode tRNAs
and rRNAs. In summary:
Each mtDNA has a single origin of replication. Nuclear Mitochondrial
During DNA replication two replication forks leave Feature
chromosomes chromosome
the ori and halt when they bump into each other
linear double circular double
on the opposite side of the circle. DNA replication DNA
stranded DNA stranded DNA
inside the mitochondria happens throughout
interphase, not once during S phase as with the genes thousands 37
nuclear chromosomes. The consequence is that
each mitochondrion has between 2 to 10 identical centromeres 1 0
copies of its chromosome (Figure 9).
telomeres 2 0
origins of
thousands 1
replication
Mitosis/ Yes No
Meiosis
Figure 9.
The relationship between cells, mitochondria, and
mitochondrial DNA. (Original-Harrington- CC BY-NC 3.0)
one found in their mother (and her mother). molecules in all of the mitochondria in all of the
Technically speaking we have only one MT-CO1 cells.
allele, it will be identical on all of the mtDNA In summary:
___________________________________________________________________________
SUMMARY:
• The c-value is the amount of DNA in a gamete. Humans are 1c = 3000 Mb.
• The n-value is the number of chromosomes in a gamete. Humans are 1n = 23.
• A typical cell in your body is 2c = 6000 Mb and 2n = 46 before DNA replication and 4c = 12 000 Mb and
2n = 46 after.
• A picture of metaphase chromosomes can be organized into a karyogram figure and described with a
karyotype statement.
• Humans have two copies of each autosomal chromosome. Females have two X chromosomes while
males have one X and one Y chromosome.
• A typical nuclear chromosome has thousands of genes, one centromere, two telomeres, and thousands
of origins of replication.
• A typical nuclear chromosome is replicated during S phase and consists of two chromatids up until the
start of anaphase. It is condensed during prophase and remains condensed until the start of telophase.
During metaphase a chromosome is both replicated and condensed for these reasons.
• The human mitochondrial chromosome has 37 genes, a single origin of replication, and neither
centromeres nor telomeres.
• Humans have ~29 000 genes, most of which are on autosomal chromosomes.
• A typical human cell has two copies of each autosomal gene and one of each mitochondrial gene.
Genes on sex chromosomes are different: females have two of each X-chromosomal gene while males
have one; males have Y-chromosomal genes while females do not.
KEY TERMS:
replicated chromosome submetacentric
condensed chromosome acrocentric
cytogeneticist telocentric
c-value telomere
n-value origin of replication
karyogram 30 nm fibre
autosome histones
maternal chromosome nucleosome
paternal chromosome histone H1
sex chromosome fibre
homologous chromosome scaffold proteins
non-homologous chromosome chromatin
sister chromatids euchromatin
non-sister chromatids heterochromatin
karyotype satellite DNA
gene mtDNA
centromere endosymbiont theory
metacentric
QUESTIONS:
1) Cytogeneticists use white blood cells to obtain 9) Could the following genes continue to perform
metaphase chromosomes for karyotyping. their normal developmental function if they
a) Why don't they use red blood cells? were moved next to the LCT gene on
b) Why don't they use white blood cells in Chromosome 2?
anaphase? a) F8
2) The human Y chromosome is smaller than the b) SRY
X chromosome. Does this mean that males c) MT-CO1
have less DNA than females?
3) Are these statements true or false? For the
false statements explain why.
a) Everyone has a paternal chromosome 1.
b) Everyone has a maternal chromosome 1.
c) Everyone has a paternal X chromosome.
d) Everyone has a maternal X chromosome.
e) Everyone has a paternal Y chromosome.
f) Everyone has a maternal Y chromosome.
g) Everyone has a paternal mitochondrial
chromosome.
h) Everyone has a maternal mitochondrial
chromosome.
4) Explain why centromeres do not have to be in
the centre of a chromosome to function.
5) Why do nuclear chromosomes have to have
multiple origins of replication?
6) Define chromatin. What is the difference
between DNA, chromatin and chromosomes?
7) Have a look at Figure 8 Which of these
chromosomes would be associated with:
a) Histone proteins (see Figure 7)
b) Condensin proteins (important scaffold
proteins)
c) Cohesin proteins (proteins which hold sister
chromatids together)
d) Kinetochore proteins (proteins which
connect centromere DNA to Microtubules)
8) Where would you find these enzymes in a
typical human cell?
a) DNA polymerases
b) RNA polymerases
c) Ribosomes
PAGE 12 OPEN GENETICS LECTURES – FALL 2017
MENDEL’S FIRST LAW – CHAPTER 16
Figure 1.
Pea plants were used by Gregor Mendel to
discover fundamental laws of genetics.
His first law, the segregation of alleles, is
covered in this chapter. His second law,
independent assortment, is covered in the
next chapter. (Wikimedia commons-B.
Ebbesen-CC BY-SA 3.0)
INTRODUCTION irreversibly with the factor for purple-flowers.
Mendel’s observations disproved blending
The once prevalent (but now discredited) concept inheritance and favor an alternative concept, called
of blended inheritance proposed that some particulate inheritance, in which heredity is the
undefined essence, in its entirety, contained all of product of discrete factors that control
the heritable information for an individual. It was independent traits.
thought that mating combined the essences from
each parent, much like the mixing of two colors of Through careful study of patterns of inheritance,
paint. Once blended together, the individual Mendel recognized that a single trait could exist in
characteristics of the parents could not be different versions, or alleles, even within an
separated again. individual plant or animal. For example, he found
two allelic forms of a gene for seed color: one allele
However, Gregor Mendel (Figure 2) was one of the
gave green seeds, and the other gave yellow seeds.
first to take a quantitative, scientific approach to
Mendel also observed that although different
the study of heredity. He started with well-
alleles could influence a single trait, they remained
characterized strains, repeated his experiments
indivisible and could be inherited separately. This is
many times, and kept careful records of his
the basis of Mendel’s First Law, also called The Law
observations. Working with peas, Mendel showed
of Equal Segregation, which states: during gamete
that white-flowered plants could be produced by
formation, the two alleles at a gene locus segregate
crossing two purple-flowered plants, but only if the
from each other; each gamete has an equal
purple-flowered plants themselves had at least one
probability of containing either allele.
white-flowered parent (Figure 3). This was
evidence that a discrete genetic factor that
produced white-flowers had not blended
1. OVERVIEW
Mendel first made his discoveries of inheritance in
the 1850’s. In his 1866 publication he didn’t use the
word “gene” as the fundamental unit of heredity
because it wasn’t coined until 1909 by Danish
botanist Wilhelm Johannsen. Thomas Hunt Morgan
proposed that genes resided on chromosomes in
1910, and occupied distinct regions on those
chromosomes. DNA as a substance was discovered
in the 1860’s, but it took until the 1940s to realize
Figure 2. that DNA was the molecule that contained the
Gregor Johann Mendel (1822-1884), an Augustinian Friar, genetic information. Then in the 1950’s Watson
who lived in Moravia (now part of the Czech Republic),
and Crick discovered the structure of DNA.
published his work in 1866 on what has become known as
the laws of Mendelian Inheritance. Nevertheless, Mendel made his discoveries without
(Wikipedia-Hugo Iltis- CC BY 4.0) any of this information. Today we have
overwhelming knowledge from research allowing
us to understand the molecular mechanism behind
Mendel’s laws. To explain Mendel’s First Law,
segregation, we will take a closer look at the
concept of meiosis.
Prophase I
Meiosis I
Figure 5.
Meiosis in Arabidopsis (n=5).
Panels A-C show different
stages of prophase I, each
with an increasing degree of
chromosome condensation.
Subsequent phases are
shown: metaphase I (D),
telophase I (E), metaphase II
(F), anaphase II (G), and
telophase II (H). (PLoS
Genetics-Chelysheva, L. et al
(2008) PLoS Genetics- CC BY
4.0)
2. MEIOSIS I
Meiosis I is called a reductional division, because it
reduces the number of chromosomes inherited in
each of the daughter cells – the parent cell is 2N
while the two daughter cells are each 1N. Meiosis I
is further divided into Prophase I, Metaphase I,
Anaphase I, and Telophase I, which are roughly
similar to the corresponding stages of mitosis,
except that in Prophase I and Metaphase I,
homologous chromosomes pair up with each
other, or synapse, and are called bivalents (Figure
7), in contrast with mitosis where the
chromosomes line up individually during
metaphase. This is an important difference
between mitosis and meiosis, because it affects the
segregation of alleles, and also allows for
recombination to occur through crossing-over,
which will be described later. During Anaphase I,
one member of each pair of homologous
chromosomes migrates to each daughter cell (1N)
(Figure 6).
In meiosis I replicated, homologous chromosomes
pair up, or synapse, during prophase I, line up in
the middle of the cell during metaphase I, and Figure 6.
Changes in DNA and chromosome content during the cell
separate during anaphase I. For this to happen the cycle. For simplicity, nuclear membranes are not shown,
homologous chromosomes need to be brought and all chromosomes are represented in a similar stage of
together while they condense during prophase I. condensation. (Original-Deyholos- CC BY-NC 3.0)
During synapsis, proteins bind to both homologous
chromosomes along their entire length and form the transient structure of a bivalent (Figure 7). The
the synaptonemal complex (synapse means proteins are released when the cell enters
junction). These proteins hold the chromosomes in anaphase I.
meiosis I is called a reductional cell division. Crossing over occurs within the synaptonemal
Meiosis II resembles mitosis, with one sister complex. A crossover is a place where DNA repair
chromatid from each chromosome separating to enzymes break the DNA of two non-sister
produce two daughter cells. Because Meiosis II, like chromatids in similar locations and then covalently
mitosis, results in the segregation of sister reattach non-sister chromatids together to create a
chromatids, Meiosis II is called an equational crossover between non-sister chromatids. This
division (Figure 6). reorganization of chromatids will persist for the
If after telophase I the cells went into a state of remainder of meiosis and result in recombination
interkinesis, then during prophase II the haploid of alleles in the gametes. Crossover events can be
chromosomes will condense and the nuclear seen as Chiasmata on the synapsed chromosomes
membrane will dissolve again. If interkinesis did not in late Meiosis I.
happen, then the cell will continue with meiosis II Crossovers function to hold homologous
(Figure 4). Prophase II ends like in mitosis with the chromosomes together during meiosis I so they
microtubules beginning to form. As metaphase II orient correctly and segregate successfully.
starts, the pairs of sister chromatids align Crossing over also reshuffles the allele
themselves along the metaphase plate, each combinations along a chromosome resulting in
chromatid attached to a microtubule from each genetic diversity, that can be selected in a
pole. Anaphase II splits the sister chromatids and population over time (evolution).
the microtubules pull them to the opposite poles.
Telophase II reforms the nuclear membrane
5. ONE LOCUS ON A CHROMOSOME -
around the chromosomes, ending finally with SEGREGATION - MONOHYBRID
cytokinesis and producing four cells with only one Not only did Mendel solve the mystery of
unreplicated chromosome of each type. There will inheritance as units (genes), he also invented
be allelic differences among gametes based upon several testing and analysis techniques still used
segregation of heterozygous alleles (Note the today. Classical genetics is the science of
differences in colours of chromosomes in each of examining biological questions using controlled
the gametes in Figure 4). matings of model organisms. It began with Mendel
in 1865 but did not attain widespread usage until
3.1. GAMETE MATURATION
Mendel’s work was rediscovered in 1903 by four
In animals and plants the cells produced by meiosis
researchers (E. von Tschermak, H. de Vries, C.
need to mature before they become functional
Correns, and W. J. Spillman). Then Thomas Morgan
gametes. In male animals the four products of
began working with fruit flies in 1908 and used this
meiosis are called spermatids. They grow
work. Later, starting with Watson and Crick’s
structures, like tails and become functional sperm
structure of DNA in 1953, classical genetics was
cells. In female animals the gametes are eggs. For
joined by molecular genetics, the science of solving
each egg to contain the maximum amount of
biological problems using DNA, RNA, and proteins.
nutrients, typically only one of the four products of
The genetics of DNA cloning began in 1970 with
meiosis becomes an egg. The other three cells end
the discovery of restriction enzymes and plasmids
up as tiny disposable cells called polar bodies. In
as cloning vectors.
plants the products of meiosis reproduce a few
times using mitosis as they develop into functional Knowing what we now know about the process of
male or female gametes. meiosis, we can better understand the mechanisms
underlying Mendel’s First Law. The Law of
4. CROSSING OVER (INTRA-CHROMOSOMAL Segregation states that every individual contains a
RECOMBINATION) pair of alleles for each gene, which segregate
During prophase I the homologous chromosomes during the formation of gametes, and so for every
pair together and form a synaptonemal complex. gene pair each parent passes on a random allele to
its offspring. The series of experiments that led to generations) have the same phenotypes with
the formulation of Mendel's first law where based respect to a particular trait. True-breeding lines are
on the process of Monohybrid crosses, which will useful, because they are typically assumed to be
be described below. homozygous for the alleles that affect the trait of
5.1. TERMINOLOGY interest. When two individuals that are
A specific position, region, or segment along a homozygous for the same alleles are crossed, all of
chromosome is called a locus. Each gene occupies a their offspring will all also be homozygous. The
specific locus (so the terms locus and gene are continuation of such crosses constitutes a true
often used interchangeably). Each locus will have breeding line or strain. A large variety of different
an allelic form (allele). The complete set of alleles strains, each with a different, true breeding
(at all loci of interest) in an individual is its character, can be collected and maintained for
genotype. Typically, when writing out a genotype, genetic research.
only the alleles at the locus (loci) of interest are 5.3. MONOHYBRID CROSSES
considered – all the others are present and A monohybrid cross is one in which both parents
assumed to be wild type but are normally not are heterozygous (or a hybrid) for a single (mono)
written in the genotype. The observable or trait. The trait might be petal colour in pea plants
detectable effect of these alleles on the structure (Figure 8b). Recall from Figure 3 that the
or function of that individual is called its
generations in a cross are named P (parental), F1
phenotype. The phenotype studied in any
particular genetic experiment may range from (first filial), F2 (second filial), and so on.
simple, visible traits such as hair color, to more By using monohybrid crosses, Mendel discovered
complex phenotypes including disease that genes were discrete units that separated in
susceptibility or behavior. If two alleles are present the creation of offspring. Previous ideas of blending
in an individual, then various interactions between inheritance would mean that a cross between a
them may influence their expression in the white flower and a purple flower would create a
phenotype. ‘blended’ phenotype. Instead what Mendel saw
was distinct parental colours in the hybrids, that
5.2. TRUE BREEDING LINES
when crossed would produce in specific ratios the
Geneticists make use of true-breeding lines just as
purple and white seen in the parents. These traits
Mendel did (Figure 8a). These are in-bred
were not blended when the true-breeding lines
populations of plants or animals in which all
were crossed, but instead those parental alleles
parents and their offspring (over many
were carried on through the offspring. Through the
monohybrid cross he was able to discern the
dominant and recessive alleles of each gene he
studied in the pea plants. In further crosses (F3, F4,
etc.), these traits were continuously transmitted
and not lost, though they may be hidden as seen in
the F1 generation.
6. PUNNETT SQUARES - 3:1 RATIO
The specific ratios seen in the monohybrid cross
can be described using a Punnett square, named
Figure 8.
(a) A true-breeding line (b) A monohybrid cross produced
after R.C. Punnett who devised this approach.
by mating two different pure-breeding lines. Given the genotypes of any two parents, we can
(Original-Deyholos-CC BY-NC 3.0) predict all of the possible genotypes of the
offspring. Furthermore, if we also know the
___________________________________________________________________________
SUMMARY:
• Mendel demonstrated that heredity involved discrete, heritable factors that affected specific traits.
• A gene can be defined operationally as a unit of inheritance.
• Homologous chromosomes contain the same series of genes along their length, but not necessarily the
same alleles. Sister chromatids initially contain the same alleles.
• Homologous chromosomes pair (sysnapse) with each other during meiosis, but not mitosis.
• A diploid organisms can have up to two different alleles at a single locus. The alleles segregate equally
between gametes during meiosis.
• Phenotype depends on the alleles that are present, their dominance relationships, and sometimes also
interactions with the environment and other factors.
• Classical geneticists make use of true breeding lines, monohybrid crosses, Punnett squares, test
crosses, and reciprocal crosses.
KEY TERMS:
blending inheritance chiasma / chiasmata
Gregor Mendel diakinesis
particulate inheritance metaphase I
alleles anaphase I
Mendel’s First Law telophase I
The Law of Equal Segregation interkinesis
dominant prophase II
recessive metaphase II
meiosis I anaphase II
meiosis II telophase II
gametes polar bodies
meiocytes classical genetics
reductional molecular genetics
synapse DNA cloning
bivalent monohybrid cross
equational locus
pair up genotype
synaptonemal complex phenotype
leptotene true-breeding lines
zygotene punnett square
pachytene test cross
crossing over tester
diplotene
STUDY QUESTIONS:
1) How would the results of the cross in Figure 3
have been different if heredity worked through
blending inheritance rather than particulate
inheritance?
2) A simple mnemonic for leptotene, zygotene,
pachytene, diplotene, & diakinesis is Lame
Zebras Pee Down Drains. Make another one
yourself.
3) What is the maximum number of alleles at a
given autosomal locus in a normal gamete from
a diploid individual? In the whole population of
a species?
4) Wirey hair (W) is dominant to smooth hair (w)
in dogs.
a) If you cross a homozygous, wirey-haired
dog with a smooth-haired dog, what will be
the genotype and phenotype of the F1
generation?
b) If two dogs from the F1 generation mated,
what would be the most likely ratio of hair
phenotypes among their progeny?
c) When two wirey-haired Ww dogs actually
mated, they had a litter of three puppies,
which all had smooth hair. How do you
explain this observation?
d) Someone left a wirey-haired dog on your
doorstep. Without extracting DNA, what
would be the easiest way to determine the
genotype of this dog?
e) Based on the information provided in
question 1, can you tell which, if either, of
the alleles is wild-type?
5) An important part of Mendel’s experiments was
the use of homozygous lines as parents for his
crosses. How did he know they were
homozygous, and why was the use of the lines
important?
6) Does equal segregation of alleles into daughter
cells happen during mitosis, meiosis, or both?
INTRODUCTION seven traits in all, each on a different
chromosome.) When either of these traits was
The principles of genetic analysis that we have
studied individually, the phenotypes segregated in
described for a single locus in Chapter 16 will be
the classical 3:1 ratio among the progeny of a
extended to the study of alleles at two loci in this
monohybrid cross (Figure 2), with ¾ of the seeds
Chapter. The analysis of two loci in the same cross
green and ¼ yellow in one cross, and ¾ round and
provides information for genetic mapping (Chapter
¼ wrinkled in the other cross. Would this be true
18) and testing gene interactions (Chapter 26).
when both hybrids were in the same individual?
These techniques are very useful for both basic and
applied research. Before discussing these
techniques, we will first revisit Mendel’s classical
experiments.
Before Mendel’s 1865 publication, blended
inheritance was the accepted model to explain the
transmission of traits. It was Mendel’s work that
established that heritable traits were controlled by
discrete factors, which we now call alleles, in a
particulate inheritance model. At the time it was Figure 2.
an important question as to whether heritable Monohybrid crosses involving two distinct traits in peas.
traits, controlled by discrete factors, were inherited a) is R/r and b) is Y/y.
independently of each other? To answer this, Monohybrid crosses are covered in more detail in Chapter
16
Mendel took two apparently unrelated traits, such (Original-Deyholos-CC BY-NC 3.0)
as seed shape and seed color, and studied their
inheritance together in one individual. For
example, he studied two variants of each trait: Like in the previous chapter, we will first walk
seed color was either green or yellow, and seed through how a dihybrid cross works on at the DNA
shape was either round or wrinkled. (He studied level, and then we will explain the results that
A a
When dealing with alleles at two different loci, we
have to use nomenclature that makes the A a
arrangement clear. There are three possible B b
arrangements: Both loci are on the same
chromosome (AB/ab), different chromosomes (A/a;
B/b), or unknown (AaBb). A a
B b
1. TWO LOCI ON DIFFERENT CHROMOSOMES
B b
The separation of gametes through the process of
meiosis has already been introduced. But what
does that mean when you are taking multiple
different genes (or loci) into account? a
A
Remember the main stages of meiosis. The
homologous pairs align during metaphase I, and b B B
b
complete one round of cell division. Then during
metaphase II in those two cells the replicated
chromosomes align individually and the sister A a a
A
chromatid separate, so when complete you have
two daughter cells. Let’s say one chromosome has B
b
gene A on it, and another chromosome has gene B
on it, and the individual is heterozygous at each
gene (a.k.a. has the genotype A/a ; B/b). There are Figure 3.
a variety of ways that the homologous pairs can Independent assortment as seen on two different
align themselves during metaphase I. The chromosomes. Gene A is found on the short chromosome
orientation of that alignment will affect the alleles and Gene B is found on the long chromosome, and both
genes are heterozygotes for the dominant (A and B) and
each gamete receives at the end of telophase II
recessive (a and b) alleles. The orientation that the
(Figure 3). chromosomes align themselves during metaphase I affect
the alleles found in the 4 gametes produced after telophase
Because the alignment at metaphase I is always
II.
random, you will see a random, equal distribution These are just two of many orientations the chromosomes
of alleles in all the gametes produced. This means can arrange themselves in at metaphase I. The full stages of
that one allele doesn’t affect the distribution of meiosis were removed for simplicity; refer to chapter 16 to
another allele, or in other words, each allele understand the divisions that lead to the 4 gametes seen in
assorts independently (Independent Assortment). telophase II.
(Original-L. Canham-CC BY-NC 3.0)
2. TWO LOCI ON ONE CHROMOSOME Crossing over is an exchange between non-sister
Based on the description in the last section, it chromatids that can occur at any position along the
would be expected that if the genes were on the entire chromosome. If the two loci that are being
same chromosome the alleles would travel considered are sufficiently separated from each
together through meiosis (Figure 4 top). However, other on the chromosome, crossover events can
when tested this is not always the case. The occur between the two loci.
recombination of alleles can be explained through
the phenomenon of crossing over, which occurs This coupled with the random orientation that the
during prophase I as described in chapter 16. chromosomes align during metaphase I, will allow
the other combination of alleles in the gametes loci. Ultimately, this will result in similar allele
(Figure 4 bottom). combinations to those observed in independent
assortment shown above, even if they are on the
While not shown in Figure 4, if the two loci are very same chromosome.
far apart, multiple crossover events can also take
place, further increasing the shuffling of alleles. If the loci are very close together on the same
chromosome, fewer crossovers are likely occur
Metaphase I Telophase II between them. We will not discuss this situation in
here, but will do later in chapter 18.
A a
3. A DIHYBRID CROSS SHOWING MENDEL’S
SECOND LAW (INDEPENDENT ASSORTMENT)
D d Mendel found that each locus had two alleles, that
segregated from each other during the creation of
A a a gametes. He wondered whether dealing with
A
multiple traits at a time would affect this
segregation, so he created a dihybrid cross. The
D d d
D distribution of offspring from his experiments led
him to formulate Mendel’s Second Law, the Law of
Independent Assortment, which states that the
A a segregation of alleles at one locus will not influence
A a
the segregation of alleles at another locus during
gamete formation – the alleles segregate
D d independently. Next, we will discuss how he came
d to this understanding, given that independent
D
assortment occurs.
A a
3.1. MENDEL’S SECOND LAW
To analyze the simultaneous segregation of two
d D traits at the same time in the same individual, he
crossed a pure-breeding line of green, wrinkled
Figure 4. peas with a pure-breeding line of yellow, round
Independent assortment as seen on the same
peas. This produced F1 progeny that had all green
chromosome. On the top is an example of what would
happen if cossovers do not occur. The dominant alleles of and round peas. They were called dihybrids
gene A and gene D would travel together, not leading to because they carried two alleles at each of the two
independent assortment. Crossovers do occur in most loci (Figure 5).
situations though, like in the bottom half of the figure. If a
crossover occurs between the two genes, then the alleles From Figure 2 we know that yellow and round are
will transfer to the other non-sister chromatid, thus dominant, and green and wrinkled are recessive. If
rearranging alleles. This allows for independent the inheritance of seed color was truly
assortment, despite being on the same chromosome. independent of seed shape, then when the F1
This is just one of the many arrangements or crossover
events that could occur during meiosis, with every
dihybrids were crossed to each other, a 3:1 ratio of
meiocyte arranging themselves differently with different one trait should be observed within each
crossovers. phenotypic class of the other trait (Figure 5). Using
(Original-L. Canham-CC BY-NC 3.0) the product law, we would therefore predict
Table 1.
Phenotypic classes expected in monohybrid and dihybrid crosses for two seed traits in pea.
To calculate the expected phenotypic ratios, we that one or more of the above conditions has not
assign a phenotype to each of the 16 genotypes in been met. Modified ratios in the progeny of a
the Punnett Square, based on our knowledge of the dihybrid cross can therefore reveal useful
alleles and their dominance relationships. information about the genes involved. One such
example is linkage.
In the case of Mendel’s seeds, any genotype with at Linkage is one of the most important reasons for
least one R allele and one Y allele will be round and distortion of the ratios expected from independent
yellow; these genotypes are shown in the nine, assortment. Two loci show linkage if they are
green-shaded cells in Figure 6. We can represent all located close together on the same chromosome.
of four of the different genotypes shown in these This close proximity alters the frequency of allele
cells with the notation (R_Y_), where the blank line combinations in the gametes. We will return to the
(__), means “any allele”. The three genotypic concept of linkage in Chapter 18. Deviations from
classes that have at least one R allele and are 9:3:3:1 ratios can also be due to interactions
homozygous recessive for y (i.e. R_yy) will have a between genes, such as epistasis, duplicate gene
round, green phenotype. Conversely the three action and complementary gene action. These
classes that are homozygous recessive r, but have interactions are discussed in Chapter 26.
at least one Y allele (rrY_) will have wrinkled,
yellow seeds. Finally, the rarest phenotypic class of 4. THE DIHYBRID TEST CROSS
wrinkled, green seeds is produced by the doubly While the cross of an F1 x F1 gives a ratio of 9:3:3:1,
homozygous recessive genotype, rryy, which is there is a better, easier cross to test for
expected to occur in only one of the sixteen independent assortment: the dihybrid test cross. In
possible offspring represented in the square. a dihybrid test cross, independent assortment is
seen as a ratio of 1:1:1:1, which is easier to score
3.2. ASSUMPTIONS OF THE 9:3:3:1 RATIO
than the 9:3:3:1 ratio. This test cross will also be
Both the product rule and the Punnett Square
easier to use when testing for linkage (Chapter 18).
approaches showed that a 9:3:3:1 phenotypic ratio
is expected among the progeny of a dihybrid cross Like in monohybrid crosses (Chapter 16), you can
such as Mendel’s RrYy × RrYy. In making these do test crosses with dihybrids to determine the
calculations, we assumed that: genotype of an individual with dominant
(1) alleles at each locus segregate independently of phenotypes, to see if they are heterozygous or
the alleles at the other; homozygous dominant. This type of cross is set up
(2) one allele at each locus is completely dominant in the same fashion: an individual with an unknown
(the other recessive); and genotype in two loci is crossed to an individual that
(3) each of four possible phenotypes can be is homozygous recessive for both loci.
distinguished unambiguously, with no interactions Punnett squares should be done ahead of the
between the two genes that would interfere with crosses, so you know what to expect for any of the
determining the genotype correctly. possible outcomes. Using the example from the
For simplicity, most student examples involve easily rest of this chapter, you cross a double
scored phenotypes, such as pigmentation or other homozygous recessive pea plant (r/r ; y/y. green
changes in visible structures. However, keep in and wrinkled) to an unknown individual that has
mind that the analysis of segregation ratios of any two dominant phenotypes (R/_ ; Y/_. yellow and
two marker loci can provide insight into their round). There are four possible genotypes the
relative positions on chromosomes. unknown individual could be: R/R ; Y/Y or R/R ; Y/y
or R/r ; Y/Y or R/r; Y/y. The Punnett squares for the
3.3. DEVIATIONS FROM THE 9:3:3:1 PHENOTYPIC RATIO first two are listed below (Figure 7). Notice on the
There can be deviations from the 9:3:3:1 left you only get the dominant phenotype for both,
phenotypic ratio. These situations may indicate so you know both genes in the unknown are
r;y r;y
r;y r;y
r;y r;y
r;y r;y
Figure 8.
Blank Punnett squares to fill in the other two possibilities of the test cross.
___________________________________________________________________________
SUMMARY:
• The alleles of loci in different chromosomes are inherited independently of each other.
• The expected phenotypic ratio of a dihybrid cross is 9:3:3:1.
• The 9:3:3:1 ratio can be modified if the loci are not simple Dominant/recessive to each other, or if
there are gene interactions, or if the two loci are linked.
• A test cross gives a ratio of 1:1:1:1 for loci that assort independently.
KEY TERMS:
blended inheritance dihybrid cross
heritable traits Mendel’s Second Law
particulate inheritance Law of Independent Assortment
Independent Assortment (IA) 9:3:3:1
crossing over Linkage
dihybrid
STUDY QUESTIONS:
1) Figure 7 shows Punnett squares for two of the
four possible test crosses. Fill in the Punnett
squares in Figure 8 for the other two possible
genotypes of the unknown that aren’t shown.
2) Based on meiosis, when dealing with two loci,
there will always be four distinct gamete types.
But if the organism is homozygous, like the
tester, all those gametes will look the same. In
this situation, when writing a Punnett square, is
it necessary to write out the four similar
gametes? How would you re-draw the Punnett
Square on the right in Figure 7?
3) If two loci assort independently, then the
AABB x aabb cross will result in dihybrid
progeny, which when crossed together will give
ratios of 9:3:3:1 in the F2, assuming “A” and “B”
are dominant to “a” and “b”, respectively.
Now, assume that locus “A” and “B” are
somewhat linked and thus will NOT assort
independently. That is the “AB” and “ab”
combinations are more likely. How will this
affect (change) the 9:3:3:1 ratio?
4) Do the same first cross as Question#3 but make
the second cross a test cross (x aabb), with
expectation of a 1:1:1:1 ratio. How would the
ratio be changed if the two loci were not
assorting independently but are somewhat
linked?
INTRODUCTION 1. GENETIC NOMENCLATURE & SYMBOLS
As we learned in Chapter 17, Mendel reported that Nomenclature and symbols have been covered in
the pairs of loci he observed segregated previous chapters. This will be a brief review to
independently of each other; for example, the revisit these topics.
segregation of seed color alleles was independent A gene is a hereditary unit that occupies a specific
from the segregation of alleles for seed shape. This position (locus) within the genome or chromosome
observation was the basis for his Second Law and has one or more specific effects upon the
(Independent Assortment), and contributed greatly phenotype of the organism and can mutate into
to our understanding of heredity as single units. various forms (alleles) (A Dictionary of Genetics 3rd
However, further research showed that Mendel’s Ed., King & Stansfield,1985) . A genotype is the
Second Law did not apply to every pair of genes specific allelic composition of a cell or organism.
that could be studied. In fact, we now know that Normally only the genes under consideration are
alleles of loci that are located close together on the listed in a genotype and the alleles at the
same chromosome tend to be inherited together. remaining gene loci are considered to be wild type.
This phenomenon is called linkage, and is a major A phenotype is the detectable outward
exception to Mendel’s Second Law of Independent manifestation of a specific genotype. In describing
Assortment. Researchers use linkage to determine a phenotype usually only the characteristics under
the location of genes along chromosomes in a consideration are listed while the remaining
process called genetic mapping. The concept of characters are assumed to be wild type (normal).
gene linkage is important to the natural processes
of heredity and evolution, as well as to our genetic
manipulation of crops and livestock.
1.1. GENE NAMES AND SYMBOLS and mutant allele of the "a" gene at the "a" locus.
Usually, gene names are unique and their This may also be abbreviated to +/a.
corresponding symbols are unique letters or In some species of diploids, the dominant allele is
combinations of letters. So, for example, the typically designated with the upper case letter(s),
"vermillion" gene in Drosophila is represented by while the recessive allele is given the lower case
the letter "v ", while "vg " is the symbol for the letter(s). For example, in Mendel’s peas the
"vestigial" gene and "vvl " is the symbol for the dominant Rough allele is “R”, while the recessive
"ventral veins lacking" gene locus. Note however smooth alleles is “r”.
that the same letter symbols may represent a
different gene in another organism. Gene symbols 2. RECOMBINATION
and gene names are typically shown italicized text. The process of meiosis leading to a separation of
In lectures we may not always use italics for gene chromosomes, and crossing over is necessary for
names and symbols. the understanding of this chapter. Refer to Chapter
The normal, or wild type, form of a gene is usually 16 and 17 for a review of these concepts.
symbolized by superscript plus sign, "+". E.g. " a+ ", The term “recombination” is used in several
" b+ ", etc. or it is sometimes abbreviated to just different contexts in genetics. In reference to
"+". A forward slash is occasionally used to indicate heredity, recombination is defined as a process
that the two symbols are alleles of the same gene, that results in gametes with combinations of alleles
but on homologous chromosomes. that were not present in the gametes from the
A typical mutant form of the gene, of which there parental generation (Figure 3). Recombination is
can be many, can be symbolized by a superscript important because it contributes to the genetic
minus sign, "-". E.g. " a- ", " b- ", etc., or sometimes variation that may be observed between
abbreviated to just "a", "b", etc. (no superscript). individuals within a population and that may be
Therefore, if the genotype of a diploid organism is acted upon by selection for evolution.
given as a+/a-, it means there is a wild type allele
Different alleles
Figure 2.
Cell Nucleus A diagram illustrating
A A how chromosomes, loci
A A
= = a
= a = A/a and alleles in a cell, and
a a how we depict them as
Homologous text.
Gene locus Chromosomes (Original-J.Locke- CC BY-
NC 3.0)
Figure 3.
When two loci are on
non-homologous
chromosomes, their
alleles will segregate in
combinations identical
to those present in the
parental gametes (Ab,
aB), and in recombinant
genotypes (AB, ab) that
are different from the
parental gametes.
(Original-Deyholos- CC
BY-NC 3.0)
2.1. INTER- AND INTRACHROMOSOMAL RECOMBINATION 2.2. INHERITING PARENTAL AND RECOMBINANT
Interchromosomal recombination occurs either GAMETES
through independent assortment of alleles whose If we consider only two loci and the products of
loci are on different chromosomes (Chapter 17). meiosis results in recombination, then the meiotic
Intrachromosomal recombination occurs through products (gametes) are said to have a recombinant
crossovers between loci on the same genotype. On the other hand, if no recombination
chromosomes. It is important to remember that in occurs between the two loci during meiosis, then
both of these cases, recombination is a process the products retain their original combinations and
that occurs during meiosis (mitotic recombination are said to have a non-recombinant, or parental
may also occur in some species, but it is relatively genotype. The ability to properly identify parental
rare). and recombinant gametes is essential to apply
recombination to experimental examples.
As an example of interchromosomal
recombination, consider loci on two different To properly identify recombinant and parental
chromosomes as shown in Figure 3. We know that gametes from an individual, you need to know the
if these loci are on different chromosomes there is genotype of its parents (the P generation). This is
no physical connection between them, so they are most easily demonstrated in a dihybrid. If, for two
unlinked and will segregate independently as did genes, one parent has the genotype A/A B/B, they
Mendel’s traits. The segregation depends on the can only produce one type of gamete: AB.
relative orientation of each pair of chromosomes at Similarly, if they are a/a b/b, then they can also
metaphase. Since the orientation is random and only produce one type of gamete: ab (Figure 4
independent of other chromosomes, each of the right). However, if those two gametes (AB and ab)
arrangements (and their meiotic products) is combine, they create an individual (F1) that has a
equally possible for two unlinked loci as shown in genotype written as A/a B/b. It can be easier to
Figure 3. keep track of the parental combinations of
gametes by keeping them together when writing
Intrachromosomal recombination occurs through the genotype, for this example AB/ab (Figure 4).
crossovers. Crossovers occur during prophase I of
meiosis, when pairs of homologous chromosomes So the above dihybrid individual can produce four
have aligned with each other in a process called different gametes: AB, ab, Ab and aB. The parental
synapsis. Crossing over begins with the breakage of gametes are those that the individual obtained
DNA of a pair of non-sister chromatids. The breaks from their parents, in this case AB and ab. Ab and
occur at corresponding positions on two non-sister aB are recombinant gametes and are evidence of a
chromatids, and then the ends of non-sister recombination event happening, resulting in a
chromatids are connected to each other resulting different combination of alleles (Figure 4 right).
in a reciprocal exchange of double-stranded DNA. For the above example, the P generation has one
Generally, every pair of chromosomes has at least parent homozygous for both dominant alleles, and
one crossover during meiosis, but often multiple the other homozygous for both recessive alleles. It
crossovers occur in each chromatid during is very important to note that this will not always
prophase I. Further details and figures of be the case. In some instances, one parent will be
crossovers are shown in Chapter 16 and 17. homozygous with one gene dominant and the
Because interchromosomal recombination occurs other gene recessive (A/A b/b) and the other
through independent assortment, genes in this parent will be the opposite (a/a B/B). This situation
situation are always unlinked. Intrachromosomal will change which is the parental and recombinant
recombination has instances of linked genes, and gametes (compare left and right in Figure 4).
so they will be the focus of this chapter.
Figure 4.
The genotype of gametes can be inferred unambiguously if the gametes are produced by homozygotes. However, recombination
frequencies can only be measured among the progeny of heterozygotes (i.e. dihybrids). Note that the dihybrid on the left
contains a different configuration of alleles than the dihybrid on the right due to differences in the genotypes of their respective
parents. Therefore, different gametes are defined as recombinant (red) and parental (blue) among the progeny of the two
dihybrids. In the cross at left, the recombinant gametes will be genotype AB and ab, and in the cross on the right, the
recombinant gametes will be Ab and aB.
(Original-Deyholos-CC BY-NC 3.0)
Figure 6.
If two loci are
completely linked, their
alleles will segregate in
combinations identical
to those present in the
parental gametes (Ab,
aB). No recombinants
will be observed.
(Original-Deyholos-CC
BY-NC 3.0)
Figure 7.
A crossover between two linked loci
can generate recombinant genotypes
(AB, ab), from the chromatids
involved in the crossover. Remember
that multiple, independent meioses
occur in each organism, so this
particular pattern of recombination
will not be observed among all the
meioses from this individual.
(Original-Deyholos-CC BY-NC 3.0)
3.3. PARTIAL LINKAGE from each other, will on average have multiple
It is also possible to obtain recombination crossovers between them and they will behave
frequencies between 0% and 50%, which is a indistinguishably from physically unlinked loci. A
situation we call incomplete (or partial) linkage. recombination frequency of 50% is therefore the
Incomplete linkage occurs when two loci are maximum recombination frequency that can be
located on the same chromosome but the loci are observed, and is indicative of loci that are either on
far enough apart so that crossovers occur between separate chromosomes, or are sufficiently
them during some, but not all, meioses (Figure 7). separated on the same chromosome.
Genes that are on the same chromosome are said 4. EXPERIMENTALLY DETERMINING
to be syntenic regardless of whether they are
RECOMBINATION FREQUENCY
completely or incompletely linked or unlinked.
Thus, all linked genes are syntenic, but not all Let us now consider a complete experiment in
syntenic genes are linked. which our objective is to measure recombination
frequency (Figure 8). We need at least two alleles
Because the location of crossovers is essentially
for each of two genes, and we must know which
random for any given base pair of the
combinations of alleles were present in the
chromosome, the greater the distance between
parental gametes. The simplest way to do this is to
two loci, the more likely a crossover will occur
start with pure-breeding lines that have contrasting
between them. Furthermore, loci that are on the
same chromosome, but are sufficiently separated
AB Ab aB ab
Aa Aa aa aa
ab
Bb bb Bb bb
e
yp
Long Long Short Short
ot
Brown White Brown White
en
ph
Figure 8.
R P P R
recombinant
An experiment to measure recombination frequency or parental
between two loci. The loci affect coat color (B/b) and tail
length (A/a).
Figure 9.
(Wikipedia-Modified Deyholos-CC BY-NC 3.0)
Punnett Square of example test cross. Homozygous
recessive tester can only produce one gamete type so
alleles at two loci. For example, we could cross only one is listed. Phenotypes are listed below. Using the
short-tailed (aa), brown mice (BB) with long-tailed phenotypes and what we know of the parents, we can
identify which phenotypes came from recombinant or
(AA), white mice (bb). Thus, (aaBB) are short-tailed
parental gametes. (Original-L. Canham- CC BY-NC 3.0)
and brown, while (AAbb) are long-tailed and white
(Figure 8 P cross). Based on the genotypes of the
parents, we know that the parental gametes will be We can then infer unambiguously the genotype of
aB or Ab (but not ab or AB), and all of the progeny the gametes produced by the dihybrid individual,
will be dihybrids, AaBb. We do not know at this and therefore calculate the recombination
point whether the two loci are on different frequency between these two loci. For example, if
chromosomes, or whether they are on the same only two phenotypic classes were observed in the
chromosome, and if so, how close together they F2 (i.e. short tails and brown fur (aaBb), and white
are. fur with long tails (Aabb)) we would know that the
only gametes produced following meiosis of the
The recombination events that may be detected dihybrid individual were of the parental type: aB
will occur during meiosis in the dihybrid individual. and Ab, and the recombination frequency would
If the loci are completely or partially linked, then therefore be 0%. Alternatively, we may observe
prior to meiosis, alleles aB will be located on one multiple classes of phenotypes in the F2 in ratios
chromosome, and alleles Ab will be on the other such as shown in Table 2. Given the data in Table 2,
chromosome. These are the parental gametes the calculation of recombination frequency is
based on our knowledge of the genotypes of the straightforward:
gametes that produced the dihybrid. Thus, RF = # recombinant offspring
recombinant gametes produced by the dihybrid
will have the genotypes ab or AB. Total offspring
Table 2.
An example of quantitative data that may be observed in a genetic mapping experiment involving two loci. The data correspond
to the F2 generation in the cross shown in Figure 8.
___________________________________________________________________________
SUMMARY:
• Recombination is defined as any process that results in gametes with combinations of alleles that were
not present in the gametes of a previous generation.
• The recombination frequency between any two loci depends on their relative chromosomal locations.
• Unlinked loci show a maximum 50% recombination frequency.
• Loci that are close together on a chromosome are linked and tend to segregate with the same
combinations of alleles that were present in their parents.
• Crossovers are a normal part of most meioses, and allow for recombination between linked loci.
• Measuring recombination frequency is easiest when starting with pure-breeding lines with two alleles
for each locus, and with suitable lines for test crossing.
KEY TERMS:
linkage unlinked
Second Law of Independent Assortment synapsis
gene recombinant genotype (and gametes)
locus parental genotype (and gametes)
allele coupling (cis) configuration
genotype repulsion (trans) configuration
phenotype recombination frequency (RF)
recombination complete (absolute) linkage
interchromosomal recombination incomplete (partial) linkage
independent assortment syntenic
intrachromosomal recombination
crossover
STUDY QUESTIONS:
1) Compare the terms “recombination” and the parental and recombinant progeny from
“crossover”. How are they similar? How are a test cross?
they different? b) If the alleles are in repulsion (trans)
2) Explain why it usually necessary to start with configuration, what will be the genotypes of
pure-breeding lines when measuring genetic the parental and recombinant progeny from
linkage by the methods presented in this a test cross?
chapter. 6) In this question the white flowers (w) are
3) Suppose you knew that in a population, a trait recessive to purple flowers (W), and yellow
(allele at a locus) that dominantly affected seeds (y) are recessive to green seeds (Y). If a
earlobe shape was tightly linked to a trait that green-seeded, purple-flowered dihybrid is
dominantly affected susceptibility to testcrossed, and half of the progeny have
cardiovascular disease in humans. Under what yellow seeds.
circumstances would this information be a) What can you conclude about linkage
clinically useful? between these loci?
4) In a previous chapter, we said a 9:3:3:1 b) What do you need to know about the
phenotypic ratio was expected among the progeny in this case?
progeny of a dihybrid cross, in absence of gene 7) If the progeny of the cross aaBB x AAbb is
interaction. testcrossed, and the following genotypes are
a) What does this ratio assume about the observed among the progeny of the testcross,
linkage between the two loci in the dihybrid what is the frequency of recombination
cross? between these loci?
b) What ratio would be expected if the loci AaBb 135
were completely linked? Be sure to consider Aabb 430
every possible configuration of alleles in the aaBb 390
dihybrids. aabb 120
5) Given a dihybrid with the genotype CcEe: 8) What is meant by the sentence “All linked
a) If the alleles are in coupling (cis) genes are syntenic, but not all syntenic genes
configuration, what will be the genotypes of are linked.”?
INTRODUCTION along each chromosome and ultimately in the whole
genome.
In previous chapters the relative location of two loci
has been examined. We have used the frequency of 1.1. CALCULATING MAP DISTANCE
recombinants vs parentals to determine the The units of genetic distance are called map units
recombinant frequency (RF). Two loci could show (mu) or centiMorgans (cM), in honor of Thomas
independent assortment (unlinked, RF~50%) or Hunt Morgan by his undergraduate student, Alfred
were linked (RF<~35%). If linked the two must be Sturtevant, who developed the concept of genetic
located on the same chromosome (syntenic), but if maps. Geneticists routinely directly convert the
unlinked they could be far apart on the same recombination frequencies of two loci into cM.
chromosome or on different chromosomes (non- Thus, the recombination frequency in percent is
syntenic). In this chapter we will learn how to approximately the same as the map distance in cM.
construct genetic maps using 3-point crosses. For example, if two loci have a recombination
frequency of 25% they are said to be ~25cM apart
1. GENETIC MAPPING
on a chromosome (Figure 2).
A genetic map (or recombination map) is a
representation of the linear order of genes (or loci),
and their relative distances determined by crossover
frequency, along a chromosome. The fact that such
linear maps can be constructed supports the
Figure 2.
concept of genes being arranged in a fixed, linear
Two genetic maps consistent with a recombination
order along a single duplex of DNA for each frequency of 25% between A and B. Note the location of
chromosome. We can use recombination the centromere. (Original-Deyholos-CC BY-NC 3.0)
frequencies to produce genetic maps of all the loci
number of crossovers that occurred. This is because except pure breeding lines with contrasting
as the distance between loci increases, so does the genotypes are crossed to produce an individual
possibility of having a second (third, or more) heterozygous at three loci (a trihybrid), which is
crossovers occur between the loci. This is a problem then testcrossed to a tester, which is homozygous
for geneticists, because with respect to the loci recessive for all three genes, to determine the
being studied, these double-crossovers produce recombination frequency between each pair of
gametes with the same genotypes as if no genes, among the three loci. A Punnett square can
recombination events had occurred (Figure 4), so be used to predict all the possible outcomes of the
they have parental genotypes. Thus, a double test cross (Figure 6). The progeny produced from
crossover will appear to be a parental type and not the testcross is shown in Table 1.
be counted as a recombinant, despite having two (or When the trihybrid is crossed to a tester, it should
more) crossovers. Geneticists will sometimes use be able to make eight different gametes, to make
specific mathematical formulae to adjust large eight possible different phenotype combinations in
recombination frequencies to account for the the offspring. The next step would be to identify if
possibility of multiple crossovers and thus get a the alleles are recombinant or parental gametes.
better estimate of the actual distance between two This can be done by comparing only two loci at one
loci. time to the parental gametes. In this example, the
2. MAPPING WITH THREE-POINT CROSSES parents of the trihybrid are a/a B/B c/c, and A/A b/b
C/C, so the parental gametes would be aBc and AbC
A genetic map consists of multiple loci distributed
respectively. Now by comparing two loci at once you
along a chromosome. A particularly efficient
can determine if, between the two, they are
method of mapping three genes at once is the
recombinant or parental. For example, the offspring
three-point cross, which allows the order and
in the first row in Table 1 came from gamete aBC.
distance between three potentially linked genes to
A b A b A b
A b A b A b
a B a B a B
a B a B a B
Figure 4.
A double crossover between two loci will produce gametes
with parental genotypes, even though TWO crossovers
have occurred between the loci.
(Original-Deyholos/Canham-CC BY-NC 3.0)
Figure 6.
P2 P1
Punnett square of the test cross for
aBC AbC abC ABC aBc Abc abc ABc Figure 5, showing the predicted
aa Aa aa Aa aa Aa aa Aa gametes possible from this cross, and
the resulting phenotypes.
abc Bb bb bb Bb Bb bb bb Bb (Original-L. Canham-CC BY-NC 3.0)
Cc Cc Cc Cc cc cc cc cc
pe
Short tail Long tail Short tail Long tail Short tail Long tail Short tail Long tail
ty
Long whis Long whis Long whis Long whis Short whis Short whis Short whis Short whis
e
ph
number of genotype of
tail fur whisker gamete from loci loci loci
progeny F2 from test
phenotype phenotype phenotype trihybrid A, B A, C B, C
n=120 cross
short brown long 5 aBC aaBbCc P R R
long white long 38 AbC (P2) AabbCc P P P
short white long 1 abC aabbCc R R P
long brown long 16 ABC AaBbCc R P R
short brown short 42 aBc (P1) aaBbcc P P P
long white short 5 Abc Aabbcc P R R
short white short 12 abc aabbcc R P R
long brown short 1 ABc AaBbcc R R P
Table 1.
An example of data that might be obtained from the F2 generation of the three-point cross is shown in Figure 5. The rarest
phenotypic classes correspond to double recombinant gametes ABc and abC. Each phenotypic class and corresponding gamete
can also be classified as parental (P) or recombinant (R) with respect to how each pair of loci (A,B), (A,C), (B,C) are arranged on
the chromosome.
Comparing loci A and B, we see that it matches one recombination frequencies may be calculated for
of the parental gametes and therefore it is parental. each pair of loci individually, as we did before for
Comparing A and C we see that it matches neither one pair of loci in our dihybrid cross (Chapter 18).
parental, so it is recombinant. The same can be said We can then use these numbers to build the map,
for comparing B and C. placing the loci with the largest RF on the ends.
$%$&%$'%$
loci A,B !" = = 25% However, note that in the three-point cross, the sum
$'(
of the distances between A-B and A-C (35%) is less
$%*%$%* than the distance calculated for B-C (32%). This is
loci A,C !" = = 10%
$'( because of double crossovers between B and C,
which were undetected when we considered only
*%$&%$'%*
!" = = 32% pairwise data for B and C. We can easily account for
$'(
loci B,C some of these double crossovers, and include them
(not corrected for double crossovers)
in calculating the map distance between B and C, as
follows (Figure 7).
Once the classes of progeny have been identified as
each pair of locus being parental or recombinant,
All
Unlinked
# of # of
Two gamete gamete
progeny progeny
Linked
and aBC 5 Abc 5
One
Unlinked
All Three
Linked
Figure 8.
Examples of how three genes can be associated with each # of # of
other, based on whether all three are unlinked, all three gamete gamete
progeny progeny
are linked or two linked and one unlinked.
(Original-J. Locke/L. Canham-CC BY-NC 3.0) ABC 16 abc 12
over, while others are lower. In addition, the resulting in a higher resolution map compared to
frequency of crossing over varies from species to species with fewer markers.
species, and even from male to female within a For species with a greater number of progeny, a
species. For example, in Drosophila melanogaster better map is possible. The ability to score
there is no crossing over in males. recombinants among 100's, 1000's, etc. means that
From Drosophila recombination data, we know that one can identify rare or very rare recombinants and
the likelihood of a crossover is greatest in the middle thus map loci that are very close together. For
of a chromosome arm and lower at the telomere example, with the mapping of bacteriophage, it is
and centromere regions (Figure 7). This distribution possible to map mutations down to the level of
would be expected if one of the functions of a single base pairs using certain selectable marker
crossover event were to hold the two synapsed systems.
chromosomes together so that they segregate Because of these two factors, the genetic maps of
correctly in metaphase I of Meiosis I. simple prokaryote genomes are more refined than
4.2. RESOLUTION OF GENETIC MAPS those of the larger and more complex eukaryote
The resolution of genetic maps depends on two genomes.
factors: (1) the number of marker loci and, (2) the These days, most laboratory species have had their
number of progeny. genomes sequenced. This knowledge provides
For species with a high number of marker loci (those another means to locate the specific gene(s)
which have a phenotype that permits the alleles to responsible for a desired trait(s).
be distinguished), more locations can be plotted ,
Figure 10.
Diagram of the frequency of crossing over along a
chromosome (bottom). The Y-axis shows the relative
rate of crossing over. The two peaks are present in the
middle of each chromosome arm, while the telomeres
and centromeres have lower frequencies of exchanges.
(Original-J. Locke-CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• A genetic map (or recombination map) is a representation of the linear order of genes (or loci), and their
relative distances determined by crossover frequency, along a chromosome.
• Recombination frequency is usually proportional to the distance between loci, so recombination
frequencies can be used to create genetic maps.
• Recombination frequencies tend to underestimate map distances, especially over long distances, since
double crossovers may be indistinguishable from non-recombinants.
• Three-point crosses can determine the order and map distance among three loci.
• In three-point crosses, a correction for the distance of the outside markers can be made to account for
double crossovers between the two outer loci.
• Crossovers are not equally frequent all along a chromosome. In some regions, crossovers are more
frequent while others are less.
• The resolution of genetic maps depends on the number of markers and the number of progeny.
KEY TERMS:
recombinants map units (mu)
parentals centimorgans (cM)
independent assortment Thomas Hunt Morgan
unlinked Alfred Sturtevant
linked map-based cloning
syntenic conserved synteny
non-syntenic double-crossover
genetic map three-point cross
STUDY QUESTIONS:
1) In corn (i.e. maize, a diploid species), imagine are crossed (i.e. a yellow fly crossed to a curved-
that alleles for resistance to a particular wing fly), and their progeny is testcrossed, the
pathogen are recessive and are linked to a locus following phenotypic ratios are observed among
that affects tassel length (short tassels are their progeny.
recessive to long tassels). Design a series of
crosses to determine the map distance between black, straight 17
these two loci. You can start with any genotypes yellow, curved 12
you want, but be sure to specify the phenotypes black, curved 337
of individuals at each stage of the process and yellow, straight 364
specify which progeny will be considered
recombinant. You do not need to calculate a) Calculate the map distance between B and C.
recombination frequency. b) Why are the frequencies of the two smallest
2) In a mutant screen in Drosophila, you identified classes not exactly the same?
a gene related to memory, as evidenced by the 6) Given the map distance you calculated between
inability of recessive homozygotes to learn to B-C in question 5, if you crossed a double mutant
associate a particular scent with the availability (i.e. yellow body and curved wing) with a wild-
of food. Given another line of flies with an type fly, and testcrossed the progeny, what
autosomal mutation that produces orange eyes, phenotypes in what proportions would you
design a series of crosses to determine the map expect to observe among the F2 generation?
distance between these two loci and specify 7) Wild-type mice have brown fur and short tails.
which progeny will be considered recombinant. Loss of function of a particular gene produces
You do not need to calculate recombination white fur, while loss of function of another gene
frequency. produces long tails, and loss of function at a third
3) Imagine that methionine heterotrophy, locus produces agitated behaviour. Each of
chlorosis (loss of chlorophyll), and absence of these loss of function alleles is recessive. If a
leaf hairs (trichomes) are each caused by wild-type mouse is crossed with a triple mutant,
recessive mutations at three different loci in and their F1 progeny is test-crossed, the
Arabidopsis. Given a triple mutant, and following recombination frequencies are
assuming the loci are on the same chromosome, observed among their progeny. Produce a
explain how you would determine the order of genetic map for these loci.
the loci relative to each other.
4) Three loci are linked in the order B-C-A. If the A- Fur Tail Behaviour Freq.
B map distance is 1cM, and the B-C map distance white short normal 16
is 0.6cM, given the lines AaBbCc and aabbcc, brown short agitated 0
what will be the frequency of Aabb genotypes brown short normal 955
among their progeny if one of the parents of the white short agitated 36
dihybrid had the genotypes AABBCC? white long normal 0
5) Genes for body color (B black dominant to b brown long agitated 14
yellow) and wing shape (C straight dominant to brown long normal 46
c curved) are located on the same chromosome white long agitated 933
in flies. If single mutants for each of these traits
Figure 1.
The E/e gene in turkeys is responsible for
bronze or brown feather colour, and is
located on the Z-chromosome.
(Flickr- stevevoght- CC BY-SA 2.0)
INTRODUCTION The combination of sex chromosomes within a
species is associated with either male or female
Previously, Mendel, working with plants, showed
individuals. In mammals, fruit flies, and some
patterns of inheritance derived from gene loci on
dioecious plants, those with two X chromosomes
autosomal chromosomes. One complication to this
are females while those with an X and a Y are
model of inheritance in animals is that loci present
males. In birds, moths, and butterflies, males are ZZ
on sex chromosomes, called sex-linked loci, don’t
and females are ZW. Because sex chromosomes
follow this pattern. This chapter covers the various
have arisen multiple times during evolution the
patterns of inheritance for various sex-linked loci.
molecular mechanism(s) through which they
1. AUTOSOMES AND SEX CHROMOSOMES determine sex differs among those organisms. For
example, although humans and Drosophila both
In diploids, most chromosomes exist in pairs (same
have X and Y sex chromosomes, they have different
length, centromere location, and banding pattern)
mechanisms for determining sex (see the next
with one set coming from each parent. These
chapter).
chromosomes are called autosomes. However,
many species have an additional pair of How do the sex chromosomes behave during
chromosomes that do not look alike. These are sex meiosis? Well, in those individuals with two of the
chromosomes because they differ between the same chromosome (i.e. homogametic sexes: XX
sexes. In humans, males have one of each while females and ZZ males) the chromosomes pair and
females have two X chromosomes. Autosomes are segregate during meiosis I the same as autosomes
those chromosomes present in the same number in do. During meiosis in XY males or ZW females
males and females, while sex chromosomes are (heterogametic sexes) the sex chromosomes pair
those that are not. When sex chromosomes were with each other.
first discovered their function was unknown and In mammals (XX, XY) the consequence of this is that
the name X was used to indicate this mystery. The all egg cells will carry an X chromosome, while the
next ones were named Y, then Z, and then W. sperm cells will carry either an X or a Y
chromosome. Half of the offspring will receive two
Figure 2.
Meiosis in an XY mammal. The stages shown are
anaphase I, anaphase II, and mature sperm. Note how
half of the sperm contain Y chromosomes and half contain
X chromosomes. Figure 3.
(Original-Harrington-CC BY-NC 3.0) X and Y chromosome have pseudoautosomal regions,
which are capable of pairing during meiosis and
X chromosomes and become female while half will recombination. (Original-Locke/Kang-CC BY-NC 3.0)
receive an X and a Y and become male (Figure 2). In
species with ZZ males, all sperm carry a Z inactive X chromosomes. These genes may explain
chromosome, while in females, ZW, half will have a clinical features in sex chromosome aneuploidy
Z and half a W. (addition or subtraction of a sex chromosome; e.g.
XXY) as gene products may be either under or over
2. PSEUDO-AUTOSOMAL REGIONS ON THE X AND Y expressed in relation to normal females and males.
CHROMOSOMES One of the genes in this region is called SHOX. It
makes a protein that promotes bone growth. 46,XX
In evolution, before the X and Y chromosomes and 46,XY people have two functioning copies and
differentiated, they used to be equivalent have average height. People with 47,XYY and
homologs, like an autosome. Over time, the Y 47,XXX genomes have three copies and are taller
chromosome lost most of its genes (hence the than average. And people with 45,X have one copy
reduced size), but the X chromosome retained all and are short. It is the single copy of SHOX and a
its genes. Thus, even though the Y chromosome few of the other genes in the pseudo-autosomal
has lost most of its genes, it still shares some region that causes health problems for women
regions with the X chromosome. This is the reason with Turner syndrome.
why although X and Y chromosomes are
heteromorphic (morphologically dissimilar), they 3. SEX LINKAGE: AN EXCEPTION TO MENDEL’S
are able to act as a homologous pair in meiosis and
FIRST LAW
undergo crossover. These common regions, contain
similar genes, permit the X and Y to pair up and are Above we introduced sex chromosomes and
called the “pseudoautosomal regions”. The name autosomes (non-sex-linked chromosomes). For loci
comes from the observation that genes in these on autosomes, the alleles follow the classic
regions behave like autosomes in their inheritance. Mendelian pattern of inheritance. However, for loci
Alleles of the genes in this region crossover just like on the sex chromosomes this doesn’t follow
those on the autosomes. Thus, genes in this region because most (not all) of the loci on the typical X-
are not inherited in a sex-linked pattern, even chromosome are absent from the Y-chromosome,
though they are located on the X chromosome. even though they act as a homologous pair during
meiosis. Instead, they will follow a sex-linked
The genes found in pseudo-autosomal region are
pattern of inheritance. An X-linked allele in the
present in two copies in both XY males and XX
father will always be passed on to his daughters
females and thus if expressed from both active and
only, but an X-linked allele in the mother will be
passed on to both daughters and sons equally.
mutant allele.
___________________________________________________________________________
SUMMARY:
• Autosomes and sex chromosomes differ in that the former exist in pairs but the latter depends on the
sex of the chromosome.
• Pseudo-autosomal regions are regions on X and Y chromosome that can pair up and recombine.
• Sex-linked genes are an exception to standard Mendelian inheritance. Their phenotypes are influenced
by the type of sex chromosome system and the type of dosage compensation system found in the
species.
• Some of the examples of sex-linked genes are: white gene on the Drosophila’s X chromosome, TDF
gene on Y chromosome, E/e gene on Z chromosome.
KEY TERMS:
autosome heteromorphic
sex chromosome sex-linked
homogametic X-linked genes
heterogametic reciprocal cross
pseudoautosomal regions Z-linked genes
STUDY QUESTIONS:
1) A rare dominant mutation causes a neurological
disease that appears late in life in all people
that carry the mutation. If a father has this
disease, what is the probability that his
daughter will also have the disease?
2) Make Punnett Squares to accompany the
crosses shown in Figure 5.
3) Draw reciprocal crosses that would show that
the turkey E-gene is on the Z-chromosome.
Figure 1.
Not all species determine sex using the
same mechanism. There are many
factors that can determine a species’ sex
and one of them is growth temperature.
For alligators, sex is determined by the
temperature of the eggs in their nest.
(Flickr-Florida Fish and Wildlife-CC BY-
ND 2.0)
INTRODUCTION gene, the Sex-determining Region Y (SRY) gene,
also known as Testis-Determining Factor (TDF)
In the previous Chapter, sex chromosomes were gene, on the Y-chromosome. Its presence in the
described and their inheritance was compared to genome and expression in gonad tissues dictates
that of the autosomes. The linkage of sex that the sex of that individual will be male. Its
chromosomes to the sex of individuals was
absence or lack of correct expression results in a
presumed. In this chapter we will cover the
female phenotype for that individual.
mechanisms of sex determination by chromosomes
(genes) as well as other, environmental, In mammals, the sex chromosomes evolved just
mechanisms. In the diversity of animal life, sex is after the divergence of the monotreme lineage
not always determined by genetics (sex (mammals that lay eggs) from the lineage that led
chromosomes). to marsupial mammals (young are carried in a
pouch) and placental mammals. Thus nearly every
1. SEX DETERMINATION MECHANISMS IN ANIMALS mammal species uses the same sex determination
There are various mechanisms for sex system. In this system, during embryogenesis, the
determination in animals. These include sex gonads will develop into either ovaries or testes.
chromosomes, chromosome dosage, and (Figure 2)
environmental cues. Figure 2.
Gonad differentiation is
1.1. SEX CHROMOSOME SYSTEMS: under the control of
several genes including
a) XY system Testis-determining
Different combinations of the X and Y sex factor (TDF, SRY) at
Yp11.3. (y chromosome,
chromosomes can determine the sex of an p arm, region 1, band 1,
organism. For example, in humans and other sub-band 3).
mammals XY embryos develop as males while XX (Original-
embryos become females. This difference in Harrington/Kang-CC BY-
development is due to the presence of only a single NC 3.0)
their sex during their lifetime. For example, the The tuatara (left) is a reptile, but not a lizard, although it is
Wrasse family includes many different species of related to lizards (right). Cladogram:
1=Tuatara
various sizes and colours. In this family, sex change
2=lizards
is typically female-to-male (male-to-female sex 3=snakes
change has been seen in experimental conditions). 4=crocodiles
The individual to change sex is generally the largest 5=birds
female in a group. (Left: Flickr-PhillipC- CC BY 2.0)
(Right: Wikipedia-Benchill-CC BY 3.0)
2.3. PARTHENOGENETIC SPECIES
Figure 4.
Moon Wrasse (Thalassoma lunare) can change sex.
(Flickr- Nick Hobgood- CC BY-NC 2.0)
Cell Response
Determining Factors Genetic Mechanism
Mechanism
• Hormonal:
Chromosomal:
directs cells to sex
• XX/XY • Single gene
phenotype
• ZW/ZZ • X-Autosome Ratio
• Cell-autonomous
• XX/XO (gene dosage)
(each cell “knows” what
• Haploid/Diploid
sex it is)
Environment:
• Rearing temp.
Not genetic Hormonal?
• Social interactions
• Parthenogenesis
Table 1. A summary table outlining various factors that affect sex determination and its genetic and cell
response mechanism.
___________________________________________________________________________
SUMMARY:
• The sex of an individual can be determined by sex chromosomes
• This includes the X/Y, Z/W, and X/O system
• Also, differences in the ploidy level (haploid vs diploid) determine sex in some species
• Lastly, environmental factors such as rearing temperature or social organization (male vs female ratio)
can determine sex.
KEY TERMS:
single gene X/O system
Testis-Determining Factor (TDF) X:Autosome (X:A) ratio
Sex-determining Region Y (SRY) haploid-diploid system
therians Tuatara
XY system Sex-ratio
ZW system parthenogenetic
STUDY QUESTIONS:
1) Draw reciprocal crosses that would
demonstrate that the turkey E-gene is on the Z
chromosome.
2) Mendel’s First Law (as stated in class) does not
apply to alleles of most genes located on sex
chromosomes. Does the law apply to the
chromosomes themselves?
PAGE 6 OPEN GENETICS LECTURES – FALL 2017
SEX CHROMOSOME: DOSAGE COMPENSATION – CHAPTER 22
Figure 1.
A calico cat showing the random inactivation
(X-inactivation) of one or the other X-
chromosome giving either an orange or black
fur colour. The inactivation is a mechanism of
dosage compensation. (Note: the white
colour pattern is due to another gene.)
(Original-J. Locke-CC:AS)
INTRODUCTION systems evolved independently, and very early in
evolution, they work differently with regard to
The previous chapters on sex chromosomes dealt
compensating for the difference in gene dosage.
with sex linkage and sex determination. Now, there
Remember, in most cases the sex chromosomes act
is one last issue dealing with sex chromosomes,
as a homologous pair even though the Y-
that of dosage compensation. Because the number
chromosome has lost most of the loci when
of X chromosomes (and Z chromosomes) differs
compared to the X-chromosome. Typically, the X
between the sexes, there is a difference in the
and the Y chromosomes were once similar but, for
number of copies for each locus on the
unclear reasons, the Y chromosomes have
chromosome: females have two, while males only
degenerated, slowly mutating and losing its loci. In
have one (opposite for the ZZ/ZW system).
modern day mammals the Y chromosomes have
1. GENE DOSAGE PROBLEM very few genes left while the X chromosomes
remain as they were. This is a general feature of all
For many loci, the different number of
organisms that use chromosome based sex
chromosomes is inconsequential. That is, the
determination systems. Chromosomes found in
phenotype is unaffected whether there are one or
both sexes (the X or the Z) have retained their
two alleles present. However, for some loci, it is
genes while the chromosome found in only one sex
significant and can affect the phenotype. These loci
(the Y or the W) have lost most of their genes. In
need to have the correct gene dosage to generate
either case there is a gene dosage difference
a wild type phenotype. The dosage difference
between the sexes: e.g. XX females have two doses
between the sexes is reconciled in one of two
of X-chromosome genes while XY males only have
ways. Either the single X chromosome in males is
one. This gene dosage needs to be compensated in
up-regulated to produce the expression equivalent
a process called dosage compensation. There are
of two doses. Or, one of the two doses in females is
two major mechanisms.
inactivated so as to only have one active dose.
Mammals and Drosophila both have XX - XY sex
determination systems. However, because these
Figure 4.
Relationship between genotype and phenotype for an X-
O
linked gene in cats. The O allele = orange while the
B
O allele = black.
(Original-Harrington-CC BY-NC 3.0)
b) Chimera
A chimera is an organism composed of genetically
distinct cells derived from different (more than
one) zygotes. Because the cells are derived from
different organisms, the cell populations will have Figure 9.
more divergent genotypes when compared to A chimeric mouse on the very right, made in NIMH’s
Transgenic Core Facility. The two coloured fur shows the
those of a mosaic. The different sources can two types of cells present.
sometimes even be different species such as a goat (Wikipedia- Staff at NIMH's Transgenic Core Facility-PD)
and a sheep, which when mixed makes a “shoat” or
a “geep”.
___________________________________________________________________________
SUMMARY:
• In order to compensate for under or over dosage of gene products, organisms use various methods
such as expressing genes twice the normal rate or inactiving one X chromosome.
• X-chromosome inactivation occurs randomly (except for special circumstances), and during interphase
the inactivated chromosome appears as a condensed mass in the nucleus called the Barr body.
• Orange gene in cats and F8 gene in humans are examples of X-linked genes.
• Sex determination can be either hormonal or cell-autonomous. Abnormality in the cell-autonomous
mechanism may result in gynandromorphs.
• Both mosaic and chimeric organisms are composed of genetically distinct cells, but their origins of
those cells are different.
KEY TERMS:
dosage compensation cell-autonomous
X-linked genes freemartin
autosomal genes chimera
X-chromosome inactivation sexual gynandromorphs
Barr body genetic mosaics
Orange gene sexually dimorphic
F8 gene X chromosome mosaicism
hermaphrodites gynandromorph
parthenogenesis anastomoses
hormonal
STUDY QUESTIONS:
0 B
1) What is the relationship between the O and O
alleles of the Orange gene in cats?
2) Another cat hair colour gene is called White
Spotting. This gene is autosomal. Cats that have
the dominant “S” allele have white spots, while
the “s” allele doesn’t. Taking the Orange locus
(OB and OO) into account, what are the possible
genotypes of cats that are:
a) entirely black
b) entirely orange
c) black and white
d) orange and white
e) orange and black (tortoiseshell)
f) orange, black, and white (calico)
3) Make a diagram similar to Figure 4, but with
the F8 alleles/genotypes, that shows the
relationship between genotype and phenotype
in females and males and which would use the
purified Factor VIII protein.
Figure 1.
Polydactyly (six fingers in this case – count them) is an example of a human
trait that can be studied by pedigree analysis.
(Wikipedia- Drgnu23- CC BY-SA 3.0)
INTRODUCTION Pedigree analysis is therefore an important tool in
basic research, agriculture, and genetic counseling.
The basic concepts of genetics described in the
preceding chapters can be applied to almost any Each pedigree chart represents all of the available
eukaryotic organism. However, some techniques, information about the inheritance of a single trait
such as test crosses, can only be performed with (most often a disease) within a family. The
model organisms or other species that can be pedigree chart is therefore drawn using factual
experimentally manipulated. To study the information, but there is always some possibility of
inheritance patterns of genes in humans and other errors in this information, especially when relying
species for which controlled matings are not on family members’ recollections or even clinical
possible, geneticists use the analysis of pedigrees diagnoses. In real pedigrees, further complications
and populations. can arise due to incomplete penetrance (including
age of onset) and variable expressivity of disease
1. PEDIGREE ANALYSIS alleles, but for the examples presented in this
1.1. PEDIGREE CHARTS book, we will presume complete accuracy of the
pedigrees – that is, the phenotype accurately
Pedigree charts are diagrams that show the
reflects the genotype. A pedigree may be drawn
phenotypes and/or genotypes for a particular
when trying to determine the nature of a newly
organism, its ancestors, and descendants. While
discovered disease, or when an individual with a
commonly used in human families to track genetic
family history of a disease wants to know the
diseases, they can be used for any species and any
probability of passing the disease on to their
inherited trait. Geneticists use a standardized set of
children. In either case, a tree is drawn, as shown
symbols to represent an individual’s sex, family
in Figure 2, with circles to represent females, and
relationships and phenotype. These diagrams are
squares to represent males. Matings are drawn as a
used to determine the mode of inheritance of a
line joining a male and female, while a
particular disease or trait, and to predict the
probability of its appearance among offspring.
AD EXAMPLE: ACHONDROPLASIA
Achondroplasia is a common form of dwarfism.
FGFR3 gene at 4p16 (chromosome 4, p arm, region
1, band 6) encodes a receptor protein that
negatively regulates bone development. A specific
bp substitution in the gene makes an over-active
Figure 3. protein and this results in shortened bones.
A pedigree consistent with AD inheritance.
(Unknown)
Achondroplasia is considered autosomal dominant
because the defective proteins made in A / a
embryos halt bone growth prematurely. A / A
embryos do not make enough limb bones to
survive. Most, but not all dominant mutations are
also recessive lethal. In achondroplasia, the A allele
shows dominant visible phenotype (shortness) and
recessive lethal phenotype.
Table 1.
2.2. X-LINKED DOMINANT (XD)
Genotype nomenclature consistent with AD inheritance.
(Original-Harrington-CC BY-NC 3.0) In X-linked dominant inheritance, the gene
responsible for the disease is located on the X-
chromosome, and the allele that causes the disease
is dominant to the normal allele in females.
Because females have twice as many X-
chromosomes as males, females tend to be more
frequently affected than males in the population.
However, not all pedigrees provide sufficient
information to distinguish XD and AD. One
definitive indication that a trait is inherited as AD,
and not XD, is that an affected father passes the
Figure 4. disease to a son; this type of transmission is not
Portrait of Sebastián de Morra by Diego Velázquez, a possible with XD, since males inherit their X
court dwarf and was painted ~1645. He likely had
achondroplasia, a condition that has autosomal dominant
chromosome from their mothers.
inheritance. (Wikimedia Commons-Diego Velázquez-PD)
Figure 5. Figure 6.
Diagram showing the mechanism of achondroplasia. Two pedigrees consistent with XD inheritance. (Unknown)
(Original-Harrington-CC BY-NC 3.0)
Table 2. Figure 8.
Genotype nomenclature consistent with XD inheritance. A pedigree consistent with AR inheritance. (Unknown)
(Original-Harrington-CC BY-NC 3.0)
Table 3.
Genotype nomenclature consistent with AR inheritance.
(Original-Harrington-CC BY-NC 3.0)
Figure 7. pedigree can be carriers, probably without knowing
Some types of rickets may follow an XD mode of
it. Compared to pedigrees of dominant traits, AR
inheritance.
(Wikipedia-Mrish-CC BY-SA 1.0) pedigrees tend to show fewer affected individuals
and are more likely than AD or XD to “skip a
XD EXAMPLE: FRAGILE X SYNDROME generation”. Thus, the major feature that
The FMR1 gene at Xq21 (X chromosome, q arm, distinguishes AR from AD or XD is that unaffected
region 2, band 1) encodes a protein needed for individuals can have affected offspring. Attached
neuron development. There is a (CGG)n repeat earlobes is a human condition that may follow an
array in the 5’UTR (untranslated region). If there is AR mode of inheritance.
expansion of the repeat in the germline cell the AR EXAMPLE: PHENYLKETONURIA (PKU)
child will inherit a non-functional allele. XA / Y Individuals with phenylketonuria (PKU) have a
males have fragile X mental retardation (IQ < 50) mutation in the PAH gene at 12q24 (chromosome
because none of their neurons can make FMR1 12, q arm, region 2, band 4), which encodes an
proteins. Fragile X syndrome is considered X-linked enzyme that breaks down phenylalanine into
dominant because only some neurons in XA / Xa tyrosine called phenylalanine hydrolase (PAH).
females can make FMR1 proteins. The severity (IQ Without PAH, the accumulation of phenylalanine
50 – 70) in these females depends upon the and other metabolites, such as phenylpyruvic acid
number and location of these cells within in the (Figure 10.), disrupts brain development, typically
brain. within a year after birth, and can lead to
2.3. AUTOSOMAL RECESSIVE (AR) intellectual disability. Fortunately, this condition is
Diseases that are inherited in an autosomal both easy to diagnose (Figure 9.) and can be
recessive pattern require that both parents of an successfully treated with a low phenylalanine diet.
affected individual carry at least one copy of the There are over 450 different mutant alleles of the
disease allele. With AR traits, many individuals in a PAH gene, so most people with PKU are compound
Figure 11.
A pedigree consistent with XR inheritance. (Unknown)
Figure 9.
Many inborn errors of metabolism, such as
Table 4.
Genotype nomenclature consistent with XR inheritance.
(Original-Harrington-CC BY-NC 3.0)
Figure 10.
Mutation in the PAH gene cannot catalyze the breakdown
of phenylalanine into tyrosine. This causes a buildup of
phenylpyruvic acid, which would damage the central
nervous system.
(Original-Harrington-CC BY-NC 3.0)
4. CALCULATING PROBABILITIES
Once the mode of inheritance of a disease or trait
is identified, some inferences about the genotype
of individuals in a pedigree can be made, based on
their phenotypes and where they appear in the Figure 15.
family tree. Given these genotypes, it is possible to Individuals in this pedigree are labeled with numbers to
calculate the probability of a particular genotype make discussion easier. (Unknown)
being inherited in subsequent generations. This
can be useful in genetic counseling, for example Assuming the disease has an AR pattern of
when prospective parents wish to know the inheritance, what is the probability that individual
likelihood of their offspring inheriting a disease for 14 will be affected? We can assume that individuals
which they have a family history. #1, #2, #3 and #4 are heterozygotes (Aa), because
Probabilities in pedigrees are calculated using they each had at least one affected (aa) child, but
knowledge of Mendelian inheritance and the same they are not affected themselves. This means that
basic methods as are used in other fields. The first there is a 2/3 chance that individual #6 is also Aa.
formula is the product rule: the joint probability of This is because according to Mendelian inheritance,
two independent events is the product of their when two heterozygotes mate, there is a 1:2:1
individual probabilities; this is the probability of distribution of genotypes AA:Aa:aa. However,
one event AND another event occurring. For because #6 is unaffected, he can’t be aa, so he is
example, the probability of a rolling a “five” with a either Aa or AA, but the probability of him being Aa
single throw of a single six-sided die is 1/6, and the is twice as likely as AA. By the same reasoning,
probability of rolling “five” in each of three there is likewise a 2/3 chance that #9 is a
successive rolls is 1/6 x 1/6 x 1/6 = 1/216. heterozygous carrier of the disease allele.
The second useful formula is the sum rule, which If individual 6 is a heterozygous for the disease
states that the combined probability of two allele, then there is a ½ chance that #12 will also be
independent events is the sum of their individual a heterozygote (i.e. if the mating of #6 and #7 is Aa
probabilities. This is the probability of one event × AA, half of the progeny will be Aa; we are also
OR another event occurring. For example, the assuming that #7, who is unrelated, does not carry
probability of rolling a five or six in a single throw any disease alleles). Therefore, the combined
of a dice is 1/6 + 1/6 = 1/3. probability that #12 is also a heterozygote is 2/3 x
1/2 = 1/3. This reasoning also applies to individual
With these rules in mind, we can calculate the #13, i.e. there is a 1/3 probability that he is a
probability that two carriers (i.e. heterozygotes) of heterozygote for the disease. Thus, the overall
an AR disease will have a child affected with the probability that both individual #12 and #13 are
disease as ½ x ½ = ¼, since for each parent, the heterozygous, and that a particular offspring of
probability of any gametes carrying the disease theirs will be homozygous for the disease alleles is
allele is ½. This is consistent with what we already 1/3 x 1/3 x 1/4 = 1/36.
___________________________________________________________________________
SUMMARY:
• Pedigree analysis can be used to determine the mode of inheritance of specific traits such as diseases.
• Loci can be X- or Y-linked or autosomal in location and alleles either dominant or recessive with respect
to wild type.
• If the mode of inheritance is known, a pedigree can be used to calculate the probability of inheritance
of a particular genotype by an individual.
KEY TERMS:
Pedigree charts X-linked recessive
mode of inheritance Hemophilia A
genetic counseling Y-linked
incomplete penetrance hairy-ear-rim
variable expressivity chloroplast
proband mitochondrion
affected organelle
carrier mitochondrial inheritance (mtDNA)
autosomal dominant endopolyplody
Achondroplasia sporadic
X-linked dominant product rule
Fragile X-syndrome sum rule
Phenylketonuria (PKU)
autosomal recessive
STUDY QUESTIONS:
1) What are some of the modes of inheritance that are consistent with this pedigree?
2) In this pedigree in question 1, the mode of inheritance cannot be determined unambiguously. What are
some examples of data (e.g. from other generations) that, if added to the pedigree would help determine
the mode of inheritance?
3) For each of the following pedigrees, name the most likely mode of inheritance (AR=autosomal recessive,
AD=autosomal dominant, XR=X-linked recessive, XD=X-linked dominant). (These pedigrees were obtained
from various external sources).
a)
b)
c)
d)
4) The following pedigree represents a rare, autosomal recessive disease. What are the genotypes of the
individuals who are indicated by letters?
5) If individual #1 in the following pedigree is a heterozygote for a rare, AR disease, what is the probability
that individual #7 will be affected by the disease? Assume that #2 and the spouses of #3 and #4 are not
carriers.
INTRODUCTION segment of the chromosome has been lost (a
deletion), the cell may be missing many genes. The
Previous chapters described chromosomes as
causes of chromosome structural abnormalities
simple linear DNA molecules on which genes are
and the consequences they have for the cell and
located. For example, your largest chromosome,
the organism are described below. They involve
chromosome 1, has about 3536 genes. To ensure
double stranded breaks in the DNA, meiotic
that each of your cells possesses these genes, the
crossover events, and rejoining of the broken ends.
typical linear eukaryotic chromosome has three
Human examples will be used to show the
critical features that allow it to be passed on during
phenotypic consequences and methods for
cell division. (1) Origins of replication found along
detection.
its length provide places for DNA replication to
start, (2) telomeres protect each end of the 1. DNA DOUBLE STRAND BREAKS AND INCORRECT
chromosome, and (3) a single centromere near the MEIOTIC CROSSOVERS CAUSE CHROMOSOMAL
middle provides a place for microtubules to attach REARRANGEMENTS
and move the chromosome during mitosis and
meiosis. 1.1. DOUBLE STRAND BREAKS AND THEIR REPAIR
However, at various locations both strands of the A chromosome is a very long but very thin
double stranded DNA in a chromosome can break molecule. In the phopho-diester backbone there
and the subsequent daughter cell(s) may not retain are only two covalent bonds holding each base pair
all the DNA and thus all the genes. For example, if a to the next. If one of these covalent bonds is
Figure 2.
Repair of single strand nicks and double strand breaks in
DNA. (Original-Harrington-CC BY-NC 3.0)
broken the chromosome will still remain intact, Figure 3.
Errors during DNA repair can cause a chromosome
although a DNA Ligase will be needed to repair the deletion. In this diagram A, B, and C are genes on the
nick (Figure 2a). Problems arise when both strands same chromosome. As in Figure 2 there has been breaks
are broken at or near the same location. This in the DNA, recruitment of NHEJ proteins, and repair.
double strand break will cleave the chromosome After the repairs are completed the small piece of DNA
into two independent pieces (Figure 2b). Because with gene B is lost and the chromosome now only has
genes A and C. (Original-Harrington-CC BY-NC 3.0)
these events do occur in cells there is a repair
system called the non-homologous end joining
(NHEJ) system to fix them. Proteins bind to each 1.2. INCORRECT MEIOTIC CROSSOVERS
broken end of the DNA and reattach them with Meiotic crossovers occur at the beginning of
new covalent bonds. This system is not perfect and meiosis for two reasons. They help hold the
sometimes leads to chromosome rearrangements homologous chromosomes together until
(see next section). separation occurs during anaphase I (see Chapter
16). They also allow recombination to occur
The NHEJ system proteins only function if required. between linked genes (see Chapter 17). The event
If the chromosomes within an interphase nucleus itself takes place during prophase I when a double
are all intact the system is not active. The strand break on one piece of DNA is joined with a
telomeres at the natural ends of chromosomes double strand break on another piece of DNA and
prevent the NHEJ system from attempting to join the ends are put together (Figure 4a). Most of the
the normal ends of chromosomes together. If there time the breaks are on non-sister chromatids and
is one double strand break the two broken ends most of the time the breaks are at the same
can be recognized and joined. But if there are two relative locations.
double strand breaks at the same time there will be Problems occur when the wrong pieces of DNA are
four broken ends in total. The NHEJ system matched up along the chromosomes during
proteins may join the ends together correctly, but crossover events. This can happen if the same or
if they fail, the result is a chromosome similar DNA sequence is found at multiple sites on
rearrangement (Figure 3). the chromosomes (Figure 4b). For example, if there
are two Alu transposable elements on a
chromosome. When the homologous
chromosomes pair during prophase I, the wrong
Alu sequences might line up. A crossover may occur
in this region. If so, when the chromosomes
does not include the centromere (para = beside). If If joined with a normal gamete, they will result in
the breaks occur on different chromosome arms an unbalanced zygote, which are usually lethal. The
the inverted section includes the centromere and consequence for this is that crossover products
the result is a pericentric inversion (peri = around). (recombinants) are lost and thus inversions appear
to suppress crossovers within the inverted region.
Note: with both types of inversions, crossovers
outside the loop are possible and fully viable, as
they don’t alter the gene balance.
Figure 6.
Inversion can result from double strand break repair. 4. DUPLICATIONS
(Original-Harrington- CC BY-NC 3.0)
There are two major forms of duplications: tandem
and inverse duplications. Tandem duplications are
3.2. INVERSIONS FROM INCORRECT MEIOSIS when the duplicated genes are in the same order,
In meiosis, when an inversion chromosome is and inverse duplications are where the duplicated
paired up there is an inversion loop formed. If genes are in the reverse order. For example if you
there is a crossover within the loop then abnormal have a chromosome that has the genes ABCDEFGH,
products will result and abnormal, unbalanced and a duplication occurs in the BCD genes, then a
gametes will be produced. For example, a tandem duplication would look like:
crossover event within the loop of a paracentric ABCDBCDEFGH. An inverse duplication would look
inversion will lead to a di-centric product that will like: ABCDDCBEFGH.
break into deletion products and produce
unbalanced gametes (Figure 7). Similarly, with a Insertional duplications are also seen, where the
pericentric inversion, a crossover event leads to duplicated region is inserted to a more distant
duplicate/deletion products that are unbalanced location. e.g. ABCDEFBCDGH
(Figure 8).
Figure 7.
A paracentric inversion
pairing at meiosis. A
crossover within the loop
causes the production of an
acentric and a dicentric
chromatids, which leads to
deletion product..
(Original-Locke-CC BY-NC
3.0)
Figure 8.
A pericentric inversion
pairing at meiosis. A
crossover within the loop
causes the production of
duplicate and deletion
products.
(Original-Locke-CC BY-NC
3.0)
Figure 11.
A reciprocal translocation
pairing at meiosis. There
are two main avenues for
segregation: Adjacent-1
and Alternate. Adjacent-1
results in duplication and
deletion for part of the
chromosome segments.
Alternate doesn’t.
(Original-Locke-CC BY-NC
3.0)
5.2. TRANSLOCATIONS FROM INCORRECT MEIOSIS The third segregation possibility is known as
For translocations during meiosis, a consequence Adjacent-2, where N1 and T1 go to one pole, while
for the two chromosomes involved is that when N2 and T2 go to the other. This way of segregating
they pair both replicated chromosome pairs will be is extremely rare, and so will not be described in
together, which can be seen cytologically as a any further detail.
tetrad. This tetrad can segregate in three ways. 6. CONSEQUENCES OF CHROMOSOMAL
This set of paired, replicated chromosomes can REARRANGEMENTS
segregate as Alternate (balanced) where both
normal (N1 and N2) and both translocated 6.1. DECREASED VIABILITY
chromosomes (T1 and T2) go to the same polls, All the chromosome rearrangements shown above
respectively. The chromosomes can segregate as produce functional chromosomes. Each has one
Adjacent-1 (unbalanced) where the normal and centromere, two telomeres, and thousands of
translocation chromosomes segregate, with N2 and origins of replication. Because inversions and
T1 segregate from N1 and T2. Alternate and translocations do not change the number of genes
Adjacent 1 both occur in approximate equal in a cell or organism they are said to be balanced
frequency and thus only about half the time do the rearrangements. Unless one of the breakpoints
gametes end up unbalanced (Figure 11.). Note how occurred in the middle of a gene the cells will not
each daughter cell in Alternate has equal amounts be affected. On the other hand, deletions and
of blue and black chromosomes, while in Adjacent- duplications are unbalanced rearrangements. The
1 one daughter has extra black chromosomes, and larger they are (more genes involved) the more
the other has extra blue. disruption they cause to the proper functioning of
the cell or organism. Having too much or too little
gene action for a large number of genes can disrupt
the cellular metabolism to generate a phenotype or gametes. This is a general property of inversions
reduce viability. and translocations.
Figure 12. Figure 13.
A normally arranged chromosome Meiosis in a cell heterozygous for the chromosomes shown in Figure 12. Note that of
(left) and a homolog with a pericentric the four gametes one has a deletion of the A gene and a duplication of the D gene while
inversion (right). another gamete has a duplication of A and a deletion of D.
(Original-Harrington/Canham-CC BY- (Original-Harrington/Canham-CC BY-NC 3.0)
NC 3.0
7. CHROMOSOMAL REARRANGEMENTS IN
HUMANS
The problems described above can affect all
eukaryotes, unicellular and multicellular. To better
understand the consequences let us consider those
that affect people. The convention when describing
a person's karyotype (chromosome composition) is
to list the total number of chromosomes, then the
sex chromosomes, and then anything out of the
ordinary. Most of us are 46,XX or 46,XY. What
follows are some examples of chromosome
number and chromosome structure abnormalities.
7.2. INVERSION(9)
The most common chromosome rearrangements in
humans are inversions of chromosome 9. About 2%
of the world's population is heterozygous or
homozygous for inversion(9). This rearrangement
does not affect a person's health because the
genes on the chromosome are all present - all that
has changed is their relative locations. Inversion(9) Figure 16.
is different from deletion(5) in two main respects. Human chromosomes. One way to obtain chromosomes is
to take a blood sample, culture the cells for three days in
As mentioned above because it is a balanced
the presence of a T-cell growth factor, arrest the cells in
rearrangement it does not cause harm. And metaphase with a microtubule inhibitor, and then drop the
because of this nearly everyone with an cells onto a slide. The cells burst and the chromosomes
inversion(9) chromosome has inherited it from a stick to the slide. The chromosomes can then be stained or
parent who had inherited it from one of his or her probed. Because the cells are in metaphase it is possible to
see 46 replicated chromosomes here. There will be dozens
parents and so on. In contrast, most cases of
of collections of chromosomes like this over the entire
deletion(5) are due to new mutations occurring in a slide.
parent. (Wikipedia-Steffen Dietzel- CC BY-SA 3.0)
___________________________________________________________________________
SUMMARY:
• Deletion(5) causes a serious condition (cri-du-chat syndrome) because deletions are unbalanced
chromosome rearrangements.
• Inversion(9) causes few health consequences because inversions are balanced chromosome
rearrangements.
• Bright field microscopy can be used to detect chromosome number abnormalities and some
chromosome rearrangements.
• Fluorescence in situ hybridization can be used to detect all types of chromosome abnormalities.
• PCR and DNA chip based techniques can be used to detect chromosome number abnormalities,
deletions, and duplications.
KEY TERMS:
origin of replication inverse duplication
telomere insertional duplication
centromere duplication
double strand break translocation
non-homologous end joining reciprocal translocation
chromosome rearrangement Robertsonian translocation
meiotic crossover Tetrad
Alu transposable elements Alternate (balanced)
terminal deletion Adjacent-1 (unbalanced)
interstitial deletion reduced fertility
deletion karyotype
deletion loop 46,sex,deletion(5)
pseudo-dominant (cri-du-chat syndrome)
inversion 46,sex,inversion(9)
paracentric inversion bright field microscopy
pericentric inversion Giemsa stain
inversion loop fluorescence in situ hybridization
tandem duplication fluorescent DNA probe
STUDY QUESTIONS:
1) Make diagrams showing how an improper
crossover event during meiosis can lead to:
a) an inversion
b) a translocation.
2) If Drosophila geneticists want to generate
mutant strains with deletion mutations, they
expose flies to gamma rays. What does this
imply about gamma rays?
3) Design a FISH based experiment to find out if
someone is a 47,XXX female or a 47,XYY male.
INTRODUCTION more deleterious problems. Having the correct
expression levels of genes is important for the
So far in this textbook, we have talked about cells function of the organism. Since chromosomes have
and organisms that are haploid and diploid. Having large numbers of genes on them, missing or gaining
the appropriate number of chromosomes is whole chromosomes can cause more serious gene
important for allowing mitosis and meiosis to dosage problems. Aneuploidy is caused through
occur. Having too many or too few individual
incorrect segregation in meiosis or mitosis, and if
chromosome, or whole sets of chromosomes can
there are living organisms with aneuploidy, they
lead to cell replication or fertility problems. This
often have difficulty with meiosis or mitosis as well.
chapter we will discuss the repercussions of having
too many or too few individual chromosomes, 1. PLOIDY NOTATION
known as aneuploidy, or having multiples of whole
chromosome sets, known as polyploidy. 1.1. NOTATION OF DNA CONTENT AND CHROMOSOME
CONTENT IN DIPLOID ORGANISMS
Most organisms of all kingdoms are haploid or The amount of DNA within a cell changes following
diploid. Occasionally though, particularly in plants, each of the following events: fertilization, DNA
you will see chromosomes sets higher than diploid. synthesis, mitosis, and meiosis (Figure 2). We use
This is known as polyploidy. When coming from a “c” to represent the DNA Content in a cell, and “n”
typically diploid plant, and increasing the ploidy in to represent the Number of complete sets of
even numbers, the resulting plant is typically chromosomes. In a gamete (i.e. sperm or egg), the
healthy, and often with larger fruits produced. amount of DNA is 1c, and the number of
However, when increasing to an odd number, it chromosomes is 1n. Upon fertilization, both the
makes it difficult for gamete production and often DNA content and the number of chromosomes
leads to infertility (seedless varieties). doubles to 2c and 2n, respectively. Following DNA
As opposed to polyploidy, where the plant is often replication, the DNA content doubles again to 4c,
healthy, aneuploidy plants and animals (losses or but each pair of sister chromatids is still counted as
multiples of individual chromosomes) often see a single chromosome (a replicated chromosome),
chromosome: tetraploid (2n=4x), hexaploid 2.3. MANY CROP PLANTS ARE HEXAPLOID OR OCTOPLOID
(2n=6x), and so on. The reason for this is clear from Polyploid plants tend to be larger and healthier
a consideration of meiosis. Remembering that the than their diploid counterparts. The strawberries
purpose of meiosis is to reduce the sum of the sold in grocery stores come from octoploid (8x)
genetic material by half, meiosis can equally divide strains and are much larger than the strawberries
an even number of chromosome sets, but not an formed by wild diploid strains. An example is bread
odd number. Thus, polyploids with an odd number wheat which is a hexaploid (6x) strain (Figure 4).
of chromosomes (e.g. triploids, 2n=3x) tend to be This species is derived from the combination of
sterile, even if they are otherwise healthy. three other wheat species, T. monococcum
The mechanism of meiosis in stable polyploids is (chromosome sets = AA), T. searsii (BB), and T.
essentially the same as in diploids: during tauschii (DD). Each of these chromosome sets has 7
metaphase I, homologous chromosomes pair with chromosomes so the diploid species are 2n=2x=14
each other. Depending on the species, all of the and bread wheat is 2n=6x=42 and has the
homologs may be aligned together at metaphase, chromosome sets AABBDD. Bread wheat is viable
or in multiple separate pairs. For example, in a because each chromosome behaves independently
tetraploid, some species may form tetravalents in during mitosis. The species is also fertile because
which the four homologs from each chromosome during meiosis I the A chromosomes pair with the
align together, or alternatively, two pairs of other A chromosomes, and so on. Thus, even in a
homologs may form two bivalents. Note that polyploid, homologous chromosomes can
because that mitosis does not involve any pairing segregate equally and gene balance can be
of homologous chromosomes, mitosis is equally maintained.
effective in diploids, even-number polyploids, and
odd-number polyploids.
Figure 4.
Modern bread wheat is hexaploid, but has
been developed from natural cross breeding
between diploid and tetraploid ancestors.
Meiosis still properly occurs, because the
chromosomes from the individual ancestors
still pair together during metaphase, as is
shown with the cartoon chromosomes
below.
(Original-J. Locke-PD)
Wheat: (Wikipedia- Marknesbitt- PD)
2.4. BANANAS, WATERMELONS, AND OTHER SEEDLESS If triploids cannot make seeds, how do we obtain
PLANTS ARE TRIPLOID enough triploid individuals for cultivation? The
The bananas found in grocery stores are a seedless answer depends on the plant species involved. In
variety called Cavendish. They are a triploid variety some cases, such as banana, it is possible to
(chromosome sets = AAA) of a normally diploid propagate the plant asexually; new progeny can
species called Musa acuminata (AA). Cavendish simply be grown from cuttings from a triploid plant.
plants are viable because mitosis can occur. On the other hand, seeds for seedless watermelon
However, they are sterile because the are produced sexually: a tetraploid watermelon
chromosomes cannot pair properly during meiosis plant is crossed with a diploid watermelon plant.
I. During prophase I there are three copies of each Both the tetraploid and the diploid are fully fertile,
chromosome trying to “pair” with each other. and produce gametes with two (1n=2x) or one
Because proper chromosome segregation in (1n=1x) sets of chromosomes, respectively. These
meiosis fails, seeds cannot be made and the result gametes fuse to produce a zygote (2n=3x) that is
is a fruit that is easier to eat because there are no able to develop normally into an adult plant
seeds to spit out. Seedless watermelons (Figure 5) through multiple rounds of mitosis, but is unable to
have a similar explanation. compete normal meiosis or produce seeds.
Polyploids are often larger in size than their diploid
relatives (Figure 6). This feature is used extensively
in food plants. For example, most strawberries you
eat are not diploid, but octoploid (8x).
Polyploidy in animals is rare, essentially limited to
lower forms, which often reproduce by
parthenogenesis.
4.1. NOMENCLATURE
If something goes wrong during cell division, an
entire chromosome may be lost and the cell will
lack all of these genes. Conversely, an entire
chromosome may be improperly included into the
new cell. These chromosomal abnormalities are
known as aneuploidy, which is the addition or
subtraction of a chromosome from a pair of
homologs. More specifically, the absence of one
member of a pair of homologous chromosomes is
called monosomy (only one remains). On the other
hand, in a trisomy, there are three, rather than two
Figure 8. (disomy), homologs of a particular chromosome.
Endoreduplicated chromosomes from a Drosophila salivary Different types of aneuploidy are sometimes
gland cell. The banding pattern is produced with represented symbolically; if 2n symbolizes the
fluorescent labels. normal number of chromosomes in a cell, then 2n-
(Flickr-Elissa Lei, Ph.D. @ NIH- CC BY 2.0)
1 indicates monosomy and 2n+1 represents
trisomy. The addition or loss of a whole
3. ENDOREDUPLICATION chromosome is a mutation, a change in the
Endoreduplication, is a special type of tissue- genotype of a cell or organism. The most widely
specific genome amplification that occurs in many known human aneuploidy is trisomy-21 (i.e. three
types of plant cells and in specialized cells of some copies of chromosome 21), which is one cause of
animals including humans. Endoreduplication does Down syndrome. Most (but not all) other human
not affect the germline or gametes, so species with autosomal aneuploidies are lethal at an early stage
endoreduplication are not considered polyploids. of embryonic development.
Endoreduplication occurs when a cell undergoes
Aneuploidy can arise through a non-disjunction change to one copy (or three copies) of the
event, which is the failure of at least one pair of hundreds or thousands of genes on an entire
chromosomes or chromatids to segregate during chromosome would be more than tolerable for the
mitosis or meiosis. Non-disjunction will generate daughter cells. They have what is called an
gametes with extra and/or missing chromosomes. unbalanced genotype, which usually kills the cell
Note that aneuploidy usually affects the number of (decreases their viability).
only one type of chromosome and is therefore If a first division or second division nondisjunction
distinct from polyploidy, in which the entire event occurs during meiosis the result is an
chromosome set is duplicated (see previous unbalanced gamete (Figure 10b and c). The gamete
section). Unlike aneuploidy, which is almost always in this case can often be functional, but after
deleterious, polyploidy can be beneficial in some fertilization the embryo will be genetically
organisms, particularly many species of food unbalanced. This usually leads to the death of the
plants. Higher ploidy levels often result in larger cell or embryo at some point in development.
plants and fruits (Figure 6). There are some exceptions to this in humans and
This section will go into the details of the causes of these will be presented later in this chapter.
aneuploidy and the consequences and diseases 5. CHROMOSOME ABNORMALITIES IN HUMANS
associated with them.
The problems described above can affect all
4.2. NONDISJUNCTION DURING MITOSIS OR MEIOSIS eukaryotes, unicellular and multicellular. To better
Segregation occurs in anaphase. In mitosis and understand the consequences let us consider those
meiosis II, sister chromatids (of replicated that affect people. As you will recall, humans are
chromosomes) are normally pulled to opposite 2n=46. The convention when describing a person's
ends of the cell. In Meiosis I, it is homologous karyotype (chromosome composition) is to list the
chromosomes, which are synapsed at that time, total number of chromosomes, then the sex
that segregate and move apart. chromosomes, and then anything out of the
ordinary. Most of us are 46,XX or 46,XY. What
4.3. CONSEQUENCE: DECREASED VIABILITY follows are some examples of chromosome
A non-disjunction event results in daughter cells number and chromosome structure abnormalities.
having an abnormal number of chromosomes.
Cells, such as the parent cell in Figure 9a, which Mitosis
have the proper number of chromosomes, are said
to be euploid. The daughter cells have one too
many or one too few chromosomes and are called 2n 2n
aneuploid. Even though both product cells have at
least one copy of all genes, both cells will probably
die. The reason is due to the loss or gain of a large
number of genes on the chromosome. Genes
normally produce a standard amount of product -
either functional RNAs or proteins. The parent cell
shown has a balanced genotype because it has two 2n 2n 2n-1 2n+1
copies of all of its genes (on its autosomes). But if (a) Correct (b) Non-disjuction
one of these cells suddenly had only one copy (or during anaphase
three copies) of all the genes on a whole Figure 9.
chromosome, the amount of product would be Mitosis done successfully (a) and unsuccessfully (b). The
cell is diploid and the homologs of one chromosome are
either 50% (or 150%) of what was normal. The cell shown in grey and black. (Original-L. Canham & M.
could probably tolerate such a change for a single Harrington- CC BY-NC 3.0)
gene and it would probably survive. But the sudden
Figure 10.
Meiosis done successfully (a) and
unsuccessfully (b and c) (Original-L. Canham &
M. Harrington- CC BY-NC 3.0)
6. GENE BALANCE
Why do trisomies, duplications, and other
chromosomal abnormalities that alter gene copy
number often have a negative effect on the normal
development or physiology of an organism? This is
particularly intriguing because in many species,
aneuploidy is detrimental or lethal, while
polyploidy is tolerated or even beneficial. The
answer probably differs in each case, but is
probably related to the concept of gene balance,
which can be summarized as follows: genes, and
the proteins they produce, have evolved to
function in complex metabolic and regulatory
networks. Some of these networks function best
when certain enzymes and regulators are present
in specific ratios to each other. Increasing or
decreasing the gene copy number for just one part
of the network may throw the whole network out
of balance, leading to increases or decreases of
certain metabolites, which may be toxic in high
concentrations or limiting in other important
processes in the cell. The activity of genes and
metabolic networks is regulated in many different
ways besides changes in gene copy number, so
duplication of just a few genes will usually not be
harmful. However, trisomy and large segmental
duplications of chromosomes affect the dosage of
so many genes that cellular networks are unable to
compensate for such changes and an abnormal or
lethal phenotype results.
___________________________________________________________________________
SUMMARY:
• Aneuploidy results from the addition or subtraction of one or more chromosomes from a group of
homologs, and is usually deleterious to the cell.
• Polyploidy is the presence of more than two complete sets of chromosomes in a genome. Even-
numbered multiple sets of chromosomes can be stably inherited in some species, especially plants.
• Aneuploidy can affect gene balance.
• Errors during anaphase in mitosis or meiosis can lead to trisomy and other forms of aneuploidy.
• Five common forms of aneuploidy in humans are 47,XY,+21 or 47,XX,+21 (Down syndrome), 47,XYY,
47,XXX, 45,X (Turner syndrome) and 47,XXY (Klinefelter syndrome).
KEY TERMS
aneuploidy balanced
polyploidy unbalanced
n first division nondisjunction
c second division nondisjunction
replicated chromosome karyotype
x 46,XX
monoploid 46,XY
sterile 47,sex,+21 (Down syndrome)
tetravalent trisomy
octoploid 47,XYY
hexaploid 47,XXX
triploid monosomy
gene balance 45,X (Turner syndrome)
cellular network pseudo-autosomal region
non-disjunction 47,XXY (Klinefelter syndrome)
euploid
STUDY QUESTIONS
1) Bread wheat (Triticum aestivum) is a hexaploid. 8) How many Barr bodies would you expect to see
Using the nomenclature presented in class, an in cells from people who are:
ovum cell of wheat has n=21 chromosomes.
a) 46, XY,
How many chromosomes in a zygote of bread
b) 46,XX,
wheat?
c) 47, XYY,
2) For a given gene:
d) 47,XXX,
a) What is the maximum number of alleles
e) 45,X,
that can exist in a 2n cell of a given diploid
f) 47,XXY
individual?
9) Why can people survive with trisomy-21
b) What is the maximum number of alleles
(47,sex,+21) but not monosomy-21 (47,XY,-21
that can exist in a 1n cell of a tetraploid
or 47,XX,-21)?
individual?
10) What would happen if there was a
c) What is the maximum number of alleles
nondisjunction event involving chromosome 21
that can exist in a 2n cell of a tetraploid
in a 46,XY zygote?
individual?
d) What is the maximum number of alleles
that can exist in a population?
3)
a) Why is aneuploidy more often lethal than
polyploidy?
b) Which is more likely to disrupt gene
balance: polyploidy or duplication?
4) For a diploid organism with 2n=4
chromosomes, draw a diagram of all of the
possible configurations of chromosomes during
normal anaphase I, with the maternally and
paternally derived chromosomes labeled.
5) For a triploid organism with 2n=3x=6
chromosomes, draw a diagram of all of the
possible configurations of chromosomes at
anaphase I (it is not necessary label maternal
and paternal chromosomes).
6) For a tetraploid organism with 2n=4x=8
chromosomes, draw all of the possible
configurations of chromosomes during a
normal metaphase.
7) Make a diagram showing how a nondisjunction
event can lead to a child with a 47,XYY
karyotype.
PAGE 12 OPEN GENETICS LECTURES – FALL 2017
GENE INTERACTIONS – CHAPTER 26
Figure 1.
Coat color in mammals is an
example of a phenotypic trait that
is controlled by more than one
locus (polygeneic) and the alleles at
these loci can interact to alter the
expected Mendelian ratios.
(Flickr-David Blaikie- CC BY 2.0)
INTRODUCTION If the inheritance of seed color was truly
independent of seed shape, then when the F1
The principles of genetic analysis that we have
dihybrids were crossed to each other, a 3:1 ratio of
described for a single locus
one trait should be observed within each
(dominance/recessiveness) can be extended to the
phenotypic class of the other trait (Figure 2). Using
study of alleles at two different loci. While the
the product law, we would therefore predict that if
analysis of two loci concurrently is required for
¾ of the progeny were green, and ¾ of the progeny
genetic mapping, it can also reveal interactions
were round, then ¾ × ¾ = 9/16 of the progeny
between genes that affect the phenotype.
would be both round and green. Likewise, ¾ × ¼ =
Understanding these interactions is very useful for
3/16 of the progeny would be both round and
both basic and applied research. Before discussing
yellow, and so on. By applying the product rule to
these interactions, we will first revisit Mendelian
all of these combinations of phenotypes, we can
inheritance for two loci.
predict a 9:3:3:1 phenotypic ratio among the
1. MENDELIAN DIHYBRID CROSSES progeny of a dihybrid cross, if certain conditions
are met, including the independent segregation of
1.1. MENDEL’S SECOND LAW (A QUICK REVIEW ) the alleles at each locus. Indeed, 9:3:3:1 is very
To analyze the segregation of two traits (e.g. colour, close to the ratio Mendel observed in his studies of
wrinkle) at the same time, in the same individual, dihybrid crosses, leading him to state his Second
Mendel crossed a pure breeding line of green, Law, the Law of Independent Assortment, which
wrinkled peas with a pure breeding line of yellow, we now express as follows: two loci assort
round peas to produce F1 progeny that were all independently of each other during gamete
green and round, and which were also dihybrids; formation.
they carried two alleles at each of two loci (Figure 2)
the typical four phenotypic classes will be observed 2.2. DOMINANT EPISTASIS
with epistasis. As we have already discussed, in the In some cases, a dominant allele at one locus may
absence of epistasis, there are four phenotypic mask the phenotype of a second locus. This is
classes among the progeny of a dihybrid cross. The called dominant epistasis, which produces a
four phenotypic classes correspond to the segregation ratio of 12:3:1, which can be viewed as
genotypes: A_B_, A_bb, aaB_, and aabb. If either a modification of the 9:3:3:1 ratio in which the
of the singly homozygous recessive genotypes (i.e. A_B_ class is combined with one of the other
A_bb or aaB_) has the same phenotype as the genotypic classes (9+3) that contains a dominant
double homozygous recessive (aabb), then a 9:3:4 allele. One of the best known examples of a 12:3:1
phenotypic ratio will be obtained. segregation ratio is fruit color in some types of
For example, in the Labrador Retriever breed of squash (Figure 6). Alleles of a locus that we will call
dogs (Figure 4), the B locus encodes a gene for an B produce either yellow (B_) or green (bb) fruit.
important step in the production of melanin. The However, in the presence of a dominant allele at a
dominant allele, B is more efficient at pigment second locus that we call A, no pigment is
production than the recessive b allele, thus B_ hair produced at all, and fruit are white. The dominant
appears black, and bb hair appears brown. A A allele is therefore epistatic to both B and bb
second locus, which we will call E, controls the combinations (Figure 7). One possible biological
deposition of melanin in the hairs. At least one interpretation of this segregation pattern is that
functional E allele is required to deposit any the function of the A allele somehow blocks an
pigment, whether it is black or brown. Thus, all early stage of pigment synthesis, before either
retrievers that are ee fail to deposit any melanin yellow or green pigments are produced.
(and so appear pale yellow-white), regardless of
the genotype at the B locus (Figure 4, right side).
The ee genotype is therefore said to be epistatic to
both the B and b alleles, since the homozygous ee
phenotype masks the phenotype of the B locus.
The B/b locus is said to be hypostatic to the ee
genotype. Because the masking allele is in this case
is recessive, this is called recessive epistasis. A Figure 6.
table showing all the possible progeny genotypes Green, yellow, and white fruits of squash. (Flickr-
and their phenotypes is shown in Figure 5. Unknown-CC BY-NC 3.0)
Figure 5.
Genotypes and phenotypes among the progeny of a
dihybrid cross of Labrador Retrievers heterozygous for
two loci affecting coat color. The phenotypes of the
progeny are indicated by the shading of the cells in the
table: black coat (black, E_B_); chocolate coat (brown, Figure 7.
E_bb); yellow coat (yellow, eeB_ or eebb). Genotypes and phenotypes among the progeny of a
(Original-Locke-CC BY-NC 3.0) dihybrid cross of squash plants heterozygous for two loci
affecting fruit color. (Original-Deyholos-CC BY-NC 3.0)
Figure 8.
Red (left) and white (right) wheat seeds.
(cropwatch.unl.edu?-pending?)
Figure 10.
a) A simplified biochemical pathway showing
complementary gene action of A and B. Note that in this
case, the same phenotypic ratios would be obtained if
gene B acted before gene A in the pathway.
Figure 9. b) biochemical pathway showing two subunits of one
Genotypes and phenotypes among the progeny of a enzyme
dihybrid cross of a wheat plants heterozygous for two loci c) biochemical pathway showing one transcription factor
affecting seed color. (Original-Deyholos-CC BY-NC 3.0) and one enzyme
(Original-Deyholos/KangCC BY-NC 3.0)
Figure 13 – Dominant Suppression.
Drosophila cross and its Punnett square showing the
effects of dominant suppression of Su gene on the white Figure 14 – Recessive Suppression.
+
gene. Note that A = white , a = white
mottled, +
B = Su b = Su
-, Drosophila cross and its Punnett square showing the
and ___ (blank) = any allele. (Original-Kang-CC BY-NC 3.0) effects of recessive suppression of Su gene on the white
+ mottled, + -,
gene. Note that A = white , a = white B = Su b = Su
and ___ (blank) = any allele. (Original-Kang-CC BY-NC 3.0)
2.7. RECESSIVE SUPPRESSION
On the other hand, in recessive suppression, the
mutation. On the other hand, flies that have the
mutant suppressor allele is recessive to the wild
wmwm alleles will have mottled phenotype unless
type suppressor allele. Therefore, two of the
they have homozygous su- alleles. If w+/wmottled;
mutant alleles are needed to suppress the wm
Su+/Su- flies are crossed together, the ratio of
(mottled) phenotype. For example, in Figure 13,
white+ (wild type) to whitemottled (mutant) would be
flies that have at least one w+ allele will show a
13:3.
wild-type phenotype. Also, flies that have su-/su-
alleles will have wildtype phenotype since two
mutant alleles can suppress the white gene
2.8. SUMMARY
Table 1. Summary table showing gene interactions and their genotypic and phenotypic ratios.
Ratio: 9 3 3 1
Ratio
Genotype” A-B- A-bb aaB- Aabb
None 9 3 3 1
9:3:3:1
AB B aB ab
Recessive epistasis 9 3 4
9:3:4
of aa acting on B and b alleles AB B A
Dominant epistasis 12 3 1
12:3:1
of A acting on B and b alleles A aB ab
Duplicate genes 15 1
15:1
A a
Complementary genes 9 7
9:7
A A
Recessive suppression 9 3 4
13:3
by aa acting on bb B b B
Dominant suppression 15 1
15:1
by A acting on bb B b
Shading represents combined classes.
3. EXAMPLE OF MULTIPLE GENES AFFECTING ONE and do not show the simple Mendelian segregation
CHARACTER (POLYGENIC INHERITANCE) ratios (e.g. 3:1) observed with some qualitative
traits. Many complex traits are also influenced
3.1. CONTINUOUS VARIATION heavily by the environment. Nevertheless,
Most of the phenotypic traits commonly used in complex traits can often be shown to have a
introductory genetics are qualitative, meaning that component that is heritable, and which must
the phenotype exists in only two (or possibly a few therefore involve one or more genes.
more) discrete, alternative forms, such as either How can genes, which are inherited (in the case of
purple or white flowers, or red or white eyes. a diploid) as at most two variants each, explain the
These qualitative traits are therefore said to exhibit wide range of continuous variation observed for
discrete variation. On the other hand, many many traits? The lack of an immediately obvious
interesting and important traits exhibit continuous explanation to this question was one of the early
variation; these exhibit a continuous range of objections to Mendel's explanation of the
phenotypes that are usually measured mechanisms of heredity. However, upon further
quantitatively, such as intelligence, body mass, consideration, it becomes clear that the more loci
blood pressure in animals (including humans), and that contribute to trait, the more phenotypic
yield, water use, or vitamin content in crops. classes may be observed for that trait (Figure 15).
Traits with continuous variation are often complex,
Figure 15.
Punnett Squares for one, two, or three loci. We are using a simplified example of up to three semi-dominant genes, and in each
case the effect on the phenotype is additive, meaning the more “upper case” alleles present, the stronger the phenotype.
Comparison of the Punnett Squares and the associated phenotypes shows that under these conditions, the larger the number of
genes that affect a trait, the more intermediate phenotypic classes that will be expected. (Original-Deyholos-CC BY-NC 3.0)
Figure 16.
The more loci that affect a trait, the larger the number of phenotypic classes that can be expected. For some traits, the number
of contributing loci is so large that the phenotypic classes blend together in apparently continuous variation. (Original-Deyholos-
CC BY-NC 3.0)
If the number of phenotypic classes is sufficiently controlled by the combined activity of many genes.
large (as with three or more loci), individual classes Note that this does not imply that each of the
may become indistinguishable from each other individual genes has an equal influence on a
(particularly when environmental effects are polygenic trait – some may have major effect, while
included), and the phenotype appears as a others only minor. Furthermore, any single gene
continuous variation (Figure 16). Thus, quantitative may influence more than one trait, whether these
traits are sometimes called polygenic traits, traits are quantitative or qualitative traits.
because it is assumed that their phenotypes are
4. ENVIRONMENTAL FACTORS Genotype + Environment
The phenotypes described thus far have a nearly ⇒ Phenotype (G + E ⇒ P)
perfect correlation with their associated Or:
genotypes; in other words an individual with a Genotype + Environment + InteractionGE
particular genotype always has the expected
phenotype. However, many (most?) phenotypes ⇒ Phenotype (G + E + IGE ⇒ P)
are not determined entirely by genotype alone. *GE = Genetics and Environment
Instead, they are determined by an interaction This interaction is especially relevant in the study of
between genotype and environmental factors and economically important phenotypes, such as
can be conceptualized in the following relationship: human diseases or agricultural productivity. For
4.2. EXPRESSIVITY
Expressivity describes the variability in mutant
phenotypes observed in individuals with a
particular phenotype (Figure 18 and Figure 19.).
Many human genetic diseases provide examples of
broad expressivity, since individuals with the same
genotypes may vary greatly in the severity of their
symptoms. Incomplete penetrance and broad
expressivity are due to random chance, non-
genetic (environmental), and genetic factors
(mutations in other genes).
(1) Genetic heterogeneity: There is more than one 5.2. THE Χ2 TEST FOR GOODNESS-OF-FIT
gene or genetic mechanism that can produce For a variety of reasons, the phenotypic ratios
the same phenotype. observed from real crosses rarely match the exact
(2) Polygenic determination: One phenotypic trait ratios expected based on a Punnett Square or other
is controlled by multiple genes. prediction techniques. There are many possible
explanations for deviations from expected ratios.
(3) Phenocopy: Organisms that do not have the Sometimes these deviations are due to sampling
genotype for trait A can also express trait A due effects, in other words, the random selection of a
to environmental conditions; they do not have non-representative subset of individuals for
the same genotype but the environment simply observation.
“copies” the genetic phenotype.
A statistical procedure called the chi-square (χ2)
(4) Incomplete penetrance: even though an test can be used to help a geneticist decide
organism possesses the genotype for trait A, it whether the deviation between observed and
might not be expressed with 100% effect. expected ratios is due to sampling effects, or
(5) Certain genotypes show a survival rate that is whether the difference is so large that some other
less than 100%. For example, genotypes that explanation must be sought by re-examining the
cause death, recessive lethal mutations, at the assumptions used to calculate the expected ratio.
embryo or larval stage will be The procedure for performing a chi-square test is
underrepresented when adult flies are counted. typically covered in the lab.
___________________________________________________________________________
SUMMARY:
• Phenotype depends on the alleles that are present, their dominance relationships, and sometimes also
interactions with the environment and other factors.
• The alleles of different loci are inherited independently of each other, unless they are genetically linked.
• Many important traits show continuous, rather than discrete variation. These are called quantitative traits.
• Many quantitative traits are influenced by a combination of environment and genetics.
• The expected phenotypic ratio of a dihybrid cross is 9:3:3:1, except in cases of linkage or gene interactions
that modify this ratio.
• Modified ratios from 9:3:3:1 are seen in the case of recessive and dominant epistasis, duplicate genes, and
complementary gene action. This usually indicates that the two genes interact within the same biological
pathway.
• There are other factors that alter the expected Mendelian ratios.
KEY TERMS:
Mendel’s Second Law long
independent assortment dilute
linkage White masking
dihybrid piebald spotting
Modified Mendelian Ratios calico Discrete variation
9:3:3:1 Continuous variation
9:3:4 Polygenic traits
12:3:1recessive epistasis G + E = P
dominant epistasis penetrance
complementary action expressivity
redundancy recessive lethal mutations
duplicate gene action
Orange
STUDY QUESTIONS:
1) In the table on the opposite page, match the a) aa
mouse hair color phenotypes with the term b) bb
from the list that best explains the observed c) dd
phenotype, given the genotypes shown. In this d) aabb
case, the allele symbols do not imply anything e) aadd
about the dominance relationships between f) bbdd
the alleles. List of terms: haplosufficiency, g) aabbdd
haploinsufficiency, pleiotropy, incomplete h) What will be the phenotypic ratios among
dominance, co-dominance, incomplete the offspring of a cross AaBb × AaBb?
penetrance, broad (variable) expressivity. i) What will be the phenotypic ratios among
the offspring of a cross BbDd × BbDd?
Answer questions 2-4 using the following j) What will be the phenotypic ratios among
biochemical pathway for fruit color. Assume all the offspring of a cross AaDd × AaDd?
mutations (lower case allele symbols) are 4) If 1 is colorless, 2 is yellow and 3 is blue and 4 is
recessive, and that either precursor 1 or red, what will be the phenotypes associated
precursor 2 can be used to produce precursor with the following genotypes?
3. If the alleles for a particular gene are not a) aa
listed in a genotype, assume that they are wild- b) bb
type. c) dd
d) aabb
e) aadd
f) bbdd
g) aabbdd
h) What will be the phenotypic ratios among
the offspring of a cross AaBb × AaBb?
i) What will be the phenotypic ratios among
the offspring of a cross BbDd × BbDd?
2) If 1 and 2 and 3 are all colorless, and 4 is red,
what will be the phenotypes associated with j) What will be the phenotypic ratios among
the following genotypes? the offspring of a cross AaDd × AaDd?
a) aa 5) Which of the situations in questions 2 – 4
b) bb demonstrate epistasis?
c) dd 6) If the genotypes written within the Punnett
d) aabb Square are from the F2 generation, what would
e) aadd be the phenotypes and genotypes of the F1 and
f) bbdd P generations for:
g) aabbdd a) Figure 5
h) What will be the phenotypic ratios among b) Figure 7
the offspring of a cross AaBb × AaBb? c) Figure 9
i) What will be the phenotypic ratios among d) Figure 11
the offspring of a cross BbDd × BbDd? 7) To better understand how genes control the
j) What will be the phenotypic ratios among development of three-dimensional structures,
the offspring of a cross AaDd × AaDd? you conducted a mutant screen in Arabidopsis
3) If 1 and 2 are both colorless, and 3 is blue and 4 plant and identified a recessive point mutation
is red, what will be the phenotypes associated allele of a single gene (g) that causes leaves to
with the following genotypes? develop as narrow tubes rather than the broad
flat surfaces that develop in wild-type (G). second gene? In each case, also specify the
Allele g causes a complete loss of function. Now phenotypic ratios that would be observed
you want to identify more genes involved in the among the F1 progeny of a cross of AaGg x
same process. Diagram a process you could use AaGg
to identify other genes that interact with gene 10) Calculate the phenotypic ratios from a dihybrid
g. Show all of the possible genotypes that could cross involving the two loci shown in Figure 17.
arise in the F1 generation. There may be more than one possible set of
8) With reference to question 7, if the recessive ratios, depending on the assumptions you make
allele, g is mutated again to make allele g*, about the phenotype of allele b.
what are the possible phenotypes of a 11) Use the product rule to calculate the
homozygous g* g* individual? phenotypic ratios expected from a trihybrid
9) Again, in reference to question 8, what are the cross. Assume independent assortment and no
possible phenotypes of a homozygous aagg epistasis/gene interactions.
individual, where a is a recessive allele of a
Table for Question 1
A1A1 A1A2 A2A2
1 all hairs black on the same all hairs white
individual: 50% of
hairs are all black and
50% of hairs are all
white
2 all hairs black all hairs are the same all hairs white
shade of grey
3 all hairs black all hairs black 50% of individuals
have all white hairs
and 50% of
individuals have all
black hairs
4 all hairs black all hairs black mice have no hair
5 all hairs black all hairs white all hairs white
6 all hairs black all hairs black all hairs white
7 all hairs black all hairs black hairs are a wide range
of shades of grey
PAGE 16 OPEN GENETICS LECTURES – FALL 2017
PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES– CHAPTER 27
Figure 1.
Genetic map of human chromosome 1
showing a region expanded to the point of
showing the genes within that region.
(Wikipedia- SpelgroepPhoenix-CC BY-SA 3.0)
INTRODUCTION genes along a linear chromosome. Note that map
distances are always calculated for one pair of loci
Chromosomes are long duplex molecules of DNA
at a time. However, by combining the results of
that are either linear or circular and composed of a
multiple pair-wise calculations, a genetic map of
relatively constant sequence of nucleotides. There
many loci on a chromosome can be produced
are three different ways of describing the linear
(Figure 2). A genetic map shows the map distance,
contents of a chromosome: (1) genetic map, (2)
in cM, that separates any two loci, and the position
cytogenetic map, and (3) physical map (ultimately
of these loci relative to all other mapped loci. The
the sequence).
genetic map distance is roughly proportional to the
1. GENETIC MAP (DISTANCE IN CM, physical distance, i.e. the amount of DNA between
RECOMBINATION FREQUENCY) two loci. For example, in Arabidopsis, 1.0 cM
corresponds to approximately 150,000bp and
In Chapter 18, we described the units of genetic contains approximately 50 genes. The exact
distance (map units / centiMorgans, cM) and how number of DNA bases in a cM depends on the
this relates to recombination frequency. We can organism, and on the particular position in the
use this information in order to produce a genetic chromosome. Some parts of chromosomes
map, which is a “map” that shows the locations of (“crossover hot spots”) have higher rates of
2. CYTOGENETIC MAP
Each eukaryotic species has its nuclear genome
divided among a number of chromosomes that is
characteristic of that species. For example, a
haploid human nucleus (i.e. sperm or egg) normally
has 23 chromosomes (n=23), and a diploid human
nucleus has 23 pairs of chromosomes (2n=46). A
karyotype is the complete set of chromosomes of
an individual. In Figure 3, the cell was in metaphase
so each of the 46 structures is a replicated
chromosome even though it is hard to see the two
sister chromatids for each chromosome at this
resolution. As expected there are 46 chromosomes.
Note that the chromosomes have different lengths.
Figure 2. In fact, human chromosomes were named based
Genetic maps for regions of two chromosomes from two
upon this feature. Our largest chromosome is
species of the silk moth, Bombyx. The scale at left shows
distance in cM, and the position of various loci is indicated called 1, our next longest is 2, and so on.
on each chromosome. Diagonal lines connecting loci on
different chromosomes show the position of 2.1. CENTROMERE LOCATION
corresponding loci in different species. This is referred to A chromosome has a telomere and centromere,
as regions of conserved synteny. which are usually in a heterochromatin state.
(NCBI-NIH-PD) Centromere is DNA sequences that are bound by
centromeric proteins that link the centromere to
recombination than others, while other regions microtubules. Centromere can be in the middle
have reduced crossing over and often correspond (metacentric), near to the middle (submetacentric),
to large regions of heterochromatin. near the end (acrocentric), at the end (telocentric)
When a novel gene or locus is identified by or the entire chromosome can act as a
mutation or polymorphism, its approximate chromosome (holocentric). Telomeres are
position on a chromosome can be determined by repetitive sequences like TTAGGG at the end of the
crossing it with previously mapped genes, and then chromosomes that help maintain the length of the
calculating the recombination frequency. If the chromosome. Another feature is that in a
novel gene and the previously mapped genes show chromosome there are p arm (petite = small) and q
complete or partial linkage, the recombination arm (queue = tail or just the next letter in the
frequency will indicate the approximate position of alphabet).*
the novel gene within the genetic map. This
information is useful in isolating (i.e. cloning) the
specific fragment of DNA that encodes the novel
gene, through a process called map-based cloning.
Genetic maps are also useful to track genes/alleles
in breeding crops and animals, in studying
evolutionary relationships between species, and in
Table 1. Table showing four types of centromere location.
determining the causes and individual susceptibility
(Original-Harrington/Kang-CC BY-NC 3.0)
of some human diseases.
*
See https://thednaexchange.com/2011/05/02/p-q-solved-being-the-
true-story-of-how-the-chromosome-got-its-name/
Figure 3.
Karyogram of a normal human male karyotype.
(Wikipedia-NHGRI-PD)
2.2. KARYOGRAM
By convention the chromosomes are arranged into
Figure 4.
the pattern shown in Figure 3 and the resulting
Fictional diagram of a human chromosome and its bands.
image is called a karyogram. A karyogram allows a A chromosome has p and q arm, which are both divided
geneticist to determine a person's karyotype - a by regions. These regions are divided by bands, and these
written description of their chromosomes including bands are subdivided into sub-bands. The bands are
anything out of the ordinary. Therefore, karyotype numbered away from the centromere, and sub-bands are
renumbered for each bands. Notice that this fictional
is a description of the complete set of
diagram was made for educational purposes.
chromosomes, and karyogram is an image that (Original-Kang-CC BY-NC 3.0)
visually describes the karyotype.
approximately 100kb fragments. Typically the set copies of the chromosome have to be broken down
of sequences in a BAC clone library will contain into little pieces with different length and frames
redundant, over lapping sequences, meaning that using restriction enzymes, so that they can partially
different clones will contain DNA from the same overlap with each other. The continual overlaps of
part of the genome so there are going to be some the fragments will eventually form a whole map of
overlaps. Because of these overlaps, it is possible to the chromosome. This contiguous assembly of
select the minimum set of clones that represent clones is called contig.
the entire genome, and to order these clones
respective to the sequence of the original 3.3. RESTRICTION MAPPING PROCEDURE
chromosome. Note that this is all to be done Restriction mapping is an inexpensive, quick, and
without knowing the complete sequence of each easy method to describe a sample of cloned DNA. It
BAC. A set of overlapping clones is called a contig. is preferred over DNA sequencing for these
Making a contig map can rely on techniques related reasons, but the sequence is still the ultimate
to Southern blotting: DNA from the ends of one description.
BAC is used as a probe to find clones that contain Restriction mapping is the technique for identifying
the same sequence in another, overlapping BAC the location of restriction sites, relative to other
clone. These clones are then assumed to overlap sites on a DNA molecule. Typically a sample of
each other. This process of finding overlaps can purified cloned DNA is aliquoted into several tubes
progress to position all the clones into overlapping and each is treated with several different
series that span the genome. Also, if we already restriction enzymes or combination of enzymes.
know the sequence of one strain of a simple These are then separated by agarose gel
organism, it can be used as a reference for mutant electrophoresis and the restriction fragment sizes
strains and can identify the differences in the determined by comparison to known size markers.
sequences. By trial and error, the combination of fragments
Small sized genome like Lambda DNA is only 48kb can be assembled like a linear jigsaw puzzle into a
long, but most chromosomes are Mb long. map of the restrictions sites – a restriction map
Currently, the only way to construct physical maps (Figure 7). One can increase the resolution of the
of large regions is through the joining of smaller restriction site map by mapping more restriction
regions to map a larger or whole portion of the sites.
chromosome. In order to do this, small, multiple 3.4. USES OF A RESTRICTION MAP
Restriction mapping is a quick, easy and
inexpensive way to characterize and distinguish
DNA samples without actually sequencing the DNA;
sequences can be represented by series of
restriction sites and using this knowledge, one can
tell if the DNA of interest is similar or different
from others by comparing their degree of overlaps.
Also, restriction sites offer positions for convenient
manipulation of the DNA. Restriction fragments
Figure 5. that contain the gene of interest can be cut out and
A portion of the physical map for human chromosome 4. once the gene is purified from the fragments, it can
The entire chromosome is shown at left. The physical map be sequenced or used as a probe. This is the reason
is derived from the small blue lines, each of which why restriction mapping is still routinely used
represents a cloned piece of DNA approximately 100kb in
today, even though sequencing technologies allows
length. (NCBI-unknown-PD)
us to sequence the whole genome.
Figure 6.
A series of overlapping cloned sequences can
be combined to eventually span much larger
regions, including whole chromosomes .
(Original-Locke-CC BY-NC 3.0)
Figure 7.
By looking at the size of the fragments produced by one restriction enzyme or combination of the restriction enzymes, the
location and the order of the restriction site on a chromosome can be identified, forming a restriction map.
(Original-Locke-CC BY-NC 3.0)
SUMMARY:
• There are different types of chromosome maps: genetic (recombination), cytogenetic (metaphase
chromosome), and physical maps.
• Recombination frequency is usually proportional to the distance between loci, and so recombination
frequencies can be used to create genetic maps.
• Chromosomes can be distinguished cytologically based on their length, centromere position, and
banding patterns when stained with dyes.
• Single clones can be restriction mapped and then combined into a contig that represents a larger
region of DNA, ultimately the whole chromosome.
• The ultimate physical map is the DNA sequence of the whole chromosome or genome.
KEY TERMS:
map units karyogram
centiMorgans contig
genetic map physical map
recombination frequency restriction map
map-based cloning contig construction
karytotype
STUDY QUESTIONS:
1) Three loci are linked in the order B-C-A. If the A-B map distance is 1cM, and the B-C map distance is 0.6cM,
given the lines AaBbCc and aabbcc, what will be the frequency of Aabb genotypes among their progeny if
one of the parents of the dihybrid had the genotypes AABBCC?
2) Given the restriction digests and with the fragment sizes shown in the gel diagram, can you
construct a map of this linear DNA molecule (Lambda DNA)?
TIPS ON SOLVING RESTRICTION MAPPING QUESTIONS
1) Start with the Nar and Apa digests, each has only one site. This will help get a simple starting map.
2) Next, try and add in the Cvn sites using the double digests with Nar and Cvn.
3) Next, try and add in the Kpn sites using the double digests with Apa and Cvn.
4) There is no formal method to solve these maps. Instead, think of them like a jigsaw puzzle (only linear)
and use trial and error to solve the puzzle.
5) Use the class notes for help.
PAGE 8 OPEN GENETICS LECTURES – FALL 2017
RESTRICTION MAPPING AND GEL ELECTROPHORESIS – CHAPTER 28
Figure 1.
Restriction enzymes that are
available on a vending machine.
(Flickr- Jun Seita CC BY-NC 2.0)
INTRODUCTION result, free, double-stranded DNA molecules are
released from the chromatin into the extraction
Molecular Genetics techniques involve the isolation,
buffer, which also contains proteins and all other
purification, and manipulation of DNA. DNA can
cellular components. (The basics of this procedure
come in the form of genomic DNA, plasmids, or
can be done with household chemicals and are
oligonucleotides.
presented on YouTube.)
1. ISOLATING DNA The free DNA molecules are subsequently isolated
DNA purification strategies rely on the chemical by one of several methods. (3) Commonly, proteins
properties of DNA that distinguish it from other are removed by adjusting the salt concentration so
molecules in the cell, namely that it is a very long, they precipitate. (4) The supernatant, which
negatively charged molecule. To extract purified contains DNA and other, smaller metabolites, is
DNA from a tissue sample, cells are broken open by then mixed with ethanol, which causes the DNA to
(1) grinding or lysing in a solution that contains precipitate. (5) A small pellet of DNA can be
chemicals that protect the DNA while disrupting collected by centrifugation, and (6) after removal
other components of the cell (Figure 2). These of the ethanol, the DNA pellet can be dissolved in
chemicals may include detergents, which dissolve water (usually with a small amount of EDTA and a
lipid membranes and denature proteins. A cation pH buffer) for the use in other reactions. Note that
such as Na+ helps to stabilize the negatively this process has purified all of the DNA from a
charged DNA and separate it from proteins such as tissue sample; if we want to further isolate a
histones. (2) A chelating agent, such as EDTA, is specific gene or DNA fragment, we must use
added to protect DNA by sequestering Mg2+ ions, additional techniques, such as PCR.
which can otherwise serve as a necessary co-factor
for nucleases (enzymes that digest DNA). As a
Figure 4.
Figure 2. Each recognition sequences are cleaved by restriction
Extraction of DNA from a mixture of solubilized cellular enzyme. It can either cut DNA at a position offset from the
components by successive precipitations. Proteins are center of the restriction site creating an overhanging region
precipitated, then DNA (in the supernatant) is or it can cut directly in the middle to create blunt end.
precipitated in ethanol, leaving a pellet of DNA. (Original-Deyholos/Kang-CC BY-NC 3.0)
(Original-Deyholos-CC BY-NC 3.0)
Figure 3.
An EcoRI dimer (blue, purple) sits like a saddle on a
double helix of DNA (one strand is green, one is brown).
This image is looking down the center of the helix. Figure 5.
(NCBI-?-PD) The recognition sequence for EcoRI (blue) is cleaved by the
enzyme (grey). This particular enzyme cuts DNA at a
position offset from the center of the restriction site. This
2. RESTRICTION ENZYMES AND DNA creates an overhanging, sticky-end.
METHYLATION (Original-Deyholos-CC BY-NC 3.0)
GATATC-3’, producing what are called blunt-ends, plasmid vector: Ligation is therefore central to the
which lack overhangs: 5’-GAT ATC-3’ production of recombinant DNA.
Many different kinds of restriction sites exist in 4. AGAROSE GEL ELECTROPHORESIS
genome, and it takes time for the enzymes to cut
up all of the restriction sites. If the concentration of 4.1. BASICS
the enzyme and the exposure time is low, this will A solution of DNA is colorless, and except for being
result in partial digest and produce longer viscous at high concentrations, is visually
fragments. If the DNA is exposed to restriction indistinguishable from water. Therefore,
enzyme long enough, this will result in complete techniques such as gel electrophoresis have been
digest and produce shorter fragments. developed to detect and analyze DNA (Figure 6).
This analysis starts when a solution of DNA is
2.2. DNA METHYLATION deposited at one end of a gel slab. This gel is made
Bacteria keep their DNA safe from their own from polymers such as agarose, which is a
restriction enzymes by methylating (adding a CH3 polysaccharide isolated from seaweed. The
group) using methyl transferase (methylase) molecules that compose the gel are linked by
enzyme. For each different restriction enzyme, its hydrogen bonds not covalent bonds, so the
matching methylase enzymes are produced to experimenter can mold the shape of the gel by
methylate the host DNA. After each replication of heating and cooling. The DNA is then forced
DNA, the enzyme has to methylate the newly through the gel by an electrical current, with DNA
synthesized DNA. molecules moving toward the positive electrode
(Figure 7). This is because the phosphate backbone
3. DNA LIGATION of DNA or RNA has negative charge on it. Therefore,
The process of DNA ligation occurs when DNA rather than moving in a vertical manner, the DNA
strands are covalently joined, end-to-end forming a or RNA molecule will move by its horizontal side.
phosphodiester bond between the 5’ phosphate
end and 3’ hydroxyl end through the action of an
enzyme called DNA ligase. Typically, sticky-ended
molecules with complementary overhanging
sequences (compatible ends) facilitate their joining
to form recombinant DNA. Likewise, two blunt-
ended sequences are also considered compatible
to join together, although they do not ligate
together as efficiently as sticky-ends. The sticky-
ended molecules with non-complementary
sequences will not ligate together with DNA ligase.
This function of joining two fragments of DNA Figure 6.
pieces together by DNA ligase is essential when Apparatus for agarose gel electrophoresis. A waterproof
tank is used to pass current through a slab gel, which is
connecting Okazaki fragments during DNA
submerged in a buffer in the tank. The current is supplied
replication, or repairing breaks in either single or by an adjustable power supply. A gel (stained blue by a
double stranded DNA molecules during dye sometimes used when loading DNA on the gel) sits in
recombination. Therefore, if a mutation occurs in a tray, awaiting further analysis, such as photography
the genes that encode for ligase enzymes, the under a UV light source.
(Flickr- camerazn - CC BY 2.0)
result will impact the organism immensely in a
negative way. In molecular genetics, ligation is
particularly important as DNA ligase facilitates the
insertion of double stranded DNA fragment into a
produces a secondary structure that forms loops. 5.2. CONTAMINATION IN GEL ELECTROPHORESIS
This can affect its mobility and can therefore DNA samples don’t always separate correctly in
provide wrong information regarding its size. Also, agarose gel electrophoresis. Typically, the DNA
the bands are going to look less sharp. Therefore, sample is contaminated with other macro-
denaturing agents have to be used when running molecules or chemicals. Figure 10 shows the
on the gel in order to break the hydrogen bond consequences of various forms of contamination.
that is holding the secondary structure of RNA. This
6. RESTRICTION MAPPING
way, most of the secondary structure of RNA can
be prevented. Some of the molecules still might 6.1. PROCEDURE
have its secondary structure and some even might Restriction mapping is the technique of identifying
re-coil back. This is why EtBr, which is an the location of restriction sites, relative to other
intercalating agent that is inserted between two sites, on a DNA molecule. Typically, a sample of
planar bases, still works on RNA molecules but with purified plasmid DNA is aliquoted into several
a lower efficiency. Therefore, in order to produce tubes and each is treated with several different
similar quality with DNA on a gel, a lot more RNA restriction enzymes or combination of enzymes.
molecules have to be used. These are then separated by agarose gel
RNA gel electrophoresis can be used when electrophoresis and their restriction fragment sizes
scientists want to identify the existence of certain determined. By trial and error, the combination of
mRNAs and therefore certain genes that are fragments can be assembled like a linear jigsaw
expressed in the cell compared to other cell types puzzle into a map of the restrictions sites – a
or the same cell type but in a different stage in life. restriction map.
Also, RNA molecules can be quantified so the level
of gene expression can be identified as well. Finally,
it can be purified just like DNA molecules.
Figure 10.
The consequences of various forms
of contamination on the separation
of DNA in an agarose gel.
(Original-Harrington-CC BY-NC 3.0)
Figure 11.
By looking at the size of the fragments produced by one restriction enzyme or combination of the restriction enzymes, the
location and the order of the restriction site on a chromosome can be identified, forming a restriction map.
(Original-Locke-CC BY-NC 3.0)
6.2. USES
Restriction mapping is a quick, easy and Also, restriction sites offer positions for convenient
inexpensive way to characterize and distinguish manipulation of the DNA. Restriction fragments
DNA samples without actually sequencing the DNA; that contain the gene of interest can be cut out and
sequences can be represented by series of once the gene is purified from the fragments, it can
restriction sites and using this knowledge, one can be sequenced or used as a probe. This is the reason
tell if the DNA of interest is similar or different why restriction mapping is still routinely used
from others by comparing their degree of overlaps. today, even though sequencing technologies allows
us to sequence the whole genome.
SUMMARY:
• Restriction enzymes are natural endonucleases used in molecular biology to cut DNA sequences at
specific sites.
• DNA fragments with compatible ends can be joined together through ligation. If the ligation produces a
sequence not found in nature, the molecule is said to be recombinant.
• DNA or RNA molecules can be identified, quantified, and separated on electrophoresis gel.
• Contamination in DNA samples such as RNA, salt, or protein that can affect the bandings on a
electrophoresis gel.
KEY TERMS:
lysing blunt-ends
detergents DNA methylation
chelating agent DNA ligation
EDTA DNA ligase
nucleases compatible ends
supernatant gel electrophoresis
pellet agarose
restriction endonucleases band
restriction enzymes ethidium bromide
EcoRI size markers
EcoRV restriction map
sticky-ends
STUDY QUESTIONS:
1) A 6.0 kbp PCR fragment flanked by recognition
sites for the HindIII restriction enzyme is cut
with HindIII then ligated into a 3kb plasmid
vector that has also been cut with HindIII. This
recombinant plasmid is transformed into E. coli.
From one colony a plasmid is prepared and
digested with HindIII.
a) When the product of the HindIII digestion is
analyzed by gel electrophoresis, what will
be the size of the band(s) observed?
b) What bands would be observed if the
recombinant plasmid was instead cut with
EcoRI, which has only one site, directly in
the middle of the PCR fragment?
c) What band(s) would be observed if the
recombinant plasmid was cut with both
EcoRI and HindIII at the same time?
2) You add ligase to a reaction containing a sticky-
ended plasmid and sticky-ended insert
fragment, which both have compatible ends.
Unbeknownst to you, someone in the lab left
the stock of ligase enzyme out of the freezer
overnight and it degraded (no longer works).
Explain in detail what will happen in your
ligation experiment in this situation should you
try and transform with it.
3) Which would move faster during agarose gel
electrophoresis, a 1.0 kbp duplex DNA molecule
or a 1,000 nt of RNA (single stranded) molecule?
INTRODUCTION test tube). These days most experiments are done
in plastico (in plastic). See Figure 1.
Recombinant DNA is a general term to describe
DNA that has been manipulated (recombined) in silico (in silicon) Experiments done within a
somehow in vivo. It typically involves the breakage computer simulation.
of DNA into fragments, using restriction enzymes, Recombinant DNA: a composite DNA molecule
and the rejoining (ligation) of these fragments into created in vitro by joining a foreign DNA with a
various arrangements and into vectors, such as vector DNA molecule. (Note; technically
plasmids, to propagate the new arrangement for recombinant DNA can be also made in vivo during
further analysis, like sequencing, or for insertion meiosis in an organism, but this is usually not the
into other hosts, such as model organism as typical meaning of these words.)
transgenes.
2. RECOMBINANT DNA TECHNIQUES:
1. BASIC TERMINOLOGY
There are many techniques for joining DNA
Before proceeding any further, there are some molecules in vitro and introducing them into cells
basic terminologies that students should know (usually bacteria) where the molecules are then
regarding recombinant DNA technology. replicated along with the host genomic DNA.
in vivo (in life) experiments done within a living
2.1. PLASMIDS ARE NATURALLY PRESENT IN SOME
cell/organism
BACTERIA
in situ (in place) experiments done on cells and Many bacteria contain extra-chromosomal DNA
structures removed intact from an organism. (ex. elements called plasmids. These are usually small
Inserting RNA into a frog egg cell on a petri dish) (a few 1000 bp), circular, double stranded
in vitro (in glass) experiments done on individual molecules that replicate independently of the
molecules removed from an organism (ex. DNA in a chromosome and can be present in multiple copies
within a cell. In the wild, plasmids can be
transferred between individuals during bacterial particularly because most plasmid vectors used in
mating and are sometimes even transferred molecular biology have been engineered to contain
between different species. Plasmids are particularly recognition sites for a large number of restriction
important in medicine because they often carry endonucleases in a segment called the Multiple
genes for pathogenicity (making the bacteria more Cloning Site (MCS).
detrimental) and drug-resistance (able to survive
various antibiotics). In the lab, plasmids are
inserted into bacterial hosts in a process called
transformation. These plasmids can be modified by
the addition of foreign DNA so that both the
plasmid vector and the target foreign DNA is
replicated.
There are 3 main features of a plasmid (Figure 2):
(1) Origin of replication (Ori) which is similar in
function to oriC in E. coli chromosome.
(2) Selectable marker gene that helps to screen the
desired and undesired strains, which is usually an
antibiotic resistance gene like ampR, tetR, or kanR.
Figure 3.
Some cells have plasmids that contain resistance to Cloning of a DNA fragment (red) into a plasmid vector.
multiple, different antibiotics. The vector already contains a selectable marker gene
(3) Multiple cloning site (MCS) that has many (blue) such as an antibiotic resistance gene.
restriction enzyme sites in a short sequence. (Original-Deyholos-CC BY-NC 3.0)
that only those cells that have actually linear DNA vector molecule that can typically hold a
incorporated the plasmid will be able to grow and 15-20 kb fragment in each clone.
form colonies. Colonies (clone) can then be picked Cosmids are a hybrid vector system composed of
and used for further study. part plasmid and part phage DNA. It can clone 30-
Molecular biologists use plasmids as vectors to 45Kb fragments in each clone. The lambda phage
contain, amplify, transfer, and sometimes express packaging system (stuffs the recombinant DNA
genes of interest that are present in the cloned into the lambda bacteriophage heads) is used for
DNA. Often, the first step in a molecular biology higher transformation efficiency, but it also has the
experiment is to “clone a gene” (i.e. make a copy) plasmid origin of replication so clones can be
into a plasmid, then transform this recombinant replicated in the host like plasmids.
plasmid into bacteria so that essentially unlimited BACs (Bacterial Artificial Chromosomes) is a circular
copies of the gene (and the plasmid that carries it) DNA vector that uses a plasmid origin of replication
can be made as the bacteria reproduce. This is a to propagate. The insert DNA can be very large,
practical necessity for further manipulations of the 100’s of kbp, so it may contain many genes. But,
DNA, since most techniques of molecular biology such large recombinant DNA molecules are difficult
require many copies of DNA to work. Even though to transform so BACs are difficult to make.
small amounts are needed they are not sensitive
enough to work with just a single molecule at a 4. DNA LIGATION
time. The process of DNA ligation occurs when DNA
Many molecular cloning and recombination strands are covalently joined, end-to-end through
experiments are therefore iterative (repetitive) the action of an enzyme called DNA ligase.
processes. For example: Molecules with complementary overhanging
sequences are said to have “sticky” or compatible
1. a DNA fragment (usually isolated by PCR and/or
ends, which facilitate their joining to form
restriction enzyme digestion) is cloned into a
recombinant DNA. Likewise, two blunt-ended
plasmid cut with a compatible restriction
sequences are also considered compatible to join
enzyme
together, although they do not ligate together as
2. the recombinant plasmid is transformed into efficiently as sticky-ends. Note: sticky-ended
bacteria molecules with non-complementary sequences will
3. the bacteria are allowed to multiply, usually in not ligate together with DNA ligase.
liquid culture The process of ligation is central to the production
4. a large quantity of the recombinant plasmid of recombinant DNA, including the insertion of a
DNA is isolated from the bacterial culture double stranded DNA fragment into a plasmid
vector.
5. further manipulations (such as site directed
mutagenesis or the introduction of another 5. AN APPLICATION OF MOLECULAR CLONING:
piece of DNA) are conducted on the RECOMBINANT INSULIN
recombinant plasmid
Purified insulin protein is critical to the treatment
6. the modified plasmid is again transformed into of diabetes. Prior to ~1980, insulin for clinical use
bacteria, prior to further manipulations, or for was isolated from human cadavers or from
expression slaughtered animals such as pigs. Human-derived
insulin generally had better pharmacological
3.2. OTHER VECTORS properties, but was in limited supply and carried
Lambda phage is a bacteriophage that infects E. risks of disease transmission. By cloning the human
coli and can be used as a vector. Lambda phage is a insulin gene and expressing it in E. coli, large
quantities of the insulin protein and identical to the for the next step.) Each independent assembly
human hormone sequence could be produced in of a DNA segment in a vector is a clone
fermenters, safely and efficiently. Production of 4. The recombinant DNA molecules are
recombinant insulin also allows specialized variants transformed into a competent bacteria host
of the protein to be produced: for example, by cell. For plasmids, this is a direct process, while
changing a few amino acids, longer-acting forms of for cosmids, a lambda in vitro packaging system
the hormone can be made. The active insulin is used to increase the efficiency of the process.
hormone contains two peptide fragments of 21 and (Lambda in vitro packaging system refers to the
30 amino acids, respectively. Today, essentially all packaging of recombinant DNA into the head of
insulin is produced from recombinant sources the bacteriophage, and then transferring this
(Figure 4), i.e. human genes and their derivatives package to the host cell.)
expressed in bacteria or yeast.
5. The transformed cells that contain plasmid or
Figure 4. cosmid with antibiotic resistance gene are
A vial of insulin. Note that the label lists
selected and propagated.
the origin as “rDNA”, which stands for
recombinant DNA. 6. Amplified recombinant plasmid DNA molecules
(Flickr-DeathByBokeh- CC BY-NC 2.0) can be purified, and collected. This collection of
many different clones (genomic DNA
fragments) makes up a DNA library or clone
library that can contain the entire DNA
sequence of an organism in the fragmented
6. GENOMIC DNA LIBRARIES AND CDNA form of multiple clones.
LIBRARIES These clones can then be stored and the fragment
of interest can be retrieved at a later time, hence
6.1. GENOMIC DNA LIBRARY the name “genomic library.” Gene libraries can be
The human genome is large and complex. It is constructed using different vectors, but almost all
much easier to break it down into little fragments work is done with plasmids these days.
to study. This is true for all organisms – deal with a
gene at a time. How many clones are needed to include every
Here is the process of constructing a genomic DNA sequence in a library? The number of clones
library (Figure 5): needed to have 1 genome equivalent can be
calculated by dividing the number of sequences of
1. Genomic DNA is broken down into short the genome by the number of sequences of the
fragments by partial restriction enzyme clone. For example, if the E. coli genome is about
digestion. The size is dictated by the vector 4,500,000 bp and cosmid clone contains 45,000 bp,
used. Plasmids will need short fragments (~5 then 4,500,000/45,000 = 100 clones would be
kbp) while cosmid vectors will need larger ones needed, end to end, to cover the whole genome.
(30-45 kbp). . However, in real life, some sequences in the
2. Circular plasmid or cosmid vector DNA is genome might not be cloned at all (others may be
opened with the same restriction enzymes that cloned more than once) and therefore the process
were used in (1) or another enzyme that yields is not 100% efficient. To get a 99% chance of
compatible, sticky ends. finding a specific gene or sequence of interest, one
needs about 5 genome equivalents. That is: 500
3. The DNA fragment and vector are mixed
clones for the E. coli example above.
together in the same test tube and ligated
together. (The ligation occurs by random
chance so not all molecules will be appropriate
PAGE 4 OPEN GENETICS LECTURES – FALL 2017
RECOMBINANT DNA – CHAPTER 29
Figure 5.
Process of making a
genomic clone library of
an organism.
(Original-Locke-CC BY-NC
3.0)
6.2. CDNA LIBRARY corresponds to the mRNA present in the cell. The
Genomic library above hypothetically contains all DNA between genes, and intron DNA, is absent
the sequences of the target’s DNA, but a cDNA from this library. These clones are often used to
library only contains the sequences that are express the gene to make a protein.
expressed in a particular cell or tissue. The main difference between Genomic DNA library
To create a cDNA library, RNA is collected from the and cDNA libray is that the genomic library
cell or tissue of interest. Primers, nucleotides, and contains DNA with exons, introns, and intergenic
RNA transcriptase enzyme are added so that sequences, so the number of different clones in the
complimentary DNA, or cDNA that is library is much bigger. On the other hand, cDNA
complimentary to the RNA is synthesized. The library contains only the sequences present after
result is a RNA-cDNA hybrid, and these two strands transcription and processing (e.g. splicing exons),
are separated by adding heat, and RNA can be which are translated into polypeptides. Therefore,
denatured by adding RNase enzyme or NaOH. The by looking at the cDNA library we can identify
remaining cDNA can act as a template and its which genes are expressed in particular cell types,
complementary DNA strand is synthesized, each and to what level of expression, too.
forming a double helix. From this point, the rest of
the procedure to create a library is equivalent with
that above. Here, however, the cloned DNA
Figure 6.
Process of producing
cDNA library. (Original-
Locke-CC BY-NC 3.0)
7. SCREENING A CLONE LIBRARY 1. Plate out library - each colony on the bacterial
plate is a clone.
After genomic or cDNA libraries have been
constructed, clones containing a particular gene, or 2. Lift clones (DNA) onto Nitrocellulose filter. Lyse
DNA sequence, can be identified and recovered the cells and fix the clone DNA onto the filter.
using the process of hybridization and labeled DNA Denature the clone DNA, so as to make it able
probes. DNA labeling involves putting a tag on the to form hybrids with probe.
DNA molecule that is going to be complementary 3. Place filter in a hybridization bag with solution
to the DNA sequence of interest, in some manner containing labeled, denatured probe DNA.
that permits one to detect its presence in minute Incubate to permit the probe strands to form
quantities at some later point in an experiment. hybrids with the clone strands.
DNA can be labeled in several different ways; one
widely used technique is to replace the normal 4. Wash away unhybridized probe.
Phosphorous of the DNA with a radioactive atom of 5. Expose probed filter to X-ray film
Phosphorous, 32P (normal isotope = 31P). This (autoradiography) to detect the presence of
radioactivity can be detected by photographic clones with labeled probe.
emulsion. A cloned DNA sequence will hybridize to
6. From the X-ray film determine which clone
only its complementary sequences and thus
hybridized to the probe and recover that clone
provides an almost unique probe. Once the
for further analysis.
appropriate probes are made, the following
procedures are performed:
Figure 7.
Process of screening a
clone library of an
organism.
(Original-Locke-CC BY-
NC 3.0)
____________________________________________________________________________
SUMMARY:
• DNA fragments can be cloned into vectors.
• Transformation of recombinant DNA is the transfer of DNA (usually recombinant plasmids) into
bacteria.
• Cloning of genes into E. coli is a common technique that allows large quantities of a DNA for gene to
made
• This allows further analysis or manipulation of the cloned sequences.
• Genomic DNA libraries contain fragments of genomic DNA.
• cDNA libraries contain shorter segments of DNA that correspond to the mRNA for each gene.
• Gene of interest can be identified using DNA probes to screen genomic or cDNA libraries.
• Cloning can also be used to produce useful proteins, such as insulin, in microbes.
KEY TERMS:
in vivo electroporation
in situ vector
in vitro clone
in plastico Lambda phage
in silico cosmid
Recombinant DNA lambda phage packaging system
plasmid BACs
transformation DNA ligation
Ori DNA ligase
selectable marker sticky / compatible end
Multiple cloning site (MCS) genomic library
competent cDNA library
STUDY QUESTIONS:
1) A coat protein from a particular virus can be
used to immunize children against further
infection. However, inoculation of children with
proteins extracted from natural viruses
sometimes causes a fatal disease, due to
contamination with live viruses. How could you
use molecular biology to produce an optimal
vaccine?
2) How would cloning be different if there were
no selectable markers?
Notes:
The untransformed cells (left) will not grow on any Once cloned, this fragment can be sequenced and
Minimal Media (MM) with or without antibiotic. the genes within it characterized and used for
The cells transformed with a plasmid containing a further experiments to determine its function and
genomic fragment (blue) will grow on the antibiotic role in the biological process under investigation.
containing media because of the antibiotic
Figure 3.
The auxotrophic A- strain E. coli (green)
is transformed with the recombinant
DNA (A+ genomic DNA plus plasmid
vector from Figure 2) to generate three
genotypic classes as shown across the
top. Left: untransformed; Centre:
transformed with random genomic DNA;
Right: transformed with the A+ gene
(red). See text for an explanation.
(Original-J. Locke-CC BY-NC 3.0)
1.4. HOW MANY CLONES ARE NEEDED TO FIND THE ONE <duplex DNA>
OF INTEREST ? 5'....GAATTCGGATCC....3'
Given that the construction of the recombinant 3'....CTTAAGCCTAGG....5'
genomic DNA library is random, in that the ↓
fragment with the A+ gene is only one of many in 5'....GAATTCGGATCC....3'
the genome, how many clones do we need to make +
and screen to be sure of finding the one we seek? 3'....CTTAAGCCTAGG....5'
There is a simple formula that researchers use to
estimate this probability: These dissociated single strands can reassociate (or
anneal) to reform the duplex DNA. This process is
(1) If each plasmid contains ~4.5 Kb of insert DNA
sequence specific, in that the duplex will only form
and the E. coli genome contains ~4.5 Mb DNA then
if the two strands are complementary.
~1000 plasmid clones, if arranged end-to-end could
contain one E. coli genome's worth of DNA. When the two strands reform a duplex, a hybrid is
formed, hence the name hybridization.
(2) Because of the random nature of which
fragment are cloned, probability says we need to 5'....GAATTCGGATCC....3'
screen the equivalent of 5 genomes worth of 3'....CTTAAGCCTAGG....5'
clones (e.g. 5,000 clones in this case) to provide a (duplex DNA again)
99% chance of finding the A+ gene (or any/every
other gene in the genome), taking the conditions in Note: hybrids can form between DNA/DNA (two
the previous section into account. Because cloning complementary DNA strands), DNA/RNA (a DNA
is a statistical probability, there can never be a strand and its complementary RNA sequence), or
100% chance, but the 99% chance is almost always between RNA/RNA (not useful here since clones
sufficient to find the gene of interest. are DNA).
Note that 5,000 bacterial clones can be produced Hybrid formation only requires that the
easily and screened quickly on a single Petri dish complementary sequences be similar, not a perfect
plate. This method is relatively easy and match. A hybrid can form some "mismatch" in the
straightforward. pairing.
Example:
Protein sequence: Met - Lys - Asn – Glu Lys=Lysine
codon: AUG - AAA - AAU - GAA Asn=Asparagine,
alternate codon: - AAG - AAC – GAG Glu=Glutamic Acid
Probe sequence: ATG AAA AAT GAA
G C G
| | |
use both bases in the oligo-nucleotide
a 50:50 mix of each base
Another source of DNA for probes comes from the radioactive atom of Phosphorous, 32P (normal
amplification of polymerase chain reaction (PCR isotope = 31P). This radioactivity can be detected by
products). The methodology of PCR is covered in photographic emulsion (e.g. auto-radiography) on
Chapter 31 and will not be presented here. sheets of X-ray film. There are several methods to
Basically PCR amplification is just another method do this (nick-translation, random priming, PCR). All
to synthesize sufficient DNA for use as a probe. It produce a single strand of 32P labeled DNA that will
uses two primer oligonucleotides to synthesize hybridize with its complementary sequence and
(amplify) a specific sequence of duplex DNA. This thus localize any DNA with that sequence.
can then be used as a probe as described in the We now have the probe labeled and ready to
next section. screen a genomic library for a gene of interest. The
2.3. SCREENING A LIBRARY TO FIND A CLONE method of screening a library of DNA clones relies
Once a DNA fragment has been obtained (see on the probe’s sequence specificity to hybridize
above) it can be labeled and used as a probe. with only the clones with a complementary
"Labeling" involves putting a tag on the DNA in sequence. The procedure is diagrammed in Figure
some manner that permits one to detect its 6.
presence, in minute quantities, at some later point The goal of screening a clone library is to identify
in an experiment. DNA can be "labeled" in several and recover clone(s) that have a sequence
different ways. One widely used technique is to complementary to the probe (e.g.: a specific gene
replace the normal Phosphorous of the DNA with a sequence)
Figure 6.
Procedure to screen a plasmid library of genomic clones
with a DNA probe.
1. Plate out library - each colony on the bacterial plate is a
clone.
2. Lift clones (DNA) onto Nitrocellulose filter. Fix the clone
DNA onto the filter and denature the clone DNA, so as to
make it able to form hybrids with probe.
3. Place filter in a hybridization bag with solution containing
labeled, denatured probe DNA. Incubate to permit the
single strands of probe to form hybrids with the clone single
strands in a sequence specific manner.
4. Wash away unhybridized probe.
5. Expose probed filter to X-ray film (autoradiography) to
detect the presence of clones with labeled probe.
6. From the X-ray film determine which clone hybridized to
the probe and recover that clone for further analysis.
(Original-J. Locke-CC BY-NC 3.0CC BY-NC 3.0)
3.1. BASICS OF TRANSPOSABLE ELEMENTS Drosophila (transgenes, etc.). We will use the P
All eukaryotes have multiple types of transposable element as an insertion mutation in our model
elements as single or multiple copies in their example here.
genomes. These segments of DNA, usually P elements are 2907 bp long, have 31 bp inverted
hundreds or thousands of base pairs long, are repeats at either end, and code for either a
usually distributed throughout the genome as transposase or repressor protein, depending on
randomly inserted sequences. Typically, they are the mRNA splicing alternative. A transposase is an
mobile, in the sense that some of them are able to enzyme that binds to the transposon and catalyzes
move and/or replicate within the genome. There the movement from its current location to another
are two main classes: (1) Retro-transposons and part of the genome. It does this by cleaving the
(2) DNA transposons. strands surrounding the region, and then cutting
Retro-transposons (or retroposons) move from and inserting the transposon in a new location in
one site to another in the genome via RNA the genome (Figure 7). The transposase transcript
intermediates (Figure 7). The sequence is has 4 exons (Exon 0 - Exon 3). The repressor
transcribed from the DNA sequence at one locus polypeptide is made if the last intron (2-3) is not
into RNA, then reverse-transcribed back into DNA spliced out. The resulting mRNA will be translated
(DNA->RNA->DNA), which is then inserted back such that the 2-3 intron is translated and a
into the genomic DNA at another locus. Within this premature stop codon prevents translation of the
class there are two subclasses. The first is the sequence in exon 3 (Figure 8). A simplistic
retrovirus-like class, which has sequences similar in description proposes that this truncated product
organization to a retrovirus, such as HIV. Their will bind to the P element ends, but not cleave the
replication is also like a retrovirus. Then there is the DNA as the transposase polypeptide does. This
non-viral class of retroposons. The human Alu binding prevents other transposase molecules from
element is an example of a short interspersed binding and cleaving, thereby acting as a repressor
element (SINE). There are also long interspersed of mobilization. The real situation is more complex.
elements (LINES). In most somatic tissues, the 2-3 intron is not
removed from the primary transcript so only the
DNA transposons move via DNA intermediates repressor is made. In germline tissues, the intron
(DNA-> DNA). No RNA (transcription) is involved. can be removed and the transposase produced to
One of the best-studied elements is the P element cause mobilization. The mobilization of the
in Drosophila. It is very well characterized and used transposon will cause insertions, which can cause
as a genetic tool to create insertion mutations mutations in genes. This is explained in the next
(mutagenesis) and as a transformation vector to section.
move constructed genes into the germline of
3.2. CREATING A TE INDUCED MUTANT IN A GENETIC 3.4. CONFIRMING THE INSERT IS RIGHT GENE BY
SCREEN REVERSION ANALYSIS
This needs wild type and mutant stocks for the Because the insertion of P element (and other TE)
gene of interest. If a P element induced mutation is are not always the causal event in the type of
already available from a stock collection, then this mutagenesis described above, it is essential that
step is completed. However, this is not the case the cause of the mutation in the gene of interest be
with most situations and so you must make your established to be due to the insert that you have
own. To create a P element induced mutation a P cloned. This is usually done through reversion
element containing stock (P-stock) is crossed with analysis.
one that lacks P elements. Both must have wild The allele with the TE in the gene of interest
type alleles of the gene of interest. If done (above) is reverted in a manner similar to that of
correctly, this cross will cause the P elements to the initial insertion. The expectation is that the
mobilize, and create random insertions into genes excision of the TE will be associated with a
all over the genome. Next is a genetic screen for reversion of the mutation back to wildtype. The
this rare insert into the gene of interest. It can be presence/absence of the TE insert can be
identified in the screen by a failure to complement monitored by Southern Blots in the reverted stocks
an existing mutant allele of the gene of interest and compared to the original, parental line and the
(See Chapter 4). insert mutant line.
This new allele is tagged with the P element and
can now be recovered and isolated for further use.
4. CURRENT APPROACHES TO MATCHING GENES
TO MUTATIONS
3.3. CLONING A TE INSERTION SITE & THE WILD TYPE
The methods described above have been used for
ALLELE
several decades to successfully find many new
From the stock with the P element induced allele in
genes. However, the current ease and quickness of
the gene of interest, the genomic DNA can be
whole genome sequencing is changing this
extracted and a genomic library built. This library
methodology. If sufficient money and equipment is
can be screened and a clone containing the P
available, the mutant genome can be compared to
element sequence identified (see the previous
the parental or wild type genome and single base
section on using a DNA probe to screen a library).
pair changes can be identified by computer
These clones can be characterized and the genomic analysis. While theoretically straightforward, the
sequences adjacent to the P element can be technical details make it challenging and unclear.
subcloned and used as a probe into a library made Often there are many changes in the sequence and
from a wild type stock. This will identify clones it is not clear which are causative and which are
containing the wild type DNA of the gene of just random changes.
interest.
Figure 8.
P element (red) with 31 bp repeats,
has a promoter (P) for a transcript
(green) that includes four exons.
Alternate splicing leads to either a
transposase (exons 0,1,2,3) or a
repressor (exons 0,1,2, intron,3).
(Original-J. Locke-CC BY-NC 3.0CC BY-
NC 3.0)
___________________________________________________________________________
SUMMARY:
• Genes in simple organisms (single cells) can be cloned by the complementation of host mutation by the
transformation of a plasmid containing a wild type copy of the host’s mutant locus.
• In screening a library for a clone, five genomes worth of cloned DNA needs to be screened in order to
have a 99% probability of finding the clone of interest.
• Libraries can be screened with a labeled probe using the DNA-DNA hybridization to bind the probe to a
target sequence in a clone.
• Genes can also be cloned via transposon tagging.
KEY TERMS:
pBR322 cDNA clone
pUC19 synthetic oligo-nucleotide
genomic DNA labeling
32
DNA vector P
complementation autoradiography
clone mobile element
restriction enzyme transposable element
ligation transposon
genomic DNA library transposon tag
transformed retro-transposon
minimal media DNA transposon
antibiotic retrovirus-like class
auxotroph non-viral class
prototroph P element
DNA hybridization Genetic tool
denaturation transposase
reassociation repressor
annealing 2-3 intron
mis-match reversion analysis
probe whole genome sequencing
QUESTIONS:
1) In Figure 4, it is intentionally not stated which is
the mouse and which is the Drosophila DNA
sequence. With a computer and internet
access, how could you determine which is
which?
2) In the example showing how the amino acid
sequence is used to make an oligonucleotide
probe, there are four amino acids shown.
Assume the next is proline.
a) Find a codon usage chart on the internet
and determine what the next three
nucleotides should be in the
oligonucleotide.
b) With the addition of the proline, how many
different sequence oligo-nucleotides are
needed to cover all the possible gene
sequences (assuming there is no exon-
intron site in this region)?
Notes:
Figure 1.
Plastic disposable tips for a micro-pipettor
are used to accurately distribute microliter
volumes of liquid in molecular biology.
(Flickr-estherase- CC BY-NC-SA 2.0)
INTRODUCTION many copies can be produced for analysis or
manipulation.
While, genetics is the study of the inheritance and
variation of biological traits, today, classical 1. ISOLATING GENOMIC DNA
genetics is often complemented by molecular DNA purification strategies rely on the chemical
biology, to give molecular genetics, which involves properties of DNA that distinguish it from other
the study of DNA and other macromolecules that molecules in the cell, namely that it is a very long,
have been isolated from an organism. Usually, negatively charged molecule. To extract purified
molecular genetics experiments involve some DNA from a tissue sample, cells are broken open by
combination of techniques to isolate, analyze, and grinding or lysing in a solution that contains
characterize the DNA, RNA, and/or protein chemicals that protect the DNA while disrupting
transcribed and translated from a particular gene. other components of the cell (Figure 2). These
In some cases, the DNA may be subsequently chemicals may include detergents, which dissolve
manipulated by mutation or by recombination with lipid membranes and denature proteins. A cation
other DNA fragments. Techniques of molecular such as Na+ helps to stabilize the negatively
genetics have wide application in many fields of charged DNA and separate it from proteins, such as
biology, as well as forensics, biotechnology, and histones. A chelating agent, such as EDTA, is added
medicine. Polymerase Chain Reaction (PCR) is a to protect DNA by sequestering Mg2+ ions, which
widely used technique to amplify and isolate can otherwise serve as a necessary co-factor for
specific DNA sequences. It requires a “template” nucleases (enzymes that digest DNA). As a result,
DNA, which is often genomic DNA. From this free, double-stranded DNA molecules are released
template, specific sequences can be amplified and from the cell and from chromatin into the
extraction buffer, which also contains proteins and very efficient method of amplifying a specific
all other cellular components. (The basics of this sequence of DNA from a small sample of a large,
procedure are simple enough that it can be done complex genome.
with household chemicals as presented on Besides its ability to make large amounts of DNA,
YouTube.) there is a second characteristic of PCR that makes it
The free DNA molecules are subsequently isolated extremely useful. Recall that most DNA
by one of several methods. Commonly, proteins polymerases can only add nucleotides to the end of
are removed by adjusting the salt concentration so an existing strand of DNA, and therefore require a
they precipitate. The supernatant, which contains primer to initiate the process of replication. For
DNA and other, smaller metabolites, is then mixed PCR, chemically synthesized primers of about 20
with ethanol, which causes the DNA to precipitate. nucleotides are used. In an ideal PCR, primers only
A small pellet of DNA can be collected by hybridize to their exact complementary sequence
centrifugation, and after removal of the ethanol, on the template strand (Figure 3).
the DNA pellet can be dissolved in water (usually
Successful PCR reactions have been conducted An essential aspect of PCR is thermal-cycling,
using only a single DNA molecule as a template, but meaning the exposure of the reaction to a series of
in practice, most successful PCR reactions contain precisely defined temperatures (Figure 5). The
many thousands of template molecules. The reaction mixture is first heated to 95°C. This causes
template DNA (e.g. total genomic DNA) has usually the hydrogen bonds between the strands of the
already been purified from cells or tissues using the template DNA molecules to melt, or denature. This
techniques described above. However, in some produces two single-stranded DNA molecules from
situations it is possible to put whole cells directly in each double helix (Figure 7). In the next step
a PCR reaction for use as a template. (annealing), the mixture is cooled to 45-65°C. The
exact temperature depends on the primer
sequence used and the objectives of the
experiment. This allows the formation of double
stranded helices between complementary DNA
molecules, including the annealing of primers to
the template. In the final step (extension) the
mixture is heated to 72°C. This is the temperature
at which the particular DNA polymerase used in
PCR is most active. During extension, the new DNA
Figure 4.
strand is synthesized, starting from the 3' end of
A strip of PCR tubes
(Wikipedia-madprime- CC BY-SA 3.0) the primer, along the length of the template
strand. The entire PCR process is very quick, with
each temperature phase usually lasting ~30
seconds or less. Each cycle of three temperatures
(denaturation, annealing, extension) is usually
repeated about 30 times, amplifying the target
region approximately 230-fold. The amount of DNA
product reaches a plateau at 20-40 cycles, usually
because the nucleotide precursors have been
exhausted. Notice from the figure that most of the
newly synthesized strands in PCR begin and end
Figure 5. with sequences either identical to or
Example of a thermal-cycle, in which the annealing complementary to the primer sequences; although
temperature is 55°C. a few strands are longer than this, they are in such
(Original-Deyholos-CC BY-NC 3.0) a small minority that they can almost always be
ignored.
Figure 6.
A temperature vs. time graph showing two cycles of PCR.
(Original-Harrington-CC BY-NC 3.0)
2.4. AN APPLICATION OF PCR: THE STARLINK AFFAIR United States is genetically modified, and contains
PCR is very sensitive (meaning it can detect very genes that government regulators have approved
small starting amounts of DNA), and specific for human consumption, back in 2000,
(meaning it can amplify only the target sequence environmental groups showed that a strain of
from a mixture of many DNA sequences). Due to genetically modified corn, which had only been
these characteristic, PCR has many practical approved for use as animal feed, had been mixed in
applications. For example, PCR can detect trace with corn used in producing human food, like taco
DNA contaminants in food, air, water or cells. The shells. To do this, the groups purchased taco shells
presence or absence and the type or species of the from stores in the Washington DC area, extracted
contaminant can be identified. DNA from the taco shells and used it as a template
As an example, PCR was used as a tool to test in a PCR reaction with primers specific for the
whether genetically modified corn was present in unauthorized gene (Cry9C). Their suspicions were
consumer products on supermarket shelves. confirmed when they ran this PCR product on an
Although currently (2013) 85% of corn in the agarose gel and saw a band of size expected for
___________________________________________________________________________
SUMMARY:
• Molecular biology involves the isolation and analysis of DNA and other macromolecules
• Isolation of total genomic DNA involves separating DNA from protein and other cellular components,
for example by ethanol precipitation of DNA.
• PCR can be used as part of a sensitive method to detect the presence of a particular DNA sequence
• PCR can also be used as part of a method to isolate and prepare large quantities of a particular DNA
sequence
• qPCR methodology allows the quantity of DNA product to be measured.
• RT-PCR methodology detects the quantity and quality of the mRNA, which indicates the spatial and
temporal level gene expression.
KEY TERMS:
classical genetics thermalcycle
molecular biology denature
molecular genetics anneal
macromolecules extention
lysis thermostable
detergent Taq DNApol
chelating agent electropheretic agarose gel
EDTA fluorochrome
nuclease Reverse Transcriptase PCR (RT-PCR)
supernatant Temporal level
pellet Spacial level
PCR StarLink affair
primer Cry9C gene
STUDY QUESTIONS:
1) What information, and what reagents would
you need in order to use PCR to detect HIV in a
human blood sample?
2) If you started with 10 molecules of double
stranded DNA template, what is the maximum
number of molecules you would you have after
10 PCR cycles?
3) What is present in a PCR tube at the end of a
successful amplification reaction? With this in
mind, why do you usually only see a single,
sharp band on a gel when it is analyzed by
electrophoresis?
INTRODUCTION cells will arrest in metaphase. Metaphase
cells are best because their chromosomes
A lot of information can be obtained from a visual
are condensed and there are no nuclear
observation of human chromosomes. This chapter
envelope to get in the way.
will discuss two techniques: bright field microscopy
and fluorescence in situ hybridization (Figure 1). Step 4 Transfer the cells to a hypotonic
While all the examples come from humans these environment. Water enters the cells and
methods can be applied to any organism. they swell up and become delicate. Fix the
cells with a mix of acetic acid and
1. BRIGHT FIELD MICROSCOPY methanol.
1.1. MAKING A METAPHASE CHROMOSOME SPREAD Step 5 Drop the solution containing the cells onto
The most commonly observed chromosomes are a glass slide. The cells burst on contact and
those from white blood cells in the metaphase leaving behind clusters of chromosomes
stage of mitosis. The protocol is as follows: for each cell.
Step 1 Obtain a sample of whole blood from a Step 6 Soak the slide in a chromosome staining
person and add to culture medium. solution. If Giemsa is used by itself the
chromosome become a uniform dark
Step 2 Add lymphocyte growth factor proteins to
purple colour. If Giemsa and Trypsin are
stimulate white blood cells. (Remember,
used together the chromosomes take on
red blood cells lack nuclei and
dark purple and light purple bands. This
chromosomes.) After three days, the cells
pattern of Giemsa-dark and Giemsa-light
have reproduced several times and
bands is consistent and can be used to
become more numerous and many are in
identify chromosomes and chromosome
the process of mitosis.
regions. These protocols are known as
Step 3 Add colcemid, a Microtubule inhibitor. The Giemsa staining and G-banding,
cells continue through the cell cycle and respectively.
will arrest in metaphase. This is because a
Step 7 Observe the slide with a bright field
cell can't enter anaphase without
microscope.
functional Microtubules. More white blood
Figure 2 shows an example of Giemsa stained This karyogram is made with G-banded
chromosomes. These images are called metaphase chromosomes. The extra chromosome is indeed
chromosome spreads because each cell's number 21 so this person has a karyotype of
chromosomes are randomly displayed on the 47,XY,+21.
surface of the slide. This table summarizes the terms introduced in this
1.2. USING BRIGHT FIELD MICROSCOPY TO DIAGNOSE section. While these are the definitions proposed
DOWN SYNDROME by the International Standing Committee on
Recall that Down syndrome is usually due to a Human Cytogenetic Nomenclature, some people
person having three copies of chromosome 21, a use the terms interchangeably.
situation known as trisomy-21. If a newborn has Bright field microscopy has its limitations though -
the physical and mental properties suggestive of it only works with mitotic chromosomes and many
Down syndrome a physician will likely order a chromosome rearrangements are either too subtle
chromosome test. A cytogeneticist will take a blood or too complex for even a skilled cytogeneticist to
sample and make a slide of metaphase spreads. discern. Even with a more powerful phase contrast
Each spread would show 47 chromosomes in total or DIC microscope there are limits to what can be
for a person with trisomy-21. Chromosome 21 can seen using G-banding. There is a more powerful
be recognized by its characteristic length, technique, one based upon hybridization probes,
centromere location, and Giemsa banding pattern. the topic of the next section.
Idiograms are maps showing this information
(Figure 3). Figure 3.
An idiogram of human
chromosome 21. Note that
To confirm that it is in fact trisomy-21 at least one only a single chromatid is
of these spreads will have its chromosomes shown even though the map is
arranged into a karyogram pattern (see the of a replicated metaphase
Chapter 15 on Human Chromosomes). Figure 4 chromosome. The constriction
shows what the cytogeneticist is looking for. near the top is the centromere
and the Giemsa-dark and
Giemsa-light bands are
coloured black and white,
respectively.
(ghr-U.S. National Library of
Medicine-PD)
Figure 2.
A human metaphase chromosome spread. This image
shows the 46 chromosomes that came from a single cell.
There will be dozens of collections of chromosomes like
this over the entire slide.
(Wikipedia-Steffen Dietzel-CC BY-SA 3.0) Figure 4.
A karyogram from a male with Down syndrome / trisomy-
21. (Wikipedia- U.S. Department of Energy Human-PD)
Term Definition
metaphase chromosome a picture of all the chromosomes from a cell as they appear on the slide
spread
karyogram a picture of all the chromosomes from a cell rearranged into the standard
pattern
idiogram a map of one or more chromosomes showing its Giemsa banding pattern
karyotype a written description of a person's chromosome composition.
2. HYBRIDIZATION PROBES
2.1. WHAT HYBRIDIZATION PROBES ARE
If you have a large amount of DNA, for example an
entire chromosome or a large collection of
restriction fragments, how could you identify a
single gene? One method is to use a hybridization
probe. These are a collection of short pieces of
single stranded DNA that can bind to the target
gene. Both the probe and target need to be single
stranded so they can pair using complementary
base pairing (As with Ts and Gs with Cs). In short,
the procedure is to denature the target DNA, add
the probe, and then detect where the probe has
stuck (Figure 5). This section will discuss making
probes and using them. Molecular geneticists use Figure 5.
hybridization probes in different protocols. Later in How a hybridization probe can reveal the location of target
this chapter we will cover fluorescence in situ DNA.
(Original-Harrington- CC BY-NC 3.0)
hybridization (FISH) while Chapter 34 covers
Southern blotting.
• Random oligonucleotides ("oligos") These are
2.2. MAKING HYBRIDIZATION PROBES WITH A RANDOM millions of different pieces of single stranded
PRIME LABELLING REACTION DNA. They are made synthetically and
purchased as a mixture.
There are a few methods to make hybridization
probes but the most common is a random prime • Regular DNA nucleotides. For example dATP,
labelling reaction. Essentially, we use DNA dGTP, and dTTP.
Polymerases to make our probe for us in a • Labelled DNA nucleotides. For example [Cy3]-
microcentrifuge tube. The reaction contains the dCTP
following:
You have to provide the template DNA but each of
• Template DNAs. This can be a plasmid, a PCR the other components can be purchased from
product, or a BAC (bacteria artificial Biotechnology companies. The only expensive
chromosome). The template DNA must be component are the labelled nucleotides; of which
denatured to make it single stranded first. there are many types (Figure 6). Radioactive ones
• DNA Polymerases. E. coli DNA Pol works well. are the best way to do Southern blotting and
fluorescent ones are at the heart of FISH.
Figure 7.
A random prime labelling reaction can use a BAC to make a
hybridization probe. The probe will hybridize to the
chromosome region originally used to make the BAC.
(Original-Harrington- CC BY-NC 3.0)
Figure 6.
Two examples of labelled nucleotides.
(Original-Harrington- CC BY-NC 3.0) This raises a very important point. No matter what
the template DNA was, the probe DNA will be short
The reaction starts when the mixture is placed at pieces of single stranded DNA, typically only 100
37°C. The oligos bind randomly to the denatured nucleotides long. Think of a probe as a cloud of tiny
template DNA and act as primers. From these DNA molecules that can stick along the length of a
primers the DNA Pols make new DNA strands. much larger target region. Chapter 34 describes
Because some of the nucleotides are labelled the how hybridization probes are used in Southern
new DNA strands are labelled as well. After an hour blotting while the next section describes their use
or so the reaction is halted. When it is time to use in FISH.
the probe it is denatured. The easiest way to
3. FLUORESCENCE IN SITU HYBRIDIZATION (FISH)
denature it is 5 minutes at 100°C and 5 minutes on
ice. What we call the probe are the new DNA 3.1. HOW FISH WORKS
strands, each is about 100 nucleotides long, each The solution to the lack of resolution in G-banding
contains several labelled nucleotides, and each is is fluorescence in situ hybridization (FISH). The
complementary to the DNA used as the template. technique is similar to a Southern blot in that a
single stranded DNA probe is allowed to hybridize
2.3. USING HYBRIDIZATION PROBES
to denatured target DNA (see Chapters 30 & 34).
Figure 7 summarizes the random prime labelling
However, instead of the probe being radioactive it
reaction. How we make our probe will depend
is fluorescent and instead of the target DNA being
upon what we are trying to detect. The typical sizes
restriction fragments on a nylon membrane it is
of the various template DNAs we can use are:
denatured chromosomes on a glass slide. Because
• PCR product - up to 5000 bp there are several fluorescent colours available it is
• plasmid - inserts can be up to 15 000 bp common to use more than one probe at the same
time. Typically, the chromosomes are also coated
• BAC - inserts can be up to 350 000 bp with a fluorescent stain called DAPI, which gives
If we want to detect a single gene a hybridization them a uniform blue colour. If the chromosomes
probe made from a plasmid or PCR product will have come from a mitotic cell it is possible to see
suffice. But if we want to detect a large all forty-six of them spread out in a small area.
chromosome region we have to use BACs. Alternatively, if the chromosomes are within the
Sometimes several BACs are used if we want to nucleus of an interphase cell they appear together
detect an entire chromosome. BACs for humans as a large blue sphere. In either case the results are
and other commonly studied organisms can be observed with either a standard or confocal
purchased from Biotechnology companies fluorescence microscope.
3.2. USING FISH TO DIAGNOSE DOWN SYNDROME 3.3. USING FISH TO DIAGNOSE CRI-DU-CHAT
Most pregnancies result in healthy children. SYNDROME
However, in some cases there is an elevated A physician may suspect that a patient has a
chance that the fetus has trisomy-21. Older women specific genetic condition based upon the patient's
are at a higher risk because the non-disjunction physical appearance, mental abilities, health
events that lead to trisomy become more frequent problems, and other factors. FISH can be used to
with maternal age. The second consideration is confirm the diagnosis. For example, Figure 9 shows
what the fetus looks like during an ultrasound a positive result for Cri-du-chat syndrome. This
examination. Fetuses with trisomy-21 and some diagram is based upon actual results. Cells from a
other chromosome abnormalities have a swelling patient's blood were prepared to show an
in the back of the neck called a nuchal interphase nucleus (a) and mitotic chromosomes
translucency. If either, or both, factor is present the (b). There are three colours shown in the diagram:
woman may choose to undergo an amniocentesis
• Blue. The DNA has been stained with DAPI .
test. In this test, some amniotic fluid is withdrawn
• Green. This hybridization probe binds within
so that the fetal cells within it can be examined.
the short arm of chromosome 5. This region
Figure 8 shows a positive result for trisomy-21. This
is absent in people with Cri-du-chat
diagram is based upon actual results. The colours
syndrome.
are:
• Red. This is hybridization probe binds within
• Blue. The DNA has been stained with DAPI. the long arm of chromosome 5. It is used to
• Red. This hybridization probe binds to the identify chromosome 5.
centromere of chromosome 21.
The results show both chromosome 5s have intact
• Green. This hybridization probe binds to the
long arms but one is missing part of its short arm.
centromere of the X chromosome. This child has the karyotype 46,XX,del(5), indicative
Based upon the available information this fetus has of Cri-du-chat syndrome.
two X chromosomes and three chromosome 21s
and therefore has a karyotype of 47,XX,+21.
Figure 8.
Confirmation of Down syndrome in a fetus using
amniocentesis and FISH.
Based upon: Antonarakis, S. E. et al. 2004. Chromosome 21 Figure 9.
and Down syndrome: From genomics to pathophysiology. Confirmation of Cri-du-chat syndrome in a child using
Nature Reviews Genetics 5:725-738 PubMed ID: 15510164. interphase cells, metaphase cells, and FISH. Based upon:
(Original-Harrington- CC BY-NC 3.0) Fang J.-S. et al. 2008. Cytogenetic and molecular
characterization of three-generation family with
chromosome 5p terminal deletion. Clinical Genetics
73:585-590 PubMed ID: 18400035 (Original-Harrington- CC
BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• Human chromosomes can be observed in either metaphase chromosome spreads or within intact
nuclei.
• DNA stains such as Giemsa and DAPI bind to DNA non-specifically.
• Hybridization probes bind to DNA at specific target sites. They are collections of DNA molecules that
are (i) short, (ii) single stranded, (iii) contain a labelled nucleotide, and (iv) are complementary to a
target region.
• Chromosomes can be prepared with either Giemsa staining or G-banding and then observed with a
visible light microscope.
• Chromosomes can be prepared with both DAPI staining and fluorescently labelled hybridization probes
and then observed with a fluorescence microscope.
• Chromosome number or structural abnormalities can be recognized in a metaphase chromosome
spread, an intact nucleus, or a karyogram diagram. They can be summarized in a karyotype statement.
KEY TERMS:
Giemsa hybridization probe
Giemsa staining fluorescence in situ hybridization (FISH)
G-banding plasmid
metaphase chromosome spread PCR product
idiogram bacteria artificial chromosome (BAC)
karyogram labelled nucleotide
karyotype DAPI
QUESTIONS:
1) Giemsa and DAPI are both used to label DNA.
Why can't we use only Giemsa or only DAPI in
human cytogenetics?
2) What changes would you have to make to the
karyogram in Figure 4 to make it show the
chromosomes from the patient with Cri-du chat
syndrome described in Figure 9?
3) What are the similarities and differences
between a PCR reaction and a random prime
labelling reaction?
4) In nucleotide triphosphates, the phosphates are
named alpha, beta, and gamma. In Figure 6
why is it the alpha phosphate that is
radioactive?
5) What would Figure 8 look like if it also showed
metaphase chromosomes from another cell?
6) Some men have an extra Y-chromosome. What
is their karyotype? Describe as many ways as
you can to detect this chromosome
abnormality.
7) Some women have an extra X-chromosome.
What is their karyotype? Describe as many
ways as you can to detect this chromosome
abnormality.
8) Design a FISH based experiment to find out if
someone is a 47,XXX female or a 47,XYY male.
Figure 3.
An automated Sanger
sequencing reaction. The regular
dNTPs are shown here as black
while the ddNTPs are in colour.
(Original-Deyholos- CC BY-NC
3.0)
Using
automated
The results from a sequencing reaction are Sanger
presented as a chromatograph. While Figure 6 only sequencing to
shows 9 peaks, a successful sequencing reaction sequence a
will generate about 700 nucleotides worth of data. plasmid.
(Original –
The figure shows the results from a single tube but Harrington- CC
in fact there can be 48 or 96 tubes in total. Thus, in BY-NC 3.0)
a single “run” an ABI 3730 machine can sequence
up to 67,000 bp of DNA.
Figure 8.
1.3. USING AUTOMATED SANGER DNA SEQUENCING TO Using automated
Sanger sequencing
SEQUENCE A PLASMID
to sequence a gene.
The site of a point
Making a new recombinant plasmid takes time and mutation is shown
money. You will want to confirm that it has the here as 'm'.
DNA sequence it should before you use it for (Original –Harrington-
CC BY-NC 3.0)
important experiments. A simple way to find out is
to sequence it. Let's say you have put a 3.0 kb
the first so called next-generation sequencing The Illumina MiSeq in the MBSU (Biological Sciences
Department, U. of Alberta). The door has been opened to
machine. show where the DNA sample and reaction mixtures are
loaded. (Original-Harrington- CC BY-NC 3.0)
$1,250 - $2,050 -
Price $4.75
$1,850 $5,150
Figure 11.
Assembling the DNA sequence of
a chromosome from many
smaller sequences.
(NHGRI-Darryl Leja-PD)
On the other hand, let's say you wanted to 2.4. USING NEXT-GENERATION DNA SEQUENCING TO
sequence your own DNA. Even if you don't consider SEQUENCE HUMANS
the time and cost of making the BAC clones it Even though we know the average human DNA
would still cost millions of dollars to do all of the sequence, each of us is unique. There are two
sequencing reactions in the ABI 3730. Conversely reasons why human DNA continues to be
the MBSU could use their NextSeq 500 and have sequenced. (Table 2)
everything done in two days for about $4 000. Each
of your 46 chromosomes would be sequenced 2.5. USING NEXT-GENERATION DNA SEQUENCING TO
SEQUENCE OTHER ORGANISMS
about 30 times each. A more expensive machine,
the Illumina HiSeq, can sequence human DNA for Next-generation sequencing has made it feasible to
about $1 000 a person. sequence anything. Here are just a few examples.
(Table 3)
Table 2. Using next-generation sequencing to sequence humans.
Use of next-generation sequencing Description
Personalized genomics If we sequence a person's DNA it can reveal information about
their susceptibility to disease and their response to various
medical treatments.
Tumour cell sequencing If a person has cancer it is now possible to sequence individual
cancer cells. This has revolutionized how physicians help their
patients. Instead of treatments based upon the location of
tumours, treatments can now be designed around the genetic
defects that lead to the cells becoming cancerous in the first
place.
___________________________________________________________________________
SUMMARY:
• Automated Sanger sequencing became commonplace in 1987 and is still used today. It is used to
sequence plasmids and PCR products. The most popular machines are Applied Biosystem's ABI 3730s.
• Next-generation sequencing began in 2004 as a better way to sequence whole genomes. There are
several competing technologies, for example Illumina's MiSeq machine and its sequencing by synthesis
technology.
• Sequencing centres offer both types of sequencing today.
• Sequencing any DNA molecules, large or small, is now fast and inexpensive.
KEY TERMS:
automated Sanger sequencing Ilumina MiSeq
Applied Biosystems ABI 3730 Illumina NextSeq 500
DNA Polymerases sequencing by synthesis
primer sequence assembly
regular dNTPs personalized genomics
fluorescently-labelled ddNTPs tumour cell sequencing
capillary tube electrophoresis de novo sequencing
chromatogram metagenomics
next-generation sequencing RNA Seq
STUDY QUESTIONS:
1) What would the chromatogram look like if you
set up an automated Sanger sequencing
reaction with only template, primers,
polymerase, and fluorescent ddNTPs?
2) How could you use DNA sequencing to identify
new species of marine microorganisms?
3) An alternative name for automated Sanger
sequencing is dye-terminator sequencing. Why
is this term appropriate?
4) Ten years ago, it would have cost $100,000,000
to sequence your DNA. Today it would cost as
little as $1,000. Why did the cost go down so
much?
5) Why haven't next-generation machines
completely replaced the first generation of
automated DNA sequencers?
6) True or false: Automated pyrosequencing and
sequencing by synthesis are both considered
next-generation DNA sequencing technologies.
Figure 1.
Agarose gel being placed on a Southern
blotting transfer set up. The DNA in the
gel will be transferred to a membrane,
placed on the gel, via the movement of a
buffer solution.
(WikimediaCommons-National Cancer
Institute-PD)
INTRODUCTION 1. SOUTHERN BLOT (DNA)
The separation of DNA, RNA, and polypeptides base A Southern blot (also called a Southern Transfer
on size is a useful biochemical technique that uses because it more accurately describes the procedure)
migration through a gel to fractionate these is named after Ed Southern, who invented it in the
macromolecules. For example, bands of DNA in an mid-1970s. This blotting method is used to identify
electrophoretic gel form if many the DNA molecules specific DNA fragments, size-separated by gel
are of the same size, such as following a PCR electrophoresis, that cross hybridize with a labeled
reaction, or restriction digestion of a plasmid. In probe (often radio-active). For example, the
other situations, such as after restriction digestion presence/absence of a particular sized restriction
of chromosomal (genomic) DNA, there will be a very fragment can be identified in a sample of genomic
large number of different sized fragments in the DNA digested with a specific restriction enzyme.
digest and thus it will appear as a continuous smear There are multiple steps in the Southern Blot
of DNA, rather than distinct bands, on a lane in a gel. procedure (Figure 2). In the first step, DNA is
With the genomic DNA case, it is necessary to use digested with restriction enzymes and separated by
additional techniques to detect the presence of a agarose gel electrophoresis. Then a sheet of a nylon
specific DNA sequence within the smear of DNA derivative, Nitro-cellulose, or similar material
separated on an electrophoretic gel. This can be (membrane) is laid under the gel (Figure 1). The
done using a DNA sequence probe of “Southern DNA, in its separated position (bands or smear), is
Blot”. In this chapter, we will describe Southern then transferred to the membrane by drawing a
blots, as well as other blotting techniques, such as buffer solution out of the gel, in a process called
Northern Blots (RNA) and Western Blots (protein), blotting. At this point the blotted DNA is usually
that use similar principles to detect those covalently attached to the membrane by briefly
macromolecules. (The Eastern and SouthWestern exposure to UV light or drying. The transfer to a
blots will not be described here.) sturdy membrane is necessary because the fragile
gel would fall apart during the next two steps in the
process. Next, the membrane is bathed in an alkali the probe’s hybridization. At maximum stringency
solution to denature (double stranded made single (higher temperature, low salt) hybridization
stranded) the attached DNA, and this is then conditions, probes will only hybridize efficiently with
neutralized. The membrane is added to a target sequences that are perfectly complementary
hybridization solution, containing a small amount of (maximum number of hydrogen bonds). At lower
labeled single-stranded probe DNA that is stringency (lower temperature, higher salt), probes
complementary to a sequence target molecule on will be able to hybridize and detect sequences to
the membrane. This probe DNA is labeled using which they do not match exactly, but have some
either fluorescent or radioactive molecules. If the mismatch along the sequence.
hybridization is performed properly, the probe DNA Southern blotting is useful not only for detecting the
will form a stable duplex only with those DNA presence of a DNA sequence within a mixture of
molecules on the membrane to which it is DNA molecules, but also for determining the size of
complementary. Then, the unhybridized probe is a restriction fragment in a DNA sample. One
washed off leaving the hybridized radioactive or advantage is that Southern blots are able to
fluorescent signal bound. This remaining signal will detecting fragments larger than those normally
appear in a distinct band when appropriately amplified by PCR. Also, the long DNA probes can
detected (fluorescent or radioactive). The band detect fragments that may be relatively dis-similar
represents the presence of a particular DNA to the original sequence. Applications of Southern
sequence within the mixture of DNA fragments that blotting will be discussed further in the context of
is complementary to the probe sequence molecular markers in a subsequent chapter.
The probe’s specificity comes via the sequence Southern blotting was invented long before PCR, but
specific hybridization (requires complementarity). PCR has replaced blotting in many applications
However, variation in hybridization temperature because of its simplicity, speed, and convenience.
and washing solutions can alter the stringency of
.
Figure 2.
A diagram of Southern blotting.
Genomic DNA that has been
digested with a restriction enzyme
is separated on an agarose gel, and
then the DNA is transferred from
the gel to a nylon membrane (grey
sheet) by blotting. The DNA is
immobilized on the membrane,
and then probed with a
radioactively labeled DNA
fragment that is complementary
to a target sequence. After
stringent washing, the blot is
exposed to X-ray film to detect
what size fragment the probe is
bound. In this case, the probe
bound to different-sized
fragments in lanes 1, 2, and 3. In
the last image the orange
represent the position of the
digested DNA, but it is not actually
present on the X-ray film.
(Original-J. Locke-CC BY-NC 3.0)
Figure 5.
Comparison of Southern, Northern, and Western blots. In the cell at the top, DNA is in blue, RNA in red, and polypeptides in green.
Size and amount of DNA, RNA, and polypeptides can be determined using similar blotting methods. A size marker lane is shown
in the left of each gel to estimate molecule size. Although a eukaryote cell is shown, the same methods can be applied to
prokaryotes, too. (Original-Locke-CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• Southern blotting involves detecting the presence of DNA fragments, such as those from total genomic
DNA digested with a restriction enzyme, separated by agarose gel electrophoresis and transferred to a
membrane that is then probed with a labeled nucleic acid probe.
• With Northern Blots, the same principle in Southern blotting is used to detect single stranded RNA of
interest
• Western Blots are also similar but use acrylamide gels to separate proteins and the membrane is probed
with antibodies to detect the molecule of interest.
KEY TERMS:
Southern Blot hybridization
Southern transfer probe
Northern Blot washing
Western Blot stringency
membrane mismatch
blotting primary antibody
denaturation secondary antibody
STUDY QUESTIONS:
1) Research shows that a particular form of cancer
is caused by a 200bp deletion in a particular
human gene that is normally 2kb long. Only one
mutant copy is needed to cause the disease – it’s
dominant.
a) Explain how you would use Southern
blotting to diagnose the disease.
b) How would any of the blots appear if you
hybridized and washed at very low
temperature (low stringency)?
2) Refer to question 1.
a) Explain how you would detect the presence
of the same deletion using PCR, rather than
a Southern blot.
b) How would PCR products appear if you
annealed at very low temperature?
3) You have a PCR fragment for a human olfactory
receptor gene (perception of smells). You want
to know what genes a dog might have that are
related to this human gene.
a) How can you use your PCR fragment and
genomic DNA from a dog to find this out?
b) Do you think dogs have more or less of these
genes?
INTRODUCTION mapped just like typical genetic markers. Molecular
markers are more numerous and can be used in
Imagine that you could compare the complete
medicine, forensics, ecology, agriculture, evolution,
genomic DNA sequence of any two people you
and many other fields. In most situations,
meet today. Although their sequences would be
molecular markers obey the same rules of
very similar on the whole, they would certainly not
inheritance that we have already described, and so
be identical at each of the ~3 billion base pair
can be used to create detailed genetic maps with
positions you examined (unless, perhaps, your
which to identify gene/disease locations though
subjects were identical twins – but even they have
genetic linkage.
some somatic differences). In fact, the genomic
sequences of almost any two, unrelated people 1. MUTATION AND POLYMORPHISM
differ at millions of nucleotide positions dispersed We have previously noted that an important
throughout their genomes. Some of these property of DNA is its fidelity: most of the time it
differences would be found in the regions of genes accurately passes the same information from one
that code for proteins. Others might affect the generation to the next. However, DNA sequences
amount of transcript that is made for a particular can also change. Changes in DNA sequences are
gene. A person’s appearance, behavior, health, and called mutations. If a mutation changes the
other characteristics depend in part on these phenotype of an individual, the individual is said to
polymorphisms. be a mutant. Naturally occurring, but rare,
Most of these nucleotide differences, however, sequence variants that are clearly different from a
have no effect at all. They have no effect on gene normal, wild-type sequence are also called
sequences or expression, because they occur in mutations. On the other hand, as discussed above,
regions of DNA that neither encode proteins, nor many naturally occurring variants exist for traits for
regulate the expression of genes. Nevertheless, which no clearly normal type can be defined; thus,
these polymorphisms are very useful because they we use the term polymorphism to refer to variants
can be used as molecular markers, which can be of DNA sequences (and other phenotypes) that co-
OPEN GENETICS LECTURES – FALL 2017 PAGE 1
CHAPTER 35 – DNA VARIATION STUDIED WITH SOUTHERN BLOTS
exist in a population at relatively high frequencies then the expansion and contraction of the number
(>1%). Polymorphisms and mutations arise of repeats are called Simple Sequence Repeat (SST)
through similar biochemical processes, but the use polymorphisms or Short Tandem Repeats (STR). If
of the word “polymorphism” avoids implying that they are longer (10-50 base pairs) then they are
any particular allele is more normal or abnormal. called Variable Number Tandem Repeats (VNTR)
For example, a change in a person’s DNA sequence (Figure 3). The difference in names here is just a
that leads to a disease such as hemophilia is matter of length.
appropriately called a mutation, but a difference in Because of the tandem nature of these sequences
DNA sequence that explains whether a person has and their propensity for addition/deletion, the
red hair rather than brown or black hair is an number of repeats is typically very variable in a
example of polymorphism. Molecular markers are population of individuals. The number of repeats
a particularly useful type of polymorphism for defines an allele, so there will be many alleles in a
many areas of genetics research. Mutations of DNA population and these loci will be highly
sequences can arise in many ways. polymorphic in a population. This leads to a high
2. MOLECULAR MARKERS – SNPS AND VNTRS degree of heterozygosity in the population, which
is good for genetic mapping of these markers.
2.1. ORIGINS OF SINGLE BASE PAIR POLYMORPHISMS
Polymorphisms can be single base pair differences
between or among individuals in a population
(Figure 2). These are referred to as Single
Nucleotide Polymorphisms (SNP) or “SNiPs”. They
are distributed randomly throughout the genome.
SNPs occur about once in every 300 base pairs, on Figure 3.
average, which means there are roughly 10 million Simple Sequence Repeat (SST) polymorphisms or Variable
SNPs in the human genome. SNPs are usually Number Tandem Repeat (VNTR) polymorphism. Each red
identified by DNA sequencing of multiple box represents a repeat. The variant region is marked in
blue (increased number of repeats), and each variant
individuals and comparing the sequence to find the sequence is arbitrarily assigned one of two allele labels.
different bases. (Original-Deyholos-CC:AN)
as a template might produce a 120bp band (Figure fragment length can be caused by the insertion of
4). An SNP (single nucleotide polymorphism), is an mobile genetic elements such as transposons,
example of polymorphism that varies in nucleotide (inserted more or less randomly into chromosomal
DNA) or to DNA deletions or duplications.
identity, but not length. SNPs are the most
common of any molecular markers, and the 4. CONSTRUCTION OF GENETIC LINKAGE MAPS
genotypes of thousands of SNP loci can be
In classical Mendelian genetics, two loci can be
determined in parallel, using new, hybridization
mapped relative to one another – they will either
based instruments. Note that the alleles of most
assort independently (unliked) or will be linked and
molecular markers are co-dominant, since it is
the frequency of recombination will determine
possible to distinguish the molecular phenotype of
their distance apart. Molecular markers can be
a heterozygote from either homozygote.
used in the same manner, both with each other
3. RESTRICTION FRAGMENT LENGTH and in combination with classic Mendelian
markers, too.
POLYMORPHISM (RFLP)
By calculating the recombination frequency
Another form of Molecular Marker is the
between pairs of molecular markers, a map of each
Restriction Fragment Length Polymorphism
chromosome can be generated for almost any
(RFLP). This polymorphism takes advantage of
organism. These maps are calculated using the
differences in the length of restriction enzyme (RE)
same mapping techniques described previously for
fragments (Figure 4).
genes, however, the high density and ease with
which molecular markers can be genotyped makes
them more useful than the “old-style” visual
phenotype method for constructing genetic maps.
These more detailed maps are useful in further
Figure 4.
studies, including map-based cloning of protein
Restriction Fragment Length Polymorphism (RFLP). The coding genes that were identified by mutation, or
variant sequence is marked in blue, and each variant for disease loci.
sequence is arbitrarily assigned one of two allele labels.
(Original-Deyholos-CC:AN) Figure 4 diagrammatically shows a set of
hypothetical results of parentals and F2 progeny
for a mapping cross. This type of experiment is
Here, the change in DNA sequence introduces or
needed to map the relative distance between two
abolishes a restriction enzyme site. This will change
loci, A and B, which are part of a series of loci along
the length of a restriction enzyme fragment that
a chromosome. Then these loci can be used to test
can be detected by Southern Blot of that genomic
for linkage to disease or other traits in a genome.
DNA. While the loss or gain of a RE site is the
typical cause of RFLPs, other changes in RE
.
Figure 5.
Determining the genotype of an individual
at a single SSR locus using a specific pair of
PCR primers and agarose gel
electrophoresis.
S= size standard
(Original-Deyholos-CC:AN)
Figure 6.
Figure Measuring recombination
frequency between two molecular marker
loci, A and B. A different pair of primers is
used to amplify DNA from either parent
(P) and 15 of the F2 offspring from the
cross shown. Recombinant progeny will
have the genotype A1A2B2B2 or
A2A2B1B2. Individuals #3, #8, #13 are
recombinant, so the recombination
frequency is 3/15=20%. Linkage!
(Original-Deyholos-CC:AN)
correlation is to obtain genomic DNA samples from advise an individual of susceptibility to a disease.
hundreds of individuals with a particular disease, as This is covered in more detail in Chapter 37.
well as samples from a control population of Molecular markers may also be used in a similar
healthy (non-afflicted) individuals. The genotype of way in agriculture to track desired traits in crops or
each individual is scored at hundreds or thousands livestock. For example, markers can be identified
of molecular marker loci (e.g. SNPs), to find alleles by screening both the traits and molecular marker
that are usually present in persons with the genotypes of hundreds of individuals. Markers that
disease, but not in healthy subjects. The molecular are linked to desirable traits can then be used
marker is presumed to be tightly linked to the gene during breeding to select varieties with
that causes the disease, although this protein- economically useful combinations of traits, even
coding gene may itself be as yet unknown. The when the genes underlying the traits are not
presence of a particular molecular polymorphism known.
may therefore be used to diagnose a disease, or to
__________________________________________________________________________
SUMMARY:
• Natural variations in the length or identity of DNA sequences occur at millions of locations throughout
most genomes.
• DNA polymorphisms are often neutral, but because of linkage may be used as molecular markers to
identify regions of genomes that contain genes of interest.
• Molecular markers are useful because of their neutrality, co-dominance, density, allele frequencies,
ease of detection, and expression in all tissues.
• Molecular markers can be used for any application in which the identity of two DNA samples is to be
compared, or when a particular region of a chromosome is to be correlated with inheritance of a trait.
KEY TERMS:
molecular marker SNP
repetitive DNA RFLP
SSR neutral mutation
SSLP
VNTR
STUDY QUESTIONS:
For the next few questions, suppose that you have
a 1.0kbp fragment from the human genome. You
are told it contains only unique sequence (no
repeated DNA sequences such as transposable
elements or Alu sequences). If you label this
fragment and use it to probe a Southern blot
containing human genomic DNA (one person)
digested with EcoRI, HindIII, and BamHI in lanes 2,
3, and 4, respectively. Lane 1 contains a size
marker.
1) Will the probe only show one band per lane in
DNA from the individual if they are
homozygous for the region being probed?
2) What if the individual is heterozygous for this
region?
3) What if you examined 100 different individual’s
genomic DNA in a similar manner, would they
all be expected to have the same pattern?
4) What would you expect if the probe was not
unique, but instead had an Alu repeat within
the 1.0kbp fragment?
5)
• Using PCR it is easy to determine the alleles tube electrophoresis. Here, one of the two PCR
that each person has. primers already has a fluorescent dye (Figure 3)
attached.
The rest of this chapter will discuss how we can
determine which alleles a person has and how we The PCR reaction occurs normally resulting in the
can use these results for DNA fingerprinting. PCR products all being fluorescently labelled. They
2. DETECTING STRS WITH PCR AND AGAROSE can then be loaded into a capillary tube
GEL ELECTROPHORESIS electrophoresis machine (Figure 4).
Let's say we have a person who is a 7/12 As before, DNA migrates through a gel material
heterozygote at the CSF1P0 site. We could find out towards the positive electrode, but this time the
by isolating their genomic DNA, amplifying the gel is contained within a thin tube. Near the end of
region using standard PCR, and then running the a tube is a laser to excite the fluorescent dye and a
PCR products on an agarose gel. We would get two detector to record fluorescence. A computer can
bands, the faster moving band would be the then monitor the tube for the appearance of any
smaller PCR products from the 7 allele and the fluorescent signal. For an STRs we would get a
slower moving band would be the products made single peak if a person was homozygous and two
from the 12 allele. peaks if a person was heterozygous
Figure 2.
Determining the genotype of an individual at a single STR
using a specific pair of PCR primers and agarose gel
electrophoresis. S = size standard.
(Original-Deyholos-CC BY-NC 3.0)
There are two other similarities between capillary STR Chromosome Result
tube electrophoresis and slab gel electrophoresis. CSF1P0 5 7/12
Just as a slab gels contain several lanes, the D8S1179 8 6/6
machines used for capillary tube electrophoresis D21S11 21 9/10
have several tubes, up to 96 in fact. This allows etc. etc. etc.
scientists to run many samples simultaneously.
Another similarity is the need for molecular weight The chance that another person, even a close
markers. In both systems these are pieces of DNA relative, has the exact same profile is
of known length that are used to estimate the sizes astronomically small. Identical twins, triplets, etc.
of the DNA molecules in each sample. In the case of will have the same DNA profile.
capillary tube electrophoresis these DNA molecules
are attached to a different coloured fluorescent 4.2. FORENSICS
dye. The computer uses them to estimate the size Forensics is the process of gathering data that can
of the PCR products. be used in a court of law. Because DNA profiles are
virtually unique to a person they can be used to
4. MODERN DNA FINGERPRINTING match a person to a DNA sample recovered at a
crime scene. In the example below only Suspect 2
4.1. OVERVIEW
matches the DNA at the crime scene; we can
DNA fingerprinting, as its name suggests, is the
exclude suspects 1 and 3. (Table 1) Note what DNA
ability to produce a unique collection of data for
profiling has done, it has made it easier to exclude
every person, using their DNA. It was invented by
a suspect than it is to convict someone.
the U.S. Federal Bureau of Investigation in the
1980s and is now done using the technique The technician who does the analysis often has to
presented above. The procedure is sometimes appear in court to explain the results to the jury. In
called CODIS (COmbined DNA Index System) named Canada if a person is convicted of a serious crime
after the software that converts the peaks into their DNA profile will then be stored at the
numerical data. National DNA data bank in Ottawa.
In Canada and the United States, the test is for 13 4.3. PATERNITY TESTING
autosomal STRs. These are shown in Figure 1. This The other common use of DNA fingerprinting is
figure also shows the AMEL gene. The allele on the paternity testing. If we have a DNA profile for a
X chromosome is shorter than the allele on the Y child and their biological mother, we can identify
chromosome. This difference can be detected with who the biological father could be. Remember, the
its own PCR reaction. If a person is XX they will only power of this type of test is that of exclusion. If the
have the shorter allele while if a person is XY they potential father lacks the alleles present in the
will have two sizes of PCR products. The sum total child, then he cannot be the father. The large
of the PCR reactions produces a collection of data number of STRs and alleles makes it possible to
known as a DNA profile. For example, a person's exclude essentially everyone except the real father.
DNA profile might begin:
Table 1. DNA profiling of suspect 1, 2 and 3.
DNA at
STR Suspect 1 Suspect 2 Suspect 3
crime scene
CSF1PO 7/12 8/11 7/12 7/15
D8S1179 6/6 9/15 6/6 9/12
D21S11 9/10 4/5 9/10 4/9
For example, consider the situation below. Every Potential father#1 lacks this allele and thus must be
child STR allele that isn't from the mother must excluded. When we apply this thinking to all of the
have come from the father. For example, the STRs, only Potential father 3 could be the child's
child's CSF1PO 7 allele must be maternal (it lacks biological father, the other two males are excluded.
the “10” allele) so their 12 allele must be paternal. Remember, there are 13 STRs, each with many
This means the real father must have at least one alleles, so this method is powerful enough to
12 allele, this is found in fathers #2 and #3. exclude all but the real father.
___________________________________________________________________________
SUMMARY:
• Short tandem repeats (STRs) are easy to detect polymorphisms in human chromosomes and most are
harmless in that they don’t affect the phenotype.
• STRs alleles can be detected with standard agarose gel electrophoresis. The size of the band or bands
reveals the number of repeats present in a particular STR.
• STRs can be detected more efficiently with capillary tube electrophoresis. The location of the peak or
peaks is identified with laser illumination of fluorescent tagged primers and the number of repeats
present in a particular STR can be determined by computer.
• DNA profiles are virtually unique to an individual. They are used in modern day DNA fingerprinting and
paternity testing.
KEY TERMS:
short tandem repeat (STR) DNA fingerprinting
PCR DNA profile
agarose gel electrophoresis forensics
fluorescent dye paternity testing
capillary tube electrophoresis
STR #3 brother
CSF1PO 12/15
D8S1179 4/6
D21S11 9/10
Figure 1.
A DNA microarray slide showing the array of test
spots on the slide.
(Flickr- Argonne Laboratory-CC BY SA 2.0)
INTRODUCTION changes like it, are called single nucleotide
polymorphisms (SNPs). Most SNPs, including this
Microarrays have revolutionized many types of
one, do not affect the expression of genes they are
genetics (Figure 1). Projects that used to take years
near or within. Their sequence has no effect on the
can now be done in weeks if not days. This chapter
phenotype.
will look at two of these techniques. One is used to
identify genes that are responsible for human This SNP is named rs16824514 and is described at
diseases. The other is used to reveal whether a an online SNP database:
person has mutations in any of these genes. http://www.ncbi.nlm.nih.gov/project
s/SNP/snp_ref.cgi?rs=16824514
1. SINGLE NUCLEOTIDE POLYMORPHISMS (SNPS)
rs16824514 is not important for human health, but
1.1. WHAT ARE SNPS? it does provide an example of what SNPs are:
Here is a little section of human DNA from within
an intron of the INPP5B gene on chromosome 1: • It has two alleles, in this case T = ancestral
and A = minor.
--TCCTCTCCAGC-- • People can have three possible genotypes,
--AGGAGAGGTCG-- in this case TT, TA, and AA.
• It is easy to determine which alleles a
--TCCTCACCAGC-- person using one of various methods.
--AGGAGTGGTCG-- The rest of this chapter will discuss how we can
determine which alleles of a SNP a person has and
Most people have the sequence on the top but what this can tell us. But before then, where do
some people have the sequence on the bottom. SNPs come from?
The only difference is the single base pair shown in
bold. This TA to AT base pair substitution mutation 1.2. HOW ARE SNPS DISCOVERED?
likely originated as a single event in the human Most human SNPs were identified during DNA
population tens of thousands of years ago. The sequencing projects. Research teams deliberately
original allele is called the ancestral allele while the chose people from different ethnic groups to
new allele is called the minor allele. These, and sequence. See Chapter 35, Figure 1, for the spread
of humans across the world. For example, they may others. What they have in common is shown in
have sequenced people from countries X, Y, and Z, Figure 2. A microarray is a piece of glass or silicon
each in a different continent. Over the past with short pieces of single stranded DNA stuck to
thousands of years, mutations have been its surface. There are thousands of spots of these
happening in people from each population. A few oligonucleotides (oligos). Within each spot all of
of the base pair substitutions would have become the oligos have the same specific sequence.
common and could even have replaced the original
ancestral allele. When we sequence a present day
members of these diverse populations, we have a
good chance of getting a person that has the new
allele. When we compare this person's genomic
sequence to people from other populations, those
people will still have the ancestral allele. This Figure 2.
difference at a single place on the chromosome is A glass slide based microarray. Actual microarrays have
our SNP. It will be entered into the SNP database as hundreds of thousands of these spots, each with
'reference SNP' followed by a number (rs####). thousands of identical oligonucleotides attached.
(Original – Harrington – CC BY SA 4.0)
There are methods other than sequencing to
detect SNPs. The detection of SNPs is useful for To use a microarray, you need to prepare the DNA
other organisms that have either never been or RNA sample first (Step 1 in Figure 3). DNA
sequenced or have only been sequenced once. samples are broken into smaller fragments, made
Even though this chapter only discusses the uses of single stranded, and covalently attached to a
SNPs in human biology and health, they are useful fluorescent dye. RNA molecules just need to be
tools used by biologists working in many different attached to the fluorescent molecules. Next you
organisms to understand basic questions in biology, pour the labelled sample onto the microarray (Step
genetics, and evolution. 2). If a piece of labelled nucleic acid is
2. MICROARRAY TECHNOLOGY complementary to the oligos in a spot they will
hybridize. After all of the unhybridized sample is
2.1. OVERVIEW washed off, the spot will fluoresce. The microarray
There are many ways to detect SNPs. Over the is then put into a microarray reader (Step 3) to
years, Southern blots, PCR, genome sequencing, take a digital photograph of the fluorescent spots
and other techniques have been used. Most of on the surface (Step 4). The spots with
these methods are rarely used on a large scale fluorescence indicate the presence of
today because they are too labour intensive, complementary sequence in the sample and the
expensive, time consuming, or some combination level of fluorescence indicates the amount.
thereof. This Chapter will focus on a single method,
chosen because it is how most SNP detection is 2.2. GENOTYPING MICROARRAYS
done in 2015. It makes use of a specific type of Have a look at the DNA sequences at the beginning
microarray, a genotyping microarray made by a of the chapter. How could we design a microarray
biotechnology company called Illumina, Inc. in San to detect which of these alleles a person has?
Diego, California, USA. Simply put, this microarray would need to have two
spots for each SNP, one for each allele. One will
Microarrays are a technology used to quantify DNA have oligonucleotides that match the ancestral
or RNA molecules. There are many, many types. allele and one will have oligos that match the
Some are sold by biotechnology companies, while minor allele (Figure 4). To use this microarray we
others are custom made by scientists themselves. would need to isolate genomic DNA from a person,
They go by different names: microarrays, DNA
chips, lab-on-a-chips, biochips, gene arrays, and
both spots
the person is heterozygous (TA)
fluoresce
then process it into short, single stranded lengths, 3. GENOME-WIDE ASSOCIATION STUDIES
and fluorescently label those pieces. Next, we (GWAS)
would need to inject it onto the microarray, let it
Some human diseases are caused by mutant alleles
hybridize, and wash away the unhybridized probe.
of single genes. But how can scientists identify
In the above example if a person has the ancestral
which of our ~20,000 gene(s) is/are responsible?
allele some of their DNA sample will bind to the
Consider a type of neuronal degeneration called SNPs on chromosome 4. On the y-axis is the
Huntington disease. It took researchers Nancy probability that each SNP is close to the disease-
Wexler and James Gusella ten years to discover associated gene. For most SNPs the probability is
that the gene responsible for this disease was on low.
chromosome 4. Their teams used a technique
called restriction fragment length polymorphism
(RFLP) mapping. This Southern blotting-based
approach has been replaced with much faster
microarray-based methods.
Using genotyping microarrays to discover genes is
called SNP mapping or genome-wide association
studies (GWAS). We need large DNA samples from
two groups of people, those that have a mutation
causing a disease (or a phenotype of interest) and Figure 6.
those who do not. Each person's DNA is isolated GWAS results show the probability that SNPs along a
chromosome are close to the gene of interest.
and then genotyped using a genotyping microarray.
(Original – Harrington – CC BY SA 4.0)
A powerful statistical test is then used to study the
results. What the software is looking for is
correlations; are any of the SNPs correlated with GWAS is very effective at identifying genes when
the disease phenotype (Figure 5)? there is only one gene that can mutate to cause the
disease. When there are two or more genes the
results are harder to interpret.
For example, let's say that there are two genes that
can mutate to cause the same disease. In our group
of people with the disease some have mutations in
the first gene and some have mutations in the
second gene. SNPs near the first gene will not be
close to the second gene, and vice versa. This
dilutes the association between SNPs and the
disease. GWAS is not very useful when it comes to
Figure 5. multifactorial diseases. These are diseases such as
SNPs can be used to determine the location of mutations cancer and heart disease, where there are
that affect our health. The G/C vs A/T SNP is located near
the gene of interest. The G/C form will tend to associate
mutations in many genes that can be responsible.
with the gene+ allele, while the A/T form will tend to
4. DIRECT TO CONSUMER DNA TESTING
associate with the gene- allele.
(Original – Harrington – CC BY SA 4.0) Direct to consumer DNA testing, as the name
suggests, is a form of genetic testing that does not
For example, does everyone with the mutant gene involve a physician ordering the test, or helping the
have an A/T base pair allele of a specific SNP, while person to understand the results. The results go
everyone with the functioning gene have a G/C directly to the consumer and they are left to
base pair allele? If so, then the gene should be near interpret their genotypes, usually with the
this SNP. Because we know where the location of assistance of a web site provided by the testing
the SNP, is we now know where the approximate company. There are over a dozen companies
location of the gene is. offering this service worldwide.
The results are displayed in a graph (Figure 6). The In the case of 23andMe, for example, a person
x-axis is the location of the SNPs, in this case all the provides a DNA sample by spitting into a tube and
pays ~$200 to have it tested. Many people think 23andMe and other such companies also offers
the DNA is being sequenced but this isn't possible heredity testing. Other SNPs on the same
for this price. The cost to sequence a person's DNA microarray are known to be correlated with
in 2015 is about $1000+. Instead what 23andMe different ethnic groups. For example, if a minor
does is to load the DNA samples they receive into allele is common in English people and the person
genotyping microarrays. In fact, they use the is homozygous for this allele chances are this
Illumina microarray described earlier in this person is English. If a person is heterozygous it
chapter. means one of their parents is likely to be English
What they are looking for is SNPs known to be next and the other not. If a person only has the
to described genes. These are genes previously ancestral alleles it means neither of their parents is
identified as being medically important. likely to be English. There are enough SNPs used for
most people to learn where their ancestors came
For example, in Figure 7 if a person is heterozygous from.
or homozygous for the A/T base pair SNP allele
they have a higher probability of having the mutant Additionally, many companies offer testing for the
allele of the gene. Conversely if they only have the percentage of your genome that came from
G/C base pair alleles they probably don't. 23andMe Neanderthals. Neanderthals interbred with humans
reports back a person's genotype as GG, GA, or AA around 60,000 years ago and many people of
and describes the likelihood of a person having this European, Asian, Australian, and Native American
disease or condition. While this looks like a medical origin have retained a few percent of their genome
diagnostic test, 23andMe argues it isn't looking for within their own. The human and Neanderthal
the disease causing mutations directly and genomes can be distinguished by SNPs.
therefore they are not offering a diagnostic test.
Figure 7.
SNPs can be used to determine the presence of mutations
that affect our health. The SNP that was originally used to
discover this gene is now being used to test for it.
(Original – Harrington – CC BY SA 4.0)
___________________________________________________________________________
SUMMARY:
• Single nucleotide polymorphisms (SNPs) are harmless and easy to detect polymorphisms in human
chromosomes
• SNPs can be detected with genotyping microarrays and microarray readers. These microarrays have
pairs of oligonucleotide spots, one for the ancestral allele and one for the minor allele. Where a
person's DNA hybridizes reveals their genotype at a particular SNP.
• Genome-wide association studies (GWAS) use SNPs and genotyping microarrays to determine the
location of mutations that affect our health.
• Direct to consumer DNA testing uses SNPs and genotyping microarrays to determine the presence of
allelic forms that show linkage to genes that may affect our health, predict our ancestry, and estimate
the percentage of Neanderthal sequences we have.
KEY TERMS:
ancestral allele genotyping microarray
minor allele SNP mapping
single nucleotide polymorphism (SNP) genome-wide association study (GWAS)
microarray direct to consumer DNA testing
microarray reader
QUESTIONS:
1) Why do most SNPs only have two alleles?
2) When using a genotyping microarray why is it
important that only perfectly matched DNA
molecules be able to hybridize?
3) Could GWAS be used to find out why some
people have blue eyes and other people don't?
4) Assuming the gene responsible for blue eye
colour is known, could direct to consumer
testing predict whether a person has blue eyes
or not?
5) What do these GWAS results mean?
Notes:
Figure 1.
Population genetics is important in
ecology, evolution, and even in our
daily lives since disease risks can be
calculated using population
genetics.
(Flickr-Zach Stern- CC BY-NC-ND 2.0)
INTRODUCTION For example:
A population is a large group of individuals of the genotype number of individuals
same species, who are capable of mating with each AA 320
other. It is useful to know the frequency of
particular alleles within a population, since this Aa 160
information can be used to calculate disease risks. aa 20
Population genetics is also important in ecology
and evolution, since changes in allele frequencies
p = (2(AA) + Aa) / (total alleles counted)
may be associated with migration or natural
= (2(320) + 160) / (2(320) + 2(160) + 2(20) ) = 0.8
selection.
q = (2(aa) + Aa) / (total alleles counted)
1. ALLELE FREQUENCIES MAY BE STUDIED AT THE = (2(20) + 160) / (2(320) + 2(160) + 2(20) ) = 0.2
POPULATION LEVEL
A (p) a (q)
The frequency of different alleles in a population
can be determined from the frequency of the A (p) p2 pq
various phenotypes in the population. In the
a (q) pq q2
simplest system, with two alleles of the same locus
(e.g. A,a), we use the symbol p to represent the
frequency of the dominant allele within the 2. HARDY-WEINBERG FORMULA
population, and q for the frequency of the
With the allele frequencies of a population we can
recessive allele. Because there are only two
use an extension of the Punnett Square, and the
possible alleles, we can say that the frequency of p
product rule, to calculate the expected frequency
and q together represent 100% of the alleles in the
of each genotype following random matings within
population (p+q=1).
the entire population. This is the basis of the
We can calculate the values of p and q, in a Hardy-Weinberg formula:
representative sample of individuals from a
p2 + 2pq + q2=1
population, by simply counting the alleles and
dividing by the total number of alleles examined. Here p2 is the frequency of homozygotes AA, 2pq is
For a given allele, homozygotes will count for twice the frequency of the heterozygotes, and q2 is the
as much as heterozygotes. frequency of homozygotes aa.
Notice that if we substitute the allele frequencies The Hardy-Weinberg formula can also be used to
we calculated above (p=0.8, q=0.2) into the estimate allele frequencies, when only the
formula p2 + 2pq + q2=1, we obtain expected frequency of one of the genotypic classes is known.
probabilities for each of the genotypes that exactly For example, if 0.04% of the population is affected
match our original observations: by a particular genetic condition, and all of the
affected individuals have the genotype aa, then we
p2=0.82=0.64 0.64 x 500 = 320 assume that q2 = 0.0004 and we can calculate p, q,
2pq= 2(0.8)(0.2)=0.32 0.32 x 500 = 160 and 2pq as follows:
q2=0.22=0.04 0.04 x 500 = 20 q2 = 0.04% = 0.0004
q= 0.0004 = 0.02
This is a demonstration of the Hardy-Weinberg
Equilibrium, where both the genotype frequencies p= 1-q = 0.98
and allele frequencies in a population remain 2pq = 2(0.98)(0.02) = 0.04
unchanged following successive matings within a
population, if certain conditions are met. These Thus, approximately 4% of the population is
conditions are listed in Table 1. Few natural expected to be heterozygous (i.e. a carrier) of this
populations actually satisfy all of these conditions. genetic condition. Note that while we recognize
Nevertheless, large populations of many species, that the population is probably not exactly in
including humans, appear to approach Hardy- Hardy-Weinberg equilibrium for this locus,
Weinberg equilibrium for many loci. In these application of the Hardy-Weinberg formula
situations, deviations of a particular gene from nevertheless can give a reasonable estimate of
Hardy-Weinberg equilibrium can be an indication allele frequencies, in the absence of any other
that one of the alleles affects the reproductive information.
success of organism, for example through natural
selection or assortative mating.
Table 1. Conditions for the Hardy-Weinberg equilibrium
• Random mating: Individuals of all genotypes mate together with equal frequency. Alternatively,
assortative mating, in which certain genotypes preferentially mate together, is a type of non-random
mating.
• No natural selection: All genotypes have equal fitness. None are selectively removed by selection.
• No migration: Individuals do not leave or enter the population.
• No mutation: The allele frequencies do not change due to mutation.
• Large population: Random sampling effects in mating (i.e. genetic drift) are insignificant in large
populations.
___________________________________________________________________________
SUMMARY:
• Populations in true Hardy-Weinberg equilibrium have random mating, and no genetic drift, no
migration, no mutation, and no selection with respect to the gene of interest.
• The Hardy-Weinberg formula can be used to estimate allele and genotype frequencies given only
limited information about a population.
KEY TERMS:
population Hardy-Weinberg equilibrium
p / q random mating
p+q=1 natural selection
Hardy-Weinberg formula migration
p2 + 2pq + q2=1 assortative mating
STUDY QUESTIONS:
1) You are studying a population in which the
frequency of individuals with a recessive
homozygous genotype is 1%. Assuming the
population is in Hardy-Weinberg equilibrium,
calculate:
a) The frequency of the recessive allele.
b) The frequency of dominant allele.
c) The frequency of the heterozygous
phenotype.
d) The frequency of the homozygous dominant
phenotype.
2) Determine whether the following population is
in Hardy-Weinberg equilibrium.
genotype number of
individuals
AA 432
Aa 676
aa 92
3) Out of 1200 individuals examined, 432 are
homozygous dominant (AA) for a particular
gene. What numbers of individuals of the other
two genotypic classes (Aa, aa) would be
expected if the population is in Hardy-Weinberg
equilibrium?
4) Propose an explanation for the deviation
between the genotypic frequencies calculated
in question 3 and those observed in the table in
question 2.
advantage (selection). Others mutations may just and selection for/against the novel expression
increase in frequency in a population through pattern, not altered gene products.
chance alone (drift). Mutations in regulatory sequences are a key to
2.1. INTERGENIC SEQUENCES understanding how evolution produces new
Mutations in intergenic sequences (regions patterns of expression and new phenotypes.
between genes) have no effect on gene expression Regulatory sequences have to be identified
or phenotype, so there is no selection for/against. experimentally and shown to act combinatorial to
Consequently, these mutations are useful for regulate transcription. Most genes have multiple
markers in genetic mapping or DNA finger printing. independent regulatory sites adjacent to the
(Random genetic drift can cause fixation, where all transcribed sequence that bind trans-acting factors
members of a species have the same DNA to modulate expression.
sequence.) The result is that evolution occurs via 3. EXAMPLE 1: DROSOPHILA YELLOW GENE
random mutation and fixation by random drift, with
no selection for or against these sequences. The yellow gene of Drosophila provides an example
of the modular nature of enhancers (regulatory
2.2. GENE CODING SEQUENCES sequences). This gene encodes an enzyme in the
Mutation in a gene coding sequence can change the pathway that produces a dark pigment in the
effectiveness of a gene product (RNA or protein), insect’s exoskeleton. Null mutants have a yellow
which can consequently affect the phenotype. This cuticle rather than the wild type darker pigmented
type of change doesn’t alter the gene’s cuticle. This gene is called the gene “yellow”
transcription, but natural selection may act because it is named after their mutant phenotype.
for/against this new phenotype. The result is that
Figure 2 shows five distinct enhancer elements that
evolution occurs via random mutation and selection
drive transcription of yellow (left, 5’ up stream -
against for/against the function of the gene’s
wing, body, mouth parts; intron – bristles, claws).
product.
Each binds a different, tissue-specific transcription
2.3. GENE’S REGULATORY SEQUENCES factor to enhance transcription of yellow+ transcript
This is the most interesting type. Mutations in (and thus express the protein) in that tissue and
regulatory sequences do not change the product makes the pigment. So, the wing cells will have a
from the gene, just the pattern of transcription transcription factor that binds to the wing enhancer
(time, place,). In other words, the time, place to drive expression; likewise in the body and mouth
(tissue), level, and response to environment of part cells. Thus, specific combinations of cis-
expression are changed. Regulatory mutations can elements and trans-factors control the differential,
affect many traits and characteristics at once tissue-specific expression of genes. This type of
(pleiotropic) or create new and/or novel patterns of combinatorial action of enhancers is typical of the
expression. This might result in a new function in an transcriptional activation of most eukaryotic genes:
organism (e.g. neomorph). With this type of specific transcription factors activate the
mutation, evolution occurs via random mutation transcription of target genes under specific
conditions.
Figure 2.
Tissue-specific cis-regulatory elements within a simplified representation of the yellow gene of Drosophila.
(Origianl-Deyholos-CC BY-NC 3.0)
While enhancer sequences promote expression, encodes a member of the RIEG/PITX homeobox
there is an oppositely acting type of element, called family) was expressed in several groups of cells,
silencers. These elements function in much the including those that developed into the pelvic fin.
same manner, with transcription factors that bind to Embryos from the shallow-water population
DNA sequences, but they act to silence or reduce expressed Pitx in the same groups of cells as the
transcription from the adjacent gene. other population, with an important exception: Pitx
Again, a gene’s overall expression profile was not expressed in the pelvic fin primordium in
(transcription level, tissue specific, temporal the shallow-water population. Further genetic
specific) is a total combination of all the various analysis showed that the absence of Pitx gene
enhancer and silencer elements that act on that expression from the developing pelvic fin of shallow-
gene. water stickleback was due to the absence
(mutation) of a particular enhancer element
4. EXAMPLE 2: PITX EXPRESSION IN STICKLEBACK upstream of Pitx.
FISH 5. EXAMPLE 3: HEMOGLOBIN EXPRESSION IN
The three-spined stickleback (Gasterosteus PLACENTAL MAMMALS
aculeatus; Figure 1) provides a classic example of
natural selection that involves a mutation in a cis- Hemoglobin is the oxygen-carrying component of
regulatory element. red blood cells (erythrocytes). Hemoglobin usually
exists as tetramers of four non-covalently bound
Background: Members of this species occur in one hemoglobin molecules (Figure 3). Each hemoglobin
of two forms: (1) populations that inhabit deep, molecule consists of a globin polypeptide with a
open water and have a spiny pelvic fin that is covalently attached heme molecule. Heme is made
thought to deter larger predator fish from feeding through a specialized metabolic pathway and is then
on them; (2) populations from shallow water bound to globin polypeptide through post-
environments and lack this spiny pelvic fin. In translational modification.
shallow water, it appears that a long, spiny pelvic fin
are a disadvantage because they allow predatory Figure 3.
A tetramer of human
insects like dragon fly larvae in the sediment to hemoglobin, type a2b2. The a
grasp the stickleback. chains are labeled red, and the b
chains are labeled blue. Heme
Researchers compared gene sequences of
groups are green. (
individuals from both deep and shallow water (Wikipedia- Zephyris- CC BY-SA
environments as shown in Figure 4. They observed 3.0)
that in embryos from the deep-water population, a
gene called Pitx (paired-like homeodomain 1
Figure 4.
Development of a large, spiny pelvic
fin in deep-water stickleback (left)
depends on the presence of a
particular enhancer element
Shallow Water
Deep Water upstream of a gene called Pitx.
Mutants lacking this element, and
therefore the large pelvic fin (right),
have been selected for in shallow-
water environments.
(Original-Deyholos-CC BY-NC 3.0)
Pitx Pitx
The composition of hemoglobin tetramers changes globin gene. Gene duplication events can occur
during development (Figure 5). From early through rare errors in processes such as DNA
childhood onward, most tetramers are of the type replication, meiosis, or transposition. The
a2b2, which means they contain of two copies of duplicated genes can accumulate mutations
each of two slightly different globin proteins named independently of each other. Mutations can occur
a and b. A small amount of adult hemoglobin is in either the regulatory regions (e.g. promoter
a2d2, which has d globin instead of the more regions), or in the coding regions, or both. In this
common b globin. Other tetrameric combinations way, the promoters of globin genes have evolved to
predominate before birth: z2e2 is most abundant in be expressed at different phases of development,
embryos, and a2g2 is most abundant in fetuses. and to produce proteins optimized for the prenatal
Although the six globin proteins (a = alpha, b = beta, environment.
g = gamma, d =delta, e =epsilon , z = zeta) are very Of course, not all mutations are beneficial: some
similar to each other, they do have slightly different mutations can lead to inactivation of one or more of
functional properties. For example, fetal the products of a gene duplication. This can produce
hemoglobin has a higher oxygen affinity than adult what is called a pseudogene. Examples of
hemoglobin, allowing the fetus to more effectively pseudogenes (y) are also found in the globin
extract oxygen from maternal blood. The clusters. Pseudogenes have mutations that prevent
specialized g globin genes that are characteristic of them from being expressed at all. The globin genes
fetal hemoglobin are found only in placental provide an example of how gene duplication and
mammals. Each of these globin polypeptides is mutation, followed by selection, allows genes to
encoded by a different gene. In humans, globin evolve specialized expression patterns and
genes are located in clusters on two chromosomes functions. Many genes have evolved as gene
(Figure 6). We can infer that these clusters arose families in this way, although they are not always
through a series of duplications of an ancestral clustered together as are the globins.
Figure 5.
Expression of globin genes during prenatal and postnatal
development in humans. The organs in which globin genes
are primarily expressed at each developmental stage are also
indicated.
Data: Wood, W.G. 1976 Br. Med. Bull. 32, 282
Original: (Wikipedia-Furfur- CC BY-SA 3.0)
Derivative work/Translation: (Wikipedia-Leonid2- CC BY-SA 3.0)
Figure 6.
Fragments of human chromosome 11 and
human chromosome 16 on which are
located clusters of b-like and a-like goblin
genes, respectively. Additional globin genes
(q, µ) have also been described by some
researchers, but are not shown here.
(Wikipedia – Modified by Kang-CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• Development of a single cell zygote to a multicellular organism involves the sequential expression of
genes so that determination and differentiation can take place and the cells can form the variety of types
found in the adult organism.
• Mutations can occur in intergenic, gene coding, or gene regulatory sequences. Changes in regulatory
sequences can lead to altered gene expression including new developmental times or tissue locations.
• The Drosophila yellow gene is an example of mutations in gene regulatory sequences.
• Stickleback fish provide an example of recent evolutionary events in which mutation of an enhancer
produced a change in morphology with a selective advantage (evolution).
• Expression of the various human globin genes, which generate hemoglobin, is an example of gene
expression changes over developmental time. The family of globin genes arose via gene duplication. Not
all duplications produce functional genes, some are pseudo-genes.
KEY TERMS:
multicellular organisms pleiotropic
zygote stickleback
determination Pitx
differentiation Primordium
cell fate post-translational modification
intergenic sequences hemoglobin/heme/globin
gene coding sequences gene duplication
gene regulatory sequences pseudogene
regulatory mutations gene families
STUDY QUESTIONS:
1) Deep-water sticklebacks that are heterozygous
for a loss-of-function mutation in the coding
region of Pitx look just like homozygous wild-
type fish from the same population. What
phenotype or phenotypes would be expected if
a homozygous wild-type fish from a deep-water
population mated with a homozygous wild-type
fish from a shallow-water population?
2) The modular nature of transcription enhancer
elements can easily be seen in the yellow gene
of Drosophila. Suppose that there was a mutant
that had a deletion of the three distal enhancer
elements (wing, body, mouth – See Figure 2.).
There was another, different mutation that
resulted in a stop codon early in the protein
coding sequence.
a) What would the phenotype of the
homozygote deletion mutant be?
b) What would the phenotype of the
homozygote stop codon mutant be?
c) What would the phenotype of the
heterozygote be?
d) Suppose the heterozygote phenotype was
wild type. How might that occur?
Figure 1.
Two transgenic mice expressing
enhanced green fluorescent protein
(eGFP) under UV-illumination
flanking one plain NOD/SCID mouse
from the non-transgenic parental
line.
(Wikimedia-Moen et. al (2012)-CC
BY 2.0)
INTRODUCTION experiments. Today, a small number of species are
The addition of new genetic material to single cell widely used as model organisms in genetics (Figure
organisms has been possible for decades. Recall R. 2). All of these species have specific characteristics
Griffith’s 1928 experiments with smooth and rough that make large number of them easy to grow and
pneumococcus (Chapter 1). However, the routine analyze in laboratories: (1) they are small, (2) fast
transformation of bacteria with plasmids began in growing with a short generation time, (3) produce
the early 1970s. The ability to transfer DNA (genes) lots of progeny from matings that can be easily
into complex, multicellular organisms is more controlled, (4) have small genomes (small C-value),
recent and usually called transfection when dealing and (5) are diploid (i.e. chromosomes are present
with cells and it also began in the 1970s. This in pairs).
genetic technology has opened up whole new The most commonly used model organisms are:
avenues of research as well as new possibilities for
commercial gain and health improvement. • The prokaryote bacterium, Escherichia coli, is
the simplest genetic model organism and is
1. MODEL ORGANISMS FACILITATE GENETIC often used to clone DNA sequences from other
ADVANCES model species.
• Yeast (Saccharomyces cerevisiae) is a good
1.1. MODEL ORGANISMS general model for the basic functions of
Many of the great advances in genetics were made eukaryotic cells.
using species that are not especially important • The roundworm, Caenorhabditis elegans is a
from a medical, economic, or even ecological useful model for the development of
perspective. Geneticists, from Mendel onwards, multicellular organisms, in part because it is
have sought the best organisms for their transparent throughout its life cycle, and its
may not be necessary, and carries with it higher Vectors for in vivo gene therapy must be capable of
risk of inducing mutations in either the transgene delivering DNA or RNA to a large proportion of the
or host genome). In contrast, transient transfection targeted cells, without inducing a significant
does not involve integration into the host genome immune response, or having any toxic effects.
and the transgene may therefore be delivered to Ideally, the vectors should also have high specificity
the cell as either RNA or DNA. Advantages of RNA for the targeted cell type. Vectors based on viruses
delivery include that no promoter is needed to (e.g. lentiviruses) are being developed for in both
drive expression of the transgene. Besides mRNA in vivo and ex vivo gene therapies. Other, non-viral
transgenes, which could provide a functional vectors (e.g. vesicles and nanoparticles) are also
version of a mutant protein, there is great interest being developed for gene therapy as well.
in delivery of siRNA (small-inhibitory RNAs), which
can be used to silence specific genes in the host
cell’s genome.
Figure 5.
Production of a transgenic mouse. Stem cells are removed from an embryo, and are transfected (using electroporation) with a
r
transgenic construct that bears a neomycin resistance gene (neo ) flanked by two segments of DNA homologous to a gene of
interest. In the nucleus of a transgenic cell, some of the foreign DNA will recombine with the targeted gene, disrupting the
r
targeted gene and introducing the selectable marker. Only cells in which neo has been incorporated will survive selection. These
neomycin resistant cells are then transplanted into another embryo, which will grow into a chimera within a foster mother.
(Wikipedia-Kiaergaard- CC BY-SA 3.0)
8.2. HOW CRISPR SYSTEM WORKS premature crRNA. This crRNA will contain
After a bacterium survives a viral infection, it various spacer and direct repeat sequences, but
attains an immune memory so that it can fight back after modification the mature crRNA will only
when the same virus infects the cell; this is where have the matching spacer or guide RNA to the
the CRISPR-Cas9 system comes in. There are three invading virus. The mature crRNA will form a
main components in this adaptive immune system complex with tracrRNA and Cas 9.
(Figure 7 and Figure 8): (3) Interference
(1) Adaption The crRNA-tracrRNA-Cas9 complex binds to the
First, virus injects its DNA to the host bacterial protospacer adjacent motif (PAM) sequence
cell. Then, the host bacterial cell detects the first that is located to the target DNA sequence
foreign DNA, and integrates short fragment of (protospacer) and opens the dsDNA and the
the viral DNA into the CRISPR locus. This insert guide RNA will base pair with the target
is called a spacer and the original sequence on sequence. The cas9 protein will leave a double
the viral genome is called a protospacer. stranded cut to the DNA.
(2) crRNA biogenesis Now, the host bacterial DNA do not have the
PAM sequence but the foreign, viral DNA does.
After the cell is re-infected with the same virus, Thus, the CRISPR system does not cleave its
CRISPR sequence is transcribed into a own DNA.
Figure 7.
Some bacteria and most archaea have
a system that can integrate their
enemy’s DNA and use it against them.
Note that direct repeats are
represented by orange blocks, and
spacers are represented by differnt
colour blocks. (Original-Kang-CC BY-NC
3.0)
Figure 8.
crRNA will base pair with tracrRNA, and this
will recruit Cas9 protein. Therefor,Cas9
protein is guided by the RNA molecule and
leaves a double stranded break. (Original-
Kang-CC BY-NC 3.0)
___________________________________________________________________________
SUMMARY:
• There are a variety of model organisms that are used for genetic experimentation because they have
advantages for various aspects of research.
• Research on one model organism can be applied to others. This permits genetic knowledge of model
organisms to be transferred to humans, farm animals or crop plants.
• Transgenic organisms contain foreign DNA that has been introduced using biotechnology to make
genetically modified organisms.
• Crispr-Cas9 is a RNA guided endonuclease that can find a specific sequence on DNA, make a cut, and
the repair mechanism of the host cell will either introduce a mutation or integrate a new copy.
KEY TERMS:
model organisms vesicles lentiviruses
Escherichia coli stable genome editing
Sacchromyces cerevisiae position effects CRISPR-Cas9 system
Caenorhabditis elegans Agrobacterium-mediated cas
Drosophila melanogaster transformation CAS9
Mus musculus Ti plasmid crRNA
Danio rerio T-DNA direct repeat
Arabadopis thaliana callus spacer
transgenic organisms recalcitrant tracrRNA
transgene particle bombardment sgRNA
GMO stem cells adaptation
transformation knock-out protospacer
transfection neo-R crRNA Biogenesis
naked DNA germline PAM
naked DNA somatic HDR
carrier ex vivo NHEJ
electroporation in vivo indel
microinjection non-integrative
vector siRNA
STUDY QUESTIONS:
1) a) List the characteristics of an ideal model
organism.
b) Which model organism can be used most
efficiently to identify genes related to:
i) eye development
ii) skeletal development
iii) photosynthesis
iii) cell division
iv) cell differentiation
v) cancer
Notes:
INTRODUCTION of skin cancers originating respectively in the
squamous cells, basal cells, or melanocytes of the
Cancer is a group of diseases that exhibit
skin.
uncontrolled cell growth, invasion of adjacent
tissues, and sometimes metastasis (the movement Lymphomas arise from hematopoietic (blood
of cancer cells through the blood or lymph). In forming) cells. This includes leukemia, the most
cancer cells, the regulatory mechanisms that common type of cancer in children.
normally control cell division and limit abnormal 2. CANCER CELL BIOLOGY
growth have been disrupted, usually by the
accumulation of several mutations in specific Cancer is a progressive disease that usually begins
genes. Cancer is therefore essentially a genetic with increased frequency of cell division (Figure 2).
disease. Although some cancer-related mutations Under the microscope, this may be detectable as
may be heritable, most cancers are sporadic, increased cellular and nuclear size, and an
meaning they arise from new mutations that occur increased proportion of cells undergoing mitosis.
in the individual who has the disease. In this As the disease progresses, cells typically lose their
chapter, we will examine the connection between normal shape and tissue organization. This
cancer and genes. increased cell division and abnormal tissue
organization is called dysplasia. Eventually a
1. CLASSIFICATION OF CANCERS tumour develops, which can grow rapidly and
Cancers can be classified based on the tissues they expand into adjacent tissues. As cellular damage
resemble and thus in which they originate. For accumulates and additional growth control
example, Sarcomas are cancers that originate in mechanisms are lost, some cells may break free of
mesenchymal cells, such as bone, cartilage, fat, or the primary tumour, pass into the blood or lymph
muscle. system, and be transported to another organ,
where they develop into new tumours (Figure 3).
Carcinomas originate in epithelial cells (both inside
The early detection of tumours is important so that
the body and on its surface) and are the most
they can be treated or removed before the onset of
common types of cancer (~85%). This includes
metastasis, but note that not all usually considered
glandular tissues (e.g. breast, prostate). Each of
life threatening. In contrast, malignant tumours
these classifications may be further sub-divided.
become invasive, and ultimately result in cancer.
For example, squamous cell carcinoma (SCC), basal
cell carcinoma (BCC), and melanoma are all types
Figure 2. Figure 3.
Progressive increases in cell division and abnormal cell Secondary tumours (white) develop in the liver from cells of a
morphology associated with cancer. (Wikipedia-NIH-PD) metastatic pancreatic cancer. (Wikipedia- J. Hayman-PD)
1. Growth signal Cancer cells can divide without the external signals normally required to
autonomy stimulate division.
2. Insensitivity to Cancer cells are unaffected by external signals that inhibit division of normal
growth inhibitory cells.
signals
3. Evasion of apoptosis When excessive DNA damage and other abnormalities are detected, apoptosis
(a type of programmed cell death) is induced in normal cells, but not in cancer
cells.
4. Reproductive Each division of a normal cell reduces the length of its telomeres. Normal cells
potential not limited arrest further division once telomeres reach a certain length. Cancer cells avoid
by telomeres this arrest and/or maintain the length of their telomeres.
5. Sustained Most cancers require the growth of new blood vessels into the tumour. Normal
angiogenesis angiogenesis is regulated by both inhibitory and stimulatory signals not required
in cancer cells.
6. Tissue invasion and Normal cells generally do not migrate (except in embryo development). Cancer
metastasis cells invade other tissues including vital organs.
7. Deregulated Cancer cells use an abnormal metabolism to satisfy a high demand for energy
metabolic pathways and nutrients.
8. Evasion of the Cancer cells are able to evade the immune system.
immune system
9. Chromosomal Severe chromosomal abnormalities are found in most cancers.
instability
10. Inflammation Local chronic inflammation is associated with many types of cancer.
Table 1
Ten hallmarks of Cancer (Hanahan and Weinberg, 2000; Hanahan 2011)
1. PAHs (polycyclic aromatic hydrocarbons) e.g. benzo[a]pyrene and several other components of
the smoke of cigarettes, wood, and fossil fuels
2. Aromatic amines e.g. formed in food when meat (including fish, poultry)
are cooked at high temperature
3. Nitrosamines and nitrosamides e.g. found in tobacco and in some smoked meat and fish
4. Azo dyes e.g. various dyes and pigments used in textiles, leather,
paints.
5. Carbamates e.g. ethyl carbamate (urethane) found in some distilled
beverages and fermented foods
6. Halogenated compounds e.g. pentachlorophenol used in some wood preservatives
and pesticides.
7. Inorganic compounds e.g. asbestos; may induce chronic inflammation and
reactive oxygen species
8. Miscellaneous compounds e.g. alkylating agents, phenolics
Table 2
Some classes of chemical carcinogens (Pecorino 2008)
5. ONCOGENES
The control of cell division involves many different
genes. Some of these genes act as signaling
molecules to activate normal progression through
the cell cycle. One of the pre-requisites for cancer
occurs when one or more of these activators of cell
division become mutated.
The mutation may involve a change in the coding
sequence of the protein, so that it is more active Figure 6.
Structure of the ras protein.
than normal, or a change in the regulation of its (Wikipedia-Mark”AbsturZ”-PD)
expression, so that it is produced at higher levels
than normal, or persists in the cell longer than
ras is an example of a proto-oncogene (Figure 6).
normal. Genes that are a part of the normal
ras acts as a switch within signal transduction
regulation of cell division, but which after mutation
pathways, including the regulation of cell division.
contribute to cancer, are called proto-oncogenes.
When a receptor protein receives a signal for cell
Once a proto-oncogene has been abnormally
division, the receptor activates ras, which in turn
activated by mutation, it is called an oncogene.
More than 100 genes have been defined as proto- activates other signaling components, ultimately
oncogenes. These include genes at almost every leading to activation of genes involved in cell
step of the signaling pathways that normally induce division. Certain mutations of the ras sequence
cell to divide, including growth factors, receptors, cause it to be in a permanently active form, which
signal transducers, and transcription factors. can lead to constitutive activation of the cell cycle.
This mutation is dominant as are most oncogenes.
An example of the role of ras in relaying a signal for
cell division in the EGF pathway is shown in Figure
7.
Figure 7.
Simplified representation of the epidermal growth factor (EGF) signaling pathway. In the panel on the left, the components are
shown in their inactive forms, prior to stimulation of the pathway. The components include the soluble ligand, EGF, its receptor
(EGFR, a tyrosine kinase), ras (a G protein), several kinases (RAF, MEK, MAPK), and a transcription factor (TF). In the right panel,
the activate pathway is shown. Binding of the ligand to its receptor leads to autophosphorylation of the receptor. Through a
series of proteins not shown here, the phosphorylated simulates conversion of ras to its active, GTP-bound form. The activated
ras then stimulates phosphorylation of a series of kinases, which ultimately activate transcription factors and the expression of
genes required for cell proliferation. (Original-Deyholos-CC:AN)
6. TUMOUR SUPPRESSOR GENES cancer. Thus, sporadic rather than inherited
mutations are the most common sources of both
More than 30 genes are classified as tumour
oncogenes and disabled tumour suppressor genes.
suppressors. The normal functions of these genes
include repair of DNA, induction of programmed An important tumour suppressor gene is a
cell death (apoptosis) and prevention of abnormal transcription factor named p53 (Figure 8). Other
cell division. In contrast to proto-oncogenes, in proteins in the cell sense DNA damage, or
tumour suppressors it is loss-of-function mutations abnormalities in the cell cycle and activate p53
that contribute to the progression of cancer. This through several mechanisms including
means that tumour suppressor mutations tend to phosphorylation (attachment of phosphate to
be recessive, and thus both alleles must be specific site on the protein) and transport into the
mutated in order to allow abnormal growth to nucleus. In its active form, p53 induces the
proceed. It is perhaps not surprising that mutations transcription of genes with several different types
in tumour suppressor genes are more likely than of tumour suppressing functions, including DNA
oncogenes to be inherited. An example is the repair, cell cycle arrest, and apoptosis. Over 50% of
tumour suppressor gene, BRCA1, which is involved human tumours contain mutations in p53. People
in DNA-repair. Inherited mutations in BRCA1 who inherit only one function copy of p53 have a
increase a woman’s lifetime risk of breast cancer by greatly increased incidence of early onset cancer.
up to seven times, although these heritable However, as with the other cancer related genes
mutations account for only about 10% of breast we have discussed, most mutations in p53 are
sporadic, rather than inherited. Mutation of p53, unregulated expression of this gene and its kinase
through formation of pyrimidine dimers in the product causes activation of a variety of
genes following exposure to UV light, has been intracellular signaling pathways, promoting the
causally linked to squamous cell and basal cell uncontrolled proliferative and survival properties
carcinomas (but not melanomas, highlighting the of CML cells (the cancer). Thus, the BCR-ABL
variety and complexities of mechanisms that can tyrosine kinase enzyme exists only in cancer cells
cause cancer). (and not in healthy cells) and a drug that inhibits
this activity could be used to target and prevent
the uncontrolled growth of the cancerous CML
cells.
___________________________________________________________________________
SUMMARY:
• Cancer is the name given to a class of different diseases that share common properties.
• Most cancers require accumulation of mutations in several different genes.
• Most cancer causing mutations are sporadic, rather than inherited, and most are caused by
environmental carcinogens, including virus, radiation, and certain chemicals.
• Oncogenes are hyper activated regulators of cell division, and are often derived from gain-of-function
mutations in proto-oncogenes.
• Tumour suppressor genes normal help to repair DNA damage, arrest cell division, or to kill over
proliferating cells. Loss-of-function of these genes contributes to the progression of cancer.
• Genetic research into cancer can provide enzyme targets for drug investigation and potential
treatment. e.g. Gleevec™
KEY TERMS:
metastasis epidemiology
sarcoma proto-oncogene
carcinoma receptor
squamous cell carcinoma signal transduction
basal cell carcinoma ras
melanoma tumour suppressors
lymphoma apoptosis
dysplasia BRC1A
benign p53
malignant phosphorylation
carcinogen CML
HPV Philadelphia chromosome
oncogene bcr-abl
ionizing Gleevec™
STUDY QUESTIONS:
1) Why do oncogenes tend to be dominant, but
mutations in tumour suppressors tend to be
recessive?
2) What tumour suppressing functions are
controlled by p53? How can a single gene affect
so many different biological pathways?
3) Are all carcinogens mutagens? Are all mutagens
carcinogens? Explain why or why not.
4) Imagine that a laboratory reports that feeding a
chocolate to laboratory rats increases the
incidence of cancer. What other details would
you want to know before you stopped eating
chocolate?
5) Do all women with HPV get cancer? Why or
why not?
6) Do all women with mutations in BRCA1 get
cancer? Why or why not?
CHAPTER 03 – ANSWERS
1) Mutant strain #1 has a mutation in gene B (but genes A and C should be functional).
Mutant strain #2 is in gene C (but genes A and B should be functional).
Mutant strain #3 is in gene A (but genes B and C should be functional).
2) Even prototrophs cannot produce the vitamin biotin, so it must be added for any strain to grow. Wild type
strains also lack the enzyme(s) for this biochemical pathway. Biotin is present in Complete Medium.
3) No, we now know that genes also encode tRNA, rRNA, and a variety of other functional RNAs.
4)
a. Changes in many amino acids do not cause a change in function. A specific amino acid is not
required at that site for function to occur.
b. Changes in many amino acids can cause a minor loss in function. A specific amino acid at a site may
be required for optimal function to occur.
c. Changes in some amino acids can cause a complete loss in function. Many specific amino acid are
required at specific sites for any function to occur (e.g. the active site within an enzyme).
d. Any one of the amino acids changed in part (c) can result in a complete loss of function.
5) No, the gene can be transcribed into an mRNA and translated into a polypeptide, but the polypeptide is
not functional because of a change in an amino acid.
6) Chain A has ~268, while chain B has 450. The entire enzyme has ~ 4 chains, two A and two B (a
heterodimer).
7) row 1 orange, orange, orange
row 2 white, orange, orange
row 3 yellow, yellow, orange
row 4 white, yellow, orange
CHAPTER 04 – ANSWERS
1)
a) Mutagenize a wild type (auxotrophic) strain and screen for mutations that fail to grow on minimal
media, but grow well on minimal media supplemented with proline.
b) Take mutants #1-#10 and characterize them, based on:
(1) genetic mapping of the mutants (different locations indicate different genes);
(2) different response to proline precursors (a different response suggests different genes);
(3) complementation tests among the mutations (if they complement then they are mutations in
different genes).
c) If the mutations are in different genes then the F1 progeny would be wild type (able to grow on
minimal medium without proline).
d) If the mutations are in the same gene then the F1 progeny would NOT be wild type (unable to grow on
minimal medium without proline).
2) There are many correct answers for this question. Here is one.
1 2 3 4 5
1 - Mutant in locus (1,2)
2 - - Mutant in locus (1,2)
3 + + - Mutant in locus (3)
2) Both involve trans-factors binding to corresponding cis-elements to regulate the initiation of transcription
by recruiting or stabilizing the binding of RNApol and related transcriptional proteins at the promoter. In
prokaryotes, genes may be regulated as a single operon. In eukaryotes, enhancers may be located much
further from the promoter than in prokaryotes.
3) If there was no deacetylation of FLC by HDAC, transcription of FLC might continue constantly, leading to
constant suppression of flowering, even after winter.
CHAPTER 08 – ANSWERS
1) Various roles could include varying abilities to bind O2. Greater need as an embryo and fetus, less so for an
adult. This could be tested by obtaining the blood from those stages and determining the blood’s ability to
bind O2.
2) The two main types (a-globins and b-globins) are both derived by gene duplication and evolution from a
common ancestral gene.
3) Cartoon:
a-globin and b-globin, two of each type
4) Look it up. If you’re too lazy to do a search, see:
http://www.bloodjournal.org/content/125/24/3694?sso-checked=true
5) It is caused by the LCR keeping the gamma genes on after a person is born rather than switching
to the beta and delta genes (see Figure 5). A person will usually be unaware they have this
condition.
CHAPTER 09 – ANSWERS
1) Heterozygous people have the lactose persistence phenotype. As infants both alleles are active. As adults
only the LP allele remains on but it continues to supply the intestinal epithelial cells with Lactases. LP
alleles are dominant to the original allele because the LP allele's phenotype is the one seen when both
alleles are together.
2) The gene's alternative symbols are LAC and LPH. It is important to have a single symbol to make searches
of published journal articles possible. It is also a guide to authors of future articles, texts, and lectures.
3) If a person has a Lactase persistence phenotype they will break down the lactose and import the glucose
(and galactose). Soon most of the glucoses will enter their circulatory system and cause a noticeable
increase in blood glucose levels. The increase is noticeable after 15 minutes and peaks at 45 minutes. If a
person has a non-persistence phenotype none of this will happen because the lactose will pass through
their digestive tract (although some of it will be consumed by gut bacteria).
4) An Insulin protein begins with a ER signal sequence. Once their Ribosome has been delivered to the ER it
can be removed. The Ribosome continues synthesizing the Insulin protein and feeding it into the ER lumen.
Insulin proteins do not have a stop transfer sequence so the entire protein will be released into the ER
lumen. Insulin proteins do have a pro sequence, it is removed once the protein has taken on its proper
shape. When a transport vesicle carrying the Insulin fuses with the plasma membrane the protein is
release from the cell and can enter the blood.
5) The major difference is E. coli cells imports lactose and then hydrolyse it while human intestinal cells do
these steps in the order hydrolyse first and import second. The major similarities are in the proteins
required: Lac Permease (E. coli) and SGLT1 (humans) are both carbohydrate transporters located in the
plasma membrane while Beta-Galactosidase (E. coli) and Lactase (humans) are both enzymes that
hydrolyze the dissaccharide lactose.
6) There are enzymes to hydrolyse each dissaccharide and transport proteins to import the resulting
monosaccharides. These proteins happen to be named Lactase, Sucrase, Maltase, SGLT1 (imports glucose
and galactose), and GLUT5 (imports fructose).
7) Firstly, the excess lactose upsets the osmotic balance in the large intestine. Water enters the gut from the
tissues leading to diarrhea and dehydration. Secondly, the lactose will be used as food by bacteria living in
the large intestine. When they use the lactose to make ATP they expel carbon dioxide, hydrogen, and
other gases which causes cramping and flatulence.
CHAPTER 10 - ANSWERS
1) A geneticist would use these as white = phenotype, white = gene, and WHITE = protein.
2) The world is 19 times brighter for these flies. Without the optical insulation provided by the pigments light
from all directions strikes all of the photoreceptors. The flies are unable to make sense of the information
their eyes send to their brains.
3) The flies would be unable to make either transporter and would have white eyes as a result.
4) Yes. Most noticeable is the male flies have difficulty performing the mating dance that leads to sex with
female flies.
CHAPTER 11 – ANSWERS
1) Polymorphisms and mutations are both variations in DNA sequence and can arise through the same
mechanisms. We use the term polymorphism to refer to DNA variants that are relatively common in
populations. Mutations affect the phenotype.
2) Misreading of bases during replication can lead to substitution and can be caused by things like
tautomerism, DNA alkylating agents, and irradiation.
3) Looping out of DNA on the template strand during replication; strand breakage, due to radiation and other
mutagens; and (discussed in other chapters) chromosomal aberrations such as deletions and
translocations.
4) Looping out of DNA on the growing strand during replication; transposition; and (discussed in earlier
chapters) chromosomal aberrations such as duplications, insertions, and translocation.
5) Benzopyrene is one of many hazardous compounds present in smoke. Benzopyrene is an intercalating
agent, which slides between the bases of the DNA molecule, distorting the shape of the double helix,
which disrupts transcription and replication and can lead to mutation.
6) See Chapter 10.
7) Class I. see Figure 9 on Transposable Elements.
CHAPTER 12 – ANSWERS
1)
a) One possible explanation is that original mutagenesis resulted in a loss-of-function mutation in a gene
that is essential for early embryonic development, and that this mutation is X-linked recessive in the
female. Because half of the sons will inherit the X chromosome that bears this mutation, half of the
sons will fail to develop beyond very early development and will not be detected among the F1
progeny. The proportion of male flies that were affected depends on what fraction of the female
parent’s gametes carried the mutation. In this case, it appears that half of the female’s gametes carried
the mutation.
b) To test whether a gene is X-linked, you can usually do a reciprocal cross. However, in this case it would
be impossible to obtain adult male flies that carry the mutation; they are dead. If the hypothesis
proposed in a) above is correct, then half of the females, and none of the living males in the F1 should
carry the mutant allele. You could therefore cross F1 females to wild type males, and see whether the
expected ratios were observed among the offspring (e.g. half of the F1 females should have a fewer
male offspring than expected, while the other half of the F1 females and all of the males should have a
roughly equal numbers of male and female offspring).
2)
a) Treat a population of seeds with a mutagen such as EMS. Allow these seeds to self-pollinate, and then
allow the F1 generation to also self-pollinate. In the F2 generation, smell each flower to find individuals
with abnormal scent.
b) The fishy gene appears to be required to make the normal floral scent. Because the flowers smell fishy
in the absence of this gene, one possibility explanation of this is that fishy makes an enzyme that
converts a fishy-smelling intermediate into a chemical that gives flowers their normal, sweet smell.
Note that although we show this biochemical pathway as leading from the fishy-smelling chemical to the
sweet-smelling chemical in one step, it is likely that there are many other enzymes that act after the fishy
enzyme to make the final, sweet-smelling product. In either case, blocking the pathway at the step
catalyzed by the fishy enzyme would explain the fishy smell.
c) In nosmell plants, the normal sweet smell disappears. Unlike fishy, the sweet smell is not replaced by
any intermediate chemical that we can easily detect. Thus, we cannot conclude where in the biochemical
pathway the nosmell mutant is blocked; nosmell may normally therefore act either before or after fishy
normally acts in the pathway:
Alternatively, nosmell may not be part of the biosynthetic pathway for the sweet smelling chemical at all.
It is possible that the normal function of this gene is to transport the sweet-smelling chemical into the cells
from which it is released into the air, or maybe it is required for the development of those cells in the first
place. It could even be something as general as keeping the plants healthy enough that they have enough
energy to do things like produce floral scent.
3)
a) Dominant mutations are generally much rarer than recessive mutations. This is because mutation of a
gene tends to cause a loss of the normal function of this gene. In most cases, having just one normal
(wt) allele is sufficient for normal biological function, so the mutant allele is recessive to the wt allele.
Very rarely, rather than destroying normal gene function, the random act of mutation will cause a gene
to gain a new function (e.g. to catalyze a new enzymatic reaction), which can be dominant (since it
performs this new function whether the wt allele is present or not). This type of gain-of-function
dominant mutation is very rare because there are many more ways to randomly destroy something
than by random action to give it a new function (think of the example given in class of stomping on an
iPod).
b) Dominant mutations should be detectable in the F1 generation, so the F1 generation, rather than the F2
generation can be screened for the phenotype of interest.
c) Large deletions, such as those caused by some types of radiation, are generally less likely than point
mutations to introduce a new function into a protein: it is hard for a protein to gain a new function if
the entire gene has been removed from the genome by deletion.
4)
a) Mutagenize a wild type (auxotrophic) strain and screen for mutations that fail to grow on minimal
media, but grow well on minimal media supplemented with proline.
b) Take mutants #1-#10) and characterize them, based on (1) genetic mapping of the mutants (different
locations indicate different genes); (2) different response to proline precursors (a different response
suggests different genes); (3) complementation tests among the mutations (if they complement then
they are mutations in different genes).
c) If the mutations are in different genes then the F1 progeny would be wild type (able to grow on
minimal medium without proline).
d) If the mutations are in the same gene then the F1 progeny would NOT be wild type (unable to grow on
minimal medium without proline).
CHAPTER 13 – ANSWERS
1) These are four common terms that are often used interchangeably by novice students, but do have
distinctly different meanings and uses. (1) gene = general term for a segment of nucleic acid that is
responsible for one or more phenotypes (2) locus = the position of a gene along a chromosome, (3) allele =
the form (DNA sequence) of a gene at a locus, (4) transcription unit = the segment of DNA that is
transcribed into RNA (often mRNA in the case of a protein coding gene).
2) Form (1) RR (red) x rr (white) gives Rr (red progeny). “R” is dominant to “r”.
+ + - - + - + -
Form (2) r r (red) x r r (white) gives r r (red progeny). “r ” is dominant to “r “.
+ -
For pink progeny, the symbols are the same, only “R” or “r ” is semi-dominant to “r” or “r “.
3) If your blood type is B, then your genotype is either IBi or IBIB. If your genotype is IBi, then your parents
could be any combination of genotypes, as long as one parent had at least one i allele, and the other
parent had at least one IB allele. If your genotype was IB IB, then both parents would have to have at least
one IB allele.
4) case 1 co-dominance
case 2 incomplete-dominance
case 3 incomplete penetrance
case 4 pleiotropy
case 5 haplo-sufficiency
case 6 haplo-insufficiency
case 7 broad (variable) expressivity
5) Mutant#1 = hypomorph
Mutant#2 = hypermorph
Mutant#3 = amorph
Mutant#4 = neomorph
Mutant#5 = antimorph
CHAPTER 14 – ANSWERS
1) No. Since chromosomes vary greatly in size, the number of chromosomes does not correlate with the total
DNA content. For reasons discussed in Chapter 5 and this chapter, the number of genes does not correlate
closely to DNA content either.
2) Heterochromatic regions with repetitive DNA, centromeres, and telomeres are examples of gene-poor
regions of chromosomes.
3)
a. Only one (except for holocentric chromosomes, not discussed in this chapter).
b. The two centromeres might get pulled towards opposite poles at mitosis/meiosis resulting in
chromosome breakage.
c. It would not segregate properly at mitosis or meiosis, leading to aneuploidy. In order to segregate
correctly, there would have to be another way to control its movement at mitosis and meiosis.
4)
a. At the end of G1, 16 chromosomes with 1 chromatid each.
b. At the end of S, 16 chromosomes with 2 chromatids each.
c. At the end of G2, 16 chromosomes with 2 chromatids each.
d. At the end of mitosis, 16 chromosomes with 1 chromatid each.
5)
a. There is little correlation between any of these, with the exception that larger genomes tend to
have more genes.
b. The C-value paradox can be explained by genomes having different amounts of non-coding DNA
between genes and within genes as introns.
c. If we define “organismal complexity” as the size of the genome (or number of cells/organism), then
larger, more complex organism tend to have more genes although not always and not in a direct,
linear, proportioned manner. Also, those with larger genomes tend to have greater distances
between genes.
CHAPTER 15 - ANSWERS:
1)
a) Red blood cells do not have chromosomes. They are terminally differentiated and have expelled their
nucleus.
b) First, it is difficult to collect cells in anaphase. Second, in anaphase there would be twice as many
chromosomes, which would make identifying them much harder.
2) Yes. Males being 46,XY have slightly less DNA than 46,XX females, but still have the same number of
chromosomes.
3)
a) True.
b) True.
c) False, only females have a paternal X chromosome.
d) True.
e) False, only males have a paternal Y chromosome.
f) False, no one has a maternal Y chromosome. Females don’t have Y-chromosomes
g) False, typically no one has a paternal mitochondrial chromosome. Mitochondria are maternally
inherited. However, there are rare cases of inheritance of paternal mitochondria.
h) True.
4) Centromeres function as a “chromosome's handle”. Each needs one handle but it doesn't matter where
along the chromosome it is.
5) If there was only one ori in the middle of the chromosome it would take too long for the replication forks
to reach the ends of the chromosome. Even with thousands of ori's per chromosome, it still takes 8 hours
to replicate our DNA.
6) Chromatin is the material from which chromosomes are made (mostly DNA + protein). DNA is a
component of both chromatin and chromosomes.
7) Top left: Histone proteins; top right: Histone and Cohesin proteins, bottom right: Histones, Cohesins,
Condensins, and Kinetochore proteins; bottom left: Histone, Condensin, and Kinetochore proteins.
8)
a) DNA Polymerases are found inside the nucleus and the mitochondria.
b) RNA Polymerases are found inside the nucleus and the mitochondria
c) Ribosomes are found free in the cytosol, on the surface of the rough ER, and inside the mitochondria.
(Some have been found in the nucleus, too.)
9)
a) The F8 gene could work on an autosome. Its mRNAs would still leave the nucleus to be translated
in the cytosol.
b) The SRY gene would not work normally on an autosome because then females would have the gene as
well as males and thus females would become males.
c) The MT-CO1 gene would not work on an autosome (nuclear) because the genetic code is different in
the nucleus (vs. mitochondrion). The protein must be translated inside the mitochondria to be the correct
amino acid sequence. If it were translated in the cytosol the amino acid sequence would be different, and
thus likely not work normally.
CHAPTER 16 – ANSWERS
1) If genetic factors blended together like paint then they could not be separated again. The white flowered
phenotype would therefore not reappear in the F2 generation, and all the flowers would be purple or
maybe light purple, not white.
2) Your choice……
3) There is a maximum of two alleles for a normal autosomal locus from a diploid individual. In the whole
population there can be essentially an unlimited number of different alleles; the limit being determined by
the population size.
4)
a. In the F1 generation, the genotype of all individuals will be Ww and all of the dogs will have wirey
hair.
b. In the F2 generation, there would be an expected 3:1 ratio of wirey-haired to smooth-haired dogs.
c. Although it is expected that only one out of every four dogs in the F2 generation would have
smooth hair, large deviations from this ratio are possible, especially with small sample sizes. These
deviations are due to the random nature in which gametes combine to produce offspring. Another
example of this would be the fairly common observation that in some human families, all of the
offspring are either girls, or boys, even though the expected ratio of the sexes is essentially 1:1.
d. You could do a test cross, i.e. cross the wirey-haired dog to a homozygous recessive dog (ww).
Based on the phenotypes among the offspring, you might be able to infer the genotype of the
wirey-haired parent.
e. From the information provided, we cannot be certain which, if either, allele is wild-type. Generally,
dominant alleles are wild-type, and abnormal or mutant alleles are recessive.
5) Even before the idea of a homozygous genotype had really been formulated, Mendel was still able to
assume that he was working with parental lines that contained the genetic material for only one variant of
a trait (e.g. EITHER green seeds of yellow seeds), because these lines were pure-breeding. Pure-breeding
means that the phenotype doesn’t change over several generations of self-pollination. If the parental lines
had not been pure-breeding, it would have been very hard to make certain key inferences, such as that the
F1 generation could contain the genetic information for two variants of a trait, although only one variant
was expressed. This inference led eventually to Mendel’s First Law.
6) Equal segregation of alleles occurs only in meiosis. Although mitosis does produce daughter cells that are
genetically equal, there is no segregation (i.e. separation) of alleles during mitosis; each daughter cell
contains both of the alleles that were originally present in the parent cell.
CHAPTER 17 – ANSWERS
1)
though. For the Punnett square on the right Figure 7, you can simplify it as:
R;Y R;y
R/r R/r
r;y
Y/y y/y
3) The “9” would increase, both “3” would decrease, and the “1” would increase.
4) Two classes, the parentals, would increase, while two classes would decrease, the recombinants.
CHAPTER 18 – ANSWERS
1) Crossovers can be observed cytologically directly under the microscope as chiasmata.
Recombination is defined genetically as the frequency calculated from the observed phenotypic
proportions in the progeny.
Crossovers lead to recombination when they are detected using genetic marker loci. Not all crossovers
result in recombination – some can’t be detected because no visible markers are recombined.
Some recombinants involve crossovers, but not all recombinants result from crossovers.
Crossovers between non-sister chromatids can result in recombination, while crossovers between sister
chromatids, which have identical alleles, will not show any recombination.
When there are two crossovers between the loci being scored for recombination, the result will appear to
be parental, not recombinant.
Recombination can occur without crossovers when marker loci are on different chromosomes, which then
assort independently.
2) The use of pure breeding lines allows the researcher to be sure that he/she is working with homozygous
(known) genotypes. If a parent is known to be homozygous, then all of its gametes will have the same
genotype. This simplifies the definition of parental genotypes and therefore the calculation of
recombination frequencies.
3) This tight linkage would suggest that individuals with the earlobe phenotype would likely carry alleles that
increased their risk of cardiovascular disease. These individuals could therefore be informed of their
increased risk and have an opportunity to seek increased monitoring and reduce other risk factors.
4)
a. It assumes that the loci are completely unlinked.
b. The expected ratio would be all partentals and no recombinants. For example, if the parental
gametes were AB and ab, then the gametes produced by the dihybrids would also be AB and ab,
and the offspring of a cross between the two dihybrids would all be genotype AABB:AaBb:aabb, in
a 1:2:1 ratio. If the parental gametes were Ab and aB, then the gametes produced by the dihybrids
would also be Ab and aB, and the offspring of a cross between the two dihybrids would all be
genotype AAbb:AaBb:aaBB, in a 1:2:1 ratio.
5)
a. Parental: CcEe and ccee; Recombinant: Ccee and ccEe.
b. Parental: Ccee and ccEe; Recombinant: CcEe and ccee.
6) a)- Let WwYy be the genotype of a purple-flowered (W), green seeded (Y) dihybrid. The cross is WwYy ×
wwyy. Half of the progeny will have yellow seeds whether the loci are linked or not. You cannot tell if they
are linked or not given only this information.
b)- You need to know the proportion of the seeds that are white or purple flowered, and in what
frequencies they appear with the white and purple flowers, e.g. what the frequencies of the four classes
are. This would help you to know about the linkage between the two loci – unliked, or what degree of
linkage.
7) If the progeny of the cross aaBB x AAbb is testcrossed, and the following genotypes are observed among
the progeny of the testcross, what is the frequency of recombination between these loci?
AaBb 135 Aabb 430 aaBb 390 aabb 120
(135 + 120)/(135+120+390+430)= 24%
8) See section 3.3. Syntenic is the term for genes found on the same chromosome. Linked genes are always
found on the same chromosome, and so are always syntenic. If the genes are sufficiently far enough away
on the same chromosome, crossover events will make the two genes assort independently, so they won’t
appear linked. Therefore, in this latter situation, these genes are syntenic, but not linked.
CHAPTER 19 – ANSWERS
1) Let tt be the genotype of a short tassels, and rr is the genotype of pathogen resistant plants. We need to
start with homozygous lines with contrasting combinations of alleles, for example:
P: RRtt (pathogen sensitive, short tassels) × rrTT (pathogen resistant, long tassels)
F1: RrTt (sensitive, long) × rrtt (resistant, short)
F2: parental Rrtt (sensitive, short) , rrTt (resistant, long)
recombinant rrtt (resistant, short) , RrTt (sensitive, long)
2) Let mm be the genotype of a mutants that fail to learn, and ee is the genotype of orange eyes. We need to
start with homozygous lines with contrasting combinations of alleles, for example (wt means wild-type):
P: MMEE (wt eyes, wt learning) × mmee (orange eyes, failure to learn)
F1: MmEe (wt eyes, wt learning) × mmee (orange eyes, failure to learn)
F2: parental MmEe (wt eyes, wt learning) , mmee (orange eyes, failure to learn)
recombinant Mmee (wt eyes, failure to learn), mmEe (orange eyes, wt learning)
3) Given a triple mutant aabbcc , cross this to a homozygote with contrasting genotypes, i.e. AABBCC, then
testcross the trihybrid progeny, i.e.
P: AABBCC × aabbcc
F1: AaBbCc × aabbcc
Then, in the F2 progeny, find the two rarest phenotypic classes; these should have reciprocal genotypes,
e.g. aaBbCc and AAbbcc. Find out which of the three possible orders of loci (i.e. A-B-C, B-A-C, or B-C-A)
would, following a double crossover that flanked the middle marker, produce gametes that correspond to
the two rarest phenotypic classes. For example, if the rarest phenotypic classes were produced by
genotypes aaBbCc and AAbbcc, then the dihybrid’s contribution to these genotypes was aBC and Abc.
Since the parental gametes were ABC and abc the only gene order that is consistent with aBC and Abc
being produced by a double crossover flanking a middle marker is B-A-C (which is equivalent to C-A-B).
4) Based on the information given, the recombinant genotypes with respect to these loci will be Aabb and
aaBb. The frequency of recombination between A-B is 1cM=1%, based on the information given in the
question, so each of the two recombinant genotypes should be present at a frequency of about 0.5%.
Thus, the answer is 0.5%.
5)
a. 4cM
b. Random sampling effects; the same reason that many human families do not have an equal
number of boys and girls.
6) There would be approximately 2% of each of the recombinants: (yellow, straight) and (black, curved), and
approximately 48% of each of the parentals: (yellow, curved) and (black, straight).
7)
A is fur color locus B is tail length locus C is behaviour locus
fur (A) tail (B) behavior (C) Freq. AB AC BC
white short normal 16 aBC R R P
brown short agitated 0 ABc P R R
brown short normal 955 ABC P P P
white short agitated 36 aBc R P R
white long normal 0 abC P R R
brown long agitated 14 Abc R R P
brown long normal 46 AbC R P R
white long agitated 933 abc P P P
B C A
|--------------|---------|
4.1cM 1.5cM
Pairwise recombination frequencies are as follows (calculations are shown below):
A - B 5.6% A - C 1.5% B - C 4.1%
AB AC BC
16 16 0
0 0 0
0 0 0
36 0 36
0 0 0
14 14 0
46 0 46
0 0 0
112 30 82
5.6% 1.5% 4.1%
CHAPTER 20 – ANSWERS
1) It depends on the chromosomal location of the disease locus. If the gene is autosomal, the probability is
50%. If it is sex-linked, that is on the X-chromosome, it would be 100%. If it is Y-linked, then 0%. In both
situations the probability would decrease if the penetrance was less than 100%.
2)
CHAPTER 21 – ANSWERS
1)
2) Because each egg or sperm cell receives exactly one sex chromosome (even though this can be either an X
or Y, in the case of sperm), it could be argued that the sex chromosomes themselves do obey the law of
equal segregation, even though the alleles they carry may not always segregate equally. However, this
answer depends on how broadly you are willing to stretch Mendel’s First Law.
CHAPTER 22 – ANSWERS
1) Co-dominance
2) Note that a semicolon is used to separate genes on different chromosomes.
Phenotype Genotype(s)
B B B
a) entirely black O / O ; s / s O / Y ; s / s
0 0 0
b) entirely orange O / O ; s / s O / Y ; s / s
B B B
c) black and white O / O ; S / _ O / Y ; S / _
0 0 0
d) orange and white O / O ; S / _ O / Y ; S / _
0 B
e) orange and black (tortoiseshell) O / O ; s / s
0 B
f) orange, black, and white (calico) O / O ; S / _
3) People with hemophilia A use injections of purified Factor VIII proteins (made through the use of
recombinant, cloned Factor VIII gene). It can be delivered on demand (to control existing bleeding) or
regularly (to limit damage to joints).
CHAPTER 23 – ANSWERS
1) The pedigree could show an AD, AR or XR mode of inheritance. It is most likely AD. It could be AR if the
mother was a carrier, and the father was a homozygote. It could be XR if the mother was a carrier, and the
father was a hemizygote. It cannot be XD, since the daughter (#2) would have necessarily inherited the
disease allele on the X chromosome she received from her father.
2) There are many possible answers. Here are some possibilities: if neither of the parents of the father were
affected (i.e. the paternal grandparents of children 1, 2, 3), then the disease could not be dominant. If
only the paternal grandfather was affected, then the disease could only be X-linked recessive if the
paternal grandmother was a heterozygote (which would be unlikely given that this is a rare disease allele).
3)
a. The mode of inheritance is most likely AD, since every affected individual has an affected parent, and the
disease is inherited even in four different matings to unrelated, unaffected individuals. It is very unlikely
that it is XD or XR, in part because affected father had an affected son.
b. The mode of inheritance cannot be AD or XD, because affected individuals must have an affected parent
when a disease allele is dominant. Neither can it be XR, because there is an affected daughter of a
normal father. Therefore, it must be AR, and this is consistent with the pedigree.
c. The mode of inheritance cannot be AD or XD, because, again, there are affected individuals with
unaffected parents. It is not XR, because there are unaffected sons of an affected mother. It is
therefore likely AR, but note that the recessive alleles for this condition appear to be relatively common
in the population (note that two of the marriages were to unrelated, affected individuals).
d. The mode of inheritance cannot be AD or XD, because, again, there are affected individuals with
unaffected parents. It could be either XR or AR, but because all of the affected individuals are male, and
no affected males pass the disease to their sons, it is likely XR.
4) If a represents the disease allele, individuals a, d, f (who all married into this unusual family) are AA, while
b, c, e, g, h, i, j are all Aa, and k is aa.
5) There is a ½ chance that an offspring of any mating Aa x AA will be a carrier (Aa). So, there is a ½ chance
that #3 will be Aa, and likewise for #4. If #3 is a carrier, there is again a ½ chance that #5 will be a carrier,
and likewise for #6. If #5 and #6 are both Aa, then there is a ¼ chance that this monohybrid cross will
result in #7 having the genotype aa, and therefore being affected by the disease. Thus, the joint
probability is 1/2 x 1/2 x 1/2 x 1/2 x 1/4 =1/64.
CHAPTER 24 – ANSWERS
1)
b.
A crossover occurs between Alu elements on different chromosomes leading to a chromosomal
translocation. Note that the homologous chromosomes are not shown in this figure for
simplicity.
2) Gamma rays are efficient at causing double strand DNA breaks, which are then more likely to rejoin
and produce a deletion.
3) First, obtain permission from the person (and ethical approval from the appropriate oversight board or
committee). Next, isolate some white blood cells, place the cells on a slide, denature the DNA,
hybridize with fluorescent nucleic acid probes specific for the X chromosome or the Y chromosome,
observe the results with a fluorescence microscope. If they are XXX, there should be three X signals
corresponding to the three X-chromosomes and no Y-chromosome signals. If they are XYY, there
should be one X signal and two Y signals in each cell nucleus.
CHAPTER 25 - ANSWERS
1) 2n=6x=42
2)
a) Two is the maximum number of alleles that can exist for a given gene in a 2n cell of a given diploid
individual.
b) Two is the maximum number of alleles that can exist in a 1n cell of a tetraploid individual.
c) Four is the maximum number of alleles that can exist in a 2n cell of a tetraploid individual.
d) The maximum number of alleles that can exist in a population is theoretically limited only by the
population size.
3)
a) Aneuploidy can disrupt gene balance and disrupt meiosis, whereas even-numbered polyploids (e.g.
tetraploid, hexaploid) can be stable through meiosis, and can retain normal gene balance.
a)
5)
a)
6)
a)
7) As in Figure 12, there is a nondisjunction event during gamete formation. The larger X chromosomes are
shown using open symbols and the smaller Y chromosomes are shown with shaded symbols. A second
division nondisjunction event in the male parent leads to a zygote with an XYY karyotype.
a)
8)
a) 46, XY - zero Barr bodies,
b) 46,XX - one,
c) 47, XYY - zero,
d) 47,XXX - two,
e) 45,X - zero,
f) 47,XXY - one.
9) Having a shortage of key proteins is usually more detrimental than having an excess.
3)
a. red (because A and B are redundant, so products 3 and then 4 can be made)
b. red (because A and B are redundant, so products 3 and then 4 can be made)
c. blue (because product 3 will accumulate, and it is blue)
d. white (because only product 1 and 2 will be present and both are colorless)
e. blue (because only product 1 and 3 will be present and 1 is colorless and 3 is blue)
f. blue(because only product 2 and 3 will be present and 2 is colorless and 3 is blue)
g. white (because only product 1 and 2 will be present and both are colorless)
h) 15 red : 1 white i) 12 red : 4 blue j) 12 red : 4 blue
4)
a) red (because A and B are redundant, so products 3 and then 4 can be made)
b) red (because A and B are redundant, so products 3 and then 4 can be made)
c) blue (because product 3 will accumulate, and it is blue)
d) yellow (because only product 1 and 2 will be present and 1 is colorless and 2 is yellow)
e) blue (because only product 1 and 3 will be present and 1 is colorless and 3 is blue)
f) green? (because only product 2 and 3 will be present and 2 is yellow and 3 is blue, so probably the
fruit will be some combination of those two colors)
g) yellow (because only product 1 and 2 will be present and 1 is colorless and 2 is yellow)
h) 15 red: 1 yellow
i) 12 red: 3 blue:1 green
j) 12 red: 4 blue
5) Epistasis is demonstrated when the phenotype for a mutant in one locus is prevented from being expressed
by a mutant at another locus. In this case, we would expect a homozygous mutant at one locus (e.g. D) to
be the same phenotype as a homozygous mutant in both loci (e.g. D and A, or D and B).
So, the following situations from questions 2-4 demonstrated epistasis:
Q#2: No epistasis can be determined from the phenotypes (even though we know from the pathway provided
that D is downstream of A and B). There are only two possible phenotypes. So even though the D locus
might be epistatic to A and B, one cannot see this interaction because the product of both A and B
(compound 3) is colourless.
PAGE 20 OPEN GENETICS LECTURES – FALL 2017
CHAPTER QUESTION - ANSWERS
Q#3: The phenotypes show that D is epistatic to A and B:
- aadd looks like AAdd or Aadd; dd prevents the expression of the A or a alleles.
- bbdd looks like BBdd or Bbdd: dd prevents the expression of the B or b alleles.
Note: that the triple mutant aabbdd would be colourless (white).
Q#4: The phenotypes show that D is epistatic to A, because aadd looks like AAdd or Aadd.
With bbdd, the difference between bbdd (green), Bbdd (blue), and BBdd (blue) is apparent, Thus, the
phenotypes do not provide evidence for epistasis between B and D.
6) The answer is the same for a) – d)
P could have been either: AABB x aabb or aaBB x AAbb;
F1 was : AaBb x AaBb
7) Conduct an enhancer/suppressor screen (which can also result in the identification of revertants, as well)
allow the plants to self-pollinate in order to make any new, recessive mutations homozygous
8) Depending which amino acids were altered, and how they were altered, a second mutation in g*g* could
either have no effect (in which case the phenotype would be the same as gg), or it could possibly cause a
reversion of the phenotype to wild-type, so that g*g* and GG have the same phenotype.
Case 2: If the normal function of gene A is in the same process as G, such that a is a recessive allele that
increases the severity of the gg mutant (i.e. a is an enhancer of g) then the phenotype of aagg could be :
no leaves. The phenotypic ratios among the progeny of a dihybrid cross depend on whether aa mutants
have a phenotype independent of gg, in other words, do aaG_ plants have a phenotype that is different
from wild-type or from A_gg. There is no way to know this without doing the experiment, since it depends
on the biology of the particular gene, mutation and pathway involved, so there are three possible
outcomes:
Case 2a) If aa is an enhancer of gg, and aaG_ plants have a mutant phenotype that differs from wild-type
or (A_gg) then the phenotypic ratios among the progeny of a dihybrid cross will be:
9 3 3 1
A_G_ A_gg aaG_ aagg
tubular leaves
Case 2c) If aa is an enhancer of gg, and aaG_ do not have a phenotype that differs from wild-type then the
phenotypic ratios among the progeny of a dihybrid cross will be:
12 3 1
A_G_ aaG_ A_gg aagg
wild-type tubular leaves no leaves
Case 3: If the normal function of gene A is in the same process as G, such that a is a recessive allele that
decreases the severity of the gg mutant (i.e. a is an suppressor of g) then the phenotype of aagg could be
: wild-type. The phenotypic ratios among the progeny of a dihybrid cross depend on whether aa mutants
have a phenotype independent of gg, in other words, do aaG_ plants have a phenotype that is different
PAGE 22 OPEN GENETICS LECTURES – FALL 2017
CHAPTER QUESTION - ANSWERS
from wild-type or from A_gg. There is no way to know this without doing the experiment, since it depends
on the biology of the particular gene, mutation and pathway involved, so there are three possible
outcomes:
Case 3a) If aa is a suppressor of gg, and aaG_ plants have a mutant phenotype that differs from wild-type
or (A_gg) then the phenotypic ratios among the progeny of a dihybrid cross will be:
10 3 3
A_G_ aagg A_gg aaG_
wild-type tubular leaves no leaves
(some phenotype that differs
from gg)
Case 3b) If aa is an suppressor of gg, and aaG_ plants have a mutant phenotype that is the same as A_gg
the phenotypic ratios among the progeny of a dihybrid cross will be:
10 6
A_G_ aagg A_gg aaG_
wild-type tubular leaves
Case 3c) If aa is an suppressor of gg, and aaG_ plants do not have a phenotype that differs from wild-type
then the phenotypic ratios among the progeny of a dihybrid cross will be:
13 3
A_G_ aaG_ aagg A_gg
wild-type tubular leaves
Case 4: If the normal function of gene A is in the same process as G, such that a is a
recessive allele that with a phenotype that is epistatic to the gg mutant then the
phenotype of both aaG_ and aagg could be : no leaves. The phenotypic ratios among
the progeny of a dihybrid cross will be:
9 4 3
A_G_ aaG_ aagg A_gg
wild-type no leaves tubular leaves
Case … ?: There are many more phenotypes and ratios that could be imagined (e.g. different types of
dominance relationships, different types of epistasis, lethality…etc). Isn’t genetics wonderful? It is
sometimes shocking that more people don’t want to become geneticists.
The point of this exercise is to show that many different ratios can be generated, depending on the biology
of the genes involved. On an exam, you could be asked to calculate the ratio, given particular biological
parameters. So, this exercise is also meant to demonstrate that it is better to learn how to calculate ratios
than just trying to memorize which ratios match which parameters. In a real genetic screen, you would
observe the ratios, and then try to deduce something about the biology from those ratios.
CHAPTER 28 – ANSWERS
1)
a) There will be a 6kb band (the insert) and a 3kb band (the plasmid vector)
b) There would be a single 9kb band.
a)
3) No. This policy is not cost-effective, and would violate various constitutional rights. Thus, it is unlikely to
happen. Most democratic countries only store DNA profiles for people accused of a crime (US) or
convicted of a crime (Canada).
4) For CSF1P0 the 7 allele is maternal and the 12 allele is paternal; for D8S1179 we can't tell which allele is
which; for D21S11 the 9 allele is paternal and the 10 allele is maternal.
Potential fathers
STR Child Mother #1 #2 #3
CSF1PO 7/12 7/10 7/10 12/14 12/13
D8S1179 6/6 6/8 6/9 12/12 5/6
D21S11 9/10 10/11 5/5 9/16 9/9
5) No. He has all the paternal alleles present in the child. He would have to be excluded based on other
evidence. This might come from other STRs being tested, and/or he wasn’t “involved” with the mother.
6) a)- We know they are stable within an individual because we can test DNA samples from various tissue
types and times during their lifetime and they all give the same DNA profile for all the individual’s samples.
They are invariant for one person.
b)- No, because there would not be a standard for a single person. This might show up as differences in
DNA profiles if DNA samples are obtained from different tissues or at different times over their lifetime.
8)
#1 B3B4E1E1
#2 B3B4E1E2
#3 B4B4E2E2
#4 B3B3E1E2
#5 B2B4E1E2
#6 B2B3E2E2
9) #3 and #6 cannot be a parent, since neither #3 and #6 have any alleles in common with #1 at locus E.
10) a) the region of the fragment that is most likely to be polymorphic
i) TAAAGGAATCAATTACTTCTGTGTGTGTGTGTGTGTGTGTGTGTTCTTAGTTGTTTAAGTTTTAAGTTGTGA
ii) ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
iii) ATTTCCTTAGTTAATGAAGACACACACACACACACACACACACAAGAATCAACAAATTCAAAATTCAACACT
b) any simple sequence repeats
i) TAAAGGAATCAATTACTTCTGTGTGTGTGTGTGTGTGTGTGTGTTCTTAGTTGTTTAAGTTTTAAGTTGTGA
ii) ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
iii) ATTTCCTTAGTTAATGAAGACACACACACACACACACACACACAAGAATCAACAAATTCAAAATTCAACACT
c) the best target sites for PCR primers that could be used to detect polymorphisms in the length of the
simple sequence repeat region in different individuals.
i) TAAAGGAATCAATTACTTCTGTGTGTGTGTGTGTGTGTGTGTGTTCTTAGTTGTTTAAGTTTTAAGTTGTGA
ii) ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
iii) ATTTCCTTAGTTAATGAAGACACACACACACACACACACACACAAGAATCAACAAATTCAAAATTCAACACT
CHAPTER 37: ANSWERS:
1) Most SNPs are the result of a single base pair substitution mutation that happened once during human
evolution. The chance is very small that a different mutation could happen at the same site in a different
person and also become prevalent.
2) If the conditions are not stringent enough the labelled DNA will attach to both oligos no matter what a
person's genotype is. It would appear that the person is heterozygous for every single SNP, a very unlikely
situation.
3) Yes. We would need to obtain DNA samples from many people with blue eyes and many people with
brown or other eye colours. In fact, most people with blue eyes have this phenotype because they have
mutations in two genes on chromosome 15 called OCA2 and HERC2.
4) Yes. In fact, 23andme looks for a SNP near HERC2 for this purpose. In Europeans most people with the GG
genotype of the SNP have blue eyes while people with the AG and AA genotypes have brown eyes.
OPEN GENETICS LECTURES – FALL 2015 PAGE 29
CHAPTER QUESTION - ANSWERS
5) For this disease or phenotype there are two genes responsible, one is on chromosome 1 and the other is
on chromosome 3.
CHAPTER 38 – ANSWERS
1)
a) q = ,-0.01. = 0.1
b) 1-q = p; 1-0.1 = 0.9
c) 2pq = 2(0.1)(0.9) = 0.18
d) p2 = 0.81
2) First, calculate allele frequencies:
p = 2(AA) + (Aa) / total number of alleles scored = 2(432) + 676 / 2(432+676+92) = 0.6417
q = 2(aa) + (Aa) / total number of alleles scored = 2(92) + 676 / 2(432+676+92) = 0.3583
Next, given these observed allele frequencies, calculate the genotypic frequencies that would
be expected if the population was in Hardy-Weinberg equilibrium.
p2 = 0.64172 = 0.4118
2pq = 2(0.6417)(0.3583) = 0.4598
q2 = 0.35832 = 0.1284
Finally, given these expected frequencies of each class, calculate the expected numbers of each in your
sample of 1200 individuals, and compare these to your actual observations.
expected observed (reported in the original question)
AA 0.4118 × 1200 = 494 432
Aa 0.4598 × 1200 = 552 676
aa 0.1284 × 1200 = 154 92
The population does not appear to be at Hardy-Weinberg equilibrium, since the observed genotypic
frequencies do not match the expectations. Of course, you could do a chi-square test to determine how
significant the discrepancy is between observed and expected.
3) If in this theoretical question, the frequency of genotype of AA is set at 432/1200 and we are asked what
frequencies of the other classes would fit a Hardy-Weinberg equilibrium. So, given that p2 = 432/1200,
then p=0.6, and q=0.4. Given these allele frequencies and a sample size of 1200 individuals, then there
should be 576 Aa individuals (2pq × 1200 = 2(0.6)(0.4) × 1200=576) and 192 aa individuals (q2 × 1200 =
0.42 × 1200 = 192), if the population was at Hardy-Weinberg equilibrium with 432 AA individuals.
4) The actual population appears to have more heterozygotes and fewer recessive homozygotes than would
be expected for Hardy-Weinberg equilibrium. There are many possible reasons that a population may not
be in equilibrium (see Table 1). In this case, there is possibly some selection against homozygous recessive
genotypes, in favour of heterozygotes in particular. Perhaps the heterozygotes have some selective
advantage that increases their fitness.
It is also worth noting the discrepancies between the allele frequencies calculated in Q3 and Q4. In
question 3, we calculated the frequencies directly from the genotypes (this is the most accurate method,
and does not require the population to be in equilibrium). In 4, we essentially estimated the frequency
base on one of the phenotypic classes. The discrepancy between these calculations shows the limitations
of using phenotypes to estimate allele frequencies, when a population is not in equilibrium.
CHAPTER 39 – ANSWERS
1) These fish would all be heterozygotes and thus have spines like the deep-water population. The presence
of the enhancers element (deep water) would be dominant to the absence of the element (shallow water).
2)
PAGE 30 OPEN GENETICS LECTURES – FALL 2017
CHAPTER QUESTION - ANSWERS
a) Yellow wings, body, and mouth parts, but normal bristles and claws.
b) All yellow, no normal colour.
c) Same as “a”.
d) The enhancer elements on the stop codon allele might cross regulate the transcription unit on the
deletion allele. That is, wild type enhancers on one allele drive a wild type transcript on the other
allele. (Note: this is the case. See Morris et al. 1999 Genetics 151: 633–651.)
CHAPTER 40 – ANSWERS
3) a) Fast and simple to grow in high density, diploid,
b)
i) zebrafish (for vertebrate eyes); flies for eyes in general
i) zebrafish
ii) Arabidopsis
iii) yeast
iv) C. elegans
v) arguably, any of the organisms, but the vertebrates would be most relevant
CHAPTER 41 – ANSWERS
1) Oncogenes usually arise from gain-of-function mutations, which tend to be haplosufficient. Mutations in
tumour suppressors are usually loss-of-function mutations, which tend to be haploinsufficient.
2) p53 activates DNA repair, apoptosis, and inhibitors of cell division. Different genes involved in each of
these pathways have enhancer elements to which p53 binds; therefore, they call all be activated by p53.
3) Some substances can promote cancer without causing a mutation, for example by inducing the cell cycle
or accelerating it so that there is less time to repair DNA damage. All mutagens are potentially carcinogens,
although some potential mutagens may not cause significant damage to cells in the body due to
detoxification or other reasons that limit their efficacy.
4) Was the dose fed to the rats relevant? Were similar effects seen in other organisms? Do epidemiological
studies support these conclusions? Could the results be replicated by a different research group? What
was the proposed mechanism for this increased incidence?
5) Cancer results from an accumulation of mutations that activate cell division and disable tumour
suppression. HPV infection alone does not satisfy all of these requirements. Also, not all strains of HPV are
equally carcinogenic, and the body’s defense may be able to suppress the activity of the virus.
6) Same for BRCA1 mutations.
END OF ANSWERS
Notes: