0% found this document useful (0 votes)

573 views405 pages

OpenGeneticsLectures Fall2017

Uploaded by

JustinVo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

573 views405 pages

OpenGeneticsLectures Fall2017

Uploaded by

JustinVo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 405

Open Genetics

Lectures
Fall 2017
Department of Biological Sciences – University of Alberta, Canada

An Open Source Molecular Genetics Textbook

Introduction

Open Genetics Lectures (OGL)

September, 2017 Version
The Open Genetic Lecture textbook is derived from the Open Genetics textbook.

OPEN GENETICS (OG) - History

The first edition of this textbook, called OPEN GENETICS, was produced in January, 2009 as
instructional material for students in Biology 207 at the University of Alberta, and was released to the
public for non-commercial use under the Creative Commons License (See below). Users were
encouraged to make modifications and improvements to the book. All the text in the original 2009
edition was written by Michael Deyholos, Ph.D. In subsequent editions (2010-2014), additional
chapters were written by Mike Harrington, Ph.D., at the University of Alberta. Additional content and
editing by John Locke, Ph.D. and Mark Wolansky, M.Sc., at the University of Alberta. Photos and some
diagrams were obtained from various, non-copyrighted sources, including Flickr, Wikipedia, Public
Library of Science, and Wikimedia Commons. Photo attributions are listed in the legend with each
image.
Open Genetic Lectures (OGL) – Origin 2015, Updated Summer 2016, 2017
OGL is an alternative approach to an open source textbook. Much of its content is derived from the OG
textbook. The 13 chapters in OG were cut up and distributed into 41 shorter chapters that parallel the
current lecture topics in BIOL 207 (Molecular Genetics and Heredity) at the University of Alberta. More
text content, figures, and chapter-end questions were added in this revision. The most recent version
of OG had ~76,000 words, while the Fall 2015 version of OGL had ~128,000 words, a 68% increase.

This reorganization of OG content into OGL was accomplished during the summer of 2015 by John
Locke, Mike Harrington, Lindsay Canham, and Min Ku Kang. This project was funded in part by the
Alberta Open Educational Resources (ABOER) Initiative, which is made possible through an investment
from the Alberta government. Lindsay Canham was supported by a grant from the Alberta Open
Education Initiative (OEI) through the University of Alberta. Min Ku Kang was supported by a Summer
Student Scholarship from the Centre for Teaching and Learning (CTL), University of Alberta. Without
these sources of financial aid this project would not have been possible. John Locke and Michael
Harrington appreciate their help, as well as that of Michael Deyholos, who initiated this endeavor.
Typographical errors, rewording, and additional questions were added in the summers of 2016 & 2017
(J. Locke, M. Harrington, and K. King-Jones).
Access to OGL text files through DataVerse
The final version of this work is available via a DataVerse link:
https://dataverse.library.ualberta.ca/dvn/dv/OpenGeneticsLectures
This includes all the .docx files for each chapter and other relevant files. This is made available for
anyone to use, adapt, or improve for educational purposes. If you have edits, improvements or
additions that you wish to share under the same license terms, please contact John Locke, University of
Alberta.
OGL Cover Images:
Girl: Flickr-Kate Andrews-CC BY-NC-SA 2.0 Peas: Wikipedia-Bill Ebbesen-CC BY-SA 3.0
Fly: Wikimedia Commons-Aka-CC BY-SA 2.5 DNA: Wikimedia Commons-Jerome Walker-PD

I OPEN GENETICS LECTURES – FALL 2017

Introduction
Table of Contents
Introduction Examples of Chromatin Structure: X-chromosome
Inactivation
Title page
Creative Commons Licence Chapter 08: Eukaryote Genes: Human
Online Open Genetics Web Site Beta-globin Genes
Table of Contents Beta-globin – Protein and Gene Structure, Clusters,
Chapter 01: DNA is the Genetic Material Pseudo-Genes
Griffiths Transformation Experiment (1928) Hemoglobin Expression Changes During Development in
Humans
Avery, MacLeod, & McCarty's Experiment (1944)
Locus Control Region (LCR) – Another Level of
Hershey & Chase Experiment (1952)
Regulation
RNA and Protein
Additional Information – Myoglobin
Chapter 02: DNA Structure and Replication
Chapter 09: Eukaryotic Genes: The Human Lactase
Semi-conservative Replication (vs. Conservative,
Dispersive) (LCT) Gene
Chromosome Replication (E. coli) – Cairns Experiment The Lactase Protein
Origins of Replication (Prokaryote – Single Origin), The LCT Gene and mRNA
Replication Fork LCT Gene Expression During Development
Eukaryote Chromosome Replication – Multiple Origins Evolution of the LCT Gene
Telomeres Chapter 10: Eukaryotic Genes: The Drosophila
Chapter 03: Genes Encode Proteins white (w) Gene
Central Dogma The white Protein
Genes Code for Enzymes – A. Garrod The white Gene
Beadle and Tatum: Prototrophic and Auxotrophic The white Gene is X-linked
Mutants The Importance of the white Gene
One Gene: One Enzyme Hypothesis led to Biochemical
Pathway Dissection Using Genetic Screens and Chapter 11: Mutations Originate as Damage to DNA
Mutants Mutation and Polymorphism
Genetics Screens for Mutations Help Characterized Types of Mutations
Biological Pathways Spontaneous Mutations of Biological Origin
Induced Mutations of Chemical Origin
Chapter 04: Complementation Induced Mutations of Physical Origin
Complementation Tests and Alleleism Failure of Repair Systems
Complementation Groups = Groups of Allelic Mutations
Transformation Rescue Chapter 12: Mutations: Consequences
Genetic Screening for Mutations: Forward Genetics,
Chapter 05: Genes to Genomes Reverse Genetics
Central Dogma - Review Some Mutations may Not Have Detectable Mutant
What is a Gene? Phenotypes
Basic Types of DNA Sequence Examples of Human Mutations
Genes: DNA Acting Directly
Genes: DNA Transcribed into RNA (RNA Coding Genes) Chapter 13: Alleles at a Single Locus
Genes: DNA Transcribed into mRNA, Translated into a Terminology
Polypeptide (Protein Coding Genes) Somatic vs. Germline Mutations
How are Genes and Other Sequences Distributed in the Alleles: Hetero-, Homo-, Hemizygosity
Genome? Pleiotropy and Polygenic Inheritance
Complete Dominance and Recessive
Chapter 06: Prokaryote Genes: E. coli Lac Operon Incomplete Dominance
The Lac Operon – a Model Prokaryote Gene Co-dominance
Negative Regulation – Inducers and Repressors Biochemical Basis of Dominance
Positive Regulation – CAP, cAMP & Polymerase Mutant Classification
The Use of Mutants to Study the Lac Operon Muller’s Morphs
Summary
Chapter 14: Mitosis and the Cell cycle
Chapter 07: Eukaryote Genes: Structure Four Stages of a Typical Cell Cycle
The Eukaryote Genome Contains Various Types of Mitosis
Sequences Measures of DNA Content and Chromosome Content
Transcripts of Protein Coding Genes – Processing
Transcription Regulation – Promoters, Chapter 15: Human Chromosomes
Enhancers/Silencers Metaphase Chromosome Spreads
Higher Order Chromatin - Additional Levels of Regulating Human Karyograms and Karyotypes
Transcription Parts of a Typical Nuclear Chromosome
Epigenetics Appearance of a Typical Nuclear Chromosome During
the Cell Cycle
DNA is Packaged into Chromatin

II OPEN GENETICS LECTURES – FALL 2017

Introduction
Parts and Appearance of a Mitochondrial Chromosome Inversions
Example Genes Duplications
Translocations
Chapter 16: Mendel's First Law: Consequences of Chromosomal Rearrangements
Segregation of Alleles Chromosomal Rearrangements in Humans
Overview
Meiosis I Chapter 25: Chromosomes: Changes in
Meiosis II Chromosome Number
Crossing Over (Intra-chromosomal Recombination) Ploidy Notation
One Locus on a Chromosome - Segregation - Polyploidy
Monohybrid Endoreduplication
Punnett Squares - 3:1 Ratio Aneuploidy
Single Locus Test Crosses Chromosome Abnormalities in Humans
Gene Balance
Chapter 17: Mendel's Second Law:
Independent Assortment Chapter 26: Gene Interactions
Two Loci on Different Chromosomes Mendelian Dihybrid Crosses
Two Loci on One Chromosome Epistasis and Other Gene Interactions
A Dihybrid Cross showing Mendel's Second Law Example of Multiple Genes Affecting One Character
(Independent Assortment) (Polygenic Inheritance)
The Dihybrid Test Cross Environmental Factors
Mendelian Phenotypic Ratios May Not Be As Expected
Chapter 18: Genes on the Same Chromosome:
Linkage Chapter 27: Physical Mapping of Chromosomes
Genetic Nomenclature & Symbols and Genomes
Recombination Genetic Map (Distance in cM, Recombination Frequency)
Unlinked Genes and Complete and Partial Linkage Cytogenetic Map
Experimentally Determining Recombination Frequency Physical Map (DNA Sequence, Restriction Sites)
Chapter 19: Recombination Mapping of Gene Loci Chapter 28: Restriction Mapping and
Genetic Mapping Gel Electrophoresis
Mapping With Three-Point Crosses Isolating DNA
Analysis of Recombination Frequencies in a Three Point Restriction Enzymes and DNA methylation
Test Cross DNA Ligation
Where Do Crossovers Occur on a Chromosome? Agarose Gel Electrophoresis
Resolution of Genetic Maps Other Applications of Gel Electrophoresis
Restriction Mapping
Chapter 20: Sex Chromosomes: Sex Linkage
Autosomes and Sex chromosomes Chapter 29: Recombinant DNA
Pseudo-autosomal Regions on the X and Y Basic Terminology
Chromosomes Recombinant DNA Techniques
Sex Linkage: an Exception to Mendel's First Law Using Cloning Vectors
Y-Linked genes DNA Ligation
Z-linked Genes in Birds An Application of Molecular Cloning: Recombinant
Chapter 21: Sex Chromosomes: Sex Determination Insulin
Genomic DNA Libraries and cDNA Libraries
Sex Determination Mechanisms in Animals
Screening a Clone Library
Environmental Factors
Chapter 30: Cloning a Gene
Chapter 22: Sex Chromosomes:
Cloning by Complementation – A Hypothetical
Dosage Compensation Auxotrophic Mutation in E. coli
Gene Dosage Problem Cloning by Hybridization of DNA Probes
Dosage Compensation in Drosophila Cloning a Gene Using the Transposon Tagging Method
X-chromosome Inactivation in Mammals Current Approaches to Matching Genes to Mutations
Mechanism of Sex Determination Systems
Chapter 31: Polymerase Chain Reaction (PCR)
Chapter 23: Pedigree Analysis Isolating Genomic DNA
Pedigree Analysis Isolating or Detecting a Specific Sequence by PCR
Modes of Inheritance
Sporadic and Non-heritable Diseases Chapter 32: Observing Intact Chromosomes
Calculating Probabilities. Bright Field Microscopy
Hybridization Probes
Chapter 24: Chromosome Rearrangements Fluorescence In Situ Hybridization (FISH)
DNA Double Strand Breaks and Incorrect Meiotic
Crossovers Cause Chromosomal Chapter 33: DNA Sequencing
Rearrangements Automated Sanger DNA Sequencing
Deletions Next-generation DNA Sequencing

III OPEN GENETICS LECTURES – FALL 2017

Introduction
Chapter 34: Southern/Northern/Western Blots Chapter 39: Evolution of Gene Expression
Southern Blot Basics of Development
Northern Blot Variation in Gene Expression and Evolution
Western Blot Example 1: Drosophila Yellow Gene
Example 2: Pitx Expression in Stickleback Fish
Chapter 35: DNA Variation Studied with Example 3: Hemoglobin Expression in Placental
Southern Blots Mammals
Mutation and Polymorphism
Molecular Markers – SNPs and VNTRs Chapter 40: Transgenic Organisms
Restriction Fragment Length Polymorphism (RFLP) Model Organisms Facilitate Genetic Advances
Construction of Genetic Linkage Maps What are Transgenic Organisms?
Applications of Molecular Markers Making a Transgenic Cell
Detection of Transgenes and Their Products
Chapter 36: DNA Variation Studied with PCR Producing a Transgenic Plant
Short Tandem Repeats (STRs) Producing a Transgenic Mouse
Detecting STRs With PCR and Agarose Gel Human Gene Therapy
Electrophoresis CRISPR-Cas9 Technology
Detecting STRs With PCR and Capillary Tube
Electrophoresis Chapter 41: Cancer Genetics
Modern DNA Fingerprinting Classification of Cancers
Cancer Cell Biology
Chapter 37: DNA Variation Studied with Microarrays Hallmarks of Cancer
Single Nucleotide Polymorphisms (SNPs) Mutagens and Carcinogens
Microarray Technology Oncogenes
Genome-wide Association Studies (GWAS) Tumour Suppressor Genes
Direct To Consumer Testing Gleevec™ (Imatinib) - The “Poster Boy” of Genetic
Chapter 38: Population Genetics Research Leading to a Cancer Treatment
Allele Frequencies May be Studied at the Population Chapter Question: Answers
Level Answers for Chapters 1-41
Hardy-Weinberg Formula END

IV OPEN GENETICS LECTURES – FALL 2017

Introduction

Also available:
Online Open Genetics!

To be successful in Introductory Genetics, you are encouraged to use the supplementary electronic
resources provided by the website for Online Open Genetics (http://opengenetics.net). These
resources will help you learn and practice problem-solving skills and self-assess your knowledge as you
progress through the course.

The website provides:
(1) access to short instructional videos and
(2) supplementary readings, as well as
(3) interactive exercises.
All these will help deepen your understanding of basic concepts in genetics, as well as to practice and
refine the skills needed to solve common problems in genetic analysis.

These supplementary materials can be accessed using the internet via web browsers on Windows or
Mac computers, or on tablets and iPads. More information is available on the web page.

V OPEN GENETICS LECTURES – FALL 2017

Introduction

Definitions:
Gene - a hereditary unit that occupies a specific position (locus) within the genome or chromosome
and has one or more specific effects upon the phenotype of the organism and can mutate into
various forms (alleles) and can recombine with similar such units.
Gene locus -(plural = loci) The specific place on a chromosome where a gene is located
Allele - refers to one of the different forms of a gene that can exist at a single gene locus
Genotype - the specific allelic composition of a cell or organism. Normally only the genes under
consideration are listed in a genotype and the alleles at all the remaining gene loci are
considered to be wild type.
Phenotype - the detectable outward manifestation of a specific genotype. In describing a phenotype
usually only the characteristics under consideration are listed while the remaining characters
are assumed to be wild type (normal).
Allelic mutations- two mutations at the same gene locus
Non-allelic mutations- two mutations that affect different gene loci
Genetic Nomenclature & Symbols - What you need to know
Geneticists use a variety of different nomenclature systems to represent genes and their mutations in
different organisms. You will need to become familiar with these different systems in order to
understand genetics and answer questions on an exam.

th
(Definitions are taken with modification from Griffiths et al, 2000 and A Dictionary of Genetics 4 Ed., King & Stansfield, 1990.)

Gene names and symbols

Different organisms have different nomenclature systems to symbolize their genes
Because different genetic model organisms have historically developed their own nomenclature
systems for denoting genes and alleles, there is a variety of different nomenclature systems in use
today. This can be very confusing for students trying to learn the basics of genetics (genes, alleles, and
mutations). However, all systems have two main parts: (1) a gene name, and (2) an allele names.
Gene Names
Usually, genes have a full name (e.g. the white locus in Drosophila) as well as a short symbol form (e.g.
w in the case of the white locus) that is a unique letter or combinations of letters.
• So, the letters “a”, “b”, and “c” would represent different named genes.
• Each named gene would have a unique letter, or combination of letters for an organism.
- For example, the "vermillion" gene in Drosophila is represented by the letter "v ".
- While "vg " is the symbol for the "vestigial" gene.
- And "vvl " is the symbol for the "ventral veins lacking" gene locus.

VI OPEN GENETICS LECTURES – FALL 2017

Introduction
• Note: the same letter symbols may represent different gene loci in different organisms, but often
the same or similar symbol is used in different organisms.
• Sometimes letters and numbers are used, especially for different loci that have similar phenotypes.
- For example, the arg-1, arg-2, and arg-3 loci described by Beadle and Tatum in Neurospora
have a similar arginine auxotrophic phenotype but are three separate gene loci.
• Also, formally, gene symbols and gene names are always shown in italics text, but in the lecture
portion of BIOL 207, we may not require or use italics in gene names and symbols all the time.
Allele names.
1) Superscripts - usually denote different allelic forms of a gene locus
The normal copy of a gene is known as wild type and is usually symbolized by superscript plus sign, "+".
+
E.g. " a+ ", " b+ ", or white , etc. or it is sometimes abbreviated to just "+"
A typical mutant form of the gene, of which there can be many, can be symbolized by a superscript
- -
minus sign, "-". E.g. " a ", " b ", etc., or sometimes abbreviated to just "a", "b", etc. (no superscript).
Therefore if the "genotype" of a diploid organism is given as a+/a-, it means there is a wild type allele
and mutant allele of the "a" gene at the "a" locus - This might be abbreviated to +/a .
2) UPPER vs. lower case letters are often used to denote dominant and recessive alleles
In diploid heterozygotes, the dominant allele is typically (but not always) designated with the upper
case letter(A) while the recessive allele is given the lower case letter(a). An example of this is Mendel’s
round (R) vs. wrinkled (r) alleles at the pea shape locus. Note that not all wild type alleles are dominant
(capital). Some mutant alleles can be dominant to wild type, even though most are recessive.

Note: Nomenclature is covered more extensively in Chapter 13.
Symbols of genes vs. proteins.
We also need to use symbols to describe the protein derived from a specific gene. Typically, the name
of the protein uses the same name as the gene, only with all CAPITAL letters. For example, the
Drosophila white gene codes for a polypeptide involved in pigment precursor transport across the cell
membrane (see Chapter 10). Thus, the white gene codes for the WHITE polypeptide/protein.
Attempts to unify the gene name system:
There have been recent attempts to unify the gene and allele naming systems of the various genetic
model organisms (e.g. Alberts et al. Molecular Biology of the Cell, 6/e), but such attempts can lead to
greater student confusion, rather than clarity.

In BIOL 207, we will try and use consistent naming systems to facilitate student learning and leave the
intricacies of multi-organism nomenclature to senior courses where it is necessary and more
appropriate.

VII OPEN GENETICS LECTURES – FALL 2017

Introduction

Creative Commons Copyright License

This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 2.5 Canada
License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/ca/.
Under the terms of the license, you are free:
• to Share — to copy, distribute and transmit the work
• to Remix — to adapt the work
Under the following conditions:
• Attribution. You must include the name of the original author in books or excerpts derived from this
work.
• Non-commercial. You may not use this work for commercial purposes.
• Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work
only under the same or similar licence to this one.
• For any reuse or distribution, you must make clear to others the licence terms of this work.
• Any of the above conditions can be waived if you get permission from the copyright holder.
• The author's moral rights are retained in this licence.

Figures in this textbook

All figures in this textbook are open source and/or under the Creative Commons License. Many are
obtained from various, non-copyrighted sources, including Flickr, Wikipedia, Public Library of Science,
and Wikimedia Commons. They are available for re-use, as per their licenses. Each caption lists the
origin, creator, and license for that figure. If any are incorrect, please contact John Locke for correction.
Guide for Image, Photo and Figure License Details in this work
In the caption for each image, photo, and figure there is a citation. This includes the source or origin,
creator, and license for each work. We have made a best effort to correctly attribute these works; if
there are any errors, please contact John Locke.
The license is given at the end of the caption in short form as the following:
CC BY 2.0 Creative Commons Attribution 2.0 Generic
CC BY 2.5 Creative Commons Attribution 2.5 Generic
CC BY 3.0 Creative Commons Attribution 3.0 Unported
CC BY 4.0 Creative Commons Attribution 4.0 International
CC BY-NC 2.0 Creative Commons Attribution-NonCommercial 2.0 Generic
CC BY-NC 3.0 Creative Commons Attribution-NonCommercial 3.0 Unported
CC BY-NC-ND 2.0 Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic
CC BY-NC-SA 2.0 Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic
CC BY-ND 2.0 Creative Commons Attribution-NoDerivs 2.0
CC BY-SA 1.0 Creative Commons Attribution-ShareAlike 1.0 Generic
CC BY-SA 2.0 Creative Commons Attribution-ShareAlike 2.0
CC BY-SA 2.5 Creative Commons Attribution-ShareAlike 2.5 Generic
CC BY-SA 3.0 Creative Commons Attribution-ShareAlike 3.0 Unported
CC BY-SA 4.0 Creative Commons Attribution-ShareAlike 4.0 International
CC0 1.0 CC0 1.0 Universal Public Domain Dedication
PD Public Domain because it was created by a US government agency or because
the author has explicitly released it into the public domain.

VIII OPEN GENETICS LECTURES – FALL 2017

Introduction
Creative Commons Copyright Licenses
Attribution This license lets others distribute, remix, tweak, and build
upon your work, even commercially, as long as they credit you
for the original creation.
Attribution- This license lets others remix, tweak, and build upon your
ShareAlike work even for commercial purposes, as long as they credit
you and license their new creations under the identical terms.
Attribution- This license allows for redistribution, commercial and non-
NoDerivs commercial, as long as it is passed along unchanged and in
whole, with credit to you.
Attribution- This license lets others remix, tweak, and build upon your
NonCommercial work non-commercially, and although their new works must
also acknowledge you and be non-commercial, they don’t
have to license their derivative works on the same terms.
Attribution- This license lets others remix, tweak, and build upon your
NonCommercial- work non-commercially, as long as they credit you and license
ShareAlike their new creations under the identical terms.
Attribution- This license is the most restrictive of our six main licenses,
NonCommercial- only allowing others to download your works and share them
NoDerivs with others as long as they credit you, but they can’t change
them in any way or use them commercially.
Public Domain No longer restricted by copyright.

Taken from https://creativecommons.org/licenses/ on May 25, 2015
Bibliography
Alberts, B. et al. 2004. "Molecular Biology of the Cell, fourth edition". Garland Science, New York
Felix Ratjen, Gerd Döring, Cystic fibrosis, The Lancet, Volume 361, Issue 9358, 22 February 2003, Pages
681-689
Francis, Richard C. “Epigenetics”. 2011, Norton, New York.
Griffiths, A. J. F. et al. 2005. "Introduction to Genetic Analysis, eighth edition." W. H. Freeman and
Company, New York
King, R. C. and W. D. Stansfield. 1997. "A Dictionary of Genetics, fifth edition". Oxford University Press,
Toronto
Lap-Chee Tsui and Ruslan Dorfman, The Cystic Fibrosis Gene: A Molecular Genetic Perspective Cold
Spring Harb Perspect, February 2013;3
MendelWeb. R. B. Blumberg, {August 1, 2012}. World Wide Web URL: www.mendelweb.org/
Online Mendelian Inheritance in Man, OMIM®.
McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD), {August 1,
2012}. World Wide Web URL: http://omim.org/
Watson, J. D. et al. 2008. "Molecular Biology of the Gene, sixth edition" Pearson Education, Inc., San
Francisco
Your Genes, Your Health
Dolan DNA Learning Center / Cold Spring Harbor Laboratory, {August 1, 2012}. World Wide Web URL:
www.ygyh.org/
Wikipedia, URL: www.wikipedia.com

IX OPEN GENETICS LECTURES – FALL 2017

Introduction
Notes:

X OPEN GENETICS LECTURES – FALL 2017

DNA IS THE GENETIC MATERIAL – CHAPTER 01

CHAPTER 01 – DNA IS THE GENETIC MATERIAL

Figure 1.
Parent and offspring Wolf’s Monkey.
(Flickr- Eric Heupel - CC BY-NC-ND 2.0)

INTRODUCTION Prior to Mendel (1865) heredity was considered to

be of a “blended inheritance” but his work
Genetics is the scientific study of heredity and the
demonstrated that inheritance was particulate in
variation of inherited characteristics. It includes the
nature (particulate inheritance). We now call these
study of genes, how they function, and how they
“particles” genes and their different forms, alleles.
produce the visible and measurable characteristics
By the early 1900’s, biochemists had isolated
we see in individuals and populations of species as
hundreds of different chemicals from living cells,
they change from one generation to the next, over
but which of these was the genetic material?
time, and in different environments.
Proteins seemed like promising candidates, since
Heredity is the concept that the characteristics of they were abundant, diverse, and complex
an individual plant or animal in a population could molecules. However, a few key experiments
be passed down through the generations. Offspring demonstrated that DNA, rather than protein, is the
look more like their parents (Figure 1). People genetic material.
learned that some heritable characteristics (such as
the size or colour of fruit) varied between
1. GRIFFITH’S TRANSFORMATION EXPERIMENT
individuals, and that they could select or breed (1928)
crops and animals for the most favorable traits. Microbiologists identified two strains of the
Knowledge of these hereditary properties has been bacterium Streptococcus pneumoniae. The R-strain
of significant value in the history of human produced rough colonies on a bacterial plate, while
development. In the past, humans could only the other S-strain was smooth (Figure 2). More
manipulate and select from naturally existing importantly, the S-strain bacteria caused fatal
combinations of genes. More recently, with the infections when injected into mice, while the R-
discovery of the substance and nature of genetic strain did not (Figure 3). Neither did “heat-treated”
material, DNA, we can now identify, clone, and S-strain cells. Griffith in 1928 noticed that upon
create novel, better combinations of genes that will mixing “heat-treated” S-strain cells together with
serve our goals. Understanding the mechanisms of some R-type bacteria (neither individually should
genetics is fundamental to using it wisely and for kill the mice), the mice died. Furthermore, there
the betterment of all.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 01 –DNA IS THE GENETIC MATERIAL

were S-strain, pathogenic cells recoverable. Thus, 2. AVERY, MACLEOD AND MCCARTY’S
some non-living component from the S-type strains EXPERIMENT (1944)
contained genetic information that could be
transferred to and transform the living R-type What kind of molecule from within the S-type cells
strain cells into S-type cells. was responsible for the transformation? To answer
this, researchers named Avery, MacLeod and
McCarty separated the S-type cells into various
components, such as proteins, polysaccharides,
lipids, and nucleic acids. Only the nucleic acids from
S-type cells were able to make the R-strains
smooth and fatal. Furthermore, when cellular
extracts of S-type cells were treated with DNase
Figure 2. (an enzyme that digests DNA), the transformation
Colonies of Rough (left) and Smooth (right) strains of S. ability was lost. The researchers therefore
pneumoniae.
(J. Exp.Med.98:21, 1953-R. Austrian-Pending) concluded that DNA was the genetic material,
which in this case controlled the appearance
(smooth or rough) and pathogenicity of the
bacteria.

3. HERSHEY AND CHASE’S EXPERIMENT (1952)

Further evidence that DNA is the genetic material
came from experiments conducted by Hershey and
Chase. These researchers studied the transmission
of genetic information in a virus called the T2
bacteriophage, which uses Escherichia coli as its
host bacterium (Figure 4).
Figure 4.
Electronmicrograph of T2
bacteriophage on surface of
E. coli.
(Wikipedia- Dr Graham
Beards- CC BY-SA 3.0)

Like all viruses, T2 hijacks the cellular machinery of
Figure 3.
Experiments of Griffith and of Avery, MacLeod and McCarty.
its host to manufacture more viruses. The T2 phage
R strains of S. pneumoniae do not cause lethality. However, itself only contains both protein and DNA, but no
DNA-containing extracts from pathogenic S strains are other class of potential genetic material. To
sufficient to make R strains pathogenic. determine which of these two types of molecules
(Wikipedia - Modified by Deyholos- CC BY-NC 3.0) contained the genetic blueprint for the virus,
Hershey and Chase grew viral cultures in the
presence of radioactive isotopes of either
phosphorus (32P) or sulphur (35S). The phage
incorporated these isotopes into their DNA and
proteins, respectively (Figure 5). The researchers

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

DNA IS THE GENETIC MATERIAL – CHAPTER 01

then infected E. coli with the radiolabeled viruses, 4. RNA AND PROTEIN
and looked to see whether 32P or 35S entered the
bacteria. After ensuring that all viruses had been While DNA is the genetic material for the vast
removed from the surface of the cells, the majority of organisms, there are some viruses that
use RNA as their genetic material. These viruses
researchers observed that infection with 32P
can be either single- or double-stranded. Examples
labeled viruses (but not the 35S labeled viruses)
include SARS virus, influenza virus, hepatitis C virus
resulted in radioactive bacteria. This demonstrated
and polio virus, as well as the retroviruses like HIV-
that DNA was the material that contained genetic
AIDS. Typically, there is DNA used at some stage in
instructions.
their life cycle to replicate their RNA genome.
Also, the prion protein is an infectious agent that
transmits characteristics via only a protein (no
nucleic acid present). Prions infect by transmitting
a mis-folded protein state from one aberrant
protein molecule to a normally folded molecule.
These agents are responsible for Bovine
Spongiform Encephalopathy (BSE, also known as
"mad cow disease") in cattle, Chronic Wasting
Disease in deer, Scrapie is sheep and Creutzfeldt–
Jakob disease (CJD) in humans. All known prion
diseases act by altering the structure of the brain
or other neural tissue and all are currently

untreatable and ultimately fatal.
Figure 5.
32
When P-labeled phage infects E. coli, radioactivity is found
only in the bacteria, after the phage are removed by

agitation and centrifugation. In contrast, after infection
35
with S-labeled phage, radioactivity is found only in the
supernatant that remains after the bacteria are removed.
(Wikipedia –Modified by Deyholos- CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 01 –DNA IS THE GENETIC MATERIAL

___________________________________________________________________________
SUMMARY:
• Genetics is the scientific study of heredity and the variation of inherited characteristics.
• Heredity is the concept that a trait of an individual can be passed down through generations
• A gene can be defined abstractly as a unit of inheritance.
• The experiments done by Griffith and Hershey and Chase showed the ability of DNA from bacteria and
viruses to transfer genetic information into bacteria demonstrates that DNA is the genetic material and
that its universal.
• Some viruses use RNA as their genetic material and can be either single or double stranded.
• Prion is a mis-folding protein that transmits its mis-folding property to a normal one.
KEY WORDS:
genetics transform
heredity Avery, MacLeod, & McCarty
Mendel DNase
blending inheritance Hershey and Chase
particulate inheritance bacteriophage
35
gene S
32
allele P
Griffith prion

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

DNA IS THE GENETIC MATERIAL – CHAPTER 01

STUDY QUESTIONS:
1) Imagine that retuning astronauts provide you
with living samples of multicellular organisms
discovered on another planet. These organisms
reproduce with a short generation time like our
standard yeast species. Initial observations
about their reproduction indicate that they also
require two “sexual types” to mate, but nothing
else is known about how their genetics works.
a) How could you define laws of heredity for
these organisms?
b) How could you determine what molecules
within these organisms contained genetic
information?
c) Would the mechanisms of genetic
inheritance likely be similar for all
organisms from this planet?
d) Would the mechanisms of genetic
inheritance likely be similar to organisms
from earth?
2) It is relatively easy to extract DNA and protein
from cells; biochemists have been doing this
since at least the 1800’s. Why then did Hershey
and Chase need to use radioactivity to label
DNA and proteins in their experiments?
3) Starting with mice and R and S strains of S.
pneumoniae, what experiments, in addition to
those shown in Figure 3 can be used to
demonstrate that DNA is THE genetic material
and the only genetic material?
4) Mendel put forth a “particulate inheritance”
model – alleles, dominant, recessive, etc. At the
time there was a “blended inheritance” model,
which is like mixing paint colours (analogy).
Suggest an analogy for Mendel’s particulate
model, taking into account the dominant and
recessive characters of alleles.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 01 –DNA IS THE GENETIC MATERIAL

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

DNA STRUCTURE AND REPLICATION – CHAPTER 02

CHAPTER 02 – DNA STRUCTURE AND REPLICATION

Figure 1.
The scientists responsible for
determining the structure of
DNA are Rosalind Franklin,
Maurice Wilkins, Francis Crick
and James Watson (left to
right). Although the work was
published in 1953, Wilkins,
Crick, and Watson received the
Nobel Prize in Physiology or
Medicine in 1962, after
Franklin had died in 1958 of
ovarian cancer.

From left to right

(Wikipedia-Unknown-PD)
(Wikipedia-Unknown-PD)
(Wikiepdia-Marc Lieberman-CC BY
2.5)
(Wikipeida-Cold Spring Harbor
Laborator-PD)
Bottom:
(Wikipedia-Michael Ströck-CC BY-
SA 3.0)

INTRODUCTION also regulate the expression of genes (information

in the DNA).
One of the fundamental things to know when
studying genetics is the basic structure of DNA and This chapter will cover the components of the DNA
how it is replicated. DNA is the “blueprint” that molecule, how the double helix structure was
contains all the instructions for making the proteins discovered, and how the mechanisms of replication
that each cell needs, whether it is a single celled were discovered and characterized.
bacterium or a multicellular organism like humans. 1. DNA STRUCTURE - DOUBLE HELIX
J. Watson, F. Crick, and M. Wilkins received the
Nobel Prize (1962) for discovering the structure of 1.1. NUCLEIC ACIDS AND PHOSPHATE SUGAR BACKBONE
DNA. (R. Franklin might have also received the prize In 1869 Johannes Friedrich Miesher, a Swiss
for this discovery, but she died in 1958.) physician and biologist, first isolated a substance
The basic structure of DNA provides insight into its he called ‘nuclein’ from the nuclei of human white
function. The main features of its structure are that blood cells. He identified this substance to be
it can reliably: (1) reproduce exact copies of itself weakly acidic with a high amount of phosphorus.
to pass on to descendant cells, and (2) use the This substance, after being further purified and
information to create proteins that produce and studied was later called deoxyribonucleic acid, or
regulate the biochemistry of the cell. Remember DNA. Its name describes the three characteristics
however, DNA within the cell is more than just a of the molecule: it has a ribose sugar with only one
loose strand within the nucleus. DNA interacts with hydroxyl group called deoxyribose (Figure 2), it is
proteins and is packaged into higher order found in the nucleus of a cell, and it is acidic.
structures (chromosomes) that will be discussed After purifying the ‘nuclein’ to DNA they found it
later in the textbook in Chapter 7. These proteins contained four different subunits that are linked in

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 02 – DNA STRUCTURE AND REPLICATION

a chain. Those subunits were identified as

nucleotides. A nucleotide contains three
components, a phosphate group (PO43-), a
Ribose Deoxyribose Dideoxyribose
deoxyribose sugar, and one of four nitrogenous
bases. Those bases fit into two groups based upon
Purine nucleotides
their structure. Purines have a double ring
structure and include adenine and guanine.
Pyrimidines have a single ring structure and
include cytosine and thymine (Figure 2). The
nature of the phosphate group and the deoxyadenosine 5'-monophasphate
(dAMP)
deoxyguanosine 5'-monophasphate
(dGMP)
deoxyribose sugar allows each nucleotide to chain Pyrimidine nucleotides
together, forming the long DNA strand.
Notice in Figure 2, dAMP has each carbon of the
ribose labeled with a number followed by a prime,
e.g. 1’-5’. The 1’ position is where the base is
attached. The 2’ position of the ribose is missing a deoxycytidine 5'-monophasphate deoxythymidine 5'-monophasphate
hydroxyl group. The 5’ position is attached to the (dCMP) (dTMP)
phosphate group. When linked in a chain, the Figure 2.
Molecular models of the components of DNA. The top
phosphate group is linked to the 3’ oxygen of the
shows the three different types of ribose sugar found in
next nucleotide using a phosphodiester bond. nucleic acids. The bottom shows the purine and pyrimidine
When a polynucleotide chain is formed, there will nucleotide monophosphates. Carbon numbers (1’-5’) are
always be a free 5’ phosphate at one end, and one labeled on the sugars and dAMP. (Original- L.Canham - CC
the free 3’ oxygen on the ribose at the other. These BY-NC 3.0)
are known as the 5’ and 3’ ends, respectively, of
the DNA strand. 1.2. CHARGAFF’S RULES
When Watson and Crick set out in the 1940’s to
Ribonucleic acid (RNA) is like DNA, in that it forms determine the structure of DNA, they already knew
chains similarly, and has the bases attached to the that DNA is made up of a series nucleotides with
same carbon. The extra hydroxyl group at the 2’ four different bases: adenine (A), cytosine (C),
position causes it to form a different conformation thymine (T), guanine (G). For DNA, the nucleotides
than DNA, becoming a more flexible molecule are abbreviated as dNTPs (deoxyribonucleotide
(DNA’s conformation will be described later in this triphosphates), which include dATP, dCTP, dGTP,
chapter). There are also dideoxynucleotides that and dTTP. For RNA they are abbreviated as NTPs,
are missing the hydroxyl group at both the 2’ and 3’ which include ATP, CTP, GTP, and UTP. Watson and
position. Because of this, a chain cannot form at Crick also knew of Chargaff’s Rules, which were a
the 3’ carbon, terminating the chain. This feature of set of observations about the relative amount of
dideoxynucleotides is used in Sanger sequencing, each nucleotide that was present in almost any
which will be described in Chapter 33. extract of DNA. Chargaff had observed that for any
given species, the abundance of A was the same as
T, and G was the same as C. This was essential to
Watson & Crick’s model.

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

DNA STRUCTURE AND REPLICATION – CHAPTER 02

handed twist that is often represented incorrectly

in popular media). The DNA bases extend from the
backbone towards the center of the helix, with a
pair of bases from each strand forming hydrogen
bonds that help to hold the two strands together.
Because of the structure of the bases, A can only
form hydrogen bonds with T, and G can only form
hydrogen bonds with C (remember Chargaff’s
Rules). Each strand is therefore said to complement
to the other, and so each strand also contains
enough information to act as a template for the
synthesis of the other. This complementary
redundancy is important in DNA replication and
repair.
Under most conditions, the two strands in the
double helix are slightly offset, which creates a
major groove, and a minor groove. In Figure 1,
notice how if you look at the bottom edge of the
helix you can see it makes a wave pattern, with a

large dip followed by a small dip, followed by a
Figure 3. large dip, etc. The “peaks” are not equidistant and
Chemical structure of two pairs of nucleotides in a
you can see the major and minor grooves. These
fragment of double-stranded DNA. Sugar, phosphate, and
bases A,C,G,T are labeled. Hydrogen bonds between bases grooves provide access for transcription regulating
on opposite strands are shown by dashed lines. Note that proteins (transcription factors), which bind to
the G-C pair has more hydrogen bonds than A-T. The specific sequences of bases along the DNA.
polarity of each strand is indicated by the labels 5’ and 3’.
(Wikipedia- Madeleine Price Ball - CC0 1.0) 2. SEMI-CONSERVATIVE REPLICATION (VS.
CONSERVATIVE, DISPERSIVE)
1.3. THE DOUBLE HELIX
Using proportional metal models of the individual From the complementary strands model of DNA,
nucleotides, Watson and Crick deduced a structure proposed by Watson and Crick in 1953, there were
for DNA that was consistent with Chargaff’s Rules three straightforward possible mechanisms for
and with x-ray crystallography data that was DNA replication: (1) semi-conservative, (2)
obtained (with some controversy) from another conservative, and (3) dispersive (Figure 4).
researcher named Rosalind Franklin. In Watson and The semi-conservative model proposes the two
Crick’s famous double helix, each of the two strands of a DNA molecule separate during
strands contains DNA bases connected through replication and then strand acts as a template for
covalent bonds to a sugar-phosphate backbone synthesis of a new, complementary strand.
(Figure 1 and Figure 3). Because one side of each The conservative model proposes that the entire
sugar molecule is always connected to the opposite DNA duplex acts as a single template for the
side of the next sugar molecule, each strand of synthesis of an entirely new duplex.
DNA has polarity: these are called the 5’ (5-prime)
end and the 3’ (3-prime) end. The two strands of The dispersive model has the double helix breaking
the double helix run in anti-parallel (i.e. opposite) into segments that which are then replicated and
directions, with the 5’ end of one strand adjacent reassembled, with the new duplexes containing
to the 3’ end of the other strand. The double helix alternating segments from one strand to the other.
has a right-handed twist, (rather than the left-

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 02 – DNA STRUCTURE AND REPLICATION

composed of one-half 15N and one-half 14N. If the

this DNA is extracted and applied to a CsCl
gradient, the observed result is that one band
appears at the point midway between the locations
predicted for wholly 15N DNA and wholly 14N DNA
(Figure 5). This “single-band” observation is
inconsistent with the predicted outcome from the
conservative model of DNA replication (disproves
this model), but is consistent with both that
expected for the semi-conservative and dispersive
models.
If the E. coli is permitted to go through another
Figure 4. round of replication in the 14N medium, and the
The three models of DNA replication possible from the DNA extracted and separated on a CsCl gradient
double helix model of DNA structure. tube, then two bands were seen by Meselson and
(Wikipedia-Adenosine- CC BY-SA 2.5) Shahl: one at the 14N-15N intermediate position and
one at the wholly 14N position (Figure 5). This result
Each of these three models makes a different is inconsistent with the dispersive model (a single
prediction about the how DNA strands should be band between the 14N-15N position and the wholly
distributed following two rounds of replication. 14
N position) and thus disproves this model. The
These predictions can be tested in the following two band observation is consistent with the semi-
experiment by following the nitrogen component conservative model which predicts one wholly 14N
in DNA in E. coli as it goes through several rounds duplex and one 14N-15N duplex. Additional rounds
of replication. Two scientists, Meselson and Stahl of replication also support the semi-conservative
in 1958, used different isotopes of Nitrogen, which model/hypothesis of DNA replication. Thus, the
is a major component in DNA. Nitrogen-14 (14N) is semi-conservative model is the currently accepted
the most abundant natural isotope, while mechanism for DNA replication. Note however,
Nitrogen-15 (15N) is rare, but also heavier. Neither that we now also know from more recent
is radioactive; each can be followed by a difference experiments that whole chromosomes, which can
in density – “light” 14 vs “heavy” 15 atomic weight be millions of bases in length, are also semi-
in a CsCl density gradient ultra-centrifugation of conservatively replicated.
DNA.
These experiments, published in 1958, are a
The experiment starts with E. coli grown for several wonderful example of how science works.
generations on medium containing only 15N. It will Researchers start with three clearly defined models
have denser DNA. When extracted and separated (hypotheses). These models were tested, and two
in a CsCl density gradient tube, this “heavy” DNA (conservative and dispersive) were found to be
will move to a position nearer the bottom of the inconsistent with the observations and thus
tube in the more dense solution of CsCl (left side in disproven. The third hypothesis, semi-conservative,
Figure 5). DNA extracted from E. coli grown on was consistent with the observations and thereby
normal (14N containing) medium will migrate supported and accepted as mechanism of DNA
towards the less dense top of the tube. replication. Note, however, this is not “proof” of
If these E. coli cells are transferred to a medium the model, just strong evidence for it; hypotheses
containing only 14N, the “light” isotope, and grown are not “proven”, only disproven or supported
for one generation, then their DNA will be

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

DNA STRUCTURE AND REPLICATION – CHAPTER 02

Figure 5.
14 15
The positions of the N and N
containing DNA in the density
gradient tube on the left.
(Wikipedia-LadyofHats-PD)

3. CHROMOSOME REPLICATION (E. COLI) - CAIRNS radioactive and one that is not. After a third round
EXPERIMENT of replication there will be a two types of daughter
DNA, one that has a non-radioactive strand and a
If the results of Meselson and Stahl were true and radioactive strand, and one that has two
there was semi-conservative replication, then the radioactive strands.
two strands of DNA have to separate to provide the
template for copying. This should be seen as a After growth in the 3H-thymidine, Cairns lysed the
‘fork’ in a linear model if you manage to see the bacteria and collected the contents onto a
DNA just as it’s replicating. John Cairns in 1963 microscope slide. He then covered the slide with a
chose to test this. photographic emulsion and allowed exposure to
film for 2 months. As the 3H-thymidine decays it
To do this he took E. coli cells growing in a normal emits an electron with a lot of energy and speed,
environment, and then allowed them to grow and known as a beta particle. The emulsion reacts with
replicate in the presence of radioactive 3H- the beta particle creating a black silver grain on the
thymidine. The hypothesis is that if the E. coli’s film. The density of grains should be indicative of
DNA or chromosome is semi-conservatively whether one or two strands are radioactive.
replicated then after the first round of replication
there should be one newly made strand that is After the first replication cycle, the film had a thin
radioactive, or “hot”, and the other strand that is circular ring of grains (Figure 6). This was
the parental template strand with no radioactivity, interpreted to be a daughter chromosome with
so is “cold”. The original parental DNA will have one strand that is hot and one strand cold. This also
two strands, each not radioactive. After replication provided physical evidence that the E. coli
the daughter DNA will have two strands, one that is chromosome is circular, something that has only
previously been shown genetically.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 02 – DNA STRUCTURE AND REPLICATION

In the second replication cycle the replication fork thought one fork was static while the other strand
was seen. Here Cairns saw the typical thin ring of went around the chromosome replicating.
grains much like the first replication cycle, but with Scientists later went on to show that replication is
a branch in the middle that had a thicker strand in-fact bidirectional.
(Figure 6). This means that the branch seen was an
actively replicating chromosome, using the
4. ORIGINS OF REPLICATION (PROKARYOTE -
radioactive strand of DNA as a template, and SINGLE ORIGIN), REPLICATION FORK
adding more radioactive thymidine as the DNA is When the cell enters S-phase in the cell cycle (See
being synthesized. Because of the shape these Chapter 14) the entire chromosomal DNA is
created on the film this replicating structure was replicated. This is done by enzymes called DNA
called a theta (Θ) structure. Cairns observed many polymerases. All DNA polymerases synthesize new
different molecules corresponding to the strands by adding nucleotides to the 3'OH group
progression from starting replication to the present on the previous nucleotide. For this reason
completion of replication. they are said to work in a 5' to 3' direction. DNA
One round of Two rounds of polymerases use a single strand of DNA as a
replication replication template upon which it will synthesize the
complementary sequence. This works fine for the
middle of chromosomes. DNA-directed DNA
Autoradiograph polymerases travel along the original DNA strands
making complementary strands (Figure 7a).

Interpretation

Figure 7.
DNA polymerases make new strands in a 5' to 3' direction.
Figure 6. (a) Regular DNA polymerases are proteins or protein
In his experiment, Cairns looked at DNA with radioactive complexes that use a single strand of DNA as a template.
thymidine on an autoradiograph film, with the radioactive For example, the main human DNA polymerase, Pol α, is
thymidine leaving dots on the film. This figure shows what large protein complex made of four polypeptides. (b)
the autoradiograph film would look like, and below what Telomerases use their own RNA as a template. The human
the interpretation of what the autoradiograph shows. The telomerase is a complex made of one polypeptide and one
blue line represents the ‘cold’ DNA that has no RNA molecule.(Original-Harrington- CC BY-NC 3.0)
radioactivity, while the red shows the ‘hot’ radioactive
DNA. The density of the dots on the autoradiograph imply
whether there is one strand or both strands of hot DNA. DNA replication in both prokaryotes and
During the second round of replication, a theta structure eukaryotes begins at an Origin of Replication (Ori).
can be seen, as the circular E. coli DNA is in the process of Origins are specific sequences on specific positions
being replicated. (Original-L.Canham- CC BY-NC 3.0) on the chromosome. In E. coli, the OriC origin is
~245 bp in size. Chromosome replication begins
Here Cairns’ results were able to further support with the binding of the DnaA initiator protein to an
the semi-conservative replication theory, showing AT-rich 9-mer in OriC and melts the two strands.
the existence of replication forks, as well as the Then DnaC loader protein helps DnaB helicase
hypothesis that E. coli has a circular chromosome. protein extend the single stranded regions such
What Cairns did not realize is that replication goes that the DnaG primase can initiate the synthesis of
in both directions at the replication fork, where he an RNA primer, from which the DNA polymerases

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

DNA STRUCTURE AND REPLICATION – CHAPTER 02

can begin DNA synthesis at the two replication only ~100 base/second. Thus, eukaryotes contain
forks. The forks continue in opposite directions multiple origins of replication distributed over the
until they meet another fork or the end of the length of each chromosome to enable the
chromosome (Figure 8). duplication of each chromosome within the
observed time of S-phase (Figure 9).

Figure 8.
An origin of replication. The sequence-specific DNA duplex
is melted, then the primase synthesizes RNA primers from
which bidirectional DNA replication begins as the two Figure 9.
replication forks head off in opposite directions. The Part of a eukaryotic chromosome showing multiple Origins
leading and lagging strands are shown along with Okazaki (1, 2, 3) of Replication, each defining a replicon (1, 2, 3).
fragments. Note the 5’ and 3’ orientation of all strands. Replication may start at different times in S-phase. Here #1
(Original-Locke- CC BY-NC 3.0) and #2 begin first then #3. As the replication forks proceed
bi-directionally, they create what are referred to as
“replication bubbles” that meet and form larger bubbles.
5. EUKARYOTE CHROMOSOME REPLICATION - The end result is two semi-conservatively replicated duplex
MULTIPLE ORIGINS DNA strands.
(Original-Locke- CC BY-NC 3.0)
In prokaryotes, with a small, simple, circular
chromosome, only one origin of replication is
6. TELOMERES
needed to replicate the whole genome. For
example, E. coli has a ~4.5 Mb genome The ends of linear chromosomes present a problem
(chromosome) that can be duplicated in ~40 – at each end one strand cannot be completely
minutes assuming a single origin, bi-directional replicated because there is no primer to extend
replication, and a speed of ~1000 and replace the end RNA primer. While the loss of
bases/second/fork for the polymerase. such a small sequence might not be a problem, the
continued rounds of replication would result in the
However, in larger, more complicated eukaryotes,
continued loss of sequence from the chromosome
with multiple linear chromosomes, more than one
end. Ultimately, the losses would reach a point
origin of replication is required per chromosome to
where essential gene sequences would be lost and
duplicate the whole chromosome set in the 8-
the organism would die. Thus, this end DNA must
hours of the replicative phase (S-phase) of the cell
be replicated. Most eukaryotes solve the problem
cycle. For example, the human diploid genome has
of synthesizing this unreplicated, end DNA with a
46 chromosomes (6 x 109 basepairs). The shortest
specialized DNA polymerase called telomerase, in
chromosomes are ~50 Mbp long and so could not
combination with a regular polymerase.
possibly be replicated from one origin. Additionally,
Telomerases are RNA-directed DNA polymerases.
the rate of replication fork movement is slower,
They are a riboprotein, as they are composed of
OPEN GENETICS LECTURES – FALL 2017 PAGE 7
CHAPTER 02 – DNA STRUCTURE AND REPLICATION

both protein and RNA. As Figure 10. shows, these repeats at the end, this fluctuation maintains a
enzymes contain a small piece of RNA that serves length buffer – sometimes it’s longer, sometimes
as a portable and reusable template from which it’s shorter – but the average length will be
the complementary DNA is synthesized. The RNA in maintained over the generations of cell replication.
human telomerases uses the sequence 3-AAUCCC- In the absence of telomerase, as is the case in
5' as the template, and thus our telomeric DNA has human somatic cells, repeated cell division leads to
the complementary sequence 5'-TTAGGG-3' the “Hayflick limit”, where the telomeres shorten
repeated over and over 1000’s of times. After the to a critical limit and then the cells enter a
telomerase has made the first strand, a primase senescence phase of non-proliferation. The
synthesizes an RNA primer and a regular DNA inappropriate activation of telomerase expression
polymerase can then make a complementary permits a cell and its descendants to become
strand so that the telomere DNA will ultimately be immortal and bypass the Hayflick limit. This
double stranded to the original length (Figure 10). happens in cancer cells, which can form tumours as
Note: the number of repeats, and thus the size of well as in cells in culture. HeLa cells, which can be
the telomere, is not set. It fluctuates after each propagated essentially indefinitely, have been kept
round of the cell cycle. Because there are many in culture since 1951 (See Chapter 41).

Figure 10.
Telomere replication showing the completion of the leading strand and incomplete replication of the lagging strand. The gap is
replicated by the extension of the 3’ end by telomerase and then filled in by extension of an RNA primer.
(Original-Locke- CC BY-NC 3.0)

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

DNA STRUCTURE AND REPLICATION – CHAPTER 02

___________________________________________________________________________
SUMMARY:
• DNA is a double helix made of two anti-parallel strands of bases on a sugar-phosphate backbone.
• Specific bases on opposite strands pair through hydrogen bonding (A=T and G=C), ensuring
complementarity of the strands.
• The hereditary information is present as the sequence of bases along the DNA strand.
• Chromosome replication begins at an origin and proceeds by DNA polymerases at a replication fork.
• Replication proceeds bi-directionally.
• Typically eukaryotes have multiple origins along each chromosome, while prokaryotes have only one.
• Eukaryotes have telomerase to complete the replication of the ends of chromosomes.
KEY TERMS:
deoxyribonucleic acid E. coli
nucleotides Meselson and Stahl
purine Nitrogen-14
adenine Nitrogen-15
guanine light
pyrimidine heavy
cytosine CsCl gradient
thymine John Cairns
3
phosphodiester bond H-thymidine
ribonucleic acid photographic emulsion
dideoxynucleotide silver grain
Watson and Crick theta structure
Chargaff’s Rules bidirectional
double helix DNA polymerases
anti-parallel Origin of replication
right-handed replicon
major groove replication bubble
minor groove telomerase
semi-conservative riboprotein
conservative Hayflick limit
dispersive HeLa cells

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 02 – DNA STRUCTURE AND REPLICATION

STUDY QUESTIONS:
1) Compare Watson and Crick’s discovery with 4) Refer to Figure 3.
Avery, MacLeod and McCarty’s discovery. a) Identify the part of the DNA molecule that
a) What did each discover, and what was the would be radioactively labeled in the
impact of these discoveries on biology? manner used by Hershey & Chase
b) How did Watson and Crick’s approach b) DNA helices that are rich in G-C base pairs
generally differ from Avery, MacLeod and are harder to separate (e.g. by heating)
McCarty’s? than A-T rich helices. Why?
c) Briefly research Rosalind Franklin on the 5) Are the ends of eukaryote, linear chromosomes
internet. Why is her contribution to the static and fixed in length? Explain.
structure of DNA controversial?
2) List the information that Watson and Crick used
to deduce the structure of DNA.
3) Refer to Watson and Crick’s
a) List the defining characteristics of the
structure of a DNA molecule.
b) Which of these characteristics are most
important to replication?
c) Which characteristics are most important to
the Central Dogma?

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

GENES ENCODE PROTEINS – CHAPTER 03

CHAPTER 03 – GENES ENCODE PROTEINS

Figure 1.
Most, but not all, genes code for
proteins. They are transcribed into
mRNA, which is then translated
into polypeptides.
(pixabay-PublicDomainPictures- CC0
1.0)

INTRODUCTION converted back to DNA through a process called
reverse transcription. As well, DNA, and its
How is the genetic information in DNA (genes)
information, can also be replicated (DNAèDNA).
expressed as biological traits, such as the flower
color of Mendel’s peas? The answer lies in what Proteins do most of the “work” in a cell. They (1)
has become known as molecular biology’s Central catalyze the formation and breakdown of most
Dogma. While not all genes code for proteins, most molecules within an organism, as well as (2) form
do. This chapter describes the Central Dogma and their structural components, and (3) regulate the
some experiments that were used to support this expression of genes. By dictating the sequence and
concept. thus structure of each protein, DNA directs the
function of that protein, which can thereby affect
1. CENTRAL DOGMA the entire organism. Thus the genetic information,
The Central Dogma of Biology describes the or genotype, defines the potential form, or
concept that genetic information is encoded in phenotype of the organism. Note, however, that
DNA in the form of genes (Figure 2). This the environment can also influence phenotype.
information is then transferred as needed, in a
process called transcription into a messenger RNA
(mRNA) sequence. The information is then
transferred again, in a process called translation
into a polypeptide (protein) sequence. The
Figure 2.
sequence of bases in DNA directly dictates the Central Dogma of molecular biology.
sequence of bases in the RNA, which in turn (Original-Locke/Kang- CC BY-NC 3.0)
dictates the sequence of amino acids that make up
a polypeptide. In the case of Mendel’s peas, purple-flowered
The original core of the Central Dogma is that plants have a gene that encodes an enzyme that
genetic information is NEVER transferred from produces a purple pigment molecule. In the white-
protein back to nucleic acids. In certain flowered plants (a purple-less mutant), the DNA for
circumstances, the information in RNA may be this gene has been changed, or mutated, so that it

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 03 –GENES ENCODE PROTEINS

no longer encodes a functional protein. This is an sugars, and one vitamin (biotin). Prototrophs can
example of a spontaneous, natural mutation in a synthesize the amino acids, vitamin, etc. necessary
gene coding for an enzyme in a biochemical for normal growth.
pathway. They also knew that by exposing Neurospora
2. GENES CODE FOR ENZYMES – A. GARROD spores to X-rays, they could randomly induce
mutations in genes (now known as damage to the
Life depends on (bio)chemistry to supply energy
DNA leading to DNA sequence change). Each spore
and to produce the molecules that construct and
exposed to X-rays potentially contained a mutation
regulate cells. In 1908, Archibald Garrod described
in a different gene. While most mutagenized spores
“in-born errors of metabolism” in humans, using
were still able to grow (prototrophic), some spores
the congenital disorder, alkaptonuria (black urine
had mutations that changed their phenotype from
disease), as an example of how “genetic defects”
a prototroph into an auxotrophic strain, which
(genotype) led to the lack of an enzyme in a
could no longer grow on minimal medium. Instead
biochemical pathway and caused a disease
these auxotrophs could grow on complete medium
(phenotype). The reason why people with
(CM), which was MM supplemented with nutrients,
alkaptonuria have black urine is because a chemical,
such as amino acids and vitamins, etc. (Figure 3). In
called “alkapton”, makes urine black when exposed
fact, some auxotrophic mutations could grow on
to air. In normal people, enzymes catalyze the
minimal medium with only one, single nutrient
reaction to break down alkapton, but people who
supplied, such as the amino acid arginine. This
are born with the disease, due to genetic defect,
implied that each auxotrophic mutant was blocked
cannot make such enzymes and therefore cannot
at a specific step in a biochemical pathway and that
break down alkapton. Garrod’s work gave huge
by adding an essential compound, such as arginine,
impact to modern genetics as it attempted to
that block could be circumvented.
explain the biochemical mechanism behind the
genes proposed in Mendelian genetics.
3. BEADLE AND TATUM: PROTOTROPHIC AND
AUXOTROPHIC MUTANTS
In 1941, over 30 years after Garrod’s discovery,
Beadle and Tatum built on this connection
between genes and metabolic pathways. Their
research led to the “one gene, one enzyme (or
protein)” hypothesis, which states that each
enzyme that acts in a biochemical pathway is
encoded by a different gene. Although we now
Figure 3.
know of many exceptions to the “one gene, one A single mutagenized spore is used to establish a colony of
enzyme” principle, it is generally true that each genetically identical fungi, from which spores are tested for
different gene produces a protein that has a their ability to grow on different types of media. Because
distinct catalytic, regulatory, or structural function. spores of this particular colony are able to grown only on
complete medium (CM), or on minimal medium
Beadle and Tatum used the fungus Neurospora supplemented with arginine (MM+Arg), they are considered
crassa (a bread mold) for their studies because it Arg auxotrophs and we infer that they have a mutation in a
had practical advantages as a laboratory model gene in the Arg biosynthetic pathway. This type of screen is
repeated many times to identify other mutants in the Arg
organism. They knew that Neurospora was pathway and in other pathways.
prototrophic, meaning that it could grow on (Original-Deyholos-CC BY-NC 3.0)
minimal medium (MM). Minimal medium lacked
most nutrients, except for a few minerals, simple

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

GENES ENCODE PROTEINS – CHAPTER 03

4. ONE GENE: ONE ENZYME HYPOTHESIS LED TO different enzyme. Each enzyme works sequentially
BIOCHEMICAL PATHWAY DISSECTION USING on a different intermediate in the pathway (Figure
4). For arginine (Arg), two of the biochemical
GENETIC SCREENS AND MUTATIONS
intermediates are ornithine (Orn) and citrulline
Beadle and Tatum’s experiments are important not (Cit). Thus, mutation of any one of the enzymes in
only for their conceptual advances in this pathway could turn Neurospora into an Arg
understanding genes, but also because they auxotroph (arg-). Srb and Horowitz extended their
demonstrate the utility of screening for genetic analysis of Arg auxotrophs by testing the
mutants to investigate a biological process – intermediates of amino acid biosynthesis for the
genetic analysis. ability to restore growth of the mutants (Figure 5).
Beadle and Tatum’s results were useful to
investigate biological processes, specifically the
metabolic pathways that produce amino acids. For
example, Srb and Horowitz in 1944 tested the
ability of the amino acids to rescue auxotrophic
strains. They added one of each of the amino acids
to minimal medium and recorded which of these
restored growth to independent mutants.

Figure 5.
Testing different Arg auxotrophs for their ability to grow
on media supplemented with intermediates in the Arg
biosynthetic pathway. (Original-Deyholos- CC BY-NC 3.0)

They found that only Arg could rescue all of the Arg
auxotrophs, while either Arg or Cit could rescue
some (Table 1). Based on these results, they
deduced the location of each mutation in the Arg
biochemical pathway, (i.e. which gene was
responsible for the metabolism of which
intermediate).
Figure 4. Mutants MM + Orn
A simplified version of the Arg biosynthetic pathway, MM + Cit MM + Arg
In:
showing citrulline (Cit) and ornithine (Orn) as
intermediates in Arg metabolism. These chemical gene A Yes Yes Yes
reactions depend on enzymes represented here as the
products of three different genes.
gene B No Yes Yes
(Original-Deyholos- CC BY-NC 3.0) gene C No No Yes
Table 1.
A convenient example is arginine. If the progeny of Ability of auxotrophic mutants of each of the three
a mutagenized spore could grow on minimal enzymes of the Arg biosynthetic pathways to grow on
medium only when it was supplemented with minimal medium (MM) supplemented with Arg or either
arginine (Arg), then the auxotroph must bear a of its precursors, Orn and Cit. Gene names refer to the
labels used in Figure 4.
mutation in the Arg biosynthetic pathway and was
called an “arginineless” strain (arg-).
Synthesis of even a relatively simple molecule such
as arginine requires many steps, each with a

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 03 –GENES ENCODE PROTEINS

5. GENETIC SCREENS FOR MUTATIONS HELP

CHARACTERIZE BIOLOGICAL PATHWAYS
Using many other mutations and the “one gene:
one enzyme model” permitted the genetic

dissection of many other biochemical and
developmental pathways.
The general strategy for a genetic screen for
mutations is to expose a population to a mutagen,
then look for individuals among the progeny that
have defects in the biological process of interest.
There are many details that must be considered
when designing a genetic screen (e.g. how can

recessive alleles be made homozygous).
Nevertheless, mutational analysis has been an
extremely powerful and efficient tool in identifying
and characterizing the genes involved in a wide
variety of biological processes, including many
genetic diseases in humans. Genetic screens are
covered in more detail in Chapter 12.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

GENES ENCODE PROTEINS – CHAPTER 03

___________________________________________________________________________
SUMMARY:
• The Central Dogma describes the information flow from nucleic acids to proteins.
• Garrod's observations showed that there is a connection between genes and enzymes.
• Beadle and Tatum proposed that one gene encoded one enzyme,
• It was an example of how to screen for genetic mutants, and therefore characterize biochemical
pathways or biological processes.
KEY TERMS:
Central Dogma prototroph
transcription minimal medium
translation auxotroph
reverse transcription complete medium
genotype genetic screen
phenotype genetic analysis
Beadle & Tatum rescue
metabolic pathway arginine
one-gene:one-enzyme genetic screen for mutations
Neurospora crassa

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 03 –GENES ENCODE PROTEINS

STUDY QUESTIONS:
1) Compare Figure 4 and Table 1. Suppose you 7) Recall that Neurospora is orange coloured
- bread mould. This biochemical pathway
created three new arg mutation called mutants
below is how wild type cells become
#1, #2, & #3. #1 grew on MM+cit and MM+arg,
orange. None of the compounds are
#2 grew on only MM+arg, while #3 grew on essential. Cells containing W are white,
MM+ orn, cit or arg. Which genes are #1, 2, & 3 cells with Y are yellow, and cells with O are
mutant in (A, B, or C)? orange. Assume that the reactions will go to
2) Why was the Vitamin biotin (see Section #3) completion if possible.
always added the MM?
3) Last century, A. Garrod, and later Beadle and
Tatum, showed that genes encode enzymes.
From what we know now, do all genes encode
enzymes? Explain.
4) Most mutant proteins differ from wild type
Fill in this table with the colours of the cell
(normal) by a single substitution at a specific cultures.
amino acid site. Explain how some amino acid Strain MM+W MM+Y MM+O
changes result in:
+
a) no loss of protein function, gene1
b) only partial loss-of-function, +
gene2
c) complete loss-of-function,
-
d) and how do changes at different amino acid gene1
sites result in the same complete loss-of- +
gene2
function. +
5) Some mutants result in the loss of a specific gene1
enzyme activity. Does this mean that no protein -
gene2
product is produced from that mutant gene? -
6) The molecular weight of the A and B chains of gene1
-
E. coli tryptophan synthase are 29,500 and gene2
49,500, respectively. The size of the entire
enzyme is 159,000.
a) If the average molecular weight of each
amino acid is 110, then how many amino
acids are present in each chain?
b) How many chains does the whole enzyme
contain? Explain.

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

COMPLEMENTATION – CHAPTER 04

CHAPTER 04 – COMPLEMENTATION
Figure 1.
In Chinese philosophy the yin yang symbol suggests opposite forces,
such as two mutations, can actually be complementary and how
together they can give rise to a whole, as with complementation in
genetics.
(Wikipedia-Gregory Maxwell-PD)

INTRODUCTION phenotype (e.g., in the same pathway). In other
words, are they allelic mutations or non-allelic
How do genetic researchers determine whether mutations, respectively? This question can be
two mutants that have similar phenotypes are resolved using complementation tests, which bring
mutant in the same gene or in different genes? together or combine, the two mutations under
One way is by determining if the genes are located
consideration into the same organism to assess the
at a similar or different location. If they are
combined phenotype.
different, they must be in different genes and thus
are not allelic. If they are located in the same
region then a complementation test is used. These
consist of classical Mendelian genetic crosses to
see if one mutant can complement another, or give
a wild type phenotype. More recently,
transformation of DNA with a gene has been used
to see if putting a single gene into a cell/organism
can rescue a mutant phenotype.
1. COMPLEMENTATION TESTS AND ALLELISM
Figure 2.
As explained earlier in the previous chapter, In this simplified biochemical pathway, two enzymes
mutant screening is one of the starting points encoded by two different genes modify chemical
geneticists use to investigate biological processes. compounds in two sequential reactions to produce a
Geneticists can observe two independently derived purple pigment. Loss of either of the enzymes disrupts the
mutants with similar phenotypes, through a pathway and no pigment is produced. (Original-Deyholos-
CC BY-NC 3.0)
mutant screen or in natural populations. An
immediate question from this observation is
whether or not the mutant phenotype is due to a The easiest way to understand a complementation
loss of function in the same gene, or are they test is by example (Figure 2). The pigment in a
mutant in different genes that both cause the same purple flower could depend on a biochemical
pathway much like the biochemical pathways
OPEN GENETICS LECTURES – FALL 2017 PAGE 1
CHAPTER 04 –COMPLEMENTATION

leading to the production of arginine in Neurospora mutant). These could be either the exact same
(Chapter 3). A diploid plant that lacks the function mutant alleles (same base pair changes), or
of gene A (genotype aa) would produce mutant different mutations (different base pair changes,
white flowers that phenotypically looked just like but in the same gene - allelic).
the white flowers of a plant that lacked the Conversely, if the F1 progeny all appear to be wild
function of gene B (genotype bb). Both A and B are type (Case 2 - Figure 3B), then each of the parents
enzymes in the same pathway that leads from a most likely carries a mutation in a different gene.
colorless compound #1, through colorless These mutations would then be called non-allelic
compound #2, to the purple pigment. Blocks at mutations - mutant in a different gene locus. These
either step will result in a mutant white flower mutations DO COMPLEMENT one another.
instead of the wild type purple flower.

Strains with mutations in gene A can be

represented as the genotype aa, while strains with
mutations in gene B can be represented as bb.
Given that there are two genes here, A and B, then
each of these mutant strains can be more
completely represented as aaBB and AAbb.
(LEARNING NOTE: Students often forget that
genotypes usually only show mutant loci, however,
one must remember all the other genes in the
Figure 3A – Observation:
diploid genome are assumed to be wild type.) In a typical complementation test, the genotypes of two
If these two strains are crossed together the parents are unknown (although they must be pure
resulting progeny will all be AaBb. They will have breeding, homozygous mutants). If the F1 progeny all have
a mutant phenotype (Case 1), there is no
both a wild type, functional A gene and B gene and complementation. If the F1 progeny are all wild-type, the
will thus have a pigmented, purple flower, a wild mutations have successfully complemented each other.
type phenotype. This is an example of (Original-Deyholos-CC BY-NC 3.0)
complementation. Together, each strain provides

what the other is lacking (AaBb). The mutations are

in different genes and are thus called non-allelic
mutations.
Now, if we are presented with a third pure-
breeding, independently derived white-flower
mutant strain, we won't initially know if it is mutant
in gene A, gene B or some other gene altogether.

We can use complementation testing to determine
which gene is mutated. To perform a Figure 3B – Interpretation:
The pure breeding, homozygous mutant parents had
complementation test, two homozygous
unknown genotypes before the complementation test, but
individuals with similar mutant phenotypes are it could be assumed that they had either mutations in the
crossed (Figure 3). same genes (Case 1) or in different genes (Case 2). In Case
1, all of the progeny would have the mutant phenotype,
If the F1 progeny all have the same mutant because they would all have the same, homozygous
phenotype (Case 1 - Figure 3A), then we infer that genotype as the parents. In Case 2, each parent has a
the same gene is mutated in each parent. These mutation in a different gene, therefore none of the F1
mutations would then be called allelic mutations – progeny would be homozygous mutant at either locus.
mutant in the same gene locus. These two Note that the genotype in Case 1 could be written as either
aa or aaBB. (Original-Deyholos-CC BY-NC 3.0)
mutations FAIL to COMPLEMENT one another (still

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

COMPLEMENTATION – CHAPTER 04

Note: For mutations to be used in If, however, you obtained a different mutation,
complementation tests they are (1) usually true- vestigial for example, which affects wing growth,
breeding (homozygous at the mutant locus), and and crossed it to a white eye colour mutation, the
(2) must be recessive mutations. Dominant and double heterozygote would result in red eyes and
semi-dominant mutations CANNOT be used in normal wings (wild type for both characters) so the
complementation tests, since these mutations two would complement and represent two
won’t show complementation effects of two non- different complementation groups: (1) white, (2)
allelic genes. (3) Note that haploid organisms like vestigial. The same would be true for the other
Neurospora cannot be used in complementation eye-colour mutations mentioned elsewhere in this
test since they have only one set of chromosome text. For example, if you crossed a scarlet eye-
(4). Also, remember, some mutant strains may colour mutant to a white eye-colour mutant, the
have more than one gene locus mutated and thus double heterozygote would have wild type red
would fail to complement mutants from more than eyes. Each mutant has the wild type allele of the
one other locus (or group). other. Again, remember that all the other genes in
the diploid genome are assumed to be wild type.
2. COMPLEMENTATION GROUPS = GROUPS OF To drive home the concept of complementation
ALLELIC MUTATIONS groups, we will look at a two hypothetical
So, with the third mutant strain above, we could examples.
assign it to be allelic with either gene A or gene B,
2.1. EXAMPLE ONE: MULTIPLE MUTANT
or some other locus, should it complement both
COMPLEMENTATION TEST
gene A and gene B mutations. If they came from
The first example, shows the results of a series of
different natural populations or from
crosses as a complementation test table (Figure 4)
independently mutagenized individuals, we could
with six mutants labeled a to f. The mutants fall
have a fourth, fifth, sixth, etc. white flower strain,
into three complementation groups in total: (1) a
then we could begin to organize the allelic
(2) b, c, f and (3) d, e. Notice that a
mutations into groups, which are called
complementation group can consist of only one
complementation groups. These are groups of
mutant, or more than one.
mutations that FAIL TO COMPLEMENT one another
(a group of NON-complementing mutations) and
are assumed to have mutations in the SAME gene;
hence they are grouped as complementation group.
A group can consist of as few as one mutation and
as many as all the mutants under study. Each group
represents a set of mutations in the same gene
(allelic). The number of complementation groups
represents the number of genes that are
represented in the total collection of mutations.
It all depends on how many mutations you have in
a gene. For example, the white gene in Drosophila
Figure 4.
has >300 different mutations within the white gene Complementation test table showing which flower mutant
described in the literature. If you were to obtain strains complement each other and vice versa. “w” stands
and cross all these mutations to themselves, you for the white flowers, which is mutant (no
would find they all belonged to the same complementation) and “p” stands for purple which
represents wild type (complementation). Blanks are for
complementation group or same white gene. Each
crosses not done. (Original-Di Cara-CC BY-NC 3.0)
complementation group represents a gene.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 04 –COMPLEMENTATION

2.2. EXAMPLE TWO: DOUBLE HIT STRAIN 3. TRANSFORMATION RESCUE

The second example is similar, but has a twist
(Figure 5). It has five mutants labeled 1-5, with 1-4 In a normal rescue experiment (See chapter 3),
being mutations in only a single gene each, while arginine auxotrophic strands of single-celled
Neurospora crassa were "rescued" when
mutant #5 has mutations in two different genes,
supplemented with the amino acids that they could
and thus is unable to complement the mutations in
not synthesize and that were essential for the
two, different genes. A double-hit strain like strain
organism's metabolism. In transformation rescue,
#5 is normally a very rare event, but is included
rather than giving supplementary metabolic
here to make a point. A double-hit strain may
pathway products, it supplies the needed genes
appear to belong in two different groups. In this
that can complement the mutant allele. The
case, mutants #3 and #4 complement (different
process of taking in foreign DNA (transformation)
genes) but #5 fails to complement both #3 and #4,
that contains the normal version of the gene and
indicating it has mutations in both the mutant
thereby rescuing the auxotrophic strain is called
genes in #3 (gene B) and #4 (gene C) (Figure 6).
transformation rescue.
Example 2:
Let’s say that there is an E. coli auxotrophic mutant
in a gene called “a” (Table 1).
MM (Minimal MM +
E. coli Strain
medium) supplement
Auxotrophic (no
a- Growth
growth)
a+ Growth Growth
Table 1.
-
The auxotrophic strand (a ) cannot grow on MM (minimal
+
medium) but the prototrophic strand (a ) can grow.

In order to transform this auxotrophic strain and

rescue, we need to:
Figure 5.
Complementation test table with pink as mutant and (1) Make the E. coli auxotrophic cells competent so
green as wild type (black is for crosses not done). Note that it can incorporate foreign DNA molecules. We
mutant #5 has two mutations. (Original-Locke-CC BY-NC
3.0)
can form a competent cell via heat shock or

electroporation that can slightly damage the
membrane and therefore provide passageways for
DNA molecules to enter the cell.
(2) Extract DNA molecules from a wild type strain
of E. coli and break them down into short
fragments using enzymes.
(3) Insert these short fragments of E. coli DNA into
a DNA vector, which is a DNA molecule that can
contain, amplify, and transfer the inserted DNA
Figure 6. fragments into the host cell. This combined DNA
Chromosomes of the organisms that are used in molecule is called recombinant DNA. Plasmids are
complementation tests to decide if the genes are allelic or small circular DNA molecule that are mostly found
non-allelic. (Original-Locke-CC BY-NC 3.0)
in bacteria and are suitable as DNA vector. The

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

COMPLEMENTATION – CHAPTER 04

[vector + DNA insert] molecule can be replicated either plasmids with no transgene or have plasmids
and the result would be multiple clones of the with gene b, would be still auxotrophic.
original DNA insert. Notice that the plasmids contain an antibiotic
(4) After the E. coli DNA fragments that were once resistance gene called AntiR and that the strains
a single long DNA molecule are inserted into DNA were actually grown on minimal medium that
vectors, we have a collection of recombinant DNA contained antibiotics. Why was this so? This is
molecules, which when transformed, can be called because we want to select for the ones that
a DNA library. Among all the recombinant DNA actually incorporated the plasmid that contained
molecules in the library, there are three the wild-type “a” gene.
possibilities (Figure 7): (1) DNA clones that contain Only a small fraction of cells is actually transformed
gene a, (2) DNA clones that don’t contain gene a, by foreign DNA. Therefore, if we grow those strains
which will be collectively presented by the letter b on agar plate without antibiotics, we cannot
and (3) DNA clones that don’t contain any foreign guarantee that the growth was due to the
genes. complementation between the host DNA and the
(5) Combine the recombinant DNA molecules and recombinant DNA or by some reversion back to
host E. coli strain together so that the auxotrophic wild type. There is a small possibility that the cells
strain can incorporate those DNA molecules that weren’t transformed could somehow
through transformation. Growing the strains on synthesize the essential substrate due to a
minimal and complete media will let us decide if spontaneous mutation. Adding the antibiotic
the transformation rescue worked or not. selection will remove cells that weren't
transformed and therefore don't contain a plasmid
The host strain’s genotype is a-b+. It needs a wild with the antibiotic resistance gene, and select for
type a+ in order to grow on minimal medium. the cells that were successfully transformed and
Therefore, plasmids that have the a+ allele would complemented by the recombinant DNA.
grow (prototrophic), and other strains that have

Figure 7.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 04 –COMPLEMENTATION

Transformation rescue diagram (Original-Locke-CC BY-NC 3.0)

___________________________________________________________________________
SUMMARY:
• Complementation testing determines whether two mutants are the result of mutation of the same
gene (allelic mutations), or if each mutant is caused by mutation of a different gene (non-allelic
mutations).
• Complementation group contains mutants that cannot complement each other (allelic mutations) and
therefore are assumed to have mutations at the same gene loci.
• Transformation rescue refers to the incorporation of recombinant DNA molecule that contains a gene
that is able to complement the mutated gene in another organism.
KEY TERMS:
allelic mutations transformation rescue
non-allelic mutations heat shock
complementation test electroporation
biochemical pathways DNA vector
complementation recombinant DNA
complementation group plasmids
double-hit strain clones
transformation DNA library
rescue AntiR

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

COMPLEMENTATION – CHAPTER 04

STUDY QUESTIONS:
1) You are working with a prototrophic model happened. The student’s control experiments
organism (e.g. a fungus). You are interested in indicate that the transformation protocol
finding genes involved in synthesis of proline worked.
(Pro), an amino acid that is normally 4) Figure 7 shows how we can rescue an a– strain
synthesizes by this organism. with a plasmid carrying an a+ gene. Could we
a) How would you design a mutant screen to also rescue this strain by growing the cells on
identify genes required for Pro synthesis? media containing Enzyme A (the product of the
b) Imagine that your screen identified ten a+ gene)? How about the product of Enzyme A?
mutants (labeled #1 through #10) that grew
very poorly unless supplemented with
Proline. How could you determine the
number of different genes represented by
these mutants?
c) If each of the ten mutants represents a
different gene, what will be the phenotype
of the F1 progeny if any pair of the ten
mutants are crossed?
d) If all of the ten mutants represents the
same gene, what will be the phenotype of
the F1 progeny if any pair of the ten
mutants are crossed?
2) Draw the expected results of a series of
complementation tests (crosses), in the form of
a table, for five yeast mutant strains where
there are at least three different mutant loci,
and one of the mutations involves a double hit
(two loci are mutant in the same strain).
3) Students create a mutant E. coli strain that is
auxotrophic for methionine. Three students
build plasmid DNA libraries from wild type DNA
from the parental strain. Student A uses EcoRI
to clone the restriction fragments. Student B
uses HindIII and student C uses XhoI. Each
transforms the auxotrophic mutant strain with
their library. Student A gets lots of prototrophic
colonies on minimal medium, while students B
and C don’t get any. Explain what might have

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 04 –COMPLEMENTATION

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

GENES TO GENOMES – CHAPTER 05

CHAPTER 05 - GENES TO GENOMES

Figure 1.
The DNA in most genomes begins at the level of
nucleotides, then to genes, chromosomes and finally
to a whole genome.
(Wikipedia- Plociam- CC BY 2.0)

INTRODUCTION encoded in the nucleotide sequence, and then, as

needed, this information is transcribed into an RNA
With Gregor Mendel’s work (1865), we transitioned
sequence, and then translated into a polypeptide
from a “blended” concept of inheritance to a
(protein) sequence. The core of the Central Dogma
“particulate” concept. The particles were given the
is that genetic information is NEVER transferred
name “gene” in 1909 by Wilhelm Johannsen and
from protein back to nucleic acids. The protein
William Bateson coined the word genetics at about
coding genes: (1) catalyze the formation and
that time. A first understanding in gene regulation
breakdown of most molecules within an organism
came from Jacob and Monods' work on the lac
as well as (2) form their structural components and
operon. Marshall Nirenberg and Heinrich J.
(3) regulate the expression of genes. Thus, nucleic
Matthaei in 1961 cracked the “genetic code”
acids (DNA and RNA) dictate the structure of each
(codons). With the advent of DNA cloning and
protein and the structure affects the function of
recombinant DNA in the 1970s we first glimpsed
that protein, which can thereby affect the entire
the interrupted nature of eukaryote genes in work
organism. Thus the sum of all the genes present in
by Phillip Allen Sharp and Richard J. Roberts. More
the genome of an organism, or genotype, defines
recently, the Human Genome Project determined
the potential form, or phenotype of the organism.
the entire nucleotide sequence of humans. Analysis
In prokaryotes, nucleotides are normally a single,
predicts around 20,000-25,000 genes, but the
circular chromosome, while in eukaryotes it is
actual number rides on deciding what is a gene.
present as multiple linear chromosomes.
Since then many more genomes have been
sequenced, both higher, multi-cellular eukaryotes, 2. WHAT IS A GENE?
as well as hundreds (thousands?) of prokaryotes. Mendel’s work showed that inheritance was
Each contributes thousands of new genes to the particulate (not blended). These particles became
databases. known as genes and could exist in one or more
What is a "gene" and how are they organized in a different versions or forms, which we now call
genome? alleles. In its broadest and most general definition,
a gene is an abstract concept with five
1. CENTRAL DOGMA - REVIEW components.
Molecular biology’s Central Dogma (see Chapter
03) states the genetic information of each gene is

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 05 – GENES TO GENOMES

A gene is: 4. GENES: DNA ACTING DIRECTLY

(1) a unit of inheritance DNA does not have to be transcribed and/or
(2) that occupies a specific position (locus) within translated in order to have a function. The DNA
the genome or chromosome and itself has a function. Examples under this type
(3) has one or more specific effects upon the
phenotype of the organism and
(4) can mutate into various forms (alleles) and
(5) can recombine with similar such units
(with modification from: King & Stansfield (1968) A dictionary of
Genetics, Third Ed. Oxford University Press, New York).

Note that this definition of a gene isn’t limited to

just protein coding genes, but includes other types
that will be discussed below. Today we know more
about the different roles that DNA sequences play
in the expression of genetic information. Often it is
difficult to say for some sequences whether they
are “genes” or not. Figure 2.
DNA sequence can be divided into two main categories in a
3. BASIC TYPES OF DNA SEQUENCE general sense: Functional and Non-functional. The
Functional sequences, or genes, can be further divided.
First, not all DNA sequences in a genome are parts (Original-J. Locke- CC BY-NC 3.0)
of a gene. In prokaryotes, a minority of DNA
sequences are not genes, i.e. most are. Those include origins of replication (Ori), centromeres,
sequences not in genes have no apparent function and telomeres. These sequences are essential, can
other than linking one gene to the next (intergenic be mutated, and occupy a specific location on a
regions) (Figure 2). In eukaryotes, however, chromosome. They are unusual and not typically
especially the multi-cellular species such as us, thought of as genes, but they are fit the definition
there appears to be a large percentage of non- above. They are typically the least frequent in the
functional DNA (~75-90%; sometimes called “Junk” genome.
DNA), depending upon what is called non-
functional. Here we call it non-functional if it fails 5. GENES: DNA TRANSCRIBED INTO RNA (RNA-
to influence the phenotype, even if it might have CODING GENES)
some biochemical activity. Remember all DNA is
Some genes are transcribed but not translated into
replicated (a biochemical activity) and all DNA is
a protein. It is the RNA that functions - functional
bound up in histone proteins (another biochemical
RNA molecules. There are many types of RNA
activity), so not all biochemical activity should be
molecules but for now, we will only list a few of
considered as functional for the phenotype.
them.
While most references to genes usually involves
protein coding genes, the functional (gene 5.1. PROTEIN SYNTHESIS RELATED
containing) sequences can actually be divided into Ribosomes contain rRNA; the large and small
three main types: subunits each have RNA molecules as components
(1) DNA acting directly of their structure. These rRNAs are encoded by the
(2) DNA transcribed into RNA, which functions rRNA genes, which are usually located as multi-
directly, and gene repeats in clusters. tRNA is also a functional
(3) DNA transcribed into mRNA, which is translated RNA that is involved in the transfer of amino acids
into a polypeptide, which has a function. to the elongating polypeptide chain. tRNA genes

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

GENES TO GENOMES – CHAPTER 05

are usually dispersed around the genome (not similar in size and function with miRNAs but come
clustered). from a different RNA precursor. Double-stranded
DNA molecules are chopped into shorter fragments
5.2. SMALL RNAS (AN INCOMPLETE LIST): that are still double-stranded, which are called
snRNA (small nuclear RNA) reside in the nucleus siRNAs. For example, RNA is transcribed from
and form an RNA-protein complex called centromeric sequences and modified into siRNA
“spliceosomes” that process the primary mRNA fragments. This double stranded siRNA is broken
transcripts into the mature mRNA. In this complex, down into two strands. One of the strands forms a
it is the RNA molecule, not the protein component complex with other proteins and this complex finds
that has the catalytic activity. its precursor (the centromeric sequence) and
snoRNA (small nucleolar RNA) act as guides for modifies it into a highly condensed chromatin,
other RNA molecules such as snRNA or rRNA producing heterochromatin.
molecules in modification process. piRNA (Piwi-interacting RNA) are single stranded
miRNA (microRNA) is a single-stranded RNA RNA molecules of 24-32 bp in length that interact
molecule that is about 22 bp long, and regulate with a protein called piwi and this RNA-protein
gene expression in both transcription and post- complex affects epigenetic and post-transcriptional
transcriptional level. gene silencing. For example, it can block the
transcription from DNA transposons by turning the
siRNA (small interfering RNA) are short double
normal chromatin into heterochromatin.
stranded RNA molecules (about 21-24 bp long) and
are also involved in RNA interference (RNAi) Each of these types of small RNAs is transcribed
regulation of genes and post-transcriptional gene from a gene.
silencing (PTGS). These small interfering RNAs are

Figure 3.
Structure of a gene contains many components. This is a mix of prokaryote and eukaryote gene structure. For example, a
polycistronic operon is found in prokaryotes, while the distant enhancer/silencer elements are found in eukaryotes. ORF is open
reading frame; UTR is untranslated region; RBS is ribosome binding site.
(Wikipedia-Thomas Shafee-CC BY-SA 4.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 05 – GENES TO GENOMES

6. GENES: DNA TRANSCRIBED INTO MRNA, The open reading frame (ORF) refers to the
TRANSLATED INTO A POLYPEPTIDE (PROTEIN sequence beginning at the start codon, through to
the stop codon. Also, many diagrams depict these
CODING GENES)
components as a “single unit” or as a cluster
Protein coding genes consist of both a regulatory sequence on the same chromosome, but in
and a transcribed sequence. The DNA is transcribed eukaryotes, regulatory regions can be very far away
into an mRNA that is then translated into (kilobases upstream/downstream) from the
polypeptides as directed by the cis-regulatory transcription unit.
elements in combination with various trans-acting
factors. One thing to note is that a gene is depicted 7. HOW ARE GENES AND OTHER SEQUENCES
as simply a “block” or a “line” in various diagrams DISTRIBUTED IN THE GENOME?
but it actually contains more than that; there are In our genome, genes are interspaced by inter-
many components inside a gene (Figure 3). For genic regions that contain interspersed repeats
example, in prokaryotes, a gene consists of: such as SINE (short interspersed elements) or LINE
- regulatory sequences: enhancer/silencer + (long interspersed elements). Organisms that have
operator + promoter smaller genes, such as bacteria, tend to have less
- transcribed sequences (transcription unit): 5’UTR inter-genic DNA compared to organisms that have
(untranslated region) + open reading frame + 3’UTR larger genes, such as yeast, Drosophila and
(includes the terminator). mammals.

Figure 4.
Comparison of the gene distribution between prokaryote and different eukaryotic organisms.
(Original-Locke- CC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

GENES TO GENOMES – CHAPTER 05

From: http://sandwalk.blogspot.ca/2011/05/whats-in-your-genome.html

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 05 – GENES TO GENOMES

___________________________________________________________________________
SUMMARY:
• The Central Dogma states that information in nucleic acids (DNA and RNA) is translated into protein,
but it can never go back in the opposite direction. More recently it has been described as information
flows from DNA->RNA->protein.
• The definition of a gene is changing as new discoveries are made but in general, a gene is a unit of
inheritance that has a locus on a chromosome, can affect an organism’s phenotype, can exist in various
forms, and can recombine with other such units.
• DNA sequences can be divided into two main categories: functional and non-functional. The functional
DNA can be divided into (1) DNA acting directly, (2) DNA transcribed into RNA, which functions directly,
and (3) DNA transcribed into mRNA, which is translated into a polypeptide, which has a function.
• There are various kinds of RNA molecules that are involved in protein synthesis, DNA replication, post-
transcriptional modification, and gene regulation.
• Protein coding genes contain regulatory sequences and transcribed sequences. The transcript (mRNA)
contains a 5’ untranslated region (5’UTR), the open reading frame (ORF), and the 3’ untranslated
region (3’UTR).
KEY TERMS:
Central Dogma non-protein coding
transcription RNA encoding
translation Structural DNA
reverse transcription rRNA
genotype tRNA
phenotype snRNA
genes snoRNA
alleles miRNA
genetics siRNA
ORF piRNA
protein coding

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

GENES TO GENOMES – CHAPTER 05

STUDY QUESTIONS:
1) Provide a definition of a gene that includes all
types of genes (e.g. more than just a protein
coding gene).
2) Is all DNA in a genome part of a gene? Does all
DNA have a function? Explain.
3) Do all transcribed RNA molecules end up as
mRNA transcripts?
4) What is the UTR on an mRNA?
5) Does a segment of DNA have to be transcribed
in order to be a gene?

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 05 – GENES TO GENOMES

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

PROKARYOTIC GENES: E. COLI LAC OPERON – CHAPTER 06

CHAPTER 06 – PROKARYOTIC GENES: E. COLI LAC OPERON

Figure 1.
Electron micrograph of growing E. coli.
Some show the constriction at the location
where daughter cells separate. The
colouring is false.
(Flickr-NIAID-CC BY 2.0)

INTRODUCTION 1. THE LAC OPERON – A MODEL PROKARYOTE
With most organisms, every cell contains GENE
essentially the same genomic sequence. How then Early insights into mechanisms of transcriptional
do cells develop and function differently from each regulation came from studies of E. coli by Francois
other? The answer lies in the regulation of gene Jacob & Jacques Monod. In E. coli, and many other
expression. Only a subset of all the genes is bacteria, genes encoding several different
expressed (i.e. active) in any given cell participating polypeptides may be located in a single
in a particular biological process. Gene expression transcription unit called an operon. The genes in an
is regulated at many different steps along the operon share the same transcriptional regulation,
process that converts DNA information into but are translated individually into separate
proteins. In the first stage, transcript abundance polypeptides. Most prokaryote genes are not
can be controlled by regulating the rate of organized as operons, but are transcribed
transcription initiation and processing, as well as individually yielding single peptide units.
the degradation of transcripts. In many cases,
Eukaryotes do not group genes together as
higher abundance of a gene’s transcripts is
operons (an exception is C. elegans and a few other
correlated with its increased expression. We will
species).
focus on transcriptional regulation in E. coli (Figure
1). Be aware, however, that cells also regulate the
overall activity of genes in other ways. For
example, by controlling the rate of mRNA
translation, processing, and degradation, as well as
Figure 2.
the post-translational modification of proteins and Diagram of a segment of an E. coli chromosome containing
protein complexes. the lac operon, as well as the lacI coding region. The
various genes and cis-elements are not drawn to scale.
(Original-Deyholos-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 06 – PROKARYOTE GENES: E. COLI LAC OPERON

1.1. BASIC LAC OPERON STRUCTURE This repressor is trans-acting and binds to two cis-
E. coli encounters many different sugars in its acting operator sequences adjacent to the
environment. These sugars, such as lactose and promoter of the lac operon. Binding of the
glucose, require different enzymes for their repressor prevents RNA polymerase from binding
metabolism. Three of the enzymes for lactose to the promoter (Figure 2, Figure 4.). Therefore,
metabolism are grouped in the lac operon: lacZ, the operon is not transcribed when the operator
lacY, and lacA (Figure 2). LacZ encodes an enzyme sequence is occupied by a repressor.
called β-galactosidase, which digests lactose into
its two constituent sugars: glucose and galactose.
lacY is a permease that helps to transfer lactose
into the cell. Finally, lacA is a trans-acetylase; the
relevance of which in lactose metabolism is not
entirely clear. Transcription of the lac operon
normally occurs only when lactose is available for it
to digest. Presumably, this avoids wasting energy
in the synthesis of enzymes for which no substrate Figure 3.
is present. In the lac operon there is a single mRNA Structure of lacI homotetramer bound to DNA.
transcript that includes coding sequences for all (Original-Deyholos- CC BY-NC 3.0)
three enzymes and is called a polycistronic mRNA.
A cistron in this context is equivalent to a gene. 2.1. THE REPRESSOR ALSO BINDS LACTOSE
(ALLOLACTOSE)
1.2. CIS- AND TRANS- REGULATORS Besides its ability to bind to specific DNA sequences
In addition to these three protein-coding genes, at the operator, another important property of the
the lac operon contains several short DNA lacI protein is its ability to bind to allolactose. If
sequences that do not encode proteins, but instead lactose is present, β-galactosidase enzymes
act as binding sites for proteins involved in convert a few of the lactose molecules into
transcriptional regulation of the operon. In the lac allolactose. This allolactose can then be
operon, these sequences are called P (promoter), allosterically bound to the lacI protein. This alters
O (operator), and CBS (CAP-binding site). the shape of the protein in a way that prevents it
Collectively, sequence elements such as these are from binding to the operator. Therefore, in the
called cis-elements because they must be located presence of lactose (allolatose) the repressor
on the same piece of DNA as the genes they doesn’t bind the operator sequence and thus RNA
regulate. On the other hand, intermolecular polymerase is able to bind to the promoter and
elements outside from the target DNA such as the transcribe the lac operon. This leads to a moderate
proteins that bind to these cis-elements are called level of expression of the mRNA encoding the lacZ,
trans-regulators because (as diffusible molecules) lacY, and lacA genes. This kind of secondary
they do not necessarily need to be encoded on the molecule that binds to either activator or repressor
same piece of DNA as the genes they regulate. and induces the production of specific enzyme is
2. NEGATIVE REGULATION – INDUCERS AND called an inducer. Also, proteins such as lacI that
change their shape and functional properties after
REPRESSORS
binding to a ligand are said to be regulated through
LacI encodes an allosterically regulated repressor an allosteric mechanism. The role of lacI in
One of the major trans-regulators of the lac operon regulating the lac operon is summarized in Figure 4.
is encoded by lacI, a gene located just upstream
from the lac operon (Figure 2). Four identical
molecules of lacI proteins assemble together to
form a homotetramer called a repressor (Figure 3).

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

PROKARYOTIC GENES: E. COLI LAC OPERON – CHAPTER 06

Figure 5.
CAP, when bound to cAMP, helps RNApol to bind to the
lac operon. cAMP is produced only when glucose [Glc] is
low. (Original-Deyholos-CC BY-NC 3.0)

when glucose is absent and lactose is present. This

Figure 4.
provides another layer of adaptive control of lac
When the concentration of lactose [Lac] is low, lacI operon expression: only in the presence of lactose,
tetramers bind to operator sequences (O), thereby and in the absence of glucose is the operon
blocking binding of RNApol (green) to the promoter (P). expressed at its highest levels.
Alternatively, when [Lac] is high, lactose binds to lacI,
preventing the repressor from binding to O, and allowing 4. THE USE OF MUTANTS TO STUDY THE LAC
transcription by RNApol.
OPERON
(Original-Deyholos-CC BY-NC 3.0)
4.1. SINGLE MUTANTS OF THE LAC OPERON
3. POSITIVE REGULATION – CAP, CAMP & The lac operon and its regulators were first
POLYMERASE characterized by studying mutants of E. coli that
exhibited various abnormalities in lactose
A second aspect of lac operon regulation is metabolism. Mutations can occur in any of the lacZ,
conferred by a trans-acting factor called cAMP lacY, and lacA genes. Such mutations result in
binding protein (CAP, Figure 5). CAP is another altered protein sequences, and cause non-
example of an allosterically regulated trans-factor. functional products. These are mutations in the
Only when the CAP protein is bound to cAMP can protein coding sequences (non-regulatory).
another part of the protein bind to a specific cis-
element within the lac promoter called the CAP Other mutants can cause the lac operon to be
binding sequence (CBS). CBS is located very close, expressed constitutively, meaning the operon was
but upstream, to the promoter (P). When CAP is transcribed whether or not lactose was present in
bound to at the CBS, RNA polymerase is better able the medium. Remember that normally the operon
to bind to the promoter and initiate transcription. is only transcribed if lactose is present. Such
Thus, the presence of cAMP ultimately leads to a mutants are called constitutive mutants.
further increase in lac operon transcription. Constitutive mutants are always on and are
unregulated by inducers. These include lacO and
The physiological significance of regulation by lacI genes.
cAMP becomes more obvious in the context of the
following information. The concentration of cAMP 4.2. INDUCER MUTATIONS (LACI LOCUS)
is inversely proportional to the abundance of The lacI locus has two types of mutations: I- and IS.
glucose (inducer in this case): when glucose (1) One type of mutant allele of lacI (called I-) either
concentrations are low, an enzyme called (a) prevents the production of a repressor
adenylate cyclase is able to produce cAMP from polypeptide or (b) produces a polypeptide that
ATP. Evidently, E. coli prefers glucose over lactose, cannot bind to the operator sequence. Therefore,
and so expresses the lac operon at high levels only

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 06 – PROKARYOTE GENES: E. COLI LAC OPERON

there is no repressor binding and transcription can permits transcription.

occur without the presence of inducer (allolactose). Note that I+ is dominant over I-. For example, in E.
This is also a constitutive expresser of the lac coli strain with I+Z-Y+A+/F I-Z+Y-A+, the lac genes will
operon because absence of repressor binding not be transcribed because the I+ allele will still
produce functional repressors that bind to all
operator sequences, preventing transcription.
(Figure 7)

(2) The other type of mutant of lacI called Is

prevents the repressor polypeptide from binding
allolactose, and thus will only bind to the operator
and the lacZ, lacY and lacA genes would be non-
inducible. This mutant constitutively represses the
lac operon whether lactose is present or not. The
lac operon is not expressed at all and this mutant is
called a “super-suppressor”. Is is therefore
dominant to both I+ and I- in trans. Therefore, E.
coli strains with the genotypes 1) ISZ+Y+A+/F I+Z+Y-A+
and 2) ISZ+Y+A+/F I-Z+Y+A+, the lac Z, lac Y and lac A
genes will not be inducible (Figure 8).

Figure 6.
+ + - + + + - + - +
Both E. coli strands with genotypes I O Z Y A /F I O Z Y A
S + + + + - - + + +
and I O Z Y A /F I O Z Y A will induce all the lac genes
-
because repressor cannot bind to the O sequence on the
F-factor and cannot prevent transcription.
(Original-Locke-CC BY-NC 3.0)

Figure 8.
S + + + + + - +
E. coli strains with genotypes 1) I Z Y A /F I Z Y A and 2)
S + + + - + + +
I Z Y A /F I Z Y A will not produce lacZ, lacY and lacA
Figure 7. products. (Original-Locke-CC BY-NC 3.0)
+ - + + - + - +
E. coli strain with genotype I Z Y A /F I Z Y A will not
produce lacZ, lacY and lacA products.
The repressor protein encoded by lac I gene has at
(Original-Locke-CC BY-NC 3.0)
least two independent functional domains. This is
the reason why it can mutate independently to give
two different types of mutants. (Figure 9)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

PROKARYOTIC GENES: E. COLI LAC OPERON – CHAPTER 06

4.4. THE F-FACTOR AND TWO LAC OPERONS IN A SINGLE

CELL – PARTIAL DIPLOID IN E. COLI
More can be learned about the regulation of the
lac operon when two different copies (each
containing mutations) are present in one cell. This
can be accomplished by using the F-factor to carry
one copy, while the other is on the genomic E. coli
chromosome. This results in a partial diploid E. coli
cell, that is, one that contains two, independent
Figure 9. copies (alleles) of the lac operon and lacI.
Because the repressor encoded by lacI gene has
The F-factor or Fertility factor is an episome, which
independent domains, mutations can also occur
independently. is capable of being either a free plasmid or
(Original-Deyholos/Locke-CC BY-NC 3.0) integrated into the host bacterial chromosome.
This switching is accomplished by IS (Insertion
4.3. OPERATOR MUTATIONS sequence) elements where unequal crossing over
The operator locus (lacO) is an example of a can recombine the F-factor and adjacent DNA
mutation in an operator sequence. The base pair sequences (genes) in and out of the host
change reduces or precludes the repressor (the lacI chromosome. If the F factor is present, then the
gene product) from recognizing and binding to the strain is an F+ strain. For example, the genotype of
operator sequence. Thus, in Oc mutants, lacZ, lacY, a host bacterium that has lac-- gene that is supplied
and lacA are expressed whether or not lactose is with F factor containing lac+ can be written as lac-
present. Note that this mutation is cis dominant /F+lac+. Researchers have used this genetic tool to
(only affects the genes on the same chromosome) create partial diploids (merozygotes) that allow
but not in trans (other DNA molecule). them to test the regulation with combinations of
different mutations in one cell. For example, the F-
Note that while Oc mutants will be constitutively factor copy may have a IS mutation while the
expressed (not regulated by lactose), some may genomic copy might have an OC mutation. How
not be maximally expressed. Some alleles may would this cell respond to the presence/absence of
partially bind the repressor and thus have some lactose (or glucose)? This partial diploid can be
measure of inhibition. For example, a deletion of used to determine that IS is dominant to I+, which in
the operator sequence will result in maximal turn is dominant to I-. It can also be used to show
expression, while a single base pair change might the OC mutation only acts in cis while the lacI
only reduce binding slightly and affect the level of mutation can act in trans.
expression slightly (Table 1).
c
5. SUMMARY
Table 1. Constitutively expressed O mutants may not be
maximally expressed and have various levels of expression In positive regulation, a low glucose level allows
relative to wild type levels. adenylate cyclase to produce cAMP from ATP,
Level Genotype Explanation which binds to CAP protein. CAP protein can then
- bind to DNA and increase the level of lac operon
100% lacI Oc no repressor
repressor fails transcription. A high glucose level halts adenylate
10-20% lacI+ Oc cyclase from producing cAMP from ATP. Hence,
to bind tightly
P+ Oc, basal transcription, cAMP will not bind to CAP protein and in turn, CAP
~≤1% will not bind to DNA and the level of transcription
high glucose constitutive
no transcription or would be low.
- -
0% P or Z
no protein In negative regulation, repressor protein acts to
prevent transcription. Inducer binds to repressor to

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 06 – PROKARYOTE GENES: E. COLI LAC OPERON

alter conformation so it no longer binds to the transcription. Low levels of lactose would not cause
operator sequence and transcription can take place. inhibition to the repressor, so transcription would
High levels of lactose (inducer) would allosterically be prevented. Various forms of regulation in the lac
inhibit repressor and therefore would not prevent operon are found in Figure 10.

Figure 10.
Top: When glucose [Glc] and lactose [Lac] are both high, the lac operon is transcribed at a
basal (<1%) level, because CAP (in the absence of cAMP) is unable to bind to its
corresponding cis-element (yellow) and therefore cannot help to stabilize binding of
RNApol at the promoter.
Bottom: Alternatively, when [Glc] is low, and [Lac] is high, CAP and cAMP can bind near
the promoter and increase further the transcription of the lac operon. (Original-Deyholos-

CC BY-NC 3.0)

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

PROKARYOTIC GENES: E. COLI LAC OPERON – CHAPTER 06

___________________________________________________________________________
SUMMARY:
• Regulation of gene expression is essential to the normal development and efficient functioning of cells
• Gene expression may be regulated by many mechanisms, including those affecting transcript
abundance, protein abundance, and post-translational modifications
• Regulation of transcript abundance may involve controlling the rate of initiation and elongation of
transcription, as well as transcript splicing, stability, and turnover
• The rate of initiation of transcription is related to the presence of RNA polymerase and associated
proteins at the promoter.
• RNApol may be blocked from the promoter by repressors, or may be recruited or stabilized at the
promoter by other proteins including transcription factors
• The lac operon is a classic, fundamental paradigm demonstrating both positive and negative regulation
through allosteric effects on trans-factors.
KEY TERMS:
gene expression trans-regulators
transcriptional regulation lacI
operon homotetramer
lactose repressor
glucose inducer
lac operon allosteric
lacZ cAMP binding protein
lacY CAP
lacA CAP binding sequence
β-galactosidase CBS
permease adenylate cyclase
trans-acetylase constitutive
P / promoter Oc / I- / Is
O / operator cis dominant
CBS F-factor / episome
CAP-binding site merozygotes
cis-elements

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 06 – PROKARYOTE GENES: E. COLI LAC OPERON
STUDY QUESTIONS:
1) With respect to the expression of β- All of the strains are grown in media that lacks
galactosidase, what would be the phenotype of glucose.
each of the following strains of E. coli? + + + + + c - -
a) I , O , Z , Y / I , O , Z , Y (high lactose)
+ + + +
a) I , O , Z , Y (no glucose, no lactose) + + + + + c - -
+ + + + b) I , O , Z , Y / I , O , Z , Y (no lactose)
b) I , O , Z , Y (no glucose, high lactose) + + - + + c + +
+ + + + c) I , O , Z , Y / I , O , Z , Y (high lactose)
c) I , O , Z , Y (high glucose, no lactose) + + - + + c + +
+ + + + d) I , O , Z , Y / I , O , Z , Y (no lactose)
d) I , O , Z , Y (high glucose, high lactose)
- + - + - + + +
+ + +
e) I , O , Z , Y (no glucose, no lactose) e) I , O+, Z , Y / I , O , Z , Y (high lactose)
- + + - + - + + +
+ + +
f) I , O , Z , Y (high glucose, high lactose) f) I , O , Z , Y / I , O , Z , Y (no lactose)
+ + + - + + + + + - +
g) I , O , Z , Y- (high glucose, high lactose) g) I , O , Z , Y / I , O , Z , Y (high lactose)
+ c + + - + + + + + - +
h) I , O , Z , Y (no glucose, no lactose) h) I , O , Z , Y / I , O , Z , Y (no lactose)
+ c + + + c + + + + - +
i) I , O , Z , Y (no glucose, high lactose) i) I , O , Z , Y / I , O , Z , Y (high lactose)
+ c + + + c + + + + - +
j) I , O , Z , Y (high glucose, no lactose) j) I , O , Z , Y / I , O , Z , Y (no lactose)
+ c + + + + - + + c + +
k) I , O , Z , Y (high glucose, high lactose) k) I , O , Z , Y / I , O , Z , Y (high lactose)
- + + + + + - + + c + +
l) I , O , Z , Y (no glucose, no lactose) l) I , O , Z , Y / I , O , Z , Y (no lactose)
- + + + + + - + s + + +
m) I , O , Z , Y (no glucose, high lactose) m) I , O , Z , Y / I , O , Z , Y (high lactose)
- + + + + + + s + + +
n) I , O , Z , Y (high glucose, no lactose) n) I , O , Z-, Y / I , O , Z , Y (no lactose)
- + + + s + + + + + - +
o) I , O , Z , Y (high glucose, high lactose) o) I , O , Z , Y / I , O , Z , Y (high lactose)
s + + + s + + + + + - +
p) I , O , Z , Y (no glucose, no lactose) p) I , O , Z , Y / I , O , Z , Y (no lactose)
s + + +
q) I , O , Z , Y (no glucose, high lactose)
s + +
3) What genotypes of E. coli would be most useful
r) I , O+, Z , Y (high glucose, no lactose) in demonstrating that the lacO operator is a cis-
s + + acting regulatory factor?
s) I , O+, Z , Y (high glucose, high lactose)
4) What genotypes of E. coli would be useful in
Use Answer Legend: demonstrating that the lacI repressor is a trans-
+++ Lots of β-galactosidase activity (100%) acting regulatory factor?
++ Moderate β-galactosidase activity (10-20%) 5) What would be the effect of the following loss-
+ Basal β-galactosidase activity (~≤1%) of-function mutations on the expression of the
- No β-galactosidase activity (0%) lac operon?
a) loss-of-function of adenylate cyclase
2) In the E. coli strains listed below, some b) loss of DNA binding ability of CAP
genes are present on both the chromosome, c) loss of cAMP binding ability of CAP
- d) mutation of CAP binding site (CBS) cis-
and the extrachromosomal F factor episome.
element so that CAP could not bind
The genotypes of the chromosome and

episome are separated by a slash. What will be
the β-galactosidase phenotype of these strains?

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: STRUCTURE– CHAPTER 07

CHAPTER 07 – EUKARYOTIC GENES: STRUCTURE

Figure 1.
Some genes are expressed in a segmental pattern and dictate the
development of cells in that segment. This tissue specific patterning
happens through the temporal and spatial regulation of these
genes. (Wikipedia-PhiLiP-PD)

INTRODUCTION (short tandem repeats) and mini-satellites (longer
tandem repeats). Interspersed repeats include
While prokaryote protein-coding genes are
SINEs (Short Interspersed Elements), LINEs (Long
relatively simple with a promoter driving a
interspersed elements).
transcribed mRNA sequence (or a multiple protein
coding mRNA in the case of an operon), the
expression of a eukaryote protein-coding gene is
much more complex. There are intron sequences,
which are spliced out during processing, or they
may be alternately spliced, and there are three
levels of transcriptional regulation. All these make
the typical eukaryote gene much larger than the
typical prokaryote one and more complex.
1. THE EUKARYOTIC GENOME CONTAINS VARIOUS Figure 2.
Tandem repeats can be either microsatellite and mini-
TYPES OF SEQUENCES satellite, and interspersed repeats can be SINE or LINE
There are three main types of sequences in depending on the length of the repeats.
(Original-Kang- CC BY-NC 3.0)
eukaryote genome, which are: (1) single copy
genes, (2) multiple copy genes and (3) repeated
sequences. Single copy genes have a single copy in 2. TRANSCRIPTS OF PROTEIN CODING GENES –
the genome and include most protein-coding PROCESSING
genes. Multiple copy genes have multiple copies in
the genome and include rRNA- and tRNA-coding 2.1. 5’ CAP, POLY(A) TAIL
genes, and some protein coding genes. Repeated An mRNA is transcribed by RNA polymerase II using
sequences can be either tandem repeats or its complementary DNA strand as a template. As it
interspersed repeats (Figure 2). Tandem repeats is synthesized, it undergoes processing before
are followed directly after one another, whereas transport to the cytoplasm. Here are the major
interspersed repeats are scattered randomly. steps of during transcription of eukaryote mRNA:
Tandem repeats include (a) short centromeric- a) mRNA transcript is synthesized by RNA
tandem arrays, which are sequence repeats at the polymerase II.
centromere region and (b) VNTR (Variable Number
Tandem Repeats). VNTR include microsatellites

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 07 – EUKARYOTE GENES: STRUCTURE

b) While the mRNA is being synthesized, a 7- 2.2. INTRON AND EXON

methyl guanosine cap is added to the 5’ end by Primary transcripts undergo RNA splicing and are
an enzyme called guanylyltransferase. shortened by the removal of intervening sequences
c) Transcription proceeds past the poly(A) called introns before being transported to the
addition site (AATAAA). cytoplasm. Only the sequences that are retained,
called the exons, are joined together to make the
d) Endonuclease cleaves the mRNA strand 11-30 mature transcript mRNA. This is done by a large
nucleotides downstream the AAUAAA signal multi-protein structure called the spliceosome,
sequence to create the 3’ end. which also contains small nuclear
e) At this 3’ end, a poly(A) tail of 150-200 ribonucleoproteins (snRNPs = small nuclear RNA).
nucleotides is added by poly(A) polymerase.
f) This results in a mRNA primary transcript,
which is not yet mature.

Figure 3.
The steps of synthesizing a primary mRNA transcript. (Original-Locke-CC BY-NC 3.0)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: STRUCTURE– CHAPTER 07

Figure 4.
For each intron spliced out, there are three sites that are
essential. They are the 5’ donor site, branch point, and 3’
acceptor site. The number below each nucleotide
represents the percentage of that nucleotide at that site.
(Original-Locke-CC BY-NC 3.0) Figure 5.
(1) Spliceosome makes a cut at 5’ splice donor site, and (2)
the 5’ end of the intron attaches to branch A point, and (3)
For each intron on the primary transcript RNA, the second cut is made at the 3’ splice acceptor site,
there exists (1) 5’ splice donor site, (2) branch point ultimately forming a mature RNA.
A, and (3) 3’ splice acceptor site. (Note that the (Original-Kang-CC BY-NC 3.0)
directionality in these names (ex. 5’, 3’) are
referenced to the mRNA sequence). The snRNA of 2.3. ALTERNATIVE SPLICING
the spliceosome base pairs with the RNA Many genes have primary transcripts that are
sequences at the 5’ splice donor site and cuts it. processed differently to produce more than one
This cut 5’ end of the intron “donates” or attaches type of mature mRNA. This is called alternative
to the branch point A via 2’-5’ phosphodiester splicing, which often results in the production of
bond and forms a lariat. Next, a second cut is made more than one type of protein product from the
at the 3’ splice site acceptor (3’ end of the intron) same gene (Figure 6.). Alternative splicing is
(Figure 5.) RNA ligase attaches the two exon ends another means of gene regulation, but it happens
together, and the intron sequence is are degraded at a post-transcriptional level.
leaving the mature mRNA. To further complicate this process, in some
Note: Almost all eukaryote genes have introns, but organisms it is even possible for exons from
for some rare genes, like the Heat Shock Protein 70 different gene transcripts to be ligated together
(HSP70) gene, the primary transcript is the mature through a process called trans-splicing. Although
mRNA. Prokaryote genes do NOT have introns. rare, an example comes from the worm, C. elegans,
where an identical short leader sequence, the
spliced leader (SL), is trans-spliced onto the 5ʹends
of multiple mRNAs.
With alternative splicing, it is possible for
organisms with 25,000 genes (e.g. humans) to
produce a much larger variation of polypeptides.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 07 – EUKARYOTE GENES: STRUCTURE

Figure 6.
Alternative splicing produces different combinations of exons, which result in different mRNA products, thus different proteins.
(Wikipedia- National Human Genome Research Institute-PD)

3. TRANSCRIPTION REGULATION – PROMOTERS, between any of these elements and the
ENHANCERS/SILENCERS transcription start site can vary, but are typically
within ~200 base pairs of the start of transcription.
3.1. PROXIMAL REGULATORY SEQUENCES. This contrasts the next set of elements.
As in prokaryotes the RNA polymerase binds to the
DNA at the gene’s promoter to begin transcription. 3.2. DISTAL REGULATORY ELEMENTS
In eukaryotes, however, RNApol is part of a large Even more variation is observed in the position and
protein complex that includes additional proteins orientation of the second major type of cis-
that bind to one or more specific cis-elements in regulatory element in eukaryotes, which are called
the promoter region, which includes GC-rich boxes, enhancer elements. Regulatory trans-factor
CAAT boxes, and TATA boxes. Cis-elements are proteins called transcription factors bind to
intramolecular elements that exist and act within enhancer sequences, then, while still bound to
the same DNA molecule. However, trans-elements DNA, these proteins interact with RNApol and
are intermolecular elements that are distinct other proteins at the promoter to enhance the rate
molecules from their target DNA; it could be RNA of transcription. There is a wide variety of different
or proteins. High levels of transcription require transcription factors and each recognizes a specific
both the presence of this protein complex at the DNA sequence (enhancer elements) to promote
promoter, as well as their interaction with other gene expression in the adjacent gene under specific
trans-factors described below. The approximate circumstances. Because DNA is a flexible molecule,
position of these elements relative to the enhancers can be located near (~100s of bp) or far
transcription start site (+1) is shown in Figure 7, (~10K of bp), and either upstream or downstream,
but it should be emphasized that the distance from the promoter (Figure 7 and Figure 8).

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: STRUCTURE– CHAPTER 07

Figure 7.
Structure of a typical eukaryotic gene. RNA polymerase binding may involve one or more cis-elements within the proximal region
of a promoter (green boxes). Enhancers (yellow boxes) may be located any distance upstream or downstream of the promoter
and are also involved in regulating gene expression. The processing of a primary transcript to a mature mRNA is also shown.
Note: not to scale. (Original-Deyholos- CC BY-NC 3.0)

3.3. EXAMPLE: GAL4-UAS SYSTEM FROM YEAST – A eukaryotes, including humans. It has been
GENETIC TOOL especially well exploited in Drosophila where
Yeasts use the Gal regulon to convert galactose to >10,000 differently expressing driver lines are
glucose-1-phosphate for glycolysis. Geneticists available. These lines permit the tissue specific
have taken advantage of a yeast distal enhancer expression of any responder gene to examine its
sequence to make the GAL4-UAS system, a effect on development or cellular functions.
powerful technique for studying genes in other
eukaryotes. It relies on two parts: a “driver” and a
“responder” (Figure 9.). The driver part is a gene
encoding a yeast transcriptional activator protein
called Gal4. It is separate from the responder part,
which contains the enhancer sequence, or
upstream activation sequence (UAS, as it is called
in yeast) to which the Gal4 protein specifically Figure 8.
binds and activate the Gal genes. This UAS is placed A transcription factor (yellow) bound to an enhancer that
upstream (using genetic engineering) from a is located far from a promoter. Because of the flexibility of
the DNA molecule, the transcription factor and RNApol
promoter transcribing a reporter gene, or other
(green) are able to interact physically, even though the
gene of interest, such as GFP (green fluorescent cis-elements to which they are bound are located far
protein). apart. In eukaryotic cells, RNApol is actually part of a
large complex of proteins (not shown here) that
Both parts must be present in the same cell for the assembles at the promoter. (Original-Deyholos- CC BY-NC
system to express the responder gene. If the driver 3.0)
is absent, the responder product will not be

expressed. However, both are in the same cell (or
organism) the pattern of expression of the driver
part will induce the responder part’s expression in
the same pattern. This system works is a variety of

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 07 – EUKARYOTE GENES: STRUCTURE

various DNA bound proteins. These modifications

alter the local chromatin density and thus the
availability for transcription. Acetylated histones,
for example, tend to be associated with actively
transcribed genes, whereas deacetylated histone
are associated with genes that are silenced (Figure
10).

Figure 9.
The GAL4-UAS system. The driver, with a wing enhancers,
expresses the Gal4 protein that then binds to the UAS
element upstream of a marker gene, GFP. This would
express the GFP in the wing tissues. The modular aspect
of this system would let the wing enhancer be replaced by
any other enhancer and the GFP marker replaced with
any other gene. (Original-Locke- CC BY-NC 3.0

Figure 10.
4. HIGHER ORDER CHROMATIN - ADDITIONAL Acetylation of histone proteins is associated with more
LEVELS OF REGULATING TRANSCRIPTION open chromatin configuration. Acetylation is a reversible
process. (Original-Deyholos- CC BY-NC 3.0)
Eukaryotes regulate transcription via promoter
sequences close to the transcription unit (as in
4.2. MODIFICATION OF DNA BASES
prokaryotes) and also use more distant enhancer
sequences to provide more variation in the timing, Likewise, methylation of DNA itself is also
level, and location of transcription, however, there associated with transcription regulation. Cytosine
are still additional levels of genetic control. This bases, particularly when followed by a guanine
consists of two major mechanisms: (1) large-scale (CpG sites) are important targets for DNA
changes in chromatin structure, and (2) methylation (Figure 11). Methylated cytosine
modification of bases in the DNA sequence. These within clusters of CpG sites is often associated with
two are often inter-connected. transcriptionally inactive DNA.

4.1. CHROMATIN DYNAMICS

Despite the simplified way in which we often methylation
represent DNA in figures such as those in this
demethylation
chapter, DNA is almost always associated with
various chromatin proteins. For example, histones
remain associated with the DNA even during Figure 11.
transcription. Thus the rate of transcription is also methylation reaction shown here produces 5-
controlled by the accessibility of DNA to RNApol methylcytosine (5mC). Methyl groups may also be
and regulatory proteins. So, in regions where the removed by various processes.
(Original-Deyholos- CC BY-NC 3.0)
chromatin is highly compacted, it is unlikely that
any gene will be transcribed, even if all the
necessary cis- and trans- factors are present in the The modification of DNA and its associated
nucleus. The extent of chromatin compaction in proteins is enzymatically reversible
various regions is regulated through the action of (acetylation/deacetylation;
chromatin remodeling proteins. These protein methylation/demethylation) and thus a cyclical
complexes include enzymes that add or remove activity. Regulation of this provides another layer
chemical tags, such as methyl or acetyl groups, to

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: STRUCTURE– CHAPTER 07

through which eukaryotic cells control the phenotype, the ability to influence traits in the next
transcription of specific genes. generation, is a topic of current research and only
some examples will be discussed here.
5. EPIGENETICS
One example comes from the grandchildren of
5.1. THE BACKGROUND OF EPIGENETICS famine victims. They are known to have lower birth
The word “epigenetics” has become popular in the weight than children without a family history of
last decade and its meaning has become confused. famine. This heritability of altered state of gene
The term epigenetics describes any heritable expression is surprising, since it appears not to
change in phenotype that is not associated with a involve typical changes in the sequence of DNA.
change in the chromosomal DNA sequence. The term epigenetics is applied here since the
Originally it meant the processes through which the apparently heritable change in phenotype is
genes were expressed to give the phenotype; that associated with something other than DNA
is, the changes in gene expression that occur during sequence.
normal development of multicellular organisms. This change is inherited from one generation to the
This includes the change in transcriptional state of next and is thus transgenerational, for at least one
a DNA sequence (gene) via DNA or chromatin generation. In developmental epigenetics, the
protein reversible modifications. Thus, DNA expression state (developmentally differentiated
methylation and chromatin protein methylation, state) is conserved only from one mitosis to the
phosphorylation, and acetylation have been next, but is erased or reset at meiosis (the
targeted as mechanisms for “heritable” changes in boundary of one generation to the next). The basis
cells as they grow from a single cell (zygote) and of at least some types of epigenetic inheritance
differentiate to a multicellular organism. Here, appears to be replication of patterns of histone and
dividing cells commit to differentiate into different DNA methylation that occurs in parallel with the
tissues such as muscle, neuron, and fibroblast due replication of the primary DNA sequence. The
to the genes that they express or silence. Some permanence of this “epigenetic change” is not the
genes are irreversibly silenced, through epigenetic same as changes in the DNA sequence itself. What
mechanisms, in some cell types, but not in others. is clear is that epigenetics is an important part of
All of this doesn’t involve any change in DNA regulating gene expression, and can serve as a type
sequence. of cellular memory, certainly within an individual,
Remember, these epigenetic effects are not or across a few generations in some cases. It is
permanent changes and thus are not selectable in becoming clear that epigenetics is an important
an evolutionary context. However, mutations in the part of biology.
genes that regulate the epigenetic effect can be
5.3. IMPRINTING AND PARENT-OF-ORIGIN EFFECTS
selected.
For some genes, the allele inherited from the
5.2. SOME HERITABLE INFORMATION CAN BE PASSED ON female parent is expressed differently than the
INDEPENDENT OF THE DNA SEQUENCE allele that is inherited from the male parent. This is
More recently however, researchers have found distinct from sex-linkage and is true even if both
many cases of environmentally induced changes in alleles are wild-type and autosomal. During
gene expression that can be passed on to the next gamete development (gametogenesis), each
generation – a potential multi-generational effect. parent imprints epigenetic information on some
These cases have also been called “epigenetics”, genes that will affect the activity of the gene in the
and probably involve similar reversible changes to offspring. Imprinting does not change the DNA
the DNA and chromatin proteins. These altered sequence, but does involve methylation of DNA
expression patterns represent the diversity of and histones, and generally silences the expression
expression for a genome. This “extended” of one of the parent’s alleles. In humans, some

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 07 – EUKARYOTE GENES: STRUCTURE

genes are expressed only from the paternal allele, The mouse agouti gene produces a signaling
and other genes are expressed only from the molecule that regulates pigment-producing cells
maternal allele. The imprinting marks are and brain cells that affect feeding and body weight.
reprogrammed before the next generation of Normally, agouti is silenced by methylation, and
gametes are formed. Thus, although a male these mice are brown and have a normal weight.
inherits epigenetic information from both his When agouti is demethylated by feeding certain
mother and father, this information is erased chemicals or by mutating a gene that controls
before sperm development, and he passes only one methylation, some mice become yellow and
pattern of imprinting to both his sons and overweight, although their DNA sequence remains
daughters. Most examples of imprinting come from unchanged. Methylation of agouti and normal
placental mammals, and many imprinted genes weight and pigmentation of offspring can be
control growth rate, such as IGF2 (insulin-like restored if their mothers are fed folic acid and
growth factor 2). other vitamins during pregnancy.
Imprinting appears to explain many different A study of an isolated Swedish village called
parent-of-origin effects. For example, Prader-Willi Överkalix provides an example of transgenerational
Syndrome (PWS) and Angelman Syndrome (AS) inheritance of nutritional factors. Detailed
are two phenotypically different conditions in historical records allowed researchers to infer the
humans that result from deletion of a specific nutritional status of villagers going back to 1890.
region of chromosome 15, which contains several The researchers then studied the health of two
genes. Whether the deletion results in PWS or in generations of these villagers’ offspring, using
AS depends on the parent-of-origin. If the deletion medical records. A significant correlation was
is inherited from the father, PWS results. found between the mortality risk of grandsons and
Conversely, if the deletion is inherited from the the food availability of their paternal grandfathers.
mother, AS is the result. The gene(s) associated This effect was not seen in the granddaughters.
with PWS is maternally silenced by imprinting, Furthermore, the nutrition of paternal
therefore the deletion of its paternally-inherited grandmothers, or either of the maternal
allele results in a complete deficiency of a required grandparents did not affect the health of the
protein. On the other hand, the paternal allele of grandsons. It was therefore proposed that
the gene involved in AS is silenced by imprinting, so epigenetic information affecting health (specifically
deletion of the maternal allele results in deficiency diabetes and heart disease) was passed from the
of the protein encoded by that gene. grandfathers, to the grandsons, through the male
line.
5.4. TRANSGENERATIONAL INHERITANCE OF
NUTRITIONAL INFLUENCES 5.5. VERNALIZATION AS AN EXAMPLE OF EPIGENETICS
Nutrition is one aspect of the environment that has Many plant species in temperate regions are
been particularly well-studied from an epigenetic winter annuals, meaning that their seeds
perspective in both mice and humans. People alive germinate in the late summer, and grow
today who experienced the Dutch famine of 1944- vegetatively through early fall before entering a
1945 as fetuses have IGF2 genes that are less dormant phase during the winter, often under a
methylated than their siblings. Methylation of IGF2 cover of snow. In the spring, the plant resumes
(and birth rate) is also lower in children of mothers growth and is able to produce seeds before other
who do not take folic acid supplements as species that germinated in the spring. In order for
compared those who do. Furthermore, an this life strategy to work, the winter annual must
individual’s phenotype can be influenced by the not resume growth or start flower production until
nutrition of parents or even grandparents. This winter has ended. Vernalization is the name given
transgenerational inheritance of nutritional effects to the requirement to experience a long period of
appears to involve epigenetic mechanisms. cold temperatures prior to flowering.

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: STRUCTURE– CHAPTER 07

How does a plant sense that winter has passed?

The signal for resuming growth cannot simply be
warm air temperature, since occasional warm days,
followed by long periods of freezing, are common
in temperate climates. Researchers have
discovered that winter annuals use epigenetic
mechanisms to sense and “remember” that winter
has occurred.

Figure 13.
In the autumn, histones associated with FLC are
acetylated, allowing this repressor of flowering genes to be
expressed. During winter, enzymes progressive
deacetylate FLC, preventing it from being expressed, and
therefore allowing flowering genes to respond to other
signals that induce flowering.
(Original-Deyholos- CC BY-NC 3.0)

6. EXAMPLES OF CHROMATIN STRUCTURE: X-

CHROMOSOME INACTIVATION
Figure 12.
A winter wheat crop (green) in early spring in the English 6.1. MAMMALIAN X-CHROMOSOME INACTIVATION –
countryside. (Flickr-Beardy Git- CC BY-NC-ND 2.0) CALICO CATS , HUMAN EXAMPLE
In mammals, the dosage compensation system
Fortunately for the researchers who were operates in females, not males. In XX embryos, one
interested in vernalization, some varieties of X in each cell is randomly chosen and marked for
Arabidopsis are winter annuals. Through inactivation. From this point forward this
mutational analysis of Arabidopsis, researchers chromosome will be inactive, hence its name X-
found that a gene called FLC (FLOWERING LOCUS C) inactive (Xi). The other X chromosome, the X-
encodes a transcription repressor acting on several active (Xa), is unaffected. The Xi is replicated
of the genes involved in early stages of flowering during S phase and transmitted during mitosis the
(Figure 13). In the fall and under other warm same as any other chromosome but most of its
conditions, the histones associated with FLC are genes are never expressed. The chromosome
acetylated and so FLC is transcribed at high levels; appears as a condensed mass within interphase
expression of flowering genes is therefore entirely nuclei called the Barr body. With the inactivation of
repressed. However, in response to cold genes on one X-chromosome, females have the
temperatures, enzymes gradually deacetylate the same number of functioning X-linked genes as
histones associated with FLC. The longer the cold males.
temperatures persist, the more acetyl groups are
removed from the FLC-associated histones, until This random inactivation of one X-chromosome
finally the FLC locus is no longer transcribed and leads to a commonly observed phenomenon in cats.
the flowering genes are free to respond to other A familiar X-linked gene is the Orange gene (O) in
environmental and hormonal signals that induce cats. The OO allele encodes an enzyme that results
flowering later in the spring. Because the in orange pigment for the hair. The OB allele causes
deacetylated state of FLC is inherited as cells divide the hairs to be black. The phenotypes of various
and the plant grows in the early spring, this is an genotypes of cats are shown in Figure 14. Note that
example of a type of cellular memory mediated by the heterozygous females have an orange and
an epigenetic mechanism. black mottled phenotype known as tortoiseshell.

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 07 – EUKARYOTE GENES: STRUCTURE

This is due to patches of skin cells having different of their liver cells do not make Factor VIII (because
X-chromosomes inactivated. In each orange hair the X with the F8+ allele is inactive) the other 50%
the Xi chromosome carrying the OB allele is can (Figure 15). Because some of their liver cells
inactivated. The OO allele on the Xa is functional are exporting Factor VIII proteins into the blood
and orange pigments are made. In black hairs the stream they have the ability to form blood clots
reverse is true, the Xi chromosome with the OO throughout their bodies. Even though their liver
allele is inactive and the Xa chromosome with the cells are a genetic mosaic, this does not produce a
OB allele is active. Because the inactivation decision visible mosaic phenotype.
happens early during embryogenesis, the cells
continue to divide to make large patches on the
adult cat skin where one or the other X is
inactivated.

Figure 14
Relationship between genotype and phenotype for an X-
O
linked gene in cats. The O allele = orange while the
B
O allele = black. Figure 15.
(Original-Harringtion- CC BY-NC 3.0) This figure shows the two types of liver cells in females
heterozygous for an F8 mutation. Because people with
the F8+/F8- genotype have the same phenotype, normal
6.2. FACTOR VIII BLOOD CLOTTING PROTEINS
blood clotting, as F8+/F8+ people the F8- mutation is
Another mammalian X-inactivation system is the F8 classified as recessive. .
gene in humans. It makes Factor VIII blood clotting (Original-Harringtion/Locke- CC BY-NC 3.0)
proteins in liver cells. If a male is hemizygous for a

mutant allele the result is hemophilia type A.
Females homozygous for mutant alleles will also
have hemophilia. Heterozygous females, F8+/F8-,
do not have hemophilia because even though half

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: STRUCTURE– CHAPTER 07

___________________________________________________________________________
SUMMARY:
• In eukaryotic genome, there are single-copy and multi-copy genes and repeated sequences that are
subdivided in various categories. Repeated sequences have tandem and interspersed repeats with
varying lengths.
• The primary mRNA transcript undergoes some modification and processing before being exported to
the cytoplasm: 5'cap, poly (A) tail, and splicing.
• Alternative splicing allows maximum number of products (proteins) with limited amount of resources
(genes)
• In eukaryotes, enhancers bind to specific trans-factors, RNA polymerase and additional proteins to
regulate transcriptional initiation in the promoter region.
• GAL4-UAS system in yeast uses driver (transcriptional activator/Gal4) and responder system (enhancer
sequence / upstream activation sequence UAS) that can be integrated into other genes and be used as
a biomarker.
• Chromatin structure, including reversible modifications such as acetylation of histones, and
methylation DNA CpG sites also regulates the initiation of transcription.
• Chromatin modifications or DNA methylation of some genes are heritable over many mitotic, and
sometimes even meiotic divisions and allow higher level of transcription
• During gamete development, each parent imprints epigenetic information on some genes that will
affect the activity of the gene in the offspring.
• Heritable changes in phenotype that do not result from a change in DNA sequence are called
epigenetics. Many epigenetic phenomena involve regulation of gene expression by chromatin
modification and/or DNA methylation.
• When there are two X chromosomes in a female, X-inactivation compensates for overdosage;
examples are calico cats and factor VIII Blood clotting protein in humans.
KEY TERMS:
Single copy genes GC boxes / CAAT boxes / TATA epigenetics
Multiple copy genes boxes transgenerational
Repeated sequences Cis-elements gametogenesis
primary transcript Trans-elements imprint
RNA splicing transcription start site parent-of-origin
introns enhancer elements Prader-Willi Syndrome (PWS)
exons transcription factorsGAL4-UAS Angleman Syndrome (AS)
mature transcript Driver/responder agouti
spliceosome chromatin remodeling winter annuals
lariat acetylation/deacetylation vernalization
alternative splicing, methylation/demethylation X-inactive (Xi) / X-active (Xa)
trans-splicing CpG sites Barr body

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 07 – EUKARYOTE GENES: STRUCTURE

STUDY QUESTIONS:
1) List all the mechanisms that can be used to
regulate gene expression in eukaryotes.
2) How are eukaryotic and prokaryotic gene
regulation systems similar?
How are they different?
3) Histone deacetylase (HDAC) is an enzyme
involved in gene regulation. What might be the
phenotype of a winter annual plant that lacked
HDAC function?

PAGE 12 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES – CHAPTER 08

CHAPTER 08 – EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES

Figure 1.
Image of red blood cells (red), platelets (green) and T cells
(orange) with a scanning electron microscope. The cells in this
image are artificial coloured.
(Flickr-ZEISS Microscopy-CC BY-NC-ND 2.0)

INTRODUCTION blood cells (erythrocytes) that transports O2 from

the environment to the body cells.
Genes encoding the globin polypeptides
(component of red blood cell - Figure 2) are found in The hemoglobin protein usually exists as a
most species of higher eukaryotes. The human b- heterotetramer of four non-covalently bound
globin genes can be used as an example of a classic hemoglobin polypeptides (Figure 2). In adults, each
eukaryotic gene because they have most of the hemoglobin protein consists of a dimer of α-globin
features needed for understanding basic eukaryote and another dimer of β-globin polypeptides, each
gene structure, expression, and regulation. There with a bound heme molecule. Together these four
are several b-like genes in each cluster and each form the hemoglobin tetramer = 2 α-globin-like + 2
gene is expressed as part of a developmental β-globin-like polypeptides. The heme molecule is
program. Each gene’s polypeptide product functions made through an independent metabolic pathway
as part of a multimer protein. and then bound to the globin polypeptide through
the iron (Fe) ion, which is covalent attachment of Fe
1. BETA-GLOBIN – PROTEIN AND GENE as a post-translational modification to the
STRUCTURE, CLUSTERS, PSEUDO-GENES polypeptide.

1.1. HEMOGLOBIN IS A HETEROTETRAMER WITH

TWO 2 a-GLOBIN AND 2 b-GLOBIN POLYPEPTIDES
Hemoglobin is an oxygen-transporting or storing
protein. This protein, or something similar, is found
in most animals and many plants. In higher
vertebrates, hemoglobin is a component of red

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 08 – EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES

Figure 2. In humans, the beta-globin cluster is located at

A heterotetramer of human chromosome 11 and includes 5 genes; epsilon, G-
hemoglobin, type a2b2. gamma, A-gamma, delta and beta genes. The alpha-
The a chains are labeled globin cluster is located at chromosome 16 and
red, and the b chains are includes 3 genes; zeta, alpha-1 and alpha-2 genes.
labeled blue. The heme
groups are green.
Other vertebrates have similar clusters of a- and β-
(Wikipedia-Zephyris- CC BY- like genes. These clusters have arisen through a
SA 3.0) series of duplications of an ancestral globin gene. In
general, gene duplication events can occur through
rare errors in normal processes such as DNA
1.2. HUMAN GLOBIN GENES ARE FOUND IN TWO
replication, meiosis (crossing over), or transposition.
CLUSTERS
Through time, the duplicated genes can accumulate
a- and β-like genes form a family of genes in most mutations independently of each other. Mutations
vertebrates. The positions of the introns/exons of can occur in either the regulatory regions (e.g.
those two genes are very similar. These genes have promoter regions), or in the coding regions, or both.
3 exons and 2 introns in total. Comparison of other In this way, the promoters of the current globin
α and β globin genes from other species shows the genes have evolved and are expressed at different
intron positions are conserved. Both polypeptides phases of development to produce proteins
are similar, too. Human α-globin is 141 amino acids optimized for the prenatal/postnatal environment.
long and β is 146 amino acids long.
Note that there are other proteins that are similar 1.3. PSEUDO-GENES
to α-globin such as zeta (z)-globin protein. There are Of course, not all mutations are beneficial: some
also β-globin-like proteins such as epsilon (e), mutations can lead to inactivation of one or more of
the products of a gene duplication event. This can
gamma (g) and delta (d) globin proteins. In a
result in what is called a pseudogene. Examples of
tetramer, the zeta can take the place of the alpha,
while the epsilon, gamma, or delta can take the pseudogenes (y) are also found within the globin
clusters. Pseudogenes have mutations that prevent
place of the beta – the ratio is always 2 a-like to 2 b-
them from being expressed. They frequently lack
like polypeptides. Each of these types of proteins (a-
the cis-acting regulatory elements (promoter and
vs b-) are encoded by different genes. Each cluster
enhancer sequences) that are required for
of genes is referred to as a “locus” – the a-globin
expression, but still retain similarity in the protein
locus and the b-globin locus. Each set of genes is coding sequences, which permits their identification
located as a single gene cluster (Figure 3)

Figure 3.
Fragments of human chromosome 11 and human chromosome 16 on which are located clusters of b-like and a-like goblin genes,
respectively. Additional globin genes (q theta, µ mu) have also been described by some researchers, but are not shown here.
(Wikipedia –Modified by Kang- CC BY-NC 3.0)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES – CHAPTER 08

as a globin gene. The ψ (psi) symbol represents the childhood onward, most tetramers are of the type
designation as a pseudogene. The globin genes a2b2 (alpha, beta). A small amount of adult
provide an example of how gene duplication and hemoglobin is a2d2 (alpha, delta), which has d globin
mutation, followed by selection, allows genes to instead of the more common b globin. Although the
evolve specialized expression patterns and six globin proteins (a = alpha, b = beta , g = gamma,
functions. In general, many genes have evolved as d = delta, e = epsilon , z = zeta) are very similar to
gene families in this way, although they are not each other, they do have slightly different functional
always clustered together as are the globins. properties. For example, fetal hemoglobin a2g2 has
2. HEMOGLOBIN EXPRESSION CHANGES DURING a higher oxygen affinity than adult hemoglobin,
allowing the fetus to more effectively extract oxygen
DEVELOPMENT IN HUMANS.
from maternal blood, which is a2b2. The specialized
In humans the composition of the globin tetramer g globin genes that are characteristic of fetal
changes during development (Figure 6). There are 3 hemoglobin are found only in placental mammals.
distinct time periods that differ in globin gene
Note that in humans the developmental changes in
expression. In embryos, z2e2 (zeta, epsilon) is the
gene expression from zeta to alpha and from epsilon
most abundant type, which means the globin
to gamma to beta parallel the location along the
tetramer contain two copies of each of zeta and
chromosome. This correlation is found in other
epsilon proteins, which are similar but slightly
species with clusters of globin genes, although not
different from each other. Next, in fetuses, a2g2
as rigidly.
(alpha, gamma) is most abundant form. From early

Figure 4.
Expression of globin genes during prenatal and postnatal development in humans. The organs where the globin genes are primarily
expressed at each developmental stage are also indicated on top.
Data: Wood, W.G. 1976 Br. Med. Bull. 32, 282
Original: (Wkipedia-Furfur- CC BY-SA 3.0)
Derivative work/Translation: (Wikipedia-Leonid2- CC BY-SA 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 08 – EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES

3. LOCUS CONTROL REGION (LCR) – ANOTHER the nucleases can cleave the DNA (Figure 6.). This
LEVEL OF REGULATION nuclease sensitivity assay is done in vitro and
reflects the open/closed chromatin found in vivo.
The transcription of the globin genes is controlled at Also, these hypersensitive sites aid in the
multiple levels. First, the promoter dictates the recruitment of molecular factors that are needed for
mRNA start position and provides a basal level of transcription; each hypersensitive region
transcription. Second, there are enhancer/silencer independently affects, in an additive fashion, the
elements that act on the promoter to determine activation of gene expression. Note that LCR regions
tissue- and temporal-specific transcription. Third, do not necessarily exist at a single site like the beta
the b-globin locus is also regulated by higher order globin gene; in other cases, LCR can be found in
chromatin structure changes. multiple sites.
At the chromatin level, there is a region upstream This kind of regional change in chromatin
from the cluster of b-globin genes that regulates and conformation permits the various globin genes to be
controls the expression of all the genes in the cluster. regulated and expressed by their own individual
It is called a locus control region (LCR). LCR region promoters, enhancers, as well as having a dynamic
can transcriptionally activate distal globin genes and chromatin regulation, too.
its exact mechanism hasn’t been fully identified.
The LCR-dependent chromatin changes develop in
There are many models proposed to explain how
erythroid precursor cells long before any globin
LCR works, and one of them is by forming a loop and
gene is expressed. It begins with the opening
interacting with transcription factors to form a
(become nuclease sensitive) of the sites 5’ to the
complex called an enhancesome. During
globin genes first, then sites that are 3’ open later
development, this complex associates with other
on. Thus the change in chromatin structure at this
transcription factors in sequential manner. The LCR
locus is a developmental planned series of events.
region contains sequences that regulate the
conformation of chromatin for all the adjacent A deletion mutation that removes the LCR region,
globin genes (Figure 5). prevents the 5’ site from forming and also prevents
the subsequent 3’ site formation, thus the 5’ site is
The change in conformation is recognized through
needed for the 3’ site to gain nuclease sensitivity.
differences in this region’s sensitivity to added
Note that LCR sites must be present in order to
nucleases. The LCR contains 4 nuclease
activate the expression globin genes; non red blood
hypersensitive sites (HS4, HS3, HS2, and HS1) that
cell precursors do not open this region of chromatin
influences can be detected when isolated nuclei are
treated with added nucleases. If the DNA is in a so the b-globin genes are not expressed.
“closed” conformation the nucleases cannot cleave
the DNA. If the DNA is in an “open” conformation

Figure 5.
Diagram showing the role of the LCR in
development. The LCR regulates which
of the globin genes in the cluster is
expressed at different times during
development of the red blood cell.
(Original-Kang-CC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES – CHAPTER 08

4. ADDITIONAL INFORMATION-MYOGLOBIN
Globin gene expression is tissue-specific; a- and β-
like globin genes are expressed in red blood cell
precursors. A different globin-like gene, myoglobin
is expressed and found only in muscle cells. Just like
hemoglobin, myoglobin is an oxygen-binding
protein and acts as temporary storage in muscle
cells. The main difference between myoglobin and
hemoglobin is that hemoglobin is mainly found in
the blood stream, but myoglobin is only found in
skeletal and heart muscles to provide oxygen for
Figure 6. metabolically active cells. Therefore, myoglobin can
The upper part of this diagram represents the nuclease act as a biomarker for detecting muscle injuries as
insensitive chromatin form where the nucleases (scissors)
cannot access or cleave the DNA. The lower part
high concentrations of myoglobin in the
represents the nuclease hypersensitive chromatin form bloodstream indicate internal bleeding from the
where the nucleases can access and cleave the DNA. The muscles. Also, myoglobin has higher affinity for
sensitivity/insensitivity of the DNA to nucleases is just a oxygen than hemoglobin; this feature allows
method to reveal the difference in chromatin structure myoglobin to take up oxygen from hemaglobin.
along the DNA molecule. Some areas are “open”
(sensitive) while others are “closed” (insensitive).
(Original-Locke- CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 08 – EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES

__________________________________________________________________________
SUMMARY:
• Hemoglobin is a tetramer that transports and stores oxygen; it is composed of 2 α-globin-like + 2 β-
globin-like polypeptides. α-globin can be replaced by zeta (z) globin, and beta-globin can be replaced by
epsilon (e), gamma (g) and delta (d) globin proteins. Each protein has different affinity for oxygen.
• Pseudogenes are version of a normal gene that frequently lacks the cis-acting regulatory elements but
still possess the protein coding sequences.
• Expression can be tissue specific; globin is expressed in red blood cells and myoglobin is expressed only
in muscle cells.
• Expression is developmental specific; each globin gene is expressed at a limited time during development.
• Expression is coordinately controlled; alpha and beta genes are expressed to the same level so that there
is a 1:1 ratio of globin polypeptides.
• Promoter, enhancer/silencer elements, and locus control region regulates gene expression.
• Hemoglobin can be found in the blood stream, and myoglobin is only expressed in the muscles.
Myoglobin acts as temporary oxygen storage for metabolically active cells and has higher affinity for
oxygen than hemoglobin.
KEY TERMS:
hemoglobin/heme/ α , β globin Gene duplication
gene families
post-translational modification
z2e2 / a2g2 /a2b2/ a2d2
zeta (z)-globin
locus control region (LCR)
epsilon (e)/gamma (g)/delta (d) globin
myoglobin
locus
Pseudogene

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES – CHAPTER 08

STUDY QUESTIONS:
1) The various a- and b-globin genes are expressed
at various times during development (embryo,
fetus, adult). This might reflect their various
physiological roles during development. What
might those roles be, and how might this be
tested?
2) Why might the a-globin and b-globin genes have
the same intron/exon structure?
3) Draw a simple cartoon showing the organization
of the globin polypeptides in the functional
hemoglobin molecule.
4) Figure 3. shows the organization of the human
globin genes on chromosomes 11 and 16. The
figure lacks a scale bar. Search the internet for a
similar figure and show the length of 10 kbp on
your figure.
5) Some adults have a condition called
hereditary persistence of fetal hemoglobin
(HPFE). What causes it? Does it affect their
health?

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 08 – EUKARYOTE GENES: HUMAN BETA-GLOBIN GENES

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE – CHAPTER 09

CHAPTER 09 – EUKARYOTIC GENES:

THE HUMAN LACTASE (LCT) GENE
Figure 1.
Unlike most dairy products,
these lack the sugar lactose.
(Flickr-USDAgov-CC BY 2.0)

INTRODUCTION itself. Most however are transported a second time
at the other side of the cell where they enter the
Young mammals get nourishment from their blood. Once in the circulatory system the sugars
mother's milk. Human milk for example contains 4% will travel to the other cells of the body.
fat, 1% protein, and 7% carbohydrates. There are
30+ different types of carbohydrates but the most
abundant is the disaccharide lactose ("milk sugar").
How does the infant digest this lactose? When the
lactose reaches the start of the small intestine it
encounters an enzyme called Lactase. As shown in
Figure 2 Lactase is a membrane protein on the
surface of intestinal epithelial cells. Lactase
performs a hydrolysis reaction, which separates the
disaccharide into two monosaccharides, galactose
and glucose. These are then imported into the cells
by a second membrane protein named the Sodium-
Glucose Transporter 1 (SGLT1). It transports some
monosaccharides (glucose and galactose) but not
others (fructose for example). As its name suggests,
it is powered by a sodium gradient, there are more
sodium ions outside the cell than inside. Each time Figure 2.
the protein allows two sodium ions to enter one Lactose import by a human intestinal epithelial cell.
(Original-Harrington-CC BY-NC 3.0)
monosaccharide can be imported. Once inside the
cell some of the sugars are consumed by the cell

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 09 – EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE

1. THE LACTASE PROTEIN

The purposes of this chapter are (i) to review how
Eukaryotic genes make proteins and (ii) to
demonstrate the importance of mutations in
human evolution. Let's start with the protein itself.
As shown in Figure 2, Lactase is a plasma
membrane protein. Like many others it is
synthesized by ribosomes at the rough ER, Figure 3.
modified by enzymes in the Golgi apparatus, and The Lactase protein. Its trans-membrane domain is shown
finally moved in a transport vesicle to the surface here in purple.
of the cell. The mature protein is 1060 amino acids (Original-Harrington-CC BY-NC 3.0)
long. Most of the protein, including its active site, is

outside the cell (Figure 3). It has a single trans-

membrane domain, which anchors it in the
membrane. There is also a small cytosolic portion
at the carboxyl-end of the protein.
Figure 4.
The location of the LCT gene on Homo sapiens
2. THE LCT GENE AND MRNA chromosome 2 (HSA2).
In humans, the gene that encodes Lactase is called (Original-Harrington-CC BY-NC 3.0)
LCT and is found on chromosome 2. Its cytogenetic

location is 2q21 (chromosome 2, long arm, region

2, band 1; Figure 4). Because chromosome 2 is an
autosome everyone has two copies of the LCT
gene, one on their maternal chromosome 2 and
one on their paternal chromosome 2. Information
about this and other human genes can be found on
the Online Mendelian Inheritance in Man website. Figure 5.
This website is easy to find if you search for The LCT gene and pre-mRNA. (Original-Harrington-CC BY-
"OMIM". NC 3.0)

The LCT gene is about 55 kilobases (kb) long (Figure

5). When it is transcribed, RNA polymerases travel
along the DNA from the promoter through the
AATAAA polyadenylation site. The RNA made is
known as a pre-mRNA (or primary transcript) as it Figure 6.
still requires processing. For this particular pre- The LCT mature mRNA. (Original-Harrington-CC BY-NC 3.0)
mRNA, 16 introns are removed yielding a much
shorter mature mRNA. The LCT mature mRNA amino acids long. However, recall from earlier in
contains a 5' cap, 6274 nucleotides, and a poly(A) this chapter that the mature protein is only 1060
tail. (Figure 6). amino acids long
The structure of the LCT mRNA is typical (Figure 7).

It has untranslated regions (UTRs) at each end and
a 5784 nucleotide long coding sequence in the
middle. Since three nucleotides are equal to one
codon this mRNA encodes a protein that is 1927

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE – CHAPTER 09

Figure 7.
A typical human mRNA shown approximately to scale. In the LCT mRNA the lengths of the 5'UTR, coding sequence, and 3'UTR are
11, 5784, and 479 nucleotides, respectively. Codon 1 is AUG = methionine/start and codon 1928 is UGA = stop.
(Wikipedia-Daylite-PD)

The resolution to this mystery comes from how the
Lactase protein is made (Figure 8). When the
ribosome first binds to the mRNA, both are floating
free in the cytosol. The first 20 amino acids of the
Lactase protein are an ER signal sequence. This
signal attracts an RNA-protein complex called the
Signal Recognition Complex. It brings the ribosome
to the surface of the rough ER. The ribosome
continues protein synthesis but now the protein is
fed into the ER lumen (Figure 8a). Since its job is
done, the ER signal sequence is cut off and its
amino acids are recycled. These are typical events
in the synthesis of membrane proteins such as
Lactase. Figure 8.
Synthesis of Lactase proteins in a human intestinal
Towards the end of the coding sequence comes a
epithelial cell.
stop transfer sequence. This portion is left in the (Original-Harrington-CC BY-NC 3.0)
ER membrane and becomes the trans-membrane
domain (Figure 8b). The ribosome soon reaches the
The purpose of this pro region was a mystery until
stop codon in the mRNA and departs.
1994 when scientists made synthetic Lactase
At this point we have a membrane protein but it is proteins that lacked them. These Lactase proteins
still longer than expected. This so-called pro- were unable to fold into their correct shapes. Thus
Lactase protein travels to the Golgi apparatus. the pro regions are necessary so that normal
There, an enzyme cuts the protein a second time Lactase proteins can fold properly in the ER.
releasing an 847 amino acid long pro region (Figure Afterwards in the Golgi apparatus, the pro regions
8c). The remaining protein is the 1060 amino acid are removed so that the enzymatic active sites are
long mature Lactase. It is modified a bit more in the exposed. Some other proteins, for example Insulin,
Golgi apparatus before being sent onwards to the also contain pro regions when they are first made.
plasma membrane (Figure 8d). Once the pro-proteins fold into their proper shapes
the pro regions are removed and their amino acids
recycled.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 09 – EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE

3. LCT GENE EXPRESSION DURING DEVELOPMENT they turn eight the LCT genes have turned off.
Without Lactases they can no longer break down
Because young mammals are completely lactose. If they drink cow milk or eat too much
dependent upon mother's milk their LCT genes are cheese, ice cream, or other dairy products the
very active in the intestinal epithelial cells (Figure
result is diarrhea. However in 35% of people this
9). This supplies these cells with enough Lactase
doesn't happen. The LCT genes remain active,
enzymes to digest the lactose sugars in the gut. The
Lactases continue to be produced, and milk and
resulting glucose and galactose can then be
dairy products do not cause gastric problems. Each
imported into the cells. Other cells do not turn on
of us is thus either lactose intolerant or lactose
the LCT genes because they have no use for Lactase
tolerant. Either we have stopped making Lactase (a
proteins. This is an example of spatial gene
phenotype also known as Lactase non-persistence)
expression - when a gene is active in only some
or we continue to make it (Lactase persistence).
cells in a multicellular organism. In liver, muscle,
The explanation for this difference reveals much
and other cells the LCT genes remain off to not
about human genetics and human evolution.
waste nucleotides on unneeded mRNAs and amino
acids on unneeded proteins. This also saves energy Before we move on, an option for lactose
that would be required by the transcription and intolerant people are lactose-reduced dairy
translation machinery too. products such as those shown in Figure 1. How are
they made? The answer is simple, Lactases are
Figure 9. purified from yeast such as Kluyveromyces fragilis
A young elk drinking and added to food during its processing. Any
milk (top) and an
older elk eating grass
lactose will be broken down into monosaccharides
(bottom). by the time the person eats the food. All people,
(Wikipedia-Left: lactose tolerant and intolerant, can import glucoses
Norbert Kaiser, Right: and galactoses into their intestinal cells using their
Jonathunder-CC BY-SA SGLT1 transporters.
3.0)
4. EVOLUTION OF THE LCT GENE
If we look at the distribution of lactase persistence
(LP) in the human population we see three
hotspots - Northern Europe, Eastern Africa, and
Arabia/India (Figure 10). After much searching

scientists found that in each case there was a single
mutation in LCT responsible. In the European
When the mammal is older it will switch from population a CG to TA base pair substitution
mother's milk to eating regular food, a process created the LP allele. The reason it took so long to
called weaning. For most mammals this means that find was the mutation was far upstream of the
there will be no more lactose in the diet, and thus transcribed region, 13 910 bp in fact!
no more need for Lactases. The LCT genes are
turned off in the intestinal cells because their job is The LP alleles in the other populations were
done. This is an example of temporal gene different bp substitutions very close by (Figure 11).
expression - when a gene is only active during In African populations it was a TA --> GC bp
specific stages during an organism's development. substitution at –13 915 while in Arabia/India was a
CG to GC bp substitution at –13 907.
In about 65% of people the LCT genes follow this
pattern of temporal gene expression. By the time

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE – CHAPTER 09

Figure 10.
Distribution of the lactose tolerant phenotype. Dots represent collection locations. Colours show the frequency of the lactose
tolerant phenotype from 0-10% to 90-100% of the local population.
(BMC Evolutionary Biology, 2010, 10:36-Itan et al-CC BY 2.0)

How can these mutations have an effect? What were Lactase persistent. In other places in the
each does is to turn this stretch of DNA into a world, places where lactose tolerance had no
binding site for a positive transcription factor. benefit, any LP alleles that arose would have not
Positive transcription factors bind to genes and been selected for and would remain rare.
activate them, while negative transcription factors
bind to genes and have the opposite effect. In this
case the positive transcription factor is a protein
known as Oct1. It is Oct1 that is keeping the LCT
genes active long after they would otherwise be
turned off. All it took to change the LCT gene's
temporal expression pattern were mutations in the
gene's regulatory region. The result had a dramatic
effect on a person's phenotype.
Figure 11.
Like other mutations these three occurred Mutations in the LCT gene that produce a Lactose
persistence allele. Note that locations on genes are
randomly. But why did these mutations become so
numbered relative to the first base pair read by the RNA
common? The answer comes from the food Polymerase being +1.
consumed in these places over the past thousands (Original-Harrington-CC BY-NC 3.0)
of years. All three groups of people raised animals;
goats, sheep, cows, or camels; that could be In summary, while there are different alleles of the
milked. Milk and milk-products offered a new year LCT gene in the human population neither type is
round food source. People in these communities "better". The original allele turns off after weaning
with the lactose tolerance phenotype would have and thus conserves nucleotides, amino acids, and
had more food available and been able to have energy. The LP alleles remain on and allow a person
more children. Their children would have inherited to eat a greater variety of foods. Ultimately, the
the LP alleles and also had this advantage. Over reason you are either lactose tolerant or intolerant
many generations the population shifted to where has to do with what your ancestors ate and drank
most if not all of the people had the LP alleles and thousands of years ago!

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 09 – EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE

SUMMARY:
• In the human gut the dissaccharide lactose is hydrolyzed by an intestinal epithelial cell membrane protein
named Lactase. The resulting monosaccharides, galactose and glucose, are then imported into the cell by
the SGLT1 transport protein.
• The LCT gene makes a long pre-mRNA which is processed (5' cap added, introns removed, poly(A) tail
added) to produce a much shorter mature mRNA.
• The Lactase protein contains regions that control where it is synthesized (ER signal sequence), become a
membrane domain (stop transfer sequence), and assist with its folding (pro sequence). Some of these
regions are removed as the protein is formed and delivered to its final location.
• The LCT gene shows both spatial gene expression (it is only active in some cells) and, in many people,
temporal gene expression (it is only active during some developmental stages).
• During human history three independent mutations have generated Lactose persistence (LP) alleles of the
LCT gene. These mutations have altered how the gene is regulated. People with one of these alleles can
digest lactose during their whole lives and not just as infants.
KEY TERMS:
lactose pro region
Lactase spatial gene expression
SGLT1 temporal gene expression
trans-membrane domain lactose intolerant /Lactase non-persistence
UTR lactose tolerant / Lactase persistence
coding sequence positive transcription factor
ER signal sequence negative transcription factor
stop transfer sequence

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE – CHAPTER 09

STUDY QUESTIONS:
1) If a person was heterozygous for LCT, i.e. they 5) In E. coli a protein called Lac Permease imports
had one Lactase persistence allele and one lactose into cell so that a protein called Beta-
Lactase non-persistence allele, what would Galactosidase can turn it into galactose and
their phenotype be? In other words, is the LP glucose. What are the similarities and
allele dominant or recessive to the original differences between how E. coli imports lactose
allele? and how you do?
2) Go to OMIM and find the entry for the LCT 6) How do you suppose other dissaccharides are
gene. What are the alternative symbols for this digested in humans? Note that in our diet the
gene? Why is it necessary for the HUGO Gene most common dissaccharides are lactose
Nomenclature Committee to approve only one (galactose + glucose), sucrose (glucose +
symbol for each gene? fructose), and maltose (glucose + glucose).
3) One way to find out if a person is lactose 7) Why do people with a lactose intolerant
tolerant or intolerant is to feed them some phenotype have gastric troubles if they drink
lactose and then monitor their blood glucose milk or eat dairy products? More specifically,
levels. How does this work? what problem is all that undigested lactose
4) Insulin proteins are synthesized and exported causing in their large intestines?
from human pancreatic cells. Consult Figure 8
and describe how these proteins are made.

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 09 – EUKARYOTIC GENES: THE HUMAN LACTASE (LCT) GENE

PAGE 8 OPEN GENETICS LECTURES – FALL 2017
EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE - CHAPTER 10

CHAPTER 10 – EUKARYOTIC GENES:

THE DROSOPHILA WHITE (W) GENE
Figure 1.
A Drosophila melanogaster adult male
(Wikipedia- André Karwath - CC BY-SA 2.5)

INTRODUCTION In Drosophila melanogaster, two types of pigments
are used: orange-coloured drosopterins and
One of the most striking features of Drosophila brown-coloured ommochromes. Eyes that contain
melanogaster is the adult's large red eyes (Figure both pigments have the wild type, bright red
1). As with other insects, these are compound
colour. Synthesizing these pigments requires a set
eyes. Each Drosophila eye is made of about 800
of transporters and enzymes. If any of the genes
tubes called ommatidia arranged in a hemisphere
encoding these proteins is mutated, the result will
(Figure 2). Light enters the outwards facing side of
be a fly with an altered eye colour. In the wild this
the ommatidium and activates a light sensitive
would be detrimental, however, a fly confined
photoreceptor cell at the base. This cell sends a
within a laboratory vial does not require vision to
nerve impulse to the brain. In order for compound
find food and mates. Eye colour mutations
eyes to function the sides of each ommatidium
therefore do not compromise the viability and
have to be opaque - otherwise light coming from
fertility of lab strains.
other directions will activate the photoreceptor
cell. Thus each ommatidium has three parts, a lens Because eye colour mutants are easy to isolate and
at the top, pigment cells along the sides, and a propagate, scientists have used them to make
photoreceptor cell at the base. many scientific discoveries. The best example of
this is a gene called white (w), with mutants giving
a white coloured eye. This chapter describes how
this gene functions, some of its mutant alleles, and
why it is important in the history of genetics. The
study of fly genes provides insight into gene
expression, function, control for other genes,
including human genes and diseases.
1. THE WHITE GENE PROTEIN

Figure 2. Each fly eye begins as a clump of cells called an
Three ommatidia within a Drosophila eye. Arthropod imaginal disc inside the larva. During pupation
compound eyes have multiple lenses in contrast to human these imaginal discs grow and mature into eyes.
eyes, which have a single lens each.
One of the developmental steps in the future
(Original-Harrington- CC BY-NC 3.0)
pigment cells is to import tryptophan and guanine

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 10 - EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE

molecules (Figure 3). These will be converted into synthesis of W and B polypeptides would be
brown (ommochrome) and orange (drosopterin) unaffected. These cells would therefore be
pigments, respectively. Tryptophan and guanine lacking tryptophan transporters but would
are imported by transporter proteins in the cell’s have functional guanine transporters.
plasma membrane. Both transporter proteins are During pupation the cells would be able to
heterodimers, proteins made of two different synthesize orange (drosopterin) but not
polypeptides. The transporter made with the W brown (ommochrome) pigments. The cells,
and S polypeptides imports tryptophan while the and the fly eyes as a whole, would be
guanine transporter is made with the same W orange.
polypeptide joined with a B polypeptide.
• If the B gene was mutated the opposite
situation would happen, the future pigment
cells would be able to import tryptophan
but not guanine. They would contain brown
but not orange pigments. These flies would
have brown eyes.
• If the W gene was mutated neither
transporter could be produced. With no
precursors imported there would be no
pigments and the flies would have white
eyes.

Figure 3.
These figures are missing one piece of information,
Import of pigment precursor molecules into a future eye the true names of the three genes. Each was
pigment cell. discovered decades before their protein's cellular
(Original-Harrington- CC BY-NC 3.0) function was revealed. So what did geneticists
name a gene that, when mutated, makes the eyes
white? Well, it was named the white gene. To
reduce confusion Drosophila geneticists use a
system of italics and capital letters when referring
to DNA, RNA, and proteins. In this system the white
gene is transcribed into the white mRNA, which
translated into the WHITE protein. The wild type
(functional) allele of the white gene can be
depicted white+ or just w+.
Figure 4.
Three different genes encode the three polypeptides The two other genes were named the same way.
needed to make the two types of transporters. The actual name of the S gene is scarlet (st), while
(Original-Harrington- CC BY-NC 3.0) the B gene is officially the brown (bw) gene. Figure
5 shows the actual names of the three genes and
Each of these three polypeptides is encoded by a their polypeptide products. There are a few other
different gene, there being three genes in total genes that also mutate to produce an unusual eye
(Figure 4). From these figures we can predict what colour. Many of these encode enzymes which turn
would happen if any one of these genes were non- the tryptophans into ommochromes or guanines
functional. into drosopterins. One example is the rosy gene
which makes the Xanthine Dehydrogenase enzyme.
• If the S gene was mutated, pigment cells Flies without this enzyme can not synthesize the
would not make any S polypeptides, but the

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE - CHAPTER 10

orange pigments and have brown-coloured eyes as represent exons and the filled in sections are the
a consequence. protein coding region. In this mRNA the start codon
is in the first exon and the stop codon is in the sixth
2. THE WHITE GENE (last) exon. The V's below and connecting the boxes
represent the introns which were removed. This
2.1. THE FUNCTIONAL (WILD TYPE) ALLELE representation allows the mature mRNA and gene
The white gene is a typical eukaryotic gene. It sequences to line up vertically. The figure omits the
makes a pre-mRNA that will have five introns 5' cap and poly(A) tail that are present in the
removed during processing to yield a shorter mature mRNA.
mature mRNA. In Figure 6 the transcribed region
on the DNA has an arrow indicating where the RNA Figure 7 shows the WHITE polypeptide. It has been
polymerase starts and the direction it travels. The flattened into two-dimensions so that the six trans-
mature mRNA is shown below where boxes membrane domains and large cytosolic domain can
be seen. The actual polypeptide would join with a
similarly structured BROWN or SCARLET
polypeptide to form a cylindrical membrane
protein.

2.2. THE FIRST MUTANT ALLELE TO BE DISCOVERED

In 1910 T. H. Morgan, the father of Drosophila
genetics, described his discovery of an unusual fly:
Figure 5.
The actual names of the genes that make the two “In a pedigree culture of
transporters. Drosophila which had been
(Original-Harrington- CC BY-NC 3.0) running for nearly a year through
a considerable number of
generations, a male appeared
with white eyes. The normal flies
have brilliant red eyes.”
This fly carried a mutation later named white1 (w1).
The #1 indicates that this was the first of many
mutant alleles (>300 now). While you might think
Figure 6. that this mutation was due to a simple base pair
The Drosophila white gene and its mRNA. substitution, many years later it was determined
(Original-Harrington- CC BY-NC 3.0)
that this mutation was actually due to the insertion
of a transposable element into the white gene.
Transposable elements, in this case one called Doc,
are pieces of DNA that jump from one location and
insert into new locations on chromosomes. If they
happen to insert into a gene the gene is mutated
(non-functional), which is what happened to the
white gene in one of Morgan's flies.
We now know that transposable elements are very
active in Drosophila and are responsible for 50% of
Figure 7.
the spontaneous mutations discovered by early
The Drosophila WHITE polypeptide. Drosophila geneticists. Humans also have
(Original-Harrington- CC BY-NC 3.0) transposable elements but they are much less

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 10 - EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE

active. They only cause 0.2% of our spontaneous • Look at the ocelli. These are three simple eyes
mutations. on the top of insect heads (Figure 9 right).
Drosophila adults use these to keep themselves
2.3. THE WHITE-APRICOT ALLELE upright as they fly. In Drosophila the ocelli are
Geneticists have isolated over a thousand mutant normally red but they are unpigmented in
alleles of the white gene. Not all of these have a certain mutant strains.
white-eyed phenotype though. The whiteapricot (wa)
allele has apricot-coloured eyes for example
(Figure 8). It was also caused by a spontaneous
transposable element insertion, in this case one
called copia. Because of the location and
orientation of the copia element insert, the wa
allele can still make mRNAs but not at wild type
levels. This allows the future pigment cells to
synthesize some WHITE polypeptides.
Consequently, there are fewer transporter proteins
on the surface of the cells and thus not enough
precursors are imported to make pigments. With
fewer of both types of pigment molecules, the cells
and eyes have a pale orange colour.

Figure 9.
Figure 8. A mantis with prominent pseudopupils (top) and a wasp
+ -
Drosophila with white (left), white mutation (middle) and with its ocelli circled (bottom).
apricot
white mutation. (right) (FlyBase-PD) (Wikipedia-right: Luc Viatour/left: Assafn- CC BY-SA 3.0)

2.4. OTHER MUTANT ALLELES 3. THE WHITE GENE IS X-LINKED

Mutant alleles of the white gene produce fly eyes
that range from white to yellow, to orange, and to T. H. Morgan noticed that when single white-eyed
red with every shade in between. Drosophila male flies were mated to female flies with regular
geneticists can tell these mutant alleles apart using red eyes, the offspring all had red eyes. When he
some tricks: mated the offspring and observed the next
generation he found that 3/4 of the flies had red
• Observe young adults when pigment deposition eyes and 1/4 had white eyes. But this was different
is incomplete. Many of the mutant colours from the typical 3:1 Mendelian ratio because only
darken with age and become harder to the males had white eyes. Initially he suspected
distinguish in older flies. that the white-eyed phenotype was only possible in
• Look for pseudopupils. Pseudopupils are males, but soon he had more data. He had also
shadows that appear to float in the middle of taken that same white-eyed male and mated it to
insect compound eyes (Figure 9 left). In most some of its female offspring. From this cross he
Drosophila eye colour mutants the pseudopupil found a 1:1:1:1 ratio of white-eyed males, red-eyed
is absent and the eye has a flat, uniform colour. males, white-eyed females, and red-eyed females.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE - CHAPTER 10

The only explanation that fit was that males had • The first Drosophila gene cloned and
one copy of this eye colour gene while females sequenced was white. The whiteapricot allele
must have two. This parallels the situation with the played an important part.
X chromosome, males have one and females have
• A functional white+ gene is used as a marker
two. His explanation for the results was that the
for small pieces of DNA when scientists
gene for eye colour was on the X chromosome. We
introduce DNA into Drosophila or move
now call genes such as this X-linked. In a famous
DNA from chromosome to chromosome.
statement Morgan concluded:
These techniques are called transfection
“The fact is that this R [the allele for red and transposition, respectively.
eyes] and X [the X chromosome] are
combined, and have never existed apart.”
His three crosses, using modern nomenclature, are
shown in Figure 10.
Morgan used these results as confirmation of the
chromosomal theory of inheritance. Other
biologists had proposed that genes were on
chromosomes but here was the evidence – the
gene for Drosophila eye colour was inherited as if
on a specific chromosome, the X chromosome. For
this and other research T. H. Morgan won the
Nobel prize in Physiology or Medicine in 1933.
In many cases, the white-eyed flies students work
with in genetics labs are the descendants of that
one male fly he discovered over one hundred years
ago (or 2500+ generations). Also, the experiment
students do today to demonstrate X-linked
inheritance would have given them a Nobel prize
had they done it before Morgan in 1910!
4. THE IMPORTANCE OF THE WHITE GENE
4.1. OTHER DISCOVERIES
Since Morgan's time geneticists have made further
discoveries using the white gene and its mutant
alleles. Some highlights include: Figure 10.
A modern depiction of Morgan's crosses and results.
• Much of what we know of heterochromatin Female flies have five thin stripes on their abdomens
comes from the analysis of the whitemottled#4 while males have two thin and one wide stripe.
(wm4) allele. A chromosome rearrangement (Original-Harrington- CC BY-NC 3.0)
has placed the white gene too close to the
centromere. In some ommatidia the gene is
able to function but in most the gene is
silenced. The result is a fly with a mosaic of
red and white ommatidia (Figure 11).
• The first Drosophila transposable element
discovered was in the whiteivory (wi) allele.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 10 - EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE

darker eyes. In short, eye colour in humans is not

related to eye colour in Drosophila.
On the other hand we do have many of the same
genes as Drosophila. The Drosophila WHITE,
SCARLET, and BROWN proteins are in the ABC
transporter family. Humans have proteins in this
family, for example the CFTR protein which is non-
functional in people with cystic fibrosis. Of the
human ABC transporters the one most similar to
Figure 11.
A fly with the white
mottled#4
mutation. WHITE is named ABCG1. It is encoded by a gene on
(PLoS Genet-Piacentini et. al (2009)-PD) our chromosome 21. ABCG1 is a transport protein,
although its job is to export cholesterol and
4.2. RELEVANCE TO HUMAN GENETICS phospholipids from macrophage cells.
Are there any direct connections to human The same is true of the enzymes involved in
genetics? The answer is yes and no. Eye colour in Drosophila eye colour. The Drosophila rosy gene
humans, or more specifically, the colour of the iris, mentioned in Section 1 is called XDH in humans.
is due to a different pigment, melanin. Irises with Both genes make enzymes necessary for purine
lots of melanin are brown or black, while those metabolism. There are many examples of medically
with small amounts are hazel, green, or blue important human genes that have a Drosophila
depending upon how light entering the eye is counterpart. Drosophila geneticists can reveal
scattered. The colour of the iris does not affect much about human health by studying and
what we see, although people with lighter eyes are manipulating these genes.
more sensitive to bright light than people with

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE - CHAPTER 10

___________________________________________________________________________
SUMMARY:
• Insects have compound eyes. Pigment-containing cells line the side of each ommatidia and serve a
crucial function in vision.
• Heterodimer transport proteins import pigment precursor molecules into these cells during
development.
• Drosophila adults that are unable to make one or both of the transporters do not have the normal red
eye colour.
• Mutations in the white gene reduce or eliminate eye pigmentation.
• The Drosophila white gene was the first X-linked gene discovered and was used by Morgan to confirm
the chromosomal theory of inheritance.
• Geneticists continue to use the white gene as a tool in their research.
KEY TERMS:
compound eye transposable element
ommatidia pseudopupil
pigment cell ocelli
pigment X-linked gene
transporter protein chromosomal theory of inheritance
heterodimer protein

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 10 - EUKARYOTIC GENES: THE DROSOPHILA WHITE (W) GENE

QUESTIONS:
1) What is the difference between white, white,
and WHITE?
2) How do Drosophila adults with the white1
mutation perceive the world? Is it darker or
lighter?
3) What colour eyes would a fly have if it had
homozygote mutations in both the sepia and
the brown genes?
4) Other Drosophila cells import tryptophan and
guanine but use them to make the
neurotransmitters serotonin and dopamine.
Does this mean that flies with the white1
mutation have altered behaviour?

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS ORIGINATE AS DAMAGE TO DNA – CHAPTER 11

CHAPTER 11 – MUTATIONS ORIGINATE AS DAMAGE TO DNA

Figure 1.
The difference in appearance between
pigmented and white peacocks is due
to mutation.
(Flickr-ecstaticist- CC BY-NC-SA 2.0)

INTRODUCTION However, DNA sequences can change. Changes in
DNA sequences are called mutations. If a mutation
The techniques of genetic analysis discussed in the
changes the phenotype of an individual, the
chapters on Mendelian inheritance depend on the
individual is said to be a mutant (as opposed to
availability of two or more alleles for a gene of
wild type).
interest. Where do these alleles come from? The
short answer is mutation, or a change to the DNA In a typical population of individuals (e.g. a
sequence. classroom of students), not all members will have
the same DNA sequence – there is genetic
Humans have an interesting relationship with
variation. The extent of this variation can be
mutations. From our perspective, mutations can be
divided into two categories. First, naturally
extraordinarily useful because they are essential
occurring but rare (<1%), sequence variants that
for the domestication and improvement of almost
are clearly different from a normal, wild-type
all the organisms we use as food. On the other
sequence are called mutations. Second, in a
hand, mutations are the cause of many cancers and
population there may be many naturally occurring
other diseases that can be devastating to
variants for a trait for which no wild type can be
individuals. Yet, the vast majority of mutations
defined. In this case we use the term
probably go unnoticed and undetected. In this
polymorphism to refer to variants of DNA
section, we will examine some of the causes of
sequences and other phenotypes that co-exist in a
mutations.
population at relatively high frequencies (>1%).
1. MUTATION AND POLYMORPHISM Polymorphisms and mutations arise through similar
biochemical processes, but the use of the word
We have previously noted that an important
“polymorphism” avoids implying that any particular
property of DNA is its sequence fidelity: most of
allele is more normal or abnormal. For example, a
the time its sequence accurately passes the same
change in a person’s DNA sequence that leads to a
information from one generation to the next.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 11 –MUTATIONS ORIGINATE AS DAMAGE TO DNA

disease such as albinism is appropriately called a Mutations here will affect the ability of RNA
mutation, but a difference in DNA sequence that polymerase to bind and transcribe that gene, and
explains whether a person has red hair rather than so will ultimately affect the overall levels of the
brown, black, or blond hair is an example of protein. Lastly, mutations can occur in regions
polymorphism. between genes, or within introns. These mutations
Molecular markers, which we will discuss in the will not affect the functions of any genes, and so
chapters on DNA variation, are a particularly useful the organism will appear wild type.
type of polymorphism for some areas of genetic 2.1. DELETION AND INSERTION MUTATIONS - FRAME
research. SHIFT
2. TYPES OF MUTATIONS A deletion or insertion mutation may cause
dramatic changes in the sequence of the protein. A
Mutations, or lesions, may involve the loss deletion is removing base pair(s) from the DNA,
(deletion), gain (insertion) of one or more base and an insertion is inserting new base pair(s) into
pairs, or else the substitution of one or more base the DNA. Remember that three nucleotides, or a
pairs with another DNA sequence of equal length. codon, code for a single amino acid. If the insertion
These changes in DNA sequence can arise in many or deletion is only three nucleotides, it will
ways, some of which are spontaneous and due to maintain the sequence reading frame so the
natural processes, while others are induced by protein will have one extra or one missing amino
humans intentionally (or unintentionally) using acid (Figure 2). The same will occur for multiples of
mutagens. There are many ways to classify three (6,9,12, etc.). The location of the
mutagens, which are the agents or processes that insertion/deletion will affect the severity of the
cause mutation or increase the frequency of mutant allele, but this type of mutation is generally
mutations. We will classify mutagens here as being less harmful than non-multiples of three.
(1) biological, (2) chemical, or (3) physical in the
next section. If a deletion or insertion mutation is not a multiple
of three, it will cause a frame shift. The typical
Mutations can occur in many locations, with codon next to the insertion or deletion will be
respect to a gene. They can occur within genes, and shifted over, and the ribosome will start placing
so can possibly change the polypeptide sequence incorrect amino acids after the mutation. This will
from that gene. The severity of that mutation, and lead to a severely disrupted protein that will likely
how it affects the genes function, can be described not be able to function properly. A frame shift is
using Muller’s morphs, which is explained in also very likely to cause a premature stop codon.
Chapter 13. They can also occur in regions that are This will lead to a truncated, or shortened
transcribed but not translated. These are non- polypeptide (Figure 2). If this happens near the end
protein coding genes, which can include tRNA, of the polypeptide sequence, it is likely that a
rRNA or siRNA. Mutations that can still affect gene frame shift will not have major effects on the
function can occur in regions that are not polypeptide function. If it happens near the start
transcribed or translated, such as in the promoter then the protein will likely be non-functional.
or regulatory regions (enhancer/silencer) of genes.

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS ORIGINATE AS DAMAGE TO DNA – CHAPTER 11

ATG CCG AAA ATA AGT TTC AGG GGT ... Figure 2.
Met Pro Lys Ile Ser Phe Arg Gly ... The top is an example sequence, and the amino acids
produced from those codons. When a three base
insertion, there isn’t a frameshift, but just an insertion
Three base insertion ATG CCG AAA CTC ATA AGT TTC AGG GGT ...
with no frameshift Met Pro Lys Leu Ile Ser Phe Arg Gly ... of a new amino acid. When a two base insertion occurs,
it causes a frame shift of every base after, in this
Two base insertion situation leading to a premature stop codon.
ATG CCG AAG CAA TAA GTT TCA GGG GT ...
with frameshift Purple shows inserted bases, green shows affected
Met Pro Lys Glu XXX Val Ser Gly ...
and stop codon bases and amino acids, red XXX is the stop codon.
(Original-L. Canham- CC BY-NC 3.0)

A substitution changes the genetic sequence, but

this simple change to the DNA sequence can cause
three different changes to the polypeptide
sequence (Figure 4).
(1) The first is known as a silent mutation. This is
where a substitution in the DNA does not affect
the amino acid made. This is because the
degeneracy of the genetic code: different
codons can code for the same amino acid. For
example, glutamic acid can be translated from
GAA or GAG, so a transition from A to G in the
third position will still produce glutamic acid.
(2) A missense mutation is a mutation that changes
the amino acid translated. For example if the
codon AGC undergoes a transversion mutation
Figure 3.
Diagram that shows what changes lead to transition
to AGG, then it changes the amino acid from a
mutations and what changes lead to transversion serine to an arginine. Amino acids are put into
mutations. groups based on their features, hydrophobic,
(Wikipedia- Petulda-PD) polar uncharged, charged, and other. If the
mutation changes the amino acid to another in
2.2. SUBSTITUTION MUTATIONS a similar group (conservative missense
Mutations don’t always add or remove pieces of mutation), the protein may still have partial
DNA. Sometimes they can also just change one function. But if a charged amino acid changes
nucleotide to another. This is called a substitution. to a hydrophobic one (non-conservative
There are two ways that a substitution can change missense mutation), this is more likely to cause
a nucleotide: major changes to the function and/or folding of
(1) A purine can be changed to another purine (A to the protein.
G or G to A), or a pyrimidine to another pyrimidine
(3) A nonsense mutation occurs when the
(C to T or T to C). This is called a transition
substitution leads to a stop codon. For
mutation.
example, a UCA for serine has a transversion
(2) A purine can be changed to a pyrimidine, or vice
mutation to a UAA, which is one of the three
versa. This is called a transversion mutation.
stop codons. A stop codon will lead to a
truncated polypeptide and, much like in the last
section, its position will affect the severity of
this mutation.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 11 –MUTATIONS ORIGINATE AS DAMAGE TO DNA

Figure 4.
Examples of substitution
mutations that lead to silent,
nonsense or missense
mutations.
(Wikipedia-Jonsta247- CC BY-SA
4.0)

2.3. CHROMOSOMAL REARRANGEMENT MUTATIONS

There can also be mutations that create major
rearrangements of the chromosome. These include
translocations, and inversions. Translocations are a
situation where large segments of DNA are

swapped between non-homologous chromosomes.
Figure 5.
Inversions have large segments within a
Mispairing of bases (e.g. G with T) can occur due to
chromosome ‘flipped’ so the DNA stays within the tautomerism, alkylating agents, or other effects. As a
same chromosome, but its orientation is in the result, in this example the AT base pair in the original DNA
opposite direction. In both of these situations, if a strand will become permanently substituted by a GC
break occurs in the middle of a gene, that gene will based pair in some progeny. The mispaired GT basepair
will likely be repaired or eliminated before further rounds
be disrupted and unable to make a normal
of replication. (Original-Deyholos- CC BY-NC 3.0)
polypeptide. More details on translocations and
inversions can be found in the Chromosomal
Rearrangements Chapter (Chapter 24). or mispaired bases, through exonuclease activity of
the polymerase, or other repair systems, such as
3. SPONTANEOUS MUTATIONS OF BIOLOGICAL base excision repair or mismatch repair.
ORIGIN Nevertheless, some errors become permanently
incorporated in a daughter strand, and so become
3.1. ERRORS DURING DNA REPLICATION – SINGLE BASE mutations that will be inherited by the cell’s
SUBSTITUTIONS descendants (Figure 5).
A major source of spontaneous mutations is DNA
replication errors. DNA polymerases are usually 3.2. ERRORS DURING DNA REPLICATION – STRAND
very accurate in adding a base to the growing SLIPPAGE
strand that is the exact complement of the base on Another type of error introduced during replication
the template strand. However, occasionally, an is caused by a rare, temporary misalignment of a
incorrect base is inserted, generating a mismatched few bases between the template strand and
base pair. Usually, the DNA replication machinery daughter strand (Figure 6).
will recognize and repair mismatched

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS ORIGINATE AS DAMAGE TO DNA – CHAPTER 11

Figure 6.
Strand-slippage can occur
occasionally during replication,
especially in regions with short,
repeated sequences. This can
lead to either deletion (left) or
insertion (right) of sequences
compared to the products of
normal replication (center),
depending on whether the
template strand or daughter
strand “loops-out” during
replication.
(Original-Deyholos- CC BY-NC
3.0)

This strand-slippage causes one or more bases on repeat diseases. If there are a low number of
either strand to be temporarily displaced in a loop repeats, the gene can be stable and the polymerase
that is not paired with the opposite strand. If this is able to faithfully replicate the repeats. Strand
loop forms on the template strand, the bases in the slippage, as described in the last section, can cause
loop may not be replicated, and a deletion will be an increase in the number of repeats in that region.
introduced in the growing daughter strand. If it only increases slightly, this usually doesn’t
Conversely, if a region of the daughter strand that cause instability. Once a threshold is reached, the
has just been replicated becomes displaced in a ability of polymerases to faithfully replicate the
loop, this region may be replicated again, leading repeated region becomes more difficult, and the
to an insertion of additional sequence in the repeats can grow, often very quickly. This mostly
daughter strand, as compared to the template occurs in the germline of individuals, and often
strand. Frame shift mutations account for mutation leads to their offspring making individuals with
hot spots in some genes. Hot spots are sites in a more and more repeats. This threshold is different
gene that are significantly more mutable than for different genes.
other sites. An example of this can be seen in with the
Regions of DNA that have several tandem repeats trinucleotide repeat disease Huntington’s Disease.
of the same few nucleotides are especially prone to Huntington’s has the repeat CAG within the reading
this type of error during replication. Thus regions frame of Huntingtin gene (HTT). A normal HTT
with short-sequence repeats (SSRs) tend to be gene will have fewer than 28 repeats. If more
highly polymorphic, and are therefore particularly repeats are gained, through strand slippage, to 28-
useful in genetics. They are called microsatellites. 35 repeats, then this is called a pre-mutation. At
this point, the DNA polymerase cannot faithfully
3.3. TRINUCLEOTIDE REPEATS reproduce the repeats and strand slippage occurs
Some regions of the genome have repeated more frequently. Such a person is unaffected, but
sequences. Dinucleotide repeats (eg. AGAGAGAG) their children may inherit the HTT gene with an
are common throughout the genome, as well as increased numbers of repeats, and their children
larger repeats such as VNTRs (Variable Number of after that even more. The more repeats, the more
Tandem Repeats). Trinucleotide repeats (eg. severe the disease. Once the repeats get above 40,
..CGGCGGCGGCGG..) though have a tendency to the HTT gene will cause the disease, which is a
expand the region of their repeat, which leads to neurodegenerative disorder that leads to a
genetic diseases. These are known as trinucleotide

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 11 –MUTATIONS ORIGINATE AS DAMAGE TO DNA

decreased life expectancy. With the CAG repeat common. This reaction occurs because of
within the HTT gene reading frame, one can endogenous metabolites within the cell.
understand why increasing the repeats will cause a
decrease in protein function, as it will be gaining Deamination is the removal of an amine group
more and more of the amino acid glutamine. through a hydrolysis reaction. When looking at the
individual nucleotides (See Chapter 2), adenine,
Not all trinucleotide repeat disorders are caused by guanine and cytosine all have amine groups on the
repeats within the reading frame though, for base. Like depurination, deamination can occur
example Fragile X syndrome. Fragile X syndrome is spontaneously due to metabolites within the cell.
a genetic congenital disease characterized by both In cytosine deamination the cytosine will change to
mental retardation and physical abnormalities like a uracil (Figure 8) and free ammonia. This change
long face and large ears in men. The gene can be easily corrected, as uracil is only found in
associated with this disorder is FMR-1. At the 5’ RNA, not DNA. So when a uracil is found in DNA, it
end of the FMR-1 mRNA, before the translational is removed and replaced again with a cytosine. If
start site is a CGG repeat. Most individuals have this does not occur though, the uracil will pair with
between 5-54 repeats of CGG. Parents with a adenine, leading to a GC to AT transition mutation.
premutation, where they are prone to pass on the
disease but are not affected themselves, have 5-methylcytosine is a cytosine with a methyl-tag.
between 60-200 repeats. Children of individuals Deamination of 5-methylcytosine is the most
with this premutation will have a greatly expanded common deamination mutation. It leads to
region of around 200-4000 of repeats and will thymine and ammonia. In this situation a T would
exhibit the syndrome. Since this is not in the
reading frame, why does this expanded repeat
cause the disease? It is predicted that the
expanded repeat causes aberrant methylation of
base pairs the 5’ upstream region, which leads to
changes in chromatin structure and gene silencing.
The karyogram of affected individual exhibits a
constricted region on the X chromosome, which is
fragile and is prone to chromosomal breakage. This
constriction is caused by the hypermethylation on
the large number of FMR-1 repeats.

3.4. SPONTANEOUS LESIONS FROM ENDOGENOUS

METABOLITES Figure 7.
Occasionally mutagenic lesions are caused by Structure of an apurinic site in a strand of DNA
naturally occurring damage to the DNA. These Wikipedia-Chemist234- CC BY-SA 3.0)
lesions do not occur during DNA replication.

Depurination is a chemical reaction where the

bond attaching the purine nucleotide (adenine or
guanine) to the deoxyribose sugar is hydrolytically
cleaved, leaving a deoxyribose sugar without its
nucleic base. This is called an apurinic site (Figure
7). Unless the apurinic site is repaired correctly, a
mutation will occur. Loss of pyrimidines can also Figure 8.
occur but the chemical structure of purines makes Deamination of cytosine to uracil.
them a good leaving group, so depurination is more (Wikipedia-Yikrazuul-PD)

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS ORIGINATE AS DAMAGE TO DNA – CHAPTER 11

then be opposite a T. This is fixed prior to the one gene with another. These insertions can occur
passage of the replication fork, but because the spontaneously, or they may also be intentionally
repair systems do not know where the original C stimulated in the laboratory as a method of
was located, this can lead to a point mutation at mutagenesis called transposon-tagging. For
that location. example, a type of transposable element called a P
Deamination of guanine creates the base xanthine. element is widely used in Drosophila as a biological
Xanthine is more prone to pair with thymine mutagen (see Chapter 30). T-DNA, which is a
instead of cytosine. If unfixed, this will cause a GC transposable element modified from a bacterial
to AT mutation. Deamination of adenine leads to pathogen, is used as a mutagen in some plant
hypoxanthine is more prone to pair with cytosine species.
instead of thymine. This causes an AT to GC Transposable elements (TEs) are also known as
mutation. mobile genetic elements, or more informally as
Oxidative damage is caused by oxygen free radicals jumping genes. They are present throughout the
such as superoxide (O2-), hydrogen peroxide (H2O2) genome of almost all organisms. These DNA
and hydroxyl radicals (OH-). These are normal sequences have a unique ability to be cut or copied
byproducts in cells, but are also increased in areas from their original location and inserted into new
of inflammation as the body uses oxidizing species locations in the genome. This is called
to attack pathogens. A common product of transposition. The insert locations are not entirely
oxidative damage is 8-oxo-7-hydrodeoxyguanosine random, but TEs can, in principle, be inserted into
(8-oxo dG), which is formed from oxidative almost any region of the genome. TEs can
damaged guanine. 8-oxo dG mispairs with A, therefore insert into genes, disrupting their
causing GC to AT mutations. function and causing mutations.
Researchers have developed methods of artificially
3.5. MUTATIONS FROM TRANSPOSABLE ELEMENTS increasing the rate of transposition, which makes
Mutations can also be caused by the insertion of some TEs a useful type of mutagen. However, the
viruses, transposable elements (see below), and biological importance of TEs extends far beyond
other types of DNA that are naturally inserted at their use in mutant screening. TEs are also
more or less random positions in chromosomes. important causes of disease and phenotypic
The insertion may disrupt the coding or regulatory instability, and they are a major mutational force in
sequence of a gene, including the fusion of part of evolution.

Figure 9.
Diagrams of the two main types of
transposable elements. (TEs) Class I
elements transpose via an ssRNA
intermediate, which is reverse
transcribed to dsDNA prior to insertion
of this copy in a new site in the
genome. Class II elements do not
involve an RNA intermediate; most
Class II elements are cut from their
original location as dsDNA, prior to
being inserted into a new site in the
genome. Although the diagram shows
TEs being inserted on the same
chromosome as they originated from,
TEs can also move to other
chromosomes within the same cell.
(Original-Deyholos- CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 11 –MUTATIONS ORIGINATE AS DAMAGE TO DNA

There are two major classes of TEs in eukaryotes DNA template from a homologous chromosome
(Figure 9). that itself contains a copy of a transposon, then the
Class I elements include retrotransposons; these total number of transposons in the genome will
are mobile by means of an RNA intermediate. The increase.
TE transcript is reverse-transcribed into DNA before Besides greatly expanding the overall DNA content
being inserted elsewhere in the genome through of genomes, TEs contribute to genome evolution in
the action of enzymes such as integrases. many other ways. As already mentioned, they may
Class II elements are known also as transposons. disrupt gene function by insertion into a gene’s
They do not use reverse transcriptase or an RNA coding region or regulatory region. More
intermediate for transposition. Instead, they use an interestingly adjacent regions of chromosomal DNA
enzyme called transposase to cut DNA from the are sometimes mistakenly transposed along with
original location and then this excised dsDNA the TE; this can lead to gene duplication. The
fragment is inserted into a new location. Note that duplicated genes are then free to evolve
the name transposon is sometimes used incorrectly independently, leading in some cases to the
to refer to any type of TEs, but in this book we use development of new functions. The breakage of
transposon to refer specifically to Class II elements. strands by TE excision and integration can disrupt
TEs are relatively short DNA stretches of 100- genes, and can lead to chromosome
10,000 bp, and encode no more than a few rearrangement or deletion if errors are made
proteins (if any). Normally, the protein-coding during strand rejoining. Furthermore, having so
genes within a TE are all related to the TE’s own many similar TE sequences distributed throughout
transposition functions. These proteins may include a chromosome sometimes allows mispairing of
reverse transcriptase, transposase, and integrase. regions of homologous chromosomes at meiosis,
However, some TEs (of either Class I or II) do not which can cause unequal crossing-over, resulting in
encode any proteins at all. These non-autonomous the deletion or duplication of large segments of
TEs can only transpose if they are supplied with chromosomes. Thus, TEs are a potentially
enzymes produced by other, autonomous TEs important evolutionary force, and may not be
located elsewhere in the genome. In all cases, included as merely “junk DNA”, as they once were.
enzymes for transposition recognize conserved 4. INDUCED MUTATIONS OF CHEMICAL ORIGIN
nucleotide sequences within the TE, which dictate
where the enzymes begin cutting or copying. Many chemical compounds, whether natural or
synthetic, can react with DNA and cause mutations.
The human genome consists of nearly 45% TEs, the In some of these reactions the chemical structure
vast majority of which are families of Class I of particular bases may change, so that they are
elements called Long Interspersed Elements (LINEs) misread during replication. In other cases the
and Short Interspersed Elements (SINEs). The short, chemical mutagens distort the double helix causing
Alu type of SINE occurs in more than one million it to be replicated inaccurately, while still other
copies in the human genome (compare this to the compounds may cause breaks in chromosomes
approximately 21,000, non-TE, protein-coding that lead to deletions and other types of
genes in humans). Indeed, TEs make up a aberrations. The following are examples of two
significant portion of the genomes of almost all classes of chemical mutagens: that are important in
eukaryotes. Class I elements, which usually genetics and medicine: alkylating agents, and
transpose via an RNA copy-and-paste mechanism, intercalating agents.
tend to be more abundant than Class II elements,
which mostly use a cut-and-paste mechanism. But 4.1. ALKYLATING AGENTS
even the cut-paste mechanism can lead to an Ethane methyl sulfonate (EMS) is an example of an
increase in TE copy number. For example, if the site alkylating agent that is commonly used by
vacated by an excised transposon is repaired with a geneticists to induce mutations in a wide range of

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS ORIGINATE AS DAMAGE TO DNA – CHAPTER 11

both prokaryotes and eukaryotes. The organism is 4.2. INTERCALATING AGENTS

fed or otherwise exposed to a solution of EMS. The Intercalating agents are another type of chemical
compound reacts with some of the G bases it mutagen. They tend to be flat, planar molecules
encounters in a process called alkylation, where like benzo[a]pyrene, a component of wood and
the addition of an alkyl group to G changes the tobacco smoke, and induce mutations by inserting
base pairing properties so that the next time the between the stacked bases at the center of the
alkylated DNA strand is replicated, a T instead of a DNA helix (Figure 11). This intercalation distorts the
C will be inserted opposite to the alkylated G in the shape of the DNA helix, which can cause the wrong
daughter strand (Figure 10). The new strand bases to be added to a growing DNA strand during
therefore bears a C to T transition mutation, which DNA synthesis.
will be inherited in all the strands that are There are a large number of chemicals that act as
subsequently replicated from it. intercalating agents, can mutate DNA, and are
carcinogenic (can cause cancer). Many of these are
also used to treat cancer, as they preferentially kill
actively dividing cells. Another important
intercalating agent is ethidium bromide, the dye
that fluorescently stains DNA in laboratory assays.
For this reason, molecular biologists are trained to
handle this chemical carefully.

4.3. OTHER CHEMICAL AGENTS

Aflatoxins are a group of fungal metabolites that
contaminate corn and peanuts. The metabolites
they produce can be toxic and carcinogenic.
Aflatoxin B1 (AFB1) is the most mutagenic. As it is
metabolized for excretion, some of the byproducts
Figure 10.
are toxic, including endo epoxide and exo epoxide.
Alkylation of G (shown in red) allows G to bond with T, Exo epoxide intercalates with DNA and then
rather than with C. catalyzes a reaction, attaching itself to the N7
(Original-Deyholos- CC BY-NC 3.0) position of guanine. This leads to the removal of

the guanine product creating an apurinic site.
Additionally, metabolism of AFB1 produces reactive
oxygen species, cause oxidative damage as
described previously. Combining all the mutagenic
properties, aflatoxin mostly creates G to T
transversions, but can also create G to A or G to C
mutations in low frequencies.
Another chemical mutagen is a base analog. Base
analogs are chemicals that look similar to a normal
nucleotide and so they can be falsely incorporated
into the DNA during replication. One example of a
common base analog is 5-bromouracil (5BU).
When 5BU is incorporated into DNA it will pair with
Figure 11.
Benzo[a]pyrene (circled in red) is an example of an adenine, but can spontaneously shift into another
intercalating agent.
(Wikipedia-Zephyris- CC BY-SA 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 11 –MUTATIONS ORIGINATE AS DAMAGE TO DNA

Figure 12.
Mutagenesis with 5-Bromouracil
(Wikipedia-Allen Gathman-CC BY-SA 2.5)

isomer that pairs with guanine. This ultimately
causes an A to G transition (Figure 12).
5. INDUCED MUTATIONS OF PHYSICAL ORIGIN
Anything that damages DNA by transferring energy
to it can be considered a physical mutagen. Usually
this involves radioactive particles, x-rays, or
ultraviolet (UV) light. The smaller, fast moving
particles may cause base substitutions or delete a
single base, while larger, slightly slower particles
may induce larger deletions by breaking the double
Figure 13.
Thymine dimers are formed when adjacent thymine bases
stranded helix of the chromosome. For example, X-
on the same DNA strand become covalently linked (red rays can cause DNA double stranded breaks.
bonds) follow exposure to mutagens such as UV light. The
Physical mutagens can also create unusual
dimers distort base pairing and can interrupt processes
such as replication. (Original-Deyholos- CC BY-NC 3.0) structures in DNA, such as the pyrimidine dimers
formed by UV light (Figure 13). Pyrimidine dimers
are covalent linkages between two adjacent
pyrimidines, with thymine dimers being the most
common. When a cell is trying to replicate its DNA,
it cannot go through the dimer and so is forced to
stop. Replication can only proceed if DNA repair
enzymes fix the damage. Pyrimidine dimers cause
conformations changes in the DNA, so they are
easily recognized by DNA repair enzymes, but are
Figure 14. often repaired incorrectly.
UV photons can cause adjacent pyrimidines to bond with
each other. This distorts the DNA, creating a bulge that
The most common types of mutations from UV
prevents polymerases from passing by the lesion.
(Wikipedia- NASA/David Herring-PD) light are GC to AT transitions but GC to TA, AT to

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS ORIGINATE AS DAMAGE TO DNA – CHAPTER 11

TA, AT to CG and CG to GC can all be caused by UV that are potentially deleterious to its long-term
mutagenesis. health. If it is not successful, cells will enter one of
three possible states: (1) they will enter a state of
UV light is a very broad mutagen that can cause dormancy, or senescence, where the cell is still
many mutation types. Compare this with EMS, living but no longer functional; (2) programmed cell
which mostly creates GC to AT mutations; or AFB1, death, or apoptosis, will be activated and the cell
which mostly creates GC to TA mutations, but can will die; or (3) unregulated cell division, where the
cause some other mutations in low frequencies as cell will divide rapidly despite numerous mutations
well. and chromosomal abnormalities (Figure 15). This
can lead to cancerous tumours (Chapter 41).
6. FAILURE OF REPAIR SYSTEMS
6.1. EXCESSIVE DNA DAMAGE
For each type of damage cells have a way to fix it.
A cell may be exposed to DNA damage past a
These repair systems include but aren’t limited to
threshold that it is normally capable of dealing
base excision repair, nucleotide excision repair,
with. Such DNA damage usually prevents the DNA
mismatch repair, non-homologous end joining and
polymerase from normally replicating the DNA and
homologous recombination. The mechanisms of
it becomes ‘stuck’. If the cell is unable to find a way
each repair are not important at this time. All these
to continue replicating it will lead to cellular death.
systems require multiple enzymes to recognize the
Alternatively, it may enter an error-prone DNA
specific type of mutagenic lesion and repair it as
repair system.
accurately as possible. In certain situations, DNA
repair systems are unable to cope with DNA In prokaryotes this is called SOS repair, and when
damage, either because the damage is too induced will recruit error-prone DNA translesion
numerous for the enzymes to be able to recognize polymerases to continue to replicate the DNA past
and repair all of them, or because there is damage the mutagenic lesion usually causing errors in the
to the DNA repair systems itself. place of the DNA damage. A similar example is seen
in eukaryotes called translesion synthesis.
If the final efforts to rescue a cell from DNA
damage are successful, the cell will be able to
survive replication, but will be full of mutations
Figure 15.
Most DNA damage is repaired in
a healthy cell. If the rate of DNA
damage exceeds the rate of
repair, a cell either undergoes
senescence, apoptosis or
uncontrolled cell growth.
(Wikipedia-Harold Brenner)

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 11 –MUTATIONS ORIGINATE AS DAMAGE TO DNA

This is often used when there are excessive DNA UV light can make those cells more susceptible to
lesions from thymine dimers or apurinic sites. If cancer. Similarly, individuals who smoke increase
these lesions are not able to be repaired through mutagens in their lungs. Thus, if heterozygous they
their normal mechanism, specific translesion are more susceptible to DNA damage and cancer in
polymerases. Like in the SOS response, the their lungs compared to a homozygous wild type
translesion polymerase is error prone and will individual.
often insert incorrect bases in the areas of the If damage to DNA repair system genes happens in
lesion. the gametes of both parents, then they can pass
6.2. DAMAGED DNA REPAIR SYSTEMS that on to their child, making the child homozygous
Another source of mutations is when the DNA mutant in that specific DNA repair gene. Instead of
repair systems fail. The genes that make the individual cells being at risk, every cell in the
proteins for the various DNA repair machinery are homozygous mutant child will be missing that DNA
just like any other genes, they can be mutated and repair gene, causing problems with DNA repair. An
cease to function. Many people are homozygous example of this is the disease Xeroderma
wild type for these DNA repair genes. Also, most of pigmentosum, which is caused by a mutant version
these gemes are haplosufficient, meaning that only of a nucleotide excision repair enzyme. Individuals
one copy of the normal allele is required to with this inherited disease are extremely
produce a normal DNA repair system. Those who susceptible to UV light, and develop skin cancer
are heterozygous are at a greater risk though. They very easily and usually die at a young age.
maybe one mutation away from completely losing Xeroderma pigmentosum is just one example of
function of that DNA repair systems gene. When many when people who are born with defects in
that gene is lost, DNA repair mechanisms might not one of the DNA repair system genes. Most of these
be able to work as well, or at all, and mutagenic diseases lead to various types of cancers,
lesions can occur within the genome, leading to particularly many hereditary colorectal cancers, like
permanent mutations. hereditary nonpolyposis colorectal cancer (HNPCC).
If an individual is heterozygous for DNA repair The breast cancer genes, BRCA1 and BRCA2 are
genes, then individual cells are at a higher risk associated with DNA repair as well, and when
when exposed to mutagens. Since UV light is one of mutant lead to early onset breast and ovarian
the most common mutagens we encounter, the cancers.
loss of DNA repair proteins in skin cells exposed to

PAGE 12 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS ORIGINATE AS DAMAGE TO DNA – CHAPTER 11

SUMMARY:
• Variations in DNA sequence that originated recently, and are rare in a population, are called mutations.
• Variations in DNA sequence that co-exist in a population, and neither one can be meaningfully defined
as wild type, are called polymorphisms.
• Mutations may either occur spontaneously, or may be induced by exposure to mutagens.
• Mutations may result in substitutions, deletions, insertions or chromosomal rearrangements.
• Spontaneous mutations arise from many sources including natural errors in DNA replication, usually
associated with base mispairing, or else insertion/deletion, especially within repetitive sequences.
Occasionally metabolites within a cell can catalyze spontaneous mutations.
• Transposable elements are dynamic, abundant components of eukaryotic genomes and important
forces in evolution.
• Induced mutations result from mispairing, DNA damage, or sequence interruptions caused by
chemical, or physical mutagens.
• DNA repair systems can fail through excessive mutations so the cell cannot cope, or by loss of function
of the DNA repair genes themselves.
• When DNA repair systems fail, a cell will senesce, undergo apoptosis or become a cancer.
KEY TERMS:
mutation mispairing LINEs
mutant strand slippage SINEs
wild type loop Alu
polymorphism mutation hot spot copy-and-paste
lesion short-sequence repeats (SSRS) cut-and-paste
deletion microsatellites alkylating agent
insertion trinucleotide repeat diseases intercalating agent
substitution depurination EMS
spontaneous apurinic site benzopyrene
induced deamination carcinogenic
mutagen oxidative damage ethidium bromide
codon transposon-tagging aflatoxin
frame shift P element AFB1
premature stop codon T-DNA base analog
truncated transposable elements (TEs) 5-bromocuracil
transition mobile genetic elements UV light
transversion Class I TE pyrimidine dimer
silent mutation retrotransposon DNA repair systems
missense mutation integrase senescence
conservative Class II TE apoptosis
non-conservative transposon cancer
nonsense mutation transposase DNA damage threshold
translocation reverse transcriptase error-prone DNA repair
inversion non-autonomous translesion polymerases
DNA replication error autonomous Xeroderma pigmentosum

OPEN GENETICS LECTURES – FALL 2017 PAGE 13

CHAPTER 11 –MUTATIONS ORIGINATE AS DAMAGE TO DNA
STUDY QUESTIONS:
1) How are polymorphisms and mutations alike?
How are they different?
2) What are some of the ways a substitution can
occur in a DNA sequence?
3) What are some of the ways a deletion can
occur in a DNA sequence?
4) What are all of the ways an insertion can occur
in a DNA sequence?
5) In the context of this chapter, explain the
health hazards of smoking tobacco.
6) How was the first mutation in the white gene of
1
Drosophila, w , caused? (See Chapter 10).
7) Which types of transposable elements are
transcribed?

PAGE 14 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS: CONSEQUENCES – CHAPTER 12

CHAPTER 12 – MUTATIONS: CONSEQUENCES

Figure 1.
A breed of cat, the Canadian
Sphynx, lack hair due to a genetic
mutation. The Sphynx breed
originated in Minnesota, but the
Canadian Sphynx line was started
in Toronto in 1966 through a
selective breeding program from a
spontaneous mutation that gave
naked kittens. This mutation is
inherited in an autosomal recessive
manner for the hairlessness gene.
(Flickr- Weimar Meneses -CC BY
2.0)

INTRODUCTION strategy of mutant screening, and has been used
very effectively to identify and understand the
When we think of the word "mutation", we
molecular components of hundreds of different
automatically think of it as something negative or
biological processes. For example, to find the basic
detrimental. However, a mutation, which is a
biological processes of memory and learning,
change in the DNA sequence, may have one or
researchers have screened mutagenized
more effects on an organism, depending on what it
populations of Drosophila to recover flies (or
is and in which gene it occurs. While detrimental
larvae) that lack the normal ability to learn (yes
effects are most common, sometimes mutations
Drosophila can learn). Mutants lack the ability to
can create new features. These mutations give us a
associate a particular odor with an electric shock.
tool with which to investigate the gene and the
Because of the similarity of biology among all
biological processes in which it is involved. In this
organisms (common descent), some of the genes
chapter we will first take a look at how scientists
identified by this mutant screen of a model
perform genetic screening for mutations, and the
organism may be relevant to learning and memory
various consequences of those mutation.
in humans, including conditions such as Alzheimer’s
1. GENETIC SCREENING FOR MUTATIONS: disease.
FORWARD GENETICS, REVERSE GENETICS On the other hand, reverse genetic screening
Forward genetic screening refers to the process of refers to the process of creating a mutation in a
finding the gene or genes responsible for a certain gene, then identifying the phenotypic
phenotype or biochemical process. One way to consequences of that specific mutant gene on the
identify genes that affect a particular biological organism. This method is becoming more useful
process is to induce random mutations in a large with the advent of whole genome sequencing.
population, and then look for mutants with Here, we have identified the gene sequences, but
phenotypes that might be caused by a disruption of are unsure of what each gene does.
a particular biochemical pathway. This is the

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 12 – MUTATIONS: CONSEQUENCES

1.1. GENETIC SCREENS (recall that the genetic code is degenerate; for
In a typical mutant screen, researchers treat a example, GCT, GCC, GCA, and GCG all encode
parental population with a mutagen. This may alanine) and is referred to as a silent mutation.
involve soaking seeds in EMS, or mixing a mutagen Additionally, the base substitution may change an
with the food fed to flies. Usually, no phenotypes amino acid, but this does not quantitatively or
are visible among the individuals that are directly qualitatively alter the function of the product, so
exposed to the mutagen because in all the cells no phenotypic change would occur.
every strand of DNA will be affected
independently. Thus, the induced mutations will be 2.2. ENVIRONMENT AND GENETIC REDUNDANCY
heterozygous and limited to single cells. There are situations where a mutation can cause a
complete loss-of-function of a gene, yet not
However, what is most important to geneticists are produce a change in the phenotype, even when the
the mutations in the germline of the mutagenized mutant allele is homozygous. The lack of a visible
individuals. The germline is defined as the gametes phenotypic change can be due to environmental
and any of their developmental precursors, and is effects: the loss of that gene product may not be
therefore distinct from the somatic cells (i.e. non- apparent in that specific environment, but might in
reproductive cells) of the body. Because most another. An example is an auxotrophic mutant on
induced mutations are recessive, the progeny of complete medium. Conversely, researchers can
mutagenized individuals must be mated in a way alter the environment to reveal such mutants (e.g.
that allows the newly induced mutations to auxotrophs on minimal media).
become homozygous (or hemizygous). Strategies
for doing this vary between organisms. In any case, Alternatively, the lack of a phenotype might be
the generation in which induced mutations are attributed to genetic redundancy. That is. the
expected to show a phenotype can be examined mutant gene’s lost function is compensated by
for the presence of novel traits. Once a relevant another gene, at another locus, encoding a
mutant has been identified, geneticists can begin similarly functioning product. Thus, the loss of one
to make inferences about what the normal function gene is compensated by the presence of another.
of the mutated gene is, based on its mutant The concept of genetic redundancy is an important
phenotype. This can then be investigated further consideration in genetic screens. A gene whose
with molecular genetic techniques to connect the function can be compensated for my another gene,
gene function with the external appearance. cannot be easily identified in a genetic screen for
loss of function mutations.
2. SOME MUTATIONS MAY NOT HAVE
DETECTABLE MUTANT PHENOTYPES 2.3. ESSENTIAL GENES AND LETHAL ALLELES
Some mutant maybe required to reach a particular
Not all DNA sequence changes result in mutant developmental stage before the phenotype can be
phenotypes. Various reasons are described below. seen or scored. For example, flower color can only
be scored in plants that are mature enough to
2.1. SILENT CHANGES
make flowers, and eye color can only be scored in
After mutagen treatment, the vast majority of base
flies that have developed to the adult stage.
pair changes (especially substitutions) have no
However, some mutant organisms may not develop
obvious effect on the phenotype. Often, this is
sufficiently to reach a stage that can be scored for a
because the change occurs in the DNA sequence of
particular phenotype. Mutations in essential genes
a non-coding region of the DNA, such as in
create recessive lethal alleles that arrest or derail
intergenic regions (between genes) or within an
the development of an individual at an immature
intron where the sequence does not code for
(embryonic, larval, or pupal) stage. This type of
protein and is not essential for proper mRNA
mutation may therefore go unnoticed in a typical
splicing. Also, even if the change affects the coding
mutant screen because they are absent from the
region, it may not alter the amino acid sequence

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS: CONSEQUENCES – CHAPTER 12

progeny being screened. Furthermore, the progeny 3. EXAMPLE OF HUMAN MUTATIONS

of a monohybrid cross involving an embryonic
lethal recessive allele may therefore all be of a 3.1. CYSTIC FIBROSIS (CF) – AUTOSOMAL RECESSIVE
single phenotypic class, giving a phenotypic ratio of Cystic fibrosis (CF) is one of many diseases that
1:0 (which is the same as 3:0). In this case the geneticists have shown to be primarily caused by
mutation may not be detected. Nevertheless, the mutation in a single, well-characterized gene.
study of recessive lethal mutations (those in Cystic fibrosis is the most common (1/2,500) life-
essential genes) has elucidated many important limiting autosomal recessive disease among people
biochemical pathways. of European heritage, with ~ 1 in 25 people being
carriers. The frequency varies in different
An example is the identification of whole classes of
populations. Most of the deaths caused by CF are
genes involved in early embryonic development.
the result of lung disease, but many CF patients
Three Drosophila geneticists, Eric Wieschaus,
also suffer from other disorders including infertility
Edward Lewis, and Christiane Nüsslein-Volhard,
and gastrointestinal disease. The disease is due to a
who were awarded a Nobel Prize (1995), identified
mutation in the CFTR (Cystic Fibrosis
pair-rule, gap, and segment polarity genes that
Transmembrane Conductance Regulator) gene,
have corresponding homologs in all segmented
which was first identified by Lap-chee Tsui’s group
organisms, including humans.
at the University of Toronto.
2.4. NAMING GENES
Many genes are first identified in mutant screens,
and so they tend to be named after their mutant
phenotypes, not the normal function or phenotype.
This can cause some confusion for students of
genetics. For example, we have already
encountered an X-linked gene named white in fruit
flies. Null mutants of the white gene have white
+
eyes, but the normal white allele has red eyes.
This tells us that the wild type (normal) function of
this gene is required to make red eyes. We now
know its product is a protein that imports a
colourless pigment precursor into developing cells
of the eye. Why don’t we call it the “red” gene, Figure 2.
Wild-type and mutant forms of CFTR in the cell
since that is what its product does? Because there
membrane. In wild-type, the CFTR ion channel is gated;
are more than one-dozen genes that when mutant when activated by ATP, the channel opens and allows ions
alter the eye colour; e.g. violet, cinnabar, brown, to move across the membrane. In some CFTR mutants,
scarlet, etc. For all these genes, their function is the channel does not open. This prevents the movement
also needed to make the eye wild type red and not of ions and water and allows mucus to build up on the
the mutant colour. If we used the name “red” for lung epithelium. (Wikipedia- Lbudd14- CC BY-SA 3.0--
all these genes it would be confusing, so we use the modified)
distinctive mutant phenotype as the gene name.
However, this can be problematic, as with the Epithelial tissues in some organs rely on the CFTR
“lethal” mutations described above. This problem protein to transport ions (especially Cl-) across their
is usually handled by giving numbers or locations to cell membranes. The passage of ions through a six-
the gene name, or making up names that describe sided channel is gated by another part of the CFTR
how they die (e.g. even-skipped, hunchback, hairy, protein, which binds to ATP. If there is insufficient
runt, etc.). activity of CFTR, an imbalance in ion concentration

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 12 – MUTATIONS: CONSEQUENCES

results, which disrupts the properties of the liquid frequency of the ΔF508 allele has led to speculation
layer that normally forms on the epithelial surface. that it may confer some selective advantage to
In the lungs, this causes mucus to accumulate and heterozygotes, perhaps by reducing dehydration
can lead to infection. Defects in CFTR also affect during cholera epidemics, or by reducing
pancreas, liver, intestines, and sweat glands, all of susceptibility to certain pathogens that bind to
which need this ion transport. CFTR is also epithelial membranes.
expressed at high levels in the salivary gland and CFTR is also notable because it is one of the well-
bladder, but defects in CFTR function do not cause characterized genetic diseases for which a drug has
problems in these organs, probably because other been developed that compensates for the effects
ion transporters are able to compensate. of a specific mutation. The drug, Kalydeco
Over one thousand different mutant alleles of CFTR (Ivacaftor), was approved by the FDA and Health
have been described. Any mutation that prevents Canada in 2012, decades after the CFTR gene was
CFTR from sufficiently transporting ions can lead to first mapped to DNA markers (in 1985) and cloned
cystic fibrosis (CF). Worldwide, the most common (in 1989). Kalydeco is effective on only some CFTR
CFTR allele among CF patients is called ΔF508 mutations, most notably G551D (i.e. where glycine
(delta-F508; or PHE508DEL), which is a deletion of is substituted by aspartic acid at position 551 of the
three nucleotides that eliminates a phenylalanine protein; GLY551ASP). This mutation is found in less
from position 508 of the 1480 aa wild-type protein. than 5% of CF patients. The G551D mutation
Mutation ΔF508 causes CFTR to be folded affects the ability of ATP to bind to CFTR and open
improperly in the endoplasmic reticulum (ER), the channel it for transport. Kalydeco compensates
which then prevents CFTR from reaching the cell for this mutation by binding to CFTR and holding it
membrane. ΔF508 accounts for approximately 70% in an open conformation. Kalydeco is expected to
of CF cases in North America, with ~1/25 people of cost approximately $250,000 per patient per year.
European descent being carriers. The high

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

MUTATIONS: CONSEQUENCES – CHAPTER 12

___________________________________________________________________________
SUMMARY:
• Forward genetic screening aims to find the molecular basis for a certain phenotype whereas reverse
genetic screening aims to find the phenotypic effects that a gene might have on the organism.
• Somatic mutations occur in non-reproductive cells which affect the current individual, while germline
mutations occur in the gametes which affect future generations and not the individual.
• Mutation can alter a gene into different levels and types of expression.
• Not all base pair changes (mutations) cause detectable changes in an organism. The efficiency of
mutant screening is limited by silent mutations, redundancy, and embryonic lethality.
• Cystic Fibrosis is a genetic disease caused by the mutation in the CTFR gene.
KEY TERMS:
mutant screen recessive lethal allele
loss-of-function double strand break
gain-of-function non-homologous end joining
null DNA repair system
dominant negative chromosome rearrangement
somatic cells CFTR
germline cells Cystic Fibrosis (CF)
silent mutation DF508(PHE508DEL)
inter-genic region Kalydeco
redundancy
essential gene

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 12 – MUTATIONS: CONSEQUENCES

STUDY QUESTIONS:
1) You have a female fruit fly, whose father was a) What do you expect to be the relative
exposed to a mutagen (she, herself, wasn’t). frequency of dominant mutations, as
Mating this female fly with another non- compared to recessive mutations, and why?
mutagenized, wild type male produces b) How will you design your screen differently
offspring that all appear to be completely than in the previous question, in order to
normal, except there are twice as many detect dominant mutations specifically?
daughters as sons in the F1 progeny of this c) Which kind of mutagen is most likely to
cross. produce dominant mutations, a mutagen
a) Propose a hypothesis to explain these that produces point mutations, or a
observations. mutagen that produces large deletions?
b) How could you test your hypothesis? 4) You are interested in finding genes involved in
2) You decide to use genetics to investigate how synthesis of proline (Pro), an amino acid that is
your favourite plant makes its flowers smell normally synthesizes by a particular model
good. organism.
a) What steps will you take to identify some a) How would you design a mutant screen to
genes that are required for production of identify genes required for Pro synthesis?
the sweet floral scent? Assume that this b) Imagine that your screen identified ten
plant is a self-pollinating diploid. mutants (#1 through #10) that grew poorly
b) One of the recessive mutants you identified unless supplemented with Pro. How could
has fishy-smelling flowers, so you name the you determine the number of different
mutant (and the mutated gene) fishy. What genes represented by these mutants?
do you hypothesize about the normal c) If each of the four mutants represents a
function of the wild-type fishy gene? different gene, what will be the phenotype
c) Another recessive mutant lacks floral scent of the F1 progeny if any pair of the four
altogether, so you call it nosmell. What mutants are crossed?
could you hypothesize about the normal d) If each of the four mutants represents the
function of this gene? same gene, what will be the phenotype of
3) Suppose you are only interested in finding the F1 progeny if any pair of the four
dominant mutations that affect floral scent. mutants are crossed?

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

ALLELES AT A SINGLE LOCUS – CHAPTER 13

CHAPTER 13 – ALLELES AT A SINGLE LOCUS

Figure 1.
A flower called Camellia showing co-dominance of the red
and white alleles of flower colour.
(Flickr- darwin cruz-CC BY 2.0)

INTRODUCTION variation so there will be different alleles. Some
may be defined as wild type, some as variants,
The previous chapter described the consequences others as mutant.
of mutations. We will now use the mutant forms of
a gene to investigate the interactions of alleles at a The complete set of alleles at all loci in an
single locus. This will begin with the difference individual is its genotype. Typically, when writing
between somatic and germ line mutations. Then it out a genotype, only the alleles at the locus (or loci)
will deal with simple dominance/recessive of interest are considered and written down – all
relationships, which many students have the others are still present and assumed to be wild
encountered before. It will end with more type. So, typically only the alleles at the few mutant
sophisticated interactions that can be described by loci appear in the written genotype. All the many,
“Muller’s Morphs”, which deal with the many others that are wild type are not.
interrelationships of mutant and wild type alleles at The visible or detectable effect of alleles on the
a more detailed level. structure or function of that individual is called its
1. TERMINOLOGY phenotype – what it looks like. The phenotype
studied in any particular genetic experiment may
A specific section of a chromosome is called a range from simple, visible traits such as hair color,
locus. Because each gene occupies a specific locus to more complex phenotypes including disease
along a chromosome, the terms locus and gene are susceptibility or behavior. If two alleles are present
often used interchangeably. However, the term in an individual, as is the case with diploid
“gene” is a much more general term, while “locus” organisms, then various interactions between them
usually is limited to defining the position along a may influence their expression in the phenotype.
chromosome. Each locus will have an allelic form
(allele); that is, a specific DNA sequence. In a
population of individuals there will be sequence
OPEN GENETICS LECTURES – FALL 2017 PAGE 1
CHAPTER 13 –ALLELES AT A SINGLE LOCUS

Figure 2.
Relationship between genotype and phenotype for an
allele that is completely dominant to another allele.
(Original-M. Deyholos -CC BY-NC 3.0)
Figure 3.
Patch of brown eye colour in a green eye.
2. SOMATIC VS. GERMLINE MUTATIONS (Wikipedia-Sheila.lorquiana-CC BY-SA 3.0)
A mutation occurs in the DNA of a single cell. In
single-cell organisms, that mutation is passed on In animals, somatic cells are segregated from germ
directly to its descendants, typically through the line cells. In plants, somatic cells become germline
process of mitosis. In multicellular animals, there is cells; so somatic mutations can become germline
a partitioning early in development into somatic mutations.
cells, which form the body cells, and germline cells,
which form the gametes for the next generation. 2.3. HAPLOID VS. DIPLOID ORGANISMS
Mutations may be passed on to somatic cells via Haploid organisms, have only one copy of a gene,
mitosis and to gametes via meiosis. In plants, this thus a mutation will directly affect the organism’s
somatic/germline separation occurs later, in the phenotype. Therefore, the phenotype can be used
cells that form the flower. to directly infer the genotype of the organism.
However, in diploid organisms, there are two
2.1. SOMATIC MUTATIONS copies of each gene. The phenotype depends upon
Somatic cells form the tissues of the organism and an interaction between the two alleles. Thus, any
are not passed on as gametes. Any mutations in mutation may not have a direct impact on the
somatic cells will only affect the individual in which organism’s phenotype. The interaction of the two
they occur, not its progeny. If mutations occur in alleles can show complete dominance, incomplete
somatic cells, its mutant descendants will exist dominance, co-dominance, or recessiveness.
alongside other non-mutant (wild type) cells. If the Therefore, inferring the genotype based upon its
mutation occurs at a very early stage of phenotype is not as simple as in diploids.
development, the mutation will be present in more
cells. This gives rise to an individual composed of 3. ALLELES: HETERO-, HOMO-, HEMIZYGOSITY
two or more types of cells that differ in their Mendel’s First Law (segregation of alleles) is
genetic composition. Such an individual is said to especially remarkable because he made his
be a mosaic. An example is shown in Figure 3. observations and conclusions (1865) without
Cancer cells are another example of mosaicism. knowing about the relationships between genes,
2.2. GERMLINE MUTATIONS chromosomes, and DNA. We now know the reason
Germline cells are those that form the eggs or why more than one allele of a gene can be present
sperm cells (ovum or pollen in plants), and are in an individual: most eukaryotic organisms are
passed on to form the next generation. Therefore, diploid and have at least two sets of homologous
mutations in germline cells will be passed on to the chromosomes. For organisms that are
next generation but won’t affect the individual in predominantly diploid, such as humans or Mendel’s
which they occur. peas, chromosomes exist as pairs, with one copy

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

ALLELES AT A SINGLE LOCUS – CHAPTER 13

inherited from each parent. Diploid cells therefore The opposite is also found. Single characteristics
can contain two different alleles of each gene, with can be affected by mutations in multiple, different
one allele part of each member of a pair of genes. This implies that many genes are needed to
homologous chromosomes. If both alleles of a make each characteristic. For example, if we return
particular gene are the same (indistinguishable), to the Drosophila wing, there are dozens of genes
the individual is said to be homozygous at that that when mutant alter the normal shape of the
gene or locus. On the other hand, if the alleles are wing, not just the vg locus. Thus there are many
different (can be distinguished) from each other, genes that are needed to make a normal wing; the
the genotype is heterozygous. In cases where there mutation of any one causes an abnormal, mutant,
is only one copy of a gene present, for example if phenotype. This type of arrangement is called
there is a deletion of the locus on the homologous polygenic inheritance.
chromosome, we use the term hemizygous. In
5. COMPLETE DOMINANCE AND RECESSIVE
another example is single X-chromosome in X/Y
males were almost all the loci on that chromosome An example of a simple phenotype is flower color
are hemizygous. (The exception is the pseudo- in Mendel’s peas. We have already said that one
autosomal region – see the chapter on sex allele as a homozygote produces purple flowers,
chromosomes.) while the other allele as a homozygote produces
white flowers (Figure 2). But what about a
Although a single diploid individual can have at
heterozygous individual that has one purple allele
most two different alleles of a particular gene,
and one white allele? What is the phenotype of a
many more alleles can exist in a population of
heterozygote?
individuals. In a natural population the most
common allelic form is usually called the wildtype This can only be determined by experimental
allele. However, in many populations there can be observation. We know from observation that
multiple variants at the DNA sequence level that individuals heterozygous for the purple and white
are visibly indistinguishable as all exhibit a normal, alleles of the flower color gene have purple
wild type appearance. There can also be various flowers. Thus, the allele associated with purple
mutant alleles (in wild populations and in lab color is therefore said to be dominant to the allele
strains) that vary from wild type in their that produces the white color. The white allele,
appearance, each with a different change at the whose phenotype is masked by the purple allele in
DNA sequence level. The many different mutations a heterozygote, is recessive to the purple allele.
(alleles) at the same locus are called an allelic The dominant/recessive character is a relationship
series for a locus. between two alleles and must be determined by
observation of the heterozygote phenotype.
4. PLEIOTROPY AND POLYGENIC INHERITANCE
Sometimes, to represent this relationship, a
There is usually not a one-to-one correspondence dominant allele will be written as a capital letter
between a gene and a physical characteristic. Often (e.g. A) while a recessive allele will be written in
a gene is responsible for several phenotypic traits lower case (e.g. a). However, this is not the only
and it is said to be pleiotropic. For example, system. Many different systems of genetic symbols
mutations in the vestigial gene (vg) in Drosophila are in use. The most common are shown in Table
results in an easily visible short wing phenotype. 3.1. Also note that genotypes (alleles) are usually
However, mutations in this gene also affect the written in italics and chromosomes and proteins
number of egg strings, position of the bristles on are not. For example, the white gene in Drosophila
scutellum, and lifespan in Drosophila. Therefore, vg melanogaster on the X chromosome encodes a
gene is said to be pleiotropic in that it affects many protein called WHITE, which is a pigment precursor
different phenotypic characteristics. transmembrane transporter enzyme.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 13 –ALLELES AT A SINGLE LOCUS

Table 1. Examples of symbols used to represent genes and 6. INCOMPLETE DOMINANCE

alleles.
Examples Interpretation Besides the complete dominant and recessive
relationship, other relationships can exist between
A and a Uppercase letters represent
alleles. In incomplete dominance (also called semi-
dominant alleles and lowercase
dominance), both alleles affect the trait additively,
letters indicate recessive alleles.
and the phenotype of the heterozygote shows a
Mendel invented this system but
typically intermediate between the homozygotes,
it is not commonly used because
which is often referred to as blended phenotype.
not all alleles show complete
For example, alleles for color in carnation flowers
dominance and many genes have
(and many other species) exhibit incomplete
more than two alleles.
dominance. Plants with an allele for red petals (A1)
+ 1
a and a Superscripts or subscripts are and an allele for white petals (A2) have pink petals.
used to indicate alleles. For wild
We say that the A1 and the A2 alleles show
type alleles the symbol is a
incomplete dominance because neither allele is
superscript +.
completely dominant over the other (Figure 4).
AA or Sometimes a forward slash is used
A/A to indicate that the two symbols 7. CO-DOMINANCE
are alleles of the same gene locus,
Co-dominance is another type of allelic relationship
but on homologous
in which a heterozygous individual expresses the
chromosomes.
phenotype of both alleles simultaneously. An

example of co-dominance is found within the ABO
blood group of humans. The ABO gene has three
common alleles that were named (for historical
A B A B
reasons) I , I , and i. People homozygous for I or I
display only A or B type antigens, respectively, on
the surface of their blood cells, and therefore have
either type A or type B blood (Figure 5).
Figure 4. AB
Heterozygous I I individuals have both A and B
Relationship between genotype and phenotype for
antigens on their cells, and so have type AB blood.
incompletely dominant alleles affecting petal colour in
carnations. Note that the heterozygote expresses both alleles
(Original-Deyholos- CC BY-NC 3.0) simultaneously, and is not some kind of novel
intermediate between A and B. Co-dominance is
therefore distinct from incomplete dominance,
although they are sometimes confused.

It is also important to note that the third allele, i,

does not make either antigen and thus is recessive
to the other alleles. IA/i or IB/i individuals display
only A or B antigens, respectively. People

homozygous for the i allele have type O blood.
Figure 5.
Relationship between genotype and phenotype for three This is a useful reminder that different types of
alleles of the human ABO gene. The IA and IB alleles show dominance relationships can exist, even for alleles
co-dominance. The IA allele is completely dominant to the of the same gene. Many types of molecular
i allele. The IB allele is completely dominant to the i allele.
(Wikipedia –Modified by Deyholos - CC BY-NC 3.0)
markers, which we will discuss in a later chapter,
display a co-dominant relationship among alleles.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

ALLELES AT A SINGLE LOCUS – CHAPTER 13

Another example of co-dominance is shown in the are usually recessive, so both copies of a gene have
first figure of this chapter – flower colour in to be lost for the premature death to occur
Camellia sp. (homozygous lethal alleles will not be viable).
Heterozygotes which have one lethal allele and one
8. BIOCHEMICAL BASIS OF DOMINANCE
wild type allele are typically viable.
Given that a heterozygote’s phenotype cannot
9.3. BIOCHEMICAL
simply be predicted from the phenotype of
Auxotrophic mutants can be derived from
homozygotes, what does the type of dominance
prototrophic parents. This type of mutation blocks
tell us about the biochemical nature of the gene
a step in a biochemical pathway as discussed for
product? How does dominance work at the
the arg- mutants of Beadle and Tatum in the
biochemical level? There are several different
chapter on biochemical pathways. Such
biochemical mechanisms that may make one allele
biochemical mutations are a specific type of the
dominant to another.
conditional mutation class (next).
For the majority of genes studied, the normal (i.e.
9.4. CONDITIONAL
wild-type) alleles are haplo-sufficient. So in
Conditional mutations rely on the concept of:
diploids, even with a mutation that causes a
phenotype = genotype + environment + interaction.
complete loss of function in one allele, the other
Organisms with this kind of mutation express a
allele, a wild-type allele, will provide sufficient
mutant phenotype, but only under specific
normal biochemical activity to yield a wild type
environmental conditions. Under restrictive
phenotype and thus be dominant and dictate the
conditions, they express the mutant phenotype
heterozygote phenotype.
while under permissive conditions, they show a
On the other hand, in some biochemical pathways, wild type phenotype. One example of a conditional
a single wild-type allele is not enough protein and mutation is the temperature-sensitive
may be haplo-insufficient to produce enough pigmentation of Siamese cats. Siamese cats have
biochemical activity to result in a normal temperature sensitive fur colour; their fur appears
phenotype, when heterozygous with a non- unpigmented (light coloured) when grown in a,
functioning mutant allele. In this case, the non- warm temperature environment. The hair appears
functional mutant allele will be dominant (or semi- pigmented (dark) when grown at a cooler
dominant) to a wild-type allele. temperature. This is seen at the peripheral regions
Mutant alleles may also encode products that have of the feet, snout, and ears (Figure 6). This is
new and/or different biochemical activities instead because in warm temperature, the enzyme that is
of, or in addition to, the normal ones. These novel needed for melanin pigment synthesis becomes
activities could cause a new phenotype that would nonfunctional. However, in cooler temperature,
be dominantly expressed. the enzyme needed for melanin synthesis is
functional and the deposition of melanin makes the
9. MUTANT CLASSIFICATION fur look dark.
9.1. MORPHOLOGICAL
Morphological mutations cause changes in the
visible form of the organism. An example could be
a change in size, shape, colour, number etc.

9.2. LETHAL
A lethal mutation causes the premature death of Figure 6.
an organism. For example, in Drosophila lethal Siamese cats have temperature sensitive pigmentation
mutations can result in the death during the due to genetic mutation. (Wikimedia-Telekokopelli-CC BY-
SA 3.0)
embryonic, larval, or pupal stage. Lethal mutations
OPEN GENETICS LECTURES – FALL 2017 PAGE 5
CHAPTER 13 –ALLELES AT A SINGLE LOCUS

10. MULLER’S MORPHS base pair changes cause the mature mRNA to
incorrectly splice introns, therefore the
Exposure of an organism to a mutagen causes translated amino acid sequence would be
mutations in essentially random positions along the altered and nonfunctional.
chromosomes. Consequently, most of the mutant
phenotypes recovered from a genetic screen are (4) Gene is present and a transcript is produced but
caused by loss-of-function mutations. These no translation occurs – changes in the base pair
alleles are due to random changes in the DNA sequences would preclude the mRNA from
sequence that cause a gene to no produce less or binding to the ribosome for translation.
no active protein, compared to the wild-type allele. (5) Gene is present and a transcript is produced
Loss-of-function alleles tend to be recessive and translated but a nonfunctional protein
because the wildtype allele is haplo-sufficient. A product is produced – the mutation alters a key
loss-of-function allele that produces no active amino acid in the polypeptide sequence
protein is called an amorph, or null. On the other producing a completely non-functional
hand, alleles with only a partial loss-of-function are polypeptide.
called hypomorphic. More rarely, a mutant allele
may have a gain-of-function, producing either Genetic/phenotypic explanation - Amorphic
more of the active protein (hypermorph) or mutations of most genes usually act as recessive to
producing an active protein with a new and wild type (case #1). However, with some genes the
different function (neomorph). Finally, antimorph amorphic mutations are dominant to wild type.
alleles have an activity that is dominant and (case #2).
opposite to the wild-type function; antimorphs are case #1: white gene in Drosophila
also known as dominant negative mutations.
w+/w+ wildtype and red eyed
Thus, mutations (changes in a gene sequence) can
w+/w- wildtype and red eyed
result in mutant alleles that no longer produce the
same level or type of active product as the wild- w-/w- mutant and white eyed
type allele. Any mutant allele can be classified into
case #2: Minute locus in Drosophila
one of five types: (1) amorph, (2) hypomorph, (3)
hypermorph, (4) neomorph, and (5) antimorph. M+/M+ wildtype and long bristeld

10.1. AMORPH M+/M- mutant and short bristled

Amorphic alleles have a complete loss-of-function. M-/M- dead, recessive lethal
They make no active product – zero function. They
are known as a “Null” mutation or a “loss-of- For the Minute gene, we concluded that the
function” mutation. organism needs both copies to have a wild type
Molecular explanation - Changes in the DNA base phenotype. Loss of one copy (an amorphic
pair sequence of an amorphic allele may cause one mutation) produces a dominant visible mutant
or more of the following: phenotype. Deletion of the gene is an example of a
classic amorphic mutation.
(1) Gene deletion - The DNA sequence is removed
from the chromosome. 10.2. HYPOMORPH
(2) Gene is present, but is not transcribed because Hypomorphic alleles show only a partial loss-of-
of a sequence change in the promoter or function. These alleles are sometimes referred to
enhancer/regulatory elements. as “leaky” mutations, because they provide some
function, but not complete, normal function.
(3) Gene is present but the transcript is aberrantly
processed. There is normal transcription but Molecular explanation - Changes in the DNA base
pair sequence of the hypomorphic allele may cause

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

ALLELES AT A SINGLE LOCUS – CHAPTER 13

one or more of the following, with gene still being 10.3. HYPERMORPH
present: Hypermorphic alleles produce quantitatively more
(1) reduced transcription – changed DNA sequence of the same, active product.
in the promoter or enhancer/regulatory Molecular explanation - Changes in the DNA base
elements can reduce the level of transcription. pair sequence of the hypermorphic allele may
(2) aberrant processing of the transcript – normal cause one or more of the following, with the gene
transcription but base pair changes cause the still being present:
mature mRNA to incorrectly splice introns, (1) increased transcription – changed DNA
therefore the translated protein sequence sequence in the promoter or
would be altered and function at a reduced enhancer/regulatory elements that increase the
level. level of transcription.
(3) reduced translation – changes in the base pair (2) increased translation – changes in the base pair
sequences would reduce the efficiency of the sequences would increase the efficiency of the
mRNA binding to the ribosome for translation. mRNA binding to the ribosome for translation.
(4) reduced-function protein product – normal (3) increased function protein product – normal
transcription, processing, and translation but transcription, processing, translation but base
mutation changes certain amino acid in the pair changes alter certain amino acid in the
polypeptide sequence so its function is reduced. polypeptide sequence so its function is normal
Genetic/phenotypic explanation - Hypomorphic but increased in amount.
mutations of most genes usually act as recessive to Genetic/phenotypic explanation - Hypermorphic
wild type, though hypomorphic mutations mutations of most genes usually act as dominant to
theoretically could be dominant to wildtype. wild type since they are a gain of function, The
classic hypermorph is a gene duplication.
whiteapricot allele in Drosophila
w+/w+ wildtype and red eyed 10.4. NEOMORPH
Neomorphic alleles produce a product with a new,
w+/wa wildtype and red eyed different function, something that the wild type
wa/wa mutant and apricot eye colour allele does not do.
Molecular explanation - Changes in the DNA base
Both amorphs and hypomorphs tend to be pair sequence of the neomorphic allele may cause
recessive to wild type in diploids because the wild one or more of the following, with the gene still
type allele is usually able to supply sufficient being present:
product to produce a wild type phenotype (called
(1) new transcription – changed DNA sequence in
haplo-sufficient). If the mutant allele is not able to
the promoter or enhancer/regulatory elements
produce a wild type phenotype, then it is haplo-
that makes new transcription either temporally
insufficient, and it will be dominant to the wild
or in a tissue-specific manner.
type allele. Here -/+ heterozygotes produce a
mutant phenotype. (2) new function protein product – normal
transcription, processing, translation but base
While the first two classes involve a loss-of-
pair changes alter certain amino acids in the
function, the next two involve a gain-of-function –
polypeptide sequence so it acquires a new
quantity or quality. Gain-of-function alleles are
function (activity) that is different from the
almost always dominant to the wild type allele.
normal function (e.g. additional substrate or
new binding site).

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 13 –ALLELES AT A SINGLE LOCUS

Genetic/phenotypic explanation – Most

neomorphic mutations act as a dominant to wild
type since they are a gain-of-function. The classical
neomorphic mutation is a translocation that moves
a new regulatory element next to a gene promoter
so it is expressed in a new tissue or at a new time
during development. Such mutations are often
produced when chromosome breaks are rejoined
and the regulatory sequences of one gene are
juxtaposed next to the transcriptional unit of
another, creating a novel, chimeric gene.

10.5. ANTIMORPH
Antimorphic alleles are relatively rare, and have a
new activity that is dominant and opposite to the
wildtype function. These alleles usually interfere
with the function from the wild type allele. (They
often lose their normal function as well.) The new
function works against the normal expression of
the wild type allele. This can happen at the
transcriptional, translational, or later level of

expression. Thus, when an antimorphic allele is
heterozygous with wild type, the wild type allele
function is reduced or prevented. At the molecular
level, there are many ways this can happen. The
simplest model to explain an antimorphic effect is
that the protein acts as a dimer (or any multimer)
and the inclusion of a mutant subunit poisons the
whole complex, thereby preventing or reducing its
level of function. Antimorphs are also known as
dominant-negative mutations because they are
usually dominant and act negatively against the
wild type function.

10.6. IDENTIFYING MULLER’S MORPHS

All mutations can be sorted into one of the five
morphs base on how they behave when
heterozygous with three other standard alleles
(Figure 7): (1) deletion alleles (zero function), (2)
wild type alleles (normal function), and (3)
duplication alleles (double normal function).

Figure 7.
Five classes of mutants designated as morphs (forms) by a
Nobel prize winner, H.J. Muller, which are known as
Muller’s Morphs. (Original-Locke- CC BY-NC 3.0)

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

ALLELES AT A SINGLE LOCUS – CHAPTER 13

___________________________________________________________________________
SUMMARY:
• Symbols are used to denote the alleles, or genotype, of a locus.
• Phenotype depends on the alleles that are present, their dominance relationships, and sometimes also
interactions with the environment and other factors.
• A somatic mutation affects the individual but not the progeny, whereas a germline mutation affects
the progeny in the next generation but not the individual in which they occur.
• In a diploid organism, alleles can be homozygous, heterozygous or hemizygous.
• Allelic interactions at a locus can be described as dominant vs. recessive, incomplete dominance, or co-
dominance.
• Muller's morphs classify all types of mutations including: amorph, hypomorph, hypermorph,
neomorph, and antimorph.
KEY TERMS:
homozygous co-dominance
heterozygous ABO blood group
hemizygous haplosufficiency
wild-type haploinsufficiency
variant loss-of-function
locus gain-of-function
genotype amorph
phenotype null
dominant hypomorph
recessive hypermorph
complete dominance neopmorph
incomplete (semi) dominance

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 13 –ALLELES AT A SINGLE LOCUS

STUDY QUESTIONS:
1) Distinguish amongst the following terms: (1) to a wild type strain the following phenotypes
gene, (2) locus, (3) allele, (4) transcription unit. are observed in the progeny:
2) A flower geneticist crosses a red flowered Mutant#1 = bristles 20% shorter
diploid plant with a white flower diploid plant Mutant#2 = bristles 30% longer
and all the progeny are red. Use two different Mutant#3 = bristles 50% shorter
forms of symbols to show this cross and its Mutant#4 = bristles kinked and misshapen
progeny. What if all the progeny were pink? Mutant#5 = bristles are missing
3) If your blood type is B, what are the possible What is the best characterization, using
genotypes of your parents at the locus that Muller’s Morphs, for each?
controls the ABO blood types?
4) In the table below, match the mouse hair color
phenotypes with the term from the list that
best explains the observed phenotype, given
the genotypes shown. In this case, the allele
symbols do not imply anything about the
dominance relationships between the alleles.
List of terms:
haplo-sufficiency,
haplo-insufficiency,
pleiotropy,
incomplete dominance,
co-dominance,
incomplete penetrance,
broad (variable) expressivity.
5) In this hypothetical example of Drosophila
bristle mutations, when various, true-breeding
mutant strains (all at a single locus) are crossed

Table for Question 2

A1A1 A1A2 A2A2

1 all hairs black on the same individual: all hairs white
50% of hairs are all black and
50% of hairs are all white
2 all hairs black all hairs are the same shade of grey all hairs white
3 all hairs black all hairs black 50% of individuals have all white hairs and
50% of individuals have all black hairs
4 all hairs black all hairs black mice have no hair
5 all hairs black all hairs white all hairs white
6 all hairs black all hairs black all hairs white
7 all hairs black all hairs black hairs are a wide range of shades of grey

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

MITOSIS AND THE CELL CYCLE – CHAPTER 14

CHAPTER 14 – MITOSIS AND THE CELL CYCLE

Figure 1.
Confocal micrograph of human cells showing
the stages of cell division. DNA is stained
blue, microtubules stained green and
kinetochores stained pink. Starting from the
top and going clockwise you see an
interphase cell with DNA in the nucleus. In
the next cell, the nucleus dissolves and
chromosomes condense in prophase. The
next is prometaphase where microtubules
are starting to attach, but the chromosomes
haven’t aligned. Next is metaphase where the
chromosomes are all attached to
microtubules and aligned on the metaphase
plate. The next two are early and late
anaphase, as the chromosomes start
separating to their respective poles. Finally
there is telophase where the cells are
completing division to be two daughter cells.
(Flickr-M. Daniels; Wellcome Images- CC BY-NC-ND 2.0)

INTRODUCTION chromosomes are not condensed yet, because S
phase is still part of interphase, they are replicated
Cell growth and division is essential to asexual
as two sister chromatids attached at the
reproduction and the development of multicellular
centromere. Still in interphase and following
organisms. The transmission of genetic information
replication, there is another lag phase, called Gap 2
is accomplished in a cellular process called Mitosis.
(G2). In G2, the cell continues to grow and acquire
This process ensures that a cell division with each
the proteins necessary for cell division. There is a
daughter cell inheriting identical genetic material,
checkpoint stage, where, if there are any problems
i.e. exactly one copy of each chromosome present
with replication or acquiring the needed proteins,
in the parental cell.
the cell cycle will arrest until it can fix itself or die.
1. FOUR STAGES OF A TYPICAL CELL CYCLE The final stage is mitosis (M), where the cell
undergoes cell division as is described in the last
The life cycle of eukaryotic cells can generally be
section.
divided into four stages (and a typical cell cycle is
shown in Figure 1). When a cell is produced Many variants of this generalized cell cycle also
through fertilization or cell division it normally goes exist. Cells undergoing meiosis do not usually have
through four main stages: G1, S, G2, and M. The first a G2 phase. Cells, like hematopoietic stem cells,
stage of interphase is a lag period is called Gap 1 which are found in the bone marrow and produce
(G1), and is the first part of interphase. This is all the other blood cells, will consistently go
where the cell does its normal cellular functions through these phases as they are constantly
and it grows in size, particularly after mitosis when replicating. Other cells, as in the nervous system,
the daughters are half the size of the mother cell. will no longer divide. These cells never leave G1
This stage ends with the onset of the DNA synthesis phase, and are said to enter a permanent, non-
(S) phase, during which each chromosome is dividing stage called G0. On the other hand some
replicated (For more information on DNA cells, like the larval tissues in Drosophila, undergo
replication, see the chapter on DNA and many rounds of DNA synthesis (S) without any
chromosome replication). Though the mitosis or cell division, leading to

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 14 – MITOSIS AND THE CELL CYCLE

endoreduplication (See Chapter 2). Understanding attach to the kinetochore and the chromosomes
the control of the cell cycle is an active area of align along the middle of the dividing cell, known as
research, particularly because of the relationship the metaphase plate. The kinetochore is the region
between cell division and cancer. on the chromosome where the microtubules
attach. It contains the centromere and proteins
2. MITOSIS that help the microtubules bind. Then in anaphase,
During the S-phase of interphase the chromosomes each of the sister chromatids from each
replicate so that each chromosome has two sister chromosome gets pulled towards opposite poles of
chromatids attached at the centromere. After S- the dividing cell. Finally in telophase, identical sets
phase and G2, the cell enters Mitosis. The first step of unreplicated chromosomes (single chromatids)
in mitosis is prophase where the nucleus dissolves are completely separated from each other into the
and the replicated chromosomes condense into the two daughter cells, and the nucleus re-forms
visible structures we associate with chromosomes. around each of the two sets of chromosomes.
Next is metaphase, where the microtubules Following this is the partitioning of the cytoplasm
(cytokinesis) to complete the process and to make
two identical daughter cells. Figure 1 and Figure 3
show real pictures and a cartoon schematic of the
process, respectively.
You should note that this is a dynamic and ongoing
process, and cells don’t just jump from one stage to
the next. When looking at snapshots of real cells,
you will more often see cells between two stages,
like is seen in some of the images in Figure 1.
An acronym to remember the main stages of
mitosis is iPMAT, where the little i stands for
Figure 2.
interphase, which will be described next.
Stages of the cell cycle. The outer ring identifies when a In contrast, Meiosis, which may appear similar, is a
cell is in interphase (I) and when it is in mitosis (M). The
very different process. Read through the Chapter
inner ring identifies the four major stages. Cells can enter
G0 if they are not actively undergoing cell division, and 16 and try to identify the similarities and
may re-enter the cell cycle at a later time. differences between the two processes.
(Wikimedia Commons - R. Wheeler - CC BY-SA 3.0

Interphase Prophase Metaphase Anaphase Telophase Cytokinesis

Figure 3.
A cartoon diagram showing the main stages of Mitosis. (Original-M. Deyholos/L. Canham-CC:AN)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

MITOSIS AND THE CELL CYCLE – CHAPTER 14

1N, 1C 2N, 2C
2N, 2C 2N, 2C 2N, 4C 2N, 4C

fertilization mitosis

ga
nt
p

p
h
(G

(G
es
1
)

2
)
is
(S

)
Figure 4.
Changes in DNA and chromosome content during the cell cycle and mitosis. For simplicity, nuclear membranes are not shown,
and all chromosomes are represented in a similar stage of condensation. (Original-M. Deyholos/L. Canham- CC BY-NC 3.0)

3. MEASURES OF DNA CONTENT AND

CHROMOSOME CONTENT
The amount of DNA within a cell changes during
the following events: fertilization, DNA synthesis
and mitosis (Figure 4). We use “c” (or C) to
represent the DNA content in a cell, and “n” (or N)
to represent the number of complete sets of
chromosomes. In a haploid gamete (i.e. sperm or
egg), the amount of DNA is 1c, and the number of
chromosomes is 1n. Upon fertilization, both the

DNA content and the number of chromosomes in
Figure 5.
the diploid zygote doubles to 2c and 2n,
Marbled Lungfish (Protopterus aethiopicus) has a genome
respectively. Following DNA replication, the DNA 9
of ~133 x 10 base pairs, which is ~45X that of a human. It
content doubles again to 4c, but each pair of sister is an example of the C-value paradox.
chromatids are still attached by the centromere, (Wikipedia-OpenCage- CC BY 2.5)
and so is still counted as a single chromosome (a
replicated chromosome), so the number of physical size or complexity of an organism.
chromosomes remains unchanged at 2n. If the cell Compare the size of E. coli and humans for example
undergoes mitosis, each daughter cell will return to in the Table 1. There are, however, many
2c and 2n, because it will receive half of the DNA, exceptions to this generalization, such as the
and one of each pair of sister chromatids. human genome contains only 3.2 x 109 DNA bases,
3.1. THE C-VALUE OF THE NUCLEAR GENOME while the wheat genome contains 17 x 109 DNA
The complete set of DNA within the nucleus of any bases, almost 6 times as much. The Marbled
organism is called its nuclear genome and is Lungfish (Protopterus aethiopicus - Figure 5)
measured as the C-value in units of either the contains ~133 x 109 DNA bases, (~45 times as much
number of base pairs or picograms of DNA. There is as a human) and a fresh water amoeboid,
a general correlation between the nuclear DNA Polychaos dubium, which has as much as 670 x 109
content of a genome (i.e. the C-value) and the bases (200x a human).

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 14 – MITOSIS AND THE CELL CYCLE

3.2. THE C-VALUE PARADOX 3.3. THE “ONION TEST”.

This apparent paradox (called the C-value paradox) This “test” deals with any proposed explanation for
can be explained by the fact that not all nuclear the function(s) of non-coding (junk) DNA. For any
DNA encodes genes – much of the DNA in larger proposed function for the excess of DNA in
genomes is non-gene coding. In fact, in many eukaryote genomes (C-value paradox) can it
organisms, genes are separated from each other by “explain why an onion needs about five times more
long stretches of DNA that do not code for genes or non-coding DNA for this function than a human?”
any other genetic information. Much of this “non- The onion Allium cepa has a haploid genome size of
gene” DNA consists of transposable elements of ~17 pg, while humans have only ~3.5 pg. Why?
various types, which are an interesting class of self- Also, onion species range from 7 to 31.5 pg, so why
replicating DNA elements discussed in Chapter 30. is there this range of genome size in organisms of
Other non-gene DNA includes short, highly similar complexity?
repetitive sequences of various types. Together, See :
this non-functional DNA is often referred to as (http://www.genomicron.evolverzone.com/2007/04/on
“Junk DNA”. ion-test/) for details.

DNA content Estimated gene Average gene Chromosome
(Mb, 1C) number density number (1N)
Homo sapiens 3,200 25,000 100,000 23
Mus musculus 2,600 25,000 100,000 20
Drosophila melanogaster 140 13,000 9,000 4
Arabidopsis thaliana 130 25,000 4,000 5
Caenorhabditis elegans 100 19,000 5,000 6
Saccharomyces cerevisiae 12 6,000 2,000 16
Escherichia coli 5 3,200 1,400 1

Table 1.
Measures of genome size in selected organisms. The DNA content (1C) is shown in millions of basepairs (Mb). For eukaryotes, the
chromosome number is the chromosomes counted in a gamete (1N) from each organism. The average gene density is the mean
number of non-coding bases (in bp) between genes in the genome."

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

MITOSIS AND THE CELL CYCLE – CHAPTER 14

___________________________________________________________________________
SUMMARY:
• The asexual transmission of genetic information is accomplished in a process called Mitosis.
• The process of mitosis can be divided into Prophase, Metaphase, Anaphase, and Telophase.
• Mitosis reduces the c-number, but not the n-number of the daughter cells.
• Not all the DNA in an organism codes for genes. In most higher eukaryotes most DNA is non-gene
coding and appears to have no specific function and is called “junk’ DNA.
• The c-value paradox refers to the observation that the amount of DNA is not necessarily related to the
complexity of the organism.

KEY TERMS:
mitosis metaphase plate
interphase anaphase
G1 Phase telophase
S Phase unreplicated chromosome
G2 Phase cytokinesis
M Phase n-value
G0 Phase c-value
chromatids replicated chromosome
prophase nuclear genome
metaphase c-value paradox
microtubules
kinetochore

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 14 – MITOSIS AND THE CELL CYCLE

STUDY QUESTIONS:
1) Species A has n=4 chromosomes and Species B
has n=6 chromosomes. Can you tell from this
information which species has more DNA? Can
you tell which species has more genes?
2) The answer to question 1 implies that not all
DNA within a chromosome encodes genes. Can
you name any examples of chromosomal
regions that contain relatively few genes
3)
a) How many centromeres does a typical
chromosome have?
b) What would happen if there was more than
one centromere per chromosome?
c) What if a chromosome had no
centromeres?
4) For a diploid organism with 2n=16
chromosomes, how many chromosomes and
chromatids are present per cell at the end of:
a) G1,
b) S,
c) G2,
d) mitosis,
5) Refer to Table 1.
a) What is the relationship between DNA
content of a genome, number of genes,
gene density, and chromosome number?
b) What feature of genomes explains the c-
value paradox?
c) Do any of the numbers in this Table show a
correlation with organismal complexity?

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

HUMAN CHROMOSOMES – CHAPTER 15

CHAPTER 15 – HUMAN CHROMOSOMES

Figure 1.
Human metaphase chromosome spreads. To make
these figures, white blood cells in metaphase were
dropped onto a slide. The cells burst open and the
chromosomes can then be stained with giemsa (a
purple colour). This image shows chromosomes
from three cells that hit the slide close to one
another. They can be distinguished by the size
difference among the chromosome sets, which is
due to the differences in condensation during the
stages of mitosis (prophase).
(Original-Alexander Smith - CC BY-NC 3.0)

INTRODUCTION and clarity. Figure 2 shows a more magnified view
of a pair of chromosomes. On average a condensed
Humans, like all other species, store their genetic
human metaphase chromosome is 5 µm long and
information in cells as large DNA molecules called
each chromatid is 700 nm wide. In contrast, a
chromosomes. Within each nucleus are 23 pairs of
decondensed interphase chromosome is 2 mm long
chromosomes, half from mother and half from
and only 30 nm wide, yet still fits into a single
father. In addition, our mitochondria have their
nucleus.
own smaller chromosome that encodes some of
the proteins found in this organelle. Figure 2.
A pair of metacentric human
This chapter will provide information on human chromosome #1.
chromosomes that will be referred to in various (Wikipedia- National Human Genome
other chapters, lectures, and in the lab. Research Institute-PD)

1. METAPHASE CHROMOSOME SPREADS

1.1. THE SHAPE OF CHROMOSOMES
Figure 1 shows chromosomes from three cells. 1.2. THE AMOUNT OF DNA IN A CELL (C-VALUE)
Each of the cells was in the metaphase stage of To calculate how much DNA is seen in the nuclei in
mitosis, which is why the chromosomes appear Figure 1, consider that a human gamete has about
replicated and condensed. We refer to 3000 million base pairs. We can shorten this
chromosomes as being replicated when they statement to 1c = 3000 Mb where c is the c-value,
consist of two sister chromatids held together at the DNA content in a gamete. When an egg and
the centromeres. DNA replication occurs during S sperm join the resulting zygote is 2c = 6000 Mb.
phase. These chromosomes are also condensed. Before the zygote can divide and become two cells
Chromosomes are compacted at the start of it must undergo DNA replication. This doubles the
mitosis in prophase. Cytogeneticists can observe DNA content to 4c = 12 000 Mb. When the zygote
chromosomes at any stage of the cell cycle but divides, each daughter cell inherits half the DNA
those from metaphase cells provide the most detail and is therefore back to 2c = 6000 Mb. Then each

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 15 – HUMAN CHROMOSOMES

cell will become 4c again (replication) before replication. The n-value does not change while the
dividing themselves to become 2c each. From this c-value does.
point forward, every cell in the embryo will be 2c =
6000 Mb before its S phase and 4c = 12 000 Mb
afterwards. The same is true for the cells of
fetuses, children, and adults. Because the cells used
to prepare this chromosome spread were adult
cells in metaphase each is 4c = 12 000 Mb. Note,
there are some rare exceptions, such as some
stages of meiocytes that make germ cells and other
rare situations like the polyploidy of terminally
differentiated liver cells. In summary:
Human cell DNA content
gamete (egg or sperm) 1c = 3000 Mb
Figure 3.
regular cell before S phase 2c = 6000 Mb Karyogram of a normal human male.
(Wikipedia-National Human Genome Research Institute –
regular cell after S phase 4c = 12 000 Mb PD)

1.3. THE NUMBER OF CHROMOSOMES (N-VALUE)

Human gametes contain 23 chromosomes. We can 2. HUMAN KARYOGRAMS AND KARYOTYPES
summarize this statement as 1n = 23 where n is 2.1. KARYOGRAMS
the n-value, the number of chromosomes in a Human cytogenetists use metaphase chromosome
gamete. When a 1n = 23 sperm fertilizes a 1n = 23 spreads as a standard representation of the
egg, the zygote will be 2n = 46. But, unlike DNA chromosomes in a cell, organism, or species.
content (c), the number of chromosomes (n) does Comparisons permit them to identify chromosome
not change with DNA replication. A replicated abnormalities. Because it can be hard to distinguish
chromosome is still just one chromosome. Thus the individual chromosomes, cytogeneticists sort the
zygote stays 2n = 46 after S phase. When the photo to put the chromosomes into a standard
zygote divides into two cells both contain 46 pattern. The result is a karyogram ("nucleus
chromosomes and are still 2n = 46. Every cell in picture"; Figure 3). In the past it was necessary to
the embryo, fetus, child, and adult is also 2n = 46 print a photograph of the metaphase spread, cut
(with the exceptions noted above). out each chromosome with scissors, and then glue
In summary: each to a piece of cardboard to show the pattern.
Now, computer software does much of this for us,
Human cell Chromosome
but the karyogram assembly is usually reviewed by
number
a qualified cytogeneticist. But either way, the
gamete (egg or sperm) 1n = 23 random collection of chromosomes seen in Figure
regular cell before S 2n = 46 1 is converted to the organized pattern in Figure 3.
phase 2.2. HUMAN CHROMOSOMES – AUTOSOMES
regular cell after S phase 2n = 46 The chromosomes are numbered to distinguish
them. Chromosomes 1 through 22 are autosomes,
which are present in two copies in both males and
Note, that in a normal cell, the chromosome
females. Because human chromosomes vary in size
number is 2n before and after chromosome
this was the easiest way to label them. Our largest
chromosome is number 1, our next longest is 2,

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

HUMAN CHROMOSOMES – CHAPTER 15

and so on. The karyogram above shows two copies 2.3. RELATIONSHIPS BETWEEN CHROMOSOMES AND
of each of the autosomes. A karyogram from a CHROMATIDS
normal female would also show these 22 pairs. To summarize what we have covered so far,
There are also the sex-chromosomes, X and Y (see karyograms depict replicated chromosomes
below). Normal females have two X-chromosomes, (because the cells had past S phase in the cell cycle)
while normal males have an X and a Y each. They and two copies of each chromosome (because the
act as a homologous pair, similar to the autosomes. cells were diploid). So how do we refer to all the
During meiosis only one of each autosome pair and pieces of DNA present? Figure 4 summarizes the
one of the sex-chromosomes makes it into the terms used.
gamete. This is how 2n = 46 adults can produce
1n = 23 eggs or sperm. 2.4. HUMAN SEX CHROMOSOMES
Figure 3 shows that most of our chromosomes are
In addition to their length, Cytogeneticists can present in two copies. Each copy has the same
distinguish chromosomes using their centromere length, centromere location, and banding pattern.
position and banding pattern. Note that at the As mentioned before, these are called autosomes.
resolution in Figure 3 both chromosome 1s look However, note that two of the chromosomes, the X
identical, even though at the base pair level there and the Y, do not look alike. These are sex
are small and often significant differences in the chromosomes. In mammals, males have one of
sequence that correspond to allelic differences each while females have two X chromosomes.
between these homologous chromosomes.
Remember that in each karyogram there are
maternal chromosomes, those inherited from their
mother, and their paternal chromosomes, those
from their father. For example, everyone has one
maternal chromosome 1 and one paternal
chromosome 1. In a typical karyogram it usually is
not possible to tell which is which. In some cases,
however, there are visible differences between
homologous chromosomes that do permit the
distinction to be made.
Figure 4.
The relationships between chromosomes and
chromatids.

(Original Deyholos- CC BY-NC 3.0)

Term Definition Example
the maternal and paternal copies maternal chromosome 1 and
homologous chromosomes
of a chromosome paternal chromosome 1
two different chromosomes within a chromosome 1 and a
non-homologous chromosomes
the same cell/organism chromosome 8
the identical chromatids within a the two chromatids within
sister chromatids
single replicated chromosome maternal chromosome 1
the similar but not identical a chromatid in maternal
non-sister chromatids chromatids from homologous chromosome 1 and a chromatid
chromosomes in paternal chromosome 1

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 15 – HUMAN CHROMOSOMES

Autosomes are those chromosomes present in the 2.5. HUMAN KARYOTYPES

same number in males and females while sex We can summarize the information shown in a
chromosomes are those that are not. When sex karyogram such as Figure 3 with a written
chromosomes were first discovered their function statement known as a karyotype ("nucleus
was unknown and the name X was used to indicate features"). By convention we list (i) the total
this mystery. The next one was named Y (then Z, number of chromosomes, (ii) the sex
and then W – see Chapter 21). chromosomes, and (iii) any abnormalities. The
It is a popular misconception that the X and Y karyotype in Figure 3 would be 46,XY, which is
chromosomes were named based upon their typical for human males. Most human females are
shapes; physically each looks like any other 46,XX.
chromosome. A Y-chromosome doesn’t look like a If a cytogeneticist sees an abnormality, it may not
Y any more than a chromosome 4 looks like a 4. be harmful or detrimental. For example many
The combination of sex chromosomes within a people in the world have a chromosome 9 with an
species is associated with either male or female inversion in the middle. They are therefore
individuals. In mammals, fruit flies, and some 46,XY,inv(9) or 46,XX,inv(9). Other chromosomal
flowering plants, XX individuals are females while abnormalities do have an effect on a person's
XY individuals are males. health and wellbeing. An example is 47,XY,+21 or
47,XX,+21. These people have an extra copy of
How do the sex chromosome behave during chromosome 21, a condition also known as
meiosis? Well, in those individuals with two of the trisomy-21 and Down Syndrome. These and other
same chromosome (i.e. XX females) the examples are described in the chapters on
chromosomes pair and segregate during meiosis I chromosome structure changes and chromosome
the same as autosomes do. During meiosis in XY number changes.
males the sex chromosomes pair with each other
(Figure 5). In mammals the consequence of this is 3. PARTS OF A TYPICAL NUCLEAR CHROMOSOME
that all egg cells will carry an X chromosome while A functional chromosome requires four features.
the sperm cells will carry either an X or a Y These are shown in Figure 6.
chromosome. Half of the offspring will receive two
X chromosomes and become female while half will
receive an X and a Y and become male.

Figure 6.
Parts of a typical human nuclear chromosome (not to
scale). The ori's and genes are distributed everywhere
along the chromosome, except for the telomeres and
centromere.
(Original-Harrington- CC BY-NC 3.0)

Figure 5. 3.1. THOUSANDS OF GENES
Meiosis in an XY mammal. The stages shown are anaphase In the previous sections we mentioned human
I, anaphase II, and mature sperm. Note how half of the chromosome 1, but what exactly is it? Well, each
sperm contain Y chromosomes and half contain X
chromosome is long molecule of double stranded
chromosomes.
(Original - Harrington - CC BY-NC 3.0) DNA. They carry genetic information (genes).
Chromosome 1, being our largest chromosome has
the most genes, about 4778 in total. Many of these

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

HUMAN CHROMOSOMES – CHAPTER 15

genes are transcribed into mRNAs, which encode where this begins are called origins of replication
proteins. Other genes are transcribed into tRNAs, (ori's). They are found distributed along the
rRNA, and other non-coding RNA molecules (see chromosome, about 40 kb apart. S phase begins at
Chapter 07). each ori as two replication forks leave travelling in
opposite directions. Replication continues and
3.2. ONE CENTROMERE replication forks travelling from one ori will collide
A centromere ("middle part") is a place where with forks travelling towards it from the
proteins attach to the chromosome as required neighboring ori. When all the forks meet, DNA
during the cell cycle. Cohesin proteins hold the replication will be complete.
sister chromatids together beginning in S phase.
Kinetochore proteins form attachment points for 4. APPEARANCE OF A TYPICAL NUCLEAR
microtubules during mitosis. The metaphase CHROMOSOME DURING THE CELL CYCLE
chromosomes shown in Figure 3 have both Cohesin
If we follow a typical chromosome in a typical
and Kinetochore proteins at their centromeres.
human cell it alternates between unreplicated and
There are no genes within the centromere region
replicated states and between relatively
DNA; rather it is composed of a simple repeated
uncondensed and condensed. The replication is
DNA sequence.
easy to explain, if a cell has made the commitment
All human chromosomes have a centromere, but to divide, it first needs to replicate its DNA. This
not necessarily in the middle of the chromosome. If occurs during S phase. Before S phase,
it is in the centre the chromosome it is called a chromosomes consist of a single piece of double-
metacentric chromosome. If it is offset a bit it is stranded DNA and after they consist of two
submetacentric, and if it is towards one end the identical double-stranded DNAs.
chromosome is acrocentric. In humans an example
The condensation is a more complex story because
of each is chromosome 1, 5, and 21, respectively.
eukaryotic DNA is always wrapped around some
Humans do not have any telocentric chromosomes,
proteins. Figure 7 shows the different levels
those with the centromere at one end, but mice
commonly found in cells. During interphase, a
and some other mammals do.
chromosome exists mostly as a 30 nm fibre. This
3.3. TWO TELOMERES allows it to fit inside the nucleus and still have the
The ends of a chromosome are called telomeres DNA be accessible for enzymes performing RNA
("end parts"). Part of the DNA replication is unusual synthesis, DNA replication, and DNA repair. At the
here, it is done with a dedicated DNA polymerase start of mitosis these processes halt and the
known as a Telomerase. Chapter 2 on DNA chromosome becomes even more condensed. This
replication goes into more detail. As with the is necessary so that the chromosomes are compact
centromere region there are no genes in the enough to move to the opposite ends within the
telomeres, just simple, repeated DNA sequences. cell. When mitosis is complete the chromosome
returns to its 30 nm fibre structure. Recall that
3.4. THOUSANDS OF ORIGINS OF REPLICATION each of our cells has a maternal and a paternal
At the beginning of S phase DNA polymerases begin chromosome 1. Figure 8 shows what these
the process of chromosome replication. The sites chromosomes look like during the cell cycle

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 15 – HUMAN CHROMOSOMES

Figure 7.
Successive stages of
chromosome
condensation depend on
the introduction of
additional proteins.
(Wikipedia-R. Wheeler- CC
BY-SA 3.0)

synthesis and transcription. Thus, chromosomes

vary in how tightly DNA is packaged, depending on
the stage of the cell cycle and also depending on
the level of gene activity required in any particular
region of the chromosome.

5.2. LEVELS OF COMPACTION

There are several different levels of structural
organization in eukaryotic chromosomes, with each
successive level contributing to the further
compaction of DNA (Figure 7). For more loosely
compacted DNA, only the first few levels of
organization may apply. Each level involves a
specific set of proteins that associate with the DNA
to compact it. First, proteins called the core
histones act as spool around which DNA is coiled
twice to form a structure called the nucleosome.
Nucleosomes are formed at regular intervals along
the DNA strand, giving the molecule the
appearance of “beads on a string”. At the next
level of organization, histone H1 helps to compact
Figure 8.
the DNA strand and its nucleosomes into a 30nm
fibre. Subsequent levels of organization involve the

Appearance of maternal and paternal chromosome 1 look

like during the cell cycle. The other 44 chromosomes are addition of scaffold proteins that wind the 30nm
not shown. Note that they are independent during both fibre into coils, which are in turn wound around
interphase (top) and mitosis (bottom). After anaphase other scaffold proteins.
there will be two cells in G1.
(Original-Harrington- CC BY-NC 3.0) 5.3. CHROMATIN PACKAGING VARIES INSIDE THE
NUCLEUS : EUCHROMATIN AND HETEROCHROMATIN
5. DNA IS PACKAGED INTO CHROMATIN Chromosomes can be stained with certain dyes,
which is how they got their name (chromosome
5.1. DNA CAN BE HIGHLY COMPACTED means “colored body”). Certain dyes stain some
If stretched to its full length, the DNA molecule of regions along a chromosome more intensely than
the largest human chromosome would be 85mm others, giving some chromosomes a banded
long. Yet during mitosis and meiosis, this DNA appearance. The material that makes up
molecule is compacted into a chromosome chromosomes, which we now know to be proteins
approximately 5µm long. Although this and DNA, is called chromatin. Classically, there are
compaction makes it easier to transport DNA two major types of chromatin, but these are more
within a dividing cell, it also makes DNA less the ends of a continuous and varied spectrum.
accessible for other cellular functions such as DNA Euchromatin is more loosely packed, and tends to

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

HUMAN CHROMOSOMES – CHAPTER 15

contain genes that are actively being transcribed. chromosomes (Figure 10), like most bacteria that
Heterochromatin, is more densely compacted and exist today. Mitochondria typically have circular
tends not to be transcribed; the genes are inactive. chromosomes that behave more like bacterial
Heterochromatin sequences also include short, chromosomes than eukaryotic chromosomes, (i.e.
highly-repetitive sequences called satellite DNA, mitochondrial genomes do not undergo mitosis or
which acquired their name because their buoyant meiosis). Also, the mitochondrial chromosome is
density, as determined by ultracentrifugation, is not associated with histones or other proteins that
distinctly different from the main band of DNA. compact it. It also lacks a centromere because
mitochondrial replication is simpler than nuclear
6. PARTS AND APPEARANCE OF A MITOCHONDRIAL chromosome replication. Mitochondria just grow
CHROMOSOME larger and split in two, like the cells of its
prokaryote origin. Because there are multiple
While most of our genome is located in the
mtDNA copies that are randomly distributed in the
nucleus, there is also DNA in the mitochondria. The
matrix, both new mitochondria will end up
human mtDNA is small, only 16.6 kb, and circular,
inheriting some mtDNAs. And lastly because the
although it is double-stranded like most DNA
mtDNA is circular there are no ends and thus no
molecules. It has only 37 genes, 13 of these make
telomeres.
mitochondrial proteins and the rest encode tRNAs
and rRNAs. In summary:
Each mtDNA has a single origin of replication. Nuclear Mitochondrial
During DNA replication two replication forks leave Feature
chromosomes chromosome
the ori and halt when they bump into each other
linear double circular double
on the opposite side of the circle. DNA replication DNA
stranded DNA stranded DNA
inside the mitochondria happens throughout
interphase, not once during S phase as with the genes thousands 37
nuclear chromosomes. The consequence is that
each mitochondrion has between 2 to 10 identical centromeres 1 0
copies of its chromosome (Figure 9).
telomeres 2 0

origins of
thousands 1
replication
Mitosis/ Yes No

Meiosis

Figure 9.
The relationship between cells, mitochondria, and
mitochondrial DNA. (Original-Harrington- CC BY-NC 3.0)

There are other differences when compared to

nuclear chromosomes. Organelles such as
mitochondria or chloroplasts are likely the
remnants of prokaryotic endosymbionts that
entered the cytoplasm of ancient progenitors of
today’s eukaryotes (endosymbiont theory). These
endosymbionts had their own, circular

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 15 – HUMAN CHROMOSOMES

Because most genes are on autosomes you have

two copies of most of your genes.

7.2. F8 - AN X CHROMOSOMAL GENE

The F8 gene makes a blood-clotting protein called
Coagulation Factor VIII (F8) (see Chapter 22).
Without normal F8 a person is unable to stop
bleeding if injured. The F8 gene is located on the X
chromosome. Females, with two X chromosomes,
have two copies of the F8 gene. Males only have
one X chromosome and thus a single F8 gene. This
has an impact on male health, a topic discussed in
Chapter 23 on pedigree analysis.

7.3. SRY - A Y CHROMOSOMAL GENE

The SRY gene is only found in males, because it is
Figure 10. located on the Y chromosome (see Chapter 22).
A map of the complete mitochondrial chromosome of the
Males have this gene and females do not. In
woolly mammoth (Mammuthus primigenius). The mtDNA
that was used to produce this map was obtained from embryogenesis, the presence this gene leads to
tissue of a mammoth that lived approximately 32,000 being male. Its absence leads to being female. A
years ago. Circular organellar chromosomes such as this pair of organs called the gonads can develop into
are typical of almost all eukaryotes. (From Rogaev et al, either ovaries or testes. In XY embryos the SRY
2006). Recent (Rohland et al, 2010) mtDNA work
gene makes a protein that causes the gonads to
indicates that mammoths are more closely related to
Indian elephants than to either of the African species. develop into testes. Conversely, XX embryos do not
have this gene and their gonads develop into
7. EXAMPLE GENES ovaries instead. Once formed the testes produce
sex hormones that direct the rest of the developing
7.1. LCT - AN AUTOSOMAL GENE
embryo to become male, while the ovaries make
The LCT gene encodes the enzyme Lactase (see
different sex hormones that promote female
Chapter 9). This enzyme allows people to digest the
development. The testes and ovaries are also the
milk sugar lactose. The LCT gene is on chromosome
organs where gametes (sperm or eggs) are
2. Because this is an autosome everyone has a
produced. Whether a person is genetically male or
maternal and a paternal copies of LCT gene. Genes
female is decided at the moment of conception, if
come in different versions called alleles. The allele
the sperm carries a Y chromosome the result is a
of the LCT gene you inherited from your mother
male and if the sperm carries an X the result is a
will probably be slightly different from the allele
female.
you received from your father. Thus, most people
have two different alleles of this gene. If we 7.4. MT-CO1 - A MITOCHONDRIAL GENE
consider a cell in G1 there will be two pieces of DNA The MT-CO1 gene is located on the mtDNA
inside the nucleus that harbour this gene. When chromosome. It encodes a protein in Complex IV of
this cell completes DNA replication there will be the mitochondrial electron transport chain. For
four copies of this gene. But because the reasons that are not clear this protein must be
chromatids on your maternal chromosome 2 are made in the mitochondria. It cannot be synthesized
identical as are the chromatids on your paternal in the cytosol of the cell and then imported into the
chromosome 2 this cell will still have just two mitochondria as is the case with most
different alleles. Because of this we simplify things mitochondrial proteins. Because humans generally
by saying that humans have two copies of LCT. receive their mitochondria from their mother,
everyone has only one MT-CO1 gene. It is the same

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

HUMAN CHROMOSOMES – CHAPTER 15

one found in their mother (and her mother). molecules in all of the mitochondria in all of the
Technically speaking we have only one MT-CO1 cells.
allele, it will be identical on all of the mtDNA In summary:

Number of this gene in Number of this gene in

Location of a gene
males females
autosomal chromosome 2 2
X chromosome 1 2
Y chromosome 1 0
mitochondrial
1 1
chromosome

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 15 – HUMAN CHROMOSOMES

___________________________________________________________________________
SUMMARY:
• The c-value is the amount of DNA in a gamete. Humans are 1c = 3000 Mb.
• The n-value is the number of chromosomes in a gamete. Humans are 1n = 23.
• A typical cell in your body is 2c = 6000 Mb and 2n = 46 before DNA replication and 4c = 12 000 Mb and
2n = 46 after.
• A picture of metaphase chromosomes can be organized into a karyogram figure and described with a
karyotype statement.
• Humans have two copies of each autosomal chromosome. Females have two X chromosomes while
males have one X and one Y chromosome.
• A typical nuclear chromosome has thousands of genes, one centromere, two telomeres, and thousands
of origins of replication.
• A typical nuclear chromosome is replicated during S phase and consists of two chromatids up until the
start of anaphase. It is condensed during prophase and remains condensed until the start of telophase.
During metaphase a chromosome is both replicated and condensed for these reasons.
• The human mitochondrial chromosome has 37 genes, a single origin of replication, and neither
centromeres nor telomeres.
• Humans have ~29 000 genes, most of which are on autosomal chromosomes.
• A typical human cell has two copies of each autosomal gene and one of each mitochondrial gene.
Genes on sex chromosomes are different: females have two of each X-chromosomal gene while males
have one; males have Y-chromosomal genes while females do not.
KEY TERMS:
replicated chromosome submetacentric
condensed chromosome acrocentric
cytogeneticist telocentric
c-value telomere
n-value origin of replication
karyogram 30 nm fibre
autosome histones
maternal chromosome nucleosome
paternal chromosome histone H1
sex chromosome fibre
homologous chromosome scaffold proteins
non-homologous chromosome chromatin
sister chromatids euchromatin
non-sister chromatids heterochromatin
karyotype satellite DNA
gene mtDNA
centromere endosymbiont theory
metacentric

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

HUMAN CHROMOSOMES – CHAPTER 15

QUESTIONS:
1) Cytogeneticists use white blood cells to obtain 9) Could the following genes continue to perform
metaphase chromosomes for karyotyping. their normal developmental function if they
a) Why don't they use red blood cells? were moved next to the LCT gene on
b) Why don't they use white blood cells in Chromosome 2?
anaphase? a) F8
2) The human Y chromosome is smaller than the b) SRY
X chromosome. Does this mean that males c) MT-CO1
have less DNA than females?
3) Are these statements true or false? For the
false statements explain why.
a) Everyone has a paternal chromosome 1.
b) Everyone has a maternal chromosome 1.
c) Everyone has a paternal X chromosome.
d) Everyone has a maternal X chromosome.
e) Everyone has a paternal Y chromosome.
f) Everyone has a maternal Y chromosome.
g) Everyone has a paternal mitochondrial
chromosome.
h) Everyone has a maternal mitochondrial
chromosome.
4) Explain why centromeres do not have to be in
the centre of a chromosome to function.
5) Why do nuclear chromosomes have to have
multiple origins of replication?
6) Define chromatin. What is the difference
between DNA, chromatin and chromosomes?
7) Have a look at Figure 8 Which of these
chromosomes would be associated with:
a) Histone proteins (see Figure 7)
b) Condensin proteins (important scaffold
proteins)
c) Cohesin proteins (proteins which hold sister
chromatids together)
d) Kinetochore proteins (proteins which
connect centromere DNA to Microtubules)
8) Where would you find these enzymes in a
typical human cell?
a) DNA polymerases
b) RNA polymerases
c) Ribosomes

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 15 – HUMAN CHROMOSOMES

PAGE 12 OPEN GENETICS LECTURES – FALL 2017
MENDEL’S FIRST LAW – CHAPTER 16

CHAPTER 16 – MENDEL’S FIRST LAW: SEGREGATION OF ALLELES

Figure 1.
Pea plants were used by Gregor Mendel to
discover fundamental laws of genetics.
His first law, the segregation of alleles, is
covered in this chapter. His second law,
independent assortment, is covered in the
next chapter. (Wikimedia commons-B.
Ebbesen-CC BY-SA 3.0)

INTRODUCTION irreversibly with the factor for purple-flowers.
Mendel’s observations disproved blending
The once prevalent (but now discredited) concept inheritance and favor an alternative concept, called
of blended inheritance proposed that some particulate inheritance, in which heredity is the
undefined essence, in its entirety, contained all of product of discrete factors that control
the heritable information for an individual. It was independent traits.
thought that mating combined the essences from
each parent, much like the mixing of two colors of Through careful study of patterns of inheritance,
paint. Once blended together, the individual Mendel recognized that a single trait could exist in
characteristics of the parents could not be different versions, or alleles, even within an
separated again. individual plant or animal. For example, he found
two allelic forms of a gene for seed color: one allele
However, Gregor Mendel (Figure 2) was one of the
gave green seeds, and the other gave yellow seeds.
first to take a quantitative, scientific approach to
Mendel also observed that although different
the study of heredity. He started with well-
alleles could influence a single trait, they remained
characterized strains, repeated his experiments
indivisible and could be inherited separately. This is
many times, and kept careful records of his
the basis of Mendel’s First Law, also called The Law
observations. Working with peas, Mendel showed
of Equal Segregation, which states: during gamete
that white-flowered plants could be produced by
formation, the two alleles at a gene locus segregate
crossing two purple-flowered plants, but only if the
from each other; each gamete has an equal
purple-flowered plants themselves had at least one
probability of containing either allele.
white-flowered parent (Figure 3). This was
evidence that a discrete genetic factor that
produced white-flowers had not blended

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 16 – MENDEL’S FIRST LAW

1. OVERVIEW
Mendel first made his discoveries of inheritance in
the 1850’s. In his 1866 publication he didn’t use the
word “gene” as the fundamental unit of heredity
because it wasn’t coined until 1909 by Danish
botanist Wilhelm Johannsen. Thomas Hunt Morgan
proposed that genes resided on chromosomes in
1910, and occupied distinct regions on those
chromosomes. DNA as a substance was discovered
in the 1860’s, but it took until the 1940s to realize
Figure 2. that DNA was the molecule that contained the
Gregor Johann Mendel (1822-1884), an Augustinian Friar, genetic information. Then in the 1950’s Watson
who lived in Moravia (now part of the Czech Republic),
and Crick discovered the structure of DNA.
published his work in 1866 on what has become known as
the laws of Mendelian Inheritance. Nevertheless, Mendel made his discoveries without
(Wikipedia-Hugo Iltis- CC BY 4.0) any of this information. Today we have
overwhelming knowledge from research allowing
us to understand the molecular mechanism behind
Mendel’s laws. To explain Mendel’s First Law,
segregation, we will take a closer look at the
concept of meiosis.

1.1. DOMINANT AND RECESSIVE ALLELES

The concepts of dominant and recessive alleles
were introduced in Chapter 13. Remember, alleles
are different versions of a gene. The relationship of
different alleles of a gene can be described as
complete dominance, incomplete dominance or co-
dominance. The traits Mendel studied with his peas
were all completely dominant, and therefore will
only be briefly reviewed here.
In a diploid organism, if an allele is dominant only
one copy of that allele is necessary to express the
dominant phenotype. If an allele is recessive, then
the gene needs to have two copies (or be
homozygous) to express the recessive phenotype.
Figure 3.
If an organism is a heterozygote, or has one copy of
Inheritance of flower color in peas. Mendel observed that each allele type, then it will show the dominant
a cross between pure breeding, white and purple peas phenotype. When representing these in written
(generation P) produced only progeny (generation F1) with form, a dominant allele is written as a capital letter
purple flowers. However, white flowered plant (e.g. A), while a recessive allele will be written in
reappeared among the F2 generation progeny of a mating
between two F1 plants. The symbols P, F1 and F2 are
lower case (e.g. a). If these are alleles of the same
abbreviations for parental, first filial, and second filial gene, they should be written with the same letter.
generations, respectively. This is the most common way of writing genotypes
(Original-Deyholos- CC BY-NC 3.0) (Table 1), but there are many different systems

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S FIRST LAW – CHAPTER 16

Table 1. Examples of symbols used to represent genes and alleles.

Examples Interpretation
A and a Uppercase letters represent dominant alleles and lowercase letters indicate recessive alleles.
Mendel invented this system but it is not commonly used because not all alleles show
complete dominance and many genes have more than two alleles.
+ 1
a and a Superscripts or subscripts are used to indicate alleles. For wild type alleles the symbol is a
superscript +.
AA or A/A Sometimes a forward slash is used to indicate that the two symbols are alleles of the same
gene, but on homologous chromosomes.

that often deviate from these general rules. Note Meiosis has two main stages, designated by the
that genes and alleles are usually written in italics roman numerals I and II. In Meiosis I homologous
and chromosomes and proteins are not, proteins chromosomes segregate, while in Meiosis II sister
often written in all capitals. For example, the white chromatids segregate (Figure 5). Most multicellular
gene in Drosophila melanogaster on the X organisms use meiosis to produce gametes, the
chromosome encodes a protein called WHITE. cells that fuse to make offspring. Some single celled
eukaryotes such as yeast also use meiosis to enter
1.2. MEIOSIS OVERVIEW the haploid part of their life cycle. Cells that will
Most eukaryotes reproduce sexually - a cell from undergo meiosis are called meiocytes and are
one individual joins with a cell from another to diploid (2N)(Figure 6). You will hear of cells that
create offspring. In order for this to be successful, have not yet undergone meiosis to become egg or
the cells that fuse must contain half the number of sperm cells called oocytes or spermatocytes
chromosomes as in the adult organism. Otherwise, respectively.
the number of chromosomes would double with
each generation, which would be unsustainable. Meiosis begins similarly to mitosis in that a cell has
The chromosome number is reduced through the grown large enough to divide and has replicated its
process of meiosis. Meiosis is similar in many ways chromosomes. However, Meiosis requires two
to mitosis (Figure 4), as the chromosomes are lined rounds of division. In the first, known as meiosis I,
up along the metaphase plate and divided to the the replicated, homologous chromosomes
poles using microtubules. It also differs in many segregate. During meiosis II the sister chromatids
significant ways from mitosis. Keep this in mind and segregate. Note how meiosis I and II are both
try to note the differences as you read ahead. Note divided into prophase, metaphase, anaphase, and
also that during this chapter that we will be telophase, since those stages have similar features
discussing N and C values. Refer back to Chapter 14 to mitosis (Figure 4). After two rounds of
to refresh yourself with this concept. cytokinesis, four cells will be produced, each with a
single copy of each chromosome in the set.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 16 – MENDEL’S FIRST LAW

Prophase I
Meiosis I

Leptotene Zygotene Pachytene Diplotene Diakinesis

Metaphase I Anaphase I Telophase I &

Interkinesis
Meiosis II

Prophase I Metaphase II Anaphase II Telophase II Cell division /

Gamete development
Mitosis

Prophase Metaphase Anaphase Telophase Cell division

Figure 4.
Stages of Prophase I and Meiosis with comparison to Mitosis. This example uses a diploid animal with 2 chromosome sets, so 4
chromosomes in total: Red, Maroon, Blue and Teal. Cross over events are shown between the two closest non-sister chromatids,
but in reality can happen between all four chromatids.
Prophase I is divided into stages. Leptotene is defined by the beginning of chromosome condensation, though chromosomes are
still long. Zygotene chromosomes are still long, but you can readily identify chromosomes as they are starting to pair. Pachytene
chromosomes are thickening and fully synapsed. During Diplotene one can begin to see the individual chromatids and chiasmata.
Diakinesis, chromosomes are fully condensed and nuclear membrane dissolves. Metaphase I, the synapsed chromosomes align
along the metaphase plate and then the synapse breaks in Anaphase I. Meiosis I is completed with Telophase I and potentially
interkinesis, completing the reductional division. Meiosis II is an equational division where the chromosomes align in Metaphase II
similarly to Mitosis, and complete Anaphase II and Telophase II, leaving with 4 haploid gametes formed.
Mitosis is listed for comparison. See the chapter on Mitosis for more details on the stages.
(Original–L. Canham–CC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S FIRST LAW – CHAPTER 16

Figure 5.
Meiosis in Arabidopsis (n=5).
Panels A-C show different
stages of prophase I, each
with an increasing degree of
chromosome condensation.
Subsequent phases are
shown: metaphase I (D),
telophase I (E), metaphase II
(F), anaphase II (G), and
telophase II (H). (PLoS
Genetics-Chelysheva, L. et al
(2008) PLoS Genetics- CC BY
4.0)

2. MEIOSIS I
Meiosis I is called a reductional division, because it
reduces the number of chromosomes inherited in
each of the daughter cells – the parent cell is 2N
while the two daughter cells are each 1N. Meiosis I
is further divided into Prophase I, Metaphase I,
Anaphase I, and Telophase I, which are roughly
similar to the corresponding stages of mitosis,
except that in Prophase I and Metaphase I,
homologous chromosomes pair up with each
other, or synapse, and are called bivalents (Figure
7), in contrast with mitosis where the
chromosomes line up individually during
metaphase. This is an important difference
between mitosis and meiosis, because it affects the
segregation of alleles, and also allows for
recombination to occur through crossing-over,
which will be described later. During Anaphase I,
one member of each pair of homologous
chromosomes migrates to each daughter cell (1N)
(Figure 6).
In meiosis I replicated, homologous chromosomes
pair up, or synapse, during prophase I, line up in
the middle of the cell during metaphase I, and Figure 6.
Changes in DNA and chromosome content during the cell
separate during anaphase I. For this to happen the cycle. For simplicity, nuclear membranes are not shown,
homologous chromosomes need to be brought and all chromosomes are represented in a similar stage of
together while they condense during prophase I. condensation. (Original-Deyholos- CC BY-NC 3.0)
During synapsis, proteins bind to both homologous
chromosomes along their entire length and form the transient structure of a bivalent (Figure 7). The
the synaptonemal complex (synapse means proteins are released when the cell enters
junction). These proteins hold the chromosomes in anaphase I.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 16 – MENDEL’S FIRST LAW

Synaptonemal to form bivalents. Crossing over (see section

Complex
Sister below) takes place in pachytene. After this, the
Chromatids pairing begins to loosen in diplotene. Remember
that these are replicated chromosomes, but before
this point this isn’t apparent. This is also when the
consequences of each crossing over event can be
seen as a cross structure known as a chiasma
(plural: chiasmata). Diakinesis follows as the
Bivalent
chromosomes continue to fully condense and
individualize. It is at this point that the nuclear
Chiasma membrane dissolves and the microtubules begin to
form. This is followed by metaphase I were the
paired chromosomes orient on the metaphase
plate in preparation for segregation (reductional).
Non-Sister
Chromatids
2.2. METAPHASE I, ANAPHASE I AND TELOPHASE I
Pre-crossover Post-crossover
Metaphase I is where the major difference
between mitosis and meiosis becomes apparent.
Figure 7.
The homologous pairs, or bivalents, orient
Diagram of a pair of homologous chromosomes during
Prophase I. themselves along the metaphase plate and the
Sister chromatids are chromatids found in one microtubules attach themselves to each
chromosome, so both blue chromatids are sister chromosome’s centromere, one pole attaching to
chromatids, and both green chromatids are sister each respective homologous pair (Figure 4). This is
chromatids. Non-sister chromatids are between
different from mitosis, where the chromosomes
chromosomes, so the green’s non-sister chromatid is the
blue. align individually and the microtubules from both
When a pair of homologous chromosomes synapse during poles attach themselves to an individual
Prophase I they form a bivalent. Proteins known as the chromosome in preparation for separating the
synaptonemal complex form between both chromosomes chromatids.
and join them together. Crossovers form between non-
sister chromatids forming a cross-structure called a During Anaphase I and Telophase I, the
chiasma. homologous chromosome pairs segregate to their
(Original-L. Canham- CC BY-NC 3.0) respective poles, but keep the sister chromatids of
each chromosome together. Telophase I completes
2.1. STAGES OF PROPHASE I with cell division to create two cells. Different
In meiosis, Prophase I is divided up into five visual organisms and cells behave differently after
stages, that are steps along a continuum of events telophase I. In some cells, the nuclear membrane
(Figure 4). Leptotene, zygotene, pachytene, reforms around the chromosomes in each pole,
diplotene and diakinesis. From interphase, a cell and the chromosomes become elongated again.
enters leptotene as the nuclear material begins to These cells may stay in the state of interkinesis for
condense into long thin visible threads some time. Other organisms the chromosomes will
(chromosomes). During zygotene homologous stay condensed, no nuclear membrane will form,
chromosomes begin to pair up (synapse) and form and it will go directly into meiosis II.
an elaborate structure called the synaptonemal
complex along their length. During zygotene the 3. MEIOSIS II
chromosomes are still quite long, but it is more At the completion of meiosis I there are two cells,
apparent that they are distinct now. At pachytene each with one, replicated copy of each
homologous chromosomes are thicker and fully chromosome (1N). Because the number of
synapsed (two chromosomes and four chromatids) chromosomes per cell has decreased (2->1),

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S FIRST LAW – CHAPTER 16

meiosis I is called a reductional cell division. Crossing over occurs within the synaptonemal
Meiosis II resembles mitosis, with one sister complex. A crossover is a place where DNA repair
chromatid from each chromosome separating to enzymes break the DNA of two non-sister
produce two daughter cells. Because Meiosis II, like chromatids in similar locations and then covalently
mitosis, results in the segregation of sister reattach non-sister chromatids together to create a
chromatids, Meiosis II is called an equational crossover between non-sister chromatids. This
division (Figure 6). reorganization of chromatids will persist for the
If after telophase I the cells went into a state of remainder of meiosis and result in recombination
interkinesis, then during prophase II the haploid of alleles in the gametes. Crossover events can be
chromosomes will condense and the nuclear seen as Chiasmata on the synapsed chromosomes
membrane will dissolve again. If interkinesis did not in late Meiosis I.
happen, then the cell will continue with meiosis II Crossovers function to hold homologous
(Figure 4). Prophase II ends like in mitosis with the chromosomes together during meiosis I so they
microtubules beginning to form. As metaphase II orient correctly and segregate successfully.
starts, the pairs of sister chromatids align Crossing over also reshuffles the allele
themselves along the metaphase plate, each combinations along a chromosome resulting in
chromatid attached to a microtubule from each genetic diversity, that can be selected in a
pole. Anaphase II splits the sister chromatids and population over time (evolution).
the microtubules pull them to the opposite poles.
Telophase II reforms the nuclear membrane
5. ONE LOCUS ON A CHROMOSOME -
around the chromosomes, ending finally with SEGREGATION - MONOHYBRID
cytokinesis and producing four cells with only one Not only did Mendel solve the mystery of
unreplicated chromosome of each type. There will inheritance as units (genes), he also invented
be allelic differences among gametes based upon several testing and analysis techniques still used
segregation of heterozygous alleles (Note the today. Classical genetics is the science of
differences in colours of chromosomes in each of examining biological questions using controlled
the gametes in Figure 4). matings of model organisms. It began with Mendel
in 1865 but did not attain widespread usage until
3.1. GAMETE MATURATION
Mendel’s work was rediscovered in 1903 by four
In animals and plants the cells produced by meiosis
researchers (E. von Tschermak, H. de Vries, C.
need to mature before they become functional
Correns, and W. J. Spillman). Then Thomas Morgan
gametes. In male animals the four products of
began working with fruit flies in 1908 and used this
meiosis are called spermatids. They grow
work. Later, starting with Watson and Crick’s
structures, like tails and become functional sperm
structure of DNA in 1953, classical genetics was
cells. In female animals the gametes are eggs. For
joined by molecular genetics, the science of solving
each egg to contain the maximum amount of
biological problems using DNA, RNA, and proteins.
nutrients, typically only one of the four products of
The genetics of DNA cloning began in 1970 with
meiosis becomes an egg. The other three cells end
the discovery of restriction enzymes and plasmids
up as tiny disposable cells called polar bodies. In
as cloning vectors.
plants the products of meiosis reproduce a few
times using mitosis as they develop into functional Knowing what we now know about the process of
male or female gametes. meiosis, we can better understand the mechanisms
underlying Mendel’s First Law. The Law of
4. CROSSING OVER (INTRA-CHROMOSOMAL Segregation states that every individual contains a
RECOMBINATION) pair of alleles for each gene, which segregate
During prophase I the homologous chromosomes during the formation of gametes, and so for every
pair together and form a synaptonemal complex. gene pair each parent passes on a random allele to

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 16 – MENDEL’S FIRST LAW

its offspring. The series of experiments that led to generations) have the same phenotypes with
the formulation of Mendel's first law where based respect to a particular trait. True-breeding lines are
on the process of Monohybrid crosses, which will useful, because they are typically assumed to be
be described below. homozygous for the alleles that affect the trait of
5.1. TERMINOLOGY interest. When two individuals that are
A specific position, region, or segment along a homozygous for the same alleles are crossed, all of
chromosome is called a locus. Each gene occupies a their offspring will all also be homozygous. The
specific locus (so the terms locus and gene are continuation of such crosses constitutes a true
often used interchangeably). Each locus will have breeding line or strain. A large variety of different
an allelic form (allele). The complete set of alleles strains, each with a different, true breeding
(at all loci of interest) in an individual is its character, can be collected and maintained for
genotype. Typically, when writing out a genotype, genetic research.
only the alleles at the locus (loci) of interest are 5.3. MONOHYBRID CROSSES
considered – all the others are present and A monohybrid cross is one in which both parents
assumed to be wild type but are normally not are heterozygous (or a hybrid) for a single (mono)
written in the genotype. The observable or trait. The trait might be petal colour in pea plants
detectable effect of these alleles on the structure (Figure 8b). Recall from Figure 3 that the
or function of that individual is called its
generations in a cross are named P (parental), F1
phenotype. The phenotype studied in any
particular genetic experiment may range from (first filial), F2 (second filial), and so on.
simple, visible traits such as hair color, to more By using monohybrid crosses, Mendel discovered
complex phenotypes including disease that genes were discrete units that separated in
susceptibility or behavior. If two alleles are present the creation of offspring. Previous ideas of blending
in an individual, then various interactions between inheritance would mean that a cross between a
them may influence their expression in the white flower and a purple flower would create a
phenotype. ‘blended’ phenotype. Instead what Mendel saw
was distinct parental colours in the hybrids, that
5.2. TRUE BREEDING LINES
when crossed would produce in specific ratios the
Geneticists make use of true-breeding lines just as
purple and white seen in the parents. These traits
Mendel did (Figure 8a). These are in-bred
were not blended when the true-breeding lines
populations of plants or animals in which all
were crossed, but instead those parental alleles
parents and their offspring (over many
were carried on through the offspring. Through the
monohybrid cross he was able to discern the
dominant and recessive alleles of each gene he
studied in the pea plants. In further crosses (F3, F4,
etc.), these traits were continuously transmitted
and not lost, though they may be hidden as seen in
the F1 generation.
6. PUNNETT SQUARES - 3:1 RATIO
The specific ratios seen in the monohybrid cross
can be described using a Punnett square, named
Figure 8.
(a) A true-breeding line (b) A monohybrid cross produced
after R.C. Punnett who devised this approach.
by mating two different pure-breeding lines. Given the genotypes of any two parents, we can
(Original-Deyholos-CC BY-NC 3.0) predict all of the possible genotypes of the
offspring. Furthermore, if we also know the

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S FIRST LAW – CHAPTER 16

dominance relationships for all alleles, we can

predict the phenotypes of the offspring. This
7. SINGLE LOCUS TEST CROSSES
provides a convenient method for calculating the
expected genotypic and phenotypic ratios from a Knowing the genotypes of an individual is an
cross. important part of a genetic experiment. However,
genotypes cannot be observed directly; they must
A Punnett square is a matrix in which all of the
be inferred based on phenotypes. Because of
possible gametes produced by one parent are
dominance, it is often not possible to distinguish
listed along one axis, and the gametes from the
between a heterozygote and a homozygote based
other parent are listed along the other axis. Each
on phenotype alone (e.g. see the purple-flowered
possible combination of gametes is listed at the
F2 plants Figure 8b). To determine the genotype of
intersection of each row and column, since we
a specific individual, a test cross can be performed,
know through the process of meiosis that the
in which the individual with an unknown genotype
alleles on each chromosome separate to form the
is crossed with an individual that is homozygous
gametes.
recessive for all of the loci being tested.
The F1 cross from Figure 8b would be drawn as in
For example, if you were given a pea plant with
Figure 9. As you can see, in a Monohybrid cross,
purple flowers it might be a homozygote (AA) or a
the offspring ratios will be 3:1 of dominant
heterozygote (Aa). You could cross this purple-
phenotype (purple) : recessive phenotype (white).
flowered plant to a white-flowered plant as a
Punnett squares can also be used to calculate the
tester, since you know the genotype of the tester is
frequency of offspring. The frequency of each
aa. Depending on the genotype of the purple-
offspring is the frequency of the male gametes
flowered parent (Figure 10), you will observe one
multiplied by the frequency of the female gamete.
of two phenotypic ratios in the F1 generation. If the
Figure 9. purple-flowered parent was a homozygote AA, all
A a A Punnett square showing a of the F1 progeny will be purple. If the purple-
monohybrid cross. The purple flowered parent was a heterozygote Aa, the F1
A AA Aa boxes represent the purple colour
progeny should segregate purple-flowered and
of the dominant (A) allele, while
the white box represents the white-flowered plants in a 1:1 ratio.
a Aa aa
recessive (aa) allele homozygote.
(Original-L. Canham- CC BY-NC 3.0)

A A A a

a Aa Aa a Aa aa

a Aa Aa a Aa aa

Figure 10.
Punnett squares showing the two possible outcomes of a
single locus test cross. (Original-L. Canham- CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 16 – MENDEL’S FIRST LAW

___________________________________________________________________________
SUMMARY:
• Mendel demonstrated that heredity involved discrete, heritable factors that affected specific traits.
• A gene can be defined operationally as a unit of inheritance.
• Homologous chromosomes contain the same series of genes along their length, but not necessarily the
same alleles. Sister chromatids initially contain the same alleles.
• Homologous chromosomes pair (sysnapse) with each other during meiosis, but not mitosis.
• A diploid organisms can have up to two different alleles at a single locus. The alleles segregate equally
between gametes during meiosis.
• Phenotype depends on the alleles that are present, their dominance relationships, and sometimes also
interactions with the environment and other factors.
• Classical geneticists make use of true breeding lines, monohybrid crosses, Punnett squares, test
crosses, and reciprocal crosses.
KEY TERMS:
blending inheritance chiasma / chiasmata
Gregor Mendel diakinesis
particulate inheritance metaphase I
alleles anaphase I
Mendel’s First Law telophase I
The Law of Equal Segregation interkinesis
dominant prophase II
recessive metaphase II
meiosis I anaphase II
meiosis II telophase II
gametes polar bodies
meiocytes classical genetics
reductional molecular genetics
synapse DNA cloning
bivalent monohybrid cross
equational locus
pair up genotype
synaptonemal complex phenotype
leptotene true-breeding lines
zygotene punnett square
pachytene test cross
crossing over tester
diplotene

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S FIRST LAW – CHAPTER 16

STUDY QUESTIONS:
1) How would the results of the cross in Figure 3
have been different if heredity worked through
blending inheritance rather than particulate
inheritance?
2) A simple mnemonic for leptotene, zygotene,
pachytene, diplotene, & diakinesis is Lame
Zebras Pee Down Drains. Make another one
yourself.
3) What is the maximum number of alleles at a
given autosomal locus in a normal gamete from
a diploid individual? In the whole population of
a species?
4) Wirey hair (W) is dominant to smooth hair (w)
in dogs.
a) If you cross a homozygous, wirey-haired
dog with a smooth-haired dog, what will be
the genotype and phenotype of the F1
generation?
b) If two dogs from the F1 generation mated,
what would be the most likely ratio of hair
phenotypes among their progeny?
c) When two wirey-haired Ww dogs actually
mated, they had a litter of three puppies,
which all had smooth hair. How do you
explain this observation?
d) Someone left a wirey-haired dog on your
doorstep. Without extracting DNA, what
would be the easiest way to determine the
genotype of this dog?
e) Based on the information provided in
question 1, can you tell which, if either, of
the alleles is wild-type?
5) An important part of Mendel’s experiments was
the use of homozygous lines as parents for his
crosses. How did he know they were
homozygous, and why was the use of the lines
important?
6) Does equal segregation of alleles into daughter
cells happen during mitosis, meiosis, or both?

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 16 – MENDEL’S FIRST LAW

PAGE 12 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT – CHAPTER 17

CHAPTER 17 – MENDEL’S SECOND LAW:

INDEPENDENT ASSORTMENT
Figure 1.
Hand pollination of a pumpkin flower. When
Mendel was doing his crosses with pea plants, he
pollinated each flower by hand in a similar way in
order to be sure he knew the parents of each
cross.
(Flickr-S. Hurmerinta- CC BY-NC 2.0

INTRODUCTION seven traits in all, each on a different
chromosome.) When either of these traits was
The principles of genetic analysis that we have
studied individually, the phenotypes segregated in
described for a single locus in Chapter 16 will be
the classical 3:1 ratio among the progeny of a
extended to the study of alleles at two loci in this
monohybrid cross (Figure 2), with ¾ of the seeds
Chapter. The analysis of two loci in the same cross
green and ¼ yellow in one cross, and ¾ round and
provides information for genetic mapping (Chapter
¼ wrinkled in the other cross. Would this be true
18) and testing gene interactions (Chapter 26).
when both hybrids were in the same individual?
These techniques are very useful for both basic and
applied research. Before discussing these
techniques, we will first revisit Mendel’s classical
experiments.
Before Mendel’s 1865 publication, blended
inheritance was the accepted model to explain the
transmission of traits. It was Mendel’s work that
established that heritable traits were controlled by
discrete factors, which we now call alleles, in a
particulate inheritance model. At the time it was Figure 2.
an important question as to whether heritable Monohybrid crosses involving two distinct traits in peas.
traits, controlled by discrete factors, were inherited a) is R/r and b) is Y/y.
independently of each other? To answer this, Monohybrid crosses are covered in more detail in Chapter
16
Mendel took two apparently unrelated traits, such (Original-Deyholos-CC BY-NC 3.0)
as seed shape and seed color, and studied their
inheritance together in one individual. For
example, he studied two variants of each trait: Like in the previous chapter, we will first walk
seed color was either green or yellow, and seed through how a dihybrid cross works on at the DNA
shape was either round or wrinkled. (He studied level, and then we will explain the results that

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 17 – MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT

Mendel saw that led him to his law, the Law of

Independent Assortment. Metaphase I Telophase II

A a
When dealing with alleles at two different loci, we
have to use nomenclature that makes the A a
arrangement clear. There are three possible B b
arrangements: Both loci are on the same
chromosome (AB/ab), different chromosomes (A/a;
B/b), or unknown (AaBb). A a
B b
1. TWO LOCI ON DIFFERENT CHROMOSOMES
B b
The separation of gametes through the process of
meiosis has already been introduced. But what
does that mean when you are taking multiple
different genes (or loci) into account? a
A
Remember the main stages of meiosis. The
homologous pairs align during metaphase I, and b B B
b
complete one round of cell division. Then during
metaphase II in those two cells the replicated
chromosomes align individually and the sister A a a
A
chromatid separate, so when complete you have
two daughter cells. Let’s say one chromosome has B
b
gene A on it, and another chromosome has gene B
on it, and the individual is heterozygous at each
gene (a.k.a. has the genotype A/a ; B/b). There are Figure 3.
a variety of ways that the homologous pairs can Independent assortment as seen on two different
align themselves during metaphase I. The chromosomes. Gene A is found on the short chromosome
orientation of that alignment will affect the alleles and Gene B is found on the long chromosome, and both
genes are heterozygotes for the dominant (A and B) and
each gamete receives at the end of telophase II
recessive (a and b) alleles. The orientation that the
(Figure 3). chromosomes align themselves during metaphase I affect
the alleles found in the 4 gametes produced after telophase
Because the alignment at metaphase I is always
II.
random, you will see a random, equal distribution These are just two of many orientations the chromosomes
of alleles in all the gametes produced. This means can arrange themselves in at metaphase I. The full stages of
that one allele doesn’t affect the distribution of meiosis were removed for simplicity; refer to chapter 16 to
another allele, or in other words, each allele understand the divisions that lead to the 4 gametes seen in
assorts independently (Independent Assortment). telophase II.
(Original-L. Canham-CC BY-NC 3.0)
2. TWO LOCI ON ONE CHROMOSOME Crossing over is an exchange between non-sister
Based on the description in the last section, it chromatids that can occur at any position along the
would be expected that if the genes were on the entire chromosome. If the two loci that are being
same chromosome the alleles would travel considered are sufficiently separated from each
together through meiosis (Figure 4 top). However, other on the chromosome, crossover events can
when tested this is not always the case. The occur between the two loci.
recombination of alleles can be explained through
the phenomenon of crossing over, which occurs This coupled with the random orientation that the
during prophase I as described in chapter 16. chromosomes align during metaphase I, will allow

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT – CHAPTER 17

the other combination of alleles in the gametes loci. Ultimately, this will result in similar allele
(Figure 4 bottom). combinations to those observed in independent
assortment shown above, even if they are on the
While not shown in Figure 4, if the two loci are very same chromosome.
far apart, multiple crossover events can also take
place, further increasing the shuffling of alleles. If the loci are very close together on the same
chromosome, fewer crossovers are likely occur
Metaphase I Telophase II between them. We will not discuss this situation in
here, but will do later in chapter 18.
A a
3. A DIHYBRID CROSS SHOWING MENDEL’S
SECOND LAW (INDEPENDENT ASSORTMENT)
D d Mendel found that each locus had two alleles, that
segregated from each other during the creation of
A a a gametes. He wondered whether dealing with
A
multiple traits at a time would affect this
segregation, so he created a dihybrid cross. The
D d d
D distribution of offspring from his experiments led
him to formulate Mendel’s Second Law, the Law of
Independent Assortment, which states that the
A a segregation of alleles at one locus will not influence
A a
the segregation of alleles at another locus during
gamete formation – the alleles segregate
D d independently. Next, we will discuss how he came
d to this understanding, given that independent
D
assortment occurs.
A a
3.1. MENDEL’S SECOND LAW
To analyze the simultaneous segregation of two
d D traits at the same time in the same individual, he
crossed a pure-breeding line of green, wrinkled
Figure 4. peas with a pure-breeding line of yellow, round
Independent assortment as seen on the same
peas. This produced F1 progeny that had all green
chromosome. On the top is an example of what would
happen if cossovers do not occur. The dominant alleles of and round peas. They were called dihybrids
gene A and gene D would travel together, not leading to because they carried two alleles at each of the two
independent assortment. Crossovers do occur in most loci (Figure 5).
situations though, like in the bottom half of the figure. If a
crossover occurs between the two genes, then the alleles From Figure 2 we know that yellow and round are
will transfer to the other non-sister chromatid, thus dominant, and green and wrinkled are recessive. If
rearranging alleles. This allows for independent the inheritance of seed color was truly
assortment, despite being on the same chromosome. independent of seed shape, then when the F1
This is just one of the many arrangements or crossover
events that could occur during meiosis, with every
dihybrids were crossed to each other, a 3:1 ratio of
meiocyte arranging themselves differently with different one trait should be observed within each
crossovers. phenotypic class of the other trait (Figure 5). Using
(Original-L. Canham-CC BY-NC 3.0) the product law, we would therefore predict

The farther apart on the chromosome the more

crossover events will take place between the two

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 17 – MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT

that if ¾ of the progeny were yellow, and ¾ of the

progeny were round, then ¾ × ¾ = 9/16 of the
progeny would be both round and yellow (Table 1).

Likewise, ¾ × ¼ = 3/16 of the progeny would be

both round and green. And ¾ × ¼ = 3/16 of the
progeny would be both wrinkled and yellow. And ¼
× ¼ = 1/16 of the progeny would be both wrinkled
and green. So by applying the product rule to all of
these combinations of phenotypes, we can predict
that if the two loci assort independently in a
9:3:3:1 phenotypic ratio among the progeny of this
dihybrid cross, if certain conditions are met (see
Figure 5. section below). Indeed, 9:3:3:1 is very close to the
Two pure-breeding lines are crossed to produce dihybrids ratio Mendel observed in his studies of dihybrid
in the F1 generation. These F1 are crossed to produce
crosses, leading him to formulate his Second Law,
four phenotypic classes, which appear in a 9:3:3:1 ratio.
(Original-Deyholos-CC BY-NC 3.0) the Law of Independent Assortment.

The 9:3:3:1 phenotypic ratio that we calculated

using the product rule could also be obtained using
Punnett Square (Figure 6). First, we list the
genotypes of the possible gametes along each axis
of the Punnett Square. In a diploid with two
heterozygous genes of interest, there are up to
four combinations of alleles in the gametes of each
parent. The gametes from the respective rows and
column are then combined in the each cell of the
array. When working with two loci, genotypes are
written with the symbols for both alleles of one
locus, followed by both alleles of the next locus
(e.g. AaBb, not ABab). Note that the order in which
the loci are written does not imply anything about
Figure 6.
A Punnett Square showing the results of the dihybrid the actual position of the loci on the chromosomes.
cross from Figure 5. Each of the four phenotypic classes is
represented by a different color of shading.
(Original-Deyholos-CC BY-NC 3.0)

Frequency of phenotypic crosses within separate monohybrid crosses:

seed shape: ¾ round ¼ wrinkled
seed color: ¾ yellow ¼ green
Frequency of phenotypic crosses within a dihybrid cross:
¾ round × ¾ yellow = 9/16 round & yellow
¾ round × ¼ green = 3/16 round & green
¼ wrinkled × ¾ yellow = 3/16 wrinkled & yellow
¼ wrinkled × ¼ green = 1/16 wrinkled & green

Table 1.
Phenotypic classes expected in monohybrid and dihybrid crosses for two seed traits in pea.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT – CHAPTER 17

To calculate the expected phenotypic ratios, we that one or more of the above conditions has not
assign a phenotype to each of the 16 genotypes in been met. Modified ratios in the progeny of a
the Punnett Square, based on our knowledge of the dihybrid cross can therefore reveal useful
alleles and their dominance relationships. information about the genes involved. One such
example is linkage.
In the case of Mendel’s seeds, any genotype with at Linkage is one of the most important reasons for
least one R allele and one Y allele will be round and distortion of the ratios expected from independent
yellow; these genotypes are shown in the nine, assortment. Two loci show linkage if they are
green-shaded cells in Figure 6. We can represent all located close together on the same chromosome.
of four of the different genotypes shown in these This close proximity alters the frequency of allele
cells with the notation (R_Y_), where the blank line combinations in the gametes. We will return to the
(__), means “any allele”. The three genotypic concept of linkage in Chapter 18. Deviations from
classes that have at least one R allele and are 9:3:3:1 ratios can also be due to interactions
homozygous recessive for y (i.e. R_yy) will have a between genes, such as epistasis, duplicate gene
round, green phenotype. Conversely the three action and complementary gene action. These
classes that are homozygous recessive r, but have interactions are discussed in Chapter 26.
at least one Y allele (rrY_) will have wrinkled,
yellow seeds. Finally, the rarest phenotypic class of 4. THE DIHYBRID TEST CROSS
wrinkled, green seeds is produced by the doubly While the cross of an F1 x F1 gives a ratio of 9:3:3:1,
homozygous recessive genotype, rryy, which is there is a better, easier cross to test for
expected to occur in only one of the sixteen independent assortment: the dihybrid test cross. In
possible offspring represented in the square. a dihybrid test cross, independent assortment is
seen as a ratio of 1:1:1:1, which is easier to score
3.2. ASSUMPTIONS OF THE 9:3:3:1 RATIO
than the 9:3:3:1 ratio. This test cross will also be
Both the product rule and the Punnett Square
easier to use when testing for linkage (Chapter 18).
approaches showed that a 9:3:3:1 phenotypic ratio
is expected among the progeny of a dihybrid cross Like in monohybrid crosses (Chapter 16), you can
such as Mendel’s RrYy × RrYy. In making these do test crosses with dihybrids to determine the
calculations, we assumed that: genotype of an individual with dominant
(1) alleles at each locus segregate independently of phenotypes, to see if they are heterozygous or
the alleles at the other; homozygous dominant. This type of cross is set up
(2) one allele at each locus is completely dominant in the same fashion: an individual with an unknown
(the other recessive); and genotype in two loci is crossed to an individual that
(3) each of four possible phenotypes can be is homozygous recessive for both loci.
distinguished unambiguously, with no interactions Punnett squares should be done ahead of the
between the two genes that would interfere with crosses, so you know what to expect for any of the
determining the genotype correctly. possible outcomes. Using the example from the
For simplicity, most student examples involve easily rest of this chapter, you cross a double
scored phenotypes, such as pigmentation or other homozygous recessive pea plant (r/r ; y/y. green
changes in visible structures. However, keep in and wrinkled) to an unknown individual that has
mind that the analysis of segregation ratios of any two dominant phenotypes (R/_ ; Y/_. yellow and
two marker loci can provide insight into their round). There are four possible genotypes the
relative positions on chromosomes. unknown individual could be: R/R ; Y/Y or R/R ; Y/y
or R/r ; Y/Y or R/r; Y/y. The Punnett squares for the
3.3. DEVIATIONS FROM THE 9:3:3:1 PHENOTYPIC RATIO first two are listed below (Figure 7). Notice on the
There can be deviations from the 9:3:3:1 left you only get the dominant phenotype for both,
phenotypic ratio. These situations may indicate so you know both genes in the unknown are

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 17 – MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT

homozygous dominant. On the right you get only

R;Y R;Y R;Y R;Y R;Y R;y R;Y R;y
the dominant phenotype for round peas, but you
get 50% yellow and 50% green peas, showing that R/r R/r R/r R/r R/r R/r R/r R/r
r;y r;y
Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
the unknown is homozygous for round, but R/r R/r R/r R/r R/r R/r R/r R/r
r;y r;y
heterozygous for colour of the peas. Figure 8 is Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
blank for you to fill in the two other gamete and R/r R/r R/r R/r R/r R/r R/r R/r
r;y r;y
Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
genotype possibilities. R/r R/r R/r R/r R/r R/r R/r R/r
r;y r;y
Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
Figure 7.
Punnett square for a test cross. The tester in both cases is
the male with the genotype r/r ; y/y .
On the left, the unknown has a genotype of R/R; Y/Y.
On the right, the unknown has the genotype R/R ; Y/y.
(Original-L. Canham-CC BY-NC 3.0)

r;y r;y

r;y r;y

Figure 8.
Blank Punnett squares to fill in the other two possibilities of the test cross.

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT – CHAPTER 17

___________________________________________________________________________
SUMMARY:
• The alleles of loci in different chromosomes are inherited independently of each other.
• The expected phenotypic ratio of a dihybrid cross is 9:3:3:1.
• The 9:3:3:1 ratio can be modified if the loci are not simple Dominant/recessive to each other, or if
there are gene interactions, or if the two loci are linked.
• A test cross gives a ratio of 1:1:1:1 for loci that assort independently.
KEY TERMS:
blended inheritance dihybrid cross
heritable traits Mendel’s Second Law
particulate inheritance Law of Independent Assortment
Independent Assortment (IA) 9:3:3:1
crossing over Linkage
dihybrid

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 17 – MENDEL’S SECOND LAW - INDEPENDENT ASSORTMENT

STUDY QUESTIONS:
1) Figure 7 shows Punnett squares for two of the
four possible test crosses. Fill in the Punnett
squares in Figure 8 for the other two possible
genotypes of the unknown that aren’t shown.
2) Based on meiosis, when dealing with two loci,
there will always be four distinct gamete types.
But if the organism is homozygous, like the
tester, all those gametes will look the same. In
this situation, when writing a Punnett square, is
it necessary to write out the four similar
gametes? How would you re-draw the Punnett
Square on the right in Figure 7?
3) If two loci assort independently, then the
AABB x aabb cross will result in dihybrid
progeny, which when crossed together will give
ratios of 9:3:3:1 in the F2, assuming “A” and “B”
are dominant to “a” and “b”, respectively.
Now, assume that locus “A” and “B” are
somewhat linked and thus will NOT assort
independently. That is the “AB” and “ab”
combinations are more likely. How will this
affect (change) the 9:3:3:1 ratio?
4) Do the same first cross as Question#3 but make
the second cross a test cross (x aabb), with
expectation of a 1:1:1:1 ratio. How would the
ratio be changed if the two loci were not
assorting independently but are somewhat
linked?

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

GENES ON THE SAME CHROMOSOME: LINKAGE – CHAPTER 18

CHAPTER 18 –GENES ON THE SAME CHROMOSOME: LINKAGE

Figure 1.
The coat colour on this juvenile horse is called Bay
Roan Tobiano. Bay is the brown base coat colour; Roan
is the mixture of white hairs with the base coat,
making a ‘foggy’ colour; and Tobiano is the white
patches. The genes causing the Roan and Tobiano coat
colours, respectfully, are found on the same
chromosome and are linked. Knowing this, we can
predict which coat colour genes are from which
parents, and how those genes will be inherited in this
horse’s offspring.
(Wikimedia Commons-Kumana @ Wild Equines- CC BY
2.0)

INTRODUCTION 1. GENETIC NOMENCLATURE & SYMBOLS
As we learned in Chapter 17, Mendel reported that Nomenclature and symbols have been covered in
the pairs of loci he observed segregated previous chapters. This will be a brief review to
independently of each other; for example, the revisit these topics.
segregation of seed color alleles was independent A gene is a hereditary unit that occupies a specific
from the segregation of alleles for seed shape. This position (locus) within the genome or chromosome
observation was the basis for his Second Law and has one or more specific effects upon the
(Independent Assortment), and contributed greatly phenotype of the organism and can mutate into
to our understanding of heredity as single units. various forms (alleles) (A Dictionary of Genetics 3rd
However, further research showed that Mendel’s Ed., King & Stansfield,1985) . A genotype is the
Second Law did not apply to every pair of genes specific allelic composition of a cell or organism.
that could be studied. In fact, we now know that Normally only the genes under consideration are
alleles of loci that are located close together on the listed in a genotype and the alleles at the
same chromosome tend to be inherited together. remaining gene loci are considered to be wild type.
This phenomenon is called linkage, and is a major A phenotype is the detectable outward
exception to Mendel’s Second Law of Independent manifestation of a specific genotype. In describing
Assortment. Researchers use linkage to determine a phenotype usually only the characteristics under
the location of genes along chromosomes in a consideration are listed while the remaining
process called genetic mapping. The concept of characters are assumed to be wild type (normal).
gene linkage is important to the natural processes
of heredity and evolution, as well as to our genetic
manipulation of crops and livestock.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 18 – GENES ON THE SAME CHROMOSOME: LINKAGE

1.1. GENE NAMES AND SYMBOLS and mutant allele of the "a" gene at the "a" locus.
Usually, gene names are unique and their This may also be abbreviated to +/a.
corresponding symbols are unique letters or In some species of diploids, the dominant allele is
combinations of letters. So, for example, the typically designated with the upper case letter(s),
"vermillion" gene in Drosophila is represented by while the recessive allele is given the lower case
the letter "v ", while "vg " is the symbol for the letter(s). For example, in Mendel’s peas the
"vestigial" gene and "vvl " is the symbol for the dominant Rough allele is “R”, while the recessive
"ventral veins lacking" gene locus. Note however smooth alleles is “r”.
that the same letter symbols may represent a
different gene in another organism. Gene symbols 2. RECOMBINATION
and gene names are typically shown italicized text. The process of meiosis leading to a separation of
In lectures we may not always use italics for gene chromosomes, and crossing over is necessary for
names and symbols. the understanding of this chapter. Refer to Chapter
The normal, or wild type, form of a gene is usually 16 and 17 for a review of these concepts.
symbolized by superscript plus sign, "+". E.g. " a+ ", The term “recombination” is used in several
" b+ ", etc. or it is sometimes abbreviated to just different contexts in genetics. In reference to
"+". A forward slash is occasionally used to indicate heredity, recombination is defined as a process
that the two symbols are alleles of the same gene, that results in gametes with combinations of alleles
but on homologous chromosomes. that were not present in the gametes from the
A typical mutant form of the gene, of which there parental generation (Figure 3). Recombination is
can be many, can be symbolized by a superscript important because it contributes to the genetic
minus sign, "-". E.g. " a- ", " b- ", etc., or sometimes variation that may be observed between
abbreviated to just "a", "b", etc. (no superscript). individuals within a population and that may be
Therefore, if the genotype of a diploid organism is acted upon by selection for evolution.
given as a+/a-, it means there is a wild type allele

Different alleles
Figure 2.
Cell Nucleus A diagram illustrating
A A how chromosomes, loci
A A
= = a
= a = A/a and alleles in a cell, and
a a how we depict them as
Homologous text.
Gene locus Chromosomes (Original-J.Locke- CC BY-

NC 3.0)

Figure 3.
When two loci are on
non-homologous
chromosomes, their
alleles will segregate in
combinations identical
to those present in the
parental gametes (Ab,
aB), and in recombinant
genotypes (AB, ab) that
are different from the
parental gametes.
(Original-Deyholos- CC
BY-NC 3.0)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

GENES ON THE SAME CHROMOSOME: LINKAGE – CHAPTER 18

2.1. INTER- AND INTRACHROMOSOMAL RECOMBINATION 2.2. INHERITING PARENTAL AND RECOMBINANT
Interchromosomal recombination occurs either GAMETES
through independent assortment of alleles whose If we consider only two loci and the products of
loci are on different chromosomes (Chapter 17). meiosis results in recombination, then the meiotic
Intrachromosomal recombination occurs through products (gametes) are said to have a recombinant
crossovers between loci on the same genotype. On the other hand, if no recombination
chromosomes. It is important to remember that in occurs between the two loci during meiosis, then
both of these cases, recombination is a process the products retain their original combinations and
that occurs during meiosis (mitotic recombination are said to have a non-recombinant, or parental
may also occur in some species, but it is relatively genotype. The ability to properly identify parental
rare). and recombinant gametes is essential to apply
recombination to experimental examples.
As an example of interchromosomal
recombination, consider loci on two different To properly identify recombinant and parental
chromosomes as shown in Figure 3. We know that gametes from an individual, you need to know the
if these loci are on different chromosomes there is genotype of its parents (the P generation). This is
no physical connection between them, so they are most easily demonstrated in a dihybrid. If, for two
unlinked and will segregate independently as did genes, one parent has the genotype A/A B/B, they
Mendel’s traits. The segregation depends on the can only produce one type of gamete: AB.
relative orientation of each pair of chromosomes at Similarly, if they are a/a b/b, then they can also
metaphase. Since the orientation is random and only produce one type of gamete: ab (Figure 4
independent of other chromosomes, each of the right). However, if those two gametes (AB and ab)
arrangements (and their meiotic products) is combine, they create an individual (F1) that has a
equally possible for two unlinked loci as shown in genotype written as A/a B/b. It can be easier to
Figure 3. keep track of the parental combinations of
gametes by keeping them together when writing
Intrachromosomal recombination occurs through the genotype, for this example AB/ab (Figure 4).
crossovers. Crossovers occur during prophase I of
meiosis, when pairs of homologous chromosomes So the above dihybrid individual can produce four
have aligned with each other in a process called different gametes: AB, ab, Ab and aB. The parental
synapsis. Crossing over begins with the breakage of gametes are those that the individual obtained
DNA of a pair of non-sister chromatids. The breaks from their parents, in this case AB and ab. Ab and
occur at corresponding positions on two non-sister aB are recombinant gametes and are evidence of a
chromatids, and then the ends of non-sister recombination event happening, resulting in a
chromatids are connected to each other resulting different combination of alleles (Figure 4 right).
in a reciprocal exchange of double-stranded DNA. For the above example, the P generation has one
Generally, every pair of chromosomes has at least parent homozygous for both dominant alleles, and
one crossover during meiosis, but often multiple the other homozygous for both recessive alleles. It
crossovers occur in each chromatid during is very important to note that this will not always
prophase I. Further details and figures of be the case. In some instances, one parent will be
crossovers are shown in Chapter 16 and 17. homozygous with one gene dominant and the
Because interchromosomal recombination occurs other gene recessive (A/A b/b) and the other
through independent assortment, genes in this parent will be the opposite (a/a B/B). This situation
situation are always unlinked. Intrachromosomal will change which is the parental and recombinant
recombination has instances of linked genes, and gametes (compare left and right in Figure 4).
so they will be the focus of this chapter.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 18 – GENES ON THE SAME CHROMOSOME: LINKAGE

Figure 4.
The genotype of gametes can be inferred unambiguously if the gametes are produced by homozygotes. However, recombination
frequencies can only be measured among the progeny of heterozygotes (i.e. dihybrids). Note that the dihybrid on the left
contains a different configuration of alleles than the dihybrid on the right due to differences in the genotypes of their respective
parents. Therefore, different gametes are defined as recombinant (red) and parental (blue) among the progeny of the two
dihybrids. In the cross at left, the recombinant gametes will be genotype AB and ab, and in the cross on the right, the
recombinant gametes will be Ab and aB.
(Original-Deyholos-CC BY-NC 3.0)

2.3. COUPLING AND REPULSION CONFIGURATION

When looking at an organism that is heterozygous
at two loci, just by looking at them you cannot tell
how the mutant and wild type alleles are arranged.
Both mutant alleles could be on one homologous
chromosome, and both wild type alleles could be
on the other (e.g. a-b- / A+B+). This is known as a Figure 5.
Alleles in coupling configuration (left) or repulsion
coupling (or cis) configuration. When one wild type configuration (right).
allele and one mutant allele are on one (Original-Deyholos-CC BY-NC 3.0)
homologous chromosome, and the opposite is on
the other, this is known as a repulsion (or trans) phenotypes, then you know that the individual you
configuration (e.g. A+b- / a-B+). The way to are looking at is in coupling configuration. If one
determine the orientation is to look at the parents parent has one dominant and one recessive
(or P generation) of that cross if you know the phenotype, and the other has the opposite, then
genotypes of them. If the parents are homozygous you know the individual is in repulsion
for both genes, and one shows both dominant configuration.
phenotypes and the other shows both recessive

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

GENES ON THE SAME CHROMOSOME: LINKAGE – CHAPTER 18

2.4. RECOMBINATION FREQUENCY Linkage Recombination Frequency

Recombination frequency (RF) is a calculation to Description
define the number of parental and recombinant
Unlinked ~0.50 or ~50%
gametes. The equation is as follows:
# !&'()*+,-,. /-)&.&0 Partial linkage <0.30 or 30%
!" =
(# !&'()*+,-,. + # 3-4&,.-5) Complete linkage 0.00 or 0%

Through identifying and defining parental and Table 1.

recombinant gametes, you can calculate the RF and The linkage description is listed corresponding to its
recombination frequency. Note: values between 0.30 and
from there decide the degree of linkage. 0.50 may be partially linked, or may not be linked at all. It
Based upon the equation and independent is often difficult to distinguish between these two
possibilities because of experimental error.
assortment, you can see that the recombination
frequency cannot be higher than 0.50. If alleles are 3.1. UNLINKED GENES
assorting independently, there will a random Unlinked genes appear to segregate and show
distribution of the alleles in the progeny, and so independent assortment. There will be a random
50% will be recombinant gametes and 50% will be and even distribution of gamete types, and an RF of
parental gametes, making the RF approximately 0.50 is the expectation. This situation occurs in two
0.50. If a gene is linked you will see a higher instances: either when the genes are on completely
percentage of parental gametes, making the RF < different chromosomes, or when they are far
0.50. You will never see recombinant gametes enough apart on a single chromosome that the
more than parental, and so in no situation will crossovers are so numerous that the alleles are
recombination frequency be higher than 0.50, distributed randomly (Figure 3). Either way,
except slightly with regards to standard because the alleles are assorting independently
experimental error. If you calculate a you should observe an equal number of
recombination frequency higher than 0.50, you recombinant and parental gametes, with an RF
need to make sure you accurately defined parental near ~0.50. Note, because of real life variability this
and recombinant gametes. value can be anywhere from ~0.40 to ~0.60.
3. UNLINKED GENES AND COMPLETE AND PARTIAL 3.2. COMPLETE LINKAGE
LINKAGE Having considered unlinked loci, let us turn to the
When comparing any two genes, they can be opposite situation, in which two loci are so close
varying distances apart. Their RF allows us to together on a chromosome that the parental
categorize them into the degree of linkage. The combinations of alleles always segregate together
amount of linkage can be placed on a sliding scale. (Figure 6). This is because the physical distance
between the two loci is so short that crossover
Table 1 shows generally how we categorize the
events become extremely rare. Therefore, the
degree linkage using recombination frequency.
alleles at the two loci are physically attached to the
Because RF is based upon experimental results that
same chromatid and will nearly always segregate
will have some experimental error, these should be
together into the same gamete. In this case, no
treated as guidelines and not hard rules in recombinants will be present following meiosis,
determining the distance between genes.
and the recombination frequency will be 0.00. This
is complete (or absolute) linkage and is rare, as the
loci must be so close together that crossovers are
virtually impossible to detect.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 18 – GENES ON THE SAME CHROMOSOME: LINKAGE

Figure 6.
If two loci are
completely linked, their
alleles will segregate in
combinations identical
to those present in the
parental gametes (Ab,
aB). No recombinants
will be observed.
(Original-Deyholos-CC
BY-NC 3.0)

Figure 7.
A crossover between two linked loci
can generate recombinant genotypes
(AB, ab), from the chromatids
involved in the crossover. Remember
that multiple, independent meioses
occur in each organism, so this
particular pattern of recombination
will not be observed among all the
meioses from this individual.
(Original-Deyholos-CC BY-NC 3.0)

3.3. PARTIAL LINKAGE from each other, will on average have multiple
It is also possible to obtain recombination crossovers between them and they will behave
frequencies between 0% and 50%, which is a indistinguishably from physically unlinked loci. A
situation we call incomplete (or partial) linkage. recombination frequency of 50% is therefore the
Incomplete linkage occurs when two loci are maximum recombination frequency that can be
located on the same chromosome but the loci are observed, and is indicative of loci that are either on
far enough apart so that crossovers occur between separate chromosomes, or are sufficiently
them during some, but not all, meioses (Figure 7). separated on the same chromosome.
Genes that are on the same chromosome are said 4. EXPERIMENTALLY DETERMINING
to be syntenic regardless of whether they are
RECOMBINATION FREQUENCY
completely or incompletely linked or unlinked.
Thus, all linked genes are syntenic, but not all Let us now consider a complete experiment in
syntenic genes are linked. which our objective is to measure recombination
frequency (Figure 8). We need at least two alleles
Because the location of crossovers is essentially
for each of two genes, and we must know which
random for any given base pair of the
combinations of alleles were present in the
chromosome, the greater the distance between
parental gametes. The simplest way to do this is to
two loci, the more likely a crossover will occur
start with pure-breeding lines that have contrasting
between them. Furthermore, loci that are on the
same chromosome, but are sufficiently separated

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

GENES ON THE SAME CHROMOSOME: LINKAGE – CHAPTER 18

four possible combinations of alleles in the

gametes of the dihybrid (Figure 9).

AB Ab aB ab
Aa Aa aa aa
ab
Bb bb Bb bb

e
yp
Long Long Short Short

ot
Brown White Brown White

en
ph
Figure 8.
R P P R
recombinant
An experiment to measure recombination frequency or parental

between two loci. The loci affect coat color (B/b) and tail
length (A/a).
Figure 9.
(Wikipedia-Modified Deyholos-CC BY-NC 3.0)
Punnett Square of example test cross. Homozygous
recessive tester can only produce one gamete type so
alleles at two loci. For example, we could cross only one is listed. Phenotypes are listed below. Using the
short-tailed (aa), brown mice (BB) with long-tailed phenotypes and what we know of the parents, we can
identify which phenotypes came from recombinant or
(AA), white mice (bb). Thus, (aaBB) are short-tailed
parental gametes. (Original-L. Canham- CC BY-NC 3.0)
and brown, while (AAbb) are long-tailed and white
(Figure 8 P cross). Based on the genotypes of the
parents, we know that the parental gametes will be We can then infer unambiguously the genotype of
aB or Ab (but not ab or AB), and all of the progeny the gametes produced by the dihybrid individual,
will be dihybrids, AaBb. We do not know at this and therefore calculate the recombination
point whether the two loci are on different frequency between these two loci. For example, if
chromosomes, or whether they are on the same only two phenotypic classes were observed in the
chromosome, and if so, how close together they F2 (i.e. short tails and brown fur (aaBb), and white
are. fur with long tails (Aabb)) we would know that the
only gametes produced following meiosis of the
The recombination events that may be detected dihybrid individual were of the parental type: aB
will occur during meiosis in the dihybrid individual. and Ab, and the recombination frequency would
If the loci are completely or partially linked, then therefore be 0%. Alternatively, we may observe
prior to meiosis, alleles aB will be located on one multiple classes of phenotypes in the F2 in ratios
chromosome, and alleles Ab will be on the other such as shown in Table 2. Given the data in Table 2,
chromosome. These are the parental gametes the calculation of recombination frequency is
based on our knowledge of the genotypes of the straightforward:
gametes that produced the dihybrid. Thus, RF = # recombinant offspring
recombinant gametes produced by the dihybrid
will have the genotypes ab or AB. Total offspring

Now that we have identified the parental and RF = _13+17_

recombinant gametes, how do we determine the 48+42+13+17
genotype of the gametes produced by the dihybrid
= 0.25
individual? The most practical method is to use a
testcross (Figure 8 F1 to tester), in other words to Because the recombination frequency is below
mate AaBb to an individual that has only recessive 0.30, we can say that the tail length gene and the
alleles at both loci (aabb). This will give a different fur colour gene are partially linked.
phenotype in the second generation for each of the

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 18 – GENES ON THE SAME CHROMOSOME: LINKAGE

Note: The use of linkage and recombination the next chapter

frequency, will be extended to Genetic Mapping in

fur number of gamete from genotype of F2 from (P)arental or

tail phenotype
phenotype progeny dihybrid test cross (R)ecombinant
short brown 48 aB aaBb P
long white 42 Ab Aabb P
short white 13 ab aabb R
long brown 17 AB AaBb R

Table 2.
An example of quantitative data that may be observed in a genetic mapping experiment involving two loci. The data correspond
to the F2 generation in the cross shown in Figure 8.

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

GENES ON THE SAME CHROMOSOME: LINKAGE – CHAPTER 18

___________________________________________________________________________
SUMMARY:
• Recombination is defined as any process that results in gametes with combinations of alleles that were
not present in the gametes of a previous generation.
• The recombination frequency between any two loci depends on their relative chromosomal locations.
• Unlinked loci show a maximum 50% recombination frequency.
• Loci that are close together on a chromosome are linked and tend to segregate with the same
combinations of alleles that were present in their parents.
• Crossovers are a normal part of most meioses, and allow for recombination between linked loci.
• Measuring recombination frequency is easiest when starting with pure-breeding lines with two alleles
for each locus, and with suitable lines for test crossing.
KEY TERMS:
linkage unlinked
Second Law of Independent Assortment synapsis
gene recombinant genotype (and gametes)
locus parental genotype (and gametes)
allele coupling (cis) configuration
genotype repulsion (trans) configuration
phenotype recombination frequency (RF)
recombination complete (absolute) linkage
interchromosomal recombination incomplete (partial) linkage
independent assortment syntenic
intrachromosomal recombination
crossover

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 18 – GENES ON THE SAME CHROMOSOME: LINKAGE

STUDY QUESTIONS:
1) Compare the terms “recombination” and the parental and recombinant progeny from
“crossover”. How are they similar? How are a test cross?
they different? b) If the alleles are in repulsion (trans)
2) Explain why it usually necessary to start with configuration, what will be the genotypes of
pure-breeding lines when measuring genetic the parental and recombinant progeny from
linkage by the methods presented in this a test cross?
chapter. 6) In this question the white flowers (w) are
3) Suppose you knew that in a population, a trait recessive to purple flowers (W), and yellow
(allele at a locus) that dominantly affected seeds (y) are recessive to green seeds (Y). If a
earlobe shape was tightly linked to a trait that green-seeded, purple-flowered dihybrid is
dominantly affected susceptibility to testcrossed, and half of the progeny have
cardiovascular disease in humans. Under what yellow seeds.
circumstances would this information be a) What can you conclude about linkage
clinically useful? between these loci?
4) In a previous chapter, we said a 9:3:3:1 b) What do you need to know about the
phenotypic ratio was expected among the progeny in this case?
progeny of a dihybrid cross, in absence of gene 7) If the progeny of the cross aaBB x AAbb is
interaction. testcrossed, and the following genotypes are
a) What does this ratio assume about the observed among the progeny of the testcross,
linkage between the two loci in the dihybrid what is the frequency of recombination
cross? between these loci?
b) What ratio would be expected if the loci AaBb 135
were completely linked? Be sure to consider Aabb 430
every possible configuration of alleles in the aaBb 390
dihybrids. aabb 120
5) Given a dihybrid with the genotype CcEe: 8) What is meant by the sentence “All linked
a) If the alleles are in coupling (cis) genes are syntenic, but not all syntenic genes
configuration, what will be the genotypes of are linked.”?

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

RECOMBINATION MAPPING OF GENE LOCI – CHAPTER 19

CHAPTER 19 – RECOMBINATION MAPPING OF GENE LOCI

Figure 1.
Thomas Hunt Morgan and his undergraduate
Alfred Henry Sturtevant used fruit fly
mutations like the ones in this figure to
create the first recombination map.
Eye colors (clockwise): brown, cinnabar,
sepia, vermilion, white, wild type. Also, the
white-eyed fly has a yellow body, the sepia-
eyed fly has a black body, and the brown-
eyed fly has an ebony body.
(Wikimedia-Ktbn-Public Domain)

INTRODUCTION along each chromosome and ultimately in the whole
genome.
In previous chapters the relative location of two loci
has been examined. We have used the frequency of 1.1. CALCULATING MAP DISTANCE
recombinants vs parentals to determine the The units of genetic distance are called map units
recombinant frequency (RF). Two loci could show (mu) or centiMorgans (cM), in honor of Thomas
independent assortment (unlinked, RF~50%) or Hunt Morgan by his undergraduate student, Alfred
were linked (RF<~35%). If linked the two must be Sturtevant, who developed the concept of genetic
located on the same chromosome (syntenic), but if maps. Geneticists routinely directly convert the
unlinked they could be far apart on the same recombination frequencies of two loci into cM.
chromosome or on different chromosomes (non- Thus, the recombination frequency in percent is
syntenic). In this chapter we will learn how to approximately the same as the map distance in cM.
construct genetic maps using 3-point crosses. For example, if two loci have a recombination
frequency of 25% they are said to be ~25cM apart
1. GENETIC MAPPING
on a chromosome (Figure 2).
A genetic map (or recombination map) is a
representation of the linear order of genes (or loci),
and their relative distances determined by crossover
frequency, along a chromosome. The fact that such

linear maps can be constructed supports the
Figure 2.
concept of genes being arranged in a fixed, linear
Two genetic maps consistent with a recombination
order along a single duplex of DNA for each frequency of 25% between A and B. Note the location of
chromosome. We can use recombination the centromere. (Original-Deyholos-CC BY-NC 3.0)
frequencies to produce genetic maps of all the loci

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 19 – RECOMBINATION MAPPING OF GENE LOCI

Note, however, this approximation works well only

for small distances (RF<30%) but progressively fails
at longer distances. This is because as the two loci
get farther apart the RF reaches a maximum at 50%,
like it would for two loci assorting independently
(not linked). In fact, some chromosomes are >100
cM long but such loci at the tips only have an RF of
50%. Calculating the map distance of the whole
chromosome (end-to-end) of over 50cM comes
from mapping of multiple loci dispersed along the
chromosome, each with a value of less than 50%,
with their total adding up to the value over 50cM
(e.g. >100cM as above). The method for mapping of
these long chromosomes is described next.
Note that the map distance of two loci alone does Figure 3.
not tell us anything about the orientation of these Genetic maps for regions of two chromosomes from two
loci relative to other features, such as centromeres species of the moth, Bombyx. The scale at left shows
distance in cM, and the position of various loci is indicated
or telomeres, on the chromosome. on each chromosome. Diagonal lines connecting loci on
different chromosomes show the position of
1.2. MAP DISTANCE OVER LONG CHROMOSOMES corresponding loci in different species. This is referred to
Map distances are always calculated for one pair of as regions of conserved synteny.
loci at a time. However, by combining the results of (NCBI-NIH-PD)
multiple pairwise calculations, a genetic map of
many loci on a chromosome can be produced If the novel gene and the previously mapped genes
(Figure 3). A genetic map shows the map distance, show complete or partial linkage with an existing
in cM, that separates any two loci, and the position locus, the recombination frequency will indicate the
of these loci relative to all other mapped loci. The approximate position of the novel gene within the
genetic map distance is roughly proportional to the genetic map. This information is useful in isolating
physical distance, i.e. the amount of DNA between (i.e. cloning) the specific fragment of DNA that
two loci. For example, in Arabidopsis, 1.0 cM encodes the novel gene. This process called map-
corresponds to approximately 150,000bp and based cloning.
contains approximately 50 genes. The exact number
of DNA base pairs in a cM depends on the organism, Genetic maps are also useful to (1) track
and on the position in the chromosome. Some parts genes/alleles when breeding crops and animals, (2)
of chromosomes (“crossover hot spots”) have in studying evolutionary relationships between
higher rates of recombination than others, while species, and (3) in determining the causes and
other regions have reduced crossing over and often individual susceptibility of some human diseases.
correspond to large regions of heterochromatin. 1.3. GENETIC MAPS ARE AN APPROXIMATION
When a novel gene or locus is identified by mutation Genetic maps are useful for showing the order of
or polymorphism, crossing it with previously loci along a chromosome, but the distances are only
mapped genes, and then calculating the a relative approximation. The correlation between
recombination frequency can determine its recombination frequency and actual chromosomal
approximate position on a chromosome. distance is more accurate for short distances (low RF
values) than long distances. Observed
recombination frequencies between two relatively
distant markers tend to underestimate the actual

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

RECOMBINATION MAPPING OF GENE LOCI – CHAPTER 19

number of crossovers that occurred. This is because except pure breeding lines with contrasting
as the distance between loci increases, so does the genotypes are crossed to produce an individual
possibility of having a second (third, or more) heterozygous at three loci (a trihybrid), which is
crossovers occur between the loci. This is a problem then testcrossed to a tester, which is homozygous
for geneticists, because with respect to the loci recessive for all three genes, to determine the
being studied, these double-crossovers produce recombination frequency between each pair of
gametes with the same genotypes as if no genes, among the three loci. A Punnett square can
recombination events had occurred (Figure 4), so be used to predict all the possible outcomes of the
they have parental genotypes. Thus, a double test cross (Figure 6). The progeny produced from
crossover will appear to be a parental type and not the testcross is shown in Table 1.
be counted as a recombinant, despite having two (or When the trihybrid is crossed to a tester, it should
more) crossovers. Geneticists will sometimes use be able to make eight different gametes, to make
specific mathematical formulae to adjust large eight possible different phenotype combinations in
recombination frequencies to account for the the offspring. The next step would be to identify if
possibility of multiple crossovers and thus get a the alleles are recombinant or parental gametes.
better estimate of the actual distance between two This can be done by comparing only two loci at one
loci. time to the parental gametes. In this example, the
2. MAPPING WITH THREE-POINT CROSSES parents of the trihybrid are a/a B/B c/c, and A/A b/b
C/C, so the parental gametes would be aBc and AbC
A genetic map consists of multiple loci distributed
respectively. Now by comparing two loci at once you
along a chromosome. A particularly efficient
can determine if, between the two, they are
method of mapping three genes at once is the
recombinant or parental. For example, the offspring
three-point cross, which allows the order and
in the first row in Table 1 came from gamete aBC.
distance between three potentially linked genes to

be determined in a single cross experiment (Figure

5).

A b A b A b

A b A b A b
a B a B a B

a B a B a B

Figure 4.
A double crossover between two loci will produce gametes
with parental genotypes, even though TWO crossovers
have occurred between the loci.
(Original-Deyholos/Canham-CC BY-NC 3.0)

This is particularly useful when mapping a new

mutation for which the location is unknown relative
to two previously mapped loci with known Figure 5.
A three point cross for loci affecting tail length, fur color,
locations. The basic strategy is the same as for the
and whisker length.
dihybrid mapping experiment described previously, (Original-Modified Deyholos/Locke-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 19 – RECOMBINATION MAPPING OF GENE LOCI

Figure 6.
P2 P1
Punnett square of the test cross for
aBC AbC abC ABC aBc Abc abc ABc Figure 5, showing the predicted
aa Aa aa Aa aa Aa aa Aa gametes possible from this cross, and
the resulting phenotypes.
abc Bb bb bb Bb Bb bb bb Bb (Original-L. Canham-CC BY-NC 3.0)
Cc Cc Cc Cc cc cc cc cc

pe

Short tail Long tail Short tail Long tail Short tail Long tail Short tail Long tail
ty

Brown White White Brown Brown White White Brown

Long whis Long whis Long whis Long whis Short whis Short whis Short whis Short whis
e

number of genotype of
tail fur whisker gamete from loci loci loci
progeny F2 from test
phenotype phenotype phenotype trihybrid A, B A, C B, C
n=120 cross
short brown long 5 aBC aaBbCc P R R
long white long 38 AbC (P2) AabbCc P P P
short white long 1 abC aabbCc R R P
long brown long 16 ABC AaBbCc R P R
short brown short 42 aBc (P1) aaBbcc P P P
long white short 5 Abc Aabbcc P R R
short white short 12 abc aabbcc R P R
long brown short 1 ABc AaBbcc R R P

Table 1.
An example of data that might be obtained from the F2 generation of the three-point cross is shown in Figure 5. The rarest
phenotypic classes correspond to double recombinant gametes ABc and abC. Each phenotypic class and corresponding gamete
can also be classified as parental (P) or recombinant (R) with respect to how each pair of loci (A,B), (A,C), (B,C) are arranged on
the chromosome.

Comparing loci A and B, we see that it matches one recombination frequencies may be calculated for
of the parental gametes and therefore it is parental. each pair of loci individually, as we did before for
Comparing A and C we see that it matches neither one pair of loci in our dihybrid cross (Chapter 18).
parental, so it is recombinant. The same can be said We can then use these numbers to build the map,
for comparing B and C. placing the loci with the largest RF on the ends.
$%$&%$'%$
loci A,B !" = = 25% However, note that in the three-point cross, the sum
$'(
of the distances between A-B and A-C (35%) is less
$%*%$%* than the distance calculated for B-C (32%). This is
loci A,C !" = = 10%
$'( because of double crossovers between B and C,
which were undetected when we considered only
*%$&%$'%*
!" = = 32% pairwise data for B and C. We can easily account for
$'(
loci B,C some of these double crossovers, and include them
(not corrected for double crossovers)
in calculating the map distance between B and C, as
follows (Figure 7).
Once the classes of progeny have been identified as
each pair of locus being parental or recombinant,

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

RECOMBINATION MAPPING OF GENE LOCI – CHAPTER 19

gene is discovered, it can be mapped relative to

other genes of known location to determine its
location. All that is needed to map a gene is two
alleles, a wild type allele and a mutant allele.
3. ANALYSIS OF RECOMBINATION FREQUENCIES IN
A THREE POINT TEST CROSS
Now that we know what the map looks like, the
frequency of each offspring type can be explained.
Figure 7.
Parental gametes (AbC and aBc) are the result of no
Two possible maps based on the data in Table 1 (without crossovers, or double crossovers between two
correction for double crossovers). alleles. Because we know all three loci are linked, it
(Original-Deyholos-CC BY-NC 3.0) is expected for this frequency to be relatively high,
much like what we see in the example above.
We already deduced that the map order must be There are recombinant gametes that are the result
BAC (or CAB). However, these double recombinants, of one crossover between two alleles (aBC, Abc, ABC
ABc and abC, were not included in our calculations and abc) single crossover events are more common,
of recombination frequency between loci B and C. If but are more likely to happen between loci B and A,
we included these double recombinant classes because they are 25 cM and so are farther apart
(multiplied by 2, since they each represent two than A and C, which are only 10 cM. So, we expect
recombination events), the calculation of to see more recombinant gametes with the former.
recombination frequency between B and C is as
follows, and the result is now more consistent with And lastly there are recombinant gametes that are a
the sum of map distances between A-B and A-C. result of double crossover events (ABc and abC).
Double crossovers between three linked genes like
!" =
*%$&%$'%' $ %'($)
= 35% this is rare, so we don’t expect to see many offspring
$'(
loci B,C from these recombinant gametes.
(corrected for double crossovers)
The frequencies we see from this cross agree with
our expectations. Figure 9 shows a diagram of the
Thus, the three-point cross was useful for: crossover events that took place in regards to
recombinant gametes and the number of offspring
(1) determining the order of three loci relative to
seen with that gamete type.
each other,
(2) calculating map distances between the loci, and In the example given above, all the genes present
(3) detecting some of the double crossover events are linked, with one pair more strongly linked than
that would otherwise lead to an underestimation of the other (A and C have stronger linkage than A and
map distance. B). When choosing three genes to map, this will not
However, it is possible that other, double crossovers always be the case. Sometimes you will have all
events remain undetected, for example double genes linked, sometimes you may have two genes
crossovers between loci A&B or between loci A&C. linked and one gene unlinked, and sometimes they
Geneticists have developed a variety of all may be unlinked (Figure 8). Much like what we
mathematical procedures to try to correct for such did above, by comparing the ratios of offspring you
double crossovers during large-scale mapping should be able to predict if the genes in the trihybrid
experiments. are linked or not.
As more and more genes are mapped a better
genetic map can be constructed. Then, when a new

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 19 – RECOMBINATION MAPPING OF GENE LOCI

All
Unlinked

# of # of
Two gamete gamete
progeny progeny
Linked
and aBC 5 Abc 5
One

Unlinked
All Three
Linked
Figure 8.
Examples of how three genes can be associated with each # of # of
other, based on whether all three are unlinked, all three gamete gamete
progeny progeny
are linked or two linked and one unlinked.
(Original-J. Locke/L. Canham-CC BY-NC 3.0) ABC 16 abc 12

If all three genes are unlinked, then we expect

independent assortment and an equal number of all
progeny types. Like in the example, if all are linked,
you expect there to be many parental genotypes,
some recombinant genotypes if they are a result of
a single recombination events. Recombinant # of # of
gamete gamete
genotypes that are a result of two recombination progeny progeny
events will be rare. The actual numbers of each will ABc 1 abC 1
differ depending if all the linked genes are equal
Figure 9.
distances from each other, or if one pair is more Diagram of the crossover events to create the different
linked than the other. recombinant gametes from the cross in Figure 5. The
parental alleles are seen on the black chromosomes. The
In the case where two genes are linked and one gene coloured lines indicate show where the crossover event
is unlinked the following applies: as in the example took place and underlines the alleles for that recombinant
before we will use the same parental gametes (AbC gamete. Below each diagram is the recombinant gamete
and aBc), but will assume the genes A and C are and the number of progeny seen in that cross per
linked and B is unlinked. In this case, because linkage Table 1.

(Original-L. Canham-CC BY-NC 3.0)

causes a higher prevalence of parental gametes, we
expect there to be more parental organizations of A 4. WHERE DO CROSSOVERS OCCUR ON A
and C, and fewer recombinant organizations of A CHROMOSOME?
and C. The presence and or absence of parental B is
not important here, because it is unlinked and will 4.1. GENERAL INFORMATION
assort independently. A crossover involves the reciprocal exchange
between non-sister chromatids when synapsed at
This information is summarized in Table 2. You can
prophase I of meiosis. While this exchange can
use this to look at the offspring of a trihybrid test
theoretically occur anywhere along the synapsed
cross and predict the linkage ahead of time.
homologs, observations show us that some regions
along a chromosome have higher rates of crossing

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

RECOMBINATION MAPPING OF GENE LOCI – CHAPTER 19

over, while others are lower. In addition, the resulting in a higher resolution map compared to
frequency of crossing over varies from species to species with fewer markers.
species, and even from male to female within a For species with a greater number of progeny, a
species. For example, in Drosophila melanogaster better map is possible. The ability to score
there is no crossing over in males. recombinants among 100's, 1000's, etc. means that
From Drosophila recombination data, we know that one can identify rare or very rare recombinants and
the likelihood of a crossover is greatest in the middle thus map loci that are very close together. For
of a chromosome arm and lower at the telomere example, with the mapping of bacteriophage, it is
and centromere regions (Figure 7). This distribution possible to map mutations down to the level of
would be expected if one of the functions of a single base pairs using certain selectable marker
crossover event were to hold the two synapsed systems.
chromosomes together so that they segregate Because of these two factors, the genetic maps of
correctly in metaphase I of Meiosis I. simple prokaryote genomes are more refined than
4.2. RESOLUTION OF GENETIC MAPS those of the larger and more complex eukaryote
The resolution of genetic maps depends on two genomes.
factors: (1) the number of marker loci and, (2) the These days, most laboratory species have had their
number of progeny. genomes sequenced. This knowledge provides
For species with a high number of marker loci (those another means to locate the specific gene(s)
which have a phenotype that permits the alleles to responsible for a desired trait(s).
be distinguished), more locations can be plotted ,

Parental or A and C Table 2.

Gametes Unlinked All linked Progeny ratios seen after a trihybrid test
Recombinant linked
cross depending on whether they are all
AbC P 1 more many linked, only two are linked or if all are
unlinked.
aBc P 1 more many This table is based upon the cross done in
Figure 5 as an example.
ABC R 1 more some
abc R 1 more some
aBC R 1 less some
Abc R 1 less some
abC R 1 less rare
ABc R 1 less rare

Figure 10.
Diagram of the frequency of crossing over along a
chromosome (bottom). The Y-axis shows the relative
rate of crossing over. The two peaks are present in the
middle of each chromosome arm, while the telomeres
and centromeres have lower frequencies of exchanges.
(Original-J. Locke-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 19 – RECOMBINATION MAPPING OF GENE LOCI

___________________________________________________________________________
SUMMARY:
• A genetic map (or recombination map) is a representation of the linear order of genes (or loci), and their
relative distances determined by crossover frequency, along a chromosome.
• Recombination frequency is usually proportional to the distance between loci, so recombination
frequencies can be used to create genetic maps.
• Recombination frequencies tend to underestimate map distances, especially over long distances, since
double crossovers may be indistinguishable from non-recombinants.
• Three-point crosses can determine the order and map distance among three loci.
• In three-point crosses, a correction for the distance of the outside markers can be made to account for
double crossovers between the two outer loci.
• Crossovers are not equally frequent all along a chromosome. In some regions, crossovers are more
frequent while others are less.
• The resolution of genetic maps depends on the number of markers and the number of progeny.
KEY TERMS:
recombinants map units (mu)
parentals centimorgans (cM)
independent assortment Thomas Hunt Morgan
unlinked Alfred Sturtevant
linked map-based cloning
syntenic conserved synteny
non-syntenic double-crossover
genetic map three-point cross

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

RECOMBINATION MAPPING OF GENE LOCI – CHAPTER 19

STUDY QUESTIONS:
1) In corn (i.e. maize, a diploid species), imagine are crossed (i.e. a yellow fly crossed to a curved-
that alleles for resistance to a particular wing fly), and their progeny is testcrossed, the
pathogen are recessive and are linked to a locus following phenotypic ratios are observed among
that affects tassel length (short tassels are their progeny.
recessive to long tassels). Design a series of
crosses to determine the map distance between black, straight 17
these two loci. You can start with any genotypes yellow, curved 12
you want, but be sure to specify the phenotypes black, curved 337
of individuals at each stage of the process and yellow, straight 364
specify which progeny will be considered
recombinant. You do not need to calculate a) Calculate the map distance between B and C.
recombination frequency. b) Why are the frequencies of the two smallest
2) In a mutant screen in Drosophila, you identified classes not exactly the same?
a gene related to memory, as evidenced by the 6) Given the map distance you calculated between
inability of recessive homozygotes to learn to B-C in question 5, if you crossed a double mutant
associate a particular scent with the availability (i.e. yellow body and curved wing) with a wild-
of food. Given another line of flies with an type fly, and testcrossed the progeny, what
autosomal mutation that produces orange eyes, phenotypes in what proportions would you
design a series of crosses to determine the map expect to observe among the F2 generation?
distance between these two loci and specify 7) Wild-type mice have brown fur and short tails.
which progeny will be considered recombinant. Loss of function of a particular gene produces
You do not need to calculate recombination white fur, while loss of function of another gene
frequency. produces long tails, and loss of function at a third
3) Imagine that methionine heterotrophy, locus produces agitated behaviour. Each of
chlorosis (loss of chlorophyll), and absence of these loss of function alleles is recessive. If a
leaf hairs (trichomes) are each caused by wild-type mouse is crossed with a triple mutant,
recessive mutations at three different loci in and their F1 progeny is test-crossed, the
Arabidopsis. Given a triple mutant, and following recombination frequencies are
assuming the loci are on the same chromosome, observed among their progeny. Produce a
explain how you would determine the order of genetic map for these loci.
the loci relative to each other.
4) Three loci are linked in the order B-C-A. If the A- Fur Tail Behaviour Freq.
B map distance is 1cM, and the B-C map distance white short normal 16
is 0.6cM, given the lines AaBbCc and aabbcc, brown short agitated 0
what will be the frequency of Aabb genotypes brown short normal 955
among their progeny if one of the parents of the white short agitated 36
dihybrid had the genotypes AABBCC? white long normal 0
5) Genes for body color (B black dominant to b brown long agitated 14
yellow) and wing shape (C straight dominant to brown long normal 46
c curved) are located on the same chromosome white long agitated 933
in flies. If single mutants for each of these traits

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 19 – RECOMBINATION MAPPING OF GENE LOCI

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOMES: SEX LINKAGE – CHAPTER 20

CHAPTER 20 – SEX CHROMOSOMES: SEX LINKAGE

Figure 1.
The E/e gene in turkeys is responsible for
bronze or brown feather colour, and is
located on the Z-chromosome.
(Flickr- stevevoght- CC BY-SA 2.0)

INTRODUCTION The combination of sex chromosomes within a
species is associated with either male or female
Previously, Mendel, working with plants, showed
individuals. In mammals, fruit flies, and some
patterns of inheritance derived from gene loci on
dioecious plants, those with two X chromosomes
autosomal chromosomes. One complication to this
are females while those with an X and a Y are
model of inheritance in animals is that loci present
males. In birds, moths, and butterflies, males are ZZ
on sex chromosomes, called sex-linked loci, don’t
and females are ZW. Because sex chromosomes
follow this pattern. This chapter covers the various
have arisen multiple times during evolution the
patterns of inheritance for various sex-linked loci.
molecular mechanism(s) through which they
1. AUTOSOMES AND SEX CHROMOSOMES determine sex differs among those organisms. For
example, although humans and Drosophila both
In diploids, most chromosomes exist in pairs (same
have X and Y sex chromosomes, they have different
length, centromere location, and banding pattern)
mechanisms for determining sex (see the next
with one set coming from each parent. These
chapter).
chromosomes are called autosomes. However,
many species have an additional pair of How do the sex chromosomes behave during
chromosomes that do not look alike. These are sex meiosis? Well, in those individuals with two of the
chromosomes because they differ between the same chromosome (i.e. homogametic sexes: XX
sexes. In humans, males have one of each while females and ZZ males) the chromosomes pair and
females have two X chromosomes. Autosomes are segregate during meiosis I the same as autosomes
those chromosomes present in the same number in do. During meiosis in XY males or ZW females
males and females, while sex chromosomes are (heterogametic sexes) the sex chromosomes pair
those that are not. When sex chromosomes were with each other.
first discovered their function was unknown and In mammals (XX, XY) the consequence of this is that
the name X was used to indicate this mystery. The all egg cells will carry an X chromosome, while the
next ones were named Y, then Z, and then W. sperm cells will carry either an X or a Y
chromosome. Half of the offspring will receive two

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 20 – SEX CHROMOSOMES: SEX LINKAGE

Figure 2.
Meiosis in an XY mammal. The stages shown are
anaphase I, anaphase II, and mature sperm. Note how
half of the sperm contain Y chromosomes and half contain
X chromosomes. Figure 3.
(Original-Harrington-CC BY-NC 3.0) X and Y chromosome have pseudoautosomal regions,
which are capable of pairing during meiosis and
X chromosomes and become female while half will recombination. (Original-Locke/Kang-CC BY-NC 3.0)
receive an X and a Y and become male (Figure 2). In
species with ZZ males, all sperm carry a Z inactive X chromosomes. These genes may explain
chromosome, while in females, ZW, half will have a clinical features in sex chromosome aneuploidy
Z and half a W. (addition or subtraction of a sex chromosome; e.g.
XXY) as gene products may be either under or over
2. PSEUDO-AUTOSOMAL REGIONS ON THE X AND Y expressed in relation to normal females and males.
CHROMOSOMES One of the genes in this region is called SHOX. It
makes a protein that promotes bone growth. 46,XX
In evolution, before the X and Y chromosomes and 46,XY people have two functioning copies and
differentiated, they used to be equivalent have average height. People with 47,XYY and
homologs, like an autosome. Over time, the Y 47,XXX genomes have three copies and are taller
chromosome lost most of its genes (hence the than average. And people with 45,X have one copy
reduced size), but the X chromosome retained all and are short. It is the single copy of SHOX and a
its genes. Thus, even though the Y chromosome few of the other genes in the pseudo-autosomal
has lost most of its genes, it still shares some region that causes health problems for women
regions with the X chromosome. This is the reason with Turner syndrome.
why although X and Y chromosomes are
heteromorphic (morphologically dissimilar), they 3. SEX LINKAGE: AN EXCEPTION TO MENDEL’S
are able to act as a homologous pair in meiosis and
FIRST LAW
undergo crossover. These common regions, contain
similar genes, permit the X and Y to pair up and are Above we introduced sex chromosomes and
called the “pseudoautosomal regions”. The name autosomes (non-sex-linked chromosomes). For loci
comes from the observation that genes in these on autosomes, the alleles follow the classic
regions behave like autosomes in their inheritance. Mendelian pattern of inheritance. However, for loci
Alleles of the genes in this region crossover just like on the sex chromosomes this doesn’t follow
those on the autosomes. Thus, genes in this region because most (not all) of the loci on the typical X-
are not inherited in a sex-linked pattern, even chromosome are absent from the Y-chromosome,
though they are located on the X chromosome. even though they act as a homologous pair during
meiosis. Instead, they will follow a sex-linked
The genes found in pseudo-autosomal region are
pattern of inheritance. An X-linked allele in the
present in two copies in both XY males and XX
father will always be passed on to his daughters
females and thus if expressed from both active and
only, but an X-linked allele in the mother will be
passed on to both daughters and sons equally.

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOMES: SEX LINKAGE – CHAPTER 20

3.1. X-LINKED GENES: THE WHITE GENE IN DROSOPHILA

MELANOGASTER
A well-studied sex-linked gene is the white gene on
the X chromosome of Drosophila melanogaster.
Normally flies have red eyes but flies with a mutant
- -
allele of this gene called white (w ) have white
eyes because the red pigments are absent. Because Figure 4.
+ Relationship between genotype and phenotype for the
this mutation is recessive to the wild type w allele
white gene on the X-linked gene in Drosophila
females that are heterozygous have normal red melanogaster. The Y chromosome is indicated with a
eyes. Female flies that are homozygous for the capital Y because it does not have a copy of the white
mutant allele have white eyes. Because there is no gene.
white gene on the Y chromosome, male flies can (Original-Deyholos/Harrington/Locke-CC BY-NC 3.0)
only be hemizygous for the wild type allele or the

mutant allele.

A researcher may not know beforehand whether a

novel mutation is sex-linked. The definitive method
to test for sex-linkage is reciprocal crosses (Figure
5). This means to cross a male and a female that
have different phenotypes, and then conduct a
second set of crosses, in which the phenotypes are
reversed relative to the sex of the parents in the
first cross. For example, if you were to set up
+
reciprocal crosses with flies from pure-breeding w
-
and w strains the results would be as shown in
Figure 5. Whenever reciprocal crosses give
different results in the F1 and F2 and whenever the
male and female offspring have different
phenotypes the usual explanation is sex-linkage.
Remember, if the locus were autosomal the F1 and Figure 5.
F2 progeny would be different from either of these Reciprocal crosses involving an X-linked gene in Drosophila
crosses. melanogaster. In the first cross (left) all of the offspring
have red eyes. In second (reciprocal) cross (right) all of the
A similar pattern of sex-linked inheritance is seen female offspring have red eyes and the male offspring all
for X-chromosome loci in other species with an XX- have white eyes. If the F1 progeny are crossed (to make
XY sex chromosome system, including mammals the P2), the F2 progeny will be different in each cross. The
first cross has all red-eyed females and half red-eyed
and humans. The ZZ-ZW system is similar, but
males. The reciprocal cross has half red-eyed males and
reversed (see below). females.
Thomas Morgan was awarded the Nobel Prize, in part, for
4. Y-LINKED GENE using these crosses to demonstrate that genes (such as
white) were on chromosomes (in this case the X-
Genes located on the Y-chromosome exhibit Y- chromosome).
linkage. For example, the TDF gene that is (Wikipedia- Deyholos/Harrington/Locke -PD)
responsible for sex determination and hairy ear rim

phenotype show only father to son inheritance
pattern.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 20 – SEX CHROMOSOMES: SEX LINKAGE

5. Z-LINKED GENES IN BIRDS

One last example is a Z-linked gene that influences
feather colour in turkeys. Turkeys are birds, which
use the ZZ-ZW sex chromosome system. The E
allele makes the feathers bronze and the e allele
makes the feathers brown (Figure 6). Only male Figure 6.
turkeys can be heterozygous for this locus, because Relationship between genotype and phenotype for a Z-
they have two Z chromosomes. They are also linked gene in turkeys. The W chromosome does not have
an E/e-gene so it is just indicated with a capital W.
uniformly bronze because the E allele is completely (Original-Harrington/Locke-CC BY-NC 3.0)
dominant to the e allele and birds use a dosage

compensation system similar to Drosophila and not
mammals. Reciprocal crosses between turkeys
from pure-breeding bronze and brown breeds
would reveal that this gene is in fact Z-linked.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOMES: SEX LINKAGE – CHAPTER 20

___________________________________________________________________________
SUMMARY:
• Autosomes and sex chromosomes differ in that the former exist in pairs but the latter depends on the
sex of the chromosome.
• Pseudo-autosomal regions are regions on X and Y chromosome that can pair up and recombine.
• Sex-linked genes are an exception to standard Mendelian inheritance. Their phenotypes are influenced
by the type of sex chromosome system and the type of dosage compensation system found in the
species.
• Some of the examples of sex-linked genes are: white gene on the Drosophila’s X chromosome, TDF
gene on Y chromosome, E/e gene on Z chromosome.
KEY TERMS:
autosome heteromorphic
sex chromosome sex-linked
homogametic X-linked genes
heterogametic reciprocal cross
pseudoautosomal regions Z-linked genes

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 20 – SEX CHROMOSOMES: SEX LINKAGE

STUDY QUESTIONS:
1) A rare dominant mutation causes a neurological
disease that appears late in life in all people
that carry the mutation. If a father has this
disease, what is the probability that his
daughter will also have the disease?
2) Make Punnett Squares to accompany the
crosses shown in Figure 5.
3) Draw reciprocal crosses that would show that
the turkey E-gene is on the Z-chromosome.

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOMES: SEX DETERMINATION – CHAPTER 21

CHAPTER 21 – SEX CHROMOSOMES: SEX DETERMINATION

Figure 1.
Not all species determine sex using the
same mechanism. There are many
factors that can determine a species’ sex
and one of them is growth temperature.
For alligators, sex is determined by the
temperature of the eggs in their nest.
(Flickr-Florida Fish and Wildlife-CC BY-
ND 2.0)

INTRODUCTION gene, the Sex-determining Region Y (SRY) gene,
also known as Testis-Determining Factor (TDF)
In the previous Chapter, sex chromosomes were gene, on the Y-chromosome. Its presence in the
described and their inheritance was compared to genome and expression in gonad tissues dictates
that of the autosomes. The linkage of sex that the sex of that individual will be male. Its
chromosomes to the sex of individuals was
absence or lack of correct expression results in a
presumed. In this chapter we will cover the
female phenotype for that individual.
mechanisms of sex determination by chromosomes
(genes) as well as other, environmental, In mammals, the sex chromosomes evolved just
mechanisms. In the diversity of animal life, sex is after the divergence of the monotreme lineage
not always determined by genetics (sex (mammals that lay eggs) from the lineage that led
chromosomes). to marsupial mammals (young are carried in a
pouch) and placental mammals. Thus nearly every
1. SEX DETERMINATION MECHANISMS IN ANIMALS mammal species uses the same sex determination
There are various mechanisms for sex system. In this system, during embryogenesis, the
determination in animals. These include sex gonads will develop into either ovaries or testes.
chromosomes, chromosome dosage, and (Figure 2)
environmental cues. Figure 2.
Gonad differentiation is
1.1. SEX CHROMOSOME SYSTEMS: under the control of
several genes including
a) XY system Testis-determining
Different combinations of the X and Y sex factor (TDF, SRY) at
Yp11.3. (y chromosome,
chromosomes can determine the sex of an p arm, region 1, band 1,
organism. For example, in humans and other sub-band 3).
mammals XY embryos develop as males while XX (Original-
embryos become females. This difference in Harrington/Kang-CC BY-
development is due to the presence of only a single NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 21 – SEX CHROMOSOMES: SEX DETERMINATION

A gene, present only on the Y chromosome called 1.2. CHROMOSOME DOSAGE

SRY, encodes a protein that directs the gonads to
mature into testes. XX embryos do not have this a) X-Autosome Ratio
gene and their gonads mature into ovaries instead, This mechanism involves ratios of autosome to sex
a default (Figure 2). chromosomes. This can occur even in species that
have two sex chromosomes For example, although
SRY in therians (placental mammals and marsupials)
Drosophila melanogaster has XX-XY sex
is an intronless gene. It encodes a DNA-binding
chromosomes, its sex determination system uses a
transcription factor that, when combined with
chromosome ration method, that of X:Autosome
other factors, up regulates genes that encode male
(X:A) ratio. In this system it is the ratio of
specific transcription factors. This begins a cascade
autosome chromosome sets (A) relative to the
of gene expression that leads to the differentiation
number of X-chromosomes (X) that determines the
of the gonad into testes. Mutations in the SRY gene
sex. Individuals with two autosome sets and two X-
lead to a range of sex-related disorders with
chromosomes (2A:2X) develop as females, while
varying effects on an individual's phenotype. In
those with only one X-chromosome (2A:1X)
some cases, the individual will morphologically
develop as males. The presence/absence of the Y-
develop as a female although both X and Y
chromosome and its genes are not significant for
chromosomes are present.
determining sex, however there are genes on the
Once formed the testes produce sex hormones that Y-chromosome that are needed for male fertility.
direct the rest of the developing embryo to An X/O fly is phenotypically male but not fertile. By
become male, while the ovaries make different sex comparison, X/O mammals are phenotypically
hormones that promote female development. The female because they lack the SRY gene.
testes and ovaries are also the organs where
b) Ploidy Level
gametes (sperm or eggs) are produced.
In other species of animals, the number of
b) ZW system chromosome sets can determine sex. For example,
In birds, some fish, some insects (butterflies and the haploid-diploid system is used in bees, ants,
moths) and reptiles, they use different and wasps. Typically, haploids are male and
chromosome for sex determination, Z- and W- diploids are female.
chromosomes. Z-chromosome is larger and has
more genes than the W-chromosome. ZZ embryos 2. ENVIRONMENTAL FACTORS
become male and ZW embryo become females. 2.1. GROWTH TEMPERATURE
This sex linkage pattern is backwards of the X and Y Alligators (Figure 1) –- Sex is determined by the
sex linkage pattern. It is currently unknown if the temperature during development in the egg and
presence of W chromosome induces female individuals are fully determined by the time of
features or two copies of Z chromosome induces hatching. Developmental temperatures of 30°C
male features. In birds, researches have not yet produce all females (nests constructed on levees).
found a ZZW or Z0 individual. Developmental temperatures of 34°C yield all
c) X/O system males (wet marsh nests). The natural sex ratio at
The X/O system (XX-female, X/O male), where O is hatching is five females to 1 male. Note that such a
an absence of a chromosome, is found in insects mechanism is sensitive to warming environmental
(e.g. grasshoppers). The absence of a chromosome temperatures.
means that there is not a specific gene that Tuatara (Figure 3) – These reptiles look like lizards
determines the sex of an individual, instead it is but are a distinctly separate Order, which has
usually determined via chromosome dosage. survived for over 200 million years. There are
currently only two extant species. Embryo’s
development temperature determines the animal’s

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOMES: SEX DETERMINATION – CHAPTER 21

sex; low temperatures (below a threshold) develop

into females. High temperatures (above a
threshold) develop into males. Global warming will
affect the sex ratio in the population. By 2080 there
will be conditions that produce 100% males.

2.2. SOCIAL ORGANIZATION

Sex-ratio in a population determines the sex of a
population. For example, most Reef fish can change Figure 3.

their sex during their lifetime. For example, the The tuatara (left) is a reptile, but not a lizard, although it is
Wrasse family includes many different species of related to lizards (right). Cladogram:
1=Tuatara
various sizes and colours. In this family, sex change
2=lizards
is typically female-to-male (male-to-female sex 3=snakes
change has been seen in experimental conditions). 4=crocodiles
The individual to change sex is generally the largest 5=birds
female in a group. (Left: Flickr-PhillipC- CC BY 2.0)
(Right: Wikipedia-Benchill-CC BY 3.0)
2.3. PARTHENOGENETIC SPECIES

In parthenogenetic species, females can lay fertile

eggs without requiring males. Examples include
walking stick insects, some fish and lizards, and
sharks in captivity.

Figure 4.
Moon Wrasse (Thalassoma lunare) can change sex.
(Flickr- Nick Hobgood- CC BY-NC 2.0)

Cell Response
Determining Factors Genetic Mechanism
Mechanism

• Hormonal:
Chromosomal:
directs cells to sex
• XX/XY • Single gene
phenotype
• ZW/ZZ • X-Autosome Ratio
• Cell-autonomous
• XX/XO (gene dosage)
(each cell “knows” what
• Haploid/Diploid
sex it is)
Environment:
• Rearing temp.
Not genetic Hormonal?
• Social interactions
• Parthenogenesis

Table 1. A summary table outlining various factors that affect sex determination and its genetic and cell
response mechanism.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 21 – SEX CHROMOSOMES: SEX DETERMINATION

___________________________________________________________________________
SUMMARY:
• The sex of an individual can be determined by sex chromosomes
• This includes the X/Y, Z/W, and X/O system
• Also, differences in the ploidy level (haploid vs diploid) determine sex in some species
• Lastly, environmental factors such as rearing temperature or social organization (male vs female ratio)
can determine sex.
KEY TERMS:
single gene X/O system
Testis-Determining Factor (TDF) X:Autosome (X:A) ratio
Sex-determining Region Y (SRY) haploid-diploid system
therians Tuatara
XY system Sex-ratio
ZW system parthenogenetic

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOMES: SEX DETERMINATION – CHAPTER 21

STUDY QUESTIONS:
1) Draw reciprocal crosses that would
demonstrate that the turkey E-gene is on the Z
chromosome.
2) Mendel’s First Law (as stated in class) does not
apply to alleles of most genes located on sex
chromosomes. Does the law apply to the
chromosomes themselves?

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 21 – SEX CHROMOSOMES: SEX DETERMINATION

PAGE 6 OPEN GENETICS LECTURES – FALL 2017
SEX CHROMOSOME: DOSAGE COMPENSATION – CHAPTER 22

CHAPTER 22 – SEX CHROMOSOMES: DOSAGE COMPENSATION

Figure 1.
A calico cat showing the random inactivation
(X-inactivation) of one or the other X-
chromosome giving either an orange or black
fur colour. The inactivation is a mechanism of
dosage compensation. (Note: the white
colour pattern is due to another gene.)
(Original-J. Locke-CC:AS)

INTRODUCTION systems evolved independently, and very early in
evolution, they work differently with regard to
The previous chapters on sex chromosomes dealt
compensating for the difference in gene dosage.
with sex linkage and sex determination. Now, there
Remember, in most cases the sex chromosomes act
is one last issue dealing with sex chromosomes,
as a homologous pair even though the Y-
that of dosage compensation. Because the number
chromosome has lost most of the loci when
of X chromosomes (and Z chromosomes) differs
compared to the X-chromosome. Typically, the X
between the sexes, there is a difference in the
and the Y chromosomes were once similar but, for
number of copies for each locus on the
unclear reasons, the Y chromosomes have
chromosome: females have two, while males only
degenerated, slowly mutating and losing its loci. In
have one (opposite for the ZZ/ZW system).
modern day mammals the Y chromosomes have
1. GENE DOSAGE PROBLEM very few genes left while the X chromosomes
remain as they were. This is a general feature of all
For many loci, the different number of
organisms that use chromosome based sex
chromosomes is inconsequential. That is, the
determination systems. Chromosomes found in
phenotype is unaffected whether there are one or
both sexes (the X or the Z) have retained their
two alleles present. However, for some loci, it is
genes while the chromosome found in only one sex
significant and can affect the phenotype. These loci
(the Y or the W) have lost most of their genes. In
need to have the correct gene dosage to generate
either case there is a gene dosage difference
a wild type phenotype. The dosage difference
between the sexes: e.g. XX females have two doses
between the sexes is reconciled in one of two
of X-chromosome genes while XY males only have
ways. Either the single X chromosome in males is
one. This gene dosage needs to be compensated in
up-regulated to produce the expression equivalent
a process called dosage compensation. There are
of two doses. Or, one of the two doses in females is
two major mechanisms.
inactivated so as to only have one active dose.
Mammals and Drosophila both have XX - XY sex
determination systems. However, because these

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 22 – SEX CHROMOSOME: DOSAGE COMPENSATION

2. DOSAGE COMPENSATION IN DROSOPHILA

In Drosophila and many other insects, dosage
compensation takes place in males. To make up for
having only a single X chromosome, the genes on it
are transcribed at twice the normal rate. This
increased gene expression restores a balance
between proteins encoded by X-linked genes and

those made by autosomal genes.
Figure 2.
3. X-CHROMOSOME INACTIVATION IN MAMMALS X chromosome inactivation during mitosis and after
mitosis.
3.1. BASICS (Original-Harrington/Kang-CC BY-NC 3.0)
In mammals a different mechanism is used, called

X-chromosome inactivation and it operates in

females, not males. In XX embryos one X in each
cell is randomly marked and inactivated. From that
point forward most of the genes on this
chromosome will be unexpressed or “inactive”,
hence its name Xinactive (Xi). The other X
chromosome, the Xactive (Xa), is unaffected and
genes are expressed as they normally would be.
The inactivation process is under the control of the
X-inactivation centre (XIC), located at Xq13 on the
X-chromosome, which contains several genes
including XIST gene. XIST gene is transcribed (but
not translated into a protein) and is responsible for
Figure 3.
the initiation and propagation of inactivation of X chromosomes detected by FISH method in a female cell’s
one X-chromosome in an XX cell. These XIST RNA nucleus.
transcripts coat the X chromosome so that the (Wikipedia-Steffen Dietzel-CC BY-SA 3.0)
transcription from that X chromosome is prevented
(inactivated). inactivation and the alleles are expressed from
The Xi chromosome is replicated during S phase both active and inactive X chromosomes. These
and transmitted during mitosis the same as any genes may explain clinical features in sex
other chromosome, but most of its genes are not chromosome aneuploidy as gene products may be
transcribed (Figure 2). The chromosome appears as either under or over expressed in relation to
a condensed mass within interphase nuclei and is normal females and males.
called the Barr body (Figure 3) and does not
3.2. X-LINKED GENES – ORANGE GENE IN CATS
decondense to be expressed. (The Barr body is
A classic X-linked gene that shows X-inactivation is
named after Canadian researcher, Murray Barr,
the Orange gene (O) in cats. The OO allele encodes
who along with his graduate student Ewart
an enzyme that results in orange pigment in the fur
Bertram at Western University in London Ontario
hairs. The OB allele results in the hairs being black.
discovered it in 1948.) With the inactivation of
The phenotypes of various genotypes of cats are
genes on one X-chromosome, females have the
shown in Figure 4. Note that the heterozygous
same number of functioning X-linked genes as
females have an orange and black mottled
males. However, some genes and particularily
phenotype known as tortoiseshell. This is due to
those in the pseudoautosomal regions escape
patches of skin cells having different X-

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOME: DOSAGE COMPENSATION – CHAPTER 22

Figure 4.
Relationship between genotype and phenotype for an X-
O
linked gene in cats. The O allele = orange while the
B
O allele = black.
(Original-Harrington-CC BY-NC 3.0)

chromosomes inactivated. In each orange hair the

Xi chromosome carrying the OB allele is inactivated.
The OO allele on the Xa is functional and orange
pigments are made. In black hairs the reverse is
true, the Xi chromosome with the OO allele is
inactive and the Xa chromosome with the OB allele Figure 5.
is active. Because the inactivation decision happens This figure shows the two types of liver cells in females
early during embryogenesis, the cells continue to heterozygous for an F8 mutation. Because people with
+ -
divide to make large patches on the adult cat skin the F8 /F8 genotype have the same phenotype, normal
+ + -
blood clotting, as F8 /F8 people the F8 mutation is
where one or the other X is inactivated.
classified as recessive. .
(Original-Harrington/Locke-CC BY-NC 3.0)
The Orange gene in cats is also a good
demonstration of how the mammalian dosage
compensation system affects gene expression. 4. MECHANISMS OF SEX DETERMINATION
However, most X-linked genes do not produce such SYSTEMS
dramatic, easy to see, mosaic phenotypes in
Sex is a phenotype. Typically, in most species, there
heterozygous females.
are multiple characteristics, in addition to sex
3.3. A TYPICAL X-LINKED GENE – F8 GENE IN HUMANS organs, that distinguish male from female
A more typical example of an X-linked gene is the individuals (although some species are normally
F8 gene in humans. It makes Factor VIII blood hermaphrodites where both sex organs are
clotting proteins in liver cells. If a male is present in the same individual; e.g. worms). The
hemizygous for a mutant allele (F8-/Y) the result is morphology and physiology of male and females is
hemophilia type A. Females homozygous for a phenotype just like hair or eye colour or wing
mutant alleles (F8-/F8-) will also have hemophilia. shape. The sex of an organism is part of its
However, heterozygous females, those people who phenotype and can be genetically (or
+ -
are F8 /F8 , do not have hemophilia because even environmentally) determined.
though half of their liver cells do not make Factor For each species, the genetic determination relies
+
VIII (because the X with the F8 allele is inactive) on one of several gene or chromosome based
the other 50% can (Figure 5). Because some of mechanisms. See Figure 5 for a summary. There
their liver cells are producing and exporting Factor are, for other species, also a variety of
VIII proteins into the blood stream they have the environmental mechanisms, too (rearing
ability to form blood clots throughout their bodies. temperature, social interactions, parthenogenesis).
The genetic mosaicism in the liver cells of their Whatever the sex choice mechanism, however,
bodies does not result in a visible mosaic there are two different means by which the cells of
phenotype2 an organism carry out this decision: hormonal or
cell-autonomous.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 22 – SEX CHROMOSOME: DOSAGE COMPENSATION

choice of the genital ridge cells, they grow and

differentiate into male (testis) or female (ovary)
gonads, which will then produce the appropriate
hormones (e.g. testosterone or estrogen). This
hormone will circulate throughout the body, enter
cells, and activate transcription factors, which will
cause all the other tissues to develop and
differentiate accordingly, into a male or female
phenotype for that individual. Simply put, the
circulating hormone induces all the cells and
tissues to be the appropriate sexual phenotype.
Sometime this hormone inducing system fails. In
cattle (and some other mammals), a freemartin is a
type of heifer (female) that becomes masculinized
because of hormone transfer from a bull (male)
twin. Externally a freemartin appears as a female
but it is infertile over 90% of the time, has
masculinized behavior, and non-functioning
ovaries. The animal originates as a female (XX) but
the female reproductive development is altered by
anti-Müllerian hormone from the male twin,
acquired via vascular connections between
placentas. Thus, the freemartin has conflicting
hormonal cues, which leads to the intermediate
phenotype.
4.2. CELL-AUTONOMOUS MECHANISM :
Figure 6. With this system, used by many animals, including
Different types of chromosomal (or gene) based sex
determination. From top to bottom, there is the
birds and insects, the zygote cell initially has a sex
archetypal XX/XY system found in humans (and most phenotype set at the cell level (not whole organism
mammals) with the TDF-Y gene leading to a male level). A cell intrinsically determines, individually,
phenotype; the ZW/ZZ system found in chickens (birds, their sex and then develops accordingly, giving the
moths, and butterflies); the same XX/XY system in appropriate sexual characteristics and phenotype
Drosophila (sex is determined by the X-
chromosome:autosome ratio); the XX/XO system as found
to the whole organism. Each cell is autonomous
in grasshoppers; and the diploid/haploid system as found with respect to its sex; there are no sex hormone
in bees (and ants, and wasps). Also, the hormonal cues to determine the sex expressed by the
mechanism is used in humans, while all the other organism.
examples use the cell-autonomous mechanism for
development of the male or female sex phenotype. This cell autonomy mechanism can lead to sexual
(Wikipedia-original - CFCF with additions and corrections gynandromorphs, which are genetic mosaics (a
by J. Locke- CC BY-SA 3.0) single organism composed of genetically distinct
cells derived from the same zygote) that display
4.1. HORMONAL MECHANISM : both male and female characteristics in a mosaic
With this system, used by mammals for example, fashion. They often are phenotypically split down
including humans, the zygote initially develops into the midline of the organism. These rare individuals
a sexually undifferentiated embryo that can are thought to be the result of an improper sex
become either sex. Then, depending on the sex chromosome segregation that occurs in a cell very

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOME: DOSAGE COMPENSATION – CHAPTER 22

early in development so that one half of the

individual has cells with a male chromosome set
while the other half has cells with a female set. If a
species is sexually dimorphic (external morphology
easily distinguishes males from females) they are
easily visible. See Figure 8 for a local example.
Gynandromorphs are so common that they are
even sometimes seen in the wild. A search on the
internet will bring up many more examples.
While gynandromorphs are seen in cell-
autonomous species, such as insects and birds,
they are not seen in hormonally determined Figure 7.
species, such as mammals. This is because all the Mosacism can cause conditions such as partial
cells in the body display the same sex phenotype heterochromia in eyes, which is a condition that causes
difference in pigmentation in the iris.
caused by the circulating sex hormones. Sexual (Wikipedia-Sheila.lorquiana- CC BY-SA 3.0)
gynandromorphs appear to be absent in reptiles,

amphibians, and fish indicating that they don’t use

a cell-autonomous mechanism. Nevertheless, there
are genetic mosaic individuals in these groups but
they do not appear to involve sex determined
traits, which is required for a true gynandromorph.
They often involve mosaicism of alleles at a single
gene locus (somatic mutation) that affect external
morphology (e.g. colour).

4.3. MIXED CELL INDIVIDUALS – MOSAIC VS CHIMERA Figure 8.
Both Mosaic and chimeras have genetically Drosophila sexual gynadromorph. The upper side is
different cells. However, the difference between female with the most distal abdominal segments being
not heavily pigmented, while the lower side is male and
the two is the origin of those cells. (Figure 10)
the two most distal segments are heavily pigmeted (red
a) Mosaic circle). Note the curvature of the abdomen caused by the
longer, female half above and shorter male half below.
A mosaic is an organism or a tissue that contains This example was found by a U. of Alberta student in the
two or more types of genetically different cells GENET 375 lab course, Introductions to Molecular Genetic
derived from the same (single) zygote. Since the Techniques.
(Original – Locke – CC BY-NC 3.0).
cell is derived from the same organism, most of the
loci will be identical in all cell populations except
for some loci. Now the genetic change within the For example, most of the embryos with mosaic
zygote might occur due to mutations, changes in turner syndrome experience death prior to birth.
the number or structure of chromosomes, or X- This is because they have only one X chromosome.
chromosome inactivation. (45, X). However those who survive are known to
have more number of normal cells that have 46, XX
In a clinical sense, if the mosaic individual has large
genotype than the abnormal 45, X genotype cells.
amounts of genetically abnormal cells and few of
normal cells, that individual will manifest disease. Another example is X chromosome mosaicism.
However, if the individual has small amounts of During embryogenesis, one of the female’s X
abnormal cells but sufficient amount of normal chromosome is randomly inactivated. Now some
cells, the severity of the disease will be reduced. cells might have their paternal X chromosome
inactivated while others might have their maternal

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 22 – SEX CHROMOSOME: DOSAGE COMPENSATION

X chromosome inactivated. Since these two cell

lines have different genetic composition, they are
also considered as mosaicism. All females have
roughly the same amount of two genetically
different cell lines.
A gynandromorph is an organism that is made up
of mosaic tissues of male and female genotypes
and displays both male and female characteristics.

b) Chimera
A chimera is an organism composed of genetically
distinct cells derived from different (more than
one) zygotes. Because the cells are derived from
different organisms, the cell populations will have Figure 9.
more divergent genotypes when compared to A chimeric mouse on the very right, made in NIMH’s
Transgenic Core Facility. The two coloured fur shows the
those of a mosaic. The different sources can two types of cells present.
sometimes even be different species such as a goat (Wikipedia- Staff at NIMH's Transgenic Core Facility-PD)
and a sheep, which when mixed makes a “shoat” or
a “geep”.

A chimeric cattle is another example, in an outside

the lab setting, to explain this concept. When cows
conceive fraternal (non-identical) twins, the
circulatory systems of the twins can be connected
via a joining called an anastomosis. Because of this,
blood, cells, and tissue can be exchanged between
the fetuses. This is how an organism can contain
genetically distinct cells from another organism, its
fraternal twin. Now if the blood and cells are
“shared”, a female fetus will be exposed to male
hormones. The result is a masculinized female cow,
which is called free martins (visibly female, but
with male behavior and also sterile).

Also, patients who have undergone cellular
Figure 10.
transplant such as bone marrow transplant are also The main difference between mosaic and chimeric
considered to be chimeras. organisms is that the former arises from the same zygote
whereas the latter arises from different (more than one)
zygotes. (Original-Kang-CC BY-NC 3.0)

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

SEX CHROMOSOME: DOSAGE COMPENSATION – CHAPTER 22

___________________________________________________________________________
SUMMARY:
• In order to compensate for under or over dosage of gene products, organisms use various methods
such as expressing genes twice the normal rate or inactiving one X chromosome.
• X-chromosome inactivation occurs randomly (except for special circumstances), and during interphase
the inactivated chromosome appears as a condensed mass in the nucleus called the Barr body.
• Orange gene in cats and F8 gene in humans are examples of X-linked genes.
• Sex determination can be either hormonal or cell-autonomous. Abnormality in the cell-autonomous
mechanism may result in gynandromorphs.
• Both mosaic and chimeric organisms are composed of genetically distinct cells, but their origins of
those cells are different.
KEY TERMS:
dosage compensation cell-autonomous
X-linked genes freemartin
autosomal genes chimera
X-chromosome inactivation sexual gynandromorphs
Barr body genetic mosaics
Orange gene sexually dimorphic
F8 gene X chromosome mosaicism
hermaphrodites gynandromorph
parthenogenesis anastomoses
hormonal

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 22 – SEX CHROMOSOME: DOSAGE COMPENSATION

STUDY QUESTIONS:
0 B
1) What is the relationship between the O and O
alleles of the Orange gene in cats?
2) Another cat hair colour gene is called White
Spotting. This gene is autosomal. Cats that have
the dominant “S” allele have white spots, while
the “s” allele doesn’t. Taking the Orange locus
(OB and OO) into account, what are the possible
genotypes of cats that are:
a) entirely black
b) entirely orange
c) black and white
d) orange and white
e) orange and black (tortoiseshell)
f) orange, black, and white (calico)
3) Make a diagram similar to Figure 4, but with
the F8 alleles/genotypes, that shows the
relationship between genotype and phenotype
in females and males and which would use the
purified Factor VIII protein.

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

PEDIGREE ANALYSIS – CHAPTER 23

CHAPTER 23 – PEDIGREE ANALYSIS

Figure 1.
Polydactyly (six fingers in this case – count them) is an example of a human
trait that can be studied by pedigree analysis.
(Wikipedia- Drgnu23- CC BY-SA 3.0)

INTRODUCTION Pedigree analysis is therefore an important tool in
basic research, agriculture, and genetic counseling.
The basic concepts of genetics described in the
preceding chapters can be applied to almost any Each pedigree chart represents all of the available
eukaryotic organism. However, some techniques, information about the inheritance of a single trait
such as test crosses, can only be performed with (most often a disease) within a family. The
model organisms or other species that can be pedigree chart is therefore drawn using factual
experimentally manipulated. To study the information, but there is always some possibility of
inheritance patterns of genes in humans and other errors in this information, especially when relying
species for which controlled matings are not on family members’ recollections or even clinical
possible, geneticists use the analysis of pedigrees diagnoses. In real pedigrees, further complications
and populations. can arise due to incomplete penetrance (including
age of onset) and variable expressivity of disease
1. PEDIGREE ANALYSIS alleles, but for the examples presented in this
1.1. PEDIGREE CHARTS book, we will presume complete accuracy of the
pedigrees – that is, the phenotype accurately
Pedigree charts are diagrams that show the
reflects the genotype. A pedigree may be drawn
phenotypes and/or genotypes for a particular
when trying to determine the nature of a newly
organism, its ancestors, and descendants. While
discovered disease, or when an individual with a
commonly used in human families to track genetic
family history of a disease wants to know the
diseases, they can be used for any species and any
probability of passing the disease on to their
inherited trait. Geneticists use a standardized set of
children. In either case, a tree is drawn, as shown
symbols to represent an individual’s sex, family
in Figure 2, with circles to represent females, and
relationships and phenotype. These diagrams are
squares to represent males. Matings are drawn as a
used to determine the mode of inheritance of a
line joining a male and female, while a
particular disease or trait, and to predict the
probability of its appearance among offspring.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 23 – PEDIGREE ANALYSIS

consanguineous mating (closely related is two

lines.
The affected individual that brings the family to the
attention of a geneticist is called the proband (or
propositus). If the individual is unaffected, they are
called the consultand. If an individual is known to
have symptoms of the disease (affected), the
symbol is filled in. Sometimes a half-filled in
symbol is used to indicate a known carrier of a
disease; this is someone who does not have any
symptoms of the disease, but who passed the
disease on to subsequent generations because they
are a heterozygote. Female carriers of X-linked
traits are indicated by a circle with a dot in the
centre. Note that when a pedigree is constructed, it
is often unknown whether a particular individual is
a carrier or not, so not all carriers are always
explicitly indicated in a pedigree. For simplicity, in
this chapter we will assume that the pedigrees
presented are accurate, and represent fully Figure 2.
Symbols used in drawing a pedigree.
penetrant traits. (Original-Deyholos/Harrington-CC BY-NC 3.0)
1.2. PEDIGREE CHART CONVENTION SYMBOLS
In pedigree analysis, standardized human pedigree symptoms of the disease (assuming complete
nomenclature is used. penetrance), and only one disease allele needs to
be inherited for an individual to be affected. Thus,
If possible, male partner should be left of female every affected individual must have an affected
partner on relationship line. Siblings should be parent. A pedigree with affected individuals in
listed from left to right in birth order, oldest to every generation is typical of AD diseases.
youngest. However, beware that other modes of inheritance
2. MODES OF INHERITANCE can also show the disease in every generation, as
described below. It is also possible for an affected
Given a pedigree of an uncharacterized disease or individual with an AD disease to have a family
trait, one of the first tasks is to determine which without any affected children, if the affected
modes of inheritance are possible and then which parent is a heterozygote. This is particularly true
mode of inheritance is most likely. This information in small families, where the probability of every
is essential in calculating the probability that the child inheriting the normal, rather than disease
trait will be inherited in any future offspring. We allele is not extremely small. Note that AD diseases
will mostly consider five major types of inheritance: are usually rare in populations, therefore affected
autosomal dominant (AD), autosomal recessive individuals with AD diseases tend to be
(AR), X-linked dominant (XD), X-linked recessive heterozygotes (otherwise, both parents would
(XR), and Y-linked (Y). have had to been affected with the same rare
disease). Achondroplastic dwarfism, and
2.1. AUTOSOMAL DOMINANT (AD) polydactyly are both examples of human conditions
that may follow an AD mode of inheritance.
When a disease is caused by a dominant allele of a
gene, every person with that allele will show

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

PEDIGREE ANALYSIS – CHAPTER 23

AD EXAMPLE: ACHONDROPLASIA
Achondroplasia is a common form of dwarfism.
FGFR3 gene at 4p16 (chromosome 4, p arm, region
1, band 6) encodes a receptor protein that
negatively regulates bone development. A specific
bp substitution in the gene makes an over-active
Figure 3. protein and this results in shortened bones.
A pedigree consistent with AD inheritance.
(Unknown)
Achondroplasia is considered autosomal dominant

because the defective proteins made in A / a
embryos halt bone growth prematurely. A / A
embryos do not make enough limb bones to
survive. Most, but not all dominant mutations are
also recessive lethal. In achondroplasia, the A allele
shows dominant visible phenotype (shortness) and
recessive lethal phenotype.

Table 1.
2.2. X-LINKED DOMINANT (XD)
Genotype nomenclature consistent with AD inheritance.
(Original-Harrington-CC BY-NC 3.0) In X-linked dominant inheritance, the gene

responsible for the disease is located on the X-
chromosome, and the allele that causes the disease
is dominant to the normal allele in females.
Because females have twice as many X-
chromosomes as males, females tend to be more
frequently affected than males in the population.
However, not all pedigrees provide sufficient
information to distinguish XD and AD. One
definitive indication that a trait is inherited as AD,
and not XD, is that an affected father passes the
Figure 4. disease to a son; this type of transmission is not
Portrait of Sebastián de Morra by Diego Velázquez, a possible with XD, since males inherit their X
court dwarf and was painted ~1645. He likely had
achondroplasia, a condition that has autosomal dominant
chromosome from their mothers.
inheritance. (Wikimedia Commons-Diego Velázquez-PD)

Figure 5. Figure 6.
Diagram showing the mechanism of achondroplasia. Two pedigrees consistent with XD inheritance. (Unknown)
(Original-Harrington-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 23 – PEDIGREE ANALYSIS

Table 2. Figure 8.
Genotype nomenclature consistent with XD inheritance. A pedigree consistent with AR inheritance. (Unknown)
(Original-Harrington-CC BY-NC 3.0)

Table 3.
Genotype nomenclature consistent with AR inheritance.
(Original-Harrington-CC BY-NC 3.0)

Figure 7. pedigree can be carriers, probably without knowing
Some types of rickets may follow an XD mode of
it. Compared to pedigrees of dominant traits, AR
inheritance.
(Wikipedia-Mrish-CC BY-SA 1.0) pedigrees tend to show fewer affected individuals
and are more likely than AD or XD to “skip a
XD EXAMPLE: FRAGILE X SYNDROME generation”. Thus, the major feature that
The FMR1 gene at Xq21 (X chromosome, q arm, distinguishes AR from AD or XD is that unaffected
region 2, band 1) encodes a protein needed for individuals can have affected offspring. Attached
neuron development. There is a (CGG)n repeat earlobes is a human condition that may follow an
array in the 5’UTR (untranslated region). If there is AR mode of inheritance.
expansion of the repeat in the germline cell the AR EXAMPLE: PHENYLKETONURIA (PKU)
child will inherit a non-functional allele. XA / Y Individuals with phenylketonuria (PKU) have a
males have fragile X mental retardation (IQ < 50) mutation in the PAH gene at 12q24 (chromosome
because none of their neurons can make FMR1 12, q arm, region 2, band 4), which encodes an
proteins. Fragile X syndrome is considered X-linked enzyme that breaks down phenylalanine into
dominant because only some neurons in XA / Xa tyrosine called phenylalanine hydrolase (PAH).
females can make FMR1 proteins. The severity (IQ Without PAH, the accumulation of phenylalanine
50 – 70) in these females depends upon the and other metabolites, such as phenylpyruvic acid
number and location of these cells within in the (Figure 10.), disrupts brain development, typically
brain. within a year after birth, and can lead to
2.3. AUTOSOMAL RECESSIVE (AR) intellectual disability. Fortunately, this condition is
Diseases that are inherited in an autosomal both easy to diagnose (Figure 9.) and can be
recessive pattern require that both parents of an successfully treated with a low phenylalanine diet.
affected individual carry at least one copy of the There are over 450 different mutant alleles of the
disease allele. With AR traits, many individuals in a PAH gene, so most people with PKU are compound

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

PEDIGREE ANALYSIS – CHAPTER 23

Figure 11.
A pedigree consistent with XR inheritance. (Unknown)
Figure 9.
Many inborn errors of metabolism, such as

phenylketonuria (PKU) are inherited as AR. Newborns are

often tested for a few of the most common metabolic
diseases.
(Wikipedia-U.S. Air Force photo/Staff Sgt. Eric T. Sheler-
PD)

Table 4.
Genotype nomenclature consistent with XR inheritance.
(Original-Harrington-CC BY-NC 3.0)

Figure 10.
Mutation in the PAH gene cannot catalyze the breakdown
of phenylalanine into tyrosine. This causes a buildup of
phenylpyruvic acid, which would damage the central
nervous system.
(Original-Harrington-CC BY-NC 3.0)

heterozygotes. Compound heterozygotes have two

different mutant alleles (different base pair Figure 12.
Some forms of colour blindness are inherited as XR-traits.
changes) at a given locus, in this case the PAH
Colour blindness is diagnosed using tests such as this
gene. Ishihara Test, which is shown above. Note: if printed in
B&W, this image will not show the number. See the online
2.4. X-LINKED RECESSIVE (XR) image in colour. (Wikipedia-unknown-PD)
Because males have only one X-chromosome, any
male that inherits an X-linked recessive disease
pedigree that can be used to definitively establish
allele will be affected by it (assuming complete
that an inheritance pattern is not XR is the
penetrance). Therefore, in XR modes of
presence of an affected daughter from unaffected
inheritance, males tend to be affected more
parents; because she would have had to inherit
frequently than females in a population. This is in
one X-chromosome from her father, he would also
contrast to AR and AD, where both sexes tend to
have been affected in XR.
be affected equally, and XD, in which females are
affected more frequently. Note, however, in the XR EXAMPLE: HEMOPHILIA A
small sample sizes typical of human families, it is F8 gene at Xq28 (X chromosome, q arm, region 2,
usually not possible to accurately determine band 8) encodes blood clotting factor VIIIc.
whether one sex is affected more frequently than Without Factor VIIIc, internal and external bleeding
others. On the other hand, one feature of a can’t be stopped. Back in the 1900s, Xa / Y male’s

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 23 – PEDIGREE ANALYSIS

inherited only through the mother (maternally).

Therefore, mutations in mitochondrial DNA
(mtDNA) are inherited through the maternal line.
There are some human diseases associated with

mutations in mitochondria genes. These mutations
Figure 13.
F8 gene encodes for blood clotting factor VIIIC which is
can affect both males and females, but males
responsible for blood coagulation, (Original-Harrington-CC cannot pass them on as all mitochondria are
BY-NC 3.0) inherited via the egg, not the sperm. Mitochondrial
DNA polymorphisms are also used to investigate
average life expectancy was 1.4 years, but in the evolutionary and historical lineages, both ancient
2000s it has increased to 65 years with the advent and recent. Because of the relative similarity of
of Recombinant Human Factor VIIIc. Hemophilia A sequence mtDNA is also used in species
is recessive because XA / Xa females have normal identification in ecology studies. An example of
blood coagulation, while Xa / Xa females have Mitochondrial inheritance is the Leber hereditary
hemophilia. optical neuropathy (LHON). Mitochondria are very

important in retinal cells for ATP and/or a
specialized function. Mutations in several mtDNA
2.5. Y-LINKED
genes result in blindness during early childhood.
Only males are affected in human Y-linked
inheritance (and other species with the X/Y sex 3. SPORADIC AND NON-HERITABLE DISEASES
determining system). There is only father-to-son
Not all the characterized human traits and diseases
transmission. This is the easiest mode of
are attributed to mutant alleles at a single gene
inheritance to identify, but it is one of the rarest
locus. Many diseases that have a heritable
because there are so few genes located only on the
component, have more complex inheritance
Y-chromosome.
patterns due to (1) the involvement of multiple
A common, but incorrect, example of Y-linked genes, and/or (2) environmental factors.
inheritance is the hairy-ear-rim phenotype seen in
On the other hand, some non-genetic diseases may
some Indian families. A better example are the Y-
appear to be heritable because they affect multiple
chromosome DNA polymorphisms that have been
members of the same family, but this is due to the
used to follow the male lineage in large families or
family members being exposed to the same toxins
through ancient ancestral lineages. For example,
or other environmental factors (e.g. in their
the Y-chromosome of Mongolian ruler Genghis
homes).
Khan (1162-1227 CE), and his male relatives,
accounts for ~8% of the Y-chromosome lineage of Finally, diseases with similar symptoms may have
men in Asia (0.5% world wide). different causes, some of which may be genetic
while others are not. One example of this is ALS
2.6. ORGANELLE GENOMES (amyotrophic lateral sclerosis); approximately 5-
In eukaryotes, DNA and genes also exist outside of 10% of cases are inherited in an AD pattern, while
the chromosomes found in the nucleus. Both the the majority of the remaining cases appear to be
chloroplast and mitochondrion have circular sporadic, in other words, not caused by a mutation
chromosomes. These organelle genomes are often inherited from a parent. We now know that
present in multiple copies within each organelle. In different genes or proteins are affected in the
most sexually reproducing species, organelle inherited and sporadic forms of ALS. The physicist
chromosomes are inherited from only one parent, Stephen Hawking (Figure 14) and baseball player
usually the one that produces the largest gamete. Lou Gehrig both suffered from sporadic ALS.
Thus, in mammals, angiosperms, and many other
organisms, mitochondria and chloroplasts are

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

PEDIGREE ANALYSIS – CHAPTER 23

Figure 14. know from calculating probabilities using a Punnett

Stephen Hawking Square (e.g. in a monohybrid cross Aa x Aa, ¼ of
(Wikipedia-NASA-PD) the offspring are aa).

We can likewise calculate probabilities in the more
complex pedigree shown in Figure 15.

4. CALCULATING PROBABILITIES
Once the mode of inheritance of a disease or trait
is identified, some inferences about the genotype
of individuals in a pedigree can be made, based on
their phenotypes and where they appear in the Figure 15.
family tree. Given these genotypes, it is possible to Individuals in this pedigree are labeled with numbers to
calculate the probability of a particular genotype make discussion easier. (Unknown)
being inherited in subsequent generations. This
can be useful in genetic counseling, for example Assuming the disease has an AR pattern of
when prospective parents wish to know the inheritance, what is the probability that individual
likelihood of their offspring inheriting a disease for 14 will be affected? We can assume that individuals
which they have a family history. #1, #2, #3 and #4 are heterozygotes (Aa), because
Probabilities in pedigrees are calculated using they each had at least one affected (aa) child, but
knowledge of Mendelian inheritance and the same they are not affected themselves. This means that
basic methods as are used in other fields. The first there is a 2/3 chance that individual #6 is also Aa.
formula is the product rule: the joint probability of This is because according to Mendelian inheritance,
two independent events is the product of their when two heterozygotes mate, there is a 1:2:1
individual probabilities; this is the probability of distribution of genotypes AA:Aa:aa. However,
one event AND another event occurring. For because #6 is unaffected, he can’t be aa, so he is
example, the probability of a rolling a “five” with a either Aa or AA, but the probability of him being Aa
single throw of a single six-sided die is 1/6, and the is twice as likely as AA. By the same reasoning,
probability of rolling “five” in each of three there is likewise a 2/3 chance that #9 is a
successive rolls is 1/6 x 1/6 x 1/6 = 1/216. heterozygous carrier of the disease allele.

The second useful formula is the sum rule, which If individual 6 is a heterozygous for the disease
states that the combined probability of two allele, then there is a ½ chance that #12 will also be
independent events is the sum of their individual a heterozygote (i.e. if the mating of #6 and #7 is Aa
probabilities. This is the probability of one event × AA, half of the progeny will be Aa; we are also
OR another event occurring. For example, the assuming that #7, who is unrelated, does not carry
probability of rolling a five or six in a single throw any disease alleles). Therefore, the combined
of a dice is 1/6 + 1/6 = 1/3. probability that #12 is also a heterozygote is 2/3 x
1/2 = 1/3. This reasoning also applies to individual
With these rules in mind, we can calculate the #13, i.e. there is a 1/3 probability that he is a
probability that two carriers (i.e. heterozygotes) of heterozygote for the disease. Thus, the overall
an AR disease will have a child affected with the probability that both individual #12 and #13 are
disease as ½ x ½ = ¼, since for each parent, the heterozygous, and that a particular offspring of
probability of any gametes carrying the disease theirs will be homozygous for the disease alleles is
allele is ½. This is consistent with what we already 1/3 x 1/3 x 1/4 = 1/36.

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 23 – PEDIGREE ANALYSIS

___________________________________________________________________________
SUMMARY:
• Pedigree analysis can be used to determine the mode of inheritance of specific traits such as diseases.
• Loci can be X- or Y-linked or autosomal in location and alleles either dominant or recessive with respect
to wild type.
• If the mode of inheritance is known, a pedigree can be used to calculate the probability of inheritance
of a particular genotype by an individual.
KEY TERMS:
Pedigree charts X-linked recessive
mode of inheritance Hemophilia A
genetic counseling Y-linked
incomplete penetrance hairy-ear-rim
variable expressivity chloroplast
proband mitochondrion
affected organelle
carrier mitochondrial inheritance (mtDNA)
autosomal dominant endopolyplody
Achondroplasia sporadic
X-linked dominant product rule
Fragile X-syndrome sum rule
Phenylketonuria (PKU)
autosomal recessive

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

PEDIGREE ANALYSIS – CHAPTER 23

STUDY QUESTIONS:
1) What are some of the modes of inheritance that are consistent with this pedigree?

2) In this pedigree in question 1, the mode of inheritance cannot be determined unambiguously. What are
some examples of data (e.g. from other generations) that, if added to the pedigree would help determine
the mode of inheritance?
3) For each of the following pedigrees, name the most likely mode of inheritance (AR=autosomal recessive,
AD=autosomal dominant, XR=X-linked recessive, XD=X-linked dominant). (These pedigrees were obtained
from various external sources).
a)

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 23 – PEDIGREE ANALYSIS

4) The following pedigree represents a rare, autosomal recessive disease. What are the genotypes of the
individuals who are indicated by letters?

5) If individual #1 in the following pedigree is a heterozygote for a rare, AR disease, what is the probability
that individual #7 will be affected by the disease? Assume that #2 and the spouses of #3 and #4 are not
carriers.

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

CHROMOSOME REARRANGEMENTS – CHAPTER 24

CHAPTER 24 – CHROMOSOME REARRANGEMENTS

Figure 1.
Comparing an ideogram of the human chromosome 2 to the
equivalent chromosomes in chimpanzees, we notice that the
human chromosome 2 likely came from a fusion event that
occurred since their common ancestor. This is supported by
evidence finding telomeric and centromeric sequences in the
middle of human chromosome 2 similar to that of the ends and
middle of the chimpanzee chromosomes.
Note: The actual formation of HSA2 was more complex (and
interesting) than this. As drawn it can't have occurred this way by
either
(i) loss of telomeres followed by chromosome fusion or by
(ii) a Robertsonian translocation because in either case the
telomere DNA would be lost.
For details see:
http://genome.cshlp.org/content/22/6/1036.full
(Flickr- T. Michael Keesey- CC BY 2.0)

INTRODUCTION segment of the chromosome has been lost (a
deletion), the cell may be missing many genes. The
Previous chapters described chromosomes as
causes of chromosome structural abnormalities
simple linear DNA molecules on which genes are
and the consequences they have for the cell and
located. For example, your largest chromosome,
the organism are described below. They involve
chromosome 1, has about 3536 genes. To ensure
double stranded breaks in the DNA, meiotic
that each of your cells possesses these genes, the
crossover events, and rejoining of the broken ends.
typical linear eukaryotic chromosome has three
Human examples will be used to show the
critical features that allow it to be passed on during
phenotypic consequences and methods for
cell division. (1) Origins of replication found along
detection.
its length provide places for DNA replication to
start, (2) telomeres protect each end of the 1. DNA DOUBLE STRAND BREAKS AND INCORRECT
chromosome, and (3) a single centromere near the MEIOTIC CROSSOVERS CAUSE CHROMOSOMAL
middle provides a place for microtubules to attach REARRANGEMENTS
and move the chromosome during mitosis and
meiosis. 1.1. DOUBLE STRAND BREAKS AND THEIR REPAIR
However, at various locations both strands of the A chromosome is a very long but very thin
double stranded DNA in a chromosome can break molecule. In the phopho-diester backbone there
and the subsequent daughter cell(s) may not retain are only two covalent bonds holding each base pair
all the DNA and thus all the genes. For example, if a to the next. If one of these covalent bonds is

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 24 – CHROMOSOME REARRANGEMENTS

Figure 2.
Repair of single strand nicks and double strand breaks in
DNA. (Original-Harrington-CC BY-NC 3.0)

broken the chromosome will still remain intact, Figure 3.
Errors during DNA repair can cause a chromosome
although a DNA Ligase will be needed to repair the deletion. In this diagram A, B, and C are genes on the
nick (Figure 2a). Problems arise when both strands same chromosome. As in Figure 2 there has been breaks
are broken at or near the same location. This in the DNA, recruitment of NHEJ proteins, and repair.
double strand break will cleave the chromosome After the repairs are completed the small piece of DNA
into two independent pieces (Figure 2b). Because with gene B is lost and the chromosome now only has
genes A and C. (Original-Harrington-CC BY-NC 3.0)
these events do occur in cells there is a repair
system called the non-homologous end joining
(NHEJ) system to fix them. Proteins bind to each 1.2. INCORRECT MEIOTIC CROSSOVERS
broken end of the DNA and reattach them with Meiotic crossovers occur at the beginning of
new covalent bonds. This system is not perfect and meiosis for two reasons. They help hold the
sometimes leads to chromosome rearrangements homologous chromosomes together until
(see next section). separation occurs during anaphase I (see Chapter
16). They also allow recombination to occur
The NHEJ system proteins only function if required. between linked genes (see Chapter 17). The event
If the chromosomes within an interphase nucleus itself takes place during prophase I when a double
are all intact the system is not active. The strand break on one piece of DNA is joined with a
telomeres at the natural ends of chromosomes double strand break on another piece of DNA and
prevent the NHEJ system from attempting to join the ends are put together (Figure 4a). Most of the
the normal ends of chromosomes together. If there time the breaks are on non-sister chromatids and
is one double strand break the two broken ends most of the time the breaks are at the same
can be recognized and joined. But if there are two relative locations.
double strand breaks at the same time there will be Problems occur when the wrong pieces of DNA are
four broken ends in total. The NHEJ system matched up along the chromosomes during
proteins may join the ends together correctly, but crossover events. This can happen if the same or
if they fail, the result is a chromosome similar DNA sequence is found at multiple sites on
rearrangement (Figure 3). the chromosomes (Figure 4b). For example, if there
are two Alu transposable elements on a
chromosome. When the homologous

chromosomes pair during prophase I, the wrong
Alu sequences might line up. A crossover may occur
in this region. If so, when the chromosomes

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

CHROMOSOME REARRANGEMENTS – CHAPTER 24

second part shows how meiosis can cause the

rearrangements.
2. DELETIONS
There are two forms of deletions: Terminal and
Interstitial. Terminal deletions are deletions off of
the end of a chromosome. Interstitial deletions are
deletions of a region in the middle of the
chromosome, while the arms on each side remain
normal. For example, with a chromosome that has
Figure 4 . the genes ABCDEF, an example of a terminal
Errors during meiotic crossovers can cause duplications deletion will be CDEF. An example of an interstitial
and deletions. This diagram shows homologous
deletion will be ABCF.
chromosomes pairing in prophase I and then separating in
anaphase I. The shaded boxes are Alu transposable
2.1. DELETIONS FROM DOUBLE STRAND BREAK REPAIR
elements. a) The homologous chromosomes pair
properly, a crossover occurs, and all four chromatids in Deletions arise from double strand breaks when
anaphase I are normal. b) The pairing is incorrect, a both breaks are on one chromosome. If the ends
crossover occurs in the mispaired region, and in anaphase are joined in this way the piece of DNA with the B
I one chromatid has a duplication and another has a gene on it does not have a centromere and will be
deletion.
lost during the next cell division.
(Original-Harrington/L. Canham-CC BY-NC 3.0)

separate during anaphase I one of the chromatids

will have a duplication and one will have a deletion.

Ultimately, of the four cells produced by this
Figure 5.
meiosis, two will be normal, one will have a
Deletion can result from double strand break repair.
chromosome with extra genes, and one will have a (Original-Harrington- CC BY-NC 3.0)
chromosome missing some genes. Errors of this
type can also cause inversions and translocations.
2.2. DELETIONS FROM INCORRECT MEIOSIS
Errors during the repair of multiple double strand If meiotic rearrangement is the cause, Deletion
breaks or incorrect meiotic crossovers can cause chromosomes will pair up with a normal homolog
four types of chromosome rearrangements: along the shared regions and at the missing
deletion, inversion, duplication or translocation. segment, the normal homolog will loop out
The type of chromosome rearrangement is either (nothing to pair with) to form a deletion loop. This
dependent upon where the two breaks were can be used to locate the deletion cytologically.
originally and how they are rejoined, or on the The deleted region is also pseudo-dominant, in
location of the homology during meiosis. Figure 3 that it permits the mutant expression of recessive
shows some possibilities but more are shown in the alleles on the normal homolog. Deletion mutations
following sections. The first part of each section don’t revert - nothing to replace the missing DNA.
shows a double strand DNA break between the B 3. INVERSIONS
and C genes (shown here as a red X). A second DNA
break occurs and the NHEJ proteins mend the 3.1. INVERSIONS FROM DOUBLE STRAND BREAKS
damage incorrectly by joining the ends (shown with Inversions also occur when both double strand
the blue arrows). The chromosomes are drawn breaks are on one chromosome. If the ends are
unreplicated as they are in G1 phase but these joined in this way, part of the chromosome is
events can happen anytime during interphase. The inverted. This example shows a paracentric
inversion, named because the inverted section

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 24 – CHROMOSOME REARRANGEMENTS

does not include the centromere (para = beside). If If joined with a normal gamete, they will result in
the breaks occur on different chromosome arms an unbalanced zygote, which are usually lethal. The
the inverted section includes the centromere and consequence for this is that crossover products
the result is a pericentric inversion (peri = around). (recombinants) are lost and thus inversions appear
to suppress crossovers within the inverted region.
Note: with both types of inversions, crossovers
outside the loop are possible and fully viable, as
they don’t alter the gene balance.
Figure 6.
Inversion can result from double strand break repair. 4. DUPLICATIONS
(Original-Harrington- CC BY-NC 3.0)
There are two major forms of duplications: tandem
and inverse duplications. Tandem duplications are
3.2. INVERSIONS FROM INCORRECT MEIOSIS when the duplicated genes are in the same order,
In meiosis, when an inversion chromosome is and inverse duplications are where the duplicated
paired up there is an inversion loop formed. If genes are in the reverse order. For example if you
there is a crossover within the loop then abnormal have a chromosome that has the genes ABCDEFGH,
products will result and abnormal, unbalanced and a duplication occurs in the BCD genes, then a
gametes will be produced. For example, a tandem duplication would look like:
crossover event within the loop of a paracentric ABCDBCDEFGH. An inverse duplication would look
inversion will lead to a di-centric product that will like: ABCDDCBEFGH.
break into deletion products and produce
unbalanced gametes (Figure 7). Similarly, with a Insertional duplications are also seen, where the
pericentric inversion, a crossover event leads to duplicated region is inserted to a more distant
duplicate/deletion products that are unbalanced location. e.g. ABCDEFBCDGH
(Figure 8).
Figure 7.
A paracentric inversion
pairing at meiosis. A
crossover within the loop
causes the production of an
acentric and a dicentric
chromatids, which leads to
deletion product..
(Original-Locke-CC BY-NC
3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

CHROMOSOME REARRANGEMENTS – CHAPTER 24

Figure 8.
A pericentric inversion
pairing at meiosis. A
crossover within the loop
causes the production of
duplicate and deletion
products.
(Original-Locke-CC BY-NC
3.0)

4.1. DUPLICATIONS FROM DOUBLE STRAND BREAKS 5. TRANSLOCATIONS

Duplications can occur from two DNA breaks at
5.1. TRANSLOCATIONS FROM DOUBLE STRAND BREAKS
different places in sister chromatids (in a replicated
Translocations result from two breaks on different
chromosome). The ends are joined together
chromosomes (not homologs) and incorrect
incorrectly to create a chromosome with a
duplication (two “B” regions as shown). Note: the rejoining. This example shows a reciprocal
reciprocal product has a deletion. translocation - two chromosomes have 'swapped'
arms, the E gene is now part of the white
chromosome and the C gene is now part of the
shaded chromosome. Robertsonian translocations
are those rare situations in which all the genes end
up together on one chromosome and the other
Figure 9. chromosome is so small that it is typically lost.
Duplication can result from double strand break repair.
(Original-Harrington- CC BY-NC 3.0)

4.2. DUPLICATIONS FROM INCORRECT MEIOSIS

Duplications also produce a cytologically visible
loop at meiotic pairing. Duplications can revert at a
relatively high frequency by unequal crossing over. Figure 10.
Translocation can result from double strand break repair.
Duplicated genes offer new possibilities for
(Original-Harrington- CC BY-NC 3.0)
mutational divergence followed by natural
selection in the course of evolution.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 24 – CHROMOSOME REARRANGEMENTS

Figure 11.
A reciprocal translocation
pairing at meiosis. There
are two main avenues for
segregation: Adjacent-1
and Alternate. Adjacent-1
results in duplication and
deletion for part of the
chromosome segments.
Alternate doesn’t.
(Original-Locke-CC BY-NC
3.0)

5.2. TRANSLOCATIONS FROM INCORRECT MEIOSIS The third segregation possibility is known as
For translocations during meiosis, a consequence Adjacent-2, where N1 and T1 go to one pole, while
for the two chromosomes involved is that when N2 and T2 go to the other. This way of segregating
they pair both replicated chromosome pairs will be is extremely rare, and so will not be described in
together, which can be seen cytologically as a any further detail.
tetrad. This tetrad can segregate in three ways. 6. CONSEQUENCES OF CHROMOSOMAL
This set of paired, replicated chromosomes can REARRANGEMENTS
segregate as Alternate (balanced) where both
normal (N1 and N2) and both translocated 6.1. DECREASED VIABILITY
chromosomes (T1 and T2) go to the same polls, All the chromosome rearrangements shown above
respectively. The chromosomes can segregate as produce functional chromosomes. Each has one
Adjacent-1 (unbalanced) where the normal and centromere, two telomeres, and thousands of
translocation chromosomes segregate, with N2 and origins of replication. Because inversions and
T1 segregate from N1 and T2. Alternate and translocations do not change the number of genes
Adjacent 1 both occur in approximate equal in a cell or organism they are said to be balanced
frequency and thus only about half the time do the rearrangements. Unless one of the breakpoints
gametes end up unbalanced (Figure 11.). Note how occurred in the middle of a gene the cells will not
each daughter cell in Alternate has equal amounts be affected. On the other hand, deletions and
of blue and black chromosomes, while in Adjacent- duplications are unbalanced rearrangements. The
1 one daughter has extra black chromosomes, and larger they are (more genes involved) the more
the other has extra blue. disruption they cause to the proper functioning of
the cell or organism. Having too much or too little
gene action for a large number of genes can disrupt

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

CHROMOSOME REARRANGEMENTS – CHAPTER 24

the cellular metabolism to generate a phenotype or gametes. This is a general property of inversions
reduce viability. and translocations.

6.2. DECREASED FERTILITY In heterozygotes there are problems during meiosis

Recall that during meiosis I homologous resulting in a lot of the gametes being unbalanced
chromosomes pair up. If a cell has a chromosome and an overall reduction in fertility. In homozygotes
with a rearrangement this chromosome will have the rearranged chromosomes pair with one
to pair with its normal homolog. another just fine and there is no effect on fertility.

Cells heterozygous for balanced rearrangements 6.3. CANCER

actually have more difficulties in prophase I. Some chromosome rearrangements have
Consider the chromosomes shown in Figure 12. breakpoints within genes leading to the creation of
There are different ways they might pair during hybrid genes – the first part of one gene with the
prophase I - one is shown in Figure 13. But if a last part of another. If the hybrid gene
crossover occurs in the inverted region the result inappropriately promotes cell replication, the cell
will be unbalanced gametes. Embryos made with can become cancerous.
unbalanced gametes rarely survive. The
consequence is that the heterozygous organism 6.4. EVOLUTION
will have reduced fertility. Those chromosome changes that duplicate genes
are important for evolution. If an organism has an
Note that an organism homozygous for this extra copy of important genes, one gene can be
inversion chromosome will not be affected in this retained for their original function while others can
way because no loops are formed. The mutate and potentially acquire new functions
chromosomes can pair along their entire length (Figure 14.). An example of this is the multiple
and crossovers will not produce any unbalanced copies of the globin genes found in mammals.

Figure 12. Figure 13.
A normally arranged chromosome Meiosis in a cell heterozygous for the chromosomes shown in Figure 12. Note that of
(left) and a homolog with a pericentric the four gametes one has a deletion of the A gene and a duplication of the D gene while
inversion (right). another gamete has a duplication of A and a deletion of D.
(Original-Harrington/Canham-CC BY- (Original-Harrington/Canham-CC BY-NC 3.0)
NC 3.0

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 24 – CHROMOSOME REARRANGEMENTS

7. CHROMOSOMAL REARRANGEMENTS IN
HUMANS
The problems described above can affect all
eukaryotes, unicellular and multicellular. To better
understand the consequences let us consider those
that affect people. The convention when describing
a person's karyotype (chromosome composition) is
to list the total number of chromosomes, then the
sex chromosomes, and then anything out of the
ordinary. Most of us are 46,XX or 46,XY. What
follows are some examples of chromosome
number and chromosome structure abnormalities.

7.1. CRI-DU-CHAT SYNDROME

Cri-du-chat syndrome occurs when a child inherits
Figure 14.
a defective chromosome 5 from one parent (Figure
Duplicated genes can mutate without compromising the 15). This condition is rare - it is present in only 1 in
viability of the organism. Occasionally the result is a new 20 000 to 1 in 50 000 births but it does account for
gene. (Original-Harrington-CC BY-NC 3.0) 1% of cases of profound intellectual disability. The
specific defect is a deletion that removes 2 Mb or
Chromosome rearrangements that decrease more from the tip of the short arm of the
fertility are also important for the origin of new chromosome. In most cases the deletion is the
species. If a rearrangement, such as the inversion result of a chromosomal rearrangement in one of
shown in Figure 12, becomes common in a small the parent's germ line cells. People with cri-du-chat
isolated population, that population has 100% have a karyotype of 46,sex,deletion(5).
fertility if they mate within their group, but a
reduced fertility if they mate with members of the
larger population. As rearrangements accumulate
the small population will become more and more
reproductively isolated. When members are
incapable of forming viable, fertile offspring with Figure 15.
the original population the group will have become A boy with cri-du-chat syndrome. The pictures were taken
a new species. at 8 months (A), 2 years (B), 4 years (C), and 9 years (D).
(Wikipedia-Paola Cerruti Mainardi/ changes: horizontally
Another example is shown in Figure 1, where the aligned the photos- CC BY 2.0)
human chromosome 2 is a fusion of two
chromosomes present in the common ancestor of As with Down syndrome this condition is
humans and other great apes (chimpanzee, gorilla, associated with intellectual disability and other
orangutan). We do not know exactly when in health problems. These problems include an
human history (evolution) this fusion event improperly formed larynx which leads to infants
occurred, except that, because it is absent in all making high pitched cat-like crying sounds (hence
other apes and present in all current humans, it the name "cry of the cat"). It is suspected that at
must have occurred after the split between least some of the intellectual disability phenotype
chimpanzee and humans. is due to having only a single copy of the CTNND2
gene. This gene is active during embryogenesis and
makes a protein essential for neuron migration.
Down syndrome and cri-du-chat syndrome are two

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

CHROMOSOME REARRANGEMENTS – CHAPTER 24

examples of the need for genomes to contain the

proper number of genes. Having too many copies
of key genes (Down syndrome) or too few (cri-du-
chat syndrome) can lead to substantial
developmental problems.

7.2. INVERSION(9)
The most common chromosome rearrangements in
humans are inversions of chromosome 9. About 2%
of the world's population is heterozygous or
homozygous for inversion(9). This rearrangement
does not affect a person's health because the
genes on the chromosome are all present - all that
has changed is their relative locations. Inversion(9) Figure 16.
is different from deletion(5) in two main respects. Human chromosomes. One way to obtain chromosomes is
to take a blood sample, culture the cells for three days in
As mentioned above because it is a balanced
the presence of a T-cell growth factor, arrest the cells in
rearrangement it does not cause harm. And metaphase with a microtubule inhibitor, and then drop the
because of this nearly everyone with an cells onto a slide. The cells burst and the chromosomes
inversion(9) chromosome has inherited it from a stick to the slide. The chromosomes can then be stained or
parent who had inherited it from one of his or her probed. Because the cells are in metaphase it is possible to
see 46 replicated chromosomes here. There will be dozens
parents and so on. In contrast, most cases of
of collections of chromosomes like this over the entire
deletion(5) are due to new mutations occurring in a slide.
parent. (Wikipedia-Steffen Dietzel- CC BY-SA 3.0)

7.3. DIAGNOSING HUMAN CHROMOSOME

ABNORMALITIES
A physician may suspect that a patient has a
How can we confirm that a person has a specific specific genetic condition based upon the patient's
chromosomal abnormality? The first method was physical appearance, mental abilities, health
simply to obtain a sample of their cells, stain the problems, and other factors. FISH can be used to
chromosomes with Giemsa dye, and examine the confirm the diagnosis. For example, Figure 17
results with a light microscope (Figure 16). Each shows a positive result for cri-du-chat syndrome.
chromosome can be recognized by its length, the The probes are binding to two long arms of
location of its centromere, and the characteristic chromosome 5 but only one short arm. One of the
pattern of purple bands produced by the Giemsa. chromosome 5s must therefore be missing part of
Bright field microscopy has its limitations though - its short arm.
it only works with mitotic chromosomes and many FISH is an elegant technique that produces
chromosome rearrangements are either too subtle dramatic images of our chromosomes.
or too complex for even a skilled cytogeneticist to Unfortunately, FISH is also expensive, time
discern. consuming, and requires a high degree of skill. For
these reasons, FISH is slowly being replaced with
The solution to these problems was fluorescence in PCR and DNA chip based methods. Versions of
situ hybridization (FISH). A single stranded these techniques have been developed that can
fluorescent DNA probe is allowed to hybridize to accurately quantify a person's DNA. For example
denatured target DNA. Because there are several DNA from a person with cri-du-chat syndrome will
fluorescent colours available it is common to use contain 50% less DNA from the end of chromosome
more than one probe at the same time. A more 5. These techniques are very useful if the suspected
detailed explanation of the FISH technique can be abnormality is a deletion, a duplication, or a change
found in the chapter 32.

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 24 – CHROMOSOME REARRANGEMENTS

in chromosome number. They are less useful for

diagnosing chromosome inversions and
translocations because these rearrangements often
involve no net loss or gain of genes.
In the future, all of these techniques will likely be
replaced with DNA sequencing. Each new
generation of genome sequencing machines can
sequence more DNA in less time. Eventually it will
be cheaper just to sequence a patient's entire
genome than to use FISH or PCR to test for specific
chromosome defects. More details on DNA
sequencing can be found in the chapter 33.

Figure 17.
A positive result for cri-du-chat syndrome. This diagram is
based upon actual results. Cells from a patient's blood

were prepared to show an interphase nucleus (a) and
mitotic chromosomes (b). The DNA has been coloured
blue with DAPI. The green fluorescent probe is binding to
the tip of the short arm of chromosome 5 (shown here as
open circles). This is the region absent in cri-du-chat. The
red fluorescent probe is binding to the middle of the long

arm of the same chromosome (filled circles). This probe is
used as a control.
(Original-Harrington- CC BY-NC 3.0)

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

CHROMOSOME REARRANGEMENTS – CHAPTER 24

___________________________________________________________________________
SUMMARY:
• Deletion(5) causes a serious condition (cri-du-chat syndrome) because deletions are unbalanced
chromosome rearrangements.
• Inversion(9) causes few health consequences because inversions are balanced chromosome
rearrangements.
• Bright field microscopy can be used to detect chromosome number abnormalities and some
chromosome rearrangements.
• Fluorescence in situ hybridization can be used to detect all types of chromosome abnormalities.
• PCR and DNA chip based techniques can be used to detect chromosome number abnormalities,
deletions, and duplications.
KEY TERMS:
origin of replication inverse duplication
telomere insertional duplication
centromere duplication
double strand break translocation
non-homologous end joining reciprocal translocation
chromosome rearrangement Robertsonian translocation
meiotic crossover Tetrad
Alu transposable elements Alternate (balanced)
terminal deletion Adjacent-1 (unbalanced)
interstitial deletion reduced fertility
deletion karyotype
deletion loop 46,sex,deletion(5)
pseudo-dominant (cri-du-chat syndrome)
inversion 46,sex,inversion(9)
paracentric inversion bright field microscopy
pericentric inversion Giemsa stain
inversion loop fluorescence in situ hybridization
tandem duplication fluorescent DNA probe

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 24 – CHROMOSOME REARRANGEMENTS

STUDY QUESTIONS:
1) Make diagrams showing how an improper
crossover event during meiosis can lead to:
a) an inversion
b) a translocation.
2) If Drosophila geneticists want to generate
mutant strains with deletion mutations, they
expose flies to gamma rays. What does this
imply about gamma rays?
3) Design a FISH based experiment to find out if
someone is a 47,XXX female or a 47,XYY male.

PAGE 12 OPEN GENETICS LECTURES – FALL 2017

CHANGES IN CHROMOSOME NUMBER – CHAPTER 25

CHAPTER 25 – CHANGES IN CHROMOSOME NUMBER

Figure 1.
Xenopus laevis, and other species in the Xenopus genus, are
one of the few animals that are polyploidy. X. laevis is
tetraploid (4n), but other species can get up to dodecaploid
(12n).
(Wikimedia Commons-P.Narbonne, D.Simpson, J..Gurdon- CC
BY 2.5

INTRODUCTION more deleterious problems. Having the correct
expression levels of genes is important for the
So far in this textbook, we have talked about cells function of the organism. Since chromosomes have
and organisms that are haploid and diploid. Having large numbers of genes on them, missing or gaining
the appropriate number of chromosomes is whole chromosomes can cause more serious gene
important for allowing mitosis and meiosis to dosage problems. Aneuploidy is caused through
occur. Having too many or too few individual
incorrect segregation in meiosis or mitosis, and if
chromosome, or whole sets of chromosomes can
there are living organisms with aneuploidy, they
lead to cell replication or fertility problems. This
often have difficulty with meiosis or mitosis as well.
chapter we will discuss the repercussions of having
too many or too few individual chromosomes, 1. PLOIDY NOTATION
known as aneuploidy, or having multiples of whole
chromosome sets, known as polyploidy. 1.1. NOTATION OF DNA CONTENT AND CHROMOSOME
CONTENT IN DIPLOID ORGANISMS
Most organisms of all kingdoms are haploid or The amount of DNA within a cell changes following
diploid. Occasionally though, particularly in plants, each of the following events: fertilization, DNA
you will see chromosomes sets higher than diploid. synthesis, mitosis, and meiosis (Figure 2). We use
This is known as polyploidy. When coming from a “c” to represent the DNA Content in a cell, and “n”
typically diploid plant, and increasing the ploidy in to represent the Number of complete sets of
even numbers, the resulting plant is typically chromosomes. In a gamete (i.e. sperm or egg), the
healthy, and often with larger fruits produced. amount of DNA is 1c, and the number of
However, when increasing to an odd number, it chromosomes is 1n. Upon fertilization, both the
makes it difficult for gamete production and often DNA content and the number of chromosomes
leads to infertility (seedless varieties). doubles to 2c and 2n, respectively. Following DNA
As opposed to polyploidy, where the plant is often replication, the DNA content doubles again to 4c,
healthy, aneuploidy plants and animals (losses or but each pair of sister chromatids is still counted as
multiples of individual chromosomes) often see a single chromosome (a replicated chromosome),

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 25 – CHANGES IN CHROMOSOME NUMBER

2n=2x 2n=4x 2n=6x

Figure 3.
Here is an example of a diploid (2x), tetraploid (4x) and
replication hexaploid (6x) cell. Each cell has 2 chromosomes, a long
chromosome with acrocentric centromere and a short
chromosome with metacentric centromere. Diploids have
2 copies of this chromosome, tetraploids have 4 and
hexaploids have 6.
(Original-L. Canham-CC BY-NC 3.0)

combine the “x” notation with the “n” notation

already defined previously in this chapter. Thus, for
both diploids and polyploids, “n” is the number of
chromosomes in a gamete, and “2n” is the number
of chromosomes following fertilization. For a
diploid, therefore, n=x, and 2n=2x. But for a
tetraploid, n=2x, and 2n=4x and for a hexaploid,
n=3x, and 2n=6x (Figure 3)
Figure 2.
Changes in DNA and chromosome content during the cell 2. POLYPLOIDY
cycle. For simplicity, nuclear membranes are not shown,
and all chromosomes are represented in a similar stage of 2.1. POLYPLOIDY FROM CHANGES IN WHOLE SETS OF
condensation. CHROMOSOMES
(Original-Deyholos-CC BY-NC 3.0) Humans, like most animals and most eukaryotic
genetic model organisms, have two copies of each
so the number of chromosomes remains autosome. This situation is called diploidy. This
unchanged at 2n. If the cell undergoes mitosis, means that most of their cells have two
each daughter cell will return to 2c and 2n, because homologous copies of each chromosome. In
it will receive half of the DNA, and one of each pair contrast, many plant species and even a few animal
of sister chromatids. In contrast, the 4 cells that species are polyploids. This means they have more
come from meiosis of a 2n, 4c cell are each 1c and than two chromosome sets, and so have more than
1n, since each pair of sister chromatids, and each two homologs of each chromosome in each cell.
pair of homologous chromosomes, divides during
When the nuclear content changes by a whole
meiosis.
chromosome set we call it a change in ploidy.
N and C values were introduced in Chapter 14. Gametes are haploid (1n) and thus most animals
are diploid (2n) (Figure 2), formed by the fusion of
1.2. NOTATION IN POLYPLOID ORGANISMS two haploid gametes. However, some species can
When describing polyploids, we use the letter “x” exist as monoploid (1x), triploid (3x), tetraploid
(not “n”) to define the level of ploidy. A diploid is (4x), pentaploid (5x), hexaploid (6x), or higher.
2x, because there are two basic sets of
chromosomes, and a tetraploid is 4x, because it 2.2. POLYPLOIDS CAN BE STABLE OR STERILE
contains four chromosome sets. For clarity when Like diploids (2n=2x), stable polyploids generally
discussing polyploids, geneticists will often have an even number of copies of each

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

CHANGES IN CHROMOSOME NUMBER – CHAPTER 25

chromosome: tetraploid (2n=4x), hexaploid 2.3. MANY CROP PLANTS ARE HEXAPLOID OR OCTOPLOID
(2n=6x), and so on. The reason for this is clear from Polyploid plants tend to be larger and healthier
a consideration of meiosis. Remembering that the than their diploid counterparts. The strawberries
purpose of meiosis is to reduce the sum of the sold in grocery stores come from octoploid (8x)
genetic material by half, meiosis can equally divide strains and are much larger than the strawberries
an even number of chromosome sets, but not an formed by wild diploid strains. An example is bread
odd number. Thus, polyploids with an odd number wheat which is a hexaploid (6x) strain (Figure 4).
of chromosomes (e.g. triploids, 2n=3x) tend to be This species is derived from the combination of
sterile, even if they are otherwise healthy. three other wheat species, T. monococcum
The mechanism of meiosis in stable polyploids is (chromosome sets = AA), T. searsii (BB), and T.
essentially the same as in diploids: during tauschii (DD). Each of these chromosome sets has 7
metaphase I, homologous chromosomes pair with chromosomes so the diploid species are 2n=2x=14
each other. Depending on the species, all of the and bread wheat is 2n=6x=42 and has the
homologs may be aligned together at metaphase, chromosome sets AABBDD. Bread wheat is viable
or in multiple separate pairs. For example, in a because each chromosome behaves independently
tetraploid, some species may form tetravalents in during mitosis. The species is also fertile because
which the four homologs from each chromosome during meiosis I the A chromosomes pair with the
align together, or alternatively, two pairs of other A chromosomes, and so on. Thus, even in a
homologs may form two bivalents. Note that polyploid, homologous chromosomes can
because that mitosis does not involve any pairing segregate equally and gene balance can be
of homologous chromosomes, mitosis is equally maintained.
effective in diploids, even-number polyploids, and
odd-number polyploids.

Figure 4.
Modern bread wheat is hexaploid, but has
been developed from natural cross breeding
between diploid and tetraploid ancestors.
Meiosis still properly occurs, because the
chromosomes from the individual ancestors
still pair together during metaphase, as is
shown with the cartoon chromosomes
below.
(Original-J. Locke-PD)
Wheat: (Wikipedia- Marknesbitt- PD)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 25 – CHANGES IN CHROMOSOME NUMBER

2.4. BANANAS, WATERMELONS, AND OTHER SEEDLESS If triploids cannot make seeds, how do we obtain
PLANTS ARE TRIPLOID enough triploid individuals for cultivation? The
The bananas found in grocery stores are a seedless answer depends on the plant species involved. In
variety called Cavendish. They are a triploid variety some cases, such as banana, it is possible to
(chromosome sets = AAA) of a normally diploid propagate the plant asexually; new progeny can
species called Musa acuminata (AA). Cavendish simply be grown from cuttings from a triploid plant.
plants are viable because mitosis can occur. On the other hand, seeds for seedless watermelon
However, they are sterile because the are produced sexually: a tetraploid watermelon
chromosomes cannot pair properly during meiosis plant is crossed with a diploid watermelon plant.
I. During prophase I there are three copies of each Both the tetraploid and the diploid are fully fertile,
chromosome trying to “pair” with each other. and produce gametes with two (1n=2x) or one
Because proper chromosome segregation in (1n=1x) sets of chromosomes, respectively. These
meiosis fails, seeds cannot be made and the result gametes fuse to produce a zygote (2n=3x) that is
is a fruit that is easier to eat because there are no able to develop normally into an adult plant
seeds to spit out. Seedless watermelons (Figure 5) through multiple rounds of mitosis, but is unable to
have a similar explanation. compete normal meiosis or produce seeds.
Polyploids are often larger in size than their diploid
relatives (Figure 6). This feature is used extensively
in food plants. For example, most strawberries you
eat are not diploid, but octoploid (8x).
Polyploidy in animals is rare, essentially limited to
lower forms, which often reproduce by
parthenogenesis.

2.5. MALE BEES ARE MONOPLOID

Monoploids, with only one set, are usually inviable
in most species, however, in many species of
hymenoptera (bees, wasps, ants) the males are
monoploid and develop from unfertilized eggs.
Figure 5. These males don’t undergo meiosis for gametes;
Seedless watermelon is triploid, with white, aborted mitosis produces sperm. Females are diploid (from
seeds within the flesh.
(Flickr- justmakeit- CC BY-NC 2.0)
fertilized eggs) and produce eggs via meiosis. This is

the basis for the haploid-diploid sex determination
system (not the X/Y chromosome system). Female
bees are diploid (2n=32) and are formed when an
egg (n=16) is fertilized by a sperm (n=16). If an egg
isn't fertilized it can still develop and the result is
an n=16 male drone. Males are described as
haploid (because they have the same number of
chromosomes as a gamete) or monoploid (because
they have only one chromosome set). Females

produce eggs by meiosis while males produce
Figure 6.
Polyploidy in strawberries. The sweet, flavorful, wild sperm by mitosis. This form of sex determination
diploid is on the left, while the huge, cultivated octoploid produces more females – workers, which do the
is on the right. (Wikipedia- Left: Ivar Leidus/Right: David work (Figure 7) than males, who are only needed
Monniaux - CC BY-SA 3.0) for reproduction

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

CHANGES IN CHROMOSOME NUMBER – CHAPTER 25

extra rounds of DNA synthesis (S-phase) without

any mitosis or cytokinesis to produce an
endopolyploid cell. This produces multiple
chromatids of each chromosome. Endopolyploidy
seems to be associated with cells that are
metabolically very active, and produce a lot of
enzymes and other proteins in a short period of
time. An example is the highly endoreduplicated
salivary gland polytene chromosomes of D.
melanogaster (Figure 8) which can have over 1,000
chromatids that align together and form giant
chromosomes that show a banding pattern that
reflects the underlying DNA sequence and genes in
that chromosome region. These chromosomes

have been wonderful research models in genetics,
Figure 7.
Hymenoptera are haploid males and diploid females.
since their relatively large, amplified size makes it
Bee: (Sharp Photography – Charlesjsharp- CC BY-SA 3.0) easy to identify and study a wide variety of
(Original-L. Canham- CC BY-SA 3.0) chromosome aberrations under the microscope.
4. ANEUPLOIDY

4.1. NOMENCLATURE
If something goes wrong during cell division, an
entire chromosome may be lost and the cell will
lack all of these genes. Conversely, an entire
chromosome may be improperly included into the
new cell. These chromosomal abnormalities are
known as aneuploidy, which is the addition or
subtraction of a chromosome from a pair of
homologs. More specifically, the absence of one
member of a pair of homologous chromosomes is
called monosomy (only one remains). On the other
hand, in a trisomy, there are three, rather than two
Figure 8. (disomy), homologs of a particular chromosome.
Endoreduplicated chromosomes from a Drosophila salivary Different types of aneuploidy are sometimes
gland cell. The banding pattern is produced with represented symbolically; if 2n symbolizes the
fluorescent labels. normal number of chromosomes in a cell, then 2n-
(Flickr-Elissa Lei, Ph.D. @ NIH- CC BY 2.0)
1 indicates monosomy and 2n+1 represents
trisomy. The addition or loss of a whole
3. ENDOREDUPLICATION chromosome is a mutation, a change in the
Endoreduplication, is a special type of tissue- genotype of a cell or organism. The most widely
specific genome amplification that occurs in many known human aneuploidy is trisomy-21 (i.e. three
types of plant cells and in specialized cells of some copies of chromosome 21), which is one cause of
animals including humans. Endoreduplication does Down syndrome. Most (but not all) other human
not affect the germline or gametes, so species with autosomal aneuploidies are lethal at an early stage
endoreduplication are not considered polyploids. of embryonic development.
Endoreduplication occurs when a cell undergoes

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 25 – CHANGES IN CHROMOSOME NUMBER

Aneuploidy can arise through a non-disjunction change to one copy (or three copies) of the
event, which is the failure of at least one pair of hundreds or thousands of genes on an entire
chromosomes or chromatids to segregate during chromosome would be more than tolerable for the
mitosis or meiosis. Non-disjunction will generate daughter cells. They have what is called an
gametes with extra and/or missing chromosomes. unbalanced genotype, which usually kills the cell
Note that aneuploidy usually affects the number of (decreases their viability).
only one type of chromosome and is therefore If a first division or second division nondisjunction
distinct from polyploidy, in which the entire event occurs during meiosis the result is an
chromosome set is duplicated (see previous unbalanced gamete (Figure 10b and c). The gamete
section). Unlike aneuploidy, which is almost always in this case can often be functional, but after
deleterious, polyploidy can be beneficial in some fertilization the embryo will be genetically
organisms, particularly many species of food unbalanced. This usually leads to the death of the
plants. Higher ploidy levels often result in larger cell or embryo at some point in development.
plants and fruits (Figure 6). There are some exceptions to this in humans and
This section will go into the details of the causes of these will be presented later in this chapter.
aneuploidy and the consequences and diseases 5. CHROMOSOME ABNORMALITIES IN HUMANS
associated with them.
The problems described above can affect all
4.2. NONDISJUNCTION DURING MITOSIS OR MEIOSIS eukaryotes, unicellular and multicellular. To better
Segregation occurs in anaphase. In mitosis and understand the consequences let us consider those
meiosis II, sister chromatids (of replicated that affect people. As you will recall, humans are
chromosomes) are normally pulled to opposite 2n=46. The convention when describing a person's
ends of the cell. In Meiosis I, it is homologous karyotype (chromosome composition) is to list the
chromosomes, which are synapsed at that time, total number of chromosomes, then the sex
that segregate and move apart. chromosomes, and then anything out of the
ordinary. Most of us are 46,XX or 46,XY. What
4.3. CONSEQUENCE: DECREASED VIABILITY follows are some examples of chromosome
A non-disjunction event results in daughter cells number and chromosome structure abnormalities.
having an abnormal number of chromosomes.
Cells, such as the parent cell in Figure 9a, which Mitosis
have the proper number of chromosomes, are said
to be euploid. The daughter cells have one too
many or one too few chromosomes and are called 2n 2n
aneuploid. Even though both product cells have at
least one copy of all genes, both cells will probably
die. The reason is due to the loss or gain of a large
number of genes on the chromosome. Genes
normally produce a standard amount of product -
either functional RNAs or proteins. The parent cell
shown has a balanced genotype because it has two 2n 2n 2n-1 2n+1
copies of all of its genes (on its autosomes). But if (a) Correct (b) Non-disjuction
one of these cells suddenly had only one copy (or during anaphase

three copies) of all the genes on a whole Figure 9.
chromosome, the amount of product would be Mitosis done successfully (a) and unsuccessfully (b). The
cell is diploid and the homologs of one chromosome are
either 50% (or 150%) of what was normal. The cell shown in grey and black. (Original-L. Canham & M.
could probably tolerate such a change for a single Harrington- CC BY-NC 3.0)
gene and it would probably survive. But the sudden

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

CHANGES IN CHROMOSOME NUMBER – CHAPTER 25

Figure 10.
Meiosis done successfully (a) and
unsuccessfully (b and c) (Original-L. Canham &
M. Harrington- CC BY-NC 3.0)

5.1. AUTOSOMAL CHROMOSOME ABNORMALITIES -

DOWN SYNDROME
The most common chromosome number
abnormality is trisomy-21 or, as it is more
commonly known, Down syndrome (Figure 11). It
is present in about 1 in 800 births. Infants with this
condition have three copies of chromosome 21
rather than the normal two. Don't confuse trisomy
- having three copies of one chromosome (i.e.
2n+1) with triploidy - having three entire
chromosome sets (3x) Females with trisomy-21 are Figure 12.
47,XX,+21 while males are 47,XY,+21. In general, This diagram shows the errors during chromosome
people with Down syndrome are 47,sex,+21 where segregation that cause Down syndrome during meiosis in
the word 'sex' signifies that the sex chromosomes both parents and fusion of the gametes. Note that the cells
may be XX or XY. that begin meiosis are called meiocytes and that this
diagram only shows one of the four cells produced by
Trisomy-21 may arise from a nondisjunction event meiosis. Meiosis occurred properly in the male parent but
during meiosis in either parent or during mitosis there was a nondisjunction event in the female parent in
anaphase I.
very early during embryogenesis. However, most
(Original-M. Harrington/L. Canham-CC BY-NC 3.0)
cases are due to a first division non-disjunction
event occurring in the female parent (Figure 12).
Current research suggests that at least some of the
Having an extra copy of the smallest human mental problems are due to having three copies of
chromosome, chromosome 21, causes substantial the DYRK gene on chromosome 21. This gene is
health problems. People with Down syndrome active in the brain and there is evidence from
have various degrees of intellectual disability and humans and from mice that neurons are damaged
often have other health problems such as heart if there is too much DYRK protein synthesized.
defects. John Down first described the disease in
1866, but it was not until 1959 when its 5.2. SEX CHROMOSOME ABNORMALITIES - XYY AND
chromosomal basis was discovered. XXX
While fetuses trisomic for any one of the other
Figure 11. autosomes seldom survive to term, the situation is
A young girl with
Down Syndrome,
quite different for the sex chromosomes.
which can result from Approximately 1 in 1000 males has an extra Y
an extra chromosome chromosome and yet most are unaware of it! There
21. (Flickr-Andreas- appears to be little harm in having two Y
Photography-CCBY-NC
chromosomes because they have relatively few
3.0)
genes. Similarly, 1 in 1000 females has an extra X

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 25 – CHANGES IN CHROMOSOME NUMBER

are viable because the one X is active in most cells.

People with this condition do have health problems
though: they are typically shorter than average,
have an elevated risk of heart defects, and are
infertile.

The reason for the health problems is that there

XX Interphase XX Mitosis are a few genes that have allelic copies on both the
X and the Y chromosome. They are found in what is
called the pseudo-autosomal region of the X and Y
chromosome. This region escapes X chromosome
inactivation. One of the genes in this region is
called SHOX. It makes a protein that promotes
bone growth. The normal 46,XX and 46,XY
individuals have two functioning copies and have
XXX Interphase
average height. People with 47,XYY and 47,XXX
Figure 13. genomes have three copies and are typically taller
Simplified dosage compensation for XXX female than average, while people with 45,X have one
mammals. In mitosis, both X chromosomes are copy and are typically shorter. It is the single copy
condensed, as normal. In interphase, one chromosome is
of SHOX and a few of the other genes in the
selectively silenced and remains condensed. The
condensed chromosome is called a Barr Body. Mammals pseudo-autosomal region that causes health
that are XXX compensate for the extra X chromosome the problems for women with Turner syndrome.
same way, but instead silences 2 of the Xs while the third
stays active. 5.4. SEX CHROMOSOME ABNORMALITIES - KLINEFELTER
For more information of mammalian dosage SYNDROME
compensation, see Chapter 22 There are four common sex-chromosome
(Original-L. Canham & M. Harrington-CC BY-NC 3.0)
aneuploidies: 47,XYY, 47,XXX, 45,X, and 47,XXY.
This last situation is known as Klinefelter
chromosome. This situation also appears relatively syndrome. These people are male (because they
harmless, although for a different reason. Normally have a Y chromosome) and tall (because they have
in female mammals (humans 46,XX), one of the three SHOX genes). They do not have health
two X chromosomes is inactivated in each cell so problems because the X chromosome inactivation
that there can be genetic balance with males (see system is independent of sex (happens in
Chapter 22). In 47,XXX females, two of the X phenotypic males, as well as females). In the
chromosomes are inactivated, leaving one active, embryonic nuclei, all but one of the X chromosome
just like in normal 46,XX females (Figure 13). are inactivated. It doesn't matter whether the
embryo is male or female. Cells from men with
5.3. SEX CHROMOSOME ABNORMALITIES - TURNER
Klinefelter syndrome have a Barr body in their
SYNDROME
nuclei, the same as 46,XX females. They do have
Monosomy (2n-1) for autosomal chromosomes
fertility problems because there are two active X
does occur at conception but these embryos
chromosome in their testes and this interferes with
almost never survive to term. Similarly, embryos
spermatogenesis. They may make enough sperm to
that are 45,Y are also non-viable because they lack
conceive children using intracytoplasmic sperm
the many essential genes found on the X
injection though.
chromosome. The only viable monosomy in
humans is 45,X, also known as Turner syndrome.
These people are phenotypically female because
they lack a Y chromosome (see Chapter 21). They

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

CHANGES IN CHROMOSOME NUMBER – CHAPTER 25

6. GENE BALANCE
Why do trisomies, duplications, and other
chromosomal abnormalities that alter gene copy
number often have a negative effect on the normal
development or physiology of an organism? This is
particularly intriguing because in many species,
aneuploidy is detrimental or lethal, while

polyploidy is tolerated or even beneficial. The
answer probably differs in each case, but is
probably related to the concept of gene balance,
which can be summarized as follows: genes, and
the proteins they produce, have evolved to
function in complex metabolic and regulatory
networks. Some of these networks function best

when certain enzymes and regulators are present
in specific ratios to each other. Increasing or
decreasing the gene copy number for just one part
of the network may throw the whole network out
of balance, leading to increases or decreases of
certain metabolites, which may be toxic in high
concentrations or limiting in other important

processes in the cell. The activity of genes and
metabolic networks is regulated in many different
ways besides changes in gene copy number, so
duplication of just a few genes will usually not be
harmful. However, trisomy and large segmental
duplications of chromosomes affect the dosage of
so many genes that cellular networks are unable to
compensate for such changes and an abnormal or
lethal phenotype results.

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 25 – CHANGES IN CHROMOSOME NUMBER

___________________________________________________________________________
SUMMARY:
• Aneuploidy results from the addition or subtraction of one or more chromosomes from a group of
homologs, and is usually deleterious to the cell.
• Polyploidy is the presence of more than two complete sets of chromosomes in a genome. Even-
numbered multiple sets of chromosomes can be stably inherited in some species, especially plants.
• Aneuploidy can affect gene balance.
• Errors during anaphase in mitosis or meiosis can lead to trisomy and other forms of aneuploidy.
• Five common forms of aneuploidy in humans are 47,XY,+21 or 47,XX,+21 (Down syndrome), 47,XYY,
47,XXX, 45,X (Turner syndrome) and 47,XXY (Klinefelter syndrome).
KEY TERMS
aneuploidy balanced
polyploidy unbalanced
n first division nondisjunction
c second division nondisjunction
replicated chromosome karyotype
x 46,XX
monoploid 46,XY
sterile 47,sex,+21 (Down syndrome)
tetravalent trisomy
octoploid 47,XYY
hexaploid 47,XXX
triploid monosomy
gene balance 45,X (Turner syndrome)
cellular network pseudo-autosomal region
non-disjunction 47,XXY (Klinefelter syndrome)
euploid

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

CHANGES IN CHROMOSOME NUMBER – CHAPTER 25

STUDY QUESTIONS

1) Bread wheat (Triticum aestivum) is a hexaploid. 8) How many Barr bodies would you expect to see
Using the nomenclature presented in class, an in cells from people who are:
ovum cell of wheat has n=21 chromosomes.
a) 46, XY,
How many chromosomes in a zygote of bread
b) 46,XX,
wheat?
c) 47, XYY,
2) For a given gene:
d) 47,XXX,
a) What is the maximum number of alleles
e) 45,X,
that can exist in a 2n cell of a given diploid
f) 47,XXY
individual?
9) Why can people survive with trisomy-21
b) What is the maximum number of alleles
(47,sex,+21) but not monosomy-21 (47,XY,-21
that can exist in a 1n cell of a tetraploid
or 47,XX,-21)?
individual?
10) What would happen if there was a
c) What is the maximum number of alleles
nondisjunction event involving chromosome 21
that can exist in a 2n cell of a tetraploid
in a 46,XY zygote?
individual?

d) What is the maximum number of alleles
that can exist in a population?
3)
a) Why is aneuploidy more often lethal than
polyploidy?
b) Which is more likely to disrupt gene
balance: polyploidy or duplication?
4) For a diploid organism with 2n=4
chromosomes, draw a diagram of all of the
possible configurations of chromosomes during
normal anaphase I, with the maternally and
paternally derived chromosomes labeled.
5) For a triploid organism with 2n=3x=6
chromosomes, draw a diagram of all of the
possible configurations of chromosomes at
anaphase I (it is not necessary label maternal
and paternal chromosomes).
6) For a tetraploid organism with 2n=4x=8
chromosomes, draw all of the possible
configurations of chromosomes during a
normal metaphase.
7) Make a diagram showing how a nondisjunction
event can lead to a child with a 47,XYY
karyotype.

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 25 – CHANGES IN CHROMOSOME NUMBER

PAGE 12 OPEN GENETICS LECTURES – FALL 2017
GENE INTERACTIONS – CHAPTER 26

CHAPTER 26 – GENE INTERACTIONS

Figure 1.
Coat color in mammals is an
example of a phenotypic trait that
is controlled by more than one
locus (polygeneic) and the alleles at
these loci can interact to alter the
expected Mendelian ratios.
(Flickr-David Blaikie- CC BY 2.0)

INTRODUCTION If the inheritance of seed color was truly
independent of seed shape, then when the F1
The principles of genetic analysis that we have
dihybrids were crossed to each other, a 3:1 ratio of
described for a single locus
one trait should be observed within each
(dominance/recessiveness) can be extended to the
phenotypic class of the other trait (Figure 2). Using
study of alleles at two different loci. While the
the product law, we would therefore predict that if
analysis of two loci concurrently is required for
¾ of the progeny were green, and ¾ of the progeny
genetic mapping, it can also reveal interactions
were round, then ¾ × ¾ = 9/16 of the progeny
between genes that affect the phenotype.
would be both round and green. Likewise, ¾ × ¼ =
Understanding these interactions is very useful for
3/16 of the progeny would be both round and
both basic and applied research. Before discussing
yellow, and so on. By applying the product rule to
these interactions, we will first revisit Mendelian
all of these combinations of phenotypes, we can
inheritance for two loci.
predict a 9:3:3:1 phenotypic ratio among the
1. MENDELIAN DIHYBRID CROSSES progeny of a dihybrid cross, if certain conditions
are met, including the independent segregation of
1.1. MENDEL’S SECOND LAW (A QUICK REVIEW ) the alleles at each locus. Indeed, 9:3:3:1 is very
To analyze the segregation of two traits (e.g. colour, close to the ratio Mendel observed in his studies of
wrinkle) at the same time, in the same individual, dihybrid crosses, leading him to state his Second
Mendel crossed a pure breeding line of green, Law, the Law of Independent Assortment, which
wrinkled peas with a pure breeding line of yellow, we now express as follows: two loci assort
round peas to produce F1 progeny that were all independently of each other during gamete
green and round, and which were also dihybrids; formation.
they carried two alleles at each of two loci (Figure 2)

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 26 – GENE INTERACTIONS

between the two genes that would alter the

phenotypes.
Deviations from the 9:3:3:1 phenotypic ratio may
indicate that one or more of the above conditions
has not been met. For example, Linkage of the two
loci results in a distortion of the ratios expected
from independent assortment. Also, if complete
dominance is lacking (e.g. co-dominance or
incomplete dominance) then the ratios will also be
distorted. Finally, it there is an interaction between
the two loci such that the four classes cannot be
distinguished (which is the topic under
Figure 2. consideration in this chapter) the ratio will also
Pure-breeding lines are crossed to produce dihybrids in deviate from 9:3:3:1.
the F1 generation. The cross of these particular dihybrids
produces four phenotypic classes. Modified ratios in the progeny of a dihybrid cross
(Original-Deyholos-CC BY-NC 3.0) can therefore reveal useful information about the

genes being investigated. Such interactions lead to

Modified Mendelian Ratios.
2. EPISTASIS AND OTHER GENE INTERACTIONS
Some dihybrid crosses produce a phenotypic ratio
that differs from the typical 9:3:3:1. These include
9:3:4, 12:3:1, 9:7, or 15:1. Note that each of these
modified ratios can be obtained by summing one or
more of the 9:3:3:1 classes expected from our
original dihybrid cross. In the following sections,
we will look at some modified phenotypic ratios
obtained from dihybrid crosses and what they
might tell us about the interactions between the
Figure 3.
A Punnett Square showing the results of the dihybrid genes involved.
cross from Figure 2. Each of the four phenotypic classes is
represented by a different color of shading.
(Original-Deyholos-CC BY-NC 3.0)

1.2. ASSUMPTIONS OF THE 9:3:3:1 RATIO

Both the product rule and the Punnett Square
approaches showed that a 9:3:3:1 phenotypic ratio Figure 4.
is expected among the progeny of a dihybrid cross Retrievers with different coat colors: (from left to right)
such as Mendel’s RrYy × RrYy. In making these black, chocolate, yellow: an example of recessive epistasis
expectations, we assumed that: phenotypes. (Flickr- Pirate Scott - CC BY-NC 2.0)
(1) both loci assort independently;
(2) one allele at each locus is completely dominant; 2.1. RECESSIVE EPISTASIS
and Epistasis (which means “standing upon”) occurs
(3) each of four possible phenotypes can be when the phenotype of one locus masks, or
distinguished unambiguously, with no interactions prevents, the phenotypic expression of another
locus. Thus, following a dihybrid cross fewer than

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

GENE INTERACTIONS – CHAPTER 26

the typical four phenotypic classes will be observed 2.2. DOMINANT EPISTASIS
with epistasis. As we have already discussed, in the In some cases, a dominant allele at one locus may
absence of epistasis, there are four phenotypic mask the phenotype of a second locus. This is
classes among the progeny of a dihybrid cross. The called dominant epistasis, which produces a
four phenotypic classes correspond to the segregation ratio of 12:3:1, which can be viewed as
genotypes: A_B_, A_bb, aaB_, and aabb. If either a modification of the 9:3:3:1 ratio in which the
of the singly homozygous recessive genotypes (i.e. A_B_ class is combined with one of the other
A_bb or aaB_) has the same phenotype as the genotypic classes (9+3) that contains a dominant
double homozygous recessive (aabb), then a 9:3:4 allele. One of the best known examples of a 12:3:1
phenotypic ratio will be obtained. segregation ratio is fruit color in some types of
For example, in the Labrador Retriever breed of squash (Figure 6). Alleles of a locus that we will call
dogs (Figure 4), the B locus encodes a gene for an B produce either yellow (B_) or green (bb) fruit.
important step in the production of melanin. The However, in the presence of a dominant allele at a
dominant allele, B is more efficient at pigment second locus that we call A, no pigment is
production than the recessive b allele, thus B_ hair produced at all, and fruit are white. The dominant
appears black, and bb hair appears brown. A A allele is therefore epistatic to both B and bb
second locus, which we will call E, controls the combinations (Figure 7). One possible biological
deposition of melanin in the hairs. At least one interpretation of this segregation pattern is that
functional E allele is required to deposit any the function of the A allele somehow blocks an
pigment, whether it is black or brown. Thus, all early stage of pigment synthesis, before either
retrievers that are ee fail to deposit any melanin yellow or green pigments are produced.
(and so appear pale yellow-white), regardless of
the genotype at the B locus (Figure 4, right side).
The ee genotype is therefore said to be epistatic to
both the B and b alleles, since the homozygous ee
phenotype masks the phenotype of the B locus.
The B/b locus is said to be hypostatic to the ee
genotype. Because the masking allele is in this case
is recessive, this is called recessive epistasis. A Figure 6.
table showing all the possible progeny genotypes Green, yellow, and white fruits of squash. (Flickr-
and their phenotypes is shown in Figure 5. Unknown-CC BY-NC 3.0)

Figure 5.
Genotypes and phenotypes among the progeny of a
dihybrid cross of Labrador Retrievers heterozygous for
two loci affecting coat color. The phenotypes of the
progeny are indicated by the shading of the cells in the
table: black coat (black, E_B_); chocolate coat (brown, Figure 7.
E_bb); yellow coat (yellow, eeB_ or eebb). Genotypes and phenotypes among the progeny of a
(Original-Locke-CC BY-NC 3.0) dihybrid cross of squash plants heterozygous for two loci
affecting fruit color. (Original-Deyholos-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 26 – GENE INTERACTIONS

2.3. DUPLICATE GENE ACTION 2.4. COMPLEMENTARY GENE ACTION

When a dihybrid cross produces progeny in two The progeny of a dihybrid cross may produce just
phenotypic classes in a 15:1 ratio, this can be two phenotypic classes, in an approximately 9:7
because the two loci’s gene products have the ratio. An interpretation of this ratio is that the loss
same (redundant) functions within the same of function of either A or B gene function has the
biological pathway. With yet another pigmentation same phenotype as the loss of function of both
pathway example, wheat shows this duplicate genes, due to complementary gene action
gene action. The biosynthesis of red pigment near (meaning that the functions of both genes work
the surface of wheat seeds (Figure 8) involves together to produce a final product). For example,
many genes, two of which we will label A and B. consider a simple biochemical pathway in which a
Normal, red coloration of the wheat seeds is colorless substrate is converted by the action of
maintained if function of either of these genes is gene A to another colorless product, which is then
lost in a homozygous mutant (e.g. in either aaB_ or converted by the action of gene B to a visible
A_bb). Only the doubly recessive mutant (aabb), pigment (Figure 10).
which lacks function of both genes, shows a
phenotype that differs from that produced by any
of the other genotypes (Figure 9). A reasonable
interpretation of this result is that both genes
encode the same biological function, and either
one alone is sufficient for the normal activity of
that pathway.

Figure 8.
Red (left) and white (right) wheat seeds.
(cropwatch.unl.edu?-pending?)

Figure 10.
a) A simplified biochemical pathway showing
complementary gene action of A and B. Note that in this
case, the same phenotypic ratios would be obtained if
gene B acted before gene A in the pathway.
Figure 9. b) biochemical pathway showing two subunits of one
Genotypes and phenotypes among the progeny of a enzyme
dihybrid cross of a wheat plants heterozygous for two loci c) biochemical pathway showing one transcription factor
affecting seed color. (Original-Deyholos-CC BY-NC 3.0) and one enzyme
(Original-Deyholos/KangCC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

GENE INTERACTIONS – CHAPTER 26

Loss of function of either A or B, or both, will have

the same result: no pigment production. Thus
A_bb, aaB_, and aabb will all be colorless, while
only A_B_ genotypes will produce pigmented
product (Figure 11). The modified 9:7 ratio may
therefore be obtained when two genes act
together in the same biochemical pathway, and
when their loss of function phenotypes are
indistinguishable from each other or from the loss
of both genes. There are also other possible
biochemical explanations for complementary gene
action.
Figure 12.
Mutation in the white gene impacts the pigmentation in
mottled
Drosophila eyes. Note that white is recessive to
+ -
white and dominant to white .
(Original-Locke-CC BY-NC 3.0)

The suppressor mutation can be within the original

gene itself (intragenic) or outside the gene, at
some other gene elsewhere in the genome
(extragenic). For example, a frameshift mutation
caused by a deletion in a gene can be reverted, or
suppressed, by an insertion in the same gene to
Figure 11. restore the original reading frame (intragenic
Genotypes and phenotypes among the progeny of a suppressor mutation). A case of an extragenic
dihybrid cross of a hypothetical plant heterozygous for suppressor mutation, on the other hand, a can
two loci affecting flower color. (Original-Deyholos-CC BY- occur when a mutant phenotype caused by
NC 3.0)
mutation in gene A is suppressed by a mutation in
gene B. In extragenic suppressor mutation, there
2.5. GENETIC SUPPRESSION AND ENHANCMENT are two types of suppressor mutations: (1)
A suppressor mutation is a type of mutation that dominant suppression and (2) recessive
usually had no phenotype of its own, but act to suppression.
suppress (makes more wildtype, less mutant) the
phenotypic expression of another mutation that 2.6. DOMINANT SUPPRESSION
already exists in an organism. On the other hand, In dominant suppression, the mutant suppressor
enhancer mutations have the opposite effect of allele is dominant to the wild type suppressor allele.
suppressor mutations. They make the phenotype Therefore, one mutant suppressor allele is
more mutant and less wild type (enhance the sufficient to suppress the mutant phenotype. For
mutant phenotype). example, in Figure 13, the Su gene represents the
suppressor gene. Flies that have at least one Su-
For example, if a fly has a whitemottled (wm)
allele, even though they have homozygous
phenotype, it can be suppressed to look more like
recessive wm/wm genotype, will show a wild-type
white+ phenotype by a dominant Suppressor
(w+) phenotype. A fly will have wm phenotype only
mutation (S-), or Enhanced to look more like white-
if it has homozygous recessive Su+/Su+ genotype. If
by a dominant enhancer mutation (E-) (Figure 12).
w+/wmottled; Su+/Su- flies are crossed together, the
Note that the wm allele is recessive to white+ (w+)
ratio of white+ (wild type) to whitemottled (mutant)
but dominant to white- (w-).
would be 15:1.

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 26 – GENE INTERACTIONS

Figure 13 – Dominant Suppression.
Drosophila cross and its Punnett square showing the
effects of dominant suppression of Su gene on the white Figure 14 – Recessive Suppression.
+
gene. Note that A = white , a = white
mottled, +
B = Su b = Su
-, Drosophila cross and its Punnett square showing the
and ___ (blank) = any allele. (Original-Kang-CC BY-NC 3.0) effects of recessive suppression of Su gene on the white
+ mottled, + -,
gene. Note that A = white , a = white B = Su b = Su
and ___ (blank) = any allele. (Original-Kang-CC BY-NC 3.0)
2.7. RECESSIVE SUPPRESSION
On the other hand, in recessive suppression, the
mutation. On the other hand, flies that have the
mutant suppressor allele is recessive to the wild
wmwm alleles will have mottled phenotype unless
type suppressor allele. Therefore, two of the
they have homozygous su- alleles. If w+/wmottled;
mutant alleles are needed to suppress the wm
Su+/Su- flies are crossed together, the ratio of
(mottled) phenotype. For example, in Figure 13,
white+ (wild type) to whitemottled (mutant) would be
flies that have at least one w+ allele will show a
13:3.
wild-type phenotype. Also, flies that have su-/su-
alleles will have wildtype phenotype since two
mutant alleles can suppress the white gene

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

GENE INTERACTIONS – CHAPTER 26

2.8. SUMMARY
Table 1. Summary table showing gene interactions and their genotypic and phenotypic ratios.
Ratio: 9 3 3 1
Ratio
Genotype” A-B- A-bb aaB- Aabb
None 9 3 3 1
9:3:3:1
AB B aB ab
Recessive epistasis 9 3 4
9:3:4
of aa acting on B and b alleles AB B A
Dominant epistasis 12 3 1
12:3:1
of A acting on B and b alleles A aB ab
Duplicate genes 15 1
15:1
A a
Complementary genes 9 7
9:7
A A
Recessive suppression 9 3 4
13:3
by aa acting on bb B b B
Dominant suppression 15 1
15:1
by A acting on bb B b
Shading represents combined classes.
3. EXAMPLE OF MULTIPLE GENES AFFECTING ONE and do not show the simple Mendelian segregation
CHARACTER (POLYGENIC INHERITANCE) ratios (e.g. 3:1) observed with some qualitative
traits. Many complex traits are also influenced
3.1. CONTINUOUS VARIATION heavily by the environment. Nevertheless,
Most of the phenotypic traits commonly used in complex traits can often be shown to have a
introductory genetics are qualitative, meaning that component that is heritable, and which must
the phenotype exists in only two (or possibly a few therefore involve one or more genes.
more) discrete, alternative forms, such as either How can genes, which are inherited (in the case of
purple or white flowers, or red or white eyes. a diploid) as at most two variants each, explain the
These qualitative traits are therefore said to exhibit wide range of continuous variation observed for
discrete variation. On the other hand, many many traits? The lack of an immediately obvious
interesting and important traits exhibit continuous explanation to this question was one of the early
variation; these exhibit a continuous range of objections to Mendel's explanation of the
phenotypes that are usually measured mechanisms of heredity. However, upon further
quantitatively, such as intelligence, body mass, consideration, it becomes clear that the more loci
blood pressure in animals (including humans), and that contribute to trait, the more phenotypic
yield, water use, or vitamin content in crops. classes may be observed for that trait (Figure 15).
Traits with continuous variation are often complex,

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 26 – GENE INTERACTIONS

Figure 15.
Punnett Squares for one, two, or three loci. We are using a simplified example of up to three semi-dominant genes, and in each
case the effect on the phenotype is additive, meaning the more “upper case” alleles present, the stronger the phenotype.
Comparison of the Punnett Squares and the associated phenotypes shows that under these conditions, the larger the number of
genes that affect a trait, the more intermediate phenotypic classes that will be expected. (Original-Deyholos-CC BY-NC 3.0)

Figure 16.
The more loci that affect a trait, the larger the number of phenotypic classes that can be expected. For some traits, the number
of contributing loci is so large that the phenotypic classes blend together in apparently continuous variation. (Original-Deyholos-
CC BY-NC 3.0)

If the number of phenotypic classes is sufficiently controlled by the combined activity of many genes.
large (as with three or more loci), individual classes Note that this does not imply that each of the
may become indistinguishable from each other individual genes has an equal influence on a
(particularly when environmental effects are polygenic trait – some may have major effect, while
included), and the phenotype appears as a others only minor. Furthermore, any single gene
continuous variation (Figure 16). Thus, quantitative may influence more than one trait, whether these
traits are sometimes called polygenic traits, traits are quantitative or qualitative traits.
because it is assumed that their phenotypes are

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

GENE INTERACTIONS – CHAPTER 26

3.2. CAT FUR GENETICS – Piebald spotting is the occurrence of patches of

( ADAPTED FROM C HRISTENSEN (2000) G ENETICS 155:999-1004) white fur. These patches vary in size due to many
Most aspects of the fur phenotypes of common reasons, including genotype. Homozygous cats
cats can be explained by the action of just a few with genotype ss do not have any patches of white,
genes (Table 2). Other genes, not described here, while cats of genotype Ss and SS do have patches
may further modify these traits and account for the of white, and the homozygotes tend to have a
phenotypes seen in tabby cats and in more exotic larger proportion of white fur than heterozygotes
breeds, such as Siamese. (part F). The combination of piebald spotting and
For example, the X-linked Orange gene has two tortoise shell patterning produce a calico cat,
allelic forms. The OO allele produces orange fur, which has separate patches of orange, black, and
while the OB alleles produce non-orange (often white fur.
black) fur. Note however, that because of X-
chromosome inactivation the result is mosaicism in
expression. In OO / OB female heterozygotes
patches of black and orange are seen, which
produces the tortoise shell pattern (Figure 17 A,B).
This is a rare example of co-dominance since the
phenotype of both alleles can be seen. Note that
the cat in part A has short fur compared to the cat
in part B; recessive alleles at an independent locus
(L/l) produce long (ll) rather than short (L_) fur.
Alleles of the dilute gene affect the intensity of
pigmentation, regardless of whether that
pigmentation is due to black or orange pigment.
Part C shows a black cat with at least one dominant
allele of dilute (D_), in contrast to the cat in D,
which is grey rather than black, because it has the
dd genotype.
Epistasis is demonstrated by an allele of only one of
the genes in Table 2.. One dominant allele of Figure 17.
white masking (W) prevents normal development Representatives of various fur phenotypes in cats.
of melanocytes (pigment producing cells). Tortoise shell (A,B) pigmentation in cats with short (A)
Therefore, cats with genotype (W_) will have and long (B) fur; black (C) and grey (D) cats that differ in
entirely white fur regardless of the genotype at the genotype at the dilute locus. The pure white pattern (E) is
distinct from piebald spotting (F).
Orange or dilute loci (part E). Although this locus A: (Flickr-Bill Kuffrey-CC BY 2.0), B: (Wikipedia-Dieter Simon-PD),
produces a white colour, W_ is not the same as C: (Flickr-atilavelo-CC BY 2.0), D: (Flickr-Waldo Jaquith-CC BY-SA
albinism, which is a much rarer phenotype caused 2.0), E: (Wikipedia-Valerius Geng-CC BY-SA 3.0), F: (Flickr-Denni
Schnapp-CC BY-NC-SA 2.0) *Changes: Letters and descriptions
by mutations in other genes. Albino cats can be were added on the pictures
distinguished by having red eyes, while W_ cats
have eyes that are not red.

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 26 – GENE INTERACTIONS

Table 2. Summary of simplified cat fur phenotypes and genotypes.

Trait Phenotype Genotype Comments
fur length short LL or Ll L is completely
long ll dominant
all white fur 100% white fur WW or Ww If the cat has red eyes
(non-albino) it is albino, not W_.
<100% white fur ww W is epistatic to all
other fur color genes; if
cat is W_, can’t infer
genotypes for any
other fur color genes.
piebald spotting > 50% white patches SS S is incompletely
(but not 100%) dominant and shows
< 50% white patches Ss variable expressivity
no white patches ss
orange fur all orange fur XOXO or XOY O is X-linked
tortoise shell XO XB
variegation
no orange fur (often XBXB or XBY
black)
dilute pigmentation pigmentation is intense Dd or DD D is completely
pigmentation is dilute dd dominant
(e.g. gray rather than
black; cream rather than
orange; light brown
rather than brown)
tabby tabby pattern AA or Aa This is a simplification
solid coloration aa of the tabby
phenotype, which
involves multiple genes
sex female XX
male XY
Note: Phenotypes May Not Be As Expected from the Genotype

4. ENVIRONMENTAL FACTORS Genotype + Environment
The phenotypes described thus far have a nearly ⇒ Phenotype (G + E ⇒ P)
perfect correlation with their associated Or:
genotypes; in other words an individual with a Genotype + Environment + InteractionGE
particular genotype always has the expected
phenotype. However, many (most?) phenotypes ⇒ Phenotype (G + E + IGE ⇒ P)
are not determined entirely by genotype alone. *GE = Genetics and Environment
Instead, they are determined by an interaction This interaction is especially relevant in the study of
between genotype and environmental factors and economically important phenotypes, such as
can be conceptualized in the following relationship: human diseases or agricultural productivity. For

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

GENE INTERACTIONS – CHAPTER 26

example, a particular genotype may pre-dispose an

individual to cancer, but cancer may only develop if
the individual is exposed to certain DNA-damaging
chemicals or carcinogens. Therefore, not all
individuals with the particular genotype will
develop the cancer phenotype, only those who
experience a particular environment.
Penetrance and Expressivity
The terms penetrance and expressivity are also
useful to describe the relationship between certain
genotypes and their phenotypes.

4.1. PENETRANCE
Penetrance is the proportion of individuals with a Figure 18.
Relationship between penetrance and expressivity in
particular genotype that display a corresponding
eight individuals that all have a mutant genotype.
phenotype (Figure 18). It is usually expressed as a Penetrance can be complete (all eight have the mutant
percentage of the population. Because all pea phenotype) or incomplete (only some have the mutant
plants that are homozygous for the allele for white phenotype). Amongst those individuals with the mutant
flowers (e.g. aa in Figure 2 of Chapter 12) actually phenotype the expressivity can be narrow (very little
variation) to broad (lots of variation).
have white flowers, this genotype is completely
(Original-Locke-CC BY-NC 3.0)
(100%) penetrant. In contrast, many human genetic

diseases are incompletely penetrant, since not all

individuals with the disease genotype actually
develop symptoms associated with the disease
(less than 100%).

4.2. EXPRESSIVITY
Expressivity describes the variability in mutant
phenotypes observed in individuals with a
particular phenotype (Figure 18 and Figure 19.).
Many human genetic diseases provide examples of
broad expressivity, since individuals with the same
genotypes may vary greatly in the severity of their
symptoms. Incomplete penetrance and broad
expressivity are due to random chance, non-
genetic (environmental), and genetic factors
(mutations in other genes).

5. MENDELIAN PHENOTYPIC RATIOS MAY NOT BE

AS EXPECTED
5.1. OTHER FACTORS

There are other factors that that affects organism’s
phenotype and thus appear to alter Mendelian Figure 19.
inheritance. Mutations in wings of Drosophila melanogaster showing
weak to strong expressivity.
(Original-J. Locke-CC;AN)

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 26 – GENE INTERACTIONS

(1) Genetic heterogeneity: There is more than one 5.2. THE Χ2 TEST FOR GOODNESS-OF-FIT
gene or genetic mechanism that can produce For a variety of reasons, the phenotypic ratios
the same phenotype. observed from real crosses rarely match the exact
(2) Polygenic determination: One phenotypic trait ratios expected based on a Punnett Square or other
is controlled by multiple genes. prediction techniques. There are many possible
explanations for deviations from expected ratios.
(3) Phenocopy: Organisms that do not have the Sometimes these deviations are due to sampling
genotype for trait A can also express trait A due effects, in other words, the random selection of a
to environmental conditions; they do not have non-representative subset of individuals for
the same genotype but the environment simply observation.
“copies” the genetic phenotype.
A statistical procedure called the chi-square (χ2)
(4) Incomplete penetrance: even though an test can be used to help a geneticist decide
organism possesses the genotype for trait A, it whether the deviation between observed and
might not be expressed with 100% effect. expected ratios is due to sampling effects, or
(5) Certain genotypes show a survival rate that is whether the difference is so large that some other
less than 100%. For example, genotypes that explanation must be sought by re-examining the
cause death, recessive lethal mutations, at the assumptions used to calculate the expected ratio.
embryo or larval stage will be The procedure for performing a chi-square test is
underrepresented when adult flies are counted. typically covered in the lab.

PAGE 12 OPEN GENETICS LECTURES – FALL 2017

GENE INTERACTIONS – CHAPTER 26

___________________________________________________________________________
SUMMARY:
• Phenotype depends on the alleles that are present, their dominance relationships, and sometimes also
interactions with the environment and other factors.
• The alleles of different loci are inherited independently of each other, unless they are genetically linked.
• Many important traits show continuous, rather than discrete variation. These are called quantitative traits.
• Many quantitative traits are influenced by a combination of environment and genetics.
• The expected phenotypic ratio of a dihybrid cross is 9:3:3:1, except in cases of linkage or gene interactions
that modify this ratio.
• Modified ratios from 9:3:3:1 are seen in the case of recessive and dominant epistasis, duplicate genes, and
complementary gene action. This usually indicates that the two genes interact within the same biological
pathway.
• There are other factors that alter the expected Mendelian ratios.
KEY TERMS:
Mendel’s Second Law long
independent assortment dilute
linkage White masking
dihybrid piebald spotting
Modified Mendelian Ratios calico Discrete variation
9:3:3:1 Continuous variation
9:3:4 Polygenic traits
12:3:1recessive epistasis G + E = P
dominant epistasis penetrance
complementary action expressivity
redundancy recessive lethal mutations
duplicate gene action
Orange

OPEN GENETICS LECTURES – FALL 2017 PAGE 13

CHAPTER 26 – GENE INTERACTIONS

STUDY QUESTIONS:
1) In the table on the opposite page, match the a) aa
mouse hair color phenotypes with the term b) bb
from the list that best explains the observed c) dd
phenotype, given the genotypes shown. In this d) aabb
case, the allele symbols do not imply anything e) aadd
about the dominance relationships between f) bbdd
the alleles. List of terms: haplosufficiency, g) aabbdd
haploinsufficiency, pleiotropy, incomplete h) What will be the phenotypic ratios among
dominance, co-dominance, incomplete the offspring of a cross AaBb × AaBb?
penetrance, broad (variable) expressivity. i) What will be the phenotypic ratios among
the offspring of a cross BbDd × BbDd?
Answer questions 2-4 using the following j) What will be the phenotypic ratios among
biochemical pathway for fruit color. Assume all the offspring of a cross AaDd × AaDd?
mutations (lower case allele symbols) are 4) If 1 is colorless, 2 is yellow and 3 is blue and 4 is
recessive, and that either precursor 1 or red, what will be the phenotypes associated
precursor 2 can be used to produce precursor with the following genotypes?
3. If the alleles for a particular gene are not a) aa
listed in a genotype, assume that they are wild- b) bb
type. c) dd
d) aabb
e) aadd
f) bbdd
g) aabbdd
h) What will be the phenotypic ratios among
the offspring of a cross AaBb × AaBb?
i) What will be the phenotypic ratios among
the offspring of a cross BbDd × BbDd?
2) If 1 and 2 and 3 are all colorless, and 4 is red,
what will be the phenotypes associated with j) What will be the phenotypic ratios among
the following genotypes? the offspring of a cross AaDd × AaDd?
a) aa 5) Which of the situations in questions 2 – 4
b) bb demonstrate epistasis?
c) dd 6) If the genotypes written within the Punnett
d) aabb Square are from the F2 generation, what would
e) aadd be the phenotypes and genotypes of the F1 and
f) bbdd P generations for:
g) aabbdd a) Figure 5
h) What will be the phenotypic ratios among b) Figure 7
the offspring of a cross AaBb × AaBb? c) Figure 9
i) What will be the phenotypic ratios among d) Figure 11
the offspring of a cross BbDd × BbDd? 7) To better understand how genes control the
j) What will be the phenotypic ratios among development of three-dimensional structures,
the offspring of a cross AaDd × AaDd? you conducted a mutant screen in Arabidopsis
3) If 1 and 2 are both colorless, and 3 is blue and 4 plant and identified a recessive point mutation
is red, what will be the phenotypes associated allele of a single gene (g) that causes leaves to
with the following genotypes? develop as narrow tubes rather than the broad

PAGE 14 OPEN GENETICS LECTURES – FALL 2017

GENE INTERACTIONS – CHAPTER 26

flat surfaces that develop in wild-type (G). second gene? In each case, also specify the
Allele g causes a complete loss of function. Now phenotypic ratios that would be observed
you want to identify more genes involved in the among the F1 progeny of a cross of AaGg x
same process. Diagram a process you could use AaGg
to identify other genes that interact with gene 10) Calculate the phenotypic ratios from a dihybrid
g. Show all of the possible genotypes that could cross involving the two loci shown in Figure 17.
arise in the F1 generation. There may be more than one possible set of
8) With reference to question 7, if the recessive ratios, depending on the assumptions you make
allele, g is mutated again to make allele g*, about the phenotype of allele b.
what are the possible phenotypes of a 11) Use the product rule to calculate the
homozygous g* g* individual? phenotypic ratios expected from a trihybrid
9) Again, in reference to question 8, what are the cross. Assume independent assortment and no
possible phenotypes of a homozygous aagg epistasis/gene interactions.
individual, where a is a recessive allele of a

Table for Question 1

A1A1 A1A2 A2A2
1 all hairs black on the same all hairs white
individual: 50% of
hairs are all black and
50% of hairs are all
white
2 all hairs black all hairs are the same all hairs white
shade of grey
3 all hairs black all hairs black 50% of individuals
have all white hairs
and 50% of
individuals have all
black hairs
4 all hairs black all hairs black mice have no hair
5 all hairs black all hairs white all hairs white
6 all hairs black all hairs black all hairs white
7 all hairs black all hairs black hairs are a wide range
of shades of grey

OPEN GENETICS LECTURES – FALL 2017 PAGE 15

CHAPTER 26 – GENE INTERACTIONS

PAGE 16 OPEN GENETICS LECTURES – FALL 2017
PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES– CHAPTER 27

CHAPTER 27 – PHYSICAL MAPPING OF

CHROMOSOMES AND GENOMES

Figure 1.
Genetic map of human chromosome 1
showing a region expanded to the point of
showing the genes within that region.
(Wikipedia- SpelgroepPhoenix-CC BY-SA 3.0)

INTRODUCTION genes along a linear chromosome. Note that map
distances are always calculated for one pair of loci
Chromosomes are long duplex molecules of DNA
at a time. However, by combining the results of
that are either linear or circular and composed of a
multiple pair-wise calculations, a genetic map of
relatively constant sequence of nucleotides. There
many loci on a chromosome can be produced
are three different ways of describing the linear
(Figure 2). A genetic map shows the map distance,
contents of a chromosome: (1) genetic map, (2)
in cM, that separates any two loci, and the position
cytogenetic map, and (3) physical map (ultimately
of these loci relative to all other mapped loci. The
the sequence).
genetic map distance is roughly proportional to the
1. GENETIC MAP (DISTANCE IN CM, physical distance, i.e. the amount of DNA between
RECOMBINATION FREQUENCY) two loci. For example, in Arabidopsis, 1.0 cM
corresponds to approximately 150,000bp and
In Chapter 18, we described the units of genetic contains approximately 50 genes. The exact
distance (map units / centiMorgans, cM) and how number of DNA bases in a cM depends on the
this relates to recombination frequency. We can organism, and on the particular position in the
use this information in order to produce a genetic chromosome. Some parts of chromosomes
map, which is a “map” that shows the locations of (“crossover hot spots”) have higher rates of

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 27 – PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES

2. CYTOGENETIC MAP
Each eukaryotic species has its nuclear genome
divided among a number of chromosomes that is
characteristic of that species. For example, a
haploid human nucleus (i.e. sperm or egg) normally
has 23 chromosomes (n=23), and a diploid human
nucleus has 23 pairs of chromosomes (2n=46). A
karyotype is the complete set of chromosomes of
an individual. In Figure 3, the cell was in metaphase
so each of the 46 structures is a replicated
chromosome even though it is hard to see the two
sister chromatids for each chromosome at this
resolution. As expected there are 46 chromosomes.
Note that the chromosomes have different lengths.
Figure 2. In fact, human chromosomes were named based
Genetic maps for regions of two chromosomes from two
upon this feature. Our largest chromosome is
species of the silk moth, Bombyx. The scale at left shows
distance in cM, and the position of various loci is indicated called 1, our next longest is 2, and so on.
on each chromosome. Diagonal lines connecting loci on
different chromosomes show the position of 2.1. CENTROMERE LOCATION
corresponding loci in different species. This is referred to A chromosome has a telomere and centromere,
as regions of conserved synteny. which are usually in a heterochromatin state.
(NCBI-NIH-PD) Centromere is DNA sequences that are bound by
centromeric proteins that link the centromere to
recombination than others, while other regions microtubules. Centromere can be in the middle
have reduced crossing over and often correspond (metacentric), near to the middle (submetacentric),
to large regions of heterochromatin. near the end (acrocentric), at the end (telocentric)
When a novel gene or locus is identified by or the entire chromosome can act as a
mutation or polymorphism, its approximate chromosome (holocentric). Telomeres are
position on a chromosome can be determined by repetitive sequences like TTAGGG at the end of the
crossing it with previously mapped genes, and then chromosomes that help maintain the length of the
calculating the recombination frequency. If the chromosome. Another feature is that in a
novel gene and the previously mapped genes show chromosome there are p arm (petite = small) and q
complete or partial linkage, the recombination arm (queue = tail or just the next letter in the
frequency will indicate the approximate position of alphabet).*
the novel gene within the genetic map. This
information is useful in isolating (i.e. cloning) the
specific fragment of DNA that encodes the novel
gene, through a process called map-based cloning.
Genetic maps are also useful to track genes/alleles
in breeding crops and animals, in studying
evolutionary relationships between species, and in
Table 1. Table showing four types of centromere location.
determining the causes and individual susceptibility
(Original-Harrington/Kang-CC BY-NC 3.0)
of some human diseases.

*
See https://thednaexchange.com/2011/05/02/p-q-solved-being-the-
true-story-of-how-the-chromosome-got-its-name/

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES– CHAPTER 27

Figure 3.
Karyogram of a normal human male karyotype.
(Wikipedia-NHGRI-PD)

2.2. KARYOGRAM
By convention the chromosomes are arranged into
Figure 4.
the pattern shown in Figure 3 and the resulting
Fictional diagram of a human chromosome and its bands.
image is called a karyogram. A karyogram allows a A chromosome has p and q arm, which are both divided
geneticist to determine a person's karyotype - a by regions. These regions are divided by bands, and these
written description of their chromosomes including bands are subdivided into sub-bands. The bands are
anything out of the ordinary. Therefore, karyotype numbered away from the centromere, and sub-bands are
renumbered for each bands. Notice that this fictional
is a description of the complete set of
diagram was made for educational purposes.
chromosomes, and karyogram is an image that (Original-Kang-CC BY-NC 3.0)
visually describes the karyotype.

2.3. BANDING These days that sequence is usually held in a

Various stains and fluorescent dyes like computer database and is accessible via the
Trypsin+Giesma and Quinacrine are used to Internet. This wasn’t always the case. The first
produce characteristic banding patterns to genome sequences were constructed from a series
distinguish all 23 chromosomes. These bands are of large, cloned physical fragments of DNA. The
first grouped in regions, sectioned into bands, and map was therefore made from physical entities
further divided into sub-bands. Notice that the (pieces of DNA) rather than abstract concepts such
band numbers are start from the centromere and as the linkage frequencies between genes that
extend towards the tip of each arm (Figure 4). The make up a genetic map. It is usually possible to
number of chromosomes varies between species, correlate genetic and physical maps, for example
but there appears to be very little correlation by identifying the clone that contains a particular
between chromosome number and either the molecular marker. The connection between
complexity of an organism or its total amount physical and genetic maps allows the genes
genomic DNA. underlying particular mutations to be identified
through a process call map-based cloning.
3. PHYSICAL MAP (DNA SEQUENCE, RESTRICTION
SITES) 3.2. CONTIG CONSTRUCTION
To create a physical map, large fragments of the
3.1. BASICS genome are cloned into plasmid vectors, or into
The ultimate physical map is an accurate larger vectors called bacterial artificial
representation of the DNA sequence of a genome. chromosomes (BACs). BACs can contain

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 27 – PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES

approximately 100kb fragments. Typically the set copies of the chromosome have to be broken down
of sequences in a BAC clone library will contain into little pieces with different length and frames
redundant, over lapping sequences, meaning that using restriction enzymes, so that they can partially
different clones will contain DNA from the same overlap with each other. The continual overlaps of
part of the genome so there are going to be some the fragments will eventually form a whole map of
overlaps. Because of these overlaps, it is possible to the chromosome. This contiguous assembly of
select the minimum set of clones that represent clones is called contig.
the entire genome, and to order these clones
respective to the sequence of the original 3.3. RESTRICTION MAPPING PROCEDURE
chromosome. Note that this is all to be done Restriction mapping is an inexpensive, quick, and
without knowing the complete sequence of each easy method to describe a sample of cloned DNA. It
BAC. A set of overlapping clones is called a contig. is preferred over DNA sequencing for these
Making a contig map can rely on techniques related reasons, but the sequence is still the ultimate
to Southern blotting: DNA from the ends of one description.
BAC is used as a probe to find clones that contain Restriction mapping is the technique for identifying
the same sequence in another, overlapping BAC the location of restriction sites, relative to other
clone. These clones are then assumed to overlap sites on a DNA molecule. Typically a sample of
each other. This process of finding overlaps can purified cloned DNA is aliquoted into several tubes
progress to position all the clones into overlapping and each is treated with several different
series that span the genome. Also, if we already restriction enzymes or combination of enzymes.
know the sequence of one strain of a simple These are then separated by agarose gel
organism, it can be used as a reference for mutant electrophoresis and the restriction fragment sizes
strains and can identify the differences in the determined by comparison to known size markers.
sequences. By trial and error, the combination of fragments
Small sized genome like Lambda DNA is only 48kb can be assembled like a linear jigsaw puzzle into a
long, but most chromosomes are Mb long. map of the restrictions sites – a restriction map
Currently, the only way to construct physical maps (Figure 7). One can increase the resolution of the
of large regions is through the joining of smaller restriction site map by mapping more restriction
regions to map a larger or whole portion of the sites.
chromosome. In order to do this, small, multiple 3.4. USES OF A RESTRICTION MAP
Restriction mapping is a quick, easy and
inexpensive way to characterize and distinguish
DNA samples without actually sequencing the DNA;
sequences can be represented by series of
restriction sites and using this knowledge, one can
tell if the DNA of interest is similar or different
from others by comparing their degree of overlaps.
Also, restriction sites offer positions for convenient

manipulation of the DNA. Restriction fragments
Figure 5. that contain the gene of interest can be cut out and
A portion of the physical map for human chromosome 4. once the gene is purified from the fragments, it can
The entire chromosome is shown at left. The physical map be sequenced or used as a probe. This is the reason
is derived from the small blue lines, each of which why restriction mapping is still routinely used
represents a cloned piece of DNA approximately 100kb in
today, even though sequencing technologies allows
length. (NCBI-unknown-PD)
us to sequence the whole genome.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES– CHAPTER 27

Figure 6.
A series of overlapping cloned sequences can
be combined to eventually span much larger
regions, including whole chromosomes .
(Original-Locke-CC BY-NC 3.0)

Figure 7.
By looking at the size of the fragments produced by one restriction enzyme or combination of the restriction enzymes, the
location and the order of the restriction site on a chromosome can be identified, forming a restriction map.
(Original-Locke-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 27 – PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES

SUMMARY:
• There are different types of chromosome maps: genetic (recombination), cytogenetic (metaphase
chromosome), and physical maps.
• Recombination frequency is usually proportional to the distance between loci, and so recombination
frequencies can be used to create genetic maps.
• Chromosomes can be distinguished cytologically based on their length, centromere position, and
banding patterns when stained with dyes.
• Single clones can be restriction mapped and then combined into a contig that represents a larger
region of DNA, ultimately the whole chromosome.
• The ultimate physical map is the DNA sequence of the whole chromosome or genome.
KEY TERMS:
map units karyogram
centiMorgans contig
genetic map physical map
recombination frequency restriction map
map-based cloning contig construction
karytotype

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES– CHAPTER 27

STUDY QUESTIONS:
1) Three loci are linked in the order B-C-A. If the A-B map distance is 1cM, and the B-C map distance is 0.6cM,
given the lines AaBbCc and aabbcc, what will be the frequency of Aabb genotypes among their progeny if
one of the parents of the dihybrid had the genotypes AABBCC?
2) Given the restriction digests and with the fragment sizes shown in the gel diagram, can you
construct a map of this linear DNA molecule (Lambda DNA)?

TIPS ON SOLVING RESTRICTION MAPPING QUESTIONS
1) Start with the Nar and Apa digests, each has only one site. This will help get a simple starting map.
2) Next, try and add in the Cvn sites using the double digests with Nar and Cvn.
3) Next, try and add in the Kpn sites using the double digests with Apa and Cvn.
4) There is no formal method to solve these maps. Instead, think of them like a jigsaw puzzle (only linear)
and use trial and error to solve the puzzle.
5) Use the class notes for help.

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 27 – PHYSICAL MAPPING OF CHROMOSOMES AND GENOMES

PAGE 8 OPEN GENETICS LECTURES – FALL 2017
RESTRICTION MAPPING AND GEL ELECTROPHORESIS – CHAPTER 28

CHAPTER 28 – RESTRICTION MAPPING AND

GEL ELECTROPHORESIS

Figure 1.
Restriction enzymes that are
available on a vending machine.
(Flickr- Jun Seita CC BY-NC 2.0)

INTRODUCTION result, free, double-stranded DNA molecules are
released from the chromatin into the extraction
Molecular Genetics techniques involve the isolation,
buffer, which also contains proteins and all other
purification, and manipulation of DNA. DNA can
cellular components. (The basics of this procedure
come in the form of genomic DNA, plasmids, or
can be done with household chemicals and are
oligonucleotides.
presented on YouTube.)
1. ISOLATING DNA The free DNA molecules are subsequently isolated
DNA purification strategies rely on the chemical by one of several methods. (3) Commonly, proteins
properties of DNA that distinguish it from other are removed by adjusting the salt concentration so
molecules in the cell, namely that it is a very long, they precipitate. (4) The supernatant, which
negatively charged molecule. To extract purified contains DNA and other, smaller metabolites, is
DNA from a tissue sample, cells are broken open by then mixed with ethanol, which causes the DNA to
(1) grinding or lysing in a solution that contains precipitate. (5) A small pellet of DNA can be
chemicals that protect the DNA while disrupting collected by centrifugation, and (6) after removal
other components of the cell (Figure 2). These of the ethanol, the DNA pellet can be dissolved in
chemicals may include detergents, which dissolve water (usually with a small amount of EDTA and a
lipid membranes and denature proteins. A cation pH buffer) for the use in other reactions. Note that
such as Na+ helps to stabilize the negatively this process has purified all of the DNA from a
charged DNA and separate it from proteins such as tissue sample; if we want to further isolate a
histones. (2) A chelating agent, such as EDTA, is specific gene or DNA fragment, we must use
added to protect DNA by sequestering Mg2+ ions, additional techniques, such as PCR.
which can otherwise serve as a necessary co-factor
for nucleases (enzymes that digest DNA). As a

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 28 – RESTRICTION MAPPING AND GEL ELECTROPHORESIS

Figure 4.
Figure 2. Each recognition sequences are cleaved by restriction
Extraction of DNA from a mixture of solubilized cellular enzyme. It can either cut DNA at a position offset from the
components by successive precipitations. Proteins are center of the restriction site creating an overhanging region
precipitated, then DNA (in the supernatant) is or it can cut directly in the middle to create blunt end.
precipitated in ethanol, leaving a pellet of DNA. (Original-Deyholos/Kang-CC BY-NC 3.0)
(Original-Deyholos-CC BY-NC 3.0)

Figure 3.
An EcoRI dimer (blue, purple) sits like a saddle on a

double helix of DNA (one strand is green, one is brown).
This image is looking down the center of the helix. Figure 5.
(NCBI-?-PD) The recognition sequence for EcoRI (blue) is cleaved by the
enzyme (grey). This particular enzyme cuts DNA at a
position offset from the center of the restriction site. This
2. RESTRICTION ENZYMES AND DNA creates an overhanging, sticky-end.
METHYLATION (Original-Deyholos-CC BY-NC 3.0)

2.1. RESTRICTION ENZYMES

For example, EcoRI (Figure 3) and EcoRV are both
Many bacteria have enzymes that recognize
enzymes from E. coli. EcoRI (pronounced eco-r-1)
specific DNA sequences (usually 4 or 6 nucleotides)
cuts double stranded DNA at the recognition
and then cut the double stranded DNA helix at this
sequence 5’-GAATTC-3’, but note that this enzyme,
sequence via hydrolysis of phosphodiester
like many others, does not cut in exactly the middle
backbone These enzymes are called site-specific
of the restriction sequence (Figure 4). The ends of a
restriction endonucleases, or more simply
molecule cut by EcoRI have an overhanging region
“restriction enzymes”, and they naturally function
of single stranded DNA, and so are sometimes
as part of bacterial defenses against viruses and
called sticky-ends. These “sticky ends” are short
other sources of foreign DNA.
stretch of complementary base pairs that anneal
Researchers use restriction enzymes that have together and aid in the formation of recombinant
been purified from various bacterial species, and molecules. On the other hand, EcoRV is an
which can be purchased from various commercial example of an enzyme that cuts both strands in
sources. These enzymes are usually named after exactly the middle of its recognition sequence 5’-
the bacterium from which they were first isolated.

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

RESTRICTION MAPPING AND GEL ELECTROPHORESIS – CHAPTER 28

GATATC-3’, producing what are called blunt-ends, plasmid vector: Ligation is therefore central to the
which lack overhangs: 5’-GAT ATC-3’ production of recombinant DNA.
Many different kinds of restriction sites exist in 4. AGAROSE GEL ELECTROPHORESIS
genome, and it takes time for the enzymes to cut
up all of the restriction sites. If the concentration of 4.1. BASICS
the enzyme and the exposure time is low, this will A solution of DNA is colorless, and except for being
result in partial digest and produce longer viscous at high concentrations, is visually
fragments. If the DNA is exposed to restriction indistinguishable from water. Therefore,
enzyme long enough, this will result in complete techniques such as gel electrophoresis have been
digest and produce shorter fragments. developed to detect and analyze DNA (Figure 6).
This analysis starts when a solution of DNA is
2.2. DNA METHYLATION deposited at one end of a gel slab. This gel is made
Bacteria keep their DNA safe from their own from polymers such as agarose, which is a
restriction enzymes by methylating (adding a CH3 polysaccharide isolated from seaweed. The
group) using methyl transferase (methylase) molecules that compose the gel are linked by
enzyme. For each different restriction enzyme, its hydrogen bonds not covalent bonds, so the
matching methylase enzymes are produced to experimenter can mold the shape of the gel by
methylate the host DNA. After each replication of heating and cooling. The DNA is then forced
DNA, the enzyme has to methylate the newly through the gel by an electrical current, with DNA
synthesized DNA. molecules moving toward the positive electrode
(Figure 7). This is because the phosphate backbone
3. DNA LIGATION of DNA or RNA has negative charge on it. Therefore,
The process of DNA ligation occurs when DNA rather than moving in a vertical manner, the DNA
strands are covalently joined, end-to-end forming a or RNA molecule will move by its horizontal side.
phosphodiester bond between the 5’ phosphate
end and 3’ hydroxyl end through the action of an
enzyme called DNA ligase. Typically, sticky-ended
molecules with complementary overhanging
sequences (compatible ends) facilitate their joining
to form recombinant DNA. Likewise, two blunt-
ended sequences are also considered compatible
to join together, although they do not ligate
together as efficiently as sticky-ends. The sticky-
ended molecules with non-complementary
sequences will not ligate together with DNA ligase.
This function of joining two fragments of DNA Figure 6.
pieces together by DNA ligase is essential when Apparatus for agarose gel electrophoresis. A waterproof
tank is used to pass current through a slab gel, which is
connecting Okazaki fragments during DNA
submerged in a buffer in the tank. The current is supplied
replication, or repairing breaks in either single or by an adjustable power supply. A gel (stained blue by a
double stranded DNA molecules during dye sometimes used when loading DNA on the gel) sits in
recombination. Therefore, if a mutation occurs in a tray, awaiting further analysis, such as photography
the genes that encode for ligase enzymes, the under a UV light source.
(Flickr- camerazn - CC BY 2.0)
result will impact the organism immensely in a
negative way. In molecular genetics, ligation is
particularly important as DNA ligase facilitates the
insertion of double stranded DNA fragment into a

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 28 – RESTRICTION MAPPING AND GEL ELECTROPHORESIS

As it migrates, each piece of DNA threads its way

through the pores, which form between the
polymers in the gel. Note that the mobility of the
molecules is affected by the molecular weight,
which with a linear molecule like DNA is primarily
length. Because shorter pieces can move through
these pores faster than longer pieces, gel
electrophoresis separates molecules based on their
Figure 7.
size (length), with shorter DNA pieces moving
Agarose gel electrophoresis. DNA is loaded into wells at
the top of a gel. A current is passed through the gel, faster than longer ones. Circular DNA molecules
pulling DNA towards the positively charged electrode. The like plasmids move according to their
DNA fragments are separated by size, with smaller conformation. Open circular will dsDNA will move
fragments moving fastest towards the electrode. The left slower than super-coiled dsDNA due to size
lane is a series of known size marker fragments, while the
difference and vice versa. DNA molecules of a
centre and right lanes contain fragments whose size can
be approximated in comparison to the known marker similar size migrate at a similar rate and thus will
sizes. (Wikipedia-Magnus Manske-PD) arrive at a similar location in each gel, called a

band. This feature makes it easy to see specific

sized DNA after staining with a fluorescent dye,
such as ethidium bromide (EtBr) that acts as
intercalating agent that is inserted between two
bases (Figure 8). By separating a mixture of DNA
molecules of known size (size markers) in adjacent
lanes on the same gel, the length of an
uncharacterized DNA fragment can be estimated.
Gel segments containing the DNA bands can also
Figure 8. be cut out of the gel, and the size-selected DNA
A photograph of an agarose gel stained with ethidium extracted and used in other types of reactions,
bromide and illuminated by UV light. The stain associated such as sequencing and cloning.
with DNA is fluorescent orange. Each band represents a

different sized cloned fragment. As in the previous figure,

4.2. GENOMIC VS CLONED DNA ON A GEL
the size of the unknown fragments can be determined by
comparison to the known marker fragments on the left. Samples of cloned DNA will have few bands, while
(Wikipedia-Transcontrol- CC BY-SA 3.0) genomic DNA will appear as a smear of DNA that

represents the many sized fragments present in a

whole genome. Since the fragments that are run on
the gel represent the whole genome, there would
be minimal gaps in between in terms of the length
of the various DNA fragments and therefore appear
as a continuous band on the gel.

5. OTHER APPLICATIONS OF GEL ELECTROPHORESIS

5.1. SEPARATION OF RNA BY GEL ELECTROPHORESIS.

Figure 9. Like DNA, samples of RNA can be separated on a
Genomic DNA of Rice on agarose gel produces a smear of gel. The major difference between RNA and DNA is
DNA that represents the fragments in a whole genome, that RNA is single stranded and DNA is double
where as the cloned DNA on the very right lane has sharp
stranded. Therefore, RNA molecules are
bands that are separated by distance.
(Flickr-IRRI Photos-CC BY-NC-SA 2.0) susceptible to intramolecular base paring which

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

RESTRICTION MAPPING AND GEL ELECTROPHORESIS – CHAPTER 28

produces a secondary structure that forms loops. 5.2. CONTAMINATION IN GEL ELECTROPHORESIS
This can affect its mobility and can therefore DNA samples don’t always separate correctly in
provide wrong information regarding its size. Also, agarose gel electrophoresis. Typically, the DNA
the bands are going to look less sharp. Therefore, sample is contaminated with other macro-
denaturing agents have to be used when running molecules or chemicals. Figure 10 shows the
on the gel in order to break the hydrogen bond consequences of various forms of contamination.
that is holding the secondary structure of RNA. This
6. RESTRICTION MAPPING
way, most of the secondary structure of RNA can
be prevented. Some of the molecules still might 6.1. PROCEDURE
have its secondary structure and some even might Restriction mapping is the technique of identifying
re-coil back. This is why EtBr, which is an the location of restriction sites, relative to other
intercalating agent that is inserted between two sites, on a DNA molecule. Typically, a sample of
planar bases, still works on RNA molecules but with purified plasmid DNA is aliquoted into several
a lower efficiency. Therefore, in order to produce tubes and each is treated with several different
similar quality with DNA on a gel, a lot more RNA restriction enzymes or combination of enzymes.
molecules have to be used. These are then separated by agarose gel
RNA gel electrophoresis can be used when electrophoresis and their restriction fragment sizes
scientists want to identify the existence of certain determined. By trial and error, the combination of
mRNAs and therefore certain genes that are fragments can be assembled like a linear jigsaw
expressed in the cell compared to other cell types puzzle into a map of the restrictions sites – a
or the same cell type but in a different stage in life. restriction map.
Also, RNA molecules can be quantified so the level
of gene expression can be identified as well. Finally,
it can be purified just like DNA molecules.
Figure 10.
The consequences of various forms
of contamination on the separation
of DNA in an agarose gel.
(Original-Harrington-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 28 – RESTRICTION MAPPING AND GEL ELECTROPHORESIS

Figure 11.
By looking at the size of the fragments produced by one restriction enzyme or combination of the restriction enzymes, the
location and the order of the restriction site on a chromosome can be identified, forming a restriction map.
(Original-Locke-CC BY-NC 3.0)

6.2. USES
Restriction mapping is a quick, easy and Also, restriction sites offer positions for convenient
inexpensive way to characterize and distinguish manipulation of the DNA. Restriction fragments
DNA samples without actually sequencing the DNA; that contain the gene of interest can be cut out and
sequences can be represented by series of once the gene is purified from the fragments, it can
restriction sites and using this knowledge, one can be sequenced or used as a probe. This is the reason
tell if the DNA of interest is similar or different why restriction mapping is still routinely used
from others by comparing their degree of overlaps. today, even though sequencing technologies allows
us to sequence the whole genome.

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

RESTRICTION MAPPING AND GEL ELECTROPHORESIS – CHAPTER 28

SUMMARY:
• Restriction enzymes are natural endonucleases used in molecular biology to cut DNA sequences at
specific sites.
• DNA fragments with compatible ends can be joined together through ligation. If the ligation produces a
sequence not found in nature, the molecule is said to be recombinant.
• DNA or RNA molecules can be identified, quantified, and separated on electrophoresis gel.
• Contamination in DNA samples such as RNA, salt, or protein that can affect the bandings on a
electrophoresis gel.
KEY TERMS:
lysing blunt-ends
detergents DNA methylation
chelating agent DNA ligation
EDTA DNA ligase
nucleases compatible ends
supernatant gel electrophoresis
pellet agarose
restriction endonucleases band
restriction enzymes ethidium bromide
EcoRI size markers
EcoRV restriction map
sticky-ends

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 28 – RESTRICTION MAPPING AND GEL ELECTROPHORESIS

STUDY QUESTIONS:
1) A 6.0 kbp PCR fragment flanked by recognition
sites for the HindIII restriction enzyme is cut
with HindIII then ligated into a 3kb plasmid
vector that has also been cut with HindIII. This
recombinant plasmid is transformed into E. coli.
From one colony a plasmid is prepared and
digested with HindIII.
a) When the product of the HindIII digestion is
analyzed by gel electrophoresis, what will
be the size of the band(s) observed?
b) What bands would be observed if the
recombinant plasmid was instead cut with
EcoRI, which has only one site, directly in
the middle of the PCR fragment?
c) What band(s) would be observed if the
recombinant plasmid was cut with both
EcoRI and HindIII at the same time?
2) You add ligase to a reaction containing a sticky-
ended plasmid and sticky-ended insert
fragment, which both have compatible ends.
Unbeknownst to you, someone in the lab left
the stock of ligase enzyme out of the freezer
overnight and it degraded (no longer works).
Explain in detail what will happen in your
ligation experiment in this situation should you
try and transform with it.
3) Which would move faster during agarose gel
electrophoresis, a 1.0 kbp duplex DNA molecule
or a 1,000 nt of RNA (single stranded) molecule?

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

RECOMBINANT DNA – CHAPTER 29

CHAPTER 29 – RECOMBINANT DNA

Figure 1.
The manipulation of DNA often
involves small quantities of liquids
that are accurately dispensed
using micro-pipettors into small
plastic microfuge tubes. Volumes
as small as 0.5 µl are routinely
dispensed for some reactions. The
use of clean, sterile plastic tubes
and tips ensures the reactions
work correctly and are
reproducible.
(Flickr- University of Michigan
School of Natural Resources and
Environment-CC BY 2.0)

INTRODUCTION test tube). These days most experiments are done
in plastico (in plastic). See Figure 1.
Recombinant DNA is a general term to describe
DNA that has been manipulated (recombined) in silico (in silicon) Experiments done within a
somehow in vivo. It typically involves the breakage computer simulation.
of DNA into fragments, using restriction enzymes, Recombinant DNA: a composite DNA molecule
and the rejoining (ligation) of these fragments into created in vitro by joining a foreign DNA with a
various arrangements and into vectors, such as vector DNA molecule. (Note; technically
plasmids, to propagate the new arrangement for recombinant DNA can be also made in vivo during
further analysis, like sequencing, or for insertion meiosis in an organism, but this is usually not the
into other hosts, such as model organism as typical meaning of these words.)
transgenes.
2. RECOMBINANT DNA TECHNIQUES:
1. BASIC TERMINOLOGY
There are many techniques for joining DNA
Before proceeding any further, there are some molecules in vitro and introducing them into cells
basic terminologies that students should know (usually bacteria) where the molecules are then
regarding recombinant DNA technology. replicated along with the host genomic DNA.
in vivo (in life) experiments done within a living
2.1. PLASMIDS ARE NATURALLY PRESENT IN SOME
cell/organism
BACTERIA
in situ (in place) experiments done on cells and Many bacteria contain extra-chromosomal DNA
structures removed intact from an organism. (ex. elements called plasmids. These are usually small
Inserting RNA into a frog egg cell on a petri dish) (a few 1000 bp), circular, double stranded
in vitro (in glass) experiments done on individual molecules that replicate independently of the
molecules removed from an organism (ex. DNA in a chromosome and can be present in multiple copies
within a cell. In the wild, plasmids can be

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 29 – RECOMBINANT DNA

transferred between individuals during bacterial particularly because most plasmid vectors used in
mating and are sometimes even transferred molecular biology have been engineered to contain
between different species. Plasmids are particularly recognition sites for a large number of restriction
important in medicine because they often carry endonucleases in a segment called the Multiple
genes for pathogenicity (making the bacteria more Cloning Site (MCS).
detrimental) and drug-resistance (able to survive
various antibiotics). In the lab, plasmids are
inserted into bacterial hosts in a process called
transformation. These plasmids can be modified by
the addition of foreign DNA so that both the
plasmid vector and the target foreign DNA is
replicated.
There are 3 main features of a plasmid (Figure 2):
(1) Origin of replication (Ori) which is similar in
function to oriC in E. coli chromosome.
(2) Selectable marker gene that helps to screen the
desired and undesired strains, which is usually an
antibiotic resistance gene like ampR, tetR, or kanR.
Figure 3.
Some cells have plasmids that contain resistance to Cloning of a DNA fragment (red) into a plasmid vector.
multiple, different antibiotics. The vector already contains a selectable marker gene
(3) Multiple cloning site (MCS) that has many (blue) such as an antibiotic resistance gene.
restriction enzyme sites in a short sequence. (Original-Deyholos-CC BY-NC 3.0)

After restriction digestion, the desired fragments

may be further purified or selected before they are
mixed together with ligase to join them together.
Following a short incubation, the newly ligated
plasmids, containing the gene of interest are
transformed into E. coli.

Transformation is accomplished by mixing the
Figure 2. ligated DNA with E. coli cells that have been
Basic structure of a double stranded bacterial plasmid, specially prepared (i.e. made competent) to uptake
represented by a circle.
(Original-Locke-CC BY-NC 3.0) DNA. Bacterial cells can be made competent by
exposure to compounds such as CaCl2 or to
electrical fields (electroporation). Because only a
3. USING CLONING VECTORS small fraction of cells that are mixed with DNA will
3.1. PLASMIDS VECTORS actually be transformed, a selectable marker, such
There are multiple steps to using plasmids as as a gene for antibiotic resistance, is usually also
cloning vectors. To insert a DNA fragment into a present on the plasmid and used to select those
plasmid, both the fragment and the circular few cells that have taken up the DNA. The rate of
plasmid are cut using a restriction enzyme that DNA uptake varies each time and is called
produces compatible ends. Given the large number transformation efficiency. This can range from
of restriction enzymes that are currently available, ~105-1010 colonies per µg of DNA.
it is usually not too difficult to find an enzyme for After transformation (combining DNA with
which corresponding recognition sequences are competent cells), bacteria are spread on a bacterial
present in both the plasmid and the DNA fragment, agar plate containing an appropriate antibiotic so

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

RECOMBINANT DNA – CHAPTER 29

that only those cells that have actually linear DNA vector molecule that can typically hold a
incorporated the plasmid will be able to grow and 15-20 kb fragment in each clone.
form colonies. Colonies (clone) can then be picked Cosmids are a hybrid vector system composed of
and used for further study. part plasmid and part phage DNA. It can clone 30-
Molecular biologists use plasmids as vectors to 45Kb fragments in each clone. The lambda phage
contain, amplify, transfer, and sometimes express packaging system (stuffs the recombinant DNA
genes of interest that are present in the cloned into the lambda bacteriophage heads) is used for
DNA. Often, the first step in a molecular biology higher transformation efficiency, but it also has the
experiment is to “clone a gene” (i.e. make a copy) plasmid origin of replication so clones can be
into a plasmid, then transform this recombinant replicated in the host like plasmids.
plasmid into bacteria so that essentially unlimited BACs (Bacterial Artificial Chromosomes) is a circular
copies of the gene (and the plasmid that carries it) DNA vector that uses a plasmid origin of replication
can be made as the bacteria reproduce. This is a to propagate. The insert DNA can be very large,
practical necessity for further manipulations of the 100’s of kbp, so it may contain many genes. But,
DNA, since most techniques of molecular biology such large recombinant DNA molecules are difficult
require many copies of DNA to work. Even though to transform so BACs are difficult to make.
small amounts are needed they are not sensitive
enough to work with just a single molecule at a 4. DNA LIGATION
time. The process of DNA ligation occurs when DNA
Many molecular cloning and recombination strands are covalently joined, end-to-end through
experiments are therefore iterative (repetitive) the action of an enzyme called DNA ligase.
processes. For example: Molecules with complementary overhanging
sequences are said to have “sticky” or compatible
1. a DNA fragment (usually isolated by PCR and/or
ends, which facilitate their joining to form
restriction enzyme digestion) is cloned into a
recombinant DNA. Likewise, two blunt-ended
plasmid cut with a compatible restriction
sequences are also considered compatible to join
enzyme
together, although they do not ligate together as
2. the recombinant plasmid is transformed into efficiently as sticky-ends. Note: sticky-ended
bacteria molecules with non-complementary sequences will
3. the bacteria are allowed to multiply, usually in not ligate together with DNA ligase.
liquid culture The process of ligation is central to the production
4. a large quantity of the recombinant plasmid of recombinant DNA, including the insertion of a
DNA is isolated from the bacterial culture double stranded DNA fragment into a plasmid
vector.
5. further manipulations (such as site directed
mutagenesis or the introduction of another 5. AN APPLICATION OF MOLECULAR CLONING:
piece of DNA) are conducted on the RECOMBINANT INSULIN
recombinant plasmid
Purified insulin protein is critical to the treatment
6. the modified plasmid is again transformed into of diabetes. Prior to ~1980, insulin for clinical use
bacteria, prior to further manipulations, or for was isolated from human cadavers or from
expression slaughtered animals such as pigs. Human-derived
insulin generally had better pharmacological
3.2. OTHER VECTORS properties, but was in limited supply and carried
Lambda phage is a bacteriophage that infects E. risks of disease transmission. By cloning the human
coli and can be used as a vector. Lambda phage is a insulin gene and expressing it in E. coli, large

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 29 – RECOMBINANT DNA

quantities of the insulin protein and identical to the for the next step.) Each independent assembly
human hormone sequence could be produced in of a DNA segment in a vector is a clone
fermenters, safely and efficiently. Production of 4. The recombinant DNA molecules are
recombinant insulin also allows specialized variants transformed into a competent bacteria host
of the protein to be produced: for example, by cell. For plasmids, this is a direct process, while
changing a few amino acids, longer-acting forms of for cosmids, a lambda in vitro packaging system
the hormone can be made. The active insulin is used to increase the efficiency of the process.
hormone contains two peptide fragments of 21 and (Lambda in vitro packaging system refers to the
30 amino acids, respectively. Today, essentially all packaging of recombinant DNA into the head of
insulin is produced from recombinant sources the bacteriophage, and then transferring this
(Figure 4), i.e. human genes and their derivatives package to the host cell.)
expressed in bacteria or yeast.
5. The transformed cells that contain plasmid or
Figure 4. cosmid with antibiotic resistance gene are
A vial of insulin. Note that the label lists
selected and propagated.
the origin as “rDNA”, which stands for
recombinant DNA. 6. Amplified recombinant plasmid DNA molecules
(Flickr-DeathByBokeh- CC BY-NC 2.0) can be purified, and collected. This collection of
many different clones (genomic DNA
fragments) makes up a DNA library or clone
library that can contain the entire DNA
sequence of an organism in the fragmented
6. GENOMIC DNA LIBRARIES AND CDNA form of multiple clones.
LIBRARIES These clones can then be stored and the fragment
of interest can be retrieved at a later time, hence
6.1. GENOMIC DNA LIBRARY the name “genomic library.” Gene libraries can be
The human genome is large and complex. It is constructed using different vectors, but almost all
much easier to break it down into little fragments work is done with plasmids these days.
to study. This is true for all organisms – deal with a
gene at a time. How many clones are needed to include every
Here is the process of constructing a genomic DNA sequence in a library? The number of clones
library (Figure 5): needed to have 1 genome equivalent can be
calculated by dividing the number of sequences of
1. Genomic DNA is broken down into short the genome by the number of sequences of the
fragments by partial restriction enzyme clone. For example, if the E. coli genome is about
digestion. The size is dictated by the vector 4,500,000 bp and cosmid clone contains 45,000 bp,
used. Plasmids will need short fragments (~5 then 4,500,000/45,000 = 100 clones would be
kbp) while cosmid vectors will need larger ones needed, end to end, to cover the whole genome.
(30-45 kbp). . However, in real life, some sequences in the
2. Circular plasmid or cosmid vector DNA is genome might not be cloned at all (others may be
opened with the same restriction enzymes that cloned more than once) and therefore the process
were used in (1) or another enzyme that yields is not 100% efficient. To get a 99% chance of
compatible, sticky ends. finding a specific gene or sequence of interest, one
needs about 5 genome equivalents. That is: 500
3. The DNA fragment and vector are mixed
clones for the E. coli example above.
together in the same test tube and ligated
together. (The ligation occurs by random
chance so not all molecules will be appropriate
PAGE 4 OPEN GENETICS LECTURES – FALL 2017
RECOMBINANT DNA – CHAPTER 29

Figure 5.
Process of making a
genomic clone library of
an organism.
(Original-Locke-CC BY-NC
3.0)

6.2. CDNA LIBRARY corresponds to the mRNA present in the cell. The
Genomic library above hypothetically contains all DNA between genes, and intron DNA, is absent
the sequences of the target’s DNA, but a cDNA from this library. These clones are often used to
library only contains the sequences that are express the gene to make a protein.
expressed in a particular cell or tissue. The main difference between Genomic DNA library
To create a cDNA library, RNA is collected from the and cDNA libray is that the genomic library
cell or tissue of interest. Primers, nucleotides, and contains DNA with exons, introns, and intergenic
RNA transcriptase enzyme are added so that sequences, so the number of different clones in the
complimentary DNA, or cDNA that is library is much bigger. On the other hand, cDNA
complimentary to the RNA is synthesized. The library contains only the sequences present after
result is a RNA-cDNA hybrid, and these two strands transcription and processing (e.g. splicing exons),
are separated by adding heat, and RNA can be which are translated into polypeptides. Therefore,
denatured by adding RNase enzyme or NaOH. The by looking at the cDNA library we can identify
remaining cDNA can act as a template and its which genes are expressed in particular cell types,
complementary DNA strand is synthesized, each and to what level of expression, too.
forming a double helix. From this point, the rest of

the procedure to create a library is equivalent with
that above. Here, however, the cloned DNA

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 29 – RECOMBINANT DNA

Figure 6.
Process of producing
cDNA library. (Original-
Locke-CC BY-NC 3.0)

7. SCREENING A CLONE LIBRARY 1. Plate out library - each colony on the bacterial
plate is a clone.
After genomic or cDNA libraries have been
constructed, clones containing a particular gene, or 2. Lift clones (DNA) onto Nitrocellulose filter. Lyse
DNA sequence, can be identified and recovered the cells and fix the clone DNA onto the filter.
using the process of hybridization and labeled DNA Denature the clone DNA, so as to make it able
probes. DNA labeling involves putting a tag on the to form hybrids with probe.
DNA molecule that is going to be complementary 3. Place filter in a hybridization bag with solution
to the DNA sequence of interest, in some manner containing labeled, denatured probe DNA.
that permits one to detect its presence in minute Incubate to permit the probe strands to form
quantities at some later point in an experiment. hybrids with the clone strands.
DNA can be labeled in several different ways; one
widely used technique is to replace the normal 4. Wash away unhybridized probe.
Phosphorous of the DNA with a radioactive atom of 5. Expose probed filter to X-ray film
Phosphorous, 32P (normal isotope = 31P). This (autoradiography) to detect the presence of
radioactivity can be detected by photographic clones with labeled probe.
emulsion. A cloned DNA sequence will hybridize to
6. From the X-ray film determine which clone
only its complementary sequences and thus
hybridized to the probe and recover that clone
provides an almost unique probe. Once the
for further analysis.
appropriate probes are made, the following
procedures are performed:

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

RECOMBINANT DNA – CHAPTER 29

Figure 7.
Process of screening a
clone library of an
organism.
(Original-Locke-CC BY-
NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 29 – RECOMBINANT DNA

____________________________________________________________________________
SUMMARY:
• DNA fragments can be cloned into vectors.
• Transformation of recombinant DNA is the transfer of DNA (usually recombinant plasmids) into
bacteria.
• Cloning of genes into E. coli is a common technique that allows large quantities of a DNA for gene to
made
• This allows further analysis or manipulation of the cloned sequences.
• Genomic DNA libraries contain fragments of genomic DNA.
• cDNA libraries contain shorter segments of DNA that correspond to the mRNA for each gene.
• Gene of interest can be identified using DNA probes to screen genomic or cDNA libraries.
• Cloning can also be used to produce useful proteins, such as insulin, in microbes.
KEY TERMS:
in vivo electroporation
in situ vector
in vitro clone
in plastico Lambda phage
in silico cosmid
Recombinant DNA lambda phage packaging system
plasmid BACs
transformation DNA ligation
Ori DNA ligase
selectable marker sticky / compatible end
Multiple cloning site (MCS) genomic library
competent cDNA library

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

RECOMBINANT DNA – CHAPTER 29

STUDY QUESTIONS:
1) A coat protein from a particular virus can be
used to immunize children against further
infection. However, inoculation of children with
proteins extracted from natural viruses
sometimes causes a fatal disease, due to
contamination with live viruses. How could you
use molecular biology to produce an optimal
vaccine?
2) How would cloning be different if there were
no selectable markers?

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 29 – RECOMBINANT DNA

Notes:

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

CLONING A GENE – CHAPTER 30

CHAPTER 30 – CLONING A GENE

Figure 1.
Diagram of two plasmid vectors. pBR322 (left) was
one of the first widely used plasmid vectors. It has
two antibiotic resistance genes (amp + tet), but no
multiple cloning site. pUC19 (right) was a more
recent improvement. It is smaller, because it has only
one antibiotic resistance gene. Most importantly it
also includes a multiple cloning site (polylinker) at
the beginning of the lacZa gene so it can be used
with the blue/white X-gal scheme for insert
detection.
(pBR322 Wikipedia-Ayacop+Yikrazuul-PD)
(pUC19 Wikipedia- Yikrazuul-PD)

INTRODUCTION 1. CLONING BY COMPLEMENTATION – A
As a geneticist, suppose you have created a strain HYPOTHETICAL AUXOTROPHIC MUTATION IN E.
that has a mutation in a gene that involves a COLI
biological process you wish to learn more about. The concept of genetic complementation was
This could be a gene dealing with cancer, covered previously in Chapter 4. This cloning
development, or heart disease, or something as method is primarily used with single-cell organism,
simple as the colour of an eye or flower. All you such as bacteria or yeast (fungi). It involves the
have to start with is this mutant strain, and you introduction of one wild type gene (allele) into a
know that it’s genetically different from wild type mutant strain, which is then able to supply
(or its parents). sufficient product to result in a wild type
How do you find the gene (that is mutant) amongst phenotype. The easiest explanation is a
the ~20,000 that are present in the typical hypothetical example dealing with an auxotrophic
eukaryote genome? mutation in E. coli.
The process of cloning a gene involves several steps 1.1. BUILDING A GENOMIC DNA LIBRARY CONTAINING
and has been done in multiple different ways. Only THE WILD TYPE GENE
three basic methods will be presented here. All To complete this method we need both an
involve the construction of a genomic DNA clone autotrophic mutant (A-) and a wild type (A+) strain.
library, which is the cloning of all the sequences in
a genome into a DNA vector (a plasmid for The first step is to clone the wild type allele of the
example, Figure 1). Then this library is screened, gene of interest. Step (1) is to restriction enzyme
using one of several methods, to find the specific digest the DNA of the wild type strain into clone-
DNA sequence (clone) that contains the gene of sized fragments. One fragment should contain the
interest to you. A+ gene. These fragments are cloned into a vector,
such as a plasmid by means of ligation (Figure 2).
The three methods for screening a library will be The ligated E. coli genomic DNA and plasmid vector
(1) cloning by complementation; (2) hybridization constructs are a genomic DNA library.
of DNA Probes; and (3) transposon tagging.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 30 – CLONING A GENE

resistance on the plasmid, but only if it contains the

supplement for the auxotrophic strain. The
transformants that have the A+ gene (red) will
grow on antibiotic media that lacks the
supplement. Thus the Minimal Media plate, with or
without antibiotic, can be used to select (screen
for) the clone with the A+ gene, a prototroph.
Typically, when doing an experiment like this, a
researcher would recover several independent
clones, each having the same A+ gene fragment
cloned in it. This recurring result would provide
evidence that this is the gene that of interest.
Figure 2.
E. coli genomic DNA (blue) is digested with a restriction 1.3. SELECTION OF THE APPROPRIATE RESTRICTION
enzyme to generate fragments for cloning into the plasmid ENZYME
vector (black). The ligation products will include some
clones with the A+ gene (red) and many without (blue).
Note in this example the restriction enzyme cut the
(Original-J. Locke-CC BY-NC 3.0) genomic DNA such that the entire A+ gene was
contained on one fragment. In actual practice a
1.2. TRANSFORMATION AND CLONE SELECTION VIA researcher would not know ahead of time which
COMPLEMENTATION enzyme to use. Several different enzymes would
The ligated genomic library is then transformed need to be tried to identify an enzyme that would
into the auxotrophic mutant host (A-). This will not cleave within the gene. Separating the A+ gene
result in a variety of cells with different genotypes into two fragments would not result in the
as shown in Figure 3. recovery of the gene.

The untransformed cells (left) will not grow on any Once cloned, this fragment can be sequenced and
Minimal Media (MM) with or without antibiotic. the genes within it characterized and used for
The cells transformed with a plasmid containing a further experiments to determine its function and
genomic fragment (blue) will grow on the antibiotic role in the biological process under investigation.
containing media because of the antibiotic

Figure 3.
The auxotrophic A- strain E. coli (green)
is transformed with the recombinant
DNA (A+ genomic DNA plus plasmid
vector from Figure 2) to generate three
genotypic classes as shown across the
top. Left: untransformed; Centre:
transformed with random genomic DNA;
Right: transformed with the A+ gene
(red). See text for an explanation.
(Original-J. Locke-CC BY-NC 3.0)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

CLONING A GENE – CHAPTER 30

1.4. HOW MANY CLONES ARE NEEDED TO FIND THE ONE <duplex DNA>
OF INTEREST ? 5'....GAATTCGGATCC....3'
Given that the construction of the recombinant 3'....CTTAAGCCTAGG....5'
genomic DNA library is random, in that the ↓
fragment with the A+ gene is only one of many in 5'....GAATTCGGATCC....3'
the genome, how many clones do we need to make +
and screen to be sure of finding the one we seek? 3'....CTTAAGCCTAGG....5'
There is a simple formula that researchers use to
estimate this probability: These dissociated single strands can reassociate (or
anneal) to reform the duplex DNA. This process is
(1) If each plasmid contains ~4.5 Kb of insert DNA
sequence specific, in that the duplex will only form
and the E. coli genome contains ~4.5 Mb DNA then
if the two strands are complementary.
~1000 plasmid clones, if arranged end-to-end could
contain one E. coli genome's worth of DNA. When the two strands reform a duplex, a hybrid is
formed, hence the name hybridization.
(2) Because of the random nature of which
fragment are cloned, probability says we need to 5'....GAATTCGGATCC....3'
screen the equivalent of 5 genomes worth of 3'....CTTAAGCCTAGG....5'
clones (e.g. 5,000 clones in this case) to provide a (duplex DNA again)
99% chance of finding the A+ gene (or any/every
other gene in the genome), taking the conditions in Note: hybrids can form between DNA/DNA (two
the previous section into account. Because cloning complementary DNA strands), DNA/RNA (a DNA
is a statistical probability, there can never be a strand and its complementary RNA sequence), or
100% chance, but the 99% chance is almost always between RNA/RNA (not useful here since clones
sufficient to find the gene of interest. are DNA).
Note that 5,000 bacterial clones can be produced Hybrid formation only requires that the
easily and screened quickly on a single Petri dish complementary sequences be similar, not a perfect
plate. This method is relatively easy and match. A hybrid can form some "mismatch" in the
straightforward. pairing.

2. CLONING BY HYBRIDIZATION OF DNA PROBES 5'....GAATTTAGATCC....3'

||||| |||||
2.1. DNA HYBRIDIZATION IS SEQUENCE SPECIFIC 3'....CTTAAGCCTAGG....5'
The use of DNA probes to recover clones from (duplex DNA again)
genomic, or other libraries, relies on the principle The extent of mismatch possible in a hybrid
of DNA hybridization. DNA is normally a duplex depends upon the hybridization conditions in the
with the two strands are held together via solution (temperature, salt concentration etc).
hydrogen bonds (A=T G=C) and base stacking Under some conditions 30-40% mismatched
interactions. strands are able to form a stable hybrid. This can
The two strands can be separated in a process be useful for finding a similar gene across species.
called denaturation. This can be easily done by The DNA sequence from one species can be used to
heating (e.g. boiling water 100°C) or alkali (e.g. 50 find a similar gene in another species. For example,
mM Na0H) to raise the pH. see Figure 4.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 30 – CLONING A GENE

cDNA clone, which is a plasmid vector containing

the DNA complementary to an mRNA sequence.
The process of making a cDNA library is similar to
that of a genomic DNA library. The result is a set of
clones that contain DNA sequences corresponding
to the mRNA molecules in a sample, cell, or tissue.
Figure 4. The mRNA of a sample is extracted and reverse
A DNA sequence alignment showing a comparison
between mouse and Drosophila DNA sequences for part of
transcribed into a complementary DNA (cDNA)
their actin gene. There are seven mismatches in this 120 sequence that is then cloned. This procedure
base sequence (94% similar), which would probably easily typically makes a mixture of clones that represents
form a hybrid. (Original-J. Locke-PD) the diversity of mRNA in the sample. See Figure 5
for a diagrammatic description of the method.
2.2. SOURCE OF A DNA PROBE
Another source of DNA for a probe can come from
Typically a cloned DNA sequence is used as a probe
synthetic oligo-nucleotides. The sequence can be
into a DNA library. There must be a source for the
derived from the amino acid sequence of the
DNA and it can be from many different sources, but
polypeptide the gene encodes. The amino acid
we will present only four common ones.
sequence can be determined by various
First, these sequences can come from “natural” biochemical techniques and then reverse
(pre-existing) sources, such as a similar gene cloned translated to determine what specific DNA
from another species. See the mouse/Drosophila sequence to chemically synthesize. The degeneracy
actin gene in Figure 4 above. The cloned Drosophila of the triplet code usually makes it impossible to
actin gene sequence can be used as a probe into a obtain a unique sequence. Instead, a degenerate
mouse genomic DNA library to find the mouse actin oligo-sequence is used.
gene, or vice versa. Another source would be a

Figure 5.
Diagram showing steps in
the construction of a cDNA
library.
(Original-J. Locke-CC BY-
NC 3.0CC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

CLONING A GENE – CHAPTER 30

Example:
Protein sequence: Met - Lys - Asn – Glu Lys=Lysine
codon: AUG - AAA - AAU - GAA Asn=Asparagine,
alternate codon: - AAG - AAC – GAG Glu=Glutamic Acid
Probe sequence: ATG AAA AAT GAA
G C G
| | |
use both bases in the oligo-nucleotide
a 50:50 mix of each base
Another source of DNA for probes comes from the radioactive atom of Phosphorous, 32P (normal
amplification of polymerase chain reaction (PCR isotope = 31P). This radioactivity can be detected by
products). The methodology of PCR is covered in photographic emulsion (e.g. auto-radiography) on
Chapter 31 and will not be presented here. sheets of X-ray film. There are several methods to
Basically PCR amplification is just another method do this (nick-translation, random priming, PCR). All
to synthesize sufficient DNA for use as a probe. It produce a single strand of 32P labeled DNA that will
uses two primer oligonucleotides to synthesize hybridize with its complementary sequence and
(amplify) a specific sequence of duplex DNA. This thus localize any DNA with that sequence.
can then be used as a probe as described in the We now have the probe labeled and ready to
next section. screen a genomic library for a gene of interest. The
2.3. SCREENING A LIBRARY TO FIND A CLONE method of screening a library of DNA clones relies
Once a DNA fragment has been obtained (see on the probe’s sequence specificity to hybridize
above) it can be labeled and used as a probe. with only the clones with a complementary
"Labeling" involves putting a tag on the DNA in sequence. The procedure is diagrammed in Figure
some manner that permits one to detect its 6.
presence, in minute quantities, at some later point The goal of screening a clone library is to identify
in an experiment. DNA can be "labeled" in several and recover clone(s) that have a sequence
different ways. One widely used technique is to complementary to the probe (e.g.: a specific gene
replace the normal Phosphorous of the DNA with a sequence)
Figure 6.
Procedure to screen a plasmid library of genomic clones
with a DNA probe.
1. Plate out library - each colony on the bacterial plate is a
clone.
2. Lift clones (DNA) onto Nitrocellulose filter. Fix the clone
DNA onto the filter and denature the clone DNA, so as to
make it able to form hybrids with probe.
3. Place filter in a hybridization bag with solution containing
labeled, denatured probe DNA. Incubate to permit the
single strands of probe to form hybrids with the clone single
strands in a sequence specific manner.
4. Wash away unhybridized probe.
5. Expose probed filter to X-ray film (autoradiography) to
detect the presence of clones with labeled probe.
6. From the X-ray film determine which clone hybridized to

the probe and recover that clone for further analysis.
(Original-J. Locke-CC BY-NC 3.0CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 30 – CLONING A GENE

3. CLONING A GENE USING THE TRANSPOSON

TAGGING METHOD
The final method of cloning a gene uses the
creation of a mutation by inserting a mobile
element (ME), also called a transposable element
(TE) or transposon, into the gene of interest. This
provides a DNA sequence tag (a known sequence)
Figure 7.
inserted into the unknown gene, and thus a means
Retro-transposons (above) and DNA transposons (below)
through which the gene of interest can be showing their basic mechanism of transposition.
identified and cloned. That is, you can transposon (Original-J. Locke-CC BY-NC 3.0CC BY-NC 3.0)
tag a gene to clone it.

3.1. BASICS OF TRANSPOSABLE ELEMENTS Drosophila (transgenes, etc.). We will use the P
All eukaryotes have multiple types of transposable element as an insertion mutation in our model
elements as single or multiple copies in their example here.
genomes. These segments of DNA, usually P elements are 2907 bp long, have 31 bp inverted
hundreds or thousands of base pairs long, are repeats at either end, and code for either a
usually distributed throughout the genome as transposase or repressor protein, depending on
randomly inserted sequences. Typically, they are the mRNA splicing alternative. A transposase is an
mobile, in the sense that some of them are able to enzyme that binds to the transposon and catalyzes
move and/or replicate within the genome. There the movement from its current location to another
are two main classes: (1) Retro-transposons and part of the genome. It does this by cleaving the
(2) DNA transposons. strands surrounding the region, and then cutting
Retro-transposons (or retroposons) move from and inserting the transposon in a new location in
one site to another in the genome via RNA the genome (Figure 7). The transposase transcript
intermediates (Figure 7). The sequence is has 4 exons (Exon 0 - Exon 3). The repressor
transcribed from the DNA sequence at one locus polypeptide is made if the last intron (2-3) is not
into RNA, then reverse-transcribed back into DNA spliced out. The resulting mRNA will be translated
(DNA->RNA->DNA), which is then inserted back such that the 2-3 intron is translated and a
into the genomic DNA at another locus. Within this premature stop codon prevents translation of the
class there are two subclasses. The first is the sequence in exon 3 (Figure 8). A simplistic
retrovirus-like class, which has sequences similar in description proposes that this truncated product
organization to a retrovirus, such as HIV. Their will bind to the P element ends, but not cleave the
replication is also like a retrovirus. Then there is the DNA as the transposase polypeptide does. This
non-viral class of retroposons. The human Alu binding prevents other transposase molecules from
element is an example of a short interspersed binding and cleaving, thereby acting as a repressor
element (SINE). There are also long interspersed of mobilization. The real situation is more complex.
elements (LINES). In most somatic tissues, the 2-3 intron is not
removed from the primary transcript so only the
DNA transposons move via DNA intermediates repressor is made. In germline tissues, the intron
(DNA-> DNA). No RNA (transcription) is involved. can be removed and the transposase produced to
One of the best-studied elements is the P element cause mobilization. The mobilization of the
in Drosophila. It is very well characterized and used transposon will cause insertions, which can cause
as a genetic tool to create insertion mutations mutations in genes. This is explained in the next
(mutagenesis) and as a transformation vector to section.
move constructed genes into the germline of

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

CLONING A GENE – CHAPTER 30

3.2. CREATING A TE INDUCED MUTANT IN A GENETIC 3.4. CONFIRMING THE INSERT IS RIGHT GENE BY
SCREEN REVERSION ANALYSIS
This needs wild type and mutant stocks for the Because the insertion of P element (and other TE)
gene of interest. If a P element induced mutation is are not always the causal event in the type of
already available from a stock collection, then this mutagenesis described above, it is essential that
step is completed. However, this is not the case the cause of the mutation in the gene of interest be
with most situations and so you must make your established to be due to the insert that you have
own. To create a P element induced mutation a P cloned. This is usually done through reversion
element containing stock (P-stock) is crossed with analysis.
one that lacks P elements. Both must have wild The allele with the TE in the gene of interest
type alleles of the gene of interest. If done (above) is reverted in a manner similar to that of
correctly, this cross will cause the P elements to the initial insertion. The expectation is that the
mobilize, and create random insertions into genes excision of the TE will be associated with a
all over the genome. Next is a genetic screen for reversion of the mutation back to wildtype. The
this rare insert into the gene of interest. It can be presence/absence of the TE insert can be
identified in the screen by a failure to complement monitored by Southern Blots in the reverted stocks
an existing mutant allele of the gene of interest and compared to the original, parental line and the
(See Chapter 4). insert mutant line.
This new allele is tagged with the P element and
can now be recovered and isolated for further use.
4. CURRENT APPROACHES TO MATCHING GENES
TO MUTATIONS
3.3. CLONING A TE INSERTION SITE & THE WILD TYPE
The methods described above have been used for
ALLELE
several decades to successfully find many new
From the stock with the P element induced allele in
genes. However, the current ease and quickness of
the gene of interest, the genomic DNA can be
whole genome sequencing is changing this
extracted and a genomic library built. This library
methodology. If sufficient money and equipment is
can be screened and a clone containing the P
available, the mutant genome can be compared to
element sequence identified (see the previous
the parental or wild type genome and single base
section on using a DNA probe to screen a library).
pair changes can be identified by computer
These clones can be characterized and the genomic analysis. While theoretically straightforward, the
sequences adjacent to the P element can be technical details make it challenging and unclear.
subcloned and used as a probe into a library made Often there are many changes in the sequence and
from a wild type stock. This will identify clones it is not clear which are causative and which are
containing the wild type DNA of the gene of just random changes.
interest.

Figure 8.
P element (red) with 31 bp repeats,
has a promoter (P) for a transcript
(green) that includes four exons.
Alternate splicing leads to either a
transposase (exons 0,1,2,3) or a
repressor (exons 0,1,2, intron,3).
(Original-J. Locke-CC BY-NC 3.0CC BY-
NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 30 – CLONING A GENE

___________________________________________________________________________
SUMMARY:
• Genes in simple organisms (single cells) can be cloned by the complementation of host mutation by the
transformation of a plasmid containing a wild type copy of the host’s mutant locus.
• In screening a library for a clone, five genomes worth of cloned DNA needs to be screened in order to
have a 99% probability of finding the clone of interest.
• Libraries can be screened with a labeled probe using the DNA-DNA hybridization to bind the probe to a
target sequence in a clone.
• Genes can also be cloned via transposon tagging.
KEY TERMS:
pBR322 cDNA clone
pUC19 synthetic oligo-nucleotide
genomic DNA labeling
32
DNA vector P
complementation autoradiography
clone mobile element
restriction enzyme transposable element
ligation transposon
genomic DNA library transposon tag
transformed retro-transposon
minimal media DNA transposon
antibiotic retrovirus-like class
auxotroph non-viral class
prototroph P element
DNA hybridization Genetic tool
denaturation transposase
reassociation repressor
annealing 2-3 intron
mis-match reversion analysis
probe whole genome sequencing

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

CLONING A GENE – CHAPTER 30

QUESTIONS:
1) In Figure 4, it is intentionally not stated which is
the mouse and which is the Drosophila DNA
sequence. With a computer and internet
access, how could you determine which is
which?
2) In the example showing how the amino acid
sequence is used to make an oligonucleotide
probe, there are four amino acids shown.
Assume the next is proline.
a) Find a codon usage chart on the internet
and determine what the next three
nucleotides should be in the
oligonucleotide.
b) With the addition of the proline, how many
different sequence oligo-nucleotides are
needed to cover all the possible gene
sequences (assuming there is no exon-
intron site in this region)?

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 30 – CLONING A GENE

Notes:

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

POLYMERASE CHAIN REACTION – CHAPTER 31

CHAPTER 31 – POLYMERASE CHAIN REACTION (PCR)

Figure 1.
Plastic disposable tips for a micro-pipettor
are used to accurately distribute microliter
volumes of liquid in molecular biology.
(Flickr-estherase- CC BY-NC-SA 2.0)

INTRODUCTION many copies can be produced for analysis or
manipulation.
While, genetics is the study of the inheritance and
variation of biological traits, today, classical 1. ISOLATING GENOMIC DNA
genetics is often complemented by molecular DNA purification strategies rely on the chemical
biology, to give molecular genetics, which involves properties of DNA that distinguish it from other
the study of DNA and other macromolecules that molecules in the cell, namely that it is a very long,
have been isolated from an organism. Usually, negatively charged molecule. To extract purified
molecular genetics experiments involve some DNA from a tissue sample, cells are broken open by
combination of techniques to isolate, analyze, and grinding or lysing in a solution that contains
characterize the DNA, RNA, and/or protein chemicals that protect the DNA while disrupting
transcribed and translated from a particular gene. other components of the cell (Figure 2). These
In some cases, the DNA may be subsequently chemicals may include detergents, which dissolve
manipulated by mutation or by recombination with lipid membranes and denature proteins. A cation
other DNA fragments. Techniques of molecular such as Na+ helps to stabilize the negatively
genetics have wide application in many fields of charged DNA and separate it from proteins, such as
biology, as well as forensics, biotechnology, and histones. A chelating agent, such as EDTA, is added
medicine. Polymerase Chain Reaction (PCR) is a to protect DNA by sequestering Mg2+ ions, which
widely used technique to amplify and isolate can otherwise serve as a necessary co-factor for
specific DNA sequences. It requires a “template” nucleases (enzymes that digest DNA). As a result,
DNA, which is often genomic DNA. From this free, double-stranded DNA molecules are released
template, specific sequences can be amplified and from the cell and from chromatin into the

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 31 – POLYMERASE CHAIN REACTION

extraction buffer, which also contains proteins and very efficient method of amplifying a specific
all other cellular components. (The basics of this sequence of DNA from a small sample of a large,
procedure are simple enough that it can be done complex genome.
with household chemicals as presented on Besides its ability to make large amounts of DNA,
YouTube.) there is a second characteristic of PCR that makes it
The free DNA molecules are subsequently isolated extremely useful. Recall that most DNA
by one of several methods. Commonly, proteins polymerases can only add nucleotides to the end of
are removed by adjusting the salt concentration so an existing strand of DNA, and therefore require a
they precipitate. The supernatant, which contains primer to initiate the process of replication. For
DNA and other, smaller metabolites, is then mixed PCR, chemically synthesized primers of about 20
with ethanol, which causes the DNA to precipitate. nucleotides are used. In an ideal PCR, primers only
A small pellet of DNA can be collected by hybridize to their exact complementary sequence
centrifugation, and after removal of the ethanol, on the template strand (Figure 3).
the DNA pellet can be dissolved in water (usually

with a small amount of EDTA and a pH buffer) for

the use in other reactions. Note that this process
has purified all of the DNA from a tissue sample
(genomic and mitochondrial DNA); if we want to
isolate a specific gene or DNA fragment, we must
use additional techniques, such as PCR. Figure 3.

The primer-template duplex at the top part of the figure is

perfectly matched, and will be stable at a higher
temperature than the duplex in the bottom part of the
figure, which contains many mismatches and therefore
fewer hydrogen bonds. If the annealing temperature is
sufficiently high, only the perfectly matched primer will be
able to initiate extension (grey arrow) from this site on the
template.
Figure 2. (Original-Deyholos-CC BY-NC 3.0)
Extraction of DNA from a mixture of solubilized cellular
components by successive precipitations. Proteins are
The experimenter can therefore control exactly
precipitated, then DNA (in the supernatant) is
precipitated with ethanol, leaving a pellet of DNA. what region of a DNA template is amplified by
(Original-Deyholos-CC BY-NC 3.0) specifying the sequence of the primers used in the
reaction.
To conduct a PCR amplification, an experimenter
2. ISOLATING OR DETECTING A SPECIFIC SEQUENCE
combines in a small, thin-walled tube (Figure 4), all
BY PCR
of the necessary components for DNA replication,
2.1. COMPONENTS OF THE PCR REACTION including:
The Polymerase Chain Reaction (PCR) is a method
of DNA amplification that is performed in a test (1) DNA polymerase and solutions containing
tube (i.e. in vitro). Here “polymerase” refers to a (2) nucleotides (dATP, dCTP, dGTP, dTTP),
DNA polymerase enzyme extracted and purified (3) a DNA template,
from bacteria. The “chain reaction” refers to the (4) DNA primers,
ability of this technique produce billions of copies (5) a pH buffer, and
of a specific DNA molecule, by using each newly
(6) ions (e.g. Mg2+) required by the polymerase.
replicated double helix as a template to synthesize

two new DNA double helices. PCR is therefore a

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

POLYMERASE CHAIN REACTION – CHAPTER 31

Successful PCR reactions have been conducted An essential aspect of PCR is thermal-cycling,
using only a single DNA molecule as a template, but meaning the exposure of the reaction to a series of
in practice, most successful PCR reactions contain precisely defined temperatures (Figure 5). The
many thousands of template molecules. The reaction mixture is first heated to 95°C. This causes
template DNA (e.g. total genomic DNA) has usually the hydrogen bonds between the strands of the
already been purified from cells or tissues using the template DNA molecules to melt, or denature. This
techniques described above. However, in some produces two single-stranded DNA molecules from
situations it is possible to put whole cells directly in each double helix (Figure 7). In the next step
a PCR reaction for use as a template. (annealing), the mixture is cooled to 45-65°C. The

exact temperature depends on the primer
sequence used and the objectives of the
experiment. This allows the formation of double
stranded helices between complementary DNA
molecules, including the annealing of primers to
the template. In the final step (extension) the
mixture is heated to 72°C. This is the temperature
at which the particular DNA polymerase used in
PCR is most active. During extension, the new DNA
Figure 4.
strand is synthesized, starting from the 3' end of
A strip of PCR tubes
(Wikipedia-madprime- CC BY-SA 3.0) the primer, along the length of the template

strand. The entire PCR process is very quick, with
each temperature phase usually lasting ~30
seconds or less. Each cycle of three temperatures
(denaturation, annealing, extension) is usually
repeated about 30 times, amplifying the target
region approximately 230-fold. The amount of DNA
product reaches a plateau at 20-40 cycles, usually
because the nucleotide precursors have been
exhausted. Notice from the figure that most of the
newly synthesized strands in PCR begin and end
Figure 5. with sequences either identical to or
Example of a thermal-cycle, in which the annealing complementary to the primer sequences; although
temperature is 55°C. a few strands are longer than this, they are in such
(Original-Deyholos-CC BY-NC 3.0) a small minority that they can almost always be
ignored.

Figure 6.
A temperature vs. time graph showing two cycles of PCR.
(Original-Harrington-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 31 – POLYMERASE CHAIN REACTION

After completion of the thermal-cycling (described in chapter 28) to determine whether a

(amplification), an aliquot from the PCR reaction is DNA fragment of the expected length was
usually loaded onto an electrophoretic agarose gel successfully amplified or not. Usually, the original

template DNA will be so dilute that it will not be
visible on the gel, only the amplified PCR product.
The presence of a sharp band of the expected
length indicates that PCR was able to amplify its
target. If the purpose of the PCR was to test for the
presence of a particular template sequence, this is
the end of the experiment. Otherwise, the
remaining PCR product can be used as starting
material for a variety of other techniques such as
sequencing or cloning.

2.2. REAL TIME PCR /QUANTITATIVE PCR (QPCR)

In a standard PCR reaction, the DNA molecule of
interest is amplified and then the products are
typically visualized at the end of the reaction on an
electrophoresis gel. On the other hand, a
procedure known as real-time PCR or quantitative
PCR (qPCR) detects the replicated DNA molecules
during the amplification process. qPCR uses
florescent molecules and relies on the fluorescence
of the amplified product measured over a number
of cycles. However, the procedure of amplifying the
DNA molecule is identical to the standard PCR
procedures. There are two ways of processing qPCR.
(1) Using florescent chemical molecules known as
fluorochrome that binds to all double stranded
DNA molecule (Nonspecific).
In the first method, the florescent dye molecule
binds to any double stranded DNA molecule. After
each cycle of amplification, the amount of ds-DNA
molecules synthesized can be quantified by
measuring the florescence. The intensity of the
florescence would indicate the amount of DNA
molecule present.
(2) Using florescent reporter probe (Specific).
Figure 7. The second method is using a florescent reporter
PCR with the three phases of the thermal cycle numbered.
The template strand (blue) is replicated using primers
probe that hybridizes with the DNA sequence of
(red), to prime the newly synthesized strands (green). interest. When the taq polymerase replicates the
The green strands, which are flanked by the two primer DNA molecule, it degrades the probe and the
sites, will increase in abundance exponentially through florescent molecule is released to the solution. This
successive PCR cycles. increases the intensity of florescence. The
(Wikipedia-madprime- CC0 1.0)
florescence is measured by the real-time PCR

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

POLYMERASE CHAIN REACTION – CHAPTER 31

machine and quantifies the DNA molecules being

synthesized.

2.3. REVERSE TRANSCRIPTASE PCR (RT-PCR)

Reverse Transcriptase PCR (RT-PCR) can detect
both the quality and quantity of mRNA molecules
(gene transcription). As a result, we are able to find
out the spatial (where the gene is expressed) and
temporal (when the gene is expressed) level of
gene expression.
Here is how it works (Figure 8):
(1) mRNA is extracted from the cell, tissue, or
organism.
(2) An enzyme called reverse transcriptase
(obtained from a retrovirus – see Chapter 30) is
added, along with oligo-DT, which anneals to the
poly-A tail and acts as a primer, to synthesize
complementary DNA (cDNA) to the mRNA.
(3) mRNA template is degraded, and cDNA is added
to a PCR reaction to amplify a specific gene
sequence. If amplification occurs, the mRNA is
present; if not, then it is absent. This permits the
quantitation of a specific mRNA (gene) sequence.
The amplified products visualized on a gel verify

the existence and the quantity of the gene of
interest. By extracting mRNA at different stages, Figure 8.
we can figure out the temporal level of gene The process of RT-PCR. Isolated mRNA can be reverse
expression. If we extract mRNAs from different cell transcribed, and then used as a template in a PCR reaction
types, we can figure out the spatial level of gene to amplify one, or more, gene specific primers.
(Original-J. Locke-CC BY-NC 3.0)
expression.

2.4. AN APPLICATION OF PCR: THE STARLINK AFFAIR United States is genetically modified, and contains
PCR is very sensitive (meaning it can detect very genes that government regulators have approved
small starting amounts of DNA), and specific for human consumption, back in 2000,
(meaning it can amplify only the target sequence environmental groups showed that a strain of
from a mixture of many DNA sequences). Due to genetically modified corn, which had only been
these characteristic, PCR has many practical approved for use as animal feed, had been mixed in
applications. For example, PCR can detect trace with corn used in producing human food, like taco
DNA contaminants in food, air, water or cells. The shells. To do this, the groups purchased taco shells
presence or absence and the type or species of the from stores in the Washington DC area, extracted
contaminant can be identified. DNA from the taco shells and used it as a template
As an example, PCR was used as a tool to test in a PCR reaction with primers specific for the
whether genetically modified corn was present in unauthorized gene (Cry9C). Their suspicions were
consumer products on supermarket shelves. confirmed when they ran this PCR product on an
Although currently (2013) 85% of corn in the agarose gel and saw a band of size expected for

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 31 – POLYMERASE CHAIN REACTION

Cry9C. The PCR test was sensitive enough to detect

one transgenic kernel in a whole bushel of corn (1
per 100,000). The company (Aventis) that sold the
transgenic seed to farmers had to pay for the
destruction of large amounts of corn, and was the
target of a class action law-suit by angry consumers
who claimed they had been made sick by the taco
shells. While no legitimate cases of harm were ever
proven, and the plaintiffs were awarded $9 million,

of which $3 million went to the legal fees, and the
remainder of the judgment went to the consumers
in the form of coupons for taco shells. The affair
damaged the company, and exposed a weakness in
the way the genetically modified crops were
handled in the United States at the time.
PCR can be also used in medical diagnostic tests for
detection of pathogens in blood, tissues and body

fluids. More recently PCR has been used in the
genotyping of patients to match their care with
specific treatments for better outcomes.
PCR is also used for DNA genotyping of biological
samples in forensic or criminal investigations.
People can be genotyped for identification
purposes, so as to match with samples present at a
crime scene or to establish family relationships in

paternity/maternity cases. Genotypes also
establish identity of people for future comparisons,
much like taking fingerprints.

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

POLYMERASE CHAIN REACTION – CHAPTER 31

___________________________________________________________________________
SUMMARY:
• Molecular biology involves the isolation and analysis of DNA and other macromolecules
• Isolation of total genomic DNA involves separating DNA from protein and other cellular components,
for example by ethanol precipitation of DNA.
• PCR can be used as part of a sensitive method to detect the presence of a particular DNA sequence
• PCR can also be used as part of a method to isolate and prepare large quantities of a particular DNA
sequence
• qPCR methodology allows the quantity of DNA product to be measured.
• RT-PCR methodology detects the quantity and quality of the mRNA, which indicates the spatial and
temporal level gene expression.
KEY TERMS:
classical genetics thermalcycle
molecular biology denature
molecular genetics anneal
macromolecules extention
lysis thermostable
detergent Taq DNApol
chelating agent electropheretic agarose gel
EDTA fluorochrome
nuclease Reverse Transcriptase PCR (RT-PCR)
supernatant Temporal level
pellet Spacial level
PCR StarLink affair
primer Cry9C gene

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 31 – POLYMERASE CHAIN REACTION

STUDY QUESTIONS:
1) What information, and what reagents would
you need in order to use PCR to detect HIV in a
human blood sample?
2) If you started with 10 molecules of double
stranded DNA template, what is the maximum
number of molecules you would you have after
10 PCR cycles?
3) What is present in a PCR tube at the end of a
successful amplification reaction? With this in
mind, why do you usually only see a single,
sharp band on a gel when it is analyzed by
electrophoresis?

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

OBSERVING INTACT CHROMOSOMES – CHAPTER 32

CHAPTER 32 – OBSERVING INTACT CHROMOSOMES

Figure 1.
Fluorescence in situ hybridization of mitotic chromosomes from a
human cancer cell.
(Wikipedia - Pmx- CC BY-SA 3.0)

INTRODUCTION cells will arrest in metaphase. Metaphase
cells are best because their chromosomes
A lot of information can be obtained from a visual
are condensed and there are no nuclear
observation of human chromosomes. This chapter
envelope to get in the way.
will discuss two techniques: bright field microscopy
and fluorescence in situ hybridization (Figure 1). Step 4 Transfer the cells to a hypotonic
While all the examples come from humans these environment. Water enters the cells and
methods can be applied to any organism. they swell up and become delicate. Fix the
cells with a mix of acetic acid and
1. BRIGHT FIELD MICROSCOPY methanol.
1.1. MAKING A METAPHASE CHROMOSOME SPREAD Step 5 Drop the solution containing the cells onto
The most commonly observed chromosomes are a glass slide. The cells burst on contact and
those from white blood cells in the metaphase leaving behind clusters of chromosomes
stage of mitosis. The protocol is as follows: for each cell.
Step 1 Obtain a sample of whole blood from a Step 6 Soak the slide in a chromosome staining
person and add to culture medium. solution. If Giemsa is used by itself the
chromosome become a uniform dark
Step 2 Add lymphocyte growth factor proteins to
purple colour. If Giemsa and Trypsin are
stimulate white blood cells. (Remember,
used together the chromosomes take on
red blood cells lack nuclei and
dark purple and light purple bands. This
chromosomes.) After three days, the cells
pattern of Giemsa-dark and Giemsa-light
have reproduced several times and
bands is consistent and can be used to
become more numerous and many are in
identify chromosomes and chromosome
the process of mitosis.
regions. These protocols are known as
Step 3 Add colcemid, a Microtubule inhibitor. The Giemsa staining and G-banding,
cells continue through the cell cycle and respectively.
will arrest in metaphase. This is because a
Step 7 Observe the slide with a bright field
cell can't enter anaphase without
microscope.
functional Microtubules. More white blood

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 32 – OBSERVING INTACT CHROMOSOMES

Figure 2 shows an example of Giemsa stained This karyogram is made with G-banded
chromosomes. These images are called metaphase chromosomes. The extra chromosome is indeed
chromosome spreads because each cell's number 21 so this person has a karyotype of
chromosomes are randomly displayed on the 47,XY,+21.
surface of the slide. This table summarizes the terms introduced in this
1.2. USING BRIGHT FIELD MICROSCOPY TO DIAGNOSE section. While these are the definitions proposed
DOWN SYNDROME by the International Standing Committee on
Recall that Down syndrome is usually due to a Human Cytogenetic Nomenclature, some people
person having three copies of chromosome 21, a use the terms interchangeably.
situation known as trisomy-21. If a newborn has Bright field microscopy has its limitations though -
the physical and mental properties suggestive of it only works with mitotic chromosomes and many
Down syndrome a physician will likely order a chromosome rearrangements are either too subtle
chromosome test. A cytogeneticist will take a blood or too complex for even a skilled cytogeneticist to
sample and make a slide of metaphase spreads. discern. Even with a more powerful phase contrast
Each spread would show 47 chromosomes in total or DIC microscope there are limits to what can be
for a person with trisomy-21. Chromosome 21 can seen using G-banding. There is a more powerful
be recognized by its characteristic length, technique, one based upon hybridization probes,
centromere location, and Giemsa banding pattern. the topic of the next section.
Idiograms are maps showing this information
(Figure 3). Figure 3.

An idiogram of human
chromosome 21. Note that
To confirm that it is in fact trisomy-21 at least one only a single chromatid is
of these spreads will have its chromosomes shown even though the map is
arranged into a karyogram pattern (see the of a replicated metaphase
Chapter 15 on Human Chromosomes). Figure 4 chromosome. The constriction
shows what the cytogeneticist is looking for. near the top is the centromere
and the Giemsa-dark and
Giemsa-light bands are
coloured black and white,
respectively.
(ghr-U.S. National Library of
Medicine-PD)

Figure 2.
A human metaphase chromosome spread. This image
shows the 46 chromosomes that came from a single cell.
There will be dozens of collections of chromosomes like
this over the entire slide.
(Wikipedia-Steffen Dietzel-CC BY-SA 3.0) Figure 4.
A karyogram from a male with Down syndrome / trisomy-
21. (Wikipedia- U.S. Department of Energy Human-PD)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

OBSERVING INTACT CHROMOSOMES – CHAPTER 32

Term Definition
metaphase chromosome a picture of all the chromosomes from a cell as they appear on the slide
spread
karyogram a picture of all the chromosomes from a cell rearranged into the standard
pattern
idiogram a map of one or more chromosomes showing its Giemsa banding pattern
karyotype a written description of a person's chromosome composition.

2. HYBRIDIZATION PROBES
2.1. WHAT HYBRIDIZATION PROBES ARE
If you have a large amount of DNA, for example an
entire chromosome or a large collection of
restriction fragments, how could you identify a
single gene? One method is to use a hybridization
probe. These are a collection of short pieces of
single stranded DNA that can bind to the target
gene. Both the probe and target need to be single
stranded so they can pair using complementary
base pairing (As with Ts and Gs with Cs). In short,
the procedure is to denature the target DNA, add
the probe, and then detect where the probe has
stuck (Figure 5). This section will discuss making
probes and using them. Molecular geneticists use Figure 5.
hybridization probes in different protocols. Later in How a hybridization probe can reveal the location of target
this chapter we will cover fluorescence in situ DNA.
(Original-Harrington- CC BY-NC 3.0)
hybridization (FISH) while Chapter 34 covers
Southern blotting.
• Random oligonucleotides ("oligos") These are
2.2. MAKING HYBRIDIZATION PROBES WITH A RANDOM millions of different pieces of single stranded
PRIME LABELLING REACTION DNA. They are made synthetically and
purchased as a mixture.
There are a few methods to make hybridization
probes but the most common is a random prime • Regular DNA nucleotides. For example dATP,
labelling reaction. Essentially, we use DNA dGTP, and dTTP.
Polymerases to make our probe for us in a • Labelled DNA nucleotides. For example [Cy3]-
microcentrifuge tube. The reaction contains the dCTP
following:
You have to provide the template DNA but each of
• Template DNAs. This can be a plasmid, a PCR the other components can be purchased from
product, or a BAC (bacteria artificial Biotechnology companies. The only expensive
chromosome). The template DNA must be component are the labelled nucleotides; of which
denatured to make it single stranded first. there are many types (Figure 6). Radioactive ones
• DNA Polymerases. E. coli DNA Pol works well. are the best way to do Southern blotting and
fluorescent ones are at the heart of FISH.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 32 – OBSERVING INTACT CHROMOSOMES

Figure 7.
A random prime labelling reaction can use a BAC to make a
hybridization probe. The probe will hybridize to the

chromosome region originally used to make the BAC.
(Original-Harrington- CC BY-NC 3.0)
Figure 6.
Two examples of labelled nucleotides.
(Original-Harrington- CC BY-NC 3.0) This raises a very important point. No matter what
the template DNA was, the probe DNA will be short
The reaction starts when the mixture is placed at pieces of single stranded DNA, typically only 100
37°C. The oligos bind randomly to the denatured nucleotides long. Think of a probe as a cloud of tiny
template DNA and act as primers. From these DNA molecules that can stick along the length of a
primers the DNA Pols make new DNA strands. much larger target region. Chapter 34 describes
Because some of the nucleotides are labelled the how hybridization probes are used in Southern
new DNA strands are labelled as well. After an hour blotting while the next section describes their use
or so the reaction is halted. When it is time to use in FISH.
the probe it is denatured. The easiest way to
3. FLUORESCENCE IN SITU HYBRIDIZATION (FISH)
denature it is 5 minutes at 100°C and 5 minutes on
ice. What we call the probe are the new DNA 3.1. HOW FISH WORKS
strands, each is about 100 nucleotides long, each The solution to the lack of resolution in G-banding
contains several labelled nucleotides, and each is is fluorescence in situ hybridization (FISH). The
complementary to the DNA used as the template. technique is similar to a Southern blot in that a
single stranded DNA probe is allowed to hybridize
2.3. USING HYBRIDIZATION PROBES
to denatured target DNA (see Chapters 30 & 34).
Figure 7 summarizes the random prime labelling
However, instead of the probe being radioactive it
reaction. How we make our probe will depend
is fluorescent and instead of the target DNA being
upon what we are trying to detect. The typical sizes
restriction fragments on a nylon membrane it is
of the various template DNAs we can use are:
denatured chromosomes on a glass slide. Because
• PCR product - up to 5000 bp there are several fluorescent colours available it is
• plasmid - inserts can be up to 15 000 bp common to use more than one probe at the same
time. Typically, the chromosomes are also coated
• BAC - inserts can be up to 350 000 bp with a fluorescent stain called DAPI, which gives
If we want to detect a single gene a hybridization them a uniform blue colour. If the chromosomes
probe made from a plasmid or PCR product will have come from a mitotic cell it is possible to see
suffice. But if we want to detect a large all forty-six of them spread out in a small area.
chromosome region we have to use BACs. Alternatively, if the chromosomes are within the
Sometimes several BACs are used if we want to nucleus of an interphase cell they appear together
detect an entire chromosome. BACs for humans as a large blue sphere. In either case the results are
and other commonly studied organisms can be observed with either a standard or confocal
purchased from Biotechnology companies fluorescence microscope.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

OBSERVING INTACT CHROMOSOMES – CHAPTER 32

3.2. USING FISH TO DIAGNOSE DOWN SYNDROME 3.3. USING FISH TO DIAGNOSE CRI-DU-CHAT
Most pregnancies result in healthy children. SYNDROME
However, in some cases there is an elevated A physician may suspect that a patient has a
chance that the fetus has trisomy-21. Older women specific genetic condition based upon the patient's
are at a higher risk because the non-disjunction physical appearance, mental abilities, health
events that lead to trisomy become more frequent problems, and other factors. FISH can be used to
with maternal age. The second consideration is confirm the diagnosis. For example, Figure 9 shows
what the fetus looks like during an ultrasound a positive result for Cri-du-chat syndrome. This
examination. Fetuses with trisomy-21 and some diagram is based upon actual results. Cells from a
other chromosome abnormalities have a swelling patient's blood were prepared to show an
in the back of the neck called a nuchal interphase nucleus (a) and mitotic chromosomes
translucency. If either, or both, factor is present the (b). There are three colours shown in the diagram:
woman may choose to undergo an amniocentesis
• Blue. The DNA has been stained with DAPI .
test. In this test, some amniotic fluid is withdrawn
• Green. This hybridization probe binds within
so that the fetal cells within it can be examined.
the short arm of chromosome 5. This region
Figure 8 shows a positive result for trisomy-21. This
is absent in people with Cri-du-chat
diagram is based upon actual results. The colours
syndrome.
are:
• Red. This is hybridization probe binds within
• Blue. The DNA has been stained with DAPI. the long arm of chromosome 5. It is used to
• Red. This hybridization probe binds to the identify chromosome 5.
centromere of chromosome 21.
The results show both chromosome 5s have intact
• Green. This hybridization probe binds to the
long arms but one is missing part of its short arm.
centromere of the X chromosome. This child has the karyotype 46,XX,del(5), indicative
Based upon the available information this fetus has of Cri-du-chat syndrome.
two X chromosomes and three chromosome 21s
and therefore has a karyotype of 47,XX,+21.

Figure 8.
Confirmation of Down syndrome in a fetus using
amniocentesis and FISH.
Based upon: Antonarakis, S. E. et al. 2004. Chromosome 21 Figure 9.
and Down syndrome: From genomics to pathophysiology. Confirmation of Cri-du-chat syndrome in a child using
Nature Reviews Genetics 5:725-738 PubMed ID: 15510164. interphase cells, metaphase cells, and FISH. Based upon:
(Original-Harrington- CC BY-NC 3.0) Fang J.-S. et al. 2008. Cytogenetic and molecular
characterization of three-generation family with
chromosome 5p terminal deletion. Clinical Genetics
73:585-590 PubMed ID: 18400035 (Original-Harrington- CC
BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 32 – OBSERVING INTACT CHROMOSOMES

3.4. NEWER TECHNIQUES

FISH is an elegant technique that produces
dramatic images of our chromosomes.
Unfortunately, FISH is also expensive, time
consuming, and requires a high degree of skill. For
these reasons, FISH is slowly being replaced with
PCR and DNA chip based methods. Versions of
these techniques have been developed that can
accurately quantify a person's DNA. For example, a

sample of DNA from a person with Down syndrome
will contain 150% more DNA from chromosome 21
than the other chromosomes. Likewise, DNA from
a person with Cri-du-chat syndrome will contain
50% less DNA from the end of chromosome 5.
These techniques are very useful if the suspected
abnormality is a deletion, a duplication, or a change

in chromosome number. They are less useful for
diagnosing chromosome inversions and
translocations because these rearrangements often
involve no net loss or gain of DNA sequences
(genes).

In the future, all of these techniques will likely be

replaced with DNA sequencing. Each new
generation of genome sequencing machines can

sequence more DNA in less time. Eventually it will
be cheaper just to sequence a patient's entire
genome than to use FISH or PCR to test for specific
chromosome defects.

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

OBSERVING INTACT CHROMOSOMES – CHAPTER 32

___________________________________________________________________________
SUMMARY:
• Human chromosomes can be observed in either metaphase chromosome spreads or within intact
nuclei.
• DNA stains such as Giemsa and DAPI bind to DNA non-specifically.
• Hybridization probes bind to DNA at specific target sites. They are collections of DNA molecules that
are (i) short, (ii) single stranded, (iii) contain a labelled nucleotide, and (iv) are complementary to a
target region.
• Chromosomes can be prepared with either Giemsa staining or G-banding and then observed with a
visible light microscope.
• Chromosomes can be prepared with both DAPI staining and fluorescently labelled hybridization probes
and then observed with a fluorescence microscope.
• Chromosome number or structural abnormalities can be recognized in a metaphase chromosome
spread, an intact nucleus, or a karyogram diagram. They can be summarized in a karyotype statement.
KEY TERMS:
Giemsa hybridization probe
Giemsa staining fluorescence in situ hybridization (FISH)
G-banding plasmid
metaphase chromosome spread PCR product
idiogram bacteria artificial chromosome (BAC)
karyogram labelled nucleotide
karyotype DAPI

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 32 – OBSERVING INTACT CHROMOSOMES

QUESTIONS:
1) Giemsa and DAPI are both used to label DNA.
Why can't we use only Giemsa or only DAPI in
human cytogenetics?
2) What changes would you have to make to the
karyogram in Figure 4 to make it show the
chromosomes from the patient with Cri-du chat
syndrome described in Figure 9?
3) What are the similarities and differences
between a PCR reaction and a random prime
labelling reaction?
4) In nucleotide triphosphates, the phosphates are
named alpha, beta, and gamma. In Figure 6
why is it the alpha phosphate that is
radioactive?
5) What would Figure 8 look like if it also showed
metaphase chromosomes from another cell?
6) Some men have an extra Y-chromosome. What
is their karyotype? Describe as many ways as
you can to detect this chromosome
abnormality.
7) Some women have an extra X-chromosome.
What is their karyotype? Describe as many
ways as you can to detect this chromosome
abnormality.
8) Design a FISH based experiment to find out if
someone is a 47,XXX female or a 47,XYY male.

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

DNA SEQUENCING – CHAPTER 33

CHAPTER 33 – DNA SEQUENCING

Figure 1.
Output from an automated Sanger DNA sequencer.
(Original-Harrington- CC BY-NC 3.0)

INTRODUCTION • 1987 - Applied Biosystems begins selling a

machine to perform automated Sanger
DNA sequencing determines the order of
sequencing, their ABI 370.
nucleotide bases for a DNA molecule. These DNA
molecules could be as small as a single restriction • 1995 to 2003 - Using ABI 370s, ABI 377s, and
fragment, an entire gene, or as large as an similar machines scientists in the US, UK, and
organism's entire genome. Most DNA sequencing other countries sequenced the human genome.
at the University of Alberta is done by the • 2002 - Applied Biosystems begins selling the
Molecular Biology Service Unit (MBSU). They use ABI 3730 (Figure 2) which became the most
three machines: (1) an Applied Biosystems ABI popular way to do automated Sanger
3730, (2) an Illumina MiSeq, and (3) an Illumina sequencing and remains so to this day.
NextSeq 500, each has its own advantages and
purposes. The 3730 uses an older technology called
automated Sanger sequencing, while the Illumina
machines perform next-generation DNA
sequencing. This chapter will cover how these
machines work and what they are used for.
1. AUTOMATED SANGER DNA SEQUENCING
1.1. HISTORICAL CONTEXT
DNA sequencing has had a long history. Beginning
in the 1970s there have been many methods and
improvements. Some dates that stand out are:
• 1977 - Frederick Sanger invents a popular
Figure 2.
method, later called manual Sanger The Applied Biosytems ABI 3730 in the MBSU (Molecular
sequencing. Biology Service Unit, Biological Sciences Department, U.
of Alberta).
• 1986 - Leroy Hood improves upon this method
(Original-Harrington- CC BY-NC 3.0)
to invent automated Sanger sequencing.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 33 – DNA SEQUENCING

Figure 3.
An automated Sanger
sequencing reaction. The regular
dNTPs are shown here as black
while the ddNTPs are in colour.
(Original-Deyholos- CC BY-NC
3.0)

1.2. HOW AUTOMATED SANGER DNA SEQUENCING

WORKS
Recall that DNA Polymerases incorporate
nucleotides (dNTPs) into a growing strand of DNA,
based on the sequence of a template strand. DNA
Polymerases add a new base only to the 3’-OH
group of an existing strand of Figure 4.
ddNTPs are synthetic DNA nucleotides that lack a 3'
DNA; this is why primers are required in natural hydroxyl group. If a DNA Polymerase uses one it can't
DNA synthesis and in techniques such as PCR. continue.
Automated Sanger sequencing relies on the (Original-Harrington- CC BY-NC 3.0)
random incorporation of modified nucleotides

called dideoxy nucleotides (ddNTPs, Figure 3).

These lack a 3’-OH group and therefore cannot

serve as an attachment site for the addition of the
next nucleotide. After a ddNTP is incorporated into
a strand of DNA, no further elongation can occur.
The ddNTPs are labelled with one of four
fluorescent dyes, each specific for one the four
nucleotide bases (Figure 4).
Figure 5.
To sequence a DNA fragment, you need many
Each ddNTP used in a sequencing reaction is attached to a
copies of that fragment (Figure 5). Unlike PCR, different fluorescent molecule using a long chain of
Automated Sanger sequencing does not amplify carbons. Note that black ink is used to represent yellow
the target sequence and only one primer is used. fluorescence.
This primer is hybridized to the denatured template (Original-Harrington- CC BY-NC 3.0)
DNA, and determines where on the template
strand the sequencing reaction will begin. A a sufficient number of shorter molecules is
mixture of regular dNTPs, fluorescently-labelled synthesized, each ending in a fluorescent label that
ddNTPs, and DNA Polymerase is added to a tube corresponds to the last base incorporated. The
containing the primer-template hybrid. The DNA newly synthesized strands can be denatured from
Polymerase will then synthesize a new strand of the template, and then separated
DNA until a fluorescently-labelled ddNTP electrophoretically based on their length (number
nucleotide is incorporated, at which point of bases). The ABI machine is used for this
extension is terminated. Because the reaction electrophoresis step. While the original, old ABI
contains millions of template molecules, 370 used a slab gel similar to the ones used in

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

DNA SEQUENCING – CHAPTER 33

insert into a pBluescript II plasmid and right now

the recombinant plasmid is in E. coli cells (Figure 7).
The first step is to isolate plasmid DNA from some
of the cells with a mini-prep protocol. This will be
the template DNA. The primer will be
oligonucleotides complementary to the pBluescript
II vector adjacent to the insert. The sequencing
reaction will tell you the sequence of the insert
DNA within the plasmid.
Figure 6. 1.4. USING AUTOMATED SANGER DNA SEQUENCING TO
Fluorescently labeled products can be separated by
SEQUENCE A GENE
capillary electrophoresis, generating a chromatogram
from which the sequence can be read. If you suspect that an organism has a mutation in a
(Wikimedia-Abizar-PD) specific gene you can use automated Sanger
sequencing to find out (Figure 8). Let's say you
undergraduate labs, the newer ABI 3730 uses have a mouse strain and you think it has a
capillary tube electrophoresis (Figure 6). In this mutation in a gene you are studying. As before, the
machine each sample travels through its own tube. first step is to isolate DNA. However, we can't
Near the end of the tube is a laser, which excites sequence this DNA directly. Amongst all of the
any fluorescent dyes moving past and a detector genomic DNA there just aren't enough copies of
that collects any emitted light. As each DNA the gene to serve as the template DNA. To
molecule moves past the laser/detector it emits a overcome this limitation a PCR reaction is used to
specific colour. Because there will be several amplify the gene sequence in question. Then we
molecules with the same length and same colour sequence the PCR product. Depending upon how
the result appears as a peak of colour. A computer large the gene is it may take several PCR products
monitoring the results can add the sequence and several sequencing reactions to get the whole
information to the colours since red = T and so on. sequence.
In this way, the DNA sequence can be read simply

from the order of the colors in successive peaks. Figure 7

Using
automated
The results from a sequencing reaction are Sanger
presented as a chromatograph. While Figure 6 only sequencing to
shows 9 peaks, a successful sequencing reaction sequence a
will generate about 700 nucleotides worth of data. plasmid.
(Original –
The figure shows the results from a single tube but Harrington- CC
in fact there can be 48 or 96 tubes in total. Thus, in BY-NC 3.0)
a single “run” an ABI 3730 machine can sequence

up to 67,000 bp of DNA.
Figure 8.
1.3. USING AUTOMATED SANGER DNA SEQUENCING TO Using automated
Sanger sequencing
SEQUENCE A PLASMID
to sequence a gene.
The site of a point
Making a new recombinant plasmid takes time and mutation is shown
money. You will want to confirm that it has the here as 'm'.
DNA sequence it should before you use it for (Original –Harrington-
CC BY-NC 3.0)
important experiments. A simple way to find out is
to sequence it. Let's say you have put a 3.0 kb

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 33 – DNA SEQUENCING

2. NEXT-GENERATION DNA SEQUENCING • 2007 - The largest biotech company in the

world, Roche, buys 454 Life Sciences.
2.1. HISTORICAL CONTEXT
Sequencing a single gene or plasmid with an ABI • 2015 - 454, now a subsidiary of Roche,
3730 is quick and inexpensive. But sequencing a continues to develop and sell next-generation
whole genome this way would be very slow and machines.
very expensive. There are two reasons for this. In 2015, there are several choices for next-
The first is that automated Sanger sequencing generation sequencing. For a few hundred
requires many copies of the template DNA. A thousand dollars you can purchase a GS FLX (made
sample of purified plasmid DNA or purified PCR by 454/Roche), an ABI 5500 (Applied Biosystems),
product has millions of copies of the target region. or an Ion Proton (Life Technologies). Each is a fancy
But a sample of genomic DNA has only a few copies looking machine that uses a unique and proprietary
of a specific target region. For many years, the only technology.
way to sequence an organism was to isolate its 2.2. HOW NEXT GENERATION DNA SEQUENCING
genomic DNA, break the DNA into large pieces, and WORKS
then clone these pieces into BAC (bacteria artificial As mentioned in the introduction to this chapter,
chromosome) vectors. The BAC clones would then the MBSU recently purchased two next generation
have to be sequenced one by one. Most of the 13 machines: an Illumina MiSeq and an Illumina
years and millions of dollars it took to sequence the NextSeq 500 (Figure 9).
human genome was spent making and organizing
these BAC clones. Both use a similar workflow (Figure 10). The
scientist has to isolate genomic DNA from an
The second limitation of automated Sanger organism (step 1) and then use a kit to break it into
sequencing is that each reaction can only generate small fragments (step 2). The scientist then loads
700 nucleotides worth of data. It took literally the fragments into the machine and turns it on.
millions of independent sequencing reactions to Once inside, the DNA fragments are isolated from
sequence the human genome. each other (step 3), amplified in place (step 4), and
Beginning in the late 1990s scientists realized that finally sequenced (step 5). The technology is called
there was a need for a machine that could sequencing by synthesis. Illumina has made
sequence genomic DNA directly and with a single animated movies of what happens within their
reaction. In several instances a technology was machines:
invented in a university lab, developed in a small www.youtube.com/watch?v=HMyCqWhwB8E

biotechnology company, and then purchased by a
larger biotech company. An example of this is:
• 1996 - Swedish scientists invent a completely
new way to sequence DNA called
pyrosequencing. It is clever but very labour
intensive.
• 2000 - An American inventor and entrepreneur,
Jonathan Rothberg, refines their technique into
automated pyrosequencing.
• 2004 - His company, 454 Life Sciences, markets Figure 9.

the first so called next-generation sequencing The Illumina MiSeq in the MBSU (Biological Sciences
Department, U. of Alberta). The door has been opened to
machine. show where the DNA sample and reaction mixtures are
loaded. (Original-Harrington- CC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

DNA SEQUENCING – CHAPTER 33

The output is just raw sequence data, there are no

chromatograms. Powerful software is needed for
sequence assembly, the process of joining these
small pieces of sequence data into a continuous
sequence (Figure 11). Ultimately there will be one
sequence for each of the organism's chromosomes.

2.3. COMPARISON BETWEEN DNA SEQUENCING

METHODS
Scientists all over the world now have a choice
between automated Sanger sequencing and next-
generation sequencing. For example, at the
University of Alberta your choices at the MBSU are
shown in Table 1.
Recall that DNA is measured in base pairs where:
Figure 10.
Next generation DNA sequencing using Illumina's 1 kilobase (kb) = 1 000 base pairs (bp)
sequencing by synthesis technology. Steps 1 and 2 are
1 Megabase (Mb) = 1 000 000 bp
done by the researcher and steps 3, 4, and 5 occur within
the machine. (Original –Harrington- CC BY-NC 3.0) 1 Gigabase (Gb) = 1 000 000 000 bp

Let's say you wanted to sequence a 2,000 bp long

Table 1. Comparison between different sequencing machines.
PCR product. You could do this with three
Machine ABI 3730
Illumina Illumina sequencing reactions in the ABI 3730 or a single run
MiSeq NextSeq 500
in the Illumina MiSeq. The first method would cost
plasmid or $15 and the second would cost $1,250. Even
DNA genomic DNA genomic DNA
PCR product though one machine is a decade older it is still the
way to go! If you did use the MiSeq you'd end up
automated sequencing by sequencing by
Technology
Sanger synthesis synthesis sequencing the same PCR product over and over. It
wouldn't produce any more data.
Data 540 Mb to 15 16 Gb to 120
700 bp
generated Gb Gb

$1,250 - $2,050 -
Price $4.75
$1,850 $5,150

Figure 11.
Assembling the DNA sequence of
a chromosome from many
smaller sequences.
(NHGRI-Darryl Leja-PD)

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 33 – DNA SEQUENCING

On the other hand, let's say you wanted to 2.4. USING NEXT-GENERATION DNA SEQUENCING TO
sequence your own DNA. Even if you don't consider SEQUENCE HUMANS
the time and cost of making the BAC clones it Even though we know the average human DNA
would still cost millions of dollars to do all of the sequence, each of us is unique. There are two
sequencing reactions in the ABI 3730. Conversely reasons why human DNA continues to be
the MBSU could use their NextSeq 500 and have sequenced. (Table 2)
everything done in two days for about $4 000. Each
of your 46 chromosomes would be sequenced 2.5. USING NEXT-GENERATION DNA SEQUENCING TO
SEQUENCE OTHER ORGANISMS
about 30 times each. A more expensive machine,
the Illumina HiSeq, can sequence human DNA for Next-generation sequencing has made it feasible to
about $1 000 a person. sequence anything. Here are just a few examples.
(Table 3)
Table 2. Using next-generation sequencing to sequence humans.
Use of next-generation sequencing Description
Personalized genomics If we sequence a person's DNA it can reveal information about
their susceptibility to disease and their response to various
medical treatments.
Tumour cell sequencing If a person has cancer it is now possible to sequence individual
cancer cells. This has revolutionized how physicians help their
patients. Instead of treatments based upon the location of
tumours, treatments can now be designed around the genetic
defects that lead to the cells becoming cancerous in the first
place.

Table 3. Using next-generation sequencing to sequence other organisms.

Use of next-generation sequencing Description
De novo sequencing This is when an organism is sequenced for the first time. For
example, in 2014 researchers in Sierra Leone sequenced 99 Ebola
virus genomes from 78 patients. They identified changes in the
virus that caused the recent outbreak.
Metagenomics This is when the entire collection of DNA in an environment is
sequenced to determine which species are present. This
technique has been used to show that a person's gut microbes
vary with their diet.
RNA Seq This is when RNAs from a tissue or organ are isolated, copied into
DNA molecules, and then sequenced. This reveals which genes
were active in the cell, tissue, or organ.

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

DNA SEQUENCING – CHAPTER 33

___________________________________________________________________________
SUMMARY:
• Automated Sanger sequencing became commonplace in 1987 and is still used today. It is used to
sequence plasmids and PCR products. The most popular machines are Applied Biosystem's ABI 3730s.
• Next-generation sequencing began in 2004 as a better way to sequence whole genomes. There are
several competing technologies, for example Illumina's MiSeq machine and its sequencing by synthesis
technology.
• Sequencing centres offer both types of sequencing today.
• Sequencing any DNA molecules, large or small, is now fast and inexpensive.
KEY TERMS:
automated Sanger sequencing Ilumina MiSeq
Applied Biosystems ABI 3730 Illumina NextSeq 500
DNA Polymerases sequencing by synthesis
primer sequence assembly
regular dNTPs personalized genomics
fluorescently-labelled ddNTPs tumour cell sequencing
capillary tube electrophoresis de novo sequencing
chromatogram metagenomics
next-generation sequencing RNA Seq

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 33 – DNA SEQUENCING

STUDY QUESTIONS:
1) What would the chromatogram look like if you
set up an automated Sanger sequencing
reaction with only template, primers,
polymerase, and fluorescent ddNTPs?
2) How could you use DNA sequencing to identify
new species of marine microorganisms?
3) An alternative name for automated Sanger
sequencing is dye-terminator sequencing. Why
is this term appropriate?
4) Ten years ago, it would have cost $100,000,000
to sequence your DNA. Today it would cost as
little as $1,000. Why did the cost go down so
much?
5) Why haven't next-generation machines
completely replaced the first generation of
automated DNA sequencers?
6) True or false: Automated pyrosequencing and
sequencing by synthesis are both considered
next-generation DNA sequencing technologies.

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

SOUTHERN/NORTHERN/WESTERN BLOTS – CHAPTER 34

CHAPTER 34 – SOUTHERN/NORTHERN/WESTERN BLOTS

Figure 1.
Agarose gel being placed on a Southern
blotting transfer set up. The DNA in the
gel will be transferred to a membrane,
placed on the gel, via the movement of a
buffer solution.
(WikimediaCommons-National Cancer
Institute-PD)

INTRODUCTION 1. SOUTHERN BLOT (DNA)
The separation of DNA, RNA, and polypeptides base A Southern blot (also called a Southern Transfer
on size is a useful biochemical technique that uses because it more accurately describes the procedure)
migration through a gel to fractionate these is named after Ed Southern, who invented it in the
macromolecules. For example, bands of DNA in an mid-1970s. This blotting method is used to identify
electrophoretic gel form if many the DNA molecules specific DNA fragments, size-separated by gel
are of the same size, such as following a PCR electrophoresis, that cross hybridize with a labeled
reaction, or restriction digestion of a plasmid. In probe (often radio-active). For example, the
other situations, such as after restriction digestion presence/absence of a particular sized restriction
of chromosomal (genomic) DNA, there will be a very fragment can be identified in a sample of genomic
large number of different sized fragments in the DNA digested with a specific restriction enzyme.
digest and thus it will appear as a continuous smear There are multiple steps in the Southern Blot
of DNA, rather than distinct bands, on a lane in a gel. procedure (Figure 2). In the first step, DNA is
With the genomic DNA case, it is necessary to use digested with restriction enzymes and separated by
additional techniques to detect the presence of a agarose gel electrophoresis. Then a sheet of a nylon
specific DNA sequence within the smear of DNA derivative, Nitro-cellulose, or similar material
separated on an electrophoretic gel. This can be (membrane) is laid under the gel (Figure 1). The
done using a DNA sequence probe of “Southern DNA, in its separated position (bands or smear), is
Blot”. In this chapter, we will describe Southern then transferred to the membrane by drawing a
blots, as well as other blotting techniques, such as buffer solution out of the gel, in a process called
Northern Blots (RNA) and Western Blots (protein), blotting. At this point the blotted DNA is usually
that use similar principles to detect those covalently attached to the membrane by briefly
macromolecules. (The Eastern and SouthWestern exposure to UV light or drying. The transfer to a
blots will not be described here.) sturdy membrane is necessary because the fragile
gel would fall apart during the next two steps in the

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 34 – SOUTHERN/NORTHERN/WESTERN BLOTS

process. Next, the membrane is bathed in an alkali the probe’s hybridization. At maximum stringency
solution to denature (double stranded made single (higher temperature, low salt) hybridization
stranded) the attached DNA, and this is then conditions, probes will only hybridize efficiently with
neutralized. The membrane is added to a target sequences that are perfectly complementary
hybridization solution, containing a small amount of (maximum number of hydrogen bonds). At lower
labeled single-stranded probe DNA that is stringency (lower temperature, higher salt), probes
complementary to a sequence target molecule on will be able to hybridize and detect sequences to
the membrane. This probe DNA is labeled using which they do not match exactly, but have some
either fluorescent or radioactive molecules. If the mismatch along the sequence.
hybridization is performed properly, the probe DNA Southern blotting is useful not only for detecting the
will form a stable duplex only with those DNA presence of a DNA sequence within a mixture of
molecules on the membrane to which it is DNA molecules, but also for determining the size of
complementary. Then, the unhybridized probe is a restriction fragment in a DNA sample. One
washed off leaving the hybridized radioactive or advantage is that Southern blots are able to
fluorescent signal bound. This remaining signal will detecting fragments larger than those normally
appear in a distinct band when appropriately amplified by PCR. Also, the long DNA probes can
detected (fluorescent or radioactive). The band detect fragments that may be relatively dis-similar
represents the presence of a particular DNA to the original sequence. Applications of Southern
sequence within the mixture of DNA fragments that blotting will be discussed further in the context of
is complementary to the probe sequence molecular markers in a subsequent chapter.
The probe’s specificity comes via the sequence Southern blotting was invented long before PCR, but
specific hybridization (requires complementarity). PCR has replaced blotting in many applications
However, variation in hybridization temperature because of its simplicity, speed, and convenience.
and washing solutions can alter the stringency of
.
Figure 2.
A diagram of Southern blotting.
Genomic DNA that has been
digested with a restriction enzyme
is separated on an agarose gel, and
then the DNA is transferred from
the gel to a nylon membrane (grey
sheet) by blotting. The DNA is
immobilized on the membrane,
and then probed with a
radioactively labeled DNA
fragment that is complementary
to a target sequence. After
stringent washing, the blot is
exposed to X-ray film to detect
what size fragment the probe is
bound. In this case, the probe
bound to different-sized
fragments in lanes 1, 2, and 3. In
the last image the orange
represent the position of the
digested DNA, but it is not actually

present on the X-ray film.
(Original-J. Locke-CC BY-NC 3.0)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

SOUTHERN/NORTHERN/WESTERN BLOTS – CHAPTER 34

2. NORTHERN BLOT (RNA)

Following the development of the Southern blot,
other types of blotting techniques were invented.
The Northern blot is much like the Southern Blot,
but involves the size separation of single stranded
RNA in gels similar to that of DNA.
First, RNA molecules are extracted and isolated from
a tissue sample or cells. RNA samples are loaded on
a lane and separated by size using agarose gel Figure 3.
electrophoresis. Because we wish to determine the Antibodies binding to the target protein on a Western
Blot. Proteins on a membrane surface (left) can bind the
native length of the RNA transcripts (and because primary antibody (centre-left). Then, secondary
RNA is single stranded) no DNA restriction enzymes antibodies can bind to the primary (centre-right). Finally,
are ever used with RNA. As with DNA, the gel is the secondary antibodies can be detected by various
fragile and probes cannot enter the gel matrix, the means (right). This will provide protein specific
samples are blotted to a membrane with a positive localization to produce a band on a Western Blot.
(Original-J. Locke-CC BY-NC 3.0) ]
charge (nucleic acids have negative charge). In the

hybridization step, single stranded DNA or RNA
probes are used in order to detect the RNA of Figure 4.
interest. Furthermore, transfer buffer often Western Blot result using an
contains formamide that has an ability to lower the anti-lipoic acid primary
temperature of probe hybridization temperature antibody and an IR-dye
since high temperature might damage the RNA. labelled secondary antibody
in Leishmania major
Most RNA is single stranded and can fold into extracts. (Wikipedia-
various conformations thorough intra-molecular TimVickers– CC BY-SA 3.0)
base pairing, so the electrophoresis separation is
more haphazard and the bands are often less sharp,
compared to that of double stranded DNA. Using
northern blot, one can observe the size (quality) and colour production detection system. Western blots
amount (quantity) of transcription of a gene. Thus a can also give bands proportional to the size and
pattern of gene expression can be defined by amount of the target protein. See (Figure 3.Figure 4)
comparing samples from different tissues.
One application of western blot application is HIV
3. WESTERN BLOT (PROTEIN) test. First, cells that may be infected by HIV virus are
In a Western blot, polypeptides are size separated extracted and cellular proteins are separated that
on an acrylamide gel before transferring to a might contain the viral protein. They are then run on
membrane. Acrylamide is used because it separates the electrophoresis gel and are transferred to a
the smaller polypeptide molecules better. The membrane. Antibodies that will bind to HIV viral
membrane is then probed, not with DNA or RNA, but proteins are added to the membrane to check the
with an antibody that specifically binds to an presence of the viral proteins.
antigenic site on the target protein (primary
antibody). The unbound antibody is washed away A comparison of all three blotting methods is shown
and the bound antibody is then detected by a in Figure 5.
secondary antibody with some fluorescent or

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 34 – SOUTHERN/NORTHERN/WESTERN BLOTS

Figure 5.
Comparison of Southern, Northern, and Western blots. In the cell at the top, DNA is in blue, RNA in red, and polypeptides in green.
Size and amount of DNA, RNA, and polypeptides can be determined using similar blotting methods. A size marker lane is shown
in the left of each gel to estimate molecule size. Although a eukaryote cell is shown, the same methods can be applied to
prokaryotes, too. (Original-Locke-CC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

SOUTHERN/NORTHERN/WESTERN BLOTS – CHAPTER 34

___________________________________________________________________________
SUMMARY:
• Southern blotting involves detecting the presence of DNA fragments, such as those from total genomic
DNA digested with a restriction enzyme, separated by agarose gel electrophoresis and transferred to a
membrane that is then probed with a labeled nucleic acid probe.
• With Northern Blots, the same principle in Southern blotting is used to detect single stranded RNA of
interest
• Western Blots are also similar but use acrylamide gels to separate proteins and the membrane is probed
with antibodies to detect the molecule of interest.
KEY TERMS:
Southern Blot hybridization
Southern transfer probe
Northern Blot washing
Western Blot stringency
membrane mismatch
blotting primary antibody
denaturation secondary antibody

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 34 – SOUTHERN/NORTHERN/WESTERN BLOTS

STUDY QUESTIONS:
1) Research shows that a particular form of cancer
is caused by a 200bp deletion in a particular
human gene that is normally 2kb long. Only one
mutant copy is needed to cause the disease – it’s
dominant.
a) Explain how you would use Southern
blotting to diagnose the disease.
b) How would any of the blots appear if you
hybridized and washed at very low
temperature (low stringency)?
2) Refer to question 1.
a) Explain how you would detect the presence
of the same deletion using PCR, rather than
a Southern blot.
b) How would PCR products appear if you
annealed at very low temperature?
3) You have a PCR fragment for a human olfactory
receptor gene (perception of smells). You want
to know what genes a dog might have that are
related to this human gene.
a) How can you use your PCR fragment and
genomic DNA from a dog to find this out?
b) Do you think dogs have more or less of these
genes?

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

DNA VARIATION STUDIED WITH SOUTHERN BLOTS – CHAPTER 35

CHAPTER 35 – DNA VARIATION STUDIED WITH SOUTHERN BLOTS

Figure 1.
Map of human migration according to
mitochondrial DNA sequencing. Variations are
inherited and can be detected using Southern
Blots (this Chapter), PCR (Chapter 36), and/or
microarrays (Chapter 37). The letters indicate
the different haplogroups, while the colours and
numbers represent thousands of years before
present for the establishment of these
populations.
(Wikipedia- Avsa – CC BY-SA 3.0)

INTRODUCTION mapped just like typical genetic markers. Molecular
markers are more numerous and can be used in
Imagine that you could compare the complete
medicine, forensics, ecology, agriculture, evolution,
genomic DNA sequence of any two people you
and many other fields. In most situations,
meet today. Although their sequences would be
molecular markers obey the same rules of
very similar on the whole, they would certainly not
inheritance that we have already described, and so
be identical at each of the ~3 billion base pair
can be used to create detailed genetic maps with
positions you examined (unless, perhaps, your
which to identify gene/disease locations though
subjects were identical twins – but even they have
genetic linkage.
some somatic differences). In fact, the genomic
sequences of almost any two, unrelated people 1. MUTATION AND POLYMORPHISM
differ at millions of nucleotide positions dispersed We have previously noted that an important
throughout their genomes. Some of these property of DNA is its fidelity: most of the time it
differences would be found in the regions of genes accurately passes the same information from one
that code for proteins. Others might affect the generation to the next. However, DNA sequences
amount of transcript that is made for a particular can also change. Changes in DNA sequences are
gene. A person’s appearance, behavior, health, and called mutations. If a mutation changes the
other characteristics depend in part on these phenotype of an individual, the individual is said to
polymorphisms. be a mutant. Naturally occurring, but rare,
Most of these nucleotide differences, however, sequence variants that are clearly different from a
have no effect at all. They have no effect on gene normal, wild-type sequence are also called
sequences or expression, because they occur in mutations. On the other hand, as discussed above,
regions of DNA that neither encode proteins, nor many naturally occurring variants exist for traits for
regulate the expression of genes. Nevertheless, which no clearly normal type can be defined; thus,
these polymorphisms are very useful because they we use the term polymorphism to refer to variants
can be used as molecular markers, which can be of DNA sequences (and other phenotypes) that co-
OPEN GENETICS LECTURES – FALL 2017 PAGE 1
CHAPTER 35 – DNA VARIATION STUDIED WITH SOUTHERN BLOTS

exist in a population at relatively high frequencies then the expansion and contraction of the number
(>1%). Polymorphisms and mutations arise of repeats are called Simple Sequence Repeat (SST)
through similar biochemical processes, but the use polymorphisms or Short Tandem Repeats (STR). If
of the word “polymorphism” avoids implying that they are longer (10-50 base pairs) then they are
any particular allele is more normal or abnormal. called Variable Number Tandem Repeats (VNTR)
For example, a change in a person’s DNA sequence (Figure 3). The difference in names here is just a
that leads to a disease such as hemophilia is matter of length.
appropriately called a mutation, but a difference in Because of the tandem nature of these sequences
DNA sequence that explains whether a person has and their propensity for addition/deletion, the
red hair rather than brown or black hair is an number of repeats is typically very variable in a
example of polymorphism. Molecular markers are population of individuals. The number of repeats
a particularly useful type of polymorphism for defines an allele, so there will be many alleles in a
many areas of genetics research. Mutations of DNA population and these loci will be highly
sequences can arise in many ways. polymorphic in a population. This leads to a high
2. MOLECULAR MARKERS – SNPS AND VNTRS degree of heterozygosity in the population, which
is good for genetic mapping of these markers.
2.1. ORIGINS OF SINGLE BASE PAIR POLYMORPHISMS
Polymorphisms can be single base pair differences
between or among individuals in a population
(Figure 2). These are referred to as Single
Nucleotide Polymorphisms (SNP) or “SNiPs”. They
are distributed randomly throughout the genome.
SNPs occur about once in every 300 base pairs, on Figure 3.
average, which means there are roughly 10 million Simple Sequence Repeat (SST) polymorphisms or Variable
SNPs in the human genome. SNPs are usually Number Tandem Repeat (VNTR) polymorphism. Each red
identified by DNA sequencing of multiple box represents a repeat. The variant region is marked in
blue (increased number of repeats), and each variant
individuals and comparing the sequence to find the sequence is arbitrarily assigned one of two allele labels.
different bases. (Original-Deyholos-CC:AN)

2.3. CLASSIFICATION AND DETECTION OF REPETITIVE

SEQUENCE POLYMORPHISM

Repetitive Sequence Polymorphism can be

classified as polymorphisms that either vary in the
Figure 2.
Single Nucleotide Polymorphisms (SNP). The variant
length of a DNA sequence, or vary only in the
region is marked in blue, and each variant sequence is identity of nucleotides at a particular position on a
arbitrarily assigned one of two allele labels. (Original- chromosome. In both cases, because two or more
Deyholos-CC:AN) alternative versions of the DNA sequence exist, and
we can detect them, we can treat each variant as a
2.2. ORIGINS OF REPETITIVE SEQUENCE POLYMORPHISM different allele of a single locus. Each allele gives a
Some of the sequence changes occur during DNA different molecular phenotype.
replication, resulting in an insertion, deletion, or For example, polymorphisms of SSRs (short
substitution of one or a few nucleotides. These sequence repeats) can be distinguished based on
replication errors occur most frequently at the length of PCR products: one allele of a
sequences where tandem repeats already exist. If particular SSR locus might produce a 100bp band,
they are relatively short repeats (1-5 base pairs) while the same primers used with a different allele

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

DNA VARIATION STUDIED WITH SOUTHERN BLOTS – CHAPTER 35

as a template might produce a 120bp band (Figure fragment length can be caused by the insertion of
4). An SNP (single nucleotide polymorphism), is an mobile genetic elements such as transposons,
example of polymorphism that varies in nucleotide (inserted more or less randomly into chromosomal
DNA) or to DNA deletions or duplications.
identity, but not length. SNPs are the most
common of any molecular markers, and the 4. CONSTRUCTION OF GENETIC LINKAGE MAPS
genotypes of thousands of SNP loci can be
In classical Mendelian genetics, two loci can be
determined in parallel, using new, hybridization
mapped relative to one another – they will either
based instruments. Note that the alleles of most
assort independently (unliked) or will be linked and
molecular markers are co-dominant, since it is
the frequency of recombination will determine
possible to distinguish the molecular phenotype of
their distance apart. Molecular markers can be
a heterozygote from either homozygote.
used in the same manner, both with each other
3. RESTRICTION FRAGMENT LENGTH and in combination with classic Mendelian
markers, too.
POLYMORPHISM (RFLP)
By calculating the recombination frequency
Another form of Molecular Marker is the
between pairs of molecular markers, a map of each
Restriction Fragment Length Polymorphism
chromosome can be generated for almost any
(RFLP). This polymorphism takes advantage of
organism. These maps are calculated using the
differences in the length of restriction enzyme (RE)
same mapping techniques described previously for
fragments (Figure 4).
genes, however, the high density and ease with
which molecular markers can be genotyped makes
them more useful than the “old-style” visual
phenotype method for constructing genetic maps.
These more detailed maps are useful in further
Figure 4.
studies, including map-based cloning of protein
Restriction Fragment Length Polymorphism (RFLP). The coding genes that were identified by mutation, or
variant sequence is marked in blue, and each variant for disease loci.
sequence is arbitrarily assigned one of two allele labels.
(Original-Deyholos-CC:AN) Figure 4 diagrammatically shows a set of
hypothetical results of parentals and F2 progeny
for a mapping cross. This type of experiment is
Here, the change in DNA sequence introduces or
needed to map the relative distance between two
abolishes a restriction enzyme site. This will change
loci, A and B, which are part of a series of loci along
the length of a restriction enzyme fragment that
a chromosome. Then these loci can be used to test
can be detected by Southern Blot of that genomic
for linkage to disease or other traits in a genome.
DNA. While the loss or gain of a RE site is the
typical cause of RFLPs, other changes in RE
.

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 35 – DNA VARIATION STUDIED WITH SOUTHERN BLOTS

Figure 5.
Determining the genotype of an individual
at a single SSR locus using a specific pair of
PCR primers and agarose gel
electrophoresis.
S= size standard
(Original-Deyholos-CC:AN)

Figure 6.
Figure Measuring recombination
frequency between two molecular marker
loci, A and B. A different pair of primers is
used to amplify DNA from either parent
(P) and 15 of the F2 offspring from the
cross shown. Recombinant progeny will
have the genotype A1A2B2B2 or
A2A2B1B2. Individuals #3, #8, #13 are
recombinant, so the recombination
frequency is 3/15=20%. Linkage!
(Original-Deyholos-CC:AN)

5. APPLICATIONS OF MOLECULAR MARKERS 4. It is relatively easy to find markers that differ

between two individuals.
Several characteristics of molecular markers make
5. There is no worrying about gene interactions or
them useful to geneticists.
other influences that make it difficult to infer
1. They are frequent throughout the genome
genotype from phenotype.
2. They are retained in the population (not
6. It possible to study hundreds of loci
selected for or against).
7. Molecular marker can be detected in any tissue
3. They are mostly phenotypically neutral.
or developmental stage.
PAGE 4 OPEN GENETICS LECTURES – FALL 2017
DNA VARIATION STUDIED WITH SOUTHERN BLOTS – CHAPTER 35

8. The same type of assay can be used to score

molecular phenotypes at millions of different
loci.
9. The loci are co-dominant (both alleles are
visible) so both can be tracked in pedigrees.

Thus, the neutrality, high density, high degree of
polymorphism, co-dominance, and ease of
detection of molecular markers led to their wide-
spread adoption in many areas of genetic research.
It is worth emphasizing again that DNA Figure 7.
polymorphisms are a natural part of most Paternity testing. Given the molecular phenotype of the
child (C) and mother (M), only one of the possible fathers
genomes. Geneticists discover these
(#2) has alleles that are consistent with the child’s
polymorphisms in various ways, including phenotype. (Original-Deyholos-CC:AN)
comparison of random DNA sequence fragments
from several individuals in a population. Once 5.2. POPULATION STUDIES
molecular markers have been identified, they can
be used in many ways, including: Population equilibrium - As described in Chapter
38, the observed frequency of alleles, including
5.1. DNA FINGERPRINTING alleles of molecular markers, can be compared to
Just like real finger prints on your fingers, the frequencies expected for populations in Hardy-
determination of alleles at genetic loci can be used Weinberg equilibrium to determine whether the
to make a “DNA fingerprint”. This is done by population is in equilibrium. By monitoring
determining the allelic genotypes at multiple molecular markers, ecologists and wildlife
molecular marker loci to make a composite biologists can make inferences about migration,
genotype. Then, by comparison, one can determine selection, diversity, and other population-level
the similarity between two DNA samples. If marker parameters.
genotypes differ, then clearly the DNAs are from Ancestry - Molecular markers can also be used by
different sources. However, if they don’t differ, anthropologists to study migration events in
then they could come from the same source. But, human ancestry. There are many commercial
there is a possibility that they came from different businesses available that will genotype people and
sources with both having the same genotype at the determine their deep genetic heritage for ~$100-
markers by chance alone. One can estimate the $200. This can be examined through the maternal
unlikelihood of them coming from different sources line via sequencing their mitochondrial genome
– eg they are from the same source. For example, a and through the paternal line via genotyping their
forensic scientist can demonstrate that the blood Y-chromosome.
sample found on a weapon and the blood sample
For example, about 8% of the men in parts of Asia
from a particular suspect are indistinguishable.
(about 0.5% of the men in the world) have a Y-
Similarly, that cat hair on the suspect's clothing
chromosomal lineage belonging to Genghis Khan
came from a particular cat at the home of a crime
and his relatives (the haplogroup C, although the
scene.
specific group varies, depending on the source).
DNA fingerprinting is also useful in paternity testing
and in commercial applications such as verification 5.3. IDENTIFICATION OF LINKED TRAITS
of species of origin of certain foods and herbal It is often possible to correlate, or link, an allele of
products. a molecular marker with a particular disease or
other trait of interest. One way to make this
OPEN GENETICS LECTURES – FALL 2017 PAGE 5
CHAPTER 35 – DNA VARIATION STUDIED WITH SOUTHERN BLOTS

correlation is to obtain genomic DNA samples from advise an individual of susceptibility to a disease.
hundreds of individuals with a particular disease, as This is covered in more detail in Chapter 37.
well as samples from a control population of Molecular markers may also be used in a similar
healthy (non-afflicted) individuals. The genotype of way in agriculture to track desired traits in crops or
each individual is scored at hundreds or thousands livestock. For example, markers can be identified
of molecular marker loci (e.g. SNPs), to find alleles by screening both the traits and molecular marker
that are usually present in persons with the genotypes of hundreds of individuals. Markers that
disease, but not in healthy subjects. The molecular are linked to desirable traits can then be used
marker is presumed to be tightly linked to the gene during breeding to select varieties with
that causes the disease, although this protein- economically useful combinations of traits, even
coding gene may itself be as yet unknown. The when the genes underlying the traits are not
presence of a particular molecular polymorphism known.
may therefore be used to diagnose a disease, or to

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

DNA VARIATION STUDIED WITH SOUTHERN BLOTS – CHAPTER 35

__________________________________________________________________________
SUMMARY:
• Natural variations in the length or identity of DNA sequences occur at millions of locations throughout
most genomes.

• DNA polymorphisms are often neutral, but because of linkage may be used as molecular markers to
identify regions of genomes that contain genes of interest.

• Molecular markers are useful because of their neutrality, co-dominance, density, allele frequencies,
ease of detection, and expression in all tissues.

• Molecular markers can be used for any application in which the identity of two DNA samples is to be
compared, or when a particular region of a chromosome is to be correlated with inheritance of a trait.

KEY TERMS:
molecular marker SNP
repetitive DNA RFLP
SSR neutral mutation
SSLP
VNTR

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 35 – DNA VARIATION STUDIED WITH SOUTHERN BLOTS

STUDY QUESTIONS:
For the next few questions, suppose that you have
a 1.0kbp fragment from the human genome. You
are told it contains only unique sequence (no
repeated DNA sequences such as transposable
elements or Alu sequences). If you label this
fragment and use it to probe a Southern blot
containing human genomic DNA (one person)
digested with EcoRI, HindIII, and BamHI in lanes 2,
3, and 4, respectively. Lane 1 contains a size
marker.
1) Will the probe only show one band per lane in
DNA from the individual if they are
homozygous for the region being probed?
2) What if the individual is heterozygous for this
region?
3) What if you examined 100 different individual’s
genomic DNA in a similar manner, would they
all be expected to have the same pattern?
4) What would you expect if the probe was not
unique, but instead had an Alu repeat within
the 1.0kbp fragment?
5)

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

CHAPTER 36 – DNA VARIATION STUDIED WITH PCR

CHAPTER 36 - DNA VARIATION STUDIED WITH PCR

Figure 1.
The 13 STR loci used in modern DNA fingerprinting.
The AMEL gene is used to determine the presence of
the X and Y chromosomes.
(Wikimedia-Chemical Science & Technology
Laboratory, National Institute of Standards and
Technology-PD)

INTRODUCTION the population have different numbers of this

repeat. Some people have as few as five repeats
Modern day DNA fingerprinting is based upon
while others have as many as sixteen. People may
harmless DNA variations present in our
be homozygous, and have the same number of
chromosomes (Figure 1). This chapter will look at
repeats on their maternal and paternal
what these variations are, how they can be
chromosomes, or be heterozygous, and have two
detected using two PCR based methods, and how
different numbers. Note that in this case none of
DNA fingerprinting is used in forensics and
these changes affect a person's health, the CSF
paternity testing.
gene is functional no matter how many repeats are
1. SHORT TANDEM REPEATS (STRS) present.
Here is a short sequence of human DNA within This repeat array is named CSF1PO and is described
intron 6 of the CSF gene on chromosome 5: at an online STR database.
http://www.cstl.nist.gov/strbase/str_CSF1P
--AGATAGATAGATAGATAGATAGATAGAT-- O.htm
--TCTATCTATCTATCTATCTATCTATCTA--
CSF1PO is not important for human health but it
Note that it has the sequence AGAT repeated
does play an important role in DNA fingerprinting.
seven times. This, and sequences like it, are called
Consider what it represents:
short tandem repeats (STRs).
• It has twelve distinct alleles (five repeats, six
The DNA replication machinery occasionally makes
repeats, etc.).
errors at STRs (see Chapter 11). They may expand
and end up to eight or more repeats, or they may • The chance that two people would have the
contract and end up with six or fewer. Because same genotype (e.g. both being a 7/12
these changes have been happening for tens, if not repeat heterozygote) is very small.
hundreds of thousands of years, different people in

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 36 - DNA VARIATION STUDIED WITH PCR

• Using PCR it is easy to determine the alleles tube electrophoresis. Here, one of the two PCR
that each person has. primers already has a fluorescent dye (Figure 3)
attached.
The rest of this chapter will discuss how we can
determine which alleles a person has and how we The PCR reaction occurs normally resulting in the
can use these results for DNA fingerprinting. PCR products all being fluorescently labelled. They
2. DETECTING STRS WITH PCR AND AGAROSE can then be loaded into a capillary tube
GEL ELECTROPHORESIS electrophoresis machine (Figure 4).

Let's say we have a person who is a 7/12 As before, DNA migrates through a gel material
heterozygote at the CSF1P0 site. We could find out towards the positive electrode, but this time the
by isolating their genomic DNA, amplifying the gel is contained within a thin tube. Near the end of
region using standard PCR, and then running the a tube is a laser to excite the fluorescent dye and a
PCR products on an agarose gel. We would get two detector to record fluorescence. A computer can
bands, the faster moving band would be the then monitor the tube for the appearance of any
smaller PCR products from the 7 allele and the fluorescent signal. For an STRs we would get a
slower moving band would be the products made single peak if a person was homozygous and two
from the 12 allele. peaks if a person was heterozygous

Figure 2 shows testing for a typical STR. These

results show the people were genotypically 17/17,
22/22, and 17/22. Other people would have many
other possibilities with different numbers of
repeats per allele and different combinations of
alleles per person.

Figure 3.
PCR using a fluorescently labelled primer
(Original-Harrington-CC BY-NC 3.0)

Figure 2.
Determining the genotype of an individual at a single STR
using a specific pair of PCR primers and agarose gel
electrophoresis. S = size standard.
(Original-Deyholos-CC BY-NC 3.0)

3. DETECTING STRS WITH PCR AND CAPILLARY

TUBE ELECTROPHORESIS
Figure 4.
The method described above is relatively simple to Capillary tube electrophoresis of a single PCR product. The
do, but very labour intensive. The gel has to be containers at either end contain a buffered solution and
poured, loaded, run, and photographed by a either a negative or positive electrode.
technician. We can get the same information using (Original-Harrington-CC BY-NC 3.0)
a more automated process, one using capillary

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

CHAPTER 36 – DNA VARIATION STUDIED WITH PCR

There are two other similarities between capillary STR Chromosome Result
tube electrophoresis and slab gel electrophoresis. CSF1P0 5 7/12
Just as a slab gels contain several lanes, the D8S1179 8 6/6
machines used for capillary tube electrophoresis D21S11 21 9/10
have several tubes, up to 96 in fact. This allows etc. etc. etc.
scientists to run many samples simultaneously.
Another similarity is the need for molecular weight The chance that another person, even a close
markers. In both systems these are pieces of DNA relative, has the exact same profile is
of known length that are used to estimate the sizes astronomically small. Identical twins, triplets, etc.
of the DNA molecules in each sample. In the case of will have the same DNA profile.
capillary tube electrophoresis these DNA molecules
are attached to a different coloured fluorescent 4.2. FORENSICS
dye. The computer uses them to estimate the size Forensics is the process of gathering data that can
of the PCR products. be used in a court of law. Because DNA profiles are
virtually unique to a person they can be used to
4. MODERN DNA FINGERPRINTING match a person to a DNA sample recovered at a
crime scene. In the example below only Suspect 2
4.1. OVERVIEW
matches the DNA at the crime scene; we can
DNA fingerprinting, as its name suggests, is the
exclude suspects 1 and 3. (Table 1) Note what DNA
ability to produce a unique collection of data for
profiling has done, it has made it easier to exclude
every person, using their DNA. It was invented by
a suspect than it is to convict someone.
the U.S. Federal Bureau of Investigation in the
1980s and is now done using the technique The technician who does the analysis often has to
presented above. The procedure is sometimes appear in court to explain the results to the jury. In
called CODIS (COmbined DNA Index System) named Canada if a person is convicted of a serious crime
after the software that converts the peaks into their DNA profile will then be stored at the
numerical data. National DNA data bank in Ottawa.
In Canada and the United States, the test is for 13 4.3. PATERNITY TESTING
autosomal STRs. These are shown in Figure 1. This The other common use of DNA fingerprinting is
figure also shows the AMEL gene. The allele on the paternity testing. If we have a DNA profile for a
X chromosome is shorter than the allele on the Y child and their biological mother, we can identify
chromosome. This difference can be detected with who the biological father could be. Remember, the
its own PCR reaction. If a person is XX they will only power of this type of test is that of exclusion. If the
have the shorter allele while if a person is XY they potential father lacks the alleles present in the
will have two sizes of PCR products. The sum total child, then he cannot be the father. The large
of the PCR reactions produces a collection of data number of STRs and alleles makes it possible to
known as a DNA profile. For example, a person's exclude essentially everyone except the real father.
DNA profile might begin:

Table 1. DNA profiling of suspect 1, 2 and 3.
DNA at
STR Suspect 1 Suspect 2 Suspect 3
crime scene
CSF1PO 7/12 8/11 7/12 7/15
D8S1179 6/6 9/15 6/6 9/12
D21S11 9/10 4/5 9/10 4/9

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 36 - DNA VARIATION STUDIED WITH PCR

For example, consider the situation below. Every Potential father#1 lacks this allele and thus must be
child STR allele that isn't from the mother must excluded. When we apply this thinking to all of the
have come from the father. For example, the STRs, only Potential father 3 could be the child's
child's CSF1PO 7 allele must be maternal (it lacks biological father, the other two males are excluded.
the “10” allele) so their 12 allele must be paternal. Remember, there are 13 STRs, each with many
This means the real father must have at least one alleles, so this method is powerful enough to
12 allele, this is found in fathers #2 and #3. exclude all but the real father.

Table 2. DNA profiling of child, mother and potential fathers.

Potential fathers
STR Child Mother
#1 #2 #3
CSF1PO 7/12 7/10 7/10 12/14 12/13
D8S1179 6/6 6/8 6/9 12/12 5/6
D21S11 9/10 10/11 5/5 9/16 9/9

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

CHAPTER 36 – DNA VARIATION STUDIED WITH PCR

___________________________________________________________________________
SUMMARY:
• Short tandem repeats (STRs) are easy to detect polymorphisms in human chromosomes and most are
harmless in that they don’t affect the phenotype.
• STRs alleles can be detected with standard agarose gel electrophoresis. The size of the band or bands
reveals the number of repeats present in a particular STR.
• STRs can be detected more efficiently with capillary tube electrophoresis. The location of the peak or
peaks is identified with laser illumination of fluorescent tagged primers and the number of repeats
present in a particular STR can be determined by computer.
• DNA profiles are virtually unique to an individual. They are used in modern day DNA fingerprinting and
paternity testing.
KEY TERMS:
short tandem repeat (STR) DNA fingerprinting
PCR DNA profile
agarose gel electrophoresis forensics
fluorescent dye paternity testing
capillary tube electrophoresis

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 36 - DNA VARIATION STUDIED WITH PCR
QUESTIONS:
1) When CSF1PO is amplified with a standard set c) B2B3
of PCR primers the 7 allele is 325 bp long and d) B2B4
the 6 allele is 321 bp. How long would the 12 8) In addition to the primers used to genotype
allele be? locus B (described above), a separate pair of
2) What results would you expect from a person primers can amplify another polymorphic SSR
who is a 7/12 heterozygote at CSF1PO using: locus E, with either a 60bp product (E1) or a
a) agarose gel electrophoresis 90bp (E2) product. DNA was extracted from six
b) capillary tube electrophoresis individuals (#1- #6), and DNA from each
3) In response to a 2015 terrorist attack, Kuwait individual was used as a template in separate
has made DNA testing of its population (both PCR reactions with primers for either locus B or
citizens and foreign residents) mandatory. Do primers for locus E, and the PCR products were
you think Canada should adopt this policy? visualized on electrophoretic gels as shown
4) With regards to Table#2 showing paternity below.
testing, label the child's alleles in red for Based on the following PCR banding patterns, what
maternal and blue for paternal. Assume that is the full genotype of each of the six individuals?
potential father 3 is indeed the child's biological
father.
5) Again using the same table, what if male #3 had
a brother with the following DNA profile:

STR #3 brother
CSF1PO 12/15

D8S1179 4/6
D21S11 9/10

Could he be excluded as the father, genetically

speaking?
6) The STR repeat number is very variable from
person to person.
a) How do we know they are relatively stable
within an individual? 9) Based on the genotypes you recorded in
b) If they weren’t, would they be a good tool Question 8, can you determine which of the
for DNA profiling? individuals could be a parent of individual #1?
7) Three different polymorphisms have been 10) At the bottom of this page, part of the DNA
identified at a particular molecular marker sequence of a chromosome is shown. Identify
locus. A single pair of PCR primers will amplify the following features on the sequence:
either a 50bp fragment (B2), a 60bp fragment a) the region of the fragment that is most
(B3), or a 100bp fragment (B4). likely to be polymorphic
Draw the PCR bands that would be expected if b) any simple sequence repeats
these primers were used to amplify DNA from c) the best target sites for PCR primers that
individuals with each of the following genotypes. could be used to detect polymorphisms in
the length of the simple sequence repeat
a) B2B2 region in different individuals
b) B4B4
TAAAGGAATCAATTACTTCTGTGTGTGTGTGTGTGTGTGTGTGTTCTTAGTTGTTTAAGTTTTAAGTTGTGA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ATTTCCTTAGTTAATGAAGACACACACACACACACACACACACAAGAATCAACAAATTCAAAATTCAACACT

PAGE 6 OPEN GENETICS LECTURES – FALL 2017
CHAPTER 37 – DNA VARIATION STUDIED WITH MICROARRAYS

CHAPTER 37 - DNA VARIATION STUDIED WITH MICROARRAYS

Figure 1.
A DNA microarray slide showing the array of test
spots on the slide.
(Flickr- Argonne Laboratory-CC BY SA 2.0)

INTRODUCTION changes like it, are called single nucleotide
polymorphisms (SNPs). Most SNPs, including this
Microarrays have revolutionized many types of
one, do not affect the expression of genes they are
genetics (Figure 1). Projects that used to take years
near or within. Their sequence has no effect on the
can now be done in weeks if not days. This chapter
phenotype.
will look at two of these techniques. One is used to
identify genes that are responsible for human This SNP is named rs16824514 and is described at
diseases. The other is used to reveal whether a an online SNP database:
person has mutations in any of these genes. http://www.ncbi.nlm.nih.gov/project
s/SNP/snp_ref.cgi?rs=16824514
1. SINGLE NUCLEOTIDE POLYMORPHISMS (SNPS)
rs16824514 is not important for human health, but
1.1. WHAT ARE SNPS? it does provide an example of what SNPs are:
Here is a little section of human DNA from within
an intron of the INPP5B gene on chromosome 1: • It has two alleles, in this case T = ancestral
and A = minor.
--TCCTCTCCAGC-- • People can have three possible genotypes,
--AGGAGAGGTCG-- in this case TT, TA, and AA.
• It is easy to determine which alleles a
--TCCTCACCAGC-- person using one of various methods.
--AGGAGTGGTCG-- The rest of this chapter will discuss how we can
determine which alleles of a SNP a person has and
Most people have the sequence on the top but what this can tell us. But before then, where do
some people have the sequence on the bottom. SNPs come from?
The only difference is the single base pair shown in
bold. This TA to AT base pair substitution mutation 1.2. HOW ARE SNPS DISCOVERED?
likely originated as a single event in the human Most human SNPs were identified during DNA
population tens of thousands of years ago. The sequencing projects. Research teams deliberately
original allele is called the ancestral allele while the chose people from different ethnic groups to
new allele is called the minor allele. These, and sequence. See Chapter 35, Figure 1, for the spread

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 37 - DNA VARIATION STUDIED WITH MICROARRAYS

of humans across the world. For example, they may others. What they have in common is shown in
have sequenced people from countries X, Y, and Z, Figure 2. A microarray is a piece of glass or silicon
each in a different continent. Over the past with short pieces of single stranded DNA stuck to
thousands of years, mutations have been its surface. There are thousands of spots of these
happening in people from each population. A few oligonucleotides (oligos). Within each spot all of
of the base pair substitutions would have become the oligos have the same specific sequence.
common and could even have replaced the original
ancestral allele. When we sequence a present day
members of these diverse populations, we have a
good chance of getting a person that has the new
allele. When we compare this person's genomic

sequence to people from other populations, those
people will still have the ancestral allele. This Figure 2.
difference at a single place on the chromosome is A glass slide based microarray. Actual microarrays have
our SNP. It will be entered into the SNP database as hundreds of thousands of these spots, each with
'reference SNP' followed by a number (rs####). thousands of identical oligonucleotides attached.
(Original – Harrington – CC BY SA 4.0)
There are methods other than sequencing to
detect SNPs. The detection of SNPs is useful for To use a microarray, you need to prepare the DNA
other organisms that have either never been or RNA sample first (Step 1 in Figure 3). DNA
sequenced or have only been sequenced once. samples are broken into smaller fragments, made
Even though this chapter only discusses the uses of single stranded, and covalently attached to a
SNPs in human biology and health, they are useful fluorescent dye. RNA molecules just need to be
tools used by biologists working in many different attached to the fluorescent molecules. Next you
organisms to understand basic questions in biology, pour the labelled sample onto the microarray (Step
genetics, and evolution. 2). If a piece of labelled nucleic acid is
2. MICROARRAY TECHNOLOGY complementary to the oligos in a spot they will
hybridize. After all of the unhybridized sample is
2.1. OVERVIEW washed off, the spot will fluoresce. The microarray
There are many ways to detect SNPs. Over the is then put into a microarray reader (Step 3) to
years, Southern blots, PCR, genome sequencing, take a digital photograph of the fluorescent spots
and other techniques have been used. Most of on the surface (Step 4). The spots with
these methods are rarely used on a large scale fluorescence indicate the presence of
today because they are too labour intensive, complementary sequence in the sample and the
expensive, time consuming, or some combination level of fluorescence indicates the amount.
thereof. This Chapter will focus on a single method,
chosen because it is how most SNP detection is 2.2. GENOTYPING MICROARRAYS
done in 2015. It makes use of a specific type of Have a look at the DNA sequences at the beginning
microarray, a genotyping microarray made by a of the chapter. How could we design a microarray
biotechnology company called Illumina, Inc. in San to detect which of these alleles a person has?
Diego, California, USA. Simply put, this microarray would need to have two
spots for each SNP, one for each allele. One will
Microarrays are a technology used to quantify DNA have oligonucleotides that match the ancestral
or RNA molecules. There are many, many types. allele and one will have oligos that match the
Some are sold by biotechnology companies, while minor allele (Figure 4). To use this microarray we
others are custom made by scientists themselves. would need to isolate genomic DNA from a person,
They go by different names: microarrays, DNA
chips, lab-on-a-chips, biochips, gene arrays, and

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

CHAPTER 37 – DNA VARIATION STUDIED WITH MICROARRAYS

spot on the left. Likewise, if a person has the minor

allele some of their DNA will bind to the spot on
the right. There are therefore three possible
results:
Table 1. Results from genotyping microarray.
Result Interpretation

left spot the person is homozygous for the

fluoresces ancestral allele (TT)

right spot the person is homozygous for the

fluoresces minor allele (AA)

both spots
the person is heterozygous (TA)
fluoresce

These two spots only take up a tiny portion of the

surface of the microarray so there can be many,
many other pairs of oligos to detect other SNPS.
Figure 3. This type of microarray is called a genotyping
How microarrays are used and read.
(Original – Harrington – CC BY SA 4.0)
microarray because it determines a person's

genotype at many SNPs all at once.
Illumina, Inc. is a biotechnology company that
makes popular genotyping microarrays and
readers. For example, their HumanOmniExpress
BeadChip microarray can test 12 people at a time
for 730 000 SNPs. The process is somewhat
different than shown in Figure 3 but the data
produced is the same. This microarray works with
two microarray readers, a larger HiScan™ or a
smaller iScan™. The microarray readers are
expensive but if one is continuously used a
laboratory can test up to 1400 people a week. The
rest of this chapter will discuss two procedures that
Figure 4. can be done with these microarrays: one looks for
Two spots to detect alleles of one SNP on a genotyping new disease causing mutations and the other
microarray.
reveals whether a person has any of these
(Original – Harrington – CC BY SA 4.0)
mutations.

then process it into short, single stranded lengths, 3. GENOME-WIDE ASSOCIATION STUDIES
and fluorescently label those pieces. Next, we (GWAS)
would need to inject it onto the microarray, let it
Some human diseases are caused by mutant alleles
hybridize, and wash away the unhybridized probe.
of single genes. But how can scientists identify
In the above example if a person has the ancestral
which of our ~20,000 gene(s) is/are responsible?
allele some of their DNA sample will bind to the

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 37 - DNA VARIATION STUDIED WITH MICROARRAYS

Consider a type of neuronal degeneration called SNPs on chromosome 4. On the y-axis is the
Huntington disease. It took researchers Nancy probability that each SNP is close to the disease-
Wexler and James Gusella ten years to discover associated gene. For most SNPs the probability is
that the gene responsible for this disease was on low.
chromosome 4. Their teams used a technique
called restriction fragment length polymorphism
(RFLP) mapping. This Southern blotting-based
approach has been replaced with much faster
microarray-based methods.
Using genotyping microarrays to discover genes is
called SNP mapping or genome-wide association
studies (GWAS). We need large DNA samples from
two groups of people, those that have a mutation
causing a disease (or a phenotype of interest) and Figure 6.
those who do not. Each person's DNA is isolated GWAS results show the probability that SNPs along a
chromosome are close to the gene of interest.
and then genotyped using a genotyping microarray.
(Original – Harrington – CC BY SA 4.0)
A powerful statistical test is then used to study the
results. What the software is looking for is
correlations; are any of the SNPs correlated with GWAS is very effective at identifying genes when
the disease phenotype (Figure 5)? there is only one gene that can mutate to cause the
disease. When there are two or more genes the
results are harder to interpret.
For example, let's say that there are two genes that
can mutate to cause the same disease. In our group
of people with the disease some have mutations in
the first gene and some have mutations in the
second gene. SNPs near the first gene will not be
close to the second gene, and vice versa. This
dilutes the association between SNPs and the
disease. GWAS is not very useful when it comes to
Figure 5. multifactorial diseases. These are diseases such as
SNPs can be used to determine the location of mutations cancer and heart disease, where there are
that affect our health. The G/C vs A/T SNP is located near
the gene of interest. The G/C form will tend to associate
mutations in many genes that can be responsible.
with the gene+ allele, while the A/T form will tend to
4. DIRECT TO CONSUMER DNA TESTING
associate with the gene- allele.
(Original – Harrington – CC BY SA 4.0) Direct to consumer DNA testing, as the name
suggests, is a form of genetic testing that does not
For example, does everyone with the mutant gene involve a physician ordering the test, or helping the
have an A/T base pair allele of a specific SNP, while person to understand the results. The results go
everyone with the functioning gene have a G/C directly to the consumer and they are left to
base pair allele? If so, then the gene should be near interpret their genotypes, usually with the
this SNP. Because we know where the location of assistance of a web site provided by the testing
the SNP, is we now know where the approximate company. There are over a dozen companies
location of the gene is. offering this service worldwide.
The results are displayed in a graph (Figure 6). The In the case of 23andMe, for example, a person
x-axis is the location of the SNPs, in this case all the provides a DNA sample by spitting into a tube and

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

CHAPTER 37 – DNA VARIATION STUDIED WITH MICROARRAYS

pays ~$200 to have it tested. Many people think 23andMe and other such companies also offers
the DNA is being sequenced but this isn't possible heredity testing. Other SNPs on the same
for this price. The cost to sequence a person's DNA microarray are known to be correlated with
in 2015 is about $1000+. Instead what 23andMe different ethnic groups. For example, if a minor
does is to load the DNA samples they receive into allele is common in English people and the person
genotyping microarrays. In fact, they use the is homozygous for this allele chances are this
Illumina microarray described earlier in this person is English. If a person is heterozygous it
chapter. means one of their parents is likely to be English
What they are looking for is SNPs known to be next and the other not. If a person only has the
to described genes. These are genes previously ancestral alleles it means neither of their parents is
identified as being medically important. likely to be English. There are enough SNPs used for
most people to learn where their ancestors came
For example, in Figure 7 if a person is heterozygous from.
or homozygous for the A/T base pair SNP allele
they have a higher probability of having the mutant Additionally, many companies offer testing for the
allele of the gene. Conversely if they only have the percentage of your genome that came from
G/C base pair alleles they probably don't. 23andMe Neanderthals. Neanderthals interbred with humans
reports back a person's genotype as GG, GA, or AA around 60,000 years ago and many people of
and describes the likelihood of a person having this European, Asian, Australian, and Native American
disease or condition. While this looks like a medical origin have retained a few percent of their genome
diagnostic test, 23andMe argues it isn't looking for within their own. The human and Neanderthal
the disease causing mutations directly and genomes can be distinguished by SNPs.
therefore they are not offering a diagnostic test.

Figure 7.
SNPs can be used to determine the presence of mutations
that affect our health. The SNP that was originally used to
discover this gene is now being used to test for it.
(Original – Harrington – CC BY SA 4.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 37 - DNA VARIATION STUDIED WITH MICROARRAYS

___________________________________________________________________________
SUMMARY:
• Single nucleotide polymorphisms (SNPs) are harmless and easy to detect polymorphisms in human
chromosomes
• SNPs can be detected with genotyping microarrays and microarray readers. These microarrays have
pairs of oligonucleotide spots, one for the ancestral allele and one for the minor allele. Where a
person's DNA hybridizes reveals their genotype at a particular SNP.
• Genome-wide association studies (GWAS) use SNPs and genotyping microarrays to determine the
location of mutations that affect our health.
• Direct to consumer DNA testing uses SNPs and genotyping microarrays to determine the presence of
allelic forms that show linkage to genes that may affect our health, predict our ancestry, and estimate
the percentage of Neanderthal sequences we have.
KEY TERMS:
ancestral allele genotyping microarray
minor allele SNP mapping
single nucleotide polymorphism (SNP) genome-wide association study (GWAS)
microarray direct to consumer DNA testing
microarray reader

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

CHAPTER 37 – DNA VARIATION STUDIED WITH MICROARRAYS

QUESTIONS:
1) Why do most SNPs only have two alleles?
2) When using a genotyping microarray why is it
important that only perfectly matched DNA
molecules be able to hybridize?
3) Could GWAS be used to find out why some
people have blue eyes and other people don't?
4) Assuming the gene responsible for blue eye
colour is known, could direct to consumer
testing predict whether a person has blue eyes
or not?
5) What do these GWAS results mean?

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 37 - DNA VARIATION STUDIED WITH MICROARRAYS

Notes:

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

POPULATION GENETICS – CHAPTER 38

CHAPTER 38 – POPULATION GENETICS

Figure 1.
Population genetics is important in
ecology, evolution, and even in our
daily lives since disease risks can be
calculated using population
genetics.
(Flickr-Zach Stern- CC BY-NC-ND 2.0)

INTRODUCTION For example:
A population is a large group of individuals of the genotype number of individuals
same species, who are capable of mating with each AA 320
other. It is useful to know the frequency of
particular alleles within a population, since this Aa 160
information can be used to calculate disease risks. aa 20
Population genetics is also important in ecology
and evolution, since changes in allele frequencies
p = (2(AA) + Aa) / (total alleles counted)
may be associated with migration or natural
= (2(320) + 160) / (2(320) + 2(160) + 2(20) ) = 0.8
selection.
q = (2(aa) + Aa) / (total alleles counted)
1. ALLELE FREQUENCIES MAY BE STUDIED AT THE = (2(20) + 160) / (2(320) + 2(160) + 2(20) ) = 0.2
POPULATION LEVEL
A (p) a (q)
The frequency of different alleles in a population
can be determined from the frequency of the A (p) p2 pq
various phenotypes in the population. In the
a (q) pq q2
simplest system, with two alleles of the same locus
(e.g. A,a), we use the symbol p to represent the
frequency of the dominant allele within the 2. HARDY-WEINBERG FORMULA
population, and q for the frequency of the
With the allele frequencies of a population we can
recessive allele. Because there are only two
use an extension of the Punnett Square, and the
possible alleles, we can say that the frequency of p
product rule, to calculate the expected frequency
and q together represent 100% of the alleles in the
of each genotype following random matings within
population (p+q=1).
the entire population. This is the basis of the
We can calculate the values of p and q, in a Hardy-Weinberg formula:
representative sample of individuals from a
p2 + 2pq + q2=1
population, by simply counting the alleles and
dividing by the total number of alleles examined. Here p2 is the frequency of homozygotes AA, 2pq is
For a given allele, homozygotes will count for twice the frequency of the heterozygotes, and q2 is the
as much as heterozygotes. frequency of homozygotes aa.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 38 – POPULATION GENETICS

Notice that if we substitute the allele frequencies The Hardy-Weinberg formula can also be used to
we calculated above (p=0.8, q=0.2) into the estimate allele frequencies, when only the
formula p2 + 2pq + q2=1, we obtain expected frequency of one of the genotypic classes is known.
probabilities for each of the genotypes that exactly For example, if 0.04% of the population is affected
match our original observations: by a particular genetic condition, and all of the
affected individuals have the genotype aa, then we
p2=0.82=0.64 0.64 x 500 = 320 assume that q2 = 0.0004 and we can calculate p, q,
2pq= 2(0.8)(0.2)=0.32 0.32 x 500 = 160 and 2pq as follows:
q2=0.22=0.04 0.04 x 500 = 20 q2 = 0.04% = 0.0004
q= 0.0004 = 0.02
This is a demonstration of the Hardy-Weinberg
Equilibrium, where both the genotype frequencies p= 1-q = 0.98
and allele frequencies in a population remain 2pq = 2(0.98)(0.02) = 0.04
unchanged following successive matings within a
population, if certain conditions are met. These Thus, approximately 4% of the population is
conditions are listed in Table 1. Few natural expected to be heterozygous (i.e. a carrier) of this
populations actually satisfy all of these conditions. genetic condition. Note that while we recognize
Nevertheless, large populations of many species, that the population is probably not exactly in
including humans, appear to approach Hardy- Hardy-Weinberg equilibrium for this locus,
Weinberg equilibrium for many loci. In these application of the Hardy-Weinberg formula
situations, deviations of a particular gene from nevertheless can give a reasonable estimate of
Hardy-Weinberg equilibrium can be an indication allele frequencies, in the absence of any other
that one of the alleles affects the reproductive information.
success of organism, for example through natural
selection or assortative mating.

Table 1. Conditions for the Hardy-Weinberg equilibrium

• Random mating: Individuals of all genotypes mate together with equal frequency. Alternatively,
assortative mating, in which certain genotypes preferentially mate together, is a type of non-random
mating.
• No natural selection: All genotypes have equal fitness. None are selectively removed by selection.
• No migration: Individuals do not leave or enter the population.
• No mutation: The allele frequencies do not change due to mutation.
• Large population: Random sampling effects in mating (i.e. genetic drift) are insignificant in large
populations.

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

POPULATION GENETICS – CHAPTER 38

___________________________________________________________________________
SUMMARY:
• Populations in true Hardy-Weinberg equilibrium have random mating, and no genetic drift, no
migration, no mutation, and no selection with respect to the gene of interest.
• The Hardy-Weinberg formula can be used to estimate allele and genotype frequencies given only
limited information about a population.
KEY TERMS:
population Hardy-Weinberg equilibrium
p / q random mating
p+q=1 natural selection
Hardy-Weinberg formula migration
p2 + 2pq + q2=1 assortative mating

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 38 – POPULATION GENETICS

STUDY QUESTIONS:
1) You are studying a population in which the
frequency of individuals with a recessive
homozygous genotype is 1%. Assuming the
population is in Hardy-Weinberg equilibrium,
calculate:
a) The frequency of the recessive allele.
b) The frequency of dominant allele.
c) The frequency of the heterozygous
phenotype.
d) The frequency of the homozygous dominant
phenotype.
2) Determine whether the following population is
in Hardy-Weinberg equilibrium.
genotype number of
individuals
AA 432
Aa 676
aa 92
3) Out of 1200 individuals examined, 432 are
homozygous dominant (AA) for a particular
gene. What numbers of individuals of the other
two genotypic classes (Aa, aa) would be
expected if the population is in Hardy-Weinberg
equilibrium?
4) Propose an explanation for the deviation
between the genotypic frequencies calculated
in question 3 and those observed in the table in
question 2.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

Evolution of Gene Expression – Chapter 39

CHAPTER 39 – EVOLUTION OF GENE EXPRESSION

Figure 1.
The stickleback is an example of an organism in which
researchers have identified mutations that cause changes in
the regulation of gene expression. These mutations confer a
selective advantage in some environments. Natural selection
acts on mutations that alter gene expression patterns, as well
as those changing the coding regions of genes.
(Wikipedia-K. Szabolcs- CC BY-SA 3.0)

INTRODUCTION specific genes being expressed to make or cause the
cell to progress down specific developmental
Mutations can occur in both cis-elements
pathways. A cell can become “determined”
(promoter, enhancers, etc.) and in the genes that
(destined to go down one pathway – a cell fate)
code for trans-factors (transcription factors); both
through the expression of one or more genes.
can result in altered patterns of gene expression. If
Usually these genes are transcription factors that
an altered pattern of gene expression results in a
regulate the transcription of other genes. An
selective advantage (or at least do not produce a
example has been presented in Chapter 21 with the
major disadvantage), they may be selected and
TDF-Y expression dictating the development of the
maintained in future populations. They may even
gonad into a testis.
contribute to the evolution of new species. An
example of a sequence change in an enhancer is Once a cell has become determined, then it will
found in the Pitx gene of the Stickleback fish (Figure differentiate into a specific cell type by expressing
1). the genes for that cell type and turning off others
that are not expressed in that type. For example a
1. BASICS OF DEVELOPMENT myoblast cell (muscle cell precursor) will begin the
expression of actin, myosin, and other protein that
1.1. ONE CELL MULTICELLULAR ORGANISM
for a muscle cell.
In multicellular organisms, the original, single
zygote cell, which arises from the union of two The fate of a cell is first determined and then the cell
gametes, replicates via mitosis to produce all the differentiates into that cell type. This typically
cells in the body. Thus, all the cells have essentially happens as a cell goes through a division where each
the same genotype. As these cells grow and divide of the daughter cells goes on to become different
they express different sets of genes in a cell types. There is a series of choices with each cell
programmed manner. Thus organismal division. This can lead to the variety of cell and tissue
development requires that the cells differentiated types found in multicellular organism.
into various cell types and tissues, which ultimately
2. VARIATION IN GENE EXPRESSION AND
become organs within the organism. This
differentiation is the result of specific sets of genes EVOLUTION
being expressed in different cell types. Genetic variation arises from random mutation in
DNA sequences. Mutations can occur in three
1.2. DETERMINATION AND DIFFERENTIATION – CELL FATE categories of DNA: (1) intergenic sequences, (2)
COMES FROM GENE EXPRESSION
gene coding sequences, and (3) gene regulatory
In the process of development cells become more sequences. If a change (mutation) occurs it may be
specialized. This specialization is the result of selected for in a population because it has an

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

Evolution of Gene Expression – Chapter 39

advantage (selection). Others mutations may just and selection for/against the novel expression
increase in frequency in a population through pattern, not altered gene products.
chance alone (drift). Mutations in regulatory sequences are a key to
2.1. INTERGENIC SEQUENCES understanding how evolution produces new
Mutations in intergenic sequences (regions patterns of expression and new phenotypes.
between genes) have no effect on gene expression Regulatory sequences have to be identified
or phenotype, so there is no selection for/against. experimentally and shown to act combinatorial to
Consequently, these mutations are useful for regulate transcription. Most genes have multiple
markers in genetic mapping or DNA finger printing. independent regulatory sites adjacent to the
(Random genetic drift can cause fixation, where all transcribed sequence that bind trans-acting factors
members of a species have the same DNA to modulate expression.
sequence.) The result is that evolution occurs via 3. EXAMPLE 1: DROSOPHILA YELLOW GENE
random mutation and fixation by random drift, with
no selection for or against these sequences. The yellow gene of Drosophila provides an example
of the modular nature of enhancers (regulatory
2.2. GENE CODING SEQUENCES sequences). This gene encodes an enzyme in the
Mutation in a gene coding sequence can change the pathway that produces a dark pigment in the
effectiveness of a gene product (RNA or protein), insect’s exoskeleton. Null mutants have a yellow
which can consequently affect the phenotype. This cuticle rather than the wild type darker pigmented
type of change doesn’t alter the gene’s cuticle. This gene is called the gene “yellow”
transcription, but natural selection may act because it is named after their mutant phenotype.
for/against this new phenotype. The result is that
Figure 2 shows five distinct enhancer elements that
evolution occurs via random mutation and selection
drive transcription of yellow (left, 5’ up stream -
against for/against the function of the gene’s
wing, body, mouth parts; intron – bristles, claws).
product.
Each binds a different, tissue-specific transcription
2.3. GENE’S REGULATORY SEQUENCES factor to enhance transcription of yellow+ transcript
This is the most interesting type. Mutations in (and thus express the protein) in that tissue and
regulatory sequences do not change the product makes the pigment. So, the wing cells will have a
from the gene, just the pattern of transcription transcription factor that binds to the wing enhancer
(time, place,). In other words, the time, place to drive expression; likewise in the body and mouth
(tissue), level, and response to environment of part cells. Thus, specific combinations of cis-
expression are changed. Regulatory mutations can elements and trans-factors control the differential,
affect many traits and characteristics at once tissue-specific expression of genes. This type of
(pleiotropic) or create new and/or novel patterns of combinatorial action of enhancers is typical of the
expression. This might result in a new function in an transcriptional activation of most eukaryotic genes:
organism (e.g. neomorph). With this type of specific transcription factors activate the
mutation, evolution occurs via random mutation transcription of target genes under specific
conditions.

Figure 2.
Tissue-specific cis-regulatory elements within a simplified representation of the yellow gene of Drosophila.
(Origianl-Deyholos-CC BY-NC 3.0)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

Evolution of Gene Expression – Chapter 39

While enhancer sequences promote expression, encodes a member of the RIEG/PITX homeobox
there is an oppositely acting type of element, called family) was expressed in several groups of cells,
silencers. These elements function in much the including those that developed into the pelvic fin.
same manner, with transcription factors that bind to Embryos from the shallow-water population
DNA sequences, but they act to silence or reduce expressed Pitx in the same groups of cells as the
transcription from the adjacent gene. other population, with an important exception: Pitx
Again, a gene’s overall expression profile was not expressed in the pelvic fin primordium in
(transcription level, tissue specific, temporal the shallow-water population. Further genetic
specific) is a total combination of all the various analysis showed that the absence of Pitx gene
enhancer and silencer elements that act on that expression from the developing pelvic fin of shallow-
gene. water stickleback was due to the absence
(mutation) of a particular enhancer element
4. EXAMPLE 2: PITX EXPRESSION IN STICKLEBACK upstream of Pitx.
FISH 5. EXAMPLE 3: HEMOGLOBIN EXPRESSION IN
The three-spined stickleback (Gasterosteus PLACENTAL MAMMALS
aculeatus; Figure 1) provides a classic example of
natural selection that involves a mutation in a cis- Hemoglobin is the oxygen-carrying component of
regulatory element. red blood cells (erythrocytes). Hemoglobin usually
exists as tetramers of four non-covalently bound
Background: Members of this species occur in one hemoglobin molecules (Figure 3). Each hemoglobin
of two forms: (1) populations that inhabit deep, molecule consists of a globin polypeptide with a
open water and have a spiny pelvic fin that is covalently attached heme molecule. Heme is made
thought to deter larger predator fish from feeding through a specialized metabolic pathway and is then
on them; (2) populations from shallow water bound to globin polypeptide through post-
environments and lack this spiny pelvic fin. In translational modification.
shallow water, it appears that a long, spiny pelvic fin
are a disadvantage because they allow predatory Figure 3.
A tetramer of human
insects like dragon fly larvae in the sediment to hemoglobin, type a2b2. The a
grasp the stickleback. chains are labeled red, and the b
chains are labeled blue. Heme
Researchers compared gene sequences of
groups are green. (
individuals from both deep and shallow water (Wikipedia- Zephyris- CC BY-SA
environments as shown in Figure 4. They observed 3.0)
that in embryos from the deep-water population, a

gene called Pitx (paired-like homeodomain 1

Figure 4.
Development of a large, spiny pelvic
fin in deep-water stickleback (left)
depends on the presence of a
particular enhancer element
Shallow Water
Deep Water upstream of a gene called Pitx.
Mutants lacking this element, and
therefore the large pelvic fin (right),
have been selected for in shallow-
water environments.
(Original-Deyholos-CC BY-NC 3.0)
Pitx Pitx

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

Evolution of Gene Expression – Chapter 39

The composition of hemoglobin tetramers changes globin gene. Gene duplication events can occur
during development (Figure 5). From early through rare errors in processes such as DNA
childhood onward, most tetramers are of the type replication, meiosis, or transposition. The
a2b2, which means they contain of two copies of duplicated genes can accumulate mutations
each of two slightly different globin proteins named independently of each other. Mutations can occur
a and b. A small amount of adult hemoglobin is in either the regulatory regions (e.g. promoter
a2d2, which has d globin instead of the more regions), or in the coding regions, or both. In this
common b globin. Other tetrameric combinations way, the promoters of globin genes have evolved to
predominate before birth: z2e2 is most abundant in be expressed at different phases of development,
embryos, and a2g2 is most abundant in fetuses. and to produce proteins optimized for the prenatal
Although the six globin proteins (a = alpha, b = beta, environment.
g = gamma, d =delta, e =epsilon , z = zeta) are very Of course, not all mutations are beneficial: some
similar to each other, they do have slightly different mutations can lead to inactivation of one or more of
functional properties. For example, fetal the products of a gene duplication. This can produce
hemoglobin has a higher oxygen affinity than adult what is called a pseudogene. Examples of
hemoglobin, allowing the fetus to more effectively pseudogenes (y) are also found in the globin
extract oxygen from maternal blood. The clusters. Pseudogenes have mutations that prevent
specialized g globin genes that are characteristic of them from being expressed at all. The globin genes
fetal hemoglobin are found only in placental provide an example of how gene duplication and
mammals. Each of these globin polypeptides is mutation, followed by selection, allows genes to
encoded by a different gene. In humans, globin evolve specialized expression patterns and
genes are located in clusters on two chromosomes functions. Many genes have evolved as gene
(Figure 6). We can infer that these clusters arose families in this way, although they are not always
through a series of duplications of an ancestral clustered together as are the globins.

Figure 5.
Expression of globin genes during prenatal and postnatal
development in humans. The organs in which globin genes
are primarily expressed at each developmental stage are also
indicated.

Data: Wood, W.G. 1976 Br. Med. Bull. 32, 282
Original: (Wikipedia-Furfur- CC BY-SA 3.0)
Derivative work/Translation: (Wikipedia-Leonid2- CC BY-SA 3.0)

Figure 6.
Fragments of human chromosome 11 and
human chromosome 16 on which are
located clusters of b-like and a-like goblin
genes, respectively. Additional globin genes
(q, µ) have also been described by some
researchers, but are not shown here.
(Wikipedia – Modified by Kang-CC BY-NC 3.0)

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

Evolution of Gene Expression – Chapter 39

___________________________________________________________________________
SUMMARY:
• Development of a single cell zygote to a multicellular organism involves the sequential expression of
genes so that determination and differentiation can take place and the cells can form the variety of types
found in the adult organism.
• Mutations can occur in intergenic, gene coding, or gene regulatory sequences. Changes in regulatory
sequences can lead to altered gene expression including new developmental times or tissue locations.
• The Drosophila yellow gene is an example of mutations in gene regulatory sequences.
• Stickleback fish provide an example of recent evolutionary events in which mutation of an enhancer
produced a change in morphology with a selective advantage (evolution).
• Expression of the various human globin genes, which generate hemoglobin, is an example of gene
expression changes over developmental time. The family of globin genes arose via gene duplication. Not
all duplications produce functional genes, some are pseudo-genes.
KEY TERMS:
multicellular organisms pleiotropic
zygote stickleback
determination Pitx
differentiation Primordium
cell fate post-translational modification
intergenic sequences hemoglobin/heme/globin
gene coding sequences gene duplication
gene regulatory sequences pseudogene
regulatory mutations gene families

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

Evolution of Gene Expression – Chapter 39

STUDY QUESTIONS:
1) Deep-water sticklebacks that are heterozygous
for a loss-of-function mutation in the coding
region of Pitx look just like homozygous wild-
type fish from the same population. What
phenotype or phenotypes would be expected if
a homozygous wild-type fish from a deep-water
population mated with a homozygous wild-type
fish from a shallow-water population?
2) The modular nature of transcription enhancer
elements can easily be seen in the yellow gene
of Drosophila. Suppose that there was a mutant
that had a deletion of the three distal enhancer
elements (wing, body, mouth – See Figure 2.).
There was another, different mutation that
resulted in a stop codon early in the protein
coding sequence.
a) What would the phenotype of the
homozygote deletion mutant be?
b) What would the phenotype of the
homozygote stop codon mutant be?
c) What would the phenotype of the
heterozygote be?
d) Suppose the heterozygote phenotype was
wild type. How might that occur?

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

TRANSGENIC ORGANISMS – CHAPTER 40

CHAPTER 40 – TRANSGENIC ORGANISMS

Figure 1.
Two transgenic mice expressing
enhanced green fluorescent protein
(eGFP) under UV-illumination
flanking one plain NOD/SCID mouse
from the non-transgenic parental
line.
(Wikimedia-Moen et. al (2012)-CC
BY 2.0)

INTRODUCTION experiments. Today, a small number of species are
The addition of new genetic material to single cell widely used as model organisms in genetics (Figure
organisms has been possible for decades. Recall R. 2). All of these species have specific characteristics
Griffith’s 1928 experiments with smooth and rough that make large number of them easy to grow and
pneumococcus (Chapter 1). However, the routine analyze in laboratories: (1) they are small, (2) fast
transformation of bacteria with plasmids began in growing with a short generation time, (3) produce
the early 1970s. The ability to transfer DNA (genes) lots of progeny from matings that can be easily
into complex, multicellular organisms is more controlled, (4) have small genomes (small C-value),
recent and usually called transfection when dealing and (5) are diploid (i.e. chromosomes are present
with cells and it also began in the 1970s. This in pairs).
genetic technology has opened up whole new The most commonly used model organisms are:
avenues of research as well as new possibilities for
commercial gain and health improvement. • The prokaryote bacterium, Escherichia coli, is
the simplest genetic model organism and is
1. MODEL ORGANISMS FACILITATE GENETIC often used to clone DNA sequences from other
ADVANCES model species.
• Yeast (Saccharomyces cerevisiae) is a good
1.1. MODEL ORGANISMS general model for the basic functions of
Many of the great advances in genetics were made eukaryotic cells.
using species that are not especially important • The roundworm, Caenorhabditis elegans is a
from a medical, economic, or even ecological useful model for the development of
perspective. Geneticists, from Mendel onwards, multicellular organisms, in part because it is
have sought the best organisms for their transparent throughout its life cycle, and its

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 40 –TRANSGENIC ORGANISMS

cells undergo a well-characterized series of also provided important implications in medical

divisions to produce the adult body. research, agriculture, and biotechnology. By using
• The fruit fly (Drosophila melanogaster) has these species genetic researchers can discover
been studied longer, and probably in more more knowledge, faster and cheaper than using
detail, than any of the other genetic model humans, farm animals or crop plants directly. For
organisms still in use, and is a useful model for example, at least 75% of the approximately 1,000
studying development as well as physiology and genes that have been associated with specific
even behaviour. human diseases have similar genes in D.
• The mouse (Mus musculus) is the model melanogaster. Information about how these genes
organism most closely related to humans, function in model organisms can usually be applied
however there are some practical difficulties to other species, including humans. From research
working with mice, such as cost, slow conducted thus far, we have learned that the main
reproductive time, and ethical considerations. features of many biochemical, cellular, and
• The zebrafish (Danio rerio) has more recently developmental pathways tend to be common
been developed by researchers as a genetic among all species. What is genetically and
model for vertebrates. Unlike mice, zebrafish biochemically true in yeast, worms, flies and mice
embryos develop quickly and externally to their tends to be true in humans, too.
mothers, and are transparent, making it easier However, it is sometimes necessary to study
to study the development of internal structures important biological processes in non-model
and organs. organisms. In humans, for example, there are
• Finally, a small weed, Arabidopsis thaliana, is some diseases or other traits for which no clear
the most widely studied plant genetic model analog exists in model organisms. In these cases
organism. This provides knowledge that can be the tools of genetic analysis developed in model
applied to other plant species, such as wheat, organisms can be applied to these other, non-
rice, and corn. model species. Examples include the development
of new types of gene discovery techniques, genetic
1.2. SOCIETY BENEFITS FROM MODEL ORGANISM
mapping of desired traits, and whole genome
RESEARCH
sequencing
The study of genetic model organisms has greatly
increased our knowledge of genetics, and biology
in general. Knowledge from model organisms has
.
Figure 2.
Some of the most important
genetic model organisms in use
today. Clockwise from top left:
yeast, roundworm, Arabidopsis,
zebrafish, mouse, fruit fly)

Clockwise from top left:

(Wikipedia- Masur-PD) /
(Wikipedia- Kbradnam- CC BY-SA
2.5) / (Wikimedia- Frost Museum-
CC BY 2.0) / (Flickr-Max Westby-CC
BY-NC-SA 2.0) / (Wikimedia- Rama-
CC BY-SA 2.0 FR) / (Flickr-
tohru.murakami- CC BY-NC 2.0)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

TRANSGENIC ORGANISMS – CHAPTER 40

2. WHAT ARE TRANSGENIC ORGANISMS? example by electroporation. When working with

Transgenic organisms contain foreign DNA that has larger cells, naked DNA can also be microinjected
been introduced using biotechnology. Foreign DNA into a cell using a specialized needle. Other
(the transgene) is defined here as DNA from methods use vectors to transport DNA across the
another species, or else recombinant DNA from the membrane. Note that the word “vector” as used
same species that has been manipulated in the here refers to any type of carrier, and not just
laboratory then reintroduced. The terms plasmid vectors. Vectors for
transgenic organism and genetically modified transformation/transfection include vesicles made
organism (GMO) are generally synonymous. (Note of lipids or other polymers that surround DNA;
that any mutant is technically “genetically various types of particles that carry DNA on their
modified”, but the “GMO” term is usually not used surface; and infectious viruses and bacteria that
to refer organism derived via classical mutant and naturally transfer their own DNA into a host cell,
breeding techniques.) but which have been engineered to transfer any
DNA molecule of interest. Usually the foreign DNA
The process of creating transgenic organisms or
is a complete expression unit that includes its own
cells to become whole organisms with a permanent
cis-regulators (e.g. promoter) as well as the gene
change to their germline has been called either
that is to be transcribed.
transformation or transfection. (Unfortunately,
both words have other meanings. Transformation When the objective of an experiment is to produce
also refers to the process of mammalian cell a stable (i.e. heritable) transgenic eukaryote, the
becoming cancerous, while transfection also refers foreign DNA must be incorporated into the host’s
to the process of introducing DNA into cells in chromosomes. For this to occur, the foreign DNA
culture, either bacterial or eukaryote, for a must enter the host’s nucleus, and recombine with
temporary use, not germ line changes.) host DNA (a chromosome). In some species, the
foreign DNA is inserted at a random location,
Transgenic organisms are useful in several areas.
probably wherever strand breakage and non-
(1) They are important research tools, and are
homologous end joining happen to occur. In other
often used when exploring a gene’s function. (2)
species, the foreign DNA can be targeted to a
Transgenesis is also related to the medical practice
particular locus, by flanking the foreign DNA with
of gene therapy, in which DNA is transferred into a
DNA that is homologous to the host’s DNA at that
patient’s cells to treat disease. (3) Transgenic locus. The foreign DNA is then incorporated into
organisms are widespread in agriculture. the host’s chromosomes through homologous
Approximately 90% of canola, cotton, corn, recombination.
soybean, and sugar beets grown in North America
are transgenic. No other transgenic livestock or 4. DETECTION OF TRANSGENES AND THEIR
crops (except some squash, papaya, and alfalfa) are PRODUCTS
currently (2014) produced in North America, Furthermore, to produce multicellular organisms in
although many are being researched. which all cells are transgenic and the transgene is
3. MAKING A TRANSGENIC CELL stably inherited, the cell that was originally
To make a transgenic cell, DNA must first be transformed must be either a gamete or must
transferred across the cell membrane, (and, if develop into tissues that produce gametes.
present, across the cell wall), without destroying Transgenic gametes can eventually be mated to
the cell. In some cases, naked DNA (meaning produce homozygous, transgenic offspring. The
plasmid or linear DNA that is not bound to any type presence of the transgene in the offspring is
of carrier) may be transferred into the cell by typically confirmed using PCR or Southern blotting
adding DNA to the medium and temporarily (see other chapters), and the expression of the
increasing the porosity of the cell membrane, for transgene can be measured using reverse-

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 40 –TRANSGENIC ORGANISMS

transcription PCR (RT-PCR), Northern (RNA)

blotting, and Western (protein) blotting.
The rate of transcription of a transgene is highly
dependent on its insertion site (i.e. position
effects). That is, the same transgene may be
expressed at a high level when inserted at one
location, while not expressed at all at a different
location. It is the state of the chromatin at the
insertion site that influences the level and pattern
of expression. Therefore, researchers often
generate several independently
transformed/transfected lines with the same
transgene, and then screen for the lines showing
the highest expression. It is also good practice to
sequence the transgenic locus from a newly
generated transgenic organism, since errors
(truncations, rearrangements, and other
mutations) can be introduced during
transformation/transfection.

5. PRODUCING A TRANSGENIC PLANT

The most common method for producing
transgenic plants is Agrobacterium-mediated
transformation (Figure 3). Agrobacterium
tumifaciens is a soil bacterium that, as part of its
natural pathogenesis, injects its own tumor- Figure 3.
inducing (Ti) plasmid into cells of a host plant. The Production of a transgenic plant using Agrobacterium-
mediated transformation. The bacterium has been
natural Ti plasmid encodes growth-promoting transformed with a T-DNA plasmid that contains the
genes that cause a gall (i.e. tumor) to form on the transgene and a selectable marker that confers resistance
plant, which also provides an environment for the to a herbicide or antibiotic. A bacterial culture and plant
pathogen to proliferate. Molecular biologists have tissue (e.g. a leaf punch) are co-cultured on growth
engineered the Ti plasmid by removing the tumor- medium in a Petri dish. Some of the plant cells will
become infected by the bacterium, which will transfer the
inducing genes and adding restriction sites that T-DNA into the plant cytoplasm. In some cases the
make it convenient to insert any DNA of interest. transgene will become integrated into the chromosomal
This engineered version is called a T-DNA (transfer- DNA of a plant cell. In the presence of certain
DNA) plasmid; the bacterium transfers a linear combinations of hormones, the plant cells will
fragment of this plasmid that includes the dedifferentiate into a mass of cells called callus. The
presence of a selective agent (e.g. herbicide or antibiotic)
conserved “left-border (LB)”, and right-border in the growth medium prevents untransformed cells from
(RB)” DNA sequences, and anything in between dividing. Therefore, each callus ideally consists only of
them (up to about 10 kb). The linear T-DNA transgenic plant cells. The resistant calli are transferred to
fragment is transported into the nucleus, where it media with other combinations of hormones that promote
recombines with the host-DNA, probably wherever organogenesis, i.e. differentiation of callus cells into shoots
and then roots. The regenerated transgenic plants are
random breakages occur in the host’s transferred to soil. Their seeds can be harvested and
chromosomes. In Arabidopsis and a few other tested to ensure that the transgene is stably inherited.
species, flowers can simply be dipped in a (Original-Deyholos-CC BY-NC 3.0)
suspension of Agrobacterium, and ~1% of the
resulting seeds will be transformed.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

TRANSGENIC ORGANISMS – CHAPTER 40

homologous to either side of a locus that is to be

targeted for replacement. If the objective of the
experiment is simply to delete (“knock-out”) the
targeted locus, the host’s DNA can simply be
replaced by selectable marker, as shown. It is also
possible to replace the host’s DNA at this locus
with a different version of the same gene, or a
completely different gene, depending on how the
transgenic construct is made. Cells that have been
transfected and express the selectable marker (i.e.
resistance to the antibiotic neomycin resistance,
neoR, in this example) are distinguished from
unsuccessfully transfected cells by their ability to
Figure 4. survive in the presence of the selective agent (e.g.
Organogenesis of flax shoots from calli. an antibiotic). Transfected cells are then injected
(Original-J. McDill- CC BY-NC-SA 2.0) into early stage embryos, and then are transferred
to a foster mother. The resulting pups are
chimeras, meaning that only some of their cells are
In most other plant species, cells are induced by
transgenic. Some of the chimeras will produce
hormones to form a mass of undifferentiated
gametes that are transgenic, which when mated
tissues called a callus. The Agrobacterium is
with a wild-type gamete, will produce mice that are
applied to a callus and a few cells are transformed,
hemizygous for the transgene. Unlike the
which can then be induced by other hormones to
chimeras, these hemizygotes carry the transgene in
regenerate whole plants (Figure 4).
all of their cells. Through further breeding, mice
Some plant species are resistant (i.e. “recalcitrant”) that are homozygous for the transgene can be
to transformation by Agrobacterium. In these obtained.
situations, other techniques must be used such as
particle bombardment, whereby DNA is non- 7. HUMAN GENE THERAPY
covalently attached to small metallic particles, Many different strategies for human gene therapy
which are accelerated by compressed air into callus are under development. In theory, either the
tissue, from which complete transgenic plants can germline or somatic cells may be targeted for
sometimes be regenerated. In all transformation transfection, but most research has focused on
methods, the presence of a selectable marker (e.g. somatic cell transfection, because of risks and
a gene that confers antibiotic resistance or ethical issues associated with germline
herbicide resistance) is useful for distinguishing transformation. Gene therapy approaches may be
transgenic cells from non-transgenic cells at an further classified as either ex vivo or in vivo, with
early stage of the transformation process. the former meaning that cells (e.g. stem cells) are
transfected in isolation before being introduced to
6. PRODUCING A TRANSGENIC MOUSE the body, where they replace defective cells. Ex
In a commonly used method for producing a vivo gene therapies for several blood disorders (e.g.
transgenic mouse, stem cells are removed from a immunodeficiencies, thalassemias) are undergoing
mouse embryo, and a transgenic DNA construct is clinical trials. For in vivo therapies, the transfection
transferred into the stem cells using occurs within the patient. The objective may be
electroporation, and some of this transgenic DNA either stable integration, or non-integrative
enters the nucleus, where it may undergo transfection. As described above, stable
homologous recombination (Figure 5). The transfection involves integration into the host
transgenic DNA construct contains DNA genome. In the clinical context, stable integration

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 40 –TRANSGENIC ORGANISMS

may not be necessary, and carries with it higher Vectors for in vivo gene therapy must be capable of
risk of inducing mutations in either the transgene delivering DNA or RNA to a large proportion of the
or host genome). In contrast, transient transfection targeted cells, without inducing a significant
does not involve integration into the host genome immune response, or having any toxic effects.
and the transgene may therefore be delivered to Ideally, the vectors should also have high specificity
the cell as either RNA or DNA. Advantages of RNA for the targeted cell type. Vectors based on viruses
delivery include that no promoter is needed to (e.g. lentiviruses) are being developed for in both
drive expression of the transgene. Besides mRNA in vivo and ex vivo gene therapies. Other, non-viral
transgenes, which could provide a functional vectors (e.g. vesicles and nanoparticles) are also
version of a mutant protein, there is great interest being developed for gene therapy as well.
in delivery of siRNA (small-inhibitory RNAs), which
can be used to silence specific genes in the host
cell’s genome.

Figure 5.
Production of a transgenic mouse. Stem cells are removed from an embryo, and are transfected (using electroporation) with a
r
transgenic construct that bears a neomycin resistance gene (neo ) flanked by two segments of DNA homologous to a gene of
interest. In the nucleus of a transgenic cell, some of the foreign DNA will recombine with the targeted gene, disrupting the
r
targeted gene and introducing the selectable marker. Only cells in which neo has been incorporated will survive selection. These
neomycin resistant cells are then transplanted into another embryo, which will grow into a chimera within a foster mother.
(Wikipedia-Kiaergaard- CC BY-SA 3.0)

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

TRANSGENIC ORGANISMS – CHAPTER 40

8. CRISPR-CAS9 TECHNOLOGY archaeal chromosome that has a cluster of many

To understand how genes work, geneticists need to short sequences that are interspaced repeatedly.
modify them (mutation) in order to see how the The CRISPR locus is composed of (1) CRISPR RNA
changes affect the phenotype. This will identify the direct repeats and (2) spacers. Now, there are
function(s) of the gene. In the past, random three main components to the CRISPR-Cas9 system
mutagenesis, followed by screening and selection, (Figure 6):
permitted researchers to identify mutations that (1) CRISPR-associated (cas) genes
affected gene function. While useful, this method
has limitations. It produces only a small, very The cas genes encode enzymes that control
limited set of sequence changes to a gene. Much the integration of foreign DNA into its own
more could be learned if researchers could define genome and the defense against
the change first, then see the affect on the bacteriophages. Cas9 gene from Streptococcus
phenotype. A new method(s) is needed for this. pyogenes encodes for a nuclease that has the
ability to denature foreign DNA.
There have been several technologies regarding
targeted genome editing. These include ZFNs (Zinc (2) CRISPR RNA (crRNA) array
Finger Nuclease) and TALENs (Transcription This crRNA array is responsible for the target
Activator-like Effectors Nucleases), which depend specificity and has two parts to it: (1) CRISPR
on protein-DNA interaction. However, engineering RNA direct repeats and interspaced (2)
specific protein sequences to bind to specific DNA variable spacers. These sequences are not
sequences can be time consuming and expensive. protein coding genes, they only transcribe RNA
Recently, a new method of genome editing tool has molecules. Transcription of this array results in
been introduced called “CRISPR-Cas9 system”. It the crRNA, which is modified into mature
can (1) be easily customized to specific DNA crRNA.
sequences, (2) target with high precision, and (3)
(3) trans-activating crRNA (tracrRNA)
target many genes simultaneously.
Trans-activating crRNA (tracrRNA) base pairs
8.1. WHAT IS CRISPR-CAS9? with crRNA and recruits the Cas9 protein. This
The CRISPR system is a combination of two crRNA-tracrRNA-Cas9 Protein complex is what
bacterial systems. It is an adaptive bacterial and we call a “RNA guided endonuclease” that can
archaeal immune system that fights against foreign target and cut specific sites on DNA.
plasmids or viruses. After a bacterium survives a
viral infection, it “remembers” the viral DNA Scientists have fused crRNA and tracrRNA
sequence by incorporating that sequence into its together to form one chimeric RNA molecule
own genome, and fights back when it is re-infected. called sgRNA (single-guided RNA) so that they
CRISPR (pronounced crisper) is an acronym for can increase the efficiency when performing
Clustered Regularly Interspaced Short Palindromic experiments.
Repeats. In other words, it is a locus on bacterial or
Figure 6.
General representation of CRISPR locus and
related genes in bacterial / archaeal
chromosome. Note that direct repeats are
in yellow colour, and spacers have various
colors since they have different sequences
which are from multiple origins. (Original-
Kang-CC BY-NC 3.0)

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 40 –TRANSGENIC ORGANISMS

8.2. HOW CRISPR SYSTEM WORKS premature crRNA. This crRNA will contain
After a bacterium survives a viral infection, it various spacer and direct repeat sequences, but
attains an immune memory so that it can fight back after modification the mature crRNA will only
when the same virus infects the cell; this is where have the matching spacer or guide RNA to the
the CRISPR-Cas9 system comes in. There are three invading virus. The mature crRNA will form a
main components in this adaptive immune system complex with tracrRNA and Cas 9.
(Figure 7 and Figure 8): (3) Interference
(1) Adaption The crRNA-tracrRNA-Cas9 complex binds to the
First, virus injects its DNA to the host bacterial protospacer adjacent motif (PAM) sequence
cell. Then, the host bacterial cell detects the first that is located to the target DNA sequence
foreign DNA, and integrates short fragment of (protospacer) and opens the dsDNA and the
the viral DNA into the CRISPR locus. This insert guide RNA will base pair with the target
is called a spacer and the original sequence on sequence. The cas9 protein will leave a double
the viral genome is called a protospacer. stranded cut to the DNA.

(2) crRNA biogenesis Now, the host bacterial DNA do not have the
PAM sequence but the foreign, viral DNA does.
After the cell is re-infected with the same virus, Thus, the CRISPR system does not cleave its
CRISPR sequence is transcribed into a own DNA.

Figure 7.
Some bacteria and most archaea have
a system that can integrate their
enemy’s DNA and use it against them.
Note that direct repeats are
represented by orange blocks, and
spacers are represented by differnt
colour blocks. (Original-Kang-CC BY-NC
3.0)

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

TRANSGENIC ORGANISMS – CHAPTER 40

Figure 8.
crRNA will base pair with tracrRNA, and this
will recruit Cas9 protein. Therefor,Cas9
protein is guided by the RNA molecule and
leaves a double stranded break. (Original-
Kang-CC BY-NC 3.0)

8.3. USING CRISPR-CAS9 FOR GENOME EDITING

What if we can engineer the guide RNA in such a
way that it will find a sequence of interest on any
DNA and edit it? This would be powerful,
programmable genome editing technology. “
We can customize the “search and cut” system that
CRISPR-Cas9 complex uses by changing the guide Figure 9.
RNA. Then, this system will search for its new The DNA repairing enzymes tries to fix the break by either
target and make a cut on the DNA at that sequence, (1) HDR (homology-directed repair) or (2) NHEJ (non-
creating a break. After this “attack” on the target homologous end joining.). NHEJ is error-prone and can
leave a scar such as indel.
DNA, the cell tries to repair the break and it may or
(Original-Kang-CC BY-NC 3.0)
may not be successful. We can use this DNA repair
mechanism of the host cell so that we can induce
mutations on a gene or even exchange the gene Note that we can also tweak the CRISPR system
with a new one. Here are two possible ways for and use it for other purposes. For example, we can
repairing the break (Figure 9): mutate few amino acids of the Cas9 protein so that
it won’t make any cuts. Instead, it can be
(1) HDR (homology-directed repair) accessorized with regulatory elements and can
For this type of repair mechanism, single activate or repress certain gene expressions.
stranded DNA donor molecule is flanked in the Note that this new technology is not perfect:
break. This way, new strand can be built upon (1) we still have to solve off-target effects, which is
this template DNA. Therefore, we can introduce CRISPR-Cas9 cutting at undesired, non-target
a new version of the gene. sequences, and
(2) NHEJ (non homologous end joining) (2) we need to find a way to effectively deliver all
the RNAs to all of the target cells, if this is to be
This repair mechanism does not require a
used as a means of gene therapy.
template DNA. It joins the two ends of the DNA
and connects them. This process is error-prone, Nevertheless, CRISPR technology is just in its
and insertion-deletion (indel) mutations can infancy and current undergraduate students will be
occur which will lead to frameshift mutations. improving and expanding this methodology in the
future. It is the PCR of the current age.

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 40 –TRANSGENIC ORGANISMS

___________________________________________________________________________
SUMMARY:
• There are a variety of model organisms that are used for genetic experimentation because they have
advantages for various aspects of research.
• Research on one model organism can be applied to others. This permits genetic knowledge of model
organisms to be transferred to humans, farm animals or crop plants.
• Transgenic organisms contain foreign DNA that has been introduced using biotechnology to make
genetically modified organisms.
• Crispr-Cas9 is a RNA guided endonuclease that can find a specific sequence on DNA, make a cut, and
the repair mechanism of the host cell will either introduce a mutation or integrate a new copy.
KEY TERMS:
model organisms vesicles lentiviruses
Escherichia coli stable genome editing
Sacchromyces cerevisiae position effects CRISPR-Cas9 system
Caenorhabditis elegans Agrobacterium-mediated cas
Drosophila melanogaster transformation CAS9
Mus musculus Ti plasmid crRNA
Danio rerio T-DNA direct repeat
Arabadopis thaliana callus spacer
transgenic organisms recalcitrant tracrRNA
transgene particle bombardment sgRNA
GMO stem cells adaptation
transformation knock-out protospacer
transfection neo-R crRNA Biogenesis
naked DNA germline PAM
naked DNA somatic HDR
carrier ex vivo NHEJ
electroporation in vivo indel
microinjection non-integrative
vector siRNA

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

TRANSGENIC ORGANISMS – CHAPTER 40

STUDY QUESTIONS:
1) a) List the characteristics of an ideal model
organism.
b) Which model organism can be used most
efficiently to identify genes related to:
i) eye development
ii) skeletal development
iii) photosynthesis
iii) cell division
iv) cell differentiation
v) cancer

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER 40 –TRANSGENIC ORGANISMS

Notes:

PAGE 12 OPEN GENETICS LECTURES – FALL 2017

CANCER GENETICS – CHAPTER 41

CHAPTER 41– CANCER GENETICS

Figure 1.
Fluorescent image of HeLa cells stained for actin binding toxin
phalloidin (red), microtubules (cyan) and cell nuclei (blue). HeLa cells
are a line of immortal cultured cells derived from a cervical cancer
taken from Henrietta Lacks in 1951.
(Wikipedia-CFCF, NIH-PD)

INTRODUCTION of skin cancers originating respectively in the
squamous cells, basal cells, or melanocytes of the
Cancer is a group of diseases that exhibit
skin.
uncontrolled cell growth, invasion of adjacent
tissues, and sometimes metastasis (the movement Lymphomas arise from hematopoietic (blood
of cancer cells through the blood or lymph). In forming) cells. This includes leukemia, the most
cancer cells, the regulatory mechanisms that common type of cancer in children.
normally control cell division and limit abnormal 2. CANCER CELL BIOLOGY
growth have been disrupted, usually by the
accumulation of several mutations in specific Cancer is a progressive disease that usually begins
genes. Cancer is therefore essentially a genetic with increased frequency of cell division (Figure 2).
disease. Although some cancer-related mutations Under the microscope, this may be detectable as
may be heritable, most cancers are sporadic, increased cellular and nuclear size, and an
meaning they arise from new mutations that occur increased proportion of cells undergoing mitosis.
in the individual who has the disease. In this As the disease progresses, cells typically lose their
chapter, we will examine the connection between normal shape and tissue organization. This
cancer and genes. increased cell division and abnormal tissue
organization is called dysplasia. Eventually a
1. CLASSIFICATION OF CANCERS tumour develops, which can grow rapidly and
Cancers can be classified based on the tissues they expand into adjacent tissues. As cellular damage
resemble and thus in which they originate. For accumulates and additional growth control
example, Sarcomas are cancers that originate in mechanisms are lost, some cells may break free of
mesenchymal cells, such as bone, cartilage, fat, or the primary tumour, pass into the blood or lymph
muscle. system, and be transported to another organ,
where they develop into new tumours (Figure 3).
Carcinomas originate in epithelial cells (both inside
The early detection of tumours is important so that
the body and on its surface) and are the most
they can be treated or removed before the onset of
common types of cancer (~85%). This includes
metastasis, but note that not all usually considered
glandular tissues (e.g. breast, prostate). Each of
life threatening. In contrast, malignant tumours
these classifications may be further sub-divided.
become invasive, and ultimately result in cancer.
For example, squamous cell carcinoma (SCC), basal
cell carcinoma (BCC), and melanoma are all types

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER 41 – CANCER GENETICS

Figure 2. Figure 3.
Progressive increases in cell division and abnormal cell Secondary tumours (white) develop in the liver from cells of a
morphology associated with cancer. (Wikipedia-NIH-PD) metastatic pancreatic cancer. (Wikipedia- J. Hayman-PD)

1. Growth signal Cancer cells can divide without the external signals normally required to
autonomy stimulate division.
2. Insensitivity to Cancer cells are unaffected by external signals that inhibit division of normal
growth inhibitory cells.
signals
3. Evasion of apoptosis When excessive DNA damage and other abnormalities are detected, apoptosis
(a type of programmed cell death) is induced in normal cells, but not in cancer
cells.
4. Reproductive Each division of a normal cell reduces the length of its telomeres. Normal cells
potential not limited arrest further division once telomeres reach a certain length. Cancer cells avoid
by telomeres this arrest and/or maintain the length of their telomeres.
5. Sustained Most cancers require the growth of new blood vessels into the tumour. Normal
angiogenesis angiogenesis is regulated by both inhibitory and stimulatory signals not required
in cancer cells.
6. Tissue invasion and Normal cells generally do not migrate (except in embryo development). Cancer
metastasis cells invade other tissues including vital organs.
7. Deregulated Cancer cells use an abnormal metabolism to satisfy a high demand for energy
metabolic pathways and nutrients.
8. Evasion of the Cancer cells are able to evade the immune system.
immune system
9. Chromosomal Severe chromosomal abnormalities are found in most cancers.
instability
10. Inflammation Local chronic inflammation is associated with many types of cancer.
Table 1
Ten hallmarks of Cancer (Hanahan and Weinberg, 2000; Hanahan 2011)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

CANCER GENETICS – CHAPTER 41

3. HALLMARKS OF CANCER may contribute to dysplasia, which is detected

during a Pap smear (Figure 5). Detection of
Researchers have identified six molecular and abnormal cell morphology in a Pap smear is not
cellular traits that characterize most cancers. These necessarily evidence of cancer. It must be
six hallmarks of cancer are summarized in Table 1.
emphasized again that cells have many regulatory
In this chapter, we will focus on the first two
mechanisms to limit division and growth, and for
hallmarks, namely growth signal autonomy and
cancer to occur, each of these mechanisms must be
insensitivity to anti-growth signals.
disrupted. This is one reason why only a minority of
4. MUTAGENS AND CARCINOGENS individuals with HPV infections ultimately develops
cancer. Although most HPV-related cancers are
A carcinogen is any agent that directly increases
cervical, HPV infection can also lead to cancer in
the incidence of cancer, while a mutagen is an
other tissues, in both women and men.
agent that increases the incidence of mutations.
Most, but not all carcinogens are mutagens. 4.2. IONIZING RADIATION
Carcinogens that do not directly damage DNA Ionizing radiation is a well-known physical
include substances that accelerate cell division, carcinogen, because of its potential to induce DNA
thereby leaving less opportunity for cell to repair damage within the body. The most damaging type
induced mutations, or errors in replication. of radiation is ionizing, meaning waves or particles
Carcinogens that act as mutagens may be with sufficient energy to strip electrons from the
biological, physical, or chemical in nature, although molecules they encounter, including DNA or
the term is most often used in relation to chemical molecules that can subsequently react with DNA.
substances. Ionizing radiation, which includes x-rays, gamma
rays, and some wavelengths of ultraviolet rays, is
4.1. HUMAN PAPILLOMA VIRUS (HPV)
distinct from the non-ionizing radiation of
microwave ovens, cell phones, and radios. As with
other carcinogens, mutation of multiple,
independent genes that normally regulate cell
division are required before cancer develops.
4.3. CHEMICAL CARCINOGENS
Figure 4. Figure 5. Chemical carcinogens (Table 2) can be either
Electron micrograph of HPV. Dysplastic (left) and
natural or synthetic compounds that, based on
(Laboratory of Tumor Virus normal (right) cells from a
Biology,-Unknown-PD) Pap smear. animal feeding trials or epidemiological (i.e. human
(Flickr-Ed Uthman-CC BY population) studies, increase the incidence of
2.0) cancer. The definition of a chemical as a carcinogen

is problematic for several reasons. Some chemicals

Human Papilloma Virus (HPV, Figure 4) is an
become carcinogenic only after they are
example of a biological carcinogen. Almost all
metabolized into another compound in the body;
cervical cancers begin with infection by HPV, which
not all species or individuals may metabolize
contains genes that disrupt the normal pattern of
chemicals in the same way. Also, the carcinogenic
cell division within the host cell. Any gene that
properties of a compound are usually dependent
leads to an uncontrolled increase in cell division is
on its dose. It can be difficult to define a relevant
called an oncogene. The HPV E6 and E7 genes are
dose for both lab animals and humans.
considered oncogenes because they inhibit the
Nevertheless, when a correlation between cancer
host cell’s natural tumour suppressing proteins
incidence and chemical exposure is observed, it is
(include p53, described below). The product of the
usually possible to find ways to reduce exposure to
E5 gene mimics the host’s own signals for cell
that chemical.
division, and these and other viral gene products

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER 41 – CANCER GENETICS

1. PAHs (polycyclic aromatic hydrocarbons) e.g. benzo[a]pyrene and several other components of
the smoke of cigarettes, wood, and fossil fuels
2. Aromatic amines e.g. formed in food when meat (including fish, poultry)
are cooked at high temperature
3. Nitrosamines and nitrosamides e.g. found in tobacco and in some smoked meat and fish
4. Azo dyes e.g. various dyes and pigments used in textiles, leather,
paints.
5. Carbamates e.g. ethyl carbamate (urethane) found in some distilled
beverages and fermented foods
6. Halogenated compounds e.g. pentachlorophenol used in some wood preservatives
and pesticides.
7. Inorganic compounds e.g. asbestos; may induce chronic inflammation and
reactive oxygen species
8. Miscellaneous compounds e.g. alkylating agents, phenolics
Table 2
Some classes of chemical carcinogens (Pecorino 2008)

5. ONCOGENES
The control of cell division involves many different
genes. Some of these genes act as signaling
molecules to activate normal progression through
the cell cycle. One of the pre-requisites for cancer
occurs when one or more of these activators of cell
division become mutated.

The mutation may involve a change in the coding
sequence of the protein, so that it is more active Figure 6.
Structure of the ras protein.
than normal, or a change in the regulation of its (Wikipedia-Mark”AbsturZ”-PD)
expression, so that it is produced at higher levels
than normal, or persists in the cell longer than
ras is an example of a proto-oncogene (Figure 6).
normal. Genes that are a part of the normal
ras acts as a switch within signal transduction
regulation of cell division, but which after mutation
pathways, including the regulation of cell division.
contribute to cancer, are called proto-oncogenes.
When a receptor protein receives a signal for cell
Once a proto-oncogene has been abnormally
division, the receptor activates ras, which in turn
activated by mutation, it is called an oncogene.
More than 100 genes have been defined as proto- activates other signaling components, ultimately
oncogenes. These include genes at almost every leading to activation of genes involved in cell
step of the signaling pathways that normally induce division. Certain mutations of the ras sequence
cell to divide, including growth factors, receptors, cause it to be in a permanently active form, which
signal transducers, and transcription factors. can lead to constitutive activation of the cell cycle.
This mutation is dominant as are most oncogenes.
An example of the role of ras in relaying a signal for
cell division in the EGF pathway is shown in Figure
7.

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

CANCER GENETICS – CHAPTER 41

Figure 7.
Simplified representation of the epidermal growth factor (EGF) signaling pathway. In the panel on the left, the components are
shown in their inactive forms, prior to stimulation of the pathway. The components include the soluble ligand, EGF, its receptor
(EGFR, a tyrosine kinase), ras (a G protein), several kinases (RAF, MEK, MAPK), and a transcription factor (TF). In the right panel,
the activate pathway is shown. Binding of the ligand to its receptor leads to autophosphorylation of the receptor. Through a
series of proteins not shown here, the phosphorylated simulates conversion of ras to its active, GTP-bound form. The activated
ras then stimulates phosphorylation of a series of kinases, which ultimately activate transcription factors and the expression of
genes required for cell proliferation. (Original-Deyholos-CC:AN)

6. TUMOUR SUPPRESSOR GENES cancer. Thus, sporadic rather than inherited
mutations are the most common sources of both
More than 30 genes are classified as tumour
oncogenes and disabled tumour suppressor genes.
suppressors. The normal functions of these genes
include repair of DNA, induction of programmed An important tumour suppressor gene is a
cell death (apoptosis) and prevention of abnormal transcription factor named p53 (Figure 8). Other
cell division. In contrast to proto-oncogenes, in proteins in the cell sense DNA damage, or
tumour suppressors it is loss-of-function mutations abnormalities in the cell cycle and activate p53
that contribute to the progression of cancer. This through several mechanisms including
means that tumour suppressor mutations tend to phosphorylation (attachment of phosphate to
be recessive, and thus both alleles must be specific site on the protein) and transport into the
mutated in order to allow abnormal growth to nucleus. In its active form, p53 induces the
proceed. It is perhaps not surprising that mutations transcription of genes with several different types
in tumour suppressor genes are more likely than of tumour suppressing functions, including DNA
oncogenes to be inherited. An example is the repair, cell cycle arrest, and apoptosis. Over 50% of
tumour suppressor gene, BRCA1, which is involved human tumours contain mutations in p53. People
in DNA-repair. Inherited mutations in BRCA1 who inherit only one function copy of p53 have a
increase a woman’s lifetime risk of breast cancer by greatly increased incidence of early onset cancer.
up to seven times, although these heritable However, as with the other cancer related genes
mutations account for only about 10% of breast we have discussed, most mutations in p53 are

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER 41 – CANCER GENETICS

sporadic, rather than inherited. Mutation of p53, unregulated expression of this gene and its kinase
through formation of pyrimidine dimers in the product causes activation of a variety of
genes following exposure to UV light, has been intracellular signaling pathways, promoting the
causally linked to squamous cell and basal cell uncontrolled proliferative and survival properties
carcinomas (but not melanomas, highlighting the of CML cells (the cancer). Thus, the BCR-ABL
variety and complexities of mechanisms that can tyrosine kinase enzyme exists only in cancer cells
cause cancer). (and not in healthy cells) and a drug that inhibits
this activity could be used to target and prevent
the uncontrolled growth of the cancerous CML
cells.

7.2. INHIBITING THE BCR-ABL TYROSINE KINASE

ACTIVITY
Knowing that the kinase activity was the key to
treatment, pharmaceutical companies screened
chemical libraries of potential kinase inhibitory
Figure 8. compounds. After initially finding low potency
p53 bound to its target site on a DNA molecule.
(Wikipedia-Thomas Spettstoesser from Cho et.al., Science
inhibitors, a relationship between structure and
265PP346, 1994-CC BY-SA 3.0) activity suggested other compounds that were
optimized to inhibit the BCR-ABL tyrosine kinase
activity. The lead compound was STI571, now
7. GLEEVEC™ (IMATINIB) - THE “POSTER BOY” OF called Gleevec™ or imatinib (Figure 9).
GENETIC RESEARCH LEADING TO A CANCER
TREATMENT
7.1. CHRONIC MYELOGENOUS LEUKEMIA (CML)
Chronic myelogenous leukemia (CML) is a type of
cancer of white blood cells, myeloid cells, which are
mutated and proliferate uncontrollably through
three stages (chronic, accelerated, and blast crisis)
and lead eventually to death. Cytogenetics showed
the myeloid cells of CML patients usually also have
a consistent chromosome translocation (the
Figure 9.
mutant event) between the long arms of Biochemical structure of Gleevec™ or Imatinib. .
chromosomes 9 and 22, t(9:22)(q34;q11). It is also (Wikipedia-Fuse809-PD)
known as the Philadelphia chromosome (Ph+). This
translocation involves breaks in two genes, c-abl
This drug was shown to inhibit the BCR-ABL
and bcr, on chromosomes 9 and 22, respectively.
tyrosine kinase activity and to inhibit CML cell
The fusion of the translocation breaks result in a
proliferation in vitro and in vivo. Gleevec™ works
chimeric gene, called bcr-abl, that contains exons 1
via targeted therapy—only the kinase activity in
and/or 2 from bcr (this varies from patient to
cancer cells was targeted and thereby killed
patient) and 2-11 from abl and it produces a
through the drug's action. In this regard, Gleevec™
chimeric protein (BCR-ABL or p185bcr-abl) that is
was one of the first cancer therapies to show the
transcribed like bcr and contains abl enzyme
potential for this type of targeted action. It was
sequences. This chimeric protein has a tyrosine-
dependent upon the genetic identification of the
kinase from the abl gene sequences that is unique
cause and protein target and is often cited as a
to the CML mutant cell. The consistent,

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

CANCER GENETICS – CHAPTER 41

paradigm for genetic research in cancer

therapeutics.
7.3. CAUTION
This is a simplified presentation of the CML/cancer

targeting by the drug Gleevec™. There are many
more details than are presented here. This story
represents as a model of finding a drug for each
type of cancer, rather than the one, single “magic
bullet” that kills all cancers. Remember, there are
always complexities in this type of research-to-
treatment process, such as patient genetic and
environmental variation that leads to differences in
drug metabolism, uptake, and binding. Also,
changes in drug dose, mutation of the bcr-abl gene,

and other events can affect the effectiveness of the
treatment and the relapse rate. Biological systems
are extremely complex and difficult to modulate in
the specific, targeted manner necessary to treat
cancer ideally.

Remember, the drug, Gleevec™, is not a cure, but

only a treatment. It prevents the uncontrolled
proliferation of the CML cells, but doesn’t kill them

directly. The arrested cells will die eventually, but
there is always a small pool of CML cells that will
proliferate if the drug is discontinued. While
sustained use of this expensive drug may be
financially beneficial to the pharmaceutical
companies, it is certainly not the ideal situation for
the patient.

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER 41 – CANCER GENETICS

___________________________________________________________________________
SUMMARY:
• Cancer is the name given to a class of different diseases that share common properties.
• Most cancers require accumulation of mutations in several different genes.
• Most cancer causing mutations are sporadic, rather than inherited, and most are caused by
environmental carcinogens, including virus, radiation, and certain chemicals.
• Oncogenes are hyper activated regulators of cell division, and are often derived from gain-of-function
mutations in proto-oncogenes.
• Tumour suppressor genes normal help to repair DNA damage, arrest cell division, or to kill over
proliferating cells. Loss-of-function of these genes contributes to the progression of cancer.
• Genetic research into cancer can provide enzyme targets for drug investigation and potential
treatment. e.g. Gleevec™
KEY TERMS:
metastasis epidemiology
sarcoma proto-oncogene
carcinoma receptor
squamous cell carcinoma signal transduction
basal cell carcinoma ras
melanoma tumour suppressors
lymphoma apoptosis
dysplasia BRC1A
benign p53
malignant phosphorylation
carcinogen CML
HPV Philadelphia chromosome
oncogene bcr-abl
ionizing Gleevec™

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

CANCER GENETICS – CHAPTER 41

STUDY QUESTIONS:
1) Why do oncogenes tend to be dominant, but
mutations in tumour suppressors tend to be
recessive?
2) What tumour suppressing functions are
controlled by p53? How can a single gene affect
so many different biological pathways?
3) Are all carcinogens mutagens? Are all mutagens
carcinogens? Explain why or why not.
4) Imagine that a laboratory reports that feeding a
chocolate to laboratory rats increases the
incidence of cancer. What other details would
you want to know before you stopped eating
chocolate?
5) Do all women with HPV get cancer? Why or
why not?
6) Do all women with mutations in BRCA1 get
cancer? Why or why not?

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER 41 – CANCER GENETICS

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

Answers
for
Open Genetics Lectures
Chapter Questions

Fall 2017 Version

CHAPTER QUESTION - ANSWERS

CHAPTER QUESTION – ANSWERS

CHAPTER 01 – ANSWERS
1) The genetic mechanism could be either they blend together like paint (they could not be separated again),
they could be particulate, like DNA and genes, or it could be something else.
a) Identify pure breeding lines of the individuals that differed in some detectable trait, then cross the lines
with the different traits and see how the traits were inherited over several generations.
b) Purify different biochemical components, then see if any of the components were sufficient to transfer
traits from one individual to another.
c) It depends in part whether the organisms all evolved from the same ancestor. If so, then it seems likely.
d) The extraterrestrials would not necessarily (and perhaps would be unlikely) to have the exact same types
of reductional divisions of chromosome-like material prior to sexual reproduction. In other words, there
are many conceivable ways to accomplish what sex, meiosis, and chromosomes accomplish on earth.
2) Hershey and Chase wanted to be able to track DNA and protein molecules from a specific source, within a
mixture of other protein and DNA molecules. Radioactivity is a good way to label molecules, since
detection is quite sensitive and the labeling does not interfere with biological function.
3) The experiments shown in Figure 3 show that DNA is necessary for transformation, (since removing the
DNA by nuclease treatment removes the competency for transformation). However, this does not
demonstrate that only DNA is sufficient to transfer genetic information; you could therefore try to purify S
strain DNA and see if injecting that DNA alone could transform R strains into S strains, while R Strain DNA
could not.
4) An analogy for the particulate model could be mixing sterile broth (recessive) with inoculated broth
(dominant). What others can you think of?
CHAPTER 02 – ANSWERS
1) a) Avery and colleagues demonstrated that DNA was likely the genetic material, while Watson and Crick
demonstrated the structure of the molecule. By knowing the structure, it was possible to understand how
DNA replicated, and how it encoded proteins, etc.
a) b) Avery and colleagues performed experiments, while Watson and Crick mostly analyzed the data of
others and used that to build models.
b) Watson and Crick relied on Franklin’s data in building their model. It is controversial whether Watson
and Crick should have been given access to these data.
2) Chargaff’s Rules, X-ray crystallography data, and Avery, MacLeod & McCarty and Hershey & Chase’s data,
as well as other information (e.g. specific details about the structure of the bases).
3) a) Right-handed, anti-parallel double helix with a major and minor groove. Each strand is composed of
sugar-nucleotide bases linked together by covalent phosphodiester bonds. Specific bases on opposite
strands of the helix pair together through hydrogen bonding, so that each strand contains the same
information in a complementary structure.
a) The complementarity of the bases and the redundant nature of the strands.
b) The order of the bases.
4) a) Hershey & Chase labeled the phosphate groups that join the bases
b) G-C pairs have more hydrogen bonds, so more energy is required to break the larger number of bonds
in a G-C rich region as compared to an A-T rich region.
5) The ends of eukaryote chromosomes, telomeres, are in a constant state of flux, in that they potentially
change with each replication cycle. Telomerase can add repeats to the end, while the lack of telomerase
activity at the end of a chromosome can result in the loss of repeats.

OPEN GENETICS LECTURES – FALL 2017 PAGE 1

CHAPTER QUESTION - ANSWERS

CHAPTER 03 – ANSWERS
1) Mutant strain #1 has a mutation in gene B (but genes A and C should be functional).
Mutant strain #2 is in gene C (but genes A and B should be functional).
Mutant strain #3 is in gene A (but genes B and C should be functional).
2) Even prototrophs cannot produce the vitamin biotin, so it must be added for any strain to grow. Wild type
strains also lack the enzyme(s) for this biochemical pathway. Biotin is present in Complete Medium.
3) No, we now know that genes also encode tRNA, rRNA, and a variety of other functional RNAs.
4)
a. Changes in many amino acids do not cause a change in function. A specific amino acid is not
required at that site for function to occur.
b. Changes in many amino acids can cause a minor loss in function. A specific amino acid at a site may
be required for optimal function to occur.
c. Changes in some amino acids can cause a complete loss in function. Many specific amino acid are
required at specific sites for any function to occur (e.g. the active site within an enzyme).
d. Any one of the amino acids changed in part (c) can result in a complete loss of function.
5) No, the gene can be transcribed into an mRNA and translated into a polypeptide, but the polypeptide is
not functional because of a change in an amino acid.
6) Chain A has ~268, while chain B has 450. The entire enzyme has ~ 4 chains, two A and two B (a
heterodimer).
7) row 1 orange, orange, orange
row 2 white, orange, orange
row 3 yellow, yellow, orange
row 4 white, yellow, orange

CHAPTER 04 – ANSWERS
1)
a) Mutagenize a wild type (auxotrophic) strain and screen for mutations that fail to grow on minimal
media, but grow well on minimal media supplemented with proline.
b) Take mutants #1-#10 and characterize them, based on:
(1) genetic mapping of the mutants (different locations indicate different genes);
(2) different response to proline precursors (a different response suggests different genes);
(3) complementation tests among the mutations (if they complement then they are mutations in
different genes).
c) If the mutations are in different genes then the F1 progeny would be wild type (able to grow on
minimal medium without proline).
d) If the mutations are in the same gene then the F1 progeny would NOT be wild type (unable to grow on
minimal medium without proline).
2) There are many correct answers for this question. Here is one.

1 2 3 4 5
1 - Mutant in locus (1,2)
2 - - Mutant in locus (1,2)
3 + + - Mutant in locus (3)

PAGE 2 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS

4 + + + - Mutant in locus (4)

5 + + - - - Mutant in locus (3 and 4)

3) The auxotrophic strain is mutant in one gene. This gene has both a HindIII and XhoI site within its
sequence, but not an EcoRI site. Thus, the EcoRI library could contain a restriction fragment with an entire,
intact gene, while the two other enzymes would break the gene into two fragments that would not be
cloned together as a functional gene.
4) No. E. coli cells do not normally import proteins from their environments, thus none of the Enzyme
A proteins would enter the cells to affect a rescue. If the product of Enzyme A was added, then it
could rescue the strain, but only if the product could be taken up by the cells.

CHAPTER 05 – ANSWERS
1) Gene: a hereditary unit, that occupies a specific position (Locus) within the genome or chromosome, and
has at least one specific effect on the phenotype of the organism, and can mutate into various allelic
forms, and can recombine with other such units. (From A Dictionary of Genetics, King & Stansfield, Third
edition, 1985.)
2) No, not all DNA in a genome is part of a gene, or is needed. Not all DNA has a function. Some (most?) DNA
in higher eukaryote genomes consists of non-functional DNA, also known as “junk” DNA, or “garbage DNA
and has no known function.
3) No. RNA transcripts also include tRNA, rRNA, as well as a whole plethora of other RNAs that function as
RNA molecules and don’t (can’t) be translated into polypeptides.
4) The UTRs (untranslated regions) are the regions at the 5’ and 3’ ends of the mRNA transcript, outside the
transcribed sequence (start to stop) that is not translated. These sequences often have regulating
elements (short sequence stretches) that alter the mRNA’s translation.
5) No. The origin of replication in E. coli (oriC) is an example. The centromeres and telomeres are also
sequences that are not transcribed, but their change or loss can have an effect on the phenotype.

CHAPTER 06 – ANSWERS
1) All are low glucose – lactose is as specified
Legend:
+++ Lots of β-galactosidase activity (100%)
++ Moderate β-galactosidase activity (10-20%)
+ Basal β-galactosidase activity (~≤1%)
- No β-galactosidase activity (0%)

-- a) I+, O+, Z+, Y+ (no glucose, no lactose)
+ + + +
+++ b) I , O , Z , Y (no glucose, high lactose)
-- c) I+, O+, Z+, Y+ (high glucose, no lactose)
+ d) I+, O+, Z+, Y+ (high glucose, high lactose)
-- e) I+, O+, Z-, Y+ (no glucose, no lactose)
-- f) I+, O+, Z-, Y+ (high glucose, high lactose)
+ g) I+, O+, Z+, Y- (high glucose, high lactose)
++ h) I+, Oc, Z+, Y+ (no glucose, no lactose)
+++ i) I+, Oc,Z+, Y+ (no glucose, high lactose)
+ j) I+, Oc, Z+, Y+ (high glucose, no lactose)

OPEN GENETICS LECTURES – FALL 2017 PAGE 3

CHAPTER QUESTION - ANSWERS

+ k) I+, Oc, Z+, Y+ (high glucose, high lactose)

+++ l) I-, O+, Z+, Y+ (no glucose, no lactose)
+++ m) I-, O+, Z+, Y+ (no glucose, high lactose)
+ n) I-, O+, Z+, Y+ (high glucose, no lactose)
+ o) I-, O+, Z+, Y+ (high glucose, high lactose)
-- p) Is, O+, Z+, Y+ (no glucose, no lactose)
-- q) Is, O+, Z+, Y+ (no glucose, high lactose)
-- r) Is, O+, Z+, Y+ (high glucose, no lactose)
-- s) Is, O+, Z+, Y+ (high glucose, high lactose)
2) All are low glucose – lactose is as specified
Legend
+++ Lots of β-galactosidase activity (100%)
++ Moderate β-galactosidase activity (10-20%)
+ Basal β-galactosidase activity (~≤1%)
- No β-galactosidase activity (0%)

+++ a. I+, O+, Z+, Y+ / I+, O-, Z-, Y- (high lactose)
-- b. I+, O+, Z+, Y+ / I+, O-, Z-, Y- (no lactose)
+++ c. I+, O+, Z-, Y+ / I+, O-, Z+, Y+ (high lactose)
++ d. I+, O+, Z-, Y+ / I+, O-, Z+, Y+ (no lactose)
+++ e. I+, O+, Z-, Y+ / I-, O+, Z+, Y+ (high lactose)
-- f. I+, O+, Z-, Y+ / I-, O+, Z+, Y+ (no lactose)
+++ g. I-, O+, Z+, Y+ / I+, O+, Z-, Y+ (high lactose)
-- h. I-, O+, Z+, Y+ / I+, O+, Z-, Y+ (no lactose)
+++ i. I+, Oc, Z+, Y+ / I+, O+, Z-, Y+ (high lactose)
++ j. I+, Oc, Z+, Y+ / I+, O+, Z-, Y+ (no lactose)
+++ k. I+, O+, Z-, Y+ / I+, Oc, Z+, Y+ (high lactose)
++ l. I+, O+, Z-, Y+ / I+, Oc, Z+, Y+ (no lactose)
-- m. I+, O+, Z-, Y+ / Is, O+, Z+, Y+ (high lactose)
-- n. I+, O+, Z-, Y+ / Is, O+, Z+, Y+ (no lactose)
-- o. Is, O+, Z+, Y+ / I+, O+, Z-, Y+ (high lactose)
-- p. Is, O+, Z+, Y+ / I+, O+, Z-, Y+ (no lactose)
3) You could demonstrate this with just I+OcZ-/I+O+Z+. The fact that this does not have constitutive β-
galactosidase expression shows that the operator only acts on the same piece of DNA on which it is
located. There are also other possible answers.
4) You could also demonstrate this with just I+O+Z-/I-O+Z+. The fact that this has the same lactose-inducible
phenotype as wild-type shows that a functional lacI gene can act on operators on both the same piece of
DNA from which it is transcribed, or on a different piece of DNA. There are also other possible answers.
5) For all of these, the answer is the same: The lac operon would remain inducible by lactose, but only up to
a basal level of expression, even in the absence of glucose

CHAPTER 07 – ANSWERS
1) Transcriptional: initiation, processing & splicing, degradation Translational: initiation, processing,
degradation Post-translational: modifications (e.g. phosphorylation), localization
Others: histone modification, other chromatin remodeling, DNA methylation

PAGE 4 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS

2) Both involve trans-factors binding to corresponding cis-elements to regulate the initiation of transcription
by recruiting or stabilizing the binding of RNApol and related transcriptional proteins at the promoter. In
prokaryotes, genes may be regulated as a single operon. In eukaryotes, enhancers may be located much
further from the promoter than in prokaryotes.
3) If there was no deacetylation of FLC by HDAC, transcription of FLC might continue constantly, leading to
constant suppression of flowering, even after winter.
CHAPTER 08 – ANSWERS
1) Various roles could include varying abilities to bind O2. Greater need as an embryo and fetus, less so for an
adult. This could be tested by obtaining the blood from those stages and determining the blood’s ability to
bind O2.
2) The two main types (a-globins and b-globins) are both derived by gene duplication and evolution from a
common ancestral gene.
3) Cartoon:
a-globin and b-globin, two of each type

4) Look it up. If you’re too lazy to do a search, see:
http://www.bloodjournal.org/content/125/24/3694?sso-checked=true

5) It is caused by the LCR keeping the gamma genes on after a person is born rather than switching
to the beta and delta genes (see Figure 5). A person will usually be unaware they have this
condition.

CHAPTER 09 – ANSWERS
1) Heterozygous people have the lactose persistence phenotype. As infants both alleles are active. As adults
only the LP allele remains on but it continues to supply the intestinal epithelial cells with Lactases. LP
alleles are dominant to the original allele because the LP allele's phenotype is the one seen when both
alleles are together.
2) The gene's alternative symbols are LAC and LPH. It is important to have a single symbol to make searches
of published journal articles possible. It is also a guide to authors of future articles, texts, and lectures.
3) If a person has a Lactase persistence phenotype they will break down the lactose and import the glucose
(and galactose). Soon most of the glucoses will enter their circulatory system and cause a noticeable
increase in blood glucose levels. The increase is noticeable after 15 minutes and peaks at 45 minutes. If a
person has a non-persistence phenotype none of this will happen because the lactose will pass through
their digestive tract (although some of it will be consumed by gut bacteria).
4) An Insulin protein begins with a ER signal sequence. Once their Ribosome has been delivered to the ER it
can be removed. The Ribosome continues synthesizing the Insulin protein and feeding it into the ER lumen.
Insulin proteins do not have a stop transfer sequence so the entire protein will be released into the ER
lumen. Insulin proteins do have a pro sequence, it is removed once the protein has taken on its proper
shape. When a transport vesicle carrying the Insulin fuses with the plasma membrane the protein is
release from the cell and can enter the blood.
5) The major difference is E. coli cells imports lactose and then hydrolyse it while human intestinal cells do
these steps in the order hydrolyse first and import second. The major similarities are in the proteins
required: Lac Permease (E. coli) and SGLT1 (humans) are both carbohydrate transporters located in the

OPEN GENETICS LECTURES – FALL 2017 PAGE 5

CHAPTER QUESTION - ANSWERS

plasma membrane while Beta-Galactosidase (E. coli) and Lactase (humans) are both enzymes that
hydrolyze the dissaccharide lactose.
6) There are enzymes to hydrolyse each dissaccharide and transport proteins to import the resulting
monosaccharides. These proteins happen to be named Lactase, Sucrase, Maltase, SGLT1 (imports glucose
and galactose), and GLUT5 (imports fructose).
7) Firstly, the excess lactose upsets the osmotic balance in the large intestine. Water enters the gut from the
tissues leading to diarrhea and dehydration. Secondly, the lactose will be used as food by bacteria living in
the large intestine. When they use the lactose to make ATP they expel carbon dioxide, hydrogen, and
other gases which causes cramping and flatulence.

CHAPTER 10 - ANSWERS
1) A geneticist would use these as white = phenotype, white = gene, and WHITE = protein.
2) The world is 19 times brighter for these flies. Without the optical insulation provided by the pigments light
from all directions strikes all of the photoreceptors. The flies are unable to make sense of the information
their eyes send to their brains.
3) The flies would be unable to make either transporter and would have white eyes as a result.
4) Yes. Most noticeable is the male flies have difficulty performing the mating dance that leads to sex with
female flies.

CHAPTER 11 – ANSWERS
1) Polymorphisms and mutations are both variations in DNA sequence and can arise through the same
mechanisms. We use the term polymorphism to refer to DNA variants that are relatively common in
populations. Mutations affect the phenotype.
2) Misreading of bases during replication can lead to substitution and can be caused by things like
tautomerism, DNA alkylating agents, and irradiation.
3) Looping out of DNA on the template strand during replication; strand breakage, due to radiation and other
mutagens; and (discussed in other chapters) chromosomal aberrations such as deletions and
translocations.
4) Looping out of DNA on the growing strand during replication; transposition; and (discussed in earlier
chapters) chromosomal aberrations such as duplications, insertions, and translocation.
5) Benzopyrene is one of many hazardous compounds present in smoke. Benzopyrene is an intercalating
agent, which slides between the bases of the DNA molecule, distorting the shape of the double helix,
which disrupts transcription and replication and can lead to mutation.
6) See Chapter 10.
7) Class I. see Figure 9 on Transposable Elements.

CHAPTER 12 – ANSWERS
1)
a) One possible explanation is that original mutagenesis resulted in a loss-of-function mutation in a gene
that is essential for early embryonic development, and that this mutation is X-linked recessive in the
female. Because half of the sons will inherit the X chromosome that bears this mutation, half of the
sons will fail to develop beyond very early development and will not be detected among the F1
progeny. The proportion of male flies that were affected depends on what fraction of the female

PAGE 6 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS

parent’s gametes carried the mutation. In this case, it appears that half of the female’s gametes carried
the mutation.
b) To test whether a gene is X-linked, you can usually do a reciprocal cross. However, in this case it would
be impossible to obtain adult male flies that carry the mutation; they are dead. If the hypothesis
proposed in a) above is correct, then half of the females, and none of the living males in the F1 should
carry the mutant allele. You could therefore cross F1 females to wild type males, and see whether the
expected ratios were observed among the offspring (e.g. half of the F1 females should have a fewer
male offspring than expected, while the other half of the F1 females and all of the males should have a
roughly equal numbers of male and female offspring).
2)
a) Treat a population of seeds with a mutagen such as EMS. Allow these seeds to self-pollinate, and then
allow the F1 generation to also self-pollinate. In the F2 generation, smell each flower to find individuals
with abnormal scent.
b) The fishy gene appears to be required to make the normal floral scent. Because the flowers smell fishy
in the absence of this gene, one possibility explanation of this is that fishy makes an enzyme that
converts a fishy-smelling intermediate into a chemical that gives flowers their normal, sweet smell.

Note that although we show this biochemical pathway as leading from the fishy-smelling chemical to the
sweet-smelling chemical in one step, it is likely that there are many other enzymes that act after the fishy
enzyme to make the final, sweet-smelling product. In either case, blocking the pathway at the step
catalyzed by the fishy enzyme would explain the fishy smell.
c) In nosmell plants, the normal sweet smell disappears. Unlike fishy, the sweet smell is not replaced by
any intermediate chemical that we can easily detect. Thus, we cannot conclude where in the biochemical
pathway the nosmell mutant is blocked; nosmell may normally therefore act either before or after fishy
normally acts in the pathway:

OPEN GENETICS LECTURES – FALL 2017 PAGE 7

CHAPTER QUESTION - ANSWERS

Alternatively, nosmell may not be part of the biosynthetic pathway for the sweet smelling chemical at all.
It is possible that the normal function of this gene is to transport the sweet-smelling chemical into the cells
from which it is released into the air, or maybe it is required for the development of those cells in the first
place. It could even be something as general as keeping the plants healthy enough that they have enough
energy to do things like produce floral scent.
3)
a) Dominant mutations are generally much rarer than recessive mutations. This is because mutation of a
gene tends to cause a loss of the normal function of this gene. In most cases, having just one normal
(wt) allele is sufficient for normal biological function, so the mutant allele is recessive to the wt allele.
Very rarely, rather than destroying normal gene function, the random act of mutation will cause a gene
to gain a new function (e.g. to catalyze a new enzymatic reaction), which can be dominant (since it
performs this new function whether the wt allele is present or not). This type of gain-of-function
dominant mutation is very rare because there are many more ways to randomly destroy something
than by random action to give it a new function (think of the example given in class of stomping on an
iPod).
b) Dominant mutations should be detectable in the F1 generation, so the F1 generation, rather than the F2
generation can be screened for the phenotype of interest.
c) Large deletions, such as those caused by some types of radiation, are generally less likely than point
mutations to introduce a new function into a protein: it is hard for a protein to gain a new function if
the entire gene has been removed from the genome by deletion.
4)
a) Mutagenize a wild type (auxotrophic) strain and screen for mutations that fail to grow on minimal
media, but grow well on minimal media supplemented with proline.
b) Take mutants #1-#10) and characterize them, based on (1) genetic mapping of the mutants (different
locations indicate different genes); (2) different response to proline precursors (a different response
suggests different genes); (3) complementation tests among the mutations (if they complement then
they are mutations in different genes).
c) If the mutations are in different genes then the F1 progeny would be wild type (able to grow on
minimal medium without proline).
d) If the mutations are in the same gene then the F1 progeny would NOT be wild type (unable to grow on
minimal medium without proline).

CHAPTER 13 – ANSWERS
1) These are four common terms that are often used interchangeably by novice students, but do have
distinctly different meanings and uses. (1) gene = general term for a segment of nucleic acid that is
responsible for one or more phenotypes (2) locus = the position of a gene along a chromosome, (3) allele =
the form (DNA sequence) of a gene at a locus, (4) transcription unit = the segment of DNA that is
transcribed into RNA (often mRNA in the case of a protein coding gene).
2) Form (1) RR (red) x rr (white) gives Rr (red progeny). “R” is dominant to “r”.
+ + - - + - + -
Form (2) r r (red) x r r (white) gives r r (red progeny). “r ” is dominant to “r “.
+ -
For pink progeny, the symbols are the same, only “R” or “r ” is semi-dominant to “r” or “r “.
3) If your blood type is B, then your genotype is either IBi or IBIB. If your genotype is IBi, then your parents
could be any combination of genotypes, as long as one parent had at least one i allele, and the other
parent had at least one IB allele. If your genotype was IB IB, then both parents would have to have at least
one IB allele.
4) case 1 co-dominance

PAGE 8 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS

case 2 incomplete-dominance
case 3 incomplete penetrance
case 4 pleiotropy
case 5 haplo-sufficiency
case 6 haplo-insufficiency
case 7 broad (variable) expressivity
5) Mutant#1 = hypomorph
Mutant#2 = hypermorph
Mutant#3 = amorph
Mutant#4 = neomorph
Mutant#5 = antimorph

CHAPTER 14 – ANSWERS
1) No. Since chromosomes vary greatly in size, the number of chromosomes does not correlate with the total
DNA content. For reasons discussed in Chapter 5 and this chapter, the number of genes does not correlate
closely to DNA content either.
2) Heterochromatic regions with repetitive DNA, centromeres, and telomeres are examples of gene-poor
regions of chromosomes.
3)
a. Only one (except for holocentric chromosomes, not discussed in this chapter).
b. The two centromeres might get pulled towards opposite poles at mitosis/meiosis resulting in
chromosome breakage.
c. It would not segregate properly at mitosis or meiosis, leading to aneuploidy. In order to segregate
correctly, there would have to be another way to control its movement at mitosis and meiosis.
4)
a. At the end of G1, 16 chromosomes with 1 chromatid each.
b. At the end of S, 16 chromosomes with 2 chromatids each.
c. At the end of G2, 16 chromosomes with 2 chromatids each.
d. At the end of mitosis, 16 chromosomes with 1 chromatid each.
5)
a. There is little correlation between any of these, with the exception that larger genomes tend to
have more genes.
b. The C-value paradox can be explained by genomes having different amounts of non-coding DNA
between genes and within genes as introns.
c. If we define “organismal complexity” as the size of the genome (or number of cells/organism), then
larger, more complex organism tend to have more genes although not always and not in a direct,
linear, proportioned manner. Also, those with larger genomes tend to have greater distances
between genes.

CHAPTER 15 - ANSWERS:
1)
a) Red blood cells do not have chromosomes. They are terminally differentiated and have expelled their
nucleus.
b) First, it is difficult to collect cells in anaphase. Second, in anaphase there would be twice as many
chromosomes, which would make identifying them much harder.

OPEN GENETICS LECTURES – FALL 2017 PAGE 9

CHAPTER QUESTION - ANSWERS

2) Yes. Males being 46,XY have slightly less DNA than 46,XX females, but still have the same number of
chromosomes.
3)
a) True.
b) True.
c) False, only females have a paternal X chromosome.
d) True.
e) False, only males have a paternal Y chromosome.
f) False, no one has a maternal Y chromosome. Females don’t have Y-chromosomes
g) False, typically no one has a paternal mitochondrial chromosome. Mitochondria are maternally
inherited. However, there are rare cases of inheritance of paternal mitochondria.
h) True.
4) Centromeres function as a “chromosome's handle”. Each needs one handle but it doesn't matter where
along the chromosome it is.
5) If there was only one ori in the middle of the chromosome it would take too long for the replication forks
to reach the ends of the chromosome. Even with thousands of ori's per chromosome, it still takes 8 hours
to replicate our DNA.
6) Chromatin is the material from which chromosomes are made (mostly DNA + protein). DNA is a
component of both chromatin and chromosomes.
7) Top left: Histone proteins; top right: Histone and Cohesin proteins, bottom right: Histones, Cohesins,
Condensins, and Kinetochore proteins; bottom left: Histone, Condensin, and Kinetochore proteins.
8)
a) DNA Polymerases are found inside the nucleus and the mitochondria.
b) RNA Polymerases are found inside the nucleus and the mitochondria
c) Ribosomes are found free in the cytosol, on the surface of the rough ER, and inside the mitochondria.
(Some have been found in the nucleus, too.)
9)
a) The F8 gene could work on an autosome. Its mRNAs would still leave the nucleus to be translated
in the cytosol.
b) The SRY gene would not work normally on an autosome because then females would have the gene as
well as males and thus females would become males.
c) The MT-CO1 gene would not work on an autosome (nuclear) because the genetic code is different in
the nucleus (vs. mitochondrion). The protein must be translated inside the mitochondria to be the correct
amino acid sequence. If it were translated in the cytosol the amino acid sequence would be different, and
thus likely not work normally.

CHAPTER 16 – ANSWERS
1) If genetic factors blended together like paint then they could not be separated again. The white flowered
phenotype would therefore not reappear in the F2 generation, and all the flowers would be purple or
maybe light purple, not white.
2) Your choice……
3) There is a maximum of two alleles for a normal autosomal locus from a diploid individual. In the whole
population there can be essentially an unlimited number of different alleles; the limit being determined by
the population size.
4)

PAGE 10 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS

a. In the F1 generation, the genotype of all individuals will be Ww and all of the dogs will have wirey
hair.
b. In the F2 generation, there would be an expected 3:1 ratio of wirey-haired to smooth-haired dogs.
c. Although it is expected that only one out of every four dogs in the F2 generation would have
smooth hair, large deviations from this ratio are possible, especially with small sample sizes. These
deviations are due to the random nature in which gametes combine to produce offspring. Another
example of this would be the fairly common observation that in some human families, all of the
offspring are either girls, or boys, even though the expected ratio of the sexes is essentially 1:1.
d. You could do a test cross, i.e. cross the wirey-haired dog to a homozygous recessive dog (ww).
Based on the phenotypes among the offspring, you might be able to infer the genotype of the
wirey-haired parent.
e. From the information provided, we cannot be certain which, if either, allele is wild-type. Generally,
dominant alleles are wild-type, and abnormal or mutant alleles are recessive.
5) Even before the idea of a homozygous genotype had really been formulated, Mendel was still able to
assume that he was working with parental lines that contained the genetic material for only one variant of
a trait (e.g. EITHER green seeds of yellow seeds), because these lines were pure-breeding. Pure-breeding
means that the phenotype doesn’t change over several generations of self-pollination. If the parental lines
had not been pure-breeding, it would have been very hard to make certain key inferences, such as that the
F1 generation could contain the genetic information for two variants of a trait, although only one variant
was expressed. This inference led eventually to Mendel’s First Law.
6) Equal segregation of alleles occurs only in meiosis. Although mitosis does produce daughter cells that are
genetically equal, there is no segregation (i.e. separation) of alleles during mitosis; each daughter cell
contains both of the alleles that were originally present in the parent cell.

CHAPTER 17 – ANSWERS
1)

R;Y R;Y r;Y r;Y R;Y R;y r;Y r;y

R/r R/r r/r r/r R/r R/r r/r r/r
r;y r;y
Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
R/r R/r r/r r/r R/r R/r r/r r/r
r;y r;y
Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
R/r R/r r/r r/r R/r R/r r/r r/r
r;y r;y
Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
R/r R/r r/r r/r R/r R/r r/r r/r
r;y r;y
Y/y Y/y Y/y Y/y Y/y y/y Y/y y/y
2) No, it’s not necessary to write out a Punnett square in a true square 2x2 or 4x4, etc. For simplicity you can
remove the duplicate gametes, and you will still get the same ratio. It isn’t incorrect to write it out fully

OPEN GENETICS LECTURES – FALL 2017 PAGE 11

CHAPTER QUESTION - ANSWERS

though. For the Punnett square on the right Figure 7, you can simplify it as:

R;Y R;y
R/r R/r
r;y
Y/y y/y
3) The “9” would increase, both “3” would decrease, and the “1” would increase.
4) Two classes, the parentals, would increase, while two classes would decrease, the recombinants.

CHAPTER 18 – ANSWERS
1) Crossovers can be observed cytologically directly under the microscope as chiasmata.
Recombination is defined genetically as the frequency calculated from the observed phenotypic
proportions in the progeny.
Crossovers lead to recombination when they are detected using genetic marker loci. Not all crossovers
result in recombination – some can’t be detected because no visible markers are recombined.
Some recombinants involve crossovers, but not all recombinants result from crossovers.
Crossovers between non-sister chromatids can result in recombination, while crossovers between sister
chromatids, which have identical alleles, will not show any recombination.
When there are two crossovers between the loci being scored for recombination, the result will appear to
be parental, not recombinant.
Recombination can occur without crossovers when marker loci are on different chromosomes, which then
assort independently.
2) The use of pure breeding lines allows the researcher to be sure that he/she is working with homozygous
(known) genotypes. If a parent is known to be homozygous, then all of its gametes will have the same
genotype. This simplifies the definition of parental genotypes and therefore the calculation of
recombination frequencies.
3) This tight linkage would suggest that individuals with the earlobe phenotype would likely carry alleles that
increased their risk of cardiovascular disease. These individuals could therefore be informed of their
increased risk and have an opportunity to seek increased monitoring and reduce other risk factors.
4)
a. It assumes that the loci are completely unlinked.
b. The expected ratio would be all partentals and no recombinants. For example, if the parental
gametes were AB and ab, then the gametes produced by the dihybrids would also be AB and ab,
and the offspring of a cross between the two dihybrids would all be genotype AABB:AaBb:aabb, in
a 1:2:1 ratio. If the parental gametes were Ab and aB, then the gametes produced by the dihybrids
would also be Ab and aB, and the offspring of a cross between the two dihybrids would all be
genotype AAbb:AaBb:aaBB, in a 1:2:1 ratio.
5)
a. Parental: CcEe and ccee; Recombinant: Ccee and ccEe.
b. Parental: Ccee and ccEe; Recombinant: CcEe and ccee.
6) a)- Let WwYy be the genotype of a purple-flowered (W), green seeded (Y) dihybrid. The cross is WwYy ×
wwyy. Half of the progeny will have yellow seeds whether the loci are linked or not. You cannot tell if they
are linked or not given only this information.
b)- You need to know the proportion of the seeds that are white or purple flowered, and in what
frequencies they appear with the white and purple flowers, e.g. what the frequencies of the four classes

PAGE 12 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS

are. This would help you to know about the linkage between the two loci – unliked, or what degree of
linkage.
7) If the progeny of the cross aaBB x AAbb is testcrossed, and the following genotypes are observed among
the progeny of the testcross, what is the frequency of recombination between these loci?
AaBb 135 Aabb 430 aaBb 390 aabb 120
(135 + 120)/(135+120+390+430)= 24%
8) See section 3.3. Syntenic is the term for genes found on the same chromosome. Linked genes are always
found on the same chromosome, and so are always syntenic. If the genes are sufficiently far enough away
on the same chromosome, crossover events will make the two genes assort independently, so they won’t
appear linked. Therefore, in this latter situation, these genes are syntenic, but not linked.

CHAPTER 19 – ANSWERS
1) Let tt be the genotype of a short tassels, and rr is the genotype of pathogen resistant plants. We need to
start with homozygous lines with contrasting combinations of alleles, for example:
P: RRtt (pathogen sensitive, short tassels) × rrTT (pathogen resistant, long tassels)
F1: RrTt (sensitive, long) × rrtt (resistant, short)
F2: parental Rrtt (sensitive, short) , rrTt (resistant, long)
recombinant rrtt (resistant, short) , RrTt (sensitive, long)
2) Let mm be the genotype of a mutants that fail to learn, and ee is the genotype of orange eyes. We need to
start with homozygous lines with contrasting combinations of alleles, for example (wt means wild-type):
P: MMEE (wt eyes, wt learning) × mmee (orange eyes, failure to learn)
F1: MmEe (wt eyes, wt learning) × mmee (orange eyes, failure to learn)
F2: parental MmEe (wt eyes, wt learning) , mmee (orange eyes, failure to learn)
recombinant Mmee (wt eyes, failure to learn), mmEe (orange eyes, wt learning)
3) Given a triple mutant aabbcc , cross this to a homozygote with contrasting genotypes, i.e. AABBCC, then
testcross the trihybrid progeny, i.e.
P: AABBCC × aabbcc
F1: AaBbCc × aabbcc
Then, in the F2 progeny, find the two rarest phenotypic classes; these should have reciprocal genotypes,
e.g. aaBbCc and AAbbcc. Find out which of the three possible orders of loci (i.e. A-B-C, B-A-C, or B-C-A)
would, following a double crossover that flanked the middle marker, produce gametes that correspond to
the two rarest phenotypic classes. For example, if the rarest phenotypic classes were produced by
genotypes aaBbCc and AAbbcc, then the dihybrid’s contribution to these genotypes was aBC and Abc.
Since the parental gametes were ABC and abc the only gene order that is consistent with aBC and Abc
being produced by a double crossover flanking a middle marker is B-A-C (which is equivalent to C-A-B).
4) Based on the information given, the recombinant genotypes with respect to these loci will be Aabb and
aaBb. The frequency of recombination between A-B is 1cM=1%, based on the information given in the
question, so each of the two recombinant genotypes should be present at a frequency of about 0.5%.
Thus, the answer is 0.5%.
5)
a. 4cM
b. Random sampling effects; the same reason that many human families do not have an equal
number of boys and girls.
6) There would be approximately 2% of each of the recombinants: (yellow, straight) and (black, curved), and
approximately 48% of each of the parentals: (yellow, curved) and (black, straight).

OPEN GENETICS LECTURES – FALL 2017 PAGE 13

CHAPTER QUESTION - ANSWERS

7)
A is fur color locus B is tail length locus C is behaviour locus
fur (A) tail (B) behavior (C) Freq. AB AC BC
white short normal 16 aBC R R P
brown short agitated 0 ABc P R R
brown short normal 955 ABC P P P
white short agitated 36 aBc R P R
white long normal 0 abC P R R
brown long agitated 14 Abc R R P
brown long normal 46 AbC R P R
white long agitated 933 abc P P P

B C A
|--------------|---------|
4.1cM 1.5cM
Pairwise recombination frequencies are as follows (calculations are shown below):
A - B 5.6% A - C 1.5% B - C 4.1%
AB AC BC
16 16 0
0 0 0
0 0 0
36 0 36
0 0 0
14 14 0
46 0 46
0 0 0
112 30 82
5.6% 1.5% 4.1%

CHAPTER 20 – ANSWERS
1) It depends on the chromosomal location of the disease locus. If the gene is autosomal, the probability is
50%. If it is sex-linked, that is on the X-chromosome, it would be 100%. If it is Y-linked, then 0%. In both
situations the probability would decrease if the penetrance was less than 100%.
2)

PAGE 14 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS
3)

CHAPTER 21 – ANSWERS
1)

2) Because each egg or sperm cell receives exactly one sex chromosome (even though this can be either an X
or Y, in the case of sperm), it could be argued that the sex chromosomes themselves do obey the law of
equal segregation, even though the alleles they carry may not always segregate equally. However, this
answer depends on how broadly you are willing to stretch Mendel’s First Law.

CHAPTER 22 – ANSWERS
1) Co-dominance
2) Note that a semicolon is used to separate genes on different chromosomes.
Phenotype Genotype(s)
B B B
a) entirely black O / O ; s / s O / Y ; s / s
0 0 0
b) entirely orange O / O ; s / s O / Y ; s / s
B B B
c) black and white O / O ; S / _ O / Y ; S / _
0 0 0
d) orange and white O / O ; S / _ O / Y ; S / _
0 B
e) orange and black (tortoiseshell) O / O ; s / s
0 B
f) orange, black, and white (calico) O / O ; S / _
3) People with hemophilia A use injections of purified Factor VIII proteins (made through the use of
recombinant, cloned Factor VIII gene). It can be delivered on demand (to control existing bleeding) or
regularly (to limit damage to joints).

OPEN GENETICS LECTURES – FALL 2015 PAGE 15

CHAPTER QUESTION - ANSWERS

CHAPTER 23 – ANSWERS
1) The pedigree could show an AD, AR or XR mode of inheritance. It is most likely AD. It could be AR if the
mother was a carrier, and the father was a homozygote. It could be XR if the mother was a carrier, and the
father was a hemizygote. It cannot be XD, since the daughter (#2) would have necessarily inherited the
disease allele on the X chromosome she received from her father.
2) There are many possible answers. Here are some possibilities: if neither of the parents of the father were
affected (i.e. the paternal grandparents of children 1, 2, 3), then the disease could not be dominant. If
only the paternal grandfather was affected, then the disease could only be X-linked recessive if the
paternal grandmother was a heterozygote (which would be unlikely given that this is a rare disease allele).
3)
a. The mode of inheritance is most likely AD, since every affected individual has an affected parent, and the
disease is inherited even in four different matings to unrelated, unaffected individuals. It is very unlikely
that it is XD or XR, in part because affected father had an affected son.
b. The mode of inheritance cannot be AD or XD, because affected individuals must have an affected parent
when a disease allele is dominant. Neither can it be XR, because there is an affected daughter of a
normal father. Therefore, it must be AR, and this is consistent with the pedigree.
c. The mode of inheritance cannot be AD or XD, because, again, there are affected individuals with
unaffected parents. It is not XR, because there are unaffected sons of an affected mother. It is
therefore likely AR, but note that the recessive alleles for this condition appear to be relatively common
in the population (note that two of the marriages were to unrelated, affected individuals).
d. The mode of inheritance cannot be AD or XD, because, again, there are affected individuals with
unaffected parents. It could be either XR or AR, but because all of the affected individuals are male, and
no affected males pass the disease to their sons, it is likely XR.
4) If a represents the disease allele, individuals a, d, f (who all married into this unusual family) are AA, while
b, c, e, g, h, i, j are all Aa, and k is aa.
5) There is a ½ chance that an offspring of any mating Aa x AA will be a carrier (Aa). So, there is a ½ chance
that #3 will be Aa, and likewise for #4. If #3 is a carrier, there is again a ½ chance that #5 will be a carrier,
and likewise for #6. If #5 and #6 are both Aa, then there is a ¼ chance that this monohybrid cross will
result in #7 having the genotype aa, and therefore being affected by the disease. Thus, the joint
probability is 1/2 x 1/2 x 1/2 x 1/2 x 1/4 =1/64.

CHAPTER 24 – ANSWERS
1)

PAGE 16 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS
a. As in Figure 4 homologous chromosomes pair during prophase I of meiosis. The shaded boxes
are regions of sequence similarity, for example Alu transposable elements. A crossover occurs
between two of the Alu elements on the same chromatid leading to a chromosomal inversion.

b.
A crossover occurs between Alu elements on different chromosomes leading to a chromosomal
translocation. Note that the homologous chromosomes are not shown in this figure for
simplicity.

2) Gamma rays are efficient at causing double strand DNA breaks, which are then more likely to rejoin
and produce a deletion.
3) First, obtain permission from the person (and ethical approval from the appropriate oversight board or
committee). Next, isolate some white blood cells, place the cells on a slide, denature the DNA,
hybridize with fluorescent nucleic acid probes specific for the X chromosome or the Y chromosome,
observe the results with a fluorescence microscope. If they are XXX, there should be three X signals
corresponding to the three X-chromosomes and no Y-chromosome signals. If they are XYY, there
should be one X signal and two Y signals in each cell nucleus.

CHAPTER 25 - ANSWERS
1) 2n=6x=42
2)
a) Two is the maximum number of alleles that can exist for a given gene in a 2n cell of a given diploid
individual.
b) Two is the maximum number of alleles that can exist in a 1n cell of a tetraploid individual.
c) Four is the maximum number of alleles that can exist in a 2n cell of a tetraploid individual.
d) The maximum number of alleles that can exist in a population is theoretically limited only by the
population size.
3)
a) Aneuploidy can disrupt gene balance and disrupt meiosis, whereas even-numbered polyploids (e.g.
tetraploid, hexaploid) can be stable through meiosis, and can retain normal gene balance.

OPEN GENETICS LECTURES – FALL 2015 PAGE 17

CHAPTER QUESTION - ANSWERS
b) Duplication is more likely than polyploidy to disrupt gene balance since only some genes will increase
their copy number following duplication of a chromosomal segment.
4) Maternal chromosomes are black and paternal chromosomes are grey.

a)
5)

a)
6)

a)

7) As in Figure 12, there is a nondisjunction event during gamete formation. The larger X chromosomes are
shown using open symbols and the smaller Y chromosomes are shown with shaded symbols. A second
division nondisjunction event in the male parent leads to a zygote with an XYY karyotype.

a)
8)
a) 46, XY - zero Barr bodies,
b) 46,XX - one,
c) 47, XYY - zero,
d) 47,XXX - two,
e) 45,X - zero,
f) 47,XXY - one.
9) Having a shortage of key proteins is usually more detrimental than having an excess.

PAGE 18 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS
10) At the two cell stage, one of the embryo’s cells will be 45,XY,-21 while the other will be 47,XY,+21. As
embryogenesis continues most of the monosomy-21 cells will die and the embryo will ultimately be made
of mostly trisomy-21 cells. The child will be born with Down syndrome.

CHAPTER 26 – ANSWERS
1)
a) case 1 co-dominance
b) case 2 incomplete-dominance
c) case 3 incomplete penetrance
d) case 4 pleiotropy
e) case 5 haplosufficiency
f) case 6 haploinsufficiency
g) case 7 broad (variable) expressivity
2) If 1 and 2 and 3 are all colorless, and 4 is red, what will be the phenotypes associated with the following
genotypes? All of these mutations are recessive. As always, if the genotype for a particular gene is not
listed, you can assume that alleles for that gene are wild-type.
a) red (because A and B are redundant, so products 3 and then 4 can be made)
b) red (because A and B are redundant, so products 3 and then 4 can be made)
c) white (because product 3 will accumulate and it is colorless)
d) white (because only product 1 and 2 will be present and both are colorless)
e) white (because only product 1 and 3 will be present and both are colorless)
f) white (because only product 2 and 3 will be present and both are colorless)
g) white (because only product 1 and 2 will be present and both are colorless)

h) 15 red : 1 white i) 12 red : 4 white j) 12 red :4 white

3)
a. red (because A and B are redundant, so products 3 and then 4 can be made)
b. red (because A and B are redundant, so products 3 and then 4 can be made)
c. blue (because product 3 will accumulate, and it is blue)
d. white (because only product 1 and 2 will be present and both are colorless)
e. blue (because only product 1 and 3 will be present and 1 is colorless and 3 is blue)
f. blue(because only product 2 and 3 will be present and 2 is colorless and 3 is blue)
g. white (because only product 1 and 2 will be present and both are colorless)

h) 15 red : 1 white i) 12 red : 4 blue j) 12 red : 4 blue

OPEN GENETICS LECTURES – FALL 2015 PAGE 19

CHAPTER QUESTION - ANSWERS

4)
a) red (because A and B are redundant, so products 3 and then 4 can be made)
b) red (because A and B are redundant, so products 3 and then 4 can be made)
c) blue (because product 3 will accumulate, and it is blue)
d) yellow (because only product 1 and 2 will be present and 1 is colorless and 2 is yellow)
e) blue (because only product 1 and 3 will be present and 1 is colorless and 3 is blue)
f) green? (because only product 2 and 3 will be present and 2 is yellow and 3 is blue, so probably the
fruit will be some combination of those two colors)
g) yellow (because only product 1 and 2 will be present and 1 is colorless and 2 is yellow)
h) 15 red: 1 yellow

i) 12 red: 3 blue:1 green

j) 12 red: 4 blue

5) Epistasis is demonstrated when the phenotype for a mutant in one locus is prevented from being expressed
by a mutant at another locus. In this case, we would expect a homozygous mutant at one locus (e.g. D) to
be the same phenotype as a homozygous mutant in both loci (e.g. D and A, or D and B).
So, the following situations from questions 2-4 demonstrated epistasis:
Q#2: No epistasis can be determined from the phenotypes (even though we know from the pathway provided
that D is downstream of A and B). There are only two possible phenotypes. So even though the D locus
might be epistatic to A and B, one cannot see this interaction because the product of both A and B
(compound 3) is colourless.
PAGE 20 OPEN GENETICS LECTURES – FALL 2017
CHAPTER QUESTION - ANSWERS
Q#3: The phenotypes show that D is epistatic to A and B:
- aadd looks like AAdd or Aadd; dd prevents the expression of the A or a alleles.
- bbdd looks like BBdd or Bbdd: dd prevents the expression of the B or b alleles.
Note: that the triple mutant aabbdd would be colourless (white).
Q#4: The phenotypes show that D is epistatic to A, because aadd looks like AAdd or Aadd.
With bbdd, the difference between bbdd (green), Bbdd (blue), and BBdd (blue) is apparent, Thus, the
phenotypes do not provide evidence for epistasis between B and D.
6) The answer is the same for a) – d)
P could have been either: AABB x aabb or aaBB x AAbb;
F1 was : AaBb x AaBb
7) Conduct an enhancer/suppressor screen (which can also result in the identification of revertants, as well)

allow the plants to self-pollinate in order to make any new, recessive mutations homozygous

8) Depending which amino acids were altered, and how they were altered, a second mutation in g*g* could
either have no effect (in which case the phenotype would be the same as gg), or it could possibly cause a
reversion of the phenotype to wild-type, so that g*g* and GG have the same phenotype.

OPEN GENETICS LECTURES – FALL 2015 PAGE 21

CHAPTER QUESTION - ANSWERS
9) Depending on the normal function of gene A, and which amino acids were altered in allele a, there are
many potential phenotypes for aagg:
Case 1: If the normal function of gene A is in an unrelated process (e.g. A is required for root development,
but not the development of leaves), then the phenotype of aagg will be: short roots and narrow leaves.
The phenotypic ratios among the progeny of a dihybrid cross will be:
9 3 3 1
A_G_ A_gg aaG_ aagg
tubular leaves short roots tubular leaves
wild-type
normal roots normal leaves short roots

Case 2: If the normal function of gene A is in the same process as G, such that a is a recessive allele that
increases the severity of the gg mutant (i.e. a is an enhancer of g) then the phenotype of aagg could be :
no leaves. The phenotypic ratios among the progeny of a dihybrid cross depend on whether aa mutants
have a phenotype independent of gg, in other words, do aaG_ plants have a phenotype that is different
from wild-type or from A_gg. There is no way to know this without doing the experiment, since it depends
on the biology of the particular gene, mutation and pathway involved, so there are three possible
outcomes:
Case 2a) If aa is an enhancer of gg, and aaG_ plants have a mutant phenotype that differs from wild-type
or (A_gg) then the phenotypic ratios among the progeny of a dihybrid cross will be:
9 3 3 1
A_G_ A_gg aaG_ aagg
tubular leaves

wild-type (some phenotype that abnormal leaves no leaves

differs from gg; maybe
small twisted leaves)
Case 2b) If aa is an enhancer of gg, and aaG_ plants have a mutant phenotype that is the same as A_gg ,
the phenotypic ratios among the progeny of a dihybrid cross will be:
9 6 1
A_G_ A_gg aaG_ aagg
wild-type tubular leaves no leaves

Case 2c) If aa is an enhancer of gg, and aaG_ do not have a phenotype that differs from wild-type then the
phenotypic ratios among the progeny of a dihybrid cross will be:
12 3 1
A_G_ aaG_ A_gg aagg
wild-type tubular leaves no leaves

Case 3: If the normal function of gene A is in the same process as G, such that a is a recessive allele that
decreases the severity of the gg mutant (i.e. a is an suppressor of g) then the phenotype of aagg could be
: wild-type. The phenotypic ratios among the progeny of a dihybrid cross depend on whether aa mutants
have a phenotype independent of gg, in other words, do aaG_ plants have a phenotype that is different
PAGE 22 OPEN GENETICS LECTURES – FALL 2017
CHAPTER QUESTION - ANSWERS
from wild-type or from A_gg. There is no way to know this without doing the experiment, since it depends
on the biology of the particular gene, mutation and pathway involved, so there are three possible
outcomes:
Case 3a) If aa is a suppressor of gg, and aaG_ plants have a mutant phenotype that differs from wild-type
or (A_gg) then the phenotypic ratios among the progeny of a dihybrid cross will be:
10 3 3
A_G_ aagg A_gg aaG_
wild-type tubular leaves no leaves
(some phenotype that differs
from gg)

Case 3b) If aa is an suppressor of gg, and aaG_ plants have a mutant phenotype that is the same as A_gg
the phenotypic ratios among the progeny of a dihybrid cross will be:
10 6
A_G_ aagg A_gg aaG_
wild-type tubular leaves

Case 3c) If aa is an suppressor of gg, and aaG_ plants do not have a phenotype that differs from wild-type
then the phenotypic ratios among the progeny of a dihybrid cross will be:
13 3
A_G_ aaG_ aagg A_gg
wild-type tubular leaves

Case 4: If the normal function of gene A is in the same process as G, such that a is a
recessive allele that with a phenotype that is epistatic to the gg mutant then the
phenotype of both aaG_ and aagg could be : no leaves. The phenotypic ratios among
the progeny of a dihybrid cross will be:
9 4 3
A_G_ aaG_ aagg A_gg
wild-type no leaves tubular leaves

Case … ?: There are many more phenotypes and ratios that could be imagined (e.g. different types of
dominance relationships, different types of epistasis, lethality…etc). Isn’t genetics wonderful? It is
sometimes shocking that more people don’t want to become geneticists.
The point of this exercise is to show that many different ratios can be generated, depending on the biology
of the genes involved. On an exam, you could be asked to calculate the ratio, given particular biological
parameters. So, this exercise is also meant to demonstrate that it is better to learn how to calculate ratios
than just trying to memorize which ratios match which parameters. In a real genetic screen, you would
observe the ratios, and then try to deduce something about the biology from those ratios.

OPEN GENETICS LECTURES – FALL 2015 PAGE 23

CHAPTER QUESTION - ANSWERS
10) Assuming that bb has no phenotype on its own (i.e. A_bb looks like A_B_), then aaB_ will have the mutant
phenotype, and A_bb, A_B_, and aabb will appear phenotypically wild-type. The phenotypic ratio will be
13 wild-type: 3 mutant.
11) For a dihybrid cross, there are 4 classes, 9:3:3:1. In a trihybrid cross without gene interactions, each of
these 4 classes will be further split into a 3:1 ratio based on the phenotype at the third locus. For example,
9 x 3 =27 and 9 x 1 = 9. This explains the first two terms of the complete ratio: 27:9:9:9:3:3:3:1.

CHAPTER 27 – ANSWERS
1) Based on the information given, the recombinant genotypes with respect to these loci will be Aabb and
aaBb. The frequency of recombination between A-B is 1cM=1%, based on the information given in the
question, so each of the two recombinant genotypes should be present at a frequency of about 0.5%.
Thus, the answer is 0.5%.
2)

CHAPTER 28 – ANSWERS
1)
a) There will be a 6kb band (the insert) and a 3kb band (the plasmid vector)
b) There would be a single 9kb band.

PAGE 24 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS
c) There would only be a 3kb band, which represents three fragments: the plasmid, and both insert
fragments. All are the same size = 3.0 kb, so they will appear on the gel as a single band.
2) The complementary, sticky ends of the insert and plasmid vector may anneal together, but the non-
functional ligase will not be able to covalently link the insert and vector together. Thus, the annealed DNA
fragments will not be stable enough to be transformed, and thus unable to replicate -> no transformants
should be expected.
3) In electrophoresis, the main force driving moving of the molecules is the electrical force: that is, the
amount of charge per mass. A duplex DNA molecule will have about twice as much charge per length as
single stranded RNA, but it will have about twice as much mass. The difference being the deoxyribonucleic
acid (DNA) will have one less Oxygen per base compared to ribonucleic acid (RNA) – DNA is 4% less than
RNA per base on average. Thus for duplex DNA vs duplex RNA of the same sequence, DNA should run
slightly faster. However, single stranded RNA (and DNA) is capable of intramolecular base pairing, which
would dramatically reduce the molecule’s length and thus increase its velocity through the gel matrix. Thus,
a single stranded RNA will migrate faster than a double stranded DNA molecule (of the same length) in a
typical agarose gel. The difference in velocity would be determined primarily by the amount and type of
intramolecular base pairing.

CHAPTER 29 – ANSWERS
1) Identify the gene encoding the antigenic fragment of the virus. Clone this gene into E. coli and produce
lots of recombinant protein, purify, and use as a vaccine without the fear of infecting with a whole virus.
2) Without a selectable marker, you would have to individually test millions of bacterial colonies to find one
that contained your cloned fragment. Furthermore, you could not maintain the plasmid in the E. coli
because the retention of the plasmid is dependent upon the antibiotic resistance selectable marker.

CHAPTER 30 - ANSWERS:
1) Use each sequence in an online BLAST search (e.g. http://blast.ncbi.nlm.nih.gov/Blast.cgi). See which is a
100% match to Drosophila and which is a 100% match to mouse.
2) a) The next three nucleotides are CC(ATCG). The third position can be any base.
a) The sequence options need to be described. Met is fixed at ATG. Lysine had two alternatives (AAA and
AAG). Asparagine has two (AAT and AAC). Glutamic acid has two (GAA and GAG). Proline has four
possible codons. Thus 2 x 2 x 2x 4 = 32 possible sequences.

CHAPTER 31 – ANSWERS
1) You would need to know that although HIV is an RNA-virus, you should be able to detect the DNA pro-virus
in infected white blood cells. You would have to extract DNA from white blood cells, then use HIV-specific
primers to detect if HIV pro-virus DNA could be amplified. Thus, you would need to know some of the
sequence of the HIV genome. You would probably want to compare your primer sequences to the human
genome sequence too, to make sure the primers are complementary only to HIV-DNA, but not human
DNA. You would probably want to try to amplify some known HIV-free human DNA with the primers as a
negative control, just to be sure that the primers were HIV-specific. And amplify the sequences from a
known positive sample to know you can detect the sequences (positive control).

For the PCR reaction, you would need primers (as mentioned), dNTPs, Taq polymerase, and other buffers
or salts as required by the polymerase. You would need an agarose gel, ethidium bromide, and
electrophoresis buffers to analyze the PCR products to detect a band in a control positive sample, have it
absent in a negative sample, and then test your experimental to obtain a valid result.

OPEN GENETICS LECTURES – FALL 2015 PAGE 25

CHAPTER QUESTION - ANSWERS
2) The amplification factor is 2n, where n is the number of cycles. So after 10 cycles, starting with 10
molecules, you would have 10 × 210 = 10,240 molecules.
3) At the end of a successful PCR amplification reaction will be polymerase, remaining dNTPs, and primers, as
well as the original template, and PCR amplification products. By far the most abundant will be the
amplification products, flanked at both ends by the primer sequences. These products will be the only
thing observed on the gel, since the template and other PCR products are present in much lower
abundance, too low to be seen as evidenced by the absence of bands in the negative control lane.

CHAPTER 32 - ANSWERS:
1) While both bind to DNA Giemsa is used with visible light microscopes because it absorbs green, yellow and
red light (and thus appears purple) while DAPI is used with fluorescence microscopes because it absorbs
UV light and emits blue light.
2) There would be only two chromosome 21s, there would be two X chromosomes and no Y, and one of the
chromosome 5s would be shorter.
3) Both are in vitro DNA replication reactions using template DNA, primers, nucleotides, and DNA
Polymerases. The differences are PCR uses only two sequence-specific primers while labelling reactions
use millions of non-specific, PCR uses only regular nucleotides while labelling reactions include a labelled
one, PCR reactions must use a heat resistant DNA Pol while labelling reactions don't require one. PCR
reactions also go through about 30 cycles of (denature, anneal, and extend) while labelling reactions use a
single cycle.
4) When nucleotides are incorporated into the growing strand the beta and gamma phosphates are
discarded. The only way to make 32-phosphate labelled DNA is to use nucleotides with the alpha-
phosphate be the radioactive isotope.
5) The results would be similar to Figure 9. There would be 47 chromosomes glowing blue. The centromeres
of two of the chromosomes would be glowing green and the centromeres of three other chromosomes
would be glowing red.
6) These men are 47,XYY. This situation doesn't affect their health or fertility so it isn't a deleterious condition
and it doesn't have a name. It can be detected with any of the techniques discussed in this chapter:
Giemsa staining, G-banding, or FISH with hybridization probes that bind to the sex chromosomes.
7) These women are 47,XXX. This situation doesn't affect their health or fertility so it isn't a deleterious
condition and it doesn't have a name. It can be detected with any of the techniques discussed in this
chapter.
8) First, obtain permission from the person (and ethical approval from the appropriate oversight board or
committee). Next, isolate some white blood cells, place the cells on a slide, denature the DNA, hybridize
with fluorescent nucleic acid probes specific for the X chromosome or the Y chromosome, observe the
results with a fluorescence microscope. If they are XXX, there should be three X signals corresponding to
the three X-chromosomes and no Y-chromosome signals. If they are XYY, there should be one X signal and
two Y signals in each cell nucleus.

CHAPTER 33 – ANSWERS
1) If only fluorescently-labelled ddNTPs (but no regular dNTPs) were added to the reaction, the reaction
would always terminate at the first base added after the primer, and chromatogram would be essentially
flat lines (no peaks) for all but the first position, which would show a Brobdingnagian peak.
2) You could extract raw, naked DNA from seawater in various different places in the world, then sequence
all of this DNA, and build a database with the sequences. Next, use computer comparisons to identify DNA
that did not belong to any known species. This is an example of meta-genomics, and is already being done

PAGE 26 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS
by some scientists. ***Remember, having the sequence is not the same as having the organism or
understanding the sequence.
3) It is a reference to the fluorescently-labelled dideoxy nucleotides that are at the heart of the procedure.
4) Before 2008 sequencing a genome meant making BAC clones and millions of sequencing reactions. Today
it is done with no cloning step and one reaction.
5) The first generation of automated sequencing machines, such as the ABI 3730, are still less expensive and
more targeted when it comes to sequencing plasmids and PCR products. Many times a researcher only
wants to know the sequence of a short stretch of DNA, not the whole genome.
6) True. Some other next-generation technologies are sequencing by ligation (used in the Applied Biosystems
machines) and ion torrent sequencing (used in Life Technologies machines).

CHAPTER 34 – ANSWERS
1)
e) Radioactively label a piece of DNA that hybridized to the gene, outside of the part of the gene
contained in the deletion. Extract DNA from the suspect cancer cells of individuals, digest with a
restriction enzyme (the best choice would be ones that cleave just outside the gene), and separate the
DNA by electrophoresis. Southern blot the gel and probe with DNA complementary to the gene. Be
sure the probe spans the 200bp deleted region. Wash at high stringency and expose to a sheet of X-ray
film (or equivalent). Individuals heterozygous for the deletion (affected with the cancer) will have two
bands: one at the normal position and one at a lower position (200 bp lower) on the gel. Those without
the deletion will have only one band, at the normal position.
f) You would probably get hybridization to extra bands, or even just a big smear, since the probe would
hybridize non-specifically to other bands in the genome.
2)
a) Use PCR primers that flank the deletion. Extract DNA from cancer cell samples for use a template (one
sample per reaction), and analyze the PCR products by gel electrophoresis.
Cancer cells will have two bands, one full length, one 200bp shorter. Normal cells will only have full
length products.
b) If the temperature was too low, the PCR products would probably appear as smears nearly the entire
length of each lane, since the primers would bind to the genomic templates at many different positions
and amplify fragments of many different lengths.
3)
a. Label the PCR fragment for use as a probe. Hybridize the probe to a Southern blot of dog DNA. Cut
out and clone any bands that hybridize to the probe.
Or, more recently, ignore the fragment and dog DNA sample, and take the sequence of the human
olfactory receptor gene and BLAST it against the dog genome sequence. Compare the sequence
output results to identify the dog olfactory receptor genes.
b. More? Do the test.

CHAPTER 35 – ANSWERS
1) An individual homozygous for this region would have the same DNA sequence on both homologs and thus
the same restriction site locations on both maternal and paternal chromosomes. Each restriction enzyme
digestion should produce the same set of fragments from both homologs. Different enzymes should give
different sized fragments. The probe would be expected to hybridize with the complementary sequences
in these fragments. While the three lanes (E, H, and B) could have only a single band hybridizing in each,

OPEN GENETICS LECTURES – FALL 2015 PAGE 27

CHAPTER QUESTION - ANSWERS
there is the possibility that the 1.0 kbp fragment could contain one (or more) of these restriction enzyme
sites and thus the probe would hybridize to two (or more) fragments per lane. Thus at least one fragment
is expected, but more per lane could occur.
2) If the individual was heterozygous, with some sequence variation between the two homologous
chromosomes, then one might expect more than one band per lane due to restriction site polymorphism
leading to restriction fragment length polymorphism.
3) If 100 individuals were examined in this way, polymorphisms in restriction fragment length would likely be
identified, although not for sure. Some regions of DNA are more variable in the population than others.
4) If the probe fragment had repeated DNA sequences (e.g. Alu repeats), then very many fragments would
hybridize and the signal would not resolve individual band, but be a smear of signal paralleling the
distribution of digested DNA fragments in each lane.

CHAPTER 36 - ANSWERS:
1) 345 bp
2) Here are the results. S = size standard, R = results

a)
3) No. This policy is not cost-effective, and would violate various constitutional rights. Thus, it is unlikely to
happen. Most democratic countries only store DNA profiles for people accused of a crime (US) or
convicted of a crime (Canada).
4) For CSF1P0 the 7 allele is maternal and the 12 allele is paternal; for D8S1179 we can't tell which allele is
which; for D21S11 the 9 allele is paternal and the 10 allele is maternal.
Potential fathers
STR Child Mother #1 #2 #3
CSF1PO 7/12 7/10 7/10 12/14 12/13
D8S1179 6/6 6/8 6/9 12/12 5/6
D21S11 9/10 10/11 5/5 9/16 9/9
5) No. He has all the paternal alleles present in the child. He would have to be excluded based on other
evidence. This might come from other STRs being tested, and/or he wasn’t “involved” with the mother.
6) a)- We know they are stable within an individual because we can test DNA samples from various tissue
types and times during their lifetime and they all give the same DNA profile for all the individual’s samples.
They are invariant for one person.
b)- No, because there would not be a standard for a single person. This might show up as differences in
DNA profiles if DNA samples are obtained from different tissues or at different times over their lifetime.

PAGE 28 OPEN GENETICS LECTURES – FALL 2017

CHAPTER QUESTION - ANSWERS
There are, however, situations where a person's DNA does change. The length of telomeres in cells is
known to change during the lifetime of a person. The age of a person can be estimated from their length.
7)

8)

#1 B3B4E1E1
#2 B3B4E1E2
#3 B4B4E2E2
#4 B3B3E1E2
#5 B2B4E1E2
#6 B2B3E2E2

9) #3 and #6 cannot be a parent, since neither #3 and #6 have any alleles in common with #1 at locus E.
10) a) the region of the fragment that is most likely to be polymorphic
i) TAAAGGAATCAATTACTTCTGTGTGTGTGTGTGTGTGTGTGTGTTCTTAGTTGTTTAAGTTTTAAGTTGTGA
ii) ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
iii) ATTTCCTTAGTTAATGAAGACACACACACACACACACACACACAAGAATCAACAAATTCAAAATTCAACACT
b) any simple sequence repeats

i) TAAAGGAATCAATTACTTCTGTGTGTGTGTGTGTGTGTGTGTGTTCTTAGTTGTTTAAGTTTTAAGTTGTGA
ii) ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
iii) ATTTCCTTAGTTAATGAAGACACACACACACACACACACACACAAGAATCAACAAATTCAAAATTCAACACT

c) the best target sites for PCR primers that could be used to detect polymorphisms in the length of the
simple sequence repeat region in different individuals.
i) TAAAGGAATCAATTACTTCTGTGTGTGTGTGTGTGTGTGTGTGTTCTTAGTTGTTTAAGTTTTAAGTTGTGA
ii) ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
iii) ATTTCCTTAGTTAATGAAGACACACACACACACACACACACACAAGAATCAACAAATTCAAAATTCAACACT

CHAPTER 37: ANSWERS:
1) Most SNPs are the result of a single base pair substitution mutation that happened once during human
evolution. The chance is very small that a different mutation could happen at the same site in a different
person and also become prevalent.
2) If the conditions are not stringent enough the labelled DNA will attach to both oligos no matter what a
person's genotype is. It would appear that the person is heterozygous for every single SNP, a very unlikely
situation.
3) Yes. We would need to obtain DNA samples from many people with blue eyes and many people with
brown or other eye colours. In fact, most people with blue eyes have this phenotype because they have
mutations in two genes on chromosome 15 called OCA2 and HERC2.
4) Yes. In fact, 23andme looks for a SNP near HERC2 for this purpose. In Europeans most people with the GG
genotype of the SNP have blue eyes while people with the AG and AA genotypes have brown eyes.
OPEN GENETICS LECTURES – FALL 2015 PAGE 29
CHAPTER QUESTION - ANSWERS
5) For this disease or phenotype there are two genes responsible, one is on chromosome 1 and the other is
on chromosome 3.

CHAPTER 38 – ANSWERS
1)
a) q = ,-0.01. = 0.1
b) 1-q = p; 1-0.1 = 0.9
c) 2pq = 2(0.1)(0.9) = 0.18
d) p2 = 0.81
2) First, calculate allele frequencies:
p = 2(AA) + (Aa) / total number of alleles scored = 2(432) + 676 / 2(432+676+92) = 0.6417
q = 2(aa) + (Aa) / total number of alleles scored = 2(92) + 676 / 2(432+676+92) = 0.3583
Next, given these observed allele frequencies, calculate the genotypic frequencies that would
be expected if the population was in Hardy-Weinberg equilibrium.
p2 = 0.64172 = 0.4118
2pq = 2(0.6417)(0.3583) = 0.4598
q2 = 0.35832 = 0.1284
Finally, given these expected frequencies of each class, calculate the expected numbers of each in your
sample of 1200 individuals, and compare these to your actual observations.
expected observed (reported in the original question)
AA 0.4118 × 1200 = 494 432
Aa 0.4598 × 1200 = 552 676
aa 0.1284 × 1200 = 154 92
The population does not appear to be at Hardy-Weinberg equilibrium, since the observed genotypic
frequencies do not match the expectations. Of course, you could do a chi-square test to determine how
significant the discrepancy is between observed and expected.
3) If in this theoretical question, the frequency of genotype of AA is set at 432/1200 and we are asked what
frequencies of the other classes would fit a Hardy-Weinberg equilibrium. So, given that p2 = 432/1200,
then p=0.6, and q=0.4. Given these allele frequencies and a sample size of 1200 individuals, then there
should be 576 Aa individuals (2pq × 1200 = 2(0.6)(0.4) × 1200=576) and 192 aa individuals (q2 × 1200 =
0.42 × 1200 = 192), if the population was at Hardy-Weinberg equilibrium with 432 AA individuals.
4) The actual population appears to have more heterozygotes and fewer recessive homozygotes than would
be expected for Hardy-Weinberg equilibrium. There are many possible reasons that a population may not
be in equilibrium (see Table 1). In this case, there is possibly some selection against homozygous recessive
genotypes, in favour of heterozygotes in particular. Perhaps the heterozygotes have some selective
advantage that increases their fitness.

It is also worth noting the discrepancies between the allele frequencies calculated in Q3 and Q4. In
question 3, we calculated the frequencies directly from the genotypes (this is the most accurate method,
and does not require the population to be in equilibrium). In 4, we essentially estimated the frequency
base on one of the phenotypic classes. The discrepancy between these calculations shows the limitations
of using phenotypes to estimate allele frequencies, when a population is not in equilibrium.

CHAPTER 39 – ANSWERS
1) These fish would all be heterozygotes and thus have spines like the deep-water population. The presence
of the enhancers element (deep water) would be dominant to the absence of the element (shallow water).
2)
PAGE 30 OPEN GENETICS LECTURES – FALL 2017
CHAPTER QUESTION - ANSWERS
a) Yellow wings, body, and mouth parts, but normal bristles and claws.
b) All yellow, no normal colour.
c) Same as “a”.
d) The enhancer elements on the stop codon allele might cross regulate the transcription unit on the
deletion allele. That is, wild type enhancers on one allele drive a wild type transcript on the other
allele. (Note: this is the case. See Morris et al. 1999 Genetics 151: 633–651.)

CHAPTER 40 – ANSWERS
3) a) Fast and simple to grow in high density, diploid,
b)
i) zebrafish (for vertebrate eyes); flies for eyes in general
i) zebrafish
ii) Arabidopsis
iii) yeast
iv) C. elegans
v) arguably, any of the organisms, but the vertebrates would be most relevant

CHAPTER 41 – ANSWERS
1) Oncogenes usually arise from gain-of-function mutations, which tend to be haplosufficient. Mutations in
tumour suppressors are usually loss-of-function mutations, which tend to be haploinsufficient.
2) p53 activates DNA repair, apoptosis, and inhibitors of cell division. Different genes involved in each of
these pathways have enhancer elements to which p53 binds; therefore, they call all be activated by p53.
3) Some substances can promote cancer without causing a mutation, for example by inducing the cell cycle
or accelerating it so that there is less time to repair DNA damage. All mutagens are potentially carcinogens,
although some potential mutagens may not cause significant damage to cells in the body due to
detoxification or other reasons that limit their efficacy.
4) Was the dose fed to the rats relevant? Were similar effects seen in other organisms? Do epidemiological
studies support these conclusions? Could the results be replicated by a different research group? What
was the proposed mechanism for this increased incidence?
5) Cancer results from an accumulation of mutations that activate cell division and disable tumour
suppression. HPV infection alone does not satisfy all of these requirements. Also, not all strains of HPV are
equally carcinogenic, and the body’s defense may be able to suppress the activity of the virus.
6) Same for BRCA1 mutations.

END OF ANSWERS
Notes:

OPEN GENETICS LECTURES – FALL 2015 PAGE 31

CHAPTER QUESTION - ANSWERS

PAGE 32 OPEN GENETICS LECTURES – FALL 2017

Enjoyment of Music Essential 4th Edition - Kristine Forney
94% (32)
Enjoyment of Music Essential 4th Edition - Kristine Forney
433 pages
Genetics: Genes, Genomes, and Evolution. ISBN 019879536X, 978-0198795360
100% (15)
Genetics: Genes, Genomes, and Evolution. ISBN 019879536X, 978-0198795360
23 pages
2021 Mandarins Audition-TRUMPET
100% (1)
2021 Mandarins Audition-TRUMPET
9 pages
Principles of Population Genetics
No ratings yet
Principles of Population Genetics
7 pages
Wiley - Gene Cloning and DNA Analysis An Introduction, 7th Edition - 978-1-119-07256-0
No ratings yet
Wiley - Gene Cloning and DNA Analysis An Introduction, 7th Edition - 978-1-119-07256-0
2 pages
Bioinformatics Pratical File
No ratings yet
Bioinformatics Pratical File
63 pages
Molecular Basis of Inheritance PDF
94% (17)
Molecular Basis of Inheritance PDF
19 pages
Bi 341 Chapter 1 The Genetic Code of Genes and Genomes & Introduction - KB
No ratings yet
Bi 341 Chapter 1 The Genetic Code of Genes and Genomes & Introduction - KB
76 pages
Assignment 1 - Database - Oct 2021
No ratings yet
Assignment 1 - Database - Oct 2021
5 pages
Characteristics and Genotyping (Semi-Automated and Automated), Apparatus Used in Genotyping
No ratings yet
Characteristics and Genotyping (Semi-Automated and Automated), Apparatus Used in Genotyping
45 pages
Unit1 - Bioinformatics (KBT-603)
No ratings yet
Unit1 - Bioinformatics (KBT-603)
91 pages
2-Gen-Sept-14-Inheritance of Complex Disorders 2021 Post Ahead
No ratings yet
2-Gen-Sept-14-Inheritance of Complex Disorders 2021 Post Ahead
61 pages
Chapter 7 Linkage, Recombination, and Eukaryotic Gene Mapping
100% (2)
Chapter 7 Linkage, Recombination, and Eukaryotic Gene Mapping
20 pages
Experiments in Molecular Cell Biology: A Problems Book With Multiple-Choice Question-Based Tests
No ratings yet
Experiments in Molecular Cell Biology: A Problems Book With Multiple-Choice Question-Based Tests
20 pages
7.1 Linkage and Crossing Over
No ratings yet
7.1 Linkage and Crossing Over
34 pages
Molecular Cell Biology 1. Exam Questions
No ratings yet
Molecular Cell Biology 1. Exam Questions
3 pages
APPLICATION OF BIOINFORMATICS IN MOLECULAR BIOLOGY AND CURRENT RESEACRH-Dr. Ruchi Yadav
No ratings yet
APPLICATION OF BIOINFORMATICS IN MOLECULAR BIOLOGY AND CURRENT RESEACRH-Dr. Ruchi Yadav
105 pages
Enetics:: An Introduction
No ratings yet
Enetics:: An Introduction
35 pages
Human Karyotyping
No ratings yet
Human Karyotyping
7 pages
Genomic DNA Libraries For Shotgun Sequencing Projects
No ratings yet
Genomic DNA Libraries For Shotgun Sequencing Projects
40 pages
Genomics and Proteomics
100% (1)
Genomics and Proteomics
317 pages
Human Genome Project A0001899-001-000 PDF
No ratings yet
Human Genome Project A0001899-001-000 PDF
10 pages
Genomics: A New Revolution in Science:: An Introduction To Promises and Ethical Considerations by Genome Alberta
100% (1)
Genomics: A New Revolution in Science:: An Introduction To Promises and Ethical Considerations by Genome Alberta
66 pages
Genetics, Lecture 11 (Lecture Notes)
No ratings yet
Genetics, Lecture 11 (Lecture Notes)
16 pages
Russel - Capt3 - Replicacion
No ratings yet
Russel - Capt3 - Replicacion
25 pages
Molecular Biology BIOL312
100% (1)
Molecular Biology BIOL312
273 pages
Sequence Analysis &alignment
100% (1)
Sequence Analysis &alignment
2 pages
GWAS
No ratings yet
GWAS
49 pages
Application of Bioinformatics in Various Fields
71% (7)
Application of Bioinformatics in Various Fields
9 pages
A Rough Guide To Drosophila Mating Schemes
No ratings yet
A Rough Guide To Drosophila Mating Schemes
39 pages
Advanced Molecular Genetics PDF
100% (2)
Advanced Molecular Genetics PDF
352 pages
B.tech. Biotechnology Notes
No ratings yet
B.tech. Biotechnology Notes
3 pages
Organization and Structure of Genome: Genome Size Variation
100% (1)
Organization and Structure of Genome: Genome Size Variation
27 pages
Applied Bioinformatics
100% (1)
Applied Bioinformatics
166 pages
Molecular Assisted Selection in Plant Breeding Programs
No ratings yet
Molecular Assisted Selection in Plant Breeding Programs
48 pages
Clone Identification, Screening, Selection
No ratings yet
Clone Identification, Screening, Selection
21 pages
Gregor Mendel
No ratings yet
Gregor Mendel
3 pages
Molecular Biology Techniques
No ratings yet
Molecular Biology Techniques
19 pages
7 Linkage Mapping
100% (2)
7 Linkage Mapping
86 pages
COURSE WORK MOLECULAR BIOLOGY & Genetics
No ratings yet
COURSE WORK MOLECULAR BIOLOGY & Genetics
3 pages
Bioinformatics Notes 2020 2021
No ratings yet
Bioinformatics Notes 2020 2021
66 pages
BLAST
100% (1)
BLAST
4 pages
A Systematic Review On The Comparison of Molecular Gene Editing Tools
No ratings yet
A Systematic Review On The Comparison of Molecular Gene Editing Tools
8 pages
Types of Electrophoresis and Dna Fingerprinting B: 5,, ,: Y Group Lood Martos Panganiban Trangia
100% (1)
Types of Electrophoresis and Dna Fingerprinting B: 5,, ,: Y Group Lood Martos Panganiban Trangia
73 pages
DNA Genetic Engineering
No ratings yet
DNA Genetic Engineering
72 pages
Omics Technology: October 2010
No ratings yet
Omics Technology: October 2010
28 pages
PCR Based Molecualr, Genetic Markers
No ratings yet
PCR Based Molecualr, Genetic Markers
59 pages
Nucleic-Acid Isolation Methods: Michael T Madziva, PHD
No ratings yet
Nucleic-Acid Isolation Methods: Michael T Madziva, PHD
33 pages
DNA Sequencing at 40 - Past Present and Future
No ratings yet
DNA Sequencing at 40 - Past Present and Future
10 pages
FASTA
No ratings yet
FASTA
33 pages
Genetic Lab Manual
No ratings yet
Genetic Lab Manual
21 pages
Blast
100% (1)
Blast
21 pages
PCR Lecture
100% (1)
PCR Lecture
35 pages
Transgenic Plants
No ratings yet
Transgenic Plants
33 pages
2012 Smarcal1 Human Molecular Genetics
100% (1)
2012 Smarcal1 Human Molecular Genetics
867 pages
Manual PDF
100% (1)
Manual PDF
53 pages
01 Overview On Biochemistry
No ratings yet
01 Overview On Biochemistry
56 pages
Sequence Similarity Searching: Basic Local Alignment Search Tool
No ratings yet
Sequence Similarity Searching: Basic Local Alignment Search Tool
47 pages
Unit 1: Structural Genomics
No ratings yet
Unit 1: Structural Genomics
4 pages
A Few Basics About QTL Mapping
100% (1)
A Few Basics About QTL Mapping
14 pages
Introduction To Genetics 1743547415. Print
No ratings yet
Introduction To Genetics 1743547415. Print
413 pages
Open Genetics PDF
100% (1)
Open Genetics PDF
513 pages
Genetics Book
100% (2)
Genetics Book
461 pages
Leigh High School Hymn-Tuba
No ratings yet
Leigh High School Hymn-Tuba
1 page
HealthEdToday CMA
No ratings yet
HealthEdToday CMA
1 page
Table For OW 2
No ratings yet
Table For OW 2
1 page
Winter Session 2022 Class List
No ratings yet
Winter Session 2022 Class List
15 pages
Sample Lab Exams
No ratings yet
Sample Lab Exams
7 pages
Persuasive Speech Outline - Sample 3
No ratings yet
Persuasive Speech Outline - Sample 3
4 pages
Readme
No ratings yet
Readme
1 page
The Creative Commons - Attribution - Non-Commercial - No Derivatives License
No ratings yet
The Creative Commons - Attribution - Non-Commercial - No Derivatives License
1 page
FFB - Custom - Settings - Pure Feel V5 (NEW)
No ratings yet
FFB - Custom - Settings - Pure Feel V5 (NEW)
3 pages
Keyboard Vortext Tabbb
No ratings yet
Keyboard Vortext Tabbb
5 pages
BIOLOGY INVESTIGATORY PROJECT (Class 11 & 12)
100% (1)
BIOLOGY INVESTIGATORY PROJECT (Class 11 & 12)
33 pages
Molecular Basis of Inheritance: WWW - Ncert.online
No ratings yet
Molecular Basis of Inheritance: WWW - Ncert.online
31 pages
MLL Biology Class 12th PDF
No ratings yet
MLL Biology Class 12th PDF
39 pages
Biology 30 Unit C - Cell Division, Genetics, and Molecular Genetics - Chapter 18
No ratings yet
Biology 30 Unit C - Cell Division, Genetics, and Molecular Genetics - Chapter 18
165 pages
Chapter 8 Worksheet Packet KEY - HB 2016
No ratings yet
Chapter 8 Worksheet Packet KEY - HB 2016
6 pages
The Hershey-Chase Experiment
No ratings yet
The Hershey-Chase Experiment
21 pages
Scientists and Dates
No ratings yet
Scientists and Dates
19 pages
H.W Grade 10 A&b
No ratings yet
H.W Grade 10 A&b
2 pages
Griffith'S Experiment: Frederick Griffith (1877-1941) Was A British Bacteriologist Whose Focus Was The
No ratings yet
Griffith'S Experiment: Frederick Griffith (1877-1941) Was A British Bacteriologist Whose Focus Was The
16 pages
Frederick Griffith: Bacterial Transformation
100% (1)
Frederick Griffith: Bacterial Transformation
4 pages
Bio Assignment 2
No ratings yet
Bio Assignment 2
11 pages
12.1 Identifying The Substance of Genes - PPT Video Online Download 7
No ratings yet
12.1 Identifying The Substance of Genes - PPT Video Online Download 7
1 page
Ch. 12 Eoc Review
No ratings yet
Ch. 12 Eoc Review
16 pages
Molecular Basis of Inheritance
No ratings yet
Molecular Basis of Inheritance
26 pages
MBD Lec Midterms
No ratings yet
MBD Lec Midterms
56 pages
Chapter 12: Molecular Genetics
No ratings yet
Chapter 12: Molecular Genetics
49 pages
DNA Structure & Replication
No ratings yet
DNA Structure & Replication
25 pages
Namma Kalvi 12th Zoology Important Questions With Answers EM 221212
No ratings yet
Namma Kalvi 12th Zoology Important Questions With Answers EM 221212
87 pages
Techno-1.3-The Molecular Basis of Heredity
No ratings yet
Techno-1.3-The Molecular Basis of Heredity
30 pages
AP Biology Unit 6 Student Notes
No ratings yet
AP Biology Unit 6 Student Notes
57 pages
The Molecular Nature of Genes I
No ratings yet
The Molecular Nature of Genes I
26 pages
A1.2 Nucleic Acids
100% (2)
A1.2 Nucleic Acids
83 pages
12 Bio Chapter 5 PYP 2
No ratings yet
12 Bio Chapter 5 PYP 2
18 pages
OpenGeneticsLectures Fall2017
No ratings yet
OpenGeneticsLectures Fall2017
405 pages
Moleculer Basis of Inheritance
No ratings yet
Moleculer Basis of Inheritance
56 pages
SBL100 Minor I Questions Answers
No ratings yet
SBL100 Minor I Questions Answers
25 pages
Chapter 10 Test Bank
No ratings yet
Chapter 10 Test Bank
11 pages
HSC Biology Feb 2014 Part 1
No ratings yet
HSC Biology Feb 2014 Part 1
2 pages
Hershey Chase
No ratings yet
Hershey Chase
1 page