0% found this document useful (0 votes)
35 views75 pages

Human Molecular Genetics Fourth Edition Tom Strachan

Uploaded by

waakidvuci
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views75 pages

Human Molecular Genetics Fourth Edition Tom Strachan

Uploaded by

waakidvuci
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Download the full version of the ebook at ebookfinal.

com

Human Molecular Genetics Fourth Edition Tom


Strachan

https://ebookfinal.com/download/human-molecular-genetics-
fourth-edition-tom-strachan/

OR CLICK BUTTON

DOWNLOAD EBOOK

Download more ebook instantly today at https://ebookfinal.com


Instant digital products (PDF, ePub, MOBI) available
Download now and explore formats that suit you...

Human Molecular Genetics 3 3rd Edition T. Strachan

https://ebookfinal.com/download/human-molecular-genetics-3-3rd-
edition-t-strachan/

ebookfinal.com

Molecular Basis of Human Nutrition 1st Edition Tom Sanders

https://ebookfinal.com/download/molecular-basis-of-human-
nutrition-1st-edition-tom-sanders/

ebookfinal.com

Molecular Genetics of Cancer 1st Edition John Cowell


(Editor)

https://ebookfinal.com/download/molecular-genetics-of-cancer-1st-
edition-john-cowell-editor/

ebookfinal.com

Molecular Genetics of Cancer 2nd ed Edition J.K. Cowell

https://ebookfinal.com/download/molecular-genetics-of-cancer-2nd-ed-
edition-j-k-cowell/

ebookfinal.com
Cellular and Molecular Neurophysiology Fourth Edition
Constance Hammond

https://ebookfinal.com/download/cellular-and-molecular-
neurophysiology-fourth-edition-constance-hammond/

ebookfinal.com

Molecular Methods for Evolutionary Genetics 1st Edition


Emily E. Hare

https://ebookfinal.com/download/molecular-methods-for-evolutionary-
genetics-1st-edition-emily-e-hare/

ebookfinal.com

Human Genetics and Society 1st Edition Ronnee K. Yashon

https://ebookfinal.com/download/human-genetics-and-society-1st-
edition-ronnee-k-yashon/

ebookfinal.com

Genetics in Human Reproduction 1st Edition Elisabeth Hildt


(Editor)

https://ebookfinal.com/download/genetics-in-human-reproduction-1st-
edition-elisabeth-hildt-editor/

ebookfinal.com

Molecular Pathology The Molecular Basis of Human Disease


1st Edition William B. Coleman

https://ebookfinal.com/download/molecular-pathology-the-molecular-
basis-of-human-disease-1st-edition-william-b-coleman/

ebookfinal.com
MA GS
Garland Science

MO ECU AR
E Ell S 4TH ED I TION

Scanned and OCRed by RagingShrimp.


"Knowledge Should Be Free"

TOM STRACHAN AND ANDREW READ


*Missing the glossary and index, and some of the reference pages at the end
of the chapters, they are useless and take a lot of time to scan, other than
that the scan is perfect.
xi

Contents

Chapter 1 Nucleic Acid Strucrure and Gene Expression 1


Chapter 2 Chromosome Structure and Function 29
Chapter 3 Genes in Pedigrees and Populations 61
Chapter 4 Cells and Cell-Cell Communication 91
Chapter 5 Principles of Development 133
Chapter 6 Amplifying DNA: Cell- based DNA Cloning and
PCR 163
Chapter 7 Nucleic Acid Hybridization: Principles and Applications 191
Chapter 6 Analyzing the Structure and Expression of Genes
and Genomes 213
Chapter 9 Organization of the Human Genome 255
Chapter 10 Model Organisms, Comparative Genomics, and
Evolution 297
Chapter 11 Human Gene Expression 345
Chapter 12 Studying Gene Function in the Post-Genome Era 381
Chapter 13 Human Genetic Variability and Its Consequences 405
Chapter 14 Genetic Mapping of Mendelian Characters 441
Chapter 15 Mapping Genes Conferring Susceptibility to Complex
Diseases 467
Chapter 16 Identifying Human Disease Genes and Susceptibility
Factors 497
Chapter 17 Cancer Genetics 537
Chapter 18 Genetic Testing ofIndividuals 569
Chapter 19 Pharmacogenetics, Personalized Medicine, and
Population Screening 605
Chap te r 20 Genetic Manipulation of Animals for Modeling
Disease and Investigating Gene Function 639
Chapte r 21 Genetic Approaches to Treating Disease 677
Glossary 719
Index 737
xii

Detailed Contents

Chapter 1 The complex relationship between amino acid


sequence and protein structure 25
Nucleic Acid Structure and Gene Expression 1
The a -helix. 25
1.1 DNA,RNA, ANDPOLYT'EPTIDI:'.S 2 The ~ - pleated sheet 26
Most genetic information flows in the sequence The ~ - turn 27
DNA -> RNA -> polypeptide 2 Higher-order structures 27
Nucleic acids and polypeptides are linear sequences FURTHER READING 27
of simple repeat units 3
Nucleic acids 3
Polypeptides 4
Chapter 2
The type of chemical bonding determines stability Chromosome Structure and Function 29
and function 6 2.1 pLOIDY AND THE CEIJ. CYCLE 30
1,2 NUCLEIC ACID SrnUCTURL AND DNA 2.2 MITOSIS AND MEIOSIS 31
REPLICATIOI'I / 7 Mitosis is the normal form of cell division 31
DNA and RNA structure 7 Meiosis is a specialized reductive cell division that
Replication is semi-conservative and gives rise to sperm and egg cells 33
semi-discontinuous 9 Independent assortment 34
DNA polymerases sometimes work in DNA Recombination 34
repair and recombination 9 x-v pairing 35
Many viruses have RNA genomes II Mitosis and meiosis have key similarities and
1.3 RNA rnANSCIUPl 101\ AND GENE differences 36
EXPRE.SSION 12 2.3 STRUCfURE.AND FUNCTION OF
Most genes are expressed to make polypeptides 14 CHROMOSOMES 36
Different sets of RNA genes are transcribed by the Chromosomal DNA is coiled hierarchically 36
three ellkatyotic RNA polymerases 15 Interphase chromatin varies in its degree of
1..1. Rl\'A PROCESSING 16 compaction 38
RNA splicing removes unwanted sequences from Each chromosome has its own territory in the
the primary transcript 16 interphase nucleus 38
Specialized nucleotides are added to the ends of Centromeres have a pivotal role.in chromosome
most RNA polymerase II transcripts 17 movement but have evolved to be very different
5' capping 18 in different organisms 39
3' polyadenylation 18 Replication of a mammaJia n chromosome involves
rRNA and tRNA transcripts undergo extensive the flexible use of multiple replication origins 40
processing 19 Telomeres have speCialized structures to preserve
1", TRANSLATION, POST-TRANSLATIONAL the ends oflin ear chromosomes 41
PROCESSING, AND PROTElf\ STRUC1lJR.E 20 Telomere structure, function, and evolution 41
mRNA is decoded to specify polypeptides 20 Telomerase and the chromosome
The genetic code is degenerate and not quite end-replication problem 42
unive rsal 22 2.4 STUDYlNG HOMAN CHROMOSOMES 43
Post-translational processing: chemical modification Chromosome analysis is easier for mitosis than
of amino acids and polypeptide cleavage 23 =~~ «
Addition of carbohydrate groups 23 Chromosomes are identified by size and staining
Addition of lipid groups 24 pattern 44
Post -translational cleavage 24 Chromosome banding 44
Detailed Contents xiii

Reporting of cytogenetic analyses 45 Mosaics 77


Molecular cytogenetics locates specific DNA Chimeras 78
sequences on chromosomes 46 3.4 GENETICS OF MULTIF,\ C[ORW. CHARACTERS:
Chromosome fluorescence in situ hybridization THE POLYGENIC THRESHOLD THEORY 79
(FISH) 47 In the early twentieth century there was controversy
Chromosome painting and molecular between proponents of Mendelian and quantitative
karyotyping 47 models of inheritance 79
Comparative genome hybridization (CGH) 48 Polygenic theory explains how quantitative traits
2.5 CHROMOSOMEt\BNORMAIJTIES 50 can be genetically determined 79
Numerical chromosomal abnormalities involve Regression to the mean 80
gain or loss of complete chromosomes 50 Hidden assumptions 81
Polyploidy 50 Heritability measures the proportion of the overall
Aneuploidy 50 variance of a character that is due to genetic
Mixoploidy 51 differences 81
Clinical consequences 52 Misunderstanding heritability 82
A variety of structural chromosomal abnormalities The threshold model extended polygenic theory to
result from rnisrepair or recombination errors 53 cover dichotomous characters 82
Different factors contribute to the clinical Using threshold theory to understand
consequences of structural chromosome recurrence risks 82
abnormalities 55 Counseling in non-Mendelian conditions is based
Incorrect parental origins of chromosomes can on empiric risks 83
result in aberrant development and disease 56 3.5 FACTORS AFFECTING GENE FREQUENCI ES 8'1
CONCLUSION 58 A thought experiment: picking genes from the gene
FURl HER READING 59 pool 84
The Hardy-Weinberg distribution relates genotype
Chapter 3 frequencies to gene frequencies 84
Using the Hardy-Weinberg relationship in genetic
Genes in Pedigrees and Populations 6l
counseling 84
3.! MO~OGENICVF.RSUS MULTIFACTORIAL Inbreeding 85
INHERlTAN(;E 62 Other causes of departures from the Hardy-
3.2 MENDElJAN PEDIGREE P/UiERNS 64 Weinberg relationship 86
There are five basic Mendelian pedigree patterns 64 Gene frequencies can vary with time 87
X-inactivation 65 Estimating mutation rates 88
Mosaicism due to X-inactivation 66 The importance of heterozygote advantage 88
Few genes on the Y chromosome 67 CONCLUSIUN 89
Genes in the pseudoautosomal region 67 rURIHER R£ADING 89
Conditions caused by mutations in the
mitochondrial DNA 68 Chapter 4
The mode of inheritance can rarely be defined
unambiguously in a single pedigree 69
Cells and Cell-Cell Communication 91
Getting the right ratios: the problem of bias of 4.1 CEll STRUCTURE AND DlVERSlTY 92
ascertainment 69 Prokaryotes and eukaryotes represent a fundam ental
The relation between Mendelian characters and division of cellular life forms 92
gene sequences 70 The extraordinary diversi ty of cells in the body 93
Locus heterogeneity 71 Germ cells are specialized for reproductive functions 93
Clinical heterogeneity 72 Cells in an individual multicellular organism can differ
in DNA content 96
3.3 COMPLICATIONS TOTH!: 8ASIC MENDEliAN
PEDIGREE PAlTERNS 72 4.2 CELlADILESION AND TISSUE FOf\l\IAlION 97
A common recessive condition can mimic a Cell junctions regulate the contact between cells 98
dominant pedigree pattern 72 Tight junctions 98
A dominant condition may fail to manifest itself 73 Anchoring cell junctions 99
Age-related penetrance in late-onset diseases 73 Communicating cell junctions 99
Many conditions show variable expression 74 The extracellular matrix regulates cell behavior
Anticipation 74 as well as acting as a scaffold to support tissues 99
Imprinting 75 Specialized cell types are organized into tissues 100
Male lethality may complicate X-linked pedigrees 76 Epithelium 100
Inbreeding can complicate pedigree interpretation 76 COlll1ective tissue 101
New mutations and mosaicism complicate pedigree Muscle tissue 101
interpretation 76 Nervo us tissue 101
xiv Detailed Contents

4.3 PRINCIPLES OF CELl. SIGNAlING 102 Additional recombination and mutation mechanisms
Signaling molecules bind to specific re ceptors in contribute to receptor diversity in B cells. but
responding cells to trigger altered cell behavior 102 not T cells 130
Some signaling molecules bind intracellular receptors The monospecificity ofIgs and TCRs is due to allelic
that activate target genes directly 103 exclusion and light chain exclusion 131
Signaling through cell surface receptors often FURTHER READING 132
involves kinas e cascades 105
Signal transduction pathways often use small
intermediate intracellular s ignaling molecules 106
Chapter 5
Synapti c sig naling is a specialized form of cell Principles of Development 133
signal ing that does not require the activatio n of 5.1 AN OVERVIEW OF DEVELOPMF.NT 134
transcription factors 107
Animal models of development 135
4.4 CEll PROUFElWION, EN.E5CENCE,AND
5.2 CELL SPECIALIZAl10N DURI""G
PROGRAMMED CEll DEATH 108
DEVUOl'MENT 136
Most of the cells in mature animals are non-dividing
cells, but some tissues and cells turn over rapidly 108 Cells become specialized through an irreversib le
Mitogens promote cell proliferation by overcoming series of hierarchical decisions 136
braking mechanisms that restrain cell cycle The choice between alternative cell fates often
dep ends on cell position 137
progression in Gl 109
Cell proliferation limits and the concept of cell Sometimes cell fate can be specified by lineage
senescence 111 rather th an position 138
Large numbers of our cells are naturally 5.3 P,\TTF.RN PORMATION IN DEVELOPMENT 139
programmed to die 111 Emergence of the body plan is dependent o n axis
The importance of programmed cell death 11 2 speCification and polarization 139
Apoptosis is performed by caspases in response Pattern formation often depends on grad ients of
to death s ignals or s ustai ned cell stress 113 s ignaling molecules 141
Extrinsic pathways: s ignaling through cell Homeotic mutations reveal the molecular basis of
s urface death receptors 11 3 pOSitional identity 142
Intrinsic pathways: intracellular responses to
cell stress 113 5.4 MORPHOGHNESIS 144
4.5 STEM CEUS AND DIFFEHENTtATION 114 Morphogenesis can be driven by changes in cell
shape and size 144
CeU specialization involves a directed series of
Major morphogenetic changes in the embryo
hierarchical decisions 114
result from changes in cell affinity 145
Stem ceUs are rare self-renewing progenitor cells 115
Cell proliferation and apoptosis are important
Tissue stem cells allow specific adult tissues to be
morphogenetic mechanisms 145
replenished 115
Stem cell niches 11 7 5.5 I!ARlY HUMAl\J DEVELOPMENT:
Stem cell ren ewal versns differentiation 11 7 FERTILIZATION TO GASTRULATIO/l. 146
Embryo nic s tem cells and embryonic germ cells During fertilization the egg is activated to form
are pluripotent 11 7 a unique individual 146
Origins of cultured embryonic stem cells 118 Cleavage partitions the zygote into many smaller
Pluripotency tests 119 cells 147
Embryonic germ cells 119 Mammalian eggs are among the smallest in the
animal kingdom. and cleavage in mammals is
4.6 L\IMUNESYSTEM CELLS: FUNCTION exceptional in several ways 148
TH,ROUGH DIVERSITY 119
Only a small percentage of the cells in the early
The innate immune system provides a rapid
mammalian embryo give rise to the mature
response based on general pattern recognition
organism 148
of pathogens 121
Implantation 150
The adaptive immune system mounts highly specific
Gastrulation is a dynamic process by which cells
immune responses that are enhanced by memory
of the epiblast give rise to the three germ layers lSI
~lli In
Humoral immunity depends on the activities of 5.6 NEURAL DEVE!..OPMENT 154
soluble antibodies 123 The axial mesoderm induces the overlying
In cell-media ted immun ity. T cells recognize cells ectoderm to develop into the nervous system 155
containing fragm ents of foreign proteins 125 Pattern formation in the neural tube involves the
T-cell ac tivation 127 coordinated expression of genes along two axes 155
The unique organization and expression oflg and Neuronal differentiation involves the combinatorial
TCR genes 128 activity of transcription factors 157
Detailed Contents XV

5.7 GERM ·CELLAND SEXDETEIL\>IlNAllON IN In phage display. heterologous proteins are


MAM 1\1ALS 158 expressed on the surface of phage particles 179
Primordial germ cells are induced in the early Eukaryotic gene expression is performed with
mammalian embryo and migrate to the greate r fidelity in eukaryotic cell lines 180
developing gonads 158 1tansient expression in insect cells by using
Sex determination involves both intrinsic and baculovirus 181
positional information 159 Transient expression in mammalian ceUs 181
5.8 CONSERVATION OF DEVELOPMENTAL Stable exp ress ion in mamm alian cells 181
I'ATI-J\VAYS 160 6.4 CLONING DNA fN VIlRO: TH E POLYMERASE
Many human diseases are caused by the failu re of CHAIN REACTION 182
normal developm ental processes 160 pCR can be used to amplify a rare target DNA
Developmental processes are often highly selectively from within a complex DNA
conserved but some show considerable species population 183
differences 160 The cyclical nature of PCR leads to exponential
FURTHEI1I1EAOING 161 amplification of the target DNA 183
Selective amplification oftarget sequences depen ds
Chapter 6 on highly specific binding of primer sequences 184
pCR is disadvantaged as a DNA cloning method
Amplifying DNA: Cell-based DNA Cloning
by short lengths and comparatively low yields
andPCR 163 of product 185
Cell-based DNA cloning 165 A wide variety of PCR approaches have been
The polymerase chain reaction (PCR) 165 developed for specific applications 186
6.1 PR INCIPLES OF CELL-BASED DNA CLONING 165 Allele-specific pCR 187
Managea ble pieces oftarget DNA are joined to Multiple target amplification and whole
vector molecules by using reso-iction genome pCR methods 187
endonucleases and DNA ligase 166 pCR mutagenesis 188
Basic DNA cloning in bacterial cells uses vectors Real-time pCR (qpCR) 189
based on naturally occurring extrachromosomal FUR1HER REAOIl\G 189
replicons 168
Cloning in bacterial cells uses genetically modified Chapter 7
plasmid or bacteriophage vectors and modified Nucleic Acid Hybridization: Principles and
host cells 169
Applications 19l
1tansformation is the key DNA fractionation step
in cell-based DNA cloning 171 7.1 PRINCIPLES OF NUCI.EIC ACID
Reco mbinant DNA can be selectively purified after HYBItlDtZATlON ]92
screening and amplifying cell clones with desired In nucleic acid hybridization a known nucleic acid
target DNA fragments 172 population interrogates an imperfectly
6.2 LARGE INSERT CLON1NG AND CLONING understood nucleic acid population 192
SYSTEMS FOR PRODUCING SINGLE- Probe-target heteroduplexes are easier to identify
TllANDED DNA 172 after capture on a solid support 193
Early large insert cloning vectors exploited Denaturation and annealing are affected by
properties of bacterioph age A 173 temperature, chemical environment. and the
Large DNA fragments can be cloned in bacterial extent of hydrogen bonding 194
cells by using extrachromosomallow-copy­ Stringent hybridization conditions increase the
num ber replicons 174 specificity of duplex form ation 195
Bacterial artificial chromosome (BAC) and The kinetics of DNA reassociation is also dependent
fosmid vectors 174 on the concentration of DNA 196
Bacterioph age PI vectors and 1'1 artificial 7.2 I..\BEUNG OF NUCLEIC ACIDS AJ'IID
chromosomes 175 OLIGONUCLEOTIDES 197
Yeast artifi cial chromosomes (YACs) enable cloning Different classes of hybridization probe can be
of megabase fragments of DNA 175 prepared fro m DNA, RNA, and oligonucleotide
Producing single-stranded DNA for use in DNA substrates 197
sequen cing and in vitro site-specific mutagenesis 176 Long nucleic acid probes are usually labeled by
M13 vectors 176 incorporating labeled nucleotides during strand
Phagemid vectors 177 synthesis 197
6.3 CLONING SYSTEMS DESIGNIID FOR GENE Labeling DNA by nick translation 198
EXPRESSION 178 Random primed DNA labeling 198
Large amounts of protein can be produced by PCR-based strand synthesis labeling 198
expression cloning in bacterial cells 178 RNA labeling 198
xvi Detailed Contents

Radioisotopes can be used to label nucleic acids Single-molecule sequencing 220


but are short-lived and can be hazardous 199 Mircoarray-based DNA capture methods allow
Fluorophores are commonly used in nonisotopic efficient resequencing 223
labeling of nucleic acids 200 8.3 GENOME S1 RUCTUREANALYSIS AND
7.3 HYBRIDlZATlO"l TO IMMOBILlZEDTAIlGET GENOME PROJECTS 224
"UCl-flC ACmS 203 Framework maps are needed for first time
Dot-blot hybridization offers rapid screening and sequencing of complex genomes 225
often employs allele-specific oligonucleotide The linear order of genomic DNA clones in a
probes 203 contig matches their original subchromosomal
Southern and northern blot hybridizations detect locations 226
size-fractionated DNA and RNA 204 The Human Genome Project was an international
Southern blot hybridization 204 endeavor and biology's first Big Project 228
Northern blot hybridization 204 The first human genetic maps were of low resolution
In an in situ hybridization test, sample DNA or RNA and were constructed with mostly anonymous
is immobilized within fixed chromosome or cell DNA markers 228
preparations 205 Physical maps ofthe human genome progressed
Chromosome in situ hybridization 205 from marker maps to clone contig maps 230
Tissue in situ hybridization 206 The final sequencing phase of the Human Genome
Hybridization can be used to screen bacterial Project was a race to an early finish 231
colonies containing recombinant DNA 206 Genome projects have also been conducted for a
7.4 MICROARRAY· BASED HYBRIDlZAllON variety of model organisms 234
ASSAYS 207 Powerful genome databases and browsers help
Microarray hybridization allows highJy parallel to store and analyze genome data 235
hybridization assays using thousands of different Different computer programs are designed to
immobilized probes 207 predict and annotate genes within genome
High-density oligonucleotide micro arrays offer sequences 235
enormously powerful tools for analyzing complex Obtaining accurate estimates for the number of
RNA and DNA samples 208 human genes is surprisingly difficult " 238
Affymetrix oligonucleotide microarrays 209 6.4 BASIC GENE EXPRESSION ANALYSES 239
Illumina oligonucleotide micro arrays 209 Principles of expression screening 239
Microarray hybridization is used mostly in transcript Hybridization -based methods allow semi-
profiling or assaying DNA variation 209 quantitative and high-resolution screening of
FURTHER READING 211 transcripts of individual genes 240
Hybridization-based methods for assaying
ChapterS transcript size and abundance 241
Tissue in situ hybridization 241
Analyzing the Structure and Expression
Quantitative PCR methods are widely used for
of Genes and Genomes 213 expression screening 241
6.1 DNA UBRAIlIES 214 Specific antibodies can be used to track proteins
Genomic DNA libraries comprise fragmented copies expressed by individual genes 242
of all the different DNA molecules in a cell 214 Protein expression in cultured celis is often
cDNA libraries comprise DNA copies of the different analyzed by using different types of fluorescence
RNA molecules in a cell 214 microscopy 244
To be useful, DNA libraries need to be conveniently 8.5 HIGHLY PARALLEL ANALYSES OF GENE
screened and disseminated 215 EXPRESSION 2.15
Library screening 215 DNA and oligonucleotide micro arrays permit rapid
Library amplification and dissemination 216 glo bal transcript profiling 245
B.2 SEQUENCING DNA 217 Modern global gene expression profiling
Dideoxy DNA sequencing involves enzymatic DNA increasingly uses sequencing to quantitate
synthesis using base-specific chain terminators 217 transcripts 248
Automation of dideoxy DNA sequencing increased its Global protein expression is often profiled with
efficiency 218 two-dimensional gel electrophoresis and mass
Iterative pyrosequencing records DNA sequences spectrometry 249
while DNA molecules are being synthesized 219 Two-dimensional polyacrylamide gel
Massively parallel DNA sequencing enables the electrophoresis (2D-PAGE) 249
simultaneous sequencing of huge numbers of Mass spectrometry 250
different DNA fragments 220 Comparative protein expression analyses have
Massively parallel sequencing of amplified many applications 252
DNA 220 FURTHEll READING 253
Detailed Contents xvii

Chapter 9 More than 3000 human genes synthesize a wide


Organization of the Human Genome 255 variety of m edium-sized to large regulatory
RNAs 287
9.1 GENERAl ORGA."IIIZAlION OFTHE HUMAN 9.4 HIGHLY REPEnTl"E DNA:
GENOME 257 HETEROCHROMAfIN AND TRAJl;SPOSO:'ll
The mitochondrial genome is densely packed with REPPATS 289
genetic information 257 Constirutive heterochromatin is largely defined by
Replication of mitochondrial DNA 257 long arrays of high- copy-number tandem DNA
Mitochondrial genes and their transcription 257 repeats 289
The mitochondrial genetic code 258 Transposon-derived repeats make up more than
The human nuclear genome consists of24 widely 40% of the human genome and arose mostly
different chromosomal DNA molecules 260 through RNA intermediates 290
The human genorue contains at least 26.000 genes, Human LTR transposons 291
but the exact gene number is difficult to Human DNA transposon fossils 291
determine 261 A few human L1NE-I eleruents are active
Human genes are unevenly distributed between transposons and enable the transposition of
and within chromosomes 263 other types of DNA sequence 291
Duplication of DNA segments has resulted in copy- Nu repeats are the most numerous human DNA
number variation and gene families 263 elements and originated as copies of7SL RNA 292
Gene dupli cation mechanisms 263
CONCLUSION 293
9.2 PUOrmN-COOlNG GENES 265
Human protein-coding genes show enormous FURTHER READING 294
variation in size and internal organization 265
Chapter 10
Size variation 265
Repetitive sequences within coding DNA 266 Model Organisms, Comparative Genomics,
Different proteins can be specified by overlapping and Evolution 297
transcription units 266 10.1 MODEL ORGAN ISMS 298
Overlapp\ng genes and genes-within-genes 266 Unicellular model organisms aid understanding of
Genes divergently transcribed or co-transcribed basic cell biology and microbial pathogens 298
from a common promoter 266 Some invertebrate models offer cheap
Human protein-coding genes ohen belong to high-throughput genetic screening, and can
families of genes that may be clustered or sometimes model disease 30 I
dispersed on multiple chromosomes 268 Various fish, frog, and bird models offer accessible
Different classes of gene family can be recognized routes to the study of vertebrate development 301
according to the extent of sequence and structural Mammalian models are disadvantaged by practical
similarity of the protein products 269 limitations and ethical concerns 303
Gene duplication events that give rise to multigene Humans are the ultimate model organism and will
families also create pseudogenes and gene probably be a principal one some time soon 305
fragments 271
10.2 COMPARATIVE GENOMICS 306
9.3 !lJ'i1\ GENES 274 Evolutionary constraint and the preselvarion of
More than 1000 human genes encode an rR NA or functio n by purifying selection 306
tRNA, mostly within large gene clusters 277 Rapidly evolving sequences and positive selection 308
Ribosomal RNA genes 277 A variety of computer programs allow automated
Transfer RNA genes 279 genome sequence alignmenrs 308
Dispersed gene families make various small nuclear Comparative genomics helps validate predicted
RNAs that facilitare general gene expression 280 genes and identifies novel genes 309
Spliceosomal small nuclear RNA (snR NA) Comparative genomics reveals a surprisingly large
genes 28 1 amount of functional noncoding DNA in
Non-spliceosomal small nuclear RNA genes 282 mammals 310
Small nucleolar RNA (snoRNA) genes 282 Comparative genomics has been particularly
Small Cajal body RNA genes 283 important in identifying regulatory sequences 312
Close to 1000 different human microRNAs regulate 10,3 GENE A..'II 0 GEl\OME EVOLUTIO~ 313
complex sets of target genes by base pairing to the Gene complexity is increased by exon duplicarion
RNA transcripts 283 and exon shuffling 314
Many thousands of different piRNAs and endogenous Exon duplication 3 14
siRNAs sup press transposition and regulate gene Exo n shuffling 315
expression 285 Gene duplication can permit increased gene dosage
Piwi-protein-interacting RNA 285 but its major val ue is to permit functional
Endogenous siRNAs 286 complexity 3 15
xviii Detailed Contents

The globin superfa mily illustrates divergence in Elongation of the transcript 349
gene regulation and function after gene Termination of transcription 349
duplication 317 Many other proteins modulate the activity of the
T\\'O or !hree major whole genome duplication basal transcription apparatus 349
e'\-ents have occurred in vertebrate lineages since Sequence-specific DNA-binding proteins can bind
the split from tunicates 318 close to a promoter or at more remote locations 350
~Iajor chromosome rearrangements have occurred Co-activators and co-repressors influence
duri ng mammalian genome evolution 319 promoters without binding to DNA 352
In heteromorphic sex chromosomes, the smaller 11.2 CHll0MATlN CONPOR.\MTION: DNA
chrOinosome is limited to one sex and is mostly MEfHYtATION AND THE HISTONF CODE 353
non -reco mbining with few genes 320 Modifications ofillstones in nucleosomes may
The pseudoautosomal regions have changed rapidly comprise a histone code 353
during evolution 322 Open and closed chromatin 354
Human sex chromosOlnes evolved after a sex­ ATP-dependent chromatin remodeling
determining locus developed on one autosome, complexes 356
causing it to diverge from its homolog 323 DNA methylation is an important control in gene
Abundant testis- expressed genes on the Y expression 357
chromosome are mostiy maintained by Methyl-CpG-binding proteins 358
intrachromosomal gene conversion 324 DNA methylation in development 359
X-chromosome inactivation developed in response Chromatin states are maintained by several
to gene depletion from the Y chromosome 326 interacting mechanisms 360
10_-1 OUlI PL\CE IN THE TREE OF LIrE 326 The role of HPI protein 36 1
Molecular phylogenetics uses sequence alignments A role for small RNA molecules 361
to construct evolutionary trees 326 A role for nuclear localization 361
Evolutionary trees can be constructed in different No single prime cause? 362
ways, and their reliability is tested by statistical The ENCODE project seeks to give a comprehensive
methods 328 overview of transcription and its control 362
The G-value paradox: organism complexity is Transcription is far more extensive than
not simply related to the number of (protein-coding) previously imagined 362
genes 329 Predicting transcription start sites 364
Striking lineage-specific gene family expansion 11.3 EPIGENETIC MEMORY AND IMPRINTING 365
often involves environmental genes 332 Ep igenetic memory depends on DNA methylation,
Regulatory DNA sequences and other noncoding and possibly on the polycomb and trithorax
DNA have significantly expanded in complex groups of proteins 365
metazoa ns 333 X-inactivation: an epigenetic change that is heritable
Mutations in cis-regulatory sequences lead to from cell to daughter cell, but not from parent
gene expression differences that underlie ro~M 3~
morphological divergence 333 Initiating X-inactivation: the role ofXlST 366
Lineage-specific exons and cis-regulatory elements Escaping X- inactivation 367
can originate from transposable elements 335 At imprinted loci, expression depends on the
Gene family expansion and gene loss/ inactivation parental origin 367
have occurred recently in human lineages, but Prader-Willi and Angelman syndromes are classic
human-specific genes are very rare 337 examples of imprinting in humans 368
Comparative genomic and phenotype-led studies Two questions arise about imprinting: how is it
seek to identify DNA sequences important in done and why is it done? 370
defining humans 339 Paramutations are a type oftransgenerational
CONCLUSION 341 epigenetic change 370
purrrHER READI"G 342 Some genes are expressed from only one allele but
independently of parental origin 371
Chapter 11 11.4 ONE! GENF. -MORE THAN ONE PROTEIN 372
Many genes have more than one promoter 372
Human Gene Expression 345
Alternative splicing allows one primary transcript
1 Ll PROMOTERS AND THE PRIMARY to encode multiple protein isoforms 373
TRANSCRIPT 347 RNA editing can change the sequence of the
Transcription by RNA polymerase 1I is a multi-step mRNA after transcription 374
process 347 11 .5 CONTROL OF GENE EAl'RESS(ON AT TILE
Defining the core promoter and transcription LFVEI.OFTRANSLATION 375
start site 347 Further controls govern when and where a
Assembling the basal transcription apparatus 348 mRNA is translated 375
Detailed Contents xix

The discovery of many small RNAs that regulate The protein interactome provides an important
gene expression caused a paradigm shift in gateway to systems biology 399
cell biology 376 Defining nucl eic acid-protein interactions is
MicroRNAs as regulators of translation 376 critical to understanding how genes function 401
MicroRNAs and cancer 378 Mapping protein-DNA interactions in vitro 401
Some unresolved questions 378 Mapping protein-DNA interactions in vivo 402
COl\ClUSION 378 CONCLUSION 403
FURTHER READING 379 FURTHER READING 404

Chapter 12 Chapter 13
Studying Gene Function in the Human Genetic Variability and Its
Post-Genome Era 381 Consequences 405
12.1 STU D\Cl NG GENE FUNCTlON:AN OVERVIEW 362 13.1 TYPES OFVARL\T10N BETWEEN HUMAN
Gene function can be studied at a variety of GENOMES 406
different levels 383 Single nucleotide polymorphisms are numerically
Gene expression studies 383 the most abundant type of genetic variant 406
Gene inactivation and inhibition of gene Both interspersed and tandem repeated sequences
expression 383 can show polymorphic va riation 408
Defining molecular partners for gene Short tandem repeat polymorph isms: the
products 383 workhorses of family and forensic studies 408
Genomewide analyses aim to integrate analyses Large-scale variatio ns in copy number are
of gene function 384 surprisingly frequent in human genomes 409
12.2BIOlNFORMATlCAPPROACHES TO 13.2 DNA VAMAGEAND REPAIR MECHANISMS 41 1
STUDYING GENE FUNCTION 364 DNA in cells tequires constant maintenance
Sequence homology searches can provide val uable to repair damage and correct errors 411
clues to gene function 385 The effects of DNA damage 413
Database searching is often performed with a DNA replication, transcription, recombination,
model of an evolutionarily conserved sequence 386 and repair use multi protein complexes that
Comparison with documented protein domains share components 415
and motifs can provide additional cl ues to Defects in DNA repair are the cause of many human
gene function 388 diseases 415
Complementation groups 416
12.3 SnlD~lNG GENE FUNCTION BY SELECTIVE
GENE INACTIVATION OR MODIFICATION 389 J3.3 PATHOGENIC DNA VAAlANTS ~l6
Clues to gene function can be inferred trom Deciding whether a DNA sequence change is
different types of genetic manipulation 389 pathogenic can be difficult 416
RNA interference is the primary method for evaluating Single nucleotide and other small-scale changes
gene function in cultured mammalian cells 390 are a common type of pathogenic change 417
Global RNAi screens provide a systems-level Missense mutatio ns 417
approach to studying gene function in cells 392 Nonsense mutations 418
Inactivation of genes in the getm line provides the Changes that affect splicing of the primary
most detailed informatio n on gene function 392 transcript 419
Frameshifts 420
12.4 PROTEOMICS, PROTEIN-PROTIlIN
Changes that affect the level of gene expression 421
INTERACTIONS, AND PROTJ::IN-VNA
Pathogenic synonymous (silent) changes 422
INTERACTIONS 393
Variations at shorr tandem repeats are occasionally
Proteomics is largely concerned with identifying
pathogeni c 422
and characterizing proteins a t the biochemical
Dynamic mu tations: a special class of
and functional levels 393
pathogenic microsatelJite variants 423
Large-scale protein-protein interaction studies
Variants that affect dosage of one or more genes
seek to define functional protein networks 395
may be pathogenic 425
Yeast two-hybrid screening relies on reconstituting
a functional transcription factor 396 13.4 MOLECUlAR PATHOLOGY: UNDERSTANDING
Affinity purification-mass spectrometry is widely THE EFFECf OF VARl.\l'<"TS 426
used to screen for protein partners of a test The biggest disthlction in molecular pathology is
protein 397 between loss-of-function and gain-of-function
Suggested protein-protein interactions are often changes 428
validated by co-immunoprecipitation or Allelic heterogeneity is a common feature of
pull-down assays 398 loss-of-function phenotypes 429
x:x Detailed Contents

Loss-of-function mutations produce dominant Lod scores of +3 and -2 are the criteria for linkage
phenotypes when there is haploinsufficiency 429 and exclusion (for a single test) 453
Dominant-negative effects occur when a mutated For whole genome searches a genomewide threshold
gene product interferes with the function of the of significance must be used 455
normal product 43 I 14.4 M UUIPOINT MAPPING 455
Gain-of-function mutations often affect the way in The CEPH families were used to construct marker
which a gene or its product reacts to regulatory framework maps 455
signals 432 Muitipoint mapping can locate a disease locus on a
Diseases caused by gain of function of G-prote in­ fram ework of markers 456
coupled hormone receptors 433
14.5 FlNE· MAPPING USING EXTENDED PEDIGREES
Allelic homogeneity is not always due to a gain of
function 433
ANDANCESTRALHAPLOTYPES 457
Autozygosity mapping can map recessive conditions
Loss-of-function and gain-of-function mutations in
effiCiently in extended inbred families 457
the same gene will cause different phenotypes 433
Identifying shared ancestral chromosome segments
13.5 THE QUEST FOR GENOTYPE-PH ENOTYPE allows high-resolution genetic mapping 459
CORRELATIONS 435
14.6 DIFFICULTIES WITH STANDARD LOD SCORE
The phenotypic effect of loss-of-function mutations
ANALYSIS 460
depends on the residual level of gene function 435
Errors in genotyping and misdiagnoses can generate
Genotype-phenotype correlations are especially
spurious recombinants 461
poor for conditions caused by mitochondrial
Computational difficulties limit the pedigrees that
mutations 436
can be analyzed 462
Variability within families is evidence of modifier
Locus heterogeneity is always a pitfall in human
genes or chance effects 437
gene mapping 462
CONCLUSION 436 Pedigree-based mapping has limited resolution 462
FURTHER RfAOING 438 Characters with non-Mendelian inheritance cannot
be mapped by the methods described in this
Chapter 14 chapter 463
Genetic Mapping of Mendelian CONCLUSION 46.~
Characters 441 FURTHER READING 464
14.1 TilE ROLE OF RECOMBINATION IN GENETIC
MAPPING 442
Chapter 15
Recombinants are identified by genotyping parents Mapping Genes Conferring Susceptibility to
and offspring for pairs of loci 442 Complex Diseases 467
The recombination fraction is a measure of the 15.1 FAMJLYSTUDIES OF COMPLEX DISEASES '168
genetic distance between two loci 442 The risk ratio (Al is a measure of familial clustering 468
Recombination fractions do not exceed 0.5. however Shared family environment is an alternative
great the distance between two loci 445 explanation for familial clustering 469
Mapp ing functions define the relationship betwee n Twin studies suffer from many limitations 469
recombination fraction and ge netic distance 445 Separated monozygotic twins 470
Chiasma counts give an estimate of the total map Adoption studies are the gold standard for
le ngth 446 disentangling genetic and environmental factors 471
Recombination events are distributed non-randomly J 5.:! SEGREG~nON ANALYSIS 471
along chromosomes. and so genetic map distances Complex segregation analysis estimates the most
may not correspond to physical distances 446 likely mix of genetic factors in pooled family data 471
14.2 MAPPING A DISEASE LOCUS 44~ J 5." LflIfKAGE ANALYSIS OP COMPLEX
Mapping human disease genes depends on genetic CHARACTERS 473
markers 448 Standard lod score analysis is usually inappropriate
For linkage analysis we need informative meioses 449 for non· Mendelian characters 473
Suitable markers need to be spread throughout the Near-Mendelian families 473
genome 449 Non-parametric linkage analysis does not require a
Linkage analysis norm ally uses either fluorescently genetic model 474
labeled microsatellites or SNPs as markers 450 Identity by descent versus identity by state 474
14.3 TWO-POIi\'T Jl.W'I'ING 451 Affected sib pair analysis 474
Scoring recombinants in human pedigrees is not Linkage analysis of complex diseases has several
always simple 451 weaknesses 475
Comp uterized lod sCOre analysis is the best way to Significance thresholds 475
analyze complex pedigrees for linkage between Striking lucky 476
Mendelian characters 452 An example: linkage analysis in schizophrenia 476
Detailed Contents xxi

15.4 ASSOCIATION STUDlP.s AND UNKAGE Mouse models have a special role in identifying
DISEQUlLillRJUM 477 human disease genes 503
Associations have many possible causes 477 16.2TlIEVAWE OF PATIENTS WITH
Association is quite distinct from linkage, except CI IROMOSOMAL ABNORMAJ.lTlES 504
where the family and the population merge 479 Patients with a balanced chromosomal abnormality
Association studies depend on linkage and an unexplained phenotype provide valuable
disequilibrium 479 clues for research 504
The size of shared ancestral chromosome X-autosome trans locations are a special Case 505
segments 480 Rearrangements that appear balanced under the
Studying linkage disequilibrium 481 microscope are not always balanced at the
The HapMap project is the definitive study of linkage molecula rlevel 506
disequilibrium across the human genome 481 Comparative genomic hybridization allows a
The use of tag-SNPs 484 systematic searcll for micro deletions and
15.5ASSOCIATION STUDIES IN PRAcrlCE 484 microduplications 507
Early studies suffered from several systematic Long-range effects are a pitfall in disease gene
weaknesses 485 identifica tion 508
The transmission disequilibrium test avoids tile 16.3 POSITION-INDEPENDENT ROUTES TO
problem of matching controls 485 IDENTIFYING DISEASE GENES 509
Association can be more powerful than linkage studies A disease gene may be identified through knowing
for detecting weak susceptibility alleles 486 the protein product 509
Case-control designs are a feasible alternative to A disease gene rna y be identified through the fun ction
the TDT for association studies 487 or interactions of its product 509
Special populations can offer advantages in A disease gene ma y be identified through an animal
association studies 488 model, even without positional information 511
A new generation of genomewide association (GWA) A disease gene may be identified by using
studies has finally broken the logjam in complex characteristics of the DNA sequence 511
disease research 488
The size of the relative risk 490 16.4 TESTING POSITIONAL CANDIDATE GENES 512
Por Mendelian conditions, a candida1e gene is
15.6 THE llMlTATIONS OF ASSOCIATION normally screened for mutations in a panel of
STUDIES 491 unrelated affected patients 512
The common disease-common variant hypothesis Epigenetic changes might cause a disease without
proposes that susceptibility factors have ancient changing the DNA sequence 513
origins 491 The gene underlying a disease may not be an
The mutation-selection hypothesis suggests that obvio us one 514
a heterogeneous collection of recent mutations Locus heterogeneity is the rule rather than the
accounts for most disease susceptibility 492 exception 514
A complete account of genetic susceptibili ty will Further studies a re often necessary to confirm that
require contributions from both the common the correct gene has been identified 515
disease-common variant and mutation-selection
hypotheses 493 16.5 IDENTIFYING CAUSAL VARIANTS FROM
CONCLUSION 494 ASSOCIATION STUDIES 515
Identifying causal variants is not simple 516
FUR'ffiEll READING 495 Causal variants are identified through a combination
of statistical and functional studies 516
Chapter 16 Functional analysis of SNPs in sequences with no
Identifying Human Disease Genes and known function is particularly difficult 518
Susceptibility Factors 497 Calpain -l 0 and type 2 diabetes 518
Chromosome 8q24 and susceptibility to
16.1 POSmONALCLONING 498
prostate cancer 518
Positional cloning identifies a disease gene from
its ap proximate chromosomal location 499 16.6 EIGHT EXAMPLES OF DISEASE GENE
The first step in positional cloning is to define the IDENTIHCATION 519
candidate region as tightly as possible 500 Case stud y I : Duchenne muscular dystrophy 519
The second step is to establish a list of genes in the Case study 2: cystic fibrosis 520
candidate region 500 Case study 3: branchio-oto -renal syndrome 522
The third step is to prioritize genes from the Case study 4: multiple sulfatase deficiency 522
candidate region for mutation testing 501 Case study 5: persistence of intestinal lactase 523
Appropriate expression 501 Case study 6: CHARGE syndrome 525
App ropriate function 502 Case study 7: breast cancer 526
Ho mologies and functional relationships 502 Case study 8: Crohn disease 528
xxii Detailed Contents

16.7 HOWWFLL HAS DISEASE GENE ATM: the initial dete ctor of damage 555
IDENTIFICATION WORK1ID? 53J Nibrin and the MRN complex 555
Most variants that cause Mendelian disease have CHEK2: a mediator kinase 555
been identified 531 The role ofBRCAll2 556
Genomewide association studies have been very p53 to the rescue 556
successful, but identifying th e true functional Defects in the repair machinery underlie a variery
variants remains difficult 532 of cancer-prone genetic disorders 556
Clinically useful findings have been achieved in a Microsatellite instabiliry was discovered through
few complex diseases 532 research on familial colon cancer 557
Alzheimer disease 532 17.6GENOMI:WIDEVlEWS OF CANCER !i59
Age-related macular degeneration (ARMD) 533 Cytogenetic and microarray analyses give
Ecze ma (alOpic dermatitis) 533 genomewide views of structural changes 559
The p rohlem of hidden heritabiliry 534 New sequencing technologies allow genomewide
CO:l:CLUSJO/I. 534 surveys of sequence changes 559
FURTHER READ£.l\IG 535 Further techniques provide a genomewide view of
epigenetic changes in tumors 560
Geno mevvide views of gene expression are used to
Chapter 17 generate expression signatures 560
Cancer Genetics 537 17.7 UNRAVELlNGTl-lE MULTI-SUG£
17.1 '1111'. EVOLUTION OF CANCER 539 EVOLUTION OF A TUMOR 5(11
I 7.2 ONCOGENES 540 The microevolution of colorectal cancer has been
Oncogenes function in growth signaling pathways 541 particularly well do cumented 561
Oncogene activation involves a gain of function 542 17.11 INTEGRATING TilE DATA: CANCER AS CELL
Activation by amp lification 542 BIOLOGY 564
Activation by point mutation 543 Tumorigenesis should be considered in terms of
Activation by a translocation that creates a pathways, not individual genes 564
novel chimeric gene 543 Malignant tumors must be capable of stimulating
Activa tion by translocation into a angiogenesis and metastasizing 565
transcriptionally active chromatin region 544 Systems biology may eventually allow a unified
Activation of oncogenes is only oncogenic under overview of tumor development 566
certain circumstances 546 CONCLUSION 566
17.3 TlJMOR SUPPRESSOR GENES 546 FURTHER RfADlNG 567
Retinoblastoma provided a paradigm for
understanding tumor suppressor genes 546
Some tumor suppressor genes show variations on Chapter 18
the fWo-hit paradigm 547 Genetic Testing of Individuals 569
Loss of heterozygosiry has been widely u sed as a
18, I WHAT TO TEST AND WHY 571
marker to locate tumor suppressor genes 548
Tumor suppressor genes are often silenced Many different rypes of sample can be used for
genetic testing 571
epigenetically by methylation 549
RNA or DNA? 572
17.-l eEl L CYCLE DYSREGULATION IN CANCER 550 Functional assays 572
Three key tumor suppressor genes control eve nts
18.2 SCANNING A GENE FOR MIJTATIONS 572
in G, phase 551
A gene is normally scanned for mutations by
pRb: a key regulator of progression through
sequencing 573
G, phase 551
A variery oftechniques have been used to scan a
p53: the guardian of the genome 551
gene rapidly for possible mutations 574
CDKN2A: one gene that encodes fwO key
Scanning methods based on detecting
regulatory proteins 552
mismatches or heteroduplexes 574
17.5INSTABIUTY OF THE GENOME 553 Scanning methods based on single-strand
\~ario us methods are used to survey cancer cells for conformation analysis 576
chromosomal changes 553 Scanning methods based on translation: the
Three main mechanisms are responsible for the protein truncation test 576
chromosome instability and abnormal Microarrays allow a gene to be scanned for almost
karyotype , 553 any mutation in a single operation 576
Telomeres are essential for chromosomal s tabiliry 554 DNA methylation patterns can be detected by a
D:\.-\ damage sends a signal to p53, whi ch initiates variety of methods 577
proreni,·e responses 555 Unclassified variants are a major problem 578
Detailed Contents xxiii

J 6.3 TESTING fOR A SPEC] FII:D SEQUENCE Chapter 19


CHANGE 579 Pharmacogenetics, Personalized Medicine,
Testing for the presence or absence of a restriction and Population Screening 605
site 579
Allele·specific oligonucleotide hybridization 580 19.1 EVALUATION OF CUNICAL TESTS 606
Allele·specific PCR amplification 581 The analytical validity of a test is a measure of its
The oligonucleotide ligation assay 581 accuracy 606
Minisequenci ng by primer extension 581 The clinical validity of a test is measured by how
Pyrosequencing 583 well it predicts a clinical condition 607
Genotyping by mass spectrometry 583 Tests must also be evaluated for their clinical
An"ay· based massively parallel SNP genotyping 583 utility and ethical acceptabiliry 608
J 9.2 PHARMACOGENETICS AND
16.4 SOME SPECIAL TESTS 585 PHARMACOGENOM]CS 609
Testing for whole-exon deletions and duplications Many genetic differences affect the metabolism
requires special techniques 585 of drugs 610
The multiplex ligation-dependent probe The P450 cytochromes are responsible for much
amplification (MLPA) test 585 of the phase 1 metabolism of drugs 610
Dystrophin gene deletions in males 586 CYP2D6 611
Apparent non·maternity in a family in which a Other P450 enzymes 612
deletion is segregating 587 Another phase 1 enzyme variant causes a problem
A quantitative PCR assay is used in prenatal testing in surgery 61 3
for fetal chromosomal aneuploidy 587 Phase 2 conjugation reactions produce excretable
Some triplet repeat diseases require special tests 588 water-soluble derivatives of a drug 613
The mutation screen for some diseases must take Fast and slow acetylators 613
account of geographical variation 589 UGTlAl glucuronosyltransferase 614
Testing for diseases with extensive locus Glutathione S-transferase GSTM1, GSTTl 614
heterogeneity is a challenge 590 Thiopurine methyltransferase 614
18.5 GENE TRACKING 592 Genetic variation in its target can influence the
pharmacodynamics of a drug 614
Gene tracking involves three logical steps 592
Variants in beta· adrenergic receptors 614
Recombination sets a fundamental limit on the Variations in the angiotensin-converting
accuracy of gene tracking 593 enzyme 615
Calculating risks in gene tracking 594 Variations in the HT2RA serotonin receptor 615
Bayesian calculations 594 Malignant hyperthermia and the ryanodine
Using linkage programs for calculating genetic receptor 616
risks 595
The special problems of Duchenne muscular \9.3 PERSONALIZED M£OICINE: PRE$CIUBIr"G
dystrophy 595 TilE BEST DnUG 616
Without bedside genotyping it is difficult to put
18.6 DNA PROFWNG 596 the ideal into practice 616
A variety of different DNA polymorphisms have Drug effects are often polygenic 616
been used for profiling 596 Warfarin 617
DNA fingerprinting using minisatellite probes 596 Pharmaceutical companies have previously had linJe
DNA profiling using microsatellite markers 597 incentive to promote personalized medicine 618
Y-chromosome and mitochondrial The stages in drug development 618
polymorphisms 597 So me drugs are designed or licensed for treating
DNA profiling is used to disprove or establish patients with specific genotypes 619
paternity 598 Trastuzumab for breast cancer wi th HER2
DNA profiling can be used to identify the origin of amplification 619
clinical samples 598 Gefitinib for lung and other cancers that
DNA profiling can be used to determine the zygosity have EGFR mutations 620
of twins 598 Expression profiling of nunors mal' lead to
DNA profiling has revolutionized forensic personalized treaunent 620
investigations but raises issues of civil liberties 599 An alternative approach: depersonalized
Technical issues 599 medicine and the poll'Pill 621
Courtroom issues 600 19.4 PERSONALIZED MEDlClNL: TESTLVG fOR
Ethical and political issues 601 SUSCEPTIBILITYTO COMPLEX DISEASES 622
CONCLUSION 602 Researchers are finally able to identify genetic
susceptibility factors 622
FURTHER READING 603 Individual factors are almost always weak 622
xxiv Detailed Contents

Even if no single test gives a strong prediction, Nuclear transfer has been used to produce
maybe a battery oEtests will 623 genetically modified domestic mammals 646
Much remains unknown about the clinical validity Exogenous promoters provide a convenient way
of susceptibility tests 624 of regulating transgene expression 646
Risk of type 2 diabetes: the Framingham and Tetracycline-regulated inducible transgene
Scandinavian studies 625 expression 647
Risk of breast cancer: the study ofPharoah Tamoxifen-regulated inducible transgene
and colleagues 625 expression 647
Risk of prostate cancer: the study of Zheng Transgene expression may be influenced by position
and colleagues 627 effects and locus structure 648
Evidence on the clinical utility of susceptibility
20.3 TARGETED GENOME MODIFICATION
testing is almost wholly lacking 627
AND G~'Iffi fNACI1VAl 'ION INVH'O 648
H),5 popm \TION SCREENING 628 The isolation of pluripotent ES cell lines was a
Screening tests are not diagnostic tests 629 landmark in mammalian genetics 648
Prenatal screening for Down syndrome defines an Gene targeting allows the production of animals
arbitrary threshold for diagnostic testing 629 carrying defined mutations in every cell 649
Acceptable screening programs must fit certain Different gene-targeting approaches create null
crileria 631 alleles or subtie point mutations 650
\ hat would screening achieve? 631 Gene knockouts 650
ensitivity and specificity 63 1 Gene knock-ins 650
ChOOSing subjects for screening 633 Creating point mutations 651
An ethical framework for screening 633 Microbial site-specific recombination systems allow
Some people worry that prenatal screening conditional gene inactiva Uon and chromosome
programs might devalue and stigmatize engineering in animals 651
affected people 633 Conditional gene inactivation 653
Some people worry that enabling people with Chromosome engineering 653
genetic diseases to lead normal lives spells Zinc finger nucleases offer an alternative way of
trouble for future generations 634 performing gene targeting 654
6 HIE ~E\'" PARADIGM: PREDICT AND Targeted gene knockdown at the RNA level involves
PREVENT? 634 cleaving the gene transcripts or inhibiting their
translation 656
·CLUSrON 636
In vivo gene knockdown by RNA interference 656
r ""THEn READING 637 Gene knockdown with morpholino antisense
oligonucleotides 656
Chap ler20
ZUA RANDOM MUTAGENESIS AND LARGE· SCALE
Genetic Manipulation of Animals for
ANlMAL MUTAGENESIS SCREEN 657
, Iodeling Disease and Investigating Gene Random mutagenesis screens often use chemicals
Function 639 that mutate bases by adding ethyl groups 658
, ER \~ EW 640 Insertional mutagenesis can be performed in ES
A lride range of species are used in animal modeling 640 cells by using expression· defective transgenes as
). • animal models are generated by some kind gene traps 658
of nificially designed genetic modification 640 Transposons cause random insertional gene
~ e types of phenotype analysis can be inactivation by jumping within a genome 658
penanned on animal models 642 Insertional mutagenesis with the Sleeping
Beauty transposon 659
\KL'lJG TRANSGENIC ANIr.w..5 643
piggyBac-mediated transposition 660
lkans mc animals have exogenous DNA inserted The International Mouse Knockout Consortium
me germ line 643
seeks to knock out all mouse genes 660
i':!=ICM~·r microinjection is an established method
,- g some transgenic animals 644 20.5 USING GENEnCALLY MOorHED ANIMALS
Tn;n:,".mes can also be inserted into the germ line TO MODEL DISEASE AND DISSECT GEl'lffi
,;a germ cells, gametes, or pluripotent cells derived FUNCTION 661
from the early embryo 645 Genetically modified animals have furthered our
Gen transfer into gametes and germ cell knowledge of gene function 661
precursors 645 Creating animal models of human disease 662
Gene transfer into pluripotent cells of the early Loss-of-function mutations are modeled by selectively
embryo or cultured pluripotent ste m cells 645 inactivating the orthologous mouse gene 662
Gene ran sfer into somatic cells (animal Null alleles 663
~ 645 Humanized alleles 663
Detailed Contents xxv

Leaky mutations and hypomorphs 663 21 .4 PR IIIICIPlES Of GENE THERAPY !\/lID


Gain-of-function m utations are conveniently MAMMALIAN GEN ETRANSFECllON
modeled by expressing a mutant transgene 663 SYSTEMS 696
Modeling chromosomal disorders is a challenge 666 Genes can be transferred to a patient's cells either in
Modeling human cancers in mice is complex 667 culture or within the patient's body 699
Gene knockouts to model loss of tumor Integration of therape utic genes into host
suppressor function 668 chromosomes has significant advantages but raises
Transgenic mice to model oncogene activation 670 major safety concerns 700
Modeling sporadic cancers . 670 Viral vectors offer strong and sometimes long-term
Humanized mice can overcome some ofthe species transgene expression, but many come with
differences that make mice imperfect models 670 safety risks 700
New developments in genetics are extending the Retroviral vectors 701
range of disease models 671 Adenoviral and adeno-associated virus (AAV)
CONCLUSION 672 vectors 702
Other viral vectors 703
FU 11TH KR HEADING 673
Non-viral vector systems are safer, but gene
transfer is less efficient and transgene expression
Chapter 21 is often relatively weak 704
Genetic Approaches to Treating Disease 677 Transfer of naked nucleic acid by direct
21.1 TREATMENT OF GENETIC DISEASE VERSUS injection or particle bombardment 704
GI:!NETIC THFATMENT OF DISEASE 678 Lipid-mediated gene transfer 704
Treatment of genetic disease is most advanced for Compacted DNA nanoparticles 704
disorders wh ose biochemical basis is well 21.5 RNAi\ND OUGONUCl.EOTIDE I'HE'RAPEUTICS
understood 678 fWD THERAPEUTIC GENE REPAIR 704
Genetic treatment of disease may be conducted Therapeutic RNAs and oligonucleotides are often
at many different levels 679 designed to selectively inactiva te a mutant allele 705
2 1.2 GENETIC APPROACHES TO DISEASE Therapeutic ribozymes 706
TREATMENT USING DRUGS, RECOMBINANT Therapeutic siR NA 707
PROTEIN~,ANDVACCINES 6110 Antisense oligonucleotides can induce exon
Drug companies have invested heavily in genomics skipping to bypass a harmful mutation 707
to try to identify new drug targets 680 Gene targeting with zinc finger nucleases can repair
Therapeutic proteins can be produced by expression a specific pathogenic mutation or specifically
cloning in microbes, mammalian cell lines, or inactivate a target gene 708
transgenic animals 68 1 2J.6GENETHERAPY IN PRACllCF. 709
Genetic engineering has produced novel antibodies The first gene therapy successes involved recessively
with therape utic potential 683 inherited blood cell disorders 710
Aptamers are selected to bind to specific targe t Gene therapies for many other monogenic disorders
proteins and inhibit their functions 685 have usually had limited success 712
Vaccines have been genetically engineered to Cancer gene therapies usually involve selective killing
improve their functi ons 686 of cancer cells, but tumors can grow again by
Cancer vaccines 687 proliferation of surviving cells 713
21.3PR INCIPLESANDAPPLlCATIONS Of CELL Multiple HlV gene therapy strategies are being pursued,
TffERAP, 667 but progress toward effective treatment is slow 714
Stem cell therapies promise to transform the CONCLUS ION 715
potential of transplantation 687 FURTHER READING 716
Embryonic stem cells 688
Tissue stem ceUs 688
Practical difficulties in stem cell therapy 689
Allogeneic and autologous cell therapy 689
Nuclear reprogram ming offers new approaches to
disease treatment and human models of human
~~E EO
Induced pluripotency in somatic cells 693
Transdifferentiation 694
Stem cell therapy h as been shown to work but is
at an immature stage 695
Chapter 1

Nucleic Acid Structure and


Gene Expression

KEY CONCEPTS

The great bulk of eukaryotic genetic information is stored in the DNA found in the nucleus.
A tiny amount is also stored in mitochondrial and chloroplast DNA.
DNA molecules are polymers of nucleotide repeat units that consist of one of four types of
nitrogenous base, plus a sugar, plus a phosphate.
The backbone of any DNA molecule is a sugar-phosphate polymer, but it is the sequence of
the bases attached to the sugars that determines the identity and genetic function of any
DNA sequence.
• DNA normally occurs as a double helix, comprising two strands that are held together by
hydrogen bonds between pairs of complementary nitrogenous bases.
1tansmission of genetic information from cell to cell is normally achieved by copying the
complementary DNA molecules that are then shared equally between two daughter cells.
Genes are discrete segments of DNA that are used as a template to synthesize a fun ctional
complementary RNA molecule.
Most genes make an RNA that will serve as a template for making a polypeptide.
Vario us genes make RNA molecules that do not encode p olypeptide. Such noncoding RNA
often helps regulate the expression of other genes.
Uke DNA, RNA molecules are polymers of nucleotide repeat units that consist of one of four
types of nitrogenous base (three of these are the same as in DNA). plus a slightly different
sugar, plus a phosphate.
Unlike DNA, RNA molecules are usually single-stranded.
To become functional, newly synthesized RNA must undergo a series of maturation steps
such as excising unwanted intervening sequences and chemical modification of certain
bases.
Polypeptide synthesis occurs at ribosomes, either in the cytoplasm or inside mitochondria
and chloroplasts.
• The sequential information encoded in the RNA is interpreted at the ribosome via a triplet
genetic code, determining the basic structure of the polypeptide.
Polypeptides often undergo a variety of chemical modifications.
Proteins display extraordinary structural and functional diversity.
2 Chapter 1: Nucleic Acid Structure and Gene Expression

1.1 DNA, RNA, AND POLYPEPTIDES


Molecular genetics is primarily concerned with the inter-relationship between
two nucleic acids, DNA and RNA, and how these are used to synthesize polypep­
tides, the basic component of all proteins. RNA may have been the hereditary
material at a very earl y stage of evolution, but now, except in certain viruses, it no
longer serves this role. Genetic information is instead stored in more chemically
stable DNA molecules that can be copied faithfully and transmitted to daughter
cells.
Nucleic acids were originally isolated from the nuclei of white blood celis, but
are fou nd in all cells and in viruses. In eukaryotes, DNA molecules are found
mainly in the chromosomes of the nucleus, but each mitochondrion also has a
small DNA molecule, as do the chloroplasts of plant cells.
A gene is a part of a DNA molecule that serves as a template for making a
functionally important RNA molecule. In simple organisms such as bacteria, the
DNAis packed with genes (typically at least several hundred up to a few thousand
cUfferent genes). In eukaryotes, the small DNA molecules of the mito chondrion
or chloroplast contain a few genes (tens up to hundreds) but the nucleus often
contains thousands of genes, and complex eukaryotes typically have tens of
thousands. In the latter case, however, much of the DNA consists of repetitive
sequences whose fun ctions are not easily identified. Some of the repetitive DNA
sequences support essential chromosomal functions, but there are also many
defec tive copies of functional genes.
There are many different types of RNA molecule but they can be divided into
two broad classes. In one class, each RNA molecule contains a coding RNA
sequence that can be decoded to generate a corresponding polypeptide seq uence.
Because this class of RNA carries genetic information from DNA to the protein
synthesis machinery, it is described as messenger RNA (mRNA). Messenger RNA
made in the nucleus needs to be exported to the cytoplasm to make proteins, but
the messenger RNA synthesized in mitochondria and chloroplasts is used to
make proteins within these organelles. Most gene expression is ultimately dedi­
cated to making polypeptides, so proteins represent the major functional end­
point of the information stored in DNA.
The other RNA class is noncoding RNA. Such molecules do not serve as tem­
plates for making polypeptides. Instead they are often involved in assisting the
expression of other genes, so metimes in a fairly general way and sometimes by
regulating the expression of a small set of target genes. These regulatory pro c­
esses ma y involve catalytic RNA molecules (ribozyrnes).

Most genetic information flows in the sequence DNA -4 RNA -4


polypeptide
Genetic information generally flows in a one-way direction: DNA is decoded to
make RNA, and then RNA is used to make polypeptides that subsequently form
proteins. Because of its universality, this flow of genetic information has been
described as the central dogma of molecular biology. Two processes are essential
in all cellular organisms:
transcription, by which DNA is used by an RNA polymerase as a template for
synthesizing on e of man y different types of RNA;
translation , by wh.ich mRNA is decoded to make polypeptides at ribosomes,
wh.ich are large RNA-protein complexes found in the cytoplasm, and also in
mirochondria and chloroplas ts.
Genetic information is encoded in the linear sequence of nucleotides in DNA
and is decoded in groups of three nucleotides at a time (triplets) to give a linear
sequence ofmlcleotides in RNA. This is in turn decoded in groups of three nucle­
otides (codons) to generate a linear sequence of amino acids in the polypeptide
product.
Eukaryotic cells, including mammalian cells, contain nonviral chromosomal
DNA sequences, such as members of the mammalian LINE- I repetitive DNA
family, that encode cellular reverse transcriptases, which can produce DNA
sequences from an RNA template. The central dogma of unidirectional flow of
genetic information in cells is therefore not strictly valid.
DNA, RNA, AND POLYPEPTIDES

(A) (B) (C)


OH OH
base base base base base I I
I I I I I CH 2 0 H CH 0 H
- sugar ---(i)- sugar -G-- sugar - 11 - sugar ~. sugar - G - 5'1/ ' ""- I 0'1/ ""- I
C H H C1' C H H Cl'
, 4'I'c _(( 1 4'I' c_cl' 1
H3'I 2' 1 H H3'I 2' 1 H
OH H OH OH

Nucleic acids and polypeptides are linear sequences of simple Figur. 1.1 Repeat units in nucleic acids.
(A) The linear backbone of nucleic acids
repeat units consists of alte rnating phosphate and sugar
residues. Attached to each sugar is a base.
Nucleic acids The basic repeat unit (pal e peach shading)
DNA and RNA have very similar structures, Both are large polymers with long consists of a base + sugar + phosphate = a
linear backbones of alternating residues of a phosphate and a five -carbon sugar. nucleotide. The suga r has fi ve carbon atoms
Attached to each sugar residue is a nitrogenous base (Figure 1.IA), The sugars in numbered l ~ to 5' . (6) In DNA, the sugar is
DNA and RNA differ, in either lacking or possessing, respectively, an -OH group deoxyribose. (C) In RNA, the sugar is ribose,
altheir 2' -carbon positions (Figure LIB, C), In deoxyribonucleic acid (DNA), the w h ic h differs from deoxyribose in having a
hydroxyl (OH) group attached to carbon 2'.
sugar is deOl,-yribose; in ribon ucleic acid (RNA), the sugar is ribose,
Unlike the sugar and phosphate residues, the bases of a nucleic acid molecule
valY, The sequence of bases identifies the nudeic acid and determines its func­
tion, Four types of base are commonly foun d in DNA: adenine (A), cytosine (C),
guanine (G), and thymine (T), RNA also has four major types of base, Three of
them (adenine, cytosine, and guanine) also occur in DNA, but in RNA uracil (U)
repl aces thymine (Figure 1.2A),

(A) adenine (A) cytosine (e) guanine (G) (B)


NH,
NH, NH, 0

16 41 11 6 16 7
/N~
~
7 C 5 1/ C, 5 N7 C , ,5
oC ........... 5 _ N
N7 ' " CH HN C./ ' \ . , N-?- " ' CH
C~ N9/
I N :?' C '\.
I I CH "IC II I I CH I 8
HC~ C~ / 8 CC~ / 8 HC",," / .
2 N/ 4 N
CH
/ 2 ~N/ . N 2 -""": N •

3
H
9
o7 ' 2 " NH / 6 H2N
3
H
9
3

1
OH
thymine (T) uracil (U) I
CH2 0
0 0 S'IC /H' ""­H C l'

II CH II "'l'c_b/ J
H, 'I 2' 1 H
/ c ,, 5/ 3 c 5
HN 4 C HN / '"' ........... CH OH OH

31 ICH 31 IICH
c c lFigur.1 .2 Purines, pyrimidines, nucleosides, and
7 '1 " N /. 7'2 " N /· nucleotides. (A) Four nitrogenous bases (A.. C, G. and
o I H o 1H
T) occu r in DNA, and four nitrogenou s bases (A, C. G,
and U) are fou nd in RNA. A and G are purines; C, T, and
(e) N
IH, NH, U are pyrimidines. (6) A nucleoside is a base + sugar

6
c,," 7 J
C "
residue; in thi s case, it is adenosine. (C) A nucleotide
is a nucleoside + a phosphate group that is attached
to the 3' or 5' carbon of the sugar.The two examples

II ~"CH
/ N" N-?- " CH
-?- C shown here are adenosine 5'-mo nophosphate (AM P)
31 II
C~ N9/ 8
1N
I
HC""" / 4
C CH
7 ' 2 " N /.
and 2'-deoxycytidine 5'·trip hosphate (dCTP).The bold
lines at the bottom of the ribose and deoxyribose
) -"""';N ,
3
o 1 rings mean that the plane of the ring is at an angle of
o o 0 0 90 0 with respect to the plane of the chemical groups
that are linked to t he 1' 104' carbon atoms w ithin the
-0 - P - O
II
-o - ~L o - ~ I!.... o -ko ring. If the plane of the base is represented as lying on
I
0-
I
CH 0-
I I
0-
I
0-
I
CH
the surface of the page, the 2' and 3' carbo ns of the
suga r could be viewed as projecting upward out of the
"1/0",,­ " 1/0",,­ page, and the oxygen atom as projecting downward
C H H C t" C H H Cl
4'I'c_c/
H3' I 2' 1 H
1 4'I'c_ c(1
H3' I 2' 1 H
below its su rfa ce. Ph osphate groups are num bered
sequentially (a.. p,1, etc.), according to their distance
OH OH OH H from the su ga r ring.
4 Chapter 1: Nucleic Acid Structure a nd Gene Expression

TABLE 1.1 NOMENCLATURE FOR BASES, NUCLEOSIDES, AND NUCLEOTIDES


Base Nucl eos ide = base - sugar Nucleotide = nucleoside + phosphate(s)

Ribose Deoxyribose Monophosphate Diphosphate Triphosphate

Purine

Ad enine OO.€- "'\Oi.. -~ adenosine monophosphate adenosine diphosp hate (ADP) adenosine triphosphate (ATP)
(AMP)'

deoxyadenosine deoxyadenosine monophosphate deoxyadenosine diphosphate deoxyadenosine triphosphate


(dAMP)b (dADP) (d ATP)

2 _: g ~ ~ nosine guanosine mono phosphate guanosine d i phosphate (GOP) guanosine triphosphate (GTP)
(G MP)'

deoxyguanosine deoxyguanosine monophosphate deoxyguan osine diphosphate deoxyguanosine triphospha te


(dGMP) (dGDP) (dGTP)

Pyrimidine

Cytosine cytid ine cytidine monophosphate (C MP)a cytidin e diphosphate (CDP) cytidine triphosphate (CTP)

deoxycyt idine deoxycytidine monophospha te deoxycytid lne diphosphate deoxycytidine triphosphate


(dCMP) (d CDP) (dCTP)

Thymi ne thymidine thymidine monophosphate thymidine d iphosphate (TOP) thym idine triphosphate (TIP)
ITMP)'

deoxythymidine deoxythymidine monophosphate deoxythymidine diphosphate d eoxythymidine triphosphate


(dTMP) (dTDP) (dTTP)

Uracil urid ine uridine mono phosphate (UMP);! uridine diphosphate (U DP) uridine triphosphate (UTP)

deoxyuridine deoxyuridine monophosphate deoxyuridine diphosphate deoxyuridine triphosphate


(dUMP) (dUDP) (dUTP)

~Nucleosjde monophosphates are alternatiyely na med as fo ll ows: AMP, adenylate; GMP, guanylate; CMP, cytidylate; TMP, thymidylate; UMP, urid ylate.
t>where the sugar is ri bose, the nucleotide is AMP; where the sugar is deoxyribose, the nucleotide is dAMP. This pattern applies through out the table.
No te that TMP. TOP, and TTP are not normally found in cells.

Bases consist of heterocyclic rings of carbon and nitrogen atoms, and can be
divided into two classes: purines (A and G). which have two interlocked rings.
and pyrimidines (e, T, and U) , which have a single ring. In nucleic acids. each
base is attached to carbon l ' (one prime) of the sugar; a sugar with an attached
base is called a nucleoside (Figure 1.2B). A nucleoside with a phosph ate group
attached at the 5' or 3' carbon of the sugar is Ole basic repeat unit of a DNA strand
and is called a nucleotide (Figure 1.2e and Table I. t) .
Polypeptides
Proteins are composed of one or more polypeptide molecules that may be modi­
fied by the addition of carbohydrate side chains or other chemical groups. Like
DNA and RNA. polypeptide molecules are polymers that are a linear sequence of
repeatlng units. The basic repeat unit is called an amino acid. An amino acid has
a positively charged amino gro up (-NH 2) and a negatively charged carboxylic
acid (carboxyl) gro up (-eOOH). These are connected by a central a-carbon ato m
th at also bears an identifying side chai n that determines the chemical nature of
tbe amino acid. POlypeptides are formed by a condensation reaction between the
amino group of one amino acid and the carboxyl group of the next, to form a
repeating backbone, where the side chain (called an R-group) can differ from one
ami no acid to another (figure 1.3).
l he 20 different common amino acids can be categorized according to their
edJai ns:
bibic amino acids (Figure 1.4A) carry a side chain with a net positive charge
pb,·siologicai pH;
DNA, RNA, AND POLYPEPTIDES 5

H 0 H 0 H 0 H 0 H 0 H 0 FIgure 1.3 The basic repeat structun~ of


I II _I II I II I II I II I II polypeptides. A poly peptide is a polym er
H-N- C-C-N- C- C-N- C-C - N-C - C-N- C-C-N -C - C-OO consisting of amino acid repeat units (pale

~ J ~~ ~ r ~k ~ ~, ~ ~ peach shading). Amino acids have the


general formula H2N-eH(A)-eOOH, where
A is the side chain, H 2N- is the amino group,
and -(OOH is the carboxyl grou p. The
• acidic amino acids (Figure L4B) carry a side chain with a net negative charge central a carbo n carries all three groups, in
at physiological pH; ea ch amino acid. The blue shading illu stra tes
• uncharged polar amino acids (Figure lAC) are electricaily neutral overall, one of the peptide bonds that lin k adjacent
although their side chains carry polar electrical groups with fractional electri­ amino acids.

cal charges (denoted as &+ or 8- );


nonpolar neutral amino acids (Figure L4D) are hydrophobic (repellingwater) ,
often interacting with one ano ther and with other hydrophobic groups.
IAI lSI
I
CH,
I I
~ CH 2 CH,
I I
. CH 1 CH, CH,
CH,
'I I I I
.,.,ICH2 NH
; C, CH, CH,

CH, I N CH
I I
l I. .;, C
I I. o
;C,0- 0; ' 0­
C
NH, H,N NH, HC=NH

lysine arginine histidine aspartic acid glutamic acid


(Lys; K) (Arg; R) (His; H) (Asp; D) (Glu; E)

basic acidic

ICi
CH,
I
; C,
CH, HC CH
I I II
CH, CH, He CH
I I ~/ I
CHI CH -CH3 C CH,

o
C
; ,
NH2 o
/ , NH] I I I I
OH OH OH SH

asparagine glutami ne serine threonine tyrosine Cy05teine


(Asn; N) (Gln;Q) (Ser; 5) (Thr; n (Tyr;Y) (Cys; C )

amide groups hydroxyl groups sulfhydrylgfoup

101
H 0
I I II
HN- C -C-OH
I
CH,
I H,C
/CH, I 1°
CH, H]C CH 2
CH
H,C
/, /CH, CH, I ,/ Figu'.1 ~4 R groups of the 20 common
H CH, CH, H,C CH, CH, amino acids, grouped according to
glycine alanine valine leucine isoleucine proline chemical class. Th ere are 11 polar amino
(Gly;G) {Ala; A) (Val; V) (leu; L) {lie; I) (Pro; P) acids, divided into three classes: (A) basic
amino acids (pos itively charged); (8) acidi c
amino acid s (negatively charged); and
CH, (el uncharged polar amino acid s bearing
CH, I CH, three different types of chemical group.
I ;C, ;CH, I Polar chemical gro ups are highlighted. (0) In
CH, HC CH HC C-C
I I II I II II addition, a fourth class is composed of nine
5 HC CH HC C nonpolar neutral amino acids. Amino acids
C
I ~/ ~ / ,/ w ithin each class are chemically very similar.
CH, CH CH NH
Side-chain carbon atoms are numbered
methionine phenylalanine tryplophan from [he central a-carbon ato m (see the
(Met;M) (Phe; F) (Trp; W) lysi ne side chain). In proline, [he A group's
side chain co nn ects to the ami no acid's -NH2
nonpolar
group as well as to its central a-carbon atom.
6 Cha pter 1: Nuclei< Add Structure and Gene Expression

TABLE 1.2 WEAK NONCOVAlENT BONDING BONDS AND FORCES


Type of bond Nature of bond
-----..:t
Hydrogen Hydrogen bonds form when a hydrogen atom interacts with electron-attracting
atoms, usually oxyge n or nitrogen atoms

Ionic IO nic interac tions occur between charged groups. They can be very strong in
crystals, but in an aqueous environment the charged groups are shield ed by both
water molecules and ions in solution and so are quite wea k. Nevertheless. they
can be very important in biological function, as in enzyme-su bstrate recog nition

Van derWaals Any two atoms in close proximity show a wea k attractive bonding interac tion
forces (van derWaals attract io n) as a resu lt of their fluctuating electrical charges. When
atoms become extremely close, they repel each other very strongly (va n de r
Waals repu lsion). Although individual van derWaals attractions are very wea k, the
cumu lative effect of many such attractive forces can be important when there is a
very good tit between the surfaces of two macromolecules

Hyd rophobic Water is a polar molecule. Hydropho bic molecu les or che mical groups in an
forces aqueous environment tend to cluster. This minimizes their disruptive effect s on
the complex network of hydrog en bonds between water molecules. Hydrophobic
groups are said to be held together by hydrophobic bond s, although the basis of

l their attraction [s their common repulsio n by water molecules

In general, polar amino acids are hydrophilic, and nonpolar amino acids are
hydrop hobic. Glycine, with its very small side chain, and cysteine (whose -SH
. group is not as polar as an - OH groupl occupy intermedia te positions on the
hydrophilic-hydrophobic scale.
As described below, the side chains can be modified by the addition of various
chemical groups or sugar chains.

The type of chemical bonding determines stability and function


The stability of nucleic acid and protein polymers is primarily dependent on
strong covalent bonds between the atoms of their linear backbones. In addition
to covalent bonds, weak noncovalent bonds ('l"nble 1.2) are important both
between and within nucleic acids or protein molecules (Box I. l) . Individ ual non·
covalent bonds are typically more than 10 times weaker than individual covalent
bo nds.
The structure of water is particularly co mplex, with a rapidly fluctuating nel·
work of noncovalent bonding occurring between water molecules. The predomi·
nant force in this structure is the hydrogen bond, a weak electros tatic bond
between fractionally positive hydrogen atoms and fractionally negative atoms
{oxygen atoms, in the case of water molecules}.

BOX 1.1 THE IMPORTANCE OF HYDROGEN DON DING IN NUCLEIC ACIDS AND PROTEINS
Intermolecular hydrogen bonding in nucleic acids codons bind to tRNA during translation . Many regulatory RNAs,
This is important in permitting the formation of the following double­ such as microRNAs, control the expression of selected target
SHa nded nucl eic acids: genes by base pairing to complementary sequences at the RNA
Double-stranded DNA The stability of the double helix is level.
mai ntained by hydrogen bonding between A- T and C-G base
Intramolecular hydrogen bond in g in nuclei c acids
p2l1rs. The individual hydrogen bonds are weak, but in eukaryotk
This is particularly prevalent in RNA molecu les. Intramolecular base
celts the two strands of a DNA helix are held together by between
pairing can form hairpins that may be crucially important to the
tens ofthousands and hundreds of millions of hydrogen bonds.
st ruct u re of some RN As such as rRN A and tRN A (see Figure 1.9), and as
oA-R/'.JA duplexes. Hydrogen bonds form naturally between DNA
targets for ge ne regu lation.
and RNA d uring tran scription, but the base pairi ng is transient
beouse the RNA migrates away from DNA as it matures. Intramolecular hyd rogen bonding in proteins
DoubJf:-scronded RNA. This occurs stably in the genomes of some Severa l characteristic elements of protein second ary stru cture, such
vJru:ses.1t also arises tra nsiently in cells during gene expression. as a-helices and ~ - pleated sheets, arise because of hydrogen bonding
ie r example, during RNA splicing, small nuclear RN A molecules between side cha ins of different amino acids on the same polypeptide
bind to complementary sequences in pre-mRNA, and mRNA chain.
NUCLEIC ACID STRUCTURE AND DNA REPLICATION 7

Charged molecules are highly soluble in water. Because of the phosphate


groups in their component nucleotides, both DNA and RNA are negatively 1
charged polyanions. Depending on their ami no acid composition, proteins may CH base
0
be electrically neutral, or they may carry a net positive charge (basic protein) or a "1/
C H
""-.,
H C1
1
net negative charge (acidic protein). All of these molecules can form multiple 41'!: _ V I
interactions with the water during their solubilization. Even electrically neutral H" I 2'1 H
proteins are readily soluble, if they contain sufficient charged or neutral polar o H
1
amino acids. In contrast, membrane-bound prote ins with many hydrophobic O= p­ O-
amino acids are thermodynamically more stable in a hydrophobic environ­ 1
ment. o
Although individually weak, the nu merous noncovalent bonds acting toge ther ~ base
make large contributions to the stab ili ty of the conformation (structme) of these "I/ 0""-.,1
molecules and are important for specifying the shape of a macromolecule. C H H C i'
Covalent bonds are comparatively sta ble, so a high input of energy is needed to 4' I'c_ t/ 1

break them . No ncoval ent bonds, however, are constantly being made and broken
H )' I rl H
o H
at physiological temperatures (see Box 1.1 ). I
Figure 1.5 A 3', 5' -phosphodiester bond.
1.2 NUCLEIC ACID STRUCTURE AND DNA fhe phosphodiester bond (pale peach
shading) joins the 3' carbon atom of one
REPLICATION sugar to the 5' carbon atom of the next
sugar in the sugar- phosp hate ba<kbone of a
DNA and RNA structure nucleic acid.

DNA a nd RNA molecules have linear backbones consisting of alternati ng


sugar residues and p hosphate groups. The sugar residues are linked by
3', 5'-phosphodiester bonds, in which a phosphate gro up links the 3' carbon
atom of one sugar to the 5' carbon atom of the next sugar in the suga r-ph os phate
backbone (FIgure 1.5).
Although certain viral genomes are composed of single-stranded DNA, cellu­
lar DNA forms a double helix: two strands of DNA are held together by hydrogen
bonds to form a duplex. Hydrogen bonding occurs between the laterally opposed
complementary base pairs on the two strands of the DNA duplex. Such base
pairs fo rm according to Watson-Crick rules: A pairs with T, while G pairs with C
(FIgure 1.6).
Because of base pairing, the base composition of DNA is not rando m: the
amount ofA equals that ofT, and the amount ofG equals that of e. The base com­
position of DNA can therefore be specified by quoting the percentage of GC
(= percentage of G + percentage of C) in its composition. For example, DNA with
42% GC has the following base composition: G, 21%; C, 21 %; A, 29%; T, 29%.
The two strands of a DNA double helix curve around each other to produce a
minor groove and a major groove in the do uble h elix, where the distance occu­
p ied by a single complete turn of the helix (its pitch) is 3.6 nm (FIgure 1.7). DNA
can adopt different types of helical structure. Under physiological conditions,
most DNA in bacterial or eukaryotic cells adopts the B form, which is a right­
banded helix (it spirals in a clockwise directio n away from the observe r) and has
JO base p airs per turn. Rarer forms are A-DNA (right-handed helix with II base
pairs per turn) and Z-DNA (a left-handed helix with 12 base pairs per turn).

,Aj (6 ) H

~
~2c -
sogac ;" 0 sugar
/\ N -H .. · .. &­

~9;
sugar
" 3 N =C 2 1/
3
N = C2
0
'" , 1 ------ sugar

" •/ \ ...-
' . j \ C - N -----­

~
N
N -c 8T j N- C
/ -H&-.....[rNj \
'~/,ICS-H
"; \\ N .. · .. H - \
\\C - c/,1 1
C

/ s """N/7 5 6, N _ 3...
3N\

c-{l "
I C -H
c"""
8 N
I 5- S
C C
~O-
1
C - Cs
H
H
8;-1;1
/ H'/"O \ 4 5
CH3
/
H 7 \ O .... · M
H- N/ "
1
\
H
A hydrogen
bond T G H C
Flgur.1 .6 AT and GC base pairs. (A) AT base pairs have two connecting hydrogen bonds (doned red lines); (8) GC base pairs have
three. Fractional positive ch arges and frac tiona l negative charges are shown by 8+ and 0- , re spectively.
8 Chapter 1: Nucleic Acid Structure and Gene Expression

Because the phosphodiester bonds link carbon ato ms number 3' and number 5'
3'
5' of successive sugar residues, the two ends of a linear DNA strand are different.
The 5' end has a terminal sugar residue in which carbon atom numbe r 5' is not
linked to another sugar residue. The 3' end has a terminal sugar resid ue whose 3'
carb on is not involved in phosphodiester bonding, The two strands of a DNA
duplex are described as being anti-parallel to each other because the 5'-.3' direc­
tion of one DNA strand is the opposite to that of its partner, according to Watson­
Crick base-pairing rules (Figure 1.8).
pitch
Genetic information is encoded by the linear sequence of bases in the DNA 3.6 nm
strands. The two strands of a DNA duplex have complementary seq uences, so the
sequence of bases of one DNA strand can therefore readily be inferred from that
major
of the other strand. It is usual to describe DNA by writing the sequence of bases groove
of one strand only, in the 5'-.3' direction, which is the direction of synthesis of
new DNA or RNA from a DNA 'template. When describing the sequence of a DNA ,,
region encompassing two neighboring bases (a dinucleotide) on one DNA strand, ,,
it is usual to insert a 'p' to denote a connecting phosphodiester bond. So, a CG ,,
,,
base pair means a C on one DNA strand is hydrogen-bonded to a G on the com­ ,,
plementary strand, but CpG represents a deoxycytidine covalently linked to a ,,
neighboring deoxyguanosine on the same DNA strand (see Figure 1.8). 5' ,,
Unlike DNA, RNA is normally single·stranded except for certain viruses that
,
1 0m ):
have double·stranded RNA genomes. However, to perform certain cell functions
two RNA molecules may need to associate transiently to for m base pairs, and
Figute 1r1 Features of the DNA double
intermolecular hydrogen bonding also permits the for mation of transient RNA­ hel ~ . The two DNA strands wind round eac h
DNA duplexes (see Box 1.1). oth er, producing a minor groove and a major
In addition, hydrogen bonding can occur between bases within a single­ groove In the double helix. The double helix
stranded RNA (or DNA) molecule to produce structurally and functionally impor' has a pitch of 3.6 nm and a radius of 1 nm
tant stretches of double·stranded sequence. Hairp in structures may be formed per cum.
with stems that are stabilized by hydrogen bonding between bases (Figure 1.9AJ.
lntrachain base pairing causes certain RNA molecules to have complex struc­
tures (Figure 1.9B).
In double-stranded RNA, A pairs with U instead ofT. Although G usually pairs
with C, sometimes G-U base pairs are formed (see the example in Figure 1.9B),
Although not particularly stable, G-U base pair ing does not significantly distort
the RNA- RNA helix.
S' end 3' end
H OH
'( H 12' h' H
O= p -o- I/~ - ~, I <l'
I 1'( H H C
o I~ /1 "
I e ':::::" G a CH,
CH ........ I
" I/O~I 0
C H H ct' I
" 1"c - c'/1 'o-p~o
I
H 3' 1 "I H H a
a
I H I,'
H C-
I,'
c" H
o= p-o' 1/ , ,, 14'
I t'C H H C
o ........
I~ /
C 0
1"
CH,
I
CH 0 G .......
•.•••• ".
1

" I '/ ~ I 0
c"H H Ct ' I
4'1't_c/ I -o-P =0
H 3'1 2'1 H H ~ I=lgl.l,.1.6 Anti-parallel nature of the DNA

O= p-o-
'I H H " ,
I /T - ~,I .f
I,' H double helix. The two anti-parallel DNA
strands run in opposite directions in lin king
1 "C H H C 3' to S' carbon atoms in the sugar residues.
o I~ /1 , This double-stranded trinucleotide has the
1 T ,...... A a jH, sequence 5' pCpGpT-OH 3'YS' pApCpG- OH 3',
CH, 0 where p sta nds for a phosphate group an d
"1/ ~I 0 -OH 3' represents the 3' terminal hydroxyl
C H HC t' 1
4'I't_t/ 1 -o-P=o group. Thi s is conventionally ab breviated to

.~
give the S' ~ 3' sequence of nucleotid es on only
"3'1 "I" one strand, either as S'-CGT-3' (blue strand) or
OH "
r~ ,~ as S'-ACG-3' (purple strand).
NUCLEIC ACID STRUCTU RE AND DNA REPLICATION 9

(AI (B) FIgure 1,9 Base pairing within single­


5' AGACCAccMri"dd"IiCAGA GCcJiiiilfai' AAGAGCC 3' stranded nucleic acids. (A) Hairpin forma tion
OH
I 3' end by intramolecular hydrogen bonding.

1
hydrogen bond A Hydrogen bonds between the sequences

1~
forma tion highlighted by dark pink shading within this
single-stranded nucleic acid (shown here
5, end
G'"C I acceptor arm as RNA) can stabilize folding back to form a
GAG C~·-G hairpin with a double-stranded stem.
A C A.. ·U
{B} Extensive intramolecular base pairing in
C. .C Urn- " A
U .. ,A transfer RNA. The tRNAGly shown here as an
Darm G' ''C example illustrates the classical cloverleaf
I G· ·· ( ~ c;l ~ ~ ~ U Um!A tRNA structure. There are three hairpins (the
uGA ~Y UGGU 1mi b ~~_G o arm, anticodon arm, and the T141C arm) plus
G : : 1 'm eG ! I
a stretch of base pairing between 5' and
G~ A G A A U U AG T'f'C
5' AGACCAC C. A AGAGCC 3'
... [ ~:C"'G
: :g oem
3' terminal sequences (called the acceptor
arm because the 3' end is used to attach an
amino acid). Note that tRNAs always have the
antlcodon arm J:::~
C- mSC same number of base pairs in the stems of the

V
anlicodon
different arms of thei r cloverleaf structure and
that the anticodon at the cen ter of the middle
loop identifies the tRNA according to the
amino acid it will bear. The minor nucleotides
depi((ed are: 0, S,6-dihydrouridine;
Replication is semi-conservative and semi-discontinuous 'V, pseudourid ine (S-ribosyluracil); m SC.
5-methylcytidine; m 1A. l-methyladenosine;
For new DNA synthesis (replication) to begin, the two DNA strands of a helix Um, 2' -O-methyluridine.
need to be unwound by the enzyme helicase. The two unwound DNA strands
then each serve as a template for DNA polymerase to make complementary DNA
strands, using the four deoxynudeoside triphosphates (dATp, dCTp, dGTP, and
dTTP). Two daughter DNA duplexes are formed, each identical to the parent mol­
ecule (Pigure 1.101. Each daugh ter DNA duplex contains one strand from the
parent molecule and one newly synthesized DNA strand, so the replication proc­
3'
ess is semi~conservative. 5'
DNA replication is initiated at specific points, called origins of replication,
generatingY-shaped replication forks, where the parental DNA duplex is opened
up. The an ti-parallel parental DNA strands serve as templates for the synthesis of parental
complementary daughter strands that run in opposite directions. duple)(

The overall direction of chain growth is 5'.....>3' for one daughter strand, the
leading strand, but 3'->5' for the other daughter strand, the lagging strand
(figure 1.11 ). The reactions catalyzed by DNA polymerase involve the addition of
a deoxynudeoside monophosphate (dNMP) residue to the free 3' hydroxyl group
of the growing DNA strand. However, only the leading strand always has a free 3'
hydroxyl group that allows continuous elongation in the same direction in which
the replication fork moves.
The direction of syn thesis of the lagging strand is opposite to that in which the 3'
replication fork moves. As a result, strand synthesis needs to be accomplished in
a progressive series of steps, making DNA segments that are typically 100-1000
nucleotides long (Okazaki fragments). Successively synthesized fragm ents are
eventually joined covalently by the enzyme DNA ligase to ensure the creation of
two complete daughter DNA duplexes. Only the leading strand is synthesized
continuously, so DNA synthesis is therefore semi-discontinuous.
5' • ~ 3' S"
DNA polymerases sometimes work in DNA repair and new ongmal original
- 3'
new
recombination ' - - -----l
daugh ter
I
daughter
duplex duplex
The machinery for DNA replication relies on a variety of proteins (80)[ 1.2) and
RNA primers, and has been highly conserved during evolution. However, the Figure 1.1 0 Semi-conservative DNA
complexity of the process is greater in mammalian cells, in terms of the numbers replication. The parental DNA dupl ex
of different DNA polymerases (TobIe 1.3), and of their constituent proteins and consists of two comp lementary,
anti-parallel DNA stra nds that unwind to
rubunits.
serve as templates for the synthesis of
Most DNA polymerases in mammalian cells use an individual DNA strand as
new complementary DNA strands. Each
a template for synthesizing a complementary DNA strand and so are DNA­ completed daughter DNA duplex contains
directed DNA polymerases. Unlike RNA polymerases, DNA polymerases nor­ one of the two parental DNA strands plus
m ally require the 3'-hydroxyl end of a base-paired primer strand as a substrate. one newly synthesized DNA strand. and is
TIlerefore, an RNA primer, synthesized by a primase, is needed to provide a free structurally identical to the original parental
3' OH group for the DNA polymerase to start synthesizing DNA. DNA duplex.
10 Chapter 1: Nucleic Acid Structure and Gene Expression

5' 3' 5' 3' 5' 3' Figure 1. 11 Semi-discontinuous DNA


replication. The enzyme helicase opens up
a replication fork, where synthesis of new
daughter DNA strands can begin . The overall
direction of movement o f the replication
fork matches that of the continuous 5'---+3'
synthesis of the leading daughter DNA
DNA synthesis DNA synthesis strand . Replication is semi-discontinuous
) ) be<ause the lagging strand, which is
synthesized in the opposite direction, is built
up in pieces (Okazaki fragments, shown here
helicase at as fragment s A, B, and e), that wililarer be
replication fork stitched together by a DNA ligase.

3' ~ ~3' 5'


leading
strand
laggIng
strand
3'5' 3' 5' 3' 5' 3'5'

There are close to 20 different types of DNA polymerase in mammalian cells,


Most use DNA as a template to synthesize DNA and they have been grouped into
four familieS-A, B, X, and V-on the basis of sequence comparisons (see Table
L3).
Members offamily B are classical (high-fidelity) DNA polymerases and include
the enzymes devoted to replicating nuclear DNA, They mostly have an associated
3'-5' exonucle ase activity that is important in proofreading: if the wrong base is
inserted at the 3' OH group of the growing DNA chain the 3' -5' exonuclease snips
it out, This results in high-fid elity replication, because base misincorporation
errors are extremely infrequent DNA polymerase a is a complex of a polymerase
and a primase and is devoted to initiating DNA synthesis and initiating Okazaki
fragments, DNA polymerases 0 and £ carry out most ofthe DNA synthesis and are
strand-specific (see Table 1.3),
Many DNA polymerases work in DNA repair or recombination, They include
classical high-fidelity DNA polymerases that are also involved in repl ication
(DNA polymerases 0 and E) and others that are dedicated to DNA repair or recom­
bination, Some of the latter are high-fid elity polymerases but many of them are
comparatively prone to base misincorporation, notably members offamily X and
especially family Y members, For example, DNA polymerase t (iota) can have an
error rate 20,000 or more times that of DNA polymerase E,
The high error rate in so me DNA polymerases is tolerated because they work
in DNA repair processes and so are used to synthesize only small stretches of
DNA, In other cases, high error rates are advantageous, For example, low-fidelity

BOX 1.2 MAJOR ClASSES OF PROTEINS INVOLVED IN DNA REPLICATION

Topoisomerases- start th e process of DNA unwind ing by DNA polym erases-for synthesizing new DNA strands. New
breaking a single DNA strand, releaSing the tension holding the cellular DNA synthesis normally depends on an existing DNA
helix in its coiled and supercoiled form. strand template that is read by a DNA--directed DNA polymerase.
Helicases-unwind the double helix at the replication fork, once This complex aggregate of protein subunits often also provides
supercoiting has been eliminated by a topoisome rase. DNA proofreading and DNA repair function s (see Table 1.3). This
Single-stranded binding proteins-maintain the stability o f means that any wrongly incorporated bases can be identified,
the replication fork. Single-stranded DNA is very vu lnerable removed, and repaired. DNA can also be synthesized from an
to enzymatic attack; the bound proteins protect it from being RNA template, u sing an RNA-directed DNA polymerase (a reverse
deg raded . transcriptase). The ends of linea r chromosomes are copied using a
Primases-enzymes that attach a small complementary RNA reversetranscriptase (telomerase).
sequence (a primer) to single-stranded DNA at the replication DNA lig ases- needed to seal nicks that remain in newly
fork. The RNA primer provides the 3' OH needed by DNA synthesized DNA after t he RNA primers have been removed and
polymerase t o begin synthesis (unlike RNA polymerases, DNA the small gaps filled by DNA polymerase. The DNA lig ases catalyze
polyrn erases can not initiate new strand synthesis from a bare the formation of a phosphodiester bond between unattached but
single-stranded template but require an initiating molecule w ith a adjacent 3' hydroxyl and 5' phosphate groups.
free 3' OH grou p o nto which deoxynucleoside triphosphates can
be attached to bu ild a complementary strand).
NUCLEIC ACID STRUCTURE AND DNA REPLICATION 11

TABLE 1.3 MAMMALIAN DNA POLYMERASES


DNA·DIRECTED DNA POLYMERASES

Polymerase Family Standard DNA replication Additional or alternative roles in DNA repair, recombination, etc.

a (alpha) B initiates synthesis at replication origins and


initiates synthesis of Okazaki fragme nts on
lagging strand

P(beta) X base excision repair b

'((gamma) A mitochondria l DNA synthesis mitochondrial DNA repaIr

I) (delta) B main polymera se t hat synthesizes lagging multiple roles in DNA repai r
strand

E (epsilon) B synthesizes leadi ng strand multipl e roles in DNA repair

S(zeta) B translesion synthesise

1) (eta) Y translesion synthesis'

e(theta) A possible role in interstrand crosslink repair<!; base excision repair b;


translesion synthesisC.; somatic hypermutatlon 9

l (iota) Y translesion synthesise; poss ible roles in base excision repair b and
mismatch repaire

)( (kappa) Y tranSlesion synthesise; nucleotide excision repair f

A (lambda) X
double-strand break repair; VO) recombination9; base excision repairb
~ (mu) X

v{nu) A possible role in interstrand crosslink repai,c

Rev' Y trans lesion synthesise

TdP X VOJ recombination 9

RNA-DIRECTED DNA POLYMERASES (REVERSETRANSCRIPTASES)

Interspersed rep eat reverse t ra nscriptases (LINE· 1 or occasionally converts mRNA and other RNA into eDNA, wh ich can integrate elsewhe re
endogenous retrovirus elements) into the genome

Telomerase reverse transcriptase (Tert) replicates DNA at the ends of linear chromosomes

.rrerminal deoxynucleotide transferase. bBa se excision repair identifies and rem oves inappropria te bases or inappropriately m odi fied bases.
'Translesion synthesis involves the repl ication of DNA past damaged DN A (lesions) on the template strand. dlnterstrand crosslin k repair is the repair
of high ly cytotoxic lesions w here covalent DNA bonds have been formed between the DNA strands. eMismatch repair is a form of DNA repair that
corrects mistakes arising when noncomplementary nucleotides form a base pair. f Nucleotide excision repair is used to fi x helix·distorting lesions.
950 matic hypermutation and VDJ recombination are mechanisms used in B cells to d iversify immunoglobulin sequences.

DNA polymerases can continue to synthesize new DNA strands opposite a lesion
in the template DNA (translesion synthesis) and they can contribu te to the
uence diversity of immunoglobulins (e.g. by introducing many base changes
in coding sequences) and so assist in the recognition of numerous foreign anti­
gens by the immune system.

any viruses have RNA genomes


DX.". is the hereditary material in all present-day cells and we generally think of
the genome as the collective term for the different hereditary DNA molecules of
an organism or cell. However, m any viruses have an RNA genome. These RNA
molecules can undergo self-replication, although the 2' OH group on their ribose
resid ues makes the sugar-phosphate bonds rath er unstable chemically. By con­
nas!. in DNA, the dem;yribose residues carry only hydrogen atoms at the 2' posi­
-iDn. making DNA a more stable carrier of genetic information.
12 Chapter 1: Nucleic Acid Structure and Gene Expression

TABLE 1.4 DIFFERENT CLASSES OF GENOME

Single linear Multiple linear Single circular MUltiple circular Mixed (linear +
circular)

DNAGENOMES

Single~stranded (ss)DNA some viruses segmented ssDNA some viruses


viruses

Do ub le~stranded (ds)DNA some viruses; a segmented dsDNA mitochondria; multipartite viruses; a very few bacteria,
very few bacteria, viruses; eukaryotic chloroplasts; many some bacteria e.g. Agrobacterium
e.g. Borrelia nuclei bacteria and Archaea tumefaciens

RNAGENOMES

Single~st r ande d (ss)RNA some viruses segmented ssRNA a very few viruses
viruses

Double~stranded {ds)RNA a few viruses segmented dsRNA


viruses

See Figure 1.12 for examples of viral genomes and for explanations of segmented and multipartite viruses.

Viruses have developed many different strategies to infect and subvert cells,
and their genomes show extraordinary diversity when compared with cellular
genomes [fable 1.4 and Figure 1.121. Because RNA replication has a much higher
error rate than DNA replication, viral RNA genomes have a higher mutational
load than DNA genomes. Although viral RNA genomes are generally quite small,
the elevated mutation rate permits more rapid adaptation to changing environ·
mental conditions. RNA viruses usually replicate in the cytoplasm; DNA viruses
generally replicate in the nucleus.
Retroviruses are unusual RNA viruses both because they replicate in the
nucleus and also because their RNA replicates via a DNA intermediate. The
single-stranded RNA genome is converted into a single-stranded cDNA using a
viral reverse transcriptase. The single-stranded viral cDNA is then converted into
double-stranded DNA, by using a DNA polymerase from the host cell. Other viral
proteins then help insert this double-stranded DNA into the host cell's chromo­
somal DNA. It can remain there for long periods or be used to synthesize new
viral RNA genomes that are packaged as new virus particles.

'.3 RNA TRANSCRIPTION AND GENE EXPRESSION


As well as having global roles in storing and transmitting genetic information and
supporting chromosom e function, DNA can have cell-specific functions because
it contains sequences that can be used to make RNA and polypeptides in ways
that differ from cell to cell. Genes are discrete DNA segments that are spaced at
irregular intervals along the DNA sequence and serve as templates for making
complementary RNA sequences (transcription). The initial primary RNA tran­
script must then undergo a series of matu ratio n steps that ultimately result in a
mature, functional noncoding RNA or a messenger RNA that will in turn serve as
a template to make a polypeptide. Some of the gene products are needed by
essentially all cells for a variety of vital cell processes (such as DNA replication or
protein synthesis). However, other RNA and protein products are made in some
cell types but not others and may even be specific for individual cells in
some exceptional cases, as in individual B and T lymphocytes.
The DNA compositions of the different cell types in a multicellular organism
are essentially identical. The variation between cells happens because of differ­
ences in gene expression, primarily at the level of transcription: different genes
are transcribed in different cells according to the needs of the cells. Some genes,
known as housekeeping genes, need to be expressed in essentially all ceUs, but
other genes show tissue-specific gene expression Or they may be expressed at
specific times (e.g. at specific stages of development or of the cell cycle).
RNA TRANSCRIPTION AND GENE EXPRESSION 13

IA) Figure 1.12 The € J :traordinary variety


DNA RNA of viral genomes. (Al Strandedness and
topology. In single-stranded viral genomes,
single-stranded
the RNA used to ma ke protein products
linear circular linear circular may have the same sense as the genome,

o o
which is therefore a positive-wand genome
~® ~® (+), or be the opposite sense (antiSense)
parvoviruses but no DN A Intermediate of the genome, a negative-strand genome
e.g. hepatitis A H. Some single-stranded (+) RNA viru ses
(e.g. retroviruses) go through a DNA
e.g. M13, fd, fl e.g. hepatitis
~0 delta virus intermediate. and some double-stranded
e.g. rhabdoviruses DNA viruses (e.g. hepatitis B) go through
a replicative RNA form. (S) Segmented
~® and multipartite genomes. Segmenred
replicate via DNA intermediate genomes are ones in which the genome is
e.g. retroviruses divided into multiple d ifferent nucleic acid
molecules, each specifying a messenger RNA
double-stranded devoted to making a single po lypep tide.

o
linear circular linear circular For example, the genome of an influenza
virus is partitioned into eight different
IIII !I II ! ! i , iii i iii! Ii " l i i i i i p jj iiiiii
negative single-strand RNA molecules. In
herpes virus,). reoviruses some segmented genomes, each of the
different m olecules is packaged in a separate
, illil"!!", " i! ~ virus particle (capsid). Such genomes
papovaviruses
adenovirus (has covalently baculoviruses are described as muftiparrite genomes, as
linked terminal protein . illustra ted here by the bipartite genome of
the gemini virus.
C,,, !iiiii iii!! ' !!!:)
poxvlruses
(have dosed ends)

IB)
DNA RNA

~
gemini virus j influenza virus
(bipartite; two
circular double­
stranded DNA
-.
- - (segmented negative­
Slrand RN A)

mo lecules in
a double capsid)

Normally, only one of the two DNA strands in a duplex serves as a template for
RNA synthesis. During transcription, double-stranded DNA is bound by RNA
polymerase. The DNA is then unwound, enabling the DNA strand that will act as
il template for RNA synthesis (the template strand) to form a transient double­
stranded RNA-DNA hybrid with the growing RNA strand.
The RNA transcript is complementary to the template strand of the DNA and
has the same 5'->3' direction and base sequence (except that U replaces T) as the
opposite, nontemplate DNA strand. The nontemplate strand is often called the
sense strand, and the template strand is often called the antisense strand (Figure
J.l3).
In documenting gene sequences, it is customary to show only the DNA Figure ... 13 Transcribed RNA is
sequence of the sense strand. The orientation of sequences relative to a gene no[­ complementary to one strand of DNA.
mally refers to the sense strand. For example, the 5' end of a gene refers to The nucleotide sequence of transcribed
RNA is normally identical to that of the
sense strand, except that U replaces T, and
5' is complementary to that of the template
r r+
3· j ,- ,. 3·

strand. The nucleotide at the extreme
S' end of a primary ANA transcript carries a
5' ppp -: ~ 1~ !! i i ! ! : : t : OH 3'
template (antisense) s t ra nd
S' triphosphate group that may later
undergo modification; the 3' end has a free
RNA hydroxyl group.
14 Chapter 1: Nucleic Acid Structure and Gene Expression

sequences at the 5' end of the sense strand, and upstream or downstream
sequences flank the gene at its 5' or 3' ends, respectively, with reference to the
sense strand. For transcription to proceed efficiently, various proteins (transcrip­
tion factors) must bind to particular DNA sequence elements (collectively called
a promoter) that are often located close to and upstream of a gene. The bound
transcription factors serve to position and guide the RNA polymerase.
RNA polymerases synthesize RNA from four nucleotide precursors: ATp, CTp,
GTp, and UTP. Elongation involves the addition of the appropriate ribonucleoside
monophosphate residue (AMP, CMP, GMP, or UMP) to the free 3' h ydroxyl group
at the 3' end of the growing RNA strand. These nucleotides are derived by split­
ting a pyrophosphate residue (PP;) from their approp riate ribonucleoside tri­
phosphate (rNTPJ precursors. Only the initiator nucleotide at the extreme 5' end
of a primary transcript carries a 5' triphosphate group.

Most genes are expressed to make polypeptides


Most eukaryotic genes are expressed to produce polypeptides using RNA
polymerase II, one of three RNApolymerases (Table 1.5). All three RNA polymer­
ases cannot initiate transcription by themselves: they require regulatory factors.
A crucial regulatory element is the promoter, a collection of closely spaced short
DNA sequence elements in the immediate vicinity of a gene. Promoters are rec­
ognized and bound by transcription fact ors that then guide and activate the
polymerase. Transcrip tion fact ors are said to be trans-acting, because they are
produced by remote genes and need to migrate to their sites of action. In con­
trast/ promoter sequences are cis-acting because they are located on the same
DNA molecule as the genes they regulate.
Promoters recognized by RNA polymerase II often include the following
elements:
• The TATA box. Often TATAAA, or a variant sequence, this element is usually
found abo ut 25 base pairs (bpJ upstream fro m the transcriptional start site
(designated by -25; Figlll"e 1.14A). It usually occurs in genes that are actively
tra nscribed by RNA polymerase 11 only at a particular stage in the cell cycle
(e.g. histone genes) or in specific cell types (e.g. the ~-globin gene). A muta­
tion in the TATA box does not prevent the initiation of transcription but causes
transcription to begin at an incorrect location.
The GC box. Usually a variant of the sequence GGGCGG, the GC box occurs in
a variety of genes, many of which lack a TATA box. This is the case for the
'housekeeping' genes that perform the same function in all cells (such as
those encoding DNA and RNA polymerases, histones, or ribosomal proteins).
Although the GC box sequence is sequentially asymmetrical, it seems to func­
tion in either orientation (Figure 1.14B).
The CAAT box. Often located at position -80 it is usually the strongest deter­
minant of promoter efficiency. Like the GC box, it functions in either
orientation.

TABLE 1.5 THE THREE CLASSES OF EUKARYOTlC RNA POLYMERASE


RNA polymerase RNA synthesized Notes

285 rRNAa, 185 rRNA"', 5.85 rRNN localized in the nucleolus; RNA polymerase I produces a single primary
transcript (455 rRNA) that is cleaved to give the three rRNA classes listed here

II mRNAh, miRNA', most snRN As d and snoRNAs e RNA polymerase II tra nscripts are unique in being subject to capping and
polyade nyl ation

III 5$ rRNN, tRNAf, U6 snRNA9, 7SL RNAh, various the promoter for some ge nes transcribed by RNA polymerase III (e.g. 5S rRNA,
other small noncoding RNAs tRNA, 7SL RNA) IS interna l to the gene; for others, it is located upstream of the
gene (see Figure 1.15)

3Riboso mal RNA. bMessenger RNA. cMicroRNA. dSmall nuclear RNAs. eSmall nucleolar RN As. frransfer RNA. 9U6 snRNA is a component or the
spliceosome, an RNA-protein complex that removes unwanted noncoding seque nces from newly formed RNA transcripts. h7SL RNA forms part of the
sign al recognition particle, which has an importan t role in the transport of newly synthesized proteins.
RNA TRANSCRIPTION AND GENE EXPRESSION 15

(A) FI9ure 1.14 Promoters for two


- 100 - 90
tIlL
-80
-
- 70 - 60 - 50 - 40

-
-30 - 20 -10 +1 eukaryotic genes encoding polypeptides.
Polypeptide-encoding genes are transcribed
by ANA polymera se 11. The promoters are
defined by short sequence elements located
(B)
in regions just upstream of the transcription
- 1000 -900 -800 - 700 -600 - 500 -400 -300 - 200 -100 +1 sta rt site (+ 1). (A) The (}globin gene
<" ,~ "}) < ») . > ~ )~J.
promoter includes a TATA box (orange), a
CAAT box (purple), and a GC box (blue).
(B) The glucocorticoid receptor gene is
unusual in possessing 13 upstream GC
For a gene to be transcribed byRNApolymerase II the DNA must first be bound
boxes: lOin the normal orientation, and
by general transcription factors, to form a preinitiation complex. General tran­ 3 in the reverse orientation (a/(emative
scription factors required by RNA polymerase 1I includeTFlLA, TFIIB, TFIID, TFIIE, orientations for GC box elements ate
TFIII; and TFlIH. These transcription factors may themselves comprise several indicated by chevron directions).
components. For example, TFIID consists of the TATA-box-binding protein (TBP;
also found in association with RNA polymerases I and III) plus various TBP­
associated factors (TAF proteins). The complex that is required to initiate trans­
cription by an RNA polymerase is known as the basal transcription apparatus
and consists of the polymerase plus aJJ of its associated general transcription
factors.
In addition to the general transcription factors required by RNA polymerase
II, s pecific recognition elements are recognized by tissue-restricted transcription
factors. For example, an enhancer is a cluster of cis-acting short sequence ele­
ments that can enhance the transcriptional activity of a specific eukaryotic gene.
Unlike a promoter, which has a relatively constant position with regard to the
transcriptional initiation site, enhancers are located at variable (often consider­
able) distances from their transcriptional start sites. Furthermore, their function
is independent of their orientation. Enhancers do, however, also bind gene regu­
latory proteins. The DNA between the promoter and enhancer sites loops out,
which brings the two different DNA sequences together and allows the proteins
bound to the enhancer to interact with the u'anscription factors bound to the
promoter, or with the RNA polymerase.
A silencer has similar properties to an enhancer but it inhibits, rather than
stimulates, the transcriptional activity of a specific gene.

Different sets of RNA genes are transcribed by the three


eukaryotic RNA polymerases
Genes that encode polypeptides are always transcribed by RNA polymerase II.
However, RNA genes (genes that make noncoding RNA) may be transcribed by
polymerases I, II, or III, depending on the type of RNA (see Table 1.5). RNA
polymerase I is unusual because it is dedicated to transcribing RNA from a single
transcription unit, generating a large transcript that is then processed to yield
three types of ribosomal RNA (see below).
RNA polymerase II synthesizes various types of small noncoding RNA in addi­
tion to mRNA. They include many types of small nuclear RNA (snRNA) and of
small nucleolar RNA (snoRNA) that are involved in different RNA processing
events. In addition, it synthesizes many microRNAs (miRNAs) that can show
tissue-specific expression and typically regulate the expression of distinctive sets
of target genes.
RNA polymerase III transcribes a variety of small noncoding RNAs that are
typically expressed in almost all cells, including the different u'ansfer RNA spe­
cies, 5S ribosomal RNA (rRNA), and some snRNAs. The genes for transfer RNAs
(tRNAs) and 5S rRNA are unusual in that the promoters lie within, rather than
upstream of, the u'anscribed sequence (FIgure 1.15).
Internal promoters are possible because the job of a promoter is simply to
attract transcription factors that will guide the RNA polymerase to the correct
transcriptional start site. By the time the polymerase is in place and ready to initi­
ate transcription, any transcription factors previously bound to downstream
promoter elements will have been removed from the template strand. As an
example, transcription of a tRNA gene begins with the following sequence:
TFIlIC (transcription factor for polymerase mC) binds to the A and B boxes of
the internal promoter of a tRNA ge ne (see Figure 1.15).
16 Chapter 1: Nucleic Acid Structure and Gene Expression

(AI Agur.'.l S Promoter elements in three


c::::' genes transcribed by RNA polymerase III.
I RNA gene
A
(A) tRNA gen es have an internal promoter
8
conSisting of an A box (located within the
o arm of the tRNA; see Figure 1.96) and a
(8)

5S rRNA gene
r 8 box that is usually found in the T\jrC arm.
(8) The promoter of the Xenopus 5S rRNA
A IE C
gene has three components : an A box (+50
to +60), an intermediate element (IE; +67
IC)
r
U6 snRNA gene
• ••
DSE
II PSE TATA
to +72), and the C box (+80 to +90). (C) The
human U6 snRNA gene hasan external
promoter con sisting of three components.
A distal sequence element (OSE; - 240 to
-215) enhances transcription and works
Bound TFIlIC guides the binding of another transcription factor, TFIllB, to a
alongside a core promoter composed of a
position upstream of the transcriptional start site; TFllIC is no longer required proximal sequence element (PSE; - 65 to -48)
and any bound TFJIlC is removed from the internal promoter. and a TATA box (-32 to - 25). Arrowsmark the
TFIJIB guides RNA polymerase III to bind to th e transcriptional start site. + 1 position.

1.4 RNA PROCESSING


The RNA transcript of most eukaryotic genes unde rgoes a series of processing
reactions to make a mature mR NA or noncoding RNA.

RNA splicing removes unwanted sequences from the primary


transcript
For most vertebrate genes-almost all protein-coding genes and some RNA
genes-only a smali portion of the gene sequence is eventually decoded to give
the final product. In these cases the genetic instructions for making an mRNA or
mature noncoding RNA occur in ""on segments that are separated by interven­
ing intron sequences that do not contribute genetic information to the final
product.
Transcription of a gene initially produces a primary transcript RNA that is
complementary to the entire length ofthe gene, including both exons andintrons.
This primary transcript then undergoes RNA splicing, which is a series of reac­
tions whereby the intronic RNA segments are removed and discarded while the
remaining exonic RNA segments are joined end-to-end, to give a shorter RNA
product (Pigure 1.16).
RNA splicing requires recognition ofthe nucleotide sequences at the bounda­
ries of transcribed exons and introns (splice junctions). The dinucleotides at the
ends of introns are highly conserved: the vast majority of introns start with a GT
(becoming GU in intronic RNA) and end with anAG (the GT-AG rule).

(A) transcription unit


I
exon 1 intron 1 exon 2 intron 2 exon 3
9 1 -- - - - - - - - - a9 gl --- -ag

1 .",0",;P.;00 of geo.
IB)

"
,gu ----- - ag
E2 ,
91J -- ,g
,El Flgure1.16 The process of RNA .splicing.
(A) In this example, the gene contains three
exons and twO introns. (S) The primary RNA
~ cleave p,;ma<y RNA "'O''';PIa' ,
and discard intronic sequences 1 and 2
transcript is a conti nuous RNA copy of the
gene and contains sequences transcribed
(C) from exons (El, E2, and E3) and introns.
E> El
" (C) The primary transcript is cleaved at

1
regions corresponding to exon- intrOn
boundaries (splice junc tions). The RNA
,pliciog of e'oo;c ',""'0'" " 2, copiesof the introns are snipped out and
and 3 to pfOduce matu re RNA discarded. (O) The RNA copies of the exons
are retained and then fused together
(0)
E' E2 El (spliced) in the same linear order as in the
genomic DNA seq uence.
RNA PROCESSING 17

lOS to > 10,000 nucleotides < 20 nucleotides Figure 1.17 Three consensus DNA
sequences in introns of complex
splice branch splice
eukaryotes. Most introns in eukaryotic
donor sile site acceptor site
genes con tai n conserved sequences that
C A
AAG G"f G"AGT··

exon
T eTA C
·cNr CGAf" · ····f¥f~¥¥t¥¥¥¥N¥~G ~
exon
II correspond to three functionally important
regions. Tw o ofthe region s, the splice donor
site and the splice acceptor site, span the 5'
and 3' boundaries o f the intron. Th e branch
site is an additional important region that
Although the conserved GT and AG dinucleotides are crucial for splicing, they typically occ urs less than 20 nucleotides
are not sufficient to mark the limits of an intron. The nucleotide sequences that upstream of the splice acceptor site. The
are immediately adjacent to them are also quite highly conserved, constiruting nudeotides shown in red in these three
splice junction consensus sequences (Hgure 1.17). A third conserved intronic consensus sequences are almost invarian t.
sequence th at is also important in splicing is known as the branch site and is typi­ The other nucleo tides detailed in both
the in tron and the exons are those most
cally located no more than 40 nucleotides upstream of the intron's 5' terminal AG
commonly fou nd at each position. In some
(see Figure 1.17). Other exonic and intronic sequences can promote splicing
in stances, two nudeotides may be equally
(splice enhancer sequences) or inhibit it (splice silencer sequences), and muta­ co mmon, as in the case of C and T near the
tions in these sequences can cause disease. 3' end of the intron . Where N app ears, any of
The essential steps in splicing are as follows: the four nucJeotides may occur.
• Nucleophilic attack ofthe intron's 5' terminal G nucleotide by the invariant A
of the branch site consensus sequ ence, to form a lariat-shaped strucrure.
Cleavage of the exonlintron junction at the splice donor si teo
Nucleophilic attack by the 3' end of the upstream exon of the splice accep tor
site, leading to cleavage and release of the intronic RNA in the form of a lariat,
aud the splicing together of the two exonic RNA segments (Figure 1.18).
For genes reSiding in eukaryotic nuclei, RNA splicing is mediated by a large
RNA-protein complex, called the spliceosome. Spliceosomes have fi ve types of
snRNA (small nuclear RNA) and more than 50 proteins. The snRNA molecules
associate with proteins to form smal l nuclear ribonucleopro tein (snRNp, or
snurp) particles. The specificity of the splicing reaction is established by
RNA-RNA base pairing between the RNA transcript to be spliced and snRNA
molecules within the spliceosome. There are two types of spliceoso me:
The major (GU-AG) spliceosome pro cesses transcripts corresponding to clas­
sical GT-AG introns. It contains five types of snRNA. Ul and U2 snRNAs recog­
nize and bind the splice donor and branch sites, respectively. U4, U5, and U6
snRNAs subseq uently bind to cause looping out of the intronic RNA (Plgure
1.19) . (A)
spike
The minor (AU-AC) spliceosome processes transcripts corresponding to rare
AU-AC introns. It also has five snRNAs but uses U11 and U12 snRNA instead ,...,
donor site

El A E2
of Ul and U2 and has variants ofU4 and U6 snRNA.
Once a splice donor site is recognized by the spliceosome, it scans the RNA 5'--"Z '"'------ ~ O H
I
3'

sequence until it meets the next splice acceptor site (signaled as a targe t by the
upstream presence of the branch site consensus sequence).

1
nucleophilic attack by A
In branch ~ ite at the 5'
Specialized nucleotides are added to the ends of most RNA termina l G of innoni( RN/I

polymerase II transcripts
(B) spike
1n addition to RNA splicing, the ends of RNA polymerase II transcripts undergo acceptor
m odifications: the 5' end is capped by adding a variant guanine by using an U
site
E1 ,..., E2
unusual phosphodiester bond, and a long sequence of adenines is added to the G
3' end. As well as protecting the ends from cellular exonucleases, these rnodifica­ 5'_ " A AG !D!lii3I 3'

,jons may assist the correct functioning of the RNA transcripts.


nucleophilic attack of
spl ice acceptor site by
3' end of El
Flgut e 1.18 The mechanism of RNA splicing, (A) The unprocessed primary
RNA transcript w ith intronlc RNA separating sequences El and E2 that
correspond to exons in DN A. The splicing mec hanism involves a nucleophilic
attack on the G of the 5' GU dinucleotide. Thi s is carried out by the 2' OH group U
G
on the conserved A of the branch site and re sults in the formation of a lariat A AG
structure (6), and cleavage of the splice donor site. The 3' OH at the 3' end of the
El sequence performs a nucleophilic attack on the splice acceptor site, causing (()
release of the intronic RNA (as a lariat-shaped structure) and (C) fusion (s plicing) EI E2
5' 3'
ofEl and E2.
18 Chapter 1: Nucleic Acid Structure and Gene Expression

Figure 1.19 Role of small nuclear ribonucieDprotein (snRNPs) in RNA (A)


splice
splicing. (AI The unprocessed primary RNA transcript as in Figu re 1.18.
donor Site branch site
(B) Within the spliceosome, part of the U 1 snRN A is complementary in sequence
E1 " " E2
to the splice donor site consensu s sequence. As a result, the U' snRNA·protein
5' _ GU A AG 3'
complex (U 1 snRNP) binds to the splice junction by RNA-RNA base pairin g. The
U2 snRNP complex similarly binds to the branch site by RNA-RNA base pairing.

l
U1 snRNP binds to
Interaction between the splice dono r and splice acce ptor sites is stabilized splice donor site. and
by (el the binding of a multi-snRNP particle that contains the U4, US, and U6 U2 snRNP binds to
snRNAs. The US snRNP binds simultaneously to both the splice donor and splice brancn site
acceptor sites. Their cleavage releases the intronic sequence and allows (D) El (B) splice
and E2 t o be spli ced together. acceptor site
" E2
A AG 3'
5' capping
Shortly after the initiation of synthesis of primary RNA transcripts that will
become mRNA, a methylated nucleoside (7-methylguanosine, m7G) is linJ<ed by

1
binding of U4/U5/ U6
snRNP complex;
a 5'-5' phospho diester bond to the first 5' nucleotide. This is described as the U5 snRNP binds to
capping of the 5' end ofthe transcrip t (Plgure 1_20); the caps of snRNA gene tran­ donor and acceptor sites
scripts may undergo additional modification. The 5' cap may have several
functions: (C )

to protect the transcript from 5'->3' exonuclease attack (uncapped mRNA


U4 U6
molecules are rapidly degraded);
US
to facilitate transport of mRNAs hom the nucleus to the cytoplasm;
• to facilitate RNA splicing; and
to facilitate attachment of the 40S subunit of cytoplasmic ribosoilles to mRNA
during translation.

i
cleavage of splice
donor and acceptor
3' polyadenylation Intronic sequence sites, and splicing
of E1 to E2
Transcrip tion by both RNA polymerase I and III stops after the enzyme recog­
nizes a specific transcription termination site, However, the 3 ends of mRNA 1

(D)
m olecules are determined by a post-transcriptional cleavage reaction. The El E2
sequence MUAAA (someti mes AUUAAA) signals the 3' cleavage fo r most 5' 3'
polymerase II transcripts.
Cleavage occurs at a sp ecific site 15-30 nucleotides downstream of the
MUAAA sequence, although the primary transcript may continue for hundreds
Or even thousands of nucleotides past the cleavage point. After cleavage has OH OH
occurred, the enzyme poly(A) polymerase sequentially adds adenylate (AMP)
residues to the 3' end (about 200 in the case of mammalian mRNA). This polya­
denylation reaction (FIgure 1.2 I) produces a poly(A) tail that is thought to:
• Help transport mRNA to the cytoplasm.
Stabilize at least some mRNA molecules in the cytoplasm.
Enhance recognition of mRNA by the ribosomal machinery. .!. 5'-5'
Histone genes are unique in producing mRNA th at does not become poly­ T P
triphosphate
linkage
adenylated; termination of their transcription nevertheless also involves 3' cleav­
age of the primary transcript.
I
p
I
o
Figur. 1.20 The 5' cap of a eukaryotic mRNA. The nucleotide compo nents 5. 1
in pink represent the reSidue of the original 5' end of a eukaryottc pre-mRNA.
The primary pte-mRNA tran script begins with a nucleotide that contains a
4,VO~Ul'
purine (Pu) base and a 5' triphosphate group. However, as the pre-mRNA IM',I
undergoes processing, the end phosphate gro up at the 5' end is excised with o 0

II I
a phosp hatase to leave a 5' di phosphate group, and a specialized nucleotide is
cova lently joined to form a cap that will protect mRNA from exonuclease attack CH,
and assist in the initiation of tran slation. The cap nucleotide (with base shown
in red) is first formed when a GTP residue is Cleaved to generate a guanosine
o
mono phosphate that is then added through a 5'-5' triphosphate linkage (pale
I
peach shadi ng) to the diphosphate group of the original purine end nucleotide.
Subsequently nitrogen atom 7 of the new 5' terminal G is methylated. In mRNAs 4,V~~1'
syntheSized in ve rtebrate cells, the 2' carbon atom of the ribose of each of the I~,I
two adjacent nucleotides, the o rigina l purine end-nucleotid e and its neighbor, o 0
are also methylated, as illustrated in thiS example. m7G, 7-methylguan osine; , I
N, any nucleotide. CH,
RNA PROCESSING 19

(A) r Flgu ..* 1.:n PoiyadenyJation of 3' ends


5' AATAAA 3' of eukaryotic mRNAs. (A, B) As RNA
3' TTATrT 5' polymerase II advances (0 tran scribe a

1 gene it carries at its rear two multiprotein

,
tcans"'plion by RNA polyme'as. " comple)(es required for polyadeny lation:
CPSF (cleavage and polyadenylation
(8) specificity factor) and CStF (cleavage and
7
5' m Gppp AAUAAA 3' sti mulation factor) that cooperate to identify
a polyadenylation signal downstream of the
15-30
termination codon in the RNA transcript and

1
nucleOlides
to cut the transcript. The polyadenylation
"."",ge at , signal comprises an AAUAA A sequence or
close va riant and some poorly understood
(e)
7
downstream Signa ls. «) Cleavage occurs
5' m Gppp AAU AAA 3' normally about 15- 30 nucleotides
downstream of the AAUAAA element, and

13' po",denylation
(D) AMP residues are subsequently added by
poly(A) polymerase to form a poly(A) tail.
(D)
1
5' m Gppp AA UAAA AAAAAAA -- - · AAA -OH 3'

rRNA and tRNA transcripts undergo extensive processing


Four major classes of eukaryotic rRNA have been identified: 28S, 18S, 5.8S, and 5S
rRNA (S is the Svedberg coefficient, a measure of how fast large molecular struc­
tures sediment in an ultracentrifuge, corresponding directly to size and shape).
18S rRNA is found in the small subunits of ribosomes; the other three are compo­
nents of the large subunit. Very large amounts of rRNA are required for cells to
perform protein synthesis, so m any genes are devoted to making rRNA in the
nucleolus, a visibly distinct compartment ofthe nucleus.
In human cells a cluster of approximately 250 genes synthesizes 5S rRNA
usIng RNA polymerase III, which also transcribes some other small RNA species.
The 28S, 18S. and 5.8S rRNAs are encoded by consecutive genes on a common 13
kb transcription unit (FIgure 1.22) that is transcribed by RNA polymerase l. A
compound unit of the 13 kb transcription unit and an adjacent 27 kb non-tran­
scribed spacer is tandemly repeated about 30-40 times at the nucleolar organizer
regions on the short arms of each of the five human acrocentric chromosomes
(13, 14, 15, 21, and 22). These five clusters of rRNA genes, each about 1.5 million
bases (Mb) long, are sometimes referred to as ribosomal DNA (eDNA).
(A)
27 kb intefgenic
18 kb rDN A transcription unit spacer

spacer spacer
r-"--. nn
-
lSS ~
5.85 285
\\

, 1
Figure 1. 2 2 The major rRNA species are
"anwipHo" of DNA synthesized by cleavage of a shared
(8) primary transcript. (A) In human cells, the
_~ 3 '
18S, 5.85, and 285 rRN As are encoded by a
I 455 rRN A
"
1 ""vag. of 455
single tran scription unit that is 13 kb long.
It occurs w ithin tandem repeat units of

(C) , primarytranKript about 40 kb thac also includes a roughly


27 kb non-transcribed (intergenic) spacer.
(8) Transcription by RNA polymerase I

1
5' I = 3' 415 rRNA
produces a 13 kb primary transcript (455
rRNA) that then undergoes a complex series
cleavage of 415
of post-transcri ptional cleavages.

, ,
int~m edia te
(C- E) Ultimately, individual 18S, 28S, and 5.85
(D) rRNA mOlecules are relea sed. The 185 rRNA
5' _ _
1 3' 20S rRNA will form part of the small ribosomal subunit.
325 rRN A
The 5.85 rRN A binds to a complementary

(E) 1 3' , 5'


segment of the 285 rRNA; the resulting
comple)( will form part of the large ribosomal
su bunit. The latter also contains 55 rRNA,
which is encoded separately by dedicated
5' _ 3' 5' I 3' genes transcribed by RNA polymerase III.
20 Cha pter': Nucleic Acid Structure and Gene Expression

In addition to the sequence of cleavage reactions (see Figure 1.22), tile pri­
mary rRNA transcript also undergoes a variety of base-specific modifications.
This extensive RNA processing is undertaken by many different small nucleolar
RNAs that are encoded by abo ut 200 different genes in the human genome.
Mature tRNAmolecules also undergo extensive base modifications, and about
10% of the bases in any tRNA are modified versions of A, C, G, or U. Common
examples of modified nucleosides include dihydrouridine, which has extra
hydrogens at carbons 5 and 6; pseudouridine, an isomer of uridine; inosine
(deami nated guanosine); and N,N'-dimethylguanosine.

1,5 TRANSLATION, POST-TRANSLATIONAL


PROCESSING, AND PROTEIN STRUCTURE
The mRNA produced by genes in tile nucleus migrates to the cytoplas m, where it
engages with ribosomes and other components to initiate translation and
polypeptide synthesis. Messenger RNA transcribed from genes in the mi tochon­
dria and chloroplas ts is translated on dedicated ribosomes within these
organelles.
Only a central segment of a eukaryotic mRNA molecule is translated to make
a polypeptide. The flanking untranslated regions (the 5' UTR and 3' UTR) are
transcribed from exon sequences present at the 5' and 3' ends of the gene. They
assist in binding and stabilizing tile mRNA on the ribosomes, and promote effi­
cient translation (Figure 1.23).
Ribosomes are large RNA-protein complexes composed of two subunits. In
eukaryotes, cytoplasmic ribosomes have a large 60S subunit aud a smaller 40S
subunit. The 60S subunit contains three types of rRNA molecule: 28S rRNA, 5.8S
rR NA, and 5S rRNA, as well as abo ut 50 ribosomal proteins. The 40S subunit con­
tains a single 18S rRNA and more tioan 30 ribosoma l proteins. Ribosomes provide
the structural framewo rk for polypeptide synthesis. The RNA components are
predominantiy responsible for the catalytic function of the ribosome; the protein
components are thought to enha nce the fu nction ofthe rRNA molecules, although
a surprising number of tile m do not se em to be essential for riboso me function .

mRNA is decoded to specify polypeptides


The assembly of a new polypeptide from its constituent amino acids is governed Figure 1•.23 Transcription and translation
by a triplet genetic code. Within an mRNA the central nucleotide sequ ence that is of the human fi-globin gene. (A) The
used to make polypeptide is scanned from 5' to 3' on the ribosome in groups of ~-glob i n gene comprises three exons (El-E3)
three nucleotides (codons). Each codon specifies an amino acid, and the decod­ and two !ntro ns. The 5' end sequence of
ing process uses a collection of different tRNA molecules, each of which binds El and the 3' end sequence of E3 are
one type of ami no acid . An antino acid-tRNA complex is known as an aminoacyl noncoding sequences (unshaded secti ons).
tRNA and is form ed when a dedicated amino acyl !RNA synthetase covalently (B) These sequences are transcribed and
so occur at t he 5' and 3' end s (unshaded
links the required amino acid to the terminal adenosine in the conserved CCA
sections) of the ~-globin m RN A t hat emerges
trinucleotide at the 3' end ofthe tRNA.
from RNA prOcess ing. They are not, however,
translated and so do not specify any part of
the precursor polypeptide (C). This figu re
also illustrates that some codo ns can be
(A) specified by ba ses that are separated by an
E1 E2 E3 intron. The arginine at position' 04 in the
.- .•. a t·· .. · TAA ~ -g l obin polypeptide is encoded by the last
three nucleotides (AGG) of exon 2 but the
arg inine at positio n 30 is encoded by an AGG
codon whose first two bases are encoded by
(8)
the last two nu cleoti des of exon 1 and w hose
m
1
Gppp[1=JI[lAiliu!1G]IGruuJlG["::::''''::;''::::'
' '~AGilli
I G['::O:-
" ' - ':;:':::;::
' " " " ::::'AAGiji;(;~:Qc¢C1¢
' :::::::9!];fU
' ' CAC I WAA
iL=JIAAAAAA ···· Ar third base is encoded by the first nucl eotide

(C)
1 lranslatloo
of exon 2. (D) During post-translational
modification the 147·amino acid precursor
polypeptide underg oes cl eavage to remove
N ·Ai •............... ..A,rg, IPJO.. .. Fh~.l c ils N-terminal methionin e residue, to
generate the mature' 46-residue j}-glo bin
~ j POst:tran~lationa!
I
protein. The flanking Nand C symbolS to
~J, mOdlficatlon the left and rig ht. respect ively, in «() and
(0) (0) depict th e N-terminu-s IN) and
Nl Va!: .. · .. " :'&rig. . :· .... ·•· .... ..Arg Pro· .. His l e ( -terminus (C).
TRANSLATION, POST-TRANSLATIONAL PROCESSING, AND PROTEIN STRUCTURE 21

IA) P site A site 18) Figure 1.24 rn translation, the genetic


code is deciphered on ribosomes by
Met I I Met G!y codon- anticodon recognition. CA) The
I I I large ribosomal subunit (60S in eukaryotes)
A large A A
C C C has two sites for binding an aminoacyl tRNA
- ribosoma l
C C C (a transfer RNA with its attached amino aCid):
unit
Initiator - ­ ---' the P Cpeptidyl) site and the A (aminoacyll
lRNAMN

~~r ...
UAC C<C
9 ' ••

".
site. The sma ll ribosomal subunit (405 in
eukaryotes) binds mRNA, which is scanned
5' AUG GGG :r 5' AUG GGG Th\C along its 5' UTR in a 5' ~3' direction until

mRNA
/ small ribosomal unit
the Start codon is identified, an AUG located
w ithin a larger con sensus sequence (see
the text) . An initiator tRNAMet carrying a
methionine residue binds to the P site with
IC) 10)
its anticodon in regi ster with the AUG start
codon. (8) The appropriate ami noacyl tRN A
Met- Gly Met : -r G1.l Tyr
I is bound to the A site wit h its anticodon
I I
A A A A base-pairing w ith the next codon {GGG in
C C C C
this case, specifying glycine). (C} The rRN A
( C
~ C C
in the large subunit catalyzes peptide bond
~-4c forma tion, resulting in the methionine
y-~! ffi CE(
:::
::: UAC
AUG detaching from its tRN A and being bound
instea d to the glycine attached to the
5' AUG GaG OA-C 5' AU_G GGG
tRNA held at the A site. (D) The ribosome
trans locates along the mRNA so that the
ribosome moves
along by one codo n tRNA bearing the Met-Gly dipeptide is
bound by the P site. The next aminoa cyl
IE) tRNA (here, carrying Tyr) binds to the A site in
P site A site P site A site preparation for new peptide bond formation.
(H, CH, (El Peptide bon d formation . The N atom of
I I the amino group of the amino acid bound to
s 5 the tRN A in the A site makes a nu cleophilic
I (H,
I attack on the carboxyl C atom of the amino
CH,
acid held by the tRN A b ound to the P site.
I I
CHl .....------...._ H CH 2 0 H
I 0< " I I II I
H.l N -CH-C= O H2N - CH- C= O H2 N-CH - C - NH - 'CH-C =O
I I OH OH OH
I
0
OH 0 OH 0

,JOt, ad.oCt, I
o
I
0
--t JOt'~deCt, I
o
I
0
I I _ I
O=p-O-
I
O=p-O­
O= p- ~ O=p-O
I I I I
o 0 o 0
I I
tRN~Mt\ tRNiAGI~
! !
tRN~IM' tRN'AGfy

Eac h tRNA has its own anticodon, a trinucleotide at the center of the anti­
codon arm (see Figure 1.9B) that provides the necessary specificity to interpret
the genetic code. For an amino acid to be added to a growing p olypeptide, the
relevant codon of the mRNA molecule must be recognized by base pairing with a
complementary anticodon on the appropriate amino acyl !RNA molecule. This
hap pens on the ribosome. The small ribosomal subuni t binds the mRNA, and the
large subunit has two sites for binding aminoacyl tRNAs, namely a P (peptidyl)
site and an A (aminoacyl) site (Plgure J _24).
The cap at the 5' end of messenger RNA mol ecules is important in initiating
translati on. It is recognized by certain key proteins that bind the small ribosomal
subunit, and these initiation fa ctors hold the mRNA in place. In cap-dependent
translation initia tio n, the ribosom e scans the 5' UTR of the mRNA in the 5'--.3'
direction to find a suitable initiation codon, an AUG that is found within the
Kozak consensus sequence 5' -GCCPuCCAUGG-3' (where Pu represents purine).
The most important determinants are the G at position +4 (immediately follow­
ing the AUG codon), and the purine (preferably A) at -3 (three nucleotides
upstrea m ofthe AUG codon) .
22 Chapter 1: Nucleic Add Structure and Gene Expression

When a suitable initiation codon is identified, an initiating tRNAMe, with its


attached methionine binds to the P site on the large ribosomal subunit so that its
anticodon base-pairs with the AUG initiator codon on dle mRNA (see Figure
1.24). Once this has happened, the transcriptional readi ng frame is established
and co dons are interpreted as successive groups ofthree nucleotides continuing
in the 5'-->3' direction downstream of the initiating AUG codon. An aminoacyl
tRN Afor the second codon (a tRNAG1y to recognize GGG in the example of Figure
J .24) binds to the neighboring A site in the large subunit.
Once dle P and A sites are occupied by aminoacyl tRNAs, the largest rRNA
component within the large subunit of the ribosome is thought to act as a pepti­
dyl transferase. It catalyzes the formation of a peptide bond by a condensation
reaction between the amino group of the amino acid held by ilie tRNA in the A
site and the carboxyl group of the methionine held by the tRNA Met The net result
is to detach ilie initiator methionine from its tR NA and attach it to the second
amino acid, forming a clipeptide (see Figure 1.24). Now without any attached
amino acid, the tRNAMe< migrates away from the P site and its place is taken by
the tRNA widl the attached dipeptide that formerly occupied ilie A site. The liber­
ated A site is now filled by an aminoacyl tRNA carrying an anticodon iliat is com­
plementary to the iliird codon, and a new peptide bond is formed to make a tri­
peptide, and so on.
After a ribosome has initiated translation of an mRNA and has then moved
along the mRNA, other ribosomes can engage wiili the same mRNA. The result­
ing polytibosome structures (po lysomes) make multiple copies of a polypeptide
from the one mRNA molecule. Polypeptide chain elongation occurs until a termi­
nation codon is met. For mRNA transcribed from nuclear genes, termination
codons come in iliree varieties: UM (ochre) , UAG (amber), and UGA (opal), but
there are some differences for mitochondrial mRNA as described in the next
section.
In response to a ternlination codon a protein release factor enters the A site
instead of an aminoacyl tRNA to signal that the polypeptide sho uld disengage
from the ribosome. The completed polypeptide will ilien undergo processing
iliat can include cleavage and modification of the side chains. Its backbone will
have a free amino group at one end (the N-terminal e nd) and a free carboxyl
group at ilie other end (the C-terminal end).

The genetic code is degenerate and not quite universal


The genetic code is a three-letter code, and iliere are four possible bases to choose
fro m at each of the three base positions in a codon. There are therefore 43 = 64
possible codons, which is more than sufficient to encode ilie 20 major types of
amino acid. The genetic code is degenerate because, on average, each amino acid
is specified by abo ut three different codons. Some amino acids (such as leucine,
serine, and arginine) are specified by as many as six codons; others are much
more poorly represented (FIgure 1.25). The degeneracy of the genetic code most
often involves ilie third base of the codon.
Figur.l .l ~ The genetic code. All 64
mtDNA mtDNA
variants variant possible codons of the genetic code and the

~ l lYS CM
CAG I Gin
GM I G!u
GAG UM I STOP
UAG
amino acid specified by each, as read in the
5' ~3 ' direction from the mRNA sequence.
MCI Asn
MU
CAC
CAU
I His GAC
GAU
I Asp UAC
UAU
I Tyr The interpretations of the 64 codons in the
'universal' genetic code are shown in black
immediately to the right of the codons.
ACA Sixty-one codon sspecify an ami no acid.
ACG I CCA
CCG I GCA I Ala
GCG
UCA I
UCG Ser
Ac e Thr CCC Pro GCC UCC Three STOP codons (UAA, UAG, and UGAI do
ACU CCU GCU UCU not encode any amino acid. The genetic code
for mitochondrial mRNA (mtDNA) conforms
STOP I AGA
AGG
IA<g CGAI Arg
CGG GGA I Gly
GGG
UGA I STOP Trp
UGG ' Trp
to the universal code except for a few
variants. For example, in the mitochondrial
AGC I CGC GGC UGe genetic code in humans and many other
AGU Ser CGU GGU UGU Cys
species four codons are used differently:

Me< AUA . Ue
AUG I Met
AUC I lie
CUA I
CUG l eu
cue
GUA I
GUG
GUC Val
UUA I
UUG Leu
I
UUC Phe
UGA encodes tryptophan instead of being a
STOP codon, AUA encodes methionine, and
instead of encoding arginine, AGA and AGG
AUU CU U GUU UUU are STOP codons.
TRANSLATION, POST-TRANSLATIONAL PROCESSING, AND PROTEIN STRUCTURE 23

Although more than 60 codons can specify an amino acid, Ihe number of dif­
TABLE 1.6 RULES FOR BASE
ferenl cytoplasmic tRNA molecules is quite a bit smaller, and only 22 types of
PAIRING CAN BE RELAXED
mitochondrial tRNA are made. The interpretation of more than 60 sense codons
(WOBBLE) AT POSITION 3
with a much smaller number of different tR NAs is possible because base pairing
OF A CODON
in RNA is more flexible than in DNA. Pairing of codon and anticodon follows the
normalA- U and G--C rules for the first two base positions in a co don. However, at Base at 5' end of Base recognized
the third position there is some flexibility (base wobble), and GV base pairs are tRN A anticodon at 3' end of mRNA
tolerated here (Table 1.6). codon
The gene tic code is the same throughout nearly all life forms. However, mito­ A U only
chondria and chloroplasts have a limited capacity for pro tein synthesis, and dur­
ing evolution their genetic codes have diverged slightly from that used at.cyto­ C G only
plasmic ribosomes. Translation of nuclear-encoded mRNA continues until one
G (or!)" C orU
of three stop codons is encountered (UM, UAG, or UGA) but in mammalian
mitochondria there are four possibilities (UM, UAG, AGA, and AGG). U AorG
The meaning of a codon can also be dependent upon the sequence context;
tI,at is, the nature of the nucleotide sequence in which it is embedded. Depending "Inosine (I) is a deam inated form of
guanosine.
on the surro unding sequence, some codons in a few types of nuclear-enco ded
mRNA can be interpreted differently from normal. For example, in a wide variety
of cells the stop codon UGA is alternatively interpreted as encoding seleno ­
cysteine with some nuclear-encoded mRNAs, and VAG can sometinles be inter­
preted to encode glutamine.

Post-translational processing: chemical modification of amino


acids and polypeptide cleavage
Primary translation products often undergo a variety of modifications during or
after translation. Simple or complex chemical groups are often covalently
attached to the side chains of certain am ino acids (Table 1.7). [n addition,
polypeptides may occasionally be cleaved to yield one or mOre active polypep­
tide products.
Addition of carbohydrate groups
Glyco proteins have oligo saccharides covalently attached to the side chains of
certain amino acids. Few proteins in the cytosol are g1ycosylated (carry an

TABLE 1.7 MAJOR TYPES OF MODIFICATION OF POLYPEPTIDES


Type of modification (group added)
--~~~--~----------~-------4
Target amino acid,s)
Notes

Phosphorylation (P0 4-) Tyr, Ser, Thr achieved by specific kinases; may be reversed by ph05phatases

Methylatio n (CH)) Lys achieved by methylases; reversed by demethylases

Hyd roxylation (OH) Pro, Lys, Asp hydroxyproline (Hyp) and hydroxylysi ne (Hyl) are particularly common in
collagens

':"cetylat ion (CH3CO) Lys achieved by an acetyla se; reversed by deacetylase

Carboxylation ((OO H) Glu achieved by y-carboxylase

V-glycosylari on (complex carbohydrate) Asn il takes place initially in the endoplasmic reticulum, w ith later add itional changes
occurring In the Golgi apparatus

O-glycosyiation (complex carbohydrate) Se r, Thr, Hylb takes place in the Golgi appara tus; less common than N-glycosyJation

Glycosyl phosphatldyli nositol (g lycolipid) Asp~ se rves to anchor protein to outer layer of pla sma membrane

'~yr i stoy l ation (C '4 fatty acyl group) Glyd serves as membrane anchor

Dalmi toylation (C 16 fatty acyl group) Cyse serves as membrane anc hor

;:~rnesy l a tion ((' 5 prenyl group) (ys~ serves as membrane anchor

-:::~ranyl ge ran ylation (C 10 prenyl group) CysC serves as membrane anchor

T"T"hls is especially common wh en Asn is in the sequ ence : Asn-X-{Serffhr), where X is any amino acid o ther than Pro. bHyd roxylysine. CAt ( -terminus
polypeptide. dAt N-terminus o f poly peptide. eroform S-pa lmitoyllink.
24 Chapter 1: Nucleic Acid Structure and Gene Expression

attached carbohydrate); if they are, they have a single sugar residue, N-acetyl­
glucosamine, attached to a serine or threonine residue. However, proteins that
are secreted from cells or transported to Iysosomes, the Golgi apparatus, or the
plasma membrane are routinely glycosylated. In these cases, the sugars are
assembled as oligosaccharides before being attached to the protein.
Two major types of glycosylation occur. Carbohydrate N-glycosylation
in volves attaching a carbohydrate group to the nitrogen atom of an asparagine
side chain, and O-glycosylation entails adding a carbohydrate to the oxygen ato m
of an OH group carried by the side chains of certain amino acids (see Table 1. 7).
Proteoglycans are proteins with attached glycosaminoglycans (polysaccha­
rides) that usually include repeating disaccharide units containing glucosamine
or galactosamine. The best-characte rized proteoglycans are components of the
e \uacellular matrix, a complex network of macromolecules secreted by, and sur­
rO Wlding, cells in tissues or in culture systems.
Addition of lipid groups
Some proteins, notably membrane proteins, are modified by the addition offatty
acyl or prenyl groups. These added groups typically serve as membrane anchors,
hydrophobic amino acid sequences that secure a newly synthesized protein
within either a plasma membrane or the endoplasmic reticulum (Table 1.8).
Anchoring a protein to the outer layer of the plasma membrane involves the
attachment of a glycosylphosphatidylinositol (GPl) group. This glycolipid group
contains a fatty acyl group tha t Serves as the membrane an chor; it is linked suc­
cessively to a glycerophosphate unit, an oligosaccharide unit, and finally­
tluough a phospho ethanolamine unit-to the C-terminus of the protein. The
entire protein, except the GPI anchor, is located in the extracellula r space.
Post-translational cleavage
The primary translation product may also undergo internal cleavage to generate
a smaller mature product. Occasionally the initiating methionine is cleaved from

TABLE 1.8 LEVELS OF PROTEIN STRUCTURE


level Defini tion Notes

Primary the linear sequence of amino acids in a can vary en ormously in length from a few to
polypeptide thousands of amino acids

Secondary the path that a polypeptide backbone follows varies along the length of the polypeptide; com mon
within local regions of the prima ry structu re elements of secondary structure include the a·helix
and p-plea ted sheet

the overall three-dimensional structure of a ca n take various forms (e.g. globular, rod-like. tube.
polype ptide, arising from the combin ation of all coil, sheet)
of the secondary structures

Qu:;ternary the aggregate structure of a multimeric protein can be stabilized by disulfide bridges between
(comprising more than one subunit, which may subunits or ligand bi nding, and other factors
be of more than one type)
TRANSLATION, POST-TRANSLATIONAL PROCESSING, AND PROTEIN STRUCTURE 25

the primary translation product, as during the synthesis of Il-globin (see Figure
1.23C, 0). More substantial polypeptide cleavage is observed during the matura­
tion of many proteins, including plasma proteins, polypeptide hormones,
neuropeptides, and growth factors. Cleavable signal sequences are often used to
mark proteins either for export or for transport to a specific intracellular location.
A single mRNA molecule can sometimes specify more than one functional
polypeptide chain as a result of post-translational cleavage of a large precllisor
polypeptide (Figure 1.26).

The complex relationship between amino acid sequence and


protein structure
Proteins can be composed of one or more polypeptides, each of which may be
s ubject to post-translational modification. Interactions between a protein and
either of the folJowing may substantially alter the conformation of that protein:
A cofactor, such as a divalent cation (such as Ca 2+, Fe 2+, Cu 2+, or Znz+), or a
small molecule required for functional enzyme activity (such as NAD+).
Aligand (any molecule that a protein binds specifically).
Four different levels of structural organization in proteins have been distin­
guished and defined (see Table 1.8).
Even withi n a single polypeptide, there is ample scope for hydrogen bonding
between different amino acid residues. This stabilizes the partial polar charges
along the backbone ofthe polypeptide and has profound effects on that protein's
overall shape. With regard to a protein's conformation, the most Significant hyd ro­
gen bonds are those that occur between the oxygen of one peptide bond's carbo­
nyl (C=O) group and the hydrogen of the amino (NHJ group of another peptide
bond. Several fundamental structural patterns (motifs) stabilized by hydrogen
bonding within a single polypeptide have been identified, the most fundamental
of which are described below.
The a-helix
This is a rigid cylinder that is stabilized by hydrogen bonding between the carbo­
nyl oxygen of a peptide bond and the hydro gen atom of the amino nitrogen of a
peptide bond located four amino acids away (Figure 1.27). a-Helices often occur

(AI
exon 1 intron 1 exon 2 intron 2 exon 3

1 ~ lW Figu.re 1..26 Insulin synthesis involves


I
multiple post-translational cleavages of
gene C----'.et ag ~ a .. G
polypeptide precursors. (A) The human
tran scription andl ,,,; , '" insu lin gene comprises three exons and
j RNA processing ./ two introns. The coding seque nce (the part
(8) 11 63 110 / / that will be used to make polype ptide) is
7

1
mRNA m Gppp AAAAM.--­ shown in deep blue. It is confined to the 3'
L--J sequence of exon 2 and the 5' sequence of
5' UTR "anSiation ~'
UTR exon 3. (8) Exon 1 and the 5' part of exon 2
(C) speCify the 5' untransJated regi on (5' UTR),
I1 24' 25 . 110 i
preproinsulin N IMel Ala lPhe Pile _ ASill c and th e 3' end of exon 3 specifies the
3' UTR. The UTRs are transcribed and so
are present at the ends of the mRNA. (e) A
post-trans,ationa' l
1 24 cleavage primar y translation produce. p rep roi nsulin,
leader sequence [Met Ala I has 110 residues and is cleaved to give
(01
(D) a 24·residue N-terminal/eader sequence
1
Plie (that is required for the protein to cross the

c'ea"ge of ,coinsulin

1
1 35
cell membrane but is thereafte r discarded)
plu s an 86-residue pro insulin precursor.
(E) Proinsulin is cleaved to give a central
·1GlU- Xii! segment (the connecting peptide) that may
tEl
maintain the co nformati o n of the A and B
con necting peptide
1 30 1 21 chains of insulin before the formation of
IPM le u ~ + r-Gly A§m their interconnecting covalent disulfide
insulin e chain in sulin A chain bridges (see Figure 1.29).
26 Chapter 1: Nucleic Acid Structure and Gene Expression

(A) (8)
1 Flgu,.1.2.7 The structure of a standard
Phe- Met Leu a-helix and an amphipathk a-helix_
OArg 'Ala
(A) The structure of an ((-helix is stabilized
4 Ala . by hydrogen bonding between the oxygen
OArg Gly of the carbonyl group «(=0) of each peptide
H
Ser Leu 2 bond and the hydrogen on the peptide bond
amide group (NH) of the fourth amino acid
O Alg '\O~4/.:IT
OAre :-.. Ala away, making the helix have 3.6 amino acid
H O Arg Pro Leu residues per turn. The side chains of each
3 amino acid are located on the outside of the
1 C ~- N
: Q \
helix; there is almost no free space w ithin
the helix. Note that only the backbone of the
H
, "'
C-N-C /5 polypeptide is shown, and some bonds have
6C1 been om itted for clarity. (B) An amphipathic
N- 0"
a-helix has tighter packing and has charged
7CI '"0 .:
8 amino acids and hydrophobic amino acids
\ H C M located on different surfaces. Here we show
- N " \.
C-N an end view of such a helix: five positively
o \ charged arginine residues are clustered on
H C9 one side of the helix, whereas the opposing
H \ I
, 10 /C - N-~ side has a series of hydrophobic amino
N- C 0 acids (mostly Ala, Leu, and Gly). The lines
I '"o w ithin the circle indicate neighboring
residues-the initiator methionine (position
1) is connected to a leucine (2), w hich
is connected to an arginine (3), w hich Is
adjacent to an alanine (4), and so on.
in proteins that perform key cellular functions (such as transcription factors,
where they are usually represented in the DNA-binding domains). Identical
a-helices with a repeating arrangement of nonpolar side chains can coil round
each other to form a particularly stable coiled coil. Coiled coils occur in many
fibrous proteins, such as collagen of the extracellular matrix, the muscle protein
tropomyosin, a-keratin in hair, and fibrinogen in blood clots.
The \3-pleated sheet
B-Pleated sheets are also stabilized by hydrogen bonding but, in this case, they
occur between opposed peptide bonds in parallel or anti-parallel segments of
the same polypeptide chain (Figure 1.28). B-Pleated sheets occur-often together
with a-helices-at the core of most globular proteins.

(A) (B)

".
o
'"

..0 0
"' .

..p o
" Fl,ure 1..28 The structure of a Ii-pleated
sheet. Hydrogen bonding occurs here
between [he carbonyl ((=0) oxygens
and amide (NH) hydrogens on adjacent
seg ments of (A) parallel and (8) anti-parallel
p-pleated sheets. (Adapted from Lehninger
AL, Nelson DL & (ox MM (1993) Principles of
Biochemistry, 2nd ed. With permission from
WH Freeman and Company.)
FURTHER READING 27

,
SH S~ Flgur. 1.29 lntrachain and interchain
disulfide bridges in human insulin.
GIVEQCCTS
, I C SLY Q lEN YeN
I Disulfide bridges (- 5-5-) form by a
sH SH
condensation reaction between the
sulfhydryl (-SH) groups on the side chains
SH sH of cysteine residues.They can form between
J I
F V N Q H LeG S H L V E A L Y L V C G E RG F F Y T P K T cySteine side chains within the same

Inlrachain disulfide bond


1 dis.'Mo bond formation
polypeptide (such as between positions 6
and 11 wi thin the insulin Achain) and also
between cysteine side chains on different
interacting polypeptides such as the insulin
Aand Bchai ns.
s..l-s
G I VEQ ~ CSLYQLENYCN
~
C T S I
1~_______ lnte(Charn -_______
disulfide bonds
1
Sr S.
F V N Q H LeG S H L v E A L Y L V C G ERG F F YTPK T

The ~-turn
Hydrogen bonding can occur betvveen amino acids that are even nearer to each
other within a polypeptide. When this arises between the peptide bond CO group
of one amino acid residue and the peptide bond NH group of an amino acid resi­
due three places farth er along, this results in a hairpin ~-turn. Abrupt changes in
the direction of a polypeptide enable compact globular shapes to be achieved.
These ~-turns can connect parallel or anti-parallel strands in ~-pleated sheets.
Higher-order structures
Many more complex structural motifs, consisting of combinations of the above
structural modules, form protein domains. Such domains are often crucial to a
protein's overall shape and stability and usually represent functi onal units
involved in binding other molecules. Another important determinant of the
StTucture (and function) of a protein is disulfide bridges. They can form between
the sulfur atoms of sulfhydryl (- SH) groups on two amino acids that may reside
on a single polypeptide chain or on two polypeptide chains (Figure 1.29).
In general, the primary structure of a protein determines the set of secondary
structures that, together, generate the protein's tertiary structure. Secondary
structural motifs can be predicted from an analysis of the primary structure, but
the overall tertiary structure cannot easily be accurately predicted. Finally, some
proteins form complex aggregates of polypeptide subunits, giving an arrange­
ment known as the quaternary structure.

FURTHER READING
;\gfis PF, Vendeix FA & Gfaham WD (2007) tRNA's wobble Preiss T & Hentze MW (2003) Starting the protein sy nthesis
decoding of the genome: 40 yea rs of modification. J. Mol. BioI. machine: eukaryoti c tra nslation initiation. BioEssays 25,
366,1-13. 1201-1211.
C.lvo 0 & Manley JL (2003) Strange bedfellows: polyadenylation San der OM. Big Picture Book of Viruses. http://www.virology.net!
factors at the promoter. Genes Dev. 17, 1321-1327. Bi g_Vi rology/ BVHomePage.html
Canadine CR, Drew HR, Luisi B &Travers AA (2004) Understanding Wea r MA &Cooper JA (2004) Capping protein: new insights into
DNA. The Molecule And How It Works, 3fd od. Academic mechanism and regulation. Trends Biochem. Sci. 29,418-428.
press. Whitford D (2005) Protei n Structure And Function. John Wiley.
Crain PF, Rozenski J & McCloskey JA. RNA Modification Database.
http:!~ i brary.med.u tah.ed u/RNAmods
cedorova 0 & Zingler N (2007) Group II introns: struct ure, folding
and splicing mechanisms. BioI. Chem. 388, 665-678.
Ga rcia-Diaz M & Bebenek K (2007) Multiple functions of DNA
polymerases. (rit. Rev. Plant Sci. 26, 105-1 22.
Chapter 2

Chromosome Structure and


Function

KEY CONCEPTS

Chromosomes have two funda men tal roles: the faithful transmission and appropriate exp ression of
genetic infor mation,
Prokaryotic chromosomes contain circular doub le-stra nded DNA mo lecules tha t are relatively pro tein­
free, h ut eukaryotic chromosomes consist of linear double-st randed DNA m olecu les com plexed
lhrough out their lengths with pro teins.
Chromatin is the DNA-protein matrix of euka ryo tic ch romosomes. Th e comp lexed p roteins serve
structu raJ roles. including comp acting the DNA in different ways, and also regula tory roles,
Ch romosomes undergo major changes in the cell cycle, notably at S phase when they rep licate and at
M phase wh en the replicated ch romoso mes becom e separated an d allocated to two daughter cells.
DNA re plication at S phase produces two double-stranded daughter DNA molecules th at are
held togeth er at a specialized region, the centromere. VVhen the da ughter DNA molecules remain held
togethe r Iik.e this they are known as sister chromatids, but once they sepa rate at M phase they become
individ ual chromosomes.
At the metaphase stage of M phas e the chromosomes are so highly co ndensed that gene expression is
un iformly shut down. But this is th e optimal tim e for viewing them under the microscope. Staining
wi th dyes that bi nd preferen tially to GC -rich or AT-rich regions ca n give re producible chro mosome
ba ndi ng patterns that allow different chromosom es (Q be differentiated.
During inte rphase, the long pe ri od of the cell cycle that separates successive M phases, chromosomes
have general ly very long extended co nformations an d are invisib le under op tical mic rosco py. The
extended struc ture means that genes ca n be exp ressed effiCiently.
Even during interp hase some chromoso mal regions always rema in highly condensed and
transcriptionally inactive (hetero chromatin). whereas others are ex te nded to allow gene expression
(euch rom atin).
Sperm and egg cells have one co py of each chro mosome (they are haploid). but most cells are diplOid,
having two sets of chro mosomes.
Fert ilization of a ha plo id egg by a haploid sperm generates the d iploid zygo te fro m which all other
body ceUs arise by cell division.
In mitosis a cell divides to give two daughter cells, each with th e same num ber and typ es of
chromoso mes as the original cell.
Meiosis is a specialized form of cell divis ion that occurs in certain cells of the testes and ovaries to
produce haploi d sperm and egg cells. DUlin g meiosis new genetiC combinations are randomly created.
partly by exchanging sequences between maternal an d paternal ch romosomes.
Three ~es of func tional elem ent are nee ded fo r eukaryo tic chromosomes (Q transmit DNA faithfully
from mo ther cell to dau ghter cells: the ce ntromere (ensures correct chromosome segregation at cell
division); replication ori gins (initiate DNA replication ); a nd telo mere s (cap the ch romosomes to stop
the internal DNA from being degraded by nucleases).
An abnormal number of chromosomes can sometimes occur but this is often leth al if present in m ost
cell s or the body.
Structural chromosome ab normalities arisin g from breaks in chromoso mes can cause genes to be
delete d or incorrectly expressed.
Having the co rrect number and structure of chromosomes is not enough. They m ust also h ave the
correct parental o rigin because certain genes are preferentially expressed on either paternally or
matern ally inherited chromosomes.
30 Chapter 2: Chromosome Structure and Function

The underlying structure and fundamental functions of DNA-replication and


rranscription-were introduced in the previous chapter. But DNA functions in a
context. In eukalyotic ceUs, the very long DNA molecules in the nucleus are com­
plexed with a variety of structural and regulatory proteins and structured into
linear chromosomes. The DNA molecules in mitochondria are different: they are
comparatively short, have little protein attached to them, and are circular.
This chapter introduces the life cycle of chromosomes in eukaryotic cells that
are usually formed from other cells by cell division. The process of ceUdivision is
a small component of the cell cycle, the process in which chromosomes and their
constituent DNA molecules need to make perfect copies of themselves and then
segregate into daughter cells. There are important differences between how this
occurs in routine cell division and in the specialized form of cell division that
gives rise to sperm and egg cells.
A feature common to both types of cell division is the importance to the ceU
of chromosome condensation. This affects the expression of information encoded
in the DNA and makes long and fragile DNA strands resilient to breakage during
the dramatic rearrangements that occur in cell division. The use of dyes to stain
condensed chromosomes has revealed patterns that, like fingerprints, can be
used to distinguish between them. Careful examination of these and other pat­
terns can reveal evidence of chromosomal abnormalities, such as breakages and
rearrangements that have occurred and survived but may cause disease.

2_ 1 PLOIDY AND THE CELL CYCLE


The chromosome and DNA content of cells is defined by the number (n) of differ­
ent chromosomes, the chromosome set, and its associated DNA content (C). For
human cells, n = 23 and Cis about 3.5 pg (3.5 x 10- 12 g). Different cell types in an
organism, however, may differ in ploidy-the number of copies they have of the
chromosome set. Sperm and egg cells carry a single chromosome set and are said
to be haploid (they have 11 chromosomes and a DNA content of C). Most human
and mammalian cells carry two copies of the chromosome set and are diploid
(having 2n chromosomes and a DNA content of 2C). However, in several non­
mammalian animal species most of the body cells are not diploid but instead are
either haploid or polyploid. In the latter case, some species are letraploid (4n)
and others have a ploidy of more than 4n, but triploidy (3n) is less common in
animals because it can give rise to problems in producing sperm and egg cells.
The cells of our body are all derived ultimately from a single diplOid cell, the
zygote, that is formed when a sperm fertilizes an egg. Starting from the zygote,
organisms grow by repeated rounds of cell division. Each round of cell division is
a cell cycle and comprises a brief M phase, during which cell division occurs, and
the much longer intervening interphase, which has three parts (Figure 2..1) . They
are: S phase (during which DNA synthesis occurs), Gl phase (the gap between M
phase and S phase), and G2 phase (the gap between S phase and M phase).
We will describe the cell biology underlying the phases of the cell cycle in a
later chapter. Here we are concerned with the life cycle of chromosomes. During
each cell cycle, chromosomes undergo profound changes to their structure,
number, and distribution within the cell. From the end of M phase right through
until DNA duplication in S phase, a chromosome of a diplOid cell contains a sin­
gle DNA double helix and the total DNA content is 2C(see Figure 2.1), After DNA
duplication, the total DNA content is 4C, but the duplicated double helices are
held together along their lengths so that each chromosome has double the DNA
content of a chromosome in early S phase. During M phase the duplicated dou­
ble helices separate, generating two daughter chromosomes, giving 4 n chromo­
somes. After equal distribution of the chromosomes to the two daughter cells,
both cells will have 2n chromosomes and a DNA content of 2C (see Figure 2.1).
Gl is the normal state of a cell, and is the long-term end state of non-dividing
cells. Cells enter S phase only if they are committed to mitosis; as will be described
in more detail in Chapter 4, non-dividing cells remain in a modified Gl stage,
sometimes called the Go phase. The cell cycle diagram can give the impression
that all the interesting action happens in Sand M phases-but this is an illusion.
AceU spends most of its life in Go or GJ phase, and that is where the genome does
most of its work.
MITOSIS AND MEIOSIS 31

M PJ1o~ siste r chromatids separate to give two Figure 2.1 Changes in chromosomes and
cNomosomes that are distribUted into two daughter cells DNA content during the cell cycle. The cell
cycle show n at the right includes a very short
I rJ!
Uti
chromosomes =2n
:,f-­J----> (1
chromosomes =4n
)+ J
chromosomes == 2n
M phase, w hen the chromosomes become
extremely highly condensed in preparation
for nuclear and cell d ivision . Afterwards,
cells enter a long period of growth ca lled

~~ ~
interphase, during which chromosomes
are enormously extended so tha t genes
can be expressed. Interpha se is divided
late S phase~ two DNA double ~ into three phases: G1 , S (when the DNA
helices Per chromosome G2 M "\
centromere _____ \ replicates), and G2. Chromosomes con tain
,',," =~---,-.,.) { I o

I
one DNA double helix from the end of M
,h,om,tid, <C~ ~::::J \ ~ 'l-~ G;I phase right through until just before the

palred <~"~~ ~ DNA is duplicated in 5 phase. After the DNA


double helix has been duplicated. the two
double helices ~V. / resulting double helices are held together
chromosomes"" 2"
DNA"" 4C tightly along their lengths (by specialized
protein complexes called cOhesins) until
i DNA replication M phase. As the chromosomes condense
at M phase they are now seen to con sist of
two sister chromatids, each containing a
early S phase: one ONA double helix per chromosome
DNA duplex, that are bound together only
centromere
chromosome --< b=====::)
DNA double helix - ;" :x!IIr';',i·~~/~"."',~·"\)!
at the centromeres. During M phase the
two sister chromatids separate to form two
independent chromoso mes that are then
chromosomes"" 2" equally distributed into the daughter cells.
DNA"" 2C

._tA
A small subset of diploid body cells constitute the germ line that gives rise to
gametes (sperm cells or egg cells). In humans, where n ~ 23, each gamete con­
tains one sex chromosome plus 22 non-sex chromosomes (aulosomes). In eggs,
the sex chro moso me is always an X; in sperm il may be either an XOr a Y. After a
haploid sperm fertilizes a haploid egg, the resulting diploid zygote and a1mosl all
ofils descendant cells have the chromosome constitution 46,XX (female) or 46,XY
(male) (Figure 2.2).
1 production

Cells outside the germ line are somatic cells. Human somatic cells are usually
diploid but, as will be described later, there are notable exceptions. Some types of
non-dividing cell lack a nucleus and any chromosomes, and so are nulliploido
Other cell types have multiple chromosome sets; they are naturally polyploid as
G i i
egg (23)0 sperm
(23,Y)
a result of multiple rounds of DNA replication without cell division.

2.2 MITOSIS AND MEIOSIS


Mitosis and meiosis are both cell division processes that involve chromosome fertilization, to
produce zygote
replication and cell division. However, the products of mitosis have the same
ploid y as the initiating cell, whereas meiosis halves the cell's ploidy. Furthermore,
whereas mitosis gives rise to genetically identical products, meiosis generates
genetic diversity to ensure that offspring are genetically different from their
parents.
®®
Mitosis is the normal form of cell division
1
46,XX

many cell cycles 1


46,XY

As an embryo develops through fetus, infant, and child to adult, many cell cycles
are needed to generate the required number of cells. Because many cells have a
limited life span, there is also a continuous requirement to generate new cells,

Figure 2.2 The human life cycle, from a chromosomal viewpoint. Haploid
egg and sperm cells originate from diploid precursors in the ovary and testis in
women and men, respectively. All eggs have a 23,X chromosome constitution,
1 cell growth. division,
and development
1
representing 22 autosomes plus a single X sex chromosome. A sperm can carry

t
either sex chromosome, so that the chromosome constitution is 23,X (SO%) and
23,Y (50%). After fertilization and fu sion of the egg and sperm nuclei, the diploid
zygote w ill have a chromosome constitution of either 46,XX or 46,XY, depending
on which sex chromosome the fertilizing sperm carried. After many cell cycles,
this zygote gives rise to all cells of the adult body, almost all of whi ch will have
the same chromosome complement as the zygote from which they originated. 46,XX 46,XY
32 Chapter 2: Chromosome Structure and Function

(A) (B) lTIetaphase

nucleolus -8* u + u
~~ ~3 phase

'1" 1 c::ce:::::=J
---- -~ ------~
cytokine sis

----
~1 ~
-~ ------- ~ ----­

i
Figure 2.3 Mitosis and cytokinesis.
(A) Mitotic stages and cytokinesis. Late in
interphase, the duplicated chromosomes
are still dispersed in the nucleus, and the
nucleolus is distinct Early in prophase, the
first stage of mitosis, the centrioles (which
were previously duplicated in interphase)
begin to separate and migrate to opposite
poles of the cell, where they will form
the spindle poles. In prometaphase, the
nuclear envelope breaks down, and the
now highly condensed chromosomes
become attached at their centromeres to
the array of microtubules e)(tending from
the mitotic spindle. At metaphase, the
chromosomes all lie along the middle of
anaphase the mitotic spindle, At anaphase, the sister
metaphase chromatids separate and begin to migrate
toward opposite poles of the cell, as a result
even in an adult organism. All of these cell divisions occur by mitosis, which is of both shortening of the microtubules and
the normal process of cell division throughout the human life cycle. Mitosis further separation of the spindle poles. The
ensures that a single cell gives rise to two daughter cells that are each genetically nuclear envelope forms again around the
identical to the parent cell, barring any errors that might have occurred durin daughter nuclei during telophase and the
DNA replication. During a human lifetime, there may be something like 10 1 9 chromosomes decondense, completing
mitotic divisions. mitosis. Constriction of the cell begins.
The M phase of the cell cycle includes various stages of nuclear division During cytokinesis, filaments beneath the
plasma membrane constrict the cytoplasm,
(prophase, prometaphase, metaphase, anaphase, and telophase), and also cell
ultimately producing two daughter cells.
division (cytokinesis)' which overlaps the final stages of mitosis (Figure 2.3). In
(B) Metaphase-anaphase transition.
preparation for cell division, the previously highly extended duplicated chromo­ Metaphase chromosomes aligned along
somes contract and condense so that, by the metaphase stage of mitosis, they are the equatorial plane (dashed line) have
readily visible when viewed under the microscope. their sister chromatids held tightly
The chromosomes of early S phase have one DNA double helix, but after DNA together (by cohesin protein complexes)
replication two identical DNA double helices are produced (see Figure 2.1), and at the centromeres. The transition to
they are held together along their lengths by multisubunit protein complexes anaphase is marked by disruption of the
called cohesins. Recent data suggest that individual co he sin subunits are linked cohesin complexes, thereby releasing the
sister chromatids to form independent
together to form a large protein ring. Some models suggest that multiple cohesin
chromosomes with their own centromeres,
rings encircle the two double helices to entrap them along their lengths, or which are then pulled by the microtubules
cohesin rings form round the individual double helices and then interact to of the spindle in the direction of opposing
ensure that the two double helices are held tightly together. poles (arrows).
La1er, vvhen the chromosomes have undergone compaction in preparation
for cell di\~sion, the cohesins are removed from all parts of the chromosomes
aparr from the centromeres. As a result, by prometaphase when the chromo­
somes can now be viewed under the light microscope, individual chromosomes
can be seen to comprise two sister chromatids that are attached at the centro­
mere by the residual cohesin complexes that continue to bind the two DNA heli­
ces at this position.
Later still, at the start of anaphase, the residual cohesin complexes holding the
sister chromatids together at the centromere are removed. The two sister chro­
matids can now disengage to become independent chromosomes that will be
pulled to opposite poles of the cell and then distributed equally to the daughter
MITOSIS AND MEIOSIS 33

cells (see Figure 2.3). Inte raction between the mitotic spinelle and the centromere
is crucial to this process and we will consider this in detail in Section 2.3.

Meiosis is a specialized reductive cell division that gives rise to


sperm and egg cells
Diploid primordial germ cells migrate into the emb ryonic gonad and engage in
repeated rounds of mitosis, to generate spermatogonia in males and oogonia
in females. Further growth and differentiation produce primary sperm atocytes
in the testis, and primary oocytes in the ovary. This process req uires many more
mitotic divisions in males than in females, and probably contributes to differ­
ences in mutation rate between the sexes.
The diploid spermatocytes and oocytes can then undergo meiosis, the cell
di\~sion process that produces haploid gametes. Meiosis is a red uctive division
because it involves two successive cell divisions (known as m eiosis I and II) but
only one ro und of DNA replication. As a result, it gives rise to four haploid cells.
In males, the two meiotic cell divisions are each symmetrical, producing four
functionally equivalent spermatozoa. Female meiosis is different because at each
meiosis asymmetric cell division results in an unequal division of the cytoplasm.
The products of female meios is I (the first meiotic division) are a large secondary
oocyte and a small cell (polar body) that is discarded. During meiosis 11 the sec­
ondary oocyte the n gives rise to the large mature egg cell and a second polar
body, which again is discarded (Agure 2,4).
In humans, p rimary oocytes enter meiosis [during fetal development but are
then all arres ted at prophase until after the onset of puberty. After puberty in
femal es, one primary oocyte completes meiosis with each menstrual cycle.
Because ovulation can continue up to the fifth and sometimes sixth decades of
life, this means that meiosis can be arrested for many decades in primary oocytes
that are used in ovulation in later life. While arres ted in prophase, the primary
oocytes continue to grow to a large size, acquiring an outer jelly coat, cortical
granules, and reserves of ribosomes, m RNA, yolk, and other cytoplasmic resources
that wo uld sustain an early embryo. In males, huge numbers of sperm are pro­
duced continuously from puberry onward.

® (A) @ Flgute 2.4 Male and female germ line


development and gametogenes is.

iJ'I,'ary
1~ ! testis
(A) Diploid primordial germ ce lls migrate
to the embryon ic gonad (the female

oogonium r I" s permatogonium


ovary (left) or the male testis (rig htl]' and
enter rounds of mitosis that esta blish
spermatogonia (in males) and oogonia (in

CJ.\t(. . . \,.""-7 0
.t'.t. .t.\t
'"'...-,8) I'; 1
r ) .,/ "'"
(8)
females). (6) Th ese undergo further mitotic
divisions, growth, and differentiation to

1
XJ produce diploid primary spermatocytes
.t'.t.
o ~ (!).t. 00()
.t.
and diploid primary oocytes, which ca n
enter meiosis. (C) Meiosis I. After ONA
duplications, the cells become tetraplOid
but then divid e to produce two diploid cells.
.t. J. In male gametogenesis. the cell division is
symmetrica l, generating identical diplOid
p" ma',
oocyte 0"'.'-'\
....
'. ~.:
;;:'
I
(C)
l ) .,'spermalocyle
~
m", second ary spermatocytes. ln female meiosis
I, by contrast, the division is asym metric;
[~ \,
I ./ \, the secondary oocyte is much large r than

seconda', #':'\.' '" I ~.


~
U . _
secondary
s permatocytes
the li rst polar body, which is discarded.
(0) Meiosis II. The diplOid seco ndary
oocyte ~:.,:; l) (0 ) spermatocyte and secondary oocyte divide
I ~,. \ pola,
1 J~ J~ without prior DNA synthesis to give haploid
~ ~

i;j
bodies
I ® C.. . 0 ® spermatld s
cell products. In male gametogenesis, this
division is agai n symmetrical. producing two
IE)
.t. .t. .t. .t. haplOid spermatids from each secondary

~ ~ ~ ('1.,
mature spermatocyte. In female meiosis II , the egg
egg produced is much larger than the second

=~J
mature spermatozoa (also discarded) polar body. (E) Maturation of
spermatids produces fou r spermatozoa.
34 Chapter 2: Chromosom e Structure and Function

diplOid p(lmary spermatocytes Figw. 2yS Independent assortment of


maternal and paternal homologs during
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X maternal
meiosis. The figure shows a random
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Y paternal
selection ofjust 5 of the 8,388,608 (2 23)

haploid sperm cells


1 meiosis
theoretically possible combination s of
homo logs t hat might occur in haploid
human spermatozoa after meiosis in a
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Y sperm 1 diploid primary spermatocyte. Maternally
derived homologs are represented by pink
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X sperm 2 boxes, and paternally derived homologs
by blue boxes. For simplicity, the diagram
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Y sperm 3 ignores recombination.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X sperm 4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X sperm 5

The second division of meiosis is identical to mitosis, but the first division has

,
important differences. Its purpose is to generate genetic diversity between the
daughter cells. This is done by two mechanisms: independent assortment of
paternal and maternal homologs, and recombination. IA)

Independent assortment
Every diploid cell contains two chromosome sets and therefore has two copies
(homologs) of each chromosome (except in the special case of the Xand Y chro­
mosomes in males). One homolog is paternally inherited and the other is mater­
nally inherited. During meiosis I the maternal and paternal homologs of each
pair of replicated chromosomes undergo synapsis by pairing together to form a
bivalent. After DNA replication, the homologous chromosomes each comprise
two sister chromatids, so each bivalent is a four-stranded structure at the m eta­
)0 ! pairin g

phase plate. Spindle fibers then pull one complete chromosome (two chroma­ 19)
tids) to either pole. In humans, for each of the 23 homologous pairs, the choice of
which daughter cell each homolog enters is independent. This allows 223 or about
8.4 x lOB different possible combinations of parental chromosomes in the gam­
etes that might arise from a single meiotic division (FIgure 2.5).

Recombination
The five stages of prophase of meiosis 1 (Flgw-e 2.6) begin during fetal life and, in !
»
crossing over
human females, can last for decades. During this extended process, the homologs
within each bivalent normally exchange segments of DNA at randomly posi­ (C)

tioned but matching locations. At the zygotene stage (Figure 2.6B), a proteina­

I
ceous synaptonemal complex form s between closely apposed homologous
chromosomes. Completion of the synaptonemal complex marks the start of the
pachyrene stage (Figure 2.6C). during which recombination (crossover) occurs.
Crossover involves physical breakage of the DNA in one paternal and one mater­
nal chromatid, and the subsequen t joining of maternal and paternal fragments.
The mechanism that allows alignment of the homologs is not known (see
Figure 2.6A, B), although such close apposition is required for recombination.
Located at intervals on the synaptonemal complex are very large multiprotein
! partial separation

assemblies, called recombination nodules, that may mediate recombination


FJgur.l.6 The five stages of prophase in meiosis I. (A) In leptotene,
the duplicated homologou s chromosomes begin to condense but remain
unpaired. (8) In zygotene, duplica ted maternal and paternal homologs pair to
form bivalents, comprising four chromatids. (C) In pachytene, recombination
(crossing over) occurs by means of the physical breakage and subsequent
rejoining of maternal and paternal chromosome fragments. There are two
crossovers in the bivalent on the left and one in the bivalent on the right. For
simpliCity, both crossovers on the left involve the same two chromatids. In l contraction
realit y, more crossovers may occur, involving three or even all fou r chromatids in
IE)
a bivalent. (0) During diplotene, th e homologous chromosomes may separate
slightly, e)(cept at the chiasmata. (E) Diakinesis is marked by contraction of the
bivalents and is the transition to metaphase I. In this figure, only 2 of 23 possible
pairs of homologs are illustrated (with the maternal homolog colored light blue,
and the paternal homolog dark blue).
MITOSIS AND MEIOSIS 35

Fi gure 2 .7 Metaphase I to production of gametes. (A) At metaphase I, the bivalents (A) :-- metaphase plate
align on the metaphase plate, at the cen ter of the spindle apparatus. Contraction of
spindle fibers draws the chromosomes in the direction of the spindle poles (a rrows).
tBl The transition to anapha se Iocc urs at the co nsequent rupture of t he chias mata.
(C) Cytokinesis segregates the two chrom osome sets, each to a di fferent primary
spermatocyte. Note that, as shown in this pa nel, after recombination during prophase
I the chromatids share a single centromere but are no longer identical. (0 ) Meiosis II
in eac h primary spermatocyte, w hich does not include DNA replication, generates
unique genetic combinations in the haploid seconda ry spermatocytes. Only 2 o f the
possible 23 different human chromosomes are depicted, for clarity, so only i2 (Le. 4) of
the possible 2 23 (8,388,608) possible combina tions are illustrated. Although oogenesis
can produce only one function al haploid ga mete per meiotic division (see Figure 2.4),
the processes by which genetic diversity arises are the same as in spe rmatoge nesis.

events. Recombi ned homologs seem to be pl1ysicallyconnectedatspecific points.


Each such connection marks the point of crossover and is known as a chiasma
(plural chiasmata). There are an average of 55 chiasmata per cell in human male
l breakage 0 1

\
meiosis, and maybe 50% more in female meiosis. chiasmata
In addition to their role in recombination, chiasmata are thought to be essen­
tial for correct ch ro mosome segregation during meiosis I. By holding together
maternal and paternal homologs of each chromosome pair on the spindle until
anaphase I, they have a role analogous to that of the centromeres in mitosis and
~)
in meiosis II. Children with incorrect numbers of chromosomes h ave been shown
genetically to be often the product of gam etes in which a bivalent lacked
chiasmata.
Meiosis II resembles mitosis, except that there are only 23 chromosomes
i.nstead of 46. Each chromosome already consists of two chromatids that become
separated at anaphase II. However, whereas the sister chromatids of a mitoti c
chromosome are genetically identical, the two chromatids of a chromosome
entering meiosis II (Figure 2.7) are usually genetically different from each other,
as a result of recombination events that took place during meiosis I.
Together, the effects ofrecombination between homologs (during prophase I)
as well as independent assortment of homologs (during anaphase I) ensure that
(e) JytOki"e'i~
a single individual can produce an almost unlimited number of genetically dis­
tinct gametes. The genetic consequences of recombination are considered more
flilly in a later chapter.
X-V pairing
During meiosis I in a human primary oocyre, each chromosome has a fully
homologous pa rmer, and the two X chromosomes synapse and engage in cross­
rl
~
over just like any other pair of homologs. In male meiosis there is a problem. The
human X and Y sex chromosomes are very different from one another. Not only is allemati'e
outcomes at
~
the X very much larger than the Y but it has a rather different DNA content and (0) or meios is II or
"ery many more genes than the Y. Nevertheless, the X and Y do pair during
prophase I, thus ensuring that at anaph ase I each daughter cell receives one sex
chromosome, either an Xor a Y.
Human X and Y chromosomes pair end ·to·end rather than along the whole
length, thanks to short regions of h omology between the X and Y chromosomes
,!! the very ends of the two chromosomes. Pairing is sustained by an obligatory
crossover in a 2.6 Mb homology region at the tips ofthe short arms, but crossove r
also sometimes occurs in a second homology region, 0.32 Mb long, at the tips of
J!H
,
+ +
1_
+
we long arms. Genes in the terminal X-V homology regions have some inte rest­
ing properties:
They are present as homologous copies on the X and Ychromosomes.
They are mostly not subject to the transcriptional inactivation that affec ts
most X-linked genes as a result of the normal decondensation of one of the
two X chromosomes in female mammalian somatic cells (X·inactivation).
They display inheritance patterns like those of genes on a utosomal chromo­
somes, rather than X· linked or Y-linked genes.
As a result of their a utosomal-like inl1eritance, the terminal X-Y homology
:-:"gio ns are known as pseudoautosomal regions. We will describe them in more
'!etail in a later chapter when we consider how sex chromosomes evolved in
onammals.
Other documents randomly have
different content
The Project Gutenberg eBook of Aarne herran
rahat: Kertomus
This ebook is for the use of anyone anywhere in the United States
and most other parts of the world at no cost and with almost no
restrictions whatsoever. You may copy it, give it away or re-use it
under the terms of the Project Gutenberg License included with this
ebook or online at www.gutenberg.org. If you are not located in the
United States, you will have to check the laws of the country where
you are located before using this eBook.

Title: Aarne herran rahat: Kertomus

Author: Selma Lagerlöf

Translator: Jalmari Jäntti

Release date: May 28, 2022 [eBook #68191]

Language: Finnish

Original publication: Finland: Werner Söderström, 1903

Credits: Tapio Riikonen

*** START OF THE PROJECT GUTENBERG EBOOK AARNE


HERRAN RAHAT: KERTOMUS ***
AARNE HERRAN RAHAT

Kertomus

Kirj.

SELMA LAGERLÖF

Käsikirjoituksesta suomensi J.

Porvoossa, Werner Söderström, 1903.


SISÄLLYS:

Solbergin pappilassa.
Laitureilla.
lähetti.
Kuutamossa.
Vaimo.
Raatikellarissa.
Rauhaton.
Sir Archien pako.
Jään poikki.
Aaltojen kohu.

Solbergin pappilassa.

1.

Siihen aikaan kun Tanskan kuningas Fredrik II oli Bohusläänin


isäntä, asui Marstrandissa kalakauppias nimeltä Torarin
Haakenpoika. Hän oli köyhä ja halpasäätyinen mies, sillä toisen
kätensä vuoksi, joka oli rujo, ei hän kyennyt kalalle eikä
soutomieheksi. Kun hän ei voinut saada merestä elatustaan, kuten
muut rannikkolaiset, niin hän piti tapanaan ajella ympäri maakuntaa
kaupittelemassa suolattua ja kuivattua kalaa. Hän ei ollut monta
päivää vuodessa oman katon alla, kun myötäänsä piti kalakuorman
päällä ajaa kituuttaa kylästä toiseen.

Yhtenä helmikuun iltana Torarin oli hämärän tullessa matkalla


Kunghällasta Solbergin pitäjään päin. Maantie oli vallan aukeana ja
tyhjänä, mutta Torarinin ei siltä tarvinnut olla ääneti, sillä hänellä oli
vieressään kuorman päällä veljellinen toveri, jonka kanssa saattoi
jutella. Se oli pieni mustaturkkinen koiraressu, jolle Torarin oli pannut
nimeksi Grim. Enimmäkseen se makasi yhdessä kohti, pää käpälien
välissä ja vaan silmäniskuilla mukautellen isäntänsä puheisiin. Mutta
jos sille jotain vastenmielistä mainitsi, niin se kohosi istualleen, nosti
kuononsa pystyyn ja ulisi pahemmin kuin susi.

»Nyt on niin asia, Grim koirani», virkkoi Torarin, »että tänään minä
olen saanut tärkeitä uutisia. Sekä Kunghällassa että Kaarepyyssä
huhuiltiin meren menneen jäähän. Kauniina ja kirkkaana ja kylmänä
ilma nyt on viime ajat pysynytkin, kuten kyllä sinä tiedät joka kaiket
päivät elelet ulkona, ja meri kuuluu tosiaan jäätyneen, ei yksin
salmissa ja vuonoissa vaan pitkät matkat Kattegatiakin. Siellä ei nyt
laine käy, ei nyt ole vene- eikä laivaväylää saariston sisään.
Ylt'yleensä ulottuu paksu, luja, kierä jää, jota rekihevonen voi juosta
raksutella Marstrandiin ja Paternosterluodolle asti.»

Kaikkea tätä koira kuunteli hyvin tarkkaan. Se ei siitä näyttänyt


olevan millänsäkään. Se vaan makaili ja iski silmää Torarinille.

»Ei meillä kalakuormasta enään ole niin erittäin paljon tähteenä»,


puheli Torarin koiralleen. »Mitähän olisi, Grim, poiketa tienkäänteestä
ja ajaa länteen päin merelle? Ajamme Solbergin kirkon ohi ja siitä
Ödsmålskilin rantaan, ja siitä en luulisi olevan pitemmältä kuin viisi
neljännestä suoraan Marstrandiin. Olisi komeata kerran tulla kotiin
tarvitsematta käyttää venettä tai lauttaa.»

He ajoivat juuri silloin pitkää Kaarepyyn kangasta. Tähän asti ilma


oli koko päivän ollut tyyni, mutta nyt tuli kankaan poikki kylmä
tuulenviima, joka teki ilman kolakaksi.

»Velton näköistä tosin on lähteä kesken työstä paraana aikana»,


jatkoi Torarin omia puolustelujaan, »mutta nyt me olemme niin kauan
kierrelleet kyliä, että jo olisi tarpeenkin saada kotona istahtaa pari
päivää takan ääressä ja lämmitellä kohmettunutta ruumistaan.»

Koira makasi yhä mitään virkkamatta, ja siitä Torarin kalakauppias


oli saavinaan vahvistusta aiheelleen.

»Nyt on äiti monet viikkokaudet istunut yksin kotituvassa», hän


sanoi, huitoen käsivarsillaan saadakseen lämmintä. »Ikävissään hän
kai jo meitä vartoo. Ja Marstrandissa vietetään reilua elämää nyt
talvisaikaan. Kadut ja kujat, tiedäthän Grim, ovat tulvillaan vieraita
kalastajia ja meriväkeä. Ranta-aitoissa on tanssit jok’ainoa ilta. Ja
niitä olvimääriä, jotka krouvissa tulvivat, niitä sinä et jaksa
käsittääkkään.»

Tätä jutellessaan Torarin painautui likemmäs koiraa, nähdäkseen


kuunteliko se todella mitä hän sille sanoi.

Ja kun koira siinä lojui ihan valveillaan eikä millään tavalla


tyytymättömyyttä osottanut, väliin vaan silmiään väläytti, niin Torarin
tienhaarassa kääntikin länteenpäin merelle. Hän hotasi hevosta
suitsenperillä ja antoi mennä aika kyytiä.
»Koska kerran ajetaan Solbergin pappilan sivu», Torarin taas
hetken päästä virkkoi, »niin pistäynpä siellä kysymässä, onko totta
että meri on Marstrandiin asti jäässä. Sieltä kai siitä asiasta saa
tiedon.»

Tämän viimeisen Torarin sanoi vaan puoliääneen, ajattelematta


kuuliko koira vai ei. Mutta tuskin se oli sanottu, kun koira kavahti
pystyyn ja päästi kauhean ulinan.

Hevonen tempautui tiepuoleen, ja Torarinkin säikähti ja kääntyi


katsomaan, oliko saanut susiparven kintereilleen. Mutta nähdessään
että Grim siinä vaan ulvoi, hän koetti saada sitä rauhoittumaan.

»No veikkonen», Torarin sanoi koiralle, »etkö muista kuinka monta


yötä me kumpikin olemme Solbergin pappilassa viettäneet? Ei tosin
ole varmaa, tietääkö Aarne herra meren jäätymisestä mitään, mutta
ainakin me häneltä saamme hyvän illallisen, ennenkuin merelle
lähdetään.»

Koira ei Torarinin sanoista lauhtunut. Se nosti kuononsa pystyyn ja


ulisi yhä hirveämmin.

Silloin Torarinia vähällä oli ruveta kammottamaan. Oli jo melkein


pimeä, mutta Solbergin kirkko kuitenkin vielä häämötti tasangolla,
jota maanpuolelta pitkät metsäharjut ja merenpuolelta puuttomat
kalliokummut ympäröivät. Siinä suuren Valkosen tasangon keskellä
ihan yksin ajaessaan hän tunsi olevansa vähänen kuin maan mato,
jolla ei ole apua eikä turvaa. Mutta metsien pimennoissa ja aukeiden
kallioiden koloissa piilevät suuret hiiden hirviöt, jotka nyt yön
hämyssä rohkenevat hiipiä kätköistään esiin. Ja koko aukealla
tasangolla ne eivät voi muuta saalista yllättää kuin hänet, Torarin
polosen.
Vielä viimeisen kerran hän yritti rauhoittaa koiraansa.

»No veikkonen, mikä sinut on Aarne herraan suututtanut? Hän on


jalosukuinen mies ja rikkain koko maassa. Jollei hän olisi papiksi
ruvennut, olisi hänestä tullut sotasankari.»

Mutta tälläkään hän ei voinut saada koiraa vaikenemaan. Silloin


hänen malttinsa loppui, niin että hän tarttui koiraa niskanahasta kiinni
ja heitti sen kuormalta alas.

Koira ei lähtenyt reen jälkiä seuraamaan, vaan jäi istumaan tielle,


ja sen ulinan Torarin kuuli siihen asti kun ajoi pimeän veräjäsolan läpi
pappilan pihaan, jota matalat puiset huonejaksot ympäröivät kaikilta
neljältä puolelta.

2.

Solbergin pappilassa istui pappi, Aarne herra illallispöydässä


kaiken väkensä kera. Torarin oli ainoa vieras tässä joukossa.

Pappi oli ijäkäs, valkohapsinen mies, vaan vielä kuitenkin väkevä


ja varreltaan ryhdikäs. Vieressä istui hänen vaimonsa, ja häneen oli
ikä merkkinsä lyönyt. Hänen päänsä ja kätensä tutisivat, ja hän oli
melkein kuuro. Herra Aarnen toisella puolen istui apupappi. Hän oli
nuori ja kalpea ja huolestuneen näköinen, ikäänkuin olisi väsynyt
siitä suuresta tietomäärästä, jonka hän opintovuosinaan
Wittenbergissä oli koonnut.

Nämä kolme istuivat pöydän yläpäässä vähän niinkuin muista


erillään. Heistä alaspäin istuivat ensin Torarin ja sitte palvelijat.
Nämäkin olivat vanhaa väkeä. Niitä oli kolme renkiä, kaikki
kaljupäitä, kumaraselkäisiä, ja kaikkien silmät räpyttivät ja vuotivat
vettä. Piikoja oli vaan kaksi. He taisivat olla hiukan nuorempia ja
reippaampia kuin rengit, mutta hekin näyttivät olevan työn kuluttamia
ja vanhuuden raihnaisia.

Pöydän alipäässä istui kaksi lasta. Toinen oli papin pojantytär. Hän
ei ollut kuin neljäntoistavuotias, valkotukkainen, hyvin hento
ruumiiltaan. Kasvot olivat vielä kesken kehityksensä, vaan kaunis
hänestä näytti tulevan. Hänen vieressään istuva neitonen oli köyhä
tyttö, etäistä sukua papin rouvalle. Hän oli orpo ja oli otettu pappilan
kasvatiksi. Molemmat neitoset istuivat ihan lähetysten lavitsalla ja
näyttivät olevan keskenään hyvät ystävykset.

Kaikki nämä istuivat ja söivät sanaakaan virkkamatta. Torarin


katseli vuorotellen itsekutakin, mutta kenenkään ei näyttänyt tekevän
mieli jutella atrian aikana. Kaikki vanhukset ajattelivat samaa asiaa:
Se on suuri siunaus, kun saa ruokaa eikä tarvitse kärsiä puutetta
eikä nälkää, niinkuin meidän ennen on monet kerrat täytynyt tehdä.
Ei pidä ruuan ääressä muuta miettiä kuin kiittää Jumalaa hänen
hyvyydestään.

Kun Torarin ei päässyt puheisiin kenenkään kanssa, niin hänen


silmänsä kiertelivät huonetta. Hän tarkasteli milloin suurta
takkauunia, joka monikerroksisine lavoineen täytti koko ovipuolen
tupaa, milloin taas korkeapylväistä uudinvuodetta huoneen
peränurkassa. Hän antoi katseensa kiertää seiniä pitkin kulkevia
lavitsoja, tai kohota suureen lakeisaukkoon, josta savu kiemurrellen
tuprusi ulos ja talvihuurua virtasi sisään.

Kun Torarin kalakaupustelija, joka asui rannikon pienimmässä


hökkelissä, tämän kaiken näki, niin hän ajatteli: Olisinpa minä
sellainen suurmies kuin Aarne herra, niin en tyytyisi asumaan
tällaisessa vanhankansan pirtissä, joka on kaikki yhtenä huoneena,
Rakentaisin kivestä korkeapäätyisen talon, jossa olisi monet
huoneet, niinkuin on meidän Marstraridin pormestarilla ja
neuvosmiehillä.

Mutta useimmin Torarin käänsi katseensa suureen tammiarkkuun,


joka seisoi pylväsvuoteen jalkopäässä. Hän katseli senvuoksi sitä
niin usein, kun kerrottiin että pappi, Aarne herra siinä säilytti kaikkia
hopearahojaan, sanottiinpa vielä että niitä siinä oli aivan reunoja
myöten.

Ja Torarin, joka oli niin köyhä että hänellä tuskin koskaan oli
hopearahaa taskussaan, hän sanoi itsekseen: Enkä minä sittenkään
noista rahoista huolisi. Sanotaan että Aarne herra on ne ottanut
suurista luostareista, joita näillä tienoin ennenaikaan oli, ja että
vanhat munkit ovat ennustaneet hänelle turmiota noiden rahojen
tautta.

Juuri näissä ajatuksissa ollessaan Torarin näki vanhan kuuron


emännän nostavan kätensä korvalliselle kuullakseen paremmin.
Sitte hän kääntyi Aarne herraan päin ja kysyi häneltä: »Mitähän
varten nyt Branehögissä puukkoja hiotaan?»

Huoneessa oli niin täydellisen äänetöntä, että kun vanhus tämän


sanoi, kaikki säpsähtivät ja katsoivat kauhistuneina toisiinsa. Kun he
näkivät hänen istuvan käsi korvalla ja kuuntelevan jotakin, niin hekin
varoivat liikuttamasta lusikoita ja teroittivat kuuloaan.

Hetken oli huoneessa ihan kuoleman hiljaisuus, mutta sen aikana


emäntä
vanhus kävi yhä enemmän ja enemmän rauhattomaksi. Hän tarttui
herra
Aarnen käsivarteen ja kysyi häneltä: »En ymmärrä miksi tänä iltana
Branehögissä niin pitkiä puukkoja hiotaan.»

Torarin näki, että Aarne herra hiveli vaimonsa kättä


rauhoittaakseen häntä. Hän ei kuitenkaan huolinut vastata, vaan söi
rauhallisesti kuten ennenkin.

Emäntävanhus istui yhä edelleen käsi korvalla. Hän sai kauhusta


vedet silmiinsä, ja hänen kätensä ja päänsä tutisivat kuin
suonenvedossa.

Silloin nuoret neitoset, jotka istuivat pöydän alipäässä, tyrskähtivät


hätäiseen itkuun.

»Ettekö te voi kuulla sitä kitinää ja raaputusta?» vanhus kysyi.


»Ettekö te voi kuulla, miten se kirskuu ja korvia viiltää?»

Aarne herra istui ääneti ja hiveli vaimonsa kättä. Niin kauvan kun
hän oli vaiti ei kukaan rohjennut mitään virkkaa.

Mutta kaikki uskoivat, että vanha perheenäiti kuuli jotakin, jotain


hirvittävää, joka ennusti turmiota. Kaikki tunsivat veren suonissaan
jähmettyvän. Ei kukaan muu pöydässä istujista enään uskaltanut
viedä ruokapalaa suuhunsa, paitse Aarne herra itse.

He ajattelivat, että emäntävanhushan monet vuodet oli pitänyt


taloudesta huolen. Hän oli aina ollut kotosalla ja älykkäästi ja hellästi
valvonut lapsia ja palvelijoita, karjaa ja koko omaisuutta, niin että
kaikki oli menestynyt. Nyt hän oli työn kuluttama ja ikäloppu, mutta
varmasti hän ennen ketään muuta vaaran huomaisi, jos se taloa
uhkaa.
Emäntävanhuksen hätä kasvoi kasvamistaan. Hän laski kätensä
ristiin, ja avuttomana hän alkoi itkeä yhä kiihkeämmin, niin että
kyynelet suurina karpaloina vierivät pitkin kuihtuneita poskia.

»Etkö sinä ollenkaan edes kysy, Aarne Aarnenpoika, mistä tämä


hätä minulla on?» hän valitti.

Aarne herra nyt kumartui häneen päin ja sanoi: »En ymmärrä mitä
sinä oikein olet säikähtänyt.»

»Niitä pitkiä puukkoja minä pelkään, joita siellä Branehögissä


hiotaan», vaimo sanoi.

»Miten sinä voit kuulla, että Branehögissä puukkoja hiotaan?»


sanoi
Aarne herra hymähtäen. »Onhan siihen taloon täältä neljänneksen
matka.
Ota nyt lusikka käteesi ja anna meidän syödä illallinen loppuun!»

Vanhus koetti hillitä kauhuaan. Hän tarttui lusikkaansa ja pisti sen


maitoruukkuun, mutta sitä tehdessä hänen kätensä vapisi niin, että
kaikki kuulivat lusikan kalisevan laitaan. Mutta heti hän panikin sen
pois lautaselle. »Miten minä voin syödä?» hän sanoi. »Enkö kuule
kuinka se kirskuu, enkö kuule kuinka se viiltää?»

Silloin Aarne herra työnsi maitoruukun pois luotaan ja pani


kätensä ristiin. Kaikki toiset tekivät samoin, ja apupappi alkoi lukea
ruokasiunausta.

Sen loputtua Aarne herra katseli pöydän ääressä istujoita, ja kun


hän näki että he olivat kalpeina ja säikäyksissään, niin hän suuttui.
Hän alkoi puhua heille niistä ajoista, kun hän oli tullut Bohuslääniin
saarnatakseen siellä Lutherin oppia. Silloin olivat Paavilaiset
ahdistaneet häntä ja hänen palvelijoitaan kuin metsän petoja.
»Emmekö ole nähneet vihamiehen tiepuolessa väijyvän meitä, kun
olimme menossa Herran huoneeseen? Emmekö ole saaneet hengen
hädässä lähteä pakoon omasta kodistamme ja turvattomina samoilla
metsiä? Nytkö meidän on ylenannettava itsemme?»

Aarne herra puhui kuin sankari, ja kaikkien rohkeus nousi häntä


kuullessaan.

Tottahan se on, he ajattelivat. Jumala on suojellut Aarne herraa


suurimmissa vaaroissa. Hänen kätensä on hänen turvanaan. Hän ei
palvelijaansa hukkaan heitä.

3.

Kun Torarin ajoi ulos pappilan pihasta, tuli hänen koiransa häntä
vastaan ja hyppäsi kuormalle. Nähdessään, että koira oli koko ajan
pysytellyt pappilan ulkopuolella, Torarin kävi uudestaan levottomaksi.
»Grim koirani», hän sanoi, »mitä sinä täällä solassa istuskelet?
Miks’et tule taloon saamaan ruokaa. Uhannekko mikä Aarne herraa?
Jokohan minä hänet nyt näin viime kertaa? Mutta kerranhan
kuolema yllättää suuretkin uroot. Taitaa hänkin jo olla lähes
yhdeksänkymmenen vuotias.»

Torarin käänsi sille tielle, joka Branehögin kartanon ohi vei


Ödsmålskilin rantaan.

Branehögin ohi ajaessaan hän näki, että siellä oli piha täynnä
rekiä ja että valoa pilkotti sulettujen ikkunaluukkujen raoista.
Kun Torarin tämän näki, niin hän sanoi Grimille: »Täällä on väki
vielä valveilla. Ajanpa taloon ja kysäsen, onko siellä tänä iltana
puukkoja hiottu.»

Hän ajoi kartanolle, ja oven avatessaan hän näki että siellä


pidettiin suuria pitoja. Seinälavitsat olivat täynnä vanhoja miehiä,
jotka ryyppivät olutta ja viiniä, ja nuoriso häili permannolla, leikkien ja
laulaen.

Torarin näki oitis, ettei täällä kukaan voinut aseitaan varustella eikä
olla surmahankkeissa. Hän työnsi oven kiinni ja aikoi juuri lähteä
tiehensä, kun isäntä tuli jälestä. Hän kärtti Torarinia jäämään pitoihin,
kun kerran oli niihin sattunut tulemaan, ja veti hänet mukanaan
pirttiin.

Torarin istui sitte pitkän aikaa juttelemassa talollisten kanssa. He


olivat hyvin nousutuulella, ja Torarin oli hyvillään kun sai heittää
mielestään kaikki pahat aatokset.

Vaan Torarin ei ollut ainoa, joka tuli myöhään tämän illan pitoihin.
Vielä paljon myöhemmin tuli eräs mies vaimoineen. He olivat
kuluneissa puvuissa ja sisään käytyään pysähtyivät ujosti ovensuu-
nurkkaan.

Isäntä meni heti uusia vieraita vastaan. Hän otti heitä toista
toisesta kädestä ja vei heidät peremmälle istumaan. Sitte hän sanoi
toisille: »Eikös ole totta kun sanotaan, että jolla on lyhin tie, se
viimeksi perille ehtii? Nämä ovat minun lähimmät naapurini. Täällä
Branehögillä ei muita tilallisia olekkaan kuin he ja minä.»

»Ennemmin sano ettei ole muita kuin sinä», mies virkkoi. »Ethän
minua voi tilalliseksi nimittää. Olenpahan vaan sysimies, jonka olet
antanut laittaa mökin maallesi.»

Sysimies sai paikan Torarinin vieressä, ja he alkoivat jutella.


Vastatullut kertoi Torarinille, mistä syystä hän niin myöhään näihin
pitoihin saapui. Heillä oli nimittäin ollut vieraita kotimökissään eivätkä
olleet tohtineet jättää sitä heidän haltuunsa. Kolme kiertelevää
nahkurinsälliä oli ollut heillä koko päivän. Aamulla tullessaan he
olivat olleet uuvuksissa ja ylen kurjan näköisinä. Olivat kertoneet
eksyneensä metsään ja harhailleensa ruuatta koko viikon. Mutta
saatuaan sitten syödä ja nukkua olivat he äkkiä elpyneet, ja illalla he
olivat kysyneet, mikä täällä oli paikkakunnan suurin ja komein talo,
josta he voisivat mennä tiedustelemaan työtä. Sysimids oli
vastannut, että pappila, jossa herra Aarne asui, oli rikkain niillä
mailla. Mutta tuskin hän oli sen sanonut, kun he olivat purkaneet
reppunsa, vetäneet esille pitkiä puukkoja ja ruvenneet niitä hiomaan.
Sitä he olivat tehneet kotvan aikaa, ja silloin he olivat niin hurjan
näköisiä, että sysimiestä ja hänen vaimoaan kauhistutti eivätkä he
rohjenneet lähteä pois kotoaan. »Näen heidät vielä niinkuin tuossa
edessäni istuisivat niitä puukkojaan kirnuttamassa», sanoi sysimies.
»He olivat hirveän näköisiä. Heillä oli suuret parrat, joita milloin
lienevätkään kerinneet, ja yllään oli karvapäälliset, visaiset
nahkatakit, jotka olivat ihan liassa. Minä jo ajattelin, että
koirankuonolaisia olin mökkiini saanut. Olin oikein mielissäni, kun ne
viimein lähtivät tiehensä.»

Kun Torarin kuuli tämän, niin hän kertoi sysimiehelle mitä hänen
pappilassa käydessään oli tapahtunut.

»Siis Branehögissä kumminkin tänään puukkoja hiottiin», sanoi


Torarin nauraen. Hän oli nyt juonut aikalailla, sillä kylään tullessaan
hän oli ollut niin alakuloinen ja pelästynyt, että hänen oli täytynyt
ottaa lohdutusta siitä mistä saisi.

»Nyt olen iloinen taas», hän sanoi, »kun sain tietää, ettei se
mikään ennustusmerkki ollut jonka papinrouva kuuli, vaan että joku
nahkuri vaan laittoi työkalujaan kuntoon.»

4.

Oli aamuyö, kun kaksi miestä astui ulos Branehögin pirtistä


pannakseen hevosensa valjaisiin ja ajaakseen kotia.

Pihalle päästyään he näkivät pohjan puolella tulipalon leiskuvan


kohti taivasta. He riensivät oitis pirttiin takaisin ja huusivat: »Tulkaa
ulos! Tulkaa ulos! Solbergin pappila on tulessa!»

Pidoissa oli paljon väkeä koolla, ja kenellä oli hevonen, heittäytyi


sen selkään ja ehätti pappilaa kohti, vaan melkein yhtä pian nekin
ehtivät perille, joiden oli sinne juostava omin kerkein jaloin.

Kun pitovieraat saapuivat pappilaan, eivät he siellä tavanneet


ainoatakaan ihmistä. Näytti kuin kaikki olisivat nukkuneet, vaikka
liekit pilvissä roihusivat.

Tulessa eivät olleet itse rakennukset, vaan asuinpirtin seinustalle


koottu nuotio, jossa oli sytykkeenä risuja ja olkia. Se ei ollut vielä
kauan palanut. Liekit olivat vasta ennättäneet noeta vanhaa
hirsiseinää ja sulattaa lunta olkikatolta. Mutta räystäässä tuli jo
kumminkin kyti. Kaikki huomasivat heti, että tässä oli murhapoltto.
He alkoivat epäillä, nukkuivatko Aarne herra ja hänen väkensä
tosiaan, vai oliko heille jotakin tapahtunut.
Vaan ennenkuin pelastajat menivät sisään pyrkimään, he ensin
pitkillä riuvuilla syytivät nuotiohalot pois seinustalta ja kiipesivät
repimään alas katto-olkia, jotka savusivat ja olivat vähällä leimahtaa
tuleen.

Sitte joitakuita miehiä meni avaamaan pirtin ovea ja herättämään


Aarne herraa. Mutta kun etumainen heistä saapui kynnykselle, hän
väistyi syrjään, päästääkseen edelle sen joka tuli hänen jälessään.

Tämä meni askelen eteenpäin, mutta kun hänen piti tarttua


ovenripaan, niin hänkin väistyi syrjään jälessä tulevan tieltä.

Sitä ovea heidän oli hirmu avata, sillä sen kynnyksen alatse tihkui
leveä verivirta ja kädensija oli veritahroissa.

Silloin ovi aukeni heidän edessään ja herra Aarnen apupappi tuli


ulos.
Hän hoiperteli miehiä kohti, päässään iso haava ja yltäpäältä
verissä.

Hän oli hetkisen seisallaan miesten edessä ja kohotti kättään


pyytääkseen olemaan ääneti.

Sitte hän sanoi korisevalla äänellä: »Tänä yönä herra Aarne


surmattiin koko väkineen. Kolme murhamiestä tunkeutui lakeisen
kautta sisään, yllään karvapäälliset nahkatakit. He hyökkäsivät kuin
pedot päällemme ja tappoivat meidät.»

Enempää hän ei jaksanut. Hän horjahti seisaaltaan ja kaatui


kuoliaana maahan miesten jalkoihin.

Nyt miehet menivät pirttiin ja näkivät, että se mitä pappi oli


sanonut oli totta.
Se suuri arkku, jossa Aarne herra oli tallettanut rahojaan, oli hilattu
pois, ja herra Aarnen hevonen oli otettu tallista sekä hänen rekensä
vajasta.

Talonpojat näkivät jalasten jälkien vievän pappilan niittyjen poikki


merelle, ja parikymmentä miestä riensi murhaajia takaa-ajamaan.
Vaan naiset menivät huoneeseen ja nostivat ruumiit verisestä pirtistä
ulos puhtaalle lumihangelle.

Silloin ei siinä herra Aarnen koko väkeä ollutkaan, vaan yksi


puuttui. Se oli se köyhä neitonen, jonka Aarne herra oli ottanut talon
kasvatiksi. Ihmeteltiin oliko hän päässyt pakoon, vai olivatko
murhamiehet vieneet hänet mukanaan.

Mutta kun he koko pirtin tarkkaan tutkivat, niin he löysivät hänet


istumasta kyyrysissään suuren uunin takana. Siinä hän oli piillyt koko
tappelun ajan, eikä hänelle ollut mitään pahaa tapahtunut, mutta hän
oli kauhusta niin tyrmistynyt ettei kyennyt puhumaan eikä
vastaamaan.

Laitureilla.

Sen köyhän neitosen, joka oli pelastunut verilöylystä, oli Torarin


vienyt mennessään Marstrandiin. Hänen oli käynyt neitoa niin sääli,
että oli tarjonnut hänelle asunnon pienessä mökissään, jossa hän
saisi syödä yhtä ruokaa hänen ja hänen äitinsä kanssa.

Tämän enempää, hän ajatteli, en kykene tekemään Aarne herran


hyväksi, joka on minulta monet kerrat ostanut kaloja ja antanut

You might also like