Genomica Del Cáncer
Genomica Del Cáncer
Genomica Del Cáncer
Ben Tran, Janet E. Dancey, Suzanne Kamel-Reid, John D. McPherson, Philippe L. Bedard,
Andrew M.K. Brown, Tong Zhang, Patricia Shaw, Nicole Onetto, Lincoln Stein, Thomas J. Hudson,
Benjamin G. Neel, and Lillian L. Siu
See accompanying editorial on page 584
Ben Tran, Philippe L. Bedard, and Lillian
L. Siu, Princess Margaret Hospital,
University Health Network, University
of Toronto; Janet E. Dancey, John D.
McPherson, Andrew M.K. Brown,
Nicole Onetto, Lincoln Stein, and
Thomas J. Hudson, Ontario Institute for
Cancer Research; Suzanne Kamel-Reid,
Tong Zhang, and Patricia Shaw, Toronto
General Hospital, University Health
Network, University of Toronto; John D.
McPherson, Nicole Onetto, Lincoln
Stein, Thomas J. Hudson, and Benja-
min G. Neel, University of Toronto; and
Benjamin G. Neel, Campbell Family
Cancer Research Institute, Ontario
Cancer Institute, University Health
Network, University of Toronto,
Toronto, Canada.
Submitted August 29, 2011; accepted
November 16, 2011; published online
ahead of print at www.jco.org on
January 23, 2012.
Terms in blue are dened in the glos-
sary, found at the end of this article
and online at www.jco.org.
Authors disclosures of potential con-
icts of interest and author contribu-
tions are found at the end of this
article.
Corresponding author: Lillian L. Siu,
MD, FRCPC, Princess Margaret Hospi-
tal, Drug Development Program, 610
University Ave, Ste 5-718, Toronto,
Ontario, M5G 2M9, Canada; e-mail:
[email protected].
2012 by American Society of Clinical
Oncology
0732-183X/12/3006-647/$20.00
DOI: 10.1200/JCO.2011.39.2316
A B S T R A C T
In recent years, the increasing awareness that somatic mutations and other genetic aberrations drive
human malignancies has led us within reach of personalized cancer medicine (PCM). The implemen-
tation of PCM is based on the following premises: genetic aberrations exist in human malignancies; a
subset of these aberrations drive oncogenesis and tumor biology; these aberrations are actionable
(dened as having the potential to affect management recommendations based on diagnostic,
prognostic, and/or predictive implications); and there are highly specic anticancer agents available that
effectively modulate these targets. This article highlights the technology underlying cancer genomics
and examines the early results of genome sequencing and the challenges met in the discovery of new
genetic aberrations. Finally, drawing fromexperiences gained in a feasibility study of somatic mutation
genotyping and targeted exome sequencing led by Princess Margaret HospitalUniversity Health
Network and the Ontario Institute for Cancer Research, the processes, challenges, and issues involved
in the translation of cancer genomics to the clinic are discussed.
J Clin Oncol 30:647-660. 2012 by American Society of Clinical Oncology
INTRODUCTION
Inrecent years, anincreasing appreciationandiden-
tication of somatic mutations and other genetic
aberrations that drive human malignancies have led
us within reach of personalized cancer medicine
(PCM). The US National Cancer Institute denes
personalized medicine as a form of medicine that
uses information about a persons genes, proteins,
and environment to prevent, diagnose, and treat
disease.
1
The implementation of PCM is based on
the following premises: genetic aberrations exist in
human malignancies; a subset of these aberrations
drives oncogenesis andtumor biology; these aberra-
tions are actionable (have potential to affect man-
agement recommendations based on diagnostic,
prognostic, and/or predictive implications); and
highly specic anticancer agents are available that
effectively modulate these targets. The National
Cancer Institute Ofce of Cancer Genomics, estab-
lished to facilitate PCM through validation of these
key premises, articulates the following mission and
goals
2
: First, enhance the understanding of the mo-
lecular mechanisms of cancer; second, accelerate
genomic science and technology development; and
third, translate genomic data toimprove cancer pre-
vention, early detection, diagnosis, and treatment.
Hence, the focus of this article will revolve around
the three pillars that support cancer genomics: dis-
covery, technology, and translation.
Since the Human Genome Project, the emerg-
ing scientic era of omics has revolutionized the
study of cancer. Although cancer is recognized as a
disease driven fundamentally by genetic changes,
the somatic events that drive the multistep progres-
sionof carcinogenesis are not well understood, even
in the most studied cancer types.
3
The International
Cancer Genome Consortium (ICGC) is coordinat-
ing efforts aimed at identifying all genomic altera-
tions signicantly associated with cancer, including
genomic loss or amplication, mutations in coding
regions, chromosomal rearrangements, aberrant
methylation, and expression proles. Through the
ICGC, the discovery pillar targets the decoding of
cancer genomes. Much of the discovery is depen-
dent on advances in molecular diagnostics, particu-
larly genome sequencing, within the technology
pillar. The improved timeliness and cost associated
with genome sequencing have driven discovery not
only incancer genomics but also inthe nal pillar of
clinical translation. Until now, the majority of the
focus within cancer genomics lay in discovery. As
our molecular understanding of cancer improves,
the prospect of applying genomic knowledge in the
clinic becomes increasingly tangible. However, spe-
cic challenges in the scientic, regulatory, and eth-
ical domains remain to be overcome before PCM
JOURNAL OF CLINICAL ONCOLOGY
S P E C I A L A R T I C L E
VOLUME 30 NUMBER 6 FEBRUARY 20 2012
2012 by American Society of Clinical Oncology 647
Information downloaded from jco.ascopubs.org and provided by at ASCO on March 6, 2014 from 158.232.241.130
Copyright 2012 American Society of Clinical Oncology. All rights reserved.
Information downloaded from jco.ascopubs.org and provided by at ASCO on March 6, 2014 from 158.232.241.130
Copyright 2012 American Society of Clinical Oncology. All rights reserved.
Information downloaded from jco.ascopubs.org and provided by at ASCO on March 6, 2014 from 158.232.241.130
Copyright 2012 American Society of Clinical Oncology. All rights reserved.
can become a reality. This article highlights the technology that em-
powers cancer genomics, examines the early results of genome se-
quencing and the challenges met in the discovery of new genetic
aberrations, and discusses the processes involved in the translation of
cancer genomics to the clinic.
TECHNOLOGY
The eld of cancer genomics is growing rapidly as a result of revolu-
tionary advances in DNAsequencing technologies. In this section, we
review the technologic developments that have catalyzed increased
understanding of cancer biology: whole genome sequencing (WGS),
targeted sequencing, genotyping, and bioinformatics.
WGS
WGS is the backbone technology that supports the in-depth
sequencing of cancer genomes. Genome sequencing consists of three
phases: samplepreparation, physical sequencing, andreconstruction.
4
In sample preparation, the target genome is broken into fragments.
4
During physical sequencing, individual bases in each fragment are
identied in order; the number of individual bases identied contig-
uously is denedas the readlength.
4
During reconstruction, bioinfor-
matics softwarealigns overlappingreads fromeachfragment, allowing
the original genome to be constructed; the longer the read length, the
easier the reconstruction.
5
Traditionally, genome sequencinghas been
costlyandtime consuming. However, newtechnologyhas diminished
both of these impediments.
4,5
First-generation sequencing, or Sanger sequencing, has been
the workhorse of DNA sequencing for almost 30 years.
4
Although
signicant improvements in optimization, miniaturization, multi-
plexing, automation, and pipeline integration have occurred, the
fundamental technology has not changed signicantly.
4
Sanger
sequencing can produce read lengths of up to 1,000 bases, consid-
erably longer than second-generation sequencing platforms.
5
It is
an effective method, the long read lengths and high accuracy of
which have resulted in monumental accomplishments, including
completion of the Human Genome Project.
4,6
However, limita-
tions of high cost and low throughput (small amount of data gener-
ated per unit of time) have led to the development of next-generation
sequencing (NGS).
4,5,7
NGS platforms consist of second- and third-generation technol-
ogies, describedindepthbyMetzker.
4
Bothare more economical than
Sanger andhave higher throughput. Second-generationplatforms are
dominated by cyclic array or ush and scanbased sequencing.
Strands of fragmented DNA are amplied, and bases are then added
sequentially using DNA polymerase. Excess reagent is washed out,
imaging then identies the base incorporated, and the process is re-
peated.
8
This repetitive process leads to millions of reads, each of
limitedlength(approximately50to400bases), creatingachallengefor
genome reconstruction.
4,5,7,9
Several second-generationplatforms are
commercially available, each with distinct differences (Table 1). Al-
though NGS has improved cost and throughput, its disadvantages
include short read length, complex sample preparation, need for am-
plication, long time to results, and signicant data storage and inter-
pretation requirements.
5,11
Third-generation sequencing technologies include novel plat-
forms, such as the PacBio RS (Pacic Biosciences, Menlo Park, CA)
and Ion Torrent PGM (Life Technologies, Carlsbad, CA; Table 1).
PacBioRSuses a process calledsingle-molecule, real-time detectionof
biologic processes that results in longer read lengths, averaging 964
bases in published articles,
4,11
and more than 2,000 bases in more
recent applications at our institute. Instead of relying on amplied
DNA, single-molecule sequencing detects the specic sequence of
each individual DNAstrand. Ion Torrent PGMuses nonoptical DNA
sequencing. Rather thandetecting nucleotide incorporationoptically,
PGM uses a semiconductor that senses the ions produced as nucleo-
tides are incorporated. Although read length currently averages fewer
than 200 bases, accuracy is high, and run time is short, potentially
allowing for real-time clinical application.
10,12
The rst humangenome sequence cost more than$2 billionand
tookadecade tocomplete.
20
However, advances incost andtimeliness
gained through these novel platforms bring the $1,000 genome target
of the National Institutes of Healthwithinreach. As genome sequenc-
ing becomes more affordable andaccessible, our understanding of the
molecular basis of cancer is expected to improve exponentially.
Targeted Genome Sequencing
Althoughcheaper thanSanger sequencing, WGSremains expen-
sive on a grand scale, with current costs of $10,000 to $35,000
7
per
human genome, exclusive of labor and other expenses. Targeted se-
quencing refers to strategies that enrich the input for DNAregions of
interest,
7
such as the whole exome or the cancer genome (ie, genes
potentiallyinvolvedintumor biology). Manyof theplatforms usedfor
WGS also are used for targeted sequencing, although polymerase
chain reaction (PCR) amplication of targeted regions or hybridiza-
tionof the test DNAtospecic arrays of oligonucleotides correspond-
ing to the desiredtarget sequences is required. Inadditionto reducing
the cost per sample, these approaches increase coverage of areas of
interest, which may overcome problems of cancer cell cellularity in
tumor specimens and increase accuracy.
7
As WGS costs remain high,
targeted sequencing, particularly exome sequencing, is likely to dom-
inate near-termsequencing strategies.
21
Cancer Genotyping
The increasing number of targeted therapeutics, the antitumor
activity of which is based on the presence of specic biomarkers, has
created a growing need for real-time detection of recognized genetic
aberrations in clinical samples in a cost-effective and timely manner.
Given the observation that some cancer mutations occur at similar
DNA bases in tumors from different patients (so-called recurrent
mutations), it is possible to use assays that test for single bases (a
process referredtoas mutationgenotyping). Giventhat the number of
clinically validated recurrent and predictive mutations are few, meth-
ods suchas PCR-basedrestrictionfragment lengthpolymorphismare
currently being used for somatic mutation genotyping in individual
patients withcancer.
13
However, therepertoireof recurrent mutations
is increasing, and there is interest in testing these to evaluate their role
as predictive mutations for the numerous molecularly targeted agents
in development. This leads to a need for higher-throughput genotyp-
ing methods.
High-throughput genotyping platforms, consisting of multi-
plexed assays and microarrays, have been successfully used for
genotyping clinical samples.
15,22-24
Table 1 details several of these
platforms, including the TaqmanOpenArray Genotyping system(us-
ing Taqman genotyping assays; Applied Biosystems, Carlsbad, CA),
Tran et al
648 2012 by American Society of Clinical Oncology JOURNAL OF CLINICAL ONCOLOGY
Information downloaded from jco.ascopubs.org and provided by at ASCO on March 6, 2014 from 158.232.241.130
Copyright 2012 American Society of Clinical Oncology. All rights reserved.
Table 1. Sequencing and Genotyping Platforms
Platform Method Application Comment
Sequencing
First generation (Sanger sequencing)
Sanger
4
Strands of fragmented DNA are resolved on
gel and distributed in order of length,
with end base labelled
Targeted sequencing; whole
genome sequencing;
genotyping
Despite high accuracy and successes such
as the rst human genome, several
limitations, particularly low throughput,
have led to increased use of NGS
technologies
Second generation (cyclic
arraybased sequencing)
Stands of fragmented DNA are amplied;
then bases are added sequentially using
DNA polymerase; excess reagent is
washed out, imaging identies base
incorporated, and process repeats
Targeted sequencing; whole
genome sequencing
Higher throughput has provided signicant
advantages; however, limitations such as
sample preparation, short read lengths,
and relatively slow run time have limited
clinical use; newer versions (such as
MiSeq Illuminaor 454 Junior Roche)
sacrice genome coverage for faster run
time to become more amenable to
clinical application
454 (Roche, Basel,
Switzerland)
4,5,10
Pyrophosphate released at time of base
incorporation
HiSeq (Illumina, San Diego,
CA)
4,5,9
Fluorescent-labelled nucleotides added
simultaneously
SOLiD 4 (Life Technologies,
Carlsbad, CA)
4,5,10
Driven by DNA ligase instead of DNA
polymerase
Third generation (novel technologies)
PacBio RS (Pacic Biosciences,
Menlo Park, CA)
4,10,11
Single-molecule real-time sequencing;
imaging of dye-labelled nucleotides as
they are incorporated during DNA
synthesis by single DNA polymerase
molecule
Targeted sequencing; whole
genome sequencing
Results in long read lengths, short run time,
and high throughput with simple sample
preparation; potential for clinical
application
Ion Torrent PGM (Life
Technologies)
10,12
Nonoptical DNA sequencing; massively
parallel semiconductor senses ions
produced as nucleotides are incorporated
by DNA polymerase-based synthesis
Targeted sequencing; whole
genome sequencing
Low technology cost and short run time;
potential for clinical application
Genotyping
Restricted fragment length
polymorphism
13,14
Uses restriction enzymes to fragment DNA
in presence of targeted mutation; then
gel electrophoresis separates resulting
fragments, identifying mutation
Single somatic mutation
analysis
Allows detection of low-frequency
mutations ( 4%) but has low
throughput and is dependent upon
subjective visual interpretation; still used
in some centers for KRAS mutation
testing; however, not feasible method for
high-throughput genotyping
Taqman OpenArray Genotyping
System (Applied Biosystems,
Carlsbad, CA)
14,15,16
Uses allele-specic PCR and dye-labelled
probes (Taqman assay) combined with
uorescent readout systems
Somatic mutation analysis;
SNP genotyping
Effective and accurate high-throughput
genotyping platform
MassARRAY (Sequenom, San Diego,
CA)
14,15,16
Uses allele-specic PCR combined with
MALDI-TOF mass spectrometry to detect
mutations/SNPs
Somatic mutation analysis;
SNP genotyping; gene
expression analysis;
methylation analysis
Effective and accurate high-throughput
genotyping platform; able to detect low-
frequency mutations ( 10%); premade
(Oncocarta) and customized mutation
panels available
ABI PRISM 3100 Genetic Analyzer
(Applied Biosystems)
14,17,18
Uses allele-specic PCR with
oligonucleotide primers and labelled
nucleotides for primer extension
(SNaPshot assay) combined with capillary
electrophoresis and optical imaging
Somatic mutation analysis;
SNP genotyping; gene
expression analysis;
methylation analysis
Effective and accurate high-throughput
genotyping platform
iScan (Illumina)
15,16
Uses allele- and locus-specic PCR with
oligonucleotide primers; hybridization of
assay products onto BeadChip; then
imaging of uorescent signals
Somatic mutation analysis;
SNP genotyping; gene
expression analysis;
methylation analysis
Effective and accurate high-throughput
genotyping platform
Gene Titan (Affymetrix, Santa Clara,
CA)
Uses microarray technology and GeneChip
arrays
Somatic mutation analysis;
SNP genotyping
Effective and accurate high-throughput
genotyping platform
aCGH platform (Agilent, Santa Clara,
CA)
19
Uses microarray technology and CGH arrays
(including Agilent and Oxford Gene
Technology Oxford, United Kingdom
arrays) to detect copy number variations
aCGH Effective platform for analysis of copy
number variations with high resolution
and high throughput
Abbreviations: aCGH, array-based CGH; CGH, comparative genomic hybridization; MALDI-TOF, matrix-assisted laser desorption/ionizationtime of ight; NGS,
next-generation sequencing; PCR, polymerase chain reaction; SNP, single nucleotide polymorphism.
Cancer Genomics
www.jco.org 2012 by American Society of Clinical Oncology 649
Information downloaded from jco.ascopubs.org and provided by at ASCO on March 6, 2014 from 158.232.241.130
Copyright 2012 American Society of Clinical Oncology. All rights reserved.
ABI 3730 DNA Analyzer (using SNaPshot assays; Applied Biosys-
tems), iScan platform(using Goldengate assays; Illumina, San Diego,
CA), Affymetrix genotyping arrays (Santa Clara, CA), and MassARRAY
platform(usingmass spectrometrywithmatrix-assistedlaser desorption/
ionizationtime-of-ightanalysis; Sequenom, SanDiego, CA).
14-18
These
high-throughput genotyping platforms can analyze hundreds to mil-
lions of germline and/or somatic variants simultaneously and are
distinctly different from sequencing technologies. Whereas DNA se-
quencing can detect any sequence variant in the gene(s) evaluated,
genotyping detects only known variants that have been selected for
analysis. Because multiplexed assays and microarrays are relatively
inexpensive and provide results rapidly, they are currently the most
common technologies used for both somatic and germline mutation
genotyping in clinical samples.
Simple somatic mutations are only one of several types of
genetic aberrations that have the potential to be predictive bio-
markers. Translocations, DNA amplications, deletions, methyl-
ation, and gene expression also are important, and assaying for
these aberrations also can be performed. In the clinic today, uo-
rescence in situ hybridization (FISH) is the gold standard for iden-
tication of ERBB2 amplication.
25-27
However, like restriction
fragment lengthpolymorphism, FISHhas a lowthroughput. Toiden-
tify and validate gene copy number changes that include deletions,
gains, andamplications (whengene copynumber is greater than10),
ahigher-throughput technologyis required. Array-basedcomparative
genomic hybridization is a molecular-cytogenetic method for detec-
tionof gene copy numbers that has highresolutionandhighthrough-
put. Technologies such as the Agilent array-based comparative
genomic hybridizationplatform(SantaClara, CA) are able toperform
high-throughput genotyping for gene copy number variations inclin-
ical samples.
19,28
Multiplexed assays and microarrays are currently the dominant
technologies in high-throughput genotyping of somatic mutations,
gene copy number variations, and other alterations affecting gene
expression or DNA methylation.
29
However, as costs associated with
targeted or whole-genome sequencing fall, NGS platforms may be-
come the preferred option.
Bioinformatics
Bioinformatics is the application of statistics and computer
science to biology. It includes both information management (eg,
genomic databases and visualization) and algorithm development,
particularly for the assembly, annotation, and comparison of
genomes.
30-33
As detailedinTable 2, these bioinformatic functions are
an essential component of cancer genomics. A detailed discussion of
bioinformatics in the age of NGS is beyond the scope of this review;
instead, the reader is referredtoanexcellent reviewof this topic byPop
et al.
30
DISCOVERY
The discovery of genetic aberrations in human cancers has identied
potential therapeutic targets andprovidedkey insights into the mech-
anisms underlying tumorigenesis,
34-41
as described in an excellent
reviewby Stratton et al.
42
The ICGC, formed in 2008 and incorporat-
ing the Cancer Genome Project of the United Kingdom and the
Cancer Genome Atlas of the United States, coordinates research proj-
ects that aim to comprehensively elucidate genomic changes present
in multiple cancers.
43
Its primary goals are to generate comprehen-
sive catalogues of genomic abnormalities in 500 tumors from each of
50 different cancer types and to accelerate research into the causes
and control of cancer.
43
Whereas ICGC projects aim to develop a molecular map of the
geneticaberrations involvedincancer, genome-wideassociationstud-
ies (GWAS) investigate the inherited basis of cancer by comparing
common DNA variations in a large set of unrelated patient cases and
controls andidentifying genetic variants associatedwithalteredrisk.
44
AlthoughGWASare important inthe larger scheme of cancer genom-
ics, this article focuses predominantly onacquired genetic aberrations
that arise in the genomes of cancer cells.
Results from ICGC studies of glioblastoma multiforme (GBM),
ovarian carcinoma, and chronic lymphocytic leukemia have been
published.
38,45,46
These studies surveyed for gene mutations, DNA
copy number, gene expression, and methylation in large cohorts. The
GBMstudyidentiedasubstantial proportionof tumors withMGMT
promoter methylation, now known to be a predictive biomarker for
Table 2. Role of Bioinformatics in Cancer Genomics
Bioinformatics
Function Background
Genome alignment
and
reconstruction
Alignment and reconstruction requires reads
sufciently long enough to be mapped accurately
onto reference genome sequence
13,15,21
Mapping processes must efciently handle millions of
generated sequences while being robust in
presence of sequencing errors and SNPs
13,21
Existing sequencing alignment tools like BLAST or
BLAT are adequate for long reads produced by
Sanger sequencing, but for short reads produced by
NGS, newer alignment tools are being developed to
allow for mismatches and/or gaps
13,21
SNPs identied through this process must be carefully
analysed to ensure they are real and not a result of
technology-specic errors
13,21
Base calling Base calling is process of converting sequencing
signals into base and is essential for SNP and
somatic variant identication
31
Improvements in base calling accuracy are essential to
reduce false positives and lead to more reliable
identication of germline and somatic variants
31,33
De novo genome
assembly
De novo genome assembly is like solving a large
jigsaw puzzle without knowing the nal picture
Several assembly tools have been adapted or
independently developed for generating assemblies
from short reads
30,31
Genome browsing
and annotation
Genome browsing and annotation enable millions of
sequences to be available to biomedical community
through easily accessible and user-friendly systems;
essential for collaboration and progress in research
Commonly used browsers include EntrezGene
browser, University of California Santa Cruz genome
browser, and European Bioinformatics
Institute/Ensemble browser
31
Commonly used browsers containing cancer mutation
datasets include COSMIC and ICGC
National Centre for Biotechnology Information SNP
database stores millions of SNPs,
31
which can be
useful in classifying mutations into known germline
variants
Abbreviations: BLAST, Basic Local Alignment Search Tool; BLAT, BLAST-Like
Alignment Tool; COSMIC, Catalogue of Somatic Mutations in Cancer; ICGC,
International Cancer Genome Consortium; NGS, next-generation sequencing;
SNP, single nucleotide polymorphism.
Tran et al
650 2012 by American Society of Clinical Oncology JOURNAL OF CLINICAL ONCOLOGY
Information downloaded from jco.ascopubs.org and provided by at ASCO on March 6, 2014 from 158.232.241.130
Copyright 2012 American Society of Clinical Oncology. All rights reserved.
temozolomide sensitivity.
38
The ovarian study identied impaired
homologous recombination in approximately 50% of tumors, a
potential predictor for benet from poly (ADP-ribose) polymerase
inhibitors.
45
The chronic lymphocytic leukemia study demonstrated
that NOTCH1 and MYD88 mutations are associated with distinct
clinical subgroups with specic biologic features.
46
These results vali-
date the role of cancer genomics in achieving PCM.
Sequenced Cancer Genomes
Much of the focus in cancer genomics has centered on sequenc-
ing cancer genomes and identifying potential driver mutations (ie,
mutations that arefunctionallycritical inthetumor). Cancer genomes
from several tumor types have been sequenced and published (Table
3). The earliest comprehensive survey of human genes in cancer by
sequencing methods examined the whole exome (20,661 genes) of 22
human GBM samples, correlating sequencing results with patient
outcome data. This study resulted in the unexpected but important
nding of IDH1 mutations as a potential favorable prognostic bio-
marker.
39
These initial results providedearlyvalidationof the utilityof
genome sequencing in cancer.
Whole genomes (or exomes) of acute myeloid leukemia, mel-
anoma, small-cell lung cancer, prostate cancer, pancreatic cancer,
hepatocellular carcinoma, and multiple myeloma have also been
sequenced. Results from each study illustrate the effectiveness of
cancer genome sequencing in furthering our understanding
of cancer.
35-37,40,47-49
Epigenetics and Gene Expression
Epigenetic aberrations incancer, suchas global hypomethylation
of DNA, hypermethylation of tumor-suppressor genes, and inactiva-
tion of microRNA by DNA methylation, are being systematically
studied, because it is clear they canhave a signicant impact ontumor
biology and treatment outcomes.
50
Using transcriptional proling,
gene expression signatures are also being studied extensively in can-
cer.
51
However, because of challenges associated with reproducibility,
only a fewvalidateddiscoveries have beenmade todate. These include
the novel molecular classicationof breast cancer initially reportedby
Perou et al
52
and the development of validated recurrence scores for
early breast cancer such as OncotypeDX(Genomic Health, Redwood
Table 3. Sequenced Cancer Genomes
Author Tumor
No. of
Samples Tissue Type
Genome
or Exome Novel Mutations
Novel
Mutations in
Coding Regions Comment
Ding et al
34
Basal-like breast
cancer
1 Blood, primary,
metastasis,
xenograft
Genome 27,173, primary;
51,710,
metastasis;
109,078,
xenograft
200, primary;
225,
metastasis;
328,
xenograft
48 validated somatic mutations
present in all three tumor
tissues, with two additional
mutations in metastasis
Mardis et al
35
AML 1 Tumor, skin Genome 20,256 113 Recurrent mutations in IDH1
discovered
Ley et al
41
AML 1 Tumor, skin Genome 31,632 241 Eight newly dened somatic
mutations for AML
Pleasance et al
37
Malignant
melanoma
1 Cell line,
lymphoblastoid
cell line
Genome 33,345 292 Identication of mutation
signature caused by
exposure to ultraviolet light
Pleasance et al
36
Small-cell lung
cancer
1 Cell line,
lymphoblastoid
cell line
Genome 22,190 134 Identication of mutation
signature caused by
exposure to tobacco smoke
Parsons et al
39
GBM 22 Seven tumors; 15
xenograft, blood
Exome