HUGO Gene Nomenclature Committee (HGNC) Recommendations For The Designation of Gene Fusions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

www.nature.

com/leu Leukemia

CONSENSUS STATEMENT OPEN

HUGO Gene Nomenclature Committee (HGNC)


recommendations for the designation of gene fusions

Elspeth A. Bruford 1,2 , Cristina R. Antonescu3, Andrew J. Carroll 4, Arul Chinnaiyan5,6, Ian A. Cree7, Nicholas C. P. Cross 8,9,
10
Raymond Dalgleish , Robert Peter Gale 11, Christine J. Harrison12, Rosalind J. Hastings13, Jean-Loup Huret14, Bertil Johansson15,
Michelle Le Beau , Cristina Mecucci 17, Fredrik Mertens15, Roel Verhaak 18,19 and Felix Mitelman15
16

© The Author(s) 2021

Gene fusions have been discussed in the scientific literature since they were first detected in cancer cells in the early 1980s. There is
currently no standardized way to denote the genes involved in fusions, but in the majority of publications the gene symbols in
question are listed either separated by a hyphen (-) or by a forward slash (/). Both types of designation suffer from important
shortcomings. HGNC has worked with the scientific community to determine a new, instantly recognizable and unique separator—
a double colon (::)—to be used in the description of fusion genes, and advocates its usage in all databases and articles describing
gene fusions.

Leukemia (2021) 35:3040–3043; https://doi.org/10.1038/s41375-021-01436-6

BRIEF HISTORICAL BACKGROUND OF GENE FUSIONS light chains [2–4]. As a consequence of these translocations, the
Technical developments at the end of the 1970s enabled the MYC gene becomes transcriptionally deregulated, often over-
identification of genes in the breakpoints of chromosome expressed, owing to the influence of regulatory elements of the
rearrangements, which in the early 1980s led to the discovery immunoglobulin genes.
and characterization of gene fusions in neoplasia. While the The alternative mechanism, the creation of a hybrid gene, was
products of translocation events are referred to by several terms, documented at the same time in CML with the demonstration that
including fusion genes, hybrid genes and chimeric genes, here we the Philadelphia chromosome, i.e., the derivative chromosome 22
largely choose to use the term fusion genes as this is most widely resulting from the recurrent reciprocal translocation t(9;22)(q34;
used in this context. Analyses of the recurrent balanced q11), juxtaposed the 5′ part of the BCR gene at 22q11 with the 3′
translocations in Burkitt lymphoma (BL) and chronic myeloid part of the ABL1 tyrosine kinase-encoding gene from 9q34. This
leukemia (CML) proved particularly pivotal. The picture to emerge leads to an in-frame fusion of parts of the two genes and results in
was that reciprocal translocations exert their effects by one of two an abnormal protein, which displays increased tyrosine kinase
alternative mechanisms: deregulation, usually resulting in the activity [5–8] (see Fig. 1).
overexpression of a seemingly normal gene in one of the These and similar molecular insights into how cancer-specific
breakpoints, or the creation of a hybrid, chimeric gene through chromosomal abnormalities act pathogenetically sparked an
fusion of parts of two genes, one in each breakpoint [1] (see enormous interest in cytogenetics as a powerful means to
Fig. 1). pinpoint the locations of genes important in tumorigenesis, and
BL provided the first conclusive evidence for the deregulation an impressive amount of information has been accumulated
mechanism. This tumor type was found to harbor one of three through these efforts. Almost 1000 gene fusions have been found
translocations: t(8;14)(q24;q32), t(2;8)(p11;q24) or t(8;22)(q24;q11). by genomic characterization of breakpoints in cytogenetically
In all three, the breakpoints in chromosome 8 were found to be identified aberrations, balanced as well as unbalanced, in various
within or adjacent to the MYC oncogene (8q24), and the other leukemias, lymphomas, and solid tumors. The accumulated data
breakpoint in an immunoglobulin gene, encoding the heavy chain have shown that the consequences of practically all gene fusions
(IGH in 14q32) or the kappa (IGK in 2p11) or lambda (IGL in 22q11) are in principle the same as those originally elucidated in BL and

1
HUGO Gene Nomenclature Committee (HGNC), European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.
2
Department of Haematology, University of Cambridge School of Clinical Medicine, Cambridge, UK. 3Department of Pathology, Memorial Sloan Kettering Cancer Center, New
York, NY, USA. 4Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA. 5University of Michigan Medical School, Ann Arbor, MI, USA. 6Howard
Hughes Medical Institute, Chevy Chase, MD, USA. 7International Agency for Research on Cancer, World Health Organization, Lyon, France. 8Faculty of Medicine, University of
Southampton, Southampton, UK. 9Wessex Regional Genetics Laboratory, Salisbury NHS Foundation Trust, Salisbury, Wiltshire, UK. 10Department of Genetics and Genome Biology,
University of Leicester, Leicester, UK. 11Centre for Haematology Research, Department of Immunology and Inflammation, Imperial College London, London, UK. 12Translational
and Clinical Research Institute, Newcastle University Centre for Cancer, Newcastle upon Tyne, UK. 13The Women’s Centre, John Radcliffe Hospital, Oxford University Hospitals
Foundation Trust, Oxford, UK. 1410 rue des Treilles, Masseuil, Quinçay, France. 15Division of Clinical Genetics, Department of Laboratory Medicine, Lund University, Lund, Sweden.
16
Comprehensive Cancer Center, University of Chicago, Chicago, IL, USA. 17Department of Medicine and Surgery, University of Perugia, Perugia, Italy. 18Jackson Laboratory for
Genomic Medicine, Farmington, CT, USA. 19Department of Neurosurgery, Amsterdam University Medical Center, Amsterdam, Netherlands. ✉email: [email protected]

Received: 8 September 2021 Revised: 10 September 2021 Accepted: 21 September 2021


Published online: 6 October 2021
E.A. Bruford et al.
3041
1234567890();,:

Fig. 1 The chromosomal basis of gene fusions. a Gene fusions may originate through balanced and unbalanced chromosome
rearrangements. Balanced changes comprise translocations (the transfer of chromosome segments between chromosomes), insertions (a
chromosome segment in a new interstitial position in the same or another chromosome) and inversions (rotation of a chromosome segment
by 180°); an example of an unbalanced change is the deletion of an interstitial chromosomal segment. Small arrows indicate breakpoints, and
large arrows indicate the resulting rearranged chromosomes. A and B signify affected genes. Note that a reciprocal gene fusion may be
generated on the partner derivative chromosome as a result of a reciprocal translocation, but this is not shown. b Both balanced and
unbalanced aberrations may lead to the deregulation of either gene A or gene B by the juxtaposition of the coding sequences with the
regulatory sequences of the other gene, or to the creation of a chimeric gene through the fusion of parts of both genes.

CML, i.e., deregulation of a seemingly normal gene or the creation PROBLEMS WITH THE NOMENCLATURE USED TO DESCRIBE
of a hybrid gene. However, not all gene fusions have translational GENE FUSIONS
consequences, and some could result in gene inactivation The SYMBOL-SYMBOL notation, e.g., BCR-ABL1, to denote fusion
[1, 9, 10]. genes has three important shortcomings:
The advent of massively parallel sequencing (MPS) has recently (1) The HGNC has approved the use of the hyphen separator, in
provided a radically new means to identify fusions at the DNA or collaboration with all contributing genome annotation groups
RNA levels without any prior information on the cytogenetic features involved in the Consensus Coding Sequence (CCDS) Project [12],
of the neoplastic cells. The results of such unbiased gene fusion for denoting readthrough transcripts, e.g., INS-IGF2.
detection efforts during the last decade have dramatically changed (2) A hyphen is often used in the literature to denote members
the gene fusion landscape. More than 30,000 gene fusions, the great of a complex, e.g., MRE11-NBN, MRE11-RAD50-NBN.
majority involving previously unsuspected genes, have now been (3) There are also specific groups of approved gene symbols
identified through deep sequencing in a wide variety of neoplasms containing hyphens as separators within the symbol, e.g., TRX-
and these are reported in a number of online resources, e.g., https:// CAT1-2.
mitelmandatabase.isb-cgc.org/, http://atlasgeneticsoncology.org/, Hence, it is difficult to search specifically for gene fusions in
https://cancer.sanger.ac.uk/cosmic, https://ccsm.uth.edu/FusionGDB/, databases and in the literature using the hyphen symbol.
http://www.kobic.re.kr/chimerdb/, https://tumorfusions.org/. A major The SYMBOL/SYMBOL notation, e.g., BCR/ABL1, has at least four
challenge will be to verify by functional studies which of the alleged major disadvantages:
gene fusions are pathogenetically important in carcinogenesis, and (1) The forward slash is an accepted symbol in the established
which are either secondary progressional changes or non- cytogenetic International System for Human Cytogenomic
consequential “noise” abnormalities, e.g., by-products of the genetic Nomenclature (ISCN) to denote different clones, both constitu-
instability that characterizes many cancer cells. It is important to note tionally (mosaicism) and in cancer cells; the Human Genome
that while gene fusions are a hallmark of neoplasia, they can also Variation Society (HGVS) guidelines (https://varnomen.hgvs.org/
occur in heritable disorders such as the formation of the Lepore and recommendations/general/) also use a forward slash to indicate
anti-Lepore haemoglobins from the HBD and HBB genes [11]. mosaicism [13].
There has never been a generally recommended, standardized (2) The forward slash is often used in the literature in place of
way to denote gene fusions. Instead, multiple notations have been “either/or”, e.g., BRCA1/2, and to denote involvement of alternative
used with varying popularity over time, though the most common genes in a fusion, e.g., “SS18-SSX1/SSX2”.
designation is SYMBOL-SYMBOL followed by SYMBOL/SYMBOL, (3) Pathway and complex descriptions use this character, e.g.,
both of which we critique below. RAS/RAF/MAPK.

Leukemia (2021) 35:3040 – 3043


E.A. Bruford et al.
3042
(4) Commercial dual fusion fluorescence in situ hybridization breakpoints lies in an intergenic region and the genomic
translocation probes (CE marked) also use a forward slash to coordinate of that breakpoint is known, this can be denoted in
indicate the two probe sets used, e.g., BCR/ABL1. an abbreviated format for publication as chr#:g.coordinate
In view of the considerations listed above, and hence the clear number. For example, ABL1::? denotes a fusion between ABL1
need for a standardized and unique way to denote gene fusions, and an unknown gene, 6q25::ABL1 a fusion between an unknown
the HGNC concluded that an alternative needed to be sought to gene located in chromosome band 6q25 and ABL1, and ABL1::
replace the use of either a hyphen (-) or a forward slash (/). chr11.g:1850000 a fusion between ABL1 and a breakpoint at
nucleotide 1,850,000 on chromosome 11. In the first and third
examples the unknown gene and intergenic region are the 3′
RECOMMENDED NEW NOMENCLATURE TO DESCRIBE GENE partners, and in the second example the unknown gene is the 5′
FUSIONS partner. Full ISCN [14] (https://iscn.karger.com/) or HGVS (https://
After careful deliberation, and consultations with experts in the varnomen.hgvs.org/recommendations/DNA/variant/complex/)
field, HGNC recommends that a new separator—a double colon nomenclature should be used for formal reporting.
(::)—be used in describing gene fusions, e.g., BCR::ABL1. The Note that HGNC always advocate listing a stable gene ID, ideally
double colon (::) has several important advantages: an HGNC ID, when referencing genes in publications. We do not
First, it follows the long-standing recommendation of the recommend that the IDs be included in the fusion notation, but
internationally accepted ISCN cytogenetic nomenclature in which rather in the accompanying text, e.g., a gene fusion involving BCR
a single colon (:) is used to indicate a chromosome break and a (HGNC:1014) and ABL1(HGNC:76) is denoted as BCR::ABL1.
double colon (::) to denote break and reunion [14]. The:: separator
thus nicely reflects the principal mode of origin of most fusion
genes. We are aware that fusion transcripts may occasionally CONCLUSIONS
originate at the RNA level through cis- or trans-splicing without a There has long been a need for a unique, standardized and easily
genomic breakage and reunion correlate, but we deem it recognizable way to symbolize gene fusion events consistently, both
unnecessary to create a different nomenclature system for such in the literature and in databases. Following consultation with
events, especially since the HGVS already recommends using the experts in the field of gene fusions, HGNC recommends the use of
double colon to describe RNA fusion transcripts (https:// the separator “::”, a double colon, between approved gene symbols,
varnomen.hgvs.org/recommendations/general/). to designate the genes involved in gene fusion events, e.g., BCR::
Secondly, it is instantly recognizable and creates a unique ABL1. This recommendation is further endorsed by all the authors of
symbol in the existing gene nomenclature, and hence is easily this manuscript, by the HGVS and the ISCN, and by the following
searchable in databases and in the literature. resources: the WHO Classification of Tumors, COSMIC, OMIM, Atlas of
Thirdly, different gene fusions found in different single cells or Genetics and Cytogenetics in Oncology and Haematology, Mitelman
in separate clones within the same tumor will be easily Database of Chromosome Aberrations and Gene Fusions in Cancer,
recognizable, i.e., SYMBOL::SYMBOL/SYMBOL::SYMBOL. and the Tumor Fusion Gene Data Portal. We urge all readers to use
and publicize this form of notation for describing gene fusions in all
future communications to avoid confusion. We recognize that this is
SPECIFIC RECOMMENDATIONS WHEN DESCRIBING GENE a newly established recommendation that could be further
FUSIONS developed in due course, and welcome feedback from the
In line with established HGNC recommendations [15], genes community (via the HGNC website, www.genenames.org).
involved in fusions should be designated by their HGNC approved
gene symbols written in italics, whereas proteins are not italicized,
e.g., BCR::ABL1 denotes a fusion of the BCR and ABL1 genes, while REFERENCES
BCR::ABL1 designates the corresponding protein product. By 1. Mertens F, Johansson B, Fioretos T, Mitelman F. The emerging complexity of gene
convention, fusion transcripts identified at the RNA level, e.g., by fusions in cancer. Nat Rev Cancer. 2015;15:371–81.
RNA-seq, are designated as genes, i.e., in italics. The double colon 2. Dalla-Favera R, Bregni M, Erikson J, Patterson D, Gallo RC, Croce CM. Human c-myc
onc gene is located on the region of chromosome 8 that is translocated in Burkitt
(::) should be used for all types of gene fusions, i.e., both those
lymphoma cells. Proc Natl Acad Sci USA. 1982;79:7824–7.
giving rise to a hybrid, chimeric gene (BCR::ABL1) and those where 3. Taub R, Kirsch I, Morton C, Lenoir G, Swan D, Tronick S, et al. Translocation of the
regulatory elements from one gene deregulate a partner gene c-myc gene into the immunoglobulin heavy chain locus in human Burkitt lym-
(IGH::MYC). phoma and murine plasmacytoma cells. Proc Natl Acad Sci USA.
In accordance with established practice in designating gene 1982;79:7837–41.
fusions, the 5′ partner gene should always be listed first in the 4. Erikson J, Nishikura K, ar-Rushdi A, Finan J, Emanuel B, Lenoir G, et al. Translo-
description of a fusion gene, i.e., before the double colon, cation of an immunoglobulin κ locus to a region 3’ of an unrearranged c-myc
irrespective of chromosomal location or the orientation of the gene. oncogene enhances c-myc transcription. Proc Natl Acad Sci USA. 1983;80:7581–5.
Thus, in the BCR::ABL1 fusion gene—the outcome of the transloca- 5. Heisterkamp N, Stam K, Groffen J, de Klein A, Grosveld G. Structural organization
of the bcr gene and its role in the Ph’ translocation. Nature. 1985;315:758–61.
tion t(9;22)(q34.1;q11.2)—the BCR gene in chromosome 22 is the 5′
6. Shtivelman E, Lifshitz B, Gale RP, Canaani E. Fused transcript of abl and bcr genes
gene, the ABL1 gene from chromosome 9 is the 3′ gene. in chronic myelogenous leukaemia. Nature. 1985;315:550–4.
In tables in scientific articles and in databases presenting gene 7. Ben-Neriah Y, Daley GQ, Mes-Masson AM, Witte ON, Baltimore D. The chronic
fusions, the two genes are often designated either as “Gene A” myelogenous leukemia-specific P210 protein is the product of the bcr/abl hybrid
and “Gene B” or “Gene 1” and “Gene 2”. Thus, it is not explicitly gene. Science. 1986;233:212–4.
stated that “Gene A” or “Gene 1” represents the 5′ gene although 8. Fainstein E, Marcelle C, Rosner A, Canaani E, Gale RP, Dreazen O, et al. A new
this is usually the case. To avoid ambiguous interpretations, HGNC fused transcript in Philadelphia chromosome positive acute lymphocytic leu-
recommends that 5′ and 3′ genes be clearly indicated in tables kaemia. Nature. 1987;330:386–8.
showing gene fusions. In fusions giving rise to a deregulated gene 9. Kumar-Sinha C, Kalyana-Sundaram S, Chinnaiyan AM. Landscape of gene fusions
in epithelial cancers: seq and ye shall find. Genome Med. 2015;7:129.
the regulatory or enhancer element should be listed first,
10. Rowley JD, Le Beau MM, Rabbitts TH, editors. Chromosomal Translocations and
whenever known. Genome Rearrangements in Cancer. New York: Springer; 2015.
If one of the genes in a fusion is unknown this may be indicated 11. Efremov GD. Hemoglobins Lepore and anti-Lepore. Hemoglobin. 1978;2:197–233.
by either a question mark (?) or by the chromosomal band where 12. Pujar S, O’Leary NA, Farrell CM, Loveland JE, Mudge JM, Wallin C, et al. Consensus
the breakpoint is located, following ISCN convention. If one of the coding sequence (CCDS) database: a standardized set of human and mouse

Leukemia (2021) 35:3040 – 3043


E.A. Bruford et al.
3043
protein-coding regions supported by expert curation. Nucleic Acids Res. 2018;46: COMPETING INTERESTS
D221–228. D1 MLB is a member of the Board of Directors of the American Cancer Society and a
13. den Dunnen JT, Dalgleish R, Maglott DR, Hart RK, Greenblatt MS, McGowan- member of the ACS Cancer Action Network; RV is a co-founder of, and has received
Jordan J, et al. HGVS recommendations for the description of sequence variants: research funding from, Boundless Bio.
2016 Update. Hum Mutat. 2016;37:564–9.
14. McGowan-Jordan J, Hastings R, Moore S, editors. An International System for
Human Cytogenomic Nomenclature. Cytogenet Genome Res. 2020;160:341–503. ADDITIONAL INFORMATION
15. Bruford EA, Braschi B, Denny P, Jones TEM, Seal RL, Tweedie S. Guidelines for Correspondence and requests for materials should be addressed to Elspeth A.
human gene nomenclature. Nat Genet. 2020;52:754–8. Bruford.

Reprints and permission information is available at http://www.nature.com/


reprints
ACKNOWLEDGEMENTS
EAB is currently funded by the National Human Genome Research Institute (NHGRI) Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims
grant U24HG003345 and Wellcome Trust grant 208349/Z/17/Z. The content is solely in published maps and institutional affiliations.
the responsibility of the authors and does not necessarily represent the official views
of the National Institutes of Health, nor the views of the authors’ employers and
associated institutions. Where authors are identified as personnel of the International
Agency for Research on Cancer/World Health Organization, the authors alone are Open Access This article is licensed under a Creative Commons
responsible for the views expressed in this article and they do not necessarily Attribution 4.0 International License, which permits use, sharing,
represent the decisions, policy, or views of the International Agency for Research on adaptation, distribution and reproduction in any medium or format, as long as you give
Cancer/World Health Organization. We thank our colleagues from the College of appropriate credit to the original author(s) and the source, provide a link to the Creative
American Pathologists, and Dr Zbyslaw Sondka from COSMIC, for useful discussions. Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the article’s Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly
AUTHOR CONTRIBUTIONS from the copyright holder. To view a copy of this license, visit http://creativecommons.
RPG, NCPC and EAB instigated the initial discussions. EAB coordinated the work, FM
org/licenses/by/4.0/.
and EB drafted the paper, and JLH created the figure. All authors commented on and
edited previous drafts, and approved the final paper.
© The Author(s) 2021

Leukemia (2021) 35:3040 – 3043

You might also like