Mass Spectrometry-Based Metabolomics
Mass Spectrometry-Based Metabolomics
Mass Spectrometry-Based Metabolomics
https://doi.org/10.1038/s41592-021-01197-1
Mass spectrometry-based metabolomics approaches can enable detection and quantification of many thousands
of metabolite features simultaneously. However, compound identification and reliable quantification are greatly
complicated owing to the chemical complexity and dynamic range of the metabolome. Simultaneous
quantification of many metabolites within complex mixtures can additionally be complicated by ion suppression,
fragmentation and the presence of isomers. Here we present guidelines covering sample preparation, replication
and randomization, quantification, recovery and recombination, ion sup
pression and peak misidentification, as a means to enable high-quality reporting of liquid chromatography– and
gas chroma tography–mass spectrometry-based metabolomics-derived data.
structure and dynamic range of abundance9,12), remains a
M
major challenge
etabolomics, the large-scale study of the metabolic with regard to the ability to provide adequate coverage of the
metabolome that can complement that achieved for the
com 1–3
genome, transcriptome and proteome. Despite these
plement of the cell , is a mature science that has been comparative limita tions, enormous advances have been
practiced for over 20 years4. Indeed, it is now a made with regard to the number of analytes about which
commonly used experimental systems biology tool with accurate quantitative informa tion can be acquired, and a
demonstrated utility in both fundamental and applied vast number of studies have yielded important biological
aspects of plant, microbial and mam malian research5–15. information and biologically active metabo lites across the
Among the many thousands of studies pub lished in this area kingdoms of life14. We have previously estimated that
over the last 20 years, notable highlights5–8,10,11,16 are briefly upwards of 1 million different metabolites occur across the
described in Supplementary Note 1. tree of life, with between 1,000 and 40,000 estimated to
Despite the insight afforded by such studies, the nature of occur in a single species4.
metabolites, particularly their diversity (in both chemical
1
Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany. 2Institute of Plants Systems Biology and Biotechnology, Plovdiv,
Bulgaria. 3Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, Israel. 4Department of Life Sciences, Ben
Gurion University of the Negev, Beersheva, Israel. 5Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
6
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany. 7Department of Bioinformatics, University of Jena,
Jena, Germany. 8Interfaculty Institute of Cell Biology, Eberhard Karls University of Tuebingen, Tuebingen, Germany. 9Biological Sciences, Royal
Holloway University of London, Egham, UK. 10Max Planck Institute for Biology of Ageing, Cologne, Germany. 11BU Bioscience, Wageningen
Research, Wageningen, the Netherlands. 12Laboratory of Plant Physiology, Wageningen University, Wageningen, the Netherlands. 13Molecular
Systems Biology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, the Netherlands. 14Max
Planck Institute for Terrestrial Microbiology, Marburg, Germany. 15College of Tropical Crops, Hainan University, Haikou, China. 16Bioinformatics
and Scientific Data, Leibniz Institute for Plant Biochemistry, Halle, Germany. 17BioInnovation Institute, Copenhagen, Denmark. 18Department of
Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden. 19Plant Molecular Science Center, Chiba
University, Chiba, Japan. 20RIKEN Center for Sustainable Resource Science, Yokohama, Japan. 21Institute for Molecular Systems Biology, ETH
Zurich, Zurich, Switzerland. 22Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY,
USA. 23Center for Metabolomics and Mass Spectrometry, Scripps Research Institute, La Jolla, CA, USA. 24Department of Biochemistry and MU
Metabolomics Center, University of Missouri, Columbia, MO, USA. 25State Key Laboratory of Genetic Engineering, Zhongshan Hospital and
School of Life Sciences, Human Phenome Institute, Metabonomics and Systems Biology Laboratory at Shanghai International Centre for
Molecular Phenomics, Fudan University, Shanghai, China. 26Department of Biological Science, Nara Institute of Science and Technology, Ikoma,
Japan. 27Singapore Phenome Center, Lee Kong Chian School of Medicine, School of Biological Sciences, Nanyang Technological University,
Nanyang, Singapore. 28Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong
Agricultural University, Wuhan, China. 29CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical
✉
Physics, Chinese Academy of Sciences, Dalian, China. e-mail: [email protected]; [email protected]
45 t
i
Grinding s
(fly or oscillate) e
Extraction Mass spectrometry TOF, in mass spectrometer t
n
Orbitrap, Q, IT according to their m/z I
m/z
Detector
Fig. 1 | Metabolomics workflow. Metabolomics involves several basic steps: (1) sample preparation and extraction; (2) metabolite separation
on a column (chromatography) such as by GC, LC or EC; (3) ionization of metabolites using an ion source; (4) separation by a mass analyzer as
ions fly or oscillate on the basis of their mass-to-charge (m/z) ratio; and (5) detection. Metabolites can be identified on the basis of a combination
of retention time (RT) and MS signature. TOF, time of flight; Q, quadrupole; IT, ion trap.
recommended, where necessary, to store completely dry
residues for as short a time as possible before their
drying and lack of storage in sealed containers can generate
artifac tual geometric isomers of pigments . Freeze drying is analysis. In addition, great care must be taken to ensure
39
also unsuit able when volatile components are of interest. that metabolism remains quenched during thawing. This is
While the appropriate means of storage is strictly dependent particularly pertinent for extracts contain ing secondary
on the stability of the class of targeted metabolites under metabolites. In such extracts, degradative enzymes often
study, it is not recommended to store samples between 0 retain their activities, which, if not kept in check, may result
and 40 °C. At these temperatures, substances can become in the consumption or conversion of certain metabolites with
concentrated in a residual aqueous phase39. It is therefore a con comitant appearance of new compounds or
breakdown products51.
Similar issues are also present with respect to both the more comprehen sive assessment of any experimental
experi mental growth media and the initial extraction solvents variance in data generation39. Indeed, such analyses are
used. Growth media often need to be removed via multiple essential for the establishment of a new extraction or
wash steps to reduce the effects of ion suppression during processing procedure or a new analytical technique as well
the subsequent MS analysis, and the solvent used for initial as for the optimization of a new instrument.
extraction may need to be exchanged owing to Biological replication is even more important and should
incompatibility with the instrumentation used for the involve at least four but preferably more replicates; the
metabolite analysis. Two pitfalls are pertinent here: (1) the required number of replicates depends on the desired
washing process results in the loss of metabolites and (2) statistical power, effect size and actual variance52. Care
solvent removal leads to concentration of the metabolites must be taken to acquire such replicates in a highly uniform
and thereby an acceleration of chemical reactions between manner. For plants, this can also mean collect
them. Thus, considerable caution is advised in method ing samples at the same time of day and under the same
optimization to ensure that extraction and handling methods environ mental conditions. In many instances, a full and
allow adequate quantitative representation of cellular independent repeat of a biological experiment is advisable53.
metabolites. In some instances, such as the analysis of There are different stages where technical replicates can be
volatile or semivolatile compounds, sample extraction and made: at sampling, quenching, extraction and analysis,
handling should only be performed on fresh material. We replicates can be made independently of the entire process.
strongly recom mend the adoption of recovery and In our experience, the extraction step is the most critical of
recombination experiments (see below) when either a these. Whether technical replication is needed in support of
substantially novel metabolomics technique is introduced or biological replication is highly dependent on the relative
a novel cell type, tissue or organism is studied. magni
tudes of variation; in cases in which the biological variation
Sample replication and randomization greatly exceeds the technical variation, it is sensible to
An important issue is the nature and number of biological, sacrifice the latter to increase the former. With new systems,
tech nical and analytical replicates. Before using any new pilot experiments are highly recommended to evaluate
extraction biological and technical variation and hence determine how
protocol or analytical procedure and when working with new many samples and how many replicates are needed to
biolog ical materials, it is essential to perform extensive pilot achieve statistical robustness52.
experiments to fully assess the technical variation that is Careful spatiotemporal randomization of biological
necessary to design a statistically sound experiment. To samples throughout a metabolomics experiment is equally
avoid misunderstanding, we refer readers to the definitions essential. If a set of samples is analyzed in a nonrandom
of each type of replicate provided in ref. 39. While analytical order, treatment and control samples or time points may end
replicates, that is, replicates corresponding to repeated up being measured under very dif
injection of the exact same extract, are useful in assess ing ferent conditions. As a result, interpretation can be
machine performance, technical replicates, which confounded by
encompass the entire experimental procedure, allow a far
Chromatography separation
100
%
80
r
A further aspect of quantification is the basis on which quantities
e
60
are expressed for tissue samples. Data are often provided per gram ofc
40
fresh or dry weight, while for body fluids they are often provided per
volume. The case of cellular metabolomics is more complicated given
20
that cell size is often variable; values are therefore typically provided
0
l
l
l
)
e
d
e
e
e
e
e
d
d
d
d
d
e
e
d
n
e
e
e
e
e
e
e
e
e
a
e
d
e
e
n
t
n
n
h
n
n
r
l
o
c
o
a
c
U
m
n
t
p
e
o
y
s
y
o
c
u
t
u
i
r
c
o
a
a
o
a
a
l
t
r
t
t
L
a
u
i
a
p
m
m
G
,
l
G
S
i
T
b
o
l
,
F
u
h
i
o
y
a
u
e
O
c
h
y
o
a
a
a
s
r
a
C
M
r
n
t
t
e
T
I
G
b
cell—yet is often not given enough consideration by the community.
140
120
Recovery and recombination experiments
100
Recovery experiments, in which authenticated standard com )
80
(
60
e
I
I
v
v
i
l
l
l
i
i
i
a
I
i
i
S
t
x
x
x
h
d
h
n
d
d
d
h
r
r
c
y
y
c
I
c
i
i
l
i
i
d
d
G
G
e
e
e
(
i
i
u
o
l
e
e
o
o
c
s
y
R
R
c
R
a
a
s
c
3
h
h
h
c
y
d
d
R
–
e
o
e
a
l
o
7
7
a
7
o
n
-
f
f
c
c
i
o
l
l
l
i
i
c
f
f
e
a
n
–
–
–
o
x
f
c
i
c
–
–
–
e
y
y
y
i
i
e
o
u
a
f
a
c
c
t
l
h
x
f
f
i
l
i
e
n
n
f
o
o
o
z
f
n
c
c
n
t
c
a
c
c
u
i
i
o
e
a
l
l
i
i
c
e
o
h
h
u
u
v
u
e
3
-
3
e
e
c
r
e
r
r
a
e
G
u
G
d
g
-
-
f
3
C
-
o
q
i
g
R
c
c
3
r
d
–
l
e
e
n
b
3
q
3
c
c
r
r
r
-
p
s
i
3
l
c
i
i
3
y
l
o
y
I
–
y
F
F
-
l
c
o
e
e
y
i
i
i
h
–
-
u
o
u
i
r
n
l
n
h
o
i
e
c
e
u
i
u
n
l
d
f
i
t
e
n
e
r
C
i
o
O
Q
Q
i
f
f
f
a
C
t
u
r
o
-
f
e
e
d
Q
Q
n
r
f
u
a
p
e
i
e
K
c Documentation
instruments detect on the into account D
D
Metabolite name
features; how ever, these Bioinformatics tools for I
Measured m/z 611.1604
include a large number of analyte identification take this Samples Chemical formula
adduct and isotope peaks. )
k
k
RT 6.85 min
and even use commonly observed adducts as a means
of identify
z
z
/
/
e
e
m
p m
p
(
(
common problems that contribute to misidentification.
and identification
ing analytes (discussed in detail below). Nonetheless, Theoretical
there are three m/z 611.1607
structures—are common in nature. Identification level
First, isomers—compounds with an Fragmentation MS/MS
Important examples
identical molecular formula but distinct
and fructose, and alanine andother sets of isomers, Public
from primary metabolism sarcosine. High-resolution MS especially when
repositories
and inositol phosphates, criminate between these and International ID (e.g., HMDB, PubChem,
Data analysis, KEGG, etc.)
citrate and isocitrate, glucose visualization
purified
Fig. 4 | Workflow for metabolic data processing and downstream
fragmentation patterns are similar, and some types of result documentation. a,b, Structure elucidation workflow for data
isomers may not separate well on conventional acquisition (a) and processing and annotation (b). c, Simple design
reverse-phase high-performance LC (HPLC). To improve for metabolic data documentation and how data can be linked to the
separation, reverse-phase ion pairing chro matography, mzTab49 tool to facilitate data representation, sharing and deposition
hydrophilic interaction chromatography (HILIC) and other to public repositories.
chromatographic methods can be used; another option is
chemical derivatization before chromatography12. In cases peak with known enzymes or chemical treatments73. These
where isomers cannot be separated, this needs to be clearly meth ods can also be combined with other approaches such
stated because such compounds may have greatly different as using authenticated standards for isomer annotation86
biological functions. and dual-labeling approaches87.
Second, the presence of overlapping compounds may As an aside, a critical aspect of nontargeted
prevent detection of some metabolites. While the metabolomics is peak filtering. Metabolomics datasets from
increasingly high resolu tion of mass spectrometers has such studies contain a large proportion of uninformative
mitigated this issue to some extent, the resolving power of features that can impede subse quent statistical analysis,
many current instruments is insufficient to separate ions and there is thus a need for versatile and data-adaptive
differing in mass by less than 5 parts per million (ppm)12. methods for filtering data before investigating the underlying
This problem, however, is only acute when chromatogra phy biological phenomena88. A list of suggestions for the design
is also unable to separate analytes that cannot be separated and implementation of data filtering strategies is provided in
on the basis of mass. Supplementary Note 5.
The third major hurdle (which is more relevant for LC–MS
than GC–MS) is the formation of in-source degradation Reporting transparency
products. These are by-product ions of ESI due to simple To fully exploit metabolomics data, they need to be
loss of water, car bon dioxide or hydrogen phosphate, more comparable between different laboratories. Indeed, several
complicated molecular rearrangements and the attachment comparative studies have been published, as we detail in
of other ions. In-source degra dation reduces the intensity of Supplementary Note 6. In addi tion to comparability at a
the metabolite parent ion, and the resulting fragment ions quantitative level, clear metabolite ontolo gies are also
may confound analysis of other co-eluting compounds, for needed to ensure that metabolites are annotated in a
example, if they have the same molecular formula as the common fashion (Supplementary Note 7).
molecular ion of another metabolite12. We provide examples To ensure that methods can be readily adopted by others, a
of these from our own work in Supplementary Fig. 1. These wealth of detailed information is required. However, detailed
exam ples demonstrate the need for careful manual curation descrip tions of sample preparation and analytical
of all peak assignments, which, however, is often not procedures are often (at least partially) absent in
feasible when annotating several hundred or thousand publications, especially in cases where metabolomics is not
metabolites (Fig. 4). In ambiguous cases, the exact the primary focus of the published work. We recommend
identification of a peak can often be best demon strated via that the following items be considered as mandatory
comparative biochemical approaches, for example, by components of any methods section for metabolomics
analyzing the metabolome in known mutants that can be experiments.
antici pated to lack certain metabolites24,85 or incubation of a
Extensive recommendations have been made before36,39; We recommend a streamlined, simpler reporting approach
however, we believe that this list will need to be revisited (Fig. 5). While this is similar to that previously suggested for
frequently owing to improvements in instrumentation and plant analyses39, we have updated reporting
other aspects of the metabo lomics workflow. If unsure of recommendations to ensure broader applicability and
how much methodological detail to provide, imagine that relevance. To simplify the adoption of these
your twin is sitting on a different continent in front of similar recommendations, we supply Supplementary Tables 1 and 2
instrumentation and has to configure the equip ment in a as template Microsoft Excel spreadsheets. Supplementary
comparable manner. Increasingly, there is software sup port Table 1 con
to extract such information from raw data files converted into, tains a list of simple questions regarding the reporting of
for example, the mzML file format44 (Fig. 4c). metabolite data, and Supplementary Table 2 provides
Considering the number of possible pitfalls in the recommendations for metabolite annotation for typical
annotation and quantification of metabolites in GC–MS or LC–MS experiments. Once one is used to filling
metabolomics approaches, the current general level of out these tables, it is our experience that it takes between
reporting in the literature is not entirely satisfactory (Figs. 4 30 and 60minutes to complete the process. In the case of
and 5). Given restrictive journal word limits and the fact that large datasets consisting of hundreds to thousands of
scientific reports tend to be highly concise, it is perhaps not samples, which nowadays represent what is reported in a
surprising that authors do not refer to compounds as ‘the sizeable proportion of metabolomics papers, the time for
metabolite that we putatively annotate as X’ within the text upload in metabo lomics repositories is thus considerably
of their articles. That said, there is nothing to preclude highly longer than the filling out of our suggested Excel tables.
gested. We anticipate that the adoption of these
Summary recommendations will offer several advantages: (1) perusal
In summary, we have presented here recommendations to of reported metadata will provide readers with the ability to
improve the quality and cross-laboratory comparability of assess the quality of the data reported and, as such, allow
metabolic datasets. These range from recommendations on greater confidence in the conclusions drawn; (2)
sampling and metabolite extraction, quantification and peak researchers will have a simple route to gain information
identification to guidelines on transparency in measurement needed to aid them in annotating their own experimental
and documentation, for which a data- rather than output
chromatogram-centric approach is sug
t
PubChem) • Accurate m/z O
i • Literature survey • Mol. formula: molecular formula of the
s
n
metabolite
e • Theor. m/z: theoretical monoisotopic mass for
t
the ion
n
Metabolite documentation
I
OH
MS/MS fragmentation scan
m/z m/z OH
Fig. 5 | Metabolite annotation and documentation. Structure elucidation workflow of metabolite identification. MS/MS fragmentation provides
information about compound structure. Metabolite annotation can be achieved using reference compounds, MS2 analysis, NMR or a photodiode
array (PDA) detector for UV–visible light spectrum detection. Database searching enables molecular formula calculation. Illustrated is an
example of our recommendations for reporting metabolomics data for a typical LC–MS experiment for the compound rutin (a flavonoid
glycoside). Comparison of the MS and MS/MS spectra for rutin reveals a peak at 611 m/z in the MS scan and two major fragments at 611 m/z in
the MS/MS scan, providing information about chemical loss of rhamnose (−146 m/z) and glucose (–162 m/z) moieties. For metabolite
documentation, the current general recommended levels of reporting are shown; see Supplementary Tables 1 and 2 for further details.
respective quantification. Our proposed reporting standards
are not meant to be a direct replacement for the standards
and (3) data obtained by multiple laboratories may be set by metabolome repositories. In fact, in most instances,
compared more easily. these are entirely com plementary to one another. We
A recent example of comprehensive documentation of a recommend that metabolomics practitioners follow
metabo lomics experiment is provided by the study of Price repository standards alongside those we dis cuss here.
et al.91, who evaluated metabolite levels in understudied There is a wealth of data reported in the literature that, for
crop species, assem bling an extensive database of the one reason or another, have not been deposited in reposito
underlying data. Greater adop tion of simple reporting tables ries (such as MetaboLights, the Metabolomics Workbench
such as the ones we describe here (Supplementary Tables and GNPS-MassIVE), and for such data it would be
1 and 2) or the similar one proposed by Dorrestein and excellent if the metadata could be captured. This is
coworkers (for a comparison of these tables, see important not only for possible reuse of the data but equally
Supplementary Note 8) has the potential to elucidate general as a means of allowing the reader the possibility to evaluate
aspects of the metabolic response. their veracity. Expansion of such approaches, including
We would like to stress that the intention of the recommen input from both experimental and computational scien tists,
dations presented here is to encourage fuller and more will facilitate the generation of pan-metabolome databases,
faithful reporting of both metabolite annotations and their which will undoubtedly open new horizons for metabolomics
in all kingdoms of life. cells. Nature 540, 153–155 (2016).
3. Oliver, S. G., Winson, M. K., Kell, D. B. & Baganz, F. Systematic
We believe that more widespread adoption of these
functional analysis of the yeast genome. Trends
recom mendations will enhance the quality of reporting of Biotechnol. 16, 373–378 (1998). 4. Alseekh, S. & Fernie, A. R.
metabolite data, advance community efforts to improve the Metabolomics 20 years on: what have we learned and what hurdles
annotation of metabolomes and, finally, facilitate the remain? Plant J. 94, 933–942 (2018).
5. Chevalier, C. et al. Gut microbiota orchestrates energy homeostasis
exchange and compara bility of metabolite data from different
during cold. Cell 163, 1360–1374 (2015).
laboratories. These efforts will also facilitate comparison of Tis paper demonstrates that the microbiota is a key factor
metabolomics datasets obtained from different species, orchestrating overall energy homeostasis during increased demand
supporting the renaissance of comparative biochemistry. in mammals. 6. Chu, C. et al. Te microbiota regulate neuronal function
and fear extinction learning. Nature 574, 543–548 (2019).
7. Djamei, A. et al. Metabolic priming by a secreted fungal efector.
Received: 2 April 2020; Accepted: 27 May 2021;
Nature 478, 395–398 (2011).
Published online: 8 July 2021 8. Dorr, J. R. et al. Synthetic lethal metabolic targeting of cellular
senescence in cancer therapy. Nature 501, 421–425 (2013).
Tis paper illustrates the identifcation of metabolite biomarkers
References for use in cancer diagnostics and to serve as targets for
1. Doerr, A. Global metabolomics. Nat. Methods 14, 32 new-concept anticancer therapies.
(2017). 2. Fessenden, M. Metabolomics: small molecules, single