Mass Spectrometry-Based Metabolomics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Perspective

https://doi.org/10.1038/s41592-021-01197-1

Mass spectrometry-based metabolomics:


a guide for annotation, quantification and
best reporting practices

Saleh Alseekh 1,2 , Asaph Aharoni 3, Yariv Brotman4, Kévin Contrepois 5, John
D’Auria 6, Jan Ewald 7, Jennifer C. Ewald 8, Paul D. Fraser9, Patrick Giavalisco10,
Robert D. Hall 11,12, Matthias Heinemann 13, Hannes Link14, Jie Luo15, Steffen
Neumann16, Jens Nielsen 17,18, Leonardo Perez de Souza 1, Kazuki Saito 19,20, Uwe
Sauer 21, Frank C. Schroeder 22, Stefan Schuster7, Gary Siuzdak 23, Aleksandra
Skirycz 1,22, Lloyd W. Sumner 24, Michael P. Snyder 5, Huiru Tang 25, Takayuki Tohge26,
Yulan Wang 27, Weiwei Wen 28, Si Wu5, Guowang Xu 29, Nicola Zamboni 21 and Alisdair
R. Fernie 1,2 ✉

Mass spectrometry-based metabolomics approaches can enable detection and quantification of many thousands
of metabolite features simultaneously. However, compound identification and reliable quantification are greatly
complicated owing to the chemical complexity and dynamic range of the metabolome. Simultaneous
quantification of many metabolites within complex mixtures can additionally be complicated by ion suppression,
fragmentation and the presence of isomers. Here we present guidelines covering sample preparation, replication
and randomization, quantification, recovery and recombination, ion sup
pression and peak misidentification, as a means to enable high-quality reporting of liquid chromatography– and
gas chroma tography–mass spectrometry-based metabolomics-derived data.
structure and dynamic range of abundance9,12), remains a

M
major challenge

etabolomics, the large-scale study of the metabolic with regard to the ability to provide adequate coverage of the
metabolome that can complement that achieved for the
com 1–3
genome, transcriptome and proteome. Despite these
plement of the cell , is a mature science that has been comparative limita tions, enormous advances have been
practiced for over 20 years4. Indeed, it is now a made with regard to the number of analytes about which
commonly used experimental systems biology tool with accurate quantitative informa tion can be acquired, and a
demonstrated utility in both fundamental and applied vast number of studies have yielded important biological
aspects of plant, microbial and mam malian research5–15. information and biologically active metabo lites across the
Among the many thousands of studies pub lished in this area kingdoms of life14. We have previously estimated that
over the last 20 years, notable highlights5–8,10,11,16 are briefly upwards of 1 million different metabolites occur across the
described in Supplementary Note 1. tree of life, with between 1,000 and 40,000 estimated to
Despite the insight afforded by such studies, the nature of occur in a single species4.
metabolites, particularly their diversity (in both chemical

1
Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany. 2Institute of Plants Systems Biology and Biotechnology, Plovdiv,
Bulgaria. 3Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, Israel. 4Department of Life Sciences, Ben
Gurion University of the Negev, Beersheva, Israel. 5Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
6
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany. 7Department of Bioinformatics, University of Jena,
Jena, Germany. 8Interfaculty Institute of Cell Biology, Eberhard Karls University of Tuebingen, Tuebingen, Germany. 9Biological Sciences, Royal
Holloway University of London, Egham, UK. 10Max Planck Institute for Biology of Ageing, Cologne, Germany. 11BU Bioscience, Wageningen
Research, Wageningen, the Netherlands. 12Laboratory of Plant Physiology, Wageningen University, Wageningen, the Netherlands. 13Molecular
Systems Biology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, the Netherlands. 14Max
Planck Institute for Terrestrial Microbiology, Marburg, Germany. 15College of Tropical Crops, Hainan University, Haikou, China. 16Bioinformatics
and Scientific Data, Leibniz Institute for Plant Biochemistry, Halle, Germany. 17BioInnovation Institute, Copenhagen, Denmark. 18Department of
Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden. 19Plant Molecular Science Center, Chiba
University, Chiba, Japan. 20RIKEN Center for Sustainable Resource Science, Yokohama, Japan. 21Institute for Molecular Systems Biology, ETH
Zurich, Zurich, Switzerland. 22Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY,
USA. 23Center for Metabolomics and Mass Spectrometry, Scripps Research Institute, La Jolla, CA, USA. 24Department of Biochemistry and MU
Metabolomics Center, University of Missouri, Columbia, MO, USA. 25State Key Laboratory of Genetic Engineering, Zhongshan Hospital and
School of Life Sciences, Human Phenome Institute, Metabonomics and Systems Biology Laboratory at Shanghai International Centre for
Molecular Phenomics, Fudan University, Shanghai, China. 26Department of Biological Science, Nara Institute of Science and Technology, Ikoma,
Japan. 27Singapore Phenome Center, Lee Kong Chian School of Medicine, School of Biological Sciences, Nanyang Technological University,
Nanyang, Singapore. 28Key Laboratory of Horticultural Plant Biology (MOE), College of Horticulture and Forestry Sciences, Huazhong
Agricultural University, Wuhan, China. 29CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical

Physics, Chinese Academy of Sciences, Dalian, China. e-mail: [email protected]; [email protected]

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


747

Perspective Nature MetHodS


always have the experi ence to provide the raw data or even
However, thus far, even the most comprehensive methods have access to them. In parallel, requiring reviewers to
can not provide firm upper limits for metabolite number. comment on all aspects of multiomics stud ies in the
Current capa bilities for detection and quantification of absence of clear guidelines is a big ask, especially con
metabolites fall a long way short of being comprehensive. sidering that many biologists lack expert competence in the
Currently, combinations of the most comprehensive area of metabolomics. Finally, and perhaps most tellingly,
methods are able to quantify 700 of the 3,700 metabolites there is dif ficulty in reporting chromatogram-level
predicted to be present in Escherichia coli17,18, information, which often requires several attempts to fulfil
500 of the 2,680 metabolites predicted to be present in the criteria of the major metabo lomics repositories.
yeast19,20, 8,000 of the 114,100 metabolites predicted to be However, while the reporting of this informa tion is highly
present in humans21 and only 14,000 of the over 400,000 useful for several purposes, it is not essential for all. As we
metabolites predicted to be present in the plant kingdom4,22. illustrate here, evaluation of the quality of the metabolomics
Chemical diversity, rapid turnover times and broad dynamic data presented in a paper can effectively be performed on
range in cellular abundance currently prohibit the possibility the basis of a relatively small amount of metadata—namely,
of using single-extraction and single-analysis procedures to by analyzing the quality of the metabolite annotation as well
measure all metabolites9. Consequently, many different as assessing the quanti tative recovery of analyte peaks.
extrac tion techniques and combinations of analytical Our aim here is to present a simplified reporting workflow,
methods have been developed in an attempt to achieve with the hope of capturing more of the missing information.
adequate metabolite coverage. This renders the While nuclear magnetic resonance (NMR) and capillary
establishment of good working practices13,15,23–26 more difficult electrophoresis– mass spectrometry (CE–MS) have specific
than with RNA-seq27, for example. Furthermore, rigorous advocates and have clear advantages in structure
standards are needed for normalization of metabolomics elucidation and sensitivity, respectively, we will focus here
data28,29. This is exacerbated by the breadth of aims on chromatography (either gas chromatography (GC) or
associated with the measurement of metabolites, which liquid chromatography (LC)) hyphenated to MS; we therefore
encompass targeted metabo lite analysis, metabolite focus our guidelines on such techniques, given that the
profiling, flux profiling, metabolomics-scale analysis and majority of metabolomics studies rely on these approaches.
metabolite fingerprinting techniques30,31. In contrast to the suggestions of the Metabolomics
Given the myriad of aims and methodologies, we argue Standards Initiative32,36–38 and the major repositories
that it is particularly important to define clear guidelines for mentioned above, we provide reporting guide
acquisition and reporting of metabolite data because there lines at the level of the processed data (supported by the
are many potential sources of misinterpretation. This is not provision of representative chromatograms allowing the
the first time such guide assessment of metabolite identification), rather than the raw
lines have been suggested, with several insightful papers chromatograms. A similar recom mendation was made to the
published on this topic12,32 and long-established metabolome plant research community in 2011 (ref. 39). Here we have
databases includ ing MetaboLights33–36 and the Metabolome aimed to revise and update these recommendations to (1)
Workbench (https:// www.metabolomicsworkbench.org/) also be more globally applicable and (2) reinforce our contention
driving this field. A more detailed description of these that quantification control experiments should be regarded
repositories as well as of more recent developments is as manda tory and can aid in determining how problematic
provided in Supplementary Note 2. Although the detailed the effects of ion suppression are in an experiment. We
standards set out by the Metabolomics Standards Initiative32 highlight potential sources of error and provide
and these repositories are laudable and clearly represent the recommendations for ensuring the robust ness of the
gold standard of metabolomics reporting, it is notable that metabolite data obtained and reported. We also present
only a small fraction of published metabolomics studies guidelines for sampling, extraction and storage, metabolite
follow these standards in their entirety and submit their data identi fication and reporting. We stress the need for
to the metabolome databases. There are probably several recombination and recovery experiments aimed at checking
reasons underlying this. First, few jour nals currently both qualitative metabolite identifications and the
mandate that data be stored in one of the metabolo mics quantitative recovery of these metabolites. In addition, we
repositories. Second, unlike the situation 20 years ago, or suggest a stricter nomenclature for metabolite annotation
even when the work of the Metabolomics Standards that would improve reporting by removing much of the
Initiative was first published some 13 years ago32,36–38, ambiguity concerning the quality of metabolite annotation
metabolomics experiments often represent only one that is currently apparent in many metabolomics studies.
component of studies integrating a wide range of The scope of our guidelines does not encompass detailed
techniques. Moreover, many groups outsource their downstream computational analysis of the acquired
metabolomics workflow to service providers and do not datasets, although we note several impor tant recent
advances in this area40–47. These tools and their applica tion Quenching needs to satisfy two criteria: it should (1) com
are discussed in Supplementary Note 3. pletely terminate all enzyme and chemical activities and (2)
We believe that such efforts are necessary to enable avoid the perturbation of existing metabolite levels during
between-laboratory comparisons of datasets, which, as has harvesting. Details regarding specific considerations that
been demonstrated for transcriptomics, provides huge need to be taken into account for quenching the metabolism
statistical power and deeper biological insights and, of various species are pro vided in Supplementary Note 4.
furthermore, provides a basis for better integration with otherThe efficiency of quenching can be followed either by
datasets48,49. controlled comparisons of various extrac tion methods38 or,
alternatively, by determining the abundance of (stable
Sampling, quenching, metabolite extraction and isotope-labeled) standards spiked into the quenching sol
storage The very first (and particularly vital) step in a vent (see “Recovery and recombination experiments”). For
metabolomics work flow (Figs. 1 and 2) is the rapid stopping, tissues, where possible, quick excision followed by
or quenching, of metabo lism and extraction of the snap-freezing in liquid nitrogen is recommended, with
metabolites in a manner that produces a stable extract that subsequent storage of deep-frozen tissue at a constant −80
is quantitatively reflective of the endogenous metabolite °C until the first application of extraction solvent. However,
levels present in the original living cell. This is especially for bulky tissue, submersion in liquid nitrogen is not sufficient
important in highly metabolically active systems such as cells because the center of the tissue is cooled too slowly. In such
and tissues, but less so in biofluids such as serum, plasma cases, freeze-clamping, where tissue is almost
or urine sam ples12. Indeed, there is no one method to fit all instantaneously squashed flat between two prefrozen metal
cases, with specific sampling, quenching and extraction blocks (known as a Wohlehberger clamp), is preferred39,50.
needed for each tissue type. That said, certain evaluations Irrespective of the quenching method, the downstream
of quality are universally applicable, and our aim here is to steps of these processes also warrant caution. For example,
provide clear instructions on how to apply them. improper freeze

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


748
Nature MetHodS
Perspective 1

Chromatography GC, LC,


EC
Quenching Extraction
2 Time (RT)
Separation
of ions MS
3 y

45 t
i

Grinding s

Ionization Ions separate n

(fly or oscillate) e
Extraction Mass spectrometry TOF, in mass spectrometer t

n
Orbitrap, Q, IT according to their m/z I

m/z
Detector

Sample replication and randomization Chromatography–mass spectrometry


Sample preparation and extraction
activities and batch correction are essential
• Standards spiked into the quenching solvent • • Separation methods, composition of the mobile
Grinding, isolation of cells, fast-filtration or aspiration phase, column properties and injection volume
• Avoid environmental perturbation during harvesting • • At least four biological replicates, preferably • Metabolites are within their range of detection • Avoid
Control environment: harvesting at the same time and more • Technical and analytic replicates are ion suppression: dilution of extracts, sonication,
under the same conditions worthy of consideration filtration or centrifugation, recovery test
• Snap-freezing in liquid nitrogen • Randomization of samples throughout workflows • Choosing ionization source and type of detection
• Enzyme quenching: completely terminate all enzyme is essential mode, MS method, scan number and speed, MS/MS
• In large-scale studies, quality-control samples and energy for fragmentation

Fig. 1 | Metabolomics workflow. Metabolomics involves several basic steps: (1) sample preparation and extraction; (2) metabolite separation
on a column (chromatography) such as by GC, LC or EC; (3) ionization of metabolites using an ion source; (4) separation by a mass analyzer as
ions fly or oscillate on the basis of their mass-to-charge (m/z) ratio; and (5) detection. Metabolites can be identified on the basis of a combination
of retention time (RT) and MS signature. TOF, time of flight; Q, quadrupole; IT, ion trap.
recommended, where necessary, to store completely dry
residues for as short a time as possible before their
drying and lack of storage in sealed containers can generate
artifac tual geometric isomers of pigments . Freeze drying is analysis. In addition, great care must be taken to ensure
39

also unsuit able when volatile components are of interest. that metabolism remains quenched during thawing. This is
While the appropriate means of storage is strictly dependent particularly pertinent for extracts contain ing secondary
on the stability of the class of targeted metabolites under metabolites. In such extracts, degradative enzymes often
study, it is not recommended to store samples between 0 retain their activities, which, if not kept in check, may result
and 40 °C. At these temperatures, substances can become in the consumption or conversion of certain metabolites with
concentrated in a residual aqueous phase39. It is therefore a con comitant appearance of new compounds or
breakdown products51.
Similar issues are also present with respect to both the more comprehen sive assessment of any experimental
experi mental growth media and the initial extraction solvents variance in data generation39. Indeed, such analyses are
used. Growth media often need to be removed via multiple essential for the establishment of a new extraction or
wash steps to reduce the effects of ion suppression during processing procedure or a new analytical technique as well
the subsequent MS analysis, and the solvent used for initial as for the optimization of a new instrument.
extraction may need to be exchanged owing to Biological replication is even more important and should
incompatibility with the instrumentation used for the involve at least four but preferably more replicates; the
metabolite analysis. Two pitfalls are pertinent here: (1) the required number of replicates depends on the desired
washing process results in the loss of metabolites and (2) statistical power, effect size and actual variance52. Care
solvent removal leads to concentration of the metabolites must be taken to acquire such replicates in a highly uniform
and thereby an acceleration of chemical reactions between manner. For plants, this can also mean collect
them. Thus, considerable caution is advised in method ing samples at the same time of day and under the same
optimization to ensure that extraction and handling methods environ mental conditions. In many instances, a full and
allow adequate quantitative representation of cellular independent repeat of a biological experiment is advisable53.
metabolites. In some instances, such as the analysis of There are different stages where technical replicates can be
volatile or semivolatile compounds, sample extraction and made: at sampling, quenching, extraction and analysis,
handling should only be performed on fresh material. We replicates can be made independently of the entire process.
strongly recom mend the adoption of recovery and In our experience, the extraction step is the most critical of
recombination experiments (see below) when either a these. Whether technical replication is needed in support of
substantially novel metabolomics technique is introduced or biological replication is highly dependent on the relative
a novel cell type, tissue or organism is studied. magni
tudes of variation; in cases in which the biological variation
Sample replication and randomization greatly exceeds the technical variation, it is sensible to
An important issue is the nature and number of biological, sacrifice the latter to increase the former. With new systems,
tech nical and analytical replicates. Before using any new pilot experiments are highly recommended to evaluate
extraction biological and technical variation and hence determine how
protocol or analytical procedure and when working with new many samples and how many replicates are needed to
biolog ical materials, it is essential to perform extensive pilot achieve statistical robustness52.
experiments to fully assess the technical variation that is Careful spatiotemporal randomization of biological
necessary to design a statistically sound experiment. To samples throughout a metabolomics experiment is equally
avoid misunderstanding, we refer readers to the definitions essential. If a set of samples is analyzed in a nonrandom
of each type of replicate provided in ref. 39. While analytical order, treatment and control samples or time points may end
replicates, that is, replicates corresponding to repeated up being measured under very dif
injection of the exact same extract, are useful in assess ing ferent conditions. As a result, interpretation can be
machine performance, technical replicates, which confounded by
encompass the entire experimental procedure, allow a far

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


749

Perspective Nature MetHodS


correction) of run order and batch effects within a study, but
sample age or shifting instrument performance, potentially not necessarily across experiments, as is possible with
occlud ing biological variation between sample groups or, reference material.
worse, creating artifactual differences. This is particularly
important in large-scale metabolic profiling studies to Quantification
characterize the natural variation of metabolism, akin to The aforementioned details of extraction, storage and
genome-wide association studies10,54–56. In such replication are equally applicable when ensuring the
experiments, weeks of instrument time may be required. accuracy of any method of metabolite quantification,
Clear best-practice guidelines for such large-scale studies including those that target single metabolites (Fig. 2). The
have been pre remainder of this article will address issues that are, at least
sented elsewhere57–60, so we will not discuss them further partially, restricted to untargeted metabolomics approaches.
here. Irrespective of the size of the experiment, the use of There are several essential aspects requiring consider
quality-control samples and batch correction is also ation here.
essential61. Such experimental controls help monitor First, it is essential to ensure that the levels of all
instrument performance and stability and, thereby, data metabolites of potential interest can be detected and, ideally,
quality. These controls ensure that missing data or peaks can be measured within a linear range of detection. This is
with low signal-to-noise ratios do not occur. Either mixtures most readily achieved through analyses of independent
of authenticated metabolite samples at defined dilutions of each extract. Additionally, for experiments that
concentrations or dry-stored aliquots of a broadly shared begin with intact tissues, it is important to ensure complete
and appropriately standard ized biological extract (for tissue disruption. In the case of cellular studies, one must
example, multi-kilogram extracts of Arabidopsis, further take into consideration whether to limit the study to
E. coli, yeast or human cell lines) can serve as broadly the endogenous cellular metabolites or also assess the
useful reference samples. Use of these references enhances exometabolome. For these controls, and many others, we
accurate quantification and makes it possible to more provide a list of reporting recommendations in the section
effectively use the data in metabolite databases62–66. A below on transparency in measure ment, metabolite
pooled quality-control sample allows for evaluation (and annotation and documentation.
Metabolomics data are most frequently provided as
Liquid chromatography:
relative quantities (that is, relative quantification is
• Wide range of compounds, reverse-phase column
performed) with respect to a reference sample. This is in (silica C18), mobile phase, injection volume
contrast to NMR-based studies, which usually provide Gas chromatography:
absolute concentrations (that is, absolute quantification), • Low-molecular-weight compounds, volatile, derivatization
(MSTFA, BSTFA), packed column, carrier gas, injection mode
with peak intensities directly proportional to con
centrations and directly comparable across different peaks
• Sample separation based on chemical properties
and samples. The relative intensities of LC–MS and GC–MS • Reduce ion suppression
peaks rep resenting different compounds do not directly • Improve detection of low-abundance compounds
correlate to absolute concentrations. This is due to the • Separation of isomers
• Reduce the interfering compounds
differential ionization efficiencies
of the different metabolites within a complex mixture. To
address this issue, standard curves can be used to
determine how signal intensity responds as a function of Mass spectrometry
analyte concentra tion and, moreover, the range of linearity
of this relationship12. The ability to generate such curves is • Ionization source, ionization type (ESI, EI, APCI, etc.),
polarity, voltage, temperature, vacuum
of course dependent on the avail ability of validated pure
• Mass analyzer (TOF, Orbitrap, ion trap, FT–ICR, etc.)
standards. While relative values are highly useful in many • Resolution, sensitivity, mass accuracy, scan rates,
contexts and indeed are the only way of expressing the acquisition mode (full scan, MS/MS, SIM, MRM, ddMS, etc.)
levels and changes in level of non-annotated analytes,
absolute values have much greater utility for determining
enzyme binding site occupancies, the thermodynamics of
metabolic reactions12,67 and the molecular dynamics Data processing, metabolite identification
and data analysis
underlying the flow of atoms through a metabolic
network68–70. A further advantage of the methods used for • Convert raw MS data (m/z) to intensities table
absolute quantification is that they can be readily adapted • Requires filtering, detection and normalization
into a means of quality control for both quantification and the • Identification and documentation
• Data analysis and representation
Biological samples

Fig. 2 | Workflow for typical MS-based metabolomics.


• Experimental design
Overview chart listing the major steps and guidelines involved in
• Growth conditions, treatments, tissue type
• Replication (n = 4), quality control typical MS-based metabolomics studies.
• Care before and during sampling
• Freeze and then extract
• Storage conditions correctness of peak annotation, for example, through
thermody namics71. However, obtaining standard curves for
thousands of metabolites in a complex mixture is currently
not always practi cal. While many of the metabolite signals in
Quenching and extraction such mixtures are nonlinear owing to a variety of reasons,
including ion interaction, ion suppression, etc., which
• Tissue samples: quick freeze, liquid N2 grinding, liquid N2
clamping, pulverization, lyophilization and cell lysis substantially complicates quantitation (as described in the
• Biofluid/cell culture extraction: cold organic solvent, next section), there are experimental tools allowing the
aspiration or filtration extent of this problem to be quantified and reported.
• Concentration/drying under N2 or cold speed-vac for short
time; SPE column for clean-up or enrichment of extract
• Recovery test, spike internal standard

Chromatography separation

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


750
Nature MetHodS
Perspective
Quantification is particularly problematic in the case of external
a
160
calibration, where quantification of standards is carried out in a
far simpler mixture than that of the biological extract. Therefore,
140
either internal quantification using isotopically labeled standards or
120
quantification of a mixture of internal and external standards, as )

100
%

described below, is preferable. (

80
r
A further aspect of quantification is the basis on which quantities
e

60
are expressed for tissue samples. Data are often provided per gram ofc

40
fresh or dry weight, while for body fluids they are often provided per
volume. The case of cellular metabolomics is more complicated given
20
that cell size is often variable; values are therefore typically provided
0
l
l
l
)
e
d
e
e
e
e
e
d
d
d
d
d
e
e
d
n
e
e
e
e
e
e
e
e
e
a
e
d
e
e

per milligram of protein or based on cell counts. The basis on which i


i
i
i
i
i
i
o
o
o
i
S
s
t
s
s
s
r
n
n
n
n
n
n
n
n
n
e
n
n
n
a
n
n
n
n
c
c
c
c
c
c
c
i
I
c
i
i
n
i
i
i
i
i
i
i
i
i
i
i
i
i
i
r
l
i
l
(
o
o
o
o
e
h
r
s
a
s
a
c
c
s
c
a
a
a
a
a
a
c

n
t
n
n
h
n
n
r
l

o
c
o

a
c
U
m
n
t
p
e
o
y
s
y
o
c
u
t
u
i
r
c
o
a
a
o
a

both absolute and relative metabolite levels is provided is of funda l


c
c
c
c
c
c
c
c
o
y
l
i
l
u
l
V
r
a
o
i
i
i
i
i
i
i
i
L
l
c
n
l
e
S
t
e
e
u
n
m
u
t
e
l
t
t
i
P
i
l
r
y
r
h
a
a
r
r
v
G
r
A
n
n

a
l
t
r
t
t
L

a
u
i
a
p
m
m
G
,
l
G
S
i
T
b
o
l
,
F
u
h
i
o
y
a
u
e
O
c
h
y
o
a
a
a
s
r
a
C
M
r
n
t
t
e
T
I
G

mental importance—for example, values given on the basis of fresh p


c
t
y
R
P
y
r
R
M
T
G
e
u
u
s
u
e
l
l
h
P
M
h
A
g
S
B
T
G
P
o

weight can be dramatically influenced by the osmotic potential of the r

b
cell—yet is often not given enough consideration by the community.
140
120
Recovery and recombination experiments
100
Recovery experiments, in which authenticated standard com )

80
(

pounds are added to the initial extraction solvent to assess losses


y

60
e

during extraction, storage and handling, were vigorously champi v

oned in the 1970s to 1990s72 and can provide persuasive evidence


40
e

that the data reported are a valid reflection of cellular metabolite


20
compositions39. Recent examples exist of validated methods in
0
microbial, plant and mammalian systems73–75. However, the metab .
.
l
I
I
I
I
I
I
I
)
c
s
a
n
o
a
e
e
d
d
e
o
o
a
I

I
I

v
v
i
l
l
l
i
i
i
a
I

i
i
S
t
x
x
x

h
d
h
n
d
d
d
h
r
r
c
y
y
c
I
c
i
i
l
i
i
d
d

G
G
e
e
e
(
i
i
u
o
l
e
e

o
o
c
s
y
R
R
c
R
a
a
s

olomics community has been relatively slow in adopting these c

c
3
h
h
h
c
y
d
d
R

e
o
e
a
l
o
7
7
a
7
o

n
-

f
f

c
c
i

o
l
l
l
i
i
c
f
f
e
a

n



o
x
f
c
i
c


e
y
y
y
i
i
e
o
u
a
f
a
c
c
t
l
h

x
f
f
i
l
i
e
n
n
f
o
o
o
z
f
n
c
c
n
t
c
a
c
c
u
i
i
o
e
a
l
l
i
i

control procedures. This is partially explained by the lack of com l


n
l
g
n
-
-
t
t
r
r
a
p
n
i
a

c
e
o
h
h
u
u

v
u
e
3
-
3
e
e
c
r

e
r
r
a
e
G
u
G
d
g
-
-
f
3
C
-
o
q
i
g
R
c
c
3
r
d

l
e
e
n
b
3
q
3
c
c
r
r
r
-
p

s
i
3
l
c

i
i
3
y
l
o
y
I

y
F
F
-
l

c
o
e
e
y

mercially available standards and/or simple synthetic approaches to a




S
n
i
n
l
h
o
o
x
m
c

i
i

i
h

-
u
o
u
i
r
n
l
n
h

o
i
e
c
e
u
i
u
n
l
d
f
i
t
e
n
e
r
C
i
o

O
Q
Q
i
f
f
f
a
C
t
u
r
o
-
f
e
e
d
Q
Q
n
r
f
u
a
p
e
i
e
K

make standards. Indeed, for unknown analytes, this approach is by 1


f
a
y
n
Q
f
e
c
g
C
f
Q
a
h
m
r
p
C
i
m
n
p
e
i
e
C
a
D
m
r
a
u
m
h

its nature impossible.


e
a
r
K
e
Q
a
N
o
a
K
s
I
K

Fortunately, there is an alternative approach—extract recombi


nation—that circumvents this practical limitation. In this not only provide information concerning the appropriateness
approach, the extract of a novel tissue is characterized by of the extrac
combination with that of a well-characterized reference tion buffer but additionally allow an assessment of so-called
material such as one from E. coli, matrix effects caused by ion suppression76–78. These
Arabidopsis or human biofluids. Such experiments experiments addi tionally allow a quantitative assessment of
the reliability of known peaks79. A schematic representation percentage recovery was estimated using the theoretical concentration
of recovery and metabolic recombination experiments is in the extract mixture: ((level in leaves (A) × A%) + (level in leaves
presented in Fig. 3. (B) × B%))/100. Dashed lines indicate the acceptable range of
For known metabolites, we suggest that recovery or 70–130%. Compounds in gray are statistically outside this range.
metabolic recombination experiments be carried out for Error bars represent ± s.e.m.
each new tissue type or species. It is clear that in any
metabolomics-scale study certain metabolites will have poor
recovery. While this does not preclude the reporting of their preventing less abundant metabolites from being detected at
values, it is important that this is documented to allow all76,78,80. As mentioned above, the best method of assessing
readers discretion in their interpretation. Recovery rates of the potential impact of ion suppression is to mix two
70–130% are acceptable, with anything deviating beyond independent extracts in a recombination experiment (Fig. 3)
this range representing a metabolite whose quantification and assess whether the metabo
should be subject to further testing. For example, even a lites detected can be quantitatively recovered51. Essentially,
50% recovery rate—if reproduc ible and linear—could be within this process, co-eluting analytes compete for the
deemed acceptable (Fig. 3). The impor ionization energy, resulting in incomplete ionization.
tance of such control experiments is perhaps best illustrated Therefore, a decreased ion count for an analyte may be due
by cases in which they were not performed. Anecdotally, either to a decreased concentration of the analyte itself or to
there are several examples in the literature where the increased concentrations of co-eluting analytes. It is
metabolite data reported can not be reflective of cellular critically important to consider these effects during method
content, for example, because the zero levels reported for vali
metabolites, if representative of cellular levels, would dation to ensure the quality of the analysis.
indicate that the cells tested were not viable. While there is no universal solution to the ion suppression
problem, assessing the effects of ion suppression affords
Ion suppression greater confidence in the accuracy of the results. There are
Despite the selectivity and sensitivity of MS techniques, several strat egies that can help minimize ion suppression77.
there are considerable challenges with regard to Among these, improvements in sample preparation and
reproducibility and accuracy when analyzing complex chromatographic selec tivity are currently the most effective.
samples. These problems are not insur mountable but In some situations, using suitable clean-up procedures
require that additional care be taken when interpret ing depending on sample type and ana lyte properties may allow
results. Ion suppression is a general problem in LC–MS removal of co-eluting components. This might involve simple
analyses due to matrix effects influencing the ionization of dilution of extracts or the growth media from which the
co-eluting ana lytes, affecting the precision and accuracy of samples are derived51 or optimization of various steps of
quantification or sample work-up, including sonication, solvent partition ing,
Fig. 3 | Recovery tests. a,b, Recovery tests were performed using filtration, centrifugation and protein precipitation81. In addi
GC–MS (a) and LC–MS (b) peaks obtained for a mixture of extracts tion, solid-phase extraction (SPE) using appropriate
from Arabidopsis and lettuce leaves. The mixture was made by absorbents has been demonstrated to be an effective
combining extracts from Arabidopsis (A) and lettuce (B) leaves method to reduce matrix
(0.2 mg fresh weight per μl) at a 1:1 ratio. The

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


751

Perspective Nature MetHodS


addition, using APCI can also reduce
effects. Furthermore, it is possible to interference effects12. It has been See sampling, quenching, metabolite extraction and
adjust chromatography con ditions so demonstrated that ion suppression is storage (Fig. 2)
that the peaks of interest do not elute often less severe for negatively
in regions of sup pression; for charged com pounds than for positively
example, modifying the composition of charged ones82. Finally, although the Feature detection
the mobile phase or gradient above-mentioned strategies may not
conditions can aid chromatographic be sufficient to completely remove the Alignment
separation and thereby improve effects of ion suppression in complex
performance. samples, the extent Normalization
Careful selection of the ion source and a Sample preparation
column polarity is an alternative Identification

strategy to reduce ion suppression. For


example, atmo spheric pressure
chemical ionization (APCI) is less
prone to matrix effects than b Data acquisition, processing and annotation
electrospray ionization (ESI). In
liquid based) with MS and in some cases also tandem
of the problem can at least be quantified by carrying out MS (MS/MS) fragmen tation patterns provides great
control experiments as described in the preceding specificity83,84. Current high-end
section. • Extraction of information from raw data, including
filtering, feature detection and alignment
• Many software packages and algorithms are available for processing
Peak misidentification and analysis of metabolite data (e.g., MetAlign, XCMS, AMDIS, GNPS,
Expressionist Refiner MS, TagFinder, Mzmin, TargetSearch, MSClust,
The orthogonal use of chromatography (either gas or etc.)
Reporting standard

c Documentation
instruments detect on the into account D

order of 10,000 or 100,000 )

D
Metabolite name
features; how ever, these Bioinformatics tools for I
Measured m/z 611.1604
include a large number of analyte identification take this Samples Chemical formula
adduct and isotope peaks. )

Intensities Documentation C27H30O16


a

k
k
RT 6.85 min
and even use commonly observed adducts as a means
of identify
z
z

/
/
e
e

m
p m
p

(
(
common problems that contribute to misidentification.
and identification
ing analytes (discussed in detail below). Nonetheless, Theoretical
there are three m/z 611.1607
structures—are common in nature. Identification level
First, isomers—compounds with an Fragmentation MS/MS
Important examples
identical molecular formula but distinct
and fructose, and alanine andother sets of isomers, Public
from primary metabolism sarcosine. High-resolution MS especially when
repositories

include hexose phosphates alone may not suffice to dis mzTab


Metabolite class References

and inositol phosphates, criminate between these and International ID (e.g., HMDB, PubChem,
Data analysis, KEGG, etc.)
citrate and isocitrate, glucose visualization
purified
Fig. 4 | Workflow for metabolic data processing and downstream
fragmentation patterns are similar, and some types of result documentation. a,b, Structure elucidation workflow for data
isomers may not separate well on conventional acquisition (a) and processing and annotation (b). c, Simple design
reverse-phase high-performance LC (HPLC). To improve for metabolic data documentation and how data can be linked to the
separation, reverse-phase ion pairing chro matography, mzTab49 tool to facilitate data representation, sharing and deposition
hydrophilic interaction chromatography (HILIC) and other to public repositories.
chromatographic methods can be used; another option is
chemical derivatization before chromatography12. In cases peak with known enzymes or chemical treatments73. These
where isomers cannot be separated, this needs to be clearly meth ods can also be combined with other approaches such
stated because such compounds may have greatly different as using authenticated standards for isomer annotation86
biological functions. and dual-labeling approaches87.
Second, the presence of overlapping compounds may As an aside, a critical aspect of nontargeted
prevent detection of some metabolites. While the metabolomics is peak filtering. Metabolomics datasets from
increasingly high resolu tion of mass spectrometers has such studies contain a large proportion of uninformative
mitigated this issue to some extent, the resolving power of features that can impede subse quent statistical analysis,
many current instruments is insufficient to separate ions and there is thus a need for versatile and data-adaptive
differing in mass by less than 5 parts per million (ppm)12. methods for filtering data before investigating the underlying
This problem, however, is only acute when chromatogra phy biological phenomena88. A list of suggestions for the design
is also unable to separate analytes that cannot be separated and implementation of data filtering strategies is provided in
on the basis of mass. Supplementary Note 5.
The third major hurdle (which is more relevant for LC–MS
than GC–MS) is the formation of in-source degradation Reporting transparency
products. These are by-product ions of ESI due to simple To fully exploit metabolomics data, they need to be
loss of water, car bon dioxide or hydrogen phosphate, more comparable between different laboratories. Indeed, several
complicated molecular rearrangements and the attachment comparative studies have been published, as we detail in
of other ions. In-source degra dation reduces the intensity of Supplementary Note 6. In addi tion to comparability at a
the metabolite parent ion, and the resulting fragment ions quantitative level, clear metabolite ontolo gies are also
may confound analysis of other co-eluting compounds, for needed to ensure that metabolites are annotated in a
example, if they have the same molecular formula as the common fashion (Supplementary Note 7).
molecular ion of another metabolite12. We provide examples To ensure that methods can be readily adopted by others, a
of these from our own work in Supplementary Fig. 1. These wealth of detailed information is required. However, detailed
exam ples demonstrate the need for careful manual curation descrip tions of sample preparation and analytical
of all peak assignments, which, however, is often not procedures are often (at least partially) absent in
feasible when annotating several hundred or thousand publications, especially in cases where metabolomics is not
metabolites (Fig. 4). In ambiguous cases, the exact the primary focus of the published work. We recommend
identification of a peak can often be best demon strated via that the following items be considered as mandatory
comparative biochemical approaches, for example, by components of any methods section for metabolomics
analyzing the metabolome in known mutants that can be experiments.
antici pated to lack certain metabolites24,85 or incubation of a

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


752
Nature MetHodS
Perspective Box 1 | Information required for transparency in measurement and
metabolite annotation and documentation

detailed reporting of the exact nature of the annotation within


Chromatography
the supplementary data associated with a paper, either
• Instrument description: manufacturer, model number,
copublished or made available through separate web
sofware and version36,39
resources. Databases such as MetaboLights89 and the
• Separation conditions: column parameters (model,
Metabolomics Workbench90 can be used for this purpose
number, thickness, diameter, length and particle size)
and indeed have been adopted as a requirement for many
• Separation method: mobile-phase composition and
journals.
modifers, fow rate, gradient program, column
and/or (M+H)+ ions, metabolite name and compound
temperature, pressure, temperature and injection: split
class36,92
or splitless and injection cycle time
• For known compounds, we propose to add
international identifers (such as from HMDB, KEGG,
Mass spectrometry
• Instrument type and parameters: model, sofware and PubChem, KNApSAcK, etc.)
version 36,39 • Quantifed data, including peak intensity and area, etc.,
across the experiment must be provided in an .xls or
• Type of ionization: ESI, EI, APCI or others; positive or
negative polarity; and other ionization parameters .text fle as a supplementary fle
(voltage, gas, vacuum and temperature) • Representative chromatogram(s) should be included to
allow the assessment of metabolite identifcation
• Mass analyzer: TOF, Orbitrap, ion trap, FT-ICR, etc.;
hybrid or single-mass analyzer used for the
More extensive ontology
experiment; and collision energy used for
• Check requirements for repository submission35
fragmentation
• Format data using formats such as NetCDF for MS
• Instrument performance: resolution, sensitivity, mass
data93 • Include international metabolite identifers
accuracy and scan rates
• State data availability: freely available, published
• Acquisition mode: full scan, MSMS, SIM, MRM,
or not • Provide a summary of the experiment
ddMS, etc. • Detector
• Indicate whether authenticated or reference spectra
were used for identifcation
Metabolite documentation (minimum ontology)
• Details are presented in Fig. 4 and Supplementary • Give details on code or other information used for
Tables 1 and 2. Included minimum proposed reporting analysis if available
data: retention time, theoretical monoisotopic mass, • In the case of submission of downstream data (results),
m/z detected in the experiment for (M−H) and/or − the minimum structure for table format and the
(M+H) ions, m/z error (in ppm), MS/MS fragments
+ experiment must be provided; see Hofmann et al.44 for
obtained from the (M−H)− an example
• In the case of submission of data to GNPS for
molecular networking, see Jarmusch et al.45 for an
example
• Chromatography: composition of the mobile phase, Tese represent recommendation in cases where the raw
column properties, temperature, fow rate and injection data or downstream results are submitted to repository
volume • Mass spectrometry: ionization source and type of databases (for example, MetaboLights, the Metabolomics
detec tion mode, MS method, scan number and speed, and Workbench, MetaPhen, GNPS, etc.)
MS/MS parameters, including resolution settings and the
energy used for fragmentation (Box 1)

Extensive recommendations have been made before36,39; We recommend a streamlined, simpler reporting approach
however, we believe that this list will need to be revisited (Fig. 5). While this is similar to that previously suggested for
frequently owing to improvements in instrumentation and plant analyses39, we have updated reporting
other aspects of the metabo lomics workflow. If unsure of recommendations to ensure broader applicability and
how much methodological detail to provide, imagine that relevance. To simplify the adoption of these
your twin is sitting on a different continent in front of similar recommendations, we supply Supplementary Tables 1 and 2
instrumentation and has to configure the equip ment in a as template Microsoft Excel spreadsheets. Supplementary
comparable manner. Increasingly, there is software sup port Table 1 con
to extract such information from raw data files converted into, tains a list of simple questions regarding the reporting of
for example, the mzML file format44 (Fig. 4c). metabolite data, and Supplementary Table 2 provides
Considering the number of possible pitfalls in the recommendations for metabolite annotation for typical
annotation and quantification of metabolites in GC–MS or LC–MS experiments. Once one is used to filling
metabolomics approaches, the current general level of out these tables, it is our experience that it takes between
reporting in the literature is not entirely satisfactory (Figs. 4 30 and 60minutes to complete the process. In the case of
and 5). Given restrictive journal word limits and the fact that large datasets consisting of hundreds to thousands of
scientific reports tend to be highly concise, it is perhaps not samples, which nowadays represent what is reported in a
surprising that authors do not refer to compounds as ‘the sizeable proportion of metabolomics papers, the time for
metabolite that we putatively annotate as X’ within the text upload in metabo lomics repositories is thus considerably
of their articles. That said, there is nothing to preclude highly longer than the filling out of our suggested Excel tables.
gested. We anticipate that the adoption of these
Summary recommendations will offer several advantages: (1) perusal
In summary, we have presented here recommendations to of reported metadata will provide readers with the ability to
improve the quality and cross-laboratory comparability of assess the quality of the data reported and, as such, allow
metabolic datasets. These range from recommendations on greater confidence in the conclusions drawn; (2)
sampling and metabolite extraction, quantification and peak researchers will have a simple route to gain information
identification to guidelines on transparency in measurement needed to aid them in annotating their own experimental
and documentation, for which a data- rather than output
chromatogram-centric approach is sug

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


753

Perspective Nature MetHodS


• RT: retention time, in minutes
• Putative name: putative identification of the
MS scan Metabolite annotation • Reference compound • metabolite
MS2, MSn
• 1D or 2D NMR
• PDA, UV–VIS absorbance • Databases (e.g., HO
y

t
PubChem) • Accurate m/z O
i • Literature survey • Mol. formula: molecular formula of the
s

n
metabolite
e • Theor. m/z: theoretical monoisotopic mass for
t
the ion
n
Metabolite documentation
I
OH
MS/MS fragmentation scan
m/z m/z OH

611.16 –162 m/z


MS
MS/MS
HCD/CID 303.05 296.20 430.25
• Peak no.: number referenced back to the main 465.10 348.27 383.13
text 500.34 555.24 575.80
(M – H)– and/or (M + H)+ difference between OH
• Found m/z: mass theoretical and found m/z O
O
OH
detected in the experiment values in ppm OH O MS2 analysis
HO 303.05
• m/z error (ppm):
+ –146 m/z
(M + H) H3C O
• MS/MS fragments: fragments
• MS/MS CE (eV): collision HO
obtained from the ion (M – H)–, O
energy used for fragmentation Rutin –162 –146

• ID: international identifier number (e.g., HMDB, OH


465.10 345.06 611.17 523.29
PubChem) • Identification level (A–D): A, standard or OH
300 350 400 450 500 550 600 m/z
NMR; B(i–iii), MS/MS; C(i–iii), MSn; D, MS only
metabolite class ES(+) theor. m/z ES(–) found error (ppm) fragments (eV) Identification
Peak no. name Molecular m/z ES(–) theor. m/z MS/MS MS/MS References level (A–D)
RT Putative Metabolite formula ES(+) found m/z m/z ES(+) ES(+) CE (ID)
303.05 (M+H – Rha – Glc)
40 5280805 B(i)
1 6.85 Rutin Flavonoids C27H30O16 611.1607 611.1604 – – 0.3 611.17 (M+H)
465.10 (M+H – Rha)

Fig. 5 | Metabolite annotation and documentation. Structure elucidation workflow of metabolite identification. MS/MS fragmentation provides
information about compound structure. Metabolite annotation can be achieved using reference compounds, MS2 analysis, NMR or a photodiode
array (PDA) detector for UV–visible light spectrum detection. Database searching enables molecular formula calculation. Illustrated is an
example of our recommendations for reporting metabolomics data for a typical LC–MS experiment for the compound rutin (a flavonoid
glycoside). Comparison of the MS and MS/MS spectra for rutin reveals a peak at 611 m/z in the MS scan and two major fragments at 611 m/z in
the MS/MS scan, providing information about chemical loss of rhamnose (−146 m/z) and glucose (–162 m/z) moieties. For metabolite
documentation, the current general recommended levels of reporting are shown; see Supplementary Tables 1 and 2 for further details.
respective quantification. Our proposed reporting standards
are not meant to be a direct replacement for the standards
and (3) data obtained by multiple laboratories may be set by metabolome repositories. In fact, in most instances,
compared more easily. these are entirely com plementary to one another. We
A recent example of comprehensive documentation of a recommend that metabolomics practitioners follow
metabo lomics experiment is provided by the study of Price repository standards alongside those we dis cuss here.
et al.91, who evaluated metabolite levels in understudied There is a wealth of data reported in the literature that, for
crop species, assem bling an extensive database of the one reason or another, have not been deposited in reposito
underlying data. Greater adop tion of simple reporting tables ries (such as MetaboLights, the Metabolomics Workbench
such as the ones we describe here (Supplementary Tables and GNPS-MassIVE), and for such data it would be
1 and 2) or the similar one proposed by Dorrestein and excellent if the metadata could be captured. This is
coworkers (for a comparison of these tables, see important not only for possible reuse of the data but equally
Supplementary Note 8) has the potential to elucidate general as a means of allowing the reader the possibility to evaluate
aspects of the metabolic response. their veracity. Expansion of such approaches, including
We would like to stress that the intention of the recommen input from both experimental and computational scien tists,
dations presented here is to encourage fuller and more will facilitate the generation of pan-metabolome databases,
faithful reporting of both metabolite annotations and their which will undoubtedly open new horizons for metabolomics
in all kingdoms of life. cells. Nature 540, 153–155 (2016).
3. Oliver, S. G., Winson, M. K., Kell, D. B. & Baganz, F. Systematic
We believe that more widespread adoption of these
functional analysis of the yeast genome. Trends
recom mendations will enhance the quality of reporting of Biotechnol. 16, 373–378 (1998). 4. Alseekh, S. & Fernie, A. R.
metabolite data, advance community efforts to improve the Metabolomics 20 years on: what have we learned and what hurdles
annotation of metabolomes and, finally, facilitate the remain? Plant J. 94, 933–942 (2018).
5. Chevalier, C. et al. Gut microbiota orchestrates energy homeostasis
exchange and compara bility of metabolite data from different
during cold. Cell 163, 1360–1374 (2015).
laboratories. These efforts will also facilitate comparison of Tis paper demonstrates that the microbiota is a key factor
metabolomics datasets obtained from different species, orchestrating overall energy homeostasis during increased demand
supporting the renaissance of comparative biochemistry. in mammals. 6. Chu, C. et al. Te microbiota regulate neuronal function
and fear extinction learning. Nature 574, 543–548 (2019).
7. Djamei, A. et al. Metabolic priming by a secreted fungal efector.
Received: 2 April 2020; Accepted: 27 May 2021;
Nature 478, 395–398 (2011).
Published online: 8 July 2021 8. Dorr, J. R. et al. Synthetic lethal metabolic targeting of cellular
senescence in cancer therapy. Nature 501, 421–425 (2013).
Tis paper illustrates the identifcation of metabolite biomarkers
References for use in cancer diagnostics and to serve as targets for
1. Doerr, A. Global metabolomics. Nat. Methods 14, 32 new-concept anticancer therapies.
(2017). 2. Fessenden, M. Metabolomics: small molecules, single

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


754
Nature MetHodS
Perspective
27. Wang, Z., Gerstein, M. & Snyder, M. RNA-seq: a revolutionary
9. Fernie, A. R., Trethewey, R. N., Krotzky, A. J. & Willmitzer, L. tool for transcriptomics. Nat. Rev. Genet. 10, 57–63
Metabolite profling: from diagnostics to systems biology. Nat. (2009).
Rev. Mol. Cell Biol. 5, 763–769 (2004). 28. Bennett, B. D., Yuan, J., Kimball, E. H. & Rabinowitz, J. D.
10. Gieger, C. et al. Genetics meets metabolomics: a genome-wide Absolute quantitation of intracellular metabolite concentrations
association study of metabolite profles in human serum. PLoS by an isotope ratio-based approach. Nat. Protoc. 3,
Genet. 4, e100282 (2008). 11. Guijas, C., Montenegro-Burke, J. R., 1299–1311 (2008).
Warth, B., Spilker, M. E. & Siuzdak, G. Metabolomics activity screening 29. Li, B. et al. NOREVA: normalization and evaluation of MS-based
for identifying metabolites that modulate phenotype. Nat. metabolomics data. Nucleic Acids Res. 45,
Biotechnol. 36, 316–320 (2018). W162–W170 (2017). 30. Fiehn, O. Metabolomics—the link between
12. Lu, W. et al. Metabolite measurement: pitfalls to avoid and practices to genotypes and phenotypes. Plant Mol. Biol. 48, 155–171
follow. Annu. Rev. Biochem. 86, 277–304 (2017). (2002).
A useful and comprehensive review highlighting the pitfalls 31. Papadimitropoulos, M. P., Vasilopoulou, C. G., Maga-Nteve, C. &
encountered in metabolomics and providing guidelines for Klapa, M. I. Untargeted GC–MS metabolomics. Methods Mol.
accurate metabolite measurements. Biol. 1738, 133–147 (2018). 32. Fiehn, O. et al. Te metabolomics
13. Mashego, M. R. et al. Microbial metabolomics: past, present and Standards Initiative (MSI). Metabolomics 3, 175–178 (2007).
future methodologies. Biotechnol. Lett. 29, 1–16 A brief report outlining the history, stature and intentions of
(2007). MSI, an authorative standards initiative for metabolomics.
14. Rinschen, M. M., Ivanisevic, J., Giera, M. & Siuzdak, G. 33. Haug, K. et al. MetaboLights—an open-access general-purpose
Identifcation of bioactive metabolites using activity metabolomics. repository for metabolomics studies and associated meta-data.
Nat. Rev. Mol. Cell Biol. 20, 353–367 (2019). Nucleic Acids Res. 41, D781–D786 (2013).
15. Van Gulik, W. M. et al. Fast sampling of the cellular metabolome. 34. Salek, R. M., Haug, K. & Steinbeck, C. Dissemination of
Methods Mol. Biol. 881, 279–306 (2012). metabolomics results: role of MetaboLights and COSMOS.
16. Delzenne, N. M. & Bindels, L. B. Microbiome metabolomics reveals GigaScience 2, 8 (2013). 35. Steinbeck, C. et al. MetaboLights:
new drivers of human liver steatosis. Nat. Med. 24, 906–907 towards a new COSMOS of metabolomics data management.
(2018). 17. Guo, A. C. et al. ECMDB: the E. coli metabolome Metabolomics 8, 757–760 (2012).
database. Nucleic Acids Res. 41, D625–D630 (2013). 36. Sumner, L. W. et al. Proposed minimum reporting standards for
18. Sajed, T. et al. ECMDB 2.0: a richer resource for understanding the chemical analysis: Chemical Analysis Working Group (CAWG)
biochemistry of E. coli. Nucleic Acids Res. 44, Metabolomics Standards Initiative (MSI). Metabolomics
D495–D501 (2016). 19. Hautbergue, T., Jamin, E. L., Debrauwer, L., 3, 211–221 (2007).
Puel, O. & Oswald, I. P. From genomics to metabolomics, moving 37. Sansone, S. A. et al. Te Metabolomics Standards Initiative. Nat.
toward an integrated strategy for the discovery of fungal secondary Biotechnol. 25, 846–848 (2007).
metabolites. Nat. Prod. Rep. 35, 147–173 (2018). Tis article highlights two standards and guidelines papers for
20. Ramirez-Gaona, M. et al. YMDB 2.0: a signifcantly expanded version MS and sample preparation by the Human Proteome
of the yeast metabolome database. Nucleic Acids Res. 45, Organization Proteomics
D440–D445 (2017). 21. Wishart, D. S. et al. HMDB 4.0: the human Standardization Initiative (HUPO-PSI) and the Functional
metabolome database for 2018. Nucleic Acids Res. 46, Genomics Experiment (FuGE) and describes how
D608–D617 (2018). metabolomics standards should align to these.
Te database described is a groundbreaking, comprehensive and 38. Spicer, R. A., Salek, R. & Steinbeck, C. A decade afer the
freely available web resource containing detailed information Metabolomics Standards Initiative: it’s time for a revision. Sci. Data
about the human metabolome. 4, 3 (2017). 39. Fernie, A. R. et al. Recommendations for reporting
22. Saito, K. & Matsuda, F. Metabolomics for functional genomics, metabolite data. Plant Cell 23, 2477–2482 (2011).
systems biology, and biotechnology. Annu. Rev. Plant 40. Aksenov, A. A. et al. A machine learning workfow enables
Biol. 61, 463–489 (2010). 23. Lisec, J., Schauer, N., Kopka, J., automatic deconvolution of GC-MS data. Nat. Biotechnol.
Willmitzer, L. & Fernie, A. R. Gas chromatography mass 39, 169–173 (2021). 41. Aron, A. T. et al. Reproducible molecular
spectrometry-based metabolite profling in plants. Nat. networking of untargeted mass spectrometry data using GNPS.
Protoc. 1, 387–396 (2006). Nat. Protoc. 15, 1954–1991 (2020). 42. Blaženović, I. et al.
24. Tohge, T. & Fernie, A. R. Combining genetic diversity, informatics Structure annotation of all mass spectra in untargeted metabolomics.
and metabolomics to facilitate annotation of plant gene function. Anal. Chem. 91, 2155–2162 (2019).
Nat. Protoc. 5, 1210–1227 (2010). 43. Buendia, P. et al. Ontology-based metabolomics data integration with
25. van Gulik, W. M. Fast sampling for quantitative microbial quality control. Bioanalysis 11, 1139–1155 (2019).
metabolomics. Curr. Opin. Biotechnol. 21, 27–34 44. Hofmann, N. et al. mzTab-M: a data standard for sharing quantitative
(2010). results in mass spectrometry metabolomics. Anal. Chem. 91,
26. Vuckovic, D. Current trends and challenges in sample preparation for 3302–3310 (2019). 45. Jarmusch, A. K. et al. ReDU: a framework to fnd
global metabolomics using liquid chromatography–mass and reanalyze public mass spectrometry data. Nat. Methods 17,
spectrometry. Anal. Bioanal. Chem. 403, 1523–1548 901–904 (2020).
(2012). Tis tool enables the capture of public MS-based metabolomics
data and their subsequent reanalysis. 57. Alseekh, S., Wu, S., Brotman, Y. & Fernie, A. R. Guidelines for
46. Nothias, L. F. et al. Feature-based molecular networking in the GNPS sample normalization to minimize batch variation for large-scale
analysis environment. Nat. Methods 17, 905–908 (2020). metabolic profling of plant natural genetic variance.
47. Tsugawa, H. et al. A cheminformatics approach to characterize Methods Mol. Biol. 1778, 33–46 (2018).
metabolomes in stable-isotope-labeled organisms. Nat. 58. Dunn, W. B. et al. Procedures for large-scale metabolic profling of
Methods 16, 295–298 (2019). 48. Huan, T. et al. Systems biology serum and plasma using gas chromatography and liquid
guided by XCMS Online metabolomics. Nat. Methods 14, chromatography coupled to mass spectrometry. Nat. Protoc.
461–462 (2017). 6, 1060–1083 (2011).
49. Patel, V. R., Eckel-Mahan, K., Sassone-Corsi, P. & Baldi, P. 59. Grifn, J. L. et al. Standard reporting requirements for biological
CircadiOmics: integrating circadian genomics, transcriptomics, samples in metabolomics experiments: mammalian/in vivo
proteomics and metabolomics. Nat. Methods 9, 772–773 experiments. Metabolomics 3, 179–188 (2007).
(2012). 60. Vandeputte, D., Tito, R. Y., Vanleeuwen, R., Falony, G. & Raes, J.
50. Palladino, G. W., Wood, J. J. & Proctor, H. J. Modifed freeze clamp Practical considerations for large-scale gut microbiome studies.
technique for tissue assay. J. Surg. Res. 28, 188–190 (1980). FEMS Microbiol. Rev. 41, S154–S167 (2017).
51. Tohge, T. et al. From models to crop species: caveats and 61. Wehrens, R. et al. Improved batch correction in untargeted
solutions for translational metabolomics. Front. Plant MS-based metabolomics. Metabolomics 12, 88
Sci. 2, 61 (2011). (2016).
52. Trutschel, D., Schmidt, S., Grosse, I. & Neumann, S. Experiment 62. Cui, Q. et al. Metabolite identifcation via the Madison
design beyond gut feeling: statistical tests and power to detect Metabolomics Consortium Database. Nat. Biotechnol.
diferential metabolites in mass spectrometry data. 26, 162–164 (2008).
Metabolomics 11, 851–860 (2015). 63. Nakamura, Y. et al. KNApSAcK Metabolite Activity Database for
53. Sanchez, D. H., Szymanski, J., Erban, A., Udvardi, M. K. & Kopka, J. retrieving the relationships between metabolites and biological
Mining for robust transcriptional and metabolic responses to long-term activities. Plant Cell Physiol. 55, e7 (2014).
salt stress: a case study on the model legume Lotus 64. Tautenhahn, R. et al. An accelerated workfow for untargeted
japonicus. Plant Cell Environ. 33, 468–480 metabolomics using the METLIN database. Nat.
(2010). Biotechnol. 30, 826–828 (2012). 65. Vinaixa, M. et al. Mass
54. Chen, W. et al. Comparative and parallel genome-wide association spectral databases for LC/MS- and GC/MS-based metabolomics: state
studies for metabolic and agronomic traits in cereals. Nat. of the feld and future prospects. Trac-Trends Anal.
Commun. 7, 12767 (2016). 55. Fuhrer, T., Zampieri, M., Sevin, D. Chem. 78, 23–35 (2016).
C., Sauer, U. & Zamboni, N. Genomewide landscape of 66. Zhu, Z. J. et al. Liquid chromatography quadrupole time-of-fight
gene–metabolome associations in Escherichia coli. Mol. mass spectrometry characterization of metabolites guided by the
Syst. Biol. 13, 907 (2017). METLIN database. Nat. Protoc. 8, 451–460 (2013).
56. Hartiala, J. A. et al. Genome-wide association study and targeted 67. Niebel, B., Leupold, S. & Heinemann, M. An upper limit on Gibbs
metabolomics identifes sex-specifc association of CPS1 with coronary energy dissipation governs cellular metabolism. Nat. Metab. 1,
artery disease. Nat. Commun. 7, 10558 (2016). 125–132 (2019).

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


755

Perspective Nature MetHodS


Anal. Chem. 65, 972A–986A (1993).
68. Jourdan, F., Breitling, R., Barrett, M. P. & Gilbert, D. MetaNetter: 79. Roessner-Tunali, U. et al. De novo amino acid biosynthesis in potato
inference and visualization of high-resolution metabolomic tubers is regulated by sucrose levels. Plant Physiol. 133,
networks. Bioinformatics 24, 143–145 (2008). 683–692 (2003). 80. Buhrman, D. L., Price, P. I. & Rudewiczcor, P. J.
69. Pirhaji, L. et al. Revealing disease-associated pathways by network Quantitation of SR 27417 in human plasma using electrospray liquid
integration of untargeted metabolomics. Nat. Methods 13, chromatography–tandem mass spectrometry: a study of ion suppression.
770–776 (2016). 70. Shen, X. T. et al. Metabolic reaction network-based J. Am. Soc. Mass Spectrom. 7, 1099–1105 (1996).
recursive metabolite annotation for untargeted metabolomics. Nat. 81. Gerssen, A. et al. Solid phase extraction for removal of matrix
Commun. 10, 1516 (2019). efects in lipophilic marine toxin analysis by liquid
Tis represents an important example of a tool using a metabolic chromatography–tandem mass spectrometry. Anal.
reaction network that expands metabolite annotations without Bioanal. Chem. 394, 1213–1226 (2009).
the need for a comprehensive standard spectral library. 82. Freitas, L. G., Götz, C. W., Ruf, M., Singer, H. P. & Müller, S. R.
71. Kummel, A., Panke, S. & Heinemann, M. Putative regulatory sites Quantifcation of the new triketone herbicides, sulcotrione and
unraveled by network-embedded thermodynamic analysis of mesotrione, and other important herbicides and metabolites, at the
metabolome data. Mol. Syst. Biol. 2, 2006.0034 (2006). ng/l level in surface waters using liquid chromatography–tandem
72. Ap Rees, T. & Hill, S. A. Metabolic control analysis of plant mass spectrometry. J. Chromatogr. A 1028, 277–286
metabolism. Plant Cell Environ. 17, 587–599 (2004).
(1994). 83. De Vijlder, T. et al. A tutorial in small molecule identifcation via
73. Arrivault, S. et al. Use of reverse-phase liquid chromatography, electrospray ionization–mass spectrometry: the practical art of
linked to tandem mass spectrometry, to profle the Calvin cycle and structural elucidation. Mass Spectrom. Rev. 37,
other metabolic intermediates in Arabidopsis rosettes at 607–629 (2018).
diferent carbon dioxide concentrations. Plant J. 59, 826–839 84. Dettmer, K., Aronov, P. A. & Hammock, B. D. Mass
(2009). spectrometry-based metabolomics. Mass Spectrom.
74. Lu, W., Bennett, B. D. & Rabinowitz, J. D. Analytical strategies Rev. 26, 51–78 (2007).
for LC–MS-based targeted metabolomics. J. 85. Tohge, T. et al. Functional genomics by integrated analysis of
Chromatogr. B Anal. Technol. metabolome and transcriptome of Arabidopsis plants
Biomed. Life Sci. 871, 236–242 (2008). over-expressing an MYB transcription factor. Plant J. 42,
75. Lunn, J. E. et al. Sugar-induced increases in trehalose 218–235 (2005).
6-phosphate are correlated with redox activation of ADP-glucose 86. Shahaf, N. et al. Te WEIZMASS spectral library for
pyrophosphorylase and higher rates of starch synthesis in high-confdence metabolite identifcation. Nat.
Arabidopsis thaliana. Biochem. J. 397, Commun. 7, 12423 (2016).
139–148 (2006). Tis represents a reference metabolite spectral library developed
76. Annesley, T. M. Ion suppression in mass spectrometry. Clin. from high-resolution MS data acquired from a structurally
Chem. 49, 1041–1044 (2003). diverse set of 3,540 plant metabolites, providing great promise
77. Antignac, J. P., Marchand, P., Le Bizec, B. & Andre, F. Identifcation for addressing the question of comprehensivity in
of ractopamine residues in tissue and urine samples at ultra-trace metabolomics.
level using liquid chromatography–positive electrospray tandem 87. Feldberg, L., Venger, I., Malitsky, S., Rogachev, I. & Aharoni, A. Dual
mass spectrometry. J. Chromatogr. B Anal. labeling of metabolites for metabolome analysis (DLEMMA): a new
Technol. Biomed. Life Sci. 774, 59–66 (2002). approach for the
78. Kebarle, P. & Tang, L. From ions in solution to ions in the gas identifcation and relative quantifcation of metabolites by means of
phase—the mechanism of electrospray mass spectrometry. dual isotope labeling and liquid chromatography–mass spectrometry.
Anal. Chem. 81, 9257–9266 (2009). time spent on this initiative. W.W. is supported by the Huazhong Agricultural
88. Schifman, C. et al. Filtering procedures for untargeted LC–MS University Scientific & Technological Self-Innovation Foundation (program no.
metabolomics data. BMC Bioinformatics 20, 334 2017RC002). K.C. and M.P.S. are supported by the NIH under grant numbers
(2019). 5U54HG010426-03 and 1U2CCA233311-01. H.T. acknowledges financial
89. Kale, N. S. et al. MetaboLights: an open-access database repository support from the Shanghai Municipal Science and Technology Major Project
for metabolomics data. Curr. Protoc. (2017SHZDZX01) and the National Natural Science Foundation of China
Bioinformatics 53, 14.13.11–14.13.18 (2016). 90. Sud, M. et (31821002). G.X. is supported by the National Natural Science Foundation of
al. Metabolomics Workbench: an international repository for China (21934006). T.T. was supported by JSPS KAKENHI grants-in-aid
metabolomics data and metadata, metabolite standards, protocols, (19H03249 and 19K06723). G.S. is supported by the NIH under grant number
tutorials and training, and analysis tools. Nucleic Acids Res. R35GM130385.
44, D463–D470 (2016).
91. Price, E. J. et al. Metabolite database for root, tuber, and banana Author contributions
crops to facilitate modern breeding in understudied crops. A.R.F. and SA. wrote the manuscript with contributions and input from all
Plant J. 101, 1258–1268 (2020). authors. All authors read and approved the final manuscript.
Recent exemplary documentation of a metabolomics experiment
that evaluated metabolite levels in crop species, providing not
only an extensive database but moreover an excellent example of Competing interests
how to correctly investigate understudied species. The authors declare no competing interests.
92. Bino, R. J. et al. Potential of metabolomics as a functional
genomics tool. Trends Plant Sci. 9, 418–425 (2004).
93. Kirwan, J. A. et al. Preanalytical processing and biobanking Additional information
procedures of biological samples for metabolomics research: a white Supplementary information The online version contains supplementary
paper, community perspective (for ‘Precision Medicine and material available at https://doi.org/10.1038/s41592-021-01197-1.
Pharmacometabolomics Task Group’– Te Metabolomics Society Initiative). Correspondence should be addressed to S.A. or A.R.F.
Clin. Chem. 64, 1158–1182 (2018).
Peer review information Nature Methods thanks Justin van der
Hooft and the other, anonymous, reviewer(s) for their contribution to the peer
Acknowledgements review of this work. Allison Doerr was the primary editor on this article and
A.R.F. and S.A. are supported by the European Union’s Horizon 2020 research managed its editorial process and peer review in collaboration with the rest of
and innovation program, under PlantaSYST (SGA-CSA no. 739582 under FPA the editorial team.
no. 664620). J.E. and S.S. were supported by German Research Foundation
Reprints and permissions information is available at
(DFG) grant numbers 210879364 and 239748522. P.D.F. is grateful for funding www.nature.com/reprints.
from the CGIAR Research Program on Roots, Tubers and Bananas (RTB) and
is supported by CGIAR Fund Donors and the Biotechnology and Biological Publisher’s note Springer Nature remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
Sciences Research Council OPTICAR Project (project BB/P001742/1). R.D.H.
acknowledges receipt of the Nils Foss Food Excellence Prize, which funded his © Springer Nature America, Inc. 2021

Nature Methods | VOL 18 | July 2021 | 747–756 | www.nature.com/naturemethods


756

You might also like