Biomedicines 12 02118

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

biomedicines

Article
A Multiplexed Quantitative Proteomics Approach to the Human
Plasma Protein Signature
Estefanía Núñez 1,2 , María Gómez-Serrano 3 , Enrique Calvo 1,2 , Elena Bonzon-Kulichenko 1 ,
Marco Trevisan-Herraz 4 , José Manuel Rodríguez 1 , Fernando García-Marqués 5 , Ricardo Magni 1,2 ,
Enrique Lara-Pezzi 1,2 , José Luis Martín-Ventura 2,6 , Emilio Camafeita 1,2, * and Jesús Vázquez 1,2, *

1 Centro Nacional de Investigaciones Cardiovasculares Carlos III, 28029 Madrid, Spain; [email protected] (E.N.);
[email protected] (E.C.); [email protected] (E.B.-K.); [email protected] (J.M.R.);
[email protected] (R.M.); [email protected] (E.L.-P.)
2 CIBER de Enfermedades Cardiovasculares (CIBERCV), 28029 Madrid, Spain; [email protected]
3 Institute for Tumor Immunology, Center for Tumor Biology and Immunology (ZTI), Philipps University,
35043 Marburg, Germany; [email protected]
4 International Center for Life, Newcastle University, Newcastle upon Tyne NE1 4EP, UK;
[email protected]
5 Canary Center for Cancer Early Detection, Stanford, CA 94304, USA; [email protected]
6 IIS-Fundación Jiménez-Díaz, 28015 Madrid, Spain
* Correspondence: [email protected] (E.C.); [email protected] (J.V.)

Abstract: Despite the plasma proteome being able to provide a unique insight into the health and
disease status of individuals, holding singular promise as a source of protein biomarkers that could
be pivotal in the context of personalized medicine, only around 100 proteins covering a few human
conditions have been approved as biomarkers by the US Food and Drug Administration (FDA) so far.
Mass spectrometry (MS) currently has enormous potential for high-throughput analysis in clinical
research; however, plasma proteomics remains challenging mainly due to the wide dynamic range
of plasma protein abundances and the time-consuming procedures required. We applied a new
Citation: Núñez, E.; Gómez-Serrano, MS-based multiplexed proteomics workflow to quantitate proteins, encompassing 67 FDA-approved
M.; Calvo, E.; Bonzon-Kulichenko, E.; biomarkers, in >1300 human plasma samples from a clinical cohort. Our results indicate that this
Trevisan-Herraz, M.; Rodríguez, J.M.; workflow is suitable for large-scale clinical studies, showing good accuracy and reproducibility
García-Marqués, F.; Magni, R.; (coefficient of variation (CV) < 20 for 90% of the proteins). Furthermore, we identified plasma
Lara-Pezzi, E.; Martín-Ventura, J.L.; signature proteins (stable in time on an individual basis), stable proteins (exhibiting low biological
et al. A Multiplexed Quantitative variability and high temporal stability), and highly variable proteins (with low temporal stability)
Proteomics Approach to the Human
that can be used for personalized health monitoring and medicine.
Plasma Protein Signature. Biomedicines
2024, 12, 2118. https://doi.org/
Keywords: LC-MS/MS; human plasma; plasma proteomics; clinical proteomics; atherosclerosis;
10.3390/biomedicines12092118
personalized medicine
Academic Editor: M. Walid Qoronfleh

Received: 22 July 2024


Revised: 30 August 2024
Accepted: 9 September 2024
1. Introduction
Published: 18 September 2024 Blood is the primary link between different parts of the body and has the potential to
expose the health/pathological status of any key organ. Plasma and serum are the predom-
inant clinical specimens available for routine molecular analysis, and among the molecules
present in blood, proteins have the greatest clinical significance. Plasma is estimated to
Copyright: © 2024 by the authors. contain more than 20,000 different proteins (based on the number of human protein-coding
Licensee MDPI, Basel, Switzerland. genes) with concentrations spanning 10–12 orders of magnitude. Thus, albumin consti-
This article is an open access article
tutes 60% of the total plasma proteome, with a concentration of 50 mg/ml, whereas the
distributed under the terms and
concentration of interleukin-6 is 4.2 pg/mL [1,2]. In addition, alternative splicing and
conditions of the Creative Commons
post-translational modifications (PTMs) lead to millions of different proteoforms [3], not to
Attribution (CC BY) license (https://
mention the underexplored peptidome [4] and the non-canonical proteome [5]. Thousands
creativecommons.org/licenses/by/
of proteins have been measured in plasma or serum using either MS-based proteomics,
4.0/).

Biomedicines 2024, 12, 2118. https://doi.org/10.3390/biomedicines12092118 https://www.mdpi.com/journal/biomedicines


Biomedicines 2024, 12, 2118 2 of 21

affinity-based proteomics, or combinations thereof, but to date, only around 100 plasma
protein biomarkers have been approved by the FDA [6].
MS is the most commonly used biomolecular detection technique in proteomics, al-
lowing high-throughput analysis of proteins in biological samples with almost complete
coverage [7,8]. MS is the technique of choice for the analysis of human samples and body
fluids, as it can yield specific quantitative information on every proteome component
in an unbiased way and therefore contribute to deciphering the molecular basis of hu-
man diseases. However, MS-based analysis of human plasma samples is constrained by
the above-mentioned large dynamic range of protein concentrations and complexity. To
overcome this drawback and achieve deep and extensive proteome coverage upon liq-
uid chromatography–tandem MS (LC-MS/MS) analysis, sample depletion from highly
abundant proteins (i.e., those accounting for >99% of the total protein mass) and exten-
sive peptide-level fractionation via offline separation techniques, such as affinity enrich-
ment [9,10], reversed-phase chromatography at basic pH, and strong cation exchange
chromatography, have been used [11–13]. However, immunodepletion can not only lead to
sample loss and diminished sample-to-sample reproducibility but also cause analytical bias
due to the removal of lower abundance proteins carried over by such species as albumin or
unspecifically bound to the antibodies used [14,15]; moreover, extensive fractionation at
the peptide level is time-consuming and therefore not feasible for large sample sizes [16].
Some of these drawbacks can be circumvented by using multiplexed analysis based on
isobaric labeling techniques in combination with peptide-level fractionation to perform
large-scale studies with suitable depth in a reasonable time.
Recent advances in MS-based proteomics have led to noticeable improvements in
sensitivity, analytical dynamic range, speed, and robustness that hold enormous potential
for large-scale plasma proteomics in clinical research. Label-free approaches are more
challenging to implement when using very long gradients and peptide fractionation (as
required to dig deeper into the plasma proteome) because of the difficulty to maintain
constant chromatographic and mass spectrometric performance, two parameters that
are essential for accurate quantification, across hundreds or thousands of individual LC-
MS/MS runs. On the other hand, isobaric labeling allows routine analysis of large numbers
of samples via LC-MS/MS without demanding chromatographic stability over extended
time ranges.
Analysis of >1000 samples from a clinical research cohort of subjects with acceptable
throughput and analytical depth using MS-based shotgun proteomics is a challenge. So
far, most non-targeted plasma proteomics studies have been confined to the analysis of
<100 samples, which offers limited power for biomarker discovery, and only a few works
have analyzed >500 human plasma samples with significant proteome coverage. Bruderer
et al. [17] described the analysis of 1508 plasma samples at 31 samples/day with a depth
of 565 proteins per analysis using capillary-flow data-independent acquisition; however,
quality control samples were required to monitor LC-MS/MS performance. Cominetti
et al. [18] resorted to multiplexed labeling with tandem mass tags (TMT) to quantitate
1000 samples at 50 samples/day, but only 190 proteins were identified per analysis. More re-
cently, Niu et al. [19], using a data-independent acquisition approach, reported the analysis
of 659 plasma samples at 60 samples/day with a depth of ca. 300 proteins per analysis.
We have recently developed an MS-based proteomics workflow for the high-throughput
analysis of human plasma samples based on isobaric labeling of peptides for multiplexed
quantification, followed by automated peptide and protein quantification using a robust
statistical model previously developed in our laboratory [20–25]. This workflow has been
used for the analysis of the human plasma proteome in several works [26–33]. In this
study, we evaluate in detail the performance and the statistical accuracy of our plasma
proteomics workflow and study the evolution of plasma protein abundance over time
using previously published data from over 1300 human plasma samples pertaining to the
PESA and AWHS cohorts.
Biomedicines 2024, 12, 2118 3 of 21

2. Materials and Methods


2.1. Plasma Samples
Plasma samples were collected from the PESA study cohort [34] at baseline (V1,
444 samples) and at the 3-year follow-up visit (V2, 444 samples), as well as from the
AWHS cohort [35,36] (350 samples). PESA is a prospective cohort study with asymptomatic
employees (age: 40–54 years) of the Santander Bank (Madrid, Spain). AWHS is a prospective
longitudinal cohort study of middle-aged workers free of clinical cardiovascular disease
(CVD) from the General Motors Spain automobile assembly plant (Zaragoza, Spain).

2.2. Plasma Depletion


Immunodepletion of the 14 most abundant plasma proteins was carried out with the
Multiple Affinity Removal Column Human 14 (4.6 × 100 mm, Agilent, Santa Clara, CA,
USA) coupled with a 1290 Infinity II liquid chromatography system (Agilent).

2.3. Protein Digestion


A volume of 5 µl of plasma from each individual (corresponding to 300 µg of protein,
as measured by NanoDrop 1000, Thermo Fisher Scientific, Waltham, MA, USA) was
mixed with 5 µl of a buffer containing 50 mM Tris, 2% sodium dodecyl sulfate, and
100 mM dithiothreitol and boiled for 5 min. On-filter protein digestion was performed
using Nanosep Centrifugal Devices with Omega Membrane 10K (Pall, Portsmouth, UK)
following the manufacturer’s instructions. Briefly, 320 µl of urea was added to each sample,
and the mixture was transferred to the filter, which was then subjected to centrifugation at
14,000× g for 10 min. Cysteine residues were alkylated with 50 mM iodoacetamide with 1 h
incubation at room temperature in the dark. After two washes with urea followed by two
washes with 100 mM ammonium bicarbonate pH 8.8, proteins were digested with trypsin
(1:30 w/w trypsin: protein, Promega, Madison, WI, USA) overnight at 37 ◦ C. The resulting
peptides were eluted with ammonium bicarbonate and NaCl, and the resulting peptide
solution was acidified with 25% trifluoroacetic acid (TFA; final concentration, 1%; Merck,
Darmstadt, Germany) and desalted with Oasis cartridges (Waters, Milford, MA, USA)
following the manufacturer’s instructions. Finally, the eluted peptides were vacuum-dried
and stored at −20 ◦ C until further use.
On-plate protein digestion was performed as described above using 96-well plates
(AcroPrep 96-well Filter Plates, Pall) coupled to a vacuum manifold (Pall), after which the
resulting peptides were desalted in Oasis HLB 96-well plates (Waters). Finally, the peptide
solutions were vacuum-dried and stored at −20 ◦ C until further use.

2.4. Isobaric Labeling


Peptide samples were taken up in 100 mM triethylammonium bicarbonate, and the
peptide concentration was measured using a DirectDetect infrared spectrometer (Merck).
Then, the peptides were subjected to multiplexed isobaric labeling with TMT using the TMT
10plex Isobaric Label Reagent Set (Thermo Fisher Scientific) following the manufacturer’s
instructions. A total of 156 TMT experiments were performed, processed in 15-TMT batches.
Each TMT experiment comprised eight individuals, with two TMT channels reserved for a
reference internal standard sample that was prepared as follows: After protein digestion,
equal peptide amounts from half of the samples included in each of the 15-TMT batches
were pooled; then, for each TMT experiment, a 50 µg aliquot was labeled with TMT reagent
126, and another 50 µg aliquot was labeled with TMT reagent 131; afterward, these two
labeled samples were pooled (to compensate for potential different labeling efficiencies of
the two TMT reagents used) and used to spike the eight pooled samples comprising each
TMT experiment with the same amount of internal standard sample. The resulting labeled
peptide mixtures were acidified with 25% TFA (1% final concentration) and desalted with
Oasis cartridges (Waters). Finally, 1/10 and 9/10 aliquots of the eluted peptides were
vacuum-dried and stored at −20 ◦ C for later MS analysis and peptide-level fractionation,
respectively.
Biomedicines 2024, 12, 2118 4 of 21

2.5. Fractionation of Peptide Samples


The 9/10 aliquot was taken up in 0.1% TFA and separated into five fractions using the
High pH Reversed-Phase Peptide Fractionation Kit (Thermo Fisher Scientific) according to
the manufacturer’s instructions. Briefly, the cartridges were washed with 50% and 100%
acetonitrile (ACN) and equilibrated with 0.1% TFA. Then, the labeled peptide samples were
loaded into the cartridges and the peptides eluted in five different fractions with increasing
ACN concentration: 12.5% ACN, 15% ACN, 17.5% ACN, 20% ACN, and 50% ACN. The
resulting labeled peptide fractions were vacuum-dried and stored at −20 ◦ C for further
MS analysis.

2.6. LC-MS/MS Analysis


The labeled peptide samples were taken up in 0.1% formic acid (FA) and subjected
to LC-MS/MS analysis using an EASY-nLC 1200 liquid chromatography system (Thermo
Fisher Scientific) coupled with an Orbitrap Fusion mass spectrometer (Thermo Fisher
Scientific) in the case of PESA V1 plasma samples or an Ultimate 3000 HPLC system
(Thermo Fisher Scientific) coupled with a Q Exactive HF mass spectrometer (Thermo
Fisher Scientific) in the case of PESA V2 and AWHS plasma samples. C18-based reversed-
phase separation was carried out using a PepMap 100 C18 trapping column (Thermo
Fisher Scientific) and an EASY-Spray 50 cm analytical column (Thermo Fisher Scientific).
Peptides were loaded in buffer A (0.1% (v/v) FA in water) and eluted with an ACN gradient
consisting of 0–21% buffer B (100% ACN, 0.1% (v/v) FA) for 300 min and 21–90% B for 5 min
at a flow rate of 200 nl/min. Mass spectra were acquired in a data-dependent manner, with
an automatic switch between MS and MS/MS using a top-speed method with the Fusion
mass spectrometer and a top-15 method with the Q Exactive HF instrument. MS spectra
were acquired in the Orbitrap analyzer in the 400–1500 m/z range with 70,000 resolution.
Higher energy collisional dissociation was performed with a normalized collision energy
value of 30, and MS/MS spectra were acquired with 60,000 resolution in the Orbitrap.

2.7. Protein Identification


For peptide identification, MS/MS spectra were searched with the SEQUEST HT
algorithm implemented in Proteome Discoverer 2.1 (Thermo Scientific) against a concate-
nated target-decoy database comprising human protein sequences (UniProtKB/Swiss-Prot
2014_07 release) with the following parameters: trypsin digestion with up to 2 missed
cleavages; Cys carbamidomethylation (57.021464 Da) and TMT labeling (229.162932 Da)
at peptide N-terminus and Lys residues were set as fixed modifications; Met oxidation
(15.994915) was allowed as a dynamic modification; and precursor and fragment mass
tolerance were set to 800 ppm and 0.02 Da, respectively. The false discovery rate (FDR) for
peptide identification was calculated based on the refined method with 15 ppm precursor
mass tolerance postfiltering [20,25]. A 1% FDR threshold was considered for peptide identi-
fication, and peptides were assigned to the best protein proposed by Proteome Discoverer.

2.8. Protein Quantification and Statistical Analysis


Protein quantification and statistical and systems biology analysis were performed
based on the quantitative information extracted from the MS/MS spectra of the TMT-
labeled peptides using the iSanXoT package [37], based on the models and algorithms
previously developed in our laboratory [21,22,24,38]. For peptide quantification, the raw
quantitative data were used as inputs to the weighted spectrum, peptide, and protein
(WSPP) model to compute the log2 ratio of each individual with respect to the average
value of the two internal standard samples. Comparative analysis was carried out by
estimating the contribution of spectrum, peptide, and protein variance to the total technical
variance; thus, relative protein abundance was measured by Xq, the log2 ratio at the protein
level, and by Zq, the log2 ratio at the protein level expressed in units of standard deviation,
based on the corresponding variance estimation. The model also enabled the estimation of
statistical weight, Wq, for each protein, defined as the inverse of its variance.
Biomedicines 2024, 12, 2118 5 of 21

To calculate CV, we estimated absolute protein abundances by summing up the


reporter intensities of all scans belonging to a given protein and calculated an average
across all the internal standard samples included in the TMT batches. We then multiplied
thepaveraged abundance of each protein by 2Xq . The standard deviation was estimated as
1/ Wq .

2.9. Functional Enrichment and Clustering Analyses


Functional enrichment and clustering analyses were performed with String v11.0 [39]
and Cytoscape v3.7.2 [40], respectively.

2.10. Biochemical Measurements


The plasma levels of immunoglobulin heavy constant alpha 2 (IGHA2) and apolipopro-
tein (a) (LPA) were measured by immunoturbidimetric assays (LK088.OPT and LK098.OPT,
respectively, The Binding Site, London, UK) using the Binding Site Optilite analyzer (The
Binding Site) in a blinded manner. The plasma levels of C-reactive protein (CRP) and
apolipoprotein B-100 (ApoB) were measured as previously published [34].

3. Results
3.1. Influence of Plasma Depletion on Protein Quantification
In this work, we investigated the proteomics results obtained from the analysis of
1008 plasma samples from the PESA study cohort [34], namely 120 samples used as a
test cohort, 444 additional baseline samples, and 444 samples at 3-year follow-up visit
(Figure 1). Firstly, we analyzed whether depletion was a suitable procedure to increase
proteome coverage using the 120-sample test cohort. These plasma samples were depleted
from the 14 most abundant proteins prior to filter-aided digestion with trypsin. Then, the
resulting peptides were labeled with TMT reagents prior to fractionation using high-pH
reversed-phase chromatography. The peptides were analyzed by LC-MS/MS both before
(fast analysis) and after fractionation (deep analysis), after which peptide and protein
quantification were achieved based on the WSPP model previously developed in our
laboratory [21] (Figure 1). Of note, we obtained an average number of 10,957 peptides
and 1674 proteins per sample, which suggested that depletion was a promising approach
to enhance the depth of analysis of this kind of samples. We then investigated whether
depletion produced alterations to protein abundance. We found that after depletion, around
30% of the peptide–spectrum matches (PSMs) belonged to depletable proteins, indicating
that these depletable proteins were still present in significant amounts. The clustering
analysis revealed that the remaining depletable proteins (highlighted in green in Figure 2A,
left panel) gathered at a single main cluster together with many other species (in blue),
indicating that all these proteins were affected by the depletion process. In clear contrast,
this cluster was not observed when the same 120 plasma samples were processed without
prior depletion (see the details below), where both the depletable proteins and the majority
of the other proteins found previously clustered did not reveal any evidence of clustering
(Figure 2A, right panel). Moreover, the protein abundance (excluding depletable proteins)
in each of the 120 depleted samples (P, Figure 2B, left panel) showed low correlation with
the corresponding non-depleted samples (V1). In clear contrast, when no depletion was
performed, a high intra-individual correlation was found between protein levels at baseline
(V1, Figure 2B, right panel) and those from the plasma samples obtained three years later
(V2). These results demonstrate that depletion of the most abundant proteins, although
successful in improving proteome coverage, introduced a strong variability in the plasma
proteome, probably due to the incompleteness of depletion and the existing interactions
between depletable and non-depletable proteins. For these reasons, the depletion protocol
was judged unsuitable for the analysis of large cohorts.
Biomedicines 2024, 12, x FOR PEER REVIEW 6 of 21

existing interactions between depletable and non-depletable proteins. For these reasons,
Biomedicines 2024, 12, 2118 6 of 21
the depletion protocol was judged unsuitable for the analysis of large cohorts.

Figure 1. Proteomics workflow for multiplexed analysis of large plasma cohorts. (1) A total of
1378 plasma samples were analyzed, of which 140 samples were subjected to plasma depletion
before protein digestion. (2) The same on-filter tryptic digestion procedure was applied using
individual centrifugalworkflow
Figure 1. Proteomics devices (1028 samples) and
for multiplexed the 96-well
analysis of largeplate format
plasma (350 (1)
cohorts. samples).
A total (3) The
of 1378
resulting peptideswere
plasma samples wereanalyzed,
isobarically labeled140
of which with TMT for
samples a total
were of 156 TMT
subjected experiments
to plasma thatbefore
depletion were
processed in 15-TMT
protein digestion. batches.
(2) The same Each TMT
on-filter experiment
tryptic comprised
digestion procedure eight
wasdifferent plasma
applied using samples
individual
centrifugal
together withdevices (1028 samples)
the internal standardand the 96-well
sample prepared plate format (350
by pooling half samples). (3) The
of the samples resulting
included in pep-
each
tides
of thewere isobarically
15-TMT batches.labeled with
(4,5) The TMT for
labeled a totalwere
peptides of 156analyzed
TMT experiments
by LC-MS/MS that were
bothprocessed in
before (fast
15-TMT batches.
analysis) and afterEach TMT
(deep experiment
analysis) comprised
high-pH eight different
reversed-phase plasma samples
fractionation. together
(6) Protein with the
quantification
internal standard sample prepared by pooling half of the samples included in each of the 15-TMT
was accomplished using iSanXoT, an implementation of the WSPP statistical model previously
batches. (4–5) The labeled peptides were analyzed by LC-MS/MS both before (fast analysis) and
developed in our group. LC-MS/MS, liquid chromatography–tandem mass spectrometry; TMT,
after (deep analysis) high-pH reversed-phase fractionation. (6) Protein quantification was accom-
tandem mass iSanXoT,
plished using tags; WSPP, weighted spectrum,
an implementation of thepeptide, and protein.
WSPP statistical model This figure was
previously created in
developed in
BioRender (BioRender.com/v03v870).
our group. LC-MS/MS, liquid chromatography–tandem mass spectrometry; TMT, tandem mass
tags; WSPP, weighted spectrum, peptide, and protein. This figure was created in BioRender (Bio-
Render.com/v03v870).
Biomedicines 2024, 12, x FOR PEER REVIEW 7 of 21
Biomedicines 2024, 12, 2118 7 of 21

Figure 2. Technical
Figure 2. Technical issues
issues in
in plasma
plasma depletion.
depletion. (A)
(A) Protein
Protein clusters
clusters in
in depleted
depleted and
and non-depleted
non-depleted
plasma samples showing that most depletable proteins (green) that could also
plasma samples showing that most depletable proteins (green) that could also be quantified be quantified in
in de-
depleted plasmas grouped along with many other proteins (blue) in a single cluster,
pleted plasmas grouped along with many other proteins (blue) in a single cluster, with other pro- with other
teins quantified
proteins shown
quantified in orange
shown (left(left
in orange panel). In the
panel). In non-depleted plasma
the non-depleted samples,
plasma depletable
samples, pro-
depletable
teins distribute
proteins among
distribute amongseveral clusters
several clusters(right
(rightpanel).
panel).(B)
(B)Correlation
Correlationanalysis,
analysis,excluding
excluding depletable
proteins, between
proteins, between plasma
plasma samples
samples from
from the
the same
same individuals
individuals with
with and
and without
without depletion
depletion (labeled
(labeled
as P and V1, respectively) and between non-depleted plasma samples from the same
as P and V1, respectively) and between non-depleted plasma samples from the same individuals atindividuals at
baseline (V1) and three years later (V2) (right panel). Squares indicate the best correlation found for
baseline (V1) and three years later (V2) (right panel). Squares indicate the best correlation found for
individuals from V1.
individuals from V1.
3.2. Performance
3.2. Performance ofof the Plasma Proteomics
the Plasma Workflow
Proteomics Workflow
The 444
The 444baseline
baselinesamples
samples(V1)
(V1)
and and
thethe
444444 samples
samples fromfrom the same
the same individuals
individuals ob-
obtained
tained at the 3-year follow-up visit (V2, see Figure 1) were processed without
at the 3-year follow-up visit (V2, see Figure 1) were processed without depletion. Using depletion.
Using
this this approach,
approach, we werewe were able
able to to quantify
quantify 46044604 peptides
peptides on average,
on average, corresponding
corresponding to
to 917
917 proteins
proteins (547
(547 quantified
quantified with
with >1>1peptide
peptideacross
acrossatatleast
least60%
60%of ofthe
theindividuals),
individuals), with
deep analysis,
the deep analysis,which
whichincluded
includedfractionation;
fractionation;meanwhile,
meanwhile, 3027
3027 peptides,
peptides, correspond-
corresponding
ing537
to to proteins
537 proteins
(349 (349 detected
detected with with >1 peptide
>1 peptide across
across at least
at least 60% of60%theofindividuals),
the individuals),
were
were quantitated with the fast
quantitated with the fast analysis. analysis.
Biomedicines 2024, 12, 2118 8 of 21

The fast approach (without fractionation) required 50 min of LC-MS/MS time per
sample, allowing the analysis of 29 samples per day. These figures were comparable to
those reported with published conventional label-free approaches (21–88 min per sample,
depending on the method used, allowing the analysis of 50 and 15 samples per day,
respectively). The protein throughput, however, compared favorably (537 versus 241–512
proteins per sample) [41–44]. In comparison, the deep analysis required 225 min per sample,
allowing the analysis of 8 samples per day, but with a depth of more than 900 proteins,
a number rarely found in high-throughput plasma proteomics works. Of interest, one
representative work in the field reached 965 plasma proteins via label-free analysis using
extensive fractionation, but at the cost of considerably longer analysis time (960 min) [45].

3.3. Quantification Accuracy of the Plasma Proteomics Workflow


To investigate the quantitative accuracy of the protein quantifications produced using
the workflow, we resorted to the WSPP statistical model previously developed in our
laboratory, which provides a statistical framework for testing the quality of quantitative ex-
periments and detecting experimental deviations. The WSPP model can separately estimate
the variance originated by (i) protein extraction and manipulation (protein-level vari-
ance), (ii) protein digestion and subsequent peptide labeling (peptide-level variance), and
(iii) extraction of quantitative information from LC-MS/MS data (scan-level variance) [24].
First, to ascertain quantitative reproducibility, we plotted protein quantification values
from the same internal standard sample labeled with two different TMT tags against their
corresponding statistical weight (i.e., the inverse of variance) (Figure 3A). Protein quan-
tification was generally within the 95% confidence interval, and practically, no significant
protein changes were detected, supporting accurate quantification of the internal standard
sample. This analysis was also performed with biological samples (Figure 3B). In this case,
5% of the quantified proteins lay outside the confidence limits, which could be accounted
for by inter-individual variability of the plasma proteome [46–51].
Furthermore, technical variability was assessed by estimating variances at the scan,
peptide, and protein levels using the quantification of the same internal standard sample
labeled with two different TMT tags (Figure 3C, top panel). As expected for the analysis
of technical replicates from identical samples, the variances at the peptide and protein
levels were not significantly different from zero, and significant variance, generated by
the errors associated with the LC-MS/MS analysis, was only detected at the scan level.
To assess biological variability, we analyzed the variances generated from the analysis of
separate plasma samples. These were calculated by matching the quantification of one
individual against the average of the internal standard sample (which was labeled with two
different TMT tags) (Figure 3C, bottom panel). As expected, the variance at the scan level
was similar to those of the technical replicates, and non-zero variances were measured at
the peptide and protein levels due to biological variability (Figure 3C, bottom panel). These
variances were in the ranges previously reported for the analysis of biological samples of
very different origin [22]. Of note, the distribution of quantification errors was in all cases
in excellent agreement with the theoretical null hypothesis distributions (Figure 3C, red
sigmoids), indicating that the statistical model was an accurate approach to analyze the
results produced using this workflow.
Finally, prior biochemical measurements of a number of proteins, namely immunoglob-
ulin heavy constant alpha 2 (IGHA2), C-reactive protein (CRP), apolipoprotein(a) (LPA),
and apolipoprotein B-100 (APOB), in hundreds of samples [33,34] showed a very strong
correlation (p-value < 10−15 ) with the quantitative values provided by the workflow, further
supporting their high quantitation accuracy (Figure 4A).
Biomedicines
Biomedicines 2024, 12, 2118 2024, 12, x FOR PEER REVIEW 9 of 2
9 of 21

Figure 3. ProteinFigure 3. Proteinaccuracy


quantification quantification
of theaccuracy of proteomics
large-scale the large-scale proteomics
workflow. (A) workflow. (A) Average pro
Average protein
tein quantification
quantification values values (grand-mean-corrected)
(grand-mean-corrected) from the same internal from the same internal
standard sample standard
labeled withsample labele
two different TMT reagents plotted against their ranked statistical weight. Red lines delimit the 95% delimit th
with two different TMT reagents plotted against their ranked statistical weight. Red lines
95% confidence
confidence interval (two standardinterval (two standard
deviations). (B) deviations). (B) Average
Average protein protein quantification
quantification values (grand- values (grand
mean-corrected) from two different biological samples plotted against their ranked statistical weigh
mean-corrected) from two different biological samples plotted against their ranked statistical weight.
Red lines delimit the 95% confidence interval (two standard deviations). (C) Technical and biologica
Red lines delimit the 95%atconfidence
variances intervaland
the scan, peptide, (two standard
protein deviations).
levels. (C) Technical
Technical variance and biological
(top panel) was estimated usin
variances at thethe quantification
scan, of theprotein
peptide, and same internal
levels.standard
Technicallabeled with two
variance (topdifferent TMTestimated
panel) was tags, whereas biolog
ical variance
using the quantification (bottom
of the same panel)
internalwas estimated
standard by matching
labeled with two thedifferent
quantification of one
TMT tags, individual agains
whereas
the average
biological variance (bottomofpanel)
the internal standard sample
was estimated (which was
by matching the labeled with twoofdifferent
quantification TMT tags). Varianc
one individual
values
against the average are internal
of the expressed as average
standard (from(which
sample 14 andwas
56 experiments
labeled withfor twotechnical
differentand
TMT biological
tags). variance
respectively) ± SEM. SEM, standard error of the mean; TMT, tandem mass tags.
Variance values are expressed as average (from 14 and 56 experiments for technical and biological
variance, respectively) ± SEM. SEM, standard error of the mean; TMT, tandem mass tags.
Biomedicines 2024, 12, x FOR PEER REVIEW 10 of 21
Biomedicines 2024, 12, 2118 10 of 21

A IGHA2
CRP
4.0
1
3.5 r=0.82 r=0.75

log (mg/ml)

log (mg/dl)
0
3.0
2.5 -1
2.0 -2
1.5
-3 -5 0 5 10
-10 -5 0 5 10
Zq Zq

LPA ApoB
3 3
r=0.73 r=0.62

log (mg/dl)

log (mg/dl)
2
2
1

0 1
-10 -5 0 5 10 -10 -5 0 5
Zq Zq
B
Technical replicates B io lo g ic a l re p lic a te s
10 5
HSA

10 4
CP
Protein Abundance

10 3
FBLN1

10 2 PF4

MMRN1
10 1

10 0
0 2 4 6 8 10 12 14 0 100 200 300 400

Correlationwith
Figure4.4.Correlation
Figure withbiochemical
biochemicalmeasurements
measurementsand andquantitative
quantitativereproducibility
reproducibilityofofthe
thelarge-
large-
scaleproteomics
scale proteomicsworkflow.
workflow.(A)(A)Biochemical
Biochemicalversus
versusproteomics
proteomicsquantification
quantificationof ofimmunoglobulin
immunoglobulin
heavyconstant
heavy constant alpha
alpha 2 (IGHA2), C-reactive
C-reactive protein
protein(CRP),
(CRP),apolipoprotein(a)
apolipoprotein(a)(LPA),(LPA),and
andapolipopro-
apolipo-
protein
tein B-100B-100 (ApoB),
(ApoB), showing
showing a strong
a strong correlation
correlation (p-value < 10<−10
(p-value 15 ). ). The
−15The Pearson
Pearson correlation
correlation coeffi-
coefficient
cient
(r) is(r) is indicated.
indicated. (B) Average
(B) Average LFQLFQ intensity
intensity showing
showing reproducible
reproducible quantification
quantification across
across 14 tech-
14 technical
nical replicates
replicates (left panel)
(left panel) andbiological
and 444 444 biological replicates
replicates (right (right
panel)panel)
of fiveof five selected
selected proteinsproteins with
with plasma
plasma concentration spanning five orders of magnitude. Blue line: human albumin,
concentration spanning five orders of magnitude. Blue line: human albumin, HAS; orange: cerulo- HAS; orange:
ceruloplasmin, CP; red: fibulin-1, FBLN1; green: platelet factor 4, PF4; and gray: multimerin-1,
plasmin, CP; red: fibulin-1, FBLN1; green: platelet factor 4, PF4; and gray: multimerin-1, MMRN1.
MMRN1. LFQ, label-free quantification; Zq, standardized log2 ratio at the protein level.
LFQ, label-free quantification; Zq, standardized log2 ratio at the protein level.

3.4.
3.4.Technical
Technicaland
andBiological
BiologicalVariability
Variabilityofofthe
thePlasma
PlasmaProteome
Proteome
To
Toevaluate
evaluatethe
thetechnical
technicalvariability
variabilityof ofthe
theplasma
plasmaproteome,
proteome,we weselected
selectedfive
fiveplasma
plasma
proteins,
proteins,namely
namelyhuman
humanserum
serumalbumin
albumin(HSA),
(HSA),ceruloplasmin
ceruloplasmin(CP),
(CP),fibulin-1
fibulin-1(FBLN1),
(FBLN1),
platelet
plateletfactor
factor44(PF4),
(PF4),and
andmultimerin-1
multimerin-1(MMRN1),
(MMRN1),with withplasma
plasmaconcentrations
concentrationsspanning
spanning
five orders of magnitude, and represented their estimated absolute abundances
five orders magnitude, and represented their estimated absolute abundances (calculated (calcu-
lated as the average value of the same internal standard sample labeled with
as the average value of the same internal standard sample labeled with two different TMTtwo different
TMT
tags).tags).
The The quantification
quantification was was reproducible
reproducible across
across thetechnical
the 14 14 technical replicates
replicates for the
for the five
five proteins
proteins selected
selected (Figure
(Figure 4B,4B, left).The
left). Theabsolute
absoluteabundance
abundanceof of these five
five proteins
proteinsacross
across
444
444individuals
individualsdistributed
distributedamong
among56 56TMT
TMTexperiments
experimentsshowed
showedthe theexpected
expectedbiological
biological
variability
variabilityofofthese
thesesamples
samplesaround
aroundstable
stableaverage
averagevalues
values(Figure
(Figure4B,
4B,right).
right).
Biomedicines 2024, 12, x FOR PEER REVIEW 11 of 21
Biomedicines 2024, 12, 2118 11 of 21

To investigate the analytical variability, we plotted CV for the 1000 most abundant
To investigate the analytical variability, we plotted CV for the 1000 most abundant pro-
proteins
teins quantified
quantified acrossacross 14 technical
14 technical replicates
replicates againstagainst their ranked
their ranked abundance.
abundance. The re-
The results
sults indicate that 90% of the proteins (895) had CV < 20%, which is the
indicate that 90% of the proteins (895) had CV < 20%, which is the most common cut-off inmost common cut-
off in diagnostic assays [52] (Figure 5A), whereas 31% of the proteins showed
diagnostic assays [52] (Figure 5A), whereas 31% of the proteins showed CV < 5 (Figure 5B). CV < 5 (Fig-
ure 5B). Moreover,
Moreover, we were able wetowere able to
quantify 67quantify
out of the67109
outFDA-approved
of the 109 FDA-approved biomarkers
biomarkers [52], with
[52], with only one of them, one of the less abundant species, showing
only one of them, one of the less abundant species, showing a CV > 20 (Figure 5C). a CV > 20 (Figure
Inter-
5C). Inter-individual
individual (biological)was
(biological) variability variability
estimated wasbyestimated
calculating bythe
calculating
CV for the the1000
CV most
for the
abundant proteins quantified across the 444 individuals that were analyzed via LC-MS/MSvia
1000 most abundant proteins quantified across the 444 individuals that were analyzed
LC-MS/MS
over over In
four months. four
thismonths.
case, 20%In of
this case,
the 20% of
proteins theshowed
(195) proteinsCV(195) showed
ranging fromCV 10ranging
to 20,
from 10 to 20, and 25% between 20 and 30 (Figure 5D,E). Regarding
and 25% between 20 and 30 (Figure 5D,E). Regarding the FDA-approved biomarkers, half the FDA-approved
ofbiomarkers,
them showed half
CVof<them showed
20 (Figure 5F).CV < 20 (Figure 5F).

A B C
400 40 F D A b io m a r k e rs
400
350
300 30 300
% o f p ro te in s

250

C V (% )
C V (% )

200
200 20
150 100
100 10
50 0
20
105 0 20 40 60
0 895 0
0 200 400 600 800 1000 <5 5 -1 0 1 0 -2 0 > 2 0 A bundance R ank

A bundance R ank C V (% )

D E F
400 F D A b io m a r k e rs
40 400
350

300 300
30
% o f p ro te in s

C V (% )

250
C V (% )

200
200
20
150 100
100 10
0
50 805 0 20 40 60
20
0
195 0 A bundance R ank
0 200 400 600 800 1000 1 0 -2 0 2 0 -3 0 3 0 -4 0 4 0 -6 0 > 60

A bundance R ank C V (% )

Figure
Figure Analytical
5. 5. Analytical and biological
and variability
biological of of
variability thethe
large-scale
large-scale proteomics
proteomics workflow.
workflow. (A)(A)
Analytical
Analytical
CVCV forfor
the 1000 most abundant proteins quantitated across 14 technical replicates
the 1000 most abundant proteins quantitated across 14 technical replicates as a function as a function of
of their
their ranked
ranked abundance.
abundance. A A total
total of 895
895 proteins
proteins showed
showed CV CV<<2020(blue (bluedots),
dots),and
and105 proteins
105 proteins
showed
showed CVCV > 20
> 20 (graydots).
(gray dots).(B)
(B) Analytical
Analytical CV values distributed
distributed as asfollows:
follows:31%31%ofofthe proteins
the pro-
hadhad
teins CVCV < 5;<31%
5; 31%ranged from
ranged 5 to5 10;
from 28%
to 10; 28%ranged
ranged from
from10 10to to
20;20;
andand10%10%showed
showed CVCV >>20.20.
(C)
(C)Analytical
Analyticalvariability
variabilityofofthe
theFDA-approved
FDA-approvedbiomarkers.
biomarkers.The Theanalytical
analyticalCV CVofofthe
the67
67FDA
FDAbiomarkers
biomark-
ersthat
thatcould
couldbe bequantitated
quantitatedwas wasplotted
plottedasasaafunction
functionofoftheir
theirranked
rankedabundance.
abundance.Only Onlyoneoneofofthe
thebi-
omarkers showed CV > 20 (orange dashed line). (D) Biological CV for the 1000 most abundant pro-
biomarkers showed CV > 20 (orange dashed line). (D) Biological CV for the 1000 most abundant
teins quantitated across 444 individuals as a function of their ranked abundance. A total of 195 pro-
proteins quantitated across 444 individuals as a function of their ranked abundance. A total of
teins showed CV < 20 (blue dots), and 805 proteins showed CV > 20 (gray dots). (E) Biological CV
195values
proteins showed CV
distributed < 20 (blue
as follows: 18%dots), andproteins
of the 805 proteins
had CV showed
from CV > 20
10 to 20;(gray
25% dots).
ranged(E) Biological
from 20 to 30;
CV20%values
ranged from 30 to 40; 20% ranged from 40 to 60; and 16% showed CV > 60. (F) Biological20varia-
distributed as follows: 18% of the proteins had CV from 10 to 20; 25% ranged from to
30;bility
20% of ranged from 30 to 40; 20% ranged from 40 to 60; and 16% showed CV
the FDA biomarkers. The biological CV of the 67 FDA biomarkers that could be quantitated> 60. (F) Biological
was plotted
variability as FDA
of the a function of theirThe
biomarkers. ranked abundance.
biological CV of A thetotal of 33biomarkers
67 FDA proteins showed CV >be20quanti-
that could (orange
dashed
tated was line).
plottedCV,ascoefficient
a functionofofvariation; FDA,abundance.
their ranked US Food and DrugofAdministration.
A total 33 proteins showed CV > 20
(orange dashed line). CV, coefficient of variation; FDA, US Food and Drug Administration.
Next, we set out to examine in more detail the nature of the proteins showing very
highNext, we set variability
biological out to examine in more
not due detail the
to technical nature of
variability, the
i.e., proteins
those showing
quantitated very at
across
high biological
least variability
80% of the not due
individuals withtotechnical
technicalCV
variability,
(CVt) < 12i.e.,and
those quantitated
biological across >at30
CV (CVb)
least 80% of
(Figure theThe
6A). individuals
functionalwith technicalanalysis
enrichment CV (CVt) of <these
12 and biological
proteins CV (CVb)
revealed > 30
acute-phase
Biomedicines2024,
Biomedicines 12,x2118
2024,12, FOR PEER REVIEW 12 12
ofof2121

(Figure 6A). The functional enrichment analysis of these proteins revealed acute-phase
response, lipoprotein particle organization, oxygen transport, and platelet degranulation
response, lipoprotein particle organization, oxygen transport, and platelet degranulation as
as enriched (at 1% FDR) functional categories (Figure 6B).
enriched (at 1% FDR) functional categories (Figure 6B).

A 30

20

C V t (% )
P ro te in s w ith h ig h
b io lo g ic a l v a ria b ility

10

0
0 50 100 150 200 250

B C V b (% )

Acute-phase response Platelet degranulation


Lipoprotein particle organization Other proteins
Oxygen transport

Figure
Figure6.6.Highly
Highlybiologically
biologicallyvariable
variableproteins.
proteins.(A)
(A)Analytical
Analyticalvariability
variability(CVt)
(CVt)was wasplotted
plottedversus
versus
biological
biologicalvariability
variability(CVb)
(CVb)for
forproteins
proteinsquantitated
quantitatedacross
acrossatatleast
least80%
80%ofofthe
theindividuals
individualsatatbaseline.
baseline.
Proteins
Proteinswith
withCVt
CVt<<1212and
andCVb
CVb>>30 30were
wereconsidered
consideredtotohave
havehigh
highbiological
biologicalvariability
variability(orange
(orange
dots). (B) Correlation network obtained with the highly biologically variable proteins. Edge line
dots). (B) Correlation network obtained with the highly biologically variable proteins. Edge line
thickness is proportional to the strength of the association. The following functional categories were
thickness is proportional to the strength of the association. The following functional categories were
found enriched (FDR < 1%): acute-phase response (blue), lipoprotein particle organization (orange),
found enriched
oxygen transport(FDR < 1%):
(pink), and acute-phase response (blue),
platelet degranulation (red).lipoprotein
Additionalparticle
highly organization (orange),
biologically variable
oxygen transport (pink), and platelet degranulation (red). Additional highly biologically
proteins are highlighted in gray. CVb, biological coefficient of variation; CVt, technical coefficient variable
of
proteins are highlighted in gray.
variation; FDR, false discovery rate.CVb, biological coefficient of variation; CVt, technical coefficient of
variation; FDR, false discovery rate.
3.5. Long-Term Temporal Stability of the Plasma Proteome
3.5. Long-Term Temporal Stability of the Plasma Proteome
Protein quantitative values from the same individual showed a significant correlation
Protein quantitative values from the same individual showed a significant correlation
(FDR < 5%) between baseline (V1) and three years later (V2) in 84% of the individuals; this
(FDR < 5%) between baseline (V1) and three years later (V2) in 84% of the individuals; this
correlation was not found when comparing the proteome of different individuals or upon
correlation was not found when comparing the proteome of different individuals or upon
random comparison (Figure 7A). Likewise, for 94% of the proteins, a significant correla-
random comparison (Figure 7A). Likewise, for 94% of the proteins, a significant correlation
tion (FDR < 5%) was obtained between the quantification values of the protein across all
(FDR < 5%) was obtained between the quantification values of the protein across all
individuals at baseline and the values of the same protein across the same individuals
individuals at baseline and the values of the same protein across the same individuals three
three years later; once again, this correlation was not found when comparing different
years later; once again, this correlation was not found when comparing different proteins
proteins or upon random comparison (Figure 7B). We also found that the proteins from
or upon random comparison (Figure 7B). We also found that the proteins from 25% of the
25% of the individuals in V1 did not show a correlation higher than 0.42 with any of the
individuals in V1 did not show a correlation higher than 0.42 with any of the individuals in
individuals in V1, indicating that the plasma underwent significant technical or biological
alterations. In clear contrast, in the remaining 75% of individuals, the correlation matrix
Biomedicines 2024, 12, 2118 13 of 21

Biomedicines 2024, 12, x FOR PEER REVIEW 13 of 21

V1, indicating that the plasma underwent significant technical or biological alterations. In
clear contrast, in the remaining 75% of individuals, the correlation matrix built with the
plasma
built proteins
with quantitated
the plasma proteins atquantitated
baseline andatthree years
baseline later
and showed
three yearsthat,
laterfor most (90%)
showed that,
of the individuals, the correlation is higher between proteins from the same
for most (90%) of the individuals, the correlation is higher between proteins from the same individual as
individual as compared with correlation across different subjects (diagonal dots in Figureis
compared with correlation across different subjects (diagonal dots in Figure 7C). This
an extremely −15 ), since −15
7C). This is an improbable result (p-value
extremely improbable < 10
result (p-value < 10 only 1–2 individuals
), since are expected
only 1–2 individuals are
to matchto
expected the samethe
match individual by chance
same individual by alone
chanceinalone
a population of ca. 300
in a population samples.
of ca. These
300 samples.
resultsresults
These indicate that thethat
indicate plasma proteome
the plasma of each of
proteome individual is highlyisstable
each individual highlyover timeover
stable and
can be recognized as long as three years later in the majority of cases.
time and can be recognized as long as three years later in the majority of cases.

A 0 .8 B 0 .8
S a m e in d iv id u a ls S a m e p r o te in s
P ro b a b ility d e n s ity

P ro b a b ility d e n s ity
0 .6 D iffe re n t in d iv id u a ls 0 .6 D iffe re n t p ro te in s

R andom R andom

0 .4 0 .4

0 .2 0 .2
84% 94%
(5 % F D R ) (5 % F D R )
0 .0 0 .0
- 0 .2 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 - 0 .2 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0
C o rre la tio n V 1 v s V 2 C o rre la tio n V 1 v s V 2

V1

V2
Figure 7. Long-term stability of the plasma proteome. (A) Probability distribution of the Pearson
Figure 7. Long-term
correlation coefficientstability
between ofplasma
the plasma proteome.
protein (A) Probability
quantitative distribution
values at baseline of the
(V1) and Pearson
three years
correlation coefficient
later (V2), showing between plasma
statistically protein
significant quantitative
temporal values at correlation
intra-individual baseline (V1)(5%and three
FDR) years
between
later (V2), showing statistically significant temporal intra-individual correlation (5% FDR) between
protein quantitative values for 84% of the individuals (red line). No correlation was found between
protein quantitative values for 84% of the individuals (red line). No correlation was found between
different (blue line) or random (orange line) individuals. (B) Probability distribution of the Pearson
different (blue line) or random (orange line) individuals. (B) Probability distribution of the Pearson
correlationcoefficient
correlation coefficientbetween
betweenthe theplasma
plasmaprotein
proteinquantitative
quantitativevalues
valuesatatbaseline
baseline(V1)
(V1)andandthree
three
years later (V2), showing statistically significant temporal intra-protein correlation
years later (V2), showing statistically significant temporal intra-protein correlation (5% FDR) for (5% FDR) for
94% of the proteins quantitated across individuals (red line). No correlation was
94% of the proteins quantitated across individuals (red line). No correlation was found between found between
different
different(blue
(blueline)
line)or
orrandom
random(orange
(orangeline)
line)proteins.
proteins.(C)
(C)The
ThePearson
Pearsoncorrelation
correlationmatrix
matrixof ofplasma
plasma
proteins
proteins quantitated
quantitatedacross
across440
440individuals
individuals(224
(224proteins)
proteins)ininV1
V1 and
and V2.
V2. For
Formost
mostof ofthe
the subjects
subjects
(90%),
(90%), intra-individual
intra-individual correlation was higher
correlation was higher than
than that
that between
betweenproteins
proteinsfrom
fromdifferent
differentindividuals.
individu-
als. FDR, false discovery
FDR, false discovery rate. rate.

3.6.Classification
3.6. ClassificationofofPlasma
PlasmaProteins
ProteinsAccording
AccordingtotoTemporal
TemporalStability
Stabilityand
andBiological
Biological Variability
Variability
To further
To further study
study the
the plasma
plasma proteome
proteome in in terms
terms of
of temporal
temporal stability
stability and
and biological
biological
variability, we plotted CVb against the temporal stability of those proteins quantified
variability, we plotted CVb against the temporal stability of those proteins quantified with with
a CVt
a CVt < 12. Following the results in Figure 7C, temporal stability was measured
Following the results in Figure 7C, temporal stability was measured by the by the Pear-
son correlation
Pearson between
correlation the plasma
between proteome
the plasma quantified
proteome at baseline
quantified and three
at baseline andyears
threelater for
years
eachfor
later individual. This allowed
each individual. us to classify
This allowed us to the proteins
classify into three
the proteins groups:
into three stable
groups: proteins,
stable
characterized
proteins, by high temporal
characterized stability stability
by high temporal and low andbiological variability;
low biological signaturesignature
variability; proteins,
proteins, with both high temporal stability and high biological variability; and unstable
proteins, distinguished by low temporal stability regardless of biological variability
Biomedicines 2024, 12, 2118 14 of 21

Biomedicines 2024, 12, x FOR PEER REVIEW 14 of 21


with both high temporal stability and high biological variability; and unstable proteins,
distinguished by low temporal stability regardless of biological variability (Figure 8 and
Table S1). Functional enrichment analysis revealed that each of these protein groups was
(Figure 8 and
implicated Table S1).
in distinct Functional
biological enrichment
processes. analysis
Thus, stablerevealed that
proteins each of these
participated inprotein
processes
groups was implicated in distinct biological processes. Thus, stable proteins
such as complement activation, blood coagulation, lipid transport (mainly apolipopro- participated
in processes
teins involvedsuch as complement
in high-density activation, (HDL)
lipoprotein blood coagulation, lipid transport
assembly, cholesterol (mainlyand
transport,
apolipoproteins involved in high-density lipoprotein (HDL) assembly,
very-low-density lipoprotein (VLDL)-receptor binding), metal ion transport, and endopep-cholesterol
transport, and very-low-density lipoprotein (VLDL)-receptor binding), metal ion
tidase activity. Signature proteins were implicated primarily in the innate immune and
transport, and endopeptidase activity. Signature proteins were implicated primarily in the
inflammatory response and lipid transport (represented principally by HDL apolipoprotein
innate immune and inflammatory response and lipid transport (represented principally
components). Finally, unstable proteins were composed mostly of proteins involved in
by HDL apolipoprotein components). Finally, unstable proteins were composed mostly
complement activation, principally through the classical pathway; serine proteases with
of proteins involved in complement activation, principally through the classical pathway;
endopeptidase activity; hemoglobins; blood coagulation proteins; and apolipoproteins
serine proteases with endopeptidase activity; hemoglobins; blood coagulation proteins;
(mainly VLDL apolipoprotein components) (Figure 8).
and apolipoproteins (mainly VLDL apolipoprotein components) (Figure 8).

Regulation of complement activation


Negative regulation of endopeptidase activity
Blood coagulation
Lipid transport
Stable proteins

Transition metal ion transport


Innate immune response
Inflammatory response
Oxygen transport
Other proteins
Signature proteins

0.9 CD5L IGHM


C4A IGHA2 IGHG4
0.8 ​
C7 HRGAPOH IGJ
​ G3BP HP IGHD
0.7 ITIH2 ​ IGHG3
APOL1 LPA
_IgMu APOC2
SERPINF1 APOE
0.6 ​ TF APOD Ig PPBP Unstable proteins
C6
Temporal stability


CFH APOM HEP2 IGLC2 PF4V1
0.5 CFI ORM2 APOA4
CPN2 APOA2 Ig APOA1
Igvar
0.4 C1R PON3
​ ​ ​
APOB APOF APOC4 ​ APOC3
0.3 ALB APOB ​
​ ​ HBB
​ ACTA2
0.2 HBD

LCAT ​ FN1 FN1
0.1 ​ ​


0
TTR
-0.1
0% 10% 20% 30% 40% 50% 60% 70%
CVb (%)

Figure 8. Temporal stability and biological variability of plasma proteins. The Pearson correlation
Figure 8. Temporal stability and biological variability of plasma proteins. The Pearson correlation
coefficient between protein quantitative values at baseline and three years later (temporal stability)
coefficient
versus CVb.between proteinwith
Only proteins quantitative values
CVt < 12 were at baseline
considered. and three
Proteins wereyears laterinto
classified (temporal stability)
three groups:
versus
stable proteins (Pearson coefficient > 0.4 and CVb < 20), signature proteins (Pearson coefficient >groups:
CVb. Only proteins with CVt < 12 were considered. Proteins were classified into three 0.4
stable proteins
and CVb > 30),(Pearson coefficient
and unstable > 0.4
proteins and CVb
(Pearson < 20), signature
coefficient < 0.4). Theproteins (Pearson
insets show coefficient
correlation across> 0.4
protein
and CVbcomponents of the functional
> 30), and unstable proteinscategories found enriched
(Pearson coefficient (FDR
< 0.4). The<insets
1%) inshow
each protein group,
correlation across
with edge line thickness proportional to correlation strength. CVb, biological coefficient
protein components of the functional categories found enriched (FDR < 1%) in each protein group, of variation;
CVt, technical coefficient of variation; FDR, false discovery rate.
with edge line thickness proportional to correlation strength. CVb, biological coefficient of variation;
CVt, technical coefficient of variation; FDR, false discovery rate.
3.7. Performance of the On-Plate Sample Preparation Method
Biomedicines 2024, 12, 2118 15 of 21

3.7. Performance of the On-Plate Sample Preparation Method


Our original MS-based proteomics workflow relies on individual centrifugal devices
for on-filter single-sample digestion. As an alternative, we also developed a 96-well plate
format for reducing sample preparation time and improving the reproducibility of such
procedures as protein digestion. This on-plate protocol was used to process 350 plasma
samples from the AWHS cohort [35,36].
The examination of protein quantification accuracy based on the internal standard
sample included in every TMT experiment and the biological samples, as described above,
(Section 3.3) revealed that quantification was accurate in both cases (Figure S1A,B). Tech-
nical and biological variability were also assessed by estimating the variance at the scan,
peptide, and protein levels. The technical and biological variances were similar to those
found with the original protocol (Figure S1C) [22]. On-plate performance of the workflow
was also evaluated before and after peptide-level fractionation. On average, 1845/2937
peptides corresponding to 390/774 proteins were quantitated before/after fractionation.
The analytical variability was evaluated by plotting the CV of the 1000 most abun-
dant proteins quantified across 14 technical replicates against their ranked abundance
(Figure S2A). We observed that ca. 99% of the proteins (988) had CV < 20, with only
12 proteins showing CV > 20 (Figure S2B). Interestingly, none of the 49 FDA-approved
biomarkers quantified with our workflow showed CV > 20 (Figure S2C). Inter-individual
(biological) variability was analyzed by plotting the CV of the 1000 most abundant proteins
across 350 individuals that were analyzed via LC-MS/MS over two months against their
ranked abundance (Figure S2D). Forty percent of proteins (403) showed CV < 20, and
twenty-eight percent of proteins showed CV between 10 and 20 (Figure S2E). Regarding
the FDA-approved biomarkers, only 15 out of 49 had CV > 20 (Figure S2F). These results
revealed that the plate protocol had a similar performance to the original one and was
therefore suitable for large-scale clinical studies.

4. Discussion
MS-based proteomics is increasingly entering regulated clinical and diagnostic set-
tings [53–55], with the potential to yield predictive biomarker signatures that support
clinical decisions, as well as to enable the prediction of patient evolution, provided that
datasets of appropriate depth and size are available [56]. However, while routine clinical
applications demand accuracy, reproducibility, and robustness, with low cost and high
throughput to facilitate comparison within and between laboratories [57], MS-based analy-
sis of human plasma remains challenging mainly because of both the large dynamic range
in abundance of blood proteins and the presence of a few large, extremely abundant species
that hamper the detection of lower abundance species [58]. Plasma depletion from highly
abundant species and peptide-level fractionation have been used to circumvent these diffi-
culties; nevertheless, depletion can introduce analytical bias due to the removal of lower
abundance proteins either carried over by the depleted proteins or unspecifically bound to
the antibodies used [14,15], and extensive fractionation is highly time-consuming [16]. This
unfortunate trade-off between depth and throughput has led to a situation where only a
few works on plasma proteomics have analyzed >500 samples with significant proteome
coverage [17–19]. For these reasons, we developed an MS-based proteomics workflow
aimed at the analysis of human plasma samples that benefits from accurate quantification
and higher analysis throughput facilitated by multiplexed isobaric labeling, increased
depth of detection attained via peptide-level fractionation, and robust statistical treatment
by means of a validated, fully automated quantitation procedure [20–25]. Our workflow
is characterized by (i) applicability to large sample cohorts, as illustrated by the analy-
sis of >1300 human plasma samples, which can also be accomplished on 96-well plates;
(ii) high-throughput capacity and satisfactory depth of detection, allowing the quantifi-
cation of either 537 proteins at 29 samples per day (50 min per sample) without prior
peptide-level fractionation or 917 proteins at 8 samples per day (180 min per sample) after
Biomedicines 2024, 12, 2118 16 of 21

fractionation; and (iii) high quantitative accuracy and reproducibility, with CV < 20% for
90% of the proteins quantitated (895 proteins).
Reproducibility issues in plasma depletion from high abundance proteins have long
been known [59,60]. In this work, we demonstrated that this procedure induces changes in
the plasma proteome that lead to biased peptide and protein quantification, as evidenced by
the decreased intra-individual plasma proteome correlation upon depletion as compared
to long-time correlation. Accordingly, this otherwise time-consuming, multistep procedure
was not included in our workflow, which was evaluated for quantitative reproducibility
based on the WSPP model to separately estimate scan-, peptide-, and protein-level variances
to conclude that both technical and biological variances were within the expected values
at the three levels. Quantitative accuracy was further supported by the very strong corre-
lation found between the quantitative values provided by the workflow and biochemical
measurements.
To investigate the sources of variability associated with our workflow, we first demon-
strated both technical and biological reproducibility using five proteins with plasma con-
centration spanning five orders of magnitude (HAS, CP, FBLN1, PF4, and MMRN1). Then,
analytical reproducibility (i.e., that related exclusively to the LC-MS/MS analysis) was
shown with the 1000 most abundant species quantitated, where 90% of the proteins (895 pro-
teins, including 67 of the 109 FDA-approved biomarkers) were found to have analytical
CV < 20, the most common cut-off in diagnostic assays [52]. Finally, this 1000-protein set
was also used to further assess biological reproducibility across 444 individuals, where 20%
of the proteins (195 proteins, including 33 FDA-approved biomarkers) exhibited analytical
CV < 20. The results support the capacity of our newly developed workflow to provide ac-
curate protein quantification, with quantitative changes accounting for the inter-individual
variability of plasma proteins instead of technical or analytical variability. The functional
categories found enriched with the subset of highly biologically variable proteins, i.e.,
those quantitated across at least 80% of the subjects with CVt < 12 and CVb > 30, are in
agreement with a previous report, where high-abundance erythrocyte-specific proteins
(such as hemoglobin), acute-phase proteins (such as serum amyloid A1 protein, SAA1; and
C-reactive protein, CRP), lipoproteins (such as apolipoprotein (a), LPA), and proteins from
the blood coagulation system (such as fibrinogen, FG; platelet basic protein, PPBP; and
platelet factor 4 variant, PF4V1) were described as the species with the highest biological
variability [45]. Likewise, the selection of a longitudinal prospective cohort comprising
baseline and 3-year follow-up visit samples enabled us to examine the long-term temporal
stability of the plasma proteome. The majority of individuals (84%) and proteins (94%)
showed significant correlation between these two time points, suggesting that the plasma
proteome is highly stable in time, with much higher inter- than intra-individual variability.
Similar observations have been reported previously but were either based on a few selected
proteins, using smaller longitudinal cohorts, or over shorter periods of time [46–51].
According to their behavior over time and their biological variability, plasma proteins
were divided into three different groups: stable proteins (high temporal stability with low
biological variability), signature proteins (high temporal stability and biological variability),
and unstable proteins (low temporal stability with either low or high biological variability).
Stable proteins comprise the second largest group, with species involved in such processes
as complement activation, endopeptidase activity, blood coagulation, and lipid transport
(mainly apolipoproteins involved in HDL assembly and VLDL-receptor binding). The
largest subgroup of stable proteins is composed of complement factors, which are involved
in the pathogenesis of many inflammatory diseases; thus, C3 and C4 circulating levels
are used to monitor patients with systemic lupus erythematosus [61]. The low biological
variability of these complement components is accounted for by the fact that hereditary
complement deficiencies are rare and generally associated with recurrent bacterial infec-
tions [62]; however, a reduced group of complement factors were classified as unstable
proteins due to their high biological variability and low temporal stability. Moreover, some
studies reported low biological variability for fibrinogen, clotting factors, and antithrombin,
Biomedicines 2024, 12, 2118 17 of 21

while fibrinolytic molecules, such as plasminogen activator inhibitor 1 and fibrinopeptide


A, were found highly variable [63].
Signature proteins, which also show low temporal variability, but, unlike stable pro-
teins, are highly biologically variable, are implicated in immune and inflammatory re-
sponses and in lipid transport (represented mainly by HDL apolipoprotein components).
LPA plasma levels, which have long been associated with increased risk of CVD, have
previously been shown to vary up to >1000 times across individuals of the same popu-
lation [64] as a consequence of its high genetic variability and the involvement of other
genes related to its synthesis and metabolism. In contrast, LPA plasma concentrations are
not altered by environmental factors and are thought to be relatively constant throughout
a person’s lifetime [65]. Moreover, plasma lipid levels and risk of CVD are highly her-
itable, with estimates ranging from 30% to 60% [66–68], and genetic polymorphisms in
apolipoprotein-encoding genes constitute important modulators of serum lipid profiles
and CVD susceptibility. Human studies have identified polymorphisms in the APOA4
gene that associate with its plasma levels, inter-individual variability in cholesterol levels,
and risk of coronary heart disease [69]. Moreover, circulating HP levels have previously
been demonstrated to be markedly reproducible in healthy individuals over a 4-month
period [70]. It has also been previously described that the intra-individual variations (with
CVs ranging from 5% to 52%) of some serum immunoglobulins, such as IgG, IgA, and IgM,
were smaller than inter-individual variations (with CVs ranging from 15% to 108%) [71].
These findings indicate that both stable and signature proteins may serve as reliable
biomarkers of different nature. Thus, alterations to the former reflect an individual’s de-
viation from baseline population health, whereas signature proteins might constitute a
personalized fingerprint of disease (e.g., by reflecting immune-inflammatory pathways as-
sociated with metabolic health). In contrast, unstable proteins (the largest group, composed
mainly of complement and blood coagulation components, serpins with endopeptidase
activity, hemoglobins, and VLDL apolipoprotein components) have low diagnostic or prog-
nostic capacity due to their high temporal and biological variability. Thus, high-abundance
erythrocyte-specific proteins, and specifically hemoglobin subunits alpha (HBA2), beta
(HBB), and delta (HBD), often show high variability due to a different extent of erythrocyte
hemolysis or platelet contamination during plasma preparation and harvesting [57].
The implementation of personalized medicine in the analysis of otherwise healthy
individuals for estimation of disease risk and medical interpretation is not yet clear. High-
prevalence diseases like CVD involve many distinct genes and biological pathways, to-
gether with environmental contributors that have not been fully established yet [72], and
detailed molecular analysis of blood samples will help predict, diagnose, and treat dis-
eases [73]. The workflow presented here generated a rich proteomics dataset on two visits
(baseline and 3-year follow-up) of 444 individuals that provided insight into potential
personal CVD markers and patterns. Inter-individual differences, ultimately reflected in
the signature plasma protein group, are likely due to genetic variability; however, envi-
ronmental contributors cannot be ruled out, suggesting that these potential markers are
actionable through lifestyle changes.
Finally, the proteomics workflow put forward in this study should be easily adaptable
to other body fluids, such as cerebrospinal fluid, urine, or saliva, all of which are tantalizing
sources of disease biomarkers.

Supplementary Materials: The following supporting information can be downloaded at: https:
//www.mdpi.com/article/10.3390/biomedicines12092118/s1, Figure S1: On-plate protein quantifi-
cation accuracy of the large-scale proteomics workflow; Figure S2: On-plate analytical and biological
variability of the large-scale proteomics workflow; Table S1: Classification of plasma proteins accord-
ing to their temporal stability and biological variability.
Author Contributions: J.L.M.-V. and J.V. conceived and designed the study; E.L.-P. developed the
clinical cohorts, including recruitment of patients, collection of clinical and demographic data, and
storing of samples; E.N., M.G.-S., E.B.-K., F.G.-M., M.T.-H. and E.C. (Enrique Calvo) conducted the
Biomedicines 2024, 12, 2118 18 of 21

plasma proteomics analyses; E.N., J.M.R., R.M. and J.V. performed the quantitative analysis of plasma
proteomics and verified the data; J.L.M.-V. conducted the antibody-based analyses and verified the
data; E.N., E.C. (Emilio Camafeita), J.L.M.-V. and J.V. drafted the manuscript. All authors provided
important intellectual revisions to the manuscript and read and approved the final manuscript. All
authors have read and agreed to the published version of the manuscript.
Funding: This study was supported by competitive grants PID2021-122348NB-I00 funded by MI-
CIU/AEI/10.13039/501100011033 and by “ERDF A way of making Europe”, PLEC2022-009298,
PLEC2022-009235, and EQC2021-007053-P funded by MICIU/AEI/10.13039/501100011033 and by
“European Union NextGenerationEU/PRTR”, S2022/BMD-7333-CM (INMUNOVAR-CM) funded by
Comunidad de Madrid, and LCF/PR/HR22/52420019 funded by “la Caixa” Foundation. The PESA
study is co-funded equally by the Centro Nacional de Investigaciones Cardiovasculares (CNIC),
Madrid, Spain, and Banco Santander, Madrid, Spain. The CNIC is supported by the Instituto de
Salud Carlos III (ISCIII), the Ministerio de Ciencia, Innovación Y Universidades (MICIU), and the Pro
CNIC Foundation, and is a Severo Ochoa Center of Excellence (grant CEX2020-001041-S funded by
MICIU/AEI/10.13039/501100011033).
Institutional Review Board Statement: The study was conducted in accordance with the Declaration
of Helsinki, and approved by the Ethics Committee of Instituto de Salud Carlos III (protocol code
PI-19-2019, 1 June 2010).
Informed Consent Statement: Written informed consent was obtained from the subjects involved in
this study.
Data Availability Statement: MS raw data have been deposited in Peptide Atlas (http://www.
peptideatlas.org/PASS/PASS01382 and http://www.peptideatlas.org/PASS/PASS01522, accessed
on 17 August 2024.
Conflicts of Interest: The authors declare no conflicts of interest. The funding sources played no
role in the study design; in the collection, analysis, or interpretation of the data; in the writing of the
report; or in the submission of this paper for publication.

References
1. Arican, O.; Aral, M.; Sasmaz, S.; Ciragil, P. Serum Levels of Tnf-Alpha, Ifn-Gamma, Il-6, Il-8, Il-12, Il-17, and Il-18 in Patients with
Active Psoriasis and Correlation with Disease Severity. Mediat. Inflamm. 2005, 2005, 273–279. [CrossRef] [PubMed]
2. Geyer, P.E.; Holdt, L.M.; Teupser, D.; Mann, M. Revisiting Biomarker Discovery by Plasma Proteomics. Mol. Syst. Biol. 2017, 13,
942. [CrossRef] [PubMed]
3. Smith, L.M.; Kelleher, N.L. Proteoforms as the Next Proteomics Currency. Science 2018, 359, 1106–1107. [CrossRef]
4. Hellinger, R.; Sigurdsson, A.; Wu, W.; Romanova, E.V.; Li, L.; Sweedler, J.V.; Sussmuth, R.D.; Gruber, C.W. Peptidomics. Nat. Rev.
Methods Primers 2023, 3, 25. [CrossRef]
5. Prensner, J.R.; Abelin, J.G.; Kok, L.W.; Clauser, K.R.; Mudge, J.M.; Ruiz-Orera, J.; Bassani-Sternberg, M.; Moritz, R.L.; Deutsch,
E.W.; van Heesch, S. What Can Ribo-Seq, Immunopeptidomics, and Proteomics Tell Us About the Noncanonical Proteome? Mol.
Cell Proteom. 2023, 22, 100631. [CrossRef]
6. Anderson, N.L.; Ptolemy, A.S.; Rifai, N. The Riddle of Protein Diagnostics: Future Bleak or Bright? Clin. Chem. 2013, 59, 194–197.
[CrossRef] [PubMed]
7. Nagaraj, N.; Wisniewski, J.R.; Geiger, T.; Cox, J.; Kircher, M.; Kelso, J.; Paabo, S.; Mann, M. Deep Proteome and Transcriptome
Mapping of a Human Cancer Cell Line. Mol. Syst. Biol. 2011, 7, 548. [CrossRef]
8. Beck, M.; Schmidt, A.; Malmstroem, J.; Claassen, M.; Ori, A.; Szymborska, A.; Herzog, F.; Rinner, O.; Ellenberg, J.; Aebersold, R.
The Quantitative Proteome of a Human Cell Line. Mol. Syst. Biol. 2011, 7, 549. [CrossRef]
9. Pieper, R.; Su, Q.; Gatlin, C.L.; Huang, S.T.; Anderson, N.L.; Steiner, S. Multi-Component Immunoaffinity Subtraction Chro-
matography: An Innovative Step Towards a Comprehensive Survey of the Human Plasma Proteome. Proteomics 2003, 3, 422–432.
[CrossRef] [PubMed]
10. Qian, W.J.; Kaleta, D.T.; Petritis, B.O.; Jiang, H.; Liu, T.; Zhang, X.; Mottaz, H.M.; Varnum, S.M.; Camp, D.G., 2nd; Huang, L.; et al.
Enhanced Detection of Low Abundance Human Plasma Proteins Using a Tandem Igy12-Supermix Immunoaffinity Separation
Strategy. Mol. Cell Proteom. 2008, 7, 1963–1973. [CrossRef]
11. Luque-Garcia, J.L.; Neubert, T.A. Sample Preparation for Serum/Plasma Profiling and Biomarker Identification by Mass
Spectrometry. J. Chromatogr. A 2007, 1153, 259–276. [CrossRef] [PubMed]
12. Mortezai, N.; Harder, S.; Schnabel, C.; Moors, E.; Gauly, M.; Schluter, H.; Wagener, C.; Buck, F. Tandem Affinity Depletion: A
Combination of Affinity Fractionation and Immunoaffinity Depletion Allows the Detection of Low-Abundance Components in
the Complex Proteomes of Body Fluids. J. Proteome Res. 2010, 9, 6126–6134. [CrossRef] [PubMed]
Biomedicines 2024, 12, 2118 19 of 21

13. Cao, Z.; Tang, H.Y.; Wang, H.; Liu, Q.; Speicher, D.W. Systematic Comparison of Fractionation Methods for in-Depth Analysis of
Plasma Proteomes. J. Proteome Res. 2012, 11, 3090–3100. [CrossRef]
14. Bellei, E.; Bergamini, S.; Monari, E.; Fantoni, L.I.; Cuoghi, A.; Ozben, T.; Tomasi, A. High-Abundance Proteins Depletion for
Serum Proteomic Analysis: Concomitant Removal of Non-Targeted Proteins. Amino Acids 2011, 40, 145–156. [CrossRef]
15. Gundry, R.L.; White, M.Y.; Nogee, J.; Tchernyshyov, I.; Van Eyk, J.E. Assessment of Albumin Removal from an Immunoaffinity
Spin Column: Critical Implications for Proteomic Examination of the Albuminome and Albumin-Depleted Samples. Proteomics
2009, 9, 2021–2028. [CrossRef]
16. Addona, T.A.; Shi, X.; Keshishian, H.; Mani, D.R.; Burgess, M.; Gillette, M.A.; Clauser, K.R.; Shen, D.; Lewis, G.D.; Farrell, L.A.;
et al. A Pipeline That Integrates the Discovery and Verification of Plasma Protein Biomarkers Reveals Candidate Markers for
Cardiovascular Disease. Nat. Biotechnol. 2011, 29, 635–643. [CrossRef] [PubMed]
17. Bruderer, R.; Muntel, J.; Muller, S.; Bernhardt, O.M.; Gandhi, T.; Cominetti, O.; Macron, C.; Carayol, J.; Rinner, O.; Astrup, A.;
et al. Analysis of 1508 Plasma Samples by Capillary-Flow Data-Independent Acquisition Profiles Proteomics of Weight Loss and
Maintenance. Mol. Cell Proteom. 2019, 18, 1242–1254. [CrossRef]
18. Cominetti, O.; Nunez Galindo, A.; Corthesy, J.; Oller Moreno, S.; Irincheeva, I.; Valsesia, A.; Astrup, A.; Saris, W.H.; Hager, J.;
Kussmann, M.; et al. Proteomic Biomarker Discovery in 1000 Human Plasma Samples with Mass Spectrometry. J. Proteome Res.
2016, 15, 389–399. [CrossRef]
19. Niu, L.; Thiele, M.; Geyer, P.E.; Rasmussen, D.N.; Webel, H.E.; Santos, A.; Gupta, R.; Meier, F.; Strauss, M.; Kjaergaard, M.; et al.
Noninvasive Proteomic Biomarkers for Alcohol-Related Liver Disease. Nat. Med. 2022, 28, 1277–1287. [CrossRef]
20. Bonzon-Kulichenko, E.; Garcia-Marques, F.; Trevisan-Herraz, M.; Vazquez, J. Revisiting Peptide Identification by High-Accuracy
Mass Spectrometry: Problems Associated with the Use of Narrow Mass Precursor Windows. J. Proteome Res. 2015, 14, 700–710.
[CrossRef]
21. Garcia-Marques, F.; Trevisan-Herraz, M.; Martinez-Martinez, S.; Camafeita, E.; Jorge, I.; Lopez, J.A.; Mendez-Barbero, N.;
Mendez-Ferrer, S.; Del Pozo, M.A.; Ibanez, B.; et al. A Novel Systems-Biology Algorithm for the Analysis of Coordinated Protein
Responses Using Quantitative Proteomics. Mol. Cell. Proteom. 2016, 15, 1740–1760. [CrossRef] [PubMed]
22. Jorge, I.; Navarro, P.; Martinez-Acedo, P.; Nunez, E.; Serrano, H.; Alfranca, A.; Redondo, J.M.; Vazquez, J. Statistical Model to
Analyze Quantitative Proteomics Data Obtained by 18o/16o Labeling and Linear Ion Trap Mass Spectrometry: Application to the
Study of Vascular Endothelial Growth Factor-Induced Angiogenesis in Endothelial Cells. Mol. Cell. Proteom. 2009, 8, 1130–1149.
[CrossRef]
23. Martinez-Bartolome, S.; Navarro, P.; Martin-Maroto, F.; Lopez-Ferrer, D.; Ramos-Fernandez, A.; Villar, M.; Garcia-Ruiz, J.P.;
Vazquez, J. Properties of Average Score Distributions of Sequest: The Probability Ratio Method. Mol. Cell. Proteom. 2008, 7,
1135–1145. [CrossRef] [PubMed]
24. Navarro, P.; Trevisan-Herraz, M.; Bonzon-Kulichenko, E.; Nunez, E.; Martinez-Acedo, P.; Perez-Hernandez, D.; Jorge, I.; Mesa, R.;
Calvo, E.; Carrascal, M.; et al. General Statistical Framework for Quantitative Proteomics by Stable Isotope Labeling. J. Proteome
Res. 2014, 13, 1234–1247. [CrossRef]
25. Navarro, P.; Vazquez, J. A refined method to calculate false discovery rates for peptide identification using decoy databases. J.
Proteome Res. 2009, 8, 1792–1796. [CrossRef]
26. Baldan-Martin, M.; Mourino-Alvarez, L.; Gonzalez-Calero, L.; Moreno-Luna, R.; Sastre-Oliva, T.; Ruiz-Hurtado, G.; Segura, J.;
Lopez, J.A.; Vazquez, J.; Vivanco, F.; et al. Plasma Molecular Signatures in Hypertensive Patients with Renin-Angiotensin System
Suppression: New Predictors of Renal Damage and De Novo Albuminuria Indicators. Hypertension 2016, 68, 157–166. [CrossRef]
27. Baldan-Martin, M.; Lopez, J.A.; Corbacho-Alonso, N.; Martinez, P.J.; Rodriguez-Sanchez, E.; Mourino-Alvarez, L.; Sastre-Oliva,
T.; Martin-Rojas, T.; Rincon, R.; Calvo, E.; et al. Potential Role of New Molecular Plasma Signatures on Cardiovascular Risk
Stratification in Asymptomatic Individuals. Sci. Rep. 2018, 8, 4802. [CrossRef] [PubMed]
28. Santos-Lozano, A.; Fiuza-Luces, C.; Fernandez-Moreno, D.; Llavero, F.; Arenas, J.; Lopez, J.A.; Vazquez, J.; Escribano-Subias, P.;
Zugaza, J.L.; Lucia, A. Exercise Benefits in Pulmonary Hypertension. J. Am. Coll. Cardiol. 2019, 73, 2906–2907. [CrossRef]
29. Calvo, E.; Corbacho-Alonso, N.; Sastre-Oliva, T.; Nunez, E.; Baena-Galan, P.; Hernandez-Fernandez, G.; Rodriguez-Cola, M.;
Jimenez-Velasco, I.; Corrales, F.J.; Gambarrutta-Malfati, C.; et al. Why Does Covid-19 Affect Patients with Spinal Cord Injury
Milder? A Case-Control Study: Results from Two Observational Cohorts. J. Pers. Med. 2020, 10, 182. [CrossRef]
30. Nunez, E.; Orera, I.; Carmona-Rodriguez, L.; Pano, J.R.; Vazquez, J.; Corrales, F.J. Mapping the Serum Proteome of Covid-19
Patients; Guidance for Severity Assessment. Biomedicines 2022, 10, 1690. [CrossRef]
31. de la Fuente-Alonso, A.; Toral, M.; Alfayate, A.; Ruiz-Rodriguez, M.J.; Bonzon-Kulichenko, E.; Teixido-Tura, G.; Martinez-
Martinez, S.; Mendez-Olivares, M.J.; Lopez-Maderuelo, D.; Gonzalez-Valdes, I.; et al. Aortic Disease in Marfan Syndrome Is
Caused by Overactivation of Sgc-Prkg Signaling by No. Nat. Commun. 2021, 12, 2628. [CrossRef] [PubMed]
32. Corbacho-Alonso, N.; Baldan-Martin, M.; Lopez, J.A.; Rodriguez-Sanchez, E.; Martinez, P.J.; Mourino-Alvarez, L.; Sastre-Oliva, T.;
Cabrera, M.; Calvo, E.; Padial, L.R.; et al. Cardiovascular Risk Stratification Based on Oxidative Stress for Early Detection of
Pathology. Antioxid. Redox Signal. 2021, 35, 602–617. [CrossRef] [PubMed]
33. Nunez, E.; Fuster, V.; Gomez-Serrano, M.; Valdivielso, J.M.; Fernandez-Alvira, J.M.; Martinez-Lopez, D.; Rodriguez, J.M.; Bonzon-
Kulichenko, E.; Calvo, E.; Alfayate, A.; et al. Unbiased Plasma Proteomics Discovery of Biomarkers for Improved Detection of
Subclinical Atherosclerosis. EBioMedicine 2022, 76, 103874. [CrossRef] [PubMed]
Biomedicines 2024, 12, 2118 20 of 21

34. Fernandez-Ortiz, A.; Jimenez-Borreguero, L.J.; Penalvo, J.L.; Ordovas, J.M.; Mocoroa, A.; Fernandez-Friera, L.; Laclaustra, M.;
Garcia, L.; Molina, J.; Mendiguren, J.M.; et al. The Progression and Early Detection of Subclinical Atherosclerosis (Pesa) Study:
Rationale and Design. Am. Heart J. 2013, 166, 990–998. [CrossRef]
35. Casasnovas, J.A.; Alcaide, V.; Civeira, F.; Guallar, E.; Ibanez, B.; Borreguero, J.J.; Laclaustra, M.; Leon, M.; Penalvo, J.L.; Ordovas,
J.M.; et al. Aragon Workers– Health Study–Design and Cohort Description. BMC Cardiovasc. Disord. 2012, 12, 45. [CrossRef]
36. Laclaustra, M.; Casasnovas, J.A.; Fernandez-Ortiz, A.; Fuster, V.; Leon-Latre, M.; Jimenez-Borreguero, L.J.; Pocovi, M.; Hurtado-
Roca, Y.; Ordovas, J.M.; Jarauta, E.; et al. Femoral and Carotid Subclinical Atherosclerosis Association with Risk Factors and
Coronary Calcium: The Awhs Study. J. Am. Coll. Cardiol. 2016, 67, 1263–1274. [CrossRef]
37. Rodriguez, J.M.; Jorge, I.; Martinez-Val, A.; Barrero-Rodriguez, R.; Magni, R.; Nunez, E.; Laguillo, A.; Devesa, C.A.; Lopez, J.A.;
Camafeita, E.; et al. Isanxot: A Standalone Application for the Integrative Analysis of Mass Spectrometry-Based Quantitative
Proteomics Data. Comput. Struct. Biotechnol. J. 2024, 23, 452–459. [CrossRef]
38. Trevisan-Herraz, M.; Bagwan, N.; Garcia-Marques, F.; Rodriguez, J.M.; Jorge, I.; Ezkurdia, I.; Bonzon-Kulichenko, E.; Vazquez, J.
Sanxot: A Modular and Versatile Package for the Quantitative Analysis of High-Throughput Proteomics Experiments. Bioinfor-
matics 2019, 35, 1594–1596. [CrossRef]
39. Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; Nastou, K.; Mehryary, F.; Hachilif, R.; Gable, A.L.; Fang, T.; Doncheva, N.T.; Pyysalo, S.;
et al. The String Database in 2023: Protein-Protein Association Networks and Functional Enrichment Analyses for Any Sequenced
Genome of Interest. Nucleic Acids Res. 2023, 51, D638–D646. [CrossRef]
40. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A
Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003, 13, 2498–2504. [CrossRef]
41. Bennike, T.B.; Bellin, M.D.; Xuan, Y.; Stensballe, A.; Moller, F.T.; Beilman, G.J.; Levy, O.; Cruz-Monserrate, Z.; Andersen, V.; Steen,
J.; et al. A Cost-Effective High-Throughput Plasma and Serum Proteomics Workflow Enables Mapping of the Molecular Impact
of Total Pancreatectomy with Islet Autotransplantation. J. Proteome Res. 2018, 17, 1983–1992. [CrossRef] [PubMed]
42. Johansson, M.; Yan, H.; Welinder, C.; Vegvari, A.; Hamrefors, V.; Back, M.; Sutton, R.; Fedorowski, A. Plasma Proteomic Profiling
in Postural Orthostatic Tachycardia Syndrome (Pots) Reveals New Disease Pathways. Sci. Rep. 2022, 12, 20051. [CrossRef]
[PubMed]
43. Mc Ardle, A.; Binek, A.; Moradian, A.; Chazarin Orgel, B.; Rivas, A.; Washington, K.E.; Phebus, C.; Manalo, D.M.; Go, J.;
Venkatraman, V.; et al. Standardized Workflow for Precise Mid- and High-Throughput Proteomics of Blood Biofluids. Clin. Chem.
2022, 68, 450–460. [CrossRef] [PubMed]
44. Woo, J.; Zhang, Q. A Streamlined High-Throughput Plasma Proteomics Platform for Clinical Proteomics with Improved Proteome
Coverage, Reproducibility, and Robustness. J. Am. Soc. Mass Spectrom. 2023, 34, 754–762. [CrossRef] [PubMed]
45. Geyer, P.E.; Kulak, N.A.; Pichler, G.; Holdt, L.M.; Teupser, D.; Mann, M. Plasma Proteome Profiling to Assess Human Health and
Disease. Cell Syst. 2016, 2, 185–195. [CrossRef]
46. Carlsson, L.; Lind, L.; Larsson, A. Reference Values for 27 Clinical Chemistry Tests in 70-Year-Old Males and Females. Gerontology
2010, 56, 259–265. [CrossRef]
47. Crawford, P.T.; Elisens, W.J. Genetic Variation and Reproductive System among North American Species of Nuttallanthus
(Plantaginaceae). Am. J. Bot. 2006, 93, 582–591. [CrossRef]
48. Geyer, P.E.; Wewer Albrechtsen, N.J.; Tyanova, S.; Grassl, N.; Iepsen, E.W.; Lundgren, J.; Madsbad, S.; Holst, J.J.; Torekov, S.S.;
Mann, M. Proteomics Reveals the Effects of Sustained Weight Loss on the Human Plasma Proteome. Mol. Syst. Biol. 2016, 12, 901.
[CrossRef]
49. Kamstrup, P.R.; Benn, M.; Tybjaerg-Hansen, A.; Nordestgaard, B.G. Extreme Lipoprotein(a) Levels and Risk of Myocardial
Infarction in the General Population: The Copenhagen City Heart Study. Circulation 2008, 117, 176–184. [CrossRef]
50. Liu, Y.; Buil, A.; Collins, B.C.; Gillet, L.C.; Blum, L.C.; Cheng, L.Y.; Vitek, O.; Mouritsen, J.; Lachance, G.; Spector, T.D.; et al.
Quantitative Variability of 342 Plasma Proteins in a Human Twin Population. Mol. Syst. Biol. 2015, 11, 786. [CrossRef]
51. Anderson, L. Six Decades Searching for Meaning in the Proteome. J. Proteom. 2014, 107, 24–30. [CrossRef] [PubMed]
52. Anderson, N.L. The Clinical Plasma Proteome: A Survey of Clinical Assays for Proteins in Plasma and Serum. Clin. Chem. 2010,
56, 177–185. [CrossRef]
53. He, T. Implementation of Proteomics in Clinical Trials. Proteom. Clin. Appl. 2019, 13, e1800198. [CrossRef]
54. DeMarco, M.L.; Nguyen, Q.; Fok, A.; Hsiung, G.R.; van der Gugten, J.G. An Automated Clinical Mass Spectrometric Method for
Identification and Quantification of Variant and Wild-Type Amyloid-Beta 1-40 and 1-42 Peptides in Csf. Alzheimers Dement. 2020,
12, e12036. [CrossRef]
55. Banerjee, S. Empowering Clinical Diagnostics with Mass Spectrometry. ACS Omega 2020, 5, 2041–2048. [CrossRef] [PubMed]
56. Lancaster, S.M.; Lee-McMullen, B.; Abbott, C.W.; Quijada, J.V.; Hornburg, D.; Park, H.; Perelman, D.; Peterson, D.J.; Tang, M.;
Robinson, A.; et al. Global, Distinctive, and Personal Changes in Molecular and Microbial Profiles by Specific Fibers in Humans.
Cell Host Microbe 2022, 30, 848–862.e847. [CrossRef] [PubMed]
57. Geyer, P.E.; Voytik, E.; Treit, P.V.; Doll, S.; Kleinhempel, A.; Niu, L.; Muller, J.B.; Buchholtz, M.L.; Bader, J.M.; Teupser, D.; et al.
Plasma Proteome Profiling to Detect and Avoid Sample-Related Biases in Biomarker Studies. EMBO Mol. Med. 2019, 11, e10427.
[CrossRef]
58. Hortin, G.L.; Sviridov, D.; Anderson, N.L. High-Abundance Polypeptides of the Human Plasma Proteome Comprising the Top 4
Logs of Polypeptide Abundance. Clin. Chem. 2008, 54, 1608–1616. [CrossRef]
Biomedicines 2024, 12, 2118 21 of 21

59. Millioni, R.; Tolin, S.; Puricelli, L.; Sbrignadello, S.; Fadini, G.P.; Tessari, P.; Arrigoni, G. High Abundance Proteins Depletion Vs
Low Abundance Proteins Enrichment: Comparison of Methods to Reduce the Plasma Proteome Complexity. PLoS ONE 2011, 6,
e19603. [CrossRef]
60. Pernemalm, M.; Orre, L.M.; Lengqvist, J.; Wikstrom, P.; Lewensohn, R.; Lehtio, J. Evaluation of Three Principally Different Intact
Protein Prefractionation Methods for Plasma Biomarker Discovery. J. Proteome Res. 2008, 7, 2712–2722. [CrossRef]
61. Ekdahl, K.N.; Persson, B.; Mohlin, C.; Sandholm, K.; Skattum, L.; Nilsson, B. Interpretation of Serological Complement Biomarkers
in Disease. Front. Immunol. 2018, 9, 2237. [CrossRef] [PubMed]
62. Skattum, L.; van Deuren, M.; van der Poll, T.; Truedsson, L. Complement Deficiency States and Associated Infections. Mol.
Immunol. 2011, 48, 1643–1655. [CrossRef] [PubMed]
63. Banfi, G.; Del Fabbro, M. Biological Variation in Tests of Hemostasis. Semin. Thromb. Hemost. 2009, 35, 119–126. [CrossRef]
[PubMed]
64. Crawford, D.C.; Peng, Z.; Cheng, J.F.; Boffelli, D.; Ahearn, M.; Nguyen, D.; Shaffer, T.; Yi, Q.; Livingston, R.J.; Rieder, M.J.; et al.
Lpa and Plg Sequence Variation and Kringle Iv-2 Copy Number in Two Populations. Hum. Hered. 2008, 66, 199–209. [CrossRef]
65. Maranhao, R.C.; Carvalho, P.O.; Strunz, C.C.; Pileggi, F. Lipoprotein (a): Structure, Pathophysiology and Clinical Implications.
Arq. Bras. Cardiol. 2014, 103, 76–84. [CrossRef]
66. Tada, H.; Won, H.H.; Melander, O.; Yang, J.; Peloso, G.M.; Kathiresan, S. Multiple Associated Variants Increase the Heritability
Explained for Plasma Lipids and Coronary Artery Disease. Circ. Cardiovasc. Genet. 2014, 7, 583–587. [CrossRef]
67. Schmidt, E.M.; Willer, C.J. Insights into Blood Lipids from Rare Variant Discovery. Curr. Opin. Genet. Dev. 2015, 33, 25–31.
[CrossRef] [PubMed]
68. Cole, C.B.; Nikpay, M.; McPherson, R. Gene-Environment Interaction in Dyslipidemia. Curr. Opin. Lipidol. 2015, 26, 133–138.
[CrossRef]
69. Wong, W.M.; Hawe, E.; Li, L.K.; Miller, G.J.; Nicaud, V.; Pennacchio, L.A.; Humphries, S.E.; Talmud, P.J. Apolipoprotein Aiv Gene
Variant S347 Is Associated with Increased Risk of Coronary Heart Disease and Lower Plasma Apolipoprotein Aiv Levels. Circ.
Res. 2003, 92, 969–975. [CrossRef] [PubMed]
70. Schenk, M.; Reichmann, R.; Koelman, L.; Pfeiffer, A.F.H.; Rudovich, N.N.; Aleksandrova, K. Intra-Individual Reproducibility of
Galectin-1, Haptoglobin, and Nesfatin-1 as Promising New Biomarkers of Immunometabolism. Metab. Open 2020, 6, 100034.
[CrossRef]
71. Hosogaya, S.; Naito, K.; Sakamoto, M.; Osada, M.; Yatomi, Y.; Ozaki, Y. Biological Inter- and Intra-Individual Variations of Serum
Immunochemical Constituents and Their Allowable Limits of Analytical Error. Rinsho Byori 1999, 47, 875–880. [PubMed]
72. Doran, S.; Arif, M.; Lam, S.; Bayraktar, A.; Turkez, H.; Uhlen, M.; Boren, J.; Mardinoglu, A. Multi-Omics Approaches for Revealing
the Complexity of Cardiovascular Disease. Brief. Bioinform. 2021, 22, bbab061. [CrossRef] [PubMed]
73. Ahadi, S.; Zhou, W.; Schussler-Fiorenza Rose, S.M.; Sailani, M.R.; Contrepois, K.; Avina, M.; Ashland, M.; Brunet, A.; Snyder,
M. Personal Aging Markers and Ageotypes Revealed by Deep Longitudinal Profiling. Nat. Med. 2020, 26, 83–90. [CrossRef]
[PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like