0% found this document useful (0 votes)
48 views10 pages

How Can Chemometrics Support The Development of Point of Need

This document discusses how chemometrics can support the development of point-of-need devices for decentralized analytical chemistry. It highlights the role of chemometrics in conceptualizing, producing, and analyzing data from portable devices to make them more reliable and cost-effective. The document provides an overview of common chemometric tools used at different stages of developing point-of-need devices, from target identification and experimental design to data analysis and result visualization. It aims to help analytical chemists choose the appropriate chemometric method for a given device architecture and transduction method.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views10 pages

How Can Chemometrics Support The Development of Point of Need

This document discusses how chemometrics can support the development of point-of-need devices for decentralized analytical chemistry. It highlights the role of chemometrics in conceptualizing, producing, and analyzing data from portable devices to make them more reliable and cost-effective. The document provides an overview of common chemometric tools used at different stages of developing point-of-need devices, from target identification and experimental design to data analysis and result visualization. It aims to help analytical chemists choose the appropriate chemometric method for a given device architecture and transduction method.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

pubs.acs.

org/ac Feature

How Can Chemometrics Support the Development of Point of Need


Devices?
The necessity to establish novel solutions for decentralized monitoring is attracting attention in all
fields of analytical chemistry, i.e., clinical, pharmaceutical, environmental, agri-food. The research
around the terms “point-of-need”, “point-of-care”, “lab-on-chip”, “biosensor”, “microfluidics”, etc.
is/has been always aimed at the possibility to produce easy-to-use and fast-response devices to be
used by nonspecialists. However, the routes to produce the optimal device might be time-consuming
and costly. In this Feature, we would like to highlight the role of chemometric-based approaches that
are useful in the conceptualization, production, and data analysis in developing reliable portable
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

devices and also decrease the amount of experiments (thus, costs) at the same time. Readers will be
provided a concise overview regarding the most employed chemometric tools used for target
identification, design of experiments, data analysis, and digitalization of results applied to the
Downloaded via 31.14.72.20 on March 8, 2022 at 19:43:35 (UTC).

development of diverse portable analytical platforms. This Feature provides a tutorial perspective
regarding all the major methods and applications that have been currently developed. In particular,
the presence of a concise and informative table assists analytical chemists in utilizing the right
chemometrics-based tool depending on the architectures and transduction.
Sara Tortorella and Stefano Cinti*

Cite This: Anal. Chem. 2021, 93, 2713−2722 Read Online

ACCESS Metrics & More Article Recommendations

monitoring analytes. Devices to improve self-healthcare


(diagnostics and personalized treatments), tools for evaluating
environmental pollution and the effectiveness of remediation,
and portable solutions to improve crop productivity while
adapting to the effects of climate change are only a few of the
contexts where analytical chemistry plays a leading role.3−5
The COVID-19 pandemic is only the last example that
highlights the necessity of user-friendly, rapid, and affordable


devices for the use of nonspecialists, as already established by
DECENTRALIZED ANALYTICAL CHEMISTRY the WHO with the ASSURED criteria.6,7 In addition to this,
the obvious limitations existing in developing/remote
The role of analytical chemistry to provide facile and affordable countries, in the frame of the 2030 Agenda for Sustainable
solutions for the major fields of action, namely, clinical, Development (UN), represent a clear objective to be extended
pharmaceutical, environmental, and agri-food, is continuously within a multidisciplinary vision.8 The research around the
facing challenges.1,2 The research and the economic efforts, various architectures and principles of analytical methods, e.g.,
along with the development of innovative and breakthrough electrochemistry, colorimetry, fluorimetry, spectrophotometry,
point-of-need tools (including point-of-care and lab-on-chip spectrometry, chromatography, has been merged to other
devices), represents a hot topic in the analytical sciences. disciplines such as biology, biochemistry, organic chemistry,
Although few of these efforts have been capable of generating material science, engineering, and microfabrication, with the
marketable solutions, the glucose biosensor for diabetic
patients and the lateral flow immunoassay strip for pregnancy
tests still roughly encompasses the total market. Public Published: January 26, 2021
authorities, nonprofit foundations, and private companies,
e.g., EU Commission, NIH, Bill and Melinda Gates, Wellcome
Trust, AIRC, Roche, Samsung, Google, etc., are committed to
funding researchers who aim to enable nonspecialized
customers and citizens, worldwide, to be actively involved in

© 2021 American Chemical Society https://dx.doi.org/10.1021/acs.analchem.0c04151


2713 Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry


pubs.acs.org/ac Feature

aim to reduce tasks for the end-user, to improve the reliability CHEMOMETRICS AT A GLANCE
and (possibly) to reduce costs (e.g., the use of eco-friendly Chemometrics was defined for the first time by a young
materials like paper-based substrates,9−11 the synthesis of assistant professor writing a grant application as “the art of
biomimetic nanomaterials,12,13 the rational design of recog- extracting chemically relevant information from data produced
nition probes (aptamers), the multiplexing approach, micro- in chemical experiments”.22 It was 1972, and the professor was
fluidics for lowering sample treatment/chemicals use/waste Svante Wold (professor of organic chemistry at Umeå
management.)14−16 Among all the aforementioned objectives, University, Sweden), today frequently remembered as the
a common feature is recognized: making analytical processes “father of chemometrics”. A few years later, together with his
more convenient in terms of (i) fabrication (use of synthetic colleague Bruce Kowalski (professor of analytical chemistry,
materials instead of animal sources, e.g., oligonucleotide University of Washington, Seattle, WA), he founded the
aptamers vs antibodies), (ii) application (in situ use, e.g., International Chemometrics Society. As clearly explained by
point-of-care vs laboratory-bounded), (iii) environment Wold himself in different occasions, the main goal of this new
(reduction of waste production, e.g., paper vs plastic), (iv) discipline was to get chemically relevant information out of
social impact (improving citizen participation, e.g., non- measured chemical data (e.g., design of experiment, DoE,
specialists vs skilled personnel), and (v) economics (limiting multivariate analysis),23−25 and to represent and display this
the use of prime matters/maintenance, e.g., microfluidics vs information. These complementary tasks clearly demanded
bulky/expensive approaches). However, the path from knowledge of statistics and applied mathematics, but the
conceptualization to market, through design and data analysis, approach has always been the fit-for-use: “We must remain
still appears dependent on a univariate paradigm.17 The choice chemists and adapt statistics to chemistry instead of vice versa. And
of a target, optimization of a sensor, analysis of a signal, noise chemometrics must continue to be motivated by chemical problem
discrimination, and evaluation of experimental parameters, e.g., solving, not by method development.”22 From the 70s up to 90s,
stability, reproducibility, shelf life, are mainly derived from the development of computerized instruments (especially in
linear correlation. For instance, the amount of an enzyme to analytical and physical-organic chemistry) made data acquis-
develop a biosensor is optimized by assuming other ition easier and cheaper. In those years, chemometrics became
experimental parameters are irrelevant and/or the presence established and a big effort was put on designing novel
of interfering species in complex matrixes might limit real algorithms for data information extraction and optimization,
application. Although this approach works in many cases, the analogous to what biologists, psychologists, and economists
adoption of statistical methods to understand the additive have done with biometrics, psychometrics, and econometrics,
effect of multiple species, to evaluate the correlation of respectively.26−28 In the late 90s, the application of computer
experimental variables on signal output, and to discriminate/ and informatics technology in chemistry led to the coining of
classify multitargets simultaneously, represent a useful the counterpart of bioinformatics in chemistry: chemo-
opportunity for moving toward a multivariate perspective.18 informatics.29,30 Chemoinformatics is the use of informatics
Plenty of information can be extracted from data if more than methods to solve chemical problems,30,31 thus including
one variable is considered at the same time: the understanding chemical database systems and structures, computer assisted
of how more inputs correlate with each other and affect the structural elucidation, computer-assisted drug and chemical
output can potentially improve both the analytical perform- synthesis design, and molecular modeling.29,32,33 Mainly born
ances and the cost of portable devices.19 The aim of this as a tool to support analytical and physical-chemical data
Feature is to highlight the use of chemometrics for the analysis, today chemometrics is applied in both academia and
development of portable analytical devices, with a holistic industry to broader areas of analytical chemistry, process
description. Although other content on chemometrics applied optimization, drug design, biomarker discovery, material
to portable devices has been reported in the literature,19−21 design, food science, digital and signal processing, image
some novel aspects are included in this Feature: (1) the analysis, and omics sciences with the potential to revolutionize
perspective is extended to aspects that are often not addressed, the very intellectual roots of problem solving.33−35 The first big
such as target identification and digitalization, and (2) the challenge for chemometrics is the reduction of the amount of
reported examples are focused on diverse disciplines around experiments. Reasons for reducing the number of experiments
the world of portable devices, microfluidics, selective are trivial: experiments are expensive, time-consuming, and
biosensors, nonspecific array, optical readouts, etc. All the sometimes pose ethical issues (e.g., animal experimentation).
steps from bench to market, namely, target identification, Minimizing the number of experiments, without compromising
device optimization, signal treatment and data digitalization, the information content therein, is arguably the main aim of a
can benefit from the adoption of chemometrics. Readers scientist. To this aim, we can use experimental design, also
working in the field of point-of-needs would be able to known as Design of Experiment (DoE): a mathematical
consider the use of novel routes for enhancing their research. framework for planning experiments by changing all involved
As reported in the preface of the book “Chemometrics in variables simultaneously, thus extracting the maximum amount
Electroanalysis”,21 Prof. Scholtz used the following words: of information in the fewest number of experiments. Different
“Still, only a few electrochemists and electroanalysts make use of it, mathematical strategies exist and have been extensively
probably because their attention is completely absorbed by the described elsewhere.36−40 However, rather than the difficulty
purely electrochemical problems, leaving not much time to study stemming from mathematical aspects, the main stumbling
chemometrics.” This Feature is intended for those operating block is the mental attitude required to switch from changing
within the field of point-of-need devices that still do not from the one-variable-at-a-time strategy (OVAT) to DoE,
consider the use of chemometrics to improve their outputs. which is still underrepresented in the scientific community.24
Light theoretical descriptions are combined with practical When optimizing a biosensor, many variables must be taken
evidence to support the realization of novel analytical tools. into account: the pH, the concentration of the target analyte,
2714 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

temperature, time of reaction, and other parameters depending one category is of interest (e.g., traceability problems, food
on the working principle. DoE, by means of, e.g., factorial authentication, and in the context of personalized medicine)
designs40 or D-optimal design methodologies,41 allows one to class modeling tools are used instead.50 Instead of looking for
identify the minimum essential experiments needed to span all differences among samples belonging to different categories,
sources of the variation and suggests that resulting class modeling techniques focus on the dis/similarities
experimental data will identify the optimal conditions, the between the samples belonging to a particular category.50,51
variables that most influence the results, and those that do not. In the last decades, novel data mining techniques have been
In addition to DoE, chemometrics tools and, more generally, developed to identify relationships and trends in large,
statistical tools of data-mining, are commonly used to extract multidimensional big data sets as might be obtained from
hidden information and enlightening relationships among data. modern high-throughput screening (HTS) or untargeted
In the simplest case, data come or can be summarized in a omics analyses, namely, substructural analysis,52 discriminant
block of data X, a m × n matrix in which for each of the m analysis,53 neural networks,54−56 decision trees,57,58 support
samples, n experimental variables, or molecular descriptors are vector machine,59 and kernel algorithms.60 However, it is
reported. Starting from the assumption that the n variables almost impossible to define the best modeling technique a
frequently are more than the m samples, they are likely to be priori: an initial benchmark study is necessary to determine the
correlated, and they can be missing in some samples, most appropriate one. Despite the mathematic complexity
multivariate analysis25 (MVA) is the elected way to extract added to new algorithms, the final goal is still the same: to
information through data analysis.42 We can distinguish two provide interpretable, thus useful, models with adequate
approaches for the application of MVA: unsupervised and predictive ability.
supervised. In the unsupervised setting, the goal is to explore
the variance in a single block of data X. For that, a matrix
factorization can be performed without any a priori knowledge
■ TARGET IDENTIFICATION
Point-of-need devices are built to sensitively, accurately, and
(e.g., no information about the class label of data, the number selectively detect an analyte (or group of analytes) of interest.
of classes, etc.), so that natural patterns can be elucidated. This A question arises: how was that particular analyte (or group of
approach is ideal to explore data in an unbiased fashion, analytes) identified as most related to a specific pathophysio-
especially in an early phase of the investigation, when no or logical condition we aim to monitor? The process of target
little information on variable or analytes involved in the analyte identification or, more popularly nowadays, of
process is available. Among unsupervised multivariate analysis biomarker discovery, deserves a dedicated research effort
tools, principal component analysis (PCA) is the workhorse in which can be speeded up by chemometrics. We can distinguish
chemometrics. PCA is used to reduce the complexity and the two possible approaches for target identification: hypothesis-
dimensionality of a set of data contained in a matrix by based and discovery-based, Figure 1. The hypothesis-based
rationalizing the variance and providing an overview of all
observations or samples in the data table.43 The idea around
PCA is to reduce the dimensionality of a data set consisting of
a large number of variables, by obtaining novel variables
(principal components) that are obtained by a combination of
former variables: it allows one to retain as much as possible the
variability present in the data set and to reduce noise and
redundancy. By inspecting a PCA model, groupings, trends,
and outliers can also be found. However, in some cases we do
have an a priori additional knowledge of the samples, for
example, concentration, dose, age, gender. In this scenario, we
can use supervised models to explore the variance in a block of
data X that allows the prediction of a response block Y, that is,
our additional knowledge. The latter may contain quantitative
data, which puts one in the regression domain, or categorical
data (i.e., healthy versus disease samples), which puts one in
the classification domain. This method helps shift the question
from “What is in there (X)?” to either “What is its relation to Y Figure 1. (A) Hypothesis-based approach focused on cystic fibrosis
(quantitative)?” or “What is the difference between the classes (CF). People affected by CF have CTFR (membrane protein)
in Y (categorical)?”.43 As a result, providing that no malfunctioning (ii), leading to chloride ions accumulation within cells.
overfitting44 is occurring (i.e., The model we are building to Sweat chloride detection is used to diagnose CF. (B) Discovery-based
rationalize relationships among observations is too closely or approach with the use of PCA: (i) score plot that displays variability
purely fitting the training data with poor predictive ability on among samples, (ii) loading plot that displays contribution of original
novel data), supervised methods can point to the variables that variables (X, Y, Z, three possible biomarkers), and (iii) biplot that
visualizes the correlation among samples and original variables.
lead to the desired quantization or classification. Popular
supervised multivariate analysis tools are partial least squares
regression45 (PLS, e.g., age, dose concentration) and its approach is grounded on the mechanistic understanding of
extension to classification problems, known as PLS discrim- biochemical processes behind the pathophysiological condition
inant analysis46 (PLS-DA, e.g., control vs treated, healthy vs of interest: understanding that diabetes mellitus increases
disease). Sparse and other variants of those techniques also blood glucose levels led to the identification of glycosylated
exist.47−49 When there is an high imbalance between the hemoglobin as an ideal biomarker for diagnosis of diabetes.61,62
number of training samples in each class and/or where only The same happens for pregnancy tests.63 Discovery-based
2715 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

approaches, on the other hand, aim to identify statistically features identified are not noise and to test the predictive
significant changes in molecular species associated with the ability of the model beyond the training data, respectively.44,75
pathophysiological state of interest. For instance, the breast All in all, even if supervised methods are preferred in the target
and ovarian cancer-associated gene BRCA1 was identified by identification step, it is good practice to start the analysis with
positional cloning of a region on chromosome 17 that is unsupervised methods: if the desired classification is already
frequently deleted in breast cancer.64 visible in a PCA scores plot, for instance, the supervised
Today, in the high-throughput omics era, where generating algorithms can be applied with less probability of overfitting. A
data is arguably easier and faster than interpreting results, the discovery-based approach can then evolve into a hypothesis-
discovery-based approach is predominant. Especially in the based approach, since it is not sufficient to prove that an
early stages of an investigation, appropriately sized data sets analyte can discriminate two or more groups of samples: a
undergo statistical analysis to data reduction and classification biochemical mechanistic explanation is needed to support the
in a purely data-driven fashion. In such a multivariable domain, discovery. Indeed, one of the bottlenecks of a discovery-based,
multivariate analysis is the natural choice to identify a panel of data-driven target identification is to validate the robustness
complementary target analytes that can effectively discriminate and to prove clinical applicability of the proposed markers,
the samples under investigation better than a single one (i.e., thus to prove interpretability of the model. In this context, it is
univariate approach).65 Taking into account the correlation crucial to combine the data-driven approach with expert
structure of the data and the synergies and antagonisms knowledge throughout the entire process of target identi-
plausibly existing among the potential analytes, the multivariate fication: from sample sizing to collection, from data modeling
approach outperforms the univariate one in sensitivity, to result interpretation and validation.
specificity, and reliability and were successfully used for
diagnostics and prognostic biomarkers discovery.66 As an
example, one of the few Food and Drug Administration (FDA)
■ DESIGN AND OPTIMIZATION OF THE DEVICE
As written above, a common way to develop point of need
approved biomarkers62 is the one for ovarian cancer (Ova1), devices is represented by a single variable optimization. This
discovered by artificial neural network (ANN) modeling of the approach appears inconvenient by two perspectives: number of
plasma proteome of women with ovarian cancer compared to experiments and reliability of optimization.76 To optimize a
women with benign gynecological diseases.67 As a result, a device composed of N variables, L levels, and with a number of
panel of five biomarkers was found to outperform the R repetitions, N × L × R experiments are required: for
previously known ovarian cancer biomarker, CA125,68 in the instance, 75 experiments are needed to optimize (variable-by-
ability to discriminate between invasive ovarian cancer and variable) 5 variables with 5 levels, repeated 3 times. Even if the
benign lesions.62,69 When the dimension of the cohorts is OVAT approach might work in some cases, the number of
limited or the interest is focused on the phenomenological experiments increases quickly, along with time and cost.77 In
characterization of the disease,51 multivariate biomarker addition, the presence of interaction among variables is not
discovery is achieved through the building of exploratory taken into account at all. The lack of information related to
models. In this case, PCA and hierarchical cluster analysis variable correlation might lead to a “falsely” optimized final
(HCA) are predominantly used to reduce the dimensionality device, thus negatively weighing on the performance.
of the data and elucidating a natural pattern. Recently, a Sparse Generally, if the interactions among variables are high, then
Mean approach was proposed as most sensitive and best able a great difference is observed between the optimizations
to identify the specifically perturbed variables in PCA-based obtained by univariate and multivariate approaches. To
methods.70 When multiple sources of variability are present, overcome this limitation, DoE allows to observe all the
PCA may suffer from an interpretational problem and other variables simultaneously, by adopting statistical multivariate
strategies can be used, e.g., ANOVA simultaneous component methods that have the goal of lowering resources and
analysis, (ASCA),71 and ANOVA principal component analysis improving outcomes.78,79 However, even if the multivariate
(ANOVA-PCA).72 However, such unsupervised, exploratory vision contains the above-discussed advantages, a clarification
methods may not be the most straightforward choice for target should be given: if N variables are considered, and each of
identification, as they are not designed to specifically find them is investigated at L levels, all the possible combinations
differences among groups, while target analytes and biomarkers are LN, e.g., 2 variables with 10 values lead to 102 = 100
are supposed to be unique molecular signature of a certain experiments, namely, full factorial. In this case, the multivariate
group.73 Among supervised classification methods, PLS-DA, approach to define the correlation of just two variables
support-vector machine, random forest, and artificial neural produces a high number of experiments. The adoption of DoE
networks are used to force the method to provide the desired might help analytical chemists understanding the effect of
classification as well as to predict the classification of new variables on response. Designs are obtained by combining the
samples. When a statistically significant discrimination among variables through well-defined rules. If the aim is to evaluate
classes (e.g., disease and healthy) is found, it means that a the effect of variables on the response, especially when a
mathematical relationship between the data and the categorical process is unknown, a Plackett−Burman (P−B) design can be
variable y (e.g., the class) was established and therefore it can used for screening experiments:80 it is known as a screening
be used to predict the class of novel samples. As anticipated, design, and it is intended as linear combinations of two levels
supervised methods suffer from the risk of overfitting, which of each variable, i.e., the upper level is signed as “+” and the
arises when the fitting of training data is so well that both the lower level is signed as “−”. This is a very economic approach,
predictive features of the data and noise are incorporated into for screening the contribution of a high number of variables
the model, which will imply poor model performance in the (N) using a number of experiments equal to N + 1 that is a
prediction stage. In order to verify that the model holds a true multiple of 4, e.g., 11 variables can be screened using 12
biochemical meaning and avoid overfitting, we can focus on experiments (Figure 2A). However, it is very important to
variable selection74 and validation to verify that the predictive highlight that this approach (1) is useful to individuate the
2716 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

avoiding experiments performed under extreme conditions.


For instance, it is commonly utilized to optimize parameters
while developing portable colorimetric device that are built on
paper substrate, i.e., microfluidic paper-based devices:86 the
optimization of the flow geometry and the amount of
deposited reagents on the paper-based strips can be finely
optimized, resulting in performance boost and time/cost
savings (only 73 experiments were used) while analyzing uric
acid in human urine. These multivariate methods, whereby all
factors are varied simultaneously, represent a good starting
point for beginners. The obtained data can be conveniently
analyzed in combination with the analysis of variance
(ANOVA) to understand the variation induced by the different
variables and to obtain the optimal compromise between the
number of experiments and chemical meaning.87
Figure 2. (A) P−B screening design algorithm for 11 factors, (B) (i)
representation of a CCD on three variables and (ii) a response surface
for microfluidic platform optimization. Reproduced with permission
from ref 81. Copyright 2017 Springer-Verlag Berlin Heidelberg. (C)
■ DATA ANALYSIS
Let us think about a single measurement like detecting blood
(i) BBD scheme and (ii) a response surface for pencil-based glucose with a portable strip: the value of produced current is
electrochemical device for metal ions. Reproduced with permission correlated to the glucose concentration within the blood
from ref 82. Copyright 2015 Elsevier B.V. droplet. This is an example when using univariate methods can
be enough: the meaning of the variable (i.e., current) is the
same among different samples.88 Instead, a more complex
most relevant variables on response to be further investigated, example is highlighted when data are composed by information
(2) considers all the interactions negligible, and (3) is effective on the pH, target concentration, and color of the sample, as in
for linear data behavior. the case of a microfluidic device.89 In this case, data are
If the contribution of a factor is relevant, this should not be heterogeneous with diverse magnitude, scale, and meaning: to
inserted in this DoE, while also existing interactions should be obtain information from multinature data, a multivariate
considered while performing a P−B design. When data are not approach is essential. It should be clear that specific conditions
linear, e.g., quadratic behavior is present, central composite, like those relative to environmental pollution, health disease,
and Box−Behnken experimental designs can be successfully and food authentication are often the results of multivariables,
used for obtaining response surfaces (the relation between thus the availability of large sets of data needs to be extracted
different experimental variables and the responses) to define an and interpreted to get information.90 PCA is very useful for
optimum. The central composite design (CCD) is adopted on this. It is largely used for pattern recognition of data set
a lower number of factors (generally 2−5) with respect to the acquired with multiarray systems such as the electronic nose
P−B design. CCD allows one to estimate the constants, the and tongue.91−93 As reported in Figure 3A, PCA combined to
linear terms, the interactions between variables, and the a lab-on-chip device composed of sensitized beads for
quadratic terms, according to the selected model (usually, the recognition of interleukin-1β (IL-1β), C-reactive protein
interactions among more than two terms are not taken into (CRP), and metalloproteinase-8 (MMP-8) allowed one to
account).83 It should be thought as a cube of objects (as for discriminate between periodontally healthy and unhealthy
factorial design) plus a second set of objects distributed in the patients and increase the diagnostic value of IL-1β and MMP-8
form of a star which goes beyond the limits of the cube to biomarkers for periodontitis.94 In addition, a pioneering paper
provide that estimation of curvature (Figure 2B). In this case demonstrated that it is possible to classifying wine samples
the number of experiments (N) is defined by the following from different regions, through the use of electrochemical
equation: N = 2F + 2F + C, where F is the number of factors capillary electrophoresis on a chip.97 However, the PCA’s
and C is the number of central points (the points where all exploratory nature is not enough to address specific issues:
factors are set up at their center value). Additionally, CCD perhaps, it is useful within food science when a qualitative
permits one to reuse previous factorial experiments (block- answer is required, but it fails to provide models for
ing):84 if the experiments are too long, one can decide to carry quantitative prediction. To overcome this, supervised chemo-
out two blocks of experiments, i.e., cube points, star points. metric approaches can be used. PLS is largely adopted for
Alternatively, the Box−Behnken design (BBD) represents a multivariate regression, for defining mathematical relationships
valuable choice to the CCD:85 BBD tries to minimize the among variables and providing quantitative predictions.98,99 As
effects of extreme values like these provided by the star points shown in Figure 3B, PLS has been coupled to a mobile device
in CCD. For this purpose, a cube and some central points are for interpreting the thermal stability of raw milk by means of
still used but, unlike CCD, samples are not positioned in the the alizarol test. The smartphone app, namely, “PhotoMetrix
vertices but in the middle of the edges, and star samples are Pro”, is freely available in the Google Play Store. PLS allowed a
not used (Figure 2C). BBD needs less experiments than satisfactory agreement when compared with the reference
analogous CCD: BBD is very useful when extreme experiments method (potentiometric).95
are undesired, and “blocking” is generally not available. The PLS has been successfully applied to a microfluidic system
number of experiments required for BBD of F factors and C with a 1H NMR-based metabolomic footprint, named as
central points is given by the following formula N = 2F × (F − “metabolomics-on-a-chip”, to identify metabolomic markers100
1) + C. This design avoids combination for which all factors and to develop a small-molecule toxicity-oriented database.
are simultaneously at their highest or lowest levels, thus PLS demonstrated to reduce the coefficient of variation for the
2717 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

Table 1. Chemometrics Tools Applied to Portable Devices


Design and Optimization
tool type of device chemometrics remarks ref
CCD PPy/HRP-SPE for Optimal setup with lowest 107
Ochratoxin A experiments
CCD Colorimetric μPAD Interactions evaluation, and 81
for glucose 1.2% absolute error
CCD Inkjet-printed PAD Only 46 experiments to 108
for isoniazid optimize the platform
BBD Colorimetric μPAD Setup optimization: reagents 76
for uric acid. and geometry
PCA Sensitized beads for New biomarkers for 94
Periodontitis periodontitis
PCAHCA Smartphone for Discrimination of three 118
amines amines through color maps
PCAANN AuNP-SPE for Discrimination of antibiotics 109
Tetracycline, mixture
cefixime
PLS μH NMR for NH3, Toxicity-oriented 88
DMSO, phenol “metabolomic-on-a chip”
database
PLS D-SPE for Coefficient of variation from 89
Figure 3. (A) Score plot of PCA demonstrates the different behavior propionaldehyde 33 to 15%
among (circle) healthy patients and (square) periodontal-affected PLS MIP-optosense for Lowering interferents, no 111
ones. Reproduced with permission from ref 94. Copyright 2007 John 1−2-naphthylamine pretreatment needed
Wiley and Sons. (B) PLS correlations between the predicted and PLS Printed tongue for Cd, Simultaneous detection and 116
measured saffron samples with a portable electronic nose made with Pb, Tl, Bi data reduction
10 metal oxide sensors. Reproduced with permission from ref 95. PLSANN Pencil Electrode for Increase the linearity of the 93
Copyright 2018 Springer Science Business Media, LLC, part of Zn, Cd, Pb, Cu data
Springer Nature. (C) (i) ANN network constructed based on a set of PLSANN Color array for volatile Inaccurate predictions 112
data which is divided in training and external tests with different N2-based minimized
concentrations of AChE mixtures, i.e., chlorpyriphos-oxon (CPO) ANN Multiarray for Cd, Pb, Decrease of complexity of the 113
Hg input data
and malaoxon (MO), (ii) correlation graph between expected and
ANN μEllman assay for Differentiate five different 119
target concentrations of MO in external test with milk. Reproduced pesticides pesticides in mixture
with permission from ref 96. Copyright 2012 Elsevier B.V.
LDA Colorimetric array for 100% discrimination of 114
glyphosate herbicide anions
PLS-DA FTIR for morphine “Signature” peaks in the 115
determination of propionaldehyde in wine from 33 to 15%, and thebaine poppy IR spectra
using an electrochemical biosensor.101 The combination of PLS-DA NIR for explosives No handling variability due 110
PLS-DA with a microfluidic paper-based device, allowed to to human handling
overcome the limitation of univariate approaches, that were LDA Multifluo array for Discriminate four species in 117
not able to simultaneously detect acetate, cyanide, fluoride, and chloropropanol mixture
phosphate ions in aqueous solution.102 Instead, a common tool
adopted for nonlinear (but also linear) response is represented Highly multivariate data coming from designed experiments,
by the artificial neural networks (ANNs): they are algorithms where both the variables (x block) and responses (y block) are
simulating the biological neuronal system. Differently from multivariate, frequently in a metabolomics study, might be
previous chemometrics tools, ANNs do not require a priori modeled by multilevel methods, including the above-
knowledge in the model.103 They are very useful both for data mentioned ASCA 71 and ANOVA-PCA. 72 In addition,
exploration and qualitative/quantitative prediction for chem- ANOVA-target projection (ANOVA-TP)104 is well suited for
ical sensors whose answer is the result of a series of complex testing the statistical significance of the studied effects and
interacting phenomena. As shown in Figure 3C, ANN has been straightforward visualization and accurate estimation of
used in combination with an automated flow screen-printed between- and within-class variance, and repeated measures
AChE-inhibition biosensor for the detection of chlorpyriphos- ANOVA (rMANOVA)105 is applicable to clinical and
oxon and malaoxon. The final configuration with two neurons personalized medicine investigations. In the context of
as the input layer (use of two enzymes), 10 neurons as the personalized medicine, the class-modeling approach as
hidden layer (mixtures of pesticides), and 2 neurons as the Statistical Health Monitoring (SHM) can be used as for
output layer (amount of pesticides) allowed one to obtain metabolomics studies: the metabolic profile of an individual is
good quantification of mixtures in milk.96 The use of ANN also compared with respect to that of healthy people in a
allowed one to consider the metallic interactions among four multivariate manner to detect abnormal metabolite/pattern
concentrations.106


metals, e.g., Zn, Cd, Pb, and Cu, and simultaneously detect
them in raw propolis samples through the use of a pencil-based
electrochemical sensor.82 However, users should be aware of DIGITALIZATION
the previously discussed overfitting issue that affects those When thinking about the “next 20 years” scenario, we can
techniques: that is the reason why validation with a relevant foresee with confidence that advances in technologies,
number of objects is highly required. In Table 1, simple cases computerization, and miniaturization are likely to increase.
for users interested in approaching chemometrics tools are The advent of the internet of things (IoT) and of innovative
reported. strategies based on information and communication tech-
2718 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

nologies (ICTs), the development of technologies like everyday life, not only providing a mean to personalized, user-
wearables, digital biosensors, smart houses, and smart cities centered data analysis but also preserving privacy through edge
will make it possible to monitor everyone in real time.120 analytics and understanding why and how their data are
Specifically referring to science, all areas of research will analyzed.
become data-intensive, emphasis will shift from data
generation to data analysis, and knowledge of data-mining
techniques will be essential to carry out research, thus bringing
■ CONCLUSIONS
The use of chemometrics tools represent a great resource for
new challenges to researchers.121,122 The combination of developing novel devices for decentralized analysis: depending
biosensors with IoT and ICTs strategies is also essential to on the analytical necessities and experimental settings, a variety
generate population health data that can be used, for instance, of statistical-based approaches might extract plenty of useful
to predict the outbreak of infectious diseases.123 Gartner information from data (sometimes not taken into account).
summarizes these concepts in its definition of big data as “high- The adoption of the mathematical tools behind chemometrics
volume, high-velocity, and/or high-variety information assets is reported to be essential for all the steps around portable
that demand cost-effective, innovative forms of information devices, starting from conceptualization to final application: in
processing that enable enhanced insight, decision making, and fact, the research around analytical devices implementation
process automation”.124 In this scenario, to quote Gasteiger, could benefit from chemometrics through identifying relevant
“the application of chemoinformatics is only limited by your own targets, optimizing architectures, analyzing data, and collecting
imagination!”30 Big data techniques, machine learning, signal information. Chemometrics can make important advances in
theory, hierarchical architecture for the detection of security developing analytical portable devices. It might facilitate (i) the
incidents in a security information system are the basis for the discovery of novel targets, e.g., hypothesis-based and discovery-
entire workflow. In the past 2 decades, the combination of based approaches, (ii) the optimization of experiments, e.g.,
multivariate techniques with the LASSO (least absolute design of experiments, and (iii) the classification of data and
shrinkage and selection operator) operator gave rise to the their processing. In addition, in the digital era, point of needs
so-called sparse models, now popular in the chemometric will be increasingly connected and personalized: chemometrics
community.47,125 These methods adapt multivariate techniques is and will be essential to mine the collected big data, derive
to the huge data dimensionality, generating simple and easier interpretable models for early risk detection and intervention,
to interpret models. Indeed, while the common tendency of and grant privacy and data protection. The aim of the paper is
data mining tools is to make the analysis as simple as possible to offer nonchemometricians starting approaches for develop-
for the end-user, leading in many cases to “black boxes” in ing portable analytical devices. Although it is mainly focused
which advanced data interpretation is very limited, chemo- on basic chemometric tools, data fusion approaches, and
metrics approaches like the ones mentioned above offer the approaches able to deal with highly multivariate data coming
unique opportunity of clearly interpreting and visualizing from designed experiments such as ASCA, APCA, ANOVA-
statistical analysis outcomes, and evaluating its robustness. TP, and rMANOVA should be taken into account for
This is what makes chemometrics “sexy” for the years to come: outcomes from different devices. Chemometrics represents a
the features of being an already up-to-date tool for solving real multidisciplinary pursuit, incorporating chemical, mathemat-
complex problems in an effective and still interpretable thus ical, and computational sciences. The adoption of models to
user-friendly way. In addition, chemometrics offer the describe, process, and differentiate data should never be
possibility of integrating and interpreting the complex independent from the chemical perspective, and focus must be
multidimensional information provided by different sensors/ put on chemical interpretability and predictive ability. The
devices through data fusion techniques.126−128 Data can be years to come bring a new challenge for chemists: to bridge the
fused by simple concatenation of different sensor data (namely, gaps among different disciplines and to find solutions by
low-level data fusion); by fusing the features extracted from the embracing different cultures to the same scientific question.
original data, e.g., via MVA or features selection strategies This Feature is a first step in this path.


(midlevel); by merging the different model responses only,
after each data set has been modeled independently (high- AUTHOR INFORMATION
level).129,130 When using data fusion approaches, the trade-off
to be found is in enhancing the quantity and the quality of the Corresponding Author
information content which can be extracted without including Stefano Cinti − Department of Pharmacy, University of
higher amounts of noise, not predictive information. For Naples “Federico II”, 80131 Naples, Italy; BAT
instance, from information provided by wearable sensors, level Center−Interuniversity Center for Studies on Bioinspired
data fusion techniques and inference methods are used for Agro-Environmental Technology, University of Napoli
activity recognition for Parkinson’s disease monitoring,131 fall “Federico II”, 80055 Naples, Italy; orcid.org/0000-0002-
detection and prediction, to physiological monitoring for early 8274-7452; Email: [email protected]
risk detection and intervention.132,133 Clearly, also new issues
Author
arise: ethical and regulatory issues concerning the require-
ments and specifications of data analysis components, the user, Sara Tortorella − Molecular Horizon srl, 06084 Bettona,
and in e-Health applications, patient consent, data, and privacy Perugia, Italy; orcid.org/0000-0001-9691-8323
protection.113 Digital sensors and biometric monitoring should Complete contact information is available at:
clearly empower citizens and hold the promise of huge https://pubs.acs.org/10.1021/acs.analchem.0c04151
potential benefits, but in order to fully exploit these devices, a
cultural effort has to be done to inform and gain users’ Author Contributions
compliance and co-operation. In the digital and IoT era, The manuscript was written through contributions of all
chemometrics will be even closer to the end users in their authors.
2719 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

Notes (11) Cinti, S. Anal. Bioanal. Chem. 2019, 411, 4303−4311.


The authors declare no competing financial interest. (12) Labib, M.; Sargent, E. H.; Kelley, S. O. Chem. Rev. 2016, 116,
Biographies 9001−9090.
(13) Wu, Y.; Belmonte, I.; Sykes, K. S.; Xiao, Y.; White, R. J. Anal.
Sara Tortorella graduated in Chemistry from the University of Perugia Chem. 2019, 91, 15335−15344.
(Italy), and in 2017 she received her Ph.D. in Biotechnology in the (14) Yetisen, A. K.; Akram, M. S.; Lowe, C. R. Lab Chip 2013, 13,
group of Cheminformatics and Molecular Modeling led by Prof. 2210−2251.
Gabriele Cruciani. During her studies, she joined excellent and top- (15) Xu, D.; Huang, X.; Guo, J.; Ma, X. Biosens. Bioelectron. 2018,
world research groups at Northwestern University (Chicago, US), 110, 78−88.
University of York (York, UK), Barcelona Biomedical Research Park (16) Song, Y.; Lin, B.; Tian, T.; Xu, X.; Wang, W.; Ruan, Q.; Guo, J.;
(Barcelona, ES), and GlaxoSmithKline (Philadelphia, US). Her Zhu, Z.; Yang, C. Anal. Chem. 2019, 91, 388−404.
expertise is the use of chemometric and cheminformatic for material (17) Turner, A. P. Chem. Soc. Rev. 2013, 42, 3184−3196.
(18) Brereton, R. G.; Jansen, J.; Lopes, J.; Marini, F.; Pomerantsev,
and drug design, molecular modeling, MS and MS imaging lipidomics
A.; Rodionova, O.; Roger, J. M.; Walczak, B.; Tauler, R. Anal. Bioanal.
data analysis, software design. She is the coordinator of the
Chem. 2017, 409, 5891−5899.
"Diffusione della Cultura Chimica" of the Società Chimica Italiana (19) Jalali-Heravi, M.; Arrastia, M.; Gomez, F. A. Anal. Chem. 2015,
(Dissemination of Chemical Culture group of the Italian Chemical 87, 3544−3555.
Society), young member of the Italian Metabolomics Network board, (20) Martynko, E.; Kirsanov, D. Biosensors 2020, 10, 100.
and Science Communication Manager of EpiLipidNet COST action (21) Díaz-Cruz, J. M.; Esteban, M.; Ariño, C. Chemometrics in
CA19105. Electroanalysis; Springer, 2019.
(22) Wold, S. Chemom. Intell. Lab. Syst. 1995, 30, 109−115.
Stefano Cinti is an Assistant Professor at the Department of
(23) Brereton, R. G. Chemometrics: Data Analysis for the Laboratory
Pharmacy, University of Naples “Federico II”. He obtained a Ph.D. and Chemical Plant; Wiley, 2003.
in Chemical Sciences in 2016 in the group headed by Prof. Giuseppe (24) Leardi, R. Anal. Chim. Acta 2009, 652, 161−172.
Palleschi at University of Rome “Tor Vergata”. He leads the (25) Wold, S.; Albano, C.; Dunn, W. J.; Edlund, U.; Esbensen, K.;
Uninanobiosensors Lab (uninanobiosensors.com) at the University Geladi, P.; Hellberg, S.; Johansson, E.; Lindberg, W.; Sjöström, M. In
of Naples “Federico II”, and his research interests include the Chemometrics; Springer Netherlands: Dordrecht, The Netherlands,
development of electrochemical sensors, paper-based devices, point- 1984; pp 17−95.
of-care, nanomotors and nanomaterials. During his research activity, (26) Geladi, P.; Esbensen, K. J. Chemom. 1990, 4, 337−354.
he had the opportunity to spend a period abroad in Finland, the U.K., (27) Esbensen, K.; Geladi, P. J. Chemom. 1990, 4, 389−412.
the U.S., Germany, and Spain. He published more than 40 papers in (28) Kiralj, R.; Ferreira, M. M. C. J. Chemom. 2006, 20, 247−272.
peer-reviewed journals, with a H-index of 25 and >1800 citations. (29) Chen, W. L. J. Chem. Inf. Model. 2006, 46, 2230−2255.
(30) Gasteiger, J. In Handbook of Chemoinformatics; Wiley-VCH
Among all the recognitions, in 2019 he was named Best Young
Verlag GmbH, 2008; pp 3−5.
Researcher in Analytical Chemistry (by the Italian Chemical Society), (31) Gasteiger, J. Mol. Inf. 2014, 33, 454−457.
and in 2020 he was included in the World’s Top 2% Scientists. He is a (32) Oprea, T. I. Chemoinformatics in Drug Discovery; Oprea, T. I.,
member of the board of the Chemical Cultural Diffusion Group and Ed.; Methods and Principles in Medicinal Chemistry; Wiley-VCH
of the Young Group of the Italian Chemical Society. Verlag GmbH & Co. KGaA: Weinheim, Germany, 2005.

■ ACKNOWLEDGMENTS
S.C. acknowledges the MIUR Grant “Dipartimento di
(33) Applied Chemoinformatics; Engel, T., Gasteiger, J., Eds.; Wiley-
VCH Verlag GmbH & Co. KGaA: Weinheim, Germany, 2018.
(34) Lavine, B. K.; Workman, J. Chemometrics: Past, Present, and
Future. In Chemometrics and Chemoinformatics; ACS Symposium
Eccellenza 2018-2022” to the Department of Pharmacy of Series, Vol. 894; American Chemical Society, 2005; pp 1−13.
University of Naples “Federico II”. Authors acknowledge Julian (35) Lavine, B. K.; Workman, J. Anal. Chem. 2013, 85, 705−714.
Ramirez for proofreading the manuscript.


(36) Box, J. F. Am. Stat. 1980, 34, 1−7.
(37) Trygg, J.; Wold, S. Introduction to Statistical Experimental Design
REFERENCES - What is it? Why and Where is it Useful?, 1996; https://www.win.tue.
(1) Wang, S.; Lifson, M. A.; Inci, F.; Liang, L.-G.; Sheng, Y.-F.; nl/~adibucch/6BV04/tutorialTryggWold.pdf.
Demirci, U. Expert Rev. Mol. Diagn. 2016, 16, 449−459. (38) L. Eriksson, E.; Johansson; Wold, N. K.; Wikstrom, C.; Wold, S.
(2) Chin, C. D.; Linder, V.; Sia, S. K. Lab Chip 2012, 12, 2118− Design of Experiments,Principles and Applications; Carlson, R., Ed.;
2134. Umetrics AB, Umea Learnways AB: Stockholm, Sweden, 2001; Vol.
(3) Nayak, S.; Blumenfeld, N. R.; Laksanasopin, T.; Sia, S. K. Anal. 15.
Chem. 2017, 89, 102−123. (39) Lanati, A.; Poli, C.; Imberti, M.; Menegon, A.; Grohovaz, F. A
(4) Gubala, V.; Harris, L. F.; Ricco, A. J.; Tan, M. X.; Williams, D. E. design of experiment approach to optimize an image analysis protocol
Anal. Chem. 2012, 84, 487−515. for drug screening. In Mathematical Models in Biology; Springer:
(5) Dai, Y.; Liu, C. C. Angew. Chem. 2019, 131, 12483−12496. Cham, Switzerland, 2015; pp 65−84.
(6) Urdea, M.; Penny, L. A.; Olmsted, S. S.; Giovanni, M. Y.; Kaspar, (40) Box, G. E. P.; Hunter, J. S.; Hunter, W. G. Statistics for
P.; Shepherd, A.; Wilson, P.; Dahl, C. A.; Buchsbaum, S.; Moeller, G.; Experimenters: Design, Innovation, and Discovery, 1st ed.; Wiley, 1978.
Hay Burgess, D. C. Nature 2006, 444, 73−79. (41) de Aguiar, P. F.; Bourguignon, B.; Khots, M. S.; Massart, D. L.;
(7) Weiss, C.; Carriere, M.; Fusco, L.; Capua, I.; Regla-Nava, J. A.; Phan-Than-Luu, R. Chemom. Intell. Lab. Syst. 1995, 30, 199−210.
Pasquali, M.; Scott, J. A.; Vitale, F.; Unal, M. A.; Mattevi, C.; et al. (42) Smilde, A.; Bro, R.; Geladi, P. Multi-Way Analysis with
ACS Nano 2020, 14, 6383−6406. Applications in the Chemical Sciences; Wiley: Chichester, U.K., 2004.
(8) https://www.un.org/sustainabledevelopment/development- (43) Trygg, J.; Holmes, E.; Lundstedt, T. J. Proteome Res. 2007, 6,
agenda/ 469−479.
(9) Cinti, S.; Moscone, D.; Arduini, F. Nat. Protoc. 2019, 14, 2437− (44) Faber, N. M.; Rajkó, R. Anal. Chim. Acta 2007, 595, 98−106.
2451. (45) Wold, S.; Sjöström, M.; Eriksson, L. Chemom. Intell. Lab. Syst.
(10) Parolo, C.; Sena-Torralba, A.; Bergua, J. F.; Calucho, E.; 2001, 58, 109−130.
Fuentes-Chust, C.; Hu, L.; Rivas, L.; Á lvarez-Diduk, R.; Nguyen, E. (46) Barker, M.; Rayens, W. J. Chemom. 2003, 17, 166−173.
P.; Cinti, S.; et al. Nat. Protoc. 2020, 15, 3788−3816. (47) Camacho, J.; Saccenti, E. J. Chemom. 2018, 32, e2964.

2720 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

(48) Lê Cao, K.-A.; Rossouw, D.; Robert-Granié, C.; Besse, P. Stat. (79) Vera Candioti, L.; De Zan, M. M.; Camara, M. S.; Goicoechea,
Appl. Genet. Mol. Biol. 2008, 7, 35. H. C. Talanta 2014, 124, 123−138.
(49) Trygg, J.; Wold, S. J. Chemom. 2002, 16, 119−128. (80) Vander Heyden, Y.; Nijhuis, A.; Smeyers-Verbeke, J.;
(50) De Luca, S.; Bucci, R.; Magrì, A. D.; Marini, F. In Encyclopedia Vandeginste, B. G. M.; Massart, D. L. J. Pharm. Biomed. Anal. 2001,
of Analytical Chemistry: Applications, Theory and Instrumentation; 24, 723−753.
Wiley, 2006; pp 1−24. (81) Avoundjian, A.; Jalali-Heravi, M.; Gomez, F. A. Anal. Bioanal.
(51) Calvani, R.; Marini, F.; Cesari, M.; Tosato, M.; Anker, S. D.; Chem. 2017, 409, 2697−2703.
von Haehling, S.; Miller, R. R.; Bernabei, R.; Landi, F.; Marzetti, E. J. (82) Pierini, G. D.; Pistonesi, M. F.; Di Nezio, M. S.; Centurión, M.
Cachexia Sarcopenia Muscle 2015, 6, 278−286. E. Microchem. J. 2016, 125, 266−272.
(52) Cramer, R. D.; Redl, G.; Berkoff, C. E. J. Med. Chem. 1974, 17, (83) Hamedpour, V.; Leardi, R.; Suzuki, K.; Citterio, D. Analyst
533−535. 2018, 143, 2102−2108.
(53) McFarland, J. W.; Gains, D. J. In Comprehensive Medicinal (84) NIST/SEMATECH e-Handbook of Statistical Methods,
Chemistry; Ramsden, C. A., Ed.; Pergamon Press: New York, 1990; pp http://www.itl.nist.gov/div898/handbook/ (Accessed Dec 30, 2020).
667−689. (85) Ferreira, S.L.C.; Bruns, R.E.; Ferreira, H.S.; Matos, G.D.; David,
(54) Rumelhart, D. E.; Hinton, G. E.; Williams, R. J. Nature 1986, J.M.; Brandao, G.C.; da Silva, E.G.P.; Portugal, L.A.; dos Reis, P.S.;
323, 533−536. Souza, A.S.; dos Santos, W.N.L. Anal. Chim. Acta 2007, 597, 179−
(55) Schneider, G.; Wrede, P. Prog. Biophys. Mol. Biol. 1998, 70, 186.
175−222. (86) Hamedpour, V.; Postma, G. J.; van Den Heuvel, E.; Jansen, J. J.;
(56) Zupan, J.; Gasteiger, J. Neural Networks in Chemistry and Drug Suzuki, K.; Citterio, D. Anal. Bioanal. Chem. 2018, 410, 2305−2313.
Design, 2nd ed.; John Wiley & Sons, Inc.: New York, 1999. (87) Lundstedt, T.; Seifert, E.; Abramo, L.; Thelin, B.; Nyström, Å.;
(57) Salzberg, S. L. Mach. Learn. 1994, 16, 235−240. Pettersen, J.; Bergman, R. Chemom. Intell. Lab. Syst. 1998, 42, 3−40.
(58) Berk, R. A. In Statistical Learning from a Regression Perspective; (88) Ni, Y.; Kokot, S. Anal. Chim. Acta 2008, 626, 130−146.
Springer New York: New York, 2008; pp 1−65. (89) Jayawardane, B. M.; Wei, S.; McKelvie, I. D.; Kolev, S. D. Anal.
(59) Saeh, J. C.; Lyne, P. D.; Takasaki, B. K.; Cosgrove, D. A. J. Chem. 2014, 86, 7274−7279.
Chem. Inf. Model. 2005, 45, 1122−1133. (90) Forina, M.; Oliveri, P.; Casale, M.; Lanteri, S. Anal. Chim. Acta
(60) Hann, M. M.; Leach, A. R.; Harper, G. J. Chem. Inf. Comput. Sci. 2008, 622, 85−93.
2001, 41, 856−864. (91) Biancolillo, A.; Marini, F. Front. Chem. 2018, 6, 576.
(61) Diabetes Care 2010, 33 Suppl 1, S4−S10,. (92) Di Natale, C.; Paolesse, R.; Macagnano, A.; Mantini, A.;
(62) McDermott, J. E.; Wang, J.; Mitchell, H.; Webb-Robertson, B.- D’Amico, A.; Legin, A.; Lvova, L.; Rudnitskaya, A.; Vlasov, Y. Sens.
J.; Hafen, R.; Ramey, J.; Rodland, K. D. Expert Opin. Med. Diagn. Actuators, B 2000, 64, 15−21.
2013, 7, 37−51. (93) Wasilewski, T.; Migoń, D.; Gebicki, J.; Kamysz, W. Anal. Chim.
(63) Canfield, R. E.; O’Connor, J. F.; Birken, S.; Krichevsky, A.; Acta 2019, 1077, 14−29.
(94) Christodoulides, N.; Floriano, P. N.; Miller, C. S.; Ebersole, J.
Wilcox, A. J. Environ. Health Perspect. 1987, 74, 57−66.
(64) Friedman, L. S.; Ostermeyer, E. A.; Lynch, E. D.; Szabo, C. I.; L.; Mohanty, S.; Dharshan, P.; Griffin, M.; Lennart, A.; Ballard, K. L.
M.; King, C. P., Jr; et al. Ann. N. Y. Acad. Sci. 2007, 1098, 411−428.
Anderson, L. A.; Dowd, P.; Lee, M. K.; Rowell, S. E.; Boyd, J.; King,
(95) Helfer, G. A.; Tischer, B.; Filoda, P. F.; Parckert, A. B.; dos
M. C. Cancer Res. 1994, 54, 6374−6382.
Santos, R. B.; Vinciguerra, L. L.; Ferrão, M. F.; Barin, J. S.; da Costa,
(65) Zhang, Z.; Yu, Y.; Xu, F.; Berchuck, A.; van Haaften-Day, C.;
A. B. Food Anal. Methods 2018, 11, 2022−2028.
Havrilesky, L. J.; de Bruijn, H. W. A.; van der Zee, A. G. J.; Woolas, R.
(96) Mishra, R. K.; Alonso, G. A.; Istamboulie, G.; Bhand, S.; Marty,
P.; Jacobs, I. J.; et al. Gynecol. Oncol. 2007, 107, 526−531. J. L. Sens. Actuators, B 2015, 208, 228−237.
(66) Robotti, E.; Manfredi, M.; Marengo, E. J. Proteomics Bioinform. (97) Scampicchio, M.; Mannino, S.; Zima, J.; Wang, J. Electroanalysis
2014, S3, 003. 2005, 17, 1215−1221.
(67) Zhang, Z.; Barnhill, S. D.; Zhang, H.; Xu, F.; Yu, Y.; Jacobs, I.; (98) Bevilacqua, M.; Bro, R.; Marini, F.; Rinnan, Å.; Rasmussen, M.
Woolas, R. P.; Berchuck, A.; Madyastha, K. R.; Bast, R. C., Jr Gynecol. A.; Skov, T. TrAC, Trends Anal. Chem. 2017, 96, 42−51.
Oncol. 1999, 73, 56−61. (99) Camacho, J.; Picó, J.; Ferrer, A. J. Chemom. 2008, 22, 299−308.
(68) Buas, M. F.; Gu, H.; Djukovic, D.; Zhu, J.; Drescher, C. W.; (100) Shintu, L.; Baudoin, R.; Navratil, V.; Prot, J. M.; Pontoizeau,
Urban, N.; Raftery, D.; Li, C. I. Gynecol. Oncol. 2016, 140, 138−144. C.; Defernez, M.; Blaise, B. J.; Domange, C.; Péry, A. R.; Toulhoat, P.;
(69) Zhang, Z.; Chan, D. W. Cancer Epidemiol., Biomarkers Prev. et al. Anal. Chem. 2012, 84, 1840−1848.
2010, 19, 2995−2999. (101) Rojas, J.; Fontana Tachon, A.; Chevalier, D.; Noguer, T.;
(70) Koeman, M.; Engel, J.; Jansen, J.; Buydens, L. Sci. Rep. 2019, 9, Marty, J.L.; Ghommidh, Ch. Sens. Actuators, B 2004, 102, 284−290.
1123. (102) Jiménez-Carvelo, A. M.; Salloum-Llergo, K. D.; Cuadros-
(71) Smilde, A. K.; Jansen, J. J.; Hoefsloot, H. C.; Lamers, R. J. A.; Rodríguez, L.; Capitán-Vallvey, L. F.; Fernández-Ramos, M. D.
Van Der Greef, J.; Timmerman, M. E. Bioinformatics 2005, 21, 3043− Microchem. J. 2020, 157, 104930.
3048. (103) Abbasitabar, F.; Zare-Shahabadi, V.; Shamsipur, M.; Akhond,
(72) Harrington, P. D. B.; Vieira, N. E.; Espinoza, J.; Nien, J. K.; M. Sens. Actuators, B 2011, 156, 181−186.
Romero, R.; Yergey, A. L. Anal. Chim. Acta 2005, 544, 118−127. (104) Marini, F.; de Beer, D.; Joubert, E.; Walczak, B. J. Chromatogr.
(73) Hendriks, M. M. W. B.; Eeuwijk, F. A. va.; Jellema, R. H.; A 2015, 1405, 94−102.
Westerhuis, J. A.; Reijmers, T. H.; Hoefsloot, H. C. J.; Smilde, A. K. (105) van Der Leeden, R. Qual. Quant. 1998, 32, 15−29.
TrAC, Trends Anal. Chem. 2011, 30, 1685−1698. (106) Engel, J.; Blanchet, L.; Engelke, U. F.; Wevers, R. A.; Buydens,
(74) Kvalheim, O. M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, L. M. PLoS One 2014, 9, e92452.
A. K.; Westerhuis, J. A. J. Chemom. 2014, 28, 615−622. (107) Alonso-Lomillo, M. A.; Dominguez-Renedo, O.; Ferreira-
(75) Szymańska, E.; Saccenti, E.; Smilde, A. K.; Westerhuis, J. A. Goncalves, L.; Arcos-Martinez, M. J. Biosens. Bioelectron. 2010, 25,
Metabolomics 2012, 8, 3−16. 1333−1337.
(76) Leardi, R. Anal. Chim. Acta 2009, 652, 161−172. (108) Hamedpour, V.; Leardi, R.; Suzuki, K.; Citterio, D. Analyst
(77) Ferreira, S. L.; Lemos, V. A.; de Carvalho, V. S.; da Silva, E. G.; 2018, 143, 2102−2108.
Queiroz, A. F.; Felix, C. S.; da Silva, D. L. F.; Dourado, G. B.; Oliveira, (109) Asadollahi-Baboli, M.; Mani-Varnosfaderani, A. Measurement
R. V. Microchem. J. 2018, 140, 176−182. 2014, 47, 145−149.
(78) Bezerra, M. A.; Santelli, R. E.; Oliveira, E. P.; Villar, L. S.; (110) Risoluti, R.; Gregori, A.; Schiavone, S.; Materazzi, S. Anal.
Escaleira, L. A. Talanta 2008, 76, 965−977. Chem. 2018, 90, 4288−4292.

2721 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722
Analytical Chemistry pubs.acs.org/ac Feature

(111) Valero-Navarro, A.; Damiani, P. C.; Fernández-Sánchez, J. F.;


Segura-Carretero, A.; Fernández-Gutiérrez, A. Talanta 2009, 78, 57−
65.
(112) Khulal, U.; Zhao, J.; Hu, W.; Chen, Q. RSC Adv. 2016, 6,
4663−4672.
(113) González-Calabuig, A.; Guerrero, D.; Serrano, N.; del Valle,
M. Electroanalysis 2016, 28, 663−670.
(114) Hamedpour, V.; Sasaki, Y.; Zhang, Z.; Kubota, R.; Minami, T.
Anal. Chem. 2019, 91, 13627−13632.
(115) Turner, N. W.; Cauchi, M.; Piletska, E. V.; Preston, C.;
Piletsky, S. A. Biosens. Bioelectron. 2009, 24, 3322−3328.
(116) Pérez-Ràfols, C.; Serrano, N.; Díaz-Cruz, J. M.; Ariño, C.;
Esteban, M. Sens. Actuators, B 2017, 250, 393−401.
(117) Wong, S. F.; Low, K. H.; Khor, S. M. Talanta 2020, 218,
121169.
(118) Bueno, L.; Meloni, G. N.; Reddy, S. M.; Paixao, T. R. RSC
Adv. 2015, 5, 20148−20154.
(119) Mwila, K.; Burton, M. H.; Van Dyk, J. S.; Pletschke, B. I.
Environ. Monit. Assess. 2013, 185, 2315−2327.
(120) Qi, J.; Yang, P.; Min, G.; Amft, O.; Dong, F.; Xu, L. Pervasive
Mob. Comput. 2017, 41, 132−149.
(121) Eisenstein, M. Nature 2015, 527, S2−S4.
(122) Seife, C. Nature 2015, 518, 480−481.
(123) Radin, J. M.; Wineinger, N. E.; Topol, E. J.; Steinhubl, S. R.
Lancet Digit. Heal. 2020, 2, e85−e93.
(124) Gartner Glossary, https://www.gartner.com/en/information-
technology/glossary/big-data (accessed 2020-10-02).
(125) Camacho, J.; Rodríguez-Gómez, R. A.; Saccenti, E. J. Comput.
Graph. Stat. 2017, 26, 501−512.
(126) Van Mechelen, I.; Smilde, A. K. Chemom. Intell. Lab. Syst.
2010, 104, 83−94.
(127) Forshed, J.; Idborg, H.; Jacobsson, S. P. Chemom. Intell. Lab.
Syst. 2007, 85, 102−109.
(128) Biancolillo, A.; Boqué, R.; Cocchi, M.; Marini, F. In Data
Fusion Methodology and Applications, Vol. 31; Cocchi, M., Ed.;
Elsevier, 2019; pp 271−310.
(129) Silvestri, M.; Elia, A.; Bertelli, D.; Salvatore, E.; Durante, C.; Li
Vigni, M.; Marchetti, A.; Cocchi, M. Chemom. Intell. Lab. Syst. 2014,
137, 181−189.
(130) Vera, L.; Aceña, L.; Guasch, J.; Boqué, R.; Mestres, M.; Busto,
O. Talanta 2011, 87, 136−142.
(131) Rodriguez-Martin, D.; Sama, A.; Perez-Lopez, C.; Catala, A.;
Cabestany, J.; Rodriguez-Molinero, A. Expert Syst. Appl. 2013, 40,
7203−7211.
(132) Smolinska, A.; Blanchet, L.; Coulier, L.; Ampt, K. A. M.;
Luider, T.; Hintzen, R. Q.; Wijmenga, S. S.; Buydens, L. M. C. PLoS
One 2012, 7, e38163.
(133) King, R. C.; Villeneuve, E.; White, R. J.; Sherratt, R. S.;
Holderbaum, W.; Harwin, W. S. Med. Eng. Phys. 2017, 42, 1−12.

2722 https://dx.doi.org/10.1021/acs.analchem.0c04151
Anal. Chem. 2021, 93, 2713−2722

You might also like