Milk Analysis Using Milk Analyzers in A Standardiz
Milk Analysis Using Milk Analyzers in A Standardiz
Milk Analysis Using Milk Analyzers in A Standardiz
Clinical Nutrition
journal homepage: http://www.elsevier.com/locate/clnu
Original article
a r t i c l e i n f o s u m m a r y
Article history: Background: Human milk analyzers are increasingly used to rapidly measure the macronutrient content
Received 23 March 2019 in breast milk for individual target fortification, to reduce the risk of postnatal growth restriction.
Accepted 27 August 2019 However, many milk analyzers are used without calibration, validation or quality assurance.
Aims: To investigate measurement quality between different human milk analyzers, to test whether
Keywords: accuracy and precision of devices can be improved by establishing individual calibration curves, and to
Breast milk
assess long-term stability of measurements, following good clinical laboratory practice (GCLP).
Good clinical laboratory practice
Methods: Sets of identical breast milk samples were sent to 13 participating centres in North America
Nutrition
Preterm infants
and Europe, for a total of 15 devices. The study included 3 sets of samples: A) initial assessment of the
Infrared spectroscopy device's performance consisting of 10 calibration samples with random replicates; B) long term stability
Macronutrient and quality control consisting of 2 batches of samples to be measured every time before the device is
used, over 6 months; C) ring trial consisting of 2 samples to be measured monthly. The devices tested
were Unity SpectraStar (n ¼ 5) and MIRIS Human Milk Analyzer (n ¼ 10).
Results: There are significant variations in accuracy and precision between different milk analyzers' fat,
protein and lactose measurements. However, the accuracy of measurements can be improved by
establishing individual correction algorithms. Repeated measurements are more robust when coming
from a larger batch volume. Long term stability also varies between devices.
Conclusion: The variations in measurements between devices are clinically significant and would impact
both daily dietary prescriptions, and the outcomes of clinical studies assessing the effect of targeted
adjustment of nutrient intake in preterm babies. This study shows that it is crucial to follow GCLP when
using milk analyzers to ensure proper measurement of macronutrients, similar to what is required of
other medical devices.
© 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction weight infants and needs to be fortified to meet their caloric and
nutritional needs [2e5]. Recent research suggests that not only
Human milk provides the best basis for enteral nutrition in protein content, but also protein-to-energy and carbohydrate-to-
neonates [1,2]. However, despite its many benefits, macronutrient fat ratios determine the rate and composition of growth [6].
content in native breast milk is insufficient for very low birth However, breast milk macronutrient content shows great vari-
ability; the content varies between mothers, within the same
mother, through the lactation period, during the same day, and
even during feeds [5,7e10]. This variability may lead to unfav-
Abbreviations: IR, infrared; GCLP, good clinical laboratory practice; QC, quality
control; CV, coefficient of variation; CHO, carbohydrates. ourable dietary intake, putting appropriate postnatal growth at
* Corresponding author. Department of Pediatrics, McMaster University Room risk. Indeed, postnatal growth retardation is still observed in more
4F5, 1280 Main Street West, Hamilton, Ontario, Canada, L8S 4K1. Fax: þ1 905 521 than 50% of preterm babies fed standard fortified mother's milk
5007. [11]. The current practice to fortify breast milk with standard
E-mail address: [email protected] (C. Fusch).
1
The collaborators of the MAMAS Study are listed in Appendix A.
amounts of fat, protein and carbohydrates may, therefore, not be
https://doi.org/10.1016/j.clnu.2019.08.028
0261-5614/© 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
2122 C. Kwan et al. / Clinical Nutrition 39 (2020) 2121e2128
adequate to reduce this variability in macronutrient content and correction algorithm was applied to all measurements in parts two
meet the nutritional needs of all preterm infants [12]. and three of the study.
To overcome this variability, infrared (IR) human milk analyzers The second part of the study assessed long term stability using
are starting to be used in clinical settings. They serve to rapidly quality control (QC) samples. Each device received two batches of
measure the macronutrient content in breast milk prior to sup- QC samples (“QC high” and “QC low”), sufficient to be measured
plementation with commercially available fortifiers. Such mea- every time before the device was used, over a period of six months.
surements may provide the basis to individually target fortify The third part of the study was a ring trial consisting of 2
breast milk with modular fat, carbohydrates and protein [12]. samples to be measured monthly by each device, for a period of five
However, currently, it is reported that these devices were mostly months.
used without calibration, validation and quality assurance [13]. This study was approved by the Hamilton Integrated Research
Standards for good clinical laboratory practice (GCLP) were not Ethics Board (HIREB #10-555-T), and has also received local ethics
applied in many studies [13]. Results by human milk analyzers can approval at all participating centres. This study followed the Stan-
thus be charged with significant errors in protein and fat deter- dards for Quality Improvement Reporting Excellence [18].
mination (combination of analytical and pre-analytical errors) that
can be unknown to the user [14,15]. 2.2. Sample preparation
An under- or overestimation of macronutrient levels can lead to
meaningful differences in growth rates [16]. For example, a sys- Breast milk samples were collected from mothers with excess
tematic error in protein determination of 0.3 g/dL would lead to a milk who stayed in the NICU at McMaster Children's Hospital. All
difference in intake of 0.5 g of protein/kg/d (assuming a milk intake mothers had provided informed written consent prior to their
of 150 mL/kg/day). This difference in protein intake will lead to a breast milk donation. Samples from different lactation periods
difference in growth rates of 3e4 g/kg/d [16]. A similar situation were collected in order to cover a wide range of macronutrient
would occur if there are errors in fat measurement. An error in fat content. Collected samples were stored in the freezer at 20 C
measurement of 0.7 g/dL can lead to a difference in intake of 1 g of until analysis.
fat/kg/d. This difference translates into a difference of 9 cal/kg/d, A total amount of 8.9 L of breast milk was used to prepare
which can also significantly jeopardize the protein-to-energy ratio, samples for all three parts of the study. For the first part of the
facilitate amino acid oxidation and have a significant impact on the study, 2 L of breast milk was used to prepare 10 batches of cali-
rate of lean mass accretion. It is therefore important to introduce bration samples. For each batch, breast milk samples (V ¼ 200 mL)
GCLP when using human milk analyzers to avoid introducing from the same mother and similar lactation stage were thawed at
measurement errors that can affect daily dietary prescriptions and 24 C in a thermostatic water bath. It was previously shown that
the outcome of clinical studies [16]. freezing and thawing do not impact the precision of the bedside
Hence, we have launched the MAMAS (Milk Analysis using Milk determination of macronutrient content [14]. Subsequently, these
Analyzers in a Standardized setting) study, a multicentre quality samples were homogenized for 1.5 s/mL using an ultrasonic soni-
initiative, to set up a Quality Control system for the calibration and cator (VCX 130; Sonics and Materials Inc, Newtown, CT, USA), and
validation of milk analyzers, following GCLP. Specifically, we would then underwent Holder pasteurization. Samples were then pooled
like to compare the variation between milk analyzers at different together and stirred for 20 min. While still stirring, samples were
centres and ensure their quality in an international multicentre aliquoted into a set of 60 1.5 mL samples and a set of 20 4.5 mL
study, by (i) investigating the differences in measurements be- samples. This pasteurization and pooling process was repeated to
tween different devices, (ii) testing whether the accuracy and prepare a total of 10 batches of breast milk samples with different
precision of the devices can be improved by establishing individual macronutrient content. For each device, the participating centre
calibration curves and (iii) assessing the long-term stability of the received 4 samples from each batch (3 1.5 mL and 1 4.5 mL
measurements after setting up a Quality Control system, following from each of the ten batches), for a total of 40 samples to measure:
GCLP guidelines for calibration and validation of milk analyzers in a 30 1.5 mL to be measured once, 10 4.5 mL to be measured in
laboratory or clinical setting. triplicates (in fractions of 1.5 mL). The sample labels of the 40
samples were randomly attributed so that replicates from the same
2. Material and methods batch were not measured consecutively and to ensure that samples
were measured in the same order across centres. The two different
2.1. Study design volumes of samples (1.5 mL and 4.5 mL set-up) were chosen to
investigate the impact of pre-analytical and sample handling errors
In this international multicentre validation study, 17 centres of small volumes on the macronutrient measurements.
with a known interest in target fortification of breast milk were For the second part of the study, 6 L of breast milk were used to
approached. Amongst these centres, four were unable to partici- prepare 2 batches of different QC samples (QC high and QC low). For
pate due to internal organizational issues. Therefore, in this study, each batch, 3 L of breast milk were thawed, homogenized, and
13 centres with a total of 15 devices from the following partici- pasteurized like in the first part of the study. Samples were then
pating countries were included: Canada (n ¼ 4), the United States pooled together and stirred for 20 min. While still stirring, samples
(n ¼ 3), France, Germany, Poland, Switzerland, Austria, and Sweden were aliquoted into 1.5 mL samples. Over a period of six months,
(all n ¼ 1). The devices tested were the SpectraStar by Unity Sci- participants were asked to measure one sample from each QC batch
entific (Brookfield, Connecticut, USA) (n ¼ 5) and the Human Milk every time before they used their device to measure milk samples.
Analyzer by MIRIS (Uppsala, Sweden) (n ¼ 10). For the ring trial (third part of this study), 900 mL of breast milk
This study consisted of three parts. The first part was an initial was used to prepare 10 batches of different samples. For each batch,
assessment of the different devices, where each device analyzed 10 90 mL of breast milk were thawed, homogenized, and pasteurized
calibration samples with random replicates. The results were also like in the first part of the study. Samples were then pooled and
used to generate a correction algorithm for each individual device, stirred for 20 min. While still stirring, samples were aliquoted into
using the inverse of the regression equation from the 10 calibration 60 1.5 mL samples. For each device, the participating centre
samples. This type of correction algorithm has been previously received 3 samples from each batch (3 1.5 mL), for a total of 30
tested and validated for human milk analyzers [14,17]. This ring trial samples. Each month, participants were asked to measure
C. Kwan et al. / Clinical Nutrition 39 (2020) 2121e2128 2123
the samples from 2 batches, over a period of five months. Partici- ±10% or ±20% translates into an error of ±0.1 to ±0.2 g of protein per
pants either measured all three triplicates from each batch, or just a dL (assuming an average content of 1 g/dL); this leads to a differ-
single measurement, according to their usual local practice; they ence in intake of 0.15e0.3 g of protein (assuming a milk intake of
were asked to execute their routine milk analysis. 150 mL/kg/day). This difference in protein intake would translate
After the samples were prepared for each part of the study, they into a weight gain difference of 1e2 g/kg/day, which might be
were stored in the freezer at 20 C and shipped to participating clinically meaningful. Similarly for fat, a measurement error of
centres on dry ice. ±10% would translate into an error of approximately ±0.5 g of fat
per dL (assuming an average content of 4.5 g/dL). This would lead to
2.3. Sample analysis e infrared human milk analyzers a difference in intake of 0.7e0.8 g of fat (assuming a milk intake of
150 mL/kg/day), equivalent to 6e8 kcal/kg/day, at which point the
Once arrived at the participating centre, samples were stored in impact on growth starts becoming clinically important.
the freezer at 20 C until analysis. For the first part of the study,
participants were asked to analyze the samples within 14 days of 3. Results
receiving them. Prior to analysis, samples were treated and handled
at each site as they would be typically, following their local practice. Supplemental Fig. 1 shows that chemical analysis for fat and
For instance, at McMaster University, samples were taken out of the protein correlates strongly with the reference methods (Valacta). In
freezer and warmed at 37 C using a water bath until thawed. protein data analysis, one value was removed from all further
Samples were then homogenized using a sonicator; the 1.5 mL analysis because the chemical analysis value deviated more than 3
samples were homogenized for 15 s each and the 4.5 mL sample SD.
were homogenized for 1 min (VCX 130; Sonics and Materials Inc,
Newtown, CT, USA). 3.1. Part 1 e initial assessment
After homogenization (if applied at the centre), each sample was
analyzed, using a human milk analyzer. Each of the 1.5 mL samples Fifteen devices returned measurements for part 1 of the study
was measured once. For the 4.5 mL sample, measurements were (Supplemental Fig. 2). Figure 1 shows the performance of the 15
done in triplicate; users of the SpectraStar had to pipette z1.5 mL devices for all three macronutrients. Among different devices, there
out of the vial for each measurement, and MIRIS users injected from are significant variations in fat, protein and carbohydrate mea-
the loaded syringe three volumes of 1.5 mL each, according to their surements, independent of the type of device that was used. For
local practices. All data were transferred to McMaster University for protein measurements, the range of variation exceeds 1 g/dL. Car-
analysis. bohydrate measurements using IR correlate poorly with lactose
measurements using reference method. Because of this poor cor-
2.4. Sample analysis e chemical reference methods relation, correction algorithms were not developed and applied to
the device's carbohydrates/lactose measurements; as a conse-
At McMaster University, a set of samples from all three parts of quence, carbohydrates/lactose data cannot be corrected in the
the study were measured using chemical reference methods that subsequent parts of this study. All repeated measurements for fat
were previously validated [14,19]. These methods consist of a and protein are shown for each individual device in Supplemental
modified ether Mojonnier fat extraction to measure fat, elemental Figs. 3 and 4. In these figures, individual devices are numbered from
analysis (EA) to measure protein, and ultra-performance liquid- 1 to 15 and this numbering is identically assigned for all subsequent
chromatography tandem mass spectrometry (UPLC MS/MS) to figures. Triplicate measurements of 1.5 mL samples taken from one
measure lactose. All measurements by milk analyzers were larger batch volume (4.5 mL) are more robust than taken from
compared to the chemical reference values to investigate the ac- multiple smaller batch volumes (3 1.5 mL).
curacy of the measurements. Precision analysis shows large differences between devices.
An independent validation for fat and true protein was done by Overall, the coefficient of variation (CV) ranges from 0.7 to 16.2% for
sending a set of initial assessment calibration samples to a certified fat, 1.5e31.6% for protein, and 0.5e9.3% for lactose among all de-
and accredited reference lab (Valacta, Sainte-Anne-de-Bellevue, vices. The average CV (all centres) is larger for the 1.5 mL samples
QC, Canada) [20]. Lactose validation analysis could not be done volume set-up compared to the 4.5 mL, across all 3 macronutrients.
because the accredited reference method (IDF 198:2007) does not The corresponding values are: 5.6% for the 1.5 mL sample and 3.5%
apply to milk that is either fermented or contained significant for the 4.5 mL sample set-up (fat), 12.7% for the 1.5 mL and 6.8% for
amounts of oligosaccharides. the 4.5 mL sample set-up (protein), 2.6% for the 1.5 mL and 2.2% for
the 4.5 mL sample set-up (lactose/carbohydrates).
2.5. Statistical analysis
3.2. Part 2 e efficacy of correction algorithm and long term quality
The graphs in this paper were plotted using Microsoft Excel® control
2010 and 2018 (Microsoft, Redmond, Washington, USA) and Prism
version 6® (GraphPad Software Inc, La Jolla, California, USA). Out- Fourteen devices returned data for part 2 of the study
liers have been removed from this paper's figures. Outliers were (Supplemental Fig. 2); one device did not contribute to this part of
defined as follow: for part 1 of the study, data lying outside the 95% the study because it was not used for routine measurements.
confidence intervals around the regression line; for part 2 of the Figure 2 and Supplemental Fig. 5 show each device's QC data,
study, data that were 2 standard deviations above or below a de- including both uncorrected and corrected measurements. For
vice's average measurement for each particular QC. correction, the initial calibration results of the 4.5 mL sample set-up
In part 3 (ring trial) of the study, the number of devices that were used because it had lower variation compared to 1.5 mL
achieved readings within ±5% and ±10% of the reference fat values, sample set-up.
and within ±10% and ±20% of the reference protein values, were In Fig. 2 and Supplemental Fig. 5, results are grouped and colour-
reported respectively. These cutoffs were chosen because of their coded according to the performance. Green: device measured
potential impact on calorie intake and weight gain, as suggested by accurately on initial assessment and correction was not needed;
other working groups [21]. For protein, a measurement error of Yellow: device did not measure accurately on initial assessment
2124 C. Kwan et al. / Clinical Nutrition 39 (2020) 2121e2128
Fig. 1. Correlation of uncorrected data for fat, protein and carbohydrate (CHO) content from 15 bedside devices versus reference values. Each device is represented by its own colour
consistent in each panel. Panels A, B, and D show mean of fat, protein, and CHO contents obtained from three repeated measurements each drawn from a 1.5 mL sample volume set-
up. Panel C shows mean of three fat measurements obtained from one 4.5 mL sample volume set-up. (For interpretation of the references to colour in this figure legend, the reader is
referred to the Web version of this article.)
Fig. 2. Overview of the individual performance of 15 devices for one set of QC samples (QC high; fat and protein) and effect of applying the correction algorithm. Grey horizontal
line shows the chemical reference value. The results before (panel A) and after applying the correction algorithm (panel B) are shown. Results are grouped and colour-coded
according to the performance. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)
and the correction was able to improve the accuracy of the mea- (Fig. 2a and b; one graph for each device), because the second de-
surements; Red: device did not measure accurately on initial vice had completed a separate initial assessment (although not
assessment and the correction was unable to improve the accuracy shown in part 1 of this study).
of the measurements. If a centre received a new device during the Overall, Fig. 2 shows that correction of the devices using the
study, only QC data from the device that completed an initial cali- initial calibration data significantly improved the accuracy for fat
bration (and thus have a corresponding correction algorithm) are and protein measurements for most of the devices. This correction
shown in the figure. Centre #2 received a new replacement device reduced systematic errors, but not the random variation. Some
during the study, so measurements are separated into 2 graphs devices did not require a correction since the uncorrected
C. Kwan et al. / Clinical Nutrition 39 (2020) 2121e2128 2125
measurements were already accurate, and some devices did not their measurements, and other centres whose measurements
show improvement in accuracy after correction. improved in quality after applying the correction algorithm. Some
Figure 3 and Supplemental Figs. 6e10 show the performance of centres with a large noise/imprecision level did not see an
the devices over time. There is significant systematic and random improvement in their measurement quality after applying the
variation between and within devices. The long-term stability of correction algorithm.
devices also varies between centres; some are more accurate and
precise over time than others. There are also differences in the 3.3. Part 3 e ring trial
long-term trends of each device; some seem to be increasing over
time, while others seem to be decreasing over time. Device 2b Fifteen devices returned data for the ring trial (Supplemental
started using sonication as of day 27, which is reflected in a change Fig. 2). For the ring trial, fat, protein and carbohydrate measure-
in the data e sonication has improved the accuracy and decreased ments vary significantly between devices, independent of the de-
the random variation. vice used. Figure 4 shows the ring trial fat and protein data from all
Supplemental Fig. 11 shows the overview of the performance of devices, including both uncorrected and corrected data. If a centre
all 15 devices for high fat QC, including various parameters: the had their device replaced or modified and subsequently saw a shift
accuracy of the uncorrected QC data, accuracy of the initial cali- in their measurements as shown in part 2 of this study, only
bration assessment, whether there are differences between the measurements from the first device that completed the initial
1.5 mL and 4.5 mL sample set-up calibration, and the noise assessment of the study were included. One sample (ring trial
(imprecision) of the following: QC data, initial calibration using the sample #4) has been removed from fat data analysis because none
1.5 mL sample set-up, initial calibration using the 4.5 mL sample of the devices measured close to the chemical value, which sug-
set-up. Supplemental Fig. 11 shows that there are some centres that gests that the result from chemical analysis for this sample may be
perform well in all parameters and do not require a correction in inaccurate.
Fig. 3. Long term stability of fat measurements (QC low), ordered by the degree of inter-day variation. Presented are uncorrected data (indicated by the black diamonds). The
horizontal line indicates the reference value. The dotted vertical line marks centres that received a new replacement device during the study or had the device's equations modified;
this line separates the periods before and after device modification.
2126 C. Kwan et al. / Clinical Nutrition 39 (2020) 2121e2128
Fig. 4. Effect of applying the correction algorithm on the measurement accuracy for ring trial samples. Left panel: fat, right panel: protein.
Correction of the ring trial data improves the accuracy of the operator, such as certain kinds of pre-analytical errors or improper
measurements (Fig. 4). Overall, the percentage of devices sample handling. Inconsistent homogenization of the milk samples
measuring within 10% of the chemical value for fat has increased by prior to analysis is a source of error with sample handling [14]. On
21% after applying the correction algorithm. For protein, the per- one hand, homogenization is important to ensure that the sample
centage of devices measuring within 20% of the chemical value has measured is representative of the milk batch because it reduces the
increased by 9% after applying the correction algorithm. These loss of fat adhering to sample vials [14]. On the other hand, duration
percentages further increased after removing data from devices and intensity of homogenization affect the size and distribution of
with large random variation in their measurements (data not the fat globules, and subsequently the readout of the fat signal [14].
shown). These results also show that ring trials as part of GCLP are Because of the mathematical cross-talk between molecule ab-
feasible. sorptions inherent in IR absorptiometry applied to complex fluids/
emulsions, the quality of homogenization also affects the precision
4. Discussion of protein measurements. It has been shown that focussing and
standardizing homogenization improves the quality of fat and
This is the first trial to compare human milk analyzer mea- protein measurements [14,23,24].
surements in a multicentre setting. We found that between devices, To control for these errors and to reduce the variation in mea-
there was a large variation in accuracy and precision, as well as in surements using human milk analyzers, it is crucial to introduce
long term stability. The order of magnitude of these errors was GCLP, similar to what is required for other medical analytical de-
different for fat and protein analysis. Repeated measurements from vices used to base clinical decisions on [25]. All device operators
single larger batch volume (4.5 mL) were more robust than must be trained, as this will improve sample handling and reduce
repeated measurements from multiple smaller batch volumes errors caused by the operator. If there is enough sample volume,
(3 1.5 mL). It was further found that the accuracy of fat and samples should be measured in duplicate or in triplicate to control
protein measurements could be improved by establishing individ- for errors. QC samples and ring trials will also identify the need for a
ual correction algorithms, and once validated the long-term results device review or training if the performance is not sufficient. For
were quite robust. Ring trials in milk analyzers seem to be feasible instance, daily QC and QC logs can identify malfunction of the de-
and useful. The data of this study confirm once again that lactose vice, inter-day variation, and long-term shift [16,25]. Recently,
cannot be measured using infrared human milk analyzers; the MIRIS reacted to the ongoing discussion of measurement impreci-
reasons of this limitation and its implications for bedside mea- sion and introduced a calibration control kit in 2018 for human milk
surements have been discussed elsewhere [14]. analysis, in line with the GCLP concept. However, there are
The observed variations in macronutrient analysis between currently no published data on the precision of this kit. In short,
centres are clinically significant and would impact the outcomes of GCLP, such as QC samples and ring trials, must be introduced to
clinical interventions and trials on the effect of targeted adjustment bedside milk analysis to avoid confusing results in milk interven-
of nutrient intake in human milk fed babies. Measurement errors of tion studies and clinical routine, and to ensure reliable and stable
1 g of protein per dL will lead to differences in protein intake by results.
1.6 g/kg/day, which would result in differences in growth rates of To reduce the systematic analytical error of a device, centres can
8e10 g/kg/day. Such differences in growth rates are clinically also adjust their device either with the manufacturer or by applying
meaningful [22]. Moreover, there is not only a risk of not providing an individual correction algorithm as used in this study. It is
sufficient nutrient intake, but there is also a risk that these mea- important to note that this correction algorithm is device-specific,
surements errors could lead to severe overfortification (e.g. 5.5 as shown by the different regression lines from the initial valida-
instead of 4.0 g of protein/kg/day), leading to overgrowth and tion part of the study. Although not shown in the results section,
unfavourable fat mass composition. this study found that when a centre gets a new device, applying the
These observed variations result from a combination of random correction algorithm from their old device leads to erroneous re-
and systematic errors. Random errors would lead to significant sults, even if both devices are from the same manufacturer. It is also
scattering of the data, which could be due to the operator's per- worth noting that recalibration is necessary not only if the device is
formance. Systematic errors would introduce a systematic offset in replaced with a different one, but also if the same device receives a
the data and could be due to the performance of the device, like software upgrade. Therefore, each device needs its own calibration,
inaccurately pre-set calibration, or due to the performance of the validation and correction algorithm.
C. Kwan et al. / Clinical Nutrition 39 (2020) 2121e2128 2127
However, our study showed that the correction algorithm was to what is required of other medical devices used for clinical de-
not always successful. There were four main outcomes after cision-making.
applying the correction algorithm to the QC results: 1) a correction
was not necessary and did not significantly change the results since Statement of authorship
the device already showed a high accuracy during the initial cali-
bration as well as the QC measurements; 2) a correction success- Celia Kwan: conceptualization, methodology, formal analysis,
fully improved the accuracy of the measurements; 3) a correction investigation, data curation, writing, project administration. Ger-
did not improve the accuracy of the measurements, due to incon- hard Fusch: conceptualization, methodology, formal analysis,
sistent calibration and QC measurements (e.g. calibration was investigation, data curation, writing, project administration. Niels
measuring too high, but QC was measuring too low); and 4) a Rochow: conceptualization, formal analysis. Christoph Fusch:
correction did not work despite the calibration and QC data being conceptualization, methodology, supervision, writing, funding
consistent. acquisition, project administration. MAMAS Study collaborators:
There are many factors that can affect the efficacy of the Investigation, Resources.
correction algorithm on the QC results, such as the imprecision of
the initial calibration, the imprecision of the long-term QC mea-
Conflict of interest
surements, and the consistency between the initial calibration and
the QC measurements. For the correction algorithm to work suc-
All authors declare no conflict of interest.
cessfully, we need both the calibration and the QC measurements to
be robust. It is also important to note that the correction algorithm
will only correct for systematic errors of the device, and not random Research funding
errors, such as certain kinds of preanalytical sample handling er-
rors, since these are not distorted in a linear fashion. Hence, sample The study is funded by Canadian Institute of Health Research
handling (such as homogenization) is another factor that affects the (CIHR) and the Natural Sciences and Engineering Research Council
efficacy of the correction algorithm, and sample handling must be (NSERC).
adequate for the correction to improve the accuracy of the mea-
surements. If the correction algorithm was not able to successfully Appendix A
improve the accuracy of the measurements, it is recommended to
repeat the initial calibration and then repeating the QC measure- The collaborators of the MAMAS Study are the following:
ments, as well as working on improving sample handling errors.
Another finding from this study is the impact of volume on 1. Dept. Pediatrics, McMaster University, Hamilton, Canada: C.
measurement quality. We have shown that obtaining samples from Kwan, G. Fusch, N. Rochow, S. el-Helou, C. Fusch.
a larger sample volume (4.5 mL) generally has less random varia- 2. BWH, Boston, USA: M. Belfort
tion compared to taking the sample from a smaller volume (1.5 mL), 3. NorthernStar Mothers Milk Bank, Calgary, Canada: J. Festival
especially in MIRIS users. This difference between the two sample 4. Texas Children’s Hospital, USA: A. Hair
volume set-ups can be explained by sample handling errors, like 5. CHRU Nancy, France: J.-M. Hascoet
introducing air bubbles, and proper homogenization might be more 6. KlinikumNeuko €lln, Berlin, Germany: T. Kuehn
problematic. However, it is of interest to note that some centres 7. Uppsala, Sweden: MIRIS
were still able to have accurate and precise results when using the 8. Inselspital, Bern, Switzerland: M. Nelle
smaller sample volume set-up (1.5 mL). It can therefore be specu- 9. SickKids, Toronto, Canada: D. O'Connor
lated that optimal training and sample handling would also allow 10. Victoria General Hospital, BC, Canada: G. Pelligra
for precise measurements when only smaller volumes of milk are 11. Cincinnati Children's Hospital, USA: B. Poindexter, T. Fu
available. 12. Medical University of Graz, Austria: B. Urlesberger
Finally, the study did not specifically compare the two types of 13. Warsaw Medical University, Warsaw, Poland: A. Weso-
instruments (SpectraStar and Miris) since we found that the sample lowska, O. Barbarska
size of Spectrastar devices was too small (n ¼ 5). It is therefore
difficult to draw conclusions. However, we have seen from our
personal experiences that the quality of measurements of both Appendix B. Supplementary data
devices is comparable to provide accurate and precise results if
GCLP are followed. Supplementary data to this article can be found online at
https://doi.org/10.1016/j.clnu.2019.08.028.
5. Conclusion
References
This large multicentre study revealed that there are significant [1] American Academy of Pediatrics Committee on Nutrition. Nutritional needs of
variations in accuracy and precision between different milk ana- low-birth-weight infants. Pediatrics 1985;75(5):976e86.
lyzers used for bedside measurements of fat, protein and lactose. [2] Reali A, Greco F, Fanaro S, Atzei A, Puddu M, Moi M, et al. Fortification of
maternal milk for very low birth weight (VLBW) pre-term neonates. Early
Repeated measurements are more robust when coming from a Hum Dev 2010;86(Suppl. 1):33e6.
larger batch volume, indicating a source of preanalytical error such [3] Maggio L, Costa S, Gallini F. Human milk fortifiers in very low birth weight
as sample handling. The long term stability also varies. Our study infants. Early Hum Dev 2009;85(Suppl. 10):S59e61.
[4] Rochow N, Fusch G, Zapanta B, Ali A, Barui S, Fusch C. Target fortification of
also shows the accuracy of measurements can be improved by
breast milk: how often should milk analysis be done? Nutrients 2015;7(4):
establishing individual correction algorithms. Hence, it is crucial to 2297e310.
follow GCLP when using human milk analyzers. This should include [5] Fusch G, Mitra S, Rochow N, Fusch C. Target fortification of breast milk: levels
training of the operator, proper sample handling, a robust calibra- of fat, protein or lactose are not related. Acta Paediatr 2015;104(1):38e42.
[6] Kashyap S, Ohira-Kist K, Abildskov K, Towers HM, Sahni R, Ramakrishnan R,
tion and validation, and daily QC for precision and accuracy, in et al. Effects of quality of energy intake on growth and metabolic response of
order to ensure proper measurement of the macronutrients, similar enterally fed low-birth-weight infants. Pediatr Res 2001;50(3):390e7.
2128 C. Kwan et al. / Clinical Nutrition 39 (2020) 2121e2128
[7] Dritsakou K, Liosis G, Valsami G, Polychronopoulos E, Skouroliakou M. The [17] Kotrri G, Fusch G, Kwan C, Choi D, Choi A, Al Kafi N, et al. Validation of
impact of maternal- and neonatal-associated factors on human milk's mac- correction algorithms for near-IR analysis of human milk in an independent
ronutrients and energy. J Matern Fetal Neonatal Med 2017;30(11):1302e8. sample setdeffect of pasteurization. Nutrients 2016;8(3):119.
[8] de Halleux V, Rigo J. Variability in human milk composition: benefit of indi- [18] Ogrinc G, Davies L, Goodman D, Batalden P, Davidoff F, Stevens D. SQUIRE 2.0
vidualized fortification in very-low-birth-weight infants. Am J Clin Nutr (Standards for QUality Improvement Reporting Excellence): revised publica-
2013;98(2):529se35s. tion guidelines from a detailed consensus process. BMJ Qual Saf 2016;25(12):
[9] Rigourd V. [Mid-infrared spectrometric analysis to evaluate nutritional con- 986e92.
tent of human milk bank]. Arch Pediatr 2010;17(6):772e3. [19] Choi A, Fusch G, Rochow N, Sheikh N, Fusch C. Establishment of micromethods
[10] Radmacher PG, Lewis SL, Adamkin DH. Individualizing fortification of human for macronutrient contents analysis in breast milk. Matern Child Nutr 2015
milk using real time human milk analysis. J Neonatal Perinatal Med 2013;6(4): Oct;11(4):761e72.
319e23. [20] Valacta. Valacta reference/calibration laboratory [cited 2018 July 2]. 2018.
[11] Rochow N, Landau-Crangle E, Fusch C. Challenges in breast milk fortification Available from: http://www.valacta.com/english/laboratoire_reference.html.
for preterm infants. Curr Opin Clin Nutr Metab Care 2015;18(3):276e84. [21] Sauer CW, Boutin MA, Kim JH. Wide variability in caloric density of expressed
[12] Rochow N, Fusch G, Choi A, Chessell L, Elliott L, McDonald K, et al. Target human milk can lead to major underestimation or overestimation of nutrient
fortification of breast milk with fat, protein, and carbohydrates for preterm content. J Hum Lact 2017;33(2):341e50.
infants. J Pediatr 2013;163(4):1001e7. [22] Embleton ND. Optimal protein and energy intakes in preterm infants. Early
[13] Fusch G, Kwan C, Kotrri G, Fusch C. “Bed side” human milk analysis in the Hum Dev 2007;83(12):831e7.
neonatal intensive care unit: a systematic review. Clin Perinatol 2017;44(1): [23] Aernouts B, Polshin E, Saeys W, Lammertyn J. Mid-infrared spectrometry of
209e67. milk for dairy metabolomics: a comparison of two sampling techniques and
[14] Fusch G, Rochow N, Choi A, Fusch S, Poeschl S, Ubah AO, et al. Rapid mea- effect of homogenization. Anal Chim Acta 2011;705(1e2):88e97.
surement of macronutrients in breast milk: how reliable are infrared milk [24] Di Marzo L, Barbano DM. Effect of homogenizer performance on accuracy and
analyzers? Clin Nutr 2015;34(3):465e76. repeatability of mid-infrared predicted values for major milk components.
[15] Kwan C, Fusch G, Bahonjic A, Rochow N, Fusch C. Infrared analyzers for breast J Dairy Sci 2016;99(12):9471e82.
milk analysis: fat levels can influence the accuracy of protein measurements. [25] Ezzelle J, Rodriguez-Chavez IR, Darden JM, Stirewalt M, Kunwar N,
Clin Chem Lab Med 2017;55(12):1931e5. Hitchcock R, et al. Guidelines on good clinical laboratory practice: bridging
[16] Fusch G, Kwan C, Huang RC, Rochow N, Fusch C. Need of quality control operations between research and clinical research laboratories. J Pharm Bio-
program when using near-infrared human milk analyzers. Acta Paediatr 2016 med Anal 2008;46(1):18e29.
Mar;105(3):324e5.