Statistics in Food Science and Nutrition PDF DOCX Download
Statistics in Food Science and Nutrition PDF DOCX Download
Visit the link below to download the full version of this book:
https://medipdf.com/product/statistics-in-food-science-and-nutrition/
v
vi Preface
Index .................................................................................................................... 65
Chapter 1
Statistics in Food Science and Nutrition
Abstract Food and nutrition is not limited to cuisine, culture, and healthy living; it
is also filled with the joy of data and statistics. Major innovations in statistics have
come about from applied problems in food science. “The lady tasting tea” experi-
ment and the relationship between Guinness and the t-test are classic examples.
A fundamental knowledge of statistical analysis and study design is more important
than ever, especially for investigating the relationship between food habits, health,
and lifestyle.
Perhaps you enjoy reading food labels on packages in the supermarket to know the
content of protein, fat, and carbohydrates and other nutritional facts. Consumers
are interested in sales statistics on different food groups and expiration dates. The
subject of food and nutrition is not all about tradition, culture, healthy living, and
cuisine. It is also filled with the joy of collecting and analyzing data from existing
databases, interviewing consumers, or producing them yourself in the laboratory.
Quantitative research involves the collection of relevant data and their analysis.
Scientific reasoning, theory, and conclusions depend largely on the interpretation
of data. Food science and nutrition is filled with the assessment of data from physi-
cal, microbial, chemical, sensory, and commercial analysis. In such an interdisci-
plinary area filled with quantitative data, statistics is at the heart of food and
nutrition research.
From a very applied point of view, statistics can be divided into two subfields –
descriptive statistics and statistical testing. Basic descriptive statistics are usually
encountered long before one enters the field of professional food and nutrition
Fig. 1.1 William Sealy Gosset (left) (reproduced from Annals of Human Genetics, 1939: 1-9,
with permission from John Wiley and Sons) and Ronald A. Fisher (right) are both regarded as
being among the foremost founding fathers of modern statistics. Their methodological innovations
in statistics were developed from agricultural and food scientific applications (see, e.g., Box 1978;
Plackett and Bernard 1990)
research. The idioms “a picture is worth a thousand words” and “not seeing the
forest for the trees” summarize what descriptive statistics is all about. It describes
how descriptive statistics facilitates the interpretation of data and results. Typical
examples are bar charts, trend lines, and scatter plots. Its application and useful-
ness is obvious also beyond the scientific (and statistical) community.
Statistical testing or inference, on the other hand, may not always be so intui-
tive, and its usefulness may be questioned outside the scientific community. It
involves such expressions as significance, p-values, confidence intervals, and
probability distributions and is founded on mathematics. One might have heard
comments like “There are three kinds of lies: lies, damned lies, and statistics.”
The mathematics and accusations of aside, consider the following questions: Are
your data due to a “real” effect or just a coincidence? Would you obtain the same
results if you repeated the measurements, study, or experiment? What if the exper-
iment were repeated and those data gave a different conclusion? If the results are
due to coincidence or randomness, the results do not reflect a “real” effect.
Statistical testing based on mathematical techniques say something about the
probability that the outcome is not just a coincidence and about what the “real”
effect is. Mathematics is the basic tool, but its application and interpretation
depend on the scientific knowledge in your field of research. This is indeed the
case as one moves from simple statistical tests to more complex statistical models
known as analysis of variance, regression analysis, general linear model, logistic
regression, and generalized linear models. These tests are actually quite related to
each other, even though they have very different names.
1.2 Historical Anecdotes Relating Statistics to Food Science and Nutrition 3
What do drinking a pint of Guinness and a cup of tea with milk have in common?
They have triggered fundamental innovations in statistical theory from applied
problems in food science.
William Sealy Gosset, a chemistry graduate from Oxford, took up a job at Arthur
Guinness, Son & Co, Ltd, in Dublin, Ireland. His task was to apply scientific meth-
ods to beer processing. To brew a perfect beer, exact amounts of yeast had to be
mixed with the continuously fermenting barley. The amount of yeast was quantified
by counting colonies. Gosset’s challenge was to develop a method to quantify the
amount of yeast colonies in entire jars of brewing beer based on small samples taken
from the jars. This applied problem in food technology triggered an innovation in
mathematics and statistics. He presented a report to the Guinness Board titled “The
Application of the ‘Law of Error’ to the Work of the Brewery.” Guinness Breweries
did not allow its scientists to publish articles, but Gosset was granted permission
provided he used a pseudonym and did not reveal any confidential data (Plackett
and Bernard 1990; Raju 2005). Two papers were published in the statistical journal
Biometrika under the pseudonym “Student” entitled “On the error of counting with
a hemacytometer” (Student 1907) and “The probable error of a mean” (Student
1908). The last paper is a classic describing the test that became the well-known
(Student’s) t-test. Gosset continued to write statistical papers based on applied prob-
lems encountered in the brewery. Student’s t-test and the t-distribution are used
extensively in all types of applied statistics, but their applied origin is in food sci-
ence and technology.
The “lady tasting tea” is the famous anecdote of an encounter with a woman by
R.A. Fisher – one of the founding fathers of modern statistics and experimental
design (Box 1978; Sturdivant 2000). His principles on statistical analysis and espe-
cially randomized designs are used in such experimental fields as agriculture, food
science, and medicine. The title refers to the investigation of an English lady’s claim
that she could tell whether milk was poured into a cup first and then tea or first the
tea and then milk. This is the account of what had a major impact on modern statis-
tics according to Box (1978).
Already, quite soon after he (i.e. R. A. Fisher) had come to Rothamstead, his presence had
transformed one commonplace tea time to an historic event. It happened one afternoon
when he drew a cup of tea from the urn and offered it to the lady beside him, Dr. B. Muriel
Bristol, an algologist. She declined it, stating that she preferred a cup into which the milk
had been poured first. “Nonsense,” returned Fisher, smiling, “Surely it makes no differ-
ence.” But she maintained, with emphasis, that of course it did. From just behind, a voice
suggested, “Let’s test her.” It was William Roach who was not long afterward to marry Miss
Bristol. Immediately, they embarked on the preliminaries of the experiment, Roach assist-
ing with the cups and exulting that Miss Bristol divined correctly more than enough of those
cups into which tea had been poured first to prove her case.
Regardless of whether or not the experiment was actually run, the statistical test
developed to analyze a cross-table from this small sample of tea cups and the principles
4 1 Statistics in Food Science and Nutrition
of randomly assigning samples of tea with milk poured first or after the tea are in
common use. The mathematical-statistical test is called “Fisher’s exact test,” and the
principle of randomization is fundamental in experimental studies and perhaps nowa-
days most closely linked to medical sciences and randomized clinical trials in drug
development. However, its important origin is, again, linked to food science and
technology.
R.A. Fisher’s work on experimental data at the Rothamsted Experimental Station,
located at Hertfordshire, England, on crop cultivation led to numerous innovations
in modern statistics. He pioneered the principles of randomization and design of
experiments and developed statistical methods such as analysis of variance and the
foundation of maximum-likelihood estimations. Many of these innovations were
presented in his 1925 book Statistical Methods for Research Workers and the later
book Design of Experiments, published in 1935. Both books are regarded as refer-
ence texts in statistical science. Again, their applied origin that triggered important
innovations in experimental and statistical science were rooted in agriculture and
closely linked to food science and technology.
Food scientists encounter many types of data. Consumers report their preferences,
sensory panels give scores on selected scales, laboratories conduct chemical and
microbial analyses, and companies set specific targets on production costs and sales.
New ingredients may improve the technological properties in foodstuffs, but how
should one conduct a study to test if it is worth changing the production process?
How does one take into account other factors that might have an effect? How many
samples should one test – 5, 10, 35, 178, or even more? Does it matter if data are
collected from the same sample repeatedly over time compared to a “fresh” sample
for each data point? All this is very important when it comes to describing data and
experimental studies with an appropriate design and statistical methods.
Epidemiology is the study of the distribution and patterns of health events,
health characteristics, and their causes or influences on well-defined populations.
It is the cornerstone method of public health research and identifies risk factors for
disease and targets for preventive medicine (see, e.g., Rothman et al. 2008). A bal-
anced and appropriate diet has since ancient times been related to health and the
prevention of disease. Food and nutritional epidemiology as a scientific discipline
has attracted increased interest in recent years (Michels 2003). A large number of
observational studies have attempted to elucidate the role of diet in health and dis-
ease. Since diet is strongly related to other aspects of life, because of the long
exposure time and practical as well as ethical obstacles in assigning subjects to
specific diets, randomized intervention studies are not as central in nutritional
research as, for instance, in drug development. The effect of fats on the risk of
coronary heart disease, the proportion of carbohydrates in one’s diet and its effect
References 5
on body mass index, and the onset of diabetes are all well-known examples of
areas of concern in food epidemiology. A complicating issue is that individuals
who try to eat a healthy diet are likely to lead a healthy lifestyle in general. This
and other complex issues of food epidemiology can partly cause tabloid headlines
about what type of food is “good” or “bad.” It is therefore more important than ever
to have a basic knowledge of epidemiological principles and of statistical methods
to describe and analyze data from nutritional studies.
References
Abstract The best way to learn statistics is by taking courses and working with
data. Some books may also be helpful. A first step in applied statistics is usually to
describe and summarize data using estimates and descriptive plots. The principle
behind p-values and statistical inference in general is covered with a schematic
overview of statistical tests and models.
How does one learn statistics, epidemiology, and experimental design? The recom-
mended approach is, of course, to take (university) courses and combine it with applied
use. In the same way it takes considerable effort and time to become trained in food
technology or chemistry or as a physician, learning statistics – both the mathematical
theory and applied use – takes time and effort. Some courses or books that promise to
teach statistics without requiring much time and that neglect all the fundamental
aspects of the subject could be deceiving. Learning the technical use of statistical
software without some fundamental knowledge of what these methods express and
the basics of calculations may leave the statistical analysis part in a black box.
Appropriate statistical analysis and a robust experimental design should be the oppo-
site of a black box – it should shed light upon data and give clear insights. It should
ideally not be Harry Potter magic!
A comprehensive introduction to statistics and experimental design goes some-
what beyond the scope of this brief text. Therefore, this section will refer the reader to
several excellent textbooks on the subject available from Springer. Readers with
access to a university library service should be able to obtain these texts online through