100% found this document useful (18 votes)
352 views14 pages

Statistics in Food Science and Nutrition PDF DOCX Download

The document discusses the book 'Statistics in Food Science and Nutrition' by Are Hugo Pripp, which covers the application of statistical methods in food science and nutrition. It emphasizes the importance of understanding statistical principles for analyzing food quality, nutritional epidemiology, and the use of multivariate statistics. The book aims to provide a concise introduction to these topics while encouraging readers to seek further education in statistics and research methodology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (18 votes)
352 views14 pages

Statistics in Food Science and Nutrition PDF DOCX Download

The document discusses the book 'Statistics in Food Science and Nutrition' by Are Hugo Pripp, which covers the application of statistical methods in food science and nutrition. It emphasizes the importance of understanding statistical principles for analyzing food quality, nutritional epidemiology, and the use of multivariate statistics. The book aims to provide a concise introduction to these topics while encouraging readers to seek further education in statistics and research methodology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Statistics in Food Science and Nutrition

Visit the link below to download the full version of this book:

https://medipdf.com/product/statistics-in-food-science-and-nutrition/

Click Download Now


Are Hugo Pripp

Statistics in Food Science


and Nutrition
Are Hugo Pripp
Oslo University Hospital
Oslo, Norway

ISBN 978-1-4614-5009-2 ISBN 978-1-4614-5010-8 (eBook)


DOI 10.1007/978-1-4614-5010-8
Springer New York Heidelberg Dordrecht London

Library of Congress Control Number: 2012945941

© Springer Science+Business Media New York 2013


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and
executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this
publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s
location, in its current version, and permission for use must always be obtained from Springer. Permissions
for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to
prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)


Preface

A brief book will by its nature represent a compromise between methodological


principles in statistics and epidemiology necessary to understand the subject and
issues that are specific to the application of statistics to food science and nutrition.
Therefore, the part on the methods and principles of statistics and epidemiology will
cover these issues in a concise and basic manner. However, food scientists interested
in statistics and epidemiology are urged to take general courses or read one of the
many recommended textbooks in this area. The field of statistics is very large, and
the number of journals and books is, fortunately, growing. For instance, www.
springerlink.com lists 102 books available (as of April 2012) with the word statis-
tics in its title.
Chapter 2, “Methods and Principles of Statistical Analysis,” provides a concise
introduction to the most important principles but with an edited reference to many
of the excellent textbooks available. Readers who are unfamiliar with the basics of
these principles are encouraged to consult more comprehensive textbooks or take
courses in statistics, research methodology, and epidemiology.
The remaining chapters will be devoted to specific topics of interest in food sci-
ence and nutrition. A primary goal of food producers is to make foods of excellent
quality. Therefore, Chap. 3 concerns statistics in relation to product quality and
sensory analysis. The relationship between food, lifestyle, and health is more impor-
tant now than ever. The number of nutritional studies is quickly expanding, and their
findings are discussed in the mass media. Health claims (or risks) are being reported
to the public with increasing frequency, but are the claims based on rigorous research
and proper statistical analysis? In this connection, Chap. 4 will focus on nutritional
epidemiology and the health effects of foods. The topic of food and health is becom-
ing ever more vital and closely linked to other lifestyle issues such as malnutrition
or obesity, for example. Proper study design and statistical analysis are therefore of
core importance. Chapter 5 – the last one in this Springer Brief – is titled the
“Application of Multivariate Statistics: Benefits and Pitfalls.” Food science and
technology have been closely linked to the innovative use of novel multivariate
statistics. These methods have been shown to have many applications and to confer
numerous benefits in data analysis, and they are used especially in such areas as

v
vi Preface

spectroscopy, chemometrics, and sensory analysis. However, their complex statisti-


cal-mathematical nature is not without pitfalls. Pretreatment of data and subsequent
interpretation of results may be an issue. Thus, one should have a basic understand-
ing of the methods’ statistical principles before applying them extensively.
With an educational and scientific background in food science and statistics, I
have had an ongoing personal interest in finding areas where the two topics inter-
twine. The many inspiring discussions on study design and statistical analysis with
medical researchers at Oslo University Hospital have also been instrumental to my
understanding of both the possibilities and limitations of statistics and, sometimes,
the beauty of a simple statistical test. I would like to thank Susan Safren and Rita
Beck at Springer New York for their continuous interest and patience in the prepara-
tion of this text. Special thanks go to my parents for always encouraging me.

Norway Are Hugo Pripp


Contents

1 Statistics in Food Science and Nutrition .................................................... 1


1.1 The Food Statistician............................................................................. 1
1.2 Historical Anecdotes Relating Statistics to Food Science
and Nutrition ......................................................................................... 3
1.3 Why Statistics, Experimental Design,
and Epidemiology Matter...................................................................... 4
References ...................................................................................................... 5
2 Methods and Principles of Statistical Analysis ......................................... 7
2.1 Recommended Textbooks on Statistics ................................................. 7
2.1.1 Applied Statistics, Epidemiology,
and Experimental Design .......................................................... 8
2.1.2 Advanced Text on the Theoretical Foundation
in Statistics ................................................................................ 9
2.2 Describing Data..................................................................................... 10
2.2.1 Categorical Data ........................................................................ 10
2.2.2 Numerical Data ......................................................................... 12
2.2.3 Other Types of Data .................................................................. 13
2.3 Summarizing Data................................................................................. 15
2.3.1 Contingency Tables (Cross Tabs) for Categorical Data ............ 15
2.3.2 The Most Representative Value of Continuous Data ................ 15
2.3.3 Spread and Variation of Continuous Data ................................. 16
2.4 Descriptive Plots ................................................................................... 17
2.4.1 Bar Chart ................................................................................... 19
2.4.2 Histograms ................................................................................ 19
2.4.3 Box Plots ................................................................................... 19
2.4.4 Scatterplots ................................................................................ 19
2.4.5 Line Plots .................................................................................. 20
2.5 Statistical Inference (the p-Value Stuff) ................................................ 20
2.6 Overview of Classical Statistical Tests ................................................. 21
2.7 Overview of Statistical Models ............................................................. 21
References ...................................................................................................... 23
vii
viii Contents

3 Applying Statistics to Food Quality ............................................................ 25


3.1 The Concept of Food Quality ................................................................ 25
3.2 Measuring Quality Quantitatively ......................................................... 26
3.3 Statistical Process Control ..................................................................... 27
3.3.1 The Foundation of Statistical Process Control .......................... 27
3.3.2 Control Charts ........................................................................... 28
3.3.3 The Statistics of Six Sigma ....................................................... 30
3.3.4 Multivariate Statistical Process Control .................................... 30
3.4 Statistical Assessment of Sensory Data................................................. 32
3.4.1 Methods in Sensory Evaluation................................................. 32
3.4.2 Statistical Assessment of Differences Between Foods ............. 33
3.4.3 Statistical Assessment of Similarities Between Foods.............. 35
3.5 Statistical Assessment of Shelf Life ...................................................... 36
3.5.1 Shelf Life and Product Quality ................................................. 36
3.5.2 Detection of Shelf Life .............................................................. 37
3.5.3 Statistical Assessment of Shelf Life: Food
Survival Analysis ...................................................................... 37
References ...................................................................................................... 38
4 Nutritional Epidemiology and Health Effects of Foods............................ 41
4.1 Food: The Source of Health and Disease .............................................. 41
4.2 Epidemiological Principles and Designs............................................... 42
4.2.1 Clinical and Epidemiological Research Strategies ................... 42
4.2.2 Clinical and Epidemiological Study Designs............................ 43
4.3 Methods to Assess Food Intake............................................................. 45
4.4 Epidemiological Use of Multiple Regression Models .......................... 46
4.4.1 Adjusting for Confounders........................................................ 46
4.4.2 Assessment of Effect Modification (Interaction) ...................... 48
4.4.3 Intermediate Variables in the Causal Pathway .......................... 49
References ...................................................................................................... 50
5 Application of Multivariate Analysis: Benefits and Pitfalls ..................... 53
5.1 Introduction of Multivariate Statistics in Food Science ........................ 53
5.2 Principal Component Analysis or Factor Analysis:
When and Where ................................................................................... 54
5.2.1 What Is Principal Component Analysis? .................................. 54
5.2.2 What Is Factor Analysis? .......................................................... 56
5.2.3 Overall Recommendations ........................................................ 58
5.3 Exploratory Data Analysis .................................................................... 60
5.4 Pattern Recognition and Clustering ....................................................... 60
5.5 Modeling and Optimization .................................................................. 61
5.6 Limitations with Multivariate Statistical Analysis
in Food Science ..................................................................................... 63
References ...................................................................................................... 64

Index .................................................................................................................... 65
Chapter 1
Statistics in Food Science and Nutrition

Abstract Food and nutrition is not limited to cuisine, culture, and healthy living; it
is also filled with the joy of data and statistics. Major innovations in statistics have
come about from applied problems in food science. “The lady tasting tea” experi-
ment and the relationship between Guinness and the t-test are classic examples.
A fundamental knowledge of statistical analysis and study design is more important
than ever, especially for investigating the relationship between food habits, health,
and lifestyle.

Keywords William Sealy Gosset • Ronald A. Fisher • Food epidemiology

1.1 The Food Statistician

Perhaps you enjoy reading food labels on packages in the supermarket to know the
content of protein, fat, and carbohydrates and other nutritional facts. Consumers
are interested in sales statistics on different food groups and expiration dates. The
subject of food and nutrition is not all about tradition, culture, healthy living, and
cuisine. It is also filled with the joy of collecting and analyzing data from existing
databases, interviewing consumers, or producing them yourself in the laboratory.
Quantitative research involves the collection of relevant data and their analysis.
Scientific reasoning, theory, and conclusions depend largely on the interpretation
of data. Food science and nutrition is filled with the assessment of data from physi-
cal, microbial, chemical, sensory, and commercial analysis. In such an interdisci-
plinary area filled with quantitative data, statistics is at the heart of food and
nutrition research.
From a very applied point of view, statistics can be divided into two subfields –
descriptive statistics and statistical testing. Basic descriptive statistics are usually
encountered long before one enters the field of professional food and nutrition

A.H. Pripp, Statistics in Food Science and Nutrition, SpringerBriefs in Food, 1


Health, and Nutrition, DOI 10.1007/978-1-4614-5010-8_1,
© Springer Science+Business Media New York 2013
2 1 Statistics in Food Science and Nutrition

Fig. 1.1 William Sealy Gosset (left) (reproduced from Annals of Human Genetics, 1939: 1-9,
with permission from John Wiley and Sons) and Ronald A. Fisher (right) are both regarded as
being among the foremost founding fathers of modern statistics. Their methodological innovations
in statistics were developed from agricultural and food scientific applications (see, e.g., Box 1978;
Plackett and Bernard 1990)

research. The idioms “a picture is worth a thousand words” and “not seeing the
forest for the trees” summarize what descriptive statistics is all about. It describes
how descriptive statistics facilitates the interpretation of data and results. Typical
examples are bar charts, trend lines, and scatter plots. Its application and useful-
ness is obvious also beyond the scientific (and statistical) community.
Statistical testing or inference, on the other hand, may not always be so intui-
tive, and its usefulness may be questioned outside the scientific community. It
involves such expressions as significance, p-values, confidence intervals, and
probability distributions and is founded on mathematics. One might have heard
comments like “There are three kinds of lies: lies, damned lies, and statistics.”
The mathematics and accusations of aside, consider the following questions: Are
your data due to a “real” effect or just a coincidence? Would you obtain the same
results if you repeated the measurements, study, or experiment? What if the exper-
iment were repeated and those data gave a different conclusion? If the results are
due to coincidence or randomness, the results do not reflect a “real” effect.
Statistical testing based on mathematical techniques say something about the
probability that the outcome is not just a coincidence and about what the “real”
effect is. Mathematics is the basic tool, but its application and interpretation
depend on the scientific knowledge in your field of research. This is indeed the
case as one moves from simple statistical tests to more complex statistical models
known as analysis of variance, regression analysis, general linear model, logistic
regression, and generalized linear models. These tests are actually quite related to
each other, even though they have very different names.
1.2 Historical Anecdotes Relating Statistics to Food Science and Nutrition 3

1.2 Historical Anecdotes Relating Statistics


to Food Science and Nutrition

What do drinking a pint of Guinness and a cup of tea with milk have in common?
They have triggered fundamental innovations in statistical theory from applied
problems in food science.
William Sealy Gosset, a chemistry graduate from Oxford, took up a job at Arthur
Guinness, Son & Co, Ltd, in Dublin, Ireland. His task was to apply scientific meth-
ods to beer processing. To brew a perfect beer, exact amounts of yeast had to be
mixed with the continuously fermenting barley. The amount of yeast was quantified
by counting colonies. Gosset’s challenge was to develop a method to quantify the
amount of yeast colonies in entire jars of brewing beer based on small samples taken
from the jars. This applied problem in food technology triggered an innovation in
mathematics and statistics. He presented a report to the Guinness Board titled “The
Application of the ‘Law of Error’ to the Work of the Brewery.” Guinness Breweries
did not allow its scientists to publish articles, but Gosset was granted permission
provided he used a pseudonym and did not reveal any confidential data (Plackett
and Bernard 1990; Raju 2005). Two papers were published in the statistical journal
Biometrika under the pseudonym “Student” entitled “On the error of counting with
a hemacytometer” (Student 1907) and “The probable error of a mean” (Student
1908). The last paper is a classic describing the test that became the well-known
(Student’s) t-test. Gosset continued to write statistical papers based on applied prob-
lems encountered in the brewery. Student’s t-test and the t-distribution are used
extensively in all types of applied statistics, but their applied origin is in food sci-
ence and technology.
The “lady tasting tea” is the famous anecdote of an encounter with a woman by
R.A. Fisher – one of the founding fathers of modern statistics and experimental
design (Box 1978; Sturdivant 2000). His principles on statistical analysis and espe-
cially randomized designs are used in such experimental fields as agriculture, food
science, and medicine. The title refers to the investigation of an English lady’s claim
that she could tell whether milk was poured into a cup first and then tea or first the
tea and then milk. This is the account of what had a major impact on modern statis-
tics according to Box (1978).
Already, quite soon after he (i.e. R. A. Fisher) had come to Rothamstead, his presence had
transformed one commonplace tea time to an historic event. It happened one afternoon
when he drew a cup of tea from the urn and offered it to the lady beside him, Dr. B. Muriel
Bristol, an algologist. She declined it, stating that she preferred a cup into which the milk
had been poured first. “Nonsense,” returned Fisher, smiling, “Surely it makes no differ-
ence.” But she maintained, with emphasis, that of course it did. From just behind, a voice
suggested, “Let’s test her.” It was William Roach who was not long afterward to marry Miss
Bristol. Immediately, they embarked on the preliminaries of the experiment, Roach assist-
ing with the cups and exulting that Miss Bristol divined correctly more than enough of those
cups into which tea had been poured first to prove her case.

Regardless of whether or not the experiment was actually run, the statistical test
developed to analyze a cross-table from this small sample of tea cups and the principles
4 1 Statistics in Food Science and Nutrition

of randomly assigning samples of tea with milk poured first or after the tea are in
common use. The mathematical-statistical test is called “Fisher’s exact test,” and the
principle of randomization is fundamental in experimental studies and perhaps nowa-
days most closely linked to medical sciences and randomized clinical trials in drug
development. However, its important origin is, again, linked to food science and
technology.
R.A. Fisher’s work on experimental data at the Rothamsted Experimental Station,
located at Hertfordshire, England, on crop cultivation led to numerous innovations
in modern statistics. He pioneered the principles of randomization and design of
experiments and developed statistical methods such as analysis of variance and the
foundation of maximum-likelihood estimations. Many of these innovations were
presented in his 1925 book Statistical Methods for Research Workers and the later
book Design of Experiments, published in 1935. Both books are regarded as refer-
ence texts in statistical science. Again, their applied origin that triggered important
innovations in experimental and statistical science were rooted in agriculture and
closely linked to food science and technology.

1.3 Why Statistics, Experimental Design,


and Epidemiology Matter

Food scientists encounter many types of data. Consumers report their preferences,
sensory panels give scores on selected scales, laboratories conduct chemical and
microbial analyses, and companies set specific targets on production costs and sales.
New ingredients may improve the technological properties in foodstuffs, but how
should one conduct a study to test if it is worth changing the production process?
How does one take into account other factors that might have an effect? How many
samples should one test – 5, 10, 35, 178, or even more? Does it matter if data are
collected from the same sample repeatedly over time compared to a “fresh” sample
for each data point? All this is very important when it comes to describing data and
experimental studies with an appropriate design and statistical methods.
Epidemiology is the study of the distribution and patterns of health events,
health characteristics, and their causes or influences on well-defined populations.
It is the cornerstone method of public health research and identifies risk factors for
disease and targets for preventive medicine (see, e.g., Rothman et al. 2008). A bal-
anced and appropriate diet has since ancient times been related to health and the
prevention of disease. Food and nutritional epidemiology as a scientific discipline
has attracted increased interest in recent years (Michels 2003). A large number of
observational studies have attempted to elucidate the role of diet in health and dis-
ease. Since diet is strongly related to other aspects of life, because of the long
exposure time and practical as well as ethical obstacles in assigning subjects to
specific diets, randomized intervention studies are not as central in nutritional
research as, for instance, in drug development. The effect of fats on the risk of
coronary heart disease, the proportion of carbohydrates in one’s diet and its effect
References 5

on body mass index, and the onset of diabetes are all well-known examples of
areas of concern in food epidemiology. A complicating issue is that individuals
who try to eat a healthy diet are likely to lead a healthy lifestyle in general. This
and other complex issues of food epidemiology can partly cause tabloid headlines
about what type of food is “good” or “bad.” It is therefore more important than ever
to have a basic knowledge of epidemiological principles and of statistical methods
to describe and analyze data from nutritional studies.

References

Box JF (1978) R. A. Fisher, the life of a scientist. Wiley, New York


Michels KB (2003) Nutritional epidemiology – past, present, future. Int J Epidemiol 32:486–488.
doi:10.1093/ije/dyg216
Plackett RL, Bernard GA (1990) Student: a statistical biography of William Sealy Gosset.
Clarendon, Oxford
Raju TNK (2005) William Sealy Gosset and William A. Silverman: two ‘students’ of science.
Pediatrics 116:732–735. doi:10.1542/peds.2005–1134
Rothman KJ, Greenland S, Lash TL (2008) Modern epidemiology, 3rd edn. LWW, Philadelphia
Student JF (1907) On the error of counting with a haemacytometer. Biometrika 5:351–360
Student (1908) The probable error of a mean. Biometrika 6:1–25
Sturdivant R (2000) Lady tasting tea. Adapted from D Nolan and T Speed (2000) Mathematical
statistics through applications. Springer, New York. http://www.dean.usma.edu/math/people/
sturdivant/images/MA376/dater/ladytea.pdf. Accessed 1 Jun 2012
Chapter 2
Methods and Principles of Statistical Analysis

Abstract The best way to learn statistics is by taking courses and working with
data. Some books may also be helpful. A first step in applied statistics is usually to
describe and summarize data using estimates and descriptive plots. The principle
behind p-values and statistical inference in general is covered with a schematic
overview of statistical tests and models.

Keywords Recommended textbooks • Descriptive statistics • p-values • Statistical


models

2.1 Recommended Textbooks on Statistics

How does one learn statistics, epidemiology, and experimental design? The recom-
mended approach is, of course, to take (university) courses and combine it with applied
use. In the same way it takes considerable effort and time to become trained in food
technology or chemistry or as a physician, learning statistics – both the mathematical
theory and applied use – takes time and effort. Some courses or books that promise to
teach statistics without requiring much time and that neglect all the fundamental
aspects of the subject could be deceiving. Learning the technical use of statistical
software without some fundamental knowledge of what these methods express and
the basics of calculations may leave the statistical analysis part in a black box.
Appropriate statistical analysis and a robust experimental design should be the oppo-
site of a black box – it should shed light upon data and give clear insights. It should
ideally not be Harry Potter magic!
A comprehensive introduction to statistics and experimental design goes some-
what beyond the scope of this brief text. Therefore, this section will refer the reader to
several excellent textbooks on the subject available from Springer. Readers with
access to a university library service should be able to obtain these texts online through

A.H. Pripp, Statistics in Food Science and Nutrition, SpringerBriefs in Food, 7


Health, and Nutrition, DOI 10.1007/978-1-4614-5010-8_2,
© Springer Science+Business Media New York 2013
8 2 Methods and Principles of Statistical Analysis

www.springerlink.com. Readers unfamiliar with the general aspects of statistics and


experimental design or who have not taken introductory courses are encouraged to
study some of these textbooks. An overview of the principles of descriptive statistics,
statistical inference (e.g., estimations and p-values), classic tests, and statistical mod-
els is given later, but it is assumed that the reader has a basic knowledge of these
principles.

2.1.1 Applied Statistics, Epidemiology,


and Experimental Design

Statistics for Non-Statisticians by Madsen (2011) is an excellent introductory text-


book for those new to the field. It covers the collection and presentation of data,
basic statistical concepts, descriptive statistics, probability distributions (with an
emphasis on the normal distribution), and statistical tests. The free spreadsheet soft-
ware OpenOffice is used throughout the text. Additional material on statistical soft-
ware, more comprehensive explanations on probability theory, and statistical
methods and examples are provided in appendices and at the textbook’s Web site.
At 160 page, the textbook is not overwhelming. Readers with different interests,
either in applied statistics or in mathematical-statistical concepts, are told which
parts to read. Readers unfamiliar with statistics are highly encouraged to read this
text or a similar introductory textbook on statistics.
Applied Statistics Using SPSS, STATISTICA, MATLAB and R by Marques de Sá
(2007) is another recommended textbook, although it goes into somewhat more depth
on mathematical-statistical principles. However, it provides a very useful introduction
to using these four key statistical softwares for applied statistics. Combined with soft-
ware manuals, it will give the reader an improved understanding of how to conduct
descriptive statistics and tests. Both SPSS and STATISTICA have menu-based sys-
tems in addition to allowing users to write command lines (syntaxes). MATLAB and
R might have a steeper learning curve and assume a more in-depth understanding of
mathematical-statistical concepts, but they have many advanced functions and are
used widely in statistical research. R is available for free and can be downloaded on
the Internet. This is sometimes a great advantage and makes the user independent of
updated licenses. Those who wish to make the effort to learn and use R will be part of
a large statistical community (R Development Core Team 2012). It may, however,
take some effort if one is unfamiliar with computer programming.
Biostatistics with R: An Introduction to Statistics Through Biological Data
by Shahbaba (2012) gives a very useful step-by-step introduction to the R software
platform using biological data. The many statistical methods available through
so-called R packages and the (free) availability of the software makes it very
attractive, but its somewhat more complicated structure compared to commercial
software like SPSS, STATISTICA, or STATA might make it less relevant for those
who use mostly so-called standard methods and have access to commercial
software.

You might also like