100% found this document useful (19 votes)

242 views15 pages

Foundations of Applied Statistical Methods - 2nd Edition Reference Book Download

Foundations of Applied Statistical Methods - 2nd Edition by Hang Lee provides a comprehensive introduction to applied statistical methods, emphasizing foundational concepts over complex mathematical derivations. The book is designed for researchers and graduate students, offering clear explanations and practical examples to enhance understanding of statistical techniques. It covers essential topics such as data description, statistical inference, t-tests, ANOVA, and regression analysis.

Uploaded by

ghynhxuo.mhiepch.ung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (19 votes)

242 views15 pages

Foundations of Applied Statistical Methods - 2nd Edition Reference Book Download

Uploaded by

ghynhxuo.mhiepch.ung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Foundations of Applied Statistical Methods - 2nd Edition

Visit the link below to download the full version of this book:

https://medipdf.com/product/foundations-of-applied-statistical-methods-2nd-editi
on/

Click Download Now

Hang Lee

Foundations of Applied
Statistical Methods
Second Edition
Hang Lee
Massachusetts General Hospital Biostatistics Center
Department of Medicine
Harvard Medical School
Boston, MA, USA

ISBN 978-3-031-42295-9 ISBN 978-3-031-42296-6 (eBook)

https://doi.org/10.1007/978-3-031-42296-6

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2014, 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by
similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Paper in this product is recyclable.

Preface

Researchers who design and conduct experiments or sample surveys, perform data
analysis and statistical inference, and write scientific reports need adequate knowl-
edge of applied statistics. To build adequate and sturdy knowledge of applied
statistical methods, firm foundation is essential. I have come across many researchers
who had studied statistics in the past but are still far from being ready to apply the
learned knowledge to their problem solving, and else who have forgotten what they
had learned. This could be partly because the mathematical technicality dealt with
their past study material was above their mathematics proficiency, or otherwise the
studied worked examples often lacked addressing essential fundamentals of the
applied methods. This book is written to fill gaps between the traditional textbooks
involving ample amount of technically challenging mathematical derivations and/or
the worked examples of data analyses that often underemphasize fundamentals. The
chapters of this book are dedicated to spell out and demonstrate, not to merely
explain, necessary foundational ideas so that the motivated readers can learn to fully
appreciate the fundamentals of the commonly applied methods and revivify the
forgotten knowledge of the methods without having to deal with complex mathe-
matical derivations or attempt to generalize oversimplified worked examples of
plug-and-play techniques. Detailed mathematical expressions are exhibited only if
they are definitional or intuitively comprehensible. Data-oriented examples are
illustrated only to aid the demonstration of fundamentals. This book can be used
as a guidebook for applied researchers or as an introductory statistical methods
course textbook for the graduate students not majoring in statistics.

Boston, MA, USA Hang Lee

v
Contents

1 Description of Data and Essential Probability Models . . . . . . . . . . . 1

1.1 Types of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Description of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Description of Categorical Data . . . . . . . . . . . . . . . . . . 3
1.2.3 Description of Continuous Data . . . . . . . . . . . . . . . . . . 3
1.2.4 Stem-and-Leaf Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.5 Box-and-Whisker Plot . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Central Tendency Descriptive Statistics
for Quantitative Outcomes . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 Dispersion Descriptive Statistics for Quantitative
Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.4 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.5 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.6 Property of Standard Deviation After Data
Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.7 Other Descriptive Statistics for Dispersion . . . . . . . . . . . 15
1.3.8 Dispersions Among Multiple Data Sets . . . . . . . . . . . . . 16
1.3.9 Caution to CV Interpretation . . . . . . . . . . . . . . . . . . . . . 18
1.4 Statistics for Describing Relationships Between
Two Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.1 Linear Correlation Between Two Continuous
Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.2 Contingency Table to Describe an Association
Between Two Categorical Outcomes . . . . . . . . . . . . . . . 20
1.4.3 Odds Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

vii
viii Contents

1.5 Two Essential Probability Distribution . . . . . . . . . . . . . . . . . . . 22

1.5.1 Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.5.2 Probability Density Function of Gaussian
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.5.3 Application of Gaussian Distribution . . . . . . . . . . . . . . . 25
1.5.4 Standard Normal Distribution . . . . . . . . . . . . . . . . . . . . 26
1.5.5 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Statistical Inference Concentrating on a Single Mean . . . . . . . . . . . 35
2.1 Population and Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.1 Sampling and Non-sampling Errors . . . . . . . . . . . . . . . . 35
2.1.2 Sample Distribution and Sampling Distribution . . . . . . . 37
2.1.3 Standard Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.1.4 Sampling Methods and Sampling Variability
of the Sample Means . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Statistical Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2.1 Data Reduction and Related Nomenclatures . . . . . . . . . . 42
2.2.2 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.3 The t-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.4 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.5 Accuracy and Precision . . . . . . . . . . . . . . . . . . . . . . . . 56
2.2.6 Interval Estimation and Conﬁdence Interval . . . . . . . . . . 58
2.2.7 Bayesian Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.2.8 Study Design and Its Impact to Accuracy
and Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3 t-Tests for Two-Mean Comparison . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1 Independent Samples t-Test for Comparing Two Independent
Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1.1 Independent Samples t-Test When Variances
Are Unequal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.1.2 Denominator Formulae of the Test Statistic
for Independent Samples t-Test . . . . . . . . . . . . . . . . . . . 77
3.1.3 Connection to the Conﬁdence Interval . . . . . . . . . . . . . . 78
3.2 Paired Sample t-Test for Comparing Paired Means . . . . . . . . . . 78
3.3 Use of Excel for t-Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4 Inference Using Analysis of Variance (ANOVA)
for Comparing Multiple Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1 Sums of Squares and Variances . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2 F-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Multiple Comparisons and Increased Chance of Type 1
Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Contents ix

4.4 Beyond Single-Factor ANOVA . . . . . . . . . . . . . . . . . . . . . . . . 93

4.4.1 Multi-factor ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4.2 Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4.3 Repeated Measures ANOVA . . . . . . . . . . . . . . . . . . . . 94
4.4.4 Use of Excel for ANOVA . . . . . . . . . . . . . . . . . . . . . . . 96
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5 Inference of Correlation and Regression . . . . . . . . . . . . . . . . . . . . . 99
5.1 Inference of Pearson’s Correlation Coefﬁcient . . . . . . . . . . . . . . 99
5.2 Linear Regression Model with One Independent Variable:
Simple Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 Simple Linear Regression Analysis . . . . . . . . . . . . . . . . . . . . . 102
5.4 Linear Regression Models with Multiple Independent
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.5 Logistic Regression Model with One Independent Variable:
Simple Logistic Regression Model . . . . . . . . . . . . . . . . . . . . . . 108
5.6 Consolidation of Regression Models . . . . . . . . . . . . . . . . . . . . 111
5.6.1 General and Generalized Linear Models . . . . . . . . . . . . 111
5.6.2 Multivariate Analysis Versus Multivariable Model . . . . . 112
5.7 Application of Linear Models with Multiple Independent
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.8 Worked Examples of General and Generalized Linear
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.8.1 Worked Example of a General Linear Model . . . . . . . . . 113
5.8.2 Worked Example of a Generalized Linear Model
(Logistic Model) Where All Multiple Independent
Variables Are Dummy Variables . . . . . . . . . . . . . . . . . . 115
5.9 Measure of Agreement Between Outcome Pairs:
Concordance Correlation Coefﬁcient for Continuous
Outcomes and Kappa (κ) for Categorical Outcomes . . . . . . . . . . 116
5.10 Handling of Clustered Observations . . . . . . . . . . . . . . . . . . . . . 120
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6 Normal Distribution Assumption-Free Nonparametric
Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.1 Comparing Two Proportions Using a 2 × 2 Contingency
Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.1.1 Chi-Square Test for Comparing Two Independent
Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.1.2 Fisher’s Exact Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.1.3 Comparing Two Proportions in Paired Samples . . . . . . . 131
6.2 Normal Distribution Assumption-Free Rank-Based
Methods for Comparing Distributions of Continuous
Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.2.1 Permutation Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.2.2 Wilcoxon’s Rank Sum Test . . . . . . . . . . . . . . . . . . . . . 135
x Contents

6.2.3 Kruskal–Wallis Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6.2.4 Wilcoxon’s Signed Rank Test . . . . . . . . . . . . . . . . . . . . 137
6.3 Linear Correlation Based on Ranks . . . . . . . . . . . . . . . . . . . . . 137
6.4 About Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . . 138
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7 Methods for Censored Survival Time Data . . . . . . . . . . . . . . . . . . . 141
7.1 Censored Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.2 Probability of Surviving Longer Than Certain Duration . . . . . . . 142
7.3 Statistical Comparison of Two Survival Distributions
with Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8 Sample Size and Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.1 Sample Size for Single Mean Interval Estimation . . . . . . . . . . . 147
8.2 Sample Size for Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . 148
8.2.1 Sample Size for Comparing Two Means
Using Independent Samples z- and t-Tests . . . . . . . . . . . 148
8.2.2 Sample Size for Comparing Two Proportions . . . . . . . . . 152
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9 Review Exercise Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.1 Review Exercise 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.1.1 Solutions for Review Exercise 1 . . . . . . . . . . . . . . . . . . 161
9.2 Review Exercise 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
9.2.1 Solutions for Review Exercise 2 . . . . . . . . . . . . . . . . . . 168
10 Statistical Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Chapter 1
Description of Data and Essential
Probability Models

This chapter portrays how to make sense of gathered data before performing formal
statistical inference. The topics covered are types of data, how to visualize data, how
to summarize data into a few descriptive statistics (i.e., condensed numerical indi-
ces), and introduction to some useful probability models.

1.1 Types of Data

Typical types of data arising from most studies fall into one of the following
categories.
Nominal categorical data contain qualitative information and appear to discrete
values that are codified into numbers or characters (e.g., 1 = case with a disease
diagnosis, 0 = control; M = male, F = female, etc.).
Ordinal categorical data are semi-quantitative and discrete, and the numeric
coding scheme is to order the values such as 1 = mild, 2 = moderate, and 3 = severe.
Note that the value of 3 (severe) does not necessarily be three times more severe than
1 (mild).
Count (number of events) data are quantitative and discrete (i.e., 0, 1, 2 . . .).
Interval scale data are quantitative and continuous. There is no absolute 0, and the
reference value is arbitrary. Examples of such data are temperature values in °C and °F.
Ratio scale data are quantitative and continuous, and there is absolute 0; e.g.,
body weight and height.
In most cases, the types of data usually fall into the above classification scheme
shown in Table 1.1 in that the types of data can be classified into either quantitative
or qualitative, and discrete or continuous.
Nonetheless, some definition of the data type may not be clear, among which the
similarity and dissimilarity between the ratio scale and interval scale may be the ones
that need further clarification.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 1

H. Lee, Foundations of Applied Statistical Methods,
https://doi.org/10.1007/978-3-031-42296-6_1
2 1 Description of Data and Essential Probability Models

Table 1.1 Classiﬁcation of data types

Qualitative Quantitative
Discrete Nominal categorical (e.g., Ordinal categorical (e.g., 1 = mild,
M = male, F = female) 2 = moderate, 3 = severe)
Count (e.g., number of incidences 0, 1, 2, 3,
. . .)
Continuous N/A Interval scale (e.g., temperature)
Ratio scale (e.g., weight)

Ratio scale: If two distinct values of quantitative data were able to be represented
by a ratio of two numerical values, then such data are ratio scale data. For example,
two observations xi = 200 and xj = 100, for i ≠ j; the ratio xi/xj = 2 shows that xi is
twice of xj, for example, lung volume, age, disease duration, etc.
Interval scale: If two distinct values of quantitative data were not ratio-able, then
such data are interval scale data. Temperature is a good example as it has three
temperature systems, i.e., Fahrenheit, Celsius, and Kelvin. Kelvin system also has its
absolute 0 (there is no negative temperature in Kelvin system). For example, 200 °F
is not a temperature that is twice higher than 100 °F. We can only say that 200 °F is
higher by 100 degrees (i.e., the displacement between 200 and 100 is 100 degrees in
Fahrenheit measurement scale).

1.2 Description of Data

1.2.1 Distribution

A distribution is a complete description of how high the occurring chance (i.e.,

probability) of a unique datum or certain range of data values is. The following two
explanations will help you grasp the concept. If you keep on rolling a die, you will
expect to observe 1, 2, 3, 4, 5, or 6 equally likely, i.e., a probability for each unique
outcome value is 1/6. We say, probability of 1/6 is distributed to 1, 1/6 is to 2, 1/6 to
3, 1/6 to 4, 1/6 to 5, and 1/6 to 6. Another example is that if you keep on rolling a die
many times, and each time you say a success if the observed outcome is 5 or 6 and
say a failure otherwise, then your expected chance to observe a success will be 1/3
and that of a failure will be 2/3. We say, a probability of 1/3 is distributed to the
success, and 2/3 is distributed to the failure. There are many distributions that cannot
be described as simply as these two examples, which require descriptions using
sophisticated mathematical functions.
Let us discuss how to describe the distributions arising from various types of data.
One way to describe a set of collected data is to describe the distribution of relative
frequency for the observed individual values (e.g., what values are very common and
what values are how less common). Graphs, simple tables, or a few summary
numbers are commonly used.
1.2 Description of Data 3

1.2.2 Description of Categorical Data

A simple tabulation (frequency table) is to list the observed count (and proportion in
percentage value) for each category. A bar chart (see Figs. 1.1 and 1.2) can be used
for a visual summary of nominal and ordinal outcome distributions. The size of each
bar in Figs. 1.1 and 1.2 reveals the actual counts. It is also common to present it as
the relative frequency (i.e., proportion of each category in percentage of the total).

1.2.3 Description of Continuous Data

Figure 1.3 is a list of white blood cell (WBC) counts of 31 patients diagnosed with a
certain illness listed by the patient identiﬁcation number. Does this listing itself tell
us the group characteristics such as the average and the variability among patients?
How can we describe the distribution of these data, i.e., how much of the
occurring chance is distributed to WBC = 5200, how much to WBC = 3100, . . .,

Fig. 1.1 Frequency table and bar chart for describing nominal categorical data
4 1 Description of Data and Essential Probability Models

Fig. 1.2 Frequency table and bar chart for describing ordinal data

etc.? Such a description may be very cumbersome. As depicted in Fig. 1.4, the listed
full data in ascending order can be a primitive way to describe the distribution, but it
does not still describe the distribution. An option is to visualize the relative frequen-
cies for grouped intervals of the observed data. Such a presentation is called
histogram. To create a histogram, one will ﬁrst need to create equally spaced
WBC categories and count how many observations fall into each category. Then
the bar graph can be drawn where each bar size indicates the relative frequency of
that speciﬁc WBC interval. The process of drawing bar graphs manually seems
cumbersome. Next section introduces a much less cumbersome manual technique to
visualize continuous outcomes.

1.2.4 Stem-and-Leaf Plot

The stem-and-leaf plot requires much less work than creating the conventional
histogram while providing the same information as what the histogram does. This
is a quick and easy option to sketch a continuous data distribution.
1.2 Description of Data 5

Fig. 1.3 List of WBC raw

data of 31 subjects

Let us use a small data set for illustration, and then revisit our WBC data example
for more discussion after this method becomes familiar to you. The following nine
data points 12, 32, 22, 28, 26, 45, 32, 21, and 85 are ages (ratio scale) of a small
group. Figures 1.5, 1.6, 1.7, 1.8 and 1.9 demonstrates how to create the stem-and-
leaf plot of these data.
The main idea of this technique is a quick sketch of the distribution of an
observed data set without computational burden. Let us just take each datum in the
order that it is recorded (i.e., the data are not preprocessed by other techniques such
as sorting by ascending/descending order) and plot one value at a time (see Fig. 1.5).
Note that the oldest observed age is 85 years, which is much greater than the next
oldest age 45 years, and the unobserved stem interval values (i.e., 50s, 60s, and 70s)
are placed. The determination of the number of equally spaced major intervals (i.e.,
number of stems) can be subjective and data range dependent.
Figure 1.10 depicts the distribution of our WBC data set by the stem-and-leaf
plot. Most values lie between 3000 and 4000 (i.e., mode); the contour of the
frequency distribution is skewed to the right, and the mean value did not describe
the central location well; the smallest and the largest observations were 1800 and
11,200, respectively, and there are no observed values lying between 1000 and 1100.
6 1 Description of Data and Essential Probability Models

Fig. 1.4 List of

31 individual WBC values
in ascending order

1.2.5 Box-and-Whisker Plot

Unlike the stem-and-leaf plot, this plot does not show the individual data values
explicitly. This can describe the data sets whose sample sizes are larger than what
can usually be illustrated manually by the stem-and-leaf plot. If the stem-and-leaf
plot is seen from a bird-eye point of view (Fig. 1.11), then the resulting description
can be made as depicted in the right-hand side panels of Figs. 1.12 and 1.13.
The unique feature of this technique is to identify and visualize where the middle
half of the data exist (i.e., the interquartile range) by the box and the interval where
the rest of the data exist by the whiskers.
If there are two or more modes, the box-and-whisker plot cannot fully character-
ize such a phenomenon, but the stem-and-leaf can (see Fig. 1.14).
1.2 Description of Data 7

Fig. 1.5 Step-by-step illustration of creating a stem-and-leaf plot

Fig. 1.6 Illustration of

creating a stem-and-leaf plot

Fig. 1.7 Two stem-and-leaf

plots describing two same
data sets

Kidist Thesis Proposal
No ratings yet
Kidist Thesis Proposal
114 pages
Partial Least Squares (PLS) Structural Equation Modeling (SEM) For Building and Testing Behavioral Causal Theory: When To Choose It and How To Use It
No ratings yet
Partial Least Squares (PLS) Structural Equation Modeling (SEM) For Building and Testing Behavioral Causal Theory: When To Choose It and How To Use It
24 pages
Big Data and Predictive Maintenance in Manufacturing
No ratings yet
Big Data and Predictive Maintenance in Manufacturing
16 pages
Data Analytics Syllabus
No ratings yet
Data Analytics Syllabus
15 pages
5 Books That Will Teach You More Than Any College Degree - New Trader U
No ratings yet
5 Books That Will Teach You More Than Any College Degree - New Trader U
7 pages
Forster Et Al (2018)
No ratings yet
Forster Et Al (2018)
15 pages
TYBBI-Sem V-Research Methodology-Sonal S
50% (2)
TYBBI-Sem V-Research Methodology-Sonal S
12 pages
Lecture - 8 MLR
No ratings yet
Lecture - 8 MLR
63 pages
Peer Group Pressure, Study Habit and Academic Achievement of Secondary School Students
No ratings yet
Peer Group Pressure, Study Habit and Academic Achievement of Secondary School Students
8 pages
Nres Midterms
No ratings yet
Nres Midterms
15 pages
A Quick Guide To Quantitative Research in The Social Sciences
No ratings yet
A Quick Guide To Quantitative Research in The Social Sciences
26 pages
Z Test Lecture Examples 1 2 3 With 1
No ratings yet
Z Test Lecture Examples 1 2 3 With 1
4 pages
Practical Research 2 Q1
No ratings yet
Practical Research 2 Q1
99 pages
Statistic & Machine Learning: Team 2
No ratings yet
Statistic & Machine Learning: Team 2
42 pages
Cosm - QB2
No ratings yet
Cosm - QB2
2 pages
Binomial Test For Non Parametric Statistics
No ratings yet
Binomial Test For Non Parametric Statistics
5 pages
Practical Research 2 3Rd Summative Test (Endterm)
No ratings yet
Practical Research 2 3Rd Summative Test (Endterm)
4 pages
Research 7 Q4 Module Answer Keys
No ratings yet
Research 7 Q4 Module Answer Keys
4 pages
Exam2005 2
0% (1)
Exam2005 2
19 pages
Chapter 02 Test Bank - Version1
No ratings yet
Chapter 02 Test Bank - Version1
38 pages
Mengenali Fungsi Logika "And" Melalui Pemrograman Perceptron Dengan Matlab
No ratings yet
Mengenali Fungsi Logika "And" Melalui Pemrograman Perceptron Dengan Matlab
8 pages
Statistics SLM
No ratings yet
Statistics SLM
7 pages
Ge 5
No ratings yet
Ge 5
9 pages
Processing and Interpretation of Data
No ratings yet
Processing and Interpretation of Data
12 pages
BMSP-ML: Big Mart Sales Prediction Using Different Machine Learning Techniques
No ratings yet
BMSP-ML: Big Mart Sales Prediction Using Different Machine Learning Techniques
10 pages
Ederio DLP, Math 7 Quarter 4
No ratings yet
Ederio DLP, Math 7 Quarter 4
7 pages
Module 2 in IStat 1 Probability Distribution
No ratings yet
Module 2 in IStat 1 Probability Distribution
6 pages
Measures of Central Tendency: Example 1 Example 2
No ratings yet
Measures of Central Tendency: Example 1 Example 2
3 pages
Introduction To Machine Learning With Python A Guide For Beginners in Data Science 9781724417503 1724417509
100% (3)
Introduction To Machine Learning With Python A Guide For Beginners in Data Science 9781724417503 1724417509
176 pages
Sampling Techniques
No ratings yet
Sampling Techniques
3 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2133)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)

Foundations of Applied Statistical Methods - 2nd Edition Reference Book Download

Uploaded by

Foundations of Applied Statistical Methods - 2nd Edition Reference Book Download

Uploaded by

Foundations of Applied Statistical Methods - 2nd Edition

Click Download Now

ISBN 978-3-031-42295-9 ISBN 978-3-031-42296-6 (eBook)

Paper in this product is recyclable.

Boston, MA, USA Hang Lee

1 Description of Data and Essential Probability Models . . . . . . . . . . . 1

1.5 Two Essential Probability Distribution . . . . . . . . . . . . . . . . . . . 22

4.4 Beyond Single-Factor ANOVA . . . . . . . . . . . . . . . . . . . . . . . . 93

6.2.3 Kruskal–Wallis Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

1.1 Types of Data

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 1

Table 1.1 Classiﬁcation of data types

1.2 Description of Data

A distribution is a complete description of how high the occurring chance (i.e.,

1.2.2 Description of Categorical Data

1.2.3 Description of Continuous Data

1.2.4 Stem-and-Leaf Plot

Fig. 1.3 List of WBC raw

Fig. 1.4 List of

1.2.5 Box-and-Whisker Plot

Fig. 1.5 Step-by-step illustration of creating a stem-and-leaf plot

Fig. 1.6 Illustration of

Fig. 1.7 Two stem-and-leaf

You might also like