Biostastics
Biostastics
1
What is Biostatistics?
Statistics: A field of study concerned with methods and procedures
for:
Collection, organization, analysis, summarization and
interpretation of numerical data, &
to make scientific inferences about a body of data when only
a small part of the data is observed.
2
The tools of statistics are employed in many fields such as business,
education, psychology, agriculture, economics, …….
3
Provides methods of organizing information
4
Assessment of risk factors
Cause & effect relationship
Drawing of inferences
Information from sample to population
6
Ways of Data Presentation include:
Tables
Graphs (Bar chart, Pie chart, Histogram, Scatter plot, etc.)
7
2. Inferential statistics: deals with techniques of making
conclusions about the population based on the information
obtained from a sample drawn from that population.
8
Data, population, Sample, parameter, Statistic, Variables
Data: are numbers which can be obtained from taking measurements
or can be obtained by counting or observation.
Numerical descriptions of things
The raw material for statistics.
9
Population and sample
Population: refers to any well defined groups of subjects/objects who
share common characteristics.
A group of people, institutions or items that have something
in common for which we wish to draw conclusions at a
particular time.
E.g., All TB patients in Ethiopia, all hospitals in Hawassa
Population is generally large & difficult to study all of them.
10
Population and sample…
Sample:
A small group or subset of a population about which
information is actually obtained.
Samples are used to describe & make inferences about
the populations from which they arise
Statistical methods are based on these samples
Samples should be selected using a suitable
method so that it can be representative (random sample)
11
12
Parameter and statistic
Parameter:
A numerical descriptive measure derived from the
data of a population.
13
Parameter and statistic….
14
Variable
• Variable: A characteristic which takes different values in
different persons, places, or things.
15
Variables can be broadly classified into:
16
1. Categorical variable: A variable which can not be measured in
quantitative form but can only be sorted by name or categories
17
Categorical variable is divided into two:
1. Nominal:
18
2. Ordinal:
19
2. Quantitative variable: A variable that can be measured or
counted and expressed numerically.
20
Quantitative variable is divided into two:
• The values are not just labels, but are actual measurable quantities.
21
2. Continuous variable:
22
SUMMARY
Variable
Types Quantitative
Qualitative
of (Numerical)
(Categorical)
variables
Measurement scales
23
Scales of measurement
24
1. Nominal scale:
25
Example of nominal
Scale:
Race/Ethnicity: • The numbers have NO
1. Black meaning
2. White • They are labels only
3. Latino
4. Other
26
• If nominal data take only two possible values, they are
called dichotomous or binary.
• Yes/no questions
27
2. Ordinal scale:
28
Example of ordinal scale:
• The numbers have
• Pain level:
1. None LIMITED meaning
2. Mild 4>3>2>1 is all we know
3. Moderate apart from their utility as
4. Severe labels
29
3. Interval scale:
- Measured on a continuum and differences between any two numbers
on a scale are of known size.
Example: Temp. in oF on 4 consecutive days
Days: A B C D
Temp. oF: 50 55 60 65
For these data, not only is day A with 50o F cooler than day D with 65o
but is 15o cooler.
- It has no true zero point. “0” is arbitrarily chosen and doesn‟t reflect
the absence the attribute.
30
4. . Ratio scale:
31
A measurement on a higher scale can be transformed into one on
a lower scale, but not vice versa.
32
33
Interval
Ordinal
Nominal
Ratio
Degree of precision in measuring
Dependent vs. Independent Variable
34
Class Exercise
I. Classify the below variables as quantitative and qualitative and
write in bracket as nominal, ordinal, discrete or continuous
35
Source/ Type of data
1. Primary data:
– Collected by the investigator or under his/her close
supervision for the purpose of specific inquiry or
study.
– Original in character and are mostly generated by
individual or research institutes.
– the investigator is aware of any limitations the data
may contain since he/she knows under what
conditions the data are collected
– Considered to be more reliable and relatively
accurate.
36
Source/Type of data…
2. Secondary data:
37
Secondary data…
Advantage:
• Data collection is inexpensive.
• Less time consuming
Disadvantages:
• It is sometimes difficult to gain access to the records or
reports required,
• The data may not always be complete and precise enough, or
too disorganized.
38
Reading assignment
1. Data Collection Techniques
2. Types of questions
Close ended questions
Open ended questions
3. Questionnaire forms
Structured
Semi-structured
Unstructured
4. Questionnaire designing
39
Descriptive Statistics
• Numbers that have not been summarized and organized are called
raw data.
40
Methods of Data Organization and
Presentation
41
A. Describing categorical variables
• Table of frequency distributions
– Frequency
– Relative frequency
– Cumulative frequencies
• Charts
– Bar charts
– Pie charts
42
Frequency distributions
• Simple and effective way of summarizing categorical data
• The actual summarization and organization of data starts from
frequency distribution
• Done by counting the number of observations falling into each of
the categories or levels of the variables.
E.g. Birth weight with levels „Very low ‟, „Low‟, „Normal‟ and „big‟.
43
Relative Frequency
• It is the proportion or percentages of observations in each category of a
variable.
44
Cumulative frequency
• It is the number of observations in the category of a variable plus
observations in all categories smaller than it.
45
Table 1. Distribution of birth weight of newborns between Sept-
Oct, 2020 at „X‟ Hospital.
46
B) Describing Quantitative variable:
– Frequency
– Relative frequency
– Cumulative frequencies
47
To determine the number of class intervals and the corresponding
width, we may use:
Sturge‟s rule:
K 1 3.322(logn)
LS
W
K
where
K = number of class intervals n = no. of observations
W = width of the class interval L = the largest value
S = the smallest value
48
Example:
Leisure time (hours) per week for 40 college students:
23 24 18 14 20 36 24 26 23 21 16 15 19 20 22 14 13 10
19 27 29 22 38 28 34 32 23 19 21 31 16 28 19 18 12 27
15 21 25 16
49
Time Relative Cumulative
(Hours) Frequency Frequency Relative
Frequency
10-14 5 0.125 0.125
15-19 11 0.275 0.400
20-24 12 0.300 0.700
25-29 7 0.175 0.875
30-34 3 0.075 0.950
35-39 2 0.050 1.00
Total 40 1.00
50
• Class Limit: The range for each class
– Upper class limit
– Lower class limit
• Subtract 0.5 from the lower and add it to the upper class limit
51
Time
(Hours) True limit(class boundary) Mid-point Frequency
52
Types of tables
53
Types of table cont.…..
54
55
Guidelines for constructing tables
• Keep them simple
• Show totals
57
Specific types of graphs include:
• Bar graph
Nominal, ordinal,
• Pie chart Discrete data
58
1. Bar charts (Graphs)
• Categories are listed on the horizontal axis (X-axis)
• There are different types of bar graphs, the most important ones
are:
59
A. Simple bar chart: It is a one-dimensional in which the bar
represents the whole of the magnitude. (only one variable)
100
Number of children
80
60
40
20
0
Not immunized Partially immunized Fully immunized
Immunization status
350
300
Number of women
250
200
150
100
50
0
Married Single Divorced Widowed
Marital status
100
Number of women
80
60
40
20
0
Married Single Divorced Widow ed
Marital status
Fig. 3 TT Immunization status by marital status of women 15-49 years, Asendabo town,
1996
62
Subdivided bar chart cont.…..
63
Method of constructing bar chart
• All the bars should rest on the same line called the base
65
Steps to construct a pie-chart
• Construct a frequency table
66
Example: Distribution of deaths for females, in England and
Wales, 1989.
67
Distribution fo cause of death for females, in England and Wales, 1989
Others
8%
Digestive System
4%
Injury and Poisoning
3%
Circulatory system
Respiratory system
42%
13%
Neoplasmas
30%
68
3. Histogram
• Histograms are frequency distributions with continuous class
interval that have been turned into graphs.
69
• It is necessary that the class intervals be non-overlapping so that
each observation falls in one and only one interval.
70
Example: Distribution of the age of women at the time of marriage
40
35
30
No of women
25
20
15
10
0
14.5-19.5 19.5-24.5 24.5-29.5 29.5-34.5 34.5-39.5 39.5-44.5 44.5-49.5
Age group 71
4. Frequency polygon
72
Age of women at the time of marriage
40
35
30
No of women
25
20
15
10
0
12 17 22 27 32 37 42 47
Age
73
Age of women at the time of marriage
40
35
30
No of women
25
20
15
10
0
12 17 22 27 32 37 42 47
Age
74
Frequency polygon of birth weight of 9975 newborns for males and
females
50
40
%
30
20
SEX
10
Males
Females
0
500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Birth Weight
75
5. Ogive Curve (Cumulative Frequency Polygon)
• Used to know the number of items whose values are more or less than a
certain amount.
• E.g. to know the no. of patients whose weight is <50 or >60 Kg.
90
80
Cumulative frequency
70
60
50
40
30
20
10
0
4.5 9.5 14.5 19.5 24.5 29.5 34.5 39.5
Upper class boundary
Fig 4: Cumulative frequency curve for amount of time college students devoted to
leisure activities
77
6. Line graph
3.0
2.5
2.0
1.5
1.0
0.5
0.0
1967 1969 1971 1973 1975 1977 1979
Year
79
80
Stem-and-Leaf Plot
A quick way to organize data to give visual impression
similar to a histogram while retaining much more detail
on the data.
Similar to histogram and serves the same purpose and
reveals the presence or absence of symmetry
Are most effective with relatively small data sets
Are not suitable for reports and other communications,
but
Help researchers to understand the nature of their data
81
Example
• 43, 28, 34, 61, 77, 82, 22, 47, 49, 51, 29, 36,
66, 72, 41
2 2 8 9
3 4 6
4 1 3 7 9
5 1
6 1 6
7 2 7
8 2
82
Steps to construct Stem-and-Leaf Plots
83
Steps to construct Stem-and-Leaf Plots
3. Write the second stem (first stem +1) below the first
stem
4. Continue with the remaining stems until you reach
the largest stem in the data set
5. Draw a vertical bar to the right of the column of
stems
6. For each number in the data set, find the appropriate
stem and write the leaf to the right of the vertical
bar
84
Scatter Plots
The most useful graphical tool for displaying the
relationship between two quantitative variables is a two
way scatterplot.
Scatter plots present data on the x- and y-axes and are used
to investigate an association between two variables.
A point represents each individual or object, and an
association between two variables can be studied by
analyzing patterns across multiple points.
A regression line is added to a graph to determine whether
the association between two variables can be explained or
not.
85
Scatter plot (Two way) Here is one that
displays annual salary vs year of education.
86
Box-and-Whisker Plots
It is a useful visual device for communicating the
information contained in a data set.
The construction of a box-and-whisker plot makes use of
the quartiles
Examination of a box-and-whisker plot for a set of
data reveals information regarding the amount of
spread, location of concentration, and symmetry
of the data.
87
Box plots
88
Any question?
89
Numerical summary measures
90
A. Measures of Central location
• The objective of calculating MCT is to determine a single value
which may be used to represent the whole data set.
Mean
x
i=1
i
x= .
n
92
b) Grouped data
• We assume that all values falling into a particular class interval
are located at the mid-point of the interval. It is calculated as
follow: k
m f
i=1
i i
x= k
f i=1
i
• where,
94
95
Properties of the arithmetic mean
• For given set of data there is one and only one arithmetic mean
(uniqueness).
• Influenced by each and every value in the data set hence affected
by the extreme values.
96
Median
• With the observations arranged in increasing or decreasing order,
the median is defined as the middle observation.
a) ungrouped data
observation.
97
The median is a better measure of central tendency (than the mean)
when the distribution is skewed
98
b) Grouped data
99
Median for Grouped data…..
To find a unique median value, use the following formal.
n
Fc
~x = L 2 W
m
fm
• where,
• Lm = lower true class boundary of the interval containing the median
• Fc = cumulative frequency of the interval just above the median class interval
100
Example. Compute the median age of 169 subjects from the
grouped data.
101
• n/2 = 84.5 = in the 3rd class interval
• Fc = 70
102
Properties of median
103
Quartiles
• If the data are divided into four equal parts, we speak of
quartiles.
a) The first quartile (Q1): 25% of all the ranked observations are
less than Q1. [25th percentile]
c) The third quartile (Q3): 75% of all the ranked observations are
less than Q3. [75th percentile] 104
Percentiles
105
– P0: The minimum
– P25: 25% of the sample values are less than or equal to this value.
P25 means 1st Quartile or 25th percentile and given by:-
0.25(n+1)th observation
– P50: 50% of the sample are less than or equal to this value. 2nd
Quartile or 50th percentile and given by:-
0.5(n+1)th observation
– P75: 75% of the sample values are less than or equal to this value.
3rd Quartile or 75th percentile and given by:-
0.75(n+1)th observation
– P100: The maximum
106
Example: Birth weight in grams
2069, 2581, 2759, 2834, 2838, 2841, 3031, 3101, 3200, 3245, 3248,
3260, 3265, 3314, 3323, 3484, 3541, 3609, 3649, 4146
107
Mode
• It is a value that occur most often.
• The mode of grouped data usually refers to the modal class with
the highest frequency.
Often its value is not unique (more than one mode is possible)
110
111
Descriptive statistics
Measures of dispersion
112
Measures of Dispersion……
113
Measures of Dispersion
• The amount may be small when the values are close together.
• Example –
– Range = 42-5 = 37
115
Properties of range
116
2. Inter-quartile range (IQR)
• Indicates the spread of the middle 50% of the observations, and
used with median
IQR = Q3 - Q1
Example: Suppose the first and third quartile for weights of girls
12 months of age are 8.8 Kg and 10.2 Kg, respectively.
i.e., 50% of the infant girls weigh between 8.8 and 10.2 Kg.
117
Example 2
• Given the following data set (age of patients):-
• Solution: 18 21 23 24 24 32 42 59
• Hence, IQR = 37 - 22 = 15
118
Properties of IQR:
119
120
121
122
123
124
n
i
(x x) 2
S2 i=1
n -1
125
n
i
(x x) 2
S2 i=1
n -1
126
n
i
(x x) 2
S2 i=1
n -1
127
n
i
(x x) 2
S2 i=1
n -1
128
n
i
(x x) 2
S2 i=1
n -1
129
n
i
(x x) 2
S2 i=1
n -1
130
n
i
(x x) 2
S2 i=1
n -1
131
n
i
(x x) 2
S2 i=1
n -1
132
n
i
(x x) 2
S2 i=1
n -1
133
n
i
(x x) 2
S2 i=1
n -1
134
Example. Compute the variance and SD of the age of 169 subjects from the
grouped data.
Mean = 5810.5/169 = 34.48 years
S2 = 20199.22/169-1 = 120.23
SD = √S2 = √120.23 = 10.96
Class
interval (mi) (fi) (mi-Mean) (mi-Mean)2 (mi-Mean)2 fi
10-19 14.5 4 -19.98 399.20 1596.80
20-29 24.5 66 -9-98 99.60 6573.60
30-39 34.5 47 0.02 0.0004 0.0188
40-49 44.5 36 10.02 100.40 3614.40
50-59 54.5 12 20.02 400.80 4809.60
60-69 64.5 4 30.02 901.20 3604.80
Total 169 1901.20 20199.22
135
Properties of SD
• Has the advantage of being expressed in the same units of
measurement as the mean
137
CV is the ratio of the SD to the mean multiplied by 100.
S
CV 100
x
SD Mean CV (%)
138
Skewed distributions
139
B. Negatively skewed distribution: occurs when majority of
scores are at the right end of the curve and a few small scores are
scattered at the left end.
140
Mean, Median & Mode
141
Which measures to use?
• When the distribution is symmetric, summarize the data using means and
standard deviations.
• When the data are skewed, it is preferable to use the median and IQR as
summary statistics.
• Median and IQR are not easily influenced by extreme values in a skewed
distribution unlike means and standard deviations.
• Remark:
• The mean and median of symmetric distribution coincide.
• When skewed to the right, its mean is larger than its median.
• When skewed to the left, its mean is smaller than its median.(see fig. a-c)
142
Any question?
143
Probability and
Probability Distributions
144
Brain storming
For a certain major operation procedure the
probability of death is 1 in 20 individuals (0.05).
if 19 consecutive individuals undertake the
procedure and all of them survived, and if the 20th
individual is you, what will you decide?
Do you undertake the operation or not? Why?
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
Types of Events
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
The Variance of a Discrete Random Variable
197
198
199
1. Binomial Distribution
It is one of the most widely encountered discrete
probability distribution.
Considers dichotomous/ binary random variables
Is based on a process known as Bernoulli trial, James
Bernoulli (1654 – 1705).
– When a single trial of an experiment in only one o two
mutually exclusive outcomes (Dead or alive, sick or well,
male or female, +ve or -ve, Yes/No etc…)
Binomial distribution is used to make inferences
about population proportions.
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
Finding normal curve areas
1. The table gives area between a value of Z0 and +∞
3. Read the value of the area (P) from the body of the
table where the row and column intersect. Values of P
are in the form of decimal points and four places.
244
245
Exercises
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
Types and Techniques of Sampling
264
Sampling is a procedure by which some members of the given
population are selected as representative of the entire population for
observation /study purpose.
265
Sampling would be easy if all populations are
similar/homogenous.
266
Definitions of terms
•Reference population/Target population.-is the population of interest
to which the findings of the study are going to be generalized.
•Source population.-The population from which the study subjects are
obtained.
•Study/Sample population:-The population included in the sample.
•Sampling unit:-The unit of selection in the sampling process.
For example, in a sample of districts, the sampling unit is a district; in a
sample of persons, the sampling is a person, etc.
•Study unit:-The unit on which information is collected.
For example, in the study of prevalence of disease the study unit is
individuals/persons. In the study of family size the study unit is a
household.
267
• Sampling frame- is the list of sampling units in the source
population from which the sample will be selected.
Example:
Researchers are interested to see whether there is association
between sexual debut and severe menstrual pain among
reproductive age group female Hawassa University students.
270
Types of sampling
I. Probability sampling
probability sampling method is any method of sampling that utilizes
some form of random selection.
more complex,
more time‐consuming and
usually more costly than non‐probability sampling.
Every individual of the target population has a known and non zero
chance to be included in the sample.
Generalization is possible (from sample to population)
272
A) Simple random sampling (SRS)
Computer programs
273
Procedure:
274
57172 42088 70098 11333 26902 29959 43909 49607
33883 87680 28923 15659 09839 45817 89405 70743
77950 67344 10609 87119 15859 74577 42791 75889
11607 11596 01796 24498 17009 67119 00614 49529
56149 55678 38169 47228 49931 94303 67448 31286
80719 65101 77729 83949 83358 75230 56624 27549
93809 19505 82000 79068 45552 86776 48980 56684
40950 86216 48161 17646 24164 35513 94057 51834
12182 59744 65695 83710 41125 14291 74773 66391
13382 48076 73151 48724 35670 38453 63154 58116
38629 94576 48859 75654 17152 66516 78796 73099
60728 32063 12431 23898 23683 10853 04038 75246
01881 99056 46747 08846 01331 88163 74462 14551
23094 29831 95387 23917 07421 97869 88092 72201
15243 21100 48125 05243 16181 39641 36970 99522
53501 58431 68149 25405 23463 49168 02048 31522
07698 24181 01161 01527 17046 31460 91507 16050
22921 25930 79579 43488 13211 71120 91715 49881
68127 00501 37484 99278 28751 80855 02035 10910
55309 10713 36439 65660 72554 77021 46279 22705
92034 90892 69853 06175 61221 76825 18239 47687
50612 84077 41387 54107 09190 74305 68196 75634
81415 98504 32168 17822 49946 37545 47201 85224
38461 44528 30953 08633 08049 68698 08759 45611
07556 24587 88753 71626 64864 54986 38964 83534
60557 50031 75829 05622 30237 77795 41870 26300
275
SRS has certain limitations:
276
B) Systematic Random Sampling
277
• The number of the first student to be included in the sample is
chosen randomly, for example by blindly picking one out of five
pieces of paper, numbered 1 to 5.
278
279
Important if the source population is arranged in some order:
– Order of registration of patients
– Numerical number of house numbers
– Student‟s registration lists
Merits
280
Demerits
Examples
281
C) Stratified Sampling
283
Example: Equal allocation:
284
Example: Proportionate Allocation
Village A B C D Total
HHs 100 150 120 130 500
S. size ? ? ? ? 60
285
Merit
Demerit
286
D) Cluster sampling
Sometimes it is too expensive to carry out SRS
Population may be large and scattered.
Complete list of the study population unavailable
Population consists of many natural groups
(clusters)
Travel costs can become expensive if interviewers
have to survey people from one end to the
other.
The clusters should be homogeneous, unlike
stratified sampling where the strata are
heterogeneous
287
D) Cluster sampling…..
• The sampling unit is a cluster, and the sampling frame is a list of these
clusters.
Procedure
• These clusters are often geographic units (eg districts, villages, etc.)
288
Example: Cluster sampling
Cluster 1 Cluster 2
Cluster 3
Cluster 5
Cluster 4
289
Merit
• A list of all the individual study units in the reference
population is not required.
• It is sufficient to have a list of clusters.
Demerit
• It is based on the assumption that the characteristic to be studied
is uniformly distributed throughout the reference population,
which may not always be the case.
Hence, sampling error is usually higher than a simple
random sample of the same size.
290
E) Multi-stage sampling
• Selection is done in stages until the final sampling unit (e.g. households or
persons) are arrived at.
• The primary sampling unit (PSU) is the sampling unit (usually large size)
in the first sampling stage.
• The secondary sampling unit (SSU) is the sampling unit in the second
sampling stage, etc.
• Example - The PSUs could be kebeles and the SSUs could be households.
291
292
Merit
– No need to have a list of all units in the population.
– Saves a great amount of time and effort
Demerit
– Error will be multiplied
– Provide less precise estimation
293
F. Sampling with probability
proportional to size (PPS)
294
295
Steps in PPS
• List all Kebeles/clusters with their population
size/HHs size
• Calculate the cumulative frequency of the
population/ HHs
• Calculate the sampling interval (say K) by dividing
the total population/HHs by the Kebeles/clusters
size to be selected
• Randomly choose a number between 1 &K, say j
• Kebeles/clusters with cumulative frequency
containing the jth, (j+k)th …will be included in the
sample 296
297
298
299
II. Non-probability sampling
No random selection (unrepresentative of the given population)
It is useful when descriptive comments about the sample itself are desired
301
302
303
304
2. Volunteer sampling
• Occurs when people volunteer to be involved
in the study.
• In experiments or pharmaceutical trials (drug
testing), for example, it would be difficult and
unethical to enlist random participants from
the general public.
• In these instances, the sample is taken from a
group of volunteers.
305
306
307
308
309
310
Errors in sampling
When we take a sample, our results will not exactly equal the
correct results for the whole population. The sample value
deviates from the population value.
• Two types of errors
312
Non Sampling Error …
313
314
Sampling Distribution
315
Sampling Distribution
A sampling distribution is a distribution of all possible values of a
statistic computed from samples of the same size randomly selected from
the same population.
316
Sampling Distribution….
Example:
Take a sample (n) from population (N) and calculate the statistics, e.g.,
Mean.
317
Sampling Distribution….
Do you expect all the sample means the same?
318
A. Sampling distribution of mean/Distribution of
sample mean
• Suppose we have a population of size N=4, constituting the ages
of four outpatients.
μ
x i
N
18 20 22 24
21
4
σ
i
(x μ) 2
2.236
N
319
320
Sample means Freq P( )
18 1 0.0625
19 2 0.1250
20 3 0.1875
21 4 0.2500
22 3 0.1875
23 2 0.1250
24 1 0.0625
321
Sampling distribution of all sample means
μx
x i
18 19 21 24
21
N 16
σx
i x
(x μ ) 2
N
(18 - 21) 2 (19 - 21) 2 (24 - 21) 2
1.58
16
324
Compare the population distribution with its sampling
distribution
325
We note that the mean of the sampling distribution of the mean has the same
value as the mean of the population.
However, the variance is different from the original population variance; but it is
equal to the population variance divided by the sample size to obtain sampling
distribution.
The square root of the sampling distribution variance is called standard error of
the mean or, simply, standard error.
Or, the standard deviation of any sample statistics is called its standard error.
326
Standard error is determined by both the sample size and the degree of
variability among the individual observations.
327
328
329
330
Properties of sampling distribution of mean
331
Properties of sampling distribution of mean
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
C. Sampling Distribution of proportion/
Distribution of Sample Proportion
352
353
Properties of Sampling Distribution of
sample proportion
• Construction of sampling distribution of
sample proportion is done in manner similar
to that of sampling distribution of sample
mean.
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
Inferential Statistics
Statistical Estimation
371
372
373
374
375
376
377
• Two methods of estimation:
– Point estimation
– Interval estimation
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated