0% found this document useful (0 votes)
33 views57 pages

Lect 4 The Normal distributionXIUGAI

1. The document discusses the normal distribution and key concepts such as the mean, standard deviation, symmetry, and infinite extensibility. 2. It introduces the standard normal distribution and how it can be used to compare areas under the normal curve between different distributions. 3. Applications of the normal distribution include establishing medical reference ranges based on measurement data from healthy populations.

Uploaded by

fareehakanwar93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views57 pages

Lect 4 The Normal distributionXIUGAI

1. The document discusses the normal distribution and key concepts such as the mean, standard deviation, symmetry, and infinite extensibility. 2. It introduces the standard normal distribution and how it can be used to compare areas under the normal curve between different distributions. 3. Applications of the normal distribution include establishing medical reference ranges based on measurement data from healthy populations.

Uploaded by

fareehakanwar93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 57

Medical Statistics

医学统计学
Lecture 4 Normal distribution
正态分布

Dept. of Epi & Biostatistics


Key points of last class
 Relative Measures
 Frequency (prevalence rate and constitute)

 Intensity (Rate)

 Ratio

 Application of relative numbers


 Standardization for crude rate
 Dynamic series and analysis index
Vocabulary for Lecture 4
Normal Distribution 正态分布
Standard normal distribution 标准正态分布
Reference Range 参考值范围
Peak 峰
Curve 曲线
Density function 密度函数
Random sampling 随机抽样
Instrument 设备
Quality control 质量控制
Two-side 双侧
One-side 单侧
Percentile 百分位数
Content :
 Concept of normal distribution
 Standard normal distribution

 Application of normal distribution


1. Concept of normal distribution

1.1 Frequency Distribution……


Let’s recall ……
Example 2.1 To acquire the serum iron
indicates (μmol/L) of 120 healthy men aged
18~35 years old who were selected randomly
from some a community as below:
Continuous measurement data
7.42 8.65 23.02 21.61 21.31 21.46 9.97 22.73 14.94 20.18 21.62 23.07

20.38 8.40 17.32 29.64 19.69 21.69 23.90 17.45 19.08 20.52 24.14 23.77

18.36 23.04 24.22 24.13 21.53 11.09 18.89 18.26 23.29 17.67 15.38 18.61

14.27 17.40 22.55 17.55 16.10 17.98 20.13 21.00 14.56 19.89 19.82 17.48

14.89 18.37 19.50 17.08 18.12 26.02 11.34 13.81 10.25 15.94 15.83 18.54

24.52 19.26 26.13 16.99 18.89 18.46 20.87 17.51 13.12 11.75 17.40 21.36

17.14 13.77 12.50 20.40 20.30 19.38 23.11 12.67 23.02 24.36 25.61 19.53

14.77 14.37 24.75 12.73 17.25 19.09 16.79 17.19 19.32 19.59 19.12 15.31

21.75 19.47 15.51 10.86 27.81 21.65 16.32 20.75 22.11 13.17 17.55 19.26

12.65 18.48 19.83 23.12 19.22 19.22 16.72 27.90 11.74 24.66 14.18 16.52
Approach of frequency distribution table

Group count Frequency Cumulative frequency


6~ 一 1 1
8~ 上 3 4
10~ 正一 6 10
12~ 正上 8 18
14~ 正正丅 12 30
16~ 正正正正 20 50
18~ 正正正正正丅 27 77
20~ 正正正上 18 95
22~ 正正丅 12 107
24~ 正上 8 115
26~ 止 4 119
28~30 一 1 120
Total 120
Approach of frequency distribution table

Group Frequencies ( % ) Cumulative frequencies


(%)
0.83 0.83
6~ 2.50 3.33
8~ 5.00 8.33
10~ 6.67 15.00
12~ 10.00 25.00
14~ 16.67 41.67
16~ 22.50 64.17
18~ 15.00 79.17
20~ 10.00 89.17
22~ 6.67 95.84
24~ 3.33 99.17
26~ 0.83 100.0
28~30
100.0
Total
Chart of frequency distribution
25.00
frequencies

20.00

15.00

10.00

5.00

0.00
μmol/L

The frequency distribution of the measurement of serum iron

Note: the lateral axis is group values, the vertical axis can be
frequency or frequency density.
Frequency density =frequency / group interval
1.2 Approaches of Normal Distribution

Increase the
sample size
and decrease
the group
interval .

Frequencies Distribution Approaches of Normal Distribution


Special continuous probability distribution
Frequency density

12.50

10.00

7.50

5.00

2.50

0
7 9 11 13 15 17 19 21 23 25 27 29
serum iron (μmol/L)
1.3 Properties of normal distribution

 Infinite extensibility.
 Symmetry.
 Two parameters.
 The area under the curve.
1.3.1 Infinite extensibility

 Bell Curve / normal curve:


 The normal curve was developed
mathematically in 1733.
 Gauss used the normal curve to analyze

astronomical data in 1809.


 The normal curve is often called the Gaussian

distribution.
1.3.2 Symmetry

 Location of the peak – center, mean, median

-3 -2 -1 1 2 3
1.3.3 Two Parameters
 The normal distribution is characterized by
two parameters: the mean (µ or X ) and the
standard deviation (σor s).
Two Parameters
 The mean (µ or X ) is a measure of location or center
and the standard deviation (σor s) is a measure of
scale or spread.
 The mean can be any value between infinity.
 The standard deviation must be positive.
 Each possible value of µ and σ define a specific
normal distribution and collectively all possible
normal distributions define the normal family.
Two Parameters---mean
 Position translation

-6 -5 -4 1
-3 -2 -1  20 1 2 3 3 4 5

1   2  3 1   2   3
Position Translation

N ( -1 , 0.52 )、
Two Parameters---σ changing

1
2

3

1   2  3  1   2   3
-3 -2 -1 0 1 2 3
σ changing
0.9
0.8
0.7 σ =0.5
0.6
0.5
0.4 σ =1
0.3 σ =2
0.2
0.1
0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
1.3.4 The area under the curve
 The total area under the curve is 1.
 Probability-density function
 The probability density function of a normal
distribution:
( x )2

f ( x) 
1
e

2 2    X  
 2 e  2.71828
  3.14159
 Denoted with: N (  ,  2 )
The area under the curve
The area under the curve

Proportion Rule of Normal Distribution


Example 4.1

To compare the areas between x1< 6 and X2 >


11.84 in the same normal distribution N(8,2 2).

6 11.84
Example 4.2

 To compare the areas between x1< 6 in a


normal distribution N(8,22) and X2 <10 in a
normal distribution N(12,32).

6 10
Example 4.2
 You can’t compare the areas directly.
 5 and 2 ,Which one is bigger?
7 3
5 15 2 14 5 2
 ,  , so , 
7 21 3 21 7 3

 Can we get a standard to compare the areas


in different normal distributions ?
Standard Normal Distribution
2. Standard Normal (Z) Distribution
 The Standard (or canonical) Normal
Distribution is a special member of the
normal family that has a mean of 0 and a
standard deviation of 1.
 Called Z distribution.
 Denoted with: N(0,12)
2. Standard Normal Distribution
 The standard normal distribution is
important since the probabilities and
percentiles of any normal distribution can
be computed from the standard normal
distribution—if µ and σ are known.
 The formula : Z  X -  OR Z  X  X
 s
Go back to example 4.2
X -  68
 X1~ N ( 8, 2 )
2 Z1    1.00
 2
Go back to example 4.2
X - 10  12
 X2~ N ( 12, 3 )
2
Z2    0.67
 3
Go back to example 4.2
 Turn to page 139.
 φ(Z1)=φ(-1.00)=0.3173/2=0.1587

 φ(Z2)=φ(-0.67)=0.5029/2=0.2515
Example 4.3
 The heights of boys aged 8 are distributed
normally in Hefei city with mean 123.02
cm and SD of 4.79 cm.
 What is the proportion of boys higher than
130 cm ?
 What is the proportion of boys heights
between 120cm and 128cm .
 Calculate the heights interval which covers
60% boys ?
Example 4.3

1)

2)
Example 4.3
 1- 0.60=0.40
 Turn to page 139.
 φ(Zi)= 0.20 ? Zi = 1.28
 X - X  123.02
Z   1.28
 4.79 60%
X  123.02
  1.28
4.79
 X  123.02  4.79 1.28
 116 .89  X  129.15
3. Application of normal distribution
 Medical Reference range (Reference
interval).
 Quality control.
 The basis of other mathematic distribution.
3.1 Reference range
3.1 Reference range
Report of endocrine lab, the First Affiliated Hospital
of AHMU
Name:Tang xia Number of the sample : 310 Time : 2011-05-09
Sex : female Number of the subject: 2011178211 Type of sample: blood
Age: 39 years Department: FY Doctor: Chen xuhua

Abbreviation Inspection item Result Units Reference range

T3 Triiodothyronine 1.55 Nmol/L 0.92~2.79

T4 Tetraiodothyronine 76.80 Nmol/L 58.10~140.60

TSH thyroid-stimulating hormone 2.938 uIU/L 0.350~5.500

Time : 2011-05-09 Examiner: Ye zhiheng


What is Reference range ?
 It usually describes the variations of a
measurement or value in “Healthy / Normal
individuals”.
 It is a basis for a physician or other health
professional to interpret a set of results for a
particular patient (physiological, Biochemical, …
measurements)
 “Healthy / Normal individuals”
 Covers the values of most healthy individuals.
“Healthy / Normal individuals”
 “Healthy individuals” are not the absolutely
healthy-persons , maybe some of them are
suffering a certain disease but do not affect the
index we focus on.
 To investigate the Reference Range of heights of
boys aged 8, three of them are suffering a heavy
cold.
 To investigate the Reference Range of weights of
boys aged 8, three of them are suffering a diarrhea.
“Most” healthy individuals.
 What is “ most ” ?
 60%? 80%? 90%? 95 %? 99% ?
 Think about the Small Probability Event.
 Usually , it is 95%.
 Two sided / one sided.
Two sided ? one sided ?
 Professional knowledge
 Research purpose

 Two sided
 A normal variance, too higher and too lower is abnormal.
Height, weight, Blood pressure
0.025 0.025

-1.96 1.96
Two sided ? one sided ?
one sided
Too higher is abnormal, Hair mercury, Blood lead,
Too lower is abnormal, IQ, Vital Capacity / lung’s
capacity: the index of respiratory function

0.05
0.05

1.64 -1.64
How to work out a reference range ?
 Well define “normal person”
 Needs grouping or not? (male-female? Age groups? )
 Determine sample size and Random sampling
 Measurement (instrument, method, quality control…)
 Two sides? One side?
 “Most individuals” ? (99%? 95%? 90%?…)
 Statistical method?
 Normal distribution method
 Percentile method
Statistical method ?
 If the frequency distribution is close to a normal
distribution: Normal distribution method.
 The dataset is not distributed normal: Percentile method.
Normal distribution method Percentile method
One sided One sided
%
Two sided Lower Two sided Lower Upper
Upper limit
limit limit limit

X  Z / 2 s
X  1.28s X  1.28s
90 X  1.64s P5~P95 P10 P90

95 X  1.96s X  1.64s X  1.64s P2.5~P97.5 P5 P95

99 X  2.58s X  2.33s X  2.33s P0.5~P99.5 P1 P99


Normal distribution method
 If the frequency distribution closes to a
normal distribution:
 Two sided (1-α)range

X  Z / 2 s  X  X  Z / 2 s denoted with X  Z / 2 s
 One sided (1-α)range

X  X  Z s OR X  X  Z s
Percentile method
 If the frequency distribution is not a normal
distribution.
 Two side (1- 0.05) range

P2.5  X  P97.5
 One side (1- 0.05) range
X  P5 X  P95
Example 4.4
 The hemoglobin of 120 healthy females
distributed a normal distribution with mean
117.4g/L and SD is 10.2g/L, calculate the
95% reference range of healthy females’
hemoglobin.
 The hemoglobin is a normal index for human beings ,
too higher and too lower are abnormal.

X  1.96S  117.4  1.96 10.2  97.41 ~ 137.39


Example 4.5
 The Lung’s Capacity of 100 healthy men distributed a
normal distribution with mean 4.2L and SD is 0.7L,
calculate the 95% reference range of Vital Capacity
for healthy men.
 The Lung’s Capacity is a index for respiratory function,
higher is better.
 it doesn’t has the upper limit.

X  1.64S  4.2  1.64  0.7  3.052


 The reference range of vital capacity is more than 3.052 L
Example 4.6
 The blood lead contents of 110 people distributed a
normal distribution with mean 12.0 μg/100g and SD is
2.5 μg/100g , calculate the 95% reference range of
blood lead contents for them.
 blood lead contents is a index for examination of
lead poisoning, lower is better.
 It doesn’t has the lower limit.

X  1.64S  12.0  1.64  2.5  16.1


 The reference range of blood lead is less than 16.1
μg/100g
Example 4.7
 The blood lead contents of 200 workers as below ,
calculate the 95% reference range of blood lead
contents for workers.
Groups Frequency Culmulative frequency
3- 36 36
8- 39 75
13- 47 122
18- 20 152
23- 18 170
28- 16 186
33- 3 189
38- 7 196
43- 3 199
48-52 1 200
Example 4.7
 It is a skewed distribution.
 It doesn’t has the lower limit.
i
P95  L  n .x%   f L   38  5 200  95%  189  38.7 g / 100 g
fx 7

 The reference range of blood lead contents for


workers is lower than 38.7 μg/100g
3.2 Quality control
 Example 4.8:
The productions of a new kind of machine in a month
distributed a normal distribution with mean 450/day
and SD is 20/day, today , the production of the
machine is 540, is there something wrong with the
machine ?
X - 1S  X  X  1S normal
X - 2S  X  X  2S warning limits
X - 3S  X  X  3S control limits
Example 4.8

days
3.3 The basis of other mathematic
distribution.
 We often transform some other
distributions to normal distribution for
further analysis.
 Sometimes we explain some theories with
normal distribution properties.

You might also like