0% found this document useful (0 votes)
13 views32 pages

Lec5&6 02sep2016

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views32 pages

Lec5&6 02sep2016

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Introduction to Biostatistics

Shamik Sen
Dept. of Biosciences & Bioengineering
IIT Bombay
Average or Arithmetic Mean

Sample Mean, 𝑥 𝑥= 𝑛
1 𝑥 /n

Population Mean, µ
Transformations on Arithmetic Mean

y = ax

y=c+x

y = c + ax
Geometric Mean

Geometric Mean preferred when some values in the dataset are


larger than the others

15, 10, 5, 8, 17, 100

Arithmetic Mean = 25.8


Geometric Mean = 14.7
Median

The median of a set of N measurements is the value that falls in


the middle position when measurements are ordered from
smallest to largest

Median position = 0.5 * (N+1)


Mode

Mode is the most frequently occurring value

Number of visits to a dental clinic in a typical week

6 7 5 1 8
4 9 3 3 4
7 2 1 4 5
5 5 5 5 7
3 4 4 5 8
Mean Vs Median Vs Mode

• Mode is used for large datasets.


• Mean & Median are used for both small and large datasets

• Median is less sensitive to outliers


Example 1: Mean Vs Median Vs Mode

1, 2, 3, 2, 4, 2, 8, 3, 6, 3, 2, 5, 45, 36, 89
Example 2: Mean Vs Median Vs Mode

1, 41, 2, 36, 3, 3, 40, 2, 29, 4, 2, 3, 3, 6, 3, 2, 5, 45, 36, 39


Example 3: Mean Vs Median Vs Mode

Symmetric Versus Asymmetric Distributions

Day 1 Day 3 Day 2


Relative Frequency

Height
Measures of Variability: Range

Range = maximum – minimum

100, 95, 40, 75, 60

Range should be looked at w.r.t minimum or maximum


Measures of Variability: Mean Absolute Deviation

Example: 1, 2, 5, 8, 12, 8, 1, 7, 5, 42
Measures of Variability: Standard Deviation

Variance = square of standard deviation


Measures of Variability: Standard Deviation

Relative Frequency

Height
Measures of Variability: Standard Deviation
Transformations with Standard Deviation

y = ax

y=c+x

y = c + ax
Standard Deviation for grouped data

If discrete values are repeated, calculate standard deviation as


in case of calculating mean
Standard Deviation: Practical Significance

Chebyshev’s Theorem:

Given a number k (greater than 1) and a set of N


measurements, at least 1 − 1/𝑘2 of the measurements will lie
within k standard deviations of their mean

Example: n = 26, Mean = 75, Variance = 100. Comment on the


distribution.
Testing Chebyshev’s Theorem

26.1 26 14.5 29.3 19.7


22.1 21.2 26.6 31.9 25
15.9 20.8 20.2 17.8 13.3

Mean = 22
Standard Deviation of Normal Distributions

+/- 1 s.d.: 68%


+/- 2 s.d.: 95%
+/- 3 s.d.: 99.7%
Relative Frequency

Height
Checking Standard Deviation Calculation

Range approximately equal to 4 times the standard deviation

Example: 5, 7, 1, 3, 4
Standard Error of the Mean (SEM)
Relative Standing: z-score

Z-score = (x-Mean)/Standard Deviation

Example: Mean = 25, Standard Deviation = 4; x = 30

Z-score > 3 is an outlier!

Example: 1, 1, 0, 15, 2, 3, 4, 0, 1, 3
Relative Standing: Percentiles
‘p’th percentile is the value which is greater than p % of the
measurements

Q1: first quartile (at position 0.25*(n+1))

Q3: third quartile (at position 0.75*(n+1))

Inter-quartile Range = IQR = Q3 – Q1


Example: Calculating Quartiles

16, 25, 4, 18, 11, 13, 20, 8, 11, 9


Example: Calculating Quartiles

The following data give noise levels measured at 36


different times directly outside of Mumbai CST.

82, 89, 94, 110, 74, 122, 112, 95, 100, 78, 65, 60, 90,
83, 87, 75, 114, 85 69, 94, 124, 115, 107, 88, 97, 74, 72,
68, 83, 91, 90, 102, 77, 125, 108, 65

Determine the quartiles.


Box-plot

Min, Q1, Median, Q3, Max


Area (sq. microns)
Detecting Outliers

Lower Fence = Q1 – 1.5 IQR


Upper Fence = Q3 + 1.5 IQR
Moments

Given a set of observations yi of a variable Y, the rth sample


moment about zero is defined as:
Moments

The rth sample moment about the mean is defined as:


Skewness
Kurtosis

You might also like