0% found this document useful (0 votes)
11 views11 pages

Ug Statistic Notes

The document provides a comprehensive overview of statistical formulas and concepts related to measures of central tendency, variability, and inferential statistics. It includes definitions and calculations for mean, median, mode, standard deviation, confidence intervals, and various statistical tests such as the chi-square test. Additionally, it covers vital statistics calculations and case-control study methodologies, along with examples and exercises for practical application.

Uploaded by

jxmcsksgbc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views11 pages

Ug Statistic Notes

The document provides a comprehensive overview of statistical formulas and concepts related to measures of central tendency, variability, and inferential statistics. It includes definitions and calculations for mean, median, mode, standard deviation, confidence intervals, and various statistical tests such as the chi-square test. Additionally, it covers vital statistics calculations and case-control study methodologies, along with examples and exercises for practical application.

Uploaded by

jxmcsksgbc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

1

STATISTICS FORMULAE
Measures of central tendencies (Mean, median, mode)
a. Mean = ∑ n (Summation of observations divided by no. of observations)
N
b. Median = middle value = value of middle observation

c. Mode is most frequently occurring observation eg 2, 4, 5, 7, 7, 8, 10, 7, 9, 7,18 20,


……….. here mode is 7 (most frequent item or most fashionable value)

Mode = 3 median - 2 mean


Mean - Mode = 3 (Mean – median)

______________________________________________________________

1. 95 % of confidence interval (CI) = mean ± 2 SD for normal distribution

2. Range = highest value – lowest value


3. Mean deviation = ∑ (x – x )
n
4. Standard deviation (SD) = √∑ (x – x )2 for sample size (n) > 30
N
5. Standard deviation (SD) = √∑ (x – x )2 for sample size (n) < 30
N-1
6. Standard error of mean = σ/√n i.e σ means standard deviation

7. Standard error of proportion = √p q /n p = proportion of males; q = proportion


of females

8. Coefficient of variation = SD × 100 for comparing variability among 2 groups


Mean

= σ/ x × 100
It is the relative measure of variation. Also, used for comparing variability between
2 or more samples with different magnitudes & between 2 or more variables in the
same sample.

9. Relative deviate (Z) = (x – x ) = Individual observation - Mean


σ S.D.

Also k/a Relative variate / Standard normal variate: It is the deviation of individual
observation from mean x in a normal curve & is measured in terms of standard deviation.
2

It indicates how much observations will be higher or lower from specific values on either
side of the mean in the form of S.D.

10. Standard error of difference between 2 proportions = √p1q1/n1 + √p2q2/n2

11. Standard error of difference between 2 means = √ (σ1)2/n1 + √ (σ2) 2/n2

12. Chi square test (χ2) χ2 = ∑ (O – E)2 O🡪 Observed value E 🡪 Expected value
E
E = Row total × Column total calculate E for each cell
N
Chi square test is used to test the significance of difference between 2 or more
than 2 groups (proportions). When data are expressed in frequencies or counts
such as no. of responses (eg patients) in 2 or more categories.

Degree of freedom (d.f.) = (c -1) (r -1) c = no. of columns r = no. of rows

For 2 × 2 contingency table, d.f. = 1

Pie diagram: - Frequency of groups is represented in the form of circle & degree of angle
of groups denotes frequency of groups

Size of each angle (degree) is calculated = Class frequency × 360o


Total frequency

In above eg, 30% = 30 × 360 = 108o; 30%= 82.8o; 15% = 54o; 7% = 25.2o; 4%=14.4o; 2%= 7.2o
3

100
o
Total is 360

Histogram:
It is a graphical representation of frequency distribution of quantitative data. Different
groups of the variable characters are indicated on the horizontal line & frequency i.e. no.
of observations are indicated vertically.

Heights of histogram give magnitudes of particular group.

Total area covered by histogram gives total frequency. Hence, it is also called Area
diagram.

Frequency polygon: It is derived from histogram by joining midpoints of the


histogram blocks, for quantitative data presentation. It gives an idea about nature &
shape of data. While making a frequency polygon, do not histogram on the same graph.
4

Frequency polygon is useful for comparison of two or more groups in terms of frequency
distribution.

Line chart:
It is a frequency polygon presenting variation by lines. It shows trend of events occurring
over a period of time. Eg IMR, CBR, CDR……. It brings about components of factors
which are imp in epidemiological studies as it gives idea about time, place & persons.
5

Scatter diagram:-
It is used to assess the relationship between two continous variables (like ht, wt, etc….)

In plotting a curve, one of the variable on X -axis & other on Y –axis is plotted
perpendicular drawn from 2 readings. A line is drawn to show the nature of correlation.

If the dots are nearer to line 🡪 strong correlation

If the dots are away from line 🡪 weak correlation

Strength of correlation 🡪 coefficient correlation (denoted by ϒ)

Extent of correlation 🡪 -1 ≤ ϒ ≤ + 1 i.e. (-1 to +1)

When ϒ = ± 1 🡪 A perfect correlation


ϒ = ± 0.9 🡪 strong correlation
ϒ = ± 0.3 🡪 weak correlation
ϒ = ± 5 to ±0.7 🡪 moderate correlation
ϒ = 0 🡪 No correlation
6

Screening: TP: true positive FP: false positive FN: False negative TN: True negative
Test Present Absent Total

Positive (a) TP (b) FP (a+b)

Negative (c) FN (d) TN (c+d)

(a+c) (b+d) (a+b+c+d)

• Sensitivity = a/a+c × 100

• Specificity = d/b+d × 100

• Positive predictive value = a/a+b × 100

• Negative predictive value = d/c+d × 100

• Percentage of false negatives = c/a+c × 100

• Percentage of false positives = b/b+d × 100

• Exercise: In a pilot study of 1600 subjects to identify breast cancer with the help of
breast carcinoma promoting factor, 880 individuals had a negative test result of
which did not have breast cancer. Calculate negative predictive value of the test.

Ans: 55 i.e. 880/1600 × 100


7
8
9

VITAL STATISTICS
• Census population in a city was 6,00000. The following events occurred during the year
1991.

• Total LBs - 15000; total deaths – 6000; total maternal deaths – 60; infant deaths – 800 ;
neonatal deaths – 720; still births in 1991 – 140; early neonatal deaths – 480

• Calculate CBR, CDR, IMR, MMR, NMR, PNMR, Early NMR, Late NMR, PMR

• CBR = total births / MYP × 1000 = 25 per 1000 pop

• CDR = total deaths/ MYP × 1000 = 10 per 1000 pop

• IMR = infant deaths /total LBs × 1000 = 53.3 per 1000 LBs

• MMR = maternal deaths /total LBs × 1000 = 4 per 1000 LBs

• NMR = 720/ total LBs × 1000 = 48 per 1000 LBs

• PNMR = infant deaths – neonatal deaths / TLBs × 1000 = 5.3 per 1000 LBs

• Early NMR = early neonatal deaths /TLBs × 1000 = 480/15000 = 32 per 1000 LBs

• Late NMR = neonatal deaths - early neonatal deaths /TLBs × 1000 = 720-480/15000 = 16
per 1000 LBs

• PMR = SBs + early neonatal deaths / total deaths (live & SB) × 1000 = 40.95 per 1000 LBs

Cases Control Total

Smokers (a) (b) (a+b)

Non-smokers (c) (d) (c+d)

(a+c) (b+d) (a+b+c+d)

• CASE CONTROL STUDY

• Calculate exposure rates in cases & controls. OR/RR

• Exposure rates in cases = smokers in cases × 100 a/a+c × 100

• Total cases
10

• Exposure rates in controls = smokers in controls × 100 = b/b+d× 100

• Total cases

• OR = ad/bc

• Case fatality rate CFR = No of deaths due to a disease

Total no of cases due to that disease

Secondary attack rate (SAR) = No. of exposed persons developing the disease within the

range of the incubation period × 100

Total no. of exposed/susceptible contacts

Q. In a population of 100 females the mean Hb concentration was 10 & the SD was 1

Ans: The SE is 0.1

Q. Mean of 25 variables is 2, SD is 2, SEM is

Ans: 0.4

Q. Calculate the SE for a population size of 25 persons suffering from history of fever
of 8 days & with SD 2

Ans: 0.4

Q. In a chi sq test degree of freedom 1, X2=6.7 p value will be more than

Ans: 0.01
11

Sample size = 4 pq

L2

You might also like