6.descriptve PPHD
6.descriptve PPHD
▪ Types of data :
1. Qualitative data
2. Quantitative data
Some Basic Terminology-
1.Characteristics-Qualities & Measurements
(Height,Weight,Income,Blood Pressure etc)
▪ Methods of presentation :
1. Tabulation method
2. Graphical method
Data
-Why it is essential to know type of data (types of variable
& scale of measurement) ?
Ex-
Grades of students-
Excellent,Satisfactory,Unsatisfactory.
Grades of Cancers-I,II,III,IV(Roman numbers)
Data
Interval- An interval variable ,in addition to ordinal
levels of measurement, has equal & fixed
distances(intervals) between values. The origin is
arbitrary.
Mean= Ʃ X /n
=83+75+……+90 /10
=810/10
=81
Advantages of a Mean
• It is easy to understand & simple calculate.
• It is based on all the values.
• It is rigidly defined .
• It is easy to understand the arithmetic average
even if some of the details of the data are
lacking.
• It is not based on the position in the series.
Disadvantages of a Mean
• It is affected by extreme values.
• It cannot be calculate for open end classes.
Observations :
N=9
2) Real variability :
Ex: Higher rate of coronary diseases in bus drivers than that in
conductors , may be due to strain or tension involved in driving .
3) Experimental variability:
It may be due to methods / materials ,defective weighing
machine.
It needs : Trained Interviewer , observer .
Untrained may give variability / error .
• Mean(Average) deviation
• S.D.
• C.V. = ------- X 100
• Mean
• Correlation coefficient
• RANGE-
Lowest observation to the highest observations in the given
series of observation. Very poor measure of variation in a
sample.
• Ex :
1. Systolic blood pressure
100-140 mm
2. Diastolic blood pressure
80-90 mm
3. Fasting blood sugar
80-120mm
• INTERQUARTILE RANGE –Arrange the data in
ascending or descending order & find median & further
divide lower half & upper half.
Mean deviation (MD) = Ʃ I x-m I / n
Variance = Ʃ ( x – x )² /n-1
√Ʃ ( x – m )² /n−1
Standard deviation (SD) =
Variance = ( S.D )²
Ex-
• Height of all students of a class-
• 100 students
• Mean height=1,74,522/100=174.52 cm
• Dispersion
• -Range is 160-182 cm
• -Standard Deviation is 11.5 cm
{ xi - x }
i
n n
i
x i - { xi }2/n
2
SD = ----------------------
n
It is a square root of the average of squares of the
deviations measured from the mean.
It measures the spread ness of the data about mean. S.D.
increases as spread ness about mean increases. If S.D. is
0, indicates all the observations are identical.
Note: If n is less than 30 then replace n in the denominator
by ( n – 1).
Merits of S.D.
• It is based on all the observations.
• It is a better measure of dispersion than Range
and M.D.(Mean Deviation)
• It is least affected by fluctuation of sampling.
• It is independent of change of origin.
• Ex. The Following are the height in cms of 5 students –
167,170,168,175,172. Find SD.
m=852/5=1 SD=Square
70.4 Root of
41.2/4=3.20
Applications of S.D.
=Most Commonly used measure of variation.
=Also used to measure Variance ,CV.
=Higher the SD, Higher the variation, provided the unit of
measurement are same.
=Presuming that the data shows normal (Gaussian)
distribution, we can make certain predictions about the
distribution of values within the sample. The approximate
predictions are below-
68% values will be within MEAN+-1SD
98% values will be within MEAN+-2SD
99% values will be within MEAN +-3SD.
This range is callled CONFIDENCE INTERVAL for the
sample.
Applications of S.D.
=To determine the precision/consistency/reliability of the
instrument. Reliability of the instrument can be determined
by calculating S.D. of repeated measurements on the same
subject by the same instrument.
S.D.
• C.V. = ------- X 100
Mean
the following table shows mean with SD for 10 subjects for the variables height and
weight.
Variable Mean SD
Height(cms) 175.5 7.29
Weight(Kgs) 72.4 12.27
Which parameter is more variable, weight or height?
C.V. ( height )= 4.15 C.V. ( weight )=16.94 . Here weight is more variable than
height.
Thanks
Summarizing Data by Graphical
method & Interpretation by Data
Presentation
Frequency distribution table
Table No.s 3
Simple bar diagram
Multiple bar diagram
Histogram
• Interval estimation can be contrasted with point
estimation.
• A point estimate is a single value given as the estimate of
a population parameter that is of interest, for example, the
mean of some quantity.
• An interval estimate specifies instead a range within
which the parameter is estimated to lie.
• Confidence intervals are commonly reported in tables or
graphs along with point estimates of the same
parameters, to show the reliability of the estimates
• The range within which the expected/ predicted value falls
is called the ‘precision’ of prediction and the chances of
predicted value falling in the range is called the ‘reliability’
of prediction.
• The reliability is expressed as confidence level and
the converse of it is significance level.