0% found this document useful (0 votes)
35 views15 pages

MATM 111 Data Management Dispersion and Normal Curve

The document provides an overview of statistical measures, focusing on quantiles, measures of dispersion, variance, and standard deviation. It explains how to calculate percentiles, quartiles, and deciles, as well as various measures of dispersion including range, mean absolute deviation, and standard deviation. Additionally, it discusses the normal distribution and its characteristics, emphasizing the importance of standard deviation in assessing data consistency.

Uploaded by

Cielo Gatdula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views15 pages

MATM 111 Data Management Dispersion and Normal Curve

The document provides an overview of statistical measures, focusing on quantiles, measures of dispersion, variance, and standard deviation. It explains how to calculate percentiles, quartiles, and deciles, as well as various measures of dispersion including range, mean absolute deviation, and standard deviation. Additionally, it discusses the normal distribution and its characteristics, emphasizing the importance of standard deviation in assessing data consistency.

Uploaded by

Cielo Gatdula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

MATM 111 – Mathematics in the Modern World

Math as a Tool: Data Management

QUANTILE (OTHER LOCATION IN THE DISTRIBUTION)


- Divides a given set of data into k equal parts.
Types of Quantile
A. Percentile (P)
 A percentile is a measure that tells us what percent of the total frequency scored at or below that measure.
 Divides the distribution into 100 equal parts (P1 , P2 , P3 … , P100)
 To find the kth percentile, follow the following steps (for Ungrouped data)
1. Arrange the data in ascending order
2. Divide k by 100 and multiply by the sample size.
k
Pk = 100 n 3. If the computed value is non-integer, round up and take the data that

corresponds in this position.

B.QUARTILES (Q)
 A measure that divides set of data or distribution into 4 equal parts (Q 1 , Q2 , Q3 ,& Q4))
 To find the kth quartile, follow the following steps: (for Ungrouped data)
1. Arrange the data in ascending order
2. Divide k by 4 then multiply by the sample size

k
Qk = n
4
3. If the computed value is non-integer, round up and take the data that corresponds in this position.
C. DECILE (D)
 A measure that divides set of data or distribution into 10 equal parts (D1 , D2 , D3 ,…D10))
 To find the kth decile, follow the following steps: (for Ungrouped data)
1. Arrange the data in ascending order
2. Divide k by 10 and multiply by the sample size.

k
` Dk = n
10
3. If the computed value is non-integer, round up and take the data that corresponds in this position

1
Example 1 : Given the following scores:
Scores: 11, 15, 16, 12, 9, 20, 18 , 19 , 14 , 13 , 20 , 11 , 19 21
25 15 17, 11

Find: a.) P 15, b.) Q 1 and c.) D 7


Solution:
Arrange the data from lowest to highest ; n = 18
9 , 11, 11, 11, 12, 13, 14, 15, 15, 16, 17, 18, 19, 19, 20 , 20, 21, 25

a) P15 = 11
k
Pk = 100 n = 15( 18 )/ 100 = 2.7 ≈ 3rd data ; P15 = 11

b) Q1 = 12

k
Qk = n = 1(18)/4 = 4.5 ≈ 5th data ; Q1 = 12
4

c) D7 = 19

k
Dk = n =7 ( 18) / 10 = 12.6 ≈ 13th data ; D7 = 19
10

For Grouped Data

 nk / 100  fl 
Formula: Pk= Lb. +  i
 f 

 nk / 4  fl 
Qk= Lb. +  i
 f 

 nk / 10  fl 
Dk= Lb. +  i
 f 

2
MEASURE OF DISPERSION

- A measure of variation shows the extent to which numerical values tend to spread out over the average.
Measure of dispersion is also use to determine the consistency or homogeneity of set of data.

- also known as measure of variation or measure of variability.


- supplements measure of central tendency in the analysis of data.
- A suitable measure should be large when the value vary over a wide range and should be small when the
range of variation is not too great.

MEASURES OF DISPERSION

1.) Range (R)

- quick but it is a poor measure of variation because it considers only the extremes data.
- is it tells nothing about the distribution of numbers in between.
2.) Mean Absolute Deviation (MAD)
- modification of range and more reliable than range.
3.) QUANTILE DEVIATION

Another way of measuring the variability of an observation is through quantile deviation, these are percentile
deviation, decile deviation, interquartile range, and quartile deviation.

a. Percentile Deviation (PD)

Percentiles Deviation described the variations of the middle 80% of the data set.:

P.D. = P90 - P10

b. Decile Deviation

Same with the percentile, deviation, described also the variations of the middle 80% of the
data set.

0 1 2 3 4 5 6 7 8 9 10

80%

c. Inter-quartile Range

A measure of variation based on the quartiles of a distribution. Describe the variations of the middle 50% of the
data set.

I.R. = Q3 – Q1

3
d. Quartile Deviation

Q.D = Q3 - Q1
2

For Ungrouped Data:

Example: Given the following data:

18, 21, 23 24, 26, 26, 27, 28, 30,

31, 33, 34, 35, 36, 37, 40, 44, 50

a.) To find the percentile deviation:

P10=100 =
𝑛𝑘 (18)10 2 nd score which is 21
= 1.8 or the 2
100

𝑛𝑘 (18)90
P90 =100 = 100
= 16.2 or the 17th score which is 44

*Percentile deviation = P90 – P 10

= 44 – 21

= 23

b.)To find the decile deviation:

𝑛𝑘 (18)1
D1 = = = 1.8 or the 2nd score which is 21
10 10

𝑛𝑘 (18)9
D9 = = = = 16.2 or the 17th score which is 44
10 10

*Decile deviation = D9 – D1

= 44 – 21

= 23

c.) To find the interquatile deviation :

𝑛𝑘 (18)3
Q3 = = = = 13.5 or 14th score which is 36
4 4

𝑛𝑘 (18)1
Q1 = 4 = = = 4.5 or 5th score which is 26
4

* Interquartile deviation = Q3 – Q1

= 36 – 26

= 10

4
d.) To find the quartile deviation :

*quartile deviation = Q3 – Q1

= 36 – 26 = 5

4.) VARIANCE (Var)

The variance or mean square deviation is the expected value of the squared deviation of an observation
from its theoretical mean.

5.) STANDARD DEVIATION

This value may be obtained by finding the square root of the calculated variance and being considered as the
most reliable measure of dispersion.

The variance and standard deviation are based on all items in the data set and each item is given a proper
weight. These two are very useful measures of variability because it measures the mean scattering of the data
around the average. The variance and the standard deviation increase with an increase in the deviation about the
mean, and decreases with decreases in these deviation. A small standard deviation (and variance) means a high
degree of uniformity in the observations and homogeneity in the distribution. The variance is the most suitable for
algebraic manipulations, but its computation results are in squared units. On the other hand, the standard deviation
has a value in the original units of data. Thus, it serves as the primary just as mean as the primary measure of
central tendency.

The standard deviation, however, has its set of limitations. It gives more weight to the extreme data and less
to those near the mean and the computation is not as easy as the range. This measure is not appropriate when
comparing two or more data sets in different units or different levels.

5
Formula:

Dispersion
Ungrouped Grouped

R = Hs - Ls R = UBLHCL - LBLLCI
Range(R)

a. For Population Variance (𝜎 2 )

∑(𝑥−𝜇)2 ∑ 𝑓(𝑥−𝜇)2
𝜎2= 𝜎2=
𝑁 𝑁
Variance
(Var)

b. For Sample Variance (s2 , sd2)

∑(𝑥−𝑥̅ )2 ∑ 𝑓(𝑥−𝑥̅ )2
s2= s2=
𝑛−1 𝑛−1

sd =√𝑣𝑎𝑟
Standard Deviation
(s , sd) a. For Population Standard Dev.(σ)
σ= √σ2

b. For Sample Standard Dev. (s ,sd)


s = √𝑠 2

Note: Notice that the sample variance and population variance have a different denominator. The sample variance
uses n-1 in the denominator instead of n. Statistical theory suggests that if there are many samples from the given
population, find the sample variance for each sample, and average each of these together, then this average will
equal the population variance when use n-1 in the denominator.

INTERPRETATION OF THE STANDARD DEVIATION

The accuracy and position of the score in the frequency distribution relative to the mean can be determined by
using the Chebyshevs’ Theorem.

1
The proportion or percentage of any data set that lie within k standard deviations of the mean is at 1-
k2
Note: k is any positive integral greater than 1 The theorem is applied to any distribution

Statement By Chebyshev about distribution:

a. At least 75% of the observation are within 2 std. deviation of its mean
b. At least 88.9% of the observation are within 3 std. deviation of its mean.

6
Example: The midterm exam score of 50 BS PSY students last prelim had mean score of 55 and standard deviation of
12 points.

Using the Chebyshev rules describe the distribution.

a.) At least 75% of the students had score between 31 and 79.
b.) At least 88.9% of the students had score between 19 and 91.

For Ungrouped Data:

Example1: Suppose you have sample scores of 70, 85, 80, 90, and 75. Solve for the range, variance and standard
deviation.

Solution:

a. R = Hs – Ls = 90 – 70
(x–𝑥̅ ) (x-𝑥̅ )2
R = 20 Scores (x)

70 (70 -80) =10 100


b. Variance

∑(𝑥−𝑥̅ )2
75 (75 -80) =5 25
s2= n−1
80 (80 -80) =0 0
Solve for mean first,
85 (85 -80) =5 25
∑𝑥 400
𝑥̅ = = = 80
𝑛 5 90 (90 -80) =10 100
∑(𝑥−𝑥̅ )2 250 250
S2= = =  = 400 250
n−1 5−1 4

S2 = 62.5

c. Standard Deviation

s = √𝑠2 = 62.5

s = 7.90

Analysis: The standard deviation of the distribution is approximately 8.

7
Example 2:) Consider the following set of observations. Four students took exams in a certain university:

Subject
Student
Eng Fil Sci Math Abstract

A 20 24 26 22 23

B 18 26 28 20 23

C 30 24 22 19 20

D 27 25 20 19 24

Determine whose student has the best performance.

Student Mean R Var sd

Compute for each individual’s mean or average:

̅𝑨 = _________
𝒙 _𝒙
̅𝑩 = __________
𝒙𝑪 = _________
̅ _𝒙𝑫 = __________
̅

State your analysis:

The mean score of student A is 23, student B is 23, student C is 23 and student D is 23. It shows that all of them have
the same average scores.

State your interpretation:

Since that all of them have the same average scores.

Therefore, all of them have the same level of performances

Compute for each individual’s standard deviation:

_𝒔𝟐𝑨 = _______________ 𝒔𝑨 = _____________


𝒔𝟐𝑩 = _______________ 𝒔𝑩 = ______________
𝒔𝟐𝑪 = _______________ 𝒔𝑪 = ______________
𝒔𝟐𝑫 = _______________𝒔𝑫 = ______________

8
State your analysis:

The computed standard deviation of student A is 2.24, student B is 4.12, student C is 4.36 and student D is 3.39. It
shows that student A has the lowest computed sd.

State your interpretation:

Since that student A has the lowest computed sd, among the four students, student A has the most consistent scores

Therefore Student A has the best performance among the four.

Note: In comparing 2 or more sets of data using standard deviation, the set that has the lowest computed sd is the set
that can be concluded that has the most consistent or homogeneous elements among the sets.

THE NORMAL CURVE

The most important probability distribution in the entire field of


statistics is the normal distribution. Its graph is the bell shaped curve
down below called the normal curve.

The mathematical equation of the normal curve is derived by -3s -2s -1s =0 1s 2s 3s
De Moivre in 1733. An extensive study was made by Gauss, thus, it is
sometimes called the Gaussian distribution in his honor.

It is completely specified by two parameters called the mean, µ, and the standard deviation s.

Characteristics of Normal curve

1.) The tails of the curve are asymptotic to the horizontal axis.
2.) The curve is symmetric to the Mean ( x ).
3.) The Mean, Median and Mode have the same value and are therefore located at the same point (center) along
the horizontal axis.
4.) The area under the normal curve is 1 (100%).
5.) The curve is divided into  3 standard deviation.

Note:

1.) The normal curve is a theoretical model, a kind of Frequency Polygon that is perfectly symmetrical and
smooth.
2.) The area under normal curve represents probability.

9
Example 1.) Suppose a quiz in a class of 50 students consisting of 25 boys and 25 girls has the
following results:

Boys: x = 78 sd = 3 n = 25 Girls: x = 78 sd = 4 n = 25

Quiz Results of Boys Quiz Results of Girls

69 72 75 78 81 84 87 66 70 74 78 82 86 90

Standard Score (Z-score)

To find or determine the proportion of the total area greater than, in between or less than an empirical value,
we should use a standardizing technique to transform the original score into units of standard deviation. We shall
convert original scores into standard score or z-score. This means that the empirical distribution will be standardized
into the theoretical normal curve. The standardized theoretical normal curve has a mean of 0 ( x = 0) and standard
deviation of 1

(sd= 1).

Formula:

Where:
xx
Z= x – empirical value
sd
x - mean
sd – standard deviation

10
Application of the Normal Curve

1.) In statistics examination, the mean grade is 78 and the standard deviation is 10. Find the ff:
a.) The corresponding z-score of two students whose grade are 93 and 62 respectively,
Solution:

b.) The grades of two students whose z-scores are 0.6 and 1.2 respectively.’
Solution:

3.) Find the area under the normal curve which lies:

a. between z = -0.55 and z = 0.55


Solution:

b. between z = 2.25 and z = 2.65


Solution:

c. to the right of z = 2.20


Solution:

11
d. to the left of z = 1.35
Solution:

e. to the right of z = -1.40


Solution:

5.)If adult male cholesterol is normally distributed with µ = 200 and σ = 25, what is the probability of selecting a male
whose cholesterol is;

f. less than 165


Solution:

g. greater than 165


Solution:

h. between 165 and 220


Solution:

i. greater than 220


Solution:

j. determine the normal lowest and highest adult cholesterol levels.


Solution:

12
7.) A group test was administered and has the ff. results:

Subject x sd Zac’s score

Physics 60 10 85

Mat 3 55 9 58

Eng 3 45 8 40

a.) What percentage of the students taking the test did Zac surpass in each subject?
b.) If there are 200 students who took the test, how many of them did he surpass in each subject?

Solutions:

Physics:

a.) Ans =_________


b.) Ans =_________

SKEWNESS

The shape of the graph of the distribution is another important property of the distribution. Either the data is
symmetrical (with bell- shaped curve) or not. If the set of data is not symmetrical, it is called asymmetrical or skewed.
A skewed distribution is the one that has the largest number of values on either the right or left

Measure of Relative Skewness

Where:
3( x  Md )
Skewness (Sk) = x = the sample mean,
s
Md = the sample median
Types of Skewness

a. Positively Skewed s = the sample


b. Negatively standard deviation.
Skewed

Md < Mean Md > Mean

13
Mo Md 𝑥̅ 𝑥̅ Md Mo

Note:

For a perfect normal distribution, Sk = 0. When index sk > 0, the distribution is positively skewed. When index
sk < 0, the distribution is negatively skewed.

KURTOSIS

Another property to describe the shape of the frequency distribution curves is kurtosis. It describes the extent
of peakedness or flatness of the distribution of the data. This can be measured by coefficient of kurtosis (k). The
kurtosis for a mesokurtic curve is 3, greater than 3 for a leptokurtic and less than 3 for a platykurtic

Measure of Kurtosis

For Ungrouped Data For Grouped Data

(x  x) 4
 f (x  x)
i i
4

Kurtosis(K)=
i
s 4
, Kurtosis(k) =
i
 s4
n n

Where
Where
xi = value of the ith observation,
i = ith class interval,
x  the sample mean and
xi = classmark of the ith class interval,
n = the sample size.
x  sample mean

n = sample size..

Types of Kurtosis

Leptokurtic

Mesokurtic (Normal Curve)

Platykurtic

14
College of Arts & Sciences Mathematics & Physics Department

Name: _______________________________________ Prog/Year/Section: ________________________________


Instructor:___________________
AREAS UNDER THE NORMAL CURVE

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224

0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3032 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441

1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952

2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

15

You might also like