Faseeh Stats Project
Faseeh Stats Project
Annual Greenhouse Gas Emissions and Population for 10 Large Nations. The data set represents
the Annual greenhouse gas emissions (Total Greenhouse Gas Emissions (kt of CO2 equiv /
100,000) and population (Population / 10,000,000) for 10 nations for years 1970-1974.Nations:
Brazil, Canada, China, Germany, France, UK, India, Japan, Mexico, US. Both variables are
continuous variables as they are infinite number of values and their data is in decimals.
2: Methodology:
A: No of Classes: 2k>n
2k>50
26>50
K=6
No of classes (For both the variables) = 6
Finding class interval: h= highest value – lowest value
k
For Data A: 59.13- 2.11 For Data B: 90.04 – 2.13
6 6
For A Class interval= 10 For Data B Class interval=15
1, Frequency Distribution
5 5
0 0
7 18 29 40 51 62 9 24 39 54 69 84
DATA A Data B
Interpretation: Both variables have slightly same graphs with some different variable A have
slightly tuff curve and variable b have slightly easy curve.
Key: The median is indicated with the blue line and the mode is indicated with red line in both
graphs.
Mean= ∑Fx
∑F
Mean= 757/50 = 15.14
median= L + h/f n/2-c
Median class = n/2 = 50/2 = 25 (lies in 1st row (2-12)
Median= 1.5 + 11/35 25-35
Median=1.5+3.142
Median = 4.642
Mode= L + fm – f1 *h
(fm-f1) + fm-f2)-
Modal class = Highest frequency class =35= 2-12
Mode= 1.5 + 35-0 * 11
(35-0) + (35-44)
Mode=1.5 + 14.80
Mode= 16.307
The data is symmetrical as the mean is higher than the median which is less than the mode. I. e
(15.14<4.642<16.307)
Mean is the best appropriate measure for the central tendency as this data is continuous and
have no outliners. So, mean poses the best measure for central tendency.
Measures of Dispersion:
1:Range: largest value – lowest value
Range=59.13 -2.11
Range=57.02
Coefficient of Range = L – S
L+S
Coefficient of range = 59.13- 2.11
59.13+2.11
Coefficient of range= 57.02
61.24
Coefficient of range = 0.931
Quartile deviation
Quartile Deviation = Q3 – Q1
2
Q3= L + h/f (3(n/4 – C)
Quartile class = n/4 = 50/4= 12.5 (lies in 2st class 13-23)
Q3= 12.5 + 11/9 (3(12.5-35)
Q3= 12.5 +1.22(37.5 -35)
Q3= 15.55
Q1= L + h/f (n/4- C)
Q1= 1.5 + 11/35(12.5-0)
Q1= 5.428
Quartile Deviation = Q3 – Q1
2
Q.D = 15.55 – 5.428
2
Q.D = 5.061
Coefficient of Quartile deviation: Q3-Q1
Q3+Q1
= 15.55-5.428
15.55+5.428
= 10.122
20.978
=0.48
Mean Deviation (About mean) = ∑F(X-mean) = 569.8/50 = 11.396
∑F
Coefficient of Mean deviation= mean deviation = 11.396 = 0.752
Mean 15.14
Standard Deviation: Σ𝑓(𝑋-Mean)2 =√13745.88= 50.74 = 16.58
Σ𝑓 50
Coefficient of Variance= S/Mean * 100 = 16.58/15.14 * 100 = 109.51%
By calculating the first four moments about mean it shows that the data is
negatively skewed and the results coincide with the mean, median and mode
of the data.
𝑏 1 = 𝑚3 2 = - 9694.60 2= 93985269= 4.521
𝑚2 3 274.95 3 20785533.31
𝑏 2 = 𝑚4 = 454973.73 = 6.01
𝑚2 2 274.952
the distribution is said to be more peaked and the curve is leptokurtic as b2>3.
Data: B
Class Frequency(F MIDPONTS C. Class X- F(X-Mean) (X- F(X-Mean)2
Fx
Intervals ) X F limits Mean Mean)2
2-16 35 1.5- 12.24 428.4 149.817 5243.595
35
9 315 16.5
40 16.5- 2.76 13.8 7.6176 38.088
17-31 5
24 120 31.5
40 31.5- 17.76 0 315.41 0
32-46 0
39 0 46.5
40 46.5- 32.76 0 1073.21 0
47-61 0
54 0 61.5
45 61.5- 47.76 143.28 2281.01 6843
62-76 5
69 207 76.5
50 76.5- 62.76 313.8 3938 19690
77-91 5
84 420 91.5
∑F(X- ∑F(X-
2
∑F= 50 ∑Fx= mean)=899.2 mean) =31814.68
1062 8
Mean= ∑Fx
∑Fs
Mean= 1062/50 = 21.24
Median class = n/2 = 50/2 = 25 (lies in 1st row (2-16)
Median= 1.5 + 14/35 (25-0)
Median=11.5
Mode= L + fm – f1 *h
(fm-f1) + fm-f2)
Modal class = Highest frequency class =35= 2-16
Measures of Dispersion:
Range: largest value – lowest value
Range:90.04-2.13
Range:87.91
Coefficient of Range = L – S
L+S
Coefficient of range = 90.04 – 2.13
90.04 + 2.13
Coefficient of range= 87.91
92.17
Coefficient of range = 0.953
Quartile Deviation = Q3 – Q1
2
Q3= L + h/f (3(n/4 – C)
Quartile class = 3n/4 = 50/4= 37.5 (lies in 2nd class 17-31)
Q3= 16.5+ 14/5 (3(12.5-13)
Q3= 16.5 + 2.8(37.5 -35)
Q3= 23.5
Q1= L + h/f (n/4- C)
Q1= 1.5+ 14/35(12.5-0)
Q1= 6.5
Quartile Deviation = Q3 – Q1
2
Q.D = 23.5-6.5
2
Q.D = 8.5
Coefficient of Quartile deviation: Q3-Q1
Q3+Q1
Coefficient of Q.D = 23.5-6.5
23.5 + 6.5
Coefficient of Q. D= 0.566
Mean Deviation (About mean) = ∑F(X-mean) = 899.28/50 = 17.98
∑F
Coefficient of mean deviation: = mean deviation = 17.98 = 0.8465
Mean 21.24
Standard Deviation: Σ𝑓(𝑋-Mean)2 =√ 31814.68 = 25.21
Σ𝑓 50
Variance = S/Mean * 100 = 25.21/21.24 * 100 = 118.69
Skewness
𝑠𝑘=𝑚𝑒𝑎𝑛−𝑚𝑜𝑑𝑒 = 21.24 – 10.409 = 0.429
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 25.21
M1=∑F(X-mean) = -0/50 = 0
∑F
M2=∑F(X-mean)2 = 36713/50 = 734.26
∑F
M3=∑F(X-mean)3 = -1714837/50 = -34296
∑F
M4= ∑F(X-mean)4 = 104335076/50 = 2086701
∑F
By calculating the first four moments about mean it shows that the data is
negatively skewed, and the results coincide with the mean, median and mode of the
data.
𝑏 1 = 𝑚3 2 = -343962 118287844= 0.30
𝑚2 3 734.263 395867282.55
the distribution is said to be more peaked and the curve is leptokurtic as b2>3 .
Variable B have data closer to normal distribution as its value is closer to 4 and normal
distributing has value of 4.
Conclusion: I study the data and applied various measures such as measures of central tendency
(Mean, Median, Mode). Measures of dispersion (Quartile deviation) etc. And further the
skewness was calculated which found to be negative for set A and positive for Set B.