Data Analysis Project
Data Analysis Project
on Country Average IQ
1
Contents
Acknowledgement ................................................................................................................................. 3
1. Introduction ........................................................................................................................................ 4
1.1 Report Objective............................................................................................................................... 4
1.2 Problem Statement ...................................................................................................................... 4
2. Overview............................................................................................................................................. 5
3. Data Exploration ................................................................................................................................. 6
3.1 Data Dictionary ............................................................................................................................. 6
3.2 Descriptive Analysis...................................................................................................................... 6
4. Graphical Data Analysis ...................................................................................................................... 7
4.1 Univariate analysis ....................................................................................................................... 7
4.1.1 Country average IQ ....................................................................................................... 7
4.1.2 Country education expenditure .................................................................................... 8
4.2 Bivariate analysis .......................................................................................................................... 9
4.2.1 Education Expense and Country Average IQ .................................................................. 9
5. Testing Hypothesis............................................................................................................................ 11
5.1 Test on two means assuming variance is equal......................................................................... 11
5.2 Test on two variances between two variables .......................................................................... 12
6. Conclusion ........................................................................................................................................ 14
Reference .............................................................................................................................................. 15
Appendix............................................................................................................................................... 16
2
Acknowledgement
First and foremost, the group would like to give special thanks to our professor
Suchismita Das for guiding and assisting us in this project and in the Statistics and Probability
class as a whole. This report would not have been possible without the combined effort from
each member of the group; Altaibaatar, Hung, Jason, and Patrick
3
1. Introduction
1.1 Report Objective
The data set contains 100 countries; their average IQ and Average expenses for
education (in dollars) per student obtained from Kaggle. Originally, the data set
contained 106 records of countries, however, due to incomplete records in regards to
Education Expenditure the researchers opted to exclude the 6 countries with
incomplete data, namely Taiwan and North Korea among others.
This leaves us with 100 records to use a sample to discover the relationship and
possible correlation of Average IQ and Education Expenditure.
4
2. Overview
The original dataset shows the worldwide average IQ as well as several other pieces of
information. In this report the researchers placed focus on two particular data sets namely;
average IQ and Education Expense. This report will include statistical reports from dataset
exploration, data in various graphs, as well as hypothesis testing.
The dataset was selected from Kaggle, and owned by Abhijit Dahatonde
5
3. Data Exploration
3.1 Data Dictionary
IQ - IQ is a number used to indicate how far above or below an individual stands
in mental ability in comparison to their peers. Results are normally gathered
using IQ tests
Education Expenditure - The average amount of expense a student in a country
needs for Education sectors. (Converted to Dollars)
IQ Education Expenditure
Mean 87 929
Median 88 367
Mode 99 16
Minimum 55 1
Range 51 5435
IQR 17 1302
6
4. Graphical Data Analysis
4.1 Univariate analysis
This analysis focuses on only one variable. This analysis’ purpose is to summarize the
information the researchers can learn from the chart/s that can be constructed from the data.
Histogram - A histogram is a graphic that uses it’s shape to describe the frequency of
occurrences in categories.
Country Average IQ
7
4.1.2 Country education expenditure
Number of Country
)
The data indicates a strongly negatively skewed distribution in per country
education expenditures across the sample of 100 countries. 66 nations spend less than
$891 Million per country on education, clustering heavily on the lower end. There is
then a significant decrease in the number of countries found in the higher spending
ranges past $891 per student. This suggests most countries within the sample
population devote relatively meager budgets to education, with far fewer outliers
allocating greater funding resources. With 66% spending under $891, there is notable
imbalance and skew, rather than an even landscape. This could signify broader
systemic issues limiting education budgets across these nations. Considering the
critical impact education has on economic development and social welfare, having
such a large portion of countries clustered on the lower end of spend is concerning.
8
4.2 Bivariate analysis
The analysis studies the relation between two variables and uses graphical representations
such as scatter plot to depict the relationship between two variables which may include their
correlation and covariance.
Scatter plot - A type of graph using values of two variables on the x and y axis. Using this set
of data we can observe the relationship between two numeric values
Correlation - Correlation measures the direction of the linear relationship between two
continuous variables.
Country Average IQ
SUMMARY Values MCT AVGIQ EDUEXP
Correlation 0,57859348 Mean 87 929
Covariance 7723,37172 Median 88 367
Count 100 Mode 99 16
Std Dev 11,36651 1.174
9
$5,500 per student. While directionally aligned, the magnitude of spending in the
billions exceeds what the broader data distribution would predict.
The majority of countries exhibit more modest education budgets,
concentrated below $100 per student, with estimated IQ levels spanning from 70 to
90. This range encompasses the bulk share of the data, suggesting most nations
demonstrate relatively average intelligence paired with significantly lower spending
compared to peer countries.
10
5. Testing Hypothesis
We use hypothesis testing to know the comparison of estimated value between statistical
data that we have
= 5.564
Following the calculation above we can computed the t-distribution
𝑡 = 5.564 > 𝑡98,0.05 = 1.661
From the information above we get that T= 5.564 > tc =1.661. Based on that we know
that we can reject the null hypothesis which mean that there is enough evidence to claim
country with average IQ >= 99 have more average on education expense spend than country
with average IQ < 99.
Also, we can construct 95% confidence interval of the difference in two means:
-474.49 < μ1- μ2 < 3336.49
11
5.2 Test on two variances between two variables
In this section, we are doing the test between two variances of two variable sample
population of education expense where country having average IQ ≥ 99 and country having
average IQ < 99.
Average IQ >= 99 Average IQ < 99
Based on the information provided, the significance level is α=0.05, v1 = 19, and v2 =
79 and the rejection region for this two-tailed test. We can get below value using the F-Table:
𝑆12
F1−0.025,(19,79) = 0.447 ≤ F = = 1.31 < F0.025,(19,79) = 1.906
𝑆22
From the information above we get that FL=0.447 ≤ F=1.31 ≤ FU =1.906. Based on that
we know that the it fails to reject the null hypothesis which mean that there is not enough
evidence to claim country with average IQ >= 99 and country with average IQ < 99 have
different variance. So, we can conclude the variance of both country sample population have
the same variance.
Using below equation to get the confidence interval
𝑆12 𝜎12 𝑆12
𝐹1−𝛼/2,(𝑣1 −1,𝑣2 −1) < < 𝐹𝛼/2,(𝑣1 −1,𝑣2 −1)
𝑆22 𝜎22 𝑆22
𝜎12
After Calculation we get the confidence interval of 0.604 < < 2.574
𝜎22
12
13
6. Conclusion
In Summary, after doing a lot of data analysis, we can conclude that country average IQ is
correlated with the country education expense per student. Even though not all factor
considers in this report. Especially when we only take top 100 country and have some data
cleaning before hands. Nonetheless, we are safe to assume that our conclusion is relatively
true.
When we doing the descriptive and graphical analysis, we can conclude sample mean and
variance is a point estimator for population mean and variance. Also, we only assuming they
have equal variances in the hypothesis testing, which may not hold true. After that test we
know that the average of education expense is higher for country that have average IQ of 99
or more. However, the variance test show that the relation between both variables is having
the same standard deviation.
14
Reference
https://www.kaggle.com/datasets/abhijitdahatonde/worldwide-average-iq-levels/data
https://courses.lumenlearning.com/introstats1/chapter/test-of-two-variances/
15
Appendix
RANK COUNTRY AVG IQ EDUCATION EXPENSE
1 Hong Kong 106 1.283
2 Japan 106 1.340
3 Singapore 106 1.428
4 China 104 183
5 South Korea 103 1.024
6 Netherlands 101 2.386
7 Finland 101 2.725
8 Canada 100 2.052
9 Luxembourg 100 3.665
10 Macao 100 1.448
11 Germany 100 1.883
12 Switzerland 100 3.550
13 Estonia 100 749
14 Australia 99 2.343
15 United Kingdom 99 2.076
16 Greenland 99 4.518
17 Iceland 99 3.812
18 Austria 99 2.341
19 Hungary 99 585
20 New Zealand 99 2.083
21 Belgium 98 2.507
22 Norway 98 5.436
23 Sweden 98 3.419
24 Denmark 98 4.133
25 Cambodia 97 16
26 France 97 2.042
27 United States 97 2.608
28 Poland 96 545
29 Czechia 96 712
30 Russia 96 338
31 Spain 95 1.176
32 Ireland 95 2.500
33 Italy 95 1.380
34 Croatia 95 505
35 Lithuania 95 550
36 Israel 93 1.811
37 Mongolia 93 153
38 Portugal 93 1.005
39 Bermuda 92 1.748
40 Bulgaria 91 224
41 Greece 91 781
42 Ukraine 91 143
43 Vietnam 91 76
44 Kazakhstan 89 225
45 Malaysia 89 450
16
46 Myanmar 89 14
47 Thailand 89 182
48 Serbia 89 208
49 Brunei 88 1.020
50 Chile 88 482
51 Costa Rica 88 487
52 Iraq 88 193
53 Romania 88 249
54 Argentina 87 454
55 Mauritius 87 310
56 Mexico 87 437
57 Turkey 87 334
58 Georgia 86 84
59 Sri Lanka 85 55
60 Cuba 84 395
61 Brazil 83 427
62 Philippines 83 67
63 Colombia 83 231
64 Laos 83 32
65 Venezuela 83 273
66 Albania 82 118
67 United Arab Emirates 82 805
68 Dominican Republic 82 158
69 Puerto Rico 82 1.174
70 Afghanistan 81 15
71 Iran 81 176
72 Pakistan 81 27
73 Indonesia 80 79
74 Kuwait 80 1.990
75 Oman 80 798
76 Qatar 80 2.331
77 Bolivia 79 156
78 Ecuador 79 199
79 Egypt 78 92
80 Algeria 77 253
81 India 77 47
82 Saudi Arabia 77 1.265
83 Sudan 77 25
84 Syria 76 252
85 Bangladesh 75 20
86 Chad 75 16
87 East Timor 74 54
88 Kenya 74 62
89 Zimbabwe 74 50
90 El Salvador 72 113
91 Morocco 71 144
92 South Africa 69 336
93 Niger 69 15
17
94 Somalia 69 1
95 Ethiopia 67 21
96 Cameroon 67 36
97 Congo 64 7
98 Ghana 61 76
99 Ivory Coast 61 69
100 Gambia 55 14
18