SQL For Data Science Project Week 4
SQL For Data Science Project Week 4
Summer-Men
corr r2 regr_intercept regr_slope cnt
0.8885850837 0.789583451 16.7894246363 4.5224768144 230
Summer-Women
corr r2 regr_intercept regr_slope cnt
0.8756011598 0.7666773911 7.1624204141 4.3315877441 222
Winter-Men
corr r2 regr_intercept regr_slope cnt
0.8226198447 0.6767034089 6.7220461281 6.1787707995 114
Winter-Women
corr r2 regr_intercept regr_slope cnt
0.8129183736 0.6608362822 4.2185210832 3.0377191215 90
Percent of women participants growing over
time
• At Olympic Games in 1896 there were no women Year season female_perc
participants and now they are almost half of all 1,896 Summer 0
participants. 1,900 Summer 1.8790849673
1,904 Summer 0.9230769231
1,906 Summer 0.7134363853
1,908 Summer 2.1739130435
1,912 Summer 2.200083022
1,920 Summer 2.9147982063
1,924 Summer 4.7911547912
1,924 Winter 4.1533546326
…
…
…
2,002 Winter 36.9320550229
2,004 Summer 40.7312683528
2,006 Winter 38.2919005613
2,008 Summer 42.2882833287
2,010 Winter 40.7334384858
2,012 Summer 44.2521631644
2,014 Winter 40.14571949
2,016 Summer 45.0308614366
Percent of women participants growing over time
– correlation between „Year” and female_perc
corr r2 regr_intercept regr_slope cnt
0.9520753909 0.90644755 1,915.8001528124 2.4633563774 51
Which of the team has the highest percent of
women participants
noc season female_perc
HKG Winter 80
TLS Summer 62.5
KOS Summer 62.5
PRK Winter 61.1940298507
CHN Winter 60.7594936709
CHN Summer 54.0057452921
BHU Summer 51.8518518519
UZB Winter 51.8518518519
MHL Summer 50
PLW Summer 50
CPV Summer 50
LCA Summer 50
ANG Summer 48.347107438
BLR Summer 46.9026548673
PRK Summer 46.6507177033
UKR Summer 46.1147421932
VIE Summer 45.5284552846
RUS Summer 43.908045977
UKR Winter 43.6666666667
DEN Winter 43.2432432432
Percent of women participants in teams
• There are many teams that never had any women participants, but there are also teams where most of the
participants were women.
Summer
avg min max
24.0928672197 0 62.5
Winter
avg min max
20.5519064359 0 80
Insights Discovered
• Analysis of Olympic Games dataset proves that number of
participants per country is highly correlated to rank in medals.
At first I build metrics with ranks and and number of participants and I
checked how they behave for olympic teams. I believed that looking at
those data would give me the answers for my main hyphotesis. I would
say that this approach led me to nowhere specific.
Second approach was much simpler: a checked corelation between
average number of medals and average number of participants per
team. They were highly correlated.
Insights Discovered
• I proved that percent of women participants is growing every year.
• 80% of participants of Hong Kong team in winter Olympic games were
women
• There are some teams that never had women participants in Olympic
Games
Recommendations and Actions
• In my analyses I focused on specific questions. I believe that this
dataset can give us much more. I didn’t even consider information
about age, height and weight of participants. I belived that there was
no place for them in this specific study but there is opportunity for
more analyses. Even with the data that I used you can find more
questions to answer.
• This data is becoming incomplete. It should be updated with new
data after every olympic games. It would be nice to know if the trends
that were observed would continue. Maybe they will slow down and
we would observe some kind of stagnation or maybe they would even
reverse. The future will tell.