Tutorial 6
Tutorial 6
Tutorial 6
In [91]: import random
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
Exercise 1
In [1]: def greet():
print("Hello, there!")
In [3]: greet("Alice")
Hello, Alice!
In [4]: greet(name="Alice")
Hello, Alice!
Exercise 2
In [43]: greet(firstname="Alice")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[43], line 1
----> 1 greet(firstname="Alice")
Hello, Alice!
Exercise 3
file:///C:/Users/longo/Downloads/Tutorial6.html 1/13
11/28/23, 1:30 PM Tutorial6
Hello, there!
Hello, Bob!
In [17]: help(greet)
greet(name)
Greets the user with the provided name.
Exercise 4
In [24]: def RPS(n_rounds):
"""
Game of Rock, Paper, Scissors.
"""
n_rounds = 3
computer_score = 0
user_score = 0
file:///C:/Users/longo/Downloads/Tutorial6.html 2/13
11/28/23, 1:30 PM Tutorial6
print(f"You won {user_score} time(s).")
print(f"The computer won {computer_score} time(s).")
score_count.plot(kind='pie', autopct='%1.1f%%')
In [25]: RPS(3)
Improving visualisations
file:///C:/Users/longo/Downloads/Tutorial6.html 3/13
11/28/23, 1:30 PM Tutorial6
Exercise 5
In [92]: url = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vTEX2jqDTG_1uLK6lc1jxhsLQd47
gs_intro_survey = pd.read_csv(url)
print(gs_intro_survey)
What sort of job would you like to do when you graduate? Monthly_Income
0 Diplomat 3000.0
1 Humanitarian aid 300.0
2 Human rights development 300.0
3 Journalist 80.0
4 Consulting 500.0
.. ... ...
89 no clue 0.0
90 Diplomat 500.0
91 No clue 250.0
92 Researcher 300.0
93 Human Rights Activist 312.0
# Create subplots
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(8, 10))
# Plot histogram
axes[0].hist(heights, bins=20, color='skyblue', edgecolor='black')
axes[0].set_title('Histogram')
Exercise 6
The histogram shows the frequency distribution of the data. The rug plot represents the
spread and concentration of the data. The box plot shows the spread, median (orange line)
and upper and lower quartiles.
Exercise 7
file:///C:/Users/longo/Downloads/Tutorial6.html 5/13
11/28/23, 1:30 PM Tutorial6
# Create subplots
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(4, 5))
# Plot histogram
plt.xlabel('Height')
plt.ylabel('Frequency')
axes[0].hist(heights, bins=20, color='skyblue', edgecolor='green') #Formats the fig
axes[0].set_title('Histogram')
axes[0].axvline(184, color='black')
axes[0].text(184, 8,"Leto's height",rotation=90,
backgroundcolor='white', horizontalalignment='center')
Exercise 8
file:///C:/Users/longo/Downloads/Tutorial6.html 6/13
11/28/23, 1:30 PM Tutorial6
171.19893617021276
Out[37]:
170.5
Out[38]:
In [39]: df = pd.DataFrame(gs_intro_survey)
The mode was repeated 6 times. Please refer to the table below to find it's corres
ponding value.
Height
153.0 1
154.0 1
155.0 1
156.0 1
157.0 2
158.0 3
160.0 5
161.0 2
162.0 2
162.5 1
163.0 5
164.0 1
164.7 1
165.0 3
166.0 2
167.0 4
168.0 6
169.0 3
170.0 3
171.0 3
172.0 6
173.0 5
174.0 1
175.0 5
176.0 1
177.0 4
178.0 3
178.5 1
180.0 1
182.0 2
183.0 4
184.0 1
185.0 2
187.0 1
188.0 1
189.0 1
190.0 2
192.0 1
193.0 1
200.0 1
dtype: int64
file:///C:/Users/longo/Downloads/Tutorial6.html 7/13
11/28/23, 1:30 PM Tutorial6
# Create subplots
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(8, 10))
# Plot histogram
axes[0].hist(heights, bins=20, color='skyblue', edgecolor='black')
axes[0].set_title('Histogram')
axes[0].axvline(168, color='black')
axes[0].text(168, 8,"Mode",rotation=90,
backgroundcolor='white', horizontalalignment='center')
axes[0].axvline(170.5, color='black')
axes[0].text(170.5, 8,"Median",rotation=90,
backgroundcolor='white', horizontalalignment='center')
axes[0].axvline(171.19, color='black')
axes[0].text(171.19, 8,"Mean",rotation=90,
backgroundcolor='white', horizontalalignment='center')
file:///C:/Users/longo/Downloads/Tutorial6.html 8/13
11/28/23, 1:30 PM Tutorial6
Measures of Variation
Exercise 9
In [54]: def differenceFrom(x, m = 0):
difference = m - x
return difference
In [45]: differenceFrom(1)
-1
Out[45]:
Exercise 10
file:///C:/Users/longo/Downloads/Tutorial6.html 9/13
11/28/23, 1:30 PM Tutorial6
In [47]: divideBy(4, 2)
2.0
Out[47]:
Exercise 11
In [60]: def calculateMean(numberList):
mean = np.mean(numberList)
return mean
3.4285714285714284
Out[53]:
Exercise 12
In [104… def calculateVariance(numberList):
mean = calculateMean(numberList)
number_data = 0
sum_squared_difference = 0
for x in numberList:
difference = differenceFrom(x, mean)
squared_difference = difference * difference
sum_squared_difference += squared_difference
number_data += 1
2.9523809523809526
Out[105]:
Exercise 13
In [102… def calculateVariancewithoutsquare(numberList):
mean = calculateMean(numberList)
number_data = 0
sum_squared_difference = 0
for x in numberList:
squared_difference = differenceFrom(x, mean)
sum_squared_difference += squared_difference
number_data += 1
-1.4802973661668753e-16
Out[103]:
file:///C:/Users/longo/Downloads/Tutorial6.html 10/13
11/28/23, 1:30 PM Tutorial6
Exercise 14
In [81]: def test_standard_deviation(r, m=10):
x = np.zeros(10) # create a dataset of 10 values
x[:5] = m - r
x[5:] = m + r
return x
In [82]: test_standard_deviation(77)
array([-67., -67., -67., -67., -67., 87., 87., 87., 87., 87.])
Out[82]:
In [84]: np.mean([-67., -67., -67., -67., -67., 87., 87., 87., 87., 87.])
10.0
Out[84]:
In [85]: np.std([-67., -67., -67., -67., -67., 87., 87., 87., 87., 87.])
77.0
Out[85]:
In [87]: test_standard_deviation(10,9)
array([-1., -1., -1., -1., -1., 19., 19., 19., 19., 19.])
Out[87]:
In [88]: np.mean([-1., -1., -1., -1., -1., 19., 19., 19., 19., 19.])
9.0
Out[88]:
In [89]: np.std([-1., -1., -1., -1., -1., 19., 19., 19., 19., 19.])
10.0
Out[89]:
m = the mean
This function creates a perfectly symetrical array centered around the mean by having the
first 5 values be the mean - standard deviation and the last 5 values be the mean + the
standard deviation.
Exercise 15
In [108… std = np.std(heights)
# Create subplots
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(8, 10))
# Plot histogram
axes[0].hist(heights, bins=20, color='skyblue', edgecolor='black')
axes[0].set_title('Histogram')
file:///C:/Users/longo/Downloads/Tutorial6.html 11/13
11/28/23, 1:30 PM Tutorial6
axes[0].axvline(168, color='black')
axes[0].text(168, 8,"Mode",rotation=90,
backgroundcolor='white', horizontalalignment='center')
axes[0].axvline(170.5, color='black')
axes[0].text(170.5, 8,"Median",rotation=90,
backgroundcolor='white', horizontalalignment='center')
axes[0].axvline(171.19, color='black')
axes[0].text(171.19, 8,"Mean",rotation=90,
backgroundcolor='white', horizontalalignment='center')
axes[0].axvline(std, color='black')
axes[0].text(std, 8,"Standard deviation",rotation=90,
backgroundcolor='white', horizontalalignment='center')
axes[0].axvline(variance, color='black')
axes[0].text(variance, 8,"Variance",rotation=90,
backgroundcolor='white', horizontalalignment='center')
file:///C:/Users/longo/Downloads/Tutorial6.html 12/13
11/28/23, 1:30 PM Tutorial6
Exercise 16
Using everything you’ve learned this week, can you now determine what information
the boxplot shows you? The box plot shows the median (orange line), the upper and lower
quartiles (start and end of box) and the range of the data (line).
file:///C:/Users/longo/Downloads/Tutorial6.html 13/13