0% found this document useful (0 votes)
9 views

Intermediate Python

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Intermediate Python

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

INTERMEDIATE PYTHON 2060

Line plot (1) 2085

 print() the last item from both the year and the pop list to see what the
predicted population for the year 2100 is. Use two print() functions. 2095
 Before you can start, you should
import matplotlib.pyplot as plt. pyplot is a sub-package of matplotlib, Line plot (3)
hence the dot.
 Use plt.plot() to build a line plot. year should be mapped on the  Print the last item from both the list gdp_cap, and the list life_exp; it
horizontal axis, pop on the vertical axis. Don’t forget to finish off is information about Zimbabwe.
with the plt.show() function to actually display the plot.  Build a line chart, with gdp_cap on the x-axis, and life_exp on the y-
axis. Does it make sense to plot this data on a line plot?
 Don’t forget to finish off with a plt.show() command, to actually
# edited/addedimport numpy as np display the plot.
year=list(range(1950,2100+1))
pop=list(np.loadtxt('pop1.txt', dtype=float)) # edited/added
# Print the last item from year and pop gdp_cap=list(np.loadtxt('gdp_cap.txt', dtype=float))
print(year[-1]) life_exp=list(np.loadtxt('life_exp.txt', dtype=float))
print(pop[-1]) # Print the last item of gdp_cap and life_exp
# Import matplotlib.pyplot as pltimport matplotlib.pyplot as plt print(gdp_cap[-1])
# Make a line plot: year on the x-axis, pop on the y-axis print(life_exp[-1])
plt.plot(year, pop) # Make a line plot, gdp_cap on the x-axis, life_exp on the y-axis
# Display the plot with plt.show() plt.plot(gdp_cap, life_exp)
plt.show() # Display the plot
plt.show()
Line Plot (2): Interpretation
Have another look at the plot you created in the previous exercise; it’s shown Scatter Plot (1)
on the right. Based on the plot, in approximately what year will there be
more than ten billion human beings on this planet?  Change the line plot that’s coded in the script to a scatter plot.
 A correlation will become clear when you display the GDP per capita
on a logarithmic scale. Add the line plt.xscale('log').
2040
 Finish off your script with plt.show() to display the plot.  Add plt.show() to actually display the histogram. Can you tell which
bin contains the most observations?
# Change the line plot below to a scatter plot
plt.scatter(gdp_cap, life_exp) # Create histogram of life_exp data

# Put the x-axis on a logarithmic scale plt.hist(life_exp)

plt.xscale('log') # Display histogram

# Show plot plt.show()

plt.show() Build a histogram (2): bins

Scatter plot (2)  Build a histogram of life_exp, with 5 bins. Can you tell which bin
contains the most observations?
 Start from scratch: import matplotlib.pyplot as plt.  Build another histogram of life_exp, this time with 20 bins. Is this
 Build a scatter plot, where pop is mapped on the horizontal axis, better?
and life_exp is mapped on the vertical axis.
 Finish the script with plt.show() to actually display the plot. Do you
# Build histogram with 5 bins
see a correlation?
plt.hist(life_exp, bins = 5)
# edited/added # Show and clear plot
pop=list(np.loadtxt('pop2.txt', dtype=float)) plt.show()
# Import packageimport matplotlib.pyplot as plt plt.clf()
# Build Scatter plot # Build histogram with 20 bins
plt.scatter(pop, life_exp) plt.hist(life_exp, bins = 20)
# Show plot # Show and clear plot again
plt.show() plt.show()
plt.clf()
Build a histogram (1)
Build a histogram (3): compare
 Use plt.hist() to create a histogram of the values in life_exp. Do not
specify the number of bins; Python will set the number of bins to 10  Build a histogram of life_exp with 15 bins.
by default for you.  Build a histogram of life_exp1950, also with 15 bins. Is there a big
difference with the histogram for the 2007 data?
# edited/added Line plot
life_exp1950=list(np.loadtxt('life_exp1950.txt', dtype=float))
# Histogram of life_exp, 15 bins Scatter plot

plt.hist(life_exp, bins = 15)


Histogram
# Show and clear plot
plt.show() Labels
plt.clf()  The strings xlab and ylab are already set for you. Use these variables
# Histogram of life_exp1950, 15 bins to set the label of the x- and y-axis.
 The string title is also coded for you. Use it to add a title to the plot.
plt.hist(life_exp1950, bins = 15)
 After these customizations, finish the script with plt.show() to
# Show and clear plot again actually display the plot.
plt.show()
plt.clf() # Basic scatter plot, log scale
plt.scatter(gdp_cap, life_exp)
Choose the right plot (1) plt.xscale('log')
You’re a professor teaching Data Science with Python, and you want to # Strings
visually assess if the grades on your exam follow a particular distribution.
Which plot do you use? xlab = 'GDP per Capita [in USD]'
ylab = 'Life Expectancy [in years]'
Line plot title = 'World Development in 2007'
# Add axis labels
Scatter plot
plt.xlabel(xlab)
plt.ylabel(ylab)
Histogram
# Add title
Choose the right plot (2) plt.title(title)
You’re a professor in Data Analytics with Python, and you want to visually # After customizing, display the plot
assess if longer answers on exam questions lead to higher grades. Which plot
do you use? plt.show()

Ticks
 Use tick_val and tick_lab as inputs to the xticks() function to make o Double the values in np_pop setting the value of np_pop equal
the the plot more readable. to np_pop * 2. Because np_pop is a NumPy array, each array
 As usual, display the plot with plt.show() after you’ve added the element will be doubled.
customizations. o Change the s argument inside plt.scatter() to
be np_pop instead of pop.
# Scatter plot
plt.scatter(gdp_cap, life_exp) # Import numpy as npimport numpy as np

# Previous customizations # Store pop as a numpy array: np_pop

plt.xscale('log') np_pop = np.array(pop)

plt.xlabel('GDP per Capita [in USD]') # Double np_pop

plt.ylabel('Life Expectancy [in years]') np_pop = np_pop * 2

plt.title('World Development in 2007') # Update: set s argument to np_pop

# Definition of tick_val and tick_lab plt.scatter(gdp_cap, life_exp, s = np_pop)

tick_val = [1000, 10000, 100000] # Previous customizations

tick_lab = ['1k', '10k', '100k'] plt.xscale('log')

# Adapt the ticks on the x-axis plt.xlabel('GDP per Capita [in USD]')

plt.xticks(tick_val, tick_lab) plt.ylabel('Life Expectancy [in years]')

# After customizing, display the plot plt.title('World Development in 2007')

plt.show() plt.xticks([1000, 10000, 100000],['1k', '10k', '100k'])


# Display the plot
Sizes plt.show()

 Run the script to see how the plot changes. Colors


 Looks good, but increasing the size of the bubbles will make things
stand out more.  Add c = col to the arguments of the plt.scatter() function.
 Change the opacity of the bubbles by setting the alpha argument
o Import the numpy package as np. to 0.8 inside plt.scatter(). Alpha can be set from zero to one, where
o Use np.array() to create a numpy array from the list pop. Call zero is totally transparent, and one is not at all transparent.
this NumPy array np_pop.

# edited/added
col=list(np.loadtxt('col.txt', dtype=str)) plt.text(1550, 71, 'India')
# Specify c and alpha inside plt.scatter() plt.text(5700, 80, 'China')
plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha = 0. # Add grid() call
8) plt.grid(True)
# Previous customizations # Show the plot
plt.xscale('log') plt.show()
plt.xlabel('GDP per Capita [in USD]')
plt.ylabel('Life Expectancy [in years]') Interpretation
plt.title('World Development in 2007') If you have a look at your colorful plot, it’s clear that people live longer in
countries with a higher GDP per capita. No high income countries have really
plt.xticks([1000,10000,100000], ['1k','10k','100k']) short life expectancy, and no low income countries have very long life
# Show the plot expectancy. Still, there is a huge difference in life expectancy between
countries on the same income level. Most people live in middle income
plt.show()
countries where difference in lifespan is huge between countries; depending
on how income is distributed and how it is used.
Additional Customizations
What can you say about the plot?
 Add plt.grid(True) after the plt.text() calls so that gridlines are drawn
on the plot. The countries in blue, corresponding to Africa,
have both low life expectancy and a low GDP per
# Scatter plot capita.
plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha = 0.
8) There is a negative correlation between GDP per
capita and life expectancy.
# Previous customizations
plt.xscale('log') China has both a lower GDP per capita and
plt.xlabel('GDP per Capita [in USD]') lower life expectancy compared to India.
plt.ylabel('Life Expectancy [in years]')
Motivation for dictionaries
plt.title('World Development in 2007')
plt.xticks([1000,10000,100000], ['1k','10k','100k'])  Use the index() method on countries to find the index of 'germany'.
Store this index as ind_ger.
# Additional customizations
 Use ind_ger to access the capital of Germany from the capitals list. # Definition of dictionary
Print it out.
europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin', 'norway':'oslo' }
# Definition of countries and capital # Print out the keys in europe
countries = ['spain', 'france', 'germany', 'norway'] print(europe.keys())
capitals = ['madrid', 'paris', 'berlin', 'oslo'] # Print out value that belongs to key 'norway'
# Get index of 'germany': ind_ger print(europe['norway'])
ind_ger = countries.index('germany')
Dictionary Manipulation (1)
# Use ind_ger to print out capital of Germany
print(capitals[ind_ger])  Add the key 'italy' with the value 'rome' to europe.
 To assert that 'italy' is now a key in europe, print out 'italy' in europe.
Create dictionary  Add another key:value pair to europe: 'poland' is the key, 'warsaw' is
the corresponding value.
 Print out europe.
 With the strings in countries and capitals, create a dictionary
called europe with 4 key:value pairs. Beware of capitalization! Make
sure you use lowercase characters everywhere. # Definition of dictionary
 Print out europe to see if the result is what you expected. europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin', 'norway':'oslo' }
# Add italy to europe
# Definition of countries and capital
europe['italy'] = 'rome'
countries = ['spain', 'france', 'germany', 'norway']
# Print out italy in europe
capitals = ['madrid', 'paris', 'berlin', 'oslo']
print('italy' in europe)
# From string in countries and capitals, create dictionary europe
# Add poland to europe
europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin', 'norway':'oslo'}
europe['poland'] = 'warsaw'
# Print europe
# Print europe
print(europe)
print(europe)
Access dictionary
Dictionary Manipulation (2)
 Check out which keys are in europe by calling the keys() method
on europe. Print out the result.  The capital of Germany is not 'bonn'; it’s 'berlin'. Update its value.
 Print out the value that belongs to the key 'norway'.
 Australia is not in Europe, Austria is! Remove the # Print out the capital of France
key 'australia' from europe.
 Print out europe to see if your cleaning work paid off. print(europe['france']['capital'])
# Create sub-dictionary data
# Definition of dictionary data = { 'capital':'rome', 'population':59.83 }
europe = {'spain':'madrid', 'france':'paris', 'germany':'bonn', # Add data to europe under key 'italy'
'norway':'oslo', 'italy':'rome', 'poland':'warsaw', europe['italy'] = data
'australia':'vienna' } # Print europe
# Update capital of germany print(europe)
europe['germany'] = 'berlin'
# Remove australiadel(europe['australia']) Dictionary to DataFrame (1)

# Print europe  Import pandas as pd.


print(europe)  Use the pre-defined lists to create a dictionary called my_dict. There
should be three key value pairs:
Dictionariception
o key 'country' and value names.
 Use chained square brackets to select and print out the capital of o key 'drives_right' and value dr.
France. o key 'cars_per_cap' and value cpc.
 Create a dictionary, named data, with the
keys 'capital' and 'population'. Set them to 'rome' and 59.83,  Use pd.DataFrame() to turn your dict into a DataFrame called cars.
respectively.  Print out cars and see how beautiful it is.
 Add a new key-value pair to europe; the key is 'italy' and the value
is data, the dictionary you just built. # Pre-defined lists
names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', 'Egyp
# Dictionary of dictionaries t']
europe = { 'spain': { 'capital':'madrid', 'population':46.77 }, dr = [True, False, False, False, True, True, True]
'france': { 'capital':'paris', 'population':66.03 }, cpc = [809, 731, 588, 18, 200, 70, 45]
'germany': { 'capital':'berlin', 'population':80.62 }, # Import pandas as pdimport pandas as pd
'norway': { 'capital':'oslo', 'population':5.084 } } # Create dictionary my_dict with three key:value pairs: my_dict
my_dict = { 'country':names, 'drives_right':dr, 'cars_per_cap':cpc }
# Build a DataFrame cars from my_dict: cars  To import CSV files you still need the pandas package: import it
as pd.
cars = pd.DataFrame(my_dict)  Use pd.read_csv() to import cars.csv data as a DataFrame. Store this
# Print cars DataFrame as cars.
 Print out cars. Does everything look OK?
print(cars)
# Import pandas as pdimport pandas as pd
Dictionary to DataFrame (2)
# Import the cars.csv data: cars
 Hit Run Code to see that, indeed, the row labels are not correctly set. cars = pd.read_csv('cars.csv')
 Specify the row labels by setting cars.index equal to row_labels.
 Print out cars again and check if the row labels are correct this time. # Print out cars
print(cars)
import pandas as pd
# Build cars DataFrame CSV to DataFrame (2)

names = ['United States', 'Australia', 'Japan', 'India', 'Russia', 'Morocco', 'Egyp  Run the code with Run Code and assert that the first column should
t'] actually be used as row labels.
dr = [True, False, False, False, True, True, True]  Specify the index_col argument inside pd.read_csv(): set it to 0, so
that the first column is used as row labels.
cpc = [809, 731, 588, 18, 200, 70, 45]  Has the printout of cars improved now?
cars_dict = { 'country':names, 'drives_right':dr, 'cars_per_cap':cpc }
cars = pd.DataFrame(cars_dict) # Import pandas as pdimport pandas as pd
print(cars) # Fix import by including index_col
# Definition of row_labels cars = pd.read_csv('cars.csv', index_col = 0)
row_labels = ['US', 'AUS', 'JPN', 'IN', 'RU', 'MOR', 'EG'] # Print out cars
# Specify row labels of cars print(cars)
cars.index = row_labels
Square Brackets (1)
# Print cars again
print(cars)  Use single square brackets to print out the country column of cars as a
Pandas Series.
 Use double square brackets to print out the country column of cars as
CSV to DataFrame (1)
a Pandas DataFrame.
 Use double square brackets to print out a DataFrame with both  Use loc or iloc to select the observations for Australia and Egypt as a
the country and drives_right columns of cars, in this order. DataFrame. You can find out about the labels/indexes of these rows
by inspecting cars in the IPython Shell. Make sure to print the
# Import cars dataimport pandas as pd resulting DataFrame.

cars = pd.read_csv('cars.csv', index_col = 0)


# Import cars dataimport pandas as pd
# Print out country column as Pandas Series
cars = pd.read_csv('cars.csv', index_col = 0)
print(cars['country'])
# Print out observation for Japan
# Print out country column as Pandas DataFrame
print(cars.iloc[2])
print(cars[['country']])
# Print out observations for Australia and Egypt
# Print out DataFrame with country and drives_right columns
print(cars.loc[['AUS', 'EG']])
print(cars[['country', 'drives_right']])
loc and iloc (2)
Square Brackets (2)
 Print out the drives_right value of the row corresponding to Morocco
 Select the first 3 observations from cars and print them out. (its row label is MOR)
 Select the fourth, fifth and sixth observation, corresponding to row  Print out a sub-DataFrame, containing the observations for Russia and
indexes 3, 4 and 5, and print them out. Morocco and the columns country and drives_right.

# Import cars dataimport pandas as pd # Import cars dataimport pandas as pd


cars = pd.read_csv('cars.csv', index_col = 0) cars = pd.read_csv('cars.csv', index_col = 0)
# Print out first 3 observations # Print out drives_right value of Morocco
print(cars[0:3]) print(cars.iloc[5, 2])
# Print out fourth, fifth and sixth observation # Print sub-DataFrame
print(cars[3:6]) print(cars.loc[['RU', 'MOR'], ['country', 'drives_right']])

loc and iloc (1) loc and iloc (3)

 Use loc or iloc to select the observation corresponding to Japan as a  Print out the drives_right column as a Series using loc or iloc.
Series. The label of this row is JPN, the index is 2. Make sure to print  Print out the drives_right column as a DataFrame using loc or iloc.
the resulting Series.  Print out both the cars_per_cap and drives_right column as a
DataFrame using loc or iloc.
# Import cars dataimport pandas as pd o x is greater than or equal to -10. x has already been defined for
you.
cars = pd.read_csv('cars.csv', index_col = 0) o "test" is less than or equal to y. y has already been defined for
# Print out drives_right column as Series you.
o True is greater than False.
print(cars.iloc[:, 2])
# Print out drives_right column as DataFrame # Comparison of integers
print(cars.iloc[:, [2]]) x = -3 * 6
# Print out cars_per_cap and drives_right as DataFrame print(x >= -10)
print(cars.loc[:, ['cars_per_cap', 'drives_right']]) # Comparison of strings
y = "test"
Equality
print("test" <= y)
 In the editor on the right, write code to see if True equals False. # Comparison of booleans
 Write Python code to check if -5 * 15 is not equal to 75.
 Ask Python whether the strings "pyscript" and "PyScript" are equal. print(True > False)
 What happens if you compare booleans and integers? Write code to
see if True and 1 are equal. Compare arrays

# Comparison of booleans  Which areas in my_house are greater than or equal to 18?
 You can also compare two NumPy arrays element-wise. Which areas
print(True == False) in my_house are smaller than the ones in your_house?
# Comparison of integers  Make sure to wrap both commands in a print() statement so that you
can inspect the output!
print(-5 * 15 != 75)
# Comparison of strings # Create arraysimport numpy as np
print("pyscript" == "PyScript") my_house = np.array([18.0, 20.0, 10.75, 9.50])
# Compare a boolean with a numeric your_house = np.array([14.0, 24.0, 14.25, 9.0])
print(True == 1) # my_house greater than or equal to 18
print(my_house >= 18)
Greater and less than
# my_house less than your_house
 Write Python expressions, wrapped in a print() function, to check print(my_house < your_house)
whether:
Boolean Operators
True
and, or, not (1)
False
 Write Python expressions, wrapped in a print() function, to check
whether:
Running these commands will result in an error.
o my_kitchen is bigger than 10 and smaller than 18.
Boolean operators with NumPy
o my_kitchen is smaller than 14 or bigger than 17.
o double the area of my_kitchen is smaller than triple the area
 Generate boolean arrays that answer the following questions:
of your_kitchen.
 Which areas in my_house are greater than 18.5 or smaller than 10?
 Which areas are smaller than 11 in both my_house and your_house?
# Define variables Make sure to wrap both commands in print() statement, so that you
my_kitchen = 18.0 can inspect the output.

your_kitchen = 14.0
# Create arraysimport numpy as np
# my_kitchen bigger than 10 and smaller than 18?
my_house = np.array([18.0, 20.0, 10.75, 9.50])
print(my_kitchen > 10 and my_kitchen < 18)
your_house = np.array([14.0, 24.0, 14.25, 9.0])
# my_kitchen smaller than 14 or bigger than 17?
# my_house greater than 18.5 or smaller than 10
print(my_kitchen < 14 or my_kitchen > 17)
print(np.logical_or(my_house > 18.5, my_house < 10))
# Double my_kitchen smaller than triple your_kitchen?
# Both my_house and your_house smaller than 11
print(my_kitchen * 2 < your_kitchen * 3)
print(np.logical_and(my_house < 11, your_house < 11))
and, or, not (2)
if, elif, else
x=8
Warmup
y=9
To experiment with if and else a bit, have a look at this code sample:
not(not(x < 3) and not(y > 14 or y > 10))
area = 10.0
What will the result be if you execute these three commands in the IPython
Shell? if(area < 9) :

NB: Notice that not has a higher priority than and and or, it is executed first. print("small")
elif(area < 12) :
print("medium") # Define variables
else : room = "kit"
print("large") area = 14.0

What will the output be if you run this piece of code in the IPython Shell? # if-else construct for roomif room == "kit" :
print("looking around in the kitchen.")else :
small print("looking around elsewhere.")
# if-else construct for area :if area > 15 :
medium print("big place!")else :
print("pretty small.")
large
Customize further: elif
The syntax is incorrect; this code will produce an error.
# Define variables
if
room = "bed"
 Examine the if statement that prints out "looking around in the area = 14.0
kitchen." if room equals "kit". # if-elif-else construct for roomif room == "kit" :
 Write another if statement that prints out “big place!” if area is greater
than 15. print("looking around in the kitchen.")elif room == "bed":
print("looking around in the bedroom.")else :
# Define variables print("looking around elsewhere.")
room = "kit" # if-elif-else construct for areaif area > 15 :
area = 14.0 print("big place!")elif area > 10 :
# if statement for roomif room == "kit" : print("medium size, nice!")else :
print("looking around in the kitchen.") print("pretty small.")
# if statement for areaif area > 15 :
print("big place!") Driving right (1)

 Extract the drives_right column as a Pandas Series and store it as dr.


Add else  Use dr, a boolean Series, to subset the cars DataFrame. Store the
resulting selection in sel.
 Print sel, and assert that drives_right is True for all observations. # Import cars dataimport pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)
# Import cars dataimport pandas as pd
# Create car_maniac: observations that have a cars_per_cap over 500
cars = pd.read_csv('cars.csv', index_col = 0)
cpc = cars['cars_per_cap']
# Extract drives_right column as Series: dr
many_cars = cpc > 500
dr = cars['drives_right']
car_maniac = cars[many_cars]
# Use dr to subset cars: sel
# Print car_maniac
sel = cars[dr]
print(car_maniac)
# Print sel
print(sel) Cars per capita (2)

Driving right (2)  Use the code sample provided to create a DataFrame medium, that
includes all the observations of cars that have
# Import cars dataimport pandas as pd a cars_per_cap between 100 and 500.
cars = pd.read_csv('cars.csv', index_col = 0)  Print out medium.
# Convert code to a one-liner
# Import cars dataimport pandas as pd
sel = cars[cars['drives_right']]
cars = pd.read_csv('cars.csv', index_col = 0)
# Print sel
# Import numpy, you'll need thisimport numpy as np
print(sel)
# Create medium: observations with cars_per_cap between 100 and 500
Cars per capita (1) cpc = cars['cars_per_cap']
between = np.logical_and(cpc > 100, cpc < 500)
 Select the cars_per_cap column from cars as a Pandas Series and store
it as cpc. medium = cars[between]
 Use cpc in combination with a comparison operator and 500. You # Print medium
want to end up with a boolean Series that’s True if the corresponding
country has a cars_per_cap of more than 500 and False otherwise. print(medium)
Store this boolean Series as many_cars.
 Use many_cars to subset cars, similar to what you did before. Store while: warming up
the result as car_maniac. Can you tell how many printouts the following while loop will do?
 Print out car_maniac to see if you got it right.
x=1 offset = offset - 1
while x < 4 : print(offset)
print(x)
Add conditionals
x=x+1
 Inside the while loop, complete the if-else statement:
0
o If offset is greater than zero, you should decrease offset by 1.
o Else, you should increase offset by 1.
1
 If you’ve coded things correctly, hitting Submit Answer should work
2 this time.

3 # Initialize offset
offset = -6
4 # Code the while loopwhile offset != 0 :
print("correcting...")
Basic while loop
if offset > 0 :
 Create the variable offset with an initial value of 8. offset = offset - 1
 Code a while loop that keeps running as long as offset is not equal
to 0. Inside the while loop: else :
offset = offset + 1
o Print out the sentence "correcting...".
o Next, decrease the value of offset by 1. You can do this print(offset)
with offset = offset - 1.
o Finally, still within your loop, print out offset so you can see Loop over a list
how it changes. Write a for loop that iterates over all elements of the areas list and prints out
every element separately.
# Initialize offset
# areas list
offset = 8
areas = [11.25, 18.0, 20.0, 10.75, 9.50]
# Code the while loopwhile offset != 0 :
# Code the for loopfor area in areas :
print("correcting...")
print(area) # house list of lists
house = [["hallway", 11.25],
Indexes and values (1)
["kitchen", 18.0],
 Adapt the for loop in the sample code to use enumerate() and use two ["living room", 20.0],
iterator variables.
["bedroom", 10.75],
 Update the print() statement so that on each run, a line of the
form "room x: y" should be printed, where x is the index of the list ["bathroom", 9.50]]
element and y is the actual list element, i.e. the area. Make sure to # Build a for loop from scratchfor x in house :
print out this exact string, with the correct spacing.
print("the " + x[0] + " is " + str(x[1]) + " sqm")
# areas list
Loop over dictionary
areas = [11.25, 18.0, 20.0, 10.75, 9.50]
# Change for loop to use enumerate() and update print()for index, area in en # Definition of dictionary
umerate(areas) : europe = {'spain':'madrid', 'france':'paris', 'germany':'berlin',
print("room " + str(index) + ": " + str(area)) 'norway':'oslo', 'italy':'rome', 'poland':'warsaw', 'austria':'vienna' }
# Iterate over europefor key, value in europe.items() :
Indexes and values (2)
print("the capital of " + str(key) + " is " + str(value))
For non-programmer folks, room 0: 11.25 is strange. Wouldn’t it be better if
the count started at 1?
Loop over NumPy array
Adapt the print() function in the for loop so that the first printout
becomes "room 1: 11.25", the second one "room 2: 18.0" and so on.  Import the numpy package under the local alias np.
 Write a for loop that iterates over all elements in np_height and prints
# areas list out "x inches" for each element, where x is the value in the array.
areas = [11.25, 18.0, 20.0, 10.75, 9.50]  Write a for loop that visits every element of the np_baseball array and
prints it out.
# Adapt the printoutfor index, area in enumerate(areas) :
print("room " + str(index + 1) + ": " + str(area)) # edited/addedimport pandas as pd
mlb = pd.read_csv('baseball.csv')
Loop over list of lists
np_height = np.array(mlb['Height'])
Write a for loop that goes through each sublist of house and prints out the x is
y sqm, where x is the name of the room and y is the area of the room. np_weight = np.array(mlb['Weight'])
baseball = [[180, 78.4],
[215, 102.7], # Import cars dataimport pandas as pd
[210, 98.5], cars = pd.read_csv('cars.csv', index_col = 0)
[188, 75.2]] # Adapt for loopfor lab, row in cars.iterrows() :
np_baseball = np.array(baseball) print(lab + ": " + str(row['cars_per_cap']))
# Import numpy as npimport numpy as np
Add column (1)
# For loop over np_heightfor x in np_height[:5]: # edited/added
print(str(x) + " inches")  Use a for loop to add a new column, named COUNTRY, that contains
a uppercase version of the country names in the "country" column.
# For loop over np_baseballfor x in np.nditer(np_baseball) :
You can use the string method upper() for this.
print(x)  To see if your code worked, print out cars. Don’t indent this code, so
that it’s not part of the for loop.
Loop over DataFrame (1)
Write a for loop that iterates over the rows of cars and on each iteration # Import cars dataimport pandas as pd
perform two print() calls: one to print out the row label and one to print out cars = pd.read_csv('cars.csv', index_col = 0)
all of the rows contents.
# Code for loop that adds COUNTRY columnfor lab, row in cars.iterrows() :
# Import cars dataimport pandas as pd cars.loc[lab, "COUNTRY"] = row["country"].upper()
cars = pd.read_csv('cars.csv', index_col = 0) # Print cars
# Iterate over rows of carsfor lab, row in cars.iterrows() : print(cars)
print(lab)
print(row) Add column (2)

 Replace the for loop with a one-liner that uses .apply(str.upper). The
Loop over DataFrame (2) call should give the same result: a column COUNTRY should be
added to cars, containing an uppercase version of the country names.
 Using the iterators lab and row, adapt the code in the for loop such  As usual, print out cars to see the fruits of your hard labor
that the first iteration prints out "US: 809", the second iteration "AUS:
731", and so on.
 The output should be in the form "country: cars_per_cap". Make sure # Import cars dataimport pandas as pd
to print out this exact string (with the correct spacing). cars = pd.read_csv('cars.csv', index_col = 0)

o You can use str() to convert your integer data to a string so # Use .apply(str.upper)
that you can print it in conjunction with the country label.
cars["COUNTRY"] = cars["country"].apply(str.upper) print(np.random.randint(1,7))

Random float Determine your next move

 seed(): sets the random seed, so that your results are reproducible  Roll the dice. Use randint() to create the variable dice.
between simulations. As an argument, it takes an integer of your  Finish the if-elif-else construct by replacing ___:
choosing. If you call the function, no output will be generated.  If dice is 1 or 2, you go one step down.
 rand(): if you don’t specify any arguments, it generates a random float  if dice is 3, 4 or 5, you go one step up.
between zero and one.  Else, you throw the dice again. The number of eyes is the number of
steps you go up.
 Import numpy as np.  Print out dice and step. Given the value of dice, was step updated
 Use seed() to set the seed; as an argument, pass 123. correctly?
 Generate your first random float with rand() and print it out.
# NumPy is imported, seed is set
# Import numpy as npimport numpy as np # Starting step
# Set the seed step = 50
np.random.seed(123) # Roll the dice
# Generate and print random float dice = np.random.randint(1,7)
print(np.random.rand()) # Finish the control constructif dice <= 2 :

Roll the dice step = step - 1elif dice <= 5 :


step = step + 1else :
 Use randint() with the appropriate arguments to randomly generate
step = step + np.random.randint(1,7)
the integer 1, 2, 3, 4, 5 or 6. This simulates a dice. Print it out.
 Repeat the outcome to see if the second throw is different. Again, # Print out dice and step
print out the result. print(dice)
print(step)
# Import numpy and set seedimport numpy as np
np.random.seed(123) Random Walk
# Use randint() to simulate a dice
The next step
print(np.random.randint(1,7))
# Use randint() again
 Make a list random_walk that contains the first step, which is the random_walk.append(step)
integer 0.
 Finish the for loop: # Print random_walk
 The loop should run 100 times. print(random_walk)
 On each iteration, set step equal to the last element in
the random_walk list. You can use the index -1 for this.
How low can you go?
 Next, let the if-elif-else construct update step for you.
 The code that appends step to random_walk is already coded.
 Use max() in a similar way to make sure that step doesn’t go below
 Print out random_walk.
zero if dice <= 2.
 Hit Submit Answer and check the contents of random_walk.
# NumPy is imported, seed is set
# Initialize random_walk # NumPy is imported, seed is set
random_walk = [0] # Initialize random_walk
# Complete the ___for x in range(100) : random_walk = [0]
# Set step: last element in random_walk for x in range(100) :
step = random_walk[-1] step = random_walk[-1]
dice = np.random.randint(1,7)
# Roll the dice
dice = np.random.randint(1,7) if dice <= 2:
# Replace below: use max to make sure step can't go below 0
# Determine next step step = max(0, step - 1)
if dice <= 2: elif dice <= 5:
step = step - 1 step = step + 1
elif dice <= 5: else:
step = step + 1 step = step + np.random.randint(1,7)
else:
step = step + np.random.randint(1,7) random_walk.append(step)

# append next_step to random_walk print(random_walk)


Visualize the walk Simulate multiple walks

 Import matplotlib.pyplot as plt.  Fill in the specification of the for loop so that the random walk is
 Use plt.plot() to plot random_walk. simulated 10 times.
 Finish off with plt.show() to actually display the plot.  After the random_walk array is entirely populated, append the array
to the all_walks list.
# NumPy is imported, seed is set  Finally, after the top-level for loop, print out all_walks.

# Initialization
# NumPy is imported; seed is set
random_walk = [0]
# Initialize all_walks (don't change this line)
for x in range(100) :
all_walks = []
step = random_walk[-1]
# Simulate random walk 10 timesfor i in range(10) :
dice = np.random.randint(1,7)

# Code from before


if dice <= 2:
random_walk = [0]
step = max(0, step - 1)
for x in range(100) :
elif dice <= 5:
step = random_walk[-1]
step = step + 1
dice = np.random.randint(1,7)
else:
step = step + np.random.randint(1,7)
if dice <= 2:
step = max(0, step - 1)
random_walk.append(step)
elif dice <= 5:
# Import matplotlib.pyplot as pltimport matplotlib.pyplot as plt
step = step + 1
# Plot random_walk
else:
plt.plot(random_walk)
step = step + np.random.randint(1,7)
# Show the plot
random_walk.append(step)
plt.show()

Distribution # Append random_walk to all_walks


all_walks.append(random_walk) all_walks.append(random_walk)
# Print all_walks # Convert all_walks to NumPy array: np_aw
print(all_walks) np_aw = np.array(all_walks)
# Plot np_aw and show
Visualize all walks
plt.plot(np_aw)
 Use np.array() to convert all_walks to a NumPy array, np_aw. plt.show()
 Try to use plt.plot() on np_aw. Also include plt.show(). Does it work
# Clear the figure
out of the box?
 Transpose np_aw by calling np.transpose() on np_aw. Call the plt.clf()
result np_aw_t. Now every row in np_all_walks represents the # Transpose np_aw: np_aw_t
position after 1 throw for the 10 random walks.
 Use plt.plot() to plot np_aw_t; also include a plt.show(). Does it look np_aw_t = np.transpose(np_aw)
better this time? # Plot np_aw_t and show
plt.plot(np_aw_t)
# numpy and matplotlib imported, seed set.
plt.show()
# initialize and populate all_walks
all_walks = []for i in range(10) : Implement clumsiness
random_walk = [0]
 Change the range() function so that the simulation is performed 250
for x in range(100) : times.
step = random_walk[-1]  Finish the if condition so that step is set to 0 if a random float is less
or equal to 0.001. Use np.random.rand().
dice = np.random.randint(1,7)
if dice <= 2: # numpy and matplotlib imported, seed set
step = max(0, step - 1) # Simulate random walk 250 times
elif dice <= 5: all_walks = []for i in range(250) :
step = step + 1 random_walk = [0]
else: for x in range(100) :
step = step + np.random.randint(1,7) step = random_walk[-1]
random_walk.append(step) dice = np.random.randint(1,7)
if dice <= 2: all_walks = []for i in range(500) :
step = max(0, step - 1) random_walk = [0]
elif dice <= 5: for x in range(100) :
step = step + 1 step = random_walk[-1]
else: dice = np.random.randint(1,7)
step = step + np.random.randint(1,7) if dice <= 2:
step = max(0, step - 1)
# Implement clumsiness elif dice <= 5:
if np.random.rand() <= 0.001 : step = step + 1
step = 0 else:
step = step + np.random.randint(1,7)
random_walk.append(step) if np.random.rand() <= 0.001 :
all_walks.append(random_walk) step = 0
# Create and plot np_aw_t random_walk.append(step)
np_aw_t = np.transpose(np.array(all_walks)) all_walks.append(random_walk)
plt.plot(np_aw_t) # Create and plot np_aw_t
plt.show() np_aw_t = np.transpose(np.array(all_walks))
# Select last row from np_aw_t: ends
Plot the distribution
ends = np_aw_t[-1,:]
 To make sure we’ve got enough simulations, go crazy. Simulate the # Plot histogram of ends, display plot
random walk 500 times.
plt.hist(ends)
 From np_aw_t, select the last row. This contains the endpoint of all
500 random walks you’ve simulated. Store this NumPy array as ends. plt.show()
 Use plt.hist() to build a histogram of ends. Don’t forget plt.show() to
display the plot. Calculate the odds
The histogram of the previous exercise was created from a NumPy array ends,
# numpy and matplotlib imported, seed set that contains 500 integers. Each integer represents the end point of a random
# Simulate random walk 500 times walk. To calculate the chance that this end point is greater than or equal to 60,
you can count the number of integers in ends that are greater than or equal to
60 and divide that number by 500, the total number of simulations.
Well then, what’s the estimated chance that you’ll reach at least 60 steps high
if you play this Empire State Building game? The ends array is everything
you need; it’s available in your Python session so you can make calculations
in the IPython Shell.

48.8%

76.6%

78.4%

95.9%

You might also like