Python Data Science Toolbox
Python Data Science Toolbox
def square():
object1 contains "data + analysis +
visualization", object2 contains "1*3", object3 contains 13. new_value = 4 ** 2
return new_value
object1 contains "data+analysis+visualization", object2 contains 3, object Complete the function header by adding the appropriate function
3 contains "13". name, shout.
In the function body, concatenate the string, 'congratulations' with
another string, '!!!'. Assign the result to shout_word.
object1 contains "dataanalysisvisualization", object2 contains 3, object3 c Print the value of shout_word.
ontains "111". Call the shout function.
# Concatenate shout1 with shout2: new_shout Modify the function header such that the function name is
now shout_all, and it accepts two parameters, word1 and word2, in
new_shout = shout1 + shout2
that order.
Concatenate the string '!!!' to each of word1 and word2 and assign
# Return new_shout to shout1 and shout2, respectively.
Construct a tuple shout_words, composed of shout1 and shout2.
return new_shout Call shout_all() with the strings 'congratulations' and 'you' and assign
# Pass 'congratulations' and 'you' to shout: yell the result to yell1 and yell2 (remember, shout_all() returns 2
variables!).
yell = shout('congratulations', 'you')
# Print yell # Define shout_all with parameters word1 and word2
print(yell) def shout_all(word1, word2):
"""Return a tuple of strings"""
A brief introduction to tuples
# Concatenate word1 with '!!!': shout1
Print out the value of nums in the IPython shell. Note the elements in shout1 = word1 + '!!!'
the tuple.
In the IPython shell, try to change the first element of nums to the
value 2 by doing an assignment: nums[0] = 2. What happens? # Concatenate word2 with '!!!': shout2
Unpack nums to the variables num1, num2, and num3. shout2 = word2 + '!!!'
Construct a new tuple, even_nums composed of the same elements
in nums, but with the 1st element replaced with the value, 2.
# Construct a tuple with shout1 and shout2: shout_words
shout_words = (shout1, shout2)
# edited/added
nums = (3,4,6)
# Return shout_words
# Unpack nums into num1, num2, and num3
return shout_words
num1, num2, num3 = nums
# Pass 'congratulations' and 'you' to shout_all(): yell1, yell2
yell1, yell2 = shout_all('congratulations', 'you') # If the language is in langs_count, add 1
# Print yell1 and yell2 if entry in langs_count.keys():
print(yell1) langs_count[entry] += 1
print(yell2) # Else add the language to langs_count, set the value to 1
else:
Bringing it all together (1)
langs_count[entry] = 1
Import the pandas package with the alias pd. # Print the populated dictionary
Import the file 'tweets.csv' using the pandas function read_csv().
print(langs_count)
Assign the resulting DataFrame to df.
Complete the for loop by iterating over col, the 'lang' column in the
DataFrame df. Bringing it all together (2)
Complete the bodies of the if-else statements in the for loop: if the
key is in the dictionary langs_count, add 1 to the value corresponding Define the function count_entries(), which has two parameters. The
to this key in the dictionary, else add the key to langs_count and set first parameter is df for the DataFrame and the second is col_name for
the corresponding value to 1. Use the loop variable entry in your the column name.
code. Complete the bodies of the if-else statements in the for loop: if the
key is in the dictionary langs_count, add 1 to its current
value, else add the key to langs_count and set its value to 1. Use the
# Import pandas loop variable entry in your code.
import pandas as pd Return the langs_count dictionary from inside
the count_entries() function.
# Import Twitter data as DataFrame: df
Call the count_entries() function by passing to it tweets_df and the
df = pd.read_csv('tweets.csv') name of the column, 'lang'. Assign the result of the call to the
# Initialize an empty dictionary: langs_count variable result.
langs_count = {}
# edited/added
# Extract column from DataFrame: col
tweets_df = pd.read_csv('tweets.csv')
col = df['lang']
# Iterate over lang column in DataFrame
# Define count_entries()
for entry in col:
def count_entries(df, col_name):
"""Return a dictionary with counts of Pop quiz on understanding scope
‘sum’
# Create a string: team
team = "teen titans"
‘range’
# Define change_team()
def change_team(): ‘array’
"""Change the value of the global variable team."""
‘tuple’
# Use team in global scope Nested Functions I
global team
Complete the function header of the nested function with the function
name inner() and a single parameter word.
# Change the value of team in global: team Complete the return value: each element of the tuple should be a call
team = "justice league" to inner(), passing in the parameters from three_shouts() as arguments
to each call.
# Print team
print(team) # Define three_shouts
# Call change_team() def three_shouts(word1, word2, word3):
change_team() """Returns a tuple of strings
# Print team concatenated with '!!!'."""
print(team)
# Define inner
Python’s built-in scope
def inner(word):
Here you’re going to check out Python’s built-in scope, which is really just a
built-in module called builtins. However, to query builtins, you’ll need """Returns a string concatenated with '!!!'."""
to import builtins ‘because the name builtins is not itself built in…No, I’m return word + '!!!'
serious!’ (Learning Python, 5th edition, Mark Lutz). After executing import
# Call echo: twice
# Return a tuple of strings twice = echo(2)
return (inner(word1), inner(word2), inner(word3)) # Call echo: thrice
# Call three_shouts() and print thrice = echo(3)
print(three_shouts('a', 'b', 'c')) # Call twice() and thrice() then print
print(twice('hello'), thrice('hello'))
Nested Functions II
The keyword nonlocal and nested functions
Complete the function header of the inner function with the function
name inner_echo() and a single parameter word1. Assign to echo_word the string word, concatenated with itself.
Complete the function echo() so that it returns inner_echo. Use the keyword nonlocal to alter the value of echo_word in the
We have called echo(), passing 2 as an argument, and assigned the enclosing scope.
resulting function to twice. Your job is to call echo(), passing 3 as an Alter echo_word to echo_word concatenated with ‘!!!’.
argument. Assign the resulting function to thrice. Call the function echo_shout(), passing it a single argument ‘hello’.
Hit Submit to call twice() and thrice() and print the results.
def shout_echo(word1, echo=1, intense=False): Complete the function header with the function name gibberish. It
"""Concatenate echo copies of word1 and three accepts a single flexible argument *args.
Initialize a variable hodgepodge to an empty string.
exclamation marks at the end of the string.""" Return the variable hodgepodge at the end of the function body.
Call gibberish() with the single string, "luke". Assign the result
to one_word.
# Concatenate echo copies of word1 using *: echo_word
Hit the Submit button to call gibberish() with multiple arguments and
echo_word = word1 * echo to print the value to the Shell.
# Else add the entry to cols_count, set the value to 1 #Initialize an empty dictionary: cols_count
else: cols_count = {}
cols_count[entry] = 1
# Iterate over column names in args
# Return the cols_count dictionary for col_name in args:
return cols_count
# Call count_entries(): result1 # Extract column from DataFrame: col
result1 = count_entries(tweets_df, col_name='lang') col = df[col_name]
# Call count_entries(): result2
result2 = count_entries(tweets_df, col_name='source') # Iterate over the column in DataFrame
# Print result1 and result2 for entry in col:
print(result1)
print(result2) # If entry is in cols_count, add 1
if entry in cols_count.keys():
Bringing it all together (2)
cols_count[entry] += 1
Complete the function header by supplying the parameter for the
DataFrame df and the flexible argument *args. # Else add the entry to cols_count, set the value to 1
Complete the for loop within the function definition so that the loop
occurs over the tuple args. else:
Call count_entries() by passing the tweets_df DataFrame and the cols_count[entry] = 1
column name 'lang'. Assign the result to result1.
Call count_entries() by passing the tweets_df DataFrame and the
column names 'lang' and 'source'. Assign the result to result2. # Return the cols_count dictionary
return cols_count
# Define count_entries()def count_entries(df, *args):
# Call count_entries(): result1
"""Return a dictionary with counts of
result1 = count_entries(tweets_df, 'lang') return words
# Call count_entries(): result2
Define the lambda function echo_word using the
result2 = count_entries(tweets_df, 'lang', 'source')
variables word1 and echo. Replicate what the original function
# Print result1 and result2 definition for echo_word() does above.
print(result1) Call echo_word() with the string argument 'hey' and the value 5, in
that order. Assign the call to result.
print(result2)
# Define echo_word as a lambda function: echo_word
Pop quiz on lambda functions
echo_word = (lambda word1, echo: word1 * echo)
How would you write a lambda function add_bangs that adds three # Call echo_word: result
exclamation points '!!!' to the end of a string a?
How would you call add_bangs with the argument 'hello'? result = echo_word('hey', 5)
# Print result
print(result)
The lambda function definition is: add_bangs = (a + '!!!'), and the
function call is: add_bangs('hello'). Map() and lambda functions
For example:
The lambda function definition is: add_bangs = (lambda a: a + '!!!'), and nums = [2, 4, 6, 8, 10]
the function call is: add_bangs('hello').
Filter() and lambda functions Import the reduce function from the functools module.
In the reduce() call, pass a lambda function that takes two string
In the filter() call, pass a lambda function and the list of arguments item1 and item2 and concatenates them; also pass the list
strings, fellowship. The lambda function should check if the number of strings, stark. Assign the result to result. The first argument
of characters in a string member is greater than 6; use to reduce() should be the lambda function and the second argument is
the len() function to do this. Assign the resulting filter object to result. the list stark.
Convert result to a list and print out the list.
# Import reduce from functoolsfrom functools import reduce
# Create a list of strings: fellowship
# Create a list of strings: stark
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas',
stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']
'gimli', 'gandalf']
# Use reduce() to apply a lambda function over stark: result
# Use filter() to apply a lambda function over fellowship: result
result = reduce(lambda item1, item2: item1 + item2, stark)
result = filter(lambda member: len(member) > 6, fellowship)
# Print the result
# Convert result to a list: result_list
print(result)
result_list = list(result)
# Print result_list Pop quiz about errors
print(result_list) Take a look at the following function calls to len():
Reduce() and lambda functions len('There is a beast in every man and it stirs when you put a sword in his
hand.')
Remember gibberish() from a few exercises back?
# Define gibberish
len(['robb', 'sansa', 'arya', 'eddard', 'jon'])
def gibberish(*args):
len(525600) exclamation marks at the end of the string."""
len(('jaime', 'cersei', 'tywin', 'tyrion', 'joffrey')) # Initialize empty strings: echo_word, shout_words
Which of the function calls raises an error and what type of error is raised? echo_word = ''
shout_words = ''
The call len('There is a beast in every man and it stirs when you put a
sword in his hand.') raises a TypeError. # Add exception handling with try-except
try:
# Concatenate echo copies of word1 using *: echo_word
The call len(['robb', 'sansa', 'arya', 'eddard', 'jon']) raises an IndexError.
echo_word = word1 * echo
Initialize the variables echo_word and shout_words to empty strings. # Return shout_words
Add the keywords try and except in the appropriate locations for the
exception handling block. return shout_words
Use the * operator to concatenate echo copies of word1. Assign the # Call shout_echo
result to echo_word.
Concatenate the string '!!!' to echo_word. Assign the result shout_echo("particle", echo="accelerator")
to shout_words.
Error handling by raising an error
# Define shout_echodef shout_echo(word1, echo=1):
Complete the if statement by checking if the value of echo is less
"""Concatenate echo copies of word1 and three than 0.
In the body of the if statement, add a raise statement that raises first 2 characters in a tweet x are ‘RT’. Assign the resulting filter
a ValueError with message 'echo must be greater than or equal to object to result. To get the first 2 characters in a tweet x, use x[0:2].
0' when the value supplied by the user to echo is less than 0. To check equality, use a Boolean filter with ==.
Convert result to a list and print out the list.
# Define shout_echodef shout_echo(word1, echo=1):
# Select retweets from the Twitter DataFrame: result
"""Concatenate echo copies of word1 and three
result = filter(lambda x: x[0:2] == 'RT', tweets_df['text'])
exclamation marks at the end of the string."""
# Create list from filter object result: res_list
res_list = list(result)
# Raise an error with raise
# Print all retweets in res_listfor tweet in res_list:
if echo < 0:
print(tweet)
raise ValueError('echo must be greater than or equal to 0')