Lecture 10
Lecture 10
Strings
Generators
• Special functions that return lazy iterables
• Use less memory
• Change is that functions yield instead of return
• def square(it):
for i in it:
yield i*i
• If we are iterating through a generator, we hit the rst yield and immediately
return that rst computation
• Generator expressions just shorthand (remember no tuple comprehensions)
- (i * i for i in [1,2,3,4,5])
fi
Ef cient Evaluation
• Only compute when necessary, not beforehand
• u = compute_fast_function(s, t)
v = compute_slow_function(s, t)
if s > t and s**2 + t**2 > 100:
u = compute_fast_function(s, t)
res = u / 100
else:
v = compute_slow_function(s, t)
res = v / 100
• Slow function will not be executed unless the condition is true
Short-Circuit Evaluation
• Automatic, works left to right according to order of operations (and before or)
• Works for and and or
• and:
- if any value is False, stop and return False
- a, b = 2, 3
a > 3 and b < 5
• or:
- if any value is True, stop and return True
- a, b, c = 2, 3, 7
a > 3 or b < 5 or c > 8
Memoization
• memo_dict = {}
def memoized_slow_function(s, t):
if (s, t) not in memo_dict:
memo_dict[(s, t)] = compute_slow_function(s, t)
return memo_dict[(s, t)]
• for s, t in [(12, 10), (4, 5), (5, 4), (12, 10)]:
if s > t and (c := memoized_slow_function(s, t) > 50):
pass
else:
c = compute_fast_function(s, t)
• Second time executing for s=12, t=10, we don't need to compute!
• Tradeoff memory for compute time
Functional Programming
• Programming without imperative statements like assignment
• In addition to comprehensions & iterators, have functions:
- map: iterable of n values to an iterable of n transformed values
- lter: iterable of n values to an iterable of m (m <= n) values
• Eliminates need for concrete looping constructs
Lambda Functions
• def is_even(x):
return (x % 2) == 0
• filter(is_even, range(10) # generator
• Lots of code to write a simple check
• Lambda functions allow inline function de nition
• Usually used for "one-liners": a simple data transform/expression
• filter(lambda x: x % 2 == 0, range(10))
• Parameters follow lambda, no parentheses
• No return keyword as this is implicit in the syntax
• JavaScript has similar functionality (arrow functions): (d => d % 2 == 0)
fi
Assignment 3
• Important for Test 1, but studying also should be a priority
• Deadline moved to Friday, Feb. 24
• Pokémon Data
• Looking at where and how people and goods move across land borders
• Start with the sample notebook (or copy its code) to download the data
• Data is a list of dictionaries
• Need to iterate through, update, and create new lists & dictionaries
Test 1
• This Wednesday, Feb. 22, 11:00am-12:15pm
• In-Class, paper/pen & pencil
• Covers material through last week
• Format:
- Multiple Choice
- Free Response
• Information at the link above
Strings
• Remember strings are sequences of characters
• Strings are collections so have len, in, and iteration
- s = "Huskies"
len(s); "usk" in s; [c for c in s if c == 's']
• Strings are sequences so have
- indexing and slicing: s[0], s[1:]
- concatenation and repetition: s + " at NIU"; s * 2
• Single or double quotes 'string1', "string2"
• Triple double-quotes: """A string over many lines"""
• Escaped characters: '\n' (newline) '\t' (tab)
fi
fi
Codes
• Characters are still stored as bits and thus can be represented by numbers
- ord → character to integer
- chr → integer to character
- "\N{horse}": named emoji
Finding & Counting Substrings
• s.count(sub): Count the number of occurrences of sub in s
• s.find(sub): Find the rst position where sub occurs in s, else -1
• s.rfind(sub): Like find, but returns the right-most position
• s.index(sub): Like find, but raises a ValueError if not found
• s.rindex(sub): Like index, but returns right-most position
• sub in s: Returns True if s contains sub
• s.startswith(sub): Returns True if s starts with sub
• s.endswith(sub): Returns True if s ends with sub
Transforming Text
• s.replace(oldsub, newsub):
Copy of s with occurrences of oldsub in s with newsub
• s.upper(): Copy of s with all uppercase characters
• s.lower(): Copy of s with all lowercase characters
• s.capitalize(): Copy of s with rst character capitalized
• s.title(): Copy of s with rst character of each word capitalized
Joining
• join is a method on the separator used to join a list of strings
• ','.join(names)
- names is a list of strings, ',' is the separator used to join them
• Example:
- def orbit(n):
# …
return orbit_as_list
print(','.join(orbit_as_list))
Formatting
• s.ljust, s.rjust: justify strings by adding ll characters to obtain a string
with speci ed width
• s.zfill: ljust with zeroes
• s.format: templating function
- Replace elds indicated by curly braces with corresponding values
- "My name is {} {}".format(first_name, last_name)
- "My name is {1} {0}".format(last_name, first_name)
- "My name is {first_name} {last_name}.format(
first_name=name[0], last_name=name[1])
- Braces can contain number or name of keyword argument
- Whole format mini-language to control formatting
D. Koop, CSCI 503/490, Spring 2023 22
fi
fi
fi
Format Strings
• Formatted string literals (f-strings) pre x the starting delimiter with f
• Reference variables directly!
- f"My name is {first_name} {last_name}"
• Can include expressions, too:
- f"My name is {name[0].capitalize()} {name[1].capitalize()}"
• Same format mini-language is available
fi
Numeric Formatting
• Add positive sign:
- f'[{27:+10d}]' # '[ +27]'
• Add space but only show negative numbers:
- print(f'{27: d}\n{-27: d}') # note the space in front of 27
• Separators:
- f'{12345678:,d}' # '12,345,678'
Raw Strings
• Raw strings pre x the starting delimiter with r
• Disallow escaped characters
• '\\n is the way you write a newline, \\\\ for \\.'
• r"\n is the way you write a newline, \\ for \."
• Useful for regular expressions
Regular Expressions
• AKA regex
• A syntax to better specify how to decompose strings
• Look for patterns rather than speci c characters
• "31" in "The last day of December is 12/31/2016."
• May work for some questions but now suppose I have other lines like: "The
last day of September is 9/30/2016."
• …and I want to nd dates that look like:
• {digits}/{digits}/{digits}
• Cannot search for every combination!
• \d+/\d+/\d+ # \d is a character class
fi
fi
Metacharacters
• Need to have some syntax to indicate things like repeat or one-of-these or
this is optional.
•. ^ $ * + ? { } [ ] \ | ( )
• []: de ne character class
• ^: complement (opposite)
• \: escape, but now escapes metacharacters and references classes
• *: repeat zero or more times
• +: repeat one or more times
• ?: zero or one time
• {m,n}: at least m and at most n
Character
Matches
class
\d Any digit (0–9).
\D Any character that is not a digit.
\s Any whitespace character (such as spaces, tabs and newlines).
\S Any character that is not a whitespace character.
\w Any word character (also called an alphanumeric character)
\W Any character that is not a word character.
Method/Attribute Purpose
fl
fi
fi
Examples
• s0 = "No full dates here, just 02/15"
s1 = "02/14/2021 is a date"
s2 = "Another date is 12/25/2020"
• re.match(r'\d+/\d+/\d+',s1) # returns match object
• re.match(r'\d+/\d+/\d+',s0) # None
• re.match(r'\d+/\d+/\d+',s2) # None!
• re.search(r'\d+/\d+/\d+',s2) # returns 1 match object
• re.search(r'\d+/\d+/\d+',s3) # returns 1! match object
• re.findall(r'\d+/\d+/\d+',s3) # returns list of strings
• re.finditer(r'\d+/\d+/\d+',s3) # returns iterable of matches