Chapter 08
Chapter 08
Chapter Goals
• To build and use a set container
• To learn common set operations for processing data
• To build and use a dictionary container
• To work with a dictionary for table lookups
• To work with complex data structures
In this chapter, we will learn how to work with two more types of
containers (sets and dictionaries) as well as how to combine
containers to model complex structures.
Page
Contents
• Sets
• Dictionaries
• Complex Structures
4/15/2025 2
Sets
SECTION 8.1
Sets
• A set is a container that stores a collection of unique values
• Unlike a list, the elements or members of the set are not stored in any
particular order and cannot be accessed by position
• Operations are the same as the operations performed on sets in
mathematics
4/15/2025 Page 3
• Because sets do not need to maintain a particular order, set
operations are much faster than the equivalent list operations
4/15/2025 4
Example Set
• This set contains three sets of colors––the colors of the British,
Canadian, and Italian flags
• In each set, the order does not matter, and the colors are not duplicated
in any one of the sets
4/15/2025 Page 5
Creating and Using Sets
• To create a set with initial elements, you can specify the elements
4/15/2025 Page 6
enclosed in braces, just like in mathematics:
cast = {
"Luigi", "Gumbys", "Spiny" }
• Alternatively, you can use the set() function to convert any sequence
into a set:
4/15/2025 Page 7
Creating an Empty Set
• For historical reasons, you cannot use {} to make an empty set in
Python
• Instead, use the set() function with no arguments:
cast = set()
• As with any container, you can use the len() function to obtain the
number of elements in a set:
4/15/2025 Page 8
numberOfCharacters = len(cast) # In this case it’s
zero
Set Membership: in
• To determine whether an element is contained in the set, use the in
operator or its inverse, the not in operator:
if "Luigi" in cast :
4/15/2025 Page 9
print("Luigi is a character in Monty Python’s Flying
Circus.")
else :
print("Luigi is not a character in the show.")
4/15/2025 Page 10
Accessing Set Elements
4/15/2025 Page 11
• Note that the order in which the elements of the set are visited
depends on how they are stored internally
4/15/2025 Page 12
• Note that the order of the elements in the output is different from
the order in which the set was created
4/15/2025 Page 13
for actor insorted(cast) :
print(actor)
4/15/2025 Page 14
Adding Elements
• Sets are mutable collections, so you can add elements by using the
4/15/2025 Page 15
Removing Elements:
add()method:
cast = set(["Luigi", "Gumbys", "Spiny"]) #1
cast.
add(
"Arthur"
) #2
cast.
add(
"Spiny"
) #3
4/15/2025 Page 16
Removing Elements:
discard()
• The discard() method removes an element if the element exists:
cast.discard("Arthur") #4
4/15/2025 Page 17
Removing Elements:
remove()
• The remove() method, on the other hand, removes an element if it
exists, but raises an exception if the given element is not a member
of the set:
cast.remove("The Colonel"
) # Raises an exception
4/15/2025 Page 18
Removing Elements:
clear()
• Finally, the clear() method removes all elements of a set, leaving
the empty set:
cast.clear()# cast now has size 0
4/15/2025 Page 19
Subsets
• A set is a subset of another set if and only if every element of the first set
is also an element of the second set
• In the image below, the Canadian flag colors are a subset of the British
colors
• The Italian flag colors are not.
4/15/2025 Page 20
The issubset() Method
• The issubset() method returns True or False to report whether
one set is a subset of another:
canadian = { "Red", "White"
british
} = { "Red", "Blue", "White"
italian = { "Red", "White", "Green"
}
}
#
if
True issubset
britis):
print("All
canadian. ( Canadian
h flag colors occur in the British
flag.")
#
if not
True issubset
britis
):
print("At
italian. ( leasth one of the colors in the Italian flag
not."
does
)
4/15/2025 Page 21
Set Equality / Inequality
• We test set equality with the “==“ and “!=“ operators
• Two sets are equal if and only if they have exactly the same elements
4/15/2025 Page 22
Set Union: union()
• The union of two sets contains all of the elements from both sets,
with duplicates removed
# inEither: The set {"Blue", "Green", "White",
"Red"} inEither = british.union(italian)
• Both the British and Italian sets contain the colors Red and White, but
the union is a set and therefore contains only one instance of each
4/15/2025 Page 23
color
4/15/2025 Page 24
Difference of Two Sets: difference()
• The difference of two sets results in a new set that contains those
elements in the first set that are not in the second set
print("Colors that are in the Italian flag but not
the
British:") print(italian.difference(british))
# Prints {'Green'}
4/15/2025 Page 25
4/15/2025 Page 26
Common Set Operations
4/15/2025 Page 27
4/15/2025 Page 28
Common Set Operations (2)
4/15/2025 Page 29
Simple Examples
• Open the file: set examples.py
4/15/2025 Page 30
Set Example: Spell Checking
• The program spellcheck.py reads a file that contains correctly spelled
words and places the words in a set
• It then reads all words from a document––here, the book Alice in
Wonderland––into a second set
• Finally, it prints all words from the document that are not in the set of
correctly spelled words
• Open the file spellcheck.py
Example: Spellcheck.py
4/15/2025 Page 31
Example: Spellcheck.py
4/15/2025 Page 32
4/15/2025 Page 33
Execution: Spellcheck.py
4/15/2025 Page 34
Programming Tip
• When you write a program that manages a collection of unique items,
sets are far more efficient than lists
• Some programmers prefer to use the familiar lists, replacing
itemSet.add(item)
with:
if (item not in itemList)
itemList.append(item)
4/15/2025 Page 35
Counting Unique Words
Problem Statement
• We want to be able to count the number of unique words in a text
document
• “Mary had a little lamb” has 57 unique words
• Our task is to write a program that reads in a text document and
determines the number of unique words in the document
4/15/2025 36
Step One: Understand the Task
• To count the number of unique words in a text document we need to
be able to determine if a word has been encountered earlier in the
document
• Only the first occurrence of a word should be counted
• The easiest way to do this is to read each word from the file and add it
to the set
• Because a set cannot contain duplicates we can use the add method
4/15/2025 Page 37
• The add method will prevent a word that was encountered earlier
from being added to the set
• After we process every word in the document the size of the set will
be the number of unique words contained in the document
4/15/2025 Page 38
for each word in the text document
Add the word to the set
Number of unique words = the size of the set
• Creating the empty set, adding an element to the set, and determining
the size of the set are standard set operations
• Reading the words in the file can be handled as a separate task
4/15/2025 Page 39
• We need to read individual words from the file. For simplicity in our
example we will use a literal file name
inputFile = open(“nurseryrhyme.txt”,
“r”) For line in inputFile : theWords
= line.split() For words in
theWords :
Process word
4/15/2025 Page 40
Step Four: Clean the Words
• To strip out all the characters that are not letters we will iterate
through the string, one character at a time, and build a new
“clean” word
def clean(string) :
result = “” for
char in string :
if char.isalpha() :
result = result + char
return result.lower()
4/15/2025 Page 41
Step Five: Some Assembly
Required
• Implement the main() function and combine it with the other
functions
• Open the file: countwords.py
4/15/2025 Page 42
Dictionaries
SECTION 8.2
Dictionaries
• A dictionary is a container that keeps associations between keys and
values
• Every key in the dictionary has an associated value
• Keys are unique, but a value may be associated with several keys
• Example (the mapping between the key and value is indicated by an
arrow):
4/15/2025 43
Syntax: Sets and Dictionaries
4/15/2025 Page 44
Creating Dictionaries
• Suppose you need to write a program that looks up the phone
number for a person in your mobile phone’s contact list
4/15/2025 Page 45
• You can use a dictionary where the names are keys and the phone
numbers are values
contacts = "Fred":
{ 7235591, "Mary": 3841212, "Bob":
3841212
, "Sarah": 2213278
}
4/15/2025 Page 46
oldContacts =dict(contacts)
4/15/2025 Page 47
Accessing Dictionary Values []
• The subscript operator [] is used to return the value associated with
a key
• The statement
# prints 7235591.
print("Fred's number is",
contacts["Fred"])
4/15/2025 Page 48
The key supplied to the subscript operator must
be a valid key in the dictionary or a
KeyError exception will be raised
Dictionaries: Checking Membership
• To find out whether a key is present in the dictionary, use the in (or
not in) operator:
if "John" in contacts :
print("John's number is",
contacts["John"])
else :
print("John is not in my contact list.")
4/15/2025 Page 49
Default Keys
• Often, you want to use a default value if a key is not present
• Instead of using the in operator, you can simply call the get()
method and pass the key and a default value
• The default value is returned if there is no matching key
Adding/Modifying Items
• A dictionary is a mutable container
4/15/2025 Page 50
• You can add a new item using the subscript operator [] much as you
would with a list
contacts["John"] = 4578102 #1
• To change the value associated with a given key, set a new value using
the [] operator on an existing key:
contacts["John"] = 2228102#2
4/15/2025 Page 51
Adding New Elements Dynamically
• Sometimes you may not know which items will be contained in the
dictionary when it’s created
• You can create an empty dictionary like this:
favoriteColors = {}
4/15/2025 Page 52
Removing Elements
• To remove an item from a dictionary, call the pop() method with
the key as the argument:
contacts = { "Fred":
7235591, "Mary": 3841212,
"Bob": 3841212, "Sarah":
2213278 }
• This removes the entire item, both the key and its associated value.
4/15/2025 Page 53
contacts.pop("Fred")
• Note: If the key is not in the dictionary, the pop method raises a
KeyError exception
• To prevent the exception from being raised, you should test for the
key in the dictionary:
4/15/2025 Page 54
if "Fred" in contacts :
contacts.pop("Fred")
Traversing a Dictionary
• You can iterate over the individual keys in a dictionary using a for
loop:
print("My Contacts:")
for key in contacts :
print(key)
4/15/2025 Page 55
Bob
items in an order that is optimized
John for efficiency,
Mary which may not be the order in which
Fred they were added
4/15/2025 Page 56
• Now, the contact list will be printed in order by name:
My Contacts:
Bob 3841212
Fred 7235591
John 4578102
Mary 3841212
Sarah 2213278
4/15/2025 Page 57
•
Python allows you to iterate over the items in a dictionary using the
items() method
• This is a bit more efficient than iterating over the keys and then
looking up the value of each key
• The items() method returns a sequence of tuples that contain the
keys and values of all items
4/15/2025 Page 58
• Here the loop variable item will be assigned a tuple that contains the
key in the first slot and the value in the second slot
4/15/2025 Page 59
Dictionaries: Data Records
•
4/15/2025 Page 60
You create an item for each data record in which the key is the field
name and the value is the data value for that field
• For example, this dictionary named record stores a single student
record with fields for ID, name, class, and GPA:
record = { "id": 100, "name": "Sally Roberts", "class":
2,"gpa":
3.78 }
4/15/2025 Page 61
Dictionaries: Data Records
•
def extractRecord(infile)
: record = {} line =
infile.readline() if
line != "" :
fields = line.split(":")
record["country"] = fields[0]
record["population"] = int(fields[1])
return record
4/15/2025 Page 62
The dictionary record that is returned has two items, one with
the key "country" and the other with the key "population"
• This function’s result can be used to print all of the records to the
terminal
infile = open("populations.txt",
"r") record =
extractRecord(infile) while
len(record) > 0 :
print("%-20s %10d" %
(record["country"],
record["population"])) record =
extractRecord(infile)
4/15/2025 Page 63
Dictionaries: Data Records
•
4/15/2025 Page 64
4/15/2025 Page 65
Common Dictionary Operations (2)
4/15/2025 Page 66
Complex Structures
SECTIONS 8.3
Complex Structures
• Containers are very useful for storing collections of values
• In Python, the list and dictionary containers can contain any type of
data, including other containers
4/15/2025 67
• Some data collections, however, may require more complex
structures.
• In this section, we explore problems that require the use of a complex
structure
4/15/2025 Page 68
A Dictionary of Sets
• The index of a book specifies on which pages each term occurs
• Build a book index from page numbers and terms contained in a text
file with the following format:
6:type
7:example
7:index
7:program
8:type
10:example
11:program
20:set
4/15/2025 Page 69
A Dictionary of Sets
• The file includes every occurrence of every term to be included in the
index and the page on which the term occurs
• If a term occurs on the same page more than once, the index
includes the page number only once
4/15/2025 Page 70
A Dictionary of Sets
• The output of the program should be a list of terms in alphabetical
order followed by the page numbers on which the term occurs,
separated by commas, like this:
example: 7,
10 index: 7
program: 7,
11 type: 6,
8 set: 20
• A dictionary of sets would be appropriate for this problem
• Each key can be a term and its corresponding value a set of the page
numbers where it occurs
4/15/2025 Page 71
A Dictionary of Sets
4/15/2025 Page 72
Why Use a Dictionary?
• The terms in the index must be unique
• By making each term a dictionary key, there will be only one instance
of each term.
• The index listing must be provided in alphabetical order by term
• We can iterate over the keys of the dictionary in sorted order to
produce the listing
• Duplicate page numbers for a term should only be included once
• By adding each page number to a set, we ensure that no duplicates
will be added
4/15/2025 Page 73
Dictionary Sets: Buildindex.py
4/15/2025 Page 74
Dictionary Sets: Buildindex.py
4/15/2025 Page 75
Dictionary Sets: Buildindex.py
4/15/2025 Page 76
Dictionary Sets: Buildindex.py
4/15/2025 Page 77
A Dictionary of Lists
• A common use of dictionaries in Python is to store a collection of lists
in which each list is associated with a unique name or key
• For example, consider the problem of extracting data from a text file
that represents the yearly sales of different ice cream flavors in
multiple stores of a retail ice cream company
• vanilla:8580.0:7201.25:8900.0
• chocolate:10225.25:9025.0:9505.0 • rocky
road:6700.1:5012.45:6011.0
• strawberry:9285.15:8276.1:8705.0
• cookie dough:7901.25:4267.0:7056.5
4/15/2025 Page 78
A Dictionary of Lists
• The data is to be processed to produce a report similar to the
following:
4/15/2025 Page 79
A Dictionary of Lists
• With this structure, each row of the table is an item in the dictionary
• The name of the ice cream flavor is the key used to identify a
particular row in the table.
• The value for each key is a list that contains the sales, by store, for
that flavor of ice cream
4/15/2025 Page 80
A Dictionary of Lists
4/15/2025 Page 81
Example: Icecreamsales.py
4/15/2025 Page 82
Example: Icecreamsales.py
4/15/2025 Page 83
Example: Icecreamsales.py
4/15/2025 Page 84
Example: Icecreamsales.py
4/15/2025 Page 85
Example: Icecreamsales.py
4/15/2025 Page 86
Modules
SPLITTING OUR PROGRAMS INTO PIECES
Modules
• When you write small programs, you can place all of your code into a
single source file
4/15/2025 87
• When your programs get larger or you work in a team, that situation
changes
• You will want to structure your code by splitting it into separate
source files (a “module”)
4/15/2025 Page 88
Reasons for Employing Modules
• Large programs can consist of hundreds of functions that become
difficult to manage and debug if they are all in one source file
• By distributing the functions over several source files and grouping
related functions together, it becomes easier to test and debug the
various functions
• The second reason becomes apparent when you work with other
programmers in a team
4/15/2025 Page 89
• It would be very difficult for multiple programmers to edit a single
source file simultaneously
• The program code is broken up so that each programmer is solely
responsible for a unique set of files
4/15/2025 Page 90
• The supplemental modules contain supporting functions and constant
variables
4/15/2025 Page 91
Modules Example
• Splitting the dictionary of lists into modules
• The tabulardata.py module contains functions for reading the
data from a file and printing a dictionary of lists with row and column
totals
• The salesreport.py module is the driver (or main) module that
contains the main function
4/15/2025 Page 92
• By splitting the program into two modules, the functions in the
tabulardata.py module can be reused in another program that
needs to process named lists of numbers
4/15/2025 Page 93
• However, if a module defines many functions, it is easier to use the
form:
import tabulardata
• With this form, you must prepend the name of the module to the
function name:
tabulardata.printReport(salesData)
4/15/2025 Page 94
Review
Python Sets
• A set stores a collection of unique values
• A set is created using a set literal or the set function
• The in operator is used to test whether an element is a member of a
set
4/15/2025 95
• New elements can be added using the add() method
• Use the discard() method to remove elements from a set
• The issubset() method tests whether one set is a subset of
another set
Python Sets
• The union() method produces a new set that contains the elements
in both sets
4/15/2025 Page 96
• The intersection() method produces a new set with the
elements that are contained in both sets
• The difference() method produces a new set with the elements
that belong to the first set but not the second
• The implementation of sets arrange the elements in the set so that
they can be located quickly
4/15/2025 Page 97
Python Dictionaries
• A dictionary keeps associations between keys and values
• Use the [] operator to access the value associated with a key
• The in operator is used to test whether a key is in a dictionary
• New entries can be added or modified using the [] operator
• Use the pop() method to remove a dictionary entry
4/15/2025 Page 98
Complex Structures
• Complex structures can help to better organize data for processing
• The code of complex programs is distributed over multiple files
4/15/2025 Page 99