Lecture 3 - Collections and Complexity Analysis
Lecture 3 - Collections and Complexity Analysis
Complexity Analysis
CIS-275
Introduction to Collections
● A collection is a group of zero or more items that are treated as a conceptual unit.
○ i.e. the items in a collection are related in some way.
● Some examples of simple collections:
○ A List of integers.
○ A dictionary of employee names and phone numbers.
● In this course, we will build our own collections (also called data structures).
● The basic principles of collections are the same in any programming language, and
remain the same over time.
● Although we will build data structures in Python, their implementation is similar in
any language we choose.
● There is an item (D1) in the first position, an item (D2) in the second position, etc.
● D2 is the only item with D1 as a predecessor and D3 as a successor.
● What are some real-world examples of a linear collection?
● There is an item (D1) in the first position, an item (D2) in the second position, etc.
● D2 is the only item with D1 as a predecessor and D3 as a successor.
● What are some real-world examples of a linear collection? People in a line, stacks of
dinner plates, a list of grocery items, etc...
● If a collection sorts its entries, we can only add items which can be logically compared
to each other in some way.
● Because an integer cannot be compared to a string, an error would occur if we try to
add “Steve” to my_list.
● Since string objects can be compared with each other, they can be added to the same
SortedList.
● Would the following code work?
my_list = SortedList()
my_list.add({ "Bob": "123-4567"})
my_list.add({ "Carl": "555-4444"})
● Since string objects can be compared with each other, they can be added to the same
SortedList.
● Would the following code work? No!
my_list = SortedList()
my_list.add({ "Bob": "123-4567"})
my_list.add({ "Carl": "555-4444"})
● Some objects, even if they are of the same type, cannot be compared meaningfully.
● Python does not know how to compare dictionaries, so an Error would occur here.
Chapter 2 An Overview of Collections 18
Sorted Collections
● Consider this very simple class:
class Person:
""" Simple class containing a Person's name and Phone Number """
class Person:
""" Simple class containing a Person's name and Phone Number """
● One option is for __gt__ to simply return True if the calling object’s _name string is
greater than other’s, False otherwise.
● After adding this method, Person objects can be placed in any sorted collection.
○ Note: It’s probably best to implement the __ge__ method as well (for greater than
or equal to) because some sorted collections might use that operator instead.
Determine the size Allow the use of Python’s len function to get # of items in collection
Test for item Membership Allow the use of Python’s in operator to search for a given target item
in collection. Return True if found, False otherwise.
Traverse the Collection Allow the use of Python’s for loop to visit each item in collection (the
order of visitation depends upon the type of collection)
Obtain a string representation Allow the use of Python’s str function to obtain a string representation
of collection
Test for Equality Allow the use of Python’s == operator to determine whether two
collections are equal. Two collections are considered equal if they
are of the same type and contain the same items. If collection type is
linear, items must be in same order in both collections as well.
Concatenate Two Collections Allow the use of Python’s + operator to create a new collection of the
same type, containing all items from both operands.
Convert Collection type Create a collection of a different type with the same items as a source
collection (i.e. convert a tuple to a list)
{ 5, 8, 7, 1, 2, 6 }
{ 55, 18, 7, 21, 2, 608, 110, 310, 910, 11, 12, 15, 61, 190, 75, 81, 14, 91, 80, 15, 43, 8, 100,
13, 17, 1, 8, 5, 12, 101, 214, 70, 16, 300, 9, 24, 37, 715, 212, 1, 34, 4, 0, 82, 73, 46, 54, 72,
102, 201, 305, 200, 60, 401, 501, 312, 502, 400, 231, 342, 320, 208, 876, 403, 109, 504,
31, 372, 462, 429, 480, 501, 430, 720, 58, 73, 24, 807, 819, 650, 460, 306, 203, 27, 189 }
start = time.time()
# Do some work here
elapsed = time.time() - start
● elapsed will contain how many seconds have passed between the start time and end
time of the work that takes place.
● The for loop iterates 10,000,000 times and its body performs two operations.
● The bulk of the algorithm’s execution time will be executing these two operations.
● These operations are called the basic operation and the number of times they are
performed is called the input.
import time
print("%12s%16s" % ("Run #", "Seconds")) Do you expect each run of the algorithm
for count in range(5): to have a somewhat similar or vastly
start = time.time()
# start the algorithm
different runtime?
work = 1
for x in range(10000000 ):
work += 1
work -= 1
# end the algorithm
elapsed = time.time() - start
print("%12d%16.3f" % (count + 1, elapsed))
Run # Seconds
1 2.196
2 2.097
3 2.266
4 2.255
5 2.348
● The reason for this is the input size never changed. The algorithm always performs
10,000,000 basic operations:
work = 1
for x in range(10000000 ):
work += 1
work -= 1
problem_size = 10000000
print("%12s%16s" % ("Problem Size", "Seconds"))
for count in range(5):
● In this updated version of the
start = time.time() program, a different input size is
# start the algorithm
work = 1 tested for each run of the algorithm.
for x in range(problem_size):
work += 1
● In the first test, the input size is
work -= 1 10,000,000.
# end of the algorithm
elapsed = time.time() - start ● In the second test, it is 20,000,000.
print("%12d%16.3f" % (problem_size, elapsed))
problem_size *= 2
● etc.
● Do you expect each run of the algorithm to have a somewhat similar or vastly
different runtime?
work = 1 work = 1
for x in range(problem_size): for x in range(problem_size):
for y in range(problem_size): work += 1
work += 1 work -= 1
work -= 1
● As the problem size doubles, the runtime now increases at a much faster rate than
doubling.
● As we would see, as the problem size gets even larger, the rate at which the runtime
increases is just going to go up as well.
● Different computers can have very different processing speeds. Some might run one
type of algorithm slowly while another runs a different type of algorithm slowly.
● If a computer has multiple programs running, the efficiency of every program running
will decrease.
problem_size = 10000000
print("%12s%16s" % ("Problem Size", "Iterations"))
for count in range(5):
# start the algorithm
work = 1
for x in range(problem_size):
for y in range(problem_size):
work += 1
work -= 1
# end of the algorithm
problem_size *= 2
● For each of the 5 tests of the algorithm, we count how many times in total the inner
loop is reached.
● This gives a general idea of how much work the algorithm is doing.
● The output does not change, regardless of what computer we run it on!
Chapter 3 Searching, Sorting, and Complexity Analysis 47
Counting Instructions
● Given this output, what is the relationship between the problem size and the number
of iterations performed?
problem_size = 1000
print("%12s%16s" % ("Problem Size", "Iterations"))
for count in range(5):
iterations = 0
# start the algorithm
work = 1
for x in range(problem_size):
iterations += 1
work += 1
work -= 1
# end of the algorithm
print("%12d%15d" % (problem_size, iterations))
problem_size *= 2
work = 1 work = 1
for x in range(problem_size): for x in range(problem_size):
for y in range(problem_size): iterations += 1
iterations += 1 work += 1
work += 1 work -= 1
work -= 1
Every quadratic algorithm is worse than every linear algorithm given a large enough input!
Although n2 and 0.01n2 are very different time complexities, we group them together
because they both grow in a similar manner (quadratically).
Chapter 3 Searching, Sorting, and Complexity Analysis 56
Order
● Consider this chart.
● For smaller inputs (10, 20, 50...) 0.1n2 + n + 100 is much less efficient than 0.1n2.
● However, once the input size reaches 100 and above, the impact n and 100 have on
performance are negligible.
● Given a function’s complexity class, we know it’s more efficient than those to its right
in this list and less efficient than those to its left.
Example:
● As n grows, any function that is Θ(nlgn) is more efficient than any function that is Θ
(n2) and less efficient than any function that is Θ(n), etc.
● There is a more rigorous way to prove this, but for this course we use this type of
intuition to determine complexity classes
‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’
‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’
‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’
‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’
‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’
‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’