0% found this document useful (0 votes)
18 views

Lecture 3 - Collections and Complexity Analysis

The document provides an overview of different types of collections including linear, hierarchical, graph, unordered, and sorted collections. It discusses the characteristics of each type and provides examples to illustrate the concepts.

Uploaded by

jacobsabraw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Lecture 3 - Collections and Complexity Analysis

The document provides an overview of different types of collections including linear, hierarchical, graph, unordered, and sorted collections. It discusses the characteristics of each type and provides examples to illustrate the concepts.

Uploaded by

jacobsabraw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Lecture 3: An Overview of Collections and

Complexity Analysis

CIS-275
Introduction to Collections
● A collection is a group of zero or more items that are treated as a conceptual unit.
○ i.e. the items in a collection are related in some way.
● Some examples of simple collections:
○ A List of integers.
○ A dictionary of employee names and phone numbers.
● In this course, we will build our own collections (also called data structures).
● The basic principles of collections are the same in any programming language, and
remain the same over time.
● Although we will build data structures in Python, their implementation is similar in
any language we choose.

Chapter 2 An Overview of Collections 2


Collection Types
● Python has several built-in collection types:
○ string, list, tuple, set, and dictionary.
○ The most common of these are string and list.
● Programmers often create other collection types as well.
● Some examples which we will build in this course:
Array
○ bag
○ linked list
○ stack
○ queue
○ priority queue
○ binary search tree
○ graph

Chapter 2 An Overview of Collections 3


Collection Types
● In many programming languages, collections are homogenous.
○ This means an instance of a collection can only hold objects of one data type.
○ For example, in Java when we create an array, we must decide what type of data
it will hold (integers, strings, characters, etc)
● In Python, collections are typically heterogeneous.
○ An instance of a Python collection can hold any type of object.
○ When we create a List in Python, we do not bind it to a type. We can add any
type of object we want to it:
convenient and easy to write, but not efficient
my_list = []
my_list.append("Hello")
my_list.append(5)
my_list.append(Counter())

Chapter 2 An Overview of Collections 4


Collection Types
● One way to categorize types of collections is by how they organize their data.
● Although there are numerous types of collections, most fit in one of the following
four main categories:
○ Linear Collections
○ Hierarchical Collections
○ Graph Collections
○ Unordered Collections

Chapter 2 An Overview of Collections 5


Linear Collections
● The items in a linear collection are ordered by position.
● Each item (other than the first) has a unique predecessor and each item (other than the
last) has a unique successor.
● For example:

● There is an item (D1) in the first position, an item (D2) in the second position, etc.
● D2 is the only item with D1 as a predecessor and D3 as a successor.
● What are some real-world examples of a linear collection?

Chapter 2 An Overview of Collections 6


Linear Collections
● The items in a linear collection are ordered by position.
● Each item (other than the first) has a unique predecessor and each item (other than the
last) has a unique successor.
● For example:

● There is an item (D1) in the first position, an item (D2) in the second position, etc.
● D2 is the only item with D1 as a predecessor and D3 as a successor.
● What are some real-world examples of a linear collection? People in a line, stacks of
dinner plates, a list of grocery items, etc...

Chapter 2 An Overview of Collections 7


Hierarchical Collections
● The items in a hierarchical collection are ordered in a
structure resembling an upside-down tree.
● Each data item (other than the top one) has only one parent
but may have multiple children.
○ For example, D3 has one parent (D1) but three children
(D4, D5, and D6)
● Typically, an item’s parents and children have a specific
relation to each other.
○ For example, we might have a tree in which each parent
is larger than its children.
● What are some real-world examples of a hierarchical
collection?

Chapter 2 An Overview of Collections 8


Hierarchical Collections
● The items in a hierarchical collection are ordered in a
structure resembling an upside-down tree.
● Each data item (other than the top one) has only one parent
but may have multiple children.
○ For example, D3 has one parent (D1) but three children
(D4, D5, and D6)
● Typically, an item’s parents and children have a specific
relation to each other.
○ For example, we might have a tree in which each parent
is larger than its children.
● What are some real-world examples of a hierarchical
collection? A file directory system, a company’s
organizational tree, a book’s table of contents, etc...
Chapter 2 An Overview of Collections 9
Graph Collections
● A graph collection (or graph) is a collection in which
each data item may have any number of successors and
predecessors.
○ In a graph, successors and predecessors are
considered the same thing and are called neighbors.
○ i.e. D3 has the following neighbors: D1, D2, D4,
and D5
● What are some real-world examples of graphs?

Chapter 2 An Overview of Collections 10


Graph Collections
● A graph collection (or graph) is a collection in which
each data item may have any number of successors and
predecessors.
○ In a graph, successors and predecessors are
considered the same thing and are called neighbors.
○ i.e. D3 has the following neighbors: D1, D2, D4,
and D5
● What are some real-world examples of graphs?
○ Maps of airline routes between cities
○ Electrical wiring diagrams
○ The internet

Chapter 2 An Overview of Collections 11


Unordered Collections
● The items in an unordered collection are not stored in
any particular order.
○ i.e. each item is simply in the collection...it has no
predecessors or successors.
● There is no “first” or “last” item in collection.
● What are some real-world examples of an unordered
collection?

Chapter 2 An Overview of Collections 12


Unordered Collections
● The items in an unordered collection are not stored in
any particular order.
○ i.e. each item is simply in the collection...it has no
predecessors or successors.
● There is no “first” or “last” item in collection.
● What are some real-world examples of an unordered
collection?
○ Most typically a bag! One specific example is a bag
of marbles. There is no ordering to the marbles,
each is simply in the bag.

Chapter 2 An Overview of Collections 13


Sorted Collections
● A sorted collection is a collection whose items are always stored in a specific order
(even if the items are added out of order)
● For example, Python’s built-in List is not a sorted collection:
my_list = []
my_list.append( 5)
my_list.append( 10)
my_list.append( 2)
print(my_list) # Prints [5, 10, 2]. Items are not sorted

● A sorted collection automatically sorts its items by their natural ordering.


● If Python’s List were a sorted collection, this code would have printed [2, 5, 10],
regardless of the order in which the values were added.

Chapter 2 An Overview of Collections 14


Sorted Collections
● Suppose we create a collection called SortedList.
○ It behaves just like a List, but it sorts its items as they are entered.
● Is there anything wrong with this code?
my_list = SortedList()
my_list.add( 5)
my_list.add( 10)
my_list.add( "Steve")

Chapter 2 An Overview of Collections 15


Sorted Collections
● Suppose we create a collection called SortedList.
○ It behaves just like a List, but it sorts its items as they are entered.
● Is there anything wrong with this code? Yes
my_list = SortedList()
my_list.add( 5)
my_list.add( 10)
my_list.add( "Steve") # Error

● If a collection sorts its entries, we can only add items which can be logically compared
to each other in some way.
● Because an integer cannot be compared to a string, an error would occur if we try to
add “Steve” to my_list.

Chapter 2 An Overview of Collections 16


Sorted Collections
● The following lines of code should work properly:
my_list = SortedList()
my_list.add( "Zander")
my_list.add( "John")
my_list.add( "Steve")

● Since string objects can be compared with each other, they can be added to the same
SortedList.
● Would the following code work?
my_list = SortedList()
my_list.add({ "Bob": "123-4567"})
my_list.add({ "Carl": "555-4444"})

Chapter 2 An Overview of Collections 17


Sorted Collections
● The following lines of code should work properly:
my_list = SortedList()
my_list.add( "Zander")
my_list.add( "John")
my_list.add( "Steve")

● Since string objects can be compared with each other, they can be added to the same
SortedList.
● Would the following code work? No!
my_list = SortedList()
my_list.add({ "Bob": "123-4567"})
my_list.add({ "Carl": "555-4444"})

● Some objects, even if they are of the same type, cannot be compared meaningfully.
● Python does not know how to compare dictionaries, so an Error would occur here.
Chapter 2 An Overview of Collections 18
Sorted Collections
● Consider this very simple class:

class Person:
""" Simple class containing a Person's name and Phone Number """

def __init__ (self, name, number):


self._name = name
self._number = number

● Would we be able to add a Person object to a SortedList?

Chapter 2 An Overview of Collections 19


Sorted Collections
● Consider this very simple class:

class Person:
""" Simple class containing a Person's name and Phone Number """

def __init__ (self, name, number):


self._name = name
self._number = number

● Would we be able to add a Person object to a SortedList? No!


● We have not told Python how Person objects should be compared.
○ i.e. how does it know if one Person comes before or after another?
● To fix this, we must define the __gt__ method for our class.
● How should we sort Person objects?

Chapter 2 An Overview of Collections 20


Sorted Collections
class Person:
""" Simple class containing a Person's name and Phone Number """

def __init__ (self, name, number):


self._name = name
self._number = number

def __gt__(self, other):


""" The Person with higher alphabetical name is greater """
return self._name > other._name

● One option is for __gt__ to simply return True if the calling object’s _name string is
greater than other’s, False otherwise.
● After adding this method, Person objects can be placed in any sorted collection.
○ Note: It’s probably best to implement the __ge__ method as well (for greater than
or equal to) because some sorted collections might use that operator instead.

Chapter 2 An Overview of Collections 21


A Taxonomy of Collection Types
● This diagram shows the four general types of collection we
just discussed: Graph, Hierarchical, Linear, and Unordered.
● Under each general type are more specific types, many of
which we’ll discuss in this course.
○ Note: Several of these collections we will also
implement in multiple ways.
○ For example, there are two common ways to
implement a Stack...we’ll cover both.

Chapter 2 An Overview of Collections 22


Operations on Collections
● There are several general types of operations that are common to most collections.
○ However, these operations work differently depending on the collection type.
● For example, each collection has a method to retrieve an item, but depending on the
collection type, a different item is retrieved:
○ i.e. The method to retrieve an item from a Queue returns a different item than the
method to retrieve an item from a Stack containing the same items.
● In this lecture, we will simply list the types of operations that will be included in all of
our collections.
○ Throughout the semester we will talk more about how the implementations of these
methods differ based on collection type.

Chapter 2 An Overview of Collections 23


Operations on Collections
Category of Operation Description

Determine the size Allow the use of Python’s len function to get # of items in collection

Test for item Membership Allow the use of Python’s in operator to search for a given target item
in collection. Return True if found, False otherwise.

Traverse the Collection Allow the use of Python’s for loop to visit each item in collection (the
order of visitation depends upon the type of collection)

Obtain a string representation Allow the use of Python’s str function to obtain a string representation
of collection

Test for Equality Allow the use of Python’s == operator to determine whether two
collections are equal. Two collections are considered equal if they
are of the same type and contain the same items. If collection type is
linear, items must be in same order in both collections as well.

Chapter 2 An Overview of Collections 24


Operations on Collections (continued)
Category of Operation Description

Concatenate Two Collections Allow the use of Python’s + operator to create a new collection of the
same type, containing all items from both operands.

Convert Collection type Create a collection of a different type with the same items as a source
collection (i.e. convert a tuple to a list)

Insert an Item Add a given item to collection, possibly at a given location.

Remove an Item Remove an item from collection, possibly at a given location

Replace an item Combine removal and insertion into one operation

Chapter 2 An Overview of Collections 25


Operations on Collections
● There is no standard name for the insertion, removal, replacement, or access operations.
● However, there are a few class variations:
○ Removal is often called pop or remove.
○ Insertion is often called push, add, or append.
○ Retrieval (without a removal) is often called peek.

Chapter 2 An Overview of Collections 26


Algorithms
Four people want to cross a bridge in 17 minutes; they all begin on the same side. It is
night, and they have one flashlight. A maximum of two people can cross the bridge at one
time. Any party that crosses, either one or two people, must have the flashlight with them.
The flashlight must be walked back and forth; it cannot be thrown, for example.

Each person walks at a different speed:


Person a: 1 minute to cross the bridge
Person b: 2 minutes to cross the bridge
Person c: 5 minutes to cross the bridge
Person d: 10 minutes to cross the bridge.
A pair must walk together at the rate of the slower person's pace.

Chapter 3 Searching, Sorting, and Complexity Analysis 27


Algorithms
● Algorithms are one of the basic building blocks of computer programs.
○ The other is a data structure, which this course is mostly about. As we’ll see,
you rarely use one without the other.
● An algorithm is a series of statements that produces an answer to a problem.

Chapter 3 Searching, Sorting, and Complexity Analysis 28


Algorithms
● With a very small input, most problems can be solved relatively quickly by humans.

For example, sort this list:

{ 5, 8, 7, 1, 2, 6 }

Chapter 3 Searching, Sorting, and Complexity Analysis 29


Algorithms
● However, with larger inputs, a computer is required if we’re interested in efficiency

For example, sort this list:

{ 55, 18, 7, 21, 2, 608, 110, 310, 910, 11, 12, 15, 61, 190, 75, 81, 14, 91, 80, 15, 43, 8, 100,
13, 17, 1, 8, 5, 12, 101, 214, 70, 16, 300, 9, 24, 37, 715, 212, 1, 34, 4, 0, 82, 73, 46, 54, 72,
102, 201, 305, 200, 60, 401, 501, 312, 502, 400, 231, 342, 320, 208, 876, 403, 109, 504,
31, 372, 462, 429, 480, 501, 430, 720, 58, 73, 24, 807, 819, 650, 460, 306, 203, 27, 189 }

Chapter 3 Searching, Sorting, and Complexity Analysis 30


Measuring the Efficiency of Algorithms
● When writing an algorithm, it is extremely important to ensure that it is correct.
○ A correct algorithm always solves the problem it is intended to answer.
● Other important qualities we want for our algorithms include: readability, ease of
maintenance, and run-time performance (or efficiency).
● In this lecture we will discuss ways to measure the run-time performance of an
algorithm and to compare the efficiency of two algorithms.

Chapter 3 Searching, Sorting, and Complexity Analysis 31


Measuring the Efficiency of Algorithms
● Why do we care about efficiency? Computers are getting faster!
○ As we will see, an algorithm can be so inefficient that it would take years or
centuries to complete, even on superfast computers!
■ Regardless of how fast computers get, some algorithms are still unviable.

● Therefore, to measure efficiency, we don’t calculate how fast an algorithm runs on a


specific computer.
○ Instead, we measure how efficient an algorithm is in relation to other algorithms.
○ i.e. we might say Algorithm A is 100 times less efficient than Algorithm B

Chapter 3 Searching, Sorting, and Complexity Analysis 32


Measuring the Efficiency of Algorithms
● One way to measure the time cost of an algorithm is to use the computer’s clock to
simply determine how much time it takes.
○ This process is called benchmarking.
● To do this, our code can perform the following steps:
○ Record the current time.
○ Run the algorithm with a certain input size.
○ Record the current time again.
○ Subtract the the first time from the second time to determine how many seconds
have passed.

Chapter 3 Searching, Sorting, and Complexity Analysis 33


Benchmarking
● To benchmark a program, we can use Python’s time module.
● This module includes a time() function which returns the number of seconds that
have passed between the current time and January 1, 1970.
● To determine how much time has passed:

start = time.time()
# Do some work here
elapsed = time.time() - start

● elapsed will contain how many seconds have passed between the start time and end
time of the work that takes place.

Chapter 3 Searching, Sorting, and Complexity Analysis 34


Benchmarking
● Consider this (rather pointless) algorithm:
work = 1
for x in range(10000000 ):
work += 1
work -= 1

● The for loop iterates 10,000,000 times and its body performs two operations.
● The bulk of the algorithm’s execution time will be executing these two operations.
● These operations are called the basic operation and the number of times they are
performed is called the input.

● Typically when we benchmark an algorithm, we run it multiple times.


● Let’s write code that executes this algorithm 5 times and records how long each
execution takes.
Chapter 3 Searching, Sorting, and Complexity Analysis 35
Benchmarking
● The outer loop executes 5 times. In each of these executions, the algorithm runs once
and its execution time is recorded.

import time

print("%12s%16s" % ("Run #", "Seconds")) Do you expect each run of the algorithm
for count in range(5): to have a somewhat similar or vastly
start = time.time()
# start the algorithm
different runtime?
work = 1
for x in range(10000000 ):
work += 1
work -= 1
# end the algorithm
elapsed = time.time() - start
print("%12d%16.3f" % (count + 1, elapsed))

Chapter 3 Searching, Sorting, and Complexity Analysis 36


Benchmarking
● As can be seen from this chart, each of the 5 runs of the algorithm took around 2
seconds to execute:

Run # Seconds
1 2.196
2 2.097
3 2.266
4 2.255
5 2.348

● The reason for this is the input size never changed. The algorithm always performs
10,000,000 basic operations:
work = 1
for x in range(10000000 ):
work += 1
work -= 1

Chapter 3 Searching, Sorting, and Complexity Analysis 37


Benchmarking
import time

problem_size = 10000000
print("%12s%16s" % ("Problem Size", "Seconds"))
for count in range(5):
● In this updated version of the
start = time.time() program, a different input size is
# start the algorithm
work = 1 tested for each run of the algorithm.
for x in range(problem_size):
work += 1
● In the first test, the input size is
work -= 1 10,000,000.
# end of the algorithm
elapsed = time.time() - start ● In the second test, it is 20,000,000.
print("%12d%16.3f" % (problem_size, elapsed))
problem_size *= 2
● etc.

● Do you expect each run of the algorithm to have a somewhat similar or vastly
different runtime?

Chapter 3 Searching, Sorting, and Complexity Analysis 38


Benchmarking
● When I ran the program on my computer, I got the following results:
Problem Size Seconds
10000000 2.340
20000000 4.383
40000000 8.997
80000000 18.732
160000000 41.275

● As can be seen, the runtimes grow as the input grows.


● Based on these results, might we be able to somewhat predict the running time (in
seconds) if the input was doubled again?

Chapter 3 Searching, Sorting, and Complexity Analysis 39


Benchmarking
● When I ran the program on my computer, I got the following results:
Problem Size Seconds
10000000 2.340
20000000 4.383
40000000 8.997
80000000 18.732
160000000 41.275

● As can be seen, the runtimes grow as the input grows.


● Based on these results, might we be able to somewhat predict the running time (in
seconds) if the input was doubled again?
● The runtime appears to approximately double as well.
○ If the input were doubled again, it would likely take about 80 seconds to execute

Chapter 3 Searching, Sorting, and Complexity Analysis 40


Benchmarking
import time

problem_size = 10000000 ● This algorithm has been give


print("%12s%16s" % ("Problem Size", "Seconds"))
for count in range(5):
another slight update.
start = time.time() ● Can anybody spot the change?
# start the algorithm
work = 1 ● How might this affect runtime?
for x in range(problem_size):
for y in range(problem_size):
work += 1
work -= 1
# end of the algorithm
elapsed = time.time() - start
print("%12d%16.3f" % (problem_size, elapsed))
problem_size *= 2

Chapter 3 Searching, Sorting, and Complexity Analysis 41


Benchmarking
New: Original:

work = 1 work = 1
for x in range(problem_size): for x in range(problem_size):
for y in range(problem_size): work += 1
work += 1 work -= 1
work -= 1

● The new algorithm contains a set of nested loops.


○ i.e. one loop inside another loop.
● For each iteration of the outer loop, the inner loop iterates problem_size times.
● Therefore, if problem_size is 100:
○ When x is 1, the inner loop iterates 100 times.
○ When x is 2, the inner loop iterates 100 times, etc.

Chapter 3 Searching, Sorting, and Complexity Analysis 42


Benchmarking
● As can be seen, the new algorithm grows at a different rate than the old one:
Problem Size Seconds
1000 0.112
2000 0.847
4000 3.260
8000 14.880
16000 52.011

● As the problem size doubles, the runtime now increases at a much faster rate than
doubling.
● As we would see, as the problem size gets even larger, the rate at which the runtime
increases is just going to go up as well.

Chapter 3 Searching, Sorting, and Complexity Analysis 43


Benchmarking
● Why might benchmarking like this not be the most accurate way to test an
algorithm’s efficiency?
Problem Size Seconds
10000000 2.340
20000000 4.383
40000000 8.997
80000000 18.732
160000000 41.275

Chapter 3 Searching, Sorting, and Complexity Analysis 44


Benchmarking
● Why might benchmarking like this not be the most accurate way to test an
algorithm’s efficiency?
Problem Size Seconds
10000000 2.340
20000000 4.383
40000000 8.997
80000000 18.732
160000000 41.275

● Different computers can have very different processing speeds. Some might run one
type of algorithm slowly while another runs a different type of algorithm slowly.
● If a computer has multiple programs running, the efficiency of every program running
will decrease.

Chapter 3 Searching, Sorting, and Complexity Analysis 45


Counting Instructions
● Rather than measuring how many seconds have passed when an algorithm runs, a
more accurate technique is to count how many instructions occur.
● When doing this, we want to determine which instructions execute more times as the
problem size grows.

problem_size = 10000000
print("%12s%16s" % ("Problem Size", "Iterations"))
for count in range(5):
# start the algorithm
work = 1
for x in range(problem_size):
for y in range(problem_size):
work += 1
work -= 1
# end of the algorithm
problem_size *= 2

Chapter 3 Searching, Sorting, and Complexity Analysis 46


Counting Instructions
problem_size = 1000
print("%12s%16s" % ("Problem Size", "Iterations"))
for count in range(5):
iterations = 0
# start the algorithm
work = 1
for x in range(problem_size):
for y in range(problem_size):
iterations += 1
work += 1
work -= 1
# end of the algorithm
print("%12d%15d" % (problem_size, iterations))
problem_size *= 2

● For each of the 5 tests of the algorithm, we count how many times in total the inner
loop is reached.
● This gives a general idea of how much work the algorithm is doing.
● The output does not change, regardless of what computer we run it on!
Chapter 3 Searching, Sorting, and Complexity Analysis 47
Counting Instructions
● Given this output, what is the relationship between the problem size and the number
of iterations performed?

Problem Size Iterations


1000 1000000
2000 4000000
4000 16000000
8000 64000000

Chapter 3 Searching, Sorting, and Complexity Analysis 48


Counting Instructions
● Given this output, what is the relationship between the problem size and the number
of iterations performed?

Problem Size Iterations


1000 1000000
2000 4000000
4000 16000000
8000 64000000

● The number of iterations performed is the problem size squared!


● More simply, we can say this algorithm performs n2 iterations.

Chapter 3 Searching, Sorting, and Complexity Analysis 49


Counting Instructions
● Let’s go back to our original algorithm without nested loops.
● What will its output look like?

problem_size = 1000
print("%12s%16s" % ("Problem Size", "Iterations"))
for count in range(5):
iterations = 0
# start the algorithm
work = 1
for x in range(problem_size):
iterations += 1
work += 1
work -= 1
# end of the algorithm
print("%12d%15d" % (problem_size, iterations))
problem_size *= 2

Chapter 3 Searching, Sorting, and Complexity Analysis 50


Counting Instructions
Problem Size Iterations
1000 1000
2000 2000
4000 4000
8000 8000
16000 16000

● The number of iterations performed is simply the problem size!


● More simply, we can say this algorithm performs n iterations.

Chapter 3 Searching, Sorting, and Complexity Analysis 51


Counting Instructions
New: Original:

work = 1 work = 1
for x in range(problem_size): for x in range(problem_size):
for y in range(problem_size): iterations += 1
iterations += 1 work += 1
work += 1 work -= 1
work -= 1

● The original algorithm performs problem_size iterations.


○ More simply, we say it performs n iterations.
● The new algorithm performs problem_size * problem_sizeiterations.
○ Or problem_size2 iterations.
○ More simply, we say it performs n2 iterations.

Chapter 3 Searching, Sorting, and Complexity Analysis 52


Time Complexity
● When we count the number of instructions an algorithm performs, we calculate its
time complexity.
● An algorithm’s time complexity usually grows as its input size grows.
● The time complexity of New is n2 and the time complexity of Original is n.
● However, as we’ll see, an algorithm’s time complexity is rarely so simple.

Chapter 3 Searching, Sorting, and Complexity Analysis 53


Time Complexity
● Consider this function:
def my_function(n):
for i in range(2 * n):
print("Hello")
for i in range(n):
print("Goodbye")

● What is its time complexity?

Chapter 3 Searching, Sorting, and Complexity Analysis 54


Time Complexity
● Consider this function:
def my_function(n):
for i in range(2 * n):
print("Hello")
for i in range(n):
print("Goodbye")

● What is its time complexity?


○ There are two basic operations:
■ Printing “Hello” in the first loop
■ Printing “Goodbye” in the second loop
○ The first is performed 2n times, the second is performed n times.
○ Therefore, the time complexity is 2n + n = 3n

Chapter 3 Searching, Sorting, and Complexity Analysis 55


Order
There are an infinite number of time complexities, making algorithms difficult to compare.

With Order, we group algorithms with other algorithms of similar complexity.

● Algorithms with time complexities of n, 3n, 100n, etc. are linear-time


○ Their time complexity grows linearly with n
● Algorithms with time complexities of n2, 0.01n2, etc. are quadratic-time
○ Their time complexity grows quadratically with n

Every quadratic algorithm is worse than every linear algorithm given a large enough input!

Although n2 and 0.01n2 are very different time complexities, we group them together
because they both grow in a similar manner (quadratically).
Chapter 3 Searching, Sorting, and Complexity Analysis 56
Order
● Consider this chart.
● For smaller inputs (10, 20, 50...) 0.1n2 + n + 100 is much less efficient than 0.1n2.
● However, once the input size reaches 100 and above, the impact n and 100 have on
performance are negligible.

Chapter 3 Searching, Sorting, and Complexity Analysis 57


Order
The term that eventually dominates is the one we are interested in.
● In any function, we can throw away lower-order terms:
○ i.e. 0.1n3 + 10n2 + 5n + 25 is a complete cubic function. We throw away 10n2, 5n,
and 25, as they are each lower-order than 0.1n3. As n grows, 0.1n3 will eventually
dominate the others.
● When a function is cubic, we say that it is in the complexity class Θ(n3)
○ We also say the function is “order n3”

Chapter 3 Searching, Sorting, and Complexity Analysis 58


Complexity Classes
● Common complexity classes, from most efficient to least efficient:
Θ(1) Θ(lgn) Θ(n) Θ(nlgn) Θ(n2) Θ(n3) Θ(2n) Θ(n!)

● Given a function’s complexity class, we know it’s more efficient than those to its right
in this list and less efficient than those to its left.

Example:
● As n grows, any function that is Θ(nlgn) is more efficient than any function that is Θ
(n2) and less efficient than any function that is Θ(n), etc.

Chapter 3 Searching, Sorting, and Complexity Analysis 59


Complexity Classes
This chart displays that rate at which functions within common complexity classes grow
based on input size

Chapter 3 Searching, Sorting, and Complexity Analysis 60


Complexity Classes
What complexity class does the following time complexity belong in?
n + n2 + 2n + n4

Chapter 3 Searching, Sorting, and Complexity Analysis 61


Complexity Classes
What complexity class does the following time complexity belong in?
n + n2 + 2n + n4

● Θ(2n), because 2n eventually dominates the other terms.


○ We can throw out n, n2, and n4.

● There is a more rigorous way to prove this, but for this course we use this type of
intuition to determine complexity classes

Chapter 3 Searching, Sorting, and Complexity Analysis 62


Sequential Search of a List
● This code performs what is called a sequential search of a list:
def sequential_search(target, list):
""" Returns the position of the target item if found or -1 otherwise """
position = 0
while position < len(list):
if target == list[position]:
return position
position += 1
return -1

● Starting at index 0, each index is checked.


○ If the item is found, its index is returned.
○ If it is not found, -1 is returned.
● What is the time complexity of this algorithm?

Chapter 3 Searching, Sorting, and Complexity Analysis 63


Best-Case, Worst-Case, and Average-Case
● This code performs what is called a sequential search of a list:
def sequential_search(target, list):
""" Returns the position of the target item if found or -1 otherwise """
position = 0
while position < len(list):
if target == list[position]:
return position
position += 1
return -1

● The time complexity of sequential search depends upon its input.


○ If the first index contains target, only one operation is performed.
○ If the last index contains target (or it is not in the list at all) n operations are
performed.
○ If an index somewhere in the middle contains target, between 1 and n operations
are performed.
Chapter 3 Searching, Sorting, and Complexity Analysis 64
Best-Case, Worst-Case, and Average-Case
● Sequential sort has a best-case complexity of Θ(1), a worst-case complexity of Θ(n),
and an average-case complexity of Θ(n).
● As can be imagined, best-case performance rarely occurs.
● Average-case is more useful, but more difficult to calculate.
● Therefore, we often consider the worst-case performance of an algorithm that does not
have an every-time case.

Chapter 3 Searching, Sorting, and Complexity Analysis 65


Sequential Search
● Suppose we have a list of sorted items.
● Is there a faster way of searching for an item than checking each index?
○ For example, to see if ‘10’ is in this list, a sequential search checks every index.

‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’

Chapter 3 Searching, Sorting, and Complexity Analysis 66


Binary Search
● A binary search works in much the same way as looking up a word in a (paper)
dictionary.
● Rather than starting at the beginning of a dictionary, most people open to the middle:
○ If they find the word they are looking for, great!
○ If the word they are looking for comes after the word they find, they check the
second half of the dictionary.
○ Otherwise, they check the first half.
● In this manner, the amount of words being searched is cut in half each time.

‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’

Chapter 3 Searching, Sorting, and Complexity Analysis 67


Binary Search
● Let’s perform a binary search for 7.
● To start a binary search, we pick the list’s midpoint.
○ We can calculate the midpoint as: ⌊the first index + the last index / 2⌋
○ In this example, the midpoint is: ⌊0 + 9 / 2⌋ = ⌊4.5⌋ = 4
● We check the midpoint and find 5.
● What do we do from here?

‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’

Chapter 3 Searching, Sorting, and Complexity Analysis 68


Binary Search
● Let’s perform a binary search for 7.
● To start a binary search, we pick the list’s midpoint.
○ We can calculate the midpoint as: ⌊the first index + the last index / 2⌋
○ In this example, the midpoint is: ⌊0 + 9 / 2⌋ = ⌊4.5⌋ = 4
● We check the midpoint and find 5.
● What do we do from here?
● We can ignore the sublist to the left of index 4, since 7 > 5.
● We then perform a binary search on the sublist to the right of 4.
○ The midpoint of this list is ⌊5 + 9 / 2⌋ = 7

‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’

Chapter 3 Searching, Sorting, and Complexity Analysis 69


Binary Search
● Because 7 < 8, we can ignore the sublist to the right of index 7.
● The sublist to the left has a midpoint of ⌊5 + 6 / 2⌋ = 5

‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’

Chapter 3 Searching, Sorting, and Complexity Analysis 70


Binary Search
● Because 6 < 7, we check the sublist to the right of index 5.
● The sublist to the right has a midpoint of ⌊6 + 6 / 2⌋ = 6
● This index contains the number we were looking for, so the binary search is finished!

‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ ‘7’ ‘8’ ‘9’ ‘10’

Chapter 3 Searching, Sorting, and Complexity Analysis 71


Binary Search
def binary_search(target, sorted_list):
left = 0
right = len(sorted_list) - 1
while left <= right:
midpoint = (left + right) // 2
if target == sorted_list[midpoint]:
return midpoint
elif target < sorted_list[midpoint]:
right = midpoint - 1
else:
left = midpoint + 1
return - 1

● This code performs a binary search on the sorted_list parameter.


● This while loop continues until the value is found (and the function returns) or left is
greater than right (i.e. the value is not in the list).

Chapter 3 Searching, Sorting, and Complexity Analysis 72


Binary Search Evaluation
● Binary search has a worst case scenario: the target is not in the list.
● In this case, we cut the list in half each iteration of the while loop until it ends.
● Any time the input is cut in half each iteration, this is an example of log2n performance.
● Therefore, binary search is faster than sequential search!

Chapter 3 Searching, Sorting, and Complexity Analysis 73

You might also like