Week6 Lecture
Week6 Lecture
Review
• In the recorded lecture we learned:
– GCD – Euclid’s method; GCD(x,y) = GCD(y, x % y)
– Root finding with binary search
– Learned how to prove correctness
2
Run time Analysis
How fast will your program run?
• The running time of your program will depend upon:
– The algorithm
– The input
– Your implementation of the algorithm in a programming
language
– The compiler/interpreter you use
– The OS on your computer
– Your computer hardware
– Maybe other things: temperature outside; other programs on
your computer; …
• Our Motivation: analyze the running time of an algorithm
as a function of only simple parameters of the input
4
Basic idea: counting operations
• Each algorithm performs a sequence of basic operations:
– Arithmetic: (low + high)/2
– Comparison: if x > 0:
– Assignment: temp = x
– Branching: while y>0:
–…
• Count number of basic operations performed on the input
• Difficulties:
– Which operations are basic?
– Not all operations take the same amount of time
– Operations take different times with different hardware or
5 compilers
Difficulties do not matter “so much”
• Operation counts are only problematic in terms of
“constant factors”
• The general form of the function describing the
running time is invariant over hardware, languages or
compilers!
for i in range(n):
for j in range(n):
x = x + i*j
7
Mathematical Formalization
8
f = O(g)
There exist a
c and an x0
such that for
all x > x0
f(x) ≤ c g(x)
Source https://en.wikipedia.org/wiki/Big_O_notation#/media/File:Big-O-notation.png
9
Mathematical Formalization
• f=O(g) should be considered as saying that “f is at
most g, up to constant factors”
• We usually will have f be the running time of an
algorithm, and g a nicely written function.
– For example:
"The running time of the algorithm was O(n2)"
10
Example
11
Example
• 4n2 + 2n + 3 = O(n2)
14
What’s going on with average case size
• Assume, first of all, that what we are searching for is in
the list (if not, of course, average case of the search
might be affected by how often the item is not in the
array)
• In our search algorithms, the average case size can be
thought of as the sum n
pd
i 0
i i
? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Average work = (1*(1/n))…
19
Probability of finding it in 1 step
Average Case Size of Binary Search Algorithm
? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Average work = (1*(1/n)) + (2*(2/n))…
? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Average work = (1*(1/n)) + (2*(2/n)) + (3*(4/n))…
? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
•In other words, there is 1/n chance of 1 step,
2/n chance of 2 steps, 4/n chance of 3 steps,
8/n chance of 4 steps…
23
Average Case Size of Binary Search Algorithm
26
Running time of simple GCD algorithm
Theorem: The simple GCD algorithm runs in
time
27
Running time of Euclid’s GCD Algorithm
• How fast does Euclid’s algorithm terminate?
28
Useful Claim: if then
Proof:
If then
2
1
3
a = [1, 2, 3]
This results in Python setting
aside, let’s say, 4 places for the
list a, but only using the first 3
35 len(a) == 3
Python Implements Lists
a
2
1 4
3
a.append(4)
a becomes [1, 2, 3, 4]
Here, append is O(1), free space was available
len(a) == 4
36
Python Implements Lists
a
2 5
1 4
3
Now do a.append(5)
No contiguous free space. Python finds new (double)
space for the list, updates a, copies pointers to new place
a becomes [1, 2, 3, 4, 5]
Here, append is O(n), no free space was available
37 len(a) == 5
Python Implements Lists
a
2 5 8
1 4 6
3 7
len(a) == 8
Now do a.append(9)
38 No contiguous free space is available
Python Implements Lists
a
2 5 8
1 4 6 9
3 7
42
Hash Tables Map to memory
location and store value
We wish to store Dictionary computes hash of key key value
there 0
key value print(hash("hello")) wide 1
print(hash("world"))
hello up print(hash("frank")) 2
7453177550310124801
there wide -9087840708672955053
hello 3
up 4
frank now -2055844616174617975
5
print(hash("hello")%11)
print(hash("there")%11) 6
print(hash("frank")%11) 7
3 frank 8
0 now
8 9
10
• The hash function sends us to the right table location for storing and for
retrieving a key-value pair
• Keys sometimes map into the same cell (a “collision”), but not too often if
43
the table is large enough
Hash Tables with Collision Map to memory
location and store value
We wish to store Dictionary computes hash of key key value
there 0
key value print(hash("hello")) wide 1
print(hash("world"))
hello up print(hash(”fred")) 2
7453177550310124801
there wide -9087840708672955053
fred
hello
up velma
3
4
fred velma -9162233691798532529
5
print(hash("hello")%11)
print(hash("there")%11) 6
print(hash(”fred")%11) 7
3 8
0
3 9
10
If there is a collision when storing the key-value, we can move
forward and look for an empty location – and when we retrieve
44 the key-value pair, we may need to visit more than one location
Algorithms on Lists
Intro2CS – week 6
Sorting
• Sorting data is often
done so that
subsequent
searching will be
much easier (e.g.,
binary search)
• An absolutely
fundamental set of
algorithms in
Computer Science
46
The sorting problem
Input: A list
Output: The same list, with the elements re-ordered from
smallest to largest.
[53, 12, 77, 23, 50] [12, 23, 50, 53, 77]
48
Sorting Built Into Python
• Timsort
• “Timsort is a hybrid stable sorting algorithm,
derived from merge sort and insertion sort,
designed to perform well on many kinds of
real-world data…It was implemented by Tim
Peters in 2002 for use in the Python
programming language.”
Wikipedia
49
sorted()
>>> numbers = [6, 9, 3, 1]
>>> numbers_sorted = sorted(numbers)
>>> numbers_sorted
[1, 3, 6, 9]
>>> numbers
[6, 9, 3, 1]
52
Selection Sort (large to small sorting)
starting order:
18 35 22 97 84 55 61 10 47
search through list, find largest value, exchange with
first list value:
97 35 22 18 84 55 61 10 47
53
Continue the Select and Exchange Process (large to small
sorting)
search through rest of list, one less each time:
97 84 61 18 35 55 22 10 47
97 84 61 55 35 18 22 10 47
97 84 61 55 47 18 22 10 35
97 84 61 55 47 35 22 10 18
97 84 61 55 47 35 22 10 18
97 84 61 55 47 35 22 18 10
54
Selection Sort (small to large sorting)
55
56
Selection Sort
57
Proof of Correctness: find_smallest
Invariants:
• lst always holds the same elements
• After the i’th iteration, lst[:i+1] holds the smallest
i+1 elements of lst in sorted order
59
Selection Sort – O(n2) Runtime
O(n)
O(n)
times
find_smallest
60
Stable Sorting vs. Unstable Sorting Techniques
• A list might include components with
exactly the same “sorting value” (e.g.,
numbers are in the list, and we’re
sorting on some digit)
• Sorting algorithms that leave such
components in their original order are
called stable, while sorting algorithms
that may change the original order of
those components are called unstable
61
Merge Sort Intuition
62 https://commons.wikimedia.org/wiki/File:Merge_sort_algorithm_diagram.svg
QuickSort Intuition
63 https://www.techiedelight.com/quicksort/