0% found this document useful (0 votes)
18 views60 pages

DSA Week1

Uploaded by

ayesha batool
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views60 pages

DSA Week1

Uploaded by

ayesha batool
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 60

CS 261

Data Structures and Algorithms


Nazeef Ul Haq
Fall 2023

Course Credits: Stanford-CS161


MULTIPLICATION!
What’s the best way to multiply two numbers?

2
MULTIPLICATION: THE PROBLEM

Input: 2 non-negative numbers, x and y (n digits each)


Output: the product x · y

5678
x 1234
7006652
3
GRADE-SCHOOL MULTIPLICATION

45
x 63
135
2700
2835
4
GRADE-SCHOOL MULTIPLICATION

45
Algorithm description (informal*): x 63
compute partial products (using multiplication
& “carries” for digit overflows), and add all
(properly shifted) partial products together 135
2700
2835
* This is not a good example of what your algorithm descriptions should look like on HW/quizzes
5
GRADE-SCHOOL MULTIPLICATION

45123456678093420581217332421
x 63782384198347750652091236423
):

6
GRADE-SCHOOL MULTIPLICATION
n digits

45123456678093420581217332421
x 63782384198347750652091236423
):
How efficient is this algorithm?
(How many single-digit operations are required?)
7
GRADE-SCHOOL MULTIPLICATION
n digits
45123456678093420581217332421
x 63782384198347750652091236423
):

n partial products: ~2n2 ops (at most n


How efficient is this algorithm?
multiplications & n additions per partial product)
(How many single-digit operations
in the worst case?) adding n partial products: ~2n2 ops
(a bunch of additions & “carries”)

8
GRADE-SCHOOL MULTIPLICATION
n digits
45123456678093420581217332421
x 63782384198347750652091236423
):

n partial products: ~2n2 ops (at most n


How efficient is this algorithm?
multiplications & n additions per partial product)
(How many single-digit operations
in the worst case?) adding n partial products: ~2n2 ops
(a bunch of additions & “carries”)

~ 4n2 operations in the worst case


9
GRADE-SCHOOL MULTIPLICATION
n digits
45123456678093420581217332421
x 63782384198347750652091236423
THE QUESTION IS...
CAN WE DO ):
BETTER n partial products: ~2n ops (at most n
2
How efficient is this algorithm?
(How many single-digit operations
?
multiplications & n additions per partial product)

in the worst case?) adding n partial products: ~2n2 ops


(a bunch of additions & “carries”)

~ 4n2 operations in the worst case


10
WHAT EXACTLY DOES “BETTER”
MEAN?
Is 1000000n operations better than 4n2?
Is 0.000001n3 operations better than 4n2?
Is 3n2 operations better than 4n2?

11
WHAT EXACTLY DOES “BETTER”
MEAN?
Is 1000000n operations better than 4n2?
Is 0.000001n3 operations better than 4n2?
Is 3n2 operations better than 4n2?

● The answers for the first two depend on what value n is…
○ 1000000n < 4n2 only when n exceeds a certain value (in this case, 250000)
● These constant multipliers are too environment-dependent...
○ An operation could be faster/slower depending on the machine, so 3n 2 ops on a
slow machine might not be “better” than 4n 2 ops on a faster machine

12
WHAT EXACTLY DOES “BETTER”
MEAN? INTRODUCING...

ASYMPTOTIC
ANALYSIS

13
WHAT EXACTLY DOES “BETTER”
MEAN? INTRODUCING...

ASYMPTOTIC

ANALYSIS
Some guiding principles: we care about how the running time/number of
operations scales with the size of the input (i.e. the runtime’s rate of growth),
and we want some measure of runtime that’s independent of hardware,
programming language, memory layout, etc.

14
WHAT EXACTLY DOES “BETTER”
MEAN? INTRODUCING...

ASYMPTOTIC

ANALYSIS
Some guiding principles: we care about how the running time/number of
operations scales with the size of the input (i.e. the runtime’s rate of growth),
and we want some measure of runtime that’s independent of hardware,
programming language, memory layout, etc.
○ Note: details like hardware/language/memory/compiler/etc. could totally be
important to real world engineers, but in TheoryLand™, we want to reason about
high-level algorithmic approaches rather than lower-level details

15
ASYMPTOTIC ANALYSIS (High Level
Idea)
We’ll express the asymptotic runtime of an algorithm using
“big-oh of n
BIG-O NOTATION squared”
or

● We would say Grade-school Multiplication “runs in time O(n2)” “Oh of n


○ Informally, this means that the runtime “scales like” n 2 squared”
○ We’ll discuss the formal definition of Big-O (math-y stuff) in next lecture

16
ASYMPTOTIC ANALYSIS (High Level
Idea)
We’ll express the asymptotic runtime of an algorithm using
“big-oh of n
BIG-O NOTATION squared”
or

● We would say Grade-school Multiplication “runs in time O(n2)” “Oh of n


○ Informally, this means that the runtime “scales like” n 2 squared”
○ We’ll discuss the formal definition of Big-O (math-y stuff) in next lecture

THE POINT OF ASYMPTOTIC NOTATION

suppress constant factors and lower-order terms


irrelevant for large
too system dependent
inputs
17
ASYMPTOTIC ANALYSIS (High Level
Idea)
THE POINT OF ASYMPTOTIC NOTATION

suppress constant factors and lower-order terms


irrelevant for large
too system dependent
inputs

2500
when n is
small...
2000
Runtime (ms)

1500
0.1n1.6 + 300
1000
0.008n2
500

0
0 100 200
n 300 400
(input 500 600
18
ASYMPTOTIC ANALYSIS (High Level
Idea)
THE POINT OF ASYMPTOTIC NOTATION

suppress constant factors and lower-order terms


irrelevant for large
too system dependent
inputs

2500
when n is 50000

small... 40000
2000

Runtime (ms)
Runtime (ms)

1500
0.1n1.6 + 300 30000

1000 20000
0.008n2 when n gets
500 10000
bigger!
0 0
0 100 200 0 1000 2000 3000 4000 5000 6000
n 300 400
(input 500 600
n (input 19
ASYMPTOTIC ANALYSIS (High Level
Idea)
THE POINT OF ASYMPTOTIC NOTATION

suppress constant factors and lower-order terms


irrelevant for large
too system dependent
inputs

when n is
2500
small...
O(n1.6) 50000

2000 40000

Runtime (ms)
Runtime (ms)

1500
0.1n1.6 + 300 30000

1000 20000
0.008n2 when n gets
500 10000
bigger!
0 O(n2) 0
0 100 200 0 1000 2000 3000 4000 5000
n 300 400
(input 500 600
n (input
ASYMPTOTIC ANALYSIS (High Level
Idea)
● To compare algorithm runtimes in this class, we compare their Big-O runtimes
○ Ex: a runtime of O(n2) is considered “better” than a runtime of O(n3)
○ Ex: a runtime of O(n1.6) is considered “better” than a runtime of O(n2)
○ Ex: a runtime of O(1/n) is considered “better” than O(1)

21
ASYMPTOTIC ANALYSIS (High Level
Idea)
● To compare algorithm runtimes in this class, we compare their Big-O runtimes
○ Ex: a runtime of O(n2) is considered “better” than a runtime of O(n3)
○ Ex: a runtime of O(n1.6) is considered “better” than a runtime of O(n2)
○ Ex: a runtime of O(1/n) is considered “better” than O(1)

So the question is:

Can we multiply
n-digit integers Don’t worry,
we’ll revisit

faster than O(n2)? Asymptotic Analysis


& Big-O stuff more
formally in Lecture 2!
22
DIVIDE AND CONQUER
algorithm design paradigm

23
DIVIDE AND CONQUER
● An algorithm design paradigm:
1. break up a problem into smaller subproblems
2. solve those subproblems recursively
3. combine the results of those subproblems to get the overall answer

big problem

sub- problem sub- problem

sub-sub sub-sub sub-sub sub-sub


problem problem problem problem
24
MULTIPLICATION SUBPROBLEMS
● Original large problem: multiply 2 n-digit numbers
● What are the subproblems? Let’s unravel some stuff...

25
MULTIPLICATION SUBPROBLEMS
● Original large problem: multiply two 4-digit numbers
● What are the subproblems? Let’s unravel some stuff...

1234 x 5678

26
MULTIPLICATION SUBPROBLEMS
● Original large problem: multiply two 4-digit numbers
● What are the subproblems? Let’s unravel some stuff...

1234 x 5678
= ( 12x100 + 34 ) x ( 56x100 + 78 )

27
MULTIPLICATION SUBPROBLEMS
● Original large problem: multiply two 4-digit numbers
● What are the subproblems? Let’s unravel some stuff...

1234 x 5678
= ( 12x100 + 34 ) x ( 56x100 + 78 )

= ( 12x56 )1002 + ( 12x78 + 34x56 )100 + ( 34x78 )

28
MULTIPLICATION SUBPROBLEMS
● Original large problem: multiply two 4-digit numbers
● What are the subproblems? Let’s unravel some stuff...

1234 x 5678
= ( 12x100 + 34 ) x ( 56x100 + 78 )
= ( 12x56 )1002 + ( 12x78 + 34x56 )100 + ( 34x78 )
1 2 3 4

One 4-digit problem Four 2-digit subproblems


29
MULTIPLICATION SUBPROBLEMS
● Original large problem: multiply 2 n-digit numbers
● What are the subproblems? More generally:

[x1x2...xn-1xn] x [y1y2...yn-1yn]
= ( ax10n/2 + b ) x ( cx10n/2 + d )
= ( a x c )10n + ( a x d + b x c )10n/2 + ( b x d )
1 2 3 4

One n-digit problem Four (n/2)-digit subproblems


30
LET’S SEE SOME PSEUDOCODE
x & y are n-digit
MULTIPLY( x, y ): numbers
Note: we’re making a
big assumption that n is a
power of 2 just to make
the pseudocode simpler

31
LET’S SEE SOME PSEUDOCODE
x & y are n-digit
MULTIPLY( x, y ): numbers
Note: we’re making a
if (n = 1): Base case: we can just reference some big assumption that n is a
memorized 1-digit multiplication tables
return x·y power of 2 just to make
the pseudocode simpler

32
LET’S SEE SOME PSEUDOCODE
x & y are n-digit
MULTIPLY( x, y ): numbers
Note: we’re making a
if (n = 1): Base case: we can just reference some big assumption that n is a
memorized 1-digit multiplication tables
return x·y power of 2 just to make
the pseudocode simpler
write x as a·10n/2 + b a, b, c, & d are
write y as c·10n/2 + d (n/2)-digit numbers

33
LET’S SEE SOME PSEUDOCODE
x & y are n-digit
MULTIPLY( x, y ): numbers
Note: we’re making a
if (n = 1): Base case: we can just reference some big assumption that n is a
memorized 1-digit multiplication tables
return x·y power of 2 just to make
the pseudocode simpler
write x as a·10n/2 + b a, b, c, & d are
write y as c·10n/2 + d (n/2)-digit numbers
ac = MULTIPLY(a,c)
ad = MULTIPLY(a,d) These are recursive
calls that provide
bc = MULTIPLY(b,c) subproblem answers

bd = MULTIPLY(b,d)

34
LET’S SEE SOME PSEUDOCODE
x & y are n-digit
MULTIPLY( x, y ): numbers
Note: we’re making a
if (n = 1): Base case: we can just reference some big assumption that n is a
memorized 1-digit multiplication tables
return x·y power of 2 just to make
the pseudocode simpler
write x as a·10n/2 + b a, b, c, & d are
write y as c·10n/2 + d (n/2)-digit numbers
ac = MULTIPLY(a,c)
ad = MULTIPLY(a,d) These are recursive
calls that provide
bc = MULTIPLY(b,c) subproblem answers

bd = MULTIPLY(b,d)
Add them up to get our overall answer!
return ac·10n + (ad + bc)·10n/2 + bd 35
HOW EFFICIENT IS THIS ALGORITHM?
● Let’s start small: if we’re multiplying two 4-digit numbers, how many 1-
digit multiplications does the algorithm perform?
○ In other words, how many times do we reach the base case where we actually
perform a “multiplication” (a.k.a. a table lookup)?
○ This at least lower bounds the number of operations needed overall

36
HOW EFFICIENT IS THIS ALGORITHM?
● Let’s start small: if we’re multiplying two 4-digit numbers, how many 1-
digit multiplications does the algorithm perform?
○ In other words, how many times do we reach the base case where we actually
perform a “multiplication” (a.k.a. a table lookup)?
○ This at least lower bounds the number of operations needed overall

This is called a 4 digit


Recursion Tree 2 digit 2 digit 2 digit 2 digit
1 digit 1 digit

1 digit 1 digit
1 digit 1 digit
1 digit 1 digit 1 digit 1 digit
1 digit 1 digit Sixteen 1-digit
1 digit 1 digit
1 digit 1 digit multiplications!
37
HOW EFFICIENT IS THIS ALGORITHM?
● Now let’s generalize: if we’re multiplying two n-digit numbers, how many
1-digit multiplications does the algorithm perform?

Recursion Tree
n Level 0: 1 problem of size n

n/2 n/2 n/2 n/2 Level 1: 41 problems of size n/2


···

n/2t n/2t n/2t n/2t ··· n/2t n/2t n/2t n/2t Level t: 4t problems of size n/2t
···

1 1 1 1 1 1 ·· 1 1 1 1 1 1
Level log2n: ____ problems of size
·
1 38
HOW EFFICIENT IS THIS ALGORITHM?
● Now let’s generalize: if we’re multiplying two n-digit numbers, how many
1-digit multiplications does the algorithm perform?

Recursion Tree
Level 0: 1 problem of size n log2n levels
n
(you need to cut n in
half log2n times to
n/2 n/2 n/2 n/2 Level 1: 41 problems of size n/2 get to size 1)

··· # of problems on
last level (size 1)
n/2t n/2t n/2t n/2t ··· n/2t n/2t n/2t n/2t Level t: 4t problems of size n/2t = 4log 2 n = nlog 2 4
··· = n2

1 1 1 1 1 1 ·· 1 1 1 1 1 1
Level log2n: ____
n2 problems of size
·
1 39
HOW EFFICIENT IS THIS ALGORITHM?
The running time of this
Divide-and-Conquer
multiplication algorithm
is at least O(n2)!
We know there are already n2 multiplications happening at the bottom
level of the recursion tree, so that’s why we say “at least” O(n2)

40
HOW EFFICIENT IS THIS ALGORITHM?
The running time of this
Divide-and-Conquer
multiplication algorithm
is at least O(n2)!
We know there are already n2 multiplications happening at the bottom
level of the recursion tree, so that’s why we say “at least” O(n2)

Wait, our grade-school algorithm was already O(n2)!


Is Divide-and-Conquer really that useless?

41
HOW EFFICIENT IS THIS ALGORITHM?
The running time of this
Divide-and-Conquer
multiplication algorithm
is at least O(n2)!
We know there are already n2 multiplications happening at the bottom
level of the recursion tree, so that’s why we say “at least” O(n2) no!!!

Wait, our grade-school algorithm was already O(n2)!


Is Divide-and-Conquer really that useless?

Karatsuba says no!!!


42
KARATSUBA INTEGER
MULTIPLICATION
Three subproblems instead of four!

43
CHOOSING SUBPROBLEMS WISELY
[x1x2...xn-1xn] x [y1y2...yn-1yn]
= ( ax10n/2 + b ) x ( cx10n/2 + d )
= ( a x c )10n + ( a x d + b x c )10n/2 + ( b x d )

The subproblems we choose to solve just need to provide these quantities:

ac ad + bc bd

Originally, we assembled these quantities by computing FOUR things: ac, ad, bc, and bd.
44
KARATSUBA’S TRICK
end result = ( ac )10n + ( ad + bc )10n/2 + ( bd )

45
KARATSUBA’S TRICK
end result = ( ac )10n + ( ad + bc )10n/2 + ( bd )

ac & bd can be recursively computed as usual

ad + bc is equivalent to (a+b)(c+d) - ac - bd
= (ac + ad + bc + bd) - ac - bd
= ad + bc

46
KARATSUBA’S TRICK
end result = ( ac )10n + ( ad + bc )10n/2 + ( bd )

ac & bd can be recursively computed as usual

ad + bc is equivalent to (a+b)(c+d) - ac - bd
= (ac + ad + bc + bd) - ac - bd
= ad + bc

So, instead of computing ad & bc as two separate subproblems,


let’s just compute (a+b)(c+d) instead!
47
OUR THREE SUBPROBLEMS
These three subproblems give us everything we need to compute our desired quantities:

1 ac
2 bd
3 (a+b)(c+d)
Assemble our overall product by combining these three subproblems:
- - 1 3 1 2 2

( ac )10n + ( ad + bc )10n/2 + ( bd )
48
OUR THREE SUBPROBLEMS
These three subproblems give us everything we need to compute our desired quantities:

1 ac (a+b) and (c+d) are


both going to be n/2-
digit numbers!
2 bd
This means we still
have half-sized
3 (a+b)(c+d) subproblems!

Assemble our overall product by combining these three subproblems:


- - 1 3 1 2 2

( ac )10n + ( ad + bc )10n/2 + ( bd )
49
WHAT’S THE RUNTIME?
This was the Recursion Tree + Analysis from Divide-and-Conquer Attempt 1:
log2n levels
n Level 0: 1 problem of size n (you need to cut n
in half log2n times
to get to size 1)
n/2 n/2 n/2 n/2 Level 1: 41 problems of size n/2
# of problems on
···
last level (size 1)
n/2t n/2t n/2t n/2t ··· n/2t n/2t n/2t n/2t Level t: 4t problems of size n/2t = 4log 2 n = nlog 2
4
···
= n2
·· Level log2n: ____ problems of size
1 1 1 1 1 1 · 1 1 1 1 1 1 n2
1

50
WHAT’S THE RUNTIME?
This was the Recursion Tree + Analysis from Divide-and-Conquer Attempt 1:
log2n levels
n Level 0: 1 problem of size n (you need to cut n
in half log2n times
to get to size 1)
n/2 n/2 n/2 n/2 Level 1: 41 problems of size n/2
# of problems on
···
last level (size 1)
n/2t n/2t n/2t n/2t ··· n/2t n/2t n/2t n/2t Level t: 4t problems of size n/2t = 4log 2 n = nlog 2
4
···
= n2
·· Level log2n: ____ problems of size
1 1 1 1 1 1 · 1 1 1 1 1 1 n2
1

For Karatsuba’s, we’ll replace the branching factor of 4 with a 3! ⇒ 51


WHAT’S THE RUNTIME?
Karatsuba Multiplication Recursion Tree
log2n levels
n Level 0: 1 problem of size n (you need to cut n
in half log2n times
to get to size 1)
n/2 n/2 n/2 Level 1: 31 problems of size n/2
# of problems on
···
last level (size 1)
n/2t n/2t n/2t ··· n/2t n/2t n/2t Level t: 3t problems of size n/2t = 3log 2 n = nlog 2
3
···
≈ n1.6
1 1 1 1 1 ·· 1 1 1 1 1
Level log2n: ____ problems of size
·
1

52
WHAT’S THE RUNTIME?
Karatsuba Multiplication Recursion Tree
log2n levels
n Level 0: 1 problem of size n (you need to cut n
in half log2n times
to get to size 1)
n/2 n/2 n/2 Level 1: 31 problems of size n/2
# of problems on
···
last level (size 1)
n/2t n/2t n/2t ··· n/2t n/2t n/2t Level t: 3t problems of size n/2t = 3log 2 n = nlog 2
3
···
≈ n1.6
·· Level log2n: ____ problems of size
1 1 1 1 1 · 1 1 1 1 1 n1.6
1

Thus, the runtime is O(n1.6)! 53


WHAT’S THE RUNTIME?
NOTE: I know Karatsuba
it looks Multiplication Recursion Tree
like we didn’t account log2n levels
n done on
for the work Level 0: 1 problem of size n (you need to cut n
in half log2n times
higher levels in the to get to size 1)
n/2 n/2
recursion tree, but as n/2 Level 1: 3 1
problems of size n/2
we’ll learn # of problems on
· · · later, the
last level (size 1)
work on the last level
n/2 n/2 n/2 · · · n/2 n/2 n/2
t t t t t t
Level t: 3 t
problems of size n/2 t = 3log 2 n = nlog 2
actually dominates in 3
this particular
···
≈ n1.6
recursion tree! Level log2n: ____ problems of size
1 1 1 1 1 ··· 1 1 1 1 1 n1.6
1

Thus, the runtime is O(n1.6)! 54


IT WORKS IN PRACTICE TOO!

O(n2)

O(n1.6)

55
IT WORKS IN PRACTICE TOO!

THE O(n2)
QUESTION IS...
CAN WE DO
BETTER
? O(n1.6)

56
CAN WE DO BETTER?
● Toom-Cook (1963): another Divide & Conquer! Instead of breaking into
three (n/2)-sized problems, break into five (n/3)-sized problems.
○ Runtime: O(n1.465)

We won’t expect
you to know any
of these algorithms
by the way!
57
CAN WE DO BETTER?
● Toom-Cook (1963): another Divide & Conquer! Instead of breaking into
three (n/2)-sized problems, break into five (n/3)-sized problems.
○ Runtime: O(n1.465)
● Schönhage–Strassen (1971): uses fast polynomial multiplications
○ Runtime: O(n log n log log n )

We won’t expect
you to know any
of these algorithms
by the way!
58
CAN WE DO BETTER?
● Toom-Cook (1963): another Divide & Conquer! Instead of breaking into
three (n/2)-sized problems, break into five (n/3)-sized problems.
○ Runtime: O(n1.465)
● Schönhage–Strassen (1971): uses fast polynomial multiplications
○ Runtime: O(n log n log log n )
● Fürer (2007): uses Fourier Transforms over complex numbers
○ Runtime: O(n log(n) 2O(log*(n)) )

We won’t expect
you to know any
of these algorithms
by the way!
59
CAN WE DO BETTER?
● Toom-Cook (1963): another Divide & Conquer! Instead of breaking into
three (n/2)-sized problems, break into five (n/3)-sized problems.
○ Runtime: O(n1.465)
● Schönhage–Strassen (1971): uses fast polynomial multiplications
○ Runtime: O(n log n log log n )
● Fürer (2007): uses Fourier Transforms over complex numbers
○ Runtime: O(n log(n) 2O(log*(n)) )
● Harvey and van der Hoeven (2019!): wild stuff
○ Runtime: O(n log(n))
We won’t expect
you to know any
of these algorithms
by the way!
60

You might also like