0% found this document useful (0 votes)
118 views27 pages

CS-E3190 Lect03 PDF

This document discusses algorithms for basic arithmetic operations like addition, multiplication, and division. It introduces the divide-and-conquer paradigm, showing how applying recursion and partitioning problems can derive more efficient algorithms. Specifically, it explores using divide-and-conquer to speed up integer multiplication, but the resulting algorithm is still O(n^2) time complexity.

Uploaded by

Yuri Shukhrov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views27 pages

CS-E3190 Lect03 PDF

This document discusses algorithms for basic arithmetic operations like addition, multiplication, and division. It introduces the divide-and-conquer paradigm, showing how applying recursion and partitioning problems can derive more efficient algorithms. Specifically, it explores using divide-and-conquer to speed up integer multiplication, but the resulting algorithm is still O(n^2) time complexity.

Uploaded by

Yuri Shukhrov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Principles of Algorithmic Techniques

CS-E3190
Pekka Orponen
Department of Computer Science
Aalto University

Autumn 2017
Lecture 3: Basic arithmetic; divide-and-conquer I

I We begin by discussing elementary algorithms for


computing with integers and their running times.
I Then we present the fundamental divide-and-conquer
algorithm design paradigm, and show how it can be used
to derive more efficient methods for arithmetic
computations.

CE-3190 PAT
Autumn 2017
2/27
3.1 Basic arithmetic: bases and logs

I The number of digits needed to represent an integer N 0


in base b is the least integer n such that N bn 1, that is
n = dlogb (N + 1)e .
I logb N = (loga N)/(loga b)
I In big-O notation the base is irrelevant and omitted:
O(log N)
I If omitted, the base is taken to be 2.

CE-3190 PAT
Autumn 2017
3/27
Logs in algorithms

log2 N is a close estimate of:


I the power to which you need to raise 2 in order to get N
I the number of times N must be halved to get down to 1
I the number of bits in the binary representation of N
I the depth of a complete binary tree with N nodes
P
I the sum1 N i=1 1/i
I etc.

1
This is actually closer to ln N, but within a constant factor anyway.
CE-3190 PAT
Autumn 2017
4/27
Basic arithmetic: addition
Lemma The sum of any three single-digit numbers is at most
two digits long.
Proof. For all bases b with b 2, we have
3(b 1) (b + 1)(b 1) = (b 1)b + (b 1), which proves
the claim.
Binary addition with carry:
Carry: 1 1 1 1
1 1 0 1 0 1 (53)
1 0 0 0 1 1 (35)

1 0 1 1 0 0 0 (88)

By the previous lemma, the carry is always at most one bit.

CE-3190 PAT
Autumn 2017
5/27
Complexity of the algorithm

I Let n be the maximum of the bit-lengths of integers x, y .


I Then the bit-length of x + y is at most n + 1, and the time
taken to do the addition is c0 + c1 n where, c0 and c1 are
constants.
I Hence the algorithm is of linear, i.e. O(n) time complexity.
I Is there a faster method?

CE-3190 PAT
Autumn 2017
6/27
Basic arithmetic: grade-school multiplication

1 1 0 1 (13)
1 0 1 1 (11)

Carry: 1 1 1 1
1 1 0 1 1x(1 times 1101)
1 1 0 1 1x(10 times 1101)
0 0 0 0 0x(100 times 1101)
+ 1 1 0 1 1x(1000 times 1101)

1 0 0 0 1 1 1 1 (binary 143)

How long does this computation take?

CE-3190 PAT
Autumn 2017
7/27
Complexity of the algorithm

I If x and y are both n-bit integers, then there are n


intermediate rows, with lengths of up to 2n bits (with
shifting).
I The total time taken to add up these rows, summing two
numbers (= rows) at a time, is

O(n) + O(n) + . . . + O(n) = O(n2 ).


| {z }
n1

CE-3190 PAT
Autumn 2017
8/27
Al Khwarizmis strikeout algorithm
y x
11 13
5 26 (halve y , double x)
2 52 (strike out rows with even y )
1 104
143 (answer)
Same as binary multiplication! (Note: halve shift right, double
shift left)
How long does this computation take?
I n recursive calls (= rows), since y is n bits long
I at each call: one halving, remainder even or odd, doubling,
and possibly one addition of integers of length n, for a total
of O(n) bit operations
I A total of O(n2 ) bit operations

Can we do better?
CE-3190 PAT
Autumn 2017
9/27
Multiplication la Franaise (1/2)


2(x b y2 c) if y is even
x y =
x + 2(x b y2 c) if y is odd

13 11 = 13 + 2 (13 5)
= 13 + 26 5
= 13 + 26 + 2 (26 2)
= 13 + 26 + 2 52
= 13 + 26 + 104

CE-3190 PAT
Autumn 2017
10/27
Multiplication la Franaise (2/2)
Algorithm 1: Multiplication la Franaise
1 function multiply (x, y );
Input: Two n-bit integers x and y , where y 0
Output: The product x y
2 if y = 0 then
3 return 0
4 else
5 set z = multiply(x, b y2 c)
6 end
7 if y is even then
8 return 2z
9 else
10 return x + 2z
11 end
Complexity again O(n2 ).
CE-3190 PAT
Autumn 2017
11/27
Division
Algorithm 2: Division
1 function divide (x, y );
Input: Two n-bit integers x and y , where y 1
Output: The quotient and remainder of x divided by y
2 if x = 0 then
3 return (q, r ) = (0, 0)
4 else
5 set (q, r ) = divide(b x2 c, y );
6 q = 2 q, r = 2 r ;
7 if x is odd then
8 r =r +1
9 end
10 if r y then
11 r = r y, q = q + 1
12 end
13 return (q, r )
14 end

CE-3190 PAT
Autumn 2017
12/27
3.2 The divide-and-conquer paradigm
Recall the merge sort algorithm: dramatic improvement to
algorithm complexity (from O(n2 ) by O(n log n)) by recursive
partitioning:

1 function M ERGE S ORT (A[1 . . . n])


2 if n = 1 then return
3 else
4 Introduce auxiliary arrays A0 [1 . . . bn/2c], A00 [1 . . . dn/2e]
5 A0 A[1 . . . bn/2c]
6 A00 A[bn/2c + 1 . . . n]
7 M ERGE S ORT(A0 )
8 M ERGE S ORT(A00 )
9 M ERGE(A0 , A00 , A)
10 end

CE-3190 PAT
Autumn 2017
13/27
Large integer multiplication
I Let us try to apply the partitioning idea to the problem of
multiplying two n-bit integers X and Y .
I Partition each of X and Y into two n/2-bit numbers and
perform multiplication first recursively between those.
(Assume for simplicity n = 2k , k = 0, 1, 2, . . . .)
I More precisely, partition as follows:
n
X = A 22 + B X : A B
n
Y = C 22 + D Y : C D
| {z } | {z }
n n
2 2

I This leads to recursive computation rule:


n
XY = AC 2n + (AD + BC) 2 2 + BD.

CE-3190 PAT
Autumn 2017
14/27
I Denote: T (n) = number of elementary bit operations
needed to multiply two n-bit numbers following this rule.
I Computation involves four multiplications of n2 -bit numbers,
by recursion, and three additions of n-bit numbers, which
can each be done in O(n) time.
I We thus obtain the following recurrence equation for T (n):
(
T (1) = c1
T (n) = 4T ( n2 ) + c2 n, n = 2k , k = 0, 1, 2, . . .

Solution to this is T (n) = (c1 + c2 )n2 c2 n.


I This is still O(n2 ), so no improvement over grade school
algorithm was achieved.

CE-3190 PAT
Autumn 2017
15/27
I Karatsuba & Ofman (1962): product XY can be obtained
from A, B, C, D with just three multiplications instead of
four:
n
XY = AC 2n + [(A B)(D C) + AC + BD] 2 2 + BD

I This leads to recurrence:


(
T (1) = c1
T (n) = 3T ( n2 ) + c2 n, n = 2k , k = 0, 1, 2, . . .

CE-3190 PAT
Autumn 2017
16/27
Let us solve the equations by naive unwinding:
T (n) = 3T ( n2 ) + c2 n
= 3(3T ( n4 ) + c2 n2 ) + c2 n
= 32 T ( 2n2 ) + c2 n( 32 + 1)
...
= 3i T ( 2ni ) + c2 n(( 23 )i1 + ( 32 )i2 + + 1)
...
= 3k T ( 2nk ) + c2 n(( 23 )k 1 + ( 32 )k 2 + + 1)
( 23 )k 1
= 3k c1 + c2 n ( 23 1)
= 3log2 n c1 + 2c2 n(( 23 )log2 n 1)
= nlog2 3 c1 + 2c2 (nlog2 3 n)
= (c1 + 2c2 )nlog2 3 2c2 n
Thus the repeated application of the Karatsuba-Ofman trick at
all levels of the recursion leads to an O(nlog2 3 ) = O(n1.59... )
algorithm. This is an order-of-growth improvement over the
simple O(n2 ) method.
CE-3190 PAT
Autumn 2017
17/27
Additional notes
1. Because of the constant factors involved, the grade school
method is actually more efficient than the Karatsuba-Ofman
algorithm up to about 500-bit numbers. On the other hand,
numbers of this size and bigger are being used routinely in e.g.
cryptographic applications.
2. In principle, the multiplication of very large numbers can be done
in time O(n log2 n log2 log2 n) by an algorithm of Schnhage and
Strassen, applying Fourier transform techniques. It is an open
problem whether multiplication can be done in time O(n).
3. A note on implementation: in practice the partitioning of numbers
in the Karatsuba-Ofman technique should of course not be
continued all the way to single bit numbers, but only to the level
of the wordlength of the underlying hardware. At this point the
multiplications can be completed by basic machine operations.

CE-3190 PAT
Autumn 2017
18/27
Divide-and-conquer, or problem partitioning

I General idea: partition problem instance x into


subinstances x1 , x2 , . . . , xm , solve these recursively
yielding subsolutions y1 , y2 , . . . , ym , compose the eventual
solution y corresponding to x from these.
I Method is efficient, if decrease in time complexity resulting
from dealing with smaller subinstances compensates for
time needed for the partitioning and composition.

CE-3190 PAT
Autumn 2017
19/27
A master theorem for recurrences
The solution T (n) to recurrence equation
(
T (1) = c1 > 0,
T (n) = aT ( bn ) + cnd , for n = bk , k = 1, 2, . . .

satisfies:

T (n) = O(nd ), if d > logb a and c 6= 0,
T (n) = O(nd log n), if d = logb a and c 6= 0,

T (n) = O(nlogb a ), if d < logb a or c = 0.

The result can be proved by elementary, but somewhat tedious


unwinding of the recurrence, or by the following high-level
computation-tree argument.

CE-3190 PAT
Autumn 2017
20/27
Solving recurrences
Master theorem (weak variant)
Suppose that a > 0, b > 1, C 0, d 0 are
constants such that
T (1) C
T (n) aT (n/b) + Cnd , n = bj , j = 1, 2, . . .

Then,

d
O(n ) if d > logb a;
T (n) = O(n log n) if d = logb a;
d


O(nlogb a if d < logb a.

CE-3190 PAT
Autumn 2017
21/27
Proof (1/2) Consider the recursion tree and
the time used on each node
Take the sum of times
& simplify
T (n) aT (n/b) + Cnd

T (n)
Cnd

T (n/b) T (n/b) T (n/b)


C(n/b)d
C(n/b)d C(n/b)d

T (n/b2 ) T (n/b2 ) T (n/b2 )


C(n/b2 )d C(n/b2 )d C(n/b2 )d
a child nodes
whenever n > 1

CE-3190 PAT
Autumn 2017
22/27
Proof (2/2)
Level (j) Size (n) #Nodes/level Time/node
0 n 1 Cnd
1 nb 1 a Cnd b d
2 nb 2 a2 Cnd b 2d
.. .. .. .. .. .. ..
. . . . . . .
j
j nb aj Cnd b jd ...
.. .. .. .. .. .. .. ..
. . . . . . . .
logb n 1 alogb n C ...



d
O(n )
logb n if bd > a;
j
T (n) Cnd a/bd = O(nd log n) if bd = a;


j=0 O(nlogb a if bd < a.

CE-3190 PAT
Autumn 2017
23/27
Matrix multiplication

I Task: Given two n n matrices A = (aij ), B = (bij ),


compute the product C = AB, where
n
X
cij = aik bkj .
k =1

I Standard method: Matrix C has n2 entries; computing


each takes (n) operations total of (n3 ) operations.
I A divide-and-conquer solution was developed by
V. Strassen in 1969.

CE-3190 PAT
Autumn 2017
24/27
Strassens algorithm (1/2)
Idea: The product of two 2 2 matrices can be computed with
just seven multiplications instead of the obvious eight.
Assume for simplicity that n = 2k for some k 1 and denote:

A11 A12 } n2  
A n
A22 } 2 B11 B 12
A = |{z}21
|{z} B=
n n
B21 B22
2 2

M1 = (A21 + A22 A11 )(B22 B12 + B11 )


M2 = A11 B11
M3 = A12 B21
M4 = (A11 A21 )(B22 B12 )
M5 = (A21 + A22 )(B12 B11 )
M6 = (A12 A21 + A11 A22 )B22
M7 = A22 (B11 + B22 B12 B21 )
CE-3190 PAT
Autumn 2017
25/27
Strassens algorithm (2/2)
Then:
 
M2 + M3 M1 + M2 + M5 + M6
AB =
M1 + M2 + M4 M7 M1 + M2 + M4 + M5

In addition to the 7 submatrix multiplications for M1 , . . . , M7


one thus also needs to perform 24 submatrix additions or
subtractions, each of complexity (n2 ).
Nevertheless this results in an asymptotic improvement:

T (1) c1
T (n) 7T ( n2 ) + cn2 , for n = 2k , k 1

2 < log2 7 T (n) = (nlog2 7 ) = (n2.807... )

CE-3190 PAT
Autumn 2017
26/27
Additional notes

1. Again, the recursion should not be continued until the level


n = 1, but only to the breakeven point n = n0 , where the
standard direct method becomes faster than further recursion.
2. Typically, Strassens algorithm is faster than the direct method for
n & 1500.
3. The asymptotically best currently known matrix multiplication
algorithm is of complexity O(n2.373 ) [V. Williams 2012], but
because of the size of the constant factors involved this result is
mainly of theoretical interest.

CE-3190 PAT
Autumn 2017
27/27

You might also like