0% found this document useful (0 votes)
9 views100 pages

Lecture 04

Uploaded by

uali93823
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views100 pages

Lecture 04

Uploaded by

uali93823
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 100

‫ُك‬ ‫َل‬

‫الَّس آلُم َع ْي ْم َوَر ْح َمُة الله‬


‫ح‬
‫َّر ٰمِن‬‫ال‬ ‫ِہ‬ ‫الل‬ ‫س‬
‫ِب ِم‬ ‫ه‬‫ُت‬ ‫كآ‬ ‫َر‬‫َب‬ ‫و‬
‫الَّر ِح يم‬
‫ُش روع َاللہ کے پاک نام سے جو بڑا مہر بان نہايت رحم‬
‫واال ہے‬
Data Structure & Algorithms
COSC-1102

Aqeel –Ur- Rehman


[email protected]
Week # 04

Algorithm Analysis
Lecture Recap
In our last session we cover following:
• Asymptotic Analysis
• Justification for analysis
• Quadratic and polynomial growth
• Counting machine instructions
• Landau symbols
• Big-Q as an equivalence relation
• Little-o as a weak ordering
Lecture Agenda
Today we will cover following:
• We will introduce machine instructions
• We will calculate the run times of:
• Operators +, -, =, +=, ++, etc.
• Control statements if, for, while, do-
while, switch
• Functions
• Recursive functions
Motivation
• The goal of algorithm analysis is to take a block of code
and determine the asymptotic run time or asymptotic
memory requirements based on various parameters
• Given an array of size n:
• Selection sort requires Q(n2) time
• Merge sort, quick sort, and heap sort all require Q(n ln(n)) time
• However:
• Merge sort requires Q(n) additional memory
• Quick sort requires Q(ln(n)) additional memory
• Heap sort requires Q(1) memory
Motivation
The asymptotic behaviour of algorithms indicates
the ability to scale
• Suppose we are sorting an array of size n

Selection sort has a run time of Q(n2)


• 2n entries requires (2n)2 = 4n2
• Four times as long to sort
• 10n entries requires (10n)2 = 100n2
• One hundred times as long to sort
Motivation
The other sorting algorithms have Q(n ln(n)) run times
• 2n entries require (2n) ln(2n) = (2n) (ln(n) + 1) = 2(n ln(n)) + 2n
• 10n entries require (10n) ln(10n) = (10n) (ln(n) + 1) = 10(n
ln(n)) + 10n

In each case, it requires Q(n) more time

However:
• Merge sort will require twice and 10 times as much memory
• Quick sort will require one or four additional memory
locations
• Heap sort will not require any additional memory
Motivation
We will see algorithms which run in Q(nlg(3)) time
• lg(3) ≈ 1.585
• For 2n entries require (2n)lg(3) = 2lg(3) nlg(3) = 3 nlg(3)
• Three times as long to sort
• 10n entries require (10n)lg(3) = 10lg(3) nlg(3) = 38.5 nlg(3)
• 38 times as long to sort
Motivation
We will see algorithms which run in Q(nlg(3)) time
• lg(3) ≈ 1.585
• For 2n entries require (2n)lg(3) = 2lg(3) nlg(3) = 3 nlg(3)
• Three times as long to sort
• 10n entries require (10n)lg(3) = 10lg(3) nlg(3) = 38.5 nlg(3)
• 38 times as long to sort

Binary search runs in Q(ln(n)) time:


• Doubling the size requires one additional search
Motivation
If we are storing objects which are not related,
the hash table has, in many cases, optimal
characteristics:
• Many operations are Q(1)
• I.e., the run times are independent of the number of
objects being stored

If we are required to store both objects and


relations, both memory and time will increase
• Our goal will be to minimize this increase
Motivation
To properly investigate the determination of run
times asymptotically:
• We will begin with machine instructions
• Discuss operations
• Control statements
• Conditional statements and loops

• Functions
• Recursive functions
Machine Instructions
Given any processor, it is capable of performing only a limited
number of operations

These operations are called instructions

The collection of instructions is called the instruction set


• The exact set of instructions differs between processors
• MIPS, ARM, x86, 6800, 68k
• You will see another in the ColdFire in ECE 222
• Derived from the 68000, derived from the 6800
Machine Instructions
Any instruction runs in a fixed amount of time (an integral number of
CPU cycles)

An example on the Coldfire is:


0x06870000000F
which adds 15 to the 7th data register

As humans are not good at hex, this can be programmed in assembly


language as
ADDI.L #$F, D7
• More in ECE 222
Machine Instructions
Assembly language has an almost one-to-one translation to machine instructions
• Assembly language is a low-level programming language

Other programming languages are higher-level:


Fortran, Pascal, Matlab, Java, C++, and C#

The adjective “high” refers to the level of abstraction:


• Java, C++, and C# have abstractions such as OO
• Matlab and Fortran have operations which do not map to relatively small number of
machine instructions:
>> 1.27^2.9 % 1.27**2.9 in Fortran
2.0000036616123606774
Machine Instructions
The C programming language (C++ without objects and other
abstractions) can be referred to as a mid-level programming
language
• There is abstraction, but the language is closely tied to the standard
capabilities
• There is a closer relationship between operators and machine instructions

Consider the operation a += b;


• Assume that the compiler has already has the value of the variable a in
register D1 and perhaps b is a variable stored at the location stored in
address register A1, this is then converted to the single instruction
ADD (A1), D1
Operators
Because each machine instruction can be executed in a fixed
number of cycles, we may assume each operation requires a
fixed number of cycles
• The time required for any operator is Q(1) including:
• Retrieving/storing variables from memory
• Variable assignment =
• Integer operations + - * / % ++ --
• Logical operations && || !
• Bitwise operations & | ^ ~
• Relational operations == != < <= => >
• Memory allocation and deallocation new delete
Operators
Of these, memory allocation and deallocation are the slowest by
a significant factor
• A quick test on eceunix shows a factor of over 100
• They require communication with the operation system
• This does not account for the time required to call the constructor and
destructor

Note that after memory is allocated, the constructor is run


• The constructor may not run in Q(1) time
Blocks of Operations
Each operation runs in Q(1) time and therefore any fixed
number
of operations also run in Q(1) time, for example:
// Swap variables a and b
int tmp = a;
a = b;
b = tmp;

// Update a sequence of values


//
ece.uwaterloo.ca/~ece250/Algorithms/Skip_lists/src/Skip_
list.h
++index;
prev_modulus = modulus;
modulus = next_modulus;
next_modulus = modulus_table[index];
Blocks of Operations
Seldom will you find large blocks of operations without any additional
control statements

This example rearranges an AVL tree structure


Tree_node *lrl = left->right->left;
Tree_node *lrr = left->right->right;
parent = left->right;
parent->left = left;
parent->right = this;
left->right = lrl;
left = lrr;

Run time: Q(1)


Blocks in Sequence
Suppose you have now analyzed a number of blocks of code run in
sequence
template <typename T>
void update_capacity( int delta ) { Q(1)
T *array_old = array;
int capacity_old = array_capacity;
array_capacity += delta;
array = new T[array_capacity];

for ( int i = 0; i < capacity_old; ++i ) {


array[i] = array_old[i]; Q(n)
}

delete[] array_old;
} Q(1) or W(n)

To calculate the total run time, add the entries: Q(1 + n + 1) = Q(n)
Blocks in Sequence
• This is taken from code found at
http://ece.uwaterloo.ca/~ece250/Algorithms/Sparse_systems/

template <int M, int N>


Matrix<M, N> &Matrix<M, N>::operator= ( Matrix<M, N> const &A ) {
if ( &A == this ) {
return *this;
}
if ( capacity != A.capacity ) { Q(1)
delete [] column_index;
delete [] off_diagonal;
capacity = A.capacity;
column_index = new int[capacity]; Q(1) Q(1 + 1 + min(M, N) + M + n + 1)
off_diagonal = new double[capacity]; = Q(M + n)
}
for ( int i = 0; i < minMN; ++i ) {
diagonal[i] = A.diagonal[i]; Q(min(M, N))
}
for ( int i = 0; i <= M; ++i ) {
row_index[i] = A.row_index[i]; Q(M) Note that min(M, N) ≤ M but we
} cannot say anything about M and n
for ( int i = 0; i < A.size(); ++i ) {
column_index[i] = A.column_index[i]; Q(n)
off_diagonal[i] = A.off_diagonal[i];
} Q(1)
return *this;
Blocks in Sequence
Other examples include:
• Run three blocks of code which are Q(1), Q(n2), and Q(n)
Total run time Q(1 + n2 + n) = Q(n2)
• Run two blocks of code which are Q(n ln(n)), and Q(n1.5)
Total run time Q(n ln(n) + n1.5) = Q(n1.5)

Recall this linear ordering from the previous topic

• When considering a sum, take the dominant term


Control Statements
Next we will look at the following control statements

These are statements which potentially alter the execution of


instructions
• Conditional statements
if, switch
• Condition-controlled loops
for, while, do-while
• Count-controlled loops
for i from 1 to 10 do ... end do; # Maple
• Collection-controlled loops
foreach ( int i in array ) { ... } // C#
Control Statements
Given any collection of nested control statements, it is always
necessary to work inside out
• Determine the run times of the inner-most statements and work your
way out
Control Statements
Given
if ( condition ) {
// true body
} else {
// false body
}

The run time of a conditional statement is:


• the run time of the condition (the test), plus
• the run time of the body which is run

In most cases, the run time of the condition is Q(1)


Control Statements
In some cases, it is easy to determine which statement must be run:

int factorial ( int n ) {


if ( n == 0 ) {
return 1;
} else {
return n * factorial ( n – 1 );
}
}
Control Statements
In others, it is less obvious
• Find the maximum entry in an array:

int find_max( int *array, int n ) {


max = array[0];

for ( int i = 1; i < n; ++i ) {


if ( array[i] > max ) {
max = array[i];
}
}

return max;
}
Analysis of Statements
In this case, we don’t know

If we had information about the distribution of the entries of the


array, we may be able to determine it
• if the list is sorted (ascending) it will always be run
• if the list is sorted (descending) it will be run once
• if the list is uniformly randomly distributed, then???
Condition-controlled Loops
The C++ for loop is a condition controlled statement:
for ( int i = 0; i < N; ++i ) {
// ...
}

is identical to
int i = 0; // initialization
while ( i < N ) { // condition
// ...
++i; // increment
}
Condition-controlled Loops
The initialization, condition, and increment usually are single
statements running in Q(1)

for ( int i = 0; i < N; ++i ) {


// ...
}
Condition-controlled Loops
The initialization, condition, and increment statements are
usually Q(1)

For example,
for ( int i = 0; i < n; ++i ) {
// ...
}

Assuming there are no break or return statements in the loop,


the run time is W(n)
Condition-controlled Loops
If the body does not depend on the variable (in this example, i),
then the run time of
for ( int i = 0; i < n; ++i ) {
// code which is Theta(f(n))
}

is Q(n f(n))

If the body is O(f(n)), then the run time of the loop is O(n f(n))
Condition-controlled Loops
For example,
int sum = 0;
for ( int i = 0; i < n; ++i ) {
sum += 1; Theta(1)
}

This code has run time


Q(n·1) = Q(n)
Condition-controlled Loops
Another example example,
int sum = 0;
for ( int i = 0; i < n; ++i ) {
for ( int j = 0; j < n; ++j ) {
sum += 1; Theta(1)
}
}

The previous example showed that the inner loop is Q(n), thus
the outer loop is
Q(n·n) = Q(n2)
Analysis of Repetition Statements
Suppose with each loop, we use a linear search an array of size
m:
for ( int i = 0; i < n; ++i ) {
// search through an array of size m
// O( m );
}

The inner loop is O(m) and thus the outer loop is


O(n m)
Conditional Statements

Consider this example


void Disjoint_sets::clear() {
Q(1)
if ( sets == n ) {
return;
}

max_height = 0; Q(1)
num_disjoint_sets = n;

for ( int i = 0; i < n; ++i ) { Q(n)


parent[i] = i;
tree_height[i]
Q(1) = 0;
}  (1) sets n
} Tclear (n) 
(n) otherwise
Analysis of Repetition Statements
If the body does depends on the variable (in this example, i),
then the run time of
for ( int i = 0; i < n; ++i ) {
n 1
// code which is Theta(f(i,n))

  1   1  f i, n 
}  i 0 
is and if the body is
 n 1 
O(f(i, n)), the result is O  1   1  f i, n 
 i 0 
Analysis of Repetition Statements
For example,
int sum = 0;
for ( int i = 0; i < n; ++i ) {
for ( int j = 0; j < i; ++j ) {
sum += i + j;
}
}

 n 1   n 1
  n n  1 
    + 1) ) = Q(i) hence 2the outer is
  2
The inner  1  1  i
 is Q(1 + i(1
loop  1  n  i   1  n    n
 i 0   i 0   
Analysis of Repetition Statements
As another example:
int sum = 0;
for ( int i = 0; i < n; ++i ) {
for ( int j = 0; j < i; ++j ) {
for ( int k = 0; k < j; ++k ) {
sum += i + j + k;
}
}
}

From inside to out:


Q(1)
Q(j)
Q(i2)
Q(n3)
Control Statements
Switch statements appear to be nested if statements:

switch( i ) {
case 1: /* do stuff */ break;
case 2: /* do other stuff */ break;
case 3: /* do even more stuff */ break;
case 4: /* well, do stuff */ break;
case 5: /* tired yet? */ break;
default: /* do default stuff */
}
Control Statements
Thus, a switch statement would appear to run in O(n) time
where n is the number of cases, the same as nested if
statements
• Why then not use:

if ( i == 1 ) { /* do stuff */ }
else if ( i == 2 ) { /* do other stuff */ }
else if ( i == 3 ) { /* do even more stuff */ }
else if ( i == 4 ) { /* well, do stuff */ }
else if ( i == 5 ) { /* tired yet? */ }
else { /* do default stuff */ }
Control Statements
Question:
Why would you introduce something into
programming language which is redundant?

There are reasons for this:


• your name is Larry Wall and you are creating the Perl (not PERL)
programming language
• you are introducing software engineering constructs, for example,
classes
Control Statements
However, switch statements were included in the original C
language... why?

First, you may recall that the cases must be actual values,
either:
• integers
• characters

For example, you cannot have a case with a variable, e.g.,


case n: /* do something */ break; //bad
Control Statements
The compiler looks at the different cases and calculates an
appropriate jump

For example, assume:


• the cases are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
• each case requires a maximum of 24 bytes (for example, six instructions)

Then the compiler simply makes a jump size based on the variable,
jumping ahead either 0, 24, 48, 72, ..., or 240 instructions
Serial Statements
Suppose we run one block of code followed by
another block of code

Such code is said to be run serially

If the first block of code is O(f(n)) and the second is


O(g(n)), then the run time of two blocks of code is
O( f(n) + g(n) )
which usually (for algorithms not including function
calls) simplifies to one or the other
Serial Statements
Consider the following two problems:
• search through a random list of size n to find the maximum entry, and
• search through a random list of size n to find if it contains a particular
entry

What is the proper means of describing the run time of these


two algorithms?
Serial Statements
Searching for the maximum entry requires that each element in
the array be examined, thus, it must run in Q(n) time

Searching for a particular entry may end earlier: for example,


the first entry we are searching for may be the one we are
looking for, thus, it runs in O(n) time
Serial Statements
Therefore:
• if the leading term is big-Q, then the result must be big-Q, otherwise
• if the leading term is big-O, we can say the result is big-O

For example,
O(n) + O(n2) + O(n4) = O(n + n2 + n4) = O(n4)
O(n) + Q(n2) = Q(n2)
O(n2) + Q(n) = O(n2)
O(n2) + Q(n2) = Q(n2)
Functions
A function (or subroutine) is code which has been separated
out, either to:
• and repeated operations
• e.g., mathematical functions
• group related tasks
• e.g., initialization
Functions
Because a subroutine (function) can be called from anywhere,
we must:
• prepare the appropriate environment
• deal with arguments (parameters)
• jump to the subroutine
• execute the subroutine
• deal with the return value
• clean up
Functions
Fortunately, this is such a common task that all modern
processors have instructions that perform most of these steps in
one instruction

Thus, we will assume that the overhead required to make a


function call and to return is Q(1) an
• We will discuss this later (stacks/ECE 222)
Functions
Because any function requires the overhead of a function call
and return, we will always assume that
Tf = W(1)

That is, it is impossible for any function call to have a zero run
time
Functions
Thus, given a function f(n) (the run time of which depends on n)
we will associate the run time of f(n) by some function Tf(n)
• We may write this to T(n)

Because the run time of any function is at least O(1), we will


include the time required to both call and return from the
function in the run time
Functions
Consider this function:
void Disjoint_sets::set_union( int m, int n ) {
m = find( m );
2Tfind
n = find( n );

if ( m == n ) {
return;
}

Tset_union= 2Tfind + Q(1)


--num_disjoint_sets;

if ( tree_height[m] >= tree_height[n] ) {


parent[n] = m;

if ( tree_height[m] == tree_height[n] ) {
++( tree_height[m] );
Q(1)
max_height = std::max( max_height, tree_height[m] );
}
} else {
parent[m] = n;
}
}
Recursive Functions

A function is relatively simple (and boring) if it simply performs


operations and calls other functions

Most interesting functions designed to solve problems usually


end up calling themselves
• Such a function is said to be recursive
Recursive Functions

As an example, we could implement the factorial function


recursively:

int factorial( int n ) {


if ( n <= 1 ) { (1)
return 1; T! (n  1)  (1)
} else {
return n * factorial( n – 1 );
}
}
Recursive Functions

Thus, we may analyze the run time of this function as follows:


Θ(1) n 1
T! ( n )  
T! ( n  1)  Θ(1) n  1

We don’t have to worry about the time of the conditional (Q(1))


nor is there a probability involved with the conditional statement
Recursive Functions

The analysis of the run time of this function yields a recurrence


relation:
T!(n) = T!(n – 1) + Q(1) T!(1) = Q(1)

This recurrence relation has Landau symbols…


• Replace each Landau symbol with a representative function:

T!(n) = T!(n – 1) + 1 T!(1) = 1


Recursive Functions

Thus, to find the run time of the factorial function, we need to


solve
T!(n) = T!(n – 1) + 1 T!(1) = 1

The fastest way to solve this is with Maple:


> rsolve( {T(n) = T(n – 1) + 1, T(1) = 1}, T(n) );
n

Thus, T!(n) = Q(n)


Recursive Functions
Unfortunately, you don’t have Maple on the examination, thus,
we can examine the first few steps:
T!(n) = T!(n – 1) + 1
= T!(n – 2) + 1 + 1 = T!(n – 2) + 2
= T!(n – 3) + 3

From this, we see a pattern:


T!(n) = T!(n – k) + k
Recursive Functions

If k = n – 1 then
T!(n) = T!(n – (n – 1)) + n – 1
= T!(1) + n – 1
=1+n–1=n

Thus, T!(n) = Q(n)


Recursive Functions

Incidentally, we may write the factorial function using the ternary


?: operator
• Ternary operators take three arguments—C++ has only one

int factorial( int n ) {


return (n <= 1) ? 1 : n * factorial( n – 1 );
}
Recursive Functions

Suppose we want to sort a array of n items

We could:
• go through the list and find the largest item
• swap the last entry in the list with that largest item
• then, go on and sort the rest of the array

This is called selection sort


void sort( int * array, int n ) {
Recursive Functions
if ( n <= 1 ) {
return; // special case: 0 or 1 items are always sorted
}
int posn = 0; // assume the first entry is the smallest
int max = array[posn];
for ( int i = 1; i < n; ++i ) { // search through the remaining entries
if ( array[i] > max ) { // if a larger one is found
posn = i; // update both the position and value
max = array[posn];
}
}
int tmp = array[n - 1]; // swap the largest entry with the last
array[n - 1] = array[posn];
array[posn] = tmp;
sort( array, n – 1 ); // sort everything else
}
Recursive Functions
We could call this function as follows:

int array[7] = {5, 8, 3, 6, 2, 4, 7};


sort( array, 7 ); // sort an array of
seven items
Recursive Functions
The first call finds the largest element
Recursive Functions

The next call finds the 2nd-largest element


Recursive Functions

The third finds the 3rd-largest


Recursive Functions

And the 4th


Recursive Functions
And the 5th
Recursive Functions
Finally the 6th
Recursive Functions
And the array is sorted:
Recursive Functions
Analyzing the function, we get:
Recursive Functions

Thus, replacing each Landau symbol with a representative, we


are required to solve the recurrence relation
T(n) = T(n – 1) + n T(1) = 1

The easy way to solve this is with nMaple:



 1n    
> rsolve( {T(n) = T(n – 1) + ( n 1 )
n, T(1) = 1}, T(n)  2 );1 

1 
1 2
n n
> expand( % ); 2 2
Recursive Functions
Consequently, the sorting routine has the run time
T(n) = Q(n2)
To see this by hand, consider the following
T( n ) T( n  1)  n
T( n  2)  ( n  1)   n
T( n  2)  n  ( n  1)
T( n  3)  n  ( n  1)  ( n  2)

n n n
n( n  1)
T(1)  
i 2
i 1   
i 2
i i
i 1
2
Recursive Functions
Consider, instead, a binary search of a sorted list:
• Check the middle entry
• If we do not find it, check either the left- or right-hand side, as
appropriate

Thus, T(n) = T((n – 1)/2) + Q(1)


Recursive Functions
Also, if n = 1, then T(1) = Q(1)

Thus we have to solve:  1 n 1



T( n )  T  n  1   1 n  1
 
  2 

Solving this can be difficult, in general, so we will consider only special values
of n

Assume n = 2k – 1 where k is an integer


Then (n – 1)/2 = (2k – 1 – 1)/2 = 2k – 1 – 1
Recursive Functions
For example, searching a list of size 31 requires us to check the
center

If it is not found, we must check one of the two halves, each of


which is size 15
31 = 25 – 1
15 = 24 – 1
Recursive Functions

Thus, we can write


T( n ) T( 2 k  1)
 2k  1  1
T  1

 2 
T( 2 k  1  1)  1
 2k  1  1  1
T  1 1

 2 
T( 2 k  2  1)  2

Recursive Functions
Notice the pattern with one more step:
T( 2 k  1  1)  1
 2k  1  1  1
T  1 1

 2 
T( 2 k  2  1)  2
T( 2 k  3  1)  3

Recursive Functions
Thus, in general, we may deduce that after k – 1 steps:
T( n ) T( 2 k  1)
T( 2 k  ( k  1)  1)  k  1
T(1)  k  1 k

because T(1) = 1
Recursive Functions
Thus, T(n) = k, but n = 2k – 1
Therefore k = lg(n + 1)
f n 
lim c 0c
n   g n 
However, recall that f(n) = Q(g(n)) if for

1
lg n  1  n  1 ln 2  n 1
lim lim lim 
n   ln n  n  1 n   n  1 ln 2  ln 2 
n

Thus, T(n) = Q(lg(n + 1)) = Q (ln(n))


Cases

As well as determining the run time of an algorithm, because


the data may not be deterministic, we may be interested in:
• Best-case run time
• Average-case run time
• Worst-case run time

In many cases, these will be significantly different


Cases
Searching a list linearly is simple enough

We will count the number of comparisons


• Best case:
• The first element is the one we’re looking for: O(1)
• Worst case:
• The last element is the one we’re looking for, or it is not in the list: O(n)
• Average case?
• We need some information about the list...
Cases
Assume the case we are looking for is in the list and equally likely
distributed

If the list is of size n, then there is a 1/n chance of it being in the kth location

Thus, we sum
1 n 1 n(n  1) n  1

n k 1
k 
n 2

2

which is O(n)
Cases

Suppose we have a different distribution:


• there is a 50% chance that the element is the first
• for each subsequent element, the probability is reduced by ½

We could write:
n 
i i

k 1 2
k
  k ?
k 1 2
Cases
You’re not expected to know this for the final, however, for
interest:

> sum( k/2^k, k = 1..infinity );

Thus, the average case is O(1)


Cases
Previously, we had an example where we were looking for the number
of times a particular assignment statement was executed:

int find_max( int * array, int n ) {


max = array[0];

for ( int i = 1; i < n; ++i ) {


if ( array[i] > max ) {
max = array[i];
}
}

return max;
}
Cases
This example is taken from Preiss
• The best case was once (first element is largest)
• The worst case was n times

For the average case, we must consider:


• What is the probability that the ith object is the largest of the first i
objects?
Cases
To consider this question, we must assume that elements in the
array are evenly distributed

Thus, given a sub-list of size k, the probability that any one


element is the largest is 1/k
n 1 n
1 1

Thus, given a value i, there are 
i 0
 ?
i + 1 objects,
i  1 i 1 i
hence
Cases
We can approximate the sum by an integral – what is the area
under:
Cases
We can approximate this by the 1/(x + 1) integrated from 0 to n
Cases
From calculus:
n n 1
1 1 n 1

0
x 1
dx   dx ln( x) 1 ln(n  1)  ln(1) ln(n  1)
1
x

How about the error? Our approximation would be useless if


the error was O(n)
Cases

Consider the following image which highlights the errors


• The errors can be fit into the box [0, 1] × [0, 1]
Cases
Consequently, the error must be < 1

In fact, it converges to g ≈ 0.57721566490


• Therefore, the error is Q(1)
Cases
Thus, the number of times that the assignment statement will be
executed, assuming an even distribution is O(ln(n))
Cases

Thus, the total run of:


int find_max( int *array, int n ) {
max = array[0];

for ( int i = 1; i < n; ++i ) {


if ( array[i] > max ) {
max = array[i];
}
}

return max;  n 1  1 
}  1  1    1  n  ln(n)   n 
 i 1  i 1  

is
Lecture Recap
Today we covered following:
• We will introduce machine instructions
• We will calculate the run times of:
• Operators +, -, =, +=, ++, etc.
• Control statements if, for, while, do-while,
switch
• Functions
• Recursive functions
End of Lecture
04

You might also like