Data Structures Algorithms U4
Data Structures Algorithms U4
Unit 4
TCC 236/05
Data Structures and
Algorithms
Algorithms and
Performance
ii WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
COURSE TEAM
Course Team Coordinator: Mr. Ishan Sudeera Abeywardena
Content Writers: Ms. Neeta Deshpande, Ms. Seema Gondhalekar, Dr. Lichade and Mr. Ishan Sudeera
Abeywardena
Instructional Designer: Ms. Marnisya Rahim
Academic Member: Mr. Vincent Chung Sheng Hung
COURSE COORDINATOR
Dr. Lim Ting Yee
PRODUCTION
In-house Editor: Ms. Marnisya Rahim
Graphic Designer: Ms. Audrey Yeong
Wawasan Open University is Malaysia’s first private not-for-profit tertiary institution dedicated to
adult learners. It is funded by the Wawasan Education Foundation, a tax-exempt entity established
by the Malaysian People’s Movement Party (Gerakan) and supported by the Yeap Chor Ee Charitable
and Endowment Trusts, other charities, corporations, members of the public and occasional grants
from the Government of Malaysia.
The course material development of the university is funded by Yeap Chor Ee Charitable and
Endowment Trusts.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or
transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or
otherwise, without prior written permission from WOU.
Contents
Unit 4 Algorithms and Performance
Unit overview 1
Unit objectives 1
Objectives 3
Introduction 3
Performance of a program 4
Objectives 17
Introduction 17
Objectives 27
Introduction 27
Base case 36
Inductive hypothesis 36
Permutations 38
Combinations 40
Summary of Unit 4 45
References 49
Glossary 51
UNIT 4 1
Algorithms and performance
Unit Overview
There may exist more than one algorithm for a problem. When comparing two
different algorithms that solve the same problem, we will often find that one
algorithm is of an order of magnitude more efficient than the other. In this case,
it only makes sense that we be able to recognise and choose the more efficient
algorithm. In this unit we try to understand the efficiency of using a specific
algorithm and how to measure the efficiency.
We will also discuss recursion and the recursive functions which are commonly
used in solving problems using algorithms.
Unit Objectives
By the end of Unit 4, you should be able to:
Introduction
In this section, you will learn how discrete mathematics relates to data structures.
Before we proceed, let us understand what happens when a program is executed.
3. Determining if it is readable.
We may also analyse program execution time and the storage complexity
associated with it, for example how fast does the program run and how much
storage it requires. Another related question could be: how big must its data
structure be and how many steps will be required to execute its algorithm?
Since this unit will discuss discrete mathematics in data structures, we will analyse
programs in terms of storage and time complexity that we call performance before
we discuss further on recursive functions.
4 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Performance of a program
In considering the performance of a program, we are primarily interested in:
We often find that we can trade time efficiency for space efficiency, or vice-versa.
To find any of these, i.e., time or space efficiency, we need to have some estimate
of the problem size. Let’s assume that some number N represents the size of the
problem. The size of the problem or N can reflect one or more features of the
problem, for instance N might be the number of input data values, or it is the
number of elements of an array, etc.
Suppose that we are given two algorithms for finding the largest value in a set of N
numbers. It is also given that the second algorithm executes twice the number of
instructions executed by the first algorithm for each N value.
Let the first algorithm execute S number of instructions. Then second algorithm
would execute 2S instructions. If each instruction takes 1 unit of time, say 1
millisecond then for N = 10, 100, 1000 and 10 000, the number of operations
and estimated execution time may be as given below.
Algorithm 1 Algorithm 1
N
Number of Estimated Number of Estimated
instructions execution time instructions execution time
10 10 S 10 msec 20 S 20 msec
100 100 S 100 msec 200 S 200 msec
1000 1000 S 1000 msec 2000 S 2000 msec
10 000 10 000 S 10 000 msec 20 000 S 20 000 msec
1 millisec = 1 msec = 1/1000 sec
Table 4.1
Notice that for larger values of N, the difference between execution time of the
two algorithms is appreciable and one may clearly say that Algorithm 2 is slower
than Algorithm 1. However, algorithm 1 is comparatively better as the problem
size gets larger. This kind of performance improvement is termed as order of
improvements. We can find the significant difference in time requirement by
UNIT 4 5
Algorithms and performance
comparing two algorithms when problem size gets larger. For example, one of
them can be two times faster than the other and it always remains two times
better regardless of the problem size.
• If the problem size doubles and the algorithm takes one more step, we
relate the number of steps to the problem size by O(log2 N ). It is read as
order of log N.
• If the problem size doubles and the algorithm takes twice as many steps,
the number of steps is related to the problem size by O(N ), i.e., order of
N, i.e., number of steps is directly proportional to N.
• If the problem size doubles and the algorithm takes more than twice as
many steps, i.e., if the number of steps required grows faster than the
problem size, we use the expression O(N log2 N ).
• You may notice that the growth rate complexity is more than the double of
the growth rate of the problem size, but it is not faster.
The approximate values of some common growth rate functions are compared below.
O(1) < O(log2 n) < O(n) < O(n log2 n) < O(n2) < O(n3) < O(2n)
N log2 n n n log2 n n2 n3 2n
N log2 n n n log2 n n2 n3 2n
2n n3
2000 n2
1500
Value of growth rate function
1000
500
n log2 n
n
0 log2 n
1 10 20 30 40 50
n
Activity 4.1
A. data size
B. problem size
C. input
D. process
A. input
B. data values
C. instructions
D. variables
A. exact
B. expected
C. estimated
D. entered
8 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Consider an unsorted integer array of size n. The insertion sort algorithm searches
for the smallest element in the array and then interchange it with the element
at the first position, for example a[0]. Then it searches for the next smallest
element from position 1 to n − 1 and swap with the element at second position,
for example a[1] and so on.
Study the above coding. The question is, “How many operations execute this
program?” There are two loops. One is outer and the other is inner. Outer loop
runs through n times. The number of times the inner loop runs change in every
iteration.
UNIT 4 9
Algorithms and performance
0 1 to 4 (4 times)
1 2 to 4 (3 times)
2 3 to 4 (2 times)
3 4 to 4 (1 times)
4 5 to 4 (no execution)
The execution of the inner loop depends on the counter of the outer loop. For
each run of the inner loop, we have a comparison and an assignment. We do not
always get an assignment due to the if clause. If we are interested in the average
case, we would assume that about half the time, the if clause is true. And then we
get the three assignments, each time we complete the outer loop.
This gives us the following number of operations carried out on average (where we
ignore the fact that incrementing loop variables, etc. costs a bit of time, too):
n−1 n−1
1
f (n) = ∑ (( ∑ 1 + − + 3)
i=0 i=0 2
Therefore this algorithm has complexity O(n 2) which is not a very good
complexity class, and we have seen that there are better sorting algorithms. One
thing to note is that when considering sorting algorithms, we often measure
complexity via the number of comparisons. We ignore things such as assignments,
etc. Note that if we only count comparisons in the example above, we get
f (n) = (n − 1) + (n − 2) + ... + 1
n(n − 1)
=
2
n2 n
= −
2 2
10 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Therefore algorithm has complexity O(n2). So the complexity remains the same.
That’s why we can restrict ourselves to counting the number of comparisons
involved. This yields a good measure for comparing the various sorting
algorithms. While considering the complexity of a particular algorithm, what
exactly to count is always a judgement call.
It is also a good idea to keep track of other factors, in particular those that go
with the dominating subterm. In the above example, the factor applied to the
dominating subterm, for example, n2 is 3/2 and by coincidence, this is also the
factor that comes with the second term, i.e., n. For a big size problem, the linear
algorithm will perform better than a quadratic algorithm. If we know that the
n2
problem has a size of at the most 100, then the complexity of order is
10
preferable to another of order 1 000 000n. And if it is already known that the
problem is used on fairly small samples, then the simplest algorithm is beneficial
as it is easier to program, as there is a lot of time to be saved.
The complexity of algorithm M is the function f (n) which gives the running time
or storage space requirement of the algorithm in terms of the size n of the input
data. The storage space required by an algorithm is simply a multiple of the data
size n. The problem for n = 200 takes twice the cost as a problem for n = 100.
And hence the costs of solving a problem vary as n increases.
The linear dependence on n is true for simple algorithms. At the lower end of the
scale we have algorithms with logarithmic dependence on n, while at the higher
end of the scale we have algorithms with an exponential dependence on n. With
increasing n, the relative difference in the cost of a computation is enormous for
these two extremes. The rate of increase of f (n) is usually examined by comparing
f (n) with some standard functions such as log2 n, n log n, n2, n3, 2n, etc.
The following table illustrates the comparative cost for a range of n values.
UNIT 4 11
Algorithms and performance
n n2 n3 2n log2 n n log2 n
2 4 8 4 1 2
10 102 103 >103 3.322 33.22
102 104 106 >1025 6.644 664.4
103 106 109 >10250 9.966 9966.0
104 108 1012 >102500 13.287 132877
The values from Table 4.3 show that we can solve only very small problems
with an algorithm that exhibits exponential behaviour. Even if we assume that
a computer can do about one million operations per second, an exponential
algorithm with n = 100 would take immeasurably longer to terminate. On the
other hand, for an algorithm with logarithmic dependence on n, a problem with
n = 104 would require only 13 steps which is about 13 microseconds of computer
time. This example stresses how important it is to have an understanding of the
way algorithms behave as a function of the problem size. That is, the complexity
f (n) of M increases as n increases.
In analysing any given algorithm, there are two measures of performance. These
are the worst case and average case behaviours. These two measures can be applied
to both the time and space complexity of an algorithm. The worst-case complexity
for a given problem size n corresponds to the maximum complexity encountered
among all problems of size n. In many practical applications it is much more
important to have a measure of the expected complexity of a given algorithm
rather than the worst case behaviour. The expected complexity gives a measure of
the behaviour of the algorithm averaged over all the possible problems of size n.
Step 5: Exit
12 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
In the worst case it is necessary to examine all n values in the list before
terminating, i.e., the algorithm requires f (n) = n + 1 comparisons. However,
the average case situation is somewhat different. It is generally assumed that all
possible points of termination are equally likely. That is, the probability of x will
be found at positions 1, 2, 3, ... and so on is 1/n.
n(n + 1)
Using the standard mathematical formula, 1 + 2 + 3 + ... + n = ,
2
average search cost = (
1 n
n 2
(n + 1) = )
(n + 1)
2
.
Let us take another example of binary search algorithm. The binary search
algorithm is based on the binary search tree. While creating the binary search
tree, the data is systematically arranged. That means, the values at left subtree
< root node value < right subtree values. You will learn more about searching
algorithms in Unit 5. At this stage, you just need to learn the impact of the
algorithm towards the cost.
Step 2: Repeat steps 2.1 and 2.2 while BEG " END and
DATA [MID] not = ITEM
2.1: If ITEM < DATA [MID], then:
Set END : = MID − 1
Else
Set BEG: = MID + 1
[End of If structure]
2.2: Set MID : = INT ( (BEG + END)/2
[End of Loop]
Step 4: Exit
UNIT 4 13
Algorithms and performance
2f (n) > n
Here we have to find the average number of iterations of the search loop that is
required before the algorithm terminates in a successful search.
One element can be found with 1 comparison, two with 2 comparisons, three with
4 comparisons.
Assume that all items present in the array are equally likely to be retrieved (i.e.,
with probability 1/n), then the average search cost is the sum of all possible
search costs, each multiplied by their associate probability, i.e.,:
1 log2 n 1
{ }
log2 n
= ∑ (i + 1) ∞ 2i = ∑ i ∞ 2i + n
n i=0 n i=0
d
and i ∞ 2i = (2x i )
dx
(n + 1)[log2 n] + 1
Hence, the average search cost = ≈ [log2 n] for large n.
n
Hence in binary search algorithm, the running time for the average case is
approximately equal to the running time for the worst case.
14 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Similarly we can find out the worst and average case of other algorithms. The
complexities are tabulated below.
n(n − 1) n(n − 1)
Bubble sort = O(n2) = O(n2)
2 2
n(n + 3)
Quick sort = O(n2) 1.4n log n = O(n log n)
2
Heap sort 3n log n = O(n log n) 3n log n = O(n log n)
n(n − 1) n(n − 1)
Insertion sort = O(n2) = O(n2)
2 4
n(n − 1) n(n − 1)
Selection sort = O(n2) = O(n2)
2 2
You will learn the method above in detail in the next section. At this stage don’t
worry about how it works.
Summary
Self-test 4.1
A. space
B. machine
C. time
D. memory
A. syntax
B. grammar
C. logical
D. language
A. logic
B. error
C. variables
D. constants
A. logic
B. complexity
C. algorithm
16 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Feedback
Activity 4.1
1. problem size
2. data values
3. expected
UNIT 4 17
Algorithms and performance
Introduction
In the previous units you have learnt about some of the more common data
structures used. For efficient use of these data structures, the data within the data
structures need to be sorted. There are a number of sorting algorithms which can
be used to sort this data. You will be learning more about these various sorting
algorithms and how to use them in Unit 5. In this section we will see how each
sorting algorithm has its own efficiency. For example if a given array is to be
sorted, then we are required to perform a number of operations. The number of
operations will certainly vary from algorithm to algorithm. It will also depend
on the number of entries to be sorted. It is intuitively clear that whatever the
algorithms may be, the fewer the entries, the fewer the number of operations. Thus
the number of operations will heavily depend on n, the number of items to be
sorted.
The attention of readers is drawn to one more point. It is presumed that the data
is in random order, so the number of operations shall be a random number. For
instance, for sorting {1, 2, 3, 4}, the number of operations shall be quite few but
for {4, 3, 2, 1} there will be a maximum number of operations. For random sets
like {4, 3, 1, 2}, {3, 1, 2, 4}, etc., it will depend on the distortion that is inherent
within the data. Now to deal with these aspects of randomness, techniques from
the subject statistics are used. However these are quite complex and a lot of
research is still being carried out in this area. We shall borrow only some results
without details of their proofs.
From the discussion above, we note that, for any algorithm, the study of the
expected number of operations that are necessary for sorting the given data is of
significance and importance, as the time for sorting and the necessary data space
required are two things of main concern. These major aspects are studied under
the heading complexity of algorithms. Efficiency of the algorithms is inversely
proportional to their complexity, i.e., the more the complexity, the less efficient
the algorithm is and vice versa.
18 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
As stated earlier, the detailed study of this aspect requires high level statistics
knowledge. We shall be content with mentioning results that are derived from
research in this field.
Now if n is sufficiently large “and which is in general the case” then one need
not consider the entire expression involving n. This feature shall be clearer by the
following illustration.
The crux of these illustrations is that, while reporting a large number we are
allowed to round off.
Similar situations arise in the case of the study of the complexities of algorithms.
Let us suppose that we have to report n4 + 6n 2 + 10. Now for n = 3 or n = 4,
reporting this as 34 or 44 is certainly a big mistake. However for n = 1017, if
we report this as (1017)4, not much is lost. This is exactly the main logic in the
use of Big O notation that we will study in this section.
f (n) = O(n)
This means that f (n) is a function of n, n3 being the most significant part and
the other terms that are not reported are of lower order. A few examples of Big O
notation are given below:
1. f (n) = O(n)
2. f (n) = O(n3/2)
3. f (n) = O(log n)
1. f (n) is of order n
Three main aspects one should consider while comparing the efficiencies of the
sorting algorithms are:
2. Space requirement.
The third aspect is not much important and referred to in only a few textbooks
and therefore we shall neglect it. In the case of availability of high capacity
storage devices, the second aspect also has lost its importance. Therefore we shall
consider the first aspect in detail.
Suppose we want to sort n items, then at each iteration we have to carry out
(n − 1) comparisons. Thus the total number of comparisons is (n − 1)2 and the
complexity of bubble sort is taken as O(n 2). However this is not the full answer.
In bubble sort, after each compression, we either retain elements in the same
position or exchange. Exchanges also add to complexity. However after the first
iteration the largest element goes to its place, after the second iteration two items
are at their own places, etc. So there will not be any exchanges. Using probability
theory it is proved that the time component of bubble sort is of the order n2.
Thus the complexity of bubble sort is O(n 2). Therefore bubble sort is considered
to be the least efficient.
20 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Next we consider the complexity of the insertion sort technique. Insertion sort
technique is usually recommended for small n. If the items are in decreasing
order and are to be sorted, then the second elements will need one comparison,
the third elements will need two comparisons and so on. So in the worst case,
the total numbers of comparisons are:
n(n − 1)
1+2≡+n−1=
2
Generally items are in random order and from the theory of probability, the
n(n − 1)
average number of comparisons is .
2
Thus the complexity of insertion sort is O(n 2) which is quite large. Here one
comment is necessary. While discussing the complexity of insertion sort, we
saw that in the worst case, the maximum number of replacements is necessary.
However if the list is nearly sorted, not many changes are necessary. Furthermore
for small n, insertion sort is useful as the execution of the program is not
complicated.
In shell sort programming, we choose k1 > k2 > k3 ... and in the first pass we sort
items at 1, k1 + 1, 2k1 + 1, ..., next item at 2, k1 + 2, ... places. At the second pass,
1, k2 + 1, 2k2 + 1, etc., Therefore at the stage of higher passes we get a longer
list to sort but it is nearly sorted, therefore the expected changes will be quite
less.
However it is generally accepted that for shell sort, the average number of
comparisons = O(n(log2 n)2). Also, here the expected number of exchanges shall
be less. This is an advantage of Shell sort over insertion sort. However the study
of the complexity of Shell sort is quite an advanced topic and requires higher
level statistical techniques. Besides, how to choose ki’s is a tough task. It is also
obvious that the complexity of Shell sort will depend on the choices of ki’s.
Therefore, after this much discussion on the complexity of the Shell sorting
technique, we move on to the study of the complexity of the selection sort
technique.
In the selection sort technique, the smallest element is first brought to the place.
Here the comparisons are (n − 1), where the next smallest among 2nd to nth is
brought to the 2nd place, etc. The numbers of comparisons are:
n(n − 1)
(n − 1) + (n − 2) + ≡ + n − 1 = = O(n 2)
2
UNIT 4 21
Algorithms and performance
It may be noted that the number of comparisons remains the same whether the
array is in the worst form or nearly sorted. Thus it is in no way a better technique
than the insertion sort technique.
Before studying the complexity of the heap sort technique, let us recall a few
things.
Heap is a tree structure where the element at every node is larger than all the
elements of the subtree of that node. The following are the phases of the heap
sort technique:
1. For a given array, prepare a HEAP using the heap construction algorithm.
2. Remove items at the top of the heap sequentially and form an array of
the removed items. The resulting array will be the sorted list.
From the discussion above, we can see that the complexity of the heap sort
technique is two fold; number of comparisons made while preparing the heap of
the given items. Heap is a binary tree as follows.
If the maximum depth is k, then at the k th level, there shall be at the most 2k
items and the total number of items shall be 1 + 2 + 22 + ... 2k = 2k + 1 − 1.
Thus if n = total number of items, then the maximum depth will be of the order
log n or the depth will be O(log2 n).
As we have seen while inserting items in the heap, we have to carry out a number
of comparisons which are equal to the depth. Hence to prepare a heap, the upper
bound for comparisons shall be n(log2 n).
Also while removing items at the root sequentially, one is required to carry out
comparisons equal to the depth.
22 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
The discussion about the average number of comparisons is beyond the scope of
this course? We would have to satisfy ourselves by just stating the result that had
been obtained by researchers. The average number of comparisons is n log2 n or
O(n log2 n).
The following table gives the summary of the complexities of various sorting
techniques in terms of the average number of comparisons.
It may be noted that insertion sort, bubble sort and selection sort consist mainly of
comparison of two items and the exchange (if necessary) of those items. Preparation
of such programs is somewhat easy.
For heap sort, however, we have to prepare heaps, which demands creation of
additional structures. The same thing can be stated for Quicksort where we need
stack creation. In the case of shell sort, proper care is needed for choosing the
decreasing sequence of k’s. But of course it does not need any additional space.
UNIT 4 23
Algorithms and performance
There are many ways to solve the problem but the question is which way is the
best algorithm. But why are we concerned about this, since Moore’s Law said that
the number of transistors on CPU doubles every year, well, more precisely every
18 months? If this is so and with the current speed of CPU’s these days, then why
should we bother about how efficient our code is?
The travelling salesman problem answers this question. The travelling salesman
problem stated that “If a travelling salesman has to visit 100 different locations
in a town, what is the shortest route that he can take?” So the total number of
distinct routes possible: 100! = 30100. If this problem occurs, what does it mean in
terms of running time? A supercomputer capable of checking 100 billion routes
per second can check roughly 1020 routes in the space of one year. So this means
millions of years are needed to check all routes.
This is the reason why need to know the efficiency of the algorithm. Efficiency
of an algorithm is a function of the number of elements to be processed. If the
algorithm contains no loops then it is considered linear, so efficiency is a function
of the number of instructions.
We will study in more detail in the next section on how the Big O notation is a
measure of algorithm efficiency.
Big O notation is used to express the upper bound on a function. Let us have a
look at this code segment on a linear loop.
24 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
f (n) = 1 + (n − 1) + c*(n − 1) + (n − 1)
= (c + 2) * (n − 1) + 1
= (c + 2)n − (c + 2) + 1
int i = 1;
Executed 1 time
while(i <= 20)
{
n comparisons
int j=1;
while(j <= 20)
Initialisation done n times
{
j++;
n comparisons
}
i++;
} Executed n times
The code segment above is treated just like a single loop and it evaluates each level
of nesting as needed. The total number of iterations is the product of the total
number of inner and outer loop iterations.
f(n) = 1 + (n + n + n) * (n + n)
= 1 + 3n * 2n
= 1 + 6n2
UNIT 4 25
Algorithms and performance
Summary
Self-test 4.2
A. directly
B. inversely
C. never
A. less
B. more
C. most
26 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
A. number of entries
B. complexity of the algorithm
C. efficiency of the algorithm
D. speed of the computer
A. Space requirement.
B. Time required for execution.
C. Time required to prepare the program.
UNIT 4 27
Algorithms and performance
Introduction
In the earlier section, we have learnt how to evaluate the functions. In this section,
we will learn recursive function in detail. You have learnt recursive function
earlier in TCC 121/05 Programming Fundamentals with Java. We will refresh your
memory by reviewing the recursive function in the earlier section. We will then
learn the recursive functions for the summation of mathematical series in detail.
In some situations, we may require calling the same function within the same
function, either directly or indirectly, i.e., they form a loop. Direct calling:
Function A( ) is calling function A( ). Indirect calling: Function A( ) calls function
B( ) and function B( ) calls function A( ).
Such functions are called recursive functions and the process is called recursion.
In this process, the function is called recursive function. A function is said to be
recursive if it has the following properties:
Property 1: There must exist some base criteria for which function will not call
itself but will return some constants.
Property 2: In every recursive call, the function argument must tend to the base
criteria.
28 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
5! = 1 ∞ 2 ∞ 3 ∞ 4 ∞ 5
= 5 ∞ (4 ∞ 3 ∞ 2 ∞ 1)
= 5 ∞ 4!
Fact(5) = 5 ∞ Fact(4).
In general,
Note that, in every recursive call, the function argument zooms towards the base
criteria.
When a function calls itself recursively, each invocation gets a fresh set of all the
variables, independent of the previous set.
System.out.print(n % 10 + ‘0’);
}
If we call the function printd( ) with argument 123, i.e. printd(123), the first
call to printd( ) receives the argument n = 123. It passes 12 to a second printd( ),
which in turn passes 1 to a third. The third-level printd prints 1, then returns to
the second-level printd which prints 2. It finally returns to the first-level printd
which prints 3 and terminates.
class power_Fun
{
// The definition of the power_fract() function
double power_fract(double a, float b, float i)
{
if( b == 0)
{
return 1;
}
else if (b < 0)
{
return( (a * power_fract(a,b + i,i)));
}
else
return( a* power_fract(a,b – i,i));
}
// class main
class mainclass
{
public static void main( String args[ ])
{
power_Fun pow1= new power_Fun() ;
pow1.power_fract( 12.6678, 0.543, 4);
}
}
}
Tutorial
Lab activity
The above discussion shows clearly that any mathematical function which can be
presented in descending order and having some base criteria can be formulated as a
recursive function. We will see some more functions in the next section.
Activity 4.2
A. directly
B. indirectly
C. directly or indirectly
D. using reference
A. zero
B. calling argument
C. base criteria
D. specified constant
A. once
B. recursively
C. every time
D. indirectly
Example 1
Fibonacci series is a mathematical series where the first two terms are constant and
the subsequent terms are generated as the following rule.
Rule: From third term onwards every term is the sum of the previous two terms.
For example:
This clearly shows that from the third term every term depends on the previous
terms.
Example 2
Recursive functions for calculating the sum of the following mathematical series
S = 12 + 22 + 32 + ... + n 2
32 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Reminder
Let n = 5 and r = 2,
Assuming we have the main class, let us now try to run the above code,
NCR(4, 2)
n R (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
4 2 NCR(3, 2)+NCR(3, 1)
NCR(3, 2)
n r (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
3 2 NCR(2, 2)+NCR(2, 1)
NCR(3, 1)
n r (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
3 1 NCR(2, 1)+NCR(2, 0)
NCR(2, 2)
n r (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
2 2 1
NCR(2, 1)
n r (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
2 1 NCR(1, 1)+NCR(1, 0)
34 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
NCR(2, 0)
n r (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
2 0 1
NCR( 1, 1)
n r (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
1 1 1
NCR(2, 0)
n r (r==0)||(n==r)
True False
return(1) return((NCR(n − 1, r) + NCR(n − 1, r − 1)
2 0 1
In short,
For example, let a = 18 and b = 12, then the greatest common divisor is 6.
Let’s say you have an old brush with N bristles. You’re pretty sure that every old
brush is bumpy. Induction might be a great way to prove it, because you could use
it to show that no matter what positive integer number of bristles the brush has,
the property of being bumpy holds.
1. Base case
Check to make sure that whatever we want to prove holds for small
positive integers, like 1, 2, or 3.
2. Inductive hypothesis
3. Inductive step
Things must be true for K in order to show that they’re still true for
K + 1.
We have the sets {1}, {1, 2}, {1, 2, 3}, ... {1, 2, 3, ..., K }, ... and so on. We want to
show that the set of the first N counting numbers has 2^N subsets.
36 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
Base case
The subsets of {1} are { } and {1}. There are two of them, which mean 2^1 subsets.
The subsets of {1, 2}} are { }, {1}, { 2}, {1, 2}, and there are four of them, or 2^2.
Inductive hypothesis
We assume that a set with the first K counting numbers has 2^K subsets.
Inductive step
Consider the set { 1, 2, ..., K + 1}. Every subset has one of the two properties,
a) K + 1 is not an element.
b) K + 1 is an element.
All the subsets with property (a) are actually subsets of { 1, 2, ..., K } and so we
know there are 2^K of them by the Inductive Hypothesis. And, if we take each
of the property (a) subsets, and stick the element K + 1 in, we get all the subsets
with property (b). So there are the same number of them, and there are 2^K
subsets with property (b). In total, we have 2^K + 2^K = 2(2^K ) = 2^(K + 1)
subsets, and we’re done.
Imagine the situation of climbing a ladder. The Base Case is like getting onto
the ladder near the bottom. The Inductive Hypothesis is like assuming to get to
the K th rung. And, the Inductive Step tells how to climb from the K th rung to
the (K + 1)th rung. After all, what do we need to know in order to climb a ladder?
We need to know how to get on, and how to get from one rung to the next.
The fact that we know how to get from rung K to rung K + 1 means we can insert
any values for K and you can get to (let’s say) rung 10 by going from rung 1 to
rung 2, rung 2 to rung 3, rung 3 to rung 4, rung 4 to rung 5, rung 5 to rung 6,
rung 6 to rung 7, rung 7 to rung 8, rung 8 to rung 9, and rung 9 to rung 10.
This is very simple. But one can think of the ladder as branching at every rung.
And it doesn’t just branch into two ladder-paths, but a lot of ladder-paths. And
it happens at every rung, so that there are bunches of rungs at each level of the
ladder. It’s hard to imagine what that would look like, let alone how one would
climb such a ladder. Luckily, there’s a way out of this problem.
branching ladder, we mark it, look downwards to see what rungs are under it, and
then we would know how to get there. So the induction proof, in ladder terms,
looks like this:
Base Case makes sure that we can get onto the ladder. Induction Hypothesis
assumes that if we happen to be on a K-level rung, we know how we got there.
Induction Step figures out how to get back from a K + 1 level rung to a K
level rung, so we’ll know where we are.
1 3
4
A B C
2 5
Let’s count the number of different routes we can take when travelling from A to
C. This will be easier to do if we number the roads as shown above.
1, 3 1, 4 1, 5 2, 3 2, 4 2, 5
Now let’s try to find the number of routes from A to C if there are 3 roads from
A to B and 4 from B to C.
There are 3 choices of routes from A to B. For each of these, there are 4 choices
from B to C. So there are 3 × 4 = 12 choices of routes from A to C.
Finally, suppose there are 3 roads from A to B, 4 roads from B to C and 3 roads
from C to another town, D. How many possible routes are there?
As above, there are 12 choices of routes from A to C. For each of these, there are
3 choices from C to D. So there are 12 × 3 (i.e., 3 × 4 × 3) = 36 choices of routes
from A to D.
38 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
These examples illustrate the basic counting principle that we can express
informally as:
For example:
6! = 6 × 5 × 4 × 3 × 2 × 1 = 720
3! = 3 × 2 × 1 = 6
1! = 1
We also define 0! to be 1.
Permutations
The word ‘permutations’ means ‘arrangements’. We use it to refer to the number
of ways of arranging a set of objects. In other words, we use permutations when
we are concerned about ‘order’.
Before explaining the concept in detail, let us try the following examples.
Example 3
How many different 4 letter arrangements can we make of the letters in the word
‘cats’, using each letter once only?
Solution
There are four choices for position 1. For each of those choices there are 3 letters
left, and so 3 ways to fill position 2. So by the counting principle, there are
4 × 3 ways of filling the first 2 positions. For each of these choices there are now
2 letters left and there are two ways of filling the third position. The remaining
letter must then go to the last position.
Example 4
How many ways can the numbers 7, 8 and 9 be arranged using each number
once? Write out all the permutations of 7, 8 and 9 to check that your answer is
correct.
Solution
They are:
7, 8, 9 7, 9, 8
8, 7, 9 8, 9, 7
9, 7, 8 9, 8, 7.
Example 5
How many 3 letter ‘words’ can be made using the letters a, b, c, d, e and f if
each letter can be used at most once?
Solution
The first box can be filled in 6 ways, the second in 5 ways and the third in 4
ways, i.e., the number of permutations of 6 letters taken 3 at a time is 6 × 5 ×
4 = 120.
Now let us work out a general formula for the number of arrangements of n
different objects taken r at a time.
We get,
n × (n − 1) × (n − 2) × ... × (n − r + 1) × (n − r) × (n − r − 1) × ... × 2 × 1
(n − r) × (n − r − 1) × ... × 2 × 1
Combinations
“Combinations” is a technical term meaning ‘selections’. We use it to refer to
the number of different sets of a certain size that can be selected from a larger
collection of objects where order does not matter.
Example 6
How many ways can two captains be chosen from this group of four cricket
players D, L, S, Y ?
Solution
There are 4 ways of choosing the first captain and 3 ways of choosing the second.
Thus it would seem at first glance that there are 12 ways to choose the captains
as below:
L, D D, Y Y, S S, L
L, Y D, S Y, L S, D
L, S D, L Y, D S, Y
UNIT 4 41
Algorithms and performance
However, when we look more closely at this list, we notice that only 6 of these
form different groups. D and Y, for example, make up the same group as Y and
D. The order in which people are chosen does not matter. To obtain the correct
answer we cannot just find:
4
P2 = 4!/2! = (4 × 3 × 2 × 1)/(2 × 1) = 12
but must also divide by the number of ways the members can be arranged, i.e., 2!,
giving the answer 6.
Example 7
How many distinct sets of 3 differently coloured pens can be bought if the shop
sells pens in 8 different colours?
Solution
There are 8 different colours of pens available, so the first pen can be chosen in any
one of these 8 colours, that is, in 8 different ways. Since the pens selected are all to
be in different colours, once the first pen is chosen, there are 7 ways of choosing the
colour of the second pen, and corresponding to each of these, 6 ways of choosing
the third.
However these sets of 3 will not be distinct. Suppose that a red pen is selected
first, then a blue and then a white. This will result in the same set of pens as
if a white were chosen first, then a blue and then a red. In fact, since a set of
3 colours can be arranged in 3 × 2 × 1 = 3! Different ways, there will be 3!
different choices which will result in the same set of pens.
Hence we must divide the number obtained by 3!. So the answer to our problem
is 8!/( 3! × 5!) = (8 × 7 × 6)/(3 × 2 × 1).
Now, once again, let’s work out a general formula. This time we want the
number of ways of choosing r objects from n distinct objects when the order in
which the objects are chosen does not matter. Any two selections containing the
same r objects are considered to be the same.
n
Pr = n!/(n − r)!
n
Cr = P/r!
= n!/(r! × (n − r!)
= [n × (n − 1) × (n − 2) × (n − r + 1)]/[r × (r − 1) × (r − 2) × ... × 2 × 1].
Going back to the previous two examples, we can now write them in this notation.
Summary
Self-test 4.3
A. first
B. first and second
C. first n terms
D. specified number of terms
UNIT 4 43
Algorithms and performance
A. natural numbers
B. non-negative numbers
C. whole numbers
D. integers
A. 2
B. 4
C. 6
D. 7
Feedback
Activity 4.2
1. directly or indirectly
2. base criteria
3. recursively
44 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
UNIT 4 45
Algorithms and performance
Summary of Unit 4
Summary
Feedback
Self-test 4.1
2. space
3. syntax
4. error
5. complexity
Self-test 4.2
1. inversely
2. less
3. number of entries
Self-test 4.3
2. non-negative numbers
3. 4
4. a. True
b. True
48 WAWASAN OPEN UNIVERSITY
TCC 236/05 Data Structures and Algorithms
c. True
d. True
UNIT 4 49
Algorithms and performance
References
Carrano, F M and Prichard, J J (2006) Data Abstraction and Problem Solving
with Java: Walls and Mirrors, Boston: Pearson Education Inc.
Glossary
Algorithm A step by step specification of the method
to solve a problem within a finite amount of
time.
Space Memory.