Chapter 1-Introduction (1)
Chapter 1-Introduction (1)
1
Chapter Outline
• Introduction
• Abstract data types
• Data structures
• Algorithms
• Properties of algorithms
• Algorithm analysis
• Measures of times
2
Introduction
• A program is written in order to solve a problem.
• A solution to a problem actually consists of two things:
• A way to organize the data
• Sequence of steps to solve the problem
• The way data are organized in a computers memory is
said to be Data Structure and the sequence of
computational steps to solve a problem is said to be an
Algorithm.
• Therefore, a program is nothing but data structures plus
algorithms.
3
Introduction(cont..)
• Given a problem, the first step to solve the problem is obtaining
ones own abstract view, or model, of the problem.
• This process of modeling is called abstraction.
5
Abstract Data Type(cont..)
• A data structure is a language construct that the
programmer has defined in order to implement an
abstract data type.
• There are lots of formalized and standard Abstract
data types such as Stacks, Queues, Trees, etc.
• Do all characteristics need to be modeled?
• Not at all!!
➢It depends on the scope of the model
➢It depends on the reason for developing the model
6
Abstraction
• Abstraction is a process of classifying characteristics as
relevant and irrelevant for the particular purpose at hand and
ignoring the irrelevant ones.
• Applying abstraction correctly is the essence of successful
programming
• How do data structures model the world or some part of the
world?
• The value held by a data structure represents some specific
characteristic of the world
• The characteristic being modeled restricts the possible values
held by a data structure
• The characteristic being modeled restricts the possible
operations to be performed on the data structure.
• Note: Notice the relation between characteristic, value, and data
structures
• Where are algorithms, then?
7
Example 1: Employee ADT
• If we are going to model employees of an
organization:
• This ADT stores employees with their relevant
attributes and discarding irrelevant attributes.
(some of such attributes can be: name, sex, id,
salary, etc)
• This ADT supports operations such as hiring,
firing, retiring.
8
Example 2: List ADT
• An ADT for a list of integers might specify the
following operations:
• Insert a new integer at a particular position in the
list.
• Return true if the list is empty.
• Reinitialize the list.
• Return the number of integers currently in the list.
• Delete the integer at a particular position in the list.
• From this description, the input and output of each
operation should be clear, but the implementation for
lists has not been specified.
9
Example 3: Other ADTs
• Objects such as lists, sets, and graphs, along with their operations, can
be viewed as ADTs, just as integers, float, double and booleans are
data types.
• Integers, reals, and booleans have operations associated with them, and
so do ADTs.
• For the set ADT, we might have such operations as add, remove, size,
• Alternatively, we might only want the two operations union and find,
which would define a different ADT on the set.
• An abstract data type can also be viewed as the realization of a data
type as a software component.
• The interface of the ADT is defined in terms of a type and a set of
operations on that type.
• The behavior of each operation is determined by its inputs and outputs.
• An ADT does not specify how the data type is implemented.
• There are lots of formalized and standard Abstract data types such as
Stacks, Queues, Trees, Graphs etc.
10
Data Structures
• A data structure is the implementation for an ADT.
• With abstraction you create a well-defined entity that can be
properly handled. These entities define the data structure of
the program.
• A data structure is a language construct that the programmer
has defined in order to implement an abstract data type.
• The C++ class (struct too) allows for the implementation of
ADTs, with appropriate hiding of implementation details.
• Thus, any other part of the program that needs to perform an
operation on the ADT can do so by calling the appropriate
method.
• If for some reason implementation details need to be
changed, it should be easy to do so by merely changing the
routines that perform the ADT operations.
• This change, in a perfect world, would be completely
transparent to the rest of the program.
11
E.g
struct student-Record
{
char Name[20];
char Id-No[10];
char Dept[10];
int age;};
12
Algorithms
• An algorithm is a well-defined computational procedure that
takes some value or a set of values as input and produces
some value or a set of values as output.
• Data structures model the static part of the world. They are
unchanging while the world is changing.
• In order to model the dynamic part of the world we need to
work with algorithms.
• Algorithms are the dynamic part of a program’s world model.
• An algorithm transforms data structures from one state to
another state in two ways:
• An algorithm may change the value held by a data structure
E.g age=age+1;
• An algorithm may change the data structure itself
E.g sort students by name
13
Algorithms
• The quality of a data structure is related to its ability to
successfully model the characteristics of the world.
• Similarly, the quality of an algorithm is related to its
ability to successfully simulate the changes in the
world.
• However, independent of any particular world model,
the quality of data structure and algorithms is
determined by their ability to work together well.
• Generally speaking, correct data structures lead to
simple and efficient algorithms and correct algorithms
lead to accurate and efficient data structures.
14
Properties of an Algorithm
• Finiteness: Algorithm must complete after a finite number of steps.
• Definiteness: Each step must be clearly defined, having one and only
one interpretation. At each point in computation, one should be able
to tell exactly what happens next.
• Sequence: Each step must have a unique defined preceding and
succeeding step. The first step (start step) and last step (halt step)
must be clearly noted.
• Feasibility: It must be possible to perform each instruction.
• Correctness: It must compute correct answer all possible legal inputs.
• Language Independence: It must not depend on any one
programming language.
• Completeness: It must solve the problem completely.
• Effectiveness: It must be possible to perform each step exactly and in
a finite amount of time.
• Efficiency: It must solve with the least amount of computational
resources such as time and space.
• Generality: Algorithm should be valid on all possible inputs.
• Input/Output: There must be a specified number of input values, and
one or more result values.
15
Algorithm Analysis Concepts
• Algorithm analysis refers to the process of determining how
much computing time and storage that algorithms will require.
• In other words, it’s a process of predicting the resource
requirement of algorithms in a given environment.
• In order to solve a problem, there are many possible algorithms.
• One has to be able to choose the best algorithm for the problem at
hand using some scientific method.
• To classify some data structures and algorithms as good, we need
precise ways of analyzing them in terms of resource requirement.
• The main resources are:
• Running Time
• Memory Usage
• Communication Bandwidth
• Running time is usually treated as the most important since
computational time is the most precious resource in most problem
domains.
16
Algorithm Analysis Concepts(cont..)
• There are two approaches to measure the efficiency of algorithms:
1.Empirical: Programming competing algorithms and trying them on different
instances. It uses actual system clock time.
Total time=t2-t1 where t1= starting time and t2=finishing time
2. Theoretical: Determining the quantity of resources required mathematically
(Execution time, memory space, etc.) needed by each algorithm.
• However, it is difficult to use actual clock-time as a consistent measure of an
algorithm’s efficiency, because clock-time can vary based on many things.
For example,
• Specific processor speed
• Current processor load
• Specific data for a particular run of the program
o Input Size
o Input Properties
• Operating Environment(multitasking vs single tasking)
• Accordingly, we can analyze an algorithm according to the number of
operations required, rather than according to an absolute amount of time
involved. This can show how an algorithm’s efficiency changes according to
the size of the input.
17
Complexity Analysis
• Complexity Analysis is the systematic study of the cost of computation,
measured either in time units or in operations performed, or in the amount of
storage space required.
• The goal is to have a meaningful measure that permits comparison of
algorithms independent of operating platform.
• There are two things to consider:
• Time Complexity: Determine the approximate number of operations
required to solve a problem of size n.
• Space Complexity: Determine the approximate memory required to solve a
problem of size n.
• Complexity analysis involves two distinct phases:
➢ Algorithm Analysis: Analysis of the algorithm or data structure to produce a
function T(n) that describes the algorithm in terms of the operations
performed in order to measure the complexity of the algorithm.
➢ Order of Magnitude Analysis: Analysis of the function T(n) to determine the
general complexity category to which it belongs.
• There is no generally accepted set of rules for algorithm analysis. However,
an exact count of operations is commonly used.
18
Analysis Rules
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
• Assignment Operation
• Single Input/Output Operation
• Single Boolean Operations
• Single Arithmetic Operations
• Function Return
3. Running time of a selection statement (if, switch) is the time for the condition
evaluation + the maximum of the running times for the individual clauses in the
selection.
Example: int x;
int sum=0;
If(a>b)
{
sum=a+b;
Cout<<sum;}
Else{
Cout<<b;}
T(n)=1+1+max(3,1)=5
19
4. Loops:
Running time for a loop is equal to the running time for the statements inside the
loop * number of iterations.
The total running time of a statement inside a group of nested loops is the running
time of the statements multiplied by the product of the sizes of all the loops.
For nested loops, analyze inside out.
Always assume that the loop executes the maximum number of iterations possible.
5. Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.
20
Algorithm Analysis Examples
Example 1:
int count()
{
int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int k=0
1 for the output statement.
1 for the input statement.
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
21
Algorithm Analysis Examples
Example 2:
int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int sum=0
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+ (1+n+1+n)+2n+1 = 4n+4 = O(n)
22
Algorithm Analysis Examples
Example 3: Time Units to Compute
void func() -------------------------------------------------
{ 1 for the first assignment statement: x=0;
int x=0; 1 for the second assignment statement: i=0;
int i=0; 1 for the third assignment statement: j=1;
int j=1; 1 for the output statement.
cout<<“Enter a value”; 1 for the input statement.
cin>>n; In the first while loop:
while (i<n){ n+1 tests
x++; n loops of 2 units for the two increment
i++; (addition) operations
} In the second while loop:
while (j<n) n tests
{ n-1 increments
j++; --------------------------------------------------
} T (n)= 1+1+1+1+1+n+1+2n+n+n-1
}
= 5n+5 = O(n)
23
Algorithm Analysis Examples
Example 4:
int sum (int n)
{
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum +(i * i * i);
return partial_sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment.
1 assignment, n+1 tests, and n increments.
n loops of 4 units for an assignment, an addition, and two
multiplications.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+(1+n+1+n)+4n+1 = 6n+4 = O(n)
24
Calculate T(n)
1)
int sum=0;
For(i=0;i<n; i++)
For(j=0;j<n; j++)
sum++;
2)
long factorial(int n)
{if(n<=1)
return 1;
else{return n*factorial(n-1);}
25
Formal Approach to Analysis
• In the above examples we have seen that analysis so
complex.
• However, it can be simplified by using some formal
approach in which case we can ignore initializations,
loop control(condition), and book keeping (assignment
operations).
26
Formal Approach to Analysis
1 = N
for (int i = 1; i <= N; i++) {
sum = sum+i;
}
i =1
• Suppose we count the number of additions that are
done. There is 1 addition per iteration of the loop,
hence N additions in total.
27
Formal Approach to Analysis
➢ Nested Loops: Formally
• Nested for loops translate into multiple summations,
one for each for loop.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) { N M N
}
sum = sum+i+j; 2 = 2M = 2MN
i =1 j =1 i =1
}
• Again, count the number of additions. The outer
summation is for the outer for loop.
28
Formal Approach to Analysis
29
Formal Approach to Analysis
➢ Conditionals: Formally
If (test) s1 else s2: Compute the maximum of the
running time for s1 and s2.
if (test == 1) {
for (int i = 1; i <= N; i++) { N N N
sum = sum+i; max 1, 2 =
}} i =1 i =1 j =1
( )
else for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) { max N , 2 N 2 = 2 N 2
sum = sum+i+j;
}}
30
31
Measures of Times
• In order to determine the running time of an algorithm it
is possible to define three functions Tbest(n), Tavg(n) and
Tworst(n) as the best, the average and the worst case
running time of the algorithm respectively.
• Average Case (Tavg): The amount of time the algorithm
takes on an "average" set of inputs.
• Worst Case (Tworst): The amount of time the algorithm
takes on the worst possible set of inputs.
• Best Case (Tbest): The amount of time the algorithm
takes on the smallest possible set of inputs.
• We are interested in the worst-case time, since it
provides a bound for all input – this is called the “Big-
Oh” estimate.
32
Best Case analysis
▪ assumes the input data are arranged in the most
advantageous order for the algorithm.
▪ For sorting algorithm-if the list is already sorted.
▪ For searching algorithm if the desired item is located
at the first accessed position.
Worst Case analysis
▪ assumes the input data are arranged in the most
disadvantageous order for the algorithm.
▪ While sorting, if the list is in opposite order.
▪ While searching, if the desired item is located at the
last position or is missing. 33
Cont.
34
Asymptotic Analysis
• Asymptotic analysis is concerned with how the
running time of an algorithm increases with the size of
the input in the limit, as the size of the input increases
without bound.
• There are five notations used to describe a running
time function. These are:
• Big-Oh Notation (O)
• Big-Omega Notation ()
• Theta Notation ()
• Little-o Notation (o)
• Little-Omega Notation ()
35
The Big-Oh Notation
• Big-Oh notation is a way of comparing algorithms and is
used for computing the complexity of algorithms; i.e., the
amount of time that it takes for computer program to run .
• It’s only concerned with what happens for very a large
value of n.
• Therefore only the largest term in the expression
(function) is needed.
• For example, if the number of operations in an algorithm
is n2 – n, n is insignificant compared to n2 for large values
of n.
• Hence the n term is ignored. Of course, for small values of
n, it may be important.
• However, Big-Oh is mainly concerned with large values
of n.
• Formal Definition: f (n)= O (g (n)) if there exist c, k ∊ ℛ+
such that for all n≥ k, f (n) ≤ c.g (n).
36
The Big-Oh Notation-Examples
Examples: The following points are facts that you can
use for Big-Oh problems:
• 1<=n for all n>=1
• n<=n2 for all n>=1
• 2n <=n! for all n>=4
• log2n<=n for all n>=2
• n<=nlog2n for all n>=2
37
The Big-Oh Notation-Examples
Example: f(n)=10n+5 and g(n)=n. Show that f(n) is
O(g(n)).
To show that f(n) is O(g(n)) we must show that
constants c and k such that f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that 10n+5<=15n
Solving for n we get: 5<=5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1.
(c=15,k=1).
38
Typical Orders
• Here is a table of some typical cases. This uses logarithms to base 2, but
these are simply proportional to logarithms in other base.
N O(1) O(log n) O(n) O(n log n) O(n2) O(n3)
1 1 1 1 1 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096
1024 1 10 1,024 10,240 1,048,576 1,073,741,824
41
Properties of the Big-O Notation
• Higher powers grow faster
nr is O( ns) if 0 <= r <= s
• Fastest growing term dominates a sum
If f(n) is O(g(n)), then f(n) + g(n) is O(g)
E.g 5n4 + 6n3 is O (n4)
• Exponential functions grow faster than powers, i.e. is
O( bn ) b > 1 and k >= 0
E.g. n20 is O( 1.05n)
• Logarithms grow more slowly than powers
logbn is O( nk) b > 1 and k >= 0
• E.g. log2n is O( n0.5)
42
Big-Omega Notation
• Just as O-notation provides an asymptotic upper bound on a function,
notation provides an asymptotic lower bound.
• Formal Definition: A function f(n) is ( g (n)) if there exist constants c
and k ∊ ℛ+ such that
f(n) >=c. g(n) for all n>=k.
• f(n)= ( g (n)) means that f(n) is greater than or equal to some constant
multiple of g(n) for all values of n greater than or equal to some k.
Example: If f(n) =n2, then f(n)= ( n)
If f(n)=3n+2,g(n)=√n show that f(n)= (g(n))
F(n)>=c(g(n)) for n>=k
3n+2>=c.√n let c=1,k=1
3n+2>=√n for n>=1
3n+2= (g(n)) for n>=1, c=1, k=1)
• In simple terms, f(n)= ( g (n)) means that the growth rate of f(n) is
greater than or equal to g(n).
• It describe the best case analysis.
43
Theta Notation
• A function f (n) belongs to the set of (g(n)) if there exist positive constants c1 and
c2 such that it can be sandwiched between c1.g(n) and c2.g(n), for sufficiently large
values of n.
• Formal Definition: A function f (n) is (g(n)) if it is both O( g(n) ) and ( g(n) ). In
other words, there exist constants c1, c2, and k >0 such that c1.g (n)<=f(n)<=c2. g(n)
for all n >= k
• If f(n)= (g(n)), then g(n) is an asymptotically tight bound for f(n).
• In simple terms, f(n)= (g(n)) means that f(n) and g(n) have the same rate of
growth.
• Example:
1. If f(n)=2n+1, then f(n) = (n), for c1=2 , c2=3 and k=1
2. f(n) =2n2 then
f(n)=O(n4)
f(n)=O(n3)
f(n)=O(n2)
• All these are technically correct, but the last expression is the best and tight one.
Since 2n2 and n2 have the same growth rate, it can be written as f(n)= (n2).
• It represent the amount of time the algorithm takes on an average set of inputs
“Average Case”.
44
E.g f(n)=5n+6
Find g(n) such that f(n)=(g(n))
C1.(g(n)<=f(n)<=c2.g(n) for n>=k
n<=5n<=10n for n>=1
g(n)=n, c1=1, c2=10
45
Little-o Notation
• Big-Oh notation may or may not be asymptotically
tight, for example:
2n2 = O(n2)
= O(n3)
• f(n)=o(g(n)) means for all c>0 there exists some k>0
such that f(n)<c.g(n) for all n>=k.
• Informally, f(n)=o(g(n)) means f(n) becomes
insignificant relative to g(n) as n approaches infinity.
Example: f(n)=3n+4 is o(n2)
• In simple terms, f(n) has less growth rate compared to
g(n).
g(n)= 2n2 g(n) =o(n3), O(n2), g(n) is not o(n2).
46
▪ It describes the worst case analysis.
Example: Find g(n) such that f(n) = o(g(n)) for f(n) =
n2
n2<2n2, for all n>1, ==> k=1, c=2,
g(n)=n2
n2< n3, g(n) = n3, f(n)=o(n3)
n2< n4 , g(n) =n4 , f(n)=o(n4)
47
Little-Omega ( notation)
48
Example: Find g(n) such that f(n)=ω(g(n)) for
f(n)=n2+3 g(n)=n,
Since n2 > n, c=1, k=2.
Since n2 > √n, c=1, k=2, can also be solution.
49
Relational Properties of the Asymptotic Notations
• Transitivity
• if f(n)=(g(n)) and g(n)= (h(n)) then f(n)=(h(n)),
• if f(n)=O(g(n)) and g(n)= O(h(n)) then f(n)=O(h(n)),
• if f(n)=(g(n)) and g(n)= (h(n)) then f(n)= (h(n)),
• if f(n)=o(g(n)) and g(n)= o(h(n)) then f(n)=o(h(n)), and
• if f(n)= (g(n)) and g(n)= (h(n)) then f(n)= (h(n)).
• Symmetry
• f(n)=(g(n)) if and only if g(n)=(f(n)).
• Transpose symmetry
• f(n)=O(g(n)) if and only if g(n)=(g(n),
• f(n)=o(g(n)) if and only if g(n)=(g(n)).
• Reflexivity
• f(n)=(f(n)),
• f(n)=O(f(n)),
• f(n)=(f(n)).
50
Thank you!!
51