0% found this document useful (0 votes)
23 views

Data Structure and Algorithm

Data Structure and Algorithm

Uploaded by

Tariku Ayele
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Data Structure and Algorithm

Data Structure and Algorithm

Uploaded by

Tariku Ayele
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 110

Chapter One

Introduction to Data Structures and


Algorithms

1
Objectives
• Introducing the basic definitions and principles of data
structure and algorithms
• To introduce commonly used data structures and their
application
• Describing the costs and benefits associated with data
structure.
– for each data structure, the amount of space and time
required for typical operations.
• Selecting the best data structure and algorithm for a given
problem.
• Introduce the basic principles of resource requirement
analysis technique
• Measure the effectiveness of a data structure and
algorithm.
Introduction

• The primary purpose of most computer


programs is not only to perform calculations.
• but to store and retrieve information usually as
fast as possible.
• So, Data Structure and Algorithm helping us to
understand how to structure information to
support efficient processing.
• For this reason, the study of data structures and
the algorithms that manipulate them is at the
heart of computer science.
Introduction to Data Structures
and Algorithms
Software Engineering
• Is the use of engineering principles in order to develop good software.
• Human creativity + Computer speed & Reliability = Program.
• There are a number of general principles that apply to many areas, including
aspects of software engineering.
Software Engineering Principles
a) Separation of Concern
• allows us to deal with different aspects of a problem and focus on each separately.
• There are many areas of concerns in software development (e.g. software
functionalities, user interface design, hardware configuration, software
applications, space and time efficiency, team organization and structure, design
strategies, control procedures, error handling).
• By separating the multiple concerns and focusing on them individually, the
inherent complexity of a large-scale software project can be greatly reduced and
better managed.
• Separation of concerns has been known to enhance the visibility, understandability
and maintainability of software systems.
Software Engineering Principles....(Continued)

b) Modularity
– Modularity is an important Software Engineering principle.
– It is a practical application of the principle of “Separation of Concerns" by
dividing a complex system into simpler and more manageable modules.
– Each module performs a set of tasks.
– Modules are technically connected to one another.
• The important attributes of modules are cohesion and coupling.
• Design goals require modules to have low-coupling and high cohesion.
• Cohesive
– is a property of modules.
– Cohesion is a measure of the inter-relatedness of elements (statements,
procedures, declarations) within a module.
– A module is said to have high cohesion if all the elements in the module are
strongly connected with one another.
– A cohesive module provides a small number of closely related services
Software Engineering Principles....(Continued)

• Coupling
– is a property of systems.
– The measure of inter-module relation is known as coupling.
– Modules are loosely coupled if the communication between them is
simple.
– “Modules form clusters with few interconnections.”
– Tight coupling of modules makes analysis, understanding,
modification and testing of modules difficult. Reuse of modules is
also hindered.
• The goal is: high cohesion and low coupling.
Software Engineering Principles....(Continued)

c) Abstraction
It is sometimes best to concentrate on general aspects of the problem
while carefully removing detailed aspects.
Software Engineering Principles....(Continued)

d) Anticipation of Change
• General rule: write all documents and code under the assumption that they will
subsequently be corrected, adapted, or changed by somebody else.
• Generality: Before starting off on a new software development project, a
software engineer should always consider what might be the underlying,
hidden general problem that requires the solution.
• The general problem may not be as complex as the original problem and there
may also be a ready-made solution that is capable of satisfying the
requirements.
A general solution is often:
• Not much harder to write than a special-purpose solution;
• More likely to be re-used; and
• Perhaps a little less efficient.
• Generality needs support from the programming language.
• Current languages do not support generality well; new OO languages may be
better (abstract classes, frameworks,. . .).
• Generalization is related to abstraction.
Software Engineering Principles....(Continued)

f) Incrementality
• It is easier to make small changes to a working system than to rebuild
the system.
• Why? Because if the modified system does not work, the errors must
have been introduced by the small changes.
g) Rigor and Formality
• Software engineering is a creative design activity, BUT
• It must be practiced systematically
• Rigor is a necessary complement to creativity that increases our
confidence in our developments
• Formality is rigor at the highest degree
– Software process driven and evaluated by mathematical laws
• Rigor: Careful and precise reasoning.
Software Engineering Principles....(Continued)

• Formal: Reasoning based on a mechanical set of rules (“formal


system”).
• Use rigor as much as possible. Use formality when suitable tools are
available (compilers, parser generators, proof checkers,...).
Definitions for Software Engineering
• Product — What we are trying to build.
• Process — The methods we use to build the product.
• Method — A guideline that describes an activity. Methods are general,
abstract, widely applicable. Example: top-down design.
• Technique —A precise rule that defines an activity. Techniques are
precise, particular, and limited. Example: loop termination proof.
• Tool — A mechanical/automated aid to assist in the application of a
methodology. Examples: editor, compiler, . . .
• Methodology — a collection of techniques and tools.
Software Engineering Principles....(Continued)

Software Engineering
• Is the use of engineering principles in order to develop good software.
Good software is:
• Correct
– The software performs according to the SRD (Software
Requirements Document).
– The SRD may be too vague (although it should not be) — in this
case, conformance to a specification is needed.
• Reliable
– This is a weaker requirement than “correct”. E-mail is reliable —
messages usually arrive — but probably incorrect.
• Robust
– The software behaves well when exercised outside the
requirements.
– For example, software designed for 10 users should not fall apart
with 11 users.
Software Engineering Principles....(Continued)

• Performance
– The software should have good space/time utilization, fast response
times.
– And the worst response time should not be too different from the
average response time.
• Friendly
– The software should be easy to use, should not irritate the user,
and should be consistent.
– The screen always mirrors the state.
– One key — one effect. Example: F1 for help.
• Verifiable
– It should be possible to prove (verify) correctness of the software.
– A common term that is not easily defined; it is easier to verify a
compiler than a word-processor.
• Flexibility
– Ability to accommodate new requirements (new situations).
Software Engineering Principles....(Continued)

• Maintainable
– Easy to correct or upgrade.
– Code traceable to design; design traceable to requirements.
– Clear simple code.
– Good documentation.
– Simple interfaces between modules.
• Reusable
– We need abstract modules that can be used in many situations.
– Sometimes, we can produce a sequence of products, each using
code from the previous one.
– Example: Accounting systems.
– Object-Oriented techniques aid reuse.
Software Engineering Principles....(Continued)

• Portable
– The software should be easy to move to different platforms.
– This implies few Operating System and hardware dependencies.
– Recent developments in platform standards (PCs, UNIX, . . .) have
aided portability.
– Portability and efficiency are incompatible.
– Highly portable systems consist of many layers, each layer hiding
local details.
– Recent achievements in portability depend on fast processors and
large memories.
• Interoperable
– The software should be able to cooperate with other software
(word-processors, spread-sheets, graphics packages, . . .).
• Visible
– All steps must be documented.
Introduction....(Continued)

Program vs Data structure vs


Algorithm
A program

• A set of instruction which is written in order


to solve a problem.

 A solution to a problem actually consists of


two things:
 A way to organize the data
 Sequence of steps to solve the problem

15
Introduction....(continued)

• The way data are organized in a computers


memory is said to be Data Structure.
• The sequence of computational steps to solve a
problem is said to be an Algorithm.
• Therefore, a program is Data structures plus
Algorithm.
• The first step to solve the problem is obtaining ones
own abstract view, or model, of the problem.
• This process of modeling is called abstraction.

16
Abstraction
Problem is a process of classifying characteristics
as relevant and irrelevant for the
particular purpose at hand and ignoring
the irrelevant ones.
Abstraction
The model
Model
•The model defines an abstract view to
Data structure the problem.
•The model should only focus on
problem related stuff
Example: model students of OSU.
• Relevant:
 Name
 ID
 Dept
 Age, year;
• Non relevant
float hieght, weight;

18
• Using the model, a programmer tries to define the
properties of the problem.

• These properties include


 The data which are affected and
 The operations that are involved in the problem

• An entity with the properties just described is called


an abstract data type (ADT).

19
Abstract Data Types
• Is logical description of a problem
• An abstract data type (ADT) is the realization of a data type as a software
component.
• Is a specification that describes a data set and the operation on that data.
• Think of an ADT as a picture of the data and the operations to manipulate and change
that data.

• The ADT specifies:


 What data is stored.
 What operations can be done on the data.
• Does not specify how to store or how to implement the operation.
• Is independent of any programming language
20
Example: ADT employees of an organization:

 This ADT stores employees with their


relevant attributes and discarding irrelevant
attributes.
Relevant:- Name, ID, Sex, Age, Salary, Dept, Address
Non Relevant :- weight, color, height
Char name[20],char id[10],int age, int salary, char
dept[10], char adress[15]
 This ADT supports hiring, firing, retiring, …
operations. 21
Data Structure
• Is physical description of a problem
• A data structure is the implementations for an ADT
• It can be implemented and used within an algorithm.
• Data Structure is a way of collecting and organising data in such a way that we can
perform operations in an effective way.
• Anything that can store data can be called as a data structure.
– Hence Integer, Float, Boolean, Char etc, all are data structures. They are known as Primitive
Data Structures.
• Also some more complex Data Structures are :
– Linked List
– Tree
– Graph
– Stack, Queue etc.
• All these data structures allow us to perform different operations on data.
• We select these data structures based on which type of operation is required. 22
Data Structure
• Example:
struct Student_Record
{
char name[20];
char ID_NO[10];
char Department[10];
int age;
};
• Attributes of each variable:
– Name: Textual label.
– Address: Location in memory.
– Scope: Visibility in statements of a program.
– Type: Set of values that can be stored + set of operations that can be
performed.
– Size: The amount of storage required to represent the variable.

23
Selecting a Data Structure
Select a data structure as follows:
1. Analyze the problem to determine the resource constraints
a solution must meet.
2. Determine the basic operations that must be supported.
Quantify the resource constraints for each operation.
3. Select the data structure that best meets these requirements.
Algorithm
• An algorithm is a method or a process followed to solve a
problem
• If the problem is viewed as a function, then an algorithm is
an implementation for the function that transforms an
input to the corresponding output.

• Inputs Algorithm Outputs

• A problem can be solved by many different algorithms


where as A given algorithm solves only one problem

25
• Data structures model the static part of the world. They
are unchanging while the world is changing.
• In order to model the dynamic part of the world we
need to work with algorithms.
• The quality of a data structure is related to its ability to
successfully model the characteristics of the world
(problem).
• Similarly, the quality of an algorithm is related to its
ability to successfully simulate the changes in the world.

26
• However, the quality of data structure and
Algorithms is determined by their ability to work
together well(they are cooperative).

• Generally speaking, correct data structures lead to


simple and efficient algorithms and correct
algorithms lead to accurate and efficient data
structures.

27
• How An algorithms transform data structure from one state
to another?.
– Take values as input. Example: cin>>age;
– Change the values held by data structures. Example:
age=age+1;
– Change the organization of the data structure:
Example:
• Sort students by name
– Produce outputs:
• Example: Display student’s information

28
Properties of Algorithms
Finiteness:
 Algorithm must complete after a finite number
of steps.
 Algorithm should have a finite number of steps.

Finite  int i=0; Infinite while(true){


while(i>10){ cout<<“Hello”;
cout<< i; }
i++;
}
29
Definiteness (Absence of ambiguity):
There can be no ambiguity as to which step will be
performed next
 Each step must be clearly defined, having
one and only one interpretation.

 At each point in computation, one should be


able to tell exactly what happens next.

30
Sequential:
 Each step must have a uniquely defined preceding
and succeeding step.
 The first step (start step) and last step (halt step)
must be clearly noted.
Correctness:
 It must compute correct answer for all possible legal
inputs.
 The output should be as expected and required and
correct.
Language Independence:
 It must not depend on any one programming language

31
Feasibility:
 Each instruction should have possibility to be
executed.
1) for(int i=0; i<0; i++){
cout<< i; // there is no possibility
} that this statement to
be executed.
2) if(5>7) {
cout<<“hello”; // not executed.
}

32
Effectiveness:
 Doing the right thing. It should yield the correct result
all the time for all of the possible cases.

Efficiency:
 It must solve with the least amount of computational
resources such as time and space.
 Producing an output as per the requirement within the
given resources (constraints).

33
Example: Write a program that takes a number and
displays the square of the number.
1) int x;
cin>>x;
cout<<x*x;

2) int x,y;
cin>>x;
y=x*x;
cout<<y;

34
Example: Write a program that takes two numbers and
displays the sum of the two.

Program I Program II Program II (the most efficient)


cin>>a; cin>>a; cin>>a;
cin>>b; cin>>b; cin>>b;
sum = a+b; a = a+b; cout<<a+b;
cout<<sum; cout<<a;

All are effective but with different efficiencies.

35
Precision:
 The result should always be the same if the algorithm is
given identical input.

Simplicity:
 A good general rule is that each step should carry out one logical
step.
• What is simple to one processor may not be simple to
another.

36
Chapter Two

Algorithm Analysis Concept

1
Algorithm Complexity Analysis

 Algorithm analysis refers to the process of determining


how much computing time and storage that an algorithm
will require
 In other words, it’s the process of predicting the resource
requirement of an algorithm in a given environment
 In order to solve a problem, there are many possible
algorithms.
 One has to be able to choose the best algorithm for the
problem at hand using some scientific method.
Algorithm Complexity Analysis

 To measure the goodness of data structures and


algorithms, we need precise ways of analyzing them in
terms of resource requirement.
 The main resources are:
 Running Time (Computational Time complexity)
 Memory Usage (Computational Space complexity)
 Bandwidth usage (the total amount of data communicated among
processing nodes in distributed computational environment)

 Running time is usually treated as the most important since


computational time is the most precious resource in most problem
domains
Algorithm Complexity Analysis
 There are two approaches to measure the efficiency of algorithms:
 Empirical: Programming competing algorithms and trying them on different
instances.
 Theoretical: Determining the quantity of resources required mathematically
(Execution time, memory space, etc.) needed by each algorithm.
1 empirical approach
 based on the total running time of the program.
 Uses actual system clock time.
Algorithm Complexity Analysis
Example:
t1
for(int i=0; i<=10; i++)
cout<<i;
t2
Running time taken by the above algorithm
(TotalTime) = t2-t1;

 However, it is difficult to use actual clock-time as a consistent


measure of an algorithm’s efficiency, because clock-time can
vary based on many factors. For example,
 Specific processor speed
 Current processor load
Algorithm Complexity Analysis
 Input data used in the program
Input Size
Input Properties
 Programming Language
C (fastest), C++ (faster), Java (fast)
C is relatively faster than Java, because C is
relatively nearer to Machine language, so, Java
takes relatively larger amount of time for
interpreting/translation to machine code.
 Operating Environment
 Algorithm used
Note: Important factors for this course are Input size and Algorithm
used.
Algorithm Complexity Analysis
2. theoretical approach
 Accordingly, we can analyze an algorithm based on the
number of operations required, rather absolute amount of
time involved.
 Determining the quantity of resources required
using mathematical concept.
 Analyze an algorithm according to the number of
basic operations (time units) required, rather than
according to an absolute amount of time involved.
 This can show how an algorithm’s efficiency changes
according to the size of the input.
Algorithm Complexity Analysis
 We use theoretical approach to determine
the efficiency of algorithm because:
• The number of operation will not vary
under different conditions.
• It helps us to have a meaningful measure
that permits comparison of algorithms
independent of operating platform.
• It helps to determine the complexity of
algorithm.
8
Algorithm Complexity Analysis
An algorithm can be analyzed under three specific
cases:
Best case
Average case
Worst case
Algorithm Complexity Analysis
Best case analysis.
We analyze the performance of the algorithm under the
circumstances on which it works best.
In that way, we can determine the lower-bound of its
performance.
However, you should note that we may obtain these
results under very unusual or special circumstances and it
may be difficult to find the optimum input data for such
an analysis.
Algorithm Complexity Analysis

Average case analysis.


This gives an indication on how the algorithm performs
on most probable data condition
It is possible that this analysis is made by taking all
possible combinations of data, experimenting with them,
and finally averaging them.
However, such an analysis may not reflect the exact
behavior of the algorithm you expect from a real-life data
set.
Nevertheless, this analysis gives you a better idea how
this algorithm work for your problem.
Algorithm Complexity Analysis
Worst case analysis.
In contrast to the best-case analysis, this gives you an
indication on how bad this algorithm can go,
 in other words, it gives a upper-bound for its
performance.
Sometimes, this could be useful in determining the
applicability of an algorithm on a mission-critical
application.
However, this analysis may be too pessimistic for a
general application, and even it may be difficult to find a
test data set that produces the worst case.
Algorithm Analysis Rules
 There is no generally accepted set of rules for
algorithm analysis.
 However, an exact count of operations is
commonly used.
 To count the number of operations we can use the
following Analysis Rule.
Analysis Rules:
1. Assume an arbitrary time unit.
2. Execution of one of the following operations takes
time 1 unit:
 Assignment Operation
Example: i=0;
 Single Input/Output Operation
Example: cin>>a;
cout<<“hello”; 13
 Single Boolean Operations
Example: i>=10
 Single Arithmetic Operations
Example: a+b;
 Function Return
Example: return sum;
3. Running time of a selection statement (if,
switch) is the time for the condition
evaluation plus the maximum of the
running times for the individual clauses in
the selection.
14
Example: int x;
int sum=0;
if(a>b)
{
sum= a+b;
cout<<sum;
}
else
{
cout<<b;
}
T(n) = 1 +1+max(3,1)
=5
15
4. Loop statements:
• The running time for the statements inside the
loop * number of iterations + time for setup(1)
+ time for checking (number of iteration + 1) +
time for update (number of iteration)
• The total running time of statements inside a
group of nested loops is the running time of
the statements * the product of the sizes of all
the loops.
• For nested loops, analyze inside out.
• Always assume that the loop executes the
maximum number of iterations possible.
(Why?)
 Because we are interested in the worst
case complexity. 16
5. Function call:
• 1 for setup + the time for any parameter
calculations + the time required for the execution of
the function body.
Examples:
1)
int k=0,n;
cout<<“Enter an integer”;
cin>>n
for(int i=0;i<n; i++)
k++;
T(n)= 3+1+n+1+n+n=3n+5

17
Algorithm Analysis Rules
Example
int count(){
int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;
} Time Units to Compute
1 for the assignment statement: int k=0
1 for the output statement. Cout<<
1 for the input statement. Cin>>
In the for loop:
1 assignment i=0,
n+1 tests i<n
n increments i++.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
Algorithm Analysis Rules
Example
int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
}

Time Units to Compute


1 for the assignment statement: sum=0
In the for loop:
1 assignment i=1
n+1 tests i<=n
and n increments. i++
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+ (1+n+1+n)+2n+1 = 4n+4 = O(n)
Algorithm Analysis Rules
Example
int sum (int n)
{
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum +(i * i * i);
return partial_sum;
}

Time Units to Compute


1 for the assignment.
In the for loop
1 assignment, n+1 tests, and n increments.
n loops of 4 units for an assignment, an addition, and
two multiplications.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+(1+n+1+n)+4n+1 = 6n+4 = O(n)
Algorithm Analysis Rules
Example Time Units to Compute
void func()
{ 1 for the first assignment statement: x=0;
int x=0; 1 for the second assignment statement: i=0;
int i=0; 1 for the third assignment statement: j=1;
int j=1; 1 for the output statement. cout<<
cout<< “Enter n”; 1 for the input statement. cin>>
cin>>n; In the first while loop:
while (i<n){ n+1 tests i<n
x++; n loops of 2 units for the two increment (addition)
i++; operations x++ & i++
} In the second while loop:
while (j<n) n tests j<n
{ n-1 increments j++
j++; -------------------------------------------------------------------
} T (n)= 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)
}
Algorithm Analysis Rules
Example Time Units to Compute
void func()
{ 1 for the first assignment statement: x=0;
int x=0; 1 for the second assignment statement: i=0;
int i=0; 1 for the output statement. cout<<
cout<< “Enter n”; 1 for the input statement. cin>>
cin>>n; In the outer while loop:
while (i<n) n+1 tests i<n
{ n loops of 3 units 2 for the two increment (addition)
int j=1; operations x++ & i++ & 1 for the assignment statement:
while (j<n) j=1;
{ n loops of the internal loops
j++;
} In the internal while loop:
x++; n tests j<n
i++; n-1 increments j++
} -------------------------------------------------------------------
} T (n)= 1+1+1+1+n+1+3n+n(n+n-1) = 2n2 + 3n+5
T (n)= O(n2)
Algorithm Analysis Rules

Formal Approach
 In the above examples we have seen that analysis is so
complex as it tries to count each and every details.
 However, it can be simplified by using some formal
approach
 In which case we can ignore statements such as
 initializations,
 loop control statements, and
 book keeping (statements such as declaration, number of pushes
and pops in function calls).
Algorithm Analysis Rules
(Formal Approach)
For Loops: Formally
 In general, a for loop translates to a summation.
 The index and bounds of the summation are the same as
the index and bounds of the for loop.

1 
for (int i = 1; i <= N; i++) {
sum = sum+i; N
}
i 1
Algorithm Analysis Rules
(Formal Approach)
For Nested Loops:
 Nested for loops translate into multiple summations, one
for each for loop.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) { N M N

}
sum = sum+i+j;   2   2M
i 1 j 1 i 1
 2 MN
}

 The outer summation is for the outer for loop


 2 is the number of additions in the inner loops.
Algorithm Complexity Analysis
(Formal Approach)

For Consecutive Statements


 Add the running times of the separate blocks of
your code

for (int i = 1; i <= N; i++) {


sum = sum+i;
 N   N N 
   
}
for (int i = 1; i <= N; i++) { 1  2  N  2 N 2

for (int j = 1; j <= N; j++) {  i 1   i 1 j 1 


sum = sum+i+j;
}
}
Algorithm Complexity Analysis
(Formal Approach)

For Conditionals statements


 If (test) s1 else s2: Compute the maximum of the
running time for s1 and s2.
if (test == 1) {
for (int i = 1; i <= N; i++) {  N N N 
sum = sum+i; max  1,   2 


}}  i 1 i 1 j 1 
 
else for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) { max N , 2 N 2  2 N 2
sum = sum+i+j;
}}

Exercises
Algorithm Complexity Analysis

Categories of Algorithm Analysis


• Algorithms may be examined under
different situations to correctly determine
their efficiency for accurate comparison.
• We can catagorise algorithms Analysis as
– Best case
– Average case
– Werset case

28
Algorithm Complexity Analysis

Best Case Analysis:


• We are not interested the best case b/c rarely
happen
• analysis based on the best case is not likely to be
representative of the behavior of the algorithm
• Does not describe the behavior of the algorithm
• Assumes the input data are arranged in the most
advantageous order for the algorithm. Takes the
smallest possible set of inputs.
• Causes execution of the fewest number of
statements.
29
Algorithm Complexity Analysis

• Computes the lower bound of T(n), where


T(n) is the complexity function.
Examples:
For sorting algorithm
 If the list is already sorted (data are arranged
in the required order).
For searching algorithm
 If the desired item is located at first accessed
position.

30
Algorithm Complexity Analysis
Worst Case Analysis:
• Assumes the input data are arranged in the most
disadvantageous order for the algorithm.
• Takes the worst possible set of inputs.
• Causes execution of the largest number of statements.
• Computes the upper bound of T(n) where T(n) is the
complexity function.
– Example: While sorting, if the list is in opposite order.
While searching, if the desired item is located at the
last position or is missing.
Examples:
For sorting algorithms
 If the list is in opposite order.
For searching algorithms
 If the desired item is located at the last position or
is missing.
31
Algorithm Complexity Analysis

Worst Case Analysis:


• Worst case analysis is the most common analysis
because:
It provides the upper bound for all input (even for
bad ones).
Average case analysis is often difficult to determine
and define.
If situations are in their best case, no need to
develop algorithms because data arrangements are
in the best situation.
Best case analysis can not be used to estimate
complexity.
We are interested in the worst case time since it
provides a bound for all input-this is called the “Big-
Oh” estimate.
32
Average Case Analysis:
• Determine the average of the running time overall
permutation of input data.
• Takes an average set of inputs.
• It also assumes random input size.
• It causes average number of executions.
• Computes the optimal bound of T(n) where T(n) is the
complexity function.
• Sometimes average cases are as bad as worst cases and
as good as best cases.
Examples:
For sorting algorithms
While sorting, considering any arrangement (order
of input data).
For searching algorithms
While searching, if the desired item is located at any
location or is missing.
33
Algorithm Complexity Analysis
Asymptotic (growth rate) Analysis
 Asymptotic analysis is concerned with how the
running time of an algorithm increases with the
size of input increases.
 It’s only concerned with what happens for very a
large value of n.
 For example, if the number of operations in an
algorithm is n2 – n, n is insignificant compared to
n2 for large values of n.
 Hence the n term is ignored.
 Of course, for small values of n, it may be
important.
 However, growth rate is mainly concerned with
large values of n.
Algorithm Complexity Analysis

Types of Asymptotic Notations

 There are five notations used to describe a


running time function asymptotically. These
are:
 Big-Oh Notation (O)
 Big-Omega Notation ()
 Theta Notation ()
 Little-o Notation (o)
 Little-Omega Notation ()
Algorithm Complexity Analysis
1. Big-Oh Notation
 Definition: We say f(n)=O(g(n)), if there are
positive constants no and c, such that to the right
of no, the value of f(n) always lies on or below
c.g(n).
• As n increases f(n) grows no faster than g(n).
• It’s only concerned with what happens for very
large values of n.
• Describes the worst case analysis.
• Gives an upper bound for a function to within a
constant factor.

36
• O-Notations are used to represent the amount of time
an algorithm takes on the worst possible set of inputs,
“Worst-Case”.
• Note:- Different functions with the same growth rate
may be represented using the same O notation.
37
Question-1

f(n)=10n+5 and g(n)=n. Show that f(n) is O(g(n)).


To show that f(n) is O(g(n)), we must show that
there exist constants c and k such that
f(n)<=c.g(n) for all n>=k.
10n+5<=c.n  for all n>=k
let c=15, then show that 10n+5<=15n
5<=5n or 1<=n
So, f(n)=10n+5<=15.g(n) for all n>=1
(c=15, k=1), there exist two constants that satisfy
the above constraints.

38
Question-2

f(n)=3n2+4n+1. Show that f(n)=O(n2).


4n<=4n2 for all n>=1 and 1<=n2 for all
n>=1
3n2+4n+1<=3n2+4n2+n2 for all n>=1
<=8n2 for all n>=1
So, we have shown that f(n)<=8n2 for all n>=1.
Therefore, f(n) is O(n2), (c=8, k=1), there exist two
constants that satisfy the constraints.

39
2. Big-Omega ()-Notation (Lower bound)

• Definition: We write f(n)= (g(n)) if there are positive


constants no and c such that to the right of no the value
of f(n) always lies on or above c.g(n).
• As n increases f(n) grows no slower than g(n).
• Describes the best case analysis.
• Used to represent the amount of time the algorithm
takes on the smallest possible set of inputs-“Best case”.

Example:
Find g(n) such that f(n) = (g(n)) for f(n)=3n+5
g(n) = √n, c=1, k=1.
f(n)=3n+5=(√n)
40
Big-Omega ()-Notation (Lower bound)

41
3. Theta Notation (-Notation) (Optimal bound)

• Definition: We say f(n)= (g(n)) if there exist positive


constants no, c1 and c2 such that to the right of no, the value
of f(n) always lies between c1.g(n) and c2.g(n) inclusive, i.e.,
c2.g(n)<=f(n)<=c1.g(n), for all n>=no.
• As n increases f(n) grows as fast as g(n).
• Describes the average case analysis.
• To represent the amount of time the algorithm takes on an
average set of inputs- “Average case”.
Example: Find g(n) such that f(n) = Θ(g(n)) for
f(n)=2n2+3
 n2≤ ≤
2n2 3n2
 c1=1, c2=3 and n =1
o
f(n) = Θ(g(n)).
42
Theta Notation (-Notation) (Optimal bound)

43
4. Little-oh (small-oh) Notation
• Definition: We say f(n)=o(g(n)), if there are positive
constants no and c such that to the right of no, the value of
f(n) lies below c.g(n).
• As n increases, g(n) grows strictly faster than f(n).
• Describes the worst case analysis.
• Denotes an upper bound that is not asymptotically tight.
• Big O-Notation denotes an upper bound that may or may
not be asymptotically tight.

Example:
Find g(n) such that f(n) = o(g(n)) for f(n) = n2

n2<2n2, for all n>1,  k=1, c=2, g(n)=n2


n2< n3, g(n) = n3, f(n)=o(n3)
n2< n4 , g(n) =n4 , f(n)=o(n4)
44
5. Little-Omega () notation

• Definition: We write f(n)=(g(n)), if there are positive


constants no and c such that to the right of no, the value of
f(n) always lies above c.g(n).
• As n increases f(n) grows strictly faster than g(n).
• Describes the best case analysis.
• Denotes a lower bound that is not asymptotically tight.
• Big -Notation denotes a lower bound that may or may not
be asymptotically tight.
Example: Find g(n) such that f(n)=(g(n)) for f(n)=n2+3

g(n)=n, Since n2 > n, c=1, k=2.


g(n)=√n, Since n2 > √n, c=1, k=2, can also be
solution.

45
Rules to estimate Big Oh of a given
function
When considering the growth rate of a function using Big-
Oh
 Ignore the lower order terms and the coefficients of the
highest-order term
 No need to specify the base of logarithm
 Changing the base from one constant to another changes the
value of the logarithm by only a constant factor
Example:
1. T(n)=3n + 5  O(n)
2. T(n)=3n2+4n+2  O(n2)

46
Rule 1:
If T1(n)=O(f(n)) and T2(n)=O(g(n)), then
a) T1(n)+T2(n)=max(O(f(n)),O(g(n))),
b) T1(n)*T2(n)=O(f(n)*g(n))
Rule 2:
If T(n) is a polynomial of degree k, then T(n)=(nk).
Rule 3:
logk n=O(n) for any constant k. This tells us that
logarithms grow very slowly.

• We can always determine the relative growth rates of two


functions f(n) and g(n) by computing lim n->infinity
f(n)/g(n). The limit can have four possible values.

47
– The limit is 0: This means that f(n)=o(g(n)).
– The limit is c≠0: This means that f(n)=(g(n)).
– The limit is infinity: This means that
g(n)=o(f(n)).
– The limit oscillates: This means that there is no
relation between f(n) and g(n).
Example:
• n3 grows faster than n2, so we can say that
n2=O(n3) or n3=(n2).
• f(n)=n2 and g(n)=2n2 grow at the same rate, so
both f(n)=O(g(n)) and f(n)=(g(n)) are true.
• If f(n)=2n2, f(n)=O(n4), f(n)=O(n3), and f(n)=O(n2)
are all correct, but the last option is the best
answer.

48
T(n) Complexity Big-O
Category
functions F(n)
c, c is constant 1 C=O(1)

10logn + 5 logn T(n)=O(logn)


√n +2 √n T(n)=O(√n)
5n+3 n T(n)=O(n)
3nlogn+5n+2 nlogn T(n)=O(nlogn)

10n2 +nlogn+1 n2 T(n)=O(n2)


5n3 + 2n2 + 5 n3 T(n)=O(n3)
2n+n5+n+1 2n T(n)=O(2n)

7n!+2n+n2+1 n! T(n)=O(n!)

8nn+2n +n2 +3 nn T(n)=O(nn)


49
50
 Arrangement of common Function Name
functions by growth
rate. List of typical c Constant
growth rates.
log N Logarithmic
log2 N Log-squared
N Linear
N log N Log-Linear
N2 Quadratic
N3 Cubic
2N Exponential
51
• The order of the body statements of a
given algorithm is very important in
determining Big-Oh of the algorithm.

Example: Find Big-Oh of the following


algorithm.
1. for( int i=1;i<=n; i++)
sum=sum + i;

T(n)=2*n=2n=O(n).
52
2. for(int i=1; i<=n; i++)
for(int j=1; j<=n; j++)
k++;

T(n)=1*n*n=n2 = O(n2).

53
Algorithm Complexity Analysis
(Formal Approach)
Big-Oh formal definition (O)
 a function f(n) is of order (or has complexity)
O(g(n)) if and only if there exist constants n0 > 0
and c > 0 such that f(n) ≤ c[g(n)] for all n > n0

In simple words, f (n) =O(g(n)) means that


the growth rate of f(n) is less than or equal to g(n).
Algorithm Complexity Analysis
(Formal Approach)
Examples: The following points are facts that you
can use for Big-Oh problems:
 1<=n for all n>=1
 n<=n2 for all n>=1
 2n <=n! for all n>=4
 log2n<=n for all n>=2
 n<=nlog2n for all n>=2
 Big-O expresses an upper bound on the growth rate of
a function, for sufficiently large values of n.
 An upper bound is the best algorithmic solution that
has been found for a problem.
 It tells “ What is the best that we know we can do?”
Algorithm Complexity Analysis
(Formal Approach)
Big-Oh Typical orders

N O(1) O(log n) O(n) O(n log n) O(n2) O(n3)


1 1 1 1 1 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096

1024 1 10 1,024 10,240 1,048,576 1,073,741,824

Examples
Big Oh: more examples
 n2 / 2 – 3n = O(n2)
 1 + 4n = O(n)
 7n2 + 10n + 3 = O(n2) = O(n3)
 log10 n = log2 n / log2 10 = O(log2 n) = O(log n)
 sin(n) = O(1);
 10 = O(1),
 1010 = O(1)
 log n + n = O(n)
 logk n = O(n) for any constant k
 n = O(2n), but 2n is not O(n)
 210n is not O(2n)


N
i 1
i  N  N  O( N 2 )
57
Math Review: logarithmic functions
x  b iff
a
log x b  a
log ab  log a  log b
log m b
log a b 
log m a
log a b  b log a
a log n
n log a

log b a  (log a )b  log a b


d log e x 1

dx x 58
Some rules
When considering the growth rate of a function using Big-Oh
 Ignore the lower order terms and the coefficients of the
highest-order term
 No need to specify the base of logarithm
 Changing the base from one constant to another changes the value
of the logarithm by only a constant factor

 Important property
 If T1(n) = O(f(n)) and T2(n) = O(g(n)), then
T1(n) + T2(n) = max(O(f(n)), O(g(n))),
T1(n) * T2(n) = O(f(n) * g(n))
59
Big-Omega
 c , n0 > 0 such that f(n)  c g(n) when n  n0
 f(n) grows no slower than g(n) for “large” n

60
Big-Omega
 f(n) = (g(n)) if there are positive constants c and n0 such
that
f(n)  cg(n) for n  n0

 The growth rate of f(n) is greater than or equal to the


growth rate of g(n).

Example
 Let f(N) = 2N2. Then
f(N) = (N)
f(N) = (N2) (best answer)
61
Big-Theta f(N) = (g(N))
 the growth rate of f(n) is the same as the growth rate of
g(n)

62
Big-Theta
f(n) = (g(n)) iff
f(n) = O(g(n)) and f(n) = (g(n))
The growth rate of f(n) equals the growth rate of g(n)

Example:
Let f(n)=n2 , g(n)=2n2
Since f(n) = O(g(n)) and f(n) = (g(n)),
thus f(n) = (g(n)).
Big-Theta means the bound is the tightest possible.

63
Some rules
If T(n) is a polynomial of degree k, then T(n) =
(nk).

For logarithmic functions, T(n)= logmn = (log N).

64
Growth rates …
• Doubling the input size
– f(N) = c  f(2N) = f(N) = c
– f(N) = log N  f(2N) = f(N) + log 2
– f(N) = N  f(2N) = 2 f(N)
– f(N) = N2  f(2N) = 4 f(N)
– f(N) = N3  f(2N) = 8 f(N)
– f(N) = 2N  f(2N) = f2(N)
• Advantages of algorithm analysis
– To eliminate bad algorithms early
– pinpoints the bottlenecks, which are worth coding
carefully

65
Algorithm Complexity Analysis
(Formal Approach)
Further notes on
 Big-Omega Notation ()
 Theta Notation ()
 Little-o Notation (o)
 Little-Omega Notation ()
 Relations among notations
Algorithm Complexity Analysis

 When plotting different algorithms against time, we may


find different behaviors:
 Linear performance: searching a value from a list of array of data
 Quadratic performance: working out how “similar” two strings
are
 Cubic performance: a simple implementation of the “maximum
contiguous subsequence sum” problem
 Logarithmic performance: binary searching for an integer in a list
of sorted integers or searching in a balanced binary tree
 Polynomial performance:
Algorithm Complexity Analysis

 Algorithm Performance Analysis focus on the running time


of the algorithm and the growth rate of the algorithm
 Running time refers to the number of operations that the
algorithm need to perform of a problem with data size
equals to N
 Growth rate analysis usually takes running time function
and find the most dominant part by neglecting the terms
which are less discriminative
 For significantly large data sets, no matter how large these
constants are, they do not affect to the overall performance
of the algorithm.
Algorithm Complexity Analysis
 For example: running time of a given sorting algorithm may
be defines as
f(n) = 0.5n2 + n – 2
 In this case the running time of the algorithm can be
considered as quadratic (n2) as the quadratic element is the
most determinant factor

 For small problem sizes, the difference between a good


algorithm and a bad algorithm is not significant.
 But, as problem size increases, the difference becomes
significant.
 The following table illustrates how the differences between
different orders grow with the data set size.
Algorithm Complexity Analysis
Data size
=N F(N) =2N F(N)=N3 F(N)=N2 F(N)=N log10 N F(N)=N F(N)=k=5

10 1024 103 102 10 10 5

102 1.27x1030 106 104 2x102 102 5

103 1.1x10301 109 106 3x103 103 5


Very
104 large 1012 108 4x104 104 5

105 1015 1010 5x105 105 5

106 1018 1012 6x106 106 5

107 1021 1014 7x107 107 5

108 1024 1016 8x108 108 5

109 1027 1018 9x109 109 5

1010 1030 1020 10x1010 1010 5

1011 1033 1022 11x1011 1011 5


Algorithm Complexity Analysis
 A fast algorithm enables us to solve a problem on a slow
machine, but a fast machine is no help when we are using a
slow algorithm.
 The following table show approximate running times for
three algorithms with different order, on two computers.
 To compute the time that the computer will takes just
compute the complexity expression for a given N and divide
by the processing speed
Problem size (N) = 1 Million Problem size (N) = 1 Billion
Operations
per sec
N N log N N2 N N log N N2
106 316231
(1MHZ) 1 Second 20 seconds 4 Months 2.8 Hours 3.5 Days years

0.001
109 (1GHZ) seconds 0.02seconds 2.8 Hours 1 Second 30 seconds 316 years
Algorithm Complexity Analysis
 Are linear algorithms better or faster than cubic algorithms
for any input size?
 It is not sensible to make generalizations:
 actual performance depends on the size of the input, the constant
factors, the implementation, and the environment.

 When considering algorithms, we need to be aware that in


many cases:
 different constants may be the dominant factors
 a different order may be the dominant factor
 order may not matter because the data set is small or the algorithm
is insignificant
 memory may be the limiting factor
Algorithm Complexity Analysis
 Complexity Analysis is the systematic study of the cost of
computation, measured either in time units or in operations
performed, or in the amount of storage space required.

 The goal is to have a meaningful measure that permits


comparison of algorithms independent of operating
platform.

 There are two factors to consider while Algorithm analysis:


Time complexity and space complexity
 Time Complexity: Determine the approximate number of
operations required to solve a problem of size n.
 Space Complexity: Determine the approximate memory required
to solve a problem of size n.
Algorithm Complexity Analysis
 Complexity analysis involves two distinct phases:
 Algorithm running time Analysis: Analysis of the algorithm or
data structure to produce a function T (n) that describes the
algorithm in terms of the operations performed in order to measure
the complexity of the algorithm.

 Order of Magnitude (Growth rate) Analysis: Analysis of the


function T (n) to determine the general complexity category to
which it belongs.

 There is no generally accepted set of rules for algorithm


analysis. However, an exact count of operations is
commonly used.

You might also like