Data Structure and Algorithm
Data Structure and Algorithm
1
Objectives
• Introducing the basic definitions and principles of data
structure and algorithms
• To introduce commonly used data structures and their
application
• Describing the costs and benefits associated with data
structure.
– for each data structure, the amount of space and time
required for typical operations.
• Selecting the best data structure and algorithm for a given
problem.
• Introduce the basic principles of resource requirement
analysis technique
• Measure the effectiveness of a data structure and
algorithm.
Introduction
b) Modularity
– Modularity is an important Software Engineering principle.
– It is a practical application of the principle of “Separation of Concerns" by
dividing a complex system into simpler and more manageable modules.
– Each module performs a set of tasks.
– Modules are technically connected to one another.
• The important attributes of modules are cohesion and coupling.
• Design goals require modules to have low-coupling and high cohesion.
• Cohesive
– is a property of modules.
– Cohesion is a measure of the inter-relatedness of elements (statements,
procedures, declarations) within a module.
– A module is said to have high cohesion if all the elements in the module are
strongly connected with one another.
– A cohesive module provides a small number of closely related services
Software Engineering Principles....(Continued)
• Coupling
– is a property of systems.
– The measure of inter-module relation is known as coupling.
– Modules are loosely coupled if the communication between them is
simple.
– “Modules form clusters with few interconnections.”
– Tight coupling of modules makes analysis, understanding,
modification and testing of modules difficult. Reuse of modules is
also hindered.
• The goal is: high cohesion and low coupling.
Software Engineering Principles....(Continued)
c) Abstraction
It is sometimes best to concentrate on general aspects of the problem
while carefully removing detailed aspects.
Software Engineering Principles....(Continued)
d) Anticipation of Change
• General rule: write all documents and code under the assumption that they will
subsequently be corrected, adapted, or changed by somebody else.
• Generality: Before starting off on a new software development project, a
software engineer should always consider what might be the underlying,
hidden general problem that requires the solution.
• The general problem may not be as complex as the original problem and there
may also be a ready-made solution that is capable of satisfying the
requirements.
A general solution is often:
• Not much harder to write than a special-purpose solution;
• More likely to be re-used; and
• Perhaps a little less efficient.
• Generality needs support from the programming language.
• Current languages do not support generality well; new OO languages may be
better (abstract classes, frameworks,. . .).
• Generalization is related to abstraction.
Software Engineering Principles....(Continued)
f) Incrementality
• It is easier to make small changes to a working system than to rebuild
the system.
• Why? Because if the modified system does not work, the errors must
have been introduced by the small changes.
g) Rigor and Formality
• Software engineering is a creative design activity, BUT
• It must be practiced systematically
• Rigor is a necessary complement to creativity that increases our
confidence in our developments
• Formality is rigor at the highest degree
– Software process driven and evaluated by mathematical laws
• Rigor: Careful and precise reasoning.
Software Engineering Principles....(Continued)
Software Engineering
• Is the use of engineering principles in order to develop good software.
Good software is:
• Correct
– The software performs according to the SRD (Software
Requirements Document).
– The SRD may be too vague (although it should not be) — in this
case, conformance to a specification is needed.
• Reliable
– This is a weaker requirement than “correct”. E-mail is reliable —
messages usually arrive — but probably incorrect.
• Robust
– The software behaves well when exercised outside the
requirements.
– For example, software designed for 10 users should not fall apart
with 11 users.
Software Engineering Principles....(Continued)
• Performance
– The software should have good space/time utilization, fast response
times.
– And the worst response time should not be too different from the
average response time.
• Friendly
– The software should be easy to use, should not irritate the user,
and should be consistent.
– The screen always mirrors the state.
– One key — one effect. Example: F1 for help.
• Verifiable
– It should be possible to prove (verify) correctness of the software.
– A common term that is not easily defined; it is easier to verify a
compiler than a word-processor.
• Flexibility
– Ability to accommodate new requirements (new situations).
Software Engineering Principles....(Continued)
• Maintainable
– Easy to correct or upgrade.
– Code traceable to design; design traceable to requirements.
– Clear simple code.
– Good documentation.
– Simple interfaces between modules.
• Reusable
– We need abstract modules that can be used in many situations.
– Sometimes, we can produce a sequence of products, each using
code from the previous one.
– Example: Accounting systems.
– Object-Oriented techniques aid reuse.
Software Engineering Principles....(Continued)
• Portable
– The software should be easy to move to different platforms.
– This implies few Operating System and hardware dependencies.
– Recent developments in platform standards (PCs, UNIX, . . .) have
aided portability.
– Portability and efficiency are incompatible.
– Highly portable systems consist of many layers, each layer hiding
local details.
– Recent achievements in portability depend on fast processors and
large memories.
• Interoperable
– The software should be able to cooperate with other software
(word-processors, spread-sheets, graphics packages, . . .).
• Visible
– All steps must be documented.
Introduction....(Continued)
15
Introduction....(continued)
16
Abstraction
Problem is a process of classifying characteristics
as relevant and irrelevant for the
particular purpose at hand and ignoring
the irrelevant ones.
Abstraction
The model
Model
•The model defines an abstract view to
Data structure the problem.
•The model should only focus on
problem related stuff
Example: model students of OSU.
• Relevant:
Name
ID
Dept
Age, year;
• Non relevant
float hieght, weight;
18
• Using the model, a programmer tries to define the
properties of the problem.
19
Abstract Data Types
• Is logical description of a problem
• An abstract data type (ADT) is the realization of a data type as a software
component.
• Is a specification that describes a data set and the operation on that data.
• Think of an ADT as a picture of the data and the operations to manipulate and change
that data.
23
Selecting a Data Structure
Select a data structure as follows:
1. Analyze the problem to determine the resource constraints
a solution must meet.
2. Determine the basic operations that must be supported.
Quantify the resource constraints for each operation.
3. Select the data structure that best meets these requirements.
Algorithm
• An algorithm is a method or a process followed to solve a
problem
• If the problem is viewed as a function, then an algorithm is
an implementation for the function that transforms an
input to the corresponding output.
25
• Data structures model the static part of the world. They
are unchanging while the world is changing.
• In order to model the dynamic part of the world we
need to work with algorithms.
• The quality of a data structure is related to its ability to
successfully model the characteristics of the world
(problem).
• Similarly, the quality of an algorithm is related to its
ability to successfully simulate the changes in the world.
26
• However, the quality of data structure and
Algorithms is determined by their ability to work
together well(they are cooperative).
27
• How An algorithms transform data structure from one state
to another?.
– Take values as input. Example: cin>>age;
– Change the values held by data structures. Example:
age=age+1;
– Change the organization of the data structure:
Example:
• Sort students by name
– Produce outputs:
• Example: Display student’s information
28
Properties of Algorithms
Finiteness:
Algorithm must complete after a finite number
of steps.
Algorithm should have a finite number of steps.
30
Sequential:
Each step must have a uniquely defined preceding
and succeeding step.
The first step (start step) and last step (halt step)
must be clearly noted.
Correctness:
It must compute correct answer for all possible legal
inputs.
The output should be as expected and required and
correct.
Language Independence:
It must not depend on any one programming language
31
Feasibility:
Each instruction should have possibility to be
executed.
1) for(int i=0; i<0; i++){
cout<< i; // there is no possibility
} that this statement to
be executed.
2) if(5>7) {
cout<<“hello”; // not executed.
}
32
Effectiveness:
Doing the right thing. It should yield the correct result
all the time for all of the possible cases.
Efficiency:
It must solve with the least amount of computational
resources such as time and space.
Producing an output as per the requirement within the
given resources (constraints).
33
Example: Write a program that takes a number and
displays the square of the number.
1) int x;
cin>>x;
cout<<x*x;
2) int x,y;
cin>>x;
y=x*x;
cout<<y;
34
Example: Write a program that takes two numbers and
displays the sum of the two.
35
Precision:
The result should always be the same if the algorithm is
given identical input.
Simplicity:
A good general rule is that each step should carry out one logical
step.
• What is simple to one processor may not be simple to
another.
36
Chapter Two
1
Algorithm Complexity Analysis
17
Algorithm Analysis Rules
Example
int count(){
int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;
} Time Units to Compute
1 for the assignment statement: int k=0
1 for the output statement. Cout<<
1 for the input statement. Cin>>
In the for loop:
1 assignment i=0,
n+1 tests i<n
n increments i++.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
Algorithm Analysis Rules
Example
int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
}
Formal Approach
In the above examples we have seen that analysis is so
complex as it tries to count each and every details.
However, it can be simplified by using some formal
approach
In which case we can ignore statements such as
initializations,
loop control statements, and
book keeping (statements such as declaration, number of pushes
and pops in function calls).
Algorithm Analysis Rules
(Formal Approach)
For Loops: Formally
In general, a for loop translates to a summation.
The index and bounds of the summation are the same as
the index and bounds of the for loop.
1
for (int i = 1; i <= N; i++) {
sum = sum+i; N
}
i 1
Algorithm Analysis Rules
(Formal Approach)
For Nested Loops:
Nested for loops translate into multiple summations, one
for each for loop.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) { N M N
}
sum = sum+i+j; 2 2M
i 1 j 1 i 1
2 MN
}
Exercises
Algorithm Complexity Analysis
28
Algorithm Complexity Analysis
30
Algorithm Complexity Analysis
Worst Case Analysis:
• Assumes the input data are arranged in the most
disadvantageous order for the algorithm.
• Takes the worst possible set of inputs.
• Causes execution of the largest number of statements.
• Computes the upper bound of T(n) where T(n) is the
complexity function.
– Example: While sorting, if the list is in opposite order.
While searching, if the desired item is located at the
last position or is missing.
Examples:
For sorting algorithms
If the list is in opposite order.
For searching algorithms
If the desired item is located at the last position or
is missing.
31
Algorithm Complexity Analysis
36
• O-Notations are used to represent the amount of time
an algorithm takes on the worst possible set of inputs,
“Worst-Case”.
• Note:- Different functions with the same growth rate
may be represented using the same O notation.
37
Question-1
38
Question-2
39
2. Big-Omega ()-Notation (Lower bound)
Example:
Find g(n) such that f(n) = (g(n)) for f(n)=3n+5
g(n) = √n, c=1, k=1.
f(n)=3n+5=(√n)
40
Big-Omega ()-Notation (Lower bound)
41
3. Theta Notation (-Notation) (Optimal bound)
43
4. Little-oh (small-oh) Notation
• Definition: We say f(n)=o(g(n)), if there are positive
constants no and c such that to the right of no, the value of
f(n) lies below c.g(n).
• As n increases, g(n) grows strictly faster than f(n).
• Describes the worst case analysis.
• Denotes an upper bound that is not asymptotically tight.
• Big O-Notation denotes an upper bound that may or may
not be asymptotically tight.
Example:
Find g(n) such that f(n) = o(g(n)) for f(n) = n2
45
Rules to estimate Big Oh of a given
function
When considering the growth rate of a function using Big-
Oh
Ignore the lower order terms and the coefficients of the
highest-order term
No need to specify the base of logarithm
Changing the base from one constant to another changes the
value of the logarithm by only a constant factor
Example:
1. T(n)=3n + 5 O(n)
2. T(n)=3n2+4n+2 O(n2)
46
Rule 1:
If T1(n)=O(f(n)) and T2(n)=O(g(n)), then
a) T1(n)+T2(n)=max(O(f(n)),O(g(n))),
b) T1(n)*T2(n)=O(f(n)*g(n))
Rule 2:
If T(n) is a polynomial of degree k, then T(n)=(nk).
Rule 3:
logk n=O(n) for any constant k. This tells us that
logarithms grow very slowly.
47
– The limit is 0: This means that f(n)=o(g(n)).
– The limit is c≠0: This means that f(n)=(g(n)).
– The limit is infinity: This means that
g(n)=o(f(n)).
– The limit oscillates: This means that there is no
relation between f(n) and g(n).
Example:
• n3 grows faster than n2, so we can say that
n2=O(n3) or n3=(n2).
• f(n)=n2 and g(n)=2n2 grow at the same rate, so
both f(n)=O(g(n)) and f(n)=(g(n)) are true.
• If f(n)=2n2, f(n)=O(n4), f(n)=O(n3), and f(n)=O(n2)
are all correct, but the last option is the best
answer.
48
T(n) Complexity Big-O
Category
functions F(n)
c, c is constant 1 C=O(1)
7n!+2n+n2+1 n! T(n)=O(n!)
T(n)=2*n=2n=O(n).
52
2. for(int i=1; i<=n; i++)
for(int j=1; j<=n; j++)
k++;
T(n)=1*n*n=n2 = O(n2).
53
Algorithm Complexity Analysis
(Formal Approach)
Big-Oh formal definition (O)
a function f(n) is of order (or has complexity)
O(g(n)) if and only if there exist constants n0 > 0
and c > 0 such that f(n) ≤ c[g(n)] for all n > n0
Examples
Big Oh: more examples
n2 / 2 – 3n = O(n2)
1 + 4n = O(n)
7n2 + 10n + 3 = O(n2) = O(n3)
log10 n = log2 n / log2 10 = O(log2 n) = O(log n)
sin(n) = O(1);
10 = O(1),
1010 = O(1)
log n + n = O(n)
logk n = O(n) for any constant k
n = O(2n), but 2n is not O(n)
210n is not O(2n)
N
i 1
i N N O( N 2 )
57
Math Review: logarithmic functions
x b iff
a
log x b a
log ab log a log b
log m b
log a b
log m a
log a b b log a
a log n
n log a
Important property
If T1(n) = O(f(n)) and T2(n) = O(g(n)), then
T1(n) + T2(n) = max(O(f(n)), O(g(n))),
T1(n) * T2(n) = O(f(n) * g(n))
59
Big-Omega
c , n0 > 0 such that f(n) c g(n) when n n0
f(n) grows no slower than g(n) for “large” n
60
Big-Omega
f(n) = (g(n)) if there are positive constants c and n0 such
that
f(n) cg(n) for n n0
Example
Let f(N) = 2N2. Then
f(N) = (N)
f(N) = (N2) (best answer)
61
Big-Theta f(N) = (g(N))
the growth rate of f(n) is the same as the growth rate of
g(n)
62
Big-Theta
f(n) = (g(n)) iff
f(n) = O(g(n)) and f(n) = (g(n))
The growth rate of f(n) equals the growth rate of g(n)
Example:
Let f(n)=n2 , g(n)=2n2
Since f(n) = O(g(n)) and f(n) = (g(n)),
thus f(n) = (g(n)).
Big-Theta means the bound is the tightest possible.
63
Some rules
If T(n) is a polynomial of degree k, then T(n) =
(nk).
64
Growth rates …
• Doubling the input size
– f(N) = c f(2N) = f(N) = c
– f(N) = log N f(2N) = f(N) + log 2
– f(N) = N f(2N) = 2 f(N)
– f(N) = N2 f(2N) = 4 f(N)
– f(N) = N3 f(2N) = 8 f(N)
– f(N) = 2N f(2N) = f2(N)
• Advantages of algorithm analysis
– To eliminate bad algorithms early
– pinpoints the bottlenecks, which are worth coding
carefully
65
Algorithm Complexity Analysis
(Formal Approach)
Further notes on
Big-Omega Notation ()
Theta Notation ()
Little-o Notation (o)
Little-Omega Notation ()
Relations among notations
Algorithm Complexity Analysis
0.001
109 (1GHZ) seconds 0.02seconds 2.8 Hours 1 Second 30 seconds 316 years
Algorithm Complexity Analysis
Are linear algorithms better or faster than cubic algorithms
for any input size?
It is not sensible to make generalizations:
actual performance depends on the size of the input, the constant
factors, the implementation, and the environment.