Data Structre and Algorithm-1
Data Structre and Algorithm-1
Data Structre and Algorithm-1
Efficiency
Running Time
Space used
• Size of input
• Data structure
• Hardware environment (clock speed, processor, memory, disk speed)
• Software environment (OS, language, interpreted or compiled)
If hardware and software environment remain same, running time depends on the size
of inputs and data structure.
1. for j ←1 to length(A)-1
2. key ← A[ j ]
3. > A[ j ] is added in the sorted sequence A[1, .. j-1]
4. i←j-1
5. while i >= 0 and A [ i ] > key
6. A[ i +1 ] ← A[ i ]
7. i ← i -1
8. A [i +1] ← key
Pseudo Code Constructs
Programming language constructs we use in pseudo-code are taken from high level
languages from C,C++. These are as follows
• Expression: ← is used for assignment and = is used for comparison.
• Method declaration : callx(param1,param2) is a method named “callx” with two
input parameter param1 and param2.
Programming constructs
• Decision structure: If <condition> then <true part> else <false part>. Indentation is
used to show the scope.
• While-loops: While <condition> do <actions>. Here also indentation is used for
scope.
• For loop: for <variable initialization,increment and condition checking> do
<action>. Indentation is used to display the scope of for.
• Repeat loop: Repeat < action> until <condition>
• Array indexing: A[i] represents the i-th cell in the array A.
Methods
• Method call: object.method(<arguments>). Where method belongs to object.
• Method returns: The return value of method.
Random Access Machine (RAM) Model
And thus we can have an idea of the running time of the code.
Best-case, Average-case and worst-case analysis
Let us do a cost analysis of Insertion sort.
Cost times
• for j ←2 to n c1 n
• key ← A[ j ] c2 n-1
• # A[ j ] is added in the sorted sequence A[1, .. j-1]
• i←j-1 c3 n-1
• while i >= 0 and A [ i ] > key c4 tk
• A[ i +1 ] ← A[ i ] c5 tk - 1
• i ← i -1 c6 tk - 1
• A [i +1] ← key c7 n-1
n
• Best Case for insertion sort : When numbers are already sorted.
tk=1. Running time = f(n)
• Worst Case : When numbers are inversely sorted. tk=k. Running
time = f(n2)
• Average Case: tk=k/2. Running time = f(n2).
Time vs Input Graph
Best-case, Average-case and worst-case analysis
• Best, worst and average cases of a given algorithm express what
the resource usage is at least, at most and on average,
respectively. Usually the resource being considered is running
time, but it could also be memory or other resources.
• In real-time computing, the worst-case execution time is often of
particular concern since it is important to know how much time
might be needed in the worst case to guarantee that the algorithm
will always finish on time.
• Average performance and worst-case performance are the most
used in algorithm analysis. Less widely found is best-case
performance. Probabilistic analysis techniques, especially expected
value, to determine average case analysis.
• Average case is often as bad as the worst case.
• Finding average case can be very difficult.
Analyzing Recursive Algorithm
Analyzing Recursive Algorithm
R
u
n
c*g(n)
n
c*f(n) is an upper
i bound for T(n)
n f(n)
g
T
i
m
e
Input size
Asymptotic Notation
Big O notation is a huge simplification; can we justify it?
– It only makes sense for large problem sizes
– For sufficiently large problem sizes, the highest-order term
swamps all the rest!
Consider R = x2 + 3x + 5 as x varies:
x = 0 x2 = 0 3x = 10 5 = 5 R = 5
x = 10 x2 = 100 3x = 30 5 = 5 R = 135
x = 100 x2 = 10000 3x = 300 5 = 5 R = 10,305
x = 1000 x2 = 1000000 3x = 3000 5 = 5 R = 1,003,005
x = 10,000 R = 100,030,005
x = 100,000 R = 10,000,300,005
Some properties of Big Oh notation
• Fastest growing function dominates a sum O(f(n)+g(n)) is O(max{f(n), g(n)})
• Product of upper bounds is upper bound for the product If f is O(g) and h is O(r)
then fh is O(gr)
• f is O(g) is transitive If f is O(g) and g is O(h) then f is O(h)
• If d is O(f), then ad is O(f) for a>0
• If d is O(f) and e is O(g), then d+e is O(f+g)
• If f(n)=a0+a1n+……+adnd the f(n) is O(nd)
• log(nx) is O(logn) for any fixed n>0.
• Hierarchy of functions O(1), O(logn), O(n1/2), O(nlogn), O(n2), O(2n), O(n!)
An Example of solving Big O
Method
1 1 1 1 1 1 1
logn 0 1 2 3 4 5
n 1 2 4 8 16 32
nlogn 0 2 8 24 64 160
n2 1 4 16 64 256 1024
The difference between the Contrapositive method and the Contradiction method is
subtle. Let's examine how the two methods work when trying to prove "If P, Then Q".
• Method of Contradiction: Assume P and Not Q and prove some sort of contradiction.
• Method of Contrapositive: Assume Not Q and prove Not P.
The method of Contrapositive has the advantage that your goal is clear: Prove Not P. In
the method of Contradiction, your goal is to prove a contradiction, but it is not always
clear what the contradiction is going to be at the start.
Simple Justification Techniques
Induction
If q(n) is true for an integer n , if we can prove that q(n+1) is true then we
can assume that q(n) is true for all positive integers.
For any positive integer n, 1 + 2 + ... + n = n(n+1)/2.
Proof. (Proof by Mathematical Induction) Let's let P(n) be the statement "1 + 2 + ... + n
= (n (n+1)/2." (The idea is that P(n) should be an assertion that for any n is verifiably
either true or false.) The proof will now proceed in two steps: the initial step and the
inductive step.
Initial Step. We must verify that P(1) is True. P(1) asserts "1 = 1(2)/2", which is
clearly true. So we are done with the initial step.
Inductive Step. Here we must prove the following assertion: "If there is a k such that
P(k) is true, then (for this same k) P(k+1) is true." Thus, we assume there is a k such
that 1 + 2 + ... + k = k (k+1)/2. (We call this the inductive assumption.) We must
prove, for this same k, the formula 1 + 2 + ... + k + (k+1) = (k+1)(k+2)/2.
This is not too hard: 1 + 2 + ... + k + (k+1) = k(k+1)/2 + (k+1) = (k(k+1) + 2 (k+1))/2 =
(k+1)(k+2)/2. The first equality is a consequence of the inductive assumption.
Loop Invariant
For
Loop Invariant
Loop Invariant
Basic Probability
Independence: Two events A and B are independent if P(A B)=P(A).P(B)
The expected value of a random variable is the weighted average of all possible values
that this random variable can take on. The weights used in computing this average
correspond to the probabilities in case of a discrete random variable, or densities in
case
of Suppose
a continuous random
random variable.
variable X can take value x1 with probability p1, value x2 with
probability p2, and so on, and, lastly, it can take value xk with probability pk. Then
the expectation of this random variable X is defined as
E(X)= x1. p1+ x2. p2 + ………….+ xk. pk