Data Structure and Algorithms PDF
Data Structure and Algorithms PDF
• Proof by contradiction
• Summations
– Formulas for sums and products of series
• Recursion analysis
• logarithms
1
Definition (Algorithm)
• What exactly is an algorithm?
2
Definition (Algorithm)
• What exactly is an algorithm?
INSERTION-SORT(A)
1. for j = 2 to length[A]
2. do key ← A[j]
3. //insert A[j] to sorted sequence A[1..j-1]
4. i ← j-1
5. while i >0 and A[i]>key
6. do A[i+1] ← A[i] //move A[i] one position
right
7. i ← i-1
8. A[i+1] ← key
5
Example Algorithm (2)
• Finding the maximum element problem
– That is, given array A=[31, 41, 26, 41, 58], the maximum
algorithm returns 58 which is the maximum element in the
array
6
Example Algorithm (2)
• An algorithm for finding the maximum element
7
Example Algorithm (3)
• Finding the greatest common devisor problem of
two natural numbers
– Problem statement
• Determine the greatest common devisor of two natural
numbers x and y (where x < y )
– Input
• Two natural numbers a and b
– Output
• The largest natural number d which is the common
devisor of a and b
8
Example Algorithm (3)
• Algorithm for finding the greatest common
devisor of two natural numbers problem
9
Example Algorithm (3)
• GCD(33,21)=3
• 33 = 1*21 + 12
• 21 = 1*12 + 9
• 12 = 1*9 + 3
• 9 = 3*3
10
Why learn about algorithms?
• Three main reasons
1. We must create algorithms so that other people or
computers can help us achieve our goals
11
Purpose Algorithm in programs
• To accept input
– There are zero or more quantities which are externally
supplied
• To produce an output
– At least one quantity is produced
13
Essential Properties of Algorithms
• Input specified
– The inputs are the data that will be transformed during the
computation to produce the output
– We must specify
• the type of data
• The amount of data
14
Essential Properties of Algorithms
• Output specified
– The outputs are the data resulting from the computation (the
intended result)
15
Essential Properties of Algorithms
• Finiteness (It Must Terminate)
– An algorithm must have the property of finiteness
– That is, every valid algorithm should have finite number of
steps
– It must complete after a finite number of steps
– It must eventually stop either with the right output or with a
statement that no solution is possible
– If you trace out the instructions of an algorithm, then for all
cases the algorithm must terminate after a finite number of
steps
– Finiteness is an issue for computer algorithms because if the
algorithm doesn’t specify when to stop (computer algorithms
often repeat instructions), the computer will continue to
repeat the instructions for ever
16
Essential Properties of Algorithms
• Definiteness (Absence of Ambiguity)
– Each step must be clearly defined, having one and only one
interpretation
19
Essential Properties of Algorithms
• Correctness
– The first step (start step) and last step (halt step) must be
clearly noted
22
Essential Properties of Algorithms
• Simplicity
23
Essential Properties of Algorithms
• Language Independence
– Completeness
25
Models of Computation
A model of computation specifies the primitive
operations a computer is assumed to support.
27
Analysis of algorithms
• The objective of algorithm analysis is
– to determine how quickly an algorithm executes in
practice
– To measure either time or space requirements of an
algorithm
• What to measure
– Space utilization- amount of memory required
– Time efficiency- amount of time required to process the
data
28
Analysis of algorithms
• What to analyze
– The most important recourse to analyze is generally the
running time
– Several factors affect the running time of an a program
• Compiler used
• Computer used
• The algorithm used
• The input to the algorithm
– The first two are beyond the scope of theoretical model
– Although thy are important, we will not deal with them in the
course
– The last two are the main factors that we deal
– Typically, the size of the input is an important consideration
29
Analysis of algorithms
• Space utilization and Time efficiency depends on
many factors
– Size of input
– Speed of machine (computer used)
– Quality of source code (used algorithm)
– Quality of compiler
30
Running Time of Algorithm
• The running time of an algorithm depends on a number of
factors
• But, for most algorithm, the running time depends on the input
– An already sorted sequence is easier to sort
31
Running Time of Algorithm
• Generally, we seek an upper bound on the running time,
because everybody likes a guarantee
– Empirically
– Theoretically
32
Empirical Analysis
• What you should do
– Run the program with the data sets (the inputs) of varying
size and composition
33
Empirical Analysis
7000
like this 6000
Time (ms)
5000
4000
3000
2000
1000
0
0 50 100
Input Size
34
Empirical Analysis
• Is an experimental study
35
Limitations of Empirical Analysis
• It can’t be used in estimating the efficiency of algorithms
because the result varies due to variations in
– Processor speed
– Input size
– SW environment
37
Limitations of Empirical Analysis
• The difficulty to know which instances to test it on
38
Exercise
• To be included
39
Theoretical Analysis
• Is in contrast to the “experimental approach”
40
Theoretical Analysis
• Takes an algorithm and produces a function T(n) which
depends on the number of operations
41
Theoretical Analysis
• Rather it uses the number of operations (which are expressed in
time units) because the number of operations do not vary with
– Processor speed
– Input size
– Current processor load
– SW environment
42
Theoretical Analysis
• What will be the number of operations for the algorithm
below using a computer having
– 100 MHZ and 750 MHZ
– Using DOS OS
– Using UNIX OS
– While printing, browsing
43
Theoretical Analysis
• Is an approach that can be used to measure or estimate the
complexity of an algorithm as the approach do not vary due to
variations in the computer systems
44
How do measure complexity of algorithms
• Two steps or phases for this
– Order of magnitude
45
How do measure complexity of algorithms
• Step one - analysis of algorithm
47
Calculating complexity of an algorithm
• Step one - analysis of algorithm
– Primitive operations
• Identifiable in pseudocode
48
Primitive or Basic Operations
• Step one - analysis of algorithm
– Examples of basic operations could be
• An assignment (Assigning a value to a variable)
• Calling a method
• Returning from a method
• Evaluating an expression
– An arithmetic operation between two variable (e.g.,
Performing an arithmetic operations such as addition)
• A Comparison between to variables (two numbers)
• Indexing into an array
49
Primitive or Basic Operations
50
Running time- T(n)
• We measure execution (running) time in terms of any of the
following
– Arithmetic operations
– Assignment operations
– Loop iterations
– Comparisons
– Procedure calls (calling a method)
– Returning from a method
51
Examples
• Include examples on computing running
time- T(n)
52
Order of Magnitude
• Refers to the rate at which the storage or time grows as a function
of problem size
54
Functions whose complexity is known through
research and experiment
• Constant function
– T(n) ε O(1) : Great. This means your algorithm takes only
constant time. You can’t beat this
• loglogn
– T(n) ε O(loglogn): super fast! For all intents this is as fast as
a constant time
• Logarithmic time
– T(n) ε O(logn): Very good. This is called logarithmic time.
This is what we look for most data structures. Nite that
log1000 ≈ 10 and log 1, 000, 000 ≈ 20 (log’s base 2)
• Polylogarithmic time
– T(n) ε O((logn)k):(where k is a constant ). This is called
polylogarithmic time. Not bad, when simple logarithmic is55not
achievable
Functions whose complexity is known through
research and experiment
• Linear time
– T(n) ε O(n) :this is called linear time. It is about the best that
one can hope for if your algorithm has to look at all the data.
– In data structure the game is usually to avoid this through
• nlogn
– T(n) ε O(nlogn) : This one is famous, because this is the time
needed to sort a list of numbers. It arises in a number of other
problems as well
• Quadratic time
– T(n) ε O(n2) : Okay if n is in the thousands, but rough when n
gets into the millions
56
Functions whose complexity is known through
research and experiment
• Polynomial time
– T(n) ε O(n2) : (where k is a constant). This is called
polynomial time. Practical if k is not too large
• Exponential time
– T(n) ε O(2n) , T(n) ε O(nn) , T(n) ε O(n!) : algorithms taking
this much time are only practical for smallest values of n
(e.g., n ≤ 10 or maybe n ≤ 20)
57
Order of Magnitude
• This type of analysis is called asymptotic analysis
• When we look at input sizes large enough to make only the order
of growth of the running time relevant, we are studying
asymptotic efficiency of algorithms
59
More examples on computing T(n) and O(T(n))
• Since, we are giving the answer in terms of Big-Oh, there are lots
of short cuts that can be taken without affecting the final answer
– In real life, not all operations take exactly the same time
62
Analysis Rules
3. If-then-else statement
63
Analysis Rules
• Example - Running time of if-then- else statement
If (x>5)
executed for (i=1; i<=n; i++)
n times
cout << i constant time
else
cout << “hello”; constant time
64
Analysis Rules
• Example - Running time of if-then- else statement
67
Analysis Rules
5. Running time of a Loop statement
The running time of a loop is, at most, the running
time of the statements inside the loop (including
tests) multiplied by the number of iterations.
68
Analysis Rules
6. Running time of Nested Loop
• Analyze inside out
• The running time of a statement inside a group of nested
loops is the running time of the statement multiplied by the
product of the sizes of all the loops
70
Analysis Rules
• Example
71
Algorithm Analysis Categories
• An algorithm may run faster on certain data sets than
others
72
Best Case Analysis
• Best case is defined as which input of size n is the cheapest
among all inputs of size n
• Best case is obtained from the in the input data set that results in
best possible performance
73
Best Case Analysis
• Input
– Assumes the input data are found in the most advantageous
order for the algorithm
– For sorting, the most advantageous order is that data are
arranged in the required order
– For searching, the best case is that the required item is found
at the first position
– Question
• Which order is the most advantageous for an algorithm
that sorts 2 5 8 1 4 (ascending order)
• Input size
– Assumes that the input size is the minimum possible
74
Best Case Analysis
• This case executes or causes the fewest number of executions
• So, it computes the lower bound of T(n) and You can not do better
75
Worst Case Analysis
• Is the complexity of an algorithm based on worst input of each
size
76
Worst Case Analysis
• Input (i.e., arrangement of data)
– Assumes that the data are arranged in the most
disadvantageous order
– For sorting, the worst case assumes that data are arranged in
opposite order
– For searching, the worst case assumes that the required item
is found at last position or missing
• Input size
– Assumes the input size to be infinite
– That is, it assumes that the input size is a very large number
– That is, it considers n Æinfinity
77
Worst Case Analysis
• This case, causes highest number of executions to be performed
by the algorithm
78
Worst Case Analysis
• Worst case is easier to analyze and can yield useful information
79
Average Case Analysis
• Is the complexity of an algorithm averaged over all inputs of
each size
• Computes the optimal (the one which is found most of the time)
bound of T(n)
80
Average Case Analysis
• Input (i.e., arrangement of data)
• Input size
– The input size can be any random number as small as the best
case or as large as the worst case
81
Average Case Analysis
• Problems with performing an average case analysis
83
Average Case Analysis
• As a result, in several of the algorithms, we limit our
analysis to determining the best and worst counts
– Easier to analyze
84
Graphical view of best, average and worst cases
best case
• The running average case
time of an worst case
algorithm 120
input and
Running Time
80
typically 60
grows with the
40
input size
20
0
1000 2000 3000 4000
Input Size
85
Which Analysis to Use?
• Question
– How can we determine the complexity of an algorithm on any
data set?
– When is the worst case time important?
87
Why use a worst case analysis
2. For some algorithms, the worst case occurs fairly often
– That is , Worst case "may" be a typical case
Conclusion
89
Insertion Sort Algorithm
Input: a sequence of n number <a1, a2, …,an>
Output: a permutation (reordering) <a1', a2', …,an'>
such that a1'≤ a2' ≤ … ≤ an '.
• Idea: (similar to sort a hand of playing cards)
– Every time, take one card, and insert the card to
correct position in already sorted cards.
90
Asymptotic Notations
• Goal: to simplify analysis by getting rid of unneeded
information (like “rounding” 1,000,001≈1,000,000)
92
Big-Oh Notation
• Big-Oh notation is a way of comparing algorithms
93
Big-Oh
• Formal definition of Big-Oh
Let T, f : INÆ IR be functions. We say T(n) is O(f(n))
(pronounced T(n) is big-oh of f(n)) which we write T(n) ε
O(f(n)) or f(n)=O(g(n)), if there is
n0 ε IN and a constant c > 0 in IR susch that for all int egers n ≥ n0 , T (n) ≤ c. f (n)
O( f (n)) = {T : IN → IR : there is n0 ε IN and a c > 0 in IR susch that for all n ≥ n0 we have T (n) ≤ c. f (n)}
• That is, for a given function f(n) we denote by O(g(n)) the set of
functions
O( f (n)) = {T (n) : IN → IR : there is n0 ε IN and a c > 0 in IR susch that for all n ≥ n0 we have 0 ≤ T (n) ≤ c. f (n)
94
Big-Oh Notation
• O(f(n)) is best thought as a set of functions
• Any function T(n) ε O(f(n)) cannot exceed the value f(n) when
n is sufficiently large, for n ≥ n0
96
Big-Oh
•This definition says that eventually there is some point n0 past
which c.f(n) is always at least as large as T(n), so that if the
constant factors are ignored, f(n) is at leas as big as T(n)
T(n) = O(f(n))
97
Big-Oh Notation
• For example
– If T(n) = 1000n and f(n) = n2, then
• n0 = 1,000 and c = 1
• We could also use n0 = 10 and c = 100
– Thus we say that 100n = O(n2)
98
Big-Oh Example 1
• Consider the function f(n)=8n+128 shown in Figure
99
Big-Oh Example 1
• E.g., suppose we choose c=1. Then
100
Big-Oh Example 1
101
Big-Oh Example 1
• Of course, there are many other values of c and n0 that
will do.
• For example,
• Another example
102
Big-Oh Example 2
For functions f(n) f(n) = 2n + 6
and g(n) (to the
right) there are
positive constants c
and n0 such that: c g(n) = 4n
f(n)≤c g(n) for n ≥ n0
conclusion:
g(n) = n
2n+6 is O(n).
n
103
Big-Oh Example 3
104
Some Facts to Use for Big-Oh Problems
• 1 ≤ n for all n ≥ 1
• n ≤ n2 for all n ≥ 1
• 2n ≤ n! for all n ≥ 4
• Exercise
– If f(n) = 10n + 5 and g(n) = n, then show that f(n) is
O(g(n))
– If f(n) = 3n2 + 4n + 1 then show that f(n) = O(n2)
105
Big-Oh Notation
• Alternate definition
T ( n)
T (n) ε Ο( f (n)) if lim
n →∞ f ( n)
106
Remarks on the definition of Big-Oh notation
• Certain conventions have evolved which concern how big oh
expressions are normally written:
• It does not say any thing about how good or tight this bound is
• Example
n = O(n2), n = O(n2.5), n = O(n3), n = O(2n)
• Therefore
– Big-Oh describes the tight upper bound of T(n), that is the best
that we can do
• It does not put any restriction on these values and gives little
advice in situations when there are many candidates
• If fact, there are infinite many pars of cs and n0s that can be
given for the same pair of functions T(n) and f(n)
• Exercise
– For T(n) = 2n2 + 3n + 1 = O(n2) where f(n) = n2, find
candidate values for c and n0 such that T(n) ≤ c.f(n) for all n
≥ n0
110
Properties of the O notation
• If k is a constant, then k is O(1)
• Constant factors may be ignored
∀k > 0, kf is O( f)
• Higher powers grow faster
– nr is O( ns) if 0 ≤ r ≤ s
• Fastest growing term dominates a sum
– If f is O(g), then f + g is O(g)
– eg an4 + bn3 is O(n4 )
• Polynomial’s growth rate is determined by leading term
– If f is a polynomial of degree d,
then f is O(nd)
– That is, a polynomial is O (the term containing the highest
power)
111
Properties of the O notation
• f is O(g) is transitive
– If f is O(g) and g is O(h) then f is O(h)
• Product of upper bounds is upper bound for the product
– If f is O(g) and h is O(r) then fh is O(gr)
• Exponential functions grow faster than powers
– nk is O( bn ) ∀ b > 1 and k ≥ 0
eg n20 is O( 1.05n)
• Logarithms grow more slowly than powers
– logbn is O( nk) ∀ b > 1 and k > 0
eg log2n is O( n0.5)
112
Properties of the O notation
• All logarithms grow at the same rate
– logbn is O(logdn) ∀ b, d > 1
∑k
k =1
r
is Θ(n r +1
)
eg
n
n(n + 1)
∑
k =1
i=
2
is θ (n ) 2
113
Properties of the O notation
• If f1(n) = O(g1(n)) and f2(n) = O(g2(n)), then
f1(n) + f2(n) = O(max(g1(n), g2(n)) )
114
Big-Omega
• Definition (Big-Omega)
Consider a function f(n) which is non-negative for all integer n
≥ 0. We say that “f(n) is Omega of g(n),'' which we write
f(n)=Ω(g(n)), if there exists an integer
n0 and a constant c > 0 in IR susch that for all integers n ≥ n0 , f (n) ≥ c.g (n)
115
Big-Omega- Example 1
• Consider the function f(n) = 5n2 – 64n + 256
• As with big oh, it does not matter what the particular constants
are--as long as they exist!
116
Big-Omega- Example 1
• E.g., suppose we choose c=1. Then
• So, we have that for c=1 and n0 = 0, f(n) ≥ cn2 for all integers n
≥ n0
118
Big-Omega- Example 1
119
More examples on Big-Omega
• Include examples
120
Big-Omega
• Alternative definition
f ( n)
f (n) ε Ω( g (n)) if lim is either a constant or ∞ (but not zero)
n →∞ g ( n)
121
Big-Omega
• The definition of omega is almost identical to that of big-oh
125
Asymptotic Lower Bound - Omega
• Transpose symmetry
128
Theta Notation (Θ)
• The Theta notation is used when the function f can be bounded
both from above and below by the same function
• Definition (Theta)
Consider a function f(n) which is non-negative for all integer n
≥ 0. We say that “f(n) is Theta g(n),'' which we write
f(n)=θ(g(n)), if there exists positive constants c1 and c2 and
n0 such that
129
Theta Notation (Θ) – Examples
• Include examples
130
Theta Notation (Θ)
• Alternative definition
f ( n)
f (n) ε θ ( g (n)) if lim is a non−zero constant (not zero or ∞)
n →∞ g ( n)
• Example
n 3 + 3n 2 + 2 n
show that f ( n ) = ε θ (n 3 )
6
131
Theta Notation (Θ)
• For a given function g(n), θ(g(n)) denotes a set of functions
132
Theta Notation (Θ)
• The figure below gives an intuitive picture of functions f(n) and
g(n), where f(n) = θ(g(n))
The Theta Notation asymptotically bounds a function form
above and below
• In other words, for all n ≥ n0, the function f(n) is equal to g(n) to
within a constant factor
134
Theta Notation (Θ)
• Therefore, we assume that every function used with θ- notation
is asymptotically nonnegative
135
Theta Notation (Θ)
• Thus g is both the lower bound and upper bound (except for a constant factor
c) on the values of f for all suitably large n
• That is, another way to view the θ- notation is that it is both Ω(g(n)) and
Ο(g(n))
136
Properties of Theta
• Reflexive
f(n) = θ(f(n))
• Transitive
T(n)=θ(f(n)), f(n)=θ(g(n))ÆT(n)=θ(g(n))
• Symmetry property
T(n)=θ(f(n)) iff f(n)=θ(T(n))
• T(n)=O(f(n)), T(n)≠θ(f(n))ÆT(n)≠Ω(f(n))
137
Little-oh
• Is also called small-oh notation
• Example
– The bound 2n2 = Ο(n2) is asymptotically tight
– The bound 2n = Ο(n2) is not asymptotically tight
– Ο(n3) is not a tight upper bound for the expression (1/2)n2 +
(1/6)n where as Ο(n2) is
– There for (1/2)n2 + (1/6)n = ο(n3) but not ο(n2)
138
Little-oh
• We use the ο-notation to denote an upper bound that is not
asymptotically tight
• That is, it represents upper bound like big- Οh but the upper
bound is not tight
139
Little-oh
• Definition (Little-oh )
T(n) is said to be in little-oh of f(n), written as T(n) =o(f(n)), if
• Little-oh
– describes the non-tight upper bound of T(n)
– shows the worst case of an algorithm
140
Little-oh
• Definition (Little-oh )
• Example
– Consider the function T(n) = n + 1
– Clearly, T(n) = O(n2)
– Clearly, T(n) ≠ Ω(n2) since no mater what c we choose,
for large enough n, c.n2 ≥ n + 1
– Thus, we may write T(n) = o(n2)
141
Little-oh
• Definition (Little-oh )
142
Little-oh
• T(n) = o(f(n)) then applying idea of limits as n increased
indefinitely T(n) becomes insignificant compared to f(n), i.e.,
T ( n)
lim =0
n →∞ f ( n)
• Example
– Show 4n3 + 3n2 + 5 = o(n4 - 3n3 - 5n – 4)
– This is equivalent to
n4 - 3n3 - 5n – 4 = ω (4n3 + 3n2 + 5)
143
Little-oh
• Little-oh is not reflexive
– T(n) ≠ o(T(n))
144
Little-oh (Small-οh)
• Little-oh is not symmetric:
– T(n) = o(f(n))
⇒ f(n)=o(T(n))
145
Little-oh
• Little-oh is transitive
146
Little omega
• we use little ω to denote lower bound that is not
asymptotically tight
147
Little omega
• Definition of little omega
148
Little omega
• So as nÆinfinity, the magnitude of g(n) is insignificant
comared to that of f(n)
• Therefore,
g ( n)
lim = 0 ⇒ g (n) = o( f (n))
n →∞ f ( n)
149
Comparison of Ω, Θ, and Ο
• Ο(g): functions that grow no faster than g
150
Clarification
• When talking about Big O, Ω, θ, and little o, ϖ, we do not talk
about specific functions