CSC 301 - Design and Analysis of Algorithms: Lecture# 01: Introduction
CSC 301 - Design and Analysis of Algorithms: Lecture# 01: Introduction
CSC 301 - Design and Analysis of Algorithms: Lecture# 01: Introduction
1
Algorithm Definition
• An algorithm is a step-by-step procedure for solving a particular
problem in a finite amount of time.
Input ALGORITHM
Some mysterious Output
X processing Y=
F: X→Y
F(X) 2
Algorithm -- Examples
• Repairing a lamp
• A cooking recipe
• Calling a friend on the phone
• The rules of how to play a game
• Directions for driving from A to B
• A car repair manual
• Human Brain Project
• Internet & Communication Links (Graph)
• Matrix Multiplication
3
Algorithm vs. Program
• A computer program is an instance, or concrete representation,
for an algorithm in some programming language
4
High Level
Language
Program
Solving Problems (1)
When faced with a problem:
1. First clearly define the problem
3. Select the one that seems the best under the prevailing
circumstances
5
Solving Problems (2)
• It is quite common to first solve a problem for a particular case
• Then for another
• And, possibly another
• And watch for patterns and trends that emerge
• And to use the knowledge from these patterns and trends in
coming up with a general solution
• And this general solution is called …………….
“Algorithm”
6
One Problem, Many Algorithms
Problem
• The statement of the problem specifies, in general terms, the
desired input/output relationship.
Algorithm
• The algorithm describes a specific computational procedure for
achieving input/output relationship.
Example
• Sorting a sequence of numbers into non-decreasing order.
Algorithms
• Various algorithms e.g. merge sort, quick sort, heap sorts etc.
7
Problem Instances
• An input sequence is called an instance of a Problem
8
Properties of Algorithms
• It must be composed of an ordered sequence of precise steps.
• It must be correct.
• It must terminate.
9
Syntax & Semantics
An algorithm is “correct” if its: WARNINGS:
• Semantics are correct
• Syntax is correct 1. An algorithm can be
syntactically correct, yet
semantically incorrect –
Semantics: dangerous situation!
Colorless
• The conceptgreen
embeddedideas
in an sleep furiously!
2. Syntactic correctness is
algorithm (the soul!)
easier to check as
compared to semantic
Syntax: correctness
• The actual representation of
an algorithm (the body!)
10
Algorithm Summary
• Problem Statement
• Relationship b/w input and output
• Algorithm
• Procedure to achieve the relationship
• Definition
• A sequence of steps that transform the input to output
• Instance
• The input needed to compute solution
• Correct Algorithm 11
• for every input it halts with correct output
Brief History
• The study of algorithms began with mathematicians and was a
significant area of work in the early years. The goal of those early
studies was to find a single, general algorithm that could solve all
problems of a single type.
13
Why Algorithms are Useful?
• Once we find an algorithm for solving a problem, we do not need
to re-discover it the next time we are faced with that problem
14
Why Write an Algorithm Down?
• For your own use in the future, so that you don’t have spend the
time for rethinking it
15
Designing of Algorithms
• Selecting the basic approaches to the solution of the problem
• Choosing data structures
• Putting the pieces of the puzzle together
• Expressing and implementing the algorithm
• clearness, conciseness, effectiveness, etc.
17
Important Designing Techniques
• Brute Force–Straightforward, naive approach–Mostly expensive
21
Analysis of Algorithms
• Many criteria affect the running time of an algorithm, including
• speed of CPU, bus and peripheral hardware
• design time, programming time and debugging time
• language used and coding efficiency of the programmer
• quality of input (good, bad or average)
• But
• Programs derived from two algorithms for solving the same
problem should both be
• Machine independent
• Language independent
• Amenable to mathematical study
• Realistic
22
Analysis of Algorithms
• The following three cases are investigated in algorithm analysis:
23
• C) Best Case: The best outcome for any possible input
• provides lower bound of resources
Analysis of Algorithms
• An algorithm may perform very differently on different example
instances. e.g: bubble sort algorithm might be presented with data:
• already in order
• in random order
• in the exact reverse order of what is required
• Platform dependent
• Execution time differ on different
architectures
• Data dependent
• Execution time is sensitive to
amount and type of data
minipulated.
• Language dependent
• Execution time differ for same 27
code, coded in different languages
∴ absolute measure for an algorithm is not appropriate
Theorerical Analysis
• Data independent
• Takes into account all possible inputs
• Platform independent
• Language independent
• Implementatiton independent
• not dependent on skill of programmer
• can save time of programming an inefficient solution
29
But Computers are So Fast These Days??
• Do we need to bother with algorithmics and complexity
any more?
• computers are fast, compared to even 10 years ago...
30
Importance of Analyzing Algorithms
• Need to recognize limitations of various algorithms for solving a
problem
32
• We have to weigh the trade-offs between an algorithm’s time
requirement and memory requirements.
What do we analyze about Algorithms?
• Algorithms are analyzed to understand their behavior and to
improve them if possible
• Correctness
• Does the input/output relation match algorithm requirement?
• Amount of work done
• Basic operations to do task
• Amount of space used
• Memory used
• Simplicity, clarity
• Verification and implementation.
• Optimality
33
• Is it impossible to do better?
Problem Solving Process
• Problem
• Strategy
• Algorithm
• Input
• Output
• Steps
• Analysis
• Correctness
• Time & Space Optimality
• Implementation
• Verification
34
Computation Model for Analysis
• To analyze an algorithm is to determine the amount of
resources necessary to execute it. These resources include
computational time, memory and communication bandwidth.
36
Pseudocode
• High-level description of an algorithm
• More structured than English prose but Less detailed than a
program
• Preferred notation for describing algorithms
• Hides program design issues
ArrayMax(A, n)
Input: Array A of n integers
Output: maximum element of A
1. currentMax A[0];
2. for i = 1 to n-1 do
3. if A[i] > currentMax then 37
4. currentMax A[i]
5. return currentMax;
Pseudocode
• Indentation indicates block structure. e.g body of loop
• Looping Constructs while, for and the conditional if-then-else
• The symbol // indicates that the reminder of the line is a comment.
• Arithmetic & logical expressions: (+, -,*,/, ) (and, or and not)
• Assignment & swap statements: a b , ab c, a b
• Return/Exit/End: termination of an algorithm or block
ArrayMax(A, n)
Input: Array A of n integers
Output: maximum element of A
1. currentMax A[0];
2. for i = 1 to n-1 do
3. if A[i] > currentMax then 38
4. currentMax A[i]
5. return currentMax;
Pseudocode
• Local variables mostly used unless global variable explicitly defined
• If A is a structure then |A| is size of structure. If A is an Array then
n =length[A], upper bound of array. All Array elements are
accessed by name followed by index in square brackets A[i].
• Parameters are passed to a procedure by values
• Semicolons used for multiple short statement written on one line
ArrayMax(A, n)
Input: Array A of n integers
Output: maximum element of A
1. currentMax A[0]
2. for i = 1 to n-1 do
3. if A[i] > currentMax then 39
4. currentMax A[i]
5. return currentMax
Elementary Operations
• An elementary operation is an operation which takes constant time
regardless of problem size.
• The running time of an algorithm on a particular input is determined by
the number of “Elementary Operations” executed.
• Theoretical analysis on paper from a description of an algorithm
41
Instruction and Sequence
• A linear sequence of elementary operations is also performed in
constant time.
• More generally, given two program fragments P1 and P2 which run
sequentially in times t1 and t2
• use the maximum rule which states that the larger time dominates
• complexity will be max(t1,t2)
42
Sequences
• Analysing a group of consecutive statements
• The statement taking the maximum time will be the one counted
• use the maximum rule
Block #1 t1
T(n) = max(t1,t2)
Block #2 t2
for i = 1 to n do
P(i);
T(n) = nt
for i = 0 to n do
for j = 0 to m do
P(j);
• Assume that P(j) takes time t, where t is independent of i and j
• Start with outer loop:
• How many iterations? n
• How much time per iteration? Need to evaluate inner loop
T(n) = n +
i 0 ri + t
n n
i 0 i
r 47
= n + n(n+1)/2 + tn(n+1)/2
Analysis Example
Algorithm: Number of times executed
1. n = read input from user 1
2. sum = 0 1
3. i = 0 1
4. while i < n n
n 1
5. number = read input from user n or i 0
1
n 1
6. sum = sum + number n or i 0
1
n 1
7. i = i + 1 n or 1
i 0
8. mean = sum / n 1
The computing time for this algorithm in terms on input size n is:
T(n) = 1 + 1 + 1 + n + n + n + n + 1 48
T(n) = 4n + 4
Another Analysis Example
i=1 ...............................1
while (i < n)................n-1
a=2+g...............n-1
i=i+1 ................n-1
if (i<=n).......................1
a=2 ....................1
else
a=3.....................1
T(n) = 1 + 3(n-1) + 1 + 1
49
=3n
Another Analysis Example
i=1...............................? 1
while (i<=10)................?
i=i+1...................?
10
i=1 ...............................? 10
while (i<=n)..................? 1
a=2+g .................?
i=i+1 ...................?
n
if (i<=n)........................ ? n
a=2 .................... ? n
else
a=3..................... ?
1
1
50
T ( n) = ? T ( n) = 3n +24
Asymptotic Growth Rate
• Changing the hardware/software environment
• Affects T(n) by constant factor, but does not alter the growth rate of T(n)
0.05 N2 = O(N2)
3N = O(N)
Time (steps)
55
Input (size)
N = 60
Important Functions
These functions often appear in algorithm analysis:
56
A comparison of Growth-Rate Functions
Size does Matter:
What happens if we increase the input size N?
57
A comparison of Growth-Rate Functions
250
f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n^2
f(n) = n^3
f(n) = 2^n
0
58
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
A comparison of Growth-Rate Functions
500
f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n^2
f(n) = n^3
f(n) = 2^n
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 59
A comparison of Growth-Rate Functions
1000
f(n) = n
f(n) = log(n)
f(n) = n log(n)
f(n) = n^2
f(n) = n^3
f(n) = 2^n
0
1 3 5 7 9 11 13 15 17 19 60
A comparison of Growth-Rate Functions
5000
4000
f(n) = n
f(n) = log(n)
3000
f(n) = n log(n)
f(n) = n^2
2000 f(n) = n^3
f(n) = 2^n
1000
0
61
11
1
13
15
17
19
A comparison of Growth-Rate Functions
10000000
1000000
100000
10000
1000
100
10
1
1 4 16 64 256 1024 4096 16384 65536 62
Performance Classification
f(n) Classification
1
:
Constant run time is fixed, and does not depend upon n . Most instructions are
executed once, or only a few times, regardless of the amount of information being processed
log n Logarithmic: when n increases, so does run time, but much slower . Common in
programs which solve large problems by transforming them into smaller problems.
n Linear: run time varies directly with n. Typically, a small amount of processing is done on
each element.
n log n When n doubles, run time slightly more than doubles . Common in programs
which break a problem down into smaller sub-problems, solves them independently, then combines
solutions
n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small
problems; typically the program processes all pairs of input (e.g. in a double nested loop).
63
2n Exponential: when n doubles, run time squares. This is often the result of a natural, “brute
force” solution.
Running Time vs. Time Complexity
• Running time is how long it takes a program to run.
65
Analysis of Results
f(n) = a n2 + b n + c
where a = 0.0001724, b = 0.0004 and c = 0.1
n f(n) a n2 % of n2
125 2.8 2.7 94.7
250 11.0 10.8 98.2
500 43.4 43.1 99.3
1000 172.9 172.4 99.7
2000 690.5 689.6 99.9 66
Model of Computation
Drawbacks:
• poor assumption that each basic operation takes constant time
• Adding, Multiplying, Comparing etc.
Comparisons: 2n + 2
Time complexity is O(n).
69
Prerequisite Review :
Mathematics
Data Structures
70
Notations
• Floor x and Ceiling x 3.4 = 3 and 3.4 = 4
• open interval (a, b) is {x | a < x < b}
log x 1 0
74
Summation Algebra
N
x
i 1
i
The sum of numbers from 1 to N; e.g 1 + 2 + 3 … + N
i
x 2
i 1
Suppose our list has 5 number, and they are 1, 3, 2, 5, 6.
Then resulting summation will be 12 + 32 + 22 + 52 + 62 = 75
y N
a ( y x 1)aor
ix
a TheNaFirst constant Rule
i 1
N N
ax
i 1
i a xi
i 1
The Second constant Rule
N N N
(x
i 1
i y i ) xi y i The Distributive Rule
i 1 i 1
3 3
75
x
i 1 j 1
ij x11 x12 x13 x 21 x 22 x 23 x31 x32 x33
Double Summation
Summation Algebra: Practice Questions
6
2
i 0
14
N N
6x y
i 1
i 6 y xi
i 1
2 3
i 0 j 0
2i 3 j 78
76
Arithmetic Series/Sequence
• A sequence in which the difference between one term and the
next is a constant. (add a fixed value each time ... on to infinity)
• Generalized form: {a, a+d, a+2d, a+3d, ... }
• where a is the first term, and d is the common difference between
the two terms
79
Series Examples
80
Permutation and Combination
Permutation
Set of n elements is an arrangement of the elements in given order
e.g. Permutation for elements a, b, c are abc, acb, bac, bca, cab, cba
- n! permutation exist for a set of elements
5! = 120 permutation for 5 elements
Combination
Set of n elements is an arrangement of the elements in any order
e.g. Combination for elements a, b, c is abc
81
Sample Space
A set “S” consisting of all possible outcomes that can result from a
random experiment (real or conceptual), can be defined as the
sample space for that experiment.
• Each possible outcome is called a sample point in that space.
Example: The sample space for tossing two coins at once (or tossing
a coin twice) will contain four possible outcomes and is denoted by
S = {HH, HT, TH, TT}.
In this example, clearly, S is the Cartesian product A A, where A =
{H, T}. 82
Events
Any subset of a sample space S of a random experiment, is called an event. In
other words, an event is an individual outcome or any number of outcomes
(sample points) of a random experiment.
Simple event is an event that contains exactly one sample point.
Compound event is an event that contains more than one sample point, and
is produced by the union of simple events.
Example: When we toss a coin, we get either a head or a tail, but not
both at the same time. The two events head and tail are therefore
mutually exclusive.
85
Probability Theory
If a random experiment can produce n mutually exclusive and
equally likely outcomes, and if m out to these outcomes are
considered favorable to the occurrence of a certain event A, then
the probability of the event A, denoted by P(A), is defined as the
ratio m/n.
m Number of favourable outcomes
P A
n Total number of possible outcomes
n A 7
P A .
n S 8 88
Probability Theory
Example: Four items are taken at random from a box of 12 items and
inspected. The box is rejected if more than 1 item is found to be faulty. If
there are 3 faulty items in the box, find the probability that the box is
accepted.
Solution: The sample space for this experiment is number of possible
combinations for selecting 4 out of 12 items from the box.
12 n! 12!
495
4
( n k )! k ! (8 )!4!
The box contains 3 faulty and 9 good items. The box is accepted if there is
(i) no faulty items, or (ii) one faulty item in the sample of 4 items selected.
Let A denote the event the number of faulty items chosen is 0 or 1. Then:
3 9 3 9
n A 126 252 378 sample po int s.
0 4 1 3
m 378 89
P A 0.76
n 495
Hence the probability that the box is accepted is 76%.
Data Structures Review
• Data structure is the logical or mathematical model of a
particular organization of data.
• Data structures let the input and output be represented in a
way that can be handled efficiently and effectively.
• Data may be organized in different ways.
Array
Linked list
90
Queue
Graph/Tree Stack
Arrays
• The linear search algorithm solves this problem by comparing ‘x’, one by
one, with each element in A. That is, we compare ITEM with A[1], then
A[2], and so on, until we find the location of ‘x’.
Worst case: ‘x’ is located in last location of the array or is not there
at all.
T(n) = 1 + n + n + 1 +1
= 2n +3
= O(n)
93
Average case
Average Case: Assume that it is equally likely for ‘x’ to appear at
any position in array A,. Accordingly, the number of comparisons
can be any of the numbers 1,2,3,..., n, and each number occurs
with probability p = 1/n.
This agrees with our intuitive feeling that the average number of
comparisons needed to find the location of ‘x’ is approximately
equal to half the number of elements in the A list. 94
Linked List
A B C
Head
• LIFO
• Implemented using linked-list or arrays 2132
123
123
123
123
123
123
• FIFO
• Implemented using linked-list or arrays 2132
123
123
2544
33
* y - 3
2 x a *
56
Tokyo Seattle
7 b
Seoul 128
L.A.
Sydney
Graphs
• A graph (network) G = (V, E) is a data structure containing a set of
vertices (points/nodes) V, and a set of edges E, where an edge in E
represents a connection between a pair of vertices in V.
V (G)= { A, B, C, D}
E (G)= { (A, B), (A, C), (B, C), (B, D) }
A
C
• Examples B
•
•
Web pages with links
Methods in a program that call each other
D
• Road maps (e.g., Google maps)
• Airline routes
• Facebook friends
• Course pre-requisites 99
• Family trees
• Paths through a maze
Definitions
e7
1. Vertex set = {v1, v2, v3, v4, v5, v6}
v6
2. Edge set = {e1, e2, e3, e4, e5, e6, e7}
v4 e5
3. e1, e2, and e3 are incident on v1
4. v2 and v3 are adjacent to v1
v5
5. e2,, e3 and e4 are adjacent to e1
e6
6. v5 and v6 are adjacent to themselves
7. v4 is an isolated vertex
v1
8. e6 and e7 are loops
9. e2 and e3 are parallel e1 e2 e3
101
Definitions: Path
A path of length k is a sequence v0, v1, …, vk of vertices such
that (vi , vi+1) for i = 0, 1, …, k – 1 is an edge of G.
Vertex a is reachable from b if a path exists from a to b.
b, c, d not a path
a, e, b is a closed path a b c d
Simple path:
a, e, k, p, l, q
m, h, d, c, g e f g h
Non-simple path:
(no repeated vertices)
a, b, e, f, g, b, g, l
j k l m
Length of a path
is the number of 102
edges in the path.
n o p q
Definitions: Cycle
A cycle is a path that starts and ends at the same vertex.
A simple cycle has no repeated vertices.
An acyclic graph does not contain any cycles
a b c d
e f g h
k, j, n, k, p, o, k
is not simple.
j k l m
103
n o p q
Definitions: Subgraph
A subgraph H of G
• is a graph;
• its edges and vertices are subsets of those of G.
V(H) = {b, d, e, f, g, h, l, p, q} E(H) = {(b, e), (b, g), (e, f), (d, h), (l, p), (l, q)}
a b c d
e f g h
j k l m
104
n o p q
Directed Graphs
• A graph is directed (digraph) if it contains a finite set of vertices V
and a set A of ordered pair of distinct vertices called arcs or direct
edges. Edge (u,v) notated as uv.
• An undirected edge is treated as two directed edges in opposite
directions
• Outdeg(v) is the number of arcs
beginning at v
• Indeg(v) is the number of arcs
ending at v.
• Each arc begins (source) and ends
(destination) at a vertex.
• Theorem: The sum of the outdegrees of the vertices of a digraph G
105
equals the sum of the indegrees of the vertices, which equals the
number of edges in G.
More Definitions 5
• A simple graph is a graph: 0 -2
• that is undirected 7
• that contains no parallel edges
4
• that contains no loop of length one
• A bipartite graph has its vertices partitioned into two subsets M&N
such that each edge of G connects a vertex of M to a vertex of N.
• In a connected graph every vertex is reachable from any other. 106
Representations of Graphs
Two techniques to represent graphs:
• Adjacency matrix
• Adjacency List
1 if (i, j) E
aij = { 0 otherwise
Given a graph G(V, E), An adjacency list represents the graph by array
Adj of |V| lists. For each u V, the adjacency list Adj[u] consists of
all the vertices adjacent to vertex u
107
Adjacency Matrix Example
1 2 3 4 5
1 2 1 0 1 0 0 1
3
2 1 0 1 1 1
3 0 1??
0 1 0
5 4 4 0 1 1 0 1
5 1 1 0 1 0
1 2 3 4 5 6
1 0 1 0 1 0 0
1 2 3 2 0 0 0 0 1 0
3 0 0 0 0 1 1
4 0 1
??
0 0 0 0
108
5 0 0 0 1 0 0
4 5 6 6 0 0 0 0 0 1
Adjacency Matrix Example
1 2 3 4 5
1 2 1 0 1 0 0 1
3
2 1 0 1 1 1
3 0 1 0 1 0
5 4 4 0 1 1 0 1
5 1 1 0 1 0
1 2 3 4 5 6
1 0 1 0 1 0 0
1 2 3 2 0 0 0 0 1 0
3 0 0 0 0 1 1
4 0 1 0 0 0 0
109
5 0 0 0 1 0 0
4 5 6 6 0 0 0 0 0 1
Adjacency List Example
Adj
2 1 2 3 5
2 1 3
1 3
3 1 2 4 5
4 3 5
5 4
5 1 3 4
114
Tree: Node Types
A
Sub-tree
B F K
C H D L X
Q G M I N P
B F K
C H D L X
Q G M I N P
Degree 1 Degree 3
116
Degree 0 Degree 2
Tree: Node Level (Distance from the Root)
A
B F K
C H D L X
Q G M I N P
Level 0 Level 2
Level 1 Level 3
117
Height/Depth of the Tree = Maximum Level + 1
Special Trees: Binary Tree
A A
B K B
C H L X C
Q M P Q
2 3
4 5 6 7
8 9 10 11 12 13 14 15
119
Special Trees: Complete Binary Tree
• A complete binary tree is a binary tree in which every level,
except possibly the last, is completely filled, and all nodes are
as far left as possible.
1
2 3
4 5 6 7
8 9 10 11 12
• Heap is a complete binary tree
• Height of the tree = lg n 120
• No. of leaves = n/2
Approaches to Algorithms Analysis
(Nested Loops)
i) Top-Down Approach
121
121
Example: Top-down vs. Bottom-up
for i 1 to n do
for j 1 to m do
ab
= 2 [(2n3+3n2+n)/6] + 3 [n(n+1)/2]
= O (n3)
Step-1: Bottom while Loop (line 5 and 6)
j
while(j) = 1
k 0
j 1 Quadratic Series
for(i) = while ( j ) j 1
j 1 j 1
2i 2i
=
j 1
j 1
j 1
123
123
= (2i(2i+1)/2) + 2i 2i2 +3i