Unit 1-1
Unit 1-1
Introduction
Outline
1-1
Facts
problem
algorithm
1-4
Points to remember
1-5
What is an algorithm?
Recipe, process, method, technique, procedure, routine,…
with the following requirements:
1. Finiteness
terminates after a finite number of steps
2. Definiteness
rigorously and unambiguously specified
3. Clearly specified input
valid inputs are clearly specified
4. Clearly specified/expected output
can be proved to produce the correct output given a valid input
5. Effectiveness
steps are sufficiently simple and basic
1-6
Why study algorithms?
Theoretical importance
Practical importance
1-7
Basic Issues Related to Algorithms
How to design algorithms
Proving correctness
Optimality
1-8
Analysis of Algorithms
1-9
Euclid’s Algorithm
1-10
Two descriptions of Euclid’s algorithm
while n ≠ 0 do
r ← m mod n
m← n
n←r
return m
1-11
Other methods for computing gcd(m,n)
Middle-school procedure
Step 1 Find the prime factorization of m
Step 2 Find the prime factorization of n
Step 3 Find all the common prime factors
Step 4 Compute the product of all the common prime factors
and return it as gcd(m,n)
Is this an algorithm?
Input: Integer n ≥ 2
Output: List of primes less than or equal to n
for p ← 2 to n do A[p] ← p
for p ← 2 to n do
if A[p] 0 //p hasn’t been previously eliminated from the list
j ← p* p
while j ≤ n do
A[j] ← 0 //mark element as eliminated
j←j+p
Example: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Time complexity: O(n)
1-14
Example n= 25
Two main issues related to algorithms
1-16
Fundamentals of Algorithmic Problem Solving
1-17
Analysis of algorithms
1-18
Important problem types
sorting
searching
string processing
graph problems
combinatorial problems
geometric problems
numerical problems
1-19
Sorting (I)
Rearrange the items of a given list in ascending order.
• Input: A sequence of n numbers <a1, a2, …, an>
• Output: A reordering <a´1, a´2, …, a´n> of the input
sequence such that a´1≤ a´2 ≤ … ≤ a´n.
Why sorting?
• Help searching
• Algorithms often use sorting as a key subroutine.
Sorting key
• A specially chosen piece of information used to guide
sorting. E.g., sort student records by names.
1-20
Sorting (II)
Examples of sorting algorithms
• Selection sort
• Bubble sort
• Insertion sort
• Merge sort
• Heap sort …
Two properties
• Stability: A sorting algorithm is called stable if it preserves the
relative order of any two equal elements in its input.
• In place : A sorting algorithm is in place if it does not require extra
memory, except, possibly for a few memory units.
1-21
Selection Sort
Algorithm SelectionSort(A[0..n-1])
//The algorithm sorts a given array by selection sort
//Input: An array A[0..n-1] of orderable elements
//Output: Array A[0..n-1] sorted in ascending order
for i 0 to n – 2 do
min i
for j i + 1 to n – 1 do
if A[j] < A[min]
min j
swap A[i] and A[min]
1-22
Searching
Find a given value, called a search key, in a given set.
Examples of searching algorithms
• Sequential search
• Binary search …
Input: sorted array a_i < … < a_j and key x;
m (i+j)/2;
while i < j and x != a_m do
if x < a_m then j m-1
else i m+1;
if x = a_m then output a_m;
Time: O(log n)
1-23
String Processing
Examples:
(i) searching for a word or phrase on WWW or in a
Word document
(ii) searching for a short read in the reference genomic
sequence
1-24
Graph Problems
Informal definition
• A graph is a collection of points called vertices, some of
which are connected by line segments called edges.
Modeling real-life problems
• Modeling WWW
• Communication networks
• Project scheduling …
Examples of graph algorithms
• Graph traversal algorithms
• Shortest-path algorithms
• Topological sorting
1-25
Combinatorial Problems
Problems that ask, explicitly or implicitly, to find a
combinatorial object—such as a permutation, a
combination, or a subset—that satisfies certain constraints
with some additional property such as a maximum value or
a minimum cost.
Combinatorial problems are the most difficult problems in
computing, from both a theoretical and practical
standpoint because:
• the number of combinatorial objects typically grows extremely fast
with a problem’s size.
• there are no known algorithms for solving most such problems
exactly in an acceptable amount of time.
Numerical Problems
• Problems that involve mathematical objects of continuous
nature: solving equations and systems of equations,
computing definite integrals, evaluating functions, and so on.
• The majority of such mathematical problems can be solved
only approximately.
Fundamental data structures
list graph
• string
stack
queue
priority queue/heap
1-28
Linear Data Structures
Arrays
• A sequence of n items of the same
data type that are stored
Arrays
contiguously in computer memory fixed length (need preliminary
and made accessible by specifying a reservation of memory)
value of the array’s index. contiguous memory locations
Linked List
direct access
• A sequence of zero or more nodes
each containing two kinds of Insert/delete
information: some data and one or Linked Lists
more links called pointers to other
nodes of the linked list. dynamic length
• Singly linked list (next pointer) arbitrary memory locations
• Doubly linked list (next + previous access by following links
pointers) Insert/delete
a1 a2 … an .
1-29
Stacks and Queues
Stacks
• A stack of plates
– insertion/deletion can be done only at the top.
– LIFO
• Two operations (push and pop)
Queues
• A queue of customers waiting for services
– Insertion/enqueue from the rear and
deletion/dequeue from the front.
– FIFO
• Two operations (enqueue and dequeue)
Priority Queue and Heap
Priority queues (implemented using heaps)
A data structure for maintaining a set of elements,
each associated with a key/priority, with the
following operations
Finding the element with the highest priority
Deleting the element with the highest priority
Inserting a new element 9
6 8
Scheduling jobs on a shared computer 5 2 3
9 6 8 5 2 3
Graphs
Formal definition
• A graph G = <V, E> is defined by a pair of two sets: a
finite set V of items called vertices and a set E of vertex
pairs called edges.
Undirected and directed graphs (digraphs).
What’s the maximum number of edges in an undirected
graph with |V| vertices?
Complete, dense, and sparse graphs
• A graph with every pair of its vertices connected by an
edge is called complete
1 2
3 4
1-32
Graph Representation
Adjacency matrix
• n x n boolean matrix if |V| is n.
• The element on the ith row and jth column is 1 if there’s an
edge from ith vertex to the jth vertex; otherwise 0.
• The adjacency matrix of an undirected graph is symmetric.
Adjacency linked lists
• A collection of linked lists, one for each vertex, that contain all
the vertices adjacent to the list’s vertex.
Which data structure would you use if the graph is a 100-node star
shape?
0111 2 3 4
0001 4
0001 4
0000
1-33
Weighted Graphs
Weighted graphs
• Graphs or digraphs with numbers assigned to the edges.
5
1 2
6 7
9
3 8 4
1-34
1-34
Graph Properties -- Paths and Connectivity
Paths
• A path from vertex u to v of a graph G is defined as a sequence of
adjacent (connected by an edge) vertices that starts with u and ends
with v.
• Simple paths: All edges of a path are distinct.
• Path lengths: the number of edges, or the number of vertices – 1.
Connected graphs
• A graph is said to be connected if for every pair of its vertices u and
v there is a path from u to v.
Connected component
• The maximum connected subgraph of a given graph.
1-35
Graph Properties -- Acyclicity
Cycle
• A simple path of a positive length that starts and
ends a the same vertex.
Acyclic graph
• A graph without cycles
• DAG (Directed Acyclic Graph)
1 2
3 4
1-36
Trees
Trees
• A tree (or free tree) is a connected acyclic graph.
• Forest: a graph that has no cycles but is not necessarily
connected.
Properties of trees
• For every two vertices in a tree there always exists exactly one
simple path from one of these vertices to the other. Why?
– Rooted trees: The above property makes it possible to
select an arbitrary vertex in a free tree and consider it as
the root of the so called rooted tree.
– Levels in a rooted tree. rooted
1 3 5 3
|E| = |V| - 1
2 4 4 1 5
2
1-37
Rooted Trees (I)
Ancestors
• For any vertex v in a tree T, all the vertices on the simple path
from the root to that vertex are called ancestors.
Descendants
• All the vertices for which a vertex v is an ancestor are said to be
descendants of v.
Parent, child and siblings
• If (u, v) is the last edge of the simple path from the root to
vertex v, u is said to be the parent of v and v is called a child of
u.
• Vertices that have the same parent are called siblings.
Leaves
• A vertex without children is called a leaf.
Subtree
• A vertex v with all its descendants is called the subtree of T
rooted at v.
1-38
Rooted Trees (II)
Depth of a vertex
• The length of the simple path from the root to the vertex.
Height of a tree
• The length of the longest simple path from the root to a leaf.
h=2
3
4 1 5
2
1-39
Ordered Trees
Ordered trees
• An ordered tree is a rooted tree in which all the children of each
vertex are ordered.
Binary trees
• A binary tree is an ordered tree in which every vertex has no more
than two children and each children is designated s either a left child
or a right child of its parent.
Binary search trees
• Each vertex is assigned a number.
• A number assigned to each parental vertex is larger than all the
numbers in its left subtree and smaller than all the numbers in its
right subtree.
log2n h n – 1, where h is the height of a binary tree and n the size.
9 6
6 8 3 9
5 2 3 2 5 8
1-40
Some Well-known Computational Problems
Sorting
Searching
Shortest paths in a graph
Minimum spanning tree
Primality testing
Traveling salesman problem
Knapsack problem
Chess
Towers of Hanoi
Program termination
1-42
Fundamentals of the Analysis of
Algorithm Efficiency
1-43
Analysis of algorithms
Issues:
• correctness
• time efficiency
• space efficiency
• optimality
Approaches:
• theoretical analysis
• empirical analysis
Space Complexity
S(P)=C+SP(I)
Fixed Space Requirements (C)
Independent of the characteristics of the inputs
and outputs
• instruction space
• space for simple variables, fixed-size structured
variable, constants
Variable Space Requirements (SP(I))
depend on the instance characteristic I
• number, size, values of inputs and outputs associated
with I
• recursive stack space, formal parameters, local
variables, return address
45
Measuring an Input’s Size
1-46
Units for Measuring Running Time
Units for measuring an algorithm’s running time using some
standard unit of time measurement—a second, or millisecond has
drawbacks due to the following:
dependence on the speed of a particular computer
dependence on the quality of a program implementing the
algorithm
the compiler used in generating the machine code
the difficulty of clocking the actual running time of the
program.
1-47
Time Complexity
The metric used are:
Step count: that consider the time required by each and every
instructions in an algorithm.
• Determine the total number of steps contributed by each
statement
step per execution frequency
• add up the contribution of all statements.
1-48
Iterative function to sum a list of numbers
steps/execution
Statement s/e Frequency Total steps
float sum(float list[ ], int n) 0 0 0
{ 0 0 0
float tempsum = 0; 1 1 1
int i; 0 0 0
for(i=0; i <n; i++) 1 n+1 n+1
tempsum += list[i]; 1 n n
return tempsum; 1 1 1
} 0 0 0
Total 2n+3
Recursive Function to sum of a list of numbers
T(n) ≈ copC(n)
running time execution time Number of times
for basic operation basic operation is
executed
Input size and basic operation examples
Problem Input size measure Basic operation
Visiting a vertex or
Typical graph problem #vertices and/or edges
traversing an edge
Physical time calculation in C++
The clock() function in C++ (ctime.h) returns the approximate
processor time that is consumed by the program.
In order to compute the processor time, the difference between
values returned by two different calls to clock(), one at the
start and other at the end of the program is used. To convert
the value to seconds, it needs to be divided by a macro
CLOCKS_PER_SEC.
The clock() time may advance faster or slower than the actual
wall clock. It depends on how the operating system allocates
the resources for the process.
If the processor is shared by other processes, the clock() time
may advance slower than the wall clock. While if the current
process is executed in a multithreaded system, the clock() time
may advance faster than wall clock.
<ctime> Functions
C++ strftime()
C++ mktime()
C++ localtime()
C++ gmtime()
C++ ctime()
C++ asctime()
C++ time()
C++ difftime()
C++ clock()
1-54
clock() prototype
• clock_t clock();
3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n n0
this is true for c = 4 and n0 = 21
3 log n + 5
3 log n + 5 is O(log n)
need c > 0 and n0 1 such that 3 log n + 5 c•log n for n n0
this is true for c = 8 and n0 = 2
W -notation
For function g(n), we define W(g(n)),
big-Omega of n, as the set:
W(g(n)) = {f(n) :
positive constants c and n0,
such that n n0,
we have 0 cg(n) f(n)}
Intuitively: Set of all functions
whose rate of growth is the same
as or higher than that of g(n).
g(n) is an asymptotic lower bound for f(n).
f(n) = Q(g(n)) f(n) = W(g(n)).
Q(g(n)) W(g(n)).
Big-omega
Example
Q-notation
For function g(n), we define Q(g(n)),
big-Theta of n, as the set:
Q(g(n)) = {f(n) :
positive constants c1, c2, and n0,
such that n n0,
we have 0 c1g(n) f(n) c2g(n)
}
Intuitively: Set of all functions that
have the same rate of growth as g(n).
1-70
Basic asymptotic efficiency classes
1 constant
log n logarithmic
n linear
n log n n-log-n
n2 quadratic
n3 cubic
2n exponential
n! factorial
Order of Growth of Time Complexity
• Time Complexity/Order
of Growth defines the
amount of time taken
by any program with
respect to the size of
the input.
1-72
Values of some important functions as n
Some properties of asymptotic order of growth
f(n) O(f(n))
1-76
Time efficiency of nonrecursive algorithms
Best-case situation:
If the two first elements of the array are the same, then we can exit
after one comparison. Best case = 1 comparison.
Time Complexity
Worst-case situation:
• The basic operation is the comparison in the inner loop. The
worst case happens for two-kinds of inputs:
– Arrays with no equal elements
– Arrays in which only the last two elements are the pair of
equal elements
1-81
Example 3: Matrix multiplication
Example 4: Counting binary digits
1 3
1-90
T(n) =2n-1 T(0) + + 2n-2+ ............ +22 +21 + 1
1-91
Tree of calls for the Tower of Hanoi Puzzle
Time complexity of TOH Problem
T(n) = 2*T(n-1) + 1
T(n) = 2 * ( 2 * T(n-2) + 1) + 1
1-95
Fibonacci numbers
The Fibonacci numbers:
0, 1, 1, 2, 3, 5, 8, 13, 21, …
The Fibonacci recurrence:
F(n) = F(n-1) + F(n-2)
F(0) = 0
F(1) = 1