0% found this document useful (0 votes)
21 views

Lecture 04

Data Analysis and Algorithms

Uploaded by

kainat sajid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Lecture 04

Data Analysis and Algorithms

Uploaded by

kainat sajid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

Advanced Analysis of Algorithms

Leviten: Chapter 3
Today Covered

Algorithms design Techniques/Strategies


Brute Force Approach,
• Checking primality
• Sorting sequence of numbers
• Knapsack problem
• Closest pair in 2-D, 3-D and n-D
• Finding maximal points in n-D
Basic Issues Related to
Algorithms
• How to design algorithms

• How to express algorithms

• Proving correctness

• Efficiency (or complexity) analysis


– Theoretical analysis

– Empirical analysis

• Optimality

1-3
Algorithm design strategies

• Brute force • Greedy approach

• Divide and conquer • Dynamic programming

• Decrease and conquer • Backtracking

• • Branch-and-bound
Transform and conquer

• Space and time tradeoffs


• Iterative Improvement

1-4
Analysis of Algorithms
• How good is the algorithm?
– Correctness
– Time efficiency
– Space efficiency

• Does there exist a better algorithm?


– Lower bounds
– Optimality

1-5
What is an algorithm?
• Recipe, process, method, technique, procedure,
routine,… with the following requirements:
1. Finiteness
 terminates after a finite number of steps
2. Definiteness
 rigorously and unambiguously specified
3. Clearly specified input
 valid inputs are clearly specified
4. Clearly specified/expected output
 can be proved to produce the correct output given
a valid input
5. Effectiveness
 steps are sufficiently simple and basic
1-6
Why study algorithms?
• Theoretical importance

– the core of computer science

• Practical importance

– A practitioner’s toolkit of known algorithms

– Framework for designing and analyzing algorithms for


new problems

1-7
Brute Force Approach,

• Brute force is a straightforward approach to solving a


problem, usually directly based on the problem
statement and definitions of the concepts involved.
• Brute Force means “Just do it”.
• And often, the brute-force strategy is indeed the one
that is easiest to apply.
Simple Brute Force Approaches
• Primality Testing
• Selection Sort
• Bubble Sort
• Linear Search
• String Matching
• Exhaustive Search
Primality Testing

(given number is n binary digits)


First Algorithm for Testing Primality
Brute Force Approach
Prime (n)
for i 2 to n-1
if n mod i = 0 then
“number is composite”
else
“number is prime”
• The computational cost is (n)
• The computational cost in terms of binary operation
is (2n)
Refined Algorithm for Testing Primality
Prime (n)
for i 2 to n/2
if n mod i = 0 then
“number is composite”
else
“number is prime”

• The computational cost is (n/2)


• The computational cost in terms of binary
operation is (2n-1), not much improvement
Algorithm for Testing Primality
• We are not interested, how many operations are
required to test if the number n is prime
• In fact, we are interested, how many operations are
required to test if a number with n digits is prime.
• RSA-128 encryption uses prime numbers which are
128 bits long. Where 2128 is:
340282366920938463463374607431768211456
• Therefore, to prime test a number with n binary
digits by brute force, we need to check 2 n numbers.
• Thus brute-force prime testing requires exponential
time with the n number of digits.
• Which is not accepted in this case
Lemma
Statement
• If n  N, n > 1 is not prime then n is divisible by
some prime number p ≤ square root of n.
Proof
• Since n is not prime hence it can be factored as
n = x.y where 1 <x ≤ y <n
• x or y is a prime, if not it can be further factored out.
• Also suppose without loss of generality that x ≤ y
• Now our claim is that x ≤ sq(n)
• This is because otherwise x.y > n, a contradiction
• We only require to check till sqr(n) for primality test.
Refined Algorithm for Testing Primality
Prime (n)
for i 2 to sqr(n)
if n mod i = 0 then
“number is composite”
else
“number is prime”

• The computational cost is (sqr(n)), much faster


• The computational cost in terms of binary operation
is (2squareroot(n)), still exponential
• Computation cost can be decreased using number
theoretic concepts, which will be discussed later on.
Sorting Sequence of Numbers
An Example of Algorithm
• Input : A sequence of n numbers (distinct)
 a1 , a 2 ,..., a n 
• Output : A permutation,  a , a  ,..., a  
1 2 n
of the input sequence such that

a1 a2 ... an

3,0,6,1,5,2 0,1,2,3,5,6
Sorting Algorithm
Sorting Algorithm: Brute Force Approach
Sort the array [2, 4, 1, 3] in increasing order
s1 = [4,3,2,1], s2 = [4,3,1,2], s3 = [4,1,2,3]
s4 = [4,2,3,1], s5 = [4,1,3,2], s6 = [4,2,1,3]
s7 = [3,4,2,1],s8 = [3,4,1,2], s9 = [3,1,2,4]
s10 = [3,2,4,1], s11 = [3,1,4,2], s12 = [3,2,1,4]
s13 = [2,3,4,1], s14 = [2,3,1,4], s15 =
[2,1,4,3]
s16 = [2,4,3,1], s17 = [2,1,3,4], s18 =
[2,4,1,3]
s19 = [1,3,2,4], s20 = [1,3,1,4], s21 = [1,4,2,3]
s22 = [1,2,3,4], s23 = [1,4,3,2], s24 = [1,2,4,3]
There are 4! = 24 number of permutations.
For n number of elements there will be n! number of
permutations. Hence cost of order n! for sorting.
Generating Permutations
Permute (i) \\initial call Permute(1)
if i == N
output A[N]
else
for j = i to N do
swap(A[i], A[j])
permute(i+1)
swap(A[i], A[j])
• There are 4! = 24 number of permutations.
• For n number of elements there will be n! number of
permutations. Hence cost of order n! for sorting.
Example
Generating Permutations
Theorem
• Prove, by mathematical induction, that computational
cost of generating permutations is n!.
Proof
• If n = 1, then the statement is true, because 1! =1
• If there are k elements in set then no. of permutation = k!
• If we add one more element in any of the permutations,
there will be k+1 number of ways to add it, resulting k+1
no. of permutations.
• Now total no. of permutations = k!(k+1) = (k+1)!
• Hence true for all n.
Selection Sort
Selection Sort
Bubble Sort
Bubble Sort
Sequential Search
Brute-Force Strengths and
Weaknesses
• Strengths
– wide applicability
– simplicity
– yields reasonable algorithms for some important
problems
(e.g., matrix multiplication, sorting, searching, string
matching)

• Weaknesses
– rarely yields efficient algorithms
– some brute-force algorithms are unacceptably slow
– not as constructive as some other design techniques
Exhaustive Search
• Many important problems require finding an element with
a special property in a domain that grows exponentially
(or faster) with an instance size.
• Typically, such problems arise in situations that involve—
explicitly or implicitly—combinatorial objects such as
permutations, combinations, and subsets of a given set.
• Many such problems are optimization problems: they ask
to find an element that maximizes or minimizes some
desired characteristic such as a path length or an
assignment cost.
Exhaustive Search
• Exhaustive search is simply a brute-force approach to
combinatorial problems.
• It suggests generating each and every element of the
problem domain, selecting those of them that satisfy all
the constraints, and then finding a desired element
(e.g.,the one that optimizes some objective function).
• Note that although the idea of exhaustive search is quite
straightforward, its implementation typically requires an
algorithm for generating certain combinatorial objects
Exhaustive Search
A brute force solution to a problem involving search for an
element with a special property, usually among
combinatorial objects such as permutations, combinations,
or subsets of a set.

Method:
– generate a list of all potential solutions to the problem in
a systematic manner (see algorithms in Sec. 5.4)

– evaluate potential solutions one by one, disqualifying


infeasible ones and, for an optimization problem,
keeping track of the best one found so far

– when search ends, announce the solution(s) found


Exhaustive Search
• Traveling Salesman Problem
• Knapsack Problem
• Depth First Search
• Breadth First Search
• Closest Pair Problem
• Finding Maximal
Traveling Salesman Problem

• Given n cities with known distances between each pair,


find the shortest tour that passes through all the cities
exactly once before returning to the starting city
• Alternatively: Find shortest Hamiltonian circuit in a
weighted connected graph
• Example: 2
a b
5 3
8 4

c d
7
How do we represent a solution (Hamiltonian circuit)?
TSP by Exhaustive Search

Tour Cost

a→b→c→d→a 2+3+7+5 = 17
a→b→d→c→a 2+4+7+8 = 21
a→c→b→d→a 8+3+4+5 = 20
a→c→d→b→a 8+7+4+2 = 21
a→d→b→c→a 5+4+3+8 = 20
a→d→c→b→a 5+7+3+2 = 17
Θ((n-1)!)
Efficiency:
Chapter 5 discusses how to generate permutations fast.
0-1 Knapsack Problem
0-1 Knapsack Problem Statement
The knapsack problem arises whenever there is
resource allocation with no financial constraints
Problem Statement
• You are in Japan on an official visit and want to make
shopping from a store
• You have a list of required items
• You have also a bag (knapsack), of fixed capacity, and
only you can fill this bag with the selected items
• Every item has a value (cost) and weight,
• And your objective is to seek most valuable set of
items which you can buy not exceeding bag limit.
0-1 Knapsack Example
Input
• Given n items each
– weight wi
– value vi
• Knapsack of capacity W
Output: Find most valuable items that fit into the knapsack

Example:
item weight value knapsack capacity W = 16
1 2 20
2 5 30
3 10 50
4 5 10
0-1 Knapsack Problem Example
Subset Total weight Total value
1.  0 0 # W V

2. {1} 2 20 1 2 20
3. {2} 5 30 2 5 30
4. {3} 10 50 3 10 50
5. {4} 5 10 4 5 10
6. {1,2} 7 50
7. {1,3} 12 70
8. {1,4} 7 30
9. {2,3} 15 80
10. {2,4} 10 40
11. {3,4} 15 60
12. {1,2,3} 17 not feasible
13. {1,2,4} 12 60
14. {1,3,4} 17 not feasible
15. {2,3,4} 20 not feasible
16. {1,2,3,4} 22 not feasible
0-1 Knapsack Algorithm
Knapsack-BF (n, V, W, C)
Compute all subsets, s, of S = {1, 2, 3, 4}
forall s  S
weight = Compute sum of weights of these items
if weight > C, not feasible
new solution = Compute sum of values of these
items
solution = solution  {new solution}
Return maximum of solution
0-1 Knapsack Algorithm Analysis
Approach
• In brute force algorithm, we go through all
combinations and find the one with maximum
value and with total weight less or equal to W = 16

Complexity
• Cost of computing subsets O(2n) for n elements
• Cost of computing weight = O(2n)
• Cost of computing values = O(2n)
• Total cost in worst case: O(2n)
The Closest Pair Problem
Finding Closest Pair
Problem
The closest pair problem is defined as follows:
• Given a set of n points, determine the two points
that are closest to each other in terms of distance.
Furthermore, if there are more than one pair of
points with the closest distance, all such pairs
should be identified.
Input :
is a set of n points
Output
• is a pair of points closest to each other,
• there can be more then one such pairs
Definition: Closest Pair
Distance
• In mathematics, particular in geometry, distance
on a given set M is a function d: M × M → R,
where R denotes the set of real numbers, that
satisfies the following conditions:

1. d(x, y) ≥ 0,
2. d(x, y) = 0 if and only if x = y.
3. Symmetric i.e.
d(x, y) = d(y, x).
4. Triangle inequality:
d(x, z) ≤ d(x, y) + d(y, z).
Finding Closest Pair
Closest Pair Problem in 2-D
• A point in 2-D is an ordered pair of values (x, y).
• The Euclidean distance between two points
Pi = (xi, yi) and Pj = (xj, yj) is
d(pi, pj) = sqr((xi − xj)2 + (yi − yj)2)
• The closest-pair problem is finding the two closest
points in a set of n points.
• The brute force algorithm checks every pair of points.
• Assumption: We can avoid computing square roots by
using squared distance.
– This assumption will not loose correctness of the problem.
Brute Force Approach: Finding Closest Pair in 2-D
ClosestPairBF(P)
1. mind  ∞ Time Complexity
2. for i  1 to n n n
3. do   c
4. for j  1 to n i 1 j1
5. if i  j n
6. do  cn
7. d  ((xi − xj)2 + (yi − yj)2) i 1
8. if d < mind then 2
9. mind  d cn
10.mini  I
2
11.minj  j (n )
12.return mind, p(mini, minj)
Improved Version: Finding Closest Pair in 2-D
ClosestPairBF(P) Time Complexity
1. mind  ∞ n 1 n

2. for i  1 to n − 1   c
i 1 ji 1
3. do
n 1
4. for j  i + 1 to n  c(n  i )
5. do i 1
6. d  ((xi − xj)2 + (yi − yj)2) n 1 n 1

7. if d < mind then


c( n   i)
i 1 i 1
8. mind  d
9. mini  i (n  1)n
cn(n  1)  c
10.minj  j 2
11.return mind, p(mini, minj) 2
(n )
Brute Force Approach: Finding Closest Pair in 3-D
ClosestPairBF(P) Time Complexity
1. mind  ∞ n 1 n
2. for i  1 to n − 1 
i 1 ji 1
c
3. do
n 1
4. for j  i + 1 to n
5. do
  c(n  i )
i 1
6. d  ((xi − xj)2 + (yi − yj)2 + (zi − zj)2)
7. if d < mind then n 1 n 1
8. mind  d c( n   i)
9. mini  i i 1 i 1
10.minj  j (n 2 )
11.return mind, p(mini), p(minj)
Finding Maximal in n-dimension
Maximal Points
• Dominated Point in 2-D
A point p is said to be dominated by
q if
p.x ≤ q.x and p.y ≤ q.y
• Dominated Point in n-D
A point p is said to be dominated by
q if
p.xi ≤ q.xi  i = 1,. . ., n
• Maximal Point
A point is said to be maximal if it is
not dominated by any other point.
Example: Maximal Points in 2-
Dimension
11
(4,10)
10
9 (2,8) (8,8)
8
7
(7,6)
6
5
(4,4) (11,4)
4 (1,3)
3
(5,2)
2 (9,1)
1

1 2 3 4 5 6 7 8 9 10 11 12
Brute Force Algorithm in n-dimension
MAXIMAL-POINTS (int m, Point P[1. . . m])
0 A = ;
1 for i 1 to m \\ m used for number of points
2 do maximal  true
3 for j  1 to m
4 do
5 if (i  j) &
6 for k  1 to n \\ n stands for
dimension
7 do
8 P[i].x[k]  P[j].x[k]
9 then maximal  false;
break
10 if maximal
Problem Statement
Problem Statement:
Given a set of m points, P = {p1, p2, . . . , pm},
in n- dimension. Our objective is to compute
a set of maximal points i.e. set of points
which are not dominated by any one in the
given list.

Mathematical Description:
Maximal Points =

{ p  P |  q  {p1, . . . , pm}, q  p,  i 
{1,. . . , n} &
p.xi ≥ q.xj}
The Assignment Problem
There are n people who need to be assigned to n
jobs, one person per job. The cost of assigning
person i to job j is C[i,j]. Find an assignment that
minimizes the total cost.
Job 0 Job 1 Job 2 Job 3
Person 0 9 2 7 8
Person 1 6 4 3 7
Person 2 5 8 1 8
Person 3 7 6 9 4

Algorithmic Plan: Generate all legitimate assignments,


compute their costs, and select the cheapest one.
How many assignments are there? n!
Pose the problem as one about a cost matrix:
Assignment Problem by Exhaustive Search
9 2 7 8
6 4 3 7
C=
5 8 1 8
7 6 9 4

Assignment (col.#s) Total Cost


1, 2, 3, 4 9+4+1+4=18
1, 2, 4, 3 9+4+8+9=30
1, 3, 2, 4 9+3+8+4=24
1, 3, 4, 2 9+3+8+6=26
1, 4, 2, 3 9+7+8+9=33
1, 4, 3, 2 9+7+1+6=23
etc.
(For this particular instance, the optimal assignment can be found by
exploiting the specific features of the number given. It is: 2,1,3,4 )
Final Comments on Exhaustive Search
• Exhaustive-search algorithms run in a realistic
amount of time only on very small instances

• In some cases, there are much better


alternatives!
– Euler circuits
– shortest paths
– minimum spanning tree
– assignment problem
• In many cases, exhaustive search or its variation
is the only known way to get exact solution
Conclusion
• Designing Algorithms using Brute Force
approach is discussed
• For Brute Force, formally, the output of any
sorting algorithm must satisfy the following
two conditions:
– Output is in decreasing/increasing order
and
– Output is a permutation, or reordering, of
input.
• Algorithms computing maximal points can be
considered as generalization of sorting
algorithms
• Maximal points are useful in Computer
Sciences and Mathematics in which at least
one component of every point is dominated

You might also like