Advanced Algorithm
Advanced Algorithm
Complexity analysis is defined as a technique to characterize the resources taken by an algorithm with
respect to input size (independent from the machine, language and compiler). It is used for evaluating
the variations of execution time on different algorithms. The resources evaluated when analyzing the
complexity of an algorithm are mainly time (number of instructions use to run the algorithm), and
space (The amount of memory use by the algorithm).
1. Asymptotic notation in complexity analysis
1. Big O notation
Big O notation is special notation that tells you how fast an algorithm is. Who cares? Well, it turns out
that you’ll use other people’s algorithms often—and when you do, it’s nice to understand how fast or
slow they are; for some complex algorithm you writes it may become interesting of testing how
efficient it is. In this section, I’ll explain what Big O notation is and give you a list of the most common
running times for algorithms using it.
Big-O notation represents the upper bound of the running time of an algorithm. Therefore, it gives the
worst-case complexity of an algorithm. By using big O- notation, we can asymptotically limit the
expansion of a running time to a range of constant factors above and below. It is a model for
quantifying algorithm performance.
The following graph shape curves of running time according the input size, using big O notation.
flag←true; //Statement4
solution:
total time = time(statement1) + time(statement2) + ... time (statementN)
Assuming that n is the size of the input, let’s use T(n) to represent the overall time and t to represent the
amount of time that a statement or collection of statements takes to execute.
T(n) = t(statement1) + t(statement2) + ... + t(statementN);
write(“Hello World”);
endFor
3. space complexity
The amount of memory required by the algorithm to solve a given problem is called the space complexity of the
algorithm. Problem-solving using a computer requires memory to hold temporary data or final result while the
program is in execution. The space Complexity of an algorithm is the total space taken by the algorithm with
respect to the input size. Space complexity includes both Auxiliary space and space used by input.
Space complexity is a parallel concept to time complexity. If we need to create an array of size n, this will
require O(n) space. If we create a two-dimensional array of size n*n, this will require O(n2) space.
In recursive calls stack space also counts.
begin
if (n <= 0)
return 0;
endif
end
1. add(4)
2. -> add(3)
3. -> add(2)
4. -> add(1)
5. -> add(0)
Each of these calls is added to call stack and takes up actual memory. So it takes
O(n) space.
begin
int sum = 0;
endfor
return sum;
end
begin
return x + y;
end
There will be roughly O(n) calls to pairSum. However, those calls do not exist
definition
Divide and conquer is a problem-solving approach that involves breaking down a complex problem
into smaller, more manageable subproblems in order to solve the original problem.
Characteristic
The divide and conquer programming paradigm requires that the problem to be solved be
decomposable into subproblems, and this decomposition can be recursive. This is also called a top-
down approach.
The divide and conquer approach has three steps: divide, conquer, and combine. Only the first two
steps are made explicit in the name of the divide and conquer paradigm, but a step of combining the
solutions to the subproblems is necessary to solve the general problem. Let's study each of these three
steps in more detail.
• The "divide" step consists of breaking down the main problem into subproblems.
• The "conquer" step consists of solving each of the subproblems individually.
• The "combine" step consists of merging all of the results obtained for each of the subproblems
in order to obtain the final result of the solution to the original problem.
So the general algorithm of the principle is as follows.
AlgorithmDAndC(P)
begin
if Small(P) then
return Solution(P);
else
return Combine(DAndC(P1),DandC(P2)...DandC(Pk));
endif
end
Applications
• Quicksort
array function quicksort(array)
begin
//Base case: arrays with 0 or 1 element are already “sorted.”
if lenght(array) < 2:
return array
else //Recursive case
pivot = array[0]
less = [i for i in array[1:] if i <= pivot] //Sub-array of all the elements less than the pivot
greater = [i for i in array[1:] if i > pivot]//Sub-array of all the elements greater than the pivot
• Dichotomic research
Dynamic programming
Dynamic programming (DP) is an algorithmic method of solving optimization problems. Programming in this context
refers to mathematical programming, which is a synonym for optimization. DP solves a problem by combining the
solutions to its sub-problems. The famous divide-and-conquer method also solves a problem in a similar manner. The
divide-and-conquer method divides a problem into independent sub-problems, whereas in DP, either the sub-problems
depend on the solution sets of other sub-problems or the sub-problems appear repeatedly. DP uses the dependency of
the sub-problems and attempts to solve a subproblem only once; it then stores its solution in a table for future lookup.
This strategy help avoiding the time spent on recalculating solutions to old sub-problems, resulting in an efficient
algorithm.
To illustrate de DP we use the Fibonacci series defined as follows:
f(0)=0, f(1)=1 and f(n)=f(n-1) + f(n-2) for n greater than 1
an implementation of this algorithm using recursion is given below.
Var int fibonacci(var int n)
begin
if (n==1)
return 1;
else
if (n==0)
return 0;
else
return fibonacci(n-1)+fionacci(n-2);
endif
endif
end
This algorithm quickly becomes very slow for relatively small values. For n=50 for example this can algorithm
can take up to more than a hour to terminate. Let’ s analyze the problem.
1.1 Problem analysis
If we analyze the algorithm for n=6 as shown on the following graph:
By carefully observing the diagram above, you will have noticed that many calculations are unnecessary,
because they are carried out many times: for example, there are 2 calls of f(4), 3 calls of f(3) , 5 calls of f(2) etc.
We could therefore greatly simplify the calculation by calculating all repeated computation once and for all,
"memorizing" the result and reusing it when necessary.
1.2 Solution: Memoization
we can solves the problem iteratively by constructing a table in a bottom-up fashion. A top-down
approach, on the other hand, seems infeasible, from this simple recursive algorithm. In fact, the
unnecessary recomputations that prevent the recursive algorithm from being efficient can be avoided
by recording all the computed solutions along the way. This idea of constructing a table in a top-down
recursive fashion is called memoization. A more optimized solution to solve our initial problem is
provided bellow
Exercise: calculate the time complexity of this algorithm for both recursive and DP approaches
Greedy algorithm
A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the
locally optimal choice at each stage.
In mathematical optimization and computer science, heuristic (from Greek εὑρίσκω "I find,
discover") is a technique designed for problem solving more quickly when classic methods
are too slow for finding an exact or approximate solution, or when classic methods fail to find
any exact solution in a search space.
Most, though not all, of these problems have n inputs and require us to obtain a subset that
satisfies some constraints. Any subset that satisfies these constraints is called a feasible
solution. We need to find a feasible solution that either maximizes or minimizes a given
objective function. A feasible solution that does this is called an optimal solution. There is usu-
ally an obvious way to determine a feasible solution but not necessarily an optimal solution.
The greedy method suggests that one can devise an algorithm that works in stages,
considering one input at a time. At each stage, a decision is made regarding whether a
particular input is in an optimal solution. This is done by considering the inputs in an order
determined by some selection proceure. If the inclusion of the next input into the partially
constructed optimal solution will result in an infeasible solution, then this input is not added to
the partial solution. Otherwise, it is added. The selection procedure itself is based on some
optimization measure. This measure may be the objective function. In fact, several different
optimization measures may be plausible for a given problem. Most of these, however, will
result in algorithms that generate suboptimal solutions. This version of the greedy technique
is called the subset paradigm.
We can describe the subset paradigm abstractly, but more precisely than above, by
considering the control abstraction in Algorithm below.
The function Select selects an input from a[] and removes it. The selected input's value is
assigned to x. Feasible is a Boolean-valued function that determines whether x can be
included into the solution vector. The function Union combines z with the solution and updates
the objective function. The function Greedy describes the essential way that a greedy
algorithm will look, once a particular problem is chosen and the functions Select,Feasible, and
Union are properly implemented.
Let us try to apply the greedy method to solve the knapsack problem. We are given n objects
and a knapsack or bag. Object i has a weight w; and the knapsack has a capacity m. If a
fraction xi, 0 ≤ x; ≤ 1, of object i is placed into the knapsack, then a profit of pix; is earned. The
objective is to obtain a filling of the knapsack that maximizes the total profit earned. Since the
knapsack capacity is m, we require the total weight of all chosen objects to be at most m.
Formally, the problem can be stated as
The profits and weights are positive numbers. A feasible solution (or filling) is any set (x 1,...,
xn) satisfying (4.2) and (4.3) above. An optimal solution is a feasible solution for which (4.1) is
maximized.
Chap : Trees
Definition: A tree is a finite set of one or more nodes such that there is a specially designated node
called the root and the remaining nodes are partitioned into n ≥ 0 disjoint sets T1,..., Tn, where each of
these sets is a tree. The sets T1,..., Tn are called the subtrees of the root.
Vocabulary
There are many terms that are often used when referring to trees. Consider the tree in Figure below.
This tree has 13 nodes, each data item of a node being a single letter for convenience. The root contains
A (we usually say node A), and we normally draw trees with their roots at the top. The number of sub-
trees of a node is called its degree. The degree of A is 3, of C is 1, and of F is 0. Nodes that have degree
zero are called leaf or terminal nodes. The set {K, L, F, G, M, I, J} is the set of leaf nodes of given
tree. The other nodes are referred to as non terminals. The roots of the sub-trees of a node X are the
children of X. The node X is the parent of its children. Thus the children of D are H, I, and J, and the
parent of D is A.
Fig. Sample tree
Children of the same parent are said to be siblings. For example H, I, and J are siblings. We can extend
this terminology if we need to so that we can ask for the grandparent of M, which is D, and so on. The
degree of a tree is the maximum degree of the nodes in the tree. The tree in the previous Figure has
degree 3. The ancestors of a node are all the nodes along the path from the root to that node. The
ancestors of M are A, D, and H.
The level of a node is defined by initially letting the root be at level one. If a node is at level p, then its
children are at level p+1. The example of in the previous figure shows the levels of all nodes in that
tree. The height or depth of a tree is defined to be the maximum level of any node in the tree.
A forest is a set of n ≥ 0 disjoint trees. The notion of a forest is very close to that of a tree because if we
remove the root of a tree, we get a forest. For example, in the previous if we remove A, we get a forest
with three trees.