0% found this document useful (0 votes)
53 views46 pages

Data Structure & Algorithm

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1/ 46

Data Structure & Algorithm

1
Introduction to Data Structure

The computer:
 Nowadays is involved in our day to day
lives.
 Has limited memory so there is need for
efficient utilization of memory
Stores and manipulates large amount of
data ,the type of the data and storage of the
data are very important
Now the question: how are the data stored in
the memory and how are they processed?
The answer lies in the study of Data
structure .
Introduction to Data Structure
A data structure may be defined as:
A data structure is a way of organizing data
that considers not only the items stored, but also
their relationship to each other. Advance
knowledge about the relationship between data
items allows designing of efficient algorithms for
the manipulation of data.
The mathematical model to organize the data in
the computer memory (main or secondary) and
the methods to process them are collectively
called data structures.
Contd.
A program is written in order to solve a
problem. A solution to a problem actually
consists of two things:
- A way to organize the data
- Sequence of steps to solve the problem
The way data are organized in a computers
memory is said to be Data Structure and the
sequence of computational steps to solve a
problem is said to be an algorithm.
Therefore, a program is nothing but data
structures plus algorithms.
Categories of data structures
. Data structure

Linear Non-linear

Array

Tree
Stack

Queue Graph

Linked list
Contd.
Linear DS: in which data elements are organized
in some sequence.
 Various operations are only possible in a
sequence
i.e. we can not insert the element directly into
some location without traversing the previous
element of the data structures
Non-linear DS: in which data elements are
organized in any arbitrary order or they are not
organized in any sequence
Common operations on DSs
Creating: reserves the memory for the data elements.
Traversal: visiting( Processing) each element in the
DS.
Search: Finding the location of the element with a
given value or the record with a given key.
Insertion: Adding a new element to the DS.
Deletion: Removing an element from the DS.
Updating: Used to change the existing value of the
data element on the DS.
Sorting: Arranging the elements in some type of
order.
Merging: Combining two lists into a single list on the
DS
DS Example Applications
How does Google quickly find web pages that
contain a search term?
What’s the fastest way to broadcast a message
to a network of computers?
 How can a subsequence of DNA be quickly
found within the genome?
How does your operating system track which
memory (disk or RAM) is free?
Abstract data type
 Solving a problem involves processing data, and
an important part of the solution is the careful
organization of the data
 In order to do that, we need to identify:
1. The collection of data items
2. Basic operation that must be performed on them
 Abstract Data Type (ADT): a collection of data
items together with the operations on the data .
And its implementation is hidden.
 An implementation of ADT consists of storage
structures to store the data items and algorithms
for basic operation
Contd. Abstract data type
The word “abstract” refers to the fact that the
data and the basic operations defined on it are
being studied independently of how they are
implemented.
We think about what can be done with the
data, not how it is done.
The ADT specifies:
- What can be stored in the Abstract Data Type
- What operations can be done on/by the
Abstract Data Type.
Algorithm
An algorithm is a well-defined computational
procedure that takes some value or a set of
values as input and produces some value or a set
of values as output.
An algorithm is a set of instructions to be
followed to solve a problem.
-There can be more than one solution (more
than one algorithm) to solve a given problem.
-An algorithm can be implemented using
different programming languages on different
platforms.
Contd. Algorithm
An algorithm must be correct. It should
correctly solve the problem.
Once we have a correct algorithm for a
problem, we have to determine the efficiency of
that algorithm.
The quality of an algorithm is related to its
ability to successfully simulate the changes in the
world.
Algorithmic Performance
There are two aspects of algorithmic performance:
Time
-Instructions take time.
-How fast does the algorithm perform?
-What affects its runtime? 
Space
-Data structures take space
-What kind of data structures can be used?
-How does choice of data structure affect the
runtime?
We will focus on time:
How to estimate the time required for an
algorithm
How to reduce the time required
Properties of an algorithm
1. Finiteness: Algorithm must complete after a
finite number of steps.
2. Definiteness: Each step must be clearly
defined, having one and only one
interpretation. At each point in computation,
one should be able to tell exactly what
happens next.
3. Sequence: Each step must have a unique
defined preceding and succeeding step. The
first step (start step) and last step (halt step)
must be clearly noted.
Contd. Properties of an algorithm
4. Feasibility: It must be possible to perform
each instruction.
5. Correctness: It must compute correct answer
all possible legal inputs.
6. Language Independence: It must not depend
on any one programming language.
7. Completeness: It must solve the problem
completely.
Contd. Properties of an algorithm
8.Effectiveness: It must be possible to perform
each step exactly and in a finite amount of time.
9.Efficiency: It must solve with the least amount
of computational resources such as time and
space.
10. Generality: Algorithm should be valid on all
possible inputs.
11. Input/Output: There must be a specified
number of input values, and one or more result
values.
Algorithm Analysis Concepts
It must be possible to perform Algorithm
analysis refers to the process of determining how
much computing time and storage that
algorithms will require.
In other words, it’s a process of predicting the
resource requirement of algorithms in a given
environment.
In order to solve a problem, there are many
possible algorithms. One has to be able to choose
the best algorithm for the problem at hand using
some scientific method.
Algorithm Analysis Concepts
To classify some data structures and
algorithms as good, we need precise ways of
analyzing them in terms of resource
requirement.
The main resources are:
-Running Time
-Memory Usage
Running time is usually treated as the most
important since computational time is the most
precious resource in most problem domains.
Algorithm Analysis Concepts
There are two approaches to measure the
efficiency of algorithms:
Empirical: Programming competing algorithms
and trying them on different instances.
Theoretical: Determining the quantity of
resources required mathematically (Execution
time, memory space, etc.) needed by each
algorithm.
Complexity Analysis
Complexity Analysis is the systematic study of the cost
of computation, measured either in time units or in
operations performed, or in the amount of storage space
required.
The goal is to have a meaningful measure that permits
comparison of algorithms independent of operating
platform.
There are two things to consider:
Time Complexity: Determine the approximate number
of operations required to solve a problem of size n.
Space Complexity: Determine the approximate memory
required to solve a problem of size n.
Contd. Complexity Analysis
Complexity analysis involves two distinct phases:
Algorithm Analysis: Analysis of the algorithm or data
structure to produce a function T (n) that describes the
algorithm in terms of the operations performed in order
to measure the complexity of the algorithm.
Order of Magnitude Analysis: Analysis of the function T
(n) to determine the general complexity category to
which it belongs.
 
There is no generally accepted set of rules for
algorithm analysis. However, an exact count of
operations is commonly used.
Best, Worst, or Average Case Analysis
An algorithm can require different times to
solve different problems of the same size.
Eg. Searching an item in a list of n elements
using sequential search.  Cost: 1,2,...,n
1. Worst-Case Analysis –The maximum amount
of time that an algorithm require to solve a
problem of size n.
-This gives an upper bound for the time
complexity of an algorithm.
-Normally, we try to find worst-case behavior
of an algorithm.
Contd.
2. Best-Case Analysis –The minimum amount of
time that an algorithm require to solve a problem
of size n.
-The best case behavior of an algorithm is NOT
so useful.
3. Average-Case Analysis –The average amount of
time that an algorithm require to solve a problem
of size n.
-Sometimes, it is difficult to find the average-
case behavior of an algorithm.
-We have to look at all possible data
organizations of a given size n, and their
distribution probabilities of these organizations.
Analysis Rules`
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes
time 1:
-Assignment Operation
-Single Input/Output Operation
-Single Boolean Operations
-Single Arithmetic Operations
-Function Return
3. Running time of a selection statement (if, switch)
is the time for the condition evaluation + the
maximum of the running times for the individual
clauses in the selection.
Contd. Analysis Rules
Example: Simple If-Statement
Cost
if (n < 0) c1
absval = -n c2
else
absval = n; c3

Total Cost <= c1 + max(c2,c3)


Contd. Analysis Rules
4. Loops: Running time for a loop is equal to the
running time for the statements inside the loop *
number of iterations.
-The total running time of a statement inside a group
of nested loops is the running time of the statements
multiplied by the product of the sizes of all the loops.
-For nested loops, analyze inside out.
-Always assume that the loop executes the maximum
number of iterations possible.
Contd. Analysis Rules
Example: Simple Loop
Cost
Times
i = 1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
i = i + 1; c4
n
sum = sum + i; c5 n
}

Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5


 The time required for this algorithm is proportional to n
Example
int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;
Time Units to Compute
------------------------------------------------
1 for the assignment statement: int k=0
1 for the output statement.
1 for the input statement.
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
Exercise

int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
Algorithm Growth Rates
 We measure an algorithm’s time requirement
as a function of the problem size.
-Problem size depends on the application: e.g.
number of elements in a list for a sorting
algorithm, the number disks for towers of
hanoi.
So, for instance, we say that (if the problem size
is n)
Algorithm A requires 5*n2 time units to solve
a problem of size n.
Algorithm B requires 7*n time units to solve a
problem of size n.
Algorithm Growth Rates
The most important thing to learn is how
quickly the algorithm’s time requirement grows
as a function of the problem size.
-Algorithm A requires time proportional to n2.
-Algorithm B requires time proportional to n.
An algorithm’s proportional time requirement
is known as growth rate.
We can compare the efficiency of two
algorithms by comparing their growth rates.
Algorithm Growth Rates (cont.)

Time requirements as a function


of the problem size n

32
Contd.
There are five notations used to describe a
running time function. These are:
Big-Oh Notation (O)
Big-Omega Notation ()
Theta Notation ()
Little-o Notation (o)
Little-Omega Notation ()
The Big-Oh Notation
Big-Oh notation is a way of comparing
algorithms and is used for computing the
complexity of algorithms; i.e., the amount of time
that it takes for computer program to run .
If Algorithm A requires time proportional to
f(n), Algorithm A is said to be order f(n), and it is
denoted as O(f(n)).
The function f(n) is called the algorithm’s
growth-rate function.
Since the capital O is used in the notation, this
notation is called the Big O notation.
The Big-Oh Notation
If Algorithm A requires time proportional to n2, it
is O(n2).
If Algorithm A requires time proportional to n, it
is O(n).
 Represented by O (Big-Oh).
It’s only concerned with what happens for very
large value of n.
Therefore only the largest term in the expression
(function) is needed. For example, if the number of
operations in an algorithm is n2 – n, n is insignificant
compared to n2 for large values of n. Hence the n
term is ignored.
The Big-Oh Notation
Formal Definition: f (n)= O (g (n)) if there exist c,
k ∊ ℛ+ such that for all n≥ k, f (n) ≤ c.g (n).
Examples: The following points are facts that you
can use for Big-Oh problems:
1<=n for all n>=1
n<=n2 for all n>=1
2n <=n! for all n>=4
log2n<=n for all n>=2
n<=nlog2n for all n>=2
Example
1. f(n)=10n+5 and g(n)=n. Show that f(n) is
O(g(n)).
Solution: To show that f(n) is O(g(n)) we must
show that constants c and k such that
f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that
10n+5<=15n
Solving for n we get: 5<5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1.
(c=15,k=1).
Exercis

1. f(n) = 3n2 +4n+1. Show that f(n)=O(n2).


Solution

1. f(n) = 3n2 +4n+1. Show that f(n)=O(n2).

4n <=4n2 for all n>=1 and 1<=n2 for all n>=1

3n2 +4n+1<=3n2+4n2+n2 for all n>=1


<=8n2 for all n>=1
So we have shown that f(n)<=8n2 for all n>=1

Therefore, f (n) is O(n2) (c=8,k=1)


Big-O Theorems
For all the following theorems, assume that f(n) is a function of n
and that K is an arbitrary constant.
 Theorem1: K is O(1) //constant time

- This means that the algorithm requires the same fixed


number of steps regardless of the size of the task.
 Theorem 2: A polynomial is O(the term containing the
highest power of n).
- The number of operations is proportional to the size of
the task squared.
Polynomial’s growth rate is determined by the leading term
If f(n) is a polynomial of degree d, then f(n) is O(nd)
f(n) = 7n4 + 3n2 + 5n + 1000 is O(7n4)
Big-O Theorems
 Theorem 3: K*f(n) is O(f(n)) [that is, constant coefficients
can be dropped]
g(n) = 7n4 is O(n4)

 Theorem 4: If f(n) is O(g(n)) and g(n) is O(h(n)) the f(n) is


O(h(n)). [transitivity]
Big-O Theorems
 Theorem 5: Each of the following functions is strictly big-O
of its successors:
K [constant]
logb(n) [always log base 2 if no base is shown]
n
nlogb(n)
n2
n to higher powers
2n
3n
larger constants to the n-th power
n! [n factorial]
nn
For Example:f(n)= 3nlogbn + 4 logbn+2 is O(nlogbn) and n2 is O(2n)
Big-O Theorems
 Theorem 6: In general, f(n) is big-O of
the dominant term of f(n), where
“dominant” may usually be determined
from Theorem 5.
f(n) = 7n2+3nlog(n)+5n+1000 is O(n2)
g(n) = 7n4+3n+106 is O(3n)
h(n) = 7n(n+log(n)) is O(n2)
 Theorem 7: For any base b, logb(n) is
O(log(n)).
Examples
In general, in Big-Oh analysis, we focus on the “big
picture,” that is, the operations that affect the
running time the most – the loops
Simplify the count:
1. Drop all lower-order terms
7n – 2 7n
2. Eliminate constants
7n n
3. Remaining term is the Big-Oh
7n – 2 is O(n)
More Examples

Example: f(n) = 5n3 – 2n2 + 1


1. Drop all lower order terms
5n3 – 2n2 + 1 5n3

2. Eliminate the constants


5n3 n3

3. The remaining term is the Big-Oh


f(n) is O(n3)
Growth-Rate Functions
O(1) Time requirement is constant, and it is independent of the problem’s size.
O(log2n) Time requirement for a logarithmic algorithm increases increases slowly
as the problem size increases.
O(n) Time requirement for a linear algorithm increases directly with the size
of the problem.
O(n*log2n) Time requirement for a n*log2n algorithm increases more rapidly than
a linear algorithm.
O(n2) Time requirement for a quadratic algorithm increases rapidly with the
size of the problem.
O(n3) Time requirement for a cubic algorithm increases more rapidly with the
size of the problem than the time requirement for a quadratic algorithm.
O(2n) As the size of the problem increases, the time requirement for an
exponential algorithm increases too rapidly to be practical.

46

You might also like