Data Structure & Algorithm
Data Structure & Algorithm
Data Structure & Algorithm
1
Introduction to Data Structure
The computer:
Nowadays is involved in our day to day
lives.
Has limited memory so there is need for
efficient utilization of memory
Stores and manipulates large amount of
data ,the type of the data and storage of the
data are very important
Now the question: how are the data stored in
the memory and how are they processed?
The answer lies in the study of Data
structure .
Introduction to Data Structure
A data structure may be defined as:
A data structure is a way of organizing data
that considers not only the items stored, but also
their relationship to each other. Advance
knowledge about the relationship between data
items allows designing of efficient algorithms for
the manipulation of data.
The mathematical model to organize the data in
the computer memory (main or secondary) and
the methods to process them are collectively
called data structures.
Contd.
A program is written in order to solve a
problem. A solution to a problem actually
consists of two things:
- A way to organize the data
- Sequence of steps to solve the problem
The way data are organized in a computers
memory is said to be Data Structure and the
sequence of computational steps to solve a
problem is said to be an algorithm.
Therefore, a program is nothing but data
structures plus algorithms.
Categories of data structures
. Data structure
Linear Non-linear
Array
Tree
Stack
Queue Graph
Linked list
Contd.
Linear DS: in which data elements are organized
in some sequence.
Various operations are only possible in a
sequence
i.e. we can not insert the element directly into
some location without traversing the previous
element of the data structures
Non-linear DS: in which data elements are
organized in any arbitrary order or they are not
organized in any sequence
Common operations on DSs
Creating: reserves the memory for the data elements.
Traversal: visiting( Processing) each element in the
DS.
Search: Finding the location of the element with a
given value or the record with a given key.
Insertion: Adding a new element to the DS.
Deletion: Removing an element from the DS.
Updating: Used to change the existing value of the
data element on the DS.
Sorting: Arranging the elements in some type of
order.
Merging: Combining two lists into a single list on the
DS
DS Example Applications
How does Google quickly find web pages that
contain a search term?
What’s the fastest way to broadcast a message
to a network of computers?
How can a subsequence of DNA be quickly
found within the genome?
How does your operating system track which
memory (disk or RAM) is free?
Abstract data type
Solving a problem involves processing data, and
an important part of the solution is the careful
organization of the data
In order to do that, we need to identify:
1. The collection of data items
2. Basic operation that must be performed on them
Abstract Data Type (ADT): a collection of data
items together with the operations on the data .
And its implementation is hidden.
An implementation of ADT consists of storage
structures to store the data items and algorithms
for basic operation
Contd. Abstract data type
The word “abstract” refers to the fact that the
data and the basic operations defined on it are
being studied independently of how they are
implemented.
We think about what can be done with the
data, not how it is done.
The ADT specifies:
- What can be stored in the Abstract Data Type
- What operations can be done on/by the
Abstract Data Type.
Algorithm
An algorithm is a well-defined computational
procedure that takes some value or a set of
values as input and produces some value or a set
of values as output.
An algorithm is a set of instructions to be
followed to solve a problem.
-There can be more than one solution (more
than one algorithm) to solve a given problem.
-An algorithm can be implemented using
different programming languages on different
platforms.
Contd. Algorithm
An algorithm must be correct. It should
correctly solve the problem.
Once we have a correct algorithm for a
problem, we have to determine the efficiency of
that algorithm.
The quality of an algorithm is related to its
ability to successfully simulate the changes in the
world.
Algorithmic Performance
There are two aspects of algorithmic performance:
Time
-Instructions take time.
-How fast does the algorithm perform?
-What affects its runtime?
Space
-Data structures take space
-What kind of data structures can be used?
-How does choice of data structure affect the
runtime?
We will focus on time:
How to estimate the time required for an
algorithm
How to reduce the time required
Properties of an algorithm
1. Finiteness: Algorithm must complete after a
finite number of steps.
2. Definiteness: Each step must be clearly
defined, having one and only one
interpretation. At each point in computation,
one should be able to tell exactly what
happens next.
3. Sequence: Each step must have a unique
defined preceding and succeeding step. The
first step (start step) and last step (halt step)
must be clearly noted.
Contd. Properties of an algorithm
4. Feasibility: It must be possible to perform
each instruction.
5. Correctness: It must compute correct answer
all possible legal inputs.
6. Language Independence: It must not depend
on any one programming language.
7. Completeness: It must solve the problem
completely.
Contd. Properties of an algorithm
8.Effectiveness: It must be possible to perform
each step exactly and in a finite amount of time.
9.Efficiency: It must solve with the least amount
of computational resources such as time and
space.
10. Generality: Algorithm should be valid on all
possible inputs.
11. Input/Output: There must be a specified
number of input values, and one or more result
values.
Algorithm Analysis Concepts
It must be possible to perform Algorithm
analysis refers to the process of determining how
much computing time and storage that
algorithms will require.
In other words, it’s a process of predicting the
resource requirement of algorithms in a given
environment.
In order to solve a problem, there are many
possible algorithms. One has to be able to choose
the best algorithm for the problem at hand using
some scientific method.
Algorithm Analysis Concepts
To classify some data structures and
algorithms as good, we need precise ways of
analyzing them in terms of resource
requirement.
The main resources are:
-Running Time
-Memory Usage
Running time is usually treated as the most
important since computational time is the most
precious resource in most problem domains.
Algorithm Analysis Concepts
There are two approaches to measure the
efficiency of algorithms:
Empirical: Programming competing algorithms
and trying them on different instances.
Theoretical: Determining the quantity of
resources required mathematically (Execution
time, memory space, etc.) needed by each
algorithm.
Complexity Analysis
Complexity Analysis is the systematic study of the cost
of computation, measured either in time units or in
operations performed, or in the amount of storage space
required.
The goal is to have a meaningful measure that permits
comparison of algorithms independent of operating
platform.
There are two things to consider:
Time Complexity: Determine the approximate number
of operations required to solve a problem of size n.
Space Complexity: Determine the approximate memory
required to solve a problem of size n.
Contd. Complexity Analysis
Complexity analysis involves two distinct phases:
Algorithm Analysis: Analysis of the algorithm or data
structure to produce a function T (n) that describes the
algorithm in terms of the operations performed in order
to measure the complexity of the algorithm.
Order of Magnitude Analysis: Analysis of the function T
(n) to determine the general complexity category to
which it belongs.
There is no generally accepted set of rules for
algorithm analysis. However, an exact count of
operations is commonly used.
Best, Worst, or Average Case Analysis
An algorithm can require different times to
solve different problems of the same size.
Eg. Searching an item in a list of n elements
using sequential search. Cost: 1,2,...,n
1. Worst-Case Analysis –The maximum amount
of time that an algorithm require to solve a
problem of size n.
-This gives an upper bound for the time
complexity of an algorithm.
-Normally, we try to find worst-case behavior
of an algorithm.
Contd.
2. Best-Case Analysis –The minimum amount of
time that an algorithm require to solve a problem
of size n.
-The best case behavior of an algorithm is NOT
so useful.
3. Average-Case Analysis –The average amount of
time that an algorithm require to solve a problem
of size n.
-Sometimes, it is difficult to find the average-
case behavior of an algorithm.
-We have to look at all possible data
organizations of a given size n, and their
distribution probabilities of these organizations.
Analysis Rules`
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes
time 1:
-Assignment Operation
-Single Input/Output Operation
-Single Boolean Operations
-Single Arithmetic Operations
-Function Return
3. Running time of a selection statement (if, switch)
is the time for the condition evaluation + the
maximum of the running times for the individual
clauses in the selection.
Contd. Analysis Rules
Example: Simple If-Statement
Cost
if (n < 0) c1
absval = -n c2
else
absval = n; c3
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
Algorithm Growth Rates
We measure an algorithm’s time requirement
as a function of the problem size.
-Problem size depends on the application: e.g.
number of elements in a list for a sorting
algorithm, the number disks for towers of
hanoi.
So, for instance, we say that (if the problem size
is n)
Algorithm A requires 5*n2 time units to solve
a problem of size n.
Algorithm B requires 7*n time units to solve a
problem of size n.
Algorithm Growth Rates
The most important thing to learn is how
quickly the algorithm’s time requirement grows
as a function of the problem size.
-Algorithm A requires time proportional to n2.
-Algorithm B requires time proportional to n.
An algorithm’s proportional time requirement
is known as growth rate.
We can compare the efficiency of two
algorithms by comparing their growth rates.
Algorithm Growth Rates (cont.)
32
Contd.
There are five notations used to describe a
running time function. These are:
Big-Oh Notation (O)
Big-Omega Notation ()
Theta Notation ()
Little-o Notation (o)
Little-Omega Notation ()
The Big-Oh Notation
Big-Oh notation is a way of comparing
algorithms and is used for computing the
complexity of algorithms; i.e., the amount of time
that it takes for computer program to run .
If Algorithm A requires time proportional to
f(n), Algorithm A is said to be order f(n), and it is
denoted as O(f(n)).
The function f(n) is called the algorithm’s
growth-rate function.
Since the capital O is used in the notation, this
notation is called the Big O notation.
The Big-Oh Notation
If Algorithm A requires time proportional to n2, it
is O(n2).
If Algorithm A requires time proportional to n, it
is O(n).
Represented by O (Big-Oh).
It’s only concerned with what happens for very
large value of n.
Therefore only the largest term in the expression
(function) is needed. For example, if the number of
operations in an algorithm is n2 – n, n is insignificant
compared to n2 for large values of n. Hence the n
term is ignored.
The Big-Oh Notation
Formal Definition: f (n)= O (g (n)) if there exist c,
k ∊ ℛ+ such that for all n≥ k, f (n) ≤ c.g (n).
Examples: The following points are facts that you
can use for Big-Oh problems:
1<=n for all n>=1
n<=n2 for all n>=1
2n <=n! for all n>=4
log2n<=n for all n>=2
n<=nlog2n for all n>=2
Example
1. f(n)=10n+5 and g(n)=n. Show that f(n) is
O(g(n)).
Solution: To show that f(n) is O(g(n)) we must
show that constants c and k such that
f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that
10n+5<=15n
Solving for n we get: 5<5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1.
(c=15,k=1).
Exercis
46