0.introduction To DS - Sorting - Searching
0.introduction To DS - Sorting - Searching
Algorithms
1
RECOMMENDED BOOKS
C++ plus data Structures, Fifth Edition by Nell
Dale
Data Structures with C++ Schaum’s Outline
Series
2
Data and Information
Data can be defined as a representation of facts and concepts by
values.
Data is collection of raw facts.
3
Information
Information is organized or classified data, which has
some meaningful values for the receiver. Information is the
processed data on which decisions and actions are based.
4
Data Processing Cycle
5
Algorithm
6
Why Data Structures?
They are essential ingredients in creating fast
and powerful algorithms.
They help to manage the organize data.
They make code cleaner and easier to
understand.
7
Classification of Data Structure
Two broad categories of data structure are :
Primitive Data Structure
8
Primitive Data Structure
9
Non-Primitive Data Structure
10
Difference between
them
11
Classification of Data Structure
Data structure
Primitive DS Non-Primitive DS
12
Classification of Data Structure
Non-Primitive DS
Characterstic Description
In Linear data structures, the data items are arranged in a linear sequence.
Linear
Example: Array
In Non-Linear data structures, the data items are not in sequence. Example: Tree,
Non-Linear
Graph
Homogeneous In homogeneous data structures, all the elements are of same type. Example: Array
Non- In Non-Homogeneous data structure, the elements may or may not be of the same
Static data structures are those whose sizes and structures associated memory
Static
locations are fixed, at compile time. Example: Array
Dynamic structures are those which expand or shrink depending upon the program
Dynamic need and its execution. Also, their associated memory locations changes. Example:
Linked List created using pointers 14
Data Structure Operations
The most commonly used operation on data
structure are broadly categorized into
following types:
Create
Selection
Updating
Searching
Sorting
Merging
Delete
Insert
Description of various
Data Structures : Arrays
An array is defined as a set of finite number of
homogeneous elements or same data items.
It means an array can contain one type of data
only, either all integer, all float-point number or
all character.
16
Arrays
The elements of array will always be stored in
the consecutive (continues) memory location.
17
Arrays
Insertion of new element
Deletion of required element
Modification of an element
Merging of arrays
18
Lists
A lists (Linear linked list) can be defined as a
collection of variable number of data items.
An element of list must contain at least two
fields, one for storing data or information and
other for storing address of next element.
For storing address need a special data
structure of list that is pointer type.
19
Lists
Technically each such element is referred to as a
node, therefore a list can be defined as a
collection of nodes as show bellow:
[Linear Liked List]
Head
21
Stack
A stack is also an ordered collection of elements
like arrays, but it has a special feature that
deletion and insertion of elements can be done
only from one end called the top of the stack
(TOP)
Due to this property it is also called as last in first
out type of data structure (LIFO).
22
Stack
Insertion of element into stack is called
PUSH and deletion of element from stack is
called POP.
The bellow show figure how the operations
take place on a stack:
PUSH POP
[STACK] 23
Stack
The stack can be implemented into two
ways:
Using arrays (Static implementation)
24
Queue
Queue are first in first out type of data
structure (i.e. FIFO)
In a queue new elements are added to the
queue from one end called REAR end and
the element are always removed from other
end called the FRONT end.
The people standing in a railway reservation
row are an example of queue.
25
Queue
10 20 30 40 50
front rear 26
Queue
The queue can be implemented into two
ways:
Using arrays (Static implementation)
27
Trees
Tree is non-linear type of data
Tree represent the hierarchical relationship
between various elements.
28
Trees
29
Trees
The tree structure organizes the data into
branches, which related the information.
A root
B C
D E F G
30
Graph
Graph is a mathematical non-linear data
structure capable of representing many kind
of physical structures.
Definition: A graph G(V,E) is a set of
vertices V and a set of edges E.
31
Graph
An edge connects a pair of vertices and
many have weight such as length, cost and
another measuring instrument for according
the graph.
Vertices on the graph are shown as point or
circles and edges are drawn as arcs or line
segment.
32
Graph
Example of graph:
6
v2 v5
v1 v3
10
v1 8 11
15
9 v2
v3 v4 v4
Undirected graph
Simple graph
Weighted graph
Connected graph
Non-connected graph
34
OPERATIONS
35
MAJOR OPERATION
Traversing: Accessing each record exactly once so that certain
items in the record may be processed [ Also known as Visiting
the record]
Searching: Finding the location of the record with a given key
value, or finding the locations of all record which satisfy
one or more conditions
Inserting : Adding a new record to the structure
Deleting : Removing a record from the structure
Sorting: Arranging a list in some logical order.
Merging: Combing two list in a single list.
36
Abstract Data Type
37
Abstract data type (ADT)
Abstract data type (ADT) is a specification of a set of
data and the set of operations that can be performed on
the data. Each operation does a specific task.
38
Uses of ADT
39
List o f ADT’s:
1. Insertion at first, middle, last
2. Deletion at first, middle, last
3. Searching
4. Reversing
5. Traversing
6. Modifying the list,
7. Merging the list
40
Sorting and Searching
"There's nothing in your head the
sorting hat can't see. So try me
on and I will tell you where you
ought to be."
-The Sorting Hat, Harry Potter
and the Sorcerer's Stone
CS 307
Fundamentals of
Computer Science
Sorting and Searching
Fundamental problems in computer
science and programming
Sorting done to make searching easier
Multiple different algorithms to solve
the same problem
How do we know which algorithm is
"better"?
Look at searching first
Examples will use arrays of ints to
Searching
Searching
Given a list of data find the location of a
particular value or report that value is
not present
linear search
intuitive approach
start at first item
is it the one I am looking for?
if not go to next item
repeat until found or all items checked
If items not sorted or unsortable this
Attendance Question 1
What is the average case Big O of linear
search in an array with N items, if an
item is present?
A. O(N)
B. O(N )
2
C. O(1)
D. O(logN)
E. O(NlogN)
If items are sorted then we can divide and
Searching
conquer in a Sorted List
dividing your work in half with each step
generally a good thing
The Binary Search on List in Ascending order
Start at middle of list
is that the item?
If not is it less than or greater than the item?
less than, move to second half of list
greater than, move to first half of list
repeat until found or sub list size = 0
list Binary Search
low item middle item high item
Is middle item what we are looking for? If not is it
more or less than the target item? (Assume lower)
list
Variables of Interest?
Attendance Question 2
What is the worst case Big O of binary search in
an array with N items, if an item is present?
A. O(N)
B. O(N2)
C. O(1)
D. O(logN)
E. O(NlogN)
Generic Binary Search
int bsearch(Comparable list[], Comparable target)
{ int result = -1;
int low = 0;
int high = length - 1;
int mid;
while( result == -1 && low <= high )
{ mid = low + ((high - low) / 2);
if( target=list[mid]) )
result = mid;
else if(target< (list[mid]))
low = mid + 1;
else
high = mid - 1;
}
return result;
}
Other Searching Algorithms
Interpolation Search
more like what people really do
Indexed Searching
Binary Search Trees
Hash Table Searching
Grover's Algorithm (Waiting for
quantum computers to be built)
best-first
Sorting Fun
Why Not Bubble Sort?
A fundamental application for computers
Sorting
Done to make finding data (searching) faster
Many different algorithms for sorting
One of the difficulties with sorting is working
with a fixed size storage container (array)
if resize, that is expensive (slow)
The "simple" sorts run in quadratic time
O(N2)
bubble sort
selection sort
insertion sort
Stable Sorting
A property of sorts
If a sort guarantees the relative order
of equal items stays the same then it is
a stable sort
[71, 6, 72, 5, 1, 2, 73, -5]
subscripts added for clarity
[-5, 1, 2, 5, 6, 71, 72, 73]
result of stable sort
Real world example:
sort a table in Wikipedia by one criteria, then
Selection
Algorithm sort
Search through the list and find the smallest element
swap the smallest element with the first element
repeat starting at second element and find the second
smallest element
public static void selectionSort(int[] list)
{ int min;
int temp;
for(int i = 0; i < list.length - 1; i++) {
min = i;
for(int j = i + 1; j < list.length; j++)
if( list[j] < list[min] )
min = j;
temp = list[i];
list[i] = list[min];
list[min] = temp;
}
}
Selection Sort in Practice
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25
Array sorted
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shellsort on Another Data Set
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25
CS 307
Fundamentals of
Computer Science
public static void shellsort(Comparable[] list)
ShellSort Code
{ Comparable temp; boolean swap;
for(int gap = list.length / 2; gap > 0; gap /= 2)
for(int i = gap; i < list.length; i++)
{ Comparable tmp = list[i];
int j = i;
for( ; j >= gap &&
tmp.compareTo( list[j - gap] ) < 0;
j -= gap )
list[ j ] = list[ j - gap ];
list[ j ] = tmp;
}
}
Comparison of Various 0Sorts
Num Items Selection Insertion Shellsort Quicksort
1000 16 5 0
2000 59 49 0 6
4000 271 175 6 5
8000 1056 686 11 0
16000 4203 2754 32 11
32000 16852 11039 37 45
64000 expected? expected? 100 68
128000 expected? expected? 257 158
256000 expected? expected? 543 335
512000 expected? expected? 1210 722
1024000 expected? expected? 2522 1550
times in milliseconds
Quicksort
Invented by C.A.R. (Tony) Hoare
A divide and conquer approach
that uses recursion
1. If the list has 0 or 1 elements it is sorted
2. otherwise, pick any element p in the list. This is
called the pivot value
3. Partition the list minus the pivot into two sub lists
according to values less than or greater than the
pivot. (equal values go to either)
4. return the quicksort of the first list followed by
the quicksort of the second list
Quicksort in Action
39 23 17 90 33 72 46 79 11 52 64 5 71
Pick middle element as pivot: 46
Partition list
23 17 5 33 39 11 46 79 72 52 64 90 71
quick sort the less than list
Pick middle element as pivot: 33
23 17 5 11 33 39
quicksort the less than list, pivot now 5
{} 5 23 17 11
quicksort the less than list, base case
quicksort the greater than list
Pick middle element as pivot: 17
and so on….
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Quicksort on Another Data Set
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25
Big O of Quicksort?
public static void swapReferences( Object[] a, int index1, int index2 )
{ Object tmp = a[index1];
a[index1] = a[index2];
a[index2] = tmp;
}
public void quicksort( Comparable[] list, int start, int stop )
{ if(start >= stop)
return; //base case list of 0 or 1 elements
int pivotIndex = (start + stop) / 2;
// Place pivot at start position
swapReferences(list, pivotIndex, start);
Comparable pivot = list[start];
// Begin partitioning
int i, j = start;
// from first to j are elements less than or equal to pivot
// from j to i are elements greater than pivot
// elements beyond i have not been checked yet
for(i = start + 1; i <= stop; i++ )
{ //is current element less than or equal to pivot
if(list[i].compareTo(pivot) <= 0)
{ // if so move it to the less than or equal portion
j++;
swapReferences(list, i, j);
}
}
//restore pivot to correct spot
swapReferences(list, start, j);
quicksort( list, start, j - 1 ); // Sort small elements
quicksort( list, j + 1, stop ); // Sort large elements
}
Attendance Question 5
What is the best case and worst case
Big O of quicksort?
Best Worst
A. O(NlogN) O(N )
2
B. O(N ) O(N2)
2
C. O(N ) O(N!)
2
D. O(NlogN) O(NlogN)
E. O(N) O(NlogN)
Quicksort Caveats
Average case Big O?
Worst case Big O?
Coding the partition step is usually the
hardest part
Attendance Question 6
You have 1,000,000 items that you will
be searching. How many searches need
to be performed before the data is
changed to make sorting worthwhile?
A. 10
B. 40
C. 1,000
D. 10,000
E. 500,000
Don Knuth cites John von Neumann as the creator
of thisMerge
algorithmSort Algorithm
1. If a list has 1 element or 0
elements it is sorted
2. If a list has more than 2 split
into into 2 separate lists
3. Perform this algorithm on each
of those smaller lists
4. Take the 2 sorted lists and
merge them together
Merge Sort
When implementing
one temporary array
is used instead of
multiple temporary
arrays.
Why?
Final Comments
Language libraries often have sorting
algorithms in them
Java Arrays and Collections classes
C++ Standard Template Library
Python sort and sorted functions
Hybrid sorts
when size of unsorted list or portion of
array is small use insertion sort, otherwise
use
O(N log N) sort like Quicksort of Mergesort