0% found this document useful (0 votes)
30 views79 pages

0.introduction To DS - Sorting - Searching

The document introduces different data structures like arrays, lists, stacks, queues, trees and graphs. It provides details about each data structure like their definition, implementation, operations etc.

Uploaded by

whyalwaysarnab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views79 pages

0.introduction To DS - Sorting - Searching

The document introduces different data structures like arrays, lists, stacks, queues, trees and graphs. It provides details about each data structure like their definition, implementation, operations etc.

Uploaded by

whyalwaysarnab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 79

CSE225: Data Structure and

Algorithms

Introduction to Data Structure(DS)

1
RECOMMENDED BOOKS
 C++ plus data Structures, Fifth Edition by Nell
Dale
 Data Structures with C++ Schaum’s Outline
Series

2
Data and Information
 Data can be defined as a representation of facts and concepts by
values.
 Data is collection of raw facts.

Data is represented with the help of characters such as alphabets (A-


Z, a-z), digits (0-9) or special characters (+,-,/,*,<,>,= etc.)

 Data structure is representation of the logical relationship


existing between individual elements of data.

3
Information
 Information is organized or classified data, which has
some meaningful values for the receiver. Information is the
processed data on which decisions and actions are based.

 The processed data must qualify for the following


characteristics −
 Timely − Information should be available when required.
 Accuracy − Information should be accurate.
 Completeness − Information should be complete.

4
Data Processing Cycle

 Input: the input data is prepared in some convenient


form for processing
 Processing: the input data is changed to produce data
in a more useful form
 Output: the result of the proceeding processing step is
collected.

5
Algorithm

 An Algorithm is a well defined list of steps to


solve a problem.
 Data structure is the logical or mathematical
relationship of individual elements of data.

 Algorithm + Data Structure= Program

6
Why Data Structures?
 They are essential ingredients in creating fast
and powerful algorithms.
 They help to manage the organize data.
 They make code cleaner and easier to
understand.

7
Classification of Data Structure
 Two broad categories of data structure are :
 Primitive Data Structure

 Non-Primitive Data Structure

8
Primitive Data Structure

 They are basic structures and directly operated


upon by the machine instructions.

 Integer, Floating-point number, Character


constants, string constants, pointers etc,

9
Non-Primitive Data Structure

 These are derived from the primitive data


structures.
 Example: Array, Lists, Stack, Queue, Tree,
Graph

10
Difference between
them

A primitive data structure is generally a


basic structure that is usually built into the
language, such as an integer, a float.

 A non-primitive data structure is built out of


primitive data structures linked together in
meaningful ways, such as a linked-list, binary
search tree, AVL Tree, graph etc.

11
Classification of Data Structure
Data structure

Primitive DS Non-Primitive DS

Integer Float Character Pointer

12
Classification of Data Structure
Non-Primitive DS

Linear List Non-Linear List

Array Queue Graph Trees

Link List Stack


13
Types of Data

Characterstic Description

In Linear data structures, the data items are arranged in a linear sequence.
Linear
Example: Array
In Non-Linear data structures, the data items are not in sequence. Example: Tree,
Non-Linear
Graph

Homogeneous In homogeneous data structures, all the elements are of same type. Example: Array

Non- In Non-Homogeneous data structure, the elements may or may not be of the same

Homogeneous type. Example: Structures

Static data structures are those whose sizes and structures associated memory
Static
locations are fixed, at compile time. Example: Array

Dynamic structures are those which expand or shrink depending upon the program
Dynamic need and its execution. Also, their associated memory locations changes. Example:
Linked List created using pointers 14
Data Structure Operations
 The most commonly used operation on data
structure are broadly categorized into
following types:
 Create
 Selection
 Updating
 Searching
 Sorting
 Merging
 Delete
 Insert
Description of various
Data Structures : Arrays
 An array is defined as a set of finite number of
homogeneous elements or same data items.
 It means an array can contain one type of data
only, either all integer, all float-point number or
all character.

16
Arrays
 The elements of array will always be stored in
the consecutive (continues) memory location.

 The number of elements that can be stored in an


array, that is the size of array or its length is
given by the following equation:
(Upperbound-lowerbound)+1

17
Arrays
 Insertion of new element
 Deletion of required element

 Modification of an element

 Merging of arrays

18
Lists
 A lists (Linear linked list) can be defined as a
collection of variable number of data items.
 An element of list must contain at least two
fields, one for storing data or information and
other for storing address of next element.
 For storing address need a special data
structure of list that is pointer type.

19
Lists
 Technically each such element is referred to as a
node, therefore a list can be defined as a
collection of nodes as show bellow:
[Linear Liked List]
Head

AAA BBB CCC

Information field Pointer field 20


Lists
 Types of linked lists:
 Single linked list
 Doubly linked list
 Single circular linked list
 Doubly circular linked list

21
Stack
 A stack is also an ordered collection of elements
like arrays, but it has a special feature that
deletion and insertion of elements can be done
only from one end called the top of the stack
(TOP)
 Due to this property it is also called as last in first
out type of data structure (LIFO).

22
Stack
 Insertion of element into stack is called
PUSH and deletion of element from stack is
called POP.
 The bellow show figure how the operations
take place on a stack:

PUSH POP

[STACK] 23
Stack
 The stack can be implemented into two
ways:
 Using arrays (Static implementation)

 Using pointer (Dynamic implementation)

24
Queue
 Queue are first in first out type of data
structure (i.e. FIFO)
 In a queue new elements are added to the
queue from one end called REAR end and
the element are always removed from other
end called the FRONT end.
 The people standing in a railway reservation
row are an example of queue.
25
Queue

 The bellow show figure how the operations


take place on a queue:

10 20 30 40 50

front rear 26
Queue
 The queue can be implemented into two
ways:
 Using arrays (Static implementation)

 Using pointer (Dynamic implementation)

27
Trees
 Tree is non-linear type of data
 Tree represent the hierarchical relationship
between various elements.

28
Trees

 There is a special data item at the top of


hierarchy called the Root of the tree.

 The remaining data items are partitioned into


number of mutually exclusive subset, each
of which is itself, a tree which is called the
sub tree.

29
Trees
 The tree structure organizes the data into
branches, which related the information.

A root

B C

D E F G

30
Graph
 Graph is a mathematical non-linear data
structure capable of representing many kind
of physical structures.
 Definition: A graph G(V,E) is a set of
vertices V and a set of edges E.

31
Graph
 An edge connects a pair of vertices and
many have weight such as length, cost and
another measuring instrument for according
the graph.
 Vertices on the graph are shown as point or
circles and edges are drawn as arcs or line
segment.

32
Graph
 Example of graph:
6
v2 v5
v1 v3
10

v1 8 11
15
9 v2
v3 v4 v4

[a] Directed & [b] Undirected Graph


Weighted Graph 33
Graph
 Types of Graphs:
 Directed graph

 Undirected graph

 Simple graph

 Weighted graph

 Connected graph

 Non-connected graph

34
OPERATIONS

 Data appearing in Data Structure are


processed by means of certain operation

 Particular DS one chooses for a given


situation depends largely on the frequency
with which specific operations are performed

35
MAJOR OPERATION
 Traversing: Accessing each record exactly once so that certain
items in the record may be processed [ Also known as Visiting
the record]
 Searching: Finding the location of the record with a given key
value, or finding the locations of all record which satisfy
one or more conditions
 Inserting : Adding a new record to the structure
 Deleting : Removing a record from the structure
 Sorting: Arranging a list in some logical order.
 Merging: Combing two list in a single list.

36
Abstract Data Type

 Abstract Data Types (ADT's) are a model used to


understand the design of a data structure. Abstract mean
an implementation-independent view of the data
structure.

 ADTs specify the type of data stored and the operations


that support the data

37
Abstract data type (ADT)
 Abstract data type (ADT) is a specification of a set of
data and the set of operations that can be performed on
the data. Each operation does a specific task.

38
Uses of ADT

 1. It helps to efficiently develop well designed program


 2. Facilitates the decomposition of the complex task
of developing a software system into a number of
simpler subtasks
 3. Helps to reduce the number of things the programmer
has to keep in mind at any time
 4. Breaking down a complex task into a number of
earlier subtasks also simplifies testing and debugging

39
List o f ADT’s:
 1. Insertion at first, middle, last
 2. Deletion at first, middle, last
 3. Searching
 4. Reversing
 5. Traversing
 6. Modifying the list,
 7. Merging the list
40
Sorting and Searching
"There's nothing in your head the
sorting hat can't see. So try me
on and I will tell you where you
ought to be."
-The Sorting Hat, Harry Potter
and the Sorcerer's Stone

CS 307
Fundamentals of
Computer Science
Sorting and Searching
 Fundamental problems in computer
science and programming
 Sorting done to make searching easier
 Multiple different algorithms to solve
the same problem
 How do we know which algorithm is
"better"?
 Look at searching first
 Examples will use arrays of ints to
Searching
Searching
 Given a list of data find the location of a
particular value or report that value is
not present
 linear search
 intuitive approach
 start at first item
 is it the one I am looking for?
 if not go to next item
 repeat until found or all items checked
 If items not sorted or unsortable this
Attendance Question 1
 What is the average case Big O of linear
search in an array with N items, if an
item is present?
A. O(N)

B. O(N )
2

C. O(1)

D. O(logN)

E. O(NlogN)
 If items are sorted then we can divide and
Searching
conquer in a Sorted List
 dividing your work in half with each step
 generally a good thing
 The Binary Search on List in Ascending order
 Start at middle of list
 is that the item?
 If not is it less than or greater than the item?
 less than, move to second half of list
 greater than, move to first half of list
 repeat until found or sub list size = 0
list Binary Search
low item middle item high item
Is middle item what we are looking for? If not is it
more or less than the target item? (Assume lower)

list

low middle high


item item item
and so forth…
Trace When Key == 3
Trace When Key == 30

Variables of Interest?
Attendance Question 2
What is the worst case Big O of binary search in
an array with N items, if an item is present?
A. O(N)
B. O(N2)
C. O(1)
D. O(logN)
E. O(NlogN)
Generic Binary Search
int bsearch(Comparable list[], Comparable target)
{ int result = -1;
int low = 0;
int high = length - 1;
int mid;
while( result == -1 && low <= high )
{ mid = low + ((high - low) / 2);
if( target=list[mid]) )
result = mid;
else if(target< (list[mid]))
low = mid + 1;
else
high = mid - 1;
}
return result;
}
Other Searching Algorithms
 Interpolation Search
 more like what people really do
 Indexed Searching
 Binary Search Trees
 Hash Table Searching
 Grover's Algorithm (Waiting for
quantum computers to be built)
 best-first

Sorting Fun
Why Not Bubble Sort?
 A fundamental application for computers
 Sorting
Done to make finding data (searching) faster
 Many different algorithms for sorting
 One of the difficulties with sorting is working
with a fixed size storage container (array)
 if resize, that is expensive (slow)
 The "simple" sorts run in quadratic time
O(N2)
 bubble sort
 selection sort
 insertion sort
Stable Sorting
 A property of sorts
 If a sort guarantees the relative order
of equal items stays the same then it is
a stable sort
 [71, 6, 72, 5, 1, 2, 73, -5]
 subscripts added for clarity
 [-5, 1, 2, 5, 6, 71, 72, 73]
 result of stable sort
 Real world example:
 sort a table in Wikipedia by one criteria, then
 Selection
Algorithm sort
 Search through the list and find the smallest element
 swap the smallest element with the first element
 repeat starting at second element and find the second
smallest element
public static void selectionSort(int[] list)
{ int min;
int temp;
for(int i = 0; i < list.length - 1; i++) {
min = i;
for(int j = i + 1; j < list.length; j++)
if( list[j] < list[min] )
min = j;
temp = list[i];
list[i] = list[min];
list[min] = temp;
}
}
Selection Sort in Practice
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25

What is the T(N), actual number of statements


executed, of the selection sort code, given a list
of N elements? What is the Big O?
Generic Selection Sort
public void selectionSort(Comparable[] list)
{ int min; Comparable temp;
for(int i = 0; i < list.length - 1; i++) {
{ min = i;
for(int j = i + 1; j < list.length; j++)
if( list[min].compareTo(list[j]) > 0 )
min = j;
temp = list[i];
list[i] = list[min];
list[min] = temp;
}
}

8 Best case, worst case, average case Big O?


Attendance Question
Is selection sort always stable?
A. Yes
3
B. No
Insertion Sort
 Another of the O(N^2) sorts
 The first item is sorted
 Compare the second item to the first
 if smaller swap
 Third item, compare to item next to it
 need to swap
 after swap compare again
 And so forth…
{
Insertion Sort Code
public void insertionSort(int[] list)
int temp, j;
for(int i = 1; i < list.length; i++)
{ temp = list[i];
j = i;
while( j > 0 && temp < list[j - 1])
{ // swap elements
list[j] = list[j - 1];
list[j - 1] = temp;
j--;
}
}
}
8 Best case, worst case, average case Big O?
Attendance Question 4
 Is the version of insertion sort shown
always stable?
A. Yes
B. No
Comparing Algorithms
 Which algorithm do you think will be
faster given random data, selection sort
or insertion sort?
 Why?
Sub Quadratic
Sorting Algorithms

Sub Quadratic means having a Big


O better than O(N2)
ShellSort
 Created by Donald Shell in 1959
 Wanted to stop moving data small
distances (in the case of insertion sort
and bubble sort) and stop making
swaps that are not helpful (in the case
of selection sort)
 Start with sub arrays created by looking
at data that is far apart and then reduce
the gap size
46 2 83 41 102 5 17 31 64 49 18
ShellSort
Gap of five. Sortinsubpractice
array with 46, 5, and 18
5 2 83 41 102 18 17 31 64 49 46
Gap still five. Sort sub array with 2 and 17
5 2 83 41 102 18 17 31 64 49 46
Gap still five. Sort sub array with 83 and 31
5 2 31 41 102 18 17 83 64 49 46
Gap still five Sort sub array with 41 and 64
5 2 31 41 102 18 17 83 64 49 46
Gap still five. Sort sub array with 102 and 49
5 2 31 41 49 18 17 83 64 102 46
Continued on next slide:
5 2 Completed
31 41 49 18 17 83Shellsort
64 102 46
Gap now 2: Sort sub array with 5 31 49 17 64 46
5 2 17 41 31 18 46 83 49 102 64
Gap still 2: Sort sub array with 2 41 18 83 102
5 2 17 18 31 41 46 83 49 102 64
Gap of 1 (Insertion sort)
2 5 17 18 31 41 46 49 64 83 102

Array sorted
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shellsort on Another Data Set
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25

Initial gap = length / 2 = 16 / 2 = 8


initial sub arrays indices:
{0, 8}, {1, 9}, {2, 10}, {3, 11}, {4, 12}, {5, 13}, {6, 14}, {7, 15}
next gap = 8 / 2 = 4
{0, 4, 8, 12}, {1, 5, 9, 13}, {2, 6, 10, 14}, {3, 7, 11, 15}
next gap = 4 / 2 = 2
{0, 2, 4, 6, 8, 10, 12, 14}, {1, 3, 5, 7, 9, 11, 13, 15}
final gap = 2 / 2 = 1

CS 307
Fundamentals of
Computer Science
public static void shellsort(Comparable[] list)
ShellSort Code
{ Comparable temp; boolean swap;
for(int gap = list.length / 2; gap > 0; gap /= 2)
for(int i = gap; i < list.length; i++)
{ Comparable tmp = list[i];
int j = i;
for( ; j >= gap &&
tmp.compareTo( list[j - gap] ) < 0;
j -= gap )
list[ j ] = list[ j - gap ];
list[ j ] = tmp;
}
}
Comparison of Various 0Sorts
Num Items Selection Insertion Shellsort Quicksort
1000 16 5 0
2000 59 49 0 6
4000 271 175 6 5
8000 1056 686 11 0
16000 4203 2754 32 11
32000 16852 11039 37 45
64000 expected? expected? 100 68
128000 expected? expected? 257 158
256000 expected? expected? 543 335
512000 expected? expected? 1210 722
1024000 expected? expected? 2522 1550

times in milliseconds

Quicksort
Invented by C.A.R. (Tony) Hoare
 A divide and conquer approach
that uses recursion
1. If the list has 0 or 1 elements it is sorted
2. otherwise, pick any element p in the list. This is
called the pivot value
3. Partition the list minus the pivot into two sub lists
according to values less than or greater than the
pivot. (equal values go to either)
4. return the quicksort of the first list followed by
the quicksort of the second list
Quicksort in Action
39 23 17 90 33 72 46 79 11 52 64 5 71
Pick middle element as pivot: 46
Partition list
23 17 5 33 39 11 46 79 72 52 64 90 71
quick sort the less than list
Pick middle element as pivot: 33
23 17 5 11 33 39
quicksort the less than list, pivot now 5
{} 5 23 17 11
quicksort the less than list, base case
quicksort the greater than list
Pick middle element as pivot: 17
and so on….
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Quicksort on Another Data Set
44 68 191 119 119 37 83 82 191 45 158 130 76 153 39 25

Big O of Quicksort?
public static void swapReferences( Object[] a, int index1, int index2 )
{ Object tmp = a[index1];
a[index1] = a[index2];
a[index2] = tmp;
}
public void quicksort( Comparable[] list, int start, int stop )
{ if(start >= stop)
return; //base case list of 0 or 1 elements
int pivotIndex = (start + stop) / 2;
// Place pivot at start position
swapReferences(list, pivotIndex, start);
Comparable pivot = list[start];
// Begin partitioning
int i, j = start;
// from first to j are elements less than or equal to pivot
// from j to i are elements greater than pivot
// elements beyond i have not been checked yet
for(i = start + 1; i <= stop; i++ )
{ //is current element less than or equal to pivot
if(list[i].compareTo(pivot) <= 0)
{ // if so move it to the less than or equal portion
j++;
swapReferences(list, i, j);
}
}
//restore pivot to correct spot
swapReferences(list, start, j);
quicksort( list, start, j - 1 ); // Sort small elements
quicksort( list, j + 1, stop ); // Sort large elements
}
Attendance Question 5
 What is the best case and worst case
Big O of quicksort?
Best Worst
A. O(NlogN) O(N )
2

B. O(N ) O(N2)
2

C. O(N ) O(N!)
2

D. O(NlogN) O(NlogN)

E. O(N) O(NlogN)
Quicksort Caveats
 Average case Big O?
 Worst case Big O?
 Coding the partition step is usually the
hardest part
Attendance Question 6
 You have 1,000,000 items that you will
be searching. How many searches need
to be performed before the data is
changed to make sorting worthwhile?
A. 10

B. 40

C. 1,000

D. 10,000

E. 500,000
Don Knuth cites John von Neumann as the creator
of thisMerge
algorithmSort Algorithm
1. If a list has 1 element or 0
elements it is sorted
2. If a list has more than 2 split
into into 2 separate lists
3. Perform this algorithm on each
of those smaller lists
4. Take the 2 sorted lists and
merge them together
Merge Sort
When implementing
one temporary array
is used instead of
multiple temporary
arrays.

Why?
Final Comments
 Language libraries often have sorting
algorithms in them
 Java Arrays and Collections classes
 C++ Standard Template Library
 Python sort and sorted functions
 Hybrid sorts
 when size of unsorted list or portion of
array is small use insertion sort, otherwise
use
O(N log N) sort like Quicksort of Mergesort

You might also like