0% found this document useful (0 votes)

3 views36 pages

Data Structure and Algorithm (Lecture 9)

This document is a lecture on data structures and algorithms, specifically focusing on searching techniques such as sequential search, binary search, and hashing. It covers various algorithms for searching unsorted and sorted arrays, compares their efficiencies, and discusses advanced topics like self-organizing lists and bit vectors. Additionally, it explains hashing methods, including collision resolution techniques like open and closed hashing.

Uploaded by

Hậu Chung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views36 pages

Data Structure and Algorithm (Lecture 9)

Uploaded by

Hậu Chung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

1

DATA STRUCTURE AND ALGORITHM

Dr. Khine Thin Zar

Professor
Computer Engineering and Information Technology Dept.
Yangon Technological University
2

Lecture 9

Searching
3

Outlines of Class (Lecture 9)

 Introduction
 Searching Unsorted and Sorted Arrays
 Sequential Search/ Linear Search Algorithm
 Binary Search Algorithm
 Quadratic Binary Search Algorithm
 Self-Organizing Lists
 Bit Vectors for Representing Sets
 Hashing
4

Introduction
 Searching
is the most frequently performed of all computing
tasks.
is an attempt to find the record within a collection of
records that has a particular key value
a collection L of n records of the form (k1; I1); (k2; I2);...; (kn;
In)
Ij is information associated with key kj from record j for 1
j  n.
the search problem is to locate a record (kj ; Ij) in L such
that kj = K (if one exists).
Searching is a systematic method for locating the record
(or records) with key value kj = K .
5

Introduction (Cont.)
Successful Search
a record with key kj = K is found
Unsuccessful Search
no record with kj = K is found (no such record exists)
Exact-match query
is a search for the record whose key value matches a
specified key value.
Range query
 is a search for all records whose key value falls within a
specified range of key values.
6

Introduction (Cont.)

Categories of search algorithms :

 Sequential and list methods
 Direct access by key value (hashing)
 Tree indexing methods
7

Searching Unsorted and Sorted Arrays

 Sequential Search/ Linear Search Algorithm

 Jump Search Algorithm
 Exponential Search Algorithm
 Binary Search Algorithm
 Quadratic Binary Search (QBS) Algorithm
8

Searching Unsorted and Sorted Arrays

(Cont.)
 Sequential search on an unsorted list requires (n) time in
the worst case.
 How many comparisons does linear search do on average?
 A major consideration is whether K is in list L at all.
 K is in one of positions 0 to n-1 in L (each position having
its own probability)
 K is not in L at all.
 The probability that K is not in L as:

Where P(x) is the probability of event x

Searching Unsorted and Sorted Arrays

(Cont.)
 pi be the probability that K is in position i of L (indexed
from 0 to n-1).
 For any position i in the list, we must look at i + 1 records
to reach it.
 the cost when K is in position i is i + 1. When K is not in
L, sequential search will require n comparisons.
 pn be the probability that K is not in L.
 The average cost T(n):
10

Searching Unsorted and Sorted Arrays

(Cont.)
 Assume all the pi’s are equal (except p0)

Depending on the value of pn

Searching Unsorted and Sorted Arrays
(Cont.)
 For large collections of records that are searched repeatedly,
sequential search is unacceptably slow.
 One way to reduce search time is to preprocess the records by
sorting them.
 Jump search algorithm
 What is the right amount to jump?
 For some value j, we check every j’th element in L, that is, check
elements L[j], L[2j], and so on.
 So long as K is greater than the values we are checking, we continue on.
 If we reach the value in L greater than K, we do a linear search on the
piece of length j-1that we know brackets K if it is in the list.
 If we define m such that mj  n < (m + 1)j, then the total cost of this
algorithm is at most m + j -1 3-way comparisons.
12

Sequential Search/ Linear Search

Algorithm
 Starts at the first record and moves through each record
until a match is found, or not found.
 Easy to write and efficient for short lists
 Does not require sorted data
 A major consideration is whether K is in list L at all.
 K is in one of positions 0 to n-1 in L or K is not in L at all.
 A simple approach is to do sequential/ linear search, i.e
 Start from the leftmost element of arr[ ] and one by one
compare x with each element of arr[ ]
 If x matches with an element, return the index.
 If x doesn’t match with any of elements, return -1.
13

Analysis of Sequential Search

The list of items is not ordered;
 Best case: find the item in the first place, at the beginning
of the list. need only one comparison.
 Worst case: not discover the item until the very last
comparison, the nth comparison.
 Average case: find the item about halfway into the list.
 The complexity of the sequential search: O(n).
14

Analysis of Sequential Search (Cont.)

If the items were ordered in some way;
 Best case: discover that the item is not in the list by looking
at only one item.
 Worst case: not discover the item until the very last
comparison
 Average case: know after looking through only n/2 items.
 The complexity of the sequential search: O(n).
15

Sequential Search Algorithm (Cont.)

 For large collections of records that are searched repeatedly,
sequential search is unacceptably slow.
 One way to reduce search time is to preprocess the records by sorting
them.
 Given a sorted array, an obvious improvement is to test if the current
element in L is greater than K.
 If it is, then we know that K cannot appear later in the array, and we
can quit the search early.
 If we look first at position 1 in sorted array L and find that K is
bigger, then we rule out position 0 as well as position 1.
 Linear search is rarely used practically because other search
algorithms such as the binary search algorithm and hash tables allow
significantly faster searching comparison to Linear search.
16

Binary Search Algorithm

 Searching a sorted array by repeatedly dividing the
search interval in half.
 Start with an interval covering the whole array.
 If the search key is less than the item in the middle of the
interval, narrow the interval to the lower half.
 Otherwise narrow it to the upper half. Repeatedly check
until the value is found or the interval is empty.
 If we know nothing about the distribution of key values,
then binary search is the best algorithm available for
searching a sorted array.
17

Dictionary Search or interpolation

Search
 The “computed” binary search is called a dictionary search or
interpolation search, where additional information about the
data is used to achieve better time complexity.
 Binary Search goes to middle element to check.
 Interpolation search may go to different locations according
the value of key being searched.
 For example if the value of key is closer to the last element,
interpolation search is likely to start search toward the end side.
 In a dictionary search, we search L at a position p that is
appropriate to the value of K as follows.

 This equation is computing the position of K as a fraction of

the distance between the smallest and largest key values.
18

Quadratic Binary Search Algorithm

 A variation on dictionary search is known as Quadratic

Binary Search (QBS)
 In Quadratic Algorithm, first calculate the middle
element,1/4th element and 3/4th element .
 Compare the key with the item in the middle ,1/4th and
3/4th positions of the array.
19

Comparison Between Binary and

Quadratic Binary Search Algorithm
 Is QBS better than binary search?
 Theoretically yes,
 First we compare, the running time : lg lg n (for QBS) to lg
n(for BS).
20

Comparison Between Binary and

Quadratic Binary Search Algorithm (Cont.)
 let us look and check the actual number of comparisons
used.
 For binary search, log n -1 total comparisons are
required.
 Quadratic binary search requires about 2.4 lg lg n
comparisons.
21

Self-Organizing Lists
 Assume that know for each key ki, the probability pi that
the record with key ki will be requested.
 Assume also that the list is ordered.
 Search in the list will be done sequentially, beginning
with the first position.
 Over the course of many searches, the expected number
of comparisons required for one search is:

 The cost to access the record in L[0] is 1 (because one key value is
looked at), and the probability of this occurring is p0.
 The cost to access the record in L[1] is 2 (because we must look at the
first and the second records’ key values), with probability p1, and so
on.
22

Self-Organizing Lists (Cont.)

 As an example of applying self-organizing lists, consider an
algorithm for compressing and transmitting messages. The list is self-
organized by the move-to-front rule. Transmission is in the form of
words and numbers, by the following rules:
 If the word has been seen before, transmit the current position of the word in
the list. Move the word to the front of the list.
 If the word is seen for the first time, transmit the word. Place the word at the
front of the list.
 Both the sender and the receiver keep track of the position of words
in the list in the same way (using the move-to-front rule), so they
agree on the meaning of the numbers that encode repeated
occurrences of words.
“The car on the left hit the car I left.”
 The entire transmission would be:
“The car on 3 left hit 3 5 I 5.”
 Ziv-Lempel coding
is a class of coding algorithms commonly used in file compression
utilities.
23

Bit Vectors for Representing Sets

 Determining whether a value is a member of a particular set is a
special case of searching for keys in a sequence of records.
 The set using a bit array with a bit position allocated for each
potential member.
 Those members actually in the set store a value of 1 in their
corresponding bit; those members not in the set store a value of 0 in
their corresponding bit.
 This representation scheme is called a bit vector or a bitmap.

Figure: The bit array for the set of primes in the range 0 to 15. The bit at
position i is set to 1 if and only if i is prime.
 to compute the set of numbers between 0 and 15 that are both prime
and odd numbers
0011010100010100 & 0101010101010101 : -------- ?
Hashing
 Hashing
 The process of finding a record using some computation
to map its key value to a position in the array.
 Hash function
 The function that maps key values to positions is called a
hash function (h).
 Hash Table
 The array that holds the records (HT).
 Slot
 A position in the hash table
 The number of slots in hash table HT will be denoted by
the variable M, with slots numbered from 0 to M-1 24
25

Hashing (Cont.)
 Applications
 Is not suitable for applications where multiple records
with the same key value are permitted.
 Is not suitable for answering range queries.
 Is not suitable for finding the record with the minimum
or maximum key value.
 Is most appropriate for answering the question (only
exact-match queries).
 Is suitable for both in-memory and disk-based
searching.
 Is suitable for organizing large databases stored on
disk.
26

Hashing (Cont.)
 Given a hash function h and two keys k1 and k2, if h(k1) = 
= h(k2)
 where  is a slot in the table, then we say that k1 and k2
have a collision at slot under hash function h.
 Finding a record with key value K in a database organized
by hashing follows a two-step procedure:
1. Compute the table location h(K).
2. Starting with slot h(K), locate the record containing
key K using (if necessary) a collision resolution policy.
Choosing a hash function
 Main objectives
 Choose an easy to compute hash function
 Minimize number of collisions
Searching vs. Hashing

 Searching methods: key comparisons

 Time complexity: O(size) or O(log n)

 Hashing methods: hash functions

 Expected time: O(1)
Hash Table

 A hash table is a data structure that is used to store

keys/value pairs.
 An array of fixed size
 Uses a hash function to compute an index into an array
in which an element will be inserted or searched.
 Mapping (hash function) h from key to index
Hash Table Operations
 Insert
 Delete
 Search
Hash Function
 If key range too large, use hash table with fewer buckets and a
hash function which maps multiple keys to same bucket:
 h(k1) =  = h(k2): k1 and k2 have collision at slot 
 Popular hash functions: hashing by division
h(k) = k%D,
where D number of buckets in hash table
 Example: hash table with 12 buckets
h(k) = k%12
80  6 (80%12= 6), 50  2, 64  4
62  2 collision!
 If the keys are strings, convert it into a numeric value.
Could use String keys each ASCII character equals some unique integer
“label" = 108+97 + 98 + 101+108 == 512
Open Hashing (separate chaining)
 One of the most commonly used collision resolution
techniques.
 Implement using linked lists.
 To store an element in the hash table you must insert it
into a specific linked list. If there is any collision (i.e. two
different elements have same hash value) then store both
the elements in the same linked list.
 Collisions are stored outside the table (open hashing)
 Data organized in linked lists
 Hash table: array of pointers to the linked lists
 The simplest form of open hashing defines each slot in
the hash table to be the head of a linked list.
32

Open Hashing (separate chaining)

(Cont.)

 Better space utilization for large items.

 Simple collision handling: searching linked list.

 Overflow: we can store more items than the hash table size.

 Deletion is quick and easy: deletion from the linked list.

Open Hashing (separate chaining)

(Cont.)

Figure: An illustration of open hashing for seven numbers stored in a ten-slot

hash table using the hash function h(K) = K mod 10. The numbers are inserted
in the order 9877, 2007, 1000, 9530, 3013, 9879, and 1057. Two of the values
hash to slot 0, one value hashes to slot 2, three of the values hash to slot 7,
and one value hashes to slot 9.
Close Hashing (open addressing)
 Closed hashing stores all records directly in the hash table. Each
record R with key value kR has a home position that is h(kR), the
slot computed by the hash function.
 One implementation for closed hashing groups hash table slots
into buckets. The M slots of the hash table are divided into B
buckets.
 The hash function assigns each record to the first slot within one
of the buckets.
 If this slot is already occupied, then the bucket slots are searched
sequentially until an open slot is found.
 If a bucket is entirely full, then the record is stored in an overflow
bucket of infinite capacity at the end of the table.
Close Hashing (open addressing)
(Cont.)

Figure: An illustration of bucket hashing for seven numbers

stored in a five bucket hash table using the hash function h(K) =
K mod 5. Each bucket contains two slots. The numbers are
inserted in the order 9877, 2007, 1000, 9530, 3013, 9879, and
1057. Two of the values hash to bucket 0, three values hash to
bucket 2, one value hashes to bucket 3, and one value hashes
to bucket 4. Because bucket 2 cannot hold three values, the
third one ends up in the overflow bucket.
36

Next Week Lecture (Week 10)

Lecture 10: Indexing

Thank you!

DS Unit-2 SearchSort
No ratings yet
DS Unit-2 SearchSort
24 pages
Searching Algorithms
No ratings yet
Searching Algorithms
10 pages
Unit II
No ratings yet
Unit II
150 pages
Module 6
No ratings yet
Module 6
47 pages
Unit 3
No ratings yet
Unit 3
109 pages
Unit 6 Dsa
No ratings yet
Unit 6 Dsa
152 pages
Unit-5 Searching & Sorting Techniques Not Completed
No ratings yet
Unit-5 Searching & Sorting Techniques Not Completed
11 pages
LabManual 23967 Content Document 20240909125727PM
No ratings yet
LabManual 23967 Content Document 20240909125727PM
11 pages
Chapter 9 Searching
No ratings yet
Chapter 9 Searching
47 pages
I B.SC CS DS Unit V
No ratings yet
I B.SC CS DS Unit V
22 pages
Searhing Techniques Data Structure
No ratings yet
Searhing Techniques Data Structure
11 pages
Chapter 2 - Searching & Sorting Algorithm
No ratings yet
Chapter 2 - Searching & Sorting Algorithm
9 pages
DSA Chapter 2
No ratings yet
DSA Chapter 2
57 pages
Cha2 Algorithm
No ratings yet
Cha2 Algorithm
48 pages
Unit - 2
No ratings yet
Unit - 2
59 pages
Final Term Course (Week No. 07 - 16)
No ratings yet
Final Term Course (Week No. 07 - 16)
50 pages
CIA3
No ratings yet
CIA3
14 pages
Chapter 6... Tree
No ratings yet
Chapter 6... Tree
39 pages
8.0 Searching
No ratings yet
8.0 Searching
27 pages
10 - Searching & Sorting
No ratings yet
10 - Searching & Sorting
110 pages
Lec 3 - Searching
No ratings yet
Lec 3 - Searching
32 pages
Unit 4 1048 1698311627 RPWPFV
No ratings yet
Unit 4 1048 1698311627 RPWPFV
100 pages
Pmscs 623p Lecture 4
No ratings yet
Pmscs 623p Lecture 4
75 pages
Data Search Algorithm
No ratings yet
Data Search Algorithm
18 pages
Ds-Module 5 Lecture Notes
No ratings yet
Ds-Module 5 Lecture Notes
12 pages
Lecture 37 - Search-I-1728885287152
No ratings yet
Lecture 37 - Search-I-1728885287152
8 pages
Chapter 2
No ratings yet
Chapter 2
9 pages
DAA Unit1 05 BinarySearch
No ratings yet
DAA Unit1 05 BinarySearch
12 pages
Chapter Two
No ratings yet
Chapter Two
26 pages
3.1 Searching Techniques
No ratings yet
3.1 Searching Techniques
49 pages
Chapter Two
No ratings yet
Chapter Two
62 pages
Searching and Sorting
No ratings yet
Searching and Sorting
32 pages
6 Searching Algorithms
No ratings yet
6 Searching Algorithms
20 pages
5 Searching and Sorting (Selection, Bubble)
No ratings yet
5 Searching and Sorting (Selection, Bubble)
86 pages
DSModule 3
No ratings yet
DSModule 3
31 pages
DS Unit 5
No ratings yet
DS Unit 5
36 pages
DS Unit 5
No ratings yet
DS Unit 5
36 pages
02chapter Two Divide Conquer ALGORITHM
No ratings yet
02chapter Two Divide Conquer ALGORITHM
63 pages
Data Structure-49-51
No ratings yet
Data Structure-49-51
3 pages
3.1 Searching Techniques
No ratings yet
3.1 Searching Techniques
48 pages
Unit V
No ratings yet
Unit V
31 pages
Unit 9
No ratings yet
Unit 9
17 pages
Chapter 06 Searching
No ratings yet
Chapter 06 Searching
14 pages
Data Structures Algorithms - Lecture 18 19 20 - Basic Searching Algorithms
No ratings yet
Data Structures Algorithms - Lecture 18 19 20 - Basic Searching Algorithms
66 pages
Lecture 9 Searching
No ratings yet
Lecture 9 Searching
18 pages
Searching
No ratings yet
Searching
11 pages
Binary Search
No ratings yet
Binary Search
22 pages
Sorting and Searching
No ratings yet
Sorting and Searching
85 pages
Liniear Binary
No ratings yet
Liniear Binary
22 pages
DAA-Unit1-05 BinarySearch
No ratings yet
DAA-Unit1-05 BinarySearch
15 pages
UNIT-2 (Serching and Sorting)
No ratings yet
UNIT-2 (Serching and Sorting)
15 pages
2 - Search SortArrays
No ratings yet
2 - Search SortArrays
36 pages
CMP 202 (Searching)
No ratings yet
CMP 202 (Searching)
26 pages
Searching in JAVA
No ratings yet
Searching in JAVA
10 pages
Fds 3
No ratings yet
Fds 3
222 pages
DS Unit 5
No ratings yet
DS Unit 5
36 pages
CH2 Searching and Sorting Notes
No ratings yet
CH2 Searching and Sorting Notes
16 pages
Search Algo
No ratings yet
Search Algo
8 pages
Sorting Algorithms: Searching Algorithm
No ratings yet
Sorting Algorithms: Searching Algorithm
7 pages
300+ Python Algorithms: Mastering the Art of Problem-Solving
From Everand
300+ Python Algorithms: Mastering the Art of Problem-Solving
Hernando Abella
5/5 (1)
oewn
No ratings yet
oewn
13 pages
An End To End Comprehensive Guide For Pca
No ratings yet
An End To End Comprehensive Guide For Pca
24 pages
Count 1w100k
No ratings yet
Count 1w100k
1,695 pages
Riding Mowers
No ratings yet
Riding Mowers
1 page
Chapter 5
No ratings yet
Chapter 5
7 pages
Full Download Operations and Supply Chain Management 8th Edition Russell Solutions Manual
100% (58)
Full Download Operations and Supply Chain Management 8th Edition Russell Solutions Manual
35 pages
AI Lab Programs
No ratings yet
AI Lab Programs
11 pages
Content Beyond Syllabus
No ratings yet
Content Beyond Syllabus
8 pages
Unit - 2 Ai
No ratings yet
Unit - 2 Ai
66 pages
Question: Exercise 3. Draw The 11-Item Hash Table That Results From Using
No ratings yet
Question: Exercise 3. Draw The 11-Item Hash Table That Results From Using
4 pages
9.4, 9.5, 9.6 Rabin Karp, KMP, Boyer Moore
No ratings yet
9.4, 9.5, 9.6 Rabin Karp, KMP, Boyer Moore
17 pages
String Matching Algorithm
No ratings yet
String Matching Algorithm
5 pages
Ai Lab Manual
No ratings yet
Ai Lab Manual
28 pages
Tutorial Metaheuristics
No ratings yet
Tutorial Metaheuristics
131 pages
Chap-3 Search Algorithms in Artificial Intelligence
100% (1)
Chap-3 Search Algorithms in Artificial Intelligence
93 pages
Mip001 Problema 1
No ratings yet
Mip001 Problema 1
3 pages
Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Advanced String Lecture
No ratings yet
Advanced String Lecture
50 pages
ADS & A Unit-3 Study Material
No ratings yet
ADS & A Unit-3 Study Material
27 pages
Lecture 4 Heuristic Search Strategies
No ratings yet
Lecture 4 Heuristic Search Strategies
20 pages
Task 3 Hashing Quadratic Probing
No ratings yet
Task 3 Hashing Quadratic Probing
7 pages
UNIT 2-Topic 5-Search With Partial Information (Heuristic Search)
No ratings yet
UNIT 2-Topic 5-Search With Partial Information (Heuristic Search)
4 pages
Lecture3 Searching
No ratings yet
Lecture3 Searching
45 pages
Hashing
No ratings yet
Hashing
7 pages
Variable Neighborhood Search For Bin Packing Problem
No ratings yet
Variable Neighborhood Search For Bin Packing Problem
21 pages
AI - Module 3
No ratings yet
AI - Module 3
99 pages
Hill Climbing
No ratings yet
Hill Climbing
12 pages
Lec5 - Hashing
No ratings yet
Lec5 - Hashing
16 pages
Branch and Bound
No ratings yet
Branch and Bound
10 pages
HASHING
No ratings yet
HASHING
21 pages
Informed Search
No ratings yet
Informed Search
37 pages
Lecture 02 Part A - Uninformed or Blind Search
No ratings yet
Lecture 02 Part A - Uninformed or Blind Search
92 pages
Practice Question
No ratings yet
Practice Question
7 pages
Chap 3 B Uninformed Search
No ratings yet
Chap 3 B Uninformed Search
67 pages

Data Structure and Algorithm (Lecture 9)

Uploaded by

Data Structure and Algorithm (Lecture 9)

Uploaded by

1

DATA STRUCTURE AND ALGORITHM

Dr. Khine Thin Zar

Outlines of Class (Lecture 9)

Categories of search algorithms :

Searching Unsorted and Sorted Arrays

 Sequential Search/ Linear Search Algorithm

Searching Unsorted and Sorted Arrays

Where P(x) is the probability of event x

Searching Unsorted and Sorted Arrays

Searching Unsorted and Sorted Arrays

Depending on the value of pn

Sequential Search/ Linear Search

Analysis of Sequential Search

Analysis of Sequential Search (Cont.)

Sequential Search Algorithm (Cont.)

Binary Search Algorithm

Dictionary Search or interpolation

 This equation is computing the position of K as a fraction of

Quadratic Binary Search Algorithm

 A variation on dictionary search is known as Quadratic

Comparison Between Binary and

Comparison Between Binary and

Self-Organizing Lists (Cont.)

Bit Vectors for Representing Sets

 Searching methods: key comparisons

 Hashing methods: hash functions

 A hash table is a data structure that is used to store

Open Hashing (separate chaining)

 Better space utilization for large items.

 Simple collision handling: searching linked list.

 Deletion is quick and easy: deletion from the linked list.

Open Hashing (separate chaining)

Figure: An illustration of open hashing for seven numbers stored in a ten-slot

Figure: An illustration of bucket hashing for seven numbers

Next Week Lecture (Week 10)

Lecture 10: Indexing

You might also like