0% found this document useful (0 votes)

35 views6 pages

4.6 Algorithms For Select and Join Operations

Uploaded by

Medha Harini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views6 pages

4.6 Algorithms For Select and Join Operations

Uploaded by

Medha Harini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

II IV

CS8492
DATABASE MANAGEMENT SYSTEMS
(Common to CSE & IT)

UNIT NO. 4

4.6 QUERY PROCESSING OVERVIEW,

ALGORITHMS FOR SELECT AND JOIN
OPERATIONS

Version: 1.0
CS8492

DATABASE MANAGEMENT SYSTEMS

(Common to CSE & IT)

Algorithms for SELECT and JOIN Operations

Implementing the SELECT Operation

There are many algorithms for executing a SELECT operation, which is basically a search
operation to locate the records in a disk file that satisfy a certain condition. Some of the search
algorithms depend on the file having specific access paths, and they may apply only to certain types
of selection conditions. We discuss some of the algorithms for implementing SELECT in this
section. We will use the following operations, specified on the relational database in Figure 3.5, to
illustrate our dis-cussion:

OP1: σSsn = ‘123456789’ (EMPLOYEE)

OP2: σDnumber > 5 (DEPARTMENT)
OP3: σDno = 5 (EMPLOYEE)
OP4: σDno = 5 AND Salary > 30000 AND Sex = ‘F’ (EMPLOYEE)
OP5: σEssn=‘123456789’ AND Pno =10(WORKS_ON)

Search Methods for Simple Selection. A number of search algorithms are pos-sible for
selecting records from a file. These are also known as file scans, because they scan the records of a
file to search for and retrieve records that satisfy a selec-tion condition.4 If the search algorithm
involves the use of an index, the index search is called an index scan. The following search methods
(S1 through S6) are examples of some of the search algorithms that can be used to implement a
select operation:

S1—Linear search (brute force algorithm). Retrieve every record in the file, and test whether
its attribute values satisfy the selection condition. Since the records are grouped into disk blocks,
each disk block is read into a main memory buffer, and then a search through the records within the
disk block is conducted in main memory.
S2—Binary search. If the selection condition involves an equality compari-son on a key
attribute on which the file is ordered, binary search—which is

more efficient than linear search—can be used. An example is OP1 if Ssn is the ordering attribute for
the EMPLOYEE file.5
S3a—Using a primary index. If the selection condition involves an equality comparison on a
key attribute with a primary index—for example, Ssn = ‘123456789’ in OP1—use the primary
index to retrieve the record. Note that this condition retrieves a single record (at most).

S3b—Using a hash key. If the selection condition involves an equality com-parison on a key
attribute with a hash key—for example, Ssn = ‘123456789’ in OP1—use the hash key to retrieve
the record. Note that this condition retrieves a single record (at most).

S4—Using a primary index to retrieve multiple records. If the comparison condition is >,
>=, <, or <= on a key field with a primary index—for exam-ple, Dnumber > 5 in OP2—use the index
CS8492

DATABASE MANAGEMENT SYSTEMS

(Common to CSE & IT)

to find the record satisfying the cor-responding equality condition (Dnumber = 5), then retrieve all
subsequent records in the (ordered) file. For the condition Dnumber < 5, retrieve all the preceding
records.

S5—Using a clustering index to retrieve multiple records. If the selection condition involves
an equality comparison on a nonkey attribute with a clustering index—for example, Dno = 5 in
OP3—use the index to retrieve all the records satisfying the condition.

S6—Using a secondary (B+-tree) index on an equality comparison. This search method can
be used to retrieve a single record if the indexing field is a key (has unique values) or to retrieve
multiple records if the indexing field is not a key. This can also be used for comparisons involving
>, >=, <, or <=.

In Section 19.8, we discuss how to develop formulas that estimate the access cost of these
search methods in terms of the number of block accesses and access time. Method S1 (linear
search) applies to any file, but all the other methods depend on having the appropriate access path on
the attribute used in the selection condition. Method S2 (binary search) requires the file to be sorted
on the search attribute. The methods that use an index (S3a, S4, S5, and S6) are generally referred to
as index searches, and they require the appropriate index to exist on the search attribute. Methods S4
and S6 can be used to retrieve records in a certain range—for example, 30000 <= Salary <= 35000.
Queries involving such conditions are called range queries.

Search Methods for Complex Selection. If a condition of a SELECT operation is a conjunctive

condition—that is, if it is made up of several simple conditions connected with the AND logical
connective such as OP4 above—the DBMS can use the following additional methods to implement
the operation:

S7—Conjunctive selection using an individual index. If an attribute involved in any single

simple condition in the conjunctive select condition has an access path that permits the use of one of
the methods S2 to S6, use that condition to retrieve the records and then check whether each
retrieved record satisfies the remaining simple conditions in the conjunctive select condition.

S8—Conjunctive selection using a composite index. If two or more attrib-utes are involved in
equality conditions in the conjunctive select condition and a composite index (or hash structure)
exists on the combined fields— for example, if an index has been created on the composite key
(Essn, Pno) of the WORKS_ON file for OP5—we can use the index directly.

S9—Conjunctive selection by intersection of record pointers.6 If second-ary indexes (or

other access paths) are available on more than one of the fields involved in simple conditions in the
conjunctive select condition, and if the indexes include record pointers (rather than block pointers),
then each index can be used to retrieve the set of record pointers that satisfy the indi-vidual
condition. The intersection of these sets of record pointers gives the record pointers that satisfy the
conjunctive select condition, which are then used to retrieve those records directly. If only some of
the conditions have secondary indexes, each retrieved record is further tested to determine whether it
CS8492

DATABASE MANAGEMENT SYSTEMS

(Common to CSE & IT)

satisfies the remaining conditions.7 In general, method S9 assumes that each of the indexes is on a
nonkey field of the file, because if one of the conditions is an equality condition on a key field, only
one record will satisfy the whole condition.

Whenever a single condition specifies the selection—such as OP1, OP2, or OP3— the
DBMS can only check whether or not an access path exists on the attribute involved in that
condition. If an access path (such as index or hash key or sorted file) exists, the method
corresponding to that access path is used; otherwise, the brute force, linear search approach of
method S1 can be used. Query optimization for a SELECT operation is needed mostly for
conjunctive select conditions whenever more than one of the attributes involved in the conditions
have an access path. The optimizer should choose the access path that retrieves the fewest records in
the most efficient way by estimating the different costs (see Section 19.8) and choosing the method
with the least estimated cost.

Selectivity of a Condition. When the optimizer is choosing between multiple simple

conditions in a conjunctive select condition, it typically considers the selectivity of each condition.
The selectivity (sl) is defined as the ratio of the num-ber of records (tuples) that satisfy the condition
to the total number of records (tuples) in the file (relation), and thus is a number between zero and
one. Zero selec-tivity means none of the records in the file satisfies the selection condition, and a
selectivity of one means that all the records in the file satisfy the condition. In gen-eral, the
selectivity will not be either of these two extremes, but will be a fraction that estimates the
percentage of file records that will be retrieved.

Although exact selectivities of all conditions may not be available, estimates of selectivities
are often kept in the DBMS catalog and are used by the optimizer. For example, for an equality
condition on a key attribute of relation r(R), s = 1/|r(R)|, where |r(R)| is the number of tuples in
relation r(R). For an equality condition on a nonkey attribute with i distinct values, s can be
estimated by (|r(R)|/i)/|r(R)| or 1/i, assuming that the records are evenly or uniformly distributed
among the distinct values. Under this assumption, |r(R)|/i records will satisfy an equality condition
on this attribute. In general, the number of records satisfying a selection condition with selectivity sl
is estimated to be |r(R)| * sl. The smaller this estimate is, the higher the desirability of using that
condition first to retrieve records. In certain cases, the actual distribution of records among the
various distinct values of the attribute is kept by the DBMS in the form of a histogram, in order to
get more accurate esti-mates of the number of records that satisfy a particular condition.

Disjunctive Selection Conditions. Compared to a conjunctive selection condition, a

disjunctive condition (where simple conditions are connected by the OR logical connective rather
than by AND) is much harder to process and optimize. For example, consider OP4 :

OP4 : σDno=5 OR Salary > 30000 OR Sex=‘F’ (EMPLOYEE)

With such a condition, little optimization can be done, because the records satisfying the
disjunctive condition are the union of the records satisfying the individual conditions. Hence, if any
one of the conditions does not have an access path, we are compelled to use the brute force, linear
CS8492

DATABASE MANAGEMENT SYSTEMS

(Common to CSE & IT)

search approach. Only if an access path exists on every simple condition in the disjunction can we
optimize the selection by retrieving the records satisfying each condition—or their record ids—and
then applying the union operation to eliminate duplicates.

A DBMS will have available many of the methods discussed above, and typically many
additional methods. The query optimizer must choose the appropriate one for executing each
SELECT operation in a query. This optimization uses formulas that estimate the costs for each
available access method, as we will discuss in Section 19.8. The optimizer chooses the access
method with the lowest estimated cost.
CS8492

DATABASE MANAGEMENT SYSTEMS

(Common to CSE & IT)

CQI-23 2nd Feb 8 23protected
100% (6)
CQI-23 2nd Feb 8 23protected
38 pages
2 Select Optimization
No ratings yet
2 Select Optimization
23 pages
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
No ratings yet
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
20 pages
Module - 4
No ratings yet
Module - 4
60 pages
Advance Database Chapter 1-1
No ratings yet
Advance Database Chapter 1-1
76 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Dbms Seminar
No ratings yet
Dbms Seminar
24 pages
Optimizing SQL Query Processing: Patient 1, 0 0 0, 0 0 0
No ratings yet
Optimizing SQL Query Processing: Patient 1, 0 0 0, 0 0 0
6 pages
Optimizing SQL Query Processing: Patient 1, 0 0 0, 0 0 0
No ratings yet
Optimizing SQL Query Processing: Patient 1, 0 0 0, 0 0 0
6 pages
Introduction To Database Management Systems CS470
No ratings yet
Introduction To Database Management Systems CS470
11 pages
DBMS
No ratings yet
DBMS
68 pages
CH 1 Query Processing
No ratings yet
CH 1 Query Processing
38 pages
Chapter ONE
No ratings yet
Chapter ONE
48 pages
ADBMS Assignment
No ratings yet
ADBMS Assignment
19 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Query Processing
No ratings yet
Query Processing
11 pages
Algorithms For Query Processing and Optimization
No ratings yet
Algorithms For Query Processing and Optimization
53 pages
Chapter 2 Query Processing & Optmzn
No ratings yet
Chapter 2 Query Processing & Optmzn
64 pages
CH 13 Updated
No ratings yet
CH 13 Updated
30 pages
Lesson 05
No ratings yet
Lesson 05
29 pages
Ad Database All Slide
No ratings yet
Ad Database All Slide
49 pages
Chapter 2
No ratings yet
Chapter 2
64 pages
Query Processing in DBMS
No ratings yet
Query Processing in DBMS
22 pages
Unit 1
No ratings yet
Unit 1
23 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages
DBMS Unit5 Lecture1
No ratings yet
DBMS Unit5 Lecture1
22 pages
Ch1 Query Processing
No ratings yet
Ch1 Query Processing
49 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
54 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
45 pages
Query Processing
No ratings yet
Query Processing
39 pages
Unit 4
No ratings yet
Unit 4
24 pages
Chapter 15
No ratings yet
Chapter 15
66 pages
Final DBMS Unit 7
No ratings yet
Final DBMS Unit 7
48 pages
ADBMS TypicalQueryOptimizer
No ratings yet
ADBMS TypicalQueryOptimizer
30 pages
05 - Strategies For Query Processing (Ch18)
No ratings yet
05 - Strategies For Query Processing (Ch18)
50 pages
Database Technology Query Processing: Heiko Paulheim
No ratings yet
Database Technology Query Processing: Heiko Paulheim
60 pages
Cs410 Notes Ch15
No ratings yet
Cs410 Notes Ch15
20 pages
11 Query Evaluations
No ratings yet
11 Query Evaluations
17 pages
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Query Trees and Heuristics For Query Optimization
No ratings yet
Query Trees and Heuristics For Query Optimization
29 pages
DBMS
No ratings yet
DBMS
24 pages
Overview of Query Evaluation: R&G Chapter 12
No ratings yet
Overview of Query Evaluation: R&G Chapter 12
30 pages
ADB Slides 4
No ratings yet
ADB Slides 4
47 pages
Relational Query Optimization: Warih Maharani, ST.,MT
No ratings yet
Relational Query Optimization: Warih Maharani, ST.,MT
39 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
55 pages
Chapter 13: Query Processing: Database System Concepts, 5th Ed
No ratings yet
Chapter 13: Query Processing: Database System Concepts, 5th Ed
55 pages
1 Query Processing
No ratings yet
1 Query Processing
4 pages
09 Query Eval
No ratings yet
09 Query Eval
29 pages
3 Query Processing and Optimization-1
No ratings yet
3 Query Processing and Optimization-1
18 pages
Module 3 Introduction To SQL
No ratings yet
Module 3 Introduction To SQL
21 pages
6 - Query Processing Updated
No ratings yet
6 - Query Processing Updated
24 pages
Lecture Notes
No ratings yet
Lecture Notes
96 pages
Query Proc Notes
No ratings yet
Query Proc Notes
10 pages
Adbs CH2
No ratings yet
Adbs CH2
56 pages
SQL Query Slides: Sharif University of Technology Database Systems
No ratings yet
SQL Query Slides: Sharif University of Technology Database Systems
23 pages
29-Query Optimization-04-10-2024
No ratings yet
29-Query Optimization-04-10-2024
35 pages
MCA - ADBMS - Module 3
No ratings yet
MCA - ADBMS - Module 3
30 pages
Two Ways of Sorting Data: Getting Information Out of A Database
No ratings yet
Two Ways of Sorting Data: Getting Information Out of A Database
9 pages
Production System: Fundamentals and Applications
From Everand
Production System: Fundamentals and Applications
Fouad Sabry
No ratings yet
MedDel App Links
No ratings yet
MedDel App Links
2 pages
Convergence English Paper Presentation
No ratings yet
Convergence English Paper Presentation
1 page
IOT
No ratings yet
IOT
11 pages
IoT Networking
No ratings yet
IoT Networking
6 pages
1.5 Enhanced e R Model
No ratings yet
1.5 Enhanced e R Model
5 pages
1.5 Entity Relationship Model
No ratings yet
1.5 Entity Relationship Model
10 pages
REDCap User Manual For
No ratings yet
REDCap User Manual For
14 pages
TQC (GD-55T) Pump Parts List
No ratings yet
TQC (GD-55T) Pump Parts List
26 pages
Micrometers: How To Read The Scale Measuring Force Limiting Device
No ratings yet
Micrometers: How To Read The Scale Measuring Force Limiting Device
1 page
0.05 To 0.1 BTC
No ratings yet
0.05 To 0.1 BTC
17 pages
Bladder Accumulators Standard: 1. Description
No ratings yet
Bladder Accumulators Standard: 1. Description
7 pages
Atmega 2560 Ingles (031-060)
No ratings yet
Atmega 2560 Ingles (031-060)
30 pages
Integrated Design Project: Passport Automation System
No ratings yet
Integrated Design Project: Passport Automation System
34 pages
Net Com Lab Assigment
No ratings yet
Net Com Lab Assigment
9 pages
Barablend - 665 LCM Standard Field Application Procedure: Baroid
100% (1)
Barablend - 665 LCM Standard Field Application Procedure: Baroid
16 pages
Digital Tire Inflator: Automatic Handheld
100% (1)
Digital Tire Inflator: Automatic Handheld
10 pages
Jurnal Biokonservasi 3
No ratings yet
Jurnal Biokonservasi 3
15 pages
Regression Problems (Practical)
No ratings yet
Regression Problems (Practical)
24 pages
महिला बाल विकास विभाग रायपुर भर्ती विज्ञापन
No ratings yet
महिला बाल विकास विभाग रायपुर भर्ती विज्ञापन
10 pages
PIC 600 Instrucciones
No ratings yet
PIC 600 Instrucciones
2 pages
Api 681 Compliance
100% (2)
Api 681 Compliance
2 pages
SQL-Questions & Answer
No ratings yet
SQL-Questions & Answer
22 pages
SGCG Methodology Overview PDF
No ratings yet
SGCG Methodology Overview PDF
45 pages
UVC-Series/U15 Laser Marking System: System Overview System Configuration
No ratings yet
UVC-Series/U15 Laser Marking System: System Overview System Configuration
12 pages
Sharp LC 20s5e BK
No ratings yet
Sharp LC 20s5e BK
58 pages
INSPIRE User Guide v.2.24.0
No ratings yet
INSPIRE User Guide v.2.24.0
122 pages
ApplicationFormA - 2022-03-02T123951.118
No ratings yet
ApplicationFormA - 2022-03-02T123951.118
1 page
Civil Materials Budget Construction (UPDATED AS OF NOVEMBER 2, 2022)
No ratings yet
Civil Materials Budget Construction (UPDATED AS OF NOVEMBER 2, 2022)
3 pages
204.4381.11 - DmOS - MIB Reference
No ratings yet
204.4381.11 - DmOS - MIB Reference
169 pages
Lineage Tracking in GIS Ensuring Data Integrity
No ratings yet
Lineage Tracking in GIS Ensuring Data Integrity
8 pages
Math Subjects
No ratings yet
Math Subjects
2 pages
Strength of Materials Solved Examples
No ratings yet
Strength of Materials Solved Examples
19 pages
LU-S1 Ультразвуковой генератор с коагулятором Sonoca 300
No ratings yet
LU-S1 Ультразвуковой генератор с коагулятором Sonoca 300
4 pages
Project Proposal Smart Health Monitoring System
No ratings yet
Project Proposal Smart Health Monitoring System
21 pages
Catalogue
No ratings yet
Catalogue
1 page

4.6 Algorithms For Select and Join Operations

Uploaded by

4.6 Algorithms For Select and Join Operations

Uploaded by

II IV

4.6 QUERY PROCESSING OVERVIEW,

DATABASE MANAGEMENT SYSTEMS

Algorithms for SELECT and JOIN Operations

Implementing the SELECT Operation

OP1: σSsn = ‘123456789’ (EMPLOYEE)

DATABASE MANAGEMENT SYSTEMS

Search Methods for Complex Selection. If a condition of a SELECT operation is a conjunctive

S7—Conjunctive selection using an individual index. If an attribute involved in any single

S9—Conjunctive selection by intersection of record pointers.6 If second-ary indexes (or

DATABASE MANAGEMENT SYSTEMS

Selectivity of a Condition. When the optimizer is choosing between multiple simple

Disjunctive Selection Conditions. Compared to a conjunctive selection condition, a

OP4 : σDno=5 OR Salary > 30000 OR Sex=‘F’ (EMPLOYEE)

DATABASE MANAGEMENT SYSTEMS

DATABASE MANAGEMENT SYSTEMS

You might also like