0% found this document useful (0 votes)

35 views

Discussion Session Week 7: Database Indices

The document discusses database indexing and provides examples to determine when indexes would be beneficial for different types of queries. In Example 1, an index on SSN would help a query searching on SSN, but an index on gender would not help since it would return half the rows. An index on city would help since it would only return 1% of rows. In Example 2, indexes on the foreign key columns would not help a join query as much as a sort-merge join. In Example 3, an index would help if there is a selective filter on one of the tables.

Uploaded by

Chochunder

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Discussion Session Week 7: Database Indices

Uploaded by

Chochunder

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

Discussion Session Week 7

Database Indices

Example 1
Assume the table created by the
following statement :
CREATE TABLE customer {
id
Serial
Primary key,
SSN
Integer
NOT NULL UNIQUE,
gender Varchar[6] NOT NULL CHECK
(gender=MALE or
gender=FEMALE),
city
text
NOT NULL
}

Example 1
Assume the following information for the
customer table :
Id

SSN

Gender

City

40K

Where the number below each field

corresponds to the number of distinct
values for each attribute. Notice that
given ID is a primary key, the table itself
has 4M tuples.

Example 1a
Consider the following prepared query :

SELECT *
FRO M
custom er
W H ERE SSN = ?
And the following information we have about the system (DA =
disk access) :
Disk page size : 4KB
Assume each tuple is 100 bytes (40 tuples per page)
Index lookup cost : 0DA (index in main memory)
Page access cost : 1DA
Assume tuples not clustered
Index on ID always exists

Should we use an index on the SSN attribute to answer this

query more efficiently?

Example 1a
Cost without index :
Only one tuple in the answer.
Need to scan over entire relation to find
the tuple.
100k page access =100 k DA

Cost with index :

Index lookup finds the correct page right
away.
T(customer)/V(customer,SSN) = 4M/4M
= 1 DA

Example 1b
What about this prepared query :
SELECT
*
FRO M
custom er
W H ERE
G ender= ?
Assume same information about the
system.
Should we use an index on the
G ender attribute to answer this
query more efficiently?

Example 1b
Cost without index :
Still need to scan over the entire relation
100k page access = 100k DA

Cost with index (unclustered tuples) :

Half the tuples are in the answer (50% chance for
each tuple).
T(customer)/V(customer,gender) = 4M/2= 2M DA

Cost with index (clustered tuples) :

Pages are accessed at most once, therefore
#pageAccessed = 100k DA.

Index beneficial : No, in all cases.

Example 1c
What about this prepared query :
SELECT
*
FRO M
custom er
W H ERE
City= ?
Assume same information about the
system.
Should we use an index on the city
attribute to answer this query more
efficiently?

Example 1c
Cost without index :
Scan over the entire relation
100k page access = 100k DA

Cost with index :

1% of tuples are in the answer.
T(customer)/V(customer,city) =
4M/40k= 100 DA

Index beneficial : Yes.

Example 2
Assume we now have another
relation :
CREATE TABLE sales {
id
SERIAL PRIMARY KEY,
customer_id INTEGER REFERENCES customer(id)
NOT NULL,
product TEXT NOT NULL,
amount INTEGER NOT NULL CHECK (amount > 0
AND amount <= 4000),
};
Assume the company doesnt allow sales of more
than 4000 products at a time.

Example 2
Assume the following information for the sales
table :
ID

Customer
_id

Product

Amount

40M

Again, the number below each field corresponds

to the number of distinct values for each
attribute. Given customer_id is a foreign key,
their cannot be more than 4M distinct values.
There are 40M tuples in this table.

Example 2
Consider the following prepared query :

SELECT *
FRO M
custom er AS C, sales as S
W H ERE C.id = S.custom er_id
Recall the information we have about the system (DA = disk access) :
Disk page size : 4KB
Assume each tuple is 100 bytes for both relations (40 tuples per page)
Index lookup cost : 0DA (index in main memory)
Page access cost : 1DA
Assume tuples not clustered
Index on ID always exists
Page write cost : 1DA

Should we use an index on the custom er_id attribute of sales or

the id attribute of customer to answer this query more
efficiently?

Example 2
Recall the best alternative, the sort-merge join :
Recall from the lecture notes : given a join of tables R
and S, sort-merge join only takes 2 reads of R and S
and one write of the equivalent amount of data.
Recall : #pagesInPerson = 100k, #pagesInSales =
1M.
Cost of sort-merge join
#PageAccesses = (2read+1write) x (#pagesInPerson
+ #pagesInSales)
= 3 x (100k + 1M)
= 3300k DA

Example 2
Assume the index is on
sales.customer_id :
For each tuple in customer, we join with
tuples from sales that match the
customer tuple id.
#pageAccesses = T(Customer) x
(T(Sales)/V(Sales, customer_id))
= 4M x (40M / 4M) = 40M
Index on sales.custom er_id is much worse
than sort-merge join.

Example 2
Assume the index is on customer.id :
For each tuple in sales, we join with the
tuples from customer that is referred by
the sales tuples customer_id.
#pageAccesses = T(Sales) x
(T(Customer)/V(Customer, customer_id))
= 40M x (4M / 4M) = 40M
Index on custom er.id is also worse than
sort-merge join.

Example 3
Now assume we change slightly the previous
query by adding a selection :
SELECT *
FRO M
custom er AS C, sales as S
W H ERE C.id = S.custom er_id AN D
C.id = 12345 % Yannis ID

Important to know : selection will happen first

on most database systems.
Should we use an index on the custom er_id
attribute of sales or the id attribute of
customer to answer this query more efficiently?

Example 3
If we use the sort-merge join :
Recall : #pagesInSales = 1M.
Given only one tuple from person is
selected and selection happens before
the join, we have #pagesInPerson = 1.
#PageAccesses = (2read+1write) x
(#pagesInPerson + #pagesInSales)
= 3 x (1+ 1M)
= 3M DA

Example 3
Assume the index is on
sales.customer_id :
There is only one tuple in customer after
the selection.
#pageAccesses = T(Customer) x
(T(Sales)/V(Sales, customer_id))
= 1 x (40M / 4M) = 10
Index on sales.custom er_id is much, much
better than sort-merge join if we have a
very selective selection.

Example 3
Assume the index is on customer.id :
There is no selection over sales
#pageAccesses = T(Sales) x
(T(Customer)/V(Customer, customer_id))
= 40M x (4M / 4M) = 40M
Index on custom er.id is not affected by the
selection on the customer table.

RMIS Documentation Addendum - AttachDetach API 072814
100% (1)
RMIS Documentation Addendum - AttachDetach API 072814
10 pages
Introduction To Indexes
No ratings yet
Introduction To Indexes
35 pages
MD.070 A E T D: USG PO Report
No ratings yet
MD.070 A E T D: USG PO Report
10 pages
CS 345: Topics in Data Warehousing: Thursday, October 21, 2004
No ratings yet
CS 345: Topics in Data Warehousing: Thursday, October 21, 2004
29 pages
Creating Tables
No ratings yet
Creating Tables
10 pages
Why MySQL Could Be Slow With Large Tables
No ratings yet
Why MySQL Could Be Slow With Large Tables
14 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
What Are Indexes?: ID First Name Last Name Class
No ratings yet
What Are Indexes?: ID First Name Last Name Class
3 pages
Database Modeling - notes-VI
No ratings yet
Database Modeling - notes-VI
8 pages
Taking Advantage of Indexes: How It Works
No ratings yet
Taking Advantage of Indexes: How It Works
7 pages
Data Types in SQL
No ratings yet
Data Types in SQL
12 pages
What Is An Index in MySQL
No ratings yet
What Is An Index in MySQL
6 pages
Mysql Query & Index Tuning: Keith Murphy
No ratings yet
Mysql Query & Index Tuning: Keith Murphy
46 pages
How Does Database Indexing Work
No ratings yet
How Does Database Indexing Work
4 pages
Practical 5 Implement Indexes
No ratings yet
Practical 5 Implement Indexes
4 pages
Perfomance Tuning
No ratings yet
Perfomance Tuning
52 pages
How Indexing Enhances Query Performance
No ratings yet
How Indexing Enhances Query Performance
11 pages
Module 12 - Managing Indexes
No ratings yet
Module 12 - Managing Indexes
19 pages
Physical Database Design and Tuning: R&G - Chapter 20
No ratings yet
Physical Database Design and Tuning: R&G - Chapter 20
23 pages
Oracle SQL High Performance Tuning: Guy Harrison Director, R&D Melbourne
100% (1)
Oracle SQL High Performance Tuning: Guy Harrison Director, R&D Melbourne
56 pages
Lab 06 (1) (1)
No ratings yet
Lab 06 (1) (1)
8 pages
The Importance of Indexing in Database Design
No ratings yet
The Importance of Indexing in Database Design
6 pages
By Saurabh Sahai
No ratings yet
By Saurabh Sahai
15 pages
Indexing in Relational Databases
No ratings yet
Indexing in Relational Databases
2 pages
Ruby On Rails: Database Indexing Techniques
No ratings yet
Ruby On Rails: Database Indexing Techniques
19 pages
SQL Server Index Design Guide
No ratings yet
SQL Server Index Design Guide
27 pages
Guidelines For Application-Specific Indexes: See Also
No ratings yet
Guidelines For Application-Specific Indexes: See Also
10 pages
L45 SQL-Developer-Lecture-24
No ratings yet
L45 SQL-Developer-Lecture-24
19 pages
Database Indexing
No ratings yet
Database Indexing
4 pages
Query Optimization in Mysql Database Usi F8e2fb8b
No ratings yet
Query Optimization in Mysql Database Usi F8e2fb8b
7 pages
Indexes
No ratings yet
Indexes
7 pages
An in Depth Look at Database Indexing
No ratings yet
An in Depth Look at Database Indexing
3 pages
ADBMS TypicalQueryOptimizer
No ratings yet
ADBMS TypicalQueryOptimizer
30 pages
Lesson 4 - Indexing
No ratings yet
Lesson 4 - Indexing
6 pages
SQL Server Index Basics
No ratings yet
SQL Server Index Basics
5 pages
MySQL-Indexing Best Practices (WEBINAR)
No ratings yet
MySQL-Indexing Best Practices (WEBINAR)
41 pages
SQL Server Indexes
No ratings yet
SQL Server Indexes
14 pages
Mysql Explain Explained
No ratings yet
Mysql Explain Explained
23 pages
SQL Query Optimization
No ratings yet
SQL Query Optimization
49 pages
Data Structure Database Table Columns of A Database Table Lookups
No ratings yet
Data Structure Database Table Columns of A Database Table Lookups
3 pages
Using Covering Indexes To Improve Query Performance - Simple Talk
No ratings yet
Using Covering Indexes To Improve Query Performance - Simple Talk
15 pages
Increasing Database Performance Using Indexes
No ratings yet
Increasing Database Performance Using Indexes
10 pages
Selecting An Index Strategy
No ratings yet
Selecting An Index Strategy
13 pages
Msbi Interview - Questions
No ratings yet
Msbi Interview - Questions
45 pages
DmytriievaA Tezy
No ratings yet
DmytriievaA Tezy
2 pages
CS DBMS 8
No ratings yet
CS DBMS 8
5 pages
SQL_8
No ratings yet
SQL_8
18 pages
Dbaseoptimizationwithindexes
No ratings yet
Dbaseoptimizationwithindexes
3 pages
Planning For SQL Server® 2012 Indexing
No ratings yet
Planning For SQL Server® 2012 Indexing
25 pages
12 Database SQL Index Interview Questions and Answers For 2 To 5 Years Experienced
No ratings yet
12 Database SQL Index Interview Questions and Answers For 2 To 5 Years Experienced
5 pages
3 - Describe Concepts of Relational Data
No ratings yet
3 - Describe Concepts of Relational Data
6 pages
Index: Presented By-VISHAKHA CHANDRA (10030141082)
No ratings yet
Index: Presented By-VISHAKHA CHANDRA (10030141082)
29 pages
SQL+Server+Index+Architecture+and+Design+Guide+-+SQL+Server+ +Microsoft+Docs
No ratings yet
SQL+Server+Index+Architecture+and+Design+Guide+-+SQL+Server+ +Microsoft+Docs
47 pages
5
No ratings yet
5
14 pages
Module3 1-Indexes
No ratings yet
Module3 1-Indexes
5 pages
Brownbag Introtosqltuning
No ratings yet
Brownbag Introtosqltuning
36 pages
Pham Hong Son - Do Thi Thu Huong - DDB Report
No ratings yet
Pham Hong Son - Do Thi Thu Huong - DDB Report
8 pages
Defining Indexes With SQL Server 2005
No ratings yet
Defining Indexes With SQL Server 2005
17 pages
SQL indexes
No ratings yet
SQL indexes
4 pages
Indexes and Operators
No ratings yet
Indexes and Operators
12 pages
Practical Mysql Indexing Guidelines
No ratings yet
Practical Mysql Indexing Guidelines
35 pages
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
From Everand
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
Equity Press
No ratings yet
Arcfm Solution: Make The Most of Your Energy
No ratings yet
Arcfm Solution: Make The Most of Your Energy
4 pages
00 Saip 77
No ratings yet
00 Saip 77
6 pages
HANA Global TimeFrameReport 1.00.120+
No ratings yet
HANA Global TimeFrameReport 1.00.120+
41 pages
Configure A Firewall and A Startup Script With Deployment Manager
No ratings yet
Configure A Firewall and A Startup Script With Deployment Manager
7 pages
Sem-4 Time Table
No ratings yet
Sem-4 Time Table
4 pages
PostgreSQL in Containers at Scale
No ratings yet
PostgreSQL in Containers at Scale
19 pages
Junior Android Developer
No ratings yet
Junior Android Developer
3 pages
163b Advanced Rdbms QP
No ratings yet
163b Advanced Rdbms QP
47 pages
Cisco IronPort Cloud Email Security Service
No ratings yet
Cisco IronPort Cloud Email Security Service
18 pages
Abap Technical Interview-1
No ratings yet
Abap Technical Interview-1
7 pages
ISA Test Plan Template
50% (2)
ISA Test Plan Template
23 pages
ML Project Movie Recommendation System
No ratings yet
ML Project Movie Recommendation System
2 pages
Modul 1 Sample Test
No ratings yet
Modul 1 Sample Test
5 pages
V2V Python + Data Science + Power BI
No ratings yet
V2V Python + Data Science + Power BI
23 pages
Configuring Smart Office PDF
No ratings yet
Configuring Smart Office PDF
10 pages
Mantis Admin Guide
No ratings yet
Mantis Admin Guide
120 pages
Aruba 6200f 48g 4sfp+ Switch-Psn1012749072dken
No ratings yet
Aruba 6200f 48g 4sfp+ Switch-Psn1012749072dken
4 pages
Report Sample HGVGH
No ratings yet
Report Sample HGVGH
52 pages
UML Activity Diagram
No ratings yet
UML Activity Diagram
13 pages
OLE For Process Control RTD Interface
No ratings yet
OLE For Process Control RTD Interface
38 pages
2018 NIST CSF Maturity Tool v1.0
100% (1)
2018 NIST CSF Maturity Tool v1.0
46 pages
cls-01 PDF Computer and Its Generation
No ratings yet
cls-01 PDF Computer and Its Generation
27 pages
Web Design Report Analysis On The Usability of Facebook
No ratings yet
Web Design Report Analysis On The Usability of Facebook
29 pages
Wa0002
No ratings yet
Wa0002
3 pages
Oracle Database 11g: Data Warehousing Fundamentals: Duración
No ratings yet
Oracle Database 11g: Data Warehousing Fundamentals: Duración
4 pages
Integrating Web Services With OAuth and PHP
No ratings yet
Integrating Web Services With OAuth and PHP
118 pages
Sapnote - 0000443500 - r3 Vs APO - Dates in Sales Orders and Deliveries
No ratings yet
Sapnote - 0000443500 - r3 Vs APO - Dates in Sales Orders and Deliveries
3 pages
New Microsoft Office Power Point Presentation
No ratings yet
New Microsoft Office Power Point Presentation
17 pages

Discussion Session Week 7: Database Indices

Uploaded by

Discussion Session Week 7: Database Indices

Uploaded by

Discussion Session Week 7

Where the number below each field

Should we use an index on the SSN attribute to answer this

Cost with index :

Cost with index (unclustered tuples) :

Cost with index (clustered tuples) :

Index beneficial : No, in all cases.

Cost with index :

Index beneficial : Yes.

Again, the number below each field corresponds

Should we use an index on the custom er_id attribute of sales or

Important to know : selection will happen first

You might also like