0% found this document useful (0 votes)

11 views

CS DBMS 8

The document discusses different types of indexes that can be created in a database to optimize query performance. It explains how regular indexes and composite indexes work to reduce disk access by sorting or grouping related data. The document also covers full text indexes and how they allow more advanced search capabilities through keywords and boolean logic.

Uploaded by

Prabhat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

CS DBMS 8

Uploaded by

Prabhat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

DBMS Lecture 8- Indexing:

Introducton:

As we all know, Database store the data in Disk.

Let's consider a Students table, id, name, batch_id.

If we run a simple query, select * from students where batch_id=2;

Internally, It will be brought into the RAM, then the query will run.

CPU cannot directly talk to disk. - OS cannot directly operate on disk. - Content from disk is brought into memory and then
application works on that data.

If a CPU were to directly work on disk, it will be wasting a lot of it's time. Speed of disk is very slow compared to CPU. - CPU will
make an I/O call to bring data into memory. While data is being brought into memory it does something else.

Let's say we have a CPU, cache is almost directly siting on the CPU, small in the size. Because it is close to CPU, therefore very
fast. Then RAM, it is volatile, it is bit far away from CPU then cache, and Then comes the disk, they are a persistent storage, tend
to have a higher size but not as fast as RAM.

16GB of disk or 16GB of RAM, which is cheaper? Disk is cheaper.

Disk working:

By the nature of how disk works, it will be slower. Because we have to rotate a particular pointer, have to go to the correct place
where data is stored then read the data and then you have to send to the CPU.

When you will try to read the data, it will give you the data of complete section and if the closer things are stored together you
will get the data faster.

A table is stored in the disk sorted by primary key.

and if I write a query,

select *
from students
where batch_id=2;

The very first operation that will happen is the query executor will go to the disk will try to read the first row, bring row1 to
memory after this it checks if it matches the condition. Then bring row2 to memory similarly check the condition. Then bring
row3 to memory check the condition and do the work.

If there were multiple students here that did not have the batch_id =3, then there will be many disk access that will be very slow
and will not end up giving me any data also. let's say there are 1M rows, and there are only 20 students with batch_id=3, How
many disk accesses are waste for us? 1M-20 will be waste for us, But we are wasting a lot of memory, time in that.

How do indexes work?

select *
from students
where id=3;

By default, data is sorted by primary key. Let's say I start going into the disl from first row, I bring the first row into the memory
check it not relevant, Then next row, the third row, check it and relevant to me, So will I go to 4^th^ row? Because in query filter is
there on the primary key where the id is 3, and if I get the id 3 in my table, now anything after this will not have id=3.

In previous example, we are filtering on non primary key, so It can be found anywhere in the table, so have to check the whole
table..

Now, We can use this idea: Let's say in this students table, and ofcourse all of these information is stored on the disk, so have
address. So if someone has given this information in memory like, Then how will this query works,
select *
from students
where id=3;

1. Find the address of the id from memory.

2. Go to the address and fetch data.
3. Done.

Here in this case, You even don't have to excess first 2 rows, I know the address and i can go to that address and will just do 1
disk access and it's done.

Let's say there are 1M rows, I need to get a student with id=15. Case1: No sorting and No table. Worst case is 1M rows.

Case2: Have sorting but No table. Then 15 rows to access.

Case3: Have sorting and Have table. Then only 1.

Here I am optimizing number of disk accesses. This is where indexes come into picture. It say, can we use this idea to speed up
the query we have.

Let's say I have students table,

select *
from students
where p-No= 44667;

Can I apply the similar operation, Can I sort this data on the disk. Not really. Since data is already sorted on the disk by the
primary key.

V1: Let me create a table, and this table is sorted by p-No.

So, 1. Find the address of the p-No from memory. 2. Go to the address and fetch data. 3. Done.

How will I get the p-No 4 from the table. HashMap

I binary search on the memory to find the p-No 5 then I go to that address and do the work.

V2:

Select *
from students
where batch_id=3;

What is different from V2 and V1? There is something that is fundamentally different, There will be multiple students with
batch_id=3, there cannot be just 1 single row that has batch_id=3 and the disk is divided in 2 sections. Now, if I have to find all
the students with batch_id=3. 1. Go to the table in memory, find all of those disk address.

Will I still get rows that don't have batch_id=3? Yes, But I will not access sections that don't have batch_id=3. I have saved
irrelevant access of disk.

V3:

select *
from students
where psp between 60 and 80;

Case1: Don't have any helper table, 1. Go through every section 2. Fetch that section into memory 3. Compare the condition

Case2: Table

Is this going to be good? No. I will only have integer rows.

and I store the address like lets say there is a psp of 0 in address 1,4,7,8.

If I have the query, BETWEEN 10 AND 35;

First I will go to psp 10, Then take section 20 and above then 30 section. These tables are Index table or Index.

Purpose of Index: To prevent unnecessary disk fetches which leads to faster queries.

When does an Index needs to be updated? - When there is a Create/ Update/ Delete on the table, Index needs to be updated as
well. - Index will needs to be updated whenever any write happens on a table. - Havind indexes will slow down writes but make
reads faster. - Index also gets stored in disk=> storage requirement of DB will increase.

Table is something like a: <Key, Value> pair. Key=> column, Value=> address in disk. Table is also sorted by column.

Data Structure: - Sorted by key. - It allows fast access by key.

DS for indexes: BTree/ B+Tree.

Difference between BTree and Normal Binary Tree: A normal Binary Tree has 1 node having 2 children. In BTree: Now because
there are multiple children,so height is reduced that means the query is going to become faster.

When to create indexes? - Don't create indexes when you create table. - Create indexes when you have a query that needs to be
speed up. - Create indexes because of access patterns and not by predicting. - Do performance testing=> Do see how it's going to
affect your writes, what is the performance impact that is happening.

So, Indexes make reads faster by reducing the number of irrelevant disk accesses.

In MYSQL Workbench:

Explain select *
from sql_store.customers
where points=2273;

EXPLAIN command is basically going to tell you different things that are going to happen behind the scenes when you execute
this query. After running this: Rows are basically the number of rows that you will have to fetch to execute this particular query:
22

So let's create index on points:

create index {idx_tablename_columnname} on {table_name}(column name)

create index idx_customers_points on

sql_store.customers(points);

Now again run this:

Explain select *
from sql_store.customers
where points=2273;

Now, rows will be just 1.

See all the indexes: show indexes in sql_store.customers; By default a table is always indexed with primary key.

Q- Find a customer with a particular address. Now, I will create index in address column.

But is that really going to be advantageous to us?? No, Because 2 customers are not going to have same address so in the index
table I will end up having 1 row per address, so size of index become huge...

In scaler's codebase,
select *
from students
where name="naman";

1. Create index on students name. Index table is huge.

Now, If I would have indexed just on 'N'=[2,5,8,9] Then size of index will be 26 but again a lot of irrelevant accesses.

Now, index on 'NA'=[2,8,9] Chance of irrelevant access is bit less.

Now, index on 'NAM'=[8,9] After this it's going to be almost same.

Why should I even create index on size 6 if 2, 3 is also almost giving me the same result.

So rather than creating index on complete column, just create index on first 3 characters.

FULL TEXT INDEXES:

Let's say you have a blogging website where there is a table called blogs. 1 popular query on this table would be on contents.

Find all blogs that have react in their title or in content. And also advanced_search=> Find all blogs that contain react but doesn't
contain redux.

So there comes Full Text Indexes: It basically helps in retrieval queries.

select *
from sql_blog.posts
where bosy like '%react%'
or title like '%react%';

Q- find the blog post that contain the word react and redux?

select *
from sql_blog.posts
where bosy like '%react%'
or title like '%react%'
and
(body like '%reedux%'
or title like '%redux%');

Now there can be many sophisticated things that I want to do. Solution: Full Text Index.

Create fulltext index idx_posts_body_title on

sql_blog.posts(bosy,title);

Now index is created.

select *
from posts
where match(title,body) against ("react redux" in boolean mode);

Here both are optional, If any title or any body contains either react or redux it will be returned.

Now,
select *
from posts
where match(title,body) against ("+react redux" in boolean mode);

Here, It must contains react but redux is optional.

select *
from posts
where match(title,body) against ("+react +redux" in boolean mode);

Here it must contain both.

select *
from posts
where match(title,body) against ("+react redux -form" in boolean mode);

Here, it will not contain the form word.

So this type of advanced search capability is also there in MYSQL using a Full Text Index.

COMPOSITE INDEXS:

students(name, phone number) - Here index table is first sorted by name then if two row have same name, tie breaker will be
phone number.

If two people will have same name and same phone number, then primary key will be the tie breaker.
So if no tie breaker works, final tie breaker is primary key.

select *
from students
where phone number=124;

Now is this be faster? Not be faster.

In a Composite Index, - Put the higher cardinality column first=> with more distinct values.

---

DP Ss3 Note First Term
100% (2)
DP Ss3 Note First Term
43 pages
L6 Query Optimization
No ratings yet
L6 Query Optimization
52 pages
DBMS Series Part-2
No ratings yet
DBMS Series Part-2
80 pages
11.2 Indexing
No ratings yet
11.2 Indexing
26 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
4 pages
J2EE notes
No ratings yet
J2EE notes
38 pages
sqlIndexes2
No ratings yet
sqlIndexes2
10 pages
blog-algomaster-io-p-a-detailed-guide-on-database-indexes
No ratings yet
blog-algomaster-io-p-a-detailed-guide-on-database-indexes
8 pages
11.physicaldesign
No ratings yet
11.physicaldesign
52 pages
What Are Indexes?: ID First Name Last Name Class
No ratings yet
What Are Indexes?: ID First Name Last Name Class
3 pages
Indexing
No ratings yet
Indexing
6 pages
Database Indexing
No ratings yet
Database Indexing
4 pages
Unit-6 Storage Strategies
No ratings yet
Unit-6 Storage Strategies
43 pages
Unit -5 - part 2
No ratings yet
Unit -5 - part 2
33 pages
Lesson 4 - Indexing
No ratings yet
Lesson 4 - Indexing
6 pages
VI. Indices
No ratings yet
VI. Indices
12 pages
Query Optimization
No ratings yet
Query Optimization
9 pages
Indexes When to Use and When to Avoid In SQL. _ by BaseCS101 _ Level Up Coding
No ratings yet
Indexes When to Use and When to Avoid In SQL. _ by BaseCS101 _ Level Up Coding
6 pages
Creating Tables
No ratings yet
Creating Tables
10 pages
Guidelines For Application-Specific Indexes: See Also
No ratings yet
Guidelines For Application-Specific Indexes: See Also
10 pages
Lab 06 (1) (1)
No ratings yet
Lab 06 (1) (1)
8 pages
SQL Query Optimization
No ratings yet
SQL Query Optimization
49 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
15 pages
DBMS JOIN INDEXING (1)
No ratings yet
DBMS JOIN INDEXING (1)
6 pages
Indexes
No ratings yet
Indexes
7 pages
CS 522 - Database Administration Manage Indexes: Dr. Dongming Liang (Dongming - Liang@svuca - Edu)
No ratings yet
CS 522 - Database Administration Manage Indexes: Dr. Dongming Liang (Dongming - Liang@svuca - Edu)
32 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
6 pages
Index: Presented By-VISHAKHA CHANDRA (10030141082)
No ratings yet
Index: Presented By-VISHAKHA CHANDRA (10030141082)
29 pages
Module 12 - Managing Indexes
No ratings yet
Module 12 - Managing Indexes
19 pages
Lecture12(CNC 312)
No ratings yet
Lecture12(CNC 312)
36 pages
1 Indexing Techniques
No ratings yet
1 Indexing Techniques
30 pages
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
100% (1)
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
7 pages
CMP 312
No ratings yet
CMP 312
2 pages
INDEXING BASCIS - Unknown
No ratings yet
INDEXING BASCIS - Unknown
59 pages
Perofrmance and Indexes Discussion Questions Solutions PDF
No ratings yet
Perofrmance and Indexes Discussion Questions Solutions PDF
5 pages
Indexes
No ratings yet
Indexes
70 pages
mod4
No ratings yet
mod4
4 pages
Inde
No ratings yet
Inde
10 pages
Performance Tunning
No ratings yet
Performance Tunning
7 pages
An in Depth Look at Database Indexing
No ratings yet
An in Depth Look at Database Indexing
3 pages
CS 345: Topics in Data Warehousing: Thursday, October 21, 2004
No ratings yet
CS 345: Topics in Data Warehousing: Thursday, October 21, 2004
29 pages
How Does Database Indexing Work
No ratings yet
How Does Database Indexing Work
4 pages
SQL+Server+Index+Architecture+and+Design+Guide+-+SQL+Server+ +Microsoft+Docs
No ratings yet
SQL+Server+Index+Architecture+and+Design+Guide+-+SQL+Server+ +Microsoft+Docs
47 pages
Database Basics
No ratings yet
Database Basics
4 pages
Selecting An Index Strategy
No ratings yet
Selecting An Index Strategy
13 pages
Indexes and Frgmentation and Stats
No ratings yet
Indexes and Frgmentation and Stats
7 pages
SQLSERVER-CLASS14(INDEXES)
No ratings yet
SQLSERVER-CLASS14(INDEXES)
3 pages
UNIT 4-rdbms (1)
No ratings yet
UNIT 4-rdbms (1)
19 pages
dbms5 Indx
No ratings yet
dbms5 Indx
3 pages
Overview - Explain - Measuring Performance - Disk Architectures - Indexes - Join Algorithms (CTD.)
No ratings yet
Overview - Explain - Measuring Performance - Disk Architectures - Indexes - Join Algorithms (CTD.)
69 pages
ADBMSUnit4pptx 2024 11 11 11 49 37
No ratings yet
ADBMSUnit4pptx 2024 11 11 11 49 37
42 pages
How Tables and Indexes Are Stored On Disk
No ratings yet
How Tables and Indexes Are Stored On Disk
14 pages
Indexing in SAP Tables
No ratings yet
Indexing in SAP Tables
6 pages
Indexes in Database
100% (1)
Indexes in Database
38 pages
Database Analysis & Design
No ratings yet
Database Analysis & Design
57 pages
12 Database SQL Index Interview Questions and Answers For 2 To 5 Years Experienced
No ratings yet
12 Database SQL Index Interview Questions and Answers For 2 To 5 Years Experienced
5 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
S - UNIT VII Indexing in Database
No ratings yet
S - UNIT VII Indexing in Database
9 pages
Blazor and API Example: Classroom Quiz Application
From Everand
Blazor and API Example: Classroom Quiz Application
Taurius Litvinavicius
No ratings yet
SQL Server: Tips and Tricks - 2
From Everand
SQL Server: Tips and Tricks - 2
Priyanka Agarwal
4.5/5 (3)
Loading, Sequencing, Routing, Scheduling
No ratings yet
Loading, Sequencing, Routing, Scheduling
7 pages
Insider Threats in Information Security: Categories and Approaches
No ratings yet
Insider Threats in Information Security: Categories and Approaches
6 pages
DIALux Setup InformationÑLL
No ratings yet
DIALux Setup InformationÑLL
14 pages
Machine Learning-Csen 3233-2023
No ratings yet
Machine Learning-Csen 3233-2023
4 pages
Thinking Robots and A New Type of Robot Assistant: Travelmate Robotics
No ratings yet
Thinking Robots and A New Type of Robot Assistant: Travelmate Robotics
3 pages
01 System Description D211056EN-A
100% (1)
01 System Description D211056EN-A
62 pages
Mini-Proj A
No ratings yet
Mini-Proj A
20 pages
AR AP Config
No ratings yet
AR AP Config
163 pages
HTTP - Proxies 3
No ratings yet
HTTP - Proxies 3
26 pages
Belarc Advisor System Cobranca
No ratings yet
Belarc Advisor System Cobranca
6 pages
Create Account
No ratings yet
Create Account
3 pages
Golden Eng XR150 Xray Machine Manual 2011
No ratings yet
Golden Eng XR150 Xray Machine Manual 2011
13 pages
Cam Followers
100% (1)
Cam Followers
8 pages
Nestle Case Competition: Team Instant
No ratings yet
Nestle Case Competition: Team Instant
41 pages
Kafka Cluster
No ratings yet
Kafka Cluster
11 pages
Test Bank For Impact A Guide To Business Communication Canadian 9th Edition Northey 0134310802 9780134310800
100% (45)
Test Bank For Impact A Guide To Business Communication Canadian 9th Edition Northey 0134310802 9780134310800
28 pages
Pillai HOC College of Engineering & Technology, Rasayani
No ratings yet
Pillai HOC College of Engineering & Technology, Rasayani
11 pages
Task 4 - Resource- Finance Optimization Steps__
No ratings yet
Task 4 - Resource- Finance Optimization Steps__
3 pages
Azure - Windows Server - Host Scan - 2024 Jan q5trvk
No ratings yet
Azure - Windows Server - Host Scan - 2024 Jan q5trvk
410 pages
Wind Speed Meter TUMI 30 DAE: Operation Manual
No ratings yet
Wind Speed Meter TUMI 30 DAE: Operation Manual
12 pages
Data Sheet 6ES7331-1KF01-0AB0: Input Current
No ratings yet
Data Sheet 6ES7331-1KF01-0AB0: Input Current
4 pages
Himachal Pradesh State Cooperative Bank (HPSCB) Recruitment 154 Various Vacancy - Last Date 21-10-2013
No ratings yet
Himachal Pradesh State Cooperative Bank (HPSCB) Recruitment 154 Various Vacancy - Last Date 21-10-2013
9 pages
MAUD Tutorial - Hippo Texture Analysis Wizard: 1 Introduction and Purpose
No ratings yet
MAUD Tutorial - Hippo Texture Analysis Wizard: 1 Introduction and Purpose
11 pages
Internet Article - Pragyan Raj Pandey Millsberry School Kathmandu Grade 7.edited
No ratings yet
Internet Article - Pragyan Raj Pandey Millsberry School Kathmandu Grade 7.edited
2 pages
C. Speakers and Amplifier
No ratings yet
C. Speakers and Amplifier
42 pages
Les Amants Magnifiques
No ratings yet
Les Amants Magnifiques
66 pages
Digital Transducers
No ratings yet
Digital Transducers
9 pages
Sage 200 Evolution v9.10 Release Notes
No ratings yet
Sage 200 Evolution v9.10 Release Notes
22 pages
Nptel: Discrete Mathematics - Video Course
No ratings yet
Nptel: Discrete Mathematics - Video Course
2 pages
XPON HGU HG322RGW Datasheet V4.1 EN 1
No ratings yet
XPON HGU HG322RGW Datasheet V4.1 EN 1
2 pages

CS DBMS 8

Uploaded by

CS DBMS 8

Uploaded by

DBMS Lecture 8- Indexing:

As we all know, Database store the data in Disk.

Let's consider a Students table, id, name, batch_id.

If we run a simple query, select * from students where batch_id=2;

16GB of disk or 16GB of RAM, which is cheaper? Disk is cheaper.

A table is stored in the disk sorted by primary key.

and if I write a query,

How do indexes work?

1. Find the address of the id from memory.

Case2: Have sorting but No table. Then 15 rows to access.

Case3: Have sorting and Have table. Then only 1.

Let's say I have students table,

V1: Let me create a table, and this table is sorted by p-No.

How will I get the p-No 4 from the table. HashMap

Is this going to be good? No. I will only have integer rows.

If I have the query, BETWEEN 10 AND 35;

Data Structure: - Sorted by key. - It allows fast access by key.

DS for indexes: BTree/ B+Tree.

So let's create index on points:

create index {idx_tablename_columnname} on {table_name}(column name)

create index idx_customers_points on

Now again run this:

Now, rows will be just 1.

1. Create index on students name. Index table is huge.

Now, index on 'NA'=[2,8,9] Chance of irrelevant access is bit less.

Now, index on 'NAM'=[8,9] After this it's going to be almost same.

FULL TEXT INDEXES:

So there comes Full Text Indexes: It basically helps in retrieval queries.

Create fulltext index idx_posts_body_title on

Now index is created.

Here, It must contains react but redux is optional.

Here it must contain both.

Here, it will not contain the form word.

Now is this be faster? Not be faster.

You might also like