1 - Dbms Module 3 PPT 1
1 - Dbms Module 3 PPT 1
PREPARED BY SHARIKA T R,
SNGCE
SYLLABUS
• SQL DML (Data Manipulation Language)
▫ SQL queries on single and multiple tables, Nested queries (correlated and
non-correlated), Aggregation and grouping, Views, assertions, Triggers,
SQL data types.
• Physical Data Organization
▫ Review of terms: physical and logical records, blocking factor, pinned and
unpinned organization. Heap files, Indexing, Singe level indices, numerical
examples, Multi-level-indices, numerical examples, B-Trees & B+-Trees
(structure only, algorithms not required), Extendible Hashing, Indexing on
multiple keys – grid files
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Data-manipulation language(DML)
• The SQL DML provides the ability to query information from
the database and to insert tuples into, delete tuples from, and
modify tuples in the database.
▫ Integrity
The SQL DDL includes commands for specifying integrity
constraints that the data stored in the database must satisfy.
Updates that violate integrity constraints are disallowed.
▫ View definition
The SQL DDL includes commands for defining views.
▫ Transaction control
SQL includes commands for specifying the beginning and ending of
transactions.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
combinations of
EMPLOYEE Ssn and DEPARTMENT Dname (Q10) in the
database.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
DISTINCT Keyword
• to eliminate duplicate tuples from the result of an SQL
querys we use the keyword DISTINCT in the SELECT
clause
• only distinct tuples should remain in the result
• a query with SELECT DISTINCT eliminates duplicates,
whereas a query with SELECT ALL does not.
• SELECT with neither ALL nor DISTINCT is equivalent to
SELECT ALL
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Duplicate
eliminated
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
The first SELECT query retrieves the projects that involve a ‘Smith’ as manager of
the department that controls the project, and the second retrieves the projects that
involve a ‘Smith’ as a worker on the project. Notice that if several employees have the
last name ‘Smith’, the project names involving any of them will be retrieved.
Applying the UNION operation to the two SELECT queries gives the desired result.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
UNION ALL
• The UNION ALL command combines the result set of two
or more SELECT statements (allows duplicate values).
• The following SQL statement returns the cities (duplicate
values also) from both the "Customers" and the "Suppliers"
table:
PREPARED BY SHARIKA T R,
SNGCE
This SQL UNION ALL example would return the supplier_id multiple times in
the result set if that same value appeared in both the suppliers and orders table.
The SQL UNION ALL operator does not remove duplicates. If you wish to remove
duplicates, try using the UNION operator.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
INTERSECT Operator
• INTERSECT operator is used to
return the records that are in
common between two SELECT
statements or data sets.
• If a record exists in one query and not
in the other, it will be omitted from
the INTERSECT results.
• It is the intersection of the two
SELECT statements.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
EXCEPT
• The SQL EXCEPT clause/operator is used to combine two
SELECT statements and returns rows from the first
SELECT statement that are not returned by the second
SELECT statement.
▫ This means EXCEPT returns only rows, which are not
available in the second SELECT statement.
• Just as with the UNION operator, the same rules apply
when using the EXCEPT operator.
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
suppose that we want to see the effect of giving all employees who work on the
‘ProductX’ project a 10 percent raise; we can issue Query 13 to see what their
salaries would become. This example also shows how we can rename an
attribute in the query result using AS in the SELECT clause.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Student Table
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Q. Fetch all data from the table Student and then sort the
result in ascending order first according to the column Age.
and then in descending order according to the column
ROLL_NO.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
NESTING OF QUERIES
• A complete SELECT query, called a nested query , can be
specified within the WHERE-clause of another query,
called the outer query
• Many of the previous queries can be specified in an
alternative form using nesting
•
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Here the nested query has a different result for each tuple in the outer
query.
A query written with nested SELECT... FROM... WHERE... blocks and using
the = or IN comparison operators can always be expressed as a single block
query.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
OUTPUT
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
OUTPUT
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
SELECT COUNT(*)
FROM EMPLOYEE;
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
GROUP BY
• The GROUP BY clause specifies the grouping attributes,
which should also appear in the SELECT clause,
• so that the value resulting from applying each aggregate
function to a group of tuples appears along with the value
of the grouping attributes
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
If NULLs exist in the grouping attribute, then a separate group is created for all
tuples with a NULL value in the grouping attribute.
For example, if the EMPLOYEE table had some tuples that had NULL for the
grouping attribute Dno, there would be a separate group for those tuples in the
result of Q24
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
HAVING clause
• In GROUP BY the grouping and functions are applied after the
joining of the two relations.
• Sometimes we want to retrieve the values of these functions only
for groups that satisfy certain conditions
• HAVING clause, which can appear in conjunction with a GROUP
BY clause
• HAVING provides a condition on the summary information
regarding the group of tuples associated with each value of the
grouping attributes.
• Only the groups that satisfy the condition are retrieved in the
result of the query.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
Q 26. For each project on which more than two PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
SQL JOIN
• How do I get data from multiple tables?
▫ A SQL JOIN combines records from two tables.
▫ A JOIN locates related column values in the two tables.
▫ A query can contain zero, one, or multiple JOIN operations.
▫ INNER JOIN is the same as JOIN; the keyword INNER is
optional
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
JOIN Syntax
• The general syntax is
SELECT column-names
FROM table-name1 INNER JOIN table-name2
ON column-name1 = column-name2
WHERE condition
• The INNER keyword is optional: it is the default as well as
the most commonly used JOIN operation.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Example
• Problem: List all orders with customer information
SELECT OrderNumber, TotalAmount, FirstName, LastName, City,
Country
FROM Order JOIN Customer
ON Order.CustomerId = Customer.Id
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Example
• In COMPANY databasewe may frequently issue queries that retrieve the
employee name and the project names that the employee works on.
• Rather than having to specify the join of the three tables EMPLOYEE,
WORKS_ON, and PROJECT every time we issue this query, we can define
a view that is specified as the result of these joins.
• Then we can issue queries on the view, which are specified as
single_x0002_table retrievals rather than as retrievals involving two joins
on three tables.
• We call the EMPLOYEE, WORKS_ON, and PROJECT tables the defining
tables of the view.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
V1
V2
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
DROP VIEW
• If we do not need a view any more, we can use the DROP
VIEW command to dispose of it.
• For example, to get rid of the view V1, we can use the SQL
statement in V1A:
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
PREPARED BY SHARIKA T R,
SNGCE
View Materialization
• involves physically creating a temporary view table when the view
is first queried and keeping that table on the assumption that
other queries on the view will follow
• an efficient strategy for automatically updating the view table
when the base tables are updated must be developed in order to
keep the view up-to-date.
• Techniques using the concept of incremental update have been
developed for this purpose,
▫ where the DBMS can determine what new tuples must be inserted,
deleted, or modified in a materialized view table when a database
update is applied to one of the defining base tables
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
UPDATING VIEWS
• There are certain conditions needed to be satisfied to update a view
1. The view is defined based on one and only one table.
2. The view must include the PRIMARY KEY of the table based upon
which the view has been created.
3. The view should not have any field made out of aggregate functions.
4. The view must not have any DISTINCT clause in its definition.
5. The view must not have any GROUP BY or HAVING clause in its
definition.
6. The view must not have any SUBQUERIES in its definitions.
7. If the view you want to update is based upon another view, the later
should be updatable.
8. Any of the selected output fields (of the view) must not use
constants, strings or value expressions.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Uses of a View
• Restricting data access
▫ Views provide an additional level of table security by restricting
access to a predetermined set of rows and columns of a table.
• Hiding data complexity
▫ A view can hide the complexity that exists in a multiple table join.
• Simplify commands for the user
▫ Views allows the user to select information from multiple tables
without requiring the users to actually know how to perform a join.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Example 1
• For each tuple in the student relation, the value of the
attribute tot_cred must equal the sum of credits of courses
that the student has completed successfully.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
TRIGGER
• A trigger is a stored procedure in database which automatically
invokes whenever a special event in the database occurs.
• action to be taken when certain events occur and when certain
conditions are satisfied.
• For example,
▫ a trigger can be invoked when a row is inserted into a specified table
or when certain table columns are being updated.
▫ it may be useful to specify a condition that, if violated, causes some
user to be informed of the violation
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Benefits of Triggers
• Generating some derived column values automatically
• Enforcing referential integrity
• Event logging and storing information on table access
• Auditing
• Synchronous replication of tables
• Imposing security authorizations
• Preventing invalid transactions
PREPARED BY SHARIKA T R,
SNGCE
Creating Triggers
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Example
• creates a row-level trigger for
the customers table that would
fire for INSERT or UPDATE or
DELETE operations performed
on the CUSTOMERS table.
• This trigger will display the
salary difference between the
old values and new values
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Triggering a Trigger
• Here is one INSERT statement, which will create a new record in the
table
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
SYLLABUS
• SQL DML (Data Manipulation Language)
▫ SQL queries on single and multiple tables, Nested queries (correlated and
non-correlated), Aggregation and grouping, Views, assertions, Triggers,
SQL data types.
• Physical Data Organization
▫ Review of terms: physical and logical records, blocking factor, pinned and
unpinned organization. Heap files, Indexing, Singe level indices, numerical
examples, Multi-level-indices, numerical examples, B-Trees & B+-Trees
(structure only, algorithms not required), Extendible Hashing, Indexing on
multiple keys – grid files
PREPARED BY SHARIKA T R,
SNGCE
Physical Files
• Physical files contain the actual data that is stored on the
system, and a description of how data is to be presented to
or received from a program.
• They contain only one record format, and one or more
members.
• Records in database files can be externally or program-
described.
• A physical file can have a keyed sequence access path.
▫ This means that data is presented to a program in a sequence
based on one or more key fields in the file.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Logical files
• Logical files do not contain data.
• They contain a description of records found in one or more
physical files.
• A logical file is a view or representation of one or more
physical files.
• Logical files that contain more than one format are referred
to as multi-format logical files.
• If your program processes a logical file which contains
more than one record format, you can use a read by record
format to set the format you wish to use.
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
File organizations
• File organization refers to the organization of the data of a
file into records, blocks, and access structures; this
includes the way records and blocks are placed on the
storage medium and interlinked.
• An access method, on the other hand, provides a group of
operations that can be applied to a file.
• In general, it is possible to apply several access methods to
a file organization
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Secondary organization
• A secondary organization or auxiliary access structure
allows efficient access to file records based on alternate
fields than those that have been used for the primary file
organization
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Question
• Number of records=100000
• record size=100 bytes
• block size=512 byte
• for spanned orgaization number of
blocks needed= 100000 × 100 ÷ 512
= 19523 blocks
• In case of unspanned organization
• No of record per block=└512/100┘=5 record/block
• Total no of blocks= 100000/5 = 20000 blocks
• so we can see in unspanned need 468 disk blocks in case of
unspanned organisation
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
File Headers
• A file header or file descriptor contains information about a
file that is needed by the system programs that access the
file records.
• The header includes information to determine the disk
addresses of the file blocks as well as to record format
descriptions,
▫ which may include field lengths and the order of fields within
a record for fixed-length unspanned records and field type
codes, separator characters, and record type codes for
variable-length records.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
• To search for a record on disk, one or more blocks are copied into main
memory buffers.
• Programs then search for the desired record or records within the buffers,
using the information in the file header.
• If the address of the block that contains the desired record is not known,
the search programs must do a linear search through the file blocks.
• Each file block is copied into a buffer and searched until the record is
located or all the file blocks have been searched unsuccessfully.
• This can be very time-consuming for a large file. The goal of a good file
organization is to locate the block that contains a desired record with a
minimal number of block transfers.
PREPARED BY SHARIKA T R,
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• Searching
▫ searching a record using any search condition involves a
linear search through the file block by block an expensive
procedure.
▫ If only one record satisfies the search condition, then, on
the average, a program will read into memory and search
half the file blocks before it finds the record.
▫ For a file of b blocks, this requires searching (b/2) blocks,
on average.
▫ If no records or several records satisfy the search
condition, the program must read and search all b blocks
in the file
PREPARED BY SHARIKA T R,
SNGCE
• Deletion
▫ a program must first find its block, copy the block into a buffer,
delete the record from the buffer, and finally rewrite the block back
to the disk.
▫ This leaves unused space in the disk block.
▫ Deleting a large number of records in this way results in wasted
storage space.
▫ Another technique used for record deletion is to have an extra byte
or bit, called a deletion marker, stored with each record.
▫ A record is deleted by setting the deletion marker to a certain
value. A different value for the marker indicates a valid (not
deleted) record. Search programs consider only valid records in a
block when conducting their search.
▫ Both of these deletion techniques require periodic reorganization
of the file to reclaim the unused space of deleted records
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• A binary search for disk files can be done on the blocks rather than on the
records.
• Suppose that the file has b blocks numbered 1, 2, ..., b; the records are
ordered by ascending value of their ordering key field; and we are searching
for a record whose ordering key field value is K.
• A binary search usually accesses log2(b) blocks, whether the record is found
or not an improvement over linear searches, where, on the average, (b/2)
blocks are accessed when the record is found and b blocks are accessed
when the record is not found.
• Ordering does not provide any advantages for random or ordered access of
the records based on values of the other nonordering fields of the file. In
these cases, we do a linear search for random access.
• To access the records in order based on a nonordering field, it is necessary
to create another sorted copy in a different order of the file
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Hashed Files
■ Hashing for disk files is called External Hashing
■ The file blocks are divided into M equal-sized buckets, numbered
bucket0, bucket1, ..., bucketM-1
Typically, a bucket corresponds to one (or a fixed number of) disk
block.
■ One of the file fields is designated to be the hash key of the file.
■ The record with hash key value K is stored in bucket i, where i=h(K),
and h is the hashing function.
■ Search is very efficient on the hash key.
■ Collisions occur when a new record hashes to a bucket that is already
full.
An overflow file is kept for storing such records.
Overflow records that hash to each bucket can be linked together.
Extendible Hashing
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
Index Structures
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• Advantages:
▫ Stores and organizes data into computer files.
▫ Makes it easier to find and access data at any given time.
▫ It is a data structure that is added to a file to provide faster
access to the data.
▫ It reduces the number of blocks that the DBMS has to check.
• Disadvantages
▫ Index needs to be updated periodically for insertion or deletion
of records in the main table.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Types of index
• Indexes can be characterized as
1. Dense index
2. Sparse index
• A dense index has an index entry for every search key
value (and hence every record) in the data file.
• A sparse (or nondense) index, on the other hand, has
index entries for only some of the search values.
PREPARED BY SHARIKA T R,
SNGCE
Structure of index
• An index is a small table having only two columns.
• The first column contains a copy of the primary or
candidate key of a table
• The second column contains a set of pointers holding the
address of the disk block where that particular key value
can be found.
• If the indexes are sorted, then it is called as ordered
indices.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Primary Index
• Primary index is defined on an ordered data file. The data
file is ordered on a key field.
• The key field is generally the primary key of the relation.
• A primary index is a nondense (sparse) index, since it
includes an entry for each disk block of the data file and the
keys of its anchor record rather than for every search value.
PREPARED BY SHARIKA T R,
SNGCE
Figure 5.1: Primary Index on the Ordering Key Field of the File
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Example
• Suppose we have an ordered file with 30,000
records and stored on a disk of block size 1024
bytes and records are of fixed size, unspanned
organisation.
• Record length = 100 bytes. How many block access
if using a primary index file, with an ordering key
field of the file 9 bytes and block pointer size 6
bytes.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Clustering Index
• If file records are physically ordered on a nonkey field
which does not have a distinct value for each record that
field is called the clustering field and the data file is called a
clustered file.
• clustering index speeds up retrieval of all the records that
have the same value for the clustering field.
• This differs from a primary index, which requires that the
ordering field of the data file have a distinct value for each
record.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Question
• Suppose we have an ordered file with 30,0000 records and
stored on a disk of block size 4096 bytes and records are of
size 100 byte, unspanned organisation.
• The ordered file is based on non key field
(department_name). How many block access if using a
primary index file, with an ordering key field of the file 5
bytes and block pointer size 6 bytes. There are 1000
department and 300 employees per department.
PREPARED BY SHARIKA T R,
SNGCE
No of record=300000
Block size=4096 b
Record size=100 b
Block ptr=6 b
Clustered Index
Without index:
No of blocks=┌300000/40┐=7500
No. of block access= log2 7500=13
With clustered index
No of index record=┌4096/(6+5)┐=4096/11=372
Total no of index record=1000 department
No of clustered index blocks=┌ 1000/372┐ =3 blocks
No of clock access= ┌log2 3┐ +1= 2+1=3 block access
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Secondary Index
• A secondary index provides a secondary means of accessing a file for
which some primary access already exists.
• The secondary index may be on a field which is a candidate key and has
a unique value in every record, or a non-key with duplicate values.
• The index is an ordered file with two fields.
• The first field is of the same data type as some non-ordering field of the
data file that is an indexing field.
• The second field is either a block pointer or a record pointer.
• There can be many secondary indexes (and hence, indexing fields) for
the same file.
• Includes one entry for each record in the data file; hence, it is a dense
index.
PREPARED BY SHARIKA T R,
SNGCE
Example
• Consider an unordered file with 10^8 records with record
size of 400 bytes with an unspanned organization. Suppose
that we construct a single level secondary index for the file
where search key field is 16 bytes and block pointer is 4
byte. Assume disk block size is 4096
• How many blocks are reqired to store the data file?
• How many blocks are required to store the index file?
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
• Disk file
▫ No of data records/ block=4096/400=10
▫ No of data block required=ceil(10^8/10)=10^7
• Index file
▫ Size of one index record=16+4=20 byte
▫ No of index record= No of data record=10^8
secondary index on key field
▫ No of index record per block=4096/20=204
▫ No of index block required=ceil(10^8/204)=490197
PREPARED BY SHARIKA T R,
SNGCE
Example 2
• Consider an unordered file with 10^8 records with record
size of 400 bytes with an spanned organization. Suppose
that we construct a single level secondary index for the file
where search key field is 16 bytes and block pointer is 4
byte. Assume disk block size is 4096
• How many blocks are reqired to store the data file?
• How many blocks are required to store the index file?
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
• Disk file
▫ No of data block required=ceil(10^8*400/4096)=9765625
• Index file
▫ Size of one index record=16+4=20 byte
▫ No of index record= No of data record=10^8
secondary index on key field
▫ No of index block required=ceil(10^8*20/4096)=488282
PREPARED BY SHARIKA T R,
SNGCE
Example
• Consider an unordered file with 30000 records with record
size of 100 bytes with an unspanned organization. Suppose
that we construct a secondary index for the key field with
key field size is 9 bytes and block pointer is 6 byte. Assume
disk block size is 1024
• Average number of disk block access to search for record
without index
• Average number of disk block access to search for record
with index
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
• Disk file
▫ No of data records/ block=1024/100=10
▫ No of data block required=ceil(30000/10)=3000
▫ Average no of disk access= ceil(3000/2)=1500
• Index file
▫ Size of one index record=9+6=15 byte
▫ No of index record= No of data record=3000
secondary index on key field
▫ No of index record per block=1024/15=68
▫ No of index block required=ceil(30000/68)=442
▫ Avg no of disk block access required=cei(log2 442)+1 =10
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Figure 5.4: Secondary Index (with Record Pointer) on a Non Key Field implemented
using one level of indirection so that Index entries are of Fixed Length and have unique
field values
PREPARED BY SHARIKA T R,
SNGCE
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Example
• Consider a file of 16384 records. Each record is 32 bytes
long and its key field is of size 6 bytes. The file is ordered
on a non-key field, and the file organization is
unspanned. The file is stored in a file system with block
size 1024 bytes, and the size of a block pointer is 10 bytes.
If the secondary index is built on the key field of the
file, and a multi-level index scheme is used to store
the secondary index, the number of first-level and second-
level blocks in the multi-level index are?
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• Without Indexing
▫ No of data record per block=1024/32=32
▫ No of data block=16384/32=512
• Indexing First Level:
▫ No of index record=16384
secondary index is built on the key field of the
file
ordered
▫ No of index record per block =1024/(6+10)=64
▫ No of index blocks=16384/64=256
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Example 2
PREPARED BY SHARIKA T R,
SNGCE
Given
• No of records=10000
• record size=80byte
• key field=15byte
• block size=512byte
• block pointer size=6 byte
SORTED ON PRIMARY KEY
UNSPANNED ORGANISATION
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
PREPARED BY SHARIKA T R,
SNGCE
Search Trees
• A search tree is slightly different from a multilevel index.
• A search tree of order p is a tree such that each node contains at most
p−1 search values and p pointers in the order
<P1K1, P2K2, ..., Pq−1Kq−1, Pq>, where q ≤ p.
• Each Pi is a pointer to a child node (or a NULL pointer), and each Ki is a
search value from some ordered set of values.
• Two constraints must hold at all times on the search tree:
1. Within each node, K1 < K2 < ... < Kq−1.
2. For all values X in the subtree pointed at by Pi, we have Ki−1<X < Ki
for 1<i < q; X < Ki for i=1; and Ki−1 < X for i = q
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
B-Trees SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• A B-tree starts with a single root node (which is also a leaf node) at
level 0 (zero).
• Once the root node is full with p – 1 search key values and we
attempt to insert another entry in the tree, the root node splits into
two nodes at level 1.
• Only the middle value is kept in the root node, and the rest of the
values are split evenly between the other two nodes.
• When a nonroot node is full and a new entry is inserted into it,
▫ that node is split into two nodes at the same level, and the middle
entry is moved to the parent node along with two pointers to the new
split nodes.
• If the parent node is full, it is also split. Splitting can propagate all
the way to the root node, creating a new level if the root is split.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Properties of a B-tree
• For a tree to be classified as a B-tree, it must fulfill the
following conditions:
▫ the nodes in a B-tree of order m can have a maximum of m
children
▫ each internal node (non-leaf and non-root) can have at least (m/2)
children (rounded up)
▫ the root should have at least two children – unless it’s a leaf
▫ a non-leaf node with k children should have k-1 keys
▫ all leaves must appear on the same level
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Building a B-tree
• Since we’re starting with an empty tree, the first item we insert
will become the root node of our tree.
• At this point, the root node has the key/value pair.
• The key is 1, but the value is depicted as a star to make it easier to
represent, and to indicate it is a reference to a record.
• The root node also has pointers to its left and right children shown
as small rectangles to the left and right of the key.
• Since the node has no children, those pointers will be empty for
now:
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Our right sub-tree is now at full capacity, so to insert 5 we need to use the same
splitting logic explained above. We split the node into two so that Key 3 goes to a
left sub-tree and 5 goes to a right sub-tree leaving 4 to be promoted to the root
node alongside 2.
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Next, we try to insert 7. However, since the rightmost tree is now at full capacity
we know that we need to do another split operation and promote one of the keys.
But wait! The root node is also at full capacity, which means that it also needs to
be split.
So, we end up doing this in two steps. First, we need to split the right nodes 5 and
6 so that 7 will be on the right, 5 will be on the left, and 6 will be promoted.
Then, to promote 6, we need to split the root node such that 4 will become a part
of new root and 6 and 2 become the parents of the right and left subtree.
PREPARED BY SHARIKA T R,
SNGCE
Continuing in this way, we fill the tree by adding Keys 8,9 and 10 until we
get the final tree:
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
B+-Trees
• In a B+-tree, data pointers are stored only at the leaf nodes
of the tree; hence, the structure of leaf nodes differs from
the structure of internal nodes.
• The leaf nodes have an entry for every value of the search
field, along with a data pointer to the record.
PREPARED BY SHARIKA T R,
follows:
1. Each internal node is of the form <P1, K1, P2, K2, ..., Pq – 1, Kq –1,
Pq> where q ≤ p and each Pi is a tree pointer.
2. Within each internal node, K1 < K2 < ... < Kq−1.
3. For all search field values X in the subtree pointed at by Pi, we have
Ki−1 < X≤ Ki for 1 < i < q; X ≤ Ki for i = 1; and Ki−1 < X for i = q.
4. Each internal node has at most p tree pointers.
5. Each internal node, except the root, has at least ┌(p/2)┐ tree
pointers. The root node has at least two tree pointers if it is an
internal node.
6. An internal node with q pointers, q ≤ p, has q − 1 search field values.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
B+Tree
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
• When a leaf node is full and a new entry is inserted there, the node overflows
and must be split.
• The first j = ┌((pleaf + 1)/2)┐entries in the original node are kept there, and
the remaining entries are moved to a new leaf node.
• The jth search value is replicated in the parent internal node, and an extra
pointer to the new node is created in the parent.
• These must be inserted in the parent node in their correct sequence.
• If the parent internal node is full, the new value will cause it to overflow also,
so it must be split.
• The entries in the internal node up to Pj—the jth tree pointer after inserting
the new value and pointer, where j = ⎣((p + 1)/2)⎦—are kept, while the jth
search value is moved to the parent, not replicated.
• A new internal node will hold the entries from Pj+1 to the end of the entries in
the node
• This splitting can propagate all the way up to create a new root node and
hence a new level for the B+-tree.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Building a B+tree
PREPARED BY SHARIKA T R,
SNGCE
When we come to insert Key 3, we find that in doing so we will exceed the
capacity of the root node.Similar to a normal B-tree this means we need
to perform a split operation. However, unlike with the B-tree, we must
copy-up the first key in the new rightmost leaf node. As mentioned, this is
so we can make sure we have a key/value pair for Key 2 in the leaf nodes:
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
Next, we add Key 4 to the rightmost leaf node. Since it’s full, we need to
perform another a split operation and copy-up Key 3 to the root node:
PREPARED BY SHARIKA T R,
SNGCE
Now, let’s add 5 to the rightmost leaf node. Once again to keep the order, we’ll split
the leaf node and copy-up 4. Since that will overflow the root node, we’ll have to
perform another split operation splitting the root node into two nodes and promoting
3 into a new root node
Notice the difference between splitting a leaf node and splitting an internal node.
When we split the internal node in the second split operation we didn’t copy-up Key
3.
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
In the same way, we keep adding the keys from 6 to 10, each time splitting and
copying-up when necessary until we have reached our final tree:
PREPARED BY SHARIKA T R,
SNGCE
• Most multi-level indexes use B-tree or B+-tree data structures because of the
insertion and deletion problem.
• This leaves space in each tree node (disk block) to allow for new index
entries.
• These data structures are variations of search trees that allow efficient
insertion and deletion of new search values.
• In B-Tree and B+-Tree data structures, each node corresponds to a disk
block.
• Each node is kept between half-full and completely full.
• An insertion into a node that is not full is quite efficient.
• If a node is full the insertion causes a split into two nodes.
• Splitting may propagate to other tree levels.
• A deletion is quite efficient if a node does not become less than half full.
• If a deletion causes a node to become less than half full, it must be merged
with neighboring nodes
Visit my YouTube Page: https://www.youtube.com/c/sharikatr for notes and PPTs
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
Grid File
PREPARED BY SHARIKA T R,
SNGCE
PREPARED BY SHARIKA T R,
SNGCE
References
• Elmasri R. and S. Navathe, Database Systems: Models,
Languages, Design and Application Programming,
Pearson Education, 2013.
• https://www.baeldung.com/cs/b-trees-vs-btrees
• https://www.cs.usfca.edu/~galles/visualization/BTree.htm
• https://www.cs.usfca.edu/~galles/visualization/BPlusTree
.html
PREPARED BY SHARIKA T R,
SNGCE
MODULE 3 ENDS