0% found this document useful (0 votes)

102 views53 pages

DBMS Indexing and Storage

Database Management System

Uploaded by

Nameet Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views53 pages

DBMS Indexing and Storage

Database Management System

Uploaded by

Nameet Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Chapter 11: Indexing and Storage

Modified from:
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Chapter 11: Indexing and Storage
n DBMS Storage
l Memory hierarchy
l File Organization
l Buffering
n Indexing
l Basic Concepts
l B+-Trees
l Static Hashing
l Index Definition in SQL
l Multiple-Key Access

CS425 – Fall 2013 – Boris Glavic 11.2 ©Silberschatz, Korth and Sudarshan
Memory Hierarchy

Modified from:
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
DBMS Storage
n Modern Computers have different types of memory
l Cache, Main Memory, Harddisk, SSD, …
n Memory types have different characteristics in terms of
l Persistent vs. volatile
l Speed (random vs. sequential access)
l Size
l Price – this usually determines size
n Database systems are designed to be use these different memory
types effectively
l Need for persistent storage: the state of the database needs to be
written to persistent storage
 guarantee database content is not lost when the computer is
shutdown
l Moving data between different types of memory
 Want to use fast memory to speed-up operations
 Need slower memory for the size
CS425 – Fall 2013 – Boris Glavic 11.4 ©Silberschatz, Korth and Sudarshan
Storage Hierarchy
Persistent
storage cache

main memory

Speed
flash memory
Size

magnetic disk

optical disk

magnetic tapes

CS425 – Fall 2013 – Boris Glavic 11.5 ©Silberschatz, Korth and Sudarshan
Main Memory vs. Disk
n Why do we not only use main memory
l What if database does not fit into main memory?
l Main memory is volatile
n Main memory vs. disk
l Given available main memory when do we keep which part of the
database in main memory
 Buffer manager: Component of DBMS that decides when to
move data between disk and main memory
l How do we ensure transaction property durability
 Buffer manager needs to make sure data written by committed
transactions is written to disk to ensure durability

CS425 – Fall 2013 – Boris Glavic 11.6 ©Silberschatz, Korth and Sudarshan
Random vs. Sequential Access
n Transfer of data from disk has a minimal size = 1 block
l Reading 1 byte is as fast as reading one block (e.g., 4KB)
n Random Access
l Read data from anywhere on the disk
l Need to get to the right track (seek time)
l Need to wait until the right sector is under the arm (on avg ½ time
for one rotation) (rotational delay)
l Then can transfer data at ~ transfer rate
n Sequential Access
l Read data that is on the current track + sector
l can transfer data at ~ transfer rate
n Reading large number of small pieces of data randomly is very slow
compared to sequential access
l Thus, try layout data on disk in a way that enables sequential
access

CS425 – Fall 2013 – Boris Glavic 11.7 ©Silberschatz, Korth and Sudarshan
File Organization

Modified from:
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
File Organization

n The database is stored as a collection of files. Each file stores

records (tuples from a table). A record is a sequence of fields
(the attributes of a tuple).
n Reading one record of a time from disk would be very slow
(random access)
l Organize our database files in pages (size of block or larger)
l Read/write data in units of pages
l One page will usually contain several records
n One approach:
l assume record size is fixed
l each file has records of one particular type only
l different files are used for different relations
This case is easiest to implement; will consider variable length
records later.

CS425 – Fall 2013 – Boris Glavic 11.9 ©Silberschatz, Korth and Sudarshan
Fixed-Length Records
n Simple approach:
l Store record i starting from byte n  (i – 1), where n is the size of
each record. Put maximal P / n records on each page.
l Record access is simple but records may cross blocks
 Modification: do not allow records to cross block boundaries

n Deletion of record i:
alternatives:
l move records i + 1, . . ., n
to i, . . . , n – 1
l move record n to i
l do not move records, but
link all free records on a
free list

CS425 – Fall 2013 – Boris Glavic 11.10 ©Silberschatz, Korth and Sudarshan
Free Lists
n Store the address of the first deleted record in the file header.
n Use this first record to store the address of the second deleted record,
and so on
n Can think of these stored addresses as pointers since they “point” to
the location of a record.

CS425 – Fall 2013 – Boris Glavic 11.11 ©Silberschatz, Korth and Sudarshan
Variable-Length Records

n Variable-length records arise in database systems in several ways:

l Storage of multiple record types in a file.
l Record types that allow variable lengths for one or more fields such as
strings (varchar)
l Record types that allow repeating fields (used in some older data
models).
n Attributes are stored in order
n Variable length attributes represented by fixed size (offset, length), with
actual data stored after all fixed length attributes
n Null values represented by null-value bitmap

CS425 – Fall 2013 – Boris Glavic 11.12 ©Silberschatz, Korth and Sudarshan
Variable-Length Records: Slotted Page Structure

n Slotted page header contains:

l number of record entries
l end of free space in the block
l location and size of each record
n Records can be moved around within a page to keep them contiguous
with no empty space between them; entry in the header must be
updated.
n Pointers should not point directly to record — instead they should
point to the entry for the record in header.

CS425 – Fall 2013 – Boris Glavic 11.13 ©Silberschatz, Korth and Sudarshan
Organization of Records in Files

n Heap – a record can be placed anywhere in the file where there

is space
l Deletion efficient
l Insertion efficient
l Search is expensive
 Example: Get instructor with name Glavic
– Have to search through all instructors
n Sequential – store records in sequential order, based on the
value of some search key of each record
l Deletion expensive and/or waste of space
l Insertion expensive and/or waste of space
l Search is efficient (e.g., binary search)
 As long as the search is on the search key we are
ordering on

CS425 – Fall 2013 – Boris Glavic 11.14 ©Silberschatz, Korth and Sudarshan
Buffering

Modified from:
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Buffer Manager
n Buffer Manager
l Responsible for loading pages from disk and writing modified
pages back to disk
n Handling blocks
1. If the block is already in the buffer, the buffer manager
returns the address of the block in main memory
2. If the block is not in the buffer, the buffer manager
1. Allocates space in the buffer for the block
1. Replacing (throwing out) some other block, if required,
to make space for the new block.
2. Replaced block written back to disk only if it was
modified since the most recent time that it was written
to/fetched from the disk.
2. Reads the block from the disk to the buffer, and returns
the address of the block in main memory to requester.

CS425 – Fall 2013 – Boris Glavic 11.16 ©Silberschatz, Korth and Sudarshan
Buffer-Replacement Policies
n Most operating systems replace the block least recently used
(LRU strategy)
n Idea behind LRU – use past pattern of block references as a
predictor of future references
n Queries have well-defined access patterns (such as sequential
scans), and a database system can use the information in a user’s
query to predict future references
l LRU can be a bad strategy for certain access patterns involving
repeated scans of data
 For example: when computing the join of 2 relations r and s
by a nested loops
for each tuple tr of r do
for each tuple ts of s do
if the tuples tr and ts match …
l Mixed strategy with hints on replacement strategy provided
by the query optimizer is preferable

CS425 – Fall 2013 – Boris Glavic 11.17 ©Silberschatz, Korth and Sudarshan
Buffer-Replacement Policies (Cont.)
n Pinned block – memory block that is not allowed to be written
back to disk. E.g., an operation still needs this block.
n Toss-immediate strategy – frees the space occupied by a block
as soon as the final tuple of that block has been processed
n Most recently used (MRU) strategy – system must pin the
block currently being processed. After the final tuple of that block
has been processed, the block is unpinned, and it becomes the
most recently used block.
n Buffer manager can use statistical information regarding the
probability that a request will reference a particular relation
l E.g., the data dictionary is frequently accessed. Heuristic:
keep data-dictionary blocks in main memory buffer
n Buffer managers also support forced output of blocks for the
purpose of recovery (more in Chapter 16 in the textbook)

CS425 – Fall 2013 – Boris Glavic 11.18 ©Silberschatz, Korth and Sudarshan
Indexing and Hashing

Modified from:
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Basic Concepts
n Indexing mechanisms used to speed up access to desired data.
l E.g., author catalog in library
n Search Key - attribute or set of attributes used to look up records in a
file.
n An index file consists of records (called index entries) of the form

search-key pointer
n Index files are typically much smaller than the original file
n Two basic kinds of indices:
l Ordered indices: search keys are stored in some sorted order
l Hash indices: search keys are distributed uniformly across
“buckets” using a “hash function”.

CS425 – Fall 2013 – Boris Glavic 11.20 ©Silberschatz, Korth and Sudarshan
Index Evaluation Metrics
n Access types supported efficiently. E.g.,
l records with a specified value in the attribute
l or records with an attribute value falling in a specified range of
values.
n Access time
n Insertion time
n Deletion time
n Space overhead

CS425 – Fall 2013 – Boris Glavic 11.21 ©Silberschatz, Korth and Sudarshan
Ordered Indices

n In an ordered index, index entries are stored sorted on the search key
value. E.g., author catalog in library.
n Primary index: in a sequentially ordered file, the index whose search
key specifies the sequential order of the file.
l Also called clustering index
l The search key of a primary index is usually but not necessarily the
primary key.
n Secondary index: an index whose search key specifies an order
different from the sequential order of the file. Also called
non-clustering index.
n Index-sequential file: ordered sequential file with a primary index.

CS425 – Fall 2013 – Boris Glavic 11.22 ©Silberschatz, Korth and Sudarshan
Secondary Indices Example

Secondary index on salary field of instructor

n Index record points to a bucket that contains pointers to all the

actual records with that particular search-key value.
n Secondary indices have to be dense

CS425 – Fall 2013 – Boris Glavic 11.23 ©Silberschatz, Korth and Sudarshan
Primary and Secondary Indices
n Indices offer substantial benefits when searching for records.
n BUT: Updating indices imposes overhead on database
modification --when a file is modified, every index on the file
must be updated,
n Sequential scan using primary index is efficient, but a
sequential scan using a secondary index is expensive
l Each record access may fetch a new block from disk
l Block fetch requires about 5 to 10 milliseconds, versus
about 100 nanoseconds for memory access

CS425 – Fall 2013 – Boris Glavic 11.24 ©Silberschatz, Korth and Sudarshan
Secondary Indices
n Frequently, one wants to find all the records whose values in
a certain field (which is not the search-key of the primary
index) satisfy some condition.
l Example 1: In the instructor relation stored sequentially by
ID, we may want to find all instructors in a particular
department
l Example 2: as above, but where we want to find all
instructors with a specified salary or with salary in a
specified range of values
n We can have a secondary index with an index record for
each search-key value

CS425 – Fall 2013 – Boris Glavic 11.25 ©Silberschatz, Korth and Sudarshan
B+-Tree Index

B+-tree indices are an alternative to indexed-sequential files.

n Disadvantage of indexed-sequential files

performance degrades as file grows, since many overflow
l
blocks get created.
l Periodic reorganization of entire file is required.
n Advantage of B+-tree index files:
l automatically reorganizes itself with small, local, changes,
in the face of insertions and deletions.
l Reorganization of entire file is not required to maintain
performance.
n (Minor) disadvantage of B+-trees:
l extra insertion and deletion overhead, space overhead.
n Advantages of B+-trees outweigh disadvantages
l B+-trees are used extensively

CS425 – Fall 2013 – Boris Glavic 11.26 ©Silberschatz, Korth and Sudarshan
Example of B+-Tree

CS425 – Fall 2013 – Boris Glavic 11.27 ©Silberschatz, Korth and Sudarshan
B+-Tree Index Files (Cont.)

A B+-tree is a rooted tree satisfying the following properties:

n All paths from root to leaf are of the same length

n Each node that is not a root or a leaf has between n/2 and
n children.
n A leaf node has between (n–1)/2 and n–1 values
n Special cases:
l If the root is not a leaf, it has at least 2 children.
l If the root is a leaf (that is, there are no other nodes in
the tree), it can have between 0 and (n–1) values.

CS425 – Fall 2013 – Boris Glavic 11.28 ©Silberschatz, Korth and Sudarshan
B+-Tree Node Structure
n Typical node

l Ki are the search-key values

l Pi are pointers to children (for non-leaf nodes) or pointers to
records or buckets of records (for leaf nodes).
n The search-keys in a node are ordered
K1 < K2 < K3 < . . . < Kn–1
(Initially assume no duplicate keys, address duplicates later)

CS425 – Fall 2013 – Boris Glavic 11.29 ©Silberschatz, Korth and Sudarshan
Leaf Nodes in B+-Trees

Properties of a leaf node:

n For i = 1, 2, . . ., n–1, pointer Pi points to a file record with
search-key value Ki,
n If Li, Lj are leaf nodes and i < j, Li’s search-key values are less
than or equal to Lj’s search-key values
n Pn points to next leaf node in search-key order

CS425 – Fall 2013 – Boris Glavic 11.30 ©Silberschatz, Korth and Sudarshan
Example of B+-tree

B+-tree for instructor file (n = 6)

n Leaf nodes must have between 3 and 5 values

((n–1)/2 and n –1, with n = 6).
n Non-leaf nodes other than root must have between 3
and 6 children ((n/2 and n with n =6).
n Root must have at least 2 children.

CS425 – Fall 2013 – Boris Glavic 11.31 ©Silberschatz, Korth and Sudarshan
Observations about B+-trees
n Since the inter-node connections are done by pointers,
“logically” close blocks need not be “physically” close.
n The non-leaf levels of the B+-tree form a hierarchy of sparse
indices.
n The B+-tree contains a relatively small number of levels
 Level below root has at least 2* n/2 values
 Next level has at least 2* n/2 * n/2 values
 .. etc.
l If there are K search-key values in the file, the tree height is
no more than  logn/2(K)
l thus searches can be conducted efficiently.
n Insertions and deletions to the main file can be handled
efficiently, as the index can be restructured in logarithmic time
(as we shall see).

CS425 – Fall 2013 – Boris Glavic 11.32 ©Silberschatz, Korth and Sudarshan
Queries on B+-Trees
n Find record with search-key value V.
1. C=root
2. While C is not a leaf node {
1. Let i be least value s.t. V  Ki.
2. If no such exists, set C = last non-null pointer in C
3. Else { if (V= Ki ) Set C = Pi +1 else set C = Pi}
}
3. Let i be least value s.t. Ki = V
4. If there is such a value i, follow pointer Pi to the desired record.
5. Else no record with search-key value k exists.

CS425 – Fall 2013 – Boris Glavic 11.33 ©Silberschatz, Korth and Sudarshan
Handling Duplicates
n With duplicate search keys
l In both leaf and internal nodes,
 we cannot guarantee that K1 < K2 < K3 < . . . < Kn–1

 but can guarantee K1  K2  K3  . . .  Kn–1

l Search-keys in the subtree to which Pi points

 are  Ki,, but not necessarily < Ki,

 To see why, suppose same search key value V is present

in two leaf node Li and Li+1. Then in parent node Ki must
be equal to V

CS425 – Fall 2013 – Boris Glavic 11.34 ©Silberschatz, Korth and Sudarshan
Queries on B+-Trees (Cont.)
n If there are K search-key values in the file, the height of the tree is no
more than logn/2(K).
n A node is generally the same size as a disk block, typically 4
kilobytes
l and n is typically around 100 (40 bytes per index entry).
n With 1 million search key values and n = 100
l at most log50(1,000,000) = 4 nodes are accessed in a lookup.
n Contrast this with a balanced binary tree with 1 million search key
values — around 20 nodes are accessed in a lookup
l above difference is significant since every node access may need
a disk I/O, costing around 20 milliseconds

CS425 – Fall 2013 – Boris Glavic 11.35 ©Silberschatz, Korth and Sudarshan
Updates on B+-Trees: Insertion (Cont.)
n Splitting a leaf node:
l take the n (search-key value, pointer) pairs (including the one
being inserted) in sorted order. Place the first n/2 in the original
node, and the rest in a new node.
l let the new node be p, and let k be the least key value in p. Insert
(k,p) in the parent of the node being split.
l If the parent is full, split it and propagate the split further up.
n Splitting of nodes proceeds upwards till a node that is not full is found.
l In the worst case the root node may be split increasing the height
of the tree by 1.

Result of splitting node containing Brandt, Califieri and Crick on inserting Adams
Next step: insert entry with (Califieri,pointer-to-new-node) into parent

B+-Tree before and after insertion of “Adams”

B+-Tree before and after insertion of “Lamport”

Before and after deleting “Srinivasan”

n Deleting “Srinivasan” causes merging of under-full leaves

Deletion of “Singh” and “Wu” from result of previous example

n Leaf containing Singh and Wu became underfull, and borrowed a value

Kim from its left sibling
n Search-key value in the parent changes as a result

CS425 – Fall 2013 – Boris Glavic 11.40 ©Silberschatz, Korth and Sudarshan
Non-Unique Search Keys
n Alternatives to scheme described earlier
l Buckets on separate block (bad idea)
l List of tuple pointers with each key
 Extra code to handle long lists
 Deletion of a tuple can be expensive if there are many
duplicates on search key (why?)
 Low space overhead, no extra cost for queries
l Make search key unique by adding a record-identifier
 Extra storage overhead for keys
 Simpler code for insertion/deletion
 Widely used

Modified from:
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Static Hashing

n A bucket is a unit of storage containing one or more records (a

bucket is typically a disk block).
n In a hash file organization we obtain the bucket of a record directly
from its search-key value using a hash function.
n Hash function h is a function from the set of all search-key values K
to the set of all bucket addresses B.
n Hash function is used to locate records for access, insertion as well
as deletion.
n Records with different search-key values may be mapped to the
same bucket; thus entire bucket has to be searched sequentially to
locate a record.

Hash file organization of instructor file, using dept_name as key

(See figure in next slide.)

n There are 10 buckets,

n The binary representation of the ith character is assumed to be the
integer i.
n The hash function returns the sum of the binary representations of
the characters modulo 10
l E.g. h(Music) = 1 h(History) = 2
h(Physics) = 3 h(Elec. Eng.) = 3

Hash file organization of instructor file, using dept_name as key

(see previous slide for details).
CS425 – Fall 2013 – Boris Glavic 11.45 ©Silberschatz, Korth and Sudarshan
Hash Functions
n Worst hash function maps all search-key values to the same bucket;
this makes access time proportional to the number of search-key
values in the file.
n An ideal hash function is uniform, i.e., each bucket is assigned the
same number of search-key values from the set of all possible values.
n Ideal hash function is random, so each bucket will have the same
number of records assigned to it irrespective of the actual distribution of
search-key values in the file.
n Typical hash functions perform computation on the internal binary
representation of the search-key.
l For example, for a string search-key, the binary representations of
all the characters in the string could be added and the sum modulo
the number of buckets could be returned. .

CS425 – Fall 2013 – Boris Glavic 11.46 ©Silberschatz, Korth and Sudarshan
Handling of Bucket Overflows
n Bucket overflow can occur because of
l Insufficient buckets
l Skew in distribution of records. This can occur due to two
reasons:
 multiple records have same search-key value
 chosen hash function produces non-uniform distribution of key
values
n Although the probability of bucket overflow can be reduced, it cannot
be eliminated; it is handled by using overflow buckets.

n Overflow chaining – the overflow buckets of a given bucket are

chained together in a linked list.
n Above scheme is called closed hashing.
l An alternative, called open hashing, which does not use overflow
buckets, is not suitable for database applications.

CS425 – Fall 2013 – Boris Glavic 11.48 ©Silberschatz, Korth and Sudarshan
Hash Indices
n Hashing can be used not only for file organization, but also for index-
structure creation.
n A hash index organizes the search keys, with their associated record
pointers, into a hash file structure.
n Strictly speaking, hash indices are always secondary indices
l if the file itself is organized using hashing, a separate primary
hash index on it using the same search-key is unnecessary.
l However, we use the term hash index to refer to both secondary
index structures and hash organized files.

hash index on instructor, on attribute ID

CS425 – Fall 2013 – Boris Glavic 11.50 ©Silberschatz, Korth and Sudarshan
Deficiencies of Static Hashing
n In static hashing, function h maps search-key values to a fixed set of B
of bucket addresses. Databases grow or shrink with time.
l If initial number of buckets is too small, and file grows, performance
will degrade due to too much overflows.
l If space is allocated for anticipated growth, a significant amount of
space will be wasted initially (and buckets will be underfull).
l If database shrinks, again space will be wasted.
n One solution: periodic re-organization of the file with a new hash
function
l Expensive, disrupts normal operations
n Better solution: allow the number of buckets to be modified dynamically.

CS425 – Fall 2013 – Boris Glavic 11.51 ©Silberschatz, Korth and Sudarshan
Index Definition in SQL
n Create an index
create index <index-name> on <relation-name>
(<attribute-list>)
E.g.: create index b-index on branch(branch_name)
n Use create unique index to indirectly specify and enforce the
condition that the search key is a candidate key is a candidate key.
n To drop an index
drop index <index-name>
n Most database systems allow specification of type of index, and
clustering.

Modified from:
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

MFG Pro Eb21 Installation Guide Progress Database
No ratings yet
MFG Pro Eb21 Installation Guide Progress Database
183 pages
Smart Card Research and Advanced Applications 13th International Conference
No ratings yet
Smart Card Research and Advanced Applications 13th International Conference
261 pages
CSC 425: Computer Installation Management
No ratings yet
CSC 425: Computer Installation Management
58 pages
Operating System Structures
0% (1)
Operating System Structures
9 pages
Line Follower Robot PDF
100% (2)
Line Follower Robot PDF
5 pages
10th IT Unit 3 DBMS
No ratings yet
10th IT Unit 3 DBMS
29 pages
Advanced Computer Architecture: CSE-401 E
No ratings yet
Advanced Computer Architecture: CSE-401 E
71 pages
Lecture 1: Catalan Numbers and Recurrence Relations
100% (1)
Lecture 1: Catalan Numbers and Recurrence Relations
6 pages
Zseries Assembler S8172a
No ratings yet
Zseries Assembler S8172a
39 pages
Veeam Backup & Replication 9.5 Update 4b Release Notes
No ratings yet
Veeam Backup & Replication 9.5 Update 4b Release Notes
29 pages
6 Data Storage and Querying
100% (1)
6 Data Storage and Querying
58 pages
Bank Queue Management System: Problem Statement
No ratings yet
Bank Queue Management System: Problem Statement
58 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
90 pages
DBMS Internals: How Does It All Work?
No ratings yet
DBMS Internals: How Does It All Work?
94 pages
SQL Database
No ratings yet
SQL Database
220 pages
Computer Basics: Computer Power Supply - A Power Cable Connects From Thepower
No ratings yet
Computer Basics: Computer Power Supply - A Power Cable Connects From Thepower
10 pages
Econometrics - Chapter 1 - Introduction To Econometrics - Shalabh, IIT Kanpur
No ratings yet
Econometrics - Chapter 1 - Introduction To Econometrics - Shalabh, IIT Kanpur
11 pages
New One Placements
No ratings yet
New One Placements
8 pages
Unit 5
No ratings yet
Unit 5
185 pages
PATRIOT User Manual URM03PH170-J
No ratings yet
PATRIOT User Manual URM03PH170-J
112 pages
Lecture15 Fall
No ratings yet
Lecture15 Fall
102 pages
Assignment FMS
No ratings yet
Assignment FMS
2 pages
202212130954
No ratings yet
202212130954
111 pages
VND - Ms Powerpoint&Rendition 1
No ratings yet
VND - Ms Powerpoint&Rendition 1
118 pages
CRUD Presentation
No ratings yet
CRUD Presentation
15 pages
Lecture 14
No ratings yet
Lecture 14
69 pages
4 DBMS
No ratings yet
4 DBMS
78 pages
CS101 Solved MCQs Alot of Solved MCQs in One File
No ratings yet
CS101 Solved MCQs Alot of Solved MCQs in One File
86 pages
Grade 8 - Q0 - W3 PCO Part 2 - FOR TEACHER
75% (8)
Grade 8 - Q0 - W3 PCO Part 2 - FOR TEACHER
25 pages
03 Storage1
No ratings yet
03 Storage1
65 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
80 pages
Mod4 Chap10 - 11 Indexing
No ratings yet
Mod4 Chap10 - 11 Indexing
77 pages
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
No ratings yet
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
53 pages
Chapter 6
No ratings yet
Chapter 6
62 pages
DBMS Unit - 1
No ratings yet
DBMS Unit - 1
40 pages
14 Gservertsg
No ratings yet
14 Gservertsg
50 pages
03 Storage1
No ratings yet
03 Storage1
55 pages
File Organization
No ratings yet
File Organization
47 pages
UNIT 4 Updated - 121124
No ratings yet
UNIT 4 Updated - 121124
52 pages
Storing Data: Disks and Files: (R&G Chapter 9)
No ratings yet
Storing Data: Disks and Files: (R&G Chapter 9)
39 pages
h17064 Dell Powermax Ras White Paper
No ratings yet
h17064 Dell Powermax Ras White Paper
44 pages
Unit 5 DBMS
No ratings yet
Unit 5 DBMS
34 pages
CH 1
No ratings yet
CH 1
39 pages
Unit-7 Indexing
No ratings yet
Unit-7 Indexing
37 pages
2010oct FE AM Questions PDF
No ratings yet
2010oct FE AM Questions PDF
34 pages
Data Storage Structures
No ratings yet
Data Storage Structures
38 pages
Exam Notes COA
No ratings yet
Exam Notes COA
36 pages
CH 13
No ratings yet
CH 13
40 pages
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
No ratings yet
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
41 pages
Notes 03 - Database Storage - I
No ratings yet
Notes 03 - Database Storage - I
42 pages
Database Management Systems, R. Ramakrishnan and J. Gehrke 1
No ratings yet
Database Management Systems, R. Ramakrishnan and J. Gehrke 1
32 pages
04 Computers Knowledge-Organiser
No ratings yet
04 Computers Knowledge-Organiser
5 pages
1 - Disk Storage - Ch13
No ratings yet
1 - Disk Storage - Ch13
31 pages
Layers of A DBMS
No ratings yet
Layers of A DBMS
38 pages
Disks, Memories & Buffer Management: "The Two Offices of Memory Are Collection and Distribution." - Samuel Johnson
No ratings yet
Disks, Memories & Buffer Management: "The Two Offices of Memory Are Collection and Distribution." - Samuel Johnson
28 pages
Chapter - Binary Index Trees
No ratings yet
Chapter - Binary Index Trees
24 pages
Lecture Data Storage
No ratings yet
Lecture Data Storage
28 pages
Disk Organization
No ratings yet
Disk Organization
29 pages
Storage Structure
No ratings yet
Storage Structure
28 pages
Final-Term Papers Solved Mcqs Cs501-Advance Computer Architecture
No ratings yet
Final-Term Papers Solved Mcqs Cs501-Advance Computer Architecture
29 pages
The Bare Basics: Storing Data On Disks and Files
No ratings yet
The Bare Basics: Storing Data On Disks and Files
33 pages
Chapter 13: Data Storage Structures: Database System Concepts, 7 Ed
No ratings yet
Chapter 13: Data Storage Structures: Database System Concepts, 7 Ed
29 pages
File and File Structure: Overview of Storage Device
No ratings yet
File and File Structure: Overview of Storage Device
29 pages
FAS8000 Technical FAQ
No ratings yet
FAS8000 Technical FAQ
25 pages
SelfStudy - Chapter 10, 11 - File Structure, Indexing and Hashing
No ratings yet
SelfStudy - Chapter 10, 11 - File Structure, Indexing and Hashing
33 pages
Chapter 10: Storage and File Structure Chapter 10: Storage and File Structure
No ratings yet
Chapter 10: Storage and File Structure Chapter 10: Storage and File Structure
16 pages
Lecture 17
No ratings yet
Lecture 17
24 pages
DSP
No ratings yet
DSP
17 pages
Codenation PDF
No ratings yet
Codenation PDF
1 page
The Euler Tour Technique: Evaluation of Tree Functions
No ratings yet
The Euler Tour Technique: Evaluation of Tree Functions
21 pages
Storage and File Structures: Goals
No ratings yet
Storage and File Structures: Goals
13 pages
31 File Structures
No ratings yet
31 File Structures
20 pages
File Organization and Indexing: Structure of Disks
No ratings yet
File Organization and Indexing: Structure of Disks
28 pages
Functions - Questions
No ratings yet
Functions - Questions
10 pages
Layers of A DBMS: Query Optimization Query Processor Query
No ratings yet
Layers of A DBMS: Query Optimization Query Processor Query
15 pages
Straight Line - Questions
No ratings yet
Straight Line - Questions
7 pages
3.practice Questions and Solutions Set-3
No ratings yet
3.practice Questions and Solutions Set-3
8 pages
Physical Data Organization: Department of Computer Science
No ratings yet
Physical Data Organization: Department of Computer Science
18 pages
Dbms Unit 01
No ratings yet
Dbms Unit 01
11 pages
HA200
No ratings yet
HA200
4 pages
Circle - Questions
No ratings yet
Circle - Questions
6 pages
Quadratic Equation - Questions
No ratings yet
Quadratic Equation - Questions
6 pages
Review: (R&G Chapter 9) - Aren't Databases Great? - Relational Model - SQL
No ratings yet
Review: (R&G Chapter 9) - Aren't Databases Great? - Relational Model - SQL
7 pages
LALR Parser For A Grammar: Compiler Design
No ratings yet
LALR Parser For A Grammar: Compiler Design
8 pages
DMX Interview Questions 1
No ratings yet
DMX Interview Questions 1
8 pages
Unit 3 ICT SKILLS
No ratings yet
Unit 3 ICT SKILLS
5 pages
Operating Systems (Cs 381)
No ratings yet
Operating Systems (Cs 381)
4 pages
Thiết bị lưu trữ SAN Unity XT 880
No ratings yet
Thiết bị lưu trữ SAN Unity XT 880
4 pages
Journey of Byte: Lecture 4: Basic Concepts of DBMS 25.10.2016
No ratings yet
Journey of Byte: Lecture 4: Basic Concepts of DBMS 25.10.2016
8 pages
06-Bufferpool 2
No ratings yet
06-Bufferpool 2
6 pages
Block Diagram of A DBMS: (R&G Chapter 9)
No ratings yet
Block Diagram of A DBMS: (R&G Chapter 9)
6 pages
03-Storage1 Notes
No ratings yet
03-Storage1 Notes
4 pages
Ds Eternus dx100 s5 WW en
No ratings yet
Ds Eternus dx100 s5 WW en
6 pages
4 Marks Chapter (12) : 1) Physical Storage Media
No ratings yet
4 Marks Chapter (12) : 1) Physical Storage Media
6 pages
03 Storage1
No ratings yet
03 Storage1
4 pages
15 Storage Manager
No ratings yet
15 Storage Manager
5 pages
Delhi Technological University Department of Computer Science
No ratings yet
Delhi Technological University Department of Computer Science
4 pages
Bottle Neck Von
No ratings yet
Bottle Neck Von
4 pages
JEE Advanced 2018 Mathematics Crash Course - MathonGo
No ratings yet
JEE Advanced 2018 Mathematics Crash Course - MathonGo
1 page
Unit 4 Database Management System4
No ratings yet
Unit 4 Database Management System4
3 pages
Application of Deep Learning For Software Defect Prediction: Team Members
No ratings yet
Application of Deep Learning For Software Defect Prediction: Team Members
2 pages
Nameet Kumar Jain: Text Mining and Analytics
No ratings yet
Nameet Kumar Jain: Text Mining and Analytics
2 pages
Ask Mr. Catalog Answers To Common ICF Catalog Questions
No ratings yet
Ask Mr. Catalog Answers To Common ICF Catalog Questions
2 pages
Online Examination System: Overview
No ratings yet
Online Examination System: Overview
2 pages
Academic Qualifications: Sahilnegi104104 Sahil Negi Sahil Negi
No ratings yet
Academic Qualifications: Sahilnegi104104 Sahil Negi Sahil Negi
1 page
Nameet Kumar Jain: Education Projects
No ratings yet
Nameet Kumar Jain: Education Projects
1 page
OpenBSD Mastery: Filesystems: IT Mastery, #19
From Everand
OpenBSD Mastery: Filesystems: IT Mastery, #19
Michael W. Lucas
No ratings yet
Hard Circle Drives (HDDs): Uncovering the Center of Information Stockpiling
From Everand
Hard Circle Drives (HDDs): Uncovering the Center of Information Stockpiling
Friend Good
No ratings yet
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
From Everand
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
Michael W. Lucas
No ratings yet
FreeBSD Mastery: Advanced ZFS: IT Mastery, #9
From Everand
FreeBSD Mastery: Advanced ZFS: IT Mastery, #9
Michael W. Lucas
No ratings yet

DBMS Indexing and Storage

Uploaded by

DBMS Indexing and Storage

Uploaded by

Chapter 11: Indexing and Storage

n The database is stored as a collection of files. Each file stores

n Variable-length records arise in database systems in several ways:

n Slotted page header contains:

n Heap – a record can be placed anywhere in the file where there

Secondary index on salary field of instructor

n Index record points to a bucket that contains pointers to all the

B+-tree indices are an alternative to indexed-sequential files.

n Disadvantage of indexed-sequential files

A B+-tree is a rooted tree satisfying the following properties:

n All paths from root to leaf are of the same length

l Ki are the search-key values

Properties of a leaf node:

B+-tree for instructor file (n = 6)

n Leaf nodes must have between 3 and 5 values

 but can guarantee K1  K2  K3  . . .  Kn–1

l Search-keys in the subtree to which Pi points

 To see why, suppose same search key value V is present

B+-Tree before and after insertion of “Adams”

B+-Tree before and after insertion of “Lamport”

Before and after deleting “Srinivasan”

n Deleting “Srinivasan” causes merging of under-full leaves

Deletion of “Singh” and “Wu” from result of previous example

n Leaf containing Singh and Wu became underfull, and borrowed a value

n A bucket is a unit of storage containing one or more records (a

Hash file organization of instructor file, using dept_name as key

n There are 10 buckets,

Hash file organization of instructor file, using dept_name as key

n Overflow chaining – the overflow buckets of a given bucket are

hash index on instructor, on attribute ID

You might also like