0% found this document useful (0 votes)
28 views79 pages

DBMS VTH Unit

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views79 pages

DBMS VTH Unit

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Database Security and Database

Recovery
INDEX
❑ DATABASE
❑ TYPES OF DATABASES
❑ DATABASE MANAGEMENT SYSTEM
❑ DATA RECOVERY ❑ DATA SECURITY
▪ FAILURE CLASSIFICATION ▪ WHY IS DATA SECURITY IMPORTANT?
▪ TRANSACTION FAILURE ▪ SOURCES OF VULNERABILITY
▪ SYSTEM CRASH ▪ SECURITY THREATS: MOST COMMON ATTACKS
▪ DISK FAILURE ▪ DATA SECURITY ARCHITECTURE
▪ STORAGE STRUCTURE ▪ OBJECTIVES OF THE SECURITY ARCHITECTURE
▪ VOLATILE STORAGE ▪ DATABASE SECURITY STRATEGY
▪ NON-VOLATILE STORAGE ▪ KEY PILLARS OF DATABASE SECURITY STRATEGY
▪ RECOVERY AND ATOMICITY ▪ METHODS TO SECURE DATABASES
▪ LOG BASED RECOVERY ▪ DATABASE SECURITY BEST PRACTICES
▪ RECOVERY WITH CONCURRENT ▪ 10 BEST SYSTEMS AVAILABLE FOR BUSINESS
TRANSACTIONS PROFESSIONALS.
▪ CHECKPOINT
▪ HOW DATA RECOVERY WORKS
▪ DATA RECOVERY METHODS.
2
DATABASE
A database is an organized collection of data, generally stored and
accessed electronically from a computer system.

3
Types Of Database
• Traditional Database(TDB)
• A Traditional Database consists of texts and numbers only.
• Multimedia Database(MDB)
• A Multimedia database (MMDB) is a collection of
related multimedia data.
• The multimedia data include one or more primary media data types
such as text, images, graphic objects (including drawings, sketches
and illustrations) animation sequences, audio and video.
• Geographic Information System(GIS)
• A geographic information system (GIS) is a system designed to
capture, store, manipulate, analyse, manage, and present spatial
or geographic data.

4
Types Of Database
• Realtime Database(RDB)
• A real-time database is a database system which uses real-time processing
to handle workloads whose state is constantly changing.
• This differs from traditional databases containing persistent data, mostly
unaffected by time. For example, a stock market changes very rapidly and
is dynamic.
• Data Warehouse(DW)
• Data warehouse (DW or DWH), also known as an enterprise data
warehouse (EDW), is a system used for reporting and data analysis, and is
considered a core component of business intelligence.
• DWs are central repositories of integrated data from one or more disparate
sources. They store current and historical data in one single place that are
used for creating analytical reports for workers throughout the enterprise.
5
Database Management
System(DBMS)
The software which is used to manage database is called Database Management
System (DBMS). For Example, MySQL, Oracle etc. are popular commercial DBMS
used in different applications. DBMS allows users the following tasks:
•Data Definition: It helps in creation, modification and removal of definitions that
define the organization of data in database.
•Data Updation: It helps in insertion, modification and deletion of the actual data in
the database.
•Data Retrieval: It helps in retrieval of data from the database which can be used
by applications for various purposes.
•User Administration: It helps in registering and monitoring users, enforcing data
security, monitoring performance, maintaining data integrity, dealing with
concurrency control and recovering information corrupted by unexpected failure.

6
DATABASE RECOVERY

• Data recovery is the process of restoring data that has been lost,
accidentally deleted, corrupted or made inaccessible.

• In enterprise IT, data recovery typically refers to the restoration of


data to a desktop, laptop, server or external storage system from
a backup.

7
Failure Classification

8
Transaction failure
• A transaction has to abort when it fails to execute or when it
reaches a point from where it can’t go any further. This is called
transaction failure

• Reasons for a transaction failure could be −

1. Logical errors − Where a transaction cannot complete because it


has some code error or any internal error condition.
2. System errors − Where the database system itself terminates an
active transaction because the DBMS is not able to execute it, or
it has to stop because of some system condition. For example, in
case of deadlock or resource unavailability, the system aborts an
active transaction.
9
System Crash
• There are problems − external to the system − that may cause
the system to stop abruptly and cause the system to crash.

• For example, interruptions in power supply may cause the


failure of underlying hardware or software failure.
Examples may include operating system errors.

10
Disk Failure

• In early days of technology evolution, it was a common problem where


hard-disk drives or storage drives used to fail frequently.

• Disk failures include formation of bad sectors, unreachability to the


disk, disk head crash or any other failure, which destroys all or a part of
disk storage.

11
Storage Structure

In brief, the storage structure can be divided into two categories −

•Volatile storage

•Non-Volatile storage

12
Volatile storage
• As the name suggests, a volatile storage cannot
survive system crashes. Volatile storage devices are
placed very close to the CPU; normally they are
embedded onto the chipset itself.
• For example, main memory and cache memory are
examples of volatile storage. They are fast but can
store only a small amount of information.

13
Non-Volatile storage
• These memories are made to survive system crashes. They are huge
in data storage capacity, but slower in accessibility.

• Examples may include hard-disks, magnetic tapes, flash memory,


and non-volatile (battery backed up) RAM.

14
Recovery and Atomicity
• When a system crashes, it may have several transactions being
executed and various files opened for them to modify the data items.
Transactions are made of various operations, which are atomic in
nature.

• But according to ACID properties of DBMS, atomicity of transactions


as a whole must be maintained, that is, either all the operations are
executed or none.

15
Log-based Recovery
• Log is a sequence of records, which maintains the records of actions
performed by a transaction. It is important that the logs are written
prior to the actual modification and stored on a stable storage media,
which is failsafe.
• The database can be modified using two approaches −
• Deferred database modification − All logs are written on to the stable
storage and the database is updated when a transaction commits.
• Immediate database modification − Each log follows an actual
database modification. That is, the database is modified immediately
after every operation

16
Log-based recovery works as follows −
•The log file is kept on a stable storage media.
•When a transaction enters the system and starts execution, it writes a
log about it.
• <Tn, Start>

•When the transaction modifies an item X, it write logs as follows −


• <Tn, X, V1, V2>

•It reads Tn has changed the value of X, from V1 to V2.


•When the transaction finishes, it logs −
• <Tn, commit>

17
Recovery with Concurrent
Transactions

• When more than one transaction are being executed in parallel, the logs
are interleaved. At the time of recovery, it would become hard for the
recovery system to backtrack all logs, and then start recovering. To ease
this situation, most modern DBMS use the concept of 'checkpoints’.

18
CHECKPOINT
• Checkpoint is a mechanism where all the previous logs are removed
from the system and stored permanently in a storage disk.

• Checkpoint declares a point before which the DBMS was in


consistent state, and all the transactions were committed.

19
CHECKPOINT

• T1, T2 and T3 transaction into redo list.


• T4 will be put into undo list .

20
How data recovery works
• The data recovery process varies, depending on the circumstances
of the data loss, the data recovery software used to create the
backup and the backup target media

• Data recovery is possible because a file and the information about


that file are stored in different places.

21
Data recovery methods
• Standard Data Recovery is where we recover data using standard lab
facilities. This is used typically for recovering data from media in case
of no physical failure.

• Remote Data Recovery is where we perform secure remote data


recovery services. This is used in cases where the media would
experience logical failures.

22
Database Security
• Database security refers to the collective measures used to
protect and secure a database or database management
software from illegitimate use and malicious threats and attacks.
• Database security protects the confidentiality, integrity and
availability(CIA) of an organizations.
• Confidentiality is a set of rules that limits access to information.
• Integrity is the assurance that the information is trustworthy and
accurate.
• Availability is a guarantee of reliable access to the information
by authorized people.

23
Database Security

24
Why is database security important?
•Finances and reputation.
▪Company’s block attacks, including ransomware and breached
firewalls, which in turn keeps sensitive information safe.
▪Prevent malware or viral infections which can corrupt data, bring
down a network, and spread to all end point devices.
▪Ensure that physical damage to the server doesn’t result in the
loss of data.
▪Prevent data loss through corruption of files or programming
errors.
Note: In 2015 ,With an estimated world population of 7.4 billion,7%
of world population was exposed to hackers and 500 million
identities were exposed.

25
Sources of Vulnerability
1. Application:
• SQL injection attack
• Application bypass
2. Test and Dev:
• Access to production data in non- secure environment
• Access to production systems for trouble shooting
3. Administrative Account Misuse:
• System and Application admins, DBA
• Stolen credential, Inadequate Training, Malicious Insiders
4. Operations:
• Lost/Stolen Backups
• Direct OS Access.
26
SECURITY THREATS:MOST
COMMON ATTACKS
1. Privilege abuse: When database users are provided with
privileges that exceeds their day-to-day job requirement, these
privileges may be abused intentionally or unintentionally.

2. Operating System vulnerabilities: Vulnerabilities in underlying


operating systems like Windows, UNIX, Linux, etc., and the
services that are related to the databases could lead to
unauthorized access. This may lead to a Denial of Service (DoS)
attack.

27
3. Database rootkits: A database rootkit is a program that is hidden
inside the database and that provides administrator-level privileges
to gain access to the data and may even turn off alerts triggered by
Intrusion Prevention Systems (IPS).

4. Weak authentication: Weak authentication models allow attackers


to employ strategies such as social engineering and brute force to
obtain database login credentials and assume the identity of
legitimate database users.

5. Weak audit trails: A weak audit logging mechanism in a database


server represents a critical risk to an organization especially in
retail, financial, healthcare, and other industries with stringent
regulatory compliance . Audit trails act as the last line of database
defense. Audit trails can detect the existence of a violation.

28
Data Security Architecture
• Security Architecture is the design artifacts that describe how
the security controls (= security countermeasures) are
positioned and related to the overall systems architecture and
serve the purpose to maintain the system's quality attributes.
• An information security architecture is designed to be strategic
to have a longer life than a blueprint, design requirement, or a
topological chart or configuration. It is meant to assist in making
choices associated to the identification, acquisition, design,
application, implementation, deployment, and operation of
elements in the organization’s technical environment.
• The information security architecture should support many
communities, departments, and lines of business, and should
represent the long-term view of technical direction.
29
• An architecture that supports:
1. An effective security program that recognizes that all
information is not identical or continuous in terms of value and
risk over time.
2. A well organized and efficient security program that applies the
right technology to protect the utmost critical assets joint with
quality processes that reduce the risks.
3. A high-quality security program that includes regular
management reviews and technology assessments to ensure
controls.

30
OBJECTIVES OF THE SECURITY 31

ARCHITECTURE
The specific objectives and deliverables of the organization’s
information security architecture can be defined as follows:
• Provides guidance to the organization’s IT corporate and
department decision-makers.
• Supports, enables, and extends the organization’s security
policy and standards by providing specific security-related
guidance.
• Describes general security strategies within the organization’s
information security architecture domain.
• Describes the high-level design objectives.
• Describes the concept of “security zones ”.
• Describes a risk management architecture.
• Leverages leading industry standards and representations to
ensure best security practices are being applied.
DATABASE SECURITY STRATEGY 32

• A database security strategy focuses on proactively


protecting data from internal and external attacks, curtailing
data exposure to privileged and authorized IT users, and
safeguarding all databases, including production and
non-production.

• With internal and external attacks on corporate and public


applications and robust regulatory compliance
enforcements, data security continues to be the highest
priority for enterprises and governments year after year. at
the very core, i.e;

“The databases that contain the corporate crown jewels”.


KEY PILLARS OF DATABASE 33

SECURITY STRATEGY
❑ FOUNDATION PILLAR
The “foundation pillar” stresses discovery and
classification of sensitive data and devising a vigorous
authentication, authorization, and access control
framework. In addition, all critical databases must be
patched periodically to remove known vulnerabilities.
• To establish a strong database security foundation,
enterprises should use:
• Database discovery and classification, which provides
information on all databases to focus upon
• AAA mechanisms for appropriate database access
• Patch management protecting against identified
vulnerabilities.
❑ DETECTION PILLAR 34

• This section encompasses monitoring, auditing, and


vulnerability assessment. Vulnerability assessment
reports gaps in the database environment, such as weak
passwords or excessive access privileges.
• Data and metadata within databases can be accessed,
modified, or even deleted in moments. The detection pillar
emphasizes a comprehensive audit trail of database
activities
• Detection layer security fundamentally includes:
• Continuous auditing and alerting on data anomalies and
access by privileged users
• Security monitoring and real-time intrusion prevention to
defend the database against potential threats
• Vulnerability assessment to check for database integrity
and security configuration across databases.
❑ PREVENTIVE PILLAR 35

• This category encompasses data encryption, data masking,


and database firewall. This pillar emphases preventing
unauthorized access and protecting against potential
attacks.
• Preventive security measures include:
• Network and data-at-rest encryption
• Data masking (redaction) across all databases to prevent
data exposure.
• Database firewalls to prevent potential threats, such as
SQL injection attacks or privilege escalation from
impacting databases.
• Change management to enable a formal procedure to
manage changes in production. The goal is to prevent
unauthorized access to and exposure of private data.
METHODS TO SECURE DATABASES 36

•Some attackers still focus on denial of service attacks and


vandalism, cybercriminals often target the database because
that is where the money is.
•Database security on its own is an extremely in-depth topic
that could never be covered in whole. However there are a
few best practices that can help even the smallest of
businesses secure their databases.
DATABASE SECURITY BEST 37

PRACTICES
• Separate the Database and Web Servers
• Always keep the database server separate from the web
server. Most vendors try to make things easier by having
the database created on the same server that the
application is installed.
• This also makes it easier for an attacker to access the
data because they only need to crack the administrator
account for one server to have access to everything.
• Instead, a database should reside on a separate
database server located behind a firewall, not in the DMZ
with the web server. This makes for a more complicated
setup.
• Encrypt stored files
• The stored files of a web application often contain information about 38
the databases that the software needs to connect to. This
information, if stored in plain text like many default installations do,
provide the keys an attacker needs to access sensitive data.

• WhiteHat security estimates that 83 percent of all web sites are


vulnerable to at least one form of attack. The stored files of a web
application often contains information about the databases the
software needs to connect to. This information, if stored in plain text
like many default installations do, provide the keys an attacker
needs to access sensitive.

• Encrypt Your Backups Too


• Encrypt back-up files. Not all data theft happens as a result of an
outside attack. Sometimes, it’s the people we trust most that are the
attackers.
• Use WEB APPLICATION FIREWALL (WAF) 39
• Many people are under the misconception that protecting
the web server has nothing to do with the database. This
is not true. In addition to protecting a site against
cross-site scripting vulnerabilities and website
vandalism, a good application firewall can thwart SQL
injection attacks as well. By preventing the injection of
SQL queries by an attacker, the firewall can help keep
sensitive information stored in the database away from
attackers.

• Websites that utilize third-party applications,


components, and various other plug-ins and add-ons are
more susceptible to an exploit than those that have been
patched.
40

• Minimize Use of 3rd Party Apps


•Keep third-party applications to a minimum. We all want our web site to be
filled with interactive widgets and sidebars filled with cool content, but any
app that pulls from the database is a potential threat. Many of these
applications are created by hobbyists or programmers who discontinue
support for them. Unless they are absolutely necessary, don’t install them.

• Don't Use a Shared Server


•Avoid using a shared web server if your database holds sensitive
information. While it may be easier, and cheaper, to host your site with a
hosting provider you are essentially placing the security of your
information in the hands of someone else. If you have no other choice,
make sure to review their security policies and speak with them about
what their responsibilities are should your data become compromised.
41

• Enable Security Controls


• Enable security controls on your database. While most databases
nowadays will enable security controls by default, it never hurts for
you to go through and make sure you check the security controls to
see if this was done.

• Keep in mind that securing your database means you have to shift
your focus from web developer to database administrator. In small
businesses, this may mean added responsibilities and additional buy
in from management. However, getting everyone on the same page
when it comes to security can make a difference between preventing
an attack and responding to an attack.
10 OF THE BEST SYSTEMS AVAILABLE FOR 42

BUSINESS PROFESSIONALS:
▪ Oracle
▪ Microsoft SQL Server
▪ MySQL
▪ PostgreSQL
▪ Microsoft Access
▪ Teradata
▪ IBM DB2
▪ Informix
▪ SAP ASE(Sybase Adaptive Server Enterprise)
▪ Amazon’s SimpleDB
Disk Storage, Basic File
structures And Hashing
CONTENTS
• Introduction
• Secondary Storage Devices
• Buffering of Blocks
• Placing File Records on Disk
• Operations on Files
• Files of Unordered Records (Heap Files)
• Files of Ordered Records (Sorted Files)
• Hashing Techniques
• Parallelizing Disk Access Using RAID Technology
INTRODUCTION
In a computerized database, the data is stored on computer storage
medium, which includes:

• Primary Storage: can be processed directly by the CPU (e.g., the main
memory, cache) –fast, expensive, but of limited capacity

• Secondary Storage: cannot be processed directly by the CPU (e.g.,


magnetic disks, optical disks, tapes) –slow, cost less, but have a large
capacity.
For the following reasons, most databases must are stored permanently
on secondary storage:

• They are too large to fit entirely in main memory

• They must persist over long period of times, but the main memory is a
volatile storage

• Secondary storage costs less

Note: In Real-time applications, such as telephone switching


applications, entire database can be kept in the main memory (with a
backup copy on secondary devices) – main memory databases.
SECONDARY STORAGE DEVICES
• Magnetic tapes (offline): operator must load it

• Magnetic Disks (online): can be accessed directly

• The capacity of a device is the number of bytes it can store

• A disk can be single-sided or double-sided

• Many disks are assembled into a disk pack


SECONDARY STORAGE DEVICES
(a) A
single-sided
disk with
read/write
hardware
(b) A disk
pack with
read/write
hardware
SECONDARY STORAGE DEVICES
Different sector organizations on disk:
(a) Sectors subtending a fixed angle
(b) Sectors maintaining a uniform recording density
SECONDARY STORAGE DEVICES
• The tracks with the same diameter on the various surfaces are called a
cylinder

• During disk formatting (initializing), each track is divided into


equal-sized disk blocks (or pages)

• Blocks are separated by fixed-size interblock gaps

• A disk is a random access addressable device

• A combination of a cylinder number, track number, and block number is


supplied the hardware address of a block.
SECONDARY STORAGE DEVICES
• A buffer is a contiguous reserved area in main memory that holds one
block.

• For a read command, the block from disk is copied into the buffer.

• For a write command, the contents of the buffer are copied into the
disk.

• The read/write head is the hardware mechanism that reads or writes a


block.
SECONDARY STORAGE DEVICES
• A disk pack is mounted in the disk drive, which includes a motor that
rotates the disks.

• A disk controller controls the disk drive and interfaces it to the computer
system.

• The time required that the disk controller mechanically positions the
read/write head on the correct track is called the seek time.

• The time required that the beginning of the desired block rotates into
position under the read/write head is called the rotational delay or
latency.
SECONDARY STORAGE DEVICES
• After finding the desired block, the time required to transfer the data
(read or write a block) is called the block transfer time.

• The seek time and rotational delay are usually much larger than the
block transfer time.

• The time required to transfer consecutive blocks is usually determined


by the bulk transfer rate.

• A magnetic tape is a sequential access device.

• A tape drive includes a mechanism to read the data from or to write the
data to a tape reel.
BUFFERING OF BLOCKS
• Buffers are reserved in main memory to speed up the processes.

• While one buffer is being read or written (by disk controllers), the CPU
can process data in the other buffers.

• Buffers play an important role when processes are running


concurrently, either in an interleaved or parallel fashion.

• Double buffering enables continuous reading or writing of data on


consecutive disk blocks.
BUFFERING OF BLOCKS
BUFFERING OF BLOCKS
PLACING FILE RECORDS ON DISK
• Data is usually stored in the form of records, which are a collection of
fields.

• A record may represent an entity (tuple), and thus each field


corresponds to an attribute.

• A data type associated with each field, specifies the types of values a
field can take.

• A collection of field names and their corresponding data types


constructs a record type or record format definition.
PLACING FILE RECORDS ON DISK
• A file is a sequence of records.

• If every record in the file has the same size, the file is of type
fixed-length records.

• If different records in the file have different sizes, the file is of type
variable-length records.

• A file that contains records of different record types and hence of


varying size is called a mixed file.

• For variable length fields, we could use a special separator character


(which does not appear in any field value) to terminate variable-length
fields.
PLACING FILE RECORDS ON DISK
• (a) A fixed-length record with six fields and size of 71 bytes
• (b) A record with two variable-length fields and three fixed-length fields
• (c) A variable-field record with three types of separator characters
PLACING FILE RECORDS ON DISK
• A block is the unit of data transfer between disk and memory.

• The blocking factor is determined by the number of records per block,


bfr = ⌊ B/R ⌋

• If records are allowed to cross block boundaries, the file organization is


called spanned.

• If records are not allowed to cross block boundaries, the file


organization is called unspanned.
PLACING FILE RECORDS ON DISK
Types of record organization:
(a) Unspanned (b) Spanned
PLACING FILE RECORDS ON DISK
In contiguous allocation, the file blocks are allocated to consecutive disk
blocks.

In linked allocation, each file block contains a pointer to the next file
block.

In indexed allocation, one or more index blocks contain pointers to the


actual file blocks.

A file header or file descriptor contains information about a file (e.g., the
disk address, record format descriptions, etc.)
OPERATIONS ON FILES
Two main types of operations:
• Retrieval operations: do not change any data in the file
• Update operations: changes the file by insertion or deletion of records
or by modification of field values.
Actual operations for locating and accessing file records implies the
following high-level operations:
• Open
• Reset
• Find
• Read
• FindNext
• Update (insert, delete, modify)
• Close
OPERATIONS ON FILES
• A file organization refers to the way records and blocks are placed on
the storage device.

• An access method, provides a group of operations that can be applied


to a file.

• A file is said to be static, if the update operations are rarely applied to it,
otherwise it is dynamic.

• A good file organization should perform as efficiently as possible the


operation we expect to apply frequently to the file.
FILES OF UNORDERED RECORDS
• Records are placed in the file in the order in which they are inserted.
Such an organization is called a heap or pile file.

• Insertion: is very efficient

• Searching: requires a linear search (expensive)

• Deleting: requires a search, then delete:


• Copy the block into a buffer, delete from buffer, and rewrite the block (leaves
unused space in the disk block)
• Having an extra byte or bit (deletion marker).

• Both of these deletion techniques require reorganization.


FILES OF ORDERED RECORDS
• Records of a file on disk are ordered based on the values of one of
their fields.

• Reading the records in order of the ordering field is extremely efficient.

• Search: is very efficient (Binary search)

• Insertion and deletion are expensive.

• Ordering files are rarely used in database applications (unless using


indexed-sequential files)
FILES OF ORDERED RECORDS
Some blocks
of an ordered
(sequential) file
of EMPLOYEE
records with
NAME as the
ordering key field.
HASHING TECHNIQUES
• A hash function maps the hash field of a record into the address of the
storage media in which the record is stored.

• Hashing provides very fast access to records, where the search


condition is an equality condition on the hash field.

• For internal files, hashing is implemented as a hash table. The mapping


that assigns each element of the data a cell of the hash table is called a
hash function.
HASHING TECHNIQUES
• Two records that yield the same hash value are said to collide.

• A good hash function must be easy to compute and generate a low


number of collisions.

• The process of finding another position (for colliding data) is called


collision resolution.

• There are several methods for collision resolution, including open


addressing, chaining, and multiple hashing.
HASHING TECHNIQUES
• Open addressing: Proceeding from the occupied position specified by
the hash function, check the subsequent positions in order until an
unused position is found.

• Chaining: Associate an overflow area (or a linked list) to any cell


(hashing address) and then simply store the data in this medium.

• Multiple hashing: Apply a second hash function if the first results in a


collision. If another collision results, use open addressing, or apply a
third hash function, and then use open addressing.
HASHING TECHNIQUES
Internal hashing
data structures.
(a) Array of M
positions for use
in internal hashing.
(b) Collision
resolution by
chaining records.
HASHING TECHNIQUES
• Hashing for disk files is called external hashing.

• The target address space in external hashing is made of buckets


(which holds a disk block or a cluster of contiguous blocks).

• The collision problem is less severe, because as many records as will


fit in a bucket can hash to the same bucket without causing collision
problem.

• A table maintained in the file header converts the bucket number into
the corresponding disk block address.
HASHING TECHNIQUES
Matching bucket numbers to disk block addresses.
HASHING TECHNIQUES
Handling overflow for buckets by chaining.
HASHING TECHNIQUES
• The hashing scheme is called static hashing if a fixed number of
buckets is allocated.

• A major drawback of static hashing is that the number of buckets must


be chosen large enough that can handle large files. That is, it is difficult
to expand or shrink the file dynamically.

• Two solutions to the above problem are:


• Extendible hashing, and
• Linear hashing
STRUCTURE OF THE EXTENDIBLE
HASHING SCHEME
PARALLELIZING DISK ACCESS USING
RAID TECHNOLOGY
• A major advance in disk technology is represented by the development
of Redundant Arrays of Inexpensive/Independent Disks (RAID).

• Improving Performance with RAID: a concept called data striping is


used. It distributes data transparently over multiple disks to make them
appear as a single disk.

• Improving Reliability with RAID: A concept called mirroring or


shadowing is used. Data is written redundantly to two identical physical
disks that are treated as one logical disk.
PARALLELIZING DISK ACCESS USING
RAID TECHNOLOGY

Data striping. File A is striped across four disks.


THANK
YOU

You might also like