ADE Unit-1: Entity Relational Model
ADE Unit-1: Entity Relational Model
PRIMARY KEY:
An entity may be defined as a thing which is recognized as being
capable of an independent existence and which can be uniquely
identified. An entity is an abstraction from the complexities of
some domain.
NORMALIZATION:
Database normalization is the process of removing
redundant data from your tables in to improve storage
efficiency, data integrity, and scalability.
In the relational model, methods exist for quantifying how
efficient a database is. These classifications are called
normal forms (or NF), and there are algorithms for
converting a given database between them.
Normalization generally involves splitting existing tables into
multiple ones, which must be re-joined or linked each time a
query is issued.
The Purpose of Normalization:
1. Normalization is a technique for producing a set of relations
with desirable properties, given the data requirements of an
enterprise.
2. The process of normalization is a formal method that identifies
relations based on their primary or candidate keys and the
functional dependencies among their attributes.
Definition of 1NF:
First Normal Form is a relation in which the intersection of each
row and column contains one and only one value. There are two
approaches to removing repeating groups from unnormalized
tables:
1.Removes the repeating groups by entering appropriate data in
the empty columns of rows containing the repeating data.
2.Removes the repeating group by placing the repeating data,
along with a copy of the original key attribute(s), in a separate
relation. A primary key is identified for the new relation.
Second normal form (2NF):
It is a relation that is in first normal form and every non-primarykey attribute is fully functionally dependent on the primary key.
Boyce-Codd normal form (BCNF):
A relation is in BCNF, if and only if, every determinant is a
candidate key.
Multi-valued dependency (MVD) :
It represents a dependency between attributes (for example, A, B
and C) in a relation, such that for each value of A there is a set of
values for B and a set of value for C. However, the set of values
for B and C are independent of each other.
QUERY PROCESSING:
Good syntax.
All referenced relations exist.
Translate the SQL to relational algebra.
Optimize
Make it run faster.
Evaluate
:
Three Steps of Query Processing
1) The Parsing and translation will first translate the query into its
internal form, and then translate the query into relational algebra
and verifies relations.
2) Optimization is to find the most efficient evaluation plan for a
query because there can be more than one way.
3) Evaluation is what the query-execution engine takes a queryevaluation plan to execute that plan and returns the answers to
the query.
Evaluation
Evaluation is what the query-execution engine takes a queryevaluation plan to execute that plan and returns the answers to
the query.
Selection Operation (primary index):
End
QUERY OPTIMIZATION:
A site-seeing trip
Start : A SQL Query
End: An execution plan
Intermediate Stopovers
query trees
logical tree transforms
strategy selection
What happens after the journey?
Execution plan is executed
Query answer returned
Query Trees:
Example:
Transaction processing:
Transaction processing is designed to maintain a computer
system (typically a database or some modern filesystems) in a
Deadlocks:
In some cases, two transactions may, in the course of their
processing, attempt to access the same portion of a database at
the same time, in a way that prevents them from proceeding
. For example, transaction A may access portion X of the
database, and transaction B may access portion Y of the
database.
If, at that point, transaction A then tries to access portion Y of the
database while transaction B tries to access portion X, a deadlock
occurs, and neither transaction can move forward. T
Concurrency control:
Concurrency control in
Database management systems
other transactional objects, and related distributed
applications (e.g., Grid computing and Cloud computing)
ensures that database transactions are performed
concurrently without violating the data integrity of the
respective databases.
Thus concurrency control is an essential element for
correctness in any system where two database transactions
or more, executed with time overlap, can access the same
data, e.g., virtually in any general-purpose database system.
Consequently a vast body of related research has been
accumulated since database systems have emerged in the
early 1970s.
RECOVERY.
1. Site failure
2. Lost messages
3. Network partitioning
4. Byzantine failures
Effects of failures
1. Inconsistent database
2. Transaction processing is blocked
3. Failed component unavailable
Independent Recovery
A recovering site makes a transition directly to a final state
without communicating with other sites.
Lemma
For a protocol, if a local states concurrency set contains both an
abort and commit, it is not resilient to an arbitrary failure of a
single site.
Si
Si
Database Tuning:
Database tuning describes a group of activities used to
optimize and homogenize the performance of a database.
It usually overlaps with query tuning, but refers to design of
the database files, selection of the database management
system (DBMS), operating system and CPU the DBMS runs
on.
The goal is to maximize use of system resources to perform
work as efficiently and rapidly as possible.
Most systems are designed to manage work efficiently, but it
is possible to greatly improve performance by customizing
settings and the configuration for the database and the
DBMS being tuned
I/O tuning
Hardware and software configuration of disk subsystems are
examined: RAID levels and configuration [1], block and stripe