0% found this document useful (0 votes)
76 views22 pages

DBMS Proper

A database is an organized collection of interrelated data that can be easily accessed and managed. A Database Management System (DBMS) is software that stores and retrieves user data with security measures and allows users to create databases. A DBMS provides benefits like efficient querying, controlling redundancy and inconsistency, indexing for fast retrieval, concurrent access, security, and backup/recovery. Relationships between database tables can be one-to-one, one-to-many, or many-to-many.

Uploaded by

Aman Kala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views22 pages

DBMS Proper

A database is an organized collection of interrelated data that can be easily accessed and managed. A Database Management System (DBMS) is software that stores and retrieves user data with security measures and allows users to create databases. A DBMS provides benefits like efficient querying, controlling redundancy and inconsistency, indexing for fast retrieval, concurrent access, security, and backup/recovery. Relationships between database tables can be one-to-one, one-to-many, or many-to-many.

Uploaded by

Aman Kala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

DATABASE

A database is an organized collection of inter-related data, so that it can be easily


accessed and managed.

DBMS

Database Management System (DBMS) is a software for storing and retrieving users'
data while considering appropriate security measures. It allows users to create their
own databases as per their requirement. Examples are - MySQL, Oracle, MongoDB.

NEED FOR DBMS

1. Processing queries and object management : We can directly store data in the form
of objects in a DBMS. Application level code needs to be written to handle, store and scan
through the data in a file system whereas a DBMS gives us the ability to query the
database.

2. Controlling redundancy and inconsistency : Redundancy refers to repeated


instances of the same data. DBMS provides redundancy control whereas in a file
system, same data may be stored multiple times. For example, suppose we have an
employee who is working on two projects. So, we might end up storing the information of
that employee twice, which may lead to increased access time and storage. A DBMS uses
data normalization to avoid redundancy. Now let us suppose we need to change the
mobile number of the employee and we change it in one place but we forget to edit it at
another place. This would lead to data inconsistency because in future if we refer to this
data we would not know which one is correct and which one is wrong.

3. Efficient memory management and indexing : In file systems, files are indexed in
place of objects so query operations require entire file scans whereas in a DBMS, object
indexing takes place efficiently through database schema based on any attribute of the
data. This helps in fast retrieval of data based on the indexed attribute.

4. Concurrent access : Multiple users can access the database at the same time when
we are using the DBMS.

5. Security : Only authorised users should be allowed to access the database and their
identity should be authenticated using a username and password.

6. Backup and Recovery : The users don't need to backup data periodically because this
is taken care of by the DBMS. Moreover, it also restores the database after a crash or
system failure to its previous condition.
TYPES OF DBMS ARCHITECTURE

1-Tier Architecture : simplest architecture of Database in which the client, server, and
Database all reside on the same machine. A simple one tier architecture example would be
anytime you install a Database in your system and access it to practice SQL queries. But
such architecture is rarely used in production.

2-Tier Architecture : Two-tier architecture consists of two layers - Client Layer and
Database Layer. The application logic is either buried inside the user interface on
the client or within the database on the server (or both). It is easy to build and
maintain. It is less secure as client can communicate with database directly.
Performance is affected on scaling. Example – Railway Reservation System, Bank
operations during physical visit.

3-Tier Architecture : Consists of three layers - Client Layer, Business/Application


Layer (provide GUI to the users) and Database Layer. In three-tier, the application
logic or process resides in the middle-tier, it is separated from the data and the user
interface. It is complex to build and maintain. It is secured as client is not allowed to
communicate with database directly. Easily scalable. Example - Designing
registration form which contains a text box, label, button or a large website on the
Internet, etc.

TYPES OF DATABASE LANGUAGES

1. DDL(Data Definition Language) : It contains commands which are required to define


database structure or schema. Creates the skeleton of the database. Examples are -

CREATE - used to create the database or its objects (tables, indexes, constraints)
ALTER - used to alter the structure of the database
DROP - used to delete objects from the database
TRUNCATE - used to remove all records from a table
COMMENT - used to add comments to the data dictionary
RENAME - used to rename an object

2. DML(Data Manipulation Language) : It contains commands which are required to


access and manipulate the data present in the database. Examples are -
SELECT - It retrieves data from a database
INSERT - It inserts data into a table
UPDATE - It updates existing data within a table
DELETE - It deletes all records from a table

3. DCL(Data Control Language) : It contains commands which are required to deal with
the user rights, permissions and controls of the database system. Examples are -
GRANT - It gives user access privileges to a database
REVOKE - It takes back permissions from the user
4. TCL(Transaction Control Language) : It contains commands which are required to
deal with the transaction within the database i.e to run the the changes made by the DML
statement. Examples are -
COMMIT - It saves the transaction or work done on the database
SAVEPOINT - It sets a point in a transaction to which you can later roll back
ROLLBACK - It restores the database to original since the last COMMIT

DATA ABSTRACTION
The process of hiding irrelevant details from users is known as data abstraction.
Data abstraction can be divided into 3 levels -

1. Internal or Physical Level : It is the lowest level of abstraction and is managed by


DBMS. It tells us how the data is actually stored in the memory. Details of this level are
typically hidden from system admins, developers, and users.

2. Conceptual or Logical Level : It is the level on which developers and system


admins work and it determines what data is stored in the database and what is the
relationship among the data entities.

3. External or View Level : This is the highest level of abstraction. It describes how the
data should be shown to the user and hides the details of the table schema and its
physical storage from the users.

INTEGRITY CONSTRAINTS

Integrity constraints are a set of rules which are used to maintain the quality of information
and to guard the database against accidental damage. Types are -

1. Domain Integrity : All attributes in table must have a defined domain i.e. a finite
set of values which have to be used. When we assign a datatype to a column, we
limit the values that it can contain. In addition we can also have value restriction as
per business rules eg. gender must be M or F.

2. Entity Integrity : Each table must have a column or a set of columns through
which we can uniquely identify a row. These columns cannot have NULL values. (No
primary key can take NULL value)

3. Referential Integrity : It is specified between two tables. If a foreign key in Table 1


refers to the Primary Key of Table 2, then every value of the Foreign Key in Table 1
must be null or be available in Table 2.

4. Key Integrity : These are called uniqueness constraints since it ensures that
every tuple in the relation should be unique. A relation can have multiple keys or
candidate keys, out of which we choose one of the keys as primary key which
should be unique and not null.

DBMS vs RDBMS

RDBMS is an advanced version of a DBMS.

DBMS RDBMS

DBMS stores data as file. RDBMS stores data in tabular form.

Multiple data elements can be accessed at


Data elements need to access individually.
the same time.

Data is stored in the form of tables which


No relationship between data.
are related to each other.

Normalization is not present. Normalization is present.

DBMS does not support distributed


RDBMS supports distributed database.
database.

It uses a tabular structure where the


It stores data in either a navigational or
headers are the column names, and the
hierarchical form.
rows contain corresponding values.

Keys and indexes do not allow Data


Data redundancy is common in this model.
redundancy.

It is used for small organization and deal


It is used to handle large amount of data.
with small data.

Data fetching is slower for the large Data fetching is fast because of relational
amount of data. approach.

The data in a DBMS is subject to low


There exists multiple levels of data security
security levels with regards to data
in a RDBMS.
manipulation.
Examples: MySQL, PostgreSQL, SQL
Examples: XML
Server, Oracle, etc.

DEGREE OR TYPES OF RELATIONSHIPS

1. One To One : Such a relationship exists when each record of one table is related to
only one record of the other table. For example, If there are two entities ‘Person’ and
‘Passport’. So, each person can have only one passport and each passport belongs to
only one person.

2. One To Many : Such a relationship exists when each record of one table can be related
to one or more than one record of the other table. This relationship is the most common
relationship found. For example, If there are two entity type ‘Customer’ and ‘Account’ then
each ‘Customer’ can have more than one ‘Account’ but each ‘Account’ is held by only one
‘Customer’.

3. Many To Many : Such a relationship exists when each record of the first table can
be related to one or more than one record of the second table and a single record of
the second table can be related to one or more than one record of the first table. For
example, If there are two entity type ‘Customer’ and ‘Product’ then each customer
can buy more than one product and a product can be bought by many different
customers.

KEYS IN DBMS
Candidate Key : The minimal set of attributes which can uniquely identify rows of a table.
There can be more than one candidate keys in a table. One key amongst all candidate
keys can be chosen as a primary key.

Super Key : The set of all the keys which help to identify rows in a table uniquely. It is the
superset of a candidate keys.

Primary Key : It is a column of a table or a set of columns that helps to identify every
record present in that table uniquely. There can be more than one candidate key in relation
out of which one can be chosen as the primary key. There can be only one primary Key in
a table. A primary key must be unique and not null.

Alternate/Secondary Key : All the candidate keys which are not chosen as primary keys
are considered as alternate keys.

Unique Key : The unique key is very similar to the primary key except that primary keys
don’t allow NULL values in the column but unique keys allow them and there can be only
one primary key in a table while there can be multiple unique keys in the table.

Composite Key : A composite key refers to a combination of two or more columns that
can uniquely identify each row in a table.

Foreign Key : It is used to establish relationships between two tables. A foreign key will
require each value in a column or set of columns to match the primary key of the
referential table. Foreign keys help to maintain data and referential integrity.

E-R MODEL

An entity-relationship model is a diagrammatic approach to a database design where


real-world objects are represented as entities and relationships between them are
mentioned.

Entity : An entity is defined as a real-world object having attributes that represent


characteristics of that particular object. Example, If we have a table of a STUDENT
(Roll_no, Student_name, Age, Mobile_no) then each student in that table is an entity and
can be uniquely identified by their Roll Number i.e Roll_no.

Entity Type : It is a collection of entities that have the same attributes. So, an entity type in
an ER diagram is defined by a name (here, STUDENT) and a set of attributes(here,
Roll_no, Student_name, Age, Mobile_no) which is the STUDENT table.

Types -

1. Strong Entity Type : Those entity types which have a key attribute. The primary key
helps in identifying each entity uniquely. It is represented by a rectangle. In the above
example, Roll_no identifies each element of the table uniquely and hence, we can say that
STUDENT is a strong entity type.

2. Weak Entity Type : Those entity types which doesn't have a key attribute. Weak entity
type can't be identified on its own. It depends upon some other strong entity for its distinct
identity. It is represented by a double outlined rectangle. The relationship between a weak
entity type and strong entity type is called an identifying relationship and shown with a
double outlined diamond instead of a single outlined diamond. Example: If we have two
tables of Customer(Customer_id, Name, Mobile_no, Age, Gender) and Address(Locality,
Town, State, Customer_id). Here we cannot identify the address uniquely as there can be
many customers from the same locality. So, for this, we need an attribute of Strong Entity
Type i.e ‘Customer’ here to uniquely identify entities of 'Address' Entity Type.

Entity Set : It is a set of all the entities present in a specific entity type in a database. For
example, a set students, employees, teachers, etc. represent an entity set. We can say
that entity type is a superset of the entity set as all the entities are included in the entity
type.

E-R MODEL vs RELATIONAL MODEL

Comparison
ER Model Relational Model
Basis

It's used to describe a set of objects It's used to represent a collection


Basic known as entities, as well as the of tables as well as the
relationships between them. relationships between them.

It is a high-level or conceptual It is the implementation or


Type
model. representational model.

It represents components as Entity, It represents components as


Components
Entity Type, and Entity Set. domain, attributes, and tuples.
This model is helpful for those
people who don't have any This model is mostly famous
Used By
knowledge about how data is among programmers.
implemented.

It is easy to understand the It is less easy to derive the relation


Relationship
relationship between entities. between tables.

SQL

Structured Query Language (SQL) is the standard language used to operate, manage, and
access databases. It is owned, hosted, maintained, and offered by Microsoft.

MySQL

MySQL is an open-source relational database management system that uses SQL


commands to perform specific functions/operations in a database. It is owned and offered
by Oracle Corporation.

JOIN

A SQL Join statement is used to combine data or rows from two or more tables based on a
related column between them. Different types of Joins are :

1. INNER JOIN : selects all rows from both the tables as long as the condition satisfies i.e
value of the common field will be same. Example -

SELECT Orders.OrderID, Customers.CustomerName


FROM Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID;

2. LEFT JOIN : returns all the rows of the table on the left side of the join and
matching rows for the table on the right side of join. The rows for which there is no
matching row on right side, the result-set will contain null. Example -

SELECT Customers.CustomerName, Orders.OrderID


FROM Customers
LEFT JOIN Orders
ON Customers.CustomerID = Orders.CustomerID
3. RIGHT JOIN : returns all the rows of the table on the right side of the join and matching
rows for the table on the left side of join. The rows for which there is no matching row on
left side, the result-set will contain null.

4. FULL JOIN : creates the result-set by combining result of both LEFT JOIN and RIGHT
JOIN. The result-set will contain all the rows from both the tables. The rows for which there
is no matching, the result-set will contain NULL values.

5. SELF JOIN : used to join a table to itself as if the table were two tables;
temporarily renaming at least one table in the SQL statement. Example -

SELECT A.CustomerName AS CustomerName1, B.CustomerName AS CustomerName2,


A.City
FROM Customers A, Customers B
WHERE A.CustomerID <> B.CustomerID

VIEWS

Views in SQL are kind of virtual tables. A view also has rows and columns as they are in a
real table in the database. We can create a view by selecting fields from one or more
tables present in the database. Syntax -

CREATE VIEW view_name AS


SELECT column1, column2, ...
FROM table_name
WHERE condition;

SELECT * FROM view_name;

CREATE OR REPLACE VIEW view_name AS


SELECT column1, column2, ...
FROM table_name
WHERE condition;

DROP VIEW view_name;

Uses of a View :

• Summarize data from various tables which can be used to generate reports.
• Restrict access to the data in such a way that a user can see and (sometimes)
modify exactly what they need and no more.
• Limiting the visibility of columns (via select) or rows (via where) to just those
required for a particular task.
• Combining rows and or columns from multiple tables into one logical table.

TRIGGER
A trigger is a stored procedure in database which automatically invokes whenever a
special event in the database occurs. For example, a trigger can be invoked when a row is
inserted into a specified table or when certain table columns are being updated.

Syntax :

create trigger Trigger_Name


[before | after]
[insert | update | delete]
on [table_name]
[for each row]
[trigger_body]

Explanation of syntax :

create trigger Trigger_Name : Creates or replaces an existing trigger with the


trigger_name.
[before | after] : This specifies when the trigger will be executed.
[insert | update | delete] : This specifies the DML operation.
on [table_name] : This specifies the name of the table on which the trigger is going to be
applied.
[for each row] : This specifies a row-level trigger, i.e., the trigger will be executed for each
row being affected.
[trigger_body] : It consists of queries that need to be executed when the trigger is fired.

Example :

Given Student Report Database, in which student marks assessment is recorded. In such
schema, create a trigger so that the total marks is automatically inserted whenever a
record is insert.

create trigger stud_marks


before INSERT
on
Student
for each row
set Student.total = Student.subj1 + Student.subj2 + Student.subj3;

Above SQL statement will create a trigger in the student database in which whenever
subjects marks are entered, before inserting this data into the database, trigger will
compute those two values and insert with the entered values.

SQL INJECTION
It is a code injection technique that might destroy your database. It is one of the most
common web hacking techniques. SQL injection usually occurs when you ask a user for
input, like their username/userid, and instead of a name/id, the user gives you an SQL
statement that you will unknowingly run on your database.

SUBQUERY
A Subquery or Inner query or a Nested query is a query within another SQL query and
embedded within the WHERE clause. Subqueries must be enclosed within parentheses.
A subquery is used to return data that will be used in the main query as a condition to further
restrict the data to be retrieved. Example -

SELECT * FROM CUSTOMERS


WHERE ID IN (SELECT ID
FROM CUSTOMERS
WHERE SALARY > 4500) ;

DELETE vs TRUNCATE

S.NO DELETE TRUNCATE

The DELETE command is used to While this command is used to delete all the
1.
delete specified rows(one or more). rows from a table.

It is a DML(Data Manipulation While it is a DDL(Data Definition Language)


2.
Language) command. command.

There may be WHERE clause in


While there may not be WHERE clause in
3. DELETE command in order to filter
TRUNCATE command.
the records.

In the DELETE command, a tuple is While in this command, data page is locked
4.
locked before removing it. before removing the table data.

The DELETE statement removes TRUNCATE TABLE removes the data by


rows one at a time and records an deallocating the data pages used to store the
5.
entry in the transaction log for each table data and records only the page
deleted row. deallocations in the transaction log.

DELETE command is slower than While TRUNCATE command is faster than


6.
TRUNCATE command. DELETE command.

CURSOR

Cursor is a temporary memory which is allocated by database server at the time of


performing DML operations on table by the user. Cursors are used to store database tables.
There are 2 types of Cursors :
1. Implicit Cursors : are also known as Default Cursors of SQL SERVER. These Cursors
are allocated by SQL SERVER when the user performs DML operations.

2. Explicit Cursors : are created by users whenever the user requires them. They are used
for fetching data from table in row-by-row manner.

How to create Explicit Cursor :

1. Declare Cursor Object

Syntax : DECLARE cursor_name CURSOR FOR SELECT * FROM table_name

Example : DECLARE s1 CURSOR FOR SELECT * FROM studDetails

2. Open Cursor Connection

Example : OPEN s1

3. Fetch Data from cursor

There are total 6 methods to access data from cursor. They are as follows :
FIRST is used to fetch only the first row from cursor table.
LAST is used to fetch only last row from cursor table.
NEXT is used to fetch data in forward direction from cursor table.
PRIOR is used to fetch data in backward direction from cursor table.
ABSOLUTE n is used to fetch the exact nth row from cursor table.
RELATIVE n is used to fetch the data in incremental way as well as decremental way.

Syntax :
FETCH NEXT/FIRST/LAST/PRIOR/ABSOLUTE n/RELATIVE n FROM cursor_name

Example :
FETCH FIRST FROM s1
FETCH LAST FROM s1
FETCH NEXT FROM s1
FETCH PRIOR FROM s1
FETCH ABSOLUTE 7 FROM s1
FETCH RELATIVE -2 FROM s1

4. Close cursor connection

Example : CLOSE s1

5. Deallocate cursor memory

Example : DEALLOCATE s1

INDEXING IN DBMS
Indexing is a way to optimize the performance of a database by minimizing the number of
disk accesses required when a query is processed. It is a data structure technique which is
used to quickly retrieve records from database.

An Index is a small table having only two columns. The first column comprises of a Search
Key that contains a copy of the primary or candidate key of a table. Its second column
contains a set of pointers for holding the address of the disk block where that specific key
value can be found.

Types of Indexing

1. Primary Index : Default format of indexing. Primary Index is an ordered file which is of
fixed length size with two fields. The first field is the primary key and second field is the
pointer to that specific data block.
The primary Indexing in DBMS is also further divided into two types :

• Dense Index : The dense index contains an index record for every search key value
in the data file. It makes searching
faster. In this, the number of records

Dense Index
in the index table is same as the number
of records in the main table. Sparse Index

• Sparse Index : In this, the index record appears only for a few items present in the
data file. Each entry in the index table points to a block.
To locate a record, we find the index record with the largest search key value less than or
equal to the search key value we are looking for.
We start at that record pointed to by the index record, and proceed along with the pointers
in the file (that is, sequentially) until we find the desired record.

2. Clustered Index : It is defined on an ordered data file. The data file is ordered on a non-
key field. In some cases, the index is created on non-primary key columns which may not
be unique for each record. In such cases, in order to identify the records faster, we will group
two or more columns together to get the unique values and create index out of them. This
method is known as the clustering index. Basically, records with similar characteristics are
grouped together and indexes are created for these groups.

3. Secondary Index or Non-Clustered Index : In secondary indexing, to reduce the


size of mapping of the first level, another level of indexing is introduced. In this
method, the huge range for the columns is selected initially so that the mapping
size of the first level becomes small. Then each range is further divided into smaller
ranges. The mapping of the first level is stored in the primary memory, so that
address fetch is faster. The mapping of the second level and actual data are stored
in the secondary memory (hard disk).
FUNCTIONAL DEPENDENCY
It is a constraint that determines the relation of one attribute to another attribute in a
DBMS. Functional Dependency helps to maintain the quality of data in the database.
A functional dependency is denoted by an arrow "→". The functional dependency of X on
Y is represented by X → Y.

Employee number Employee Name Salary City


1 Dana 50000 San Francisco
2 Francis 38000 London
3 Andrew 25000 Tokyo

In this example, if we know the value of Employee number, we can obtain Employee
Name, city, salary, etc. By this, we can say that the city, Employee Name, and salary are
functionally depended on Employee number.

NORMALIZATION

It is a database design technique that reduces data redundancy and eliminates


undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization
rules divide larger tables into smaller tables and links them using relationships. The
purpose of Normalisation in SQL is to eliminate redundant (repetitive) data and ensure
data is stored logically.

Purpose of Normalization :

1. Prevent the same data from being stored in many places.


2. Prevent updates made to some data and not others.
3. Prevent deleting unrelated data. This is where we delete a record and other information
is also deleted.
4. It ensures queries are more efficient.

Types of Normalization :

First Normal Form (1NF)

A relation is in 1NF if every attribute in that relation is singled valued attribute. First normal
form disallows the multi-valued attribute and composite attribute.
Example : Relation EMPLOYEE is not in 1NF because of multi-valued attribute
EMP_PHONE.

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385, 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389, 8589830302 Punjab

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302 Punjab

Second Normal Form (2NF)

A table or relation must be in 1NF and all the non-primary key attributes should be fully
functional dependent on the primary key. It applies to relations with composite keys, that
is, relations with a primary key composed of two or more attributes.

Example : Let's assume, a school can store the data of teachers and the subjects they
teach. In a school, a teacher can teach more than one subject.

TEACHER_ID SUBJECT TEACHER_AGE


25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38

Here, Primary Key : (TEACHER_ID, SUBJECT), Non-prime attribute : TEACHER_AGE

In the above TEACHER table, non-prime attribute TEACHER_AGE is dependent only on


TEACHER_ID which is a proper subset of a candidate key. So, there is a partial
dependency in this case. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it into two tables :

TEACHER_DETAIL
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38

TEACHER_SUBJECT
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer

Third Normal Form (3NF)

A table or relation must be in 2NF and there should be no transitive dependency for non-
prime attributes.

Example :

EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY


222 Harry 201010 UP Noida
333 Stephan 02228 US Boston
444 Lan 60007 US Chicago
555 Katharine 06389 UK Norwich
666 John 462007 MP Bhopal

Primary Key : EMP_ID


Non-prime attributes : All attributes except EMP_ID are non-prime

Here, (EMP_STATE, EMP_CITY) are dependent on EMP_ZIP and EMP_ZIP dependent


on EMP_ID.
EMP_ID -> EMP_ZIP and EMP_ZIP -> (EMP_STATE, EMP_CITY)

The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on primary key


(EMP_ID). So, it violates the rule of third normal form.

That's why we need to move the EMP_CITY and EMP_STATE to the new
EMPLOYEE_ZIP table, with EMP_ZIP as a Primary key.

EMPLOYEE table:
EMP_ID EMP_NAME EMP_ZIP
222 Harry 201010
333 Stephan 02228
444 Lan 60007
555 Katharine 06389
666 John 462007

EMPLOYEE_ZIP table:
EMP_ZIP EMP_STATE EMP_CITY
201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal

Boyce Codd Normal Form (BCNF)

BCNF is the advance version of 3NF. It is stricter than 3NF. A table is in BCNF if for every
functional dependency X → Y, X is the super/candidate key of the table.

Example : Let's assume there is a company where employees work in more than one
department.

EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO


264 India Designing D394 283
264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549

In the above table Functional dependencies are as follows :


EMP_ID -> EMP_COUNTRY
EMP_DEPT -> {DEPT_TYPE, EMP_DEPT_NO}

Candidate key : {EMP-ID, EMP-DEPT} (composite)

The table is not in BCNF because left side part of both the functional dependencies i.e.
EMP_DEPT and EMP_ID are not alone candidate key.

To convert the given table into BCNF, we decompose it into three tables :

EMP_COUNTRY table :
EMP_ID EMP_COUNTRY
264 India
264 India

EMP_DEPT table :
EMP_DEPT DEPT_TYPE EMP_DEPT_NO
Designing D394 283
Testing D394 300
Stores D283 232
Developing D283 549

EMP_DEPT_MAPPING table :
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549

Candidate Keys -
For the first table : EMP_ID
For the second table : EMP_DEPT
For the third table : {EMP_ID, EMP_DEPT}

Now, this is in BCNF because left side part of both the functional dependencies is a
candidate key.
DENORMALIZATION

When we normalize tables, we break them into multiple smaller tables. So when we want
to retrieve data from multiple tables, we need to perform some kind of join operation on
them. In that case, we use the denormalization technique that eliminates the drawback of
normalization.
So, Denormalization is a database optimization technique in which we add redundant data
to one or more tables.

For example, in a normalized database, we might have a Courses table and a Teachers
table. Each entry in Courses would store the teacherID for a Course but not the
teacherName. When we need to retrieve a list of all Courses with the Teacher name, we
would do a join between these two tables. The drawback is that if tables are large, we may
spend an unnecessarily long time doing joins on tables. In this case, we'll update the
database with denormalization, redundancy, and extra effort to maximize the efficiency
benefits of fewer joins.

TRANSACTION

A transaction can be defined as a group of tasks. A single task is the minimum processing unit
which cannot be divided further.
Let’s take an example of a simple transaction. Suppose a bank employee transfers Rs 500 from A's
account to B's account. This very simple and small transaction involves several low-level tasks.

A’s Account
Open_Account(A)
Old_Balance = A.balance
New_Balance = Old_Balance - 500
A.balance = New_Balance
Close_Account(A)

B’s Account

Open_Account(B)
Old_Balance = B.balance
New_Balance = Old_Balance + 500
B.balance = New_Balance
Close_Account(B)
STATES OF TRANSACTION

1. Active State : This is the first state in the life cycle of a transaction. A transaction is
called in an active state as long as its instructions are getting executed. All the changes
made by the transaction now are stored in the buffer in main memory.

2. Partially Committed State : After the last instruction of transaction has executed, it
enters into a partially committed state. After entering this state, the transaction is
considered to be partially committed. It is not considered fully committed because all the
changes made by the transaction are still stored in the buffer in main memory.

3. Committed State : After all the changes made by the transaction have been
successfully stored into the database, it enters into a committed state. After a transaction
has entered the committed state, it is not possible to roll back the transaction.

4. Failed State : When a transaction is getting executed in the active state or partially
committed state and some failure occurs due to which it becomes impossible to continue
the execution, it enters into a failed state.

5. Aborted State : After the transaction has failed and entered into a failed state, all the
changes made by it have to be undone. To undo the changes made by the transaction, it
becomes necessary to roll back the transaction. After the transaction has rolled back
completely, it enters into an aborted state.

6. Terminated State : This is the last state in the life cycle of a transaction.
After entering the committed state or aborted state, the transaction finally enters into a
terminated state where its life cycle finally comes to an end.

ACID PROPERTIES

In order to maintain consistency in a database, before and after the transaction, certain
properties are followed. These are called ACID properties.

1. Atomicity : By this, we mean that either the entire transaction takes place at once or
doesn’t happen at all. There is no midway i.e. transactions do not occur partially. Atomicity
is the main focus in the bank systems.

Example – if person A having $30 in his account wishes to send $10 to person B’s
account. Suppose in account B, a sum of $100 is already present. Now, there will be two
operations that will take place. One is the deduction of $10 from account A and second is
the addition of $10 to account B. Now, suppose the first operation of debit executes
successfully, but the credit operation, however, fails. Thus in account A, balance becomes
$20 and in account B, the balance remains $100.

2. Consistency : This means that integrity constraints must be maintained so that the
database is consistent before and after the transaction. It refers to the correctness of a
database.
Example – Suppose, there are three accounts, A, B, and C, where A is making a
transaction T one by one to both B & C. There are two operations that take place, i.e.,
Debit and Credit. Account A firstly debits $50 to account B, and the amount in account A is
read $300 by B before the transaction. After the successful transaction T, the available
amount in B becomes $150. Now, A debits $20 to account C, and that time, the value read
by C is $250 (that is correct as a debit of $50 has been successfully done to B). The debit
and credit operation from account A to C has been done successfully. We can see that the
transaction is done successfully, and the value is also read correctly. Thus, the data is
consistent. In case the value read by B and C is $300, which means that data is
inconsistent because when the debit operation executes, it will not be consistent.

3. Isolation : This property ensures that multiple transactions can occur concurrently
without leading to the inconsistency of database state. Transactions occur independently
without interference. Changes occurring in a particular transaction will not be visible to any
other transaction until that particular change in that transaction has been committed.

Example - If two operations are concurrently running on two different accounts, then the
value of both accounts should not get affected. As you can see in the below diagram,
account A is making T1 and T2 transactions to account B and C, but both are executing
independently without affecting each other.

4.

Durability : This property ensures that once the transaction has completed execution, the
updates and modifications to the database are stored in and written to disk and they
persist even if a system failure occurs. These updates now become permanent and are
stored in non-volatile memory.

You might also like