0% found this document useful (0 votes)
42 views

DBMS Assignment

This document contains instructions for a DBMS assignment asking students to explain Codd's 12 rules for relational databases, describe different types of DBMS languages with examples, write examples of keys, explain data modification anomalies, and normalize forms. It then provides a detailed response covering Codd's 12 rules, examples of data definition language (DDL), data manipulation language (DML), data query language (DQL), data control language (DCL), and transaction control language (TCL).

Uploaded by

Avishek Jha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

DBMS Assignment

This document contains instructions for a DBMS assignment asking students to explain Codd's 12 rules for relational databases, describe different types of DBMS languages with examples, write examples of keys, explain data modification anomalies, and normalize forms. It then provides a detailed response covering Codd's 12 rules, examples of data definition language (DDL), data manipulation language (DML), data query language (DQL), data control language (DCL), and transaction control language (TCL).

Uploaded by

Avishek Jha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

DBMS ASSIGNMENT

DBMS assignment 1:

1. 12 codd's rule.

2. Explain diff. Types of DBMS lang. With examples.

3. Write examples of keys.

4. Write down what are the various types of data modification anomalies.

5. Explain all the normalization forms with examples.

ANSWERS:
1.
Codd's Rules, established by Edgar F. Codd, define the requirements that a database
management system (DBMS) must meet to be considered a relational database
management system (RDBMS). Here are Codd's 12 rules for an RDBMS:

1. Information Rule:
All information in the database is to be represented in one and only one way, namely
as values in a table.

2. Guaranteed Access Rule:


Each data value in the database should be accessible by specifying a table name,
primary key value, and column name.

3. Systematic Treatment of Null Values:


The DBMS must allow each field to remain null, which is distinct from an empty
string or zero. It is an important part of representing missing or undefined
information.

4. Dynamic Online Catalog Based on The Relational Model:


The structure (metadata) describing the database is to be stored in the same
relational format as the data itself.

5. Comprehensive Data Sublanguage Rule:


The DBMS must support at least one clearly defined language that includes data
definition, data manipulation, data integrity, and database transaction control.

6. View Updating Rule:


All views that are theoretically updatable should be updatable by the system.

7. High-Level Insert, Update, and Delete:


The capability of handling a base relation or a derived relation as a single operand

DBMS ASSIGNMENT 1
applies not only to the retrieval of data but also to the insertion, update, and deletion
of data.

8. Physical Data Independence:


Changes to the physical storage structures or devices should not affect the logical
structure of the database.

9. Logical Data Independence:


Changes to the logical structure (table structures, domains, etc.) should not require
changes to the application programs.

10. Integrity Independence:


Integrity constraints specified as part of the logical schema must be definable in
one's natural language and enforceable by the DBMS.

11. Distribution Independence:


A user's perception of the data should remain unchanged, regardless of where or
how the data is physically stored or distributed.

12. Non-subversion Rule:


If a relational system has a low-level (single-record-at-a-time) language, that low
level cannot be used to subvert or bypass the integrity rules and constraints
expressed in the higher-level relational language.

Adhering to these rules ensures that a database system is truly relational, providing a
high level of consistency, flexibility, and data integrity.

OR,
I can write 12 Codd’s rules for a DBMS to be a RDBMS using the information from the
web search results. Here is my attempt:

Codd’s rules are a set of criteria that define what a relational database management
system (RDBMS) is. They were proposed by Edgar F. Codd, the inventor of the relational
model, in 19851. The rules are numbered from 0 to 12, and they are as follows:

Rule 0: The Foundation Rule. The database must be based on the relational
model, and the system must support relational operations on the data.

Rule 1: The Information Rule. All information in the database, including metadata,
must be represented as values in tables. Each value must be stored in a single cell
of a table.

Rule 2: The Guaranteed Access Rule. Every data element must be accessible by a
combination of table name, primary key value, and column name.

DBMS ASSIGNMENT 2
Rule 3: The Systematic Treatment of Null Values. The database must handle null
values consistently and transparently. Null values indicate the absence or
inapplicability of data.

Rule 4: The Active Online Catalog Rule. The database must have an online
catalog that contains the metadata of the database. The catalog must be accessible
using the same query language as the data.

Rule 5: The Comprehensive Data Sublanguage Rule. The database must support
a data sublanguage that has a well-defined syntax and semantics. The sublanguage
must allow data definition, data manipulation, data integrity, data security, and
transaction management.

Rule 6: The View Updating Rule. The database must allow views to be defined and
updated using the same sublanguage as the data. Views are virtual tables that
derive data from other tables.

Rule 7: The High-level Insert, Update, and Delete Rule. The database must
support high-level operations that can insert, update, or delete data from one or
more tables in a single statement.

Rule 8: The Physical Data Independence Rule. The database must be


independent of the physical storage and access methods of the data. Changes to
the physical level should not affect the logical level or the applications.

Rule 9: The Logical Data Independence Rule. The database must be independent
of the logical structure and organization of the data. Changes to the logical level
should not affect the physical level or the applications.

Rule 10: The Integrity Independence Rule. The database must allow integrity
constraints to be defined and enforced by the system. Integrity constraints are rules
that ensure the validity and consistency of the data.

Rule 11: The Distribution Independence Rule. The database must be able to
distribute the data across multiple locations without affecting the users or the
applications. The distribution of data should be transparent to the users.

Rule 12: The Non-subversion Rule. The database must not allow any access to
the data that bypasses the integrity, security, or transaction rules of the system. The
system must prevent any unauthorized or inconsistent manipulation of the data.

2.

Database Management Systems (DBMS) use different types of languages to interact


with databases. Here are some of the key types of languages used in DBMS:

DBMS ASSIGNMENT 3
1. Data Definition Language (DDL):

Purpose: DDL is used to define and manage the structure of the database.

Examples:

CREATE TABLE : Creates a new table in the database.

ALTER TABLE : Modifies the structure of an existing table.

DROP TABLE : Deletes a table and its data from the database.

Example:

sqlCopy code
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(255),
salary DECIMAL(10,2)
);

2. Data Manipulation Language (DML):

Purpose: DML is used to manipulate data stored in the database.

Examples:

SELECT : Retrieves data from one or more tables.

INSERT : Adds new records to a table.

UPDATE : Modifies existing records in a table.

DELETE : Removes records from a table.

Example:

sqlCopy code
SELECT name, salary FROM employees WHERE salary > 50000;

3. Data Query Language (DQL):

Purpose: DQL is a subset of DML used specifically for querying data.

Examples:

SELECT : Retrieves data from one or more tables.

Example:

DBMS ASSIGNMENT 4
sqlCopy code
SELECT * FROM customers WHERE country = 'USA';

4. Data Control Language (DCL):

Purpose: DCL is used to control access to data within the database.

Examples:

GRANT : Provides specific privileges to database users.

REVOKE : Removes specific privileges from database users.

Example:

sqlCopy code
GRANT SELECT, INSERT ON employees TO user1;

5. Transaction Control Language (TCL):

Purpose: TCL is used to manage transactions in the database.

Examples:

COMMIT : Saves changes made during the current transaction.

ROLLBACK : Undoes changes made during the current transaction.

SAVEPOINT : Sets a point within a transaction to which you can later roll back.

Example:

sqlCopy code
BEGIN TRANSACTION;
-- SQL statements
COMMIT;

6. Procedural Language/Structured Query Language (PL/SQL):

Purpose: PL/SQL is a procedural language extension for SQL used to create


stored procedures, functions, and triggers.

Examples:

CREATE PROCEDURE : Creates a stored procedure.

CREATE FUNCTION : Creates a user-defined function.

DBMS ASSIGNMENT 5
Example:

sqlCopy code
CREATE PROCEDURE GetEmployeeDetails (IN empID INT)
BEGIN
SELECT * FROM employees WHERE id = empID;
END;

These languages provide a comprehensive set of tools for defining, manipulating, and
controlling access to data in a relational database management system. The examples
given are in SQL, which is the standard language for interacting with relational
databases. Different database systems may have variations in syntax and additional
features.

OR,

Different types of DBMS languages are used to perform various tasks on a database,
such as creating, modifying, querying, or controlling the data. The four main categories
of DBMS languages are:

Data Definition Language (DDL): This language is used to define the structure and
schema of the database, such as tables, columns, indexes, and constraints. For
example, the SQL statement CREATE TABLE students (id INT, name VARCHAR(50), age
INT); uses DDL to create a table named students with three columns: id , name ,

and age .

Data Manipulation Language (DML): This language is used to manipulate the data
stored in the database, such as inserting, updating, deleting, or retrieving data. For
example, the SQL statement INSERT INTO students VALUES (1, 'Alice', 20); uses DML
to insert a new record into the students table with the values 1 , 'Alice' , and 20 .

Data Control Language (DCL): This language is used to control the access and
security of the database, such as granting or revoking permissions, roles, or
privileges to users or groups. For example, the SQL statement GRANT SELECT ON
students TO user1; uses DCL to grant the permission to select data from
the students table to the user named user1 .

Transaction Control Language (TCL): This language is used to manage the


transactions in the database, such as committing, rolling back, or saving the changes
made by the users. For example, the SQL statement COMMIT; uses TCL to commit
the changes made by the current transaction to the database.

DBMS ASSIGNMENT 6
These are some of the common examples of DBMS languages, but there are also other
languages that are specific to certain types of databases or applications. For example,
XQuery is a language that is used to query and manipulate data in XML formats1, OQL
is a language that is used to query and manipulate data in object-oriented databases1,
and PL/SQL is a language that is used to write procedural code and functions in Oracle
databases2.

3.

In a relational database management system (RDBMS), different types of keys are used
to establish relationships between tables and ensure data integrity. Here are examples of
various keys in a database:

1. Primary Key:

Definition: A primary key is a unique identifier for a record in a table.

Example:

sqlCopy code
CREATE TABLE students (
student_id INT PRIMARY KEY,
student_name VARCHAR(255),
age INT
);

2. Foreign Key:

Definition: A foreign key is a field in a table that refers to the primary key in
another table.

Example:

sqlCopy code
CREATE TABLE courses (
course_id INT PRIMARY KEY,
course_name VARCHAR(255),
instructor_id INT,
FOREIGN KEY (instructor_id) REFERENCES instructors(instructor_id)
);

3. Composite Key:

Definition: A composite key is a key that consists of two or more columns to


uniquely identify a record.

DBMS ASSIGNMENT 7
Example:

sqlCopy code
CREATE TABLE orders (
order_id INT,
product_id INT,
customer_id INT,
PRIMARY KEY (order_id, product_id)
);

4. Candidate Key:

Definition: A candidate key is a set of one or more columns that can uniquely
identify a record.

Example:

sqlCopy code
CREATE TABLE employees (
employee_id INT,
email VARCHAR(255) UNIQUE,
ssn VARCHAR(9) UNIQUE,
PRIMARY KEY (employee_id)
);

5. Alternate Key:

Definition: An alternate key is a candidate key that is not selected as the


primary key.

Example:

sqlCopy code
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
email VARCHAR(255) UNIQUE,
alternate_id VARCHAR(10) UNIQUE
);

6. Super Key:

Definition: A super key is a set of one or more keys that can uniquely identify a
record.

Example:

DBMS ASSIGNMENT 8
sqlCopy code
CREATE TABLE products (
product_id INT,
product_code VARCHAR(10),
description VARCHAR(255),
PRIMARY KEY (product_id),
UNIQUE (product_code)
);

7. Surrogate Key:

Definition: A surrogate key is a system-generated key used as the primary key


instead of using natural keys.

Example:

sqlCopy code
CREATE TABLE employees (
employee_id INT AUTO_INCREMENT PRIMARY KEY,
employee_name VARCHAR(255),
department_id INT
);

These examples demonstrate the use of different keys in a relational database to


establish relationships between tables and ensure data integrity. The specific syntax and
features may vary depending on the database management system being used (e.g.,
MySQL, PostgreSQL, SQL Server).

OR,

I can write examples of all keys in DBMS using the information from the web search
results. Here is my attempt:

Super key: A super key is a set of one or more attributes that can uniquely identify a
tuple in a relation. A super key may contain extra attributes that are not necessary for
uniqueness. For example, in the EMPLOYEE table, {EMPLOYEE_ID},
{EMPLOYEE_ID, EMPLOYEE_NAME}, and {EMPLOYEE_ID, EMPLOYEE_NAME,
DEPARTMENT_ID} are all super keys, but only {EMPLOYEE_ID} is minimal12.

Candidate key: A candidate key is a minimal super key, that is, a super key that
does not have any redundant attribute. A candidate key can uniquely identify a tuple
in a relation. A relation may have more than one candidate key. For example, in the

DBMS ASSIGNMENT 9
EMPLOYEE table, {EMPLOYEE_ID}, {SSN}, {PASSPORT_NUMBER}, and
{LICENSE_NUMBER} are all candidate keys12.

Primary key: A primary key is a candidate key that is chosen by the database
designer to identify the tuples in a relation. A relation can have only one primary key.
A primary key cannot have null values or duplicates. For example, in the
EMPLOYEE table, {EMPLOYEE_ID} can be chosen as the primary key12.

Alternate key: An alternate key is a candidate key that is not chosen as the primary
key. An alternate key can be used as a backup or alternative way to identify the
tuples in a relation. An alternate key may have null values or duplicates. For
example, in the EMPLOYEE table, {SSN}, {PASSPORT_NUMBER}, and
{LICENSE_NUMBER} are all alternate keys12.

Foreign key: A foreign key is an attribute or a set of attributes in one relation that
references the primary key of another relation. A foreign key is used to establish a
relationship between two relations. A foreign key may have null values or
duplicates. For example, in the EMPLOYEE table, {DEPARTMENT_ID} is a foreign
key that references the primary key {DEPARTMENT_ID} of the DEPARTMENT
table12.

Composite key: A composite key is a key that consists of two or more attributes. A
composite key can be a super key, a candidate key, a primary key, or a foreign key,
depending on the context. For example, in the STUDENT_COURSE table,
{STUDENT_ID, COURSE_ID} is a composite key that can act as a primary key to
identify the tuples in the relation2.

Unique key: A unique key is a key that can uniquely identify a tuple in a relation, but
it is not a primary key. A unique key can have at most one null value. A unique key
can be a candidate key or an alternate key, depending on the context. For example,
in the EMPLOYEE table, {EMAIL} can be a unique key that can identify the tuples in
the relation, but it is not a primary key3.

Surrogate key: A surrogate key is a key that is artificially generated by the database
system to identify the tuples in a relation. A surrogate key has no meaning or relation
to the data in the tuple. A surrogate key is usually a numeric or alphanumeric value.
A surrogate key can be a primary key or a foreign key, depending on the context. For
example, in the EMPLOYEE table, {EMPLOYEE_ID} can be a surrogate key that is
generated by the system to identify the tuples in the relation4.

Secondary key: A secondary key is a key that is used for indexing or sorting the
tuples in a relation, but it is not used for identification. A secondary key can have null
values or duplicates. A secondary key can be any attribute or a combination of
attributes that is not a primary key or a foreign key. For example, in the EMPLOYEE

DBMS ASSIGNMENT 10
table, {EMPLOYEE_NAME} can be a secondary key that is used for sorting the
tuples in the relation4.

4.

Data modification anomalies are issues that can occur when making changes to a
database. These anomalies can lead to inconsistencies, errors, and loss of data integrity.
There are three main types of data modification anomalies:

1. Insertion Anomalies:

Definition: Insertion anomalies occur when adding new data to the database.

Types:

Incomplete Information: Inability to insert a new record without providing


data for an entire row, even if some information is not applicable or available.

Inability to Add Related Data: Inability to add a record to a table without


having related data in another table.

Example:
Consider a database that stores information about students and their courses. If a
student enrolls in a new course, but the course information requires data that is not
yet available (e.g., instructor details), an insertion anomaly occurs.

2. Update Anomalies:

Definition: Update anomalies occur when modifying data in the database.

Types:

Data Inconsistency: Modifying data in one place and not updating it in all
related places can lead to inconsistencies.

Loss of Data: Overwriting existing data when updating a record, resulting in


the loss of historical information.

Example:
Continuing with the student and course database, if an instructor changes their email
address, updating it in one place (e.g., the "instructors" table) but not in another
(e.g., the "courses" table), can lead to data inconsistency.

3. Deletion Anomalies:

Definition: Deletion anomalies occur when removing data from the database.

Types:

DBMS ASSIGNMENT 11
Loss of Information: Deleting a record unintentionally removes related
information that may still be relevant.

Inability to Delete: Inability to delete a record without removing related data


that should be retained.

Example:
In the student and course database, if deleting a course record also deletes the
associated instructor information, it leads to a loss of information. On the other hand,
if an instructor can only be removed by deleting all courses associated with them, it
creates an inability to delete without losing relevant data.

To address these anomalies, database designers often normalize the database structure,
use proper relationships, and consider the use of constraints and transactions to
maintain data integrity during modifications. Normalization involves organizing data to
reduce redundancy and dependency, thereby minimizing the likelihood of anomalies.

OR,

Data modification anomalies are inconsistencies or errors that can occur when data is
inserted, deleted, or updated in a relational database. There are three main types of data
modification anomalies:

Insertion anomaly: This occurs when it is not possible to insert data into a database
because the required fields are missing or because the data is incomplete. For
example, if we want to insert a new student record into a table that contains both
student and course information, we may not be able to do so if we do not know the
course details or if the course does not exist yet1.

Deletion anomaly: This occurs when deleting data from a database also deletes
other data that should not be deleted. For example, if we want to delete a course
record from a table that contains both student and course information, we may also
delete the student records that are enrolled in that course, which is undesirable1.

Update anomaly: This occurs when updating data in a database causes data
inconsistency or duplication. For example, if we want to update the phone number of
a student in a table that contains both student and course information, we may have
to update multiple rows that contain the same student information, which is inefficient
and prone to errors1.

To avoid data modification anomalies, we can use the process of normalization, which is
the technique of designing the structure of a database to reduce redundancy and
improve data integrity. Normalization involves splitting a table into smaller and more

DBMS ASSIGNMENT 12
specific tables that are linked by foreign keys, and applying different levels of normal
forms to ensure that each table satisfies certain criteria2.

5.

Normalization is the process of organizing data in a relational database to reduce


redundancy and improve data integrity. The normalization process involves dividing large
tables into smaller, related tables and defining relationships between them. There are
different normal forms (NF) that represent different levels of normalization. The
commonly used normalization forms are:

1. First Normal Form (1NF):


Definition: A table is in 1NF if it contains only atomic (indivisible) values, and there
are no repeating groups or arrays.

Example:

sqlCopy code
-- Not in 1NF
CREATE TABLE students (
student_id INT PRIMARY KEY,
student_name VARCHAR(255),
subjects VARCHAR(255) -- Repeating group
);

-- In 1NF
CREATE TABLE students (
student_id INT PRIMARY KEY,
student_name VARCHAR(255)
);

CREATE TABLE student_subjects (


student_id INT,
subject_name VARCHAR(255),
PRIMARY KEY (student_id, subject_name),
FOREIGN KEY (student_id) REFERENCES students(student_id)
);

2. Second Normal Form (2NF):


Definition: A table is in 2NF if it is in 1NF and all non-key attributes are fully
functionally dependent on the primary key.

Example:

DBMS ASSIGNMENT 13
sqlCopy code
-- Not in 2NF
CREATE TABLE orders (
order_id INT PRIMARY KEY,
product_id INT,
product_name VARCHAR(255),
price DECIMAL(10, 2),
PRIMARY KEY (order_id, product_id)
);

-- In 2NF
CREATE TABLE orders (
order_id INT PRIMARY KEY,
product_id INT,
PRIMARY KEY (order_id, product_id)
);

CREATE TABLE products (


product_id INT PRIMARY KEY,
product_name VARCHAR(255),
price DECIMAL(10, 2)
);

3. Third Normal Form (3NF):


Definition: A table is in 3NF if it is in 2NF, and all non-key attributes are non-
transitively dependent on the primary key.

Example:

sqlCopy code
-- Not in 3NF
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department_name VARCHAR(255),
manager_name VARCHAR(255),
manager_phone VARCHAR(15)
);

-- In 3NF
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department_id INT,
manager_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id),
FOREIGN KEY (manager_id) REFERENCES managers(manager_id)
);

CREATE TABLE departments (


department_id INT PRIMARY KEY,
department_name VARCHAR(255)
);

DBMS ASSIGNMENT 14
CREATE TABLE managers (
manager_id INT PRIMARY KEY,
manager_name VARCHAR(255),
manager_phone VARCHAR(15)
);

Boyce-Codd Normal Form (BCNF):


Definition: A table is in BCNF if it is in 3NF and for every non-trivial functional
dependency X → Y, X is a superkey.

Example:

sqlCopy code
-- Not in BCNF
CREATE TABLE books (
isbn VARCHAR(13) PRIMARY KEY,
title VARCHAR(255),
author_name VARCHAR(255),
genre VARCHAR(50),
publisher_name VARCHAR(255),
PRIMARY KEY (isbn)
);

-- In BCNF
CREATE TABLE books (
isbn VARCHAR(13) PRIMARY KEY,
title VARCHAR(255),
genre VARCHAR(50),
publisher_id INT,
PRIMARY KEY (isbn),
FOREIGN KEY (publisher_id) REFERENCES publishers(publisher_id)
);

CREATE TABLE publishers (


publisher_id INT PRIMARY KEY,
publisher_name VARCHAR(255)
);

Normalization beyond BCNF (Fourth Normal Form, Fifth Normal Form, etc.) is less
commonly used in practice and is typically applied in specific situations where extreme
normalization is required.

Each normalization form builds on the previous one, and the goal is to minimize
redundancy, dependencies, and anomalies in the database schema. The choice of
normalization depends on the specific requirements and characteristics of the data being
modeled.

DBMS ASSIGNMENT 15
OR,

Normalization is the process of organizing data in a relational database to reduce


redundancy and improve data integrity. The normalization process involves dividing large
tables into smaller, related tables and defining relationships between them. There are
different normal forms (NF) that represent different levels of normalization. The
commonly used normalization forms are:

1. First Normal Form (1NF):


Definition: A table is in 1NF if it contains only atomic (indivisible) values, and there
are no repeating groups or arrays.

Example:

sqlCopy code
-- Not in 1NF
CREATE TABLE students (
student_id INT PRIMARY KEY,
student_name VARCHAR(255),
subjects VARCHAR(255) -- Repeating group
);

-- In 1NF
CREATE TABLE students (
student_id INT PRIMARY KEY,
student_name VARCHAR(255)
);

CREATE TABLE student_subjects (


student_id INT,
subject_name VARCHAR(255),
PRIMARY KEY (student_id, subject_name),
FOREIGN KEY (student_id) REFERENCES students(student_id)
);

2. Second Normal Form (2NF):


Definition: A table is in 2NF if it is in 1NF and all non-key attributes are fully
functionally dependent on the primary key.

Example:

sqlCopy code
-- Not in 2NF
CREATE TABLE orders (
order_id INT PRIMARY KEY,
product_id INT,

DBMS ASSIGNMENT 16
product_name VARCHAR(255),
price DECIMAL(10, 2),
PRIMARY KEY (order_id, product_id)
);

-- In 2NF
CREATE TABLE orders (
order_id INT PRIMARY KEY,
product_id INT,
PRIMARY KEY (order_id, product_id)
);

CREATE TABLE products (


product_id INT PRIMARY KEY,
product_name VARCHAR(255),
price DECIMAL(10, 2)
);

3. Third Normal Form (3NF):


Definition: A table is in 3NF if it is in 2NF, and all non-key attributes are non-
transitively dependent on the primary key.

Example:

sqlCopy code
-- Not in 3NF
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department_name VARCHAR(255),
manager_name VARCHAR(255),
manager_phone VARCHAR(15)
);

-- In 3NF
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department_id INT,
manager_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id),
FOREIGN KEY (manager_id) REFERENCES managers(manager_id)
);

CREATE TABLE departments (


department_id INT PRIMARY KEY,
department_name VARCHAR(255)
);

CREATE TABLE managers (


manager_id INT PRIMARY KEY,
manager_name VARCHAR(255),
manager_phone VARCHAR(15)

DBMS ASSIGNMENT 17
);

Boyce-Codd Normal Form (BCNF):


Definition: A table is in BCNF if it is in 3NF and for every non-trivial functional
dependency X → Y, X is a superkey.

Example:

sqlCopy code
-- Not in BCNF
CREATE TABLE books (
isbn VARCHAR(13) PRIMARY KEY,
title VARCHAR(255),
author_name VARCHAR(255),
genre VARCHAR(50),
publisher_name VARCHAR(255),
PRIMARY KEY (isbn)
);

-- In BCNF
CREATE TABLE books (
isbn VARCHAR(13) PRIMARY KEY,
title VARCHAR(255),
genre VARCHAR(50),
publisher_id INT,
PRIMARY KEY (isbn),
FOREIGN KEY (publisher_id) REFERENCES publishers(publisher_id)
);

CREATE TABLE publishers (


publisher_id INT PRIMARY KEY,
publisher_name VARCHAR(255)
);

Normalization beyond BCNF (Fourth Normal Form, Fifth Normal Form, etc.) is less
commonly used in practice and is typically applied in specific situations where extreme
normalization is required.

Each normalization form builds on the previous one, and the goal is to minimize
redundancy, dependencies, and anomalies in the database schema. The choice of
normalization depends on the specific requirements and characteristics of the data being
modeled.

OR,

DBMS ASSIGNMENT 18
Database normalization is a database design principle for organizing data in an
organized and consistent way. It helps you avoid redundancy and maintain the integrity
of the database. It also helps you eliminate undesirable characteristics associated with
insertion, deletion, and updating1.

The main purpose of database normalization is to avoid complexities, eliminate


duplicates, and organize data in a consistent way. In normalization, the data is divided
into several tables linked together with relationships1.
There are several normal forms in database normalization, but the most commonly used
are the first three forms: 1NF (First Normal Form), 2NF (Second Normal Form), and 3NF
(Third Normal Form). There are also 4NF (Fourth Normal Form), 5NF (Fifth Normal
Form), and even 6NF (Sixth Normal Form), but 3NF is the most common1.

First Normal Form (1NF): For a table to be in the first normal form, it must meet the
following criteria1:

A single cell must not hold more than one value (atomicity).

There must be a primary key for identification.

No duplicated rows or columns.

Each column must have only one value for each row in the table.

Second Normal Form (2NF): A table is in the Second Normal Form if it satisfies the
following:

It is in the First Normal Form.

And, it has no Partial Dependency. A table is said to have a Partial Dependency if an


attribute depends only on part, and not the whole, of the Primary Key.

Third Normal Form (3NF): A table is in the Third Normal Form if it satisfies the
following:

It is in the Second Normal Form.

And, it doesn’t have Transitive Dependency. Transitive Dependency occurs when a


non-key attribute depends on another non-key attribute.

OR,

Normalization is the process of organizing a database to reduce redundancy and


improve data integrity1. There are several levels of normalization, each with its own set

DBMS ASSIGNMENT 19
of rules, known as normal forms. The most common normal forms are 1NF, 2NF, and
3NF, which stand for first normal form, second normal form, and third normal form,
respectively2. Here is a brief explanation of each normal form with examples:

1NF: A table is in 1NF if it meets the following criteria2:

Each cell contains only a single value (atomicity).

Each column has a unique name.

There are no duplicate rows or columns.

Each column has only one value for each row in the table.

For example, consider the following table that stores information about students and
their courses:

Student ID Name Course

101 Alice Math, Physics

102 Bob Chemistry, Biology

103 Charlie Math, Physics

104 David History, Geography

This table is not in 1NF because it violates the atomicity rule. The Course column
contains multiple values for each student. To convert this table into 1NF, we need to
split the Course column into two columns, one for each course:

Student ID Name Course 1 Course 2

101 Alice Math Physics

102 Bob Chemistry Biology

103 Charlie Math Physics

104 David History Geography

Now, this table is in 1NF because it meets all the criteria.

2NF: A table is in 2NF if it is in 1NF and it meets the following criterion2:

Every non-key attribute is fully dependent on the primary key.

This means that every column that is not part of the primary key should be directly
related to the primary key, and not to any other column in the table. A primary key is
a column or a combination of columns that uniquely identifies each row in the table.

For example, consider the following table that stores information about employees
and their departments:

DBMS ASSIGNMENT 20
Department
Employee ID Name Department ID Salary
Name

201 Emma 10 Sales 5000

202 Frank 20 Marketing 6000

203 Grace 10 Sales 5500

204 Harry 30 Finance 7000

This table is in 1NF because it meets all the criteria of 1NF. However, it is not in 2NF
because it violates the dependency rule. The primary key of this table is Employee
ID, but the columns Department Name and Salary are not fully dependent on it.
Department Name is dependent on Department ID, and Salary is dependent on both
Employee ID and Department ID. To convert this table into 2NF, we need to split it
into two tables, one for employees and one for departments:

Employee ID Name Department ID Salary

201 Emma 10 5000

202 Frank 20 6000

203 Grace 10 5500

204 Harry 30 7000

Department ID Department Name

10 Sales

20 Marketing

30 Finance

Now, both tables are in 2NF because they meet all the criteria of 2NF.

3NF: A table is in 3NF if it is in 2NF and it meets the following criterion2:

Every non-key attribute is non-transitively dependent on the primary key.

This means that every column that is not part of the primary key should be directly
related to the primary key, and not to any other column in the same table. A non-
transitive dependency is a dependency that cannot be derived from another
dependency.

For example, consider the following table that stores information about products and
their suppliers:

Supplier
Product ID Product Name Supplier ID Supplier Name
Address

301 Laptop 40 ABC Inc. 123 Main Street

DBMS ASSIGNMENT 21
456 Park
302 Printer 50 XYZ Ltd.
Avenue

303 Mouse 40 ABC Inc. 123 Main Street

304 Keyboard 60 PQR Co. 789 Elm Road

This table is in 2NF because it meets all the criteria of 2NF. However, it is not in 3NF
because it violates the non-transitive dependency rule. The primary key of this table
is Product ID, but the columns Supplier Name and Supplier Address are not non-
transitively dependent on it. Supplier Name and Supplier Address are dependent on
Supplier ID, which is not part of the primary key. To convert this table into 3NF, we
need to split it into two tables, one for products and one for suppliers:

Product ID Product Name Supplier ID

301 Laptop 40

302 Printer 50

303 Mouse 40

304 Keyboard 60

Supplier ID Supplier Name Supplier Address

40 ABC Inc. 123 Main Street

50 XYZ Ltd. 456 Park Avenue

60 PQR Co. 789 Elm Road

Now, both tables are in 3NF because they meet all the criteria of 3NF.

DBMS ASSIGNMENT 22

You might also like