NAME: LYENENO SILILO
PROGRAM: ADVANCED CERTIFICATE IN COMPUTER STUDIES
COURSE: DATABASE TECHNOLOGY
LECTURER: MR MBUMWAE
WORK TITLE: ASSIGNMENT ONE TERM THREE
a.) Normalization (up to 3rd Normal Form)
First Normal Form (1NF)
Order_I Customer_Nam Book_Title Author_Nam Publishe Pric Quantit Date
D e e r e y
001 John Doe The Future Alice Baker ABC Pub 50 2 2024-05-
13
002 Jane Smith Data Science Bob Marks XYZ Pub 30 1 2024-05-
14
003 John Doe Machine Alice Baker ABC Pub 70 1 2024-05-
Learning 13
Second Normal Form (2NF)
1. Orders Table
Order_I Customer_Name Book_Title Quantity Date
D
001 John Doe The Future 2 2024-05-13
002 Jane Smith Data Science 1 2024-05-14
003 John Doe Machine Learning 1 2024-05-13
2. Books Table
Book_Title Author_Name Publisher Price
The Future Alice Baker ABC Pub 50
Data Science Bob Marks XYZ Pub 30
Machine Learning Alice Baker ABC Pub 70
Third Normal Form (3NF)
1. Customers Table
Customer_ID First_Name Last_Name Other_Name
1 John Doe
2 Jane Smith
2. Authors Table
Author_ID First_Name Last_Name Other_Name
1 Alice Bakers
2 Bob Marks
3. Publishers Table
Publisher_ID Publisher_Name
1 ABC Pub
2 XYZ Pub
4. Books Table
Book_I Book_Title Author_ID Publisher_ID Price
D
1 The Future 1 1 50
2 Data Science 2 2 30
3 Machine Learning 1 1 70
5. Orders Table
Order_ID Customer_I Book_ID Quantity Order_Dat
D e
001 1 1 2 2024-05-13
002 2 2 1 2024-05-14
003 1 3 1 2024-05-13
b.) Entity-Relationship Diagram (ERD)
SQL queries:
a. select o.order_id,b.book_title,a.author_id,(b.price * o.quantity) as total_price from
orders o join books b on o.book_id = b.book_id join authors a on b.author_id =
a.author_id where o.customer_id = (select customer_id from customers where
first_name = 'john');
b. create view total_amount_spent as select c.first_name,sum(b.price * o.quantity) as
total_amount from customers c join orders o on c.customer_id = o.customer_id join
books b on o.book_id = b.book_id group by c.first_name;
select * from total_amount_spent;
c. update books set price = price * 1.10 where publisher_id = (select publisher_id
from publishers where publisher_name = 'XYZ Pub');
select * from books where publisher_id = (select publisher_id from publishers
where publisher_name = 'XYZ Pub');
The Importance of Normalization in Database Design
Normalization is a systematic approach to organizing data in a database to minimize
redundancy and improve data integrity. The process involves dividing large tables into
smaller, related tables and defining relationships between them. This is crucial for efficient
data management and retrieval.
Role of Functional Dependencies
A fundamental concept in normalization is functional dependencies, which describe the
relationship between attributes in a database. For instance, if attribute A determines attribute
B, then knowing the value of A allows us to uniquely identify the value of B. This principle
guides the decomposition of relations, ensuring that each piece of data is stored in only one
place, thereby reducing redundancy. For example, in a bookstore database, if each book has a
unique ISBN, then the ISBN can serve as a primary key, ensuring that all related data about
the book (like title, author, and price) is stored together.
Trade-offs in Database Decomposition
While normalization enhances data integrity and reduces redundancy, it also introduces
tradeoffs. One significant trade-off is the increased complexity in database queries. When
data is spread across multiple tables, retrieving information often requires complex joins,
which can impact performance. For instance, a query that retrieves customer orders along
with book details may involve joining several tables, which can slow down response times,
especially in large databases. Therefore, database designers must balance the benefits of
normalization with the potential performance impacts, ensuring that the database remains
efficient for its intended use.
Boyce-Codd Normal Form (BCNF)
Achieving Boyce-Codd Normal Form (BCNF) is critical for further reducing anomalies in
databases. BCNF is a stronger version of the Third Normal Form (3NF) and addresses certain
types of redundancy that can still exist in 3NF. Specifically, BCNF requires that every
determinant in a relation is a candidate key. This means that if one attribute determines
another, the determining attribute must be a primary key. By enforcing this rule, BCNF
eliminates update, insertion, and deletion anomalies that can occur when data is not properly
organized. For example, if a book's price changes, ensuring that the price is stored in only one
place prevents inconsistencies across the database.
Concurrency Control in Databases
Concurrency control is essential in database management systems to ensure the consistency
and integrity of data when multiple transactions occur simultaneously. As databases are often
accessed by multiple users at the same time, it is crucial to implement mechanisms that
prevent conflicts and maintain data accuracy.
Two-Phase Locking (2PL)
One widely used method for concurrency control is the Two-Phase Locking (2PL) protocol.
This protocol ensures serializability in transactions by requiring that all locks be acquired
before any changes are made to the database. The 2PL protocol operates in two phases: the
growing phase, where locks are acquired, and the shrinking phase, where locks are released.
This structure helps prevent issues such as lost updates and dirty reads, ensuring that
transactions are executed in a consistent manner.
Deadlock Prevention Techniques
In addition to 2PL, deadlock prevention techniques are crucial for managing concurrency.
Techniques such as wait-die and wound-wait help mitigate the risk of deadlocks, which occur
when two or more transactions are waiting indefinitely for resources held by each other. The
wait-die scheme allows older transactions to wait for younger ones to release locks, while
younger transactions are aborted if they request locks held by older transactions. This
approach helps maintain the flow of transactions and prevents the system from becoming
unresponsive.
Snapshot Isolation
Another important concept in concurrency control is snapshot isolation, which provides high
concurrency with minimal locking overhead. This technique allows transactions to read a
consistent snapshot of the database at a specific point in time, enabling multiple transactions
to operate simultaneously without interfering with each other. Snapshot isolation enhances
performance by reducing the need for locks, thus allowing for faster transaction processing
while maintaining data integrity.
Conclusion
In conclusion, both normalization and concurrency control are fundamental to effective
database design and management. Normalization ensures data integrity and reduces
redundancy, while concurrency control mechanisms like 2PL, deadlock prevention
techniques, and snapshot isolation maintain consistency and performance in multi-user
environments. Understanding and implementing these concepts is essential for developing
robust and efficient database systems. By carefully considering the principles of normaliza
tion and concurrency control, database designers can create systems that are both efficient and
reliable, ultimately leading to better data management and user satisfaction.