lecture-4
lecture-4
Evrad KAMTCHOUM
1 Introduction
2 Entity-Relationship Modeling
Functional Dependencies
Database Normalization
Database Denormalization
Integrity Rules and Constraints
4 Conclusion
Definition
Database design is the process of creating a detailed data model of a
database. It involves identifying the data requirements, organizing the
data into tables, and defining the relationships between tables.
There are several key concepts in database design that form the
foundation of a well-structured database:
Entity: Represents a real-world object or concept, such as a customer
or product.
Attribute: Describes the characteristics or properties of an entity.
Relationship: Defines how entities are related to each other.
Cardinality: It describes the number of instances of one entity that
can be associated with a single instance of another entity.
Normalization: Process of organizing data to minimize redundancy
and dependency.
Denormalization: Technique used to optimize query performance by
introducing redundancy.
Understanding these concepts is essential for designing effective database
schemas.
Evrad KAMTCHOUM (CCMC (UBa)) Database Systems November 14, 2024 6 / 48
Best Practices in Database Design
Definition
Conceptual modeling involves creating a high-level description of the data
to be stored in the database, independent of any specific implementation
details.
Definition
Physical modeling involves translating the logical database design into the actual
implementation details, including data storage structures, indexing strategies, and performance
optimization techniques.
Storage Structures: Determining how data will be stored on disk, such as tables, indexes,
and partitions.
Indexes: Creating indexes to speed up data retrieval operations by providing efficient
access paths to data.
Data Types and Constraints: Specifying the data types and constraints for each attribute
in the database schema.
Normalization: Ensuring the database schema is in an optimal normalized form to
minimize redundancy and dependency.
Performance Optimization: Implementing strategies to improve query performance and
overall system efficiency, such as query optimization and caching.
Data Integrity: Enforcing data integrity constraints, such as foreign key relationships and
unique constraints, at the database level.
Definition
The database design process involves several stages aimed at creating an efficient, secure, and
scalable database system that meets the requirements of the organization or application.
In this diagram, we have entities such as Student, Course, and College, with attributes and
relationships defined between them.
Evrad KAMTCHOUM (CCMC (UBa)) Database Systems November 14, 2024 18 / 48
Introduction to Functional Dependencies
Definition
Functional dependencies are a key concept in database theory, describing
the relationship between attributes in a relation. They help ensure data
integrity and guide the process of normalization.
Importance
Functional dependencies aid in the process of data modeling, where
designers define the structure and relationships of data entities in a
database. By identifying and modeling functional dependencies, designers
can create accurate and efficient database schemas.
Definition
Database normalization is a process used to organize a database schema in
such a way that reduces redundancy and dependency of data.
There are several normal forms defined in the process of database normalization,
each building upon the previous one:
1 First Normal Form (1NF): Ensures that each column contains atomic
values, eliminating repeating groups.
2 Second Normal Form (2NF): Meets the requirements of 1NF and ensures
that non-key attributes are fully dependent on the primary key.
3 Third Normal Form (3NF): Meets the requirements of 2NF and eliminates
transitive dependencies between non-key attributes.
4 Boyce-Codd Normal Form (BCNF): A stricter version of 3NF, ensuring
that every determinant is a candidate key.
5 Fourth Normal Form (4NF): Addresses multi-valued dependencies and
further reduces redundancy.
6 Fifth Normal Form (5NF): Eliminates join dependencies by decomposing
relation schemas.
This schema violates 1NF because the Item column contains multiple
values. By normalizing the schema and separating items into a separate
table, we can eliminate redundancy and achieve higher levels of
normalization.
In this table:
OrderID is the primary key.
CustomerID is a non-key attribute.
Product is a non-key attribute.
This table violates the Second Normal Form (2NF) because:
Product is functionally dependent on part of the primary key
(OrderID), but not on the entire primary key (OrderID, CustomerID).
To bring this table into 2NF, we need to split it into two separate tables:
Orders and Products, with a foreign key relationship between them.
Evrad KAMTCHOUM (CCMC (UBa)) Database Systems November 14, 2024 27 / 48
Example of Third Normal Form (3NF)
Let’s continue with our previous example of a database table representing orders. After applying
Second Normal Form (2NF), we have two separate tables: Orders and Products.
OrderID CustomerID
1 101
2 101
3 102
OrderID Product
1 Laptop
2 Phone
3 Tablet
However, the Orders table still contains transitive dependencies. For example, CustomerID is
functionally dependent on OrderID, but not directly on the primary key.
To achieve Third Normal Form (3NF), we need to further decompose the Orders table into two
separate tables: Orders and Customers, ensuring that each table represents a single entity and
eliminates transitive dependencies.
Let’s continue with our example of a database table representing orders. After applying Third
Normal Form (3NF), we have three separate tables: Orders, Customers, and Products.
OrderID ProductID
1 101
2 102
3 103
ProductID ProductName
101 Laptop
102 Phone
103 Tablet
The Orders table is now in Third Normal Form (3NF), but it still contains a dependency
between OrderID and ProductID. To achieve Boyce-Codd Normal Form (BCNF), we need to
decompose the Orders table further.
We can create a new table, OrderDetails, to represent the relationship between orders and
products, with OrderID and ProductID as its primary key. This ensures that there are no
non-trivial functional dependencies on candidate keys.
Let’s consider a database table representing orders and products, which is already in
Boyce-Codd Normal Form (BCNF):
OrderID ProductID
1 101
2 102
3 103
ProductID ProductName
101 Laptop
102 Phone
103 Tablet
However, there might still be multi-valued dependencies present. For example, a single order can
contain multiple products, and a single product can appear in multiple orders.
To achieve Fourth Normal Form (4NF), we need to further decompose the table to remove
multi-valued dependencies. We can create a new table, OrderItems, to represent the relationship
between orders and products, with OrderID and ProductID as its primary key.
Definition
Database denormalization is a process used to improve the performance of
a database by adding redundancy to the data model.
BookID Genre
1 Fiction
2 Fiction
In this schema, the books are stored in a Books table with BookID, Title, and
Author columns. Additionally, there is a Genres table with BookID and Genre
columns.
To simplify queries and improve performance, we may choose to denormalize the
schema by adding a Genre column directly to the Books table. This introduces
redundancy but can optimize read performance by eliminating the need for joins.
Definition
A Logical Data Model (LDM) is a representation of the data elements and
their relationships in a database, independent of any specific database
management system or physical implementation.
For entities: Any entity becomes a table, the properties of the entity
are the attributes of the table, the identifier of the entity is the
primary key of the table.
For relationships: That depends on the cardinalities. Two cases are
possible:
one-to-one or one-to-many: The relationship is materialised by the
addition of a foreign key.
many-to-many: The relationship is transfromed into a new table.
Key Takeaways
Database design is a critical aspect of building a robust and efficient
database system. By following principles such as entity-relationship
modeling, normalization, and denormalization, you can create a
well-structured database that meets the data requirements of your
application while optimizing performance and ensuring data integrity.
Entity-Relationship (ER) modeling is a powerful technique for designing and
visualizing the structure of a database. By defining entities, attributes, and
relationships, designers can create clear and concise representations of the
database schema.
Integrity rules and constraints are essential components of database
management systems, ensuring the accuracy, consistency, and reliability of
the data stored within them. By enforcing entity integrity, referential
integrity, domain integrity, and user-defined integrity, databases can
maintain data quality and prevent data corruption.
Key Takeaways
Database normalization is a critical process in database design, aimed at organizing the
database schema to reduce redundancy and dependency of data. By achieving higher
levels of normalization, databases can improve data integrity, minimize anomalies, and
optimize performance.
Functional dependencies are a fundamental concept in database theory, describing the
relationship between attributes in a relation. They help ensure data integrity, guide the
process of normalization, and facilitate efficient database design.
Database denormalization is a powerful technique for improving the performance of a
database by introducing redundancy into the data model. By carefully balancing the
advantages and disadvantages of denormalization and following best practices, database
administrators can optimize query performance and enhance the user experience of
applications.
The logical data model and database schema are fundamental components of database
design and management, providing a structured representation of the data and its
relationships.