Relational Model (Unit-2)
Relational Model (Unit-2)
1. Relation (Table):
2. Tuple (Row):
3. Attribute (Column):
Each attribute has a specific data type (e.g., integer, text, date).
4. Domain:
Example: The "Age" column might have a domain of integers between 1 and 120.
7. Constraints:
******
E. F. Codd’s 12 Rules for the Relational Model
E. F. Codd, the father of relational databases, defined 12 rules (actually 13, including Rule 0) to determine whether a
database management system (DBMS) is truly relational. These rules ensure data integrity, consistency, and structure in
relational databases.
1
Rule 0: The Foundation Rule
A system must be relational, meaning it must manage data entirely using relational principles (tables, rows, and columns).
All data must be stored in tables (relations), with rows and columns.
Every data value must be accessible using a combination of table name, primary key, and column name.
The database must support NULL values to represent missing, unknown, or not applicable data.
The system must store metadata (schema, tables, constraints, etc.) in tables, just like actual data.
✅ Example: Information about tables and columns should be queryable using SQL.
A relational database must support a full data language like SQL, which can:
✅ Example:
Views (virtual tables created using SELECT) must be updatable if logically possible.
✅ Example: If a VIEW is created for active students, updates should reflect in the base table.
The system must support operations on sets of rows, not just one at a time.
Changes to how data is stored (physical structure) should not affect applications.
✅ Example: If a table is moved to a different disk, applications should work without modification.
2
Rule 9: Logical Data Independence
✅ Example: If a new column (Email) is added to a Student table, old queries should still work
Integrity constraints (such as primary key, foreign key, and check constraints) must be stored in the database and not in
applications.
A relational database should work the same way whether data is stored locally or across multiple locations (distributed
databases).
Example: A bank database should work the same whether accounts are stored in one branch or multiple branches.
If a low-level access method (like file manipulation) exists, it must not bypass relational security and integrity rules.
Example: Users should not be able to modify database files directly, avoiding bypassing constraints.
******
Relational Algebra
Relational Algebra is a fundamental concept in databases used to query and manipulate relational data. Relational
operators in relational algebra can be classified into Traditional and Special operators.
1. Union (∪) – Combines tuples from two relations (tables), removing duplicates.
3. Set Difference (-) – Returns tuples in one relation but not in another.
4. Cartesian Product (×) – Combines every tuple of one relation with every tuple of another.
Example: π Name, Age (Employee) → Selects only the Name and Age columns.
3. Join (⨝) – Combines related tuples from two relations based on a condition.
Types of Joins:
4. Division (÷) – Used for queries like "Find those who have all required attributes."
******
Relational Calculus
Relational Calculus is a non-procedural query language in relational databases, meaning it focuses on what to retrieve
rather than how to retrieve it (unlike Relational Algebra, which specifies a step-by-step process). Queries in relational
calculus are expressed as logical formulas, describing the properties of data to be retrieved.
General format:
\{ t \ | \ P(t) \}
Example:
Uses domain variables that represent column values (attributes) rather than entire tuples.
General format:
Example:
******
RELATION ALGEBRA, V/S, RELATIONAL CALCULUS
Relational Algebra Relational Calculus
Relational Algebra is a procedural language in which user Relational Calculus is a non-procedural language in which
specifies the sequence of steps (procedures) to obtain user specifies what is to be retrieved from the database
information from the database. rather than how to retrieve it.
It specifies operations performed on existing relations to In relation calculus, operations are directly performed on
obtain new relations. the relations in form of formulas.
In relation algebra, queries are domain independent. In relation calculus, queries are domain dependent.
In relational algebra, the evaluation of the query, depends In relation calculus, the evaluation of the query does not
upon the order of operations. depend upon the order of operations.
******
In the relational model of databases, keys are crucial for uniquely identifying records (tuples) in a table (relation) , prevent
duplicate data, and establishing relationships between tables. There are several types of keys, each serving a specific
purpose:
1. Super Key
A super key is any set of attributes that uniquely identifies a tuple in a relation. A relation can have multiple super keys.
2. Candidate Key
A candidate key is a minimal super key, meaning it has no redundant attributes. Every relation can have multiple candidate
keys.
Example: If {Student_ID, Name} is a super key but Student_ID alone is also unique, then {Student_ID} is a candidate key.
3. Primary Key
5
A primary key is a candidate key selected to uniquely identify tuples in a relation. It must be unique and not null.
4. Foreign Key
A foreign key is an attribute (or set of attributes) in one table that refers to the primary key of another table. It establishes
relationships between tables.
Example: If a Courses table has an attribute Student_ID that references the Students table, then Student_ID is a foreign
key.
5. Composite Key
A composite key consists of multiple attributes that together uniquely identify a tuple.
6. Unique Key
A unique key ensures that values in a column (or set of columns) are unique across all rows, similar to a primary key but
allowing NULL values.
7. Alternate Key
An alternate key is any candidate key that is not chosen as the primary key.
Example: If a Students table has {Student_ID, Email} as candidate keys and Student_ID is chosen as the primary key, then
Email is an alternate key.
8. Surrogate Key
A surrogate key is an artificial key, usually an auto-generated number, used when no natural primary key exists.
Example: A database may use an auto-incrementing User_ID instead of relying on SSN or Email.
******
What is Normalization
Normalization is the process of organizing data in a database to reduce redundancy (duplicate data) and improve data
integrity. It involves dividing large tables into smaller ones and establishing relationships between them using keys.
Normalization follows a series of rules called normal forms (NF). Each normal form builds on the previous one.
6
1. First Normal Form (1NF) – Remove Duplicate Columns & Ensure Atomicity
2. Second Normal Form (2NF) – Remove Partial Dependency
3. Third Normal Form (3NF) – Remove Transitive Dependency
******
Advantages and Disadvantages of Normalization
Advantages of Normalization
Disadvantages of Normalization
1. Increased Complexity
Splitting data across many tables can make reporting more complex.
******
First, Second, and Third Normal Forms (1NF, 2NF, 3NF) with Examples
Normalization is a process of organizing data in a database to eliminate redundancy and improve data integrity. It follows
different normal forms, where each form builds upon the previous one.
First Normal Form (1NF) – Remove Duplicate Columns & Ensure Atomicity
Rules:
1. Each column should have atomic (indivisible) values (no multiple values in one cell).
|-----------|------|------------|
|-----------|------|---------|
🔹 Now, the table is in 1NF because each column has atomic values, and there are no repeating groups.l
Rules:
2. All non-key columns must depend on the entire primary key, not just a part of it.
8
|-----------|----------|------------|-----------|
Problem: "StudentName" depends only on "StudentID", and "CourseName" depends only on "CourseID".
Student Table
| StudentID | StudentName |
|-----------|------------|
| 101 | Alex |
Course Table
| CourseID | CourseName |
|----------|-----------|
| C01 | Math |
| C02 | Science |
| StudentID | CourseID |
|-----------|----------|
| 101 | C01 |
| 101 | C02 |
🔹 Now, the table is in 2NF because all non-key attributes depend on the whole primary key, not just part of it.
Rules:
|-----------|------------|--------------|---------------|
9
Student Table
|-----------|------------|--------------|
Department Table
| DepartmentID | DepartmentName |
|-------------|---------------|
| D01 | Science |
| D02 | Arts |
🔹 Now, the table is in 3NF because there are no transitive dependencies. Each column depends only on the primary key.
******
Boyce-Codd Normal Form (BCNF) – A Stricter 3NF
BCNF is an enhanced version of 3NF that removes anomalies not handled by 3NF.
Rules of BCNF
|-----------|----------|------------|
Functional Dependencies:
CourseID → Instructor (Problem: CourseID alone determines Instructor, but CourseID is not a super key!)
Course Table
10
| CourseID | Instructor |
|----------|------------|
Enrollment Table
| StudentID | CourseID |
|-----------|----------|
| 101 | C01 |
| 102 | C01 |
| 103 | C02 |
🔹 Now, the table is in BCNF because every determinant is a super key, preventing anomalies.
******
What do you mean by functional dependency? Explain with suitable example.
Functional dependency is a relationship between attributes (columns) in a relational database. It defines how one attribute
determines another within a table.
Definition
If A → B, it means that for a given value of A, there is only one unique value of B.
|-----------|------|--------|
🚫 Incorrect Dependency:
Name → StudentID is not valid because two students might have the same name.
11
A functional dependency is fully functional when the dependent attribute depends only on the entire primary key.
2. Partial Dependency
A partial dependency happens when a non-key attribute depends on part of a composite primary key instead of the whole
key.
3. Transitive Dependency
******
12