0% found this document useful (0 votes)
6 views12 pages

Relational Model (Unit-2)

The document provides an overview of the Relational Data Model (RDM), introduced by E.F. Codd, which organizes data into tables (relations) with unique rows and columns. It outlines key components such as relations, tuples, attributes, and keys, along with Codd's 12 rules for ensuring data integrity in relational databases. Additionally, it covers normalization processes to reduce redundancy and improve data integrity, detailing the advantages and disadvantages of normalization and the different normal forms.

Uploaded by

ramaraji614
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views12 pages

Relational Model (Unit-2)

The document provides an overview of the Relational Data Model (RDM), introduced by E.F. Codd, which organizes data into tables (relations) with unique rows and columns. It outlines key components such as relations, tuples, attributes, and keys, along with Codd's 12 rules for ensuring data integrity in relational databases. Additionally, it covers normalization processes to reduce redundancy and improve data integrity, detailing the advantages and disadvantages of normalization and the different normal forms.

Uploaded by

ramaraji614
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Unit-2

Relational Data Model


The Relational Data Model (RDM) is a way of organizing data into tables, called relations, where data is stored in rows and
columns. This model was introduced by E.F. Codd in 1970 and is widely used in Relational Database Management Systems
(RDBMS) like MySQL, PostgreSQL, and SQL Server.

Key Components of the Relational Data Model

1. Relation (Table):

A relation is a table with unique rows and columns.

Each table represents an entity (e.g., Employees, Products).

2. Tuple (Row):

A tuple is a single record in a table.

Each row contains values for different attributes.

3. Attribute (Column):

Attributes are the columns in a table, representing data fields.

Each attribute has a specific data type (e.g., integer, text, date).

4. Domain:

The domain defines the possible values for an attribute.

Example: The "Age" column might have a domain of integers between 1 and 120.

5. Primary Key (PK):

A unique identifier for each record in a table.

Example: "Student_ID" in a Student table.

6. Foreign Key (FK):

A column that links to the primary key of another table.

Used to establish relationships between tables.

7. Constraints:

Rules to maintain data integrity (e.g., NOT NULL, UNIQUE, CHECK).

******
E. F. Codd’s 12 Rules for the Relational Model
E. F. Codd, the father of relational databases, defined 12 rules (actually 13, including Rule 0) to determine whether a
database management system (DBMS) is truly relational. These rules ensure data integrity, consistency, and structure in
relational databases.

1
Rule 0: The Foundation Rule

A system must be relational, meaning it must manage data entirely using relational principles (tables, rows, and columns).

Rule 1: Information Rule

All data must be stored in tables (relations), with rows and columns.

Rule 2: Guaranteed Access Rule

Every data value must be accessible using a combination of table name, primary key, and column name.

Rule 3: Systematic Treatment of NULL Values

The database must support NULL values to represent missing, unknown, or not applicable data.

✅ Example: If a student hasn’t provided a phone number, it should be stored as NULL.

Rule 4: Dynamic Online Catalog (Data Dictionary)

The system must store metadata (schema, tables, constraints, etc.) in tables, just like actual data.

✅ Example: Information about tables and columns should be queryable using SQL.

SELECT * FROM INFORMATION_SCHEMA.TABLES;

Rule 5: Comprehensive Data Sublanguage Rule

A relational database must support a full data language like SQL, which can:

1. Define the structure (DDL - Data Definition Language).

2. Manipulate data (DML - Data Manipulation Language).

3. Query data (SELECT statements).

4. Control access (DCL - Data Control Language).

✅ Example:

SELECT * FROM Student WHERE Course = 'Math';

Rule 6: View Updating Rule

Views (virtual tables created using SELECT) must be updatable if logically possible.

✅ Example: If a VIEW is created for active students, updates should reflect in the base table.

Rule 7: High-Level Insert, Update, and Delete

The system must support operations on sets of rows, not just one at a time.

✅ Example: Update all students in a specific course at once:

Rule 8: Physical Data Independence

Changes to how data is stored (physical structure) should not affect applications.

✅ Example: If a table is moved to a different disk, applications should work without modification.

2
Rule 9: Logical Data Independence

Changes to logical structure (schema modifications) should not affect applications.

✅ Example: If a new column (Email) is added to a Student table, old queries should still work

Rule 10: Integrity Independence

Integrity constraints (such as primary key, foreign key, and check constraints) must be stored in the database and not in
applications.

Rule 11: Distribution Independence

A relational database should work the same way whether data is stored locally or across multiple locations (distributed
databases).

Example: A bank database should work the same whether accounts are stored in one branch or multiple branches.

Rule 12: Nonsubversion Rule

If a low-level access method (like file manipulation) exists, it must not bypass relational security and integrity rules.

Example: Users should not be able to modify database files directly, avoiding bypassing constraints.

******
Relational Algebra
Relational Algebra is a fundamental concept in databases used to query and manipulate relational data. Relational
operators in relational algebra can be classified into Traditional and Special operators.

Types of Relational Operators

Relational Operators can be broadly classified into two types

Traditional Relational Operators

Special Relational Operators

Traditional Relational Operators:

1. Union (∪) – Combines tuples from two relations (tables), removing duplicates.

Example: A ∪ B → Includes all unique tuples from A and B.

2. Intersection (∩) – Returns common tuples between two relations.

Example: A ∩ B → Includes only tuples present in both A and B.

3. Set Difference (-) – Returns tuples in one relation but not in another.

Example: A - B → Includes tuples in A but not in B.

4. Cartesian Product (×) – Combines every tuple of one relation with every tuple of another.

Example: A × B → Forms all possible pairs of tuples from A and B.

5. Rename (ρ) – Renames a relation or its attributes.

Example: ρ(NewName, Employee) → Renames the Employee relation to NewName.


3
Special Relational Operators:

1. Selection (σ) – Filters rows (tuples) based on a condition.

Example: σ Age > 30 (Employee) → Selects employees older than 30.

2. Projection (π) – Extracts specific columns (attributes).

Example: π Name, Age (Employee) → Selects only the Name and Age columns.

3. Join (⨝) – Combines related tuples from two relations based on a condition.

Types of Joins:

Theta Join (⨝ θ) – Uses a general condition (e.g., A ⨝ A.id = B.id B).

Equi-Join – A special case of Theta Join using only =.

Natural Join (⋈) – Joins on attributes with the same name.

4. Division (÷) – Used for queries like "Find those who have all required attributes."

Example: A ÷ B → Finds tuples in A that relate to all tuples in B.

******
Relational Calculus
Relational Calculus is a non-procedural query language in relational databases, meaning it focuses on what to retrieve
rather than how to retrieve it (unlike Relational Algebra, which specifies a step-by-step process). Queries in relational
calculus are expressed as logical formulas, describing the properties of data to be retrieved.

Types of Relational Calculus

1. Tuple Relational Calculus (TRC)

Uses tuple variables to represent rows in a relation.

Queries are expressed using predicate logic.

General format:

\{ t \ | \ P(t) \}

Example:

Retrieve employees who earn more than 50,000:

\{ t \ | \ t \in Employee \land t.salary > 50000 \}

2. Domain Relational Calculus (DRC)

Uses domain variables that represent column values (attributes) rather than entire tuples.

General format:

\{ (x_1, x_2, ..., x_n) \ | \ P(x_1, x_2, ..., x_n) \}

Example:

Retrieve employee names and salaries where salary > 50,000:


4
\{ (name, salary) \ | \ \exists d_1, d_2, ..., d_n \ (Employee(name, salary, d_1, d_2, ..., d_n) \land salary > 50000) \}

******
RELATION ALGEBRA, V/S, RELATIONAL CALCULUS
Relational Algebra Relational Calculus

Relational Algebra is a procedural language in which user Relational Calculus is a non-procedural language in which
specifies the sequence of steps (procedures) to obtain user specifies what is to be retrieved from the database
information from the database. rather than how to retrieve it.

Relational Algebra is like a programming language. It is closer to natural language.

It is prescriptive in nature i.e.; it describes steps to perform


a given task. It is descriptive in nature i.e.; it describes desired result.

It specifies operations performed on existing relations to In relation calculus, operations are directly performed on
obtain new relations. the relations in form of formulas.

In relation algebra, queries are domain independent. In relation calculus, queries are domain dependent.

Relation Algebra provides a collection of explicit


operations like- union, intersect, difference, select, project, Relational Calculus provides a notation for formulating the
join, etc., that can be actually used to build some desired definition of that desired relation in terms of those given
relation from the given relations in the database. relations in the operations.

In relational algebra, the evaluation of the query, depends In relation calculus, the evaluation of the query does not
upon the order of operations. depend upon the order of operations.

******

Keys in relational model

In the relational model of databases, keys are crucial for uniquely identifying records (tuples) in a table (relation) , prevent
duplicate data, and establishing relationships between tables. There are several types of keys, each serving a specific
purpose:

1. Super Key

A super key is any set of attributes that uniquely identifies a tuple in a relation. A relation can have multiple super keys.

Example: In a Students table, {Student_ID, Name} is a super key if Student_ID is unique.

2. Candidate Key

A candidate key is a minimal super key, meaning it has no redundant attributes. Every relation can have multiple candidate
keys.

Example: If {Student_ID, Name} is a super key but Student_ID alone is also unique, then {Student_ID} is a candidate key.

3. Primary Key
5
A primary key is a candidate key selected to uniquely identify tuples in a relation. It must be unique and not null.

Example: If Student_ID is a candidate key, it can be chosen as the primary key.

4. Foreign Key

A foreign key is an attribute (or set of attributes) in one table that refers to the primary key of another table. It establishes
relationships between tables.

Example: If a Courses table has an attribute Student_ID that references the Students table, then Student_ID is a foreign
key.

5. Composite Key

A composite key consists of multiple attributes that together uniquely identify a tuple.

Example: In a Course_Enrollment table, {Student_ID, Course_ID} together form a composite key.

6. Unique Key

A unique key ensures that values in a column (or set of columns) are unique across all rows, similar to a primary key but
allowing NULL values.

Example: An Employees table may have Email as a unique key.

7. Alternate Key

An alternate key is any candidate key that is not chosen as the primary key.

Example: If a Students table has {Student_ID, Email} as candidate keys and Student_ID is chosen as the primary key, then
Email is an alternate key.

8. Surrogate Key

A surrogate key is an artificial key, usually an auto-generated number, used when no natural primary key exists.

Example: A database may use an auto-incrementing User_ID instead of relying on SSN or Email.

******
What is Normalization
Normalization is the process of organizing data in a database to reduce redundancy (duplicate data) and improve data
integrity. It involves dividing large tables into smaller ones and establishing relationships between them using keys.

The goal of normalization is to:

Minimize data redundancy (avoid duplicate data).

Ensure data consistency (maintain data accuracy).

Improve database efficiency (optimize queries and storage).

Rules of Normalization (Normal Forms)

Normalization follows a series of rules called normal forms (NF). Each normal form builds on the previous one.
6
1. First Normal Form (1NF) – Remove Duplicate Columns & Ensure Atomicity
2. Second Normal Form (2NF) – Remove Partial Dependency
3. Third Normal Form (3NF) – Remove Transitive Dependency

Higher Normal Forms (BCNF, 4NF, 5NF, 6NF)

BCNF (Boyce-Codd Normal Form): A stricter version of 3NF, ensuring no anomalies.

4NF: Removes multi-valued dependencies.

5NF: Breaks down tables to remove complex dependencies.

6NF: Used in advanced cases like data warehousing.

******
Advantages and Disadvantages of Normalization
Advantages of Normalization

1. Reduces Data Redundancy

Eliminates duplicate data, saving storage space.

2. Improves Data Integrity and Consistency

Ensures accurate and consistent data across tables.

3. Prevents Data Anomalies

Avoids insertion, update, and deletion anomalies.

4. Efficient Query Performance

Smaller tables mean faster queries and optimized indexing.

5. Easy to Maintain and Update

Changes in one table automatically reflect in related tables.

Disadvantages of Normalization

1. Increased Complexity

More tables and relationships make database design more complicated.

2. Slower Join Operations

Querying data requires multiple JOINs, which can slow performance.

3. Difficult for Reporting & Analysis

Splitting data across many tables can make reporting more complex.

4. More Foreign Keys

Excessive foreign keys may increase storage and processing time.


7
5. Not Always Necessary

In some cases, denormalization (combining tables) is better for performance.

******
First, Second, and Third Normal Forms (1NF, 2NF, 3NF) with Examples
Normalization is a process of organizing data in a database to eliminate redundancy and improve data integrity. It follows
different normal forms, where each form builds upon the previous one.

First Normal Form (1NF) – Remove Duplicate Columns & Ensure Atomicity

Rules:

1. Each column should have atomic (indivisible) values (no multiple values in one cell).

2. Each row should be unique (a primary key must exist).

3. No repeating groups (no multiple columns for similar data).

✅ Example (Before 1NF - Not in 1NF)

| StudentID | Name | Subjects |

|-----------|------|------------|

| 101 | Alex | Math, Science |

| 102 | Bob | English, History |

Problem: "Subjects" column has multiple values.

✅ After 1NF (Separate into Rows)

| StudentID | Name | Subject |

|-----------|------|---------|

| 101 | Alex | Math |

| 101 | Alex | Science |

| 102 | Bob | English |

| 102 | Bob | History |

🔹 Now, the table is in 1NF because each column has atomic values, and there are no repeating groups.l

Second Normal Form (2NF) – Remove Partial Dependency

Rules:

1. The table must be in 1NF.

2. All non-key columns must depend on the entire primary key, not just a part of it.

✅ Example (Before 2NF - Not in 2NF)

| StudentID | CourseID | StudentName | CourseName |

8
|-----------|----------|------------|-----------|

| 101 | C01 | Alex | Math |

| 101 | C02 | Alex | Science |

Problem: "StudentName" depends only on "StudentID", and "CourseName" depends only on "CourseID".

✅ After 2NF (Separate Tables)

Student Table

| StudentID | StudentName |

|-----------|------------|

| 101 | Alex |

Course Table

| CourseID | CourseName |

|----------|-----------|

| C01 | Math |

| C02 | Science |

Enrollment Table (Bridge Table)

| StudentID | CourseID |

|-----------|----------|

| 101 | C01 |

| 101 | C02 |

🔹 Now, the table is in 2NF because all non-key attributes depend on the whole primary key, not just part of it.

Third Normal Form (3NF) – Remove Transitive Dependency

Rules:

1. The table must be in 2NF.

2. No transitive dependency: A column should not depend on a non-primary key column.

✅ Example (Before 3NF - Not in 3NF)

| StudentID | StudentName | DepartmentID | DepartmentName |

|-----------|------------|--------------|---------------|

| 101 | Alex | D01 | Science |

| 102 | Bob | D02 | Arts |

Problem: "DepartmentName" depends on "DepartmentID" instead of "StudentID".

✅ After 3NF (Separate into Tables)

9
Student Table

| StudentID | StudentName | DepartmentID |

|-----------|------------|--------------|

| 101 | Alex | D01 |

| 102 | Bob | D02 |

Department Table

| DepartmentID | DepartmentName |

|-------------|---------------|

| D01 | Science |

| D02 | Arts |

🔹 Now, the table is in 3NF because there are no transitive dependencies. Each column depends only on the primary key.

******
Boyce-Codd Normal Form (BCNF) – A Stricter 3NF
BCNF is an enhanced version of 3NF that removes anomalies not handled by 3NF.

Rules of BCNF

1. The table must be in 3NF.

2. For every functional dependency (A → B), A must be a super key.

Problem in 3NF (Not in BCNF)

| StudentID | CourseID | Instructor |

|-----------|----------|------------|

| 101 | C01 | Dr. Smith |

| 102 | C01 | Dr. Smith |

| 103 | C02 | Dr. John |

Functional Dependencies:

{StudentID, CourseID} → Instructor (Correct)

CourseID → Instructor (Problem: CourseID alone determines Instructor, but CourseID is not a super key!)

Solution (Convert to BCNF - Separate Tables)

Course Table
10
| CourseID | Instructor |

|----------|------------|

| C01 | Dr. Smith |

| C02 | Dr. John |

Enrollment Table

| StudentID | CourseID |

|-----------|----------|

| 101 | C01 |

| 102 | C01 |

| 103 | C02 |

🔹 Now, the table is in BCNF because every determinant is a super key, preventing anomalies.

What is Functional Dependency?

******
What do you mean by functional dependency? Explain with suitable example.
Functional dependency is a relationship between attributes (columns) in a relational database. It defines how one attribute
determines another within a table.

Definition

If A → B, it means that for a given value of A, there is only one unique value of B.

A (Determinant) → B (Dependent/Determined Attribute)

✅ Example: In a Student table:

| StudentID | Name | Course |

|-----------|------|--------|

| 101 | Alex | Math |

| 102 | Bob | Science |

StudentID → Name (Each StudentID has only one Name).

StudentID → Course (Each StudentID is enrolled in only one Course).

🚫 Incorrect Dependency:

Name → StudentID is not valid because two students might have the same name.

Types of Functional Dependency

1. Full Functional Dependency

11
A functional dependency is fully functional when the dependent attribute depends only on the entire primary key.

2. Partial Dependency

A partial dependency happens when a non-key attribute depends on part of a composite primary key instead of the whole
key.

3. Transitive Dependency

A transitive dependency happens when A → B and B → C, meaning A indirectly determines C.

******

12

You might also like