Unit 4

Unit 4 covers relational database design, focusing on normalization to reduce redundancy and improve data integrity, alongside file organization, indexing, and hashing techniques. It outlines the database design process, including requirement analysis, conceptual and logical design, normalization, and optimization for performance. Key concepts include various normal forms, file organization types, indexing methods like B+ trees, and hashing techniques for efficient data retrieval.

Uploaded by

G016 Bobade Samruddhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views6 pages

Unit 4

Uploaded by

G016 Bobade Samruddhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Unit 4: Relational Database Design and File Organization, Indexing & Hashing

1. Normalization
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity.
It involves decomposing a table into smaller, more manageable tables while ensuring that the relationships
between the data are preserved.
Features of Good Relational Designs
 Minimized Redundancy: Avoids duplicate data, which saves storage space and reduces
inconsistencies.
 Data Integrity: Ensures accuracy and consistency of data through constraints like primary keys,
foreign keys, and unique constraints.
 Ease of Maintenance: Simplifies updates, deletions, and insertions by reducing anomalies.
 Scalability: Supports future growth and changes in the database structure.
Functional Dependencies
 A functional dependency (FD) is a relationship between two sets of attributes in a relation. It is denoted
as X→YX \rightarrow Y, where XX determines YY.
 Example: In a table Student(StudentID, Name, Age), StudentID → Name means that StudentID
uniquely determines Name.
Normal Forms
Normal forms are a series of guidelines to ensure that a database design is free from redundancy and
anomalies.
First Normal Form (1NF):
 A table is in 1NF if:
o All attributes contain atomic (indivisible) values.
o Each column contains only one value per row.
 Example: A table with a multi-valued attribute like PhoneNumbers is not in 1NF. It should be split into
separate rows.
Second Normal Form (2NF):
 A table is in 2NF if:
o It is in 1NF.
o All non-key attributes are fully functionally dependent on the primary key (no partial
dependency).
 Example: In a table Order(OrderID, ProductID, ProductName, Quantity), if ProductName depends
only on ProductID, it violates 2NF. Split into Order(OrderID, ProductID, Quantity) and
Product(ProductID, ProductName).
Third Normal Form (3NF):
 A table is in 3NF if:
o It is in 2NF.
o There are no transitive dependencies (non-key attributes depend only on the primary key).
 Example: In a table Employee(EmployeeID, DepartmentID, DepartmentLocation), if
DepartmentLocation depends on DepartmentID, it violates 3NF. Split into Employee(EmployeeID,
DepartmentID) and Department(DepartmentID, DepartmentLocation).
Boyce-Codd Normal Form (BCNF):
 A stricter version of 3NF.
 A table is in BCNF if:
o For every functional dependency X→YX \rightarrow Y, XX is a superkey.
 Example: In a table Enrollment(StudentID, CourseID, Instructor), if Instructor determines CourseID, it
violates BCNF. Split into Enrollment(StudentID, CourseID) and Course(CourseID, Instructor).
Fourth Normal Form (4NF):
 A table is in 4NF if:
o It is in BCNF.
o It has no multi-valued dependencies (MVDs) unless they are trivial.
 Example: In a table EmployeeSkills(EmployeeID, Skill, Language), if an employee can have multiple
skills and languages independently, it violates 4NF. Split into EmployeeSkills(EmployeeID, Skill) and
EmployeeLanguages(EmployeeID, Language).
Functional Dependency Theory
 Closure of a Set of Functional Dependencies: The set of all functional dependencies that can be
inferred from a given set of FDs.
 Armstrong's Axioms: A set of rules to derive all possible FDs from a given set:
o Reflexivity: If Y⊆XY \subseteq X, then X→YX \rightarrow Y.
o Augmentation: If X→YX \rightarrow Y, then XZ→YZXZ \rightarrow YZ.
o Transitivity: If X→YX \rightarrow Y and Y→ZY \rightarrow Z, then X→ZX \rightarrow Z.
Multivalued Dependencies
 A multivalued dependency (MVD) occurs when an attribute determines a set of values independently
of other attributes.
 Denoted as X→→YX \rightarrow \rightarrow Y, meaning XX multidetermines YY.
 Example: In a table Employee(EmployeeID, Skill, Language), if an employee can have multiple skills
and languages independently, there is an MVD.
Database Design Process
1. Requirement Analysis: Understand the data and its relationships.
2. Conceptual Design: Create an Entity-Relationship (ER) model.
3. Logical Design: Convert the ER model into relational schemas.
4. Normalization: Apply normal forms to eliminate redundancy.
5. Physical Design: Implement the database with file organization, indexing, and hashing.

Database design is a structured process that involves converting requirements and conceptual models into a set
of relational schemas that can be implemented in a database system. The process ensures that the database is
efficient, accurate, and easy to maintain, and normalization is a key part of that process.
Overview of the Database Design Process
1. ER Diagram to Relational Schema:
o The design often starts with an Entity-Relationship (E-R) diagram. This is a conceptual
model of the data.
o The E-R diagram is then converted into relation schemas. This step involves creating tables
(relations) based on the entities and their relationships in the diagram.
2. Normalization:
o After creating a relation schema, normalization is applied. Normalization is a process that
removes redundancy and ensures the schema adheres to specific normal forms (e.g., 1NF, 2NF,
3NF, BCNF).
o The goal of normalization is to organize the schema in such a way that it reduces anomalies and
ensures data integrity.
3. Denormalization:
o Sometimes, to improve performance, a database designer might intentionally denormalize the
schema. Denormalization introduces some redundancy (i.e., storing the same data in multiple
places) to make certain queries faster.
o This is particularly useful when reading operations are more frequent than writes. However, it
comes with the cost of maintaining consistency across redundant data.
Specific Design Issues
1. E-R Diagram and Normalization:
 When creating an E-R diagram, careful attention to entity design can often eliminate the need for
extensive normalization later.
 Functional dependencies in an entity set (like dept_name → dept_address) need to be handled, or they
will lead to redundancy.
 In cases of complex relationships or multivalued dependencies, normalization helps in breaking down
the relation schema into smaller, more manageable pieces.
2. Naming Conventions:
 Unique Role Assumption: Each attribute name should have a unique meaning to avoid confusion. For
instance, naming a field "number" for both a room number and phone number in different tables is
problematic.
 Consistency in naming conventions across entities and relationships is crucial for clarity. This makes
database management easier, especially in large systems.
3. Denormalization for Performance:
 Denormalization is sometimes used to speed up query performance at the cost of data redundancy.
 A normalized schema may require joins to answer certain queries. Denormalizing the schema (like
storing combined information in one relation) can avoid costly joins.
 Materialized views are an alternative to full denormalization, where the database stores the result of a
join or a computation for quicker access.
4. Other Design Issues:
 Time-related Data: Storing time-series data can be tricky. For example, in a university database,
creating separate relations for every year (e.g., total_inst_2007, total_inst_2008, etc.) is problematic
because it requires new relations to be added every year and complicates queries.
 Crosstab Representations: Sometimes data is represented in a cross-tab format (e.g., one column per
year for a department). While this may be useful for reports, it is not suitable for the underlying
database design.
Practical Steps in the Database Design Process
1. Start with Conceptual Design:
o Create the E-R diagram based on the requirements.
o Identify entities, relationships, and attributes.
2. Convert E-R Diagram to Relational Schema:
o Convert entities into tables.
o Define primary keys and foreign keys.
3. Normalize the Schema:
o Apply normalization rules (1NF, 2NF, 3NF, BCNF) to ensure the schema is well-structured and
free from redundancy.
4. Optimize for Performance:
o If performance is a concern, consider denormalization or materialized views to speed up query
execution.
5. Test and Refine:
o Test the schema with sample data and queries.
o Refine the design as necessary based on performance or new requirements.
By following these steps and considering design principles like normalization, naming conventions, and
performance optimizations, you can ensure a well-designed, efficient, and maintainable database.
Let me know if you would like to dive deeper into any specific area, like normalization or how to handle time-
series data.
4o mini

2. File Organization
File organization refers to how data is stored in files and accessed in a database system.
Types of File Organization
 Heap File Organization: Records are stored in no particular order. Insertion is fast, but searching and
deletion are slow.
 Sequential File Organization: Records are stored in a sorted order based on a key. Efficient for range
queries but slow for insertions and deletions.
 Hash File Organization: Records are stored based on a hash function applied to a key. Fast for exact
match queries but inefficient for range queries.
 Indexed Sequential Access Method (ISAM): Combines sequential and indexed file organization.
Uses an index to locate records quickly.

3. Indexing
Indexing is a data structure technique used to quickly locate and access data in a database.
Ordered Indices
 Primary Index: Created on a sorted data file, typically on the primary key.
 Secondary Index: Created on a non-key attribute, allowing faster access to records.
B+ Tree Index Files
 A balanced tree structure where all leaf nodes are at the same level.
 Supports efficient insertion, deletion, and search operations.
 Used in most database systems for indexing.
B Tree Index File
 Similar to B+ trees but stores data in both internal and leaf nodes.
 Less commonly used in databases compared to B+ trees.

4. Hashing
Hashing is a technique used to map data to a fixed-size table (hash table) for fast access.
Static Hashing
 The size of the hash table is fixed.
 Collision: Occurs when two keys hash to the same location.
 Collision Resolution Techniques:
o Chaining: Store multiple items in the same location using linked lists.
o Open Addressing: Find another location within the hash table.
Dynamic Hashing
 The size of the hash table can grow or shrink dynamically.
 Extendible Hashing: Uses a directory to point to buckets, allowing the hash table to expand.
 Linear Hashing: Gradually increases the number of buckets as needed.

Summary
 Normalization ensures a well-structured database by eliminating redundancy and anomalies.
 File Organization determines how data is stored and accessed.
 Indexing (e.g., B+ trees, B trees) improves query performance.
 Hashing (static and dynamic) provides fast data retrieval but requires careful handling of collisions.
This unit is crucial for designing efficient, scalable, and maintainable database systems.

UNIT-4 (Database Concept)
No ratings yet
UNIT-4 (Database Concept)
8 pages
Dependency
No ratings yet
Dependency
47 pages
Chapter 7 - Database Design
No ratings yet
Chapter 7 - Database Design
52 pages
Data-Normalization 677e35783b23e
No ratings yet
Data-Normalization 677e35783b23e
37 pages
Lec 10 - DS - Database Management System Normalization
No ratings yet
Lec 10 - DS - Database Management System Normalization
40 pages
Normalisation Formated Unit 4
No ratings yet
Normalisation Formated Unit 4
32 pages
Chapter 4 - Functional Dependency and Normalization
No ratings yet
Chapter 4 - Functional Dependency and Normalization
17 pages
Normalization Module3 Presentation
No ratings yet
Normalization Module3 Presentation
14 pages
DBMS Module 2,4
No ratings yet
DBMS Module 2,4
12 pages
Normalization Module3 Complete
No ratings yet
Normalization Module3 Complete
12 pages
Database Design and Normalization
No ratings yet
Database Design and Normalization
27 pages
Relational Database Design
No ratings yet
Relational Database Design
40 pages
Sample Assignment
No ratings yet
Sample Assignment
15 pages
Sad Chapter2
No ratings yet
Sad Chapter2
37 pages
CBD 04 Normalisation
No ratings yet
CBD 04 Normalisation
29 pages
Normalization4 NF
No ratings yet
Normalization4 NF
41 pages
Data Modeling Advanced Concepts & Database Tables and Normalization
No ratings yet
Data Modeling Advanced Concepts & Database Tables and Normalization
7 pages
Quantum Computing in The Arts and Humanities An Introduction To Core Concepts, Theory and Applications
No ratings yet
Quantum Computing in The Arts and Humanities An Introduction To Core Concepts, Theory and Applications
371 pages
Normalisation 2025
No ratings yet
Normalisation 2025
74 pages
Normalisation 2025
No ratings yet
Normalisation 2025
74 pages
Database Management System-Unit-5 Notes
No ratings yet
Database Management System-Unit-5 Notes
14 pages
Database Unit 4 Normilization 1 1
No ratings yet
Database Unit 4 Normilization 1 1
38 pages
Chapter 4
No ratings yet
Chapter 4
53 pages
Unit-1 Bda
No ratings yet
Unit-1 Bda
5 pages
Normalization
No ratings yet
Normalization
44 pages
Chapter 5
No ratings yet
Chapter 5
34 pages
DB Lecture 7. Data Normalization
No ratings yet
DB Lecture 7. Data Normalization
25 pages
Chapter 06
No ratings yet
Chapter 06
46 pages
DB Lecture W09a Normalization
No ratings yet
DB Lecture W09a Normalization
15 pages
Normalisation in DataBase
No ratings yet
Normalisation in DataBase
28 pages
Database Design: Normalization
No ratings yet
Database Design: Normalization
27 pages
Normalization Unit 3
No ratings yet
Normalization Unit 3
30 pages
LEC06 Normalization Up
No ratings yet
LEC06 Normalization Up
51 pages
Unit-3 DBMS
No ratings yet
Unit-3 DBMS
63 pages
Chapter Four: System Design: Werabe University Institute of Technology Department of Information Systems
No ratings yet
Chapter Four: System Design: Werabe University Institute of Technology Department of Information Systems
23 pages
Unit IV Database Normalization
No ratings yet
Unit IV Database Normalization
36 pages
Chapter 5-T323 Introduction To The Relational Database
No ratings yet
Chapter 5-T323 Introduction To The Relational Database
37 pages
Data Science Unit 1 Unit 2
No ratings yet
Data Science Unit 1 Unit 2
49 pages
Chapter6 NormalizationDatabaseTables Part4
No ratings yet
Chapter6 NormalizationDatabaseTables Part4
38 pages
Chapter05 Updated
No ratings yet
Chapter05 Updated
52 pages
Database Techniques DB Normalization
No ratings yet
Database Techniques DB Normalization
37 pages
Relational Database Design - : Mapping ERD To Relational
No ratings yet
Relational Database Design - : Mapping ERD To Relational
61 pages
Databases Lecture 5
No ratings yet
Databases Lecture 5
34 pages
Network Software
No ratings yet
Network Software
17 pages
Database Normalisation 101
No ratings yet
Database Normalisation 101
9 pages
Computer Science Industrial Training Report
No ratings yet
Computer Science Industrial Training Report
43 pages
Normalization: Updatesl
No ratings yet
Normalization: Updatesl
20 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
GISP Practice Exam Printable Final
No ratings yet
GISP Practice Exam Printable Final
12 pages
Unit 4 Relational Database Design
No ratings yet
Unit 4 Relational Database Design
22 pages
Database Normalization
No ratings yet
Database Normalization
28 pages
Qa Interview Questions 7
No ratings yet
Qa Interview Questions 7
114 pages
Data Management and Database Design: INFO 6210 Week #4
No ratings yet
Data Management and Database Design: INFO 6210 Week #4
44 pages
Chapter 13
No ratings yet
Chapter 13
31 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
8 pages
Communication As A Problem Solving Skill
No ratings yet
Communication As A Problem Solving Skill
8 pages
Database Normalization and ERD
No ratings yet
Database Normalization and ERD
10 pages
Database Normalization
No ratings yet
Database Normalization
10 pages
Lec10 Normalization PDF
No ratings yet
Lec10 Normalization PDF
50 pages
HTCS501 Unit 5
No ratings yet
HTCS501 Unit 5
21 pages
Unit 1
No ratings yet
Unit 1
83 pages
Report For Blood Bank
No ratings yet
Report For Blood Bank
10 pages
Unit-5 Final
No ratings yet
Unit-5 Final
19 pages
DBMS
No ratings yet
DBMS
8 pages
It6701 - Information Management: Unit I - Database Modelling, Management and Development
No ratings yet
It6701 - Information Management: Unit I - Database Modelling, Management and Development
35 pages
Pig Vs Hive Big Data Analysis Showdown
No ratings yet
Pig Vs Hive Big Data Analysis Showdown
11 pages
Prelim Model Paper 3
No ratings yet
Prelim Model Paper 3
2 pages
Normalization of Database Models
No ratings yet
Normalization of Database Models
43 pages
Database Management - CH 06 Normalization
No ratings yet
Database Management - CH 06 Normalization
22 pages
Notes 1
No ratings yet
Notes 1
8 pages
Mater Dei College College of Nursing Tubigon, Bohol
No ratings yet
Mater Dei College College of Nursing Tubigon, Bohol
27 pages
Lec 5 Normalization
No ratings yet
Lec 5 Normalization
25 pages
Internet of Things
No ratings yet
Internet of Things
1 page
Sat - 7.Pdf - Predicting Student's Performance Based On Machine Learning
No ratings yet
Sat - 7.Pdf - Predicting Student's Performance Based On Machine Learning
11 pages
337 Lecture-01
No ratings yet
337 Lecture-01
14 pages
DIP Project Report
No ratings yet
DIP Project Report
16 pages
CS 3308 Discussion Assignment Unit 2
No ratings yet
CS 3308 Discussion Assignment Unit 2
6 pages
Attacks Against Machine Learning - Evasion
No ratings yet
Attacks Against Machine Learning - Evasion
45 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Experiment 6-7-8
No ratings yet
Experiment 6-7-8
5 pages
Introduction To Vector Embeddings and Vector Databases
No ratings yet
Introduction To Vector Embeddings and Vector Databases
11 pages
Decision Tree
No ratings yet
Decision Tree
2 pages
Coffee Delight Cafe New
No ratings yet
Coffee Delight Cafe New
11 pages
Chapter 05
No ratings yet
Chapter 05
56 pages
Alam@Resume
No ratings yet
Alam@Resume
1 page
S4-Enhancing Unsupervised Neural Networks Based Text Summarization With Word Embedding and Ensemble Learning
No ratings yet
S4-Enhancing Unsupervised Neural Networks Based Text Summarization With Word Embedding and Ensemble Learning
17 pages
Riju Mondal 2.6 YOE
No ratings yet
Riju Mondal 2.6 YOE
1 page
DBMS R19 - Unit-4
No ratings yet
DBMS R19 - Unit-4
9 pages
Key Value
No ratings yet
Key Value
5 pages
Module For Advance Database Systems
No ratings yet
Module For Advance Database Systems
3 pages
CPH 9
No ratings yet
CPH 9
6 pages
An Electronic Birth Record Management System For Nigeria: Nigerian Journal of Technology July 2019
No ratings yet
An Electronic Birth Record Management System For Nigeria: Nigerian Journal of Technology July 2019
7 pages
Automating Hierarchical Document Classification For Construction Management Information Systems
No ratings yet
Automating Hierarchical Document Classification For Construction Management Information Systems
12 pages
YogeshSakhahariSonawane (1 3)
No ratings yet
YogeshSakhahariSonawane (1 3)
3 pages
PR 3 CPHM
No ratings yet
PR 3 CPHM
2 pages
Maurizio Scibilia Resume
No ratings yet
Maurizio Scibilia Resume
1 page
Lecture Slides For Chapter 1 of Deep Learning Ian Goodfellow 2016-09-26
No ratings yet
Lecture Slides For Chapter 1 of Deep Learning Ian Goodfellow 2016-09-26
13 pages
Shashank Shekhar: Education Career Summary
No ratings yet
Shashank Shekhar: Education Career Summary
1 page
Data Structures I Essentials
From Everand
Data Structures I Essentials
Dennis Smolarski
No ratings yet
SQL Interview Success From Beginner To Pro
From Everand
SQL Interview Success From Beginner To Pro
Shana
No ratings yet
Basic Concepts in Data Structures
From Everand
Basic Concepts in Data Structures
K.Meenendranath Reddy
No ratings yet

Unit 4

Uploaded by

Unit 4

Uploaded by

Unit 4: Relational Database Design and File Organization, Indexing & Hashing

You might also like