0% found this document useful (0 votes)
18 views15 pages

Chapter 1

Uploaded by

starlord68736
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views15 pages

Chapter 1

Uploaded by

starlord68736
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Data : facts , figures that have no particular meaning (eg: 201 , shaunak , 21)

Record : refers to collection of related data items

(eg: 201 , Shounak , 21) these have no meaning now but if we organize them in the following way

ROLL NAME AGE They represent meaningful information


201 Shounak 21

Table or relation is a collection of records

ROLL NAME AGE Columns : attributes / fields / domains


201 Shounak 21
Rows : tuples / records
202 Parthib 21
203 Soubeer 22
• A DBMS is a collection of interrelated data and a set of programs to access those data
o Its primary purpose is to provide a way to store and retrieve database information
that is both convenient and efficient

FILE BASED SYSTEMS

• These are traditional method of organizing and storing data on a computer. They are a
essentially a collection of individual files each containing a specific set of data . ( eg : set of
text documents or spreadsheets or images or videos)

KEY CHARACTERISTICS OF FBS :

• Decentralized : each file is managed individually


• Data redundancy : same data may be stored in multiple files leading to redundancy
• Limited sharing : sharing of data b/w files and applications is difficult and requires
manual work
• Simple structure : there is no complex relationships between files

DISADVANTAGES OF FBS :

• Data redundancy: Duplicate data may be stored across files causing storage wastage and
increased risk of data inconsistencies
• Data isolation Data is often stored in separate files making it difficult to integrate and
analyze information across different sources
• Limited Data Sharing: Sharing data between different applications or users can be
complex and often requires manual intervention.
• Integrity problems : difficult to enforce and update consistency constraints
• Atomicity problems : ensuring complete or no execution of operations during failures is
challenging
• Security problems : hard to restrict unauthorized access due to lack of centralized control
• Concurrent Access Issues: Simultaneous updates can cause data inconsistency.

1
ADVANTAGES OF DBMS over FBS :

• Reduces data redundancy as DBMS stores data in single place avoiding duplication across
multiple files
• DBMS ensures Data Consistency , any update in the DB is reflected everywhere avoiding
mismatched data
• Easy to retrieve data with queries without having to write complex programs
• DBMS provides better security as no unauthorized access is allowed
• Supports data integrity (like no negative balance allowed) and atomicity ( ensures
complete operations or no execution of operations at all during system failures)
• DBMS provides automatic backup and recovery in case of failure keeping the data safe
• DBMS allows concurrent handling of data without errors
• DBMS easily handles large and growing amount of data providing scalability

THE DATABASE APPROACH

It is a method of managing data where all information is stored in a centralized database system
and multiple users or applications can access and manipulate this data efficiently

Key Features of the Database Approach

1. Centralized Storage: All data is stored in a single, organized database rather than scattered
across files.

2. Data Sharing: Allows multiple users and applications to access the same data concurrently.
3. Data Independence: Changes to the database structure don’t affect applications that use the
data.
4. Minimized Redundancy: Reduces duplicate data to save storage and maintain consistency.

5. Enhanced Security: Access to data is controlled to protect sensitive information.


6. Data Integrity: Ensures the accuracy and consistency of the data.

7. Efficient Retrieval: Queries can quickly retrieve specific data without complex programming.

LOGICAL DBMS ARCHITECTURE : 1 tier 2 tier 3 tier

1 tier : in this the user directly interacts with DBMS and any changes are made directly to the
database. Its mainly used for development of local applications allowing programmers to
communicate with database for quick responses .

2 tier : in here the system has two layers

• Client : The user interface where users interact with the application
• Server : The database layer where data is stored and manages

The client sends request to the server and the server processes these requests and sends back the
required data this architecture is commonly used in applications where the user interacts with the
database via an application rather than directly

Advantages are : better and simplified user interaction, faster communication b/w server and
client

(eg)A desktop application like MS access where user interacts with db through a software
interface

2
3 tier : in here there is no direct communication between server and client

• Its mainly used for large applications on the web


• Features : data backup , recovery , security and concurrency control

A 3-tier architecture divides the system into three layers to enhance modularity and
scalability:

1. Presentation Layer (Client): The user interface (UI) where users interact with the system,
such as web browsers or desktop applications.

2. Application Layer (Middle Tier): Processes the business logic and acts as a bridge
between the client and the database.

3. Data Layer (Database Server): Stores and manages the data.

How it works :

The client sends request to application layer


The application layer processes the requests and queries the database
The database returns to the application layer which then can deliver to the client
Advantages
• Scalability: Each layer can be modified or scaled independently.
• Security: Direct access to the database is restricted to the middle layer.
• Maintainability : Easier to maintain and update the system
• Modularity : changes to one layer can be made without affecting other layers

Disadvantages
3
• Complexity: More layers make the system harder to develop and manage.
• Performance Overhead: Communication between layers can slow down the system.
• Cost: Requires more resources and infrastructure, increasing costs.
• Network Dependency: Issues in the network can disrupt communication between layers.
• Development Effort: More time and effort needed to design and implement the system.
example
A typical e-commerce website:
• Client: User's browser.
• Application Layer: The server that handles the website logic (e.g., Java Spring, Django).
• Database Layer: Stores product details, user data, and orders.
NEED FOR 3 level architecture
• Separation of Tasks: Divides the system into different layers (UI, business logic, data),
making it easier to manage and update each part.
• Data Independence: Changes in one layer (like the database) don't affect other layers,
making the system more flexible.
• Scalability: Each layer can be scaled separately to handle more users or data, improving
performance.
• Security: The database is protected from direct user access, keeping sensitive data safe.
• Easier Maintenance: Each layer can be updated independently, making it easier to
maintain the system.
PHYSICAL DBMS ARCHITECTURE

Physical DBMS Architecture refers to how data is stored and managed on physical devices like
hard drives. It focuses on how the database organizes and retrieves data efficiently.

Components of Physical DBMS Architecture:

1. Storage Manager: Manages how data is saved and accessed from the storage.

o File Manager: Handles storing and organizing files.

o Buffer Manager: Manages memory used to store data temporarily for faster access.

o Index Manager: Manages indexes that help quickly find data.

2. Data Access Methods: These are ways the system finds data efficiently, such as using
sorting or indexing.

3. Disk Storage: Refers to how data is saved on storage devices (e.g., hard drives or SSDs)
and organized for easy access.

4. Transaction Management: Makes sure that database changes are completed correctly, or
if there’s a problem, everything is rolled back to the correct state.

5. Recovery Management: Recovers data if there’s a system failure or crash.

Purpose of physical dbms architecture :

• Efficient Storage: Organizes data for better storage and faster access.

• Data Safety: Ensures data is consistent and can be recovered.


4
• Performance: Helps retrieve data quickly and efficiently.

DATABASE ADMINISTRATOR (DBA) FUNCTIONS & ROLE

• A DBA is responsible for managing and maintaining the database system


• His responsibilities are to ensure that the database runs efficiently smoothly securely and
is accessible when needed

KEY FUNCTIONS OF DBA

1. (Database design and planning)The DBA designs the structure of the database to ensure
it is well-organized, efficient, and scalable.
2. data security : control access to the database and enforces security measures
3. Backup and Recovery : plans and implements data backups and recovery techniques
4. Performance monitoring : Monitors and optimizes the database for better performance
5. Database maintenance : Regularly updates and maintains the database
6. Data Integrity: Ensures data accuracy and consistency through constraints.
7. Troubleshooting: Resolves issues like slow queries or database crashes.

Data Files, Indices, and Data Dictionary

1. Data Files:
• Definition: Data files store the actual data in a database. Each data file consists of records
organized in a structured format, such as rows in a table.
• Role: They store all the information needed for operations like retrieval, insertion, and
updates.
• Types:
o Primary Data Files: Store the actual data.
o Secondary Data Files: Used for specific functions like indexing or for organizing
data across multiple locations.
2. Indices:
• Definition: An index is a data structure that improves the speed of data retrieval
operations on a database table.
• Purpose: It provides quick access to rows based on values in specific columns.
• Types:
o Primary Index: Built on the primary key of the table.
o Secondary Index: Built on non-primary key columns.
• Benefits:
o Speeds up query performance.
o Reduces data retrieval time by providing fast lookup of records.

3. Data Dictionary:
A Data Dictionary is like a reference guide that stores information about the database, such as
how the tables are connected and what data they hold.
It helps keep the data organized and prevents duplication.
Eg: For example, a data dictionary could describe a table that holds employee details, explaining
what each column in the table means (like name, address, etc.). It is an important part of a
database because it helps manage data in a clear and structured way.

5
• Role: It helps DBAs and developers understand the structure and relationships of data
within the database.
• Contents:
o Table names, column names, data types, and sizes.
o Relationships (foreign keys, etc.).
o Constraints (primary keys, unique keys, etc.).
• Benefits:
1. Better Understanding: It provides important information about the database, such as
entities, relationships, and attributes, which the data model alone doesn’t give.
2. Reduces Redundancy: It helps avoid repeating data and ensures consistency when
different team members use the data.
3. Structured Design: It supports designing and analyzing data by following data standards,
which are rules for how data should be collected and presented.
4. Naming Conventions: It helps define the rules for naming things in the database, ensuring
consistency.

• Data files hold the actual data in the database.


• Indices help speed up data retrieval by providing quick access to data.
• Data dictionaries store metadata about the database, ensuring proper structure and
integrity.

DATA ABSTRACTION

Data abstraction is the process of hiding the details of how data is stored and maintained while
showing only the relevant information to the user. It allows users to interact with the data without
needing to understand its internal complexities

Level 1 Physical Level (Internal View/Schema)

• Definition: This is the lowest level of abstraction that describes how data is actually stored
on disk.

• Details: It involves technical specifics, such as data structures, indexing, and file
organization.

• Example: A student record stored as a block of memory locations.

Logical Level (Conceptual View/Schema)

• Definition: This level describes what data is stored in the database and the relationships
between those data. DBAs operate in this level

• Details: It provides a simplified representation of the database, hiding the complexity of


the physical level.

type student = record

ID : char(10);

name : char(30);

dept_name : char(20);

total_credits : numeric(3);
6
end;

View Level (External View/Schema)

• Definition: The highest level of abstraction shows only specific parts of the database to
users, tailored to their needs.

• Details:

o Multiple views can exist for the same database.

o It hides details of the logical level and enforces data security by restricting access
to sensitive parts of the database.

• Example: A university registrar’s office clerk sees only student records, not instructor
salaries.

Schema: The overall structure of the database (e.g., a blueprint showing how tables like student
and course are related).

Instance: The actual data stored in the database at a specific time (e.g., a table filled with rows
of student data).

Importance of Data Abstraction

1. Efficiency: Hides storage complexities while maintaining quick access to data.

2. Security: Restricts user access to sensitive information through views.

3. Flexibility: Allows the logical structure to change without affecting user interaction (data
independence).

Types of Databases

Databases are classified based on their data organization, usage, and access methods.

1. Relational Database

• Definition: Data is organized in tables (relations) with rows (records) and columns
(attributes).

• Example: SQL databases like MySQL, PostgreSQL.

• Usage: Banking, e-commerce, inventory systems.

2. Hierarchical Database

• Definition: Data is organized in a tree-like structure where each child has a single parent.

• Example: IBM Information Management System (IMS).

• Usage: Banking applications, real-time applications.

3. Network Database

• Definition: Data is organized as a graph with multiple parent-child relationships.

• Example: Integrated Data Store (IDS).

• Usage: Telecommunications, supply chain management.

7
4. Object-Oriented Database

• Definition: Data is stored in objects as in object-oriented programming.

• Example: db4o, ObjectDB.

• Usage: Multimedia applications, CAD systems.

5. Document-Oriented Database (NoSQL)

• Definition: Data is stored as documents, often in JSON or XML format.

• Example: MongoDB, CouchDB.

• Usage: Real-time analytics, content management systems.

6. Key-Value Database

• Definition: Data is stored as key-value pairs.

• Example: Redis, DynamoDB.

• Usage: Caching, session management.

7. Columnar Database

• Definition: Data is stored in columns instead of rows for fast analytics.

• Example: Apache Cassandra, HBase.

• Usage: Big data analytics, real-time reporting

Relational Model

• The relational model organizes data into tables, also called relations.

• Each table has:

o Columns: Represent attributes (properties of data) with unique names.

o Rows: Represent individual records of data.

• It is a record-based model, meaning data is stored in fixed formats (tables with


predefined fields).

• Most modern database systems, like MySQL and PostgreSQL, use the relational model
because it is simple and efficient.

Entity-Relationship (E-R) Model

• Definition: A conceptual design model that describes data using entities, attributes, and
relationships. It is used for database design

• Components:

1. Entities: Objects or things in the real world (e.g., Student, Course).

2. Attributes: Properties of entities (e.g., Name, ID, Dept_Name).

8
3. Relationships: Associations between entities (e.g., "Enrolls" relationship between
Student and Course).

• ER Diagram Symbols:

o Rectangle: Entity.

o Ellipse: Attribute.

o Diamond: Relationship.

o Line: Connects entities to attributes or relationships..

• Example: A university database might have entities like Student and Course, with a
relationship Enrolls linking them.

A database model is a theoretical blueprint or framework that defines how data is organized and
structured within a database system. It provides a conceptual representation of the data and its
relationships.

The relational model is a specific type of database model that organizes data into tables, with
rows representing individual records and columns representing attributes. It's characterized by
the use of primary keys and foreign keys to establish relationships between tables.

DATABASE TERMS

Domains

• A domain defines the set of permissible values for an attribute.

• It specifies the data type and any constraints on the values, such as ranges, allowed
characters, or valid values.
Example:
• Age: Domain: Integer, Range: 0-120
• Gender: Domain: {'Male', 'Female', 'Other'}

Tuple and Relation

• Tuple: A single row in a table, representing a single record or instance of an entity. It


contains values for each attribute of the relation.

9
• Relation: A table in a relational database. It consists of a set of tuples, each having the
same number of attributes.

3. keys

EmployeeID Name Email Phone

101 Shaun [email protected] 9876543210


102 Soubeer [email protected] 9903543102

• Super key: A set of one or more attributes that uniquely identifies each tuple in a relation.

o {EmployeeID} is a super key.

o {EmployeeID, Email} is also a super key, though it's not minimal.

• Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as
a candidate key.

o {EmployeeID} and {Email} are candidate keys (both can uniquely identify each
row).

o {EmployeeID, Email} is not a candidate key because it is not minimal.

• Primary Key: A candidate key chosen to uniquely identify tuples in a relation. It is a


special type of candidate key.

o In the "Employees" table, {EmployeeID} could be chosen as the primary key since it
uniquely identifies every employee

• Foreign Key: An attribute or set of attributes in one relation that refers to the primary key
of another relation. It establishes a link between tables.

Example:
In a "Departments" table:

DeptID DeptName

D1 HR

D2 IT

• In an "Employees" table:

EmployeeID Name DeptID

101 Alice D1

102 Bob D2

o DeptID in the "Employees" table is a foreign key referring to DeptID in the


"Departments" table.

• Alternate key : any key that is not chosen as a primary key

10
o If "EmployeeID" is the primary key in an "Employees" table, "Email" could be an
alternate key if it is guaranteed to be unique for each employee.

• Unique Key : it is similar to a candidate key it ensures uniqueness within a colum However
it allows one null value while candidate keys cant have null values

o Example: A "Phone Number" column might be a unique key, as some employees


might not have a phone number.

• Composite Key: it is a key consisting of multiple attributes combined to uniquely identify a


record.

o Example: In an "Orders" table, a combination of "CustomerID" and "OrderDate"


might be a composite key to uniquely identify each order.

• Surrogate key : A surrogate key is a special ID number or code given to a record in a


database. It is created by the system and has no real-world meaning, just a unique number
to identify each record.

RELATIONAL CONSTRAINTS

Relational constraints are rules that must be enforced on the data within a relational database to
maintain data integrity. They ensure that the data is accurate, consistent, and meets the specific
requirements of the application.

Domain Constraints

• Domain constraints restrict the values that can be stored in an attribute to a specific set of
values ensuring data quality by preventing invalid values from being entered

o Eg : specifying an attribute to be of certain data type(eg Integer Date string)

Key Constraints

• Key constraints ensure the uniqueness and referential integrity of data within and between
tables.

• Uniqueness: The primary key must be unique. No two rows can have the same primary
key.

o Example: In a "Students" table, each student should have a unique Student_ID.

• Referential Integrity: A foreign key in one table must either match a primary key in
another table or be left empty (null).

o Example: If a "Course" table has a foreign key that references the "Student" table,
every Student_ID in the "Course" table should match an existing Student_ID in the
"Student" table, or it can be empty (null) if no student is assigned.

6. Update Operations and Dealing with Constraint Violations

• Update Operations: Actions that modify data in the database, such as:

o Insert: Adding new tuples to a relation.

11
o Delete: Removing existing tuples from a relation.

o Update: Modifying the values of attributes in existing tuples.

• Dealing with Constraint Violations:

o Reject the operation: Prevent the update from occurring if it would violate a
constraint.

o Trigger actions: Perform actions (e.g., sending notifications) when a constraint is


violated.

o Cascade updates/deletes: Automatically update or delete related tuples in other


relations to maintain referential integrity.

1. Integrity Constraints

• Definition: Integrity constraints are rules that must be enforced on the data within a
relational database to maintain data accuracy, consistency, and adherence to the
application's specific requirements. They act as safeguards to prevent invalid or
inconsistent data from being entered or modified.

• Types:

o Domain Constraints: Restrict the values that can be stored in an attribute to a


specific set (e.g., data type, range, valid values).

o Key Constraints: Ensure the uniqueness and referential integrity of data:

▪ Primary Key: Uniquely identifies each row in a table.

▪ Unique Key: Ensures uniqueness within a column, allowing one null value.

▪ Foreign Key: Links tables by referencing the primary key of another table.

▪ Candidate Key: A minimal superkey (set of attributes uniquely identifying a


tuple).

o Entity Integrity: Ensures that every table has a primary key and that no primary
key value is null.

o Referential Integrity: Maintains consistency between related tables by ensuring


that foreign key values either match a primary key value in the referenced table or
are null.

2. Update Operations

• Definition: Actions that modify data within the database:

o Insert: Adds new tuples (rows) to a relation.

o Delete: Removes existing tuples from a relation.

o Update: Modifies the values of attributes in existing tuples.

3. Dealing with Constraint Violations

When an update operation attempts to violate an integrity constraint:

12
• Reject the Operation: The database system prevents the update from occurring. This is the
most common and strict approach.

• Trigger Actions: Execute specific actions (e.g., send notifications, log the violation) when a
constraint is violated.

• Cascade Updates/Deletes: Automatically update or delete related tuples in other tables to


maintain referential integrity. This requires careful consideration to avoid unintended
consequences.

4. Relational Operations

Relational operations manipulate relations (tables) to retrieve and modify data. Common
operations include:

• Select: Extracts a subset of tuples based on a condition.

• Project: Extracts a subset of attributes from a relation.

• Join: Combines tuples from two or more relations based on a matching condition.

• Union: Combines two relations into a single relation containing all unique tuples.

• Intersection: Creates a relation containing only tuples that exist in both input relations.

• Difference: Creates a relation containing tuples that exist in the first relation but not in the
second.

Key Points:

• Integrity constraints are essential for maintaining data accuracy and consistency.

• Update operations can potentially violate constraints, requiring careful handling to ensure
data integrity.

• Relational operations provide the foundation for querying and manipulating data in
relational databases.

ER MODELLING
1) What is an Entity?

• Definition: An entity is any object or thing in the real world that can be distinctly identified.
It can be a person, place, object, or event.

o In a school database, Student and Teacher are entities. Each student or teacher
can be uniquely identified by their ID.

2) What is an Attribute?

• An attribute is a property or characteristic of an entity. It helps describe the entity in more


detail.

• Example: For a Student entity, attributes could be Name, Age, Grade, Address.

3) What is a Relationship?

• A relationship describes how two or more entities are related to each other.

• Example: A Student is enrolled in a Course. This is a relationship between the Student and
Course entities.

13
More about Entities and Relationships

4. Types of Entities

• Strong Entity: An entity that can exist independently, without relying on other entities.

o Example: A Student entity can exist on its own, having its own unique student ID.

• Weak Entity: An entity that depends on a strong entity to exist. It cannot be uniquely
identified by its own attributes.

o Example: A Course Enrollment entity might depend on both the Student and
Course entities for its identity.

5. Types of Relationships

• One-to-One (1:1): One entity in a table is related to one entity in another table.

o Example: A Student might have one StudentID card.

• One-to-Many (1:M): One entity in a table can be related to multiple entities in another
table.

o Example: A Teacher can teach many Courses, but each Course is taught by only
one Teacher.

• Many-to-Many (M:M): Multiple entities in one table are related to multiple entities in
another table.

o Example: Students can enroll in many Courses, and each Course can have many
Students.

Conversion of E-R Diagram to Relational Database

How to Convert an E-R Diagram to a Relational Database?

• Step 1: Convert entities to tables: Each entity becomes a table in the database.

o Example: The Student entity becomes the Student table.

• Step 2: Convert attributes to columns: Each attribute of the entity becomes a column in
the table.

o Example: The Student table will have columns like StudentID, Name, and Age.

• Step 3: Convert relationships to foreign keys: For relationships, we add foreign keys in
the related tables to show how they are connected.

o Example: In a Student-Course relationship (many-to-many), we create a new table


like Student_Course, with foreign keys to both Student and Course.

Why is the ER Model Used?

The Entity-Relationship (ER) Model is used to visually represent and design databases. It helps
in organizing data, identifying how different pieces of information are related, and creating a
structure for the database. This makes it easier to understand, manage, and work with complex
data systems.

Main Reasons for Using the ER Model:

14
1. Database Design: ER modeling helps in designing databases by clearly defining the
entities, their attributes, and the relationships between them.

2. Simplifies Complex Data: It breaks down complex data into smaller, manageable parts,
making it easier to understand the structure.

3. Communication: It is an easy way to communicate between technical and non-technical


people (like developers, analysts, and business stakeholders).

4. Data Integrity: It helps to define rules for data consistency, ensuring the database
functions correctly.

BENEFITS :

Clear Structure: Provides a visual map of data, showing entities and their relationships, making it
easier to understand and design the database.

Improves Data Integrity: Ensures rules like uniqueness and consistency are enforced, keeping
the data accurate and reliable.

Reduces Redundancy: Helps avoid duplicate data by organizing it into related entities and
defining relationships.

Easy Communication: Simplifies discussion between technical and non-technical stakeholders


through easy-to-understand diagrams.

Simplifies Database Design: Clearly defines what data to store, how to structure it, and how
tables are related, leading to better database organization.

Easier Modifications: Allows easy addition of new features or changes without disrupting the
existing structure.

Faster Development: Speeds up database creation by providing a clear guide for developers.

Improved Query Design: Makes it easier to design efficient queries by showing how data is
related across tables.

15

You might also like