Basis
Basis
Sharing of Due to the centralized approach, Data is distributed in many files, and
data data sharing is easy. it may be of different formats, so it
isn't easy to share data.
Data DBMS gives an abstract view of The file system provides the detail of
Abstraction data that hides the details. the data representation and storage
of data.
Security and DBMS provides a good protection It isn't easy to protect a file under the
Protection mechanism. file system.
Recovery DBMS provides a crash recovery The file system doesn't have a crash
Mechanism mechanism, i.e., DBMS protects mechanism, i.e., if the system crashes
the user from system failure. while entering some data, then the
content of the file will be lost.
Manipulation DBMS contains a wide variety of The file system can't efficiently store
Techniques sophisticated techniques to store and retrieve the data.
and retrieve the data.
Concurrency DBMS takes care of Concurrent In the File system, concurrent access
Problems access of data using some form of has many problems like redirecting
locking. the file while deleting some
information or updating some
information.
Where to use Database approach used in large File system approach used in large
systems which interrelate many systems which interrelate many files.
files.
Cost The database system is expensive The file system approach is cheaper
to design. to design.
Data Due to the centralization of the In this, the files and application
Redundancy database, the problems of data programs are created by different
and redundancy and inconsistency are programmers so that there exists a
Inconsistency controlled.
lot of duplication of data which may
lead to inconsistency.
Structure The database structure is complex The file system approach has a
to design. simple structure.
Data In this system, Data Independence In the File system approach, there
Independence exists, and it can be of two types. exists no Data Independence.
o Logical Data Independence
o Physical Data
Independence
Data Models In the database approach, 3 types In the file system approach, there is
of data models exist: no concept of data models exists.
o Hierarchal data models
o Network data models
o Relational data models
Flexibility Changes are often a necessity to The flexibility of the system is less as
the content of the data stored in compared to the DBMS approach.
any system, and these changes are
more easily with a database
approach.
**Advantages of DBMS**:
1. **Data Centralization**:
- DBMS centralizes data storage, providing a single source of truth for the
organization's data.
- This reduces data redundancy and inconsistency, as there is only one copy of
each data item stored in the database.
3. **Data Security**:
- DBMS provides advanced security features such as access control,
authentication, encryption, and auditing to protect data from unauthorized
access, manipulation, or disclosure.
- It allows administrators to define user roles and permissions, restricting
access to sensitive data.
4. **Data Accessibility**:
- DBMS offers powerful query languages (e.g., SQL) and reporting tools for
retrieving, analyzing, and presenting data in various formats and contexts.
- It supports concurrent access by multiple users or applications, enabling
efficient data sharing and collaboration.
5. **Scalability and Performance**:
- DBMS systems are designed to handle large volumes of data and support
high levels of concurrency and throughput.
- They offer optimized data storage and retrieval mechanisms for efficient
performance, even as the database grows in size and complexity.
**Disadvantages of DBMS**:
5. **Vendor Lock-In**:
- Organizations may face vendor lock-in when using proprietary DBMS
solutions, limiting their flexibility to switch to alternative vendors or migrate to
open-source solutions.
- This dependency on a specific vendor can result in higher costs, limited
customization options, and potential compatibility issues with other systems.
- **Users**: These are individuals or applications that interact with the DBMS
to perform various operations such as querying, updating, and managing data.
Users can be categorized into different roles based on their level of access and
privileges, such as administrators, database designers, and end-users.
The architecture of a DBMS defines the overall design and structure of the
system, including how its various components interact with each other. A
typical DBMS architecture consists of three main layers:
1. **Relational Model**:
- The relational model organizes data into tables (relations) consisting of rows
(tuples) and columns (attributes).
- It establishes relationships between tables using keys, such as primary keys
and foreign keys.
- SQL (Structured Query Language) is commonly used to manipulate and
query relational databases.
4. **Hierarchical Model**:
- The hierarchical model organizes data in a tree-like structure, with parent-
child relationships between data elements.
- It is characterized by a one-to-many relationship between parent and child
records.
- This model is commonly used in hierarchical databases such as XML
(eXtensible Markup Language) databases.
5. **Network Model**:
- The network model extends the hierarchical model by allowing many-to-
many relationships between records.
- It represents data as a collection of records connected in a network-like
structure.
- This model offers more flexibility in representing complex relationships but
can be more difficult to implement and query compared to the relational
model.
6. **Object-Relational Model**:
- The object-relational model combines features of both the relational and
object-oriented models.
- It allows for the storage of complex data types and supports inheritance,
encapsulation, and polymorphism.
- This model bridges the gap between relational databases and object-
oriented programming languages.
7. **Dimensional Model**:
- The dimensional model is used in data warehousing and OLAP (Online
Analytical Processing) systems.
- It organizes data into fact tables containing measures and dimension tables
containing descriptive attributes.
- This model is optimized for analyzing large volumes of data and performing
complex queries for decision support.
8. **Graph Model**:
- The graph model represents data as nodes (entities) and edges
(relationships) in a graph structure.
- It is suitable for modeling complex networks and relationships found in
social networks, recommendation systems, and network analysis.
Each of these models has its own strengths and weaknesses, and the choice of
model depends on factors such as the nature of the data, the requirements of
the application, and the preferences of the database designers.
In the context of databases, a "key" refers to a specific attribute or combination
of attributes that uniquely identifies a record (row) within a table (relation).
Keys are essential for maintaining data integrity and enforcing constraints
within a database. There are several types of keys commonly used in database
management systems (DBMS). Here are some key types:
1. **Primary Key**:
- A primary key is a unique identifier for each record in a table.
- It must contain unique values and cannot have NULL values.
- Every table in a database should have a primary key, and it serves as the
main index for the table.
- Example: EmployeeID in an Employee table.
2. **Composite Key**:
- A composite key is a key composed of multiple attributes (columns) that,
when combined, uniquely identify a record.
- It is used when no single attribute can uniquely identify a record on its own.
- Example: Combination of EmployeeID and DepartmentID in an Employee-
Department table.
3. **Foreign Key**:
- A foreign key is an attribute or set of attributes in one table that refers to
the primary key in another table.
- It establishes a relationship between two tables, known as a parent-child
relationship.
- Foreign keys help maintain referential integrity and enforce constraints
between related tables.
- Example: DepartmentID in an Employee table, referencing the
DepartmentID primary key in a Department table.
4. **Alternate Key**:
- An alternate key is a candidate key that is not chosen as the primary key.
- It can be unique and can serve as an alternate means of identifying records.
- Example: Email address in a User table, if the primary key is UserID.
5. **Candidate Key**:
- A candidate key is an attribute or set of attributes that can uniquely identify
a record within a table.
- It satisfies the uniqueness and minimality constraints required for a primary
key.
- Example: Both EmployeeID and Social Security Number (SSN) could serve as
candidate keys in an Employee table.
6. **Super Key**:
- A super key is a set of attributes that uniquely identifies a record within a
table.
- It may contain more attributes than necessary to uniquely identify records.
- Example: Combination of EmployeeID, FirstName, and LastName in an
Employee table.
1. **Selection (σ)**:
- The selection operation selects rows from a relation (table) that satisfy a
specified condition or predicate.
- It is denoted by the σ symbol and takes the form
σ<sub>condition</sub>(relation).
2. **Projection (π)**:
- The projection operation selects specific columns (attributes) from a relation
while discarding the rest.
- It is denoted by the π symbol and takes the form π<sub>attribute1,
attribute2, ...</sub>(relation).
3. **Union (∪)**:
- The union operation combines the rows of two relations into a single
relation, eliminating duplicate rows.
- It is denoted by the ∪ symbol and takes the form relation<sub>1</sub> ∪
relation<sub>2</sub>.
4. **Intersection (∩)**:
- The intersection operation returns the common rows between two
relations.
- It is denoted by the ∩ symbol and takes the form relation<sub>1</sub> ∩
relation<sub>2</sub>.
5. **Difference (−)**:
- The difference operation returns the rows that are present in one relation
but not in the other.
- It is denoted by the − symbol and takes the form relation<sub>1</sub> -
relation<sub>2</sub>.
7. **Join (⋈)**:
- The join operation combines rows from two relations based on a common
attribute or condition.
- It is denoted by the ⋈ symbol and takes the form relation<sub>1</sub>
⋈<sub>condition</sub> relation<sub>2</sub>.
8. **Division (÷)**:
- The division operation returns rows from one relation that are related to all
rows of another relation.
- It is a less commonly used operation in relational algebra.
1. **ACID Properties**:
- ACID is an acronym that stands for Atomicity, Consistency, Isolation, and
Durability. These properties ensure the reliability and integrity of transactions
in a database system:
- **Atomicity**: Transactions are atomic, meaning they are either
completed successfully in their entirety or aborted with no partial changes
applied to the database. Atomicity ensures that transactions are indivisible and
all-or-nothing.
- **Consistency**: Transactions maintain the consistency of the database by
transforming it from one consistent state to another consistent state.
Consistency ensures that the database remains valid and adheres to predefined
constraints and integrity rules.
- **Isolation**: Transactions execute independently of each other, as if they
were executed sequentially, even when executed concurrently. Isolation
prevents interference between transactions and ensures data integrity and
correctness.
- **Durability**: Once a transaction commits and its changes are applied to
the database, they persist even in the event of a system failure. Durability
guarantees that committed transactions survive system crashes or failures and
are not lost.
2. **Database Recovery**:
- Database recovery ensures that the ACID properties are maintained even
after a system failure or crash. The recovery process typically involves the
following steps:
- **Backup**: Regularly back up the database to preserve a copy of the data
in case of failure.
- **Transaction Logging**: Log all changes made by transactions to a
transaction log before applying them to the database. Transaction logs record
the sequence of operations performed by transactions.
- **Checkpointing**: Periodically create checkpoints to save the current
state of the database, including committed transactions and their
corresponding log records.
- **Rollback and Rollforward**: During recovery, analyze the transaction log
to identify incomplete or partially applied transactions. Rollback incomplete
transactions to undo their effects on the database. Rollforward committed
transactions by reapplying their changes from the transaction log to restore the
database to a consistent state.
- **Redo and Undo Logs**: Redo logs contain information about committed
transactions that need to be reapplied during recovery, while undo logs contain
information about transactions that need to be rolled back.
- **Crash Recovery**: When a system failure occurs, initiate crash recovery
to restore the database to a consistent state using the information from the
transaction log and checkpoints.