UNIT 1 - 2 DBMS Notes
UNIT 1 - 2 DBMS Notes
UNIT-1
1. What is Data?
Definition: Data refers to raw facts and figures without context. It can be anything
that is collected and has the potential to be processed into meaningful information.
Examples: Numbers, dates, names, or measurements.
Importance: Data is essential for decision-making and analysis in modern
applications and organizations.
2. What is a Database?
DBMS: It’s a collection of programs that enables user to create and maintain a
database. In other words it’s a general purpose software that provides the users with
the processes of defining, constructing and manipulating the database for various
applications.
Simplified dbms
1) Users/ programmers
2) Application programs/quries
3) Software to process queries/programs
4) Software to access stored data
5) Dbms catalog contains the stored database definition ( Metadata)
6) The physical stored database
Reduced Data Redundancy: A DBMS centralizes data, reducing the need for
repeated storage of the same data in multiple locations.
Improved Data Integrity: Ensures data accuracy and consistency using rules and
constraints.
Data Independence: Applications are insulated from changes in data structure
(logical and physical independence).
Centralized Management: The DBMS centralizes data storage and access,
simplifying management.
Security: The DBMS provides strong mechanisms for user authentication and
authorization.
In-brief explanation :
Data duplication and inconsistency: Different programmers create files and programs
over time, often using different structures and programming languages. This can lead to the
same data being stored in multiple places. For example, if a student has two majors, their
contact details might be saved in separate files for each department. If the student’s address
changes and only one file is updated, the data becomes inconsistent. This also increases
storage costs.
Data isolation: Data is stored in different files and formats, making it hard to retrieve
specific information. Writing new programs to access this scattered data is challenging.
Integrity issues: Some rules must be followed when storing data, such as ensuring that a
department’s account balance never goes below zero. These rules are enforced through
coding in different programs. However, if new rules are needed, updating multiple programs
can be difficult, especially when different data sources are involved.
Atomicity problems: Computer systems can fail, and if they do, it’s crucial to ensure that
data remains accurate and complete.
OR
Responsibilities:
o Database Security: Ensuring only authorized users can access sensitive data.
o Performance Tuning: Optimizing query performance and managing
resources.
o Backup and Recovery: Creating backup strategies to recover data in case of
failure.
o Database Maintenance: Ensuring the database is running smoothly, including
updates and patches.
Skills Required: Knowledge of database software, system administration,
programming, and troubleshooting.
Responsibilities:
o Database Modeling: Designing the structure of the database using techniques
like Entity-Relationship (ER) diagrams.
o Schema Design: Defining the tables, columns, and relationships in a database.
o Normalization: Ensuring the database is free from redundancy and maintains
data integrity.
o Business Requirements Analysis: Understanding the needs of users and
translating them into database structures.
Skills Required: Understanding of business processes, data modeling, and database
theory.
Data Sharing: Multiple users can access the database concurrently without conflicts.
Data Consistency and Integrity: DBMS ensures that the data remains consistent,
accurate, and valid through constraints and validation rules.
Reduced Redundancy: Storing data in a central location reduces duplicate entries.
Improved Security: Data is stored securely, with access controlled by user
authentication.
Backup and Recovery: Provides mechanisms to protect and recover data in case of
failure.
Better Data Management: Supports complex querying, reporting, and data analysis.
Simple Applications: For applications with minimal data storage and no need for
complex data relationships (e.g., personal projects, small-scale applications).
High Performance Requirements: When the overhead of using a DBMS slows
down performance, such as in real-time systems with low-latency requirements.
Cost Considerations: The cost of setting up and maintaining a DBMS may not be
justified for small, simple systems.
Lack of Need for Complex Queries: If the application doesn’t require complex data
retrieval, DBMS may be unnecessary.
OR
While Database Management Systems (DBMS) offer numerous advantages, there are
scenarios where using a DBMS might not be the best choice. Here are some situations
where a DBMS may not be appropriate:
- In systems where performance and latency are critical (e.g., real-time systems,
gaming, or high-frequency trading), the overhead of a DBMS (e.g., query parsing,
transaction management) might introduce unacceptable delays. In-memory data
structures or custom storage solutions may be more suitable.
3. **Limited Resources**
- If the system has limited computational resources (e.g., embedded systems, IoT
devices), a DBMS might consume too much memory, processing power, or storage.
Lightweight alternatives like SQLite or file-based storage may be more appropriate.
- For data that is short-lived or only needed temporarily (e.g., cache data, session
data), a DBMS might add unnecessary complexity. In-memory stores like Redis or
simple file storage could be more efficient.
6. **Cost Constraints**
7. **Lack of Expertise**
- If the team lacks the expertise to design, manage, and optimize a DBMS, using one
could lead to poor performance, security vulnerabilities, or data integrity issues. In such
cases, simpler solutions might be more manageable.
- For applications that are single-user or standalone (e.g., a personal diary app), the
advanced features of a DBMS (e.g., concurrency control, multi-user support) are
unnecessary. A local file system or lightweight database might be sufficient.
- For experimental or prototype projects where the data model is not yet stable, using
a DBMS might slow down development. Flexible, schema-less storage options (e.g.,
NoSQL databases or flat files) might be more appropriate.
10. **High Data Volatility**
- If the data changes very frequently and requires constant schema updates, a DBMS
might not be the best choice. Schema-less databases or file-based systems might offer
more flexibility.
UNIT-2
A data model is a set of tools used to describe how data is organized, related, and maintained
in a database. It helps define:
1. Relational Model – This model organizes data into tables (also called relations),
where each row represents a record, and each column represents an attribute. It is the
most commonly used model in databases today.
2. Entity-Relationship (E-R) Model – This model represents data using entities (real-
world objects) and their relationships. It is mainly used in database design.
What is Entity?
An Entity may be an object with a physical existence – a particular person, car, house, or
employee – or it may be an object with a conceptual existence – a company, a job, or a
university course.
Entity Set
We can represent the entity set in ER Diagram but can’t represent entity in ER Diagram
because entity is row and column in the relation and ER Diagram is graphical representation
of data.
Types of Entity
There are two types of entity:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on
other Entity in the Schema. It has a primary key, that helps in identifying it uniquely, and it
is represented by a rectangle. These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set. But
some entity type exists for which key attributes can’t be defined. These are called Weak
Entity types .
For Example, A company may store the information of dependents (Parents, Children,
Spouse) of an Employee. But the dependents can’t exist without the employee. So Dependent
will be a Weak Entity Type and Employee will be Identifying Entity type for Dependent,
which means it is Strong Entity Type .
A weak entity type is represented by a Double Rectangle. The participation of weak entity
types is always total. The relationship between the weak entity type and its identifying strong
entity type is called identifying relationship and it is represented by a double diamond.
What is Attributes?
Attributes are the properties that define the entity type. For example, Roll_No, Name, DOB,
Age, Address, and Mobile_No are the attributes that define entity type Student. In ER
diagram, the attribute is represented by an oval.
Attribute
Types of Attributes
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key
attribute. For example, Roll_No will be unique for each student. In ER diagram, the key
attribute is represented by an oval with underlying lines.
Key Attribute
2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For
example, the Address attribute of the student Entity type consists of Street, City, State, and
Country. In ER diagram, the composite attribute is represented by an oval comprising of
ovals.
Composite Attribute
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No
(can be more than one for a given student). In ER diagram, a multivalued attribute is
represented by a double oval.
Multivalued Attribute
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a derived
attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived attribute is
represented by a dashed oval.
Derived Attribute
The Complete Entity Type Student with its Attributes can be represented as:
3. Object-Based Model – This model is based on object-oriented programming and
combines features of the relational and E-R models. It allows data to be stored as
objects with methods (functions) and identities.
public:
void search();
void update();
};
Here, Student is a class, and objects like S1, S2 can be created from it in the main function.
4. Semistructured Model – Unlike the previous models, this allows data items of the
same type to have different attributes. XML is a common format for semistructured
data.
Characteristics:
Does not follow a strict format like traditional databases but has some structure.
Cannot be stored in rows and columns like relational databases.
Uses tags and metadata to organize and describe data.
Data is arranged in a hierarchy, grouping similar entities together.
Entities in the same group may have different attributes.
Lacks enough metadata, making automation and management difficult.
The size and type of attributes in the same group may vary.
Not easily processed by computer programs due to its flexible structure.
Emails
XML and markup languages (e.g., HTML)
Binary files
Network data (TCP/IP packets)
Compressed files (ZIP, RAR)
Data from multiple sources
Web pages
Advantages:
Older models, such as the network and hierarchical models, were more complex and are
now rarely used, except in older systems.
Network Model
The network model is a type of database model where data is organized using records
(nodes) and relationships (edges) in a graph-like structure. It allows complex relationships
between data and supports many-to-many relationships, unlike the hierarchical model,
which follows a strict parent-child structure.
Example:
In the network model, both students and courses can have multiple connections, unlike the
hierarchical model, which would force a single-parent structure.
Advantages:
Disadvantages:
Hierarchical Model
The Hierarchical Model is a type of database model where data is organized in a tree-like
structure. It follows a parent-child relationship, meaning each parent record can have
multiple child records, but each child has only one parent.
1. Tree Structure – Data is stored in a hierarchical (tree) format, where a single root
node connects to multiple child nodes.
2. Parent-Child Relationship – Each child record has only one parent, but a parent
can have multiple children.
3. Fast Data Access – Because of predefined relationships, retrieving related data is
quick.
4. Uses Pointers – Relationships between records are maintained using pointers.
Example:
markdown
CopyEdit
Company
│
├── HR
│ ├── Employee 1
│ ├── Employee 2
│
├── IT
│ ├── Employee 3
│ ├── Employee 4
│
├── Finance
├── Employee 5
├── Employee 6
✔ Efficient for one-to-many relationships – Best suited for hierarchical data like
organizational structures.
✔ Fast data retrieval – Since relationships are predefined, searching is quick.
✔ Ensures data integrity – The parent-child structure prevents orphan records.
- A schema contains schema objects like table, foreign key, primary key, views, columns,
data types, stored procedure, etc
- A database schema is the skeleton structure of the database. It represents the logical
view of the entire database.
- A database schema can be represented by using the visual diagram. That diagram shows
the database objects and relationship with each other.
Example: The database consists of information about a set of customers and accounts in a bank
and the relationship between them
2-Tier Architecture
• The 2-tier architecture is similar to a basic client-server model . The application at the
client end directly communicates with the database on the server side. APIs like
ODBC and JDBC are used for this interaction. The server side is responsible for
providing query processing and transaction management functionalities. On the client
side, the user interfaces and application programs are run. The application on the
client side establishes a connection with the server side to communicate with the
DBMS.
An advantage of this type is that maintenance and understanding are easier, and
compatible with existing systems. However, this model gives poor performance when
there are a large number of users.
• Advantages of 2-Tier Architecture
• Easy to Access: 2-Tier Architecture makes easy access to the database, which makes
fast retrieval.
• Scalable: We can scale the database easily, by adding clients or upgrading hardware.
• Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture and Multi-Tier
Architecture .
• Easy Deployment: 2-Tier Architecture is easier to deploy than 3-Tier Architecture.
• Simple: 2-Tier Architecture is easily understandable as well as simple because of
only two components.
3-Tier Architecture
• In 3-Tier Architecture , there is another layer between the client and the server. The
client does not directly communicate with the server. Instead, it interacts with an
application server which further communicates with the database system and then the
query processing and transaction management takes place. This intermediate layer
acts as a medium for the exchange of partially processed data between the server and
the client. This type of architecture is used in the case of large web applications.
Data Independence refers to the ability to modify the database structure without affecting
the applications or programs that use the data. This helps in easy maintenance and scalability
of databases.
In short, data independence ensures that database changes do not break existing
applications, making the system more stable and efficient.
Types of Database Languages in DBMS
• Database languages are specialized languages used to interact with a database. They
allow users to perform different tasks such as defining, controlling,
and manipulating the data. There are several types of database languages in DBMS,
categorized into the following four main types:
• DDL (Data Definition Language)
• DCL (Data Control Language)
• DML (Data Manipulation Language)
CREATE Command
The CREATE is a DDL command used to create databases, tables, triggers and other
database objects.
Syntax
CREATE TABLE Students (
column1 INT,
column2 VARCHAR(50),
column3 INT
);
Alter Command
ALTER is a DDL command which changes or modifies the existing structure of the
database, and it also changes the schema of database objects. We can also add and drop
constraints of the table using the ALTER command.
Syntax
ALTER TABLE Students ADD column_name;
Drop Command
DROP is a DDL command used to delete/remove the database objects from the SQL
database. We can easily remove the entire table, view, or index from the database using
this DDL command.
Syntax
DROP Table Table_name;
Update Command
The UPDATE command in SQL (Structured Query Language) is used to modify
existing records in a table. This command enables users to change the values of
one or more columns for specific rows based on a condition (criteria). It is
crucial for maintaining and adjusting data in a database when necessary.
Syntax
UPDATE Table_Name SET Name = 'New_Value' WHERE Name = 'Ola_Value';
Delete Command
The DELETE command in SQL (Structured Query Language) is used to remove
one or more existing records from a table in a database. It is an essential
operation for managing data, enabling users to delete specific rows that meet
certain conditions or criteria.
Syntax:
DELETE FROM Table_Name WHERE Column = Value;
Merge Command
The Merge command in SQL is used to perform an upsert operation, which
combines both UPDATE and INSERT actions. It allows you to insert new rows if
they do not exist, or update existing rows if they match a certain condition. This
command is particularly useful for synchronizing two tables by inserting new
records and updating.
Syntax:
MERGE INTO target_table AS target
USING source_table AS source
ON (target.id = source.id)
WHEN MATCHED THEN
UPDATE SET target.name = source.name
WHEN NOT MATCHED THEN
INSERT (id, name) VALUES (source.id, source.name);existing ones in a single
operation.
Example:
ON target.EmployeeID = source.EmployeeID
Table 2. New_Hires_Table
OUTPUT
CALL Command
The Call command is used to invoke a stored procedure or user-defined function, which is a
set of precompiled SQL statements. It allows you to execute complex operations within a
database as a single unit.
Syntax:
CALL user_defined_function(parameter 1, parameter 2);
Example:
CALL UpdateEmployeeSalary(101, 55000);
LOCK TABLE
The lock table command is used to lock the table for preventing access from others, ensuring
no other operations (like insertions, updates, or deletions) can be performed on the table
while it's locked. Useful for transactional operations where consistency is important.
Syntax:
LOCK TABLE your_table IN EXCLUSIVE MODE;
Example:
LOCK TABLE ClassMembers IN EXCLUSIVE MODE;
TCL ( Transaction Control Language )
The TCL full form is Transaction Control Language commands are used to manage and
control transactions in a database, grouping them into logical units. These commands help
ensure the integrity of data and consistency during complex operations. Here are the two
main commands in this category:
Commit: Saves all the changes made during the current transaction to the database. These
are very useful in the banking sector.
Rollback: used to restore the database to its original state from the last commit. This
command also plays an important role in Banking Sectors.
Now we will explain these two commands for better understanding with examples
Commit Command
The COMMIT command is used to save all changes made during a transaction in the
database. This command ensures that modifications made by DML statements (such as
INSERT, UPDATE, or DELETE) become permanent in the database.
Syntax
Database Operation
Commit
ROLLBACK Command
The rollback command is used to restore the database to its state at the last COMMIT,
effectively undoing any changes made since that point. It helps ensure data consistency by
allowing the reversal of partial or erroneous operations.
Syntax
ROLLBACK;
Interfaces in DBMS
A database management system (DBMS) interface is a user interface that allows for the
ability to input queries to a database without using the query language itself.
• Menu-Based Interfaces
• These interfaces present the user with lists of options (called menus) that lead the user
through the formation of a request. The basic advantage of using menus is that they
remove the tension of remembering specific commands and syntax of any query
language. The query is basically composed step by step by collecting or picking
options from a menu that is shown by the system. Pull-down menus are a very popular
technique in Web-based interfaces. They are also often used in browsing interfaces
which allow a user to look through the contents of a database in an exploratory and
unstructured manner.
• Forms-Based Interfaces
• A forms-based interface displays a form to each user. Users can fill out all of the form
entries to insert new data, or they can fill out only certain entries, in which case the
DBMS will redeem the same type of data for other remaining entries. These types of
forms are usually designed or created and programmed for users that have no
expertise in operating systems. Many DBMS’s have form specification languages
which are special languages that help specify such forms.
• Example: SQL Forms is a form-based language that specifies queries using a form
designed in conjunction with the relational database schema.
• Graphical User Interface
• A GUI typically displays a schema to the user in diagrammatic form. The user then
can specify a query by manipulating the diagram. In many cases, GUI utilise both
menus and forms. Most GUI use a pointing device such as a mouse, to pick a certain
part of the displayed schema diagram.
• Graphical user interface of Coordi system: Information stored in the database
management system is represented as visual icon objects. Relevant objects can be
linked through the lines, which can be manually created by simply drawing lines
between the objects.
• A Natural Language Interface (NLI) is a system that can understand and process
requests written in English or other languages. It works by using a special schema,
which is like a blueprint, and a dictionary of important words to understand the
meaning behind the request.
• When you make a request, the NLI checks the words in its schema and dictionary to
figure out what you're asking. If it understands your request, it translates it into a
high-level query that a database management system (DBMS) can process. If it
doesn't understand, it starts a conversation with you to clarify what you mean.
However, the main disadvantage is that these interfaces are not very advanced, meaning they
might not understand complex requests very well.
The generalized includes the following main steps: 1. Acquiring information about the
desired database as a whole.
2. Getting information algorithm of users’ interaction with NLIDB system modules about the
described entity as a whole.
3. Gets information about the set of attributes of the selected entity.
4. Perform a managed dialogue to find dependencies and relationships between entity
attributes, establish their type and a multiplicity of these relation. On the system side, the
received data is normalized, primary keys are added if necessary, new tables are formed from
subsets of attributes.
5. Performing a managed dialogue, the purpose of which is to find dependencies between
entities and establish a multiplicity of these relations. If it is necessary, the system adds
foreign keys and additional tables.
6. Obtaining a data domain abstraction of an entry-level.
7. Check, edit, adjust business rules and inputted data.
8. Generate SQL queries and create a database file.
Speech Input and Output Interfaces
• There is limited use of speech be it for a query or an answer to a question or being a
result of a request it is becoming commonplace. Applications with limited vocabulary
such as inquiries for telephone directory, flight arrival/departure, and bank account
information are allowed speech for input and output to enable ordinary folks to access
this information.
• The Speech input is detected using predefined words and used to set up the
parameters that are supplied to the queries. For output, a similar conversion from text
or numbers into speech takes place