0% found this document useful (0 votes)
10 views52 pages

DBMS-mod 1,2,3,4

The document provides an overview of Database Management Systems (DBMS), detailing its definition, characteristics, advantages, and disadvantages. It also covers the history of DBMS, differences between file systems and DBMS, levels of abstraction, data independence, and the database design process, including the Entity-Relationship model. Additionally, it explains the structure of a DBMS and the importance of proper database design for effective data management.

Uploaded by

Dev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views52 pages

DBMS-mod 1,2,3,4

The document provides an overview of Database Management Systems (DBMS), detailing its definition, characteristics, advantages, and disadvantages. It also covers the history of DBMS, differences between file systems and DBMS, levels of abstraction, data independence, and the database design process, including the Entity-Relationship model. Additionally, it explains the structure of a DBMS and the importance of proper database design for effective data management.

Uploaded by

Dev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

DBMS -1 sem

Module-1

#DBMS:-It stands for database management systeml.


-it is used to store and retrieve data ,in an easy and effective manner.
-It is also used to organize the data in the form of a table, schema, views, and reports, etc.
-Database management system is a software which is used to manage the database. For
example: MySQL, Oracle, etc are a very popular commercial database which is used in
different applications.
-DBMS provides an interface to perform various operations like database creation, storing
data in it, updating data, creating a table in the database and a lot more.
-It provides protection and security to the database. In the case of multiple users, it also
maintains data consistency.

*Characteristics of DBMS:-DBMS contains automatic backup and recovery procedures.


-It contains ACID properties which maintain data in a healthy state in case of failure.
-It can reduce the complex relationship between data.
-It is used to provide security of data.
-It can view the database from different viewpoints according to the requirements of the
user.
-control of data redundancy.
-insulation between program and data.
-self describe nature of data.
-no integrity problems.
-stores any kind of structured data.

→Advantages of a DBMS
-Controls database redundancy: It can control data redundancy because it stores all the
data in one single database file and that recorded data is placed in the database.
-Data sharing: , the authorized users of an organization can share the data among multiple
users.
-Easily Maintenance: It can be easily maintainable due to the centralized nature of the
database system.
-Reduce time: It reduces development time and maintenance need.
-Backup: It provides backup and recovery subsystems which create automatic backup of
data from hardware and software failures and restores the data if required.
-multiple user interface: It provides different types of user interfaces like graphical user
interfaces, application program interfaces
-data integrity and security.
-data administration.
→Disadvantages of a DBMS
-Cost of Hardware and Software: It requires a high speed of data processor and large
memory size to run DBMS software.
-Size: It occupies a large space of disks and large memory to run them efficiently.
-Complexity: Database system creates additional complexity and requirements.
-Higher impact of failure: Failure is highly impacted the database because in most of the
organization, all the data stored in a single database and if the database is damaged due to
electric failure or database corruption then the data may be lost forever.
-Low performance.
-High maintenance cost.
#History of DBMS
-Charles Bachman was the first person to develop the Integrated Data Store (IDS) which was based
on network data model.
-It was developed in early 1960’s.
-In the late 1960’s, IBM (International Business Machines Corporation) developed the Integrated
Management Systems which is the standard database system used till date in many places.
-It was developed based on the hierarchical database model.
-t was during the year 1970 that the relational database model was developed by Edgar Codd. Many
of the database models we use today are relational based.
-Later during the same decade (1980’s), IBM developed the Structured Query Language (SQL) as a
part of R project.
-It was declared as a standard language for the queries by ISO and ANSI.

*Difference between File System and DBMS


→filesystem:- The file system is basically a way of arranging the files in a storage medium
like a hard disk. <-or-> It is nothing but a collection of programs which manage and store
data in files and folders in a computer hard disk.
-it organizes the files and helps in the retrieval of files when they are required.
-File systems consist of different files which are grouped into directories.
-The directories further contain other folders and files.
-The file system performs basic operations like management, file naming, giving access rules,
etc.
-Data redundancy is high and cannot be controlled easily in file management systems.
Eg:- Consider an example of a
student's file system.
-The student file will contain
information regarding
the student (i.e. roll no,
student name, course etc.).
-Similarly, we have a subject file
that contains information about the subject and the result file which contains the
information regarding the result.
-Some fields are duplicated in more than one file, which leads to data redundancy.
-So to overcome this problem, we need to create a centralized system, i.e. DBMS
approach.
→DBMS(Database Management System) :- Database Management System is basically
software that manages the collection of related data.
-It is used for storing data and retrieving the data effectively when it is needed. -It also
provides proper security measures for protecting the data from unauthorized access.
-It also provides mechanisms for data recovery and data backup.
-The various operations performed by the DBMS system are: Insertion, deletion, selection,
sorting etc.
- example are Oracle, MySQL, MS SQL server.
-fig
*In the above figure, duplication
of data is reduced due to centralization
of data.

File system DBMS

Redundant data can be present there is no redundant data.

It doesn’t provide backup and recovery of data It provides backup and recovery of data even
if it is lost. if it is lost.

There is less data consistency There is more data consistency

It is less complex as compared to DBMS. It has more complexity in handling

It provide less security It provide more security

It is less expensive It is expensive

Only one user can access data at a time Multiple users can access data at a time.

The user has to write procedures for managing The user not required to write procedures
databases

The file system approach has a simple structure. The database structure is complex to design

Eg:- Cobol, C++ Eg:- Oracle, SQL Server


#Describing and Storing Data in a DBMS:-A data model is a collection of
high-level data description constructs that hide many low-level storage details.
-most commercial database system are based on the relational data model.
-it is easier to use a semantic data model to model an application domain.
-A widely used semantic data model called the entity-relationship (ER) model allows us to
pictorially denote entities and the relationships among them.
-An Example of Poor Design: The relational schema for Students illustrates a poor design
choice; you should never create a field such as age, whose value is constantly changing. A
better choice would be DOB (for date of birth); age can be computed from this. But we
continue to use age in our examples, however, because it makes them easier to read.

*The Relational Model:-The relational model represents how data is stored in Relational
Databases.
-A relational database stores data in the form of relations (tables).
-A relation schema represents the name of the relation with its attributes. e.g.; STUDENT
(ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema for STUDENT. If a
schema has more than 1 relation, it is called Relational Schema.
-Attributes are the properties that define a relation. e.g.; ROLL_NO, NAME
-Consider a relation STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE
shown in Table .
-A description of data in terms of a data model
is called a schema.
-Each row in the relation is known as a tuple.
The above relation contains 4 tuples,
one of which is shown as:

-In the relational database system, the relational


instance is represented by a finite set of tuples. Relation instances do not have duplicate
tuples.

–>Schema:-overall design of the database is called database schema.


-schema will not change frequently.
-it is the logical structure of a database.
-It defines how the data is organized and how the relations among them are associated.
-It’s the database designers who design the schema to help programmers understand the
database and make it useful.
-Type of schema (3 type):-
1.Physical Schema:-Designing a database at the physical level is called a physical schema.
-It used to define that how the data will be stored in block.
2.conceptual/logical schema:- It describe what type of data to be stored in the database
and also describe the relationship among those stored data.
-maintain data consistency and provides data integrity as well as security.
-It is the middle level.
-for each database there will be one conceptual schema.
3.external/view schema:-Highest level of schema.
-It is very closer to the end user.
-It provides the different views of the same database to each user.
-Generally the DBMS assist off one physical schema,one conceptual schema and several sub
or external schema.


● Or

→Difference between Schema and Instance/extension/database state in DBMS


-Instances are the collection of information stored at a particular moment. The instances can
be changed by certain operations as like addition, deletion of data.
-It is a dynamic value which keep on changing.
-schema mukali unde.

*Levels of Abstraction in a DBMS:-A data definition language (DDL) is used to define


the external and conceptual schemas.
-Information about conceptual, external, and physical schemas is stored in the system
catalogs
-Any given database has exactly one conceptual schema and one physical schema because it
,but it may have several external schemas.
-There are 3 types of abstraction they are
1.Physical/internal Schema:-It is the lowest level
of abstraction
-It defines how the data is actually stored.
-The entire database is described in this level
-It is a very complex level to understand.
2.Conceptual/logical Schema:-It is the intermediate
level or next highest Level.
-less complex than physical level.
-It is used by developers or database administrators.
-It describe what data is stored in the database and
what relationship exist among those data.
3.External/view Schema:- It is the highest level.
-There are different levels of views and every view only defines a part of the entire data.
-It is the least complex and easy to understand.

*Data Independence:-Data independence is achieved through the use of the three levels
of data abstraction.
-It is the ability to modify the schema without affecting
the programs and the applications to be written. Or
It is the ability to modify the schema at one level of the
database system without altering the schema at the next
higher level.
-There are two types of data independence they are;
1.Physical data independence:-It is the capacity to
change the internal schema without having the
change in conceptual schema.
-If we do any changes in the storage size of the
database system server, then the Conceptual
structure of the database will not be affected.
-Physical data independence is used to separate conceptual levels from the internal levels.
-Physical data independence occurs at the logical interface level.
2.Logical data Independence:-Capacity to change the conceptual schema without having to
change external schema.
-Logical data independence is used to separate the external level from the conceptual
view.
-If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
-Logical data independence occurs at the user interface level.
*Structure of a DBMS:-

-The figure shows the structure of a typical DBMS based on the relational data model.
-The DBMS accepts SQL commands generated from a variety of user interfaces,produces
● query evaluation plans
● executes plans against the database and
● Returns the answers
->fig explain
-Here we can notice that queries from variety of user interfaces are received by the dbms in
the form of sql commands.
-Query evaluation engine generates the query evaluation plan then it execute the query
evaluation plan against the database and after the execution it return the result back to the
user.
-Actually query evaluation plan is generated by query evaluation engine by using this four
modules ( plan executer ,parser,operator evaluation ,optimizer)
-initially sql commands are received by the parser then the parser break the query into
tokens and the it checks the query is in tactically and semantically correct or not.
-If the query is correct then it is converted into an algebraic expression.
-until this algebraic expression is passed as the input to the operator evaluator.
-Know the operator evaluator module evaluate the operators in the algebraic expression
using several techniques such as indexing techniques etc..
-And then set of possible query evaluation plan are generated for executing the sql query.
-then all this alternative query evaluation plans are given as the input to the optimizer.
-Know the optimizer module chooses the optimized query evaluation plan from the set of
alternative plans in order to execute the query most effectively
-then this optimized plan is executed by the plan executed module.
-during the execution of the query evaluation plan the rule of file access methods buffer
manager and disk space manager is very important.
-File and access methods provide the abstraction of file structure stored in the database and
it creates indexes of the file inorder to access the deserved file very quickly.
-Buffer manager is used to fetching there data from disk into main memory and also it is
used to decide what data to be kept in the cache memory.
-Then disk space manager is used to manage the space on the disk by providing empty
space for new request and deleting space for lot of existing files when ever they are
deleted by the users
-Then the translation manager and lock manager used to inputs concurrency control
because the data stored in database can be accessed by multiple user at the same time,
thats why this tranation manager and lock manager are used to inputs concurrency control.
-Recovery manager maintains a log that is used to restore the system to a consistent state
even after crash occurrence in the system or in the database.
-System catalolg which is also called data dictionary, it is usually stored the part of
database and it is used to store metadata information, that is it contain the discretion of all
the tables, views,datafiles and indexes which are stored in the database
Or
#Introduction to Database Design:-Database design can be generally defined as
a collection of tasks or processes that enhance the designing, development,
implementation, and maintenance of enterprise data management system.
-Properly designed database are easy to maintain, improves data consistency and are cost
effective in terms of disk storage space.
-The database designer decides how the data elements correlate and what data must be
stored.
-The main objectives of database design in DBMS are to produce logical and physical designs
models of the proposed database system.
-The logical model concentrates on the data requirements and the data to be stored
independent of physical considerations. It does not concern itself with how the data will be
stored or where it will be stored physically.
-The physical data design model involves translating the logical DB design of the database
onto physical media using hardware resources and software systems such as database
management systems (DBMS).
→Major Steps in Database Design/database design process can be divided into six steps
they are;
1.Requirements Analysis:-The very first step in designing a database application
-Talk to the potential users and Understand what data is to be stored, and what operations
and requirements are needed.
2.Conceptual Database Design:-Develop a high-level description of the data and constraints
(we will use the ER data model).
3.Logical Database Design:-Convert the conceptual model to a schema in the chosen data
model of the DBMS. For a relational database, this means converting the conceptual to a
relational schema (logical schema).
4. Schema Refinement: Look for potential problems in the original choice of schema and try
to redesign.
5. Physical Database Design: Direct the DBMS into choice of underlying data layout (e.g.,
indexes and clustering) in hopes of optimizing the performance.
6. Applications and Security Design: How will the underlying database interact with
surrounding applications.

*ER model :-ER model stands for an Entity-Relationship model.


-It is a high-level data model.
-It describe the structure of a database with the help
of diagram known as entity relationship diagram.
-For example, Suppose we design a school database. In this
database, the student will be an entity with attributes like
address, name, id, age, etc. The address can be another entity
with attributes like city, street name, pin code, etc and there
will be a relationship between them.
-components of er models are

1.entity:- An entity may be any object, class, person or


place.
-In the ER diagram, an entity can be
represented as rectangles.
-It is a real world object.
-An entity set is a group of similar entities
and these entities can have attributes
-An entity type defines a collection of
entities that have the same attributes.
-e.g.; E1 is an entity having Entity Type Student and set of all
students is called Entity Set. In ER diagram,
Entity Type is represented as:

-entities are two types they are;


● Strong entity:-An entity that can be uniquely identified by its own attributes
-Also known as parent entity.
-It don't depend on any other entity
-Have a key attribute
-represented by single rectangle
● Weak entity:-An entity that cannot be uniquely identify by its own attribute.
-Also known as child entity.
-represented by a double rectangle
-Always depend on strong entity.

Q)what is entity type and entity set? (Answer mukalillunde)

↖️
→note: The relationship between strong and weak entity is represented by a double diamond (fig
)

2.Attributes:-he attribute is used to describe the property of an entity.


-For example, id, age, contact number, name, etc.
can be attributes of a student.
-The attribute is represented by an oval

-Types of attributes are;


a. Key Attribute:-The key attribute is used to represent
the main characteristics of an entity.
-The attribute which uniquely identifies each entity in the entity set is called key attribute.
-For example, Roll_No will be unique for each student.
-In ER diagram, key attribute is represented by an oval with underlying lines.
b. Composite Attribute:-An attribute that composed of many other attributes is known as
a composite attribute.
-For example, Address attribute of student Entity type consists of Street, City, State, and
Country.
-The composite attribute is represented by an oval, and those oval are connected with an
oval.

C.single valued attribute:- It only have a single value.


-eg: Rollno, Age
D.Multivalued Attribute:- An attribute can have more than one value.
-These attributes are known as a multivalued attribute.
-The double oval is used to represent multivalued attribute.
-For example, a student can have more than one phone number.
e.Derived Attribute:-An attribute that can be derived from other attribute is known as a
derived attribute.
-eg; Age (can be derived from DOB).
-The derived attribute is represented by a dashed oval.
f.stored attribute:-From which the value of other attribute are derived.
-eg DOB
G.complex attribute:- It has multivalued and composite attribute in it.
-multivalued-{} ,composite-().
-eg; {college degree(college,year,degree)}.
-The complete entity type Student with its attributes can be represented as:

3.Relationship:- A relationship is used to describe the relation between entities.


-Diamond is used to represent the relationship.

-A relationship type represents the association between entity types.


-For example,‘Enrolled in’ is a relationship type that exists between entity type Student and
Course.
-In ER diagram, relationship type is represented by a diamond and connecting the entities
with lines.

-A set of relationships of same type is known as relationship set.


-The following relationship set depicts S1 is enrolled in C2, S2 is enrolled in C1 and S3 is
enrolled in C3.

Q)What is relationship type and relationship set? (answer mukalill unde)


→Degree of a relationship set:- The number of different entity sets participating in a
relationship set is called as degree of a relationship set.
● Unary Relationship:-When there is only ONE entity set participating in a relation, the
relationship is called as unary relationship.

-For example, one person is married to only one person


● Binary Relationship:-When there are TWO entities set participating in a relation, the
relationship is called as binary relationship.
-For example, Student is enrolled in Course.

● n-ary Relationship:-When there are n entities set participating in a relation, the


relationship is called as n-ary relationship.
→Cardinality:-The number of times an entity of an entity set participates in a relationship set is
known as cardinality. Cardinality can be of different types:
→Types of relationship/ different type of cardinality:-
➢ One-to-One Relationship:-When each entity in each entity set can take part only
once in the relationship, the cardinality is one to one.
-eg; a male can marry to one female and a female can marry to one male.
-So the relationship will be one to one.

➢ One-to-many relationship:-When only one instance of the entity on the left, and
more than one instance of an entity on the right associates with the relationship
then this is known as a one-to-many relationship.
-cardinality is many to one
-Eg; Scientist can invent many inventions, but the invention is done by the only
specific scientist.

➢ Many-to-one relationship:- When more than one instance of the entity on the left,
and only one instance of an entity on the right associates with the relationship then
it is known as a many-to-one relationship.
-For example, a student can take only one course but one course can be taken by
many students
➢ Many-to-many relationship:- When more than one instance of the entity on the left,
and more than one instance of an entity on the right associates with the relationship
then it is known as a many-to-many relationship.
-eg; a student can take more than one course and one course can be taken by many
students

→Instance of a relationship set:-It is a specific set of relationships.


-snapshot of the relationship set at some instance in time.
-bakki net ill nokkikonam

*Additional Features of the ER Model

➢ Key Constraints:- An employee can work in several departments, and a department


can have several employees.

-Employee 231-31-5368 has worked in Department 51 since 3/3/93 and in


Department 56 since 2/2/92. Department 51 has two employees.
-Now consider another relationship set called Manages between the Employees and
Departments entity sets such that each department has at most one manager,
although a single employee is allowed to manage more than one department.
-It restriction that each department has at most one manager is an example of a key
constraint
-This restriction is indicated in the ER diagram of Figure 2.6 by using an arrow from
Departments to Manages.

- It can be extended to a ternary relationship also (explanation ⬇️ unde)


Or

-Consider Work_In:- An employee can work inmany departments and a department


can have many employees.
-Works_In relationship set is many-to-many(m:n)
-In contrast, each dept has at most one manager,according to the key constraints on
manages.
-manages relationship set is one-to-many(1:N).
-This restriction is indicated in the ER diagram by using an arrow from Departments
to Manages.
-If we add a restriction that each employee can
manage at most one department .
-Then we obtain a one-to-one (1:1) relationship
Set. fig—-------------------------------------------------->

-It can be extended to a ternary relationship also


-each employee works in at most one department
and at a single location fig—----------------------->
➢ Participation Constraints:- It specifies the number of instances of an entity that can
participate in a relationship set.
-There are two type of Participation Constraints they are;

1. Total Participation:-It specifies that each entity in the entity set must
compulsorily participate in at least one relationship instance in that
relationship set.
-It is represented using a double line
between the entity set and relationship
set
-It is also called mandatory participation
2. Partial Participation:- > It specifies that each entity in the entity set may or
may not participate in the relationship instance in that relationship set.
-It is represented using a single line
Between the entity set and
relationship set.
- Fig of total and Partial Participation in a er model

-Eg; If each student must enroll in a course, the participation of student will be total
Participation .
-If some courses are not enrolled by any of the student, the participation of course will be
partial Participation.
➢ weak entity:- A weak entity set is an entity set which do not contain a primary key .
-In other words, a primary key does not exist for a weak entity set.
-Here employee is a strong
entity because it has a primary
key (ssn) and dependents
is a weak entity because of
no primary key.
-so dependents entity have attributes pname and age and we set pname as partial
key.
-use of partial key in a weak entity set and primary key in strong entity are combined
to identify each entity in the weak entity set. <-or->
The combination of partial key and
primary key of the strong entity set makes
it possible to uniquely identify all entities
of the weak entity set.
-partial key called as a discriminator.
-partial key /discriminator is represented by underlining with a dashed line.
➢ Class Hierarchies:- Class hierarchy can be viewed one of two ways
● Specialization (Top Down Approach):-Specialization is a process of identifying
subsets of an entity that shares different characteristics.
-It breaks an entity into multiple entities from higher level (super class) to
lower level (subclass).
-The class vehicle can be specialized into Car, Truck and Motorcycle ( Top
Down Approach)
-Hence, vehicle is the superclass
and Car, Truck, Motorcycle are
subclasses.
-All three of these inherit
attributesfrom Vehicle.

● Generalization (Bottom Up Approach):-Generalization is a process of


generalizing an entity which contains generalized attributes or properties of
generalized entities.
-The entity that is created will contain the common features. Generalization is
a Bottom up process.
-The classes Car, Truck and motorcycle can be generalised into Vehicle.
(Bottom Up Approach).
-Car, Truck and Motorcycle are subclasses while vehicle is the superclass.
➢ Aggregation:- It is a process in which single entity alone is not able to make sense in
a relationship.
-so the relationship set itself act as one single entity.
-In aggregation, the relation between two entities is treated as a single entity.

___________________________________________________________________________

2nd -module- Relational Model

# Introduction to the Relational Model:- The relational Model was proposed


by E.F. Codd to model data in the form of
relations or tables.
-The relational model represents how
data is stored in Relational Databases.
-A relational database stores data in the
form of relations (tables).
-Consider a relation STUDENT with
attributes ROLL_NO, NAME, ADDRESS,
PHONE, and AGE shown in Table.

–> IMPORTANT TERMINOLOGIES


➢ Attribute: Attributes are the properties that define a relation. e.g.; ROLL_NO, NAME
➢ Relation Schema: A relation schema represents the name of the relation with its
attributes. e.g.; STUDENT (ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the
relation schema for STUDENT. If a schema has more than 1 relation, it is called
Relational Schema.
➢ Tuple: Each row in the relation is known as a tuple. The above relation contains 4
tuples, one of which is shown as:
1 RAM DELHI 9455123451 18
➢ Relation Instance: The set of tuples of a relation at a particular instance of time is
called a relation instance. Table 1 shows the relation instance of STUDENT at a
particular time. It can change whenever there is an insertion, deletion, or update in
the database.
➢ Degree: The number of attributes in the relation is known as the degree of the
relation. The STUDENT relation defined above has degree 5.
➢ Cardinality: The number of tuples in a relation is known as cardinality. The STUDENT
relation defined above has cardinality 4.
➢ Column: The column represents the set of values for a particular attribute. The
column ROLL_NO is extracted from the relation STUDENT.
➢ NULL Values: The value which is not known or unavailable is called a NULL value. It is
represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL.
OR
1. Attribute: Each column in a Table. Attributes are the properties which define a
relation. e.g., Student_Rollno, NAME,etc.
2. Tables – In the Relational model the, relations are saved in the table format. It is
stored along with its entities. A table has two properties rows and columns. Rows
represent records and columns represent attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with its
attributes.
5. Degree: The total number of attributes which in the relation is called the degree of
the relation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the RDBMS system.
Relation instances never have duplicate tuples.
9. Relation key – Every row has one, two or multiple attributes, which is called relation
key.
10. Attribute domain – Every attribute has some pre-defined value and scope which is
known as attribute domain.

*Creating and Modifying Relations Using SQL:-The CREATE TABLE statement is used
to define a new table.
-Data Definition Language (DDL) supports the creation, deletion, and modification of tables.
-To create the Students relation, we can use the following statement:
CREATE TABLE Students ( sid INT, name CHAR(30), login CHAR(20), age INT );
-Tuples are inserted ,using the INSERT command.
-We can insert a single tuple into the Students table as follows:
INSERT INTO Students VALUES (53688,’Smith’,’smith@ee’,18);
-We can delete tuples using the DELETE command.
-We can delete all Students tuples with name equal to Smith using the command:
DELETE FROM Students WHERE name=’Smith’ ;
-We can modify the column values in an existing row using the UPDATE command.
-For example, we can increment the age of the student with sid 53688:
UPDATE Students SET age =age+1 WHERE sid = 53688 ;
-The WHERE clause determines which rows are to be modified.
-The SET clause then determines how these rows are to be modified.

*Integrity Constraints:-Integrity constraints are a set of rules.


-It is used to maintain the quality of information.
-Integrity constraints ensure that the data insertion, updating, and other processes have to
be performed in such a way that data integrity is not affected.
-Thus, integrity constraint is used to guard against accidental damage to the database.

→ Types of Integrity Constraint/Constraints in Relational Model


➢ Key Constraints:- Every relation in the database should have at least one set of
attributes that defines a tuple uniquely. Those set of attributes is called keys.
-An attribute that can uniquely identify a tuple in a relation is called the key of the
table.
-e.g.; ROLL_NO in STUDENT is a key. No two students can have the same roll number.
-So a key has two properties:
○ It should be unique for all tuples.
○ It can’t have NULL values.
-There could be multiple keys in a single entity set, but out of these multiple keys,
only one key will be the primary key.
-A primary key can only contain unique and not null values in the relational database
table.
-The last row of the student's table violates
the key integrity constraint since ID 1002 is
repeated twice in the primary key column.
-A primary key must be unique and not null
therefore duplicate values are not allowed in
the ID.
-A set of fields that uniquely identifies a tuple according to a key constraint is called a
candidate key.
-A Primary Key is a attributes of a table that has the task to uniquely identify the
rows, or tuples of the given table.
→primary key:-A Primary Key is a attributes of a table that has the task to uniquely
identify the rows, or tuples of the given table.
-A primary key is used to uniquely identify the rows of a table.
-Properties of a Primary Key:
*A relation can contain only one primary key.
*A primary key is the minimum super key.
*primary key attribute should not be null.
*Primary key is always chosen from the possible candidate keys.
*Primary key cannot contain duplicate values.
Eg: CREATE TABLE Students ( sid INT PRIMARY KEY, name CHAR(30), login CHAR(20), age INT );

➢ Foreign Key Constraints:-A foreign key is different from a super key, candidate key or
primary key because a foreign key is the one that is used to link two tables together
or create connectivity between the two.
-A foreign key is the one that is used to link two tables together the primary key.
- It means the columns of one table points to the primary key attribute of the other
table.
-The use of a foreign key is simply to link the attributes of two tables together with
the help of a primary key attribute.
- it is used for creating and maintaining the relationship between the two relations.
-example; Consider two tables Student and Department having their respective
attributes as shown in the below table structure:

-In the tables, one attribute, you can see, is common, that is Stud_Id.
-In the Student table, the field Stud_Id is a primary key because it is uniquely
identifying all other fields of the Student table.
-On the other hand, Stud_Id is a foreign key attribute for the Department table
because it is acting as a primary key attribute for the Student table.
-It means that both the Student and Department table are linked with one another
because of the Stud_Id attribute.
-In the below-shown figure, you can view the relationship between the two tables.
➢ Domain constraints:-Domain constraints can be defined as the definition of a valid
set of values for an attribute.
-The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the attribute must be available in the corresponding domain.
-fig

*General Constraints:- net ill nokkikkonam(text page 103)

#E-R Model to Relational Model:-There is more than one approach to translating


an ER diagram to a relational model they are;
1. Entity Sets to Tables:-An entity set is mapped to a relation in a straightforward way.
-Each attribute of the entity set becomes an attribute of the table.
-Consider the Employees entity set with attributes ssn, name, and lot shown in
Figure.
-the Employees entity set, containing three
Employees entities.
-The Employees Entity Set in a tabular format.

CREATE TABLE Employees ( ssn CHAR(11),


name CHAR(30) ,
lot INTEGER,
PRIMARY KEY (ssn)

2.Relationship Sets (without Constraints) to Tables:-Create a table for the


relationship set.
- Add all primary keys of the participating entity sets as fields of the table.
-Add a field for each attribute of the relationship.
-Declare a primary key using all key fields from the entity sets.
-Declare foreign key constraints for all these fields from the entity sets

-eg;CREATE TABLE WVorks_In2 ( ssn CHAR(11),


did INTEGER,
address CHAR(20) ,
Since DATE,
PRIMARY KEY (ssn, did, address),
FOREIGN KEY (ssn) REFERENCES Employees,
FOREIGN KEY (address) REFERENCES Locations,
FOREIGN KEY (did) REFERENCES Departments)

3.Translating Relationship Sets with Key Constraints:-


-It can be translated into relations using either of two approaches:

a).Defining three relations as follows:


->A separate relation for the entity set ‘Employees’.
->A separate relation for the entity set ‘ Departments’.
->A separate relation for the relationship set ‘ Manages’.
b).Defining two relations as follows:
->A separate relation for the entity set ‘Employees’.
->A common relation for ‘Employees’ and for ‘Manages’.
-Approach (a) is obtained with the following SQL statement for the relationship set
‘Manages’.
-CREATE TABLE Manages (ssn CHAR (11) ,
did INTEGER,
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees,
FOREIGN KEY (did)REFERENCES Departments)
-In Approach (b) ,each department has a unique manager, so combine manages and
department into a single relation named ‘Dept_mgr’
-CREATE TABLE Dept_Mgr ( did INTEGER,
dname CHAR(20),
budget REAL,
ssn CHAR (11) ,
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees)
-Note that ssn can take on null values.

4.Translating Relationship Sets with Participation Constraints:-

-The Figure shows two relationship sets, Manages and Works_In.


-Every department is required to have a manager, due to the participation constraint,
and at most one manager, due to the key constraint.
-CREATE TABLE Dept_Mgr ( did INTEGER,
dname CHAR(20) ,
budget REAL,
ssn CHAR(11) NOT NULL,
since DATE,
PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employees ON DELETE NO
ACTION);
-Here using the second approach /{approach (b)} mentioned above .because the
constraint that every department must have a manager cannot be captured with the
first approach.
-Since the ER diagram shows a total participation constraint on the Department entity
set as well as a key constraint, there must be exactly one manager per department.
-Therefore the attribute ssn cannot take on null values.
-The NO ACTION specification (which is not really needed since it is the default) ensures
that an Employee tuple cannot be deleted while it is pointed to by a 'Dept_Mgr' tuple.
-The total participation constrained shown in the 'Works_In' relationship set cannot be
specified in the above SQL statement unless we use table constraints or assertions.

5.Translating Weak Entity Sets:-A weak entity set always participates in a one-to-many
binary relationship and has a key constraint and total participation.
-the weak entity has only a partial key. Also, when an owner entity is deleted, we want all
owned weak entities to be deleted.
-Consider the Dependents weak entity set with partial key pname.
-A Dependents entity can be identified uniquely only if we take the key of the owning
Employees entity and the pname of the Dependents entity, and the Dependents entity
must be deleted if the owning Employees entity is deleted.

-CREATE TABLE Dep_Policy (pname CHAR(20) ,


age INTEGER,
cost REAL,
ssn CHAR (11) ,
PRIMARY KEY (pname, ssn),
FOREIGN KEY (ssn) REFERENCES Employees ON DELETE
CASCADE )
-the primary key is (pname, ssn) , since Dependents is a weak entity.
-We have to ensure that every Dependents entity is associated with an Employees entity
(the owner), as per the total participation constraint on Dependents.
-The ssn cannot be null.
-The CASCADE option ensures that information about an employee's policy and
dependents is deleted if the corresponding Employees tuple is deleted.

___________________________________________________________________________

Module-3-Structured Query Language

#Structured Query Language/ Overview of SQL:-Structured Query Language


(SQL) is the most widely used conunercial relational database language.
-It was, originally developed at IBlVI in the SEQUEL-XRM and System-R projects (1974-1977).
-Structured Query Language is a standard Database language which is used to create,
maintain and retrieve the relational database.
-SQL follows the following rules:
● Structure query language is not case sensitive. Generally, keywords of SQL are
written in uppercase.
● Statements of SQL are dependent on text lines. We can use a single SQL statement
on one or multiple text line.
● Using the SQL statements, you can perform most of the actions in a database.
● SQL depends on tuple relational calculus and relational algebra.
-The SQL language has several aspects to it.
● The Data Manipulation Language (DML) :- This subset of SQL allows users
to write queries and to insert, delete and modify rows
● The Data Definition Language(DDL) :- This subset of SQL supports the
creation, deletion and modification of definitions for tables and views.
● Trigger and Advanced Integrity Constraint :- The new SQL includes
supports for triggers, which are actions executed by the DBMS whenever changes
to database meet conditions specified in the trigger.
- SQL allows the use of queries to specify complex integrity constraint specifications
● Embedded and Dynamic SQL: Embedded SQL features allow SQL
code to be called from a host language such as C or COBOL. Dynamic
SQL features allow a query to be constructed (and executed) at run-time.
● Client-Server Execution and Remote Database Access: These commands control
how a client application program can connect to an SQL
database server, or access data from a database over a network
● Transaction Management: Various commands allow a user to control the aspects of
how a transaction is to be executed
● Security: SQL provides mechanisms to control users access to data objects such as
tables and views.
● Advanced Features :- The New SQL supports oops concepts, recursive queries,
decision support queries etc

*Basic Queries in SQL:-A DBMS would typically execute a query in a different and more
efficient way.
-The basic form of an SQL query is as follows:
● SELECT [DISTINCT] select-list
● FROM from-list
● WHERE qualification
-Every query must have a SELECT clause and a FROM clause,The WHERE clause is optional.
*UNION,INTERSECT and EXCEPT:-UNION, EXCEPT, and INTERSECT are operators that
operate similarly and are used between two queries to form Boolean combinations between
the results of the two queries.
-Given queries A and B, UNION returns all records returned by either A or B.
-EXCEPT returns all records in A but not in B.
-INTERSECT returns all records returned by both A and also by B.
-eg:Consider the tables called Invoices and Students.

-The Invoices table lists invoices for construction work done by named people, with some
names appearing more than once.
-The Students table lists names of students.
-Some names appear in both tables.
-We can form Boolean combinations between the results of queries involving those tables.
-For example, we may want to know the names of all people working on construction
projects who are not students.
- The UNION, EXCEPT, and INTERSECT operators can form Boolean combinations between
any two queries.
-Examples: SELECT Name FROM Invoices UNION SELECT Name FROM Students;
-The query above returns a table with all names found either in the Invoices table or in the
Students table, removing duplicates.
- To keep duplicates in the results table, we use ALL:
-eg: SELECT Name FROM Invoices UNION ALL SELECT Name FROM Students;

→Except
-eg:SELECT Name FROM Invoices EXCEPT SELECT Name FROM Students;

-The query above returns a table with all names found in the Invoices table except those
who are students, removing duplicates.
To keep duplicates in the results table, we use ALL:

→INTERSECT
-eg:SELECT Name FROM Invoices INTERSECT SELECT Name FROM Students;

-The query above returns a table with all names found in the Invoices table who also are
students, removing duplicates.
-To keep duplicates in the results table, we use ALL:
-Note:INTERSECT has priority over UNION or EXCEPT.
* Nested Queries:-One of the most powerful features of SQL is nested queries.
-A nested query is a query that has another query embedded within it; the embedded query
is called a subquery.
-The embedded query can of course be a nested query itself.
-A subquery typically appears within the WHERE clause of a query.
-It can sometimes appear in the FROM clause or HAVING clause.
-Example: Find the names of employee who have regno=103
-The query is as follows −
select E.ename from employee E where E.eid IN (select S.eid from salary S where
S.regno=103);
-There are mainly two types of nested queries:
1. Independent Nested Queries:-In independent nested queries, query execution starts
from innermost query to outermost queries.
-The execution of inner query is independent of outer query, but the result of inner
query is used in execution of outer query.
-Various operators like IN, NOT IN, ANY, ALL etc are used in writing independent
nested queries.
-eg:

-If we want to find out S_ID who are enrolled in C_NAME ‘DSA’ or ‘DBMS’, we can
write it with the help of independent nested query and IN operator. From COURSE
table, we can find out C_ID for C_NAME ‘DSA’ or DBMS’ and we can use these C_IDs
for finding S_IDs from STUDENT_COURSE TABLE.

2. Co-related Nested Queries:-: In co-related nested queries, the output of inner query
depends on the row which is being currently executed in outer query.
-e.g.; If we want to find out S_NAME of STUDENTs who are enrolled in C_ID ‘C1’, it
can be done with the help of co-related nested query as:
Select S_NAME from STUDENT S where EXISTS
-For each row of STUDENT S, it will find the rows from STUDENT_COURSE where
S.S_ID = SC.S_ID and SC.C_ID=’C1’. If for a S_ID from STUDENT S, atleast a row exists
in STUDENT_COURSE SC with C_ID=’C1’, then inner query will return true and
corresponding S_ID will be returned as output.
*Aggregate Operators:-aggregation function is used to perform the calculations on
multiple rows of a single column of a table. It returns a single value.
Or
-In database management an aggregate function is a function where the values of multiple
rows are grouped together as input on certain criteria to form a single value of more
significant meaning.
-It is also used to summarize the data.
-Types of SQL Aggregation Function are;

1. Count():-COUNT function is used to Count the number of rows in a database table.


- It can work on both numeric and non-numeric data types.
-COUNT function uses the COUNT(*) that returns the count of all the rows in a
specified table.
-COUNT(*) considers duplicate and Null.
-Syntax : COUNT(*)
-fig

-Eg 1:SELECT COUNT(*) FROM PRODUCT_MAST;


Output:10
-Eg 2:COUNT with WHERE
SELECT COUNT(*) FROM PRODUCT_MAST; WHERE RATE>=20;
Output:7

2. Sum():-Sum function is used to calculate the sum of all selected columns.


-It works on numeric fields only.
-Syntax: SUM()
-Example: SUM()
SELECT SUM(COST) FROM PRODUCT_MAST;
Output:670
-Example 2: SUM() with WHERE
SELECT SUM(COST) FROM PRODUCT_MAST WHERE QTY>3;
Output:320

3. Avg():-he AVG function is used to calculate the average value of the numeric type.
-Syntax: AVG()
-Example: SELECT AVG(COST) FROM PRODUCT_MAST;
Output:67.00

4. Max():-MAX function is used to find the maximum value of a certain column.


-This function determines the largest value of all selected values of a column.
-Syntax: MAX()
-Example: SELECT MAX(RATE) FROM PRODUCT_MAST;
Output: 30

5. Min():-MIN function is used to find the minimum value of a certain column.


-This function determines the smallest value of all selected values of a column.
-Syntax: MIN()
-Example: SELECT MIN(RATE) FROM PRODUCT_MAST;
Output: 10

*Null Values:-The column values in a row are always known.


-In practice column values can be unknown.
-For example, when a sailor, say Dan, joins a yacht club, he may not yet have a rating
assigned.
-Since the definition for the Sailors table has a rating column, what row should we insert for
Dan? what is needed here is a special value that denotes unknown.
-provides a special column value called null to use in such situations.
-We use null when the column value is either unknown or inapplicable.
-Using our Sailor table definition, we might enter the row (108. Dan, null, 39) to represent
Dan . (no,name,rate,age)
-It is important to understand that you cannot use comparison operators such as “=”, “<”, or
“>” with NULL values.
-This is because the NULL values are unknown and could represent any value.
-Instead, you must use “IS NULL” or “IS NOT NULL” operators to check if a value is NULL.
->Principles of NULL values
● Setting a NULL value is appropriate when the actual value is unknown, or when a
value is not meaningful.
● A NULL value is not equivalent to a value of ZERO if the data type is a number and is
not equivalent to spaces if the data type is a character.
● A NULL value can be inserted into columns of any data type.
● A NULL value will evaluate NULL in any expression.
● Suppose if any column has a NULL value, then UNIQUE, FOREIGN key, and CHECK
constraints will ignore by SQL.
-we can update the null value using UPDATE statement.

*String and Date Functions:-String functions are used to perform an operation on input
string and return an output string.
-String functions are the predefined functions that allow the database users for string
manipulation.
-Following are the most important string functions:
● ASCII():-This function is used to find the ASCII value of a character.
● CHAR_LENGTH():-This string function returns the length of the specified word. It
shows the number of characters from the word.
-Syntax: SELECT char_length('Hello!');
Output: 6

● CHARACTER_LENGTH():-This string function returns the length of the given string. It


shows the number of all characters and spaces from the sentence.
-Syntax: SELECT CHARACTER_LENGTH('geeks for geeks');
Output: 15

● CONCAT():-This function is used to add two words or strings.


Syntax: SELECT 'Geeks' || ' ' || 'forGeeks' FROM dual;
Output: ‘GeeksforGeeks’

● FIND_IN_SET():-This function is used to find a symbol from a set of symbols.


Syntax: SELECT FIND_IN_SET('b', 'a, b, c, d, e, f');
Output: 2

● FORMAT():-This function is used to display a number in the given format.


Syntax: Format("0.981", "Percent");
Output: ‘98.10%’

● INSERT():-This function is used to insert the data into a database.


Syntax: INSERT INTO database (geek_id, geek_name) VALUES (5000, 'abc');
Output: successfully updated

● LCASE():-This function is used to convert the given string into lower case.
Syntax: LCASE ("GeeksFor Geeks To Learn");
Output: geeksforgeeks to learn

● LENGTH(): This function is used to find the length of a word.


Syntax: LENGTH('GeeksForGeeks');
Output: 13

● LOWER():-his function is used to convert the upper case string into lower case.
Syntax: SELECT LOWER('GEEKSFORGEEKS.ORG');
Output: geeksforgeeks.org

● REVERSE():-This function is used to reverse a string.


Syntax: SELECT REVERSE('geeksforgeeks.org');
Output: ‘gro.skeegrofskeeg’

● UCASE():-This function is used to make the string in upper case.


Syntax: UCASE ("GeeksForGeeks");
Output:GEEKSFORGEEKS

-The date and time functions in DBMS are quite useful to manipulate and store values
related to date and time.
-Here are some of the important date functions which are in-built in SQL;
● NOW():-It Returns the current date and time.
Syntax: SELECT NOW();
Output:2023-04-30 02:58:30

● CURDATE():- Returns the current date.


Query:SELECT CURDATE();
Output: 2023-04-30

● CURTIME():-Returns the current time.


Query:SELECT CURTIME();
Output: 03:00:05
● DATEDIFF():-Returns the number of days between two dates.
Syntax:DATEDIFF(date1, date2);
Query:SELECT DATEDIFF('2017-01-13','2017-01-03') AS DateDiff;
Output: 10

● DATE_FORMAT():- Displays date/time data in different formats.

*Complex Integrity Constraints in SQL:-Integrity constraints are a set of rules. It is used to


maintain the quality of information.
-Integrity constraints ensure that the data insertion, updating, and other processes have to
be performed in such a way that data integrity is not affected.
-Thus, integrity constraint is used to guard against accidental damage to the database.
-In this we specify complex integrity constraints included in SQL.
-It relates to integrity constraints.
-Number of constraints are involved.
● Constraints over single a table:-It is possible to specify complex constraints over a
single table using table constraints which have the form CHECK
conditional-expression.
-Example to ensure that rating must be an integer in the range 1 to 10 we could use
the following statements.
CREATE TABLE Sailors ( sid INTEGER,
sname CHAR(10),
rating INTEGER,
age REAL,
PRIMARY KEY (sid),
CHECK (rating >= 1 AND rating <= 10 ))

● Domain Constraints:-A user can define a new domain using CREATE DOMAIN
statement which make use of CHECK constraints.
-CREATE DOMAIN ratingval INTEGER DEFAULT 0
CHECK (VALUE>=1 AND VALUE<=10)

● Assertions:-Table constraints are associated with a single table.


-Table constraints hold only if the associated table is nonempty.
-When a constraints involves two or more tables, the constraint mechanism is
sometimes cumbersome and not quite desired.

Triggers and Views in SQL:-A trigger is a stored procedure in database which


automatically invokes whenever a special event in the database occurs.
-For example, a trigger can be invoked when a row is inserted into a specified table or when
certain table columns are being updated.
-trigger is always associated with a table.
-A trigger is called a special procedure because it cannot be called directly like a stored
procedure.
-The key distinction between the trigger and procedure is that a trigger is called
automatically when a data modification event occurs against a table.
-Views in SQL are kind of virtual tables.
-A view also contains rows and columns.
-A view can be created using the CREATE VIEW statement.
-A View can be created from a single table or multiple tables.

#Accessing Databases from Applications :-w SQL commands can be executed


from within a program in a host language such as C or Java.

*Embedded SQL:-The use of SQL commands within a host language program is called
Embedded SQL.
-Embedded SQL is the one which combines the high level language with the DB language like
SQL.
-It allows the application languages to communicate with DB and get requested result.
-The high level languages which supports embedding SQLs within it are also known as host
language.
-There are different host languages which support embedding SQL within it like C, C++, ADA,
Pascal, FORTRAN, Java etc.
-When SQL is embedded within C or C++, then it is known as Pro*C/C++ or simply Pro*C
language.
-Pro*C is the most commonly used embedded SQL.
-or google nokkuka
*Dynamic SQL:-SQL statements can be executed in one of two ways: statically or
dynamically.
1. Static SQL:-For statically executed SQL statements, the syntax is fully known at
precompile time.
2. Dynamic SQL:- is a programming technique that could be used to write SQL queries
during runtime.
-Dynamic SQL could be used to create general and flexible SQL queries.
-bakki google.
*Cursors:-Whenever DML statements are executed, a temporary work area is created in
the system memory and it is called a cursor.
-A cursor can have more than one row, but processing wise only 1 row is taken into account.
-Cursors are very helpful in all kinds of databases like Oracle, SQL Server, MySQL, etc.
-Two different types of cursors are available.
1. Implicit cursors:-These types of cursors are generated and allocated by the SQL
server when the system performs INSERT, DELETE, and UPDATE operations on SQL
queries.
-This cursor is also referred to as the default cursor in SQL.
-An implicit cursor is also created by the system when the SELECT query selects the
single row.

2. Explicit cursors:-These types of cursors are created by the user using the SELECT
query.
-An explicit cursor holds multiple records but processes a single row at a time. It uses
the pointer, which moves to another row after reading one row.
-It is basically used for gaining extra control over the temporary workstation.

#Relational Database Design

*Schema Refinement:-Normalization or Schema Refinement is a technique of organizing


the data in the database.
-It is a systematic approach of decomposing tables to eliminate data redundancy and
remove unwanted characteristics like Insertion, Update and Deletion .
-The Schema Refinement refers to refine the schema by using some technique. The best
technique of schema refinement is decomposition.
-Identifying and clearing the future problems in the database is called schema refinement.
-In this refinement main problem is data redundancy. It is avoided by normalization
technique.
-The Basic Goal of Normalization is used to eliminate redundancy.
-Redundancy refers to repetition of same data or duplicate copies of same data stored in
different locations.
-Although decomposition can eliminate redundancy, it can lead to problems of its own and
should be used with caution.

→Problems Caused by redundancy:-Storing the same information redundantly, can lead


to several problems,they are;
● Redundant Storage: Some information is stored repeatedly.
● Update Anomalies: If one copy of such repeated data is updated, If not equally
updated in all copies, an inconsistency is produced.
● Insertion Anomalies: It may not be possible to store certain information unless some
other, unrelated, information is stored as well
● Deletion Anomalies: It may not be possible to delete certain information without
losing some other, unrelated, information as well.
→Decompositions : Decomposition means replacing a relation with a collection of smaller
relations. or The process of breaking down the relation into smaller relations is called
Decomposition.
-It can breaks the table into multiple tables in a database.
-If there is no proper decomposition of the relation, then it may lead to problems like loss of
information.
-Schema Refinement chothichal Problems Caused by Redundancy and Decompositions ezhuthanam

*Functional Dependencies:-The functional dependency is a relationship that exists


between two attributes.
-It is denoted as X → Y, where X is a set of attributes that is capable of determining the value
of Y.
-The attribute set on the left side of the arrow, X is called Determinant, while on the right
side, Y is called the Dependent.
-Example:Assume we have an employee table with attributes:
Emp_Id, Emp_Name, Emp_Address.
-Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table
because if we know the Emp_Id, we can tell that employee name associated with it.
-Functional dependency can be written as: Emp_Id → Emp_Name
-We can say that Emp_Name is functionally dependent on Emp_Id.
-Types of Functional dependencies in DBMS:
● Trivial functional dependency:- explanation book ill unde
● Non-Trivial functional dependency:-
● Multivalued functional dependency:-
● Transitive functional dependency:-

*NormalForms: explain cheyanam book ill unde


● First Normal Form,
● Second Normal Form,
● Third Normal Form,
● Boyce Codd Normal Form.

___________________________________________________________________________
Module-4-Transaction Management

#Transaction Management:-ManagementA transaction is a logical unit of work


that contains one or more SQL statements. A transaction is an atomic unit.
-A transaction usually means that the data in the database has changed. One of the major
uses of DBMS is to protect the user’s data from system failures.
-It is done by ensuring that all the data is restored to a consistent state when the computer
is restarted after a crash.
-One of the important properties of the transaction is that it contains a finite number of
steps.
- Executing the same program multiple times will generate multiple transactions.
-Example: Transfer of 50₹ from Account A to Account B. Initially A= 500₹, B= 800₹.
● R(A) -- 500 // Read A balance 500
● A = A-50 // Deducting 50₹ from A.
● W(A)--450 //write A balance 450.
● R(B) -- 800 //Read B balance 800.
● B=B+50 // 50₹ is added to B's Account.
● W(B) --850 //Write B balance 850.
● commit // Stop.
-The updated value of Account A = 450₹ and Account B = 850₹.

->Uses of Transaction Management


-The DBMS is used to schedule the access of data concurrently. It means that the user can
access multiple data from the database without being interfered with by each other.
-Transactions are used to manage concurrency.
-It is also used to satisfy ACID properties.
-It is used to solve Read/Write Conflicts.
-It is used to implement Recoverability, Serializability, and Cascading.
-Transaction Management is also used for Concurrency Control Protocols and the Locking of
data.
→Disadvantages of using a Transaction
-It may be difficult to change the information within the transaction database by end-users.
-We need to always roll back and start from the beginning rather than continue from the
previous state.

*Properties of Transaction (ACID Properties):-For a transaction to be performed in


DBMS, it must have to possess several properties often called ACID properties.
● A – Atomicity:-The entire transaction takes place at once or doesn't happen at all.
- or Either all actions are carried out or none are.
-Users should not have to worry about the effect of incomplete transactions.
● C – Consistency:-The database must be consistent before and after the translation.
● I – Isolation:-Ensure that transaction is isolated from other transaction
● D – Durability:- The database should be durable enough to hold all its latest updates
even if the system fails or restarts.
-If a transaction updates a chunk of data in a database and commits, then the
database will hold the modified data.

->Atomicity:-By this, we mean that either the entire transaction takes place at once or
doesn’t happen at all.
-There is no midway i.e. transactions do not occur partially.
-ach transaction is considered as one unit and either runs to completion or is not executed
at all.
-Atomicity involves the following two operations:
1. Abort: If a transaction aborts then all the changes made are not visible.
2. Commit: If a transaction commits then all the changes made are visible.
-Atomicity is also known as the ‘All or nothing rule’.
-Example:Consider the following transaction T consisting of T1 and T2: Transfer of 100 from
account X to account Y.

-If the transaction fails after completion of T1 but before completion of T2.( say, after
write(X) but before write(Y)), then the amount has been deducted from X but not added to
Y.
-This results in an inconsistent database state.
-Therefore, the transaction must be executed in its entirety in order to ensure the
correctness of the database state.

->Consistency:-This means that integrity constraints must be maintained so that the


database is consistent before and after the transaction.
-It refers to the correctness of a database. Referring to the example above,
-The total amount before and after the transaction must be maintained.
-Total before T occurs = 500 + 200 = 700.
-Total after T occurs = 400 + 300 = 700.
-Therefore, the database is consistent. Inconsistency occurs in case T1 completes but T2 fails.
As a result, T is incomplete.

->Isolation:-It shows that the data which is used at the time of execution of a transaction
cannot be used by the second transaction until the first one is completed.
-In isolation, if the transaction T1 is being executed and using the data item X, then that data
item can't be accessed by any other transaction T2 until the transaction T1 ends.
-Example: Let X= 500, Y = 500. Consider two transactions T and T”.
-Suppose T has been executed till Read (Y) and
then T’’ starts. As a result, interleaving of
operations takes place due to which T’’ reads the
correct value of X but the incorrect value of Y
and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of the transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
-This results in database inconsistency, due to a loss of 50 units. Hence, transactions must
take place in isolation and changes should be visible only after they have been made to the
main memory.

->Durability:-This property ensures that once the transaction has completed execution, the
updates and modifications to the database are stored in and written to disk and they persist
even if a system failure occurs.
-These updates now become permanent and are stored in non-volatile memory.
-The effects of the transaction, thus, are never lost.

->Advantages of ACID Properties in DBMS:


● Data Consistency: ACID properties ensure that the data remains consistent and
accurate after any transaction execution.
● Data Integrity: ACID properties maintain the integrity of the data by ensuring that
any changes to the database are permanent and cannot be lost.
● Concurrency Control: ACID properties help to manage multiple transactions
occurring concurrently by preventing interference between them.
● Recovery: ACID properties ensure that in case of any failure or crash, the system can
recover the data up to the point of failure or crash.

→Disadvantages of ACID Properties in DBMS:


● Performance: The ACID properties can cause a performance overhead in the system,
as they require additional processing to ensure data consistency and integrity.
● Scalability: The ACID properties may cause scalability issues in large distributed
systems where multiple transactions occur concurrently.
● Complexity: Implementing the ACID properties can increase the complexity of the
system and require significant expertise and resources.
-Overall, the advantages of ACID properties in DBMS outweigh the disadvantages.
They provide a reliable and consistent approach to data
● management, ensuring data integrity, accuracy, and reliability. However, in some
cases, the overhead of implementing ACID properties can cause performance and
scalability issues.
-Therefore, it’s important to balance the benefits of ACID properties against the
specific needs and requirements of the system.

*Concurrent Execution of Transactions:-In a multi-user system, multiple users can


access and use the same database at one time, which is known as the concurrent execution
of the database.
-It means that the same database is executed simultaneously on a multi-user system by
different users.
-In the transaction process, a system usually allows executing more than one transaction
simultaneously. This process is called a concurrent execution.
->Advantages of concurrent execution of a transaction
● Decrease waiting time or turnaround time.
● Improve response time
● Increased throughput or resource utilization.
->Concurrency problems
-Several problems can occur when concurrent transactions are run in an uncontrolled
manner, such type of problems is known as concurrency problems.
-There are following different types of problems or conflicts which occur due to concurrent
execution of transaction:
1. Lost update problem (Write – Write conflict):-This type of problem occurs when two
transactions in database access the same data item and have their operations in an
interleaved manner that makes the value of some database item incorrect.
-If there are two transactions T1 and T2 accessing the same data item value and then
update it, then the second record overwrites the first record.
–Example: Let's take the value of A is 100
-fig explanation
❖ At t1 time, T1 transaction reads the value of A i.e.,100.
❖ At t2 time, T1 transaction deducts the value of A by 50.
❖ At t3 time, T2 transactions read the value of A i.e., 100.
❖ At t4 time, T2 transaction adds the value of A by 150.
❖ At t5 time, T1 transaction writes the value of A data item
on the basis of value seen at time t2 i.e., 50.
❖ At t6 time, T2 transaction writes the value of A based on
value seen at time t4 i.e., 150.
❖ So at time T6, the update of Transaction T1 is lost
because Transaction T2 overwrites the value of A without looking at its current value.
❖ Such type of problem is known as the Lost Update Problem.

2. Dirty read problem (W-R conflict):-This type of problem occurs when one
transaction T1 updates a data item of the database, and then that transaction fails
due to some reason, but its updates are accessed by some other transaction.
-Example: Let's take the value of A is 100
-fig explanation
❖ At t1 time, T1 transaction reads the value of A i.e., 100.
❖ At t2 time, T1 transaction adds the value of A by 20.
❖ At t3 time, T1transaction writes the value of A (120) in the
database.
❖ At t4 time, T2 transactions read the value of A data item
i.e., 120.
❖ At t5 time, T2 transaction adds the value of A data item by
30.
❖ At t6 time, T2transaction writes the value of A (150) in the database.
❖ At t7 time, a T1 transaction fails due to power failure then it is rollback according to
atomicity property of transaction (either all or none).
❖ So, transaction T2 at t4 time contains a value which has not been committed in the
database. The value read by the transaction T2 is known as a dirty read.

3. Unrepeatable read (R-W Conflict):-It is also known as an inconsistent retrieval


problem. If a transaction T1 reads a value of data item twice and the data item is
changed by another transaction T2 in between the two read operation.
-Hence T1 access two different values for its two read operation of the same data
item.
-Example: Let's take the value of A is 100
-fig explanation
❖ At t1 time, T1 transaction reads the value of A i.e., 100.
❖ At t2 time, T2transaction reads the value of A i.e., 100.
❖ At t3 time, T2 transaction adds the value of A data item
by 30.
❖ At t4 time, T2 transaction writes the value of A (130) in
the database.
❖ Transaction T2 updates the value of A. Thus, when
another read statement is performed by transaction T1,
it accesses the new value of A, which was updated by T2.
Such type of conflict is known as R-W conflict.

*Serialisability:-serializability is a way to check if the execution of two or more


transactions are maintaining the database consistency or not.
-It helps in maintaining the transactions to execute simultaneously without interleaving one
another.
-A non-serial schedule is called a serializable schedule.
-there are different types of serializable. Each type of serializable has some advantages and
disadvantages. The two most common types of serializable are ;
1. Conflict Serializability:-Conflict serializability is a type of conflict operation in
serializability that operates the same data item that should be executed in a
particular order and maintains the consistency of the database.
-In DBMS, each transaction has some unique value, and every transaction of the
database is based on that unique value of the database.
-This unique value ensures that no two operations having the same conflict value are
executed concurrently.
-For example, let's consider two examples, i.e., the order table and the customer
table. One customer can have multiple orders, but each order only belongs to one
customer.

2. View Serializability:-View serializability is a type of operation in the serializable in


which each transaction should produce some result and these results are the output
of proper sequential execution of the data item.
-Unlike conflict serialized, the view serializability focuses on preventing inconsistency
in the database.
-In DBMS, the view serializability provides the user to view the database in a
conflicting way.
-In DBMS, we should understand schedules S1 and S2 to understand view
serializability better.
-These two schedules should be created with the help of two transactions T1 and T2.
->The benefits of using the serializable in the database.
● Predictable execution: In serializable, all the threads of the DBMS are executed at
one time.
-There are no such surprises in the DBMS.
-In DBMS, all the variables are updated as expected, and there is no data loss or
corruption.
● Easier to Reason about & Debug: In DBMS all the threads are executed alone, so it is
very easier to know about each thread of the database.
-This can make the debugging process very easy. So we don't have to worry about
the concurrent process.
● Reduced Costs: With the help of serializable property, we can reduce the cost of the
hardware that is being used for the smooth operation of the database.
-It can also reduce the development cost of the software.
● Increased Performance:In some cases, serializable executions can perform better
than their non-serializable counterparts since they allow the developer to optimize
their code for performance.

*Anomalies Due to Interleaved Execution:-The three anomalous situations can be


describedin terms of when the actions of two transactions Tl and T2 conflict with each
other: In a write-read (WR) conflict/Reading Uncommitted Data (WR Conflicts), T2 reads a
data object previously
written by Tl; we define read-write (RW)/Unrepeatable Reads (RW Conflicts) and
write-write (WW) conflicts/Overwriting Uncommitted Data (WW Conflicts)
similarly.
-front ill unde answer (imp front illa name ezhutharuthe )
eg:Dirty read problem (W-R conflict) inu pakaram Reading Uncommitted Data (WR Conflicts)
->Difference Between Serial Schedule and Serializable Schedule:

*Lock-Based Concurrency Control:-In this type of protocol, any transaction cannot


read or write data until it acquires an appropriate lock on it.
-It is a mechanism in which a transaction cannot read or write data unless the appropriate
lock is acquired.
-This helps in eliminating the concurrency problem by locking a particular transaction to a
particular user.
-There are two types of lock:
1. Shared lock:-It is also known as a Read-only lock.
-In a shared lock, the data item can only read by the transaction.
-And the transaction does not have the permission to update data on the data item.
-Eg:For example, consider a case where two transactions are reading the account
balance of a person. The database will let them read by placing a shared lock.
However, if another transaction wants to update that account’s balance, shared lock
prevent it until the reading process is over.

2. Exclusive lock:-In the exclusive lock, the data item can be both reads as well as
written by the transaction.
-This lock is exclusive, and in this lock, multiple transactions do not modify the same
data simultaneously.
-Consider a transaction(T2) which requires to update the data item value A. The
following steps take place when lock protocol is applied to this traction
● T2 will acquire an exclusive lock on the data item A.
● Read the current value of data item A.
● Modify the data item as required. In the example illustrated, a value of 50 is
subtracted from the data item A.
● Write the updated value of the data item.
● Once the transaction is completed, the data item will be unlocked.

-There are four types of lock protocols available they are;


1. Simplistic lock protocol:-explain cheyandaaa
2. Pre-claiming Lock Protocol:-explain cheyandaaa
3. Two-phase locking (2PL):-A transaction is said to follow the Two-Phase
Locking protocol if Locking and Unlocking can be done in two phases.
-the locking and unlocking of a transaction take place in either of the 2 phases
i.e. Growing or Shrinking phase
-Growing Phase: It is the phase where new locks can be acquired on the data
items.
-Shrinking phase: It is the phase where the existing locks on the data items
are released.
- the point where the growing phase ends and the shrinking phase begins.
bakki onnum explain cheyandaa
4. Strict Two-phase locking (Strict-2PL):-In the first phase, after acquiring all the
locks, the transaction continues to execute normally.
-The only difference between 2PL and strict 2PL is that Strict-2PL does not
release a lock after using it.
-Strict-2PL waits until the whole transaction to commit, and then it releases
all the locks at a time.
-Strict-2PL protocol does not have shrinking phase of lock release.

*Deadlocks:-consider the above execution phase. Now, T1


holds an Exclusive lock over B, and T2 holds a Shared lock over A.
- Consider Statement 7, T2 requests for lock on B, while in
Statement 8 T1 requests lock on A.
-This as you may notice imposes a Deadlock as none can proceed
with their execution.
-Such a cycle of transactions waiting for locks to be released is
called a deadlock.
-Clearly, these two transactions will make no further progress.
- Worse, they hold locks that may be required by other
transactions.
-The DBMS must either prevent or detect (and resolve) such
deadlock situations; the common approach is to detect and
resolve deadlocks.
-A simple way to identify deadlocks is to use a timeout mechanism.
- If a transaction has been waiting too long for a lock, we can assume (pessimistically)
that it is in a deadlock cycle and abort it.

*Crash Recovery:-Crash recovery is the process by which the database is moved back to a
consistent and usable state.
-This is done by rolling back incomplete transactions and completing committed transactions
that were still in memory when the crash occurred.
-Conditions that can result in transaction failure
● A power failure on the machine causing the database manager and the database
partitions on it to go down.
● A hardware failure such as memory corruption, or disk, CPU, or network failure.
● A serious operating system error that causes the DB to go down
-Then a DBMs is restarted after crashes. the recovery manager is given control
and must bring the database to a consistent state.
-The recovery manager is also responsible for undoing the actions of an aborted transaction.

->Stealing Frames and Forcing pages:-google


->Overview of ARIES:-google

*Dealing with Deadlocks:-A deadlock is a condition where two or more transactions are
waiting indefinitely for one another to give up locks.
-Deadlock is said to be one of the most feared complications in DBMS as no task ever gets
finished and is in waiting state forever.
-For example: In the student table, transaction T1 holds a lock on some rows and needs to
update some rows in the grade table. Simultaneously, transaction T2 holds locks on some
rows in the grade table and needs to update the rows in the Student table held by
Transaction T1.

-Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and
similarly, transaction T2 is waiting for T1 to release its lock.
->Deadlock Detection:In a database, when a transaction waits indefinitely to obtain a lock,
then the DBMS should detect whether the transaction is involved in a deadlock or not.
-The lock manager maintains a Wait for the graph to detect the deadlock cycle in the
database.
❖ Wait for Graph:-This is the suitable method for deadlock detection. In this method, a
graph is created based on the transaction and their lock.
-If the created graph has a cycle or closed loop, then there is a deadlock.

No deadlock->
->Deadlock prevention: For a large database, the deadlock prevention method is
suitable.
- A deadlock can be prevented if the resources are allocated in such a way that a
deadlock never occurs.
-Deadlock prevention mechanism proposes two schemes:
1. Wait-Die Scheme:-In this scheme, If a transaction requests a resource that is
locked by another transaction, then the DBMS simply checks the timestamp
of both transactions and allows the older transaction to wait until the
resource is available for execution.
-Suppose, there are two transactions T1 and T2, and Let the timestamp of any
transaction T be TS (T). Now, If there is a lock on T2 by some other transaction
and T1 is requesting resources held by T2, then DBMS performs the following
actions:
-Checks if TS (T1) < TS (T2) – if T1 is the older transaction and T2 has held
some resource, then it allows T1 to wait until resource is available for
execution.
-That means if a younger transaction has locked some resource and an older
transaction is waiting for it, then an older transaction is allowed to wait for it
till it is available.
-If T1 is an older transaction and has held some resource with it and if T2 is
waiting for it, then T2 is killed and restarted later with random delay but with
the same timestamp. i.e.
if the older transaction has held some resource and the younger transaction
waits for the resource, then the younger transaction is killed and restarted
with a very minute delay with the same timestamp.
-This scheme allows the older transaction to wait but kills the younger one.
Or
-Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held some
resource, then Ti is allowed to wait until the data-item is available for
execution. That means if the older transaction is waiting for a resource which
is locked by the younger transaction, then the older transaction is allowed to
wait for resource until it is available.
-Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held some resource
and if Tj is waiting for it, then Tj is killed and restarted later with the random
delay but with the same timestamp.

2. Wound Wait Scheme:-In this scheme, if an older transaction requests for a


resource held by a younger transaction, then an older transaction forces a
younger transaction to kill the transaction and release the resource.
-The younger transaction is restarted with a minute delay but with the same
timestamp.
-If the younger transaction is requesting a resource that is held by an older
one, then the younger transaction is asked to wait till the older one releases
it.
Or
-In wound wait scheme, if the older transaction requests for a resource which
is held by the younger transaction, then older transaction forces younger one
to kill the transaction and release the resource. After the minute delay, the
younger transaction is restarted but with the same timestamp.
-If the older transaction has held a resource which is requested by the
Younger transaction, then the younger transaction is asked to wait until older
releases it.

→differences between Wait – Die and Wound -Wait scheme prevention schemes are;

#Distributed Database:-A distributed database is essentially a database that is


spread over across numerous sites, i.e., on various computers or over a network of
computers, and is not restricted to a single system.
-This can be necessary when different people from all over the world need to access a
certain database.
- It must be handled such that, to users, it seems to be a single database.
-Types of Distributed Database are;
1. Homogeneous Database:-A homogeneous database stores data uniformly across all
locations. All sites utilize the same operating system, database management system,
and data structures.
-They are therefore simple to handle.
2. Heterogeneous Database:-It is the opposite of a Homogenous distributed database.
-In a heterogeneous distributed database, different sites can use different schema
and software that can lead to problems in query processing and transactions.
- Also, a particular site might be completely unaware of the other sites. Different
computers may use a different operating system, different database application.
-They may even use different data models for the database. Hence, translations are
required for different sites to communicate.

*Distributed Data Storage :There are 2 ways in which data can be stored on different
sites. These are:
1. Replication:-In this approach, the entire relationship is stored redundantly at 2 or
more sites. If the entire database is available at all sites, it is a fully redundant
database.
-Hence, in replication, systems maintain copies of data.
-This is advantageous as it increases the availability of data at different sites. Also,
now query requests can be processed in parallel.
-However, it has certain disadvantages as well.
-Data needs to be constantly updated. Any change made at one site needs to be
recorded at every site that relation is stored or else it may lead to inconsistency. This
is a lot of overhead.
-Also, concurrency control becomes way more complex as concurrent access now
needs to be checked over a number of sites.
2. Fragmentation:-In this approach, the relations are fragmented (i.e., they’re divided
into smaller parts) and each of the fragments is stored in different sites where
they’re required.
- It must be made sure that the fragments are such that they can be used to
reconstruct the original relation (i.e, there isn’t any loss of data).
-Fragmentation is advantageous as it doesn’t create copies of data, consistency is not
a problem.
-Fragmentation of relations can be done in two ways:
● Horizontal fragmentation – Splitting by rows – The relation is fragmented into
groups of tuples so that each tuple is assigned to at least one fragment.
● Vertical fragmentation – Splitting by columns – The schema of the relation is
divided into smaller schemas. Each fragment must contain a common
candidate key so as to ensure a lossless join.
-In certain cases, an approach that is hybrid of fragmentation and replication is used.

->Advantages of Distributed Database System :


● There is fast data processing as several sites participate in request processing.
● Reliability and availability of this system is high.
● It possess reduced operating cost.
● It is easier to expand the system by adding more sites.
● It has improved sharing ability and local autonomy.
->Disadvantages of Distributed Database System :
● The system becomes complex to manage and control.
● The security issues must be carefully managed.
● The system require deadlock handling during the transaction processing
otherwise the entire system may be in inconsistent state.
● There is need of some standardization for processing of distributed database
system.

*Distributed DBMS Architectures:-Common Architecture Models of Distributed


Database Systems:
● Client-Server Architecture of DDBMS:-This architecture is two level architecture
bakki net ill nokkikonam
where clients and servers are the points or levels where the main functionality is
divided.
-There is various functionality provided by the server, like managing the transaction,
managing the data, processing the queries, and optimization.
● Peer-to-peer Architecture of DDBMS:-In this architecture, each node or peer is
considered as a server as well as a client, and it performs its database services as
both (server and client).
-The peers coordinate their efforts and share their resources with one another.
● Multi DBMS Architecture of DDBMS:-This is an amalgam of two or more
independent Database Systems that functions as a single integrated Database
System.
-front ulla Distributed Database System oka ezhuthanam

___________________________________________________________________________

Note;
Mod1:- data users,
Mod2:-Types of Integrity Constraint (1.Domain constraints, 2.Entity integrity constraints,
3.Referential Integrity Constraints,4.Key constraints)
Mod 3:-Schedules Involving Aborted Transactions.

You might also like