0% found this document useful (0 votes)
14 views112 pages

Database MGMT

Uploaded by

Shiji Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views112 pages

Database MGMT

Uploaded by

Shiji Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Database management.

1. Define Database: A database is a collection of logically related data stored in an efficient and
compact manner. A database is usually controlled by a database management system (DBMS).

2. What is meant by the E-R model? ER model stands for an Entity-Relationship model. It is a high-
level data model. This model is used to define the data elements and relationship for a specified
system.

It develops a conceptual design for the database. It also develops a very simple and easy to design view of
data. In ER modeling, the database structure is portrayed as a diagram called an entity-relationship
diagram.

For example, Suppose we design a school database. In this database, the student will be an entity with
attributes like address, name, id, age, etc. The address can be another entity with attributes like city, street
name, pin code, etc and there will be a relationship between them.

3. Define Schema : A database schema defines how data is organized within a relational database;
this is inclusive of logical constraints such as, table names, fields, data types, and the relationships
between these entities. Schemas commonly use visual representations to communicate the
architecture of the database, becoming the foundation for an organization’s data management
discipline. This process of database schema design is also known as data modeling.

4. What is generalization? Generalization is like a bottom-up approach in which two or more entities
of lower level combine to form a higher level entity if they have some attributes in common.

In generalization, an entity of a higher level can also combine with the entities of the lower level to form a
further higher level entity. Generalization is more like subclass and superclass system, but the only
difference is the approach. Generalization uses the bottom-up approach. In generalization, entities are
combined to form a more generalized entity, i.e., subclasses are combined to make a superclass.

For example, Faculty and Student entities can be generalized and create a higher level entity Person.

5. What is specialization?

Specialization is a top-down approach, and it is opposite to Generalization. In specialization, one higher


level entity can be broken down into two lower level entities.

Specialization is used to identify the subset of an entity set that shares some distinguishing characteristics.

Normally, the superclass is defined first, the subclass and its related attributes are defined next, and
relationship set are then added.

For example: In an Employee management system, EMPLOYEE entity can be specialized as TESTER or
DEVELOPER based on what role they play in the company.
6. Types of functional dependency:

A full functional dependency is a functional dependency where the dependent attributes are determined by
the determinant attributes. For example, in the database of employees, the employee ID number fully
determines the employee’s name, address, and other personal information.

A partial functional dependency is a functional dependency where the dependent attributes are partially
determined by the determinant attributes. For example, in a database of employees, the employee ID
number may partially determine the employee’s address, but not the employee’s name or other personal
information.

A transitive functional dependency is a functional dependency where the dependent attributes are
determined by a set of attributes that are not included in the determinant attributes. For example, in a
database of employees, the employee ID number may determine the employee’s department, which in turn
determines the employee’s salary.

7. Differentiate between full functional dependency and partial functional dependency:

Full Functional Dependency Partial Functional Dependency

A functional dependency X->Y is a fully A functional dependency X->Y is a partial


functional dependency if Y is functionally dependency if Y is functionally dependent on X
dependent on X and Y is not functionally and Y can be determined by any proper subset
1. dependent on any proper subset of X. of X.

In full functional dependency, the non-prime In partial functional dependency, the non-prime
attribute is functionally dependent on the attribute is functionally dependent on part of a
2. candidate key. candidate key.

In fully functional dependency, if we remove any In partial functional dependency, if we remove


attribute of X, then the dependency will not exist any attribute of X, then the dependency will still
3. anymore. exist.

Partial Functional Dependency does not equate


Full Functional Dependency equates to the to the normalization standard of Second Normal
normalization standard of Second Normal Form. Form. Rather, 2NF eliminates the Partial
4. Dependency.
Full Functional Dependency Partial Functional Dependency

An attribute A is fully functional dependent on


An attribute A is partially functional dependent
another attribute B if it is functionally dependent
on other attribute B if it is functionally
on that attribute, and not on any part (subset) of
dependent on any part (subset) of that attribute.
5. it.

Partial dependency does not enhance the data


Functional dependency enhances the quality of
quality. It must be eliminated in order to
the data in our database.
6. normalize in the second normal form.

8. What is data integrity?

Data integrity is the overall accuracy, completeness, and consistency of data. Data integrity also refers to
the safety of data in regard to regulatory compliance — such as GDPR compliance — and security. It is
maintained by a collection of processes, rules, and standards implemented during the design phase. When
the integrity of data is secure, the information stored in a database will remain complete, accurate, and
reliable, no matter how long it’s stored or how often it’s accessed.

The importance of data integrity in protecting yourself from data loss or a data leak cannot be overstated. In
order to keep your data safe from outside forces acting with malicious intent, you must first ensure that
internal users are handling data correctly. By implementing the appropriate data validation and error
checking, you can ensure that sensitive data is never mis categorized or stored incorrectly, thus exposing
you to potential risk.

Data integrity in SQL databases refers to ensuring that each row of a table is uniquely identified so that
data can be retrieved separately. To achieve this, you need constraints on columns (constraints are sets of
rules). Data constraints prevent invalid data entry into the base tables of the database, which helps
maintain data integrity.

9. Define referential integrity.

Referential integrity is a property of data that ensures the accuracy and consistency of data within a
relationship, where data is linked between two or more tables by foreign keys referencing primary keys. It
requires that whenever a foreign key value is used, it must reference a valid, existing primary key in the
parent table. This helps to prevent incorrect records from being added, deleted, or modified.

For example, if we delete a row in a primary table, we need to ensure that there’s no foreign key in any
related table with the value of the deleted row. We should only be able to delete a primary key if there are
no associated rows. Otherwise, we would end up with an orphaned record.

Referential integrity is a subset of data integrity, which is concerned with the accuracy and consistency of
all data (relationship or otherwise). Maintaining data integrity is a crucial part of working with databases.

10. Describe DCL

DCL stands for Data Control Language.

Data control language (DCL) is used to access the stored data. It is mainly used for revoke and to grant the
user the required access to a database. In the database, this language does not have the feature of
rollback.

It is a part of the structured query language (SQL). It helps in controlling access to information stored in a
database. It complements the data manipulation language (DML) and the data definition language (DDL).

It is the simplest among three commands. It provides the administrators, to remove and set database
permissions to desired users as needed.
These commands are employed to grant, remove and deny permissions to users for retrieving and
manipulating a database.

11. Weak Entity:

A weak entity is an entity set that does not have sufficient attributes to form a primary key. It depends on
some other entity (known as owner entity) to ensure its existence. The weak entities have total participation
constraint (existence dependency) in its identifying relationship with owner identity. Weak entity types have
partial keys. Partial Keys are set of attributes with the help of which the tuples of the weak entities can be
distinguished and identified.

12. Difference between Strong and Weak Entity:

A strong entity is not dependent of any other entity in the relation. A strong entity will always have a primary
key. Strong entities are represented by a single rectangle. The relationship of two strong entities is
represented by a single diamond.

A weak entity is dependent on a strong entity to ensure its existence. Unlike a strong entity, a weak entity
does not have any primary key. It instead has a partial discriminator key. A weak entity is represented by a
double rectangle. The relation between one strong and one weak entity is represented by a double
diamond.

S.NO Strong Entity Weak Entity

1. Strong entity always has a primary key. While a weak entity has a partial discriminator key.

Strong entity is not dependent on any


2. Weak entity depends on strong entity.
other entity.

Strong entity is represented by a single


3. Weak entity is represented by a double rectangle.
rectangle.

Two strong entity’s relationship is While the relation between one strong and one weak
4.
represented by a single diamond. entity is represented by a double diamond.
S.NO Strong Entity Weak Entity

Strong entities have either total


5. While weak entity always has total participation.
participation or not.

13. Define entity and give example.

An entity in a database management system (DBMS) is a real-world object or concept that is


distinguishable from other objects or concepts. It can be anything about which we store information, such
as a student, an employee, or a bank account. An entity contains attributes, which describe that entity. For
example, a student, An employee, or bank a/c, etc. all are entities.
.

Entity Set
An entity set is a collection of similar types of entities that share the same attributes.
For example: All students of a school are a entity set of Student entities.

14. What is view? How to create a view? Give the syntax.


In the world of databases, a view is a query that’s stored on a database.
The term can also be used to refer to the result set of a stored query.
Views are sometimes referred to as virtual tables, because they present data in the form of a table, but
without such a table existing in the database.
A view can present data from multiple tables and present it as though it’s in a single table (just like any
other SELECT query). So when creating a view (just as creating any SELECT query), you specify
which columns to display.

Creating a View
Views are created using the (SQL) CREATE VIEW statement. So for example, to create a view called say,
“NewCustomers” you would start with:

CREATE VIEW NewCustomers AS

15. Data manipulation operations:


Data manipulation operations in DBMS (Database Management System) are used to manipulate data
stored in a database. These operations include insert, update, delete, and select .

Insert operation is used to add new data to a database. Update operation is used to modify existing data in
a database. Delete operation is used to delete data from a databaseSelect operation is used to retrieve
data from a database.

Data manipulation is performed using tools like SQL (Structured Query Language), DML (Data
Manipulation Language), and Excel. SQL is used for structured data manipulation, whereas NoSQL
languages like MongoDB are used for unstructured data manipulation.

Data manipulation is one of the initial processes done in data analysis. It involves arranging or rearranging
data points to make it easier for users/data analysts to perform necessary insights or business
directives. The steps involved in data manipulation are: mine the data and create a database, perform data
preprocessing, arrange the data, transform the data, and perform data analysis.

16. Difference between primary key and foreign key


A primary key is a column or group of columns in a relational database table that uniquely identifies each
row in the table. It is used to ensure that data in the specific column is unique and cannot have NULL
values. A primary key can be an existing table column or a column that is specifically generated by the
database according to a defined sequence

On the other hand, a foreign key is a column or group of columns in a relational database table that
provides a link between data in two tables. It is a column (or columns) that references a column (most often
the primary key) of another table. A foreign key is used to identify the relationship between the tables
through the primary key of one table that is the primary key one table acts as a foreign key to another table

In summary, a primary key is used to ensure that data in the specific column is unique and a foreign key is
used to identify the relationship between the tables through the primary key of one table that is the primary
key one table acts as a foreign key to another table

17. What is the command alter table in SQL?


The ALTER TABLE command in SQL is used to add, delete, or modify columns in an existing table. It
can also be used to add and drop various constraints on an existing table 1. Here are some examples
of how to use the ALTER TABLE command:

To add a column in a table, use the following syntax:


ALTER TABLE table_name ADD column_name datatype;

To delete a column in a table, use the following syntax:


ALTER TABLE table_name DROP COLUMN column_name;

To rename a column in a table, use the following syntax:


ALTER TABLE table_name RENAME COLUMN old_name to new_name;

To change the data type of a column in a table, use the following syntax:
ALTER TABLE table_name ALTER COLUMN column_name datatype;

18. What is a degree of a relation?


Degree of a relation (table)
The degree of a relation is the number of attributes (columns) in the given table. It is also called
as Arity.
EMPLOYEES Table

EMPID HIREDATE SALARY DEPT JOBCODE SEX

119012 01JUL1973 42340.58 CSR010 602 F

120591 05DEC1985 31000.55 SHP002 602 F

127845 16JAN1972 75320.34 ACC024 204 M

129540 01AUG1987 56123.34 SHP002 204 F

135673 15JUL1989 46322.58 ACC013 602 F

212916 15FEB1958 52345.58 CSR010 602 F

216382 15JUN1990 34004.65 SHP013 602 F

For the EMPLOYEES table given above, the degree is 6. That is there are 6 attributes (columns/fields)
in this table.

STUDENT
RegNo SName Gen Phone
R1 Sundar M 9898786756
R3 Karthik M 8798987867
R4 John M 7898886756
R2 Ram M 9897786776
For the STUDENT table given above, the degree is 4. That is there are 4 attributes in the STUDENT
table.

19. Difference between inner join and outer join

In SQL, Inner Join and Outer Join are two types of join operations used to combine data from two or
more tables. The main difference between the two is that an Inner Join returns only the rows that
match in both tables, while an Outer Join returns all the rows from one table and matching rows from
the other table.

Here’s a simple example to illustrate the difference between the two:


Suppose we have two tables, Table A and Table B, with the following data:

Table A

+----+-------+

| ID | Name |

+----+-------+

| 1 | John |

| 2 | Jane |

| 3 | Alice |

+----+-------+
Table B

+----+-------+

| ID | Color |

+----+-------+

| 1 | Red |

| 2 | Blue |

+----+-------+

If we perform an Inner Join on Table A and Table B using the ID column, we get the following result:
Inner Join
+----+------+-------+
| ID | Name | Color |
+----+------+-------+
| 1 | John | Red |
| 2 | Jane | Blue |
+----+------+-------+
On the other hand, if we perform a Left Outer Join on Table A and Table B using the ID column, we
get the following result:
Left Outer Join
+----+-------+-------+
| ID | Name | Color |
+----+-------+-------+
| 1 | John | Red |
| 2 | Jane | Blue |
| 3 | Alice | NULL |
+----+-------+-------+
All the rows from Table A are returned, along with the matching rows from Table B. If there is no
matching row in Table B, then NULL is returned for the columns of Table B.

20. Two phase locking:

Locking in a database management system is used for handling transactions in databases. The two-phase
locking protocol ensures serializable conflict schedules. A schedule is called conflict serializable if it can be
transformed into a serial schedule by swapping non-conflicting operations.

Two-Phase Locking

The types of locks used in transaction control are:

Shared Lock: Data can only be read when a shared lock is applied. Data cannot be written. It is denoted
as lock-S

Exclusive lock: Data can be read as well as written when an exclusive lock is applied. It is denoted
as lock-X

The two phases of Locking are :


Growing Phase: In the growing phase, the transaction only obtains the lock. The transaction cannot release
the lock in the growing phase. Only when the data changes are committed the transaction starts the
Shrinking phase.

Shrinking Phase: Neither are locks obtained nor they are released in this phase. When all the data changes
are stored, only then the locks are released.

Two-Phase Locking Types

Two-phase Locking is further classified into three types :

1. Strict two-phase locking protocol :


o The transaction can release the shared lock after the lock point.
o The transaction can not release any exclusive lock until the transaction commits.
o In strict two-phase locking protocol, if one transaction rollback then the other transaction
should also have to roll back. The transactions are dependent on each other. This is
called Cascading schedule.
2. Rigorous two-phase locking protocol :
o The transaction cannot release either of the locks, i.e., neither shared lock nor exclusive
lock.
o Serailizability is guaranteed in a Rigorous two-phase locking protocol.
o Deadlock is not guaranteed in rigorous two-phase locking protocol.
3. Conservative two-phase locking protocol :
o The transaction must lock all the data items it requires in the transaction before the
transaction begins.
o If any of the data items are not available for locking before execution of the lock, then no
data items are locked.
o The read-and-write data items need to know before the transaction begins. This is not
possible normally.
o Conservative two-phase locking protocol is deadlock-free.
o Conservative two-phase locking protocol does not ensure a strict schedule.

21. Difference between procedural and non-procedural language:


Key Differences Between Procedural and Nonprocedural Language

1. The semantics of the procedural language is quite tough as compared to the non-procedural
language.
2. The functions in a nonprocedural programming language can return any type of data (data type) and
value. On the other hand, procedural language functions are not able to return all type of data and
value only restricted datatype and values are allowed.
3. For the time critical application the non-procedural language performs effectively while in case of
less time-critical application the procedural language produces better results and is more efficient.
4. Among the programs constructed in the procedural and nonprocedural language, the procedural
language program are of larger size while non-procedural language programs have a small size.

Procedural languages are command-driven and statement-oriented. The program code is written as
a sequence of instructions, where the user has to specify “what to do” and also “how to do” (step by
step procedure). These instructions are executed in the sequential order. Examples of procedural
languages include FORTRAN, COBOL, ALGOL, BASIC, C and Pascal 1.

On the other hand, non-procedural languages are function-driven and applicative. The user has to
specify only “what to do” and not “how to do”. It involves the development of the functions from other
functions to construct more complex functions. Examples of non-procedural languages
include SQL, PROLOG, and LISP

22. Relational database model:


The relational model represents how data is stored in Relational Databas es. A relational database
consists of a collection of tables, each of which is assigned a unique name. Consider a relation
STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE shown in the table.
Table Student
ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

4 SURESH DELHI 18

A relational model diagram is a graphical representation of the structure and constraints of a relational
database. A relational model diagram consists of one or more tables, each with a unique name and a set of
attributes. Each attribute has a name and a data type, and optionally a domain or a set of allowed values.
A relational model diagram also shows the relationships between the tables, which are based on the
concept of keys. A key is an attribute or a set of attributes that can uniquely identify a tuple (row) in a table.
A primary key is a key that is chosen to be the main identifier for a table. A foreign key is a key that
references a primary key in another table, and establishes a link between the two tables.
A relational model diagram can also include integrity constraints, which are rules that specify the valid
states of the database. Some common types of integrity constraints are:
 Domain constraints: These specify the valid values for an attribute, such as a range, a list, or a pattern.
 Entity integrity: This ensures that every tuple in a table has a unique and non-null primary key value.
 Referential integrity: This ensures that every foreign key value in a table either matches a primary key
value in another table, or is null.
 Other constraints: These are any additional rules that apply to the database, such as uniqueness, check,
or not null constraints.
What is a Database Management System?

A database management system (DBMS) is software designed to store, retrieve, and manage data. The
most prevalent DBMS in an enterprise database system is the RDBMS. The complete form of RDBMS is
Relational Database Management System. Now that it is clear what a database management system is,
let’s learn about the relational database management system.

What is Relational Database Management System with Example?

According to E. F. Codd’s relational model, an RDBMS allows users to construct, update, manage, and
interact with a relational database, allowing storing data in a tabular form. Therefore, consider RDBMS as
an advanced data management system that makes gaining insights from data a lot easier. But why do we
need a relational database?

Today, various businesses use relational database architecture instead of flat files or hierarchical
databases for their company database management system (DBMS). So, what is the reason for creating a
relational database? A relational database is purpose-built to efficiently handle a wide range of data
formats and process queries. And how is data in a relational database management system organized?

The answer to this is simple: a relational database management system organizes data in tables that can
be linked internally depending on shared data. This allows a user to retrieve one or more tables with just
one query easily. On the other hand, flat-file stores data in a single table structure, which is less efficient
and consumes more space and memory.

Hence, we need a relational database. An example of a relational database management system could be
a production department in an organization that leverages this model to process purchases and track
inventory.

The most commercially available and company-wide database management system or relational database
management system in use today is Structured Query Language (SQL database) to access the database.
Other widely used relational database management systems for companies include Oracle database,
MySQL, PostgreSQL (an open-source relational database), and Microsoft SQL Server. RDBMS structures
are commonly used to perform four basic operations: CRUD (create, read, update and delete), which are
critical in supporting consistent data management.

Now that you know the definition of an RDBMS let’s look at how it differs from a DBMS and the
characteristics of a relational database system.

Differences Between RDBMS and DBMS


There are some contrasting differences between RDBMS vs. DBMS. An RDBMS is an advanced version of
a DBMS. Unlike a DBMS that manages databases on a computer network and hard disks, an RDBMS
database helps maintain relationships between its tables.

Here are some of the main differences between an RDBMS and a DBMS:

 Number of operators:

A DBMS allows only a single operator simultaneously, whereas multiple users can operate an
RDBMS concurrently. An RDBMS uses intricate algorithms that enable several users to access the
database while preserving data integrity simultaneously, significantly reducing response time.

 Hardware and software need:

A DBMS utilizes fewer data storage and retrieval resources than an RDBMS. The latter is more
complex due to its multi-table structure and cross-referencing capability, making it costlier than a
DBMS. RDBMSs are also generally used for enterprise-class applications, while DBMSs are more
commonly utilized for smaller, purpose-specific applications.

 Data modification:

Altering data in a DBMS is quite difficult, whereas you can easily modify data in an RDBMS using
an SQL query. Thus, programmers can change/access multiple data elements simultaneously. This
is one of the reasons why an RDBMS is more efficient than a DBMS.

 Data volume:

A DBMS is more suitable for handling low data volume, whereas an RDBMS can handle even large
data volumes.

 Keys and Indexes:

A DBMS doesn’t involve keys and indexes, whereas an RDBMS specifies a relationship between
data elements via keys and indexes.

 Data consistency:

As a DBMS does not follow the ACID (Atomicity, Consistency, Isolation, and Durability) model, the
data stored can have inconsistencies. In contrast, an RDBMS follows the ACID model, making it
structured and consistent.

 Database structure:

A DBMS works by storing data in a hierarchical structure, while an RDBMS stores data in tables.
 Data fetching speed:

In a DBMS, the process is relatively slow, especially when data is complex and extensive. This is
because each of the data elements must be fetched individually. In an RDBMS, data is fetched
faster because of the relational approach. Plus, SQL facilitates quicker data retrieval in an RDBMS.

 Distributed databases:

A DBMS doesn’t support distributed databases, whereas an RDBMS offers full support for
distributed databases.

 Client-server architecture:

Unlike a DBMS, an RDBMS supports client-server architecture.

How Does a Relational Database Management System Work?

Data is stored in a relational database in the form of multiple tables. A key question here arises, how does
a database structure work, and how is it implemented? Let’s understand this in detail.

A database structure works by arranging every table into rows (known as records/ tuples) and columns
(known as fields/attributes). Tables, columns, and rows are the three major components of a relational
database.

Advantages of Relational Database Management System

The pros of a relational database management system offer a systematic view of data, which helps
businesses improve their decision-making processes by enhancing different areas.

Various other advantages of a relational database model:

Enhanced Data Security

The authorization and access control features in relational database software support advanced encryption
and decryption, enabling database administrators to manage access to the stored data. This offers
significant benefits in terms of security. In addition, operators can modify access to the database tables and
even limit the available data to others. This makes RDBMSs an ideal data storage solution for businesses
where the higher management needs to control data access for workers and clients.

Retain Data Consistency

It is easier to add new data or modify existing tables in an RDBMS while maintaining data consistency with
the existing format. This is mainly because an RDBMS is ACID-compliant.
Better Flexibility and Scalability

An RDBMS offers more flexibility when updating data as the modifications only have to be made once. For
instance, updating the details in the main table will automatically update the relevant files and save you the
trouble of changing several files one by one. Plus, each table can be altered independently without
disturbing the others. This makes relational databases scalable for growing data volumes.

Easy Maintenance

Relational databases are considered low-maintenance because users can quickly test, regulate, fix and
back up data as the automation tool in RDBMS help systematize these tasks.

Reduced Risk of Errors

In relational database software, you can easily check for errors against the data in different records.
Furthermore, as each data item is stored at a single location, there’s no possibility of older versions blurring
the picture.

Conclusion

Over time, RDBMSs have evolved to provide increasingly advanced query optimization and sophisticated
plugins for enterprise developers. As a result, various enterprise applications of relational database
management systems exist. They also serve as a focal point in numerous applications, such as reporting,
analytics, and data warehousing.

23. Privileges recognized in SQL:

In SQL, there are two types of privileges: system privileges and object privileges. System privileges
allow users to create, alter, or drop database objects, while object privileges allow users to execute,
select, insert, update, or delete data from database objects to which the privileges apply
System privileges are the rights to perform certain actions on the database or on specific types of
database objects. For example, the system privilege to create a table allows a user to create a table in their
own schema1. System privileges can also control the use of computing resources, such as the amount of
disk space or CPU time that a user can consume1. There are about 60 different system privileges in
Oracle1.
Object privileges are the rights to perform specific operations on a particular database object, such as a
table, view, sequence, or procedure2. For example, the object privilege to insert rows into a table allows a
user to insert data into that table1. Object privileges can be granted by the owner of the object or by another
user who has been granted the privilege with the WITH GRANT OPTION clause1.
Some examples of system privileges and object privileges are:
 CREATE TABLE is a system privilege that allows a user to create a table in their own schema 1.
 SELECT is an object privilege that allows a user to query data from a table or view1.
 DROP ANY TABLE is a system privilege that allows a user to drop any table in the database, regardless
of the owner1.
 UPDATE is an object privilege that allows a user to modify data in a table or view 1.
 CREATE ANY INDEX is a system privilege that allows a user to create an index on any table in the
database, regardless of the owner1.
 EXECUTE is an object privilege that allows a user to run a stored procedure or function1.

24. Tuple relational calculus-

Tuple Relational Calculus (TRC), a non-procedural query language used in relational database
management systems (RDBMS) to retrieve data from tables. TRC is based on the concept of tuples, which
are ordered sets of attribute values that represent a single row or record in a database table. TRC is a
declarative language, meaning that it specifies what data is required from the database, rather than how to
retrieve it. TRC queries are expressed as logical formulas that describe the desired tuples.

The basic syntax of TRC is as follows:

where t is a tuple variable and P(t) is a logical formula that describes the conditions that the
tuples in the result must satisfy. The curly braces {} are used to indicate that the expression is a set of
tuples.

For example, let’s say we have a table called “Employees” with the following attributes: Employee ID,
Name, Salary, Department ID. To retrieve the names of all employees who earn more than $50,000 per
year, we can use the following TRC query:

In this query, the “Employees (t)” expression specifies that


the tuple variable t represents a row in the “Employees” table. The “∧” symbol is the logical AND operator,
which is used to combine the condition “t.Salary>50000” with the table selection. The result of this query
will be a set of tuples, where each tuple contains the Name attribute of an employee who earns more than
$50,000 per year.

TRC can also be used to perform more complex queries, such as joins and nested queries, by using
additional logical operators and expressions. While TRC is a powerful query language, it can be more
difficult to write and understand than other SQL-based query languages, such as Structured Query
Language (SQL). However, it is useful in certain applications, such as in the formal verification of database
schemas and in academic research.

25. Retrieve data from the database:

Data retrieval is the process of obtaining data from a database management system (DBMS). In
databases, data retrieval is the process of identifying and extracting data from a database, based on a
query provided by the user or application. It enables the fetching of data from a database in order to
display it on a monitor and/or use within an application.

To retrieve the desired data, the user presents a set of criteria by a query. A query language, like
Structured Query Language (SQL), is used to prepare the queries.

SQL Server is the product of Microsoft which provides the facility to insert, update, delete and retrieve
data from database table so if you need to see the records of the table or a specific row or column then
you can also do it. Suppose that you have created a database and some tables to store the data in a
separate form and want to show or retrieve the data to see that is it correct or missing then you can do
it with the help of “Select” command. The Structured Query Language offers database users a powerful
and flexible data retrieval mechanism — the SELECT statement.

The SQL SELECT statement retrieves data from one or more tables in a database, allowing you to easily
access the specific information you need. This statement is commonly used and is essential for efficient
data retrieval. You can streamline your database queries using the SELECT statement and get the precise
data you require.

The basic syntax of the SQL SELECT statement is as follows:

There are many commands which help to retrieve the data according to the different condition. Some of
them are where, order by, distinct, group by etc.

26. Draw and explain the 3 level architecture of the database system;

The architecture of a database system is greatly influenced by the underlying computer system on which
the database system runs. Database systems can be centralized, or client-server, where one server
machine executes work on behalf of multiple client machines. Database systems can also be designed to
exploit parallel computer architectures. Distributed databases span multiple geographically separated
machines
A database system is partitioned into modules that deal with each of the responsibilities of the overall
system. The functional components of a database system can be broadly divided into the storage manager
and the query processor components. The storage manager is important because databases typically
require a large amount of storage space. The query processor is important because it helps the database
system simplify and facilitate access to data. It is the job of the database system to translate updates and
queries written in a nonprocedural language, at the logical level, into an efficient sequence of operations at
the physical level.

Database applications are usually partitioned into two or three parts. In a two-tier architecture, the
application resides at the client machine, where it invokes database system functionality at the server
machine through query language statements. Application program interface standards like ODBC and
JDBC are used for interaction between the client and the server. In contrast, in a three-tier architecture, the
client machine acts as merely a front end and does not contain any direct database calls. Instead, the client
end communicates with an application server, usually through a forms interface.
The application server in turn communicates with a database system to access data. The business logic of
the application, which says what actions to carry out under what conditions, is embedded in the application
server, instead of being distributed across multiple clients. Three-tier applications are more appropriate for
large applications, and for applications that run on the WorldWideWeb.

There are several types of DBMS Architecture that we use according to the usage requirements. Types of
DBMS Architecture are discussed here.
 1-Tier Architecture
 2-Tier Architecture
 3-Tier Architecture
1-Tier Architecture
In 1-Tier Architecture the database is directly available to the user, the user can directly sit on the DBMS
and use it that is, the client, server, and Database are all present on the same machine. For Example: to
learn SQL we set up an SQL server and the database on the local system. This enables us to directly
interact with the relational database and execute operations. The industry won’t use this architecture they
logically go for 2-tier and 3-tier Architecture.

Advantages of 1-Tier Architecture


Below mentioned are the advantages of 1-Tier Architecture.
 Simple Architecture: 1-Tier Architecture is the most simple architecture to set up, as only a single
machine is required to maintain it.
 Cost-Effective: No additional hardware is required for implementing 1-Tier Architecture, which
makes it cost-effective.
 Easy to Implement: 1-Tier Architecture can be easily deployed, and hence it is mostly used in small
projects.
2-Tier Architecture
The 2-tier architecture is similar to a basic client-server model. The application at the client end directly
communicates with the database on the server side. APIs like ODBC and JDBC are used for this
interaction. The server side is responsible for providing query processing and transaction management
functionalities. On the client side, the user interfaces and application programs are run. The application on
the client side establishes a connection with the server side to communicate with the DBMS.
An advantage of this type is that maintenance and understanding are easier, and compatible with existing
systems. However, this model gives poor performance when there are a large number of users.

DBMS 2-Tier Architecture

Advantages of 2-Tier Architecture


 Easy to Access: 2-Tier Architecture makes easy access to the database, which makes fast retrieval.
 Scalable: We can scale the database easily, by adding clients or upgrading hardware.
 Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture and Multi-Tier Architecture.
 Easy Deployment: 2-Tier Architecture is easier to deploy than 3-Tier Architecture.
 Simple: 2-Tier Architecture is easily understandable as well as simple because of only two
components.
3-Tier Architecture
In 3-Tier Architecture, there is another layer between the client and the server. The client does not directly
communicate with the server. Instead, it interacts with an application server which further communicates
with the database system and then the query processing and transaction management takes place. This
intermediate layer acts as a medium for the exchange of partially processed data between the server and
the client. This type of architecture is used in the case of large web applications.

DBMS 3-Tier Architecture

Advantages of 3-Tier Architecture


 Enhanced scalability: Scalability is enhanced due to the distributed deployment of application
servers. Now, individual connections need not be made between the client and server.
 Data Integrity: 3-Tier Architecture maintains Data Integrity. Since there is a middle layer between
the client and the server, data corruption can be avoided/removed.
 Security: 3-Tier Architecture Improves Security. This type of model prevents direct interaction of the
client with the server thereby reducing access to unauthorized data.
Disadvantages of 3-Tier Architecture
 More Complex: 3-Tier Architecture is more complex in comparison to 2-Tier Architecture.
Communication Points are also doubled in 3-Tier Architecture.
 Difficult to Interact: It becomes difficult for this sort of interaction to take place due to the presence
of middle layers.

27. What is an attribute? List different types of attribute with examples

In a database, data is organized strictly in row and column format. The rows are called Tuple or Record.
The data items within one row may belong to different data types. On the other hand, the columns are often
called Domain or Attribute. All the data items within a single attribute are of the same data type.

Attributes Properties/characteristics which describe entities are called attributes.

In ER modeling, notation for attribute is given below.

Domain of Attributes The set of possible values that an attribute can take is called the domain of the
attribute. For example, the attribute day may take any value from the set {Monday, Tuesday ... Friday}.
Hence this set can be termed as the domain of the attribute day.

Key attribute The attribute (or combination of attributes) which is unique for every entity instance is called
key attribute. E.g the employee_id of an employee, pan_card_number of a person etc.If the key attribute
consists of two or more attributes in combination, it is called a composite key.

In ER modeling, notation for key attribute is given below.

Simple attribute If an attribute cannot be divided into simpler components, it is a simple attribute. Example
for simple attribute : employee_id of an employee.

Composite attribute If an attribute can be split into components, it is called a composite attribute. Example
for composite attribute : Name of the employee which can be split into First_name, Middle_name, and
Last_name.

Single valued Attributes If an attribute can take only a single value for each entity instance, it is a single
valued attribute. example for single valued attribute : age of a student. It can take only one value for a
particular student.

Multi-valued Attributes If an attribute can take more than one value for each entity instance, it is a multi-
valued attribute. Multi-valued example for multi valued attribute : telephone number of an employee, a
particular employee may have multiple telephone numbers.

In ER modeling, notation for multi-valued attribute is given below.


Stored Attribute An attribute which need to be stored permanently is a stored attribute Example for stored
attribute : name of a student

Derived Attribute An attribute which can be calculated or derived based on other attributes is a derived
attribute. Example for derived attribute : age of employee which can be calculated from date of birth and
current date.

In ER modeling, notation for derived attribute is given below.

28. What is normalization? Explain 3NF.

Normalization is a database design technique that reduces data redundancy and eliminates
undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization rules divides
larger tables into smaller tables and links them using relationships. The purpose of Normalisation in
SQL is to eliminate redundant (repetitive) data and ensure data is stored logically.

The inventor of the relational model Edgar Codd proposed the theory of normalization of data with the
introduction of the First Normal Form, and he continued to extend theory with Second and Third Normal
Form. Later he joined Raymond F. Boyce to develop the theory of Boyce-Codd Normal Form.

Database Normal Forms


Here is a list of Normal Forms in SQL:

 1NF (First Normal Form)


 2NF (Second Normal Form)
 3NF (Third Normal Form)
 BCNF (Boyce-Codd Normal Form)
 4NF (Fourth Normal Form)
 5NF (Fifth Normal Form)
 6NF (Sixth Normal Form)

Database Normalization With Examples


Database Normalization Example can be easily understood with the help of a case study. Assume, a
video library maintains a database of movies rented out. Without any normalization in database, all
information is stored in one table as shown below. Let’s understand Normalization database with
normalization example with solution:
Here you see Movies Rented column has multiple values. Now let’s move into 1st Normal Forms:

1NF (First Normal Form) Rules

 Each table cell should contain a single value.


 Each record needs to be unique.

The above table in 1NF-

1NF Example

2NF (Second Normal Form) Rules

 Rule 1- Be in 1NF
 Rule 2- Single Column Primary Key that does not functionally dependant on any subset of
candidate key relation

It is clear that we can’t move forward to make our simple database in 2 nd Normalization form unless we
partition the table above.
We have divided our 1NF table into two tables viz. Table 1 and Table2. Table 1 contains member
information. Table 2 contains information on movies rented.

We have introduced a new column called Membership_id which is the primary key for table 1. Records can
be uniquely identified in Table 1 using membership id

Database – Foreign Key


In Table 2, Membership_ID is the Foreign Key

Foreign Key references the primary key of another Table! It helps connect your Tables

 A foreign key can have a different name from its primary key
 It ensures rows in one table have corresponding rows in another
 Unlike the Primary key, they do not have to be unique. Most often they aren’t
 Foreign keys can be null even though primary keys can not
3NF (Third Normal Form) Rules

 Rule 1- Be in 2NF
 Rule 2- Has no transitive functional dependencies

To move our 2NF table into 3NF, we again need to again divide our table.

3NF Example
Below is a 3NF example in SQL database:
We have again divided our tables and created a new table which stores Salutations.

There are no transitive functional dependencies, and hence our table is in 3NF

In Table 3 Salutation ID is primary key, and in Table 1 Salutation ID is foreign to primary key in Table 3

Now our little example is at a level that cannot further be decomposed to attain higher normal form types of
normalization in DBMS. In fact, it is already in higher normalization forms. Separate efforts for moving into
next levels of normalizing data are normally needed in complex databases. However, we will be discussing
next levels of normalisation in DBMS in brief in the following.

BCNF (Boyce-Codd Normal Form)


Even when a database is in 3rd Normal Form, still there would be anomalies resulted if it has more than
one Candidate Key.

Sometimes is BCNF is also referred as 3.5 Normal Form.

4NF (Fourth Normal Form) Rules


If no database table instance contains two or more, independent and multivalued data describing the
relevant entity, then it is in 4th Normal Form.

5NF (Fifth Normal Form) Rules


A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed into any number of smaller
tables without loss of data.

29. Explain the different DDL commands in SQL.

DDL is an abbreviation of Data Definition Language.

The DDL Commands in Structured Query Language are used to create and modify the schema of the
database and its objects. The syntax of DDL commands is predefined for describing the data. The commands
of Data Definition Language deal with how the data should exist in the database.

Following are the five DDL commands in SQL:

1. CREATE Command

2. DROP Command

3. ALTER Command

4. TRUNCATE Command

5. RENAME Command

CREATE Command
CREATE is a DDL command used to create databases, tables, triggers and other database objects.

Examples of CREATE Command in SQL

Example 1: This example describes how to create a new database using the CREATE DDL command.

Syntax to Create a Database:

CREATE Database Database_Name;

Suppose, you want to create a Books database in the SQL database. To do this, you have to write the
following DDL Command:

Create Database Books;

Example 2: This example describes how to create a new table using the CREATE DDL command.

Syntax to create a new table:

CREATE TABLE table_name


(
column_Name1 data_type ( size of the column ) ,
column_Name2 data_type ( size of the column) ,
column_Name3 data_type ( size of the column) ,
...
column_NameN data_type ( size of the column )
);

Suppose, you want to create a Student table with five columns in the SQL database. To do this, you have
to write the following DDL command:

CREATE TABLE Student


(
Roll_No. Int ,
First_Name Varchar (20) ,
Last_Name Varchar (20) ,
Age Int ,
Marks Int ,
);

DROP Command

DROP is a DDL command used to delete/remove the database objects from the SQL database. We can
easily remove the entire table, view, or index from the database using this DDL command.

Examples of DROP Command in SQL

Example 1: This example describes how to remove a database from the SQL database.
Syntax to remove a database:

DROP DATABASE Database_Name;

Suppose, you want to delete the Books database from the SQL database. To do this, you have to write the
following DDL command:

DROP DATABASE Books;

Example 2: This example describes how to remove the existing table from the SQL database.

Syntax to remove a table:

DROP TABLE Table_Name;

Suppose, you want to delete the Student table from the SQL database. To do this, you have to write the
following DDL command:

DROP TABLE Student;

ALTER Command

ALTER is a DDL command which changes or modifies the existing structure of the database, and it also
changes the schema of database objects.

We can also add and drop constraints of the table using the ALTER command.

Examples of ALTER Command in SQL

Example 1: This example shows how to add a new field to the existing table.

Syntax to add a newfield in the table:

ALTER TABLE name_of_table ADD column_name column_definition;

Suppose, you want to add the 'Father's_Name' column in the existing Student table. To do this, you have
to write the following DDL command:

ALTER TABLE Student ADD Father's_Name Varchar(60);

TRUNCATE Command

TRUNCATE is another DDL command which deletes or removes all the records from the table.

This command also removes the space allocated for storing the table records.

Syntax of TRUNCATE command

TRUNCATE TABLE Table_Name;


Example

Suppose, you want to delete the record of the Student table. To do this, you have to write the following
TRUNCATE DDL command:
TRUNCATE TABLE Student;
The above query successfully removed all the records from the student table.

RENAME Command

RENAME is a DDL command which is used to change the name of the database table.

Syntax of RENAME command

RENAME TABLE Old_Table_Name TO New_Table_Name;


Example
RENAME TABLE Student TO Student_Details ;

This query changes the name of the table from Student to Student_Details.

30. Describe the concurrency control mechanism using examples

Concurrency control concept comes under the Transaction in database management system
(DBMS). It is a procedure in DBMS which helps us for the management of two simultaneous
processes to execute without conflicts between each other, these conflicts occur in multi user
systems. Concurrency can simply be said to be executing multiple transactions at a time. It is
required to increase time efficiency. If many transactions try to access the same data, then
inconsistency arises. Concurrency control required to maintain consistency data. For example, if
we take ATM machines and do not use concurrency, multiple persons cannot draw money at a
time in different places. This is where we need concurrency.

The advantages of concurrency control are as follows −  Waiting time will be decreased.
 Response time will decrease.  Resource utilization will increase.  System performance &
Efficiency is increased.

Various concurrency control techniques are:


1. Two-phase locking Protocol
2. Time stamp ordering Protocol
3. Multi version concurrency control
4. Validation concurrency control
1. Two-Phase Locking Protocol: Locking is an operation which secures: permission to read, OR
permission to write a data item. Two phase locking is a process used to gain ownership of shared
resources without creating the possibility of deadlock. The 3 activities taking place in the two phase
update algorithm are:
(i). Lock Acquisition
(ii). Modification of Data
(iii). Release Lock
Two phase locking prevents deadlock from occurring in distributed systems by releasing all the
resources it has acquired, if it is not possible to acquire all the resources req uired without waiting for
another process to finish using a lock. This means that no process is ever in a state where it is holding
some shared resources, and waiting for another process to release a shared resource which it requires.
This means that deadlock cannot occur due to resource contention. A transaction in the Two Phase
Locking Protocol can assume one of the 2 phases:
 (i) Growing Phase: In this phase a transaction can only acquire locks but cannot release any lock.
The point when a transaction acquires all the locks it needs is called the Lock Point.
 (ii) Shrinking Phase: In this phase a transaction can only release locks but cannot acquire any.
2. Time Stamp Ordering Protocol: A timestamp is a tag that can be attached to any transaction or any
data item, which denotes a specific time on which the transaction or the data item had been used in any
way. A timestamp can be implemented in 2 ways. One is to directly assign the current value of the clock
to the transaction or data item. The other is to attach the value of a logical counter that keeps increment
as new timestamps are required. The timestamp of a data item can be of 2 types:
 (i) W-timestamp(X): This means the latest time when the data item X has been written into.
 (ii) R-timestamp(X): This means the latest time when the data item X has been read from. These 2
timestamps are updated each time a successful read/write operation is performed on the data item
X.
4. Multiversion Concurrency Control: Multiversion schemes keep old versions of data item to
increase concurrency. Multiversion 2 phase locking: Each successful write results in the
creation of a new version of the data item written. Timestamps are used to label the versions.
When a read(X) operation is issued, select an appropriate version of X based on the timestamp
of the transaction.

4. Validation Concurrency Control: The optimistic approach is based on the assumption that the
majority of the database operations do not conflict. The optimistic approach requires neither locking nor
time stamping techniques. Instead, a transaction is executed without restrictions until it is committed.
Using an optimistic approach, each transaction moves through 2 or 3 phases, referred to as read,
validation and write.
 (i) During read phase, the transaction reads the database, executes the needed computations and
makes the updates to a private copy of the database values. All update operations of the
transactions are recorded in a temporary update file, which is not accessed by the remaining
transactions.
 (ii) During the validation phase, the transaction is validated to ensure that the change s made will not
affect the integrity and consistency of the database. If the validation test is positive, the transaction
goes to a write phase. If the validation test is negative, he transaction is restarted and the changes
are discarded.
 (iii) During the write phase, the changes are permanently applied to the database.
Real-Life Example
Scenario: A world-famous band, “The Algorithmics,” is about to release tickets for their farewell concert.
Given their massive fan base, the ticketing system is expected to face a surge in access requests.
EventBriteMax must ensure that ticket sales are processed smoothly without double bookings or system
failures.
 Two-Phase Locking Protocol (2PL):
 Usage: Mainly for premium ticket pre-sales to fan club members. These sales occur a day before
the general ticket release.
 Real-Life Example: When a fan club member logs in to buy a ticket, the system uses 2PL. It locks
the specific seat they choose during the transaction. Once the transaction completes, the lock is
released. This ensures that no two fan club members can book the same seat at the same time.
 Timestamp Ordering Protocol:
 Usage: For general ticket sales.
 Real-Life Example: As thousands rush to book their tickets, each transaction gets a timestamp. If
two fans try to book the same seat simultaneously, the one with the earlier timestamp gets priority.
The other fan receives a message suggesting alternative seats.
 Multi-Version Concurrency Control (MVCC):
 Usage: Implemented in the mobile app version of the ticketing platform.
 Real-Life Example: Fans using the mobile app see multiple versions of the seating chart. When a
fan selects a seat, they’re essentially choosing from a specific version of the seating database. If
their choice conflicts with a completed transaction, the system offers them the next best seat based
on the latest version of the database. This ensures smooth mobile user experience without
frequent transactional conflicts.
 Validation Concurrency Control:
 Usage: For group bookings where multiple seats are booked in a single transaction.
 Real-Life Example: A group of friends tries to book 10 seats together. They choose their seats
and proceed to payment. Before finalizing, the system validates that all 10 seats are still available
(i.e., no seat was booked by another user in the meantime). If there’s a conflict, the group is
prompted to choose a different set of seats. If not, their booking is confirmed.
The concert ticket sales go off without a hitch. Fans rave about the smooth experience, even with such high
demand. Behind the scenes, EventBriteMax’s effective implementation of the four concurrency control
protocols played a crucial role in ensuring that every fan had a fair chance to purchase their ticket and no
seats were double-booked. The Algorithmics go on to have a fantastic farewell concert, with not a single
problem in the ticketing process.
Conclusion
Thus, Concurrency control techniques in Database Management Systems (DBMS) are pivotal for
maintaining data integrity, consistency, and reliability in multi-user database environments. These methods
prevent multiple transactions from interfering with one another, preventing possible data inconsistencies
and clashes.

31. Describe the ACID properties of transaction

A transaction is a single logical unit of work that accesses and possibly modifies the contents of a
database. Transactions access data using read and write operations.
In order to maintain consistency in a database, before and after the transaction, certain properties
are followed. These are called ACID properties.
Atomicity:
By this, we mean that either the entire transaction takes place at once or doesn’t happen at all. There is
no midway i.e. transactions do not occur partially. Each transaction is considered as one unit and either
runs to completion or is not executed at all. It involves the following two operations.
—Abort: If a transaction aborts, changes made to the database are not visible.
—Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account X to
account Y.

If the transaction fails after completion of T1 but before completion of T2.( say, after write(X) but
before write(Y)), then the amount has been deducted from X but not added to Y. This results in an
inconsistent database state. Therefore, the transaction must be executed in its entirety in order to ensure
the correctness of the database state.

Consistency:

This means that integrity constraints must be maintained so that the database is consistent before and
after the transaction. It refers to the correctness of a database. Referring to the example above,
The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, the database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As a
result, T is incomplete.

Isolation:

This property ensures that multiple transactions can occur concurrently without leading to the
inconsistency of the database state. Transactions occur independently without interference. Changes
occurring in a particular transaction will not be visible to any other transaction until that particular change
in that transaction is written to memory or has been committed. This property ensures that the execution
of transactions concurrently will result in a state that is equivalent to a state achieved these wer e
executed serially in some order.
Let X= 500, Y = 500.
Consider two transactions T and T”.

Suppose T has been executed till Read (Y) and then T’’ starts. As a result, interleaving of operations
takes place due to which T’’ reads the correct value of X but the incorrect value of Y and sum computed
by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of the transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take place in
isolation and changes should be visible only after they have been made to the main memory.

Durability:

This property ensures that once the transaction has completed execution, the updates and modifications
to the database are stored in and written to disk and they persist even if a system failure occurs. These
updates now become permanent and are stored in non-volatile memory. The effects of the transaction,
thus, are never lost.
Some important points:
Property Responsibility for maintaining properties

Atomicity Transaction Manager

Consistency Application programmer

Isolation Concurrency Control Manager

Durability Recovery Manager

The ACID properties, in totality, provide a mechanism to ensure the correctness and consistency of a
database in a way such that each transaction is a group of operations that acts as a single unit, produces
consistent results, acts in isolation from other operations, and updates that it makes are durably stored.

ACID properties are the four key characteristics that define the reliability and consistency of a transaction
in a Database Management System (DBMS). The acronym ACID stands for Atomicity, Consistency,
Isolation, and Durability. Here is a brief description of each of these properties:
1. Atomicity: Atomicity ensures that a transaction is treated as a single, indivisible unit of work. Either
all the operations within the transaction are completed successfully, or none of them are. If any part
of the transaction fails, the entire transaction is rolled back to its original state, ensuring data
consistency and integrity.
2. Consistency: Consistency ensures that a transaction takes the database from one consistent state to
another consistent state. The database is in a consistent state both before and after the transaction
is executed. Constraints, such as unique keys and foreign keys, must be maintained to ensure data
consistency.
3. Isolation: Isolation ensures that multiple transactions can execute concurrently without interfering
with each other. Each transaction must be isolated from other transactions until it is completed. This
isolation prevents dirty reads, non-repeatable reads, and phantom reads.
4. Durability: Durability ensures that once a transaction is committed, its changes are permanent and
will survive any subsequent system failures. The transaction’s changes are saved to the database
permanently, and even if the system crashes, the changes remain intact and can be recovered.
Overall, ACID properties provide a framework for ensuring data consistency, integrity, and reliability in
DBMS. They ensure that transactions are executed in a reliable and consistent manner, even in the
presence of system failures, network issues, or other problems. These properties make DBMS a reliable
and efficient tool for managing data in modern organizations.

Advantages of ACID Properties in DBMS:

1. Data Consistency: ACID properties ensure that the data remains consistent and accurate after any
transaction execution.
2. Data Integrity: ACID properties maintain the integrity of the data by ensuring that any changes to the
database are permanent and cannot be lost.
3. Concurrency Control: ACID properties help to manage multiple transactions occurring concurrentl y
by preventing interference between them.
4. Recovery: ACID properties ensure that in case of any failure or crash, the system can recover the
data up to the point of failure or crash.

Disadvantages of ACID Properties in DBMS:

1. Performance: The ACID properties can cause a performance overhead in the system, as they require
additional processing to ensure data consistency and integrity.
2. Scalability: The ACID properties may cause scalability issues in large distributed systems where
multiple transactions occur concurrently.
3. Complexity: Implementing the ACID properties can increase the complexity of the system and
require significant expertise and resources.
Overall, the advantages of ACID properties in DBMS outweigh the disadvantages. They provide a
reliable and consistent approach to data
management, ensuring data integrity, accuracy, and reliability. However, in some cases, the
overhead of implementing ACID properties can cause performance and scalability issues. Therefore,
it’s important to balance the benefits of ACID properties against the specific needs and requirements
of the system.

32. Give SQL statements which create a STUDENT table consisting of the following fields: Name
CHAR(40) Class CHAR(6) Marks NUMBER(4) Rank CHAR(8)

CREATE TABLE STUDENT


(
Name Varchar (40) ,
Class Varchar (6) ,
Marks Int (4)
Rank Varchar (8) ,

);

33. Define functional dependency. Give the inference rules of functional dependencies.

A functional dependency is a relationship between two sets of attributes in a database, where one
set (the determinant) determines the values of the other set (the dependent).
For example, in a database of employees, the employee ID number (determinant) would determine the
employee’s name, address, and other personal information (dependent). This means that, given an
employee ID number, we can determine the corresponding employee’s name and other personal
information, but not vice versa.

Functional dependencies can also be represented using mathematical notation. For example, the functional
dependency above can be represented as:

Employee ID → Employee Name, Address, etc.

It’s important to note that functional dependencies only apply to the individual tuples in the table, and not to
the table as a whole.

What can functional dependencies be used for?

Identify and eliminate data redundancy in a database

For example, if a database contains a table with the attributes “employee ID” and “employee name”, and
another table with the attributes “employee ID” and “employee address”, then there is a functional
dependency between “employee ID” and “employee name” in the first table, and between “employee ID”
and “employee address” in the second table.

By combining these two tables into one, with the attributes “employee ID”, “employee name”, and
“employee address”, the data redundancy is eliminated.

Identify and eliminate data inconsistencies in a database

For example, if a database contains a table with the attributes “employee ID” and “employee name”, and
another table with the attributes “employee ID” and “employee address”, then there is a functional
dependency between “employee ID” and “employee name” in the first table, and between “employee ID”
and “employee address” in the second table. If the employee’s name is changed in the first table, but not in
the second table, then the data is inconsistent.

By combining these two tables into one, with the attributes “employee ID”, “employee name”, and
“employee address”, the data inconsistencies are eliminated.

What are the different types of functional dependencies?

There are several types of functional dependencies, including full functional dependencies, partial
functional dependencies, and transitive functional dependencies.

A full functional dependency is a functional dependency where the dependent attributes are determined
by the determinant attributes. For example, in the database of employees, the employee ID number fully
determines the employee’s name, address, and other personal information.

A partial functional dependency is a functional dependency where the dependent attributes are partially
determined by the determinant attributes. For example, in a database of employees, the employee ID
number may partially determine the employee’s address, but not the employee’s name or other personal
information.

A transitive functional dependency is a functional dependency where the dependent attributes are
determined by a set of attributes that are not included in the determinant attributes. For example, in a
database of employees, the employee ID number may determine the employee’s department, which in turn
determines the employee’s salary.

Functional dependencies are a crucial aspect of database design and are used to ensure that the database
is in a state of normalization. They help to minimize data redundancy and improve data integrity. However,
it’s important to note that functional dependencies are not the only factor to consider when designing a
database. Other factors such as performance and scalability should also be taken into account.

One of the most common ways to represent functional dependencies is using the Armstrong’s Axioms.
These are a set of rules that can be used to infer functional dependencies from a given set of functional
dependencies. These rules include reflexivity, augmentation, and transitivity.

Reflexivity states that if X is a subset of Y, then Y → X.

Augmentation states that if X → Y, then XZ → YZ for any attributes Z.

Transitivity states that if X → Y and Y → Z, then X → Z.

Normal forms in functional dependencies

Another way to represent functional dependencies is using the Normal Forms. Normal Forms are a set of
rules that are used to determine the degree of normalization of a database. There are several Normal
Forms, including First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), and so
on.

First Normal Form (1NF) requires that each table have a primary key, and that all data in the table is atomic
(indivisible).

Second Normal Form (2NF) requires that the table is in 1NF, and that all non-primary key attributes are
functionally dependent on the primary key.

Third Normal Form (3NF) requires that the table is in 2NF, and that all non-primary key attributes are not
functionally dependent on any non-primary key attributes.

It’s important to note that functional dependencies are not always easy to identify, and may require a
thorough understanding of the data and the relationships between the data. Additionally, it’s not always
possible to achieve higher Normal Forms, and trade-offs may need to be made between normalization and
performance.

Conclusion

Functional dependencies are a crucial aspect of database design and are used to ensure that the database
is in a state of normalization. They help to minimize data redundancy and improve data integrity. However,
it’s important to note that functional dependencies are not the only factor to consider when designing a
database. Other factors such as performance and scalability should also be taken into account.
1. Reflexive Rule (IR1)

In the reflexive rule, if Y is a subset of X, then X determines Y.

1. If X ⊇ Y then X → Y

Example:

1. X = {a, b, c, d, e}
2. Y = {a, b, c}

2. Augmentation Rule (IR2)

The augmentation is also called as a partial dependency. In augmentation, if X determines Y, then XZ


determines YZ for any Z.

1. X → Y then XZ → YZ

Example:

1. For R(ABCD), if A → B then AC → BC

3. Transitive Rule (IR3)

In the transitive rule, if X determines Y and Y determine Z, then X must also determine Z.

1. If X → Y and Y → Z then X → Z

4. Union Rule (IR4)

Union rule says, if X determines Y and X determines Z, then X must also determine Y and Z.

1. If X → Y and X → Z then X → YZ
Proof:

1. X → Y (given)
2. X → Z (given)
3. X → XY (using IR2 on 1 by augmentation with X. Where XX = X)
4. XY → YZ (using IR2 on 2 by augmentation with Y)
5. X → YZ (using IR3 on 3 and 4)

5. Decomposition Rule (IR5)

Decomposition rule is also known as project rule. It is the reverse of union rule.

This Rule says, if X determines Y and Z, then X determines Y and X determines Z separately.

1. → YZ then X → Y and X → Z

Proof:

1. X → YZ (given)
2. YZ → Y (using IR1 Rule)
3. X → Y (using IR3 on 1 and 2)

6. Pseudo transitive Rule (IR6)

In Pseudo transitive Rule, if X determines Y and YZ determines W, then XZ determines W.

1. If X → Y and YZ → W then XZ → W

Proof:

1. X → Y (given)
2. WY → Z (given)
3. WX → WY (using IR2 on 1 by augmenting with W)
4. WX → Z (using IR3 on 3 and 2)

34. Explain different constraints to maintain data integrity in SQL

Integrity Constraints are the protocols that a table's data columns must follow. These are used to restrict
the types of information that can be entered into a table. This means that the data in the database is
accurate and reliable. You may apply integrity Constraints at the column or table level. The table-level
Integrity constraints apply to the entire table, while the column level constraints are only applied to one
column.

Domain Integrity Constraints in SQL

Domain Integrity Constraints define the permissible values for a given column. By applying these
constraints, you can restrict the data entered into a specific column, ensuring consistent data values across
your database.

Some commonly used domain integrity constraints include:

 Data type – The column must contain values of a specific data type
 Data format – The format of the values in a column must follow a defined pattern
 Range – The values must fall within a specified range
 Enumeration – The values in the column can only be taken from a predefined set of values
For example, if you have a table containing information about employees' salaries, you might enforce a
domain integrity constraint on the "salary" column to ensure that only numeric values within a specific
range are entered.

Entity Integrity Constraints in SQL

Entity Integrity Constraints involve uniquely identifying the rows in a database table, such that there are no
duplicate or null values in a primary key column. A primary key is a unique column in a table that uniquely
identifies every row in the table. This constraint helps maintain the uniqueness and integrity of data by
preventing the existence of duplicate rows.

For instance, in a table storing customer information, a unique identification number (‘customer_id’) can
be assigned as the primary key to uniquely identify every customer.

Referential Integrity Constraint in SQL

Referential Integrity Constraint ensures that relationships between tables are maintained consistently. It is
enforced by using foreign keys, which are columns in a table that refer to a primary key in another table.
The foreign key helps to maintain the referential integrity between two related tables by making sure that
changes in one table's primary key are reflected in the corresponding foreign key in another table.

There are two main rules to uphold when it comes to Referential Integrity Constraints:

 If a primary key value is updated or deleted, the corresponding foreign key values in the related
table must be updated or deleted as well.
 Any new foreign key value added to the related table must have a corresponding primary key value
in the other table.

35. Describe the basic operators of relational algebra with an example for each

Relational Algebra is a procedural query language that takes relations as an input and returns
relations as an output. There are some basic operators which can be applied in relation to producing
the required results which we will discuss one by one. We will use STUDENT_SPOR TS,
EMPLOYEE, and STUDENT relations as given in Table 1, Table 2, and Table 3 respectively to
understand the various operators.
Table 1: STUDENT_SPORTS

ROLL_NO SPORTS

1 Badminton

2 Cricket

2 Badminton

4 Badminton

Table 2: EMPLOYEE

EMP_NO NAME ADDRESS PHONE AGE


1 RAM DELHI 9455123451 18

5 NARESH HISAR 9782918192 22

6 SWETA RANCHI 9852617621 21

4 SURESH DELHI 9156768971 18

Table 3: STUDENT

ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

4 SURESH DELHI 9156768971 18

Selection operator (σ): Selection operator is used to selecting tuples from a relation based on some
condition. Syntax:
σ (Cond)(Relation Name)
Extract students whose age is greater than 18 from STUDENT relation given in Table 3
σ (AGE>18)(STUDENT)
RESULT:

ROLL_NO NAME ADDRESS PHONE AGE

3 SUJIT ROHTAK 9156253131 20

Projection Operator (∏): Projection operator is used to project particular columns from a
relation. Syntax:
∏(Column 1,Column 2….Column n)(Relation Name)
Extract ROLL_NO and NAME from STUDENT relation given in Table 3
∏(ROLL_NO,NAME)(STUDENT)
RESULT:

ROLL_NO NAME

1 RAM

2 RAMESH

3 SUJIT

4 SURESH

Note: If the resultant relation after projection has duplicate rows, it will be removed. For
Example ∏(ADDRESS)(STUDENT) will remove one duplicate row with the value DELHI and return three
rows.
Cross Product(X): Cross product is used to join two relations. For every row of Relation1, each row of
Relation2 is concatenated. If Relation1 has m tuples and and Relation2 has n tuples, cross product of
Relation1 and Relation2 will have m X n tuples. Syntax:
Relation1 X Relation2
To apply Cross Product on STUDENT relation given in Table 1 and STUDENT_SPORTS relation given
in Table 2,
STUDENT X STUDENT_SPORTS
RESULT:

ROLL_NO NAME ADDRESS PHONE AGE ROLL_NO SPORTS

1 RAM DELHI 9455123451 18 1 Badminton

1 RAM DELHI 9455123451 18 2 Cricket

1 RAM DELHI 9455123451 18 2 Badminton

1 RAM DELHI 9455123451 18 4 Badminton

2 RAMESH GURGAON 9652431543 18 1 Badminton

2 RAMESH GURGAON 9652431543 18 2 Cricket

2 RAMESH GURGAON 9652431543 18 2 Badminton

2 RAMESH GURGAON 9652431543 18 4 Badminton

3 SUJIT ROHTAK 9156253131 20 1 Badminton

3 SUJIT ROHTAK 9156253131 20 2 Cricket

3 SUJIT ROHTAK 9156253131 20 2 Badminton

3 SUJIT ROHTAK 9156253131 20 4 Badminton

4 SURESH DELHI 9156768971 18 1 Badminton

4 SURESH DELHI 9156768971 18 2 Cricket

4 SURESH DELHI 9156768971 18 2 Badminton

4 SURESH DELHI 9156768971 18 4 Badminton

Union (U): Union on two relations R1 and R2 can only be computed if R1 and R2 are union
compatible (These two relations should have the same number of attributes and corresponding
attributes in two relations have the same domain). Union operator when applied on two relations R1 and
R2 will give a relation with tuples that are either in R1 or in R2. The tuples which are in both R1 and R2
will appear only once in the result relation. Syntax:
Relation1 U Relation2
Find the person who is either student or employees, we can use Union operators like:
STUDENT U EMPLOYEE
RESULT:

ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

4 SURESH DELHI 9156768971 18

5 NARESH HISAR 9782918192 22

6 SWETA RANCHI 9852617621 21

Minus (-): Minus on two relations R1 and R2 can only be computed if R1 and R2 are union compatible.
Minus operator when applied on two relations as R1-R2 will give a relation with tuples that are in R1 but
not in R2. Syntax:
Relation1 - Relation2
Find the person who is a student but not an employee, we can use minus operator like:
STUDENT - EMPLOYEE
RESULT:

ROLL_NO NAME ADDRESS PHONE AGE

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

Rename(ρ): Rename operator is used to giving another name to a relation. Syntax:


ρ(Relation2, Relation1)
To rename STUDENT relation to STUDENT1, we can use rename operator like:
ρ(STUDENT1, STUDENT)
If you want to create a relation STUDENT_NAMES with ROLL_NO and NAME from STUDENT, it can be
done using rename operator as:
ρ(STUDENT_NAMES, ∏(ROLL_NO, NAME)(STUDENT))

36. Explain aggregate functions in SQL with examples

SQL aggregate functions are functions that operate on a set of values and return a single value. They are
often used with the GROUP BY clause to divide the result set into groups of values and calculate a
summary statistic for each group1.
Some of the commonly used SQL aggregate functions are:
 COUNT(): This function returns the number of rows in a table or a group. It can also count the number of
distinct or non-null values in a column2.
 SUM(): This function returns the sum of all or distinct values in a column or a group2.
 AVG(): This function returns the average of all or distinct values in a column or a group. It ignores null
values1.
 MIN(): This function returns the minimum value in a column or a group. It ignores null values 1.
 MAX(): This function returns the maximum value in a column or a group. It ignores null values 1.

The following illustrates how the aggregate function is used with the GROUP BY clause:

SELECT c1, aggregate_function(c2)


FROM table
GROUP BY c1;

You can use aggregate functions as expressions only in the following:

 The select list of a SELECT statement, either a subquery or an outer query.


 A HAVING clause

AVG

The AVG() function returns the average values in a set. The following illustrates the syntax of
the AVG() function:

AVG( ALL | DISTINCT)Code language: SQL (Structured Query Language) (sql)

The ALL keyword instructs the AVG() function to calculate the average of all values while
the DISTINCT keyword forces the function to operate on distinct values only. By default, the ALL option is
used.

The following example shows how to use the AVG() function to calculate the average salary of each
department:

SELECT
department_name, ROUND(AVG(salary), 0) avg_salary
FROM
employees
INNER JOIN
departments USING (department_id)
GROUP BY department_name
ORDER BY department_name;Code language: SQL (Structured Query Language) (sql)

MIN

The MIN() function returns the minimum value of a set. The following illustrates the syntax of
the MIN() function:
MIN(column | expression)Code language: SQL (Structured Query Language) (sql)

For example, the following statement returns the minimum salary of the employees in each department:

SELECT
department_name, MIN(salary) min_salary
FROM
employees
INNER JOIN
departments USING (department_id)
GROUP BY department_name
ORDER BY department_name;Code language: SQL (Structured Query Language) (sql)

MAX

The MAX() function returns the maximum value of a set. The MAX() function has the following syntax:

MAX(column | expression)Code language: SQL (Structured Query Language) (sql)

For example, the following statement returns the highest salary of employees in each department:

SELECT
department_name, MAX(salary) highest_salary
FROM
employees
INNER JOIN
departments USING (department_id)
GROUP BY department_name
ORDER BY department_name;Code language: SQL (Structured Query Language) (sql)

COUNT
The COUNT() function returns the number of items in a set. The following shows the syntax of
the COUNT() function:

COUNT ( [ALL | DISTINCT] column | expression | *)Code language: SQL (Structured Query Language)
(sql)

For example, the following example uses the COUNT(*) function to return the headcount of each
department:

SELECT
department_name, COUNT(*) headcount
FROM
employees
INNER JOIN
departments USING (department_id)
GROUP BY department_name
ORDER BY department_name;Code language: SQL (Structured Query Language) (sql)

SUM

The SUM() function returns the sum of all values. The following illustrates the syntax of the SUM() function:

SUM(ALL | DISTINCT column)Code language: SQL (Structured Query Language) (sql)

For example, the following statement returns the total salary of all employees in each department:

SELECT
department_id, SUM(salary)
FROM
employees
GROUP BY department_id;
37. Draw the system architecture of DBMS. Explain each component in detail.

A Database stores a lot of critical information to access data quickly and securely. Hence it is important to
select the correct architecture for efficient data management. DBMS Architecture helps users to get their
requests done while connecting to the database. We choose database architecture depending on several
factors like the size of the database, number of users, and relationships between the users. There are two
types of database models that we generally use, logical model and physical model. Several types of
architecture are there in the database which we will deal with in the next section.
Types of DBMS Architecture
There are several types of DBMS Architecture that we use according to the usage requirements. Types of
DBMS Architecture are discussed here.
 1-Tier Architecture
 2-Tier Architecture
 3-Tier Architecture
1-Tier Architecture
In 1-Tier Architecture the database is directly available to the user, the user can directly sit on the DBMS
and use it that is, the client, server, and Database are all present on the same machine. For Example: to
learn SQL we set up an SQL server and the database on the local system. This enables us to directly
interact with the relational database and execute operations. The industry won’t use this architecture they
logically go for 2-tier and 3-tier Architecture.

Advantages of 1-Tier Architecture


 Simple Architecture: 1-Tier Architecture is the most simple architecture to set up, as only a single
machine is required to maintain it.
 Cost-Effective: No additional hardware is required for implementing 1-Tier Architecture, which
makes it cost-effective.
 Easy to Implement: 1-Tier Architecture can be easily deployed, and hence it is mostly used in small
projects.
2-Tier Architecture
The 2-tier architecture is similar to a basic client-server model. The application at the client end directly
communicates with the database on the server side. APIs like ODBC and JDBC are used for this
interaction. The server side is responsible for providing query processing and transaction management
functionalities. On the client side, the user interfaces and application programs are run. The application on
the client side establishes a connection with the server side to communicate with the DBMS.
An advantage of this type is that maintenance and understanding are easier, and compatible with existing
systems. However, this model gives poor performance when there are a large number of users.

Advantages of 2-Tier Architecture


 Easy to Access: 2-Tier Architecture makes easy access to the database, which makes fast retrieval.
 Scalable: We can scale the database easily, by adding clients or upgrading hardware.
 Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture and Multi-Tier Architecture.
 Easy Deployment: 2-Tier Architecture is easier to deploy than 3-Tier Architecture.
 Simple: 2-Tier Architecture is easily understandable as well as simple because of only two
components.
3-Tier Architecture
In 3-Tier Architecture, there is another layer between the client and the server. The client does not directly
communicate with the server. Instead, it interacts with an application server which further communicates
with the database system and then the query processing and transaction management takes place. This
intermediate layer acts as a medium for the exchange of partially processed data between the server and
the client. This type of architecture is used in the case of large web applications.

Advantages of 3-Tier Architecture


 Enhanced scalability: Scalability is enhanced due to the distributed deployment of application
servers. Now, individual connections need not be made between the client and server.
 Data Integrity: 3-Tier Architecture maintains Data Integrity. Since there is a middle layer between
the client and the server, data corruption can be avoided/removed.
 Security: 3-Tier Architecture Improves Security. This type of model prevents direct interaction of the
client with the server thereby reducing access to unauthorized data.
Disadvantages of 3-Tier Architecture
 More Complex: 3-Tier Architecture is more complex in comparison to 2-Tier Architecture.
Communication Points are also doubled in 3-Tier Architecture.
 Difficult to Interact: It becomes difficult for this sort of interaction to take place due to the presence
of middle layers.

38. DML Commands in SQL

DML is an abbreviation of Data Manipulation Language.

The DML commands in Structured Query Language change the data present in the SQL database. We can
easily access, store, modify, update and delete the existing records from the database using DML
commands.

Following are the four main DML commands in SQL:

1. SELECT Command

2. INSERT Command

3. UPDATE Command

4. DELETE Command
SELECT DML Command

SELECT is the most important data manipulation command in Structured Query Language. The SELECT
command shows the records of the specified table. It also shows the particular record of a particular
column by using the WHERE clause.

Syntax of SELECT DML command

SELECT column_Name_1, column_Name_2, ….., column_Name_N FROM Name_of_table;

Here, column_Name_1, column_Name_2, ….., column_Name_N are the names of those columns whose
data we want to retrieve from the table.

If we want to retrieve the data from all the columns of the table, we have to use the following SELECT
command:

SELECT * FROM table_name;


Examples of SELECT Command

Example 1: This example shows all the values of every column from the table.

SELECT * FROM Student;

This SQL statement displays the following values of the student table:

Student_ID Student_Name Student_Marks

BCA1001 Abhay 85

BCA1002 Anuj 75

BCA1003 Bheem 60

BCA1004 Ram 79

BCA1005 Sumit 80

INSERT DML Command

INSERT is another most important data manipulation command in Structured Query Language, which allows
users to insert data in database tables.

Syntax of INSERT Command

INSERT INTO TABLE_NAME ( column_Name1 , column_Name2 , column_Name3 , .... column_NameN ) VA


LUES (value_1, value_2, value_3, .... value_N ) ;

Example 1: This example describes how to insert the record in the database table.
Let's take the following student table, which consists of only 2 records of the student.

Stu_Id Stu_Name Stu_Marks Stu_Age

101 Ramesh 92 20

201 Jatin 83 19

Suppose, you want to insert a new record into the student table. For this, you have to write the following DML
INSERT command:

INSERT INTO Student (Stu_id, Stu_Name, Stu_Marks, Stu_Age) VALUES (104, Anmol, 89, 19);

UPDATE DML Command

UPDATE is another most important data manipulation command in Structured Query Language, which allows
users to update or modify the existing data in database tables.

Syntax of UPDATE Command

DATE Table_name SET [column_name1= value_1, ….., column_nameN = value_N] WHERE CONDITION;

Here, 'UPDATE', 'SET', and 'WHERE' are the SQL keywords, and 'Table_name' is the name of the table
whose values you want to update.

Examples of the UPDATE command

Example 1: This example describes how to update the value of a single field.

Let's take a Product table consisting of the following records:

Product_Id Product_Name Product_Price Product_Quantity

P101 Chips 20 20

P102 Chocolates 60 40

P103 Maggi 75 5

P201 Biscuits 80 20

P203 Namkeen 40 50

Suppose, you want to update the Product_Price of the product whose Product_Id is P102. To do this, you
have to write the following DML UPDATE command:

UPDATE Product SET Product_Price = 80 WHERE Product_Id = 'P102' ;

39. Define DBMS


A database management system is a collection of interrelated data and a set of program to access those
data. The collection of data is referred as database. The primary goal of a DBMS is to provide a way to
store and retrieve database information that is both convenient and efficient. Data processing occurs when
data is collected and translated into usable information.

40. Functions of a DBA:

A database administrator, or DBA, is responsible for maintaining, securing, and operating databases and
also ensures that data is correctly stored and retrieved. In addition, DBAs often work with developers to
design and implement new features and troubleshoot any issues.

His main functions and responsibilities include:

 Decides hardware –
They decide on economical hardware, based on cost, performance, and efficiency of hardware, and
best suits the organization. It is hardware that is an interface between end users and the database.
 Manages data integrity and security –
Data integrity needs to be checked and managed accurately as it protects and restricts data from
unauthorized use. DBA eyes on relationships within data to maintain data integrity.
 Database Accessibility –
Database Administrator is solely responsible for giving permission to access data available in the
database. It also makes sure who has the right to change the content.
 Database design –
DBA is held responsible and accountable for logical, physical design, external model design, and
integrity and security control.
 Database implementation –
DBA implements DBMS and checks database loading at the time of its implementation.
 Query processing performance –
DBA enhances query processing by improving speed, performance, and accuracy.
 Tuning Database Performance –
If the user is not able to get data speedily and accurately then it may lose organization’s business.
So by tuning SQL commands DBA can enhance the performance of the database.

41. Data Independence : Data Independence is defined as a property of DBMS that helps us to
change the Database schema at one level of a database system without requiring to change the
schema at the next higher level. Data independence helps us to keep data separated from all
programs that make use of it.

Types of Data Independence:

In DBMS there are two types of data independence

Physical data independence

Logical data independence.

Physical Data Independence


Physical data independence helps you to separate conceptual levels from the internal/physical levels. It
allows you to provide a logical description of the database without the need to specify physical structures.
Compared to Logical Independence, it is easy to achieve physical data independence.

With Physical independence, you can easily change the physical storage structures or devices with an
effect on the conceptual schema. Any change done would be absorbed by the mapping between the
conceptual and internal levels. Physical data independence is achieved by the presence of the internal
level of the database and then the transformation from the conceptual level of the database to the internal
level.

Examples of changes under Physical Data Independence


Due to Physical independence, any of the below change will not affect the conceptual layer.
 Using a new storage device like Hard Drive or Magnetic Tapes
 Modifying the file organization technique in the Database
 Switching to different data structures.
 Changing the access method.
 Modifying indexes.
 Changes to compression techniques or hashing algorithms.
 Change of Location of Database from say C drive to D Drive

Logical Data Independence


Logical Data Independence is the ability to change the conceptual scheme without changing

1. External views
2. External API or programs

Any change made will be absorbed by the mapping between external and conceptual levels.

When compared to Physical Data independence, it is challenging to achieve logical data independence.

Examples of changes under Logical Data Independence


Due to Logical independence, any of the below change will not affect the external layer.

1. Add/Modify/Delete a new attribute, entity or relationship is possible without a rewrite of existing


application programs
2. Merging two records into one
3. Breaking an existing record into two or more records

Difference between Physical and Logical Data Independence

Logica Data Independence Physical Data Independence

Logical Data Independence is mainly concerned with


Mainly concerned with the storage of the data.
the structure or changing the data definition.

It is difficult as the retrieving of data is mainly


It is easy to retrieve.
dependent on the logical structure of data.

Compared to Logic Physical independence it is Compared to Logical Independence it is easy to


difficult to achieve logical data independence. achieve physical data independence.

You need to make changes in the Application


A change in the physical level usually does not need
program if new fields are added or deleted from the
change at the Application program level.
database.

Modification at the logical levels is significant Modifications made at the internal levels may or may
whenever the logical structures of the database are not be needed to improve the performance of the
changed. structure.

Concerned with conceptual schema Concerned with internal schema


Logica Data Independence Physical Data Independence

Example: change in compression techniques,


Example: Add/Modify/Delete a new attribute
hashing algorithms, storage devices, etc

Importance of Data Independence

 Helps you to improve the quality of the data


 Database system maintenance becomes affordable
 Enforcement of standards and improvement in database security
 You don’t need to alter data structure in application programs
 Permit developers to focus on the general structure of the Database rather than worrying about the
internal implementation
 It allows you to improve state which is undamaged or undivided
 Database incongruity is vastly reduced.
 Easily make modifications in the physical level is needed to improve the performance of the
system.

Summary

 Data Independence is the property of DBMS that helps you to change the Database schema at one
level of a database system without requiring to change the schema at the next higher level.
 Two levels of data independence are 1) Physical and 2) Logical
 Physical data independence helps you to separate conceptual levels from the internal/physical
levels
 Logical Data Independence is the ability to change the conceptual scheme without changing
 When compared to Physical Data independence, it is challenging to achieve logical data
independence
 Data Independence Helps you to improve the quality of the data

42. What is a transaction?

A transaction is a single logical unit of work which accesses and possibly modifies the contents of a
database. Transactions access data using read and write operations. As transactions deal with accessing
and modifying the contents of the database, they must have some basic properties which help maintain
the consistency and integrity of the database before and after the transaction. Transactions follow 4
properties, namely, Atomicity, Consistency, Isolation, and Durability. Generally, these are referred to as
ACID properties of transactions in DBMS.

Operations of Transaction
A user can make different types of requests to access and modify the contents of a database. So, we have
different types of operations relating to a transaction. They are discussed as follows:
i) Read(X)
A read operation is used to read the value of X from the database and store it in a buffer in the main
memory for further actions such as displaying that value. Such an operation is performed when a user
wishes just to see any content of the database and not make any changes to it. For example, when a user
wants to check his/her account’s balance, a read operation would be performed on user’s account balance
from the database.
ii) Write(X)
A write operation is used to write the value to the database from the buffer in the main memory. For a write
operation to be performed, first a read operation is performed to bring its value in buffer, and then some
changes are made to it, e.g. some set of arithmetic operations are performed on it according to the user’s
request, then to store the modified value back in the database, a write operation is performed. For example,
when a user requests to withdraw some money from his account, his account balance is fetched from the
database using a read operation, then the amount to be deducted from the account is subtracted from this
value, and then the obtained value is stored back in the database using a write operation.
iii) Commit
This operation in transactions is used to maintain integrity in the database. Due to some failure of power,
hardware, or software, etc., a transaction might get interrupted before all its operations are completed.
This may cause ambiguity in the database, i.e. it might get inconsistent before and after the transaction.
To ensure that further operations of any other transaction are performed only after work of the current
transaction is done, a commit operation is performed to the changes made by a transaction permanently
to the database.
iv) Rollback
This operation is performed to bring the database to the last saved state when any transaction is interrupted
in between due to any power, hardware, or software failure. In simple words, it can be said that a rollback
operation does undo the operations of transactions that were performed before its interruption to achieve
a safe state of the database and avoid any kind of ambiguity or inconsistency.
Transaction Schedules
When multiple transaction requests are made at the same time, we need to decide their order of execution.
Thus, a transaction schedule can be defined as a chronological order of execution of multiple transactions.
There are broadly two types of transaction schedules discussed as follows,
i) Serial Schedule
In this kind of schedule, when multiple transactions are to be executed, they are executed serially, i.e. at
one time only one transaction is executed while others wait for the execution of the current transaction to
be completed. This ensures consistency in the database as transactions do not execute simultaneously.
But, it increases the waiting time of the transactions in the queue, which in turn lowers the throughput of
the system, i.e. number of transactions executed per time. To improve the throughput of the system,
another kind of schedule are used which has some more strict rules which help the database to remain
consistent even when transactions execute simultaneously.
ii) Non-Serial Schedule
To reduce the waiting time of transactions in the waiting queue and improve the system efficiency, we use
nonserial schedules which allow multiple transactions to start before a transaction is completely executed.
This may sometimes result in inconsistency and errors in database operation. So, these errors are handled
with specific algorithms to maintain the consistency of the database and improve CPU throughput as well.
Serial Schedules are also sometimes referred to as parallel schedules as transactions execute in parallel
in this kind of schedules.
Serializable
Serializability in DBMS is the property of a nonserial schedule that determines whether it would maintain
the database consistency or not. The nonserial schedule which ensures that the database would be
consistent after the transactions are executed in the order determined by that schedule is said to be
Serializable Schedules. The serial schedules always maintain database consistency as a transaction starts
only when the execution of the other transaction has been completed under it. Thus, serial schedules are
always serializable.
A transaction is a series of operations, so various states occur in its completion journey. They are
discussed as follows:
i) Active
It is the first stage of any transaction when it has begun to execute. The execution of the transaction takes
place in this state. Operations such as insertion, deletion, or updation are performed during this state.
During this state, the data records are under manipulation and they are not saved to the database, rather
they remain somewhere in a buffer in the main memory.
ii) Partially Committed
This state of transaction is achieved when it has completed most of the operations and is executing its
final operation. It can be a signal to the commit operation, as after the final operation of the transaction
completes its execution, the data has to be saved to the database through the commit operation. If some
kind of error occurs during this state, the transaction goes into a failed state, else it goes into the Committed
state.
iii) Commited
This state of transaction is achieved when all the transaction-related operations have been executed
successfully along with the Commit operation, i.e. data is saved into the database after the required
manipulations in this state. This marks the successful completion of a transaction.
iv) Failed
If any of the transaction-related operations cause an error during the active or partially committed state,
further execution of the transaction is stopped and it is brought into a failed state. Here, the database
recovery system makes sure that the database is in a consistent state.
v) Aborted
If the error is not resolved in the failed state, then the transaction is aborted and a rollback operation is
performed to bring database to the the last saved consistent state. When the transaction is aborted, the
database recovery module either restarts the transaction or kills it.
The illustration below shows the various states that a transaction may encounter in its completion
journey.

43. Privileges and Roles


Confidentiality, integrity, and availability are the stamps of database security. Authorization is the
allowance to the user or process to access the set of objects. The type of access granted can be any
like, read-only, read, and write.

A privilege in a database management system is the permission to execute certain actions on the
database.

Privileges can be permitted to:

1. Access a table
2. Access permission to execute a database command
3. Access another user’s object

Privileges make certain actions possible, such as connecting to a database, creating a table, and executing
another user’s stored procedure.

Privileges are granted to users so they can accomplish a given task. If privileges are not granted with
guidance, this can lead to a security bridge on the database.

Categories of privileges

There are two main categories of privileges possible in a database:

1. System privileges
2. Object privileges

1. System privileges

A system privilege can include the following:

1. Perform an action on any object of a database.example


2. Create tablespaces in a database
3. Delete the rows of any table in a database

We have close to 60 system privileges in existence in the database management system.

System privileges are mostly granted to administrative personnel and application developers. This privilege
is usually not open to end-users of the database.

Below is a table of some popular system privileges.

2. Object privileges

Object privilege is the permission to access specific database objects. Object privilege entails performing a
specific action on a particular table, function, or package.
The right to delete rows from a table is an object privilege. Object privileges are granted to normal users,
unlike system privileges. Some of the popular object privileges are given below.

Conclusion

Both system and object privileges are very important in a database management system, as they help
secure the data stored in a database system.

There are two methods by which access control is performed is done by using the following.
1. Privileges
2. Roles

Privileges :
The authority or permission to access a named object as advised manner, for example,
permission to access a table. Privileges can allow permitting a particular use r to connect to the
database. In, other words privileges are the allowance to the database by the database object.

 Database privileges —
A privilege is permission to execute one particular type of SQL statement or access a second
persons’ object. Database privilege controls the use of computing resources. Database privilege
does not apply to the Database administrator of the database.

 System privileges —
A system privilege is the right to perform an activity on a specific type of object. for example, the
privilege to delete rows of any table in a database is system privilege. There are a total of 60
different system privileges. System privileges allow users to CREATE, ALTER, or DROP the
database objects.

 Object privilege —
An object privilege is a privilege to perform a specific action on a particular table, function, or
package. For example, the right to delete rows from a table is an object privilege. For example, let us
consider a row of table GEEKSFORGEEKS that contains the name of the employee who is no longer
a part of the organization, then deleting that row is considered as an object privilege. Object privilege
allows the user to INSERT, DELETE, UPDATE, or SELECT the data in the datab ase object.

Roles:
A role is a mechanism that can be used to allow authorization. A person or a group of people can be
allowed a role or group of roles. By many roles, the head can manage access privileges very easily. The
roles are provided by the database management system for easy and managed or controlled privilege
management.
Properties –
The following are the properties of the roles which allow easy privilege management inside a database:
 Reduced privilege administration —
The user can grant the privilege for a group of users who are related instead of granting the same
set of privileges to the users explicitly.
 Dynamic privilege management —
If the privilege of the group changes then, only the right of role needs to be changed.
 Application-specific security —
The user can also protect the use of a role by using a password. Applications can be created to allow
a role when entering the correct and best password. Users are not allowed the role if they do not
know about the password.

44. How are privileges granted and revoked?


Using the Create User Statement only creates a new user but does not grant any privileges to the user
account. Therefore to grant privileges to a user account, the GRANT statement is used.
Syntax:
GRANT privileges_names ON object TO user;
Parameters Used
 privileges_name: These are the access rights or privileges granted to the user.
 object: It is the name of the database object to which permissions are being granted. In the case of
granting privileges on a table, this would be the table name.
 user: It is the name of the user to whom the privileges would be granted.
Privileges: The privileges that can be granted to the users are listed below along with the description:
Privileges: The privileges that can be granted to the users are listed below along with the description:

Grant Privileges on Table

Let us now learn about different ways of granting privileges to the users:
 Granting SELECT Privilege to a User in a Table:
1. To grant Select Privilege to a table named “users” where User Name is Amit, the following GRANT
statement should be executed.
2. The general syntax of specifying a username is: ‘user_name’@’address’
3. If the user ‘Amit’ is on the local host then we have to mention it as ‘Amit’@’localhost’. Or suppose if
the ‘Amit’ username is on 192.168.1.100 IP address then we have to mention it as
‘Amit’@’192.168.1.100’.
‘user_name’@’address’ – When you’re granting or revoking permissions in MySQL, you use the
‘username’ or ‘hostname’ format to tell which users are allowed or denied. This is important for keeping
security and access control in place, so here’s why we use it:
 Granularity of Access Control
 Multi-User Environments
 User identification
GRANT SELECT ON Users TO 'Amit'@'localhost;'
 Granting more than one Privilege to a User in a Table: To grant multiple Privileges to a user
named “Amit” in a table “users”, the following GRANT statement should be executed.
GRANT SELECT, INSERT, DELETE, UPDATE ON Users TO 'Amit'@'localhost';
 Granting All the Privilege to a User in a Table: To Grant all the privileges to a user named “Amit”
in a table “users”, the following Grant statement should be executed.
GRANT ALL ON Users TO 'Amit'@'localhost';
 Granting a Privilege to all Users in a Table: To Grant a specific privilege to all the users in a table
“users”, the following Grant statement should be executed.
GRANT SELECT ON Users TO '*'@'localhost';
In the above example the “*” symbol is used to grant select permission to all the users of the table
“users”.
 Granting Privileges on Functions/Procedures: While using functions and procedures, the Grant
statement can be used to grant users the ability to execute the functions and procedures in
MySQL. Granting Execute Privilege: Execute privilege gives the ability to execute a function or
procedure.Syntax:
GRANT EXECUTE ON [ PROCEDURE | FUNCTION ] object TO user;
Different ways of granting EXECUTE Privileges
 Granting EXECUTE privileges on a function in MySQL.: If there is a function named
“CalculateSalary” and you want to grant EXECUTE access to the user named Amit, then the
following GRANT statement should be executed.
GRANT EXECUTE ON FUNCTION Calculatesalary TO 'Amit'@'localhost';
 Granting EXECUTE privileges to all Users on a function in MySQL.: If there is a function named
“CalculateSalary” and you want to grant EXECUTE access to all the users, then the following
GRANT statement should be executed.
GRANT EXECUTE ON FUNCTION Calculatesalary TO '*'@'localhost';
 Granting EXECUTE privilege to a Users on a procedure in MySQL.: If there is a procedure
named “DBMSProcedure” and you want to grant EXECUTE access to the user named Amit, then the
following GRANT statement should be executed.
GRANT EXECUTE ON PROCEDURE DBMSProcedure TO 'Amit'@'localhost';
 Granting EXECUTE privileges to all Users on a procedure in MySQL.: If there is a procedure
called “DBMSProcedure” and you want to grant EXECUTE access to all the users, then the following
GRANT statement should be executed.
GRANT EXECUTE ON PROCEDURE DBMSProcedure TO '*'@'localhost';
 Checking the Privileges Granted to a User: To see the privileges granted to a user in a table, the
SHOW GRANTS statement is used. To check the privileges granted to a user named “Amit” and host
as “localhost”, the following SHOW GRANTS statement will be executed:
SHOW GRANTS FOR 'Amit'@'localhost';
Output:
GRANTS FOR Amit@localhost
GRANT USAGE ON *.* TO `SUPER`@`localhost`
Revoking Privileges from a Table
The Revoke statement is used to revoke some or all of the privileges which have been granted to a user
in the past.
Syntax:
REVOKE privileges ON object FROM user;
Parameters Used:
 object: It is the name of the database object from which permissions are being revoked. In the case
of revoking privileges from a table, this would be the table name.
 user: It is the name of the user from whom the privileges are being revoked.
Privileges can be of the following values

Revoke Privileges on Table

Different Ways of revoking privileges from a user


 Revoking SELECT Privilege to a User in a Table: To revoke Select Privilege to a table named
“users” where User Name is Amit, the following revoke statement should be executed.
REVOKE SELECT ON Users FROM 'Amit'@'localhost';
 Revoking more than Privilege to a User in a Table: To revoke multiple Privileges to a user named
“Amit” in a table “users”, the following revoke statement should be executed.
REVOKE SELECT, INSERT, DELETE, UPDATE ON Users FROM 'Amit'@'localhost';
 Revoking All the Privilege to a User in a Table: To revoke all the privileges to a user named “Amit”
in a table “users”, the following revoke statement should be executed.
REVOKE ALL ON Users FROM 'Amit'@'localhost';
 Revoking a Privilege to all Users in a Table: To Revoke a specific privilege to all the users in a
table “Users”, the following revoke statement should be executed.
REVOKE SELECT ON Users FROM '*'@'localhost';
 Revoking Privileges on Functions/Procedures: While using functions and procedures, the revoke
statement can be used to revoke the privileges from users which have been EXECUTE privileges in
the past.
Syntax:
REVOKE EXECUTE ON [ PROCEDURE | FUNCTION ] object FROM User;
 Revoking EXECUTE privileges on a function in MySQL: If there is a function called
“CalculateSalary” and you want to revoke EXECUTE access to the user named Amit, then the
following revoke statement should be executed.
REVOKE EXECUTE ON FUNCTION Calculatesalary FROM 'Amit'@'localhost';
 Revoking EXECUTE privileges to all Users on a function in MySQL: If there is a function called
“CalculateSalary” and you want to revoke EXECUTE access to all the users, then the following
revoke statement should be executed.
REVOKE EXECUTE ON FUNCTION Calculatesalary FROM '*'@'localhost';
 Revoking EXECUTE privilege to a Users on a procedure in MySQL: If there is a procedure called
“DBMSProcedure” and you want to revoke EXECUTE access to the user named Amit, then the
following revoke statement should be executed.
REVOKE EXECUTE ON PROCEDURE DBMSProcedure FROM 'Amit'@'localhost';
 Revoking EXECUTE privileges to all Users on a procedure in MySQL: If there is a procedure
called “DBMSProcedure” and you want to revoke EXECUTE access to all the users, then the
following revoke statement should be executed.
REVOKE EXECUTE ON PROCEDURE DBMSProcedure FROM '*'@'localhost';

45. Explain LIKE operator with example

The SQL LIKE Operator

The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.

There are two wildcards often used in conjunction with the LIKE operator:

 The percent sign % represents zero, one, or multiple characters


 The underscore sign _ represents one, single character

Example

Select all customers that starts with the letter "a":

SELECT * FROM Customers


WHERE CustomerName LIKE 'a%';

Syntax

SELECT column1, column2, ...


FROM table_name
WHERE columnN LIKE pattern;

Demo Database

Below is a selection from the Customers table used in the examples:

CustomerID CustomerName ContactName Address City PostalCode Country

1 Alfreds Maria Anders Obere Str. 57 Berlin 12209 Germany


Futterkiste

2 Ana Trujillo Ana Trujillo Avda. de la México 05021 Mexico


Emparedados y Constitución D.F.
helados 2222

3 Antonio Moreno Antonio Mataderos México 05023 Mexico


Taquería Moreno 2312 D.F.
4 Around the Horn Thomas 120 Hanover London WA1 1DP UK
Hardy Sq.

5 Berglunds Christina Berguvsvägen Luleå S-958 22 Sweden


snabbköp Berglund 8

The _ Wildcard

The _ wildcard represents a single character.

It can be any character or number, but each _ represents one, and only one, character.

Example

Return all customers from a city that starts with 'L' followed by one wildcard character, then 'nd' and then
two wildcard characters:

SELECT * FROM Customers


WHERE city LIKE 'L_nd__';

The % Wildcard

The % wildcard represents any number of characters, even zero characters.

Example

Return all customers from a city that contains the letter 'L':

SELECT * FROM Customers


WHERE city LIKE '%L%';

Starts With

To return records that starts with a specific letter or phrase, add the % at the end of the letter or phrase.

Example

Return all customers that starts with 'La':

SELECT * FROM Customers


WHERE CustomerName LIKE 'La%';

Example

Return all customers that starts with 'a' or starts with 'b':

SELECT * FROM Customers


WHERE CustomerName LIKE 'a%' OR CustomerName LIKE 'b%';

Ends With

To return records that ends with a specific letter or phrase, add the % at the beginning of the letter or
phrase.
Example

Return all customers that ends with 'a':

SELECT * FROM Customers


WHERE CustomerName LIKE '%a';

Example

Return all customers that starts with "b" and ends with "s":

SELECT * FROM Customers


WHERE CustomerName LIKE 'b%s';

Contains

To return records that contains a specific letter or phrase, add the % both before and after the letter or
phrase.

Example

Return all customers that contains the phrase 'or'

SELECT * FROM Customers


WHERE CustomerName LIKE '%or%';

Combine Wildcards

Any wildcard, like % and _ , can be used in combination with other wildcards.

Example

Return all customers that starts with "a" and are at least 3 characters in length:

SELECT * FROM Customers


WHERE CustomerName LIKE 'a__%';

Example

Return all customers that have "r" in the second position:

SELECT * FROM Customers


WHERE CustomerName LIKE '_r%';

Without Wildcard

If no wildcard is specified, the phrase has to have an exact match to return a result.

Example

Return all customers from Spain:

SELECT * FROM Customers


WHERE Country LIKE 'Spain';

46. Explain relations and tuples

A relation is informally known as a table of values.


Tuple − A single row of a table, which contains a single record for that relation is called a tuple.

consider a simple employee entity type as shown below.

In converting an ER-Diagram into a database schema, each entity type is transformed into a relation or
table.

All the attributes of the entity type are converted into the attributes columns of the table.

Each tuple or row of the relation or table is an entity of the entity type.

In our case, each tuple or row represents the details of an employee of the company.

The image below shows the employee relation.

Relation Attributes and Tuple in RDBMS 1


Each column of the table is called an Attribute.

Each row of the table is called a tuple.

The collection of all such rows is called a relation or table.

The collection of such kinds of relations is called a database.

Types of Tuples
There are two types of tuples in a database management system:
 Physical Tuples: Physical Tuples are the actual data stored in the storage media of a database. It is
also known as a record or row.
 Logical Tuples: Logical Tuples are the data representation in memory, where data is temporarily
stored before being written to disk or during a query operation.
Both physical and logical tuples have the same attributes, but their representation and usage can differ
based on the context of the operation.
47. Different types of lock available

In database management systems, locks are used to synchronize the access of multiple transactions to
the same data item. A lock is a variable associated with a data item that describes the status of the item
with respect to possible operations that can be applied to it. There are several types of locks used in
concurrency control, including binary locks and shared/exclusive locks .
Binary locks are simple but restrictive and are not used in practice. They can have two states or values:
locked and unlocked. A distinct lock is associated with each database item. If the value of the lock on an
item is 1, the item cannot be accessed by a database operation that requests the item. If the value of the
lock on an item is 0, then the item can be accessed when requested .

Shared/exclusive locks provide more general locking capabilities and are used in practical database locking
schemes. Shared locks are acquired when only read operation is to be performed. Shared locks can be
shared between multiple transactions as there is no data being altered. Exclusive locks are acquired when
a write operation is to be performed. Exclusive locks are not shared between transactions .

1. Shared Lock (S):


 Another transaction that tries to read the same data is permitted to read, but a transaction that tries
to update the data will be prevented from doing so until the shared lock is released.
 Shared lock is also called read lock, used for reading data items only.
 Shared locks support read integrity. They ensure that a record is not in process of being updated
during a read-only request.
 Shared locks can also be used to prevent any kind of updates of record.
 It is denoted by Lock-S.
 S-lock is requested using Lock-S instruction.
For example, consider a case where initially A=100 and there are two transactions which are reading A.
If one of transaction wants to update A, in that case other transaction would be reading wrong value.
However, Shared lock prevents it from updating until it has finished reading.
2. Exclusive Lock (X) :
 When a statement modifies data, its transaction holds an exclusive lock on data that prevents other
transactions from accessing the data.
 This lock remains in place until the transaction holding the lock issues a commit or rollback.
 They can be owned by only one transaction at a time.
 With the Exclusive Lock, a data item can be read as well as written. Also called write lock.
 Any transaction that requires an exclusive lock must wait if another transaction currently owns an
exclusive lock or a shared lock against the requested resource.
 They can be owned by only one transaction at a time.
 It is denoted as Lock-X.
 X-lock is requested using Lock-X instruction.
For example, consider a case where initially A=100 when a transaction needs to deduct 50 from A. We
can allow this transaction by placing X lock on it. Therefore, when the any other transaction wants to
read or write, exclusive lock prevent it. Lock Compatibility Matrix :

Compatibility matrix for locks

 If the transaction T1 is holding a shared lock in data item A, then the control m anager can grant the
shared lock to transaction T2 as compatibility is TRUE, but it cannot grant the exclusive lock as
compatibility is FALSE.
 In simple words if transaction T1 is reading a data item A, then same data item A can be read by
another transaction T2 but cannot be written by another transaction.
 Similarly if an exclusive lock (i.e. lock for read and write operations) is hold on the data item in some
transaction then no other transaction can acquire Shared or Exclusive lock as the compatibility
function denoted FALSE.
Difference between Shared Lock and Exclusive Lock :
. Shared Lock Exclusive Lock

Lock mode is read as well as write


1. Lock mode is read only operation. operation.

Shared lock can be placed on objects that do Exclusive lock can only be placed on
not have an exclusive lock already placed on objects that do not have any other kind of
2. them. lock.

Prevents others from reading or updating


3. Prevents others from updating the data. the data.

Issued when transaction wants to read item Issued when transaction wants to update
4. that do not have an exclusive lock. unlocked item.

Any number of transaction can hold shared Exclusive lock can be hold by only one
5. lock on an item. transaction.

X-lock is requested using lock-X


6. S-lock is requested using lock-S instruction. instruction.

Example: Multiple transactions reading the Example: Transaction updating a table


7. same data row

48. Database audit

Database auditing is the process of analyzing and monitoring a database to ensure its security,
compliance, performance, and integrity. Database auditing can help you identify vulnerabilities, track data
access and modifications, comply with regulations, and optimize database performance. There are different
types of database audits, such as security auditing, compliance auditing, data auditing, and configuration
auditing, depending on the objectives and scope of the audit.

A database audit requires analysis of your database, including users, their permissions, and access to
data to ensure compliance with regulations.

The Different Types of Database Audits


There are multiple types of database audits, including, but not limited to, the following:

 Security Auditing – Security auditing verifies that robust passwords are in place, ensures that sensitive data
is protected through encryption, and confirms that only those with proper clearance can access the
information.
 Compliance Auditing – Ensures compliance with industry regulations and legal requirements such as
GDPR, HIPAA, PCI, and SOX. It involves reviewing the database to confirm that proper measures are in
place to protect data and that the organization is adhering to relevant laws and regulations about data
management.
 Data Auditing - A data audit monitors and logs data access and modifications. It allows you to trace who
accessed the data and what changes were made, including identifying individuals responsible for adding,
modifying, or deleting data. It also enables tracking of when these changes are made.
 Configuration Auditing - Configuration auditing involves monitoring and tracking the actions taken by users
and database administrators, including creating and modifying database objects, managing user accounts,
and making changes to the database's configuration. In addition, it covers system-level changes such as
database software updates, operating system modifications, and hardware changes.
Additional types of database audits can be more granular such as SQL statement, SQL privilege, and
schema object audits. Or, more widely, database audits can be to review administrative activity, data
access and modification, user denials or login failures, and system-wide changes.

The Benefits of Database Audit

The benefits of a database audit include security, compliance, and data integrity. A database audit can help
you ensure your organization is not vulnerable to potential threats, remain compliant with relevant laws and
regulations such as GDPR, HIPAA, PCI, and SOX, and ensure data is accurate, complete, and consistent.
A database audit can also help with business continuity by making sure the database is available and
accessible at all times. In addition, should an issue occur where a database becomes corrupt or attacked, a
database audit can ensure that a disaster recovery plan is in place.
With proper auditing and tracking, which includes detailed records of all activities that have taken place in a
database, you can quickly discover common issues during a database audit. By resolving these errors, you
can increase the performance of your database, which would otherwise cause the database to be slow due
to slow queries, blocked processes, and other bottlenecks.

Common Issues Found During a Database Audit


Common issues found during a database audit depend on the state of your database and how it is
maintained. Some common problems you may encounter include a lack of security, such as weak
passwords and other security vulnerabilities that can lead to potential attacks, compliance violations, data
integrity issues, performance problems, and configuration errors. Some database audits reveal access by
unauthorized users, which can also lead to security vulnerabilities.

How to Perform a Database Audit

Performing a database audit depends on the needs and requirements of your organization. Below are four
key areas you should focus on when performing a database audit.

Audit Access and Authentication


Analyze data on user login attempts and review access control settings, including authentication methods.

Audit User and Administrator


Collect and analyze data on user and administrator actions to review what they’ve done, including whether
they have created and modified database objects, user accounts, and any other configuration changes.

Monitor Security Activity


Collect and analyze data on security-related events, including firewall logs, intrusion detection system
alerts, and system-wide changes, to monitor for unusual or suspicious activity.

Database Audit Vulnerability and Threat Detection


Identify and assess vulnerabilities and threats to the database, such as unpatched software, passwords
that need to be stronger, and access by unauthorized users.

Database administrators use tools such as Lynis, a security auditing tool for Linux, to help with database
audits. It is free, open source, and can be modified or extended based on your preferences. To identify and
prioritize security issues and vulnerabilities, Lynis provides detailed reports of your database’s security-
related configuration settings. As a result, database administrators can schedule regular security
assessments and identify potential issues early.

How to Correct the Issues Discovered during a Database Audit

Correcting issues discovered during a database audit depends on the type of database audited. The first
thing a database auditor needs to do is review the audit report and understand the identified issues. Then
create a plan of action to address the issues. Before making any corrections to a database, it is
recommended that you make backups in case you need to revert the database to its original state.
Resolve issues by applying patches and updates and ensuring the database is running the latest version.
Additionally, more advanced changes can be made in the database’s configuration settings for security,
such as authentication and access control. Database administrators can also reorganize database objects
such as tables and indexes to improve performance, as this can also resolve issues.
Once your changes and updates are made, monitor the database carefully to ensure no additional issues
are discovered. For best practice, performing a database audit after making changes and updates ensures
the database is running properly.

Conclusion

Performing a database audit should be done regularly. With the help of tools like Lynis, you can schedule
database audits as frequently as needed, which can help you protect your database data and help increase
database performance.

49. Aggregate functions:

In database management an aggregate function is a function where the values of multiple rows are
grouped together as input on certain criteria to form a single value of more significant meaning.

Various Aggregate Functions


1) Count()
2) Sum()
3) Avg()
4) Min()
5) Max()
Example:
Id Name Salary
-----------------------
1 A 80
2 B 40
3 C 60
4 D 70
5 E 60
6 F Null

Count():
Count(*): Returns total number of records .i.e 6.
Count(salary): Return number of Non Null values over the column salary. i.e 5.
Count(Distinct Salary): Return number of distinct Non Null values over the column salary .i.e 4

Sum():

sum(salary): Sum all Non Null values of Column salary i.e., 310
sum(Distinct salary): Sum of all distinct Non-Null values i.e., 250.

Avg():

Avg(salary) = Sum(salary) / count(salary) = 310/5


Avg(Distinct salary) = sum(Distinct salary) / Count(Distinct Salary) = 250/4

Min():

Min(salary): Minimum value in the salary column except NULL i.e., 40.
Max(salary): Maximum value in the salary i.e., 80.

50. Difference between primary key and candidate key

Primary Key:
Primary Key is a set of attributes (or attribute) which uniquely identify the tuples in relation or table.
The primary key is a minimal super key, so there is one and only one primary key in any
relationship. For example,

Student{ID, Aadhar_ID, F_name, M_name, L_name, Age}


Here only ID or Aadhar_ID can be primary key because the name, age can be same, but ID or
Aadhar_ID can’t be same.
Candidate Key:
A candidate key is a set of attributes (or attribute) which uniquely identify the tuples in relation or
table. As we know that Primary key is a minimal super key, so there is one and only one primary key
in any relationship but there is more than one candidate key can take place. Candidate key’s
attributes can contain a NULL value which opposes to the primary key. For example,
Student{ID, Aadhar_ID, F_name, M_name, L_name, Age}
Here we can see the two candidate keys ID and Aadhar_ID. So here, there are present more than one
candidate keys, which can uniquely identify a tuple in a relation.
Difference between Primary and Candidate Key:
S.NO Primary Key Candidate Key

Primary key is a minimal super key. So there is one While in a relation there can be more
1.
and only one primary key in a relation. than one candidate key.

Any attribute of Primary key cannot contain NULL While in Candidate key any attribute
2.
value. can contain NULL value.

But without candidate key there can’t


3. Primary key can be optional to specify any relation.
be specified any relation.

Primary key specifies the important attribute for the Candidate specifies the key which can
4.
relation. qualify for primary key.
S.NO Primary Key Candidate Key

It is not confirmed that a candidate key


5. It is confirmed that a primary key is a candidate key.
can be a primary key.

51. BCNF

Application of the general definitions of 2NF and 3NF may identify additional redundancy caused by
dependencies that violate one or more candidate keys. However, despite these additional constraints,
dependencies can still exist that will cause redundancy to be present in 3NF relations. This weakness
in 3NF resulted in the presentation of a stronger normal form called the Boyce-Codd Normal Form
(Codd, 1974).

Although, 3NF is an adequate normal form for relational databases, still, this (3NF) normal form may
not remove 100% redundancy because of X−>Y functional dependency if X is not a candidate key of
the given relation. This can be solved by Boyce-Codd Normal Form (BCNF).

Boyce-Codd Normal Form (BCNF)


Boyce–Codd Normal Form (BCNF) is based on functional dependencies that take into account all
candidate keys in a relation; however, BCNF also has additional constraints compared with the general
definition of 3NF.

Rules for BCNF


Rule 1: The table should be in the 3rd Normal Form.
Rule 2: X should be a superkey for every functional dependency (FD) X−>Y in a given relation.
Note: To test whether a relation is in BCNF, we identify all the determinants and make sure that they are
candidate keys.

It can be inferred that every relation in BCNF is also in 3NF. To put it another way, a relation in 3NF
need not be in BCNF. Ponder over this statement for a while.
To determine the highest normal form of a given relation R with functional dependencies, the first step is
to check whether the BCNF condition holds. If R is found to be in BCNF, it can be safely deduced that the
relation is also in 3NF, 2NF, and 1NF as the hierarchy shows. The 1NF has the least restrictive constraint
– it only requires a relation R to have atomic values in each tuple. The 2NF has a slightly more restrictive
constraint.
The 3NF has a more restrictive constraint than the first two normal forms but is less restrictive than the
BCNF. In this manner, the restriction increases as we traverse down the hierarchy.
Examples
Here, we are going to discuss some basic examples which let you understand the properties of BCNF. We
will discuss multiple examples here.

Example 1

Let us consider the student database, in which data of the student are mentioned.

Stu_ID Stu_Branch Stu_Course Branch_Number Stu_Course_No

Computer Science &


101 DBMS B_001 201
Engineering

Computer Science & Computer


101 B_001 202
Engineering Networks

Electronics & Communication


102 VLSI Technology B_003 401
Engineering

Electronics & Communication Mobile


102 B_003 402
Engineering Communication

Functional Dependency of the above is as mentioned:


Stu_ID −> Stu_Branch
Stu_Course −> {Branch_Number, Stu_Course_No}
Candidate Keys of the above table are: {Stu_ID, Stu_Course}
Why this Table is Not in BCNF?
The table present above is not in BCNF, because as we can see that neither Stu_ID nor Stu_Course is a
Super Key. As the rules mentioned above clearly tell that for a table to be in BCNF, it must follow the property
that for functional dependency X−>Y, X must be in Super Key and here this property fails, that’s why this
table is not in BCNF.
How to Satisfy BCNF?
For satisfying this table in BCNF, we have to decompose it into further tables. Here is the full procedure
through which we transform this table into BCNF. Let us first divide this main table into two
tables Stu_Branch and Stu_Course Table.
Stu_Branch Table
Stu_ID Stu_Branch

101 Computer Science & Engineering

102 Electronics & Communication Engineering

Candidate Key for this table: Stu_ID.


Stu_Course Table
Stu_Course Branch_Number Stu_Course_No

DBMS B_001 201

Computer Networks B_001 202

VLSI Technology B_003 401

Mobile Communication B_003 402

Candidate Key for this table: Stu_Course.


Stu_ID to Stu_Course_No Table
Stu_ID Stu_Course_No

101 201

101 202

102 401

102 402

Candidate Key for this table: {Stu_ID, Stu_Course_No}.


After decomposing into further tables, now it is in BCNF, as it is passing the condition of Super Key, that in
functional dependency X−>Y, X is a Super Key.

Example 2

Find the highest normal form of a relation R(A, B, C, D, E) with FD set as:
{ BC->D, AC->BE, B->E }
Explanation:
 Step-1: As we can see, (AC)+ ={A, C, B, E, D} but none of its subsets can determine all attributes of the
relation, So AC will be the candidate key. A or C can’t be derived from any other attribute of the relation,
so there will be only 1 candidate key {AC}.
 Step-2: Prime attributes are those attributes that are part of candidate key {A, C} in this example and
others will be non-prime {B, D, E} in this example.
 Step-3: The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or
composite attributes.
The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a proper subset of
candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and B->E is in 2nd normal form
(B is not a proper subset of candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a prime attribute)
and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd normal for, either LHS of
an FD should be super key or RHS should be a prime attribute. So the highest normal form of relation will be
the 2nd Normal form.
Note: A prime attribute cannot be transitively dependent on a key in BCNF relation.
Consider these functional dependencies of some relation R
AB ->C
C ->B
AB ->B
Suppose, it is known that the only candidate key of R is AB. A careful observation is required to conclude
that the above dependency is a Transitive Dependency as the prime attribute B transitively depends on the
key AB through C. Now, the first and the third FD are in BCNF as they both contain the candidate key (or
simply KEY) on their left sides. The second dependency, however, is not in BCNF but is definitely in 3NF
due to the presence of the prime attribute on the right side. So, the highest normal form of R is 3NF as all
three FDs satisfy the necessary conditions to be in 3NF.

Example 3

For example consider relation R(A, B, C)


A -> BC,
B -> A
A and B both are super keys so the above relation is in BCNF.
Note: BCNF decomposition may always not be possible with dependency preserving, however, it always
satisfies the lossless join condition. For example, relation R (V, W, X, Y, Z), with functional dependencies:
V, W -> X
Y, Z -> X
W -> Y
It would not satisfy dependency preserving BCNF decomposition.
Note: Redundancies are sometimes still present in a BCNF relation as it is not always possible to eliminate
them completely.
There are also some higher-order normal forms, like the 4th Normal Form and the 5th Normal Form.
Difference between 3NF and BCNF:

S.NO. 3NF BCNF

BCNF stands for Boyce Codd Normal


1. 3NF stands for Third Normal Form.
Form.

In 3NF there should be no transitive dependency


In BCNF for any relation A->B, A
2. that is no non prime attribute should be transitively
should be a super key of relation.
dependent on the candidate key.

It is comparatively more stronger than


3. It is less stronger than BCNF.
3NF.

In 3NF the functional dependencies are already in In BCNF the functional dependencies
4.
1NF and 2NF. are already in 1NF, 2NF and 3NF.

The redundancy is comparatively low in


5. The redundancy is high in 3NF.
BCNF.

In BCNF there may or may not be


In 3NF there is preservation of all functional
6. preservation of all functional
dependencies.
dependencies.
S.NO. 3NF BCNF

7. It is comparatively easier to achieve. It is difficult to achieve.

Lossless decomposition is hard to


8. Lossless decomposition can be achieved by 3NF.
achieve in BCNF.

The table is in 3NF if it is in 2NF and for each


functional dependency X->Y at least following
condition hold: The table is in BCNF if it is in 3rd
9. normal form and for each relation X->Y
(i) X is a super key, X should be super key.
(ii) Y is prime attribute of table.

3NF can be obtained without sacrificing all Dependencies may not be preserved in
10.
dependencies. BCNF.

3NF can be achieved without losing any For obtaining BCNF we may lose some
11.
information from the old table. information from old table.

4NF
BCNF

1 A relation in 4NF must also be in Boyce Codd


A relation in BCNF must also be in 3NF.
Normal Form (BCNF).

2 A relation in BCNF may have multi- <A relation in 4NF must not have any multi-valued
valued dependency. dependency.

A relation in BCNF may or may not be


3 A relation in 4NF is always in BCNF.
in 4NF.

4 BCNF is less stronger in comparison to


4NF is more stronger in comparison to BCNF.
4NF.

5 If a relation is in BCNF then it will have If a relation is in 4NF then it will have less
more redundancy as compared to 4NF. redundancy as compared to BCNF .

If a relation is in BCNF then all If a relation is in 4NF then all redundancy based on
6 redundancy based on functional functional dependency as well as multi-valued
dependency has been removed. dependency has been removed.
For a relation, number of tables in
7 For a relation, number of tables in 4NF is greater
BCNF is less than or equal to number of
than or equal to number of tables in BCNF.
tables in 4NF.

8 Dependency preserving is hard to Dependency preserving is more hard to achieve in


achieve in BCNF. 4NF as compared to BCNF.

9 In real world database designing, In real world database designing, generally 4NF is
generally 3NF or BCNF is preferred. not preferred by database designer.

10 A relation in BCNF may contain multi- A relation in 4NF may only contain join
valued as well as join dependency. dependency.

S.NO 4NF 5NF

A relation in 4NF must also be in A relation in 5NF must also be in 4NF(Fourth


1.
BCNF(Boyce Codd Normal Form). Normal Form).

A relation in 4NF must not have any A relation in 5NF must not have any join
2.
multi-valued dependency. dependency.

A relation in 4NF may or may not be


3. A relation in 5NF is always in 4NF
in 5NF.

Fourth Normal Form is less stronger Fifth Normal form is more stronger than Fourth
4.
in comparison to Fifth Normal form. Normal Form.

If a relation is in Fourth Normal Form If a relation is in Fifth Normal Form then it will less
5.
then it will have more redundancy. redundancy.

If a relation is in Fourth Normal Form If a relation is in Fifth Normal Form then it cannot be
6. then it may be decomposed further decomposed further into sub-relations without any
into sub-relations. modification in meaning or facts.

52. What is an ER diagram? Explain the symbol used in it with the help of an example

The Entity Relational Model is a model for identifying entities to be represented in the database and
representation of how those entities are related. The ER data model specifies enterprise schema that
represents the overall logical structure of a database graphically.
The Entity Relationship Diagram explains the relationship among the entities present in the database. ER
models are used to model real-world objects like a person, a car, or a company and the relation between
these real-world objects. In short, the ER Diagram is the structural format of the database.
Why Use ER Diagrams In DBMS?
 ER diagrams are used to represent the E-R model in a database, which makes them easy to be
converted into relations (tables).
 ER diagrams provide the purpose of real-world modeling of objects which makes them intently
useful.
 ER diagrams require no technical knowledge and no hardware support.
 These diagrams are very easy to understand and easy to create even for a naive user.
 It gives a standard solution for visualizing the data logically.
Symbols Used in ER Model
ER Model is used to model the logical view of the system from a data perspective which consists of these
symbols:
 Rectangles: Rectangles represent Entities in the ER Model.
 Ellipses: Ellipses represent Attributes in the ER Model.
 Diamond: Diamonds represent Relationships among Entities.
 Lines: Lines represent attributes to entities and entity sets with other relationship types.
 Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
 Double Rectangle: Double Rectangle represents a Weak Entity.

Components of ER Diagram
ER Model consists of Entities, Attributes, and Relationships among Entities in a Database System.
Entity
An Entity may be an object with a physical existence – a particular person, car, house, or employee – or it
may be an object with a conceptual existence – a company, a job, or a university course.
Entity Set: An Entity is an object of Entity Type and a set of all entities is called an entity set. For Example,
E1 is an entity having Entity Type Student and the set of all students is called Entity Set. In ER diagram,
Entity Type is represented as:

Entity Set

1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on other Entity in
the Schema. It has a primary key, that helps in identifying it uniquely, and it is represented by a rectangle.
These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set. But some entity
type exists for which key attributes can’t be defined. These are called Weak Entity types.
For Example, A company may store the information of dependents (Parents, Children, Spouse) of an
Employee. But the dependents don’t have existed without the employee. So Dependent will be a Weak
Entity Type and Employee will be Identifying Entity type for Dependent, which means it is Strong Entity
Type.
A weak entity type is represented by a Double Rectangle. The participation of weak entity types is
always total. The relationship between the weak entity type and its identifying strong entity type is called
identifying relationship and it is represented by a double diamond.

Strong Entity and Weak Entity

Attributes
Attributes are the properties that define the entity type. For example, Roll_No, Name, DOB, Age, Address,
and Mobile_No are the attributes that define entity type Student. In ER diagram, the attribute is represented
by an oval.
Attribute

1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key attribute. For
example, Roll_No will be unique for each student. In ER diagram, the key attribute is represented by an
oval with underlying lines.

Key Attribute

2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For example, the
Address attribute of the student Entity type consists of Street, City, State, and Country. In ER diagram, the
composite attribute is represented by an oval comprising of ovals.

Composite Attribute

3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No (can be more
than one for a given student). In ER diagram, a multivalued attribute is represented by a double oval.

Multivalued Attribute

4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a derived attribute. e.g.;
Age (can be derived from DOB). In ER diagram, the derived attribute is represented by a dashed oval.

Derived Attribute
The Complete Entity Type Student with its Attributes can be represented as:

Entity and Attributes

Relationship Type and Relationship Set


A Relationship Type represents the association between entity types. For example, ‘Enrolled in’ is a
relationship type that exists between entity type Student and Course. In ER diagram, the relationship type
is represented by a diamond and connecting the entities with lines.

Entity-Relationship Set

A set of relationships of the same type is known as a relationship set. The following relationship set depicts
S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered in C3.

Relationship Set

Degree of a Relationship Set


The number of different entity sets participating in a relationship set is called the degree of a relationship
set.
1. Unary Relationship: When there is only ONE entity set participating in a relation, the relationship is
called a unary relationship. For example, one person is married to only one person.

Unary Relationship
2. Binary Relationship: When there are TWO entities set participating in a relationship, the relationship
is called a binary relationship. For example, a Student is enrolled in a Course.

Binary Relationship

3. n-ary Relationship: When there are n entities set participating in a relation, the relationship is called
an n-ary relationship.
Cardinality
The number of times an entity of an entity set participates in a relationship set is known as cardinality.
Cardinality can be of different types:
1. One-to-One: When each entity in each entity set can take part only once in the relationship, the
cardinality is one-to-one. Let us assume that a male can marry one female and a female can marry one
male. So the relationship will be one-to-one.
the total number of tables that can be used in this is 2.

one to one cardinality

Using Sets, it can be represented as:

Set Representation of One-to-One

2. One-to-Many: In one-to-many mapping as well where each entity can be related to more than one
relationship and the total number of tables that can be used in this is 2. Let us assume that one surgeon
deparment can accomodate many doctors. So the Cardinality will be 1 to M. It means one deparment has
many Doctors.
total number of tables that can used is 3.
one to many cardinality

Using sets, one-to-many cardinality can be represented as:

3. Many-to-One: When entities in one entity set can take part only once in the relationship set and entities
in other entity sets can take part more than once in the relationship set, cardinality is many to one. Let us
assume that a student can take only one course but one course can be taken by many students. So the
cardinality will be n to 1. It means that for one course there can be n students but for one student, there
will be only one course.
The total number of tables that can be used in this is 3.

many to one cardinality

Using Sets, it can be represented as:


Set Representation of Many-to-One

In this case, each student is taking only 1 course but 1 course has been taken by many students.
4. Many-to-Many: When entities in all entity sets can take part more than once in the relationship
cardinality is many to many. Let us assume that a student can take more than one course and one course
can be taken by many students. So the relationship will be many to many.
the total number of tables that can be used in this is 3.

many to many cardinality

Using Sets, it can be represented as:


Many-to-Many Set Representation

In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled by S1, S3, and S4. So it is
many-to-many relationships.
Participation Constraint
Participation Constraint is applied to the entity participating in the relationship set.
1. Total Participation – Each entity in the entity set must participate in the relationship. If each student
must enroll in a course, the participation of students will be total. Total participation is shown by a double
line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT participate in the relationship. If
some courses are not enrolled by any of the students, the participation in the course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having total participation and
Course Entity set having partial participation.

Total Participation and Partial Participation

Using Set, it can be represented as,

Set representation of Total Participation and Partial Participation

Every student in the Student Entity set participates in a relationship but there exists a course C4 that is not
taking part in the relationship.
How to Draw ER Diagram?
 The very first step is Identifying all the Entities, and place them in a Rectangle, and labeling them
accordingly.
 The next step is to identify the relationship between them and pace them accordingly using the
Diamond, and make sure that, Relationships are not connected to each other.
 Attach attributes to the entities properly.
 Remove redundant entities and relationships.
 Add proper colors to highlight the data present in the database.

53. Different types of database users

Database users are categorized based up on their interaction with the database. These are seven types
of database users in DBMS.
1. Database Administrator (DBA) : Database Administrator (DBA) is a person/team who defines the
schema and also controls the 3 levels of database. The DBA will then create a new account id and
password for the user if he/she need to access the database. DBA is also responsible for providing
security to the database and he allows only the authorized users to access/modify the data bas e.
DBA is responsible for the problems such as security breaches and poor system response time.
 DBA also monitors the recovery and backup and provide technical support.
 The DBA has a DBA account in the DBMS which called a system or superuser account.
 DBA repairs damage caused due to hardware and/or software failures.
 DBA is the one having privileges to perform DCL (Data Control Language) operations such as
GRANT and REVOKE, to allow/restrict a particular user from accessing the database.
2. Naive / Parametric End Users : Parametric End Users are the unsophisticated who don’t have any
DBMS knowledge but they frequently use the database applications in their daily life to get the
desired results. For examples, Railway’s ticket booking users are naive users. Clerk s in any bank is
a naive user because they don’t have any DBMS knowledge but they still use the database and
perform their given task.
3. System Analyst :
System Analyst is a user who analyzes the requirements of parametric end users. They check
whether all the requirements of end users are satisfied.
4. Sophisticated Users : Sophisticated users can be engineers, scientists, business analyst, who are
familiar with the database. They can develop their own database applications according to their
requirement. They don’t write the program code but they interact the database by writing SQL
queries directly through the query processor.
5. Database Designers : Data Base Designers are the users who design the structure of database
which includes tables, indexes, views, triggers, stored procedures and constraints which are usually
enforced before the database is created or populated with data. He/she controls what data must be
stored and how the data items to be related. It is responsibility of Database Designers to understan d
the requirements of different user groups and then create a design which satisfies the need of all the
user groups.
6. Application Programmers : Application Programmers also referred as System Analysts or simply
Software Engineers, are the back-end programmers who writes the code for the application
programs. They are the computer professionals. These programs could be written in Programming
languages such as Visual Basic, Developer, C, FORTRAN, COBOL etc. Application programmers
design, debug, test, and maintain set of programs called “canned transactions” for the Naive
(parametric) users in order to interact with database.
7. Casual Users / Temporary Users : Casual Users are the users who occasionally use/access the
database but each time when they access the database they require the new information, for
example, Middle or higher level manager.
8. Specialized users : Specialized users are sophisticated users who write
specialized database application that does not fit into the traditional data -
processing framework. Among these applications are computer aided-design
systems, knowledge-base and expert systems etc.

54. Database security

Database security includes a variety of measures used to secure database management systems from
malicious cyber-attacks and illegitimate use. Database security programs are designed to protect not
only the data within the database, but also the data management system itself, and every application
that accesses it, from misuse, damage, and intrusion.

Database security encompasses tools, processes, and methodologies which establish security inside a
database environment.

Database Security means keeping sensitive information safe and prevent the loss of data. Security of
data base is controlled by Database Administrator (DBA).
The following are the main control measures are used to provide security of data in databases:
1. Authentication
2. Access control
3. Inference control
4. Flow control
5. Database Security applying Statistical Method
6. Encryption
These are explained as following below.
1. Authentication :
Authentication is the process of confirmation that whether the user log in onl y according to the rights
provided to him to perform the activities of data base. A particular user can login only up to his
privilege but he can’t access the other sensitive data. The privilege of accessing sensitive data is
restricted by using Authentication.
By using these authentication tools for biometrics such as retina and figure prints can prevent the
data base from unauthorized/malicious users.
2. Access Control :
The security mechanism of DBMS must include some provisions for restricting access to the data
base by unauthorized users. Access control is done by creating user accounts and to control login
process by the DBMS. So, that database access of sensitive data is possible only to those people
(database users) who are allowed to access such data and to restrict access to unauthorized
persons.
The database system must also keep the track of all operations performed by certain user
throughout the entire login time.
3. Inference Control :
This method is known as the countermeasures to statistical database security problem. It is used to
prevent the user from completing any inference channel. This method protect sensitive information
from indirect disclosure.
Inferences are of two types, identity disclosure or attribute disclosure.
4. Flow Control :
This prevents information from flowing in a way that it reaches unauthorized users. Channels are the
pathways for information to flow implicitly in ways that violate the privacy policy of a company are
called convert channels.
5. Database Security applying Statistical Method :
Statistical database security focuses on the protection of confidential individual values stored in and
used for statistical purposes and used to retrieve the summaries of values based on categories. They
do not permit to retrieve the individual information.
This allows to access the database to get statistical information about the number of employees in
the company but not to access the detailed confidential/personal information about the specific
individual employee.
6. Encryption :
This method is mainly used to protect sensitive data (such as credit card numbers, OTP numbers)
and other sensitive numbers. The data is encoded using some encoding algorithms.
An unauthorized user who tries to access this encoded data will face difficult y in decoding it, but
authorized users are given decoding keys to decode data.

Database Security Threats

Many software vulnerabilities, misconfigurations, or patterns of misuse or carelessness could result in


breaches. Here are a number of the most known causes and types of database security cyber threats.

Insider Threats

An insider threat is a security risk from one of the following three sources, each of which has privileged
means of entry to the database:

 A malicious insider with ill-intent


 A negligent person within the organization who exposes the database to attack through careless actions
 An outsider who obtains credentials through social engineering or other methods, or gains access to the
database’s credentials

An insider threat is one of the most typical causes of database security breaches and it often occurs
because a lot of employees have been granted privileged user access.

Human Error

Weak passwords, password sharing, accidental erasure or corruption of data, and other undesirable user
behaviors are still the cause of almost half of data breaches reported.
Exploitation of Database Software Vulnerabilities

Attackers constantly attempt to isolate and target vulnerabilities in software, and database management
software is a highly valuable target. New vulnerabilities are discovered daily, and all open source database
management platforms and commercial database software vendors issue security patches regularly.
However, if you don’t use these patches quickly, your database might be exposed to attack.

Even if you do apply patches on time, there is always the risk of zero-day attacks, when attackers discover
a vulnerability, but it has not yet been discovered and patched by the database vendor.

SQL/NoSQL Injection Attacks

A database-specific threat involves the use of arbitrary non-SQL and SQL attack strings into database
queries. Typically, these are queries created as an extension of web application forms, or received via
HTTP requests. Any database system is vulnerable to these attacks, if developers do not adhere to secure
coding practices, and if the organization does not carry out regular vulnerability testing.

Buffer Overflow Attacks

Buffer overflow takes place when a process tries to write a large amount of data to a fixed-length block of
memory, more than it is permitted to hold. Attackers might use the excess data, kept in adjacent memory
addresses, as the starting point from which to launch attacks.

Denial of Service (DoS/DDoS) Attacks

In a denial of service (DoS) attack, the cybercriminal overwhelms the target service—in this instance the
database server—using a large amount of fake requests. The result is that the server cannot carry out
genuine requests from actual users, and often crashes or becomes unstable.

In a distributed denial of service attack (DDoS), fake traffic is generated by a large number of computers,
participating in a botnet controlled by the attacker. This generates very large traffic volumes, which are
difficult to stop without a highly scalable defensive architecture. Cloud-based DDoS protection services can
scale up dynamically to address very large DDoS attacks.

Malware

Malware is software written to take advantage of vulnerabilities or to cause harm to a database. Malware
could arrive through any endpoint device connected to the database’s network. Malware protection is
important on any endpoint, but especially so on database servers, because of their high value and
sensitivity.

An Evolving IT Environment

The evolving IT environment is making databases more susceptible to threats. Here are trends that can
lead to new types of attacks on databases, or may require new defensive measures:

 Growing data volumes—storage, data capture, and processing is growing exponentially across almost all
organizations. Any data security practices or tools must be highly scalable to address distant and near-
future requirements.
 Distributed infrastructure—network environments are increasing in complexity, especially as businesses
transfer workloads to hybrid cloud or multi-cloud architectures, making the deployment, management, and
choice of security solutions more difficult.
 Increasingly tight regulatory requirements—the worldwide regulatory compliance landscape is growing
in complexity, so following all mandates are becoming more challenging.
 Cybersecurity skills shortage—there is a global shortage of skilled cybersecurity professionals, and
organizations are finding it difficult to fill security roles. This can make it more difficult to defend critical
infrastructure, including databases.

How Can You Secure Your Database Server?

A database server is a physical or virtual machine running the database. Securing a database server, also
known as “hardening”, is a process that includes physical security, network security, and secure operating
system configuration.

Ensure Physical Database Security

Refrain from sharing a server for web applications and database applications, if your database
contains sensitive data. Although it could be cheaper, and easier, to host your site and database together
on a hosting provider, you are placing the security of your data in someone else’s hands.

If you do rely on a web hosting service to manage your database, you should ensure that it is a company
with a strong security track record. It is best to stay clear of free hosting services due to the possible lack of
security.

If you manage your database in an on-premise data center, keep in mind that your data center is also
prone to attacks from outsiders or insider threats. Ensure you have physical security measures, including
locks, cameras, and security personnel in your physical facility. Any access to physical servers must be
logged and only granted to authorized individuals.

In addition, do not leave database backups in locations that are publicly accessible, such as temporary
partitions, web folders, or unsecured cloud storage buckets.
Lock Down Accounts and Privileges

Let’s consider the Oracle database server. After the database is installed, the Oracle database
configuration assistant (DBCA) automatically expires and locks most of the default database user accounts.

If you install an Oracle database manually, this doesn’t happen and default privileged accounts won’t be
expired or locked. Their password stays the same as their username, by default. An attacker will try to use
these credentials first to connect to the database.

It is critical to ensure that every privileged account on a database server is configured with a strong, unique
password. If accounts are not needed, they should be expired and locked.

For the remaining accounts, access has to be limited to the absolute minimum required. Each account
should only have access to the tables and operations (for example, SELECT or INSERT) required by the
user. Avoid creating user accounts with access to every table in the database.

Regularly Patch Database servers

Ensure that patches remain current. Effective database patch management is a crucial security practice
because attackers are actively seeking out new security flaws in databases, and new viruses and malware
appear on a daily basis.

A timely deployment of up-to-date versions of database service packs, critical security hotfixes, and
cumulative updates will improve the stability of database performance.

Disable Public Network Access

Organizations store their applications in databases. In most real-world scenarios, the end-user doesn’t
require direct access to the database. Thus, you should block all public network access to database
servers unless you are a hosting provider. Ideally, an organization should set up gateway servers (VPN or
SSH tunnels) for remote administrators.

Encrypt All Files and Backups

Irrespective of how solid your defenses are, there is always a possibility that a hacker may infiltrate your
system. Yet, attackers are not the only threat to the security of your database. Your employees may also
pose a risk to your business. There is always the possibility that a malicious or careless insider will gain
access to a file they don’t have permission to access.

Encrypting your data makes it unreadable to both attackers and employees. Without an encryption key,
they cannot access it, this provides a last line of defense against unwelcome intrusions. Encrypt all-
important application files, data files, and backups so that unauthorized users cannot read your critical
data.

Database Security Best Practices

Here are several best practices you can use to improve the security of sensitive databases.
Actively Manage Passwords and User Access

If you have a large organization, you must think about automating access management via password
management or access management software. This will provide permitted users with a short-term
password with the rights they need every time they need to gain access to a database.

It also keeps track of the activities completed during that time frame and stops administrators from sharing
passwords. While administrators may feel that sharing passwords is convenient, however, doing so makes
effective database accountability and security almost impossible.

In addition, the following security measures are recommended:

 Strong passwords must be enforced


 Password hashes must be salted and stored encrypted
 Accounts must be locked following multiple login attempts
 Accounts must be regularly reviewed and deactivated if staff move to different roles, leave the company, or
no longer require the same level of access

Test Your Database Security

Once you have put in place your database security infrastructure, you must test it against a real threat.
Auditing or performing penetration tests against your own database will help you get into the mindset of a
cybercriminal and isolate any vulnerabilities you may have overlooked.

To make sure the test is comprehensive, involve ethical hackers or recognized penetration testing services
in your security testing. Penetration testers provide extensive reports listing database vulnerabilities, and it
is important to quickly investigate and remediate these vulnerabilities. Run a penetration test on a critical
database system at least once per year.

Use Real-Time Database Monitoring

Continually scanning your database for breach attempts increases your security and lets you rapidly react
to possible attacks.

In particular, File Integrity Monitoring (FIM) can help you log all actions carried out on the database’s server
and to alert you of potential breaches. When FIM detects a change to important database files, ensure
security teams are alerted and able to investigate and respond to the threat.

Use Web Application and Database Firewalls

You should use a firewall to protect your database server from database security threats. By default, a
firewall does not permit access to traffic. It needs to also stop your database from starting outbound
connections unless there is a particular reason for doing so.

As well as safeguarding the database with a firewall, you must deploy a web application firewall (WAF).
This is because attacks aimed at web applications, including SQL injection, can be used to gain illicit
access to your databases.

A database firewall will not stop most web application attacks, because traditional firewalls operate at the
network layer, while web application layers operate at the application layer (layer 7 of the OSI model). A
WAF operates at layer 7 and is able to detect malicious web application traffic, such as SQL injection
attacks, and block it before it can harm your database.
55. How will you create and manage views

Views in SQL are kind of virtual tables. A view also has rows and columns as they are in a real table in
the database. We can create a view by selecting fields from one or more tables present in the database.
A View can either have all the rows of a table or specific rows based on certain condition. In this article
we will learn about creating , deleting and updating Views.
Sample Tables:
StudentDetails

StudentMarks

CREATING VIEWS
We can create View using CREATE VIEW statement. A View can be created from a single table or
multiple tables. Syntax:
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE condition;

view_name: Name for the View


table_name: Name of the table
condition: Condition to select rows
Examples:
Creating View from a single table:
 In this example we will create a View named DetailsView from the table StudentDetails. Query:
CREATE VIEW DetailsView AS
SELECT NAME, ADDRESS
FROM StudentDetails
WHERE S_ID < 5;
 To see the data in the View, we can query the view in the same manner as we query a table.
SELECT * FROM DetailsView;
Output:

 In this example, we will create a view named StudentNames from the table StudentDetails. Query:
CREATE VIEW StudentNames AS
SELECT S_ID, NAME
FROM StudentDetails
ORDER BY NAME;
 If we now query the view as,
SELECT * FROM StudentNames;
Output:

 Creating View from multiple tables: In this example we will create a View named MarksView from
two tables StudentDetails and StudentMarks. To create a View from multiple tables we can simply
include multiple tables in the SELECT statement. Query:
CREATE VIEW MarksView AS
SELECT StudentDetails.NAME, StudentDetails.ADDRESS, StudentMarks.MARKS
FROM StudentDetails, StudentMarks
WHERE StudentDetails.NAME = StudentMarks.NAME;
 To display data of View MarksView:
SELECT * FROM MarksView;
 Output:

LISTING ALL VIEWS IN A DATABASE


We can list View using the SHOW FULL TABLES statement or using the information_schema table. A
View can be created from a single table or multiple tables.
Syntax (Using SHOW FULL TABLES):
use "database_name";
show full tables where table_type like "%VIEW";
Syntax (Using information_schema) :
select * from information_schema.views where table_schema = "database_name";

OR

select table_schema,table_name,view_definition from information_schema.views where table_schema =


"database_name";

DELETING VIEWS
We have learned about creating a View, but what if a created View is not needed any more? Obviously
we will want to delete it. SQL allows us to delete an existing View. We can delete or drop a View using
the DROP statement. Syntax:
DROP VIEW view_name;

view_name: Name of the View which we want to delete.


For example, if we want to delete the View MarksView, we can do this as:
DROP VIEW MarksView;
UPDATING VIEWS
There are certain conditions needed to be satisfied to update a view. If any one of these conditions
is not met, then we will not be allowed to update the view.
1. The SELECT statement which is used to create the view should not include GROUP BY clause or
ORDER BY clause.
2. The SELECT statement should not have the DISTINCT keyword.
3. The View should have all NOT NULL values.
4. The view should not be created using nested queries or complex queries.
5. The view should be created from a single table. If the view is created using multiple tables then we
will not be allowed to update the view.
 We can use the CREATE OR REPLACE VIEW statement to add or remove fields from a
view. Syntax:
CREATE OR REPLACE VIEW view_name AS
SELECT column1,column2,..
FROM table_name
WHERE condition;
 For example, if we want to update the view MarksView and add the field AGE to this View
from StudentMarks Table, we can do this as:
CREATE OR REPLACE VIEW MarksView AS
SELECT StudentDetails.NAME, StudentDetails.ADDRESS, StudentMarks.MARKS, StudentMarks.AGE
FROM StudentDetails, StudentMarks
WHERE StudentDetails.NAME = StudentMarks.NAME;
 If we fetch all the data from MarksView now as:
SELECT * FROM MarksView;

 Output:
 Inserting a row in a view: We can insert a row in a View in a same way as we do in a table. We can
use the INSERT INTO statement of SQL to insert a row in a View.
Syntax:
INSERT INTO view_name(column1, column2 , column3,..)
VALUES(value1, value2, value3..);

view_name: Name of the View


Example: In the below example we will insert a new row in the View DetailsView which we have created
above in the example of “creating views from a single table”.
INSERT INTO DetailsView(NAME, ADDRESS)
VALUES("Suresh","Gurgaon");
 If we fetch all the data from DetailsView now as,
SELECT * FROM DetailsView;
Output:

 Deleting a row from a View: Deleting rows from a view is also as simple as deleting rows from a
table. We can use the DELETE statement of SQL to delete rows from a view. Also deleting a row
from a view first delete the row from the actual table and the change is then reflected in the view.
Syntax:
DELETE FROM view_name
WHERE condition;

view_name:Name of view from where we want to delete rows


condition: Condition to select rows
Example: In this example, we will delete the last row from the view DetailsView which we just added in
the above example of inserting rows.
DELETE FROM DetailsView
WHERE NAME="Suresh";
 If we fetch all the data from DetailsView now as,
SELECT * FROM DetailsView;
Output:
WITH CHECK OPTION
The WITH CHECK OPTION clause in SQL is a very useful clause for views. It is applicable to an
updatable view. If the view is not updatable, then there is no meaning of including this clause in the
CREATE VIEW statement.
 The WITH CHECK OPTION clause is used to prevent the insertion of rows in the view where the
condition in the WHERE clause in CREATE VIEW statement is not satisfied.
 If we have used the WITH CHECK OPTION clause in the CREATE VIEW statement, and if the
UPDATE or INSERT clause does not satisfy the conditions then they will return an error.
Example: In the below example we are creating a View SampleView from StudentDetails Table with
WITH CHECK OPTION clause.
CREATE VIEW SampleView AS
SELECT S_ID, NAME
FROM StudentDetails
WHERE NAME IS NOT NULL
WITH CHECK OPTION;
In this View if we now try to insert a new row with null value in the NAME column then it will give an error
because the view is created with the condition for NAME column as NOT NULL. For example,though the
View is updatable but then also the below query for this View is not valid:
INSERT INTO SampleView(S_ID)
VALUES(6);
NOTE: The default value of NAME column is null.
Uses of a View: A good database should contain views due to the given reasons:
1. Restricting data access – Views provide an additional level of table security by restricting access to
a predetermined set of rows and columns of a table.
2. Hiding data complexity – A view can hide the complexity that exists in multiple tables join.
3. Simplify commands for the user – Views allow the user to select information from multiple tables
without requiring the users to actually know how to perform a join.
4. Store complex queries – Views can be used to store complex queries.
5. Rename Columns – Views can also be used to rename the columns without affecting the base
tables provided the number of columns in view must match the number of columns specified in select
statement. Thus, renaming helps to hide the names of the columns of the base tables.
6. Multiple view facility – Different views can be created on the same table for different users.

56. Advantages of DBMS

Database Management System (DBMS) is a collection of interrelated data and a set of software
tools/programs that access, process, and manipulate data. It allo ws access, retrieval, and use of that
data by considering appropriate security measures. The Database Management system (DBMS) is
really useful for better data integration and security.

Advantages of Database Management System


Some of them are given as follows below.
 Better Data Transferring: Database management creates a place where users have an advantage
of more and better-managed data. Thus making it possible for end-users to have a quick look and to
respond fast to any changes made in their environment.
 Better Data Security: The more accessible and usable the database, the more it is prone to security
issues. As the number of users increases, the data transferring or data sharing rate also increases
thus increasing the risk of data security. It is widely used in the corporate world where companies
invest large amounts of money, time, and effort to ensure data is secure and used properly. A
Database Management System (DBMS) provides a better platform for data privacy and security
policies thus, helping companies to improve Data Security.
 Better data integration: Due to the Database Management System we have access to well-
managed and synchronized forms of data thus it makes data handling very easy and gives an
integrated view of how a particular organization is working and also helps to keep track of how one
segment of the company affects another segment.
 Minimized Data Inconsistency: Data inconsistency occurs between files when different versions of
the same data appear in different places. For Example, data inconsistency occurs when a student’s
name is saved as “John Wayne” on the main computer of the school but on the teacher registered
system same student name is “William J. Wayne”, or when the price of a product is $86.95 in the
local system of the company and its National sales office system shows the same product price as
$84.95. So if a database is properly designed then Data inconsistency can be greatly reduced hence
minimizing data inconsistency.
 Faster data Access: The Database management system (DBMS) helps to produce quick answers to
database queries thus making data access faster and more accurate. For example, to read or update
the data. For example, end-users, when dealing with large amounts of sale data, will have enhanced
access to the data, enabling a faster sales cycle. Some queries may be like:
 What is the increase in sales in the last three months?
 What is the bonus given to each the salespeople in the last five months?
 How many customers have a credit score of 850 or more?
 Better decision making: Due to DBMS now we have Better managed data and Improved data
access because of which we can generate better quality information hence on this basis better
decisions can be made. Better Data quality improves accuracy, validity, and the a time it takes to
read data. DBMS does not guarantee data quality, it provides a framework to make it easy to
improve data quality. DBMS provides powerful data analysis and reporting tools that allow users to
make informed decisions based on data insights. This helps organizations to improve their decision-
making processes and achieve better business outcomes.
 Increased end-user productivity: The data which is available with the help of a combination of
tools that transform data into useful information, helps end-users to make quick, informative, and
better decisions that can make a difference between success and failure in the global economy.
 Simple: Database management system (DBMS) gives a simple and clear logical view of data. Many
operations like insertion, deletion, or creation of files or data are easy to implement.
 Data abstraction: The major purpose of a database system is to provide users with an abstract view
of the data. Since many complex algorithms are used by the developers to increase the efficiency of
databases that are being hidden by the users through various data abstraction levels to allow users
to easily interact with the system.
 Reduction in data Redundancy: When working with a structured database, DBMS provides the
feature to prevent the input of duplicate items in the database. for e.g. – If there are two same
students in different rows, then one of the duplicate data will be deleted.
 Application development: A DBMS provides a foundation for developing applications that require
access to large amounts of data, reducing development time and costs.
 Data sharing: A DBMS provides a platform for sharing data across multiple applications and users,
which can increase productivity and collaboration.
 Data organization: A DBMS provides a systematic approach to organizing data in a structured way,
which makes it easier to retrieve and manage data efficiently.
 The atomicity of data can be maintained: That means, if some operation is performed on one
particular table of the database, then the change must be reflected for the entire database.
 The DBMS allows concurrent access to multiple users by using the synchronization technique.
 Data consistency and accuracy: DBMS ensures that data is consistent and accurate by enforcing
data integrity constraints and preventing data duplication. This helps to eliminate data discrepancies
and errors that can occur when data is stored and managed manually.
 Improved data security: DBMS provides a high level of data security by offering user authentication
and authorization, data encryption, and access control mechanisms. This helps to protect sensitive
data from unauthorized access, modification, or theft.
 Efficient data access and retrieval: DBMS allows for efficient data access and retrieval by
providing indexing and query optimization techniques that speed up data retrieval. This reduces the
time required to process large volumes of data and increases the overall performance of the system.
 Scalability and flexibility: DBMS is highly scalable and can easily accommodate changes in data
volumes and user requirements. DBMS can easily handle large volumes of data, and can scale up or
down depending on the needs of the organization. It provides flexibility in data storage, retrieval, and
manipulation, allowing users to easily modify the structure and content of the database as needed.
 Improved productivity: DBMS reduces the time and effort required to manage data, which
increases productivity and efficiency. It also provides a user-friendly interface for data entry and
retrieval, which reduces the learning curve for new users.
Advantages of Database Management System over Traditional File System
 Data Integrity and Security: DBMS provides a centralized approach to data management that
ensures data integrity and security. DBMS allows defining constraints and rules to ensure that data is
consistent and accurate.
 Reduced Data Redundancy: DBMS eliminates data redundancy by storing data in a structured way.
It allows sharing data across different applications and users, reducing the need for duplicating data.
 Improved Data Consistency: DBMS ensures data consistency by enforcing data validation rules
and constraints. This ensures that data is accurate and consistent across different applications and
users.
 Improved Data Access and Availability: DBMS provides efficient data access and retrieval
mechanisms that enable quick and easy data access. It allows multiple users to access the data
simultaneously, ensuring data availability.
 Improved Data Sharing: DBMS provides a platform for sharing data across different applications
and users. It allows sharing data between different departments and systems within an organization,
improving collaboration and decision-making.
 Improved Data Integration: DBMS allows integrating data from different sources, providing a
comprehensive view of the data. It enables data integration from different systems and platforms,
improving the quality of data analysis.
 Improved Data Backup and Recovery: DBMS provides backup and recovery mechanisms that
ensure data is not lost in case of a system failure. It allows restoring data to a specific point in time,
ensuring data consistency.
 Data sharing: DBMSs prevent conflicts and data loss by enabling numerous people to view and edit
the same data at once. This promotes teamwork and enhances data uniformity throughout the
company.
 Data independence: By separating the logical and physical views of data, database management
systems (DBMS) enable users to work with data without being aware of its exact location or
structure. This offers adaptability and lowers the possibility of data damage as a result of
modifications to the underlying hardware or software.
 Data integrity: To avoid data mistakes and inconsistencies, database management systems
(DBMSs) apply data integrity requirements including referential integrity, entity integrity, and domain
integrity. This guarantees the consistency, accuracy, and completeness of the data.
 Data security: To prevent illegal access, alteration, or theft, database management systems
(DBMS) include a number of security features, including encryption, authentication, and
authorization. Sensitive data is safeguarded against both internal and external attacks thanks to this.
 Data backup and recovery: Database management systems (DBMS) offer backup and recovery
features that let businesses swiftly and effectively restore lost or damaged data. This guarantees
business continuity and lowers the chance of data loss.
 Decreased data redundancy: By keeping data centrally and offering methods for sharing and
reusing it, database management systems (DBMS) remove data redundancy. As a result, less data
storage is needed, and data consistency is increased.
Conclusion
Overall, Database management System offers several advantages over traditional file-based systems. It
ensures data integrity, security, and consistency, reduces data redundancy, and improves data access,
sharing, and integration. These benefits make DBMS an essential tool for managing and processing data
in modern organizations.

57. Relational Algebra Operations

Relational Algebra is a procedural query language that takes relations as an input and returns
relations as an output. There are some basic operators which can be applied in relation to producing
the required results which we will discuss one by one. We will use STUDENT_SPORTS,
EMPLOYEE, and STUDENT relations as given in Table 1, Table 2, and Table 3 respectively to
understand the various operators.
Table 1: STUDENT_SPORTS

ROLL_NO SPORTS

1 Badminton
2 Cricket

2 Badminton

4 Badminton

Table 2: EMPLOYEE

EMP_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

5 NARESH HISAR 9782918192 22

6 SWETA RANCHI 9852617621 21

4 SURESH DELHI 9156768971 18

Table 3: STUDENT

ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

4 SURESH DELHI 9156768971 18

Selection operator (?): Selection operator is used to selecting tuples from a relation based on some
condition. Syntax:
? (Cond)(Relation Name)
Extract students whose age is greater than 18 from STUDENT relation given in Table 3
? (AGE>18)(STUDENT)
[Note: SELECT operator does not show any result, the projection operator must be called before the
selection operator to generate or project the result. So, the correct syntax to generate the result
is: ?(? (AGE>18)(STUDENT))]
RESULT:

ROLL_NO NAME ADDRESS PHONE AGE

3 SUJIT ROHTAK 9156253131 20

Projection Operator (?): Projection operator is used to project particular columns from a
relation. Syntax:
?(Column 1,Column 2….Column n)(Relation Name)
Extract ROLL_NO and NAME from STUDENT relation given in Table 3
?(ROLL_NO,NAME)(STUDENT)
RESULT:

ROLL_NO NAME
1 RAM

2 RAMESH

3 SUJIT

4 SURESH

Note: If the resultant relation after projection has duplicate rows, it will be removed. For
Example ?(ADDRESS)(STUDENT) will remove one duplicate row with the value DELHI and return three
rows.
Cross Product(X): Cross product is used to join two relations. For every row of Relation1, each row
of Relation2 is concatenated. If Relation1 has m tuples and and Relation2 has n tuples, cross
product of Relation1 and Relation2 will have m X n tuples. Syntax:
Relation1 X Relation2
To apply Cross Product on STUDENT relation given in Table 1 and STUDENT_SPORTS relation
given in Table 2,
STUDENT X STUDENT_SPORTS
RESULT:

ROLL_ ADDRE AG ROLL_ SPORT


NAME PHONE
NO SS E NO S

9455123 Badmint
1 RAM DELHI 18 1
451 on

9455123
1 RAM DELHI 18 2 Cricket
451

9455123 Badmint
1 RAM DELHI 18 2
451 on

9455123 Badmint
1 RAM DELHI 18 4
451 on

RAME GURGA 9652431 Badmint


2 18 1
SH ON 543 on

RAME GURGA 9652431


2 18 2 Cricket
SH ON 543

RAME GURGA 9652431 Badmint


2 18 2
SH ON 543 on

RAME GURGA 9652431 Badmint


2 18 4
SH ON 543 on

ROHTA 9156253 Badmint


3 SUJIT 20 1
K 131 on
ROHTA 9156253
3 SUJIT 20 2 Cricket
K 131

ROHTA 9156253 Badmint


3 SUJIT 20 2
K 131 on

ROHTA 9156253 Badmint


3 SUJIT 20 4
K 131 on

SURE 9156768 Badmint


4 DELHI 18 1
SH 971 on

SURE 9156768
4 DELHI 18 2 Cricket
SH 971

SURE 9156768 Badmint


4 DELHI 18 2
SH 971 on

SURE 9156768 Badmint


4 DELHI 18 4
SH 971 on

Union (U): Union on two relations R1 and R2 can only be computed if R1 and R2 are union
compatible (These two relations should have the same number of attributes and corresponding
attributes in two relations have the same domain). Union operator when applied on two relations R1
and R2 will give a relation with tuples that are either in R1 or in R2. The tuples which are in both R1
and R2 will appear only once in the result relation. Syntax:
Relation1 U Relation2
Find the person who is either student or employees, we can use Union operators like:
STUDENT U EMPLOYEE
RESULT:

ROLL_NO NAME ADDRESS PHONE AGE

1 RAM DELHI 9455123451 18

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

4 SURESH DELHI 9156768971 18

5 NARESH HISAR 9782918192 22

6 SWETA RANCHI 9852617621 21

Minus (-): Minus on two relations R1 and R2 can only be computed if R1 and R2 are union
compatible. Minus operator when applied on two relations as R1-R2 will give a relation with tuples
that are in R1 but not in R2. Syntax:
Relation1 - Relation2
Find the person who is a student but not an employee, we can use minus operator like:
STUDENT - EMPLOYEE
RESULT:
ROLL_NO NAME ADDRESS PHONE AGE

2 RAMESH GURGAON 9652431543 18

3 SUJIT ROHTAK 9156253131 20

Rename(?): Rename operator is used to giving another name to a relation. Syntax:


?(Relation2, Relation1)
To rename STUDENT relation to STUDENT1, we can use rename operator like:
?(STUDENT1, STUDENT)
If you want to create a relation STUDENT_NAMES with ROLL_NO and NAME from STUDENT, it
can be done using rename operator as:
?(STUDENT_NAMES, ?(ROLL_NO, NAME)(STUDENT))

58. Indexing. Different indexing techniques

Indexing improves database performance by minimizing the number of disc visits required to fulfill a
query. It is a data structure technique used to locate and quickly access data in databases. Several
database fields are used to generate indexes. The main key or candidate key of the table is
duplicated in the first column, which is the Search key. To speed up data retrieval, the values are
also kept in sorted order. It should be highlighted that sorting the data is not required. The second
column is the Data Reference or Pointer which contains a set of pointers holding the address of the
disk block where that particular key value can be found.

Attributes of Indexing
 Access Types: This refers to the type of access such as value-based search, range access, etc.
 Access Time: It refers to the time needed to find a particular data element or set of elements.
 Insertion Time: It refers to the time taken to find the appropriate space and insert new data.
 Deletion Time: Time taken to find an item and delete it as well as update the index structure.
 Space Overhead: It refers to the additional space required by the index.
In general, there are two types of file organization mechanisms that are followed by the indexing methods
to store the data:
Sequential File Organization or Ordered Index File
In this, the indices are based on a sorted ordering of the values. These are generally fast and a more
traditional type of storing mechanism. These Ordered or Sequential file organizations might store the data
in a dense or sparse format.
 Dense Index
 For every search key value in the data file, there is an index record.
 This record contains the search key and also a reference to the first data record with
that search key value.

 Sparse Index
 The index record appears only for a few items in the data file. Each item points to a block
as shown.
 To locate a record, we find the index record with the largest search key value less than
or equal to the search key value we are looking for.
 We start at that record pointed to by the index record, and proceed along with the
pointers in the file (that is, sequentially) until we find the desired record.
 Number of Accesses required=log₂(n)+1, (here n=number of blocks acquired by index
file)
Hash File Organization
Indices are based on the values being distributed uniformly across a range of buckets. The buckets to
which a value is assigned are determined by a function called a hash function. There are primarily three
methods of indexing:
 Clustered Indexing: When more than two records are stored in the same file this type of storing is
known as cluster indexing. By using cluster indexing we can reduce the cost of searching reason
being multiple records related to the same thing are stored in one place and it also gives the frequent
joining of more than two tables (records).
The clustering index is defined on an ordered data file. The data file is ordered on a non-key field. In
some cases, the index is created on non-primary key columns which may not be unique for each
record. In such cases, in order to identify the records faster, we will group two or more columns
together to get the unique values and create an index out of them. This method is known as the
clustering index. Essentially, records with similar properties are grouped together, and indexes for
these groupings are formed.
Students studying each semester, for example, are grouped together. First-semester students,
second-semester students, third-semester students, and so on are categorized.


Primary Indexing: This is a type of Clustered Indexing wherein the data is sorted according to the
search key and the primary key of the database table is used to create the index. It is a default
format of indexing where it induces sequential file organization. As primary keys are unique and are
stored in a sorted manner, the performance of the searching operation is quite efficient.
 Non-clustered or Secondary Indexing: A non-clustered index just tells us where the data lies, i.e. it
gives us a list of virtual pointers or references to the location where the data is actually stored. Data
is not physically stored in the order of the index. Instead, data is present in leaf nodes. For eg. the
contents page of a book. Each entry gives us the page number or location of the information stored.
The actual data here(information on each page of the book) is not organized but we have an ordered
reference(contents page) to where the data points actually lie. We can have only dense ordering in
the non-clustered index as sparse ordering is not possible because data is not physically organized
accordingly.
It requires more time as compared to the clustered index because some amount of extra work is
done in order to extract the data by further following the pointer. In the case of a clustered index,
data is directly present in front of the index.

Multilevel Indexing: With the growth of the size of the database, indices also grow. As the index
is stored in the main memory, a single-level index might become too large a size to store with
multiple disk accesses. The multilevel indexing segregates the main block into various smaller
blocks so that the same can be stored in a single block. The outer blocks are divided into inner
blocks which in turn are pointed to the data blocks. This can be easily stored in the main memory
with fewer overheads.
Advantages of Indexing
 Improved Query Performance: Indexing enables faster data retrieval from the database. The
database may rapidly discover rows that match a specific value or collection of values by generating
an index on a column, minimizing the amount of time it takes to perform a query.
 Efficient Data Access: Indexing can enhance data access efficiency by lowering the amount of disk
I/O required to retrieve data. The database can maintain the data pages for frequently visited
columns in memory by generating an index on those columns, decreasing the requirement to read
from disk.
 Optimized Data Sorting: Indexing can also improve the performance of sorting operations. By
creating an index on the columns used for sorting, the database can avoid sorting the entire table
and instead sort only the relevant rows.
 Consistent Data Performance: Indexing can assist ensure that the database performs consistently
even as the amount of data in the database rises. Without indexing, queries may take longer to run
as the number of rows in the table grows, while indexing maintains a roughly consistent speed.
 By ensuring that only unique values are inserted into columns that have been indexed as unique,
indexing can also be utilized to ensure the integrity of data. This avoids storing duplicate data in the
database, which might lead to issues when performing queries or reports.
Overall, indexing in databases provides significant benefits for improving query performance, efficient data
access, optimized data sorting, consistent data performance, and enforced data integrity
Disadvantages of Indexing
 Indexing necessitates more storage space to hold the index data structure, which might increase the
total size of the database.
 Increased database maintenance overhead: Indexes must be maintained as data is added,
destroyed, or modified in the table, which might raise database maintenance overhead.
 Indexing can reduce insert and update performance since the index data structure must be updated
each time data is modified.
 Choosing an index can be difficult: It can be challenging to choose the right indexes for a specific
query or application and may call for a detailed examination of the data and access patterns.
Features of Indexing
 The development of data structures, such as B-trees or hash tables, that provide quick access to
certain data items is known as indexing. The data structures themselves are built on the values of
the indexed columns, which are utilized to quickly find the data objects.
 The most important columns for indexing columns are selected based on how frequently they are
used and the sorts of queries they are subjected to. The cardinality, selectivity, and uniqueness of
the indexing columns can be taken into account.
 There are several different index types used by databases, including primary, secondary, clustered,
and non-clustered indexes. Based on the particular needs of the database system, each form of
index offers benefits and drawbacks.
 For the database system to function at its best, periodic index maintenance is required. According to
changes in the data and usage patterns, maintenance work involves building, updating, and
removing indexes.
 Database query optimization involves indexing, which is essential. The query optimizer utilizes the
indexes to choose the best execution strategy for a particular query based on the cost of accessing
the data and the selectivity of the indexing columns.
 Databases make use of a range of indexing strategies, including covering indexes, index-only scans,
and partial indexes. These techniques maximize the utilization of indexes for particular types of
queries and data access.
 When non-contiguous data blocks are stored in an index, it can result in index fragmentation, which
makes the index less effective. Regular index maintenance, such as defragmentation and
reorganization, can decrease fragmentation.
Conclusion
Indexing is a very useful technique that helps in optimizing the search time in database queries. The table
of database indexing consists of a search key and pointer. There are four types of indexing: Primary,
Secondary Clustering, and Multivalued Indexing. Primary indexing is divided into two types, dense and
sparse. Dense indexing is used when the index table contains records for every search key. Sparse
indexing is used when the index table does not use a search key for every record. Multilevel indexing
uses B+ Tree. The main purpose of indexing is to provide better performance for data retrieval.

59. States of a transaction

States through which a transaction goes during its lifetime. These are the states which tell about the
current state of the Transaction and also tell how we will further do the processing in the
transactions. These states govern the rules which decide the fate of the transaction whether it will
commit or abort.
They also use Transaction log. Transaction log is a file maintain by recovery management
component to record all the activities of the transaction. After commit is done transaction log file is
removed.

These are different types of Transaction States :

1. Active State –
When the instructions of the transaction are running then the transaction is in active state. If all the
‘read and write’ operations are performed without any error then it goes to the “partially committed
state”; if any instruction fails, it goes to the “failed state”.

2. Partially Committed –
After completion of all the read and write operation the changes are made in main memory or local
buffer. If the changes are made permanent on the DataBase then the state will change to “committed
state” and in case of failure it will go to the “failed state”.

3. Failed State –
When any instruction of the transaction fails, it goes to the “failed state” or if failure occurs in making
a permanent change of data on Data Base.

4. Aborted State –
After having any type of failure the transaction goes from “failed state” to “aborted state” and since in
previous states, the changes are only made to local buffer or main memory and hence these
changes are deleted or rolled-back.

5. Committed State –
It is the state when the changes are made permanent on the Data Base and the transaction is
complete and therefore terminated in the “terminated state”.

6. Terminated State –
If there isn’t any roll-back or the transaction comes from the “committed state”, then the system is
consistent and ready for new transaction and the old transaction is terminated.

60. Different types of data models

A Data Model in Database Management System (DBMS) is the concept of tools that are developed to
summarize the description of the database. Data Models provide us with a transparent picture of data
which helps us in creating an actual database. It shows us from the design of the data to its proper
implementation of data.
Types of Relational Models
1. Conceptual Data Model
2. Representational Data Model
3. Physical Data Model
It is basically classified into 3 types:-
1. Conceptual Data Model

The conceptual data model describes the database at a very high level and is useful to understand the
needs or requirements of the database. It is this model, that is used in the requirement -gathering process
i.e. before the Database Designers start making a particular database. One such popular model is
the entity/relationship model (ER model). The E/R model specializes in entities, relationships, and even
attributes that are used by database designers. In terms of this concept, a discussion can be made even
with non-computer science(non-technical) users and stakeholders, and their requirements can be
understood.
Entity-Relationship Model( ER Model): It is a high-level data model which is used to define the data and
the relationships between them. It is basically a conceptual design of any database which is easy to design
the view of data.
Components of ER Model:
1. Entity: An entity is referred to as a real-world object. It can be a name, place, object, class, etc. These
are represented by a rectangle in an ER Diagram.
2. Attributes: An attribute can be defined as the description of the entity. These are represented by
Eclipse in an ER Diagram. It can be Age, Roll Number, or Marks for a Student.
3. Relationship: Relationships are used to define relations among different entities. Diamonds and
Rhombus are used to show Relationships.

Characteristics of a conceptual data model

 Offers Organization-wide coverage of the business concepts.


 This type of Data Models are designed and developed for a business audience.
 The conceptual model is developed independently of hardware specifications like data storage
capacity, location or software specifications like DBMS vendor and technology. The focus is to
represent data as a user will see it in the “real world.”
Conceptual data models known as Domain models create a common vocabulary for all stakeholders by
establishing basic concepts and scope

2. Representational Data Model

This type of data model is used to represent only the logical part of the database and does not represent
the physical structure of the database. The representational data model allows us to focus primarily, on
the design part of the database. A popular representational model is a Relational model. The relational
Model consists of Relational Algebra and Relational Calculus. In the Relational Model, we basically use
tables to represent our data and the relationships between them. It is a theoretical concept whose pra ctical
implementation is done in Physical Data Model.
The advantage of using a Representational data model is to provide a foundation to form the base for the
Physical model

3. Physical Data Model

The physical Data Model is used to practically implement Relational Data Model. Ultimately, all data in a
database is stored physically on a secondary storage device such as discs and tapes. This is stored in the
form of files, records, and certain other data structures. It has all the information on the format in which the
files are present and the structure of the databases, the presence of external data structures, and their
relation to each other. Here, we basically save tables in memory so they can be accessed efficiently. In
order to come up with a good physical model, we have to work on the relational model in a better
way. Structured Query Language (SQL) is used to practically implement Relational Algebra.
This Data Model describes HOW the system will be implemented using a specific DBMS system. This
model is typically created by DBA and developers. The purpose is actual implementation of the
database.

Characteristics of a physical data model:

 The physical data model describes data need for a single project or application though it maybe
integrated with other physical data models based on project scope.
 Data Model contains relationships between tables that which addresses cardinality and nullability of
the relationships.
 Developed for a specific version of a DBMS, location, data storage or technology to be used in the
project.
 Columns should have exact datatypes, lengths assigned and default values.
 Primary and Foreign keys, views, indexes, access profiles, and authorizations, etc. are d efined
Some Other Data Models

1. Hierarchical Model

The hierarchical Model is one of the oldest models in the data model which was developed by IBM, in the
1950s. In a hierarchical model, data are viewed as a collection of tables, or we can say segments that form
a hierarchical relation. In this, the data is organized into a tree-like structure where each record consists
of one parent record and many children. Even if the segments are connected as a chain -like structure by
logical associations, then the instant structure can be a fan structure with multiple branches. We call the
illogical associations as directional associations.

2. Network Model

The Network Model was formalized by the Database Task group in the 1960s. This model is the
generalization of the hierarchical model. This model can consist of multiple parent segments and these
segments are grouped as levels but there exists a logical association between the segments belonging to
any level. Mostly, there exists a many-to-many logical association between any of the two segments.

3. Object-Oriented Data Model

In the Object-Oriented Data Model, data and their relationships are contained in a single structure which
is referred to as an object in this data model. In this, real-world problems are represented as objects with
different attributes. All objects have multiple relationships between them. Basically, it is a combination of
Object Oriented programming and a Relational Database Model.

4. Float Data Model

The float data model basically consists of a two-dimensional array of data models that do not contain any
duplicate elements in the array. This data model has one drawback it cannot store a large amount of data
that is the tables can not be of large size.

5. Context Data Model

The Context data model is simply a data model which consists of more than one data model. For example,
the Context data model consists of ER Model, Object-Oriented Data Model, etc. This model allows users
to do more than one thing which each individual data model can do.
6. Semi-Structured Data Model

Semi-Structured data models deal with the data in a flexible way. Some entities may have extra attributes
and some entities may have some missing attributes. Basically, you can represent data here in a flexible
way.
Advantages of Data Models
1. Data Models help us in representing data accurately.
2. It helps us in finding the missing data and also in minimizing Data Redundancy.
3. Data Model provides data security in a better way.
4. The data model should be detailed enough to be used for building the physical database.
5. The information in the data model can be used for defining the relationship between tables, primary
and foreign keys, and stored procedures.
Disadvantages of Data Models
1. In the case of a vast database, sometimes it becomes difficult to understand the data model.
2. You must have the proper knowledge of SQL to use physical models.
3. Even smaller change made in structure require modification in the entire application.
4. There is no set data manipulation language in DBMS.
5. To develop Data model one should know physical data stored characteristics.

Conclusion

 Data modeling is the process of developing data model for the data to be stored in a Database.
 Data Models ensure consistency in naming conventions, default values, semantics, security while
ensuring quality of the data.
 Data Model structure helps to define the relational tables, primary and foreign keys and stored
procedures.
 There are three types of conceptual, logical, and physical.
 The main aim of conceptual model is to establish the entities, their attributes, and their relationships.
 Logical data model defines the structure of the data elements and set the relationships between
them.
 A Physical Data Model describes the database specific implementation of the data model.
 The main goal of a designing data model is to make certain that data objects offered by the
functional team are represented accurately.
 The biggest drawback is that even smaller change made in structure require modification in the
entire application.
 Reading this Data Modeling tutorial, you will learn from the basic conc epts such as What is Data
Model? Introduction to different types of Data Model, advantages, disadvantages, and data model
example.

61. DDL and DML


Data Definition Language helps you to define the database structure or schema. DDL commands help you
to create the structure of the database and the other database objects. Its commands are auto-committed
so, the changes are saved in the database permanently. The full form of DDL is Data Definition Language.

What is DML?
DML commands it to allow you to manage the data stored in the database, although DML commands are
not auto-committed. Moreover, they are not permanent. So, It is possible to roll back the operation. The full
form of DML is Data Manipulation Language.

Difference Between DDL and DML in DBMS


Here is the main difference between DDL and DML Command in DBMS:
DDL DML

Data Definition Language (DDL) helps you to Data Manipulation Language (DML command) allows you
define the database structure or schema. to manage the data stored in the database.

DDL command is used to create the database DML command is used to populate and manipulate
schema. database

DML is classified as Procedural and Non and Procedural


DDL is not classified further.
DMLs.

CREATE, ALTER, DROP, TRUNCATE AND


INSERT, UPDATE, DELETE, MERGE, CALL, etc.
COMMENT and RENAME, etc.

It defines the column of the table. It adds or updates the row of the table

DDL statements affect the whole table. DML effects one or more rows.

SQL Statement can’t be rollback SQL Statement can be a rollback

DDL is declarative. DML is imperative.

Important DDL commands are: 1) CREATE, 2) ALTER, 3) DROP, 4) TRUNCATE, etc. while important
DML commands are: 1) INSERT, 2) UPDATE, 3) DELETE, 4) MERGE, etc.

Why DDL?
Here, are reasons for using DDL method:

 Allows you to store shared data


 Data independence improved integrity
 Allows multiple users
 Improved security efficient data access

Why DML?
Here, benefits/ pros of DML:

 The DML statements allow you to modify the data stored in a database.
 Users can specify what data is needed.
 DML offers many different flavors and capabilities between database vendors.
 It offers an efficient human interaction with the system.

Commands for DDL


Five types of DDL commands are:

CREATE
CREATE statements is used to define the database structure schema:
Syntax:

CREATE TABLE TABLE_NAME (COLUMN_NAME DATATYPES[,....]);


For example:

Create database university;


Create table students;
Create view for_students;
DROP
Drops commands remove tables and databases from RDBMS.

Syntax:

DROP TABLE ;
For example:

Drop object_type object_name;


Drop database university;
Drop table student;
ALTER
Alters command allows you to alter the structure of the database.

Syntax:

To add a new column in the table

ALTER TABLE table_name ADD column_name COLUMN-definition;


To modify an existing column in the table:

ALTER TABLE MODIFY(COLUMN DEFINITION....);


For example:

Alter table guru99 add subject varchar;


TRUNCATE
This command used to delete all the rows from the table and free the space containing the table.

Syntax:

TRUNCATE TABLE table_name;


Example:

TRUNCATE table students;

Commands for DML


Here are some important DML commands:

 INSERT
 UPDATE
 DELETE

INSERT
This is a statement that is a SQL query. This command is used to insert data into the row of a table.
Syntax:

INSERT INTO TABLE_NAME (col1, col2, col3,.... col N)


VALUES (value1, value2, value3, .... valueN);
Or
INSERT INTO TABLE_NAME
VALUES (value1, value2, value3, .... valueN);
For example:

INSERT INTO students (RollNo, FIrstName, LastName) VALUES ('60', 'Tom', 'Erichsen');
UPDATE
This command is used to update or modify the value of a column in the table.

Syntax:

UPDATE table_name SET [column_name1= value1,...column_nameN = valueN] [WHERE CONDITION]


For example:

UPDATE students
SET FirstName = 'Jhon', LastName=' Wick'
WHERE StudID = 3;
DELETE
This command is used to remove one or more rows from a table.

Syntax:

DELETE FROM table_name [WHERE condition];


For example:

DELETE FROM students


WHERE FirstName = 'Jhon';

DDL Command Example

CREATE
Syntax:

CREATE TABLE tableName


(
column_1 datatype [ NULL | NOT NULL ],
column_2 datatype [ NULL | NOT NULL ],
...
);
Here,

 The parameter tableName denotes the name of the table that you are going to create.
 The parameters column_1, column_2… denote the columns to be added to the table.
 A column should be specified as either NULL or NOT NULL. If you don’t specify, SQL Server will
take NULL as the default

Example:

CREATE TABLE Students


(
Student_ID Int,
Student_Name Varchar(10)
)
ALTER
Syntax:

Alter TABLE <Table name> ADD Column1 datatype, Column2 datatype;


Example:

ALTER TABLE University.Students_Name ADD Course_Duration VARCHAR(20);


DROP
Syntax:

DROP TABLE <tableName>;


The parameter tableName is the name of the table that is to be deleted.

Example:

DROP TABLE COURSE_NAMES;

DML Command Example

INSERT
In PL/SQL, we can insert the data into any table using the SQL command INSERT INTO. This command
will take the table name, table column, and column values as the input and insert the value in the base
table.

The INSERT command can also take the values directly from another table using ‘SELECT’ statement
rather than giving the values for each column. Through ‘SELECT’ statement, we can insert as many rows
as the base table contains.

Syntax:

BEGIN
INSERT INTO <table_name>(<column1 >,<column2>,...<column_n>)
VALUES(<valuel><value2>,...:<value_n>);
END;
The above syntax shows the INSERT INTO command. The table name and values are mandatory fields,
whereas column names are not mandatory if the insert statements have values for all the columns of the
table.

The keyword ‘VALUES’ is mandatory if the values are given separately, as shown above.

Syntax:

BEGIN
INSERT INTO <table_name>(<columnl>,<column2>,...,<column_n>)
SELECT <columnl>,<column2>,.. <column_n> FROM <table_name2>;
END;
The above syntax shows the INSERT INTO command that takes the values directly from the
<table_name2> using the SELECT command.

The keyword ‘VALUES’ should not be present in this case, as the values are not given separately.

DELETE
Below is the Syntax to delete table
Syntax:

DROP TABLE <TableName>;


The parameter TableName is the name of the table that is to be deleted.

Example:

DROP TABLE COURSE_NAMES;


SELECT
To view data in SQL Server, we use the SELECT statement.

Syntax:

SELECT expression
FROM tableName
[WHERE condition];
Example:

SELECT * FROM Course;

62. What is an Index?


An Index is a key built from one or more columns in the database that speeds up fetching rows from the
table or view. This key helps a Database like Oracle, SQL Server, MySQL, etc. to find the row associated
with key values quickly.

Two types of Indexes are:

 Clustered Index
 Non-Clustered Index

What is a Clustered index?


Cluster index is a type of index which sorts the data rows in the table on their key values. In the Database,
there is only one clustered index per table.

A clustered index defines the order in which data is stored in the table which can be sorted in only one way.
So, there can be an only a single clustered index for every table. In an RDBMS, usually, the primary key
allows you to create a clustered index based on that specific column.

What is Non-clustered index?


A Non-clustered index stores the data at one location and indices at another location. The index contains
pointers to the location of that data. A single table can have many non-clustered indexes as an index in the
non-clustered index is stored in different places.

For example, a book can have more than one index, one at the beginning which displays the contents of a
book unit wise while the second index shows the index of terms in alphabetical order.
A non-clustering index is defined in the non-ordering field of the table. This type of indexing method helps
you to improve the performance of queries that use keys which are not assigned as a primary key. A non-
clustered index allows you to add a unique key for a table.
Characteristic of Clustered Index

 Default and sorted data storage


 Use just one or more than one columns for an index
 Helps you to store Data and index together
 Fragmentation
 Operations
 Clustered index scan and index seek
 Key Lookup

Characteristics of Non-clustered Indexes

 Store key values only


 Pointers to Heap/Clustered Index rows
 Allows Secondary data access
 Bridge to the data
 Operations of Index Scan and Index Seek
 You can create a nonclustered index for a table or view
 Every index row in the nonclustered index stores the nonclustered key value and a row locator

Clustered vs Non-clustered Index in SQL: Key Differences

Parameters Clustered Non-clustered

You can sort the records and store A non-clustered index helps you to creates a
Use for clustered index physically in memory as logical order for data rows and uses pointers
per the order. for physical data files.

Allows you to stores data pages in the This indexing method never stores data pages
Storing method
leaf nodes of the index. in the leaf nodes of the index.

The size of the clustered index is quite The size of the non-clustered index is small
Size
large. compared to the clustered index.

Data accessing Faster Slower compared to the clustered index

Additional disk
Not Required Required to store the index separately
space

By Default Primary Keys Of The Table It can be used with unique constraint on the
Type of key
is a Clustered Index. table which acts as a composite key.

A clustered index can improve the It should be created on columns which are
Main feature
performance of data retrieval. used in joins.

An example of a clustered index


In the example below, SalesOrderDetailID is the clustered index. Sample query to retrieve data

SELECT CarrierTrackingNumber, UnitPrice


FROM SalesData
WHERE SalesOrderDetailID = 6
An example of a non-clustered index
In the below example, a non-clusted index is created on OrderQty and ProductID as follows

CREATE INDEX myIndex ON


SalesData (ProductID, OrderQty)

The following query will be retrieved faster compared to the clustered index.

SELECT Product ID, OrderQty


FROM SalesData
WHERE ProductID = 714

Advantages of Clustered Index


The pros/benefits of the clustered index are:

 Clustered indexes are an ideal option for range or group by with max, min, count type queries
 In this type of index, a search can go straight to a specific point in data so that you can keep
reading sequentially from there.
 Clustered index method uses location mechanism to locate index entry at the start of a range.
 It is an effective method for range searches when a range of search key values is requested.
 Helps you to minimize page transfers and maximize the cache hits.

Advantages of Non-clustered index


Pros of using non-clustered index are:

 A non-clustering index helps you to retrieves data quickly from the database table.
 Helps you to avoid the overhead cost associated with the clustered index
 A table may have multiple non-clustered indexes in RDBMS. So, it can be used to create more
than one index.

Disadvantages of Clustered Index


Here, are cons/drawbacks of using clustered index:

 Lots of inserts in non-sequential order


 A clustered index creates lots of constant page splits, which includes data page as well as index
pages.
 Extra work for SQL for inserts, updates, and deletes.
 A clustered index takes longer time to update records when the fields in the clustered index are
changed.
 The leaf nodes mostly contain data pages in the clustered index.

Disadvantages of Non-clustered index


Here, are cons/drawbacks of using non-clustered index:

 A non-clustered index helps you to stores data in a logical order but does not allow to sort data
rows physically.
 Lookup process on non-clustered index becomes costly.
 Every time the clustering key is updated, a corresponding update is required on the non-clustered
index as it stores the clustering key.

You might also like