Dbms
Dbms
403
SQL is Structured Query Language, which is a computer language for storing, manipulating
and retrieving data stored in a relational database. It is the standard language for Relational
Database System. All the Relational Database Management Systems (RDMS) like MySQL,
MS Access, Oracle, Sybase, Informix, Postgre and SQL Server use SQL as their standard
database language. The different parts of SQL or SQL commands are divided into five
categories:
i. Data Definition Language
The Data Definition Language (DDL) consist of SQL statements used to define the
database structure or schema. It simply deals with descriptions of the database schema
and is used to create and modify the structure of database objects in databases. SQL
commands which comes under Data Definition Language are:
a. Create
b. Alter
c. Drop
a. Commit
b. Roll back
c. Save point
i. UNION Operation
UNION is used to combine the results of two or more SELECT statements. However it
will eliminate duplicate rows from its result set. In case of union, number of columns and
datatype must be same in both the tables, on which UNION operation is being applied.
This operation is similar to Union. But it also shows the duplicate rows.
iii. INTERSECT
Intersect operation is used to combine two SELECT statements, but it only returns the
records which are common from both SELECT statements. In case of Intersect the
number of columns and datatype must be same.
iv. MINUS
The Minus operation combines results of two SELECT statements and return only those
in the final result, which belongs to the first set of the result.
SQL PL/SQL
It is a database Structured Query Language. It is a database programming language using
SQL.
Data variable are not available. Data variable are available.
No Supported Control Structures. Control Structures are available Like, For
loop, While loop.
Query performs single operation. PL/SQL block performs Group of Operation
as single block.
SQL is declarative language as it can be PL/SQL is procedural language as it can’t be
embedded in PL/SQL. embedded in SQL.
It directly interacts with the database server. It does not interacts directly with the
database server.
It is data oriented language. It is application oriented language.
It is used to write queries, DDL and DML It is accustomed write program blocks,
statements. functions, procedures triggers, and
packages.
i. INNER JOIN
The INNER JOIN keyword selects all rows from both the tables as long as the
condition satisfies. This keyword will create the result-set by combining all rows from
both the tables where the condition satisfies i.e. value of the common field will be
same.
5. How does database differ from database management system? Explain the
characteristics of modern database system that differentiate it from traditional file
processing system. What do you mean by data independence.
A collection of related pieces of data, whose purpose is to solve the data management
needs of an institution is called a Database while on the other hand Database Management
Systems (DBMS), are very complex software which save the data on the secondary storage
devices and which are used to manipulate databases.
DBMS or Database Management System is a A file system is a software that manages and
software application. It is used for accessing, organizes the files in a storage medium. It
creating, and managing databases. controls how data is stored and retrieved.
DBMS gives an abstract view of data that The file system provides the details of data
hides the details representation and storage of data.
DBMS is efficient to use as there are a wide Storing and retrieving of data can't be done
variety of methods to store and retrieve data. efficiently in a file system.
There is a backup recovery for data in It does not offer data recovery processes.
DBMS.
DBMS provides a crash recovery mechanism The file system doesn't have a crash recovery
mechanism.
DBMS offers good protection mechanism. Protecting a file system is very difficult.
The redundancy of data is low in the DBMS In a file management system, the redundancy
system. of data is greater.
Data inconsistency is low in a database Data inconsistency is higher in the file
management system. system.
Database Management System offers high The file system offers lesser security.
security.
DBMS system provides backup and recovery It doesn't offer backup and recovery of data if
of data even if it is lost. it is lost.
Data Independence is defined as a property of DBMS that helps you to change the Database
schema at one level of a database system without requiring to change the schema at the next
higher level. Data independence helps you to keep data separated from all programs that
make use of it. Physical data independence and logical data independence are the two data
independence in DBMS.
6. Explain briefly about physical and logical data independence? Discuss advantages of
DBMS over conventional data processing file system.
Data Independence is defined as a property of DBMS that helps you to change the Database
schema at one level of a database system without requiring to change the schema at the next
higher level.
Physical data independence helps you to separate conceptual levels from the internal/physical
levels. It allows you to provide a logical description of the database without the need to
specify physical structures. Compared to Logical Independence, it is easy to achieve physical
data independence.
Logical Data Independence is the ability to change the conceptual scheme without changing
External views and External API or programs. Any change made will be absorbed by the
Mr. 403
mapping between external and conceptual levels. When compared to Physical Data
independence, it is challenging to achieve logical data independence.
The advantages of DBMS over conventional data processing file system are:
i. DBMS is a software application that is used for accessing, creating, and managing
databases, whereas a file system is a software that manages and organizes the files in a
storage medium.
ii. DBMS provides a crash recovery mechanism, The file system doesn't have a crash
recovery mechanism on the other hand.
iii. Data inconsistency is low in a database management system, but data inconsistency is
higher in the file system.
iv. In DBMS system it is easy to implement complicated transactions using SQL while File
system does not provide support for complicated transactions.
v. DBMS provides a concurrency facility, whereas File system does not offer concurrency.
7. List the advantages of DBMS over File System. What are the differences between DDL
and DML?
The advantages of DBMS over File System are:
vi. DBMS is a software application that is used for accessing, creating, and managing
databases, whereas a file system is a software that manages and organizes the files in a
storage medium.
vii. DBMS provides a crash recovery mechanism, The file system doesn't have a crash
recovery mechanism on the other hand.
viii. Data inconsistency is low in a database management system, but data
inconsistency is higher in the file system.
ix. In DBMS system it is easy to implement complicated transactions using SQL while File
system does not provide support for complicated transactions.
x. DBMS provides a concurrency facility, whereas File system does not offer concurrency.
It basically defines the column (Attributes) It add or update the row of the table. These
of the table. rows are called as tuple.
It doesn’t have any further classification. It is further classified into Procedural and
Non-Procedural DML.
Basic command present in DDL are BASIC command present in DML are
CREATE, DROP, RENAME, ALTER etc. UPDATE, INSERT, MERGE etc.
Mr. 403
DDL does not use WHERE clause in its While DML uses WHERE clause in its
statement. statement.
8. What are integrity constraints? Why are they important? Explain its types.
OR,
What do you mean by integrity constraints? Explain all the integrity constraints you
know about.
Integrity constraints are used to ensure accuracy and consistency of the data in a relational
database. Data integrity is handled in a relational database through the concept of referential
integrity.
It is important because it is used to maintain the quality of information. Integrity
constraints ensure that the data insertion, updating, and other processes have to be performed
in such a way that data integrity is not affected. Thus, integrity constraint is used to guard
against accidental damage to the database. Its types are:
i. Entity Integrity Constraints
Entity integrity constraint ensures that the primary key attribute in a relation,
should not accept a null value. This is because the primary key attribute
value uniquely defines an entity in a relation. So, it being null would not work.
9. What are integrity constraints? What are domain constraints? Explain referential
integrity constraint with its function.
OR,
Explain referential integrity constraint with its function.
Integrity constraints are used to ensure accuracy and consistency of the data in a relational
database. Data integrity is handled in a relational database through the concept of referential
integrity.
Each table has certain set of columns and each column allows a same type of data, based on
its data type. The column does not accept values of any other data type. Domain
constraints are user defined data type and we can define them like this:
Mr. 403
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT).
We wish to ensure that a value that appears in one relation for a given set of attributes also
appears for a certain set of attributes in another relation. This condition is called referential
integrity. Referential integrity ensures that the values for a set of attributes in one
relation must also appear the same for the particular set attributes in another relation.
Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key
attribute of a relation that can be referred in other relation. Referential integrity constraint
states that if a relation refers to a key attribute of a different or same relation, then that key
element must exist.
Referential integrity (RI) is a term used with relational databases to describe the integrity
of the business relationships represented in the schema. It ensures that relationships
between tables remain consistent.
A referential integrity constraint is also known as foreign key constraint. A foreign key is
a key whose values are derived from the Primary key of another table. The table from
which the values are derived is known as Master or Referenced Table and the Table in
which values are inserted accordingly is known as Child or Referencing Table, In other
words, we can say that the table containing the foreign key is called the child table, and
the table containing the Primary key/candidate key is called the referenced or parent
table. When we talk about the database relational model, the candidate key can be
defined as a set of attribute which can have zero or more attributes.
10. What is normalization? Explain partial functional dependency and 2NF with suitable
example. What is BCNF.
If we have a composite primary key and if there is any non-key attribute which is dependent
on part of the composite key then we called the non-key attribute as partial dependency. It
occurs when a non-key attribute is functionally dependent on part of a candidate key. The
2nd Normal Form (2NF) eliminates the Partial Dependency.
For a table to be in the Second Normal Form, it must satisfy two conditions:
i. The table should be in the First Normal Form.
ii. There should be no Partial Dependency.
Example:
<StudentProject>
StudentID ProjectNo StudentName ProjectName
S01 199 Katie Geo Location
S02 120 Ollie Cluster Exploration
In the above table, we have partial dependency; let us see how:
The prime key attributes are StudentID and ProjectNo, and
Mr. 403
<ProjectInfo>
ProjectNo ProjectName
199 Geo Location
120 Cluster Exploration
iii. And, for a relation R(A,B,C), if there is a multi-valued dependency between, A and B,
then B and C should be independent of each other.
An attribute that is not part of any candidate key is known as non-prime attribute.
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functional dependency X-> Y at least one of the following conditions hold:
An attribute that is a part of one of the candidate keys is known as prime attribute.
Example: Suppose a company wants to store the complete address of each employee, they
create a table named employee_details that looks like this:
Here, emp_state & emp_district dependent on emp_zip. And, emp_zip is dependent on emp_id
that makes non-prime attributes (emp_state & emp_district) transitively dependent on super key
(emp_id). This violates the rule of 3NF.
To make this table complies with 3NF we have to break the table into two tables to remove the
transitive dependency:
employee table:
employee_zip table:
12. What is the role of functional dependency in normalization? Differentiate between 3NF
and BCNF.
Functional Dependency is a constraint that determines the relation of one attribute to another
attribute in a Database Management System (DBMS). Functional Dependency helps to
maintain the quality of data in the database. It plays a vital role to find the difference between
good and bad database design.
3NF BCNF
In 3NF there should be no transitive In BCNF for any relation A->B, A should be
dependency that is no non prime attribute a super key of relation.
should be transitively dependent on the
candidate key.
It is less stronger than BCNF. It is comparatively more stronger than 3NF.
In 3NF the functional dependencies are In BCNF the functional dependencies are
already in 1NF and 2NF. already in 1NF, 2NF and 3NF.
The redundancy is high in 3NF. The redundancy is comparatively low in
BCNF.
In 3NF there is preservation of all functional In BCNF there may or may not be
dependencies. preservation of all functional dependencies.
It is comparatively easier to achieve. It is difficult to achieve.
Lossless decomposition can be achieved by Lossless decomposition is hard to achieve in
3NF. BCNF.
i. For a dependency A → B, if for a single value of A, multiple value of B exists, then the
table may have multi-valued dependency.
ii. Also, a table should have at-least 3 columns for it to have a multi-valued dependency.
iii. And, for a relation R(A,B,C), if there is a multi-valued dependency between, A and B,
then B and C should be independent of each other.
For a table to satisfy the Fourth Normal Form, it should satisfy the following two conditions:
i. It should be in the Boyce-Codd Normal Form.
ii. And, the table should not have any Multi-valued Dependency.
Example:
Below we have a college enrolment table with columns s_id, course and hobby.
s_id course hobby
1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 Php Hockey
Well the two records for student with s_id 1, will give rise to two more records, as shown
below, because for one student, two hobbies exists, hence along with both the courses, these
hobbies should be specified.
s_id course hobby
1 Science Cricket
1 Maths Hockey
1 Science Hockey
1 Maths Cricket
And, in the table above, there is no relationship between the columns course and hobby.
They are independent of each other. So there is multi-value dependency, which leads to
un-necessary repetition of data and other anomalies as well.
To make the above relation satify the 4th normal form, we can decompose the table into 2
tables.
Course Table
s_id course
1 Science
1 Maths
2 C#
2 Php
Mr. 403
Hobbies Table
s_id hobby
1 Cricket
1 Hockey
2 Cricket
2 Hockey
SQL PL/SQL
It is a database Structured Query Language. It is a database programming language using
SQL.
Data variable are not available. Data variable are available.
No Supported Control Structures. Control Structures are available Like, For
loop, While loop.
Mr. 403
16. What do you mean by PL/SQL? Explain with example how function and procedure are
created in PL/SQL.
PL/SQL is a combination of SQL along with the procedural features of programming
languages. It was developed by Oracle Corporation in the early 90's to enhance the
capabilities of SQL. PL/SQL is one of three key programming languages embedded in the
Oracle Database, along with SQL itself and Java. This tutorial will give you great
understanding on PL/SQL to proceed with Oracle database and other advanced RDBMS
concepts.
Creating a Function:
A standalone function is created using the CREATE FUNCTION statement. The simplified
syntax for the CREATE OR REPLACE PROCEDURE statement is as follows:
Where,
i. function-name specifies the name of the function.
ii. [OR REPLACE] option allows the modification of an existing function.
iii. The optional parameter list contains name, mode and types of the parameters. IN
represents the value that will be passed from outside and OUT represents the parameter
that will be used to return a value outside of the procedure.
iv. The function must contain a return statement.
v. The RETURN clause specifies the data type you are going to return from the function.
vi. function-body contains the executable part.
Mr. 403
vii. The AS keyword is used instead of the IS keyword for creating a standalone function.
Creating a Procedure
18. Define Database. Describe in brief, various characteristics of the database approach.
A database is an organized collection of structured information, or data, typically stored
electronically in a computer system. A database is a systematic collection of data. They
support electronic storage and manipulation of data. Databases make data management easy.
Database is usually controlled by a database management system (DBMS). Various
characteristics of the database approach are:
i. Manages Information
A database always takes care of its information because information is always helpful for
whatever work we do. It manages all the information that is required to us. Managing
information by using a database, we become more deliberated user of our data.
ii. Easy Operation Implementation
All the operations like insert, delete, update, search etc. are carried out in a flexible and
easy way. Database makes it very simple to implement these operations. A user with little
knowledge can perform these operations. This characteristic of database makes it more
powerful.
iii. Multiple Views of Database
Basically, a view is a subset of the database. A view is defined and devoted for a
particular user of the system. Different users of the system may have different views of
the same system.
iv. Data For Specific Purpose
A database is designed for data of specific purpose. For example, a database of student
management system is designed to maintain the record of student’s marks, fees and
attendance etc. This data has a specific purpose of maintaining student record.
v. It has Users of Specific Interest
A database always has some indented group of users and applications in which these user
groups are interested.
vi. Represent Some Aspects of Real World Applications
Mr. 403
A database represents some features of real world applications. Any change in the real
world is reflected in the database.
vii. Self Describing nature
A database is of self describing nature; it always describes and narrates itself. It contains
the description of the whole data structure, the constraints and the variables. It makes it
different from traditional file management system in which definition was not the part of
application program.
19. What is Database, Database Management System? What are the advantages and
disadvantages of DBMS?
A database is an organized collection of structured information, or data, typically stored
electronically in a computer system. A database is a systematic collection of data. They
support electronic storage and manipulation of data. Databases make data management easy.
Database is usually controlled by a database management system (DBMS).
Database Management System (DBMS) is a collection of programs that enable its users to
access databases, manipulate data, report, and represent data. It also helps to control access to
the database. Database Management Systems are not a new concept and, as such, had been
first implemented in the 1960s. Charles Bachman's Integrated Data Store (IDS) is said to be
the first DBMS in history. With time database, technologies evolved a lot, while usage and
expected functionalities of databases increased immensely.
Advantages of DBMS
i. DBMS offers a variety of techniques to store & retrieve data.
ii. DBMS serves as an efficient handler to balance the needs of multiple applications using
the same data.
iii. Uniform administration procedures for data.
iv. Application programmers never exposed to details of data representation and storage.
v. A DBMS uses various powerful functions to store and retrieve data efficiently.
vi. Offers Data Integrity and Security.
vii. The DBMS implies integrity constraints to get a high level of protection against
prohibited access to data.
viii. A DBMS schedules concurrent access to the data in such a manner that only one user can
access the same data at a time.
ix. Reduced Application Development Time.
Disadvantage of DBMS
i. Cost of Hardware and Software of a DBMS is quite high which increases the budget of
your organization.
ii. Most database management systems are often complex systems, so the training for users
to use the DBMS is required.
iii. In some organizations, all data is integrated into a single database which can be damaged
because of electric failure or database is corrupted on the storage media.
iv. Use of the same program at a time by many users sometimes lead to the loss of some
data.
v. DBMS can't perform sophisticated calculations.
20. Define Database and Database Management System. Describe in brief, the various
types of DBMS. Explain logical and physical data independence with the help of three-
schema architecture of Database System.
A database is an organized collection of structured information, or data, typically stored
electronically in a computer system. A database is a systematic collection of data. They
support electronic storage and manipulation of data. Databases make data management easy.
Database is usually controlled by a database management system (DBMS).
Database Management System (DBMS) is a collection of programs that enable its users to
access databases, manipulate data, report, and represent data. It also helps to control access to
the database. Database Management Systems are not a new concept and, as such, had been
first implemented in the 1960s. Charles Bachman's Integrated Data Store (IDS) is said to be
the first DBMS in history. With time database, technologies evolved a lot, while usage and
expected functionalities of databases increased immensely. There are four types of DBMS:
i. Hierarchical DBMS
In a Hierarchical database, model data is organized in a tree-like structure. Data is Stored
Hierarchically (top down or bottom up) format. Data is represented using a parent-child
relationship. In Hierarchical DBMS parent may have many children, but children have
only one parent.
The network database model allows each child to have multiple parents. It helps you to
address the need to model more complex relationships like as the orders/parts many-to-
many relationship. In this model, entities are organized in a graph which can be accessed
through several paths.
Relational DBMS is the most widely used DBMS model because it is one of the easiest.
This model is based on normalizing data in the rows and columns of the tables. Relational
model stored in fixed structures and manipulated using SQL.
In Object-oriented Model data stored in the form of objects. The structure which is called
classes which display data within it. It defines a database as a collection of objects which
stores both data members values and operations.
21. Explain the three-schema architecture of a DBMS. Differentiate between logical and
physical data independence with the help of three-schema architecture of Database
System.
The three schema architecture describes how the data is represented or viewed by the user in
the database. This architecture is also known as three-level architecture and is sometimes
called ANSI/ SPARC architecture. The three schema architecture divides the database into
three-level to create a separation between the physical database and the user application. In
simple words, this architecture hides the details of physical storage from the user. This
architecture has three levels:
i. External level
It is also called view level. The reason this level is called “view” is because several users
can view their desired data from this level which is internally fetched from database with
the help of conceptual and internal level mapping. The user doesn’t need to know the
database schema details such as data structure, table definition etc. user is only concerned
about data which is what returned back to the view level after it has been fetched from
database (present at the internal level). External level is the “top level” of the Three Level
DBMS Architecture.
Data Independence is defined as a property of DBMS that helps you to change the Database
schema at one level of a database system without requiring to change the schema at the next
higher level. Data independence helps you to keep data separated from all programs that make
use of it.
22. Describe the components of Three-Tier Architecture of Database System. What are the
roles of a Database Administrator?
The three schema architecture describes how the data is represented or viewed by the user in
the database. This architecture is also known as three-level architecture and is sometimes
called ANSI/ SPARC architecture. The three schema architecture divides the database into
three-level to create a separation between the physical database and the user application. In
simple words, this architecture hides the details of physical storage from the user. This
architecture has three levels:
i. External level
It is also called view level. The reason this level is called “view” is because several users
can view their desired data from this level which is internally fetched from database with
the help of conceptual and internal level mapping. The user doesn’t need to know the
database schema details such as data structure, table definition etc. user is only concerned
about data which is what returned back to the view level after it has been fetched from
database (present at the internal level). External level is the “top level” of the Three Level
DBMS Architecture.
Database Administrator (DBA) is the central authority for managing a database system. A
Database Administrator is a person or a group of person who are responsible for managing
all the activities related to database system. This job requires a high level of expertise by a
person or group of person. There are very rare chances that only a single person can manage
all the database system activities so companies always have a group of people who take care
of database system. Some of DBA roles are:
Database performance plays an important role for any business. If user is not able to fetch
data speedily then it may loss company business. So by tuning an modifying sql
commands a DBA can improves the performance of database.
v. Capacity Issues
All the databases have their limits of storing data in it and the physical memory also has
some limitations. DBA has to decide the limit and capacity of database and all the issues
related to it.
23. What is a Relational Database Management System? Explain its various components
with examples.
A relational database management system (RDBMS) is a program that allows you to create,
update, and administer a relational database. Most relational database management systems
use the SQL language to access the database. RDBMS is a software system which is used to
store only data which need to be stored in the form of tables. In this kind of system, data is
managed and stored in rows and columns which is known as tuples and attributes. RDBMS is
a powerful data management system and is widely used across the world.
A relational database has following major components:
i. Table
A table is a collection of data represented in rows and columns. Each table has a name in
database.
ii. Record or Tuple
Each row of a table is known as record. It is also known as tuple.
iii. Field or Column name or Attribute
Each field represents a field of data that occurs consistently in each record or table
throughout the database.
iv. Domain
A domain is a set of permitted values for an attribute in table.
v. Instance & Schema
Design of a database is called the schema. Schema is of three types: Physical schema,
logical schema and view schema.
vi. Keys
Mr. 403
Key plays an important role in relational database; it is used for identifying unique rows
from table. It also establishes relationship among tables. Different types of keys are:
a. Primary Key
b. Super Key
c. Candidate Key
d. Alternate Key
e. Composite Key
f. Foreign Key
A transaction is said to follow Two Phase Locking protocol if Locking and Unlocking can
be done in two phases.
i. Growing Phase: New locks on data items may be acquired but none can be released.
ii. Shrinking Phase: Existing locks may be released but no new locks can be acquired.
Pre-claiming protocols evaluate their operations and create a list of data items on
which they need locks. Before initiating an execution, the transaction requests the
system for all the locks it needs beforehand.
This locking protocol divides the execution phase of a transaction into three parts. In
the first part, when the transaction starts executing, it seeks permission for the locks it
requires. The second part is where the transaction acquires all the locks.
The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the first
phase, the transaction continues to execute normally.
26. What is database transactions and concurrency control? What are the properties of
transaction? Explain.
A transaction is a single logical unit of work which accesses and possibly modifies the
contents of a database. Transactions access data using read and write operations. In order to
maintain consistency in a database, before and after the transaction, certain properties are
followed. These are called ACID properties.
The process of managing the simultaneous execution of transactions in a shared database, is
known as concurrency control. It ensures that Database transactions are performed
concurrently and accurately to produce correct results without violating data integrity of the
respective Database.
A transaction in a database system must maintain Atomicity, Consistency, Isolation,
and Durability, commonly known as ACID properties in order to ensure accuracy,
completeness, and data integrity.
i. Atomicity
Mr. 403
This property states that a transaction must be treated as an atomic unit, that is, either all
of its operations are executed or none. There must be no state in a database where a
transaction is left partially completed. States should be defined either before the
execution of the transaction or after the execution/abortion/failure of the transaction.
ii. Consistency
The database must remain in a consistent state after any transaction. No transaction
should have any adverse effect on the data residing in the database. If the database was in
a consistent state before the execution of a transaction, it must remain consistent after the
execution of the transaction as well.
iii. Durability
The database should be durable enough to hold all its latest updates even if the system
fails or restarts. If a transaction updates a chunk of data in a database and commits, then
the database will hold the modified data. If a transaction commits but the system fails
before the data could be written on to the disk, then that data will be updated once the
system springs back into action.
iv. Isolation
In a database system where more than one transaction are being executed simultaneously
and in parallel, the property of isolation states that all the transactions will be carried out
and executed as if it is the only transaction in the system. No transaction will affect the
existence of any other transaction.
27. What are the methods of executing a transaction? Explain dirty read and incorrect
summary problem.
A transaction is a single logical unit of work which accesses and possibly modifies the
contents of a database. Transactions access data using read and write operations. In order to
maintain consistency in a database, before and after the transaction, certain properties are
followed. These are called ACID properties.
The methods of executing a transaction are:
i. Active
In this state, the transaction is being executed. This is the initial state of every
transaction.
ii. Partially Committed
When a transaction executes its final operation, it is said to be in a partially committed
state.
iii. Failed
A transaction is said to be in a failed state if any of the checks made by the database
recovery system fails. A failed transaction can no longer proceed further.
iv. Aborted
Mr. 403
If any of the checks fails and the transaction has reached a failed state, then the recovery
manager rolls back all its write operations on the database to bring the database back to
its original state where it was prior to the execution of the transaction. Transactions in
this state are called aborted. The database recovery module can select one of the two
operations after a transaction aborts:
When a transaction is allowed to read a row that has been modified by an another
transaction which is not committed yet that time Dirty Reads occurred. It is mainly
occurred because of multiple transaction at a time which is not committed.
Incorrect Summary issue occurs when one transaction takes summary over the value of all
the instances of a repeated data-item, and second transaction update few instances of that
specific data-item. In that situation, the resulting summary does not reflect a correct result.
28. Explain about the steps used in query processing and query optimization techniques
with neat diagram.
Query processing is a procedure of transforming a high-level query (such as SQL) into a
correct and efficient execution plan expressed in low-level language.
It is done in the following steps:
Mr. 403
Step-1:
Parser:
During parse call, the database performs the following checks. Parser performs the
following checks as:
Syntax check: concludes SQL syntactic validity. Example:
SELECT * FORM employee
Semantic check: determines whether the statement is meaningful or not. Example:
query contains a table name which does not exist is checked by this check.
Shared Pool check: Every query possess a hash code during its execution. So, this
check determines existence of written hash code in shared pool if code exists in shared
pool then database will not take additional steps for optimization and execution.
Step-2:
Hard Parse and Soft Parse:
If there is a fresh query and its hash code does not exist in shared pool then that query has
to pass through from the additional steps known as hard parsing otherwise if hash code
exists then query does not passes through additional steps. It just passes directly to
execution engine. This is known as soft parsing.
Hard Parse includes following steps:
Optimizer:
During optimization stage, database must perform a hard parse at least for one unique DML
statement and perform optimization during this parse. This database never optimizes DDL
unless it includes a DML component such as sub-query that require optimization.
Row Source Generation:
The Row Source Generation is a software that receives a optimal execution plan from the
optimizer and produces an iterative execution plan that is usable by the rest of the database.
Mr. 403
Step-3:
Execution Engine: Finally runs the query and display the required result.
Query optimization is a feature of many relational database management systems and other
databases such as graph databases. The query optimizer attempts to determine the most efficient
way to execute a given query by considering the possible query plans. Steps for Query
Optimization are:
Step 1: Query Tree Generation
A query tree is a tree data structure representing a relational algebra expression. The tables of
the query are represented as leaf nodes. The relational algebra operations are represented as the
internal nodes. The root represents the query as a whole.
Step 2: Query Plan Generation
After the query tree is generated, a query plan is made. A query plan is an extended query tree
that includes access paths for all operations in the query tree. Access paths specify how the
relational operations in the tree should be performed. For example, a selection operation can
have an access path that gives details about the use of B+ tree index for selection.
Step 3: Code Generation
Code generation is the final step in query optimization. It is the executable form of the query,
whose form depends upon the type of the underlying operating system. Once the query code is
generated, the Execution Manager runs it and produces the results.
29. Explain Nested Queries with examples. Define DDL and DML with example.
A Nested query or Sub-query or Inner query is a query within another SQL query and embedded
within the WHERE clause. It is used to return data that will be used in the main query as a condition
to further restrict the data to be retrieved. It can be used with the SELECT, INSERT, UPDATE, and
DELETE statements along with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.
It basically defines the column (Attributes) It add or update the row of the table. These
of the table. rows are called as tuple.
It doesn’t have any further classification. It is further classified into Procedural and
Non-Procedural DML.
Basic command present in DDL are BASIC command present in DML are
CREATE, DROP, RENAME, ALTER etc. UPDATE, INSERT, MERGE etc.
Mr. 403
DDL does not use WHERE clause in its While DML uses WHERE clause in its
statement. statement.
30. Highlight the importance of security in DBMS and also explain about encryption
techniques.
Database security is the technique that protects and secures the database against intentional
or accidental threats. The importance of security in DBMS are:
i. Deployment failure.
ii. Excessive privileges.
iii. Privilege abuse.
iv. Platform vulnerabilities.
v. Unmanaged sensitive data.
vi. Backup data exposure.
vii. Weak authentication.
viii. Database injection attacks.
Encryption is the method by which information is converted into secret code that hides the
information's true meaning. The science of encrypting and decrypting information is
called cryptography. There are two encryption techniques:
i. Symmetric Encryption
In symmetric encryption the same key is used for encryption and decryption. It is
therefore critical that a secure method is considered to transfer the key between sender
and recipient.
31. Explain the needs of database security. Differentiate between encryption and
decryption.
OR,
What is need of database security? Explain encryption and decryption with example.
Database security is the technique that protects and secures the database against intentional
or accidental threats. Database security in needed because of:
i. Deployment failure.
ii. Excessive privileges.
iii. Privilege abuse.
iv. Platform vulnerabilities.
v. Unmanaged sensitive data.
vi. Backup data exposure.
vii. Weak authentication.
viii. Database injection attacks.
Encryption Decryption
Encryption is the process of converting Decryption is the process of converting
normal message into meaningless message. meaningless message into its original form.
It is the process which take place at sender’s It is the process which take place at
end. receiver’s end.
Its major task is to convert the plain text into Its main task is to convert the cipher text into
cipher text. plain text.
Any message can be encrypted with either Whereas the encrypted message can be
secret key or public key. decrypted with either secret key or private
key.
In encryption process, sender sends the data In decryption process, receiver receives the
to receiver after encrypted it. information(Cipher text) and convert into
plain text.
Mr. 403
i. Physical
The sites containing the computer systems must be secured against armed (equipped) or
surreptitious (secret) entry by intruders (persons with criminal intent).
ii. Human
Users must be authorized carefully to reduce the chance of any such user giving access to
an intruder in exchange for a bribe (backhander) or other favors.
iii. Operating System
No matter how secure the database system is, weakness in operating system security may
serve as a means of unauthorized access to the database.
iv. Network
Since almost all database systems allow remote access through terminals or networks,
software-level security within the network software is as important as physical security,
both on the Internet and in networks private to an enterprise.
v. Database System
Some database-system users may be authorized to access only a limited portion of the
database. Other users may be allowed to issue queries, but may be forbidden to modify
the data. It is responsibility of the database system to ensure that these authorization
restrictions are not violated.
The MAC system doesn't permit end users (data owners) to have a say in the entities having
access in a unit or facility. In this mode, users or owners do not enjoy the privilege of
deciding who can access their files, rather it is enforced by the system wide set of privilege
Mr. 403
rules (i.e. System Owner manages the access control). MAC however, is more secure than
Discretionary Access Control (DAC). This model is more static and more complex compared
to DAC. Special types of the Unix operating system are based on MAC model.
34. What are database keys? Explain all the types of keys you are aware with.
Keys in DBMS is an attribute or set of attributes which helps you to identify a row(tuple) in a
relation(table). They allow you to find the relation between two tables. Keys help you
uniquely identify a row in a table by a combination of one or more columns in that table. Key
is also helpful for finding unique record or row from the table.
The different types of keys in DBMS are:
i. Candidate Key
The candidate keys in a table are defined as the set of keys that is minimal and can
uniquely identify any data row in the table.
ii. Primary Key
The primary key is selected from one of the candidate keys and becomes the identifying
key of a table. It can uniquely identify any data row of the table.
iii. Super Key
Super Key is the superset of primary key. The super key contains a set of attributes,
including the primary key, which can uniquely identify any data row in the table.
iv. Composite Key
If any single attribute of a table is not capable of being the key i.e. it cannot identify a
row uniquely, then we combine two or more attributes to form a key. This is known as a
composite key.
v. Secondary Key
Only one of the candidate keys is selected as the primary key. The rest of them are known
as secondary keys.
vi. Foreign Key
A foreign key is an attribute value in a table that acts as the primary key in another table.
Hence, the foreign key is useful in linking together two tables. Data should be entered in
the foreign key column with great care, as wrongly entered data can invalidate the
relationship between the two tables.
35. Describe schemas and instances with example. What is specialization in ER model?
A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are
associated. It formulates all the constraints that are to be applied on the data.
Example: Let’s say a table teacher in our database name school, the teacher table require
the name, dob, doj in their table so we design a structure as :
Teacher table
Mr. 403
name: String
doj: date
dob: date
Instances are the collection of information stored at a particular moment. The instances can
be changed by certain CRUD operations as like addition, deletion of data. It may be noted
that any search query will not make any kind of changes in the instances.
Example: Let’s say a table teacher in our database whose name is School, suppose the table
has 50 records so the instance of the database has 50 records for now and tomorrow we are
going to add another fifty records so tomorrow the instance have total 100 records. This is
called an instance.
Top-Down
Approach
Is A
Developer Analyzer
Account
Mr. 403
Is A
Saving Current
offer
Center Course
enquire
higher level. Data independence helps you to keep data separated from all programs that
make use of it. In DBMS there are two types of data independence:
39. What is relational algebra? How is Cartesian Product different from Natural Join?
Explain with suitable example.
Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator
can be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.
Cartesian Product in DBMS is an operation used to merge columns from two relations.
Generally, a cartesian product is never a meaningful operation when it performs alone.
Mr. 403
Student Table
Roll_No Name
1 A
2 B
3 C
Marks Table
Roll_No Marks
2 70
3 50
4 85
Natural Join joins two tables based on same attribute name and datatypes. The
resulting table will contain all the attributes of both the tables but only one copy of
each common column.
Student Table
Roll_No Name
1 A
Mr. 403
2 B
3 C
Marks Table
Roll_No Marks
2 70
3 50
4 85
40. Define with examples the selection, projection and Cartesian Product operators.
Selection (σ) is used to select required tuples of the relations. The SELECT operation is
used for selecting a subset of the tuples according to a given selection condition. Sigma(σ)
symbol denotes it. It is used as an expression to choose tuples which meet the selection
condition. Select operator selects tuples that satisfy a given predicate.
Example:
σ sales > 50000 (Customers)
Output: Selects tuples from Customers where sales is greater than 50000
Projection (π) is used to project required column data from a relation. The projection
eliminates all attributes of the input relation but those mentioned in the projection list. The
projection method defines a relation that contains a vertical subset of Relation. (pi) symbol is
used to choose attributes from a relation.
Example:
CustomerName Status
Google Active
Amazon Active
Apple Inactive
Alibaba Active
Cartesian Product in DBMS is an operation used to merge columns from two relations.
Generally, a cartesian product is never a meaningful operation when it performs alone.
However, it becomes meaningful when it is followed by other operations. It is also called
Cross Product or Cross Join.
Example: Consider the two tables given below:
Student Table
Roll_No Name
1 A
2 B
3 C
Marks Table
Roll_No Marks
2 70
3 50
4 85
3 C 3 50
3 C 4 85
41. What is relational algebra? How does it differ from relational calculus?
Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator
can be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.
The Data Model is defined as an abstract model that organizes data description, data
semantics, and consistency constraints of data. The data model emphasizes on what data is
needed and how it should be organized instead of what operations will be performed on data.
Data Model is like an architect's building plan, which helps to build conceptual models and
set a relationship between data items.
There are mainly three different types of data models: conceptual data models, logical data
models, and physical data models, and each one has a specific purpose. The data models are
used to represent the data and how it is stored in the database and to set the relationship
between data items.
i. One-to-One: When only one instance of an entity is associated with the relationship, then
it is known as one to one relationship.
ii. One-to-many: One entity from entity set A can be associated with more than one entity of
entity set B, however an entity from entity set B can be associated with at most one
entity.
iii. Many-to-one: More than one entity from entity set A can be associated with at most one
entity of entity set B, however an entity from entity set B can be associated with more
than one entity from entity set A.
Mr. 403
iv. Many-to-many: One entity from A can be associated with more than one entity from B
and vice versa.
A participation constraint defines the number of times an object in an object class can
participate in a connected relationship set. Every connection of a relationship set must have a
participation constraint. However, participation constraints do not apply to relationships. The
two types of participation are:
i. Total Participation
Each entity is involved in the relationship. Total participation is represented by double
lines.
ii. Partial participation
Not all entities are involved in the relationship. Partial participation is represented by
single lines.
constraints can also be added to an existing relation by using the command alter table table-
name add constraint, where constraint can be any constraint on the relation. When such a
command is executed, the system first ensures that the relation satisfies the specified
constraint. If it does, the constraint is added to the relation; if not , the command is rejected.
System security is a huge topic one that could easily be the subject of multiple courses at the
undergraduate or graduate level, as well as being an ongoing focus of research. There are
three key concepts involved in understanding the security mechanisms of SQL.
i. "An authorization ID is a character string that is obtained by the database manager when
a connection is established between the database manager and a process."
iii. An authority (also called privilege) is the right to perform a certain operation on an
object. There are different kinds of privileges that apply to objects at different levels.
45. Represent the ER Model as equivalent relations. Use proper representation for keys.
ER model stands for an Entity-Relationship model. It is a high-level data model. This model
is used to define the data elements and relationship for a specified system. It develops a
conceptual design for the database. It also develops a very simple and easy to design view of
data. In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.
i. Entity
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles. Consider an organization as an example- manager, product,
employee, department etc. can be taken as an entity.
a. Weak Entity
Mr. 403
An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double
rectangle.
ii. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an
attribute. For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
b. Composite Attribute
c. Multi-valued Attribute
An attribute can have more than one value. These attributes are known as a multi-
valued attribute. The double oval is used to represent multi-valued attribute. For
example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It
can be represented by a dashed ellipse. For example, A person's age changes over
time and can be derived from another attribute like Date of birth.
iii. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is
used to represent the relationship.
Mr. 403
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is
known as one to one relationship. For example, A female can marry to one male, and
a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then this is known as a one-to-
many relationship. For example, Scientist can invent many inventions, but the
invention is done by the only specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an
entity on the right associates with the relationship then it is known as a many-to-one
relationship. For example, Student enrolls for only one course, but a course can have
many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of
an entity on the right associates with the relationship then it is known as a many-to-
many relationship. For example, Employee can assign by many projects and project
can have many employees.
Mr. 403
46. What are integrity constraints? What are domain constraints? Explain referential
integrity constraint with its function.
OR,
Explain referential integrity constraint with its function.
Integrity constraints are used to ensure accuracy and consistency of the data in a relational
database. Data integrity is handled in a relational database through the concept of referential
integrity.
Each table has certain set of columns and each column allows a same type of data, based on
its data type. The column does not accept values of any other data type. Domain
constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT).
We wish to ensure that a value that appears in one relation for a given set of attributes also
appears for a certain set of attributes in another relation. This condition is called referential
integrity. Referential integrity ensures that the values for a set of attributes in one
relation must also appear the same for the particular set attributes in another relation.
Referential integrity constraints work on the concept of Foreign Keys.
Referential integrity (RI) is a term used with relational databases to describe the integrity
of the business relationships represented in the schema. It ensures that relationships
between tables remain consistent.
A referential integrity constraint is also known as foreign key constraint. A foreign key is
a key whose values are derived from the Primary key of another table. The table from
which the values are derived is known as Master or Referenced Table and the Table in
which values are inserted accordingly is known as Child or Referencing Table, In other
words, we can say that the table containing the foreign key is called the child table, and
the table containing the Primary key/candidate key is called the referenced or parent
table. When we talk about the database relational model, the candidate key can be
defined as a set of attribute which can have zero or more attributes.