0% found this document useful (0 votes)
10 views

DBMS Note

The document discusses various topics related to database management systems (DBMS) including: 1. It introduces basic concepts of DBMS like the 2-tier and 3-tier architectures, data independence, and data modeling techniques. 2. It covers Entity-Relationship modeling including attribute relationships and different types of relationships between entities. 3. It discusses the basics of keys such as primary keys, candidate keys, super keys, and foreign keys. 4. Other topics covered include normalization, transaction management, SQL, indexing, and B-tree structures.

Uploaded by

Sravanti Bagchi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

DBMS Note

The document discusses various topics related to database management systems (DBMS) including: 1. It introduces basic concepts of DBMS like the 2-tier and 3-tier architectures, data independence, and data modeling techniques. 2. It covers Entity-Relationship modeling including attribute relationships and different types of relationships between entities. 3. It discusses the basics of keys such as primary keys, candidate keys, super keys, and foreign keys. 4. Other topics covered include normalization, transaction management, SQL, indexing, and B-tree structures.

Uploaded by

Sravanti Bagchi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

DBMS Notes:

1. Basic Intro: a. 2 tier, b) 3 tier


i. 3 scheme
ii. 3 level of abstraction
iii. Data independence
iv. Network
v. Hierarchical
vi. ER
vii. Object Oriented programming
viii. Relational
2. E-R model
a) Attributes relationship
b) Types of relationship
i. One to one
ii. One to many
iii. Many to many
3. Basics of keys
a) Primary key
b) Candidate key
c) Super key
d) Foreign key
4. Normalization
a) Closure method
b) Functional dependencies
c) 1st normal form
d) 2nd NF
e) 3rd NF
f) 4NF
g) 5NF
h) BCNF
5. Transaction control and concurrency
a) Acid properties
b) R-w
c) W-r
d) W-w
e) Conflict serializability
f) Recoverability
g) 2 phase locking
h) Time stamp protocol
6. SQL(Structured Query Language) and relational Algebra
a) DDL commands
b) DCL
c) DML
d) Constraint
e) Aggregate function
f) Joins
g) Nested Query
i. In, not in, any, all
7. Indexing
a) Primary
b) Cluster
c) Secondary
8. B tree, B+ tree

Sir’s Note: sir er book pdf theke dagano question + sir er deya notes + this copy. Sob mile prte hbe.

1. Write down the diff between logical and physical independence.


Answer:

parameters Physical Data Independence Logical Data Independence

Basics Physical data independence is Logical data independence is


concerned mainly with how a set of concerned mainly with the changing
data/ info gets stored in a given definition of the data in a system or
system. its structure as a whole.

Ease of We can easily retrieve it. Retrieving is very difficult because


Retrieving the data mainly depends on its logical
structure and not its physical
location.

Ease of Achieving physical data Achieving logical data independence


Achieving independence is much easier as is more difficult as compared to
compared to logical data physical data independence.
independence.

Degree of The changes made at the physical Any changes made at the physical
Changes level need not be made at the level need to be made at the
Required application level. application level as well.

Internal We may or may not need the Making modifications at the logical
Modification modifications at the internal level level is a prerequisite whether we
for improving the performance of a want to change the database structure
system’s structure. or not.

Type of Schema The internal schema is the primary The conceptual schema is the primary
concern. concern.

Examples For example, altering the For example, adding, deleting, or


compression techniques, storage modifying any attribute in a system.
devices changes, and changes in the
hashing algorithms.

2. Distinguish between file management system and database management system.


Answer:

file management system database management system.

1. What it is Used to manage and organise the files A software to store and retrieve the
stored in the hard disk of the computer user’s data

2. Redundant data Redundant data is present No presence of redundant data


3. Query processing Query processing is not so efficient Query processing is efficient

4. Data consistency Data consistency is low Due to the process of normalisation, the
data consistency is high

5. complexity Less complex, does not support More complexity in managing the data,
complicated transactions easier to implement complicated
transactions

6. security Less security Supports more security mechanisms

7. expense Less expensive in comparison to DBMS Higher cost than the File system

8. crash recovery Does not support crash recovery Crash recovery mechanism is highly
supported

3. Write Difference between procedural and non-procedural DML.


Answer:

Procedural Languages Non-Procedural Languages

The procedural languages are command-driven or The non-procedural languages are fact-oriented.
statement-oriented.

The programs in procedural language specify what is to The programs in non-procedural language specify
be accomplished by a program and instruct the what is to be done and do not state exactly how a
computer on accurately how the evaluation is result is to be evaluated.
completed.

Procedural languages are used for application and Non-Procedural languages are used in RDBMS, expert
system programming. systems, natural language processing, and education.

It is complex. It is simpler than procedural.

These are imperative programming languages. These are declarative programming languages.

The textual context or execution sequence is considered. There is no need to consider textual context or
execution sequence.

As an example of a program, the sorting is completed In a non-procedural language, it is essential only to


by defining in a C++ program all of the elements of define the features of the sorted list. From this
some sorting algorithm to a computer having a C++ description, the non-procedural language system can
compiler. The computer after translating the C++ produce the sorted list.
program into machine code or some interpretive
intermediate code follows the instructions and produces
the sorted list.

Machine efficiency is good. The logic programs that use the only resolution face
serious problems of machine efficiency.

The procedural paradigm leads to a large number of the There are no such connections present in the non-
probable network between functions and data if there procedural paradigm.
are many functions and many global data items.

An example of procedural languages is C, ADA, Pascal, An example of non-procedural languages is Prolog,


C++, etc. USP, SQL Scheme, etc.

4. Describe different types of attribute.


Answer:

In DBMS, there are various types of attributes available:

 Simple Attributes
 Composite Attributes
 Single Valued Attributes
 Multi-Valued Attributes
 Derived Attributes
 Complex Attributes (Rarely used attributes)
 Key Attributes
 Stored Attributes

Now, we will study all of these different types of attributes in DBMS in detail along with their diagrams and
examples :)

Simple Attributes

Simple attributes in an ER model diagram are independent attributes that can't be classified further and also, can't
be subdivided into any other component. These attributes are also known as atomic attributes.

Example Diagram:

As we can see in the above example, Student is an entity represented by a rectangle, and it consists of
attributes: Roll_no, class, and Age. Also, there is a point to be noted that we can't further subdivide the Roll_no
attribute and even the other two attributes into sub-attributes. Hence, they are known as simple attributes of the
Student entity.

Composite Attributes

Composite attributes have opposite functionality to that of simple attributes as we can further subdivide composite
attributes into different components or sub-parts that form simple attributes. In simple terms, composite
attributes are composed of one or more simple attributes.

Example Diagram
As we can see in the above example, Address is a composite attribute represented by an elliptical shape, and it
can be further subdivided into many simple attributes like Street, City, State, Country, Landmark, etc.

Single-Valued Attributes

Single valued attributes are those attributes that consist of a single value for each entity instance and can't store
more than one value. The value of these single-valued attributes always remains the same, just like the name of a
person.

Example Diagram:

As we can see in the above example, Student is an entity instance, and it consists of attributes: Roll_no, Age,
DOB, and Gender. These attributes can store only one value from a set of possible values. Each entity instance can
have only one Roll_no, which is a unique, single DOB by which we can calculate age and also fixed gender. Also,
we can't further subdivide these attributes, and hence, they are simple as well as single-valued attributes.

Multi-Valued Attributes

Multi-valued attributes have opposite functionality to that of single-valued attributes, and as the name suggests,
multi-valued attributes can take up and store more than one value at a time for an entity instance from a set of
possible values. These attributes are represented by co-centric elliptical shape, and we can also use curly braces
{ } to represent multi-valued attributes inside it.

Example Diagram:

As we can see in the above example, the Student entity has four attributes: Roll_no and Age are simple as well as
single-valued attributes as discussed above but Mob_no and Email_id are represented by co-centric ellipse are
multi-valued attributes. Each student in the real world can provide more than one email-id as well as a mobile
contact number, and therefore, we need these attributes to be multi-valued so that they can store multiple values at
a time for an entity instance.

Derived Attributes

Derived attributes are those attributes whose values can be derived from the values of other attributes. They are
always dependent upon other attributes for their value.
For example, As we were discussing above, DOB is a single-valued attribute and remains constant for an entity
instance. From DOB, we can derive the Age attribute, which changes every year, and can easily calculate the age
of a person from his/her date of birth value. Hence, the Age attribute here is derived attribute from
the DOB single-valued attribute.

Example Diagram:

Derived attributes are always represented by dashed or dotted elliptical shapes.

Key Attributes

Key attributes are special types of attributes that act as the primary key for an entity and they can uniquely identify
an entity from an entity set. The values that key attributes store must be unique and non-repeating.

Example Diagram:

As we can see in the above example, we can say that the Roll_no attribute of the Student entity is not only simple
and single-valued attribute but also, a key valued attribute as well. Roll_no of a student will always be unique to
identify the student. Also note that the Gender and Age of two or more persons can be same and overlapping in
nature and obviously, we can't identify a student on the basis of them. Hence, gender and age are not key-valued
attributes.

Complex Attributes

Complex attributes are rarely used in DBMS. They are formed by the combination of multi-valued and composite
attributes. These attributes always have many sub-sections in their values.

Example Diagram:
As we can see in the above example, Address_EmPhone (which represents Address, Email, and
Phone number altogether) is a complex attribute. Email and Phone number are multi-valued attributes
while Address is a composite attribute which is further subdivided as House number, Street,
City & State. This combination of multi-valued and composite attributes altogether forms a complex
attribute.

Stored Attributes

Values of stored attributes remain constant and fixed for an entity instance and also, and they help in
deriving the derived attributes. For example, the Age attribute can be derived from the Date of
Birth attribute, and also, the Date of birth attribute has a fixed and constant value throughout the life of
an entity. Hence, the Date of Birth attribute is a stored attribute.

Example Diagram:

As we can see in the above image, there are different types of attributes in DBMS, well slotted for each field for an
entity instance.

5. What is closure and normal cover?

Answer: sir er boi er page nmbr 129 & 131.

6. What is inclusion dependency?

Answer:

Let's say we take two relations, namely R and S that are created by using two entity sets in a way that
every entity in R is also S entity. Inclusion dependence occurs when projecting R's key attributes gives a relation
that is contained in the relation acquired by projecting S's key attributes.

Let's name the relations R as teacher and S as student, so take the attribute as teacher_id, so we can write:

 teacher.teacher_id --> student.teacher_id

teacher:

teacher_id (primary key) name department


1 Ram Kumar DBMS

student:

student_1 name teached_id (foreign key) age


1 Rahul Singh 1 18
teacher_id will be the primary key for teacher table and will be foreign key for the student table, attributes of the
teacher table will be available in the student table.

So this foreign key concept makes the inclusion dependency possible.

7. What is 2 -phase locking protocol? How does it guarantee serializability ?


Answer:
Two Phase Locking Protocol also known as 2PL protocol is a method of concurrency control in DBMS that
ensures serializability by applying a lock to the transaction data which blocks other transactions to access the same
data simultaneously. Two Phase Locking protocol helps to eliminate the concurrency problem in DBMS.
This locking protocol divides the execution phase of a transaction into three different parts.

 In the first phase, when the transaction begins to execute, it requires permission for the locks it needs.
 The second part is where the transaction obtains all the locks. When a transaction releases its first lock,
the third phase starts.
 In this third phase, the transaction cannot demand any new locks. Instead, it only releases the acquired
locks.

The Two-Phase Locking protocol allows each transaction to make a lock or unlock request in two steps:

 Growing Phase: In this phase transaction may obtain locks but may not release any locks.
 Shrinking Phase: In this phase, a transaction may release locks but not obtain any new lock

It is true that the 2PL protocol offers serializability. However, it does not ensure that deadlocks do not happen.

In the above-given diagram, you can see that local and global deadlock detectors are searching for deadlocks and
solve them with resuming transactions to their initial states.

Two phase locking is of two types −

Strict two phase locking protocol

A transaction can release a shared lock after the lock point, but it cannot release any exclusive lock
until the transaction commits. This protocol creates a cascade less schedule.
Cascading schedule: In this schedule one transaction is dependent on another transaction. So if one has
to rollback then the other has to rollback.

Rigorous two phase locking protocol

A transaction cannot release any lock either shared or exclusive until it commits.
The 2PL protocol guarantees serializability, but cannot guarantee that deadlock will not happen.

Example

Let T1 and T2 are two transactions.


T1=A+B and T2=B+A

T1 T2

Lock-X(A) Lock-X(B)

Read A; Read B;

Lock-X(B) Lock-X(A)
Here,
Lock-X(B) : Cannot execute Lock-X(B) since B is locked by T2.
Lock-X(A) : Cannot execute Lock-X(A) since A is locked by T1.
In the above situation T1 waits for B and T2 waits for A. The waiting time never ends. Both the
transaction cannot proceed further at least any one releases the lock voluntarily. This situation is called
deadlock.
The wait for graph is as follows −

Wait for graph: It is used in the deadlock detection method, creating a node for each transaction,
creating an edge Ti to Tj, if Ti is waiting to lock an item locked by Tj. A cycle in WFG indicates a
deadlock has occurred. WFG is created at regular intervals.

8. What are the various states of a transaction? Explain with a state diagram.
Answer:

A transaction is a unit of database processing which contains a set of operations. For example, deposit of money,
balance enquiry, reservation of tickets etc.
Every transaction starts with delimiters begin transaction and terminates with end transaction delimiters. The set
of operations within these two delimiters constitute one transaction.
main()
{
begin transaction
} end transaction
A transaction is divided into states to handle various situations such as failure. It passes through various states
during its lifetime. The state of a transaction is defined by the current activity it is performing.
At a particular instant of time, a transaction can be in one of the following state −

 Active − Transaction is executing.

 Failed − Transaction fails to complete successfully.

 Abort − changes made by transaction are cancelled (roll back).

 Partially commit − Final statement of transaction is executed.

 Commit − Transaction completes its execution successfully.

 Terminated − Transaction is finished.


The states of transaction are diagrammatically represented as follows −

A transaction will terminate either when it commits or when it is aborted.


9. What is cascadeless schedule ? why is cascadeless of schedule desired?
Answer:

Casecadeless schedule: page 163 of book.

10. Explain log based recovery.


Answer:

o The log is a sequence of records. Log of each transaction is maintained in some stable storage so that if

any failure occurs, then it can be recovered from there.


o If any operation is performed on the database, then it will be recorded in the log.

o But the process of storing the logs should be done before the actual transaction is applied in the database.

Let's assume there is a transaction to modify the City of a student. The following logs are written for this
transaction.

o When the transaction is initiated, then it writes 'start' log.

1. <Tn, Start>

o When the transaction modifies the City from 'Noida' to 'Bangalore', then another log is written to the file.

1. <Tn, City, 'Noida', 'Bangalore' >

o When the transaction is finished, then it writes another log to indicate the end of the transaction.
1. <Tn, Commit>

There are two approaches to modify the database:

1. Deferred database modification:

o The deferred modification technique occurs if the transaction does not modify the database until it has

committed.
o In this method, all the logs are created and stored in the stable storage, and the database is updated when

a transaction commits.

2. Immediate database modification:

o The Immediate modification technique occurs if database modification occurs while the transaction is

still active.
o In this technique, the database is modified immediately after every operation. It follows an actual

database modification.

Recovery using Log records

When the system is crashed, then the system consults the log to find which transactions need to be undone and
which need to be redone.

1. If the log contains the record <Ti, Start> and <Ti, Commit> or <Ti, Commit>, then the Transaction Ti

needs to be redone.

2. If log contains record<Tn, Start> but does not contain the record either <Ti, commit> or <Ti, abort>, then

the Transaction Ti needs to be undone.

11. Show that if a relation schema BCNF then it is in 3 nf , but reverse is not true.

Answer: from page 137 to 139 of book pdf.

12. What are metadata and data dictionary?

Answer: A Data Dictionary is an integral part of a database. It holds the information about the database and

the data that it stores called as metadata.

A meta data is the data about the data.


It is the self-describing nature of databases.
It holds the information about each data element in the database.

Such as names, types, range of values, access authorization, indicate which application program
uses the data.
Meta data is used by the developers to develop the program, queries to manage and
manipulate the data.
Types of dictionaries
In general, DBMS data dictionaries are of two types. These dictionaries are as follows −
Active data dictionary

It is managed automatically by the data management software.


It is always consistent with the current structure of the database.
Passive data dictionary

It is mainly used for documentation purposes.


It is managed by the user of the system and is modified manually by the user.
T he data dictionaries are pictorially represented as follows −

Data in DBMS data dictionary


The data in DBMS data dictionary is explained below −
Schema
Tables
Columns
Constraints
Foreign keys
Indexes
Sequences
Program
Views
User defined function
Stored procedures
Trigger
Security
Users
User groups
Privileges
Physical implementation
Partitions
Files
Backups
Size of tables and indexes in KBs
Number of rows in tables

13. Explain the terms candidate key, primary key, foreign key, super key.

Answer:
o Keys play an important role in the relational database.

o It is used to uniquely identify any record or row of data from the table. It is also used to establish and

identify relationships between tables.

For example, ID is used as a key in the Student table because it is unique for each student. In the PERSON table,
passport_number, license_number, SSN are keys since they are unique for each person.

Types of keys:

1. Primary key

o It is the first key used to identify one and only one instance of an entity uniquely. An entity can contain

multiple keys, as we saw in the PERSON table. The key which is most suitable from those lists becomes

a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each employee. In the

EMPLOYEE table, we can even select License_Number and Passport_Number as primary keys since

they are also unique.


o For each entity, the primary key selection is based on requirements and developers.
2. Candidate key

o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.

o Except for the primary key, the remaining attributes are considered a candidate key. The candidate keys

are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the attributes, like SSN,
Passport_Number, License_Number, etc., are considered a candidate key.

3. Super Key

Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a candidate key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of two
employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination can also be a key.

The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key

o Foreign keys are the column of the table used to point to the primary key of another table.

o Every employee works in a specific department in a company, and employee and department are two

different entities. So we can't store the department's information in the employee table. That's why we

link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in the

EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.

14. Short note on armstrong’s axioms.


Answer: book page 128
15. What is functional dependency.
Answer:

Functional Dependency

The functional dependency is a relationship that exists between two attributes. It typically exists
between the primary key and non-key attribute within a table.

1. X → Y

The left side of FD is known as a determinant, the right side of the production is known as a dependent.

For example:

Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address.

Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if we
know the Emp_Id, we can tell that employee name associated with it.

Functional dependency can be written as:

1. Emp_Id → Emp_Name

We can say that Emp_Name is functionally dependent on Emp_Id.

Types of Functional dependency

1. Trivial functional dependency

o A → B has trivial functional dependency if B is a subset of A.


o The following dependencies are also trivial like: A → A, B → B

Example:

1. Consider a table with two columns Employee_Id and Employee_Name.


2. {Employee_id, Employee_Name} → Employee_Id is a trivial functional dependency as
3. Employee_Id is a subset of {Employee_Id, Employee_Name}.
4. Also, Employee_Id → Employee_Id and Employee_Name → Employee_Name are trivial dependen
cies too.

2. Non-trivial functional dependency

o A → B has a non-trivial functional dependency if B is not a subset of A.


o When A intersection B is NULL, then A → B is called as complete non-trivial.

Example:

1. ID → Name,
2. Name → DOB

16. What is data dictionary? What do you mean by unary operation in relational algebra?
Answer:

Data Dictionary is made up of two words, data which means the collected information through multiple sources,
and dictionary meaning the place where all this information is made available.

A data dictionary is a crucial part of a relational database as it provides additional information about the
relationships between multiple tables in a database. The data dictionary in DBMS helps the user to arrange data in
a neat and well-organized way, thus preventing data redundancy.

Below is a data dictionary describing the table containing employee details.

Attribute Name Data Type Max Field Size Description isRequired


Employee ID Integer 10 A unique ID for each Employee Yes
Name Text 25 Name of the Employee Yes
Date of Birth DateTime 10 Date of Birth of the Employee Yes
Mobile Number Integer 10 Contact Number of the Employee Yes

Some advantages of using a data dictionary are:

1.

Data models provide very little information about the database, so a data dictionary is very essential to
have proper knowledge about entities, relationships, and attributes that are present in a data model.

2.
3.

The Data Dictionary provides consistency by reducing data redundancy in the collection and use of data
across various members of a team.

4.
5.
The Data Dictionary provides structured analysis and design tools by enforcing the use of data standards.
Data standards are the set of rules that govern the way data is collected, recorded, and represented.

6.
7.

Using a Data Dictionary helps to define naming conventions that are used in a model.

8.

Types of Data Dictionary in DBMS

There are mainly two types of data dictionary in a database management system:

1. Integrated Data Dictionary


2. Stand Alone Data Dictionary

1. Integrated Data Dictionary

Every relational database has an Integrated Data Dictionary contained within the DBMS. This integrated data
dictionary acts as a system catalog that is accessed and updated by the relational database. In older databases, they
did not include an integrated data dictionary, so in that case, the database administrator had to use Stand Alone
Data Dictionary. In DBMS, an Integrated Data Dictionary can bind metadata to data.

The Integrated Data Dictionary can be further classified into two types:

 Active: An active data dictionary is updated automatically by the DBMS whenever any changes are
made to the database. This is also known as a self-updating dictionary as it keeps the information up-to-
date.

 Passive: In contrast to an active dictionary, a passive dictionary needs to be updated manually whenever
any changes are made to the database. This type of data dictionary is difficult to handle as it requires
proper handling. Otherwise, the database and the data dictionary will get unsynchronized.

2. Stand Alone Data Dictionary

In DBMS, this type of data dictionary is very flexible as it allows the Database Administrator to define and
manage all the confidential data. It doesn't matter whether the data is computerized or not. A stand-alone data
dictionary allows database designers to interact with end-users regardless of the data dictionary format.

There is no standard format for a data dictionary. Below given are some of the common elements:

1.

Data Elements: The Data Dictionary stores the definition of all the data elements such as name,
datatype, storage formats, and validation rules.

2.
3.

Tables: All information regarding the table, such as the user who created the table, the number of rows
and columns, the date on which the table was created and accessed, etc.

4.
5.
Index: Indexes for defined database tables are stored in the data dictionary. DBMS stores the index
name used by the attributes, location, and characteristics of the index, as well as the date of creation, in
each index.

6.
7.

Programs: Programs defined to access the database, including reports, application and screen formats,
SQL queries, etc., are also stored in the data dictionary.

8.
9.

Relationship between data elements: The Data Dictionary stores the type of relationship; for example,
if it is compulsory or optional, the cardinality of the relationship and connectivity, etc.

10.
11.

Administrations and End-Users: The Data Dictionary stores all the information of the administration
along with the end-users.

12.

The metadata, which is stored in the Data Dictionary, is similar to a monitor that monitors the use of the database
and the allocation of permission to access the database by the users.

How to Create a Data Dictionary?

As discussed above, most businesses rely on database management systems having an integrated data dictionary as
they are updated automatically and are easy to maintain. Documentation for a data dictionary can be generated in
various types of relational databases like MySQL, SQL Server, Oracle, etc.

While creating a stand-alone data dictionary, the database administrator can take the help of a template in SQL
Server, Oracle, or even Microsoft Excel.

Various notations used to create a data dictionary are:

Data Construct Notation Stands For


Composition = is composed of
Sequence + AND
Selection [|] OR
Repetition \{ \}^{n}{}n n repetitions
Parentheses () to represent optional data
Comment *…* to define a comment

Challenges with Data Dictionary

In the above sections, we discussed the advantages of a data dictionary, but dealing with a data dictionary has its
challenges.

A data dictionary can be difficult and time-consuming to create if we have not done any kind of data preparation.
Without proper data preparation, a data dictionary might only standardize a part of a database. While doing data
preparation for large-scale data can be a huge maintenance burden for little value and quickly becomes outdated,

17. Discuss different levels of views?


Answer:
18. What is weak entity set? Explain with suitable example.
Answer:

Weak entity set:

 An entity set that does not have a primary key is referred to as a weak entity set.
 The existence of a weak entity set depends on the existence of a identifying entity set:
o It must relate to the identifying entity set via a total, one-to-many relationship set from the
identifying to the weak entity set.
o Identifying relationship depicted using a double diamond.
 The discriminator (or partial key) of a weak entity set is the set of attributes that distinguishes among all
the entities of a weak entity set.
 The primary key of a weak entity set is formed by the primary key of the strong entity set on which the
weak entity set is existence dependent, plus the weak entity set’s discriminator.
 We depict a weak entity set by double rectangles. We underline the discriminator of a weak entity set
with a dashed line.
 Example:

o payment-number – discriminator of the payment entity set


o Primary key for payment – (loan-number, payment-number).

 Note: the primary key of the strong entity set is not explicitly stored with the weak entity set, since it is
implicit in the identifying relationship.
 If loan-number were explicitly stored, payment could be made a strong entity, but then the relationship
between payment and loan would be duplicated by an implicit relationship defined by the attribute loan-
number common to payment and loan.

19. ER diagram of a hospital management system.


Answer:
20. What is transaction. Explain transaction state.
Answer: page 157 and 158 of book pdf
21. What do you mean by shadow paging?
Answer:
Shadow paging is one of the techniques that is used to recover from failure. We all know that recovery
means to get back the information, which is lost. It helps to maintain database consistency in case of failure.

Concept of shadow paging

Now let see the concept of shadow paging step by step −


Step 1 − Page is a segment of memory. Page table is an index of pages. Each table entry points to a page
on the disk.
Step 2 − Two page tables are used during the life of a transaction: the current page table and the shadow
page table. Shadow page table is a copy of the current page table.
Step 3 − When a transaction starts, both the tables look identical, the current table is updated for each
write operation.
Step 4 − The shadow page is never changed during the life of the transaction.
Step 5 − When the current transaction is committed, the shadow page entry becomes a copy of the
current page table entry and the disk block with the old data is released.
Step 6 − The shadow page table is stored in non-volatile memory. If the system crash occurs, then the
shadow page table is copied to the current page table.
The shadow paging is represented diagrammatically as follows −

Advantages

The advantages of shadow paging are as follows −

 No need for log records.

 No undo/ Redo algorithm.


 Recovery is faster.

Disadvantages

The disadvantages of shadow paging are as follows −


Data is fragmented or scattered.
Garbage collection problem. Database pages containing old versions of modified data need to be garbage
collected after every transaction.
Concurrent transactions are difficult to execute.

22. Deadlock handling.[book pdf] pg - 169


Answer:

Deadlock Handling

Ostrich Algorithm

Ostrich Algorithm is an approach of handling deadlock that involves ignoring the deadlock and
pretending that it would never occur. This approach derives its name from the behavior of an Ostrich
which is “to stick one’s head in the sand and pretend there is no problem”. Windows and UNIX-based
systems use this approach of handling a deadlock.

But now you might question, Why ignore the deadlock?

This is because deadlock is a very rare case and the cost of handling a deadlock is very high. You
might have encountered a situation when your system got hanged up and to fix it a restart was needed.
In this case, the Operating system ignores the deadlock as the time required to handle the deadlock is
higher than the rebooting time of windows. Rebooting is a preferred choice, considering the rarity of
deadlock in Windows.

Deadlock Avoidance

 Deadlock avoidance is a technique of detecting any deadlock in advance. Methods like Wait-
For graph can be used in smaller databases to detect deadlocks, but in the case of larger
databases deadlock prevention measures have to be used.
 When a database gets stuck in a state of deadlock, it is preferred to avoid using that database
instead of aborting or rebooting the database server as it wastes of both time and resources.

Deadlock Detection

While a database transaction, if a task waits indefinitely to obtain CPU resources, then DBMS has to
check whether that task is in a state of deadlock or not. To detect a deadlock a resource scheduler is
used. A resource scheduler can detect deadlock by keeping track of resources allocated to a specific
transaction and requested by another transaction.

 Wait-For Graph: This method of detecting deadlocks involves creating a graph based on a
transaction and its acquired lock (a lock is a way of preventing multiple transactions from
accessing any resource simultaneously). If the created graph contains a cycle/loop, it means a
deadlock exists.

DBMS creates a wait-for graph for every transaction/task that is in a waiting state and keeps on
checking whether there exists a cycle in any of the graphs or not
The above is a wait-for graph representation for two transactions T1 and T2 in a deadlock situation.

Deadlock Prevention

1. Avoiding one or more of the above-stated Coffman conditions can lead to the prevention of a
deadlock. Deadlock prevention in larger databases is a much more feasible situation rather than
handling it.
2. The DBMS is made to efficiently analyze every database transaction, whether they can cause a
deadlock or not, if any of the transactions can lead to a deadlock, then that transaction is never
executed.
3. Wait-Die Scheme: When a transaction requests a resource that is already locked by some other
transaction, then the DBMS checks the timestamp of both the transactions and makes the older
transaction wait until that resource is available for execution.
4. Wound-wait Scheme: When an older transaction demands a resource that is already locked by a
younger transaction (a transaction that is initiated later), the younger transaction is forced to kill/stop its
processing and release the locked resource for the older transaction's own execution. The younger
transaction is now restarted with a one-minute delay, but the timestamp remains the same. If a younger
transaction requests a resource held by an older one, the younger transaction is made to wait until the
older one releases the resource.

23. What are the advantages of normalization.


Answer:
The goal of a relational database design is to generate a set of relation schemas that allows us
to store information without any redundant (repeated) data. It also allows us to retrive information
easily and more efficiently.
For this we use a approach normal form as the set of rules. These rules and regulations are known
as Normalization.
Database normalization is data design and organization process applied to data structures based
on their functional dependencies and primary keys that help build relational databases.
Normalization Helps:
• Minimizing data redundancy.
• Minimizing the insertion, deletion and update anomalies.
• Reduces input and output delays
• Reducing memory usage.
• Supports a single consistent version of the truth.
• It is an industry best method of tables or entity design.
Uses: Database normalization is a useful tool for requirements analysis and data modelling
process of software development. Thus
The normalization is the process to reduce the all undesirable problems by using the functional
dependencies and keys.

24. How does BCNF differ from 3rd normal form.


Answer:
Parameters 3NF BCNF

Strength 3NF is comparatively less strong than BCNF is comparatively much stronger
that of the BCNF. than that of the 3NF.

Functional The functional dependencies in 3NF The functional dependencies in BCNF


Dependencies already exist in 2NF and INF. already exist in 3NF, 2NF, and INF.
Redundancy 3NF has a comparatively much higher BCNF has a comparatively much lower
redundancy. redundancy.

Functional In the case of 3NF, preservation occurs In the case of BCNF, there is no
Dependencies for all the functional dependencies. preservation for all the functional
dependencies.

Lossless Lossless decomposition is comparatively Lossless decomposition is comparatively


Decomposition much easier to achieve in the case of much harder to achieve in the case of
3NF. BCNF.

25. Difference between strictly 2pl and conservative 2pl.


Answer:

S.No Conservative 2-PL Strict 2-PL


.

1. In Conservative 2-PL, A transaction has to acquire In Strict 2-PL, A transaction can acquire locks on
locks on all the data items it requires before the data items whenever it requires (only in growing
transaction begins it execution. phase) during its execution.

2. It does not have growing phase. It has growing phase.

3. It has shrinking phase. It has partial shrinking phase.

4. It ensures that the schedule generated would be It ensures that the schedule generated would be
Serializable and Deadlock-Free. Serializable, Recoverable and Cascadeless.

5. It does not ensures Recoverable and Cascadeless It does not ensures Deadlock-Free schedule.
schedule.

6. It does not ensure Strict Schedule. It ensures that the schedule generated would
be Strict.

7. It is less popular as compared to Strict 2-PL. It is the most popular variation of 2-PL.

8. It is not used in practise. It is the most popular variation of 2-PL.

9. In Conservative 2-PL, a transaction can read a In Strict 2-PL, a transaction only reads value of
value of uncommitted transaction. committed transaction.

26. Discuss the advantages and disadvantages of using DBMS approach as compared to using a
conventional file system.
Answer: pdf book from page 5 to 9.

27. Define the concept of generalization, specialization and aggregation.


Answer:

Generalization
Generalization is a process of generalizing an entity which contains generalized attributes or properties of
generalized entities. The entity that is created will contain the common features. Generalization is a Bottom up
process.
We can have three sub entities as Car, Truck, Motorcycle and these three entities can be generalized into one
general super class as Vehicle.

It is a form of abstraction that specifies two or more entities (sub class) having common characters that can be
generalized into one single entity (super class) at higher level hiding all the differences.

Specialization

Specialization is a process of identifying subsets of an entity that shares different characteristics. It breaks an
entity into multiple entities from higher level (super class) to lower level (sub class). The breaking of higher level
entity is based on some distinguishing characteristics of the entities in super class.
It is a top down approach in which we first define the super class and then sub class and then their attributes and
relationships.

Aggregation

Aggregation represents relationship between a whole object and its component. Using aggregation we can express
relationship among relationships. Aggregation shows ‘has-a’ or ‘is-part-of’ relationship between entities where
one represents the ‘whole’ and other ‘part’.

Consider a ternary relationship Works_On between Employee, Branch and Manager. Now the best way to model
this situation is to use aggregation, So, the relationship-set, Works_On is a higher level entity-set. Such an entity-
set is treated in the same manner as any other entity-set. We can create a binary relationship, Manager, between
Works_On and Manager to represent who manages what tasks.
28. Explain the terms 'partial functional dependency' and 'non-transitive dependency' with example.
Answer:
Partial Dependency?

Partial Dependency occurs when a non-prime attribute is functionally dependent on part of a candidate key.
The 2nd Normal Form (2NF) eliminates the Partial Dependency.
Let us see an example −

Example

<StudentProject>

StudentID ProjectNo StudentName ProjectName


S01 199 Katie Geo Location
S02 120 Ollie Cluster Exploration
In the above table, we have partial dependency; let us see how −
The prime key attributes are StudentID and ProjectNo, and

StudentID = Unique ID of the studentStudentName = Name of the studentProjectNo = Unique ID of the projectProjectName = Name
of the project
As stated, the non-prime attributes i.e. StudentName and ProjectName should be functionally dependent on part
of a candidate key, to be Partial Dependent.
The StudentName can be determined by StudentID, which makes the relation Partial Dependent.
The ProjectName can be determined by ProjectNo, which makes the relation Partial Dependent.
Therefore, the <StudentProject> relation violates the 2NF in Normalization and is considered a bad database
design.
To remove Partial Dependency and violation on 2NF, decompose the tables −
<StudentInfo>

StudentID ProjectNo StudentName


S01 199 Katie
S02 120 Ollie
<ProjectInfo>

ProjectNo ProjectName
199 Geo Location
120 Cluster Exploration
Now the relation is in 2nd Normal form of Database Normalization.
29. With suitable examples show how recovery in a database system can be done using LOG file with :
i) immediate updation
ii) differed updation.

Answer:

1. Deferred database modification:

o The deferred modification technique occurs if the transaction does not modify the database until it has

committed.
o In this method, all the logs are created and stored in the stable storage, and the database is updated when

a transaction commits.
2. Immediate database modification:

o The Immediate modification technique occurs if database modification occurs while the transaction is

still active.
o In this technique, the database is modified immediately after every operation. It follows an actual

database modification.

28. What is metadata?


Answer:
A Data Dictionary is an integral part of a database. It holds the information about the database and the
data that it stores called as metadata.
 A meta data is the data about the data.
 It is the self-describing nature of databases.
 It holds the information about each data element in the database.
 Such as names, types, range of values, access authorization, indicate which application program uses the
data.
 Meta data is used by the developers to develop the program, queries to manage and manipulate the data.

29. What is foreign key?


Answer:
Foreign keys link data in one table to the data in another table. A foreign key column in a table points
to a column with unique values in another table (often the primary key column) to create a way of
cross-referencing the two tables. If a column is assigned a foreign key, each row of that
column must contain a value that exists in the ‘foreign’ column it references. The referenced (i.e.
“foreign”) column must contain only unique values – often it is the primary key of its table.

For a tangible example, let’s look at the orders table in our database again. The user_id column here
corresponds with the user_id column in the users table, and the product_sku column corresponds with
the product_sku column in the books table.

When we’re setting up this table, it would make sense to add foreign key rules to
both orders.user_id and orders.product_sku:

 orders.user_id should reference users.user_id


 orders.product_sku should reference books.product_sku

Using these foreign keys saves us from having to store the same data repeatedly – we don’t have to
store the user’s name in the orders table, because we can use orders.user_id to reference that user’s
unique row in users.user_id to get their name and other information about them.

But the real purpose of foreign keys is that they add a restriction: entries to the table with a foreign
key must have a value that corresponds with the ‘foreign’ table column.

This restriction is called a foreign key constraint. Let’s take a look at foreign key constraints in more
detail.

30. Explain Time-stamp based protocol for concurrency control .


Answer:

Timestamp Ordering Protocol

o The Timestamp Ordering Protocol is used to order the transactions based on their Timestamps. The order

of transaction is nothing but the ascending order of the transaction creation.


o The priority of the older transaction is higher that's why it executes first. To determine the timestamp of

the transaction, this protocol uses system time or logical counter.


o The lock-based protocol is used to manage the order between conflicting pairs among transactions at the

execution time. But Timestamp based protocols start working as soon as a transaction is created.
o Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the system at

007 times and transaction T2 has entered the system at 009 times. T1 has the higher priority, so it

executes first as it is entered the system first.


o The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write' operation on a

data.

Basic Timestamp ordering protocol works as follows:

1. Check the following condition whenever a transaction Ti issues a Read (X) operation:

o If W_TS(X) >TS(Ti) then the operation is rejected.

o If W_TS(X) <= TS(Ti) then the operation is executed.

o Timestamps of all the data items are updated.

2. Check the following condition whenever a transaction Ti issues a Write(X) operation:

o If TS(Ti) < R_TS(X) then the operation is rejected.

o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the operation is

executed.

Where,

TS(TI) denotes the timestamp of the transaction Ti.

R_TS(X) denotes the Read time-stamp of data-item X.

W_TS(X) denotes the Write time-stamp of data-item X.

Advantages and Disadvantages of TO protocol:

o TO protocol ensures serializability since the precedence graph is as follows:


o TS protocol ensures freedom from deadlock that means no transaction ever waits.

o But the schedule may not be recoverable and may not even be cascade- free.

31. Wait-Die and wound-wait protocol for dead lock prevention.


Answer:
In a multi-process system, deadlock is an unwanted situation that arises in a shared resource environment, where a
process indefinitely waits for a resource that is held by another process.
For example, assume a set of transactions {T0, T1, T2, ...,Tn}. T0 needs a resource X to complete its task.
Resource X is held by T1, and T1 is waiting for a resource Y, which is held by T2. T2 is waiting for resource Z,
which is held by T0. Thus, all the processes wait for each other to release resources. In this situation, none of the
processes can finish their task. This situation is known as a deadlock.
Deadlocks are not healthy for a system. In case a system is stuck in a deadlock, the transactions involved in the
deadlock are either rolled back or restarted.

Deadlock Prevention

To prevent any deadlock situation in the system, the DBMS aggressively inspects all the operations, where
transactions are about to execute. The DBMS inspects the operations and analyzes if they can create a deadlock
situation. If it finds that a deadlock situation might occur, then that transaction is never allowed to be executed.
There are deadlock prevention schemes that use timestamp ordering mechanism of transactions in order to
predetermine a deadlock situation.
Wait-Die Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already held with a conflicting lock
by another transaction, then one of the two possibilities may occur −
 If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older than Tj − then Ti is allowed to
wait until the data-item is available.

 If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted later with a random delay but
with the same timestamp.

This scheme allows the older transaction to wait but kills the younger one.
Wound-Wait Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already held with conflicting lock
by some another transaction, one of the two possibilities may occur −
 If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti wounds Tj. Tj is restarted later with a
random delay but with the same timestamp.

 If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.
This scheme, allows the younger transaction to wait; but when an older transaction requests an item held by a
younger one, the older transaction forces the younger one to abort and release the item.
In both the cases, the transaction that enters the system at a later stage is aborted.

Deadlock Avoidance

Aborting a transaction is not always a practical approach. Instead, deadlock avoidance mechanisms can be used to
detect any deadlock situation in advance. Methods like "wait-for graph" are available but they are suitable for only
those systems where transactions are lightweight having fewer instances of resource. In a bulky system, deadlock
prevention techniques may work well.
Wait-for Graph
This is a simple method available to track if any deadlock situation may arise. For each transaction entering into
the system, a node is created. When a transaction Ti requests for a lock on an item, say X, which is held by some
other transaction Tj, a directed edge is created from Ti to Tj. If Tj releases item X, the edge between them is
dropped and Ti locks the data item.
The system maintains this wait-for graph for every transaction waiting for some data items held by others. The
system keeps checking if there's any cycle in the graph.

Here, we can use any of the two following approaches −


 First, do not allow any request for an item, which is already locked by another transaction. This is not always
feasible and may cause starvation, where a transaction indefinitely waits for a data item and can never
acquire it.

 The second option is to roll back one of the transactions. It is not always feasible to roll back the younger
transaction, as it may be important than the older one. With the help of some relative algorithm, a transaction
is chosen, which is to be aborted. This transaction is known as the victim and the process is known as victim
selection.

32. What is integrity constant?


Answer:

Data integrity means that the data contained in the database


is both accurate and consistent. Integrity means constraints, which are consistency rules that the
database system should not violate.
Most database applications have certain integrity constraints that must hold for the data. A
DBMS should provide capabilities for defming and enforcing these constraints. The simplest type of
integrity constraint specifying a data type for each data item.

33. What is lossless decomposition?


Answer:

What is Lossless Decomposition in DBMS?


The decomposition of a given relation X is known as a lossless decomposition when the X decomposes into two
relations X1 and X2 in a way that the natural joining of X1 and X2 gives us the original relation X in return.

Uses of Lossless Decomposition in DBMS


There are two types of decompositions in DBMS, lossless and lossy decomposition. The process of lossless
decomposition helps in the removal of data redundancy from a database while still preserving the initial/original
data.

How is a Decomposition Lossless?


In the case of lossless decomposition, one selects the common attribute. Here, the criteria used for the selection of
a common attribute as this attribute has to be a super key or a candidate key in either relation X1, relation X2, or
either of them.
In simpler words, the decomposition of the relation X into the relations X1 and X2 will be a lossless join
decomposition in DBMS when a minimum of one of these functional dependencies is in F+ (Functional
dependency closure).
X1 ∩ X2 → X1
OR
X1 ∩ X2 → X2

Conditions Required for Lossless Decomposition in DBMS


Let us consider a relation X. In case we decompose this relation into relation X1 and relation X2 sub-parts. This
decomposition will be referred to as a lossless decomposition in case it satisfies these statements:

 If we union the sub relations X1 and X2, then it should consist of all the attributes available before the
decomposition in the original relation X.
 The intersections of X1 and X2 can never be Null. There must be a common attribute in the sub relation.
This common attribute must consist of some unique data/information.
Here, the common attribute needs to be the super key of the sub relations, either X1 or X2.
In this case,
X = (P, Q, R)
X1 = (P, Q)
X2 = (Q, W)
The relation X here consists of three attributes P, Q, and R. The relation X here decomposes into two separate
relations X1 and X2. Thus, each of these X1 and X2 both have two attributes. The common attribute among each
of these is Q.
Remember that the value present in column Q has to be unique. In case it consists of a duplicate value, then a
lossless-join decomposition would not be possible here.

Lossless Decomposition in DBMS Example

Example 1
Draw a table with the relation X that has raw data:
X (P, Q, R)

P Q R

37 25 16

29 18 35

16 39 28
This relation would decompose into the following sub relations, X1 and X2:
X1 (P, Q)

P Q

37 25

29 18

16 39
X2 (Q, R)

Q R

25 16

18 35

39 28
Let us now check the first condition that satisfies the lossless-join decomposition. Here, the union of
the sub relations X1 and X2 generate the same results as the relation X.
X1 ∩ X2 = X
Here, we will get the result as follows:
X (P, Q, R)

P Q R

37 25 16

29 18 35

16 39 28
This relation is similar to the original relation X. Thus, this decomposition can be considered as the lossless join
decomposition in DBMS.

Example 2
Let us take a look at an example:
<Cand_Info>

Cand_ID Cand_Name Cand_Age Cand_Location Sec_ID Sec_Name

G0091 Robin 25 Canada Sec_1 HR

G0081 Ted 29 New Jersey Sec_2 Finance

G0071 Barney 27 Las Vegas Sec_3 Operations


Let us decompose this table mentioned above into the following tables:
<Cand_Details>

Cand_ID Cand_Name Cand_Age Cand_Location

G0091 Robin 25 Canada

G0081 Ted 29 New Jersey

G0071 Barney 27 Las Vegas


<Sec_Details>

Sec_ID Cand_ID Sec_Name

Sec_1 G0091 HR

Sec_2 G0081 Finance

Sec_3 G0071 Operations


Let us now apply Natural Join in the tables mentioned above. The result generated here would be:

Cand_ID Cand_Name Cand_Age Cand_Location Sec_ID Sec_Name


G0091 Robin 25 Canada Sec_1 HR

G0081 Ted 29 New Jersey Sec_2 Finance

G0071 Barney 27 Las Vegas Sec_3 Operations


Thus, the relation mentioned above has a lossless decomposition. In simpler words, no information would be lost
here.
32. Define the concurrency problem - dirty read, non - repeatable read and phantom.
Answer:
33. What are the roles of analysis in dbms.
34. When do we call a relation is in 3nf.
35. Define deadlock prevention system. [pg - after 169 page]
36. What is indexing?
Answer:

Indexing is used to quickly retrieve particular data from the database. Formally we can define Indexing as a
technique that uses data structures to optimize the searching time of a database query. Indexing reduces the number
of disks required to access a particular data by internally creating an index table.

Indexing is achieved by creating Index-table or Index.

Index usually consists of two columns which are a key-value pair. The two columns of the index table(i.e., the
key-value pair) contain copies of selected columns of the tabular data of the database.

Here, Search Key contains the copy of the Primary Key or the Candidate Key of the database table. Generally, we
store the selected Primary or Candidate keys in a sorted manner so that we can reduce the overall query time or
search time(from linear to binary).

Data Reference contains a set of pointers that holds the address of the disk block. The pointed disk block contains
the actual data referred to by the Search Key. Data Reference is also called Block Pointer because it uses block-
based addressing.

37. Explain sparse and dense index with example.


Answer:

Ordered indexing is the traditional way of storing that gives fast retrieval. The indices are stored in a sorted manner hence it is
also known as ordered indices.

Ordered Indexing is further divided into two categories:

1. Dense Indexing: In dense indexing, the index table contains records for every search key value of the database. This
makes searching faster but requires a lot more space. It is like primary indexing but contains a record for every search
key.

Example:
1. Sparse Indexing: Sparse indexing consumes lesser space than dense indexing, but it is a bit slower as well. We do
not include a search key for every record despite that we store a Search key that points to a block. The pointed block
further contains a group of data. Sometimes we have to perform double searching this makes sparse indexing a bit
slower.

Example:

38. Short note on - secondary index, primary index, clustered index.

Answer:

According to the attributes defined above, we divide indexing into three types:

Single Level Indexing

It is somewhat like the index (or the table of contents) found in a book. Index of a book contains topic names along
with the page number similarly the index table of the database contains keys and their corresponding block
address.

Single Level Indexing is further divided into three categories:

1. Primary Indexing: The indexing or the index table created using Primary keys is known as Primary
Indexing. It is defined on ordered data. As the index is comprised of primary keys, they are unique, not
null, and possess one to one relationship with the data blocks.

Example:
Characteristics of Primary Indexing:

 Search Keys are unique.


 Search Keys are in sorted order.
 Search Keys cannot be null as it points to a block of data.
 Fast and Efficient Searching.

1. Secondary Indexing: It is a two-level indexing technique used to reduce the mapping size of the primary index. The
secondary index points to a certain location where the data is to be found but the actual data is not sorted like in the
primary indexing. Secondary Indexing is also known as non-clustered Indexing.

Example:

Characteristics of Secondary Indexing:

 Search Keys are Candidate Keys.


 Search Keys are sorted but actual data may or may not be sorted.
 Requires more time than primary indexing.
 Search Keys cannot be null.
 Faster than clustered indexing but slower than primary indexing.

1. Cluster Indexing: Clustered Indexing is used when there are multiple related records found at one
place. It is defined on ordered data. The important thing to note here is that the index table of clustered
indexing is created using non-key values which may or may not be unique. To achieve faster retrieval,
we group columns having similar characteristics. The indexes are created using these groups and this
process is known as Clustering Index.

Example:
Characteristics of Clustered Indexing:

 Search Keys are non-key values.


 Search Keys are sorted.
 Search Keys cannot be null.
 Search Keys may or may not be unique.
 Requires extra work to create indexing.

39. Why acid properties are important?


Answer: In totality, the ACID properties of transactions provide a
mechanism in DBMS to ensure the consistency and correctness of any database. It
ensures consistency in a way that every transaction acts as a group of operations acting
as single units, produces consistent results, operates in an isolated manner from all the
other operations, and makes durably stored updates. These ensure the integrity of data
in any given database.

40. All candidate keys are super keys but all super keys are not canidate key. Justify with
suitable example.
Answer:
A superkey SK is a subset of attributes of a relation schema R, such that for any two
distinct tuples t1 and t2 in relation state r of R, we have t1 [SK] ≠ t2 [SK]. For example,
consider a relation schema BOOK with three attributes ISBN, Book_title, and Category. The
value of ISBN is unique for each tuple; hence, {ISBN } is a superkey. In addition, the
combination of all the attributes, that is, {ISBN, Book_title, Category} is a default superkey for
this relation schema.
Generally, all the attributes of a superkey are not required to identify each tuple uniquely
in a relation. Instead, only a subset of attributes of the superkey is sufficient to uniquely
identify each tuple. Further, if any attribute is removed from this subset, the remaining set
of attributes can no longer serve as a superkey. Such a minimal set of attributes, say K, is
a candidate key (also known as irreducible superkey). For example, the
superkey {ISBN, Book_title, Category} is not a candidate key, since its subset {ISBN} is a
minimal set of attributes that uniquely identify each tuple of BOOK relation. So, ISBN is a
candidate key as well as superkey. Hence, it is concluded that all candidate keys are
superkeys, whereas all superkeys are not candidate keys.

41. What do you mean by transitive dependency?


Answer:
Whenever some indirect relationship happens to cause functional dependency (FC), it is
known as Transitive Dependency. Thus, if A -> B and B -> C are true, then A -> C
happens to be a transitive dependency.
Thus, to achieve 3NF, one must eliminate the Transitive Dependency.
Note:
The given functional dependency can only be transitive when it is formed indirectly by
two FDs. For example,
P -> R happens to be a transitive dependency when the following functional
dependencies hold true:
 P -> Q
 Q does not -> P
 Q -> R

The transitive dependency can occur easily only in the case of some given relation of
three or more attributes. Such a type of dependency helps us in normalizing the
database in their 3rd Normal Form (3NF).

You might also like