0% found this document useful (0 votes)
48 views13 pages

DBMS 02

Solution paper

Uploaded by

Rakesh Prajapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views13 pages

DBMS 02

Solution paper

Uploaded by

Rakesh Prajapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Roll No. Total No.

of Pages :

Total No. of Questions : 09

B.Tech. (CSE) (Sem.–5)


DATABASE MANAGEMENT SYSTEMS
Subject Code : BTCS-501-18
M.Code : 78320
Date of Examination : 19-12-23

Time : 3 Hrs. Max. Marks : 60

INSTRUCTIONS TO CANDIDATES :

1. SECTION-A is COMPULSORY consisting of TEN questions carrying TWO marks


each.
2. SECTION-B contains FIVE questions carrying FIVE marks each and students have
to attempt any FOUR questions.
3. SECTION-C contains THREE questions carrying TEN marks each and students
have to attempt any TWO questions.

SECTION-A:

1. What do you mean by data abstraction and data independence?

Data Abstraction:
Data abstraction is the process of hiding the complexities of the database from users and
providing only essential information. It simplifies the interaction with the database by
separating data into three abstraction levels:

1. Physical Level: How data is stored physically (e.g., disk structures).


2. Logical Level: What data is stored and the relationships between them.
3. View Level: How data is presented to the users.

Data Independence:
Data independence refers to the ability to modify one level of the database schema without
affecting other levels.

1. Physical Data Independence: Changes in physical storage do not affect the logical schema.
o Example: Moving data from HDD to SSD should not impact applications.
2. Logical Data Independence: Changes in the logical schema do not affect the view schema.
o Example: Adding a new column to a table should not affect user views.
2. Differentiate between DDL and DML.

Aspect DDL (Data Definition Language) DML (Data Manipulation Language)


Purpose Defines the structure of the database Manipulates data within tables
Commands CREATE, ALTER, DROP, TRUNCATE SELECT, INSERT, UPDATE, DELETE
Effect on
Changes schema or structure Changes or retrieves data
Database
CREATE TABLE Students (ID INT, INSERT INTO Students VALUES
Example Name VARCHAR); (1, 'John');

3. Name a few popular open-source and commercial DBMS.

Open-Source DBMS:

1. MySQL
2. PostgreSQL
3. SQLite
4. MariaDB

Commercial DBMS:

1. Oracle Database
2. Microsoft SQL Server
3. IBM Db2
4. SAP HANA

4. What do you mean by query optimization?

Query Optimization:
Query optimization is the process of improving the efficiency of SQL queries to minimize
execution time and resource usage.

Key Techniques:

1. Using Indexes: Speeds up data retrieval.


2. Join Reordering: Evaluating the most efficient sequence for joins.
3. Query Rewriting: Simplifying complex queries.
4. Cost-Based Analysis: Choosing the least expensive query execution plan.

Example:
A query without optimization:

SELECT * FROM Employees WHERE Age > 30 AND Department = 'HR';


Optimized using an index on Age and Department.

5. What is the role of indices in DBMS?

Role of Indices:
An index is a data structure that improves the speed of data retrieval operations in a database
at the cost of additional storage.

Advantages:

1. Faster Query Execution: Reduces the time for searching rows.


2. Efficient Sorting: Allows data to be sorted quickly.
3. Improved Join Performance: Speeds up queries with joins by indexing key columns.

Types of Indexes:

1. Clustered Index: Reorders the data in the table to match the index.
2. Non-Clustered Index: Creates a separate structure that points to the data.
3. Unique Index: Ensures that all values in a column are unique.

6. What are the problems arising out of concurrency?

Concurrency in databases allows multiple transactions to execute simultaneously. However, it


can lead to the following issues:

1. Lost Update: When two transactions update the same data, and one overwrites the
other.
o Example: T1 and T2 both update a salary value simultaneously.
2. Dirty Read: A transaction reads uncommitted data from another transaction.
o Example: T2 reads data modified by T1, but T1 rolls back.
3. Non-Repeatable Read: A transaction reads the same data twice and gets different
values due to updates by another transaction.
o Example: T1 reads a value; T2 updates it; T1 reads it again and sees a different value.
4. Phantom Read: A transaction reads a set of rows, but another transaction inserts or
deletes rows, altering the result.
o Example: T1 queries a range; T2 adds a new record in the range.

7. What are the various types of database failures?

1. System Crash:
o Cause: Hardware or software failures.
o Example: Power outage, operating system crash.
2. Media Failure:
o Cause: Corruption of storage media.
o Example: Hard disk crash.
3. Transaction Failure:
o Cause: Errors within a transaction.
o Example: Division by zero, logical constraints violated.
4. Application Errors:
o Cause: Bugs in the application accessing the database.
o Example: Incorrect SQL logic.
5. Disk Failure:
o Cause: Damage to disk sectors or data corruption.
o Example: Loss of a database file.

8. What is intrusion detection in database systems?

Intrusion Detection:
Intrusion detection involves monitoring database systems to identify unauthorized access or
suspicious activities.

Types:

1. Signature-Based: Detects known patterns of attacks (e.g., SQL injection).


2. Anomaly-Based: Monitors deviations from normal behavior.
3. Hybrid: Combines signature and anomaly-based techniques.

Tools and Techniques:

1. Database firewalls (e.g., IBM Guardium).


2. Log analysis to track unusual queries.
3. Real-time alerts for unauthorized access attempts.

Example Threats:

• SQL injection.
• Privilege escalation attacks.

9. What are the features of a logical database?

Logical Database Features:

1. Abstract Representation: Logical databases provide an abstract view of data


independent of physical storage.
2. Schema Design: Focus on the relationships, constraints, and structure of the database.
o Example: Tables, columns, relationships.
3. Data Independence: Changes in physical storage do not affect logical design.
4. Simplified Querying: Logical structure aligns with user queries, making data access
intuitive.
5. Normalization: Reduces redundancy and improves data integrity.
10. What are web databases?
Web Databases:
Web databases are databases that are accessible through the internet or a web interface. They
store, manage, and retrieve data for web applications.

Features:

1. Online Accessibility: Accessed via web browsers.


2. Integration with Web Applications: Used for dynamic content generation.
3. Scalability: Handles large volumes of traffic.

Examples:

1. MySQL used for e-commerce websites.


2. MongoDB used for storing JSON-like data for web apps.

Advantages:

• Real-time data access.


• Seamless integration with web technologies (HTML, JavaScript).

SECTION B:

1. What are integrity constraints? Why are they important?

Integrity Constraints:
Integrity constraints are rules enforced in a database to maintain data accuracy, consistency,
and validity. They ensure that the database adheres to its schema and real-world expectations.

Types of Integrity Constraints:

1. Domain Integrity: Ensures data in a column adheres to a predefined domain or data


type.
o Example: Age must be a positive integer.
2. CREATE TABLE Students (
3. ID INT PRIMARY KEY,
4. Name VARCHAR(50),
5. Age INT CHECK (Age > 0)
6. );
7. Entity Integrity: Ensures that each row in a table has a unique identifier (primary
key) and that it is not null.
o Example: Each student in a table must have a unique ID.
8. Referential Integrity: Ensures consistency between related tables using foreign keys.
o Example: A foreign key in the Orders table must reference a valid primary key in
the Customers table.
9. Unique Constraint: Ensures all values in a column or combination of columns are
unique.
10. Not Null Constraint: Ensures that a column cannot have null values.

Importance of Integrity Constraints:

1. Ensures Data Accuracy: Prevents invalid data entries.


2. Maintains Relationships: Enforces valid foreign key references.
3. Prevents Redundancy: Helps in maintaining normalized data.
4. Improves Query Accuracy: Ensures queries return valid and consistent results.

2. Between hashing and B-trees, which method is preferable for storing


indexes in a database?

Comparison of Hashing and B-Trees:

1. Hashing:
o Hashing uses a hash function to compute the location of data in a hash table.
o Advantages:
▪ Fast retrieval for exact matches.
▪ Efficient for primary key lookups.
o Disadvantages:
▪ Not suitable for range queries.
▪ Collisions require additional handling (e.g., chaining or open addressing).
2. B-Trees:
o A B-tree is a balanced tree structure used for indexing, where data is stored in
sorted order.
o Advantages:
▪ Supports range queries and ordered traversal.
▪ Handles insertions, deletions, and updates efficiently.
o Disadvantages:
▪ Slower for exact lookups compared to hashing.
▪ Requires more space for tree maintenance.

When to Use Each:

• Hashing: Ideal for exact match queries (e.g., retrieving records by primary key).
• B-Trees: Suitable for range queries, ordered traversal, and multi-level indexing.

3. Explain the concept of authorization and authentication.

Authentication:
Authentication verifies the identity of a user or system trying to access a database. It ensures
that only legitimate users can log in.
Methods:

1. Password-Based: User provides a username and password.


2. Token-Based: User provides a token generated by a secure system.
3. Biometric: Verifies a user’s identity using fingerprints, face recognition, etc.

Example:

GRANT CONNECT TO user1 IDENTIFIED BY 'password123';

Authorization:
Authorization determines what actions a user or system is allowed to perform after
authentication. It ensures that users can only access resources they have permissions for.

Types of Authorization:

1. Role-Based Access Control (RBAC): Users are assigned roles, and each role has specific
permissions.
o Example: Admin can modify data; users can only read data.
2. Discretionary Access Control (DAC): Permissions are assigned directly to users or roles.
o Example: A user can be given SELECT and UPDATE permissions for a table.
3. GRANT SELECT, UPDATE ON Employees TO user1;
4. Mandatory Access Control (MAC): Access is controlled based on security levels.

Importance:

• Enhances database security.


• Prevents unauthorized data access and modifications.

4. Write short notes on:

a) Access Control Models:


Access control models define the rules and policies for restricting user access to resources in
a database.

Key Models:

1. Discretionary Access Control (DAC):


o Permissions are assigned to specific users or groups.
o Example: A user is given access to modify a particular table.
2. Mandatory Access Control (MAC):
o Access is based on security levels (e.g., Confidential, Secret, Top Secret).
o Example: A user with "Secret" clearance cannot access "Top Secret" data.
3. Role-Based Access Control (RBAC):
o Users are assigned roles, and roles define permissions.
o Example: A "Manager" role can view and update employee records, while a "Clerk"
role can only view them.
4. Attribute-Based Access Control (ABAC):
o Access is granted based on attributes of the user, resource, or environment.
o Example: Access is allowed only during office hours.

b) Distributed Databases:
A distributed database is a collection of data stored across multiple locations, connected
through a network.

Key Features:

1. Data Distribution: Data is partitioned or replicated across sites.


2. Transparency: Users interact with the database as if it were centralized.

Advantages:

• Faster local access to data.


• Improved fault tolerance due to data replication.

Challenges:

• Synchronizing data across multiple sites.


• Increased complexity in query execution.

Example: A banking system with branches in multiple cities using a distributed database to
manage customer accounts locally while synchronizing globally.

5. What is the difference between object-oriented and object-relational


databases?

Object-Oriented Databases (OODB):


OODB integrates object-oriented programming concepts into database systems.

Features:

1. Classes and Objects: Stores data as objects.


2. Inheritance: Supports class hierarchies.
3. Methods: Data manipulation is done through methods defined in classes.

Example:
An employee record is stored as an object with properties (Name, Salary) and methods
(CalculateBonus()).
Object-Relational Databases (ORDB):
ORDB extends relational databases by adding object-oriented features.

Features:

1. Support for User-Defined Types: Allows custom data types.


2. Inheritance: Tables can inherit structure from parent tables.
3. Standard Query Language: Uses SQL with extensions.

Example: PostgreSQL supports JSON and array data types, allowing it to handle semi-
structured data.

Aspect OODB ORDB


Data Storage Objects Tables with object support
Query Language OQL (Object Query Language) SQL with extensions
Use Case CAD systems, AI applications General-purpose databases

SECTION C:

7. What is a Data Model? State and explain various data models with suitable
examples.

A data model is an abstract representation of the structure of data, the operations that can be
performed on the data, and the relationships between different data elements. It is used to
define how data is stored, organized, and manipulated in a database. Data models help in
designing the database structure and provide a framework for managing and interacting with
data. There are several types of data models in Database Management Systems (DBMS),
each with its own approach to data organization and manipulation. Below are the most
commonly used data models:

a) Hierarchical Data Model

• Description: The hierarchical data model organizes data in a tree-like structure where each
record has a single parent, and each parent can have multiple children. The structure is a set
of hierarchical relationships between data elements.
• Example: A typical example is the organizational chart of a company. Each department can
have multiple employees, and each employee belongs to a specific department.
• Advantages: Data retrieval is fast if the hierarchy is small.
• Disadvantages: It is rigid and not flexible for representing many-to-many relationships.

b) Network Data Model


• Description: The network data model is an extension of the hierarchical model. It allows
more complex relationships, with records that can have multiple parent nodes (many-to-
many relationships).
• Example: A transportation network where cities (nodes) are connected by routes (edges),
allowing for multiple routes to connect cities in various configurations.
• Advantages: More flexible than the hierarchical model in representing relationships.
• Disadvantages: Complex to manage and design, and can be difficult to query.

c) Relational Data Model

• Description: The relational data model organizes data in tables (also called relations), where
each table is made up of rows (tuples) and columns (attributes). Data is related using keys,
such as primary keys and foreign keys.
• Example: A "Student" table where each row represents a student, and columns represent
attributes like student ID, name, and age.
• Advantages: Simple and flexible, supports powerful querying using SQL (Structured Query
Language).
• Disadvantages: Not well-suited for complex hierarchical or network relationships.

d) Object-Oriented Data Model

• Description: The object-oriented data model is based on object-oriented programming


principles. It uses objects, classes, and methods to represent and manipulate data.
• Example: A software application that models real-world entities as objects, such as a class
for "Vehicle" with attributes like "make," "model," and methods like "start()" or "stop()".
• Advantages: Supports complex data structures and real-world modeling.
• Disadvantages: More complex to implement and manage.

e) Entity-Relationship (ER) Model

• Description: The ER model is used to represent the relationships between entities in a


database. Entities are objects or things in the real world, and relationships are associations
between these entities.
• Example: In a university database, "Student" and "Course" are entities, and the relationship
"Enrolls" represents a student enrolling in a course.
• Advantages: Conceptually simple and provides a visual representation of data and its
relationships.
• Disadvantages: Not suitable for querying directly; it's mainly used in database design.

f) Document Data Model (NoSQL)

• Description: This model is used in NoSQL databases, where data is stored as documents,
typically in JSON, BSON, or XML formats. It is flexible, allowing for semi-structured or
unstructured data.
• Example: A collection of documents where each document represents a product, with
attributes like "name," "price," and "description" stored in a JSON format.
• Advantages: Highly flexible and scalable, suitable for applications with large and diverse
datasets.
• Disadvantages: Querying can be less efficient than relational models.
8. Write notes on the following:

a) Relational Algebra

• Description: Relational algebra is a procedural query language used to query and manipulate
relational databases. It consists of a set of operations that take one or more relations as
input and produce a new relation as output. It is used to define queries without the need for
procedural code.
• Operations:
1. Select (σ): Filters rows based on a condition. Example: σ (Age > 20)(Student).
2. Project (π): Extracts specific columns from a table. Example: π (Name,
Age)(Student).
3. Union (∪): Combines two relations into one, eliminating duplicates. Example:
Student ∪ Teacher.
4. Set Difference (−): Returns rows that are in the first relation but not in the second.
Example: Student − Graduate.
5. Cartesian Product (×): Combines each row from the first relation with every row
from the second relation. Example: Student × Course.
6. Join: Combines related rows from two relations based on a common attribute.
Example: Student ⨝ Enrollment.

b) Normal Forms

• Description: Normalization is a process of organizing the data in a database to reduce


redundancy and dependency. There are several normal forms (NF), each addressing
different levels of redundancy.
o First Normal Form (1NF): Ensures that all columns contain atomic values, meaning
no repeating groups or arrays.
o Second Normal Form (2NF): Achieved by removing partial dependencies, i.e., every
non-key attribute must depend on the entire primary key.
o Third Normal Form (3NF): Removes transitive dependencies, ensuring that non-key
attributes are not dependent on other non-key attributes.
o Boyce-Codd Normal Form (BCNF): A stronger version of 3NF, ensuring that every
determinant is a candidate key.
o Fourth Normal Form (4NF): Eliminates multi-valued dependencies.
o Fifth Normal Form (5NF): Removes join dependencies and ensures that every join
operation is lossless.

c) Query Processing

• Description: Query processing is the set of steps that a DBMS follows to execute a query.
This includes parsing, optimization, and execution of the query.
o Parsing: The query is parsed to check for syntax and semantics.
o Optimization: The DBMS creates an optimized execution plan that minimizes cost
(such as disk I/O or CPU time).
o Execution: The optimized query plan is executed, and the results are returned to the
user.

d) Join Strategies
• Description: Join strategies are algorithms used by a DBMS to perform joins between two or
more tables. Some common join strategies include:
1. Nested Loop Join: For each row in one table, scan all rows in the other table.
2. Sort-Merge Join: Both tables are sorted on the join column, and the rows are
merged based on the sorted order.
3. Hash Join: A hash table is built for one table, and the other table is probed to find
matching rows.
4. Index Join: Uses indexes to speed up the search for matching rows in a table.

9. a) What are the ACID properties of transactions?

ACID stands for Atomicity, Consistency, Isolation, and Durability. These are the four key
properties that ensure reliable transaction processing in a DBMS.

1. Atomicity: A transaction is atomic, meaning it is treated as a single unit of work.


Either all operations within the transaction are completed successfully, or none are. If
a transaction fails, all changes made by it are rolled back.
2. Consistency: A transaction brings the database from one consistent state to another. It
ensures that any transaction will leave the database in a valid state according to the
defined rules (constraints, triggers, etc.).
3. Isolation: Transactions are executed in isolation from one another. The intermediate
states of a transaction are invisible to other transactions, ensuring that concurrent
execution does not lead to inconsistent data.
4. Durability: Once a transaction is committed, its effects are permanent, even in the
case of a system failure. The changes made by the transaction are stored in non-
volatile memory.

9. b) Discuss the Lock-based and Timestamp-based protocol for concurrency


control.

Concurrency control is used to manage the simultaneous execution of transactions in a multi-


user database environment to avoid conflicts and maintain consistency.

Lock-based Protocol

• Description: The lock-based protocol ensures that transactions acquire locks on data before
accessing it. These locks prevent other transactions from accessing the same data
simultaneously, thereby preventing conflicts.
• Types of Locks:
1. Shared Lock (S-lock): Allows multiple transactions to read a data item but prevents
them from writing to it.
2. Exclusive Lock (X-lock): Prevents other transactions from both reading and writing
to the data item.
• Two-Phase Locking Protocol (2PL): Ensures that transactions follow two phases: the growing
phase, where locks can be acquired, and the shrinking phase, where locks are released. This
guarantees serializability.
Timestamp-based Protocol

• Description: The timestamp-based protocol assigns a unique timestamp to each transaction.


Transactions are ordered based on these timestamps, and conflicts are resolved by ensuring
that transactions with earlier timestamps are allowed to proceed before those with later
timestamps.
• Working:
o Each transaction is given a timestamp when it starts.
o If a transaction requests access to a data item, it is allowed if its timestamp is earlier
than any other conflicting transaction.
o If conflicts occur, the transaction with the later timestamp is rolled back.

You might also like