DBMS Mid Que Paper Solution N
DBMS Mid Que Paper Solution N
Q.1
o Reflexivity: If Y is a subset of X, then X → Y. This means that any set of attributes can
always functionally determine a subset of itself. For example, if you have attributes
{A, B}, then {A, B} → {A}.
o Augmentation: If X → Y, then adding the same attributes to both sides does not
change the functional dependency, i.e., (X, Z) → (Y, Z). For example, if {A} → {B}, then
{A, C} → {B, C}.
These axioms help in deducing all possible dependencies in a relational schema and are
essential in normalization.
2. How does the CHECK constraint help in maintaining data integrity? Explain with an
example. (3 Marks)
The CHECK constraint in SQL is used to ensure that all values in a column meet specific
conditions, thus enforcing data integrity by restricting invalid data. For example, suppose you
have a table for employee salaries, and you want to ensure that the salary is always greater
than zero. You can define the CHECK constraint as:
In this example, the CHECK constraint ensures that no employee can have a negative salary,
maintaining the correctness of data in the table. If someone tries to insert a salary of 0 or
negative, the database will reject the transaction. This mechanism helps avoid errors and ensures
that business rules are respected.
o Balanced Height: The tree remains balanced by keeping all leaf nodes at the same
depth, ensuring that all operations (search, insert, delete) take logarithmic time
(O(log n)).
o Efficient Disk Access: A B-tree minimizes disk access because it reads and writes
large blocks of data in a single operation, unlike binary search trees that involve
many small operations.
For example, in a database index, a B-tree allows for quick lookups, even with millions of
records, because the tree remains balanced and ensures optimal access time. Inserting,
deleting, and searching operations in a B-tree are efficient, which is why it's widely used in
indexing large databases.
4. List the relational algebra operators. Discuss any one such operator. (3 Marks)
The key relational algebra operators include:
o Set Difference (−): Retrieves rows in one table but not in another.
o Join (⨝): Combines related rows from two tables based on a condition.
Selection (σ) is one of the most important operators and is used to filter tuples (rows) based
on a given condition. For example, if you want to find employees with a salary greater than
$50,000, you would apply:
Q.2
• Physical Level: This is the lowest level of abstraction that describes how the data is actually
stored in the system, including details about file organization, indexing, and disk storage.
• Logical Level: This level describes what data is stored and what relationships exist between
them. It presents a more abstract view, hiding the implementation details, and is concerned
with the database schema (e.g., table structures, data types, constraints).
• View Level: This highest level defines how individual users interact with the data. It presents
data to users in a simplified format, often restricting access to only the data they need.
Multiple views can be created for different user needs.
Data abstraction is important because it allows for flexibility in database management, where the
internal structure can change without affecting how users or applications interact with the data.
(A) State how an Entity-Relationship (ER) model represents real-world entities. (3 Marks)
An Entity-Relationship (ER) model is a diagrammatic way of representing real-world objects and their
relationships within a database. It uses several components to map out the structure of a database:
• Entities: These are objects or things that have a distinct existence in the real world and are
represented by rectangles in the ER diagram. For example, a "Student" or "Course" is an
entity.
• Attributes: These describe the properties of entities and are represented by ovals. For
instance, a Student entity might have attributes like StudentID, Name, and Age.
The ER model provides a high-level conceptual design for databases and is useful in visualizing how
data will be structured and how different entities interact with one another. It helps database
designers communicate the structure of the database in an easy-to-understand manner before actual
implementation.
(B) Explain the working of Cartesian product operation and the Division operation with an
appropriate example. (3 Marks)
• Cartesian Product (×): The Cartesian product combines every row from the first relation with
every row from the second relation. For example, if Relation A has 3 rows and Relation B has
2 rows, the Cartesian product A × B will result in 6 rows (3 × 2). It is the foundation for other
join operations. Example: If A = {1, 2} and B = {x, y}, then A × B will yield:
• Division (÷): The division operation is used when we want to find tuples in one relation that
are related to all tuples in another relation. This is especially useful in queries involving "all"
conditions. For example, if a table has student-course pairs, and we want to find students
who have taken all available courses, we can use division. Example: Let A be a relation of
students and courses they are enrolled in, and B be a relation of all courses. Then A ÷ B will
return the students who are enrolled in all courses listed in B.
Q.2 (C)
Q.2 (C)
1. Internal Level (Physical Level): The lowest level, describing how data is physically stored in
the database (disk storage, indexing).
2. Conceptual Level (Logical Level): Describes what data is stored and the relationships
between the data. It hides the physical storage details and focuses on structuring the data
(tables, views).
3. External Level (View Level): The highest level, which interacts directly with users. This level
provides individual users or groups a customized view of the data.
Q.3 (A)
Hashing is an efficient technique for data retrieval in databases, where data is mapped to a specific
"hash" value using a hash function. The key benefit of hashing is that it enables direct access to the
data, making it much faster than other methods such as sequential search or binary search.
1. Direct Access: When a hash function processes a key, such as a student ID or product code,
it calculates a unique hash value that points directly to the location of the data in a hash
table. This avoids the need to search through multiple records.
2. Time Complexity: The efficiency of hashing lies in its time complexity for search, insertion,
and deletion operations, which is typically O(1), meaning constant time. In contrast, other
methods like binary search or B-trees may take O(log n) time, depending on the number of
records.
3. Collision Handling: Though hashing is highly efficient, there can be hash collisions (where
two keys produce the same hash value). To manage this, techniques like chaining or open
addressing are used to ensure data is still accessible efficiently even in cases of collision.
For example, if we use hashing to store student records based on their student IDs, the hash
function will generate a unique position for each ID, making retrieval almost instantaneous.
Q.3 (B)
Discuss the importance of achieving a lossless design in relational database design. Is it always
possible? (3 Marks)
A lossless design in relational databases is essential to ensure that no data is lost when
decomposing a large table into smaller tables during the process of normalization. Achieving a
lossless design is crucial for maintaining data integrity and ensuring the system can reconstruct the
original data from the decomposed tables.
1. Maintaining Data Integrity: Lossless design ensures that when relations are broken down
into smaller, normalized tables, the data can be joined back without loss of information.
This is especially important to avoid redundancies and anomalies like data inconsistency or
redundancy.
2. Ensuring Accurate Joins: If a design is not lossless, joining decomposed tables might result
in incorrect or incomplete data, which compromises the database’s reliability. A lossless
design guarantees that data can be accurately recombined through natural joins or foreign
keys.
Q.3 (C)
OR
Q.3 (A): Describe different join strategies for a query and explain how they affect performance (3
Marks)
o Description: Compares every row of one table (outer) with every row of the second
table (inner).
o When used: Suitable for small datasets or when the join condition involves indexed
columns.
o Performance Impact: Works poorly with large datasets due to O(n²) complexity but
requires little memory.
2. Hash Join:
o Description: Uses a hash table to store rows from one table and matches rows from
the other table based on the join condition.
o Performance Impact: Faster than nested loop for large tables but consumes more
memory for hash tables.
3. Merge Join:
o Description: Both tables are sorted on the join columns, and the sorted results are
merged to find matching rows.
o When used: Best if input tables are pre-sorted or sorted indexes exist.
o Performance Impact: Works faster with sorted data but adds overhead if sorting is
needed first.
Q.3 (C): Apply different SQL aggregate functions with examples (6 Marks)
SQL aggregate functions summarize data across multiple rows. Here are examples:
1. SUM():
o Example:
o Usage: Helps find the total salary expenditure in a company.
2. AVG():
o Example:
3. COUNT():
o Example:
o Example:
o Example:
o Usage: Aggregates data by department to find the employee count in each
department.
Q.4 (1): Describe how the UNIQUE constraint differs from the PRIMARY KEY constraint (3 Marks)
• UNIQUE Constraint:
o Allows one NULL value in the column (since NULL is not treated as a value).
• Query Processing refers to the steps a DBMS follows to execute SQL queries efficiently. It
includes:
• Role:
Q.4 (3): Explain how trivial and non-trivial dependencies affect the design of a database schema (3
Marks)
• Trivial Dependency:
o Example: A → A.
o Impact on Design: These dependencies don’t affect schema design since they are
self-evident.
• Non-Trivial Dependency:
o Occurs when a non-key attribute depends on another attribute that is not a primary
key.
Q.4 (5): Use a B-tree to illustrate how data is inserted and searched in a database (3 Marks)
• B-tree is a balanced tree structure used to store and access data efficiently. It maintains data
sorted and ensures fast insertion, deletion, and search operations.
1. Structure:
2. Insertion:
3. Search:
Q.4 (6): Describe the differences between primary and secondary indices in databases (3 Marks)