dbms assignment 2
dbms assignment 2
three-level architecture?
(b) What are contents defined by scheme.
(c) Why is achieving Second Normal Form (2NF) important, and how
does it difTer from First Normal Form (INF)?
OR
Q.4 (a) Which term is used to describe a situation where two or more
transactions are waiting indefinitely for each other to release locks,
leading to a system standstill?
What are the techniques for concurrency control?
(b)
The primary function of the Database Management System (DBMS) in the three-
level architecture is to act as an intermediary between users and the actual physical
database. It manages data access, manipulation, and ensures data integrity and
security.
In a database schema, the schema defines the overall structure of the database,
including the definition of tables, their attributes (columns), data types, constraints
(like primary and foreign keys), and relationships between tables.
Associative Entities:
Imagine tables for Students and Courses. A student can enroll in multiple courses,
and a course can have many students. An associative entity named "Enrollment" can
be created with attributes like "enrollment_id" (primary key), "student_id" (foreign key
referencing Students), "course_id" (foreign key referencing Courses), and potentially
"semester" or "grade."
Entities:
o Student (attributes: student_id, name, major, etc.)
o Course (attributes: course_id, name, department, credits, etc.)
o Department (attributes: department_id, name, college, etc.)
o Professor (attributes: professor_id, name, department, etc.) (Optional)
Relationships:
o A Student ENROLLS_IN many Courses (Many-to-Many, with
Enrollment as the associative entity)
o A Course belongs to a DEPARTMENT (One-to-Many)
o A Professor TEACHES many Courses (Many-to-Many, optional
depending on the data model)
Cardinalities:
A Student can enroll in many Courses (1:N)
A Course can have many Students enrolled (N:1)
A Department can have many Courses (1:N)
A Course belongs to one Department (N:1)
A Professor can teach many Courses (1:N) (Optional)
A Course can be taught by many Professors (N:1) (Optional)
Example: Student Enrollment
The choice between ER and hierarchical models depends on the data structure. ER
models are more flexible for complex relationships, while hierarchical models excel
at representing strict hierarchies. The university database with its many-to-many
relationships would be better suited for an ER model.
I'll provide answers to your database-related questions, following the format you
requested:
A primary key in a relational database table is a column (or a set of columns) that
uniquely identifies each row. It enforces data integrity by ensuring no duplicate
records exist.
Imagine a library database with tables for Books and Borrowings. A single column
like "book_id" might not be unique if there are multiple editions of the same book.
Here, a composite key using "book_id" and "edition_number" would ensure unique
identification of each book instance.
A partial dependency occurs when a non-key attribute (attribute not part of the
primary key) depends on only a part of the primary key, not the entire key. This can
lead to data redundancy and inconsistency issues.
Contribution to Normalization
Example:
Consider a table named "Sales" with columns: "Order ID", "Customer Name",
"Product ID", "Product Name", "Quantity", and "Price". This table might have partial
dependencies: "Customer Name" depends on "Order ID", and "Product Name" and
"Price" depend on "Product ID".
(a) Deadlock
This term describes a situation where two or more transactions are waiting
indefinitely for each other to release locks on resources (data items) they need to
complete their operations. This creates a standstill as no transaction can proceed,
leading to a system halt.
Imagine transferring funds between two bank accounts (Account A and Account B) in
a single transaction. Atomicity guarantees that either:
Both debits (from A) and credits (to B) are completed successfully, updating both
accounts.
If any step fails (e.g., insufficient funds), the entire transaction is rolled back, leaving
neither account modified.
(d) Two-Phase Commit (2PC) in Distributed Transactions
2PC is a coordination protocol used in distributed database systems to ensure the
consistency of transactions that involve updates across multiple database nodes
(servers). It guarantees that all nodes participating in the transaction either commit
the changes or roll them back in a synchronized manner.
2PC Phases:
1. Prepare Phase: The coordinator node (a designated server) sends a "prepare"
message to all participating nodes. Each node performs the required operations
locally and logs the updates, but doesn't permanently commit them to the database.
2. Commit or Rollback: Based on the responses from all nodes (success or failure),
the coordinator decides:
o Commit: If all nodes confirm readiness, the coordinator sends a "commit"
message to all nodes, instructing them to permanently apply the updates.
o Rollback: If any node encounters an error or reports failure, the coordinator
sends a "rollback" message, instructing all nodes to undo the local changes
performed during the prepare phase.
Example Scenario:
B-Trees are a widely used data structure for creating multi-level indexes. They offer
efficient searching and insertion/deletion operations:
Ordered Structure: B-Trees organize data in a sorted fashion, allowing for
efficient comparisons during query processing.
Self-Balancing: B-Trees automatically balance their structure to maintain
optimal search performance as data volume changes (inserts/deletes).
Multi-Level Search: Queries can traverse the B-Tree levels, starting from the
root node (highest level with a broader overview) to locate specific data blocks
containing the desired records.
(c) Hash Functions and Data Integrity
Hash functions don't directly contribute to data integrity in databases. However, they
can be used in conjunction with other techniques to enhance data verification.
A clustered index on "price" would improve performance for queries like "find
products priced between $100 and $200". The physical clustering based on
price allows the database to scan through contiguous data blocks efficiently.
A covering index on "category" and "price" would allow filtering products by
category (e.g., "electronics") and price range (e.g., "$100-$200") to be
potentially answered using the index itself, without needing to access the
entire table data.
Alternative: Significance of Primary Indexes
Primary Index Importance: