Dbmsco 3,4 Part 2
Dbmsco 3,4 Part 2
topics you listed — perfect for your Mid-2 and Semester prep. You can use this directly
for creating your PDF!
Transaction States
• Partially Committed – Final operations are done, but not yet saved.
ACID Properties
• C - Consistency keeps the database valid before and after the transaction.
• I - Isolation makes sure each transaction runs independently, even with multiple
users.
• D - Durability guarantees that once a transaction is committed, the changes are
saved permanently, even if the system crashes.
These properties are the backbone of safe and secure transaction processing in
databases.
When many transactions run at the same time, problems like lost updates, temporary
inconsistencies, and uncommitted data access can occur. For example, two users
editing the same record might overwrite each other’s work. This leads to wrong or
incomplete data. Other issues include deadlocks, where two transactions wait forever
for each other to release resources. To prevent these issues, proper concurrency
control and transaction scheduling are needed.
Concurrency control makes sure that multiple transactions can work together without
interfering with each other. Locking protocols are used to manage data access.
• Exclusive lock allows only one transaction to write and blocks others.
• Two-Phase Locking (2PL) is a popular method where all locks are acquired first
(growing phase), and then released (shrinking phase).
This helps avoid conflicts and ensures safe execution of transactions.
• Redo is used to reapply changes from a successful transaction that were not
saved permanently due to a crash.
Logs are maintained to keep track of all actions, which helps in performing
rollback and redo operations.
Indexing Techniques
Indexing is a way to make searching and data retrieval faster in a database. Without an
index, the database has to check every row, which is slow. Indexes work like a table of
contents in a book. When you search for something, it tells you where to find it. Indexes
are created on one or more columns of a table, especially those used in WHERE or JOIN
conditions. This makes data access more efficient and reduces load on the database.
B-Trees are a type of index used for sorted data and range queries. They allow fast
insertion, deletion, and search operations and are widely used in databases like MySQL
and Oracle.
Hash Indexing uses a hash function to convert a key (like an ID) into a location in
memory. It is very fast for exact match lookups but doesn’t work well for range-based
searches.
Both indexing techniques help speed up query execution but are used in different
scenarios depending on the type of data.
Query optimization is the process of writing and executing SQL queries in the most
efficient way possible. This involves choosing the best way to fetch data with minimum
time and resource usage. Strategies include:
Big Data refers to datasets that are so large and complex that traditional data
processing tools can’t handle them. It is described by the 3Vs – Volume (huge size),
Velocity (speed of data flow), and Variety (different types like text, images, videos).
Companies use big data to analyze customer behavior, detect fraud, and improve
services. Big data needs special tools and platforms for storage and processing.
Distributed File Systems (DFS) break big data into smaller parts and store them across
multiple computers. This makes the system faster, more fault-tolerant, and scalable. If
one machine fails, another can continue the task. DFS is the foundation for big data
frameworks like Hadoop. It helps store massive data reliably and allows parallel
processing for faster results.
Hadoop Framework (HDFS)
Hadoop is an open-source framework used for processing big data. Its main part is
HDFS (Hadoop Distributed File System) which stores large files across many
computers. Another part, MapReduce, processes the data in parallel. Hadoop is fault-
tolerant, scalable, and cheap, which makes it perfect for big data analytics in
companies. It can store both structured and unstructured data.
Role-Based Access Control (RBAC) means that users are given roles, and each role
has specific permissions. For example, an "Admin" can read, write, and delete data,
while a "User" can only read data. This makes it easy to manage large numbers of users
and ensure everyone gets only the access they need. RBAC improves security and
reduces the chances of accidental or unauthorized data changes.
Encryption is the process of converting data into a secret format so that only authorized
users can read it. Decryption is turning it back into its original form. These techniques
are important for protecting sensitive data like passwords, bank details, and personal
information. Even if a hacker gets access to the database, encrypted data stays safe
because they can’t read it without the key. Modern databases support both at-rest
encryption (for stored data) and in-transit encryption (for data being transferred).
Let me know if you want this converted into a PDF, or if you'd like a summary version or
flashcards too. I'm here to help you ace your exams bro!