Dbms Viva
Dbms Viva
2. Transaction Management
Transaction management in databases ensures that database operations are executed reliably and
consistently, following the ACID properties: Atomicity, Consistency, Isolation, and Durability.
3. Properties of Transactions,
The properties of transactions, often abbreviated as ACID, are:
Atomicity: Transactions are all-or-nothing; either all operations within a transaction are completed
successfully, or none of them are. There are no partial executions.
Consistency: Transactions ensure that the database remains in a consistent state before and after the
transaction. Integrity constraints and data validity rules are maintained.
Isolation: Transactions are isolated from each other to prevent interference. Changes made by one
transaction are not visible to other transactions until they are committed.
Durability: Once a transaction is committed, its changes are permanent and survive system failures. They are
stored in a way that ensures they can be recovered even after a crash.
4. Concept of Schedule
A schedule in a database refers to the chronological order of transactions' execution. It shows how
transactions are interleaved and executed concurrently in a multi-user database system. The schedule
reflects the sequence of read and write operations performed by transactions, influencing data consistency
and concurrency control.
5. Serial Schedule
A serial schedule in a database executes transactions sequentially, one after another, without interleaving
or concurrency.
7. Cascaded Aborts
Cascaded aborts occur when a transaction aborts, triggering the rollback of other dependent
transactions, potentially leading to a chain reaction of aborts throughout the system.
Need : Concurrency control is essential in multi-user database systems to manage simultaneous access to
shared data, preventing interference and maintaining consistency.
10. Deadlocks
Deadlocks in a database occur when two or more transactions are waiting for each other to release resources
that they need, resulting in a situation where no transaction can proceed.
13. Checkpoints
Checkpoints record consistent database states for faster recovery after crashes.
4. Parallel Databases:
Parallel databases distribute data processing tasks across multiple processors to improve performance by
leveraging parallelism and exploiting parallel hardware architectures.
7. Distributed Databases
Distributed databases store data across multiple nodes connected via a network, enabling scalability, fault
tolerance, and decentralized data management.
11. Basics
Distributed transactions are fundamental units of work that span multiple databases or systems, maintaining
ACID properties (Atomicity, Consistency, Isolation, Durability) across distributed environments.
12. Failure modes,
Failure modes in distributed transactions refer to potential issues such as network failures, system crashes, or
communication errors that can disrupt the coordination and execution of transactions across multiple
databases or systems.
3. document store,
Document stores are NoSQL databases that store data in a semi-structured format, typically using JSON or
BSON documents. Each document is a self-contained unit containing key-value pairs, allowing for flexible
schema and nested data structures. Examples include MongoDB, Couchbase, and Elasticsearch.
4. graph,
Graph databases are NoSQL databases that represent data as nodes, edges, and properties, enabling efficient
storage and traversal of complex relationships between entities. Examples include Neo4j, Amazon Neptune,
and JanusGraph.
5. Performance
Performance in NoSQL databases is characterized by their ability to handle large volumes of data, high
throughput, low latency, and scalability.
10. NoSQL Data Models: NoSQL data models include key-value, document-oriented, column-family, and graph
databases, each tailored to specific data storage and retrieval needs.
11. Case Study-unstructured data from social media: Analyzing unstructured data from social media platforms
involves collecting, storing, and analyzing text, images, videos, and other content to extract insights about
user behavior, sentiment analysis, and trends.
12. Introduction to Big Data: Big Data refers to large volumes of structured, semi-structured, and unstructured
data that cannot be processed or analyzed using traditional database management tools. It encompasses the
three Vs: volume, velocity, and variety.
13. HADOOP: Hadoop is an open-source framework for distributed storage and processing of Big Data across
clusters of commodity hardware. It provides scalability, fault tolerance, and parallel processing capabilities
for handling massive datasets.
14. HDFS (Hadoop Distributed File System): HDFS is the primary storage system used by Hadoop for distributed
storage of large datasets across multiple nodes in a Hadoop cluster. It stores data in a fault-tolerant manner
and enables high-throughput data access.
15. MapReduce: MapReduce is a programming model and processing engine used in Hadoop for parallel
processing and analysis of large datasets. It divides tasks into map and reduce phases, enabling distributed
computation across a Hadoop cluster.