DBDM 2 Marks
DBDM 2 Marks
UNIT-1
Database management system (DBMS) is a collection of interrelated data and a set of programs to
access those data.
Applications of DBMS
a) Banking f) Finance
b) Airlines g) Sales
c) Universities h) Manufacturing
e) Tele communication
a) Physical level
b) Logical level
c) View level
5. Define instance and schema?
Instance: Collection of data stored in the data base at a particular moment is called an Instance of the
database.
Schema: The overall design of the data base is called the data base schema.
1) Physical schema
2) logical schema.
Physical schema: The physical schema describes the database design at the physical level,
which is the lowest level of abstraction describing how the data are actually stored.
Logical schema: The logical schema describes the database design at the logical level, which
describes what data are stored in the database and what relationship exists among the data.
Entity type: An entity type defines a collection of entities that have the same attributes.
Entity set: The set of all entities of the same type is termed as an entity set.
10. What are attributes? Give examples.
Example: possible attributes of customer entity are customer name, customer id, Customer
Types of attributes
o Simple
o Composite
o Single-valued
o Multi-valued
o Derived
Single valued attributes: attributes with a single value for a particular entity are called single
valued attributes.
Multivalued attributes : Attributes with a set of value for a particular entity are called
multivalued attributes.
Weak entity set: entity set that do not have key attribute of their own are called weak entity sets.
Strong entity set: Entity set that has a primary key is termed a strong entity set.
• Total: The participation of an entity set E in a relationship set R is said to be total if every
• Partial: if only some entities in E participate in relationships in R, the participation of entity set
Mapping cardinalities or cardinality ratios express the number of entities to which another entity
• One to one
• One to many
• Many to one
Example: A depositor relationship associates a customer with each account that he/she has.
Part-B
Part-A
These rows in the table denote a real-world entity or relationship. Each column of the table corresponds to
an attribute of the relation.
Relational model uses a collection of tables to represent both data and the relationships among those data.
The relational model is an example of a record based model.
The relation represents the associations among the attributes of an entity as well as the relationship among
different entities. Relation is a subset of a Cartesian product of list domains.
For each attribute there is a set of permitted values called the domain of that attribute.
4. List some relational integrity constraints.
Integrity constraints ensures that changes made to the database by authorized user do not result in a loss of
data consistency. Thus Integrity constraints guard against accidental damage to the database
1. Domain Constraints
2. Key constraints
3. Referential integrity constraints
DDL: Data base schema is specified by a set of definitions expressed by a special language
DML:
A data manipulation language is a language that enables users to access or manipulate data as organized by
the appropriate data model
A super key is a set of one or more attributes that collectively allows us to identify uniquely an entity in the
entity set.
Superkeys for which no proper subset is a super key. Such minimal superkeys are called candidate keys or
primary keys
To create the view, we can select the fields from one or more tables present in database.
A view can either have specific rows based on certain condition or all the rows of a table.
Syntax:
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE condition;
Structured Query Language (SQL) is database language which can perform certain operations on existing
database and also we can use this language to create a database.
Part-B
Axioms or Inference rules provide a simpler technique can apply to a set of FD(functional
dependency) to derive other FD. The Inference rule are Armstrong's axioms, used to conclude functional
dependencies on a relational database.
Inference rules:
If X ⊇ Y then X → Y
If X → Y then XZ → YZ
If X → Y and Y → Z then X → Z
If X → Y and X → Z then X → YZ
If X → YZ then X → Y and X → Z
If X → Y and YZ → W then XZ → W
The functional dependency is a relationship that exists between two attributes i.e. the primary key and non-
key attribute within a table.
FD: X → Y
Primary Key for a table is the column or a group of columns (composite key) which can uniquely identify
each record in the table. This is called Functional Dependency.
Emp_Id → Emp_Name
Lossless-join decomposition
Dependency preservation
Repetition of information
For example,
42 abc 17
43 pqr 18
44 xyz 18
Example:
1. ID → Name,
2. Name → DOB
42 abc 17
43 pqr 18
44 xyz 18
roll_no → name is a non-trivial functional dependency, since the dependent name is not a subset of
determinant roll_no
In Multivalued functional dependency, entities of the dependent set are not dependent on each other.
i.e. If a → {b, c} and there exists no functional dependency between b and c, then it is called a
multivalued functional dependency.
For example,
42 abc 17
43 pqr 18
44 xyz 18
roll_no → {name, age} is a multivalued functional dependency, since the dependents name & age are not
dependent on each other
For example,
42 abc CO 4
43 pqr EC 2
44 xyz IT 1
45 abc EC 2
9.What is Transitive Dependency?
When a non-prime attribute depends on other non-prime attributes rather than depending upon the
prime attributes or primary key. This is Transitive Dependency
An attribute in a table depends on only a part of the primary key and not on the whole key.
Partial Dependency exists, when for a composite primary key, any attribute in the table depends
only on a part of the primary key and not on the complete primary key.
A Minimal Cover is a simplified and reduced version of the given set of functional dependencies
1) Split the right-hand attributes of all FDs. Example A->XY => A->X, A->Y
2) Remove all redundant FDs. Example { A->B, B->C, A->C } Here A->C is redundant
It is a process of analyzing the given relation schemas based on their Functional Dependencies
Ø Minimizing redundancy
13. Define Anamolies. Write its types. How it will be reduced in a databases?
Anamolies are data redundancy(repetition) and data inconsistencies. If a table is not properly normalized and
have data redundancy then it takes extra memory space, difficult to handle and update the database
Types of Anamolies:
Insertion,
Updation and
For a table to be in the First Normal Form, it should follow the following rules:
A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally
dependent on primary key.
Rules of 2NF
A -> B is a trivial FD
Part-B
1. Define Normalization. Elaborate the various normal forms in detail.
2. Define Functional Dependency. Classify its types in detail.
3. Elaborate the steps involved in mapping of an ER Model-to-Relational Model.
4. State the Properties of relational decomposition
UNIT-4
Part-A
ACID Property is a set of properties that guarantee database transactions are processed reliably.
It is performed by a single user to perform operations for accessing the contents of the database.
Properties of transaction
Atomicity
Consistency
Isolation
Durability
Ø Active
Ø Partially committed
Ø Failed
Ø Aborted
Ø Committed
Ø Terminated
3, What are the two operation access data in transaction?
Read(x)- transfer data item x from database.
Write(x)- transfer data item x from the local buffer
4, What are the steps followed in Executing Read(x) and Write(x) in transaction
A series of operation from one transaction to another transaction is known as schedule. It is used to
preserve the order of the operation in each of the individual transaction
Types of Schedule:
1. Serial
2. Non-Serial
3. Serializable
Conflict Serializable
View Serializable
Serializability of schedules is used to find non-serial schedules that allow the transaction to execute
concurrently without interfering with one another.
It identifies which schedules are correct when executions of the transaction have interleaving of
their operations.
A non-serial schedule will be serializable if its result is equal to the result of its transactions
executed serially.
Any schedule produced by concurrent processing of set of transaction will have an effect equivalent to a
schedule produce when these transaction are rule serially is some order of guarantees this called
seralizability
Types of serializability
Conflict serializability
View serializability
9. How to find the schedule is conflict serialization or non using procedure graph?
The graph has a cycle then schedules not conflictserialization if the graph
contain no cycle thee schedule is conflict serialization.
If a precedence graph contains a single edge Ti → Tj, then all the instructions of Ti are executed before
the first instruction of Tj is executed.
If a precedence graph for schedule S contains a cycle, then S is non-serializable. If the precedence graph
has no cycle, then S is known as serializable.
10. Define Serialization Graph. Write the conditions for drawing the graph.
Serialization Graph is used to test the Serializability of a schedule. For schedule S, we construct a
graph known as precedence graph.
This graph has a pair G = (V, E), where V consists a set of vertices, and E consists a set of edges. The
set of vertices is used to contain all the transactions participating in the schedule. The set of edges is
used to contain all edges Ti ->Tj for which one of the three conditions holds:
Serializability of schedules generated by concurrently executing transactions can be ensure through one of
a variety of mechanisms called concurrency control
In a multi-user system, multiple users can access and use the same database at one time, which is known as
the concurrent execution of the database. It means that the same database is executed simultaneously on a
multi-user system by different users.
Problems with Concurrent Execution
15.Define lock
Lock is variable associated with a data item, used for synchronizing the access by concurrent transaction to
the database item
Shared
Exclusive
The concurrency control protocols ensure the atomicity, consistency, isolation, durability and
serializability of the concurrent execution of the database transactions. Therefore, these protocols are
categorized as:
In this type of protocol, any transaction cannot read or write data until it acquires an appropriate lock
on it. There are two types of lock:
1. Shared lock:
It is also known as a Read-only lock. In a shared lock, Data item can only read by transaction.
It can be shared between the transactions because when the transaction holds a lock, then it can't
update the data on the data item.
2. Exclusive lock:
In the exclusive lock, the data item can be both reads as well as written by transaction.
This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
Two-phase locking protocol divides the execution phase of transaction into three parts.
In third phase, Transaction cannot demand any new locks. It only releases acquired locks.
Growing phase: a transaction may obtain newlocks but not release any lock.
Shrinking phase: a transaction may release locks but may not obtain any new locks.
Part-B
Part-A
Object Query Language (OQL) is a version of the Structured Query Language. Like SQL, OQL is a
declarative (not procedural) language
Object Query Language is a query language standard for accessing data in object-oriented databases
Features:
SQL-like query language with special features objects, values and methods.
A NoSQL is non SQL or non relational database that provides a mechanism for storage and retrieval of data
NoSQL databases ("not only SQL") are non-tabular databases and store data differently than relational tables.
Features:
5. Object databases:
6. XML databases
Scalability
Replication Models
Master-slave replication
master-master replication
CAP theorem states that it is not possible to guarantee all three of desirable properties—consistency,
availability & partition tolerance—at same time in a distributed system with data replication.
Three letters in CAP refer to 3 desirable properties of distributed systems with replicated data:
Availability (of the system for read and write operations) and
Partition tolerance (in face of the nodes in system being partitioned by a network fault).
Document-Based Nosql is a document based systems store data in the form of documents using well-
known formats, such as JSON (JavaScript Object Notation). Documents are accessible via their document
id, but can also be accessed rapidly using other indexes.
MongoDB documents are stored in BSON (Binary JSON) format, which is a variation of JSON with some
additional data types and is more efficient for storage than JSON. Individual documents are stored in a
collection
Create,
Read,
Update,
Delete.
db.<collection_name>.insert(<document(s)>)
db.<collection_name>.remove(<condition>)
db.<collection_name>.find(<condition>)
7. Define Column-Based Nosql Systems
Column-Based Nosql Systems partition a table by column into column families (a form of vertical
partitioning;), A category of NOSQL systems is known as column-based systems.
The Google distributed storage system for big data, known as BigTable, is a well-known example of this class
of NOSQL systems
Big- Table uses the Google File System (GFS) for data storage and distribution.
An open source system known as Apache Hbase is similar to Google Big-Table, but it typically uses HDFS
(Hadoop Distributed File System) for data storage.
HDFS is used in many cloud computing applications, Hbase can also use Amazon’s Simple Storage System
(known as S3) for data storage.
The data model in Hbase organizes data using the concepts of namespaces, tables, column families, column
qualifiers, columns, rows, and data cells.
create,
read,
update,
delete
A row type is a sequence of field name/data type pairs that provides a data type to represent the types of rows in
tables
Create a Branch table consisting of branch number and address, insert a record into new table:
branchNo CHAR(4),
address ROW(street VARCHAR(25),
city VARCHAR(15),
subPart VARCHAR(4))));
INSERT INTO Branch VALUES (‘B005’, ROW(‘23 Deer Rd’, ‘London’, ROW(‘SW1’, ‘4EH’)));
User-defined types (UDTs) are referred to as abstract data types (ADTs) subdivide into two categories:
Structured types.
Simpler type of UDT is distinct type, which allows differentiation between same base types.
UDT definition consists of one or more attribute definitions, zero or more routine declarations (methods) and,
in a subsequent release, operator declarations
Consistency means that nodes will have same copies of a replicated data item visible for various transactions.
Availability means that each read or write request for a data item will either be processed successfully or the
operation cannot be completed.
Partition tolerance means that the system can continue operating if the network has a fault that results in two
or more partitions, where the nodes can only communicate among each other.
PART-B