Question Bank For ETE DBMS
Question Bank For ETE DBMS
:-
• :- MySQL. ...
• Oracle Database. ...
• MongoDB. ...
• IBM Db2 DBMS. ...
• Amazon RDS. ...
• PostgreSQL. ...
:-
1. Physical Level: At the physical level, the information about the location of database objects
in the data store is kept. Various users of DBMS are unaware of the locations of these
objects.In simple terms,physical level of a database describes how the data is being stored
in secondary storage devices like disks and tapes and also gives insights on additional
storage details.
2. Conceptual Level: At conceptual level, data is represented in the form of various database
tables. For Example, STUDENT database may contain STUDENT and COURSE tables
which will be visible to users but users are unaware of their storage.Also referred as logical
schema,it describes what kind of data is to be stored in the database.
3. External Level: An external level specifies a view of the data in terms of conceptual level
tables. Each external level view is used to cater to the needs of a particular category of
users. For Example, FACULTY of a university is interested in looking course details of
students, STUDENTS are interested in looking at all details related to academics, accounts,
courses and hostel details as well. So, different views can be generated for different us ers.
The main focus of external level is data abstraction.
Advantages of DBMS
DBMS helps in efficient organization of data in database which has following advantages over typical
file system:
• Minimized redundancy and data inconsistency: Data is normalized in DBMS to minimize
the redundancy which helps in keeping data consistent. For Example, student information
can be kept at one place in DBMS and accessed by different users.This minimized
redundancy is due to primary key and foreign keys
• Simplified Data Access: A user need only name of the relation not exact location to access
data, so the process is very simple.
• Multiple data views: Different views of same data can be created to cater the needs of
different users. For Example, faculty salary information can be hidden from student view of
data but shown in admin view.
• Data Security: Only authorized users are allowed to access the data in DBMS. Also, data
can be encrypted by DBMS which makes it secure.
The SQL commands that deals with the manipulation of data present in the database belong to DML or Data
Manipulation Language and this includes most of the SQL statements. It is the component of the SQL statement
that controls access to data and to the database. Basically, DCL statements are grouped with DML statements.
Transactions group a set of tasks into a single execution unit. Each transaction begins with a specific task and
ends when all the tasks in the group successfully complete. If any of the tasks fail, the transaction fails.
Therefore, a transaction has only two results: success or failure. You can explore more about transactions here.
Hence, the following TCL commands are used to control the execution of a transaction:
DCL includes commands such as GRANT and REVOKE which mainly deal with the rights, permissions, and other
controls of the database system.
REVOKE: This command withdraws the user’s access privileges given by using the GRANT command.
Unit 2
9. Write SQL query to get the Student name whose salary is between 1000 and 2000.
:- SELECT * FROM students
WHERE salary BETWEEN 1000 and 3000;
10. Write SQL query to get the Students Name and address from Student table.
:- SELECT S_NAME, S_ADDRESS, FROM STUDENTS ;
11. Write SQL query to get the Students Name who have got marks greater than 90 from Student table Explain
Union and Intersection in DBMS.
:- SELECT StudID, StudName ,Marks FROM STUDENT_MARKS
We can observe that the number of tuples in STUDENT relation is 2, and the number of tuples in DETAIL is
So the number of tuples in the resulting relation on performing CROSS PRODUCT is 2*2 = 4.
Important points on CARTESIAN PRODUCT(CROSS PRODUCT) Operation:
• The cardinality (number of tuples) of resulting relation from a Cross Product operation is equal to the
number of attributes(say m) in the first relation multiplied by the number of attributes in the second
relation(say n).
• Cardinality = m*n
• The Cross Product of two relation A(R1, R2, R3, …, Rp) with degree p, and B(S1, S2, S3, …, Sn) with degree n,
is a relation C(R1, R2, R3, …, Rp, S1, S2, S3, …, Sn) with degree p + n attributes.
• Degree = p+n
• In SQL, CARTESIAN PRODUCT(CROSS PRODUCT) can be applied using CROSS JOIN.
• In general, we don’t use cartesian Product unnecessarily, which means without proper meaning we don’t
use Cartesian Product. Generally, we use Cartesian Product followed by a Selection operation and
comparison.
14. Explain Select and Project.
:- YHA CLICK KRKE PDH LO BHAI
Unit
3
1. What is partial functional dependency?
2. What is transitive functional dependency?
3. What is full functional dependency?
4. Explain 1NF, 2NF, 3NF, BCNF, 4NF with Example.
5. What is the difference between 3NF and BCNF.
6. R(C, S, Z)
7. CS→Z
Z→C
8. Find the list of candidate key.
9. Find Normal Form
Form
R(A,B,C,D,E)
A->BC
BC->E
E->DA
Unit 4
1. What is concurrent schedule?
2. What is deadlock?
3. What do you mean by serializable schedule.
4. What is serial schedule?
5. What do you mean by ACID property?
6. check the given schedule is serializable or not. Consider the
following schedule for transaction T1, T2, T3
T1 T2 T3
r(x)
r(y)
r(y)
w(y)
w(x)
w(x)
r(x)
w(x)
Unit 5
• Growing Phase − All the locks are issued in this phase. No locks are released, after all
changes to data-items are committed and then the second phase (shrinking phase) starts.
• Shrinking phase − No locks are issued in this phase, all the changes to data-items are noted
(stored) and then locks are released.
The 2PL locking protocol is represented diagrammatically as follows −
In the growing phase transaction reaches a point where all the locks it may need has been acquired. This
point is called LOCK POINT.
After the lock point has been reached, the transaction enters a shrinking phase.
Types
Two phase locking is of two types –
A transaction can release a shared lock after the lock point, but it cannot release any exclusive lock until
the transaction commits. This protocol creates a cascade less schedule.
Cascading schedule: In this schedule one transaction is dependent on another transaction. So if one has to
rollback then the other has to rollback.
The 2PL protocol guarantees serializability, but cannot guarantee that deadlock will not happen.
Example
Let T1 and T2 are two transactions.
T1=A+B and T2=B+A
T1 T2
Lock-X(A) Lock-X(B)
Read A; Read B;
Lock-X(B) Lock-X(A)
Here,
In the above situation T1 waits for B and T2 waits for A. The waiting time never ends. Both the transaction
cannot proceed further at least any one releases the lock voluntarily. This situation is called deadlock.
Wait for graph: It is used in the deadlock detection method, creating a node for each transaction, creating
an edge Ti to Tj, if Ti is waiting to lock an item locked by Tj. A cycle in WFG indicates a deadlock has
occurred. WFG is created at regular intervals.
Control concurrency
The simultaneous execution of transactions over shared databases can create several data integrity and
consistency problems.
For example, if too many people are logging in the ATM machines, serial updates and synchronization in
the bank servers should happen whenever the transaction is done, if not it gives wrong information and
wrong data in the database.
Main problems in using Concurrency
Updates will be lost − One transaction does some changes and another transaction deletes that change.
One transaction nullifies the updates of another transaction.
Uncommitted Dependency or dirty read problem − On variable has updated in one transaction, at the
same time another transaction has started and deleted the value of the variable there the variable is not
getting updated or committed that has been done on the first transaction this gives us false values or the
previous values of the variables this is a major problem.
Inconsistent retrievals − One transaction is updating multiple different variables, another transaction is in a
process to update those variables, and the problem occurs is inconsistency of the same variable in different
instances.
Concurrency control techniques
Locking
Lock guaranties exclusive use of data items to a current transaction. It first accesses the data items by
acquiring a lock, after completion of the transaction it releases the lock.
Types of Locks
Shared Lock [Transaction can read only the data item values]
Exclusive Lock [Used for both read and write data item values]
Time Stamping
Time stamp is a unique identifier created by DBMS that indicates relative starting time of a transaction.
Whatever transaction we are doing it stores the starting time of the transaction and denotes a specific
time.
Optimistic
It is based on the assumption that conflict is rare and it is more efficient to allow transactions to proceed
without imposing delays to ensure serializability.
6. What is Lost Update problem?
:- Consider the case where two users are about to update the same row/document in some data store. For
example, let user A retrieve some row first. After that, assume that user B retrieves the same row;
however, B writes his update immediately, and in particular before A writes his update.
In the lost update problem, an update done to a data item by a transaction is lost as it is overwritten by the
update done by another transaction.
7. Explain validation-Based Protocol.
:- Validation Based Protocol is also called Optimistic Concurrency Control Technique. This protocol is used
in DBMS (Database Management System) for avoiding concurrency in transactions. It is called optimistic
because of the assumption it makes, i.e. very less interference occurs, therefore, there is no need for
checking while the transaction is executed.
In this technique, no checking is done while the transaction is been executed. Until the transaction end is
reached updates in the transaction are not applied directly to the database. All updates are applied to local
copies of data items kept for the transaction. At the end of transaction execution, while execution of the
transaction, a validation phase checks whether any of transaction updates violate serializability. If there is
no violation of serializability the transaction is committed and the database is updated; or else, the
transaction is updated and then restarted.
Optimistic Concurrency Control is a three-phase protocol. The three phases for validation based protocol:
Read Phase:
Values of committed data items from the database can be read by a transaction. Updates are only applied
to local data versions.
Validation Phase:
Checking is performed to make sure that there is no violation of serializability when the transaction
updates are applied to the database.
Write Phase:
On the success of the validation phase, the transaction updates are applied to the database, otherwise, the
updates are discarded and the transaction is slowed down.
The idea behind optimistic concurrency is to do all the checks at once; hence transaction execution
proceeds with a minimum of overhead until the validation phase is reached. If there is not much
interference among transactions most of them will have successful validation, otherwise, results will be
discarded and restarted later. These circumstances are not much favourable for optimization techniques,
since, the assumption of less interference is not satisfied.
In order to perform the Validation test, each transaction should go through the various phases as described
above. Then, we must know about the following three time-stamps that we assigned to transaction Ti, to
check its validity:
3. Tj completes its read phase before Ti completes its read phase and both read_set and write_set of Ti are
disjoint with the write_set of Tj
Unit 6
1. What do you mean by research?
:- Research databases are organized collections of computerized information or data such as periodical
articles, books, graphics and multimedia that can be searched to retrieve information
2. Explain Quantitative data Analysis
:- Data analysis is defined as a process of cleaning, transforming, and modeling data to discover useful
information for business decision-making. The purpose of Data Analysis is to extract useful information
from data and taking the decision based upon the data analysis.
In this Data Science Tutorial, you will learn:
If your business is not growing, then you have to look back and acknowledge your mistakes and make a
plan again without repeating those mistakes. And even if your business is growing, then you have to look
forward to making the business to grow more. All you need to do is analyze your business data and
business processes.
Data analysis tools make it easier for users to process and manipulate data, analyze the relationships and
correlations between data sets, and it also helps to identify patterns and trends for interpretation. Here is a
complete list of tools used for data analysis in research.
Types of Data Analysis: Techniques and Methods
There are several types of Data Analysis techniques that exist based on business and technology. However,
the major Data Analysis methods are:
• Text Analysis
• Statistical Analysis
• Diagnostic Analysis
• Predictive Analysis
• Prescriptive Analysis
• Text Analysis
Text Analysis is also referred to as Data Mining. It is one of the methods of data analysis to discover a
pattern in large data sets using databases or data mining tools. It used to transform raw data into business
information. Business Intelligence tools are present in the market which is used to take strategic business
decisions. Overall it offers a way to extract and examine data and deriving patterns and finally
interpretation of the data.
Statistical Analysis
Statistical Analysis shows “What happen?” by using past data in the form of dashboards. Statistical Analysis
includes collection, Analysis, interpretation, presentation, and modeling of data. It analyses a set of data or
a sample of data. There are two categories of this type of Analysis – Descriptive Analysis and Inferential
Analysis.
Descriptive Analysis
analyses complete data or a sample of summarized numerical data. It shows mean and deviation for
continuous data whereas percentage and frequency for categorical data.
Inferential Analysis
analyses sample from complete data. In this type of Analysis, you can find different conclusions from the
same data by selecting different samples.
Diagnostic Analysis
Diagnostic Analysis shows “Why did it happen?” by finding the cause from the insight found in Statistical
Analysis. This Analysis is useful to identify behavior patterns of data. If a new problem arrives in your
business process, then you can look into this Analysis to find similar patterns of that problem. And it may
have chances to use similar prescriptions for the new problems.
Predictive Analysis
Predictive Analysis shows “what is likely to happen” by using previous data. The simplest data analysis
example is like if last year I bought two dresses based on my savings and if this year my salary is increasing
double then I can buy four dresses.
Prescriptive Analysis
Prescriptive Analysis combines the insight from all previous Analysis to determine which action to take in a
current problem or decision. Most data-driven companies are utilizing Prescriptive Analysis because
predictive and descriptive Analysis are not enough to improve data performance. Based on current
situations and problems, they analyze the data and make decisions.
Data Analysis Process
The Data Analysis Process is nothing but gathering information by using a proper application or tool which
allows you to explore the data and find a pattern in it. Based on that information and data, you can make
decisions, or you can get ultimate conclusions.
Data Analysis consists of the following phases: