0% found this document useful (0 votes)
21 views

Dbms Notes

Uploaded by

arijit2004pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Dbms Notes

Uploaded by

arijit2004pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

NOTES ON
DATABASE
MANAGEMENT
SYSTEM

BY D.K

1
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

What is a database? Describe the advantages and disadvantages of using of DBMS.

Ans: Database – A database is a collection of related data and/or information stored so that it is
available to many users for different purposes.
Advantages Of DBMS
1. Centralized Management and Control - One of the main advantages of using a
database system is that the organization can exert, via the DBA, centralized management and control
over the data.
2. Reduction of Redundancies and Inconsistencies - Centralized control avoids
unnecessary duplication of data and effectively reduces the total amount of data storage required.
Removing redundancy eliminates inconsistencies.
3. Data Sharing - A database allows the sharing of data under its control by any number
of application programs or users.
4. Data Integrity - Data integrity means that the data contained in the database is both
accurate and consistent. Centralized control can also ensure that adequate checks are incorporated in the
DBMS to provide data integrity.
5. Data Security - Data is of vital importance to an organization and may be
confidential. Such confidential data must not be accessed by unauthorized persons. The DBA who has
the ultimate responsibility for the data in the DBMS can ensure that proper access procedures are
followed. Different levels of security could be implemented for various types of data and operations.
6. Data Independence - Data independence is the capacity to change the schema at one
level of a database system without having to change the schema at the next level. It is usually
considered from two points of view: physical data independence and logical data independence.
Physical data independence is the capacity to change the internal schema without having to change
conceptual schema. Logical data independence is the capacity to change the conceptual schema without
having to change external schemas or application programs.
7. Providing Storage Structures for Efficient Query Processing - Database systems
provide capabilities for efficiently executing queries and updates. Auxiliary files called indexes are used
for this purpose.
8. Backup and Recovery - These facilities are provided to recover databases from
hardware and/or software failures.
Some other advantages are:
▪ Reduced Application Development Time
▪ Flexibility
▪ Availability of up-to-date Information
Disadvantages Of DBMS
1. Cost of Software/Hardware and Migration - A significant disadvantage of the
DBMS system is cost.

2. Reduced Response and Throughput - The processing overhead introduced by the


DBMS to implement security, integrity, and sharing of the data causes a degradation of the response and
throughput times.
3. Problem with Centralization - Centralization also means that the data is accessible
from a single source namely the database. This increases the potential of security breaches and
disruption of the operation of the organization because of downtimes and failures.
Explain five duties of Database Administrator. (7)Ans:
1. DBA administers the three levels of the database and, in consultation with the
overall user community, sets up the definition of the global view or conceptual levelof the database.

2
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
2. Mappings between the internal and the conceptual levels, as well as between the
conceptual and external levels, are also defined by the DBA.
3. DBA ensures that appropriate measures are in place to maintain the integrity of the
database and that the database is not accessible to unauthorized users.
4. DBA is responsible for granting permission to the users of the database and stores
the profile of each user in the database.
5. DBA is responsible for defining procedures to recover the database from failures
with minimal loss of data.

Explain the terms primary key, candidate key and foreign key. Give an example foreach.
(7)

Ans: Primary Key – Primary key is one of the candidate keys that uniquelyidentifies
each row in the relation.
Candidate Key – A candidate key of an entity set is a minimal superkey, that uniquelyidentifies
each row in the relation.
Foreign Key – Let there are two relations (tables) R and S. Any candidate key of the relation R which is
referred in the relation S is called the foreign key in the relation S and referenced key in the relation R.
The relation R is also called as parent table and relation Sis also called as child table.
For example:

STUDENT

Enrl Roll No Name City Mobile


No
11 17 Ankit Vats Delhi 9891663808
15 16 Vivek Rajput Meer 9891468487
ut
6 6 Vanita Punja
b
33 75 Bhavya Delhi 9810618396

GRADE

Roll Course Grade


No
6 C A
17 VB C
75 VB A
6 DBMS B
16 C B
▪ Roll No is the primary key in the relation STUDENT and Roll No + Course is the
primary key of the relation GRADE.
▪ Enrl No and Roll No are the candidate keys of the relation STUDENT.
▪ Roll No in the relation GRADE is a foreign key whose values must be one of thoseof
the relation STUDENT.

Differentiate between logical database design and physical database design. Show how this separation
leads to data independence. (7)

3
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
Ans:
Basis Logical Database Design Physical Database Design
Task Maps or transforms the conceptual The specifications for the stored
schema (or an ER schema) from the database in terms of physical storage
high-level data model into a structures, record placement, and indexes
relational database schema. are designed.
Choice The mapping can proceed in two The following criteria are often used to
of stages: guide the choice of physical database
criteria ▪ System-independent mapping design options:
but data model-dependent ▪ Response Time
▪ Tailoring the schemas to a ▪ Space Utilization
specific DBMS ▪ Transaction Throughput
Result DDL statements in the language of An initial determination of storage
the chosen DBMS that specify the structures and the access paths for the
conceptual and external level schemas database files. This corresponds to
of the database system. But if the defining the internal schema in terms of
DDL statements include some Data Storage Definition Language.
physical design parameters, a
complete DDL specification must
wait until after the physical database
design phase is completed.
The database design is divided into several phases. The logical database design and physical database
design are two of them. This separation is generally based on the concept of three-level architecture of
DBMS, which provides the data independence. Therefore, we can say that this separation leads to data
independence because the output of the logical database design is the conceptual and external level
schemas of the database system which is independent from the output of the physical database
design that isinternal schema.
Consider the following relation schemes: (27=14)
Project (Project#, Project_name, chief_architect)Employee (Emp#, Empname)
Assigned_To (Project#, Emp#)
Give expression in Tuple calculus and Domain calculus for each of the queries below:
(i) Get the employee numbers of employees who work on all projects.
(ii) Get the employee numbers of employees who do not work on the COMP123
project.
Ans:
(i) Tuple Calculus:
{t[Emp#] | t  ASSIGNED_TO  p (p  PROJECT �u (u  ASSIGNED_TO 
p[Project#] = u[Project#]  t[Emp#] = u[Emp#]))}
Domain Calculus:
{e | p (<p, e>  ASSIGNED_TO  p1 (<p1, n1, c1>  PROJECT
�<p1, e>  ASSIGNED_TO))}
(ii) Tuple Calculus:
{t[Emp#] | t  ASSIGNED_TO  u (u  ASSIGNED_TO
 u[ Project#] = ‗COMP123 ‘  t[ Emp#] = u[ Emp#])}
Domain Calculus:
{e | p (<p, e>  ASSIGNED_TO  p1, e1 (<p1, e1>  ASSIGNED_TO
 p1  ‗COMP123‘  e1  e))}
4
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

Define the five basic operators of relational algebra with an example each.
Ans: Five basic operators of relational algebra are:
1. Union () - Selects tuples that are in either P or Q or in both of them. The
duplicate tuples are eliminated.R = P  Q
2. Minus (–) - Removes common tuples from the first relation.
R=P–Q
3. Cartesian Product or Cross Product () - The cartesian product of two relations
is the concatenation of tuples belonging to the two relations and consisting of all possible combination
of the tuples.
R=PQ
For Example:
P: Q:

ID Name ID Name

101 Jones 100 John

103 Smith 104 Lalonde

104 Lalonde
R=PQ R=P–Q
ID Name ID Name
100 John 101 Jones
101 Jones 103 Smith
103 Smith
104 Lalonde
R=PQ
P.ID P.Nam Q.ID Q.Nam
e e
101 Jones 100 John
101 Jones 104 Lalonde
103 Smith 100 John
103 Smith 104 Lalonde
104 Lalond 100 John
e
104 Lalond 104 Lalonde
e

5
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
4. Projection () - The projection of a relation is defined as a projection of all its tuples over
some set of attributes, i.e., it yields a vertical subset of the relation. It is used to either reduce the
number of attributes (degree) in the resultant relation or to reorder attributes. The projection of a
relation T on the attribute A is denoted by A(T).
5. Selection () - Selects only some of the tuples, those satisfy given criteria, from the
relation. It yields a horizontal subset of a given relation, i.e., the action is defined over acomplete set of
attribute names but only a subset of the tuples are included in the result. R = B(P)
For Example:
EMPLOYEE:
Id Name Name
101 Jones Jones

103 Smith Smith

104 Lalonde Lalonde


106 Byron Byron

Projection of relation EMPLOYEE over attribute Name EMPLOYEE: Result of Selection

Id Name �Id Name


101 Jones 104 Lalonde
103 Smith 106 Byron

104 Lalonde
106 Byron
Result of Selection over EMPLOYEE for ID > 103

Explain entity integrity and referential integrity rules in relational model. Show howthese
are realized in SQL.

Ans:
Entity Integrity Rule – No primary key value can be null.
Referential Integrity Rule – In referential integrity, it is ensured that a value that appears in one
relation for a given set of attributes also appears for a certain set of attributes in another relation.
In SQL, entity integrity and referential integrity rules are implemented as constraints onthe relation
called as primary key constraint and reference key constraint respectively.These constraints can be
specified with relation at the time of creation of the relations orafter the creation of the relations by
altering the definition of the relations. For example: CREATE TABLE DEPT
(DEPTNO NUMBER PRIMARY KEY,DNAME VARCHAR2(15));
CREATE TABLE EMP
(EMPNO NUMBER PRIMARY KEY,

6
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

ENAME VARCHAR2(15),JOB VARCHAR2(10),


DEPTNO NUMBER REFERENCES DEPT(DEPTNO));

What are the advantages of embedded query language? Give an example of a embedded SQL
query.
Ans:
Embedded query language – SQL can be implemented in two ways. It can be used interactively or
embedded in a host language or by using API. The use of SQL commands within a host language (e.g.,
C, Java, etc.) program is called embedded query language or Embedded SQL. Although similar
capabilities are supported for a variety of host languages, the syntax sometimes varies. Some of the
advantages of embedded SQL are:
▪ SQL statements can be used wherever a statement in the host language is allowed.
▪ It combines the strengths of two programming environments, the procedural features of
host languages and non-procedural features of SQL.
▪ SQL statements can refer to variables (must be prefixed by a colon in SQL statements)
defined in the host program.
▪ Special program variables (called null indicators) are used to assign and retrieve the
NULL values to and from the database.
▪ The facilities available through the interactive query language are also automatically
available to the host programs.
▪ Embedded SQL along with host languages can be used to accomplish very complex and
complicated data access and manipulation tasks.
Example: The following Embedded SQL statement in C inserts a row, whose columnvalues are based
on the values of the host language variables contained in it.
EXEC SQL
INSERT INTO Sailors VALUES (:c_sname, :c_sid, :c_rating, :c_age);

Consider the following relations: (3.5 x 2=7)


S (S#, SNAME, STATUS, CITY)SP (S#, P#, QTY)
P (P#, PNAME, COLOR, WEIGHT, CITY)
Give an expression in SQL for each of queries below:
(i) Get supplier names for supplier who supply at least one red part
(ii) Get supplier names for supplier who do not supply part P2.
Ans:(i)
SELECT SNAME FROM S
WHERE S# IN (SELECT S# FROM SP WHERE P# IN (SELECT P# FROM P
WHERE COLOR = RED‘))
(ii) SELECT SNAME FROM S
WHERE S# NOT IN (SELECT S# FROM SP WHERE P# = ‗P2‘)

Define a view and a trigger. Construct a view for the above relations which has the information
about suppliers and the parts they supply. The view contains the S#, SNAME, P# , PNAME
renamed as SNO, NAME, PNO, PNAME.

7
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

Ans:
View – A view is a virtual table which is based on the one or more physical tables and/or views. In
other words, a view is a named table that is represented, not by its own physically separate stored data,
but by its definition in terms of other named tables (base tables or views).
Trigger – A trigger is a procedure that is automatically invoked by the DBMS in the response to
specified changes to the database. Triggers may be used to supplement declarative referential integrity,
to enforce complex business rules or to audit changes todata.
Command:
CREATE VIEW SUP_PART (SNO, NAME, PNO, PNAME) AS
SELECT S.S#, SNAME, P.P#, PNAMEFROM S, SP, P
WHERE S.S# = SP.S# AND P.P# = SP.P#

Differentiate between the following: (10)

(i) Theta Join. (ii) Equi Join. (iii) Natural Join


(iv) Outer Join.

Ans:(i) Theta Join – The theta join operation is an extension to the natural-join operation that allows us
to combine selection and a Cartesian product into a single operation. Consider relations r(R) and s(S),
and let θ be a predicate on attributes in the schema R  S. The theta join operation r DD  s is defined as
follows:
r DD  s =  (r x s)
(ii) Equi Join – It produces all the combinations of tuples from two relations that satisfy a
join condition with only equality comparison (=).
(iii) Natural Join - Same as equi-join except that the join attributes (having same
names) are not included in the resulting relation. Only one sets of domain compatible attributes
involved in the natural join are present.
(iv) Outer Join - If there are any values in one table that do not have corresponding
value(s) in the other, in an equi-join that will not be selected. Such rows can be forcefully selected by
using the outer join. The corresponding columns for that row will have NULLs. There are actually three
forms of the outer-join operation: left outer join ( X), right outer join (X ) and full outer join ( X ).

Draw and explain the three level architecture of the database system.
Ans:
A DBMS provides three levels of data is said to follow three-level architecture. The goal of the three-
schema architecture is to separate the user applications and the physical database. The view at each of
these levels is described by a schema. The processes of transforming requests and results between levels
are called mappings. In this architecture, schemas can be defined at the following three levels:

8
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

▪ External Level or Subschema – It is the highest level of database abstraction where only
those portions of the database of concern to a user or application program are included. Any number of
user views (some of which may be identical) may exist for a given global or conceptual view. Each
external view is described by means of a schema called an external schema or subschema.
▪ Conceptual Level or Conceptual Schema - At this level of database abstraction all the
database entities and the relationships among them are included. One conceptual view represents the
entire database. This conceptual view is defined by the conceptual schema. There is only one
conceptual schema per database. The description of data at this level is in a format independent of its
physical representation. It also includes features that specify the checks to retain data consistency and
integrity.
▪ Internal Level or Physical Schema – It is closest to the physical storage method used. It
indicates how the data will be stored and describes the data structures and access methods to be used by
the database. The internal view is expressed by the internal schema.
Explain (a) Heap file (b) Sorted file. Also discuss their advantages and disadvantages.
Ans: Heap File is an unordered set of records, stored on a set of pages. This class provides basic
support for inserting, selecting, updating, and deleting records. Temporary heap files are used for
external sorting and in other relational operators. A sequential scan of a heap file (via the Scan class) is
the most basic access method.
Sorted file The sort utility shall perform one of the following functions:
1. Sort lines of all the named files together and write the result to the specified output.
2. Merge lines of all the named (presorted) files together and write the result to the specified
output.
3. Check that a single input file is correctly presorted.
Comparisons shall be based on one or more sort keys extracted from each line of input(or, if no sort
keys are specified, the entire line up to, but not including, the terminating
<newline>), and shall be performed using the collating sequence of the current locale.
9
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

Describe a method for direct search? Explain how data is stored in a file so that direct
searching can be performed.
Ans: For a file of unordered fixed length records using unspanned blocks and contiguous allocation, it
is straight forward to access any record by its position in the file. If the file records are numbered 0,1,2,-
--,r-1 and the records in each block are numbered 0,1,---bfr-1; where bfr is the blocking factor, then ith
record of the file is located in block [(i/bfr)] and is the (I mod bfr)th record in that block. Such a file is
often called a relative or direct file because records can easily be accessed directly by their relative
positions. Accessing a record based on a search condition; however, it facilitates the construction of
access paths on the file, such as the indexes.
Explain the integrity constraints: Not Null, Unique, Primary Key with an example each.Is the
combination ‘Not Null, Primary Key’ a valid combination. Justify.

Ans: Not Null – Should contain valid values and cannot be NULL.
Unique – An attribute or a combination of two or more attributes must have a uniquevalue in each
row. The unique key can have NULL values.
Primary Key – It is same as unique key but cannot have NULL values. A table can have at most one
primary key in it.
For example:

STUDENT

Roll No Name City Mobile


17 Ankit Vats Delhi 9891663808
16 Vivek Rajput Meerut 9891468487
6 Vanita Punjab NULL
75 Bhavya Delhi 9810618396
▪ Roll No is a primary key.
▪ Name is defined with NOT NULL, means each student must have a name.
▪ Mobile is unique.
‗Not Null, Primary Key‘ is a valid combination. Primary key constraint already includes ‗Not Null‘
constraint in it but we can also add ‗Not Null‘ constraint with it. The use of ‗Not Null‘ with ‗Primary
Key‘ will not have any effect. It is same as if we are using just ‗Primary Key‘.
Explain the followings :
(i) Nested Queries.
(ii) Cursors in SQL.
(iii) RDBMS.
(iv) View
(v) Application Programming Interface (14)

Ans: (i) Nested Queries – A SELECT query can have subquery(s) in it. When aSELECT query
having another SELECT query in it, is called as nested query. Someoperations cannot be
performed with single SELECT command or with join operation.There are some operations which
can be performed with the help of nested queries (alsoreferred to as subqueries). For example, we want
to compute the second highest salary: SELECT MAX(SAL) FROM EMP WHERE SAL < (SELECT
MAX(SAL) FROM EMP)
Some operations can be performed both by Join and subqueries. The Join operation is costlier in terms
of time and space. Therefore, the solution based on subqueries is preferred.
10
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
(ii) Cursors in SQL – An object used to store the output of a query for row-by-row
processing by the application programs. Cursors are constructs that enable the user to name a private
memory area to hold a specific statement for access at a later time. Cursors are used to process multi-
row result sets one row at a time. Additionally, cursors keep track of which row is currently being
accessed, which allows for interactive processing of the active set.
(iii) RDBMS – RDBMS is a database management system (DBMS) that stores data in the
form of relations. Relational databases are powerful because they require few assumptions about how
data is related or how it will be extracted from the database. As a result, the same database can be
viewed in many different ways. An important feature of relational system is that a single database can be
spread across several tables. This differs from flat-file databases, in which each database is self-contained in a
single table.
(iv) View – A view is a relation (virtual rather than base) and can be used in query
expressions, that is, queries can be written using the view as a relation. In other words, a view is a
named table that is represented, not by its own physically separate stored data, but by its definition in
terms of other named tables (base tables or views). The base relations on which a view is based are
sometimes called the existing relations. The definition of a view in a create view statement is stored in
the system catalog. The syntax to create a view is:CREATE [OR REPLACE] VIEW <view_name>
[(<aliases>)] AS
<query> WITH {READ ONLY|CHECK OPTION [CONSTRAINT
<constraint_name>]};
(v) Application Programming Interface – Commercial SQL implementations take one
of the two basic techniques for including SQL in a programming language – embedded SQL and
application program interface (API). In the application program interface approach, the program
communicates with the RDBMS using a set of functions called the Application Program Interface
(API). The program passes the SQL statements to the RDBMS using API calls and uses API calls to
retrieve the results. In this method, the precompiler is not required.

Consider the following relational schema:


PERSON (SS#, NAME, ADDRESS)
CAR (REGISTRATION_NUMBER, YEAR, MODEL)ACCIDENT (DATE, DRIVER,
CAR_REG_NO) OWNS (SS#, LICENSE)
Construct the following relational algebra queries:
(i) Find the names of persons who are involved in an accident.
(ii) Find the registration number of cars which were not involved in any accident.

Ans: (i) NAME(PERSON)  DRIVER(ACCIDENT)


(ii) REGISTRATION_NUMBER(CAR) – CAR_REG_NO(ACCIDENT)
What is a key? Explain Candidate Key, Alternate Key and Foreign Key.
Ans:
Key – A single attribute or a combination of two or more attributes of an entity set that is
used to identify one or more instances (rows) of the set (table) is called as key.
Candidate Key – A candidate key is a minimal superkey, which can be used to uniquelyidentify a
tuple in the relation.
Alternate Key – All the candidate keys except primary key are called as alternate keys. Foreign Key –
Let there are two relations (tables) R and S. Any candidate key of therelation R which is referred in
the relation S is called the foreign key in the relation S andreferenced key in the relation R. The relation
R is also called as parent table and relation Sis also called as child table.
What is data independence? Explain the difference between physical and logical data
independence.
11
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

Ans: Data independence is the capacity to change the schema at one level of a database system without
having to change the schema at the next level. The three-schema architecture allows the feature of data
independence. Data independence occurs because when the schema is changed at some level, the
schema at the next level remains unchanged; only the mapping between the two levels is changed.
Types ofdata independence are:
▪ Physical Data Independence – It is capacity to change the internal schema without
having to change conceptual schema. Hence, the external schemas need not be changed as well.
Changes to the internal schema may be needed because some physical files had to be reorganized to
improve the performance of retrieval or update. If the same data as before remains in the database, the
conceptual schema needs not be changed.
▪ Logical Data Independence - It is the capacity to change the conceptual schema without
having to change external schemas or application programs. The conceptual schema may be changed to
expand the database (by adding a record type or data item), to change constraints, or to reduce the
database (by removing a record type or data item). Only the view definition and the mappings need be
changed in a DBMS that supports logical data independence. Changes to constraints can be applied to
the conceptual schema without affecting the external schemas or application programs.

Write short notes on:


(i) Weak and strong entity sets.
(ii) Types of attributes.
(iii) Oracle Instance.
(iv) Mid square method of hashing.
Ans: (i) Weak and Strong entity sets: A strong entity set has a primary key. All
tuples in the set are distinguishable by that key. A weak entity set has no primary key unless attributes
of the strong entity set on which it depends are included. Tuples in a weak entity set are partitioned
according to their relationship with tuples in a strong entity set. Tuples within each partition are
distinguishable by a discriminator, which is a set of attributes. A strong entity set has a primary key.
All tuples in the set are distinguishable by that key. A weak entity set has no primary key unless
attributes of the strong entity set on which it depends are included. Tuples in a weak entity set are
partitioned according to their relationship with tuples in a strong entity set. Tuples within each partition
are distinguishable by a discriminator, which is a set of attributes.
(ii) Types of attributes:An attribute's type determines the kind of values that are allowed in
the attribute. For example, the value version 1 is not valid for an attribute defined as an integer, but the
value 1 is valid. Numeric types (such as integer or real) can also be limited to a predefined range by
their attribute definition.
Choice :An attribute with a list of predefined values.
ID Reference: An attribute with a value that is a Unique ID value from another element. It is typically
used for element-based cross-references.
ID References: An attribute with a value of one or more Unique ID values from another element.
Integer: An attribute with a whole number value (no decimal parts). Examples of valid integers are 22, -
22, and +322. An integer can be defined to fall within a range.
Integers: An attribute with a value of one or more integers. Enter each number on a separate line in the
Attribute Value text box.
Real An attribute with a real number value, with or without a decimal part (the value can also be
expressed in scientific notation). Examples of valid real numbers are 2, 22.4, - 0.22, and 2.3e-1. A real
number can be defined to fall within a range.

12
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

Reals: An attribute with a value of one or more real numbers. Enter each number on a separate line in
the Attribute Value text box.
String: An attribute with a value of a series of characters (text).
Strings: An attribute with a value of one or more strings. Enter each string on a separate line in the
Attribute Value text box.
Unique ID: An attribute with a value of a unique text string. An element can have only one ID attribute
(which can be of type Unique ID or Unique IDs). All ID values must be unique in the document or
book. An element with a Unique ID attribute can be the source for an element-based cross-reference.
Unique IDs: An attribute with a value of one or more unique text strings. Enter each string on a
separate line in the Attribute Value text box.
(iii) Oracle Instances: An instance is the (executed) Oracle software and the memory they
use. It is the instance that manipulates the data stored in the database. It can be started independent of
any database. It consists of:
1) A shared memory area that provides the communication between various processes.
2) Upto five background processes which handled various tasks. Whenever an oracle
instance starts, the file ‗INIT.ORA‘ is executed.
(iv) Mid square method of hashing: In midsquare hashing, the key is squared and the
address selected from the middle of the squared number.
Mid square method
* Square K.
* Strip predetermined digits from front and rear.
* e.g., use thousands and ten thousands places.

Consider the following relational schemas:


EMPLOYEE (EMPLOYEE_NAME, STREET, CITY)
WORKS (EMPLOYEE_NAME, COMPANYNAME, SALARY)COMPANY
(COMPANY_NAME, CITY) Specify the table definitions in SQL.

Ans:
CREATE TABLE EMPLOYEE
( EMPLOYEE_NAME VARCHAR2(20) PRIMARY KEY,STREET
VARCHAR2(20),
CITY VARCHAR2(15));CREATE TABLE COMPANY
( COMPANY_NAME VARCHAR2(50) PRIMARY KEY, CITY
VARCHAR2(15));
CREATE TABLE WORKS
( EMPLOYEE_NAME VARCHAR2(20) REFERENCES
EMPLOYEE(EMPLOYEE_NAME,
COMPANYNAME VARCHAR2(50) REFERENCES
COMPANY(COMPANY_NAME,
SALARY NUMBER(6),
CONSTRAINT WORKS_PK PRIMARY KEY(EMPLOYEE_NAME,COMPANY_NAME));
Give an expression in SQL for each of queries below:

13
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

(i) Find the names of all employees who work for first Bank Corporation.
(ii) Find the names and company names of all employees sorted in ascending order
ofcompany name and descending order of employee names of that company.
(iii) Change the city of First Bank Corporation to ‘New Delhi’

Ans:
(i) SELECT EMPLOYEE_NAMEFROM WORKS
WHERE COMPANYNAME = ‗First Bank Corporation‘;
(ii) SELECT EMPLOYEE_NAME, COMPANYNAMEFROM WORKS
ORDER BY COMPANYNAME, EMPLOYEE_NAME DESC;
(iii) UPDATE COMPANY
SET CITY = ‗New Delhi‘
WHERE COMPANY_NAME = ‗First Bank Corporation‘;

Discuss the correspondence between the E-R model construct and the relation model construct.
Show how each E-R model construct can be mapped to the relational model using the suitable
example?
Ans: An entity-relationship model (ERM): An entity-relationship model (ERM) is an abstract
conceptual representation of structured data. Entity-relationship modeling is a relational schema
database modeling method, used in software engineering to produce a type of conceptual data model (or
semantic data model) of a system, often a relational database, and its requirements in a top-down
fashion. Diagrams created using this process are called entity-relationship diagrams, or ER diagrams or
ERDs for short.
ER-to-Relational Mapping Algorithm:
1) Step 1: Mapping of regular entity types: For each strong entity type E, create a
relation T that includes all the simple attributes of a composite attribute.
2) Step2: Mapping of weak entity types: For each weak entity type W with owner entity
type E, create relation R and include all simple attributes (or simple components of composite
attributes) of W as attributed of R. In addition, include as foreign key attributes of R, the primary key
attribute (s) of relation(s) that correspond to the owner(s) and the partial key of the weak entity type W,
if any.
3) Mapping of relationship types: form a relation R, for relationship with primary keys of
participating relations A and B as foreign keys in R. In addition to this, any attributes of relationship
become an attribute of R also.
4) Mapping of multivalued attributes: For each multilvalued attribute A, create a new
relation R. This relation R will include an attribute corresponding to A, plus primary key attribute K-as
a foreign key in R-of the relation that represents the entity type or relationship type that has A as an
attribute.
Explain the concepts of relational data model. Also discuss its advantages and
disadvantages.
Ans:
Relational Data Model – The relational model was first introduced by Prof. E.F. Codd ofthe IBM
Research in 1970 and attracted immediate attention due to its simplicity and mathematical foundation.
The model uses the concept of a mathematical relation (like a table of values) as its basic building
block, and has its theoretical basis in set theory and first-order predicate logic. The relational model
represents the database as a collection of relations. The relational model like all other models consists
of three basic components:
▪ a set of domains and a set of relations
▪ operation on relations
14
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
▪ integrity rules
Advantages
 Ease of use – The revision of any information as tables consisting of rows and columns
is quite natural and therefore even first time users find it attractive.
 Flexibility – Different tables from which information has to be linked and extracted can
be easily manipulated by operators such as project and join to give information in the form in which it is
desired.
 Security – Security control and authorization can also be implemented more easily by
moving sensitive attributes in a given table into a separate relation with its own authorization controls.
If authorization requirement permits, a particular attribute could be joined back with others to enable
full information retrieval.
 Data Independence – Data independence is achieved more easily with normalization
structure used in a relational database than in the more complicated tree or network structure. It also
frees the users from details of storage structure and access methods.
 Data Manipulation Language – The possibility of responding to ad-hoc query by
means of a language based on relational algebra and relational calculus is easy in the relational database
approach. Provides simplicity in the data organization and the availability of reasonably simple to very
powerful query languages.
Disadvantages
 Performance – If the number of tables between which relationships to be established
are large and the tables themselves are voluminous, the performance in responding to queries is
definitely degraded.
 Unsuitable for Hierarchies – While the relational database approach is a logically
attractive, commercially feasible approach, but if the data is for example naturally organized in a
hierarchical manner and stored as such, the hierarchical approach may give better results.

(i) Explain the Integrity Constraints.

Ans:
(i) Integrity Constraints – A database is only as good as the information stored in it, and a
DBMS must therefore help prevent the entry of incorrect information. An integrity constraint is a
condition specified on a database schema and restricts the data that can be stored in an instance of the
database. If a database instance satisfies all the integrity constraints specified on the database schema, it
is a legal instance. A DBMS enforces integrity constraints, in that it permits only legal instances to be
stored in the database. Integrity constraints are specified and enforced at different times:
▪ When the DBA or end user defines a database schema, he or she specifies the integrity
constraints that must hold on any instance of this database.
▪ When a database application is run, the DBMS checks for violations and disallows
changes to the data that violate the specified integrity constraints.
Many kinds of integrity constraints can be specified in the relational model, such as, Not Null, Check,
Unique, Primary Key, etc.
List any two significant differences between a file processing system and a DBMS.
Ans:
File Processing System vs. DBMS
Data Independence - Data independence is the capacity to change the schema at one level of a
database system without having to change the schema at the next level. In file processing systems the
data and applications are generally interdependent, but DBMS provides the feature of data
independence.
Data Redundancy – Data redundancy means unnecessary duplication of data. In file processing
15
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
systems there is redundancy of data, but in DBMS we can reduce data redundancy by means of
normalization process without affecting the original data. If we do so in file processing system, it
becomes too complex.
Differentiate between various levels of data abstraction.
Ans: Data Abstraction – Abstraction is the process to hide the irrelevant things from the users and
represent the relevant things to the user. Database systems are often used by non-computer
professionals so that the complexity must be hidden from database system users. This is done by
defining levels of abstract as which the database may be viewed, there are logical view or external view,
conceptual view and internal view or physical view.

o External View – This is the highest level of abstraction as seen by a user. It describes
only the part of entire database, which is relevant to a particular user.
o Conceptual View – This is the next higher level of abstraction which is the sum total
of Database Management System user's views. It describes what data are actually stored in the
database. It contains information about entire database in terms of a small number of relatively simple
structure.
o Internal View – This is the lowest level of abstraction. It describes how the data are
physically stored

Define the following terms:


a) Primary key. b) DML c) Multivalued attribute d) Relationship instance

Ans: Primary Key – Primary key is one of the candidate keys. It should be chosen such that its
attribute values are never, or very rarely, changed.
b) Data Manipulation Language (DML) – A data manipulation language is a language
that enables users to access or manipulate data as organized by the appropriate data model.
c) Multivalued Attribute – Multivalued attribute may have more than one value for an
entity. For example, PreviousDegrees of a STUDENT.
d) Relationship Instance – A relationship is an association among two or more entities.
An instance of relationship set is a set of relationships.

Consider the following relational database:STUDENT (name, student#, class, major)


COURSE (course name, course#, credit hours, department) SECTION (section identifier,
course#, semester, year, instructor)GRADE_REPORT (student#, section identifier, grade)
PREREQUISITE (course#, presequisite#)
Specify the following queries in SQL on the above database schema.
i. Retrieve the names of all students majoring in ‘CS’ (Computer Science).
ii. Retrieve the names of all courses taught by Professor King in 1998
iii. Delete the record for the student whose name is ‘Smith’ and whose student
number is 17.
iv. Insert a new course <‘Knowledge Engineering‘, ‗CS4390‘, 3, ‗CS‘>

Ans: (i) SELECT NAME FROM STUDENT WHERE MAJOR = ‗CS‘


(ii) SELECT COURSE_NAME FROM COURSE C, SECTION SWHERE C.COURSE#
= S.COURSE#
AND INSTRUCTOR = ‗KING‘ AND YEAR = 1998OR
SELECT COURSE_NAME FROM COURSE
WHERE COURSE# IN (SELECT COURSE# FROM SECTIONWHERE INSTRUCTOR = ‗KING‘
AND YEAR = 1998)
16
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
(iii) DELETE FROM STUDENT WHERE NAME = ‗Smith‘ AND STUDENT# = 17
(iv) INSERT INTO COURSE
VALUES(‗Knowledge Engineering‘, ‗CS4390‘, 3, ‗CS‘)

Explain the concept of a data model. What data models are used in database management systems?
Ans:
Data Model – Model is an abstraction process that hides irrelevant details while highlighting details
relevant to the applications at hand. Similarly, a data model is a collection of concepts that can be used
to describe structure of a database and provides the necessary means to achieve this abstraction.
Structure of database means the data types, relationships, and constraints that should hold for the data.
In general a data model consists of two elements:
 A mathematical notation for expressing data and relationships.
 Operations on the data that serve to express queries and other manipulations of the
data.
Data Models used in DBMSs:
▪ Hierarchical Model - It was developed to model many types of hierarchical
organizations that exist in the real world. It uses tree structures to represent relationship among records.
In hierarchical model, no dependent record can occur without its parent record occurrence and no
dependent record occurrence may be connected to more than one parent record occurrence.
▪ Network Model - It was formalised in the late 1960s by the Database Task Group of the
Conference on Data System Language (DBTG/CODASYL). It uses two different data structures to
represent the database entities and relationships between the entities, namely record type and set type. In
the network model, the relationships as well as the navigation through the database are predefined at
database creation time.
▪ Relational Model - The relational model was first introduced by E.F. Codd of the
IBM Research in 1970. The model uses the concept of a mathematical relation (like a table of values) as
its basic building block, and has its theoretical basis in set theory and first-order predicate logic. The
relational model represents the database as a collection of relations.
▪ Object Oriented Model – This model is based on the object-oriented programming
language paradigm. It includes the features of OOP like inheritance, object-identity,
encapsulation, etc. It also supports a rich type system, including structured and collection types.
▪ Object Relational Model – This model combines the features of both relational
model and object oriented model. It extends the traditional relational model with a variety of features
such as structured and collection types.

Briefly explain the differences between a stand alone query language, embedded query language
and a data manipulation language.
Ans: Stand alone Query Language – The query language which can be used interactively is called
stand alone query language. It does not need the support of a host language.
Embedded Query Language – A query language (e.g., SQL) can be implemented in two ways. It can
be used interactively or embedded in a host language. The use of query language commands within a
host language (e.g., C, Java, etc.) program is called embedded query language. Although similar
capabilities are supported for a variety of host languages, the syntax sometimes varies.
Data Manipulation Language (DML) – A data manipulation language is a language that enables
users to access or manipulate data as organized by the appropriate data model.

Consider the following relations for a database that keeps track of business trips of
salespersons in a sales office:
SALESPERSON (SSN, Name, start_year, Dept_no)
17
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
TRIP (SSN, From_city, To_city, Departure_Date, Return_Date, Trip_ID)EXPENSE(TripID,
Account#, Amount)
Specify the following queries in relational algebra: Give the
details (all attributes of TRIP) for trips that exceeded $2000 in expenses.
(i) Print the SSN of salesman who took trips to ‘Honolulu’

(ii) Print the trip expenses incurred by the salesman with SSN= ‘234-56-
7890’.Notethat the salesman may have gone on more than one trip. List them individually

Ans: (i) TRIP.* ( amount > 2000 (TRIP DD EXPENSE))


(ii)  SSN ( to_city = ‗Honolulu‘ (TRIP))
(iii) EXPENSE.tripid, amount ( SSN = ‗234-56-7890‘ (TRIP DD EXPENSE))

What is the difference between a key and a superkey?

Ans: Key – A key a single attribute or a combination of two or more attributes of an entity set that is
used to identify one or more instances (rows) of the set (table). It is a minimal combination of attributes.
Super Key – A super key is a set of one or more attributes that, taken collectively, allows us to identify
uniquely a tuple in the relation.

Why are cursors necessary in embedded SQL?


Ans: A cursor is an object used to store the output of a query for row-by-row processing by the
application programs. SQL statements operate on a set of data and return a set of data. On other hand,
host language programs operate on a row at a time. The cursors are used to navigate through a set of
rows returned by an embedded SQL SELECT statement. A cursor can be compared to a pointer.

What are views? Explain how views are different from tables.
Ans:
A view in SQL terminology is a single table that is derived from other tables. These other
tables could be base tables or previously defined views. A view does not necessarily exist in physical
form; it is considered a virtual table, in contrast to base tables, whose tuples are actually stored in the
database. This limits the possible update operations that can be applied to views, but it does not provide
any limitations on querying a view. A view represents a different perspective of a base relation(s). The
definition of a view in a create view statement is stored in the system catalog. Any attribute in the view
can be updated as long as the attribute is simple and not derived from a computation involving two or
more base relation attribute. View that involve a join may or may not be updatable. Such views are not
updatable if they do not include the primary keys of the base relations.
What do you mean by integrity constraints? Explain the two constraints, check and
foreign key in SQL with an example for each. Give the syntax.

Ans:Integrity Constraints –An integrity constraint is a condition specified on a database schema and
restricts the data that can be stored in an instance of the database. If a database instance satisfies all the
integrity constraints specified on the database schema, it is a legal instance. A DBMS enforces integrity
constraints, in that it permits only legal instances to be stored in the database.
CHECK constraint – CHECK constraint specifies an expression that must always be true for every
row in the table. It can‘t refer to values in other rows.
Syntax:
ALTER TABLE <table_name>
ADD CONSTRAINT <constraint_name> CHECK(<expression>);
FOREIGN KEY constraint – A foreign key is a combination of columns with values based on the
18
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
primary key values from another table. A foreign key constraint, also known as referential integrity
constraint, specifies that the values of the foreign key correspond to actual values of the primary or
unique key in other table. One can refer to a primary or unique key in the same table also.
Syntax:
ALTER TABLE <table_name>
ADD CONSTRAINT <constraint_name> FOREIGN KEY(<column_name(s)>)REFERENCES
<base_table>(<column_name>) ON {DELETE | UPDATE} CASCADE;

Q: What are the different types of database end users? Discuss the main activities of each.
Ans:
End-Users – End-users are the people whose jobs require access to the database for querying, updating,
and generating reports; the database primarily exists for their use. The different types of end-users are:
▪ Casual end-users – occasionally access the database, need different information each
time.
▪ Naive or Parametric end-users – includes tellers, clerks, etc., make up a sizable portion
of database end-users, main job function revolves around constantly querying and updating the
database.
▪ Sophisticated end-users – includes engineers, scientists, business analyst, etc., use for
their complex requirements
▪ Stand-alone users – maintain personal databases by using ready-made program
packages, provide easy-to-use menu-based or graphics-based interfaces

Describe cardinality ratios and participation constraints for relationship types.


Ans:
Cardinality Ratios – The cardinality ratios for a relationship type specifies the maximum number of
relationship instances that an entity can participate in. The possible cardinality ratios for relationship
types are one-to-one (1:1), one-to-many or many-to-one (1:M or M:1), and many-to-many (M:N).
Participation Constraints – The participation constraint specifies whether the existence of an entity
depends on its being related to another entity via the relationship type. This constraint specifies the
minimum number of relationship instances that each entity can participate in. It is sometimes called the
minimum cardinality constraint. There are two types of participation constraints – total and partial.

Information about a bank is about customers and their account. Customer has a name, address
which consists of house number, area and city, and one or more phone numbers. Account has
number, type and balance. We need to record customers who own an account. Account can be
held individually or jointly. An account cannot exist without a customer.
Arrive at an E-R diagram. Clearly indicate attributes, keys, the cardinality ratios and
participation constraints. Phone_no
area
Ans:
House_ no city

name type
address name type details

CUSTOMER ( I. M) Owns account M ACC OUN T TYP E

19
Acco unt _no balance
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

Describe the static hash file with buckets and chaining and show how insertion, deletionand
modification of a record can be performed.
Ans:
In static hash file organization, the term bucket is used to denote a unit storage that can store one or
more records.A file consists of buckets 0 through N-1, with one primary page per bucket initially and
additional overflow pages chained with bucket, if required later. Buckets contain data entries (or data
records). In hashing scheme, a hash function, h, is performed on the key of the record to identify the
bucket to which data record belongs to. The hash function is an important component of the hashing
approach. The main problem with static hash file is that the number of buckets is fixed.

Insertion of a record – To insert a data entry, the hash function is used to identify the

h(key) mod N 0
1
Key
h

N-1

Primary Bucket Pages Overflow Pages


Static Hash File

correct bucket and then put the data entry there. If there is no space for this data entry, a

new overflow page will be allocated, put the data entry on this page, and the page to the overflow chain
of the bucket.
Deletion of a record – To delete a data entry, the hash function is used to identify the correct bucket,
locate the data entry by searching the bucket, and then remove it. If the data entry is the last in an
overflow page, the overflow page is removed from the overflow chain of the bucket and added to a list
of free pages.
Modification of a record – To modify a data entry, the hash function is used to identify the correct
bucket, locate the data entry by searching the bucket and get it, modify the data entry, and then rewrite
the modified data entry on it.

Define the following terms


(i) Derived and stored attribute.
(ii) Distributed system.
(iii) Interblock gap
(iv) Degree of a relation.
(v) Catalog
(vi) Conceptual schema
(vii) DDL and SDL.

Ans: (i) Derived and Stored Attribute - In some cases, two or more attribute values are related, for
example, Age and BirthDate attributes of a person. For particular person entity, the value of Age can be
determined from the current date and the value of that person‘s BirthDate. Hence, the attribute Age is
called as derived attribute and the attribute BirthDate is called as stored attribute.
(ii) Distributed System – A distributed system consists of a number of processing
20
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
elements that are interconnected by a computer network and that cooperate in performing certain
assigned tasks.
(iii) Interblock Gap – A track of a disk is divided into equal-sized disk blocks. Blocks are
separated by fixed-size gaps, called as interblock gaps, which include specially coded control
information written during disk initialization.
(iv) Degree of a Relation – The degree or arity of a relation is the number of attributes
n of its relation schema.
(v) Catalog – A relational DBMS maintains information about every table and index that
it contains. A catalog is a collection of special tables, which stores the descriptive information of every
table and index.
(vi) Conceptual Schema – Conceptual schema describes the structure of the whole
database for a community of users. It hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, and constraints.
(vii) DDL and SDL – The data definition language (DDL) is used by DBA and database
designers to define conceptual schema, internal schema, and mappings between these two. In some
DBMSs, a clear separation is maintained between conceptual schema and internal schema. In that case,
DDL is used to specify the conceptual schema only. Another language, storage definition language
(SDL) is used to specify the internal schema. The mappings between the two schemas may be specified
in either one of these languages.
Define a relation.
Ans: Relation – A relations is a named two-dimensional table of data. Mathematically, a relation can be
defined as a subset of the cartesian product of a list of domains. Each relation consists of a set of named
columns and an arbitrary number of rows. The columns correspond to the fields describing each tuple
in the table or relation. The rows correspond to each instance of the entity described by the table or
relation.
Describe entity integrity and referential integrity. Give an example of each.
Ans:
Entity Integrity Rule – If the attribute A of relation R is a prime attribute of R then A
cannot accept null values.
Referential Integrity Rule – In referential integrity, it is ensured that a value that appears in one
relation for a given set of attributes also appears for a certain set of attributes in another relation.
For example:
STUDENT

Enrl No Roll No Name City Mobile


11 17 Ankit Vats Delhi 9891663808
15 16 Vivek Rajput Meerut 9891468487
6 6 Vanita Punjab
33 75 Bhavya Delhi 9810618396
GRADE

Roll No Course Grade


6 C A
17 VB C
75 VB A
6 DBMS B
16 C B

21
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
▪ Roll No is the primary key in the relation STUDENT and Roll No + Course is the
primary key of the relation GRADE. (Entity Integrity)
▪ Roll No in the relation GRADE (child table) is a foreign key, which is referenced from
the relation STUDENT (parent table). (Referential Integrity).

Consider the two relations given below


R S
A B C
A1 b1 c1 D A F
Null b2 null d1 a1 f1
a1 b1 c1 d1 a2 null
Given that A is the primary key of R, D is the primary key of S and there is a referential integrity
between S.A and R.A, discuss all integrity constraints that are violated.
Ans: (i) Primary key of R contains the ‗null‘ value and the value ‗a1‘ is duplicated, henceit violates the
entity integrity constraint in the relation R.
(ii) In primary key of S, the value ‗d1‘ is duplicated, hence it violates the entity integrity
constraint in the relation S.
(iii) The foreign key S.A contains the value ‗a2‘, which is not available in the parent key
R.A, hence it violates the referential integrity constraint in the relation S.

Given the following relations


TRAIN (NAME, START, DEST)
TICKET (PNRNO., START, DEST, FARE) PASSENGER (NAME, ADDRESS, PNRNO.)
Write SQL expressions for the following queries:
Note: Assume NAME of Train is a column of Ticket.
(i) List the names of passengers who are travelling from the start to the
destinationstation of the train.
(ii) List the names of passengers who have a return journey ticket.
(iii) Insert a new Shatabti train from Delhi to Bangalore.
(iv) Cancel the ticket of Tintin.

Ans
(i) SELECT P.NAME FROM TRAIN T, TICKET I, PASSENGER PWHERE P.PNRNO
= I.PNRNO AND T.NAME = I.NAME
AND T.START = I.START AND T.DEST = I.DEST
(ii) SELECT NAME FROM PASSENGER
WHERE PNRNO IN (SELECT DISTINCT A.PNRNO
FROM TICKET A, TICKET B WHERE A.PNRNO = B.PNRNOAND A.START = B.DEST AND
A.DEST = B.START)
(iii) INSERT INTO TRAIN
VALUES(‗Shatabdi‘, ‗Delhi‘, ‗Banglore‘
(iv) DELETE FROM TICKET
WHERE PNRNO = (SELECT PNRNO FROM PASSENGERWHERE NAME = ‗Tintin‘)

Define outer union operation of the relational algebra. Compute the outer union for the
relations R and S given below.
R S
A B C D A F
a1 b1 c1 d1 a1 f1
a3 b2 c2 d1 a2 null
22
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
Ans:
Outer Join - If there are any values in one table that do not have corresponding value(s) in the other,
in an equi-join that will not be selected. Such rows can be forcefully selected by using the outer join.
The corresponding columns for that row will have NULLs. There are actually three forms of the outer-
join operation: left outer join ( X), right outer join(X ) and full outer join ( X ).

R.A B C D S.A F
a1 b1 c1 d1 a1 f1
a3 b2 c2 Null Null Null
Null Null Null d1 a2 Null

Given the following relations


Vehicle (Reg_no, make, colour)Person(eno, name, address) Owner(eno, reg_no)
Write expressions in the relational algebra to answer the following queries:-
(i) List the reg_no of vehicles owned by John.
(ii) List the names of persons who own maruti cars.
(iii) List all the red coloured vehicle.

Ans: (i)  reg_no ( name=‘John‘ (PERSON DD OWNER))


(ii)  name ( make=‘maruti‘ (PERSON DD OWNER DD VEHICAL))
(iii)  colour=‘red‘ (VEHICAL)

Describe the responsibilities of the DBA and the database designer.


Ans: The responsibilities of DBA and database designer are:
1. Planning for the database's future storage requirements
2. Defining database availability and fault management architecture
3. Defining and creating environments for development and new releaseinstallation
4. Creating physical database storage structures after developers have designed an
application
5. Constructing the database
6. Determining and setting the size and physical locations of data files
7. Evaluating new hardware and software purchase
8. Researching, testing, and recommending tools for Oracle development, modeling,
database administration, and backup and recovery implementation, as well as planning for the future
9. Providing database design and implementation
10. Understanding and employing the optimal flexible architecture
to ease administration, allow flexibility in managing I/O, and to increase thecapability to scale the
system.
What are the four main characteristics of the database approach?
Ans: The four main characteristics of the database approach are:
1. Self-describing nature of a database system.
2. Insulation between programs and data, and data abstraction.
3. Support of multiple views of the data.
4. Sharing of data and multi-user transaction processing.

Differentiate between DDL and DML.


Ans: DDL - Data Definition Language: statements used to define the database
structure or schema. Some examples:
23
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
▪ CREATE - to create objects in the database
▪ ALTER - alters the structure of the database
▪ DROP - delete objects from the database
▪ TRUNCATE - remove all records from a table, including all spaces allocated for
the records are removed
▪ COMMENT - add comments to the data dictionary
▪ RENAME - rename an object
DML - Data Manipulation Language: statements used for managing data withinschema objects.
Some examples:
 SELECT - retrieve data from the a database
 INSERT - insert data into a table
 UPDATE - updates existing data within a table
 DELETE - deletes all records from a table, the space for the records remain
 MERGE - UPSERT operation (insert or update)
 CALL - call a PL/SQL or Java subprogram
 EXPLAIN PLAN - explain access path to data
List any two disadvantages of a database system.
Ans: The disadvantages of database system are:
 Database systems are complex, difficult, and time-consuming to design.
 Substantial hardware and software start-up costs.
 Damage to database affects virtually all applications programs.
 Extensive conversion costs in moving form a file-based system to a database system.
 Initial training required for all programmers and users.
Explain the utilities that help the DBA to manage the database.

Ans: Every DBA uses database utilities to manage and control their databases. But
there is a lot of confusion in the field as to what, exactly, is a database utility. There are a lot of
definitions floating around out there. DBAs constantly refer to utilities, tools, solutions, and suites.
So, first of all, let‘s be clear on what a utility is and what is a ―tool‖ or ―solution.‖ A utility is generally
a single purpose program for moving and/or verifying database pages; examples include LOAD,
UNLOAD, REORG, CHECK, COPY, and RECOVER. A database tool is a multi-functioned program
designed to simplify database monitoring, management, and/or administrative tasks. A solution is a
synergistic group of tools and utilities designed to work together to address a customer‘s business issue.
A suite is a group of tools that are sold together, but are not necessarily integrated to work with each
other in any way. Of course, these are just my definitions. But there are useful definitions that make it
easier to discuss DBA products and programs.
Differentiate between
(i) WHERE and HAVING clause in SQL.
(ii) Strong entity set and weak entity set.
Ans: (i) WHERE and HAVING clause in SQL
The WHERE clause is basically used for implementing conditions on every tuple of therelation.
The HAVING clause is used in combination with the GROUP BY clause. It can be usedin a SELECT
statement to filter groups of the records that a GROUP BY returns.
The syntax for the HAVING clause is:
SELECT column1, column2, ... column_n, aggregate_function (expression)FROM tables
WHERE predicates
GROUP BY column1, column2, ... column_nHAVING condition1 ... condition_n;
Aggregate_function can be a function such as SUM, COUNT, MIN, or MAX.
(ii) Strong entity set and weak entity set: A strong entity set has a primary key. All
24
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
tuples in the set are distinguishable by that key. A weak entity set has no primary key unless attributes
of the strong entity set on which it depends are included. Tuples in a weak entity set are partitioned
according to their relationship with tuples in a strong entity set. Tuples within each partition are
distinguishable by a discriminator, which is a set of attributes. A strong entity set has a primary key.
All tuples in the set are distinguishable by that key. A weak entity set has no primary key unless
attributes of the strong entity set on which it depends are included. Tuples in a weak entity set are
partitioned according to their relationship with tuples in a strong entity set. Tuples within each partition
are distinguishable by a discriminator, which is a set of attributes.
Discuss with examples about various types of attributes present in the ER model.
Ans: Types of Attributes are:
 SIMPLE attributes are attributes that are drawn from the atomic value domains
E.g. Name = {John} ; Age = {23}
 COMPOSITE attributes: Attributes that consist of a hierarchy of attributes
E.g. Address may consists of ―Number‖, ―Street‖ and ―Suburb‖ → Address = {59 +‗Meek Street‘ +
‗Kingsford‘}
SINGLE VALUED attributes: Attributes that have only one value for each entity
E.g. Name, Age for EMPLOYEE
 MULTIVALUED attributes: Attributes that have a set of values for each entity
E.g. Degrees of a person: ‗ BSc‘ , ‗MIT‘, ‗PhD‘
 DERIVED attributes: Attributes Contain values that are calculated from other
attributes
Eg. Age can be derived from attribute DateOfBirth. In this situation, DateOfBirthmight be called Stored
Attribute.

What is the main goal of RAID technology?

Ans: RAID stands for Redundant Array of Inexpensive (or sometimes


"Independent")Disks. RAID is a method of combining several hard disk drives into one logical unit
(two or more disks grouped together to appear as a single device to the host system). RAID technology
was developed to address the fault-tolerance and performance limitations of conventional disk storage.
It can offer fault tolerance and higher throughput levels than a single hard drive or group of independent
hard drives. While arrays were once considered complex and relatively specialized storage solutions,
today they are easy to use and essential for a broad spectrum of client/server applications.
Define the following terms
ii) Hashing (ii) Specialization
Ans: (i) Hashing: Hashing is a method to store data in an array so that storing, searching,
inserting and deleting data is fast (in theory it's O(1)). For this every record needs an unique key.
The basic idea is not to search for the correct position of a record with comparisons but to compute the
position within the array. The function that returns the position is called the 'hash function' and the array
is called a 'hash table'.
ii)Specialization: Specialization allows you to define new kinds of information (new structural types or
new domains of information), while reusing as much of existing design and code as possible, and
minimizing or eliminating the costs of interchange, migration, and maintenance.

Differentiate between natural join and outer join.


Ans: Natural join is a binary operator that is written as (R * S) where R and S are relations. The result
of the natural join is the set of all combinations of tuples in R and S that are equal on their common
attribute names. In this only one column out of columns having same name attributes is retained.
An Outer join contains those tuples and additionally some tuples formed by extending an unmatched
tuple in one of the operands by "fill" values for each of the attributes of the other operand.
25
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

For the relations R and S given below:


R S
A B C B C D
1 2 3 2 3 10
4 5 6 2 3 11
7 8 9 6 7 12
Compute
(i) A,C (ii) B2 S
R
(iii) natural join (iv) outer join

26
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
Ans: (i)
A C
1 3
4 6
7 9
(ii)
B C D
2 3 10
2 3 11
(iii)
A B C D
1 2 3 10
1 2 3 11
(iv) Assuming left outer join
A B C D
1 2 3 10
1 2 3 11
4 5 6 NULL
7 8 9 NULL

Write short notes on


(i) Data models.
(ii) Oracle database structure.
(iii) Group By clause in SQL.
Ans: (i) Data models: A data model is an abstract model that describes how data is
represented and accessed.
The term data model has two generally accepted meanings:
A data model theory, i.e. a formal description of how data may be structured andaccessed.
A data model instance, i.e. applying a data model theory to create a practical data model
instance for some particular application.
(ii) Oracle database structure:
The relational model has three major aspects:
Structures: Structures are well-defined objects that store the data of a database. Structures and the data
contained within them can be manipulated by operations.
Operations: Operations are clearly defined actions that allow users to manipulate the data and
structures of a database. The operations on a database must adhere to a pre- defined set of integrity
rules.
Integrity Rule: Integrity rules are the laws that govern which operations are allowed on the data and
structures of a database. Integrity rules protect the data and the structures of adatabase.

27
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
An ORACLE database has both a physical and a logical structure. By separating physical and logical
database structure, the physical storage of data can be managed without affecting the access to logical
storage structures.
(iii) Group By clause in SQL: The GROUP BY clause can be used in a SELECT
statement to collect data across multiple records and group the results by one or more columns.
The syntax for the GROUP BY clause is:
SELECT column1, column2, ... column_n, aggregate_function (expression)FROM tables
WHERE predicates
GROUP BY column1, column2, ... column_n;
aggregate_function can be a function such as SUM, COUNT, MIN, or MAX.

Explain the disadvantages of file oriented approach.


Ans: Applications are designed in isolation. Design of application is optimized for one application.
Independently developed applications leads to data redundancy. Wasted storage space due to
redundancy.
a. There is loss of data integrity because integrity checking is not automated.
b. Difficulty in accessing data.
c. Information is available only in reports.
d. Data Isolation.
e. Inadequate security
f. Limited Flexibility
g. High Maintenance cost.
h. Each data file of an application is a separate entity.

Explain the three data models namely relational, network and hierarchical and compare their
relative advantages and disadvantages.
Ans: Hierarchical Model: In hierarchical model, data elements are connected to one another through
links. Records are arranged in a top-down structure that resembles a tree or genealogy chart. The top
node is called the root, the bottom nodes are called leaves, and intermediate nodes have one parent
node and several child nodes. The root can have any number of child nodes but a child node can have
only one parent node. Data are related in a nested, one-to-many set of relationships, while many-to-
many relationship cannot be directly expressed.
A child record occurrence must have a parent record occurrence; deleting a parent record occurrence
requires deleting all its child record occurrences.
A network data model can be regarded as an extended form of the hierarchical model; the principle
distinction between the two being that in a hierarchical model, a child record has exactly one parent
whereas in network model, a child record can have any number of parents. It may have zero also.
Data in the network model is represented by collection of records and relationship among data is
represented by links, which can be viewed, as pointers. The records in the database are organized as
collection of arbitrary graphs, which allows to have one-to- many as well as many-to-many relationship
is a collection of data items which can be retrieved from a database, or which can be stored in a
database as an undivided object. Thus, A DBMS may STORE, DELETE or MODIFY records within a
database. In this way, a number of records within a network database are dynamically changed. The
network model can be graphically represented as follows:
A labeled rectangle represents the corresponding entity or record type. An arrow represents the set type,
which denotes the relationship between the owner record type and member record. The arrow direction
is from the owner record type to the member record type.
A labeled rectangle represents the corresponding entity or record type. An arrow represents the set type,
which denotes the relationship between the owner record type and member record. The arrow direction
is from the owner record type to the member record type.
28
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
Each many to many relationship is handled by introducing a new record type to represent the
relationship wherein the attributes, if any, of the relationship are stored. We when create two
symmetrical 1:M sets with the member in each of the sets being the newly introduced record type. In
this model, the relationships as well as the navigation through the database are predefined at database
creation time.
In relational model the data and the relations among them are represented by a collection of tables. A
tables is a collection of records and each record in a table contains the same fields. The attractiveness of
the relational approach arouses from the simplicity in the data organization and the availability of ably
simple to very powerful query languages. The relational model is based on a technique called
―Normalization‖ proposed by E.F. Codd. This model reduces the complexity of the Network and
Hierarchical Models. This model uses the certain mathematical operations from relational algebra and
relational calculus on the relation such as projection, union and joins etc. where fields in two different
tables take values from the same set, a join operation can be performed to select related records in the
two tables by matching values in those fields. A description of data in terms of a data model is called a
schema. In relation model, the schema for a relation specifies its name, the name of each field and the
type of each field.
Navigation through relations the represent an M:N relationship is just as simple as through a 1:M
relationship. This leads us to conclude that it is easier to specify how to manipulate a relational database
than a network or hierarchical one. This in turn leads to a query language for the relational model that is
correct, clear, and effective in specifying the required operations. Unfortunately, the join operation is
inherently inefficient and demands a considerable amount of processing and retrieval of unnecessary
data. The structure for the network and hierarchical model can be implemented efficiently. Such an
implementation would mean that navigating through these databases, though awkward, requires the
retrieval of relatively little unnecessary data.

In an organization several projects are undertaken. Each projects can employ one or more employees.
Each employee can work on one or more projects. Each project is undertaken on the required of
client. A client can request for several projects. Each
Explain the relevance of Data Dictionary in a Database System.
Ans: Data dictionary is a database in its own right residing on the disk which consist of Meta data
which is = Data about all entity sets + attributes + relationships among entity sets + constraints. It
consist of compiled form of definitions, structure and usage information on data stored, design
decisions, usage standards, application programme descriptions, user information. It is consulted by
DBMS before DML operation and by user to learn what each piece of data and various synonymous of
data fields mean. Data dictionary can be integrated system where it is part of DBMS or add ons to
DBMS. In integrated system data dictionary contains information concerning external, conceptual and
internal level of data base. Both in source and object form. It contains source code of each data field
value, frequency of its use, audit trail concerning updates and cross reference information. Present
system are all add ons standards do not exist for iintegrity data dictionary with DBMS. Data dictionary
should be integrated in database it defines and thus include its own definition so that it can be queried
with the same language usefor queering database.

(i) Differentiate between


(ii) Procedural and non procedural languages.
(iii) Key and superkey
(iv) Primary and secondary storage

Ans: (i) Procedural and non procedural languages - A procedural language specifies the operations
to be performed on the existing data to derive the results. It also specifies the sequence of operations in
which they will be performed. But, a non procedural language specifies only the result or information
29
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
required not how it is obtained.
(ii) Key and superkey - A key a single attribute or a combination of two or more
attributes of an entity set that is used to identify one or more instances (rows) of the set (table). If we
add some additional attributes to a primary key then that augmented key is called as super key.
Therefore, the primary key is the minimum super key.
(iii) Primary and secondary storage – Primary storage device stores the data
temporarily. Primary storage is generally used by the processing unit to temporary store the data,
intermediate results, and the final results before storing to the secondary storage because the secondary
storage devices are not directly accessible by the CPU. But, if we want to store data permanently then
the secondary storage devices are required. Secondary storage devices are slower than the primary
storage devices.
What is the difference between a primary index and a secondary index? What are the
advantages of using an index and what are its disadvantages.

Ans: Primary Index: A primary index is an ordered file whose records are of fixed length with two
fields. The first field is the ordering key field-called primary key-of the data file, and the second field is
a pointer to a disk block. There is one index entry in the index file for each block in the data file. Each
index entry has the value of the primary key field for the first record in a block and a pointer to that
block as its two field values. A major problem with a primary index is insertion and deletion of records.
If we attempt to insert a record in it‘s correct positioning the data file, we have to not only move records
tomake space for the new record but also change some index entries.
Secondary Index: A secondary index is also an ordered file with two fields. The first field is non-
ordering field of the data file that is an indexing field. The second field is either a block pointer or a
record pointer. A secondary index on a candidate key looks just like a dense primary index, except that
the records pointed to by successive values in the index are not stored sequentially.
In contrast, if the search key of a secondary index is not a candidate key, it is not enough to point to
just the first record with each search-key value. The remaining records with the same search – key
value could be anywhere in the file, since the records are ordered by the search key of the primary
index, rather than by the search key if the secondary index. Therefore, a secondary index must contain
pointers to all the records. Secondary indices improve the performance of queries that use keys other
than the search key of the primary index. However, they impose a significant overhead on modification
of the database. The designer of a database decides which secondary indices are desirable on an
estimate of the relative frequency of query‘s and modifications.
Some of the advantages of using an index are:
(v) Indexes speed up search on the indexed attributes(s). Without an index either a
sequential search or some sort of binary search would be needed.
(vi) Indexes can also speed up sequential processing of the file when the file is not
stored as a sequential file.
Some of the disadvantages of using an index are :
(i) An index requires additional storage. This additional storage can be significant
when a number of indexes are being used on a file.
(ii) Insertion, deletion and updates on a file with indexes takes more time than on afile
without any indexes.

Describe the function of each of the following types of keys: Primary, alternative, secondary and
foreign.
Ans: Primary Key : The primary key is an attribute or a set of attributes that uniquely identify a
specific instance of an entity. Every entity in the data model must have a primary key whose values
uniquely identify instances of the entity.
To qualify as a primary key for an entity, an attribute must have the following properties :
30
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
* It must have a non-null value for each instance of the entity.
* The value must be unique for each instance of an entity
* The values must not change or become null during the life of each entity instance.
Candidate Key and Alternate Key : In some instances, an entity will have more thanone attribute
that can serve as a primary key. Any key or minimum set of keys that couldbe a primary key is called a
candidate key. Ones candidate keys are identified, one of them is chosen as primary key. The choice of
Primary key is based on guaranteeduniqueness and minimalism.
Candidate keys which are not chosen as the primary key are known as alternate keys. Foreign Key :
The primary key of one file or table which is implanted in another file ortable to implement the
relationships between them. Foreign keys are used to implementsome types of relationships. Foreign
keys do not exist in information models.

Discuss the techniques for a hash file to expand and shrink dynamically. What are the
advantages and disadvantages of each?
Ans:
The hashing techniques that allow dyanamic file expansion are:
(i) Extendible hashing
(ii) Linear hashing
The main advantage of extendible hashing that makes it attractive is that performance of the file does
not degrade as the file grows. Also, no space is allocated in extendible hashing for future growth, but
additional buckets can be allocated dynamically as needed. A disadvantage is that the directory must be
searched before accessing the buckets themselves, resulting in two blocks accesses instead of one in
static hashing.

What is the difference between a database schema and a database state?


Ans: The collection of information stored in database at particular moment in time is called database state
while the overall design of database is called database schema.

Discuss the types of integrity constraints that must be checked for the update operations –Insert
and Delete. Give examples.
Ans: Insert operation can violet any of the following four constraints:
1) Domain constraints can be violated if given attribute value does not appear in
corresponding domain.
2) Key constraints can be violated if given attribute value does not appear in
corresponding domain.
3) Entity integrity can be violated if the primary key of the new tuple t is NULL.
4) Referential integrity can be violated if value of any foreign key in t refers to a tuple
that does not exist in referenced relation.
Delete operation can violate only referential integrity constraints, if the tuple beingdeleted is referenced
by the foreign keys from other tuples in the database.

Differentiate between the following giving advantages anddisadvantages of each.


(i) Primary and secondary storage.
(ii) Open addressing and chaining for collision resolution.

Ans: i) Primary and secondary storage


*Computer storage is classified into primary(main) memory and secondary(peripheral)storage.
a) Primary storage is usually RAM; ~10ns access time
b) Secondary storage is usually hard disk drives; ~10ms access time
c) Secondary storage is a lot cheaper than primary, 3¢/Mb vs.$1/Mb.
31
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
 Secondary storage is persistent

a) Databases are stored in secondary memory, and largedatabases are manipulated in


secondary storage.
b) We need to minimise disk accesses when accessing andmanipulating databases.
(ii) Open addressing and chaining for collision resolution
In open addressing, proceeding from the occupied position specified by hash address the program
checks the subsequent positions in order until an unused (empty) position is found.
Chaining: In this method various overflow locations are kept, usually by extending the way with a
number of overflow positions. Additionally, a pointer field is added to each record location. A collision
is resolved by placing the new record in an unused overflow location and setting the pointer of occupied
hash address location to the address of that overflow location. A linked list of overflow records for each
hash address is thus maintained.
Explain the EXISTS and UNIQUE functions of SQL. Give an example for each.

Ans: EXISTS: The EXISTS function takes one parameter which is a SQL statement. If any records exist
that match the criteria it returns true, otherwise it returns false. This gives you a clean, efficient way to
write a stored procedure that does either an insert or update.
UNIQUE: If UNIQUE is specified then only unique values are used tocalculate the mean.

32
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
What is NULL? Give an example to illustrate testing for NULL in SQL.

Ans: The NULL SQL keyword is used to represent either a missing value or avalue that is not
applicable in a relational table.
Consider there is a relation:
Person(id, name, address, phone)
Now to find ids and names of person who do not have a phone is:
Select id, namefrom Person
where phone is null

Discuss the differences between the candidate keys and the primary key of a relation.Give
example to illustrate your answer.

Ans: A candidate key is one which can be used as primary key that is not null and unique constraint
both holding true. In short all primary keys are definitely candidate keys. That is one of the
candidate keys is chosen as primary key.

Ans:

Constraint Description
PRIMARY Determines which column(s) uniquely identifies each record.
KEY The primary key cannot be NULL, and the data value(s) must
be unique.
FOREIGN In a one-to-many relationship, the constraint is added to the
KEY "many"table. The constraint ensures that if a value is entered
into a specified column, it must already exist in the "one" table,
or the record is not added.
UNIQUE Ensures that all data values stored in a specified column are
unique.The UNIQUE constraint differs from the PRIMARY
KEY constraint in that it allows NULL values.
CHECK Ensures that a specified condition is true before the data value is
added to a table. For example, an order's ship date cannot be
earlierthan its order date.
NOT NULL Ensures that a specified column cannon contain a NULL value.
TheNOT NULL constraint can only be created with the
column-level approach to table creation.

What is DBMS and what are functions of DBMS (10)


Ans: DBMS consist of collection of integrated data and set of program to access those data.
The functions performed by a typical DBMS are the following:
 Data Definition: The DBMS provides functions to define the structure of the data in
the application. These include defining and modifying the record structure, the type and size of fields
and the various constraints/conditions to be satisfied by the data in each field.
 Data Manipulation: Once the data structure is defined, data needs to be inserted,
modified or deleted. The functions which perform these operations are also part of the DBMS. These
functions can handle planned and unplanned data manipulation needs. Planned queries are those which
form part of the application. Unplanned queries are ad-hoc queries which are performed on a need basis.
 Data Security & Integrity: The DBMS contains functions which handle the security
and integrity of data in the application. These can be Thus the DBMS provides an environment that is
33
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS
both convenient and efficient to use when there is a large volume of data and many transactions to be
proved

Describes the various relationship constraints by giving suitable example.

Ans: Constraints on relationships: There are two types of constraints on relationships.


1) Mapping cardinalities or cardinality ratios express the number of entities to which
another entity can be associated via a relationship set. For a binary relationship set between entities A
and B. The mapping cardinality may be any one of the following:
a) One to One: An entity in A is associated with at most one entity in B and an entity inB
is associated with at most one entity in A.
b) One to Many: An entity in A can be associated with any number in B but an entity in B
is associated with at most one in A.
c) Many to one: An entity in A is associated with at most one in B but an entity in B is
associated with any number in A.
d) Many to Many: An entity in A and B can be associated with any number of entitiesin
the other entity set.
2) Participation constraints: The participation of an entity set E in a relationship
set R is said to be total if every entity in E participates in at least one relationship in
R. If only some entities in E participation of entity set E in relationship R is said tobe partial.

Define the following


(i) Record-Based Logical Models
(ii) Data Independence
Ans:(i) Record-Based logical Models: Also describe data at the conceptual and view levels. Unlike
object-oriented models, are used to specify overall logical structure of the database, and provide a
higher-level description of the implementation. Named so because the database is structured in fixed-
format records of several types. Each record type defines a fixed number of fields, or attributes. Each
field is usually of a fixed length (this simplifies the implementation). Record-based models do not
include a mechanism for direct representation of code in the database. Separate languages associated
with the model are used to express database queries and updates. The three most widely-accepted
models are the relational, network, and hierarchical.
(ii) Data Independence: Techniques that allow data to be changed without affecting the
applications that process it. There are two kinds of data independence. The first type is data
independence for data, which is accomplished in a database management system (DBMS). It allows the
database to be structurally changed without affecting most existing programs. Programs access data in a
DBMS by field and are concerned with only the data fields they use, not the format of the complete
record. Thus, when the record layout is updated (fields added, deleted or changed in size), the only
programs that must be changed are those that use those new fields.

What are the various types of the update operations on relations? Also explain the constraints
on these update operation. Give examples in support of your answer.

34
Debasis kamila 9432208397 DATABASE MANAGEMENT SYSTEMS

Ans: There are three basic update separation on relations:


(i) Insert : It is used to insert a new tuple or tuples in a relation. Insert can violate any of
the four tuples of constraints : Domain constraints, key constraints , entity integrity and referential
integrity.
(ii) Delete : It is used to delete tuples. The delete operation can violate only referential
integrity, if the tuple being deleted is referenced by the foreign keys from other tuples in the database
(iii) Modify: It is used to change value of some attributes in existing tuples. Modify can
also violate any of the four constraints as specified in insert operation.
Examples:
(1) Inserting a tuple having null value for primary key violates entity integrity
(2) Deleting a tuple in a table for which tuples exist other tables that are dependent on
those tuples will violate referential key integrity.
(3) Modifying a primary key attribute to any other such that is aheady exists in the table
will violate entity integrity.

35
DC10 DATABASE MANAGEMENT SYSTEMS
Write short note on followings:
(i) Relational Constraints
(ii) Disadvantages of Relational Approach
(iii) Instances and Schemas

Ans: (i) Relational Constraints are:


▪ NOT NULL
▪ Unique
▪ Primary key
▪ Foreign key
▪ Table check
(ii) Disadvantages of relational approach:
 Substantial hardware and system software overhead
 May not fit all business models
 Can facilitate poor design and implementation
 May promote "islands of information" problems
(iii) Instances and schemas: Databases changes over time as the information is inserted
and deleted. The collection of information stored in database at a particular moment in time is called
Instances and the overall design of database is called schemas.

Explain the disadvantages of a file processing system.

Ans: Disadvantages of File Processing Systems include:


1) Data Redundancy
2) Data Inconsistency
3) Difficult to access data
4) Data isolation
5) Atomicity problem
6) Concurrent access anomalies
7) Security problem

What is data model? Explain object based and record based data models.

Ans: A data model is an abstract model that describes how data is represented and accessed.
(i) Object based data models: Similar to a relational database model, but objects,
classes and inheritance are directly supported in database schemas and in the query language
(ii) Record based data models: is a database model based on first-order predicate logic.
Its core idea is to describe a database as a collection of predicates over a finite set of predicate
variables, describing constraints on the possible values and combinations of values.

Q :What is Generalization , Specialization and aggregation ?

Generalization
The process of generalizing entities, where the generalized entities contain the properties of all the
generalized entities, is called generalization. In generalization, a number of entities are brought together into
one generalized entity based on their similar characteristics. For example, pigeon, house sparrow, crow and
dove can all be generalized as Birds.

36
DC10 DATABASE MANAGEMENT SYSTEMS

Specialization
Specialization is the opposite of generalization. In specialization, a group of entities is divided into sub-
groups based on their characteristics. Take a group ‗Person‘ for example. A person has name, date of birth,
gender, etc. These properties are common in all persons, human beings. But in a company, persons can be
identified as employee, employer, customer, or vendor, based on what role they play in the company.

Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based on what role
they play in school as entities.

Aggregration

Aggregration is a process when relation between two entity is treated as a single entity. Here the relation
between Center and Course, is acting as an Entity in relation with Visitor.

37
DC10 DATABASE MANAGEMENT SYSTEMS

What do you mean by anomalies ?

An anomaly is an irregularity, or something which deviates from the expected or normal state. When
designing databases, we identify three types of anomalies: Insert, Update and Delete.

Insertion Anomaly in Referencing Relation:

We can‘t insert a row in REFERENCING RELATION if referencing attribute‘s value is not present in
referenced attribute value. e.g.; Insertion of a student with BRANCH_CODE ‗ME‘ in STUDENT relation
will result in error because ‗ME‘ is not present in BRANCH_CODE of BRANCH.

Deletion/ Updation Anomaly in Referenced Relation:

We can‘t delete or update a row from REFERENCED RELATION if value of REFRENCED ATTRIBUTE
is used in value of REFERENCING ATTRIBUTE. e.g; if we try to delete tuple from BRANCH having
BRANCH_CODE ‗CS‘, it will result in error because ‗CS‘ is referenced by BRANCH_CODE of
STUDENT, but if we try to delete the row from BRANCH with BRANCH_CODE CV, it will be deleted as
the value is not been used by referencing relation. It can be handled by following method:

ON DELETE CASCADE: It will delete the tuples from REFERENCING RELATION if value used by
REFERENCING ATTRIBUTE is deleted from REFERENCED RELATION. e.g;, if we delete a row from
BRANCH with BRANCH_CODE ‗CS‘, the rows in STUDENT relation with BRANCH_CODE CS
(ROLL_NO 1 and 2 in this case) will be deleted.

ON UPDATE CASCADE: It will update the REFERENCING ATTRIBUTE in REFERENCING


RELATION if attribute value used by REFERENCING ATTRIBUTE is updated in REFERENCED
RELATION. e.g;, if we update a row from BRANCH with BRANCH_CODE ‗CS‘ to ‗CSE‘, the rows in
STUDENT relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be updated with
BRANCH_CODE ‗CSE‘.

What is Composite Key?

Key that consist of two or more attributes that uniquely identify an entity occurance is called Composite key.
But any attribute that makes up the Composite key is not a simple key in its own.

38
DC10 DATABASE MANAGEMENT SYSTEMS

Database Normalization
Database normalization is the process of organizing the attributes of database to reduce or eliminate data
redundancy (having same data but at different places) .

Problems because of data redundancy


Data redundancy unnecessarily increases size of database as same data is repeated on many places.
Inconsistency problems also arise during insert, delete and update operations.

Functional Dependency
Functional Dependency is a constraint between two sets of attributes in a relation from a database.
Functional dependency is denoted by arrow (→). If an attributed A functionally determines B, then it is
written as A → B.
For example employee_id → name means employee_id functionally determines name of employee. As
another example in a time table database, {student_id, time} → {lecture_room}, student ID and time
determine the lecture room where student should be.

What does functionally dependent mean?


A function dependency A → B mean for all instances of a particular value of A, there is same value of B.

For example in the below table A → B is true, but B → A is not true as there are different values of A for B
= 3.

A B
------
1 3
2 3
4 0
1 3
4 0

Trivial Functional Dependency


X –> Y is trivial only when Y is subset of X.
Examples

ABC --> AB
ABC --> A
ABC --> ABC
Non Trivial Functional Dependencies
X –> Y is a non trivial functional dependencies when Y is not a subset of X.

39
DC10 DATABASE MANAGEMENT SYSTEMS
X –> Y is called completely non-trivial when X intersect Y is NULL.
Examples:

Id --> Name,
Name --> DOB

Normal form
Normal forms are used to eliminate or reduce redundancy in database tables.

First Normal Form


A relation is in first normal form if every attribute in that relation is singled valued attribute.

Example :

ID Name Courses
------------------
1 A c1, c2
2 E c3
3 M C2, c3

In the above table Course is a multi valued attribute so it is not in 1NF.

Below Table is in 1NF as there is no multi valued attribute


ID Name Course
------------------
1 A c1
1 A c2
2 E c3
3 M c1
3 M c2

Second Normal Form

A relation is in 2NF iff it has No Partial Dependency, i.e., no non-prime attribute (attributes which are not
part of any candidate key) is dependent on any proper subset of any candidate key of the table.

For example consider following functional dependencies in relation R (A, B , C, D )

AB -> C [A and B together determine C]


BC -> D [B and C together determine D]

In the above relation, AB is the only candidate key and there is no partial dependency, i.e., any proper subset
of AB doesn‘t determine any non-prime attribute.

Third Normal Form


A relation is in 3NF iff at least one of the following condition holds in every non-trivial function dependency
X –> Y
a) x is a super key.
b) Y is a prime attribute (each element of Y is part of some candidate key).

For example consider relation R(A, B, C, D, E)


A -> BC,
CD -> E,
40
DC10 DATABASE MANAGEMENT SYSTEMS
B -> D,
E -> A

All possible candidate keys in above relation are {A, E, CD, BC}
All attribute are on right sides of all functional dependencies are prime.

BCNF
A relation is in BCNF iff in every non-trivial functional dependency X –> Y, X is a super key.

For example consider relation R(A, B, C)


A -> BC,
B -> A

A and B both are super keys so above relation is in BCNF.

Key Points

1. BCNF is free from redundancy.


2. If a relation is in BCNF, then 3NF is also also satisfied.
3. If all attributes of relation are prime attribute, then the relation is always in 3NF.
4. A relation in a Relational Database is always and at least in 1NF form.
5. Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.
6. If a Relation has only singleton candidate keys( i.e. every candidate key consists of only 1 attribute),
then the Relation is always in 2NF( because no Partial functional dependency possible).
7. Sometimes going for BCNF form may not preserve functional dependency. In that case go for BCNF
only if the lost FD(s) is not required, else normalize till 3NF only.
8. There are many more Normal forms that exist after BCNF, like 4NF and more. But in real world
database systems it‘s generally not required to go beyond BCNF.

Exercise 1: Find the highest normal form in R (A, B, C, D, E) under following functional
dependencies.

ABC --> D
CD --> AE

Important Points for solving above type of question.


1) It is always a good idea to start checking from BCNF, then 3 NF and so on.
2) If any functional dependency satisfied a normal form then there is no need to check for lower normal
form. For example, ABC –> D is in BCNF (Note that ABC is a super key), so no need to check this
dependency for lower normal forms.

Candidate keys in given relation are {ABC, BCD}

BCNF: ABC -> D is in BCNF. Let us check CD -> AE, CD is not a super key so this dependency is not in
BCNF. So, R is not in BCNF.

3NF: ABC -> D we don‘t need to check for this dependency as it already satisfied BCNF. Let us consider
CD -> AE. Since E is not a prime attribute, so relation is not in 3NF.

2NF: In 2NF, we need to check for partial dependency. CD which is a proper subset of a candidate key and
it determine E, which is non prime attribute. So, given relation is also not in 2 NF.
41
DC10 DATABASE MANAGEMENT SYSTEMS
So, the highest normal form is 1 NF.

Equivalence of Functional Dependencies


For understanding equivalence of Functional Dependencies Sets (FD sets), basic idea about Attribute
Closuresis given in this article

Given a Relation with different FD sets for that relation, we have to find out whether one FD set is subset of
other or both are equal.

How to find relationship between two FD sets?

Let FD1 and FD2 are two FD sets for a relation R.

1. If all FDs of FD1 can be derived from FDs present in FD2, we can say that FD2 ⊃ FD1.
2. If all FDs of FD2 can be derived from FDs present in FD1, we can say that FD1 ⊃ FD2.
3. If 1 and 2 both are true, FD1=FD2.

All these three cases can be shown using Venn diagram as:

Q. Let us take an example to show the relationship between two FD sets. A relation R(A,B,C,D) having two
FD sets FD1 = {A->B, B->C, AB->D} and FD2 = {A->B, B->C, A->C, A->D}

Step 1. Checking whether all FDs of FD1 are present in FD2

 A->B in set FD1 is present in set FD2.


 B->C in set FD1 is also present in set FD2.
 AB->D in present in set FD1 but not directly in FD2 but we will check whether we can derive it or not. For set
FD2, (AB)+ = {A,B,C,D}. It means that AB can functionally determine A, B, C and D. So AB->D will also
hold in set FD2.

As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.

Step 2. Checking whether all FDs of FD2 are present in FD1

 A->B in set FD2 is present in set FD1.


 B->C in set FD2 is also present in set FD1.
 A->C is present in FD2 but not directly in FD1 but we will check whether we can derive it or not. For set FD1,
(A)+ = {A,B,C,D}. It means that A can functionally determine A, B, C and D. SO A->C will also hold in set
FD1.
42
DC10 DATABASE MANAGEMENT SYSTEMS
 A->D is present in FD2 but not directly in FD1 but we will check whether we can derive it or not. For set
FD1, (A)+ = {A,B,C,D}. It means that A can functionally determine A, B, C and D. SO A->D will also hold in
set FD1.

As all FDs in set FD2 also hold in set FD1, FD1 ⊃ FD2 is true.

Step 3. As FD2 ⊃ FD1 and FD1 ⊃ FD2 both are true FD2 =FD1 is true. These two FD sets are semantically
equivalent.

Q. Let us take another example to show the relationship between two FD sets. A relation R2(A,B,C,D)
having two FD sets FD1 = {A->B, B->C,A->C} and FD2 = {A->B, B->C, A->D}

Step 1. Checking whether all FDs of FD1 are present in FD2

 A->B in set FD1 is present in set FD2.


 B->C in set FD1 is also present in set FD2.
 A->C is present in FD1 but not directly in FD2 but we will check whether we can derive it or not. For set FD2,
(A)+ = {A,B,C,D}. It means that A can functionally determine A, B, C and D. SO A->C will also hold in set
FD2.

As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.

Step 2. Checking whether all FDs of FD2 are present in FD1

 A->B in set FD2 is present in set FD1.


 B->C in set FD2 is also present in set FD1.
 A->D is present in FD2 but not directly in FD1 but we will check whether we can derive it or not. For set
FD1, (A)+ = {A,B,C}. It means that A can‘t functionally determine D. SO A->D will not hold in FD1.

As all FDs in set FD2 do not hold in set FD1, FD2 ⊄ FD1.

Step 3. In this case, FD2 ⊃ FD1 and FD2 ⊄ FD1, these two FD sets are not semantically equivalent.

ACID Properties in DBMS


A transaction is a single logical unit of work which accesses and possibly modifies the contents of a
database. Transactions access data using read and write operations.
In order to maintain consistency in a database, before and after transaction, certain properties are followed.
These are called ACID properties.

Atomicity
By this, we mean that either the entire transaction takes place at once or doesn‘t happen at all. There is no midway i.e.
transactions do not occur partially. Each transaction is considered as one unit and either runs to completion or is not
executed at all. It involves following two operations.
—Abort: If a transaction aborts, changes made to database are not visible.
—Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‗All or nothing rule‘.
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account X to account Y.

43
DC10 DATABASE MANAGEMENT SYSTEMS

If the transaction fails after completion of T1 but before completion of T2.( say, after write(X) but before
write(Y)), then amount has been deducted from X but not added to Y. This results in an inconsistent
database state. Therefore, the transaction must be executed in entirety in order to ensure correctness of
database state.

Consistency

This means that integrity constraints must be maintained so that the database is consistent before and after the
transaction. It refers to correctness of a database. Referring to the example above,
The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As a result T is incomplete.
Isolation

This property ensures that multiple transactions can occur concurrently without leading to inconsistency of database
state. Transactions occur independently without interference. Changes occurring in a particular transaction will not be
visible to any other transaction until that particular change in that transaction is written to memory or has been
committed. This property ensures that the execution of transactions concurrently will result in a state that is equivalent
to a state achieved these were executed serially in some order.
Let X= 500, Y = 500.
Consider two transactions T and T‖.

Suppose T has been executed till Read (Y) and then T‘‘ starts. As a result , interleaving of operations takes
place due to which T‘‘ reads correct value of X but incorrect value of Y and sum computed by
T‘‘: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
44
DC10 DATABASE MANAGEMENT SYSTEMS
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take place in
isolation and changes should be visible only after a they have been made to the main memory.

Durability:
This property ensures that once the transaction has completed execution, the updates and modifications to the database
are stored in and written to disk and they persist even is system failure occurs. These updates now become permanent
and are stored in a non-volatile memory. The effects of the transaction, thus, are never lost.

The ACID properties, in totality, provide a mechanism to ensure correctness and consistency of a database in
a way such that each transaction is a group of operations that acts a single unit, produces consistent results,
acts in isolation from other operations and updates that it makes are durably stored.

Indexing in Databases
Indexing is a way to optimize performance of a database by minimizing the number of disk accesses
required when a query is processed.

An index or database index is a data structure which is used to quickly locate and access the data in a
database table.

Indexes are created using some database columns.

 The first column is the Search key that contains a copy of the primary key or candidate key of the table. These
values are stored in sorted order so that the corresponding data can be accessed quickly (Note that the data
may or may not be stored in sorted order).
 The second column is the Data Reference which contains a set of pointers holding the address of the disk
block where that particular key value can be found.

There are two kinds of indices:

1. Ordered indices: Indices are based on a sorted ordering of the values.


2. Hash indices: Indices are based on the values being distributed uniformly across a range of buckets.
The buckets to which a value is assigned is determined by function called a hash function.

There is no comparison between both the techniques, it depends on the database application on which
it is being applied.

 Access Types: e.g. value based search, range access, etc.


 Access Time: Time to find particular data element or set of elements.
 Insertion Time: Time taken to find the appropriate space and insert a new data time.
 Deletion Time: Time taken to find an item and delete it as well as update the index structure.
 Space Overhead: Additional space required by the index.

45
DC10 DATABASE MANAGEMENT SYSTEMS
Indexing Methods
Ordered Indices

The indices are usually sorted so that the searching is faster. The indices which are sorted are known
as ordered indices.

 If the search key of any index specifies same order as the sequential order of the file, it is known as
primary index or clustering index.
Note: The search key of a primary index is usually the primary key, but it is not necessarily so.
 If the search key of any index specifies an order different from the sequential order of the file, it is
called the secondary index or non-clustering index.

Clustered Indexing

Clustering index is defined on an ordered data file. The data file is ordered on a non-key field. In some
cases, the index is created on non-primary key columns which may not be unique for each record. In
such cases, in order to identify the records faster, we will group two or more columns together to get
the unique values and create index out of them. This method is known as clustering index. Basically,
records with similar characteristics are grouped together and indexes are created for these groups.

For example, students studying in each semester are grouped together. i.e. 1st Semester students, 2nd
semester students, 3rd semester students etc are grouped.

Clustered index sorted according to first name (Search key)

Primary Index

In this case, the data is sorted according to the search key. It induces sequential file organisation.
In this case, the primary key of the database table is used to create the index. As primary keys are unique and
are stored in sorted manner, the performance of searching operation is quite efficient. The primary index is
classified into two types : Dense Index and Sparse Index.

46
DC10 DATABASE MANAGEMENT SYSTEMS
(I) Dense Index :

 For every search key value in the data file, there is an index record.
 This record contains the search key and also a reference to the first data record with that search key value.

(II) Sparse Index :

 The index record appears only for a few items in the data file. Each item points to a block as shown.
 To locate a record, we find the index record with the largest search key value less than or equal to the search
key value we are looking for.
 We start at that record pointed to by the index record, and proceed along the pointers in the file (that is,
sequentially) until we find the desired record.
 Q: What are the difference Between Relational Algebra and Relational Calculus ?

BASIS FOR RELATIONAL ALGEBRA RELATIONAL CALCULUS


COMPARISON

Basic Relational Algebra is a Procedural Relational Claculus is


language. Declarative language.

States Relational Algebra states how to Relational Calculus states what


obtain the result. result we have to obtain.

Order Relational Algebra describes the order Relational Calculus does not
in which operations have to be specify the order of operations.
performed.

47
DC10 DATABASE MANAGEMENT SYSTEMS
Domain Relational Algebra is not domain Relation Claculus can be
dependent. domain dependent.

Related It is close to a programming language. It is close to the natural


language.


BASIS FOR
COMPARISON Sequential Hash

Method of storing Stored as they come or Stored at the hash address generated
sorted as they come

Types Pile file and sorted file Static and dynamic hashing
Method

Design Simple Design Medium

Storage Cost Cheap (magnetic tapes) Medium

Advantage Fast and efficient when Faster Access


there is large volumes of No Need to Sort
data, Report generation, Handles multiple transactions
statistical calculations Suitable for Online transactions
etc

Disadvantage Sorting of data each time Accidental Deletion or updation of Data


for insert/delete/ update Use of Memory is inefficient
takes time and makes Searching range of data, partial data, non-hash key
system slow. column, searching single hash column when multiple
hash keys present or frequently updated column as
hash key are inefficient.


 Q: Discuss different types of anomalies.

48
DC10 DATABASE MANAGEMENT SYSTEMS
 Update anomaly: In the above table we have two rows for employee Rick as he belongs to two
departments of the company. If we want to update the address of Rick then we have to update the
same in two rows or the data will become inconsistent. If somehow, the correct address gets updated
in one department but not in other then as per the database, Rick would be having two different
addresses, which is not correct and would lead to inconsistent data.

 Insert anomaly: Suppose a new employee joins the company, who is under training and currently
not assigned to any department then we would not be able to insert the data into the table if emp_dept
field doesn‘t allow nulls.

 Delete anomaly: Suppose, if at a point of time the company closes the department D890 then
deleting the rows that are having emp_dept as D890 would also delete the information of employee
Maggie since she is assigned only to this department.

Q: What is self join ?


Self join is a kind of join in that each row combines with itself and each every other row in the table.

1. SELECT column-names
2. FROM table-name T1 JOIN table-name T2
3. WHERE condition

Q: Why "bcnf is stronger than 3nf" explain with the help of an example.

A relation R is in 3NF if and only if every dependency A->B satisfied by R meets at least ONE of the
following criteria: 1. A->B is trivial (i.e. B is a subset of A) 2. A is a superkey 3. B is a subset of a candidate
key BCNF doesn't permit the third of these options. Therefore BCNF is said to be stronger than 3NF because
3NF permits some dependencies which BCNF does not.
Why is SQL called a structured and a non-procedural language?
SQL is a declarative language in which the expected result or operation is given without the specific details
about how to accomplish the task. The steps required to execute SQL statements are handled transparently
by the SQL database. Sometimes SQL is characterized as non-procedural because procedural languages
generally require the details of the operations to be specified, such as opening and closing tables, loading
and searching indexes, or flushing buffers and writing data to filesystems. Therefore, SQL is considered to
be designed at a higher conceptual level of operation than procedural languages because the lower level
logical and physical operations aren't specified and are determined by the SQL engine or server process that
executes it.
Q: What are the advantages of hash file organization ?

 Records need not be sorted after any of the transaction. Hence the effort of sorting is reduced in this
method.
 Since block address is known by hash function, accessing any record is very faster. Similarly
updating or deleting a record is also very quick.
 This method can handle multiple transactions as each record is independent of other. i.e.; since there
is no dependency on storage location for each record, multiple records can be accessed at the same
time.
 It is suitable for online transaction systems like online banking, ticket booking system etc.
49
DC10 DATABASE MANAGEMENT SYSTEMS

What is the difference between Tuple relational and Domain relational?


There is a very big conceptual difference between the two.In case of tuple relational calculus ,you operate on
each tuples but in case of domain relational calculus ,you deal with each column or attribute.Both methods
can be used to get any result
Q: Explain transaction with example.
A transaction can be defined as a group of tasks. A single task is the minimum processing unit which cannot
be divided further.
Let‘s take an example of a simple transaction. Suppose a bank employee transfers Rs 500 from A's account
to B's account. This very simple and small transaction involves several low-level tasks.
A’s Account
Open_Account(A)
Old_Balance = A.balance
New_Balance = Old_Balance - 500
A.balance = New_Balance
Close_Account(A)
B’s Account
Open_Account(B)
Old_Balance = B.balance
New_Balance = Old_Balance + 500
B.balance = New_Balance
Close_Account(B)
Is SQL relationally complete?
Note: To prove that SQL is relationally complete, you need to show that for every expression of the
relational algebra, there exists a semantically equivalent expression in SQL.
Proof by Contradiction:
Alternatively, to prove it is not, you need to show there exists at least one expression of the relational
algebra for which no such SQL equivalent exists.
In order to show that SQL is relationally complete, it is sufficient to show that
a. there exist SQL expressions for each of the algebraic operators restrict, project, product, union, and
difference (all of the other algebraic operators discussed can be defined in terms of these five), and
b. the operands to those SQL expressions can be arbitrarily complex SQL expressions in turn.
Attempt: First of all, as we know, SQL effectively does support the relational algebra RENAME operator,
thanks to the availability of the optional AS specification on items in the SELECT clause. We can therefore
ensure that tables do all have proper column names, and hence that the operands to product, union, and
difference in particular satisfy the requirements of the algebra with respect to column naming.
SQL fails to support projection on no columns at all, because it does not support empty comma lists in the
SELECT clause.
As a consequence, it does not support TABLE_DEE or TABLE_DUM, and therefore it is not relationally
complete after all.
However, SQL is "nearly" relationally complete.
What is data model ?classify it.
 A data model is an idea which describes how the data can be represented and accessed from software
system after its complete implementation.
 It is a simple abstraction of complex real world data gathering environment.
 It defines data elements and relationships among various data elements for a specified system.

50
DC10 DATABASE MANAGEMENT SYSTEMS
 The main purpose of data model is to give an idea that how final system or software will look like after
development is completed.

1. Hierarchical Model
 Hierarchical model was developed by IBM and North American Rockwell known as Information
Management System.
 It represents the data in a hierarchical tree structure.
 This model is the first DBMS model.
 In this model, the data is sorted hierarchically.
 It uses pointer to navigate between the stored data.

2. Relational Model
 Relational model is based on first-order predicate logic.
 This model was first proposed by E. F. Codd.
 It represents data as relations or tables.
 Relational database simplifies the database structure by making use of tables and columns.

3. Network Database Model


51
DC10 DATABASE MANAGEMENT SYSTEMS
 Network Database Model is same like Hierarchical Model, but the only difference is that it allows a record
to have more than one parent.
 In this model, there is no need of parent to child association like the hierarchical model.
 It replaces the hierarchical tree with a graph.
 It represents the data as record types and one-to-many relationship.
 This model is easy to design and understand.

4. Entity Relationship Model


 Entity Relationship Model is a high-level data model.
 It was developed by Chen in 1976.
 This model is useful in developing a conceptual design for the database.
 It is very simple and easy to design logical view of data.
 The developer can easily understand the system by looking at an ER model constructed.

In this diagram,
 Rectangle represents the entities. Eg. Doctor and Patient.
 Ellipse represents the attributes. Eg. DocId, Dname, PId, Pname. Attribute describes each entity becomes a
major part of the data stored in the database.
 Diamond represents the relationship in ER diagrams. Eg. Doctor diagnoses the Patient.

52
DC10 DATABASE MANAGEMENT SYSTEMS
What do you mean by Lossless Decomposition ?
Lossless Decomposition :

Decomposition is lossless if it is feasible to reconstruct relation R from decomposed tables using Joins. This
is the preferred choice. The information will not lose from the relation when decomposed. The join would
result in the same original relation.

Let us see an example:

<EmpInfo>

Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name


E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance

Decompose the above table into two tables:

<EmpDetails>

Emp_ID Emp_Name Emp_Age Emp_Location


E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas

<DeptDetails>

Dept_ID Emp_ID Dept_Name


Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance

53
DC10 DATABASE MANAGEMENT SYSTEMS

Non-Clustered Indexing

A non clustered index just tells us where the data lies, i.e. it gives us a list of virtual pointers or references to
the location where the data is actually stored. Data is not physically stored in the order of the index. Instead ,
data is present in leaf nodes. For eg. the contents page of a book. Each entry gives us the page number or
location of the information stored. The actual data here(information on each page of book) is not organised
but we have an ordered reference(contents page) to where the data points actually lie.

54
DC10 DATABASE MANAGEMENT SYSTEMS

It requires more time as compared to clustered index because some amount of extra work is done in
order to extract the data by further following the pointer. In case of clustered index, data is directly
present in front of the index.

Secondary Index

It is used to optimize query processing and access records in a database with some information other
than the usual search key (primary key). In this two levels of indexing are used in order to reduce the
mapping size of the first level and in general. Initially, for the first level, a large range of numbers is
selected so that the mapping size is small. Further, each range is divided into further sub ranges.

55
DC10 DATABASE MANAGEMENT SYSTEMS
In order for quick memory access, first level is stored in the primary memory. Actual physical
location of the data is determined by the second mapping level.

DDL

Data Definition Language (DDL) statements are used to define the database structure or schema.
Some examples:

o CREATE - to create objects in the database


o ALTER - alters the structure of the database
o DROP - delete objects from the database
o TRUNCATE - remove all records from a table, including all spaces allocated for the records
are removed
o COMMENT - add comments to the data dictionary
o RENAME - rename an object

DML

Data Manipulation Language (DML) statements are used for managing data within schema objects.
Some examples:

o SELECT - retrieve data from the a database


o INSERT - insert data into a table
o UPDATE - updates existing data within a table
o DELETE - deletes all records from a table, the space for the records remain
o MERGE - UPSERT operation (insert or update)
o CALL - call a PL/SQL or Java subprogram
o EXPLAIN PLAN - explain access path to data
56
DC10 DATABASE MANAGEMENT SYSTEMS
o LOCK TABLE - control concurrency

DCL

Data Control Language (DCL) statements. Some examples:

o GRANT - gives user's access privileges to database


o REVOKE - withdraw access privileges given with the GRANT command

TCL

Transaction Control (TCL) statements are used to manage the changes made by DML statements. It
allows statements to be grouped together into logical transactions.

o COMMIT - save work done


o SAVEPOINT - identify a point in a transaction to which you can later roll back
o ROLLBACK - restore database to original since the last COMMIT

What is the difference between a DELETE command and TRUNCATE command?

DELETE command: DELETE command is used to delete rows from a table based on the condition that we
provide in a WHERE clause. o DELETE command delete only those rows which are specified with the
WHERE clause. o DELETE command can be rolled back. o DELETE command maintain a log, that's why it
is slow. DELETE use row lock while performing DELETE function.

:What is Normalization? Why it is requried?

Ans: Normalization is the process of efficiently organizing data in a database. There are two goals of the
normalization process: eliminating redundant data (for example, storing the same data in more than one
table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are
worthy goals as they reduce the amount of space a database consumes and ensure that data is logically
stored. Why it is requried? Normalization reduces redundancy. Redundancy is the unnecessary repetition
of data. It can cause problems with storage, reterival and updation of data. Redundancy can lead to:
Inconsistencies:-errors are more likely to occur when facts are repeated. Update anomalies:-inserting,
modifying and deleting data may cause inconsistencies. Inconsistency occurs when we perform updation or
deletion of data in one relation, while forgetting to make corresponding changes in other relations. During
the process of normalization, you can identify dependencies, which can cause problems when deleting or
updating. Normalization also helps to simplify the structure of the tables. A fully normalized record consist
of: A primary key that identifies that entity. A set of attributes that describe that entity

What is a Log File?

Ans: A log file is a recording of everything that goes in and out of a particular server. It is a concept much
like the black box of an airplane that records everything going on with the plane in the event of a problem.
The information is frequently recorded chronologically, and is located in the root directory, or occasionally
in a secondary folder, depending on how it is set up with the server. The only person who has regular access
to the log files of a server is the server administrator, and a log file is generally password protected, so that
the server administrator has a record of everyone and everything that wants to look at the log files for a
specific server,
57
DC10 DATABASE MANAGEMENT SYSTEMS
What is the domain of an Attribute?

Ans: Attribute domains are rules that describe the legal values of a field type, providing a method for
enforcing data integrity. Attribute domains are used to constrain the values allowed in any particular
attribute for a table or feature class. If the features in a feature class or nonspatial objects in a table have been
grouped into subtypes, different attribute domains can be assigned to each of the subtypes. A domain is a
declaration of acceptable attribute values. Whenever a domain is associated with an attribute field, only the
values within that domain are valid for the field. In other words, the field will not accept a value that is not in
that domain. Using domains helps ensure data integrity by limiting the choice of values for a particular field.

What do you mean by Meta data?

Ans: Metadata is data about data. An item of metadata may describe an individual data item or a collection
of data items. Metadata is used to facilitate the understanding, use and management of data. Metadata
defines the nature of the data stored in the database. Metadata consists of pre-determined values that
describe various attributes of a given table or a relation. Thus a part of the database which contains
information about data stored in the database is called as metadata.

What are Derived Attributes?

Ans: Derived attributes are those attributes which are based on and are derived from the attributes of
another table or a relation. The derived attributes may contain new values or the values from the base table
from which it was derived. Derived attributes are effectively read-only since there is no place to write them
back to. Also, because derived attributes don‘t directly point to anything in the database, they cannot be used
as primary keys. For example: a derived attribute person‘s full name may be derived from attribute person‘s
first name and the last name.

What is the use of group by clause?

Group by clause is used to apply aggregate functions to a set of tuples. The attributes given in the group by
clause are used to form groups. Tuples with the same value on all attributes in the group by clause are placed
in one group.

58

You might also like