Unit 1 DBMS
Unit 1 DBMS
In this Chapter
DBMS basic concepts
Advantages of a DBMS over file-processing
systems
Data Abstraction
Database Languages
Data Models
Data Independence
Components of a DBMS/
Overall Structure of a DBMS
DBMS architecture
ERD
Schema Diagram
Relational Model
Codd’s Rule
Relational Integrity
1.1 Basic Concepts NOTES
What is Data?
Data is a collection of a distinct small unit of information. It can be
used in a variety of forms like text, numbers, media, bytes, etc. it
can be stored in pieces of paper or electronic memory, etc.
Word 'Data' is originated from the word 'datum' that means
'single piece of information.' It is plural of the word datum.
In computing, Data is information that can be translated into a
form for efficient movement and processing. Data is
interchangeable.
What is Database?
A database is an organized collection of data, so that it can be
easily accessed and managed.
What is Database Management System?
Database Management System is a software or technology used to
manage data from a database. Some popular databases are MySQL,
Oracle, MongoDB, etc. DBMS provides many operations e.g. creating
a database, storing in the database, updating an existing database,
delete from the database. DBMS is a system that enables you to store,
modify and retrieve data in an organized way. It also provides
security to the database.
Files are stored in different locations, different formats. Thus they are
isolated. For example, one location the student data may be stored
in .txt format. In other location, the same file may be stored
in .doc format.
1. The data and program are 1. The data and program are independent of each
inter- dependent. other.
2. File-based system caused data 2. Database system control data redundancy.
redundancy. The data may be The data appeared only once in the system.
duplicated in different files
3. File –based system caused data 3. In database system data always consistent.
inconsistency. The data in Because data appeared only once.
different files may be different
that cause data inconsistency.
4. The data cannot be shared 4. In database data is easily shared because data
because data is distributed in is stored at one place.
different files.
5. In file based system data is 5. It provides many methods to maintain data
widely spread. Due to this reason security in the database.
file based system provides poor
security.
6. File based system does not 6. Database system provides a different
provide consistency constrains. consistency constrains to maintain data integrity
in the system.
7. File based system is less 7. Database system is very complex system.
complex system.
8. To generate different report to 8. The report can be generated very easily in
take a crucial decision is very required format in database system.
difficult in file based system.
The answer to these questions is NO. You will not ask these questions
because these questions are of no use. You do not care about these questions.
You are only concerned about a few things, such as the company, size, color,
material, and how the shoes look. That is why these unimportant details are
kept hidden from the end user. This is the process we call data abstraction.
The database system contains intricate data structures and relations. The
developers keep away the complex data from the user and remove the
complications so that the user can comfortably access data in the database
and can only access the data they want, which is done with the help of data
abstraction.
The main purpose of data abstraction is to hide irrelevant data and provide
an abstract view of the data. With the help of data abstraction, developers
hide irrelevant data from the user and provide them the relevant data. By
doing this, users can access the data without any hassle, and the system will
also work efficiently.
In DBMS, there are three levels of data abstraction, which are as follows:
The tree structure has parts record superior to supplier record. That is parts
form the parent and supplier forms the children. Each of the four trees figure,
consists of one part record occurrence, together with a set of subordinate
supplier record occurrences. There is one supplier record for each supplier of
a particular part. Each supplier occurrence includes the corresponding
shipment quantity.
the part es
and finally returning to the same part.
of
Data Dictionary
A data dictionary is a repository of metadata. The data dictionary of Oracle
is stored in the SYS schema.
Each Oracle database has a data dictionary, which is a set of tables and views
that serve as a reference about the database.
For example, a data dictionary stores information about both
the logical and physical structure of the database.
A data dictionary also stores the valid users of an Oracle database, information
about integrity constraints defined for tables in the database, and the amount
of space allocated for a schema object and how much of that space is in use,
among much other information.
A data dictionary is created when a database is created. To accurately reflect
the status of the database at all times, the data dictionary is automatically
updated by Oracle Database in response to specific actions, such as when the
structure of the database is altered. Database users cannot modify the data
dictionary. Various database processes rely on the data dictionary to record,
verify, and conduct ongoing work. For example, during database operation,
Oracle Database reads the data dictionary to verify that schema objects exist
and that users have proper access to them.
Data Dictionary Tables
USER_TABLES
USER_VIEWS
USER_CONSTRAINTS
USER_INDEXES
Data are actually stored as bits, or numbers and strings, but it is difficult to
work with data at this level.
Schema:
Description of data at some level. Each level has its own schema.
We will be concerned with three forms of schemas:
physical,
conceptual, and
external
Physical Data Level
The physical schema describes details of how data is stored: files, indices, etc.
on the random access disk system. It also typically describes the record layout
of files and type of files (hash, b-tree, flat).
Early applications worked at this level - explicitly dealt with details. E.g.,
minimizing physical distances between related data and organizing the data
structures within the file (blocked records, linked lists of blocks, etc.)
Routines are hardcoded to deal with physical representation.
Problem:
Changes to data structures are difficult to make.
Application code becomes complex since it must deal with details.
Rapid implementation of new features very difficult.
Hides details of the physical level.
In the relational model, the conceptual schema presents data as a set
of tables.
The DBMS maps data access between the conceptual to physical schemas
automatically.
Physical schema can be changed without changing application:
DBMS must change mapping from conceptual to physical.
Aggregation
In aggregation, the relation between two entities is treated as a single entity.
In aggregation, relationship with its corresponding entities is aggregated into
a higher level entity.
For example: Center entity offers the Course entity act as a single entity in
the relationship which is in a relationship with another entity visitor. In the
real world, if a visitor visits a coaching center then he will never enquiry about
the Course only or just about the Center instead he will ask the enquiry about
both.
The steps below outline the logic between a relation and its domains.
3. Then r ⊆ D1×D2×…×Dn
Table
A database is composed of multiple tables and each table holds the data.
Figure 7.1 shows a database that contains three tables.
Domain
A domain is the original sets of atomic values used to model data. By atomic
value, we mean that each value in the domain is indivisible as far as the
relational model is concerned. For example:
The domain of Marital Status has a set of possibilities: Married, Single,
Divorced.
Notes by Dr. Nilesh Shelke 29
SIT, Nagpur
The domain of Shift has the set of all possible
The domain of Shift has the set of all possible days: {Mon, Tue, Wed…}.
The domain of Salary is the set of all floating-point numbers greater than
0 and less than 200,000.
The domain of First Name is the set of character strings that represents
names of people.
In summary, a domain is a set of acceptable values that a column is allowed to
contain. This is based on various properties and the data type for the column.
We will discuss data types in another chapter.
Degree
The degree is the number of attributes in a table.
Example of Relational data model.
S# Sname Status City
S1 Smith 20 London
S2 Jones 10 Paris
S3 Blake 30 Paris
P
P# PNAME COLOR WEIGH CITY
P1 Nut Red T
12 Londo
P2 Bolt Green 17 n
Paris
P3 Screw Blue 17 Rome
P4 Screw Red 14 Londo
SP n
S# P# SP#
S1 P1 300
S1 P2 200
S1 P3 400
S2 P1 300
S2 P2 400
S3 P2 200
• Primary Key
• Super Key
• Candidate Key
• Foreign Key
Primary key
Primary key of a table never contains null and duplicate values. Thus, it is used
to identify tuple of a table uniquely. For example, Roll numbers attribute of a
student record acts as a primary key because every student must have a
unique roll number.
Super key
Super key is the combination of more than one attribute that is used to identify
every tuple of the table uniquely. A table may contain more than one super
keys depending on the possible combinations of the attributes in the table.
You can use primary key of a table to make a super key.
Candidate key
Candidate key is that super key, which combines minimum attributes of a
table to identify the tuples uniquely and having primary key of the table. You
can use candidate key as a primary key in a table.
Foreign Key
Foreign key is that column of the table, which is used to maintain relationship
E.F. Codd, the famous mathematician has introduced 12 rules for the
relational model for databases commonly known as Codd's rules. The rules
mainly define what is required for a DBMS for it to be considered relational,
i.e., an RDBMS. There is also one more rule i.e. Rule00 which specifies the
relational model should use the relational way to manage the database. The
rules and their description are as follows:-
Rule 0: Foundation Rule
A relational database management system should be capable of using its
relational facilities (exclusively) to manage the database.
All information in the database is to be represented in one and only one way.
This is achieved by values in column positions within rows of tables.
All data must be accessible with no ambiguity, that is, Each and every datum
(atomic value) is guaranteed to be logically accessible by resorting to a
combination of table name, primary key value and column name.
The database description is represented at the logical level in the same way as
ordinary data, so authorized users can apply the same relational language to
its interrogation as they apply to regular data. The authorized users can access
the database structure by using common language i.e. SQL.
a. data definition
b. view definition
c. data manipulation (interactive and by program)
d. integrity constraints
e. authorization
f. Transaction boundaries (begin, commit, and rollback).
All views that are theoretically updateable are also updateable by the system.
The system is able to insert, update and delete operations fully. It can also
perform the operations on multiple rows simultaneously.
Q13. Draw an ER diagram for the Company which has the following
description:
Company has several departments.
Each department may have several Location.
Departments are identified by a name, D_no, Location.
A Manager control a particular department.
Each department is associated with number of projects.
Employees are identified by name, id, address, dob, date_of_joining.
An employee works in only one department but can work on several
project.
We also keep track of number of hours worked by an employee on a
single project.
Each employee has dependent
Dependent has D_name, Gender and relationship.
SUMMARY
• Whereas the snapshot of the database at any given time is the database
instance.
• A query is a request to a database for informationretrieval and data
manipulation (insertion, deletion or update). It is written in Structured
Query Language (SQL).
• Relational DBMS (RDBMS) is used to store data in related tables. Rows and
columns of a table are called tuples and attributed respectively. A table is
referredto as a relation.
• Destructions on data stored in a RDBMS is appliedby use of keys such
as Candidate Key, Primary Key, Composite Primary Key, Foreign Key.
• Primary key in a relation is used for unique identification of tuples.
• Foreign key is used to relate two tables or relations.
• Each column in a table represents a feature (attribute)of a record. Table
stores the information for an entity whereas a row represents a record.
• Each row in a table represents a record. A tuple is collection of attribute
values that makes a record unique.
• A tuple is a unique entity whereas attribute values canbe duplicate in the
table.
SQL is the standard language for RDBMS systems like MySQL.
38 INFORXI