KMBNIT03 - Unit 1
KMBNIT03 - Unit 1
DBMS also provides protection and security to the databases. It also maintains data consistency
in case of multiple users.
MySql
Oracle
SQL Server
IBM DB2
PostgreSQL
Amazon SimpleDB (cloud based) etc.
1. Data stored into Tables: Data is never directly stored into the database. Data
is stored into tables, created inside the database. DBMS also allows to have
relationships between tables which makes the data more meaningful and
connected. You can easily understand what type of data is stored where by
looking at all the tables created in a database.
2. Reduced Redundancy: In the modern world hard drives are very cheap, but
earlier when hard drives were too expensive, unnecessary repetition of data in
database was a big problem. But DBMS follows Normalisation which divides
the data in such a way that repetition is minimum.
3. Data Consistency: On Live data, i.e. data that is being continuously updated
and added, maintaining the consistency of data can become a challenge. But
DBMS handles it all by itself.
4. Support Multiple user and Concurrent Access: DBMS allows multiple users
to work on it (update, insert, delete data) at the same time and still manages to
maintain the data consistency.
5. Query Language: DBMS provides users with a simple Query language, using
which data can be easily fetched, inserted, deleted and updated in a database.
6. Security: The DBMS also takes care of the security of data, protecting the data
from un-authorised access. In a typical DBMS, we can create user accounts
with different access permissions, using which we can easily secure our data by
restricting user access.
7. DBMS supports transactions, which allows us to better handle and manage
data integrity in real world applications where multi-threading is extensively
used.
Advantages of DBMS
Disadvantages of DBMS
It’s Complexity
Except MySQL, which is open source, licensed DBMSs are generally costly.
They are large in size.
On the other hand, when it comes to security and appropriate management of data based on
constraints and other stuff that we are going to talk about, the first choice of many experts,
is Database Management System (DBMS).
So what are they? What are the parameters to decide the best one for your need? Let’s come to
these aspects now.
File Systems is the traditional way to keep your data organized in a way which is easy for
physical access, whether it’s on your shelf or on the drives.
Earlier people used to keep records and maintain data in registers and any alteration/retrieval to
this data was difficult. When computers came, same agenda was followed for storing the data on
drives.
File System actually stores data in the form of isolated files which have their own set of property
table and physical location on the drive and user manually goes to these locations to access the
files.
It is an easy way to store data in general files like images, text, videos, audios etc., but security is
less because only options available to these files are the options given by the operating system
such as locks, hidden files and sharing. These files are hard to maintain when it comes to
frequent changes to these files.
Data redundancy is more and can’t be controlled easily. Data integration is hard to achieve and
also data consistency is not met.
Database Management System, abbreviated as DBMS, is an effective way to store the data when
constraints are high and data maintenance and security are the primary concern of the user.
DBMS stores data in the form of interrelated tables and files. These are generally consist of
database management system providers that are used to store and manipulate databases,
hardware where the data is physically stored and an user friendly software developed to met
specific purpose in certain situations, using which user can easily access database without
worrying about the underlying schema of the database.
Database Management System is great way to manage data as, the data redundancy is minimized due to
interrelation of data entities and also provide a procedure for data integration due to centralisation of data
in the database. Security of data is also maximized using password protection, encryption/decryption,
granting authorized access and others.
User locates the physical address of the In Database Management System, user
files to access data in File Management is unaware of physical address where
System. data is stored.
DBMS Architecture
A Database Management system is not always directly available for users and applications to
access and store data in it. A Database Management system can be centralised(all the data stored
at one location), decentralised(multiple copies of database at different locations)
or hierarchical, depending upon its architecture.
1-tier DBMS architecture also exist, this is when the database is directly available to the user
for using it to store data. Generally such a setup is used for local application development, where
programmers communicate directly with the database for quick response.
2-tier DBMS architecture includes an Application layer between the user and the DBMS, which
is responsible to communicate the user’s request to the database management system and then
send the response from the DBMS to the user.
Such an architecture provides the DBMS extra security as it is not exposed to the End User
directly. Also, security can be improved by adding security and authentication checks in the
Application layer too.
3-tier DBMS architecture is the most commonly used architecture for web applications.
It is an extension of the 2-tier architecture. In the 2-tier architecture, we have an application
layer which can be accessed programatically to perform various operations on the DBMS. The
application generally understands the Database Access Language and processes end users
requests to the DBMS.
For the end user, the GUI layer is the Database System, and the end user has no idea about the
application layer and the DBMS system.
If you have used MySQL, then you must have seen PHPMyAdmin, it is the best example of a
3-tier DBMS architecture.
Schemas and Instances
Database Schema
A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are associated.
It formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a descriptive
detail of the database, which can be depicted by means of schema diagrams. It’s the database
designers who design the schema to help programmers understand the database and make it
useful.
Physical Database Schema: This schema pertains to the actual storage of data
and its form of storage like files, indices, etc. It defines how the data will be
stored in a secondary storage.
Logical Database Schema: This schema defines all the logical constraints that
need to be applied on the data stored. It defines tables, views, and integrity
constraints.
Database Instance
It is important that we distinguish these two terms individually. Database schema is the skeleton
of database. It is designed when the database doesn’t exist at all. Once the database is
operational, it is very difficult to make any changes to it. A database schema does not contain
any data or information.
A database instance is a state of operational database with data at any given time. It contains a
snapshot of the database. Database instances tend to change with time. A DBMS ensures that its
every instance (state) is in a valid state, by diligently following all the validations, constraints,
and conditions that the database designers have imposed.
Data Independence
If a database system is not multi-layered, then it becomes difficult to make any changes in the
database system. Database systems are designed in multi-layers as we learnt earlier.
Data Independence
A database system normally contains a lot of data in addition to users’ data. For example, it
stores data about data, known as metadata, to locate and retrieve data easily. It is rather difficult
to modify or update a set of metadata once it is stored in the database. But as a DBMS expands,
it needs to change over time to satisfy the requirements of the users. If the entire data is
dependent, it would become a tedious and highly complex job.
Metadata itself follows a layered architecture, so that when we change data at one layer, it does
not affect the data at another level. This data is independent but mapped to each other.
Logical data is data about database, that is, it stores information about how data is managed
inside. For example, a table (relation) stored in the database and all its constraints, applied on
that relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual data
stored on the disk. If we do some changes on table format, it should not change the data residing
on the disk.
All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data
independence is the power to change the physical data without impacting the schema or logical
data.
For example, in case we want to change or upgrade the storage system itself − suppose we want
to replace hard-disks with SSD − it should not have any impact on the logical data or schemas.
Database Language and Interfaces
The DBMS must support suitable languages and interfaces for each class of customers. In this
topic, we will explain the types of languages and interfaces supported by using a DBMS.
The data definition language is the language used to define and change the conceptual schema of
the database. DDL permits the DBA or customer to represent and name the entities, attributes,
and relationships needed for the function, together with any related integrity and security
constraints.
The DBMS will have a DDL compiler whose operation is to technique DDL statements to
analyses the definition of the schema design and to save the schema definition in the DBMS
directory.
Alter:
The structure of the table may be modified by way of using the Alter Table command. Alter
Table permits converting the mechanism of a current table.
Drop:
Truncate:
This command is used to remove the records or information from the table, but its structure
remains the same.
Rename:
Comment:
The data manipulation language (DML) is the language used at the conceptual and view levels to
retrieve, insert, delete, and modify information stored in the database.
Query Language
Query Language is the part of the DML used for retrieving information. The term query
language and data manipulation language are frequently used interchangeably.
DDL stands for Data Definition Language. It is used to define database structure or pattern.
Using the DDL statements, you can create the skeleton of the database.
Data definition language is used to store the information of metadata like the number of tables
and schemas, their names, indexes, columns in each table, constraints, etc.
DCL stands for Data Control Language. It is used to retrieve the stored or
saved data.
The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature of
rolling back.)
TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.
Interfaces:
Forms-Based Interfaces:
A forms-based interface displays a form to each user. Users can fill out all of the form entries to
insert new data, or they can fill out only certain entries, in which case the DBMS will redeem
same type of data for other remaining entries. These types of forms are usually designed or
created and programmed for the users that have no expertise in operating system. Many DBMSs
have forms specification languages which are special languages that help specify such forms.
Example: SQL* Forms is a form-based language that specifies queries using a form designed in
conjunction with the relational database schema.
A GUI typically displays a schema to the user in diagrammatic form. The user then can specify a
query by manipulating the diagram. In many cases, GUIs utilize both menus and forms. Most
GUIs use a pointing device such as mouse, to pick a certain part of the displayed schema
diagram.
These interfaces accept request written in English or some other language and attempt to
understand them. A Natural language interface has its own schema, which is similar to the
database conceptual schema as well as a dictionary of important words.
The natural language interface refers to the words in its schema as well as to the set of standard
words in a dictionary to interpret the request.If the interpretation is successful, the interface
generates a high-level query corresponding to the natural language and submits it to the DBMS
for processing, otherwise a dialogue is started with the user to clarify any provided condition or
request. The main disadvantage with this is that the capabilities of this type of interfaces are not
that much advance.
There is limited use of speech be it for a query or an answer to a question or being a result of a
request it is becoming commonplace. Applications with limited vocabularies such as inquiries
for telephone directory, flight arrival/departure, and bank account information are allowed
speech for input and output to enable ordinary folks to access this information.
The Speech input is detected using predefined words and used to set up the parameters that are
supplied to the queries. For output, a similar conversion from text or numbers into speech takes
place.
DML resembles simple English language and enhances efficient user interaction with the system.
The functional capability of DML is organized in manipulation commands like SELECT,
UPDATE, INSERT INTO and DELETE FROM, as described below:
SELECT: This command is used to retrieve rows from a table. The syntax is
SELECT [column name(s)] from [table name] where [conditions]. SELECT is
the most widely used DML command in SQL.
UPDATE: This command modifies data of one or more records. An update
command syntax is UPDATE [table name] SET [column name = value] where
[condition]
INSERT: This command adds one or more records to a database table. The
insert command syntax is INSERT INTO [table name] [column(s)] VALUES
[value(s)].
DELETE: This command removes one or more records from a table according
to specified conditions. Delete command syntax is DELETE FROM [table
name] where [condition].
Applications: It can be considered as a user-friendly web page where the user enters the
requests. Here he simply enters the details that he needs and presses buttons to get the data.
End User: They are the real users of the database. They can be developers, designers,
administrators, or the actual users of the database.
DDL: Data Definition Language (DDL) is a query fired to create database, schema, tables,
mappings, etc in the database. These are the commands used to create objects like tables, indexes
in the database for the first time. In other words, they create the structure of the database.
DDL Compiler: This part of the database is responsible for processing the DDL commands.
That means this compiler actually breaks down the command into machine-understandable
codes. It is also responsible for storing the metadata information like table name, space used by
it, number of columns in it, mapping information, etc.
DML Compiler: When the user inserts, deletes, updates or retrieves the record from the
database, he will be sending requests which he understands by pressing some buttons. But for the
database to work/understand the request, it should be broken down to object code. This is done
by this compiler. One can imagine this as when a person is asked some question, how this is
broken down into waves to reach the brain!
Query Optimizer: When a user fires some requests, he is least bothered how it will be fired on
the database. He is not all aware of the database or its way of performance. But whatever be the
request, it should be efficient enough to fetch, insert, update, or delete the data from the database.
The query optimizer decides the best way to execute the user request which is received from the
DML compiler. It is similar to selecting the best nerve to carry the waves to the brain!
Stored Data Manager: This is also known as Database Control System. It is one of the main
central systems of the database. It is responsible for various tasks
Data Files: It has the real data stored in it. It can be stored as magnetic tapes, magnetic disks, or
optical disks.
Compiled DML: Some of the processed DML statements (insert, update, delete) are stored in it
so that if there are similar requests, it will be re-used.
Data Dictionary: It contains all the information about the database. As the name suggests, it is
the dictionary of all the data items. It contains a description of all the tables, view, materialized
views, constraints, indexes, triggers, etc.
End Users:
End Users are the people who interact with the database through applications or utilities. The
various categories of end users are:
1. Casual End Users: These Users occasionally access the database but may need
different information each time. They use sophisticated database Query
language to specify their requests. For example: High level Managers who
access the data weekly or biweekly.
2. Native End Users: These users frequently query and update the database using
standard types of Queries. The operations that can be performed by this class of
users are very limited and effect precise portion of the database. For example:
Reservation clerks for airlines/hotels check availability for given request and
make reservations. Also, persons using Automated Teller Machines (ATM’s)
fall under this category as he has access to limited portion of the database.
3. Standalone end Users/On-line End Users: Those end Users who interact with
the database directly via on-line terminal or indirectly through Menu or
graphics based Interfaces. Example:-Library Management System.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.
Mapping cardinalities
one to one
one to many
many to one
many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model
than others. This model is based on first-order predicate logic and defines a table as an n-ary
relation.
The main highlights of this model are:
Entity
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable.
For example, in a school database, students, teachers, classes, and courses offered can be
considered as entities. All these entities have some attributes or properties that give them their
identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with
attribute sharing similar values. For example, a Students set may contain all the students of a
school; likewise a Teachers set may contain all the teachers of a school from all faculties. Entity
sets need not be disjoint.
Attributes
Entities are represented by means of their properties, called attributes. All attributes have
values. For example, a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a
student’s name cannot be a numeric value. It has to be alphabetic. A student’s age cannot be
negative, etc.
Types of Attributes
Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.
For example, the roll_number of a student makes him/her identifiable among students.
Relationship
The association among entities is called a relationship. For example, an employee works_at a
department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a relationship too
can have attributes. These attributes are called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
Binary = degree 2
Ternary = degree 3
n-ary = degree
ISSUES
The E-R model can results problems due to limitations in the way the entities are related in the
relational databases in a project. These problems are called connection traps. These problems
often occur due to the incorrect name of the meaning of certain relationships.
ER-Model: ER Diagram
Entity Relationship Diagram, also known as ERD, ER Diagram or ER model, is a type of
structural diagram for use in database design. An ERD contains different symbols and connectors
that visualize two important information: The major entities within the system scope, and
the inter-relationships among these entities.
So, when do we draw ERDs? While ER models are mostly developed for designing relational
database in terms of concept visualization and in terms of physical database design, there are still
other situations when ER diagrams can help. Here are some typical use cases.
(I) Database design – Depending on the scale of change, it can be risky to alter a database structure
directly in a DBMS. To avoid ruining the data in a production database, it is important to plan out the
changes carefully. ERD is a tool that helps. By drawing ER diagrams to visualize database design ideas,
you have a chance to identify the mistakes and design flaws, and to make correction before executing the
changes in database.
(II) Database debugging – To debug database issues can be challenging, especially when the
database contains many tables, which require writing complex SQL in getting the information
you need. By visualizing a database schema with an ERD, you have a full picture of the entire
database schema. You can easily locate entities, view their attributes and to identify the
relationships they have with others. All these allows you to analyze an existing database and to
reveal database problem easier.
(II) Database creation and patching – ERD tool like Visual Paradigm supports database
generation tool that can automate the database creation and patching process by means of ER
diagrams. So, with this ER Diagram tool your ER design is no longer just a static diagram but a
mirror that reflects truly the physical database structure.
One-to-one
In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and an entity
in E2 is associated with at most one entity in E1.
One-to-many
In one-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an
entity in E2 is associated with at most one entity in E1.
Many-to-one
In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and an
entity in E2 is associated with any number of entities in E1.
Many-to-Many
In many-to-many mapping, an entity in E1 is associated with any number of entities in E2, and
an entity in E2 is associated with any number of entities in E1.