Unit 1 Introduction Lecture
Unit 1 Introduction Lecture
Concepts and applications, Objectives and Evolution, Needs of DBMS, Data abstraction, Data
independence, Schema and Instances, Concepts of DDL, DML, and DCL, Data Manager and
Users
………………………………………………………………………………………………………
What is Data?
The raw facts are called as data. The word “raw” indicates that they have not been processed.
What is information?
What is Knowledge?
DATA/INFORMATION PROCESSING:
The process of converting the data (raw facts) into meaningful information is called as
data/information processing.
When When
Data Information Knowledge
Processed Processed
Note: In business processing knowledge is more useful to make decisions for any organization.
DIFFERENCE BETWEEN DATA AND INFORMATION:
DATA INFORMATION
1.Raw facts 1.Processed data
2. It is in unorganized form 2. It is in organized form
3. Not meaningful 3. Meaningful
4. Data doesn’t help in 4.Information helps in
Decision making process Decision making process
DATABASE:
(OR)
The main purpose (objective) of the database is to operate a large amount of information by
storing, retrieving, and managing data.
The file management system also called as FMS in short is one in which all data is
stored on a single large file. The main disadvantage in this system is searching a record or data
takes a long time. This lead to the introduction of the concept, of indexing in this system. Then
also the FMS system had lot of drawbacks to name a few like updating or modifications to the
data cannot be handled easily, sorting the records took long time and so on. All these drawbacks
led to the introduction of the Hierarchical Database System.
The previous system FMS drawback of accessing records and sorting records which took a
long time was removed in this by the introduction of parent-child relationship between records
in database. The origin of the data is called the root from which several branches have data at
different levels and the last level is called the leaf. The main drawback in this was if there is
any modification or addition made to the structure then the whole structure needed alteration
which made the task a tedious one. In order to avoid this next system took its origin which is
called as the Network Database System.
In this the main concept of many-many relationships got introduced. But this also
followed the same technology of pointers to define relationships with a difference in this made
in the introduction if grouping of data items as sets.
In order to overcome all the drawbacks of the previous systems, the Relational
Database System got introduced in which data get organized as tables and each record forms a
row with many fields or attributes in it. Relationships between tables are also formed in this
system.
What is DBMS?
A database management system (DBMS) is application software for creating and managing
databases.
A DBMS makes it possible for end users to create, protect, read, update and delete data in a
database.
The DBMS provides a centralized view of data that can be accessed by multiple users, from
multiple locations, in a controlled manner.
Applications of DBMS
The following are the various kinds of applications/organizations uses databases for their
business processing activities in their day-to-day life. They are:
1. Banking: For customer information, accounts, and loans, and banking transactions.
2. Airlines: For reservations and schedule information. Airlines were among the first to use
databases in a geographically distributed manner—terminals situated around the world
accessed the central database system through phone lines and other data networks.
4. Credit Card Transactions: For purchases on credit cards and generation of monthly
statements.
6. Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds.
8. Manufacturing: For management of supply chain and for tracking production of items in
factories, inventories of items in warehouses/stores, and orders for items.
9. Human resources: For information about employees, salaries, payroll taxes and benefits,
and for generation of paychecks.
11. Web: For access the Back accounts and to get the balance amount.
12. E –Commerce: For Buying a book or music CD and browse for things like watches,
mobiles from the Internet.
The objectives of DBMS are:
6. To protect the data from the physical harm and unauthorized access.
Need of DBMS
In the File System, duplicate data is created in many places because all the programs have
their own files which create data redundancy resulting in wastage of memory. In DBMS, all
the files are integrated in a single database. So there is no chance of duplicate data.
For example: A student record in a library or examination can contain duplicate values, but
when they are converted into a single database, all the duplicate values are removed.
Data security level is high by protecting your precious data from unauthorized access. Only
authorized users should have the grant to access the database with the help of credentials.
DBMS allows multiple users to access the same database at a time without any conflicts.
Avoidance of inconsistency.
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account
X to account Y.
If the transaction fails after completion of T1 but before completion of T2.( say, after
write(X) but before write(Y)), then the amount has been deducted from X but not added to
Y. This results in an inconsistent database state. Therefore, the transaction must be executed
in its entirety in order to ensure the correctness of the database state.
Consistency:
This means that integrity constraints must be maintained so that the database is consistent
before and after the transaction. It refers to the correctness of a database.
In DBMS, data is stored in a single database so data becomes more consistent in comparison
to file processing systems.
Shared data
Data can be shared between authorized users of the database in DBMS. All the users have
their own right to access the database. Admin has complete access to the database. He has a
right to assign users to access the database.
Enforcement of standards
As DBMS have central control of the database. So, a DBA can ensure that all the applications
follow some standards such as format of data, document standards etc. These standards help
in data migrations or in interchanging the data.
Unauthorized persons are not allowed to access the database because of security credentials.
Data loss is a big problem for all the organizations. In the file system users have to back up
the files in regular intervals which lead to waste of time and resources.
DBMS solves this problem of taking backup automatically and recovery of the database.
Tunability
Tuning means adjusting something to get a better performance. Same in the case of DBMS,
as it provides tunability to improve performance. DBA adjusts databases to get effective
results.
Disadvantages of DBMS
Complexity
The provision of the functionality that is expected of a good DBMS makes the DBMS an
extremely complex piece of software. Database designers, developers, database
administrators and end-users must understand this functionality to take full advantage of it.
Failure to understand the system can lead to bad design decisions, which leads to a serious
consequence for an organization.
Size
The functionality of DBMS makes use of a large piece of software which occupies megabytes
of disk space.
Performance
The centralization of resources increases the vulnerability of the system because all users and
applications rely on the availability of DBMS, the failure of any component can bring
operation to halt.
Cost of DBMS
The cost of DBMS varies significantly depending on the environment and functionality
provided. There is also the recurrent annual maintenance cost.
Data Abstraction
Data abstraction is a process of hiding the implement details (such as how the data are
stored and maintained) and representing only the essential features to simplify user's
interaction with the system.
The major purpose of a database system is to provide users with an abstract view of the
system.
To simplify user's interaction with the system, the complexity is hidden from the database
users through several levels of abstraction.
Physical Level:
View Level:
1. External level
It is also called view level. The reason this level is called “view” is because several users can
view their desired data from this level which is internally fetched from database with the help
of conceptual and internal level mapping.
The user doesn’t need to know the database schema details such as data structure, table
definition etc. user is only concerned about data which is what returned back to the view level
after it has been fetched from database (present at the internal level).
External level is the “top level” of the Three Level DBMS Architecture.
2. Conceptual level
It is also called logical level. The whole design of the database such as relationship among
data, schema of data etc. are described in this level.
Database constraints and security are also implemented in this level of architecture. This level
is maintained by DBA (database administrator).
3. Internal level
This level is also known as physical level. This level describes how the data is actually stored
in the storage devices. This level is also responsible for allocating space to the data. This is
the lowest level of the architecture.
In simple words, we can say that Data independence is a property of a database that allows
the User or Database Administrator to change the schema at one level without affecting
the data or schema at another level.
To achieve data independence it is necessary that if we are making the change at one level it
doesn't change or hamper the data at the next level.
The purpose of data independence is to enhance the security of the system and to save lots of
time and price needed once the information is changed or altered.
To achieve data independence, we first need to ensure Data Abstraction. Data Abstraction
can be defined as extracting the necessary data by ignoring the remaining irrelevant details. If
we take the example of a real-world entity, ATM is one of the best examples of data
abstraction We all use an ATM machine for cash withdrawal, money transfer, etc in our
daily life. But we don't know internally what processes and operations are being performed
inside an ATM.
The main purpose of data abstraction is to achieve data independence. There are three
levels of abstraction.
Physical or Internal Level - Physical level is the lowest level of data abstraction
Conceptual or Logical Level - defines the logical relations between the data. It
provides the link between the external schema and the internal schema of the
database.
External or View Level - It is the highest level of data abstraction. The external level
describes the user interaction with the database management system.
Based on the data abstraction, there are two levels of data independence:
The ability to change the physical level without affecting the logical or Conceptual level.
Physical data independence gives us the freedom to modify the - Storage device, File
structure, location of the database, etc without changing the definition of conceptual or view
level.
If we take the database of the banking system and we want to scale up the database by
changing the storage size and also want to change the file structure then we can do it without
affecting any functionality of logical schema.
Below changes can be done at the physical layer without affecting the conceptual layer -
Changing the storage devices like SSD, hard disk and magnetic tapes, etc.
Changing the access technique and modifying indexes.
Changing the compression techniques or hashing algorithms.
A property of a database that can be used to change the logic behind the logical level
without affecting the other layers of the database
Logical data independence is usually required for changing the conceptual schema without
having to change the external schema or application programs. It allows us to make changes
in a conceptual structure like adding, modifying, or deleting an attribute in the database.
Ex-
If there is a database of a banking system and we want to add the details of a new customer or
we want to update or delete the data of a customer at the logical level data will be changed
but it will not affect the Physical level or structure of the database.
These changes can be done at a logical level without affecting the application program
or external layer.
Data independence allows changing the schema at one level without the need to
change the schema at another level.
It helps to improve the data security of the database.
With the help of physical data independence, we can change the storage and file
structure of the database without affecting the application program.
Logical data independence helps us to add and delete the data in the database without
any effect on the application program.
Data independence provides us the facility of data abstraction which means we can
use the functions without worrying about the internal structure of that function. Ex- in
a vehicle we used to apply breaks to decelerate the vehicle without worrying about the
internal mechanism of the break.
Data independence saves a lot of time and price in case, we want to change the
location or file structure of the database or we want to alter the data.
Now you know that data independence is how much important for handling a large amount of
data and providing security to the data in the DBMS. But data independence has a darker side
also.
The first disadvantage of data independence is that we need a high initial investment
for the hardware and software to set up the database because we need hardware and
machine with high performance.
The second disadvantage of data independence is the complexity, the database should
be designed in a way such that it can give high performance, and to achieve this
complexity increases automatically.
Schema
Instance
Schema:
The overall design of the database is called the “Schema” or “Meta Data”. A
databaseschema corresponds to the programming language type definition. The value of
a variable in programming language corresponds to an “Instance” of a database
Schema.
The goal of this architecture is to separate the user applications and the
physicaldatabase. In this architecture, schemas can be defined at the following
three levels:
1. The internal level has an internal schema, which describes the physical storage
structure of the database. The internal schema uses a physical data model and
describes the complete details of data storage and access paths for the database.
2. The conceptual level has a conceptual schema, which describes the structure of the
whole database for a community of users. The conceptual schema hides the details of
physical storage structures and concentrates on describing entities, data types,
relationships, user operations, and constraints. A high-level data model or an
implementation data model can be used at this level.
3. The external or view level includes a number of external schemas or user views.
Each external schema describes the part of the database that a particular user group is
interested inand hides the rest of the database from that user group. A high-level data
model or an implementation data model can be used at this level.
What is Instance?
The data stored in the database at any given time is an instance of the
databaseStudent
In the above table 1201, 1202, Venkat etc are said to be instance of student table.
Schema Instance
It is the overall description of the It is the collection of information stored in a
database. database at a particular moment.
Data in instances can be changed using addition,
Schema is same for whole database.
deletion, updation
Does not change Frequently. Changes Frequently.
Defines the basic structure of the database
It is the set of Information stored at a particular
i.e how the data will be stored in the
time.
database.
Concepts of DDL, DML, and DCL
DDL
DDL is short name of Data Definition Language, which deals with database schemas and
descriptions, of how the data should reside in the database.
CREATE - to create a database and its objects like (table, index, views, store
procedure, function, and triggers)
ALTER - alters the structure of the existing database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated for the
records are removed
COMMENT - add comments to the data dictionary
RENAME - rename an object
DML
DML is short name of Data Manipulation Language which deals with data manipulation
and includes most common SQL statements such SELECT, INSERT, UPDATE, DELETE,
etc., and it is used to store, modify, retrieve, delete and update data in a database.
DCL is short name of Data Control Language which includes commands such as GRANT
and mostly concerned with rights, permissions and other controls of the database system.
TCL
TCL is short name of Transaction Control Language which deals with a transaction within a
database.
People who work with a database can be categorized as database users or database
administrators.
Database Users:
There are four different types of database-system users, differentiated by the way
theyexpect to interact with the system.
Naive users:
Naive users are unsophisticated users who interact with the system by invoking one
of theapplication programs that have been written previously.
For example, a bank teller who needs to transfer $50 from account A to account
B invokes a program called transfer. This program asks the teller for the amount of
money to be transferred, the account from which the money is to be transferred, and the
account to which themoney is to be transferred.
Application programmers:
Specialized users:
Database Administrator:
One of the main reasons for using DBMSs is to have central control of both the
data and the programs that access those data. A person who has such central control over
the system is called a database administrator (DBA).
Schema definition:
The DBA creates the original database schema by executing a set of data
definitionstatements in the DDL, Storage structure and access-method definition.