Dbms 1
Dbms 1
Vocational Skills
Unit – 1
Database Concepts
• DATA
Data is a collection of facts in a raw or
unorganized form such as numbers or
characters.
Example: Ankit, Delhi, 12, 80.
• INFORMATION
Information is data that has been
converted into a more useful,
meaningful form.
Example: Name – Ankit, City – Delhi,
Class – 12, Marks – 80.
• DATABASE
The related information when placed in an organized
form makes a database. We can organize related
information in the form of files or folders (File based
system) or in form of tables(Relational database).
• Properties of a Database
NEED FOR A DATABASE
1. Size of Data:
The small amount of data storing into spreadsheet is fine, however it might turns into a
large amount of data then Spreadsheet solution will not work. Even if the size of data
records goes into millions then storing data in multiple spreadsheet which will create a
problem of speed. It will take you long time to find a record from the multiple spreadsheet
files.
3. Accuracy: When user doing data entry in files then it might be possible to enter incorrect
data due to no validation present like you can enter wrong spelling, wrong dates, and
wrong amount. So the Data accuracy is hard to maintain and accuracy is in question. A
database is pretty accurate as it has all sorts of build in constraints, checks etc. This means
that the information available in a database is guaranteed to be correct in most cases.
4. Security:
You cannot secure the data in the text files and spreadsheet. Anyone can access the file
and read any data present in the file. So storing data will not work with banking,
healthcare application, payroll department where privacy is difficult to maintain.
Databases have various methods to ensure security of data. There are user logins
required before accessing a database and various access specifiers. These allow only
authorised users to access the database.
5. Redundancy: The duplication of data can be possible using text files or spreadsheet.
Chances of adding multiple copies of data cannot be limited here. This will leads to
accuracy issues. Maintaining and updating multiple copies is not an easy task. This is
ensured in databases by using various constraints for data. Data integrity in databases
makes sure that the data is accurate and consistent in a database.
6. Incomplete Data: As data files are independent, accessing the information out of
multiple files becomes very difficult. To overcome above problem, associated with
storing data in the text file or spreadsheet the database is required.
A database management system or DBMS is a software used for creating and
managing the data in the database easily and effectively. It is basically a set of
programs that allow users to store, modify/update, and retrieve information from the
database as per the requirements.
Example: MySQL, MS SQL Server, Oracle, SQL, DB2, Microsoft Access, etc. are
different types of database management system.
Database Envoirnment
There are various characteristics of a database management system, but following are some important characteristics:
1. Self-Describing Nature
Before DBMS, traditional file management system was used for storing information and data. There was no concept of
definition in traditional file management system like we have in DBMS. A DBMS should be of Self- Describing nature as it
not only contains the database itself but also the metadata. A metadata (data about data) defines and describes not
only the extent, type, structure and format of all data but also relationship between the data. This data represent itself
that what actions should be taken on it.
2. Support ACID Properties
Any DBMS is able to support ACID (Accuracy, Completeness, Isolation, and Durability) properties. It is made sure in
every DBMS that the real purpose of data should not be lost while performing transactions like delete, insert and
update. Let us take an example; if an employee name is updated then it should make sure that there is no duplicate data
and no mismatch of employee information.
• Atomicity − This property states that a transaction must be treated as an atomic unit, that is, either all of its
operations are executed or none. There must be no state in a database where a transaction is left partially completed.
• Consistency − The database must remain in a consistent state after any transaction. No transaction should have any
adverse effect on the data residing in the database
• Durability − The database should be durable enough to hold all its latest updates even if the system fails or restarts.
• Isolation − In a database system where more than one transaction are being executed simultaneously and in parallel,
the property of isolation states that all the transactions will be carried out and executed as if it is the only transaction in
the system. No transaction will affect the existence of any other transaction.
3. Concurrent Use of Database
There are many chances that many users will be accessing the data at the
same time. They may require altering the database system concurrently. At
that time, DBMS supports them to concurrently use the database without any
problem. For Example, the employees of railway reservation system can book
and access tickets for passengers concurrently. Every employee can see on his
own interface that how many seats are available or bogie is fully booked.
4. Transactions
Transactions are bunch of actions that are done to bring the database from
one consistent state to new consistent state. Traditional file-based system did
not have this feature. Transaction is always atomic that means it can never be
further divided. It can only be complete or incomplete.
For example, A person wants to credit money from his account to another
person’s account. Then transaction will be complete if he sends the money
and other guy receives the money. Anything other than this can lead to an
inconsistent transaction.
5. Data Persistence
Persistence means if the data is not removed explicitly then all the data will be
maintained in DBMS. If any system failure happens then life span of data
stored in the DBMS will be decided by the users directly or indirectly. Any data
stored in the DBMS can never be lost. If system failure happens in between
any transaction then it will be rolled back or fully completed, but the data will
never be at risk.
6. Backup and recovery
There are many chances of failure of whole database. At that time no one will
be able to get the database back and for sure company will be in a big loss.
The only solution is to take backup of database and whenever it is needed, it
can be stored back.
7. Data integrity
This is one of the most important characteristics of database management
system. Integrity ensures the quality and reliability of database system. It
protects the unauthorized access of database and makes it more secure. It
brings only the consistence and accurate data into the database.
8. Multiple Views
Users can have multiple views of database depending on their department and interest. DBMS
support multiple views of database to the users. For example, a user of teaching department will
have different view and user of hostel department will have different. This feature helps users to
have somewhat security because users of other department cannot access their files.
9. Stores any kind of data
A database management system is able to store any kind of data. It should not be restricted to
employee name, salary and address. Any kind of data that exists in the real world can be stored in
DBMS because we need to work with all kinds of data that is present around us.
10. Security
DBMS provides security to the data stored in it because all users have different rights to access the
database. Some of the user can access the whole database while other can access a small part of
database. For example, a computer teacher can only access files that are related to computer
subjects but HOD of the department can access files of all subject that are related to their
department.
11. Represents complex relationship between data
Data stored in a database is connected with each other and a relationship is made in between data.
DBMS is able to represent the complex relationship between data to make efficient and accurate
use of data.
Advantages of DBMS
1. Cost
DBMS requires high initial investment for hardware, software and trained staff.
A significant investment based upon size and functionality of organization if
required. Also organization has to pay concurrent annual maintenance cost.
2. Complexity
A DBMS fulfill lots of requirement and it solves many problems related to
database. But all these functionality has made DBMS an extremely complex
software. Developer, designer, DBA and End user of database must have
complete skills if they want to user it properly. If they don’t understand this
complex system then it may cause loss of data or database failure.
3. Technical staff requirement
Any organization have many employees working for it and they can perform
many others tasks too that are not in their domain but it is not easy for them
to work on DBMS. A team of technical staff is required who understand DBMS
and company have to pay handsome salary to them too.
4. Database Failure
As we know that in DBMS, all the files are stored in single database so chances of database failure become more.
Any accidental failure of component may cause loss of valuable data. This is really a big question mark for big
firms.
5. Extra Cost of Hardware
A DBMS requires disk storage for the data and sometimes you need to purchase extra space to store your data.
Also sometimes you need to a dedicated machine for better performance of database. These machines and
storage space increase extra costs of hardware.
6. Size
As DBMS becomes big software due to its functionalities so it requires lots of space and memory to run its
application efficiently. It gains bigger size as data is fed in it.
7. Cost of Data Conversion
Data conversion may require at any time and organization has to take this step. It is unbelievable that data
conversion cost is more than the costs of DBMS hardware and machine combined. Trained staff is needed to
convert data to new system. It is a key reason that most of the organizations are still working on their old DBMS
due to high cost of data conversion.
8. Maintenance
As new threats comes daily, so DBMS requires to updates itself daily. DBMS should be updates according to the
current scenario.
9. Performance
Traditional files system was very good for small organizations as they give splendid performance. But DBMS gives
poor performance for small scale firms as its speed is slow
Types of user who play different
roles in DBMS
• Application Programmers
• Database Administrators
• End-Users
1. Application Programmers
The users who write the application programs in programming languages
(such as Java, C++, or Visual Basic) to interact with databases are called
Application Programmer.
2. Database Administrators (DBA)
A person who manages the overall DBMS is called a database administrator or
simply DBA.
3. End-Users
The end-users are those who interact with the database management system
to perform different operations by using the different database commands
such as insert, update, retrieve, and delete on the data, etc.
Applications of DBMS
What is File Management System?
• A file management system is a collection of programs that manage and store
data in files and folders in a computer hard disk.
• A file management system manages the way of reading and writing data to
the hard disk. It is also known as conventional file system.
• This system actually stores data in the isolated files which have their own
physical location on the drive, and users manually go to these locations to
access these files. It is the easiest way to store the data like text, videos,
images, audios, etc. in general files.
• Data redundancy is high in file management system, and it cannot be
controlled easily.
• Data consistency is not met, and the integration of data is hard to achieve.
• Operating System such as Linux and Windows has its own file system. For
example, NTFS is the Windows file system, and EXT is the Linux file system.
These operating systems provide less security to these files where they have
options such as hide files, locks, and sharing on files.
Relational Database
➢ Relational Model: The relational model
represents the database as a collection of
relations. A relation is nothing but a table of
values. Every row in the table represents a
collection of related data values. These rows in
the table denote a real-world entity or
relationship.
➢ Relational Database: A relational database
organizes data into tables which can be linked—
or related—based on data common to each. A
database in which the data is stored in the form
of relations (also called tables) is called a
Relational Database
Relational Database
Management System
• RDBMS: A Relational database management system (RDBMS) is a
database management system (DBMS) that is based on the relational
model as introduced by E. F. Codd. Some popular RDBMS software
available are: Oracle, MySQL, Sybase.
• Properties of RDBMS
Values are atomic.
All of the values in a column have the same data type.
Each row is unique.
The sequence of columns is insignificant.
The sequence of rows is insignificant.
Each column has a unique name.
Integrity constraints maintain data consistency across multiple tables.
RDBMS TERMINOLOGY
Chandigarh
Rajasthan
Ludhiana
• Row/Record/Tuple – A record is also called as a row of data is each
individual entry that exists in a table. For example, there are 7
records in the above CUSTOMERS table.
• Following is a single row of data or record in the CUSTOMERS table −
1 Ramesh 32 Delhi
❖ DDL
1. CREATE DATABASE databasename;
2. CREATE TABLE table_name ( column1 datatype, column2 datatype,
column3 datatype, );
3. ALTER TABLE table_name ADD column_name datatype;
4. ALTER TABLE table_name DROP COLUMN column_name;
5. ALTER TABLE table_name ALTER COLUMN column_name datatype;
6. DROP TABLE table_name;
7. RENAME Old_table_name TO new_table_name;
❖ DML
1. SELECT column1, column2, ... FROM table_name;
2. SELECT * FROM table_name;
3. SELECT column1, column2, ... FROM table_name WHERE condition;
4. INSERT INTO table_name (column1, column2, column3, ...) VALUES
(value1, value2, value3, ...);
5. INSERT INTO table_name VALUES (value1, value2, value3, ...);
6. UPDATE table_name SET column1 = value1, column2 = value2, ...
WHERE condition;
7. DELETE FROM table_name WHERE condition;
❖ DCL
1. GRANT privileges_names ON object TO user;
2. REVOKE privileges ON object FROM user;
❖ TCL
1. COMMIT;
2. ROLLBACK;
3. SAVEPOINT SAVEPOINT_NAME;
4. ROLLBACK TO SAVEPOINT_NAME;
Constraints in RDBMS
Relational constraints are the restrictions imposed on the database
contents and operations. They ensure the correctness of data in the
database.
alter table teacher add constraint fr foreign key(deptno) references department(deptid)on delete
set null on update set null;
Self-Referencing Tables:
A foreign key constraint can reference columns within the same table. These tables are called as
self-referencing tables. For example, consider a table Employee that contains five columns:
Employee_ID, Name, Age, Salary and Manager_ID. Because the manager is also an employee,
there is a foreign key relationship between the Manager_ID and Employee_ID as shown below:
Functions
In SQL, duplicate tuples can appear more than once in a table and in the result of a query.
However if the requirement is to list distinct values of an attribute then this can be done
by using the keyword – 'DISTINCT'. The SQL DISTINCT keyword is used in conjunction with
the SELECT statement to eliminate all the duplicate records and fetching only unique
records. There may be a situation when you have multiple duplicate records in a table.
1. select distinct deptno from teacher;
HAVING
The HAVING Clause enables you to specify conditions that filter which
group results appear in the results. The HAVING clause was added to SQL
because the WHERE keyword could not be used with aggregate functions.
Query:-
1. Select deptno,deptname,teacherid, count(*) as No_of_teachers from
teacher,department where deptno=deptid group by deptno having
count(*)>1;
Alias
SQL aliases are used to give a table, or a column in a table, a temporary name. Aliases are often used to
make column/table names more readable. An alias only exists for the duration of the query.
Suppose the teacher and department table both had same names for the department number, say
Dept_ID as shown below:
CREATE TABLE Department
( Dept_ID INTEGER PRIMARY KEY,
Dept_Name VARCHAR (30) NOT NULL
);
CREATE TABLE Teacher (
Teacher_ID INTEGER,
First_Name VARCHAR(20) NOT NULL,
Last_Name VARCHAR(20),
Gender CHAR(1),
Salary DECIMAL(10,2) DEFAULT 40000,
Date_of_Birth DATE,
Dept_ID INTEGER,
CONSTRAINT TEACHER_PK PRIMARY KEY (Teacher_ID),
CONSTRAINT TEACHER_FK FOREIGN KEY (Dept_ID) REFERENCES Department (Dept_ID) );
In such case, when the join condition is specified, there will be an ambiguity about
which Dept_ID we are talking about. To resolve this problem, we have to prefix the
name of the attribute with the relation name followed by a period as shown in the
query below:
Query: To retrieve names of all the teachers who belong to Hindi department.
SELECT First_Name, Last_Name FROM Teacher, Department WHERE Department.
Dept_ID=Teacher. Dept_ID AND Dept_Name="Hindi";
Query:-
1. Select t.name,d.name from teacher t and department d where t.deptid=d.deptid;
Or
Select t.name,d.name from teacher t and department d;
Surbhi Bansal
•
Sometimes it is required to apply certain mathematical functions on group of values in a
database. Such functions are called Aggregate Functions. For example retrieving the
total number of teachers in all the Departments. Following are the commonly used built-
in aggregate functions:
• SUM– It finds the sum of all the values for a selected attribute which has numeric data
type.
• MAX–It finds the maximum value out of all the values for a selected attribute which
has numeric data type.
• MIN-It finds the minimum value out of all the values for a selected attribute which has
numeric data type.
• AVG– It finds the average value of all the values for a selected attribute which has
numeric data type.
• SELECT SUM(Salary) AS Total_Salary , AVG(Salary) AS Average_Salary
FROM Teacher;
• SELECT MAX(Salary) AS Max_Salary, MIN(Salary) AS Min_Salary
FROM Teacher;
• SELECT COUNT(Salary) FROM Teacher WHERE Salary > 40000;
• SELECT COUNT(*)FROM Teacher WHERE Salary >40000;
• SELECT First_Name, Last_Name, Salary, Salary*1.1 AS New_Salary
FROM Teacher WHERE Dept_No = 4;
GROUP BY
The GROUP BY clause is a SQL command that is used to group rows that
have the same values. The GROUP BY clause is used in the SELECT
statement.
• Query:-
1.Select d.deptid,t.teacherid,t.name t.salary,t.deptno from department
d,teacher t order by t.teacherid group by t.deptno;
HAVING
The HAVING Clause enables you to specify conditions that filter which
group results appear in the results. The HAVING clause was added to SQL
because the WHERE keyword could not be used with aggregate functions.
• Query:-
1.Select deptno,deptname,teacherid, count(*) as No_of_teachers from
teacher,department where deptno=deptid group by deptno having
count(*)>1;
Question for Practice
Que 1: Consider the following Employee table:
• Table Name: Employee
Emploee_ Employee Job_title Salary Bonus Age Manager_
id _Name Id
1201 Diya President 50000 Null 29 Null
1205 Amtra Manager 30000 2500 26 1201
1211 Rahul Analyst 20000 1500 23 1205
1213 Manish Salesman 15000 Null 22 1205
1216 Megha Analyst 22000 1300 25 1201
1217 Mohit Salesman 16000 Null 22 1205
Field DataType
Machine_Id CHAR(3)
Income DECIMAL(8,2)
• The primary key of the table Machine is Machine_ID. Records in the
table Sales are uniquely identified by the fields Machine_ID and
Date.
(c) Write a query to find the total ticket income of the station “New
Delhi” of each day.
(d) Write a query to find the total number of tickets sold by the
machine (Machine_ID = 122) till date.