DBMS Unit1
DBMS Unit1
Unit – I
the Database Environment, Advantages and Disadvantages of DBMSs, The Three-Level ANSI-
SPARC Architecture,Database Languages, Data Models, Functions of a DBMS, Components of
a DBMS.Relational Model: Introduction, Terminology, Integrity Constraints, Views.The
Relational Algebra: Unary Operations, Set Operations, Join Operations, Division Operation,
Aggregation and Grouping Operations.
Unit – II
SQL: Introduction, Data Manipulation–Simple Queries, Sorting Results, Using the SQL
Aggregate Functions,Grouping Results, Sub-queries, ANY and ALL, Multi-table Queries,
EXISTS and NOT EXIST, Combining ResultTables, Database Updates.SQL: The ISO SQL Data
Types, Integrity Enhancement Feature–Domain Constraints, Entity Integrity,Referential
Integrity, General Constraints, Data Definition–Creating a Database, Creating a Table, Changing
a Table Definition, Removing a Table, Creating an Index, Removing an Index, Views–Creating a
View, Removinga View, View Resolution, Restrictions on Views, View Updatability, WITH
CHECK OPTION, Advantages and Disadvantages of Views, View Materialization,
Transactions, Discretionary Access Control–Granting
Privileges to Other Users, Revoking Privileges from Users.
Advanced SQL: The SQL Programming Language–Declarations, Assignments, Control
Statements, Exceptions,Cursors, Subprograms, Stored Procedures, Functions, and Packages,
Triggers, Recursion.
Unit – III
Entity–Relationship Modeling: Entity Types, Relationship Types, Attributes, Keys, Strong and
Weak EntityTypes, Attributes on Relationships, Structural Constraints, Problems with ER
Models–Fan Traps, ChasmTraps.Enhanced Entity–Relationship Modeling:
Specialization/Generalization, Aggregation, Composition.Functional–Dependencies: Anomalies,
Partial Functional Dependency, Transitive Functional Dependency,Multi Valued Dependency,
Join Dependency.
Unit – IV
2
Introduction:
Database Management System: The software which is used to manage database is called
Database Management System (DBMS). For Example, MySQL, Oracle etc. are popular
commercial DBMS used in different applications. DBMS allows users the following tasks:
Data Definition: It helps in creation, modification and removal of definitions that define the
organization of data in database.
Data Updation: It helps in insertion, modification and deletion of the actual data in the database.
Data Retrieval: It helps in retrieval of data from the database which can be used by applications
for various purposes.
User Administration: It helps in registering and monitoring users, enforcing data security,
monitoring performance, maintaining data integrity, dealing with concurrency control and
recovering information corrupted by unexpected failure.
File System manages data using files in hard disk. Users are allowed to create, delete, and
update the files according to their requirement. Let us consider the example of file based
University Management System. Data of students is available to their respective Departments,
Academics Section, Result Section, Accounts Section, Hostel Office etc. Some of the data is
common for all sections like Roll No, Name, Father Name, Address and Phone number of
students but some data is available to a particular section only like Hostel allotment number
which is a part of hostel office. Let us discuss the issues with this system:
Redundancy of data: Data is said to be redundant if same data is copied at many places. If
a student wants to change Phone number, he has to get it updated at various sections.
Similarly, old records must be deleted from all sections representing that student.
3
Inconsistency of Data: Data is said to be inconsistent if multiple copies of same data does
not match with each other. If Phone number is different in Accounts Section and
Academics Section, it will be inconsistent. Inconsistency may be because of typing errors
or not updating all copies of same data.
Difficult Data Access: A user should know the exact location of file to access data, so the
process is very cumbersome and tedious. If user wants to search student hostel allotment
number of a student from 10000 unsorted students’ records, how difficult it can be.
Unauthorized Access: File System may lead to unauthorized access to data. If a student
gets access to file having his marks, he can change it in unauthorized way.
No Concurrent Access: The access of same data by multiple users at same time is known
as concurrency. File system does not allow concurrency as data can be accessed by only
one user at a time.
No Backup and Recovery: File system does not incorporate any backup and recovery of
data if a file is lost or corrupted.
These are the main reasons which made a shift from file system to DBMS.
A database represents some features of real world applications. Any change in the real world is
reflected in the database. For example, let us take railway reservation system; we have in our
mind some certain applications of maintaining records of attendance, waiting list, train arrival
and departure time, certain day etc. related to each train.
A database is of self describing nature; it always describes and narrates itself. It contains the
description of the whole data structure, the constraints and the variables. It makes it different
from traditional file management system in which definition was not the part of application
program. These definitions are used by the users and DBMS software when needed.
A database gives a logical relationship between its records and data. So a user can access various
records depending upon the logical conditions by a single query from the database.
DBMS follows the rules of normalization, which splits a relation when any of its attributes is
having redundancy in values. Normalization is a mathematically rich and scientific process that
reduces data redundancy.
5. Query Language
DBMS is equipped with query language, which makes it more efficient to retrieve and
manipulate data. A user can apply as many and as different filtering options as required to
retrieve a set of data. Traditionally it was not possible where file-processing system was used.
DBMS supports multi-user environment and allows them to access and manipulate data in
parallel. Though there are restrictions on transactions when users attempt to handle the same data
item, but users are always unaware of them.
Basically, a view is a subset of the database. A view is defined and devoted for a particular user
of the system. Different users of the system may have different views of the same system. Every
view contains only the data of interest to a user or a group of users. It is the responsibility of
users to be aware of how and where the data of their interest is stored.
8. Security
Features like multiple views offer security to some extent where users are unable to access data
of other users and departments. DBMS offers methods to impose constraints while entering data
into the database and retrieving the same at a later stage. DBMS offers many different levels of
security features, which enables multiple users to have different views with different features.
For example, a user in the Sales department cannot see the data that belongs to the Purchase
department. Additionally, it can also be managed how much data of the Sales department should
be displayed to the user. Since a DBMS is not saved on the disk as traditional file systems, it is
very hard for miscreants to break the code.
9. ACID Properties
DBMS follows the concepts of Atomicity, Consistency, Isolation, and Durability (normally
shortened as ACID) in order to ensure accuracy, completeness, and data integrity. These
concepts are applied on transactions, which manipulate data in a database.
Atomicity
5
The atomicity property identifies that the transaction is atomic. An atomic transaction is either
fully completed, or is not begun at all. Any updates that a transaction might affect on a system
are completed in their entirety. If for any reason an error occurs and the transaction is unable to
complete all of its steps, the then system is returned to the state it was in before the transaction
was started. An example of an atomic transaction is an account transfer transaction. The money
is removed from account A then placed into account B. If the system fails after removing the
money from account A, then the transaction processing system will put the money back into
account A, thus returning the system to its original state. This is known as a rollback.
Isolation
When a transaction runs in isolation, it appears to be the only action that the system is carrying
out at one time. If there are two transactions that are both performing the same function and are
running at the same time, transaction isolation will ensure that each transaction thinks it has
exclusive use of the system. This is important in that as the transaction is being executed, the
state of the system may not be consistent. The transaction ensures that the system remains
consistent after the transaction ends, but during an individual transaction, this may not be the
case. If a transaction was not running in isolation, it could access data from the system that may
not be consistent. By providing transaction isolation, this is prevented from happening.
Consistency
A transaction enforces consistency in the system state by ensuring that at the end of any
transaction the system is in a valid state. If the transaction completes successfully, then all
changes to the system will have been properly made, and the system will be in a valid state. If
any error occurs in a transaction, then any changes already made will be automatically rolled
back. This will return the system to its state before the transaction was started. Since the system
was in a consistent state when the transaction was started, it will once again be in a consistent
state. The account transfer system, the system is consistent if the total of all accounts is constant.
If an error occurs and the money is removed from account A and not added to account B, then
the total in all accounts would have changed. The system would no longer be consistent. By
rolling back the removal from account A, the total will again be what it should be, and the
system back in a consistent state.
Durability
A transaction is durable in that once it has been successfully completed, all of the changes it
made to the system are permanent. There are safeguards that will prevent the loss of information,
even in the case of system failure. By logging the steps that the transaction performs, the state of
the system can be recreated even if the hardware itself has failed. The concept of durability
allows the developer to know that a completed transaction is a permanent part of the system,
regardless of what happens to the system later on.
6
Components of DBMS
Users: Users may be of any kind such as DB administrator, System developer or database
users.
Database application: Database application may be Departmental, Personal, organization's and /
or Internal.
DBMS: Software that allows users to create and manipulate database access,
Database: Collection of logical data as a single unit.
Database Environment
A database environment is a system of components that regulate the collection, management and
use of data. It includes software, hardware, people, procedures and the data itself.
The hardware in a database environment includes computers and computer peripherals and the
software is everything from the operating system to the application programs. it includes
database management software like Microsoft Access or SQL Server.
The people in a database environment include everyone who administrates and uses the system.
The procedures are the rules and instructions given to both the people and the software and the
data is the collection of facts and information located within the database environment.
The DBMS makes it possible to produce quick answers to ad hoc queries. From a database
perspective, a query is a specific request issued to the DBMS for data manipulation—for
example, to read or update the data. Simply put, a query is a question, and an ad hoc query is a
spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to
the application. For example, end users, when dealing with large amounts of sales data, might
want quick answers to questions (ad hoc queries) such as:
- What was the dollar volume of sales by product during the past six months?
- What is the sales bonus figure for each of our salespeople during the past three months?
- How many of our customers have credit balances of 3,000 or more?
Better-managed data and improved data access make it possible to generate better-quality
information, on which better decisions are based. The quality of the information generated
depends on the quality of the underlying data. Data quality is a comprehensive approach to
promoting the accuracy, validity, and timeliness of the data. While the DBMS does not guarantee
data quality, it provides a framework to facilitate data quality initiatives.
one of the disadvantages of dbms is Database systems require sophisticated hardware and
software and highly skilled personnel. The cost of maintaining the hardware, software, and
personnel required to operate and manage a database system can be substantial. Training,
licensing, and regulation compliance costs are often overlooked when database systems are
implemented.
2. Management complexity
Database systems interface with many different technologies and have a significant impact on a
company’s resources and culture. The changes introduced by the adoption of a database system
must be properly managed to ensure that they help advance the company’s objectives. Given the
fact that database systems hold crucial company data that are accessed from multiple sources,
security issues must be assessed constantly.
3. Maintaining currency
To maximize the efficiency of the database system, you must keep your system current.
Therefore, you must perform frequent updates and apply the latest patches and security measures
to all components.
Because database technology advances rapidly, personnel training costs tend to be significant.
Vendor dependence. Given the heavy investment in technology and personnel training,
companies might be reluctant to change database vendors.
As a consequence, vendors are less likely to offer pricing point advantages to existing customers,
and those customers might be limited in their choice of database system components.
DBMS vendors frequently upgrade their products by adding new functionality. Such new
features often come bundled in new upgrade versions of the software. Some of these versions
require hardware upgrades. Not only do the upgrades themselves cost money, but it also costs
money to train database users and administrators to properly use and manage the new features.
The ANSI-SPARC Architecture, where ANSI-SPARC stands for American National Standards
Institute, Standards Planning And Requirements Committee, is an abstract design standard for a
Database Management System (DBMS), first proposed in 1975.
All users should be able to access same data but have a different customized view of the data.
These views independent, changes to one view should not affect others.
Users should not need to know physical database storage details. Database storage structures
could be change without affecting to user views.
The internal (Storage) structure of database not effecting changes made to logical structure of
Database. As a example when shifting database to another hard disc it’s not affecting to the
structure of database.
Levels are
Conceptual level – Describes what data is stored in the database and the relationships among the
data.
Internal Level
The physical representation of the database on the computer to achieve optimal runtime
performance and storage space utilization.
Covers data structures and file organizations used to store data on the storage device.
11
Conceptual Level
This level contains the logical structure of the entire database. Provides a complete view of the
data requirements of the organization that is independent of any storage considerations.
External Level
Describes the part of the database that is relevant to the user.
The external view include only the entities, attributes or relationships in the ‘real world’ that the
user is interested in.
Conceptual schema changes (e.g. addition/removal of entities) should not require changes to
external schema or rewrites of application programs.
Internal schema changes (e.g. using different file organizations, storage structures/devices)
should not require change to conceptual or external schemas.
Database Language
o DBMS has appropriate languages and interfaces to express database queries and updates.
o Database languages can be used to read, store and update the data in the database.
o DDL stands for Data Definition Language. It is used to define database structure or
pattern.
12
The very first data model could be flat data-models, where all the data used are to be kept in the
same plane. Earlier data models were not so scientific, hence they were prone to introduce lots
of duplication and update anomalies.
2. Relational Model
3. Hierarchical Model
5. Object Model
1.Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.
ER Model is based on −
Mapping cardinalities −
o one to one
o one to many
o many to one
o many to many
15
2.Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model
than others. This model is based on first-order predicate logic and defines a table as an n-ary
relation.
Relational data model is the primary data model, which is used widely around the world for data
storage and processing. This model is simple and it has all the properties and capabilities
required to process data with storage efficiency.
Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores
the relation among entities. A table has rows and columns, where rows represents records and
columns represent the attributes.
16
Tuple − A single row of a table, which contains a single record for that relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system represents relation
instance. Relation instances do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table name), attributes, and
their names.
Relation key − Each row has one or more attributes, known as relation key, which can identify
the row in the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as attribute
domain.
3. Hierarchical Model
Hierarchical model was developed by IBM and North American Rockwell known as
Information Management System.
It represents the data in a hierarchical tree structure.
This model is the first DBMS model.
In this model, the data is sorted hierarchically.
It uses pointer to navigate between the stored data.
Network Database Model is same like Hierarchical Model, but the only difference is that it
allows a record to have more than one parent.
In this model, there is no need of parent to child association like the hierarchical model.
It replaces the hierarchical tree with a graph.
It represents the data as record types and one-to-many relationship.
17
5. Object Model
Object model stores the data in the form of objects, classes and inheritance.
This model handles more complex applications, such as Geographic Information System (GIS),
scientific experiments, engineering design and manufacturing.
It is used in File Management System.
It represents real world objects, attributes and behaviors.
It provides a clear modular structure.
Data Dictionary Management is the one of the most important function of database management
system.
DBMS stores definitions of the data elements and their relationships (metadata) in a data
dictionary.
So, all programs that access the data in the database work through the DBMS.
The DBMS uses the data dictionary to look up the required data component structures and
relationships which relieves you from coding such complex relationships in each program.
Additionally, any changes made in a database structure are automatically recorded in the data
dictionary, thereby freeing you from having to modify all of the programs that access the
changed structure.
In other words, the DBMS system provides data abstraction, and it removes structural and data
dependence from the system.
A modern DBMS system provides storage not only for the data, but also for related data entry
forms or screen definitions, report definitions, data validation rules, procedural code, structures
to handle video and picture formats, and so on.
Data storage management is also important for database performance tuning. Performance tuning
relates to the activities that make the database perform more efficiently in terms of storage and
access speed. So, the data storage management is another important function of Database
Management System.
The DBMS creates a security system that enforces user security and data privacy. Security rules
determine which users can access the database, which data items each user can access, and which
data operations (read, add, delete, or modify) the user can perform. This is especially important
in multiuser database systems.
The DBMS provides data access through a query language. A query language is a non
procedural language—one that lets the user specify what must be done without having to specify
how it is to be done.
Structured Query Language (SQL) is the defacto query language and data access standard
supported by the majority of DBMS vendors.
- End users can generate answers to queries by filling in screen forms through their preferred
Web browser.
- The DBMS can connect to third-party systems to distribute information via e-mail or other
productivity applications.
Constraints in DBMS
Constraints enforce limits to the data or type of data that can be inserted/updated/deleted from a
table. The whole purpose of constraints is to maintain the data integrity during an
update/delete/insert into a table. In this tutorial we will learn several types of constraints that can
be created in RDBMS.
Types of constraints
NOT NULL
UNIQUE
DEFAULT
CHECK
Key Constraints – PRIMARY KEY, FOREIGN KEY
Domain constraints
Mapping constraints
NOT NULL:
NOT NULL constraint makes sure that a column does not hold NULL value. When we don’t
provide value for a particular column while inserting a record into a table, it takes NULL value
by default. By specifying NULL constraint, we can be sure that a particular column(s) cannot
have NULL values.
Example:
);
Read more about this constraint here.
UNIQUE:
UNIQUE Constraint enforces a column or set of columns to have unique values. If a column has
a unique constraint, it means that particular column cannot have duplicate values in a table.
DEFAULT:
The DEFAULT constraint provides a default value to a column when there is no value provided
while inserting a record into a table.
CHECK:
This constraint is used for specifying range of values for a particular column of a table. When
this constraint is being set on a column, it ensures that the specified column must have the value
falling in the specified range.
Key constraints:
PRIMARY KEY:
Primary key uniquely identifies each record in a table. It must have unique values and cannot
contain nulls. In the below example the ROLL_NO field is marked as primary key, that means
the ROLL_NO field cannot have duplicate and null values.
Foreign keys are the columns of a table that points to the primary key of another table. They act
as a cross-reference between tables.
Definition: Foreign keys are the columns of a table that points to the primary key of another
table. They act as a cross-reference between tables.
For example:
In the below example the Stu_Id column in Course_enrollment table is a foreign key as it points
to the primary key of the Student table.
Course_enrollment table:
Course_Id Stu_Id
C01 101
23
C02 102
C03 101
C05 102
C06 103
C07 102
Student table:
101 Chaitanya 22
102 Arya 26
24
103 Bran 25
104 Jon 21
Note: Practically, the foreign key has nothing to do with the primary key tag of another table, if
it points to a unique column (not necessarily a primary key) of another table then too, it would be
a foreign key. So, a correct definition of foreign key would be: Foreign keys are the columns of a
table
A table is DBMS is a set of rows and columns that contain data. Columns in table have a unique
name, often referred as attributes in DBMS. A domain is a unique set of values permitted for an
attribute in a table. For example, a domain of month-of-year can accept January,
February….December as possible values, a domain of integers can accept whole numbers that
are negative, positive and zero.
Definition: Domain constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT)
Example:
For example I want to create a table “student_info” with “stu_id” field having value greater than
100, I can create a domain and table like this:
constraint acc_type_test
check(value in ("Checking", "Saving"));
Mapping Cardinality:
One to One: An entity of entity-set A can be associated with at most one entity of entity-set B
and an entity in entity-set B can be associated with at most one entity of entity-set A.
One to Many: An entity of entity-set A can be associated with any number of entities of entity-
set B and an entity in entity-set B can be associated with at most one entity of entity-set A.
Many to One: An entity of entity-set A can be associated with at most one entity of entity-set B
and an entity in entity-set B can be associated with any number of entities of entity-set A.
Many to Many: An entity of entity-set A can be associated with any number of entities of
entity-set B and an entity in entity-set B can be associated with any number of entities of entity-
set A.
Example:
Abstraction is one of the main features of database systems. Hiding irrelevant details from user
and providing abstract view of data to users, helps in easy and efficient user-
database interaction. In the previous tutorial, we discussed the three level of DBMS architecture,
The top level of that architecture is “view level”. The view level provides the “view of data” to
the users and hides the irrelevant details such as data relationship, database schema, constraints,
security etc from the user.
To fully understand the view of data, you must have a basic knowledge of data abstraction and
instance & schema. Refer these two tutorials to learn them in detail.
1. Data abstraction
2. Instance and schema
Data Abstraction in DBMS
Database systems are made-up of complex data structures. To ease the user interaction with
database, the developers hide internal irrelevant details from users. This process of hiding
irrelevant details from user is called data abstraction.
Logical level: This is the middle level of 3-level data abstraction architecture. It describes what
data is stored in database.
27
View level: Highest level of data abstraction. This level describes the user interaction with
database system.
Example: Let’s say we are storing customer information in a customer table. At physical
level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in
memory. These details are often hidden from the programmers.
At the logical level these records can be described as fields and attributes along with their data
types, their relationship among each other can be logically implemented. The programmers
generally work at this level because they are aware of such things about database systems.
At view level, user just interact with system with the help of GUI and enter the details at the
screen, they are not aware of how the data is stored and what data is stored; such details are
hidden from them.
Relational Algebra
Relational database systems are expected to be equipped with a query language that can assist
its users to query the database instances. There are two kinds of query languages − relational
algebra and relational calculus.
Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator can
be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.
Select
Project
Union
Set different
Cartesian product
Rename
Select Operation (σ)
It selects tuples that satisfy the given predicate from a relation.
28
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula
which may use connectors like and, or, and not. These terms may use relational operators like
− =, ≠, ≥, < , >, ≤.
For example −
σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
For example −
r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s
Where r and s are either database relations or relation result set (temporary relation).
Notation − r − s
Notation − r Χ s
r Χ s = { q t | q ∈ r and t ∈ s}
Notation − ρ x (E)
Set intersection
Assignment
Natural join
30
1. UNION
2. UNION ALL
3. INTERSECT
4. MINUS
UNION Operation
UNION is used to combine the results of two or more SELECT statements. However it will
eliminate duplicate rows from its resultset. In case of union, number of columns and datatype
must be same in both the tables, on which UNION operation is being applied.
Example of UNION
The First table,
ID Name
1 abhi
2 adam
ID Name
31
2 adam
3 Chester
ID NAME
1 abhi
2 adam
3 Chester
UNION ALL
This operation is similar to Union. But it also shows the duplicate rows.
ID NAME
32
1 abhi
2 adam
ID NAME
2 adam
3 Chester
ID NAME
1 abhi
2 adam
2 adam
3 Chester
INTERSECT
Intersect operation is used to combine two SELECT statements, but it only retuns the records
which are common from both SELECT statements. In case of Intersect the number of columns
and datatype must be same.
33
Example of Intersect
The First table,
ID NAME
1 abhi
2 adam
ID NAME
2 adam
3 Chester
ID NAME
34
2 adam
MINUS
The Minus operation combines results of two SELECT statements and return only those in the
final result, which belongs to the first set of the result.
Example of Minus
The First table,
ID NAME
1 abhi
2 adam
ID NAME
2 adam
3 Chester
ID NAME
1 abhi
1. If there is a bank in that particular city, that person must have an account in that bank.
2. If there is a course in the list of courses required to be graduated, that person must have taken
that course.
Do not worry if you are not clear with all this new things right away, we will try to expain as we
move on with this tutorial.
We shall see the second example, mentioned above, in detail.
Table 1: Course_Taken → It consists of the names of Students against the courses that they
have taken.
Student_Name Course
Robert Databases
David Databases
Table 2: Course_Required → It consists of the courses that one is required to take in order to
graduate.
Course
Databases
Programming Languages
Unfortunately, there is no direct way by which we can express the division operator. Let's walk
through the steps, to write the query for the division operator.
37
Student_name
Robert
David
Hannah
Tom
Student_Name Course
Robert Databases
David Databases
38
Hannah Databases
Tom Databases
3. Find all the students and the required courses they have not taken
Here, we are taking our first step for finding the students who cannot graduate. The idea is to
simply find the students who have not taken certain courses that are required for graduation and
hence they wont be able to graduate. This is simply all those tuples/rows which are present
in StudentsAndRequired and not present in Course_Taken.
CREATE table StudentsAndNotTaken AS
SELECT * FROM StudentsAndRequired WHERE NOT EXISTS
(Select * FROM Course_Taken WHERE StudentsAndRequired.Student_Name =
Course_Taken.Student_Name
AND StudentsAndRequired.Course = Course_Taken.Course)
The table StudentsAndNotTaken comes out to be:
Student_Name Course
Hannah Databases
Tom Databases
39
Student_name
David
Hannah
Tom
Student_name
Robert
Hence we just learned, how different steps can lead us to the final answer. Now let us see how to
write all these 5 steps in one single query so that we do not have to create so many tables.
40
Student_name
Robert
This gives us the same result just like the 5 steps above.
SQL | Join (Inner, Left, Right and Full Joins)
A SQL Join statement is used to combine data or rows from two or more tables based on a
common field between them. Different types of Joins are:
INNER JOIN
LEFT JOIN
RIGHT JOIN
FULL JOIN
Student
StudentCourse
41
11. LEFT JOIN: This join returns all the rows of the table on the left side of the join and
matching rows for the table on the right side of join. The rows for which there is no
matching row on right side, the result-set will contain null. LEFT JOIN is also known as
LEFT OUTER JOIN.Syntax:
12. SELECT table1.column1,table1.column2,table2.column1,....
13. FROM table1
14. LEFT JOIN table2
15. ON table1.matching_column = table2.matching_column;
16.
17.
18. table1: First table.
19. table2: Second table
20. matching_column: Column common to both the tables.
Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are same.
SELECT Student.NAME,StudentCourse.COURSE_ID
FROM Student
LEFT JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
Output:
21. RIGHT JOIN: RIGHT JOIN is similar to LEFT JOIN. This join returns all the rows of the
table on the right side of the join and matching rows for the table on the left side of join.
The rows for which there is no matching row on left side, the result-set will contain null.
RIGHT JOIN is also known as RIGHT OUTER JOIN.Syntax:
22. SELECT table1.column1,table1.column2,table2.column1,....
23. FROM table1
24. RIGHT JOIN table2
25. ON table1.matching_column = table2.matching_column;
26.
27.
28. table1: First table.
29. table2: Second table
30. matching_column: Column common to both the tables.
44
Note: We can also use RIGHT OUTER JOIN instead of RIGHT JOIN, both are same.
31. FULL JOIN: FULL JOIN creates the result-set by combining result of both LEFT JOIN
and RIGHT JOIN. The result-set will contain all the rows from both the tables. The rows
for which there is no matching, the result-set will contain NULL values.Syntax:
32. SELECT table1.column1,table1.column2,table2.column1,....
33. FROM table1
34. FULL JOIN table2
35. ON table1.matching_column = table2.matching_column;
45
36.
37.
38. table1: First table.
39. table2: Second table
40. matching_column: Column common to both the tables.
Aggregate functions
Aggregate functions perform a calculation on a set of values and return a single value.
Aggregate functions ignore NULL values except COUNT.
It is used with the GROUP BY clause of the SELECT statement.
Following are the Aggregate functions:
1. AVG
2. MAX
3. MIN
4. SUM
5. COUNT()
6. COUNT(*)
Example
<Employee> Table