DBMS
DBMS
Improved Performance
Design Issues
1-tier architecture
Basically, a one-tier architecture keeps all of the elements of an
application, including the interface, Middleware and back-end data, in
one place. Developers see these types of systems as the simplest and
most direct way.
***
2-tier architecture:
The two-tier is based on Client Server architecture. The two-tier
architecture is like client server application. The direct
communication takes place between client and server. There is no
intermediate between client and server.
2-tier architecture
***
3-tier architecture:
The 3-tier architecture separates its tiers from each other based
on the complexity of the users and how they use the data present in the
database. It is the most widely used architecture to design a DBMS.
3-tier architecture
This architecture has different usages with different applications. It
can be used in web applications and distributed applications. The
strength in particular is when using this architecture over distributed
systems.
Database (Data) Tier − At this tier, the database resides along with
its query processing languages. We also have the relations that
define the data and their constraints at this level.
Application (Middle) Tier − At this tier reside the application
server and the programs that access the database. For a user, this
application tier presents an abstracted view of the database. End-
users are unaware of any existence of the database beyond the
application. At the other end, the database tier is not aware of any
other user beyond the application tier. Hence, the application layer
sits in the middle and acts as a mediator between the end-user and
the database.
User (Presentation) Tier − End-users operate on this tier and they
know nothing about any existence of the database beyond this layer.
At this layer, multiple views of the database can be provided by the
application. All views are generated by applications that reside in
the application tier.
***
n-tier architecture:
N-tier architecture would involve dividing an application into three
different tiers. These would be the
1. Logic tier,
2. The presentation tier, and
3. The data tier.
n- Tier architecture
It is the physical separation of the different parts of the application
as opposed to the usually conceptual or logical separation of the
elements in the model-view-controller (MVC) framework. Another
difference from the MVC framework is that n-tier layers are connected
linearly, meaning all communication must go through the middle layer,
which is the logic tier. In MVC, there is no actual middle layer
because the interaction is triangular; the control layer has access to
both the view and model layers and the model also accesses the view;
the controller also creates a model based on the requirements and
pushes this to the view. However, they are not mutually exclusive, as
the MVC framework can be used in conjunction with the n-tier
architecture, with the n-tier being the overall architecture used and
MVC used as the framework for the presentation tier.
Database Types
1. Distribution Database
In comparison to the centralized database idea, there are
inputs from the general database and the information collected
from local computers. The data is not accessible in a single
location and is distributed to various company sites. These
sites are connected to each other through communication links
that enable access to the data distributed.
2. Relational Database
Such databases are classified by a set of tables, in which data
falls into a predefined classification. The table is made up of
rows and columns with data input for a certain category and
rows, with the example of the data identified by the category.
The Structured query language is the standard interface of a
relation-database user and application program. There are
several basic operations which can be added to a table that
enables the expansion of these databases, joining two commonly-
related databases and modifying all existing applications.
3. Object Oriented Database
An object-driven database is an object-driven and relational
database collection. There are different items, such as java,
C++ that can be saved in a relational database using object-
oriented programming languages but object-oriented databases
are suitable for these components. An object-oriented database
will be organized instead of actions around objects and data
instead of logic. In contrast to an alphanumeric value, for
example, a multimedia record in a relational database can be a
definable data object.
4. Cloud Database
Now a day, data are actually stored in a public cloud, a hybrid
cloud or a private cloud, also known as a virtual environment.
A cloud database is an automated or built-in database for such
a virtualized environment. A cloud service offers various
advantages, including the ability to pay per user storage
capacity and bandwidth and provides scalability on request, as
well as high availability. In addition, a cloud platform allows
companies to support enterprise applications in the delivery
of software as a service.
5. Centralized Database
The data is stored centrally and users from various locations
can access this data. This database includes hiring processes
that help users even from a remote location to access the data.
For verification and validation of end-users, various types of
authentication procedures are applied, and the application
processes keeping a track and record of data utilization also
provide registration numbers.
7. NoSQL Database
These are used for large data sets. There are certain big data
performance problems that are handled effectively by relational
databases, and NoSQL databases can easily address such
problems. The analysis of large-size, unstructured information
can be done very efficiently on several cloud virtual servers.
8. Commercial Database
These are the paid versions of the enormous databases, designed
for users who wish to access the information for assistance.
These databases are specific subjects and such huge information
can not be maintained. Commercial links provide access to such
databases.
9. Personal Database
Data is collected and stored on small and easily manageable
personal computers. The data are usually used by the same
company department and are viewed by a small number of
individuals.
Data Model
Note:
Data object:
The data object is actually a location or region of storage that
contains a collection of attributes or groups of values that act
as an aspect, characteristic, quality, or descriptor of the
object. A vehicle is a data object which can be defined or
described with the help of a set of attributes or data.
Example: Sales databases such as customers, store items, sales.
Meta data:
Metadata in DBMS is the data (details/schema) of any other data. It can
also be defined as data about data. The word 'Meta' is the prefix that
is generally the technical term for self-referential. In other words,
we can say that Metadata is the summarized data for the contextual data
Types of Data Models: There are mainly three different types of data
models: conceptual data models, logical data models, and physical data
models, and each one has a specific purpose. The data models are used
to represent the data and how it is stored in the database and to set
the relationship between data items.
1. Conceptual Data Model: This Data Model defines WHAT the system
contains. This model is typically created by Business
stakeholders and Data Architects. The purpose is to organize
scope and define business concepts and rules.
2. Logical Data Model: Defines HOW the system should be implemented
regardless of the DBMS. This model is typically created by Data
Architects and Business Analysts. The purpose is to developed
technical map of rules and data structures.
3. Physical Data Model: This Data Model describes HOW the system
will be implemented using a specific DBMS system. This model is
typically created by DBA and developers. The purpose is actual
implementation of the database.
Types of Data Model
Customer and Product are two entities. Customer number and name
are attributes of the Customer entity
Product name and price are attributes of product entity
Sale is the relationship between the customer and product
The physical data model describes data need for a single project
or application though it may be integrated with other physical
data models based on project scope.
Data Model contains relationships between tables that which
addresses cardinality and null ability of the relationships.
Developed for a specific version of a DBMS, location, data
storage or technology to be used in the project.
Columns should have exact datatypes, lengths assigned and default
values.
Primary and Foreign keys, views, indexes, access profiles, and
authorizations, etc. are defined.
Conclusion
Levels of Database
1. Physical/Internal
2. Conceptual/Logical
3. External/View
Levels of DBMS Architecture Diagram
With Physical independence, you can easily change the physical storage
structures or devices with an effect on the conceptual schema. Any
change done would be absorbed by the mapping between the conceptual
and internal levels. Physical data independence is achieved by the
presence of the internal level of the database and then the
transformation from the conceptual level of the database to the
internal level.
Due to Physical independence, any of the below change will not affect
the conceptual layer.
Any change made will be absorbed by the mapping between external and
conceptual levels.
Due to Logical independence, any of the below change will not affect
the external layer.
For example: In the following diagram, we have a schema that shows the
relationship between three tables: Course, Student and Section. The
diagram only shows the design of the database, it doesn’t show the
data present in those tables. Schema is only a structural view(design)
of a database as shown in the diagram below.
***
Logical schema or Conceptual or Logical Level:
This logical level comes between the user level and physical storage
view. However, there is only single conceptual view of a single
database.
***
***
***
DBMS Instance
Definition of instance: The data stored in database at a particular
moment of time is called instance of database. Database schema defines
the variable declarations in tables that belong to a particular
database; the value of these variables at a moment of time is called
the instance of that database.
For example, lets say we have a single table student in the database,
today the table has 100 records, so today the instance of the database
has 100 records. Lets say we are going to add another 100 records in
this table by tomorrow so the instance of database tomorrow will have
200 records in table. In short, at a particular moment the data stored
in database is called the instance, that changes over time when we add
or delete data from the database.
1. Centralized Database:
Advantages –
Since all data is stored at a single location only thus it is
easier to access and co-ordinate data.
The centralized database has very minimal data redundancy since all
data is stored at a single place.
It is cheaper in comparison to all other databases available.
Disadvantages –
The data traffic in case of centralized database is more.
If any kind of system failure occurs at centralized system then
entire data will be destroyed.
what is a database management system(DBMS)
The DBMS serves as the intermediary between the user and the database.
The database structure itself is stored as a collection of files, So,
we can access the data in those files through the DBMS.
The DBMS receives all application requests and translates them into
the complex operations required to fulfill those requests. The DBMS
hides much of the database’s internal complexity from the application
programs and users.
The more users access the data, the greater the risks of data security
breaches. Corporations invest considerable amounts of time, effort,
and money to ensure that corporate data are used properly. A DBMS
provides a framework for better enforcement of data privacy and
security policies.
3. Better data integration
- What was the dollar volume of sales by product during the past six
months?
- What is the sales bonus figure for each of our salespeople during
the past three months?
- How many of our customers have credit balances of 3,000 or more?
The availability of data, combined with the tools that transform data
into usable information, empowers end users to make quick, informed
decisions that can make the difference between success and failure in
the global economy.
1. Increased costs
2. Management complexity
3. Maintaining currency
To maximize the efficiency of the database system, you must keep your
system current. Therefore, you must perform frequent updates and apply
the latest patches and security measures to all components.
Example-
‘Enrolled in’ is a relationship that exists between entities Student and Course.
Relationship Set-
Example-
The number of entity sets that participate in a relationship set is termed as the degree of
that relationship set. Thus,
On the basis of degree of a relationship set, a relationship set can be classified into the
following types-
1. Unary relationship set
2. Binary relationship set
3. Ternary relationship set
4. N-ary relationship set
Unary relationship set is a relationship set where only one entity set participates in a
relationship set.
Example-
One person is married to only one person
Example-
Student is enrolled in a Course
3. Ternary Relationship Set-
Ternary relationship set is a relationship set where three entity sets participate in a
relationship set.
Example-
Participation constraints define the least number of relationship instances in which an entity
must compulsorily participate.
1. Total participation
2. Partial participation
1. Total Participation-
It specifies that each entity in the entity set must compulsorily participate in at least one
relationship instance in that relationship set.
That is why, it is also called as mandatory participation.
Total participation is represented using a double line between the entity set and
relationship set.
Example-
Here,
Double line between the entity set “Student” and relationship set “Enrolled in” signifies
total participation.
It specifies that each student must be enrolled in at least one course.
2. Partial Participation-
It specifies that each entity in the entity set may or may not participate in the relationship
instance in that relationship set.
That is why, it is also called as optional participation.
Partial participation is represented using a single line between the entity set and
relationship set.
Example-
Here,
Single line between the entity set “Course” and relationship set “Enrolled in” signifies
partial participation.
It specifies that there might exist some courses for which no enrollments are made.
In the database, every entity set or relationship set can be represented in tabular form.
There are some points for converting the ER diagram to the table:
In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual
tables.
In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT
table. Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so
on.
o A key attribute of the entity type represented by the primary key.
In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are the
key attribute of the entity.
In the given ER diagram, student address is a composite attribute. It contains CITY, PIN,
DOOR#, STREET, and STATE. In the STUDENT table, these attributes can merge as an
individual column.
In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time
by calculating the difference between current date and Date of Birth.
Using these rules, you can convert the ER diagram to tables and columns and assign the
mapping between the tables. Table structure for the given ER diagram is as below:
Following rules are used for converting an ER diagram into the tables-
A strong entity set with only simple attributes will require only one table in relational model.
Attributes of the table will be the attributes of the entity set.
The primary key of the table will be the key attribute of the entity set.
Example-
A strong entity set with any number of composite attributes will require only one table in
relational model.
While conversion, simple attributes of the composite attributes are taken into account and
not the composite attribute itself.
Example-
Types of Attributes in DBMS: Simple attributes, Composite attributes, Single valued attributes
Multi valued attributes, Derived attributes, Key attributes
Rule-03: For Strong Entity Set With Multi Valued Attributes-
A strong entity set with any number of multi valued attributes will require two tables in
relational model.
One table will contain all the simple attributes with the primary key.
Other table will contain the primary key and all the multi valued attributes.
Example-
Roll_no City
Roll_no Mobile_no
Rule-04: Translating Relationship Set into a Table-
A relationship set will require one table in the relational model.
Attributes of the table are-
Primary key attributes of the participating entity sets
Its own descriptive attributes if any.
Set of non-descriptive attributes will be the primary key.
Example-
NOTE-
If we consider the overall ER diagram, three tables will be required in relational model-
One table for the entity set “Employee”
One table for the entity set “Department”
One table for the relationship set “Works in”
a1 a2
Table-A
a1 b1
Table-R
b1 b2
Table-B
STUDENT COURSE
a1 a2
Table-A
a1 b1 B2
Table-BR
NOTE- Here, combined table will be drawn for the entity set B and relationship set R.
STUDENT COLLEGE
Table-AR
b1 b2
Table-B
NOTE- Here, combined table will be drawn for the entity set A and relationship set R.
Case-04: For Binary Relationship With Cardinality Ratio 1:1
PERSON PASSPORT
Here, two tables will be required. Either combine ‘R’ with ‘A’ or ‘B’
Way-01:
1. AR ( a1 , a2 , b1 )
2. B ( b1 , b2 )
b1 b2
Table-B
a1 a1 b1
Table-AR
Way-02:
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
a1 a2
Table-A
a1 b1 b2
Table-BR
While determining the minimum number of tables required for binary relationships with given
cardinality ratios, following thumb rules must be kept in mind-
For binary relationship with cardinality ration m : n , separate and individual tables will be drawn
for each entity set and relationship.
For binary relationship with cardinality ratio either m : 1 or 1 : n , always remember “many side
will consume the relationship” i.e. a combined table will be drawn for many side entity set and
relationship set.
For binary relationship with cardinality ratio 1 : 1 , two tables will be required. You can combine
the relationship set with any one of the entity sets.
Case-01: For Binary Relationship With Cardinality Constraint and Total Participation
Constraint From One Side-
Because cardinality ratio = 1 : n , so we will combine the entity set B and relationship set R.
Then, two tables will be required-
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
Because of total participation, foreign key a1 has acquired NOT NULL constraint, so it can’t
be null now.
Case-02: For Binary Relationship With Cardinality Constraint and Total Participation
Constraint From Both Sides-
If there is a key constraint from both the sides of an entity set with total participation, then
that binary relationship is represented using only single table.
Weak entity set always appears in association with identifying relationship with total
participation constraint.
Here, two tables will be required-
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
Entity Set in DBMS-
In ER diagram,
Attributes are associated with an entity set.
Attributes describe the properties of entities in the entity set.
Based on the values of certain attributes, an entity can be identified uniquely.
A strong entity set is an entity set that contains sufficient attributes to uniquely
identify all its entities.
In other words, a primary key exists for a strong entity set.
Primary key of a strong entity set is represented by underlining it.
Symbols Used-
Example-
In this ER diagram,
Two strong entity sets “Student” and “Course” are related to each other.
Student ID and Student name are the attributes of entity set “Student”.
Student ID is the primary key using which any student can be identified
uniquely.
Course ID and Course name are the attributes of entity set “Course”.
Course ID is the primary key using which any course can be identified uniquely.
Double line between Student and relationship set signifies total participation.
It suggests that each student must be enrolled in at least one course.
Single line between Course and relationship set signifies partial participation.
It suggests that there might exist some courses for which no enrollments are
made.
A weak entity set is an entity set that does not contain sufficient attributes to
uniquely identify its entities.
In other words, a primary key does not exist for a weak entity set.
However, it contains a partial key called as a discriminator.
Discriminator can identify a group of entities from the entity set.
Discriminator is represented by underlining with a dashed line.
NOTE-
The combination of discriminator and primary key of the strong entity set makes
it possible to uniquely identify all entities of the weak entity set.
Thus, this combination serves as a primary key for the weak entity set.
Clearly, this primary key is not formed by the weak entity set completely.
Symbols Used-
In this ER diagram,
One strong entity set “Building” and one weak entity set “Apartment” are
related to each other.
Strong entity set “Building” has building number as its primary key.
Door number is the discriminator of the weak entity set “Apartment”.
This is because door number alone can not identify an apartment uniquely as
there may be several other buildings having the same door number.
Double line between Apartment and relationship set signifies total participation.
It suggests that each apartment must be present in at least one building.
Single line between Building and relationship set signifies partial participation.
It suggests that there might exist some buildings which has no apartment.
Thus,
Primary key of Apartment
= Primary key of Building + Its own discriminator
= Building number + Door number
A single rectangle is used for the A double rectangle is used for the
representation of a strong entity set. representation of a weak entity set.
Total participation may or may not exist Total participation always exists in the
in the relationship. identifying relationship.
Important Note-
In ER diagram, weak entity set is always present in total participation with the
identifying relationship set.
So, we always have the picture like shown here-
Constraints in DBMS-
1. Domain constraint
2. Tuple Uniqueness constraint
3. Key constraint
4. Entity Integrity constraint
5. Referential Integrity constraint
1. Domain Constraint-
Example-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul A
Here, value ‘A’ is not allowed since only integer values can be taken by the age attribute.
2. Tuple Uniqueness Constraint-
TUPLE: A single row of a table, which contains a single record for that relation, is called
a tuple.
Tuple Uniqueness constraint specifies that all the tuples must be necessarily unique in any relation.
Example-01:
Consider the following Student table-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation satisfies the tuple uniqueness constraint since here all the tuples are unique.
Example-02:
Consider the following Student table-
S001 Akshay 20
S001 Akshay 20
S003 Shashank 20
S004 Rahul 20
This relation does not satisfy the tuple uniqueness constraint since here all the tuples are not unique.
3. Key Constraint-
Key constraint specifies that in any relation-
Example-
S001 Akshay 20
S001 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation does not satisfy the key constraint as here all the values of primary key are not unique.
4. Entity Integrity Constraint-
Entity Integrity is the mechanism the system provides to maintain primary keys.
The primary key serves as a unique identifier for rows in the table. Entity
Integrity ensures two properties for primary keys: The primary key for a row is
unique; it does not match the primary key of any other row in the table.
Entity integrity constraint specifies that no attribute of primary key must contain a null
value in any relation.
This is because the presence of null value in the primary key violates the uniqueness property.
Example-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
Rahul 20
This relation does not satisfy the entity integrity constraint as here the primary key contains a NULL value.
5. Referential Integrity Constraint-
Whenever two tables contain one or more common columns, Oracle can enforce the
relationship between the two tables through a referential integrity constraint. Define a
PRIMARY or UNIQUE key constraint on the column in the parent table (the one that has
the complete set of column values).
This constraint is enforced when a foreign key references the primary key of a relation.
It specifies that all the values taken by the foreign key must either be available in the relation of the
primary key or be null.
Important Results-
The following two important results emerges out due to referential integrity constraint-
We can not insert a record into a referencing relation if the corresponding record does not exist in the
referenced relation.
We can not delete or update a record of the referenced relation if the corresponding record exists in the
referencing relation.
Example-
Department
Dept_no Dept_name
D10 ASET
D11 ALS
D12 ASFL
D13 ASHS
Here,
The relation ‘Student’ does not satisfy the referential integrity constraint.
This is because in relation ‘Department’, no value of primary key specifies department no. 14.
Thus, referential integrity constraint is violated.
Referential integrity constraints is base on the concept of Foreign Keys. A foreign key is an important
attribute of a relation which should be referred to in other relationships. Referential integrity constraint
state happens where relation refers to a key attribute of a different or same relation. However, that key
element must exist in the table.
Example:
Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know CustomerName=Google
has billing amount $300
Cardinality Constraint-
Cardinality constraint defines the maximum number of relationship instances in which an entity
can participate.
Symbol Used-
Example-
Here,
One student can enroll in any number (zero or more) of courses.
One course can be enrolled by any number (zero or more) of students.
2. Many-to-One Cardinality-
Symbol Used-
Example-
Here,
One student can enroll in at most one course.
One course can be enrolled by any number (zero or more) of students.S
3. One-to-Many Cardinality-
Symbol Used-
Example-
Here,
One student can enroll in any number (zero or more) of courses.
One course can be enrolled by at most one student.
4. One-to-One Cardinality-
Symbol Used-
Example-
Here,
One student can enroll in at most one course.
One course can be enrolled by at most one student.
Attributes in ER Diagram-
Attributes are the descriptive properties which are owned by each entity
of an Entity Set.
There exist a specific domain or set of values for each attribute from
where the attribute can take its values.
Types of Attributes-
In ER diagram, attributes associated with an entity set may be of the
following types-
1. Simple attributes
2. Composite attributes
3. Single valued attributes
4. Multi valued attributes
5. Derived attributes
6. Key attribute
1. Simple Attributes-
Simple attributes are those attributes which cannot be divided further.
Example-
Here, all the attributes are simple attributes as they can not be divided
further.
2. Composite Attributes-
Composite attributes are those attributes which are composed of many other
simple attributes.
Example-
Here, the attributes “Name” and “Address” are composite attributes as they
are composed of many other simple attributes.
3. Single Valued Attributes-
Single valued attributes are those attributes which can take only one
value for a given entity from an entity set.
Example-
Here, all the attributes are single valued attributes as they can take
only one specific value for each entity.
Here, the attributes “Mob_no” and “Email_id” are multi valued attributes
as they can take more than one values for a given entity.
5. Derived Attributes-
Derived attributes are those attributes which can be derived from other
attribute(s).
Example-
6. Key Attributes-
Key attributes are those attributes which can identify an entity uniquely
in an entity set.
Example-
In the above data item, each column is a field and each row is a
record.
Types of Keys in Database Management System: Each key which has
the parameter of uniqueness is as follows:
1. Super key
2. Candidate key
3. Primary key
4. Composite key
5. Secondary or Alternative key
6. Non- key attribute
7. Non- prime attribute
8. Foreign key
9. Simple key
10. Compound key
11. Artificial key
Super keys: The above table has following super keys. All of the
following sets of super key are able to uniquely identify a row of the
employee table.
{REGD.NO}
{REGD.NO, NAME}
{REGD.NO, CLASS}
{REGD.NO, CLASS, AGE}
{REGD.NO, CLASS, AGE, SEX}…ETC
*****
2. Candidate keys
A candidate key is a minimal super key with no redundant
(NEEDLESS) attributes.
Candidate keys are the set of fields; primary key can be
selected from these fields. A set of properties or attributes acts as
a primary key for a table. Every table must have at least one
candidate key or several candidate keys. It is a super key’s subset.
Example:
3. Primary key
The candidate key which is very suitable to be the main key
of table is a primary key.
The primary keys are compulsory in every table.
The properties of a primary key are:
Model stability
Occurrence of minimum fields
Defining value for every record i.e. being definitive
Feature of accessibility
Example
*****
4.Composite key
Key that consists of two or more attributes that uniquely
identify any record in a table is called Composite key.
Below is the table that shows the diagram for the above table:
From the table, we found that no attribute is available that alone can
identify a record in the table and can become a primary key. However,
the combination of some attributes can form a key and can identify a
tuple in the table. In the above example, Cust_Id and Prod_code can
together form a primary key because they alone are not able to
identify a tuple, but together they can do so.
*****
5.Secondary or Alternative key
The rejected candidate keys as primary keys are called as
secondary or alternative keys.
The candidate key which are not selected as primary key are
known as secondary keys or alternative keys.
Example:
6.Non-key Attribute
The attributes excluding the candidate keys are called as
non-key attributes.
Example:
7.Non-prime Attribute
Excluding primary attributes in a table are non-prime
attributes.
Example:
8. Foreign key
Generally foreign key is a primary key from one table, which
has a relationship with another table.
Example:
9. Simple key
Simple keys have a single field to specially recognize a
record. The single field cannot be divided into more fields. Primary
key is also a simple key.
Example: In the below example studentId is a single field because
no other student will have same Id. Therefore, it is a simple
key.
1 Adam 34 13000
2 Alex 28 15000
3 Stuart 20 18000
4 Ross 42 19020
1 Adam 34 13000
What is an Attribute?
Attribute Domain
Hence, the attribute Name will hold the name of employee for every
tuple. If we save employee's address there, it will be violation of
the Relational database model.
Name
Adam
Alex
Stuart - 9/401, OC Street, Amsterdam
Ross
1. Key Constraints
2. Domain Constraints
Key Constraints
The Key attribute should never be NULL or same for two different row
of data.
Domain Constraint
Domain constraints refers to the rules defined for the values that can
be stored for a certain attribute.
Like we explained above, we cannot store Address of employee in the
column for Name.
The table from which the values are derived is known as Master or
Referenced Table and the Table in which values are inserted
accordingly is known as Child or Referencing Table, In other words, we
can say that the table containing the foreign key is called the child
table, and the table containing the Primary key/candidate key is
called the referenced or parent table. When we talk about the database
relational model, the candidate key can be defined as a set of
attribute which can have zero or more attributes.
CREATE TABLE Student (Roll int PRIMARY KEY, Name varchar(25) , Course
varchar(10) );
Here column Roll is acting as Primary Key, which will help in deriving
the value of foreign key in the child table.
The syntax of Child Table or Referencing table is:
CREATE TABLE Subject (Roll int references Student, SubCode int, SubNam
e varchar(10) );
In the above table, column Roll is acting as Foreign Key, whose values
are derived using the Roll value of Primary key from Master table.
Table is a collection of data, organized in terms of rows and columns.
In DBMS term, table is known as relation and row as tuple.
A table has a specified number of columns, but can have any number of
rows.
Syntax
CREATE TABLE table_name (
column1 datatype,
column2 datatype,
column3 datatype,
....
);
The column parameters specify the names of the columns of the table.
The datatype parameter specifies the type of data the column can hold (e.g. varchar,
integer, date, etc.).
The following example creates a table called "Persons" that contains five columns:
PersonID, LastName, FirstName, Address, and City:
SQL> CREATE TABLE Persons (
PersonID int,
LastName varchar(20),
FirstName varchar(2),
Address varchar(2),
City varchar(20)
);
Table created.
The PersonID column is of type int and will hold an integer.
The LastName, FirstName, Address, and City columns are of type varchar and will hold
characters, and the maximum length for these fields is 255 characters.
Syntax
CREATE TABLE new_table_name AS
SELECT column1,column2,...
FROM existing_table_name
WHERE ....;
The following SQL creates a new table called "TestTables" (which is a copy of the
"Persons" table):
EXAMPLE:
SQL> create table testTable as
select PersonID,
LastName,
FirstName
from Persons;
Table created.
no rows selected
1 row created.
SQL> select * from Persons;
SQL> /
Note: After insert the values into Persons table, by using that table
(Persons) we are creating new table, then that new table (testable)
contains the parent table values by default.
4. Subtraction (-)
Example 1:
Select 10-20 as VALUE from dual;
Output:
Example 2:
Select 20-10 as VALUE from dual;
Output:
5. Multiplication (*)
Select 10*20 as VALUE from dual;
Output:
6. Division (/)
Example 1:
Select 10/20 as VALUE from dual;
Output:
Example 2:
Select 20/10 as VALUE from dual;
Output:
7. Modulus (%)
Example 1:
Select 10%20 as VALUE from dual;
Output:
Example 2:
Select 20%10 as VALUE from dual;
Output:
SQL Comparison Operators
Comparison operators compare two operand values or can also be used in
conditions where one expression is compared to another that often
returns a result (Result can be true or false).
OPERATOR OPERATION EXPLANATION
SQL> CREATE TABLE FACULTY (FID INT NOT NULL, FNAME VARCHAR(10) NOT
NULL, MOBILE_NO INT, JOBID VARCHAR(5), SALARY INT);
Table created.
****
4. Less than (<)
***
5. Less than or equal to (<=)—
SELECT * FROM FACULTY WHERE SALARY <= 30000;
Consider the tables “FACULTY” and “JOBS” below that is being used as a
reference for the next example.
SQL> CREATE TABLE FACULTY (FID INT NOT NULL, FNAME VARCHAR(10) NOT
NULL, MOBILE_NO INT, JOBID VARCHAR(5), SALARY INT);
Table created.
SQL> INSERT INTO JOBS VALUES('JL', 'JUNIOR LECTURER', 'PART TIME', 4);
INSERT INTO JOBS VALUES('JL', 'JUNIOR LECTURER', 'PART TIME', 4)
*
ERROR at line 1:
ORA-12899: value too large for column "DSR"."JOBS"."DESIGNATION"
(actual: 15,
maximum: 10)
*****
1. ALL
Example 1:
SELECT * FROM FACULTY WHERE SALARY > ALL (SELECT SALARY FROM FACULTY W
HERE FID = 4);
Referring to the above statement, we know that the subquery returns
one record with SALARY = 30000. So, the main query returns all such
records whose SALARY is greater than 30000.
SQL> SELECT * FROM FACULTY WHERE SALARY > ALL (SELECT SALARY FROM
FACULTY WHERE FID = 4);
Example 2:
SELECT * FROM FACULTY WHERE SALARY > ALL (SELECT SALARY FROM FACULTY W
HERE FID < 2);
This time the subquery returns ONE record (with SALARY = 10000). So,
the main query returns all such records whose SALARY is greater than
all of those returned from the subquery.
SQL> SELECT * FROM FACULTY WHERE SALARY > ALL (SELECT SALARY FROM
FACULTY WHERE FID < 2);
SQL> SELECT * FROM FACULTY WHERE SALARY > ALL (SELECT SALARY FROM
FACULTY WHERE FID < 3);
no rows selected
*****
Example 3:
SELECT * FROM FACULTY WHERE SALARY < ALL (SELECT SALARY FROM FACULTY
WHERE FID > 3);
In this case, the subquery returns ONE record (with SALARY = 30000). So, the main query
returns all such records whose SALARY is less than all of those returned from the
subquery.
SQL> SELECT * FROM FACULTY WHERE SALARY < ALL (SELECT SALARY FROM
FACULTY WHERE FID > 3);
3.ANY
Example 1:
Here, the subquery returns TWO records (with SALARY = 30000, 40000).
So, the main query returns all such records whose SALARY is greater
than any one value of those returned from the subquery.
SQL> SELECT * FROM FACULTY WHERE SALARY > ANY (SELECT SALARY FROM
FACULTY WHERE FID > 2);
SELECT * FROM FACULTY WHERE SALARY = ANY (SELECT SALARY FROM FACULTY
WHERE FID > 4);
In this scenario, the subquery returns three records (with SALARY =
30000, 43000, 60000). So, the main query returns all such records
which are returned as a result of the subquery .
SQL> SELECT * FROM FACULTY WHERE SALARY = ANY (SELECT SALARY FROM
FACULTY WHERE FID > 4);
SALARY
----------
30000
30000
*****
4.BETWEEN X AND Y
SQL> SELECT * FROM FACULTY WHERE SALARY BETWEEN 20000 AND 50000;
*****
5.Not Between X AND Y
SQL> SELECT * FROM FACULTY WHERE SALARY NOT BETWEEN 20000 AND 50000;
*****
6.IN
SELECT * FROM FACULTY WHERE FID IN (3, 5);
Returning the rows with the column values of NUMBER datatype specified
in the list.
SQL> SELECT * FROM FACULTY WHERE FID IN (3, 5);
FID FNAME MOBILE_NO JOBID SALARY
---------- ---------- ---------- ----- ----------
5 SATYA 8899775566 ASIS 30000
3 KRISH 9977664422 ASO 40000
7.NOT IN
SELECT * FROM FACULTY WHERE FID NOT IN (3,5)
Notice that all those rows are returned except the ones that are
specified in the argument list.
SQL> SELECT * FROM FACULTY WHERE FID NOT IN (3, 5);
FID FNAME MOBILE_NO JOBID SALARY
---------- ---------- ---------- ----- ----------
1 RAM 9988776655 JL 10000
2 LAKSH 9977665544 PROF 50000
4 SIVA 9966778855 ASIS 30000
*****
8. LIKE
Example 1:
SELECT * FROM FACULTY WHERE FNAME LIKE ‘RA%’;
The above query returns the rows that have FNAME starting with ‘RA’.
Example 2:
SELECT * FROM FACULTY WHERE FNAME LIKE ‘%H’;
This query returns the rows that have FNAME ending with ‘s’.
SQL> SELECT * FROM FACULTY WHERE FNAME LIKE '%H';
FID FNAME MOBILE_NO JOBID SALARY
---------- ---------- ---------- ----- ----------
2 LAKSH 9977665544 PROF 50000
3 KRISH 9977664422 ASO 40000
SQL Commands
o SQL commands are instructions. It is used to communicate with the
database. It is also used to perform specific tasks, functions,
and queries of data.
o SQL can perform various tasks like create a table, add data to
tables, drop the table, modify the table, set permission for
users.
o SQL commands are divided into five subgroups, DDL, DML, DCL, DQL
and TCL.
o
Types of SQL Commands
There are five types of SQL commands: DDL, DML, DCL, TCL, and DQL.
Courses Table:
CourseID CourseName CustomerID
c01 DevOps 2
c02 Machine Learning 4
c03 RPA 1
c04 Tableau 3
c05 AWS 2
Now, if you observe, the customerID column in the courses table refers
to the customerID column in the customers’ table. The customerID
column from the customers’ table is the Primary Key and the customerID
column from the courses table is the Foreign Key of that table.
Starting with the first operation:
SQL> CREATE TABLE CUSTOMER( ID INT NOT NULL, NAME VARCHAR(10), PAYMENT
INT, PRIMARY KEY(ID));
Table created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
ERROR at line 1:
ORA-00001: unique constraint (DSR.SYS_C006090) violated
1 row created.
1 row created.
1 row created.
SQL> SELECT * FROM PRODUCTS;
SUPPLIER_NAME
------------------------------
WIPRO
REMAINDERS:
ERROR at line 1:
ORA-02449: unique/primary keys in table referenced by foreign keys
SQL | Numeric Functions
Numeric Functions are used to perform operations on numbers and return
numbers.
Following are the numeric functions defined in SQL:
1. ABS(): It returns the absolute value of a number.
Syntax: SELECT ABS(-243.5) FROM DUAL;
Output:
ABS(-243.5)
-----------
243.5
MOD(18,4)
----------
2
Syntax
Here, NOT NULL signifies that column should always accept an explicit
value of the given data type. There are two columns where we did not
use NOT NULL, which means these columns could be NULL.
A field with a NULL value is the one that has been left blank during
the record creation.
Example
The NULL value can cause problems when selecting data. However,
because when comparing an unknown value to any other value, the result
is always unknown and not included in the results. You must use the IS
NULL or IS NOT NULL operators to check for a NULL value.
Consider the following CUSTOMERS table having the records as shown
below.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | |
| 7 | Muffy | 24 | Indore | |
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 6 | Komal | 22 | MP | |
| 7 | Muffy | 24 | Indore | |
+----+----------+-----+-----------+----------+
Example-2:
Creation of table:
SQL> create table cust(id int not null,
name varchar(20) not null,
age int not null,
address char(25),
primary key(id)
);
Table created.
1 row created.
1 row created.
SQL> insert into cust values(3, 20, 'vizag');
insert into cust values(3, 20, 'vizag')
*
ERROR at line 1:
ORA-00947: not enough values
or
SQL> insert into cust values(3, 'krish', 'vizag');
insert into cust values(3, 'krish', 'vizag')
*
ERROR at line 1:
ORA-00947: not enough values
or
SQL> insert into cust (id, name, address)
values(3,'krish','vizag');
insert into cust (id, name, address) values(3,'krish','vizag')
*
ERROR at line 1:
ORA-01400: cannot insert NULL into ("DSR"."CUST"."AGE")
Or
SQL> insert into cust (id, name, age, address)
values(3,'krish','vizag');
insert into cust (id, name, age, address)
values(3,'krish','vizag')
*
ERROR at line 1:
ORA-00947: not enough values
1 row created.
SQL> select * from cust;
CITY
----------
***
IS NOT NULL Syntax
SELECT column_names
FROM table_name
WHERE column_name IS NOT NULL;
EX:1 SQL> SELECT CITY FROM EMP19 WHERE CITY IS NOT NULL;
CITY
----------
HYD
VIJ
EX:2 SQL> SELECT AGE FROM EMP19 WHERE AGE IS NOT NULL;
AGE
----------
20
19
19
19
19
What is DUAL table?
The DUAL is special one row, one column table present by default in
all Oracle databases. The owner of DUAL is SYS (SYS owns the data
dictionary, therefore DUAL is part of the data dictionary.) but DUAL
can be accessed by every user. The table has a single VARCHAR2(1)
column called DUMMY that has a value of 'X'. MySQL allows DUAL to be
specified as a table in queries that do not need data from any tables.
In SQL Server DUAL table does not exist, but you could create one.
The DUAL table was created by Charles Weiss of Oracle corporation to
provide a table for joining in internal views.
DESC DUAL;
Output:
Output:
DUMMY
----------
X
The following command displays the number of rows of DUAL table :
Output:
COUNT(*)
----------
1
The following command displays the string value from the DUAL
table :
Output:
'ABCDEF1234
-----------
ABCDEF12345
The following command displays the numeric value from the DUAL
table :
Output:
123792.52
----------
123792.52
The following command tries to delete all rows from the DUAL
table :
Output:
The following command tries to remove all rows from the DUAL
table:
Output:
UNION ALL
Output
DUMMY
----------
X
X
Example - 1
You can also check the system date from the DUAL table using the
following statement :
Output:
SYSDATE
---------
11-DEC-10
Example - 2
You can also check the arithmetic calculation from the DUAL table
using the following statement :
15+10-5*5/5
-----------
20
Example - 3
SELECT level
FROM DUAL
Output:
LEVEL
----------
1
2
3
4
5
6
7
8
9
10
Example - 4
In the following code, DUAL involves the use of decode with NULL.
SELECT decode(null,null,1,0)
FROM DUAL;
Output:
DECODE(NULL,NULL,1,0)
---------------------
1
We have already learned that DUAL is a special one row one column
table. For Oracle, it is useful because Oracle doesn't allow
statements like :
SELECT 15+10-5*5/5;
Output:
SELECT 15+10-5*5/5
*
ERROR at line 1:
ORA-00923: FROM keyword not found where expected
But the following command will execute (see the output of the
previous example) :
SELECT 15+10-5*5/5;
Output:
Table altered.
After alter the table (adding new column) the table description is;
SQL> DESC EMP19;
Name Null? Type
----------------------------------------- -------- ---------------
EMPNO NUMBER(38)
ENAME VARCHAR2(20)
AGE NUMBER(38)
CITY VARCHAR2(10)
DOB DATE
CURRENT_D ONLY_CURRENT_YEAR
--------- -----------------
25-APR-21 2021
Explanation:
Useful to retrieve only year from the System date/Current date
or particular specified date.
Example-2: Extracting Month:
SQL> SELECT SYSDATE AS CURRENT_DATE_TIME, EXTRACT(Month FROM
SYSDATE) AS ONLY_CURRENT_MONTH FROM DUAL;
Output:
CURRENT_D ONLY_CURRENT_MONTH
--------- ------------------
25-APR-21 4
Explanation:
Useful to retrieve only month from the System date/Current
date or particular specified date.
2. ADD_MONTHS(date,n):
Using this method in PL/SQL you can add as well as subtract
number of months(n) to a date. Here ‘n’ can be both negative
or positive.
Example-4:
SQL> SELECT ADD_MONTHS(SYSDATE, -1) AS PREV_MONTH,
SYSDATE AS CURRENT_DATE,
ADD_MONTHS(SYSDATE, 1) as NEXT_MONTH FROM Dual;
Output:
PREV_MONT CURRENT_D NEXT_MONT
--------- --------- ---------
25-MAR-21 25-APR-21 25-MAY-21
Explanation:
ADD_MONTHS function have two parameters one is date, where it
could be any specified/particular date or System date as
current date and second is ‘n’, it is an integer value could
be positive or negative to get upcoming date or previous date.
3. LAST_DAY (date):
Using this method in PL/SQL you can get the last day in the month of
specified date.
Example-5:
SQL> SELECT SYSDATE AS CURRENT_DATE, LAST_DAY(SYSDATE) AS
LAST_DAY_OF_MONTH, LAST_DAY(SYSDATE)+1 AS FIRST_DAY_OF_NEXT_MONTH
FROM Dual;
Output:
CURRENT_D LAST_DAY_ FIRST_DAY
--------- --------- ---------
25-APR-21 30-APR-21 01-MAY-21
Explanation:
In above example, we are getting current date using SYSDATE
function and last date of the month would be retrieved using
LAST_DAY function and this function be also helpful for retrieving
the first day of the next month.
Example-6: Number of Days left in the month
SQL> SELECT SYSDATE AS CURRENT_DATE, LAST_DAY(SYSDATE) - SYSDATE AS
DAYS_LEFT_IN_MONTH FROM Dual ;
Output:
CURRENT_D DAYS_LEFT_IN_MONTH
--------- ------------------
25-APR-21 5
4. MONTHS_BETWEEN(date1,date2):
Using this method in PL/SQL you can calculate the number of
months between two entered dates date1 and date2. if date1 is
later than date2 then the result would be positive and if date1
is earlier than date2 then result is negative.
Note:
If a fractional month is calculated, the MONTHS_BETWEEN function
calculates the fraction based on a 31-day month.
Example-7:
SQL> SELECT MONTHS_BETWEEN (TO_DATE ('01-07-2003', 'dd-mm-yyyy'),
TO_DATE ('14-03-2003', 'dd-mm-yyyy')) AS NUMBER_OF_MONTHS FROM Dual;
Output:
NUMBER_OF_MONTHS
----------------
3.58064516
Explanation:
Here date1 and date2 are not on the same day of the month that’s why
we are getting the value in fractions, as well as date1 is later than
date2 so the resulting value is in integers.
Eneterd date should be in particular date format, that is the reason
of using TO_DATE function while comparison within MONTHS_BETWEEN
functions.
Let’s select the number of months an employee has worked for the
company.
5. NEXT_DAY(date,day_of_week):
It will return the upcoming date of the first weekday that is
later than the entered date. It has two parameters first date
where, system date or specified date can be entered; second day
of week which should be in character form.
Example-9:
SELECT NEXT_DAY (SYSDATE, 'SUNDAY') AS NEXT_SUNDAY FROM Dual;
Output:
NEXT_SUND
---------
02-MAY-21
SQL Datatype
o SQL Datatype is used to define the values that a column can contain.
o Every column is required to have a name and data type in the database table.
Datatype of SQL:
1. Binary Datatypes
There are Three types of binary Datatypes which are given below:
binary It has a maximum length of 8000 bytes. It contains fixed-length binary data.
varbinary It has a maximum length of 8000 bytes. It contains variable-length binary data.
Data Description
type
Datatype Description
timestamp It stores the year, month, day, hour, minute, and the second value.
Constraints in DBMS-
1. Domain constraint
2. Tuple Uniqueness constraint
3. Key constraint
4. Entity Integrity constraint
5. Referential Integrity constraint
1. Domain Constraint-
Example-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul A
Here, value ‘A’ is not allowed since only integer values can be taken by the age attribute.
2. Tuple Uniqueness Constraint-
TUPLE: A single row of a table, which contains a single record for that relation, is called
a tuple.
Tuple Uniqueness constraint specifies that all the tuples must be necessarily unique in any relation.
Example-01:
Consider the following Student table-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation satisfies the tuple uniqueness constraint since here all the tuples are unique.
Example-02:
Consider the following Student table-
S001 Akshay 20
S001 Akshay 20
S003 Shashank 20
S004 Rahul 20
This relation does not satisfy the tuple uniqueness constraint since here all the tuples are not unique.
3. Key Constraint-
Example-
S001 Akshay 20
S001 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation does not satisfy the key constraint as here all the values of primary key are not unique.
4. Entity Integrity Constraint-
Entity Integrity is the mechanism the system provides to maintain primary keys.
The primary key serves as a unique identifier for rows in the table. Entity
Integrity ensures two properties for primary keys: The primary key for a row is
unique; it does not match the primary key of any other row in the table.
Entity integrity constraint specifies that no attribute of primary key must contain a null value in any
relation.
This is because the presence of null value in the primary key violates the uniqueness property.
Example-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
Rahul 20
This relation does not satisfy the entity integrity constraint as here the primary key contains a NULL value.
5. Referential Integrity Constraint-
Whenever two tables contain one or more common columns, Oracle can enforce the
relationship between the two tables through a referential integrity constraint. Define a
PRIMARY or UNIQUE key constraint on the column in the parent table (the one that has
the complete set of column values).
This constraint is enforced when a foreign key references the primary key of a relation.
It specifies that all the values taken by the foreign key must either be available in the relation of the
primary key or be null.
Important Results-
The following two important results emerges out due to referential integrity constraint-
We can not insert a record into a referencing relation if the corresponding record does not exist in the
referenced relation.
We can not delete or update a record of the referenced relation if the corresponding record exists in the
referencing relation.
Example-
Department
Dept_no Dept_name
D10 ASET
D11 ALS
D12 ASFL
D13 ASHS
Here,
The relation ‘Student’ does not satisfy the referential integrity constraint.
This is because in relation ‘Department’, no value of primary key specifies department no. 14.
Thus, referential integrity constraint is violated.
Referential integrity constraints is base on the concept of Foreign Keys. A foreign key is an important
attribute of a relation which should be referred to in other relationships. Referential integrity constraint
state happens where relation refers to a key attribute of a different or same relation. However, that key
element must exist in the table.
Example:
Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know CustomerName=Google
has billing amount $300
SQL TRUNCATE TABLE
SQL TRUNCATE TABLE command used to completely remove all table
records. Not supporting to a WHERE clause.
SQL TRUNCATE command is faster and use some transaction log resources.
SQL TRUNCATE command logically equivalent to a DELETE command that
deletes all rows, but they are practically different under some rules.
Syntax
Considering following syntax that help you to understanding TRUNCATE,
TRUNCATE TABLE table_name;
Example
SQL> TRUNCATE TABLE emp_data;
Table Truncated.
Different between Delete and Truncate Commands
DELETE TRUNCATE
Simple INDEX
Simple INDEX create only one selected column of the database table.
Syntax
CREATE INDEX index_name
ON table_name (column_name)
Storage setting specifies the table space explicitly. This are the
optional storage setting if you are not specifies automatically
default storage setting used.
Consider the following employee table
Example:
Index created.
Composite INDEX
Composite INDEX creates on multiple selected column of the database
table.
Syntax
Example
SQL> CREATE INDEX EMP_INDX ON EMPLOYEE(ID, AGE);
Index created.
7 rows selected.
Example
SQL> CREATE UNIQUE INDEX EMP_INDEX ON EMPLOYEE (NAME);
Index created
SQL> INSERT INTO EMPLOYEE VALUES(777, 'GANESH', 23, 'PUNE', 15000);
ERROR at line 1:
1 row created.
We are renaming the above created index name EMP_IND to a new index
name EMP_INDEX.
DROP INDEX
Syntax
DROP INDEX index_name;
Example
SQL> DROP INDEX EMP_INDEX;
For example, the entity type EMPLOYEE describes the type (that is, the
attributes and relationships) of each employee entity, and also refers
to the current set of EMPLOYEE entities in the COMPANY database.
For example, the entities that are members of the EMPLOYEE entity type
may be distinguished further into SECRETARY, ENGINEER, MANAGER,
TECHNICIAN, SALARIED_EMPLOYEE, HOURLY_EMPLOYEE, and so on.
1. Specialization
COMMIT command
COMMIT command is used to permanently save any transaction into the
database.
When we use any DML command like INSERT, UPDATE or DELETE, the changes
made by these commands are not permanent, until the current session is
closed, the changes made by these commands can be rolled back.
To avoid that, we use the COMMIT command to mark the changes as
permanent.
ROLLBACK command
This command restores the database to last commited state. It is also
used with SAVEPOINT command to jump to a savepoint in an ongoing
transaction.
If we have used the UPDATE command to make some changes into the
database, and realise that those changes were not required, then we
can use the ROLLBACK command to rollback those changes, if they were
not commited using the COMMIT command.
In short, using this command we can name the different states of our
data in any table and then rollback to that state using
the ROLLBACK command whenever required.
SQL> COMMIT;
Commit complete.
Now we are update the value of SNAME= LAKSHMAN to GANESH where SID=4.
Now we are creating another save point SP3 after inserting new value
ie SID=5.
NOTE: SELECT statement is used to show the data stored in the table.
The resultant table will look like,
Now let's again use the ROLLBACK command to roll back the state of
data to the SAVEPOINT SP1;
8 rows selected.
SQL> SELECT ITEM, SUM(SALE) FROM SALESDEPT GROUP BY ITEM;
ITEM SUM(SALE)
--------------- ----------
COMPUTER 10
MEDICINES 100
MOBILE 15
MOUSE 6
LAPTOP 22
EXAMPLE-2
Consider the bellow table of data
SQL> CREATE TABLE "EMPLOYEES"(
EID INT NOT NULL,
ENAME VARCHAR(10) NOT NULL,
AGE INT,
DEPT VARCHAR(10),
SALARY INT
)
/
Table created.
Inset the values into table
SQL> INSERT INTO EMPLOYEES VALUES(101, 'RAM', 20, 'SOFTWARE',
30000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(102, 'KRISH', 22, 'SOFTWARE',
35000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(103, 'LAKSH', 23, 'SOFTWARE',
40000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(104, 'SIVA', 22, 'MECHANICAL',
20000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(105, 'VENKAT', 23, 'HARDWARE',
25000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(106, 'SATYA', 22, 'CIVIL',
30000);
1 row created.
6 rows selected.
SQL> SELECT S.S_ID, COUNT (*) AS TOTALCOURSES FROM STUDENT S LEFT JOIN
COURSE C ON S.S_ID= C.S_ID GROUP BY S.S_ID;
S_ID TOTALCOURSES
---------- ------------
102 1
101 2
104 2
105 1
103 2
106 1
6 rows selected.
S_ID TOTALCOURSES
---------- ------------
101 2
102 1
103 2
104 2
105 1
106 1
6 rows selected.
*******
Question:
Display COURSE_ID and total no. of Students selected a single
course by considering student and course tables using GROUP BY
Clause.
SQL> SELECT C.COURSE_ID, COUNT(*) AS TOTALSTUDENTS FROM STUDENT S
RIGHT JOIN COURSE C ON S.S_ID = C.S_ID GROUP BY C.COURSE_ID ORDER
BY COURSE_ID;
COURSE_ID TOTALSTUDENTS
---------- -------------
1 2
2 4
3 2
COURSE_ID TOTALSTUDENTS
---------- -------------
1 2
2 4
3 2
1
HAVING Clause
The HAVING clause was added to SQL because the WHERE keyword cannot be
used with aggregate functions.
The HAVING Clause enables you to specify conditions that filter which
group results appear in the results.
The WHERE clause places conditions on the selected columns, whereas
the HAVING clause places conditions on groups created by the GROUP BY
clause.
The HAVING clause must follow the GROUP BY clause in a query and must
also precedes the ORDER BY clause if used. The following code block
has the syntax of the SELECT statement including the HAVING clause –
Syntax
SELECT column1, column2
FROM table1, table2
WHERE [ conditions ]
GROUP BY column1, column2
HAVING [ conditions ]
ORDER BY column1, column2
no rows selected
SQL> SELECT COUNT (S_ID), ADDRESS FROM STUDENT GROUP BY ADDRESS HAVING
COUNT(S_ID) < 3;
COUNT(S_ID) ADDRESS
----------- ----------
1 HYD
1 TIRUPATI
2 VIJ
2 VIZAG
SQL> SELECT COUNT (S_ID), ADDRESS FROM STUDENT GROUP BY ADDRESS HAVING
COUNT(S_ID) < 2;
COUNT(S_ID) ADDRESS
----------- ----------
1 HYD
1 TIRUPATI
Oracle GROUP BY Clause
In Oracle GROUP BY clause is used with SELECT statement to collect
data from multiple records and group the results by one or more
columns.
Syntax: SELECT expression1, expression2, ... expression_n,
aggregate_function (aggregate_expression)
FROM tables
WHERE conditions
GROUP BY expression1, expression2, ... expression_n;
Parameters:
expression1, expression2, ... expression_n: It specifies the
expressions that are not encapsulated within aggregate function.
These expressions must be included in GROUP BY clause.
aggregate_function: It specifies the aggregate functions i.e.
SUM, COUNT, MIN, MAX or AVG functions.
aggregate_expression: It specifies the column or expression on
that the aggregate function is based on.
tables: It specifies the table from where you want to retrieve
records.
conditions: It specifies the conditions that must be fulfilled
for the record to be selected.
Oracle GROUP BY Example: (with SUM function)
Let's take a table "salesdepT"
SQL> CREATE TABLE "SALESDEPT "(
ITEM VARCHAR(15),
SALE INT,
BILLING_ADDRESS VARCHAR(15)
)
/
Table created.
SQL> INSERT INTO SALESDEPT VALUES('COMPUTER', 10, 'VIJ');
1 row created.
SQL> INSERT INTO SALESDEPT VALUES('MOBILE', 5, 'ELR');
1 row created.
SQL> INSERT INTO SALESDEPT VALUES('MEDICINES', 100, 'GUNTUR');
1 row created.
SQL> INSERT INTO SALESDEPT VALUES('LAPTOP', 7, 'VIZAG');
1 row created.
SQL> INSERT INTO SALESDEPT VALUES('MOUSE', 6, 'TIRUPATI');
1 row created.
SQL> INSERT INTO SALESDEPT VALUES('MOBILE', 10, 'RJY');
1 row created.
SQL> INSERT INTO SALESDEPT VALUES('LAPTOP', 5, 'HYD');
1 row created.
SQL> INSERT INTO SALESDEPT VALUES('LAPTOP',10, 'VIJ');
1 row created.
8 rows selected.
SQL> SELECT ITEM, SUM(SALE) FROM SALESDEPT GROUP BY ITEM;
ITEM SUM(SALE)
--------------- ----------
COMPUTER 10
MEDICINES 100
MOBILE 15
MOUSE 6
LAPTOP 22
EXAMPLE-2
Consider the bellow table of data
SQL> CREATE TABLE "EMPLOYEES"(
EID INT NOT NULL,
ENAME VARCHAR(10) NOT NULL,
AGE INT,
DEPT VARCHAR(10),
SALARY INT
)
/
Table created.
Inset the values into table
SQL> INSERT INTO EMPLOYEES VALUES(101, 'RAM', 20, 'SOFTWARE',
30000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(102, 'KRISH', 22, 'SOFTWARE',
35000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(103, 'LAKSH', 23, 'SOFTWARE',
40000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(104, 'SIVA', 22, 'MECHANICAL',
20000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(105, 'VENKAT', 23, 'HARDWARE',
25000);
1 row created.
SQL> INSERT INTO EMPLOYEES VALUES(106, 'SATYA', 22, 'CIVIL',
30000);
1 row created.
6 rows selected.
SQL> SELECT S.S_ID, COUNT (*) AS TOTALCOURSES FROM STUDENT S LEFT JOIN
COURSE C ON S.S_ID= C.S_ID GROUP BY S.S_ID;
S_ID TOTALCOURSES
---------- ------------
102 1
101 2
104 2
105 1
103 2
106 1
6 rows selected.
S_ID TOTALCOURSES
---------- ------------
101 2
102 1
103 2
104 2
105 1
106 1
6 rows selected.
*******
Question:
Display COURSE_ID and total no. of Students selected a single
course by considering student and course tables using GROUP BY
Clause.
SQL> SELECT C.COURSE_ID, COUNT(*) AS TOTALSTUDENTS FROM STUDENT S
RIGHT JOIN COURSE C ON S.S_ID = C.S_ID GROUP BY C.COURSE_ID ORDER
BY COURSE_ID;
COURSE_ID TOTALSTUDENTS
---------- -------------
1 2
2 4
3 2
COURSE_ID TOTALSTUDENTS
---------- -------------
1 2
2 4
3 2
1
HAVING Clause
The HAVING clause was added to SQL because the WHERE keyword cannot be
used with aggregate functions.
The HAVING Clause enables you to specify conditions that filter which
group results appear in the results.
The WHERE clause places conditions on the selected columns, whereas
the HAVING clause places conditions on groups created by the GROUP BY
clause.
The HAVING clause must follow the GROUP BY clause in a query and must
also precede the ORDER BY clause if used. The following code block has
the syntax of the SELECT statement including the HAVING clause –
Syntax
SELECT column1, column2
FROM table1, table2
WHERE [ conditions ]
GROUP BY column1, column2
HAVING [ conditions ]
ORDER BY column1, column2
no rows selected
SQL> SELECT COUNT (S_ID), ADDRESS FROM STUDENT GROUP BY ADDRESS HAVING
COUNT(S_ID) < 3;
COUNT(S_ID) ADDRESS
----------- ----------
1 HYD
1 TIRUPATI
2 VIJ
2 VIZAG
SQL> SELECT COUNT (S_ID), ADDRESS FROM STUDENT GROUP BY ADDRESS HAVING
COUNT(S_ID) < 2;
COUNT(S_ID) ADDRESS
----------- ----------
1 HYD
1 TIRUPATI
Views in SQL
o Views in SQL are considered as a virtual table. A view also
contains rows and columns.
o To create the view, we can select the fields from one or more
tables present in the database.
o A view can either have specific rows based on certain condition
or all the rows of a table.
o View is used to restrict data access.
1. Creating view
A view can be created using the CREATE VIEW statement.
We can create a view from a single table or multiple tables.
Syntax:
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE condition;
OR
CREATE OR REPLACE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE condition;
Just like table query, we can query the view to view the data.
SNAME AGE
---------- ----------
KRISH 20
SATYA 20
******
EXAMPLE-2
SQL> CREATE OR REPLACE VIEW STU_VIEW AS SELECT SNAME, AGE FROM STUDENT
WHERE S_ID < 105;
View created.
SQL> SELECT * FROM STU_VIEW;
SNAME AGE
---------- ----------
RAM 19
LAKSHMAN 18
KRISH 20
VENKAT 21
Just like table query, we can query the view to view the data.
SQL> SELECT SNAME, AGE FROM STUDENT WHERE S_ID < 105;
SNAME AGE
---------- ----------
RAM 19
LAKSHMAN 18
KRISH 20
VENKAT 21
*****
We can use the CREATE OF REPLACE VIEW statement to add or remove from
a view.
SYNTAX:
CREATE OR REPLACE VIEW view_name AS
SELECT column1, column2,…
FROM Table_Name
WHERE Condition;
For example, if we want to update the view MARKS_VIEW and add the
field AGE to this View from STUDENT Table, we can do this as:
Before update MARKS_VIEW table the data in the table is”
Now consider the STU_VIEW (This view is created using single table
STUDENT)
SQL> SELECT * FROM STU_VIEW;
S_ID SNAME AGE
---------- ---------- ----------
101 RAM 19
102 LAKSHMAN 18
103 KRISH 20
104 VENKAT 21
105 SIVA 22
106 SATYA 20
6 rows selected.
6 rows selected.
Non Updatable Views:
Non-Updateable Views may affect INSERT, UPDATE, and DELETE
operations.
One of the main reasons, why the views become non-updateable is
because of inclusion of aggregate functions (which also includes
DISTINCT), Group By, and Join.
Also in the cases of Nested Views, which includes those views
that is non-updateable, will cause the final view also to be non-
updateable.
As shown below, I have used a table called Student.
Now, we will create a view name “VSTU” which will also include
aggregate function like DISTINCT, and see what happens:
View created.
Display the view VSTU--
SQL> SELECT * FROM VSTU;
AGE
----------
22
20
21
18
19
Therefore, to test the updateability of the view, the next query
attempts to perform an INSERT command through the view:
NOTE: If NOT NULL constraint assign to any column in the table that
column must be initialized when you are inserting a new record into
table using view. Otherwise it throws an error. The error is shown
bellow.
SQL> INSERT INTO STU_VIEW VALUES ('SIVA', 22);
INSERT INTO STU_VIEW VALUES ('SIVA' ,22 )
*
ERROR at line 1:
ORA-01400: cannot insert NULL into ("DSR"."STUDENT"."S_ID")
In this example we will delete the last row from the view
STU_VIEW which we just added in the above example of inserting rows.
Before doing this we create new view;
S_ID SNAME
---------- ----------
101 RAM
102 LAKSHMAN
103 KRISH
104 VENKAT
105 SIVA
106 SATYA
--- -----
6 rows selected.
6 rows selected.
DELETING(DROP) VIEWS
We have learned about creating a View, but what if a created View is
not needed anymore? Obviously we will want to delete it. SQL allows
us to delete an existing View. We can delete or drop a View using the
DROP statement.
SYNTAX:
DROP VIEW View_name;
View_name: Name of the view which we want to delete.
For Example if we want to delete the view STU_VIEW, We can do this as:
USES OF A VIEW
A good database should contain views due to the given reasons:
1. Restricting data access –
Views provide an additional level of table security by restricting
access to a predetermined set of rows and columns of a table.
2. Hiding data complexity –
A view can hide the complexity that exists in a multiple table
join.
3. Simplify commands for the user –
Views allows the user to select information from multiple tables
without requiring the users to actually know how to perform a
join.
4. Store complex queries –
Views can be used to store complex queries.
5. Rename Columns –
Views can also be used to rename the columns without affecting the
base tables provided the number of columns in view must match the
number of columns specified in select statement. Thus, renaming
helps to to hide the names of the columns of the base tables.
6. Multiple view facility –
Different views can be created on the same table for different
users.
In database diagram, a car belongs to one brand while a brand has one
or many cars. The relationship between brand and car is a one-to-many.
The following SQL statements create the CARS and BRANDS tables; and
also insert sample data into these tables.
SQL> CREATE TABLE BRANDS (BRAND_ID NUMBER NOT NULL, BRAND_NAME
VARCHAR(10) NOT NULL, PRIMARY KEY(BRAND_ID));
Table created.
SQL> CREATE TABLE CARS(
CAR_ID NUMBER NOT NULL,
CAR_NAME VARCHAR (15) NOT NULL,
BRAND_ID NUMBER NOT NULL,
PRIMARY KEY (CAR_ID),
FOREIGN KEY (BRAND_ID) REFERENCES BRANDS (BRAND_ID)
);
Table created.
Create a view using cars table with CAR_ID and CAR_NAME are the
attributes.
SQL> CREATE OR REPLACE VIEW CAR_MASTER AS
SELECT CAR_ID, CAR_NAME FROM CARS;
View created.
SQL> SELECT * FROM CAR_MASTER;
CAR_ID CAR_NAME
---------- ---------------
1 Audi R8 Coupe
2 Audi Q2
3 Audi S1
4 BMW 2-serie
5 BMW i8
6 Ford Edge
7 Ford Mustang
8 Honda S2000
9 Honda Legend
10 Toyota GT86
11 Toyota C-HR
11 rows selected.
11 rows selected.
********
Update view and the view created by two tables using inner join:
12 rows selected.
A new row has been inserted into the cars table. This INSERT statement
works because Oracle can decompose it to an INSERT statement against
the cars table.
The following statement deletes all Honda cars from the cars table via
the ALL_CARS view:
SQL> CREATE TABLE PERSONS1 (ID INT NOT NULL, LASTNAME VARCHAR (15),
FIRSTNAME VARCHAR(15), AGE INT);
1 row created.
1 row created.
1 row created.
If we want to insert those orders from 'PERSONS1' table which have the ID
LESSTHAN 103 into 'PERSON_BKP' table the following SQL can be used:
CREATION OF TABLE PERSON_BKP:
SQL> CREATE TABLE PERSON_BKP(ID INT NOT NULL, LASTNAME VARCHAR(15), FIRSTNAME
VARCHAR(15), AGE INT);
Table created.
SQL> SELECT * FROM PERSON_BKP;
No rows selected
SQL Code:
SQL> INSERT INTO PERSON_BKP SELECT * FROM PERSONS1 WHERE ID < 103;
2 rows created.
Important Rule:
o - A SELECT clause
o - A FROM clause
o - A WHERE clause
You can use the comparison operators, such as >, <, or =. The
comparison operator can also be a multiple-row operator, such as
IN, ANY, or ALL.
The inner query executes first before its parent query so that
the results of an inner query can be passed to the outer query.
Syntax
SELECT column_name
FROM table_name
WHERE column_name expression operator
( SELECT column_name from table_name WHERE ... );
Consider the following EMP table:
SQL> INSERT INTO EMP VALUES (1001, 'RAM', 20, 10000);
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created
EID
----------
1003
1004
SQL> SELECT * FROM EMP WHERE EID IN (SELECT EID FROM EMP WHERE SALARY>10000);
EXAMPLE -2:
Table created.
1 row created.
1 row created.
SQL> INSERT INTO STUDENT1 VALUES (103, 'KRISH');
1 row created.
1 row created.
1 row created.
1 row created.
SID NAME
---------- ---------------
101 RAM
102 LAKSH
103 KRISH
104 VENKAT
105 SIVA
106 SATYA
6 rows selected.
SQL> CREATE TABLE MARKS (SID INT NOT NULL, TOTAL_MARKS INT NOT
NULL);
Table created.
1 row created.
1 row created.
1 row created.
SQL> INSERT INTO MARKS VALUES (104, 70);
1 row created.
1 row created.
1 row created.
SID TOTAL_MARKS
---------- -----------
101 95
102 85
103 80
104 70
105 75
106 72
Now we want to write a query to identify all students who get better marks
than that of the student who's SID is '102', but we do not know the marks of
'102'.
---- To solve the problem, we require two queries. One query returns the
marks (stored in Total_marks field) of '102' and a second query identifies
the students who get better marks than the result of the first query.
First query:
Query result:
SID TOTAL_MARKS
---------- -----------
102 85
OR
SQL> SELECT A.SID, A.NAME, B.TOTAL_MARKS FROM STUDENT1 A, MARKS B WHERE A.SID
= B.SID AND B.TOTAL_MARKS>85;
Query result:
Above two queries identified students who get the better MARKS than the
student who's SID is '102' (RAM).
You can combine the above two queries by placing one query inside the other.
The subquery (also called the 'inner query') is the query inside the
parentheses. See the following code and query result:
SQL Code:
SQL> SELECT A.SID, A.NAME, B.TOTAL_MARKS FROM STUDENT1 A, MARKS B WHERE A.SID
= B.SID AND B.TOTAL_MARKS > (SELECT TOTAL_MARKS FROM MARKS WHERE SID = 102);
Query result:
SID NAME TOTAL_MARKS
---------- --------------- -----------
101 RAM 95
Subqueries: Guidelines
There are some guidelines to consider when using subqueries:
A subquery must be enclosed in parentheses.
A subquery must be placed on the right side of the comparison
operator.
Subqueries cannot manipulate their results internally, therefore ORDER
BY clause cannot be added into a subquery. You can use an ORDER BY
clause in the main SELECT statement (outer query) which will be the
last clause.
Use single-row operators with single-row subqueries.
If a subquery (inner query) returns a null value to the outer query,
the outer query will not return any rows when using certain comparison
operators in a WHERE clause.
Scalar Functions
Scalar functions return a single value from an input value. Following are
some frequently used Scalar Functions in SQL.
Consider the following EMP table
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
6 rows selected.
1. UPPER() Function:
UPPER(NAME)
-------------------
RAM
LAKSH
KRISH
VENKAT
SIVA
SATYA
2. LOWER() Function:
LOWER function is used to convert value of string columns to Lowercase
characters.
Syntax for LOWER is,
SELECT LOWER(COLUMN_NAME) FROM TABLE_NAME;
Using LOWER() function
Consider the following EMP table
LOWER(NAME)
--------------------
ram
laksh
krish
venkat
siva
satya
SUB
---
AM
AKS
RIS
ENK
IVA
aty
SU
--
AM
AK
RI
EN
IV
at
SUBS
----
AM
AKSH
RISH
ENKA
IVA
atya
SUB
---
RAM
LAK
KRI
VEN
SIV
Sat
4. REVERSE() Function: RETURNS THE REVERSE ORDER OF A STRING VALUE.
REVERSE(NAME)
--------------------
MAR
HSKAL
HSIRK
TAKNEV
AVIS
aytas
INITCAP(NAME)
--------------------
Ram
Laksh
Krish
Venkat
Siva
Satya
LENGTH(NAME)
------------
3
5
5
6
4
5
7. RTRIM() Function: Returns a character string after truncating all
trailing blanks.
Syntax for RTRIM() is:
SELECT RTRIM(COLUMN_NAME) FROM TABLE_NAME;
Consider the following EMP table
RTRIM(NAME)
--------------------
RAM
LAKSH
KRISH
VENKAT
SIVA
Satya
Without using LTRIM() the age column displayed in the below format
SQL> SELECT AGE FROM EMP;
AGE
----------
20
21
22
19
23
22
LTRIM(AGE)
----------------
20
21
22
19
23
22
NAME RTRIM(AGE)
-------------------- ---------------------
RAM 20
LAKSH 21
KRISH 22
VENKAT 19
SIVA 23
RAM 24
6 rows selected.
9. CONCAT() Function: Returns text string concatenated.
Syntax for CONCAT() is:
SELECT CONCAT(COLUMN1, column2) FROM TABLE_NAME;
Consider the following EMP table
CONCAT(NAME,EID)
----------------------------------------
RAM1001
LAKSH1002
KRISH1003
VENKAT1004
SIVA1005
satya1006
OR
INSTR(NAME,'A')
---------------
0
0
0
0
0
2
INSTR(NAME,'A')
---------------
2
2
0
5
4
0
SQL Functions
SQL provides many built-in functions to perform operations on data. These
functions are useful while performing mathematical calculations, string
concatenations, sub-strings etc. SQL functions are divided into two
categories,
1. Aggregate Functions
2. Scalar Functions
Aggregate Functions
These functions return a single value after performing
calculations on a group of values. Following are some of the frequently used
Aggregrate functions.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
AVG(SALARY)
-----------
10200
AVG(AGE)
----------
21
COUNT() Function:
Count returns the number of rows present in the table either based on some
condition or without condition.
Its general syntax is,
SELECT COUNT (COLUMN_NAME) FROM TABLE_NAME;
COUNT(NAME)
-----------
2
COUNT(EID)
----------
2
COUNT(AGE)
----------
2
COUNT(NAME)
-----------
1
COUNT(NAME)
-----------
5
Example of COUNT(distinct):
COUNT(DISTINCTSALARY)
---------------------
4
SQL> SELECT COUNT(DISTINCT NAME) FROM EMP;
COUNT(DISTINCTNAME)
-------------------
5
SQL> SELECT COUNT(DISTINCT AGE) FROM EMP;
COUNT(DISTINCTAGE)
------------------
5
SQL> SELECT COUNT(*) FROM EMP; //RETURNS TOTAL NUMBER OF RECORDS
IN A TABLE
COUNT(*)
----------
6
MAX() Function
MAX function returns maximum value from selected column of the table.
Syntax of MAX function is,
SELECT MAX(COLUMN_NAME) FROM TABLE_NAME;
Using MAX() function
Consider the ABOVE EMP table
SQL query to find the Maximum salary will be,
SQL> SELECT MAX(SALARY) FROM EMP;
Result of the above query will be,
MAX(SALARY)
-----------
12000
MAX(AGE)
----------
23
MAX(EID)
----------
1005
SQL> SELECT MAX(NAME) FROM EMP;
MAX(NAME)
--------------------
VENKAT
MIN() Function
MIN function returns minimum value from a selected column of the table.
Syntax for MIN function is,
MIN(SALARY)
-----------
9000
SQL> SELECT MIN(AGE) FROM EMP;
MIN(AGE)
----------
19
MIN(NAME)
--------------------
KRISH
MIN(EID)
----------
1001
SUM() Function
SUM function returns total sum of a selected columns numeric values.
Syntax for SUM is,
SELECT SUM (COLUMN_NAME) FROM TABLE_NAME;
Using SUM() function
Consider the ABOVE EMP table
SQL query to find sum of salaries will be,
SUM(SALARY)
-----------
51000
MAXIMUMPAY
----------
12000
MINIMUMPAY
----------
9000
SQL Set Operation
The SQL Set operation is used to combine the two or more SQL SELECT
statements.
1. Union
o The SQL Union operation is used to combine the result of two or more
SQL SELECT queries.
o In the union operation, all the number of datatype and columns must be
same in both the tables on which UNION operation is being applied.
o The union operation eliminates the duplicate rows from its resultset.
o UNION is used to combine the results of two or more SELECT statements.
However it will eliminate duplicate rows from its resultset. In case of
union, number of columns and datatype must be same in both the tables,
on which UNION operation is being applied.
Syntax:
SELECT column_name FROM table1
UNION
SELECT column_name FROM table2;
CONSIDER THE FOLLOWING TABLES:
Table created.
1 row created.
1 row created.
1 row created.
ID NAME
-------- -----------
1 RAM
2 LAKSH
3 KRISH
Table created.
1 row created.
1 row created.
1 row created.
ID NAME
---------- ---------------
3 KRISH
4 SIVA
5 VENKAT
ID NAME
-------- -----------
1 RAM
2 LAKSH
3 KRISH
ID NAME
---------- --------------------
1 RAM
2 LAKSH
3 KRISH
4 SIVA
5 VENKAT
2. Union All
Union All operation is equal to the Union operation. It returns the set
without removing duplication and sorting the data.
This operation is similar to Union. But it also shows the duplicate rows.
Syntax:
SELECT column_name FROM table1
UNION ALL
SELECT column_name FROM table2;
Example: Using the above TABLE1 and TABLE2 table.
ID NAME
---------- --------------------
1 RAM
2 LAKSH
3 KRISH
ID NAME
---------- ---------------
3 KRISH
4 SIVA
5 VENKAT
ID NAME
---------- --------------------
1 RAM
2 LAKSH
3 KRISH
3 KRISH
4 SIVA
5 VENKAT
6 rows selected.
RENAME THE TABLE NAME: SQL> ALTER TABLE FIRST RENAME TO TABLE1;
MINUS
The Minus operation combines results of two SELECT statements and return only
those in the final result, which belongs to the first set of the result.
o It combines the result of two SELECT statements. Minus operator is used to display
the rows which are present in the first query but absent in the second query.
o It has no duplicates and data arranged in ascending order by default.
Syntax:
SELECT column_name FROM table1
MINUS
SELECT column_name FROM table2;
Table created.
1 row created.
1 row created.
1 row created.
ID NAME
-------- -----------
1 RAM
2 LAKSH
3 KRISH
Table created.
1 row created.
1 row created.
1 row created.
ID NAME
---------- ---------------
3 KRISH
4 SIVA
5 VENKAT
ID NAME
-------- -----------
1 RAM
2 LAKSH
3 KRISH
Example
Syntax
SELECT column_name FROM FIRST
INTERSECT
SELECT column_name FROM SECOND;
Table created.
1 row created.
1 row created.
1 row created.
ID NAME
-------- -----------
1 RAM
2 LAKSH
3 KRISH
SQL> CREATE TABLE SECOND (ID INT, NAME VARCHAR (15));
Table created.
1 row created.
1 row created.
1 row created.
ID NAME
---------- ---------------
3 KRISH
4 SIVA
5 VENKAT
ID NAME
-------- -----------
1 RAM
2 LAKSH
3 KRISH
Example:
ID NAME
---------- --------------------
3 KRISH
PL/SQL Trigger
Triggers are the SQL codes that are automatically executed in response to
certain events on a particular table.
Triggers are stored programs, which are automatically executed or fired when
some event occurs.
Advantages of Triggers
These are the following advantages of Triggers:
o {BEFORE | AFTER | INSTEAD OF}: This specifies when the trigger would be
executed. The INSTEAD OF clause is used for creating trigger on a view.
o {INSERT [OR] | UPDATE [OR] | DELETE}: This specifies the DML operation.
o [OF col_name]: This specifies the column name that would be updated.
o [ON table_name]: This specifies the name of the table associated with
the trigger.
o [REFERENCING OLD AS o NEW AS n]: This allows you to refer new and old
values for various DML statements, like INSERT, UPDATE, and DELETE.
o [FOR EACH ROW]: This specifies a row level trigger, i.e., the trigger
would be executed for each row being affected. Otherwise the trigger
will execute just once when the SQL statement is executed, which is
called a table level trigger.
o WHEN (condition): This provides a condition for rows for which the
trigger would fire. This clause is valid only for row level triggers.
DECLARE --optional
<declarations>
BEGIN --mandatory
<executable statements. At least one executable statement is mandatory>
EXCEPTION --optional
<exception handles>
END; --mandatory
/
1 row created.
SQL> INSERT INTO EMPLOYEE VALUES (222, 'LAKSH', 20, 'DELHI', 12000);
1 row created.
SQL> INSERT INTO EMPLOYEE VALUES (333, 'VENKAT', 23, 'MUMBAI', 11000);
1 row created.
SQL> INSERT INTO EMPLOYEE VALUES (444, 'KRISH', 21, 'HYD', 14000);
1 row created.
SQL> INSERT INTO EMPLOYEE VALUES (555, 'SIVA', 25, 'BANGLORE', 13000);
1 row created.
SQL> INSERT INTO EMPLOYEE VALUES (666, 'SATYA', 24, 'CHENNAI', 15000);
1 row created.
Create trigger:
Let's take a program to create a row level trigger for the EMPLOYEE table
that would fire for INSERT or UPDATE or DELETE operations performed on the
EMPLOYEE table. This trigger will display the salary difference between the
old values and new values:
Note: As many times you executed this code, the old and new both salary is
incremented by 500 and hence the salary difference is always 500.
After the execution of above code again, you will get the following result.
SQL> SET SERVEROUTPUT ON;
SQL> DECLARE
total_rows number(2);
BEGIN
UPDATE employee
SET salary=salary+500;
IF sql%notfound THEN
dbms_output.put_line('no employee updated');
ELSIF sql%found THEN
total_rows:=sql%rowcount;
dbms_output.put_line(total_rows ||'employees updated');
END IF;
END;
/
Old salary: 11000
New salary: 11500
Salary difference: 500
Old salary: 13000
New salary: 13500
Salary difference: 500
Old salary: 12000
New salary: 12500
Salary difference: 500
Old salary: 15000
New salary: 15500
Salary difference: 500
Old salary: 14000
New salary: 14500
Salary difference: 500
Old salary: 16000
New salary: 16500
Salary difference: 500
6 employees updated
Parameters or arguments
Trigger_name – The name of the trigger to be created.
AFTER INSERT – indicates that the trigger is triggered after the INSERT operator is
executed.
table_name – The name of the table for which the Trigger was created.
Restrictions
You cannot create a trigger in views.
You cannot update :NEW (new) values.
You cannot update :OLD (old) values.
Example
Here we are considering two tables SALES and PRODUCTSTOCK
SALES TABLE:
SQL> CREATE TABLE SALES (PROD_ID INT, CUSTOMERNAME VARCHAR(15));
Table created.
PROD_ID CUSTOMERNAME
---------- ---------------
1 RAM
PRODUCTSTOCK TABLE:
SQL> CREATE TABLE PRODUCTSTOCK (PROD_ID INT, PRODUCTNAME VARCHAR (15),
TOTAL INT);
Table created.
Counter : 1
Counter : 2
The condition in the EXIT WHEN clause evaluated to true when the
counter is three. Therefore, the loop body only executed two times
before it terminated.
Int I=1;
While(i<=5)
{
Printf(“%d”, i);
I++;
}
PL/SQL NULL Statement
Introduction to PL/SQL NULL statement
The PL/SQL NULL statement has the following format:
NULL;
The NULL statement is a NULL keyword followed by a semicolon ( ;).
The NULL statement does nothing except that it passes control to the
next statement.
The NULL statement is useful to:
Improve code readability
Provide a target for a GOTO statement
Create placeholders for subprograms
CASE n_credit_status
WHEN 'BLOCK' THEN
request_for_aproval;
WHEN 'WARNING' THEN
send_email_to_accountant;
ELSE
NULL;
END CASE;
END;
In this example, if the credit status is not blocked or warning, the
program does nothing.
Providing a target for a GOTO statement
When using a GOTO statement, you need to specify a label followed by
at least one executable statement.
The following example uses a GOTO statement to quickly move to the end
of the program if no further processing is required:
DECLARE
b_status BOOLEAN;
BEGIN
IF b_status THEN
GOTO end_of_program;
END IF;
-- further processing here
-- ...
<<end_of_program>>
NULL;
END;
Note that an error will occur if you don’t have the NULL statement
after the end_of_program label.
PL/SQL LOOP
PL/SQL LOOP syntax
The PL/SQL LOOP statement has the following structure:
<<label>> LOOP
statements;
END LOOP loop_label;
This structure is the most basic of all the loop constructs
including FOR LOOP and WHILE LOOP. This basic LOOP statement consists
of a LOOP keyword, a body of executable code, and
the END LOOP keywords.
EXIT statement
The EXIT statement allows you to unconditionally exit the current iteration
of a loop.
LOOP
EXIT;
END LOOP;
Typically, you use the EXIT statement with an IF statement to
terminate a loop when a condition is true:
LOOP
IF condition THEN
EXIT;
END IF;
END LOOP;
The following example illustrates how to use the LOOP statement to
execute a sequence of code and EXIT statement to terminate the loop.
DECLARE
l_counter NUMBER := 0;
BEGIN
LOOP
l_counter := l_counter + 1;
IF l_counter > 3 THEN
EXIT;
END IF;
dbms_output.put_line( 'Inside loop: ' || l_counter ) ;
END LOOP;
-- control resumes here after EXIT
dbms_output.put_line( 'After loop: ' || l_counter );
END;
Here is the output:
Inside loop: 1
Inside loop: 2
Inside loop: 3
After loop: 4
The following explains the logic of the code:
First, declare and initialize a variable l_counter to zero.
Second, increase the l_counter by one inside the loop and exit
the loop if the l_counter is greater than three. If
the l_counter is less than or equal three, show
the l_counter value. Because the initial value of l_counter is
zero, the code in the body of the loop executes three times
before it is terminated.
Third, display the value of the l_counter after the loop.
Nested loops
It is possible to nest a LOOP statement within another LOOP statement
as shown in the following example:
DECLARE
l_i NUMBER := 0;
l_j NUMBER := 0;
BEGIN
<<outer_loop>>
LOOP
l_i := l_i + 1;
EXIT outer_loop WHEN l_i > 2;
dbms_output.put_line('Outer counter ' || l_i);
-- reset inner counter
l_j := 0;
<<inner_loop>> LOOP
l_j := l_j + 1;
EXIT inner_loop WHEN l_j > 3;
dbms_output.put_line(' Inner counter ' || l_j);
END LOOP inner_loop;
END LOOP outer_loop;
END;
Here is the output: Outer counter 1
Inner counter 1
Inner counter 2
Inner counter 3
Outer counter 2
Inner counter 1
Inner counter 2
Inner counter 3
Introduction to PL/SQL GOTO statement
The GOTO statement allows you to transfer control to a labeled block
or statement. The following illustrates the syntax of
the GOTO statement:
GOTO label_name;
The label_name is the name of a label that identifies the target
statement. In the program, you surround the label name with double
enclosing angle brackets as shown below:
<<label_name>>;
When PL/SQL encounters a GOTO statement, it transfers control to the
first executable statement after the label.
DECLARE
l_counter PLS_INTEGER := 10;
BEGIN
FOR l_counter IN 1.. 5 loop
DBMS_OUTPUT.PUT_LINE ('Local counter:' || l_counter);
outer.l_counter := l_counter;
end loop;
-- after the loop
DBMS_OUTPUT.PUT_LINE ('Global counter' || l_counter);
END outer;
D) Referencing loop index outside the FOR LOOP
The following example causes an error because it references the loop
index, which is undefined, outside the FOR LOOP statement.
BEGIN
FOR l_index IN 1..3 loop
DBMS_OUTPUT.PUT_LINE (l_index);
END LOOP;
-- referencing index after the loop
DBMS_OUTPUT.PUT_LINE (l_index);
END;
Oracle issued the following error:
PLS-00201: identifier 'L_INDEX' must be declared
In PL/SQL, the code is not executed in single line format, but it is always executed
by grouping the code into a single element called Blocks. In this tutorial, you are
going to learn about these blocks.
Blocks contain both PL/SQL as well as SQL instruction. All these instruction will be
executed as a whole rather than executing a single instruction at a time.
1. Declaration section
2. Execution section
3. Exception-Handling section
Declaration Section
This is the first section of the PL/SQL blocks. This section is an optional part. This is
the section in which the declaration of variables, cursors, exceptions, subprograms,
pragma instructions and collections that are needed in the block will be declared.
Below are few more characteristics of this part.
Execution part is the main and mandatory part which actually executes the code
that is written inside it. Since the PL/SQL expects the executable statements from
this block this cannot be an empty block, i.e., it should have at least one valid
executable code line in it. Below are few more characteristics of this part.
Exception-Handling Section:
This is the section where the exception raised in the execution block is
handled.
This section is the last part of the PL/SQL block.
Control from this section can never return to the execution block.
This section starts with the keyword 'EXCEPTION'.
This section should always be followed by the keyword 'END'.
DECLARE --optional
<declarations>
BEGIN --mandatory
<executable statements. At least one executable statement is mandatory>
EXCEPTION --optional
<exception handles>
END; --mandatory
/
Note: A block should always be followed by '/' which sends the information to the
compiler about the end of the block.
1. Anonymous blocks
2. Named Blocks
Anonymous blocks:
Anonymous blocks are PL/SQL blocks which do not have any names assigned to
them. They need to be created and used in the same session because they will not
be stored in the server as database objects.
Since they need not store in the database, they need no compilation steps. They
are written and executed directly, and compilation and execution happen in a single
process.
These blocks don't have any reference name specified for them.
These blocks start with the keyword 'DECLARE' or 'BEGIN'.
Since these blocks do not have any reference name, these cannot be stored
for later purpose. They shall be created and executed in the same session.
They can call the other named blocks, but call to anonymous block is not
possible as it is not having any reference.
It can have nested block in it which can be named or anonymous. It can also
be nested in any blocks.
These blocks can have all three sections of the block, in which execution
section is mandatory, the other two sections are optional.
Named blocks:
Named blocks have a specific and unique name for them. They are stored as the
database objects in the server. Since they are available as database objects, they
can be referred to or used as long as it is present on the server. The compilation
process for named blocks happens separately while creating them as a database
objects.
1. Procedure
2. Function
*****
How to write a simple program using PL/SQL
We need to execute "set serveroutput on" if we need to see the output of the
code.
Now we are ready to work with the SQL* Plus tool.
we are going to write a simple program for printing "Hello World" using
"Anonymous block".
BEGIN
dbms_output.put_line (‘Hello World..');
END;
/
Output:
Hello World...
Code Explanation:
Note: A block should be always followed by '/' which sends the information to the
compiler about the end of the block. Till the compiler encounters '/', it will not
consider the block is completed, and it will not execute it.
EXAMPLE:
OUTPUT:HELLO
SQL>
Here we are going to print the "Hello World" using the variables.
DECLARE
text VARCHAR2(25);
BEGIN
text:= ‘Hello World’;
dbms_output.put_line (text);
END;
/
Output:
Hello World
Code Explanation:
SQL> DECLARE
TEXT VARCHAR(15);
BEGIN
TEXT := 'HELL0 PL/SQL';
DBMS_OUTPUT.PUT_LINE(TEXT);
END;
/
HELL0 PL/SQL
SQL> DECLARE
TEXT VARCHAR(15);
BEGIN
TEXT := 'HELL0 PL/SQL';
-- DBMS_OUTPUT.PUT_LINE(TEXT); //SINGLE LINE COMMENT
END;
/
Commenting code simply instructs the compiler to ignore that particular code from
executing.
Comment can be used in the program to increase the readability of the program. In
PL/SQL codes can be commented in two ways.
Using '--' in the beginning of the line to comment that particular line.
Using '/*…….*/' we can use multiple lines. The symbol '/*' marks the starting
of the comment and the symbol '*/' marks the end of the comment. The
code between these two symbols will be treated as comments by the
compiler.
Example: In this example, we are going to print 'Hello World' and we are also
going to see how the commented lines behave in the code
BEGIN
--single line comment
dbms output.put line (' Hello World ’);
/*Multi line commenting begins
Multi line commenting ends */
END;
/
Output:
Hello World
Code Explanation:
Code line 2: Single line comment and compiler ignored this line from
execution.
Code line 3: Printing the value "Hello World."
Code line 4: Multiline commenting starts with '/*'
Code line 5: Multiline commenting ends with '*/'
The literal values should always be enclosed in single quotes while assigning them
to CHARACTER data type.
Variables are mainly used to store data during the data manipulation or data
processing. They need to be declared before using them inside the program. This
declaration needs to be done in the declarative section of the PL/SQL blocks.
Syntax
The above syntax shows how to declare the variable in the declarative section.
Once the variable is declared, they are ready to hold the data of defined type. The
values of these variables can be assigned either in execution section or at the time
of declaring itself. The value can be either a literal or another variable's value. Once
a particular value has been assigned, it will be stored in the allocated memory
space for that variable.
Syntax
The above syntax shows how to declare the variable and assign value in the
declarative section.
<Yariable_name> <datatype>;
<variable name> := <value>;
The above syntax shows how to assign the value to an already declared variable.
Example1: In this example, we are going to learn how to declare the variable and
how to assign the value to them. We are going to print 'GURU99' in the following
program by using the variables.
DECLARE
lv_name VARCHAR2(50);
lv_name_2 VARCHAR2(50) := ‘PL / SQL';
BEGIN
lv_name := lv_name_2;
dbms_output .put_line(lv_name);
END;
/
Code Explanation:
Code line 2: Declaring the variable 'lv_name' of VARCHAR2 with size 50.
Code line 3: Declaring the variable 'lv_name_2' of VARCHAR2 with size 50
and assigned the default value using literal ' PL / SQL'.
Code line 5: Value for variable 'lv_name' has been assigned from the
variable 'lv_name_2'.
Code line 6: Printing the stored value of variable 'lv_name'.
When the above code is executed, you will get the following output.
Output: PL / SQL
A trigger is a named PL/SQL module that is stored in a database and
can be invoked again. You can enable and disable a trigger, but you
cannot explicitly call it.
When the Trigger is enabled, the database shall automatically call the
Trigger whenever the event that triggers the Trigger occurs. While the
Trigger is disabled, it shall not trigger.
You shall create the Trigger with the CREATE TRIGGER operator. You
specify the triggering event in terms of the triggering operators and
the object they act on. The Trigger is considered to be created or
defined for an object that is either a table, representation, scheme
or database.
You also specify the synchronization point that determines whether the
Trigger starts before or after execution of the Trigger Operator and
whether it starts for each line affected by the Trigger Operator. By
default, the Trigger is created in the enabled state.
1 row created.
1 row updated.
SQL> SELECT * FROM SUPERHEROS;
SP_NAME
----------
KRISHNA
1 row deleted.
no rows selected
Now we are creating a single trigger for all three DML commands
SQL>CREATE OR REPLACE TRIGGER TR_SH
BEFORE INSERT OR DELETE OR UPDATE ON SUPERHEROS
FOR EACH ROW
DECLARE
TR_USER VARCHAR2(10);
BEGIN
SELECT USER INTO TR_USER FROM DUAL;
IF INSERTING THEN
DBMS_OUTPUT.PUT_LINE ('ONE ROW INSERTED BY. '|| TR_USER);
ELSIF DELETING THEN
DBMS_OUTPUT.PUT_LINE ('ONE ROW DELETED BY. '|| TR_USER);
ELSIF UPDATING THEN
DBMS_OUTPUT.PUT_LINE ('ONE ROW UPDATED BY. '|| TR_USER);
END IF;
END;
/
Trigger created.
CHECK THE INSERT, UPDATE AND DELETE COMMANDS:
SQL> INSERT INTO SUPERHEROS VALUES ('RAM');
ONE ROW INSERTED BY. DSR
1 row created.
1 row updated.
1 row deleted.
PL/SQL IF THEN statement example
In the following example, the statements between THEN and END
IF execute because the sales revenue is greater than 100,000.
EXAMPLE
DECLARE
n_sales NUMBER := 2000000;
BEGIN
IF n_sales > 100000 THEN
DBMS_OUTPUT.PUT_LINE( 'Sales revenue is greater than 100K ' );
END IF;
END;
/
Nested IF statement:
You can nest an IF statement within another IF statement as shown
below:
IF condition_1 THEN
IF condition_2 THEN
nested_if_statements;
END IF;
ELSE
else_statements;
END IF;
PL/SQL CASE Statement
The CASE statement chooses one sequence of statements to execute out
of many possible sequences.
The CASE statement has two types: simple CASE statement and
searched CASE statement. Both types of the CASE statements support an
optional ELSE clause.
Simple CASE statement
A simple CASE statement evaluates a single expression and compares the
result with some values.
The simple CASE statement has the following structure:
The CASE statement chooses one sequence of statements to execute out
of many possible sequences.
The CASE statement has two types: simple CASE statement and
searched CASE statement. Both types of the CASE statements support an
optional ELSE clause.
Simple CASE statement
A simple CASE statement evaluates a single expression and compares the
result with some values.
The simple CASE statement has the following structure:
Let’s examine the syntax of the simple CASE statement in detail:
1) selector
The selector is an expression which is evaluated once. The result of
the selector is used to select one of the several alternatives
e.g., selector_value_1 and selector_value_2.
2) WHEN selector_value THEN statements
The selector values i.e., selector_value_1, selector_value_2, etc.,
are evaluated sequentially. If the result of a selector value equals
the result of the selector, then the associated sequence of statements
executes and the CASE statement ends. In addition, the subsequent
selector values are not evaluated.
3) ELSE else_statements
If no values in WHERE clauses match the result of the selector in
the CASE clause, the sequence of statements in the ELSE clause
executes.
Because the ELSE clause is optional, you can skip it. However, if you
do so, PL/SQL will implicitly use the following:
ELSE
RAISE CASE_NOT_FOUND;
n other words, PL/SQL raises a CASE_NOT_FOUND error if you don’t
specify an ELSE clause and the result of the CASE expression does not
match any value in the WHEN clauses.
Note that this behavior of the CASE statement is different from the IF
THEN statement. When the IF THEN statement has no ELSE clause and the
condition is not met, PL/SQL does nothing instead raising an error.
Simple CASE statement example
The following example compares single value (c_grade) with many
possible values ‘A’, ‘B’,’C’,’D’, and ‘F’:
DECLARE
c_grade CHAR( 1 );
c_rank VARCHAR2( 20 );
BEGIN
c_grade := 'D';
CASE c_grade
WHEN 'A' THEN
c_rank := 'Excellent' ;
WHEN 'B' THEN
c_rank := 'Very Good' ;
WHEN 'C' THEN
c_rank := 'Good' ;
WHEN 'D' THEN
c_rank := 'Fair' ;
WHEN 'F' THEN
c_rank := 'Poor' ;
ELSE
c_rank := 'No such grade' ;
END CASE;
DBMS_OUTPUT.PUT_LINE( c_rank );
END;
SQL> CREATE TABLE STUDENT (S_ID INT NOT NULL, SNAME VARCHAR (20), AGE INT,
ADDRESS VARCHAR (20));
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
SQL> INSERT INTO STUDENT VALUES (106, 'SATYA', 22, 'TIRUPATI');
1 row created.
Table created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
COURSE_ID S_ID
---------- ----------
1 101
2 102
2 103
3 104
1 105
4
5
6
LEFT JOIN: This join returns all the rows of the table on the left side of
the join and matching rows for the table on the right side of join. The rows
for which there is no matching row on right side, the result-set will
contain null. LEFT JOIN is also known as LEFT OUTER JOIN.
Syntax:
SELECT table1.column1, table.column2, table2.column1,…
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;
Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are same.
SQL> CREATE TABLE STUDENT (S_ID INT NOT NULL, SNAME VARCHAR (20), AGE INT,
ADDRESS VARCHAR (20));
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
Table created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
SQL> SELECT * FROM COURSE;
COURSE_ID S_ID
---------- ----------
1 101
2 102
2 103
3 104
1 105
6 rows selected.
6 rows selected.
SQL> SELECT * FROM STUDENT LEFT JOIN COURSE ON STUDENT.S_ID= COURSE.S_ID;
6 rows selected.
INNER JOINS
A SQL Join statement is used to combine data or rows from two or more tables
based on a common field between them.
A JOIN clause is used to combine rows from two or more tables, based on a related
column between them.
Inner Join
Natural Join
Left Join
Right Join
Full Join
SQL> CREATE TABLE STUDENT (S_ID INT NOT NULL, SNAME VARCHAR(20), AGE INT,
ADDRESS VARCHAR(20));
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
SQL> SELECT * FROM STUDENT;
Table created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
COURSE_ID S_ID
---------- ----------
1 101
2 102
2 103
3 104
1 105
INNER JOIN:
Syntax:
SELECT table1.column1, table1.column2, table2.column1,….
FROM table1
INNER JOIN table2
ON table1.matching_column = table2. Matching_column;
INNER JOIN
This query will show the student ID and age of students enrolled in
different courses.
SQL> SELECT COURSE.COURSE_ID, STUDENT.S_ID, STUDENT.AGE FROM STUDENT INNER
JOIN COURSE ON STUDENT.S_ID = COURSE.S_ID;
FULL JOIN: FULL JOIN creates the result-set by combining result of both LEFT
JOIN and RIGHT JOIN. The result-set will contain all the rows from both the
tables. The rows for which there is no matching, the result-set will
contain NULL values.
Syntax:
Syntax:
SELECT COLUMN-NAME-LIST
FROM table1
FULL JOIN table2
ON table1.matching_column = table2.matching_column;
FULL JOIN
SQL> CREATE TABLE STUDENT (S_ID INT NOT NULL, SNAME VARCHAR (20), AGE INT,
ADDRESS VARCHAR (20));
1 row created.
1 row created.
SQL> INSERT INTO STUDENT VALUES (103, 'KRISH', 20, 'VIZAG');
1 row created.
1 row created.
1 row created.
1 row created.
Table created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
COURSE_ID S_ID
---------- ----------
1 101
2 102
2 103
3 104
1 105
4
5
6
SNAME COURSE_ID
--------------- ----------
RAM 1
LAKSHMAN 2
KRISH 2
VENKAT 3
SIVA 1
SATYA ---
--- 6
--- 5
--- 4
9 rows selected.
SQL> SELECT STUDENT.S_ID, COURSE.COURSE_ID FROM STUDENT FULL JOIN COURSE ON
(STUDENT.S_ID = COURSE.S_ID);
S_ID COURSE_ID
---------- ----------
101 1
102 2
103 2
104 3
105 1
106 ---
--- 6
--- 5
--- 4
9 rows selected.
OR
Table created.
1 row created.
1 row created.
1 row created.
ID NAME
---------- ----------------
1 RAM
2 LAKSH
3 KRISH
Table created.
1 row created.
1 row created.
ID ADDRESS
---------- -----------
1 VIZ
2 VIZAG
3 TIRUPATI
ID NAME ID ADDRESS
---------- -------------------- ---------- --------------------
1 RAM 1 VIZ
1 RAM 2 VIZAG
1 RAM 3 TIRUPATI
2 LAKSH 1 VIZ
2 LAKSH 2 VIZAG
2 LAKSH 3 TIRUPATI
3 KRISH 1 VIZ
3 KRISH 2 VIZAG
3 KRISH 3 TIRUPATI
9 rows selected.
ID NAME ADDRESS
---------- -------------------- --------------------
1 RAM VIZ
1 RAM VIZAG
1 RAM TIRUPATI
2 LAKSH VIZ
2 LAKSH VIZAG
2 LAKSH TIRUPATI
3 KRISH VIZ
3 KRISH VIZAG
3 KRISH TIRUPATI
9 rows selected.
ID NAME ID ADDRESS
---------- -------------------- ---------- ---------------
1 RAM 1 VIZ
1 RAM 2 VIZAG
1 RAM 3 TIRUPATI
2 LAKSH 1 VIZ
2 LAKSH 2 VIZAG
2 LAKSH 3 TIRUPATI
3 KRISH 1 VIZ
3 KRISH 2 VIZAG
3 KRISH 3 TIRUPATI
9 rows selected
Note: The number of rows in the output will always be the cross product of
number of rows in each table. In our example table-1 has 3 rows and table-2
has 3 rows so the output has 3×3 = 9 rows.
What is ER Diagram?
ER Diagram stands for Entity Relationship Diagram, also known as ERD
is a diagram that displays the relationship of entity sets stored in a
database. In other words, ER diagrams help to explain the logical
structure of databases. ER diagrams are created based on three basic
concepts: entities, attributes and relationships.
ER Diagrams contain different symbols that use rectangles to represent
entities, ovals to define attributes and diamond shapes to represent
relationships.
What is ER Model?
ER Model stands for Entity Relationship Model is a high-level
conceptual data model diagram. ER model helps to systematically
analyze data requirements to produce a well-designed database. The ER
Model represents real-world entities and the relationships between
them. Creating an ER Model in DBMS is considered as a best practice
before implementing your database.
ER Modeling helps you to analyze data requirements systematically
to produce a well-designed database. So, it is considered a best
practice to complete ER modeling before implementing your database.
OR
An Entity–relationship model (ER model) describes the structure of a
database with the help of a diagram, which is known as Entity
Relationship Diagram (ER Diagram). An ER model is a design or
blueprint of a database that can later be implemented as a database.
The main components of E-R model are: entity set and relationship set.
Here are the geometric shapes and their meaning in an E-R Diagram. We
will discuss these terms in detail in the next section(Components of a
ER Diagram) of this guide so don’t worry too much about these terms
now, just go through them once.
Rectangle: Represents Entity sets.
Ellipses: Attributes
Diamonds: Relationship Set
Lines: They link attributes to Entity Sets and Entity sets to
Relationship Set
Double Ellipses: Multivalued Attributes
Dashed Ellipses: Derived Attributes
Double Rectangles: Weak Entity Sets
Double Lines: Total participation of an entity in a relationship set
Components of a ER Diagram
What is a Surrogate key?
SURROGATE KEYS is an artificial key which aims to uniquely identify each
record is called a surrogate key. This kind of partial key in DBMS is unique
because it is created when you don't have any natural primary key. They do
not lend any meaning to the data in the table. Surrogate key is usually an
integer. A surrogate key is a value generated right before the record is
inserted into a table.
Surrogate Key has no actual meaning and is used to represent existence. It
has an existence only for data analysis.
Above, given example, shown shift timings of the different employee. In this
example, a surrogate key is needed to uniquely identify each employee.
Lossless Decomposition
Decomposition is lossless if it is feasible to reconstruct
relation R from decomposed tables using Joins. This is the preferred
choice. The information will not lose from the relation when
decomposed. The join would result in the same original relation.
Let us see an example −
<EmpInfo>
Emp_ID Emp_Name Emp_Age Emp_Location Dept_ID Dept_Name
E001 Jacob 29 Alabama Dpt1 Operations
E002 Henry 32 Alabama Dpt2 HR
E003 Tom 22 Texas Dpt3 Finance
<EmpDetails>
Emp_ID Emp_Name Emp_Age Emp_Location
E001 Jacob 29 Alabama
E002 Henry 32 Alabama
E003 Tom 22 Texas
<DeptDetails>
Dept_ID Emp_ID Dept_Name
Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance
Lossy Decomposition
As the name suggests, when a relation is decomposed into two or
more relational schemas, the loss of information is unavoidable when
the original relation is retrieved.
<DeptDetails>
Dept_ID Dept_Name
Dpt1 Operations
Dpt2 HR
Dpt3 Finance
1. Lossless decomposition--
Lossless decomposition ensures-
No information is lost from the original relation during decomposition.
When the sub relations are joined back, the same relation is obtained
that was decomposed.
Every decomposition must always be lossless.
2. Dependency Preservation--
Types of Decomposition-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
Example-
Consider the following relation R( A , B , C )--
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
A B
1 2
2 5
3 3
R1( A , B )
B C
2 1
5 3
3 3
R2( B , C )
R1 ⋈ R2 = R
Now, if we perform the natural join (⋈) of the sub relations R1 and
R2 , we get-
A B C
1 2 1
2 5 3
3 3 3
NOTE-
Lossless join decomposition is also known as non-additive join
decomposition.
This is because the resultant relation after joining the sub
relations is same as the decomposed relation.
No extraneous tuples appear after joining of the sub-relations.
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R
Example-
Consider the following relation R( A , B , C )-
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations as R1( A , C )
and R2( B , C )-
A C
1 1
2 3
3 3
R1( A , B )
B C
2 1
5 3
3 3
R2( B , C )
R1 ⋈ R2 ⊃ R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and
R2 we get-
A B C
1 2 1
2 5 3
2 3 3
3 5 3
3 3 3
This relation is not same as the original relation R and contains some
extraneous tuples.
Clearly, R1 ⋈ R2 ⊃ R.
Thus, we conclude that the above decomposition is lossy join
decomposition.
NOTE-
Anomalies in DBMS
There are three types of anomalies that occur when the database is not
normalized. These are – Insertion, update and deletion anomaly. Let’s take
an example to understand this.
The above table is not normalized. We will see the problems that we face
when a table is not normalized.
Update anomaly: In the above table we have two rows for employee RAM as he
belongs to two departments of the company. If we want to update the address
of RAM then we have to update the same in two rows or the data will become
inconsistent. If somehow, the correct address gets updated in one department
but not in other then as per the database, RAM would be having two different
addresses, which is not correct and would lead to inconsistent data.
Insert anomaly: Suppose a new employee joins the company, who is under
training and currently not assigned to any department then we would not be
able to insert the data into the table if emp_dept field doesn’t allow nulls.
ID NAME ADDRESS
--- --------------- ------------
111 RAM DELHI
222 LAKSH DELHI
333 VENKAT MUMBAI
444 KRISH HYD
555 SIVA BANGLORE
666 SATYA CHENNAI
What is Normalization?
Normalization is a method of organizing the data in the database which helps
you to avoid data redundancy, insertion, update and deletion anomaly
(irregularity). It is a process of analyzing the relation schemas based on
their different functional dependencies and primary key.
*****
In the table above, we have data of 4 Computer Sci. students. As we can see,
data for the fields branch, hod(Head of Department) and office_tel is
repeated for the students who are in the same branch in the college, this
is Data Redundancy.
Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch,
data of the student cannot be inserted, or else we will have to set the
branch information as NULL.
Also, if we have to insert data of 100 students of same branch, then the
branch information will be repeated for all those 100 students.
These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? or is no longer the HOD of computer science
department? In that case all the student records will have to be updated, and
if by mistake we miss any record, it will lead to data inconsistency. This is
Updation anomaly.
Deletion Anomaly
In our Student table, two different informations are kept together, Student
information and Branch information. Hence, at the end of the academic year,
if student records are deleted, we will also lose the branch information.
This is Deletion anomaly.
Types of Normal Forms
There are the four types of normal forms:
Normal Description
Form
4NF A relation will be in 4NF if it is in Boyce Codd normal form and has
no multi-valued dependency.
Example: STUDENT
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two
independent entity. Hence, there is no relationship between COURSE and HOBBY.
So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Third Normal form (3NF)
A table design is said to be in 3NF if both the following conditions hold:
Table must be in 2NF
Transitive functional dependency of non-prime attribute on any super key
should be removed.
An attribute that is not part of any candidate key is known as non-prime
attribute.
In other words 3NF can be explained like this: A table is in 3NF if it is in
2NF and for each functional dependency X-> Y at least one of the following
conditions hold:
X is a super key of table
Y is a prime attribute of table
An attribute that is a part of one of the candidate keys is known as prime
attribute.
Example: Suppose a company wants to store the complete address of each
employee, they create a table named employee_details that looks like this:
employee_zip table:
Example: Let's assume, a school can store the data of teachers and the
subjects they teach. In a school, a teacher can teach more than one subject.
TEACHER table
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following
rules:
o A relation will be 1NF if it contains an atomic value.
EMPLOYEE table:
7272826385,
14 John UP
9064738238
7390372389,
12 Sam Punjab
8589830302
The decomposition of the EMPLOYEE table into 1NF has been shown below:
14 John 7272826385 UP
14 John 9064738238 UP
What is dependency?
Let’s take an example of a student table with columns student_id, name,
reg_no(registration nmber), branch and address(student’s home address).
In this table, student_id is the primary key and will be unique for every
row, hence we can use student_id to fetch any row of data from this table
Even for a case, where student names are same, if we know the student_id we
can easily fetch the correct record.
Hence we can say a Primary Key for a table is the column or a group of
columns(composite key) which can uniquely identify each record in the table.I
can ask from branch name of student with student_id IT101, and I can get it.
Similarly, if I ask for name of student with student_id IT01 or IT02, I will
get it. So all I need is student_id and every other column depends on it, or
can be fetched using it.
subject_id subject_name
IT01 Java
IT02 C++
IT03 Php
Let's create another table Score, to store the marks obtained by students in
the respective subjects. We will also be saving name of the teacher who
teaches that subject along with marks.
score_id student_id subject_id marks teacher
if I ask you to get me marks of student with student_id IT01, can you
get it from this table? No, because you don't know for which subject.
And if I give you subject_id, you would not know for which student.
Hence we need student_id + subject_id to uniquely identify any row.
Now as we just discussed that the primary key for this table is a
composition of two columns which is student_id & subject_id but the
teacher's name only depends on subject, hence the subject_id, and has
nothing to do with student_id.
This is Partial Dependency, where an attribute in a table depends on
only a part of the primary key and not on the whole key.
Boyce Codd normal form (BCNF)
o BCNF is the advance version of 3NF. It is stricter than 3NF.
o A table is in BCNF if every functional dependency X → Y, X is the super
key of the table.
o For BCNF, the table should be in 3NF, and for every FD, LHS is super
key.
Example: Let's assume there is a company where employees work in more than
one department.
EMPLOYEE table:
EMP_ID -→ EMP_COUNTRY
EMP_DEPT -→ {DEPT_TYPE, EMP_DEPT_NO}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
364 UK
EMP_DEPT table:
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
Functional dependencies:
EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
Now, this is in BCNF because left side part of both the functional
dependencies is a key.
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form or BCNF is an extension to the third normal form, and
is also known as 3.5 Normal Form.
We learned about the third normal form and we also learned how to
remove transitive dependency from a table
Example
Below we have a college enrolment table with
columns student_id, subject and professor.
103 C# P.Chash
As you can see, we have also added some sample data to the table.
In the table above:
One student can enrol for multiple subjects. For example, student
with student_id 101, has opted for subjects - Java & C++
For each subject, a professor is assigned to the student.
And, there can be multiple professors teaching one subject like we have
for Java.
What do you think should be the Primary Key?
Well, in the table above student_id, subject together form the primary
key, because using student_id and subject, we can find all the columns
of the table.
One more important point to note here is, one professor teaches only
one subject, but one subject may have two different professors.
Hence, there is a dependency between subject and professor here,
where subject depends on the professor name.
This table satisfies the 1st Normal form because all the values are
atomic, column names are unique and all the values stored in a
particular column are of same domain.
This table also satisfies the 2nd Normal Form as their is no Partial
Dependency.
And, there is no Transitive Dependency, hence the table also satisfies
the 3rd Normal Form.
But this table is not in Boyce-Codd Normal Form.
Why this table is not in BCNF?
In the table above, student_id, subject form primary key, which means subject column is
a prime attribute.
But, there is one more dependency, professor → subject.
And while subject is a prime attribute, professor is a non-prime attribute, which is
not allowed by BCNF.
student_id p_id
101 1
101 2
and so on...
1 P.Java Java
2 P.Cpp C++
and so on...
Operations in Transaction-
1. Read Operation-
Read operation reads the data from the database and then stores it
in the buffer in main memory.
For example- Read(A) instruction will read the value of A from the
database and will store it in the buffer in main memory(RAM).
2. Write Operation-
Write operation writes the updated data value back to the database
from the buffer.
For example- Write(A) will write the updated value of A from the
buffer to the database.
Transaction States-
1. Active State-
This is the first state in the life cycle of a transaction.
A transaction is called in an active state as long as its
instructions are getting executed.
All the changes made by the transaction now are stored in the buffer
in main memory.
3. Committed State-
NOTE-
4. Failed State-
5. Aborted State-
After the transaction has failed and entered into a failed state,
all the changes made by it have to be undone.
To undo the changes made by the transaction, it becomes necessary
to roll back the transaction.
After the transaction has rolled back completely, it enters into
an aborted state.
6. Terminated State-
Explanation:
Read(A): In T1, no subsequent writes to A, so no new edges
Read(B): In T2, no subsequent writes to B, so no new edges
Read(C): In T3, no subsequent writes to C, so no new edges
Write(B): B is subsequently read by T3, so add edge T2 → T3
Write(C): C is subsequently read by T1, so add edge T3 → T1
Write(A): A is subsequently read by T2, so add edge T1 → T2
Write(A): In T2, no subsequent reads to A, so no new edges
Write(C): In T1, no subsequent reads to C, so no new edges
Write(B): In T3, no subsequent reads to B, so no new edges
Precedence graph for schedule S1:
o Searching a record
When a record needs to be searched, then the same hash function
retrieves the address of the bucket where the data is stored.
o Insert a Record
When a new record is inserted into the table, then we will
generate an address for a new record based on the hash key and record
is stored in that location.
o Delete a Record
To delete a record, we will first fetch the record which is
supposed to be deleted. Then we will delete the records for that
address in memory.
o Update a Record
To update a record, we will first search it using a hash
function, and then the data record is updated.
If we want to insert some new record into the file but the
address of a data bucket generated by the hash function is not empty,
or data already exists in that address. This situation in the static
hashing is known as bucket overflow. This is a critical situation in
this method.
To overcome this situation, there are various methods. Some commonly
used methods are as follows:
1. Open Hashing
When a hash function generates an address at which data is
already stored, then the next bucket will be allocated to it. This
mechanism is called as Linear Probing.
For example: suppose R3 is a new address which needs to be inserted,
the hash function generates address as 112 for R3. But the generated
address is already full. So the system searches next available data
bucket, 113 and assigns R3 to it.
2. Close Hashing
When buckets are full, then a new data bucket is allocated for
the same hash result and is linked after the previous one. This
mechanism is known as Overflow chaining.
Index structure:
Indexes can be created using some database columns.
SEARCH
DATA REFERENCE
KEY
o The first column of the database is the search key that contains
a copy of the primary key or candidate key of the table. The
values of the primary key are stored in sorted order so that the
corresponding data can be accessed easily.
o The second column of the database is the data reference. It
contains a set of pointers holding the address of the disk block
where the value of the particular key can be found.
Indexing Methods
Ordered indices
The indices are usually sorted to make searching faster. The indices
which are sorted are known as ordered indices.
Primary Index
o If the index is created on the basis of the primary key of the
table, then it is known as primary indexing. These primary keys
are unique to each record and contain 1:1 relation between the
records.
o As primary keys are stored in sorted order, the performance of
the searching operation is quite efficient.
o The primary index can be classified into two types: Dense index
and Sparse index.
Dense index
o The dense index contains an index record for every search key
value in the data file. It makes searching faster.
o In this, the number of records in the index table is same as the
number of records in the main table.
o It needs more space to store index record itself. The index
records have the search key and a pointer to the actual record on
the disk.
Sparse index
o In the data file, index record appears only for a few items. Each
item points to a block.
o In this, instead of pointing to each record in the main table,
the index points to the records in the main table in a gap.
Clustering Index
o A clustered index can be defined as an ordered data file.
Sometimes the index is created on non-primary key columns which
may not be unique for each record.
o In this case, to identify the record faster, we will group two or
more columns to get the unique value and create index out of
them. This method is called a clustering index.
o The records which have similar characteristics are grouped, and
indexes are created for these group.
These mappings are usually kept in the primary memory so that address
fetch should be faster. Then the secondary memory searches the actual
data based on the address got from mapping.
If the mapping size grows then fetching the address itself becomes
slower. In this case, the sparse index will not be efficient. To
overcome this problem, secondary indexing is introduced.
In this method, the huge range for the columns is selected initially
so that the mapping size of the first level becomes small.
Then each range is further divided into smaller ranges. The mapping of
the first level is stored in the primary memory, so that address fetch
is faster. The mapping of the second level and actual data are stored
in the secondary memory (hard disk).
For example:
o If you want to find the record of roll 111 in the diagram, then
it will search the highest entry which is smaller than or equal
to 111 in the first level index. It will get 100 at this level.
o Then in the second index level, again it does max (111) <= 111
and gets 110. Now using the address 110, it goes to the data
block and starts searching each record till it gets 111.
o This is how a search is performed in this method. Inserting,
updating or deleting is also done in the same manner.
Advantages of Indexing
Important pros/ advantage of Indexing are:
Disadvantages of Indexing
Important drawbacks/cons of Indexing are:
The goal of a hashed search is to find the largest data in only one
test.
the records in a sequence i.e one after other in the order in which
they are inserted into the tables.
Static Hashing –
In static hashing, when a search-key value is provided, the hash
function always computes the same address. For example, if we want to
generate address for STUDENT_ID = 76 using mod (5) hash function, it
always result in the same bucket address 4. There will not be any
changes to the bucket address here. Hence number of data buckets in
the memory for this static hashing remains constant throughout.
Operations –
Insertion – When a new record is inserted into the table, The hash
function h generate a bucket address for the new record based on
its hash key K.
Bucket address = h(K)
Searching – When a record needs to be searched, The same hash
function is used to retrieve the bucket address for the record. For
Example, if we want to retrieve whole record for ID 76, and if the
hash function is mod (5) on that ID, the bucket address generated
would be 4. Then we will directly got to address 4 and retrieve the
whole record for ID 104. Here ID acts as a hash key.
Deletion – If we want to delete a record, Using the hash function
we will first fetch the record which is supposed to be
deleted. Then we will remove the records for that address in
memory.
Updation – The data record that needs to be updated is first
searched using hash function, and then the data record is updated.
Now, If we want to insert some new records into the file But the data
bucket address generated by the hash function is not empty or the
data already exists in that address. This becomes a critical
situation to handle. This situation in the static hashing is
called bucket overflow.
How will we insert data in this case?
There are several methods provided to overcome this situation. Some
commonly used methods are discussed below:
1. Open Hashing –
In Open hashing method, next available data block is used to enter
the new record, instead of overwriting older one. This method is
also called linear probing.
For example, D3 is a new record which needs to be inserted , the
hash function generates address as 105. But it is already full. So
the system searches next available data bucket, 123 and assigns D3
to it.
2. Closed hashing –
In Closed hashing method, a new data bucket is allocated with same
address and is linked it after the full data bucket. This method is
also known as overflow chaining.
For example, we have to insert a new record D3 into the tables. The
static hash function generates the data bucket address as 105. But
this bucket is full to store the new data. In this case is a new
data bucket is added at the end of 105 data bucket and is linked to
it. Then new record D3 is inserted into the new bucket.
Quadratic probing :
Quadratic probing is very much similar to open hashing or linear
probing. Here, The only difference between old and new bucket is
linear. Quadratic function is used to determine the new bucket
address.
Double Hashing:
Double Hashing is another method similar to linear probing. Here
the difference is fixed as in linear probing, but this fixed
difference is calculated by using another hash function. That’s
why the name is double hashing.
Dynamic Hashing –
The drawback of static hashing is that that it does not expand or
shrink dynamically as the size of the database grows or shrinks. In
Dynamic hashing, data buckets grows or shrinks (added or removed
dynamically) as the records increases or decreases. Dynamic hashing
is also known as extended hashing.
In dynamic hashing, the hash function is made to produce a large
number of values. For Example, there are three data records D1, D2
and D3 . The hash function generates three addresses 1001, 0101 and
1010 respectively. This method of storing considers only part of
this address – especially only first one bit to store the data. So it
tries to load three of them at address 0 and 1.
But the problem is that No bucket address is remaining for D3. The
bucket has to grow dynamically to accommodate D3. So it changes the
address have 2 bits rather than 1 bit, and then it updates the
existing data to have 2 bit address. Then it tries to accommodate D3.
FAILURE CLASSIFICATION
To find that where the problem has occurred, we generalize a failure
into the following categories:
1. Transaction failure
2. System crash
3. Disk failure
1. Transaction failure
The transaction failure occurs when it fails to execute or when it
reaches a point from where it can't go any further. If a few
transaction or process is hurt, then this is called as transaction
failure.
Reasons for a transaction failure could be -
1. Logical errors: If a transaction cannot complete due to some
code error or an internal error condition, then the logical
error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an
active transaction because the database system is not able to
execute it. For example, the system aborts an active
transaction, in case of deadlock or resource unavailability.
2. System Crash
o System failure can occur due to power failure or other hardware
or software failure. Example: Operating system error.
Fail-stop assumption: In the system crash, non-volatile storage
is assumed not to be corrupted.
3. Disk Failure
o It occurs where hard-disk drives or storage drives used to fail
frequently. It was a common problem in the early days of
technology evolution.
o Disk failure occurs due to the formation of bad sectors, disk
head crash, and unreachability to the disk or any other
failure, which destroy all or part of disk storage.
DYNAMIC HASHING
o The Dynamic Hashing method is used to overcome the problems of
static hashing like bucket overflow.
o In this method, data buckets grow or shrink as the records
increases or decreases. This method is also known as extendable
hashing method.
o This method makes hashing dynamic, i.e., it allows insertion or
deletion without resulting in poor performance.
For example:
The last two bits of 2 and 4 are 00. So it will go into bucket B0.
The last two bits of 5 and 6 are 01, so it will go into bucket B1.
The last two bits of 1 and 3 are 10, so it will go into bucket B2.
The last two bits of 7 are 11, so it will go into B3.
Insert key 9 with hash address 10001 into the above structure:
o Since key 9 has hash address 10001, it must go into the first
bucket. But bucket B1 is full, so it will get split.
o The splitting will separate 5, 9 from 6 since last three bits of
5, 9 are 001, so it will go into bucket B1, and the last three
bits of 6 are 101, so it will go into bucket B5.
o Keys 2 and 4 are still in B0. The record in B0 pointed by the 000
and 100 entry because last two bits of both the entry are 00.
o Keys 1 and 3 are still in B2. The record in B2 pointed by the 010
and 110 entry because last two bits of both the entry are 10.
o Key 7 are still in B3. The record in B3 pointed by the 111 and
011 entry because last two bits of both the entry are 11.
o In this method, if the data size increases then the bucket size
is also increased. These addresses of data will be maintained in
the bucket address table. This is because the data address will
keep changing as buckets grow and shrink. If there is a huge
increase in data, maintaining the bucket address table becomes
tedious.
o In this case, the bucket overflow situation will also occur. But
it might take little time to reach this situation than static
hashing.
DBMS SERIALIZABILITY
When multiple transactions are running concurrently then there is
a possibility that the database may be left in an inconsistent state.
Serializability is a concept that helps us to check
which schedules are serializable. A serializable schedule is the one
that always leaves the database in consistent state.
A serial schedule doesn’t allow concurrency, only one transaction
executes at a time and the other starts when the already running
transaction finished.
Types of Serializability
There are two types of Serializability.
1. Conflict Serializability
2. View Serializability
Conflicting Operations
The two operations become conflicting if all conditions satisfy:
1. Both belong to separate transactions.
2. They have the same data item.
3. They contain at least one write operation.
Example:
Swapping is possible only if S1 and S2 are logically equal.
Here, S1 = S2. That means it is non-conflict.
Conflict Equivalent
In the conflict equivalent, one can be transformed to another by
swapping non-conflicting operations. In the given example, S2 is
conflict equivalent to S1 (S1 can be converted to S2 by swapping non-
conflicting operations).
Two schedules are said to be conflict equivalent if and only if:
1. They contain the same set of the transaction.
2. If each pair of conflict operations are ordered in the same way.
Example:
T1 T2
Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)
1. Initial Read
An initial read of both schedules must be the same. Suppose two
schedule S1 and S2. In schedule S1, if a transaction T1 is reading the
data item A, then in S2, transaction T1 should also read A.
2. Updated Read
In schedule S1, if Ti is reading A which is updated by Tj then in S2
also, Ti should read A which is updated by Tj.
Above two schedules are not view equal because, in S1, T3 is reading A
updated by T2 and in S2, T3 is reading A updated by T1.
3. Final Write
A final write must be the same between both the schedules. In schedule
S1, if a transaction T1 updates A at last then in S2, final writes
operations should also be done by T1.
Schedule S
With 3 transactions, the total number of possible schedule
= 3! = 6
S1 = <T1 T2 T3> S2 = <T1 T3 T2> S3 = <T2 T3 T1> S4 = <T2 T1 T3>
S5 = <T3 T1 T2> S6 = <T3 T2 T1>
Schedule S1
Step 1: final updation on data items
In both schedules S and S1, there is no read except the initial read
that's why we don't need to check that condition.
Step 2: Initial Read
The initial read operation in S is done by T1 and in S1, it is also
done by T1.
Step 3: Final Write
The final write operation in S is done by T3 and in S1, it is also
done by T3. So, S and S1 are view Equivalent.
The first schedule S1 satisfies all three conditions, so we don't need
to check another schedule.
Hence, view equivalent serial schedule is:
T1 → T2 → T3
Concurrency Control
Concurrency = Done At the Same Time
Several problems can occur when concurrent (done at the same time.)
transactions are executed in an uncontrolled manner. Following are the
three problems in concurrency control.
1. Lost updates
2. Dirty read (uncommitted data)
3. Unrepeatable read
o When two transactions that access the same database items contain
their operations in a way that makes the value of some database
item incorrect, then the lost update problem occurs.
o If two transactions T1 and T2 read a record and then update it,
then the effect of updating of the first record will be
overwritten by the second update.
Example:
Here,
o At time t2, transaction-X reads A's value.
o At time t3, Transaction-Y reads A's value.
o At time t4, Transactions-X writes A's value on the basis of the
value seen at time t2.
o At time t5, Transactions-Y writes A's value on the basis of the
value seen at time t3.
o So at time T5, the update of Transaction-X is lost because
Transaction y overwrites it without looking at its current value.
o Such type of problem is known as Lost Update Problem as update
made by one transaction is lost here.
2. Dirty Read
o The dirty read occurs in the case when one transaction updates an
item of the database, and then the transaction fails for some
reason. The updated database item is accessed by another
transaction before it is changed back to the original value.
o A transaction T1 updates a record which is read by T2. If T1
aborts then T2 now has values which have never formed part of the
stable database.
Example:
Example:
Internal node
o An internal node of the B+ tree can contain at least n/2 record
pointers except the root node.
o At most, an internal node of the tree contains n pointers.
Leaf node
o The leaf node of the B+ tree can contain at least n/2 record
pointers and n/2 key values.
o At most, a leaf node contains n record pointer and n key values.
o Every leaf node of the B+ tree contains one block pointer P to
point to next leaf node.
Searching a record in B+ Tree
Suppose we have to search 55 in the below B+ tree structure. First, we
will fetch for the intermediary node which will direct to the leaf
node that can contain a record for 55.
So, in the intermediary node, we will find a branch between 50 and 75
nodes. Then at the end, we will be redirected to the third leaf node.
Here DBMS will perform a sequential search to find 55.
B+ Tree Insertion
Suppose we want to insert a record 60 in the below structure. It will
go to the 3rd leaf node after 55. It is a balanced tree, and a leaf
node of this tree is already full, so we cannot insert 60 there.
In this case, we have to split the leaf node, so that it can be
inserted into tree without affecting the fill factor, balance and
order.
The 3rd leaf node has the values (50, 55, 60, 65, 70) and its current
root node is 50. We will split the leaf node of the tree in the middle
so that its balance is not altered. So we can group (50, 55) and (60,
65, 70) into 2 leaf nodes.
If these two has to be leaf nodes, the intermediate node cannot branch
from 50. It should have 60 added to it, and then we can have pointers
to a new leaf node.
Properties:
Internal nodes point to other nodes in the tree.
Leaf nodes point to data in the database using data pointers. Leaf
nodes also contain an additional pointer, called the sibling pointer,
which is used to improve the efficiency of certain types of search.
All the nodes in a B+-Tree must be at least half full except the root
node which may contain a minimum of two entries. The algorithms that
allow data to be inserted into and deleted from a B+-Tree guarantee
that each node in the tree will be at least half full.
Searching for a value in the B+-Tree always starts at the root node
and moves downwards until it reaches a leaf node.
Both internal and leaf nodes contain key values that are used to guide
the search for entries in the index.
The B+ Tree is called a balanced tree because every path from the root
node to a leaf node is the same length. A balanced tree means that all
searches for individual values require the same number of nodes to be
read from the disc.
The figure depicts the basic structure of B+ Tree.
Algorithm
Basic operations associated with B+ Tree:
Searching a node in a B+ Tree
Perform a binary search on the records in the current node.
If a record with the search key is found, then return that
record.
If the current node is a leaf node and the key is not found, then
report an unsuccessful search.
Otherwise, follow the proper branch and repeat the process.
INSERT
etc., every node contains only one value (key) and a maximum of two
which a node contains more than one value (key) and more than two
children.
Here, the number of keys in a node and number of children for a node
depends on the order of B-Tree. Every B-Tree has an order.
B-Tree of Order m has the following properties...
Property #1 - All leaf nodes must be at same level.
Property #2 - All nodes except root must have at least [m/2]-
1 keys and maximum of m-1 keys.
Property #3 - All non leaf nodes except root (i.e. all internal
nodes) must have at least m/2 children.
Property #4 - If the root node is a non leaf node, then it must
have at least 2 children.
Property #5 - A non leaf node with n-1 keys must have n number of
children.
Property #6 - All the key values in a node must be in Ascending
Order.
For example, B-Tree of Order 4 contains a maximum of 3 key values in a
node and maximum of 4 children for a node.
Example
Operations on a B-Tree
The following operations are performed on a B-Tree...
1. Search
2. Insertion
3. Deletion
Atomicity
By this, we mean that either the entire transaction takes place at once
or doesn’t happen at all. There is no midway i.e. transactions do not
occur partially. Each transaction is considered as one unit and either
runs to completion or is not executed at all. It involves the following
two operations.
Consistency
This means that integrity constraints must be maintained so that
the database is consistent before and after the transaction. It refers
to the correctness of a database. Referring to the example above,
The total amount before and after the transaction must be maintained.
Isolation
This property ensures that multiple transactions can occur concurrently
without leading to the inconsistency of database state. Transactions
occur independently without interference. Changes occurring in a
particular transaction will not be visible to any other transaction
until that particular change in that transaction is written to memory or
has been committed. This property ensures that the execution of
transactions concurrently will result in a state that is equivalent to a
state achieved these were executed serially in some order.
Let X= 500, Y = 500.
Consider two transactions T and T”.
Suppose T has been executed till Read (Y) and then T’’ starts. As a
result, interleaving of operations takes place due to which T’’ reads
correct value of X but incorrect value of Y and sum computed by
Durability:
This property ensures that once the transaction has completed execution,
the updates and modifications to the database are stored in and written
to disk and they persist even if a system failure occurs. These updates
now become permanent and are stored in non-volatile memory. The effects
of the transaction, thus, are never lost.
The ACID properties, in totality, provide a mechanism to ensure
correctness and consistency of a database in a way such that each
transaction is a group of operations that acts a single unit, produces
consistent results, acts in isolation from other operations and updates
that it makes are durably stored.