0% found this document useful (0 votes)
28 views73 pages

Second Unit 3

Best

Uploaded by

humendrayede123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views73 pages

Second Unit 3

Best

Uploaded by

humendrayede123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

What is Relational Model?

Relational Model (RM) represents the database as a collection of relations. A


relation is nothing but a table of values. Every row in the table represents a
collection of related data values. These rows in the table denote a real-world
entity or relationship.
The table name and column names are helpful to interpret the meaning of values
in each row. The data are represented as a set of relations. In the relational
model, data are stored as tables. However, the physical storage of the data is
independent of the

Relational Model is the most widely used model. In this model, the data is
maintained in the form of a two-dimensional table. All the information is
stored in the form of row and columns. The basic structure of a relational
model is tables. So, the tables are also called relations in the relational
model. Example: In this example, we have an Employee table.

Features of Relational Model

 Tuples: Each row in the table is called tuple. A row contains all the
information about any instance of the object. In the above example, each
row has all the information about any specific individual like the first
row has information about John.
 Attribute or field: Attributes are the property which defines the table or
relation. The values of the attribute should be from the same domain. In
the above example, we have different attributes of the employee like
Salary, Mobile_no, etc.
Advnatages of Relational Model

 Simple: This model is more simple as compared to the network and


hierarchical model.
 Scalable: This model can be easily scaled as we can add as many rows
and columns we want.
 Structural Independence: We can make changes in database structure
without changing the way to access the data. When we can make
changes to the database structure without affecting the capability to
DBMS to access the data we can say that structural independence has
been achieved.
Disadvantages of Relatinal Model

 Hardware Overheads: For hiding the complexities and making things


easier for the user this model requires more powerful hardware
computers and data storage devices.
 Bad Design: As the relational model is very easy to design and use. So
the users don't need to know how the data is stored in order to access it.
This ease of design can lead to the development of a poor database
which would slow down if the database grows.
But all these disadvantages are minor as compared to the advantages of the
relational model. These problems can be avoided with the help of proper
implementation and organisation.

. Relationship

A relationship is used to describe the relation between entities. Diamond or


rhombus is used to represent the relationship.

Types of relationship are as follows:

a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is
known as one to one relationship.

For example, A female can marry to one male, and a male can marry to one
female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of
an entity on the right associates with the relationship then this is known as a
one-to-many relationship.

For example, Scientist can invent many inventions, but the invention is done by
the only specific scientist.

c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of
an entity on the right associates with the relationship then it is known as a
many-to-one relationship.

For example, Student enrolls for only one course, but a course can have many
students.
d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one
instance of an entity on the right associates with the relationship then it is
known as a many-to-many relationship.

For example, Employee can assign by many projects and project can have
many employees.

Relational Model Concepts in DBMS

1. Attribute: Each column in a Table. Attributes are the properties which


define a relation. e.g., Student_Rollno, NAME,etc.
2. Tables – In the Relational model the, relations are saved in the table
format. It is stored along with its entities. A table has two properties rows
and columns. Rows represent records and columns represent attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single
record.
4. Relation Schema: A relation schema represents the name of the relation
with its attributes.
5. Degree: The total number of attributes which in the relation is called the
degree of the relation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the
RDBMS system. Relation instances never have duplicate tuples.
9. Relation key – Every row has one, two or multiple attributes, which is
called relation key.
10. Attribute domain – Every attribute has some pre-defined value and
scope which is known as attribute domain

Operations in Relational Model


Four basic update operations performed on relational database model are

Insert, update, delete and select.

 Insert is used to insert data into the relation


 Delete is used to delete tuples from the table.
 Modify allows you to change the values of some attributes in existing
tuples.
 Select allows you to choose a specific range of data.
 Whenever one of these operations are applied, integrity constraints
specified on the relational database schema must never be violated.

 Insert Operation
 The insert operation gives values of the attribute for a new tuple which
should be inserted into a relation.


 Update Operation
 You can see that in the below-given relation table CustomerName=
‘Apple’ is updated from Inactive to Active.


 Delete Operation
 To specify deletion, a condition on the attributes of the relation selects the
tuple to be deleted.


 In the above-given example, CustomerName= “Apple” is deleted from
the table.
 The Delete operation could violate referential integrity if the tuple which
is deleted is referenced by foreign keys from other tuples in the
same database.
 Select Operation


 In the above-given example, CustomerName=”Amazon” is selected

Best Practices for creating a Relational Model

 Data need to be represented as a collection of relations


 Each relation should be depicted clearly in the table
 Rows should contain data about instances of an entity
 Columns must contain data about attributes of the entity
 Cells of the table should hold a single value
 Each column should be given a unique name
 No two rows can be identical
 The values of an attribute should be from the same domain

Advantages of Relational Database Model

 Simplicity: A Relational data model in DBMS is simpler than the


hierarchical and network model.
 Structural Independence: The relational database is only concerned
with data and not with a structure. This can improve the performance of
the model.
 Easy to use: The Relational model in DBMS is easy as tables consisting
of rows and columns are quite natural and simple to understand
 Query capability: It makes possible for a high-level query language
like SQL to avoid complex database navigation.
 Data independence: The Structure of Relational database can be
changed without having to change any application.
 Scalable: Regarding a number of records, or rows, and the number of
fields, a database should be enlarged to enhance its usability.
Disadvantages of Relational Model

 Few relational databases have limits on field lengths which can’t be


exceeded.
 Relational databases can sometimes become complex as the amount of
data grows, and the relations between pieces of data become more
complicated.
 Complex relational database systems may lead to isolated databases
where the information cannot be shared from one system to another.

What are Attributes?

Database Management System (DBMS) consists of ER model. The full form of


ER model is the Entity-Relationship model. We use ER model to describe the
data elements and their relation with the specified system.

The ER model consists of entities and attributes. An entity can be an object,


person, or place. In the ER model, we represent Entity as rectangles. For
example, In an organization, we can take employees, departments, executives,
as an entity.

Attributes give us additional information about the entity. It describes the


property of an entity. In the ER model, we represent the Attributes as an
Eclipse. For example, If Employee is an entity, employee id, contact number,
name, date of joining, etc can be the attributes of an employee.

Types of attributes

The different types of attributes are as follows −


Composite attribute
It can be divided into smaller sub parts, each sub part can form an independent
attribute.
For example −
Name
FirstName MiddelName LastName
Simple or Atomic attribute
Attributes that cannot be further subdivided are called atomic attributes.
For example −
Phone number
PIN code

Single valued Attribute


Attributes having a single value for a particular item is called a single valued
attribute.
For example: Room Number

Multi-valued Attribute
Attribute having a set of values for a single entity is called a multi-valued
attribute.
For example −
e-mail
Tel.No
Hobbies
Derived Attributes or stored Attributes
When one attribute value is derived from the other is called a derived attribute.
For example: Age can be derived from date of birth, where,
 Age is the derived attribute.
 DOB is the stored attribute.

Complex Attribute
Nesting of composite and multi-valued attributes forms a complex attribute.
For example
If a person has more than one house and each house has more than one phone.
Then, that attribute phone is represented as a complex attribute.
----------------------------------------------------------------------------------

Types of Relationship in Database Table

A relational database collects different types of data sets that use tables, records,
and columns. It is used to create a well-defined relationship between database
tables so that relational databases can be easily stored. For example of relational
databases such as Microsoft SQL Server, Oracle Database, MYSQL, etc.

There are some important parameters of the relational database:

o It is based on a relational model (Data in tables).


o Each row in the table with a unique id, key.
o Columns of the table hold attributes of data.

Employee table (Relation / Table Name)

EmpID EmpName EmpAge CountryName

Emp 101 Andrew Mathew 24 USA

Emp 102 Marcus dugles 27 England


Emp 103 Engidi Nathem 28 France

Emp 104 Jason Quilt 21 Japan

Emp 108 Robert 29 Italy

1. One to One relationship


2. One to many or many to one relationship
3. Many to many relationships

One to One Relationship (1:1): It is used to create a relationship between two


tables in which a single row of the first table can only be related to one and only
one records of a second table. Similarly, the row of a second table can also be
related to anyone row of the first table.

Following is the example to show a relational database, as shown below.

One to Many Relationship: It is used to create a relationship between two


tables. Any single rows of the first table can be related to one or more rows of
the second tables, but the rows of second tables can only relate to the only row
in the first table. It is also known as a many to one relationship.

Representation of One to Many relational databases:


Representation of many to one relational database

Many to Many Relationship: It is many to many relationships that create a


relationship between two tables. Each record of the first table can relate to any
records (or no records) in the second table. Similarly, each record of the second
table can also relate to more than one record of the first table. It is also
represented an N:N relationship.

For example, there are many people involved in each project, and every person
can involve more than one project.
Difference between a database and a relational database
Relational Database Database

A relational database can store and It is used to store the data as files.
arrange the data in the tabular form like
rows and columns.

The data normalization feature is It does not have a normalization.


available in the relational database.

It supports a distributed database. It does not support the distributed


database.

In a relational database, the values are Generally, it stores the data in the
stored as tables that require a primary hierarchical or navigational form.
keys to possess the data in a database.

It is designed to handle a huge collection It is designed to handle the small


of data and multiple users. collection of data files that requires
a single user.

A relational database uses integrity It does not follow any integrity


constraints rules that are defined in ACID constraints rule nor utilize any
properties. security to protect the data from
manipulation.

Stored data can be accessed from the There is no relationship between


relational database because there is a data value or tables stored in files.
relationship between the tables and their
attributes.

Advantages of relational databases


1. Simple Model: The simplest model of the relational database does not
require any complex structure or query to process the databases. It has a
simple architectural process as compared to a hierarchical database
structure. Its simple architecture can be handled with simple SQL queries
to access and design the relational database.
2. Data Accuracy: Relational databases can have multiples tables related to
each other through primary and foreign keys. There are fewer chances for
duplication of data fields. Therefore the accuracy of data in relational
database tables is greater than in any other database system.
3. Easy to access Data: The data can be easily accessed from the relational
database, and it does not follow any pattern or way to access the data.
One can access any data from a database table using SQL queries. Each
table in the associated database is joined through any relational queries
such as join and conditional descriptions to concatenate all tables to get
the required data.
4. Security: It sets a limit that allows specific users to use relational data in
RDBMS.
5. Collaborate: It allows multiple users to access the same database at a
time.
------------------------------------------------------------------------------------------

What are Keys in DBMS?


KEYS in DBMS is an attribute or set of attributes which helps you to identify a
row(tuple) in a relation(table). They allow you to find the relation between two
tables. Keys help you uniquely identify a row in a table by a combination of one
or more columns in that table. Key is also helpful for finding unique record or
row from the table. Database key is also helpful for finding unique record or
row from the table.
Example:

Employee ID FirstName LastName

11 Andrew Johnson

22 Tom Wood

33 Alex Hale

In the above-given example, employee ID is a primary key because it uniquely


identifies an employee record. In this table, no other employee can have the
same employee ID.

Why we need a Key?


Here are some reasons for using sql key in the DBMS system.

 Keys help you to identify any row of data in a table. In a real-world


application, a table could contain thousands of records. Moreover, the
records could be duplicated. Keys in RDBMS ensure that you can
uniquely identify a table record despite these challenges.
 Allows you to establish a relationship between and identify the relation
between tables
 Help you to enforce identity and integrity in the relationship.

Types of Keys in DBMS (Database Management System)


There are mainly Eight different types of Keys in DBMS and each key has it’s
different functionality:

1. Super Key
2. Primary Key
3. Candidate Key
4. Alternate Key
5. Foreign Key
6. Compound Key
7. Composite Key
8. Surrogate Key

Let’s look at each of the keys in DBMS with example:

 Super Key – A super key is a group of single or multiple keys which


identifies rows in a table.
 Primary Key – is a column or group of columns in a table that uniquely
identify every row in that table.
 Candidate Key – is a set of attributes that uniquely identify tuples in a
table. Candidate Key is a super key with no repeated attributes.
 Alternate Key – is a column or group of columns in a table that uniquely
identify every row in that table.
 Foreign Key – is a column that creates a relationship between two tables.
The purpose of Foreign keys is to maintain data integrity and allow
navigation between two different instances of an entity.
 Compound Key – has two or more attributes that allow you to uniquely
recognize a specific record. It is possible that each column may not be
unique by itself within the database.
 Composite Key – is a combination of two or more columns that uniquely
identify rows in a table. The combination of columns guarantees
uniqueness, though individual uniqueness is not guaranteed.

What is the Super key?


A superkey is a group of single or multiple keys which identifies rows in a
table. A Super key may have additional attributes that are not needed for unique
identification.

Example:

EmpSSN EmpNum Empname

9812345098 AB05 Shown

9876512345 AB06 Roslyn

199937890 AB07 James


In the above-given example, EmpSSN and EmpNum name are superkeys.

What is a Primary Key?


PRIMARY KEY in DBMS is a column or group of columns in a table that
uniquely identify every row in that table. The Primary Key can’t be a duplicate
meaning the same value can’t appear more than once in the table. A table
cannot have more than one primary key.
Rules for defining Primary key:

 Two rows can’t have the same primary key value


 It must for every row to have a primary key value.
 The primary key field cannot be null.
 The value in a primary key column can never be modified or updated if
any foreign key refers to that primary key.

Example:

In the following example, <code>StudID</code> is a Primary Key.

StudID Roll No First Name LastName Email

1 11 Tom Price [email protected]

2 12 Nick Wright [email protected]

3 13 Dana Natan [email protected]

What is the Alternate key?


ALTERNATE KEYS is a column or group of columns in a table that uniquely
identify every row in that table. A table can have multiple choices for a primary
key but only one can be set as the primary key. All the keys which are not
primary key are called an Alternate Key.
Example:

In this table, StudID, Roll No, Email are qualified to become a primary key. But
since StudID is the primary key, Roll No, Email becomes the alternative key.

StudID Roll No First Name LastName Email


1 11 Tom Price [email protected]

2 12 Nick Wright [email protected]

3 13 Dana Natan [email protected]

What is a Candidate Key?


CANDIDATE KEY in SQL is a set of attributes that uniquely identify tuples in
a table. Candidate Key is a super key with no repeated attributes. The Primary
key should be selected from the candidate keys. Every table must have at least a
single candidate key. A table can have multiple candidate keys but only a single
primary key.
Properties of Candidate key:

 It must contain unique values


 Candidate key in SQL may have multiple attributes
 Must not contain null values
 It should contain minimum fields to ensure uniqueness
 Uniquely identify each record in a table

Candidate key Example: In the given table Stud ID, Roll No, and email are
candidate keys which help us to uniquely identify the student record in the table.

StudID Roll No First Name LastName Email

1 11 Tom Price [email protected]

2 12 Nick Wright [email protected]

3 13 Dana Natan [email protected]


Candidate Key in DBMS

FOREIGN KEY is a column that creates a relationship between two tables.


The purpose of Foreign keys is to maintain data integrity and allow navigation
between two different instances of an entity. It acts as a cross-reference between
two tables as it references the primary key of another table.
Example:

DeptCode DeptName

001 Science

002 English

005 Computer

Teacher ID Fname Lname

B002 David Warner

B017 Sara Joseph

B009 Mike Brunton

In this key in dbms example, we have two table, teach and department in a
school. However, there is no way to see which search work in which
department.

In this table, adding the foreign key in Deptcode to the Teacher name, we can
create a relationship between the two tables.

Teacher ID DeptCode Fname Lname

B002 002 David Warner

B017 002 Sara Joseph

B009 001 Mike Brunton

This concept is also known as Referential Integrity.


What is the Compound key?
COMPOUND KEY has two or more attributes that allow you to uniquely
recognize a specific record. It is possible that each column may not be unique
by itself within the database. However, when combined with the other column
or columns the combination of composite keys become unique. The purpose of
the compound key in database is to uniquely identify each record in the table.
Example:

OrderNo PorductID Product Name Quantity

B005 JAP102459 Mouse 5

B005 DKT321573 USB 10

B005 OMG446789 LCD Monitor 20

B004 DKT321573 USB 15

B002 OMG446789 Laser Printer 3

In this example, OrderNo and ProductID can’t be a primary key as it does not
uniquely identify a record. However, a compound key of Order ID and Product
ID could be used as it uniquely identified each record.

What is the Composite key?


COMPOSITE KEY is a combination of two or more columns that uniquely
identify rows in a table. The combination of columns guarantees uniqueness,
though individually uniqueness is not guaranteed. Hence, they are combined to
uniquely identify records in a table.
The difference between compound and the composite key is that any part of the
compound key can be a foreign key, but the composite key may or maybe not a
part of the foreign key.
Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of
information.
o Integrity constraints ensure that the data insertion, updating, and other
processes have to be performed in such a way that data integrity is not
affected.
o Thus, integrity constraint is used to guard against accidental damage to
the database.

Types of Integrity Constraint

1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of
values for an attribute.
o The data type of domain includes string, character, integer, time, date,
currency, etc. The value of the attribute must be available in the
corresponding domain.

Example:
2. Entity integrity constraints
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows
in relation and if the primary key has a null value, then we can't identify
those rows.
o A table can contain a null value other than the primary key field.

Example:
3. Referential Integrity Constraints
o A referential integrity constraint is specified between two tables.
o In the Referential integrity constraints, if a foreign key in Table 1 refers
to the Primary Key of Table 2, then every value of the Foreign Key in
Table 1 must be null or be available in Table 2.

Example:
4. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set
uniquely.
o An entity set can have multiple keys, but out of which one key will be the
primary key. A primary key can contain a unique and null value in the
relational table.

Example:

---------------------------------------------------------------------------------------------

What is Relational Algebra?

The relational algebra is a theoretical procedural query language which takes an


instance of relations and does operations that work on one or more relations to
describe another relation without altering the original relation(s). Thus, both the
operands and the outputs are relations. So the output from one operation can
turn into the input to another operation, which allows expressions to be nested
in the relational algebra, just as you nest arithmetic operations. This property is
called closure: relations are closed under the algebra, just as numbers are closed
under arithmetic operations.

The relational algebra is a relation-at-a-time (or set) language where all tuples
are controlled in one statement without the use of a loop. There are several
variations of syntax for relational algebra commands, and you use a common
symbolic notation for the commands and present it informally.

Relational algebra is a procedural query language that works on relational


model. The purpose of a query language is to retrieve data from database or
perform various operations such as insert, update, delete on the data. When I say
that relational algebra is a procedural query language, it means that it tells what
data to be retrieved and how to be retrieved.

On the other hand relational calculus is a non-procedural query language, which


means it tells what data to be retrieved but doesn’t tell how to retrieve it. We
will discuss relational calculus in a separate tutorial.

Types of operations in relational algebra

We have divided these operations in two categories:


1. Basic Operations
2. Derived Operations

Basic/Fundamental Operations:

1. Select (σ)
2. Project (∏)
3. Union (∪)
4. Set Difference (-)
5. Cartesian product (X)
6. Rename (ρ)

Derived Operations:

1. Natural Join (⋈)


2. Left, Right, Full outer join (𝔴, ⟖, 𝔴)
3. Intersection (∩)
4. Division (÷)

Select Operator (σ)


Select Operator is denoted by sigma (σ) and it is used to find the tuples (or
rows) in a relation (or table) which satisfy the given condition.

If you understand little bit of SQL then you can think of it as a where clause in
SQL, which is used for the same purpose.
Syntax of Select Operator (σ)

σ Condition/Predicate(Relation/Table name)
Select Operator (σ) Example

Table: CUSTOMER

Customer_Id Customer_Name Customer_City

C10100 Steve Agra


C10111 Raghu Agra
C10115 Chaitanya Noida
C10117 Ajeet Delhi
C10118 Carl Delhi
Query:

σ Customer_City="Agra" (CUSTOMER)
Output:

Customer_Id Customer_Name Customer_City

C10100 Steve Agra


C10111 Raghu Agra

Project Operator (∏)


Project operator is denoted by ∏ symbol and it is used to select desired columns
(or attributes) from a table (or relation).

Project operator in relational algebra is similar to the Select statement in SQL.

Syntax of Project Operator (∏)

∏ column_name1, column_name2, , column_nameN(table_name)


Project Operator (∏) Example

In this example, we have a table CUSTOMER with three columns, we want to


fetch only two columns of the table, which we can do with the help of Project
Operator ∏.
Table: CUSTOMER

Customer_Id Customer_Name Customer_City

C10100 Steve Agra


C10111 Raghu Agra
C10115 Chaitanya Noida
C10117 Ajeet Delhi
C10118 Carl Delhi
Query:

∏ Customer_Name, Customer_City (CUSTOMER)


Output:

Customer_Name Customer_City

Steve Agra
Raghu Agra
Chaitanya Noida
Ajeet Delhi
Carl Delhi

Union Operator (∪)

Union operator is denoted by ∪ symbol and it is used to select all the rows
(tuples) from two tables (relations).

Lets discuss union operator a bit more. Lets say we have two relations R1 and
R2 both have same columns and we want to select all the tuples(rows) from
these relations then we can apply the union operator on these relations.

Note: The rows (tuples) that are present in both the tables will only appear once
in the union set. In short you can say that there are no duplicates present after
the union operation.

Syntax of Union Operator (∪)

table_name1 ∪ table_name2
Union Operator (∪) Example

Table 1: COURSE
Course_Id Student_Name Student_Id

C101 Aditya S901


C104 Aditya S901
C106 Steve S911
C109 Paul S921
C115 Lucy S931
Table 2: STUDENT

Student_Id Student_Name Student_Age

S901 Aditya 19
S911 Steve 18
S921 Paul 19
S931 Lucy 17
S941 Carl 16
S951 Rick 18
Query:

∏ Student_Name (COURSE) ∪ ∏ Student_Name (STUDENT)


Output:

Student_Name

Aditya
Carl
Paul
Lucy
Rick
Steve
Note: As you can see there are no duplicate names present in the output even
though we had few common names in both the tables, also in the COURSE
table we had the duplicate name itself.

Intersection Operator (∩)


Intersection operator is denoted by ∩ symbol and it is used to select common
rows (tuples) from two tables (relations).

Lets say we have two relations R1 and R2 both have same columns and we
want to select all those tuples(rows) that are present in both the relations, then in
that case we can apply intersection operation on these two relations R1 ∩ R2.
Note: Only those rows that are present in both the tables will appear in the
result set.

Syntax of Intersection Operator (∩)

table_name1 ∩ table_name2
Intersection Operator (∩) Example

Lets take the same example that we have taken above.


Table 1: COURSE

Course_Id Student_Name Student_Id

C101 Aditya S901


C104 Aditya S901
C106 Steve S911
C109 Paul S921
C115 Lucy S931
Table 2: STUDENT

Student_Id Student_Name Student_Age

S901 Aditya 19
S911 Steve 18
S921 Paul 19
S931 Lucy 17
S941 Carl 16
S951 Rick 18
Query:

∏ Student_Name (COURSE) ∩ ∏ Student_Name (STUDENT)


Output:

Student_Name

Aditya
Steve
Paul
Lucy
Set Difference (-)

Set Difference is denoted by – symbol. Lets say we have two relations R1 and
R2 and we want to select all those tuples(rows) that are present in Relation R1
but not present in Relation R2, this can be done using Set difference R1 – R2.

Syntax of Set Difference (-)

table_name1 - table_name2
Set Difference (-) Example

Lets take the same tables COURSE and STUDENT that we have seen above.

Query:
Lets write a query to select those student names that are present in STUDENT
table but not present in COURSE table.

∏ Student_Name (STUDENT) - ∏ Student_Name (COURSE)


Output:

Student_Name

Carl
Rick

Cartesian product (X)


Cartesian Product is denoted by X symbol. Lets say we have two relations R1
and R2 then the cartesian product of these two relations (R1 X R2) would
combine each tuple of first relation R1 with the each tuple of second relation
R2. I know it sounds confusing but once we take an example of this, you will be
able to understand this.
Syntax of Cartesian product (X)

R1 X R2
Cartesian product (X) Example

Table 1: R

Col_A Col_B

AA 100
BB 200
CC 300
Table 2: S

Col_X Col_Y

XX 99
YY 11
ZZ 101
Query:
Lets find the cartesian product of table R and S.

RXS
Output:

Col_A Col_B Col_X Col_Y

AA 100 XX 99
AA 100 YY 11
AA 100 ZZ 101
BB 200 XX 99
BB 200 YY 11
BB 200 ZZ 101
CC 300 XX 99
CC 300 YY 11
CC 300 ZZ 101
Note: The number of rows in the output will always be the cross product of
number of rows in each table. In our example table 1 has 3 rows and table 2 has
3 rows so the output has 3×3 = 9 rows.
Rename (ρ)

Rename (ρ) operation can be used to rename a relation or an attribute of a


relation.
Rename (ρ) Syntax:
ρ(new_relation_name, old_relation_name)

Rename (ρ) Example

Lets say we have a table customer, we are fetching customer names and we are
renaming the resulted relation to CUST_NAMES.

Table: CUSTOMER

Customer_Id Customer_Name Customer_City

C10100 Steve Agra


C10111 Raghu Agra
C10115 Chaitanya Noida
C10117 Ajeet Delhi
C10118 Carl Delhi
Query:

ρ(CUST_NAMES, ∏(Customer_Name)(CUSTOMER))
Output:

CUST_NAMES

Steve
Raghu
Chaitanya
Ajeet
Carl
Join is a combination of a Cartesian product followed by a selection process. A
Join operation pairs two tuples from different relations, if and only if a given
join condition is satisfied.
We will briefly describe various join types in the following sections.

Theta (θ) Join

Theta join combines tuples from different relations provided they satisfy the
theta condition. The join condition is denoted by the symbol θ.
Notation
R1 ⋈θ R2
R1 and R2 are relations having attributes (A1, A2, .., An) and (B1, B2,.. ,Bn)
such that the attributes don’t have anything in common, that is R1 ∩ R2 = Φ.
Theta join can use all kinds of comparison operators.

Student

SID Name Std

101 Alex 10

102 Maria 11

Subjects

Class Subject

10 Math

10 English

11 Music

11 Sports

Student_Detail −
STUDENT ⋈Student.Std = Subject.Class SUBJECT
Student_detail

SID Name Std Class Subject

101 Alex 10 10 Math

101 Alex 10 10 English

102 Maria 11 11 Music

102 Maria 11 11 Sports

Equijoin

When Theta join uses only equality comparison operator, it is said to be


equijoin. The above example corresponds to equijoin.

Natural Join (⋈)

Natural join does not use any comparison operator. It does not concatenate the
way a Cartesian product does. We can perform a Natural Join only if there is at
least one common attribute that exists between two relations. In addition, the
attributes must have the same name and domain.
Natural join acts on those matching attributes where the values of attributes in
both the relations are same.

Courses

CID Course Dept

CS01 Database CS

ME01 Mechanics ME
EE01 Electronics EE

HoD

Dept Head

CS Alex

ME Maya

EE Mira

Courses ⋈ HoD

Dept CID Course Head

CS CS01 Database Alex

ME ME01 Mechanics Maya

EE EE01 Electronics Mira

Outer Joins

Theta Join, Equijoin, and Natural Join are called inner joins. An inner join
includes only those tuples with matching attributes and the rest are discarded in
the resulting relation. Therefore, we need to use outer joins to include all the
tuples from the participating relations in the resulting relation. There are three
kinds of outer joins − left outer join, right outer join, and full outer join.

Left Outer Join(R S)

All the tuples from the Left relation, R, are included in the resulting relation. If
there are tuples in R without any matching tuple in the Right relation S, then the
S-attributes of the resulting relation are made NULL.
Left

A B

100 Database

101 Mechanics

102 Electronics

Right

A B

100 Alex

102 Maya

104 Mira

Courses HoD

A B C D

100 Database 100 Alex

101 Mechanics --- ---

102 Electronics 102 Maya


Right Outer Join: ( R S)

All the tuples from the Right relation, S, are included in the resulting relation. If
there are tuples in S without any matching tuple in R, then the R-attributes of
resulting relation are made NULL.

Courses HoD

A B C D

100 Database 100 Alex

102 Electronics 102 Maya

--- --- 104 Mira

Full Outer Join: ( R S)

All the tuples from both participating relations are included in the resulting
relation. If there are no matching tuples for both relations, their respective
unmatched attributes are made NULL.

Courses HoD

A B C D

100 Database 100 Alex

101 Mechanics --- ---

102 Electronics 102 Maya

--- --- 104 Mira


-------------------------------------------------------------------------------------------

DBMS SQL

Structured Query Language, popularly known as SQL is a domain specific


language used to manage data in relational database management systems
(RDBMS).

 It allows the user to perform operations such as create, delete etc. on tables
in database
 It also allows operations such as read, insert, update, delete etc. on the data
stored in the tables.
 All popular relational database management softwares such as MySQL,
PostgreSQL, Oracle, Microsoft Access, IDBM Db2, SQLite etc. use SQL
to manage the database.
 SQL is a simple english like language which is used to query the database
to perform various operations on the database.

Rules for a Better SQL Schema

1. Only use lowercase letters, numbers and underscores for database, schema,
column and table name.
2. Use simple & meaningful, table and column name. This is because
sometimes, we refer the attribute of a table from another table as foreign
key.
3. SQL is not a case sensitive language, however the keywords used in the
SQL query are generally written in uppercase.
4. Using SQL queries you can perform various operations on the database.

SQL Processing

 When a SQL statement executes, system first performs various checks


such as syntax check, semantic check, shared pool check on the SQL
query during parsing phase.
 In the second phase, which is also known as hard parsing phase, system
performs query optimization and generate various execution plans.
 In the next phase, row source generator software receives the optimal
execution plan and generate a iterative plan that is used by the database.
 Last step is execution of the query based on the iterative plan generated
in the previous step.
DBMS Characteristics of SQL

Characteristics of SQL

1. Easy to Learn: SQL is user-friendly, english like language that makes it


easy to learn. Learning SQL doesn’t require prior knowledge.
2. Portable language: SQL is a portable language, which means the
software that supports SQL can be moved to another machine without
affecting the capability of SQL interacting with the database on new
machine.
3. Supports wide variety of commands: SQL supports various useful
commands such as:
 DDL (Data Definition Language) commands like CREATE, DROP,
ALTER.
 DML (Data Manipulation Language) commands like INSERT,
DELETE, UPDATE.
 DCL (Data Control Language) commands like GRANT, REVOKE.
 TCL (Transaction Control Language) commands like COMMIT,
ROLLBACK.
 DQL (Data Query Language) commands like SELECT.
4. Reusability: SQL promotes reusability by supporting stored procedures.
These stored procedures are stored SQL statements that can be used to
perform a specific task any number of times. This makes it easier to write
SQL statements for a re-occurring task and reusing the saved stored
procedure to perform the same task without rewriting the same SQL
statements again.
5. Supports JOIN: SQL supports join which is used to combine the data of
two or more tables. This can be useful when we need to perform the
operation on multiple tables.
6. Supports UNION: UNION command can be used to join two or more
DQL statement (SELECT statements).
7. Integration: SQL allows integration to non-SQL database applications as
well.
8. Performance: Better performance even if the database size if huge.
9. SQL is scalable and flexible.
10. SQL is secure.

DBMS: Advantages of SQL

In this article, we will discuss the advantages of SQL. We have already seen
the several features of SQL in characteristics of SQL. Here, we will cover some
of the advantages that we get while using SQL as a database language.

1. Fast Response Time

You can quickly retrieve large amount of data from database using SQL. The
response time of a SQL query is very fast.
2. Requires No coding

Learning SQL is easy and doesn’t require any prior coding or programming
knowledge. The syntax of SQL is very simple and close to english so learning
curve is smooth.

3. Portable

SQL is portable, it supports various operating system and devices. SQL


statements can be stored as saved procedures and these procedures can be used
on a different machine to perform the same task without needing to rewrite the
statements again.

4. Standardised language

SQL is been used for over the years and has wide variety of well maintained
documentation. This language is so standardised that the same syntax can be
used on various different platforms.

5. Intergation

SQL server can connect to third party backends like Oracle, IDM Db2, MySQL
etc. using drivers. These drivers allow the smooth integration.

6. Secure

SQL allows to set permissions in tables, this makes it secure as a user with no
permission cannot read, write or modify the data in database. SQL also has the
concept of constraints that ensures what type of data can be inserted into the
tables. All these features make SQL a secure database language.

7. Scalable

It is easy to add or drop tables in the database using SQL. Also, the database
size doesn’t affect the performance of SQL that much and it works pretty great
with large databases as well. It is easy to add several new tables into the
database as well as drop multiple tables from database.
8. Supports Transactions

Transactions are the logical units or sequence of tasks that either needs to be
completed fully or none to maintain database integrity. SQL commands such as
COMMIT, ROLLBACK, SAVEPOINT etc. support transactions.

What is a Primary Key


A primary key is a minimal set of attributes (columns) in a table that uniquely
identifies tuples (rows) of that table.

For example, you want to store student data in a table “student”. The attributes
of this table are: student_id, student_name, student_age, student_address. The
primary key is a set of one or more of these attributes to uniquely identify a
record in the table. In the case, since student_id is different for each student, this
can be considered a primary key.

Characteristics of a primary key

Primary key has the following characteristics:

1. Minimal

The primary key should contain minimal number of attributes. The example we
seen above, where student_id is able to uniquely identify a record, here
combination of two attributes such as {student_id, student_name} can also
uniquely identify record. However since we should choose minimal set of
attribute thus student is chosen as primary key instead of {student_id,
student_name}.

2. Unique

The value of primary key should be unique for each row of the table. The
column(s) that makes the key cannot contain duplicate values. This is because
non-unique value would not help us uniquely identify record. If two students
have same student_id then updating a record of one student based on primary
key can mistakenly update record of other student.
3. Non Null

The attribute(s) that is marked as primary key is not allowed to have null values.

4. Not dependent on Time

The primary key value should not change over time. It should remain as it is
until explicitly updated by the user.

5. Easily accessible

The primary key of the record should be accessible to all the users who are
performing any operations on the database.

6. Can have more than one attributes

It can be a set of more than one attributes (columns). For


example {Stu_Id, Stu_Name} collectively can identify the tuple in the above
table, but we do not choose it as primary key because Stu_Id alone is enough to
uniquely identifies rows in a table and we always go for minimal set. Having
that said, we should choose more than one columns as primary key only
when there is no single column that can uniquely identify the tuple in table.

Syntax for Creating Primary key constraint:

While creating table you can define primary key like this:

CREATE TABLE table_name


(
column_name1 datatype [ NULL | NOT NULL ],
column_name2 datatype [ NULL | NOT NULL ],
...

CONSTRAINT constraint_name PRIMARY KEY (column_nameX,


column_nameY..)
);
For example: Here we are making stu_id primary key while creating the table
STUDENTS.

CREATE TABLE STUDENTS


( stu_id int NOT NULL
first_name VARCHAR(30) NOT NULL,
last_name VARCHAR(25) NOT NULL,
dob DATE,
CONSTRAINT student_pk PRIMARY KEY (stu_id)
);
Properties of a Primary Key

 It doesn’t not allow duplicates.


 A table can have only one primary key
 Primary key is denoted by underlining the attribute name (column name).
 It uniquely identifies each record of the table
 It doesn’t allow null values to be inserted for the primary key column.
 A primary key can consists of more than one columns, such primary key is
known as composite primary key.

What Are the Benefits of a Primary Key?

The following are the advantages of a primary key:

 It uniquely identifies each row of a table. This is definitely useful to


perform any operation on data such as update, delete, search etc.
 It allows faster access of the record because it uses the concept indexing in
DBMS.

Primary Key Example in DBMS


Let’s take an example to understand the concept of primary key. In the
following table, there are three attributes: Stu_ID, Stu_Name & Stu_Age. Out
of these three attributes, one attribute or a set of more than one attributes can be
a primary key.

 Attribute Stu_Name alone cannot be a primary key as more than one


students can have same name.
 Attribute Stu_Age alone cannot be a primary key as more than one
students can have same age.
 Attribute Stu_Id alone is a primary key as each student has a unique id that
can identify the student record in the table.

Note: In some cases an attribute alone cannot uniquely identify a record in a


table, in that case we try to find a set of attributes that can uniquely identify a
row in table. We will see the example of it after this example.
Table Name: STUDENTS

-------------------------------------------------------------------------------------------

Super key in DBMS

Definition of Super Key in DBMS: A super key is a set of one or more


attributes (columns), which can uniquely identify a row in a table. Often DBMS
beginners get confused between super key and candidate key, so we will also
discuss candidate key and its relation with super key in this article.

How candidate key is different from super key?


Answer is simple – Candidate keys are selected from the set of super keys, the
only thing we take care while selecting candidate key is: It should not have any
redundant attribute. That’s the reason they are also termed as minimal super
key.

Let’s take an example to understand this:


Table: Employee

Emp_SSN Emp_Number Emp_Name

123456789 226 Steve


999999321 227 Ajeet
888997212 228 Chaitanya
777778888 229 Robert
Super keys: The above table has following super keys. All of the following sets
of super key are able to uniquely identify a row of the employee table.

 {Emp_SSN}
 {Emp_Number}
 {Emp_SSN, Emp_Number}
 {Emp_SSN, Emp_Name}
 {Emp_SSN, Emp_Number, Emp_Name}
 {Emp_Number, Emp_Name}

Candidate Keys: As I mentioned in the beginning, a candidate key is a minimal


super key with no redundant attributes. The following two set of super keys are
chosen from the above sets as there are no redundant attributes in these sets.

 {Emp_SSN}
 {Emp_Number}

Only these two sets are candidate keys as all other sets are having redundant
attributes that are not necessary for unique identification.

Super key vs Candidate Key


I have been getting lot of comments regarding the confusion between super key
and candidate key. Let me give you a clear explanation.
1. First you have to understand that all the candidate keys are super keys. This is
because the candidate keys are chosen out of the super keys.
2. How we choose candidate keys from the set of super keys? We look for those
keys from which we cannot remove any fields. In the above example, we have
not chosen {Emp_SSN, Emp_Name} as candidate key because {Emp_SSN}
alone can identify a unique row in the table and Emp_Name is redundant.

Candidate Key in DBMS

Definition of Candidate Key in DBMS: A super key with no redundant


attribute is known as candidate key. Candidate keys are selected from the set of
super keys, the only thing we take care while selecting candidate key is that the
candidate key should not have any redundant attributes. That’s the reason they
are also termed as minimal super key.
Candidate Key Example

Lets take an example of table “Employee”. This table has three attributes:
Emp_Id, Emp_Number & Emp_Name. Here Emp_Id & Emp_Number will be
having unique values and Emp_Name can have duplicate values as more than
one employees can have same name.

Emp_IdEmp_Number Emp_Name

E01 2264 Steve


E22 2278 Ajeet
E23 2288 Chaitanya
E45 2290 Robert
How many super keys the above table can have?
1. {Emp_Id}
2. {Emp_Number}
3. {Emp_Id, Emp_Number}
4. {Emp_Id, Emp_Name}
5. {Emp_Id, Emp_Number, Emp_Name}
6. {Emp_Number, Emp_Name}

Lets select the candidate keys from the above set of super keys.

1. {Emp_Id} – No redundant attributes


2. {Emp_Number} – No redundant attributes
3. {Emp_Id, Emp_Number} – Redundant attribute. Either of those attributes
can be a minimal super key as both of these columns have unique values.
4. {Emp_Id, Emp_Name} – Redundant attribute Emp_Name.
5. {Emp_Id, Emp_Number, Emp_Name} – Redundant attributes. Emp_Id or
Emp_Number alone are sufficient enough to uniquely identify a row of
Employee table.
6. {Emp_Number, Emp_Name} – Redundant attribute Emp_Name.

The candidate keys we have selected are:


{Emp_Id}
{Emp_Number}
Foreign key in DBMS

Definition: Foreign keys are the columns of a table that points to the primary
key of another table. They act as a cross-reference between tables.

For example:
In the below example the Stu_Id column in Course_enrollment table is a foreign
key as it points to the primary key of the Student table.

Course_enrollment table:

Course_Id Stu_Id

C01 101

C02 102

C03 101

C05 102

C06 103
C07 102

Student table:

Stu_Id Stu_Name Stu_Age

101 Chaitanya 22

102 Arya 26

103 Bran 25

104 Jon 21

Note: Practically, the foreign key has nothing to do with the primary key tag of
another table, if it points to a unique column (not necessarily a primary key) of
another table then too, it would be a foreign key. So, a correct definition of
foreign key would be: Foreign keys are the columns of a table that points to
the candidate key of another table.
Composite key in DBMS

Definition of Composite key: A key that has more than one attributes is known
as composite key. It is also known as compound key.

Note: Any key such as super key, primary key, candidate key etc. can be called
composite key if it has more than one attributes.

Composite key Example


Lets consider a table Sales. This table has four columns (attributes) – cust_Id,
order_Id, product_code & product_count.

Table – Sales

cust_Id order_Id product_code product_count

C01 O001 P007 23


C02 O123 P007 19
C02 O123 P230 82
C01 O001 P890 42
None of these columns alone can play a role of key in this table.

Column cust_Id alone cannot become a key as a same customer can place
multiple orders, thus the same customer can have multiple entires.

Column order_Id alone cannot be a primary key as a same order can contain
the order of multiple products, thus same order_Id can be present multiple
times.

Column product_code cannot be a primary key as more than one customers can
place order for the same product.

Column product_count alone cannot be a primary key because two orders can
be placed for the same product count.

Based on this, it is safe to assume that the key should be having more than one
attributes:
Key in above table: {cust_id, product_code}
This is a composite key as it is made up of more than one attributes

Alternate key in DBMS

As we have seen in the candidate key guide that a table can have multiple
candidate keys. Among these candidate keys, only one key gets selected
as primary key, the remaining keys are known as alternative or secondary
keys.

Alternate Key Example


Lets take an example to understand the alternate key concept. Here we have a
table Employee, this table has three attributes: Emp_Id, Emp_Number &
Emp_Name.

Table: Employee/strong>

Emp_Id Emp_Number Emp_Name

E01 2264 Steve


E22 2278 Ajeet
E23 2288 Chaitanya
E45 2290 Robert
There are two candidate keys in the above table:
{Emp_Id}
{Emp_Number}

DBA (Database administrator) can choose any of the above key as primary key.
Lets say Emp_Id is chosen as primary key.

Since we have selected Emp_Id as primary key, the remaining


key Emp_Number would be called alternative or secondary key.
Relational Calculus

There is an alternate way of formulating queries known as Relational Calculus.


Relational calculus is a non-procedural query language. In the non-procedural
query language, the user is concerned with the details of how to obtain the end
results. The relational calculus tells what to do but never explains how to do.
Most commercial relational languages are based on aspects of relational
calculus including SQL-QBE and QUEL.

Why it is called Relational Calculus?

It is based on Predicate calculus, a name derived from branch of symbolic


language. A predicate is a truth-valued function with arguments. On substituting
values for the arguments, the function result in an expression called a
proposition. It can be either true or false. It is a tailored version of a subset of
the Predicate Calculus to communicate with the relational database.

Many of the calculus expressions involves the use of Quantifiers. There are
two types of quantifiers:

o Universal Quantifiers: The universal quantifier denoted by ∀ is read as


for all which means that in a given set of tuples exactly all tuples satisfy a
given condition.
o Existential Quantifiers: The existential quantifier denoted by ∃ is read
as for all which means that in a given set of tuples there is at least one
occurrences whose value satisfy a given condition.

A tuple variable t is bound if it is quantified which means that if it appears in


any occurrences a variable that is not bound is said to be free.

Free and bound variables may be compared with global and local variable of
programming languages.
Types of Relational calculus:

1. Tuple Relational Calculus (TRC)

It is a non-procedural query language which is based on finding a number of


tuple variables also known as range variable for which predicate holds true. It
describes the desired information without giving a specific procedure for
obtaining that information. The tuple relational calculus is specified to select the
tuples in a relation. In TRC, filtering variable uses the tuples of a relation. The
result of the relation can have one or more tuples.

Notation:

A Query in the tuple relational calculus is expressed as following notation

1. {T | P (T)} or {T | Condition (T)}

Where

T is the resulting tuples

P(T) is the condition used to fetch T.

For example:

1. { T.name | Author(T) AND T.article = 'database' }


Output: This query selects the tuples from the AUTHOR relation. It returns a
tuple with 'name' from Author who has written an article on 'database'.

TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential
(∃) and Universal Quantifiers (∀).

For example:

1. { R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}

Output: This query will yield the same result as the previous one.

Table: Student

First_Name Last_Name Age

Ajeet Singh 30
Chaitanya Singh 31
Rajeev Bhatia 27
Carl Pratap 28
Lets write relational calculus queries.

Query to display the last name of those students where age is greater than
30

{ t.Last_Name | Student(t) AND t.age > 30 }


In the above query you can see two parts separated by | symbol. The second part
is where we define the condition and in the first part we specify the fields which
we want to display for the selected tuples.

The result of the above query would be:

Last_Name

Singh
Query to display all the details of students where Last name is ‘Singh’

{ t | Student(t) AND t.Last_Name = 'Singh' }


Output:
First_Name Last_Name Age

Ajeet Singh 30
Chaitanya Singh 31

2. Domain Relational Calculus (DRC)

The second form of relation is known as Domain relational calculus. In domain


relational calculus, filtering variable uses the domain of attributes. Domain
relational calculus uses the same operators as tuple calculus. It uses logical
connectives ∧ (and), ∨ (or) and ┓ (not). It uses Existential (∃) and Universal
Quantifiers (∀) to bind the variable. The QBE or Query by example is a query
language related to domain relational calculus.

Notation:

1. { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Where

a1, a2 are attributes


P stands for formula built by inner attributes

For example:

1. {< article, page, subject > | ∈ javatpoint ∧ subject = 'database'}

First_Name Last_Name Age

Ajeet Singh 30
Chaitanya Singh 31
Rajeev Bhatia 27
Carl Pratap 28
Query to find the first name and age of students where student age is greater
than 27

{< First_Name, Age > | ∈ Student ∧ Age > 27}


Note:
The symbols used for logical operators are: ∧ for AND, ∨ for OR and ┓ for
NOT.

Output:

First_Name Age

Ajeet 30
Chaitanya 31
Carl 28

What is Indexing in DBMS?


Indexing is a technique for improving database performance by reducing the
number of disk accesses necessary when a query is run. An index is a form of
data structure. It’s used to swiftly identify and access data and information
present in a database table.

Structure of Index
We can create indices using some columns of the database.

 The search key is the database’s first column, and it contains a duplicate
or copy of the table’s candidate key or primary key. The primary key
values are saved in sorted order so that the related data can be quickly
accessible.
 The data reference is the database’s second column. It contains a group of
pointers that point to the disk block where the value of a specific key can
be found.

Methods of Indexing
Ordered Indices
To make searching easier and faster, the indices are frequently arranged/sorted.
Ordered indices are indices that have been sorted.
Example
Let’s say we have a table of employees with thousands of records, each of
which is ten bytes large. If their IDs begin with 1, 2, 3,…, etc., and we are
looking for the student with ID-543:

 We must search the disk block from the beginning till it reaches 543 in
the case of a DB without an index. After reading 543*10=5430 bytes, the
DBMS will read the record.
 We will perform the search using indices in the case of an index, and the
DBMS would read the record after it reads 542*2 = 1084 bytes, which is
significantly less than the prior example.
Primary Index

 Primary indexing refers to the process of creating an index based on the


table’s primary key. These primary keys are specific to each record and
establish a 1:1 relationship between them.
 The searching operation is fairly efficient because primary keys are stored
in sorted order.
 There are two types of primary indexes: dense indexes and sparse
indexes.
Dense Index
Every search key value in the data file has an index record in the dense index. It
speeds up the search process. The total number of records present in the index
table and the main table are the same in this case. It requires extra space to hold
the index record. A pointer to the actual record on the disk and the search key
are both included in the index records.
 For every search key value in the data file, there is an index record.
 This record contains the search key and also a reference to the first data
record with that search key value.

Sparse Index
Only a few items in the data file have index records. Each and every item points
to a certain block. Rather than pointing to each item in the main database, the
index, in this case, points to the records that are present in the main table that is
in a gap.
 The index record appears only for a few items in the data file. Each item
points to a block as shown.
 To locate a record, we find the index record with the largest search key
value less than or equal to the search key value we are looking for.
 We start at that record pointed to by the index record, and proceed along
with the pointers in the file (that is, sequentially) until we find the desired
record.
Clustering Index

An ordered data file can be defined as a clustered index. Non-primary key
columns, which may or may not be unique for each record, are sometimes
used to build indices.
 In this situation, we’ll join two or more columns to acquire the unique
value and generate an index out of them to make it easier to find the
record. A clustering index is a name for this method.
 Records with comparable properties are grouped together, and indices for
these groups are constructed.
Example
Assume that each department in a corporation has numerous employees.
Assume we utilise a clustering index, in which all employees with the same
Dept_ID are grouped together into a single cluster, and index pointers refer to
the cluster as a whole. Dept_Id is a non-unique key in this case.
Because one disk block is shared by records from various clusters, the previous
structure is a little unclear. It is referred to as a better strategy when we employ
distinct disk blocks for separate clusters.
Secondary Index
When using sparse indexing, the size of the mapping grows in sync with the
size of the table. These mappings are frequently stored in primary memory to
speed up address fetching. The secondary memory then searches the actual data
using the address obtained through mapping. Fetching the address becomes
slower as the mapping size increases. The sparse index will be ineffective in this
scenario, so secondary indexing is used to solve this problem.
Another level of indexing is introduced in secondary indexing to reduce the size
of the mapping. The massive range for the columns is chosen first in this
method, resulting in a small mapping size at the first level. Each range is then
subdivided into smaller groups. Because the first level’s mapping is kept in
primary memory, fetching the addresses is faster. The second-level mapping, as
well as the actual data, are kept in secondary memory (or hard disk).
PL/SQL Full Form
PL/SQL stands for “Procedural Language extensions to the Structured Query
Language.” PL/SQL is Oracle Corporation’s procedural extension for SQL and
the Oracle relational database. It is a high-performance, highly integrated
database language.

What is PL/SQL Developer?


PL/SQL Developer is a free Integrated Development Environment provided by
Oracle to develop Software in Oracle Database environment and perform
various Database tasks with ease. The PL/SQL Developer IDE provides with
GUI and Plugins to use in order to help the end users save the time on their
Database tasks.

Architecture of PL/SQL
The Below PL/SQL Example is a pictorial representation of PL/SQL
Architecture.
PL/SQL Architecture Diagram

The PL/SQL architecture mainly consists of following three components:

1. PL/SQL Block
2. PL/SQL Engine
3. Database Server

PL/SQL block:

 This is the component which has the actual PL/SQL code.


 This consists of different sections to divide the code logically (declarative
section for declaring purpose, execution section for processing
statements, exception handling section for handling errors)
 It also contains the SQL instruction that used to interact with the database
server.
 All the PL/SQL units are treated as PL/SQL blocks, and this is the
starting stage of the architecture which serves as the primary input.

Following are the different type of PL/SQL units.

 Anonymous Block
 Function
 Library
 Procedure
 Package Body
 Package Specification
 Trigger
 Type
 Type Body

PL/SQL Engine

 PL/SQL engine is the component where the actual processing of the


codes takes place.
 PL/SQL engine separates PL/SQL units and SQL part in the input (as
shown in the image below).
 The separated PL/SQL units will be handled by the PL/SQL engine itself.
 The SQL part will be sent to database server where the actual interaction
with database takes place.
 It can be installed in both database server and in the application server.

Database Server:

 This is the most important component of Pl/SQL unit which stores the
data.
 The PL/SQL engine uses the SQL from PL/SQL units to interact with the
database server.
 It consists of SQL executor which parses the input SQL statements and
execute the same.

Features & Advantages of PL/SQL

1. Better performance, as SQL is executed in bulk rather than a single


statement
2. High Productivity
3. Tight integration with SQL
4. Full Portability
5. Tight Security
6. Supports Object Oriented Programming concepts.
7. Scalability and Manageability
8. Supports Web Application Development
9. Supports Server Page Development

Disadvantages of PL/SQL

1. Stored Procedures in PL/SQL uses high memory


2. Lacks functionality debugging in stored procedures
3. Any change in underlying database requires change in the presentation
layer also
4. Does not completely separate roles of back-end developer and fron-end
developer
5. Difficult to separate HTML development with PL/SQL development

Difference between SQL and PL/SQL


Here are some important differences between SQL and PL/SQL:

SQL PL/SQL

 PL/SQL is a block of codes that used


 SQL is a single query that is used to
to write the entire program blocks/
perform DML and DDL operations.
procedure/ function, etc.

 It is declarative, that defines what


 PL/SQL is procedural that defines
need to be done, rather than how
how the things needs to be done.
things need to be done.

 Execute as a single statement.  Execute as a whole block.

 Mainly used to manipulate data.  Mainly used to create an application.

 No interaction with the database


 Interaction with a Database server.
server.

 It is an extension of SQL, so that it


 Cannot contain PL/SQL code in it.
can contain SQL inside it.
Differences between SQL and PL/SQL:
SQL PL/SQL

PL/SQL is a block of codes that used to


SQL is a single query that is used to write the entire program blocks/
perform DML and DDL operations. procedure/ function, etc.

It is declarative, that defines what


needs to be done, rather than how PL/SQL is procedural that defines how
things need to be done. the things needs to be done.

Execute as a single statement. Execute as a whole block.

Mainly used to manipulate data. Mainly used to create an application.

It is an extension of SQL, so it can


Cannot contain PL/SQL code in it. contain SQL inside it.

Structured Query Language (SQL)

Structured Query Language is a standard Database language which is used to


create, maintain and retrieve the relational database. Following are some
interesting facts about SQL.
 SQL is case insensitive. But it is a recommended practice to use keywords
(like SELECT, UPDATE, CREATE, etc) in capital letters and use user
defined things (liked table name, column name, etc) in small letters.
 We can write comments in SQL using “–” (double hyphen) at the beginning
of any line.
 SQL is the programming language for relational databases (explained
below) like MySQL, Oracle, Sybase, SQL Server, Postgre, etc. Other non-
relational databases (also called NoSQL) databases like MongoDB,
DynamoDB, etc do not use SQL
 Although there is an ISO standard for SQL, most of the implementations
slightly vary in syntax. So we may encounter queries that work in SQL
Server but do not work in MySQL.

Rules:

SQL follows the following rules:

o Structure query language is not case sensitive. Generally, keywords of


SQL are written in uppercase.
o Statements of SQL are dependent on text lines. We can use a single SQL
statement on one or multiple text line.
o Using the SQL statements, you can perform most of the actions in a
database.
o SQL depends on tuple relational calculus and relational algebra.

What Is SQL?

Structured Query Language (SQL) refers to a standard programming language


utilized to extract, organize, manage, and manipulate data stored in relational
databases. SQL is thereby referred to as a database language that can execute
activities on databases that consist of tables made up of rows and columns.

SQL plays a crucial role in retrieving relevant data from databases, which can
later be used by various platforms such as Python or R for analysis purposes.
SQL can manage several data transactions simultaneously where large volumes
of data are written concurrently.

SQL is an American National Standards Institute (ANSI) standard that operates


via multiple versions and frameworks to handle backend data across various
web applications supported by relational databases such as MySQL, SQL
Server, Oracle PostgreSQL, and others.

Top companies owned by Meta Inc., such as Facebook, WhatsApp, and


Instagram, all rely on SQL for data processing and backend storage.
How does SQL work?

As an SQL query is written and run, it is processed by the ‘query language


processor’ having a parser and query optimizer. The SQL server then compiles
the processed query in three stages:

1. Parsing: This refers to a process that cross-checks the syntax of the query.

2. Binding: This step involves verifying query semantics before executing it.

3. Optimization: The final step generates the query execution plan. The
objective here is to identify an efficient query execution plan that runs in
minimal time. This implies that the shorter the response time for the SQL query,
the better the results. Several combinations of plans are generated to have a
practical end execution plan.
How does SQL work?

Essential SQL benefits

SQL offers several benefits as it is a user-friendly language accessible across


platforms.

Let’s understand the key benefits of SQL:

 Portable language: SQL, being a portable language, can be


transferred from one device to another, where the devices can range
from personal computers and servers to laptops and even some
mobile devices. The language is capable of running on local internet
and intranet systems.
 Fast query processing: Irrespective of the volume of data, SQL is
capable of inserting, deleting, retrieving, and manipulating data
quickly and efficiently while ensuring data accuracy. This enables
fast data sharing between users.
 No coding skills required: SQL does not demand coding skills like
other programming languages. Its user-friendly trait makes it
accessible to all users as they can manage SQL with the help of
keywords such as ‘create,’ ‘insert,’ ‘select,’ ‘update,’ and others
without possessing any programming skills.
 Uniform platform with standardized language: SQL uses English
as a standard language; hence, it is easy for all users to understand,
learn, write, and interpret without much difficulty. The English
words and statements make SLQ accessible to everyone, including
people with little or no previous experience.
 Offers multiple data views: SQL provides a facility to create
multiple data views, where different users can visualize the database
structure and its content differently.
 Open source code. Open source SQL solutions such as MySQL,
MariaDB, and PostGresSQL provide accessible SQL databases.
This attracts the participation of larger communities at a lower cost.
 Top database management system (DBMS) vendors use SQL:
The DBMS systems of top companies such as IBM,
Oracle, and Microsoft use SQL, considering the comprehensive
benefits it offers.
 Interactive language: SQL is an interactive and interpretive
language. As such, it reduces the chances of miscommunication or
misunderstanding between users.
Elements of SQL

SQL is the go-to-choice of most database users due to its easy usability and how
the queries can carry out varied functions on vast amounts of structured data.

SQL programming language has the following vital elements:

1. Keywords

2. Clauses

3. Expressions

4. Predicates

5. Queries

Elements of SQL
Let’s understand the role of each element in SQL programming:

1. Keywords: Keywords refer to a set of words that allow you to perform


operations on your database. Consider the example of the keyword ‘LIKE‘; it
searches for a specific data pattern in the database.

For example, let’s say we need to identify the names of families living in the
Boston area who have ‘Luis’ as their last name. The following SQL query will
fetch the relevant results for this problem statement:

SELECT * FROM [BOSTON] WHERE NAME LIKE ‘%LUIS’

Similarly, different keywords perform various operations on the databases. The


following are some examples of such keywords with their functional roles:

 CREATE: This keyword helps in creating a database structure or


simply tables, views, and an index
 INSERT: It adds data to the rows of a table
 SELECT: Selects data from database or table
 FROM: Indicates the table from which data needs to be fetched
 WHERE: It filters the data so that only relevant data matching
certain conditions is fetched
 UPDATE: Updates existing rows in a table
 DELETE: It deletes the existing rows in a table

2. Clauses: Clauses refer to the in-built functions that filter out data and retrieve
the required data from the database or table. It is suitable when handling large
databases. Clauses are a part of the SQL statement.

Let’s consider a use case where you need to select age, email, and address from
the database. One would then represent the clause as ‘SELECT Age,
Email, and Address,’ where SELECT is a keyword and age, email, and address
reveal certain information to run the SQL query and retrieve the required data.

3. Expressions: SQL expressions represent a formula typically written in a


query format. It combines one or more values, operators, and SQL functions
that evaluate a specific value. Moreover, SQL expressions are broadly divided
into three types, namely, Boolean, numeric, and date.

Let’s consider an example of the Boolean expression that fetches data by


matching single values. If you want to identify employees whose salaries are
equal to 5,000, one can use the following SQL query:

SELECT * FROM EMPLOYEES WHERE SALARY = 5000;


4. Predicates: Predicates refer to keywords that reveal a relationship between
two expressions and result in a true or false value. It is just another term for an
expression that is used to determine an unknown or TRUE/FALSE condition.

For example, consider the following SQL statement:

SELECT * FROM CUSTOMERS WHERE Product = ‘Television’;

Here, ‘Product = Television’ is the predicate of the SQL statement.

5. Queries: SQL queries refer to statements used to request or retrieve data


from a database. For example, let’s say you want to retrieve the first name and
customer number of all customers whose last name is ‘Lobo’. The following
query will fetch the relevant data from the database:

SELECT First_Name, Customer_No FROM Customers WHER

You might also like