SQL Joins
SQL Joins
SQL Join is used to fetch data from two or more tables, which is joined to appear as single set of
data. SQL Join is used for combining column from two or more tables by using values common
to both tables. JoinKeyword is used in SQL queries for joining two or more tables. Minimum
required condition for joining table, is(n-1) where n, is number of tables. A table can also join to
itself known as, Self Join.
Types of Join
The following are the types of JOIN that we can use in SQL.
Inner
Outer
Left
Right
ID
NAME
abhi
adam
alex
Address
DELHI
MUMBAI
CHENNAI
NAME
ID
Address
abhi
DELHI
adam
DELHI
alex
DELHI
abhi
MUMBAI
adam
MUMBAI
alex
MUMBAI
abhi
CHENNAI
adam
CHENNAI
alex
CHENNAI
NAME
abhi
adam
alex
anu
Address
DELHI
MUMBAI
CHENNAI
NAME
ID
Address
abhi
DELHI
adam
MUMBAI
alex
CHENNAI
Natural JOIN
Natural Join is a type of Inner join which is based on column having same name and same
datatype present in both the tables to be joined.
NAME
abhi
adam
alex
anu
Address
DELHI
MUMBAI
CHENNAI
NAME
Address
abhi
DELHI
adam
MUMBAI
alex
CHENNAI
In the above example, both the tables being joined have ID column(same name and same
datatype), hence the records for which value of ID matches in both the tables will be the result of
Natural Join of these two tables.
Outer JOIN
Outer Join is based on both matched and unmatched data. Outer Joins subdivide further into,
on table-name1.column-name = table-name2.column-name;
NAME
abhi
adam
alex
anu
ashish
Address
DELHI
MUMBAI
CHENNAI
NOIDA
PANIPAT
NAME
ID
Address
abhi
DELHI
adam
MUMBAI
alex
CHENNAI
anu
null
null
ashish
null
null
table-name2
on table-name1.column-name = table-name2.column-name;
NAME
abhi
adam
alex
anu
ashish
Address
DELHI
MUMBAI
CHENNAI
NOIDA
PANIPAT
NAME
ID
Address
abhi
DELHI
adam
MUMBAI
alex
CHENNAI
null
null
NOIDA
null
null
PANIPAT
table-name2
on table-name1.column-name = table-name2.column-name;
NAME
abhi
adam
alex
anu
ashish
Address
DELHI
MUMBAI
CHENNAI
NOIDA
PANIPAT
NAME
ID
Address
abhi
DELHI
adam
MUMBAI
alex
CHENNAI
anu
null
null
ashish
null
null
null
null
NOIDA
null
null
PANIPAT
135.1K
31
Rate this:
Vote!
Types of join in SQL Server for fetching records from multiple tables.
Introduction
In this tip, I am going to explain about types of join.
What is join??
1.
2.
1.
2.
3.
An SQL JOIN clause is used to combine rows from two or more tables, based on a
common field between them.
There are many types of join.
Inner Join
Equi-join
Natural Join
Outer Join
Left outer Join
Right outer join
Full outer join
Cross Join
Self Join
Employee
2.
3.
4.
5.
6.
7.
8.
9.
10.
Departments
create table Departments(
id int identity(1,1) primary key,
DepartmentName varchar(50)
)
1) Inner Join
The join that displays only the rows that have a match in both the joined tables is known
as inner join.
select e1.Username,e1.FirstName,e1.LastName,e2.DepartmentName _
from Employee e1 inner join Departments e2 on e1.DepartID=e2.id
It gives matched rows from both tables with reference to DepartID of first table and id
of second table like this.
Equi-Join
Equi join is a special type of join in which we use only equality operator. Hence, when
you make a query forjoin using equality operator, then that join query comes under Equi
join.
Equi join has only (=) operator in join condition.
Equi join can be inner join, left outer join, right outer join.
Check the query for equi-join:
SELECT * FROM Employee e1 JOIN Departments e2 ON e1.DepartID = e2.id
2) Outer Join
Outer join returns all the rows of both tables whether it has matched or not.
We have three types of outer join:
1.
2.
3.
Result:
Result:
Result:
3) Cross Join
A cross join that produces Cartesian product of the tables that are involved in the join.
The size of a Cartesian product is the number of the rows in the first table multiplied by
the number of rows in the second table like this.
SELECT * FROM Employee cross join Departments e2
4) Self Join
Joining the table itself called self join. Self join is used to retrieve the records having
some relation or similarity with other records in the same table. Here, we need to use
aliases for the same table to set a self join between single table and retrieve records
satisfying the condition in where clause.
SELECT e1.Username,e1.FirstName,e1.LastName from Employee e1 _
inner join Employee e2 on e1.id=e2.DepartID
Here, I have retrieved data in which id and DepartID of employee table has been
matched:
Points of Interest
Here, I have taken one example of self join in this scenario where manager name can be
retrieved by manageridwith reference of employee id from one table.
Here, I have created one table employees like that:
If I have to retrieve manager name from manager id, then it can be possible by Self join:
select e1.empName as ManagerName,e2.empName as EmpName _
from employees e1 inner join employees e2 on e1.id=e2.managerid
Result:
History
11 important database
designing rules which I follow
557.8K
191
Table of Contents
Introduction
Rule 1: What is the nature of the application (OLTP or
OLAP)?
Rule 2: Break your data in to logical pieces, make life
simpler
Rule 3: Do not get overdosed with rule 2
Rule 4: Treat duplicate non-uniform data as your biggest
enemy
Rule 5: Watch for data separated by separators
Introduction
Before you start reading this article let me confirm to you I
am not a guru in database designing. The below 11 points are
what I have learnt via projects, my own experiences, and my
own reading. I personally think it has helped me a lot when it
comes to DB designing. Any criticism is welcome.
So now lets apply the second rule of 1st normal form: Avoid
repeating groups. You can see in the above figure I have
created a separate syllabus table and then made a many-tomany relationship with the subject table.
With this approach the syllabus field in the main table is no
more repeating and has data separators.
In the above figure you can see how the average field is
dependent on the marks and subject. This is also one form of
redundancy. So for such kinds of fields which are derived from
other fields, give a thought: are they really necessary?
This rule is also termed as the 3rd normal form: No column
should depend on other non-primary key columns. My
personal thought is do not apply this rule blindly, see the
situation; its not that redundant data is always bad. If the
redundant data is calculative data, see the situation and then
decide if you want to implement the 3rdnormal form.
Rule 9: Multidimensional
data is a different beast
altogether
OLAP projects mostly deal with multidimensional data. For
instance you can see the below figure, you would like to get
sales per country, customer, and date. In simple words you
are looking at sales figures which have three intersections of
dimension data.
License
This article, along with any associated source code and files,
is licensed under The Code Project Open License (CPOL)
Share
EMAIL
TWITTER
Shivprasad koirala
Architect http://www.questpond.com
India
Identifying Entities
The types of information that are saved in the database are called 'entities'.
These entities exist in four kinds: people, things, events, and locations.
Everything you could want to put in a database fits into one of these
categories. If the information you want to include doesn't fit into these
categories, than it is probably not an entity but a property of an entity, an
attribute.
To clarify the information given in this article we'll use an example. Imagine
that you are creating a website for a shop, what kind of information do you
have to deal with? In a shop you sell your products to customers. The
Identifying Relationships
The next step is to determine the relationships between the entities and to
determine the cardinality of each relationship. The relationship is the
connection between the entities, just like in the real world: what does one
entity do with the other, how do they relate to each other? For example,
customers buy products, products are sold to customers, a sale comprises
products, a sale happens in a shop.
The cardinality shows how much of one side of the relationship belongs to
how much of the other side of the relationship. First, you need to state for
each relationship, how much of one side belongs to exactly 1 of the other
side. For example: How many customers belong to 1 sale?; How many
sales belong to 1 customer?; How many sales take place in 1 shop?
You'll get a list like this: (please note that 'product' represents a type of
product, not an occurance of a product)
Now we'll put the data together to find the cardinality of the whole
relationship. In order to do this, we'll draft the cardinalities per relationship.
To make this easy to do, we'll adjust the notation a bit, by noting the
'backward'-relationship the other way around:
Customers --> Sales; 1 customer can buy something several times
Sales --> Customers; 1 sale is always made by 1 customer at the
time
The second relationship we will turn around so it has the same entity order
as the first. Please notice the arrow that is now faced the other way!
Customers <-- Sales; 1 sale is always made by 1 customer at the
time
Cardinality exists in four types: one-to-one, one-to-many, many-to-one, and
many-to-many. In a database design this is indicated as: 1:1, 1:N, M:1, and
M:N. To find the right indication just leave the '1'. If there is a 'many' on the
left side, this will be indicated with 'M', if there is a 'many' on the right side it
is indicated with 'N'.
Customers --> Sales; 1 customer can buy something several times;
1:N.
Customers <-- Sales; 1 sale is always made by 1 customer at the
time; 1:1.
The true cardinality can be calculated through assigning the biggest values
for left and right, for which 'N' or 'M' are greater than '1'. In thisexample, in
both cases there is a '1' on the left side. On the right side, there is a 'N' and
a '1', the 'N' is the biggest value. The total cardinality is therefore '1:N'. A
customer can make multiple 'sales', but each 'sale' has just one customer.
If we do this for the other relationships too, we'll get:
Customers --> Sales; --> 1:N
Customers --> Products; --> M:N
Customers --> Shops; --> M:N
Sales --> Products; --> M:N
Shops --> Sales; --> 1:N
Shops --> Products; --> M:N
So, we have two '1-to-many' relationships, and four 'many-to-many'
relationships.
Between the entities there may be a mutual dependency. This means that
the one item cannot exist if the other item does not exist. For example,
there cannot be a sale if there are no customers, and there cannot be a
sale if there are no products.
The relationships Sales --> Customers, and Sales --> Products are
mandatory, but the other way around this is not the case. A customer can
exist without sale, and also a product can exist without sale. This is of
importance for the next step.
Recursive Relationships
Sometimes in your model you will get a 'redundant relationship'. These are
relationships that are already indicated by other relationships, although not
directly.
In the case of our example there is a direct relationships between
customers and products. But there are also relationships from customers to
sales and from sales to products, so indirectly there already is a
relationship between customers and products through sales. The
relationship 'Customers <----> Products' is made twice, and one of them is
therefore redundant. In this case, products are only purchased through a
Identifying Attributes
The data elements that you want to save for each entity are called
'attributes'.
About the products that you sell, you want to know, for example, what the
price is, what the name of the manufacturer is, and what the type number
is. About the customers you know their customer number, their name, and
address. About the shops you know the location code, the name, the
address. Of the sales you know when they happened, in which shop, what
products were sold, and the sum total of the sale. Of the vendor you know
his staff number, name, and address. What will be included precisely is not
of importance yet; it is still only about what you want to save.
Derived Data
Derived data is data that is derived from the other data that you have
already saved. In this case the 'sum total' is a classical case of derived
data. You know exactly what has been sold and what each product costs,
so you can always calculate how much the sum total of the sales is. So
really it is not necessary to save the sum total.
So why is it saved here? Well, because it is a sale, and the price of the
product can vary over time. A product can be priced at 10 euros today and
at 8 euros next month, and for your administration you need to know what it
cost at the time of the sale, and the easiest way to do this is to save it here.
There are a lot of more elegant ways, but they are too profound for this
article.
type of relationship. The side of the relationship that is mandatory for the
other to exist will be indicated through a dash on the line. Not mandatory
entities are indicated through a circle. "Many" is indicated through a
'crowfeet'; de relationship-line splits up in three lines.
In this article we make use of DeZign for Databases to design and present
our database.
A 1:1 mandatory relationship is represented as follows:
Assigning Keys
Primary Keys
A primary key (PK) is one or more data attributes that uniquely identify an
entity. A key that consists of two or more attributes is called a composite
key. All attributes part of a primary key must have a value in every record
(which cannot be left empty) and the combination of the values within these
attributes must be unique in the table.
In the example there are a few obvious candidates for the primary key.
Customers all have a customer number, products all have a unique product
number and the sales have a sales number. Each of these data is unique
and each record will contain a value, so these attributes can be a primary
key. Often an integer column is used for the primary key so a record can be
easily found through its number.
Link-entities usually refer to the primary key attributes of the entities that
they link. The primary key of a link-entity is usually a collection of these
reference-attributes. For example in the Sales_details entity we could use
the combination of the PK's of the sales and products entities as the PK of
Sales_details. In this way we enforce that the same product (type) can only
be used once in the same sale. Multiple items of the same product type in a
sale must be indicated by the quantity.
In the ERD the primary key attributes are indicated by the text 'PK' behind
the name of the attribute. In the example only the entity 'shop' does not
have an obvious candidate for the PK, so we will introduce a new attribute
for that entity: shopnr.
Foreign Keys
The Foreign Key (FK) in an entity is the reference to the primary key of
another entity. In the ERD that attribute will be indicated with 'FK' behind its
name. The foreign key of an entity can also be part of the primary key, in
that case the attribute will be indicated with 'PF' behind its name. This is
usually the case with the link-entities, because you usually link two
instances only once together (with 1 sale only 1 product type is sold 1
time).
If we put all link-entities, PK's and FK's into the ERD, we get the model as
shown below. Please note that the attribute 'products' is no longer
necessary in 'Sales', because 'sold products' is now included in the linktable. In the link-table another field was added, 'quantity', that indicates how
many products were sold. The quantity field was also added in the stocktable, to indicate how many products are still in store.
Normalization
Normalization makes your data model flexible and reliable. It does generate
some overhead because you usually get more tables, but it enables you to
do many things with your data model without having to adjust it.
Normalization, the First Form
The first form of normalization states that there may be no repeating groups
of columns in an entity. We could have created an entity 'sales' with
attributes for each of the products that were bought. This would look like
this:
What is wrong about this is that now only 3 products can be sold. If you
would have to sell 4 products, than you would have to start a second sale
or adjust your data model by adding 'product4' attributes. Both solutions are
unwanted. In these cases you should always create a new entity that you
link to the old one via a one-to-many relationship.
The third form of normalization states that all attributes need to be directly
dependent on the primary key, and not on other attributes. This seems to
be what the second form of normalization states, but in the second form is
actually stated the opposite. In the second form of normalization you point
out attributes through the PK, in the third form of normalization every
attribute needs to be dependent on the PK, and nothing else.
There are more normalization forms than the three forms mentioned above,
but those are not of great interest for the average user. These other forms
are highly specialized for certain applications. If you stick to the design
rules and the normalization mentioned in this article, you will create a
design that works great for most applications.
Normalized Data Model
If you apply the normalization rules, you will find that the 'manufacturer' in
de product table should also be a separate table:
Figure 19: Data model in accordance with 1st, 2nd and 3d normal form.
Glossary
Attributes - detailed data about an entity, such as price, length, name
Cardinality - the relationship between two entities, in figures. For example,
a person can place multiple orders.
Entities - abstract data that you save in a database. For example:
customers, products.
Foreign key (FK) - a referral to the Primary Key of another table. Foreign
Key-columns can only contain values that exist in the Primary Key column
that they refer to.
Key - a key is used to point out records. The most well-known key is the
Primary Key (see Primary Key).
Resources
Learn
DeZign for Databases: Learn more about DeZign for Databases.
Getting started with DeZign for Databases: Start making a data model
directly.
Display data types in a diagram: Learn how to display data type
and/or domain info in the entity boxes on your diagram.
Get products and technologies
Build your next data model with DeZign for Databases trial software,
available for download directly from Datanamic's download section.