0% found this document useful (0 votes)
13 views

SQL - RDBMS Concepts

SQL cheat sheet

Uploaded by

memof87346
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

SQL - RDBMS Concepts

SQL cheat sheet

Uploaded by

memof87346
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

What is RDBMS?

RDBMS stands for Relational Database Management System.


RDBMS is the basis for SQL, and for all modern database systems
like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft
Access.

A Relational database management system (RDBMS) is a


database management system (DBMS) that is based on the
relational model as introduced by E. F. Codd in 1970.

What is a Table?
The data in an RDBMS is stored in database objects known
as tables. This table is basically a collection of related data entries
and it consists of numerous columns and rows.

Remember, a table is the most common and simplest form of


data storage in a relational database. Following is an example of
a CUSTOMERS table which stores customer's ID, Name, Age,
Salary, City and Country −

ID Name Age Salary City Country

1 Ramesh 32 2000.00 Hyderabad India

2 Mukesh 40 5000.00 New York USA

3 Sumit 45 4500.00 Muscat Oman

4 Kaushik 25 2500.00 Kolkata India

5 Hardik 29 3500.00 Bhopal India

6 Komal 38 3500.00 Saharanpur India

7 Ayush 25 3500.00 Delhi India

8 Javed 29 3700.00 Delhi India

What is a Field?
Every table is broken up into smaller entities called fields. A field
is a column in a table that is designed to maintain specific
information about every record in the table.
For example, our CUSTOMERS table consists of different fields
like ID, Name, Age, Salary, City and Country.

What is a Record or a Row?


A record is also called as a row of data is each individual entry
that exists in a table. For example, there are 7 records in the
above CUSTOMERS table. Following is a single row of data or
record in the CUSTOMERS table −

ID Name Age Salary City Country

1 Ramesh 32 2000.00 Hyderabad India

A record is a horizontal entity in a table.

What is a Column?
A column is a vertical entity in a table that contains all
information associated with a specific field in a table.

For example, our CUSTOMERS table have different columns to


represent ID, Name, Age, Salary, City and Country.

What is a NULL Value?


A NULL value in a table is a value in a field that appears to be
blank, which means a field with a NULL value is a field with no
value.

It is very important to understand that a NULL value is different


than a zero value or a field that contains spaces. A field with a
NULL value is the one that has been left blank during a record
creation. Following table has three records where first record has
NULL value for the salary and second record has a zero value for
the salary.

ID Name Age Salary City Country

1 Ramesh 32 Hyderabad India

2 Mukesh 40 00.00 New York USA


3 Sumit 45 4500.00 Muscat Oman

SQL Constraints
Constraints are the rules enforced on data columns on a table.
These are used to limit the type of data that can go into a table.
This ensures the accuracy and reliability of the data in the
database.

Constraints can either be column level or table level. Column


level constraints are applied only to one column whereas, table
level constraints are applied to the entire table.

Following are some of the most commonly used constraints


available in SQL −

S.No. Constraints

NOT NULL Constraint


1
Ensures that a column cannot have a NULL value.

DEFAULT Constraint
2
Provides a default value for a column when none is specified.

UNIQUE Key
3
Ensures that all the values in a column are different.

PRIMARY Key
4
Uniquely identifies each row/record in a database table.

FOREIGN Key
5
Uniquely identifies a row/record in any another database table.

CHECK Constraint
6
Ensures that all values in a column satisfy certain conditions.

INDEX Constraint
7
Used to create and retrieve data from the database very quickly.

Data Integrity
The following categories of data integrity exist with each RDBMS

 Entity Integrity − This ensures that there are no duplicate


rows in a table.
 Domain Integrity − Enforces valid entries for a given column
by restricting the type, the format, or the range of values.
 Referential integrity − Rows cannot be deleted, which are used
by other records.
 User-Defined Integrity − Enforces some specific business rules
that do not fall into entity, domain or referential integrity.

Database Normalization
Database normalization is the process of efficiently organizing
data in a database. There are two reasons of this normalization
process −

 Eliminating redundant data, for example, storing the same


data in more than one table.
 Ensuring data dependencies make sense.

Both these reasons are worthy goals as they reduce the amount
of space a database consumes and ensures that data is logically
stored. Normalization consists of a series of guidelines that help
guide you in creating a good database structure.

Normalization guidelines are divided into normal forms; think of a


form as the format or the way a database structure is laid out.
The aim of normal forms is to organize the database structure, so
that it complies with the rules of first normal form, then second
normal form and finally the third normal form.

It is your choice to take it further and go to the Fourth Normal


Form, Fifth Normal Form and so on, but in general, the Third
Normal Form is more than enough for a normal Database
Application.

 First Normal Form (1NF)


 Second Normal Form (2NF)
 Third Normal Form (3NF)

First Normal Form (1NF)

Database normalization is the process of efficiently organizing


data in a database to eliminate the redundant data from the
database and ensuring data dependencies make sense. Various
Normalization Forms are used to eliminate or reduce data
redundancy in database tables.

What is First Normal Form (1NF)?


First Normal Form (1NF) sets the basic rules to organize the data in
a database. A database is said to be in first normal form if it
satisfies the following conditions −

 Rule 1 (Atomic Values) − Every column of a table should


contain only atomic values. An atomic value is a value that
cannot be divided further.
 Rule 2 (No Repeating Groups) − There are no repeating groups
of data. This means a table should not contain repeating
columns.

While designing your database tables, you must take care of


atleast the First Normal Form compliance otherwise you will put
yourself in a big problem during database operatoons.

Rule 1 - Atomic Values


Every column of a table should contain only atomic values. An
atomic value is a value that cannot be divided further.

Consider the following CUSTOMERS table which is being used to


store customers data −

ID Name Age Salary City Country

1 Ramesh 32 2000.00 Hyderabad, Delhi India

2 Mukesh 40 5000.00 New York USA

3 Sumit 45 4500.00 Muscat Oman

4 Kaushik 25 2500.00 Kolkata India


This table is not in first normal form because the City column can
contain multiple values. For example, the first row includes values
"Hyderabad" and "Delhi."

Now to bring this table to first normal form, we have to consider


the real problem where a customer can stay in different cities
which could be in the same or different countries. So we split the
table into two separate tables as below −

CUSTOMERS Table
ID Name Age Salary

1 Ramesh 32 2000.00

2 Mukesh 40 5000.00

3 Sumit 45 4500.00

4 Kaushik 25 2500.00

CUSTOMERS_ADDRESS Table
ID City Country

1 Hyderabad India

1 Delhi India

2 New York USA

3 Muscat Oman

4 Kolkata India

Rule 2 - No Repeating Groups


There are no repeating groups of data. This means a table should
not contain repeating columns.
Consider the following CUSTOMERS table which is being used to
store customers data −

ID Name Age Salary City1 City2 Country

1 Ramesh 32 2000.00 Hyderabad Delhi India

2 Mukesh 40 5000.00 New York USA

3 Sumit 45 4500.00 Muscat Oman

4 Kaushik 25 2500.00 Kolkata India

This table is not in first normal form because we have City column
repeated two times and you can see some problems in the
current table. This table always reserves space on the disk for
two cities, whether the person stays in two cities or not.

To eliminate the repeating columns and bring the table to the first
normal form, separate the table into two tables. Put the repeating
columns into one of the tables as below −

CUSTOMERS Table
ID Name Age Salary

1 Ramesh 32 2000.00

2 Mukesh 40 5000.00

3 Sumit 45 4500.00

4 Kaushik 25 2500.00

CUSTOMERS_ADDRESS Table
ID City Country
1 Hyderabad India

1 Delhi India

2 New York USA

3 Muscat Oman

4 Kolkata India

Now we have normalized tables which are meeting requirements


to be in First Normal Form and now we can assign multiple cities
for the same customer without wasting space.

Database - Second Normal Form (2NF)

The Second Normal Form states that it should meet all the rules
for 1NF and there must be no partial dependences of any of the
columns on the primary key −

Consider a customer-order relation and you want to store


customer ID, customer name, order ID and order detail and the
date of purchase −

CREATE TABLE CUSTOMERS(


CUST_ID INT NOT NULL,
CUST_NAME VARCHAR (20) NOT NULL,
ORDER_ID INT NOT NULL,
ORDER_DETAIL VARCHAR (20) NOT NULL,
SALE_DATE DATETIME,
PRIMARY KEY (CUST_ID, ORDER_ID)
);

This table is in the first normal form; in that it obeys all the rules
of the first normal form. In this table, the primary key consists of
the CUST_ID and the ORDER_ID. Combined, they are unique
assuming the same customer would hardly order the same thing.
However, the table is not in the second normal form because
there are partial dependencies of primary keys and columns.
CUST_NAME is dependent on CUST_ID and there's no real link
between a customer's name and what he purchased. The order
detail and purchase date are also dependent on the ORDER_ID,
but they are not dependent on the CUST_ID, because there is no
link between a CUST_ID and an ORDER_DETAIL or their
SALE_DATE.

To make this table comply with the second normal form, you
need to separate the columns into three tables.

First, create a table to store the customer details as shown in the


code block below −

CREATE TABLE CUSTOMERS(


CUST_ID INT NOT NULL,
CUST_NAME VARCHAR (20) NOT NULL,
PRIMARY KEY (CUST_ID)
);

The next step is to create a table to store the details of each


order −

CREATE TABLE ORDERS(


ORDER_ID INT NOT NULL,
ORDER_DETAIL VARCHAR (20) NOT NULL,
PRIMARY KEY (ORDER_ID)
);

Finally, create a third table storing just the CUST_ID and the
ORDER_ID to keep a track of all the orders for a customer −

CREATE TABLE CUSTMERORDERS(


CUST_ID INT NOT NULL,
ORDER_ID INT NOT NULL,
SALE_DATE DATETIME,
PRIMARY KEY (CUST_ID, ORDER_ID)
);
Third Normal Form (3NF)

A table is in a third normal form when the following conditions are


met −

 It is in second normal form.


 All nonprimary fields are dependent on the primary key.

The dependency of these non-primary fields is between the data.


For example, in the following table the street name, city and the
state are unbreakably bound to their zip code.

CREATE TABLE CUSTOMERS(


CUST_ID INT NOT NULL,
CUST_NAME VARCHAR (20) NOT NULL,
DOB DATE,
STREET VARCHAR(200),
CITY VARCHAR(100),
STATE VARCHAR(100),
ZIP VARCHAR(12),
EMAIL_ID VARCHAR(256),
PRIMARY KEY (CUST_ID)
);

The dependency between the zip code and the address is called
as a transitive dependency. To comply with the third normal
form, all you need to do is to move the Street, City and the State
fields into their own table, which you can call as the Zip Code
table. −

CREATE TABLE ADDRESS(


ZIP VARCHAR(12),
STREET VARCHAR(200),
CITY VARCHAR(100),
STATE VARCHAR(100),
PRIMARY KEY (ZIP)
);

The next step is to alter the CUSTOMERS table as shown below −


CREATE TABLE CUSTOMERS(
CUST_ID INT NOT NULL,
CUST_NAME VARCHAR (20) NOT NULL,
DOB DATE,
ZIP VARCHAR(12),
EMAIL_ID VARCHAR(256),
PRIMARY KEY (CUST_ID)
);

The advantages of removing transitive dependencies are mainly


two-fold. First, the amount of data duplication is reduced and
therefore your database becomes smaller.

The second advantage is data integrity. When duplicated data


changes, there is a big risk of updating only some of the data,
especially if it is spread out in many different places in the
database.

For example, if the address and the zip code data were stored in
three or four different tables, then any changes in the zip codes
would need to ripple out to every record in those three or four
tables.

You might also like