0% found this document useful (0 votes)
22 views56 pages

SQL

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 56

SQL

The SQL programming language was developed in the 1970s by IBM researchers
Raymond Boyce and Donald Chamberlin. The programming language, known then as
SEQUEL, was created following Edgar Frank Codd’s paper, “A Relational Model of Data
for Large Shared Data Banks,” in 1970.
SQL (Structured Query Language) is a programming language for storing, managing,
manipulating, and processing data in relational databases. SQL has been around since
the 1970s, and was standardized by the American National Standards Institute (ANSI) in
1986
Online transaction processing (OLTP) Vs Online analytical processing (OLAP): Both
OLAP and OLTP are database management systems for storing and processing data in
large volumes.
The primary purpose of online analytical processing (OLAP) is to analyze aggregated
data, while the primary purpose of online transaction processing (OLTP) is to process
database transactions.
SQL Commands: DDL, DML, DCL, TCL, DQL

DDL: DDL stands for Data Definition Language and are the statements required to create
the tables & maintaining the tables. These commands are CREATE, DROP, ALTER,
TRUNCATE, and RENAME.

DML: DML stands for Data Manipulation Language. These are the INSERT statements that
you will need to run in your database to populate the data and other statements like
UPDATE, and DELETE

DCL: DCL stands for Data Control Language, DCL is used to access the stored
data(GRANT/REVOKE)

TCL: TCL stands for Transaction Control Language, TCL includes statements that are used
to manage the changes that are made from DML statements. It enhances the transactional
nature of SQL. The TCL commands in SQL are COMMIT and ROLLBACK.

DQL: DQL stands for Data Query Language, DQL commands are used for fetching data
from a relational database. They perform read-only queries of data. The only command,
'SELECT'
Proc SQL Syntax SQL Syntax
PROC SQL; CREATE TABLE tablename AS
SELECT column(s)
CREATE TABLE tablename AS FROM table(s) | view(s)
WHERE expression
SELECT column(s)
GROUP BY column(s)
FROM table(s) | view(s) HAVING expression
ORDER BY column(s);
WHERE expression

GROUP BY column(s)

HAVING expression

ORDER BY column(s);

QUIT;
Easy way to remember syntax order in SQL
Create table: The SQL CREATE TABLE statement allows you to create and define a table.

Syntax: create table table_name

Select: The SELECT statement is used to select data from a database table and retrieve records from one or more
tables in your database. You can select specific columns(mentioned columns with comma) or all columns(*)

Syntax: select column-1,column2,........,etc. Or Select *

From: The SQL FROM clause is used to list the tables and any joins required for the SQL statement. From is just
clause to mention existing tables which you want to extract data.

Syntax: from table_name

Where: The SQL WHERE clause is used to filter the results and apply conditions in a SELECT, INSERT, UPDATE, or
DELETE statement.

Syntax: Where condition

Group BY: The SQL GROUP BY clause can be used in a SELECT statement to collect data across multiple records
and group the results by one or more columns.

Syntax : group by column1,column2,.....,Column n.


HAVING: The SQL HAVING clause is used in combination with the GROUP BY
clause to restrict the groups of returned rows to only those whose the condition is
TRUE.

Syntax: having condition

ORDER BY: The SQL ORDER BY clause is used to sort the records in the result
set for a SELECT statement.

Syntax: order by column1, column2,...,column n [ASC | DESC]


Data Types

Data Type Category Description


NUMBER Numeric Used to store numeric values with high precision
FLOAT Numeric Used to store approximate numeric values
CHAR Character Used to store fixed-length character strings
VARCHAR2 Character Used to store variable-length character strings
DATE Date and Time Used to store date and time information
TIMESTAMP Date and Time Used to store more precise date and time information

Data types In Proc SQL are same as in SAS data step


Different way to create table in SQL & Proc SQL
1. Creating table with help of create table statement
Syntax:
CREATE TABLE table_name(

column1 datatype [ NULL | NOT NULL ],

column2 datatype [ NULL | NOT NULL ],

);
2. Use CREATE TABLE ... LIKE to create an empty table based on the definition of
another table, including any column attributes and indexes defined in the original table:
Syntax:
SQL: CREATE TABLE new_tbl SELECT * FROM orig_tbl where 1=0;
SAS: PROC SQL inobs=0: CREATE TABLE new_tbl SELECT * FROM orig_tbl ;QUIT;
CREATE TABLE new_tbl LIKE orig_tbl;
3. Simply creating table from existing table with select statement and from clause with
data or without data
Syntax:
CREATE TABLE new_tbl SELECT * FROM orig_tbl
INSERT Statement
The SQL INSERT statement is used to insert a one or more records into a table. There are 2 syntaxes
for the INSERT statement depending on whether you are inserting one record or multiple records.
Syntax:
INSERT INTO table (column1, column2, ... )
VALUES (expression1, expression2, ... );
Or
INSERT INTO table (column1, column2, ... )
SELECT expression1, expression2, ...
FROM source_tables
[WHERE conditions];
UPDATE Statement

The SQL UPDATE statement is used to update existing records in the tables. There are 2 syntaxes for the
update statement depending on whether you are inserting single table or from another table.
Syntax:
UPDATE table
SET column1 = expression1, column2 = expression2,...
[WHERE conditions];
Or
UPDATE table1
SET column1 = (SELECT expression1 FROM table2 WHERE conditions)
[WHERE conditions];
SQL CASE Expression
The CASE expression goes through conditions and returns a value when the first condition is met (like an if-
then-else statement). So, once a condition is true, it will stop reading and return the result. If no conditions
are true, it returns the value in the ELSE clause.
If there is no ELSE part and no conditions are true, it returns NULL.
Syntax:
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
WHEN conditionN THEN resultN
ELSE result
END AS column_name;
SELECT LIMIT Statement
The SQL SELECT LIMIT statement is used to retrieve records from one or more
tables in a database and limit the number of records returned based on a limit value.
Syntax:
SELECT expressions
FROM tables
[WHERE conditions]
[ORDER BY expression [ ASC | DESC ]]
LIMIT number_rows [ OFFSET offset_value ];
DISTINCT Clause
The SQL DISTINCT clause is used to remove duplicates from the result set of a
SELECT statement.
Syntax:
SELECT DISTINCT expressions
FROM tables
[WHERE conditions];
DELETE Statement
The SQL DELETE statement is a used to delete one or more records from a table.
Syntax:
DELETE FROM table
[WHERE conditions];
TRUNCATE TABLE Statement
The SQL TRUNCATE TABLE statement is used to remove all records from a
table. It performs the same function as a DELETE statement without a WHERE
clause.
Syntax:
TRUNCATE TABLE table_name;
ALTER TABLE Statement
The SQL ALTER TABLE statement is used to add, modify, or drop/delete columns
in a table. The SQL ALTER TABLE statement is also used to rename a table.
Add column in table
Modify column in table
Drop column in table
Rename column in table
Add constraints
Add column in table
We can add one or more columns with add statement along with alter table statement.
Syntax:
ALTER TABLE table_name
ADD column_name column_definition;
OR
ALTER TABLE table_name
ADD (column_1 column_definition, column_2 column_definition, … column_n
column_definition);
Modify column in table
To modify a column in an existing table, the SQL ALTER TABLE
Syntax:
ALTER TABLE table_name
MODIFY column_name column_type;
OR
ALTER TABLE table_name
MODIFY (column_1 column_type, column_2 column_type, … column_n
column_type);
Drop column in table
Drop/remove column from a table.
Syntax:
ALTER TABLE table_name
DROP COLUMN column_name;
Rename column in table
Rename a column in an existing table
Syntax:
ALTER TABLE table_name
RENAME COLUMN old_name TO new_name;
*Rename statement will not work in SAS Proc SQL
DROP TABLE Statement
The SQL DROP TABLE statement allows you to remove or delete a table from the
SQL database.
Syntax:
DROP TABLE table_name;
UNION & UNION ALL Operators
The SQL UNION operator is used to combine the result sets of 2 or more SELECT statements. It
removes duplicate rows between the various SELECT statements.
UNION removes duplicate rows.
UNION ALL does not remove duplicate rows.
Each SELECT statement within the UNION must have the same number of fields in the result sets
with similar data types.
Syntax:
SELECT expression1, expression2, ... expression_n FROM tables [WHERE conditions]
UNION
SELECT expression1, expression2, ... expression_n FROM tables [WHERE conditions];
INTERSECT Operator
The SQL INTERSECT operator is used to return the results of 2 or more SELECT
statements. However, it only returns the rows selected by all queries or data sets.
If a record exists in one query and not in the other, it will be omitted from the
INTERSECT results.
Syntax:
SELECT expression1, expression2, ... expression_n FROM tables [WHERE
conditions]
INTERSECT
SELECT expression1, expression2, ... expression_n FROM tables [WHERE
conditions];
MINUS Operator
The SQL MINUS operator is used to return all rows in the first SELECT statement that are not
returned by the second SELECT statement. Each SELECT statement will define a dataset. The
MINUS operator will retrieve all records from the first dataset and then remove from the results all
records from the second dataset.
Syntax:
SELECT expression1, expression2, ... expression_n FROM tables [WHERE conditions]
MINUS
SELECT expression1, expression2, ... expression_n FROM tables [WHERE conditions];
Note:The EXCEPT operator is not supported in all SQL databases. It can be used in databases
such as SQL Server, PostgreSQL, and SQLite. For databases such as Oracle, use the MINUS
operator to perform this type of query.
Constraints
SQL constraints are used to specify rules for the data in a table. Constraints are used
to limit the type of data that can go into a table. This ensures the accuracy and
reliability of the data in the table. If there is any violation between the constraint and
the data action, the action is aborted.
There are 6 types of constraints
1. Not Null
2. Unique
3. Primary key
4. Foreign key
5. Check
6. Default
Not Null
This constraint tells that we cannot store a null value in a column. That is, if a column is specified as
NOT NULL then we will not be able to store null in this particular column any more.
Syntax:
CREATE TABLE table_name ( column1 datatype NOT NULL,...);
Or
ALTER TABLE table_name MODIFY Column_name datatype NOT NULL;
In SAS:
ALTER TABLE Table_name ADD Column_name datatype NOT NULL;
Or
ALTER TABLE table_name ADD CONSTRAINT constraint_name NOT NULL(column_name)
Unique
This constraint when specified with a column, tells that all the values in the column
must be unique. That is, the values in any row of a column must not be repeated.
Syntax:
CREATE TABLE table_name (Column_name data_type UNIQUE);
Or
ALTER TABLE table_name ADD CONSTRAINT constraint_name UNIQUE
(column_name);
Primary Key
Primary Key: A primary key is a special attribute or field within a database table
that uniquely identifies each record or row in that table. Primary key column
cannot have NULL values.
Primary key syntax
CREATE TABLE table_name ( column1 datatype PRIMARY KEY);
Or
ALTER TABLE table_name ADD CONSTRAINT constraint_name PRIMARY KEY
(column1, column2, ... column_n);
Foreign Key
A foreign key is a column or group of columns in a relational database table that
provides a link between data in two tables. It is a column (or columns) that
references a column (most often the primary key) of another table.
Foreign Key syntax
CREATE TABLE table_name ( column1 datatype
NOT NULL,... CONSTRAINT constraint_name
FOREIGN KEY(Column_name) REFERENCES
reference_tablename(Column_name));
Or
ALTER TABLE table_name ADD CONSTRAINT
constraint_name FOREIGN KEY (Column_names)
REFERENCES
reference_tablename(Column_names);
Check
This constraint helps to validate the values of a column to meet a particular
condition. That is, it helps to ensure that the value stored in a column meets a
specific condition.
Syntax:
CREATE TABLE table_name (Column_name data_type CONSTRAINT
constraint_name CHECK (Condition));
Or
ALTER TABLE table_name ADD CONSTRAINT constraint_name CHECK
(Condition);
Default
This constraint is used to provide a default value for the fields. That is, if at the
time of entering new records in the table if the user does not specify any value for
these fields then the default value will be assigned to them.
Syntax:
CREATE TABLE table_name(column_name datetype DEFAULT default_value);
Or
ALTER TABLE table_name ALTER column_name SET DEFAULT default_value;
Indexes
An index is a performance-tuning method of allowing faster retrieval of records. An
index creates an entry for each value that appears in the indexed columns. Each
index name must be unique in the database.
Syntax:
CREATE [UNIQUE] INDEX index_name
ON table_name (column1, column2, ... column_n);
ALIASES
SQL ALIASES can be used to create a temporary name for columns or tables.
COLUMN ALIASES are used to make column headings in your resultset easier to
read.
TABLE ALIASES are used to shorten your SQL to make it easier to read or when
you are performing a self join (ie: listing the same table more than once in the FROM
clause).
Syntax:
COLUMN: column_name [AS] alias_name
TABLE: table_name [AS] alias_name
SQL JOINS
SQL JOINS are used to retrieve data from multiple tables. A SQL JOIN is
performed whenever two or more tables are listed in a SQL statement.
There are 4 different types of SQL joins:
1. SQL INNER JOIN (sometimes called simple join)
2. SQL LEFT OUTER JOIN (sometimes called LEFT JOIN)
3. SQL RIGHT OUTER JOIN (sometimes called RIGHT JOIN)
4. SQL FULL OUTER JOIN (sometimes called FULL JOIN)
SQL INNER JOIN (sometimes called simple join)
Chances are, you've already written a SQL statement that uses an SQL INNER JOIN. It is the most
common type of SQL join. SQL INNER JOINS return all rows from multiple tables where the join
condition is met.
Syntax:
SELECT columns
FROM table1
INNER JOIN table2
ON table1.column = table2.column
INNER JOIN table3
ON table1.column = table3.column;
SQL LEFT OUTER JOIN (sometimes called LEFT JOIN)
Another type of join is called a LEFT OUTER JOIN. This type of join returns all
rows from the LEFT-hand table specified in the ON condition and only those rows
from the other table where the joined fields are equal (join condition is met).
Syntax:
SELECT columns
FROM table1
LEFT [OUTER] JOIN table2
ON table1.column = table2.column;
SQL RIGHT OUTER JOIN (sometimes called RIGHT JOIN)

Another type of join is called a SQL RIGHT OUTER JOIN. This type of join returns
all rows from the RIGHT-hand table specified in the ON condition and only those
rows from the other table where the joined fields are equal (join condition is met).
Syntax:
SELECT columns
FROM table1
RIGHT [OUTER] JOIN table2
ON table1.column = table2.column;
SQL FULL OUTER JOIN (sometimes called FULL JOIN)
Another type of join is called a SQL FULL OUTER JOIN. This type of join returns
all rows from the LEFT-hand table and RIGHT-hand table with NULL values in
place where the join condition is not met.
Syntax:
SELECT columns
FROM table1
FULL [OUTER] JOIN table2
ON table1.column = table2.column;
SQL Self Join
A self join is a regular join, but the table is joined with itself.
Syntax:
SELECT column_name(s)
FROM table1 T1, table1 T2
WHERE condition;
Subqueries
A subquery is a query within a query. You can create subqueries within your SQL
statements. These subqueries can reside in the WHERE clause, the FROM clause
EXISTS Condition
The SQL EXISTS condition is used in combination with a subquery and is
considered to be met, if the subquery returns at least one row. It can be used in a
SELECT, UPDATE, or DELETE statement.
Syntax:
WHERE EXISTS ( subquery );
IN Condition
Most often, the subquery will be found in the WHERE clause. These subqueries
are also called nested subqueries.
Syntax:
SELECT *
FROM all_tables tabs
WHERE tabs.column_name IN (SELECT cols.column_name
FROM all_tab_columns cols
WHERE cols.column_name = Column_value);
From Clause Subquery
A subquery can also be found in the FROM clause
Syntax:
FROM table_name,
(subquery) subquery1
WHERE subquery1.column_name = table_name.column_name;
SQL Functions
Aggregate functions: Count, Sum, Average,Min and Max
Mathematical functions: Sign, Absolute, Ceil, Floor, Truncate and Modulo
String functions: Length, Lower, Upper, Concat and Trim
Date and Time functions: Date, Time, Extract
Windows functions: Rank, and Nth value
Miscellaneous functions: Coalesce

Reference link for available functions in Oracle:


https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Functions.html#GUID-D079
EFD3-C683-441F-977E-2C9503089982
Connecting External Databases in SAS
Most of the times while working with SAS we need to pull in data from some
external database or push the results back into external databases like DB2,
ORACLE, Teradata etc.
SAS has two methods to come across this;
1. Explicit: SQL Pass through
2. Implicit: LIBNAME
Explicit: SQL Pass through Method
With this method SAS makes explicit connection to the external database
meaning; we write SQL code which is native to that specific database; SAS just
hands over the query to that database and code gets executed on database
engine. After query gets finished the results are handed over back to SAS.
Syntax: For DML type of queries like select/create etc.
PROC SQL;
CONNECT TO TERADATA(USER= PASSWORD= SERVER/PATH=
DATABASE/SCHEMA= );
CREATE TABLE table_name AS
SELECT * FROM CONNECTION TO TERADATA
(SQL Query);
DISCONNECT FROM TERADATA;
QUIT;
Syntax: For DDL type of queries like DROP/UPDATE etc

PROC SQL;
CONNECT TO TERADATA(USER= PASSWORD= SERVER= DATABASE= );
EXECUTE(SQL Query);BY TERADATA;
DISCONNECT FROM TERADATA;
QUIT;
Implicit: LIBNAME Method
With this method SAS makes implicit connection to external database; meaning
we use a SAS syntax to be operated on external database tables; SAS optimizes
the code and converts into database specific SQL query which gets handed over
to database engine.
External database engine then executes the query and hands over the results
back to SAS engine.
Syntax: LIBNAME Method
LIBNAME tera TERADATA SERVER=XXXX USER=XXXX PWD=XXXXXX DATABASE=XXXX;
Commit and Rollback

Commit: Make changes done in transaction permanent


Rollback: Use the ROLLBACK statement to undo work done in the current
transaction or to manually undo the work done by an in-doubt distributed
transaction.

By default auto commit will be enabled in most of the databases


Comments
In SQL, you can comment your code just like any other language. Comments can
appear on a single line or span across multiple lines. Let's explore how to
comment your SQL statements.
There are two syntaxes that you can use to create a comment in SQL.
Syntax:
Using -- symbol
OR
Using /* and */ symbols
Interview Questions
1. Which is more faster- Data Step / Proc SQL?
2. How to remove duplicates using PROC SQL?
3. How to use NODUPKEY kind of operation with PROC SQL
4. How to count unique values by a grouping variable?
5. Selecting Random Samples with PROC SQL?
6. How to you assign incremental value by group in PROC SQL?
7. We have employee table which have all employee details with manager ID
only. How do we get manager details along with employee information?
8. How do you get third highest salary of employee/employees?
9. Joins
10. Explicit Vs Implicit

You might also like