Reda Rafi: Data Engineer
Reda Rafi: Data Engineer
Data Engineer
SQL
Une base de données est :
données
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 3
SQL is the language you use to
ask the questions.
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 4
TABLE NAME:
CONTACTS
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 5
SQL est un langage déclaratif puissant que vous
Résumé pouvez appréhender en comprenant quelques
concepts de base
Understanding Basic SQL Syntax
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 7
FROM CLAUSE
SELECT CLAUSE
{
{
{
{
;
KEYWORD KEYWORD
IDENTIFIER IDENTIFIER
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 8
BASIC SQL COMMANDS
id first_name last_name
1 Jon Flanders
SELECT
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 9
BASIC SQL COMMANDS
id first_name last_name
1 Jon Flanders
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 10
BASIC SQL COMMANDS
id first_name last_name
1 Jon Flanders
UPDATE
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 11
BASIC SQL COMMANDS
id first_name last_name
1 Jon Flanders
2 Fritz Onion
DELETE
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 14
EXEMPLES DE QUESTION
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 15
Querying Data with the SELECT Statement
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 16
A SIMPLE QUERY WITH SELECT
SELECT CLAUSE
SELECT ‘Hello’,‘World’
{
;
SELECT LIST
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 17
• MOST OF THE TIME IT CONTAINS A LIST OF
COLUMNS
• FROM A TABLE YOU WANT TO QUERY.
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 18
SELECT <COLUMN_NAME>, <COLUMN_NAME> FROM<TABLE_NAME> ;
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 19
* This wildcard "SELECT list"
character
pulls all the columns from a table
BAD
PRACTICE!
Defines the table you want
FROM CLAUSE
to query
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 21
QUALIFYING COLUMN NAME WITH TABLE NAME
Good Practice
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 22
ALIASING THE TABLE NAME
Best Practice
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 23
WAYS TO CONSTRAIN THE NUMBER OF RESULTS
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 24
NOT DISTINCT
first_name
Jon
Jon
Fritz
What are the first names of all the people
I know?
Shannon
Jason
Jason
Brian
SELECT p.first_name FROM person p ;
Brian
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 25
DISTINCT
first_name
Jon
Fritz
Shannon
What are all unique first names of the
people I know?
Jason
Brian
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 26
• QUERY DATA WITH THE SELECT COMMAND
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 28
• Constrains the result set
WHERE CLAUSE
SELECT p.last_name
FROM person p ;
WHERE p.first_name
= ‘Jon’
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 30
BOOLEAN OPERATORS
Operator
<> Not Equal TO True if value on both sides are not equal
> Greater Than True if left side is larger than right side
< Less Than True if left side is smaller than right side
<= Less Than or Equal True if left side is smaller or equal to right
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 31
• WE’D LIKE TO ASK MORE COMPLEX QUESTIONS
A • ADDITIONAL KEYWORDS ARE NEEDED
Single
Expression is These can chain multiple
expressions
Quite
Limiting
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 32
COMBINES TWO
EXPRESSIONS
AND IF BOTH ARE TRUE, ROW IS
INCLUDED IF EITHER IS
FALSE, ROW IS EXCLUDED
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 33
SELECT CLAUSE Who are all the people in my
contact list that have the first name
SELECT p.first_name,
p.last_name
FROM person p
WHERE p.first_name = ‘Jon’
AND p.birthdate > WHERE CLAUSE
’12/31/1965’; AND
ALSO COMBINES TWO
EXPRESSIONS
OR
IF EITHER ARE TRUE, ROW IS
INCLUDED IF BOTH ARE
FALSE, ROW IS EXCLUDED
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 35
Who are all the people in my
contact list that have the first name
Jon or a last name of Flanders?
SELECT SELECT
P.FIRST_NAME, CLAUSE FROM
CLAUSE
P.LAST_NAME
FROM PERSON P
WHERE p.first_name = WHERE
‘Jon’ OR p.last_name = CLAUSE OR
‘Flanders’;
OTHER OPERATORS
BETWEEN LIKE IN
IS IS NOT
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 37
ACTS ON COLUMN AND TWO
VALUES
TRUE IF COLUMN VALUE IS
BETWEEN BETWEEN TWO VALUES
INCLUSIVE – INCLUDES TWO
VALUES (LIKE >= & <=)
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 38
Who are all the people in my contact
list that I have contacted at least
once but no more than 20 times?
SELECT SELECT
CLAUSE
P.FIRST_NAME,
P.LAST_NAME
FROM PERSON P FROM CLAUSE
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 40
Who are all the people in my
contact list that have a first name
that begins with the letter J?
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 42
SELECT CLAUSE
Who are all the people in my
contact list that are named Jon or
Fritz?
FROM CLAUSE
SELECT p.first_name,
p.last_name
FROM person p
WHERE p.first_name WHERE
IN (‘Jon’,’Fritz’); CLAUSE IN
SPECIAL
OPERATOR
IS Like a equals operator
But just for values that might be NULL
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 44
Who are all the people in my
contact list that don’t have a last
SELECT CLAUSE name?
FROM CLAUSE
SELECT p.first_name,
p.last_name
FROM person p
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 46
Who are all the people in my
contact list that have a last name?
SELECT CLAUSE
FROM CLAUSE
SELECT p.first_name,
p.last_name
FROM person p
WHERE
WHERE p.last_name
CLAUSE IS
IS NOT NULL;
NOT
The WHERE clause enables us to
restrict the result set of our queries
Summar
The more complex the question we
y want to ask, the more complex the
WHERE clause becomes
Shaping Results with ORDER BY and GROUP BY
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 49
Sometimes you want the result set
to be different than the data
returned by a simple SELECT
statement.
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 50
ALLOWS SORTING OF RESULT SET
After the WHERE clause (if there is one)
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 51
Who are all the people in my
contact list, ordered by last name?
SELECT
P.LAST_NAME, SELECT CLAUSE
P.FIRST_NAME
FROM PERSON P FROM CLAUSE
ORDER BY
ORDER BY p.last_name;
CLAUSE
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 52
• COMPUTES NEW VALUES FROM COLUMN
VALUES
• USE IN PLACE OF COLUMNS IN SELECT
Set Function CLAUSE PASSES COLUMN NAME TO
FUNCTION
• HELPS US TO ASK MORE INTERESTING
QUESTIONS OFTEN USED WITH THE
DISTINCT QUALIFIER
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 53
SET FUNCTIONS
Function
SUM Sum of all the values of the column (does not include
NULL values, only numeric column)
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 54
What is the total number of times I’ve
contacted my contacts?
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 55
SET Often used together
FUNCTIONS Add inside of the function
+ QUALIFIERS Run against DISTINCT column
values
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 56
What is the count of unique first names
among my contacts?
P.FIRST_NAME) FROM
PERSON P;
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 57
ALLOWS MULTIPLE COLUMNS WITH A
SET FUNCTION
BREAKS RESULT SET INTO SUBSETS
Runs set function against each subset
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 58
What is the count of every
unique first name among my
contacts?
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 60
What is the count of unique
first names among my
contacts that appear at least 5
times?
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 62
Matching Different Data Tables with JOINs
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 63
Merges multiple tables into one result set
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 64
SIMPLEST
JOIN
All rows from both tables
No WHERE clause
CROSS JOIN Least useful
Inefficient Cartesian
Product
CROSS keyword
implied
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 65
What are all the first names and
email addresses I have?
BAD
PRACTICE!
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 66
MOST TYPICAL JOIN
EMPHASIZES RELATIONAL
NATURE OF DATABASE MATCHES
INNER JOIN
COLUMN IN FIRST TABLE TO
SECOND PRIMARY KEY TO
FOREIGN KEY IS MOST COMMON
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 67
SELECT
P.FIRST_NAME, P.LAST_NAME, What are my contacts’
E.EMAIL_ADDRESS email addresses?
FROM person p
INNER JOIN INNER JOIN
email_address e
ON ON CLAUSE
p.person_id =
e.email_address_person_id;
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 68
INNER JOIN DOESN’T DEAL WITH
NULL VALUES
OUTER JOIN WORKS EVEN
WHEN NO MATCH NULL
OUTER JOIN COLUMNS IF NO MATCH IN
SECOND TABLE FULL OUTER
JOIN RETURNS ALL JOINED
ROWS NULL WHEN NO MATCH
IN EITHER TABLE
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 69
ANOTHER NULL-RELATED JOIN
ALL ROWS FROM THE LEFT SIDE
LEFT OUTER JOIN WILL BE RETURNED NULL FOR
NON-MATCHING RIGHT SIDE
TABLE
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 70
SELECT
P.FIRST_NAME, P.LAST_NAME, What are my contacts and their
E.EMAIL_ADDRESS email addresses, including those I
don’t have an email for?
FROM person p
LEFT OUTER JOIN LEFT OUTER
JOIN
email_address e
ON p.person_id =
e.email_address_ first_name last_name email_address
person_id; Jon Flanders jon@...
Fritz Onion fritz@...
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 72
SELECT
P.FIRST_NAME, P.LAST_NAME, W hat are the email addresses I
have, including those emails I don’t
E.EMAIL_ADDRESS have a person for?
FROM person p
RIGHT OUTER JOIN RIGHT OUTER
email_address e JOIN
ON first_name last_name email_address
p.person_id = Jon Flanders jon@...
e.email_address_person_id; Fritz Onion fritz@...
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 73
SELECT What are all my contacts and their email
addresses, including the ones missing an
P.FIRST_NAME, P.LAST_NAME, email address and the ones with an email
address but missing a contact name?
E.EMAIL_ADDRESS
FROM person p
FULL OUTER JOIN FULL OUTER
email_address e JOIN
ON p.person_id = first_name last_name email_address
e.email_address_ Jon Flanders jon@...
person_id; Fritz Onion fritz@...
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 75
YOU CAN JOIN A TABLE ON
ITSELF
ODD BUT
SOMETIMES USEFUL
NO SPECIAL
SYNTAX
SELF JOIN
SAME TABLE ON LEFT AND RIGHT
SIDE OF JOIN USEFUL WHEN
TABLE CONTAINS HIERARCHICAL
DATA
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 76
Summary JOINS make the relational model come to life
by associating tables together
Adding, Changing, and Removing Data
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 78
So far, all we’ve done is
query data, but SQL has
commands for adding
and modifying data as
well.
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 79
ALL THE COMMANDS – AKA CRUD
UPDATE DELETE
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 80
COMMAND IS ACTUALLY INSERT
INTO
TABLE NAME AFTER COMMAND
Only one table allowed
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 81
INSERT INTO INSERT
person( INTO TABLE
person_id, first_name,
last_name, contacted_number, COLUMN NAMES
date_last_contacted,
date_added
)
VALUES( VALUES KEYWORD
1,‘Jon’,‘Flanders',0, VALUES
NULL,'2016-05-14 11:43:31'
);
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 82
INSERT ALLOWS ONLY ONE TABLE AND COLUMN LIST
person_id, first_name,
COLUMN NAMES
last_name, contacted_number,
date_last_contacted,
date_added)
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 85
MODIFIES COLUMN(S) IN A
SINGLE TABLE
UPDATE WHERE CLAUSE DICTATES WHICH
ROWS SET KEYWORD FOLLOWS
TABLE NAME
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 86
UPDATE UPDATE
email_address e COMMAND TABLE
SET NAME
e.email_address = SET KEYWORD
‘[email protected]’
VALUES
WHERE
e.email_address_id WHERE CLAUSE
= 5;
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 87
DELETES ONE OR MORE ROWS IN A
TABLE
DELETE PERMANENT!
DELETE FROM IS ACTUAL FULL
COMMAND WHERE CLAUSE IS CRITICAL!
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 88
DELETE FROM PERSON DELETE COMMAND
P;
BAD
PRACTICE!
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 89
INSERT, UPDATE, and DELETE are the
SUMMARY
three SQL commands you want to learn if you
are going to be modifying data in your
database tables
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 90
Creating Database Tables
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 91
SQL subset for creating databases and tables
DATADEFINITION Most tools have a visual method
LANGUAGE (DDL)
Good to have an idea of what they are doing
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 92
ODDLY NOT PART OF THE SQL
STANDARD
CREATE Is supported by most implementations
USE DATABASE to scope future queries
DATABASE Can also fully qualify table name to database
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 93
CREATE DATABASE
CREATE DATABASE Contact; COMMAND
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 94
IS PART OF SQL STANDARD
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 95
CREATE
CREATE TABLE TABLE TABLE
email_address NAME
(
email_address_id
INTEGER,
email_address_pe
COLUMNS
rson_id
INTEGER,
email_address
VARCHAR(55)
);
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 96
STANDARD SQL DATA TYPES
Data Type Value Space
CHARACTER Can hold N character values – set to N statically
Can hold N character values – set to N dynamically –
CHARACTER VARYING
storage can be less than N
BINARY Hexadecimal data
SMALLINT -2^15 (-32,768) to 2^15-1 (32,767)
INTEGER -2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647)
-2^63 (-9,223,372,036,854,775,808) to
BIGINT
2^63-1 (9,223,372,036,854,775,807)
BOOLEAN TRUE or FALSE
DATE YEAR, MONTH, and DAY in the format YYYY-MM-DD
HOUR, MINUTE, and SECOND in the format
TIME
HH:MM:SS[.sF] where F is the fractional part of the
TIMESTAMP Both DATE and TIME
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 97
NULL IS A SPECIAL VALUE
IN SQL
INDICATES A LACK OF A
VALUE
NULL VALUES Columns can be required or not required
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 98
NULL OR NOT
NULL
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 99
CREATE TABLE CREATE TABLE COMMAND
email_address
(
email_address_id
INTEGER NOT NULL, NOT
email_address_per NULL
son_id INTEGER, NULL
email_address
VARCHAR(55) NOT NULL); NOT NULL
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 100
Must be a unique value per row
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 102
WAY TO ADD KEYS IN
ONE GROUPING
CONSTRAINT
PRIMARY OR FOREIGN
KEYS
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 103
CREATE TABLE phone_number
(
phone_number_id INTEGER
CREATE TABLE phone_number
NOT NULL, CREATE TABLE
(
phone_number_person_id COMMAND
phone_number_id INTEGER NOT NULL,
INTEGER NOT NULL, INTEGER NOT
phone_number_person_id
phone_number
NULL, CONSTRAINT
VARCHAR(55)
phone_number NOT NULL,
VARCHAR(55) NOT NULL,
CONSTRAINT PK_phone_number PRIMARY
CONSTRAINT
KEY
PK_phone_number
(phone_number_id)
); PRIMARY KEY
(phone_number_i
d)
);
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 104
USED TO CHANGE AN EXISTING
TABLE
ADD/REMOVE COLUMN
CHANGE COLUMN DATA TYPE
ALTER TABLE CHANGE COLUMN
CONSTRAINTS
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 105
ALTER TABLE ALTER TABLE
email_address TABLE NAME
ADD CONSTRAINT ADD KEYWORD
FK_email_address_person
FOREIGN KEY FOREIGN KEY
(email_address_person_id)
REFERENCES
person
(person_id);
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 106
REMOVES TABLE AND ALL DATA FROM DATABASE
BE CAREFUL!
DROP TABLE
Error if table is a foreign key to another
table
DROP TABLE person; DROP TABLE COMMAND
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 108
UNDERSTANDING DDL IS A GOOD
FOUNDATION FOR WORKING
WITH SQL, EVEN IF YOU USE IT
RARELY
CREATE TABLE is the command to
Summary configure your columns and relations
ALTER TABLE lets you change
existing definitions
DROP TABLE removes the table and all
its rows from the database
SQL | REDA RAFI | 18-MAI-2023 Company Confidential © Capgemini 2023. All rights reserved | 109