Data Camp-Data Analyst in SQL
Data Camp-Data Analyst in SQL
Data Camp-Data Analyst in SQL
Table of Contents
Course 1 - Introduction to SQL...............................................................................................................3
1 – Relational Databases....................................................................................................................3
1.1 Databases................................................................................................................................3
1.2 Tables.......................................................................................................................................4
1.3 Data Types...............................................................................................................................4
2 – Querying......................................................................................................................................5
2.1 Introducing Queries.................................................................................................................5
2.2 Writing Queries........................................................................................................................6
2.3 SQL “Flavours”.........................................................................................................................9
Course 2 - Intermediate to SQL............................................................................................................11
1 – Selecting Data............................................................................................................................11
1.1 Querying a database..............................................................................................................11
1.3 SQL style................................................................................................................................13
2- Filtering Records..........................................................................................................................13
2.1 Filtering numbers...................................................................................................................13
2.2 Multiple criteria.....................................................................................................................14
2.3 Filtering Text..........................................................................................................................16
2.4 NULL values............................................................................................................................19
3 - Aggregate functions....................................................................................................................20
3.1 Summarizing data..................................................................................................................20
3.2 Summarizing subsets.............................................................................................................22
3.3 Aliasing and arithmetic..........................................................................................................23
4 – Sorting and Grouping.................................................................................................................25
4.1 Sorting Results.......................................................................................................................25
4.2 Grouping data........................................................................................................................27
4.3 Filtering grouped data............................................................................................................29
Course 3 - Joining Data in SQL.............................................................................................................31
1 Introducing Inner Joins.................................................................................................................31
1.1 The ins and outs of INNER JOIN.......................................................................................31
1.2 Defining relationships......................................................................................................33
1.3 Multiples Joins.................................................................................................................34
2 Outer Joins, Cross Joins and Self Joins.....................................................................................36
2.1 Left and Right Joins................................................................................................................37
2.2 Full Joins..........................................................................................................................39
2.3 Crossing into Cross Joins..................................................................................................40
2.4 Self Joins..........................................................................................................................40
3 Set Theory for SQL Joins..........................................................................................................41
3.1 Union and Union All.........................................................................................................41
3.2 At the Intersect................................................................................................................42
3.3 Except....................................................................................................................................43
4 Subqueries...............................................................................................................................43
4.1 Sub querying with semi joins and anti joins...........................................................................44
4.2 Sub querying inside WHERE and SELECT..........................................................................45
4.3 Sub querying inside FROM...............................................................................................47
4.4 Summary..........................................................................................................................48
Course 4- Data Manipulation in SQL....................................................................................................50
1 – CASE...........................................................................................................................................50
1.1 Introduction to CASE........................................................................................................50
1.2 CASE.................................................................................................................................50
1.3 CASE WHEN......................................................................................................................50
2 – Short and Simple Subqueries................................................................................................50
2.1WHERE and subqueries..........................................................................................................50
2.2 Subqueries in FROM..............................................................................................................50
2.3 Subqueries in SELECT.............................................................................................................50
2.4 Subqueries everywhere...................................................................................................50
3– Correlated Queries, Nested Queries, and Common Table Expression.........................................50
3.1 Correlated subqueries............................................................................................................50
3.2 Nested subqueries...........................................................................................................50
3.3 Common Table Expression.....................................................................................................50
3.4 Deciding on techniques to use...............................................................................................50
4– Window Functions.................................................................................................................50
4.1 OVER................................................................................................................................51
4.2 OVER with a PARTITION.........................................................................................................51
4.3 Sliding windows.....................................................................................................................51
4.4 Bringing all together..............................................................................................................51
Course 1 - Introduction to SQL
1 – Relational Databases
1.1 Databases
Organized by tables including rows and columns but in the world of data bases:
Relational databases: Define the relationship between tables of data inside the database.
Database advantages
1.2 Tables
1. Should be lowercase
2. Space = _
3. Refer to a collective group or be plural
Assigned seats
Different type of data is stored different and use different memory space
String = is a sequence of characters such as letters or punctuation (eg. Name in the table
above). One popular string data type in SQL is VARCHAR
Integers = store whole numbers (eg. Member_year). One popular integer data type in SQL is
INT
Float = numbers that include fractional part (eg. total_fine). One popular integer data type in
SQL is NUMERIC
Schemas is like a blueprint of the database and shows the database design (tables included, the
relationship between them and the datatype each field can hold)
2 – Querying
Keywords:
2.1.1 SELECT (is used to choose the fields that will be included in the result set)
FROM (is used to pick the table in which the fields are listed)
Eg.
SELECT name
FROM patrons;
It was used the best practices putting the query words in capital and the tables and names in
lower case.
Note that this query does not change anything from the database.
It is possible to save the query for other users have access.
We can select multiple fields just separating them with a coma and put the order we want to see:
And if we want to select all fields? With * , that is
SELECT *
FROM patrons;
How to use distinct in multiple fields? We put distinct in the field we want to distinct, that is:
1st: Without a distinct
2nd: With distinct in dept_id, we can see that the dept_id 3 has no longer two records in the 2021.
The question in real world: Which departments hired in the last years? So, doesn’t matter how much.
VIEW=creates a virtual table that is the result of a saved SQL SELECT statement and we access it is
updated automatically.
LIMIT = limit the result to just a few records to test out code before selecting thousands of results
from the database
SELECT..
FROM…
LIMIT..
Course 2 - Intermediate to SQL
1 – Selecting Data
Keywords:
We can count multiple fields is just to repeat the first line a separate with a coma.
Comma
Keywords
Misspelling
Formatting is not required but is best practices (SQL style guide by Simon Holywell)
Capitalization
New lines with the KEYWORDS
Semicolon (in the end of query)
Not name the field with space because we have to quote the 2 words to work with
the field.
2- Filtering Records
+ WHERE
+OR
If we want to use AND + OR need to use (), for example, we will filter films released in 1994 or 1995,
and certified PG or R
BETWEEN = we want to select a range (it’s inclusive, that is, BETWEEN 1994 AND 2000 will
include 1994 and 2000)
% matches zero, one or many characters, for example: Where name LIKE ‘Ade%’ it will return the
name equal to Ade (if exists) and then all the names that start with Ade
_ match a single character, for example, Where name LIKE ‘Ev_’ it will return the name with one
more character that starts with Ev
Missing values = null and can be because human error, information not available or unknow
IS NOT NULL
When we can with a field name it will count the not null so it’s not necessary to include the IS NOT
NULL as showed above.
In sum:
3 - Aggregate functions
MAX ()
COUNT ()
3.2 Summarizing subsets
Arithmetic
The division give 1 because give us an integer, if we want the real result we need to do SELECT
(4.0/3.0)
The main difference between aggregate functions vs arithmetic it’s that the first one performs
vertical, and the arithmetic performs horizontal.
ORDER BY = order results as we want, from default is ascending (-inf to +inf or A-Z)
We cannot include the fields we used to ORDER BY but the best practices advise to do it.
4.2 Grouping data
It will give an error if we try to SELECT directly a field that is not in a GROUP BY, so we have to add an
operator function, for example:
We can aggregate with multiple fields that the 1st field is the main one and the 2nd field is used when
we have a tie.
4.3 Filtering grouped data
HAVING: filtering groups of data, that is, the substitute of WHERE
In Sum from Intermediate SQL:
Course 3 - Joining Data in SQL
INNER JOIN = looks for records in both tables which match on a given field
Example:
SELECT *
FROM left_table
One-to-many relationships:
The result will be the left table with a null in the field right_val.
The code will be the same as LEFT but with RIGHT and with the FROM xxx and RIGHT JOIN yyy
changed, that is, if in the LEFT is y and x, in this case will be opposite,
2.2 Full Joins
2.3 Crossing into Cross Joins
CROSS JOIN can be incredibly helpful when asking questions that involve looking at all possible
combinations or pairings between two sets of data.
UNION ALL such the same as UNION + duplicates, that is, in the example above the 1A and 4A of the
right table is not shown because they already are in the left table… if we use UNION ALL will show 1A
and 4A twice. UNION = 7 records vs UNION ALL = 9 records
3.2 At the Intersect
We can use two fields to intersect but just will show if some record in the right tables matches
with the left tables.
3.3 Except
The ID 4 is present in both tables but since that the val field does not match it will appear as a
result too.
4 Subqueries
WHERE - subquery
SELECT – subquery
4.3 Sub querying inside FROM
4.4 Summary
Course 4- Data Manipulation in SQL
1 – CASE
4.1 OVER