Data Camp-Data Analyst in SQL

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 51

Data Camp-Data Analyst in SQL

Table of Contents
Course 1 - Introduction to SQL...............................................................................................................3
1 – Relational Databases....................................................................................................................3
1.1 Databases................................................................................................................................3
1.2 Tables.......................................................................................................................................4
1.3 Data Types...............................................................................................................................4
2 – Querying......................................................................................................................................5
2.1 Introducing Queries.................................................................................................................5
2.2 Writing Queries........................................................................................................................6
2.3 SQL “Flavours”.........................................................................................................................9
Course 2 - Intermediate to SQL............................................................................................................11
1 – Selecting Data............................................................................................................................11
1.1 Querying a database..............................................................................................................11
1.3 SQL style................................................................................................................................13
2- Filtering Records..........................................................................................................................13
2.1 Filtering numbers...................................................................................................................13
2.2 Multiple criteria.....................................................................................................................14
2.3 Filtering Text..........................................................................................................................16
2.4 NULL values............................................................................................................................19
3 - Aggregate functions....................................................................................................................20
3.1 Summarizing data..................................................................................................................20
3.2 Summarizing subsets.............................................................................................................22
3.3 Aliasing and arithmetic..........................................................................................................23
4 – Sorting and Grouping.................................................................................................................25
4.1 Sorting Results.......................................................................................................................25
4.2 Grouping data........................................................................................................................27
4.3 Filtering grouped data............................................................................................................29
Course 3 - Joining Data in SQL.............................................................................................................31
1 Introducing Inner Joins.................................................................................................................31
1.1 The ins and outs of INNER JOIN.......................................................................................31
1.2 Defining relationships......................................................................................................33
1.3 Multiples Joins.................................................................................................................34
2 Outer Joins, Cross Joins and Self Joins.....................................................................................36
2.1 Left and Right Joins................................................................................................................37
2.2 Full Joins..........................................................................................................................39
2.3 Crossing into Cross Joins..................................................................................................40
2.4 Self Joins..........................................................................................................................40
3 Set Theory for SQL Joins..........................................................................................................41
3.1 Union and Union All.........................................................................................................41
3.2 At the Intersect................................................................................................................42
3.3 Except....................................................................................................................................43
4 Subqueries...............................................................................................................................43
4.1 Sub querying with semi joins and anti joins...........................................................................44
4.2 Sub querying inside WHERE and SELECT..........................................................................45
4.3 Sub querying inside FROM...............................................................................................47
4.4 Summary..........................................................................................................................48
Course 4- Data Manipulation in SQL....................................................................................................50
1 – CASE...........................................................................................................................................50
1.1 Introduction to CASE........................................................................................................50
1.2 CASE.................................................................................................................................50
1.3 CASE WHEN......................................................................................................................50
2 – Short and Simple Subqueries................................................................................................50
2.1WHERE and subqueries..........................................................................................................50
2.2 Subqueries in FROM..............................................................................................................50
2.3 Subqueries in SELECT.............................................................................................................50
2.4 Subqueries everywhere...................................................................................................50
3– Correlated Queries, Nested Queries, and Common Table Expression.........................................50
3.1 Correlated subqueries............................................................................................................50
3.2 Nested subqueries...........................................................................................................50
3.3 Common Table Expression.....................................................................................................50
3.4 Deciding on techniques to use...............................................................................................50
4– Window Functions.................................................................................................................50
4.1 OVER................................................................................................................................51
4.2 OVER with a PARTITION.........................................................................................................51
4.3 Sliding windows.....................................................................................................................51
4.4 Bringing all together..............................................................................................................51
Course 1 - Introduction to SQL

1 – Relational Databases

1.1 Databases

Organized by tables including rows and columns but in the world of data bases:

 Rows= records (holds data on an individual observation)


 Columns= Fields (holds one piece of information about all records)

Relational databases: Define the relationship between tables of data inside the database.

Database advantages

 Store more data


 More security
 Several users can query the data base without changing the data bases
SQL = Structure Querying Languages

1.2 Tables

Good table manners:

1. Should be lowercase
2. Space = _
3. Refer to a collective group or be plural

Assigned seats

 Unique identifiers are used to identify records in a table


 They are unique and often numbers

1.3 Data Types

Different type of data is stored different and use different memory space

Some operations only apply to certain data types

 String = is a sequence of characters such as letters or punctuation (eg. Name in the table
above). One popular string data type in SQL is VARCHAR
 Integers = store whole numbers (eg. Member_year). One popular integer data type in SQL is
INT
 Float = numbers that include fractional part (eg. total_fine). One popular integer data type in
SQL is NUMERIC
Schemas is like a blueprint of the database and shows the database design (tables included, the
relationship between them and the datatype each field can hold)

Database Storage = is in server

2 – Querying

2.1 Introducing Queries

Keywords:
2.1.1 SELECT (is used to choose the fields that will be included in the result set)

FROM (is used to pick the table in which the fields are listed)

Eg.
SELECT name

FROM patrons;

 It was used the best practices putting the query words in capital and the tables and names in
lower case.
 Note that this query does not change anything from the database.
 It is possible to save the query for other users have access.

We can select multiple fields just separating them with a coma and put the order we want to see:
And if we want to select all fields? With * , that is

SELECT *

FROM patrons;

2.2 Writing Queries

AS = Aliasing (to rename columns)


In this case we changed the original output name to the ones we want but without changing the
original database

DISTINCT = select just the distinct records

How to use distinct in multiple fields? We put distinct in the field we want to distinct, that is:
1st: Without a distinct
2nd: With distinct in dept_id, we can see that the dept_id 3 has no longer two records in the 2021.
The question in real world: Which departments hired in the last years? So, doesn’t matter how much.

VIEW=creates a virtual table that is the result of a saved SQL SELECT statement and we access it is
updated automatically.

We can query using SELECT and FROM a table we name it previously:


2.3 SQL “Flavours”

LIMIT = limit the result to just a few records to test out code before selecting thousands of results
from the database
SELECT..

FROM…

LIMIT..
Course 2 - Intermediate to SQL

1 – Selecting Data

1.1 Querying a database

Keywords:

COUNT() = count the number of records within the field


The aliasing (AS) was just to be clearer not needed.

We can count multiple fields is just to repeat the first line a separate with a coma.

For a table is COUNT (*)

Is it possible to merge with distinct: SELECT COUNT (DISTINCT birthdate)

1.2 Querying Execution

Debugging (common errors):

 Comma

 Keywords
 Misspelling

1.3 SQL style

Formatting is not required but is best practices (SQL style guide by Simon Holywell)

 Capitalization
 New lines with the KEYWORDS
 Semicolon (in the end of query)
 Not name the field with space because we have to quote the 2 words to work with
the field.

2- Filtering Records

2.1 Filtering numbers

WHERE = filter something

eg. WHERE color = ‘green’


WHERE release_year >= 1960
WHERE release_year <> 1960
2.2 Multiple criteria

 OR = when we need to satisfy just one condition

Put parentheses in the where without that will not work

+ WHERE

 AND = we need to satisfy all criteria


+ WHERE

+OR

If we want to use AND + OR need to use (), for example, we will filter films released in 1994 or 1995,
and certified PG or R

 BETWEEN = we want to select a range (it’s inclusive, that is, BETWEEN 1994 AND 2000 will
include 1994 and 2000)

We can use like a shortcut for AND in some situations like:

With between we can:

+AND but can be used with OR also


2.3 Filtering Text

 LIKE = to search for a pattern in a field

% matches zero, one or many characters, for example: Where name LIKE ‘Ade%’ it will return the
name equal to Ade (if exists) and then all the names that start with Ade

_ match a single character, for example, Where name LIKE ‘Ev_’ it will return the name with one
more character that starts with Ev

We can use them in any position, for example:


 NOT LIKE = records that not match (case sensitive)

 IN = search for a records that include …

We can easily substitute this:


With this:

Another example with text:

2.4 NULL values

Missing values = null and can be because human error, information not available or unknow

COUNT (field_name) = includes only non-missing values

COUNT (*) = include missing values


IS NULL = retrieve the nulls

IS NOT NULL

When we can with a field name it will count the not null so it’s not necessary to include the IS NOT
NULL as showed above.
In sum:

3 - Aggregate functions

3.1 Summarizing data

- AVG() – gives us the average value

- SUM () – returns the sum of all records in the field


MIN ()

MAX ()

COUNT ()
3.2 Summarizing subsets

(Keywords above + WHERE):


ROUND- to round our outputs:

We can round to negative parameters, for example:

3.3 Aliasing and arithmetic

Arithmetic

The division give 1 because give us an integer, if we want the real result we need to do SELECT
(4.0/3.0)
The main difference between aggregate functions vs arithmetic it’s that the first one performs
vertical, and the arithmetic performs horizontal.

4 – Sorting and Grouping


4.1 Sorting Results

ORDER BY = order results as we want, from default is ascending (-inf to +inf or A-Z)
We cannot include the fields we used to ORDER BY but the best practices advise to do it.
4.2 Grouping data
It will give an error if we try to SELECT directly a field that is not in a GROUP BY, so we have to add an
operator function, for example:

We can aggregate with multiple fields that the 1st field is the main one and the 2nd field is used when
we have a tie.
4.3 Filtering grouped data
HAVING: filtering groups of data, that is, the substitute of WHERE
In Sum from Intermediate SQL:
Course 3 - Joining Data in SQL

1 Introducing Inner Joins


1.1 The ins and outs of INNER JOIN

INNER JOIN = looks for records in both tables which match on a given field

Example:

SELECT *

FROM left_table

INNER JOIN right_table


An example with the following data:

The INNER JOIN will be like this:


In terms of efficiency, we can aliasing tables, like the example below:

Instead of ON we can use USING:

1.2 Defining relationships

One-to-many relationships:

For example, one band can produce a lot of musics.

Another relationship is One-to-One:


Many-to-many relationships:

1.3 Multiples Joins


2 Outer Joins, Cross Joins and Self Joins
2.1 Left and Right Joins

The result will be the left table with a null in the field right_val.
The code will be the same as LEFT but with RIGHT and with the FROM xxx and RIGHT JOIN yyy
changed, that is, if in the LEFT is y and x, in this case will be opposite,
2.2 Full Joins
2.3 Crossing into Cross Joins

CROSS JOIN can be incredibly helpful when asking questions that involve looking at all possible
combinations or pairings between two sets of data.

2.4 Self Joins


SELF JOIN = there is no straight away Syntax for SELF JOIN so we normal use INNER JOIN and
then use the same table. Self joins are very useful for comparing data from one part of a table with
another part of the same table.
For example, in the example below we are showing 2 countries within the same region, this can be
interpreted as if the PM of a country want to see all the other PM.

3 Set Theory for SQL Joins

3.1 Union and Union All

UNION ALL such the same as UNION + duplicates, that is, in the example above the 1A and 4A of the
right table is not shown because they already are in the left table… if we use UNION ALL will show 1A
and 4A twice. UNION = 7 records vs UNION ALL = 9 records
3.2 At the Intersect

We can use two fields to intersect but just will show if some record in the right tables matches
with the left tables.
3.3 Except

The ID 4 is present in both tables but since that the val field does not match it will appear as a
result too.

4 Subqueries

4.1 Sub querying with semi joins and anti joins


4.2 Sub querying inside WHERE and SELECT

WHERE - subquery
SELECT – subquery
4.3 Sub querying inside FROM

4.4 Summary
Course 4- Data Manipulation in SQL

1 – CASE

1.1 Introduction to CASE


1.2 CASE

1.3 CASE WHEN

2– Short and Simple Subqueries

2.1WHERE and subqueries

2.2 Subqueries in FROM

2.3 Subqueries in SELECT

2.4 Subqueries everywhere

3– Correlated Queries, Nested Queries, and Common Table Expression

3.1 Correlated subqueries

3.2 Nested subqueries

3.3 Common Table Expression

3.4 Deciding on techniques to use


4– Window Functions

4.1 OVER

4.2 OVER with a PARTITION

4.3 Sliding windows

4.4 Bringing all together

You might also like