WELCOME TO GENERAL
ASSEMBLY
Let’s get started...
General Assembly
INTRO TO SQL
Your Instructor
Nigel Caldon
Data Analytics Instructor, General Assembly
Economics, New York University
Worked at:
General Assembly: Intro to SQL
Today’s Learning Objectives
1. Understand the foundations of databases and
when to use SQL
2. Learn and practice the basic syntax of SQL
3. Explore more advanced features of SQL, such as joining,
grouping, filtering, and functions
General Assembly: Intro to SQL
FUNDAMENTALS OF
DATABASES AND SQL
General Assembly: Intro to SQL
What’s a Database?
A database is a set of data that has a regular structure and that is organized in
such a way that a computer can easily find the desired information.
Source: The Linux Information Project
General Assembly: Intro to SQL
What’s SQL?
SQL (pronounced: “si-kwel”) is short for Structured Query Language
and was developed at IBM in the early 1970s. It is your “language” to interact
with, modify, and retrieve data from databases.
General Assembly: Intro to SQL
Data Analytics Workflow
SQL supports the second step of the Data
Analytics Workflow: obtaining your data.
As an analyst, obtaining data often
means accessing a SQL server using the
SQL language to get the data you need.
General Assembly: Intro to SQL
Data Analytics Workflow
There are also connections to other steps
in the workflow.
‣ Step 1: Before accessing a database, you
should already have an idea of what you need
to solve your problem
‣ Step 2: There are various ways to understand
what data is stored in a database
‣ Steps 3 and 4: We will learn how to query,
structure, clean, and aggregate data using SQL,
as well as calculate useful statistics and
visualizations after exporting our data
General Assembly: Intro to SQL
Why do we need SQL?
Why couldn’t we just use Excel?
● Excel is limited by your computer’s available memory and system resources.
● Excel has a fixed upper limit of 1,048,576 rows and 16,384 columns.
In other words, Excel is a local tool that is not able to capably manage or interact
with very large datasets. This is when SQL steps in!
General Assembly: Intro to SQL
Why do we need SQL?
What can SQL do that Excel can’t?
● SQL can rapidly navigate databases and query, retrieve, and aggregate millions of records.
● SQL is also more adept than Excel at creating data flows for cleaning and preparing data at
high volumes.
● SQL is the industry standard for data query and retrieval.
General Assembly: Intro to SQL
Why do we need SQL?
However...
● SQL is not a data visualization tool.
● SQL can query and organize data, but is not typically used to analyze it.
In other words, SQL is not a replacement for Excel. Instead, SQL is often used in
conjunction with Excel and other Business Intelligence tools.
General Assembly: Intro to SQL
SQL Dialects
You’ll come across many variants of SQL out in the wild:
General Assembly: Intro to SQL
SQL Clients
You’ll have many options for tools to write and execute SQL queries:
General Assembly: Intro to SQL
Before
we go
further...
WHAT WE’RE
USING TODAY
General Assembly: Intro to SQL
Tutorial Republic Online SQL Editor
For this workshop we’ll be using Tutorial Republic’s
online SQL editor, because:
● It’s far easier than each of us installing clients and
connecting to a live SQL server
● It’s an easy place to learn and practice SQL with
sample data
But keep in mind that:
● It’s never going to be this easy to setup and configure your environment
● There are many small (and sometimes important) differences between SQL dialects
General Assembly: Intro to SQL
Let’s fire up CodeLab!
http://bit.ly/codelabsql
Remember, this only works in Chrome, Safari, and Opera
General Assembly: Intro to Statistics
BASIC SQL SYNTAX
General Assembly: Intro to SQL
Phrases to Remember
✓ SELECT: Selects the fields (i.e., columns).
✓ FROM: Points to the table.
✓ WHERE: Filters on rows.
✓ GROUP BY: Aggregates across values of a variable.
✓ HAVING: Filters groups.
✓ ORDER BY: Sorts or arranges the results.
✓ LIMIT: Limits result to the first n rows.
General Assembly: Intro to SQL
A Basic Query
SELECT * FROM orders;
Retrieves all fields available... ...from the table “orders”
General Assembly: Intro to SQL
Entity Relationship Diagram
An ERD shows the relationships of
entity sets stored in a database. An
entity, in this context, is a component of
data. In other words, ERDs illustrate the
logical structure of databases.
Source: Dataedo
General Assembly: Intro to SQL
Activity: Mapping the Data Structure
DIRECTIONS
1. Click on each table to explore the data (and notice that CodeLab
generates a SQL query each time).
2. Examining the column headers, figure out the relationship between
ACTIVITY the orders, order_details, products, and customers tables.
3. Document these tables as an Entity Relationship Diagram.
DELIVERABLE
Entity Relationship Diagram for the sample data set.
General Assembly: Intro to SQL
Activity: Mapping the Data Structure
customers orders order_details products
cust_id order_id order_id product_id
cust_name cust_id product_id product_name
address order_date units supplier_id
city order_value category_id
postal_code shipper_id quantity_per_unit
country price
General Assembly: Intro to SQL
SQL “Grammar”
● Carriage returns and tabs are ignored, and are often used to enhance readability.
● Queries are not case sensitive (except within strings).
● Strings are enclosed in ‘single quotes’.
● A semicolon ends your query.
General Assembly: Intro to SQL
Adding to the Query
SELECT product_name, price FROM products;
Retrieve just the “product_name” and “price” fields... ...from the table “products”
General Assembly: Intro to SQL
Adding SORT BY and LIMIT
Retrieve just the “product_name”
and “price” fields
SELECT product_name, price
FROM products
From the table “products”
ORDER BY price
Sort “price” from low to high
LIMIT 5;
(use desc for high to low)
Display first 5 rows
General Assembly: Intro to SQL
Adding DISTINCT
Retrieve just the “country” field, and
only return unique values
SELECT DISTINCT country
From the table “customers”
FROM customers
ORDER BY country;
Sort in alphabetical order
General Assembly: Intro to SQL
Activity: Practicing Basic Queries
DIRECTIONS
1. Create a table that includes just the order_id and order_value fields.
2. Display the top 10 largest orders.
ACTIVITY
DELIVERABLE
Executed SQL query that results in the top 10 largest orders and their
associated order IDs.
General Assembly: Intro to SQL
WHERE Conditions
WHERE statements filter and focus resulting information, and can be used to
filter different types of data.
Remember, when combining logical operators, be aware of the order of
operations! Use (parentheses) to create groupings or priorities.
General Assembly: Intro to SQL
WHERE Conditions: Order of Operations
Order of Operations = PEMDAS
P Parentheses:
these go first.
Exponents:
E
powers and square roots are next.
Multiplication & Division:
MD
left to right.
Addition & Subtraction:
AS
left to right.
General Assembly: Intro to SQL
WHERE Conditions: Operators
Let’s learn about a few operators and then apply them to our sample data set:
= Equal to.
!= Not equal to.
>, >= Greater than, greater than or equal to.
<, <= Less than, less than or equal to.
BETWEEN Within the range of.
IN ( ) Found in list of items.
LIKE Contains item.
% Wildcard.
General Assembly: Intro to SQL
Adding WHERE
Retrieve just the “cust_name” and
“country” fields
SELECT cust_name, country
From the table “customers”
FROM customers
WHERE country = ‘Germany’;
Only show rows where the “country”
field has the value “Germany”
General Assembly: Intro to SQL
Activity: Practicing WHERE clauses
DIRECTIONS
1. Create a table that includes just the order_id and order_value fields.
2. Display the orders where the order_value is greater than $4,000.
ACTIVITY
DELIVERABLE
Executed SQL query that results in orders over $4,000 and their associated
order IDs.
General Assembly: Intro to SQL
A FEW ADVANCED
SQL TOPICS
General Assembly: Intro to SQL
Aggregation Functions
There are many SQL commands that can be used to aggregate your data, such as:
MIN, MAX, SUM, COUNT, and AVG.
A valuable SQL commands to include aggregations is GROUP BY. This indicates the
dimensions you want to group your data by (e.g., a category that you wish to sort into
subgroups).
General Assembly: Intro to SQL
Adding aggregation and GROUP BY
Create columns for the count of
values in “cust_id” and “country”
SELECT COUNT(cust_id), country
FROM customers
From the table “customers”
GROUP BY country
ORDER BY 1; Perform the aggregation (in this
case, COUNT), for each unique
value in “country”
Sort the first column from low to high
General Assembly: Intro to SQL
Activity: Practicing Aggregation
DIRECTIONS
1. Create a table that only displays order_value and shipper_id.
2. Display the average order_value for each shipper_id.
ACTIVITY
DELIVERABLE
Executed SQL query that results in the average value of orders each
shipper ID was asked to handle.
General Assembly: Intro to SQL
JOINing Tables
JOINs are similar to
VLOOKUPs in Excel.
That is, you are connecting two
tables together so you can use
information from both tables
for your final result.
General Assembly: Intro to SQL
JOINing Tables
When you create a JOIN, each field in the SELECT clause needs to specify which table it
comes from:
table1 → table1.field1 here, the field is in table1
table2 → table2.field1 this field is in table2
After the FROM clause, add a JOIN and specify the table to which you want to link.
Finally, specify the field on which you want to link tables, which looks like this:
ON a.column_name, b.column_name
General Assembly: Intro to SQL
Adding a JOIN
Create columns for “product_name” and
“category_name” from their respective tables
SELECT products.product_name, categories.category_name
FROM products
JOIN the “products” and “categories” tables
JOIN categories
ON products.category_id = categories.category_id;
Specify which fields should be used to
combine the tables
General Assembly: Intro to SQL
Activity: Practicing Aggregation and JOINs
DIRECTIONS
1. Create a table that only displays order_value and shipper_name.
2. Display the total order_value for each shipper_name.
ACTIVITY
DELIVERABLE
Executed SQL query that results in the total value of orders each shipper
was asked to handle.
General Assembly: Intro to SQL
CONCLUSION
General Assembly: Intro to SQL
Recap of Learning Objectives
1. Understand the foundations of databases and
when to use SQL
2. Learn and practice the basic syntax of SQL
3. Explore more advanced features of SQL, such as joining,
grouping, filtering, and functions
General Assembly: Intro to SQL
Resources
● TutorialRepublic SQL Tutorial:
https://www.tutorialrepublic.com/sql-tutorial
● SQL Tutorial Cheat Sheet:
http://www.sqltutorial.org/sql-cheat-sheet
● Khan Academy Intro to SQL:
https://www.khanacademy.org/computing/computer-programming/sql
● Basic Business Statistics, Concepts and Applications by Berenson, Levine, Krehbiel. 13th
Edition. http://a.co/d/j1j1RtQ
General Assembly: Intro to SQL