What's in the
database?
E X P L O R AT O R Y D ATA A N A LY S I S I N S Q L
Christina Maimone
Data Scientist
PostgreSQL
EXPLORATORY DATA ANALYSIS IN SQL
Database client
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
Select a few rows
SELECT *
FROM company
LIMIT 5;
id | exchange | ticker | name | parent_id
----+----------+--------+-----------------------+-----------
1 | nasdaq | PYPL | PayPal Holdings, Inc. |
2 | nasdaq | AMZN | Amazon.com, Inc. |
3 | nasdaq | MSFT | Microsoft Corporation |
4 | nasdaq | MDB | MongoDB Inc. |
5 | nasdaq | DBX | Dropbox, Inc. |
(5 rows)
EXPLORATORY DATA ANALYSIS IN SQL
A few reminders
Code Note
NULL missing
EXPLORATORY DATA ANALYSIS IN SQL
A few reminders
Code Note
NULL missing
IS NULL , IS NOT NULL don't use = NULL
EXPLORATORY DATA ANALYSIS IN SQL
A few reminders
Code Note
NULL missing
IS NULL , IS NOT NULL don't use = NULL
count(*) number of rows
EXPLORATORY DATA ANALYSIS IN SQL
A few reminders
Code Note
NULL missing
IS NULL , IS NOT NULL don't use = NULL
count(*) number of rows
count(column_name) number of non- NULL values
EXPLORATORY DATA ANALYSIS IN SQL
A few reminders
Code Note
NULL missing
IS NULL , IS NOT NULL don't use = NULL
count(*) number of rows
count(column_name) number of non- NULL values
count(DISTINCT column_name) number of di erent non- NULL values
EXPLORATORY DATA ANALYSIS IN SQL
A few reminders
Code Note
NULL missing
IS NULL , IS NOT NULL don't use = NULL
count(*) number of rows
count(column_name) number of non- NULL values
count(DISTINCT column_name) number of di erent non- NULL values
SELECT DISTINCT column_name ... distinct values, including NULL
EXPLORATORY DATA ANALYSIS IN SQL
Let's start exploring
E X P L O R AT O R Y D ATA A N A LY S I S I N S Q L
The keys to the
database
E X P L O R AT O R Y D ATA A N A LY S I S I N S Q L
Christina Maimone
Data Scientist
Foreign keys
EXPLORATORY DATA ANALYSIS IN SQL
Foreign keys
Reference another row
In a di erent table or the same table
Via a unique ID
>> Primary key column containing unique, non-NULL values
Values restricted to values in referenced column OR NULL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
EXPLORATORY DATA ANALYSIS IN SQL
Coalesce function
coalesce(value_1, value_2 [, ...])
Operates row by row
Returns rst non- NULL value
EXPLORATORY DATA ANALYSIS IN SQL
Coalesce function
SELECT *
FROM prices;
column_1 | column_2
----------+----------
| 10
|
22 |
3 | 4
SELECT coalesce(column_1, column_2)
FROM prices;
coalesce
----------
10
22
3
EXPLORATORY DATA ANALYSIS IN SQL
Time to keep
exploring!
E X P L O R AT O R Y D ATA A N A LY S I S I N S Q L
Column Types and
Constraints
E X P L O R AT O R Y D ATA A N A LY S I S I N S Q L
Christina Maimone
Data Scientist
Column constraints
Foreign key: value that exists in the referenced column, or NULL
Primary key: unique, not NULL
Unique: values must all be di erent except for NULL
Not null: NULL not allowed: must have a value
Check constraints: conditions on the values
column1 > 0
columnA > columnB
EXPLORATORY DATA ANALYSIS IN SQL
Data types
Common Special
Numeric Arrays
Character Monetary
Date/Time Binary
Boolean Geometric
Network Address
XML
JSON
and more!
EXPLORATORY DATA ANALYSIS IN SQL
Numeric types: PostgreSQL documentation
EXPLORATORY DATA ANALYSIS IN SQL
Types in entity relationship diagrams
EXPLORATORY DATA ANALYSIS IN SQL
Casting with CAST()
Format
-- With the CAST function
SELECT CAST (value AS new_type);
Examples
-- Cast 3.7 as an integer
SELECT CAST (3.7 AS integer);
-- Cast a column called total as an integer
SELECT CAST (total AS integer)
FROM prices;
EXPLORATORY DATA ANALYSIS IN SQL
Casting with ::
Format
-- With :: notation
SELECT value::new_type;
Examples
-- Cast 3.7 as an integer
SELECT 3.7::integer;
-- Cast a column called total as an integer
SELECT total::integer
FROM prices;
EXPLORATORY DATA ANALYSIS IN SQL
Time to practice!
E X P L O R AT O R Y D ATA A N A LY S I S I N S Q L