0% found this document useful (0 votes)
55 views

01 Python 03 SQL Basics

This document provides an introduction to SQL and relational databases. It discusses different types of databases, including relational databases that use SQL and NoSQL databases. It then describes some key SQL concepts like tables, primary keys, and relationships. The document uses an example soccer database and shows how to query it using both a GUI client and Python. It provides examples of common SQL queries involving selection, projection, aggregation, sorting, and joining multiple tables.

Uploaded by

AyoubENSAT
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

01 Python 03 SQL Basics

This document provides an introduction to SQL and relational databases. It discusses different types of databases, including relational databases that use SQL and NoSQL databases. It then describes some key SQL concepts like tables, primary keys, and relationships. The document uses an example soccer database and shows how to query it using both a GUI client and Python. It provides examples of common SQL queries involving selection, projection, aggregation, sorting, and joining multiple tables.

Uploaded by

AyoubENSAT
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

09/05/2023 09:49 01-Python_03-SQL-Basics

SQL Basics

Databases
SQL (1970-now):
SQLite
PostgreSQL
MySQL / MariaDB
Microsoft SQL Server
Oracle
IBM DB2
NoSQL (https://en.wikipedia.org/wiki/NoSQL) (2000-now):
Document (CouchDB, MongoDB, etc.)
Key-value (Couchbase, Dynamo, Redis, Riak, etc.)
Graph (Neo4J, etc.)

Relational Databases
A relational database is a digital database based on the relational model of data, as proposed
by E. F. Codd in 1970.

[...]

This model organizes data into one or more tables (or "relations") of columns and rows,
with a unique primary key identifying each row.

[...]

The primary keys within a database are used to define the relationships among the tables.
When a PK is used in another table, it is named a foreign key. This design pattern can
represent either a one-to-one or one-to-many relationship.

Source: Wikipedia (https://en.wikipedia.org/wiki/Relational_database)

SQLite
A database stored in a single file.

👉 sqlite.org (https://www.sqlite.org/index.html)

Approaching a new Database


As a Data Scientist

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 1/8
09/05/2023 09:49 01-Python_03-SQL-Basics

SQLite DB Example
👉 European Soccer Database (https://www.kaggle.com/hugomathien/soccer/) on Kaggle

Exploration
Let's use DBeaver (https://dbeaver.io/), a universal database client for developers, SQL programmers,
database administrators and analysts.

ERD Diagram
When discovering a new database, a data scientist should explore and draw the Entity Relationship
Diagram (https://www.visual-paradigm.com/guide/data-modeling/what-is-entity-relationship-diagram/).

👉 Useful tool: kitt.lewagon.com/db (https://kitt.lewagon.com/db) (Save XML)

Querying the Database

With DBeaver

Open the SQL Editor and write your first SQL query.

SELECT * FROM Country

Execute the query (Click on ▶️or use keyboard shortcut Ctrl + Enter )

id |name |
-----|-----------|
1|Belgium |
1729|England |
4769|France |
[...]

With Python
Reaching for the sqlite3 (https://docs.python.org/3.7/library/sqlite3.html) package.

In [ ]:

import sqlite3

conn = sqlite3.connect('data/soccer.sqlite')
c = conn.cursor()

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 2/8
09/05/2023 09:49 01-Python_03-SQL-Basics

In [ ]:

c.execute("SELECT * FROM Country")


rows = c.fetchall()
rows

Out[ ]:

[(1, 'Belgium'),
(1729, 'England'),
(4769, 'France'),
(7809, 'Germany'),
(10257, 'Italy'),
(13274, 'Netherlands'),
(15722, 'Poland'),
(17642, 'Portugal'),
(19694, 'Scotland'),
(21518, 'Spain'),
(24558, 'Switzerland')]

With Python (Advanced)


You can also fetch a list of sqlite3.Row (https://docs.python.org/3/library/sqlite3.html#sqlite3.Row)
elements:

In [ ]:

conn = sqlite3.connect('data/soccer.sqlite')
conn.row_factory = sqlite3.Row
c = conn.cursor()

In [ ]:

c.execute("SELECT * FROM Country")


rows = c.fetchall()
first_row = rows[0]

In [ ]:

first_row['name']

Out[ ]:

'Belgium'

In [ ]:

tuple(first_row)

Out[ ]:

(1, 'Belgium')

Fetching only one element


Sometimes you know that your query will yield only one (or zero!) element. Then use fetchone
(https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.fetchone):

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 3/8
09/05/2023 09:49 01-Python_03-SQL-Basics

In [ ]:

c.execute("SELECT * FROM Country WHERE Country.id = 1")


row = c.fetchone()
print(row[0], '-' ,row[1])

1 - Belgium

In [ ]:

c.execute("SELECT * FROM Country WHERE Country.id = 2")


row = c.fetchone()
print(row)

None

SQL

Projection
Choosing which columns the query shall return.

🤔 Retrieve id , season , stage and date of all matches

SELECT "Match".id, "Match".season, "Match".stage, "Match".date


FROM "Match"

💡Tip: You can alias tables name for enhanced readability

SELECT matches.id, matches.season, matches.stage, matches.date


FROM "Match" AS matches

Selection
Selecting which rows the query shall return.

🤔 Retrieve matches which happened in France

SELECT *
FROM "Match" AS matches
WHERE matches.country_id = 4769

🤔 Retrieve matches which happened in Belgium or England

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 4/8
09/05/2023 09:49 01-Python_03-SQL-Basics

SELECT *
FROM "Match" AS matches
WHERE matches.country_id = 1
OR matches.country_id = 1729

Alternative:

SELECT *
FROM "Match" AS matches
WHERE matches.country_id IN (1, 1729)

🤔 Retrieve players named John

SELECT *
FROM Player
WHERE UPPER(Player.player_name) LIKE 'JOHN %'

Counting
Counting the number of rows matching the selection

🤔 How many players are taller than 2.00 meters?

SELECT COUNT(Player.id)
FROM Player
WHERE Player.height >= 200

Sorting
Sorting the rows based on a column (or a group of columns)

🤔 Who are the 10 heaviest players?

SELECT *
FROM Player
ORDER BY Player.weight DESC
LIMIT 10

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 5/8
09/05/2023 09:49 01-Python_03-SQL-Basics

Grouping
Grouping rows on a given column C (aggregating rows with a function where values of C column are the
same)

🤔 How many matches were played on a per-country basis?

SELECT COUNT(matches.id), matches.country_id


FROM "Match" AS matches
GROUP BY matches.country_id

🤔 What if we want to sort those results? We need an alias:

SELECT COUNT(matches.id) AS match_count, matches.country_id


FROM "Match" AS matches
GROUP BY matches.country_id
ORDER BY match_count DESC

🤔 How many matches were played on a per-country basis, ignoring countries with less than 3000 matches?

SELECT COUNT(matches.id) AS match_count, matches.country_id


FROM "Match" AS matches
GROUP BY matches.country_id
HAVING match_count >= 3000
ORDER BY match_count DESC

🤔 How many matches were


1. won by the home team
2. won by the away team
3. finished with a draw

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 6/8
09/05/2023 09:49 01-Python_03-SQL-Basics

SELECT
COUNT(matches.id) AS outcome_count,
CASE
WHEN matches.home_team_goal > matches.away_team_goal
THEN 'home_win'
WHEN matches.home_team_goal = matches.away_team_goal
THEN 'draw'
ELSE 'away_win'
END AS outcome
FROM "Match" AS matches
GROUP BY outcome
ORDER BY outcome_count DESC

Querying multiple tables


It's time to JOIN .

🤔 Retrieve leagues with their respective country.

SELECT League.name, Country.name


FROM League
JOIN Country ON League.country_id = Country.id

🤔 How many matches where played in each league (with their respective country)?

SELECT
League.id,
League.name AS league_name,
COUNT(matches.id) AS match_count,
Country.name AS country_name
FROM "Match" AS matches
JOIN League ON matches.league_id = League.id
JOIN Country ON League.country_id = Country.id
GROUP BY League.id
ORDER BY
match_count DESC,
country_name ASC

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 7/8
09/05/2023 09:49 01-Python_03-SQL-Basics

The order of SQL statements matters

👉 Source article (https://www.sisense.com/blog/sql-query-order-of-operations/)

Your turn!
You will learn to:

Explore a new database and draw its schema with kitt.lewagon.com/db (https://kitt.lewagon.com/db)
Query the database with SQL with a GUI client like DBeaver
Use Python to execute SQL queries against the database

https://kitt.lewagon.com/camps/1173/lectures/content/01-Python_03-SQL-Basics.html 8/8

You might also like