Col362 HW1
Col362 HW1
Instructions
Instructions
• You can install PostgreSQL in the default directory. But if you would like, you can also
install it in as separate directory (for ease of removal later), by passing appropriate
arguments to the configure script. Please read the configure help for more details.
1
• On some machines – namely, the Mac machines with M1/M2 chip – you may get an
error saying the platform does not support spinlocks. You can turn it off by passing
the appropriate argument to the configure script. On all other machines you will
not have this issue.
4. After successful installation, run psql in the terminal and in the prompt run query: SELECT
version(); and submit the output in the report. Subsequently, all the SQL queries should
be run in psql prompt.
2 Dataset Description
We all still fondly remember the exciting finals game of FIFA WC’22 between Argentina and
France which is considered one of the greatest football matches ever. In this homework, we take
a little bit of history tour and go back in time to look at the past FIFA WCs.
This is a comprehensive database about the FIFA World Cups, that covers 21 World Cup tour-
naments (1930-2018) with 7 tables. The description is given only for a subset of columns. Other
columns are self-explanatory. The schema of the database can be found here
1. players: This table has details about the players who participated in FIFA WCs.
Column Description
The unique ID number for the team that scored the goal. For own
team id goals, this is the team that is awarded the goal, not the team of
the player who scored the own goal
player id The unique ID number for the player who scored the goal
The unique ID number for the team of the player who scored the
player team id
goal
Column Description
stage name The stage of the tournament in which the match occurred
home team id The unique ID number for the home team
away team id The unique ID number for the away team
home team score Number of goals scored by the home team
away team score Number of goals scored by the away team
home team score penalties The score of the home team in the penalty shootout
away team score penalties The score of the away team in the penalty shootout
4. penalty kicks: This table records all penalty kicks taken during penalty shootouts. This
table does not include attempted penalty kicks during matches.
Column Description
converted Whether the penalty kick was converted
5. stadiums: This table records all stadiums that have hosted a World Cup match.
6. teams: This table records all teams who have participated in a World Cup match.
2
3 Loading the Dataset
The ’data’ folder provided has 7 csv files. Run ’create table.sql’ (also provided in data folder)
to create the tables. To load the data in the csv files into these tables, use the command: COPY
table-name FROM path/to/file.csv DELIMITER ’,’ CSV HEADER;
Note that you need to load CSV files in some order. It is left as an exercise for you to
identify the right order for loading the data.
4 SQL
4.1 Exercise - 1
Run the following queries, submit the output and describe what each query is doing in the report
1. SELECT COUNT(*) FROM Matches JOIN Tournaments ON Matches.tournament id = Tour-
naments.tournament id AND tournament name = ’2014 FIFA World Cup’
2. SELECT COUNT(*) FROM (SELECT DISTINCT Goals.match id FROM Players JOIN Goals
ON Players.player id = Goals.player id AND Players.family name = ’Mbappé’ AND given name
= ’Kylian’) AS t;
3. SELECT DISTINCT team name FROM Teams JOIN Matches ON (Teams.team id = Matches.home team id
OR Teams.team id = Matches.away team id) AND Matches.stage name = ’final’
4. SELECT COUNT(*) FROM Teams JOIN (SELECT * FROM Matches JOIN Teams ON ((Teams.team id
= Matches.home team id OR Teams.team id = Matches.away team id) AND team name =
’Germany’)) AS t ON ((Teams.team id = t.home team id OR Teams.team id = t.away team id)
AND Teams.team name = ’France’ AND t.stage name != ’group stage’)
5. SELECT DISTINCT player id FROM Goals JOIN (SELECT * FROM Matches JOIN Tourna-
ments ON Matches.tournament id = Tournaments.tournament id AND tournament name
= ’1930 FIFA World Cup’) AS t1 ON Goals.match id = t1.match id AND Goals.own goal =
FALSE
4.2 Exercise - 2
Write SQL Queries for the following
1. Tournaments in which the host country is the winner of the tournament
Output: tournament id, tournament name, year, winner
2. Players who played atleast 4 WCs
Output: player id, family name, given name, count tournaments
3. Number of draw matches Croatia was a part of
Output: num matches
4. Stadium in which the final match of ”1990 FIFA World Cup” tournament was held
Output: stadium name,city name,country name
5. Number of goals scored by Ronaldo (family name: Ronaldo, given name: Cristiano) in all
WCs (Don’t include self goals)
Output: num goals
6. Player with highest number of goals in WCs from 2002 - 2018 (both years inclusive, don’t
include self goals)
Output: player id, family name, given name, num goals
3
7. Team which scored highest number of self goals (lost point to opponent) in last 3 WCs
Output: team id, team name, num self goals