SQL Notes
SQL Notes
What is Data?
Data refers to raw facts, figures, or information that can be processed or analyzed to derive
meaningful insights.
What is a Database?
A database is an organized collection of data that allows efficient storage, retrieval, and
management.
A DBMS is software that manages and controls databases, ensuring data integrity, security,
and efficient processing. Examples include MySQL, PostgreSQL, and Oracle.
DBMS Operations:
1. Data Storage
2. Data Retrieval
3. Data Modification
4. Data Security
5. Backup & Recovery
Advantages:
Disadvantages:
• Complex Implementation
• Expensive Hardware & Software
• Requires Skilled Personnel
• Performance Issues with Large Databases
What is RDBMS?
An RDBMS stores data in tabular form with relationships between tables. Examples:
MySQL, SQL Server, PostgreSQL.
What is SQL?
SQL (Structured Query Language) is a standard language for interacting with RDBMS
databases.
WHERE Clause
Syntax:
SQL Operators
SQL provides various operators to use with the WHERE clause.
1. Comparison Operators
o = (Equal to)
o != or <> (Not equal to)
o > (Greater than)
o < (Less than)
o >= (Greater than or equal to)
o <= (Less than or equal to)
2. Logical Operators
o AND (Both conditions must be true)
o OR (Either condition can be true)
o NOT (Negates a condition)
3. Special Operators
o BETWEEN (Range condition)
o LIKE (Pattern matching)
o IS NULL (Checks if value is NULL)
o IN (Matches values in a list)
o DISTINCT (Removes duplicate values)
Special Operators
BETWEEN
Syntax:
LIKE Operator
Syntax:
IS NULL
Syntax:
IN
Syntax:
DISTINCT
Syntax:
Aggregation Functions
Aggregation functions perform calculations on multiple rows and return a single value.
GROUP BY Clause - The GROUP BY clause groups rows with the same values in specified
columns and applies aggregation functions.
HAVING Clause
Used to filter aggregated results (similar to WHERE, but for grouped data).
Syntax:
SQL constraints are rules applied to table columns to enforce data integrity and ensure the
accuracy and reliability of data. Constraints prevent invalid data entry and maintain database
consistency.
The NOT NULL constraint ensures that a column cannot store NULL values.
2. UNIQUE Constraint
The UNIQUE constraint ensures that all values in a column are distinct (no duplicate values).
OR
The PRIMARY KEY uniquely identifies each record in a table. It combines the NOT
NULL and UNIQUE constraints.
The FOREIGN KEY constraint ensures that values in a column match values in the
referenced table. It maintains referential integrity between tables.
5. CHECK Constraint
The DEFAULT constraint assigns a default value to a column when no value is provided.
7. AUTO_INCREMENT Constraint
A Composite Key is a primary key that consists of multiple columns to uniquely identify
each record.
An INDEX is used to improve query performance. It does not enforce data integrity but helps
in faster retrieval of records.
ORDER BY Clause
The ORDER BY clause is used to sort the result set in either ascending (ASC) or descending (DESC)
order. By default, it sorts the results in ascending order.
Syntax:
This query sorts the student records by age in ascending order (smallest to largest).
This query sorts the student records by age in descending order (largest to smallest).
First, it sorts by age in ascending order. If multiple students have the same age, it sorts by
marks in descending order.
UNION and UNION ALL in SQL
The UNION and UNION ALL operators are used to combine the results of multiple SELECT queries.
The UNION operator merges results from two queries and removes duplicates.
Syntax:
Example:
Tables:
students_2023
students_2024
Result:
The UNION ALL operator merges results from two queries and keeps duplicates.
Syntax:
Example:
Conclusion
Syntax:
Example:
Tables:
employees
departments
Query:
Result:
Returns all records from the left table and matching records from the right table. If there is no
match, NULL is returned.
Syntax:
Example:
Result:
Returns all records from the right table and matching records from the left table. If there is no
match, NULL is returned.
Syntax:
Example:
Result:
Returns all records from both tables, with NULLs where there is no match.
MySQL does not support FULL JOIN directly, but we can use UNION of LEFT JOIN and RIGHT
JOIN.
Returns the Cartesian product of both tables (every row from table1 is paired with every row
from table2).
Syntax:
Example:
A table joins itself. Useful for hierarchical data like employees and managers.
Syntax:
Example:
Result:
This query finds each employee’s manager from the same table.
Summary of MySQL Joins
Conclusion
The CASE expression in MySQL is used to implement conditional logic within SQL queries. It works
like an IF-ELSE statement in programming.
Syntax:
Example:
Result:
Searched CASE Expression
Syntax:
Example:
Result:
Improves readability
Can be referenced multiple times in a query
Useful for recursive queries
CTE Syntax
Result:
Recursive CTEs are useful for hierarchical data like employee-manager relationships.
Syntax:
Example: Employee Hierarchy
Result:
Alice is the top manager, Bob reports to Alice, and Charlie reports to Bob.
Summary
Window Functions in MySQL 🚀
Window functions allow performing calculations across a specific range of rows without collapsing
them into a single result (unlike aggregate functions).
Syntax
Ranking Functions
1. ROW_NUMBER()
2. RANK()
3. DENSE_RANK()
4. PERCENT_RANK()
5. NTILE(n)
1. SUM()
2. AVG()
3. COUNT()
4. MAX()
5. MIN()
Value Functions
1. LEAD()
2. LAG()
3. FIRST_VALUE()
4. LAST_VALUE()
✅ Ranking Functions
(a) ROW_NUMBER()
Assigns a unique number to each row within a partition based on the specified ORDER BY.
Example
Result
(b) RANK()
Assigns a rank to each row, but skips numbers when there are ties.
Example
Result
Alice and Charlie have the same salary, so they both get Rank 1. The next rank is 3 (skipping 2).
(c) DENSE_RANK()
Example
Result
(d) PERCENT_RANK()
Calculates the relative rank of a row as a percentage of the total rows. The formula is:
Example :-
Result :-
Alice and Charlie have the same salary, so they share the lowest rank (0.00). Dave has the
highest percentage rank (1.00).
(e) NTILE(n)
Example
Result
(a) SUM()
(b) AVG()
(a) LEAD()
(b) LAG()
Syntax
Example
Scenario: Retrieve the highest salary (first in order) for each department.
Sample Data
Result
Explanation:
Syntax
Example
Scenario: Retrieve the lowest salary (last in order) for each department.
Result
Explanation:
7. Summary
Window functions in SQL allow you to perform calculations across a set of rows related to the
current row within a partition. The ROWS BETWEEN clause helps define which rows to consider for
the calculation.
• UNBOUNDED PRECEDING → The window starts from the first row of the partition.
• UNBOUNDED FOLLOWING → The window extends to the last row of the partition.
Example:
This creates a cumulative sum from the first row to the current row.
3. Cumulative Sum
A cumulative sum (running total) is the sum of values from the first row up to the current row.
Sample Data
Result
Explanation
• For each row, the sum accumulates from the first row to the current row.
Sample Data
Result
Explanation
• The AVG(salary) function considers 2 previous rows, the current row, and 2 next rows.
The NTILE(n) function is a window function that distributes rows into n number of approximately
equal groups. This is useful when dividing data into percentiles, quartiles, or rankings.
Explanation:
The LIMIT and OFFSET clauses are used to control the number of rows returned and skip rows
when fetching results.
Explanation:
SQL executes queries in the following order, which is different from how we write them:
Example Query
Explanation:
A NATURAL JOIN automatically joins tables based on columns with the same name.
• If both tables have a department_id column, NATURAL JOIN automatically joins on that
column.
• No need to write ON employees.department_id = departments.department_id.
A subquery is a query inside another query. It's used to fetch data dynamically.
Example: Get Employees with Salary Higher than the Company’s Average
Explanation:
• The subquery (SELECT AVG(salary) FROM employees) calculates the average salary.
• The outer query selects employees who earn more than the average salary.
A Stored Procedure is a saved SQL function that can be executed multiple times.
Execute Procedure
use practice_table;
--Employees Table
CREATE TABLE Employees (
EmployeeName VARCHAR(100),
DepartmentID INT,
HireDate DATE
);
VALUES
--Department Table
CREATE TABLE Departments (
DepartmentName VARCHAR(100)
);
VALUES
(1, 'Sales'),
(3, 'Logistics');
--Orders Table
CREATE TABLE orders (
order_date DATE,
status VARCHAR(50),
shipping_address VARCHAR(255),
);
VALUES
(10, '2024-08-10', 10, 225.00, 'Pending', '707 Spruce St, San Jose'),
(13, '2024-08-13', 13, 149.49, 'Pending', '1010 Oak St, Fort Worth'),
(14, '2024-08-14', 14, 135.00, 'Shipped', '1111 Maple St, Columbus'),
(16, '2024-08-16', 16, 180.75, 'Pending', '1313 Cedar St, San Francisco'),
(25, '2024-08-25', 25, 180.00, 'Pending', '2222 Birch St, Oklahoma City');
--Customers Table
CREATE TABLE customers (
name VARCHAR(100),
email VARCHAR(100),
phone_number VARCHAR(15),
address VARCHAR(255),
city VARCHAR(100),
created_at DATE,
);
--Customer Table Data
INSERT INTO customers (customer_id, name, email, phone_number, address, city, created_at)
VALUES
(1, 'Customer 1', '[email protected]', '123-456-7890', '123 Main St', 'New York', '2024-01-
01'),
(2, 'Customer 2', '[email protected]', '234-567-8901', '456 Elm St', 'Los Angeles', '2024-01-
02'),
(3, 'Customer 3', '[email protected]', '345-678-9012', '789 Oak St', 'Chicago', '2024-01-03'),
(4, 'Customer 4', '[email protected]', '456-789-0123', '101 Maple St', 'Houston', '2024-01-
04'),
(5, 'Customer 5', '[email protected]', '567-890-1234', '202 Birch St', 'Phoenix', '2024-01-05'),
(6, 'Customer 6', '[email protected]', '678-901-2345', '303 Cedar St', 'Philadelphia', '2024-01-
06'),
(7, 'Customer 7', '[email protected]', '789-012-3456', '404 Walnut St', 'San Antonio', '2024-
01-07'),
(8, 'Customer 8', '[email protected]', '890-123-4567', '505 Chestnut St', 'San Diego', '2024-
01-08'),
(9, 'Customer 9', '[email protected]', '901-234-5678', '606 Pine St', 'Dallas', '2024-01-09'),
(10, 'Customer 10', '[email protected]', '012-345-6789', '707 Spruce St', 'San Jose', '2024-
01-10'),
(11, 'Customer 11', '[email protected]', '123-456-7890', '808 Fir St', 'Austin', '2024-01-11'),
(13, 'Customer 13', '[email protected]', '345-678-9012', '1010 Oak St', 'Fort Worth', '2024-
01-13'),
(14, 'Customer 14', '[email protected]', '456-789-0123', '1111 Maple St', 'Columbus', '2024-
01-14'),
(15, 'Customer 15', '[email protected]', '567-890-1234', '1212 Birch St', 'Charlotte', '2024-
01-15'),
(16, 'Customer 16', '[email protected]', '678-901-2345', '1313 Cedar St', 'San Francisco',
'2024-01-16'),
(18, 'Customer 18', '[email protected]', '890-123-4567', '1515 Chestnut St', 'Seattle', '2024-
01-18'),
(19, 'Customer 19', '[email protected]', '901-234-5678', '1616 Pine St', 'Denver', '2024-01-
19'),
(21, 'Customer 21', '[email protected]', '123-456-7890', '1818 Fir St', 'Boston', '2024-01-
21'),
(22, 'Customer 22', '[email protected]', '234-567-8901', '1919 Redwood St', 'El Paso', '2024-
01-22'),
(23, 'Customer 23', '[email protected]', '345-678-9012', '2020 Oak St', 'Nashville', '2024-01-
23'),
(24, 'Customer 24', '[email protected]', '456-789-0123', '2121 Maple St', 'Detroit', '2024-
01-24'),
(25, 'Customer 25', '[email protected]', '567-890-1234', '2222 Birch St', 'Oklahoma City',
'2024-01-25'),
(26, 'Customer 26', '[email protected]', '678-901-2345', '2323 Cedar St', 'Portland', '2024-
01-26'),
(27, 'Customer 27', '[email protected]', '789-012-3456', '2424 Walnut St', 'Las Vegas',
'2024-01-27'),
(29, 'Customer 29', '[email protected]', '901-234-5678', '2626 Pine St', 'Louisville', '2024-
01-29'),
--Cinema Table
Create table If Not Exists cinema (id int, movie varchar(255), description varchar(255), rating float(2,
1));
insert into cinema (id, movie, description, rating) values ('2', 'Science', 'fiction', '8.5');
insert into cinema (id, movie, description, rating) values ('3', 'irish', 'boring', '6.2');
insert into cinema (id, movie, description, rating) values ('4', 'Ice song', 'Fantacy', '8.6');
insert into cinema (id, movie, description, rating) values ('5', 'House card', 'Interesting', '9.1');
--Sales_data Table
CREATE TABLE sales_data (
customer_id INT,
product_id INT,
quantity INT,
sale_date DATE,
sale_amount DECIMAL(10, 2)
);
VALUES
--Data_for_sorting Table
CREATE TABLE data_for_sorting (
name VARCHAR(50),
age INT,
gender VARCHAR(10),
city VARCHAR(50),
join_date DATE,
department VARCHAR(50)
);
--Data_for_sorting Table Data
INSERT INTO data_for_sorting (id, name, age, gender, city, salary, join_date, department) VALUES
insert into Employee (id, name, department, managerId) values ('102', 'Dan', 'A', '101');
insert into Employee (id, name, department, managerId) values ('103', 'James', 'A', '101');
insert into Employee (id, name, department, managerId) values ('104', 'Amy', 'A', '101');
insert into Employee (id, name, department, managerId) values ('105', 'Anne', 'A', '101');
insert into Employee (id, name, department, managerId) values ('106', 'Ron', 'B', '101');
--Project Table
Create table If Not Exists Project (project_id int, employee_id int);
--Employee Table
Create table If Not Exists Employee (employee_id int, name varchar(10), experience_years int);
insert into Employee (employee_id, name, experience_years) values ('2', 'Ali', '2');
insert into Employee (employee_id, name, experience_years) values ('3', 'John', '1');
insert into Employee (employee_id, name, experience_years) values ('4', 'Doe', '2');
--Emp Table
CREATE TABLE emp (
);
--Employees Table
CREATE TABLE Employees (
Name VARCHAR(50),
DepartmentID INT,
Position VARCHAR(50),
Salary DECIMAL(10, 2)
);
--Departments Table
CREATE TABLE Departments (
DepartmentName VARCHAR(50),
Location VARCHAR(50)
);
--Projects Table
CREATE TABLE Projects (
ProjectName VARCHAR(50),
DepartmentID INT
);
(7, 'Project Eta', 109), -- Project with department ID 109, no matching employees
(12, 'Project Mu', 106); -- Project with department ID 106, no matching in Departments
--EmployeeProject Table
CREATE TABLE EmployeeProjects (
EmployeeID INT,
ProjectID INT,
HoursWorked INT,
(1, 1, 120),
(2, 2, 100),
(3, 3, 110),
(4, 4, 95),
(5, 5, 130),
(6, 6, 50),
(7, 1, 60),
(8, 3, 115),
(9, 2, 80),
(10, 5, 70),
(11, 9, 85),
(12, 8, 65),
(16, 4, 110),
(18, 6, 105),
(19, 1, 130),
(20, 9, 75),
(23, 3, 60),
(24, 2, 50),