0% found this document useful (0 votes)
7 views11 pages

- Homework 02 - Student name: Nguyễn Đức Duy - Student ID:22125019

The document contains SQL queries for a homework assignment by a student named Nguyễn Đức Duy, focusing on customer orders, book recommendations, and aggregations related to sales data. It also includes relational algebra (RA) and tuple relational calculus (TRC) representations for some of the SQL queries, noting limitations in RA and TRC regarding aggregation and ranking. The queries aim to analyze customer behavior and book sales, including identifying top-selling books and recommending books based on order patterns.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

- Homework 02 - Student name: Nguyễn Đức Duy - Student ID:22125019

The document contains SQL queries for a homework assignment by a student named Nguyễn Đức Duy, focusing on customer orders, book recommendations, and aggregations related to sales data. It also includes relational algebra (RA) and tuple relational calculus (TRC) representations for some of the SQL queries, noting limitations in RA and TRC regarding aggregation and ranking. The queries aim to analyze customer behavior and book sales, including identifying top-selling books and recommending books based on order patterns.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

-- Homework 02

-- Student name: Nguyễn Đức Duy


-- Student ID:22125019

1.​ SQL
●​ Q1
SELECT cid, cname
FROM Customers
WHERE cid IN (
SELECT cid
FROM Orders
WHERE order_date >= '2024-01-01' AND order_date <
'2025-01-01'
)-- placed at least one order in the year 2024
AND cid NOT IN (
SELECT cid
FROM Orders
WHERE order_date >= '2025-01-01' AND order_date <
'2026-01-01'
); -- did not place any orders in the year 2025.

●​ Q2
SELECT DISTINCT c.cid, c.cname
FROM Customers c
JOIN Orders o ON c.cid = o.cid
WHERE o.order_id IN (
SELECT ol.order_id
FROM Orderlists ol
JOIN Books b ON ol.isbn = b.isbn
GROUP BY ol.order_id
HAVING SUM(b.price * ol.quantity) > 200
);
—- The subquery is used to find all orders with a total value greater
than 200.

●​ Q3

SELECT isbn, title, author


FROM Books
WHERE isbn NOT IN (
SELECT isbn FROM Recommendations
);
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

●​ Q4
SELECT b.isbn, b.title, AVG(ol.quantity) AS
avg_quantity
FROM Orderlists ol
JOIN Books b ON ol.isbn = b.isbn
WHERE ol.isbn IN (
SELECT isbn
FROM Orderlists
GROUP BY isbn
HAVING SUM(quantity) >= 50
)-- find all books with total ordered quantity at least 50
GROUP BY b.isbn, b.title;
●​ Q5
SELECT c.cid, c.cname, MAX(o.order_date) AS
latest_order
FROM Customers c
JOIN Orders o ON c.cid = o.cid
GROUP BY c.cid, c.cname;
●​ Q6
SELECT c.cid, c.cname
FROM Customers c
JOIN Orders o ON c.cid = o.cid
JOIN Orderlists ol ON o.order_id = ol.order_id
JOIN Books b ON ol.isbn = b.isbn
WHERE b.author = 'Stephen King'
GROUP BY c.cid, c.cname
HAVING COUNT(DISTINCT b.isbn) = (
SELECT COUNT(DISTINCT isbn)
FROM Books
WHERE author = 'Stephen King'
);
Count the number of ordered books whose author is Stephen
King of customer then compare it with the total number of books
which has the author is Stephen King in the book table.
●​ Q7
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

SELECT BookWithCount.author, BookWithCount.title


FROM (
SELECT
b.isbn,
b.title,
b.author,
COUNT(DISTINCT ol.order_id) AS order_count
FROM Orderlists ol
JOIN Books b ON ol.isbn = b.isbn
GROUP BY b.isbn, b.title, b.author
) AS BookWithCount
WHERE BookWithCount.order_count = (
SELECT MAX(AuthorBookWithCount.order_count)
FROM (
SELECT
b.isbn,
COUNT(DISTINCT ol.order_id) AS order_count
FROM Orderlists ol
JOIN Books b ON ol.isbn = b.isbn
WHERE b.author = BookWithCount.author
GROUP BY b.isbn
) AS AuthorBookWithCount
);

●​ Step 1 – Count Orders per Book​


The inner subquery Book_with_count computes how many distinct orders
each book has received.
●​ Step 2 – Filter by Author's Max​
For each author, we compare the book’s order count with the maximum
order count among all their books.
●​ Step 3 – Final Selection​
Only keep the books whose order count equals the author's max → these
are the top-selling books per author.​

●​ Q8
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

SELECT c.cname
FROM Customers c
JOIN (
SELECT r.cid AS cid, COUNT(*) AS
recommended_books_ordered
FROM (
SELECT r.cid, r.isbn
FROM Recommendations r
JOIN Orders o ON r.cid = o.cid
JOIN Orderlists ol ON o.order_id = ol.order_id
WHERE r.isbn = ol.isbn
AND o.order_date > r.rec_date
AND r.cid IN (
SELECT cid
FROM Recommendations
GROUP BY cid
HAVING COUNT(*) >= 5
)
GROUP BY r.cid, r.isbn
) AS OrderedRecommendedBooks
GROUP BY r.cid
) AS RecommendedBooksOrderedCount ON c.cid =
RecommendedBooksOrderedCount.cid

JOIN (
SELECT o.cid AS cid, COUNT(DISTINCT ol.isbn) AS
total_books_ordered
FROM Orders o
JOIN Orderlists ol ON o.order_id = ol.order_id
GROUP BY o.cid
) AS TotalBooksOrderedCount ON c.cid =
TotalBooksOrderedCount.cid

WHERE
RecommendedBooksOrderedCount.recommended_books_ordered
* 1.0 / TotalBooksOrderedCount.total_books_ordered >
0.5;

​ OrderedRecommendedBooks subquery:
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

●​ Gets customer–book pairs where the book was recommended and later
ordered.
●​ Filters to only include customers with ≥5 recommendations.
●​ Groups by cid, isbn to avoid duplicate counts of the same book.

RecommendedBooksOrderedCount:

●​ Counts how many unique recommended books each customer has


actually ordered.

TotalBooksOrderedCount:

●​ Counts how many distinct books each customer has ordered in total.

Final WHERE clause:

●​ Selects customers where the ratio of recommended-then-ordered books to


total books ordered is greater than 0.5.

●​ Q9
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

-- Step 1: Identify the top 3 best-selling books by total number of


orders(total quantity)
WITH Top3Bestsellers AS (
SELECT b.isbn
FROM Orderlists ol
JOIN Books b ON ol.isbn = b.isbn
GROUP BY b.isbn
ORDER BY SUM(ol.quantity) DESC
LIMIT 3
),

-- Step 2: Find (customer, isbn) pairs where the customer has NOT ordered
the top 3 books
CustomerBookPairs AS (
SELECT c.cid, t.isbn
FROM Customers c
CROSS JOIN Top3Bestsellers t
WHERE NOT EXISTS (
SELECT 1
FROM Orders o
JOIN Orderlists ol ON o.order_id = ol.order_id
WHERE o.cid = c.cid AND ol.isbn = t.isbn
)
)

-- Step 3: Insert those pairs into Recommendations table


INSERT INTO Recommendations (isbn, cid, rec_date,
rec_type)
SELECT t.isbn, t.cid, CAST(GETDATE() AS DATE),
'bestseller'
FROM CustomerBookPairs t;
●​ Q10
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

-- Step 1: Identify frequently co-ordered pairs (>= 10 orders)


WITH FrequentPairs AS (
SELECT
a.isbn AS isbn1,
b.isbn AS isbn2,
COUNT(DISTINCT a.order_id) AS co_order_count
FROM Orderlists a
JOIN Orderlists b
ON a.order_id = b.order_id
AND a.isbn < b.isbn
GROUP BY a.isbn, b.isbn
HAVING COUNT(DISTINCT a.order_id) >= 10
),

-- Step 2: All customers and their purchased books


CustomerBooks AS (
SELECT DISTINCT o.cid, ol.isbn
FROM Orders o
JOIN Orderlists ol ON o.order_id = ol.order_id
),

-- Step 3: Recommend the missing pair book to customers


PairsToRecommend AS (
SELECT
cb.cid,
CASE
WHEN cb.isbn = fp.isbn1 AND NOT EXISTS (
SELECT 1
FROM CustomerBooks cb2
WHERE cb2.cid = cb.cid AND cb2.isbn =
fp.isbn2
) THEN fp.isbn2
WHEN cb.isbn = fp.isbn2 AND NOT EXISTS (
SELECT 1
FROM CustomerBooks cb2
WHERE cb2.cid = cb.cid AND cb2.isbn =
fp.isbn1
) THEN fp.isbn1
END AS isbn_to_recommend
FROM FrequentPairs fp
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

JOIN CustomerBooks cb
ON cb.isbn IN (fp.isbn1, fp.isbn2)
)
-- Step 4: Insert into Recommendations
INSERT INTO Recommendations (isbn, cid, rec_date,
rec_type)
SELECT
p.isbn_to_recommend,
p.cid,
CAST(GETDATE() AS DATE),
'frequently co-ordered'
FROM PairsToRecommend p
WHERE p.isbn_to_recommend IS NOT NULL;
Explanation for step 3

●​ Go through each customer and the books they've purchased


(CustomerBooks).
●​ Check if the customer has bought only one book from a frequently
co-ordered pair (FrequentPairs).
●​ If so, we recommend the other book in that pair.
●​ The CASE logic:
○​ If the customer owns isbn1 and has not purchased isbn2, we
recommend isbn2.
○​ If the customer owns isbn2 and has not purchased isbn1, we
recommend isbn1.​

2.​RA
●​ Q1
r1← π[cid](σ[order_date ≥ '2024-01-01' ∧ order_date ≤
'2024-12-31'](Orders))
r2← π[cid](σ[order_date ≥ '2025-01-01' ∧ order_date ≤
'2025-12-31'](Orders))
Result ← π[cname](Customers ⨝ (r1 - r2))
●​ Q2
RA does not support aggregation (SUM)
●​ Q3
RecommendedISBNs ← π[isbn](Recommendations)
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

NonRecommendedBooks ← π[title, author](Books) -


π[title, author](Books ⨝ RecommendedISBNs)
●​ Q4
RA does not support aggregation(SUM, COUNT, AVG)
●​ Q5
RA does not support MAX
●​ Q6
SKBooks ← σ[author='Stephen King'](Books)
SKISBNs ← π[isbn](SKBooks)
CustISBN ← π[cid, isbn](Orders ⨝ Orderlists)
NotAll ← π[cid](SKISBNs × π[cid](Customers)) -
π[cid](σ[isbn1=isbn2](SKISBNs × CustISBN))
Result ← π[cname](Customers) - π[cname](Customers ⨝
NotAll)

Explanation:
●​ SKBooks ← σ[author='Stephen King'](Books)​
→ Select all books written by Stephen King.​

●​ SKISBNs ← π[isbn](SKBooks)​
→ Get the ISBNs of Stephen King’s books.​

●​ CustISBN ← π[cid, isbn](Orders ⨝ Orderlists)​


→ Find which customer ordered which books.​

●​ SKISBNs × π[cid](Customers)​
→ All possible (cid, isbn) pairs if every customer ordered every Stephen
King book.​

●​ σ[isbn1=isbn2](SKISBNs × CustISBN)​
→ Get (cid, isbn) pairs where the customer actually ordered a Stephen
King book.​

●​ Subtraction​
NotAll ← ...​
→ Find customers missing at least one Stephen King book.​
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

●​ Final result​
Result ← π[cname](Customers) - π[cname](Customers ⨝ NotAll)​
→ Get customer names of those who have ordered every Stephen King
book.

●​ Q7
RA does not support MAX
●​ Q8
Since it needs percentage and filtering, RA cannot express this
●​ Q9
RA does not support top k ranking(in this case k = 3)
●​ Q10
RA does not support aggregation(COUNT)
3.​TRC
●​ Q1
{ c.cname |
Customers(c.cid, c.cname, c.address, c.cardnum) ∧
(∃o)(Orders(o.order_id, c.cid, odate) ∧ odate ≥
'2024-01-01' ∧ odate ≤ '2024-12-31') ∧
¬(∃o2)(Orders(o2.order_id, c.cid, odate2) ∧ odate2
≥ '2025-01-01' ∧ odate2 ≤ '2025-12-31')
}
●​ Q2
TRC does not support aggregation (SUM, GROUP BY)
●​ Q3
{ b.title, b.author |
Books(b.isbn, b.title, b.author, b.qty, b.price,
b.year) ∧
¬(∃r)(Recommendations(rid, cid, risbn, rdate, rtype)
∧ risbn = b.isbn)
}
●​ Q4
RA does not support aggregation(SUM, COUNT, AVG)
●​ Q5
TRC does not support MAX
-- Homework 02
-- Student name: Nguyễn Đức Duy
-- Student ID:22125019

●​ Q6
{ c.cname |
Customers(c.cid, c.cname, c.addr, c.card) ∧
¬∃b(Books(b.isbn, b.title, b.author, q, p, y) ∧
b.author = 'Stephen King' ∧
¬∃o(Orders(o.oid, c.cid, d) ∧
∃l(Orderlists(o.oid, b.isbn, ql, sd))))
}
Explanation:
Customers(c.cid, c.cname, ...)​
→ We're looking at each customer c ​

¬∃b(...Stephen King...)​
→ For that customer, there does not exist a Stephen King book b such
that...​

¬∃o(... ∧ ∃l(...))​
→ ...the customer has not ordered it.
●​ Q7
TRC does not support MAX
●​ Q8
Since it needs percentage and filtering, TRC cannot express this
●​ Q9
TRC does not support top k ranking(in this case k = 3)
●​ Q10
TRC does not support aggregation(COUNT)

You might also like