SQL - Part I
SQL - Part I
SQL
• Structured Query Language is a standard relational database language.
• Originally developed by IBM and later became official standard- ANSI-SQL.
• SQL language can be split into:
DML: Querying and manipulating data
DDL: Creation and deletion of schemas
Triggers: Actions executed based on some conditions
Embedded SQL
Security and Transaction management
3
SQL Queries
• Queries fetch the information from the database tables based on the
condition specified in the SQL.
• We shall work with the relations given below
Student (sid: integer, sname : string, age: integer)
Course (cid: integer, cname: string, duration: integer)
Enrollment (sid: integer, cid: integer, edate: date)
4
Student
Sample Data
SID SNAME AGE
----------------------------------------
1 Mary Molle 20
2 John July 21
3 Lilly Lolly 20
4 Bill Will 22
Enrollment
BASICS
6
Syntax
Simple Join
•Query 5: Find sids and names of the students have enrolled for at
least one course.
SELECT Student.sid, sname FROM Student, Enrollment
WHERE Student.sid= Enrollment.sid
Join condition
• Using alias
SELECT s.sid, sname FROM Student s, Enrollment e
WHERE s.sid= e.sid
9
DISTINCT/UNIQUE
• Query 6: Find the names of the students and the names of the
course which they have enrolled for.
2 join conditions
11
Patterns: LIKE
•LIKE operator is used with wild card characters to search for patterns.
• Wild cards:
% : any number of characters
_ : one character
•Query 7: Find all the students who name end with the letter ‘y’.
SELECT sname FROM student WHERE sname LIKE '%y'
•Query 8: Find all the enrollments that happened in july.
select * from enrollment where edate like '%JUL%'
13
• Query 9: Find the 2 lettered course name that ends with a ‘S’
SELECT * FROM course WHERE cname LIKE '_S'
• Query 10: Find all the students whose name as two characters
before ‘ll’.
SELECT sname FROM student WHERE sname LIKE '__ll%'
14
Ranges: BETWEEN
• Allows querying within ranges.
• Query 11: Find all enrollments which has taken place from July till
September.
SELECT * FROM enrollment WHERE edate BETWEEN '01-
JUL-2006' AND '30-SEP-2006'
• Query 12: Find all courses in the range of 30 to 35 hours.
SELECT * FROM course WHERE duration BETWEEN 30 AND
35
15
Logical Operators
• AND , OR , NOT
• Query 13: Find all courses not in the range of 30 to 35 hours.
SELECT * FROM course WHERE duration NOT BETWEEN 30
AND 35
• Query 14: Find names of all students who have enrolled for some
courses and whose age is 20.
SELECT sname FROM student s, enrollment e WHERE
s.sid=e.sid AND s.age=20
16
• UNION
• INTERSECT eliminates the
• EXCEPT (OR MINUS) duplicate rows
• UNION ALL from the final set.
• IN
• Op ANY and Op ALL
Nested and Correlated
• EXISTS
Subqueries
18
UNION
• Query 15: Find the sids of all the students who have taken Java or
RDBMS courses.
INTERSECT
• Query 16: Find the sids of all the students who have taken both
Java and RDBMS courses.
• Query 16 revisit: For the same question in the previous slide, would
the query below work?
EXCEPT or MINUS
• Query 17: Find the sids of all the students who have taken Java
but not RDBMS courses.
Set Difference
22
Nested Query
Set-Comparison
• Query 18: Find all the course whose duration is more any other
course.
SELECT cname FROM course WHERE duration >=
ALL(SELECT duration FROM course )
• Query 19: Find all the students whose age is below the age of any of
the students whose name end with ‘y’.
SELECT * FROM STUDENT WHERE age < ANY (SELECT age
FROM STUDENT WHERE sname LIKE '%y')
24
IN
• Query 20: Find the names of all students who have taken Java
course.
SELECT sname FROM Student s where sid IN (SELECT
sid FROM Enrollment e, Course c WHERE cname LIKE
'Java%' AND e.cid=c.cid)
25
• Query 21: Get all the student’s name whose sids are 2 and 4.
SELECT sname FROM Student WHERE sid IN(2,4)
• Query 22: Get all the student’s name whose sids are not in 2 or 4.
SELECT sname FROM Student WHERE sid NOT IN(2,4)
26
Correlated subqueries
EXISTS
• EXISTS is same like IN operator with the difference that the top-level
query executes based on the existence of a record of the subquery.
• Query 23 revisited: Find the names of the students who have
enrolled for a course on 1 September 2006.
SELECT sname FROM Student s where EXISTS
Same as (SELECT
the previous query *
FROM Enrollment e WHERE e.sid=s.sid AND e.edate='1-
SEP-2006')
28
DIVISION
• Division operation finds the subset of items in one set that are
related to all items in another set.
• There is no straight division operator as such in SQL.
• But the there is a way in which such operation are performed using
correlated subqueries.
• Lets us look at the example to understand the strategy.
29
Division by example
B Analysis…..
30
Round 1: sid =1 .
A: SELECT cid FROM course : (1,2)
B: SELECT cid FROM enrollment e WHERE
e.sid=s.sid : (1,2)
A-B=().
Outer query selects sid=1
31
Round 1: sid =2 .
A: SELECT cid FROM course : (1,2)
B: SELECT cid FROM enrollment e WHERE
e.sid=s.sid : (1)
Aggregate Operators
33
Aggregate Operators
• COUNT([DISTINCT] a)
• SUM([DISTINCT] a)
• AVG([DISTINCT] a)
• MAX(a)
• MIN(a)
COUNT
• Query 25: Find the total number of unique students who have
enrolled for a course.
SELECT count(distinct sid) FROM Enrollment
35
MAX, MIN
• Query 18 revisit: Find all the courses whose duration is more than
any other course duration.
SELECT cname FROM Course WHERE duration =(SELECT
MAX(duration) FROM Course)
36
SUM, AVG
• Query 27: How many hours at the minimum would it take for
anybody to finish if this person takes all the courses?
SELECT SUM(duration) FROM Course
• Query 28: What is the average age of a student?
SELECT AVG(age) FROM Student
37
Challenge Questions
Challenge Questions
• Query 30: Find the names of the students whose age is next to
the oldest student(s).
Order By
• ORDER BY causes the result of the query to be displayed in a specific
order (ascending or descending) based on the attribute.
• Display the student records in sorted order by their names.
SELECT * FROM student ORDER BY sname ASC
• Display the course records displaying from highest to lowest w.r.t.
course duration and then sorted by course name.
SELECT * FROM course ORDER BY duration DESC, cname
By default it is always ASC
40
GROUP BY
CID COUNT(*)
--------------------------------
1 3
2 2
42
Challenge Question
HAVING
Challenge Question
• Find all the sids who have taken a course which sid=1 has
taken.
NULL
JOINS
47
Joins
Natural Join
USING and ON
• In cases where there are more than one common key, if we need
to join based on only one key the USING keyword can be used.
SELECT DISTINCT sid, sname FROM Student JOIN
Enrollment USING(sid)
Or
SELECT DISTINCT Student.sid, sname FROM Student
JOIN Enrollment ON Student.sid= Enrollment.sid
51
• In the outer join, the records that do not match the criteria also
appear.
• Query 36: List out all the students. Also list out the courses, if any,
that each student has enrolled for.
SELECT * FROM student,enrollment WHERE
student.sid=enrollment.sid(+)
Or
select * from student LEFT OUTER JOIN enrollment
on student.sid=enrollment.sid
52
• Query 38: List all the courses and all the students. Link the
students and courses whenever appropriate.
SELECT c.cid, cname, sid, sname FROM enrollment e
FULL OUTER JOIN course c on c.cid=e.cid FULL
OUTER JOIN Student s on e.sid=s.sid
54
Self Join
• A table that needs to get results based on the values of its two
columns uses self–join.
• Assume that we have a table called customer.
1 Neeta Shyam
2 Dolly Dilly 1
3 Meena 2
Kumari
55
• Query 39: Find the names of all customers who have referred other
customers.