SQL commands
SQL commands
A language used in relational databases. DBMS and SQL supports CRUD - Create, Read,
Update, Delete commands.
Notes:
• SQL keywords are case insensitive
• Mathematics can be done in SQL
Language
SQL provides the following capabilities:
Data Definition Language Define and setup database CREATE, ALTER, DROP
(DDL)
Data Manipulation Maintain and use database SELECT, INSERT, DELETE,
Language (DML) UPDATE
Data Control Language Control access to database GRANT, REVOKE
(DCL)
Other commands Administer database and N/A
transaction control
DDL Commands
CREATE TABLE entity_name (
Attribute type,
...
) ENGINE=InnoDB;
e.g.
CREATE TABLE Account (
AccountID smallint auto_increment,
AccountName varchar(100) NOT NULL,
OutstandingBalance DECIMAL(10,2) NOT NULL,
CustomerID smallint NOT NULL,
PRIMARY KEY (AccountID),
FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID)
ON DELETE RESTRICT
ON UPDATE CASCADE
);
Records can also be inserted from an existing table (see next section)
INSERT INTO entity_name
SELECT * FROM entity_name;
e.g.
INSERT INTO Customer
(CustFirstName, CustLastName, CustType)
VALUES ("Peter", "Smith", 'Personal');
FROM entity
e.g. :
SELECT CustLastName FROM Customer;
WHERE where_condition
e.g. :
SELECT CustLastName FROM Customer
WHERE CustLastName = "Smith";
Aggregate functions
Aggregate functions operate on the set of values in a column of a relation and return a
single value e.g. CONCAT(), AVG(), COUNT(), MIN(), SUM(), MAX() etc.
HAVING where_condition
➢ Indicates condition - added as the 'WHERE' keyword cannot be used with aggregate
functions
e.g. List the number of customers of each country only including countries with
more than 5 customers.
SELECT COUNT(CustomerID), CountryName
FROM Customers
GROUP BY CountryName
HAVING COUNT(CustomerID)>5;
➢ LIMIT limits the output size and OFFSET skips a number of records
Left outer join SELECT * FROM Rel1 LEFT OUTER JOIN Rel2
ON condition
Right outer join SELECT * FROM Rel1 RIGHT OUTER JOIN Rel2
ON condition
Full outer join SELECT * FROM Rel1 FULL OUTER JOIN Rel2
ON condition
SQL provides abilities to nest subqueries, where another select query is commonly used to
perform set tests.
ANY Must satisfy at least one WHERE sal > ANY(200, 300, 400);
of the inner conditions. =
WHERE sal>200 OR sal>300 OR sal>400;
ALL Must satisfy all inner WHERE sal > ALL(200, 300, 400);
conditions. =
WHERE sal>200 AND sal>300 AND sal>400;
EXISTS Inner query returns at SELECT * FROM Buyer
WHERE EXISTS (SELECT * FROM Offer WHERE
least one record. Buyer.BuyerID = Offer.BuyerID AND
ArtefactID = 1)
IN alternatives
There is often a more efficient alternate method for IN operations.
e.g. List the BuyerID, Name and Phone number for all bidders on artefact 1.
SELECT * FROM Buyer
WHERE BuyerID IN
(SELECT BuyerID FROM Offer WHERE ArtefactID = 1)
Offer
SellerID ArtefactID BuyerID Date
1 1 1 2012-06-20
1 1 2 2012-06-20
2 2 1 2012-06-20
2 2 2 2012-06-20
Buyer
BuyerID Name Phone
1 Maggie 0333333333
2 Nicole 0444444444
3 Oleg 0555555555
Artefact
ID Name Description
1 Vase Old Vase
2 Knife Old Knife
3 Pot Old Pot
UPDATE entity
SET attribute = new_attribute
WHERE condition;
is better than:
UPDATE Salared
SET AnnualSalary = AnnualSalary * 1.10
WHERE AnnualSalary > 100000;
UPDATE Salared
SET AnnualSalary = AnnualSalary * 1.05
WHERE AnnualSalary <= 100000;
REPLACE
➢ Works identically as INSERT except if an old row in a table has a key value the same as
the new row, then it is overwritten
➢ Any relation not in physical models but made available to the user as a virtual relation
○ Helps hide query complexity and data from users
○ Once a view is defined its definition is stored in the database and can be used
like any other table
DBMS stores information of disks (normally hard disks) and involves many read and write
operations when data is accessed - high cost. So storage and indexing is needed.
Terminology
Conceptual modelling Entity Attribute Instance of an entity
Logical modelling Relation Attribute Tuple
Physical modelling/SQL Table Column/field Row
Disk storage File Field Record
Sorted files
Similar structure as heap files however pages and records are ordered.
• Sequential order based on the search key (by columns)
• Quick search (especially on a range) but slow insert due to needing to reshuffle
records
• Can be good for range search with less than <
Cost of storage
Data is typically stored in pages on hard disks, to be able to process an analyse it data needs
to be brought to the RAM.
• Access to hard disks are much slower than access to memory
• I/O cost dominates overall cost and accounts for much more than CPU - so
measurements for access to memory is negligible
For all operations, DBMS models the cost - the number of pages (or disk I/O operations - to
bring data from disk to memory).
Storage^J Indexing and Query Processing Page 23
bring data from disk to memory).
• e.g. a table of 100 records with each page storing 10 records, the cost of accessing the
entire file is 10 I/O (or 10 pages)
Indexes are made up of data entries which refer back to the data in the relation.
• Speeds up selection on the search key fields
• Stored in an index file, in contrast to data file which contains actual records
• Data file pages are not necessarily organised in the same manner as Index Pages
A data structure built on over specific fields called search key fields.
• Built on top of data files
• Speeds up selections on the search key fields
• Any subset of the fields of a relation can be the search key for an index
Index classification
Clustering Clustered - data records in data file have
same order as data entries (sorted)
Properties
• There is only one search key
combination of a clustered index
• Clustered indexes more efficient for
range and equality queries but
expensive to maintain
• Clustered indexes can only be applied
to table that uses sorted file
organisation
Primary vs Primary - records are retrieved based on the N/A
secondary value of the primary key.
index
Secondary - any other index that isn't the
primary key, often fields that are frequently
queried.
Properties
• Primary index never contains
duplicates whereas secondary index
may contain duplicates
Composite An index built over a combination of search
Hash-based Represents index as a collection of buckets. e.g. Suppose you are given 5 buckets and
index A hash function maps the search key to the .
corresponding bucket. Bucket Key
• Better for equality selections
• Can't perform range selections 0 200
1
2 22
3 8, 33
4 119