Crash Course in SQL
Crash Course in SQL
If you want to work with databases, you must learn to speak their language.
Databases speak Structured Query Language, better known as SQL, which
was invented by E. F. Codd in the 1970s. Instead of working with tables one
record at a time, SQL manages groups of records as a single entity, which
makes it suitable for creating queries of any complexity. This language has
been standardized, and now most database servers, and ADO itself, accept
its ANSI-92 dialect.
Most of the examples in the following sections are based on the Biblio.mdb
or the NWind.mdb databases. You can recall previous queries using the less-
than and greater-than buttons, and you can also safely perform action queries
that delete, insert, or modify records because all your operations are wrapped
in a transaction that's rolled back when you close the form.
Basic selections
The simplest SELECT command returns all the records and all the fields
from a database table:
SELECT * FROM Publishers
(In all the examples in this section, I use uppercase characters for SQL
keywords, but you can write them in lowercase characters if you like.) You
can refine a SELECT command by specifying the list of fields you want to
retrieve. If the field name includes spaces or other symbols, you must
enclose it within square brackets:
You can often speed up a query by retrieving only the fields you're actually
going to use in your application. SQL supports simple expressions in the
field list portion of a SELECT. For example, you determine the age of each
author at the turn of the century using the following command:
This statement returns only one record with two fields: AuthorCnt is the
number of records in the Authors table, and AvgAge is the average of the
age of all authors in the year 2000.
The Count(*) syntax is an exception to the general rule in that it returns the
number of all the records in the result. In a real application, you rarely
retrieve all the records from a table, though, for a good reason: If the table
contains thousands of records, you're going to add too much overhead to
your system and the network. You filter a subset of the records in the table
using the WHERE clause. For example, you might want to retrieve the
names of all the publishers in California:
You can also combine multiple conditions using AND and OR Boolean
operators, as in the following query, which retrieves only the Californian
publishers whose names begin with the letter M:
In the WHERE clause, you can use all the comparison operators (=, <, <=, >,
>=, and <>) and the LIKE, BETWEEN, and IN operators. The BETWEEN
operator is used to select all the values in a range:
The IN operator is useful when you have a list of values. The query below
returns all the publishers located in California, Texas, and New Jersey.
SQL lets you embed strings in single quotes or in double quotes. Because
you usually pass these statements as Visual Basic strings, using single
quotes is often more practical. But when the string contains a single quote,
you must use two consecutive single quotes. For example, to search for an
author named O'Hara, you must use the following query:
You can also specify multiple sort keys by separating the keys with commas.
Furthermore, for each sort key you can add the DESC keyword to sort in
descending order. For example, you can list publishers sorted by state in
ascending order and at the same time list all the publishers in the same state
by city in descending order, as you can see in the following statement. (This
is, admittedly, not a really useful thing to do, but it works as an example.)
When the results are sorted, you can decide to take just the first records
returned by the SELECT, which you do with the TOP clause. For example,
you can retrieve the five titles published more recently using the following
query:
Keep in mind, however, that the TOP clause always returns all the records
with a given value in the field for which the results are sorted. For example,
the version of the Biblio.mdb database I'm working with includes seven
titles published in the most recent year (1999), and therefore the preceding
query will return seven records, not five.
You can define the number of returned records in terms of the percentage of
the total number of the records that would be returned, using the TOP
PERCENT clause:
The GROUP BY clause lets you create summary records that include
aggregate values from groups of other records. For example, you can create
a report with the number of titles published in each year by using the
following query:
The next query displays the number of titles published in the last 10 years:
You can prepare more sophisticated groupings with the HAVING clause.
This clause is similar to the WHERE clause, but it acts on the fields
produced by the GROUP BY clause and is often followed by an aggregate
expression. The next query is similar to the previous one but returns a record
only for those years when more than 50 titles have been published:
You can have both a WHERE clause and a HAVING clause in the same
query. If you use both, SQL applies the WHERE clause first to filter the
records from the original table. The GROUP BY clause then creates the
groups, and finally the HAVING clause filters out the grouped records that
don't meet the condition it specifies.
The following statement returns the names of all the cities where there's at
least one publisher:
This query presents some problems in that it also returns records for which
the City field is Null, and, in addition, it returns duplicate values when
there's more than one publisher in a city. You can solve the first problem by
testing the field with the ISNULL function, and you can filter out duplicates
using the DISTINCT keyword:
Subqueries
All the examples I've shown you so far retrieved their records from just one
table. Most of the time, however, the data you're interested in is distributed
in multiple tables. For example, to print a list of titles and their publishers
you must access both the Titles and the Publishers tables. You do this easily
by specifying both table names in the FROM clause and then setting a
suitable WHERE clause:
You use the tablename.fieldname syntax to avoid ambiguities when the two
tables have fields with the same name. Here's another example, which
retrieves all the titles published by a given publisher. You need to specify
both tables in the FROM clause, even if the returned fields come only from
the Titles table:
But you can use another—and often more efficient—way to retrieve the
same information, based on the fact that the SELECT statement can return a
value, which you can use to the left of the = operator. The next query uses a
nested SELECT query to get a list of all the titles from a given publisher:
If you aren't sure whether the subquery returns only one record, use the IN
operator instead of the equal sign. You can make these subqueries as
complex as you like. For example, the following query returns the titles
published by all the publishers from California, Texas, and New Jersey:
You can also use aggregate functions, such as SUM or AVG. The next query
returns all the items from the Orders table in NWind.mdb for which the
Freight value is higher than the average Freight value:
Joins
The join operation is used to retrieve data from two tables that are related to
each other through a common field. Conceptually, the result of the join is a
new table whose rows consist of some or all the fields from the first table
followed by some or all the fields from the second table; the expression in
the ON clause in a JOIN command determines which rows from the second
table will match a given row from the first table. For example, the following
query returns information about all titles, including the name of their
publisher. I already showed that you can complete this task using a SELECT
command with multiple tables, but an INNER JOIN command is often
better:
This is an important detail: the previous statement retrieves only those titles
for which there is a publisher, that is, those whose PubID field isn't Null.
While the INNER JOIN (also known as equi-join) is the most common form
of join operation, SQL also supports two other types of joins, the LEFT
JOIN and the RIGHT JOIN operations. The LEFT JOIN operation retrieves
all the records in the first table, regardless of whether there's a corresponding
record in the other table. For example, the following command retrieves all
the titles, even if their publisher isn't known:
The RIGHT JOIN operation retrieves all the records in the second table,
even if there isn't a related record in the first table. The following statement
selects all the publishers, whether or not they have published any titles:
Of course, you can filter records using a WHERE clause in both the nested
and the external join. For example, you can return only titles published
before 1960:
The two tables can have different structures, provided that the fields returned
by each SELECT command are of the same type.
If the table has a key field that's automatically generated by the database
engine—as is true for the Au_Id field in the Authors table—you don't have
to include it in the field list. Null values are inserted in all the columns that
you omit from the field list and that aren't automatically generated by the
database engine. If you want to insert data that's already stored in another
table, you can append a SELECT command to the INSERT INTO statement.
For example, the following command copies all the records from a table
called NewAuthors into the Authors table:
You often need a WHERE clause to limit the number of records that are
inserted. You can copy from tables with a different structure or with
different name fields, but in this case you need to use aliases to make field
names match. The statement which follows copies an entry from the Contact
table to the Customer table but accounts for different field names.
INSERT INTO Customers SELECT ContactName AS Name,
Address, City, State
FROM Contacts WHERE Successful = True
You can also use expressions in SET clauses. For example, the following
statement increments the discount for all the items in the Order Details table
that have been ordered by customer LILAS. (Run this query against the
NWind.mdb database.)
After that, you can delete the records in the Employees table: