Retrieving Data With SQL Queries
Retrieving Data With SQL Queries
The Structured Query Language offers database users a powerful and flexible data retrieval mechanism -- the SELECT statement. In this article, we'll take a look at the general form of the SELECT statement and compose a few sample database queries together. If this is your first foray into the world of the Structured Query Language, you may wish to review the article SQL Fundamentals before continuing. If you're looking to design a new database from scratch, the article Creating Databases and Tables in SQL should prove a good jumpingoff point. Now that you've brushed up on the basics, let's begin our exploration of the SELECT statement. As with previous SQL lessons, we'll continue to use statements that are compliant with the ANSI SQL standard. You may wish to consult the documentation for your DBMS to determine whether it supports advanced options that may enhance the efficiency and/or efficacy of your SQL code. The general form of the SELECT statement appears below: SELECT select_list FROM source WHERE condition(s) GROUP BY expression HAVING condition ORDER BY expression The first line of the statement tells the SQL processor that this command is a SELECT statement and that we wish to retrieve information from a database. The select_list allows us to specify the type of information we wish to retrieve. The FROM clause in the second line specifies the specific database table(s) involved and the WHERE clause gives us the capability to limit the results to those records that meet the specified condition(s). The final three clauses represent advanced features outside the scope of this article -- we'll explore them in future SQL lessons. The easiest way to learn SQL is by example. With that in mind, let's begin looking at some database queries. Throughout this article, we'll use the employees table from the fictional XYZ Corporation human resources database to illustrate all of our queries. Here's the entire table:
EmployeeID
LastName
FirstName
Salary
ReportsTo
Smith
John
32000
Scampi
Sue
45000
NULL
3
4 5 6 7
Kendall
Jones Allen Reynolds Johnson
Tom
Abraham Bill Allison Katie
29500
35000 17250 19500 21000
2
2 4 4 3
Retreiving an Entire Table XYZ Corporation's Director of Human Resources receives a monthly report providing salary and reporting information for each company employee. The generation of this report is an example of the SELECT statement's simplest form. It simply retrieves all of the information contained within a database table -- every column and every row. Here's the query that will accomplish this result: SELECT * FROM employees Pretty straightforward, right? The asterisk (*) appearing in the select_list is a wildcard used to inform the database that we would like to retrieve information from all of the columns in the employees table identified in the FROM clause. We wanted to retrieve all of the information in the database, so it wasn't necessary to use a WHERE clause to restrict the rows selected from the table. Here's what our query results look like:
EmployeeID ---------1 2 3 4 5 6 7 LastName -------Smith Scampi Kendall Jones Allen Reynolds Johnson FirstName --------John Sue Tom Abraham Bill Allison Katie Salary -----32000 45000 29500 35000 17250 19500 21000 ReportsTo --------2 NULL 2 2 4 4 3
In the next section of this lesson, we'll look at some more powerful queries that allow you to restrict the information retrieved from the database. Read on!
In the first part of this feature, we looked at the general form of the SELECT statement and a simple query that retrieved all of the information contained within a table. Let's go a step
further and look at some queries that restrict the information retrieved. Retrieving Selected Columns from a Table Our last example produced a report for the Director of Human Resources that contained all of the salary and reporting information for every employee of XYZ Corporation. There are several mid-level managers within the department that also require access to reporting information as part of their duties. These managers do not need access to salary information, so we'd like to provide them with a report containing limited information from the database -each employee's name, ID number and the ID number of their manager. Here's a SQL SELECT statement that accomplishes the desired result: SELECT EmployeeID, LastName, FirstName, ReportsTo FROM employees This query looks somewhat different from the previous one. Notice that the asterisk wildcard has been replaced with a list of the column names we would like to include in our query results. The Salary column is omitted to satisfy privacy concerns by limiting the information provided to mid-level managers. Here's the output of this query:
EmployeeID ---------1 2 3 4 5 6 7 LastName -------Smith Scampi Kendall Jones Allen Reynolds Johnson FirstName --------John Sue Tom Abraham Bill Allison Katie ReportsTo --------2 NULL 2 2 4 4 3
Employee -------1 3 4
ReportsTo --------2 2 2
After reviewing this report, Sue decides that she would like to further limit the results to those employees that earn a salary in excess of $30,000. We can use a compound condition in the WHERE clause to achieve these results. Here's the revised SQL query: SELECT * FROM employees WHERE ReportsTo = 2 AND Salary > 30000 And the results of this query:
Employee -------1 4
ReportsTo --------2 2
Notice that Tom Kendall's record dropped out of the results because his salary did not meet the minimum requirement of $30,000. In the final section of this article we'll look at two techniques used to enhance query results.
In the first two sections of this lesson, we examined basic database queries and more powerful statements that restrict the query results. Now let's look at two techniques used to enhance the display of query output. Renaming Columns in Query Results All too often, database tables contain cryptic column headings that don't make sense to users outside of a company's IT department. Fortunately, SQL provides a mechanism that allows us to change the headings displayed in query output for enhanced readability. For example, let's look at a database query that displays the name of every employee of XYZ Corporation along with their annual salary. Our database simply labels the compensation column "Salary" but doesn't clarify the time period involved. Here's a query that straightens things out:
SELECT LastName, FirstName, Salary AS 'Annual Salary' FROM employees Notice that the third field in the SELECT clause is slightly different from previous examples. The AS statement modifies the column heading used in the query output. It's necessary to enclose this heading in single quotes to incorporate the space character. Here's the output of this query:
EmployeeID ---------1 2 3 4 5 6 7 LastName -------Smith Scampi Kendall Jones Allen Reynolds Johnson FirstName --------John Sue Tom Abraham Bill Allison Katie Annual Salary -----32000 45000 29500 35000 17250 19500 21000 ReportsTo --------2 NULL 2 2 4 4 3
3 The duplicate ID numbers have been eliminated and we're left with the desired report. Notice that the NULL value appears in the query output. NULL is considered a unique value and will appear once (and only once) in query output when the DISTINCT keyword is used. That's it for this lesson. Now get out there and practice some SQL!
In several recent articles we explored the fundamental concepts of SQL, the process of creating databases and database tables and the art of retrieving data from a database using simple queries. This article expands on these topics and looks at using join techniques to retrieve data from multiple tables. By way of example, let's return to our fictitious XYZ Corporation. XYZ utilizes an Oracle database to track the movements of their vehicle fleet and drivers between their facilities. Some employees are assigned to drive trucks while others are assigned to drive cars. Take a moment to examine the following two tables from their vehicle management database:
drivers
location
class
13232
Baker
Roland
New York
Car
18431
Smythe
Michael
Miami
Truck
41948
Jacobs
Abraham
Seattle
Car
81231
Ryan
Jack
Annapolis
Car
vehicles
tag
location
class
D824HA
Miami
Truck
H122JM
New York
Car
J291QR
Seattle
Car
L990MT
Seattle
Truck
P091YF
Miami
Car
In the previous article, we looked at methods used to retrieve data from single tables. For example, we could use simple SELECT statements to answer questions such as:
Practical applications often require the combination of data from multiple tables. Our vehicle managers might make requests like the following:
List all of the vehicle/driver pairings possible without relocating a vehicle or driver
Granted, it would be possible to create complex SELECT statements using subqueries to fulfill these requests. However, there's a much simpler method -- the use of inner and outer joins. We'll explore each of these concepts in the next two sections of this article.
Inner joins (also known as equijoins) are used to contain information from a combination of two or more tables. The join condition determines which records are paired together and is specified in the WHERE clause. For example, let's create a list of driver/vehicle match-ups where both the vehicle and driver are located in the same city. The following SQL query will accomplish this task: SELECT lastname, firstname, tag FROM drivers, vehicles WHERE drivers.location = vehicles.location
And let's take a look at the results: lastname firstname tag -------- --------- --Baker Roland H122JM Smythe Michael D824HA Smythe Michael P091YF Jacobs Abraham J291QR Jacobs Abraham L990MT Notice that the results are exactly what we sought. It is possible to further refine the query by specifying additional criteria in the WHERE clause. Our vehicle managers took a look at the results of our last query and noticed that the previous query matches drivers to vehicles that they are not authorized to drive (e.g. truck drivers to cars and vice-versa). We can use the following SELECT lastname, firstname, tag, vehicles.class FROM drivers, vehicles WHERE drivers.location = vehicles.location AND drivers.class = vehicles.class Notice that in this example we needed to specify the source table for the class attribute in the SELECT clause. This is due to the fact that class is ambiguous it appears in both tables and we need to specify which tables column should be included in the query results. In this case it does not make a difference as the columns are identical and they are joined using an equijoin. However, if the columns contained different data this distinction would be critical. Here are the results of this query: lastname FirstName Tag Class -------- ----------- ----Baker Roland H122JM Car Smythe Michael D824HA Truck Jacobs Abraham J291QR Car Notice that the rows pairing Michael Smythe to a car and Abraham Jacobs to a truck have been removed. You can also use inner joins to combine data from three or more tables. Outer joins allow database users to include additional information in the query results. We'll explore them in the next section of this article.
Take a moment and review the database tables located on the first page of this article. Notice that we have a driver -- Jack Ryan -- who is located in a city where there are no vehicles. Our vehicle managers would like this information to be included in their query results to ensure that drivers do not sit idly by waiting for a vehicle to arrive. We can use outer joins to include records from one table that have no corresponding record in the joined table. Let's create a list of driver/vehicle pairings that includes records for drivers with no vehicles in their city. We can use the following query: SELECT lastname, firstname, driver.city, tag FROM drivers, vehicles WHERE drivers.location = vehicles.location (+) Notice that the outer join operator "(+)" is included in this query. This operator is placed in the join condition next to the table that is allowed to have NULL values. This query would produce the following results: lastname firstname citytag -------- --------- ------Baker Roland NewYorkH122JM Smythe Michael MiamiD824HA Smythe Michael MiamiP091YF Jacobs Abraham SeattleJ291QR Jacobs Abraham Seattle L990MT Ryan Patrick Annapolis This time our results include the stranded Patrick Ryan and our vehicle management department can now dispatch a vehicle to pick him up. Note that there are other possible ways to accomplish the results seen in this article and syntax may vary slightly from DBMS to DBMS. These examples were designed to work with Oracle databases, so your mileage may vary. Furthermore, as you advance in your knowledge of SQL youll discover that there is often more than one way to accomplish a desired result and oftentimes one way is just as good as another. Case in point, it is also possible to specify a join condition in the FROM clause rather than the WHERE clause. For example, we used the following SELECT statement earlier in this article: SELECT lastname, firstname, tag FROM drivers, vehicles WHERE drivers.location = vehicles.location AND drivers.class = vehicles.class
The same query could be rewritten as: SELECT lastname, firstname, tag FROM drivers INNER JOIN vehicles ON drivers.location = vehicles.location WHERE drivers.class = vehicles.class
That's it for this week! Be sure to check back next week for a new exciting article on databases. If you'd like a reminder in your Inbox, subscribe to the About Databases newsletter.