0% found this document useful (0 votes)
56 views

SQL Lesson 5 - Set Operators

The document discusses set operators in SQL such as INTERSECT, EXCEPT, UNION, and OUTER UNION. It explains that set operators vertically combine the results of two queries. INTERSECT returns rows common to both queries, EXCEPT returns rows unique to the first query, UNION returns unique rows from both queries, and OUTER UNION returns all rows. The document uses examples analyzing customer sales data from multiple tables to demonstrate how to use set operators to answer business questions.

Uploaded by

augustocgn
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

SQL Lesson 5 - Set Operators

The document discusses set operators in SQL such as INTERSECT, EXCEPT, UNION, and OUTER UNION. It explains that set operators vertically combine the results of two queries. INTERSECT returns rows common to both queries, EXCEPT returns rows unique to the first query, UNION returns unique rows from both queries, and OUTER UNION returns all rows. The document uses examples analyzing customer sales data from multiple tables to demonstrate how to use set operators to answer business questions.

Uploaded by

augustocgn
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

SQL Essentials - Lesson 5 – Set Operators

Lesson 5: Set Operators

5.1. Introduction to Set Operators


5.1.1. Combining Data Using Set Operators

Suppose your manager has requested four reports regarding all target sales contacts by phone or email.
The company wants to contact each of the target customers, and determine which customers need to be
contacted again to retrieve a response. Our goal is to analyze our contact with our customers. The reports
should answer four questions: Which customers responded to both email and phone requests? Which
customers responded to either phone and/or email? Which customers responded to only the email request?
And is there a complete list of all customer responses?

If we had a single table that contained all the information about our customers and contact methods, our
programming task would be straightforward. However, the data required to answer these questions is
stored in three tables: saleslist, salesemail, and salesphone. The saleslist table contains contact information
for customers we want to target in our campaign. It has the UserID (or email), the CustomerID, and phone
numbers where the customer can be reached.

Review the saleslist data.

The salesemail table contains responses from customers who have either accepted or declined our offer via
email, and the salesphone table contains responses from phone sales. If the customer isn't listed in these
tables, that means the customer hasn't responded to us at this time. Customers can be listed in the
salesphone table twice if a call back was requested by the customer.

Review the salesemail data.

Review the salesphone data.

Let's now compare how the data is organized. Both tables contain a CustomerID column and a response
column, but the response column is named differently. These two tables also store responses differently.
Salesemail has one response per email and the response can be either Accepted or Declined. In
salesphone, the SalesRep column indicates the sales rep who made the call, and the PhoneResp column
indicates the response. A response can be Declined, Call Back or Accepted.

All three tables contain a CustomerID column, but the columns are not in the same position.

Now that you've seen the data, what do you think? Can you answer any of the manager's four questions by
querying only one table? All of these questions require you to query multiple tables. You can use set
operators to vertically combine queries and create reports that answer the four questions.

1
5.1.2. What Are Set Operators?

How does a set operator work? A set operator vertically combines the intermediate result sets from
two queries to produce a final result set. The intermediate result sets have rows and columns, and
a set operator acts on the intermediate result sets not directly on the input tables.

To vertically combine the results of two queries, you can use one of four set operators--
INTERSECT, EXCEPT, UNION, and OUTER UNION. The INTERSECT, EXCEPT, and UNION
operators are specified in the ANSI standard for SQL. The OUTER UNION operator is a SAS
enhancement.

To explain the results of each method, we'll represent two tables as circles in the simplified Venn
diagram. The INTERSECT operator returns rows from the first query that also occur in the second.
In other words, unique rows that are common in both queries. This is the overlapping area of the
Venn diagram.

The EXCEPT operator returns rows that result from the first query, but not from the second query.
In other words, unique rows in the first query only. This is the top circle in the Venn diagram.

The UNION operator combines two query results. It produces all the unique rows that result from
both queries. That is, it returns a row if it occurs in the first table, the second, or both. This is the
top and bottom circles of the Venn diagram. UNION does not return duplicate rows. If a row occurs
more than once, then only one occurrence is returned.

The OUTER UNION operator combines the results of both queries. It includes all rows and
columns and nothing overlaps. Therefore, the diagram shows two separated circles.

The default behavior of columns is slightly different between the set operators. The INTERSECT,
EXCEPT, and UNION set operators align columns by position in both results sets. For example,
set operators combine columns from two queries based on their position in the reference tables
without regard to the individual column names. Columns in the same relative position in the two
queries must have the same data types. The column names of the tables in the first query become
the column names of the output table. The OUTER UNION set operator includes all columns from
both result sets.

5.1.3. Using Set Operators


Using Set Operators

A set operation consists of two sets of query clauses that are combined by one of the four set operators.
The entire set operation is a single SELECT statement, so you place a semicolon after the last SELECT
statement.

SELECT query
UNION | EXCEPT | INTERSECT | OUTER UNION <ALL> <CORR>
SELECT query...;

When you work with set operators, you're not limited to their default behaviors. Remember that the
INTERSECT, EXCEPT, and UNION set operators produce only the unique rows by default. PROC SQL
must make a second pass through the data to eliminate the duplicate rows.

To change a default behavior for rows, you can add the ALL keyword to the code, and SAS won't remove
the duplicate rows. You should consider using the ALL keyword when either of the following conditions
2
SQL Essentials - Lesson 5 – Set Operators
occurs-- the presence of duplicates in the final results that will not cause problems, and when duplicates are
not possible, for example, if there's a unique or primary key constraint on the column.

Again, using the ALL keyword improves efficiency of the set operators, because SAS doesn't make a
second pass to remove duplicates.

You can use a CORR keyword to modify the default behavior for columns. Remember that the
INTERSECT, EXCEPT, and UNION set operators align columns by their position in their intermediate result
sets. The CORR keyword aligns columns that have the same name in both intermediate results sets.
CORR also aligns the columns by name in the OUTER UNION set operator.

3
5.2. Using the INTERSECT, EXCEPT and UNION Set Operators
5.2.1. Using the INTERSECT Operator
We want to start by finding customers who responded to both our email and phone call attempts, no matter
if they accepted or declined our offer. We're looking for our highly responsive customers. This list of
customers who responded to email or phone are in the salesemail and salesphone tables, and we want to
find the matching CustomerIDs. To find the CustomerIDs that intersect the two queries, we're going to use
the INTERSECT set operator.

SELECT query...
INTERSECT <ALL> <CORR>
SELECT query...;

In this example, we're referencing the CustomerID column only. The first query returns all CustomerIDs
from the salesemail table. We then want to intersect those results with the result set 2, the list of
CustomerIDs from the salesphone table.

proc sql;
select CustomerID
from sq.salesemail
intersect
select CustomerID
from sq.salesphone;
quit;

The INTERSECT operator has two steps. The first step is to remove any duplicates in each of the
intermediate result sets. In this case, there are no duplicate rows in Result Set 1, but there is a duplicate
row in Result Set 2, so it's removed. Next, the INTERSECT operator selects rows from the first intermediate
results set that are also in the second intermediate results set, the equivalent rows.

The results of the INTERSECT set operator lists CustomerIDs that are extremely responsive to our sales
attempts.

Review the output.

4
SQL Essentials - Lesson 5 – Set Operators
5.2.2. Activity
Open s105a01.sas from the activities folder and perform the following tasks to find unique customers who
have responded by phone and email:

1. Run the first queries to preview the sq.salesemail and sq.salesphone tables. Examine the
columns in both tables.

2. In the Intersect section, examine and run the query. Did the query run successfully? Why
not?

3. Add the CORR keyword after the INTERSECT set operator. Run the query. Did the query
run successfully? Why?

5
5.2.3. Using a Set Operator Versus a Join
Here's the code from the previous Activity:

proc sql;
select *
from sq.salesemail
intersect corr
select *
from sq.salesphone;
quit;

You can use an INNER JOIN to produce identical results. This example uses an inner join to find all
matches of the two tables. The DISTINCT keyword removes any duplicate rows, resulting in the same
result as the INTERSECT set operator.

proc sql;
select distinct e.CustomerID
from sq.salesemail as e inner join
sq.salesphone as p
on e.CustomerID = p.CustomerID;
quit;

6
SQL Essentials - Lesson 5 – Set Operators
5.2.4. Using the EXCEPT Operator
In this scenario, we sent only email offers to our target customers from the saleslist table. We haven't had a
chance to call them yet, but we only want to spend time calling customers who haven't responded to our
initial email. One method to create a list of these customers is to use the EXCEPT set operator.

We'll select all target CustomerIDs from the saleslist table as our first result set, and then select
CustomerIDs from the salesemail table for our second result set. We use the EXCEPT set operator to
retrieve all customers in the first results set but not the second.

SELECT query...
EXCEPT <ALL> <CORR>
SELECT query...;

title "Customers Who Haven't Responded to the Sales Email"


proc sql;
select CustomerID
from sq.saleslist
except
select CustomerID
from sq.salesemail;
quit;

The EXCEPT set operator follows the same steps as the INTERSECT operator. It first searches for
duplicate rows in each of the intermediate result sets and removes them. In this case, there are no
duplicate rows in either results set, because we have currently only sent one email to each customer in our
sales list. Next, rows from the first intermediate result set that are not in the second intermediate result set
are selected.

Our results are a list of customers who have not responded to our email. With this information, we can
follow up to only the specific customers we need to.

Review the output.

7
5.2.5. Activity
Open s105a02.sas from the activities folder and perform the following tasks to find all target customers
who have not responded to our sales phone call:

1. Run the first queries in the Preview Tables section to preview the sq.salesemail and
sq.salesphone tables. Examine the columns in both tables.

2. In the EXCEPT section, complete the query to find all customers from the sq.saleslist table
who have not responded to our sales call in sq.salesphone.

3. How many customers have not responded to our phone call?

8
SQL Essentials - Lesson 5 – Set Operators
5.2.6. Using a Set Operator Versus a Subquery
Here's the code from the previous Activity:

select CustomerID
from sq.saleslist
except
select CustomerID
from sq.salesphone;

You can use subqueries to produce identical results. This example uses a subquery to return all
CustomerIDs in the salesphone table. The WHERE clause subsets for all CustomerIDs in the saleslist table
that are not in the list returned by the subquery. The DISTINCT keyword removes any duplicate rows,
resulting in the same result as the EXCEPT set operator.

select distinct CustomerID


from sq.saleslist
where CustomerID not in(select CustomerID
from sq.salesphone);

9
5.2.7. Using the UNION Operator
We want to find customers who responded to either our email or phone call attempts, no matter if they
accepted or declined our offer. Then we want to use this information to find the total number of unique
customers who responded to either phone or email.

We want to select the CustomerIDs from the salesemail table and use the results of the query to combine
with a query of CustomerIDs from the salesphone table. The UNION set operator first combines the result
sets, and then removes duplicate values from the result set to return all unique rows.

SELECT query...
UNION <ALL> <CORR>
SELECT query...;

proc sql;
select CustomerID
from sq.salesemail
union
select CustomerID
from sq.salesphone;
quit;

10
SQL Essentials - Lesson 5 – Set Operators
5.2.8. Demo: Using the UNION Operator to Find All Unique Rows
We're going to use a UNION set operator to count the number of unique customers who responded to an
email or a phone sales attempt.

Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your
practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In
Enterprise Guide, run the Autoexec process flow.

1. Open the s105d01.sas program in the demos folder and find the Demo section. Under Explore the
salesemail and salesphone tables, run the two queries.

proc sql;
select *
from sq.salesemail;
select *
from sq.salesphone;
quit;

2. Review the output.


The first table has CustomerID and EmailResp. We can see Accepted or Declined, and then our
second table is the phone table. We have CustomerID, the sales rep who called, and then the
response. We want to combine these tables to see the distinct customers that responded.

3. In the next section, complete the query to find all unique customers who responded to either an
email or phone call. Begin with the SELECT statement and select all columns from the
sq.salesemail table. Use the UNION set operator followed by another SELECT statement to select
all columns from the sq.salesphone table. Run the query and examine the syntax error.

proc sql;
select *
from sq.salesemail
union
select *
from sq.salesphone;
quit;

4. Review the log.


We can see column 2 from the first contributor is not the same type as its counterpart from the
second. Recall that EmailResp from the first table is character, and SalesRep from the second
table is numeric.

5. Add the keyword CORR after the UNION set operator. Note: The CORR keyword aligns the
columns that have the same name in both tables, which is CustomerID, and removes any columns
not found in both tables. Run the query and examine the log and results.

proc sql;
select *
from sq.salesemail
union corr
select *
from sq.salesphone;
quit;

6. Review the output.


We can see our unique list of customers.

7. Remove the CORR keyword and specify the CustomerID column in both SELECT clauses. Run the
query and examine the log and results.

proc sql;
select CustomerID
11
from sq.salesemail
union
select CustomerID
from sq.salesphone;
quit;

8. Review the output.


The results are the same as before. We see our unique list of cutomers. Our final goal is to count
the number of unique customers.

9. Add another SELECT statement at the first line in the SQL procedure. Use the COUNT(*) function to
count all rows. Name the column TotalNum. Add a FROM clause and use the previous query as a
subquery in the FROM clause (in-line view). Be sure to add parentheses around the subquery. Run
the query and examine the results.

proc sql;
select count(*) as TotalNum
from (select CustomerID
from sq.salesemail
union
select CustomerID
from sq.salesphone);
quit;

10. Review the output.


Eight distinct customers have responded.

12
SQL Essentials - Lesson 5 – Set Operators
5.2.9. Default Behavior of the UNION Operator
The UNION set operator works in a different order than the INTERSECT and EXCEPT operators. The
UNION set operator first combines results sets, then removes duplicate rows. INTERSECT and EXCEPT
remove duplicate rows, then combine result sets. If the two intermediate result sets have a different number
of columns, then SAS extends one table with null columns so that the two intermediate result sets will have
the same number of columns. If result set 1 is extended with null columns, then the name of the column in
result set 2 will be used in the final results. In these cases, SAS writes a note in the log.

proc sql;
select CustomerID, EmailResp
from sq.salesemail
union
select CustomerID,PhoneResp, SalesRep
from sq.salesphone;
quit;

Review the output.

13
5.2.10. Combining Set Operators
Combining Set Operators

What if you want a list of customers who haven't responded to either email or phone sales attempts? You
can combine the salesemail and salesphone tables using the UNION set operator to find a unique list of
customers. Then you specify that you'd like everyone in the saleslist table EXCEPT the unique list of
customers who have responded.

In the code, the second query returns a unique list of customers who have responded to your sales call or
email. Then we select all customers in the saleslist table except the results of the UNION. The returning
results leave us with two customers who have not responded to our sales attempts.

proc sql;
select CustomerID
from sq.saleslist
except
(select customerid
from sq.salesemail
union
select customerID
from sq.salesphone);
quit;

14
SQL Essentials - Lesson 5 – Set Operators
5.2.11. Practice

Practice Level 1: Using the EXCEPT Set Operator


Use the EXCEPT set operator to generate a report listing of merchants in the sq.merchant table who are
not listed in the sq.transaction table.

Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your
practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In
Enterprise Guide, run the Autoexec process flow.

1. Write a query using the following requirements:

 Select MerchantID from the sq.merchant table.


 Use the EXCEPT set operator.
 Select MerchantID from the sq.transaction table.
 Order the results by MerchantID.
 Add an appropriate title.
 Run the query and view the results.

2. How many merchants do not have a transaction?

1.
/*s105s01.sas*/

title 'Merchants without Transactions';


proc sql;
select MerchantID
from sq.merchant
except
select MerchantID
from sq.transaction
order by MerchantID;
quit;
title;

2. six

15
Practice Level 1: Using the UNION Set Operator
Using the sq.employee table as input, create a report that displays the total salary and total number of
employees. In addition, add rows to display the salary information for US and AU employees separately.
Use the UNION set operator to combine the three rows of output for the final report.

Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your
practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In
Enterprise Guide, run the Autoexec process flow.

1. Write a query to create a report for all employees. Use the following requirements:

 Create the first row of the report.


 Include the constant text Total Paid to All Employees, the sum of Salary formatted with the
DOLLAR. format, and the total number of employees using the COUNT function. Name the
count column Total.
 Run the query and view the results.

2. Write a query to create a report for all US employees. Use the following requirements:

 Create the second row of the report.


 Include the constant text Total Paid to US Employees, the sum of Salary formatted with the
DOLLAR. format, and the total number of employees using the COUNT function. Name the
count column Total.
 Filter the data for US employees by using the UPCASE function to select rows with the
Country value US.
 Run the query and view the results.

3. Write another query to create a report for all AU employees. Use the following requirements:

 Follow the previous steps to create the last row of the report for AU employees.
 Run the query and view the results.

4. Combine the results of all three queries into a single query using UNION set operators.

 Order the results by descending total salary.


 Add an appropriate title.
 Run the query and view the results.

5. Which country has more employees?

16
SQL Essentials - Lesson 5 – Set Operators

1.
/*s105s02.sas*/

/*a*/
proc sql;
select 'Total Paid to All Employees',
sum(Salary) format=dollar16.,
count(*) as Total
from sq.employee;
quit;

2.
/*b*/
proc sql;
select 'Total Paid to US Employees',
sum(Salary) format=dollar16.,
count(*) as Total
from sq.employee
where upcase(Country)='US';
quit;

3.
/*c*/
proc sql;
select 'Total Paid to AU Employees',
sum(Salary) format=dollar16.,
count(*) as Total
from sq.employee
where upcase(Country)='AU';
quit;

4.
/*d*/
title 'Country Specific Salary Information';
proc sql;
select 'Total Paid to All Employees',
sum(Salary) format=dollar16.,
count(*) as Total
from sq.employee
union
select 'Total Paid to US Employees',
sum(Salary) format=dollar16.,
count(*) as Total
from sq.employee
where upcase(Country)='US'
union
select 'Total Paid to AU Employees',
sum(Salary) format=dollar16.,
count(*) as Total
from sq.employee
where upcase(Country)='AU'
order by 2 desc;
quit;
title;

5. The US has more employees.

17
Practice Level 2: Using the EXCEPT Set Operator with the DISTINCT Keyword
Using the sq.statepopulation table, generate a list of state codes for states without any customers.

Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your
practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In
Enterprise Guide, run the Autoexec process flow.

1. Write a query to list the unique Name values in the sq.statepopulation table. This list represents all
available states.

2. Write a separate query to list the unique State values from the sq.customer table. This list
represents all states where customers reside.

3. Combine the queries using the EXCEPT set operator to display states with no customers. Add an
appropriate title and run the query.

4. For which value (or values) of State are there no customers?

1.
proc sql;
select distinct Name
from sq.statepopulation;
quit;

2.
proc sql;
select distinct State
from sq.customer;
quit;

3.
title 'States with No Customers';
proc sql;
select Name
from sq.statepopulation
except
select distinct State
from sq.customer;
quit;
title;

4. PR

18
SQL Essentials - Lesson 5 – Set Operators

Challenge Practice: Using Set Operators to Summarize Data


Determine what percentage of customers have accepted either the phone or email offer. The sq.saleslist
table contains the full list of customers presented with an offer. The sq.salesemail and sq.salesphone
tables contain email and phone responses.

Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your
practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In
Enterprise Guide, run the Autoexec process flow.

1. Write a query to use a UNION SET operation to combine the CustomerID values for customers who
accepted either the phone or email offer. This will form the basis of your in-line view.

2. Write a query to count the number of customers who have accepted either offer (step 1). Use the
following requirements:

 Use the following formula to calculate the rate of offer acceptance:


 select count(*)/(select count(*) from sq.saleslist)
from (your in-line view code from step 1)

 Name the calculated column PctResp. Format it as a percent with no decimals.


 Label the new column Offer Acceptance Rate.
 Add an appropriate title to the report
 Run the query and view the results.

3. What is the value of Offer Acceptance Rate?

1.
proc sql;
select *
from(select CustomerID
from sq.salesemail
where EmailResp = 'Accepted'
union
select CustomerID
from sq.salesphone
where PhoneResp = 'Accepted');
quit;

2.
/*s105s04.sas*/

title 'Acceptance Rate';


proc sql;
select count(*)/(select count(*)
from sq.saleslist) as PctResp format=percent5.
label='Offer Acceptance Rate'
from(select CustomerID
from sq.salesemail
where EmailResp = 'Accepted'
union
select CustomerID
from sq.salesphone
where PhoneResp = 'Accepted');
quit;
title;
3. 50%
19
5.3. Using an OUTER UNION

5.3.1. Using the OUTER UNION Operator

Using the OUTER UNION Operator

Suppose you need to create a report that lists all customer phone and email responses. To keep all rows
and all columns from both the salesemail and salesphone tables, you can use the OUTER UNION set
operator. Recall that the OUTER UNION set operator combines two query results, and then produces all
rows and columns from both queries. So, if we select everything from the salesemail table, use the OUTER
UNION set operator, and select everything from the salesphone table, will it provide us with our desired
results?

SELECT query...
OUTER UNION <CORR>
SELECT query...;

proc sql;
select *
from sq.salesemail
outer union
select *
from sq.salesphone;
quit;

When using the OUTER UNION operator, by default, all rows and columns from both intermediate results
sets are combined and duplicate rows are NOT removed from the final result set. These are not the desired
results. How can we overlay columns?

Review the output.

20
SQL Essentials - Lesson 5 – Set Operators

5.3.2. Demo: Using the OUTER UNION Operator to Combine Tables


We're going to use the OUTER UNION set operator with a CORR keyword to combine the salesemail and
salesphone table.

Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your
practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In
Enterprise Guide, run the Autoexec process flow.

1. Open the s105d02.sas program in the demos folder and find the Demo section. Run the query to
perform an OUTER UNION concatenation of the sq.salesemail and sq.salesphone tables.
Examine the results.

proc sql;
select *
from sq.salesemail;
select *
from sq.salesphone;
quit;

2. Review the output.


We can see this is not what we want. By default, OUTER UNION combines all rows and all
columns. So what we need to do is overlay the columns. We want to overlay CustomerID, and then
we want to overlay the response and the SalesRep.

3. Add the CORR keyword after the OUTER UNION set operator. Run the query and examine the
results.

proc sql;
select *
from sq.salesemail
outer union corr
select *
from sq.salesphone;
quit;

4. Review the output.


We can see CustomerID was overlaid, because both columns were named the same. But we want
to overlay EmailResp and PhoneResp. Those columns are not the same name. One method to fix
this is to use the RENAME= data set option. This is a SAS enhancement.

5. Add the RENAME= option after the salesemail table and rename the column EmailResp to Resp.
After the salesphone table, rename the column PhoneResp to Resp. Run the query and examine
the results.

proc sql;
select *
from sq.salesemail(rename=(EmailResp=Resp))
outer union corr
select *
from sq.salesphone(rename=(PhoneResp=Resp));
quit;

6. Review the output.


We can see that column has been overlaid. We can also see that the SalesRep column is missing
for all rows in the sales email table. This was what we expected. Let's look at another solution.

7. Remove the RENAME= option after each table. Modify the first SELECT clause and select the
column CustomerID. In the first clause, also select EmailResp and change the column name to
Resp using the AS keyword. Modify the second SELECT clause and select the CustomerID,

21
SalesRep, and PhoneResp columns. Change the PhoneResp column name to Resp using the AS
keyword. Run the query and examine the results.

proc sql;
select CustomerID, EmailResp as Resp
from sq.salesemail
outer union corr
select CustomerID, SalesRep, PhoneResp as Resp
from sq.salesphone;
quit;

And again, the code produces the exact same results. It's just another method for you to use.

8. Add the CREATE TABLE statement to create a table from the query results. Name the table
response1. Run the query and examine the results.

proc sql;
create table response1 as
select CustomerID, EmailResp as Resp
from sq.salesemail
outer union corr
select CustomerID, SalesRep, PhoneResp as Resp
from sq.salesphone;
quit;

9. Review the log.


We created a new table with 12 rows and three columns.

10. Find the SAS DATA Step section. Another method to create this table and rename the columns is
to use the DATA step. This will produce the same results as the previous query.

data response2;
length Resp $12;
set sq.salesemail(rename=(EmailResp=Resp))
sq.salesphone(rename=(PhoneResp=Resp));
run;

22
SQL Essentials - Lesson 5 – Set Operators

5.3.3. SQL Versus Traditional SAS Programming


As you saw in the demo, the OUTER UNION set operator is similar to the DATA step with multiple tables
listed in the SET statement, and they can produce similar results. Using either method allows you to use the
flexibility of SAS programming or SQL. Use the method that works best for you.

proc sql;
create table response1 as
select CustomerID, EmailResp as Resp
from sq.salesemail
outer union corr
select CustomerID, SalesRep,
PhoneResp as Resp
from sq.salesphone;
quit;

data response2;
length Resp $12;
set sq.salesemail(rename=(EmailResp=Resp))
sq.salesphone(rename=(PhoneResp=Resp));
run;

23
5.3.4. Practice

Practice Level 1: Using the OUTER UNION Set Operator


Create a report that shows the email and phone offer responses along with the sales representative if
available.

Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your
practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In
Enterprise Guide, run the Autoexec process flow.

1. Using the sq.salesphone table as input, write a query to list the following columns:

 CustomerID
 a new column named Response based on the existing PhoneResp column
 SalesRep labeled Sales Rep
 a new column named Channel with the constant text Phone
 Run the query and view the results.

2. Using the sq.salesemail table as input, write a query to list the following columns:

 CustomerID
 a new column named Response based on the existing EmailResp column
 a new column named Channel with the constant text Email
 Run the query and view the results.

3. Combine the two query results using the OUTER UNION set operation. Use the following
requirements:

 Be mindful of the column alignment. Use the SET operator modifiers as needed.
 Order the results by CustomerID and Response.
 Add an appropriate title.
 Run the query and view the final results.

4. To how many sales representatives can we attribute an accepted offer?

24
SQL Essentials - Lesson 5 – Set Operators
1.
proc sql;
select CustomerID,
PhoneResp as Response,
SalesRep 'Sales Rep',
'Phone' as Channel
from sq.salesphone;
quit;

2.
proc sql;
select CustomerID,
EmailResp as Response,
'Email' as Channel
from sq.salesemail;
quit;

3.
/*s105s05.sas*/

title 'Offer Results with Sales Rep';


proc sql;
select CustomerID,
PhoneResp as Response,
SalesRep 'Sales Rep',
'Phone' as Channel
from sq.salesphone
outer union corr
select CustomerID,
EmailResp as Response,
'Email' as Channel
from sq.salesemail
order by 1,2;
quit;
title;

4. two

25
Quiz: Lesson 5

Select the best answer for each question. When you are finished, click Submit Quiz.

1. How many rows will this query produce?

proc sql;
select *
from a
intersect
select *
from b;
quit;

ID Pet

1 Cat

1 Dog

1 Pig

ID Pet

1 Cat

2 Cow

3 Dog
a. 1

b. 2

c. 5

d. 6

2. By default, the EXCEPT, INTERSECT, and UNION set operators remove duplicate rows from the
query results.

a. true

b.
false

3. How many rows will this query produce?

proc sql;
26
SQL Essentials - Lesson 5 – Set Operators
select *
from a
except
select *
from b;
quit;

ID Pet

1 Cat

1 Dog

1 Pig

ID Pet

1 Cat

2 Cow

3 Dog
a. 1

b. 2

c. 5

d. 6

4. Adding the CORR and ALL keywords to the EXCEPT operator would change the number of rows
that this query produces.

proc sql;
select *
from a
except
select *
from b;
quit;

ID Pet

1 Cat

1 Dog

1 Pig

27
b

ID Pet

1 Cat

2 Cow

3 Dog
a. true

b.
false

5. How many rows will this query produce?

proc sql;
select *
from a
outer union
select *
from b;
quit;

ID Pet

1 Cat

1 Dog

1 Pig

ID Pet

1 Cat

2 Cow

3 Dog
a. 1

b. 2

c. 5

d. 6

6. How many rows will this query produce?

proc sql;
select *
from a
union
28
SQL Essentials - Lesson 5 – Set Operators
select *
from b;
quit;

ID Pet

1 Cat

1 Dog

1 Pig

ID Pet

1 Cat

2 Cow

3 Dog
a. 2

b. 3

c. 4

d. 5

7. How many rows will this query produce?

proc sql;
select *
from a
union all
select *
from b;
quit;

ID Pet

1 Cat

1 Dog

1 Pig

29
ID Pet

1 Cat

2 Cow

3 Dog
a. 2

b. 4

c. 6

d. 8

8. Which statement is true about using the keyword CORR in a set operation?

a. When used with the EXCEPT operator, the CORR keyword instructs PROC SQL to position by
same-name columns and eliminate columns that do not have the same name in both tables from the final
results.
b. When used with the INTERSECT operator, the CORR keyword instructs PROC SQL to overlay
columns by position in the final results.
c. When used with the OUTER UNION operator, the CORR keyword instructs PROC SQL to
eliminate columns that do not have the same name in both tables from the final results.

9. Which statement is true about using the keyword ALL in a set operation?

a. Generally, using the keyword ALL is less efficient because it requires an extra pass through the
data.
b. Generally, using the keyword ALL is more efficient because it avoids an extra pass through the
data.
c. PROC SQL does not allow duplicate rows to remain eligible for processing.

d. The keyword ALL is not an implied part of the OUTER UNION set operator and is not
necessary.

10. PROC SQL with set operators handles columns and rows depending on the specific set operator
and keywords used in the set operation.

a. true

b.
false

30
SQL Essentials - Lesson 5 – Set Operators

1. Correct answer: a

The INTERSECT operator returns the unique rows that occur in both queries.

2. Correct answer: a

Although they do not remove the duplicate rows at the same point, the EXCEPT, INTERSECT, and UNION
set operators remove duplicate rows.

3. Correct answer: b

The EXCEPT set operator selects unique rows from the first query that are not found in the second query.

4. Correct answer: b

Because the tables have no duplicate rows and the columns are named the same, the keywords cause no
changes.

5. Correct answer: d

The OUTER UNION set operator selects all rows from both query results.

6. Correct answer: d

The UNION set operator returns unique rows from both queries.

7. Correct answer: c

Because the columns in both tables correspond, no columns are removed. The ALL keyword instructs
PROC SQL to retain all duplicate rows in the results.

8. Correct answer: a

When the CORR keyword is used with EXCEPT and INTERSECT, it instructs PROC SQL to align columns
that the query results have in common. The OUTER UNION CORR aligns by like column names in both
tables and includes columns that do not have the same name in both tables.

9. Correct answer: b

Not using the keyword ALL is generally less efficient because it requires PROC SQL to make an extra pass
through the data to eliminate duplicate rows.

10. Correct answer: a

The INTERSECT, EXCEPT and UNION set operators align columns by position and remove duplicate rows
by default. The OUTER UNION set operator includes all columns and rows by default.

31

You might also like