Sas 1
Sas 1
Sas 1
SAS Programming 1:
Essentials
Course Notes
SAS® Programming 1: Essentials Course Notes was developed by Stacey Syphus and Beth Hardin.
Additional contributions were made by Bruce Dawless, Brian Gayle, Anita Hillhouse, Marty Hultgren,
Mark Jordan, Eva-Maria Kegelmann, Gina Repole, Gemma Robson, Samantha Rowland, Allison
Saito, Prem Shah, Charu Shankar, Kristin Snyder, Peter Styliadis, Su Chee Tay, and Kitty Tjaris .
Instructional design, editing, and production support was provided by the Learning Design and
Development team.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or
trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
Copyright © 2018 SAS Institute Inc. Cary, NC, USA. All rights reserved. Printed in the United States
of America. No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise,
without the prior written permission of the publisher, SAS Institute Inc.
Book code E71466, course code LWPG1M6/PG1M6, prepared date 31Jan2019. LWPG1M6_001
ISBN 978-1-64295-229-2
For Your Infor mation iii
Table of Contents
To learn more…
For information about other courses in the curriculum, contact the
SAS Education Division at 1-800-333-7660, or send e-mail to
[email protected]. You can also find this information on the web at
http://support.sas.com/training/ as well as in the Training Course
Catalog.
For a list of SAS books (including e-books) that relate to the topics
covered in this course notes, visit https://www.sas.com/sas/books.html or
call 1-800-727-0025. US customers receive free shipping to US
addresses.
viii For Your Information
Lesson 1 Essentials
1.1 The SAS Programming Process................................................................................... 1-3
Demonstration: SAS Programming Process................................................................ 1-6
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.1 The SAS Programming Process 1-3
3
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
It is impossible to understand data without using tools that help you derive meaning from numbers
and text. SAS offers a huge collection of tools and solutions to handle all your data needs. At the
core of all that SAS offers is the SAS programming language. Regardless of the SAS suite of tools
that you licensed, the Base SAS programming language is included. This course teaches you how
to write SAS code to handle the most common data processing tasks.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-4 Lesson 1 Essentials
Analyze and
Access Explore Prepare Export
report on
data data data results
data
4
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
As you go through the process of making data meaningful and actionable, you will likely follow these
basic steps: access, explore, prepare, analyze and report, and export. SAS has programming tools
for each of these steps in the process. You follow this process as you learn the fundamentals of the
SAS programming language.
international
storm data
5
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.1 The SAS Programming Process 1-5
6
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
In this course, you follow the SAS programming process to start with raw data and turn it into helpful,
deliverable results.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-6 Lesson 1 Essentials
Scenario
Examine the international storm data that is used in course demonstrations. Open and run a SAS
program that follows the SAS programming process. The code included in the program is covered
throughout this course.
Files
• p101d01.sas
• Storm.xlsx – a Microsoft Excel workbook containing detail and summary data about international
storms
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.1 The SAS Programming Process 1-7
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-8 Lesson 1 Essentials
b. Enterprise Guide: Click the Results - Excel tab and click Download.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.1 The SAS Programming Process 1-9
US National
Park data class
cars
international
storm and
weather data
shoes
Europe
tourism and
trade data
9
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-10 Lesson 1 Essentials
10
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
11
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-11
SAS windowing
environment
13
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
SAS provides several programming interfaces that can be used to interactively write and submit
code.
• SAS Studio – a web-based interface to SAS that you can use on any computer. SAS Studio is
the interface that is used in SAS OnDemand for Academics and SAS University Edition. SAS
University Edition is a free download for personal use. SAS OnDemand for Academics is cloud-
based software. For more information, visit https://www.sas.com/en_us/learn/academic-
programs/software.html.
• SAS Enterprise Guide – a Windows client application that runs on your PC and accesses SAS
on a local or remote server.
• SAS windowing environment – a legacy interface that is part of SAS.
Note: This course uses SAS Enterprise Guide and SAS Studio because these are
the SAS interfaces that have the most modern programming tools.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-12 Lesson 1 Essentials
SAS Studio, SAS Enterprise Guide, and the SAS windowing environment enable you to write code
and view the log and output in the same interface.
Programs can also be submitted to the operating environment behind the scenes. This is referred to
as batch processing or back ground submit. The log and results are saved by default as separate
files in the same location as the SAS program. Background submission is often used for programs
that run regular jobs on a routine basis. These programs have typically been tested and can run
unattended.
SAS Studio also enables you to submit programs by right-clicking a .sas file in the Files and
Folders section and selecting Background Submit. You can view the status of background
programs and access the associated log and results files by clicking the More application options
icon and selecting Background Job Status.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-13
Scenario
Write and submit a simple SAS program in SAS Enterprise Guide and examine the log and results.
Files
• sashelp.class – a sample table provided by SAS that includes information about 19 students
Notes
• Programs can be submitted by clicking Run or pressing the F3 key.
• A program generates a log. Depending on the code, a program might also generate results and
output data.
• To run a subset of a program, highlight the desired code. Then click the drop-down arrow next to
Run and select Run Selection, or press F3.
• When you rerun a program, a prompt appears. If you click Yes, the existing log and results are
replaced. If you click No, a copy of the program is created, and a new log and results
are generated.
Demo
1. View Sashelp sample tables.
Note: Sashelp is a collection of sample data files provided by SAS that are useful for testing
and practicing. This course references various data files in Sashelp to illustrate
programming syntax.
a. Open SAS Enterprise Guide. In the Welcome window, click New Project.
Note: In Enterprise Guide, your work is organized in projects. As you open tables and
programs or create new programs, you will notice shortcuts added to your project in
the Project Tree window. The project can be saved by selecting File Save Project.
b. In the Servers window in the lower left corner, expand Servers Local Libraries
SASHELP.
c. Double-click the CLASS table to open and view the data. You do not need to close the table.
2. Write and submit a program in SAS Enterprise Guide.
a. Select File New Program, or click New on the toolbar and select Program.
b. Type or copy and paste the program below on the Program tab and click Run.
Note: If you copy and paste the program, select Edit Format Code to improve the
program spacing.
data myclass;
set sashelp.class;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-14 Lesson 1 Essentials
c. Click the Log tab. The log includes the program and messages that are returned from SAS.
The Log Summary is displayed by default at the bottom of the window. You can click on any
of the messages in the Log Summary to find the message in the log.
Note: If the Log Summary is closed, click Log Summary on the toolbar.
d. Click the Output Data and Results tabs to examine the output.
e. Return to the Program tab. Highlight the PROC PRINT and RUN statements and click the
drop-down arrow next to Run and select Run Selection, or press F3.
Note: In many of the programs in this course, you need to run only a portion of a SAS
program.
f. Click Yes when you are prompted to replace the results. Confirm that the log and the results
were replaced.
g. Often it is helpful to view two project items at the same time. For example, you might want to
view a program and the results, or possibly compare two tables. To view two windows at the
same time, click the Workspace Layout toolbar button and select either Stacked or Side
By Side. You can select different items in each section of the layout. Close one of the
sections when you are finished.
Note: If you have more than one program open in Enterprise Guide, you can select
the program to view in the drop-down list at the top of each section.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-15
Scenario
Write and submit a simple SAS program in SAS Studio and examine the log and results.
Files
• sashelp.class – a sample table provided by SAS that includes information about 19 students
Notes
• Programs can be submitted by clicking Run or pressing the F3 key.
• A program generates a log. Depending on the code, a program might also generate results and
output data.
• To run a subset of a program, highlight the desired code and click Run or press F3.
• When you rerun a program, the existing log, results, and output data are replaced.
Demo
1. View Sashelp sample tables.
Note: Sashelp is a collection of sample data files provided by SAS that are useful for testing
and practicing. This course references various data files in Sashelp to illustrate
programming syntax.
a. Open SAS Studio. In the Navigation pane on the left side of the window, select Libraries.
Expand My Libraries SASHELP.
b. Double-click the CLASS table to open and view the data. A panel to the left of the data lists
the columns in the table. The Column panel can be collapsed by clicking .
c. Close the SASHELP.CLASS tab.
2. Write and submit a program in SAS Studio.
a. A new program window labeled Program 1 is open. Notice that there are tabs labeled CODE,
LOG, and RESULTS.
Note: If you do not have a new Program tab, press F4 or click New and select
SAS Program.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-16 Lesson 1 Essentials
b. Type or copy and paste the program below on the CODE tab and click Run.
Note: If you copy and paste the program, click the Format Code button to improve the
program spacing.
data myclass;
set sashelp.class;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-17
16
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-18 Lesson 1 Essentials
Practice
Note: Please choose either the SAS Enterprise Guide or SAS Studio practice to further explore
your interface of choice.
b. Select File New Program (or click the New tool and select Program) to start writing
a SAS program. On the Program tab, type or copy and paste the following code. This is a
simple SAS program called a DATA step.
Note: If you copy and paste the program, select Edit Format Code to improve the
program spacing.
data work.shoes;
set sashelp.shoes;
NetSales=Sales-Returns;
run;
c. Click Run or press F3 to submit the code. Examine the Log and Output Data tabs.
d. Click the Log tab. Notice that there are additional statements included before and after the
DATA step. This is called wrapper code, and it includes statements added by Enterprise
Guide to set up the environment and results. To make the log easier to read, the wrapper
code statements can be hidden. Select Tools Options Results General and clear the
Show generated wrapper code in SAS log check box. Click OK.
e. Return to the Program tab and rerun the program. Select Yes when you are prompted to
replace the results.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-19
Note: To eliminate the prompt to replace results when you rerun a program, select Tools
Options Results General. Change the drop-down list with Replace results to
Replace without prompting and click OK.
f. On the Program tab, add code to compute summary statistics. At the end of the program,
begin by typing pr. Notice that a prompt appears with valid keywords. Press the Enter key
or the spacebar to add the word proc to the program. Press the spacebar and type me.
Press Enter again to add means to the program.
g. Press the spacebar, use the prompts to select data=work.shoes, and press Enter. Notice
that the prompt lists all valid options. Type or select options in the window to complete the
following statement:
proc means data=work.shoes mean sum maxdec=2;
Note: Autocomplete prompts can be modified or disabled by selecting Program
Editor Options and then clicking the Autocomplete tab. On the tab, you can adjust
the prompts.
h. Complete the program by adding the highlighted statements below. Notice that after VAR
and CLASS, the autocomplete prompt includes a list the columns from the work.shoes
table.
proc means data=work.shoes mean sum maxdec=2;
var NetSales;
class region;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-20 Lesson 1 Essentials
i. Highlight the code from PROC MEANS through RUN, and click the down arrow next to Run
and select Run Selection, or press F3.
Note: The default output format in Enterprise Guide is SAS Report. SAS Report output is
an XML file, which can be viewed within SAS applications.
j. To view two tabs at the same time, click the Workspace Layout button on the toolbar and
select either Stacked or Side By Side. View the Program tab in one section and Results in
the other.
k. To return to a single window, click the Workspace Layout toolbar button and select Single.
You can also click the X in the upper right corner of either window.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-21
l. In addition to creating SAS Report output, you can create other output types. Click the
Program tab and click Properties. Select Results Customize result formats, styles,
and behavior. Clear any selected check boxes and then select the PDF and Excel check
boxes. Click OK.
m. Run the program again. A separate tab is added for each of the results that are created.
Note: PowerPoint, Excel, PDF, and RTF results must be viewed outside of Enterprise
Guide. Click the View button in the Excel or PDF results window to open the file.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-22 Lesson 1 Essentials
n. To save the program, return to the Program tab and select Save Save Program As.
Navigate to the output folder in the course files. Enter shoesprogram in the File name field
and click Save. The .sas file extension is automatically added to the file name.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-23
b. Options are available in the banner area to customize your SAS Studio environment.
New program, new import data, new query, close all tabs, and
New Options maximize view.
c. On the Program 1 tab, type or copy and paste the following code. This is a simple SAS
program called a DATA step.
Note: If you copy and paste the program, click Format Code to improve the program
spacing.
data work.shoes;
set sashelp.shoes;
NetSales=Sales-Returns;
run;
d. Click Run or press F3 to submit the code. Examine the LOG and OUTPUT DATA tabs.
The RESULTS tab is empty because the program did not create a report.
e. On the CODE tab, add code to compute summary statistics. At the end of the program, begin
by typing pr. Notice that a prompt appears with valid keywords and syntax help. Press Enter
to add the word proc to the program. Press the spacebar and type me and press Enter
again to add means to the program.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-24 Lesson 1 Essentials
Note: The Autocomplete prompts also include a window with syntax help and links to
documentation and examples.
f. Press the spacebar, use the prompt to select data=, and then type work.shoes. Press the
spacebar and notice that the prompt lists all valid options. Type or select options in the
window to complete the following statement:
proc means data=work.shoes mean sum;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-25
j. Highlight the code from PROC MEANS through RUN and click Run , or press F3
to run only the selected portion. Confirm the results.
Note: The default output format in SAS Studio is HTML.
k. To view multiple tabs at the same time, click the RESULTS tab and drag it to the right side
of the work area until a highlighted region appears. To return to a single window, drag the
RESULTS tab back to the main tab area.
l. On the RESULTS tab, click the HTML, PDF, or Word icon to open results in the
corresponding file format. You are prompted to open the file in the browser.
Note: Additional options for the output formats are available in More application options
Preferences Results.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-26 Lesson 1 Essentials
m. To save the program, return to the CODE tab and click the Save As toolbar button.
Navigate to the output folder in the course files. Enter shoesprogram in the Name field and
click Save. The .sas file extension is automatically added to the file name.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.2 Using SAS Programming Tools 1-27
course
files
activities
Make a note of
data
the location of
your course files
demos folder.
practices
output
18
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
data
demos
practices p104d01.sas
P r ogramming 1, Lesson 4, demo 1
output
19
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-28 Lesson 1 Essentials
course
files
activities
data
cre8data.sas
demos
practices
output
20
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
21
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.3 Understanding SAS Syntax 1-29
24
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
25
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-30 Lesson 1 Essentials
26
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
data myclass;
set sashelp.class;
heightcm=height*2.54;
run;
27
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.3 Understanding SAS Syntax 1-31
28
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
If a RUN or QUIT statement is not used at the end of a step, the beginning of a new step implies the
end of the previous step. If a RUN or QUIT statement is not used at the end of the last step, SAS
Studio and Enterprise Guide automatically submits a RUN and QUIT statement after the submitted
code.
SAS program
29
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-32 Lesson 1 Essentials
data myclass;
set sashelp.class;
heightcm=height*2.54;
run;
Most statements
proc print data=myclass;
run; begin with
a keyword, and all
proc means data=myclass; statements end with
var age heightcm; a semicolon.
run;
30
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Most statements begin with an identifying keyword. In addition to DATA, PROC, and RUN
statements, this program also includes SET and VAR statements. The one statement that does not
begin with a keyword is the one that is creating the new column heightcm. The most important thing
to remember here is that all statements end with a semicolon.
The assignment statement that creates a new column named heightcm is the one statement that
does not begin with a keyword.
Global Statements
TITLE . . . ;
OPTIONS . . . ;
Global statements
are typically
LIBNAME . . . ; outside of steps
and do not need a
RUN statement.
31
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.3 Understanding SAS Syntax 1-33
1.03 Activity
Open p101a03.sas from the a ctivities folder and perform the following tasks:
1. View the code. How many steps are in the program?
2. How many statements are in the PROC PRINT step?
3. How many global statements are in the program?
4. Run the program and view the log.
5. How many observations were read by the PROC PRINT step?
32
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
34
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Enterprise Guide: Select Edit Format Code. You can also right-click in the program and select
Format Code or press Ctrl+I to format a SAS program.
SAS Studio: Click Format Code . You can also right-click in the program and select Format
Code to format a SAS program.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-34 Lesson 1 Essentials
data under13;
set sashelp.class;
where AGE<13;
drop heIGht Weight;
run;
Unquoted values can
be in any case.
35
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
36
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
To comment out a block of code using the /* */ technique in the SAS interfaces, you can highlight
the code and then press Ctrl+/ (forward slash).
• To uncomment a block of code in SAS Studio, highlight the block and then press Ctrl+/ again.
• To uncomment a block of code in SAS Enterprise Guide, highlight the block, and then press
Ctrl+Shift+/.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.3 Understanding SAS Syntax 1-35
Scenario
Examine program statements, improve program spacing, and add comments.
Files
• p101d02.sas
• sashelp.cars – a sample table provided by SAS that includes basic information about 428 cars
Syntax
/*comment*/
*comment;
Notes
• All statements end with a semicolon.
• Spacing does not matter in a SAS program.
• Values not enclosed in quotation marks can be lowercase, uppercase, or mixed case.
• Consistent program spacing is a good practice to make programs legible.
• Use the following automatic spacing features:
SAS Studio: Click Format Code .
Enterprise Guide: Select Edit Format Code or press Ctrl+I.
• Comments can be added to prevent text in the program from executing.
Demo
1. Open the p101d02.sas program from the demos folder. Run the program. Does it run
successfully?
2. Use the Format Code feature to improve the program spacing.
• Enterprise Guide: Select Edit Format Code. You can also right-click in the program
and select Format Code or press Ctrl+I.
• SAS Studio: Click Format Code . You can also right-click in the program
and select Format Code.
3. Add the following text as a comment before the DATA statement: Program created by
<your-name>
Note: Select the comment text and press Ctrl+/ to surround it with /* and */.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-36 Lesson 1 Essentials
4. Comment out the first TITLE statement and the WHERE statement in PROC PRINT.
Run the code and verify that 428 rows are included in the results.
/*Program created by <name>*/
data mycars;
set sashelp.cars;
AvgMPG=mean(mpg_city, mpg_highway);
run;
title;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.3 Understanding SAS Syntax 1-37
syntax
m is spelled
errors
ke ywords
unmatched
WARNING or ERROR
quotation
m a rks
message
m is sing
s e micolon invalid
options
log
38
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-38 Lesson 1 Essentials
Scenario
Find and resolve some common syntax errors.
Files
• p101d03.sas
• sashelp.cars – a sample table provided by SAS that includes basic information about 428 cars
Notes
• Some common syntax errors are unmatched quotation marks, missing semicolons, misspelled
keywords, and invalid options.
• Syntax errors might result in a warning or error in the log.
• Refer to the log to help diagnose and resolve syntax errors.
Demo
1. Open the p101d03.sas program from the demos folder. Identify the three syntax errors but do
not fix them. Run the program.
2. Carefully review the messages in the log.
Note: The Log Summary is available to view the notes, warnings, and errors.
3. Fix the code and rerun the program.
data mycars;
set sashelp.cars;
AvgMPG=mean(mpg_city, mpg_highway);
run;
title;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.3 Understanding SAS Syntax 1-39
1.04 Activity
Open p101a04.sas from the a ctivities folder and perform the following tasks:
1. Format the program to improve the spacing. What syntax error is
detected? Fix the error and run the program.
2. Read the log and identify any additional syntax errors or warnings. Correct
the program and format the code again.
3. Add a comment to describe the changes that you made to the program.
4. Run the program and examine the log and results. How many rows are in
the ca nadashoes data? data canadashoes set sashelp.shoes;
where region="Canada;
Profit=Sales-Returns;run;
42
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
The Extended Learning Page is designed to supplement your learning for SAS Programming 1.
The Extended Learning Page includes the following resources:
• PDF version of the course notes in English and other languages
• course files
• case studies for additional practice and application
• links to papers, videos, blogs, and other resources to learn more about related topics
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-40 Lesson 1 Essentials
43
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• Watch the video Getting Started with SAS Studio.
• View additional free video tutorials about using SAS Studio tasks.
• Watch the video Writing and Submitting SAS Code: Choosing an Editor.
• Take the SAS Enterprise Guide 1: Querying and Reporting course.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.3 Understanding SAS Syntax 1-41
44
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• Read the blog post How to run SAS programs in Jupyter Notebook.
• Read instructions and download Jupyter kernel for SAS on the SAS github page.
• Watch the video An Introduction to SAS Viya Programming for SAS 9 Programmers .
• Take the Programming for SAS Viya course after SAS Programming 1.
• Take the free SAS Programming for R Users course.
• Use the Getting Started with SAS Viya for R documentation.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-42 Lesson 1 Essentials
1.4 Solutions
Solutions to Activities and Questions
Confirm that 22
SAS tables were
created.
22
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
33
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1.4 Solutions 1-43
41
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
1-44 Lesson 1 Essentials
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Lesson 2 Accessing Data
2.1 Understanding SAS Data ............................................................................................. 2-3
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.1 Understanding SAS Data 2-3
Analyze and
A ccess Explore Prepare Export
report on
da ta data data
data
results
3
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Accessing data is the first step in the SAS programming process. There are many types of data files,
and SAS makes it easy to access these different types of data and use them for reporting and
analysis.
Types of Data
Structured data Unstructured data
4
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-4 Lesson 2 Accessing Data
Types of Data
Structured data Unstructured data
5
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
structured data
.s as7bdat
6
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.1 Understanding SAS Data 2-5
descriptor
SAS table portion
• table name
• number of rows
data • date created
portion • column names
• column attributes
data values
7
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
SAS Terminology
column or
SAS table or data set variable
row or
observation
8
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-6 Lesson 2 Accessing Data
Name
Length
9
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
1 – 32 characters
Name
starts with a letter or
underscore
can be uppercase,
Length lowercase, or mixed case
10
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.1 Understanding SAS Data 2-7
a. m onth6
b. 6m onth
c. m onth#6
d. m onth 6
e. m onth_6
f. Month6
11
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
13
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-8 Lesson 2 Accessing Data
Name
Numeric
SAS Dates
Type 01Jan1959 01Jan1960 01Jan1961
-365 0 366
Storing dates as
numbers makes
Length calculations
easy!
14
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
SAS date values represent the number of days between January 1, 1960, and a specified date.
SAS can perform calculations on dates ranging from A.D. 1582 to A.D. 19,900.
SAS time values represent the number of seconds since midnight of the current day.
SAS datetime values represent the number of seconds between midnight on January 1, 1960,
and an hour/minute/second within a specified date.
Name
Numeric C haracter
Type
8 bytes 1 - 32,767 bytes
(~16 significant digits) (1 byte = one character)
Length
15
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.1 Understanding SAS Data 2-9
2.02 Activity
1. Navigate to the location of your course files and open the da ta folder.
Enterprise Guide: Expand Servers Local F iles.
SAS Studio: Expand F i les and Folders.
2. Double-click the s torm_summary.sas7bdat SAS table to view it.
How are missing character and numeric values represented in the data?
16
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
2.03 Question
Click Ta ble Properties above the s torm_summary data to view the table
and column attributes. Examine the length of the Ba s in column. Could East
Pacific be properly stored as a data value in the Ba s in column?
Yes
No
18
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-10 Lesson 2 Accessing Data
P R OC CONTENTS DATA=data-set;
R UN;
PROC CONTENTS
creates a report
about the descriptor
portion of the data.
20
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p102a04
The path provided in the program must be relative to where SAS is running. If SAS is on a remote
server, the path points to the server, not the local machine.
21
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p102a04
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.1 Understanding SAS Data 2-11
2.04 Activity
Open p102a04.sas from the a ctivities folder and perform the following task:
1. Write a PROC CONTENTS step to generate a report of the
s torm_summary.sas7bdat table properties. Highlight the step and run
only the selected code.
2. How many observations are in the table?
3. How is the table sorted?
P R OC CONTENTS DATA=data-set;
R UN;
22
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-12 Lesson 2 Accessing Data
25
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Discussion
What challenges might arise if you use
a fixed path in your program?
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.2 Accessing Data through Libraries 2-13
Think of the
I hope I don’t editing I’ll need
need to write that to do if the data
file path again changes location!
and again in a What if I want to
long program! access another
type of data?
27
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
• eight-character maximum
• starts with a letter or
underscore
• continues with letters,
numbers, or underscores
28
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-14 Lesson 2 Accessing Data
LIBNAME is a
global statement
and does not need
a RUN statement.
29
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
SAS complies with operating system permissions that are assigned to the data files referenced by
the library. If you have Write access to the files, you are able to use SAS code to add, modify, or
delete data files. If you have Read access but do not have Write access, you can read data files
via the library, but you cannot make any changes to the files with SAS code.
To prevent SAS from making changes to tables in a library, you can add ACCESS=READONLY
at the end of the LIBNAME statement.
libname mylib base "s:/workshop/data" access=readonly;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.2 Accessing Data through Libraries 2-15
libref.table-name
Create
use the
the proc contents data=mylib.class;
library
library run;
mylib
31
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
32
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-16 Lesson 2 Accessing Data
2. Run the code and verify that the library was successfully assigned
in the log.
3. Go back to your program and save it as l i bname.sas in the main course
files folder. Replace the file if it exists.
33
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Note: The log might indicate that the pg1 libref refers to the same physical library as another libref,
such as TMP0001 or _TEMP0. When a table is opened to view in the data grid, SAS creates
a library that points to the folder where the table is located. You do not need to clear the libref
that is created by SAS.
2.06 Activity
1. Enterprise Guide: Select Li braries in the resources pane and click Ref resh.
SAS Studio: Select Li braries in the navigation pane and expand
My Libraries.
2. Expand the PG 1 library. Why are the Excel and text files in the da ta folder
not included in the library?
35
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.2 Accessing Data through Libraries 2-17
temporary
Work
contents deleted at the end
of the SAS session
data=work.test
data=test
37
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Work and Sashelp are also known as SAS system libraries. For more information about system
libraries, access this page in SAS Help.
Work
includes sample data
that you can use
Sashelp
data=sashelp.cars
38
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-18 Lesson 2 Accessing Data
SAS
administrator
sales research
sales.quarter1 r esearch.field_trial9
39
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
If your SAS platform has an administrator, other automatic libraries might be defined when you open
your SAS interface. If libraries are defined for you, you do not need to submit a LIBNAME statement.
You can use the libref that was created by your administrator and the table name to reference data in
your program.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.2 Accessing Data through Libraries 2-19
Scenario
Use the Work and Sashelp libraries that are automatically created by SAS. Determine what
happens with libraries and tables when SAS restarts.
Files
• p102d01.sas
• sashelp.class – a sample table provided by SAS that includes basic information about
19 students
Notes
• Work and Sashelp are system libraries that are automatically defined by SAS.
• Tables stored in the Work library are deleted at the end of each SAS session.
• Work is the default library, so if a table name without a libref is provided in the program, the table
is read from or written to the Work library.
• Sashelp contains a collection of sample tables and other files that include information about your
SAS session.
Demo
1. Open the p102d01.sas program from the demos folder and find the Demo section. Run the
demo program and use the navigation pane to examine the contents of the Work and Out
libraries.
2. Which table is in the Work library? Which table is in the Out library?
3. Restart SAS.
a. Enterprise Guide: In the Servers list, select Local and click Disconnect. Click Yes in the
confirmation window. Expand Local to start SAS again, and then expand Libraries.
b. SAS Studio: Select More application options Reset SAS Session.
4. Discuss the following questions:
a. What is in the Work library?
b. Why are the out and pg1 libraries not available?
c. Is class_copy2 saved permanently?
d. What must be done to re-establish the out library?
5. To re-establish the pg1 library, open and run the libname.sas program that was saved
previously in the main course files folder.
Note: Whenever you restart SAS Studio or SAS Enterprise Guide, you need to run the
libname.sas program to re-establish the pg1 library.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-20 Lesson 2 Accessing Data
41
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p102d02
The XLSX engine requires a license for SAS/ACCESS Interface to PC Files, and it also requires
SAS 9.4M2 or later.
Note: SAS/ACCESS Interface to PC Files is included with SAS University Edition.
42
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p102d02
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.2 Accessing Data through Libraries 2-21
In SAS Studio and Enterprise Guide, the VALIDVARNAME= option is set to ANY by default. ANY
enables column names to contain special characters, including spaces. If a column name contains
special characters, the column name must be expressed as a SAS name literal.
“var-name”n
VALIDVARNAME can be set to V7 during the SAS session by submitting the OPTIONS statement.
You can also change the default value for VALIDVARNAME in the interface options .
Enterprise Guide: Select Tools Options Data General and change Valid variable names to
Basic variable names.
SAS Studio: Select More application options Preferences and change SAS variable name
policy to V7.
Note: The SAS windowing environment sets VALIDVARNAME=V7 by default.
options validvarname=v7;
libname xlclass xlsx "s:/workshop/data/class.xlsx";
43
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p102d02
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-22 Lesson 2 Accessing Data
Scenario
Create a library to connect to an Excel workbook and reference an Excel worksheet in the program.
Files
• p102d02.sas
• Storm.xlsx – an Excel workbook with multiple worksheets that contain storm data
Syntax
OPTIONS VALIDVARNAME=V7;
Notes
• The XLSX engine enables you to read data directly from Excel workbooks. The XLSX engine
requires the SAS/ACCESS Interface to PC Files license.
• The VALIDVARNAME=V7 system option forces table and column names read from Excel to follow
SAS naming conventions. Spaces and special symbols are replaced with underscores, and names
greater than 32 characters are truncated.
• Date values are automatically converted to numeric SAS date values and formatted for easy
interpretation.
• Worksheets from the Excel workbook can be referenced in a SAS program as libref.work sheet-
name.
• When you define a connection to a data source other than a SAS data source, such as Excel or
other databases, it is a good practice to delete the libref at the end of your program with the
CLEAR option.
Demo
1. Open the Storm.xlsx file in Excel to view the data. Notice that, in the Storm_Summary
worksheet, there are spaces in the Hem NS and Hem EW column headings. Close the Excel file
after you finish viewing it.
Note: The file must be closed before you assign a library to the file.
2. Open p102d02.sas from the demos folder and find the Demo section. Complete the OPTIONS
statement to ensure that column names follow SAS naming conventions.
3. Complete the LIBNAME statement to define a library named xlstorm that connects to the
Storm.xlsx workbook.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.2 Accessing Data through Libraries 2-23
4. Highlight the OPTIONS and LIBNAME statements and run the selected code. Use the navigation
pane to find the xlstorm library. Open the storm_summary table. Notice that the Hem_NS and
Hem_EW columns include underscores. Close the storm_summary table.
*Complete the OPTIONS statement;
options validvarname=v7;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-24 Lesson 2 Accessing Data
2.07 Activity
Open p102a07.sas from the a ctivities folder and perform the following tasks:
1. If necessary, update the path of the course files in the LIBNAME
statement.
2. Complete the PROC CONTENTS step to read the pa rks table in the N P
library.
3. Run the program. Navigate to your list of libraries and expand the N P
library. Confirm that three tables are included: Pa rks, Species, and V isits.
4. Examine the log. Which column names were modified to follow SAS
naming conventions?
5. Uncomment the final LIBNAME statement and run it to clear the N P
library.
45
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.3 Importing Data into SAS 2-25
PROC
IMPORT
48
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
49
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p102d03
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-26 Lesson 2 Accessing Data
50
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p102d03
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.3 Importing Data into SAS 2-27
Scenario
Using PROC IMPORT, import a comma-delimited file and create a new SAS table.
Files
• p102d03.sas
• storm_damage.csv – a comma-delimited file that includes a description and damage estimates for
storms in the US with damages over one billion dollars
Syntax
Notes
• The IMPORT procedure can be used to read delimited text files .
• The DBMS option identifies the file type. The CSV value is included with Base SAS.
• The OUT= option provides the library and name of the SAS output table.
• The REPLACE option is necessary to overwrite the SAS output table if it exists.
• SAS assumes that column names are in the first line of the text file and data begins on the
second line.
• Date values are automatically converted to numeric SAS date values and formatted for easy
interpretation.
• The GUESSINGROWS= option can be used to increase the number of rows that SAS scans to
determine each column’s type and length from the default of 20 rows to a maximum of 32,767.
Demo
The storm_damage.csv file is in the data folder. In this display of the data, notice that column
names are in the first row, the data is comma-delimited, and there is a Date column. Data values
that include commas are enclosed in quotation marks.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-28 Lesson 2 Accessing Data
1. Open the p102d03.sas program in the demos folder and find the Demo section. Complete the
PROC IMPORT step to read storm_damage.csv and create a temporary SAS table named
storm_damage_import. Replace the table if it exists.
2. Complete the PROC CONTENTS step to examine the properties of storm_damage_import.
3. Highlight the demo program and submit the selected code.
*Complete the PROC IMPORT step;
proc import datafile="s:/workshop/data/storm_damage.csv" dbms=csv
out=storm_damage_import replace;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.3 Importing Data into SAS 2-29
2.08 Activity
Open p102a08.sas from the a ctivities folder and perform the following tasks:
1. This program imports a tab-delimited file. Run the program twice and
carefully read the log. What is different about the second submission?
2. Fix the program and rerun it to confirm that the import is successful.
52
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
54
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
If the Excel file is open when PROC IMPORT runs, an error occurs.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-30 Lesson 2 Accessing Data
56
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• SAS/ACCESS courses (http://support.sas.com/training/us/paths/dmgt.html#acc )
• SAS/ACCESS documentation (https://support.sas.com/documentation/onlinedoc/access/)
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.3 Importing Data into SAS 2-31
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
1. Importing Excel Data from a Single Worksheet
Create a table that contains a copy of the data that is in an Excel workbook. The Excel workbook
contains a single worksheet.
a. Open p102p01.sas from the practices folder. Complete the PROC IMPORT step to read
eu_sport_trade.xlsx. Create a SAS table named eu_sport_trade and replace the table
if it exists.
b. Modify the PROC CONTENTS code to display the descriptor portion of the eu_sport_trade
table. Submit the program, and then view the output data and the results.
Level 2
2. Importing Data from a CSV File
Create a table from a comma-delimited CSV file.
np_traffic.csv
ParkName,UnitCode,ParkType,Region,TrafficCounter,ReportingDate,TrafficCount
Big Hole NB,BIHO,National Battlefield,Pacific West,TRAFFIC COUNT AT BATTLE ROAD,31JAN2016,0
Big Hole NB,BIHO,National Battlefield,Pacific West,TRAFFIC COUNT AT BATTLE ROAD,29FEB2016,0
Big Hole NB,BIHO,National Battlefield,Pacific West,TRAFFIC COUNT AT BATTLE ROAD,31MAR2016,0
Big Hole NB,BIHO,National Battlefield,Pacific West,TRAFFIC COUNT AT BATTLE ROAD,30APR2016,183
Big Hole NB,BIHO,National Battlefield,Pacific West,TRAFFIC COUNT AT BATTLE ROAD,31MAY2016,289
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-32 Lesson 2 Accessing Data
a. Create a new program. Write a PROC IMPORT step to read the np_traffic.csv file and
create the traffic SAS table. Add a PROC CONTENTS step to view the descriptor portion of
the newly created table. Submit the program.
b. Examine the data interactively. Scroll down to row 37. Notice that the values of ParkName
and TrafficCounter seem to be truncated. Modify the program to resolve this issue.
c. Submit the program and verify that ParkName and TrafficCounter are no longer truncated.
Challenge
3. Importing Data with a Specific Delimiter
Create a table from np_traffic.dat. The values in the text file are delimited with a pipe (that is,
a vertical bar).
ParkName|UnitCode|ParkType|Region|TrafficCounter|ReportingDate|TrafficCount
Big Hole NB|BIHO|National Battlefield|Pacific West|TRAFFIC COUNT AT BATTLE ROAD|31JAN2016|0
Big Hole NB|BIHO|National Battlefield|Pacific West|TRAFFIC COUNT AT BATTLE ROAD|29FEB2016|0
Big Hole NB|BIHO|National Battlefield|Pacific West|TRAFFIC COUNT AT BATTLE ROAD|31MAR2016|0
Big Hole NB|BIHO|National Battlefield|Pacific West|TRAFFIC COUNT AT BATTLE ROAD|30APR2016|183
Big Hole NB|BIHO|National Battlefield|Pacific West|TRAFFIC COUNT AT BATTLE ROAD|31MAY2016|289
a. Access the SAS Procedures Guide. Expand Procedures and find the IMPORT Procedure
section. Review the syntax and examples to determine how to read a file that is delimited
with a specific symbol.
b. Use PROC IMPORT to import the np_traffic.dat file and create the temporary traffic2 SAS
table.
Partial Results (rows 37-46 of 2,784)
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.3 Importing Data into SAS 2-33
c. To test the library, select More application options Reset SAS Session. Expand the
Libraries section of the navigation pane and verify that the pg1 library is available.
5. SAS Enterprise Guide: Assigning a Library Automatically at Start-Up
Recall that when SAS shuts down, library references are deleted. It might be helpful to have
certain libraries that are automatically defined when SAS starts.
a. Select Tools Options SAS Programs. Select the Submit SAS code when server is
connected check box and click Edit. You can include any SAS code that you want to
execute each time SAS starts. Type a LIBNAME statement, click Save, and then click OK.
Note: Change the path if necessary to match the location of your course data.
libname pg1 base "s:/workshop/data";
b. To test the library, select Local in the Servers list, click Disconnect, and then click Yes.
Expand Local to start SAS again, and then expand Libraries to confirm that pg1 is
available.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-34 Lesson 2 Accessing Data
2.4 Solutions
Solutions to Practices
1. Importing Excel Data from a Single Worksheet
*Modify the path if necessary;
proc import datafile="s:/workshop/data/eu_sport_trade.xlsx"
dbms=xlsx
out=eu_sport_trade
replace;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.4 Solutions 2-35
a. m onth6
b. 6m onth
c. m onth#6
d. m onth 6 Mo nth6 and month6
e. m onth_6 are actually the
same column name.
f. Month6
12
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
How are missing character and numeric values represented in the data?
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-36 Lesson 2 Accessing Data
Yes
No Basin is two bytes,
so East Pacific would
be truncated, and
the value would be
Ea.
19
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
23
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2.4 Solutions 2-37
2. Run the code and verify that the library was successfully assigned
in the log.
25 libname pg1 base "s:/workshop/data";
NOTE: Libref PG1 was successfully assigned as follows:
Engine: BASE
Physical Name: s:\workshop\data
36
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
2-38 Lesson 2 Accessing Data
46
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
2. Fix the program and rerun it to confirm that the import is successful.
proc import datafile="s:/workshop/data/storm_damage.tab"
dbms=tab out=storm_damage_tab replace;
run;
53
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Lesson 3 Exploring and Validating
Data
3.1 Exploring Data ............................................................................................................. 3-3
Demonstration: Exploring Data with Procedures ........................................................ 3-10
Practice............................................................................................................... 3-14
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.1 Exploring Data 3-3
Analyze and
Access Explore Prepare Export
report on
data da ta data results
data
$w.
w.d
3
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Exploring data can include learning about the columns and values that you have, as well as
validating data to look for incorrect or inconsistent values. In this lesson, you learn to use some
procedures that give you some of this insight. You also learn to subset the data so that you can
focus on particular segments, format data so you can easily understand it, sort data, and identify and
clean up duplicate values.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-4 Lesson 3 Exploring and Validating Data
MEANS
UNIVARIATE
FREQ
4
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
PRINT Procedure
5
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.1 Exploring Data 3-5
6
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
a. BY
b. ID
c. SUM
d. VAR
7
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-6 Lesson 3 Exploring and Validating Data
PRINT Procedure
proc print data=sashelp.cars (obs=10);
var Make Model Type MSRP;
run;
9
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
MEANS Procedure
PROC MEANS DATA= input-table;
VAR col-name(s);
RU N ;
10
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.1 Exploring Data 3-7
MEANS Procedure
proc means data=sashelp.cars;
var EngineSize Horsepower MPG_City MPG_Highway;
run;
11
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
UNIVARIATE Procedure
PROC UNIVARIATE DATA=input-table;
VAR col-name(s);
RU N ;
12
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-8 Lesson 3 Exploring and Validating Data
UNIVARIATE Procedure
proc univariate data=sashelp.cars;
var MPG_Highway;
run;
13
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
FREQ Procedure
PROC FREQ DATA=input-table;
TABLES col-name(s);
RU N ;
14
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.1 Exploring Data 3-9
FREQ Procedure
proc freq data=sashelp.cars;
tables Origin Type DriveTrain;
run;
15
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-10 Lesson 3 Exploring and Validating Data
Scenario
Use the PRINT, MEANS, UNIVARIATE, and FREQ procedures to explore and validate data.
Files
• p103d01.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
Syntax
Notes
• PROC PRINT lists all columns and rows in the input table by default. The OBS= data set option
limits the number of rows read from the input data. The VAR statement limits and orders the
columns that are listed.
• PROC MEANS generates simple summary statistics for each numeric column in the input data
by default. The VAR statement limits the columns to analyze.
• PROC UNIVARIATE also generates summary statistics for each numeric column in the data
by default, but it includes more detailed statistics related to distribution and extreme values.
The VAR statement limits the columns to analyze.
• PROC FREQ creates a frequency table for each column in the input table by default. You can limit
the columns that are analyzed by using the TABLES statement.
Demo
1. Open p103d01.sas from the demos folder and find the Demo section of the program. Complete
the PROC PRINT statement to list the data in pg1.storm_summary. Print the first 10
observations. Highlight the step and run the selected code.
proc print data=pg1.storm_summary (obs=10);
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.1 Exploring Data 3-11
2. Add a VAR statement to include only the following columns: Season, Name, Basin,
MaxWindMPH, MinPressure, StartDate, and EndDate. Add list first 10 rows as a comment
before the PROC PRINT statement. Highlight the step and run the selected code.
Enterprise Guide Note: To easily add column names, use the autocomplete prompts to view
and select columns. You can either double-click on a column to add it in the program,
or start to type the column name and press the spacebar when the correct column is
highlighted.
SAS Studio Note: To easily add column names, place your cursor after the keyword VAR.
Use the Library section of the navigation pane to find the pg1 library. Expand the
storm_summary table to see a list of column names. Hold down the Ctrl key and select
the columns in the order in which you want them to appear in the statement. Drag the
selected columns to the VAR statement.
/*list first 10 rows*/
proc print data=pg1.storm_summary(obs=10);
var Season Name Basin MaxWindMPH MinPressure StartDate
EndDate;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-12 Lesson 3 Exploring and Validating Data
3. Copy the PROC PRINT step and paste it at the end of the program. Change PRINT to MEANS.
Remove the OBS= data set option to analyze all observations. Modify the VAR statement to
calculate summary statistics for MaxWindMPH and MinPressure. Add calculate summary
statistics as a comment before the PROC MEANS statement. Highlight the step and run the
selected code.
/*calculate summary statistics*/
proc means data=pg1.storm_summary;
var MaxWindMPH MinPressure;
run;
4. Copy the PROC MEANS step and paste it at the end of the program. Change MEANS to
UNIVARIATE. Add examine extreme values as a comment before the PROC UNIVARIATE
statement. Highlight the step and run the selected code.
/*examine extreme values*/
proc univariate data=pg1.storm_summary;
var MaxWindMPH MinPressure;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.1 Exploring Data 3-13
5. Copy the PROC UNIVARIATE step and paste it at the end of the program. Change UNIVARIATE
to FREQ. Change the VAR statement to a TABLES statement to produce frequency tables for
Basin, Type, and Season. Add list unique values and frequencies as a comment before the
PROC FREQ statement. Highlight the step and run the selected code.
/*list unique values and frequencies*/
proc freq data=pg1.storm_summary;
tables Basin Type Season;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-14 Lesson 3 Exploring and Validating Data
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
1. Exploring Data with Procedures
The pg1.np_summary table contains public use statistics from the National Park Service.
Use the PRINT, MEANS, UNIVARIATE, and FREQ procedures to explore the data for possible
inconsistencies.
a. Open p103p01.sas from the practices folder. Complete the PROC PRINT statement to list
the first 20 observations in pg1.np_summary.
b. Add a VAR statement to include only the following variables: Reg, Type, ParkName,
DayVisits, TentCampers, and RVCampers. Highlight the step and run the selected code.
Do you observe any possible inconsistencies in the data?
c. Copy the PROC PRINT step and paste it at the end of the program. Change PRINT to
MEANS and remove the OBS= data set option. Modify the VAR statement to calculate
summary statistics for DayVisits, TentCampers, and RVCampers. Highlight the step and
run the selected code.
What is the minimum value for tent campers? Is that value unexpected?
d. Copy the PROC MEANS step and paste it at the end of the program. Change MEANS
to UNIVARIATE. Highlight the step and run the selected code.
Are there negative values for any of the columns?
e. Copy the PROC UNIVARIATE step and paste it at the end of the program. Change
UNIVARIATE to FREQ. Change the VAR statement to a TABLES statement to produce
frequency tables for Reg and Type. Highlight the step and run the selected code.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.1 Exploring Data 3-15
Are there any lowercase codes? Are there any codes that occur only once in the table?
f. Add comments before each step to document the program. Save the program as
np_validate.sas in the output folder.
Level 2
2. Using Procedures to Validate Data
The pg1.np_summary table contains information about US national parks, monuments,
preserves, rivers, and seashores. Valid values for the columns Reg and Type are as follows:
Reg Description
Type Description
A Alaska
NM National Monument
IM Intermountain
NP National Park
MW Midwest
NS National Seashore
NC National Capital
PRE National Preserve
NE Northeast
RVR National River
PW Pacific West
SE Southeast
a. Create a new program. Write a PROC FREQ step to produce frequency tables for the Reg
and Type columns in the pg1.np_summary table. Submit the step and look for invalid
values.
b. Write a PROC UNIVARIATE step to generate statistics for the Acres column in the
pg1.np_summary table. Notice the observation numbers for the smallest park and the
largest park.
c. View the pg1.np_summary table to identify the name of the smallest and largest park s.
Challenge
3. Generating Extreme Observations Output
The pg1.eu_occ table includes monthly occupancy counts for European countries between
January 2004 and September 2017.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-16 Lesson 3 Exploring and Validating Data
The SAS Output Delivery System (ODS) gives you options for controlling the type and format
of the output that is generated by SAS code. The ODS SELECT statement is used to specify
output objects for results. The ODS SELECT statement can be used to generate a report
containing only the Extreme Observations output.
Note: To specify an output object, you need to know which output objects your SAS program
produces. The ODS TRACE statement writes to the SAS log a trace record that includes
the path, the label, and other information about each output object that your SAS
program produces. You can find documentation about the ODS TRACE and ODC
SELECT statements in the SAS Help Facility and in the online documentation.
a. Create a new program. Write a PROC UNIVARIATE step to examine Camp in the
pg1.eu_occ table.
b. Add the ODS TRACE statements before and after PROC UNIVARIATE as follows.
ods trace on;
proc univariate data=pg1.eu_occ;
var camp;
run;
ods trace off;
c. Submit the program and notice the trace information in the SAS log. Determine the name
of the Extreme Observations output object.
d. Delete the ODS TRACE statements. Add an ODS SELECT statement immediately before
the PROC UNIVARIATE step and provide the name of the Extreme Observation output
object.
Note: This method can be used with other procedures that create multiple tables (such as
PROC CONTENTS) to select a portion of the output.
e. Using the SAS documentation or the syntax Help in the editor, identify the option that
specifies the number of extreme observations that are listed in the table. Use the option
to change the number of extreme observations from five to 10. Submit the program.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-17
PROC procedure-name . . . ;
WHERE expression;
RU N ;
filters rows in
the results based If expression is true,
on the expression include the row
in the results.
19
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Type = "SUV"
= or EQ < or LT
Type EQ "SUV"
^= or ~= or NE >= or GE
MSRP <= 30000
> or G T <= or L E
MSRP LE 30000
20
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-18 Lesson 3 Exploring and Validating Data
Type = "SUV"
Character values are case
sensitive and must be enclosed in Type = 'Wagon'
double or single quotation marks.
MSRP <= 30000
Numeric values must be standard
numeric (that is, no symbols).
21
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
22
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-19
Combining Expressions
proc print data=sashelp.cars;
var Make Model Type MSRP MPG_City MPG_Highway;
where Type="SUV" and MSRP <= 30000;
run;
Expressions can be
combined with AND or OR.
23
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d02
24
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-20 Lesson 3 Exploring and Validating Data
Scenario
Use the WHERE statement and basic operators to subset rows in a procedure.
Files
• p103d02.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
Syntax
WHERE expression;
Basic Operators:
=, EQ
^= , ~= , NE
> , GT
< , LT
>= , GE
<= , LE
IN(value1, …, valuen)
Notes
• The WHERE statement is used to filter rows. If the expression is true, rows are read.
If the expression is false, they are not.
• Character values are case sensitive and must be enclosed in quotation marks.
• Numeric values are not in quotation marks and must include only digits, decimal points,
and negative signs.
• Compound conditions can be created with AND or OR.
• The logic of an operator can be reversed with the NOT keyword.
• When an expression includes a fixed date value, use the SAS date constant syntax:
“ddmmmyyyy”d.
− dd represents a one- or two-digit day
− mmm represents a three-letter month in uppercase, lowercase, or mixed case
− yyyy represents a two- or four-digit year
Demo
1. Open p103d02.sas from the demos folder and find the Demo section of the program.
Write a PROC PRINT step to list the data in pg1.storm_summary.
2. Write a WHERE statement to include rows with MaxWindMPH values greater than or equal to
156 (Category 5 storms). Highlight the PROC PRINT step and run the selected code.
where MaxWindMPH >= 156;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-21
3. Modify the WHERE statement for each of the conditions below. Highlight the PROC PRINT step
and run the selected code after each condition.
a. Basin equal to WP (West Pacific)
where Basin = "WP";
b. Basin equal to SI or NI (South Indian or North Indian)
where Basin in ("SI" "NI");
c. StartDate on or after January 1, 2010
where StartDate >= "01jan2010"d;
d. Type equal to TS (tropical storm) and Hem_EW equal to W (west)
where Type = "TS" and Hem_EW = "W";
e. MaxWindMPH greater than 156 or MinPressure less than 920
where MaxWindMPH > 156 or MinPressure < 920;
4. In the final WHERE statement, are missing values included for MinPressure? How can you
exclude missing values?
where MaxWindMPH>156 or 0<MinPressure<920;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-22 Lesson 3 Exploring and Validating Data
26
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
IS NULL is another special operator that can be used with DBMS data. It distinguishes between null
and missing values. IS NULL and IS MISSING are the same when they are used with a SAS table.
27
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-23
28
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
3.02 Activity
Open p103a02.sas from the a ctivities folder and perform the following tasks:
1. Uncomment each WHERE statement one at a time and run the step to
observe the rows that are included in the results.
2. Comment all previous WHERE statements. Add a new WHERE statement
to print storms that begin with Z. How many storms are included in the
results?
29
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-24 Lesson 3 Exploring and Validating Data
31
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Suppose you have a program with multiple procedures, and you want to filter each where the value
of Type is Wagon. After you look at the results, you decide that you want similar reports where
Type=SUV and Type=Sedan. Find and replace is an option, but it would be preferable to change
that repeating value in one place.
Wagon
A SAS macro variable
Sedan stores text that is
SUV
substituted in your code
when it runs. It’s like
macro
an automatic
variable find-and-replace.
32
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
The SAS macro language enables you to design dynamic programs that you can easily update
or modify. A macro variable enables you to store text that you want to use in your program.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-25
It is recommended that you do not include quotation marks when you define the macro variable
value. Use quotation marks when necessary after the macro variable is resolved.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-26 Lesson 3 Exploring and Validating Data
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-27
EXIT
37
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-28 Lesson 3 Exploring and Validating Data
Scenario
Modify a program to use SAS macro variables to filter data in multiple procedures.
Files
• p103d03.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
Syntax
%LET macrovar=value;
WHERE numvar=¯ovar;
WHERE charvar="¯ovar";
WHERE datevar="¯ovar"d;
Notes
• A macro variable stores a text string that can be substituted into a SAS program.
• The %LET statement defines the macro variable name and assigns a value.
• Macro variable names must follow SAS naming rules.
• Macro variables can be referenced in a program by preceding the macro variable name with an &
(ampersand).
• If a macro variable reference is used inside quotation marks, double quotation marks must be
used.
Demo
1. Open p103d03.sas from the demos folder and find the Demo section of the program. Highlight
the demo program and run the selected code.
2. Write three %LET statements to create macro variables named WindSpeed, BasinCode, and
Date. Set the initial values of the variables to match the WHERE statement .
3. Modify the WHERE statement to reference the macro variables. Highlight the demo program and
run the selected code. Verify that the same results are produced.
%let WindSpeed=156;
%let BasinCode=NA;
%let Date=01JAN2000;
proc print data=pg1.storm_summary;
where MaxWindMPH>=&WindSpeed and Basin="&BasinCode" and
StartDate>="&Date"d;
var Basin Name StartDate EndDate MaxWindMPH;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-29
4. Change the values of the macro variables to values that you select. Possible values for Basin
include NA, WP, SP, WP, NI, and SI. Highlight the demo program and run the selected code.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-30 Lesson 3 Exploring and Validating Data
3.03 Activity
Open p103a03.sas from the a ctivities folder and perform the following tasks:
1. Change the value in the %LET statement from N A to SP.
2. Run the program and carefully read the log.
Which procedure did not produce a report?
What is different about the WHERE statement in that step?
39
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-31
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
4. Filtering Rows in a Listing Report Using Character Data
The pg1.np_summary table contains public use statistics from the National Park Service.
The park type codes are inconsistent for national preserves. Examine these inconsistencies
by producing a report that lists any national preserve.
a. Open p103p04.sas from the practices folder. Add a WHERE statement to print only the
rows where ParkName includes Preserve.
Note: ParkName contains character values. These values are case sensitive.
b. Submit the program and view the results. Which codes are used for preserves?
Note: If you use double quotation marks in the WHERE statement, you receive a warning
in the log. To eliminate the warning, use single quotation marks.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-32 Lesson 3 Exploring and Validating Data
Level 2
6. Using Macro Variables to Subset Data in Procedures
a. Create a new program. Write a PROC FREQ step to analyze rows from pg1.np_species.
Include only rows where Species_ID starts with YOSE (Yosemite National Park) and
Category equals Mammal. Generate frequency tables for Abundance and
Conservation_Status.
b. Write a PROC PRINT step to list the same subset of rows from pg1.np_species. Include
Species_ID, Category, Scientific_Name, and Common_Names in the report. Run the
program.
c. Create a macro variable named ParkCode to store YOSE, and another macro variable
named SpeciesCat to store Mammal. Modify the code to reference the macro variables.
Run the program and confirm that the same results are generated.
Note: The macro variable values are case sensitive when they are used in a WHERE
statement.
d. Change the values of the macro variables to ZION (Zion National Park) and Bird. Run the
program.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.2 Filtering Row s 3-33
Challenge
7. Eliminating Case Sensitivity in WHERE Conditions
Character comparisons in a WHERE statement are case sensitive. Use SAS functions to make
comparisons case insensitive.
a. Open pg1.np_traffic. Notice that the case of Location values is inconsistent.
b. Create a new program. Write a PROC PRINT step that lists ParkName, Location, and
Count. Print rows where Count is not equal to 0 and Location includes MAIN ENTRANCE.
Submit the program. Use the log to confirm that 38 rows are listed.
Note: If you use double quotation marks in the WHERE statement, you receive a warning
in the log. To eliminate the warning, use single quotation marks.
c. The UPCASE function can be used to eliminate case sensitivity in character WHERE
expressions. Use the UPCASE function on the Location column to include any case of
MAIN ENTRANCE. Run the program and verify that 40 rows are listed.
UPCASE(column)
Note: The UPCASE function in a WHERE statement does not permanently convert the
values of the column to uppercase.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-34 Lesson 3 Exploring and Validating Data
$w.
w.d
Changing how
values appear
makes it easier to
interpret them.
43
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
44
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.3 For matting Columns 3-35
45
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
3.04 Activity
1. Go to support.sas.com/documentation. Click 9.4 after SA S Language
El ements by Name, Product, and Category.
2. Expand the F ormats section and click A l phabetical Listing.
3. What does the Zw.d format do?
46
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-36 Lesson 3 Exploring and Validating Data
48
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
49
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p103d04
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.3 For matting Columns 3-37
Scenario
Use the FORMAT statement in a procedure to display data values as dates and currency.
Files
• p103d04.sas
• storm_damage – a SAS table that contains a description and damage estimates for storms in the
US with damages greater than one billion dollars
Syntax
<$>format-name<w>.<d>
Notes
• Formats are used to change how values are displayed in data and reports.
• Formats do not change the underlying data values.
• Formats can be applied in a procedure using the FORMAT statement.
• Visit SAS Language Elements documentation to access a list of available SAS formats.
Demo
1. Open p103d04.sas from the demos folder and find the Demo section of the program. Write
a PROC PRINT step to list the data in pg1.storm_damage. Highlight the step and run the
selected code.
2. Add a FORMAT statement to apply the MMDDYY10. format to Date, DOLLAR16. to Cost, and
COMMA5. to Deaths. Highlight the step and run the selected code.
proc print data=pg1.storm_damage;
format Date mmddyy10. Cost dollar16. Deaths comma5.;
run;
3. Change the width of MMDDYY to 8 and DOLLAR to 14. Highlight the step and run the selected
code. Change MMDDYY to 6 and DOLLAR to 10. Highlight the step and run the selected code
again. What happens to the formatted values?
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-38 Lesson 3 Exploring and Validating Data
3.05 Activity
Open p103a05.sas from the a ctivities folder and perform the following tasks:
1. Highlight the PROC PRINT step and run the selected code. Notice how
the values of La t, Lon, StartDate, and EndDate are displayed in the report.
2. Change the width of the DATE format to 7 and run the PROC PRINT step.
How does the display of Sta rtDate and EndDate change?
3. Change the width of the DATE format to 11 and run the PROC PRINT
step. How does the display of Sta rtDate and EndDate change?
4. Highlight the PROC FREQ step and run the selected code. Notice that the
report includes the number of storms for each Sta rtDate.
5. Add a FORMAT statement to apply the MONNAME. format to Sta rtDate
and run the PROC FREQ step. How many rows are in the report?
51
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.4 Sorting Data and Remov ing Duplicates 3-39
Sorting Data
55
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Sorting Data
PROC SORT DATA=input-table <OU T=output-table>;
BY <DESCENDING> col-name(s);
RU N ;
56
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-40 Lesson 3 Exploring and Validating Data
Sorting Data
proc sort data=pg1.class_test2 out=test_sort;
by Name;
run;
ascending order
by Nam e
57
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Sorting Data
proc sort data=pg1.class_test2 out=test_sort;
by Name TestScore;
run;
ascending order
by Nam e and then
within Nam e by
ascending TestScore
58
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.4 Sorting Data and Remov ing Duplicates 3-41
Sorting Data
proc sort data=pg1.class_test2 out=test_sort;
by Subject descending TestScore;
run;
ascending order
by Su bject and then
within Su bject by
descending Test Score
59
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
3.06 Activity
Open p103a06.sas from the a ctivities folder and perform the following tasks:
1. Modify the OUT= option in the PROC SORT statement to create
a temporary table named s torm_sort.
2. Complete the WHERE and BY statements to answer the following
question: Which storm in the North Atlantic basin (NA or na) had
the strongest Ma xWindMPH?
60
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-42 Lesson 3 Exploring and Validating Data
62
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
63
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.4 Sorting Data and Remov ing Duplicates 3-43
This removes
keeps only the first duplicate values
occurrence of each unique of the column listed
value of the BY variable in the BY statement.
64
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
65
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-44 Lesson 3 Exploring and Validating Data
Scenario
Use the NODUPRECS and NODUPKEY options in PROC SORT to identify and remove duplicates .
Files
• p103d05.sas
• storm_detail – a SAS table that contains multiple rows per storm for the 1980 through 2016 storm
seasons. Each row represents one measurement for each six hours of a storm.
Syntax
Remove duplicate rows:
Notes
• The NODUPRECS option removes adjacent rows that are entirely duplicated.
• The DUPOUT= option creates an output table in which the duplicates are removed.
• Using _ALL_ in the BY statement sorts by all columns and ensures that duplicate rows are
adjacent in the sorted table and are removed.
• The NODUPKEY option keeps only the first row for each unique value of the column or columns
listed in the BY statement.
Demo
1. Open p103d05.sas from the demos folder and find the Demo section of the program. Modify the
first PROC SORT step to sort by all columns and remove any duplicate rows. Write the removed
rows to a table named storm_dups. Highlight the step and run the selected code. Confirm that
there are 107,821 rows in storm_clean and 214 rows in storm_dups.
proc sort data=pg1.storm_detail out=storm_clean
noduprecs dupout=storm_dups;
by _all_;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.4 Sorting Data and Remov ing Duplicates 3-45
2. The second PROC SORT step is filtering for nonmissing values of Name and Pressure and then
sorting by descending Season, Basin, Name, and Pressure. Run the second PROC SORT step
and confirm that the first row for each storm represents the minimum value of Pressure.
Note: Because storm names can be reused in multiple years and basins, unique storms are
grouped by sorting by Season, Basin, and Name.
3. Modify the third PROC SORT step to sort the min_pressure table from the previous PROC
SORT step, and keep the first row for each storm. You do not need to keep the removed
duplicates. Highlight the step and run the selected code.
proc sort data=min_pressure nodupkey;
by descending Season Basin Name;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-46 Lesson 3 Exploring and Validating Data
• Visit the SAS 9.4 Procedures • Take the SAS Macro 1 course. • Learn about PROC FORMAT
Help page. • Read the SAS Macro in SAS Help.
• Browse or ask questions in Programming Made Easy • Take the SAS Programming 2
the SAS Procedures book. course.
community and see responses
from other SAS programmers.
67
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• Visit the SAS 9.4 Procedures Help page.
• Browse or ask questions in the SAS Procedures community and see responses from other SAS
programmers.
• Take the SAS Macro 1 course.
• Read the SAS Macro Programming Made Easy book.
• Learn about PROC FORMAT in SAS Help.
• Take the SAS Programming 2 course.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.4 Sorting Data and Remov ing Duplicates 3-47
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
8. Sorting Data and Creating an Output Table
Create the np_sort table that contains data for national parks ordered by regional code
and decreasing numbers of daily visitors.
a. Open p103p08.sas from the practices folder. Modify the PROC SORT step to read
pg1.np_summary and create a temporary sorted table named np_sort.
b. Add a BY statement to order the data by Reg and descending DayVisits.
c. Add a WHERE statement to select Type equal to NP. Submit the program.
Level 2
9. Sorting Data to Remove Duplicate Rows
The pg1.np_largeparks table contains gross acreage for large national parks. There are
duplicate rows for some locations.
a. Open and review the pg1.np_largeparks table. Notice that there are exact duplicate rows
for some parks.
b. Create a new program. Write a PROC SORT step that creates two tables (park_clean and
park_dups), and removes the duplicate rows. Submit the program.
park_clean
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-48 Lesson 3 Exploring and Validating Data
park_dups
Challenge
10. Creating a Lookup Table from a Detailed Table
The pg1.eu_occ table includes multiple rows from each country code and country name.
Create a lookup table that includes a single row for each country c ode and name.
a. Create a new program. Write a PROC SORT step to sort pg1.eu_occ and create an output
table named countrylist. Remove duplicate key values. Sort by Geo and then Country.
b. To read only Geo and Country from the pg1.eu_occ table, you can use the KEEP= data set
option. Add the KEEP= option immediately after the input table and list Geo and Country.
data-set (KEEP=varlist)
c. Run the program and verify that only one row per country is included.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.5 Solutions 3-49
3.5 Solutions
Solutions to Practices
1. Exploring Data with Procedures
/*Parts A and B*/
/*list first 20 rows*/
proc print data=pg1.np_summary(obs=20);
var Reg Type ParkName DayVisits TentCampers RVCampers;
run;
/*Part C*/
/*calculate summary statistics*/
proc means data=pg1.np_summary;
var DayVisits TentCampers RVCampers;
run;
/*Part D*/
/*examine extreme values*/
proc univariate data=pg1.np_summary;
var DayVisits TentCampers RVCampers;
run;
/*Part E*/
/*list unique values and frequency counts*/
proc freq data=pg1.np_summary;
tables Reg Type;
run;
b. Do you observe any possible inconsistencies in the data?
Yes. The Type column has inconsistencies. Notice that national preserve locations
have the code PRES and PRESERVE.
c. What is the minimum value for tent campers? Is that value unexpected?
The minimum value is zero. No, because it is possible that a park had zero tent
campers.
d. Are there negative values for any of the columns?
No
e. Are there any lowercase codes? Are there any codes that occur only once in the table?
There are no lowercase codes. NC, NPRE, and RIVERWAYS occur once in the table.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-50 Lesson 3 Exploring and Validating Data
*Part B;
proc univariate data=pg1.np_summary;
var Acres;
run;
a. What invalid values exist for Reg? None
What invalid values exist for Type? NPRE, PRESERVE, RIVERWAYS
c. What are the smallest and largest parks? Observation 78 (African Burial Ground
Monument, .35 acres) and observation 6 (Noatak National Preserve, 6,587,071.39
acres)
3. Generating Extreme Observations Output
*Part A and B;
ods trace on;
proc univariate data=pg1.eu_occ;
var camp;
run;
ods trace off;
*Part D and E;
ods select extremeobs;
proc univariate data=pg1.eu_occ nextrobs=10;
var camp;
run;
4. Filtering Rows in a Listing Report Using Character Data
proc print data=pg1.np_summary;
var Type ParkName;
where ParkName like '%Preserve%';
run;
5. Creating a Listing Report for Missing Data
*Part A;
proc print data=pg1.eu_occ;
where Hotel is missing and ShortStay is missing and
Camp is missing;
run;
*Part B;
proc print data=pg1.eu_occ;
where Hotel > 40000000;
run;
a. How many rows are included? 101
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.5 Solutions 3-51
b. Which months are included in the report? The months are July or August.
6. Using Macro Variables to Subset Data in Procedures
%let ParkCode=ZION;
%let SpeciesCat=Bird;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-52 Lesson 3 Exploring and Validating Data
a. BY
b. ID
c. SUM
d. VAR
8
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
NO TE : Th er e we r e 24 o bs er va ti on s r ea d fr om t he da ta s et P G1 .S TO R M_ SU MM AR Y.
WH ER E na m e li ke ' Z% ';
30
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.5 Solutions 3-53
40
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
47
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-54 Lesson 3 Exploring and Validating Data
continued...
3.05 Activity – Correct Answer
2. Change the width of the DATE format to 7 and run the PROC PRINT step.
How does the display of Sta rtDate and EndDate change?
3. Change the width of the DATE format to 11 and run the PROC PRINT
step. How does the display of Sta rtDate and EndDate change?
Formats are an
easy way to
The new group data in
report has 12 procedures!
rows.
53
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3.5 Solutions 3-55
61
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-56 Lesson 3 Exploring and Validating Data
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Lesson 4 Preparing Data
4.1 Reading and Filtering Data........................................................................................... 4-3
Practice............................................................................................................... 4-11
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.1 Reading and Filtering Data 4-3
Analyze and
Access Explore Prepare Export
report on
data data da ta data
results
IF
THEN
3
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
After you explore your data, you likely want to make some adjustments based on what you find and
what you need. This is where the DATA step really shines. In this lesson, you learn various ways to
subset data, and you use expressions and functions to compute new columns. You also learn how
to use conditional processing to obtain the results that you want in your output data.
DATA Step
4
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-4 Lesson 4 Preparing Data
The DATA step is a robust, yet simple programming tool that can do everything from simple querying
to providing structure to messy weblogs. Although you do not see everything that the DATA step can
do in this class, you become familiar with the most common data manipulation actions, such as
filtering rows and columns, computing new columns, and performing conditional processing. Beyond
these features, the DATA step also enables you to merge or join tables, read complex raw data, and
perform repetitive processing with DO loops or arrays. These topics and many others are covered in
SAS Programming 2: Data Manipulation Techniques and other advanced programming courses.
specifies the
table to create
DATA output-table;
SET input-table;
RU N ;
specifies the
table to read
5
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
6
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.1 Reading and Filtering Data 4-5
Compilation Execution
• Check syntax for errors. • Read and write data.
• Identify column • Perform data
attributes. manipulations,
• Establish new table calculations, and so on.
What happens metadata.
behind the
scenes when a
DATA step runs?
7
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
data myclass;
Execution set sashelp.class;
...other statements...
1) Read a row from the run;
input table.
2) Sequentially process
statements. Automatic
3) At the end, write the row looping makes
to the output table. processing
4) Loop back to the top data easy!
of the DATA step to read
the next row from the
input table.
8
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-6 Lesson 4 Preparing Data
4.01 Activity
Open p104a01.sas from the a ctivities folder and perform the following tasks:
1. Complete the DATA step to create a temporary table named s torm_new
and read pg 1.storm_summary. Run the program and read the log.
2. Define a library named out pointing to the output folder in the main
course files folder.
3. Change the program to save a permanent version of s torm_new
in the out library. Run the modified program.
11
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.1 Reading and Filtering Data 4-7
DATA output-table;
SET input-table;
WHERE expression;
filters rows based
RU N ;
on the expression The DATA step reads
rows only from the
input table where the
expression is true.
13
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d01
data myclass;
set sashelp.class;
where age >= 15;
run;
14
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-8 Lesson 4 Preparing Data
15
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d01
data myclass;
set sashelp.class;
keep name age height;
These statements or
these statements drop sex weight;
have the same run;
result in the
output table.
table
16
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.1 Reading and Filtering Data 4-9
4.03 Activity
Modify the program that you opened in the previous activity or open
p104a03.sas from the a ctivities folder and perform the following tasks:
1. Change the name of the output table to s torm_cat5.
2. Include only Category 5 storms (Ma xWindMPH greater than or equal
to 156) with Sta rtDate on or after 01JAN2000.
3. Add a statement to include the following columns in the output data:
Sea son, Basin, N ame, Type, and Ma xWindMPH. How many Category 5
storms occurred since January 1, 2000?
17
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
DATA output-table;
SET input-table;
FORMAT col-name format;
RU N ; Formats in the
DATA step are
name of the name of the permanently
column that you format that you assigned to the
want to format want to apply columns.
19
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-10 Lesson 4 Preparing Data
20
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.1 Reading and Filtering Data 4-11
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
1. Creating a SAS Table
The pg1.eu_occ SAS table contains monthly occupancy rates for European countries from
January 2004 through September 2017.
a. Open the pg1.eu_occ table and examine the column names and values.
b. Open p104p01.sas from the practices folder. Modify the code to create a temporary table
named eu_occ2016 and read pg1.eu_occ.
c. Complete the WHERE statement to select only the stays that were reported in 2016. Notice
that YearMon is a character column and the first four positions represent the year.
d. Complete the FORMAT statement in the DATA step to apply the COMMA17. format to the
Hotel, ShortStay, and Camp columns.
e. Complete the DROP statement to exclude Geo from the output table.
Level 2
2. Creating a Permanent SAS Table
The np_species table includes one row for each species that is found in each national park.
a. Create a new program. Write a DATA step to read the pg1.np_species table and create a
new permanent table named fox. Write the new table to the output folder.
b. Include only the rows where Category is Mammal and Common_Names includes Fox.
c. Exclude the Category, Record_Status, Occurrence, and Nativeness columns. Run the
program.
d. Notice that Fox Squirrels are included in the output table. Add a condition in the WHERE
statement to exclude rows that include Squirrel.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-12 Lesson 4 Preparing Data
Challenge
3. Creating a SAS Table Using Macro Variables
The np_species table includes one row for each species that is found in each national park.
a. Write a new program that creates a temporary table named Mammal that includes only the
mammals from the pg1.np_species table. Do not include Abundance, Seasonality, or
Conservation_Status in the output table.
b. Use PROC FREQ to determine how many species there are for each unique value of
Record_Status.
c. Modify the program to use a macro variable to change Mammal to other values of Category.
Change the macro variable value to Bird and run the program.
Note: Use PROC FREQ to determine the unique values of Category.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-13
DATA output-table;
arithmetic expression
SET input-table;
assignment statement or constant
new-column = expression;
RU N ;
The assignment
statement can
create or update
a column.
23
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
24
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d02
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-14 Lesson 4 Preparing Data
Files
• p104d02.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
Syntax
DATA output-table;
SET input-table;
new-column = expression;
RUN;
Notes
• The name of the column to be created or updated is listed on the left side of the equal sign.
• Provide an expression on the right side of the equal sign.
• SAS automatically defines the required attributes (name, type, and length) if the column is new.
• A new numeric column has a length of 8.
• The length of a new character column is determined based on the length of the assigned string.
• Character strings must be enclosed in quotation marks and are case sensitive.
Demo
1. Open p104d02.sas from the demos folder and find the Demo section of the program. Add
an assignment statement to create a numeric column named MaxWindKM by multiplying
MaxWindMPH by 1.60934.
2. Add a FORMAT statement to round MaxWindKM to the nearest whole number.
3. Add an assignment statement to create a new character column named StormType that is equal
to Tropical Storm. Highlight the DATA step and run the selected code.
data tropical_storm;
set pg1.storm_summary;
drop Hem_EW Hem_NS Lat Lon;
where Type="TS";
*Add assignment and FORMAT statements;
MaxWindKM=MaxWindMPH*1.60934;
format MaxWindKM 3.;
StormType="Tropical Storm";
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-15
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-16 Lesson 4 Preparing Data
4.04 Activity
Open p104a04.sas from the a ctivities folder and perform the following tasks:
1. Add an assignment statement to create StormLength that represents
the number of days between Sta rtDate and EndDate.
2. Run the program. In 1980, how long did the storm named Agatha last?
data storm_length;
set pg1.storm_summary;
drop Hem_EW Hem_NS Lat Lon;
*Add assignment statement;
run;
26
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Functions
A function is
a routine that
returns a value.
28
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-17
Functions
DATA output-table;
SET input-table;
new-column=function( arguments);
RU N ;
29
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Numeric Functions
Functions
SUM (num1, num2, ...)
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-18 Lesson 4 Preparing Data
Numeric Functions
data cars_new;
set sashelp.cars;
MPG_Mean=mean(MPG_City, MPG_Highway);
format MPG_Mean 4.1;
keep Make Model MPG_City MPG_Highway MPG_Mean;
run;
31
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d03
4.05 Activity
Open p104a05.sas from the a ctivities folder and perform the following tasks:
1. Open the pg 1.storm_range table and examine the columns. Notice that
each storm has four wind speed measurements.
2. Create a new column named W i ndAvg that is the mean of W i nd1, W ind2,
W i nd3, and W i nd4.
3. Create a new column W i ndRange that is the range of W i nd1, W ind2,
W i nd3, and W i nd4.
data storm_windavg;
set pg1.storm_range;
*Add assignment statements;
run;
32
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-19
Character Functions
34
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
The default delimiters for the PROPCASE function are a blank, forward slash, hyphen, open
parenthesis, period, and tab. To use a different list of delimiters, specify a list of characters in a
single set of quotation marks as the second argument in the function.
Character Functions
data cars_new;
set sashelp.cars; Ty p e is an
Type=upcase(Type); existing column.
keep Make Model Type;
run;
35
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d03
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-20 Lesson 4 Preparing Data
Files
• p104d03.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
Syntax
UPCASE(char)
PROPCASE(char, <delimiters>)
Notes
• The UPCASE function converts character values to uppercase.
• The PROPCASE function changes the first letter of each word to uppercase and other letters
to lowercase.
• The CATS function concatenates character values and removes any leading or trai ling blanks.
• The SUBSTR function extracts a string from a character value.
Demo
1. Open p104d03.sas from the demos folder and find the Demo section of the program.
Add an assignment statement to convert Basin to all uppercase letters using the UPCASE
function.
2. Add an assignment statement to convert Name to proper case using the PROPCASE function.
3. Add an assignment statement to create Hemisphere, which concatenates Hem_NS and
Hem_EW using the CATS function.
4. Add an assignment statement to create Ocean, which extracts the second letter of Basin using
the SUBSTR function. Highlight the DATA step and run the selected code.
data storm_new;
set pg1.storm_summary;
drop Type Hem_EW Hem_NS MinPressure Lat Lon;
*Add assignment statements;
Basin=upcase(Basin);
Name=propcase(Name);
Hemisphere=cats(Hem_NS, Hem_EW);
Ocean=substr(Basin,2,1);
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-21
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-22 Lesson 4 Preparing Data
4.06 Activity
Open p104a06.sas from the a ctivities folder and perform the following tasks:
1. Add a WHERE statement that uses the SUBSTR function to include rows
where the second letter of Ba s in is P (Pacific ocean storms).
2. Run the program and view the log and data. How many storms were in
the Pacific basin?
data pacific;
set pg1.storm_summary;
drop Type Hem_EW Hem_NS MinPressure Lat Lon;
*Add a WHERE statement that uses the SUBSTR function;
run;
37
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Date Functions
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-23
Date Functions
40
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
The optional third argument in the YRDIF function is called the basis. The basis value describes how
SAS calculates a date difference or a person’s age. When calculating the age of a person or event,
'AGE' should be used as the basis. Visit the SAS documentation for the YRDIF function to learn
about other values for the basis.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-24 Lesson 4 Preparing Data
Files
• p104d04.sas
• storm_damage – a SAS table that contains a description and damage estimates for storms in the
US with damages greater than one billion dollars
Syntax
YEAR(SAS-date)
MONTH(SAS-date)
DAY(SAS-date)
WEEKDAY(SAS-date)
TODAY()
MDY(month, day, year)
YRDIF(startdate, enddate, 'AGE')
Notes
• The YEAR, MONTH, DAY, and WEEKDAY functions return a numeric value. For WEEKDAY, 1
represents Sunday.
• The TODAY function returns the current date based on the system clock as a SAS date value.
• The MDY function creates a SAS date based on numeric month, day, and year values.
• The YRDIF function calculates a precise age between two dates. There are various values for the
third argument. However, 'AGE' should be used for accuracy.
Demo
1. Open p104d04.sas from the demos folder and find the Demo section of the program. Create
the column YearsPassed and use the YRDIF function. The difference in years should be based
on each Date value and today’s date.
2. Create Anniversary as the day and month of each storm in the current year.
3. Format YearsPassed to round the value to one decimal place, and Date and Anniversary as
MM/DD/YYYY. Highlight the DATA step and run the selected code.
data storm_damage2;
set pg1.storm_damage;
drop Summary;
*Add assignment and FORMAT statements;
YearsPassed=yrdif(Date,today(),'age');
Anniversary=mdy(month(Date),day(Date),year(today()));
format YearsPassed 4.1 Date Anniversary mmddyy10.;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-25
Note: Values for YearsPassed and Anniversary will be different based on the current date.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-26 Lesson 4 Preparing Data
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
4. Creating New Columns
Create a new table named np_summary_update from pg1.np_summary. Create two new
columns: SqMiles and Camping.
a. Open p104p04.sas from the practices folder. Create a new column named SqMiles by
multiplying Acres by .0015625.
b. Create a new column named Camping as the sum of OtherCamping, TentCampers,
RVCampers, and BackcountryCampers.
c. Format SqMiles and Camping to include commas and zero decimal places.
d. Modify the KEEP statement to include the new columns. Run the program.
Level 2
5. Creating New Columns with Character and Date Functions
The pg1.eu_occ table contains individual columns for nights spent at hotels, short stay
accommodations, or camps for each year and month. The YearMon column is character.
a. Open a new program. Write a DATA step to create a temporary table named eu_occ_total
based on the pg1.eu_occ table. Create the following new columns:
• Year – the four-digit year extracted from YearMon.
• Month – the two-digit month extracted from YearMon.
• ReportDate – the first day of the reporting month.
Note: Use the MDY function and the new Year and Month columns.
• Total – the total nights spent at any establishment. Format the new column to display
the values with commas.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.2 Computing New Columns 4-27
b. Format Hotel, ShortStay, Camp, and Total with commas. Format ReportDate to display
the values in the form JAN2018.
c. Keep Country, Hotel, ShortStay, Camp, ReportDate, and Total in the new table.
Challenge
6. Creating a New Column with the SCAN Function
a. Access SAS Help to learn about the SCAN function.
b. Create a new program. Create a new temporary table named np_summary2 based on the
pg1.np_summary table. Use the SCAN function to create a new column named ParkType
that is the last word of the ParkName column.
Note: Use a negative number for the second argument to count words from right to left
in the character string.
c. Keep Reg, Type, ParkName, and ParkType in the output table.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-28 Lesson 4 Preparing Data
44
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
data cars2;
set sashelp.cars;
if MSRP<30000 then Cost_Group=1;
if MSRP>=30000 then Cost_Group=2;
keep Make Model Type MSRP Cost_Group;
run;
45
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d05
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-29
Files
• p104d05.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
Syntax
Notes
• The expression following IF defines a condition that is evaluated as true or false for each row.
• If the condition is true, the statement following THEN is executed.
• Only one statement is permitted after THEN.
Demo
1. Open p104d05.sas from the demos folder and find the Demo section of the program.
Create a column named PressureGroup that is based on the following assignments:
MinPressure<=920 1
MinPressure>920 0
data storm_new;
set pg1.storm_summary;
keep Season Name Basin MinPressure PressureGroup;
*Add IF-THEN statements;
if MinPressure<=920 then PressureGroup=1;
if MinPressure>920 then PressureGroup=0;
run;
2. Highlight the DATA step, run the selected code, and examine the data. What value is assigned
to PressureGroup when MinPressure is missing?
3. Add a new IF-THEN statement before the existing IF-THEN statements to assign
PressureGroup=. if MinPressure is missing.
data storm_new;
set pg1.storm_summary;
keep Season Name Basin MinPressure PressureGroup;
*Add IF-THEN statements;
if MinPressure=. then PressureGroup=.;
if MinPressure<=920 then PressureGroup=1;
if MinPressure>920 then PressureGroup=0;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-30 Lesson 4 Preparing Data
4. Highlight the DATA step and run the selected code. What value is assigned to PressureGroup?
When MinPressure is missing, the first two IF conditions are true . The last assignment
statement determines the value of PressureGroup.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-31
47
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
data cars2;
set sashelp.cars;
if MSRP<20000 then Cost_Group=1;
else if MSRP<40000 then Cost_Group=2;
else if MSRP<60000 then Cost_Group=3;
else Cost_Group=4;
keep Make Model Type MSRP Cost_Group;
run;
48
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-32 Lesson 4 Preparing Data
Example: MSRP=35000
data cars2;
set sashelp.cars;
false if MSRP<20000 then Cost_Group=1;
else if MSRP<40000 then Cost_Group=2;
else if MSRP<60000 then Cost_Group=3;
else Cost_Group=4;
keep Make Model Type MSRP Cost_Group;
run;
49
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
Example: MSRP=35000
data cars2;
set sashelp.cars;
if MSRP<20000 then Cost_Group=1;
true else if MSRP<40000 then Cost_Group=2; execute
else if MSRP<60000 then Cost_Group=3;
else Cost_Group=4;
keep Make Model Type MSRP Cost_Group;
run;
50
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-33
Example: MSRP=35000
data cars2;
set sashelp.cars;
if MSRP<20000 then Cost_Group=1; skip
else if MSRP<40000 then Cost_Group=2;
else if MSRP<60000 then Cost_Group=3;
else Cost_Group=4;
keep Make Model Type MSRP Cost_Group;
run;
51
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
Example: MSRP=75000
data cars2;
set sashelp.cars;
if MSRP<20000 then Cost_Group=1;
false else if MSRP<40000 then Cost_Group=2;
else if MSRP<60000 then Cost_Group=3;
else Cost_Group=4;
execute keep Make Model Type MSRP Cost_Group;
run;
The final ELSE statement
executes if all previous
conditions were false.
52
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-34 Lesson 4 Preparing Data
4.07 Activity
Open p104a07.sas from the a ctivities folder and perform the following tasks:
1. Add the ELSE keyword to test conditions sequentially until a true
condition is met.
2. Change the final IF-THEN statement to an ELSE statement.
3. How many storms are in PressureGroup 1?
53
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
55
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-35
56
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
number of
LEN GTH char-column $ length; bytes or
characters
57
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-36 Lesson 4 Preparing Data
explicitly creates
data cars2; a new character column
set sashelp.cars; with a length of 6
length CarType $ 6;
if MSRP<60000 then CarType="Basic";
else CarType="Luxury";
keep Make Model MSRP CarType;
run;
58
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p104d06
4.08 Activity
Open p104a08.sas from the a ctivities folder and perform the following tasks:
1. Run the program and examine the results. Why is Ocean truncated?
What value is assigned when Basin='na'?
2. Modify the program to add a LENGTH statement to declare the name,
type, and length of Ocean before the column is created.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-37
data cars2;
set sashelp.cars;
if MPG_City>26 and MPG_Highway>30 then Efficiency=1;
else if MPG_City>20 and MPG_Highway>25 then Efficiency=2;
else Efficiency=3;
keep Make Model MPG_City MPG_Highway Efficiency;
run;
OR One condition
must be true.
62
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
data cars2;
set sashelp.cars;
length Cost_Type $ 4;
if MSRP<20000 then Cost_Group=1 and Cost_Type="Low";
else if MSRP<40000 then Cost_Group=2 and Cost_Type="Mid";
else Cost_Group=3 and Cost_Type="High";
run;
This program doesn’t
Compound work because only
statements one statement is
are not allowed. permitted after THEN.
63
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-38 Lesson 4 Preparing Data
64
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-39
4.09 Activity
Open p104a09.sas from the a ctivities folder. Run the program. Why does the
program fail?
data girls boys;
set sashelp.class;
if sex="F" then do;
Gender="Female";
output girls;
else do;
Gender="Male";
output boys;
run;
66
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-40 Lesson 4 Preparing Data
Files
• p104d07.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
Syntax
Notes
• After the IF-THEN/DO statement, list any number of executable statements.
• Close each DO block with an END statement.
Demo
Open p104d07.sas from the demos folder and find the Demo section of the program. Modify the
IF-THEN statements to use IF-THEN/DO syntax to write rows to either the indian, atlantic, or
pacific table based on the value of Ocean. Highlight the DATA step and run the selected code.
data indian atlantic pacific;
set pg1.storm_summary;
length Ocean $ 8;
keep Basin Season Name MaxWindMPH Ocean;
Basin=upcase(Basin);
OceanCode=substr(Basin,2,1);
*Modify the program to use IF-THEN-DO syntax;
if OceanCode="I" then do;
Ocean="Indian";
output indian;
end;
else if OceanCode="A" then do;
Ocean="Atlantic";
output atlantic;
end;
else do;
Ocean="Pacific";
output pacific;
end;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-41
run;
indian Table
atlantic Table
pacific Table
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-42 Lesson 4 Preparing Data
69
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• SAS Programming 2: Data Manipulation Techniques
• SAS Programming 3: Advanced Techniques and Efficiencies
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-43
• Stick around for the last • Read this blog post: Reasons • Look for Reading Text Files
lesson! to love PROC DS2. wi th the DATA Step on the
• Take the SAS SQL 1 course. • Take the DS2 Programming Extended Learning page.
course.
70
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• Take the SAS SQL 1 course.
• Read this blog post: Reasons to love PROC DS2.
• Take the DS2 Programming course.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-44 Lesson 4 Preparing Data
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
7. Processing Statements Conditionally with IF-THEN/ELSE
The pg1.np_summary table contains public use statistics from the National Park Service. The
values of the Type column represent park type as a code. Create a new column, ParkType,
that contains full descriptive values.
a. Open p104p07.sas from the practices folder. Submit the program and view the generated
output.
b. In the DATA step, use IF-THEN/ELSE statements to create a new column, ParkType,
based on the value of Type.
Type ParkType
NM Monument
NP Park
NS Seashore
c. Modify the PROC FREQ step to generate a frequency report for ParkType.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.3 Conditional Processing 4-45
Level 2
8. Processing Statements Conditionally with DO Groups
Use conditional processing to split pg1.np_summary into two tables: parks and monuments.
a. Create a new program. Write a DATA step to create two temporary tables named parks and
monuments based on the pg1.np_summary table. Read only national parks or monuments
from the input table. (Type is either NP or NM.)
b. Create a new column named Campers that is the sum of all columns containing counts of
campers. Format the column to include commas.
c. When Type is NP, create a new column named ParkType that is equal to Park, and write the
row to the parks table. When Type is NM, assign ParkType as Monument and write the row
to the monuments table.
d. Keep Reg, ParkName, DayVisits, OtherLodging, Campers, and ParkType in both output
tables.
parks Table
monuments Table
Challenge
9. Processing Statements Conditionally with SELECT-WHEN Groups
SELECT and WHEN statements can be used in a DATA step as an alternative to IF-THEN
statements to process code conditionally.
a. Use SAS Help or online documentation to read about using SELECT and WHEN statements
in the DATA step.
b. Repeat Practice 8 using SELECT groups and WHEN statements.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-46 Lesson 4 Preparing Data
4.4 Solutions
Solutions to Practices
1. Creating a SAS Table
data eu_occ2016;
set pg1.eu_occ;
where YearMon like "2016%";
format Hotel ShortStay Camp comma17.;
drop geo;
run;
2. Creating a Permanent SAS Table
libname out "s:/workshop/output";
data out.fox;
set pg1.np_species;
where Category='Mammal' and Common_Names like '%Fox%'
and Common_Names not like '%Squirrel%';
drop Category Record_Status Occurrence Nativeness;
run;
data &cat;
set pg1.np_species;
where Category="&cat";
drop Abundance Seasonality Conservation_Status;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.4 Solutions 4-47
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-48 Lesson 4 Preparing Data
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.4 Solutions 4-49
temporary table
data storm_new;
set pg1.storm_summary;
run;
permanent table
12
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-50 Lesson 4 Preparing Data
data out.storm_cat5;
set pg1.storm_summary;
where StartDate>="01jan2000"d and MaxWindMPH>=156;
keep Season Basin Name Type MaxWindMPH;
run;
Th ere were 18 Category 5 storms since January 1, 2000. How is the KEEP
NOTE: There were 18 observations read statement different from
from the data set PG1.STORM_SUMMARY. the VAR statement in
WHERE (StartDate>='01JAN2000'D) PROC PRINT?
and (MaxWindMPH>=156);
data storm_length;
set pg1.storm_summary;
drop Hem_EW Hem_NS Lat Lon;
StormLength = EndDate-StartDate;
run;
27
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.4 Solutions 4-51
data storm_windavg;
set pg1.storm_range;
WindAvg=mean(wind1, wind2, wind3, wind4);
WindRange=range(of wind1-wind4);
run;
OF col1 - coln
That's a good
shortcut for listing a
range of columns!
33
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
NOTE: There were 1958 observations read from the data set
PG1.STORM_SUMMARY.
WHERE SUBSTR(basin, 2, 1)='P';
NOTE: The data set WORK.PACIFIC has 1958 observations and 6 variables.
38
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-52 Lesson 4 Preparing Data
data storm_cat;
set pg1.storm_summary;
keep Name Basin MinPressure StartDate PressureGroup;
*add ELSE keyword and remove final condition;
if MinPressure=. then PressureGroup=.;
else if MinPressure<=920 then PressureGroup=1;
else PressureGroup=0;
run;
54
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
60
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4.4 Solutions 4-53
56 else Ocean="Pacific";
57 length Ocean $ 8;
WARNING: Length of character variable Ocean has The order of KEEP,
already been set. DROP, and WHERE
Use the LENGTH statement as the very
first statement in the DATA STEP to
statements does not
declare the length of a character matter in the DATA
variable. step.
58 run;
61
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
4-54 Lesson 4 Preparing Data
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Lesson 5 Analyzing and Reporting
on Data
5.1 Enhancing Reports with Titles, Footnotes, and Labels................................................. 5-3
Demonstration: Enhancing Reports ........................................................................... 5-9
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.1 Enhanc ing Reports w ith Titles, Footnotes, and Labels 5-3
A na lyze
Access Explore Prepare a nd Export
data data data report results
on data
MEANS TITLE
LABEL
FOOTNOTE
FREQ
3
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Now that data access, validation, and manipulation are behind you, you are ready to address the
peak of the programming process: analyzing and reporting on the data. Analyzing your data can
mean a lot of different things. It could be basic summarization to examine what happened in the
past, or it could be complex data mining or machine learning algorithms to predict what might
happen in the future. In this lesson, you concentrate on summarizing data. Specifically, you explore
in more depth the procedures that you can use for exploration: PRINT, MEANS, and FREQ.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-4 Lesson 5 Analyzing and Reporting on Data
4
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p105d01
5.01 Activity
Open p105a01.sas from the a ctivities folder and perform the following tasks:
1. In the program, notice that there is a TITLE statement followed by two
procedures. Run the program. Where does the title appear in the output?
2. Add a TITLE2 statement above PROC MEANS to print a second line:
Sum mary Statistics for MaxWind and MinPressure
3. Add another TITLE2 statement above PROC FREQ with this title:
F requency Report for Basin
4. Run the program. Which titles appear above each report?
5
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.1 Enhanc ing Reports w ith Titles, Footnotes, and Labels 5-5
5.02 Activity
Open p105a02.sas from the a ctivities folder. Notice that there are no TITLE
statements in the code. Run the program. Does the report have the same
titles assigned in the previous activity?
Yes
No
7
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
9
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-6 Lesson 5 Analyzing and Reporting on Data
%let age=13;
title;
footnote;
10
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Note: Remember to use double quotation marks when you reference macro variables in text
strings.
LA BEL col-name="label-text";
11
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p105d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.1 Enhanc ing Reports w ith Titles, Footnotes, and Labels 5-7
12
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
In PROC PRINT, you must use either the LABEL or SPLIT= option in the PROC PRINT statement
to display labels in the report. When you use the LABEL option, SAS determines whether to split the
labels to multiple lines, and if so, where to make the split. The SPLIT= option enables you to define
a character that forces labels to split in specific locations.
proc print data=sashelp.cars split="*";
var Make Model MSRP MPG_Highway MPG_City;
label MSRP="Manufacturer Suggested*Retail Price"
MPG_Highway="Highway Miles*per Gallon";
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-8 Lesson 5 Analyzing and Reporting on Data
Segmenting Reports
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.1 Enhanc ing Reports w ith Titles, Footnotes, and Labels 5-9
Enhancing Reports
Scenario
Use titles, footnotes, labels, and grouping to enhance a report.
Files
• p105d01.sas
• storm_final – a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
TITLEn "title-text";
FOOTNOTEn "footnote-text";
LABEL col-name="label-text"
col-name="label-text";
Notes
• TITLE is a global statement that establishes a permanent title for all reports that are created in
your SAS session.
• You can have a maximum of 10 titles. You use a number 1 through 10 after the keyword TITLE
to indicate the line number. TITLE and TITLE1 are equivalent.
• Titles can be replaced with an additional TITLE statement with the same number. TITLE; clears
all titles.
• You can also add footnotes to any report with the FOOTNOTE statement. The same rules for titles
apply to footnotes.
• Labels can be used to provide more descriptive column headings. A label can include any text
at a maximum of 256 characters.
• All procedures automatically display labels except for PROC PRINT. You must add the LABEL
option in the PROC PRINT statement.
• To create a grouped report, first use PROC SORT to arrange the data by the grouping column,
and then use the BY statement in the reporting procedure.
Demo
1. Open p105d01.sas from the demos folder and find the Demo section of the program. Add a
PROC SORT step before PROC PRINT to sort pg1.storm_final by BasinName and descending
MaxWindMPH. Create a temporary table named storm_sort. Filter the rows to include only
MaxWindMPH>156.
proc sort data=pg1.storm_final out=storm_sort;
by BasinName descending MaxWindMPH;
where MaxWindMPH > 156;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-10 Lesson 5 Analyzing and Reporting on Data
2. Modify the PROC PRINT step to read the storm_sort table and group the report
by BasinName.
3. Add the following title: Category 5 Storms. Clear the title for future results.
4. Add labels for the following columns and ensure that PROC PRINT displays the labels:
MaxWindMPH Max Wind (MPH)
MinPressure Min Pressure
StartDate Start Date
StormLength Length of Storm (days)
5. Add the NOOBS option in the PROC PRINT statement to suppress the OBS column. Highlight
the demo program and run the selected code.
title "Category 5 Storms";
proc print data=storm_sort label noobs;
by BasinName;
var Season Name MaxWindMPH MinPressure StartDate StormLength;
label MaxWindMPH="Max Wind (MPH)"
MinPressure="Min Pressure"
StartDate="Start Date"
StormLength="Length of Storm (days)";
run;
title;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.1 Enhanc ing Reports w ith Titles, Footnotes, and Labels 5-11
15
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
When the LABEL statement is used in a DATA step, labels are assigned as permanent attributes
in the descriptor portion of the table. When procedures create reports using that data, labels are
automatically displayed. Notice that the LABEL option is still required in PROC PRINT.
5.03 Activity
Open p105a03.sas from the a ctivities folder and perform the following tasks:
1. Modify the LABEL statement in the DATA step to label the Invoice column
as Invoice Price.
2. Run the program. Why do the labels appear in the PROC MEANS report
but not in the PROC PRINT report? Fix the program and run it again.
16
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-12 Lesson 5 Analyzing and Reporting on Data
number of
unique values
change
statistics
graphs to view
distribution 20
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p105d01
PROC FREQ was used with the TABLES statement for data validation. However, many more
statements and options are available in PROC FREQ to customize the output and include additional
statistics.
21
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.2 Creating Frequency Reports 5-13
Start with the frequency reports that are based on individual columns. By default, each column that
is listed in the TABLES statement generates a separate frequency table that includes the number
and percentage of rows for each value in the data, as well as a cumulative frequency and percent.
The numbers that are included in this report can be customized using options in the PROC FREQ
and TABLES statements.
See SAS Help for full documentation about PROC FREQ:
http://support.sas.com/documentation/cdl/en/procstat/66703/HTML/default/viewer.htm#procstat_freq
_syntax.htm
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-14 Lesson 5 Analyzing and Reporting on Data
Files
• p105d02.sas
• storm_final – a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
Notes
• One or more TABLES statements can be used to define frequency tables and options.
• ODS Graphics enables graph options to be used in the TABLES statement.
• WHERE, FORMAT, LABEL, and BY statements can be used in PROC FREQ to customize
the report.
Demo
Note: Highlight the demo program and run the selected code after each step.
1. Open p105d02.sas from the demos folder and find the Demo section of the program.
Highlight the PROC FREQ step and run the selected code. Examine the default results.
2. In the PROC FREQ statement, add the ORDER=FREQ option to sort results by descending
frequency. Add the NLEVELS option to include a table with the number of distinct values.
proc freq data=pg1.storm_final order=freq nlevels;
3. Add the NOCUM option in the TABLES statement to suppress the cumulative columns.
tables BasinName Season / nocum;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.2 Creating Frequency Reports 5-15
4. Change Season to StartDate in the TABLES statement. Add a FORMAT statement to display
StartDate as the month name (MONNAME.).
proc freq data=pg1.storm_final order=freq nlevels;
tables BasinName StartDate / nocum;
format StartDate monname.;
run;
5. Add the ODS GRAPHICS ON statement before PROC FREQ. Use the PLOTS=FREQPLOT
option in the TABLES statement to create a bar chart. Add the chart options
ORIENT=HORIZONTAL and SCALE=PERCENT.
ods graphics on;
proc freq data=pg1.storm_final order=freq nlevels;
tables BasinName StartDate /
nocum plots=freqplot(orient=horizontal scale=percent) ;
format StartDate monname.;
run;
6. Add the title Frequency Report for Basin and Storm Month. Turn off the procedure title with
the ODS NOPROCTITLE statement. Add a LABEL statement to display BasinName as Basin
and StartDate as Storm Month. Clear the titles and turn the procedure titles back on.
ods graphics on;
ods noproctitle;
title "Frequency Report for Basin and Storm Month";
proc freq data=pg1.storm_final order=freq nlevels;
tables BasinName StartDate /
nocum plots=freqplot(orient=horizontal scale=percent);
format StartDate monname.;
label BasinName="Basin"
StartDate="Storm Month";
run;
title;
ods proctitle;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-16 Lesson 5 Analyzing and Reporting on Data
5.04 Activity
Open p105a04.sas from the a ctivities folder and perform the following tasks:
1. Create a temporary output table named s torm_count by completing the
OUT= option in the TABLES statement.
2. Add the NOPRINT option in the PROC FREQ statement to suppress the
printed report.
3. Run the program. Which statistics are included in the output table?
Which month has the highest number of storms?
23
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
rows columns
25
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p105d03
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.2 Creating Frequency Reports 5-17
Files
• p105d03.sas
• storm_final – a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
Notes
• When you place an asterisk between two columns in the TABLES statement, PROC FREQ
produces a two-way frequency or crosstabulation report. The values of the first listed column are
the rows of the report, and the values of the second column are the columns.
• Use options in the TABLES statement to customize the table structure and the statistics that are
included in the output.
Demo
Note: Highlight the PROC FREQ step and run the selected code after each step.
1. Open p105d03.sas from the demos folder and find the Demo section of the program.
Highlight the PROC FREQ step, run the selected code, and examine the default results.
2. Add the NOPERCENT, NOROW, and NOCOL options in the TABLES statement.
tables StartDate*BasinName / norow nocol nopercent;
3. Delete the options in the TABLES statement and add the CROSSLIST option.
tables StartDate*BasinName / crosslist;
4. Change the CROSSLIST option to the LIST option in the TABLES statement.
tables StartDate*BasinName / list;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-18 Lesson 5 Analyzing and Reporting on Data
5. Delete the previous options and add OUT=STORMCOUNTS. Add NOPRINT to the PROC FREQ
statement to suppress the report.
proc freq data=pg1.storm_final noprint;
tables StartDate*BasinName / out=stormcounts;
format StartDate monname.;
label BasinName="Basin"
StartDate="Storm Month";
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.2 Creating Frequency Reports 5-19
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
1. Creating One-Way Frequency Reports
The pg1.np_species table provides a detailed species list for selected national parks.
Use this table to analyze categories of reported species.
a. Create a new program. Write a PROC FREQ step to analyze rows from pg1.np_species.
1) Use the TABLES statement to generate a frequency table for Category.
2) Use the NOCUM options to suppress the cumulative columns.
3) Use the ORDER=FREQ option in the PROC FREQ statement to order the results
by descending frequency.
4) Use Categories of Reported Species as the report title.
5) Run the program and review the results.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-20 Lesson 5 Analyzing and Reporting on Data
3) Add in the Everglades as a second title. Run the program and review the results.
Level 2
2. Creating Two-Way Frequency Reports
The pg1.np_codelookup table is primarily used to look up a park name or park code. However,
the table also includes columns for the park type and park region. Use this table to analyze the
frequency of park types by the various regions.
a. Create a new program. Write a PROC FREQ step to analyze rows from
pg1.np_codelookup. Generate a two-way frequency table for Type by Region. Exclude any
park type that contains the word Other. The levels with the most rows should come first in
the order. Suppress the display of column percentages. Use Park Types by Region as the
report title.
b. Run the program and review the results. Identify the top three park types based on total
frequency count.
Note: Statistics labels appear in the main table in Enterprise Guide if SAS Report is the
output format.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.2 Creating Frequency Reports 5-21
c. Modify the PROC FREQ step by limiting the park types to the three that were determined in
the previous step. In addition to suppressing the display of column percentages, display the
table using the CROSSLIST option. Add a frequency plot that groups the bars by the row
variable, displays row percentages, and has a horizontal orientation. Use Selected Park
Types by Region as the report title. Run the program and review the results.
Note: Use SAS documentation to learn how the GROUPBY=, SCALE=, and ORIENT=
options can be used to control the appearance of the plot.
Challenge
3. Creating a Customized Graph of a Two-Way Frequency Table
The SGPLOT procedure can be used to create statistical graphics such as histograms and
regression plots, in addition to simple graphics such as bar charts and line plots. Statements and
options enable you to control the appearance of your graph and add additional features such as
legends and reference lines.
a. Open p105p03.sas from the practices folder. Highlight the first TITLE statement and PROC
FREQ step, run the selected code, and examine the generated plot. The program subsets
the pg1.np_codelookup table for three park types: National Historic Site, National
Monument, and National Park. The plot uses a stacked layout with a horizontal orientation.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-22 Lesson 5 Analyzing and Reporting on Data
b. To create a more customized frequency bar chart, the SGPLOT procedure can be used with
the pg1.np_codelookup table. Examine the PROC SGPLOT step in the demo program.
1) The HBAR statement creates a horizontal bar chart with separate bars for each Region.
The GROUP= option segments each bar by the distinct values of Type.
2) The KEYLEGEND statement customizes the appearance and position of the legend.
3) The XAXIS statement adds reference lines on the horizontal axis.
c. Use SAS Help or autocomplete prompts to look for additional options in the HBAR statement
to customize the appearance of the chart.
1) Display labels on each segment of the bars.
2) Change the fill attributes for each bar to make the color 50% transparent.
3) Apply different values for the DATASKIN option to change the color effect on the bars.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Creating Summary Statistics Reports 5-23
group
data
PROC MEANS
makes it easy to
summarize your
data in reports
or tables!
PROC MEANS is a very useful procedure for calculating basic summary statistics and looking for
numeric values that might be outside of an expected range. Now that you are beyond validation, you
can use PROC MEANS to generate complex reports that include various statistics and groupings
within the data.
30
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-24 Lesson 5 Analyzing and Reporting on Data
Files
• p105d04.sas
• storm_final – a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
Notes
• Options in the PROC MEANS statement control the statistics that are included in the report.
• The CLASS statement specifies columns to group the data before calculating statistics.
• The WAYS statement specifies the number of ways to make unique combinations of class
columns.
Demo
Note: Highlight the PROC MEANS step and run the selected code after each step.
1. Open p105d04.sas from the demos folder and find the Demo section of the program.
Run the step and examine the starting report.
2. List the following statistics in the PROC MEANS statement: MEAN, MEDIAN, MIN, and MAX.
Add the MAXDEC=0 option to round statistics to the nearest integer.
proc means data=pg1.storm_final mean median min max maxdec=0;
3. The CLASS statement can be used to calculate statistics for groups. Add a CLASS statement
and list the BasinName column.
Note: The CLASS statement does not require the data to be sorted.
proc means data=pg1.storm_final mean median min max maxdec=0;
var MaxWindMPH;
class BasinName;
run;
4. Add StormType as an additional column in the CLASS statement. Run the program and notice
that one report is created with statistics that are calculated for the combination of BasinName
and StormType values.
class BasinName StormType;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Creating Summary Statistics Reports 5-25
5. The WAYS statement can be used to indicate the combinations of class columns to use for
creating the report. Add the WAYS statement and provide a value of 1.
proc means data=pg1.storm_final mean median min max maxdec=0;
var MaxWindMPH;
class BasinName StormType;
ways 1;
run;
6. Change the WAYS statement to list 0, 1, and 2.
proc means data=pg1.storm_final mean median min max maxdec=0;
var MaxWindMPH;
class BasinName StormType;
ways 0 1 2;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-26 Lesson 5 Analyzing and Reporting on Data
5.05 Activity
Open p105a05.sas from the a ctivities folder and perform the following tasks:
1. Add options to include N (count), MEAN, and MIN statistics. Round each
statistic to the nearest integer.
2. Add a CLASS statement to group the data by Sea son and Ocean. Run the
program.
3. Modify the program to add the WAYS statement so that separate reports
are created for Sea son and Ocean statistics. Run the program.
Which ocean had the lowest mean for minimum pressure?
Which season had the lowest mean for minimum pressure?
32
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
34
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Creating Summary Statistics Reports 5-27
5.06 Activity
Open p105a06.sas from the a ctivities folder and perform the following tasks:
1. Run the PROC MEANS step and compare the report and the wi nd_stats
table. Are the same statistics in the report and table? What do the first
five rows in the table represent?
2. Uncomment the WAYS statement. Delete the statistics listed in the PROC
MEANS statement and add the NOPRINT option. Run the program. Notice
that a report is not generated and the first five rows from the previous
table are excluded.
3. Add the following options in the OUTPUT statement and run the program
again. How many rows are in the output table?
output out=wind_stats mean=AvgWind max=MaxWind;
35
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-28 Lesson 5 Analyzing and Reporting on Data
5.07 Activity
Open p105a07.sas from the a ctivities folder. Run the program and examine
the results to see examples of other procedures that analyze and report
on the data.
38
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
The map is created using the SGMAP procedure, which requires SAS 9.4M5 or later.
39
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Creating Summary Statistics Reports 5-29
Links
• Review the SAS 9.4 ODS Graphics documentation.
• Take the ODS Graphics: Essentials course.
• Use this ODS Graphics tip sheet as a reference.
• Take the free e-learning Statistics 1: Introduction to ANOVA, Regression, and Logistic Regression
course.
• Check out other training options for advanced analytics.
• Learn to use PROC REPORT and PROC TABULATE in the SAS Report Writing 1: Essentials
course.
• Read PROC REPORT by Example: Techniques for Building Professional Reports Using SAS .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-30 Lesson 5 Analyzing and Reporting on Data
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
4. Producing a Descriptive Statistic Report
The pg1.np_westweather table contains weather-related information for four national parks:
Death Valley National Park, Grand Canyon National Park, Yellowstone National Park,
and Zion National Park. Use the MEANS procedure to analyze the data in this table.
a. Create a new program. Write a PROC MEANS step to analyze rows from
pg1.np_westweather with the following specifications:
1) Generate the mean, minimum, and maximum statistics for the Precip, Snow, TempMin,
and TempMax columns.
2) Use the MAXDEC= option to display the values with a maximum of two decimal
positions.
3) Use the CLASS statement to group the data by Year and Name.
4) Use Weather Statistics by Year and Park as the report title. Run the program
and review the results.
Level 2
5. Creating an Output Table with Custom Columns
The pg1.np_westweather table contains weather-related information for four national parks:
Death Valley National Park, Grand Canyon National Park, Yellowstone National Park,
and Zion National Park. Use the MEANS procedure to analyze the data in this table.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.3 Creating Summary Statistics Reports 5-31
a. Create a new program. Write a PROC MEANS step to analyze rows from
pg1.np_westweather where values for Precip are not equal to zero. Analyze precipitation
amounts grouped by Name and Year. Create only an output table, named rainstats, with
columns for the N and SUM statistics. Name the columns RainDays and TotalRain
respectively. Keep only those rows that are the combination of Year and Name.
b. Write a PROC PRINT step to print the rainstats table. Suppress the printing of observation
numbers, and display column labels. Display the columns in the following order: Name, Year,
RainDays, and TotalRain. Label Name as Park Name, RainDays as Number of Days
Raining, and TotalRain as Total Rain Amount (inches). Use Rain Statistics by Year and
Park as the report title.
c. Run the program and review the results.
Challenge
6. Identifying the Top Three Extreme Values with the Output Statistics
a. Create a new program. Write a PROC MEANS step to analyze rows from pg1.np_multiyr
and create a table named top3parks with the following attributes:
1) Suppress the display of the PROC MEANS report.
2) Analyze Visitors grouped by Region and Year.
3) Drop the _FREQ_ and _TYPE_ columns from top3parks and keep only rows that are
a result of a combination of Region and Year.
4) Create a column for TotalVisitors in the output table.
5) Include in the output table the top three parks in terms of the number of visitors.
Automatically resolve conflicts in the column names when names are assigned
to the new columns in the output table.
Note: Use SAS Help to learn about the IDGROUP option in the OUTPUT statement.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-32 Lesson 5 Analyzing and Reporting on Data
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.4 Solutions 5-33
5.4 Solutions
Solutions to Practices
1. Creating One-Way Frequency Reports
/*part a*/
title1 "Categories of Reported Species";
proc freq data=pg1.np_species order=freq;
tables Category / nocum;
run;
/*part b*/
ods graphics on;
ods noproctitle;
title1 "Categories of Reported Species";
title2 "in the Everglades";
proc freq data=pg1.np_species order=freq;
tables Category / nocum plots=freqplot;
where Species_ID like "EVER%" and
Category ne "Vascular Plant";
run;
title;
2. Creating Two-Way Frequency Reports
What are the top three park types based on total frequency?
National Historic Site, National Monument, and National Park
/*part a, b*/
title1 'Park Types by Region';
proc freq data=pg1.np_codelookup order=freq;
tables Type*Region / nocol;
where Type not like '%Other%';
run;
/*part c*/
title1 'Selected Park Types by Region';
ods graphics on;
proc freq data=pg1.np_codelookup order=freq;
tables Type*Region / nocol crosslist
plots=freqplot(groupby=row scale=groupp ercent
orient=horizontal);
where Type in ('National Historic Site', 'National Monument',
'National Park');
run;
title;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-34 Lesson 5 Analyzing and Reporting on Data
/*part b */
title1 'Counts of Selected Park Types by Park Region';
ods graphics on;
proc freq data=pg1.np_codelookup order=freq noprint;
tables Type*Region / out=park_freq;
where Type in ('National Historic Site', 'National Monument',
'National Park');
run;
/*part c*/
proc sgplot data=pg1.np_codelookup;
where Type in ('National Historic Site', 'National Monument',
'National Park');
hbar region / group=type;
keylegend / opaque across=1 position=bottomright
location=inside;
xaxis grid;
run;
/*part d*/
proc sgplot data=pg1.np_codelookup;
where Type in ('National Historic Site', 'National Monument',
'National Park');
hbar region / group=type seglabel
fillattrs=(transparency=0.5) dataskin=crisp;
keylegend / opaque across=1 position=bottomright
location=inside;
xaxis grid;
run;
title;
4. Producing a Descriptive Statistic Report
title1 'Weather Statistics by Year and Park';
proc means data=pg1.np_westweather mean min max maxdec=2;
var Precip Snow TempMin TempMax;
class Year Name;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.4 Solutions 5-35
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-36 Lesson 5 Analyzing and Reporting on Data
6
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
8
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.4 Solutions 5-37
continued...
5.03 Activity – Correct Answer
1. Modify the LABEL statement in the DATA step to label the Invoice column
as Invoice Price.
data cars_update;
set sashelp.cars;
keep Make Model MSRP Invoice AvgMPG;
AvgMPG=mean(MPG_Highway, MPG_City);
label MSRP="Manufacturer Suggested Retail Price"
AvgMPG="Average Miles per Gallon"
Invoice="Invoice Price";
run;
17
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
18
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-38 Lesson 5 Analyzing and Reporting on Data
24
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
33
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5.4 Solutions 5-39
continued...
5.06 Activity – Correct Answer
1. Run the PROC MEANS step and compare the report and the wi nd_stats
table. Are the same statistics in the report and table? What do the first
five rows in the table represent?
The statistics are different. The first five rows in the table summarize the
entire input table.
36
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
View SAS
documentation for more
options to customize the
output table.
37
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
5-40 Lesson 5 Analyzing and Reporting on Data
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Lesson 6 Exporting Results
6.1 Exporting Data ............................................................................................................. 6-3
Demonstration: Exporting Data to an Excel Workbook.................................................. 6-7
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.1 Exporting Data 6-3
Analyze
Access Explore Prepare Export
and report
data data data
on data res ults
3
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
You have clean data and accurate, interesting reports, so now you need to share what you created
with others. You realize that not everyone who needs access to your results uses SAS, so you need
methods to save the data and reports in formats that are easy to view.
4
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-4 Lesson 6 Exporting Results
Both SAS Enterprise Guide and SAS Studio provide easy point-and-click options to export data
to a variety of formats.
Enterprise Guide: Open a table in the data grid and select Export.
SAS Studio: Right-click on a table in the Libraries section of the Navigation pane and select Export.
5
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Common DBMS identifiers that are included with Base SAS are as follows:
• CSV – comma-separated values
• JMP – JMP files, JMP 7 or later
• TAB – tab-delimited values
• DLM – delimited files, default delimiter is a space. To use a different delimiter,
use the DELIMITER= statement.
Additional DBMS identifiers that are included with SAS/ACCESS Interface to PC Files:
• XLSX – Microsoft Excel 2007, 2010, and later
• ACCESS – Microsoft Access 2000 and later
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.1 Exporting Data 6-5
Remember that
the path is relative
to the location
of SAS.
6
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
If SAS Studio or Enterprise Guide were configured to connect to SAS on a remote server, both
interfaces provide a method to download files from the remote server to your local machine.
SAS Studio – Select the file in the Files and Folders section of the Navigation pane and click
Download .
Enterprise Guide – Select Tasks Data Copy Files.
6.01 Activity
1. Open the l i bname.sas program in the course files folder.
2. Create a macro variable named outpath that stores the location
of the output folder in your course files location.
3. Run the code and save the program.
7
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-6 Lesson 6 Exporting Results
6.02 Activity
Open p106a02.sas from the a ctivities folder and perform the following tasks:
1. Complete the PROC EXPORT step to read the pg 1.storm_final SAS table
and create a comma-delimited file named s torm_final.csv. Use &outpath
to substitute the path of the output folder.
2. Run the program and view the text file:
11
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p106d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.1 Exporting Data 6-7
Files
• p106d01.sas
• storm_final – a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
Notes
• The XLSX engine requires a license for SAS/ACCESS Interface to PC Files.
• The XLSX engine can read and write data in Excel files.
• To write data to a new or existing Excel workbook, use the LIBNAME statement to assign a libref
that points to the Excel file. Use the libref when you name output tables. The table name is the
worksheet label in the Excel file.
Demo
1. Open p106d01.sas from the demos folder and find the Demo section of the program. Examine
the DATA and PROC MEANS steps and identify the temporary SAS tables that will be created.
Highlight the demo program and run the selected code.
2. Add a LIBNAME statement to create a library named xlout that points to an Excel file named
southpacific.xlsx in the output folder of the course data.
Note: Use the outpath macro variable to substitute the path of the output folder. If you did not
define the outpath macro variable, run the libname.sas program that was completed in
Activity 6.01.
libname xlout xlsx "&outpath/southpacific.xlsx";
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-8 Lesson 6 Exporting Results
3. Modify the DATA and PROC steps to write output tables to the xlout library.
libname xlout xlsx "&outpath/southpacific.xlsx";
data xlout.South_Pacific;
set pg1.storm_final;
where Basin="SP";
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.1 Exporting Data 6-9
6.03 Activity
Open p106a03.sas from the a ctivities folder and perform the following tasks:
1. Complete the LIBNAME statement using the XLSX engine to create
an Excel workbook named s torm.xlsx in the output folder.
2. Modify the DATA step to write the s torm_final table to the s torm.xlsx file.
3. After the DATA step, write a statement to clear the library.
4. Run the program and view the log to confirm that s torm.xlsx was exported
with 3092 rows.
5. If possible, open the s torm.xlsx file. How do dates appear in the
s torm_final workbook?
13
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-10 Lesson 6 Exporting Results
16
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
SAS provides the Output Delivery System (ODS) to create customized output in a variety of formats.
In SAS, procedures that generate reports actually generate output objects. These can easily be
rendered in one or more output formats that are designed to be viewed in SAS or in other software
applications. In ODS terminology, each of these formats is called a destination. Some ODS
destinations produce very simple output files, such as text files that conform to comma-separated
values’ standards. Others produce complex output files that are designed to be viewed and
manipulated using external software applications. Common destinations of this type include Excel
(XLSX), Microsoft Word (RTF), Microsoft PowerPoint (PPTX), and Adobe (PDF). Many other
destinations are available in SAS.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-11
17
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-12 Lesson 6 Exporting Results
18
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
19
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-13
20
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
ODS Excel was experimental in SAS 9.4M1 and M2. It is fully supported in SAS 9.4M3 and later.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-14 Lesson 6 Exporting Results
Files
• p106d02.sas
• storm_final – a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
Notes
• The ODS EXCEL destination creates an XLSX file.
• By default, each procedure output is written to a separate worksheet with a default worksheet
name. The default style is also applied.
• Use the STYLE= option in the ODS EXCEL statement to apply a different style.
• Use the OPTIONS(SHEET_NAME=’label’) option in the ODS EXCEL statement to provide
a custom label for each worksheet.
Demo
1. Open p106d02.sas from the demos folder and find the Demo section in the program. Add an
ODS statement to create an Excel file named wind.xlsx in the output folder of the course files.
Close the Excel destination at the end of the program. Highlight the demo program and run the
selected code.
Note: Use the outpath macro variable to substitute the path of the output folder. If you did not
define the outpath macro variable, run the libname.sas program that was completed in
Activity 6.01.
ods excel file="&outpath/wind.xlsx";
title "Wind Statistics by Basin";
...
title;
ods proctitle;
ods excel close;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-15
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-16 Lesson 6 Exporting Results
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-17
6.04 Activity
Open p106a04.sas from the a ctivities folder and perform the following tasks:
1. Add ODS statements to create an Excel file named pressure.xlsx
in the output folder. Be sure to close the ODS location at the end
of the program. Run the program and open the Excel file.
SA S Studio: Navigate to the output folder in the Files and Folders section
of the navigation pane. Select pressure.xlsx and click Download .
Enterprise Guide: Click the Res ults - Excel tab and click Downl oad.
22
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
24
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-18 Lesson 6 Exporting Results
6.05 Activity
Open p106a05.sas from the a ctivities folder and perform the following tasks:
1. Run the program and open the pressure.pptx file.
2. Modify the ODS statements to change the output destination to RTF.
Change the style to s a pphire.
3. Add the STARTPAGE=NO option in the first ODS RTF statement
to eliminate a page break between the procedure results.
4. Rerun the program and open the pres sure.rtf file.
25
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
27
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-19
Files
• p106d03.sas
• storm_final – a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
Notes
• The ODS PDF destination creates a PDF file.
• The PDFTOC=n option controls the level of the expansion of the table of contents in PDF
documents.
• The ODS PROCLABEL statement enables you to change a procedure label.
Demo
1. Open p106d03.sas from the demos folder and find the Demo section of the program. Run the
program and open the PDF file to examine the results. Notice that bookmarks are created, and
they are linked to each procedure’s output.
Note: Use the outpath macro variable to substitute the path of the output folder. If you did not
define the outpath macro variable, run the libname.sas program that was completed in
Activity 6.01.
2. Add the STARTPAGE=NO option to eliminate page breaks between procedures. Add the
STYLE=JOURNAL option.
ods pdf file="&outpath/wind.pdf" startpage=no style=journal;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-20 Lesson 6 Exporting Results
3. To customize the PDF bookmarks, add the PDFTOC=1 option to ensure that bookmarks are
expanded only one level when the PDF is opened. To customize the bookmark labels, add the
ODS PROCLABEL statement before each PROC with custom text. Run the program and open
the PDF file.
ods pdf file="&outpath/wind.pdf" startpage=no style=journal
pdftoc=1;
ods noproctitle;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-21
29
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• Take the Exporting SAS Data Sets and Creating ODS Files for Microsoft Excel course.
• View the following Help pages:
– Base SAS EXPORT Procedure
– SAS Output Delivery System: User’s Guide
– SAS/ACCESS Interface to PC Files: Reference
• Take the SAS Report Writing 1: Essentials course.
• Explore the SAS Output Delivery System resource page.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-22 Lesson 6 Exporting Results
Practice
If you restarted your SAS session, open and submit the libname.sas program in the course files.
Level 1
1. Creating an Excel File Using ODS EXCEL
Create an Excel workbook named StormStats.xlsx that includes the results of SAS procedures.
Customize the names of the Excel worksheets.
a. Open p106p01.sas from the practices folder. Before the PROC MEANS step, add an ODS
EXCEL statement to do the following:
1) Write the output file to “&outpath/StormStats.xlsx”.
Note: If you did not define the outpath macro variable, run the libname.sas program
that was completed in Activity 6.01.
2) Set the style for the Excel file to snow.
3) Set the sheet name for the first tab to South Pacific Summary.
b. Turn off the procedure titles and report titles at the start of the program. Turn the procedure
titles on at the end of the program.
c. Immediately before the PROC PRINT step, add an ODS EXCEL statement to set the sheet
name to Detail.
d. At the end of the program, add an ODS EXCEL statement to close the Excel destination.
e. Submit the program. If possible, open the StormStats.xlsx workbook in Excel.
Level 2
2. Creating a Word Document with ODS RTF
Generate an RTF file that can be opened in Microsoft Word. The file should include the results
of three procedures and use different styles to change the appearance.
a. Open p106p02.sas from the practices folder. Modify the program to write the output file
to &outpath/ParkReport.rtf. Set the style for the output file to Journal and remove page
breaks between procedure results. Suppress the printing of procedure titles.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-23
Note: If you did not define the outpath macro variable, run the libname.sas program that
was completed in Activity 6.01.
b. Run the program. Open the output file in Microsoft Word. Notice that the Journal style is
applied to the results, but the graph is now gray scale instead of color. Also notice that the
date and time the program ran is printed in the upper right corner of the page. Close
Microsoft Word.
c. Modify your SAS program so that both tables are created using the Journal style, but the
graph is created using the SASDOCPRINTER style.
Note: An ODS destination statement enables you to specify a style without requiring you
to redefine the output file location.
d. Add an OPTIONS statement with the NODATE option at the beginning of the program
to suppress the date and time in the RTF file. Restore the option for future submissions
by adding an OPTIONS statement with the DATE option at the end of the program.
e. Run the program. Open the new output file using Microsoft Word. Ensure that the style for
both tables is the same, but that the graph is now displayed in color. Close the report.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-24 Lesson 6 Exporting Results
Challenge
3. Creating a Landscape Report with ODS PDF
Generate a PDF document in landscape orientation. Print a report and map side by side.
a. Open p106p03.sas from the practices folder. Run the program and examine the output.
The program produces a table and map for North Atlantic region storms in the 2016 season.
b. Modify the program to produce a PDF file named StormSummary.pdf in the output folder
in the course files. Set the output style to Journal.
c. Use SAS Help to find a SAS system option that changes the page layout to landscape.
d. Use SAS Help to learn about the ODS LAYOUT GRIDDED statement as a way that you can
control the layout of multiple result objects. Force the results to be arranged in one row and
two columns.
e. Reset the system option at the end of the program so that future results have a portrait
layout.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.2 Exporting Reports 6-25
f. Run the program and open the StormSummary.pdf file to confirm the results.
Note: SAS Studio generates a warning in the log because the wrapper code is creating
an RTF file behind the scenes. LAYOUT is not supported in RTF. The warning can
be ignored because it does not impact the PDF results.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-26 Lesson 6 Exporting Results
6.3 Solutions
Solutions to Practices
1. Creating an Excel File Using ODS EXCEL
ods excel file="&outpath/StormStats.xlsx"
style=snow
options(sheet_name='South Pacific Summary');
ods noproctitle;
title;
proc means data=pg1.storm_detail maxdec=0 median max;
class Season;
var Wind;
where Basin='SP' and Season in (2014,2015,2016);
run;
ods excel options(sheet_name='Detail');
proc print data=pg1.storm_detail noobs;
where Basin='SP' and Season in (2014,2015,2016);
by Season;
run;
ods excel close;
ods proctitle;
2. Creating a Word Document with ODS RTF
ods rtf file="&outpath/ParkReport.rtf" style=Journal startpage=no;
ods noproctitle;
options nodate;
title "US National Park Regional Usage Summary";
proc freq data=pg1.np_final;
tables Region / nocum;
run;
proc means data=pg1.np_final mean median max nonobs maxdec=0;
class Region;
var DayVisits Campers;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.3 Solutions 6-27
ods region;
proc print data=pg1.storm_final noobs;
var name StartDate MaxWindMPH StormLength;
where Basin="NA" and Season=2016;
format StartDate monyy7.;
run;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-28 Lesson 6 Exporting Results
8
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
10
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6.3 Solutions 6-29
14
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
...
23
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
6-30 Lesson 6 Exporting Results
The STARTPAGE=
option controls
page breaks
in the file.
26
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Lesson 7 Using SQL in SAS®
7.1 Using Structured Query Language (SQL) in SAS ......................................................... 7-3
Demonstration: Reading and Filtering Data with SQL................................................... 7-8
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.1 Using Structured Query Language (SQL) in SAS 7-3
Python
REST SQL
In addition to working
with other types of data,
SAS SAS also enables you to
use other programming
Java R languages and APIs!
Lua
3
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
To learn more about how these languages and APIs can be integrated in the SAS platform,
visit http://developer.sas.com.
A na lyze
Access Explore Prepare Export
a nd report
data data da ta results
on data
Structured Query
Language (SQL)
4
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-4 Lesson 7 Using SQL in SAS®
You saw how the DATA step and procedures can be used to prepare, summarize, and report on
data. Structured Query Language (SQL) is a common language that is used by many programmers
in a wide variety of software. SAS enables you to write SQL code as part of a SAS program. It is
likely that you will encounter SQL as you progress as a SAS programmer, so it is important to
understand how SQL can be a beneficial tool, and how it compares to the SAS code that was
written.
5
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
The SQL language is available to use in Base SAS. Because SQL is a separate language, it is
implemented in SAS as a procedure. Many programmers who are new to SAS will have prior
experience with SQL. This provides an easy, familiar entry point for programming on the SAS
Platform.
There are two procedures to choose from for executing SQL in Base SAS: PROC SQL and PROC
FedSQL. Each has different extensions and strengths. PROC SQL is more tightly integrated with the
SAS system and has several unique extensions that are useful when processing on the SAS
Platform. PROC FedSQL is written to a more modern SQL ANSI standard, and is more ANSI
compliant, which means that it has fewer SAS extensions. Because PROC SQL has been available
longer, it is more commonly encountered in existing SAS code, so PROC SQL was chosen for
executing SQL programs in this class.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.1 Using Structured Query Language (SQL) in SAS 7-5
6
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
PROC SQL;
SELECT col-name, col-name
FROM input-table;
QU IT;
proc sql;
select Name, Age, Height, Birthdate format=date9.
from pg1.class_birthdate;
quit;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-6 Lesson 7 Using SQL in SAS®
proc sql;
select Name, Age, Height*2.54 as HeightCM format=5.1,
Birthdate format=date9.
from pg1.class_birthdate;
quit;
8
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p107d01
7.01 Activity
Open p107a01.sas from the a ctivities folder.
1. What are the similarities and differences in the syntax of the two steps?
2. Run the program. What are the similarities and differences in the results?
proc sql;
select Name, Age, Height*2.54 as HeightCM format=5.1,
Birthdate format=date9.
from pg1.class_birthdate;
quit;
9
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.1 Using Structured Query Language (SQL) in SAS 7-7
W HERE expression
proc sql;
select Name, Age, Height, Birthdate format=date9.
from pg1.class_birthdate
where age > 14;
quit;
11
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p107d01
proc sql;
select Name, Age, Height, Birthdate format=date9.
from pg1.class_birthdate
where age > 14
order by Height desc;
quit;
The default sort
order is ascending.
12
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p107d01
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-8 Lesson 7 Using SQL in SAS®
Files
• p107d01.sas
• storm_final - a SAS table that contains one row per storm for the 1980 through 2017 storm
seasons. The data was cleaned and prepared previously using the DATA step.
Syntax
PROC SQL;
SELECT col-name, col-name FORMAT=fmt
FROM input-table
WHERE expression
ORDER BY col-name <DESC>;
QUIT;
Notes
• PROC SQL creates a report by default.
• The SELECT statement describes the query. After the SELECT keyword, list columns to include in
the results, separated by commas.
• Computed columns can be included in the SELECT clause.
• The FROM clause lists one or more input tables.
• The ORDER BY clause arranges rows based on the listed columns. The default order is
ascending. Use DESC after a column name to reverse the sort sequence.
• PROC SQL ends with a QUIT statement.
Demo
1. Open p107d01.sas from the demos folder and find the Demo section of the program. Add a
SELECT statement to retrieve all columns from pg1.storm_final. Highlight the step and run the
selected code. Examine the log and results.
proc sql;
select *
from pg1.storm_final;
quit;
2. Modify the query to retrieve only the Season, Name, StartDate, and MaxWindMPH columns.
Format StartDate with MMDDYY10. Highlight the step and run the selected code.
proc sql;
select Season, Name, StartDate format=mmddyy10., MaxWindMPH
from pg1.storm_final;
quit;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.1 Using Structured Query Language (SQL) in SAS 7-9
3. Modify Name in the SELECT clause to convert the values to proper case.
proc sql;
select Season, propcase(Name) as Name,
StartDate format=mmddyy10., MaxWindMPH
from pg1.storm_final;
quit;
4. Add a WHERE clause to include storms during or after the 2000 season with MaxWindMPH
greater than 156.
5. Add an ORDER BY clause to arrange rows by descending MaxWindMPH, and then by Name.
6. Add TITLE statements to describe the report. Highlight the step and run the selected code.
title "International Storms since 2000";
title2 "Category 5 (Wind>156)";
proc sql;
select Season, propcase(Name) as Name,
StartDate format=mmddyy10., MaxWindMPH
from pg1.storm_final
where MaxWindMPH > 156 and Season >= 2000
order by MaxWindMPH desc, Name;
quit;
title;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-10 Lesson 7 Using SQL in SAS®
7.02 Activity
Open p107a02.sas from the a ctivities folder and perform the following tasks:
1. Complete the SQL query to display Event and Cos t from
pg 1.storm_damage. Format the values of Cos t.
2. Add a new column named Sea s on that extracts the year from Da te.
3. Add a WHERE clause to return rows where Cos t is greater than 25 billion.
4. Add an ORDER BY clause to arrange rows by descending Cos t.
Which storm had the highest cost?
PRO C SQL;
S E LECT col-name, col-name <FORMAT=fmt.>, expression A S col-name
FROM input-table
WHERE expression
ORDER BY col-name <DESC>;
Q UIT;
14
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
proc sql;
create table work.myclass as
select Name, Age, Height Adding CREATE
from pg1.class_birthdate TABLE at the
where age > 14 beginning of the
order by Height desc; query turns a
quit; report into a table.
16
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.1 Using Structured Query Language (SQL) in SAS 7-11
proc sql;
drop table work.myclass;
quit; This is helpful if you
are working with
DBMS tables that don’t
allow you to overwrite
existing tables.
17
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-12 Lesson 7 Using SQL in SAS®
c lass_combine
Only students in
both input tables
are included.
19
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
proc sql;
select Grade, Age, Teacher
from pg1.class_update inner join pg1.class_teachers
on class_update.Name = class_teachers.Name;
quit;
20
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p107d02
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.2 Joining Tables Using SQL in SAS 7-13
proc sql;
select Grade, Age, Teacher
from pg1.class_update inner join pg1.class_teachers
on class_update.Name = class_teachers.Name;
quit;
matching
criteria
21
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p107d02
proc sql;
select class_update.Name, Grade, Age, Teacher
from pg1.class_update inner join pg1.class_teachers
on class_update.Name = class_teachers.Name;
quit;
Because Nam e occurs
in both tables, you must
use the table prefix to
indicate which column
you want to select.
22
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
p107d02
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-14 Lesson 7 Using SQL in SAS®
Files
• p107d02.sas
• storm_summary – a SAS table that contains one row per storm for the 1980 through 2016 storm
seasons
• storm_basincodes – a SAS table that includes each two-letter basin code and the corresponding
full basin name
Syntax
PROC SQL;
SELECT col-name, col-name
FROM input-table1 INNER JOIN input-table2
ON table1.col-name=table2.col-name;
QUIT;
Notes
• An SQL inner join combines matching rows between two tables.
• The two tables to be joined are listed in the FROM clause separated by INNER JOIN.
• The ON expression indicates how rows should be matched. The column names must be qualified
as table-name.col-name.
Demo
1. Open pg1.storm_summary and pg1.storm_basincodes and compare the columns. Identify
the matching column.
2. Open the p107d02.sas program in the demos folder and find the Demo section of the program.
Add pg1.storm_basincodes to the FROM clause to perform an inner join on Basin. Qualify the
Basin columns as table-name.col-name in the ON expression only.
3. Add the BasinName column to the query after Basin. Highlight the step, run the selected code,
and examine the log. Why does the program fail?
proc sql;
select Season, Name, Basin, BasinName, MaxWindMPH
from pg1.storm_summary inner join pg1.storm_basincodes
on storm_summary.basin=storm_basincodes.basin
order by Season desc, Name;
quit;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.2 Joining Tables Using SQL in SAS 7-15
4. Modify the query to qualify the Basin column in the SELECT clause. Highlight the step and run
the selected code.
proc sql;
select Season, Name, storm_summary.Basin, BasinName, MaxWindMPH
from pg1.storm_summary inner join pg1.storm_basincodes
on storm_summary.basin=storm_basincodes.basin
order by Season desc, Name;
quit;
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-16 Lesson 7 Using SQL in SAS®
proc sql;
select u.Name, Grade, Age, Teacher
from pg1.class_update as u
inner join pg1.class_teachers as t
on u.Name=t.Name;
quit;
24
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
7.03 Activity
Open p107a03.sas from the a ctivities folder and perform the following tasks:
1. Define aliases for s torm_summary and s torm_basincodes in the FROM
clause.
2. Use one table alias to qualify Ba s in in the SELECT clause.
3. Complete the ON expression to match rows when Ba s in is equal in the
two tables. Use the table aliases to qualify Ba s in in the expression. Run
the step.
25
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.2 Joining Tables Using SQL in SAS 7-17
29
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-18 Lesson 7 Using SQL in SAS®
30
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Links
• Take the SAS SQL 1 course.
• Read PROC SQL by Example.
• Take the SAS SQL Methods and More course.
• Read Practical and Efficient SAS Programming.
• Take the DS2 Programming Essentials course.
• Read Mastering the SAS DS2 Procedure.
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.2 Joining Tables Using SQL in SAS 7-19
31
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
https://communities.sas.com/sas-training
32
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-20 Lesson 7 Using SQL in SAS®
7.3 Solutions
Solutions to Activities and Questions
15
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7.3 Solutions 7-21
continued...
7.03 Activity – Correct Answer
proc sql;
select Season, Name, s.Basin, BasinName, MaxWindMPH
from pg1.storm_summary as s
inner join pg1.storm_basincodes as b
on s.basin=b.basin
order by Season desc, Name;
quit;
The st o rm_summary
table includes some
lowercase B asin
values. Are they
in the results?
26
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
proc sql;
select Season, Name, s.Basin, BasinName, MaxWindMPH
from pg1.storm_summary as s
inner join pg1.storm_basincodes as b
on upcase(s.basin)=b.basin
order by Season desc, Name;
quit;
27
C o p yri gh t © SA S In sti tu te In c. A l l ri gh ts reserved .
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
7-22 Lesson 7 Using SQL in SAS®
Copyright © 2018, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.