0% found this document useful (0 votes)
35 views8 pages

Sas Tutorial: Structure of A SAS Program

A SAS program consists of DATA and PROC steps that are executed sequentially, with DATA steps used to create or modify SAS data sets and PROC steps to perform statistical analysis procedures on the data; SAS statements within these steps are interpreted in the order they appear and certain blocks of statements like DATA and PROC steps are executed as groups.

Uploaded by

amietgupta
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views8 pages

Sas Tutorial: Structure of A SAS Program

A SAS program consists of DATA and PROC steps that are executed sequentially, with DATA steps used to create or modify SAS data sets and PROC steps to perform statistical analysis procedures on the data; SAS statements within these steps are interpreted in the order they appear and certain blocks of statements like DATA and PROC steps are executed as groups.

Uploaded by

amietgupta
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Structure of a SAS Program

 SAS programs consist of SAS statements


 General Form: SAS_keyword operand;
e.g. PROC ANOVA DATA=CORN;
 SAS statements are interpreted and
executed in their order of appearance
SAS TUTORIAL  However, certain blocks of statements
called steps, which define one of two basic
activities, are executed as groups
Mervyn Marasinghe
 DATA step: statements leading to the
creation of a SAS data set
 PROC step: statements needed to tell a
SAS procedure to perform a statistical
analysis
 A SAS program consists of several logically
related DATA and PROC steps

Sample SAS Program Some Syntax Rules

data oranges;
 SAS names: Names you give to SAS
input variety $ flavor texture
variables, SAS data sets etc.
looks;
total=flavor+texture+looks;  Max. of 8 characters long
datalines;  Must begin with an alpha character
navel 9 8 6  Can use the notation Q1-Q12 to name a

temple 7 7 7 list of variables Q1, Q2, …, Q12


valencia 8 9 9  SAS statements:
mandarin 5 7 8  Begin and end anywhere in a line
;  Must end with a ‘ ; ‘
run;
 More than one statement may appear in
proc sort data=oranges; a line
by descending total;
 A statement may take more than one
run; line
proc print data=oranges;
 Items in SAS statements must be
title 'Taste Test Results for separated by at least one blank (unless
Oranges'; already separated by a SAS symbol)
run;  e.g., x = y; x =y; x= y; x=y;

1
SAS Data Sets SAS Data Sets

 Consist of data values arranged in a A SAS data set (or table) is a rectangular
rectangular array table of rows and columns.

 Columns of this array represent the SAS


variables
 Rows represent observations
 In addition to the data values, the
attributes associated with each variable
are also kept in the SAS data set
Rows
 SAS data set has a special structure which (observations)
is different from data sets you create using
an editor, such as an ASCII data file
 Most SAS procedures can access only
SAS data sets
 SAS data sets are usually created in a Columns
data step and may be temporary or (variables)
permanent

SAS Data Set Missing Values

A format is used to control how values are If a data value is not present for a variable
displayed. Formats do not affect how values are in a particular observation, it is considered
stored. missing.
 A missing character value is
displayed as a blank
blank.
 A missing numeric value is displayed
as a period.

.
Format: DOLLAR
Format: MMDDYY Width: 8
Width: 10 Decimal Places: 2
Stored value: 15766 Stored
value: 234.60

2
SAS Data Step Flow of Operations: Data Step

 Begins with the statement START


 DATA name ;
If no more lines
 Followed by one of these statements: of data to input
SAS reads a SAS closes the
 INPUT ; line of data data set and
 SET ; goes on to next
 MERGE ; DATA or PROC
statement
 UPDATE ;
SAS carries out
SAS
 A SAS data step is (usually) used to create program
returns statements for
a new SAS data set from for a this data line
 external data (using an INPUT new and creates
line of new
statement) data observation
 another SAS data set (using a SET
statement)
 SAS program statements are used in a SAS adds
SAS data step to modify input data, if observation to
necessary new data set

9 10

SAS Data Step: Flow of Operations SAS Variables

 Data values stored in SAS data sets are of


data oranges; two ‘type’s
input variety $ flavor texture looks;
total=flavor+texture+looks;  Character
datalines;  Numeric

navel 9 8 6  SAS variables that contain these values


temple 7 7 7 have several attributes associated with
valencia 8 9 9 them
mandarin 5 7 8
 Name
;
run;  Type
;  Length (in bytes)
Variety Flavor Texture Looks Total  Relative position
 Informat
navel 9 8 6 23
 Format
temple 7 7 7 21  Label

12

3
Simple INPUT Statements SAS Program Statements
 List Input Y1 = X1+X2**2;
INPUT ID SEX $ AGE WEIGHT ; Y2 = ABS(X3) ;
1342 F 27 121.2 Y3 = SQRT(X4 + 4.0*X5**2)  X6 ;
X7=3.14156*LOG(X7);
INPUT SCORE1-SCORE4 ;
63.1 94 87.5 72 IF INCOME = . THEN DELETE ;
IF STATE = ‘CA’ | STATE =‘OR’ THEN
REGION = ‘PACIFIC COAST’ ;
 Formatted Input
IF SCORE < 0 THEN SCORE = 0;
INPUT ID 4. STATE $2. FERT 5.2 PERCENT 3.2 ;
0001IA_ _504089 IF SCORE < 80 THEN WEIGHT=.67;
ELSE WEIGHT=.75;

INPUT @10 ITEM $4. +5 PRICE 6.2;


WEIGHT = (SCORE < 80 ) * .67 + (SCORE >=80) * .75;
xxxxxxxxxR2D2xxxxx_91350
IF SCORE <80 THEN DO;
INPUT (ID SEX AGE WT HT) (3. $1. 2. 2*5.1); WEIGHT =.67;
123M21_1650_ _721 RATE=5.70;
ELSE DO;
 Column Input
WEIGHT =.75;
INPUT ID 1-4 STATE $ 5-6 FERT 7-12 RATE=6.50;
PERCENT 13-15 .2 ; END;
0001IA_ _5.04_89

13 14

SAS Program Statements (continued) Order of Evaluating Expressions


 Rule 1: Expressions within parenthesis are
IF 6.5<=RATE<=7.5 THEN GO TO USEIT; evaluated first
:
SAS statements to calculate a new  Rule 2: Higher priority operators are
value for RATE performed first
: Group I **, + (prefix),  (prefix), ^(NOT), ><, <>
USEIT: COST = HOURS*RATE; Group II *, /
: Group III + (infix), (infix)
Group IV | |
Group V <, <=, =, ^=, >=, >, ^>, ^<
DATA ; Group VI & (AND)
INPUT X 1  X 5 ; Group VII | (OR)
X6 = (X 4+X 5) / 2 ;
DROP X 4 X 5 ;  Rule 3: For operators with the same
DATALINES ; priority, the operations take place from left
 to right of the expression (except for Group
I operators, which are executed right to
left.)

15 16

4
SAS Functions Using SAS Functions

A SAS function is a routine that returns a value  SAS functions can do the following:
that is determined from specified arguments.
 perform arithmetic operations
General form of a SAS function:  compute sample statistics (for example:
sum, mean, and standard deviation)
 manipulate SAS dates
function-name(argument1,argument2, . . .)
 process character values
 perform many other tasks
● Examples:

date=mdy(month,day,year)
 Sample statistics functions ignore
missing values.
ave=mean(flavor,texture,looks)

id=substr(item,1,2)

18

The Proc Step The BY Statement


A simple proc step:
PROC PRINT; BY variables_list;
Prints a listing of
the most recently created data set
 Allows the processing of subsets of the
 all variables contained in the data set
same data set in one proc step
 all observations in the data set
 Procedure is executed repeatedly on each
General form of the proc step:
subset of observations defined by values
PROC proc-name options-list;
of variables listed in the BY statement
procedure information statements
variable attribute statements  Subset of observations to be processed
Examples of options: together is defined as those having the
DATA=NEW STDERR KENDALL
same values for the specified variables
Examples of procedure information statements:
VAR … ;  These observations must occur together
BY … ; ((I.e., contiguously)
g y) in the data set
CLASS … ;  Usually the given data set to be processed
MODEL … ; in subset must be rearranged so that the
OUTPUT … ; observations are in the required groupings
Examples of variable attribute statements:
DROP … ;  This rearrangement is most easily
FORMAT … ; achieved using PROC SORT
LABEL … ;
19 20

5
Simple Uses of Proc SORT Simple Uses of Proc SORT (cont)
 
PROC SORT ; PROC SORT DATA=SURVEY ;
BY STATE ;
BY GENDER INCOME ;

 Input data set is the one created most recently PROC PRINT ;
 Output data set replaces the input data set BY GENDER ;
 Observations will be arranged in the increasing
order of the values for STATE PROC MEANS MEAN VAR ;
 More importantly, the observations with the same BY GENDER INCOME ;
value for STATE will appear together
 VAR AGE FOOD RENT ;
PROC SORT ; 
BY STATE CITY ; Output 1 GENDER=F

Listing of observations in the increasing
 Observations will be arranged in the increasing
order of STATE. Within each STATE, the order of INCOME
observations will be arranged in the increasing
order of CITY GENDER=M
 Observations with the same values for STATE and Listing of observations in the increasing
CITY will be grouped together order of INCOME

21 22

Simple Uses of Proc SORT (cont) Examples of SAS Procedures


Output 2
GENDER=F INCOME=1 PROC MEANS MEAN VAR ;
VAR AGE INCOME ;
Means and variances of each of the
variables AGE etc. Other Options: DATA= , NOPRINT, MAXDEC=, N,
MISS, MEAN, STD, MIN, MAX, RANGE, SUM,
GENDER=F
GENDER F INCOME=2
INCOME 2 VAR, STDERR, CV, T, PRT etc.,
 PROC PLOT ;
PLOT Y*X ;
GENDER=F INCOME=3 PLOT Y1*X=‘*’ Y2*X=‘.’ / OVERLAY ;
 PLOT HEIGHT*AGE/HPOS=30 VPOS=40 ;
Other Options: DATA= , UNIFORM, NOLEGEND
GENDER=M INCOME=1 PROC CORR ;
 VAR X 1 X 2 ;
WITH X 3 X 4 ;
GENDER=M INCOME=2 Some Options: DATA= , PEARSON, SPEARMAN,
BEST= , NOSIMPLE, NOPRINT, NOPROB etc.,

BY statement could be used with all 3 of the above


GENDER=M INCOME=3 procedures as well as PROC PRINT

23 24

6
LABEL and FORMAT Statements Some Special Pointer Controls

LABEL variable = ‘label’  ;  #n : for reading multiple data lines when inputting
an observation
e.g., LABEL PTS=‘AEROBIC POINTS’
 INPUT A 5. B 6.2 #2 @15 C 7.3 ;
PERCENT=‘% OF TOTAL FAT’ ;  INPUT AGE GRADE/SCORE1-SCORE5 ;
FORMAT variables [format]  ;  Trailing @ : for holding a data line for another
INPUT statement in the same data step
e.g., FORMAT WEIGHT HEIGHT 8.2  DATA ;
INCOME DOLLAR12.2; INPUT TYPE $1 @ ;
 These 2 statements can be used in both IF TYPE=‘C’ THEN
INPUT  ;
DATA and PROC steps
ELSE
 In a DATA step, these statements will INPUT  ;
associate labels and formats permanently  Trailing @@ : for holding a data line for repeated
execution of the same data step
with the variable specified
 DATA ;
 Also look-up INPUT NAME $ VERBAL MATH @@ ;
 LENGTH TOTAL=VERBAL+MATH;
DATALINES ;
 INFORMAT Sue 610 560 John 720 640 Mary 580 590
 ATTRIB Jim 650 760 Bernard 690 670 Gary 570 680
statements ….

25 26

OUTPUT Statement Arrays


 OUTPUT [ datasetname] ;  Example 1:
 Example DATA SCORES;
DATA FALL01; INPUT QUIZ1-QUIZ5 TEST1-TEST3;
INPUT NAME $ TEST1-TEST3 ; ARRAY SCORES {8} QUIZ1-QUIZ5 TEST1-TEST3;
DROP TEST1-TEST3 ; DO I= 1 TO 8 ;
TEST=1; SCORE=TEST1; OUTPUT; IF SCORES{I}=. THEN SCORES {I}=0 ;
TEST=2;
TEST 2; SCORE=TEST2;
SCORE TEST2; OUTPUT;
END ;
TEST=3; SCORE=TEST3; OUTPUT;
DATALINES; DATALINES;
SMITH 67 92 74 Can use with SAS Expressions and can be used in
JONES 76 81 72 assignment statements and in
; DO WHILE, DO UNTIL, IF-THEN, INPUT, PUT, ETC.
This data step creates the dataset
NAME TEST SCORE
SMITH 1 67  Example 2:
SMITH 2 92 DATA LOAD;
SMITH 3 74
JONES 1 76
INPUT D1-D7;
JONES 2 81 ARRAY DAY {7} D1-D7 ;
JONES 3 72 ARRAY HOUR {7} H1-H7 ;
 Example DO I= 1 TO 7 ;
DATA YR81 YR82 ; IF DAY {I}=999 THEN DAY {I}=. ;
INPUT YEAR PRODUCT $ QTY SALES ;
HOUR{I}=DAY{I}*12 ;
IF YEAR=81 THEN OUTPUT YR81 ;
END ;
ELSE OUTPUT YR82 ;
DATALINES; Could have nested DO structures and multidimensional arrays.
data lines See ARRAY statement in SAS Language Reference: Dictionary
;
27 28

7
Arrays (continued)
 Example 3:
DATA FALL01;
INPUT NAME $ TEST1-TEST10 ;
ARRAY TEST {*} TEST1-TEST10 ;
DROP TEST1-TEST10 ;
DO TESTNO
TESTNO=1 1 TO 10 ;
SCORE=TEST ; OUTPUT ;
END ;
DATALINES ;
SMITH 8 7 9 . 3 10 9 7 5 7
JONES 4 5 6 8 4 5 . 8 7 6
;
Creates the following dataset
NAME TESTNO SCORE
SMITH 1 8
S
SMITH 2 7
  
JONES 1 4
  
JONES 10 6

Could be used with DO, DO WHILE or DO UNTIL


statements

29

You might also like