Sas Tutorial: Structure of A SAS Program
Sas Tutorial: Structure of A SAS Program
data oranges;
SAS names: Names you give to SAS
input variety $ flavor texture
variables, SAS data sets etc.
looks;
total=flavor+texture+looks; Max. of 8 characters long
datalines; Must begin with an alpha character
navel 9 8 6 Can use the notation Q1-Q12 to name a
1
SAS Data Sets SAS Data Sets
Consist of data values arranged in a A SAS data set (or table) is a rectangular
rectangular array table of rows and columns.
A format is used to control how values are If a data value is not present for a variable
displayed. Formats do not affect how values are in a particular observation, it is considered
stored. missing.
A missing character value is
displayed as a blank
blank.
A missing numeric value is displayed
as a period.
.
Format: DOLLAR
Format: MMDDYY Width: 8
Width: 10 Decimal Places: 2
Stored value: 15766 Stored
value: 234.60
2
SAS Data Step Flow of Operations: Data Step
9 10
12
3
Simple INPUT Statements SAS Program Statements
List Input Y1 = X1+X2**2;
INPUT ID SEX $ AGE WEIGHT ; Y2 = ABS(X3) ;
1342 F 27 121.2 Y3 = SQRT(X4 + 4.0*X5**2) X6 ;
X7=3.14156*LOG(X7);
INPUT SCORE1-SCORE4 ;
63.1 94 87.5 72 IF INCOME = . THEN DELETE ;
IF STATE = ‘CA’ | STATE =‘OR’ THEN
REGION = ‘PACIFIC COAST’ ;
Formatted Input
IF SCORE < 0 THEN SCORE = 0;
INPUT ID 4. STATE $2. FERT 5.2 PERCENT 3.2 ;
0001IA_ _504089 IF SCORE < 80 THEN WEIGHT=.67;
ELSE WEIGHT=.75;
13 14
15 16
4
SAS Functions Using SAS Functions
A SAS function is a routine that returns a value SAS functions can do the following:
that is determined from specified arguments.
perform arithmetic operations
General form of a SAS function: compute sample statistics (for example:
sum, mean, and standard deviation)
manipulate SAS dates
function-name(argument1,argument2, . . .)
process character values
perform many other tasks
● Examples:
date=mdy(month,day,year)
Sample statistics functions ignore
missing values.
ave=mean(flavor,texture,looks)
id=substr(item,1,2)
18
5
Simple Uses of Proc SORT Simple Uses of Proc SORT (cont)
PROC SORT ; PROC SORT DATA=SURVEY ;
BY STATE ;
BY GENDER INCOME ;
Input data set is the one created most recently PROC PRINT ;
Output data set replaces the input data set BY GENDER ;
Observations will be arranged in the increasing
order of the values for STATE PROC MEANS MEAN VAR ;
More importantly, the observations with the same BY GENDER INCOME ;
value for STATE will appear together
VAR AGE FOOD RENT ;
PROC SORT ;
BY STATE CITY ; Output 1 GENDER=F
Listing of observations in the increasing
Observations will be arranged in the increasing
order of STATE. Within each STATE, the order of INCOME
observations will be arranged in the increasing
order of CITY GENDER=M
Observations with the same values for STATE and Listing of observations in the increasing
CITY will be grouped together order of INCOME
21 22
23 24
6
LABEL and FORMAT Statements Some Special Pointer Controls
LABEL variable = ‘label’ ; #n : for reading multiple data lines when inputting
an observation
e.g., LABEL PTS=‘AEROBIC POINTS’
INPUT A 5. B 6.2 #2 @15 C 7.3 ;
PERCENT=‘% OF TOTAL FAT’ ; INPUT AGE GRADE/SCORE1-SCORE5 ;
FORMAT variables [format] ; Trailing @ : for holding a data line for another
INPUT statement in the same data step
e.g., FORMAT WEIGHT HEIGHT 8.2 DATA ;
INCOME DOLLAR12.2; INPUT TYPE $1 @ ;
These 2 statements can be used in both IF TYPE=‘C’ THEN
INPUT ;
DATA and PROC steps
ELSE
In a DATA step, these statements will INPUT ;
associate labels and formats permanently Trailing @@ : for holding a data line for repeated
execution of the same data step
with the variable specified
DATA ;
Also look-up INPUT NAME $ VERBAL MATH @@ ;
LENGTH TOTAL=VERBAL+MATH;
DATALINES ;
INFORMAT Sue 610 560 John 720 640 Mary 580 590
ATTRIB Jim 650 760 Bernard 690 670 Gary 570 680
statements ….
25 26
7
Arrays (continued)
Example 3:
DATA FALL01;
INPUT NAME $ TEST1-TEST10 ;
ARRAY TEST {*} TEST1-TEST10 ;
DROP TEST1-TEST10 ;
DO TESTNO
TESTNO=1 1 TO 10 ;
SCORE=TEST ; OUTPUT ;
END ;
DATALINES ;
SMITH 8 7 9 . 3 10 9 7 5 7
JONES 4 5 6 8 4 5 . 8 7 6
;
Creates the following dataset
NAME TESTNO SCORE
SMITH 1 8
S
SMITH 2 7
JONES 1 4
JONES 10 6
29