SAS Programming by Example (14) : Chapter 14 Efficiency Making Your Programs More Efficient
SAS Programming by Example (14) : Chapter 14 Efficiency Making Your Programs More Efficient
IF 0 LE AGE LE 20 THEN AGEGRP='0 TO 20'; ELSE IF 21 LE AGE LE 40 THEN AGEGRP='21 TO 40'; ELSE IF AGE GT 40 THEN AGEGRP='>40'; DATALINES; 1 23 68 155 ... ; RUN; PROC MEANS N MEAN STD DATA=ONE; VAR AGE HEIGHT WEIGHT; RUN; PROC FREQ; TABLES AGEGRP; RUN; Example 3 Processing Selected Raw Data Records Using IF Statements . INEFFICIENT DATA SUBSET; INFILE 'input_file_spec'; INPUT @1 ID 3. @5 GENDER $1. @6 (Q1-Q10) ($1.); IF GENDER='M'; RUN; . EFFICIENT DATA SUBSET; INFILE 'input_file_spec'; INPUT @5 GENDER $1. @; IF GENDER='M'; INPUT @1 ID 3. @6 (Q1-Q10) ($1.); RUN; Example 4 Processing Selected SAS Data Set Observations Using WHERE Statements in Procedures . INEFFICIENT DATA SUBSET; SET OLD; WHERE GROUP='A'; RUN; PROC UNIVARIATE DATA=SUBSET; VAR X Y Z; RUN; . EFFICIENT PROC UNIVARIATE DATA=OLD; WHERE GROUP='A'; VAR X Y Z;
RUN; Example 5 Dropping Unnecessary Variables When Building a SAS Data Set from Raw Data . Inefficient DATA SCORE; ARRAY KEY[5] $ 1; ARRAY Q[5] $ 1; RETAIN KEY1 "A" KEY2 "B" KEY3 "C" KEY4 "D" KEY5 "E"; INPUT (Q1-Q5)($1.); DO I=1 TO 5; RAW+(Q[I]=KEY[I]); END; PERCENT=100*RAW/5; DATALINES; ABCDA BBCAC EBCAD ... ; . Efficient DATA SCORE; ARRAY KEY[5] $ 1; ARRAY Q[5] $ 1; RETAIN KEY1 "A" KEY2 "B" KEY3 "C" KEY4 "D" KEY5 "E"; INPUT (Q1-Q5) ($1.); DO I=1 TO 5; RAW+(Q[I]=KEY[I]); END; PERCENT=100*RAW/5; KEEP RAW PERCENT; *or DROP KEY1-KEY5 Q1-Q5 I; DATALINES; ABCDA BBCAC EBCAD ... ; Example 6 Dropping Unnecessary Variables When Building a SAS Data Set by Setting an Existing One . Inefficient DATA NEW; SET OLD; ... DROP X1-X20;
RUN; . Efficient DATA NEW; SET OLD (DROP=X1-X20); ... RUN; Example 7 Using a LENGTH Statement . Inefficient DATA LONG; INPUT ID 1-3; @4 (Q1-Q10) (1.) @15 HEIGHT 2. @17 WEIGHT 3; DATALINES; ... . Efficient DATA LONG; LENGTH HEIGHT WEIGHT 4; INPUT ID 1-3; @4 (Q1-Q10) ($1.) @15 HEIGHT 2. @17 WEIGHT 3; DATALINES; ... Example 8 Using IF-THEN/ELSE Statements Instead of Multiple IF Statements . Inefficient DATA ONE; SET TWO; IF 0 LE AGE LE 10 THEN AGEGRP=1; IF 10 LE AGE LE 20 THEN AGEGRP=2; IF 20 LE AGE LE 30 THEN AGEGRP=3; IF 30 LE AGE LE 40 THEN AGEGRP=4; IF AGE GT 40 THEN AGEGRP=5; RUN; . Efficient DATA ONE; SET TWO; IF 0 LE AGE LE 10 THEN AGEGRP=1; ELSE IF 10 LE AGE LE 20 THEN AGEGRP=2; ELSE IF 20 LE AGE LE 30 THEN AGEGRP=3; ELSE IF 30 LE AGE LE 40 THEN AGEGRP=4; ELSE IF AGE GT 40 THEN AGEGRP=5; RUN;
Example 9 Arranging the Order of Your IF Statements . Inefficient DATA ONE; SET TWO; IF 0 LE AGE LE 10 THEN AGEGRP=1; ELSE IF 10 LE AGE LE 20 THEN AGEGRP=2; ELSE IF 20 LE AGE LE 30 THEN AGEGRP=3; ELSE IF 30 LE AGE LE 40 THEN AGEGRP=4; ELSE IF AGE GT 40 THEN AGEGRP=5; RUN; . Efficient DATA ONE; SET TWO; IF 30 LE AGE LE 40 THEN AGEGRP=4; ELSE IF AGE GT 40 THEN AGEGRP=5; ELSE IF 0 LE AGE LE 10 THEN AGEGRP=1; ELSE IF 10 LE AGE LE 20 THEN AGEGRP=2; ELSE IF 20 LT AGE GT 30 THEN AGEGRP=3; RUN; Example 10 Using Multiple OR Operators instead of the IN operator . Inefficient DATA NEW; SET OLD; IF GROUP IN ('A','C','E','Z'); RUN; . Efficient DATA NEW; SET OLD; IF GROUP='A' OR GROUP='C' OR GROUP='E' OR GROUP='Z'; RUN; Example 11 Using DATA _NULL_ When Creating Reports . Inefficient DATA UNNECESS; FILE PRINT; SET OLD; IF AGE GT 50 THEN PUT ID= AGE=; RUN; . Efficient DATA _NULL_; FILE PRINT; SET OLD; IF AGE GT 50 THEN PUT ID= AGE=;
RUN; Example 12 Saving Data in SAS System Files . Inefficient DATA TEMP; INFILE 'input_file_spec'; INPUT ...; RUN; PROC anyproc DATA=TEMP; (SAS Statements) RUN; (at a future time) DATA TEMP; INFILE 'input_file_spec'; INPUT ...; RUN; PROC anotherproc DATA=TEMP; (SAS Statements) RUN; . Efficient LIBNAME libref 'SAS_data_library'; DATA libref.PERMAN; INFILE 'input_file_spec'; INPUT ...; PROC anyproc DATA=libref.PERMAN; (SAS Statements) RUN; (at a future time) PROC anotherproc DATA=libref.PERMAN; (SAS Statements) RUN; Example 13 Using PROC DATASETS to Modify Variables . Inefficient LIBNAME libref 'SAS_data_library'; DATA NEW; SET libref.OLD (RENAME=(X=NEWX Y=NEWY)); LABEL HT='Height of Subject'; FORMAT DOB MMDDYY8; RUN; . Efficient LIBNAME libref 'SAS_data_library'; PROC DATASETS LIBRARY=libref; MODIFY OLD; RENAME X=NEWX Y=NEWY; LABEL HT='Height of Subject';
FORMAT DOB MMDDYY8; RUN; Example 14 Using PROC DATASETS to Modify SAS Data Sets . Inefficient DATA NEWNAME; SET OLDNAME; RUN; . Efficient PROC DATASETS; CHANGE OLDNAME=NEWNAME; RUN; Example 15 Using PROC APPEND to Join Similar Data Sets . Inefficient DATA UPDATE; SET MASTER NEW; RUN; . Efficient PROC APPEND BASE=MASTER DATA=NEW; RUN; Example 16 Using a RETAIN statement to Initialize Constants . Inefficient DATA TEST; INFILE 'file specification'; A=.10 B=1.57 PI=3.14159; INPUT X @@; Y=A*X+B*PI+X DROP A B PI; RUN; . Efficient DATA TEST; INFILE 'file specification'; RETAIN A .10 B 1.57 PI 3.14159; INPUT X @@; Y=A*X+B*PI+X DROP A B PI; RUN;
Example 17 Avoiding Unnecessary Sorts: Performing a Two-level Sort Instead of a One-level and a Two-level Sort . Inefficient PROC SORT DATA=TEST; BY YEAR; RUN; PROC anyprocs; BY YEAR; (SAS Statement) RUN; PROC SORT DATA=TEST; BY YEAR MONTH; RUN; PROC otherprocs; BY YEAR MONTH; (SAS Statement) RUN; . Efficient PROC SORT DATA=TEST; BY YEAR MONTH; RUN; PROC anyprocs; BY YEAR; (SAS Statement) RUN; PROC otherprocs; BY YEAR MONTH; (SAS Statement) RUN; Example 18 Avoiding Unnecessary Sorts: Using a CLASS statement When Possible . Inefficient PROC SORT DATA=TEST; BY YEAR; RUN; PROC MEANS NOPRINT DATA=TEST; BY YEAR; VAR COST; OUTPUT OUT=MEANS MEAN=; RUN; . Efficient PROC MEANS NOPRINT NWAY DATA=TEST; CLASS YEAR; BY YEAR; VAR COST; OUTPUT OUT=MEANS MEAN=; RUN
Example 19 Making Your Sorts More Efficient: Sort Only What You Have to Sort . Inefficient PROC SORT DATA=TEST; BY ID DATE; RUN; DATA NEW; SET OLD(DROP=X1-X10); WHERE YEAR BETWEEN '01JAN90'D AND '31DEC93'D; RUN; . Efficient PROC SORT DATA=OLD(DROP=X1-X10) OUT=NEW; BY ID DATE; WHERE YEAR BETWEEN '01JAN90'D AND '31DEC93'D; RUN; Example 20 Making Your Sorts More Efficient: Using the NOEQUALS Option . Inefficient PROC SORT DATA=TEST; BY YEAR; RUN; . Efficient PROC SORT DATA=TEST NOEQUALS; BY YEAR; RUN; Note: Use the NOEQUALS option to specify that the order of observations within the levels of the BY variables in the sorted data set does not have to be the same as that Of the data set before sorting.
JZone.com 2000