Functions in SAS
Functions in SAS
Frank C. DiIorio
Advanced Integrated Manufacturing Solutions, Co.
Durham NC
Introduction
Lets start by admitting that programmers are, at heart,
rather lazy. We want procedures to do our sorting, printing, and analysis. We want formats to display the number
5 as Well above average. In essence, we dont want to
code any more statements than we have to. With that in
mind, lets look at another code and time-saving feature of
the SAS System.
This paper gives an overview of functions, a powerful set
of tools in the SAS System. Functions are a set of predefined routines that come with the SAS System. They perform a wide range of activities, and often reduce complex
computations that would require arduous and error-prone
DATA step coding to a single, simple statement. Not even
a novice SAS programmers toolbox is complete without a
basic knowledge of system functions.
This paper introduces SAS novices to functions. Basic
terminology is reviewed first, followed by usage issues
common to nearly all functions. The last section of the
paper describes the purpose and syntax of some of the
more commonly used functions. Bear in mind that this paper is simply an overview of a broad and sometimes complex topic. The reader should consult SAS Institute documentation for the definitive, exhaustive description of the
purpose, limitations, and uses of the functions.
Fundamentals
The logic thats common to all functions is straightforward. Two components of function syntax the function
name and its parameters identify what is to be performed. The first component is the function name. It
identifies the action that the function performs identify
the minimum of a list of numbers (the MIN function), locate the third word in a character variable (the SCAN
function), and so on. The name usually gives some idea of
the activities performed by the function.
The second function component is a list of parameters
(sometimes referred to as arguments) enclosed in parentheses. Notice in the above description of the function
names we said minimum of a list and third word in a
character variable. The of and in identify what the
function should operate on. Putting these two pieces together, lets look at two complete uses of these functions:
min_year = min(fy2001q1, fy2001q2,
fy2001q3);
file_name = scan(dir_line, 3, );
Bottom line: there are lots of words in the language. Define arrays with names that dont conflict with function
names.
Names Matter (Gotcha #2). Programmers new to SAS,
or those regularly moving back and forth between SAS and
other languages, must be careful not to assume that function x in one language does the same thing in another language. Dont make assumptions. Read the documentation
carefully, and be sure the functionality is identical.
If you need motivation in this regard, consider the subtle
difference in this example: the TRIM function is in
EXCEL and SAS, and performs basically the same activity
(trimming blanks from a character variable). In EXCEL,
both leading and trailing blanks are trimmed, but SAS
trims only the trailing blanks. Its usually easier to review
documentation for the functions than it is to debug their
unadvised usage.
Look at Data Types Carefully. Some functions require
numeric arguments, others require character arguments.
Yet others require a mix of these data types. The type(s)
of the argument(s) does not influence the value returned by
the function. The LENGTH function, for example, returns
the location of the last non-blank character in a variable.
The function requires a character argument but returns a
numeric value.
The Number of Arguments Varies. The number of arguments required by a function will, of course, depend on
the type of work the function performs. Even using the
same function, though, the number of arguments can vary.
MIN and other descriptive statistic functions can handle a
varying number of arguments, provided there are enough
values to perform the required task (you need at least three
arguments to calculate skewness, for example). Other
functions expect n arguments and will make assumptions
if they do not receive the full n the third argument to
the SUBSTR function, for example, is the number of positions to extract from a character variable. If omitted from
the SUBSTR call, the default behavior is to subset to the
rightmost position in the variable.
Parameter Order May Matter. Some functions do not
care about the order in which arguments are specified. The
SUM function, for example, performs an action (addition)
that is, by nature, indifferent to order. Other functions are
not so forgiving, and assign specific meanings to arguments. The first argument to the ROUND function, for
example, is a numeric value. The second argument is the
rounding unit (round to the nearest ). SAS is often
unable to detect misspecified parameters because they may
make syntactical sense but do not have logical validity.
Consider the following statements:
inc1 = round(income, 1000);
inc2 = round(1000, income);
q4 = .;
quoted = quote(text);
run;
v4 = 6;
Commonly-Used Functions
A complete list of functions, grouped by category, is found
in the Appendix. This section takes a closer look at some
of the more commonly used functions and gives examples
of their use. The order and category names correspond to
the SAS Online Doc. Yet again refer to SAS Institute
documentation for a description of parameters and other
usage notes.
Category/Name Description and example of usage
array
dim
Number of elements in an array.
do i = 1 to dim(list);
character
compbl
Removes consecutive blanks from a string.
old = much extra space;
new = compbl(old);
NEW becomes much extra space
compress
Removes characters from a string.
old = Chapel Hill, NC 27516;
new1 = compress(old);
new2 = compress(old, ,-);
NEW1 becomes ChapelHill,NC-27516
NEW2 becomes Chapel Hill NC 27516
You could use COMPBL on NEW2 to remove
consecutive blanks.
index
Gives the starting position of a string within a
string.
string = \temp\examples1.sas;
loc1 = index(string, .sas);
loc2 = index(string, .SAS);
loc3 = index(upcase(string), .SAS);
LOC1 equals 1, LOC2 equals 0 (not found), LOC3
equals 16
left
Left-aligns a string
old = leading blanks;
new = left(old);
NEW equals leading blanks
length
Returns the length (rightmost non-blank character)
of a string.
length old $40;
old = Short;
len = length(old);
translate
tranwrd
upcase
day /
month /
year
hour /
minute /
second
old = J*V87;
if substr(old, 2, 1) = * then do;
Changes all occurrences of one character in a
string to another.
old = Line1*Line2*Line3;
new = translate(old, /, *);
NEW equals Line1/Line2/Line3
Similar to TRANSLATE, but at the word level.
Parameter order is different (from-to, rather than
to-from!)
old = Mrs. Smith;
new = tranwrd(old, Mrs., Sra.);
NEW equals Sra. Smith
Upper-cases a string.
old = Mixed Case;
new = upcase(old);
NEW equals MIXED CASE
Returns the current date as a SAS date value
today = date()
TODAY is 15138 (June 12, 2001 if formatted)
Returns the current date-time as a SAS datetime
value.
rightnow =datetime();
RIGHTNOW is 1307985980.7
(12JUN2001:17:26:21 if formatted)
Extract the day, month, and year numbers from a
SAS date value.
curr_day = day(today());
CURR_DAY equals 12 if today is June 12, 2001.
Extracts the hour, minute, and second from a SAS
time value.
curr_hr = hour(onset);
CURR_HR equals 7 if ONSET is 7:04.
symget
math
abs
fact
log /
log10 /
log2
mod
CALL routine
if eof then call symput(goodones, put(_n,3.));
Macro variable GOODONES contains the value of
variable _N.
Retrieves the value of a macro variable.
status = symget(stat);
DATA step variable STATUS equals the value of
macro variable STAT.
Returns the absolute value of a numeric variable.
old = -3;
new = abs(old);
NEW equals 3
Returns the factorial of an integer.
fact = fact(5);
FACT equals 120
Return the natural, base 10, and base 2 logarithms of a positive number.
Returns the remainder of a division.
old = 100;
new1 = mod(old, 10);
new2 = mod(old, 8);
NEW1 equals 0 (no remainder)
NEW2 equals 4 (4 left over when 100 is divided by
8)
random number
rannor /
As CALL routines, they return random variates
ranuni
from normal and uniform distributions.
call ranuni(-1);
rannor /
As functions, they return random variates from
ranuni
normal and uniform distributions. The CALL routines greater control over seed values.
call rannor(-1);
special
system
CALL routine, it submits a host operating system
command for execution.
Questions? Comments?
Your feedback is always welcome. Contact the author at
[email protected].
BAND
BLSHIFT
BNOT
BOR
BRSHIFT
BXOR
Character
BYTE
COLLATE
COMPBL
COMPRESS
DEQUOTE
INDEX
INDEXC
INDEXW
LEFT
LENGTH
LOWCASE
MISSING
QUOTE
RANK
REPEAT
REVERSE
RIGHT
SCAN
SOUNDEX
SPEDIS
SUBSTR
(left of =)
SUBSTR
TRANSLATE
Character
TRANWRD
TRIM
TRIMN
UPCASE
VERIFY
External Files
DNUM
DOPEN
DOPTNAME
DOPTNUM
DREAD
DROPNOTE
FAPPEND
Descriptive Statistics
CSS
CV
KURTOSIS
MAX
MEAN
MIN
MISSING
N
NMISS
ORDINAL
RANGE
SKEWNESS
STD
STDERR
SUM
USS
VAR
External Files
DCLOSE
DINFO
FCLOSE
FCOL
FDELETE
FEXIST
FGET
FILEEXIST
FILENAME
FILEREF
FINFO
FNOTE
FOPEN
FOPTNAME
FOPTNUM
FPOINT
FPOS
FPUT
FREAD
FREWIND
FRLEN
FSEP
FWRITE
MOPEN
PATHNAME
SYSMSG
SYSRC
External Routines
CALL MODULE
CALL MODULEI
MODULEC
MODULEIC
MODULEIN
MODULEN
Financial
COMPOUND
CONVX
CONVXP
DACCDB
DACCDBSL
DACCSL
DACCSYD
DACCTAB
DEPDB
DEPDBSL
DEPSL
DEPSYD
DEPTAB
DUR
DURP
INTRR
IRR
MORT
NETPV
NPV
PVP
SAVING
YIELDP
Random Number
Hyperbolic
COSH
SINH
TANH
CALL EXECUTE
CALL SYMPUT
RESOLVE
Macro
SYMGET
RANEXP
RANGAM
RANNOR
RANPOI
RANTBL
RANTRI
RANUNI
UNIFORM
ATTRC
ATTRN
CEXIST
Mathematical
ABS
AIRY
CNONCT
COMB
CONSTANT
DAIRY
DEVIANCE
DIGAMMA
ERF
ERFC
EXP
FACT
FNONCT
GAMMA
IBESSEL
JBESSEL
LGAMMA
LOG
LOG10
LOG2
MOD
PERM
SIGN
SQRT
TNONCT
TRIGAMMA
Probability
CDF
LOGPDF
LOGSDF
PDF
POISSON
PROBBETA
PROBBNML
PROBBNRM
PROBNEGB
PROBNORM
PROBT
SDF
BETAINV
CINV
FINV
GAMINV
PROBIT
TINV
PROBCHI
PROBF
PROBGAM
PROBHYPR
PROBMC
Quantile
Random Number
CALL RANBIN
CALL RANCAU
CALL RANEXP
CALL RANGAM
CALL RANNOR
CALL RANPOI
CALL RANTBL
CALL RANTRI
CALL RANUNI
NORMAL
RANBIN
RANCAU
CLOSE
CUROBS
DROPNOTE
DSNAME
EXIST
FETCH
FETCHOBS
GETVARC
GETVARN
IORCMSG
LIBNAME
LIBREF
NOTE
OPEN
PATHNAME
POINT
REWIND
SYSMSG
SYSRC
VARFMT
VARINFMT
VARLABEL
VARLEN
VARNAME
VARNUM
VARTYPE
Special
ADDR
CALL POKE
CALL SYSTEM
DIF
GETOPTION
INPUT
INPUTC
INPUTN
LAG
PEEK
PEEKC
POKE
PUT
PUTC
PUTN
SYSGET
SYSPARM
SYSPROD
SYSTEM
FIPNAME
FIPNAMEL
FIPSTATE
STFIPS
STNAME
STNAMEL
ZIPFIPS
ZIPNAME
ZIPNAMEL
ZIPSTATE
ARCOS
ARSIN
Trigonometric
Trigonometric
ATAN
COS
SIN
TAN
CEIL
Returns the smallest integer that is greater than or equal to the argument
Returns the largest integer that is less than or equal to the argument
Returns the nearest integer if the argument is within 1E-12
Returns the integer value
Rounds to the nearest round-off unit
Truncates a numeric value to a specified length
Truncation
FLOOR
FUZZ
INT
ROUND
TRUNC
Variable Control
CALL LABEL
CALL SET
CALL VNAME
Variable Information
VARRAY
VARRAYX
VFORMAT
VFORMATD
VFORMATDX
VFORMATN
VFORMATNX
VFORMATW
VFORMATWX
VFORMATX
VINARRAY
VINARRAYX
VINFORMAT
VINFORMATD
VINFORMATDX
VINFORMATN
VINFORMATNX
VINFORMATW
VINFORMATWX
VINFORMATX
VLABEL
VLABELX
VLENGTH
VLENGTHX
VNAME
VNAMEX
VTYPE
VTYPEX
Web Tools
HTMLDECODE
HTMLENCODE
URLDECODE
URLENCODE