Sas Programming
Sas Programming
DATA STEP:
This step involves loading the required data set into SAS memory and identifying the variables (also called
columns) of the data set. It also captures the records (also called observations or subjects). the string
variables have a $ at the end and numeric values are without it.
Syntax:
Data data_set_name; #name the data set
Inputvar1,var2,var3; # define the variables in this data set
New var; # create new variables
Datalines; # enter the data
Run;
Example:
Data myfuel;
Set sashelp.fuel;
Input CPRatio EQRatio Fuel $ NOx;
Dataliner;
13 5.90 indian 7.6
12 3.54 hp 0.34
12 4.77 hp 5.12
14 2.54 indian 4.32
9 3.14 hp 4.81
12 6.54 hp 4.81
Run;
PROC STEP:
This step involves invoking a SAS built-in procedure to analyse the data.
Syntax:
Proc procedure_name options; #the name of the proc
Run;
Example:
Proc means;
Run;
THE OUTPUT STEP:
The data from the data sets can be displayed with conditional output statements.
Syntax:
Proc print data=data_set;
Options;
Run;
Example:
Proc print data=myfuel;
Where CPRatio>=10;
Run;
ACCESSING THE DATA:
PROC CONTENTS:
The proc contents procedure provides a summary of a dataset's contents, including details such as
variable names, types, and attributes (such as formats, informats, and labels). It also tells you the
number of observations and variables present in the dataset, as well as the creation date of the
dataset.
Syntax:
Proc contents data=dataset_name;
Run;
Example:
proc contents data=sashelp.cars;
run;
Accessing the data through library:
A libref is the name of the library that can be used in a SAS program to read data files.
The engine provides instructions for reading SAS files and other types of files.
The path provides the directory where the collection of tables is located.
The libref remains active until you clear it, delete it, or shut down SAS.
Syntax:
Example :
Libname mylib base “s:/workshop/data”;
Libname statement creates a libref or library name ‘mylib’ that uses the base engine to read sas tables
located in s:/workshop/data, ‘base’ is a default engine.
Using a Library to Read Excel Files:
Exploring Data:
PROC PRINT:
proc print lists all columns and rows in the input table by default
Obs=number of rows, Var statement=order and limits the columns listed.
Syntax:
Proc print data=input_table(obs=n);
Var col_name(s);
Run;
Example:
Run;
PROC MEANS:
Proc means generates simple summary statistics for each numberic column in the input data by default.
Syntax:
Var colname;
Run;
Example:
Run;
PROC UNVARIATE:
Proc univariate also generates summary statistical for each numeric column in the data by default ,but
include more ddetailed statistics related to distribution and extreme values.
Syntax:
Var col_name;
Run;
Example:
Run;
PROC FREQ:
PROC FREQ creates a frequency table for each variable in the input table by default. You can limit the
variables analyzed by using the TABLES statement
Syntax:
Example:
Run;
FILTERING ROWS:
WHERE STATEMENT:
Where statement is used for filter rows. if expression is true, rows are read. If expression is false, then rows
are not read.
Syntax:
Proc procedure_name:
Where expression;
Run;
Example:
Run;
Syntax:
Where expression;
Basic Operators;
= (or) EQ
^= (or) ~= (or) NE
> (or) GT
< (or) LT
>= (or) GE
<= (or) LE
When an expression includes a fixed date value, use the SAS date constant syntax: “ddmmmyyyy”d, where
dd represents a 1- or 2-digit day, mmm represents a 3-letter month in any case, and yyyy represents a 2- or
4-digit year.
Example:
Run;
Example:
Run;
Combine Expression using “ OR ”:
Example:
Run;
IN Operator:
%LET macro-variable=value;
Macro variables can be referenced in a program by preceding the macro variable name with an &.
If a macro variable reference is used inside quotation marks, double quotation marks must be used.
Eample:
%let cartype=Wagon;
Where Type=”&cartype”;
Run;
Where Type=”&carType”;
Run;
Where Type=”&carType”;
Run;
FORMATTING COLUMNS:
Formats are used to change the way values are displayed in data and reports.
Run;
<$>format-name<w>.<d>
Run;
SORTING DATA:
Proc sorts the rows in a table on one or more character or numeric columns.
The out= option specifies an output table. Without this option, PROC SORT changes the order of rows in
the input table.
The BY statement specifies one or more columns in the input table whose values are used to sort the rows.
By default, SAS sorts in ascending order.
Syntax:
By <descending> col_name(s);
Run;
Example:
By descending name;
Run;
REMOVING THE DUPLICATE DATA FROM PARTICULAR COLUMN USING SORT STATEMENT:
SYNTAX:
NODUPKEY <DUPOUT=OUTPUT_TABLE>;
BY COL-NAME(S);
RUN;
EXAMPLE:
Nodupkey dupout=test_dups;
By name;
Run;
REMOVING THE DUPLICATE DATA FROM ALL COLUMNS USING SORT STATEMENT:
SYNTAX:
NODUPKEY <DUPOUT=OUTPUT_TABLE>;
BY_ ALL_;
RUN;
EXAMPLE:
Nodupkey dupout=test_dups;
By _all_;
Run;
Syntax:
Data output_table;
Set input_table;
Where expression ;
Run;
Example:
Data myclass;
Set sashelp.class;
Where age>=15;
Keep Name Age Weight Height; # keep the column in the new table
Run;
Syntax:
Data out-table;
Set input_table;
New_column=expression;
Run;
Example:
Data cars_new;
Set sashelp.cars;
Where origin ne “USA”;
Profit=MSRP-Invoice;
Source=”Non-US cars”;
Run;
Syntax:
Data output_table;
Set input_table;
New_column=function(arguments);
Run;
CHARACTER FUNCTION:
Data storm_new;
Set pg1.storm_summary;
Basin=upcase(basin);
Name=propcase(name);
Hemisphere=cat(Hem_NS,Hem_EW);
Occean=substr(Basin,2,1);
Run;
Data Storm_new;
Set pg1.storm_damage;
Drop summary;
Yearspassed=yrdif(Date,today(),”age”);
Anniversary=mdy(month(Date),day(Date),year(today()));
Run;
CONDITIONAL STATEMENTS:
Syntax:
Example:
Data cars;
Set sashelp.cars;
Run;
Syntax:
Example:
Data cars;
Set sashelp.cars;
Else Cost_Group=4;
Keep Make Model,Type, MSRP Cost_Group;
Run;
Example:
Syntax:
<executable statements>
END;
<executable statements>
END;
ELSE DO:
<executable statements>
END;
Example:
Data cars2;
Set sashelp.cars;
Length Cost_Type $ 4;
Run;
ANALYZING AND REPORTING ON DATA
TITLE is a global statement that establishes a permanent title for all reports created in your SAS session.
You can have up to 10 titles, in which case you would just use a number 1-10 after the keyword TITLE to
indicate the line number. TITLE and TITLE1 are equivalent.
Titles can be replaced with an additional TITLE statement with the same number. TITLE; clears all titles.
Syntax:
TITLE <n>”title_text”;
FOOTNOTE<n>”footnote-text”;
Run;
Syntax:
LABEL col_name=”label_text”;
Example:
Where type=”sedan”;
Var MSRP,MPG_Highway;
Run;
Syntax:
Run;
Order=freq|formatted|data nlevels
Nocum
Nopercent
Out=output_table
Example:
Ods nonproctitle;
Run;
Title;
Ods proctitle;
CREATING TWO-WAY FREQUENCY REPORTS:
Syntax:
Run;
Noprint
Norow ,nocol,nopercent
Crosslist ,list
Out=output-table
Example:
Format StartDate=monname.;
Run;
<stat-list> is used to specify the statistics. that we want to calculate and how they should be display.
Syntax:
Var col-name(s);
Class col-name(s);
Ways n;
Run;
Example:
Var MaxwindMPH;
Ways 0,1;
Run;
Syntax:
Example:
Var weight;
Class col_status;
Ways 1;
Run;
EXPORTING DATA
proc export can export a SAS table to a variety of file formats outside SAS.
<dbms=identifier><replace>;
Run;
Syntax:
Example:
EXPORTING REPORTS:
The SAS Output Delivery System (ODS) is programmable and flexible, making it simple to automate the
entire process of exporting reports to other formats.
Ods<destination><destination-specifications>;
Ods<distination>close;
By default, each procedure output is written to a separate worksheet with a default worksheet name. The
default style is also applied.
Use the style= option on the ods excel statement to apply a different style.
Use the options(sheet_name=’label’) on the ods excel statement to provide a custom label for each
worksheet.
Syntax:
Example:
Options(sheet_name=”windstats’);
Class BasinName;
Var MaxWindMPH;
Run;
Histogram MaxWindMPH;
Density MaxWindMPH;
Run;
Title;
Ods proctitle;
By using csvall destination, the ods statement are exporting the result of the procedure.so by using ods
csvall with proc print, you can specify the order and format of the columns in your output csv file.
Syntax:
Example:
Run;
The pdftoc=n option controls the level of the expansion of the table of contents in PDF documents.
Syntax:
ods proclabel=”label”;
Style=style;
SQL used in prepare data and analyse and report on data in SAS.
The SELCET statement describes the query. List columns to include in the results after SELECT, separated by
commas. The FROM clause lists the input table(s).
The ORDER BY clause arranges rows based on the columns listed. The default order is ascending. Use DESC
after a column name to reverse the sort sequence.
Syntax:
proc sql;
from input-table;
quit;
Example:
Proc sql;
From pg1.class_birthdate;
Quit;
SUBSETTING DATA:
Where expression is not separate statements from SQL but it is clause add to select statement.
Syntax:
WHERE expression;
SORTING DATA
We use order by clause to sort the data in ascending and descending order.
Syntax:
Example:
Proc sql;
From pg1.class_birthdate;
Where age>14;
Quit;
Syntax:
Proc sql;
Select Name,Age,Height
From pg1.class_birthdate;
Quit;
Syntax:
Example:
Proc sql;
Quit;
ON table1.column = table2.column
Example:
INNER JOIN:
Proc sql;
On class-update=class.-teacher;
Quit;
LEFT JOIN:
Proc sql;
On class-update=class.-teacher;
Quit;
OUTER JOIN:
Proc sql;
Quit;
Assign an alias (or nickname) to a table in the FROM clause by adding the keyword AS and the alias of your
choice. Then you can use the alias in place of the full table name to qualify columns in the other clauses of
a query.
Syntax:
Example:
Proc sql;
Select u.Name,Age,Grade,Teacher
On u.name = t.name;
Quit;