0% found this document useful (0 votes)
171 views5 pages

SAS Chapter 10

This document provides examples of using SAS procedures like PROC MEANS and PROC UNIVARIATE to produce descriptive statistics and summary data sets from sample data. Example 1 shows computing totals and creating a summary data set using PROC MEANS. Example 2 shows computing multiple statistics like means, sums and frequencies. Example 3 creates unweighted summary statistics in two steps. Example 4 formats the output. Example 5 produces a formatted summary report. Example 6 computes values as percentages. Example 7 merges data sets and calculates medians.

Uploaded by

mihirhota
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
171 views5 pages

SAS Chapter 10

This document provides examples of using SAS procedures like PROC MEANS and PROC UNIVARIATE to produce descriptive statistics and summary data sets from sample data. Example 1 shows computing totals and creating a summary data set using PROC MEANS. Example 2 shows computing multiple statistics like means, sums and frequencies. Example 3 creates unweighted summary statistics in two steps. Example 4 formats the output. Example 5 produces a formatted summary report. Example 6 computes values as percentages. Example 7 merges data sets and calculates medians.

Uploaded by

mihirhota
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 5

SAS Programming by Example [10]

Chapter 10 PROC MEANS and PROC UNIVARIATE Producing Descriptive Statistics and Summary Data Sets
Example 1 Computing Totals and Using PROC MEANS to Create a Summary Data Set Features: PROC MEANS, CLASS and OUTPUT Statements . Observations from the SALES Data Set PO_NUM ITEM REGION PRICE QUANTITY PROC MEANS DATA=SALES; TITLE "Sample Output from PROC MEANS"; CLASS REGION ITEM; VAR QUANTITY; OUTPUT OUT=QUAN_SUM SUM=TOTAL; RUN; PROC PRINT DATA=QUAN_SUM; TITLE "Summary Data Set"; RUN; Example 2 Computing More Than One Statistic Features: N=, MEAN=, and SUM= OUTPUT Options PROC MEANS DATA=ORIGDATA; CLASS A B; VAR X Y Z; OUTPUT OUT =STATS N =NUM_X NUM_Y NUM_Z MEAN=MEAN_X MEAN_Y MEAN_Z SUM =TOT_X TOT_Y TOT_Z; RUN; Note: In Data set STATS, variables are A, B, _TYPE_, _FREQ_, MAX_X, The first row represents the entire population. Example 3 Creating UN weighted Summary Statistics (Step 1) Features: PROC MEANS, NWAY and NOPRINT Options, CLASS Statement . SUBJ YEAR SBP DBP PROC MEANS DATA=PRESSURE NOPRINT NWAY; CLASS SUBJ YEAR;

VAR SBP DBP; OUTPUT OUT=MEANOUT MEAN=; RUN; Note: NOPRINT means not to print the resulting statistics MEAN= in the OUTPUT Statement results having the same name as The individual variables in the original data set. NWAY tells the system to produce output for only the highest Level of class interactions. The combination of NOPRINT and NWAY options makes a very Powerful and frequently used data production tool. Example 4 Creating UN weighted Summary Statistics (Step 2) Features: PROC MEANS, MEAN and MAXDEC= Options, CLASS Statement PROC MEANS DATA=MEANOUT MEAN MAXDEC=2; TITLE 'Average Computed from Person Yearly Means'; CLASS YEAR; VAR SBP DBP; RUN; Note: MAXDEC=2 limits output values to two decimal places Ask only MEAN statistic, that all you get. Example 5 Producing a Formatted Summary Report Features: PROC MEANS, PROC PRINT, N=, NMISS=, SUM=, and CLASS Variables, NWAY, NOPRINT, LABEL, and DOUBLE Options, Conman. And DOLLARn. Formats . Data Set FUND NAME TOWNSHIP AMOUNT OPTIONS LS=72 NONUMBER NODATE; PROC MEANS DATA=FUND NOPRINT NWAY; CLASS TOWNSHIP; VAR AMOUNT; OUTPUT OUT =SUMMARY N =RETURNED NMISS=NOT_RETN SUM =TOTAL; RUN; DATA REPORT; SET SUMMARY; MAILED=RETURNED+NOT_RETN; */Alternative: MAILED=_freq_; PER_RETN=TOTAL/RETURNED; PER_MAIL=TOTAL/MAILED; LABEL MAILED ='LETTERS MAILED' RETURNED='NUMBER OF DONATIONS'

TOTAL ='TOTAL DONATION' PER_RETN='MEAN DONATION' PER_MAIL='MEAN DONATION PER LETTER MAILED'; RUN; PROC PRINT DATA=REPORT LABEL DOUBLE; TITLE 'Fund Drive Summary Report'; ID TOWNSHIP; VAR MAILED RETURNED TOTAL PER_RETN PER_MAIL; FORMAT MAILED RETURNED COMMA5. TOTAL DOLLAR7. PER_RETN PER_MAIL DOLLAR5; RUN; Note: LABEL option instructs the system to use the labels connected To the variables for column headings instead of the variable Names. DOUBLE option double spaces the report Example 6 Computing Values as Percentage of All Observations Features: PROC MEANS, NOPRINT Option, OUTPUT Statement, Conditional SET Statement (in DATA Step) DATA TEST; INPUT HR SBP DBP; DATALINES; 80 160 100 70 150 90 60 140 80 ; PROC MEANS NOPRINT DATA=TEST; VAR HR SBP DBP; OUTPUT OUT =MOUT MEAN=MHR MSBP MDBP; RUN; OBS _TYPE_ 1 0 3 _FREQ_ MHR 70 150 90 MSBP MDBP

DATA PERCENT; SET TEST; DROP MHR MSBP MDBP _TYPE_ _FREQ_; IF _N_=1 THEN SET MOUT; HRPER =100*HR/MHR; SBPPER=100*SBP/MSBP; DBPPER=100*DBP/MDBP; RUN; PROC PRINT NOOBS DATA=PERCENT; TITLE "Listing of PERCENT Data Set"; RUN; HR SBP DBP HRPER SBPPER DBPPER

80 70 60

160 150 140

100 114.286 106.667 111.111 90 100.000 100.000 100.000 80 85.714 93.333 88.889

Example 7 Creating a Summary Data Set That Contains a Median Features: PROC UNIVARIATE, BY and OUTPUT Statements, TRANSLATE and INPUT Functions, Merging Data Sets, LAST. Variable, Clever Comment Boxes . Data set CLINICAL PATNUM DATE DRUGGRP CHOL SBP DBP HR ROUTINE 01 01/05/89 D 400 160 90 88 Y /*******************************************************\ | First, create a new data set NEW_CLIN from CLIN which | | contains a numeric variable RATIO which has values | | Of 0 for "N" and 1 for "Y". | \******************************************************/ DATA NEW_CLIN; SET CLINICAL; RATIO=INPUT (TRANSLATE (ROUTINE,"01","NY"),1.); /********************************************************\ | | | Two alternatives would be: | | | | IF ROUTINE = "N" THEN RATIO = 0; | | ELSE RATIO = 1; | | | | Or | | | | SELECT (ROUTINE); | | WHEN ("N") THEN RATIO = 0; | | WHEN ("Y") THEN RATIO = 1; | | END; | | | \********************************************************/ RUN; PROC SORT DATA=NEW_CLIN; BY PATNUM DATE; RUN; * Create a data set with the last record for each patient; DATA LAST (RENAME= (DATE=LASTDATE CHOL=LASTCHOL)); SET NEW_CLIN (KEEP=PATNUM DATE CHOL); BY PATNUM; IF LAST.PATNUM; RUN; * Output means and medians for each patient to a data set; PROC UNIVARIATE DATA=NEW_CLIN NOPRINT; BY PATNUM;

VAR CHOL SBP DBP RATIO; OUTPUT OUT=STATS MEAN=MEANCHOL MEANSBP MEANDBP RATIO MEDIAN=MEDCHOL; RUN; * Combine the LAST data set with the STATS data set; DATA FINAL; MERGE STATS LAST; BY PATNUM; RUN; * Print a final report; PROC PRINT DATA=FINAL LABEL DOUBLE; TITLE "Listing of data set FINAL in Example 7"; ID PATNUM; VAR LASTDATE LASTCHOL MEANCHOL MEDCHOL MEANSBP MEANDBP RATIO; LABEL LASTDATE="Date of Last Visit" MEANCHOL="Mean Chol" MEANSBP ="Mean SBP" MEANDBP ="Mean DBP" MEDCHOL ="Median Chol" LASTCHOL="Last Chol" RATIO ="Proportion of visits that were routine"; FORMAT MEANCHOL MEANSBP MEANDBP MEDCHOL LASTCHOL 5.0 RATIO 3.2; RUN;

You might also like