SAS Procedures For Common Statistical Analyses: Contents
SAS Procedures For Common Statistical Analyses: Contents
Contents:
1. Introduction/Data Set Up
2. Describing Quantitative Variables
3. Describing Qualitative Variables
4. Two-Sample Tests (Independent Samples)
5. Completely Randomized Design (1-Way ANOVA)
6. Randomized Block Design
7. 2-Factor ANOVA
8. Chi-Square Tests
9. Linear Regression
10. Correlation
11. Generalized Linear Models
a) Logistic Regression
b) Poisson Regression
c) Negative Binomial Regression
2 Introduction/Data Set-Up
For all descriptions, we will have datasets where each line represents
an individual case, and there are 3 quantitative variables: X, Y, Z
measured; and 2 qualtative variables: A, B given, unless otherwise
noted.
DATA ONE;
INPUT X Y Z A B;
CARDS;
Data Here
;
RUN;
NOTE: All procedures can be done separately for all levels of one or
more factors, and specifically for only cases that meet some criteria.
DATA ONE;
INPUT A B NUMCASE;
CARDS;
1 1 25
1 2 32
2 1 17
2 2 42
;
RUN;
PROC FREQ; TABLES A*B; WEIGHT NUMCASE; RUN;
PROC TTEST;
CLASS A;
VAR X;
RUN;
PROC GLM;
CLASS A;
MODEL Y = A;
MEANS A / BON TUKEY HOVTEST;
OUTPUT OUT=AOVOUT R=E;
RUN;
PROC GLM;
CLASS A B;
MODEL Y = A B;
MEANS A / BON TUKEY;
OUTPUT OUT=AOVOUT R=E;
RUN;
PROC FREQ;
TABLES B*A*Y / CMH2 SCORES=RANK NOPRINT;
RUN;
7. 2-Factor ANOVA
Statistical Model:
Y=µ + α i + β j +(α β )ij + ε ijk i=1,…,a j=1,…,b k=1,…,n
The dataset AOVOUT will contain the original dataset and residuals
(with variable name E).
PROC GLM;
CLASS A B;
MODEL Y = A B;
MEANS A B / BON TUKEY;
OUTPUT OUT=AOVOUT R=E;
RUN;
PROC GLM;
CLASS A B;
MODEL Y = A B A*B;
MEANS A B / BON TUKEY;
OUTPUT OUT=AOVOUT R=E;
RUN;
8. Chi-Square Test
Cases are classified on two qualitative variables: A and B
Want to test whether the classifications are independent (or that the
conditional distribution of variable B is the same for every level of
A).
PROC FREQ;
TABLES A*B / CHISQ EXPECTED;
RUN;
PROC FREQ;
TABLES A*B / MEASURES;
RUN;
9. Linear Regression
Simple Linear Regression
Statistical Model: Yi = β 0 + β 1Xi + ε i i=1,…,n
The dataset REGOUT will contain the original dataset and residuals
(with variable name E).
PROC REG;
MODEL Y = X;
OUTPUT OUT=REGOUT R=E;
RUN;
PROC REG;
MODEL Y = X1 X2 … Xk;
OUTPUT OUT=REGOUT R=E;
RUN;
10. Correlation
Data: Variables Y1,…,Yk
Pairwise Bivariate Correlations
PROC CORR; VAR Y1 … Yk; RUN;
PROC GENMOD;
MODEL Y = X / DIST=BIN LINK=LOGIT;
RUN;
Poisson Regression
Statistical Model: Y is a count outcome:
Yi ~ Poisson(λ i) log(λ i) = β 0 + β 1XI E(Yi) = (λ i) V(Yi) = λ i
PROC GENMOD;
MODEL Y = X / DIST=POI LINK=LOG;
RUN;
PROC GENMOD;
MODEL Y = X / DIST=NB LINK=LOG;
RUN;