0% found this document useful (0 votes)
70 views23 pages

Freq Procedure: For Creating Frequency Tables & Contingency Tables

This document provides examples of using SAS procedures to analyze data. It demonstrates how to use PROC FREQ to create one-way and two-way tables, PROC UNIVARIATE to analyze variable distributions and obtain descriptive statistics, and PROC FREQ with the CHISQ option to conduct a chi-square test of association on a 2x2 contingency table. It also shows using ODS tables and options like NOPRINT, OUTPUT, and SELECT with these procedures.

Uploaded by

SadafShaikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views23 pages

Freq Procedure: For Creating Frequency Tables & Contingency Tables

This document provides examples of using SAS procedures to analyze data. It demonstrates how to use PROC FREQ to create one-way and two-way tables, PROC UNIVARIATE to analyze variable distributions and obtain descriptive statistics, and PROC FREQ with the CHISQ option to conduct a chi-square test of association on a 2x2 contingency table. It also shows using ODS tables and options like NOPRINT, OUTPUT, and SELECT with these procedures.

Uploaded by

SadafShaikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

FREQ

PROCEDURE
For creating frequency tables &
contingency tables

FREQ PROCEDURE
proc
proc freq
freq Data=new;
Data=new;
tables
tables aa // missprint;
missprint;
title
title '1-WAY
'1-WAY FREQUENCY
FREQUENCY TABLE
TABLE WITH
WITH MISSPRINT
MISSPRINT
OPTION';
OPTION';

run;
run;

Two-way contingency tables


proc
proc freq
freq data=new;
data=new;
tables
tables a*b
a*b ;;
title
title '2-WAY
'2-WAY CONTINGENCY
CONTINGENCY
TABLE';
TABLE';
run;
run;

Data SummerSchool
Background
High school students applied for
courses in a summer enrichment
program; these courses included
journalism, art history, statistics,
graphic arts, and computer
programming. The students accepted
were randomly assigned to classes
with and without internships in local
companies.
Investigation
Researchers are interested in whether
there is an association between
internship status and summer
program enrollment. The Pearson chisquare statistic is an appropriate
proc
proc freq
freq data=SummerSchool
data=SummerSchool
statistic to assess the association in
order=data;
order=data;
the corresponding table.
tables
tables Internship*Enrollment
Internship*Enrollment // chisq;
chisq;
weight
weight Count;
Count;

SummerSchool: 2X2 contingency table

SummerSchool: Chi-square test of


assosiation
The Pearson chi-square statistic is
labeled 'Chi-Square' and has a value
of 0.8189 with 1 degree of freedom.
The associated p-value is 0.3655,
which means that there is no
significant evidence of an association
between internship status and
program enrollment.
The other chi-square statistics have
similar values and are asymptotically
equivalent.
The other statistics (phi coefficient,
contingency coefficient, and
Cramers ) are measures of
association derived from the Pearson
chi-square. For Fishers exact test,
the two-sided -value is 0.4122, which
also shows no association between

Analysis of distributions
of variables
PROC UNIVARIATE

Data Gains

PROC UNIVARIATE: Vanilla Flavor


Proc univariate data=gains ;
var height;
run;
Part of SAS output is shown below:

MORE
OUTPUT
ON NEXT
PAGE

PROC UNIVARIATE: OUTPUT

MORE
OUTPUT
ON NEXT
PAGE

PROC UNIVARIATE: OUTPUT (Cont.)

PROC UNIVARIATE: NOPRINT OPTION


Proc univariate noprint data = gains ;
var height weight ;
output out =unigain
Output Statement

mean = hmean wmean


pctlpts = 1 5 10 25 75 90 95 99

Percentile points

pctlpre= h w ;

Prefix for percentilevariables

run;
Note: Use proc print to print data UNIGAIN

SAS output is shown below:

ODS Tables Produced with the PROC UNIVARIATE Statement


ODS Table Name
BasicIntervals

BasicMeasures
ExtremeObs
ExtremeValues
Frequencies
LocationCounts

MissingValues
Modes
Moments
Plots
Quantiles
RobustScale
SSPlots
TestsForLocation
TestsForNormality
TrimmedMeans

Description
Confidence intervals
for mean, standard
deviation, variance
Measures of location
and variability
Extreme observations
Extreme values
Frequencies
Counts used for sign
test and signed rank
test
Missing values
Modes
Sample moments
Line printer plots
Quantiles
Robust measures of
scale
Line printer side-byside box plots
Tests for location
Tests for normality
Trimmed means

Option
CIBASIC

Default
Default
NEXTRVAL=
FREQ
LOCCOUNT

Default, if missing
values exist
MODES
Default
PLOTS
Default
ROBUSTSCALE
PLOTS (with BY
statement)
Default
NORMALTEST
TRIMMED=

SAS data set used in examples

Getting estimates of basic measures and quintiles o


Requesting tables of interest - ODS SELECT statement
ods listing close;
ods html ;
data BPressure;
Set Bpressure;
run;
title 'Systolic and Diastolic Blood
Pressure';
ods select BasicMeasures
Quantiles;
proc univariate
data=BPressure;
var Systolic Diastolic;
run;
ods html close;
ods listing

Opening and closing ODS


destinations

Requested ODS tables

Opening and closing ODS


destinations

ODS Output: Basic Measures & Quintiles


Basic Statistical Measures
Location

Variability

Mean

121.2727

Median

120.0000

Mode

120.0000

S
y
s
t
o
l
i
c
a
n
d

Std Deviation

14.28346

Variance

204.01732

Range

69.00000

Interquartile Range

13.00000

D
Quantiles (Definition
5)
i

Quantile
100% Max
99%
95%
90%
75% Q3
50% Median

a
s
t
o
l
i
c
B
l
o
o
d
P
r
e
s
s
u
r
e

Estimate
165
165
140
134
125
120

25% Q1

112

10%

108

5%

100

1%

96

0% Min

96

Proc Univariate: Robust measures of location and scale

Data BPressure;
Set BPressure;
Run;

Robust measures of location


Winsorized Means
Percent
Winsorized
in Tail

Number
Winsorized
in Tail

Winsorized
Mean

Std Error
Winsorized
Mean

13.64

120.64

2.42

95%
Confidence
Limits
115.48

125.78

DF

t for H0:
Mu0=0.00

Pr > |t|

15

49.9102

<.0001

Trimmed Means
Percent
Trimmed
in Tail

Number
Trimmed
in Tail

Trimmed
Mean

Std Error
Trimmed
Mean

95% Confidence Limits

DF

t for H0:
Mu0=0.00

Pr > |t|

4.55

120.3500

2.573536

114.9635

125.7365

19

46.76446

<.0001

13.64

120.3125

2.395387

115.2069

125.4181

15

50.22675

<.0001

1 observation trimmed mean


0.1 trimmed mean (10% or 2.2 obs 3 obs trimmed; n=22)

Robust measures of scale

Robust Measures of Scale


Measure

Value

Estimate
of Sigma

IInterquartile Range

13.00000

9.63691

Gini's Mean Difference

15.03030

13.32026

MAD

6.50000

9.63690

Sn

9.54080

9.54080

Qn

13.33140

11.36786

Q3-Q1

Quantile-Quantile Plots:
Agreement of empirical distribution with theoretical
distribution

Q-Q Plots
Reference theoretical distribution : Normal with estimated mean an

Use estimated mean and SD


For normal distribution

You might also like