0% found this document useful (0 votes)

13 views1 page

Proc Univariate HTML

This document is a comprehensive guide to the PROC UNIVARIATE procedure in SAS, detailing its capabilities for descriptive statistics and normality testing. It compares PROC UNIVARIATE with PROC MEANS, highlighting its superior features such as custom percentiles, normality tests, and graphical outputs. The guide includes code examples and explanations for various statistical analyses, including calculating extreme values, checking normality, and generating plots.

Uploaded by

urielusb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views1 page

Proc Univariate HTML

Uploaded by

urielusb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

ABOUT INDEX CONTACT

HOME SAS R PYTHON DATA SCIENCE CREDIT RISK SQL EXCEL JOBS SPSS CALCULATORS INFOGRAPHICS SEARCH... GO

Home » SAS » Complete Guide to PROC UNIVARIATE  Get Free Email Updates
 Follow us on Facebook

COMPLETE GUIDE TO PROC UNIVARIATE

Deepanshu Bhalla 14 Comments SAS

This tutorial explains how to explore data with PROC UNIVARIATE. It is one of the most powerful
SAS procedure for running descriptive statistics as well as checking important assumptions of
various statistical techniques such as normality, detecting outliers. Despite various powerful
features supported by PROC UNIVARIATE, its popularity is low as compared to PROC MEANS.
Most of the SAS Analysts are comfortable running PROC MEANS to run summary statistics such
as count, mean, median, missing values etc, In reality, PROC UNIVARIATE surpass PROC MEANS
in terms of options supported in the procedure. See the main difference between the two
procedures.

PROC UNIVARIATE vs. PROC MEANS

1. PROC MEANS can calculate various percentile points such as 1st, 5th, 10th, 25th, 50th, 75th,
90th, 95th, 99th percentiles but it cannot calculate custom percentiles such as 20th, 80th, 97.5th,
99.5th percentiles. Whereas, PROC UNIVARIATE can run custom percentiles.

2. PROC UNIVARIATE can calculate extreme observations - the five lowest and five highest
values. Whereas, PROC MEANS can only calculate MAX value.

3. PROC UNIVARIATE supports normality tests to check normal distribution. Whereas, PROC
MEANS does not support normality tests.

4. PROC UNIVARIATE generates multiple plots such as histogram, box-plot, steam leaf diagrams
whereas PROC MEANS does not support graphics.

Tutorial : PROC MEANS with Examples

Basic PROC UNIVARIATE Code

In the example below. we would use sashelp.shoes dataset. SALES is the numeric (or
measured) variable.

proc univariate data = sashelp.shoes;

var sales;
run;

Default Output of PROC UNIVARIATE

1. Moments : Count, Mean, Standard Deviation, SUM etc

2. Basic Statistics : Mean, Median, Mode etc

Default Output : PART I

3. Tests for Location : one-sample t-test, Signed Rank test.

4. Percentiles (Quantiles)

5. Extreme Observations - first smallest and largest values against their row position.

Default Output : Part II

Example 1 : Analysis of Sales by Region

Suppose you are asked to calculate basic statistics of sales by region. In this case, region is a
grouping (or categorical) variable. The CLASS statement is used to define categorical variable.

proc univariate data = sashelp.shoes;

var sales;
class region;
run;

See the output shown below -

PROC UNIVARIATE Class Statement

The similar output was generated for other regions - Asia, Canada, Eastern Europe, Middle East
etc.

2. Generating only Percentiles in Output

Suppose you want only percentiles to be appeared in output window. By default, PROC
UNIVARIATE creates five output tables : Moments, BasicMeasures, TestsForLocation, Quantiles,
and ExtremeObs. The ODS SELECT can be used to select only one of the table. The Quantiles is
the standard table name of PROC UNIVARIATE for percentiles which we want. ODS stands for
Output Delivery System.

ods select Quantiles;

proc univariate data = sashelp.shoes;
var sales;
class region;
run;

How to know the table names generated by SAS procedure

The ODS TRACE ON produces name and label of tables that SAS Procedures generates in the
log window.

ods trace on;

proc univariate data = sashelp.shoes;
var sales;
run;
ods trace off;

How to write Percentile Information in SAS Dataset

The ODS OUTPUT statement is used to write output in results window to a SAS dataset. In the
code below, temp would be the name of the dataset in which all the percentile information
exists.

ods output Quantiles = temp;

proc univariate data = sashelp.shoes;
var sales;
class region;
run;
ods output close;

3. Calculating Extreme Values

Like we generated percentiles in the previous example, we can generate extreme values with
extremeobs option. The ODS OUTPUT tells SAS to write the extreme values information to a
dataset named outlier. The "extremeobs" is the standard table name of PROC UNIVARIATE for
extreme values.

ods output extremeobs = outlier;

proc univariate data = sashelp.shoes;
var sales;
class region;
run;
ods output close;

4. Checking Normality

Most of the statistical techniques assumes data should be normally distributed. It is important
to check this assumption before running a model.

There are multiple ways to check Normality :

1. Plot Histogram and see the distribution

2. Calculate Skewness

3. Normality Tests

I. Plot Histogram

Histogram shows visually whether data is normally distributed.

proc univariate data=sashelp.shoes NOPRINT;

var sales;
HISTOGRAM / NORMAL (COLOR=RED);
run;

It also helps to check whether there is an outlier or not.

II. Skewness

Skewness is a measure of the degree of asymmetry of a distribution. If skewness is close to 0, it

means data is normal.

Skewness

A positive skewed data means that there are a few extreme large values which turns its mean
to skew positively. It is also called right skewed.

Positive Skewness : If skewness > 0, data is positively skewed. Another way to see
positive skewness : Mean is greater than median and median is greater than
mode.

A negative skewed data means that there are a few extreme small values which turns its mean
to skew negatively. It is also called left skewed.

Negative Skewness : If skewness < 0, data is negatively skewed. Another way to

see negative skewness : Mean is less than median and median is less than mode.

Rule :

1. If skewness < −1 or > +1, the distribution is highly skewed.

2. If skewness is between −1 and −0.5 or between 0.5 and +1, the distribution is moderately
skewed.

3. If skewness > −0.5 and < 0.5, the distribution is approximately symmetric or normal.

ods select Moments;

proc univariate data = sashelp.shoes;
var sales;
run;

Skewness and Normality

Since Skewness is greater than 1, it means data is highly skewed and non-normal.

III. Normality Tests

The NORMAL keyword tells SAS to generate normality tests.

ods select TestsforNormality;

proc univariate data = sashelp.shoes normal;
var sales;
run;

Tests for Normality

The two main tests for normality are as follows :

1. Shapiro Wilk Test [Sample Size <= 2000]

It states that the null hypothesis - distribution is normal.

In the example above, p value is less that 0.05 so we reject the null hypothesis. It
implies distribution is not normal. If p-value > 0.05, it implies distribution is normal.

This test performs well in small sample size up to 2000.

2. Kolmogorov-Smirnov Test [Sample Size > 2000]

In this test, the null hypothesis states the data is normally distributed.

If p-value > 0.05, data is normal. In the example above, p-value is less than 0.05, it
means data is not normal.

This test can handle larger sample size greater than 2000.

5. Calculate Custom Percentiles

With PCTLPTS= option, we can calculate custom percentiles. Suppose you need to generate 10,
20, 30, 40, 50, 60, 70, 80, 90, 100 percentiles.

proc univariate data = sashelp.shoes noprint;

var sales;
output out = temp
pctlpts = 10 to 100 by 10 pctlpre = p_;
run;

The OUTPUT OUT= statement is used to tell SAS to save the percentile information in TEMP
dataset. The PCTLPRE= is used to add prefix in the variable names for the variable that contains
the PCTLPTS= percentile.

Suppose you want to calculate 97.5 and 99.5 percentiles.

proc univariate data = sashelp.shoes noprint;

var sales;
output out = temp
pctlpts = 97.5,99.5 pctlpre = p_;
run;

6. Calculate Winsorized and Trimmed Means

The Winsorized and Trimmed Means are insensitive to Outliers. They should be reported rather
than mean when the data is highly skewed.

Trimmed Mean : Removing extreme values and then calculate mean after filtering out the
extreme values. 10% Trimmed Mean means calculating 10th and 90th percentile values and
removing values above these percentile values.

Winsorized Mean : Capping extreme values and then calculate mean after capping extreme
values at kth percentile level. It is same as trimmed mean except removing the extreme values,
we are capping at kth percentile level.

Winsorized Mean

In the example below, we are calculating 20% Winsorized Mean.

ods select winsorizedmeans;

ods output winsorizedmeans=means;
proc univariate winsorized = 0.2 data=sashelp.shoes;
var sales;
run;

Winsorized Means

Percent Winsorized in Tail : 20% of values winsorized from each tail (upper and lower side)
Number Winsorized in Tail : 79 values winsorized from each tail

Trimmed Mean

In the example below, we are calculating 20% trimmed Mean.

ods select trimmedmeans;

ods output trimmedmeans=means;
proc univariate trimmed = 0.2 data=sashelp.shoes;
var sales;
run;

7. Calculate Sample T-test

It tests the null hypothesis that mean of the variable is equal to 0. The alternative hypothesis is
that mean is not equal to 0. When you run PROC UNIVARIATE, it defaults generates sample t-
test in 'Tests for Location' section of output.

ods select TestsForLocation;

proc univariate data=sashelp.shoes;
var sales;
run;

Since p-value is less than 0.05. we reject the null hypothesis. It concludes the mean value of the
variable is significantly different from zero.

Ttest with PROC Univariate

8. Generate Plots

PROC UNIVARIATE generates the following plots :

1. Histogram

2. Box Plot

3. Normal Probability Plot

The PLOT keyword is used to generate plots.

proc univariate data=sashelp.shoes PLOT;

var sales;
run;

SAS Tutorials : 100 Free SAS Tutorials

 Spread the Word!

 Share  Share  Tweet  Subscribe

Related Posts
Check number of observations in SAS dataset

Predictive Modeling Interview Questions and Answers

Extract last 4 characters / digits of value in SAS

SAS : Calculate AUC of Validation Data

Detecting Interaction in Regression Model

Run SAS in Python without Installation

About Author:
Deepanshu founded ListenData with a simple objective - Make analytics
easy to understand and follow. He has over 10 years of experience in data
science. During his tenure, he has worked with global clients in various
domains like Banking, Insurance, Private Equity, Telecom and Human
Resource.

While I love having friends who agree, I only learn from those who don't
 Let's Get Connected  Email  LinkedIn

14 Responses to "Complete Guide to PROC UNIVARIATE"

Unknown 30 July 2016 at 15:24

Very good article. I am just loving listendata.

Reply Delete

Replies

Deepanshu Bhalla 31 July 2016 at 12:34

Thank you for your appreciation. Cheers!

Delete

Anonymous 28 October 2016 at 08:26

Nevermind, I skipped the part. Thank you so much. I will bookmark your page. You are the best

Reply Delete

Unknown 20 March 2017 at 22:17

The way you described the article, really appreciated..!!

Reply Delete

Replies

Unknown 20 February 2019 at 07:50

im completed base sas course now im looking for job if any job vaccany for fresher

Delete

Unknown 20 February 2019 at 07:51

if anybody knows please try to help me
my 8340015912
please try to help me

Delete

Anonymous 24 March 2017 at 04:03

This is really nice platform to learn SAS

Reply Delete

Anonymous 19 April 2017 at 16:55

Great tutoring Mr Bhalla.... I always look for your material on a particular topic I am searching for.Please keep posting. Can you make a series
on PROC SGPLOTS please.... Thank you.

Reply Delete

Anonymous 11 June 2017 at 07:43

Great brother

Reply Delete

Ritesh Patel 29 March 2018 at 02:06

When everyone around the world is busy minting money to teach. You are doing a great job by providing valuable information for free. your
explanation is so easy to understand and also almost cover all the area.
Great job. Keep up the good work.

Reply Delete

Unknown 13 August 2018 at 08:26

Can i get normality test results in output dataset using proc univariate

Reply Delete

Unknown 14 September 2018 at 23:44

I am big fan of your work. God bless you.

Reply Delete

Unknown 14 September 2018 at 23:46

Could you pls advise me the syntax how to import data from PDF file to sas. Thanks in advance!

Reply Delete

Shilpi 23 June 2020 at 14:46

Nice explanations Deepanshu. Very clear explanation.

Reply Delete

To leave a comment, click the button below to sign in with Google.

SIGN IN WITH GOOGLE

← P REV NEXT →

Sas Cheat Sheet
No ratings yet
Sas Cheat Sheet
3 pages
TR1212 Ver201
No ratings yet
TR1212 Ver201
55 pages
How To Crack Exam On SAS Certified Clinical Trials Programmer Using SAS 9
100% (2)
How To Crack Exam On SAS Certified Clinical Trials Programmer Using SAS 9
12 pages
Lab 3 - Kristi Proc Univariate
No ratings yet
Lab 3 - Kristi Proc Univariate
10 pages
Chapter 6 - Evaluating Quantitative Data
No ratings yet
Chapter 6 - Evaluating Quantitative Data
21 pages
Unit Iii Sas Procedures
No ratings yet
Unit Iii Sas Procedures
27 pages
Introduction To Tables and Graphs in SAS
No ratings yet
Introduction To Tables and Graphs in SAS
8 pages
BRM Lab File
No ratings yet
BRM Lab File
35 pages
Guido's Guide To PROC UNIVARIATE: A Tutorial For SAS® Users
No ratings yet
Guido's Guide To PROC UNIVARIATE: A Tutorial For SAS® Users
18 pages
Univariate
No ratings yet
Univariate
9 pages
Tips and Techniques For The SAS Programmer
No ratings yet
Tips and Techniques For The SAS Programmer
19 pages
Lecture 2 2
No ratings yet
Lecture 2 2
22 pages
123-110
No ratings yet
123-110
10 pages
S.A.S. - Descriptive Statistics: Download Link
No ratings yet
S.A.S. - Descriptive Statistics: Download Link
7 pages
Proc Summary
No ratings yet
Proc Summary
19 pages
SPSS Instruction
No ratings yet
SPSS Instruction
14 pages
SAS Notes Part 7
No ratings yet
SAS Notes Part 7
8 pages
Analytics
No ratings yet
Analytics
4 pages
1st Unit Notes
No ratings yet
1st Unit Notes
22 pages
Base Five 08
No ratings yet
Base Five 08
19 pages
Advanced Analytics Using SAS
No ratings yet
Advanced Analytics Using SAS
14 pages
Data Science Presentation
100% (3)
Data Science Presentation
113 pages
Module 4_chapter 2
No ratings yet
Module 4_chapter 2
14 pages
Catherine Truxillo, PH.D., Stephen Mcdaniel, and David Mcnamara, Sas Institute Inc., Cary, NC
No ratings yet
Catherine Truxillo, PH.D., Stephen Mcdaniel, and David Mcnamara, Sas Institute Inc., Cary, NC
9 pages
Further Summary
No ratings yet
Further Summary
29 pages
Advanced SQL Processing
No ratings yet
Advanced SQL Processing
7 pages
Business Research Methodology Vivan
No ratings yet
Business Research Methodology Vivan
19 pages
Presentation On Data Analysis: Submitted by
No ratings yet
Presentation On Data Analysis: Submitted by
38 pages
Data Analysis ToolPak For Statistics
No ratings yet
Data Analysis ToolPak For Statistics
10 pages
Sas Tutorial Procunivariate
No ratings yet
Sas Tutorial Procunivariate
10 pages
SPSS excercise 1
No ratings yet
SPSS excercise 1
19 pages
Quantitative Analysis Paper
No ratings yet
Quantitative Analysis Paper
15 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
48 pages
Lab01 Note SAS
No ratings yet
Lab01 Note SAS
5 pages
Big Data - Sources and Opportunities
No ratings yet
Big Data - Sources and Opportunities
30 pages
Submitted To Submitted by
No ratings yet
Submitted To Submitted by
44 pages
Submitted To Submitted by
No ratings yet
Submitted To Submitted by
44 pages
Summary Syntax SAS
No ratings yet
Summary Syntax SAS
6 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
30 pages
Topic: Generating Reports
No ratings yet
Topic: Generating Reports
15 pages
SC Sug 96020
No ratings yet
SC Sug 96020
11 pages
PA summary sheet
No ratings yet
PA summary sheet
9 pages
SAS Info 2
No ratings yet
SAS Info 2
4 pages
SPSS File
No ratings yet
SPSS File
21 pages
ML 3
No ratings yet
ML 3
18 pages
Example of Data Analysis
No ratings yet
Example of Data Analysis
5 pages
Lecture 5 - Spring 2024
No ratings yet
Lecture 5 - Spring 2024
30 pages
SPSS Notes
No ratings yet
SPSS Notes
8 pages
Spss Training Manual
No ratings yet
Spss Training Manual
94 pages
17.-Typical-Statistical-Testing-Procedures
No ratings yet
17.-Typical-Statistical-Testing-Procedures
29 pages
Spss
No ratings yet
Spss
50 pages
Chapter 4: SPSS: Spss Overview The SPSS Environment
No ratings yet
Chapter 4: SPSS: Spss Overview The SPSS Environment
10 pages
Descriptive Statistics Using SAS
No ratings yet
Descriptive Statistics Using SAS
10 pages
MMW Chapter 5 GH Annotated1
No ratings yet
MMW Chapter 5 GH Annotated1
32 pages
Statistics For Management: Assignment - 1
No ratings yet
Statistics For Management: Assignment - 1
12 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Linear Regression with Multiple Covariates
From Everand
Linear Regression with Multiple Covariates
Brett Kottmann
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
From Everand
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
S. Deviant
4.5/5 (6)
A Pocket Guide to Risk Mathematics: Key Concepts Every Auditor Should Know
From Everand
A Pocket Guide to Risk Mathematics: Key Concepts Every Auditor Should Know
Matthew Leitch
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Top 20 MS Excel VBA Simulations, VBA to Model Risk, Investments, Growth, Gambling, and Monte Carlo Analysis
From Everand
Top 20 MS Excel VBA Simulations, VBA to Model Risk, Investments, Growth, Gambling, and Monte Carlo Analysis
Andrei Besedin
2.5/5 (2)
[Ebooks PDF] download Statistics Slam Dunk (MEAP V11) Gary Sutton full chapters
100% (2)
[Ebooks PDF] download Statistics Slam Dunk (MEAP V11) Gary Sutton full chapters
65 pages
Bir
No ratings yet
Bir
28 pages
SAS 9.4 Installation Instructions
No ratings yet
SAS 9.4 Installation Instructions
9 pages
Sas Va 6.3
No ratings yet
Sas Va 6.3
666 pages
Adv. SAS (Macro)
No ratings yet
Adv. SAS (Macro)
25 pages
SAS Lab Manual-I
No ratings yet
SAS Lab Manual-I
6 pages
Macro Variables
No ratings yet
Macro Variables
4 pages
The Evolution of Decision Making: How Leading Organizations Are Adopting A Data-Driven Culture
100% (1)
The Evolution of Decision Making: How Leading Organizations Are Adopting A Data-Driven Culture
20 pages
SAS Functions by Example - Herman Lo
100% (1)
SAS Functions by Example - Herman Lo
18 pages
Stat Packages
No ratings yet
Stat Packages
50 pages
Vikas Chandra CV
No ratings yet
Vikas Chandra CV
1 page
SAS Formats and Informats
No ratings yet
SAS Formats and Informats
378 pages
SAS For Windows Begining Tutorial
100% (3)
SAS For Windows Begining Tutorial
206 pages
Did You Know?: Forest Plot Using PROC SGPLOT
No ratings yet
Did You Know?: Forest Plot Using PROC SGPLOT
8 pages
Krishna Das Invite Letter From Jerry
No ratings yet
Krishna Das Invite Letter From Jerry
2 pages
Sports Performance Analytics PDF
No ratings yet
Sports Performance Analytics PDF
28 pages
Merge
0% (1)
Merge
16 pages
SAS Interview Questions: Click Here
No ratings yet
SAS Interview Questions: Click Here
31 pages
Sept 20xx - Sept 20xx
No ratings yet
Sept 20xx - Sept 20xx
1 page
Clinical Development Services: Life Sciences Data Hub (LSH)
No ratings yet
Clinical Development Services: Life Sciences Data Hub (LSH)
11 pages
actuary software knowledge required
No ratings yet
actuary software knowledge required
3 pages
Overview of Statistical Software Applications
No ratings yet
Overview of Statistical Software Applications
3 pages
Soujanya Siram Resume - SAS
No ratings yet
Soujanya Siram Resume - SAS
3 pages
Swathi Belavadi Aswathanarayana
No ratings yet
Swathi Belavadi Aswathanarayana
2 pages
SAS Macro
No ratings yet
SAS Macro
7 pages
SASA Notes Finals
No ratings yet
SASA Notes Finals
2 pages
Resume-Xinyu Liu
No ratings yet
Resume-Xinyu Liu
1 page