SAS Learning Module Collapsing Across Observations Using Proc SQL
SAS Learning Module Collapsing Across Observations Using Proc SQL
Page 1 of 4
2. Creating a new variable of group mean We will continue to use the data set in previous example. Now we want to use the variable famid as a group variable and create a new variable that is the group mean of the variable age.
proc sql; create table kids2 as select *, mean(age) label="group average" as mean_age from kids group by famid; quit;
http://www.ats.ucla.edu/stat/sas/modules/sqlcollapse.htm
1/17/2008
Page 2 of 4
title 'New Variable of Group Mean'; proc print data=kids2 noobs; run; title 'Label at Work'; proc freq data=kids2; table mean_age; run;
Now we see that in the following output of proc print the new variable of group mean we just created. We also see the label created for the variable in the output of proc freq.
New Variable of Group Mean kidname Barb Bob Beth Ann Al Andy Pete Phil Pam sex f m f f m m m m f famid 1 1 1 2 2 2 3 3 3 birth 3 2 1 3 2 1 1 3 2 age 3 6 9 2 6 8 6 2 4 wt 20 40 60 20 50 80 60 20 40 mean_age 6.00000 6.00000 6.00000 5.33333 5.33333 5.33333 4.00000 4.00000 4.00000
Label at Work The FREQ Procedure group average Cumulative Cumulative mean_age Frequency Percent Frequency Percent ------------------------------------------------------------------4 3 33.33 3 33.33 5.3333333333 3 33.33 6 66.67 6 3 33.33 9 100.00
3. Creating multiple variables of summary statistics at once Sometimes we only need summary statistics based on a group variable similar to the output of proc means. This can also be done in proc sql as shown in our next example.
proc sql; create table kids3 as select famid, mean(age) as mean_age , std(age) as std_age, mean(wt) as mean_wt, std(wt) as std_wt from kids group by famid; quit; proc print data=kids3 noobs; run; famid 1 2 3 mean_age 6.00000 5.33333 4.00000 std_age 3.00000 3.05505 2.00000 mean_wt 40 50 40 std_wt 20 30 20
If you only want the output statistics instead of creating a new data set, you can omit the create table statement and simply run the proc sql part. The result will be shown in the output window.
proc sql; select famid, mean(age) as mean_age, std(age) as std_age, mean(wt) as mean_wt, std(wt) as std_wt from kids
http://www.ats.ucla.edu/stat/sas/modules/sqlcollapse.htm
1/17/2008
Page 3 of 4
group by famid; quit; From the Output Window: famid mean_age std_age mean_wt std_wt -----------------------------------------------1 6 3 40 20 2 5.333333 3.05505 50 30 3 4 2 40 20
5. Creating variables and their summary statistics on-the-fly Let's say that we want to know the number of boys and girls in each family. We can use variable sex to figure it out in one step using proc sql as shown below.
proc sql; create table my_count as select famid, sum(boy) as num_boy, sum(girl) as num_girl from (select famid, (sex='m') as boy, (sex='f') as girl from kids) group by famid; quit; proc print data=my_count noobs; run; From the Output Window famid num_boy num_girl 1 2 3 1 2 2 2 1 1
6. Creating grand mean and save it into a SAS macro variable Sometimes, we want to get a summary statistic for a variable and use it later for other purposes. We can save the summary statistic in a macro variable and then it can be accessed throughout the entire SAS session. proc sql is very handy as shown in the following example where we save the grand mean of variable age into macro variable
http://www.ats.ucla.edu/stat/sas/modules/sqlcollapse.htm
1/17/2008
Page 4 of 4
meanage.
proc sql noprint; select mean(age) into :meanage from kids; quit; %put &meanage; From Log Window: 3027 proc sql noprint; 3028 select mean(age) into :meanage from kids; 3029 quit; NOTE: PROCEDURE SQL used: real time 0.00 seconds cpu time 0.00 seconds 3030 %put &meanage; 5.111111
7. Creating group means and save them into a sequence of SAS macro variables
proc sql noprint; select mean(age) into :meanage1 - :meanage3 from kids group by famid; quit; %put _user_;
UCLA Researchers are invited to our Statistical Consulting Services We recommend others to our list of Other Resources for Statistical Computing Help These pages are Copyrighted (c) by UCLA Academic Technology Services The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.
http://www.ats.ucla.edu/stat/sas/modules/sqlcollapse.htm
1/17/2008