0% found this document useful (0 votes)
65 views27 pages

Rsudio Problems

The document contains multiple SAS programs that create, format, and analyze various temporary and permanent datasets for educational purposes. The programs cover topics such as randomly generating data, importing data from external files, calculating summary statistics, formatting values, and filtering observations based on certain criteria. The document appears to be a collection of example programs and exercises completed by a student to learn SAS fundamentals.

Uploaded by

Ben Van Neste
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views27 pages

Rsudio Problems

The document contains multiple SAS programs that create, format, and analyze various temporary and permanent datasets for educational purposes. The programs cover topics such as randomly generating data, importing data from external files, calculating summary statistics, formatting values, and filtering observations based on certain criteria. The document appears to be a collection of example programs and exercises completed by a student to learn SAS fundamentals.

Uploaded by

Ben Van Neste
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

*Problem: 11.

12 #8
Purpose: To create a temporary data set of 1,000 observations with
random but equally likely integers from 1 to 5
Programmer: Ben Van Neste
Date Written: May 19th, 2021;

data WORK.INPUT;
do i=1 to 1000;
x=int(rand('uniform')*5)+1;output ;end;
run;

proc freq data=WORK.INPUT;


tables x/missing;
run;

Problem 12.17 #9:


*Problem: 12.17 #9
Purpose: To observe any customer value with the string SPIRIT in the
Sales data file
Programmer: Ben Van Neste
Date Written: May 19th, 2021;
proc import datafile=
'/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks 5_6/Sales.xls'
dbms=xls out=Work.Import;
GETNAMES = YES;
run;

Proc contents data = WORK.IMPORT;


run;

data Spirited;
set WORK.IMPORT;
where find(Customer,'spirit','i');
run;

title "SPIRITED Customers";


proc print data=Spirited noobs;
run;

Problem 13.10 #5:


*Problem: 13.10 #5
Purpose: To create a data set of test scores and observe which grades
are passing
Programmer: Ben Van Neste
Date Written: May 19th, 2021;

libname learn 'c:sue.mcdaniel';


data PassingScore;
array pass_score{5} _temporary_
(65,70,60,62,68);
array Score{5};
input ID : $3. Score1-Score5;
ScoresPassed = 0;
do Test = 1 to 5;
ScoresPassed + (Score{Test} ge pass_score{Test});
end;
drop Test;
datalines;
001 90 88 92 95 90
002 64 64 77 72 71
003 68 69 80 75 70
004 88 77 66 77 67
;

title "Passing Grades";


proc print data=PassingScore;
id ID;
run;
Problem 9.11 #2:
*Problem: 9.11 #2
Purpose: To create a permanent and specifically organized data set of
dates
Programmer: Ben Van Neste
Date Written: May 3rd, 2021;

data ThreeDates;
input @1 Date1 mmddyy10.
@12 Date2 mmddyy10.
@23 Date3 date9.;
format Date1 Date2 Date3 mmddyy10.;

datalines;
01/03/1950 01/03/1960 03Jan1970
05/15/2000 05/15/2002 15May2003
10/10/1998 11/12/2000 25Dec2005
;
run;

data ThreeDates;
set WORK.Import;
year12=round(yrdif(Date1,Date2,'Actual'));
year23=round(yrdif(Date2,Date3,'Actual'));
run;
Title "ThreeDates";
proc print data=threedates;
run;

Problem 9.11 #8:


*Problem: 9.11 #8
Purpose: To create a temporary dataset formatted into MMDDYY10
Programmer: Ben Van Neste
Date Written: May 3rd, 2021;

data FormattedDates;
input Day Month Year;
datalines;
25 12 2005
1 1 1960
21 10 1946
run;

data FormattedDates;
set FormattedDates;
Date = mdy(Month,Day,Year);
format Date mmddyy10.;
run;

title "FormattedDates";
proc print data=FormattedDates;
run;

Problem 10.16 #3:


*Problem: 10.16 #3
Purpose: To create two temporary data sets to seperately list the men
and women that have a low cholestrol
Programmer: Ben Van Neste
Date Written: May 3rd, 2021;

libname learn 'c:sue.mcdaniel';


data WORK.Blood;
infile '/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
7&8/blood.txt' truncover;
length Gender $ 6 BloodType $ 2 AgeGroup $ 5;
input Subject
Gender
BloodType
AgeGroup
WBC
RBC
Chol;
label Gender = 'Gender'
BloodType = 'Blood Type'
AgeGroup = 'Age Group'
WBC = 'Cholestrol'
run;

data LowMale LowFemale;


set WORK.Blood;
where Chol lt 100 and not missing(Chol);
if Gender = 'Female' then output LowFemale;
else if Gender = 'Male' then output LowMale;
run;

title 'Women with Low Cholestrol';


proc print data=LowFemale noobs;
run;

title 'Men with Low Cholestrol';


proc print data=LowMale noobs;
run;
Problem 10.16 #7:
*Problem: 10.16 #7
Purpose: To create a temporary data set for the Gym that calculates
Cost Percent for all subjects
Programmer: Ben Van Neste
Date Written: April 12th, 2021;

data Gym;
infile '/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
7&8/gym.txt' truncover;
input Subj : $3.
Date : mmddyy10.
Fee;
format Date mmddyy8. Fee Dollar6.;
run;

proc means data=Gym noprint;


var Fee;
output out=MeanFee(drop=_type_ _freq_)
Mean=AveFee;

data Gym;
set Gym;
if _n_ = 1 then set MeanFee;
FeePercent = round(100*fee / AveFee);
drop AveFee;
run;

title "Gym Cost Percent";


proc print data=Gym;
run;
Problem 7.10 #3:
*Purpose: To list observations from certain employees
Problem 7.10 #3
Programmer: Ben Van Neste
Date Written: April 20th, 2021;

proc import datafile=


'/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks 5_6/Sales.xls'
dbms=xls out=Work.Import;
GETNAMES = YES;
run;

Title "Employee Observations";


Proc print data = Work.Import;
Where EmpID = "9888" OR EmpID = "0177";
run;

*Purpose: To list observations from certain employees


Problem 7.10 #3

Programmer: Ben Van Neste


Date Written: April 20th, 2021;

proc import datafile=


'/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks 5_6/Sales.xls'
dbms=xls out=Work.Import;
GETNAMES = YES;
run;

Title "Employee Observations";


Proc print data = Work.Import;
Where EmpID in ("9888", "0177");
run;

Problem 7.10 #4:


*Purpose: To create a new dataset highlighting total sales in 4
regions
Problem 7.10 #4
Programmer: Ben Van Neste
Date Written: April 20th, 2021;

proc import datafile=


'/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks 5_6/Sales.xls'
dbms=xls out=Work.Import;
GETNAMES = YES;
run;

Proc contents data = Work.import;


run;

Data refile;
set work.import (keep = TotalSales Region);

Title "Regional Sales";

Select;
when (Region = 'North') Weight = 1.5;
when (Region = 'South') Weight = 1.7;
when (Region = 'East') Weight = 2.0;
when (Region = 'West') Weight = 2.0;
otherwise;
end;
run;

proc print data = refile;


run;

Problem 7.10 #6
*Purpose: To specify North regions under 60 quantity and compare to
Pet’s are Us Customer
Problem 7.10 #6
Programmer: Ben Van Neste
Date Written: April 20th, 2021;

proc import datafile=


'/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks 5_6/Sales.xls'
dbms=xls out=Work.Import;
GETNAMES = YES;
run;

Proc contents data = Work.import;


run;

Data refile;
set work.import;

where Region = 'North' and Quantity < 60 or Customer eq "Pet's are


Us";
run;

proc print data = refile;


run;

Problem 8.9 #4
*Purpose: To count the Missing Values in the Missing.txt data
Problem 8.9 #4
Programmer: Ben Van Neste
Date Written: April 26th, 2021;
data missing;

input A $ B $ C $;
if missing(A) then MissA + 1;
if missing(B) then MissB + 1;
if missing(C) then MissC + 1;

datalines;
1 2 3
4 5 .
6 7 8
9 10 11
;
run;

title "Missing Values";


proc print data= missing;
run;

Problem 8.9 #14


*Purpose: To generate a list of squar values under the value of 100
Problem 8.9 #14
Programmer: Ben Van Neste
Date Written: April 26th, 2021;

libname learn 'c:sue.mcdaniel';


data SquareVals;
do num = 1 to 20 until (sqr ge 100);
sqr = num * num;
output; end;
run;

title "Square Values under 100";


proc print data = SquareVals;
run;

Problem 4.10 #3:


*Problem: 4.10 #3
Purpose: To create a permanent data set and compute the mean age
Programmer: Ben Van Neste
Date Written: April 12th, 2021;
libname learn 'c:sue.mcdaniel';
data Survey2007 ;
input Age Gender $ (Ques1-Ques5)($1.);
/* See Chapter 21, Section 14 for a discussion
of variable lists and format lists used above */
datalines;
23 M 15243
30 F 11123
42 M 23555
48 F 55541
55 F 42232
62 F 33333
68 M 44122
;
* libname learn 'c:sue.mcdaniel';
proc means data= Survey2007;
var Age;
run;

Problem 5.9 #2:

*Problem: 5.9 #2
Purpose: To see the frequencies of each question and also change to
just three categories
Programmer: Ben Van Neste
Date Written: April 12th, 2021;

data voter;
input Age $Party : $1. (Ques1-Ques4)($1. + 1);
datalines;
23 D 1 1 2 2
45 R 5 5 4 1
67 D 2 4 3 3
39 R 4 4 4 4
19 D 2 1 2 1
75 D 3 3 2 3
57 R 4 3 4 4
;

proc format;
value age low-30 = '0-30'
31-50 = '31-50'
51-70 = '51-71'
71-high = '71+';
value $party 'D' = 'Democrat'
'R' = 'Republican';
value $Ques 1,2 = "Generally Disagree"
3 = "No opinion"
4,5 = "Generally Agree";
title "Voter Questions";
proc print data=voter label;
label Ques1 = "The president is doing a good job"
Ques2 = "Congress is doing a good job"
Ques3 = "Taxes are too high"
Ques4 = "Government should cut spending";
format Age $Age.
Party $Party.
Ques1-Ques4 $Ques.;
run;

title "Question Frequency";


proc freq data = voter;
tables Ques1 Ques2 Ques3 Ques4;
label Ques1 = "The president is doing a good job"
Ques2 = "Congress is doing a good job"
Ques3 = "Taxes are too high"
Ques4 = "Government should cut spending";
format Ques1-Ques4 $Ques.;
run;

*Actually now Problem 5.9 #2


Purpose: To change the categories into just three of Generally
Disagree, No Opinion, and Generally Agree.
We will change the original format.. to its updated from
value $Ques 1 = "Strongly disagree"
2 = "Disagree"
3 = "No opinion"
4 = "Agree"
5 = "Strongly agree";
Programmer: Ben Van Neste
Date Written: April 12th, 2021

Problem 5.9 #3:


*Problem: 5.9 #3
Purpose: To create "Colors" data set, format into groups, and list
frequencies
Programmer: Ben Van Neste
Date Written: April 12th, 2021;

data colors;
input Color : $1. @@;
datalines;
R R B G Y Y . . B G R B G Y P O O V V B
;

proc format;
value $Color 'R','B','G' = 'Group 1'
'Y','O' = 'Group 2'
' ' = 'Not Given'
other = 'Group 3';
run;

proc freq data = colors;


format color $color.;
Tables Color;
run;
Problem 6.7 #2:

*Purpose: The program converts weights from pounds to kilograms (WtKg)


Programmer: Ben Van Neste
Date Written: March 29, 2021;

Data InputOutput;
Infile"/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
1_2/mydata1.txt";
input Gender $ Age Height Weight;
*Compute the weight from pounds to kilos (WtKg);
WtKg = (Weight / 2.2);
run;

Proc Print Data=InputOutput noobs;


run;

b)
*Purpose: The program converts heights from inches to centimeters
(HtCm)
Programmer: Ben Van Neste
Date Written: March 29, 2021;

Data InputOutput;
Infile"/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
1_2/mydata1.txt";
input Gender $ Age Height Weight;
*Compute the height from inches to centimeters (HtCm);
HtCm = (Height / 2.54);
run;

Proc Print Data=InputOutput noobs;


run;

c)
I tried to find anything on how to calculate blood pressure with weight, height, and/or age but
could not come up on anything. Not knowing how else to approach this problem I included
code if the diastolic and systolic blood pressure was included in this data set..
*Purpose: The program converts weights from pounds to kilograms (WtKg)
Programmer: Ben Van Neste
Date Written: March 29, 2021;

Data InputOutput;
Infile"/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
1_2/mydata1.txt";
input Gender $ Age Height Weight;
*Compute the weight from pounds to kilos (WtKg);
WtKg = (Weight / 2.2);
run;

Proc Print Data=InputOutput noobs;


run;

d)
*Purpose: The program calculates heights into polynomials
(HtPolynomial)
Programmer: Ben Van Neste
Date Written: March 29, 2021;

Data InputOutput;
Infile"/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
1_2/mydata1.txt";
input Gender $ Age Height Weight;
*Compute the height with polynomials (HtPolynomial);
HtPolynomial = ((Height**2)*2)+((Height**3)*1.5);
run;

Proc Print Data=InputOutput noobs;


run;

C)
*Purpose: To create a temporary SAS data set from the political.csv
file.
Age is numerical, State and Party are character variables
Programmer: Ben Van Neste
Date Written: March 29, 2021;

data InputOutput;
Infile"/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
1_2/political.csv" dsd;
input State $ Party $ Age;
run;

title "Vote";
proc print data=InputOutput noobs;
run;
D)
*Purpose: To create a temporary SAS data set from the political.csv
file.
Age is numerical, State and Party are character variables
Programmer: Ben Van Neste
Date Written: March 29, 2021;

*FILENAME - nickname for filepath to VoteData;


filename VoteData
"/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
1_2/political.csv";
data InputOutput;
Infile VoteData dsd;
input State $ Party $ Age;
run;

title "Vote";
proc print data=InputOutput noobs;
run;

E)
*Purpose: To create a SAS data set named Bank from the bankdata.txt
file.
Using column input for specification
Including a computed interest variable (Balance * Rate)
Programmer: Ben Van Neste
Date Written: March 29, 2021;
*FILENAME - nickname for filepath to BankData;
filename BankData
"/home/u58352038/my_shared_file_links/sue.mcdaniel/Wks
1_2/bankdata1.txt";
data InputOutput;
Infile BankData;
input Name $ 1-15
Account $ 16-20
Balance 21-26
Rate 27-30;
*Included varaible to compute Interest;
Interest = Balance*Rate;
run;

title "Bank";
proc print data=InputOutput noobs;
run;

You might also like