ACMT 311 Assignment

Uploaded by

shaenza402

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views6 pages

ACMT 311 Assignment

Uploaded by

shaenza402

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

ACMT 311

R Assignment #1 Due before the date of the final exam

Instructions:

• In RStudio, go to file and open a new R script.

• Save the file as A1_Lastname.R. If your last name is longer than 8 letters, you can stop at
8 characters. Be sure to put A1 underscore before your last name so I know that this is for
Assignment #1. For example, my file will be called A1_Onsoti.R.
• Always write your code in the Source panel (not the Console panel). View your results in
the Console panel or graphs in the Plots panel. Note that different questions may result in
different codes for different people. One code may not necessarily be more correct than
another. However, one code may be better written than another, but I will not mark it as
wrong as long as I am able to generate the needed result.
• If you make a mistake, correct your code in the Source panel so that the submission does
not contain codes you don’t want me to grade. Basically, your submission should be a
clean, working file. Do not submit any errors, practices, etc.
• Before answering or writing the code for each question, start by commenting the question
number. For example, before answering question 1, write # Q1. Then go to the next line
and start answering the question. If an answer is a sentence, the first line will be #
Question number. The next line will be # Answer in sentence form. If the question calls
for a code, the first line will be # Question number. The next line will be the code.
• Upload your file in the google drive link I will provide. Be sure your file has the
extension .R or .r.
Guidelines:

• Each question needs an R code unless it is italicized, in which case, you need to answer
the question in a complete sentence, when necessary.
• I need to see your script and not the result. So do not cut and paste results. I will run each
script to see the result.
• Be sure your code works. Do NOT submit unworkable codes.
Data Background
The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey
of 350,000 people in the United States. As its name implies, the BRFSS is designed to identify
risk factors in the adult population and report emerging health trends. For example, respondents
are asked about their diet and weekly physical activity, their HIV/AIDS status, possible tobacco
use, and even their level of healthcare coverage. The BRFSS Web site
(http://www.cdc.gov/brfss) contains a complete description of the survey, including the research
questions that motivate the study and other interesting results.
We will focus on a random sample of 20,000 people from the BRFSS survey conducted
in 2000. While there are over 200 variables in this data set, we will work with a small subset.
Begin by loading the dataset into your R workspace by using this command:
source(“http://www.openintro.org/stat/data/cdc.R”). Make sure the data, cdc, is showing in the
environment panel.
Questions:
Q1. How many cases and variables does the data set, CDC, have?
Q2. What are the variables of the data set, cdc? (Your code should return: genhlth, exerany,
hlthplan, smoke100, height, weight, wtdesire, age, and gender. Each one of these
variables corresponds to a question that was asked in the survey. For example, for
genhlth, respondents were asked to evaluate their general health, responding either
excellent, very good, good, fair or poor. The exerany variable indicates whether the
respondent exercised in the past month (1) or did not (0). Likewise, hlthplan indicates
whether the respondent had some form of health coverage (1) or did not (0). The
smoke100 variable indicates whether the respondent had smoked at least 100 cigarettes in
respondent’s lifetime. The other variables record the respondent’s height in inches,
weight in pounds as well as their desired weight, wtdesire, age in years, and gender.)
Q3. Generate the first 10 lines of the dataset.
Q4. For each variable, identify whether it is numerical or categorical. Be sure to list the
variables. Use the same variable name as given in the dataset.
Q5. Create a numerical summary of the variable, weight, that shows the minimum, Q1,
median, mean, Q3, and maximum values.
Q6. Calculate the inter-quartile range for the variable, weight.
Q7. Make a histogram of the weight distribution. Include a title and label for both axes so that
the graphic is understandable to anyone.
Q8. From the histogram, what kind of shape is the weight distribution? That is, state whether
the histogram is right-skewed, left-skewed, symmetric and whether it is unimodal,
bimodal, or multi-modal?
Q9. Make a horizontal boxplot of the weight distribution. Include a title and label for the axes
so that the graphic is understandable to anyone.
Q10. Describe one feature you see in the histogram that is difficult to see in the boxplot.
Q11. Describe one feature you see in the boxplot that is difficult to see in the histogram.
Q12. Draw the boxplot for the variable, wtdesire (desired weight). Include a title and label for
the axes so that the graphic is understandable to anyone.
Q13. Why do you think there are 2 extreme values showing in the boxplot for the variable,
wtdesire?
R Assignment #2
Using the data from assignment #1
Questions
Q1. Change all the entries in the variable, smoke100, from 0 to No and 1 to Yes. See Sec.
16.4 on how to do this.
Q2. Generate the last 8 lines of the dataset to check that all the changes took place.
Q3. Calculate how many “Yes” and how many “No” responses there were.
Q4. Calculate the relative frequency of the response distribution. To do this, take your code in
Q4 and divide it by the number of observations. Notice that R automatically divides all
entries.
Q5. Make a barplot of the responses by putting the table( ) command inside the barplot
function. Include a title and label for both axes so that the graphic is understandable to
the general audience.
Q6. Now split the smokers by gender. Do a cross-tab count by gender and smoker. (In your
code, put gender first, then smoker.)
Q7. Calculate the relative frequency of the response distribution.
Q8. Graph a side-by-side barplot with smoker on the horizontal axis and gender on the
vertical axis. Include a title, legend, and label for both axes so that the graphics is
understandable to the general audience. (Note: the legend might cover part of your
graphics. Do not worry about it. If you are bothered by the coverage, a fast way is to
extend the vertical axis or you can research how to reposition the legend.)
Q9. By separating gender and smokers, what feature do you see in the relative frequency
calculation or barplot that you do not see when gender was not separated? Give at least
one feature.
Q10. Graph a mosaic plot using the function mosaicplot( ) by putting table( ) inside the
mosaicplot( ) function. Include a title to your plot. Be sure labels are clear to the general
audience. (Although not necessary, you may try adding colors to make the plot more
visually pleasing.)
Q11. Name one feature that you see in the side-by-side barplot that is easier to see than in the
mosaic plot.
Q12. Name one feature that you see in the mosaic plot that is easier to see than in the side-by-
side plot.
Assignment #3
Background
For this assignment, we are going to see how the Central Limit Theorem works.
We will consider the real estate data from the city of Ames, Iowa. The details of every real
estate transaction in Ames is recorded by the City Assessor’s office. Our particular focus
will be all residential home sales in Ames between 2006 and 2010. This collection
represents our population of interest.
Download the real estate data from Ames, Iowa by entering the following codes:

• download.file("http://www.openintro.org/stat/data/ames.RData", destfile =
"ames.RData")
• load("ames.RData")
(There are lots of quotation marks here. If the code does not work, you may have to play around
with the quotation marks.)
This data set has 82 variables. We are only going to be focusing on one variable, the Sale Price
of homes in Ames, Iowa.
Questions:
Q1. Rename the variable SalePrice as price.
Q2. Draw a histogram of price. Label the axes and give a title to your histogram so that it is
understandable to a general audience.
Q3. From the histogram, what kind of distribution is price?
Q4. Set the seed first. Then take any sample size you want from price by using the code:
sample(price, size), although I suggest using a small sample size for easy viewing. Call
that sample_size_size. For example, if your chosen sample size is 25, then your vector
will be called sample_size_25. (This part is optional: You may want to call out
sample_size_size to view what samples R generates. Try the code sample( ) several times
using different seeds or without seeding to see the samples vary.)
We will now take different sample sizes of price and calculate the sample means. For
uniformity, call the variable, sample_means_ size. You can follow the directions below or
refer to Chapter 21 (Samples and Distributions) of the R Guide. Any word in red and
italicized means you enter your own value. Any word in green is the code function name.
Q5. Use sample size = 5. Begin each code as follows: # 5a, # 5b, # 5c…
a. Do the each of the following in order.
• set_seed(any integer)
• sample_mean_size <- rep(NA, number of repetitions) #I suggest 1000 or more
repetitions. Play around with the number of repetitions
• for (i in 1: number of repetitions){
sample_mean_size[i] <- mean(sample(price, size))
}
# This loop takes the mean of the samples and puts them in the ith entry of
vector, sample_means_size, each time.
b. Do a histogram of sample_means_size. Be sure to label the axes. Add the title:
Sample Size of size.
c. What kind of distribution is sample_means_size?
d. Calculate the mean.
e. Calculate the standard deviation of sample_means_size.
f. Calculate the standard deviation of price divided by the square root of the sample
size.
Q6. Repeat all of a – e in #5 using size = 10. Start your code with # 6a, # 6b, # 6c …
Q7. Repeat all of a – e in #5 using size = 30. Start your code with # 7a, # 7b, # 7c …
Q8. Repeat all of a – e in #5 using size = 50. Start your code with # 8a, # 8b, # 8c …
Q9. Calculate the mean of price
Q10. As the sample size increases from 5 to 50, how does the mean of sample_means_size
compare with the mean of price?
Q11. As the sample size increases from 5 to 50, are your answers to #5 – 8, (e ) and (f ) getting
closer?
Assignment #4
Background
This is a made-up dataset but we will assume data was randomly collected. We want see what
happens when a two sample means hypothesis test was performed on a matched-pair dataset.
Questions:
Q1. Upload the dataset called “assignment4” into RStudio.
Situation 1 – Analyze the data as a two-sample means problem
Q2. Draw a side-by-side boxplot. There is no need for title or axes labels.
Q3. Draw the quantile plots for each variable. Include a line for each plot. Put the title “Group
1” and “Group 2” in the appropriate quantile plot.
Q4. From the boxplots and quantile plots, would you say the distribution of the variables are
approximately symmetric, right-skewed, left-skewed or none of the given?
Q5. Write the null and alternative hypothesis in symbols, if a two-sample means test is
performed. Identify the symbols used.
Q6. Perform a two-sample hypothesis test to determine if there is any difference between the
population mean.

Q7. Using the significance level, 𝛼 = 0.10, write a conclusion for your hypothesis test in the
context of the given situation.
Situation 2 - Analyze the data as a matched pair
Q8. Add a column of differences between Group1 and Group2 to the data frame. One way to
add a new column to the existing data frame is:
data_frame$new_column_name <- data_frame$Group1 – data_frame$Group2.
View your dataset to make sure column is added correctly.
Q9. Draw a boxplot for the “differences” variable.
Q10. Draw a quantile plot of the “differences” variable and include a line.
Q11. From the boxplot and quantile plot, would you say the distribution of the “differences”
variable is approximately symmetric, right-skewed, left-skewed or none of the given?
Q12. Perform a one-sample hypothesis test to determine if there are any differences in the
population means.

Q13. Using the significance level, 𝛼 = 0.10, write a conclusion for your hypothesis test in the
context of the given situation.

PI150 Series Frequency Inverter Operation Manual: 1.foreword
100% (3)
PI150 Series Frequency Inverter Operation Manual: 1.foreword
15 pages
Color Code Ieee 1580 Table 22
No ratings yet
Color Code Ieee 1580 Table 22
1 page
Mechanisms in Modern Engineering Design PDF
100% (3)
Mechanisms in Modern Engineering Design PDF
618 pages
Oracle: Question & Answers
No ratings yet
Oracle: Question & Answers
18 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
Statistics Cheat Sheet
100% (1)
Statistics Cheat Sheet
4 pages
Datos Tecnicos RLN
No ratings yet
Datos Tecnicos RLN
7 pages
Intro To R
No ratings yet
Intro To R
18 pages
Lecture 10 R
No ratings yet
Lecture 10 R
117 pages
Full Slides Beginselen2019
No ratings yet
Full Slides Beginselen2019
364 pages
AL Tamil Medium Answer
No ratings yet
AL Tamil Medium Answer
93 pages
R-Unit 4
No ratings yet
R-Unit 4
93 pages
5.prestressing in UHPFRC
No ratings yet
5.prestressing in UHPFRC
10 pages
Wendland, Aristeae Ad Philocratem Epistula
No ratings yet
Wendland, Aristeae Ad Philocratem Epistula
275 pages
Lab 5
0% (1)
Lab 5
5 pages
Documents From The US Antitrust Investigation Into Apple
No ratings yet
Documents From The US Antitrust Investigation Into Apple
113 pages
Dissertation Final Lusungu Munthali
No ratings yet
Dissertation Final Lusungu Munthali
48 pages
Teaching Notes of R
No ratings yet
Teaching Notes of R
78 pages
R Software Project
No ratings yet
R Software Project
42 pages
Unit3 R
No ratings yet
Unit3 R
30 pages
Unit3 R
No ratings yet
Unit3 R
19 pages
CSC 820 How To Do Analyses in Spss
No ratings yet
CSC 820 How To Do Analyses in Spss
39 pages
CS ELEC 4 Midterm Module
No ratings yet
CS ELEC 4 Midterm Module
59 pages
Conduit User Manual
No ratings yet
Conduit User Manual
29 pages
Swami Tech
No ratings yet
Swami Tech
32 pages
PCI PTS POI - SRED v4.x
No ratings yet
PCI PTS POI - SRED v4.x
51 pages
Midterm Project Group 6
No ratings yet
Midterm Project Group 6
41 pages
Algebraic Geometry For Geometric Modeling: Ragni Piene
No ratings yet
Algebraic Geometry For Geometric Modeling: Ragni Piene
46 pages
Rdias FDP
No ratings yet
Rdias FDP
50 pages
QM 2 Tute 3
No ratings yet
QM 2 Tute 3
32 pages
MATH10282: Introduction To Statistics Supplementary Lecture Notes
No ratings yet
MATH10282: Introduction To Statistics Supplementary Lecture Notes
50 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
Statistical Modelling
No ratings yet
Statistical Modelling
39 pages
1 MODERN TECHNOLOGY AS INSTRUCTIONAL DEVICES M Tesis Menzy 02-25-16 4 PM
No ratings yet
1 MODERN TECHNOLOGY AS INSTRUCTIONAL DEVICES M Tesis Menzy 02-25-16 4 PM
91 pages
Module2 Analytical Tool
No ratings yet
Module2 Analytical Tool
25 pages
EGPCL-NPL-PEL-KEC-PPL-RPT-00007 Wall Thickness Calculation Report C01
No ratings yet
EGPCL-NPL-PEL-KEC-PPL-RPT-00007 Wall Thickness Calculation Report C01
13 pages
Data Analyses R Manual NYTS
No ratings yet
Data Analyses R Manual NYTS
24 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
R Studio Cheat Sheet
No ratings yet
R Studio Cheat Sheet
6 pages
Data Sheet Fujitsu Server Primergy Rx2540 m5 Rack Server
No ratings yet
Data Sheet Fujitsu Server Primergy Rx2540 m5 Rack Server
16 pages
ASSK CONSTITUITION - DRAFT 2023 (1) (2) .docx-FINAL
No ratings yet
ASSK CONSTITUITION - DRAFT 2023 (1) (2) .docx-FINAL
20 pages
Statistic Course
No ratings yet
Statistic Course
20 pages
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
No ratings yet
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
15 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
R Viva Ques
No ratings yet
R Viva Ques
24 pages
Unit 5
No ratings yet
Unit 5
9 pages
STATS 10 Assignment 1
No ratings yet
STATS 10 Assignment 1
7 pages
STAT501 Online - HW2R - Spring2024
No ratings yet
STAT501 Online - HW2R - Spring2024
7 pages
IntroR 2
No ratings yet
IntroR 2
18 pages
WP99-UPC RI - Expense Claim Form - Rediansyah - Maret 2024-3
No ratings yet
WP99-UPC RI - Expense Claim Form - Rediansyah - Maret 2024-3
11 pages
CDM 400x300 en
No ratings yet
CDM 400x300 en
5 pages
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
No ratings yet
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
39 pages
Lab0 R Tutorial EHS
No ratings yet
Lab0 R Tutorial EHS
9 pages
Apache Cassandra Developer Associate - Exam Practice Tests
From Everand
Apache Cassandra Developer Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Mendenhall R
No ratings yet
Mendenhall R
14 pages
R Manual PDF
No ratings yet
R Manual PDF
78 pages
Etx 2 v6.7 - Datasheet
No ratings yet
Etx 2 v6.7 - Datasheet
13 pages
CY23 102 Environmental Studies Exam Pattern - 2023 - 24
No ratings yet
CY23 102 Environmental Studies Exam Pattern - 2023 - 24
9 pages
CONDUITE
No ratings yet
CONDUITE
9 pages
Assessment User Experience Responsive Web Applications Case Study
No ratings yet
Assessment User Experience Responsive Web Applications Case Study
8 pages
Introduction To R: Exercises: Aboratory For Pplied Tatistics Elle Ørensen Niversity of Openhagen Ugust
No ratings yet
Introduction To R: Exercises: Aboratory For Pplied Tatistics Elle Ørensen Niversity of Openhagen Ugust
42 pages
1 s2.0 S2772940024000171 Main1
No ratings yet
1 s2.0 S2772940024000171 Main1
10 pages
Descriptive and Inferential Statistics With R
No ratings yet
Descriptive and Inferential Statistics With R
6 pages
Time Series Practice
No ratings yet
Time Series Practice
4 pages
Assignment 3 (2023)
No ratings yet
Assignment 3 (2023)
9 pages
2024 Sept To Dec Exam TT Version - 2
No ratings yet
2024 Sept To Dec Exam TT Version - 2
9 pages
STA1007S Lab 3: Plots (II) and Sub-Setting: "Sample"
No ratings yet
STA1007S Lab 3: Plots (II) and Sub-Setting: "Sample"
10 pages
Tutorial 5 - Calculating Mean, Standard Deviation, Frequencies
No ratings yet
Tutorial 5 - Calculating Mean, Standard Deviation, Frequencies
6 pages
DA Lab Week-1
No ratings yet
DA Lab Week-1
7 pages
Chapter Four
No ratings yet
Chapter Four
8 pages
BES - R Lab
No ratings yet
BES - R Lab
5 pages
2023 Tutorial 12
No ratings yet
2023 Tutorial 12
6 pages
Lab 12
No ratings yet
Lab 12
8 pages
Lab 1 Manual - Introduction To R
No ratings yet
Lab 1 Manual - Introduction To R
7 pages
Sheeting Accessories
No ratings yet
Sheeting Accessories
6 pages
ER04242
No ratings yet
ER04242
5 pages
Coding Interview Questions and Answers
From Everand
Coding Interview Questions and Answers
Chinmoy Mukherjee
No ratings yet
R Cheat Sheet
No ratings yet
R Cheat Sheet
9 pages
Lab 1 Activities
No ratings yet
Lab 1 Activities
4 pages
R Notes For Data Analysis and Statistical Inference
No ratings yet
R Notes For Data Analysis and Statistical Inference
10 pages
00 Lab Notes
No ratings yet
00 Lab Notes
8 pages
Unit 1 R Reading-Writing Files
No ratings yet
Unit 1 R Reading-Writing Files
8 pages
Fiverr
No ratings yet
Fiverr
5 pages
AT04 - AT05 Series Datasheet V2.1
No ratings yet
AT04 - AT05 Series Datasheet V2.1
3 pages
Problem Set 1 (R)
No ratings yet
Problem Set 1 (R)
5 pages
Econ Notes32
No ratings yet
Econ Notes32
5 pages
Prelims Biostat
No ratings yet
Prelims Biostat
9 pages
FB Viral Page
No ratings yet
FB Viral Page
2 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
Lab 1 Introduction To Data
No ratings yet
Lab 1 Introduction To Data
11 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Stata Introduction and Worksheet
No ratings yet
Stata Introduction and Worksheet
2 pages
HEF Scholarship and Loans Appeal
No ratings yet
HEF Scholarship and Loans Appeal
1 page

ACMT 311 Assignment

Uploaded by

ACMT 311 Assignment

Uploaded by

ACMT 311

R Assignment #1 Due before the date of the final exam

• In RStudio, go to file and open a new R script.

You might also like