0% found this document useful (0 votes)
52 views

Reating A Project IN Tudio: Steps

This document provides instructions for creating an R project in RStudio and analyzing the diamonds dataset within that project. The key steps are: 1. Create a new RStudio project directory called "Diamonds" and add subfolders for data, code, documents, output and plots. 2. Copy the diamonds dataset into the data folder and create an R script in the code folder to read in the data, subset it, summarize it and produce plots which are saved to the plots folder. 3. Analyze the diamonds data by calculating summary statistics like the mean, standard deviation, and number of observations for variables like price and carat.

Uploaded by

Milind Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Reating A Project IN Tudio: Steps

This document provides instructions for creating an R project in RStudio and analyzing the diamonds dataset within that project. The key steps are: 1. Create a new RStudio project directory called "Diamonds" and add subfolders for data, code, documents, output and plots. 2. Copy the diamonds dataset into the data folder and create an R script in the code folder to read in the data, subset it, summarize it and produce plots which are saved to the plots folder. 3. Analyze the diamonds data by calculating summary statistics like the mean, standard deviation, and number of observations for variables like price and carat.

Uploaded by

Milind Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

1

 CREATING A PROJECT IN RSTUDIO

This exercise requires that you create and use a project in RStudio.

Steps

1. Download the data set Diamonds.csv from the Canvas module.


2. Open up RStudio.
3. Click on File > New Project
a. The following dialog appears:

b. Select New Directory


c. Select New Project
d. In the following dialog, enter Diamonds in the Directory Name box and then
Browse to the Subdirectory where you want to create the project. Leave the other
checkboxes blank.
2

e. Click Create Project.


f. You should see the following in the Files pane in RStudio:

g. Click on New folder; create the following five folders in the Files pane:
 Data
 Code
 Documents
 Output
 Plots

h. Copy the file Diamonds.csv data into the Data folder of the project.
i. Open a new script file: File > New File > RScript.
3

j. Copy and paste the following code into the script pane of RStudio (usually the
pane in the upper left).
getwd()# Check the default directory.
diamonds <- read.csv("./Data/Diamonds.csv")# Read Diamonds.csv into diamonds.
head(diamonds,10) # Print out the first 10 rows.
tail(diamonds,10) # Print the last 10 rows
#
summary(diamonds) # Summarize the data in diamonds.
#
#### Subset the data to include only observations where carets <= 2.5.
diamonds <- diamonds[which(diamonds$carat <= 2.5),]
#
#
summary(diamonds) # Run the summary again.
#
hist(diamonds$carat) # Produce a histogram of the carat sizes.
# Save the plot your project Plots folder.
dev.copy(jpeg,'./Plots/carat_hist.jpg', width = 800, height = 600)
dev.off() # Turn off the output to files.
#
table(diamonds$clarity) # Produce a table of the clarity values.
#
#### Install the following ggplot2 on your computer.
#### (You may be asked to select a repository; any repository will work,
#### if you select one geographically close, things will go faster.

# install.packages("ggplot2") # Note: the quote marks are necessary.


library(ggplot2)# Load ggplot2; note the quote marks are optional.
# Generate a scatterplot of price versus carat using ggplot.
# aes stands for "aesthetics"; geom_point() plots a point for each observation.
#
ggplot(data = diamonds, aes(x = carat, y = price, color = clarity)) + geom_point()
+ geom_smooth(se = FALSE)
# Save the plot your project Plots folder.
dev.copy(jpeg,"./Plots/scatterplot.jpg", width = 800, height = 600)
dev.off() # Turn off the output to files.
#
#
ggplot(data = diamonds, aes(x = carat, y = price, color = clarity)) +
geom_smooth(se = FALSE)
# Save the plot your project Plots folder.
dev.copy(jpeg,'./Plots/lines.jpg', width = 800, height = 600)
dev.off() # Turn off the output to files.

k. Select the first few lines and click Run.


l. Fix any errors.
m. Run the rest of the code. Check for and fix any errors.
n. Determine the following statistics from the diamonds data frame:
4

a. Mean of price; use the following code: mean(diamonds$price).


b. Mean of carat.
c. Number of observations; use the code: nrow(diamonds)
d. Standard deviation of price; use the code sd(diamonds$price)
e. Standard deviation of carat.
f. Record the statistics on paper and then run the quiz exercise which asks you
to enter each of the statistics.

You might also like