Lab4Instructions Knitr
Lab4Instructions Knitr
2024-09-17
Begin by setting a working directory. Remember you can also set this using the menu in R-Studio.
Session>Set Working Directory>Choose Directory. . .
Note that your file will be invisible - navigate to the folder it resides within and hit ‘Open’
It will post something similar to the line below in your console. NOTE: Do not run the following line in
your console as the directories in your PC are completely different - R will output an error.
##Not run
setwd("~/Library/CloudStorage/OneDrive-DePaulUniversity/DePaul/Teaching/2025WQ/BIO206/Labs/Day_5")
list.files()
The data is in CSV format (comma separate value). Excel cannot save any plots in this format. It will only
save the text data.
Read in our data and name it something meaningful.
CanidDiet<-read.csv("CanidsData_DietPart.csv")
CanidForce<-read.csv("CanidsData_MassForcePart.csv")
You can view your data by clicking in the ‘envionrment’ panel, in the top-right of the R-Studio windows.
Now, let’s make a histogram.
As we saw in the other labs, the dollar sign allows us to access a variable directly.
In R-Studio it will give you options that you can click as a shortcut as you start to type out the variable
name. You can hit the ‘tab’ key to autocomplete what R-Studio believes should be entered.
We add two other arguments separated by commas xlab and main.
xlab lets us change the axis labels.
main is the title, I set it as NULL so it removes it.
1
par(mfrow = c(1,2))
hist(CanidData$Mass_KG, xlab="Mass (KG)", main = NULL)
hist(CanidData[,3]) #You can also use indexing to access a variable.
Histogram of CanidData[, 3]
3
3
2
2
Frequency
Frequency
1
1
0
Know that if you need to change your plotting window to only show a single chart, us par and mfrow again.
par(mfrow = c(1,1))
This tells the plotting window to place only a single plot as you ask for 1 row and 1 column. Before we asked
for 1 row and two columns.
Note that the distribution above is not normal. Right skew.
Now, let’s subset the data
1
0
2
hist(Omn$Mass_KG, xlab="Omnivore Mass (KG)", main = NULL)
1
Frequency
We can measure central tendency of the whole dataset, and by separating the data out by a categorical
variable. In this case diet.
## [1] 118.9944
## [1] 115.1652
## [1] 27.84065
3
aggregate(x = cbind(Mass_KG,BiteForceN)~Diet, FUN="sd", data = CanidData)
Boxplots are a great way to illustrate a continuous variable grouped by a discrete variable
The general format is as follows:
boxplot(continuous~categorical)
boxplot(Dependent~Independent)
You pass the function your whole data frame (data = CanidData), so you do not need to use the $ here.
par(pty='s',mfrow=c(1,2))
boxplot(Mass_KG~Diet, data = CanidData, xlab = "Diet", ylab = "Mass (KG)")
boxplot(BiteForceN~Diet, data = CanidData, xlab = "Diet", ylab = "Bite Force (N)")
180
60
50
Bite Force (N)
Mass (KG)
40
140
30
20
100
10
Diet Diet
Finally, we can use R to calculate a Z-score and the add the data back into our dataframe. We create a new
column for both mass and bite force.
The general formula for a z-score is: (value-mean)/standard deviation.
Create two pairs of box plots - these box plots, despite initially being on different scales, are now more
comparable.
4
CanidData$Mass_KG_Z <- (CanidData$Mass_KG-mean(CanidData$Mass_KG))/
sd(CanidData$Mass_KG)
par(pty='s',mfrow=c(1,2))
boxplot(Mass_KG_Z~Diet, data = CanidData, xlab = "Diet", ylab = "Mass Z-Score")
1.0
Mass Z−Score
1.0
0.0
0.0
−1.0
−1.0
−2.0
Diet Diet