0% found this document useful (0 votes)

11 views5 pages

Lab4Instructions Knitr

Lab #4 focuses on data analysis using R, specifically working with canid dietary data. It covers setting a working directory, reading CSV files, creating histograms, calculating central tendency measures, and using boxplots to visualize data grouped by diet. Additionally, it demonstrates how to calculate Z-scores and add them to the dataset for further analysis.

Uploaded by

Jai Calatrava

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views5 pages

Lab4Instructions Knitr

Uploaded by

Jai Calatrava

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Lab #4

2024-09-17

Begin by setting a working directory. Remember you can also set this using the menu in R-Studio.
Session>Set Working Directory>Choose Directory. . .
Note that your file will be invisible - navigate to the folder it resides within and hit ‘Open’
It will post something similar to the line below in your console. NOTE: Do not run the following line in
your console as the directories in your PC are completely different - R will output an error.

##Not run
setwd("~/Library/CloudStorage/OneDrive-DePaulUniversity/DePaul/Teaching/2025WQ/BIO206/Labs/Day_5")

Is my file in the directory I just selected?

list.files()

## [1] "CanidsData_DietPart.csv" "CanidsData_MassForcePart.csv"

## [3] "Day_5_Script.R" "FULLDataCanids.csv"
## [5] "Lab4Instructions_Knitr.pdf" "Lab4Instructions_Knitr.Rmd"
## [7] "LabWorksheet_4_Complete.docx" "LabWorksheet_4_Complete.pdf"
## [9] "LabWorksheet_4.docx" "Worksheet_Boxplot.pdf"
## [11] "Worksheet_BoxplotZscore.pdf" "Worksheet_Hist.pdf"

The data is in CSV format (comma separate value). Excel cannot save any plots in this format. It will only
save the text data.
Read in our data and name it something meaningful.

CanidDiet<-read.csv("CanidsData_DietPart.csv")
CanidForce<-read.csv("CanidsData_MassForcePart.csv")

CanidData<-merge(CanidDiet, CanidForce, by = "SpeciesID")

You can view your data by clicking in the ‘envionrment’ panel, in the top-right of the R-Studio windows.
Now, let’s make a histogram.
As we saw in the other labs, the dollar sign allows us to access a variable directly.
In R-Studio it will give you options that you can click as a shortcut as you start to type out the variable
name. You can hit the ‘tab’ key to autocomplete what R-Studio believes should be entered.
We add two other arguments separated by commas xlab and main.
xlab lets us change the axis labels.
main is the title, I set it as NULL so it removes it.

1
par(mfrow = c(1,2))
hist(CanidData$Mass_KG, xlab="Mass (KG)", main = NULL)
hist(CanidData[,3]) #You can also use indexing to access a variable.

Histogram of CanidData[, 3]
3

3
2

2
Frequency

Frequency
1

1
0

80 100 140 180 80 100 140 180

Mass (KG) CanidData[, 3]

Know that if you need to change your plotting window to only show a single chart, us par and mfrow again.
par(mfrow = c(1,1))
This tells the plotting window to place only a single plot as you ask for 1 row and 1 column. Before we asked
for 1 row and two columns.
Note that the distribution above is not normal. Right skew.
Now, let’s subset the data

Can<-subset(CanidData, subset = CanidData$Diet == "Carnivore")

Omn<-subset(CanidData, subset = CanidData$Diet == "Omnivore")

And then produce a histogram of the newly separated data.

hist(Can$Mass_KG, xlab="Carnivore Mass (KG)", main = NULL)

2
Frequency

1
0

100 120 140 160 180

Carnivore Mass (KG)

2
hist(Omn$Mass_KG, xlab="Omnivore Mass (KG)", main = NULL)
1
Frequency

85 90 95 100 105 110 115 120

Omnivore Mass (KG)

We can measure central tendency of the whole dataset, and by separating the data out by a categorical
variable. In this case diet.

mean(CanidData$Mass_KG) #Calculate a mean

## [1] 118.9944

median(CanidData$Mass_KG) #Calculate a median

## [1] 115.1652

#Note that the median is quite different to the mean

sd(CanidData$Mass_KG) #Calculate standard deviation

## [1] 27.84065

What if I wanted a mean for a given diet.

I can use the aggregate function.
First, you join all the continuous variables together that you’re interested in using the function cbind. Then
you tell it which categorical variable you want to find the mean/median/sd for - Diet FUN in this case means
‘function’

aggregate(x = cbind(Mass_KG,BiteForceN)~Diet, FUN="mean", data = CanidData)

## Diet Mass_KG BiteForceN

## 1 Carnivore 138.8240 46.0
## 2 Omnivore 99.1648 27.6

aggregate(x = cbind(Mass_KG,BiteForceN)~Diet, FUN="median", data = CanidData)

## Diet Mass_KG BiteForceN

## 1 Carnivore 129.91779 45
## 2 Omnivore 96.28355 30

3
aggregate(x = cbind(Mass_KG,BiteForceN)~Diet, FUN="sd", data = CanidData)

## Diet Mass_KG BiteForceN

## 1 Carnivore 25.52264 9.617692
## 2 Omnivore 10.46614 12.660964

Boxplots are a great way to illustrate a continuous variable grouped by a discrete variable
The general format is as follows:
boxplot(continuous~categorical)
boxplot(Dependent~Independent)
You pass the function your whole data frame (data = CanidData), so you do not need to use the $ here.

par(pty='s',mfrow=c(1,2))
boxplot(Mass_KG~Diet, data = CanidData, xlab = "Diet", ylab = "Mass (KG)")
boxplot(BiteForceN~Diet, data = CanidData, xlab = "Diet", ylab = "Bite Force (N)")
180

60
50
Bite Force (N)
Mass (KG)

40
140

30
20
100

Carnivore Omnivore Carnivore Omnivore

Diet Diet

Finally, we can use R to calculate a Z-score and the add the data back into our dataframe. We create a new
column for both mass and bite force.
The general formula for a z-score is: (value-mean)/standard deviation.
Create two pairs of box plots - these box plots, despite initially being on different scales, are now more
comparable.

4
CanidData$Mass_KG_Z <- (CanidData$Mass_KG-mean(CanidData$Mass_KG))/
sd(CanidData$Mass_KG)

CanidData$BiteForceN_Z <- (CanidData$BiteForceN-mean(CanidData$BiteForceN))/

sd(CanidData$BiteForceN)

par(pty='s',mfrow=c(1,2))
boxplot(Mass_KG_Z~Diet, data = CanidData, xlab = "Diet", ylab = "Mass Z-Score")

boxplot(BiteForceN_Z~Diet, data = CanidData, xlab = "Diet", ylab = "Bite Force Z-Score")

2.0

Bite Force Z−Score

1.0
Mass Z−Score

1.0

0.0
0.0

−1.0
−1.0

−2.0

Carnivore Omnivore Carnivore Omnivore

Diet Diet

Krijnen IntroBioInfStatistics
No ratings yet
Krijnen IntroBioInfStatistics
278 pages
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Lab3Instructions Knitr
No ratings yet
Lab3Instructions Knitr
5 pages
R
No ratings yet
R
4 pages
Rintro
No ratings yet
Rintro
42 pages
Statistical Inference Lab4
No ratings yet
Statistical Inference Lab4
32 pages
Intro To R Software
No ratings yet
Intro To R Software
7 pages
Carlos Willis Problem-Set-1
No ratings yet
Carlos Willis Problem-Set-1
10 pages
Univariate/Bi Variate Analysis
No ratings yet
Univariate/Bi Variate Analysis
32 pages
Đại Học Quốc Gia Đại Học Bách Khoa Tp Hồ Chí Minh: Subject: probability and statistics
No ratings yet
Đại Học Quốc Gia Đại Học Bách Khoa Tp Hồ Chí Minh: Subject: probability and statistics
13 pages
Ex Day1
No ratings yet
Ex Day1
9 pages
Big Data - Lab 5
No ratings yet
Big Data - Lab 5
26 pages
STAT501 Online - HW2R - Spring2024
No ratings yet
STAT501 Online - HW2R - Spring2024
7 pages
q3 Stat2100 Bautista-Lhuriely
No ratings yet
q3 Stat2100 Bautista-Lhuriely
11 pages
F24 Lab-01
No ratings yet
F24 Lab-01
4 pages
Variance (ANOVA)
No ratings yet
Variance (ANOVA)
7 pages
B) Stata Interface (With Data and Commands, Windows) : End: The Introduction of Data Has Finished
No ratings yet
B) Stata Interface (With Data and Commands, Windows) : End: The Introduction of Data Has Finished
14 pages
R Cheat Sheet
No ratings yet
R Cheat Sheet
9 pages
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
No ratings yet
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
39 pages
7th Report
No ratings yet
7th Report
14 pages
R Presentation
No ratings yet
R Presentation
38 pages
Applied Statistics For Bioinformatics PDF
No ratings yet
Applied Statistics For Bioinformatics PDF
278 pages
Tutorials
No ratings yet
Tutorials
10 pages
DevRes wk1-2
No ratings yet
DevRes wk1-2
6 pages
R Exercice
No ratings yet
R Exercice
11 pages
R Practice
No ratings yet
R Practice
38 pages
Combined 76 90
No ratings yet
Combined 76 90
15 pages
AMDA Practical - A048
No ratings yet
AMDA Practical - A048
35 pages
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
No ratings yet
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
58 pages
Assignment Food and Nutrition
No ratings yet
Assignment Food and Nutrition
3 pages
Applied Statistics For Bioinformatics Using R
100% (2)
Applied Statistics For Bioinformatics Using R
279 pages
Q3 - Stat2100 Dupol Melkiancaesar
No ratings yet
Q3 - Stat2100 Dupol Melkiancaesar
12 pages
Karan Parmar BBA (MS) Section-A - R-Programming Assignment
No ratings yet
Karan Parmar BBA (MS) Section-A - R-Programming Assignment
21 pages
Cours BI - R
No ratings yet
Cours BI - R
18 pages
ANOVA Models
No ratings yet
ANOVA Models
44 pages
Lab 0 CR
No ratings yet
Lab 0 CR
3 pages
BAN5
No ratings yet
BAN5
2 pages
Data Manipulation With Dplyr
100% (1)
Data Manipulation With Dplyr
39 pages
Lesson 1
No ratings yet
Lesson 1
18 pages
BM1, Applied Statistics, Lesson 1: Data and Graph Basics: Luis Del Peso Ovalle
No ratings yet
BM1, Applied Statistics, Lesson 1: Data and Graph Basics: Luis Del Peso Ovalle
17 pages
R Practical
No ratings yet
R Practical
9 pages
Practical Assignment-10 Mini Project Nutrition Calculator - Calculate Nutrition For Recipes
No ratings yet
Practical Assignment-10 Mini Project Nutrition Calculator - Calculate Nutrition For Recipes
16 pages
Lab 01 NOTES On Scientific Data and Graphs
No ratings yet
Lab 01 NOTES On Scientific Data and Graphs
9 pages
R Studio Lab Summary Sheet
No ratings yet
R Studio Lab Summary Sheet
3 pages
Assignment# 06
No ratings yet
Assignment# 06
16 pages
Parta PDF
No ratings yet
Parta PDF
153 pages
Apuntes de Clase - DataCamp - R
No ratings yet
Apuntes de Clase - DataCamp - R
42 pages
Stata All Command (Jahidul)
No ratings yet
Stata All Command (Jahidul)
13 pages
Human Anatomy & Physiology I Lab 2 Graphing Styles & Interpreting Graphs
No ratings yet
Human Anatomy & Physiology I Lab 2 Graphing Styles & Interpreting Graphs
11 pages
05 Data Transformation Exploration Visualization
No ratings yet
05 Data Transformation Exploration Visualization
38 pages
R
No ratings yet
R
6 pages
R With RCMDR: Basic Instructions: 1 Running & Installation R Under Windows
No ratings yet
R With RCMDR: Basic Instructions: 1 Running & Installation R Under Windows
23 pages
Pretty Graphs 2
No ratings yet
Pretty Graphs 2
20 pages
Normalization 1
No ratings yet
Normalization 1
23 pages
A3 SummaryGraphChallenge
No ratings yet
A3 SummaryGraphChallenge
4 pages
Introduction To Statistics in R
No ratings yet
Introduction To Statistics in R
42 pages
Basic R Commands For Data Analysis
No ratings yet
Basic R Commands For Data Analysis
7 pages
R Syntax Examples 1
No ratings yet
R Syntax Examples 1
6 pages
Lab 1 (With Answers)
No ratings yet
Lab 1 (With Answers)
44 pages
Visual Financial Accounting for You: Greatly Modified Chess Positions as Financial and Accounting Concepts
From Everand
Visual Financial Accounting for You: Greatly Modified Chess Positions as Financial and Accounting Concepts
Anthony Brticevic
No ratings yet
Introduction To Hadoop - Part Two: 1 Working With Found Datasets 1 2 Hadoop and Comma Separated Values (CSV) Files 1
No ratings yet
Introduction To Hadoop - Part Two: 1 Working With Found Datasets 1 2 Hadoop and Comma Separated Values (CSV) Files 1
18 pages
AUS Reggs-UserGuide ABC WebV-2
No ratings yet
AUS Reggs-UserGuide ABC WebV-2
56 pages
Motion Creator Pro 2 Functional User Manual English v0.1
No ratings yet
Motion Creator Pro 2 Functional User Manual English v0.1
111 pages
CMDBuild UserManual ENG V240
No ratings yet
CMDBuild UserManual ENG V240
80 pages
AccuPAR PAR 80 Operators Manual
No ratings yet
AccuPAR PAR 80 Operators Manual
82 pages
PSCAD Tutorial
100% (2)
PSCAD Tutorial
42 pages
Introduction To STATA With Econometrics in Mind: January 2010
No ratings yet
Introduction To STATA With Econometrics in Mind: January 2010
47 pages
What Is A FBD?: Moment Utility
No ratings yet
What Is A FBD?: Moment Utility
22 pages
Xelix Implementation Guide - Full Platform
No ratings yet
Xelix Implementation Guide - Full Platform
23 pages
Pandas: Reference Sheet
No ratings yet
Pandas: Reference Sheet
9 pages
File Handling in Python Last Updated 25-04-2024 240426 180346
No ratings yet
File Handling in Python Last Updated 25-04-2024 240426 180346
34 pages
Data Analytics Certificate Glossary
No ratings yet
Data Analytics Certificate Glossary
23 pages
2JdR4PdZSEGXUeD3WWhBYg - Course 7 Week 5 Glossary - DA Terms and Definitions
No ratings yet
2JdR4PdZSEGXUeD3WWhBYg - Course 7 Week 5 Glossary - DA Terms and Definitions
24 pages
GeoCLIM1.2.0 Manual
No ratings yet
GeoCLIM1.2.0 Manual
79 pages
Startlaz 7
No ratings yet
Startlaz 7
20 pages
Whatsapp Group Contacts Getter
No ratings yet
Whatsapp Group Contacts Getter
4 pages
Kendriya Vidyalaya Sangathan, Chennai Region Practice Test 2020 - 21 Class Xii
No ratings yet
Kendriya Vidyalaya Sangathan, Chennai Region Practice Test 2020 - 21 Class Xii
9 pages
Operation Guide - SAP Portafolio and Project Management V1.1 PDF
No ratings yet
Operation Guide - SAP Portafolio and Project Management V1.1 PDF
34 pages
A05 Logic and Plots: Instructions
No ratings yet
A05 Logic and Plots: Instructions
7 pages
Black Box Manual Service
100% (2)
Black Box Manual Service
9 pages
Caterpillar Prod
No ratings yet
Caterpillar Prod
20 pages
Report Designer Manual 18 Appendix III PDF
No ratings yet
Report Designer Manual 18 Appendix III PDF
85 pages
Limesurvey
No ratings yet
Limesurvey
36 pages
AIGDEL - 0820 Red 1 26 - Compressed 1 26
No ratings yet
AIGDEL - 0820 Red 1 26 - Compressed 1 26
26 pages
3 Excel Data Entry and Graphs
No ratings yet
3 Excel Data Entry and Graphs
25 pages
Cambridge International AS & A Level: Information Technology 9626/02
No ratings yet
Cambridge International AS & A Level: Information Technology 9626/02
8 pages
MS1500L LPR Data Logger: Metal Samples Company
No ratings yet
MS1500L LPR Data Logger: Metal Samples Company
68 pages
Incident Report NG and Investigation: Oaky Creek
No ratings yet
Incident Report NG and Investigation: Oaky Creek
17 pages
Release Notes: ASE DLMS Meter Explorer
No ratings yet
Release Notes: ASE DLMS Meter Explorer
12 pages
Shashank Bodduna: Informatics Practices Project XII
No ratings yet
Shashank Bodduna: Informatics Practices Project XII
20 pages

Lab4Instructions Knitr

Uploaded by

Lab4Instructions Knitr

Uploaded by

Lab #4

Is my file in the directory I just selected?

## [1] "CanidsData_DietPart.csv" "CanidsData_MassForcePart.csv"

CanidData<-merge(CanidDiet, CanidForce, by = "SpeciesID")

80 100 140 180 80 100 140 180

Mass (KG) CanidData[, 3]

Can<-subset(CanidData, subset = CanidData$Diet == "Carnivore")

And then produce a histogram of the newly separated data.

hist(Can$Mass_KG, xlab="Carnivore Mass (KG)", main = NULL)

100 120 140 160 180

Carnivore Mass (KG)

85 90 95 100 105 110 115 120

Omnivore Mass (KG)

mean(CanidData$Mass_KG) #Calculate a mean

median(CanidData$Mass_KG) #Calculate a median

#Note that the median is quite different to the mean

sd(CanidData$Mass_KG) #Calculate standard deviation

What if I wanted a mean for a given diet.

aggregate(x = cbind(Mass_KG,BiteForceN)~Diet, FUN="mean", data = CanidData)

## Diet Mass_KG BiteForceN

aggregate(x = cbind(Mass_KG,BiteForceN)~Diet, FUN="median", data = CanidData)

## Diet Mass_KG BiteForceN

## Diet Mass_KG BiteForceN

Carnivore Omnivore Carnivore Omnivore

CanidData$BiteForceN_Z <- (CanidData$BiteForceN-mean(CanidData$BiteForceN))/

boxplot(BiteForceN_Z~Diet, data = CanidData, xlab = "Diet", ylab = "Bite Force Z-Score")

Bite Force Z−Score

Carnivore Omnivore Carnivore Omnivore

You might also like