0% found this document useful (0 votes)
22 views7 pages

On Building An R Report (Problem Set 0) : Objective

This document provides instructions for creating an R report that includes R code, text, and outputs. It instructs the reader to install packages, load data, create text and code blocks, and format text and plots. Examples are provided for creating tables, histograms, and scatterplots within the report. Mathematical notation is also demonstrated using LaTeX syntax. The goal is to help the reader feel comfortable generating professional-looking reports in R markdown.

Uploaded by

Abner ogega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views7 pages

On Building An R Report (Problem Set 0) : Objective

This document provides instructions for creating an R report that includes R code, text, and outputs. It instructs the reader to install packages, load data, create text and code blocks, and format text and plots. Examples are provided for creating tables, histograms, and scatterplots within the report. Mathematical notation is also demonstrated using LaTeX syntax. The goal is to help the reader feel comfortable generating professional-looking reports in R markdown.

Uploaded by

Abner ogega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

On building an R report (Problem Set 0)

Casey Crisman-Cox

Fall 2023

Objective
Our goal here is to get your comfortable with generating an R report in order to give you practice with
programming and writing results. For 2 bonus points on your final grade, complete all the tasks and examples
listed here and turn it in with problem set 1. That’s 2 extra points for just learning a neat and simple way to
generate professional looking reports that include R code and text. This is not a real problem set; it is only
to be used as a guide for creating an R report. Before starting, install tinytex with the following R code
(YOU ONLY NEED TO DO THIS ONCE THIS WHOLE SEMESTER!)
install.packages('tinytex')
tinytex::install_tinytex()

Also make sure you have the following packages installed: knitr, dplyr, ggplot2
1. Open a new R script file. Save it as markdownExample_LASTNAME.R.
2. Create a title block (like this one) at the top of your file.

Basically, now you can write around your R code so long as you begin your comments with #' instead of
the regular #. Pressing the compile button will allow you to produce a pdf, html, or Word document that
contains your R code and your written responses. This setup also allows you to format things very easily
using some basic commands: For example you can:
• Start at numbered list using typing 1.
• Create a bullet list with simple * bullets
– Indent your list by using 4 spaces to move in
• Place a * either side of some text to italicize
• Place a ** either side of some text bold
This takes us to our next task: 3. Create a numbered list of your top four favorite anything (e.g., movies,
animals, books, beers, ice cream flavors). Bold face one of them.

Packages
4. Load the knitr, dplyr, and ggplot packages with the code

1
library(knitr)
library(ggplot2)
library(dplyr)

Note: You can get rid of the stupid package start-up messages that come with some of these and ugly up
your documents by adding the line #+ message=FALSE before the code. The #+ at the start of a line instead
of #' tells R that you are writing options for the upcoming code rather text or code. To recap
1. #' This line is text (including math and title block)
2. #+ This line is options for code you’re about to write
3. Nothing at the start of a line is code. ## Using data
# Ordinary code comments without the extra tick are included as part of a code block
# loading mtcars data
data(mtcars)
print(head(mtcars)) #ugly and unprofessional

## mpg cyl disp hp drat wt qsec vs am gear carb


## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
kable(head(mtcars), format="pandoc")

mpg cyl disp hp drat wt qsec vs am gear carb


Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

# better and the report knows what to do with it

You can also use format=“html” to get something you can export to a word document. This process requires
more steps and is a bit hacky. First, you get the html code by using kable(head(mtcars), format="html").
Second, you copy-paste the html output into a blank text file and save it as an html file. Third, you open the
html file in a web browser. Fourth, you copy and paste that formatted table into word. Like I said a bit
hacky, which is why we like to compile to pdf.
5. Create a table that reports the mean MPG for each number of cylinders. You can aggregate using the
summarize function as mentioned in the working-with-data lecture (dataviz1.pdf), if we haven’t gotten
to that lecture yet, sit tight. Report the results using the kable function. Make sure that you use well
formatted names.

Plotting
We can plot data have the figures appear in the text.
ggplot(mtcars)+
geom_point(aes(x=disp, y=mpg))+
ylab("Miles per gallon")+

2
xlab("Displacement (cu. cm.)")+
ggtitle("Engine displacement and fuel efficiency")

Engine displacement and fuel efficiency


35

30
Miles per gallon

25

20

15

10
100 200 300 400
Displacement (cu. cm.)
This plot is a bit too big, we can can change the size options by using a comment that starts with #+. For
example, to change the plot size, we can specify #+ fig.width=4, fig.height=4 before plotting.
Note that these changes only apply to the figure you’re currently plotting. Here is an example of changing
the size of the figure using #+ fig.width=4, fig.height=4
ggplot(mtcars)+
geom_point(aes(x=disp, y=mpg))+
ylab("Miles per gallon")+
xlab("Displacement (cu. cm.)")+
ggtitle("Engine displacement and fuel efficiency")

3
Engine displacement and fuel efficiency
35

30
Miles per gallon

25

20

15

10
100 200 300 400
Displacement (cu. cm.)
Be careful with figure size, always be sure that it looks good, and always give proper labels.
Bad plot: #+ fig.width=2, fig.height=2
ggplot(mtcars)+
geom_histogram(aes(x=mpg))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.


5

4
count

0
10 15 20 25 30 35
mpg
Better plot: #+ fig.width=4, fig.height=4
ggplot(mtcars)+
geom_histogram(aes(x=mpg))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

4
5

3
count

10 15 20 25 30 35
mpg
Best plot: #+ fig.width=4, fig.height=4,fig.align='center'
ggplot(mtcars)+
geom_histogram(aes(x=mpg))+
ggtitle("Histogram of MPG ratings")+
xlab("Miles per gallon")+
ylab("Frequency")

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

5
Histogram of MPG ratings
5

Frequency
3

10 15 20 25 30 35
Miles per gallon
Mac users sometimes have trouble with saving R plots. The reason for this is often to do with a piece of
graphing software called xquartz that used to come standard on Macs and then didn’t. You can download it
here https://www.xquartz.org/.
6. Plot a well labeled scatterplot that has MPG on the y-axis and cylinders on the x-axis. Use triangular
points. You can find the right pch by checking the help file using the command ?points.

Math support
It’s not just code and text that are easier this way. We also have an easy-to-use math typing system. For
inline math we wrap $ symbols around the math. Here, $x+1$ gives x + 1. Using two dollar signs gives us
display math, save this for important points. For example: $$E[X] = \mu$$ produces

E[X] = µ

Note that this typed math is text. Any line that includes math starts with #'.
Here, we see that Greek letters are at our disposal. The basic ones we use:
• \mu, µ
• \sigma, σ
• \beta, β
• \epsilon, ϵ
• \rho, ρ
• \alpha, α
We can convert these to estimates using $\hat{}$ as in the sample mean is $\hat{\mu}$ or µ̂. We can also
add exponents and subscripts using ˆ and _ as in the sample variance is $\hat{\sigma}ˆ2$ (σ̂ 2 ). Fractions
are easy $\frac{1}{2}$ gives 12 . Square roots are another common thing we run into, those are given as

$\sqrt{\sigmaˆ2}=\sigma$ ( σ 2 = σ) You can look up other symbols as needed, but the last common
one we use is the sum operator, given by \sum_{}ˆ{}, putting it all together we can write the sample mean

6
formula as $$\hat{\mu}=\frac{1}{N}\sum_{i=1}ˆ{N}x_i$$, which gives us
N
1 X
µ̂ = xi
N i=1

7. Write a fraction of α over β in display math.


8. Compile your document as a pdf and turn it in along with your R script file.

You might also like