0% found this document useful (0 votes)
15 views40 pages

Intro2R Wk2

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 40

Introduction to R for Air Quality Research

WK2

Dr. Ross Edwards - Senior Scientist


Wisconsin State Laboratory of Hygiene, University of Wisconsin Madison
Acknowledgements and disclaimer
Disclaimer
This presentation was funded by a grant from the United States Department of
State to the University of Wisconsin. The opinions, findings and conclusions
stated herein are those of the presenter and do not necessarily reflect those of
the United States Department of State.

Funding
• U.S. Department of State awards SMLAQM19CA2361
Partnerships
• The U.S Embassy in Bangladesh and U.S Consulate General, Kolkata.

Academic collaborations
• Dhaka University and Bose Institute, Kolkata.
Introduction to R

Subjects to be covered

1. Cran R Installation and setup navigating.


2. R- Studio 1.
3. R- Studio 2.
4. Data types and structures 1.
5. Data types and structures 2.
6. Data types and structures 3.
7. Importing and exporting data.
8. Datetime objects
9. Plotting.
10. Plotting.
12. Writing functions.
13. Writing functions.
R GUI – Some alternatives

Rattle – an R
https://www.rstudio.com package
https://cran.r-project.org/web/packages/rattle/index.html

R Commander
https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/
RKWard
https://rkward.kde.org/About.html
More alternatives exist.

Note: We do not endorse any of these options.


• Integrated Development Environment (IDE) for R.
• Extra tools for viewing and importing and exporting data.
• Includes tutorials.
• Has 4 panels for Running scripts, entering commands to console,
viewing the environment and plotting graphics.
R Studio – Seven reasons to use it

1. Lots of powerful features for editing and running code. Easier to write scripts.
2. Easier to to import and inspect data (in table form) + see objects in the
working environment.
3. Version control through R Studio projects.
4. Improved organization of workspaces, scripts, data etc.
5. Graphics can be more accessible.
6. Can execute long running scripts in the background ( if memory is available).
7. Simplifies monitoring memory usage.
R Studio – getting started

1. Install R: Before installing RStudio, you need to have R installed on your


computer. You can download R from the Comprehensive R Archive Network
(CRAN) at https://cran.r-project.org/. Follow the instructions for your
operating system (Windows, macOS, or Linux) to complete the installation.

2. Install RStudio: After installing R, you can download and install RStudio from
the RStudio website at https://www.rstudio.com/products/rstudio/download/
. Choose the appropriate installer for your operating system and follow the
installation instructions.
R Studio – Menu bar

Menu Bar – R Studio

Menu Bar – R GUI


R Studio interface panes

1. Source 3. Environment
• The Environment tab shows the variables, data
• Source: This pane is where you
frames, and other objects in your current R
write and edit your R scripts.
workspace. The History tab displays a log of
You can create a new script by
previously executed commands.
clicking on "File" > "New File" >
"R Script" or open an existing
script using "File" > "Open File".

4. Files / plots /
packages/ Help/ …..
2. Console • These tabs provide access to your file system,
• plots generated by your R code, installed and
The console pane is where you can enter
available R packages, and help documentation.
R commands interactively and see the
output. You can execute a single line of
code by typing it in the console and
pressing Enter.
Write and run R code:

In RStudio, you can write R code in the Source pane and run it
using the Console pane.

To run a single line of code or a selected block of code from


the Source pane, press Ctrl+Enter (Cmd+Enter on macOS).

To run the entire script, press Ctrl+Shift+Enter


(Cmd+Shift+Enter on macOS). The output will be displayed in
the Console pane.
Install and load
packages
To install an R package, you can use the "Packages" tab or run
the install.packages() function in the Console pane.

e.g. install.packages("ggplot2").

To load a package, use tick the library’s name in the


"Packages" tab or use the library() function in the console
pane.

e.g. library(ggplot2)
Save your
work

To save your workspace (including data and variables),


Top menu:
click on "Session" > "Save Workspace As...".

Save your R script by


Top menu:
clicking on "File" > "Save" or pressing Ctrl+S (Cmd+S on macOS).
Example
1. Open RStudio and
# Load required packages
create a new R script library(ggplot2)
by clicking on "File" >
"New File" > "R Script" # Create sample data
set.seed(42)
or pressing data <- data.frame(x = rnorm(50), y = rnorm(50))
Ctrl+Shift+N
(Cmd+Shift+N on # Generate scatter plot using ggplot2
scatter_plot <- ggplot(data, aes(x = x, y = y)) +
macOS). geom_point() +
labs(title = "Scatter Plot of Sample Data",
2. In the Source pane, x = "X-Axis",
y = "Y-Axis")
type or copy-paste the
following R code: # Display the plot
print(scatter_plot)
Example

3. Save the R script by clicking on "File" > "Save" or pressing Ctrl+S (Cmd+S on
macOS). Choose a location on your computer and give your script a name, such as
"scatter_plot_example.R".

4. Install the ggplot2 package if you haven't already. You can do this in the Console
pane by typing install.packages("ggplot2") or by clicking on the "Packages" tab,
clicking "Install", typing "ggplot2", and pressing Enter.

5. To run the entire script, press Ctrl+Shift+Enter (Cmd+Shift+Enter on macOS).


Or click on the "Run" button at the top-right corner of the Source
pane.
Alternatively, you can run the code line by line or in blocks by selecting the desired
lines and pressing Ctrl+Enter (Cmd+Enter on macOS).

6. After running the script, you should see the scatter plot in the "Plots" tab in the
lower right pane of RStudio. The Environment tab in the upper right pane will show
the variables created during the execution, such as data and scatter_plot.
Example
R Studio – Workflows

R workflows

• An R workflow refers to the systematic


process of organizing, writing, executing, and
maintaining R code to accomplish a data
analysis or visualization task.

• A well-structured workflow can improve your


efficiency, help you better understand your
own code, and make it easier for others to
understand and reproduce your work.
R Studio – Workflows
Best practices and recommendations for creating an
effective R workflow:

1.Project organization: Organize your project files in a clear and


logical folder structure. This can include separate folders for
raw data, cleaned data, scripts, output files, and
documentation.

2. Script organization: Break your code into smaller,


modular scripts or functions that each perform a specific
task.

3. Reproducible analysis: Make your analysis


reproducible by using relative file paths, setting seeds for
random number generation, and documenting your code
with comments and markdown files.

4. Automation: Use R scripts or R Markdown documents


to automate repetitive tasks, such as data cleaning,
analysis, or report generation.
R Studio – Workflows

Enhanced Data analysis workflows

• Create R projects to house self-contained data


analysis projects.

• Create literate code documents - text documents


with embedded code chunks such as R Notebook
and R markdown,
WK3 R Studio – R Projects

R Projects
R Studio – R Projects

R Projects
• Having to set working directories and absolute paths to data
folders rapidly becomes a problem and slows work-flows.

• A better solution is to work with R projects.

• R projects directory automatically becomes the working


directory for the project.

• An R project can contain all relevant scripts, data files,


figures/outputs, and history in sub-folders.

• Enables analysis etc. to be organized in a portable, self-


contained folder.
R Studio – R Projects

Creating an R project

• Created from File drop down menu.

Menu bar ▶
File ▶
New project ▶
New Directory
New project

• Fill in directory name and click on


create project.
R Studio – R Projects

R project dir/folder contents

• R project - a folder containing


a .Rproj file.

• The .Rproj file is a shortcut which will


open the project.

• Can also open a project by selecting


“Open Project” from the File menu.

• Or the R project icon and a drop-down


menu on right side of Environment
panel.
R Studio – R Projects

R project settings

• It’s good practice to start RStudio each


time with a “clean slate”- that is a clean
workspace etc.

• In this case objects etc. will not be


preserved and will need to be recreated
by running scripts etc.

• Select “Project Options” from the Tools


menu. In the “General” tab, set RStudio
to not restore .RData into workspace at
startup, and to not save workspace
to .RData on exit.
R Studio – R Projects

R project subfolders
• Enables work to be organized in a
portable, self-contained folder.

• Note that the R project folder


automatically becomes the working
directory.

• Functions and script files can be added


to the project folder or organized in sub- See example in WK3 folder
folders.
R Studio – R Projects
Good practices

• Create separate R Studio projects for each data analysis project.

• Keep raw and processed data files in project sub-folders.


Relative directory path
• Keep scripts and functions in the project
Use “.” = project folder i.e
• Save outputs to a project sub-folder. working directory.

• Use relative directory paths, not absolute paths to maintain E.g. relative path for a
portability. project subfolder.

“./subfolder”
WK3 R Studio – R Notebooks & Markdown

R Notebooks and R Markdown


R Studio – R Notebooks & R Markdown

Enhanced workflow with R Notebooks & R Markdown


• Literate code / dynamic documents!

• Fusion of text and executable code.

• Results can be saved into word, PDF and other formats.

• Can be a great way to share code and generate reports etc.


R Studio – R Notebooks & R Markdown

Knitr
• Both R Notebooks and R Markdown use a package called
Knitr.

• The basic idea of knitr is you construct a normal type of text


document with blocks of code in it.

• Formatting of the text typically uses Markdown code


symbols.

• Knitr converts the document to a rendered form with the R


code shown or hidden and the code output. Output forms
include webpages, PDFs, word documents, slides etc.
R Studio – R Notebooks & R Markdown

Differences between R Notebooks and R Markdown

• Both R Notebooks and Markdown are similar but have some


different options and execution.

• R Markdown: when the file is Knit, all the code (chunks) are
also run and their results rendered.

• In R Notebooks, when the file is Previewed, only the code


chunks already run will show their results.
R Studio – R Notebooks and R Markdown

R Notebooks
• Workflow where analyses are performed within a document which
acts as the final report.

Advantages

• Code and output are adjacent don’t have to switch between


“panes”.
• Combine documentation and reporting next to code, with
formatted text elements.
• Code can be shown or hidden.
• Code can also be Python, SQL etc. in addition to R.
• Easy to collaborate with human-readable documents.
• Output in different formats including HTML (web page), PDF,
Microsoft Word, presentations.
R Studio – R Notebook / Markdown
R Notebook & R Markdown anatomy

YAML Header Metadata: title, author, date information, options for output.

Text Use markdown symbols to format text.


Executable R code. Options for rendering in output.
Code chunks
Run 1 chunk at a time or all of them.
Code output Data, calculations, figures etc.

.Rmd file Outputs

HTML (web page), * Save R Notebook in R project for easier organization


pdf, word doc,
presentation slides
etc.
R Studio – R Notebook / Markdown
R Notebook & R Markdown anatomy

YAML Header Outputs

Text HTML (web page),


pdf, word doc,
Code chunks presentation slides
etc.
Code output

Dynamic report generation engine,


Package renders R notebooks and
Markdown.
R Studio – R Notebook document

YAML Header
• Includes title, author, and date information as
well as options for customizing output.

• Section is fenced by “---” before and


after the section.

• In this case the output is saved as a html file


along side the .Rmd file

• YAML spaces must be correct! No tabs! If


indentation is needed use a single space.

https://www.cloudbees.com/blog/yaml-tutorial-everything-
you-need-get-started
R Studio – R Notebook document
Markdown formatting
Text formatting - titles E.g.
# Title
• Use “#” for title of section ## Title
• Use ## for sub-section title (smaller font size) ### Title
• Use ### ……..

Paragraphs, bold, italics and hyperlink


E.g.
• For a paragraph jump, use two spaces at the end
- Title
of a line .
**Title**
• Use the “-” to create bullet points:
*Title*
• Use **for bold** text.
[Title]
• Use *for italics*.
• Use [ ] for webpage links.
See
https://bookdown.org/yihui/bookdown/markdown-syntax.html
for detailed discussion.
R Studio – R Notebook document
Code chunk R code
Optional label
3 back ticks
• Fenced by there back ticks ``` before
and after code. ```{r chunk- label, options}
• R code block use identified using {r} Code goes here
after back ticks.
• Optional chunk label and options can
```
Optional Options
be included.

Options

•message: displays (TRUE) or not (FALSE) messages.

•warning: displays (TRUE) or not (FALSE) warnings.

•echo: TRUE by default, it hides the code when changed to FALSE.

•eval: TRUE by default, it hides the output when changed to FALSE.

•include: TRUE by default, it hides everything when changed to FALSE.


R Studio – R Notebook document

Code chunk

• Figure options can be included inside the


{r}.

e.g.

{r fig2, fig.width=3, fig.width=8, fig.align='center',


fig.cap='Figure 2. Six hour average Dhaka TSI2 PM2.5 2022.’}

See https://sebastiansauer.github.io/figure_sizing_knitr/
R Studio – R Notebooks and R Markdown
R Notebooks & R Markdown Web page output
Click to run code chunk
Knit to render output, preview in R Notebook

rendered
output

run all code chunks


R Studio – R Notebooks & Markdowns

How to make a new R Markdown

Menu bar ▶
File ▶
New File ▶
R Markdown...

• Select the class of output you would like to make with


your .Rmd file (document , presentation , etc.)

•Select the type of output to with the radio buttons (you can
change this later) e.g. HTML, PDF, Word …

•Click OK
R Studio – R Notebooks

R Notebook
• When you save an R Notebook Knitr
will automatically render a html version
of the notebook. And save it along with
an .RMD file (the Notebook)

• Note only the results of code chunks that


have been previously will be rendered
R Studio – R Notebooks

R Notebook

• R Studio can also output R Notebooks


and R markdown files as MS Word
docs etc.

You might also like