0% found this document useful (0 votes)
100 views

Getting Started With R and RStudio

R is a free and open-source programming language and software environment for statistical analysis, graphics, and statistical computing. It was primarily developed for data analysis and statistical graphics. R can be used through the R console or through RStudio, which provides a graphical user interface. Key features of R include being free, running on multiple platforms, having a large community of users and contributors, and having many add-on packages available. When using RStudio, scripts can be created and run, and the interface provides panes for editing code, running code in the console, and viewing plots and other output. Common tasks like installing and loading packages can be done within R.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Getting Started With R and RStudio

R is a free and open-source programming language and software environment for statistical analysis, graphics, and statistical computing. It was primarily developed for data analysis and statistical graphics. R can be used through the R console or through RStudio, which provides a graphical user interface. Key features of R include being free, running on multiple platforms, having a large community of users and contributors, and having many add-on packages available. When using RStudio, scripts can be created and run, and the interface provides panes for editing code, running code in the console, and viewing plots and other output. Common tasks like installing and loading packages can be done within R.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Getting started with R and RStudio

1.1 Why R?
• R is not a programming language like C or Java.

• It was not created by software engineers for software development.

• Instead, it was developed by statisticians as an interactive environment for data


analysis.

• You can read the full history in the paper A Brief History of S.

• The interactivity is an indispensable feature in data science because, as you will


soon learn, the ability to quickly explore data is a necessity for success in this field.

• Developed by Ross Ihaka and Robert Gentleman


• However, like in other programming languages, you can save your
work as scripts that can be easily executed at any moment.

• These scripts serve as a record of the analysis you performed, a


key feature that facilitates reproducible work.
Attractive features of R 

• R is free and open source.

• It runs on all major platforms: Windows, Mac OS, UNIX/Linux.

• Scripts and data objects can be shared seamlessly across platforms.

• There is a large, growing, and active community of R users and, as a

result, there are numerous resources for learning and asking questions.
• It is easy for others to contribute add-ons which enables

developers to share software implementations of new data science

methodologies.

• This gives R users early access to the latest methods and to tools

which are developed for a wide variety of disciplines, including

ecology, molecular biology, social sciences, and geography, just to

name a few examples.


1.2 The R console
• Interactive data analysis usually occurs on
the R console that executes commands as
you type them.

• There are several ways to gain access to an


R console.

• One way is to simply start R on your


computer.
As a quick example, try using the console to calculate a 15% tip on
a meal that cost $19.71:

0.15 * 19.71

#> [1] 2.96


1.3 Scripts
• One of the great advantages of R over point-
and-click analysis software is that you can save
your work as scripts.

• You can edit and save these scripts using a text


editor.

• RStudio includes an editor with many R


specific features, a console to execute your
code, and other useful panes, including one to
show figures.
1.4 RStudio
• RStudio will be our launching pad for data science projects.

• It not only provides an editor for us to create and edit our scripts but also provides many other
useful tools.

• In this section, we go over some of the basics.


1.4.1 The panes
• When you start RStudio for the first time, you will
see three panes.

• The left pane shows the R console.

• On the right, the top pane includes tabs such


as Environment and History, while the bottom pane
shows five tabs: File, Plots, Packages, Help,
and Viewer (these tabs may change in new versions).

• You can click on each tab to move across the


different features.
• To start a new script, you can click on File, then New File, then R Script.
• This starts a new pane on the left and it is here where you can start writing your script.
1.4.2 Key bindings
• Many tasks we perform with the mouse can be
achieved with a combination of key strokes
instead.

• These keyboard versions for performing tasks


are referred to as key bindings.

• For example, we just showed how to use the


mouse to start a new script, but you can also
use a key binding: Ctrl+Shift+N on Windows
and command+shift+N on the Mac.
1.4.3 Running commands while editing scripts

• There are many editors specifically made for coding.

• These are useful because color and indentation are automatically added to make code more
readable.

• RStudio is one of these editors, and it was specifically developed for R.

• One of the main advantages provided by RStudio over other editors is that we can test our code
easily as we edit our scripts. 
• Let’s start by opening a new script as we did before.

• A next step is to give the script a name.

• We can do this through the editor by saving the current new unnamed script.

• To do this, click on the save icon or use the key binding Ctrl+S on Windows and command+S on
the Mac.
• When you ask for the document to be saved for
the first time, RStudio will prompt you for a
name.

• A good convention is to use a descriptive name,


with lower case letters, no spaces, only hyphens
to separate words, and then followed by the
suffix .R.

• We will call this script my-first-script.R.


• Another feature you may have noticed is that when you type library( the second parenthesis is
automatically added.

• This will help you avoid one of the most common errors in coding: forgetting to close a
parenthesis.

• As an example, we will make a graph showing murder totals versus population totals by state.

• Once you are done writing the code needed to make this plot, you can try it out by executing the
code.

• To do this, click on the Run button on the upper right side of the editing pane.

• You can also use the key binding: Ctrl+Shift+Enter on Windows or command+shift+return on the
Mac.
• Once you run the code, you will see it
appear in the R console and,

• In this case, the generated plot appears in


the plots console.

• Note that the plot console has a useful


interface that permits you to click back and
forward across different plots, zoom in to
the plot, or save the plots as files.
1.4.4 Changing global options
• You can change the look and functionality of RStudio quite a bit.

• To change the global options you click on Tools then Global Options….

• As an example we show how to make a change that we highly recommend.

• This is to change the Save workspace to .RData on exit to Never and uncheck the Restore .RData into workspace
at start.

• By default, when you exit R saves all the objects you have created into a file called .RData.

• This is done so that when you restart the session in the same folder, it will load these objects.

• We find that this causes confusion especially when we share code with colleagues and assume they have this
.RData file.

• To change these options, make your General settings look like this:


1.5 Installing R packages

•  first install as base R

• extra functionality from CRAN and many others shared via other repositories such as GitHub.

• R instead makes different components available via packages.

• R makes it very easy to install packages from within R.

•  For example, to install the dslabs package

install.packages("dslabs")

• In RStudio, you can navigate to the Tools tab and select install packages. We can then load the
package into our R sessions using the library Function

library(dslabs)
• We can install more than one package at once by feeding a character vector to this function:

install.packages(c("tidyverse", "dslabs"))

• You can see all the packages you have installed using the following function:

installed.packages()
  Installing R and RStudio
Installing R
• RStudio is an interactive desktop
environment, but it is not R, nor does
it include R when you download and
install it.

• Therefore, to use RStudio, we first


need to install R.

• You can download R from the


Comprehensive R Archive Network
(CRAN). Search for CRAN on your
browser:
• Once at the CRAN download
page, you will have several
choices.

• You want to install


the base subdirectory. This
installs the basic packages you
need to get started.

• We will later learn how to


install other needed packages
from within R, rather than
from this webpage.
• Click on the link for the latest version to start the download.

• You can now click through different choices to finish the installation. We recommend you select
all the default choices.
 Installing RStudio
• Download Rstudio

• Once you select this option, it will take you to a page in which the operating system options are
provided. Click the link showing your operating system.
R - The very basics
Objects

• We will write out general code for the quadratic equation below, but if we are asked to solve x2+x−1=0, then we define

a <- 1

b <- 1

c <- -1

We use <- to assign values to the variables.

we simply ask R to evaluate a and it shows the stored value:

#> [1] 1
If you want a quick look at the arguments without opening the help system, you can type:
args(log)
#> function (x, base = exp(1))
#> NULL
You can change the default values by simply assigning another object:
log(8, base = 2)
#> [1] 3
Note that we have not been specifying the argument X
log(x = 8, base = 2)
#> [1] 3
OR
log(8,2)
#> [1] 3
OR
log(base = 2, x = 8)
#> [1] 3
 Other prebuilt objects
• There are several datasets that are included for users to practice and test out functions. You can see all the available datasets by
typing:

data()
• This shows you the object name for these datasets. These datasets are objects that can be used by simply typing the name. For
example, if you type:

Co2
• R will show you Mauna Loa atmospheric CO2 concentration data.

• Other prebuilt objects are mathematical quantities, such as the constant  π and ∞:

pi
#> [1] 3.14
Inf+1
#> [1] Inf
Variable names

• We have used the letters a, b, and c as variable names, but variable names can be
almost anything.
• Some basic rules in R are that variable names have to start with a letter, can’t
contain spaces, and should not be variables that are predefined in R. For example,
don’t name one of your variables install.packages by typing something
like install.packages <- 2.
• A nice convention to follow is to use meaningful words that describe what is
stored, use only lower case, and use underscores as a substitute for spaces. For the
quadratic equations, we could use something like this:
solution_1 <- (-b + sqrt(b^2 - 4*a*c)) / (2*a)

solution_2 <- (-b - sqrt(b^2 - 4*a*c)) / (2*a)


Saving your workspace

• Values remain in the workspace until you end your session or erase them with the function rm. But
workspaces also can be saved for later use. In fact, when you quit R, the program asks you if you want to
save your workspace. If you do save it, the next time you start R, the program will restore the workspace.
• We actually recommend against saving the workspace this way because, as you start working on different
projects, it will become harder to keep track of what is saved. Instead, we recommend you assign the
workspace a specific name. You can do this by using the function save or save.image. To load, use the
function load. When saving a workspace, we recommend the suffix rda or RData. In RStudio, you can also
do this by navigating to the Session tab and choosing Save Workspace as. You can later load it using the Load
Workspace options in the same tab. You can read the help pages on save, save.image, and load to learn
more.
Motivating scripts
Commenting your code
Example

You might also like