0% found this document useful (0 votes)
178 views13 pages

Module 4 PDF

This document discusses installing and using R and RStudio for statistical analysis. It describes how to download and install both R and RStudio software, providing step-by-step instructions. The RStudio interface is explained, with its four panes that include the Console, Script, Environment and History, and Files/Plots windows. Basic use of R commands in RStudio is covered.

Uploaded by

ABAGAEL CACHO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views13 pages

Module 4 PDF

This document discusses installing and using R and RStudio for statistical analysis. It describes how to download and install both R and RStudio software, providing step-by-step instructions. The RStudio interface is explained, with its four panes that include the Console, Script, Environment and History, and Files/Plots windows. Basic use of R commands in RStudio is covered.

Uploaded by

ABAGAEL CACHO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

MODULE 4: R AND RSTUDIO FOR STATISTICAL ANALYSIS

(For SEPTEMBER 14-16)

Learning Outcomes:
(1) Install the R and RStudio software for statistical data analysis in your
personal computer or laptop, in preparation for its use in the succeeding
modules.
(2) Practice basic RStudio commands to get familiarized with the syntax and
language of RStudio.

In this module, you will come to know about R and RStudio, how to install these software
applications, and perform basic commands in RStudio.

4.1. INTRODUCTION TO R AND RSTUDIO

R is a statistical software that is a free, non-commercial implementation of the statistical


programming language R which is developed at the AT&T Bell Laboratories by Rick Becker,
John Chambers and co-workers. It is a programming language for statistics and graphics
that can be installed on many computers without restriction.

R is a function based computer language which implies that all actions are initiated by
calling functions. R provides a powerful and comprehensive environment for data
manipulation and analysis. You will be encountering many examples as we engage
ourselves in its use for statistical analysis.

4.2. INSTALLING R AND RSTUDIO

How to Install R

Step 1. Download the R installer from https://cran.r-project.org/. The necessary files for
installing R in Windows, Mac OS X, or Linux are found in the website.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 40
Figure 1. Screenshot of the cran website https://cran.r-project.org/

Select the installer file which is appropriate for your laptop‟s or computer‟s operating
system. For those using the Windows operating system, click on “Download R for Windows.”
For those using Mac OS X or Linux, just click on the corresponding link to proceed with the
download.

Upon clicking on that option, you will be directed to the next page as shown in the
following figure.

Figure 2. Screenshot for Step 2 of the installation process.

Step 2. After downloading the R installer file, click on “Install R for the first time.”

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 41
Step 3. On the next webpage, as shown in Figure 3, click on “Previous releases” and from
the list, select R 3.6.3. The same R installer version is available for download for Mac
OS X. For those using Linux as their operating system, follow the steps outlined in the
website. You will be prompted to save the installer file. You can opt to save it on
your “Downloads” folder.

Figure 3. Screenshot of webpage for Step 3 of the installation process.

Step 4. Once the installer file has been downloaded, you can see it at the bottom left
portion of your screen. Click on the icon to start with the installation process. Follow
the on-screen prompts and just consider the default settings.

How to Install RStudio

Step 1. Download the RStudio installer file from the site


https://www.rstudio.com/products/rstudio/download/

Just like the R software, the necessary installer files for RStudio for the different operating
systems (Windows, Mac OS X, and Linux) are available at the website.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 42
Figure 4. The RStudio website.

Step 2. Click on the appropriate download link for the installer file depending on your
laptop‟s or computer‟s operating system. Again, assuming most of us are on
Windows, we click on the download link for Windows 10/8/7. You will be prompted
to save the installer file. Again, you can opt to save it on your “Downloads” folder.

Figure 5. Saving the RStudio installer file on the “Downloads” folder.

Step 3. The installer file will be downloaded and you can see the installer icon at the
bottom left part of your screen. Click on the installer icon to proceed with the
installation process. Follow the on-screen prompts and just consider the default
settings.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 43
Check that R and RStudio are working.

Figure 6 shows the R interface. It is a simple graphical user interface (GUI) where one can
now start to enter R commands in the R Console window. This would work for simple
computations but not for real data analysis, which we would want to be well documented
as well as we might want to repeat it in the same or slightly modified way for a different
dataset. It is hence recommended to generate a text file that would include the R
commands to be used in the analysis.

Figure 6. The R Console window.

In the case of performing simple to complex data analysis, it would be a good practice to
use a text editor or an integrated development environment (IDE). This is where the RStudio
would come into play.

The RStudio is an IDE or, simply, a user-interface that adds a more user-friendly and
streamlined working environment to R. Figure 7 shows the RStudio interface.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 44
Figure 7. The RStudio interface.

The Four Panes of the RStudio Interface.

1. The Console pane. This is where the R software is running. It is considered as the
heart of RStudio. In Figure 7, the console shows that the running version of R is R3.6.3.
You can directly type commands on the console. You can see in the above figure
that a simple arithmetic calculation was performed on the console.

2. The Script or Source pane. This pane is where you can write scripts of R commands
as well as make notes about the analysis that you are going to perform. When a
command is run in this pane, it is sent over to the console pane where the
command is executed. Multiple scripts can be opened in this pane where each
script will have their own tab. When this pane is not initially activated, you can
access it by clicking on File → NewFile → R Script.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 45
3. The Environment and History pane. The Environment tab shows all R objects that are
currently loaded or generated during the current session. The History tab, on the
other hand, contains a history of the R commands that were executed.

4. The Files, Plots, Packages, Help, and Viewer windows. The Files window shows a file
browser which after the start shows the current working directory. The Plots window
shows plots generated in the current session. It is empty immediately after starting
RStudio. In the Package window, all packages installed on the system are shown.
The Help window provides several ways of help for both R and RStudio. The Viewer
window displays local websites or web applications.

4.3. SOME RSTUDIO BASICS

We will be running all the R commands in RStudio. For this section, we are going to have a
feel of how it is to use RStudio and learn some of the basics in relation to its operation.

RStudio as a Basic Calculator

We can use R as a calculator. The following tables show how arithmetic and some
mathematical operations are written in R.

Arithmetic Operations

Arithmetic Operation R syntax Result


Addition 5+3 8
Subtraction 5–3 2
Multiplication 5*3 15
Division 5/3 1.666667

Exponents

Operation R syntax Result


Positive exponent 5^3 125
Negative Exponent 5 ^ (– 3) 0.008
Fractional power 81^(1/2) 9
Square root sqrt(81) 9

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 46
Mathematical Constants

Mathematical Constant R syntax Result


π pi 3.1415927
e exp(1) 2.7182818

On the RStudio console pane, execute the R commands for the given examples on the
preceding tables. Locate the flashing cursor after the > symbol. Type each command,
hitting enter after.

Text and Error Messages in RStudio

Text is often referred to as “strings” in programming and even in Statistics. Text or strings
require quotation marks (“ “) in order for RStudio to differentiate it from a command or
function.

In the RStudio console pane, enter the phrase „Hello World‟ with and without quotation
marks for you to see the difference. What have you observed?

Now, try to perform a basic calculation involving a number and a text. Enter the command
5 – “two”. What happened?

Error messages in RStudio mean that it wasn‟t able to execute the command. These errors
informs you that you need to check your command and it might not be appropriate to
proceed to the next command until you are able to resolve the error.

True/False Questions in RStudio (Comparing quantities or items)

You can ask RStudio true/false questions. These true/false questions in RStudio are actually
useful when we deal with „objects‟. The following are the commonly used commands in
RStudio in stating true/false questions. You can try these in the console pane of RStudio.

True/False Question RStudio Command Example


Are two items equal? == 1 == 2
Are two items not equal? != 9 != 8
Is one item less than another item? < 5<9
Is one item greater than another item? > 5>9
Is one item less than or equal to another item? <= 15 <= 20
Is one item greater than or equal to another item? >= 15 >= 20

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 47
For the “equal” and “not equal” True/False questions, you can also make use of strings
aside from quantities. For example, “school” == “university”.

Creating and Modifying Objects in RStudio

Moving through the modules for this course, you will become familiar with the assign
command ( <- ) which is used to assign anything on the right of this symbol to a particular
item written before the symbol. We use this to create objects in RStudio.

1. Suppose we want to assign the quantity 15 to a certain variable, say, x. Also, we


want to assign the value -10 to y. We can now write the following commands in the
console pane of RStudio.

x <-15# Assigns the value 15 to x


y <--10# Assigns the value -10 to y

2. Now, notice the Environment pane. The two commands that you have entered and
executed in the console pane have created new objects called x and y, and these
new objects appear in the environment pane. Now, when you just type the object
names in the console, the value of these objects will be shown or given.

3. Since both these new objects contain a single numerical value, mathematical
operations as well as True/False questions can now be performed to these objects.

Since these objects contain numerical information, the use of quotation marks are
no longer required when these objects are used in any command. Please do note
that RStudio is case-sensitive, that is, X is not the same as x.

Try entering each of these commands in the console pane and see what happens.
x +y; x –y; x *y; x/y; x ==y; x != y; x >y; x <y; x >= y; x <= y

4. We can also build new objects by using other objects. Type each of the following
commands in the console pane. Then, print out the values of the new objects by just
typing each object‟s name.

a <-x + y; b <-x * y / a

Please note that object names cannot begin with a number or a set of reserved
words like if, for, next, break, in, etc.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 48
Creating Vectors and Data Frames

Most often, it‟s not just a single value or number that is being assigned to an object in
RStudio. You can also assign a vector to an object. A vector is basically an ordered list of
items; a list in which the order of the items could matter. A vector could represent a single
column or a single row of entries within a spreadsheet in MS Excel.

Suppose you are to go to the grocery store and would like to make a list of your grocery
items using RStudio. To generate a list of the items, we use the combine command, c( ).
The combine command is used to make a vector with all the items inside the parenthesis
(of the combine command).

Enter this command in the console pane in order for RStudio to generate your vector of
grocery items.

grocery.list <-c("soap","pork","eggs","flour","milk")
grocery.list

If you want to modify the contents of an existing object, just use the original object‟s name
and the assignment operator ( <- ). For example, you wish to modify the contents of object
x by assigning a vector to it, you can have the following command:

x <-c(2, 5, 6, 8, 12) # x is a vector with 5 elements

When you modify the contents of an object, you would notice in the environment pane
that the object‟s name does not change, but its value does.

In RStudio, data sets are usually stored as objects of the class data frame. A data frame is
composed of columns, where a column represents a particular variable, and rows with
each row being assigned to each item or unit. We can create small data sets in RStudio
directly.

Suppose we gather data on different variables from 5 individuals. We use the following
commands to create the hypothetical data in RStudio.

# These commands creates individual data vectors, each with 5 elements

salary <-c(35000, 52000, 45000, 68000, 37000)

education <-c("Bachelor's Degree", "MS Degree", "Bachelor's Degree", "PhD


Degree", "MS Degree")

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 49
industry <-c("Government", "Banking", "Manufacturing", "R&D", "Government" )

# Combine vectors to form a data frame

data <-data.frame(salary, education, industry)


data

# Output data frame

salary education industry


1 35000 Bachelor's Degree Government
2 52000 MS Degree Banking
3 45000 Bachelor's Degree Manufacturing
4 68000 PhD Degree R&D
5 37000 MS Degree Government

One great feature of Rstudio is that it can store and work with multiple objects. Since it is
object based, it can work on multiple spreadsheets of data as long as each of these
spreadsheets is assigned to an object. Hence, the number of databases that you can work
with in a session of RStudio is only limited by the computer‟s memory and what you can
keep track of.

For you to learn more about the basics of R and RStudio, the software has a built in tutorial
that you can easily access anytime by following these steps.

Step 1: Install the „swirl‟ package. Type the command in the console pane.
install.packages(“swirl”)
Step 2: Load the installed swirl package in RStudio.
library(swirl)
Step 3: You can now start with the tutorial. Just enter the following command
and just follow the on-screen instructions or guide. Enjoy. 
swirl()

The next time you continue with the tutorial, you don‟t need to install the swirl package
again. Once a package has been installed in RStudio, it remains in the program. Now, all
you have to do is just load the package and type the command to start with the tutorial.
> library(swirl)
> swirl()

So, there it is! You have learned some of the basics regarding the use of RStudio and I hope
that at this point, you are even more excited to learn about other things that we can do
with the software, particularly in the field of Statistics. The other important concepts
regarding the use of RStudio will be introduced in the succeeding modules.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 50
Now, before we move on to the next modules, we might as well start setting up first our
working directory in RStudio in order for our data files to be organized. 

Setting up working directory in RStudio

Even if you already have prior use of RStudio, let us have a separate working directory
folder for our files for the course. Follow the steps outlined below.
Step 1. Open RStudio. Click on the File menu.
Step 2. Click on New Project…
Step 3. On the pop-up window as shown in Figure 8, click on New Directory.

Figure 8. Creating a New Directory in RStudio.

Step 4. Click on New Project on the next pop-up window shown in Figure 9.

Figure 9. Create a New Project

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 51
Step 5. On the Create New Project pop-up window, enter AE311 for the
Directory name. For the next entry, click on Browse and select your
main directory. It is suggested that you use a folder in drive D as the
main directory. After filling up these items, click on Create Project.

Figure 10. Entering Directory name

Step 6. Check your working directory by entering the command getwd() in


the RStudio console.

Congratulations! You just completed Module 4. You are now ready to use R for
statistical analysis. Take a break and get ready for Module 5.

Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 52

You might also like