R Seminar 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

Introduction to R

Introduction to : A programming language


for data analysis

David Leiva

Quantitative Psychology Section

Session 1

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Contents

Introduction to R
The R Environmnent
RStudio: An IDE for R
Getting started with R
R Help System
R packages
Exercises

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

The R Environmnent

What is R?

R is GNU version of S programming language.


It is available on Windows, Linux/UNIX and MacOS.
You can download on CRAN web page:
http://www.r-project.org/

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

The R Environmnent

What is R?

Some advantages of using R software:


It is free and open source.
It has hundreds of helping manuals, tutorials, examples...
Most of them for free!
There are many other additional packages (thousands
available in CRAN repository).
It allows users to produce excellent graphs.
It allows users to create new functions and packages.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

The R Environmnent

What is R?
But R also has some drawbacks:
Its graphical interface is quite limited (GUI).
It mainly works under a command line interface (CLI).
Its learning can be cumbersome for some novice users.
What is this R code for? Histogram of x

x <- rnorm(100)

20
hist(x, col = "red")

15
Frequency

10
5
0

−3 −2 −1 0 1 2

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

The R Environmnent

Working with R

Once installed, R can be run by clicking on desktop icon:


R GUI
1 Main menu.
2 R console.
3 Other
windows.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

The R Environmnent

Working with R

Typing R commands on the Console vs. using a New Script:

R Console R Script Editor

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

The R Environmnent

Working with R

Using a text editor with highlighting feature for R code is


more suitable.
Many compatible editors developed since R’s appearance:
Tinn-R, Vim, gedit, Kate, WinEdt, and so on...
I strongly recommend a really nice multiplatform IDE called
RStudio.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

RStudio: An IDE for R

Installing and running RStudio

Available on http://www.rstudio.com/.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

RStudio: An IDE for R

Installing and running RStudio

The interface consists of 4 main panels.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

RStudio: An IDE for R

Installing and running RStudio

RStudio allows to create, edit and manage:


R scripts.
Sweave files (a program for combining R and LATEX).
RMarkdown files (package that combines R and
Markdown).
Projects (a combination of data, scripts and output
reports).
R packages.
Here you’ll find a cheatsheet for the RStudio IDE.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

R: an object-oriented programming language

How does R work?


x <- sqrt(9) # What does sqrt mean?
x # What is x?
[1] 3

y # Why it does not work?

As we will see, R works with a remarkable variety of


objects: arrays, matrices, lists, data frames, plots...

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Working with Data

c() function
students <- c(84,74,67,90,78)
students
[1] 84 74 67 90 78

Functions for computing mean, minimum, maximum,


length, or variance can be applied to this numeric vector.
Main drawbacks:
Data type is the same for all values in the vector.
Only suitable for small vectors.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Working with Data

scan() function
income <- scan()
# COPY THESE DATA:
# 12 11 10 14 15 16 18 23 24 27 13 21 12
# THEN ENTER

This function can also be appropriated for small data sets.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Working with Data

: operator, rep() and seq() functions


1:5 # Sequence 1 to 5
[1] 1 2 3 4 5
(seq(1,5)) # The same
[1] 1 2 3 4 5
(rep(c("A","B"),c(2,3)))
[1] "A" "A" "B" "B" "B"

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Importing data

read.table() function
df1 <- read.table(file="visualMemory.txt",header=TRUE)
head(df1)

before after difference


1 1 11 10
2 2 11 9
3 3 11 8
4 3 10 7
5 2 11 9
6 1 12 11

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Importing data

read.csv() function
df2 <- read.csv(file="visualMemory.csv",
header=TRUE,sep=";")
head(df2)

before after difference


1 1 11 10
2 2 11 9
3 3 11 8
4 3 10 7
5 2 11 9
6 1 12 11

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Importing data

read_sav() function
#install.packages('haven')
library(haven)
df3 <- read_sav(file="visualMemory.sav")
head(df3,n=3)

# A tibble: 3 x 3
before after difference
<dbl> <dbl> <dbl>
1 1 11 10
2 2 11 9
3 3 11 8

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Importing data

read_xlsx() function
#install.packages('readxl')
library(readxl)
df4 <- read_xlsx(path="visualMemory.xlsx",sheet=1)
colnames(df4)

[1] "before" "after" "difference"

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Data in R: miscellanea

[] operator
Accessing to vector’s elements: vector[index]
students[1]
[1] 84
Accessing to data frame elements (and matrices):
dataframe[row,column]
df1[1,2]
[1] 11

NOTE: [[]] operator can also be used with lists.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Data in R: miscellanea

$ operator
Accessing to data frame variables:
dataframe$variable.
df1$before

[1] 1 2 3 3 2 1 1 2 3 2 1 1 2
[14] 3 2 3 1 1 1 2 2 1 2 1 2 3
[27] 2 2 1 1

NOTE: $ operator can also be used with lists.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

Data in R: miscellanea

Other functions for accessing data sets


Accessing to data frame variables:
dataframe["variable"].
Accessing to data frame variables by using attach()
function. Nevertheless, its use should be avoided.
A more convenient way: with(data.frame,command).
Some built-in packages include data sets. They can be
loaded by means of data() function.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

A first look at environments

Dealing with objects


search() traces the path where R interpreter looks for
existing objects.
New environments can be created and accessed in R
sessions.
ls() lists all objects of an specified environment. NOTE:
= objects().
rm() removes objects of an specified environment. If you
want to delete all objects in Workspace: rm(list=ls()).

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

A first look at environments

WARNING: Masking objects can be problematic.


sd <- 2
sd
stats::sd

What is sd object? A scalar or a function?


Try not to mask existing objects when creating new ones.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Getting started with R

A first look at environments

Working directory
A default directory is used in any R working session when
loading and saving data files and scripts.
Where is the working directory? getwd().
How can I change it? setwd(DIRNAME).
RStudio allows users to easily deal with:
Objects in global environment (i.e. Workspace).
Directories and files.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

R Online Help System


Thousands of packages that include thousands of
functions make necessary a good documentation system.
R Online help documentation includes information
regarding functions, packages, data sets, classes and
methods...
Its structure is defined by convention and it is expected to
be followed by all R developers.
help() or ? functions give access to R documentation.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

Help structure: functions


Description: What the function does.
Usage: How to call the function.
Arguments: Input parameters and options in detail.
Details: A more detailed explanation of the function, its
algorithms, related functions and additional information.
Value: What the function returns as an output.
References: Optional.
Examples: Good examples are really useful in order to
learn its functioning.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

Vignettes
Another documentation is the so-called vignette.
Vignettes offer a more detailed description of packages.
vignette(package="xtable")
vignette("xtableGallery")

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

Other available manuals


There are some manuals included within the basic
installation of R.
Some of these manuals are the references for many of R
developers.
Function help.start() allows users to get access to
these manuals.
Vague searching can be done with help.search() or,
alternatively, with ??.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

Users’ discussion groups and forums


R-project: https://www.r-project.org/help.html
Stackoverflow:
http://stackoverflow.com/questions/tagged/r
RStudio Community:
https://community.rstudio.com/
R-bloggers: https://www.r-bloggers.com/
R-UCA: http://knuth.uca.es/moodle/

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

apropos() function
Suppose you are interested in doing a boxplot of frontal
lobe size FL variable included in crabs data set.
apropos() function can be used in order to find any
object that coincides with the argument given.
apropos("boxplot")

Now you can check the documentation of boxplot()


function.
RStudio has nice functionalities of syntax auto-completion
and a pop-up system with information regarding objects.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R Help System

Help Documentation

example() function
Generally, functions
documentation includes
examples.
It can be checked by means of
example() function.
example(boxplot)

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R packages

R repositories

R itself and add-on packages live in repositories.


The main repositories are:
1 CRAN
2 BioConductor
3 R-Forge
4 Omega-hat
These repositories are sustained by mirrors around the
world.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R packages

R repositories

Number of packages in CRAN


An exponential growth in the number of contributed
packages over the last decade.
Currently there are 18,244 packages.
The main problem is to keep yourself up to date with such
a big quantity of new releases.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R packages

Installing and loading packages

There are some questions you need to bear in mind.


1 What package do you need?
2 Is it installed?
3 Is it loaded in the working session?
Let’s work with psych package.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R packages

Installing and loading packages

install.packages() function
Contributed packages should be installed before being
used. Steps:
1 install.packages("psych"). NOTE: Package name
placed between quotation marks.
2 Choose a mirror.
Default repository is CRAN but users can change it.
installed.packages() function for checking what
packages are already installed.
update.packages() function looks for new versions of
installed packages, then installs it.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

R packages

Installing and loading packages

library() function
Every time you need to use a package in a new R working
session you have to load it.
It is done by: library(PACKAGE). NOTE: Quotation
marks are not required.
How to know what packages are already loaded in the
current session: loadedNamespaces() function.

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Exercises

Exercise 1

caregivers.sav contains simulated data on a sample of


informal caregivers of relatives with dementia and
non-dementia related problems. Tasks:
a) Import data into R creating a new dataframe called
caregivers.
b) Check the dataframe structure:
Sample size
Type of variables
Missing data presence

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Exercises

Exercise 2

File qlife.xls includes simulated data about quality of life in


children with Congenital Limb Defects (CLD). Tasks:
a) Import data into R creating a new dataframe called
quality.
b) Check the dataframe structure:
Sample size
Type of variables
Missing data presence

David Leiva Quantitative Psychology Section


Introduction to R
Introduction to R

Exercises

Exercise 3

Install the package stargazer, check its help doc and look for
information regarding the following:
a) What is the main purpose of this package?
b) What is the main function included in this package?
c) Check the Usage section for this function. How many
arguments does it have?
d) Create a summary table for dataset attitude, printing 3
decimal digits, to be exported to a LaTeX file and to an
HTML file.

David Leiva Quantitative Psychology Section


Introduction to R

You might also like