0% found this document useful (0 votes)

18 views15 pages

R Tutorial

This document provides an overview of topics that will be covered in an R tutorial, including getting started with R, reading and manipulating data, linear models, summary statistics, and plotting data. The key points are: downloading R and RStudio, reading different data file types into R, storing and manipulating data frames, using packages like tidyverse and ggplot2, fitting linear models with lm(), summarizing data with functions like summary() and mean(), and creating plots and other visualizations with commands like plot(), hist(), and boxplot().

Uploaded by

jasonmao6969

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views15 pages

R Tutorial

Uploaded by

jasonmao6969

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

ECON 323

R Tutorial
Dr. Lucija Muehlenbachs
Reid Fortier
What We’ll Cover Today:

● Getting Started with R

● Reading data
● Types of data
● Storing and manipulating data
● Using packages
● Linear models
● Summary statistics
● Plotting data
Getting Started with R

● For those on their own computers, download R and RStudio here: RStudio Desktop -
Posit

● R is necessary, RStudio is highly recommended

● With RStudio we can write and save scripts, view data frames, and have a
much easier interface to navigate
● Common coding etiquette:
○ Write yourself as the author of your own script and date it
○ Provide comments in the script by prefacing a line with #
○ Name variables/data/files clearly and concisely
○ Set a working directory for your files to be read from
Reading Data

● R can read various document types, including

○ CSV files (read.csv())
○ TSV files (read_tsv())
○ Delimited files
○ Excel files (read_excel() or read_xls(), install readxl package first)
○ Stata .dta files (read_dta using the Haven package)
○ And more (SAS, SPSS, etc.)
● Entering a read data command in the R console shows a preview of the data,
but we can save it as a data frame using RStudio
○ I.e. instead of simply running read.csv(gasdata.csv), we can store it in a dataframe by running
mydata <- read.csv(gasdata.csv)
Types of Data

● Strings: characters encoded as non-numeric (e.g. “Hello World!”, “Alberta”,

“100”, etc.)
● Numeric: integers, floating point numbers, double, etc.
● Factors: data that is characterized by levels/categories. Numeric and string
variables can be encoded as string variables with the factor() command

In short, string variables will typically be for descriptive data. If they are needed for
mathematical/empirical applications, the must be encoded as factors.
Numeric/factor variables are typically for more “raw” data that we will work with.
Storing Data/Variables

● R stores data in multiple forms:

○ Data frames
○ Matrices
○ Vectors
■ Can be manually inputted (e.g. vector <- c())
○ Lists
● To save a dataset as a data frame in R, use the <- operator
○ E.g. mydata <- read.csv(“welldata.csv”)
○ Alternatively, you can combine multiple vectors of equal length into a single data frame with
cbind (e.g. mydata <- cbind(vector1, vector2, vector3)
● To save a column from a data frame as its own vector, use the same process
○ E.g. price <- mydata$gasprice
Data Manipulation
● Using the $ character, we can add variables to our dataset
○ E.g. mydata$logprice <- log(mydata$price)
● To keep or remove variables from our data, we can use subset()
○ E.g. mydata <- subset(mydata, select = c(variable1, variable2)) to keep variable1 and
variable2
○ Mydata <- subset(mydata, select = -c(variable1, variable2)) to drop variable1 and variable2
● Sometimes we may have missing data that we cannot impute. To remedy this
we can, if desired, delete these observations
○ The package tidyr is useful for this using the command drop_na
○ E.g. mydata <- mydata %>% drop_na(variable). If no variable is provided, the command drops
all missing values in the data frame.
Data Manipulation

● We can also use subset to create new data frames without erasing the
original from our environment:
○ mysubset <- subset(mydata, subset = criteria), where criteria is a logical expression indicating
which observations to keep and which to remove
○ Logical operators:
■ >: strictly greater than (< for strictly less than)
■ >=: greater than or equal to (<= for less than or equal to)
■ !=: not equal to
■ ==: equal to
■ E.g. mysubset <- subset(mydata, subset = Date > “1990-01-01”)
Data Manipulation

● For your assignment, you will need to aggregate observations to a monthly

level
● The command aggregate is useful for this as it allows us to specify which
variables to keep, which variable to aggregate over, and the function we want
to aggregate by (taking the mean, summing, etc.)
● E.g. aggData <- aggregate(c(X1, X2) ~ aggVar, FUN = mean)
○ X1 and X2 are vectors of data
○ aggVar is the variable we are aggregating over (examples could be by well cohort or by
month)
○ FUN takes the function by which we are aggregating (mean takes the mean of the chosen
variables, sum adds the variables together, max returns the maximum value, etc.)
Using Packages

● Any packages you want to use in R need to be both installed on the local disc
and called in R
● To install packages, go to the Packages tab in the lower left of RStudio and
use the search bar
● To call packages, run the library() command with the package name
○ E.g. run library(tidyverse) before executing any commands under the tidyverse package
● Some useful packages: readxl, Haven, tidyverse (data cleaning), ggplot2
(data plotting), stargazer (output tables), lubridate (date formatting), etc.
Linear Models

● The lm() command is used for basic linear modelling (i.e. y = 𝛽0 + 𝛽1X + 𝜖)
● Hint: use help(lm) to find what arguments the function lm takes (useful for
when you remember the command name, but not the necessary arguments)
● E.g. mymodel <- lm(Y ~ X1 + X2, data = mydata, subset = criterion)
○ Note that we can choose to subset the data within our linear model without having to create a
new dataframe with the subset command
● For cross-sectional and time series data, lm() should be sufficient
○ glm() is more flexible as it can model limited dependent variable (LDV) models (logit, probit,
tobit)
○ plm() is useful for balanced panel datasets and requires the plm package
Summarizing Data

● Once we run the lm() command in R, it is saved in our environment. We can

see the model results using the summary() command:
○ summary(mymodel) returns the coefficients, standard errors, t-statistics, and p-values for all
included independent variables
● Summary() is also used to report summary statistics of variables:
○ summary(X1) returns the mean, min/max, and quartiles of the variable X1
● Summary statistics can alternatively be called through their own function
○ mean(X1)
○ sd(X1)
○ var(X1)
○ max(X1)
Graphical Analysis: Plotting Data

● Base R uses the command plot() to plot two data vectors against each other
● plot(x, y, type, main, xlab, ylab, xlim, ylim)
○ x and y are the variables to be plotted on the x- and y-axes, respectively
○ Type gives the type of plot to be drawn (“l” for lines, “p” for points, “b” for both, etc.)
○ Main is the title of the figure
○ xlab and ylab are the x- and y-axis labels, respectively
○ xlim and ylim give the range of values for the x and y variables to be restricted to in the plot
■ E.g. xlim = c(xmin, xmax)
● x and y are the only necessary arguments to be passed, but it is a good
convention to appropriately name and label your figures
● Additional options for your figures include colouring the plot, choosing the size
of points, and choosing a specific aspect ratio (help(plot)!)
Graphical Analysis: Other Visualizations

● Histograms: use hist()

● Box-and-whisker plots: use boxplot()
● Barplot: use barplot()
● Pie chart: pie()
○ Remember to use the help() command to find arguments for each of these commands
● As noted earlier, the package ggplot2 is great for making nice looking figures,
but is a bit more complicated to learn.
● To save your figures, click on Export in the lower right of RStudio in the figure
preview window
Additional Topics we Can Cover if Time Permits

● Using ggplot2 for nicer looking figures

● Plotting multiple trends on top of each other
● Logging output

Detecting and Solving Memory Problems in Net
No ratings yet
Detecting and Solving Memory Problems in Net
86 pages
Mendenhall R
No ratings yet
Mendenhall R
14 pages
Introduction To R
No ratings yet
Introduction To R
39 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
Howtouser: 1 What Is R
No ratings yet
Howtouser: 1 What Is R
6 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
Basic of R Language: Jarno Tuimala
No ratings yet
Basic of R Language: Jarno Tuimala
41 pages
R Studio Lab Summary Sheet
No ratings yet
R Studio Lab Summary Sheet
3 pages
S24 Stats10 Lab1-1
No ratings yet
S24 Stats10 Lab1-1
8 pages
R Programming Cheat Sheet
No ratings yet
R Programming Cheat Sheet
15 pages
Practical 1 - Data Frame Manipulation - 072502
No ratings yet
Practical 1 - Data Frame Manipulation - 072502
16 pages
Introduction To R For Business Analytics
No ratings yet
Introduction To R For Business Analytics
7 pages
R Language Lab Manual Lab 1
100% (1)
R Language Lab Manual Lab 1
33 pages
Importing The Files
No ratings yet
Importing The Files
14 pages
Chapter - 03 - Review of Basic Data
No ratings yet
Chapter - 03 - Review of Basic Data
92 pages
R Tutorial
No ratings yet
R Tutorial
15 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
50 pages
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
No ratings yet
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
31 pages
Beginner Guide To R and R Studio V1
No ratings yet
Beginner Guide To R and R Studio V1
27 pages
CRM Cheat Sheet
No ratings yet
CRM Cheat Sheet
7 pages
DA Lab Week-2
No ratings yet
DA Lab Week-2
22 pages
MIS 4.hafta (Introduction To R)
No ratings yet
MIS 4.hafta (Introduction To R)
52 pages
RBasics Handout
No ratings yet
RBasics Handout
6 pages
R Socialscience
No ratings yet
R Socialscience
62 pages
Introduction To R
No ratings yet
Introduction To R
23 pages
P6ADBMS
No ratings yet
P6ADBMS
34 pages
A Brief Introduction To R
No ratings yet
A Brief Introduction To R
17 pages
All v2 Basic Statistics Using R
No ratings yet
All v2 Basic Statistics Using R
241 pages
R Intro2021
No ratings yet
R Intro2021
23 pages
R Programming: © 2016 SMART Training Resources Pvt. LTD
No ratings yet
R Programming: © 2016 SMART Training Resources Pvt. LTD
28 pages
06 Plots Export Plots
100% (1)
06 Plots Export Plots
17 pages
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
No ratings yet
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
33 pages
R Software Project
No ratings yet
R Software Project
42 pages
R Workshop
No ratings yet
R Workshop
47 pages
Statistical Analysis With R - A Quick Start
100% (1)
Statistical Analysis With R - A Quick Start
47 pages
R Studio
No ratings yet
R Studio
8 pages
MultivariateRGGobi PDF
No ratings yet
MultivariateRGGobi PDF
60 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
Theory 1. R Basics
No ratings yet
Theory 1. R Basics
43 pages
Lab 1
No ratings yet
Lab 1
26 pages
R Prog
No ratings yet
R Prog
27 pages
Time Series Analysis With R - Part I
No ratings yet
Time Series Analysis With R - Part I
23 pages
R Statistical Package
No ratings yet
R Statistical Package
63 pages
Visualizing Data in R
No ratings yet
Visualizing Data in R
20 pages
DV Unit 2 Update
No ratings yet
DV Unit 2 Update
13 pages
W2 Advanced Data Structures, IO & Control
No ratings yet
W2 Advanced Data Structures, IO & Control
44 pages
DA Lab Week-1
No ratings yet
DA Lab Week-1
7 pages
Introduction To R
No ratings yet
Introduction To R
52 pages
R - Lecture 4
No ratings yet
R - Lecture 4
37 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Ian Talks Python A-Z
From Everand
Ian Talks Python A-Z
Ian Eress
No ratings yet
RTOS Application Design - Embedded Software Design - A Practical Approach To Architecture, Processes, and Coding Techniques
No ratings yet
RTOS Application Design - Embedded Software Design - A Practical Approach To Architecture, Processes, and Coding Techniques
32 pages
Lecture02 GraphDatabases Neo4J PDF
No ratings yet
Lecture02 GraphDatabases Neo4J PDF
95 pages
Francisco Villa
No ratings yet
Francisco Villa
4 pages
Chapter 4 Bootstrap
No ratings yet
Chapter 4 Bootstrap
37 pages
Royal College Grade 11 Information and Communication Technology Second Term Paper 2022 English Medium
No ratings yet
Royal College Grade 11 Information and Communication Technology Second Term Paper 2022 English Medium
15 pages
How To Add Company Logo To SAP Screen
No ratings yet
How To Add Company Logo To SAP Screen
4 pages
Question Bank Answers
No ratings yet
Question Bank Answers
6 pages
History 3
No ratings yet
History 3
80 pages
Knowledge Unit of Science and Technology: Laboratory Manual (IT1091) : (Semester Fall-2021)
No ratings yet
Knowledge Unit of Science and Technology: Laboratory Manual (IT1091) : (Semester Fall-2021)
9 pages
Veracode Corporate vsh2 - 2020
No ratings yet
Veracode Corporate vsh2 - 2020
15 pages
I Unit Test 2019
No ratings yet
I Unit Test 2019
27 pages
SAP Roles
100% (1)
SAP Roles
16 pages
Heiken Ashi Smoothed Alert Bar1.Mq4
No ratings yet
Heiken Ashi Smoothed Alert Bar1.Mq4
3 pages
Excel IFERROR Function: Example #1
No ratings yet
Excel IFERROR Function: Example #1
5 pages
HCFA PAC PLC Debug Tutorial
No ratings yet
HCFA PAC PLC Debug Tutorial
38 pages
TA80 INTG050 GosuAndXML
No ratings yet
TA80 INTG050 GosuAndXML
44 pages
11th C S Practical Exam EM 2025
No ratings yet
11th C S Practical Exam EM 2025
8 pages
Extjs Tutorial PDF
No ratings yet
Extjs Tutorial PDF
2 pages
Vmware Product Lifecycle Matrix Supported Products Releases
No ratings yet
Vmware Product Lifecycle Matrix Supported Products Releases
28 pages
Week 2 GRPA21512512512
No ratings yet
Week 2 GRPA21512512512
5 pages
Top Accenture Python Interview Questions
No ratings yet
Top Accenture Python Interview Questions
7 pages
NE701-Mobile Application Development
No ratings yet
NE701-Mobile Application Development
7 pages
Lectoop 1
No ratings yet
Lectoop 1
61 pages
Why Java Is Robust
No ratings yet
Why Java Is Robust
16 pages
TIB Ebx 6.1.3 Relnotes
No ratings yet
TIB Ebx 6.1.3 Relnotes
38 pages
Question # 01: Source Code: Talha Maqsood
No ratings yet
Question # 01: Source Code: Talha Maqsood
2 pages
Exam C1000-118 IBM Cloud Professional Architect v5 Sample Test
No ratings yet
Exam C1000-118 IBM Cloud Professional Architect v5 Sample Test
5 pages
Spring Framework 6 Spring Boot 3 Notes Part1 - Hyder Abbas
No ratings yet
Spring Framework 6 Spring Boot 3 Notes Part1 - Hyder Abbas
11 pages
Answers To Problems For Software Engineering, 10th Edition by Ian Sommerville
No ratings yet
Answers To Problems For Software Engineering, 10th Edition by Ian Sommerville
6 pages

R Tutorial

Uploaded by

R Tutorial

Uploaded by

ECON 323

● Getting Started with R

● R is necessary, RStudio is highly recommended

● R can read various document types, including

● Strings: characters encoded as non-numeric (e.g. “Hello World!”, “Alberta”,

● R stores data in multiple forms:

● For your assignment, you will need to aggregate observations to a monthly

● Once we run the lm() command in R, it is saved in our environment. We can

● Histograms: use hist()

● Using ggplot2 for nicer looking figures

You might also like