0% found this document useful (0 votes)

4 views

R Tutorial

The document outlines a plan for learning R programming, covering topics such as using R as a calculator, creating and manipulating objects, loading data, generating graphics, and performing regressions. It includes examples of arithmetic operations, object creation, data frame manipulation, and basic plotting techniques. Additionally, it discusses simple and multiple linear regression, including how to interpret model outputs and diagnostic plots.

Uploaded by

yuruoqianqy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

R Tutorial

Uploaded by

yuruoqianqy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Today’s plan:

1. R as a calculator
2. Objects in R
3. Loading data
4. Graphics
5. Regressions
6. R markdown

1. Using R as a calculator (Arithmetric Operations)

5+3

## [1] 8
5/3

## [1] 1.666667
5ˆ3

## [1] 125
5*(10-3)

## [1] 35
sqrt(4)

## [1] 2

2 Creating and manipulating various objects in R

2.1 Objects
R can store information as an object.
• Objects are “shortcuts” to some piece of information or data.
• We use the assignment operator <- to assign some value to an object.
– < Left Angle Bracket
– - Dash
• We can store the result as an object, and then access the value by referring to the object’s name.
– just enter the object name and hit Enter // use the print( ) function
• Object names are case sensitive.
• In the upper-right window, called Environment, we will see the objects we created.

2
result <- 5+3
result

## [1] 8
print(result)

## [1] 8

2.2 Vectors
A vector or a one-dimensional array simply represents a collection of information stored in a specific order.

x <- 1:5 # generate a sequence from 1 to 5

## [1] 1 2 3 4 5
y <- c(1,2,3,4,5) # use the function c( ), which stands for “concatenate,” to enter a data vector
y

## [1] 1 2 3 4 5

Indexing:
To access specific elements of a vector, we use square brackets [ ].
• Within square brackets, the dash, -, removes the corresponding element from a vector.
• Within square brackets, multiple elements can be extracted via a vector of indices.

x[2] # access the second element of the vector

## [1] 2
x[-2] # access all elements except for the second one

## [1] 1 3 4 5
x[c(1,3)] # access the first and the third elements

## [1] 1 3
x[1:3] # access the first three elements

## [1] 1 2 3

2.3 Functions
A function often takes multiple input objects and returns an output object.
• e.g., sqrt( ), print( ), and c( )
• funcname(input): funcname is the function name and input is the input object (arguments)
• To find out more information about a function (e.g. sqrt( )), type ?sqrt.

3
length(x) # display the length of vector x

## [1] 5
mean(x) # display the mean value

## [1] 3
max(x) # display the maximum value

## [1] 5

3 Loading data sets into R

3.1 Working Directory
where R by default load data from and save data to.
1. getwd( ): get working directory
2. setwd( ): change working directory
• e.g. setwd(“/Users/Desktop”)

3.2 Data Files

• CSV or comma-separated values files: tabular data

• RData files: a collection of R objects

ads <- read.csv("Advertising.csv")

class(ads)

## [1] "data.frame"

3.3 Data Frame Object

A data frame object is a collection of vectors.

names(ads) # return variable names

## [1] "X" "TV" "radio" "newspaper" "sales"

nrow(ads) # return the number of rows

## [1] 200

4
ncol(ads) # return the number of columns

## [1] 5
dim(ads) # return the dimensions of the data

## [1] 200 5
summary(ads) # produce a summary of the data.

## X TV radio newspaper
## Min. : 1.00 Min. : 0.70 Min. : 0.000 Min. : 0.30
## 1st Qu.: 50.75 1st Qu.: 74.38 1st Qu.: 9.975 1st Qu.: 12.75
## Median :100.50 Median :149.75 Median :22.900 Median : 25.75
## Mean :100.50 Mean :147.04 Mean :23.264 Mean : 30.55
## 3rd Qu.:150.25 3rd Qu.:218.82 3rd Qu.:36.525 3rd Qu.: 45.10
## Max. :200.00 Max. :296.40 Max. :49.600 Max. :114.00
## sales
## Min. : 1.60
## 1st Qu.:10.38
## Median :12.90
## Mean :14.02
## 3rd Qu.:17.40
## Max. :27.00

Indexing (extraction):

• square brackets [ ]
• two indexes: one for rows and the other for columns
• call specific rows (columns) by either row (column) numbers/names.
• If we use row (column) numbers, useful sequencing functions, i.e., : and c()
• If we do not specify a row (column) index, then the syntax will return all rows (columns).
• To access an individual variable in a data frame: use the $ operator.

ads[2,"sales"] # extract the second row of the "sales" column

## [1] 10.4
ads_subset1 <- ads[1:3,] # extract the first three rows (and all columns)
ads_subset1

## X TV radio newspaper sales

## 1 1 230.1 37.8 69.2 22.1
## 2 2 44.5 39.3 45.1 10.4
## 3 3 17.2 45.9 69.3 9.3
ads_subset2 <- ads[c(1,3),] # extract the first and the third rows (and all columns)
ads_subset2

## X TV radio newspaper sales

## 1 1 230.1 37.8 69.2 22.1
## 3 3 17.2 45.9 69.3 9.3
ads_subset1[,"sales"] # extract the column called "sales" from ads_subset1

5
## [1] 22.1 10.4 9.3
ads_subset1$sales # another way of retrieving individual variable "sales": operator $

## [1] 22.1 10.4 9.3

3.4 Packages
• R packages are collections of functions and data sets developed by the community.
• To use the package, we must load it into the workspace using the library() function.
• In some cases, a package needs to be installed before being loaded: we can use the install.packages()
function.

install.packages("foreign") # install package

library("foreign") # load package

4. Graphics
4.1 Scatterplot

• The plot( ) function is the primary way to plot data in R.

• plot(x, y) produces a scatterplot of the numbers in x versus the numbers in y.

x <- rnorm(100) # generates a vector of 100 random normal variables

y <- rnorm(100)
plot(x, y) # scatterplot of x and y

6
3
2
1
y

0
−1
−2

−2 −1 0 1 2

x
plot(x, y, xlab = "this is the x-axis",
ylab = "this is the y-axis", main = "Plot of X vs Y") # add xlabel, ylabel, title

Plot of X vs Y
3
2
this is the y−axis

1
0
−1
−2

−2 −1 0 1 2

this is the x−axis

We return to the advertising data. To show the relationship between TV and sales in a scatterplot, we need
to specifically indicate the variable name using $.

7
plot(ads$TV, ads$sales)

25
20
ads$sales

15
10
5

0 50 100 150 200 250 300

ads$TV
Alternatively, we can use the attach( ) function in order to tell R to make the variables in this data frame
available by name.

attach(ads)
plot(TV, sales)

4.2 Histogram

• The hist( ) function can be used to plot a histogram.

attach(ads) # tell R to make the variables in "ads" available by name
hist(sales) # histogram of the variable "sales"

8
Histogram of sales
80
60
Frequency

40
20
0

0 5 10 15 20 25 30

sales
hist(sales, col = 2, breaks = 15) # change the color to red, change the number of bins to 15

Histogram of sales
40
30
Frequency

20
10
0

0 5 10 15 20 25

sales

9
5. Regressions
5.1 Simple Linear Regression
• We use the lm() function to fit a simple linear regression model.
• The basic syntax is lm(y ~ x, data), where y is the response, x is the predictor, and data is the data set
in which these two variables are kept.
• Again, we either use $ or attach() to explicitly tell R what data frame we use.
• We store the regression results in an object called lm.fit.
• If we type lm.fit, some basic information about the model is output.
• If we want to get detailed information about the model, we use the summary( ) function.

lm.fit <- lm(sales ~ TV, data = ads) # simple linear regression of sales on TV

attach(ads) # alternative
lm.fit <- lm(sales ~ TV)
lm.fit # basic information about the model

##
## Call:
## lm(formula = sales ~ TV)
##
## Coefficients:
## (Intercept) TV
## 7.03259 0.04754
summary(lm.fit) # detailed information about the model

##
## Call:
## lm(formula = sales ~ TV)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.3860 -1.9545 -0.1913 2.0671 7.2124
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.032594 0.457843 15.36 <2e-16 ***
## TV 0.047537 0.002691 17.67 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.259 on 198 degrees of freedom
## Multiple R-squared: 0.6119, Adjusted R-squared: 0.6099
## F-statistic: 312.1 on 1 and 198 DF, p-value: < 2.2e-16

• To obtain a confidence interval for the coefficient estimates, we can use the confint() command:
confint(lm.fit, level=0.95)

## 2.5 % 97.5 %
## (Intercept) 6.12971927 7.93546783
## TV 0.04223072 0.05284256

10
• We can plot sales and TV along with the least squares regression line using the plot() and abline()
functions.
• diagnostic plots (optional):
– use the par() and mfrow() functions to tell R to split the display screen into separate panels so
that multiple plots can be viewed simultaneously
– e.g., par(mfrow = c(2, 2)) divides the plotting region into a 2 × 2 grid of panels

plot(TV, sales)
abline(lm.fit)
25
20
sales

15
10
5

0 50 100 150 200 250 300

TV
par(mfrow = c(2, 2))
plot(lm.fit)

11
Standardized residuals
Residuals vs Fitted Q−Q Residuals
Residuals

2
0

0
−2
−10

26
17936 1793626

8 10 12 14 16 18 20 −3 −2 −1 0 1 2 3

Fitted values Theoretical Quantiles

Standardized residuals

Standardized residuals
Scale−Location Residuals vs Leverage
17936
26

2
1.0

0
Cook's distance26
0.0

−3
179 36

8 10 12 14 16 18 20 0.000 0.005 0.010 0.015 0.020

Fitted values Leverage

The diagnostic plots show residuals in four different ways:

1. Residuals vs Fitted. Used to check the linear relationship assumptions. A horizontal line, without
distinct patterns is an indication for a linear relationship, what is good.
2. Normal Q-Q. Used to examine whether the residuals are normally distributed. It’s good if residuals
points follow the straight dashed line.
3. Scale-Location (or Spread-Location). Used to check the homogeneity of variance of the residuals
(homoscedasticity). Horizontal line with equally spread points is a good indication of homoscedasticity.
This is not the case in our example, where we have a heteroscedasticity problem.
4. Residuals vs Leverage. Used to identify influential cases, that is extreme values that might influence
the regression results when included or excluded from the analysis.

5.2 Multiple Linear Regression

• The syntax lm(y ~ x1 + x2 + x3) is used to fit a model with three predictors, x1, x2, and x3.
• The summary( ) function outputs the regression coefficients for all the predictors.
• We can access the individual components of a summary object by name.

lm.fit2 <- lm(sales ~ TV + radio + newspaper, ads)

summary(lm.fit2)

##
## Call:
## lm(formula = sales ~ TV + radio + newspaper, data = ads)
##
## Residuals:
## Min 1Q Median 3Q Max

12
## -8.8277 -0.8908 0.2418 1.1893 2.8292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.938889 0.311908 9.422 <2e-16 ***
## TV 0.045765 0.001395 32.809 <2e-16 ***
## radio 0.188530 0.008611 21.893 <2e-16 ***
## newspaper -0.001037 0.005871 -0.177 0.86
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.686 on 196 degrees of freedom
## Multiple R-squared: 0.8972, Adjusted R-squared: 0.8956
## F-statistic: 570.3 on 3 and 196 DF, p-value: < 2.2e-16
names(summary(lm.fit2))

## [1] "call" "terms" "residuals" "coefficients"

## [5] "aliased" "sigma" "df" "r.squared"
## [9] "adj.r.squared" "fstatistic" "cov.unscaled"
summary(lm.fit2)$r.sq # the R2

## [1] 0.8972106
summary(lm.fit2)$fstatistic # the F-statistic

## value numdf dendf

## 570.2707 3.0000 196.0000
names(ads) # The data frame "ads" contains 5 variables.

## [1] "X" "TV" "radio" "newspaper" "sales"

lm(sales ~ ., ads) # a regression on all of the predictors
lm(sales ~ .-X, ads) # a regression excluding the predictor X

5.3 Interaction Terms

• The syntax TV :radio tells R to include an interaction term between TV and radio.
• The syntax TV*radio simultaneously includes TV, radio, and the interaction term TV×radio as
predictors.
– it is a short-hand for TV + radio + TV:radio

summary(lm(sales ~ TV*radio, data = ads))

##
## Call:
## lm(formula = sales ~ TV * radio, data = ads)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.3366 -0.4028 0.1831 0.5948 1.5246
##

13
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.750e+00 2.479e-01 27.233 <2e-16 ***
## TV 1.910e-02 1.504e-03 12.699 <2e-16 ***
## radio 2.886e-02 8.905e-03 3.241 0.0014 **
## TV:radio 1.086e-03 5.242e-05 20.727 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9435 on 196 degrees of freedom
## Multiple R-squared: 0.9678, Adjusted R-squared: 0.9673
## F-statistic: 1963 on 3 and 196 DF, p-value: < 2.2e-16

6. R Markdown (Optional)
install.packages("rmarkdown")

14
The rmd file contains three types of content:
• A YAML (Yet Another Markup Language) header surrounded by ---
– A YAML header contains YAML arguments, such as “title”, “author”, and “output”.
• R code chunks
– All code chunks start and end with three backticks. On your keyboard, the backticks can be found
on the same key as the tilde (~).
• text, tables, figures, images

15
16

Step By Step Agisoft – Metashape
No ratings yet
Step By Step Agisoft – Metashape
18 pages
Unit 2
No ratings yet
Unit 2
32 pages
Suzlon Case Study
50% (2)
Suzlon Case Study
21 pages
R Tutorial
No ratings yet
R Tutorial
15 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
R Examples
No ratings yet
R Examples
56 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
Econometrics I - R Summary (Maite Cabeza-Gutes)
No ratings yet
Econometrics I - R Summary (Maite Cabeza-Gutes)
77 pages
Lecture 10 R
No ratings yet
Lecture 10 R
117 pages
CRM Cheat Sheet
No ratings yet
CRM Cheat Sheet
7 pages
Lab 1
No ratings yet
Lab 1
26 pages
Brief Introduction To R Kaustav Banerjee: Decision Sciences Area, IIM Lucknow
No ratings yet
Brief Introduction To R Kaustav Banerjee: Decision Sciences Area, IIM Lucknow
7 pages
N2 Data in R
No ratings yet
N2 Data in R
7 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
Session Set Working Directory Choose Directlry
No ratings yet
Session Set Working Directory Choose Directlry
17 pages
R Short Tutorial
No ratings yet
R Short Tutorial
5 pages
Importing The Files
No ratings yet
Importing The Files
14 pages
Practical 1_Data Frame Manipulation_072502
No ratings yet
Practical 1_Data Frame Manipulation_072502
16 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
No ratings yet
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
12 pages
MultivariateRGGobi PDF
No ratings yet
MultivariateRGGobi PDF
60 pages
DA_Lab_Week-2
No ratings yet
DA_Lab_Week-2
22 pages
Linear Regression Analysis HUDM 5122: Introduction To R Johnny Wang
No ratings yet
Linear Regression Analysis HUDM 5122: Introduction To R Johnny Wang
17 pages
Chapter - 03 - Review of Basic Data
No ratings yet
Chapter - 03 - Review of Basic Data
92 pages
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
No ratings yet
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
24 pages
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
No ratings yet
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
31 pages
R - Tutorial: Matrices Are Vectors
No ratings yet
R - Tutorial: Matrices Are Vectors
13 pages
R Functions List
No ratings yet
R Functions List
8 pages
Basic Statistics
No ratings yet
Basic Statistics
66 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
Time Series Practice
No ratings yet
Time Series Practice
4 pages
Introduction to R for Business Analytics(1)
No ratings yet
Introduction to R for Business Analytics(1)
7 pages
R PPT
No ratings yet
R PPT
63 pages
CH 03
No ratings yet
CH 03
42 pages
Teaching Notes of R
No ratings yet
Teaching Notes of R
78 pages
Data in R
No ratings yet
Data in R
7 pages
RStudio Exercices
No ratings yet
RStudio Exercices
8 pages
R_intro2021
No ratings yet
R_intro2021
23 pages
Broomspatial
No ratings yet
Broomspatial
31 pages
Introdution to R - Network Analysis_ Practical 1 - Sacha Epskamp - University of Amsterdam, 2013
No ratings yet
Introdution to R - Network Analysis_ Practical 1 - Sacha Epskamp - University of Amsterdam, 2013
34 pages
Theory 1. R Basics
No ratings yet
Theory 1. R Basics
43 pages
Data_analysis_with_R _24
No ratings yet
Data_analysis_with_R _24
47 pages
R Programming 101 Part 1
No ratings yet
R Programming 101 Part 1
53 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
Rintro
No ratings yet
Rintro
14 pages
Introduction To R: 1 Getting Started
No ratings yet
Introduction To R: 1 Getting Started
14 pages
Lecture 2: More Data Structures: Outline
No ratings yet
Lecture 2: More Data Structures: Outline
16 pages
Mod 2 Summary Table
No ratings yet
Mod 2 Summary Table
16 pages
R Programming For NGS Data Analysis
No ratings yet
R Programming For NGS Data Analysis
5 pages
Fall 2005 Statistics 579 R Tutorial: Vectors, Matrices, and Arrays
No ratings yet
Fall 2005 Statistics 579 R Tutorial: Vectors, Matrices, and Arrays
8 pages
Data Analysis Using R - 5
No ratings yet
Data Analysis Using R - 5
9 pages
MIS 4.hafta (Introduction To R)
No ratings yet
MIS 4.hafta (Introduction To R)
52 pages
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
No ratings yet
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
15 pages
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
No ratings yet
A Brief Guide To R For Beginners in Econometrics: Department of Economics, Stockholm University
33 pages
Introduction To R
No ratings yet
Introduction To R
52 pages
R1 Plots
No ratings yet
R1 Plots
20 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
4/5 (2)
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Gd Script
From Everand
Gd Script
Marijo Trkulja
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
1B Approval Form
No ratings yet
1B Approval Form
1 page
SEBI Interview tips
No ratings yet
SEBI Interview tips
1 page
Sikacrete08SCC Pds
No ratings yet
Sikacrete08SCC Pds
2 pages
Defining and Measure
No ratings yet
Defining and Measure
12 pages
Relationships Homework
100% (1)
Relationships Homework
6 pages
Go To KANEKA Website
No ratings yet
Go To KANEKA Website
10 pages
Syl 1
No ratings yet
Syl 1
4 pages
The Learning Environment: Setting Up and Maintaining The Classroom
No ratings yet
The Learning Environment: Setting Up and Maintaining The Classroom
9 pages
DLL - Mathematics 5 - Q3 - W1
No ratings yet
DLL - Mathematics 5 - Q3 - W1
6 pages
Shunt & Anti-Split Plates: For The Electrical Distribution Industry
No ratings yet
Shunt & Anti-Split Plates: For The Electrical Distribution Industry
2 pages
Test in Science 8 Sounds
No ratings yet
Test in Science 8 Sounds
1 page
Water Intake Tower 1. What Is Water Intake Tower?
No ratings yet
Water Intake Tower 1. What Is Water Intake Tower?
3 pages
Course Outline MMP-II
No ratings yet
Course Outline MMP-II
2 pages
SSRN Id4685971
No ratings yet
SSRN Id4685971
10 pages
The Basic Elements & Phasors
No ratings yet
The Basic Elements & Phasors
120 pages
2023 ZHOU Effects of Various Coating Methods On The Mechanical Physycal and Aesthetic Properties of Gummetal Archwires in Vitro Study
No ratings yet
2023 ZHOU Effects of Various Coating Methods On The Mechanical Physycal and Aesthetic Properties of Gummetal Archwires in Vitro Study
15 pages
Serres and Hallward - The Science of Relations - An Interview
No ratings yet
Serres and Hallward - The Science of Relations - An Interview
13 pages
Pre-Calculus Lesson 2: Jean B. Corpuz, LPT Instructor
No ratings yet
Pre-Calculus Lesson 2: Jean B. Corpuz, LPT Instructor
24 pages
Zrb2 SWR Bridge
No ratings yet
Zrb2 SWR Bridge
2 pages
Tutorial-Radiation
No ratings yet
Tutorial-Radiation
3 pages
Noor Science Academy Multan: Director: Naveed Noor Cell # 03052440984
No ratings yet
Noor Science Academy Multan: Director: Naveed Noor Cell # 03052440984
3 pages
Isover DoP 0011 CPR UK Isover Steel Frame Infill Batt
No ratings yet
Isover DoP 0011 CPR UK Isover Steel Frame Infill Batt
3 pages
Honors Students Perceptions of Their High School
No ratings yet
Honors Students Perceptions of Their High School
16 pages
Social Psychology Handout 1
No ratings yet
Social Psychology Handout 1
43 pages
Instant Download The Ethics of Social Roles Alex Barber (Editor) PDF All Chapters
100% (2)
Instant Download The Ethics of Social Roles Alex Barber (Editor) PDF All Chapters
57 pages
Section A Criteria A & C (Knowledge &understanding and Communication)
No ratings yet
Section A Criteria A & C (Knowledge &understanding and Communication)
9 pages
Foundations of Quantum Field Theory 3rd Edition Klaus Dieter Rothe instant download
100% (2)
Foundations of Quantum Field Theory 3rd Edition Klaus Dieter Rothe instant download
62 pages
Crio - Copy Business Operations - Case Study Assignment
No ratings yet
Crio - Copy Business Operations - Case Study Assignment
3 pages

R Tutorial

Uploaded by

R Tutorial

Uploaded by

Today’s plan:

1. Using R as a calculator (Arithmetric Operations)

2 Creating and manipulating various objects in R

x <- 1:5 # generate a sequence from 1 to 5

x[2] # access the second element of the vector

3 Loading data sets into R

3.2 Data Files

• CSV or comma-separated values files: tabular data

ads <- read.csv("Advertising.csv")

3.3 Data Frame Object

names(ads) # return variable names

## [1] "X" "TV" "radio" "newspaper" "sales"

ads[2,"sales"] # extract the second row of the "sales" column

## X TV radio newspaper sales

## X TV radio newspaper sales

## [1] 22.1 10.4 9.3

install.packages("foreign") # install package

• The plot( ) function is the primary way to plot data in R.

x <- rnorm(100) # generates a vector of 100 random normal variables

this is the x−axis

0 50 100 150 200 250 300

• The hist( ) function can be used to plot a histogram.

0 50 100 150 200 250 300

Fitted values Theoretical Quantiles

8 10 12 14 16 18 20 0.000 0.005 0.010 0.015 0.020

Fitted values Leverage

The diagnostic plots show residuals in four different ways:

5.2 Multiple Linear Regression

lm.fit2 <- lm(sales ~ TV + radio + newspaper, ads)

## [1] "call" "terms" "residuals" "coefficients"

## value numdf dendf

## [1] "X" "TV" "radio" "newspaper" "sales"

5.3 Interaction Terms

summary(lm(sales ~ TV*radio, data = ads))

You might also like