0% found this document useful (0 votes)
28 views53 pages

R Programming 101 Part 1

Uploaded by

PavaniPaladugu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views53 pages

R Programming 101 Part 1

Uploaded by

PavaniPaladugu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

R Programming 101:

Nuts and Bolts


Jocelyn Mara
Discipline of Sport and Exercise Science
R
• A free programming language and software environment
• Primarily used for statistical computing and graphics
• Uses command line interface for most processes
• RStudio is the graphical interface (but still heavily reliant on CLI)
• Runs on any operating system
• Users can use the built-in functions or create their own

https://www.r-project.org
First step

Download and install R and RStudio using the


guide provided
RStudio
RStudio
RStudio
RStudio
The Prompt >

• Informally stands for “what’s next”


• R is waiting for you to give it some instructions
Calculations in R
• We can use R as a calculator
>4+3
[1] 7

> 20 / 5
[1] 4

>5*4
[1] 20

> 64 - 57
[1] 7

>8^2
[1] 64
Value assignment
• The <- symbol is the assignment operator

> x <- 7
> print(x)
[1] 7
>x
[1] 7

• The [1] indicates that x is a vector and 7 is the first element


Value assignment

> x <- 1:20


>x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

• The : operator is used to create integer sequences


Value assignment
Value assignment
> x <- 7
>x+3
[1] 10
> x+y
Error: object ‘y’ not found
> y <- 3
> x+y
[1] 10
Value assignment
Objects
• Anything we manipulate/analyse/encounter in R is an object
• Single values (e.g. x <- 7)
• Vectors (e.g. Numerical, Matrices, Dataframes)
• Plots
Object Classes
Classes in R describe the type of values within an object
• Numeric (real numbers, e.g. 2.73)
• Integers (whole numbers, e.g. 2, 7, 68)
• Factor (e.g. 1 = Male, 2 = Female)
• Logical (true/false)
• Character (e.g. “Hello World”)
• Complex (e.g. 2n + i)
Object Classes
Object Classes
Object Classes
• If you want a number to be an integer you need to use the suffix ‘L’
Object Classes
• If you want to check the class of an object you can use the class function

> class(x)
[1] "numeric”
> class(y)
[1] ”integer”
Vectors
• Vectors are objects which contain multiple values of the same class (with
the exception of a list and dataframe)

> x <- rnorm(n = 20)

>x

[1] 1.31 2.35 0.91 -1.06 -0.54 0.86 -0.15 1.19 -1.40 -0.60 -1.44 -1.70 2.02

[14] -0.50 1.72 0.23 -0.61 -2.78 1.41 -1.57

> class(x)

[1] "numeric"
Vectors

> y <- x > 0

>y

[1] TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE

[12] FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE FALSE

> class(y)

[1] ”logical"
Creating Vectors
• Use the c() function to create vectors (combine values)

> x <- c(12.3, 27.8)


>x
[1] 12.3 27.8
Creating Vectors
• Use the c() function to create vectors (combine values)

> x <- c(TRUE, FALSE, TRUE)


>x
[1] TRUE FALSE TRUE
Creating Vectors
• Use the c() function to create vectors (combine values)

> x <- c(“This”, “Is”, “Fun!”)


>x
[1] "This" "is" "Fun!"
Mixing Classes
• When values of different classes are mixed in a vector, coercion occurs so
that every element in the vector is of the same class

> x <- c(12.3, TRUE, “foo”)


> class(x)
[1] “Character”
>x
[1] "12.3" "TRUE" "foo"
Mixing Classes
• When values of different classes are mixed in a vector, coercion occurs so
that every element in the vector is of the same class

> x <- c(TRUE, 1.7, FALSE)


> class(x)
[1] “Numeric”
>x
[1] 1.0 1.7 0.0
Explicit Coercion
• Objects can be explicitly coerced from one class to another using the as.*
functions
> x <- 0:6
> class(x)
[1] “integer”
> as.numeric(x)
[1] 0 1 2 3 4 5 6
> as.logical(x)
[1] FALSE TRUE TRUE TRUE TRUE TRUE
> as.character(x)
[1] “0” “1” “2” “3” “4” “5” “6”
Explicit Coercion
• A coercion that doesn’t make sense will result in NAs

> x <- c(“a”, “b”, ”c”)


> as.numeric(x)
[1] NA NA NA
Warning message:
NAs introduced by coercion
> as.logical(x)
[1] NA NA NA
Warning message:
Nas introduced by coercion
Matrices
• A matrix is a vector with a dimensions attribute (nrow, ncol)

> mat <- matrix(x, nrow = 5, ncol = 4)

> mat
[,1] [,2] [,3] [,4]
[1,] 1.31 0.86 -1.44 0.23
[2,] 2.35 -0.15 -1.70 -0.61
[3,] 0.91 1.19 2.02 -2.78
[4,] -1.06 -1.40 -0.50 1.41
[5,] -0.54 -0.60 1.72 -1.57
Matrices
• Use the dim function to check the dimensions of a vector

> mat <- matrix(x, nrow = 5, ncol = 4)

> dim(mat)

[1] 5 4
Dataframes
• Are vectors with dimensions and variable names (attributes)

• Arranged with each column as a variable and each row a case

> df <- as.data.frame(mat)

> df
V1 V2 V3 V4
1 1.31 0.86 -1.44 0.23
2 2.35 -0.15 -1.70 -0.61
3 0.91 1.19 2.02 -2.78
4 -1.06 -1.40 -0.50 1.41
5 -0.54 -0.60 1.72 -1.57
Dataframes
• Can contain different classes

• But each column (variable) should have the same class

> df
Subject Position Distance
1 Centre 1200
2 Back 1759
3 Forward 1680
Dataframes
• Use the dim function to check dimensions
• Use the names function to check variable names

> dim(df)

[1] 5 4

> names(df)

[1] "V1" "V2" "V3" "V4"


Lists
• Vectors that can contain elements of different classes

> x <- list(c(17.1, 23.2), TRUE, "a")


>x
[[1]]
[1] 17.1 23.2

[[2]]
[1] TRUE

[[3]]
[1] "a"
Attributes
• Names (variable names or dim names)
• Dimensions (nrow, ncol)
• Length (n values if vector with no dim or a matrix, ncol if dataframe)
• Class
Attributes
• Use attributes function to check attributes of a vector

> attributes(df)

$names

[1] "V1” "V2" "V3” "V4”

$row.names

[1] 1 2 3 4 5

$class

[1] "data.frame"
Missing Values
• Missing values represented by NA

• NaN (not a number) is used for undefined mathematical operations (e.g.


0/0)

• is.na( ) is used to test if there are missing values in an object

• is.nan( ) is used to test for NaN

• A NaN value is also NA, but a NA is not a NAN


Missing Values
> x <- c(1, 2, NA, 10, 3)
> is.na(x)
[1] FALSE FALSE TRUE FALSE FALSE
> is.nan(x)
[1] FALSE FALSE FALSE FALSE FALSE
> y <- c(1, 2, NaN, NA, 4)
> is.na(y)
[1] FALSE FALSE TRUE TRUE FALSE
> is.nan(y)
[1] FALSE FALSE TRUE FALSE FALSE
Basic Functions
So far in this lesson I’ve used:

• class( )
• rnorm( )
• c( )
• as.numeric( )
• as.logical( )
• as.character( )
• as.data.frame( )
• matrix( )
• dim( )
• attributes( )
• is.na( )
Basic Functions
Rather than doing this to find the mean..

> (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10) / 10
[1] 5.5

... I can do this...

> mean(x)
[1] 5.5
Basic Functions
Some other examples:

• sd( )
• min( )
• max( )
• median( )
• range( )
Basic Functions
Functions have this format...

function-name(arg 1, arg 2, ...)

Example:
mean(x, trim = 0, na.rm = FALSE)

Function name/ Other


describes what we’re arguments
doing The object
we’re applying
the function to
Function Arguments
• Functions have named arguments which sometimes have default values

mean(x, trim = 0, na.rm = FALSE)

• If I just typed mean(x)...

.... this would be equivalent to mean(x, trim = 0, na.rm = FALSE)


Argument Matching
• Function arguments can be matched by position or by name

• E.g. the following calls are all equivalent:

> mydata <- 1:20


> mean(x = mydata, trim = 0, na.rm = FALSE)
> mean(mydata, 0, FALSE)
> mean(na.rm = FALSE, trim = 0, x = mydata)
> mean(mydata, trim = 0, FALSE)
> mean(mydata)

But don’t mess around with it too much


Function Arguments
• To see the arguments for a function you can use the args( ) function

> args(lm)
function (formula, data, subset, weights, na.action, method = "qr", model = TRUE, x =
FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...)
Function Arguments
• You can also use ?function-name to see more information about the function and it’s
arguments

> ?mean
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later

> x <- rnorm(n = 20)


> y <- rnorm(n = 20)
> args(plot)
function (x, y, ...)
NULL
> plot(x, y)
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later

> x <- rnorm(n = 20)


> y <- rnorm(n = 20)
> args(plot)
function (x, y, ...)
NULL
> plot(x, y)
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later

> plot(x, y, col = “red”)


The “...” Argument
• The “...” argument is also necessary when the number of arguments
passed to the function is not known in advance

> args(paste)
function (..., sep = " ", collapse = NULL)
NULL
> paste(“This”, “is”, “Fun”, sep = “ ”, collapse = NULL)
[1] "This is Fun"
The “...” Argument
• The catch – any arguments coming after the “...” must be explicitly named

> paste("This", "is", "Fun"," ", NULL)


[1] "This is Fun "
Summary
• Value assignment

• Objects, classes, attributes

• Missing values

• Basic functions and their arguments

You might also like