R - Lecture 2
R - Lecture 2
Fundamentals
Ivan Belik
• we can give variable names to R objects and statements and recall them later
• Create variables
• refer to variables
2
R: Basics
• Basic arithmetic example:
>1+2
[1] 3
• Limitations:
3
R: Basics
• Please, use descriptive names:
> Answer <- 3
>
• We can print assigned variables (or any results) using print() function:
• Frequently, print() is needed in the r-script when you have to print results/messages (as a part of your code)
> Answer <- 3
> print(Answer)
[1] 3
>
• But, of course, you can just type in the name of the object and R will do it’s thing (in console):
> Answer <- 3
> Answer
[1] 3
>
4
R: Basics
• In Rstudio: it is easy to track which objects were created:
5
R: Basics
• If you want to remove an object use rm() function
• Or you can use “Clear objects” button in RStudio to remove all objects
6
R: Basics
We have learned how to work with single values and statements
Now, let’s consider more interesting data objects that are very useful in practice
7
R: Objects
• R is an object oriented language
8
R: Objects
• The most basic OBJECTS are:
1. Vectors:
- For example, this could be a vector of length 26 (i.e. one containing 26 elements)
where each element is a character
Examples:
Numeric vector: 5.0 5.4 5.8 6.2 6.6 7.0 7.4 7.8 8.2 8.6 9.0
9
R: Objects
2. Matrices & Arrays:
- Matrices: two dimensional rectangular objects (special type of vector with two dimensions: rows and columns)
- Arrays: higher - dimensional rectangular objects (similar to matrix, but can store data in more than two dimensions)
NOTE: All elements of matrices or arrays have to be of the same mode (i.e., type)
10
R: Objects
3. Lists:
- Lists are like vectors but they do not have to contain elements of the same mode
Example 1 (descriptive):
- The first element of a list could be a vector of the 26 letters of the alphabet.
- The second element could contain a vector of all the prime numbers below 1000.
Example 2 (R-list):
11
R: Objects
4. Data Frames:
- They are two dimensional containers with rows corresponding to ‘observations’ and columns corresponding to ‘variables.’
Example:
5. Functions:
- Functions are objects that take other objects as inputs and return some new object.
Example:
12
R: Modes
• All objects have a certain mode (type of data they can contain)
• Some objects can only deal with one mode at a time (for ex., matrices)
•
• Others can store elements of multiple modes (for ex., lists)
13
R: Vectors
14
R: Vectors
• So far we have created only trivial vectors of length equal to 1:
> Answer <- 3
>
15
R: Vectors
• Function c() also works for the vector of characters:
> Vector2 <- c( "a " , "b" , " c" , "d ")
> Vector2
[1] "a " "b" " c" "d "
>
• OR:
> Vector3 <- c( "1 " , "2" , " 3" , "4 ")
> Vector3
[1] "1 " "2" " 3" "4 "
>
• You can make a vector of vectors using c() function (to concatenate them):
> Vector4 <- c( Vector2 , Vector3 , Vector2 , Vector2 , Vector2 )
> Vector4
[1] "a " "b" " c" "d " "1 " "2" " 3" "4 " "a " "b" " c" "d " "a " "b" " c"
[16] "d " "a " "b" " c" "d "
>
16
R: Vector operations
• Most standard mathematical functions work with vectors:
> Vector1 <- c(1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10)
> Vector1 + Vector1
[1] 2 4 6 8 10 12 14 16 18 20
>
17
R: Vector operations
> log ( Vector1 )
[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
[8] 2.0794415 2.1972246 2.3025851
>
• Here we are nesting the log() function inside the round() function:
> round ( log ( Vector1 ))
[1] 0 1 1 1 2 2 2 2 2 2
>
• The round() can take an optional argument digit
• argument digit specifies how many decimals to display (i.e. number of digits after point)
18
R: Vector operations
• For work with vectors a variety of built-in functions exist:
19
R: Vector operations
• Examples:
20
R: Simplifying Vector Creation
• Sometimes function c() is not useful
• For example, you do not want to type manually all elements of vector
> 1:10
[1] 1 2 3 4 5 6 7 8 9 10
• Assigning to an object:
21
R: Simplifying Vector Creation
• Also, you can use seq() function
> seq (0 , 10 , by = 2) # the 'by' argument let's you set the increments
[1] 0 2 4 6 8 10
22
R: Simplifying Vector Creation
> seq (0, 10, length.out = 25)
• the ' length .out ' argument specifies the length of the vector
23
R: Simplifying Vector Creation
• Function rep() allows you to repeat things:
24
R: Simplifying Vector Creation
• Repeating Vector 1 twice
> Vector1
[1] 1 2 3 4 5 6 7 8 9 10
> rep ( Vector1 , 2)
[1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
> Vector2
[1] "a" "b" "c" "d"
> rep(Vector2, each = 2)
[1] "a" "a" "b" "b" "c" "c" "d" "d"
>
25
R: Length
> Vector6 <- c( " The " , " Course" , " Fellow " , " is " , " smart ." )
26
R: Indexing
• We reference the third element:
> Vector6 [3]
[1] " Fellow “
27
R: Indexing
• To change elements:
> Vector6
> [1] " The " " Course " " Fellow " " is " " smart ."
> Vector6 [5] <- " great ."
> Vector6
[1] " The " " Course " " Fellow " "is" " great ."
28
R: Logical Operators
• Logical operators come in handy when indexing:
29
R: More Functions
• Let’s consider the following three functions:
• na.omit()
• subset()
• sample()
30
R: More Functions
• Let’s create a new vector called V:
• This won't work because many functions (such as max) can't deal with NA-s
31
R: More Functions
• This is where the na.omit() function comes in.
>V
[1] 2 3 4 3 NA NA 6 6 10 11 2 NA 4 3
32
R: More Functions
• Now, we can apply all those functions that break when they encounter NA-s:
>V
[1] 2 3 4 3 NA NA 6 6 10 11 2 NA 4 3
• The summary() function is useful check whether NAs are present in your object:
> summary(V)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
2.000 3.000 4.000 4.909 6.000 11.000 3
33
R: More Functions
• The is.na() function is more powerful
• Combined with the subset() function we can remove the NAs manually
• The first argument you need to supply is the object you want to subset (i.e., V in our case)
> is.na(V)
[1] FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
• size = 10 means that we pick up 10 numbers randomly in order to form our sample
• replace = TRUE: after we pick up a number we can place the same number back to the bawl
It means that it CAN be selected again
36
R: More Functions
• The print() function
• We know already that the print() function prints an object to the screen
> print(0.2)
[1] 0.2
• you can paste multiple objects together and print them to the screen
37
R: Lists
38
R: LISTS
• List is a data structure that can contain elements of different modes (i.e., types).
• Vector that has all elements of the same type is called VECTOR (i.e., atomic vector)
but:
• We created a list where we have three elements with the following types:
39
R: LISTS
• In our list a we specified tags for each element: “X”, “Y” and “Z”
40
R: LISTS
• We can also create lists without any tags:
Then, to retrieve the element (component of the list) to retrieve the content of the element
we use indexes in single square brackets [ ]: we use indexes in double square brackets [[ ]]:
41
R: LISTS
• We can use tags (if they were defined) instead of indexes to access the content of the element:
• This ways:
• Or, instead of double square brackets [[ ]], we can use $ sign to access the content of the element:
42
R: LISTS
• Few words about tags:
43
R: LISTS: Modifying
• To modify lists we should make a reassignment:
44
R: LISTS: adding
• To add a new component to the list we just need to declare a new component:
45
R: LISTS: adding
• If we specify an index that is greater than the next expected index:
46
R: LISTS: adding
• Also we can use function append (x, values, after= ):
• values: to be added to x
• after = : index of the element, after witch the values will be added
47
R: LISTS: deleting
To delete an element from the list We can use indexing:
we can just assign NULL value to it: negative index means "don't include this element".
48
Interactive Input
49
R: Interactive Input
• Use readline() function to take input from the user (interactive session):
>A
[1] "20"
50
R: Interactive Input
• We can convert Age back to string format:
>A
[1] "20"
51