Lenguaje R C4

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 15

Lenguaje estadística

4
Working in More Dimensions
Working in More Dimensions
In the previous chapters, you worked with one‐dimensional vectors. The
data could be represented by a single row or column in a Microsoft Excel
spreadsheet. calculations. Many datasets contain values of different types
for multiple variables and observations, so you need a two dimensional
table to represent this data. In Excel, you would do that in a spreadsheet; in
R, you use a specific object called a data frame for the task.
In addition to vectors, R can represent matrices as an object you work and
calculate with.
Discovering a new dimension
Vectors are closely related to a bigger class of objects, arrays. Arrays have
two very important features:
✓✓They contain only a single type of value.
✓✓They have dimensions.
The dimensions of an array determine the type of the array. You know
already that a vector has only one dimension. An array with two dimensions
is a matrix. Anything with more than two dimensions is simply called an
array. Technically, a vector has no dimensions at all in R. R returns NULL
as a result
if you use the functions dim(), nrow(), or ncol().
Creating your first matrix
Creating a matrix is almost as easy as writing the word: You simply use the
matrix() function. You do have to give R a little bit more information, though.
R needs to know which values you want to put in the matrix and how you
want to put them in. The matrix() function has several arguments for this:
✓✓data is a vector of values you want in the matrix.
✓✓ncol takes a single number that tells R how many columns you want.
✓✓nrow takes a single number that tells R how many rows you want.
✓✓byrow takes a logical value that tells R whether you want to fill the
matrix row‐wise (TRUE) or column‐wise (FALSE). Column ‐wise is the
default.
> first.matrix <- matrix(1:12, ncol = 4)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
 
> matrix(1:12, ncol = 4, byrow = TRUE)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
Looking at the properties
You can look at the structure of an object using the str() function.
> str(first.matrix)
int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...
columns. If you want the number of rows and columns without looking at
the structure, you can use the dim() function.
> dim(first.matrix)
[1] 3 4
To get only the number of rows, you use the nrow() function. The ncol()
function gives you the number of columns of a matrix.
You can find the total number of values in a matrix exactly the same way
as you do with a vector, using the length() function:
> length(first.matrix)
[1] 12
Combining vectors into a matrix
You can combine both vectors as two rows of a matrix with the rbind()
function:
> baskets.of.Granny <- c(12, 4, 5, 6, 9, 3)
> baskets.of.Geraldine <- c(5, 4, 2, 4, 12, 9)
> baskets.team <- rbind(baskets.of.Granny, baskets.of.Geraldine)
The cbind() function does something similar. It binds the vectors as
columns of a matrix:
> cbind(1:3, 4:6, matrix(7:12, ncol = 2))
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
Here you bind together three different nameless objects:
✓✓A vector with the values 1 to 3 (1:3)
✓✓A vector with the values 4 to 6 (4:6)
✓✓A matrix with two columns and three rows, filled column‐wise with the
values 7 through 12 (matrix(7:12, ncol = 2))
Extracting values from a matrix
Using numeric Indices
Whereas a vector has only one dimension that can be indexed, a matrix has
two. You separate the indices for both dimensions by a comma — you give
the index for the row before the comma, and the index for the column after
the comma.
> first.matrix[1:2, 2:3]
[,1] [,2]
[1,] 4 7
[2,] 5 8
R gives you an easy way to extract complete rows and columns from a
matrix. You simply don’t specify the other dimension.
> first.matrix[2:3, ]
[,1] [,2] [,3] [,4]
[1,] 2 5 8 11
[2,] 3 6 9 12
Dropping values using negative índices
> first.matrix[-2, -3]
[,1] [,2] [,3]
[1,] 1 4 10
[2,] 3 6 12
If you want to drop only the element at the second row and the third
column, you have to treat the matrix like a vector.
> nr <- nrow(first.matrix)
> id <- nr * 2 + 2 # el primer 2 no. de columna y el Segundo es la posición
en la tercera columna
> first.matrix[-id]
[1] 1 2 3 4 5 6 7 9 10 11 12
You can do this in one line, like this:
> first.matrix[-(2 * nrow(first.matrix) + 2)]
[1] 1 2 3 4 5 6 7 9 10 11 12
If you want to drop the first and third rows of the matrix:
> first.matrix[-c(1, 3), ]
[1] 2 5 8 11 # it returns a vector
By default, R always tries to simplify the objects to the smallest number of
dimensions possible when you use the brackets to extract values from an
array.
> first.matrix[2, , drop = FALSE]
[,1] [,2] [,3] [,4]
[1,] 2 5 8 11
Actually, the square brackets work like a function, and the row index and
column index are arguments for the square brackets.
Replacing values in a matrix
> first.matrix[3, 2] <- 4
> first.matrix
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 4 9 12
You can change an entire row or column of values by not specifying the other
dimension. Note that values are recycled.
> first.matrix[2, ] <- c(1, 3)
> first.matrix
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 1 3 1 3
[3,] 3 4 9 12
You can replace a subset of values within the matrix by another matrix. You don’t
even have to specify the values as a matrix — a vector will do.
> first.matrix[1:2, 3:4] <- c(8, 4, 2, 1)
> first.matrix
[,1] [,2] [,3] [,4]
[1,] 1 4 8 2
[2,] 1 3 4 1
[3,] 3 4 9 12 # R reads and writes matrices column‐wise by default.
Naming Matrix Rows and Columns
The matrix baskets.team from the previous section already has some row names. It
would be better if the names of the rows would just read "Granny" and "Geraldine".
> rownames(baskets.team) <- c("Granny", "Geraldine")
> rownames(baskets.team)
[1] "Granny" "Geraldine"
> colnames(baskets.team) <- c("1st", "2nd", "3th", "4th", "5th", "6th")
This gives you the following matrix:
> baskets.team
1st 2nd 3th 4th 5th 6th
Granny 12 4 5 6 9 3
Geraldine 5 4 2 4 12 9
> colnames(baskets.team)[3] <- "3rd"
> colnames(baskets.team)
[1] "1st" "2nd" "3rd" "4th" "5th" "6th“
If you want to get rid of either column names or row names, the only thing you need to do is
set their value to NULL.
> baskets.copy <- baskets.team
> colnames(baskets.copy) <- NULL
> baskets.copy
[,1] [,2] [,3] [,4] [,5] [,6]
Granny 12 4 5 6 9 3
Geraldine 5 4 2 4 12 9
R stores the row and column names in an attribute called dimnames. Use the dimnames()
function to extract or set those values.
Using names as indices
These row and column names can be used just like you use names for values in a
vector.
> baskets.team[, c("2nd", "5th")]
2nd 5th
Granny 4 9
Geraldine 4 12
 
> baskets.team["Granny", ]
1st 2nd 3rd 4th 5th 6th # transformed to a vector
12 4 5 6 9 3
 
> baskets.team["Granny", , drop = FALSE]
1st 2nd 3rd 4th 5th 6th
Granny 12 4 5 6 9 3
Using standard operations with matrices
> first.matrix + 4
[,1] [,2] [,3] [,4]
[1,] 5 8 11 14
[2,] 6 9 12 15
[3,] 7 10 13 16
You can use all other arithmetic operators in exactly the same way to perform an
operation on all elements of a matrix.
The addition of two matrices is the addition of their corresponding elements. So, you
need to make sure both matrices have the same dimensions.
Say you want to add 1 to the first row, 2 to the second row, and 3 to the third row of
the matrix first.matrix.
> second.matrix <-matrix(1:3, nrow = 3, ncol =4)
> second.matrix
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 2 2 2 2
[3,] 3 3 3 3
> first.matrix + second.matrix
[,1] [,2] [,3] [,4]
[1,] 2 5 8 11
[2,] 4 7 10 13
[3,] 6 9 12 15
If dimensions of both matrices are not the same, R will complain and refuse to carry
out the operation, as you can see:
> first.matrix + second.matrix[, 1:3]
Error in first.matrix + second.matrix[, 1:3] : non-conformable arrays
> first.matrix + 1:3
[,1] [,2] [,3] [,4]
[1,] 2 5 8 11
[2,] 4 7 10 13
[3,] 6 9 12 15
R does not complain about the dimensions, and recycles the vector over the values of
the matrices! In fact, R treats the matrix as a vector by simply ignoring the
dimensions. So, in this case, you dont use matrix addition but simple (vectorized)
addition.
Calculating row and column summaries
Because a matrix is simply a vector with dimensions attached to it. You also can
summarize the rows or columns of a matrix using some specialized functions.
To get the total number each woman made during the last six games, use the function
rowSums():
> rowSums(baskets.team)
Granny Geraldine
39 36
 The rowSums() function returns a named vector with the sums of each row. You can
get the means of each row with rowMeans(), and the respective sums and means of
each column with colSums() and colMeans().
Creating an array
You have two different options for constructing matrices or arrays. Either you
use the creator functions matrix() and array(), or you simply change the
dimensions using the dim() function.
> my.array <- array(1:24, dim = c(3, 4, 2))
> my.array
,,1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
,,2
[,1] [,2] [,3] [,4]
[1,] 13 16 19 22
[2,] 14 17 20 23
[3,] 15 18 21 24
 This array has three dimensions (3 rows, 4 columns, 2 slices ). Notice that,
although the rows are given as the first dimension, the slices are filled
column‐wise. So, for arrays, R fills the columns, then the rows, and then the
rest.
Changing the dimensions of a vector
you could just add the dimensions using the dim() function. This is a little hack that goes a
bit faster than using the array() function; it´s especially useful if you have your data already
in a vector.
> my.vector <- 1:24
> dim(my.vector) <- c(3, 4, 2)
You can check whether two objects are identical by using the identical() function. To check
whether my.vector and my.array are identical, try:
> identical(my.array, my.vector)
[1] TRUE
Happy Coding!
Instructor: Amaury Beltrán Mendez

You might also like