R Codes For Data Analysis
R Codes For Data Analysis
the function rowSums() conveniently calculates the totals for each row of a matrix. This function
creates a new vector:
rowSums(my_matrix)
colSums(my_matrix)
add a column or multiple columns to a matrix with the cbind() function, which merges matrices
and/or vectors together by column. For example:
big_matrix <- cbind(matrix1, matrix2, vector1 ...)
my_matrix[1,2] selects the element at the first row and second column.
my_matrix[1:3,2:4] results in a matrix with the data on the rows 1, 2, 3 and columns 2,
3, 4.
If you want to select all elements of a row or a column, no number is needed before or after the
comma, respectively:
my_matrix[,1] selects all elements of the first column.
my_matrix[1,] selects all elements of the first row.
Calculate means
mean(my_variable)
factors:
factor refers to a statistical data type used to store categorical variables.
factor(my_vector)
There are two types of categorical variables: a nominal categorical variable and an ordinal
categorical variable.
A nominal variable is a categorical variable without an implied order. This means that it is
impossible to say that 'one is worth more than the other'. For example, think of the categorical
variable animals_vector with the categories "Elephant", "Giraffe", "Donkey" and "Horse".
Here, it is impossible to say that one stands above or below the other. (Note that some of you
might disagree ;-) ).
Example:
animals_vector <- c("Elephant", "Giraffe", "Donkey", "Horse")
factor_animals_vector <- factor(animals_vector)
In contrast, ordinal variables do have a natural ordering. Consider for example the categorical
variable temperature_vector with the categories: "Low", "Medium" and "High". Here it is
obvious that "Medium" stands above "Low", and "High" stands above "Medium"
Example:
temperature_vector <- c("High", "Low", "High","Low", "Medium")
factor_temperature_vector <- factor(temperature_vector, order = TRUE,
levels = c("Low", "Medium", "High"))
summary(my_factor)