R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
Often it can be useful to store results from operations for later use. This can be done using the assignment operator <test <- 2 * 3
performs the operation on the right hand side (2 * 3) and then stores the result as an object named test. One can also use = or even -> for assignments. Further operations can be carried out on objects:
2 * test
Note that objects are overwritten without notice. The command ls() outputs the list of currently defined objects. Data Types: Data types available in R: numeric, character, logical. As the name indicates, numeric is used for numerical values (double precision). The type character is used for characters and is generally entered using quotation marks. However, it is not possible (nor meaningful) to apply arithmetic operators on character data types. The data type logical is used for boolean variables: (TRUE or T, and FALSE or F).
> ls() character(0) > myname <- "Vikrant"
> myname [1] "Vikrant" > ls() [1] "myname"
Object Types: Depending on the structure of the data, R recognizes four standard object types: vectors, matrices, data frames and lists. Vectors are onedimensional arrays of data, the ith element of a vector is accessed via the subscript notation; e.g. v[i]. Matrices are two dimensional data arrays, The element in the intersection of the ith row and jth column is accessed with m[i, j]. Arrays have k dimensions. Each element of an array is accessed with k indices, a[i1, i2, ... , ik]. An array object of dimension 1 differs from a vector object by virtue of having a dimension vector. The dimension vector is a vector of positive integers. The length of this vector gives the dimension of the array. Creating Vectors in R: There are various means of creating vectors in R. E.g. in case one wants to save the numbers 3; 5; 6; 7; 1 as mynumbers, one can use the c() command:
mynumbers <- c(3, 5, 6, 7, 1)
Further operations can then be carried out on the R object mynumbers. Note that arithmetic operations on vectors (and matrices) are carried out component wise, e.g. mynumbers*mynumbers returns the squared value of each component of mynumbers. Sequences can be created using either : or seq():
> 1:10 [1] 1 2 3 4 5 6 7 8 9 10 > seq(1,10) [1] 1 2 3 4 5 6 7 8 9 10 > seq(0.5, 2.5, 0.5) [1] 0.5 1.0 1.5 2.0 2.5 > seq(0.5, 2.5, length = 10) [1] 0.5000000 0.7222222 0.9444444 1.1666667 1.3888889 1.6111111 1.8333333 2.0555556 2.2777778 2.5000000
The seq() command allows the increments of the sequence to be specified. Alternatively one can specify the length of the sequence creates a sequence from 0:5 to 2:5 with the increments chosen such that the resulting sequence contains 10 equally spaced values.
Creating Matrices in R: One way of creating a matrix in R is to convert a vector of length n.m into a nxm matrix:
> (mynumbers <- 1:12) [1] 1 2 3 4 5 6 7 8 9 10 11 12 > matrix(mynumbers, nrow = 4) [,1] [,2] [,3] [1,] 1 5 9 [2,] 2 6 10 [3,] 3 7 11 [4,] 4 8 12
Note that the matrix is created columnwise for rowwise construction one has to use the option byrow=TRUE:
> matrix(mynumbers, nrow = 4, byrow = TRUE) [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 [4,] 10 11 12
An alternative way for constructing matrices is to use the functions cbind() and bind(), which combine vectors (row- or columnwise) to a matrix:
> mynumbers1 <- 1:4 > mynumbers2 <- 11:14 > cbind(mynumbers1, mynumbers2) mynumbers1 mynumbers2 [1,] 1 11 [2,] 2 12 [3,] 3 13 [4,] 4 14 > rbind(mynumbers1, mynumbers2) mynumbers1 mynumbers2 [,1] [,2] [,3] [,4] 1 2 3 4 11 12 13 14
Particular elements of R vectors and matrices can be accessed using square brackets:
> (vector1 <- seq(-3, 3, 0.5)) [1] -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 > (matrix1 <- matrix(1:20, nrow = 5)) [,1] [,2] [,3] [,4] [1,] 1 6 11 16 [2,] 2 7 12 17 [3,] 3 8 13 18 [4,] 4 9 14 19 [5,] 5 10 15 20 > vector1[5] # returns the 5th element of vector1 [1] -1 > vector1[1:3] # returns the first three elements of vector1 [1] -3.0 -2.5 -2.0 > vector1[c(2, 4, 5)] # returns the 2nd, 4th and 5th element of vector1 [1] -2.5 -1.5 -1.0 > vector1[-5] # returns all elements except for the 5th one [1] -3.0 -2.5 -2.0 -1.5 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 > matrix1[2,] # returns the 2nd row of matrix1 [1] 2 7 12 17 > matrix1[,3] # returns the 3rd column of matrix1 [1] 11 12 13 14 15 > matrix1[2, 3] # returns the value from matrix1 in the 2nd row and 3rd column [1] 12 > matrix1[1:2, 3] # returns the value from matrix1 in the first and the third column [1] 11 12
Example of Plotting Functions: Assume that you were to plot a function by hand. One possibility of doing it is to 1. Select some xvalues from the range to be plotted 2. Compute the corresponding y = f(x) values 3. Plot x against y 4. Add a (more or less) smooth line connecting the (x; y)points Graphs of functions are created in essentially the same way in R, e.g. plotting the function f(x) = sin(x) in the range of - to can be done as follows:
> par(mfrow = c(1,2)) # creates a row with two columns for the two figures > x1 <- seq(-pi, pi, length = 10) # defines 10 values from -pi to pi > y1 <- sin(x1) # computes the corresponding y-values > x2 <- seq(-pi, pi, length = 1000) # defines 1000 values from pi to pi > y2 <- sin(x2) # computes the corresponding y-values > plot(x1,y1, main = 'Length(x) = 10'); lines(x1,y1) # plots; joins the points with line > plot(x2,y2, type = 'l', main = 'Length(x) = 1000'); lines(x2,y2)
Note the use of the option type="l", which causes the graph to be drawn with connecting lines rather than points. Statistical Distributions:
The names of the R functions for distributions comprise two parts. The first part (the first letter) indicates the function group, and the second part (the remainder of the function name) indicates the distribution. The following function groups are available: probability density function (d) cumulative distribution function (p) quantile function (q) random number generation (r) Common distributions have their corresponding R names:
The following examples illustrate the use of the R functions for computations involving statistical distributions:
> rnorm(10) # draws 10 random numbers from a standard normal distribution [1] -1.22344887 0.39122103 -0.05510522 -2.82353210 -0.36081333 1.11091760 0.26517245 [8] -0.06404960 -0.78331341 0.31062647 > rnorm(10, 5, 2) # draws 10 random numbers from a N( = 5, = 2) distribution [1] 6.362673 7.096071 7.979135 3.645824 5.987376 6.277054 4.547036 3.840034 6.917535 7.256952 > pnorm(0) # returns the value of a standard normal cdf at t = 0 [1] 0.5 > qnorm(0.5) # returns the 50% quantile of the standard normal distribution [1] 0
Assume that we want to generate 50 (standard) normally distributed random numbers and to display them as a histogram. Additionally, we want to add the pdf of the (fitted) normal distribution to the plot.
mysample <- rnorm(50) # generates random numbers hist(mysample, prob = TRUE) # draws the histogram hist() mu <- mean(mysample) # computes the sample mean mean() sigma <- sd(mysample) # computes the sample std deviation sd() x <- seq(-4, 4, length = 500) # defines xvalues for the pdf y <- dnorm(x, mu, sigma) # computes the normal pdf lines(x, y, col = "red") # adds the pdf as lines to the plot
Another example is the visualization of the approximation of the binomial distribution with the normal distribution for e.g. n = 50 and = 0:25. Comparing the binomial distribution with n = 50 and = 0:25 with an approximation by the normal distribution:
> x <- 0:50 # defines the xvalues > y <- dbinom(x, 50, 0.25) # computes the binomial probabilities > plot(x, y, type = "h") # plots binomial probabilities > x2 <- seq(0, 50, length = 500)# defines xvalues (for the normal pdf) > y2 <- dnorm(x2, 50*0.25, + sqrt(50*0.25*(1-0.25))) # computes the normal pdf > lines(x2, y2, col = "red") # draws the normal pdf