Introduction To R
Introduction To R
INTRODUCTION TO R PROGRAMING
Data Analytics with R 3
Today’s discussion…
History of R
R resources
• http://www.r-project.org/
• http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf
Data Analytics with R
• Download R :
http://cran.r-project.org/bin/
• Download RStudio :
http://www.rstudio.com/ide/download/desktop
Data Analytics with R
Installation
Installing R on windows PC :
Installing R on Linux:
sudo apt-get install r-base-core
Data Analytics with R
Installation
Installing RStudio:
Click on the version recommended for your system, or the latest Windows
version, and save the executable file. Run the .exe file and follow the
installation instructions.
Data Analytics with R
Version
• Get R version
R.Version()
• You can type your own program at the prompt line >.
Data Analytics with R
▫ help.start()
▫ help(topic)
▫ ?topic
▫ ??topic
Data Analytics with R
Identifiers naming
• The preferred form for variable names is all lower case letters
and words separated with dots (variable.name) but
variableName is also accepted.
• Examples:
avg.clicks GOOD
avgClicks OK
avg_Clicks BAD
Using C command
• [1] 4 5 7 8 2 9 4 3
Data Analytics with R
• Vector
• Matrix
• Data Frame
• List
Data Analytics with R
Vectors in R
• >x=c(1,2,3,4,56)
• >x
• > x[2]
• > x = c(3, 4, NA, 5)
• >mean(x)
• [1] NA
• >mean(x, rm.NA=T)
• [1] 4
• > x = c(3, 4, NULL, 5)
• >mean(x)
• [1] 4
Data Analytics with R
More on Vectors in R
• >y = c(x,c(-1,5),x)
• >length(x)
• >length(y)
• There are useful methods to create long vectors whose elements
are in arithmetic progression:
• > x=1:20
• >x
•
• If the common difference is not 1 or -1 then we can use
the seq function
• > y=seq(2,5,0.3)
• >y
• [1] 2.0 2.3 2.6 2.9 3.2 3.5 3.8 4.1 4.4 4.7 5.0
• > length(y)
• [1] 11
Data Analytics with R
More on Vectors in R
• > x=1:5 It is very easy to add/subtract/multiply/divide two
• > mean(x) vectors entry by entry.
> y=c(0,3,4,0)
• [1] 3 > x+y
• >x [1] 1 5 7 4 5
• [1] 1 2 3 4 5 > y=c(0,3,4,0,9)
• > x^2 > x+y
• [1] 1 4 9 16 25 [1] 1 5 7 4 14
• > x+1 Warning message:
• [1] 2 3 4 5 6 In x + y : longer object length is not a multiple of
shorter object length
• > 2*x > x=1:6
• [1] 2 4 6 8 10 > y=c(9,8)
• > exp(sqrt(x)) > x+y
• [1] 2.718282 4.113250 5.652234 [1] 10 10 12 12 14 14
7.389056 9.356469
Data Analytics with R
Matrices in R
• Same data type/mode – number , character, logical
• a.matrix <- matrix(vector, nrow = r, ncol = c, byrow = FALSE,
dimnames = list(char-vector-rownames, char-vector-col-names))
## dimnames is optional argument, provides labels for rows & columns.
• > y <- matrix(1:20, nrow = 4, ncol = 5)
• >A = matrix(c(1,2,3,4),nrow=2,byrow=T)
• >A
• >A = matrix(c(1,2,3,4),ncol=2)
• >B = matrix(2:7,nrow=2)
• >C = matrix(5:2,ncol=2)
• >mr <- matrix(1:20, nrow = 5, ncol = 4, byrow = T)
• >mc <- matrix(1:20, nrow = 5, ncol = 4)
• >mr
• >mc
Data Analytics with R
More on matrices in R
• >dim(B) #Dimension
• >nrow(B)
• >ncol(B)
• >A+C
• >A-C
• >A%*%C #Matrix multiplication. Where will be the result?
• >A*C #Entry-wise multiplication
• >t(A) #Transpose
• >A[1,2]
• >A[1,]
• >B[1,c(2,3)]
• >B[,-1]
Data Analytics with R
Lists in R
• Vectors and matrices in R are two ways to work with a
collection of objects.
Examples of lists in R
• >x = list(name="Arun Patel",
nationality="Indian", height=5.5,
marks=c(95,45,80))
• >names(x)
• >x$name
Data frame in R
• A data frame is more general than a matrix, in that different
columns can have different modes (numeric, character, factor, etc.).
• >d <- c(1,2,3,4)
• >e <- c("red", "white", "red", NA)
• >f <- c(TRUE,TRUE,TRUE,FALSE)
• >myframe <- data.frame(d,e,f)
• >names(myframe) <- c("ID","Color","Passed") # Variable names
• >myframe
• >myframe[1:3,] # Rows 1 , 2, 3 of data frame
• >myframe[,1:2] # Col 1, 2 of data frame
• >myframe[c("ID","Color")] #Columns ID and color from data frame
• >myframe$ID # Variable ID in the data frame
Data Analytics with R
Factors in R
• In R we can make a variable is nominal by making it a factor.
• The factor stores the nominal values as a vector of integers in the range [ 1... k]
(where k is the number of unique values in the nominal variable).
Functions in R
REFERRENCES
Web:
https://cran.r-project.org/bin/windows/base/ (for Windows),
https://cran.r-project.org/bin/macosx/ (for Mac), and
https://cran.r-project.org/bin/linux/ (for Linux).
https://www.tutorialspoint.com/r/index.htm
https://www.rstudio.com/products/rstudio/download/#download.
Book:
Practical Statistics for Data Scientists: 50 Essential Concepts (Peter Bruce
and Andrew Bruce)
Think Stats: Exploratory Data Analysis (Allen B. Downey)
Data Analysis with R - Second Edition: A comprehensive guide to
manipulating, analyzing, and visualizing data in R (Tony Fischetti)