0% found this document useful (0 votes)

1K views41 pages

R Programming for Beginners

This document provides an introduction to R programming including: - Why R is commonly used for statistical computing and graphics, with some major companies that use R listed. - The different data types and structures in R, including vectors, lists, matrices, arrays, factors, and data frames. - How to create basic charts and graphs in R like pie charts, bar charts, box plots, histograms, line graphs, and scatter plots. - Common statistical functions in R like mean, median, mode, correlation, linear regression, and ANOVA functions. The document then goes on to discuss running R, the RStudio interface, basic R syntax like vectors and indexing, and compares R to other statistical

Uploaded by

Tim Johnson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views41 pages

R Programming for Beginners

Uploaded by

Tim Johnson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Introduction to R:

 Why we use R for statistical computing and graphics?

 Which companies are using R?
 Application of R program in real world
R reserved words

R data types and constants

R data types:

 Logical ,numerical,integer,complex,character and raw data

R constants:

 Numeric
 Character
 Built in
R data structures

 Vector
 List
 Matrices
 Arrays
 Factors
1. How to create factors?
2. How to access components of a factors?
 Data frames
1. How to create dataframe in R?
2. How to access components of a data frame?
Using rbind() and Column bind cbind()/ Installing pakages

R flow controls (loops and if then else) / import data sets/ How to read csv files?

R charts and graphs

 Pie chart
 Bar chart
 Box plot
 Histogram
Histogram with added parameters
 Line graphs
 Scatter plots
 Strip charts
R statistical functions
 Basic statistical functions: mean,median,mode,average,min,max
 Correlation and Linear regression , multilinear regression functions
 ANOVA functions
Mode of teaching: Lab sessions

Introduction to R programming:

• R is a programming language and environment commonly used in statistical

computing, data analytics and scientific research.

• It is one of the most popular languages used by statisticians, data analysts, researchers
and marketers to retrieve, clean, analyze, visualize and present data.

• Due to its expressive syntax and easy-to-use interface, it has grown in popularity in
recent years.

Why we use R programming for statistical computing and graphics?

• R is open source and free!

• R is popular – and increasing in popularity

• R runs on all platforms

• Learning R will increase your chances of getting a job

• R is being used by the biggest tech giants

History of R

 John Chambers and colleagues developed R at Bell Laboratories. R is an

implementation of the S programming Language and combines with lexical scoping
semantics inspired by Scheme.
 R was named partly after the first names of two R authors. The project conceives in
1992, with an initial version released in 1995 and a stable beta version in 2000.
 Companies Using R for what purpose?
Application of R programming in the real world:

• Data Science

• Programming languages like R give a data scientist superpowers that allow them to
collect data in real-time, perform statistical and predictive analysis, create
visualizations and communicate actionable results to stakeholders.

• Statistical computing

• It has a rich package repository with more than 9100 packages with every statistical
function you can imagine

• Machine Learning

Machine learning enthusiasts to researchers use R to implement machine learning algorithms

in fields like finance, genetics research, retail, marketing and health care.

Alternatives of R programming:

SAS

SPSS

Python

Features of R

 It supports procedural programming with functions and object-oriented

programming with generic functions. Procedural programming includes procedure,
records, modules, and procedure calls. While object-oriented programming language
includes class, objects, and
functions.
 Packages are part of R programming. Hence, they are useful in collecting sets of R
functions into a single unit.
 R programming features include database input, exporting data, viewing data, variable
labels, missing data, etc.
 R is an interpreted language. Hence, we can access it through command line interpreter.
 R supports matrix arithmetic.
 It has effective data handling and storage facilities.
 R supports a large pool of operators for performing operations on arrays and matrices.
 It has facilities to print the reports for the analysis performed in the form of graphs either
on-screen or on hardcopy.
So, we can obtain the installation files for the R program on the official R Website (www.r-
[Link]). The website has general documentation related to R along with the libraries of
routines. We can simply download and install the R program from the R Website.

Run R programming in Windows

• Go to official site of R programming

• Click on the CRAN link on the left sidebar

• Select a mirror

• Click “Download R for Windows”

• Click on the link that downloads the base distribution

• Run the file and follow the steps in the instructions to install R.

R studio GUI

a. Features of RStudio
 Code highlighting that gives different colors to keywords and variables, making it easier
to read
 Automatic bracket matching
 Code completion, so as to reduce the effort of typing the commands in full
 Easy access to R Help, with additional features for exploring functions and parameters of
functions
 Easy exploration of variables and values. RStudio is available free of charge for Linux,
Windows, and Mac devices. It can be directly accessed by clicking the RStudio icon in
the menu system on the desktop.
Because RStudio is available free of charge for Linux, Windows, and Mac devices, it is a
good option to use with R. To open RStudio, click the RStudio icon in the menu system or on
the desktop.
b. Components of RStudio
 Source – Top left corner of the screen contains a text editor that lets the user work with
source script files. Multiple lines of code can also be entered here. Users can save R
script file to disk and perform other tasks on the script.
 Console – Bottom left corner is the R console window. The console in RStudio is
identical to the console in RGui. All the interactive work of R programming is performed
in this window.
 Workspace and History – The top right corner is the R workspace and history window.
This provides an overview of the workspace, where the variables created in the session
along with their values can be inspected. This is also the area where the user can see a
history of the commands issued in R.
Files, Plots, Package, and Help the bottom right corner gives access to the following tools:

 Files – This is where the user can browse folders and files on a computer.
 Plots – Now, this is where R displays the user’s plots.
 Packages – This is where the user can view a list of all the installed packages.
 Help – This is where you can browse the built-in Help system of R.

R reserved words
Comparison of R with other technologies:

 Data handling Capabilities – Good data handling capabilities and options for parallel
computation.
 Availability / Cost – R is an open source and we can use it anywhere.
 Advancement in Tool – If you are working on latest technologies, R gets latest features.
 Ease of Learning – R has a learning curve. R is a low-level programming language. As a
result, simple procedures can take long codes.
 Job Scenario – It is a better option for start-ups and companies looking for cost
efficiency.
 Graphical capabilities – R is having the most advanced graphical capabilities. Hence, it
provides you with advanced graphical capabilities.
 Customer Service support and community – R is the biggest online growing
community.

R code and explanation

Vectors:

A vector must have elements of the same type, this function will try and coerce elements to
the same type, if they are different.
Coercion is from lower to higher types from logical to integer to double to character.
Example 1:
Code:
x <- c(1, 5, 4, 9, 0)
typeof(x)
length(x)

Example:2
Code:
x <- c(1, 5.4, TRUE, "hello")
x
typeof(x)
If we want to create a vector of consecutive numbers, the : operator is very helpful.
Code:
X <- 1:7; x
y <- 2:-2; y

Creating a vector using seq() function

Code:
seq(1, 3, by=0.2) # specify step size

seq(1, 5, [Link]=4) # specify length of the vector

Using integer vector as index

 Vector index in R starts from 1, unlike most programming languages where index
start from 0.
 We can use a vector of integers as index to access specific elements.
 We can also use negative integers to return all elements except that those specified.
 But we cannot mix positive and negative integers while indexing and real numbers, if
used, are truncated to integers.
Code:

[1] 0 2 4 6 8 10

x[3] # access 3rd element

[1] 4

x[c(2, 4)] # access 2nd and 4th element

[1] 2 6

x[-1] # access all but 1st element

[1] 2 4 6 8 10

x[c(2, -4)] # cannot mix positive and negative integers

Error in x[c(2, -4)] : only 0's may be mixed with negative subscripts
x[c(2.4, 3.54)] # real numbers are truncated to integers

[1] 2 4

Using logical vector as index

 When we use a logical vector for indexing, the position where the logical vector
is TRUE is returned.
 This useful feature helps us in filtering of vector as shown below.
x[c(TRUE, FALSE, FALSE, TRUE)]

[1] -3 3

x[x < 0] # filtering vectors based on conditions

[1] -3 -1

x[x > 0]

[1] 3

Using character vector as index

 This type of indexing is useful when dealing with named vectors. We can name each
elements of a vector.

x <- c("first"=3, "second"=0, "third"=9)

names(x)

[1] "first" "second" "third"

x["second"]

second

x[c("first", "third")]

first third

3 9

How to modify a vector in R?

 We can modify a vector using the assignment operator.

 We can use the techniques discussed above to access specific elements and modify
them.
 If we want to truncate the elements, we can use reassignments.
x

[1] -3 -2 -1 0 1 2

x[2] <- 0; x # modify 2nd element

[1] -3 0 -1 0 1 2

x[x<0] <- 5; x # modify elements less than 0

[1] 5 0 5 0 1 2

x <- x[1:4]; x # truncate x to first 4 elements

[1] 5 0 5 0

How to delete a Vector?

We can delete a vector by simply assigning a NULL to it.

[1] -3 -2 -1 0 1 2

x <- NULL

NULL

x[4]

NULL

Matrix:

 Matrix is a two dimensional data structure in R programming.

 Matrix is similar to vector but additionally contains the dimension attribute.
 All attributes of an object can be checked with the attributes() function (dimension
can be checked directly with the dim() function).
 We can check if a variable is a matrix or not with the class() function.

R code for practice:

*charcter constants

'example'

typeof("5")
*Numeric Constants

Types of operators

Arithmetic operators

+ / - / * / / / %% / %/% / ^

Add two vectors

Subtract s second vector from the first

Multiply both the vectors

Divide the first vector with the second

Give the remainder of the first vector with the second

The result of division of first vector with second (quotient)

The first vector raised to the exponent of second vector

u <- c(2,3,4)

v <- c(9,8,7)

print (u+v)

b <- c(1,2,3)

c <- c(9,8,7)

print(b-c)

print (u-v)

print(v-u)

g <- c(1,2)

h <- c(2,3,4)
print(g*h)

g <- c(1,2,3)

h <- c(3,5,6)

print (g*h)

g <- c(1,2,3)

h <- c(3,5,6)

print (g%%h)

print(g %/% h)

print (g^h)

Built in constants

LETTERS

letters

[Link]

Vector:

Basic statistical operations (code)

mean(c(0, 5, 1, -10, 6))

median(c(0, 5, 1, -10, 6))

var(c(0, 5, 1, -10, 6))

length(c(1, 5, 6, -2))

quantile(c(5,6,7))

sd(c(5,6,7,8))

max(c(5,6,7,8))

min(c(5,6,7,8))

sqrt(c(2, 4))
Mode function :

Mode

getmode <- function(v) {

uniqv <- unique(v)

uniqv[[Link](tabulate(match(v, uniqv)))]

v <- c(2,1,2,3,1,2,3,4,1,5,5,3,2,3)

result <- getmode(v)

print(result)

Create the vector with characters.

charv <- c("o","it","the","it","it")
>
> # Calculate the mode using the user function.
result <- getmode(charv)
print(result)
vector:

 Vector is a basic data structure in R. It contains element of the same type. The
data types can be logical, integer, double, character, complex or raw.
 A vector’s type can be checked with the typeof() function.
 Another important property of a vector is its length. This is the number of
elements in the vector and can be checked with the function length().

apple <- c('red','green',"yellow")

print(apple)

# Get the class of the vector.

print(class(apple))

Bschools <- c('MMS','PGPM','PGDM')

Bschools

print(class(Bschools))

List:

 List is a data structure having components of mixed data types.

 A vector having all elements of the same type is called atomic vector but a vector
having elements of different type is called list.
 We can check if it’s a list with typeof() function and find its length using length().
Here is an example of a list having three components each of different data type.

list1 <- list(c(2,5,3),21.3,sin)

# Print the list.

print(list1)

list <- list('MMS students',21.5,c(3,7,8,9))

list

Matrix:

Create a matrix
Matrix can be created using the matrix() function.
Dimension of the matrix can be defined by passing appropriate value for
arguments nrow and ncol.

M = matrix( c('k','a','v','i','t','a'), nrow = 2, ncol = 3, byrow = TRUE)

print(M)

matrix(1:9, nrow = 3, ncol = 3)

matrix(1:9, nrow = 3)

matrix(1:9, nrow=3, byrow=TRUE) # fill matrix row-wise

x <- matrix(1:9, nrow = 3, dimnames = list(c("India","USA","UK"), c("C1","C2","C3")))

Column names and row names chaging and accessing

colnames(x)

"A" "B" "C"

rownames(x)

"X" "Y" "Z"

> # It is also possible to change names

colnames(x) <- c("C1","C2","C3")

rownames(x) <- c("R1","R2","R3")

Column bind and row bind

cbind(c(1,2,3),c(4,5,6))

rbind(c(1,2,3),c(4,5,6))

cbind(c('t','e','a','c','h'),c(1,2,3,4,5))

rbind(c('t','e','a','c','h'),c(1,2,3,4,5))

How to modify a matrix?

x[2,2] <- 10; x # modify a single element

x[x<5] <- 0; x # modify elements less than 5

x[-1,] # select all rows except first

x[c(1,2),c(2,3)] select rows 1 & 2 and columns 2 & 3

x[c(3,2),] # leaving column field blank will select entire columns

x[,] # leaving row as well as column field blank will select entire matrix

x[-1,] # select all rows except first

factors

Factor is a data structure used for fields that takes only predefined, finite number of
values (categorical data).
For example: a data field such as marital status may contain only values from single,
married, separated, divorced, or [Link] such case, we know the possible values
beforehand and these predefined, distinct values are called levels. Following is an
example of factor in R.

seeds_rice<- c('IR 20','Basmati','IR 60','Kolam','kolam nasik','IR idli rice','IR

20','Basmati','wada kolam')

seeds_rice

factor_seeds <- factor(seeds_rice)

print(factor_seeds)

print(nlevels(factor_seeds))

Data frames

BMI <- [Link](

gender = c("Male", "Male","Female"),

height = c(152, 171.5, 165),

weight = c(81,93, 78),

Age = c(42,38,26)

print(BMI)

Temp < - [Link] (

Min = c (23,12,13,5),

Max = c(23,45,45,65)

)
Print(Temp)

x <- [Link]("SN" = 1:2, "Age" = c(47,75), "Name" = c("kavita","ramalingam"))

str(x) # structure of x

x["Name"]

x$Name

x[["Name"]]

x[[3]]

combining two dataframes

library(gtools)

df1 = [Link](a = c(1:5), b = c(6:10))

df2 = [Link](a = c(11:15), b = c(16:20), c = LETTERS[1:5])

smartbind(df1,df2)

[Link] <- [Link](

emp_id = c (1:5),

emp_name = c("Ricky","Danish","Mini","Ryan","Gary"),

salary = c(643.3,515.2,671.0,729.0,943.25),

start_date = [Link](c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11","2015-03-

27")),

stringsAsFactors = FALSE
)

print([Link])

Get the Structure of the R Data Frame

The structure of the data frame can see by using the str () function.

str([Link])

extract specific columns

result <- [Link]([Link]$emp_name,[Link]$salary)

print(result)

extract first two rows

result <- [Link][1:2,]

print(result)

3rd and 5th row and 2nd and 4th column

result <- [Link][c(3,5),c(2,4)]

print(result)

add the dept column

[Link]$dept <- c("IT","Operations","IT","HR","Finance")

v <- [Link]

print(v)

create a 2nd data frame

[Link] <- [Link](

emp_id = c (6:8),

emp_name = c("Rasmi","Pranab","Tusar"),

salary = c(578.0,722.5,632.8),

start_date = [Link](c("2013-05-21","2013-07-30","2014-06-17")),

dept = c("IT","Operations","Fianance"),

stringsAsFactors = FALSE

[Link] <- rbind([Link],[Link])

print([Link])

Using Loops

Exercise 1:

How to print a multiplication table

num = [Link](readline(prompt = "Enter a number: "))

# use for loop to iterate 10 times

for(i in 1:10)

print(paste(num,'x', i, '=', num*i))

Exercise:2

How to print a addition table

num = [Link](readline(prompt = "Enter a number: "))

for(i in 1:10)

print(paste(num,'+', i, '=', num +i))

Exercise:3

To check the given number is even or odd

num = [Link](readline(prompt="Enter a number: "))

if((num %% 2) == 0) {

print(paste(num,"is Even"))

} else {

print(paste(num,"is Odd"))

Charts and its types:

[Link] <- c(22, 27, 26, 24, 23, 26, 28)

barplot([Link])

bar chart with added parameters:

barplot([Link],

main = "Maximum Temperatures in a Week",

xlab = "Degree Celsius",

ylab = "Day",

[Link] = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"),

col = "darkred",

horiz = TRUE)

plotting categorical data:

age <- c(17,18,18,17,18,19,18,16,18,18)

table(age)

barplot(table(age),

main="Age Count of 10 Students",

xlab="Age",

ylab="Count",

border="red",

col="blue",

density=10

histogram

Builtin data sets

str(airquality) # str structure of the data set

Temperature <- airquality$Temp

hist(Temperature)

added parameters
hist(Temperature,

main="Maximum daily temperature at La Guardia Airport",

xlab="Temperature in degrees Fahrenheit",

xlim=c(50,100),

col="darkmagenta",

freq=FALSE

return value of hist()

h <- hist(Temperature)

return values for labels using text()

h <- hist(Temperature,ylim=c(0,40))

text(h$mids,h$counts,labels=h$counts, adj=c(0.5, -0.5))

histogram using different breaks

hist(Temperature, breaks=4, main="With breaks=4")

hist(Temperature, breaks=20, main="With breaks=20")

histogram with non uniform width:

hist(Temperature,

main="Maximum daily temperature at La Guardia Airport",

xlab="Temperature in degrees Fahrenheit",

xlim=c(50,100),
col="chocolate",

border="brown",

breaks=c(55,60,70,75,80,100)

bar plot

str(airquality)

boxplot(airquality$Ozone) # ozone readings

boxplot(airquality$Ozone,

main = "Mean ozone in parts per billion at Roosevelt Island",

xlab = "Parts Per Billion",

ylab = "Ozone",

col = "orange",

border = "brown",

horizontal = TRUE,

notch = TRUE

b <- boxplot(airquality$Ozone)

boxplot(Temp~Month,

data=airquality,

main="Different boxplots for each month",

xlab="Month Number",

ylab="Degree Fahrenheit",

col="orange",

border="brown"

strip chart

str(airquality)

stripchart(airquality$Ozone)

using jitter as a method

stripchart(airquality$Ozone,

main="Mean ozone in parts per billion at Roosevelt Island",

xlab="Parts Per Billion",

ylab="Ozone",

method="jitter",

col="orange",

pch=1

to draw multiple strips we want to prepare data set

# prepare the data

temp <- airquality$Temp

# gererate normal distribution with same mean and sd

tempNorm <- rnorm(200,mean=mean(temp, [Link]=TRUE), sd = sd(temp, [Link]=TRUE))

# make a list
x <- list("temp"=temp, "norm"=tempNorm)

stripchart(x,

main="Multiple stripchart for comparision",

xlab="Degree Fahrenheit",

ylab="Temperature",

method="jitter",

col=c("orange","red"),

pch=16

strip chart from the formula

stripchart(Temp~Month,

data=airquality,

main="Different strip chart for each month",

xlab="Months",

ylab="Temperature",

col="brown3",

[Link]=c("May","June","July","August","September"),

vertical=TRUE,

pch=16

TYPES OF CHARTS

Data set

[Link] frequency
11.5-16.5 2

16.5-21.5 6

21.5-26.5 7

26.5-31.5 5

31.5-36.5 3

hist(CHARTS1$frequency,right = FALSE)

histogram

v <- c(9,13,21,8,36,22,12,41,31,33,19)

hist(v,xlab = "Weight",col = "yellow",border = "blue")

hist(v,xlab = "Weight",col = "green",border = "red", xlim = c(0,40), ylim =

c(0,5),

breaks = 5)

plot

v <- c(7,12,28,3,41)

plot(v,type = "o")
line chart

v <- c(7,12,28,3,41)

plot(v,type = "o", col = "red", xlab = "Month", ylab = "Rain fall",

main = "Rain fall chart")

multiple lines in a chart

v <- c(7,12,28,3,41)

t <- c(14,7,6,19,3)

plot(v,type = "o",col = "red", xlab = "Month", ylab = "Rain fall",

main = "Rain fall chart")

lines(t, type = "o", col = "blue")

stem(CHARTS1$frequency)

> stem(CHARTS1$frequency)

The decimal point is at the |

2 | 00
4|0
6 | 00

dotchart(CHARTS$frequency)
SCATTER PLOT

plot(CHARTS1$frequency)
barplot(CHARTS1$frequency)

Values <- matrix(c(28,40,38,50,53,55,38,30,53),

nrow=3,ncol=3,byrow=TRUE,

dimnames = list(c("A","B","C"),c("1947","1957","1967")))

State <- c ("A","B","C")

colors <-c("darkblue","red","yellow")

counts <- table(dot_data$A,dot_data$B)

barplot(Values, main="production of paddy",

xlab="Years", col=c("darkblue","red","yellow"),

beside=TRUE,ylab = "production of paddy in lakhs tones")

legend("bottomright", State, cex=1.3, fill=colors)

Values <- matrix(c(28,40,38,50,53,55,38,30,53),

nrow=3,ncol=3,byrow=TRUE,

dimnames = list(c("A","B","C"),c("1947","1957","1967")))

State <- c ("A","B","C")

colors <-c("darkblue","red","yellow")

counts <- table(dot_data$A,dot_data$B)

barplot(Values, main="production of paddy",

xlab="Years", col=c("darkblue","red","yellow"),

ylab = "production of paddy in lakhs tones")

legend("bottomright", State, cex=1.3, fill=colors)

slices <- c(10, 12,4, 16, 8)

lbls <- c("US", "UK", "Australia", "Germany", "France")

pie(slices, labels = lbls, main="Pie Chart of Countries")

slices <- c(10, 12, 4, 16, 8)

lbls <- c("US", "UK", "Australia", "Germany", "France")

pct <- round(slices/sum(slices)*100)

lbls <- paste(lbls, pct) # add percents to labels

lbls <- paste(lbls,"%",sep="") # ad % to labels

pie(slices,labels = lbls, col=rainbow(length(lbls)),

main="Pie Chart of Countries")

x <- seq(-pi,pi,0.1)
plot(x, sin(x))

plot(x, sin(x),

main="The Sine Function",

ylab="sin(x)")

plot(x, sin(x),

main="The Sine Function",

ylab="sin(x)",

type="l",

col="blue")

plot(x, sin(x),

main="Overlaying Graphs",

ylab="",

type="l",

col="blue")

lines(x,cos(x), col="red")

legend("topleft",

c("sin(x)","cos(x)"),

fill=c("blue","red")

[Link] # a vector used for plotting

Sun Mon Tue Wen Thu Fri Sat

22 27 26 24 23 26 28

par(mfrow=c(1,2)) # set the plotting area into a 1*2 array

barplot([Link], main="Barplot")

pie([Link], main="Piechart", radius=1)

Temperature <- airquality$Temp

Ozone <- airquality$Ozone

par(mfrow=c(2,2))

hist(Temperature)

boxplot(Temperature, horizontal=TRUE)

hist(Ozone)

boxplot(Ozone, horizontal=TRUE)

make labels and margins smaller

par(cex=0.7, mai=c(0.1,0.1,0.2,0.1))

Temperature <- airquality$Temp

# define area for the histogram

par(fig=c(0.1,0.7,0.3,0.9))

hist(Temperature)

# define area for the boxplot

par(fig=c(0.8,1,0,1), new=TRUE)

boxplot(Temperature)

# define area for the stripchart

par(fig=c(0.1,0.67,0.1,0.25), new=TRUE)

stripchart(Temperature, method="jitter")
drawing a 3D plot

cone <- function(x, y){

sqrt(x^2+y^2)

to prepare our variables

x <- y <- seq(-1, 1, length= 20)

z <- outer(x, y, cone)

persp(x, y, z)

persp(x, y, z,

main="Perspective Plot of a Cone",

zlab = "Height",

theta = 30, phi = 15,

col = "springgreen", shade = 0.5)

Read csv file:

mydata <- [Link] ("[Link]", header= TRUE)

mydata

output:

Anova

Analysis of Variance

Anova code:

y1 = c(18.2, 20.1, 17.6, 16.8, 18.8, 19.7, 19.1)

y2 = c(17.4, 18.7, 19.1, 16.4, 15.9, 18.4, 17.7)

y3 = c(15.2, 18.8, 17.7, 16.5, 15.9, 17.1, 16.7)

y = c(y1, y2, y3)

n = rep(7, 3)

group = rep(1:3, n)

group

tmp = tapply(y, group, stem)

tmpfn = function(x) c(sum = sum(x), mean = mean(x), var = var(x),

n = length(x))

tapply(y, group, tmpfn)

data = [Link](y = y, group = factor(group))

fit = lm(y ~ group, data)

anova(fit)

df = anova(fit)[, "Df"]

names(df) = c("trt", "err")

anova(fit)["Residuals", "Sum Sq"]

anova(fit)["Residuals", "Sum Sq"]/qchisq(c(0.025, 0.975), 18,

[Link] = FALSE)

output:

> y1 = c(18.2, 20.1, 17.6, 16.8, 18.8, 19.7, 19.1)

> y2 = c(17.4, 18.7, 19.1, 16.4, 15.9, 18.4, 17.7)
> y3 = c(15.2, 18.8, 17.7, 16.5, 15.9, 17.1, 16.7)
> y = c(y1, y2, y3)
> n = rep(7, 3)
> n
[1] 7 7 7
> group = rep(1:3, n)
> group
[1] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 3
> tmp = tapply(y, group, stem)

The decimal point is at the |

16 | 8
17 | 6
18 | 28
19 | 17
20 | 1

The decimal point is at the |

15 | 9
16 | 4
17 | 47
18 | 47
19 | 1

The decimal point is at the |

15 | 29
16 | 57
17 | 17
18 | 8

>
> tmpfn = function(x) c(sum = sum(x), mean = mean(x), var = var(x),
+ n = length(x))
> tapply(y, group, tmpfn)
$`1`
sum mean var n
130.300000 18.614286 1.358095 7.000000

$`2`
sum mean var n
123.600000 17.657143 1.409524 7.000000

$`3`
sum mean var n
117.900000 16.842857 1.392857 7.000000

>
> data = [Link](y = y, group = factor(group))
> fit = lm(y ~ group, data)
> anova(fit)
Analysis of Variance Table

Response: y
Df Sum Sq Mean Sq F value Pr(>F)
group 2 11.007 5.5033 3.9683 0.03735 *
Residuals 18 24.963 1.3868
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> df = anova(fit)[, "Df"]
> names(df) = c("trt", "err")
> df
trt err
2 18
>
> anova(fit)["Residuals", "Sum Sq"]
[1] 24.96286
>
> anova(fit)["Residuals", "Sum Sq"]/qchisq(c(0.025, 0.975), 18,
+ [Link] = FALSE)
[1] 0.7918086 3.0328790

Interpretation :

If the p value from the F test is greater than or equal to 0.05 then the null hyphothesis is accepted otherwise
rejected.

Correlation :
cor(CORRELATION, use="[Link]", method="pearson")

CORRELATION
X Y
1 10 20
2 12 13
3 9 12
4 13 5
5 6 9
6 8 2
7 12 5
8 13 6

OUTPUT:

cor(CORRELATION, use="[Link]", method="pearson")

X Y
X 1.00000000 -0.09610721
Y -0.09610721 1.00000000

>
cor(CORRELATION, use="[Link]", method="spearman")

CORRELATION
X Y
1 10 20
2 12 13
3 9 12
4 13 5
5 6 9
6 8 2
7 12 5
8 13 6

OUTPUT: cor(CORRELATION, use="[Link]", method="spearman")

X Y
X 1.00000000 -0.09697148
Y -0.09697148 1.00000000

cor(CORRELATION, use="[Link]", method="kendall")

cov(CORRELATION, use="[Link]")

output:

X Y

10 20

12 13

9 12

13 5

6 9

8 2

12 5

13 6
Regression

REGRESSION
alligator = [Link](
lnLength = c(3.87, 3.61, 4.33, 3.43, 3.81, 3.83, 3.46, 3.76,
3.50, 3.58, 4.19, 3.78, 3.71, 3.73, 3.78),
lnWeight = c(4.87, 3.93, 6.46, 3.33, 4.38, 4.70, 3.50, 4.50,
3.58, 3.64, 5.90, 4.43, 4.38, 4.42, 4.25)
)
alligator #view data

alligator_regression = lm(lnWeight ~ lnLength, data = alligator)

lm(formula = lnWeight ~ lnLength, data = alligator)
lm(formula = lnWeight ~ lnLength, data = alligator)

summary(alligator_regression)

alligator_regression = lm(lnWeight ~ lnLength, data = alligator)

> lm(formula = lnWeight ~ lnLength, data = alligator)

Call:
lm(formula = lnWeight ~ lnLength, data = alligator)

Coefficients:
(Intercept) lnLength
-8.476 3.431

>
> summary(alligator_regression)
Call:
lm(formula = lnWeight ~ lnLength, data = alligator)

Residuals:

Learn R Programming in A Day
100% (10)
Learn R Programming in A Day
229 pages
R Programming Presentation
100% (1)
R Programming Presentation
23 pages
R Programming
100% (4)
R Programming
163 pages
Data Analysis Using R
100% (1)
Data Analysis Using R
78 pages
R Programming
No ratings yet
R Programming
92 pages
Statistics with R Programming Course
No ratings yet
Statistics with R Programming Course
2 pages
R Programming Tutorial
100% (2)
R Programming Tutorial
66 pages
R Programming Notes
100% (1)
R Programming Notes
32 pages
R for NGS Data Analysis Beginners
No ratings yet
R for NGS Data Analysis Beginners
5 pages
R Programming A Step-by-Step Guide For Absolute Beginners by Daniel Bell
100% (1)
R Programming A Step-by-Step Guide For Absolute Beginners by Daniel Bell
145 pages
R for Statistics and Data Analysis
No ratings yet
R for Statistics and Data Analysis
91 pages
Statistical Analysis with R Basics
100% (1)
Statistical Analysis with R Basics
45 pages
Introduction to R Programming Guide
100% (1)
Introduction to R Programming Guide
101 pages
R Programming Course Notes
No ratings yet
R Programming Course Notes
28 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
Statistics For Data Science by Mihir Patnaik
100% (1)
Statistics For Data Science by Mihir Patnaik
103 pages
R Tutorial
100% (2)
R Tutorial
196 pages
R Programming Cheatsheet
100% (2)
R Programming Cheatsheet
6 pages
Shiny Server Setup & App Development Guide
No ratings yet
Shiny Server Setup & App Development Guide
2 pages
The Art of R Programming
100% (2)
The Art of R Programming
193 pages
Data Manipulation with dplyr in R
100% (1)
Data Manipulation with dplyr in R
22 pages
Shiny
No ratings yet
Shiny
21 pages
R Programming Unit 1
No ratings yet
R Programming Unit 1
83 pages
R Programming by Rober D. Peng
No ratings yet
R Programming by Rober D. Peng
179 pages
FDS Unit I
No ratings yet
FDS Unit I
17 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
67 pages
RStudio Basics for Beginners
No ratings yet
RStudio Basics for Beginners
47 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
41 pages
Standard Deviation in R Programming
No ratings yet
Standard Deviation in R Programming
23 pages
Intro to Basic R Programming
No ratings yet
Intro to Basic R Programming
38 pages
Intro To Data Science Lecture 3
No ratings yet
Intro To Data Science Lecture 3
18 pages
Unit 1
No ratings yet
Unit 1
16 pages
Lec 1
No ratings yet
Lec 1
42 pages
R Language
No ratings yet
R Language
59 pages
R Lanaguage
No ratings yet
R Lanaguage
25 pages
Data Analytics with R: Course Overview
No ratings yet
Data Analytics with R: Course Overview
34 pages
BADSIS Lec 19-20 Sep 9 SR R Programming
No ratings yet
BADSIS Lec 19-20 Sep 9 SR R Programming
55 pages
R Programming Lab Manual for B.Tech
100% (1)
R Programming Lab Manual for B.Tech
46 pages
Introduction To Analytics and R File
No ratings yet
Introduction To Analytics and R File
29 pages
R For Beginners
No ratings yet
R For Beginners
76 pages
Statistical Methods Lab Manual-2021-22
No ratings yet
Statistical Methods Lab Manual-2021-22
58 pages
R Studio Manual
No ratings yet
R Studio Manual
61 pages
Unit - 1 Q) What Is R Programming? What Are The Features of R Programming?
No ratings yet
Unit - 1 Q) What Is R Programming? What Are The Features of R Programming?
32 pages
R Programming
No ratings yet
R Programming
48 pages
1mod References
No ratings yet
1mod References
52 pages
1.R Unit 1
No ratings yet
1.R Unit 1
49 pages
Lab Manual
No ratings yet
Lab Manual
46 pages
R Programming - PPT - UNIT - 1
No ratings yet
R Programming - PPT - UNIT - 1
72 pages
R Lang-Unit-01
100% (1)
R Lang-Unit-01
50 pages
R Language Lab Manual Lab 1
No ratings yet
R Language Lab Manual Lab 1
32 pages
A Concise Tutorial On R
No ratings yet
A Concise Tutorial On R
112 pages
R-Programming Notes
100% (2)
R-Programming Notes
33 pages
MIS 3.hafta (Introduction To R)
No ratings yet
MIS 3.hafta (Introduction To R)
32 pages
Unit I R Data Structures
No ratings yet
Unit I R Data Structures
30 pages
Module 5 Introduction To R Programming
No ratings yet
Module 5 Introduction To R Programming
17 pages
Unit - II Part-1
No ratings yet
Unit - II Part-1
37 pages
CS ELEC 4 - Analytics Techniques & Tools/Machine Learning: Module No.: 1 (Prelim) Module Title: Writer
No ratings yet
CS ELEC 4 - Analytics Techniques & Tools/Machine Learning: Module No.: 1 (Prelim) Module Title: Writer
22 pages
R Manual
No ratings yet
R Manual
48 pages
SCTR Unit 1
No ratings yet
SCTR Unit 1
36 pages
Lecture 1 - R Introduction1
No ratings yet
Lecture 1 - R Introduction1
77 pages
Full Course of Machine Learning
100% (17)
Full Course of Machine Learning
660 pages
Excel Bible For Beginners - Excel For Dummies Guide To The Best Excel Tools, Tips and Shortcuts
100% (18)
Excel Bible For Beginners - Excel For Dummies Guide To The Best Excel Tools, Tips and Shortcuts
148 pages
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
94% (18)
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
334 pages
Hackers Guide To Machine Learning With Python PDF
100% (16)
Hackers Guide To Machine Learning With Python PDF
272 pages
Learning The Pandas Library Python Tools For Data Munging Analysis and Visual PDF
100% (19)
Learning The Pandas Library Python Tools For Data Munging Analysis and Visual PDF
208 pages
Python in Excel (2024)
100% (14)
Python in Excel (2024)
607 pages
Advanced Excel Tutorial
98% (49)
Advanced Excel Tutorial
232 pages
Machine Learning With Python
100% (15)
Machine Learning With Python
692 pages
Basic Statistics PDF
100% (10)
Basic Statistics PDF
262 pages
Data Structure and Algorithms With Python
100% (16)
Data Structure and Algorithms With Python
369 pages
DATA ANALYTICS - A Comprehensive Beginner's Guide To Learn About The Realms of Data Analytics From A-Z
89% (18)
DATA ANALYTICS - A Comprehensive Beginner's Guide To Learn About The Realms of Data Analytics From A-Z
102 pages
Learning Statistics
100% (30)
Learning Statistics
408 pages
The Python Bible
97% (33)
The Python Bible
506 pages
Understanding Machine Learning
100% (73)
Understanding Machine Learning
416 pages
Data Visualization With Python PDF
93% (15)
Data Visualization With Python PDF
662 pages
Python Programming. A Step-by-Step Guide For Absolute Beginners
91% (46)
Python Programming. A Step-by-Step Guide For Absolute Beginners
181 pages
Data Wrangling With R
92% (13)
Data Wrangling With R
237 pages
Deep Learning - Fundamentals, Theory and Applications 2019 PDF
100% (11)
Deep Learning - Fundamentals, Theory and Applications 2019 PDF
168 pages
Data Analysis With Microsoft Excel
92% (26)
Data Analysis With Microsoft Excel
532 pages
Beginning R, 2nd Edition
100% (5)
Beginning R, 2nd Edition
337 pages
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
96% (23)
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
471 pages
Python Guide for Beginners
100% (10)
Python Guide for Beginners
115 pages
Programming Skills For Data Science, With R
92% (13)
Programming Skills For Data Science, With R
399 pages
Burkov's Guide to Machine Learning
100% (11)
Burkov's Guide to Machine Learning
135 pages
PYTHON Learn Python Programming in 90 Minutes or Less Python Learning Python Python Programming Python Tutorial Python Programming For Beginners Python For Dummies Book 1 PDF
93% (14)
PYTHON Learn Python Programming in 90 Minutes or Less Python Learning Python Python Programming Python Tutorial Python Programming For Beginners Python For Dummies Book 1 PDF
161 pages
Deep Learning With Python
100% (10)
Deep Learning With Python
396 pages
Understand Statistics
100% (10)
Understand Statistics
146 pages
Python Pandas Tutorial
96% (28)
Python Pandas Tutorial
178 pages
IMC Strategy for Priya Gold Biscuits
No ratings yet
IMC Strategy for Priya Gold Biscuits
14 pages
Maruti Suzuki: India's Leading Car Manufacturer
No ratings yet
Maruti Suzuki: India's Leading Car Manufacturer
1 page
IKEA's Local Strategies for India
No ratings yet
IKEA's Local Strategies for India
9 pages
FMCG CO. - Nestle India
No ratings yet
FMCG CO. - Nestle India
4 pages
Neuro-Fuzzy Speed Control for PMSM
No ratings yet
Neuro-Fuzzy Speed Control for PMSM
8 pages
Data Models for Database Designers
No ratings yet
Data Models for Database Designers
59 pages
A26361 D2594 Z110 Muli
No ratings yet
A26361 D2594 Z110 Muli
20 pages
Mcafee Installer Manual
No ratings yet
Mcafee Installer Manual
9 pages
Ti Embedded Guide
80% (5)
Ti Embedded Guide
127 pages
Global Directory Services
No ratings yet
Global Directory Services
10 pages
Any Connect 31 RN
No ratings yet
Any Connect 31 RN
34 pages
PDF and JPEG2000 Document Formats
No ratings yet
PDF and JPEG2000 Document Formats
32 pages
Steam Game Save and Settings Guide
No ratings yet
Steam Game Save and Settings Guide
3 pages
C and Assembly
No ratings yet
C and Assembly
12 pages
String Instructions
No ratings yet
String Instructions
14 pages
Comp131 Test
76% (17)
Comp131 Test
84 pages
System (And A Few User) Abend Codes Explained
No ratings yet
System (And A Few User) Abend Codes Explained
54 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
30 pages
Failure Recovery in Distributed Systems
0% (1)
Failure Recovery in Distributed Systems
2 pages
Project Document
No ratings yet
Project Document
7 pages
Assignment 2 A172
No ratings yet
Assignment 2 A172
19 pages
Database Systems Solutions
75% (8)
Database Systems Solutions
324 pages
LLM User Guide
No ratings yet
LLM User Guide
570 pages
H55M Le
No ratings yet
H55M Le
6 pages
File Transfer via TServerSocket
No ratings yet
File Transfer via TServerSocket
3 pages
Microsoft Intune
No ratings yet
Microsoft Intune
21 pages
C++ Exception Handling - Try Catch
No ratings yet
C++ Exception Handling - Try Catch
10 pages
Z80 Microprocessor Overview and Architecture
100% (3)
Z80 Microprocessor Overview and Architecture
98 pages
VBE/AF Standard: Video Electronics Standards Association
No ratings yet
VBE/AF Standard: Video Electronics Standards Association
72 pages
Freescale Signal Processing Extension 2 ISA Reference Manual
No ratings yet
Freescale Signal Processing Extension 2 ISA Reference Manual
1,110 pages
Oracle12c DataGuard-FarSync and Whats New - IOUG
No ratings yet
Oracle12c DataGuard-FarSync and Whats New - IOUG
43 pages
DN P 3 Introduction Hors
No ratings yet
DN P 3 Introduction Hors
64 pages
Computer Based Industrial Control
No ratings yet
Computer Based Industrial Control
3 pages