0% found this document useful (0 votes)

15 views63 pages

R Statistical Package

The document provides a comprehensive overview of the R Statistical Package, detailing its capabilities for statistical analysis and graphics. It includes essential commands, data modes and types, operators, and methods for creating vectors, matrices, lists, and data frames, as well as importing data and performing descriptive statistics. Additionally, it covers how to install and use R packages, assign values to variables, and utilize various functions for data manipulation and analysis.

Uploaded by

bikilaanole

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views63 pages

R Statistical Package

Uploaded by

bikilaanole

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 63

R Statistical Package

Dec 2024
Wolaita Sodo,
Ethiopia
Outline
Introduction
Important Commands in R
Data Modes & Types in R
Operators in R
Creating Vectors, Matrix, Lists & Data
frames in R
Importing Data into R
Descriptive Statistics & Graphics in R
Statistical Models in R
Introduction
R is a powerful computer program for
performing statistical analysis and
graphics.
It is free & easy to learn.
R is a platform for the object-oriented
statistical programming language
R has an excellent built-in help system.
 R has excellent graphing capabilities.
R is a computer programming
language.
R is initially written by Ross Ihaka and
Robert Gentleman at Dep. of Statistics of
It is "open-source" software (which for our
purposes means that it can be freely
downloaded);
Download R at http://cran.r‐project.org/
It is available for a number of different
operating systems, including Windows, Linux,
and Macintosh;
By itself is fairly powerful and is extensible
(meaning that procedures for analyzing data
Getting Started
Once you have installed R, there will be an
icon on your desktop. Double click it and R
will start up.
Type 'q()' to quit R.
R does have a few pull-down menus, but
mostly commands in R are entered on the
command line (>).
The > is a prompt symbol displayed by R, not
typed by you. This is R’s way of telling you it’s
ready for you to type a command
To see the list of installed datasets, use the
data method with an empty argument:
> data()
Installing and using R packages
Installing R packages:
install.packages() #To Install a package from CRAN

installed.packages() #To View the list of installed packages

library() # To Load and use an R package

search() #To View loaded R packages

detach(pkg_name, unload = TRUE) # To Unload an R

package

remove.packages() # To Remove installed packages

update.packages() #To Update installed packages

Some Important Commands
>sort() :-used for sorting a vector of values in ascending/
descending order.
rev(x): reverses the elements of x

rev(sort(x)): sorts the elements of x in decreasing/ascending order.

>rank() :-Returns the ranks of the values in a vector of value.
>rep() :- Repeats the same value several times, e.g.,
rep(pi,12)
>seq() :-Generates regular sequences of values, e.g.,
seq(from=5,to=30,by=5)
>print():- Enables to print an object using a different format
than simply typing its name,
e.g., print(pi,digits=20)
>rm():- Removes (i.e., delete) an object
>ls():- Lists all existing objects
round(x, n): rounds the elements of x to n decimals
Cont’d
>length():- Returns the number of values in an R
object
>mean():- Returns the arithmetic average of a vector
>median():- Returns the median of a vector
>max():- Returns the maximum value of a vector
>min():- Returns the minimum value of a vector
>range():- Returns the range of a vector, i.e.,
return the minimum and maximum values
>var():- Returns the variance of a vector
(computed with n-1 as a denominator)
>sd():- Returns the standard deviation of the
vector, i.e., the square root of the variance
>cor():- Returns the correlation coefficient
between two vectors
>summary():- Gives several descriptive statistics of an
object
Cont’d
>sqrt():- Returns the square root of values of a vector
>log():- Returns the natural logarithm of values of a
vector
>log10():- Returns the base-10 logarithm of values of
a vector
>exp():- Returns the exponential of values of a vector
>abs():- Returns the absolute values of a vector
>sin():- Returns the sinus of values of a vector (in
radians)
>cos():- Returns the cosinus of values of a vector (in radians)
>tan():- Returns the tangent of values of a vector (in radians)
Finding help (e.g., on functions)
You will find also several helps in the menu “help” in
the main R window.
Cont’d
For getting help on a specific function or
command, you can use the function help() that
will open a help window, with all e.g.,
>help(mean) or ?mean

>help(var) or ?var
>help(t.test) or ?t.test, etc…
will give you all the information, with examples,
references, etc., on how to use the function
mean() or ?mean, ?var, etc.
Data Modes
Logical - Binary data mode, with values
represented as T or F.
Numeric - Numeric data mode includes
integer , representations of numeric
values.
Complex - Complex numeric values (real
and imaginary parts).
Character - Character values represented
Data Types
Vector : A set of elements in a specified
order.
Matrix : is a two-dimensional array of
elements of the same mode.
Factor : is a vector of categorical data.
Data frame : is a two-dimensional array
whose columns may represent data of
different modes.
R as a calculator
Arithmetic: R can function as a calculator for scalar arithmetic, performing
addition +, subtraction −, multiplication *, division /, exponentiation ˆ , taking
the modulus %%, and integer division %/%. Parentheses () specifies the order
of operations.
Example
>(17*0.35)^(1/3); > log(10); > exp(1); > 3^-1; >2+5
> (3+5/78)^3*7
[1] 201.3761
> 89%%13 # modulus
[1] 11
> 89%/%13 # division
[1] 6
Assigning Values to variables
Variables are assigned using ‘<-’ or “=“
> x<-12.6
>x
[1] 12.6
Variables that contains many values (vectors), e.g. with the concatenate
function:
> y<-c(3,7,9,11)
>y
[1] 3 7 9 11
Assigning Values to variables
Operator ‘:’ means “a series of integers between”:
> x<-1:6
> x
[1] 1 2 3 4 5 6
 Object names cannot contain `strange' symbols like !, +, -, #.
 A dot (.) and an underscore ( _) are allowed, also a name starting with a dot.
 Object names can contain a number but cannot start with a number.
 R is case sensitive, X and x are two different objects, as well as temp and temP.
 > x = sin(9)/75
 > y = log(x) + x^2
> x
> y
 > m <- matrix(c(1,2,4,1), ncol=2)
> m
 > solve(m)
 To list the objects that you have in your current R session use the function ls or
the function objects.
 > ls()
 [1] "x" "y"
 10
Operators in R
I. Arithmetic Operators
* : Multiply
+ : Add
- : Subtract
/ : Divide
^ : Exponentiation
%% : Modulus
II. Comparison Operators

!= Not Equal To
< Less Than
<= Less Than or Equal to
== Equal
> Greater Than
>= Greater Than or Equal
to
III. Logical Operators
!: Not
| : Or (For Calculating Vectors and Arrays
of Logical)

||: Sequential or (for Evaluating

Conditionals)
& : And (For Calculating Vectors and
Arrays of Logical)

&&: Sequential And (For Evaluating

Creating Vectors
Vectors in the mathematical sense are one-
dimensional arrays .
Lets create two small vectors with data and a scatter
plot.
>z1 <- c(1,2,3,4,5,6)
>z2 <- c(6,8,3,5,7,1)
>plot(z1,z2)
>title("My first scatterplot")
>a <- c(1,2,5.3,6,-2,4) # numeric vector
>b <- c("one","two","three") # character vector
>c <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) #logical vector
Refer to elements of a vector using subscripts.
>a[c(2,4)] # 2nd and 4th elements of vector
Alternatively create a vector as follows:
>d=1:12 #making vector
>e=seq(10,20,0.1) # making vector
MATRIX
Matrices are two-dimensional arrays; higher
dimensions are possible
 All columns in a matrix must have the same
mode(numeric, character, etc.) and the same length.
The general format is:
>mymatrix <- matrix(vector, nrow=r, ncol=c,
byrow=
FALSE,dimnames=list(char_vector_rownames,
char_vector_colnames))
 byrow=TRUE indicates that the matrix should be
filled by rows.
 byrow=FALSE indicates that the matrix should be
filled by columns (the default).
 dimnames provides optional labels for the columns
and rows.
Creating Matrix
An array is simply a vector with an
associated dimension attribute, to give its
shape.
 Arraysare similar to matrices but can have
more than two dimensions. See help(array)
for details.
Example
Use the following vector to create matrix
>cm <-
c(35,14,11,1,4,11,3,0,12,9,38,4,2,5,12,2)
>cm
>dim(cm) <- c(4, 4)
>cm
Cont’d
[,1] [,2] [,3] [,4]
[1,] 35 4 12 2
[2,] 14 11 9 5
[3,] 11 3 38 12
[4,] 1 0 4 2
> dim(cm)
[1] 4 4
Cont’d
>m <- matrix(1:15, 5, 3, byrow=T))# making
matrix
# generates 5 x 4 numeric matrix
>y<-matrix(1:20, nrow=5,ncol=4)
# another example
>cells <- c(1,26,24,68)
>rnames <- c("R1", "R2")
>cnames <- c("C1", "C2")
>mymatrix <- matrix(cells, nrow=2, ncol=2,
byrow=TRUE, dimnames=list(rnames, cnames))
#Identify rows, columns or elements using subscripts.
>y[,4] # 4th column of matrix
>y[3,] # 3rd row of matrix
>y[2:4,1:3] # rows 2,3,4 of columns 1,2,3
>y[2,3] # entry in row 2 and column 3
cbind and rbind can also be used to
create matrices:
> x1 = 1:3
> x2 = c(7,6,6)
> x3 = c(12,19,21)
> A = cbind(x1,x2,x3) # Bind vectors x1, x2, and x3 into a
matrix. Treats each as a column.
> A = rbind(x1,x2,x3) # Bind vectors x1, x2, and x3 into a
matrix. Treats each as a row.
Other matrix commands are:
> dim(A) # get the dimensions of a matrix
> nrow(A) # number of rows; > ncol(A) # number of columns
> apply(A,1,sum) # apply the sum function to the rows of A
> apply(A,2,sum) # apply the sum function to the columns of A
> sum(diag(A)) # trace of A; > A = diag(1:3); solve(A) #
inverse of A
> det(A) # determinant of A
Data frames
 A data frame is more general than a matrix, in that
different columns can have different modes
(numeric, character, factor, etc.).
>d <- c(1,2,3,4)
>e <- c("red", "white", "red", NA)
>f <- c(TRUE,TRUE,TRUE,FALSE)
>mydata <- data.frame(d,e,f)
>names(mydata) <- c("ID","Color","Passed")#variable
names
 There are a variety of ways to identify the elements of a
data frame .
>myframe[3:5] # columns 3,4,5 of data frame
>myframe[c("ID","Age")] # columns ID and Age from data
frame
>myframe$X1 # variable x1 in the data frame
Cont’d
Example
# create a data frame
>age <- c(25, 30, 56)
>gender <- c("male", "female", "male")
>weight <- c(160, 110, 220)
>mydata <- data.frame(age,gender,weight)
Creating Spreadsheet to input data from keyboard
# enter data using editor
>mydata <- data.frame(age=numeric(0), gender=
character(0), weight=numeric(0))
>mydata <- edit(mydata)
# note that without the assignment in the line
above,
# the edits are not saved!
Data frame

Create the following data frame:

Use attach to make the variables accessible

by name:
> attach(worms)
Use names to get a list of variable names:
Data frame
Selecting Parts of a Data frame: Subscripts used
Subscripts within square brackets: to select part of a
dataframe[, means “all the rows” and ,] means “all
the columns”
To select the first three column of the dataframe
worms given next slide
> worms[,1:3]
Area Slope Vegetation
Nashs.Field 3.6 11 Grassland
Silwood.Bottom 5.1 2 Arable
Nursery.Field 2.8 3 Grassland
Rush.Meadow 2.4 5 Meadow
Gunness.Thicket 3.8 0 Scrub
(…)
Lists
Creating Lists
 Lists can be created using the list function. Like data frames, they can
incorporate a mixture of modes into the one list and each component can
be of a different length or size.
 For example, the following is an example of how we might create a list
from scratch.
 L1 <- list(x = sample(1:5, 20, rep=T), y = rep(letters[1:5], 4), z = rpois(20, 1))
> L1; > L1[1] # indexing
Working with Lists
The length of a list is equal to the number of components in that list.
>length(L1)
To determine the names assigned to a list, the names function can be
used. Names of lists can also be altered in a similar way to that
shown for data frames.
> names(L1) <- c("Item1","Item2","Item3")
Joining two lists can be achieved either using the concatenation
function
Concatenation function:
> L2 <- list(x=c(1,5,6,7), y=c("apple","orange","melon","grapes"))
> c(L1,L2)
List vs Vector vs Data frame

list: an ordered collection of data of arbitrary types.

vector: an ordered collection of data of the same
type
Data frame: is supposed to represent the typical
data table that researchers come up with – like a
spreadsheet.
It is a rectangular table with rows and columns;
data within each column has the same type (e.g.
number, text, logical), but different columns may
have different types.

 Subsetting:
 Individual elements of a vector, matrix, array or data
frame are accessed with “[ ]” by specifying their index, or
their name
Useful Functions
>length(object) # number of elements or
components
>names(object) # names
>c(object,object,...) # combine objects into a vector
>cbind(object, object, ...) # combine objects as
columns
>rbind(object, object, ...) # combine objects as rows
>ls() # list current objects
>rm(object) # delete an object
>newobject <- edit(object) # edit copy and save a
new object
>fix(object) # edit in place
Data Import
From the keyboard one by one

c( )

weight <- c(160, 110, 220)

From the file

read.table(); read.csv(); read.dta(); read.spss(); …

# to read data saved as text (tab delimited)saved in D local disk.

#folder name =Rtraining, file name=datatry w/c is excel data.

dat1<-read.table(“D:/Rtraining/datatry.txt", header=TRUE)

dat1

attach(dat1)

# To import Excell data saved as comma separeted value (csv)

getwd() # to get working directory

dat2<-read.csv(“D:/Rtraining/mastmo.csv")# saved in local disk D, folder=Rtraining, file name=mastmo, 54 african

data

dat2
Data Import…
#How to import data from SPSS into R.

library(foreign)# first you have to load the library foreign.

#We need to know the current wokring directory

getwd()

dat3<-read.spss(“D:/Rtraining/employee.sav", header=FALSE)# saved in local disk D,

folder=Rtraining, file name=employee under the folder Rtraining..

dat3

attach(dat3)# to make the variables accessible by name

By a spreadsheet

data.entry()

edit()
Value Labels
 You can use the factor function to create your own value
labels.
# variable v1 is coded 1, 2 or 3
# we want to attach value labels 1=red, 2=blue,3=green
>mydata$v1 <- factor(mydata$v1,
levels = c(1,2,3),
labels = c("red", "blue", "green"))
# mydata$sex <- factor(mydata$sex, levels = c(1,2), labels =
c("male", "female"))
# variable y is coded 1, 3 or 5
# we want to attach value labels 1=Low, 3=Medium, 5=High
>mydata$y <- ordered(mydata$y, levels = c(1,3, 5),
labels = c("Low", "Medium", "High"))
Note: factor and ordered are used the same way, with the
same arguments. The former creates factors and the later
creates ordered factors.
Creating new variables
Use the assignment operator <- to create new variables.
A wide array of operators and functions are available
here.
# Three examples for doing the same computations
>mydata$sum <- mydata$x1 + mydata$x2
>mydata$mean <- (mydata$x1 + mydata$x2)/2
>attach(mydata)
>mydata$sum <- x1 + x2
>mydata$mean <- (x1 + x2)/2
>detach(mydata)
>mydata <- transform( mydata, sum = x1 + x2,mean =
(x1+ x2)/2 )
Recoding variables
In order to recode data, you will probably use
one or more of R's control structures.
# create 2 age categories
>mydata$agecat <- ifelse(mydata$age > 70,
c("older"), c("younger"))
# another example: create 3 age categories
>attach(mydata)
>mydata$agecat[age > 75] <- "Elder“
>mydata$agecat[age > 45 & age <= 75] <-
"Middle Aged“
>mydata$agecat[age <= 45] <- "Young“
>detach(mydata)
Merging Data frames/files/data
 To merge two data frames (datasets) horizontally, use
the merge function. In most cases, you join two data
frames by one or more common key variables (i.e., an
inner join).
 # merge two data frames by ID
>total <- merge(dataframeA, dataframeB,by="ID")
# merge two dataframes by ID and Country
>total <-
merge(dataframeA,dataframeB,by=c("ID","Country"))
ADDING ROWS
 To join two data frames (datasets) vertically, use the
rbind function.
 The two data frames must have the same variables, but
they do not have to be in the same order.
>total <- rbind(dataframeA, dataframeB)
Statistical HypothesisTests
>t.test(): ->Student t-tests(one and two
samples)
>var.test(): ->Fisher(variance tests; one and
equality of variances)
>cor.test(): ->correlation tests
>chisq.test(): ->² test
>prop.test(): ->proportion tests (one &
difference of two proportions)
>Wilcox.test(): ->wilcoxon test(one and two
samples)
Graphical Procedures
> plot(x)->function is used
> plot(xvalues,yvalues)
Histograms
Histograms are a useful graphic for displaying univariate data
>hist(x)
>boxplot(x)#to produce box plot.
Q-Q Plot: to check normality
> qqnorm(resid,main="Normal Q-Qplot")
Changing the Look of Graphics
> plot(xvalues,yvalues, ylab = "Label for y axis", xlab = "Label for x axis", las = 1,
cex.lab = 1.5)
 las : numeric in {0,1,2,3} change orientation of the axis
labels;
 cex.lab : magnification to be used for x and y labels;
 To get full range of changes about graphical parameters:
>?par
Cont’d
Each function has its own set of arguments.
The most common ones are
 xlim,ylim: range of variable plotted on the x
and y axis respectively
 pch, col, lty: plotting character, colour and
line type
 xlab, ylab: labels of x and y axis respectively
 main, sub: main title and sub-title of graph
 type=“l” (line),”p” (point),”h” (vertical line)…
Example:
## plot the graph of f(x)=x^2+2x+9 b/n x=-3
and x=3
> x<-seq(-3,3,0.01)
> f<-x^2+2*x+9
> plot(x,f,type="l",main="Graph of
Quadratic",xlab="Xvalue",ylab="funalvalue
",col="red")
Cont’d
Graph of Quadratic

20
funalvale

15
10

-3 -2 -1 0 1 2 3

Xvalue

> #plot of histogram

> xx<-rnorm(100,36,10)
> hist(xx,main="Histogram",nclass=25,col=5)

Histogram
10
8
Frequency

6
4
2
0

10 20 30 40 50 60

xx
Example
The plot of x^3 −3x between x=−2 and x=2:
>curve(x^3-3*x, -2, 2)
Here is the more cumbersome code to do the
same thing using plot:
>x<-seq(-2,2,0.01)
>y<-x^3-3*x
>plot(x,y,type="l")
More Graphical Parameters
C0lor options and their descriptions
>col # Default plotting color. Some functions
(e.g. lines) accept a vector of values that are
recycled.
>col.axis # color for axis annotation
>col.lab # color for x and y labels
>col.main # color for titles
>col.sub #color for subtitles
>fg # plot foreground color (axes, boxes - also
sets col= to same)
>bg # plot background color
Scatterplot Matrices
# Basic Scatterplot Matrix
>pairs(~mpg+disp+drat+wt,data=mtcars,
main="Simple Scatterplot Matrix")
Statistical Models in R
Regression Model in R
>fit1<-lm(y ∼ x) : ->Simple regression
>lm(y ∼ 1+x): -> Explicit intercept
>lm(y ∼ -1 + x):-> Through the origin
>fit<-lm(y ∼ x + x2):-> Quadratic regression
>fit<-lm(y ∼ x1 + x2 + x3):-> Multiple Regression
>coef(fit)-> to find regression coefficients
>resid(fit) -> to find residuals
>fitted(fit) -> to find fitted values
>summary(fit) -> to find analysis summary
>predict(fit)-> predict for new data
>anova(fit) # to get anova table
>deviance(fit)-> residual sum of squares
>plot(resid, fitted) #to check constant variance assumption
>qqnorm(resid(fit)) # to check normality assumption
>X <- model.matrix(˜ y - 1, Data)
Cont’d
Fitting the Model
# Multiple Linear Regression Example
>fit <- lm(y ~ x1 + x2 + x3, data=mydata)
>summary(fit) # show results
# Other useful functions
>coefficients(fit) # model coefficients
>confint(fit, level=0.95) # CIs for model parameters
>fitted(fit) # predicted values
>residuals(fit) # residuals
>anova(fit) # anova table
>vcov(fit) # covariance matrix for model parameters
>influence(fit) # regression diagnostics
# diagnostic plots provide checks for heteroscedasticity,
normality, and influential observations.
>plot(fit) #Diagnostic Plots.
Example of Simple LRM from R
data set
>data()# to view data set available in R
>edit(cars) -> close it # to import data frame
named cars to our current working space
>names(cars)
[1] "speed" "dist"
> y<-cars$speed
> x<-cars$dist
> fit<-lm(y~x)
>Fit
>plot(resid(fit),fitted(fit),main=“CCVA”,
ylab=“fitted”, xlab=“resid”)
>qqnorm(resid(fit),main=“QQ plot”)
Example of Multiple LRM from R
data set
>data()# to view data set available in R
>edit(rock) -> close it # to import data frame
named rock to our current working space
>names(rock)# names in data frame rock
[1] "area" "peri" "shape" "perm"
> Y<-rock$area
> X1<-rock$peri
> X2<-rock$shape
> X3<-rock$perm
> fit1<-lm(Y~X1+X2+X3)# fitting multiple
linear Regression model
> fit1
Tree data example
>data() # to view data set in R
>edit(trees) # to view data trees
> names(trees)
[1] "Girth" "Height" "Volume"
> Y<-trees$Girth
> x1<-trees$Height
> x2<-trees$Volume
> fit1<-lm(Y~x1+x2)
>fit1
>coef(fit1)
>anova(fit1)
Extracting Statistics from the Regression
The most important statistics and parameters of a
regression are stored in the lm object or the summary
object.
> output <- summary(result)
> SSR <- deviance(result)
> LL <- logLik(result)
> DegreesOfFreedom <- result$df
> Yhat <- result$fitted.values
> Coef <- result$coefficients
> Resid <- result$residuals
> s <- output$sigma
> RSquared <- output$r.squared
> CovMatrix <- s^2*output$cov
> aic <- AIC(result)
>vcov() #variance-covariance matrix of the coefficients
linear model
>lm(y~x) : ->To fit régression model
>lm(y~x1+x2): ->To fit multiple linear régression
model using two regressors x1 & x2
>aov(y~x): -> to fit one way anova model
>f=as.factor(f): ->transforms f into a factor
>lm(y~f) : ->one factor ANOVA
>lm(y~f1+f2) : ->two factors ANOVA
>lm(y~x+f): -> covariance analysis
Families :
?family # to identify the family of model
Logistic regression
glm.out=glm(y~x, binomial)
Poisson régression
glm.out=glm(y~x, poisson)
Remark:
>lm(y~x) equivalent to > glm(y~x, gaussian)
ANOVA MODEL
Partition of variation into
 Between groups
 Within groups
The model:(One Way ANOVA Model)
Yij = m + aj + eij
Assumptions:
 Normality
 Independence
 Homogeneity
Var(Y) = Var(m) + Var(a) + Var(e) = Var(a) + Var(e)
Example: Perform one way ANOVA for the data given in table
below:
>treat <- c(1,1,1,2,2,2,3,3,3) A B C
>y <- c(43, 40, 35, 41, 47, 54, 39, 34, 43
37) 41 39
>treat <- as.factor(treat) 40 47 34
>fit <- aov(y ~ treat) 35 54 37
>summary(fit)
>anova(fit)
treat=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
yield=c(13.25,25.61,37.9,38.65,2.14,14.05,23.8,37.5,38.93,1.91,14
.21,24.76,36.44,37.8,1.78)
>treat <- as.factor(treat)
>fit <- aov(yield ~ treat)
>summary(fit)
>anova(fit)
ANOVA - Fit a Model
# One Way Anova (Completely Randomized Design)
>fit <- aov(y ~ group)
# Randomized Block Design (B is the blocking factor)
>fit <- aov(y ~ A + B)
# Two Way Factorial Design
>fit <- aov(y ~ A + B + A*B, data=mydataframe)
>fit <- aov(y ~ A*B, data=mydataframe) # same thing
# Analysis of Covariance
>fit <- aov(y ~ A + x, data=mydataframe)
For within subjects designs, the dataframe has to be rearranged
so that each measurement on a subject is a separate observation
# One Within Factor
>fit <- aov(y~A+Error(Subject/A),data=mydataframe)
# Two Within Factors W1 W2, Two Between Factors B1 B2
>fit <- aov(y~(W1*W2*B1*B2)+Error(Subject/(W1*W2))+(B1*B2),
data=mydataframe)
Factorial ANOVA
Example :Perform factorial ANOVA using for the
following
Variety data Pesticide Total
1 2 3 4
B1 29 50 43 53 175
B2 41 58 42 73 214
B3 66 85 63 85 305
Total 136 193 154 211 694

Model: product = a(mean) + b(variety) +

g(pesticide)
>variety <- c(1, 1, 1, +e
1, 2, 2, 2,2, 3, 3, 3, 3)
>pesticide <- c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4)
>product <-
c(29,50,43,53,41,58,42,73,66,85,69,85)
>variety <- as.factor(variety)
>pesticide <- as.factor(pesticide)
>fit<-aov(product~variety+pestcide)
>anova(fit)
Cont’d
>analysis <- aov(product ~ variety + pesticide)
>anova(analysis)
Analysis of Variance Table
Response: product
Df Sum Sq Mean Sq F value
Pr(>F)
variety 2 2225.17 1112.58 44.063 0.000259 ***
pesticide 3 1191.00 397.00 15.723 0.003008 **
Residuals 6 151.50 25.25
---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Lab Activity on One Way ANOVA
Perform One –Way Anova using the following
Data obtained from four treatments
Low Low High High
Sugar - Sugar - Sugar - Sugar -
Low High Low High
Acidity Acidity Acidity Acidity
65 63 72 60
72 69 80 61
77 65 83 55
68 73 75 62
81 71 78 64
59 55 62 50
72 65 84 57
80 66 91 58
66 73 79 60
77 64 84 65
71 60 79 60
91 72 88 63
74 80 86 65
69 62 78 57
73 59 84 64
The Logistic Model
Data:
Binary outcomes (eg disease status)
Aim:
 is to identify which factors influence the outcome

Prob(Yi=0) = exp(hi)/(1+exp(hi))
hi = Sj xij bj - Linear Predictor

xij – Design Matrix (genotypes etc)

bj – Model Parameters (to be estimated)

Model is investigated by
estimating the bj’s by maximum likelihood
testing if the estimates are different from 0
Fitting the Model
>afit <- glm( y ~additive(x),family=‘binomial’)
Model Comparison
> afit <- glm(t$y ~ additive(t$m20))
> gfit <- glm(t$y ~ genotype(t$m20))
> anova(afit,gfit)
R> plasma_glm_1 <- glm(ESR ~ fibrinogen, data =
plasma,family = binomial())# simple Logistic
R> data("womensrole", package = "HSAUR2")
R> fm1 <- cbind(agree, disagree) ~ gender +
education
R> womensrole_glm_1 <- glm(fm1, data =
womensrole,
+ family = binomial())
> no.yes <- c("No","Yes")
> smoking <- gl(2,1,8,no.yes)
> obesity <- gl(2,2,8,no.yes)
> snoring <- gl(2,4,8,no.yes)
> n.tot <- c(60,17,8,2,187,85,51,23)
> n.hyp <- c(5,2,1,0,35,13,15,8)
> data.frame(smoking,obesity,snoring,n.tot,n.hyp)
The gl function, to “generate levels”
R is able to fit logistic regression analyses for
tabular data in two different ways.
> hyp.tbl <- cbind(n.hyp,n.tot-n.hyp)
>
glm(hyp.tbl~smoking+obesity+snoring,family=bi
nomial ("logit"))
logistic regression model is to give the
proportion of diseased in each cell:
> prop.hyp <- n.hyp/n.tot
> glm.hyp <-
glm(prop.hyp~smoking+obesity+snoring,
+ binomial,weights=n.tot)
> summary(glm.hyp)
> confint(glm.hyp)
> exp(confint(glm.hyp))
Thank You!

Statistical Computing II-slide
No ratings yet
Statistical Computing II-slide
279 pages
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
No ratings yet
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
36 pages
Statistical Analysis With R - A Quick Start
100% (1)
Statistical Analysis With R - A Quick Start
47 pages
Lecture Notes - Programming in R
No ratings yet
Lecture Notes - Programming in R
9 pages
IDS - Unit 3 - 5
No ratings yet
IDS - Unit 3 - 5
80 pages
Unit 1.1
No ratings yet
Unit 1.1
85 pages
RStudio
No ratings yet
RStudio
60 pages
Network Analysis and Visualization With R and Igraph
No ratings yet
Network Analysis and Visualization With R and Igraph
62 pages
Tutorial 1
No ratings yet
Tutorial 1
29 pages
First Course On R
No ratings yet
First Course On R
26 pages
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
No ratings yet
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
58 pages
R
No ratings yet
R
13 pages
Prerequis R
No ratings yet
Prerequis R
38 pages
Data Types in R (Vectors)
No ratings yet
Data Types in R (Vectors)
48 pages
R Programming
No ratings yet
R Programming
59 pages
Satyam Jha R File
No ratings yet
Satyam Jha R File
41 pages
About R Language
No ratings yet
About R Language
15 pages
R Prog Lab Manual Theory
No ratings yet
R Prog Lab Manual Theory
16 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
Lec 3
No ratings yet
Lec 3
23 pages
R Programming Slides
No ratings yet
R Programming Slides
73 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
Introduction To R: 1 Getting Started
No ratings yet
Introduction To R: 1 Getting Started
14 pages
Introduction To R
No ratings yet
Introduction To R
23 pages
R Workshop
No ratings yet
R Workshop
47 pages
R Studio
No ratings yet
R Studio
41 pages
R-Basic Concepts
No ratings yet
R-Basic Concepts
67 pages
R Lab
No ratings yet
R Lab
114 pages
R Studio
No ratings yet
R Studio
8 pages
R Software - Notes
No ratings yet
R Software - Notes
18 pages
R Programming Checklist of Basic Skills With Examples
No ratings yet
R Programming Checklist of Basic Skills With Examples
33 pages
SSMDA Expt 7
No ratings yet
SSMDA Expt 7
16 pages
R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
No ratings yet
R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
8 pages
R Short Tutorial
No ratings yet
R Short Tutorial
5 pages
MIS 4.hafta (Introduction To R)
No ratings yet
MIS 4.hafta (Introduction To R)
52 pages
STATS LAB Basics of R PDF
No ratings yet
STATS LAB Basics of R PDF
77 pages
Lecture 1
No ratings yet
Lecture 1
35 pages
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
No ratings yet
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
15 pages
Introduction To Analytics and R File
No ratings yet
Introduction To Analytics and R File
29 pages
R Programming
No ratings yet
R Programming
22 pages
All v2 Basic Statistics Using R
No ratings yet
All v2 Basic Statistics Using R
241 pages
R Session A
No ratings yet
R Session A
107 pages
Gfmam The Maintenance Framework First Edition English Version
100% (1)
Gfmam The Maintenance Framework First Edition English Version
24 pages
Untitled
No ratings yet
Untitled
59 pages
Chapter 1 Introduction To R
No ratings yet
Chapter 1 Introduction To R
33 pages
Assignment 2: Introduction To R: Text Like This Will Be Problems For You To Do and Turn In. (There Are 7 in All.)
No ratings yet
Assignment 2: Introduction To R: Text Like This Will Be Problems For You To Do and Turn In. (There Are 7 in All.)
15 pages
Part I: Introductory Materials: Introduction To R
No ratings yet
Part I: Introductory Materials: Introduction To R
25 pages
Introduction To R
No ratings yet
Introduction To R
21 pages
Rintro
No ratings yet
Rintro
14 pages
Introduction To R
No ratings yet
Introduction To R
39 pages
Programming With R: Lecture #4
No ratings yet
Programming With R: Lecture #4
34 pages
R Handout Statistics and Data Analysis Using R
No ratings yet
R Handout Statistics and Data Analysis Using R
91 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
Stratus 3i Installation Guide
No ratings yet
Stratus 3i Installation Guide
8 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
Brighton Spec ASME 80-10 2017 PDF
No ratings yet
Brighton Spec ASME 80-10 2017 PDF
1 page
Disclosure To Promote The Right To Information: IS 9875 (1990) : Lipstick (PCD 19: Cosmetics)
No ratings yet
Disclosure To Promote The Right To Information: IS 9875 (1990) : Lipstick (PCD 19: Cosmetics)
18 pages
Philippine Public Administration
No ratings yet
Philippine Public Administration
15 pages
Dan Glimne Motor Tuning 2 - MC Jan-70
No ratings yet
Dan Glimne Motor Tuning 2 - MC Jan-70
40 pages
ERP in FMCG Company
No ratings yet
ERP in FMCG Company
48 pages
Indian Railway
No ratings yet
Indian Railway
29 pages
SPM Time Card Management
No ratings yet
SPM Time Card Management
12 pages
Science Literacy Strategies
No ratings yet
Science Literacy Strategies
3 pages
Telehandler Genie GTH 1048-Specifications
No ratings yet
Telehandler Genie GTH 1048-Specifications
2 pages
CG Project Report
No ratings yet
CG Project Report
25 pages
Sharif Abushaikha Hotel General Manager
No ratings yet
Sharif Abushaikha Hotel General Manager
2 pages
T B T S S B D: HE Ig and HE Mall Ides of IG ATA
No ratings yet
T B T S S B D: HE Ig and HE Mall Ides of IG ATA
87 pages
RDBMS Unit2
No ratings yet
RDBMS Unit2
28 pages
Practice Exam For Final Exam Acct301 With Answers
No ratings yet
Practice Exam For Final Exam Acct301 With Answers
9 pages
Tan ChineseLiteratureEssays 2016
No ratings yet
Tan ChineseLiteratureEssays 2016
5 pages
Peace and Conflict Studies
No ratings yet
Peace and Conflict Studies
18 pages
UG - CAO - .00132-002 Tools & Equipment
No ratings yet
UG - CAO - .00132-002 Tools & Equipment
39 pages
Action Plan in English
No ratings yet
Action Plan in English
4 pages
Monetary Statistics M
No ratings yet
Monetary Statistics M
42 pages
Namra Finance Limited
No ratings yet
Namra Finance Limited
5 pages
Job Network Transfer
No ratings yet
Job Network Transfer
4 pages
Julia de Burgos Biography - Bilingual
No ratings yet
Julia de Burgos Biography - Bilingual
2 pages
What Is Galvanic Cell Give Half Cell Reaction of Daniell Cell and Ex
No ratings yet
What Is Galvanic Cell Give Half Cell Reaction of Daniell Cell and Ex
1 page
Reserch Proposal Raneesha
No ratings yet
Reserch Proposal Raneesha
22 pages
Schedule For OzCon 2023 Revised 05-30 2
No ratings yet
Schedule For OzCon 2023 Revised 05-30 2
4 pages
Semest Er - 3ecesyl L Abus (Analogelectroni CS)
No ratings yet
Semest Er - 3ecesyl L Abus (Analogelectroni CS)
2 pages
Sapien Labs Age of First Smartphone and Mental Wellbeing Outcomes
No ratings yet
Sapien Labs Age of First Smartphone and Mental Wellbeing Outcomes
26 pages
New Developments in Freefem++
No ratings yet
New Developments in Freefem++
16 pages

R Statistical Package

Uploaded by

R Statistical Package

Uploaded by

R Statistical Package

installed.packages() #To View the list of installed packages

library() # To Load and use an R package

search() #To View loaded R packages

detach(pkg_name, unload = TRUE) # To Unload an R

remove.packages() # To Remove installed packages

update.packages() #To Update installed packages

rev(sort(x)): sorts the elements of x in decreasing/ascending order.

||: Sequential or (for Evaluating

&&: Sequential And (For Evaluating

Create the following data frame:

Use attach to make the variables accessible

list: an ordered collection of data of arbitrary types.

weight <- c(160, 110, 220)

From the file

read.table(); read.csv(); read.dta(); read.spss(); …

# to read data saved as text (tab delimited)saved in D local disk.

#folder name =Rtraining, file name=datatry w/c is excel data.

# To import Excell data saved as comma separeted value (csv)

getwd() # to get working directory

dat2<-read.csv(“D:/Rtraining/mastmo.csv")# saved in local disk D, folder=Rtraining, file name=mastmo, 54 african

library(foreign)# first you have to load the library foreign.

#We need to know the current wokring directory

dat3<-read.spss(“D:/Rtraining/employee.sav", header=FALSE)# saved in local disk D,

folder=Rtraining, file name=employee under the folder Rtraining..

attach(dat3)# to make the variables accessible by name

> #plot of histogram

Model: product = a(mean) + b(variety) +

xij – Design Matrix (genotypes etc)

You might also like