STA 421 Notes II
STA 421 Notes II
STA 421 Notes II
WACHANA
3. More on R Graphics
Not only R has fancy graphical tools, but also it has all sorts of useful commands that allow users
to control almost every aspect of their graphical output to the finest details.
3.1 Histogram
We will use a data set fuel.frame which is based on makes of cars taken from the April 1990
issue of Consumer Reports.
> fuel.frame<-read.table("c:/fuel-frame.txt", header=T, sep=",")
> names(fuel.frame)
[1] "row.names" "Weight" "Disp." "Mileage" "Fuel" "Type"
> attach(fuel.frame)
attach() allows to reference variables in fuel.frame without the cumbersome fuel.frame$
prefix.
In general, graphic functions are very flexible and intuitive to use. For example, hist()
produces a histogram, boxplot() does a boxplot, etc.
> hist(Mileage)
> hist(Mileage, freq=F) # if probability instead of frequency is desired
Let us look at the Old Faithful geyser data, which is a built-in R data set.
> data(faithful)
> attach(faithful)
> names(faithful)
[1] "eruptions" "waiting"
1
STA 421 NOTES BATCH II R. WACHANA
3.2 Boxplot
> boxplot(Weight) # usual vertical boxplot
> boxplot(Weight, horizontal=T) # horizontal boxplot
> rug(Weight, side=2)
2
STA 421 NOTES BATCH II R. WACHANA
If you want to get the statistics involved in the boxplots, the following commands show them. In
this example, a$stats gives the value of the lower end of the whisker, the first quartile (25th
percentile), second quartile (median=50th percentile), third quartile (75th percentile), and the
upper end of the whisker.
> a<-boxplot(Weight, plot=F)
> a$stats
[,1]
[1,] 1845.0
[2,] 2567.5
[3,] 2885.0
[4,] 3242.5
[5,] 3855.0
> a #gives additional information
> fivenum(Weight) #directly obtain the five number summary
[1] 1845.0 2567.5 2885.0 3242.5 3855.0
Boxplot is more useful when comparing grouped data. For example, side-by-side boxplots of
weights grouped by vehicle types are shown below:
> boxplot(Weight ~Type)
> title("Weight by Vehicle Types")
3
STA 421 NOTES BATCH II R. WACHANA
3.3 plot()
plot() is a general graphic command with numerous options.
> plot(Weight)
The following command produce a scatterplot with Weight on the x-axis and Mileage on the y-
axis.
> plot(Weight, Mileage, main="Weight vs. Mileage")
A fitted straight line is shown in the plot by executing two more commands.
> fit<-lm(Mileage~Weight)
> abline(fit)
3.4 matplot()
matplot() is used to plot two or more vectors of equal length.
> y60<-c(316.27, 316.81, 317.42, 318.87, 319.87, 319.43, 318.01, 315.74,
314.00, 313.68, 314.84, 316.03)
> y70<-c(324.89, 325.82, 326.77, 327.97, 327.91, 327.50, 326.18, 324.53,
322.93, 322.90, 323.85, 324.96)
> y80<-c(337.84, 338.19, 339.91, 340.60, 341.29, 341.00, 339.39, 337.43,
335.72, 335.84, 336.93, 338.04)
> y90<-c(353.50, 354.55, 355.23, 356.04, 357.00, 356.07, 354.67, 352.76,
350.82, 351.04, 352.69, 354.07)
> y97<-c(363.23, 364.06, 364.61, 366.40, 366.84, 365.68, 364.52, 362.57,
360.24, 360.83, 362.49, 364.34)
> CO2<-data.frame(y60, y70, y80, y90, y97)
> row.names(CO2)<-c("Jan", "Feb",
"Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
> CO2
4
STA 421 NOTES BATCH II R. WACHANA
5
STA 421 NOTES BATCH II R. WACHANA
4. Plot Options
6
STA 421 NOTES BATCH II R. WACHANA
If the main title is too long, you can split it into two and adding a subtitle below the horizontal
axis label is easy:
> title(main="Title is too long \n so split it into two",sub="subtitle goes
here")
By default, when you issue a plot command R inserts variable name(s) if it is available and
figures out the range of x axis and y axis by itself. Sometimes you may want to change these:
> plot(Fuel, Weight, ylab="Weight in pounds", ylim=c(1000,6000))
Similarly, you can specify xlab and xlim to change x-axis. If you do not want the default labels
to appear, specify xlab=" ", ylab=" ". This give you a plot with no axis labels. Of course you
can add the labels after using appropriate statements within title() statement.
7
STA 421 NOTES BATCH II R. WACHANA
Also you can specify the line types using lty argument within plot() command:
> plot(Fuel, type="l", lty=1) #the usual series plot
> plot(Fuel, type="l", lty=2) #shows dotted line instead. lty can go up to
8.
> plot(Fuel, type="l", lty=1); title(main="Fuel data", sub="lty=1")
> plot(Fuel, type="l", lty=2); title(main="Fuel data", sub="lty=2")
> plot(Fuel, type="l", lty=3); title(main="Fuel data", sub="lty=3")
> plot(Fuel, type="l", lty=4); title(main="Fuel data", sub="lty=4")
8
STA 421 NOTES BATCH II R. WACHANA
Note that we can control the thickness of the lines by lwd=1 (default) through lwd=5 (thickest).
9
STA 421 NOTES BATCH II R. WACHANA
depends on the system you are using. You may want to experiment with different numbers. Of
course you can specify the col option together with other options such as type or lty. pch option
allows you to choose alternative plotting characters when making a points-type plot. For
example, the command
> plot(Fuel, pch="*") # plots with * characters
> plot(Fuel, pch="M") # plots with M.
bty ="n"; No box is drawn around the plot, although the x and y axes are still drawn.
bty="o"; The default box type; draws a four-sided box around the plot.
bty="c"; Draws a three-sided box around the plot in the shape of an uppercase "C."
bty="l"; Draws a two-sided box around the plot in the shape of an uppercase "L."
bty="7"; Draws a two-sided box around the plot in the shape of a square numeral "7."
> par(mfrow = c(2,2))
> plot(Fuel)
> plot(Fuel, bty="l")
> plot(Fuel, bty="7")
> plot(Fuel, bty="c")
4.6 Legend
legend() is useful when adding more information to the existing plot.
In the following example, the legend() command says
(1) put a box whose upper left corner coordinates are x=30 and y=3.5;
(2) write the two texts Fuel and Smoothed Fuel within the box together with corresponding
symbols described in pch and lty arguments.
>par(mfrow = c(1,1))
>plot(Fuel)
>lines(lowess(Fuel))
>legend(30,3.5, c("Fuel","Smoothed Fuel"), pch="* ", lty=c(0,1))
10
STA 421 NOTES BATCH II R. WACHANA
If you want to keep the legend box from appearing, add bty="n" to the legend command.
mtext() allows you to put texts to the four sides of the plot. Starting from the bottom (side=1),
it goes clockwise to side 4. The plot command in the example suppresses axis labels and the plot
itself. It just gives the frame. Also shown is the use of cex (character expansion) argument
which controls the relative size of the text characters. By default, cex is set to 1, so graphics text
and symbols appear in the default font size. With cex=2, text appears at twice the default font
size. text() statement allows precise positioning of the text at any specified point. First text
statement puts the text within the quotation marks centered at x=15, y=4.3. By using optional
argument adj, you can align to the left (adj=0) such that the specified coordinates are the
starting point of the text.
11
STA 421 NOTES BATCH II R. WACHANA
12
STA 421 NOTES BATCH II R. WACHANA
> plot(Fuel)
> identify(Fuel, n=3)
After pressing return, R waits for you to identify (n=3) points with the mouse. Moving the mouse
cursor over the graphics window and click on a data point. Then the observation number appears
next to the point, thus making the point identifiable.
13
STA 421 NOTES BATCH II R. WACHANA
> dev.cur()
windows
4
> dev.set(3) #change the current window to window 3
windows
3
> dev.cur() #check it
windows
3
> dev.off() #close the current window and window 4 is active
windows
4
> dev.list()
windows windows
2 4
> graphics.off() # now close all three
> dev.list()
Correlations of write, read, math and science with listwise deletion of missing values. The
correlations will not be calculated if there are missing values so it is important to use the
complete.obs argument to indicate how the missing values should be handled.
# correlation matrix
cor(read.sci, use="complete.obs")
cor(read.sci, use="pairwise.complete.obs")
plot(math, write)
Unless you are going to continue working with the hs0 data frame it is generally a good idea to
detach all attached data frames.
detach()
detach()
14
STA 421 NOTES BATCH II R. WACHANA
Data Modification in R
It is a good practice to label the data sets or variables that we have been working on. This can be
accomplished by using the comment function.
# cleaning up
rm(list=ls())
# reading in data
hs0 <- read.table("hs0.csv", header=T, sep=",")
# commenting the data set
comment(hs0)<-"High school and beyond data"
# checking
comment(hs0)
# variable labels using comment
comment(hs0$write)<-"writing score"
comment(hs0$read) <-"reading score"
# more checking to make sure that our comments stay with the
data frame
save(hs0,file="hs0.rda")
rm(list=ls())
15
STA 421 NOTES BATCH II R. WACHANA
load(file="hs0.rda")
comment(hs0)
comment(hs0$write)
For the rest of this section, we are going to attach hs0 so our syntax will look cleaner. The
search() function displays what is currently on the search path.
search()
attach(hs0)
search()
We use the sapply function with the is.factor function to check if any of the variables in the hs0
data frame are factor variables.
sapply(hs0, is.factor)
Creating a factor (categorical) variable called schtyp.f for schtyp and a factor variable female
for gender with value labels.
table(hs0$race)
hs0$race[hs0$race==5] <-NA
table(hs0$race)
16
STA 421 NOTES BATCH II R. WACHANA
total<-read+write+math+science
# noticing the missing values generated
summary(total)
# initializing a variable
grade<-0
grade[total <=140]<-0
grade[total > 140 & total <= 180] <-1
grade[total > 180 & total <= 210] <-2
grade[total > 210 & total <= 234] <-3
grade[total > 234] <-4
table(grade)
Creating mean scores in two ways - working with missing values differently.
m1<-(read+write+math+science)/4
m2<-rowMeans(cbind(read, write, math, science))
m2<-rowMeans(cbind(read, write, math, science), na.rm=T)
At this point, we might want to combine the new variables we have created with the original data
set. We can use the cbind function for this.
Data Management in R
17
STA 421 NOTES BATCH II R. WACHANA
Read in the hs1 data using the read.table function and storing in object called hs1.
Comparing means of read in the original hs1 data frame and the new smaller hs1.read.well data
frame. To keep from getting confused we will use the convention of using the data name, dollar
sign, variable name. For example, hs1$read is the read variable from the hs1 data.
mean(hs1.read.well$read)
mean(hs1$read)
Keeping only the variables read and write from the hs1 data frame.
Dropping the variables read and write from the hs1 data frame by using the column indices
corresponding to these two variables with a negative sign.
18
STA 421 NOTES BATCH II R. WACHANA
We will subset hs1 to two data sets, one for female and one for male. We then put them back
together.
hsfemale<-hs1[female==1, ]
hsmale<-hs1[female==0, ]
dim(hsfemale)
dim(hsmale)
hs.all<-rbind(hsfemale, hsmale)
dim(hs.all)
We will create two data sets from hs1, one contains demographic variables and the other one
contains test scores. We then merge the two data sets by the id variable.
If the variable that we were merging on had different names in each data frame then we could
use the by.x and by.y arguments. In the by.x argument we would list the name of the variable(s)
that was in the data frame listed first in the merge function (in this case in hs.demo) and in the
by.y argument we would name the variable(s) that was in the data frame listed second (in this
case hs.scores).
19
STA 421 NOTES BATCH II R. WACHANA
1.0 Functions used in this session, the data set and the syntax file
Read in the hs1 data via the internet using read.table function. We also use attach function to
place the data set on the search path of R.
rm(list=ls())
hs1 <- read.table("c://Users/Richard Wachana/SAS 205/hs1.csv",
header=T, sep=",")
attach(hs1)
3.0 t-tests
This is the one-sample t-test, testing whether the sample of writing scores was drawn from a
population with a mean of 50.
t.test(write, mu=50)
20
STA 421 NOTES BATCH II R. WACHANA
This is the paired t-test, testing whether or not the mean of write equals the mean of read.
This is the two-sample independent t-test. We can use either the by function or the tapply
function to look at the variances of the variable write for each group of female. The output from
the first t.test function assumes equal variances which is the default in the t.test function; the
output from the second t.test function assumes unequal variances.
4.0 Anova
In R you can use either the aov function or the anova function combined with the lm function.
Both alternatives will give you the same results. The anova function extracts the anova table
from the linear model fitted by the lm function. The aov function only fits an anova model and
we use the summary function to see all the output.
anova(lm(write~factor(prog)))
summary(aov(write~factor(prog)))
The following is an example of a two-way factorial ANOVA. Notice that in R, the sum of
squares is type I.
m2<-lm(write~factor(prog)*factor(female))
anova(m2)
Here is an analysis of covariance (ANCOVA). In this example, prog is the categorical predictor
and read is the continuous covariate.
anova(lm(write~factor(prog) + read))
summary(aov(write~factor(prog) + read))
5.0 Regression
21
STA 421 NOTES BATCH II R. WACHANA
summary(lm(write~female+read)
The generic plot function will produce multiple diagnostic plots when applied to an lm object.
These plots include residual versus fitted plots, qqplots of the residuals as well as scatter plots
with the regression line overlaid. If you are only interested in one or a few of the plots it might
be useful to use the which.plot option in the plot function.
Let's take a closer look at the object lm2 we just created. Notice that an object can have many
components of different types and different sizes.
class(lm2)
names(lm2)
length(lm2)
length(lm2$residuals)
length(lm2$coefficients)
lm2$coefficients
The fitted function will extract the fitted values from the lm object and the resid function will
extract the residuals.
write[1:20]
fitted(lm2)[1:20]
resid(lm2)[1:20]
22
STA 421 NOTES BATCH II R. WACHANA
The glm function fits a generalized linear model including a logistic regression. In order to fit a
logistic model we need to specify that the distribution of the dependent variable is binomial in
the family argument and the default link function used will then be the logit function.
# odds ratios
exp(coef(lr))
The signtest is the nonparametric analog to the single-sample t-test and is obtained by using the
wilcox.test function. The value that is being tested is specified by the mu argument.
wilcox.test(write, mu=50)
The signrank test is the nonparametric analog to the paired t-test. This test can be obtained by
also using the wilcox.test function and specifying T in the paired argument.
The ranksum test is the nonparametric analog to the independent two-sample t-test.
wilcox.test(write, female)
The kruskal wallis test is the nonparametric analog to the one-way anova.
kruskal.test(write, ses)
Unless you are going to continue working with the hs1 data it is generally a good idea to detach
all attached data frames.
detach()
23
STA 421 NOTES BATCH II R. WACHANA
Functions
Almost everything in R is done through functions. Here I'm only referring to numeric and
character functions that are commonly used in creating or recoding variables.
1. Numeric Functions
Function Description
round(x, digits=n) round(3.475, digits=2) is 3.48 It rounds the value to the number
specified decimal places
signif(x, digits=n) signif(3.475, digits=2) is 3.5 rounds the value to the specified number
of significant digits
cos(x), sin(x), tan(x) These functions give the obvious trigonometric functions. They
respectively compute the cosine, sine, tangent, arc-cosine, arc-sine,
cos(x),sin(x),tan(x), arc-tangent, and the two-argument arc-tangent.
acos(x), asin(x), atan(x),
atan2(y, x)
exp(x) e^x
24
STA 421 NOTES BATCH II R. WACHANA
2. Character Functions
Function Description
sub(pattern, replacement, x, Find pattern in x and replace with replacement text. If fixed=FALSE
ignore.case =FALSE, fixed=FALSE) then pattern is a regular expression.
If fixed = T then pattern is a text string.
sub("\\s",".","Hello There") returns "Hello.There"
paste(..., sep="") Concatenate strings after using sep string to seperate them.
paste("x",1:3,sep="") returns c("x1","x2" "x3")
paste("x",1:3,sep="M") returns c("xM1","xM2" "xM3")
paste("Today is", date())
toupper(x) Uppercase
tolower(x) Lowercase
25
STA 421 NOTES BATCH II R. WACHANA
The following table describes functions related to probability distributions. You need to type at
the command prompt set.seed(1234) to create reproducible pseudo-random numbers below.
Function Description
dbinom(x, size, prob) binomial distribution where size is the sample size
pbinom(q, size, prob) and prob is the probability of a heads (pi)
qbinom(p, size, prob) # prob of 0 to 5 heads of fair coin out of 10 flips
rbinom(n, size, prob) dbinom(0:5, 10, .5)
# prob of 5 or less heads of fair coin out of 10 flips
pbinom(5, 10, .5)
26
STA 421 NOTES BATCH II R. WACHANA
Other useful statistical functions are provided in the following table. Each has the option na.rm
to strip missing values before calculations. Otherwise the presence of missing values will lead to
a missing result. Object can be a numeric vector or data frame.
Function Description
sd(x) standard deviation of object(x). also look at var(x) for variance and
mad(x) for median absolute deviation.
median(x) median
quantile(x, probs) quantiles where x is the numeric vector whose quantiles are desired
and probs is a numeric vector with probabilities in [0,1].
# 30th and 84th percentiles of x
y <- quantile(x, c(.3,.84))
range(x) range
sum(x) sum
diff(x, lag=1) lagged differences, with lag indicating which lag to use
min(x) minimum
max(x) maximum
27
STA 421 NOTES BATCH II R. WACHANA
Note that while the examples on this topic apply functions to individual variables, many can be
applied to vectors and matrices as well.
Numerical Analysis in R
Integration
Integrate will evaluate the function over the specified range (lower to upper) by
passing a vector of these values to the function being integrated. Note that any
28
STA 421 NOTES BATCH II R. WACHANA
Now, let's say you wanted to integrate this function for a series of values of b
to make it work, you need to force vectorization of the function, so it can cycle
piecewise through the elements of the vector and evaluate the function for each
one:
Differentiation
Differential equations
install.packages("deSolve")
library("deSolve")
library(help="deSolve") # see information on package
29
STA 421 NOTES BATCH II R. WACHANA
Parameter optimization
To find the minimum value of a function within some interval, we use optimize
(optimise is a valid alias):
Now, let's say you want to find the x value of the function at which the y value
equals some number (1.234, say):
Of course, there are 2 solutions in this case, as seen by plotting the function to be
minimized:
We can get the other solution by giving a skewed search interval (type at the
command prompt ?optimize to see how the start point is determined):
res2 <- optimize(fun.aux, interval=c(-10, 100), target=1.234) #
force higher start value
points(res2$minimum, res2$objective)# plot other minimum
Interpolating data
30
STA 421 NOTES BATCH II R. WACHANA
f <- smooth.spline(x, y)
lines(predict(f), lty=2)
why not also add the best-fit sine curve predicted from a linear regression with lm:
Note that, by default, the predicted values are evaluated at the (X) positions of the
raw data. This means that you can end up with rather coarse curves, as seen above.
To get round this, you need to work with functions for the splines, which can be
supplied with more finely-spaced X values for plotting:
plot(y ~ x)
curve(fun.spline, add=TRUE) # } "curve" uses n=101 points by default
curve(fun.smooth, add=TRUE, lty=2)# }at which to evaluate the function
A wider range of splines is available in the package of the same name, accessed via
library("splines"), including B splines, natural splines and so on.
31
STA 421 NOTES BATCH II R. WACHANA
Consider a simple survey. You ask 100 people (randomly chosen) and 42 say ``yes'' to your
question. Does this support the hypothesis that the true proportion is 50%?
prop.test(42,100,p=.5)
Testing a mean
Suppose a car manufacturer claims a model gets 25 mpg. A consumer group asks 10 owners of
this model to calculate their mpg and the mean value was 22 with a standard deviation of 1.5. Is
the manufacturer's claim supported?
## Compute the t statistic. Note we assume mu=25 under H_0
xbar=22;s=1.5;n=10
t = (xbar-25)/(s/sqrt(n))
t
Two-sample t-tests
32
STA 421 NOTES BATCH II R. WACHANA
Equal variances
When the two samples are assumed to have equal variances, then the data can be pooled to find
an estimate for the variance. By default, R assumes unequal variances. If the variances are
assumed equal, then you need to specify var.equal=TRUE when using t.test.
After a side-by-side boxplot (boxplot(x,y), but not shown), it is determined that the
assumptions of equal variances and normality are valid. A one-sided test for equivalence of
means using the t-test is found. This tests the null hypothesis of equal variances against the one-
sided alternative that the drug group has a smaller mean. (�1 - �2 < 0). Here are the results
x = c(15, 10, 13, 7, 9, 8, 21, 9, 14, 8)
y = c(15, 14, 12, 8, 14, 7, 16, 10, 15, 12)
t.test(x,y,alt="less",var.equal=TRUE)
Unequal variances
If the variances are unequal, the denominator in the t-statistic is harder to compute
mathematically. But not with R. The only difference is that you don't have to specify
var.equal=TRUE (so it is actually easier with R).
t.test(x,y,alt="less")
Matched samples
Matched or paired t-tests use a different statistical model. Rather than assume the two samples
are independent normal samples albeit perhaps with different means and standard deviations, the
matched-samples test assumes that the two samples share common traits.
The basic model is that Yi = Xi + i where i is the randomness. We want to test if the i are mean
0 against the alternative that they are not mean 0. In order to do so, one subtracts the X's from the
Y's and then performs a regular one-sample t-test.
Actually, R does all that work. You only need to specify paired=TRUE when calling the t.test
function.
33
STA 421 NOTES BATCH II R. WACHANA
Clearly there are differences. Are they described by random fluctuations (mean i is 0), or is
there a bias of one grader over another? (mean 0). A matched sample test will give us some
insight. First we should check the assumption of normality with normal plots say. (The data is
discrete due to necessary rounding, but the general shape is seen to be normal.) Then we can
apply the t-test as follows
x = c(3, 0, 5, 2, 5, 5, 5, 4, 4, 5)
y = c(2, 1, 4, 1, 4, 3, 3, 2, 3, 5)
t.test(x,y,paired=TRUE)
Regression Analysis
Regression analysis forms a major part of the statisticians tool box. This section discusses statistical
inference for the regression coefficients.
The basic model for linear regression is that pairs of data, (xi,yi), are related through the equation
yi = 0 + 1 xi + i
The values of 0 and 1 are unknown and will be estimated from the data. The value of i is the amount
the y observation differs from the straight line model.
x = c(18,23,25,35,65,54,34,56,72,19,23,42,18,39,37)
y = c(202,186,187,180,156,169,174,172,153,199,193,174,198,183,178)
plot(x,y) # make a plot
abline(lm(y ~ x)) # plot the regression line
lm(y ~ x) # the basic values of the regression analysis
summary(lm.result)
es = resid(lm.result) # the residuals lm.result
b1 =(coef(lm.result))[['x']] # the x part of the coefficients
34
STA 421 NOTES BATCH II R. WACHANA
Analysis of Variance
Recall, the t-test was used to test hypotheses about the means of two independent samples. For
example, to test if there is a difference between control and treatment groups. The method called
analysis of variance (ANOVA) allows one to compare means for more than 2 independent
samples.
To illustrate, suppose we have just 27 tests and 3 graders (not 300 and 6 to simplify data entry.).
Furthermore, suppose the grading scale is on the range 1-5 with 5 being the best and the scores
are reported as
grader 1 4 3 4 5 2 3 4 5
grader 2 4 4 5 5 4 5 4 4
grader 3 3 4 2 4 5 5 4 4
We enter this into our R session as follows and then make a data frame
> x = c(4,3,4,5,2,3,4,5)
> y = c(4,4,5,5,4,5,4,4)
> z = c(3,4,2,4,5,5,4,4)
> scores = data.frame(x,y,z)
> boxplot(scores)
Before beginning, we made a side-by-side boxplot which allows us to compare the three
distributions. From this graph (not shown) it appears that grader 2 is different from graders 1 and
3.
Analysis of variance allows us to investigate if all the graders have the same mean. The R
function to do the analysis of variance hypothesis test (oneway.test) requires the data to be in a
different format. It wants to have the data with a single variable holding the scores, and a factor
describing the grader or category. The stack command will do this for us:
scores = stack(scores) # look at scores if not clear
names(scores)
35
STA 421 NOTES BATCH II R. WACHANA
Looking at the names, we get the values in the variable values and the category in ind. To call
oneway.test we need to use the model formula notation as follows
Now, we have formulas and could do all the work ourselves, but were here to learn how to let the
computer do as much work for us as possible. Two functions are useful in this example:
oneway.test to perform the hypothesis test, and anova to give detailed
By default, it returns the value of F and the p-value but that's it. The small p value matches our
analysis of the figure. That is the means are not equal. Notice, we set explicitly that the variances
are equal with var.equal=T.
The function anova gives more detail. You need to call it on the result of lm
anova(lm(values ~ ind, data=df))
36