0% found this document useful (0 votes)
69 views32 pages

Univariate/Bi Variate Analysis

This document discusses univariate and bivariate analysis and basic graphics in R. It covers accessing and subsetting data, creating histograms, boxplots, scatterplots, and line plots. It also demonstrates how to add titles, labels, colors, legends, and panel multiple graphs.

Uploaded by

jbsimha3629
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views32 pages

Univariate/Bi Variate Analysis

This document discusses univariate and bivariate analysis and basic graphics in R. It covers accessing and subsetting data, creating histograms, boxplots, scatterplots, and line plots. It also demonstrates how to add titles, labels, colors, legends, and panel multiple graphs.

Uploaded by

jbsimha3629
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Univariate/Bi variate

analysis
Working with data.
• Accessing columns.
• D has our data in it…. But you can’t see it directly.
• To select a column use D$column.
Working with data.
• Subsetting data.
• Use a logical operator to do this.
• ==, >, <, <=, >=, <> are all logical operators.
• Note that the “equals” logical operator is two = signs.
• Example:
• D[D$Gender == “M”,]
• This will return the rows of D where Gender is “M”.
• Remember R is case sensitive!
• This code does nothing to the original dataset.
• D.M <- D[D$Gender == “M”,] gives a dataset with
the appropriate rows.
Basic Graphics
• Histogram
• hist(D$wg)
Basic Graphics
• Add a title…
• The “main” statement will
give the plot an overall
heading.
• hist(D$wg ,
main=‘Weight Gain’)
Basic Graphics
• Adding axis labels…
• Use “xlab” and “ylab” to
label the X and Y axes,
respectively.
• hist(D$wg , main=‘Weight
Gain’,xlab=‘Weight
Gain’, ylab
=‘Frequency’)
Basic Graphics
• Changing colors…
• Use the col statement.
• ?colors will give you help
on the colors.
• Common colors may
simply put in using the
name.
• hist(D$wg,
main=“Weight
Gain”,xlab=“Weight
Gain”, ylab
=“Frequency”,
col=“blue”)
Basic Graphics – Colors
Basic Plots
• Box Plots
• boxplot(D$wg)
Boxplots
• Change it!
• boxplot(D$wg,main='Weigh
t Gain',ylab='Weight
Gain (lbs)')
Box-Plots - Groupings
• What if we want several box plots side by side to be
able to compare them.
• First Subset the Data into separate variables.
• wg.m <- D[D$Gender=="M",]
• wg.f <- D[D$Gender=="F",]
• Then Create the box plot.
• boxplot(wg.m$wg,wg.f$wg)
Boxplots – Groupings
Boxplots - Groupings

boxplot(wg.m$wg, wg.f$wg, main='Weight Gain (lbs)',


ylab='Weight Gain', names = c('Male','Female'))
Boxplot Groupings
• Do it by shift
• wg.7a <- D[D$Shift=="7am",]
• wg.8a <- D[D$Shift=="8am",]
• wg.9a <- D[D$Shift=="9am",]
• wg.10a <- D[D$Shift=="10am",]
• wg.11a <- D[D$Shift=="11am",]
• wg.12p <- D[D$Shift=="12pm",]
• boxplot(wg.7a$wg, wg.8a$wg, wg.9a$wg,
wg.10a$wg, wg.11a$wg, wg.12p$wg, main='Weight
Gain', ylab='Weight Gain (lbs)', xlab='Shift',
names =
c('7am','8am','9am','10am','11am','12pm'))
Boxplots Groupings
Scatter Plots
• Suppose we have two variables and we wish to see
the relationship between them.
• A scatter plot works very well.
• R code:
• plot(x,y)
• Example
• plot(D$metmin,D$wg)
Scatterplots
Scatterplots

plot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain',


xlab='Mets (min)',ylab='Weight Gain (lbs)')
Scatterplots

plot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain',


xlab='Mets (min)',ylab='Weight Gain (lbs)',pch=2)
Line Plots
• Often data comes through time.
• Consider Dell stock
• D2 <- read.csv("H:\\Dell.csv",header=TRUE)
• t1 <- 1:nrow(D2)
• plot(t1,D2$DELL)
Line Plots
Line Plots

plot(t1,D2$DELL,type="l")
Line Plots

plot(t1,D2$DELL,type="l",main='Dell Closing Stock Price',


xlab='Time',ylab='Price $'))
Overlaying Plots
• Often we have more than one variable measured
against the same predictor (X).
• plot(t1,D2$DELL,type="l",main='Dell Closing
Stock Price',xlab='Time',ylab='Price $'))
• lines(t1,D2$Intel)
Overlaying Graphs
Overlaying Graphs

lines(t1,D2$Intel,lty=2)
Overlaying Graphs
Adding a Legend
• Adding a legend is a bit tricky in R.
• Syntax
• legend( x, y, names, line types)

X
coordinate
Y Names of Corresponding
series in line types
coordinate
column
format
Adding a Legend

legend(60,45,c('Intel','Dell'),lty=c(1,2))
Paneling Graphics
• Suppose we want more than one graphic on a
panel.
• We can partition the graphics panel to give us a
framework in which to panel our plots.
• par(mfrow = c( nrow, ncol))

Number of Number of columns


rows
Paneling Graphics
• Consider the following
• par(mfrow=c(2,2))
• hist(D$wg, main='Histogram',xlab='Weight Gain',
ylab ='Frequency', col=heat.colors(14))
• boxplot(wg.7a$wg, wg.8a$wg, wg.9a$wg, wg.10a$wg,
wg.11a$wg, wg.12p$wg, main='Weight Gain',
ylab='Weight Gain (lbs)',
• xlab='Shift', names =
c('7am','8am','9am','10am','11am','12pm'))
• plot(D$metmin,D$wg,main='Met Minutes vs. Weight
Gain', xlab='Mets (min)',ylab='Weight Gain
(lbs)',pch=2)
• plot(t1,D2$Intel,type="l",main='Closing Stock
Prices',xlab='Time',ylab='Price $')
• lines(t1,D2$DELL,lty=2)
Paneling Graphics

You might also like