Introduction To Machine Learning
Introduction To Machine Learning
#cor() prints a correlation table showing correleation between each pair of variables
cor(bp[,2:8])
#pairs() outputs a correlation graph;
#$pch = 19 means use filled circles, pch = 21 means use open circle;
#lower.panel = NULL removes the correlation graph in the lower triangle
pairs(bp[,2:8], pch = 19, lower.panel = NULL)
#we want to display the pearson correlation coefficient in the graph
#so we need to customize what will appear in the upper triangle
upper.panel<-function(x, y){
points(x,y, pch=19) #gets the points to be correlated
r <- round(cor(x, y), digits=2) #c or(x,y) computes the pearson correlation between x and y
#and assigns to variable r
displayRtxt <- paste0("R = ", r) #assigns to variable displayRtxt the value of r
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
text(0.5, 0.9, displayRtxt) #size of displayRtxt when shown onscreen
}
names(bp)
#str(bp) Compactly display the internal structure of our R object, bp
str(bp)
#summary(bp) to display summary statistic per variable in bp such as minimum and maximum values,
#mean, median,1st Quartile value and 3rd Quartile value
summary(bp)
#get the linear model using lm(Y~X, data=dataframe)
bplm = lm(bp$BP ~ bp$Weight, data = bp)
with(bplm, plot(bp$Weight, bp$BP, main = "Scatterplot of BP vs Weight", xlab = "Weight in kg",
ylab = "BP in mm Hg"))
#to display the linear regression line we use the abline() function with bplm as the data
abline(bplm, col="red")
Findings
• There is a high correlation between
1. BP and Weight correlation coef = 0.950
2. BP and BSA correlation coef = 0.866
3. Weight and BSA correlation coef = 0.875
• There seems to be little or no correlation between
1. BSA and Stress correlation coef = 0.018
2. Stress and Weight correlation coef = 0.034
3. Duration and BSA correlation coef = 0.131
4. BP and Stress correlation coef = 0.164
#summary(bp) to display summary statistic per variable in bp such as minimum and maximum values,
#mean, median,1st Quartile value and 3rd Quartile value
summary(bp)
#cor() prints a correlation table showing correleation between each pair of variables
cor(bp[,2:8])
#let's use the linear model function lm(Y~X, data=dataframe) to compute if there is a linear correlation between the variables BP
and and Weight
bplm = lm(BP ~ Weight, data = bp)
summary(bplm)
upper.panel<-function(x, y){
points(x,y, pch=19) #gets the points to be correlated
r <- round(cor(x, y), digits=2) #cor(x,y) computes the pearson correlation between x and y and assigns to
#variable r
displayRtxt <- paste0("R = ", r) #assigns to variable displayRtxt the value of r
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
text(0.5, 0.9, displayRtxt) #size of displayRtxt when shown onscreen
}