Exp 1 145
Exp 1 145
ROLL NO – 21BME145
EXERCISE 1
Answer the following questions by understanding the data set
1. How many rows are there in the data set? How many
columns? What do therows and columns represent?
2. Make some pairwise scatterplot of the predictors(columns) in
this data set.Describe your findings
3. Do any of the suburbs of Boston appear to have
particularly high crimerates? Tax rates? Pupil-teacher
rations? Comment on the range of each predictor
4. What is the median pupil-teacher ratio among the towns in this data
set?
5. Which suburbs of Boston has lowest median value of owner
occupied homes? what are the values of the other predictors for
that suburbs, and howdo those values compare to the overall
ranges for those predictors? Commenton your findings.
6. In this data set, how many of the suburbs average more than
seven rooms per dwelling? More than eight rooms per dwelling?
Comment on the suburbs that average more than eight rooms
per dwelling.
CODE
library(MASS)
head(Boston)
dim(Boston)
str(Boston)
pairs(Boston)
summary(Boston$crim)
summary(Boston$tax)
summary(Boston$ptra o)
qplot(Boston$crim, binwidth=2 , xlab = "Crime rate", ylab="Number of
Suburbs" )
qplot(Boston$tax, binwidth=5 , xlab = "Full-value property-tax rate per
$10,000", ylab="Number of Suburbs")
qplot(Boston$ptra o, binwidth=0.1, xlab ="Pupil-teacher ra o by town",
ylab="Number of Suburbs")
selec on <- subset( Boston, crim > 10)
nrow(selec on)/ nrow(Boston)
selec on <- subset( Boston, crim > 50)
nrow(selec on)/ nrow(Boston)
selec on <- subset( Boston, tax< 600)
nrow(selec on)/ nrow(Boston)
selec on <- subset( Boston, tax> 600)
nrow(selec on)/ nrow(Boston)
selec on <- Boston[order(Boston$medv),]
selec on[1,]
summary(selec on)
rm_over_7 <- subset(Boston, rm>7)
nrow(rm_over_7)
rm_over_8 <- subset(Boston, rm>8)
nrow(rm_over_8)
summary(rm_over_8)
2)
3)
5)
EXERCISE 2
1. Remove the rows containing the mussing values
2. Using the full dataset, inves gate the predictors graphically, using
sca erplot or other tools of your choice. Create some plots highligh ng the
rela onships among the predictors. Comment on your findings.
3. Suppose that we wish to predicr the gas milage (mpg) on the basis of the
other valrables. Do your plot suggest thatany of the other variables might be
useful in predic ng mpg? Jus fy your answer
R Code
head(Auto)
New=na.omit(Auto)
noname=New[,-9]
pairs(noname)
plot(noname$mpg,noname$cylinders, xlab="mpg", ylab="No.of cylinders")
plot(noname$mpg,noname$displacement, xlab="mpg", ylab="Displacement")
plot(noname$mpg,noname$horsepower, xlab="mpg", ylab="Horse Power")