Data Visualization
Data Visualization
In this homework students will create and enhance scatterplots and bubble charts using the
ggplot function of ggplot2. Students will use the R dataset 'mtcars'. Each of the questions is
worth four points (correct = 4 points, almost correct 2 points, otherwise 0 points). Turn in both
this edited rmd file and a pdf of the rendered rmd file.
head(mtcars)
str(mtcars)
dim(mtcars)
```
We will explore relationships between car attributes. These attributes are of course designed
intentionally but in the context usually of engineering principles, efficiency, and cost.
Q1) Create a scatterplot using by first producing a ggplot object called mtcars.p with the data
and basic scatterplot aesthetic attributes using displacement (disp as x) and horsepower (hp as
y). Then add a point geom to produce the scatterplot.
```{r Q1: scatterplot displacement and horsepower}
mtcars.p <- ggplot(data = mtcars, aes(x = disp, y = hp))
mtcars.p + layer(geom = 'point', stat = 'identity', position = 'identity')
mtcars.p + geom_point()
```
Q2) Render the same graph above and add a color to the scatterplot mapping it to weight (wt)
and increasing the glyph size. Add a smooth geom using the default method; do not include the
confidence interval.
```{r Q2: scatterplot with lines}
mtcars.p + geom_point(aes(color = wt, size = wt), alpha = 3/4) + geom_smooth(se = FALSE)
```
Q3) To the x, y, and color aesthetics add shape to the ggplot command using cylinders (cyl).
Note that shape cannot accept a continuous variable, so you will need to change the data type
(within the ggplot2 command) - hint: turn it into a type factor. Do not include the line but do
include the increased size aesthetic as a property of the point geom.
```{r Q3: bubble plot}
mtcars.p + geom_point(aes(color = wt, size = wt, shape = as.factor(cyl)), alpha = 3/4)
```
Q4) Create a plot identical to Q3 except only include data points for cars with 4 cylinders.
```{r Q4: plot similar to Q3 with datapoints}
library(calibrate)
cyl4 <- mtcars[mtcars$cyl == '4', 'cyl']
mtcars.p + geom_point(aes(color = wt, size = wt, shape = as.factor(cyl), alpha = 3/4)) + textxy()
#geom_text(aes(label = cyl4, size = 3))
```
Q5) To further review the relationship between displacement and horsepower, divide the data
into two segments along the x axis. Create a 'constant' called kCutOff and set it to 250. Create a
ggplot object with displacement and horsepower and then add four layers - two point geoms
and two smooth geoms. One point and smooth should be set for the mtcars rows where
displacement is less than or equal to kCutOff (use the color cyan1 for both layers) and the other
two for values greater than kCutOff (use the color coral1 for both layers). Do not include
confidence intervals, extend the lines across the full graph, and use the lm method for both
smooth geoms.
```{r Q5: bubblechart with lines for disp ranges}
kCutOff <- 250
```