0% found this document useful (0 votes)

9 views22 pages

Ggplot2 advancedTP - RMD

This document is a practical guide on advanced data visualization using R and ggplot2, focusing on enhancing charts through annotations, themes, and customizations. It includes code examples and exercises for users to practice creating and modifying visualizations, such as histograms and line plots. The guide also emphasizes the importance of annotation in data visualization to convey insights effectively.

Uploaded by

emmanuel prah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views22 pages

Ggplot2 advancedTP - RMD

Uploaded by

emmanuel prah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 22

---

title: "Advanced data visualization with R and ggplot2"

author: "a practical by [Yan Holtz](https://github.com/holtzy)"
date: "`r format(Sys.time(), '%d %B %Y')`"
mail: "[email protected]"
linkedin: "yan-holtz-2477534a"
twitter: "r_graph_gallery"
github: "holtzy"
home: "www.yan-holtz.com"
output:
epuRate::epurate:
toc: TRUE
number_sections: FALSE
code_folding: "show"
---

```{r global options, include = FALSE}

knitr::opts_chunk$set( warning=FALSE, message=FALSE)

library(rmarkdown)
library(epuRate)

# If necessary
# library(devtools)
# install_github("holtzy/epuRate")
```

> This practical follows the previous basic [introduction to

ggplot2](https://www.yan-holtz.com/teaching). It allows to go further with
`ggplot2`: annotation, theme customization, color palette, output formats, scales,
and more.

# Get ready
***
The following libraries are needed all along the practical. Install them with
`install.packages()` if you do not have them already. Then load them with
`library()`.
```{r, echo=TRUE}
# Load it
library(ggplot2)
library(dplyr)
library(hrbrthemes)
library(viridis)
library(plotly)
```

# 1- General appearance
***

## → Titles

Q1.1 The code below builds a basic

histogram for Rbnb apartment prices on the French Riviera. It shows only value
under 300 euros. Add code to:

- add a title with `ggtitle()`

- change axis labels `xlab()` and `ylab()`
- change axis limits with `xlim()` and `ylim()`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_histogram() +
...
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Libraries
library(ggplot2)

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_histogram() +
ggtitle("Night price distribution of Airbnb appartements") +
xlab("Night price") +
ylab("Number of apartments") +
xlim(0,400)
```

## → Chart components

All `ggplot2` chart components can be changed using the `theme()` function. You can
see a complete list of components in the official
[documentation](https://ggplot2.tidyverse.org/reference/theme.html).

Note: components are changed using different functions: `element_text()`,

`element_line()` for lines and so on..

Q1.2 Reproduce the previous

histogram and change:

- plot title size and color with `plot.title`

- X axis title size and color with `axis.title.x`
- Grid appearance with `panel.grid.major`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Make the histogram
data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
... +
theme(
plot.title = element_text(size=..., color=...),
...,
...
)
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Make the histogram
data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_histogram() +
ggtitle("Night price distribution of Airbnb appartements") +
xlab("Night price") +
ylab("Number of apartments") +
xlim(0,400) +
theme(
plot.title = element_text(size=13, color="orange"),
axis.title.x = element_text(size=13, color="purple"),
panel.grid.major = element_line(colour = "red")
)
```

## → Themes

Q1.3 `ggplot2` offers a set of pre-built

themes. Try the followings to see which one you like the most:

- `theme_bw()`
- `theme_dark()`
- `theme_minimal()`
- `theme_classic()`

See a complete list [here](https://www.r-graph-gallery.com/192-ggplot-themes/).

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
... +
theme_classic()
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_histogram(fill="#69b3a2", color="#e9ecef", alpha=0.9) +
ggtitle("Night price distribution of Airbnb appartements") +
theme_classic()
```

Q1.4 The `hrbrthemes` package

provides my favourite style. Install the package, load it, and apply the
`theme_ipsum()`. Documentation is [here](https://github.com/hrbrmstr/hrbrthemes).

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Libraries
library(tidyverse)
library(hrbrthemes)
library(viridis)

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
stat_bin(breaks=seq(0,300,10), fill="#69b3a2", color="#e9ecef", alpha=0.9) +
ggtitle("Night price distribution of Airbnb appartements") +
theme_ipsum()
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Libraries
library(tidyverse)
library(hrbrthemes)
library(viridis)

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

# 2- Annotation
***
Annotation is a crucial component of a good dataviz. It can turn a boring graphic
into an interesting and insightful way to convey information. Dataviz is often
separated in two main types: exploratory and explanatory analysis. Annotation is
used for the second type.

## → Text
The most common type of annotation is text. Let's say you have a spike in a line
plot. It totally makes sense to highlight it, and explain more in details what it
is about.

Q1.1 Build a line plot showing the

bitcoin price evolution between 2013 and 2018. Dataset is located
[here]("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/3_TwoNumOrdered.csv") and can be read directly with `read.table()`.
What part of the chart would you highlight?

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/3_TwoNumOrdered.csv", header=T)
data$date <- as.Date(data$date)

# plot
...
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/3_TwoNumOrdered.csv", header=T)
data$date <- as.Date(data$date)

# plot
data %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2")
```

Q1.2 Use the `annotate()` function

to add text. Annotate requires several arguments:

- `geom`: type of annotation, use `text`

- `x`: position on the X axis
- `y`: position on the Y axis
- `label`: what you want to write
- Optional: `color`, `size`, `angle` and
[more](https://www.r-graph-gallery.com/233-add-annotations-on-ggplot2-chart/).

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# plot
data %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
annotate( x=as.Date("2017-01-01"), ...)
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# plot
data %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
annotate(geom="text", x=as.Date("2017-01-01"), y=19000,
label="Bitcoin price reached 20k $\nat the end of 2017")
```

## → Shape

Q1.3 Find the exact spike `date` and its

`value`. Use this information to add a circle around the spike. This is done with
the `annotate()` function once more:

- `geom`: use `point`

- `x`: position on the X axis
- `y`: position on the Y axis
- `shape`: use 21, to be able to change the `fill` and the `color` arguments.
(fill=inside, color=stroke)
- `size`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Find spike date and value:
# data %>% arrange(...) ...

# plot
data %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
annotate(geom="text", ...) +
annotate(geom="point", ...)
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Find spike date and value:
# data %>% arrange(desc(value)) %>% head(1)

# plot
data %>%
ggplot( aes(x=date, y=value)) +
geom_line(color="#69b3a2") +
ylim(0,22000) +
annotate(geom="text", x=as.Date("2017-01-01"), y=20089,
label="Bitcoin price reached 20k $\nat the end of 2017") +
annotate(geom="point", x=as.Date("2017-12-17"), y=20089, size=10, shape=21,
fill="transparent")
```

## → Abline

Q1.4 Add a horizontal abline to show what

part of the curve is over 5000 $. This is possible thanks to the `geom_hline()`
function that requires its `yintercept` argument.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Find spike date and value:
# data %>% arrange(desc(value)) %>% head(1)
# plot
data %>%
...
annotate(...) +
annotate(...) +
geom_hline(..., color=..., size=...)
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Find spike date and value:
# data %>% arrange(desc(value)) %>% head(1)

## → Color

Q1.5 Build a scatterplot based on the

`gapminder` dataset. Use `gdpPercap` for the X axis, `lifeExp` for the Y axis, and
`pop` for bubble size. Keep only the year 2007.

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Data are available in the gapminder package
library(gapminder)
data <- gapminder %>% filter(year=="2007") %>% select(-year)

# Basic scatterplot
ggplot( data, aes(x=gdpPercap, y=lifeExp, size = pop, color = continent)) +
geom_point(alpha=0.7)

```

Q1.6 Highlight South Africa in the

chart: draw it in red, with all other circles in grey. Follow those steps:

- create a new column with `mutate`: this new column has the value `yes` if
`country=="South Africa"`, `no` otherwise. This is possible thanks to the `ifelse`
function.
- in the aesthetics part of the ggplot call, use this new column to control dot
colors
- use `scale_color_manual()` to control the color of both group. Use a bright color
for the country to highlight, and grey for the others.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Basic scatterplot
data %>%
mutate(isSouthAfrica = ... ) %>%
ggplot( .., color = isSouthAfrica)) +
geom... +
scale_color_manual(values=c("grey", "red")) +
theme(legend.position="none")
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Basic scatterplot
data %>%
mutate(isSouthAfrica = ifelse(country=="South Africa", "yes", "no")) %>%
ggplot( aes(x=gdpPercap, y=lifeExp, size = pop, color = isSouthAfrica)) +
geom_point(alpha=0.7) +
scale_color_manual(values=c("grey", "red")) +
theme(legend.position="none")

```

## → Multiple text

Q1.7 Highlight every country with

`gdpPercap > 5000` & `lifeExp < 60` in red. Write their names using the
`geom_text_repel of the `ggrepel` package to avoid text overlapping.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# ggrepel
library(ggrepel)

# prepare data
tmp <- data %>%
mutate( annotation = ifelse(...))

# plot
tmp %>%
ggplot( ...) +
geom... +
theme(...) +
geom_text_repel(data=tmp %>% filter(annotation=="yes"), aes(label=country),
size=4 )
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# ggrepel
library(ggrepel)

# prepare data
tmp <- data %>%
mutate( annotation = ifelse(gdpPercap > 5000 & lifeExp < 60, "yes", "no"))

# plot
tmp %>%
ggplot( aes(x=gdpPercap, y=lifeExp, size = pop, color = continent)) +
geom_point(alpha=0.7) +
theme(legend.position="none") +
geom_text_repel(data=tmp %>% filter(annotation=="yes"), aes(label=country),
size=4 )
```

# 3- Faceting
***

Faceting is a very powerful data visualization technique. It splits the figure in

small subsets, usually one by level of a categorical variable. `ggplot2` offers 2
functions to build small multiples: `facet_wrap()` and `facet_grid()`.

## → facet_wrap()

Q3.1 Build a [spaghetti

chart](https://www.data-to-viz.com/caveat/spaghetti.html) showing the evolution of
9 baby names in the US. (See code
[here](https://www.data-to-viz.com/caveat/spaghetti.html)). What's wrong with this
chart?

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Libraries
library(babynames)

# Load dataset from github

data <- babynames %>%
filter(name %in% c("Ashley", "Amanda", "Jessica", "Patricia", "Linda",
"Deborah", "Dorothy", "Betty", "Helen")) %>%
filter(sex=="F")

...

```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Libraries
library(babynames)

# Load dataset from github

data <- babynames %>%
filter(name %in% c("Ashley", "Amanda", "Jessica", "Patricia", "Linda",
"Deborah", "Dorothy", "Betty", "Helen")) %>%
filter(sex=="F")
# line plot = spaghetti chart
data %>%
ggplot( aes(x=year, y=n, group=name, color=name)) +
geom_line() +
ggtitle("Popularity of American names in the previous 30 years")
```

Q3.2 Use the `facet_wrap()`

function to build one area chart for each name. Basically, you have to provide a
categorical variable to the function. It will build a chart for each of its level.

Have a look to the Y axis. What do you observe? Is it a good option?

```{r, eval=FALSE, class.source="Question",echo=TRUE }

...
geom_area() +
... +
facet_wrap(~name)
```

You should get something like this:

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

data %>%
ggplot( aes(x=year, y=n, group=name, fill=name)) +
geom_area() +
ggtitle("Popularity of American names in the previous 30 years") +
theme(
legend.position="none",
) +
facet_wrap(~name)
```

Q3.3 Find out how to use the

`scale` option to have different Y axis limits for each subset. Does it make sense?
In which conditions?

```{r, eval=FALSE, class.source="Question",echo=TRUE }

...
geom_area() +
... +
facet_wrap(~name, scale=)
```

You should get something like this:

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}
data %>%
ggplot( aes(x=year, y=n, group=name, fill=name)) +
geom_area() +
ggtitle("Popularity of American names in the previous 30 years") +
theme(
legend.position="none",
) +
facet_wrap(~name, scale="free_y")
```

## → facet_grid()

Bonus Find out what the `facet_grid()`

function does. Why is it different to `facet_wrap()`?

BonusLoad
[this](https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/10_OneNumSevCatSubgroupsSevObs.csv) dataset in R. Build a histogram
for every combination of day and sex using `facet_wrap()`

You should get something like:

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/10_OneNumSevCatSubgroupsSevObs.csv", header=T, sep=",")

# Plot
ggplot(data, aes(x=total_bill)) +
geom_histogram() +
facet_grid(sex~day)
```
# 4- Saving plots
***

Q4.1 - Save the previous chart as a

`PNG` file using the `ggsave()` function. Where is saved the file?

```{r}
# save the plot in an object called p
p <- ggplot(data, aes(x=total_bill)) +
geom_histogram() +
facet_grid(sex~day)

# Save the plot

ggsave(p, filename = "chartFromRPractical.png")
```

Q4.2 - Specify the complete path

before file name to save the chart at a specific location.

# 5- Colors
***
Picking the right colors is a crucial step for a good dataviz. R offers awesome
options and packages to make the right choices. Here is an overview of the main
options.

## → One color

Q5.1 Several options exist to pick one

color. Change the histogram color using the `fill` argument on the chart below
using each of the following options:

- plain color name. Type `colors()` to see all the options.

- using `rgb()`. This function provides the quantity of red, green and blue to
build the color. Plus an argument for the opacity. Example, try
`rgb(.7, .6, .3, .2)`
- using `HTML` colors. Use [this
website](https://www.w3schools.com/colors/colors_picker.asp) to pick one you like.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_histogram(fill="steelblue") +
ggtitle("Night price distribution of Airbnb appartements") +
theme_ipsum()
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_histogram(fill="steelblue") +
ggtitle("Night price distribution of Airbnb appartements") +
theme_ipsum()
```

## → Discrete color palette

Q5.2 Build a scatterplot based on the

`iris` dataset. Use `Sepal.Length` for the X axis, `Petal.Length` for the Y axis.
Use `color=Species` to color groups.

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) +
geom_point()
```

Q5.3 It is possible to set the

color scale manually using `scale_color_manual()`. Use the hint below to see how to
use it and apply it to the previous scatterplot.

Note: it is a bad practice to pick colors randomly. Your palette will be

ugly and will probably not be colorblind friendly.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) +
geom_point() +
scale_color_manual( values=c("red","green","blue"))
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) +
geom_point() +
scale_color_manual( values=c("red","green","blue"))
```

Q5.4 Fortunately, people already

tackeled this issue for us and created packages offering nice color palettes. The
most famous one is `RColorBrewer`. Palettes are already available in `ggplot2`. See
all of them [here](https://www.r-graph-gallery.com/38-rcolorbrewers-palettes/), and
use one on your chart using `scale_color_brewer()`.

Pick the one you like the most and apply it to to previous scatterplot. Use it to
color the `Species`.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

... +
scale_color_brewer(palette = )
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) +
geom_point(size=4) +
scale_color_brewer(palette = "Set3")
```

## → Continuous color palette

Q5.5 `RColorBrewer` also offers continuous

color palette. However they must be called through the `scale_color_distiller`
function. Use the palette you like the most to color circles depending on
`Sepal_length`.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

... +
scale_color_distil...
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Sepal.Length)) +
geom_point() +
scale_color_distiller(palette = "RdPu")
```
# 6- Interactive charts
***
An interactive chart is a chart on which you can zoom, hover shapes to get
tooltips, click to trigger actions and more. Building interactive charts requires
javascript under the hood, but it is relatively easy to build it using R packages
that wrap the javascript for you. This type of packages are called [HTML widgets]
(https://www.htmlwidgets.org).

## → Plotly

Q6.1 Build the `gapminder` bubble plot

you've already done in the annotation part of this practical. Store it in an object
called `p`
```{r, eval=FALSE, class.source="Question",echo=TRUE }
# load data
library(gapminder)
data <- gapminder %>% filter(year=="2007") %>% select(-year)

# Basic ggplot
p <- data %>%
ggplot( ...
p
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# load data
library(gapminder)
data <- gapminder %>% filter(year=="2007") %>% select(-year)

# Basic ggplot
p <- data %>%
ggplot( aes(x=gdpPercap, y=lifeExp, size = pop, color = continent)) +
geom_point(alpha=0.7)
p
```

Q6.2 Install and load the `plotly`

package. Build an interactive chart using the `ggplotly()` function. What are the
new functionalities of this chart? Is it useful? What could be better?
```{r, fig.align="center"}
# Interactive version
library(plotly)
ggplotly(p)
```
 Q6.3 Let's improve the tooltip of
the chart:

- build a new column called `myText`. Fill it with whatever you want to show in the
tooltip.
- add a new aesthetics: `text=myText`
- in the `ggplotly()` call, add `tooltip="text"`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Basic ggplot
p <- data %>%
mutate(myText=...) %>%
ggplot( aes(...text=myText)) +
...

ggplotly(p, tooltip="text")
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Basic ggplot
p <- data %>%
mutate(myText=paste("This country is: " , country )) %>%
ggplot( aes(x=gdpPercap, y=lifeExp, size = pop, color = continent, text=myText))
+
geom_point(alpha=0.7)

ggplotly(p, tooltip="text")

```

## → Leaflet

Q6.4 Use the HTML widget called `leaflet`

to build an interactive map showing the earthquakes described in the dataset called
`quakes`. Code is fully provided here, since cartography with R could deserve an
entire practical. The idea is just to discover to potential offered in a few lines
of code:

```{r}
# Library
library(leaflet)

# load example data (Fiji Earthquakes) + keep only 100 first lines
data(quakes)
quakes = head(quakes, 100)

# Create a color palette with handmade bins.

mybins=seq(4, 6.5, by=0.5)
mypalette = colorBin( palette="YlOrBr", domain=quakes$mag, na.color="transparent",
bins=mybins)

# Final Map
leaflet(quakes) %>%
addTiles() %>%
setView( lat=-27, lng=170 , zoom=4) %>%
addProviderTiles("Esri.WorldImagery") %>%
addCircleMarkers(~long, ~lat,
fillColor = ~mypalette(mag), fillOpacity = 0.7, color="white", radius=8,
stroke=FALSE
) %>%
addLegend( pal=mypalette, values=~mag, opacity=0.9, title = "Magnitude", position
= "bottomright" )
```

## → Heatmap
The `d3heatmap` package allows to build interactive heatmaps in a few line of code.
Let's see how it works

Q6.5 Load

[this](http://datasets.flowingdata.com/ppg2008.csv) dataset in R. Have a look to
the first rows. Describe it. ([source](https://flowingdata.com/2010/01/21/how-to-
make-a-heatmap-a-quick-and-easy-solution/))

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load data
data <- read.csv(...)
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Load data
data <- read.csv("http://datasets.flowingdata.com/ppg2008.csv", row.names = 1)

# head(data)
# summary(data)
```

Q6.6 R offers a `heatmap()`

function to build... heatmaps! Apply it to the dataset. What do you observe? Are
you happy with this heatmap? What's wrong with it? How can we solve the issue?

Note: input dataset must be at the `matrix` format to be accepted by the

function. Use `as.matrix()` to get this format.
```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Make the heatmap
heatmap( as.matrix(data) )
```

Q6.7 Check the `scale` option of

the `heatmap()` function. What is it for? Can it help us? How? Use it to improve
the heatmap.

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Make the heatmap
heatmap( as.matrix(data), scale = "column")
```

Q6.8 `d3heatmap()` uses exactly the

same syntax than `heatmap()`. Use the function to get an interactive version of the
previous heatmap!

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load library
library(d3heatmap)

# Build heatmap
d3heatmap(...)
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE}
# Load library
library(d3heatmap)

# Build heatmap
d3heatmap(data, scale = "column")
```

## → Time Seriesx

Q6.9 - Use the HTML widget called

`dygraphs` to build an interactive line plot of the [bitcoin price evolution]
(https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/
3_TwoNumOrdered.csv). Try to reproduce the example below.

```{r, fig.align="center", out.width="100%"}

# Library
library(dygraphs)
library(xts) # To make the convertion data-frame / xts format

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/3_TwoNumOrdered.csv", header=T)
data$date <- as.Date(data$date)

# Then you can create the xts format, and thus use dygraph
don <- xts(x = data$value, order.by = data$date)

# Use the dygraph HTML widget

dygraph(don) %>%
dyOptions(labelsUTC = TRUE, fillGraph=TRUE, fillAlpha=0.1, drawGrid = FALSE,
colors="#D8AE5A") %>%
dyRangeSelector() %>%
dyCrosshair(direction = "vertical") %>%
dyHighlight(highlightCircleSize = 5, highlightSeriesBackgroundAlpha = 0.2,
hideOnMouseOut = FALSE) %>%
dyRoller(rollPeriod = 1)
```

## → HTML widgets

BONUS - The packages showcased above are

just a sample of the possibilities offered by the html widgets. Visit [this
website](https://www.htmlwidgets.org/showcase_leaflet.html) to have an overview of
what kind of interactive chart you can do with `R`. Pick your favorite example and
try to reproduce it.

# 7- Scales
***
Scales control the details of how data values are translated to visual properties.
[Many different scales](https://ggplot2.tidyverse.org/reference/index.html#section-
scales) are offered by ggplot2. The most widely one is probably the log scale.

Q7.1 Build a histogram showing the

night price distribution of the french riviera apartements ([data
here](https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/
1_OneNum.csv)). Keep all the data, with extreme values.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
...
geom_histogram() +
...
```

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Libraries
library(ggplot2)

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
ggplot( aes(x=price)) +
geom_histogram(color="white", fill="steelblue4") +
ggtitle("Night price distribution of Airbnb appartements") +
xlab("Night price") +
ylab("Number of apartments")
```

Q7.2 A common practice to avoid the

effect of extreme values is to filter data, or use `xlim` to zoom on a part of the
axis. Another approach is to use `scale_x_log10()` to apply a log transformation.
Apply this function to the histogram.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
...
geom_histogram() +
... +
scale_x_log10()
```
```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}
# Libraries
library(ggplot2)

# Load dataset from github

data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/
Example_dataset/1_OneNum.csv", header=TRUE)

# Make the histogram

data %>%
ggplot( aes(x=price)) +
geom_histogram(color="white", fill="steelblue4") +
ggtitle("Night price distribution of Airbnb appartements") +
xlab("Night price") +
ylab("Number of apartments") +
scale_x_log10()
```

Q7.3 What's the difference between

`scale_x_log10()` and applying the `log()` function on the dataset before doing the
chart? Why is it better?

Combined
No ratings yet
Combined
180 pages
Uber Analysis Python Project in R
No ratings yet
Uber Analysis Python Project in R
29 pages
Ggplot2 Book PDF
100% (2)
Ggplot2 Book PDF
281 pages
Lec06-Data Visualization
No ratings yet
Lec06-Data Visualization
70 pages
Week10 Slides Updated
No ratings yet
Week10 Slides Updated
80 pages
Week12 Slides
No ratings yet
Week12 Slides
46 pages
Intro Ggplot2 3
No ratings yet
Intro Ggplot2 3
53 pages
04 Data Visualization
No ratings yet
04 Data Visualization
64 pages
(Use R!) Hadley Wickham (Auth.) - Ggplot2 - Elegant Graphics For Data Analysis-Springer International Publishing (2016) PDF
100% (5)
(Use R!) Hadley Wickham (Auth.) - Ggplot2 - Elegant Graphics For Data Analysis-Springer International Publishing (2016) PDF
268 pages
DS-R Block 4 All
No ratings yet
DS-R Block 4 All
50 pages
Pie Chart and Bar Chart
No ratings yet
Pie Chart and Bar Chart
14 pages
Rhapsody of Realities October 2024 (
100% (3)
Rhapsody of Realities October 2024 (
80 pages
Actex Pa Sample
No ratings yet
Actex Pa Sample
12 pages
R Ggplot2 Package
No ratings yet
R Ggplot2 Package
21 pages
Figures With GGPlot
No ratings yet
Figures With GGPlot
58 pages
03 Data Visualization
No ratings yet
03 Data Visualization
64 pages
Ggplot2 For Data Visualization: Grammer of Graphics "
No ratings yet
Ggplot2 For Data Visualization: Grammer of Graphics "
19 pages
Tugas Pertemuan 8 - Visualisasi Data
No ratings yet
Tugas Pertemuan 8 - Visualisasi Data
51 pages
Combined 8 15
No ratings yet
Combined 8 15
8 pages
Owner'S Manual: Solar Water Heaters
No ratings yet
Owner'S Manual: Solar Water Heaters
56 pages
Lecture 6 - Data Visualization With Ggplot2
No ratings yet
Lecture 6 - Data Visualization With Ggplot2
15 pages
Data Visualization in R Sem-III 2021 PDF
No ratings yet
Data Visualization in R Sem-III 2021 PDF
57 pages
R
No ratings yet
R
14 pages
Basics Concrete Construction (2015) PDF
No ratings yet
Basics Concrete Construction (2015) PDF
76 pages
Visualizing Data in R
No ratings yet
Visualizing Data in R
20 pages
Arman Writing Assignment Init 4
No ratings yet
Arman Writing Assignment Init 4
8 pages
Data Visualization
No ratings yet
Data Visualization
30 pages
R Programming
No ratings yet
R Programming
9 pages
Catalog DAIKIN 2013 Sing
100% (1)
Catalog DAIKIN 2013 Sing
47 pages
Data Visualization in R
No ratings yet
Data Visualization in R
4 pages
Efficient and Beautiful Data Visualisation
No ratings yet
Efficient and Beautiful Data Visualisation
4 pages
Learning Ggplot2
No ratings yet
Learning Ggplot2
16 pages
Standard Operating Procedure To Learn How To Behave in Quality Control Laboratory in Pharmaceuticals
100% (1)
Standard Operating Procedure To Learn How To Behave in Quality Control Laboratory in Pharmaceuticals
38 pages
Ggplot 2: Elegant Graphics For Data Analysis. Second Edition.
No ratings yet
Ggplot 2: Elegant Graphics For Data Analysis. Second Edition.
277 pages
Unit 3data Visualization With Ggplot2
No ratings yet
Unit 3data Visualization With Ggplot2
19 pages
R Module 4
No ratings yet
R Module 4
31 pages
P 1014 Ap 06
No ratings yet
P 1014 Ap 06
24 pages
Ultimate Cheat SHEET - Analysis in R
No ratings yet
Ultimate Cheat SHEET - Analysis in R
17 pages
PRACTICUM, Day 1: R Graphing: Basic Plotting and Ggplot2: CRG Bioinformatics Unit, Sarah - Bonnin@crg - Eu May 6th, 2016
No ratings yet
PRACTICUM, Day 1: R Graphing: Basic Plotting and Ggplot2: CRG Bioinformatics Unit, Sarah - Bonnin@crg - Eu May 6th, 2016
52 pages
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
No ratings yet
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
5 pages
Lesson3 Aesthetics
No ratings yet
Lesson3 Aesthetics
3 pages
The Ggplot2 System
No ratings yet
The Ggplot2 System
7 pages
UKZN Map - Westville
0% (1)
UKZN Map - Westville
1 page
Data Visualization With Ggplot2 - CheatSheet
No ratings yet
Data Visualization With Ggplot2 - CheatSheet
9 pages
Data Visualization
No ratings yet
Data Visualization
46 pages
Exercise 2
No ratings yet
Exercise 2
3 pages
Ggplot2 Elegant Graphics For Data Analysis (2016, Springer) PDF
No ratings yet
Ggplot2 Elegant Graphics For Data Analysis (2016, Springer) PDF
281 pages
MPS Multis Varios 2007 PDB
No ratings yet
MPS Multis Varios 2007 PDB
204 pages
Lesson2 GGPlot
No ratings yet
Lesson2 GGPlot
3 pages
Geom - Histogram Ggplot2 Geom - Histogram : # Library
No ratings yet
Geom - Histogram Ggplot2 Geom - Histogram : # Library
9 pages
Using Ggplot2 For Plots in R
No ratings yet
Using Ggplot2 For Plots in R
8 pages
Assignment 2 PDF
No ratings yet
Assignment 2 PDF
9 pages
Essential n8n Playbook
From Everand
Essential n8n Playbook
Leandro Calado
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 04
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 04
7 pages
Ga Irrsp Study Guide
100% (3)
Ga Irrsp Study Guide
7 pages
Nano Fluids PDF
No ratings yet
Nano Fluids PDF
22 pages
Lesson3 Sandbox - RMD
No ratings yet
Lesson3 Sandbox - RMD
4 pages
How To Make Any Plot in Ggplot2?: Topics
No ratings yet
How To Make Any Plot in Ggplot2?: Topics
18 pages
# Fake Some Missing Data: Library Ifelse Ggplot Aes Log Geom - Miss - Point Stat - Summary Facet - Wrap Towebgl Ggplotly
No ratings yet
# Fake Some Missing Data: Library Ifelse Ggplot Aes Log Geom - Miss - Point Stat - Summary Facet - Wrap Towebgl Ggplotly
4 pages
Ggplot2 Exercise
No ratings yet
Ggplot2 Exercise
6 pages
GB ENEXIO SKS Lamella Clarifier
No ratings yet
GB ENEXIO SKS Lamella Clarifier
4 pages
MELSEC iQ-R PROFINET IO Controller Module Function Block Reference
No ratings yet
MELSEC iQ-R PROFINET IO Controller Module Function Block Reference
36 pages
Content: Dplyr, Readr, TM, Ggplot2/+ggforce/, Tidyr, Broom Dplyr
No ratings yet
Content: Dplyr, Readr, TM, Ggplot2/+ggforce/, Tidyr, Broom Dplyr
8 pages
VacStar OP Manual 55151 RevJ
No ratings yet
VacStar OP Manual 55151 RevJ
12 pages
Evaluation Reporting of Results Annex 2a Examples of Re Test Programmes For Quantitative Tests PDF
No ratings yet
Evaluation Reporting of Results Annex 2a Examples of Re Test Programmes For Quantitative Tests PDF
17 pages
TinyG Report - Final
No ratings yet
TinyG Report - Final
44 pages
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
No ratings yet
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
5 pages
7 Series Fpgas Data Sheet: Overview: General Description
No ratings yet
7 Series Fpgas Data Sheet: Overview: General Description
18 pages
Cheat Sheet Ggplot2
No ratings yet
Cheat Sheet Ggplot2
2 pages
6.3.1.10 Packet Tracer - Exploring Internetworking Devices Instructions
100% (1)
6.3.1.10 Packet Tracer - Exploring Internetworking Devices Instructions
4 pages
Under The Guidance Of:-Mr. Prahakant Dwivedi (Assistant Professor)
No ratings yet
Under The Guidance Of:-Mr. Prahakant Dwivedi (Assistant Professor)
17 pages
No Ph.D. Game Design With Three.js
From Everand
No Ph.D. Game Design With Three.js
Nikiforos Kontopoulos
No ratings yet
Exercise-9..Study and Implementation of Data Visulization With Ggplot
No ratings yet
Exercise-9..Study and Implementation of Data Visulization With Ggplot
1 page
Battery Charger or Battireis PM
No ratings yet
Battery Charger or Battireis PM
4 pages
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Plotting With Ggplot: Install - Packages ("Ggplot2") Library (Ggplot2)
No ratings yet
Plotting With Ggplot: Install - Packages ("Ggplot2") Library (Ggplot2)
3 pages
Proposal Work Defense Prince
No ratings yet
Proposal Work Defense Prince
19 pages
10 1093@ijlit@eaz004 PDF
No ratings yet
10 1093@ijlit@eaz004 PDF
33 pages
Oppe-2 (24 July) Java
No ratings yet
Oppe-2 (24 July) Java
16 pages
Prospectus The-Africa-Epidemic-Services ENG v2 2
No ratings yet
Prospectus The-Africa-Epidemic-Services ENG v2 2
4 pages
Existing Control Probability (Before Risk ID Analyses Module/ Compone Potential Failure Potential Cause(s) Potential Effect of Severity (Prior To
No ratings yet
Existing Control Probability (Before Risk ID Analyses Module/ Compone Potential Failure Potential Cause(s) Potential Effect of Severity (Prior To
25 pages
Ggplot2 Cheat Sheet
No ratings yet
Ggplot2 Cheat Sheet
1 page
Operating System Exercises - Chapter 5-Exr
No ratings yet
Operating System Exercises - Chapter 5-Exr
2 pages
10 1136@bmj m1326 PDF
No ratings yet
10 1136@bmj m1326 PDF
2 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
4.2 Variance and Covariance of Random Variables: Definition 4.4
No ratings yet
4.2 Variance and Covariance of Random Variables: Definition 4.4
5 pages
7 0 0 4 MBA (Sem I) Theory Examination 2017-18 Business Statistics
No ratings yet
7 0 0 4 MBA (Sem I) Theory Examination 2017-18 Business Statistics
3 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
MH 1000 MT Magnetic Hyperthermia Test System
No ratings yet
MH 1000 MT Magnetic Hyperthermia Test System
3 pages
Micrometer
No ratings yet
Micrometer
3 pages
Accounts Receivable Management and Finan Quoted Firms Nigeria
No ratings yet
Accounts Receivable Management and Finan Quoted Firms Nigeria
5 pages
.300 Win. Magnum Ballistics Calcs (QuickTarget Unlimited Lapua Edition)
No ratings yet
.300 Win. Magnum Ballistics Calcs (QuickTarget Unlimited Lapua Edition)
4 pages
Product Datasheet: Circuit Breaker Compact Ns800N, 50 Ka at 415 Vac, Micrologic 2.0 A Trip Unit, 800 A, Fixed, 3 Poles 3D
No ratings yet
Product Datasheet: Circuit Breaker Compact Ns800N, 50 Ka at 415 Vac, Micrologic 2.0 A Trip Unit, 800 A, Fixed, 3 Poles 3D
3 pages
Load Tables
No ratings yet
Load Tables
3 pages
Maddox2018 PDF
No ratings yet
Maddox2018 PDF
2 pages
Assignment 04
No ratings yet
Assignment 04
2 pages

Ggplot2 advancedTP - RMD

Uploaded by

Ggplot2 advancedTP - RMD

Uploaded by

---

title: "Advanced data visualization with R and ggplot2"

```{r global options, include = FALSE}

> This practical follows the previous basic [introduction to

<br><span class="questionNumber">Q1.1</span> The code below builds a basic

- add a title with `ggtitle()`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Make the histogram

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Load dataset from github

# Make the histogram

## &rarr; Chart components

<u>Note</u>: components are changed using different functions: `element_text()`,

<br><br><br><span class="questionNumber">Q1.2</span> Reproduce the previous

- plot title size and color with `plot.title`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

<br><span class="questionNumber">Q1.3</span> `ggplot2` offers a set of pre-built

See a complete list [here](https://www.r-graph-gallery.com/192-ggplot-themes/).

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Make the histogram

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Make the histogram

<br><br><br><span class="questionNumber">Q1.4</span> The `hrbrthemes` package

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github

# Make the histogram

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Load dataset from github

# Make the histogram

<br><br><br><span class="questionNumber">Q1.1</span> Build a line plot showing the

```{r, eval=FALSE, class.source="Question",echo=TRUE }

<br><br><br><span class="questionNumber">Q1.2</span> Use the `annotate()` function

- `geom`: type of annotation, use `text`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

<br><span class="questionNumber">Q1.3</span> Find the exact spike `date` and its

- `geom`: use `point`

```{r, eval=FALSE, class.source="Question",echo=TRUE }

<br><span class="questionNumber">Q1.4</span> Add a horizontal abline to show what

```{r, eval=FALSE, class.source="Question",echo=TRUE }

<br><span class="questionNumber">Q1.5</span> Build a scatterplot based on the

<br><br><br><span class="questionNumber">Q1.6</span> Highlight South Africa in the

```{r, eval=FALSE, class.source="Question",echo=TRUE }

## &rarr; Multiple text

<br><span class="questionNumber">Q1.7</span> Highlight every country with

```{r, eval=FALSE, class.source="Question",echo=TRUE }

Faceting is a very powerful data visualization technique. It splits the figure in

<br><span class="questionNumber">Q3.1</span> Build a [spaghetti

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Load dataset from github

# Load dataset from github

<br><br><br><span class="questionNumber">Q3.2</span> Use the `facet_wrap()`

Have a look to the Y axis. What do you observe? Is it a good option?

```{r, eval=FALSE, class.source="Question",echo=TRUE }

You should get something like this:

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

<br><br><br><span class="questionNumber">Q3.3</span> Find out how to use the

```{r, eval=FALSE, class.source="Question",echo=TRUE }

You should get something like this:

<br><span class="questionNumber">Bonus</span> Find out what the `facet_grid()`

You should get something like:

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

<br><br><br><span class="questionNumber">Q4.1</span> - Save the previous chart as a

# Save the plot

<br><br><br><span class="questionNumber">Q4.2</span> - Specify the complete path

## &rarr; One color

<br><span class="questionNumber">Q5.1</span> Several options exist to pick one

- plain color name. Type `colors()` to see all the options.

```{r, eval=FALSE, class.source="Question",echo=TRUE }

# Make the histogram

```{r, class.source="Correction",fig.show="hide",echo=FALSE, fig.show='asis'}

# Make the histogram

## &rarr; Discrete color palette

<br><span class="questionNumber">Q5.2</span> Build a scatterplot based on the

<br><br><br><span class="questionNumber">Q5.3</span> It is possible to set the

<u>Note</u>: it is a bad practice to pick colors randomly. Your palette will be

## → Chart components

## → Multiple text

## → One color

## → Discrete color palette

## → Continuous color palette

## → Time Seriesx

## → HTML widgets