Interactive Data Visualization with Plotly Express in R
Last Updated :
04 Sep, 2024
Data Visualization in R is the process of representing data so that it is easy to understand and interpret. Various packages are present in the R Programming Language for data visualization.
Plotly’s R graphing library makes interactive, publication-quality graphs. Plotly can be used to make various interactive graphs such as scatter, line, bar, histogram, heatmaps, and many more. It is based on the Plotly.js JavaScript library which is used for making interactive graphical visualization.
Plotly supports a wide range of features including animation, legends, and tooltips.
Installation
To make interactive data visualization you first need to install R and R studio on your machine and then can install Plotly by running the below command in R studio
install.packages("plotly")
Now we can use the Plotly package in r using the below code
library(plotly)
Creating a basic scatter plot using the iris data set and plot_ly function:
R
# Install the package (run only once)
install.packages("plotly")
# Load the library
library(plotly)
# Create the plotly object
variable <- plot_ly(data = iris, x = ~Petal.Length, y = ~Petal.Width)
# Display the plot
variable
Output:
- We first installed the plotly package
- Then we used it using the library function
- Then we plotted a scatter plot using plot_ly function
- In plot_ly function we specified the iris dataset and the x and y axis
- Then we printed the plot
Scatter Plot
In scatter plot shows variation of one variable with respect to another variable. We plot one variable on the x-axis and another on y-axis. The relationship between the two variables is showed with a dot. We can change the dot size, color to show the relationship between the variables.
For plotting scatter plot we are going to use the mtcars dataset. We are going to map the mpg property on the xaxis and disp property on the yaxis.
R
install.packages("plotly") # Install plotly if not already installed
install.packages("dplyr") # Install dplyr if not already installed
library(plotly)
library(dplyr)
graph <- mtcars %>%
plot_ly(x = ~mpg, y = ~disp, type = "scatter", mode = "markers", color = ~cyl) %>%
layout(
title = "Miles per gallon vs Displacement",
xaxis = list(
title = "Miles per gallon",
range = c(0, 50)
),
yaxis = list(
title = "Displacement",
range = c(0, 500)
)
)
# Display the plot
graph
Output:
- First we used pipe operator to pass the mtcars data set to plot_ly function
- Then we defined the x and y axis.
- Notice we did not specify the plot to be scattered, but plot_ly itself identifies that the best plot for the given information is scatter plot.
The points are colored based on the cyl attribute present in mtcars dataset.
R
install.packages("gapminder")
library(gapminder)
library(plotly)
library(dplyr)
animatedscatter <- gapminder %>%
plot_ly(x = ~log(gdpPercap), y = ~lifeExp, frame = ~year, color = ~continent, type = "scatter", mode = "markers") %>%
layout(
title = list(
text = "Fuel Efficiency",
font = list(color = "black"),
pad = list(t = 100)
),
paper_bgcolor = 'rgb(128,128,128)',
plot_bgcolor = 'rgb(128,128,128)',
xaxis = list(
title = "log(GdpPerCapita)",
color = "black",
linecolor = "black"
),
yaxis = list(
title = "LifeExp",
color = "black",
linecolor = "black"
)
)
# Display the plot
animatedscatter
Output:
in the above code we used the gapminder dataset to draw animated scatter plot. The above code showed the life expentency and gdp per capita for all the years.
- First we install the gapminder package
- then we loaded it in our project
- used the plot_ly function to specify the x,y axis and frame. The frame specify that we want different scatter plot for each year.
- The color of the dot will change for each year.
Line Plot
Line plot is similar to scatter plot but in this we add connect the two dots together to form a line. We can draw multiple lines with different color to show relation between the x and different y axis.
For drawing the line plot we are going to use the economics dataset. We are to plot the date on the x-axis and then see how unemploy rate changes with the date using line plot.
R
install.packages("plotly") # Install plotly if not already installed
install.packages("dplyr") # Install dplyr if not already installed
library(plotly)
library(dplyr)
graph <- economics %>%
plot_ly(x = ~date) %>%
add_trace(y = ~unemploy / 400, type = "scatter", mode = "lines")
# Display the plot
graph
Output:
- first imported the plotly library
- Then passed the economics data to the plot_ly function
- We also mapped the x-axis to the date attribute
- Then we used the add_trace function to specify the y-axis, the type of plot and the mode
Multiline Plot:
We can also add multiple lines to the same plot using the add_trace function. The line plot created will be of different color for each different y attribute we specify in the add_trace function.
R
install.packages("plotly") # Install plotly if not already installed
install.packages("dplyr") # Install dplyr if not already installed
library(plotly)
library(dplyr)
graph <- economics %>%
plot_ly(x = ~date) %>%
add_trace(y = ~unemploy / 400, type = "scatter", mode = "lines", name = "Unemployed") %>%
add_trace(y = ~uempmed, type = "scatter", mode = "lines", name = "Unemployment Rate") %>%
layout(
title = list(
text = "Date vs (Unemployed and Unemployment Rate)",
font = list(color = "white"),
pad = list(t = 100)
),
margin = list(t = 50),
paper_bgcolor = 'rgb(0,0,0)',
plot_bgcolor = 'rgb(0,0,0)',
legend = list(
bgcolor = "white",
font = list(
family = "sans-serif",
color = "red"
)
),
xaxis = list(
title = "Date",
rangeslider = list(type = "date"),
color = "white",
linecolor = "white",
tickangle = -45
),
yaxis = list(
title = "Unemployed and Unemployment Rate",
color = "white",
linecolor = "white",
tickangle = -45,
title_standoff = 10
)
)
# Display the plot
graph
Output:
- Here it shows passed the economics data to the plot_ly function and mapped the x axis to attribute date
- Then we used the add_trace function to specify the y attribute, the type and mode of the plot and the name to give to this specified plot
- Then we added another add_trace function to add another line in the graph. The type and mode property is same as the first add_trace function but this line will have a different name.
- Then we added the labels to the graph.
Box Plot:
Box plot is used to see the distribution of data for a variety of classes. A box plot display 5 infomation about a class min, first quartile, median, second quartile and max. The box is drawn by connecting the first and second quartile of the data.
R
install.packages("plotly") # Install plotly if not already installed
install.packages("dplyr") # Install dplyr if not already installed
library(plotly)
library(dplyr)
boxplot <- mtcars %>%
plot_ly(x = ~factor(cyl), y = ~mpg) %>%
add_trace(type = "scatter", mode = "markers", name = "Scatter") %>%
add_boxplot(name = "Boxplot") %>%
layout(
title = "Fuel Efficiency",
xaxis = list(title = "Number of Cylinders"),
yaxis = list(title = "Miles Per Gallon")
)
# Display the plot
boxplot
Output:
- first passed the mtcars dataset to the plot_ly function
- in plot_ly function we mapped the x axis to the cyl attribute, here factor is used for drawing the dox and dot plot for different number of cylinders, then we mapped the y-axis with mpg attribute.
- We then specified the plot to be scattered using the add_trace function.
- add_boxplot function is used to draw the boxplot.
- layout is used to label the graph.
Now we can plot multiple boxplot using boxmode grouping function.
R
install.packages("plotly") # Install plotly if not already installed
library(plotly)
fig <- plot_ly(diamonds, x = ~cut, y = ~price, color = ~clarity, type = "box") %>%
layout(boxmode = "group", title = "CUT vs PRICE")
# Display the plot
fig
Output:
We will now draw box plot using the diamonds dataset. Diamond dataset contains information such as price and other attributes for almost 54,000 diamonds. We will draw a box plot plot for each cut of the diamond vs its price.
- we passed the diamonds dataset to the plotly function
- we then mapped the x-axis to cut and y-axis to price
- the color attribute creates a new box plot for each clarity type
- then we specified the type of plot to be box
- boxmode attribute is set to group, which will create seperate box plot for each colour.
3d Scatter Plot
In 3d plot we map x, y and z axis to three different attributes of the dataset. We are going to consider the iris dataset. We will map the Sepal.Length to x axis, Sepal.Width to y-axis and Petal.Length to the z axis. Even if we do not specify the type of plot to be scatter3d the plot_ly function automatically assumes it to be a scatter 3d plot.
R
install.packages("plotly") # Install plotly if not already installed
library(plotly)
plot <- plot_ly(
data = iris,
x = ~Sepal.Length,
y = ~Petal.Length,
z = ~Sepal.Width,
color = ~Species,
type = "scatter3d",
mode = "markers"
)
# Display the plot
plot
Output:
Heatmap:
A heatmap is a two-dimensional graphical representation of data where the individual values that are contained in a matrix are represented as colors.
R
install.packages("plotly") # Install plotly if not already installed
library(plotly)
# Load the iris dataset
data(iris)
# Calculate the correlation matrix
cor_matrix <- cor(iris[, 1:4])
# Create a heatmap using Plotly
heatmap <- plot_ly(
x = colnames(cor_matrix),
y = colnames(cor_matrix),
z = cor_matrix,
type = "heatmap",
colorscale = "Viridis"
) %>%
layout(title = "Correlation Heatmap of Iris Dataset")
# Display the heatmap
heatmap
Output:
- First we calculate the correlation matrix of the numerical attributes (columns 1 to 4) using the
cor
function. - Then create a heatmap using the
plot_ly
function. We specify the x and y axes as column names, the z values as the correlation matrix, the type as “heatmap,” and the colorscale as “Viridis” (you can choose other color scales as well). - We customize the layout of the heatmap by setting the title using the
layout
function. - We display the heatmap using the
print
function.
Similar Reads
How to create interactive data visualizations with ggvis
Creating interactive data visualizations is a powerful way to explore and present data. The ggvis package in R provides a flexible framework for building these visualizations by combining the capabilities of dplyr data manipulation and Shiny interactivity. This article will guide you through the pro
7 min read
Interactive Data Visualizations in R Using ggiraph
Interactive data visualizations can significantly enhance the ability to explore and understand complex datasets. In R, the ggiraph package allows you to create interactive versions of ggplot2 visualizations. This article will provide an overview of ggiraph, its key features, and step-by-step exampl
5 min read
Interactive Charts using Plotly in R
R Programming Language is a powerful tool for data analysis and visualization. Interactive plots with R can be particularly useful for exploring and presenting data, but creating them can be challenging. The Shiny package provides a framework for creating web-based applications with R, including int
5 min read
Create interactive ggplot2 graphs with Plotly in R
"A Picture is worth a thousand words," and that picture would be even more expressive if the user could interact with it. Hence the concept of "interactive graphs or charts. Interactive charts allow both the presenter and the audience more freedom since they allow users to zoom in and out, hover and
6 min read
Data Visualization with Seaborn - Python
Data visualization can be done by seaborn and it can transform complex datasets into clear visual representations making it easier to understand, identify trends and relationships within the data. This article will guide you through various plotting functions available in Seaborn. Getting Started wi
13 min read
Plotly for Data Visualization in Python
Plotly is an open-source Python library for creating interactive visualizations like line charts, scatter plots, bar charts and more. In this article, we will explore plotting in Plotly and covers how to create basic charts and enhance them with interactive features. Introduction to Plotly in Python
13 min read
Animated Data Visualization using Plotly Express
Data Visualization is a big thing in the data science industry and displaying the proper statistics to a business or governments can help them immeasurably in improving their services. It is very painful to understand data from different times from multiple charts and make any sense of it. That is w
4 min read
Plotly Express vs. Altair/Vega-Lite for Interactive Plots
Interactive data visualization is a critical component in data analysis and presentation, providing a dynamic way to explore and understand data. Two popular tools for creating interactive plots are Plotly Express and Altair/Vega-Lite. Both libraries have their strengths and cater to different needs
6 min read
Data visualization With Pygal
Pygal is an open-source Python library designed for creating interactive SVG (Scalar Vector Graphics) charts. It is known for its simplicity and ability to produce high-quality visualizations with minimal code. Pygal is particularly useful for web applications, as it integrates well with frameworks
10 min read
Visualizing Google Forms Data with Matplotlib
In this article, we will see how can we visualize the data of google forms with the Matplotlib library. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It is a plotting library for the Python programming language and its numerical mathem
7 min read