Joining Points on Scatter plot using Smooth Lines in R
Last Updated :
24 Apr, 2025
A smooth line, also known as a smoothed line, is a line that is drawn through a set of data points in such a way that it represents the overall trend of the data while minimizing the effects of random fluctuations or noise. In other words, it is a way to represent a general pattern or trend in a dataset while reducing the impact of individual data points that deviate from that pattern.
There are several methods that can be used to create a smooth line, such as linear regression, loess, and splines. Each method has its own pros and cons, and the choice of method will depend on the specific characteristics of the data and the goals of the analysis.
When we plot a smooth line on a scatter plot, it helps us to identify the underlying pattern in the data and to make predictions about future values based on that pattern. It also helps to identify outliers in the data, if any and gives a general idea about the spread of data.
It is also useful for exploring relationships between two or more variables, especially when the data points are dense or overlapping.
How to install GGplot2 Library?
You can install the ggplot2 library by running the following command in the R console:
install.packages("ggplot2")
Then, you can load the library by running the following command:
library(ggplot2)
Make sure that you are connected to the internet while installing the package. Once the package is installed and loaded successfully you can proceed with the code and you should not encounter the error message.
The geom_smooth() function is used to plot a smooth line using ggplot2 in R Programming Language. This function is a geom, which is a kind of plotting layer in ggplot2, and it can be added to a plot using the + operator.
Syntax
geom_smooth(mapping = NULL, data = NULL, stat = "smooth", position = "identity",
..., method = "auto", formula = y ~ x, se = TRUE, n = 80,
fullrange = FALSE, level = 0.95, span = NULL, method.args = list(),
method.fit = NULL, show.legend = NA, inherit.aes = TRUE)
Parameters
- mapping: Aesthetic mapping, usually constructed with aes().
- data: Data frame containing the data to be plotted.
- stat: The statistical transformation to use on the data for this layer.
- position: The position adjustment to use for overlapping points on this layer.
- ... : Additional arguments passed to the underlying smoothing method.
- method: The smoothing method to use. The default is "auto", which will use "loess" for small datasets and "gam" for larger ones.
- formula: A formula used to specify the relationship between x and y.
- se: Whether to show a standard error of the smoothing estimate.
- n: Number of observations used to compute the smooth.
- fullrange: If true, the smooth is computed over the full range of x.
- level: Level of confidence interval to use.
- span: The span of the smoother, similar to the "window" parameter in the loess function.
- method.args: Additional arguments passed to the underlying smoothing method.
- method.fit: Function used to fit the smoother.
- show.legend: Whether to show a legend for this layer.
- inherit.aes: If true, the aesthetic properties of the layer are inherited from the plot defaults.
You can set these arguments to customize the appearance and behavior of the smooth line. Most importantly, you can use method, se and span arguments to control the smoothing method, standard error and span of the smoother.
Creating a Simple Smooth Line
R
library(ggplot2)
# Create some example data
x <- 1:100
y <- sin(x)
df <- data.frame(x, y)
# Create the plot
ggplot(df, aes(x, y)) +
# Add points to the plot
geom_point() +
geom_smooth(method = "loess", se=F,
size=1.2, color="red",
linetype = "dashed")+
ggtitle("Smooth Line Plot") +
xlab("X-axis") +
ylab("Y-axis")
Output:
This code creates a plot of a sine wave with a smooth line using the "loess" method, a dashed red line with a width of 1.2, and without showing the standard error. It also adds axis labels and titles to the plot.
You can also use your own data in place of the example data, and you can adjust the line type, color, and other properties to customize the plot as per your requirements.
Different Methods of plotting
The geom_smooth() function in ggplot2 provides several methods for plotting a smooth line through a set of data points. These methods include:
- "loess": locally weighted regression. It is a non-parametric method that fits a polynomial regression model to a subset of the data and uses a weighting function to assign greater importance to nearby data points. It's useful when data is non-linear and not too large.
- "lm": linear regression. It fits a linear model to the data and can be useful when the relationship between x and y is roughly linear.
- "glm": generalized linear regression. It is an extension of linear regression that allows for the response variable to have a non-normal distribution and for the relationship between the predictor and response variables to be non-linear.
- "gam": generalized additive models. It is a flexible framework for fitting non-linear relationships between predictor and response variables.
- "rlm": robust linear models, it is an extension of linear models that are resistant to outliers.
- "auto": It will automatically select "loess" for small datasets and "gam" for larger datasets.
- "rq": quantile regression, it is an extension of linear models that allows for the estimation of quantiles of the conditional distribution of the response variable.
You can specify the method to use by providing the appropriate argument to the geom_smooth() function. For example, to use the "loess" method:
R
library(ggplot2)
# Create some example data
x <- 1:100
y <- sin(x)
df <- data.frame(x, y)
# Create the plot
ggplot(df, aes(x, y)) +
# Add points to the plot
geom_point() +
geom_smooth(method = "loess", se=F,
size=1.2, color="red",
linetype = "dashed")+
ggtitle("Smooth Line Plot") +
xlab("X-axis") +
ylab("Y-axis")
and to use the "lm" method:
R
library(ggplot2)
# Create some example data
x <- 1:100
y <- sin(x)
df <- data.frame(x, y)
# Create the plot
ggplot(df, aes(x, y)) +
# Add points to the plot
geom_point() +
geom_smooth(method = "lm", se=F,
size=1.2, color="red",
linetype = "dashed")+
ggtitle("Smooth Line Plot") +
xlab("X-axis") +
ylab("Y-axis")
Output Differences in both the methods:
Comparison between the lines drawn by using "loess" and "lm" method
You should choose the method that best represents the underlying pattern of your data and that is consistent with the goals of your analysis. Similarly, you can try all the available methods mentioned above.
Different types of Line available
The geom_smooth() function in ggplot2 allows you to change the line type of the smooth line by using the line type argument. The possible values for the line-type argument include:
"solid" (default): a solid line.
"dashed": a line composed of dashes.
"dotted": a line composed of dots.
"dotdash": a line composed of alternating dots and dashes.
"longdash": a line composed of long dashes.
"twodash": a line composed of two dashes.
You can always change the linetype if you want to change the line. If you change these lines in the above code, you can change the linetype.
For Example: use a "dashed" linetype.
R
library(ggplot2)
# Create some example data
x <- 1:100
y <- sin(x)
df <- data.frame(x, y)
# Create the plot
ggplot(df, aes(x, y)) +
# Add points to the plot
geom_point() +
geom_smooth(method = "loess", se=F,
size=1.2, color="red",
linetype = "solid")+
ggtitle("Smooth Line Plot") +
xlab("X-axis") +
ylab("Y-axis")
Output Differences in both Line Types:
Comparison between the lines drawn by using "dashed" and "solid" linetypeExample 1:
R
library(ggplot2)
# Create some example data
x <- rnorm(100)
y <- x + rnorm(100)
df <- data.frame(x, y)
# Create the plot using geom_smooth
ggplot(df, aes(x, y)) +
# Add points to the plot
geom_point() +
geom_smooth(method = "loess", se=F,
size=1.2, color="red",
linetype = "dashed")+
ggtitle("Smooth Line Plot") +
xlab("X-axis") +
ylab("Y-axis")
Output:
This code creates a scatter plot of the data and adds a smooth line to the plot, using the "loess" method, a red color and width of 1.2, and with a dashed line type. It also adds axis labels and titles to the plot.
Example 2:
R
library(ggplot2)
# Create some example data
x <- rnorm(100)
y <- x + rnorm(100)
df <- data.frame(x, y)
# Create the plot using geom_smooth
ggplot(df, aes(x, y)) +
geom_point() + # Add points to the plot
geom_smooth(method = "gam", se=F,
size=1.2, color="purple",
linetype = "dotted")+
ggtitle("Smooth Line Plot") +
xlab("X-axis") +
ylab("Y-axis")
Output:
This code creates a scatter plot of the data and adds a smooth line to the plot, using the "gam" method, a purple color, a width of 1.2, and a dotted line type. It also adds axis labels and titles to the plot.
Similar Reads
Create Scatter Plot with smooth Line using Python
A curve can be smoothened to reach a well approximated idea of the visualization. In this article, we will be plotting a scatter plot with the smooth line with the help of the SciPy library. To plot a smooth line scatter plot we use the following function: scipy.interpolate.make_interp_spline() from
2 min read
How to Color Scatter Plot Points in R ?
A scatter plot is a set of dotted points to represent individual pieces of data in the horizontal and vertical axis. But by default, the color of these points is black and sometimes there might be a need to change the color of these points. In this article, we will discuss how to change the color o
2 min read
Scatter Slot using Plotly in R
In order to examine the relationship between two variables in data analysis, scatter plots are a fundamental visualisation tool. When we wish to visualize the distribution of data points and search for patterns, trends, or outliers, they are extremely helpful. With the help of the potent R package P
6 min read
Make Scatter Plot From Set of Points in Python Tuples
Now we'll look at an example that shows how to use scatter and how scatter values can be passed to a function as a tuple argument. Assume your function takes four and five arguments each argument will be passed as a separate single data point or value to plot the scatter chart. Let's see the impleme
3 min read
Drawing Scatter Trend Lines Using Matplotlib
Matplotlib is a powerful Python library for data visualization, and one of its essential capabilities is creating scatter plots with trend lines. Scatter plots are invaluable for visualizing relationships between variables, and adding a trend line helps to highlight the underlying pattern or trend i
3 min read
Scatter Plot with Regression Line using Altair in Python
Prerequisite: Altair In this article, we are going to discuss how to plot to scatter plots with a regression line using the Altair library. Scatter Plot and Regression Line The values of two different numeric variables is represented by dots or circle in Scatter Plot. Scatter Plot is also known as a
4 min read
How to Plot 3D Scatter Diagram Using ggplot in R
The ggplot2 package in R is one of the most popular tools for creating complex and aesthetically pleasing plots. However, ggplot2 is primarily designed for 2D plotting, which presents a challenge when it comes to creating 3D scatter plots. While ggplot2 does not natively support 3D plotting, it can
4 min read
Control the Size of the Points in a Scatterplot in R
In this article, we are going to see how to control the size of the points in a scatterplot in R Programming language. We will Control the size of the points in a scatterplot using cex argument of the plot function. In this approach to control the size of the points in a scatterplot, the user needs
2 min read
How to increase the size of scatter points in Matplotlib ?
Prerequisites: Matplotlib Scatter plots are the data points on the graph between x-axis and y-axis in matplotlib library. The points in the graph look scattered, hence the plot is named as 'Scatter plot'. The points in the scatter plot are by default small if the optional parameters in the syntax ar
2 min read
How to Plot a Smooth Line using ggplot2 in R ?
In this article, we will learn how to plot a smooth line using ggplot2 in R Programming Language. We will be using the "USArrests" data set as a sample dataset for this article. Murder Assault UrbanPop Rape Alabama 13.2 236 58 21.2 Alaska 10.0 263 48 44.5 Arizona 8.1 294 80 31.0 Arkansas 8.8 190 50
3 min read