Drawing Only Boundaries of stat_smooth in ggplot2 using R
Last Updated :
23 Sep, 2024
When creating plots with ggplot2
, you often use stat_smooth()
to add a smooth curve to visualize trends in your data. By default, stat_smooth()
includes both the smoothed line and the shaded confidence interval. However, in certain cases, you may only want to show the boundaries of the confidence interval without filling the area between them. In this article, we will cover how to draw only the boundaries of the confidence interval (without shading) using stat_smooth()
in ggplot2
.
Overview of stat_smooth()
The stat_smooth()
function in ggplot2
is used to add a smoothed conditional mean to a plot. It often includes:
- A line representing the estimated trend.
- A shaded region represents the confidence interval around the smooth line.
The goal here is to display only the boundaries of the confidence interval and remove the shaded region.
Method 1: Default Behavior of stat_smooth()
Let’s first see how stat_smooth()
behaves by default, with both the trend line and the shaded confidence interval. We will use the mtcars
dataset as an example.
R
# Load the required libraries
library(ggplot2)
# Create a basic scatter plot with stat_smooth()
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
stat_smooth(method = "lm", se = TRUE) +
labs(title = "Default stat_smooth() with Confidence Interval")
Output:
Default Behavior of stat_smooth()geom_point()
: Creates a scatter plot.stat_smooth()
: Adds a smoothed line (method = "lm"
specifies linear regression) and the shaded confidence interval (since se = TRUE
).
By default, the plot shows a regression line with a shaded confidence interval.
Method 2: Drawing Only the Boundaries of the Confidence Interval
To show only the boundaries of the confidence interval without filling the region, we can use the following steps:
- Turn off the shading by setting
se = FALSE
in stat_smooth()
. - Manually add the confidence interval boundaries using
geom_ribbon()
or geom_line()
.
Step 1: Remove the Shading with se = FALSE
By setting the se = FALSE
argument in stat_smooth()
, you can remove the shaded region:
R
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
stat_smooth(method = "lm", se = FALSE) +
labs(title = "Smoothed Line without Confidence Interval Shading")
Output:
Drawing Only Boundaries of stat_smooth in ggplot2 using RThis removes the confidence interval entirely, but the goal is to add just the boundaries back.
Step 2: Add Boundaries Using geom_line()
and predict()
To manually add the boundaries of the confidence interval, you can use the predict()
function to calculate the upper and lower bounds and then plot them using geom_line()
.
R
# Fit a linear model
fit <- lm(mpg ~ wt, data = mtcars)
# Create a data frame with fitted values and confidence intervals
pred_data <- data.frame(
wt = mtcars$wt,
mpg = predict(fit, newdata = mtcars, interval = "confidence")
)
# Create the plot
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
geom_line(aes(y = mpg.fit), data = pred_data, color = "blue", size = 1) + # Smoothed line
geom_line(aes(y = mpg.lwr), data = pred_data, linetype = "dashed", color = "red") + # Lower boundary
geom_line(aes(y = mpg.upr), data = pred_data, linetype = "dashed", color = "red") + # Upper boundary
labs(title = "Linear Regression with Confidence Interval Boundaries",
x = "Weight",
y = "Miles per Gallon") +
theme_minimal()
Output:
Drawing Only Boundaries of stat_smooth in ggplot2 using Rpredict()
is used to calculate the fitted values and the confidence intervals.geom_line()
is used to add both the smoothed line and the upper/lower boundaries of the confidence intervals as dashed lines.- The smoothed line is blue, and the boundaries are represented by dashed red lines.
This plot shows the smoothed regression line with the upper and lower confidence interval boundaries, but without shading between the two boundaries.
Method 3: Using geom_smooth()
for Built-In Confidence Interval Boundaries
If you want a simpler approach using ggplot2
, you can use geom_smooth()
with the fullrange = TRUE
argument and adjust the alpha
of the fill to 0, making the shading invisible while still keeping the boundaries.
R
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE, fill = NA, linetype = "dashed", color = "red") +
labs(title = "Confidence Interval Boundaries Without Shading",
x = "Weight",
y = "Miles per Gallon") +
theme_minimal()
Output:
Drawing Only Boundaries of stat_smooth in ggplot2 using Rfill = NA
: Removes the shading of the confidence interval.linetype = "dashed"
: Changes the confidence interval lines to dashed lines.color = "red"
: Changes the color of the confidence interval boundaries.
This plot displays only the dashed boundaries of the confidence interval without shading.
Conclusion
In this guide, we learned how to draw only the boundaries of the confidence interval without shading using stat_smooth()
and geom_line()
in ggplot2
. Whether you prefer to manually calculate the confidence intervals and plot them or use a simpler built-in method, R provides flexible options for customizing your plots.