Open In App

Drawing Only Boundaries of stat_smooth in ggplot2 using R

Last Updated : 23 Sep, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

When creating plots with ggplot2, you often use stat_smooth() to add a smooth curve to visualize trends in your data. By default, stat_smooth() includes both the smoothed line and the shaded confidence interval. However, in certain cases, you may only want to show the boundaries of the confidence interval without filling the area between them. In this article, we will cover how to draw only the boundaries of the confidence interval (without shading) using stat_smooth() in ggplot2.

Overview of stat_smooth()

The stat_smooth() function in ggplot2 is used to add a smoothed conditional mean to a plot. It often includes:

  • A line representing the estimated trend.
  • A shaded region represents the confidence interval around the smooth line.

The goal here is to display only the boundaries of the confidence interval and remove the shaded region.

Method 1: Default Behavior of stat_smooth()

Let’s first see how stat_smooth() behaves by default, with both the trend line and the shaded confidence interval. We will use the mtcars dataset as an example.

R
# Load the required libraries
library(ggplot2)

# Create a basic scatter plot with stat_smooth()
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  stat_smooth(method = "lm", se = TRUE) +
  labs(title = "Default stat_smooth() with Confidence Interval")

Output:

gh
Default Behavior of stat_smooth()
  • geom_point(): Creates a scatter plot.
  • stat_smooth(): Adds a smoothed line (method = "lm" specifies linear regression) and the shaded confidence interval (since se = TRUE).

By default, the plot shows a regression line with a shaded confidence interval.

Method 2: Drawing Only the Boundaries of the Confidence Interval

To show only the boundaries of the confidence interval without filling the region, we can use the following steps:

  • Turn off the shading by setting se = FALSE in stat_smooth().
  • Manually add the confidence interval boundaries using geom_ribbon() or geom_line().

Step 1: Remove the Shading with se = FALSE

By setting the se = FALSE argument in stat_smooth(), you can remove the shaded region:

R
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  stat_smooth(method = "lm", se = FALSE) +
  labs(title = "Smoothed Line without Confidence Interval Shading")

Output:

gh
Drawing Only Boundaries of stat_smooth in ggplot2 using R

This removes the confidence interval entirely, but the goal is to add just the boundaries back.

Step 2: Add Boundaries Using geom_line() and predict()

To manually add the boundaries of the confidence interval, you can use the predict() function to calculate the upper and lower bounds and then plot them using geom_line().

R
# Fit a linear model
fit <- lm(mpg ~ wt, data = mtcars)

# Create a data frame with fitted values and confidence intervals
pred_data <- data.frame(
  wt = mtcars$wt,
  mpg = predict(fit, newdata = mtcars, interval = "confidence")
)

# Create the plot
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_line(aes(y = mpg.fit), data = pred_data, color = "blue", size = 1) +  # Smoothed line
  geom_line(aes(y = mpg.lwr), data = pred_data, linetype = "dashed", color = "red") +  # Lower boundary
  geom_line(aes(y = mpg.upr), data = pred_data, linetype = "dashed", color = "red") +  # Upper boundary
  labs(title = "Linear Regression with Confidence Interval Boundaries",
       x = "Weight",
       y = "Miles per Gallon") +
  theme_minimal()

Output:

gh
Drawing Only Boundaries of stat_smooth in ggplot2 using R
  • predict() is used to calculate the fitted values and the confidence intervals.
  • geom_line() is used to add both the smoothed line and the upper/lower boundaries of the confidence intervals as dashed lines.
  • The smoothed line is blue, and the boundaries are represented by dashed red lines.

This plot shows the smoothed regression line with the upper and lower confidence interval boundaries, but without shading between the two boundaries.

Method 3: Using geom_smooth() for Built-In Confidence Interval Boundaries

If you want a simpler approach using ggplot2, you can use geom_smooth() with the fullrange = TRUE argument and adjust the alpha of the fill to 0, making the shading invisible while still keeping the boundaries.

R
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth(method = "lm", se = TRUE, fill = NA, linetype = "dashed", color = "red") +
  labs(title = "Confidence Interval Boundaries Without Shading",
       x = "Weight",
       y = "Miles per Gallon") +
  theme_minimal()

Output:

gh
Drawing Only Boundaries of stat_smooth in ggplot2 using R
  • fill = NA: Removes the shading of the confidence interval.
  • linetype = "dashed": Changes the confidence interval lines to dashed lines.
  • color = "red": Changes the color of the confidence interval boundaries.

This plot displays only the dashed boundaries of the confidence interval without shading.

Conclusion

In this guide, we learned how to draw only the boundaries of the confidence interval without shading using stat_smooth() and geom_line() in ggplot2. Whether you prefer to manually calculate the confidence intervals and plot them or use a simpler built-in method, R provides flexible options for customizing your plots.


Next Article

Similar Reads