How to Calculate Confidence Intervals in Python?
Last Updated :
28 Apr, 2025
Confidence interval (CI) is a statistical range that estimates the true value of a population parameter, like the population mean, with a specified probability. It provides a range where the true value is likely to lie, based on sample data. The confidence level (e.g., 95%) indicates how certain we are that the true value is within this range. Formula:
Confidence interval - x: sample mean
- t: t-value that corresponds to the confidence level
- s: sample standard deviation
- n: sample size
Using scipy.stats.t.interval
This method involves providing the confidence level, sample mean, sample standard deviation and sample size. It is particularly useful for small samples where the population standard deviation is unknown and you want to estimate the range within which the true mean lies, with a specified confidence level.
Python
import numpy as np
import scipy.stats as stats
d = [12, 15, 14, 10, 13, 17, 14, 15, 16, 14]
cl = 0.95 # confidence level
# confidence interval
ci = stats.t.interval(confidence_level, df=len(d)-1, loc=np.mean(d), scale=np.std(d, ddof=1) / np.sqrt(len(d)))
print(ci)
Output
(np.float64(12.56928618802332), np.float64(15.43071381197668))
Explanation:
- np.mean(d) computes the sample mean of the data.
- np.std(d, ddof=1) calculates the sample standard deviation, using ddof=1 for sample standard deviation (Bessel's correction).
- len(d) determines the number of data points (sample size).
- stats.t.interval() calculates the confidence interval using the confidence level, degrees of freedom,sample mean and standard error (sample standard deviation divided by the square root of sample size).
Using statsmodels
This approach uses Ordinary Least Squares (OLS) regression to calculate confidence intervals for the regression coefficients. The sm.OLS function fits a linear regression model, and conf_int() is used to retrieve the confidence intervals for the model's parameters. It’s ideal for understanding the uncertainty in the estimated parameters of a regression model.
Python
import numpy as np
import statsmodels.api as sm
d = [12, 15, 14, 10, 13, 17, 14, 15, 16, 14]
# Create a model
X = sm.add_constant(np.array(d))
model = sm.OLS(d, X)
res = model.fit()
# Confidence interval
ci = res.conf_int(alpha=0.05)
print(ci)
Output
[[-3.98583075e-14 7.88388442e-15]
[ 1.00000000e+00 1.00000000e+00]]
Explanation:
- sm.add_constant(np.array(d)) adds an intercept term to the data for linear regression.
- sm.OLS(d, X) creates an OLS regression model with d as the dependent variable and X (with the intercept) as the independent variable.
- model.fit() fits the regression model to the data.
- res.conf_int(alpha=0.05) computes the 95% confidence interval for the model's coefficients.
Using numpy and scipy
This method manually computes the confidence interval by first calculating the t-value, sample standard deviation and standard error. The margin of error is then determined and added or subtracted from the sample mean to form the confidence interval. This hands-on approach is straightforward and works well when you need to compute the interval using basic statistical formulas.
Python
import numpy as np
import scipy.stats as stats
d = [12, 15, 14, 10, 13, 17, 14, 15, 16, 14]
m, s, n = np.mean(d), np.std(d, ddof=1), len(d) # Mean, SD, Size
t = stats.t.ppf(0.975, df=n-1) # t-value
e = t * (s / np.sqrt(n)) # Margin
print(m - e, m + e)
Output
12.56928618802332 15.43071381197668
Explanation:
- stats.t.ppf(0.975, df=n-1) calculates the t-value for a 95% confidence level with n-1 degrees of freedom.
- t * (s / np.sqrt(n)) computes the margin of error by multiplying the t-value with the standard error of the mean.
Using pandas
This approach is similar to the previous one but utilizes pandas for easier data manipulation. It calculates the sample mean and standard deviation from a DataFrame and uses the t-value and margin of error to compute the confidence interval. This method is helpful when working with structured data in pandas, especially for large datasets.
Python
import pandas as pd
import numpy as np
import scipy.stats as stats
d = [12, 15, 14, 10, 13, 17, 14, 15, 16, 14] # Data
df = pd.DataFrame(d, columns=['data'])
m, s, n = df['data'].mean(), df['data'].std(ddof=1), len(df)
t = stats.t.ppf(0.975, df=n-1) # t-value
e = t * (s / np.sqrt(n)) # Margin
print(m - e, m + e)
Output
12.56928618802332 15.43071381197668
Explanation:
- stats.t.ppf(0.975, df=n-1) finds the t-value for a 95% confidence level with n-1 degrees of freedom.
- t * (s / np.sqrt(n)) calculates the margin of error as the t-value multiplied by the standard error of the mean.
Similar Reads
How to Calculate a Binomial Confidence Interval in R?
In this article, we will discuss how to calculate a Binomial Confidence interval in R Programming Language. We can calculate Binomial Confidence Interval by using the below formulae: p  +/-  z*(âp(1-p) / n) where, p is for the  proportion of successesz is  the chosen valuen is the  sample size We ca
2 min read
How to Calculate z score of Confidence Interval
To calculate the z-score for a confidence interval, find the complement of the confidence level (1 - C), divide by 2, then use a z-table or calculator to find the z-score corresponding to the cumulative probability (1 - α/2).z-score represents the number of standard deviations a data point is from t
3 min read
How to Plot a Confidence Interval in R?
In this article, we will discuss how to plot confidence intervals in the R programming language. Method 1: Plotting the confidence Interval using geom_point and geom_errorbar In this method to plot a confidence interval, the user needs to install and import the ggplot2 package in the working r conso
4 min read
How to Find Confidence Intervals in R?
The confidence interval in R signifies how much uncertainty is present in statistical data. a fundamental statistical technique, confidence intervals offer a range of likely values for an unknown population parameter based on sample data. They are essential to decision-making, hypothesis testing, an
6 min read
Confidence Intervals for Caret Package in R
Confidence intervals are an essential aspect of statistical analysis, providing a range of values within which we can expect the true parameter to lie, with a certain level of confidence. In predictive modeling, confidence intervals help us understand the uncertainty around our model estimates. The
4 min read
Add confidence intervals to dotchart in R
In statistics, confidence intervals are a type of interval estimate used to provide an estimate of the range of values that a population parameter, such as a mean or proportion, is likely to fall within. These intervals are essential for interpreting the results of statistical analyses and providing
9 min read
Adding 95% Confidence Intervals to Line Charts in Plotly
Creating a line chart with a 95% confidence interval (CI) in Plotly is a powerful way to visualize data with uncertainty. Confidence intervals provide a range of values that likely contain the true value of the parameter being measured, thus adding a layer of statistical insight to your visualizatio
4 min read
How to Calculate Mean Absolute Error in Python?
When building machine learning models, our aim is to make predictions as accurately as possible. However, not all models are perfect some predictions will surely deviate from the actual values. To evaluate how well a model performs, we rely on error metrics. One widely used metric for measuring pred
3 min read
How to Calculate SMAPE in Python?
In this article, we will see how to compute one of the methods to determine forecast accuracy called the Symmetric Mean Absolute Percentage Error (or simply SMAPE) in Python. The SMAPE is one of the alternatives to overcome the limitations with MAPE forecast error measurement. In contrast to the me
3 min read
How to calculate the 95% confidence interval for the slope in a linear regression model in R
Linear regression is a fundamental statistical technique used to model the relationship between a dependent variable (response) and one or more independent variables (predictors). When interpreting the results of a linear regression model, understanding the uncertainty around the estimated coefficie
4 min read