Open In App

Orthogonal distance regression using SciPy

Last Updated : 23 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Orthogonal Distance Regression (ODR) is a powerful statistical technique used to fit a model to data when both independent (X) and dependent (Y) variables are subject to error. Unlike traditional Ordinary Least Squares (OLS), which assumes that only the dependent variable has measurement errors, ODR accounts for errors in both directions, making it ideal for scientific and engineering data where all measurements can be noisy.

Why Use ODR Instead of OLS?

In many real-world scenarios, both the independent variable (X) and the dependent variable (Y) may be affected by measurement errors. In such cases, ODR becomes more suitable because it:

  • Accounts for errors in both X and Y
  • Provides a more geometrically accurate fit
  • Is capable of handling non-linear models

Mathematical Formulation

The objective function minimized in ODR is:

\sum_{i=1}^{n} \left[ \frac{(y_i - \alpha - \beta x_i)^2}{\eta} + (x_i - X_i)^2 \right]

Where:

  • 𝑦𝑖: observed dependent variable
  • 𝑥𝑖: true (unknown) value of the independent variable
  • 𝑋𝑖: observed value of the independent variable
  • \alpha,\beta: regression coefficients (intercept and slope)
  • \eta : weighting factor between Y and X errors

And the weighting factor \eta is defined as:

\eta = \frac{\sigma_\xi^2}{\sigma_\mu^2}

Where:

  • \sigma_\xi^2: variance of error in the dependent variable (Y-axis)
  • \sigma_\mu^2: variance of error in the independent variable (X-axis)

Implementation in SciPy

SciPy provides the scipy.odr module to implement ODR using the ODRPACK library, a well-established FORTRAN-77 based package. SciPy wraps this functionality in an object-oriented interface for ease of use.

Step-by-Step Approach

  1. Import required libraries
  2. Create input data arrays (feature, target)
  3. Define a model function (e.g., linear)
  4. Use odr.Model() to wrap the model function
  5. Wrap data using odr.Data()
  6. Create and configure odr.ODR() instance
  7. Run the regression using .run()
  8. Display results with .pprint()
Python
import numpy as np
import matplotlib.pyplot as plt
from scipy import odr  

x = np.arange(1, 11)
np.random.shuffle(x)
y = np.array([0.65, -0.75, 0.90, -0.5, 0.14,
              0.84, 0.99, -0.95, 0.41, -0.28])

def model_fn(p, x):
    m, c = p
    return m * x + c

model = odr.Model(model_fn)
data = odr.Data(x, y)
odr_run = odr.ODR(data, model, beta0=[0.2, 1.0])
res = odr_run.run()
res.pprint()

Output

Beta: [ 0.11545417 -0.48999795]

Beta Std Error: [0.07475684 0.46382517]

Beta Covariance: [[ 0.01228991 -0.06759452]

[-0.06759452 0.4731028 ]]

Residual Variance: 0.45472947791705537

Inverse Condition #: 0.06923218954368635

Reason(s) for Halting:

Sum of squares convergence


Article Tags :
Practice Tags :

Similar Reads