0% found this document useful (0 votes)

124 views19 pages

Simple Explanation of Statsmodel Linear Regression Model Summary

The document discusses the summary output of linear regression models in Python's statsmodels library. It explains key terms in the model summary like degrees of freedom, covariance type, R-squared, t-statistics, p-values, F-statistics, and information criteria like AIC and BIC. An example using salary data demonstrates how to interpret the summary to evaluate which variables are statistically significant predictors of salary.

Uploaded by

jigsan5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

124 views19 pages

Simple Explanation of Statsmodel Linear Regression Model Summary

Uploaded by

jigsan5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Open in app

Member-only story

Simple Explanation of Statsmodel Linear

Regression Model Summary
Statsmodel library model summary explanation

Md Sohel Mahmood · Follow

Published in Towards Data Science
7 min read · Apr 22, 2022

Listen Share More

Image by Author

Introduction

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 1/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Regression analysis is the bread and butter for many statisticians and data
scientists. We perform simple and multiple linear regression for the purpose of
prediction and always want to obtain a robust model free from any bias. In this
article, I am going to discuss the summary output of python’s statsmodel library
using a simple example and explain a little bit how the values reflect the model
performance.

Typical model summary

For the purposae of demonstration, I will use kaggle’s Salary dataset (Apache 2.0
open source license). This dataset has two columns: years of experience and salary.
I have two two more column: Projects and People_managing.

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 2/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Sample data

When we use statsmodel to use all the three variables to predict Salary, we get the
following summary result.

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 3/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

I am going to explain all these parameters in the summary below.

Dep variable

“Salary” which is the only dependent variable in the data.

Model and Method

OLS which stands for Ordinary Least Square. The model tries to find out a linear
expression for the dataset which minimizes the sum of residual squares.

DF residuals and DF model

We have total 30 observation and 4 features. Out of 4 features, 3 features are

independent. DF Model is therefore 3. DF residual is calculated from total
observation-DF model-1 which is 30–3–1 = 26 in our case.

Covariance type

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 4/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Covariance type is typically nonrobust which means there is no elimination of data

to calculate the covariance between features. Covariance shows how two variables
move with respect to each other. If this value is greater than 0, both move in same
direction and if this is less than 0, the variables mode in opposite direction.
Covariance is difference from correlation. Covariance does not provide the strength
of the relationship, only the direction of movement whereas, correlation value is
normalized and ranges between -1 to +1 and correlation provides the strength of
relationship. If we want to obtain robust covariance, we can declare
cov_type=HC0/HC1/HC2/HC3. However, the statsmodel documentation is not that
rich to explain all these. HC stands for heteroscedasticity consistent and HC0
implements the simplest version among all.

R-squared

R-squared value is the coefficient of determination which indicates the percentage

of the variability if the data explained by the selected independent variables.

Adj. R-squared

As we add more and more independent variables to our model, the R-squared
values increases but in reality, those variables do not necessarily make any
contribution towards explaining the dependent variable. Therefore addition of each
unnecessary variables needs some sort of penalty. The original R-squared values is
adjusted when there are multiple variables incorporated. In essence, we should
always look for adjusted R-squared value while performing multiple linear
regression. For a single independent variable, both R-squared and adjusted R-
squared value are same.

Before moving to F-statistics, we need to understand the t-statistics first. T-statistics

are provided in the table shown below.

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 5/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

coef and std err

The coef column represents the coefficients for each independent variable along
with intercept value. Std err is the standard deviation of the corresponding
variable’s coefficient across all the data points. When using only one predicting
variable, the standard error can be obtained from this two dimensional space as
shown below

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 6/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Image by Author

t-values and P>|t|

The t-column provides the t-values corresponding to to each independent variables.

For example here Projects, People_managing and Salary all have different t-values
as well as different p-values associated with each variables. T-statistics are used to
calculate the p-values. Typically when p-value is less than 0.05, it indicates a strong
evidence against null hypothesis which states that the corresponding independent
variable has no effect on the dependent variable. P-value of 0.249 for Projects says
us that there is 24.9% chance that Projects variables has no effect on Salary. It seems
YearsExperience got 0 p-value indicating that the data for YearsExperience is
statistically significant since is is less than the critical limit (0.05). In this case, we
can reject the null hypothesis and say that YearsExperience data is significantly
controlling the Salary.

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 7/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Years of Experience against Salary showing strong correlation

F-statistics

F-test provides a way to check all the independent variables all together if any of
those are related to the dependent variable. If Prob(F-statistic) is greater than 0.05,
there is no evidence of relationship between any of the independent variable with
the output. If it is less than 0.05, we can say that there is at least one variable which
is significantly related with the output. In our example, the p-value is less than 0.05
and therefore, one or more than one of the independent variable are related to
output variable Salary. We have seen previously that YearsExperience is significantly
related with Salary but others are not. Therefore, the F-test data supports the t-test
outcomes. However, there may be some cases when prob(F-statistic) may be greater
than 0.05 but one of the independent variable shows strong correlation. This is
because each t-test is carried out with different set of data whereas F-test checks the
combined effect including all variables globally.

Log-likelihood

The log-likelihood value is a measure for fit of the model with the given data. It is
useful when we compare two or more models. The higher the value of log-

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 8/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

likelihood, the better the model fits the given data. It can range from negative
infinity to positive infinity.

log-likelihood when all three variables are included

log-likelihood when only “Projects” is included

When all three independent variables are incorporated in the model, the log-
likelihood value is -310.21 which is higher than -334.95 when only Projects data is
included. This mean the first model fits the data better. It also goes hand in hand
with R-squared values as seen above.

AIC and BIC

AIC (stands for Akaike’s Information Criteria developed by Japanese statistician

Hirotugo Akaike) and BIC (stands for Bayesian Information Criteria) are also used as
criteria for model robustness. The goal is to minimize these values to get a better
model. I have another article where I have discussed on these topics.
https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 9/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Simple Stepwise and Weighted Regression Model

Stepwise and Weighted Regression
Model Stepwise and Weighted Regressiontowardsdatascience.com

Omnibus and Prob(Omnibus)

Omnibus test checks the normality of the residuals once the model is deployed. If
the value is zero, it means the residuals are perfectly normal. Here, in the example
prob(Omnibus) is 0.357 indicating that there is 35.7% chance that the residuals the
normally distributed. For a model to be robust, besides checking R-squared and
other rubrics, the residual distribution is also required to be normal ideally. In other
words, the residual should not follow any pattern when plotted against the fitted
values.

Skew and Kurtosis

Skew values tells us the skewness of the residual distribution. Normally distributed
variables have 0 skew values. Kurtosis is a measure of light-tailed or heavy-tailed
distribution compared to normal distribution. High kurtosis indicates the
distribution is too narrow and low kurtosis indicates the distribution is too flat. A
kurtosis value between -2 and +2 is good to prove normalcy.

Durbin-Watson

Durbin-Watson statistic provides a measure of autocorrelation in the residual. If the

residual values are autocorrelated, the model becomes biased and it is not expected.
This simply means that one value should not be depending on any of the previous
values. An ideal value for this test ranges from 0 to 4.

Jarque-Bera (JB) and Prob(JB)

Jarque-Bera (JB) and Prob(JB) is similar to Omni test measuring the normalcy of the
residuals.

Condition Number

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 10/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

High condition number indicates that there are possible multicollinearity present
in the dataset. If only one variable is used as predictor, this value is low and can be
ignored. We can proceed like stepwise regression and see if there is any
multicollinearity added when additional variables are included.

Conclusion

We have discussed all the summary parameters from statsmodel output. This will
useful for readers who are interested to check all the rubrics for a robust model.,
Most of the time, we look for R-squared value to make sure that the model explains
most of the variability but we have seen that there is much more than that.

Thanks for reading

Github page

Youtube Channel

Join Medium with my referral link - Md Sohel Mahmood

As a Medium member, a portion of your membership fee goes to
writers you read, and you get full access to every story…
mdsohel-mahmood.medium.com

Multiple Linearregression Simple Linear Regression Statsmodels Python

Statistics

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 11/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Written by Md Sohel Mahmood

412 Followers · Writer for Towards Data Science

Data Science Enthusiast

Md Sohel Mahmood in Towards Data Science

Outlier Detection (Part 1)

IQR, Standard Deviation, Z-score and Modified Z-score

· 6 min read · May 5, 2022

109 4

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 12/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Heiko Hotz in Towards Data Science

RAG vs Finetuning — Which Is the Best Tool to Boost Your LLM

Application?
The definitive guide for choosing the right method for your use case

· 19 min read · Aug 25

1.8K 16

Giuseppe Scalamogna in Towards Data Science

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 13/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

New ChatGPT Prompt Engineering Technique: Program Simulation

A potentially novel technique for turning a ChatGPT prompt into a mini-app.

9 min read · Sep 4

887 10

Md Sohel Mahmood in Towards Data Science

Factor Analysis of Mixed Data

Use of FAMD for data having continuous and categorical features

· 5 min read · Jul 12, 2021

32 1

See all from Md Sohel Mahmood

See all from Towards Data Science

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 14/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Recommended from Medium

Erkan Hatipoğlu in Towards AI

Multivariate Linear Regression From Scratch

With Code in Python

12 min read · Mar 21

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 15/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Maninder Singh

Understanding Categorical Correlations with Chi-Square Test and

Cramer’s V
In this, I explained about correlation(code) between categorical features which I Learned when
wanted to find the same in one of my…

9 min read · Jun 18

151

Lists

Predictive Modeling w/ Python

20 stories · 385 saves

Coding & Development

11 stories · 168 saves

Practical Guides to Machine Learning

10 stories · 425 saves

New_Reading_List
174 stories · 105 saves

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 16/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Yennhi95zz

Time Series in Machine Learning: Understanding and Applications

Time series analysis is one such method used to analyze and predict trends in time-dependent
data. In this blog post, we will explore what…

· 4 min read · Mar 31

203 3

Vivekawasthi

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 17/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

The Central Limit Therom & Making Estimates with the Confidence
intervals
This blog will cover the central limit theorem (CLT), allowing us to apply the concepts we
learned on the normal distribution to…

3 min read · Mar 24

Wendy Hu

Monte Carlo Simulation with Python

Introduction

5 min read · Apr 5

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 18/19
9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Juan Broglio

Gamma Regression vs Linear Regression (in Python)

General Linear Models and Gamma Regression

4 min read · Aug 9

See more recommendations

https://medium.com/towards-data-science/simple-explanation-of-statsmodel-linear-regression-model-summary-35961919868b 19/19

A Real Life Application of Linear Progra
100% (1)
A Real Life Application of Linear Progra
7 pages
Statistics For Viva
No ratings yet
Statistics For Viva
22 pages
Time Series Analysis
100% (1)
Time Series Analysis
15 pages
Solved Exercises and Problems of Statist PDF
100% (1)
Solved Exercises and Problems of Statist PDF
229 pages
Netflix Data Science Interview Question
No ratings yet
Netflix Data Science Interview Question
7 pages
Advanced Machine Learning: CS 281
100% (1)
Advanced Machine Learning: CS 281
88 pages
99 Operations Management Dissertation Topics Research Ideas
No ratings yet
99 Operations Management Dissertation Topics Research Ideas
1 page
Globalization-Operations Management
100% (1)
Globalization-Operations Management
3 pages
Data Science: Concepts and Practice: Course Slides
No ratings yet
Data Science: Concepts and Practice: Course Slides
9 pages
Darpan Chaudhary Analytics Take-Home Test
No ratings yet
Darpan Chaudhary Analytics Take-Home Test
6 pages
21AML543 - Fundamentals of Data Science
No ratings yet
21AML543 - Fundamentals of Data Science
4 pages
Practice Exam
No ratings yet
Practice Exam
38 pages
Matrix Cookbook
No ratings yet
Matrix Cookbook
71 pages
Lecture Notes Chapter 5 Data Collection Sampling and Data Analysis
No ratings yet
Lecture Notes Chapter 5 Data Collection Sampling and Data Analysis
41 pages
Book - Roger D Peng-Exploratory Data Analysis With R-Leanpub (2015) PDF
No ratings yet
Book - Roger D Peng-Exploratory Data Analysis With R-Leanpub (2015) PDF
125 pages
Unit 5 and 6 - Inferential Statistics and Regression Analysis
No ratings yet
Unit 5 and 6 - Inferential Statistics and Regression Analysis
68 pages
Data Science For Financial Markets - Kaggle
No ratings yet
Data Science For Financial Markets - Kaggle
202 pages
House Price Prediction Using Machine Learning
No ratings yet
House Price Prediction Using Machine Learning
6 pages
Statistic Interview Questions and Answers by Jeevan Raj
No ratings yet
Statistic Interview Questions and Answers by Jeevan Raj
21 pages
Buan 6312
No ratings yet
Buan 6312
7 pages
Logistic Regression
100% (1)
Logistic Regression
14 pages
Statistical Forcasting - Excel, ARIMA
No ratings yet
Statistical Forcasting - Excel, ARIMA
14 pages
P7650A/B/U: Differential Pressure Sensors
No ratings yet
P7650A/B/U: Differential Pressure Sensors
4 pages
MN04020003E
No ratings yet
MN04020003E
204 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Karna Ugh Maps Approach To Understanding Control
No ratings yet
Karna Ugh Maps Approach To Understanding Control
7 pages
Dplyr Tutorial
100% (1)
Dplyr Tutorial
22 pages
1 - Course Slides - Data Science and ML Fundamentals
No ratings yet
1 - Course Slides - Data Science and ML Fundamentals
92 pages
Chapter 3 - Introduction To Linear Programming A
No ratings yet
Chapter 3 - Introduction To Linear Programming A
37 pages
ML Cheatsheet Final
No ratings yet
ML Cheatsheet Final
32 pages
AWS RDS User Guide PDF
100% (1)
AWS RDS User Guide PDF
759 pages
Linear Regression: in Machine Learning
No ratings yet
Linear Regression: in Machine Learning
6 pages
BEE Cement Plant Code Final
No ratings yet
BEE Cement Plant Code Final
4 pages
Regression Splines
No ratings yet
Regression Splines
4 pages
PMP Formulas: 1. Number of Communication Channels
No ratings yet
PMP Formulas: 1. Number of Communication Channels
5 pages
Introduction To STATISTICS-new
100% (1)
Introduction To STATISTICS-new
46 pages
SAS Cluster Project Report
100% (1)
SAS Cluster Project Report
24 pages
Statistical Infrences Lec 1
No ratings yet
Statistical Infrences Lec 1
35 pages
Linear Regression
No ratings yet
Linear Regression
83 pages
Notes On Time Series Analysis
No ratings yet
Notes On Time Series Analysis
111 pages
Tutorial 2018 Optimization
No ratings yet
Tutorial 2018 Optimization
7 pages
Complex Sentences
100% (1)
Complex Sentences
55 pages
Flygt DX: Submersible Drainage & Waste Water Pumps, 50 HZ
No ratings yet
Flygt DX: Submersible Drainage & Waste Water Pumps, 50 HZ
4 pages
Arch Model and Time-Varying Volatility
No ratings yet
Arch Model and Time-Varying Volatility
17 pages
Simple Linear Regression and Correlation: Chapter Outline
No ratings yet
Simple Linear Regression and Correlation: Chapter Outline
79 pages
Hypothesis Testing Results Analysis Using SPSS RM Dec 2017
No ratings yet
Hypothesis Testing Results Analysis Using SPSS RM Dec 2017
66 pages
I. The Types of Machine Learning
No ratings yet
I. The Types of Machine Learning
8 pages
Digiplex EVO High Security and Access System: Programming Guide
No ratings yet
Digiplex EVO High Security and Access System: Programming Guide
68 pages
An Introduction To T
No ratings yet
An Introduction To T
7 pages
RECURSION
No ratings yet
RECURSION
14 pages
Multiple Regression
No ratings yet
Multiple Regression
20 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Shabana
No ratings yet
Shabana
104 pages
Applied Regression Analysis: Third Edition
0% (1)
Applied Regression Analysis: Third Edition
9 pages
Group 5 Development of Single Actuator Circuit
No ratings yet
Group 5 Development of Single Actuator Circuit
26 pages
12 Classification
No ratings yet
12 Classification
16 pages
The Age of Analytics - Competing in A Data-Driven World - McKinsey & Company
No ratings yet
The Age of Analytics - Competing in A Data-Driven World - McKinsey & Company
6 pages
Components of Time Series
No ratings yet
Components of Time Series
4 pages
CSE-Machine Learning & Big Data - WSS Source Book
No ratings yet
CSE-Machine Learning & Big Data - WSS Source Book
181 pages
Session 18 Time Series Forecasting
No ratings yet
Session 18 Time Series Forecasting
30 pages
CHP 6 Class Review Chem
No ratings yet
CHP 6 Class Review Chem
15 pages
SPSS Syntax
No ratings yet
SPSS Syntax
17 pages
SPAN Explanation
No ratings yet
SPAN Explanation
18 pages
Failure Analysis of A Helical Gear in A Gearbox Used in A Steel Rolling Mill
No ratings yet
Failure Analysis of A Helical Gear in A Gearbox Used in A Steel Rolling Mill
7 pages
Evaluation Reporting of Results Annex 2a Examples of Re Test Programmes For Quantitative Tests PDF
No ratings yet
Evaluation Reporting of Results Annex 2a Examples of Re Test Programmes For Quantitative Tests PDF
17 pages
Bayesian Model Updating
No ratings yet
Bayesian Model Updating
26 pages
Roadmap To Becoming A Data Scientist in Python
No ratings yet
Roadmap To Becoming A Data Scientist in Python
12 pages
A Sensible Mutual Fund Selection Model
No ratings yet
A Sensible Mutual Fund Selection Model
14 pages
Ugc Model Curriculum Statistics: Submitted To The University Grants Commission in April 2001
No ratings yet
Ugc Model Curriculum Statistics: Submitted To The University Grants Commission in April 2001
101 pages
Marantz SC-7-S-2 Service Manual
No ratings yet
Marantz SC-7-S-2 Service Manual
81 pages
GAS - AMPZILLA Original
No ratings yet
GAS - AMPZILLA Original
8 pages
Some Exercises Using Minitab
No ratings yet
Some Exercises Using Minitab
20 pages
Data Preprocessing in Python Pandas (With Code)
No ratings yet
Data Preprocessing in Python Pandas (With Code)
11 pages
Tutorial On "R" Programming Language
No ratings yet
Tutorial On "R" Programming Language
25 pages
SAS Presentation
No ratings yet
SAS Presentation
49 pages
Econometrics Cheat Sheet
No ratings yet
Econometrics Cheat Sheet
4 pages
App.A - Detection and Estimation in Additive Gaussian Noise PDF
No ratings yet
App.A - Detection and Estimation in Additive Gaussian Noise PDF
55 pages
Ina102 PDF
No ratings yet
Ina102 PDF
13 pages
Group 5 - Streamflow Measurement
No ratings yet
Group 5 - Streamflow Measurement
70 pages
Introduction To Quant Investing With Python - by Luís Fernando Torres - InsiderFinance Wire
No ratings yet
Introduction To Quant Investing With Python - by Luís Fernando Torres - InsiderFinance Wire
21 pages
Three Star Auto Spare Parts Trdg. - THB: 04-10-0011 26/08/2021 Top Concrete PO - BOX: 12515 050305999 06 8823055
No ratings yet
Three Star Auto Spare Parts Trdg. - THB: 04-10-0011 26/08/2021 Top Concrete PO - BOX: 12515 050305999 06 8823055
1 page
Polynomial Regression and Step Function
100% (1)
Polynomial Regression and Step Function
6 pages
Computational Statistics With Matlab
No ratings yet
Computational Statistics With Matlab
71 pages
Ib1 Formative-2 Key - Son
No ratings yet
Ib1 Formative-2 Key - Son
7 pages
Analysis of Reliability Parameters of Conveyor Bel PDF
No ratings yet
Analysis of Reliability Parameters of Conveyor Bel PDF
7 pages
Manual AN7001
No ratings yet
Manual AN7001
7 pages
High Altitude Student Platform (HASP) 2021 Final Report
No ratings yet
High Altitude Student Platform (HASP) 2021 Final Report
33 pages
Alarm System - DSC Pc1555 - Faq
No ratings yet
Alarm System - DSC Pc1555 - Faq
3 pages
JT Spiral-Tube Brochure
No ratings yet
JT Spiral-Tube Brochure
2 pages
Principle of Concrete Mix Design
No ratings yet
Principle of Concrete Mix Design
3 pages
File Management
No ratings yet
File Management
14 pages
Ammonia QP
No ratings yet
Ammonia QP
4 pages
Krushna Prasad Shadangi, Kaustubha Mohanty: Highlights
No ratings yet
Krushna Prasad Shadangi, Kaustubha Mohanty: Highlights
7 pages
POM 1 - Intro
No ratings yet
POM 1 - Intro
38 pages
Variable Selection
No ratings yet
Variable Selection
15 pages
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
From Everand
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
Joseph O. Esin
No ratings yet

Simple Explanation of Statsmodel Linear Regression Model Summary

Uploaded by

Simple Explanation of Statsmodel Linear Regression Model Summary

Uploaded by

9/17/23, 11:47 AM Simple Explanation of Statsmodel Linear Regression Model Summary | by Md Sohel Mahmood | Towards Data Science

Simple Explanation of Statsmodel Linear

Md Sohel Mahmood · Follow

Listen Share More

Typical model summary

I am going to explain all these parameters in the summary below.

“Salary” which is the only dependent variable in the data.

Model and Method

DF residuals and DF model

We have total 30 observation and 4 features. Out of 4 features, 3 features are

Covariance type is typically nonrobust which means there is no elimination of data

R-squared value is the coefficient of determination which indicates the percentage

Before moving to F-statistics, we need to understand the t-statistics first. T-statistics

coef and std err

t-values and P>|t|

The t-column provides the t-values corresponding to to each independent variables.

Years of Experience against Salary showing strong correlation

log-likelihood when all three variables are included

log-likelihood when only “Projects” is included

AIC and BIC

AIC (stands for Akaike’s Information Criteria developed by Japanese statistician

Simple Stepwise and Weighted Regression Model

Omnibus and Prob(Omnibus)

Skew and Kurtosis

Durbin-Watson statistic provides a measure of autocorrelation in the residual. If the

Jarque-Bera (JB) and Prob(JB)

Thanks for reading

Join Medium with my referral link - Md Sohel Mahmood

Multiple Linearregression Simple Linear Regression Statsmodels Python

Written by Md Sohel Mahmood

Data Science Enthusiast

More from Md Sohel Mahmood and Towards Data Science

Md Sohel Mahmood in Towards Data Science

Outlier Detection (Part 1)

· 6 min read · May 5, 2022

Heiko Hotz in Towards Data Science

RAG vs Finetuning — Which Is the Best Tool to Boost Your LLM

· 19 min read · Aug 25

Giuseppe Scalamogna in Towards Data Science

New ChatGPT Prompt Engineering Technique: Program Simulation

9 min read · Sep 4

Md Sohel Mahmood in Towards Data Science

Factor Analysis of Mixed Data

· 5 min read · Jul 12, 2021

See all from Md Sohel Mahmood

See all from Towards Data Science

Recommended from Medium

Erkan Hatipoğlu in Towards AI

Multivariate Linear Regression From Scratch

12 min read · Mar 21

Understanding Categorical Correlations with Chi-Square Test and

9 min read · Jun 18

Predictive Modeling w/ Python

Coding & Development

Practical Guides to Machine Learning

Time Series in Machine Learning: Understanding and Applications

· 4 min read · Mar 31

3 min read · Mar 24

Monte Carlo Simulation with Python

5 min read · Apr 5

Gamma Regression vs Linear Regression (in Python)

4 min read · Aug 9

See more recommendations

You might also like