0% found this document useful (0 votes)
9 views6 pages

Exp1c

The document outlines a step-by-step guide to demonstrate the features of Statsmodel, including Linear Regression, Summary Statistics, and ANOVA. It details the process of importing libraries, creating a dataset, building a regression model, predicting new values, and performing residual analysis. The implementation successfully showcases the functionalities of Statsmodel with sample input and output results.

Uploaded by

bharath91589
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

Exp1c

The document outlines a step-by-step guide to demonstrate the features of Statsmodel, including Linear Regression, Summary Statistics, and ANOVA. It details the process of importing libraries, creating a dataset, building a regression model, predicting new values, and performing residual analysis. The implementation successfully showcases the functionalities of Statsmodel with sample input and output results.

Uploaded by

bharath91589
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Exp:1c

Explore the features of Statsmodel


Aim:
To demonstrate various features of Stasmodel such as Linear
Regression, Summary Statistics ,Predicting New Values,Checking
Model Residuals,ANOVA (Analysis of Variance)

Algorithm:

Step 1: Import Required Libraries


• Import numpy, pandas, and statsmodels for statistical modeling
and analysis.
Step 2: Create a Dataset
• Generate a dataset with two variables: o Experience
(Independent variable) o Salary (Dependent variable)
Step 3: Prepare Data for Regression
• Define X as the independent variable (Experience).
• Define Y as the dependent variable (Salary).
• Add a constant to X using sm.add_constant(X). Step 4: Build
the Regression Model
• Use sm.OLS(Y, X).fit() to create and fit an Ordinary Least
Squares (OLS) model.
Step 5: Display Model Summary
• Use model.summary() to view regression coefficients, Rsquared
value, p-values, and other statistics.
Step 6: Predict Salary for New Experience Values
• Define new values for experience.
• Add a constant term using sm.add_constant().
• Use model.predict() to generate predicted salaries.
Step 7: Residual Analysis
• Extract residuals using model.resid.
• Residuals help check if errors are randomly distributed.
Step 8: Perform ANOVA Test (Analysis of Variance)
• Use sm.stats.anova_lm(model, typ=2) to check model
significance.
Step 9: Display All Results
• Print regression summary, predictions, residuals, and ANOVA
table.

Program:
Program:
import numpy as np import pandas as pd
import statsmodels.api as sm import
statsmodels.formula.api as smf
np.random.seed(42) # Ensure reproducibility
data = {
"Experience": np.random.randint(1, 20, 10), # Years of experience
"Salary": np.random.randint(40000, 100000, 10) # Salary in
dollars
}
df = pd.DataFrame(data)
print("Original DataFrame:\n", df)
X = df["Experience"] # Independent variable
Y = df["Salary"] # Dependent variable
X = sm.add_constant(X) # Adds a column of ones to Xl
model = sm.OLS(Y, X).fit() # Ordinary Least Squares (OLS)
print("\nRegression Model Summary:")
print(model.summary())
new_experience = pd.DataFrame({"Experience": [5, 10, 15]})
new_experience = sm.add_constant(new_experience)
predicted_salary = model.predict(new_experience) print("\nPredicted
Salaries for New Experience Values:") print(predicted_salary)

residuals = model.resid print("\nModel


Residuals:") print(residuals)
anova_table = sm.stats.anova_lm(model, typ=2)
print("\nANOVA Test Results:")
print(anova_table)

Sample input/output:
Original DataFrame:
Experience Salary
0 7 77817
1 4 67682
2 11 64299
3 9 90636
4 2 67017
5 12 73126
6 11 89921
7 9 51479
8 14 47292
9 3 64997

Regression Model Summary:


OLS Regression Results
=================================================
======= ======================
Dep. Variable: Salary R-squared: 0.258
Model: OLS Adj. R-squared: 0.166
Method: Least Squares F-statistic: 2.795
Prob (F-statistic): 0.136
=================================================
=======
======================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 80351.0280 10514.128 7.642 0.000
56823.836 103878.220
Experience -1035.3023 620.075 -1.671 0.136 -2498.663
428.058
=================================================
======= ======================

Predicted Salaries for New Experience Values:


0 75174.5171
1 69897.6062 2 64620.6953 dtype:
float64

Model Residuals:
0 -2606.488476
1 -699.431093
2 459.471269 3 12644.062759
...

ANOVA Test Results:


sum_sq df F PR(>F)
Experience 209084039.3 1.0 2.795 0.136
Residual 671645249.5 8.0 NaN NaN

Result:
The implementation Statsmodel features including operations Linear
Regression, Summary Statistics ,Predicting New Values,Checking
Model Residuals,ANOVA (Analysis of Variance) successfully
executed and verified.

You might also like