0% found this document useful (0 votes)

60 views7 pages

Regression Stat Assignment

Uploaded by

Pritom Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views7 pages

Regression Stat Assignment

Uploaded by

Pritom Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

copy-of-regression-stat-assignment

May 4, 2024

#Statistics Assignment: Generating Regression Model

[2]: import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import linregress
from scipy.optimize import curve_fit

Solve the problems: At http://www.statsci.org/data/general/brunhild.html, you will find a dataset

that measures the concentration of a sulfate in the blood of a baboon named Brunhilda as a function
of time. Build a linear regression of the log of the concentration against the log of time.
(a) Prepare a plot showing
1. the data points and
2. the regression line in log-log coordinates.
(b) Prepare a plot showing
1. the data points and
2. the regression curve in the original coordinates.
(c) Plot the residual against the fitted values in log-log and in original coordinates.
(d) Use your plots to explain whether your regression is good or bad and why.
From http://www.statsci.org/data/general/brunhild.html, a dataset that measures the concentra-
tion of a sulfate in the blood of a baboon named Brunhilda as a function of time was found. The
data table is presented here:
Hours Sulfate
2 15.11
4 11.36
6 9.77
8 9.09
10 8.48
15 7.69
20 7.33
25 7.06
30 6.7
40 6.43
50 6.16
60 5.99

1
70 5.77
80 5.64
90 5.39
110 5.09
130 4.87
150 4.6
160 4.5
170 4.36
180 4.27
Lets represent the data table as two numpy arrays for further mathematical queries
[3]: x_hours = np.array([2, 4, 6, 8, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90,␣
↪110, 130, 150, 160, 170, 180])

y_sulfate = np.array([15.11, 11.36, 9.77, 9.09, 8.48, 7.69, 7.33, 7.06, 6.7, 6.
↪43, 6.16, 5.99, 5.77, 5.64, 5.39, 5.09, 4.87, 4.6, 4.5, 4.36, 4.27])

#Question a: Prepare a plot showing :

1. the data points and
2. the regression line in log-log coordinates.
[4]: log_hours = np.log(x_hours)
log_sulfate = np.log(y_sulfate)

[5]: slope, intercept, a, b, c = linregress(log_hours, log_sulfate)

[10]: plt.scatter(log_hours, log_sulfate, color='black')

plt.plot(log_hours, slope * log_hours + intercept, label='Regression Line')
plt.xlabel('Log_Hours')
plt.ylabel('Log_Sulfate')
plt.title('Log-Log Plot of Sulfate Concentration vs. Hours')

2
#Question 2: Prepare a plot showing - 1. the data points and 2. the regression curve in
the original coordinates.
First, plot the data points as is:
[12]: plt.scatter(x_hours, y_sulfate, color='black', label='Data Points')
plt.xlabel('Hours')
plt.ylabel('Sulfate')
plt.title('Plot of Sulfate Concentration vs. Time')

[12]: Text(0.5, 1.0, 'Plot of Sulfate Concentration vs. Time')

3
Here, for the regression curve we need to use curve_fit() function from scipy.optimize. We
need to assume the type of curve e.g. sin/tan/y = mx+c/exponential.
For example: In this dataset, the points resembles negative power exponential function. So we
assume the function to be,
f(x) = a * e ^ (bx) + c

[15]: def e_to_the_power(x, a, b, c):

return a * np.exp(-b * x) + c

Now, we pass this fuction and the data points inside curve_fit() function. It will return in the
constants, for this case:
f(x) = a * e ^ (bx) + c
a, b and c are the constants.
[ ]: constants, _ = curve_fit(e_to_the_power, x_hours, y_sulfate)

plotting the regression curve:

4
[18]: plt.scatter(x_hours, y_sulfate, color='black', label='Data Points')
plt.plot(x_hours, e_to_the_power(x_hours, *constants), label='Regression Curve')
plt.xlabel('Hours')
plt.ylabel('Sulfate')
plt.title('Plot of Sulfate Concentration vs. Time')

[18]: Text(0.5, 1.0, 'Plot of Sulfate Concentration vs. Time')

#Question 3: Plot the residual against the fitted values in log-log and in original coor-
dinates.
The residual is the difference between the observed value of the dependent variable (in this case,
the sulfate concentration) and the value predicted by the regression model. In other words, it
represents the error or deviation of each data point from the fitted regression line or curve.
[19]: regression_line_log = slope * log_hours + intercept
residual_log = log_sulfate - regression_line_log

Now we plot residual vs fitted values which represents the error or deviation of each data point
from the fitted regression line or curve.:

5
.
[21]: plt.scatter(regression_line_log, residual_log, color='black')
plt.xlabel('Fitted Values (Log)')
plt.ylabel('Residual (Log)')
plt.title('Plot 5: Residual vs Fitted Values (Log-Log)')

[21]: Text(0.5, 1.0, 'Plot 5: Residual vs Fitted Values (Log-Log)')

For the original coordinates it is the same. The difference between main datapoints and fitted
curve. We will plot it against hours value
[24]: fit_curve = e_to_the_power(x_hours, *constants)
residual_original = y_sulfate - fit_curve

[26]: plt.scatter(fit_curve, residual_original, color='black')

plt.xlabel('Fitted Values (Original)')
plt.ylabel('Residual (Original)')
plt.title('Plot 6: Residual vs Fitted Values (Original)')

[26]: Text(0.5, 1.0, 'Plot 6: Residual vs Fitted Values (Original)')

6
1 Question 4:
Use your plots to explain whether your regression is good or bad and why.
In Plot 5, the regression line we’ve calculated shows residuals distributed around zero, indicating
a good fit. Random scattering suggests the model captures data variation well, making it a strong
predictor.
In contrast, Plot 6 displays residuals clustered away from zero, particularly around values like 4-6.
This indicates a poor fit, with systematic errors in predictions. The model struggles to accurately
forecast values, making it unreliable for future predictions.

Multilayer Perceptron (MLP) & Linear Separabaility
No ratings yet
Multilayer Perceptron (MLP) & Linear Separabaility
7 pages
Routing and Scheduling Algorithms
100% (1)
Routing and Scheduling Algorithms
25 pages
Bisection Method
100% (1)
Bisection Method
4 pages
Lect DPv4 Z
No ratings yet
Lect DPv4 Z
46 pages
Asoc D 24 01798
No ratings yet
Asoc D 24 01798
45 pages
Chapter 5 - Excursion B-Splines - Commented2
No ratings yet
Chapter 5 - Excursion B-Splines - Commented2
47 pages
Lecture 11 Regression
No ratings yet
Lecture 11 Regression
53 pages
TOC M2 Handwritten
No ratings yet
TOC M2 Handwritten
46 pages
Ps 3
No ratings yet
Ps 3
16 pages
UnivariateRegression Summary
No ratings yet
UnivariateRegression Summary
36 pages
MCQ-01 Linear Search
No ratings yet
MCQ-01 Linear Search
3 pages
Closest Pair of Points
No ratings yet
Closest Pair of Points
28 pages
d2l en Pytorch
No ratings yet
d2l en Pytorch
988 pages
Assignment1 Roll 182-001
No ratings yet
Assignment1 Roll 182-001
12 pages
Arsahd Assignment 1 ComputatiionalPhysics
No ratings yet
Arsahd Assignment 1 ComputatiionalPhysics
9 pages
Ma6452 Scad MSM
No ratings yet
Ma6452 Scad MSM
128 pages
Regression Stat Assignment
No ratings yet
Regression Stat Assignment
7 pages
Sheet5 Sol
No ratings yet
Sheet5 Sol
13 pages
Examen Parcial 2 2023-2 Secc 1 (Solutions Alumnos)
No ratings yet
Examen Parcial 2 2023-2 Secc 1 (Solutions Alumnos)
5 pages
1 - Intro To Numerical Methods
No ratings yet
1 - Intro To Numerical Methods
6 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Implementation of Linear Regression With Python
No ratings yet
Implementation of Linear Regression With Python
5 pages
GMT Question20.12.2018
No ratings yet
GMT Question20.12.2018
1 page
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
No ratings yet
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
12 pages
Analytical Chemistry Exp3 Plots
No ratings yet
Analytical Chemistry Exp3 Plots
2 pages
Chapter 6-Simple Linear Regression and Correlation
No ratings yet
Chapter 6-Simple Linear Regression and Correlation
23 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
Colab 2.2
No ratings yet
Colab 2.2
5 pages
2.1 - Linear Regression
No ratings yet
2.1 - Linear Regression
20 pages
Assignment One - Solutions
No ratings yet
Assignment One - Solutions
7 pages
Lab1 Code PDF
No ratings yet
Lab1 Code PDF
3 pages
Me310 5 Regression PDF
No ratings yet
Me310 5 Regression PDF
15 pages
Tybms Regular Exams Operations Research Set 1
No ratings yet
Tybms Regular Exams Operations Research Set 1
6 pages
1.a Numpy Code
No ratings yet
1.a Numpy Code
2 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Numerical Method For Engineers-Chapter 20
100% (1)
Numerical Method For Engineers-Chapter 20
46 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Algo3 Syllabus
No ratings yet
Algo3 Syllabus
1 page
Single Image Super-Resolution Using Deep Learning
No ratings yet
Single Image Super-Resolution Using Deep Learning
1 page
4-Curve Fitting and Interpolation
No ratings yet
4-Curve Fitting and Interpolation
48 pages
Sherman Motors - TORA Output PDF
No ratings yet
Sherman Motors - TORA Output PDF
2 pages
Ary Reg
No ratings yet
Ary Reg
10 pages
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
No ratings yet
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
2 pages
Week 2 MrSumanBera HandsOn
No ratings yet
Week 2 MrSumanBera HandsOn
9 pages
C-Newton Rapson Method
No ratings yet
C-Newton Rapson Method
8 pages
Lec 34
No ratings yet
Lec 34
9 pages
utf-8''C2M1 Assignment
No ratings yet
utf-8''C2M1 Assignment
24 pages
Curve Fitting
100% (1)
Curve Fitting
43 pages
Lab 6 - Linear Regression and Multiple Linear Regression
No ratings yet
Lab 6 - Linear Regression and Multiple Linear Regression
12 pages
Math Lab Assignment
No ratings yet
Math Lab Assignment
5 pages
How To Perform Simple Linear Regression in Python
No ratings yet
How To Perform Simple Linear Regression in Python
8 pages
UCS415 DAA Syallbus 2024-25 EVEN
No ratings yet
UCS415 DAA Syallbus 2024-25 EVEN
2 pages
Lec4 Numerical Model
No ratings yet
Lec4 Numerical Model
26 pages
Exp 4 - LM
No ratings yet
Exp 4 - LM
5 pages
Machine Learning-Lecture 1 (Student)
No ratings yet
Machine Learning-Lecture 1 (Student)
14 pages
Google Deepmind Alphazero Chess, As Having
No ratings yet
Google Deepmind Alphazero Chess, As Having
1 page
Simple Linear Regression - Assign4
No ratings yet
Simple Linear Regression - Assign4
8 pages
Neutralization Reaction Mathematics Internal Assessment: Candidate Number: Session: Supervisor
No ratings yet
Neutralization Reaction Mathematics Internal Assessment: Candidate Number: Session: Supervisor
14 pages
Numerical Analysis Tutorial 2
No ratings yet
Numerical Analysis Tutorial 2
2 pages
Week 7 and Week 8
No ratings yet
Week 7 and Week 8
29 pages
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
No ratings yet
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
30 pages
Data Science Manual
No ratings yet
Data Science Manual
16 pages
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
No ratings yet
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
6 pages
Curve Fitting: There Are Two General Approaches For Curve Fitting
No ratings yet
Curve Fitting: There Are Two General Approaches For Curve Fitting
63 pages
Gaurav - Data Mining Lab Assignment
No ratings yet
Gaurav - Data Mining Lab Assignment
36 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
Wa0002.
No ratings yet
Wa0002.
5 pages
Lecture#7 - Flow Network Algorithm
No ratings yet
Lecture#7 - Flow Network Algorithm
40 pages
Mid Unit Review 1.1-1.6
No ratings yet
Mid Unit Review 1.1-1.6
6 pages
Slides
No ratings yet
Slides
41 pages
Section 2
No ratings yet
Section 2
22 pages
Statistic For Agriculture Studies: The Assumptions of Regression
No ratings yet
Statistic For Agriculture Studies: The Assumptions of Regression
6 pages
Lecture#6 - Branch-and-Bound Algorithm
No ratings yet
Lecture#6 - Branch-and-Bound Algorithm
32 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
Lab-5-1-Regression and Multiple Regression
100% (2)
Lab-5-1-Regression and Multiple Regression
8 pages
Sample Numerical Methods Mcs491 Assignment 1
No ratings yet
Sample Numerical Methods Mcs491 Assignment 1
25 pages
Exercise 4: Simple and Multiple Linear Regression Analysis
No ratings yet
Exercise 4: Simple and Multiple Linear Regression Analysis
15 pages
Tugas 1 1. Vsloga: Regression
No ratings yet
Tugas 1 1. Vsloga: Regression
7 pages
Assignment 9
No ratings yet
Assignment 9
5 pages
BN2102 7-10
No ratings yet
BN2102 7-10
24 pages
Regression PDF
No ratings yet
Regression PDF
18 pages
Regression Bhowal, Barua
No ratings yet
Regression Bhowal, Barua
12 pages
R Lab 4
No ratings yet
R Lab 4
7 pages
BUS203 Suggestions
No ratings yet
BUS203 Suggestions
2 pages
04 - Notebook4 - Additional Information
No ratings yet
04 - Notebook4 - Additional Information
5 pages
Error Calculations
No ratings yet
Error Calculations
10 pages
DP Matrix-Chain Multiplication (MCM) :: Questions
No ratings yet
DP Matrix-Chain Multiplication (MCM) :: Questions
3 pages
Regression 101
No ratings yet
Regression 101
18 pages
Oundary Alue Roblems: Dr. Johnson
No ratings yet
Oundary Alue Roblems: Dr. Johnson
33 pages
Lab 3 - Linear Regression
No ratings yet
Lab 3 - Linear Regression
15 pages
Calibration: Constructing A Calibration Curve
No ratings yet
Calibration: Constructing A Calibration Curve
10 pages
325E6B
No ratings yet
325E6B
1 page
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
Algorithms
No ratings yet
Algorithms
10 pages
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
4/5 (2)
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Regression Stat Assignment

Uploaded by

Regression Stat Assignment

Uploaded by

copy-of-regression-stat-assignment

#Statistics Assignment: Generating Regression Model

Solve the problems: At http://www.statsci.org/data/general/brunhild.html, you will find a dataset

#Question a: Prepare a plot showing :

[5]: slope, intercept, a, b, c = linregress(log_hours, log_sulfate)

[10]: plt.scatter(log_hours, log_sulfate, color='black')

[12]: Text(0.5, 1.0, 'Plot of Sulfate Concentration vs. Time')

[15]: def e_to_the_power(x, a, b, c):

plotting the regression curve:

[18]: Text(0.5, 1.0, 'Plot of Sulfate Concentration vs. Time')

[21]: Text(0.5, 1.0, 'Plot 5: Residual vs Fitted Values (Log-Log)')

[26]: plt.scatter(fit_curve, residual_original, color='black')

[26]: Text(0.5, 1.0, 'Plot 6: Residual vs Fitted Values (Original)')

You might also like