0% found this document useful (0 votes)

139 views

Excel Regression

The document discusses spreadsheet problem solving techniques including fitting linear, multilinear, polynomial, and nonlinear regression models to data. It provides examples of using Excel's Data Analysis Regression tool and Trendline feature to analyze straight-line, polynomial, and multi-linear regression models. It also discusses using the Solver tool to perform nonlinear regression to fit parameters in the van der Waals equation of state to pressure-volume data.

Uploaded by

Steve Wan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

139 views

Excel Regression

Uploaded by

Steve Wan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 41

Spreadsheet Problem Solving

fitting models to data

straight-line regression
multilinear regression
nonlinear regression
model building and selection
Data Analysis Regression tool
using
Trendline
Solver

Review of Straight-line Linear Regression

[ from Class #6 ]
y1

y = ax + b
Model

y
y11
e11

y11
x

x11

For each data point, there is an error between that

point and the model line. Fitting the model has to do
with minimizing these errors.

Finding the model parameters that give the best fit

For the straight-line model, the model parameters are
the slope (a) and the intercept (b).
The problem is then to find the values of a and b that
give the best fit. What is meant by the best fit?
The standard measure of goodness of fit is the sum
of squares of the errors:
n

SSE yi yi
i 1

yi a xi b

So, the problem reduces to finding the minimum of

SSE by adjusting a and b.

Fitting a straight-line model to data

The minimization of SSE can be solved by calculus
to give formulas for the best values of a and b:

n xi yi xi yi
i 1 i 1
a i 1
2
n
n

2
n xi xi
i 1
i 1
n

y
i 1

x
i 1

and Excel solves problems like this with either formulas

or built-in tools (Data Analysis Regression & Trendline).
4

Example: straight-line fit

Transfer the data to an Excel spreadsheet

and create a graph

CO2 Emissions for the US

1520
1500
1480

CO2 Emissions (MMT C)

1460
1440
1420
1400
1380
1360
1340
1320
1989

1990

1991

1992

1993

1994

1995
Year

1996

1997

1998

1999

2000

Calculating the slope and intercept using Excel formulas

n xi yi xi yi
i 1 i 1
a i 1
2
n
n

n xi2 xi
i 1
i 1
n

y
i 1

x
i 1

The formulas behind the numbers

Using the model straight-line equation to compute

the predictions:

and copy these

to the graph,
displaying as
a straight line

CO2 Emissions for the US

1550

CO2 Emissions (MMT C)

1500

y = 21.32x - 41090
1450

1400

1350

1300
1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

Year

Using an alternate, shortcut approach

Trendline

Start with a simple graph of the data

Select the data series by
clicking on it
CO2 Emissions for the US
1520
1500
1480

Select
Add Trendline
option

1460
CO2 Emissions (MMT C)

Right-click on a
data point to get
context-sensitive
menu

1440
1420
1400
1380
1360
1340
1320
1989

1990

1991

1992

1993

1994

1995
Year

1996

1997

1998

1999

2000

The Add Trendline dialog box

Linear selected
by default
OK for this
problem
Click on
Options tab

Options tab

Set for
Display equation
on chart

Click OK
13

Fix up
equation
display

Initial form of graph with straight-line added

CO2 Emissions for the US
1550

y = 21.315x - 41090

CO2 Emissions (MMT C)

1500

1450

1400

1350

1300
1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

Year

CO2 Emissions for the US

1550

CO2 Emissions (MMT C)

1500

y = 21.315x - 41090

1450

1400

1350

1300
1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

Year

Looks just like before, but we got there quicker

But neither of these approaches gives us much information
15
about the model, how good it is, etc.

A 2nd alternate approach

Tools
Data Analysis

Data Analysis Regression tool

recall that, if Data Analysis
does not appear on the Tools
menu, you will need to check
Analysis Toolpak in the Add-ins
dialog box [if its not there, you
will have to go back to Microsoft
Office/Excel set-up]

Initial, empty
Regression
dialog box

Regression dialog box set up for our problem

checking Residuals
will give us also
model predictions
17

Initial (poorly formatted) Regression output display

[ on new worksheet ]

Format
Autoformat
OK
and fix up
display for
appropriate
significant
figures
18

Final Display of Regression Output

[ tons of info, most of
which you will not
understand for a
couple years ]

used to judge
goodness of
fit
intercept
and slope
values

used to judge
whether terms
belong in the
model
add to data graph
for visual comparison
with model

Judging Goodness of Fit

correlation coefficient: if close

to +1 or 1, indicates strong
correlation between x and y
[something we already know
from the original graph!]
coefficient of determination:
%-age of the variability in y
thats accounted for by the
model

gives an idea of how

far off the model
predictions will be

adjustment to R2 that
penalizes the value for
using a model with too
many terms

Adjusted R2 or Standard Error can be used to compare

different models and choose which fits best. The higher
the value of Adjusted R2 the better, the lower the value
of Standard Error the better.
20

Judging whether terms belong in the model

P-values estimate the probability
that the true value of the coefficient
could be zero

A P-value of 5%
(0.05) or greater
causes suspicion
that the coefficient
may not be
significant and that
the term should
probably be dropped
from the model

P-values that are quite small, like

these, indicate that there is little
question about the significance of
the term coefficients. In our case
here, that means that both the
intercept term and the slope term
belong in the model.

The Data Analysis Regression tool appears much more

complicated and involved that the shortcut Trendline tool, so . . .
Why use Data Analysis Regression?
1) It provides more information that lets us
judge the goodness of fit and significance
of model terms
2) It can handle model forms that cannot be
handled by Trendline
So, generally, when using Excel, we prefer
the Data Analysis Regression tool over Trendline
but Trendline is still quite good for quick and dirty
looks at the data
Learn to use both!

More complicated models

Polynomial models

y a bx cx 2 dx 3 L

Note: it is called linear regression,

even when there are nonlinear
terms in x, because the terms are
linear in the model parameters,
a, b, c, etc.

General linear models

y a f1 x b f 2 x c f 3 x d f 4 x L
Examples:

polynomial models above

1
y a b c ln x
x
Multilinear models

y a f1 x1 ,x2 ,K b f 2 x1 ,x2 ,K c f 3 x1 ,x2 ,K L

Examples:

y a bx1 cx2 dx1 x2

y ae

x1
x2
23

Nonlinear models
Transformable to linear

ln y ln a b x

y a eb x
Not transformable

P 10

B
T C

straight-line
regression!

We can use the Data Analysis Regression tool for everything

except the nonlinear models that cant be transformed into
linear. For those, we can use the Solver.

Example: polynomial regression

curvature evident

Viscosity of Water at Atmospheric Pressure

2.000
1.800
1.600

Viscosity (cp)

1.400
1.200
1.000
0.800
0.600
0.400
0.200
0.000
0

100

150

200

250

Temperature (degF)

Setting up for polynomial fits

Select for quadratic model, etc

Data Analysis Regression tool

check Labels because

headings are included
in selections for Y and X

check
Residuals

Quadratic model regression results

model performance
adjR2

model coefficients
copy to graph

Quadratic model really doesnt capture behavior of data

Viscosity of Water at Atmospheric Pressure
2.000
1.800
1.600
Data

Viscosity (cp)

1.400

Quadratic

1.200
1.000
0.800
0.600
0.400
0.200
0.000
0

100

150

200

250

Temperature (degF)

Continue with fits of cubic, 4th- & 5th-order polynomials

Summary of results

Looks like 5th-order offers best performance

but improvement is marginal over 4th-order.
Resulting model:
Visc 3.161 0.05699 T 5.023 10 4 T 2 2.162 10 6 T 3 3.593 10 9 T 4

Viscosity of Water at Atmospheric Pressure

2.000
1.800
1.600
Data

Viscosity (cp)

1.400

Quadratic
Cubic

1.200

4th Order
1.000
0.800
0.600
0.400
0.200
0.000
20

100

120

140

160

180

200

220

Temperature (degF)

Precautions on polynomial fitting

Try to use the lowest-order model that gives a good fit.
Higher-order models will have wiggles between data
points that will cause prediction errors.
In fact, an (n-1)th-order polynomial will provide a perfect
fit to the n data points, but it will usually do bizarre things
in between the data points.

Example: multi-linear regression

Model 1: y a b x1 c x2

Model 2:

y b x1 c x2

X-input range includes

two independent variables:
x1 and x2
High P value for intercept in
Model 1 suggests Model 2
without intercept, but there
is a significant loss in adjR2

Multilinear Model Performance

12.0

Model performance isnt that

great for either model, and
Model 1 doesnt appear
dramatically better than Model 2

10.0

Predicted y

8.0

Model 1

6.0

Model 2

4.0

2.0

0.0
0

Measured y

Note: for multi-linear models, we plot Predicted vs Measured y.

A perfect model would place points directly on the 45-degree line.

Nonlinear Regression
Fitting the parameters of the van der Waals equation of state
Data for SO2
RT
a

V b V 2

Find the values of a and b

that give the best predictions
for P, when compared to the
measured values of P

Strategy for Nonlinear Regression

1) estimate initial values for a and b
2) compute predicted Ps using data for V and T
3) compute errors between predicted Ps and measured Ps
4) sum the squares of these errors to compute SSE
5) have the Solver minimize SSE
by adjusting the values of a and b

Basic data

Calculated Pressure

by both ideal gas law

and van der Waals
Sum of
squares
of this
column

Ideal Gas
Sum of Squares
Calculation Calculation

van der Waals Calculation

Error Calculation

Setting up Solver Parameters

SSE as Target Cell
Minimize
by adjusting a and b
with b>=0 constraint

Results

Fit of van der Waals Eqn for SO2

and Comparison to Ideal Gas Law
12000000

Note departure of
ideal gas predictions
at higher pressures

Predicted Pressure (Pa)

10000000

8000000
van der Waals
Ideal Gas

6000000

4000000

2000000

0
0

2000000

4000000

6000000

8000000

10000000

12000000

Measured Pressure (Pa)

Krish Naik - Hands-On Python For Finance-Packt Publishing (2019)
100% (1)
Krish Naik - Hands-On Python For Finance-Packt Publishing (2019)
506 pages
RISEwithSAP PrivateCloudEdition BrownField Migrations Partner Overview
100% (2)
RISEwithSAP PrivateCloudEdition BrownField Migrations Partner Overview
76 pages
Fokus Osce Ukmppd - 20200212130044
No ratings yet
Fokus Osce Ukmppd - 20200212130044
128 pages
Pump Sizing Calculation Sheet
100% (1)
Pump Sizing Calculation Sheet
10 pages
Software Engineering Midterm Lec 3
No ratings yet
Software Engineering Midterm Lec 3
6 pages
NMIMS MBA Solved Assignment Solutions Case Studies & Projects Contact: Sunita Call Us +919632359315
No ratings yet
NMIMS MBA Solved Assignment Solutions Case Studies & Projects Contact: Sunita Call Us +919632359315
6 pages
CSE Database Management System
No ratings yet
CSE Database Management System
23 pages
Hands-On Deep Learning For Images With T PDF
No ratings yet
Hands-On Deep Learning For Images With T PDF
3 pages
Pandas
No ratings yet
Pandas
1,839 pages
A Study On AI Use in Insurance Industry
No ratings yet
A Study On AI Use in Insurance Industry
4 pages
Tutorial Weka - Feature Selection and Classification, Data Mining
No ratings yet
Tutorial Weka - Feature Selection and Classification, Data Mining
7 pages
Chapter 13 Linear Regression
100% (1)
Chapter 13 Linear Regression
71 pages
Information Technology Unit 4
No ratings yet
Information Technology Unit 4
9 pages
Refactoring
No ratings yet
Refactoring
82 pages
Arun Mani Sam, R&D Software Engineer
No ratings yet
Arun Mani Sam, R&D Software Engineer
21 pages
Approaches To The Analysis of Survey Data PDF
No ratings yet
Approaches To The Analysis of Survey Data PDF
28 pages
Hotel Bookings Exploratory Data Analysis - 1
No ratings yet
Hotel Bookings Exploratory Data Analysis - 1
13 pages
Digital Analytics Maturation Model
No ratings yet
Digital Analytics Maturation Model
3 pages
Software Test Plan For Automated Ticket Issuing System For Dhaka Subway Systems PDF
No ratings yet
Software Test Plan For Automated Ticket Issuing System For Dhaka Subway Systems PDF
16 pages
Imperial Business Analytics: From Data To Decisions
No ratings yet
Imperial Business Analytics: From Data To Decisions
16 pages
Placement Project Report
No ratings yet
Placement Project Report
6 pages
Extensible Markup Language
No ratings yet
Extensible Markup Language
38 pages
Configuring CICD Pipelines As Code With YAML in Azure DevOps
No ratings yet
Configuring CICD Pipelines As Code With YAML in Azure DevOps
24 pages
Data Mining Implementation Process
No ratings yet
Data Mining Implementation Process
9 pages
Syllabus (AI & ML BlackBelt+ Program)
No ratings yet
Syllabus (AI & ML BlackBelt+ Program)
15 pages
Information Technology Unit 3
No ratings yet
Information Technology Unit 3
7 pages
Regression Analysis Using SPSS: DR Somesh K Sinha
100% (1)
Regression Analysis Using SPSS: DR Somesh K Sinha
17 pages
The 8085 Microprocessor Architecture
0% (1)
The 8085 Microprocessor Architecture
33 pages
Starbucks Sentiment Analysis Using VADER
No ratings yet
Starbucks Sentiment Analysis Using VADER
23 pages
Data Science Learning Plan
No ratings yet
Data Science Learning Plan
3 pages
POL BigDataStatisticsJune2014
No ratings yet
POL BigDataStatisticsJune2014
27 pages
Flex 2021 Sustainability Report
No ratings yet
Flex 2021 Sustainability Report
89 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
30 pages
GIS Fundamentals Chapter-3
No ratings yet
GIS Fundamentals Chapter-3
44 pages
Business Intelligence Overview
No ratings yet
Business Intelligence Overview
15 pages
Efficient Frontier
No ratings yet
Efficient Frontier
27 pages
What Is A Test Case?: Software Development Life Cycle Software Testing
No ratings yet
What Is A Test Case?: Software Development Life Cycle Software Testing
26 pages
Intro To ML
No ratings yet
Intro To ML
32 pages
Lecture 4 - Introduction To UML
No ratings yet
Lecture 4 - Introduction To UML
29 pages
Presentation On Data Mining
100% (1)
Presentation On Data Mining
51 pages
Migrating AS400-COBOL To Java A Report From The Field: Harry - Sneed@t-Online - de Katalin - Erdos1@t-Online - Hu
No ratings yet
Migrating AS400-COBOL To Java A Report From The Field: Harry - Sneed@t-Online - de Katalin - Erdos1@t-Online - Hu
10 pages
Real Time Object Detection Using Deep Learning Andmachine Learning Project
No ratings yet
Real Time Object Detection Using Deep Learning Andmachine Learning Project
56 pages
Talha Nadeem 11610
100% (1)
Talha Nadeem 11610
6 pages
Netops
No ratings yet
Netops
81 pages
Data Smart For Product Managers
100% (1)
Data Smart For Product Managers
13 pages
Data Collection and Presentation
100% (2)
Data Collection and Presentation
15 pages
Code Oean - GPU Accelerated Workflows Webinar Slides
No ratings yet
Code Oean - GPU Accelerated Workflows Webinar Slides
17 pages
Examples of Performance Appraisal
No ratings yet
Examples of Performance Appraisal
6 pages
SAS Presentation
No ratings yet
SAS Presentation
49 pages
Psm1 (New - Deg) - 0910 - v1.3 Nhs Bursary Application Form
100% (1)
Psm1 (New - Deg) - 0910 - v1.3 Nhs Bursary Application Form
26 pages
Assessing Personalized Software Defect Predictors
No ratings yet
Assessing Personalized Software Defect Predictors
4 pages
Kaiser Tableau 10 Workshop 01-2017
No ratings yet
Kaiser Tableau 10 Workshop 01-2017
114 pages
Arbitrage Project
No ratings yet
Arbitrage Project
96 pages
Usability Test Report
No ratings yet
Usability Test Report
2 pages
Trend Analysis
No ratings yet
Trend Analysis
27 pages
5-A Novel Multi-Objective Evolutionary Algorithm For Recommendation Systems
No ratings yet
5-A Novel Multi-Objective Evolutionary Algorithm For Recommendation Systems
11 pages
Straub - Understanding Technology Adoption
100% (1)
Straub - Understanding Technology Adoption
26 pages
Wireless Sensor Network
No ratings yet
Wireless Sensor Network
20 pages
MSC
No ratings yet
MSC
17 pages
Design and Functional Specification
No ratings yet
Design and Functional Specification
25 pages
Cloud Computing Big Data Technology
No ratings yet
Cloud Computing Big Data Technology
2 pages
Intermediate Macro
No ratings yet
Intermediate Macro
260 pages
DMV Unit 3 PPT_RSK_250419_125620 jfhuehiwhu
No ratings yet
DMV Unit 3 PPT_RSK_250419_125620 jfhuehiwhu
89 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
65 pages
Boiler Efficiency Calculation
No ratings yet
Boiler Efficiency Calculation
4 pages
Selection Presentation
No ratings yet
Selection Presentation
15 pages
Plano Rizzi Sopladores B 6102 FCC
No ratings yet
Plano Rizzi Sopladores B 6102 FCC
2 pages
Causes of Coupling Failures
100% (1)
Causes of Coupling Failures
6 pages
Startup Steam Required
No ratings yet
Startup Steam Required
4 pages
Combustion Calculation
No ratings yet
Combustion Calculation
2 pages
Air Damper Control System671 PDF
No ratings yet
Air Damper Control System671 PDF
1 page
Esp 100 TPD Precicon 3 Drawing PDF
No ratings yet
Esp 100 TPD Precicon 3 Drawing PDF
1 page
Steam Boilers Veloa
No ratings yet
Steam Boilers Veloa
96 pages
Process Control Boiler
No ratings yet
Process Control Boiler
70 pages
Boiler Safety Checklist &
No ratings yet
Boiler Safety Checklist &
185 pages
Chimney Calculations
No ratings yet
Chimney Calculations
3 pages
Part 8 Exhaust Fan
No ratings yet
Part 8 Exhaust Fan
7 pages
Tank Size
No ratings yet
Tank Size
7 pages
Furnace Typical Draft Profile
No ratings yet
Furnace Typical Draft Profile
1 page
1.3.7 High - and Low-Level Languages and Their Translators
No ratings yet
1.3.7 High - and Low-Level Languages and Their Translators
4 pages
Course Book Itil 4 Foundation Pages 101-150 - Flip PDF Download - Fliphtml5
No ratings yet
Course Book Itil 4 Foundation Pages 101-150 - Flip PDF Download - Fliphtml5
268 pages
CMP9132M Advanced Artificial Intelligence Assessment Item 1 Brief 2021-2022
No ratings yet
CMP9132M Advanced Artificial Intelligence Assessment Item 1 Brief 2021-2022
2 pages
AI HLY -QP SET A 2024-25
No ratings yet
AI HLY -QP SET A 2024-25
3 pages
L2 ASP - NET MVC Code First Approach - 075015
No ratings yet
L2 ASP - NET MVC Code First Approach - 075015
9 pages
CV Sap-Fico
No ratings yet
CV Sap-Fico
3 pages
SCCM Resume
No ratings yet
SCCM Resume
4 pages
Thesis Service Oriented Architecture
100% (3)
Thesis Service Oriented Architecture
6 pages
Dokumentasi - Laporan Pembelian & Laporan Penjualan
No ratings yet
Dokumentasi - Laporan Pembelian & Laporan Penjualan
13 pages
Systemd
No ratings yet
Systemd
11 pages
PlanG FSX Log
No ratings yet
PlanG FSX Log
76 pages
Smart Note Taker
No ratings yet
Smart Note Taker
15 pages
On-Device, Real-Time Hand Tracking With MediaPipe
No ratings yet
On-Device, Real-Time Hand Tracking With MediaPipe
9 pages
Adikavi Nannaya University: Master of Computer Applications (MCA)
No ratings yet
Adikavi Nannaya University: Master of Computer Applications (MCA)
31 pages
Client Side Webspoofing PishCatcher
No ratings yet
Client Side Webspoofing PishCatcher
75 pages
Interrupt Driven Io
No ratings yet
Interrupt Driven Io
15 pages
Software Piracy
No ratings yet
Software Piracy
51 pages
Sad 9 Cocomo Model Questions
No ratings yet
Sad 9 Cocomo Model Questions
26 pages
A Literature Study On Agricultural Production System Using Iot As Inclusive Technology
No ratings yet
A Literature Study On Agricultural Production System Using Iot As Inclusive Technology
5 pages
Passive voice with modal verbs 6
No ratings yet
Passive voice with modal verbs 6
4 pages
Unit 8 Lesson-8 Normalization (Cont'd)
No ratings yet
Unit 8 Lesson-8 Normalization (Cont'd)
14 pages
Topic 2 - Basic Switch and End Device Configuration
No ratings yet
Topic 2 - Basic Switch and End Device Configuration
60 pages
Universidad Cristiana de Honduras (Ucrish)
No ratings yet
Universidad Cristiana de Honduras (Ucrish)
6 pages
RC 2014 AutoCAD Manual
100% (1)
RC 2014 AutoCAD Manual
110 pages
SAP Fiori Interview Questions - Tutorialspoint PDF
No ratings yet
SAP Fiori Interview Questions - Tutorialspoint PDF
13 pages
Montecarlosimulations: Software By: Barringer & Associates, Inc
No ratings yet
Montecarlosimulations: Software By: Barringer & Associates, Inc
26 pages
Hack An Android App Finding Forensic Artifacts
No ratings yet
Hack An Android App Finding Forensic Artifacts
31 pages

Excel Regression

Uploaded by

Excel Regression

Uploaded by

Spreadsheet Problem Solving

fitting models to data

Review of Straight-line Linear Regression

For each data point, there is an error between that

Finding the model parameters that give the best fit

So, the problem reduces to finding the minimum of

Fitting a straight-line model to data

and Excel solves problems like this with either formulas

Example: straight-line fit

Transfer the data to an Excel spreadsheet

CO2 Emissions for the US

CO2 Emissions (MMT C)

Calculating the slope and intercept using Excel formulas

The formulas behind the numbers

Using the model straight-line equation to compute

and copy these

CO2 Emissions for the US

CO2 Emissions (MMT C)

Using an alternate, shortcut approach

Start with a simple graph of the data

The Add Trendline dialog box

Initial form of graph with straight-line added

CO2 Emissions (MMT C)

CO2 Emissions for the US

CO2 Emissions (MMT C)

Looks just like before, but we got there quicker

A 2nd alternate approach

Data Analysis Regression tool

Regression dialog box set up for our problem

Initial (poorly formatted) Regression output display

Final Display of Regression Output

Judging Goodness of Fit

correlation coefficient: if close

gives an idea of how

Adjusted R2 or Standard Error can be used to compare

Judging whether terms belong in the model

P-values that are quite small, like

The Data Analysis Regression tool appears much more

More complicated models

Note: it is called linear regression,

General linear models

polynomial models above

y a f1 x1 ,x2 ,K b f 2 x1 ,x2 ,K c f 3 x1 ,x2 ,K L

y a bx1 cx2 dx1 x2

We can use the Data Analysis Regression tool for everything

Example: polynomial regression

Viscosity of Water at Atmospheric Pressure

Setting up for polynomial fits

Select for quadratic model, etc

Data Analysis Regression tool

check Labels because

Quadratic model regression results

Quadratic model really doesnt capture behavior of data

Continue with fits of cubic, 4th- & 5th-order polynomials

Looks like 5th-order offers best performance

Viscosity of Water at Atmospheric Pressure

Precautions on polynomial fitting

Example: multi-linear regression

X-input range includes

Multilinear Model Performance

Model performance isnt that

Note: for multi-linear models, we plot Predicted vs Measured y.

Find the values of a and b

Strategy for Nonlinear Regression

by both ideal gas law

van der Waals Calculation

Setting up Solver Parameters

Fit of van der Waals Eqn for SO2

Predicted Pressure (Pa)

Measured Pressure (Pa)

You might also like