0% found this document useful (0 votes)
52 views

CVE156 Chap5 Data Analysis Functions

Right click on any data point > Add Trendline > Linear > Format Trendline dialog box appears > click on Options > check Display equation on chart and Display R‐squared value on chart > OK. The regression line equation and R2 value are displayed on the chart. CVE 156_MDP Example 2: Linear Regression A laboratory test is conducted to determine the relationship between the concentration of a solution (x) and its electrical conductivity (y). The data obtained is tabulated as follows: Concentration (x) Conductivity (y) 0.1 0.12 0.2 0.24 0.

Uploaded by

johairsaving01
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

CVE156 Chap5 Data Analysis Functions

Right click on any data point > Add Trendline > Linear > Format Trendline dialog box appears > click on Options > check Display equation on chart and Display R‐squared value on chart > OK. The regression line equation and R2 value are displayed on the chart. CVE 156_MDP Example 2: Linear Regression A laboratory test is conducted to determine the relationship between the concentration of a solution (x) and its electrical conductivity (y). The data obtained is tabulated as follows: Concentration (x) Conductivity (y) 0.1 0.12 0.2 0.24 0.

Uploaded by

johairsaving01
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

CVE156

COMPUTER
PROGRAMMING FOR CE

MDP_CVE156 1
DATA ANALYSIS
CHAPTER 5

MDP_CVE156 2
Content

5.1 Data Analysis Functions


5.2 Statistical Analysis Functions
5.3 Analysis ToolPak

MDP_CVE156 3
Objectives

1. To import, explore, clean, analyze, and visualize data using


Data Analysis Functions
2. Employ Excel built‐in functions to obtain theoretical
parameters that give the best relationship between theory and
experimental result.
3. Use Analysis ToolPak to develop complex statistical or
engineering analyses
Data Analysis Functions
C H A P TER 5
DATA ANALYSIS FUNCTIONS
Excel has many built‐in functions that are used to analyze
data obtained from experimental result.

➢ Sort ➢ What-If Analysis


➢ Filter ➢ Solver
➢ Conditional Formatting ➢ Analysis ToolPak
➢ Charts
➢ Pivot Tables
➢ Tables
CVE 156_MDP
5.1.1 Sort
Sort your Excel data on one column or multiple columns. You can sort in
ascending or descending order.

One Column
To sort on one column, execute the
following steps.

1. Click any cell in the column you want


to sort.

Figure 4.1
CVE 156_MDP
2. To sort in ascending order, on the Data tab, in the Sort & Filter group,
click AZ.

Note: to sort in descending


order, click ZA.

Figure 4.2

CVE 156_MDP
Multiple Columns
To sort on multiple columns, execute the following steps.

1. On the Data tab, in the Sort & Filter group, click Sort.

The Sort dialog box appears.

2. Select Last Name from the 'Sort by' drop-down list.

CVE 156_MDP
3. Click on Add Level.
4. Select Sales from the 'Then by' drop-
down list.

5. Click OK.
Result. Records are sorted by Last Name
first and Sales second.

CVE 156_MDP
5.1.2 What-If Analysis

What-If Analysis in Excel allows you to try out different values (scenarios)
for formulas. The following example helps you master what-if analysis
quickly and easily.

CVE 156_MDP
5.1.2 What-If Analysis
Assume you own a bookstore and have 100 books in storage. You
sell a certain % for the highest price of $50 and a certain % for the
lower price of $20

If you sell 60% for the


highest price, cell D10
calculates a total profit
of 60 * $50 + 40 * $20
= $3800.

Figure 4.3

CVE 156_MDP
Create Different Scenarios

But what if you sell 70% for the highest price? And what if you sell 80% for the
highest price? Or 90%, or even 100%? Each different percentage is a
different scenario. You can use the Scenario Manager to create these scenarios.

Note: You can simply type in a different percentage into cell C4 to see the
corresponding result of a scenario in cell D10. However, what-if analysis
enables you to easily compare the results of different scenarios.
1. On the Data tab, in the Forecast group, click What-If Analysis.

CVE 156_MDP
2. Click Scenario Manager.

The Scenario Manager dialog box appears.

3. Add a scenario by clicking on Add.

CVE 156_MDP
4. Type a name (60% highest), select cell C4 (% sold for the highest price)
for the Changing cells and click on OK.

5. Enter the corresponding value 0.6 and click on OK again.

Figure 4.3

CVE 156_MDP
6. Next, add 4 other scenarios (70%, 80%, 90% and 100%).
Finally, your Scenario Manager should be consistent with the picture
below:

Note: to see the result of a scenario, select the scenario and click on
the Show button. Excel will change the value of cell C4 accordingly for
you to see the corresponding result on the sheet.

CVE 156_MDP
Scenario Summary

To easily compare the results of these scenarios, execute the following


steps.
1. Click the Summary button in the Scenario Manager.
2. Next, select cell D10 (total profit) for the result cell and click on OK.

CVE 156_MDP
Result:

Conclusion: If you sell 70% for the highest price, you obtain a total
profit of $4100, if you sell 80% for the highest price, you obtain a total
profit of $4400, etc. That's how easy what-if analysis in Excel can be.
5.1.3 Goal Seek
What if you want to know how many books you need to sell for the highest price, to
obtain a total profit of exactly $4700? You can use Excel's Goal Seek feature to find the
answer.

1. On the Data tab, in the Forecast group, click What-If Analysis.

2. Click Goal Seek.

The Goal Seek dialog box appears.

CVE 156_MDP
3. Select cell D10.
4. Click in the 'To value' box and type 4700.
5. Click in the 'By changing cell' box and select cell C4.
6. Click OK.

Result. You need to sell 90% of the books for the highest price to obtain a total profit of exactly $4700.

CVE 156_MDP
Example 3: Goal seek

Quadratic Equation

A quadratic equation is of the form a𝑥 2 + bx + c = 0 where a ≠ 0. A


quadratic equation can be solved by using the quadratic formula. You
can also use Excel's Goal Seek feature to solve a quadratic equation.

CVE 156_MDP
CVE 156_MDP
4. You can use Excel's Goal Seek feature to obtain the exact same result. On
the Data tab, in the Forecast group, click What-If Analysis.

5. Click Goal Seek.

The Goal Seek dialog box appears.

CVE 156_MDP
6. Select cell B2.
7. Click in the 'To value' box and type 24.5
8. Click in the 'By changing cell' box and select cell A2.
9. Click OK.
Result.

Note: Excel returns the solution x = 5. Excel finds the other


solution if you start with an x-value closer to x = -1. For
example, enter the value 0 into cell A2 and repeat steps 5 to 9.
To find the roots, set y = 0 and solve the quadratic equation
3x2 - 12x + 9.5 = 0. In this case, set 'To value' to 0.
Statistical Data
C H A P TER 5
5.2.1 STATISTICAL ANALYSIS FUNCTIONS

Excel has many built‐in functions that are used to analyze


data obtained from experimental result.

➢ Linear Regression
➢ Polynomial Regression
➢ Interpolation
➢ Statistical Data

CVE 156_MDP
5.2.2 LINEAR REGRESSION

Linear regression is to determine a straight line that fits or the most closely
fits to a number of points data, providing a linear relationship between two
variables. The method used to obtain the line is called the least squares
method. Thus, linear regression consists of a series of points that fit to a
number (n) of points data (xi, yi) written into a straight-line equation:

y = Ax + B where:

A = slope of the line,

B = the intersection of the straight line to the Y‐axis


CVE 156_MDP
The accuracy of the
straight line over a
number of points data is
evaluated by a total
deviation E, which is the
sum of squares of the
distance e between the
points data and the fitted
points:

Figure 4.1

CVE 156_MDP
𝑅2 value is also used for the accuracy measurement by a
relationship below:

𝑅2 value varies from 0 to 1. 𝑅2 = 1 is when the regression line


coincides with the data. In polynomial regression, the
higher‐order of polynomial, the closer 𝑅2 value to = 1.

CVE 156_MDP
Regression in Excel can be obtained using the TREND function, SLOPE,
INTERCEPT and LINEST or with Trendline, a regression line from
data relations in an XY coordinate system. Trendline is created by the
following steps: right‐clicking the mouse when the pointer is on one of
the points data in the graph to display a box menu > click Add
Trendline to display the Format Trendline dialog box > select Linear
for linear regression.

CVE 156_MDP
Example 1: Linear Regression

The soil shear strength determination from a laboratory test obtains the result as
shown in Figure 4.2. In general, the depiction of data generated from the test is rarely a
straight line to show failure envelope, therefore, it needs a fitted straight line to
represent the data.

Regression is used to obtain shear strength parameters, which are cohesion (c) and
shear (∅) of the soil mass.

CVE 156_MDP
CVE 156_MDP
CVE 156_MDP
Figure 4.2:
Shear strength in this model is a linear
function over the normal stress, where c
and tan φ respectively expresses Y‐axis
intercept and the slope of the line.

In Chart, a regression can be done with


Trendline by the following steps: right
clicking mouse when the pointer is on
one of the points data on Chart to display
a box menu. From the box menu select
Add Trendline to display Format
Trendline dialog box > Linear >
Backward to intercept Y‐axis > click
Display Equation on Chart. The result is
as shown in Figure 4.2.

Figure 4.3

CVE 156_MDP
The values of A and B that are shown in line equation, y = Ax + B in direct shear
test are equal to those calculated with the following functions:

A = slope of the line = tan φ = SLOPE(Y,X)


B = Y‐intercept = INTERCEPT(Y,X)

Shear angle φ therefore can be calculated with ATAN function (in degree):
=ATAN(SLOPE(Y,X))*180/PI()

CVE 156_MDP
CVE 156_MDP
5.2.3 Polynomial Regression

CVE 156_MDP
Example 2. Polynomial Regression

Figure 4.4 shows two data series from the


relationship between water density and
temperature, where X‐coordinate is the
temperature (℃) and the Y‐coordinate is the
water density (g/cm3 ). By plotting Y‐values
against X‐values, it is apparently visible that
the result has a curve trend.

Figure 4.4:
Figure 4.5

Figure 4.5: Regression analysis using Trendline for


data series in Figure 4.4
CVE 156_MDP
5.2.4 INTERPOLATION

(2.7)

(𝑥2 , 𝑦2 )

(𝑥, 𝑦)
𝑦2 − 𝑦1
𝑦 − 𝑦1
(𝑥1 , 𝑦1 )
𝑥 − 𝑥1

CVE 156_MDP
𝑥2 − 𝑥1
CVE 156_MDP
Example 3: Linear Interpolation
Given below is two data series of X and Y‐array for interpolation:

Table 4.1: Two Data Series for Linear


Interpolation

CVE 156_MDP
Plotted in chart:

Figure 4.6

CVE 156_MDP
Worksheets below show formulas to obtain the interpolation of the data in
Table 4.1, using TREND function and Equation 2.7, respectively.

Using TREND function:

Figure 4.7
CVE 156_MDP
Figure 4.8

CVE 156_MDP
The interpolation values are in column D, which correspond to new values
entered in column C. The y‐value that correspond to a x‐value that lies between
0 ‐ 1 in cell C3 is shown in cell D3; the y‐value that correspond to a x‐value that
lies between 1 ‐ 2 in cell C4 is shown in cell D4, and so forth.

TREND function and Equation 2.7 can also be used to interpolate x‐values from
known y-values, by replacing the variable x with y and vice versa in Equation
2.7, or by switching x and y‐argument in TREND function. The formulas are
shown in the worksheet above at row 12, columns C and D.

CVE 156_MDP
5.2.5 Histogram and Cumulative Distribution

Histogram and Cumulative Distribution

In a statistical data presentation, it is convenient to make the data from observation into
histogram to see the data distribution. In histogram, the observed data plotted against its
frequency distribution and thus, we can get a visual summary and give quick impression
on the observed data.

CVE 156_MDP
Example 4 Histogram and Cumulative Distribution
The following is concrete compressive strength data selected randomly from 40 samples
obtained from the test of characteristics compressive strength of concrete:

Table 4.2

CVE 156_MDP
Table 4.3: Data distribution to create a histogram

CVE 156_MDP
Histogram chart of Figure 4.9 is made based on Table 4.3 data. To create a
histogram, take the following steps: click Insert tab > Column > Column
clusterd > Select Data > Add > Series Values: select range "Percentage" > OK.
Click Edit for Horizontal Axis Labels > insert range "Mid Point".

To create cumulative frequency chart as Figure 2.9, select the type of XY scatter
chart with smoothlines and markers > Select Data > Add > Series X-values:
select range "Mid Point" > Y-values: select range "Cumulative".

CVE 156_MDP
Figure 4.9: Histogram of concrete compressive strength

CVE 156_MDP
Figure 4.10: Cumulative frequency of concrete
compressive strength Range, Mean dan
Standard Deviation

CVE 156_MDP
CVE 156_MDP
Analysis ToolPak
C H A P TER 5
5.3.1 Analysis ToolPak

The Analysis ToolPak is an Excel add-in program that provides data analysis
tools for financial, statistical and engineering data analysis.

To load the Analysis ToolPak add-in, execute the following steps.

1. On the File tab, click Options.

CVE 156_MDP
2. Under Add-ins, select Analysis ToolPak and click on the Go button.

CVE 156_MDP
3. Check Analysis ToolPak and click on OK.

CVE 156_MDP
4. On the Data tab, in the Analysis group, you can now click on Data Analysis.

The following dialog box below appears.

5. For example, select Histogram and click OK to create a Histogram in


Excel.

CVE 156_MDP
5.3.2 Descriptive statistics

You can use the Analysis Toolpak add-in to


generate descriptive statistics. For example,
you may have the scores of 14 participants
for a test.

CVE 156_MDP
To generate descriptive statistics for these scores, execute the
following steps.

1. On the Data tab, in the Analysis group, click Data Analysis.

2. Select Descriptive Statistics and click OK.

CVE 156_MDP
3. Select the range A2:A15 as the Input Range.
4. Select cell C1 as the Output Range.
5. Make sure Summary statistics is checked.

CVE 156_MDP
6. Click OK.
Result:

CVE 156_MDP
5.3.3 Regression
This example teaches you how to run a linear regression analysis in Excel and how to
interpret the Summary Output.

Below you can find our data. The big question is: is there a relation between Quantity Sold
(Output) and Price and Advertising (Input). In other words: can we predict Quantity Sold if
we know Price and Advertising?

CVE 156_MDP
1. On the Data tab, in the Analysis group, click Data Analysis.

2. Select Regression and click OK.

CVE 156_MDP
3. Select the Y Range (A1:A8). This is the predictor variable (also called
dependent variable).

4. Select the X Range(B1:C8). These are the explanatory variables (also


called independent variables). These columns must be adjacent to each other.

5. Check Labels.

6. Click in the Output Range box and select cell A11.

7. Check Residuals.

8. Click OK.

CVE 156_MDP
Excel produces the following Summary Output (rounded to 3 decimal places).

CVE 156_MDP
R Square

R Square equals 0.962, which is a very good fit. 96% of the


variation in Quantity Sold is explained by the independent variables
Price and Advertising. The closer to 1, the better the regression line
(read on) fits the data.

CVE 156_MDP
Significance F and P-values

To check if your results are reliable (statistically significant), look at Significance F


(0.001). If this value is less than 0.05, you're OK. If Significance F is greater than 0.05, it's
probably better to stop using this set of independent variables. Delete a variable with a high
P-value (greater than 0.05) and rerun the regression until Significance F drops below 0.05.
Most or all P-values should be below 0.05. In our example this is the case.
(0.000, 0.001 and 0.005).

CVE 156_MDP
Coefficients

The regression line is: y = Quantity Sold = 8536.214 -835.722 * Price + 0.592 *
Advertising. In other words, for each unit increase in price, Quantity Sold decreases
with 835.722 units. For each unit increase in Advertising, Quantity Sold increases
with 0.592 units. This is valuable information.

You can also use these coefficients to do a forecast. For example, if price equals $4
and Advertising equals $3000, you might be able to achieve a Quantity Sold of
8536.214 -835.722 * 4 + 0.592 * 3000 = 6970.
Residuals

CVE 156_MDP
Residuals

The residuals show you how far away the actual data points are from the
predicted data points (using the equation). For example, the first data point
equals 8500. Using the equation, the predicted data point equals 8536.214
-835.722 * 2 + 0.592 * 2800 = 8523.009, giving a residual of 8500 -
8523.009 = -23.009.

CVE 156_MDP
You can also create a scatter plot of these residuals.

CVE 156_MDP
References
❑ Pangaribuan, G. (2016). An Introduction to EXCEL for Civil Engineers. (n.p.).
❑ Excel Easy. (n.d.). Introduction. https://www.excel-easy.com/
❑ Lora, V. PYTHON for Civil and Structural Engineers. (n.p.).

❑ Kalkan, S., Sehitoglu O.T., & Ucoluk, G. (2022). V. Programming with Python for
Engineers. Release 1.0. (n.p.).

❑ CE 215 Python Notes. (n.d.). Fundamental Topics.


https://www.webpages.uidaho.edu/~mlowry/Teaching/python/html/index.html

❑ CE 103 Introduction to Computers and Programming. (n.d.). Introduction.


https://github.com/komec/py4civil/blob/master/CE103-Week01-Introduction.ipynb

MDP_CVE156 73

You might also like