CVE156 Chap5 Data Analysis Functions
CVE156 Chap5 Data Analysis Functions
COMPUTER
PROGRAMMING FOR CE
MDP_CVE156 1
DATA ANALYSIS
CHAPTER 5
MDP_CVE156 2
Content
MDP_CVE156 3
Objectives
One Column
To sort on one column, execute the
following steps.
Figure 4.1
CVE 156_MDP
2. To sort in ascending order, on the Data tab, in the Sort & Filter group,
click AZ.
Figure 4.2
CVE 156_MDP
Multiple Columns
To sort on multiple columns, execute the following steps.
1. On the Data tab, in the Sort & Filter group, click Sort.
CVE 156_MDP
3. Click on Add Level.
4. Select Sales from the 'Then by' drop-
down list.
5. Click OK.
Result. Records are sorted by Last Name
first and Sales second.
CVE 156_MDP
5.1.2 What-If Analysis
What-If Analysis in Excel allows you to try out different values (scenarios)
for formulas. The following example helps you master what-if analysis
quickly and easily.
CVE 156_MDP
5.1.2 What-If Analysis
Assume you own a bookstore and have 100 books in storage. You
sell a certain % for the highest price of $50 and a certain % for the
lower price of $20
Figure 4.3
CVE 156_MDP
Create Different Scenarios
But what if you sell 70% for the highest price? And what if you sell 80% for the
highest price? Or 90%, or even 100%? Each different percentage is a
different scenario. You can use the Scenario Manager to create these scenarios.
Note: You can simply type in a different percentage into cell C4 to see the
corresponding result of a scenario in cell D10. However, what-if analysis
enables you to easily compare the results of different scenarios.
1. On the Data tab, in the Forecast group, click What-If Analysis.
CVE 156_MDP
2. Click Scenario Manager.
CVE 156_MDP
4. Type a name (60% highest), select cell C4 (% sold for the highest price)
for the Changing cells and click on OK.
Figure 4.3
CVE 156_MDP
6. Next, add 4 other scenarios (70%, 80%, 90% and 100%).
Finally, your Scenario Manager should be consistent with the picture
below:
Note: to see the result of a scenario, select the scenario and click on
the Show button. Excel will change the value of cell C4 accordingly for
you to see the corresponding result on the sheet.
CVE 156_MDP
Scenario Summary
CVE 156_MDP
Result:
Conclusion: If you sell 70% for the highest price, you obtain a total
profit of $4100, if you sell 80% for the highest price, you obtain a total
profit of $4400, etc. That's how easy what-if analysis in Excel can be.
5.1.3 Goal Seek
What if you want to know how many books you need to sell for the highest price, to
obtain a total profit of exactly $4700? You can use Excel's Goal Seek feature to find the
answer.
CVE 156_MDP
3. Select cell D10.
4. Click in the 'To value' box and type 4700.
5. Click in the 'By changing cell' box and select cell C4.
6. Click OK.
Result. You need to sell 90% of the books for the highest price to obtain a total profit of exactly $4700.
CVE 156_MDP
Example 3: Goal seek
Quadratic Equation
CVE 156_MDP
CVE 156_MDP
4. You can use Excel's Goal Seek feature to obtain the exact same result. On
the Data tab, in the Forecast group, click What-If Analysis.
CVE 156_MDP
6. Select cell B2.
7. Click in the 'To value' box and type 24.5
8. Click in the 'By changing cell' box and select cell A2.
9. Click OK.
Result.
➢ Linear Regression
➢ Polynomial Regression
➢ Interpolation
➢ Statistical Data
CVE 156_MDP
5.2.2 LINEAR REGRESSION
Linear regression is to determine a straight line that fits or the most closely
fits to a number of points data, providing a linear relationship between two
variables. The method used to obtain the line is called the least squares
method. Thus, linear regression consists of a series of points that fit to a
number (n) of points data (xi, yi) written into a straight-line equation:
y = Ax + B where:
Figure 4.1
CVE 156_MDP
𝑅2 value is also used for the accuracy measurement by a
relationship below:
CVE 156_MDP
Regression in Excel can be obtained using the TREND function, SLOPE,
INTERCEPT and LINEST or with Trendline, a regression line from
data relations in an XY coordinate system. Trendline is created by the
following steps: right‐clicking the mouse when the pointer is on one of
the points data in the graph to display a box menu > click Add
Trendline to display the Format Trendline dialog box > select Linear
for linear regression.
CVE 156_MDP
Example 1: Linear Regression
The soil shear strength determination from a laboratory test obtains the result as
shown in Figure 4.2. In general, the depiction of data generated from the test is rarely a
straight line to show failure envelope, therefore, it needs a fitted straight line to
represent the data.
Regression is used to obtain shear strength parameters, which are cohesion (c) and
shear (∅) of the soil mass.
CVE 156_MDP
CVE 156_MDP
CVE 156_MDP
Figure 4.2:
Shear strength in this model is a linear
function over the normal stress, where c
and tan φ respectively expresses Y‐axis
intercept and the slope of the line.
Figure 4.3
CVE 156_MDP
The values of A and B that are shown in line equation, y = Ax + B in direct shear
test are equal to those calculated with the following functions:
Shear angle φ therefore can be calculated with ATAN function (in degree):
=ATAN(SLOPE(Y,X))*180/PI()
CVE 156_MDP
CVE 156_MDP
5.2.3 Polynomial Regression
CVE 156_MDP
Example 2. Polynomial Regression
Figure 4.4:
Figure 4.5
(2.7)
(𝑥2 , 𝑦2 )
(𝑥, 𝑦)
𝑦2 − 𝑦1
𝑦 − 𝑦1
(𝑥1 , 𝑦1 )
𝑥 − 𝑥1
CVE 156_MDP
𝑥2 − 𝑥1
CVE 156_MDP
Example 3: Linear Interpolation
Given below is two data series of X and Y‐array for interpolation:
CVE 156_MDP
Plotted in chart:
Figure 4.6
CVE 156_MDP
Worksheets below show formulas to obtain the interpolation of the data in
Table 4.1, using TREND function and Equation 2.7, respectively.
Figure 4.7
CVE 156_MDP
Figure 4.8
CVE 156_MDP
The interpolation values are in column D, which correspond to new values
entered in column C. The y‐value that correspond to a x‐value that lies between
0 ‐ 1 in cell C3 is shown in cell D3; the y‐value that correspond to a x‐value that
lies between 1 ‐ 2 in cell C4 is shown in cell D4, and so forth.
TREND function and Equation 2.7 can also be used to interpolate x‐values from
known y-values, by replacing the variable x with y and vice versa in Equation
2.7, or by switching x and y‐argument in TREND function. The formulas are
shown in the worksheet above at row 12, columns C and D.
CVE 156_MDP
5.2.5 Histogram and Cumulative Distribution
In a statistical data presentation, it is convenient to make the data from observation into
histogram to see the data distribution. In histogram, the observed data plotted against its
frequency distribution and thus, we can get a visual summary and give quick impression
on the observed data.
CVE 156_MDP
Example 4 Histogram and Cumulative Distribution
The following is concrete compressive strength data selected randomly from 40 samples
obtained from the test of characteristics compressive strength of concrete:
Table 4.2
CVE 156_MDP
Table 4.3: Data distribution to create a histogram
CVE 156_MDP
Histogram chart of Figure 4.9 is made based on Table 4.3 data. To create a
histogram, take the following steps: click Insert tab > Column > Column
clusterd > Select Data > Add > Series Values: select range "Percentage" > OK.
Click Edit for Horizontal Axis Labels > insert range "Mid Point".
To create cumulative frequency chart as Figure 2.9, select the type of XY scatter
chart with smoothlines and markers > Select Data > Add > Series X-values:
select range "Mid Point" > Y-values: select range "Cumulative".
CVE 156_MDP
Figure 4.9: Histogram of concrete compressive strength
CVE 156_MDP
Figure 4.10: Cumulative frequency of concrete
compressive strength Range, Mean dan
Standard Deviation
CVE 156_MDP
CVE 156_MDP
Analysis ToolPak
C H A P TER 5
5.3.1 Analysis ToolPak
The Analysis ToolPak is an Excel add-in program that provides data analysis
tools for financial, statistical and engineering data analysis.
CVE 156_MDP
2. Under Add-ins, select Analysis ToolPak and click on the Go button.
CVE 156_MDP
3. Check Analysis ToolPak and click on OK.
CVE 156_MDP
4. On the Data tab, in the Analysis group, you can now click on Data Analysis.
CVE 156_MDP
5.3.2 Descriptive statistics
CVE 156_MDP
To generate descriptive statistics for these scores, execute the
following steps.
CVE 156_MDP
3. Select the range A2:A15 as the Input Range.
4. Select cell C1 as the Output Range.
5. Make sure Summary statistics is checked.
CVE 156_MDP
6. Click OK.
Result:
CVE 156_MDP
5.3.3 Regression
This example teaches you how to run a linear regression analysis in Excel and how to
interpret the Summary Output.
Below you can find our data. The big question is: is there a relation between Quantity Sold
(Output) and Price and Advertising (Input). In other words: can we predict Quantity Sold if
we know Price and Advertising?
CVE 156_MDP
1. On the Data tab, in the Analysis group, click Data Analysis.
CVE 156_MDP
3. Select the Y Range (A1:A8). This is the predictor variable (also called
dependent variable).
5. Check Labels.
7. Check Residuals.
8. Click OK.
CVE 156_MDP
Excel produces the following Summary Output (rounded to 3 decimal places).
CVE 156_MDP
R Square
CVE 156_MDP
Significance F and P-values
CVE 156_MDP
Coefficients
The regression line is: y = Quantity Sold = 8536.214 -835.722 * Price + 0.592 *
Advertising. In other words, for each unit increase in price, Quantity Sold decreases
with 835.722 units. For each unit increase in Advertising, Quantity Sold increases
with 0.592 units. This is valuable information.
You can also use these coefficients to do a forecast. For example, if price equals $4
and Advertising equals $3000, you might be able to achieve a Quantity Sold of
8536.214 -835.722 * 4 + 0.592 * 3000 = 6970.
Residuals
CVE 156_MDP
Residuals
The residuals show you how far away the actual data points are from the
predicted data points (using the equation). For example, the first data point
equals 8500. Using the equation, the predicted data point equals 8536.214
-835.722 * 2 + 0.592 * 2800 = 8523.009, giving a residual of 8500 -
8523.009 = -23.009.
CVE 156_MDP
You can also create a scatter plot of these residuals.
CVE 156_MDP
References
❑ Pangaribuan, G. (2016). An Introduction to EXCEL for Civil Engineers. (n.p.).
❑ Excel Easy. (n.d.). Introduction. https://www.excel-easy.com/
❑ Lora, V. PYTHON for Civil and Structural Engineers. (n.p.).
❑ Kalkan, S., Sehitoglu O.T., & Ucoluk, G. (2022). V. Programming with Python for
Engineers. Release 1.0. (n.p.).
MDP_CVE156 73