FDA Session 5
FDA Session 5
nl
O
se
U
tre
en
Session: 5
C
h
ec
Data Analysis
pt
rA
Fo
y
Explain Analysis ToolPak
nl
Explain single factor and two factor tool with and without replication
O
Describe Correlation and Covariance
se
Explain F-Test two-sample for variances
Describe Histogram and moving average
U
Explain Random number generation
tre
Describe Goal seek and Solver for what-if analysis
Explain
en
one-input and two-input data table and scenario
Describe Regression, Sampling, Rank, and Percentile
C
Describe Exponential Smoothing
h
ec
pt
rA
Fo
nl
analysis.
O
To enable Analysis ToolPak, perform the following steps:
se
U
1. Click File → Options. The Excel Options dialog box opens.
tre
en
2. Select Add-Ins from the left pane. The View and manage Microsoft Office Add-ins pane
is displayed.
C
h
ec
4. Select Analysis ToolPak option from the Inactive Application Add-ins list and click Go.
Fo
O
following figure:
se
U
tre
en
C
h
ec
6. After activation, the Analysis ToolPak can be accessed under the Data tab in the
Analysis group as shown in the following figure:
pt
rA
Fo
nl
Following table lists some of these tools along with their description:
O
Tool Description
se
Anova: Single Factor Determines if there is a statistically significant difference between two
Anova: Two-Factor with Replication or more given sets of data by analyzing their variances.
U
Anova: Two-Factor without Replication
Computes coefficients for two or more sets of data.
tre
Correlation
Covariance Computes pair-wise covariance coefficients for 2 or more sets of data.
Generates a report of univariate statistics for a given set of data.
en
Descriptive Statistics
F-Test Two-Sample for Variances Tests whether variances of the two given sets of data differ or not.
C
Computes the Discrete Fourier Transform and its inverse for a given
Fourier Analysis
set of data.
h
Histogram Creates a histogram for a given set of data.
ec
Moving Average Computes moving averages for a given set of data.
Rank and Percentile Computes rank and percentile for the values of a given set of data.
rA
z-Test: Two Sample for Means Performs z-Test on two sets of data with known variances.
nl
the following figure:
O
se
U
tre
en
To calculate the variance, perform the following steps:
C
h
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box
ec
appears.
pt
rA
2. Select Anova: Single Factor from the Analysis Tools list. The Anova: Single Factor dialog
box is displayed.
Fo
O
se
U
tre
en
Following figure shows the report generated by Anova: Single Factor tool:
C
h
ec
pt
rA
Fo
nl
similar to Anova: Two-Factor Without Replication.
O
Anova: Two-Factor Without Replication tool deals with data sets having one sample
for each combination.
se
Anova: Two-Factor With Replication tool deals with data sets having more than one
U
sample for each combination.
Consider the revised student count details as shown in the following figure:
tre
en
C
h
ec
pt
rA
Fo
nl
O
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box is
displayed.
se
U
2. Select Anova: Two-Factor With Replication from the Analysis Tools list and click OK. The
Anova: Two-Factor With Replication dialog box is displayed.
tre
en
3. Specify the Input Range, Rows per sample, and Output options as shown in the following
figure and click OK.
C
h
ec
pt
rA
Fo
nl
tool:
O
se
U
tre
en
C
h
ec
pt
rA
Fo
nl
using worksheet functions such as CORREL and PEARSON.
O
Example: Consider the production and sales data as shown in the following figure:
se
U
tre
en
To calculate the correlation coefficient of production and sales variables, perform
the following steps:
C
h
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box is
ec
displayed.
pt
rA
2. Select Correlation from the Analysis Tools list and click OK. The Correlation dialog box is
displayed.
Fo
O
OK.
se
U
tre
en
C
Following figure shows the report generated by Correlation tool:
h
ec
pt
rA
Fo
nl
verify if the two measurement variables have a tendency to move together.
O
For example, consider the product production and sales data as shown in the
following figure:
se
U
tre
en
To calculate the covariance of production and sales variables, perform the following
steps: C
h
ec
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box is
displayed.
pt
rA
2. Select Covariance from the Analysis Tools list and click OK. The Covariance dialog
Fo
box is displayed.
O
se
U
tre
en
Following figure shows the report generated by Covariance tool:
C
h
ec
pt
rA
Fo
nl
variances are different or not.
O
For example, consider the product sales data as shown in the following figure:
se
U
tre
en
To perform F-Test Two-Sample for Variances of production and sales variables,
perform the following steps:
C
h
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box is
ec
displayed.
pt
rA
2. Select F-Test Two-Sample for Variances from the Analysis Tools list and click OK. The F-
Test Two-Sample for Variances dialog box is displayed.
Fo
O
se
U
tre
en
Following figure shows the report generated by F-Test Two-Sample for Variances
tool:
C
h
ec
pt
rA
Fo
nl
the number of measurements that belong to several intervals.
O
For example, consider the marks of three medical students in various subjects as
shown in the following figure:
se
U
tre
en
C
To create a histogram of the data, perform the following steps:
h
ec
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box is
displayed.
pt
rA
2. Select Histogram from the Analysis Tools list and click OK. The Histogram dialog box is
Fo
displayed.
O
checkbox as shown in the following figure and click OK.
se
U
tre
en
C
Following figure shows the histogram generated by the Histogram tool:
h
ec
pt
rA
Fo
nl
For example, consider a five-years student enrolment detail for a certification
O
course as shown in the following figure:
se
U
tre
en
C
h
ec
To display the moving average of the data, perform the following steps:
pt
rA
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box is
displayed.
Fo
2. Select Moving Average from the Analysis Tools list and click OK. The Moving Average
dialog box is displayed.
© Aptech Ltd. Data Analysis/Session 5 19
y
nl
3. Specify the Input Range, Interval, and Output Range as shown in the following figure
and click OK.
O
se
U
tre
en
Following figure shows the output generated by the Moving Average tool:
C
h
ec
pt
rA
Fo
y
The Random Number Generation analysis tool is used to fill a range with random
nl
numbers generated from one of the seven types of distribution methods.
O
To generate random numbers from a range, perform the following steps:
se
1. Click Data → Data Analysis from the Analysis group. The Data Analysis dialog box is
displayed.
U
tre
2. Select Random Number Generation from the Analysis Tools list and click OK. The
Random Number Generation dialog box is displayed.
en
C
3. Specify the Number of Variables, Number of Random Numbers, Distribution, Between
range, Random Seed, and Output Range as shown in the following figure and click OK.
h
ec
pt
rA
Fo
nl
tool:
O
se
U
tre
en
C
h
ec
pt
rA
Fo
nl
the effect on the formulas and the result.
O
Values in one or more cells can be changed to observe the effect on specific
formula cells.
se
Following table lists the various tools provided by Excel that help a user to find
U
answers to ‘What-if’ type of questions:
tre
Tool Description
en
Goal Seek is used when the required output is known but one of the input values for the output is
not known. For example, if the required output is 5, and the first number is 40, then what could be
Goal Seek
C
the second number? Should it be 40 divided by 8 or 9? In such cases, Goal Seek can be used to
find the value.
h
Solver is useful in finding the best solution to a complex problem involving manipulation of multiple
ec
Solver variables and constraints.
Data Table is used for viewing different results by altering the value of an input cell in a formula.
pt
Data Table
rA
A scenario is a set of values and formulas that are saved for later use. They can be substituted
automatically in worksheet cells according to requirement. A user can create and save different
Scenario
Fo
groups of values on a worksheet and then change to any defined scenarios to view the different
results.
nl
about the end result but not about the data required to achieve the result.
O
Example: Consider a situation to find the number to be multiplied with 4232 to
achieve the result 63480. This can be done by performing the following steps:
se
U
1. Create a worksheet as shown in the following figure:
tre
en
C
To generate random numbers from a range, perform the following steps:
h
ec
2. To find out the number, which when multiplied with Number1, that is, 4232, gives the
pt
desired output 63480, type the following formula in cell B3: =(B1*B2).
rA
3. Click Data → What-if Analysis and then select Goal Seek from the drop-down menu.
Fo
nl
and click OK.
O
se
U
tre
The value 15 is displayed in cell B2 along with the Goal Seek Status dialog box as
en
shown in the following figure:
C
h
ec
pt
rA
Fo
nl
Following figure shows the Solver Parameters dialog box:
O
se
U
tre
en
C
h
ec
pt
rA
To enable Solver, click File → Options → Add-ins. Select Excel add-ins from Manage list
Fo
and click Go. Select Solver Add-in from the Add-ins dialog box and click OK.
nl
along with their description:
O
Parameter Description
se
This represents the target cell in the worksheet that has to be set to a specified, minimum, or
Set Objective
maximum value. This cell must contain a formula.
U
This option allows treating the target cell in three ways namely, Max, Min, and Value of. Max is
tre
the default option that tells Solver to maximize the target cell value with the specified
To
constraints. Min tells Solver to minimize the target cell value with the specified constraints. Value
of tries to reach the value specified in the box.
en
This represents the cells that can be adjusted until the constraints specified in the problem are
By Changing satisfied and the cell in the Set Objective box reaches its target. The adjustable cells must be
Variable Cells
C
related, directly or indirectly, to the target cell.
h
Subject to the This lists the restrictions on a problem.
ec
Constraints
Add Displays the Add Constraint dialog box to add constraints.
pt
Change Displays the Change Constraint dialog box to change the constraints.
rA
Delete Removes the selected constraint from the Subject to the Constraints list.
Fo
Starts the solution process for the defined problem and opens the Solver Results dialog box
Solve
after the process is completed.
nl
along with their description:
O
Parameter Description
se
Closes the Solve Parameters dialog box without solving the problem. It retains the changes
Close
made by using options, Add, Change, or Delete buttons.
U
Reset All Clears the current problem settings and resets all the settings to original values.
tre
Select Solving Allows a user to select the desired solving model namely, Generalized Reduced Gradient (GRG)
Model Nonlinear, Simplex LP, and Evolutionary.
en
Displays the Solver Options dialog box, to set additional constraints for the solution process.
Options Some options have default settings that are appropriate for most of the problems.
C
Following figure shows the Solver Options dialog box:
h
ec
pt
rA
Fo
nl
values.
O
The Add Constraint dialog box is shown in the following figure:
se
U
tre
The Add Constraint dialog box consists of the following fields:
en
C
Cell Reference: Specifies the cell references for which the values have to be
restricted.
h
ec
Operator list drop-down: Consists of the operators that can be used to specify
pt
nl
(ROI) affects a monthly instalment by using the PMT function. To do this, perform
O
the following steps:
se
1. Create an Excel sheet to calculate Equated Monthly Installments (EMIs) as shown in
U
the following figure:
tre
en
C
h
ec
pt
2. Select the range C2:D5 that contains the formula in cell D2, the different interest rates
in cells C3:C5, and the cells in which the installments have to be displayed, D3:D5.
rA
Fo
3. Click Data → What-if Analysis from the Data Tools group. Select Data Table option from
the drop-down menu. The Data Table dialog box is displayed.
O
figure:
se
U
tre
en
C
5. Click OK. Excel fills the cells D3 to D5 with the Payment values corresponding to the
h
ROIs as shown in the following figure:
ec
pt
rA
Fo
se
U
tre
en
C
h
ec
pt
rA
Fo
nl
one formula lead to a change in the result of that formula.
O
In a two-input table, input values consists of two input cells, and the input values
are present in row and column.
se
Example: One can use a two-input data table to observe how different
U
combinations of rate of interest and loan duration affect the monthly installments.
To do this, perform the following steps:
tre
en
1. Create an Excel sheet as shown in the following figure:
C
h
ec
pt
rA
Fo
O
following figure:
se
U
tre
en
3. Select the range C2:E5 where C2 contains the formula, cells C3:C5 and D2:E2 are the
C
rows and columns of variable values, and the cells D3:E5 are the cells in which the
calculated values have to be displayed.
h
ec
4. Click Data → What-If Analysis in the Data Tools group. Select Data Table from the drop-
pt
down menu.
rA
Fo
5. Type the reference of cell $B$4 in the Row input cell box.
O
figure:
se
U
tre
en
C
h
ec
7. Click OK. Excel fills the cells from D3 to E5 with the payment values corresponding to
the combination of ROIs and Durations as shown in the following figure:
pt
rA
Fo
nl
A scenario is a collection of cell values and formulas that can be saved as a group
O
and then swapped automatically for another group of cell values in a worksheet.
Scenarios can accept more than two input values.
se
Example: One might want to create a monthly expenditure budget. One can then
U
make changes to amount of individual items such as fuel, groceries, clothes, and so
on and observe how these changes affect the overall budget.
tre
Consider the worksheet consisting of the details of monthly expenses as shown in
the following figure:
en
C
h
ec
pt
rA
Fo
nl
budget, perform the following steps:
O
se
1. Select cells A3:B9 and click Formulas → Create from Selection. The Create Names
from Selection dialog box is displayed.
U
tre
2. Select Left Column check box as shown in the following figure and click OK. This will
en
internally assign the names in column A cells to the corresponding column B values.
C
h
ec
pt
rA
Fo
nl
from the drop-down menu. The Scenario Manager dialog box is displayed as shown in
O
the following figure:
se
U
tre
en
C
h
ec
4. Click Add to create a new scenario. The Add Scenario dialog box is displayed.
pt
rA
O
monthly budget, the user has to create a scenario and supply the new values.
se
7. Type the cell range B5:B7 in the Changing cells box as shown in the following figure.
U
On selecting the range, the dialog box title changes to Edit Scenario.
tre
en
C
h
ec
pt
rA
Fo
O
se
U
tre
en
9. Click OK as it is the initial budget and no values need to be changed. The Scenario
C
Manager dialog box is displayed with the name of the Initial Budget scenario added in
the Scenarios box as shown in the following figure:
h
ec
pt
rA
Fo
O
monthly budget, the user has to re-create a scenario and supply the new values.
se
11. Repeat steps 1 to 7 to create another scenario called ‘Revised Budget.’
U
tre
12. From the Scenario Manager dialog box, select the Revised Budget scenario and click
en
Show. The values in the respective fields will change based on the changes in the Food,
Clothes, and Phone Bill values, as shown in the following figure:
C
h
ec
pt
rA
Fo
nl
steps:
O
se
1. Click Data → What-If Analysis from the Data Tools group. Select Scenario Manager
from the drop-down list. The Scenario Manager dialog box is displayed.
U
tre
2. Select the Initial Budget scenario and click Edit. The Edit scenario dialog box will be
displayed.
en
C
To delete an existing scenario, perform the following steps:
h
ec
1. Click Data → What-If Analysis from the Data Tools group. Select Scenario Manager
from the drop-down list. The Scenario Manager dialog box is displayed.
pt
rA
2. Select the Initial Budget scenario and click Delete. The scenario will be removed from
Fo
nl
the following steps:
O
se
1. Click Data → What-If Analysis from the Data Tools group. Select Scenario Manager
from the drop-down list. The Scenario Manager dialog box is displayed.
U
tre
2. Click Summary. The Scenario Summary dialog box is displayed.
en
C
3. Click Result cells box and select cells D12, B11, and D3 individually while pressing the
CTRL key as shown in the following figure:
h
ec
pt
rA
Fo
O
the following figure:
se
U
tre
en
C
h
ec
pt
rA
Fo
y
Regression analysis is useful in determining the change in a dependent entity when
nl
one or more independent entities change. Regression analysis provides various data
O
models and analyzing techniques to determine trends from large amounts of data.
Consider the case of an educational organization that needs to analyze whether an
se
increase in literacy rate will lead to increase in per capita income.
To perform regression analysis, perform the following steps:
U
tre
1. Create an Excel worksheet, as shown in the following figure:
en
C
h
ec
pt
rA
2. Click Data → Data Analysis and then select Regression. The Regression dialog box is
displayed.
Fo
se
U
tre
en
C
h
ec
pt
rA
Fo
O
figure:
se
U
tre
en
C
h
ec
pt
rA
Fo
y
The Sampling technique is used to analyze data when there is large volume of data.
nl
From this data, random samples are selected to perform sampling.
O
Consider a scenario where a water supplying agency needs to determine the threat
of water-borne diseases in households in different states.
se
For this purpose, the agency has collected data about different states where the
residential areas and polluted industrial areas are located closely to each other.
U
To determine sampling results on this data in Excel, perform the following tasks:
tre
1. Create a worksheet as shown in the following figure:
en
C
h
ec
pt
rA
Fo
O
for this cell.
se
3. Copy this cell value to all the cells in column A to generate a list of random numbers for
U
the sample data, as shown in the following figure:
tre
en
C
h
ec
pt
rA
Fo
O
se
U
tre
en
C
h
ec
pt
rA
Fo
O
se
6. Select all the values in the Random column and click Data → Sort → Expand the
selection to sort the random values. The worksheet displays the values as shown in the
U
following figure:
tre
en
C
h
ec
pt
rA
Fo
O
se
8. Enter the values in the dialog box, as shown in the following figure. You can select any
input range as the data is random.
U
tre
en
C
h
ec
pt
rA
Fo
O
se
U
tre
en
C
h
ec
pt
rA
Fo
y
Ranking is a derivative of the percentile value. Depending on the percentile, the
nl
user can rank a particular entity from highest to lowest. To use the Rank and
O
Percentile function in Excel, consider the example of a class of 30 students who
need to be ranked on the basis of their percentage of marks in the 1st term, 2nd
se
term, and 3rd term.
To get this output, perform the following tasks:
U
tre
1. Create a worksheet as shown in the following figure:
en
C
h
ec
pt
rA
Fo
O
=AVERAGE(C2,D2,E2).
se
3. To calculate rank and percentile based on this data, click Data → Data Analysis →
U
Rank and Percentile. The Rank and Percentile dialog box is displayed.
tre
en
4. To Enter the values in the Rank and Percentile dialog box as shown in the following
figure:
C
h
ec
pt
rA
Fo
O
se
U
tre
en
C
h
ec
pt
rA
Fo
y
To use the INDEX () and MATCH () functions, perform the following tasks:
nl
O
1. Create two Student Name and Marks in columns M and N respectively, in the same
worksheet.
se
U
2. Copy the values from Column 1 header of the Rank and Percentile report to the Marks
tre
column header.
en
3. Apply the following formula to the Student Name Column:
=INDEX($B$2:$B$30,MATCH(H2,$A$2:$A$30))
C
h
ec
Replicate this formula in all the cells of the Student Name column header.
pt
rA
Fo
y
The mark sheet with the Student Names and Marks arranged according to
nl
percentage in descending order is displayed, as shown in the following figure:
O
se
U
tre
en
C
h
ec
pt
rA
Fo
y
Exponential smoothing technique is used to mitigate irregularity or extremities in
nl
data collected over a period of time to analyze trends and forecast.
O
Consider a situation where the sales data is accumulated for a product for the past
20 years.
se
In this case, there would be varied product demand based on many factors which
may act as temporary catalysts in causing change in demand.
U
To even out this data set, the exponential smoothing technique can be used.
tre
To use exponential smoothing in Excel, perform the following tasks:
en
1. Create a worksheet as shown in the following figure:
C
h
ec
pt
rA
Fo
O
se
3. In the Exponential Smoothing dialog box, provide the Input Range, Damping Factor,
and the Output Range, as shown in the following figure:
U
tre
en
C
h
ec
pt
rA
Fo
O
se
U
tre
en
C
h
ec
pt
rA
Fo
nl
financial, and engineering functions for analysis of complex problems.
O
The Anova tool is used to determine if there is a statistically significant difference between two or
more given sets of data by analyzing their variances.
se
Covariance tool can be used to examine each pair of measurement variables to verify if the two
measurement variables have a tendency to move together.
U
Goal Seek is a what-if analysis tool in Excel used in situations when a user is clear about the end
result but not about the data required to achieve the result.
tre
Solver is an add-in in Excel used for the simulation and optimization of business and engineering
problems. It helps to solve linear and non-linear mathematical problems.
en
The Data Table tool allows a user to see different results of a formula by changing an input cell
used in the formula.
C
A scenario is a collection of cells and formulas that can be saved as a group and then swapped
automatically for another group of cell values in a worksheet.
h
Regression determines the effect of an independent variable on a dependent variable when the
ec