0% found this document useful (0 votes)
21 views30 pages

AnyDoc Reader Document

Uploaded by

TEJAS THORAT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views30 pages

AnyDoc Reader Document

Uploaded by

TEJAS THORAT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Sub: Data Analytics

Unit: II
Descriptive Statistics and data Validations, Data Analysis Techniques
Statistical Functions



To begin with, statistical function in Excel let’s first understand what is


statistics and why we need it? So, statistics is a branch of sciences that
can give a property to a sample. It deals with collecting, organizing,
analyzing, and presenting the data. One of the great mathematicians
Karl Pearson, also the father of modern statistics quoted that, “statistics
is the grammar of science”.

Ways to approach statistical function in Excel:

In order to understand statistical Functions we will divide them into two


sets:
1. Basic statistical Function
2. Intermediate Statistical Function.

1. COUNT function

The COUNT function is used to count the number of cells containing a


number. Always remember one thing that it will only count the
number.
Formula: =COUNT(value1,value2….)
Formula for COUNT function = COUNT(value1, [value2], …)
Intermediate Statistical Function
Let’s discuss some intermediate statistical functions in Excel. These
functions used more often by the analyst. It includes functions like
AVERAGE function, MEDIAN function, MODE function, STANDARD
DEVIATION function, VARIANCE function, QUARTILES function,
CORRELATION function.

2. SUM Function

The SUM Function is used to sum the specified values. The value format
contains arrays, numbers, ranges, and cell references. It can add up to
255 values.

The syntax of the SUM Function is as follows:

1. =SUM (number 1,[number 2],...)

1. AVERAGE value1, [value2], …)

The AVERAGE function is one of the most used intermediate functions.


The function will return the arithmetic mean or an average of the cell in
a given range.
Formula for AVERAGE function = AVERAGE(number1, [number2], …)

Example of statistical function.

So the average total revenue is Rs.144326.6667

3. MEDIAN function

The MEDIAN function will return the central value of the data. Its syntax
is similar to the AVERAGE function.
Formula for MEDIAN function = MEDIAN(number1, [number2], …)

Example of statistical function.

Thus, the median quantity sold is 300.

4. MODE function

The MODE function will return the most frequent value of the cell in a
given range.
Formula for MODE function = MODE.SNGL(number1,[number2],…)

Example of statistical function.

Thus, the most frequent or repetitive cost is Rs. 250.

5. STANDARD DEVIATION

This function helps us to determine how much observed value deviated


or varied from the average. This function is one of the useful functions
in Excel.
Formula for STANDARD DEVIATION function =
STDEV.P(number1,[number2],…)

9. MAX function

The MAX function will return the largest numeric value within a given
set of data or an array.
Formula for MAX function = MAX (number1, [number2], ...)
The maximum quantity of textbooks is Physics,620 in numbers.

10. MIN function

The MIN function will return the smallest numeric value within a given
set of data or an array.
Formula for MIN function = MIN (number1, [number2], ...)

Frequency Distribution in Excel

Frequency Distribution in Excel is used to give an impression of how the

data is spread out. This can be done using a Histogram which gives the

proper vision of how the data is being distributed. To create Frequency

Distribution in Excel, we must have Data Analysis Toolpak, which we can

activate from the Add-Ins option in the Developer menu tab. Once it is

activated, select the Histogram from Data Analysis, and select the data

we want to project.

Frequency Formula in Excel


Below is the Frequency Formula in Excel :
The Frequency Function has two arguments as below:

 Data array: A set of array values where it is used to count the

frequencies. If the data array values are zero (i.e., Null values),

then the frequency function in Excel returns an array of zero

values.

 Bins array: A set of array values that are used to group the values

in the data array. If the bin array values are zero (i.e., Null values),

it will return the array elements from the data array.

How to Make Frequency Distribution in Excel?


Frequency Distribution in Excel is very simple and easy to use. Let’s

understand the working of Excel Frequency Distribution by some

examples.

In Excel, we can find the “frequency function” in the Formulas menu,

which comes under the statistical category, by following the below steps

as follows.
 Go to the Formula menu.

 Click on More Function.

 Under the Statistical category, choose Frequency Function, as

shown in the below screenshot.

 We will get the Frequency Function Dialogue box as shown below.

Where data array is an array or set of values where we want to count

frequencies, and Bins_array is an array or set of values where we want

to group the values in the data array.

Histogram:

A histogram is a chart that shows the frequency distribution of a set


of values. The frequency distribution of these values are arranged into
specified ranges known as bins. 8.Histogram
A histogram is an accurate representation of the distribution of
numerical data. A histogram is used to summarize discrete or
continuous data. In other words, it provides a visual interpretation of
numerical data by showing the number of data points that fall within a
specified range of values (called “bins”). It is similar to a vertical bar
graph. However, a histogram, unlike a vertical bar graph, shows no gaps
between the bars. Histogram allows the inspection of the data for its
underlying distribution (e.g., normal distribution), outliers, skewness,
etc
How to Install Data Analysis Tool Pak in Excel
Step 1: Click on the File tab and then select ‘Option’.
Step 2: Select Add-ins in the navigation pane of the Excel options dialog
box.

Step 3: In the Manage drop-down, select Excel Add-ins and click Go.
This would install the Analysis Toolpak and you can access it in the Data
tab in the Analysis group.
How to Create a Histogram using Analysis Tool Pak in Excel
Step 1: Put the data in the Excel sheet
Step 2: Now Go to Data Tab.
Step 3: In the Analysis Group click on Data Analysis.
Step 4: In the ‘Data Analysis ‘ Dialog box, select Histograjm from the list
and click ok.
In the Histogram dialog box:

After highlighting your data, follow these steps:

1. Click on the Insert tab.

2. Click on “Insert Statistical Charts” from the Charts section.

3. Select the Histogram chart.

1. Double click on the Chart

2. Click on the Design tab.

3. Select the Change Colors

4. In this section, you can change the color range of the chart.

5. From the Format Chart Area, you also have other options such as fill
& Line, Effects, and Size.

Pivot Tables

You can use a PivotTable to summarize, analyze, explore, and present


summary data. PivotCharts complement PivotTables by adding
visualizations to the summary data in a PivotTable, and allow you to
easily see comparisons, patterns, and trends.

 Querying large amounts of data in many user-friendly ways.


 Subtotaling and aggregating numeric data, summarizing data by
categories and subcategories, and creating custom calculations and
formulas.
 Expanding and collapsing levels of data to focus your results, and
drilling down to details from the summary data for areas of interest
to you.
 Moving rows to columns or columns to rows (or "pivoting") to see
different summaries of the source data.
 Filtering, sorting, grouping, and conditionally formatting the most
useful and interesting subset of data enabling you to focus on just
the information you want.
 Presenting concise, attractive, and annotated online or printed
reports.
Insert a Pivot Table
To insert a pivot table, execute the following steps.
1. Click any single cell inside the data set.
2. On the Insert tab, in the Tables group, click PivotTable.

The following dialog box appears. Excel automatically selects the data
for you. The default location for a new pivot table is New Worksheet.
3. Click OK.
Drag fields
The PivotTable Fields pane appears. To get the total amount exported
of each product, drag the following fields to the different areas.
1. Product field to the Rows area.
2. Amount field to the Values area.
3. Country field to the Filters area.
Below you can find the pivot table. Bananas are our main export
product. That's how easy pivot tables can be!

Sort
To get Banana at the top of the list, sort the pivot table.
1. Click any cell inside the Sum of Amount column.
2. Right click and click on Sort, Sort Largest to Smallest.

Result.

Filter
Because we added the Country field to the Filters area, we can filter this
pivot table by Country. For example, which products do we export the
most to France?
1. Click the filter drop-down and select France.
Result. Apples are our main export product to France.

Note: you can use the standard filter (triangle next to Row Labels) to
only show the amounts of specific products.
Change Summary Calculation
By default, Excel summarizes your data by either summing or counting
the items. To change the type of calculation that you want to use,
execute the following steps.
1. Click any cell inside the Sum of Amount column.
2. Right click and click on Value Field Settings.
3. Choose the type of calculation you want to use. For example, click
Count.

4. Click OK.
Result. 16 out of the 28 orders to France were 'Apple' orders.

Data Visualization in Excel

 Data Visualization is the representation of data in a graphical format.


It makes
using
Excel
visualization
in Excel.
tools
is athe
spreadsheet
like
data
as Tableau,
well.
easier
In this
that
to
Google
understand.
article,
is charts,
usedlet’s
for
DataWrapper,
Data
understand
data
Visualization
organization
Data
andVisualization
can
many
and
bemore.
done
data
Excel provides various types of charts like Column charts, Bar charts, Pie
charts, Linecharts, Area charts, Scatter charts, Surface charts, and much
more.
Steps for visualizing data in Excel:
 Open the Excel Spreadsheet and enter the data or select the data
you want to visualize.
 Click on the Insert tab and select the chart from the list of charts
available or the shortcut key for creating chart is by simply selecting
a cell in the Excel data and press the F11 function key.

 A chart with the data entered in the excel sheet is obtained.


 You can design and style your chart with different types of styles and
colors by selecting the design tab.
 In Excel 2010, the design tab option is visible by clicking on the chart.

Simple graphs are only the tip of the iceberg. There’s a whole lot of
visualization methods to present data in effective and interesting ways
which i will be explaining in future articles.

1. Line Chart:

Line Chart is a type of chart which displays information as a series of


data points called 'markers' connected by straight line segments.It is
characterized by a tendency to reflect things as they change over time
or ordered categories.
2. Spider Chart:

A radar chart is a graphical method of displaying multivariate data in the


form of a two-dimensional chart of three or more quantitative variables
represented on axes starting from the same point. The relative position
and angle of the axes is typically uninformative.
3.Pie Chart

A pie chart (or a circle chart) is a circular statistical graphic, which is


divided into slices to illustrate numerical proportion. In a pie chart, the
arc length of each slice (and consequently its central angle and area), is
proportional to the quantity it represents. While it is named for its
resemblance to a pie which has been sliced, there are variations on the
way it can be presented.The pie chart is not suitable for multiple series
of data, because as the series increase, each slice becomes smaller, and
finally the size distinction is not obvious.
4. Column Chart

A bar chart or bar graph is a chart or graph that presents categorical


data with rectangular bars with heights or lengths proportional to the
values that they represent. The bars can be plotted vertically or
horizontally.
6. Scatter Plot
The scatter plot shows two variables in the form of points on a
rectangular coordinate system. The position of the point is determined
by the value of the variable. By observing the distribution of the data
points, we can infer the correlation between the variables.

7. Bubble Chart
A bubble chart is a variation of a scatter chart in which the data points
are replaced with bubbles.

Advanced Excel Formulas

Microsoft Excel is one of the powerful tools widely used on a large scale
for official and personal use. By default, Excel consists of various
formulas and functions, which help to calculate the data effectively and
quickly. It saves the user's time and has high computational power and
efficiency. MS Excel is used in various fields, such as Education.
Hospitals, Government and private organizations, etc. It is also used to
create Budgets, Balance Sheets, business decisions, etc., calculate sales
reports, etc. Some of the applications of Excel are as follows:

o Preparing a Chart for the data


o Conditional Formatting with Visual support
o Identifying the Trend
o Single Solution for the Various Types of Data

Advanced Excel formulas

Excel consists of some of the advanced formulas used for calculating the
data. Some of the formula is explained as follows:

1. VLOOKUP

As the name suggests, VLOOKUP stands for Vertical Lookup. This Excel
function is used to find the specific information across the vertical
pattern of the given worksheet.

The syntax of the VLOOKUP function is as follows:

1. =VLOOKUP (lookup_value, table_array, col_index_num,[range_lookup])


8. COUNTIF Function

The COUNTIF function in Excel counts the cells containing data based on
single criteria in the selected range. The COUNTIF function counts the
cell containing text, numbers, and dates.

The syntax for the COUNTIF function is:

1. =COUNTIF(range,criteria)

INDEX MATCH
Formula = INDEX(C3:E9,MATCH(B13, C3:C9,0),MATCH(B14,C3:E3,0))

An excellent alternative to the VLOOKUP or HLOOKUP formula on Excel


that has some drawbacks for performing lookup tasks. The INDEX
MATCH is a combination formula of 2 separate functions:

3. SUMIF

This formula in excel is denoted as SUMIF(range, criteria, [sum range]).


This would result in the sum of the values within the desired Range of
cells that would meet the requirements set by you. For example,
= SUMIF(C3: C12, “>70,000) would return the sum of values between
the cells of C3 and C12 from only the cells that have the value more
than 70,000.

What-if Analysis in Excel

In Excel, What-if analysis is a process of changing cells' values to see


how those changes will affect the worksheet's outcome. You can use
several different sets of values to explore all the different results in one
or more formulas.

o You can propose different budgets based on revenue.


o You can predict the future values based on the given historical
values.
o If you expect a certain value due to a formula, you can find
different sets of input values that produce the desired result.

o Scenario Manager
o Goal Seek
o Data Tables

1. Scenario Manager

A scenario is a set of values that Excel saves and can substitute


automatically in cells on a worksheet. Below are the following key
features, such as:

o You can create and save different groups of values on a worksheet


and then switch to any of these new scenarios to view different
results.
o A scenario can have multiple variables, but it can accommodate
only up to 32 values.
o You can also create a scenario summary report, which combines
all the scenarios on one worksheet. For example, you can create
several different budget scenarios that compare various possible
income levels and expenses, and then create a report that lets you
compare the scenarios side-by-side.
o Scenario Manager is a dialog box that allows you to save the
values as a scenario and name the scenario.

2. Goal Seek

Goal Seek is useful if you want to know the formula's result but unsure
what input value the formula needs to get that result. For example, if
you want to borrow a loan and know the loan amount, tenure of loan
and the EMI that you can pay, you can use Goal Seek to find the interest
rate at which you can avail of the loan.

Goal Seek can be used only with one variable input value. If you have
more than one variable for input values, you can use the Solver add-in.

3. Data Table

A Data Table is a range of cells where you can change values in some of
the cells and answer different answers to a problem. For example, you
might want to know how much loan you can afford for a home by
analyzing different loan amounts and interest rates. You can put these
different values and the PMT function in a Data Table and get the
desired result.

A Data Table works only with one or two variables, but it can accept
many different values for those variables.

What-If Analysis Scenario Manager

Scenario Manager is one of the What-if Analysis tools in Excel. Scenario


Manager is useful in a case where you have more than two variables in
the sensitivity analysis. Scenario Manager creates scenarios for each set
of the input values for the variables under consideration. Scenarios help
you to explore a set of possible outcomes, supporting the following:

o Varying as many as 32 input sets.


o Merging the scenarios from several different worksheets or
workbooks.

If you want to analyze more than 32 input sets, and the values
represent only one or two variables, you can use Data Tables.

Step 1: Define the cells that contain the input values.


Step 2: Name the cells Metals_name and Cost.

Step 3: Define the cells that contain the results.

Step 4: Name the result cell Total_cost.

ADVERTISEMENT
ADVERTISEMENT

Step 5: place the formula in the result cell.

Step 6: Below is the created table.

To create an analysis report with Scenario Manager, follow the following


steps, such as:

Step 1: Click the Data tab.

Step2: Go to the What-If Analysis button and click on the Scenario


Manager from the dropdown list.

Step 3: Now a scenario manager dialog box appears, click on the Add
button to create ascenario.

Step 4: Create the scenario, name the scenario, enter the value for each
changing input cell for that scenario, and then click the Ok button.

Step 5: Now, B3, B4, B5, B6, and B7 appear in the cells box.

Step 6: Now, change the value of B3to 500 and click the Add button.

Step 7: After clicking on the Add button, the add scenario dialog box
appears again.

o In the scenario name box, create scenario 2.


o Select the prevent changes.
o And click on the Ok

Step 8: Again appears scenario values box with the changed value of B3
cell.

Step 9: Change the value of B5 to 20000 and click the Ok button.

Step 10: Similarly, create Scenario 3 and click the Ok button.

Step 11: Again, appears scenario values box with a changed value of the
B5 cell.

Step 12: Change the value of B7 to 10000 and click the Ok button.
The Scenario Manager Dialog box appears. In the box under Scenarios,
You will find the names of all the scenarios that you have created.

Step 13: Now, click on the Summary button. The Scenario Summary
dialog box appears.

What-If Analysis Goal Seek

Goal Seek is a What-If Analysis tool that helps you to find the input
value that results in a target value that you want. Goal Seek requires a
formula that uses the input value to give the result in the target value.
Then, by varying the formula's input value, Goal Seek tries to solve the
input value.

Goal Seek works only with one variable input value. If you have more
than one input value to be determined, you have to use the Solver add-
in. Below are the following steps to use the Goal Seek feature in Excel.

Step 1: On the Data tab, go What-If Analysis and click on the Goal
Seek option.

Step 2: The Goal Seek dialog box appears.


Step 3: Type C9 in the Set cell box. This box is the reference for the cell
that contains the formula that you want to resolve.

Step 4: Type 57000 in the To value box. Here, you get the formula
result.

Step 5: Type B9 in the By changing cell box. This box has the reference
of the cell that contains the value you want to adjust.

Step 6: This cell that the formula must reference goal Seek changes in
the cell that you specified in the Set cell box. Click Ok.

Step 7: Goal Seek box produces the following result.

As you can observe, Goal Seek found the solution using B9, and it
returns 0 in the B9 cell because the target value and current value are
the same.

What-If Analysis Data Tables

With a Data Table in Excel, you can easily vary one or two inputs and
perform a What-if analysis. A Data Table is a range of cells where you
can change values in some of the cells and answer different answers to
a problem. There are two types of Data Tables, such as:

1. One-variable data tables


2. Two-variable data tables

If you have more than two variables in your analysis problem, you need
to use the Excel Scenario Manager Tool.

One-variable Data Tables


ADVERTISEMENT

A one-variable Data Table can be used to see how different values of


one variable in one or more formulas will change those formulas'
results. In other words, with a one-variable Data Table, you can
determine how changing one input changes any number of outputs.
Below is an example of creating a one-variable data table.

A good example of a data table employs the PMT function with


different loan amounts and interest rates to calculate the loan.

There is a loan of 1 00,000 for a tenure of 5 years. You want to know the
monthly payments (EMI) for varied interest rates. You also want to
know the amount of interest and Principal that is paid in the second
year.

Step 1: Create the required table.

o Assume that the interest rate is 10%.


o List all the required values.
o Name the cells containing the values.
o Set the calculation for EMI, Cumulative Interest and Cumulative
Principal with the Excel functions PMT, CUMIPMT and CUMPRINC,
respectively.
o Below is the created table.

Step 2: Type the list of interest rate values that you want to substitute
in the input cell.

As you observe, there is an empty row above the Interest Rate values.
This row is for the formulas.
Step 3: Type the first function (PMT) in the cell one row above and one
cell to the right of the column of values. Type the other functions (
CUMIPMT and CUMPRINC) in the cells to the first function's right.

Step 4: The Data Table looks as given below.

Step 5: Select the range of cells that contains the formulas and values
that you want to substitute, E2:H13.

Step 6: Go to the Data tab, select What-if Analysis and click on the
Data Table tool in the dropdown list.

Step 7: Data Table dialog box appears.

o Click in the Column input cell box.


o And click on the Interest_Rate cell, which is C2.

You can see that the Column input cell is taken as $C$2.

Step 8: Click on the Ok button.

You might also like