FDA Unit 4
FDA Unit 4
Introduction to Pivot Table :- A PivotTable is an extremely powerful tool that you can
use to slice and dice data. You can track and analyze hundreds of thousands of data
points with a compact table that can be changed dynamically to enable you to find the
different perspectives of the data. It is a simple tool to use, yet powerful.
The primary goal of using a PivotTable normally is to explore the data to extract significant and
required information. You have several options to do this that include Sorting, Filtering,
Nesting, Collapsing and Expanding, Grouping and Ungrouping, etc.
Summarizing Values: Once you organize the data required by you by the different
exploration techniques, the next step that you would like to take is to summarize the data.
Excel provides you with a variety of calculation types that you can apply based on
suitability and requirement. You can also switch across different calculation types and
view the results in a matter of seconds.
Updating a PivotTable: Once you have explored the data and summarized it, you need
not repeat the exercise if and when the source data gets updated. You can refresh the
PivotTable so that it reflects the changes in the source data.
PivotTable Reports: After exploring and summarizing the data with a PivotTable, you
would be presenting it as a report. PivotTable reports are interactive in nature, with the
specialty that even a person not familiar with Excel can use them intuitively. Because of
their inherent dynamic nature, they will enable you to change the perspective quickly of
the report to show the required level of detail or to focus on the specific items in which
the audience expresses interest.
Further, you can structure a PivotTable report for standalone presentation or as an integral part
of a broad report as the case may be.
A pivot table is a table of values which are aggregations of groups of individual values of a
more extensive table (such as from a database, spreadsheet, or business intelligence program)
within one or more discrete categories. The aggregations or summaries on the groups of the
individual terms might include sums, averages, counts, or other statistics. A pivot table is an
outcome of statistically processing on a tabularized raw data and can be used for decision
making.
Pivot tables are one of Excel's most powerful features. A pivot table allows we to extract the
significance from a large, detailed data set.
To insert a pivot table, execute the following steps.
1. Click any single cell inside the data set.
2. On the Insert tab, in the Tables group, click PivotTable.
Field Calculations: In Pivot table If the data area contains numerical values then the
SUM() function is used by default. If the data area contains non numerical values then the
COUNT() function is used. One can specify which fields to include and the type of
calculations used on those fields.
For each combination of values in the row and column fields, the data field takes on a different
value and this value appears in the data area.
Default Calculations
It is possible to use other function in order to summarize the data. There are actually choices of
eleven different aggregate functions that can be used in the pivot table.
Pivot table Summarize Functions:
SUM This is the default function used when the data area contains numeric
values.
The total value of the numbers in a list or cell range.
COUNT This is the default function used when the data area contains non
numeric values.
The number of numeric values in a list or array of numbers.
COUNTA Count Nums The number of non-blank cells in a list or cell range.
AVERAGE The arithmetic mean of a list or array of numbers.
MAX The largest value in a list or array of numbers.
MIN The smallest value in a list or array of numbers.
PRODUCT The product of all the numbers in a list or cell range.
STDDEV The standard deviation based on a sample.
STDDEVP The standard deviation based on an entire population.
VAR The compound variance based on a sample.
VARP The variance based on an entire population.
When we change the function, the Data area will reflect the changes automatically.
It is possible to customize the selected function by adding some calculation options on the pivot
table field dialog box. In addition to the eleven functions that are provide by default we can also
create our own custom calculations.
Custom Calculations: There are also a large number of custom calculations which we
can use including running totals and item percentages. Some of these calculations require
a field to use as well as the value for the field. To apply a custom calculation go to
PivotTable -> Value Field Settings -> Show values as ) "Options".
When we select a member of the Base Field, the corresponding items will automatically be
displayed in the Calculation tab.
We can summarize a PivotTable by placing a field in ∑ VALUES area in the PivotTable Fields
Task pane. By default, Excel takes the summarization as sum of the values of the field in ∑
VALUES area. However, we have other calculation types, such as, Count, Average, Max, Min,
etc.
Normal Default
Difference From Calculates the difference between two cells.
% of Calculates the percentage of a cell to a selected base value.
% Difference From Calculates the difference between two cell values.
Running Total in Calculates and displays the running total in each cell.
% of row Calculates the percentage of the cell value to the total row.
% of column Calculates the percentage of the cell value to the total column.
% of total Calculates the percentage of the value of the grand total.
Index Calculates the index value of the cell value.
Introduction
When we add a field to the pivot table's Values area, 11 different functions, such as Sum, Count
and Average, are available to summarize the data.
The summary functions in a pivot table are similar to the worksheet functions with the same
names, with a few differences as noted in the descriptions that follow.
Totals and Subtotals: The selected summary function will automatically be used in
the subtotals and grand totals for that field. We can select a different function for the
totals. However, the totals calculated on the source data, not on the values showing
in the pivot table. For example, if a field uses the MAX summary function, and the
subtotal shows the AVERAGE, it will be an average from the values in the source
data, not an average of the MAX values. (To calculate the Average of the Max values,
we could use formulas outside of the pivot table, or create a new pivot table, based
on the original one.)
Sum Function:The pivot table's Sum function totals all the underlying values for each
item in the field. The result is the same as using the SUM function on the worksheet to
total the values. Blank cells, and cells with text are ignored. When we add a numerical
field to the pivot table's Values area, Sum will be the default summary function. (Note: If
the field contains text or blank cells, Count will be the default.)
In the screen shot below, we can see the source data for a small pivot table, and the total
quantity, using the worksheet's SUM function, is 317.
With a pivot table, we can quickly see the total sum for each product that was sold, and the
grand total -- 317 -- which matches the worksheet total.
Count Function
Count is the default summary function when fields with nonnumeric or blank cells are added to
the Values area. The Count function's name is slightly confusing, because it's like the
COUNTA worksheet function, not the COUNT worksheet function. The pivot
table Count function counts: text, numbers, errors. Blank cells are NOT counted.
1. In the PivotTable Fields list, check the Qty field, to add it to the Values area
2. Qty appears in the pivot table as Sum of Qty
3. Right-click a cell in the Sum of Qty column
4. Point to Summarize Values By, then click Count
Fix the Problem: To get the count of all orders, even if the Qty cells are blank, follow
these steps:
1. In the PivotTable Fields list, uncheck the Qty field, to remove it from the Values area
2. Drag another copy of the Product field into the pivot table, and place it in the Values area
3. Because Product is a text field, it will automatically summarize as Count.
Because none of the Product cells are blank, the count includes all the orders.
If we have formatted the worksheet to hide zero values, remember that those zero values will be
included in the averages, even if the cells appear blank.
Format the Results: When we use the Average summary function, the results will probably
show a strange mixture of decimal places, as shown in the pivot table at the left, in the screen
shot below. Format the field to have a consistent number of decimal places (as in the pivot table
at the right, below), so the numbers are easy to compare.
Max Function: The Max summary function shows the maximum value from the
underlying values in the Values area. The result is the same as using the MAX
function on the worksheet to calculate the maximum of the values. In the screen
shot below, we can see the source data for a small pivot table, and the maximum
quantity, using the worksheet's MAX function, is 97.
With a pivot table, we can quickly see the maximum for each product that was sold, and the
grand total -- 97 -- which matches the worksheet maximum.
Min Function: The Min summary function shows the minimum value from the
underlying values in the Values area. The result is the same as using the MIN
function on the worksheet to calculate the minimum of the values. In the screen shot
below, we can see the source data for a small pivot table, and the minimum quantity,
using the worksheet's MIN function, is 8.
With a pivot table, we can quickly see the minimum for each product that was sold, and the
grand total -- 8 -- which matches the worksheet minimum. In both the worksheet and the pivot
table, the blank cell is ignored when calculating the minimum amount.
Product Function: The Product summary function shows the result of multiplying
all the underlying values in the Values area. The result is the same as using the
PRODUCT function on the worksheet to calculate the product of the values. In the
screen shot below, we can see the pivot table source data, with the PRODUCT
calculated for each product group. At the bottom of the source data is the overall
PRODUCT calculation.
The results of the Product function may be very large numbers and default to a scientific
number format. We can format the numbers as Number format, instead of scientific format.
Note: Excel only stores and calculates with 15 significant digits of precision, so after the 15th
character we'll only see zeros.
Count Numbers vs. Count: In the pivot table shown below, the Qty field has been added twice
to the Values area. In column B, the summary function is Count Numbers, and the Grand Total
is 7. In column C, the summary function is Count, which includes text, so the Grand Total for
that column is 8.
StdDev Function and StdDevP Function: Like the STDEV.P and STDEV.S worksheet
functions, the StdDevp and StdDev summary functions calculate the standard
deviation for the underlying data in the Values area. The standard deviation is a
measure of how widely the values vary from the average of the values. The StdDevP
summary function should be used when the entire population is used in the
calculation. When a sample of the data is used, not the entire population, then use
the StdDev summary function.
In the screen shot below, we can see example pivot table source data, and the STDEV.P
worksheet function is calculating the standard deviation for each product type. For the File
Folders, there is a large difference between the quantities sold, and the standard deviation is
high -- 44.5. For Paper, the difference in quantity is much smaller, and the standard deviation is
low -- 4.7.
When the Qty field is added to the pivot table, change the summary calculation to StdDevp.
In the screen shot below, we can see that the standard deviations in the pivot table are the same
as those that were calculated on the worksheet.
Note: If the count of items is one, a #DIV/0! error is displayed when using the StdDev summary
function, because one is subtracted from the count when calculating the standard deviation.
In the screen shot below is the example pivot table source data, with the VAR.P worksheet
function calculating the variance for each product type. For the File Folders, where there is a
wide difference between the two quantities, the variance is large -- 1980.25. For the paper sales,
there is a small difference in quantity, and the variance is only 22.22.
To show the variance, when the Qty field is added to the pivot table, change the summary
calculation to Varp.
As we can see, the variances shown in the pivot table are the same as those that were calculated
on the worksheet.
Note: If the count of items is one, a #DIV/0! error is displayed when using the Var summary
function, because one is subtracted from the count when calculating the variance.
Errors with Count and Count Numbers: These two summary functions count the errors, or
ignore them. The errors are not shown in the item totals.
Count Numbers: Blank cells, errors, and text are not counted.
Count: Text, numbers and errors are counted. Blank cells are not counted.
Errors with Other Summary Functions: For all other Summary Functions, if errors are in the
source data field:
the first error encountered in the source data is displayed in the pivot table
the total is not calculated - it shows the first error from the source data.
In the data, #VALUE! is the first error listed, so it appears in the pivot table.
However, if we sort the data with the latest dates at the top, the #DIV/0! error is first. Then,
refresh the pivot table, and it shows the #DIV/0! error.
Totals and Subtotals: If subtotals, or row and column totals, are displayed, affected totals and
subtotals display the error. And even though they don't show errors in the item totals, the Count
and Count Numbers functions will also display errors in their totals, if both of these conditions
are met:
other summary functions are included in the pivot table, and those fields contain errors in
the data
the Count and Count Number fields contain errors in the data
For example, in the screen shot below, an Average for the Price field has been added, and that
field contains a #DIV/0! error. As a result:
The Count Nums and Count Grand Totals show the #VALUE! error, because they're
based on the Total field, which contains errors in the data
However, the "Count of Date" Grand Total is correct, because the Date field does not
contain any errors in the data
With Values Field Settings, we can set the calculation type in the PivotTable. We can also
decide on how we want to display the values.
In the box Show Values As, No Calculation is displayed. Click the Show Values As box. We
can find several ways of showing wer total values.
% of Grand Total
% of Column Total: Suppose we want to summarize the values as % of each month total.
Click on Sum of Order Amount in ∑ VALUES area.
Select Value Field Settings from the dropdown list. The Value Field Settings dialog box
appears.
In the Custom Name box, type % of Month Total.
Click on the Show values as box.
Select % of Column Total from the dropdown list.
Click OK.
The PivotTable summarizes the values as % of the Column Total. In the Month columns, we
will find the values as % of the specific month total.
Count: Suppose we want to summarize the values by the number of Accounts region wise,
salesperson wise and month wise.
Deselect Order Amount.
Drag Account to ∑ VALUES area. The Sum of Account will be displayed in the ∑
VALUES area.
Click on Sum of Account.
Select Value Field Settings from the dropdown list. The Value Field Settings dialog box
appears.
In the Summarize value field by box, select Count. The Custom Name changes to Count
of Account.
Click OK.
The Count of Account will be displayed as shown below −
Grouping and Ungrouping Field Values
You can group and ungroup field values to define your own clustering. For example, you might
want to know the data combining East and North regions.
Select the East and North items of the Region field in the PivotTable, along with the
nested Salesperson field items.
Click the ANALYZE tab on the Ribbon.
Click Group Selection in the group – Group.
The items – East and North will be grouped under the name Group1. In addition, a new South is
created under which South is nested and a new West is created under which West is nested.
You can also observe that a new field – Region2 is added in the PivotTable Fields list, which
appears in the ROWS area.
Select the South and West items of the Region2 field in the PivotTable, along with the
nested Region and Salesperson field items.
Click the ANALYZE tab on the Ribbon.
Click Group Selection in the group – Group.
The items – South and West of the field Region will be grouped under the name Group2.
Consider the following PivotTable, wherein you have the employee data summarized by Count
of EmployeeID, hiredate wise and title wise.
Suppose you want to group this data by the HireDate field that is a Date field into years and
quarters.
If you want to ungroup this grouping, you can do as shown earlier, by clicking Ungroup in the
group – Group on the Ribbon.
Pivot Table- Filtering Data
You might have to do in-depth analysis on a subset of your PivotTable data. This might be
because you have large data and your focus is required on a smaller portion of the data or
irrespective of the size of the data, your focus is required on certain specific data. You can filter
the data in the PivotTable based on a subset of the values of one or more fields. There are
several ways to do that as follows −
Consider the following PivotTable wherein you have the summarized sales data region wise,
salesperson wise and month wise.
Manual Filtering
You can also filter the PivotTable by picking the values of a field manually. You can do this by
clicking on the arrow in the Row Labels or Column Labels cell.
Suppose you want to analyze only February data. You need to filter the values by the field
Month. As you can observe, Month is part of Column Labels.
As you can observe, there is a Search box in the dropdown list and below the box, you have the
list of the values of the selected field, i.e. Month. The boxes of all the values are checked,
showing that all the values of that field are selected.
Uncheck the (Select All) box at the top of the list of values.
Check the boxes of the values you want to show in your PivotTable, in this case February
and click OK.
The PivotTable displays only those values that are related to the selected Month field value –
February. You can observe that the filtering arrow changes to the icon to indicate that a filter
is applied. Place the cursor on the icon.
You can observe that is displayed indicating that the Manual Filter is applied on the field-
Month.
If all the values of the field are not visible in the list, drag the handle in the bottom-right corner
of the dropdown to enlarge it. Alternatively, if you know the value, type it in the Search box.
Suppose you want to apply another filter on the above filtered PivotTable. For example, you
want to display the data of that of Walters, Chris for the month February. You need to refine
your filtering by adding another filter for the field Salesperson. As you can observe,
Salesperson is part of Row Labels.
Click Salesperson from the dropdown list. The list of the values of the field – Salesperson
will be displayed.
Uncheck (Select All) and check Walters, Chris.
Click OK.
The PivotTable displays only those values that are related to the selected Month field value –
February and Salesperson field value - Walters, Chris.
The filtering arrow for Row Labels also changes to the icon to indicate that a filter is applied.
Place the cursor on the icon on either Row Labels or Column Labels.
A text box is displayed indicating that the Manual Filter is applied on the fields – Month, and
Salesperson.
You can thus filter the PivotTable manually based on any number of fields and on any number
of values.
Filtering by Text
If you have fields that contain text, you can filter the PivotTable by Text, provided the
corresponding field label is text-based. For example, consider the following Employee data.
The data has the details of the employees – EmployeeID, Title, BirthDate, MaritalStatus,
Gender and HireDate. Additionally, the data also has the manager level of the employee (levels
0 – 4).
Suppose you have to do some analysis on the number of employees reporting to a given
employee by title. You can create a PivotTable as given below.
You might want to know how many employees with ‘Manager’ in their title have employees
reporting to them. As the Label Title is text-based, you can apply the Label Filter on the Title
field as follows −
Filtering by Values
You might want to know the titles of the employees who have more than 25 employees
reporting to them. For this, you can apply the Value Filter on the Title field as follows −
The PivotTable will be filtered to display the employee titles who have more than 25 employees
reporting to them.
Filtering by Dates
You might want to display the data of all the employees who were hired in the fiscal year 2015-
15. You can use Data Filters for the same as follows −
Include the HireDate field in the PivotTable. Now, you do not require manager data and
so remove ManagerLevel field from the PivotTable.
Now that you have a Date field in the PivotTable, you can use Date Filters.
The PivotTable will be filtered to display only the data with HireDate between 1st April 2014
and 31st March 2015.
You can group the dates into Quarters as follows −
Right click on any of the dates. The Grouping dialog box appears.
Type 4/1/2014 in the box Starting at. Check the box.
Type 3/31/2015 in the box Ending at. Check the box.
Click Quarters in the box under By.
The dates will be grouped into quarters in the PivotTable. You can make the table look compact
by dragging the field HireDate from ROWS area to COLUMNS area.
You will be able to know how many employees were hired during the fiscal year, quarter wise.
Filtering Using Top 10 Filter
You can use the Top 10 Filter to display the top few or bottom few values of a field in the
PivotTable.
In the first box, click on Top (You can choose Bottom also).
In the second box, enter a number, say, 7.
In the third box, you have three options by which you can filter.
o Click on Items to filter by number of items.
o Click on Percent to filter by percentage.
o Click on Sum to filter by sum.
As you have count of EmployeeID, click Items.
In the fourth box, click on the field Count of EmployeeID.
Click OK.
The top seven values by count of EmployeeID will be displayed in the PivotTable.
As you can observe, the highest number of hires in the fiscal year is that of Production
Technicians and a predominant number of these are in Qtr1.
If your PivotTable has a date field, you can filter the PivotTable using Timeline.
Create a PivotTable from the Employee Data that you used earlier and add the data to the Data
Model in the Create PivotTable dialog box.
As you can observe, All Periods – in Months are displayed on the Timeline.
You might have to clear the filters you have set from time to time to switch across different
combinations and projections of your data. You can do this in several ways as follows −
You can clear all the filters set in a PivotTable at one go as follows −
You can assign a Filter to one of the fields so that you can dynamically change the PivotTable
based on the values of that field.
The Filter with the label as Region appears above the PivotTable (in case you do not have
empty rows above your PivotTable, PivotTable gets pushed down to make space for the Filter.
You will observe that
A drop-down list with the values of the field Region appears. Check the box Select Multiple
Items.
By default, all the boxes are checked. Uncheck the box (All). All the boxes will be unchecked.
Then check the boxes - South and West and click OK.
The data pertaining to South and West regions only will get summarized.
In the cell next to the Filter Region - (Multiple Items) is displayed, indicating that you have
selected more than one item. However, how many items and / or which items is not known from
the report that is displayed. In such a case, using Slicers is a better option for filtering.