0% found this document useful (0 votes)
252 views7 pages

Measures of Forecast Accuracy

The document discusses various methods for measuring forecast accuracy and evaluates them based on simulations. It finds that some measures have inherent biases towards over or under forecasting and identifies MASE and MSSE as the least biased measures. The document also discusses key metrics for measuring the success of a workforce management team, including forecast accuracy across different timeframes, scheduling adherence, and operational metrics like service level and average speed of answer.

Uploaded by

SJCheshire
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
252 views7 pages

Measures of Forecast Accuracy

The document discusses various methods for measuring forecast accuracy and evaluates them based on simulations. It finds that some measures have inherent biases towards over or under forecasting and identifies MASE and MSSE as the least biased measures. The document also discusses key metrics for measuring the success of a workforce management team, including forecast accuracy across different timeframes, scheduling adherence, and operational metrics like service level and average speed of answer.

Uploaded by

SJCheshire
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Measures of forecast Accuracy:

There are many measures of forecast accuracy, starting with the sum of errors, to the more
complex calculations such as Durbin-Watson Statistic and MASE. Many of these measures have
biases which would drive forecasters to over- or under- forecast, depending on the measure
used. Some measures are scale dependent, some produce errors when the demand is
intermittent, and some produce results that only have use as a tracking statistic, not as an
actual measure of performance. I have identified 14 ways of measuring forecast performance,
and I will examine each of them in turn.

Methods discovered:
Mean of Error
Mean of Absolute Error
MAPE (Mean Absolute Percentage Error)
MAPEf (MAPE with respect to Forecast amount)
MsAPE (Symmetric MAPE)
MmaxAPE (MAPE using highest of forecast and actual values)
MtAPE (truncated MAPE sets max error to 100%)
Average of MAPE and MAPEf
RMS (Root Mean Squared)
MRMS (RMS divided by MAPE)
Durbin-Watson Statistic ( Wikipedia )
Tracking Signal (Indicator of bias in forecasts)
MASE (Mean Absolute Scaled Error)
MSSE (Mean Squared Scaled Error)
Evaluation of methods:
20 random dice rolls (1 to 6) were taken to generate actuals. From this range of 1-6, an
expected average of 3 would be expected (1+2+3+4+5+6=21, 216=3). Constant forecasts
of 2, 3 and 5 were evaluated to check for bias of a measure. Some measures use a base level
(nave) forecast to measure themselves against. The nave forecast used in the simulation was
one where the forecast for the next time period was whatever occurred in the previous
forecast.
Using Excel, the forecast measures were recalculated 100s of times to see if any patterns of bias
could be seen. As all measures can be produced in an excel spreadsheet, any and all of the
measures may be produced from the OPSDB, at such a point when the forecasts are loaded into
that system.

For reference, the spreadsheet can be viewed by clicking here
Observations on Methods:
1. Mean of actual error.
This measure is useful to see if forecasts are generally higher or lower than actuals.
While it can show if forecasts are biased, a large error will hide smaller errors. Also, this
measure can reduce to zero if the errors are consistently over and under. This would
exclude this measure from serious consideration
2. Mean of Absolute Error.
MAE will give the average error. This tries to remove the cancelling out effect of the first
measure, but it is susceptible to large errors skewing the overall output.
3. MAPE.
This measure can be used to compare forecasts that have different scales of volume.
This measure shows a definite bias towards under forecasting, which can be detrimental
to customer experience (and, inherently, the bottom line as customers could abandon
calls to go to other businesses)
4. MAPEf.
This measure can also be used to compare forecasts on different scales of volume. As
this measure measures the error against the forecast, instead of the actuals of the
previous measure, a bias towards over forecasting can be seen, which would help
customer experience, but add to overheads.
5. MsAPE.
This measure takes the average of Forecast and Actual to compare the forecast against.
This is trying to make the best of MAPE and MAPEf, but it still has an inherent bias to
under forecasting, but less than MAPE alone.
6. MmaxAPE.
This measure takes the maximum for Forecast or Actual which maximizes the error
measure. This removes the under forecast bias of MAPE, but doesnt remove the over
forecast bias that you get from MAPEf
7. MtAPE.
This measure truncates the maximum error to 100%, which reduces the effect outliers
have on forecast measure. This has the effect of changing the under forecast bias seen
with MAPE to over forecast bias
8. Average of MAPE and MAPEf.
Averaging MAPE and MAPEf values tends to cancel out most of the bias of MAPE and
MAPEf, but does not produce a meaningful value that can be used for measuring
performance
9. RMS.
RMS produces a measure that is unbiased, but cant be used to compare forecasts, as
the result is not scaled
10. MRMS.
Reducing it to a score using MAPE gives a figure that can be used to compare forecasts,
though its use is comparing forecast methodologies, and cannot be used to give a rating
11. Durbin-Watson Statistic.
The measure shows if there is a tendency to over or under forecast with +2 and -2 being
the points that show if the forecast is not consistent or if it swings close and far away.
This measure is useful to show consistency of forecast, but not to how big the error is,
as the number will always fall between -4 and +4
12. Tracking Signal.
Another measure to show if the forecast is swinging too far away from the actual. A
number between -4 and 4 indicates no adjustment is needed. This measure is useful as a
forecast adjustment indicator only.
13. MASE.
MASE compares your forecast to a nave forecast. A score of 0 means your forecast is
perfect, as score of 1 indicates your forecast is no better than the nave forecast, and a
score greater than 1 means you have some adjustment to do
14. MSSE.
This operates in the same manner as MASE, but squares the error instead of taking an
average. This is the least biased of the measures, with the score provided showing the
same indications of MASE
Measures of Workforce Management Team Success
By Maggie Klenke, The Call Center School
A frequent question from workforce management (WFM) professionals is how do I measure
the success of the WFM team? This is a challenge since WFM does not work in isolation but is
an integral part of the overall delivery of service from the contact center. Some would say that
service level or ASA is the ultimate metric to use while others argue that WFM has little control
over whether operations follows the plan, whether the schedule options people are hired for
produce a good match to the requirements, etc. There is no single metric that can address the
wide variety of roles that WFM teams are asked to play. Some focus entirely on forecasting and
scheduling while others might add real-time adherence management, ACD routing script
management, development and delivery of reports, and so on. However, basic forecasting,
scheduling, and managing the intra-day changes needed to match the staff to the workload
seem to be in virtually every WFM teams area of responsibility.
The Call Center School offers the following as the four basic metrics that can be applied to get a
good picture of the WFM teams success. These are designed not just as a scorecard for the
WFM operation, but to provide the analysis tools needed to identify opportunities for
continuous improvement in managing the staffing resource. Knowing that the process is not as
good as desired isnt enough the team must have the mechanisms to see exactly where the
problems are occurring so that they can be more readily analyzed and improved. As expected,
not all of these recommended metrics are totally within the control of the WFM team. They
involve the operations team, human resources, and others. It is the ultimate delivery of the
right number of staff, in the right place, with the right skills, at the right time to meet customer
demand with quality and reasonable cost control that is the goal.
FORECASTING ACCURACY
A forecast is the foundation of the entire WFM process, so forecast accuracy must be a top
success measure. The operations team has a role here in ensuring that any anomalies in the
actual workload are identified for cause so that future forecasts are better able to
accommodate and predict these drivers. Communication with departments that drive activity
for the contact center is also key to predicting the impacts of such things as marketing
campaigns, mailings, billing cycles, etc. Full understanding of what makes customers and staff
behave differently is essential to improving the accuracy of the forecast in many cases. There
are three major elements to consider in measuring forecast accuracy:
The elements of the forecast the volume of work units (call volume, emails, chats, white mail,
etc.) and the average amount of time required to complete each unit (AHT) as well as the total
workload (volume multiplied by AHT). While it is common for centers to analyze the call volume
forecast, it is less common to see analysis of the AHT. Both are equal partners in the workload
calculation and should be analyzed separately as well as combined to help identify
opportunities for improving accuracy. For example, we have worked with centers where the
AHT is relatively steady across all times of day and others that double from morning to night
shift. Its critical to understand if the difference is a function of less supervision, more new hires
in unattractive shifts, customers calling with more difficult problems when they have time to
talk, or a combination of effects. Digging into that type of question can not only improve the
accuracy of forecasting the actual workload in each time period of the day, but can also help in
identifying opportunities to reduce AHT through other measures.
The timeframe over which the forecast accuracy is analyzed monthly, weekly, daily, and/or
half-hourly. Analyzing accuracy at the monthly or weekly level serves as a reasonable scorecard
but does little to help the analyst discover where the forecast may be consistently over or
under the actual demand. Analysis at the interval level is required to focus attention on those
elements of the forecast that can be improved. This kind of analysis is a bit like measuring
service level or ASA over long periods. There can be rather dramatic fluctuations within that
period that offset each other making the overall average look good. However, dealing with wide
swings at the daily or half-hourly interval level puts an unrealistic demand on the operations
team. The goal is a consistent level of accuracy, not an average level over time.
The methodologies for analyzing the accuracy percent variation, standard deviation of the
variation and correlation coefficient to identify pattern anomalies. The percent that the actual
varies from the forecast (forecast minus actual divided by forecast) is the most commonly used
analysis of forecasting accuracy. At the interval level, it is also a pretty accurate picture.
However, where there is a wealth of data to analyze, it is helpful to have an easier way to put a
finger on the pulse of the accuracy over a long period and that is best found by calculating the
standard deviation of the variation percentages. A small deviation is better than wide swings in
the variation and this will show up even if the average variation seems quite small. Another tool
to analyze the variations over time is the correlation coefficient which analyzes the patterns
from one period to another. The correlation coefficient analysis can be applied to the variation
percentages but is probably most useful when applied to the arrival patterns of work volume
and the changes in AHT over the intervals. It compares two periods to see if the patterns are a
match or not. For example, the typical Monday might adhere to a relatively consistent pattern,
but the correlation analysis may reveal that one particular Monday varies in pattern even if the
total volume of workload is within normal boundaries. This would suggest that further
understanding of what happened on that Monday is useful. This level of detail is also critical to
determining which historical data is normal and which is not when deciding to allow the data
to average into the history kept for forecasting. Data which is outside of an acceptable range
should be considered for adjustment, storing separately as a sample of a particular repeatable
event, or even being discarded as an anomaly unlikely to reoccur.
SCHEDULING EFFICIENCY
Sometimes referred to as schedule inflexibility, this measures how effectively the scheduling
plan has matched the number of agents available to handle the work to the required staffing
based on the forecast workload and the shrinkage assumptions. The ability to make a perfect
match of staff to requirements is subject to the limitations in the schedule options the WFM
analyst has to work with and the creativity of applying the flexibility that is available to the
process. It can also be influenced by whether the shrinkage assumptions are detailed by
interval or applied as a consistent percentage of loss across all periods. Where there are no
part-time staff plus rigid rules about scheduling (i.e., days off must be consecutive, breaks must
be at a certain interval, etc.), it is difficult to match the constantly changing demand. Keeping
this mismatch as low as possible ensures the best utilization of the personnel resource, more
consistent speed of answer for customers, more consistent occupancy levels for staff, and
minimal cost. Knowing where the mismatches are occurring can point to more creative ways to
work within the rules in place, but also encourage the entire operation to identify new shift
types to consider, hiring options to be offered, etc. If the forecast is reasonably accurate and
the schedules match up to that forecast reasonably well, the intra-day chaos of constant
adjustments can be minimized.
The calculation requires the following elements:
Total required bodies in chairs based on the forecast plus the assumed shrinkage adjustment.
(Where a skill-based routing configuration is in place, the output of a sophisticated algorithm is
needed to identify the requirements across all skill combinations while in a traditional single-
skilled assignment situation, basic Erlang C will be acceptable.) The shrinkage percentage
assumption can be added in each interval to raise the bodies in chairs requirement to the
number that is actually needed on the schedule so that absenteeism and other factors can
occur without impacting the service delivery.
Total scheduled staff in each period compared to the requirement. This is the staff scheduled
including the shrinkage assumption not yet covered by the schedule elements. For example,
looking at a schedule for next week, the expectations for vacations and training might already
be in the schedules but daily absenteeism would still need to be a shrinkage assumption. In a
single-skilled environment, this will be relatively easy to determine, but more complex in skill-
based configurations where a single agent may be capable of serving more than one type of call
at any moment in time. (If preferred, the shrinkage can be deducted from the scheduled staff
rather than added to the required bodies in chairs.)
Total variation of scheduled staff compared to required staff all extra and short staff totals
by period. When the overs and unders are added together, the total for the period is
divided by the total requirements for that period. The result is the percent of variation that is
caused by the mismatch of staffing to requirements. (A standard deviation analysis of the
variances can be done if the over-and understaffing numbers are converted to percentages of
variation by period. When the staffing requirements by period vary significantly by day or week
or time of day, the raw numbers of staff variance will provide an inaccurate result.)
Analysis period This is generally analyzed at the day or week level although analysis of the
patterns of variations intraday can be effective in determining trends of mismatch.
OPERATIONAL EFFECTIVENESS
This measures the ability of the WFM and Operations teams to identify and adjust to the
situations that develop within the week and day so that the staffing levels are as near ideal as
possible in each period and the speed of answer goal (service level or ASA) is met as
consistently as possible. The following elements are needed:
The revised bodies in chairs requirement by period based on the actual workload (volume and
AHT). This is the calculated staffing requirement once the true workload is known after the fact.
The actual positions staffed for each period This is the true bodies in chairs by period or the
number of staff who actually logged in and were available for work. (In a single-skill
environment, this is easily reported by the ACD, but in skill-based environments may be an
estimate based on allocation of staff by skill.)
The forecast speed of answer by period When the forecast is created and the staffing model
applied, a forecast for the speed of answer for each period can be calculated. This is the
expectation if the work arrives exactly as predicted and the staff follows the plan perfectly.
The actual speed of answer by period Once the actual workload and staffing are known, the
ACD typically reports the actual speed of answer for that period.
The total variation of staffing and speed of answer results (plus and minus) for each period
as well as the standard deviation and correlation coefficient to identify pattern anomalies. This
is where everything comes together. Regardless of the accuracy of the forecast or the efficiency
of the planned schedule, when the day arrives, the whole center does what it can to make
adjustments to deliver the speed of answer goal. Measuring the actual results in terms of the
speed of answer variation and the match of the actual staffing to the final requirements will
identify where the challenges occurred so that they may be more readily analyzed for possible
improvements.
EMPLOYEE SATISFACTION WITH THE WFM PROCESS
Like any operation within a contact center, customer satisfaction is a key element of judging
performance and for the WFM team, the customer is operations including the agents and
supervisory staff. Regular employee satisfaction surveys should be conducted to measure the
operations teams satisfaction with such WFM processes as:
Shift choices available
Shift bidding/assignment process fairness and effectiveness
Time-off management fairness
Schedule adherence management fairness
Willingness of the WFM team to accommodate requests
When all of these elements have been analyzed, the WFM team should have a good picture of
not only the current level of performance but where the opportunities are to focus attention
for improvements. This level of analysis can also be used to support the business cases needed
when schedules need to be more flexible, coordination between WFM and other teams are
lacking, or employee retention is a challenge.

Maggie Klenke is a Founding Partner of The Call Center School. She can be reached at 615-812-
8411 or by email at [email protected].
2011 Society of Workforce Planning Professionals

You might also like