Call Center Simulation Modeling: Methods, Challenges, and Opportunities
Call Center Simulation Modeling: Methods, Challenges, and Opportunities
net/publication/4053730
CITATIONS READS
73 5,965
2 authors, including:
Vijay Mehrotra
University of San Francisco
25 PUBLICATIONS 903 CITATIONS
SEE PROFILE
All content following this page was uploaded by Vijay Mehrotra on 17 May 2014.
135
Mehrotra and Fama
makes it difficult for decision makers to understand system On a day-to-day basis, while simultaneously keeping
dynamics without effective modeling. costs, service quality, and employee satisfaction, these ex-
The remainder of this tutorial is organized as follows. ecutives and managers must (implicitly or explicitly) an-
In Section 2, we motivate the need for and value of simula- swer a number of important questions for which decision
tion in the context of effective call center management. In support models are valuable:
Section 3, we discuss how call centers make use of simula- • How many agents should we have on staff with
tion models, focusing on the key output statistics that are which particular skills? How should we schedule
used for system performance evaluation. In Section 4, we these agents’ shifts, breaks, lunches, training,
provide a modeling framework for call center simulation, meetings and other activities?
and discuss the key inputs associated with simulation mod- • How many calls of which type do we expect at
els, introducing the concepts through the formulation of a which times?
simulation model. In Section 5, we discuss business deci- • How quickly do we want to respond to each type
sions associated with this model and explore some of the of inbound call?
results of our analysis. Finally, in Section 6, we propose • How should we cross-train our agents? How
likely future directions for call center simulation. should we route our calls to make the best use of
Note: Throughout this paper, we will use the term these resources?
“call center” and focus our discussion on centers that are • Given a forecast, a routing design, and an agent
processing only phone calls (either inbound, outbound, or schedule, how well will our system perform?
both). Another common term in this industry is “contact • What is our overall capacity? How will a spike in
center,” which refers to centers handling not only phone call volumes impact our overall performance?
calls but other types of customer contacts such as email,
• How is our center doing right now? What has
fax, paper, and/or chat sessions. We have chosen to focus
changed since we did our last forecast and pub-
on call centers here for clarity of exposition. However, lished our schedules? If the changes are signifi-
leveraging the ideas presented here from phone-only call
cant, what can I do to respond to minimize the
centers to multi-channel contact centers is a straightfor-
impact on the rest of the day or week?
ward extension that we have also engaged in extensively. There are a variety of mathematical methods (see
Grossman et al. 2001 and Mandelbaum 2001 for more dis-
2 CALL CENTER MANAGEMENT cussion of this) and associated software to help call center
CHALLENGES AND THE NEED personnel as they try to address these types of questions,
FOR MODELS most notably workload forecasting models based on time
series and agent scheduling optimization solutions.
Those responsible for managing call centers face a very However, over the past several years, simulation has
difficult set of challenges. At a high level, they must strike emerged to play an important role in the call center design
a balance between three powerful competing interests, as and management arena.
shown in Figure 1 below.
3 HOW CALL CENTERS USE SIMULATION
136
Mehrotra and Fama
137
Mehrotra and Fama
notified about the state of their credit by mail; and (c) addi- val, it has been customary to translate call volume forecasts
tional limitations may be placed on the account. into λ values for Poisson arrivals and AHT forecasts into µ
The call center itself features two queues: an Inbound values for Exponential service times.
queue and an Outbound queue. The time period that we A great deal of research has been conducted on call
are using for the analysis is one week. volume forecasting models, and the interested reader is re-
The two agent groups and the basic routing logic are ferred to Mabert (1985) and Andrews and Cunningham
illustrated in Figure 3. Calls from the Inbound queue will (1995) for valuable discussions on this topic.
arrive and be served by an agent from Group #1, the In- Forecasts must be created for each queue for each time
bound-Only group. If no agent from this group is avail- interval in the simulation period.
able, the calls will wait in queue. If, after some pre- The most common call center forecasting approach is
defined period of time, the Inbound call has not yet been to create weighted averages of historical data for specific
served, it will then also queue for an agent from Group #2, time intervals over the course of a week. For example, the
the Cross-Trained Outbound group. initial call volume forecast for 8:15 a.m. - 8:30 a.m. next
Tuesday might be computed as the average of call volume
for the 8:15 a.m. - 8:30 a.m. period on the past several
Abandonment Tuesdays. From here, alterations may be (or more com-
Outbound Call
Prospects
monly, should be!) made based on additional information
(Effectively (e.g. specific marketing activities for a sales center or
Inbound Calls:
Unlimited Pool) Arriving emerging product issues for a technical support center) that
Randomly may cause volume to differ substantially from previous
Based on FCs patterns.
138
Mehrotra and Fama
From a simulation perspective, each agent is viewed as by Garnett et al. 2002). We refer to the mean of this distri-
a resource to perform certain types of work. Note that in bution as “the patience factor.”
the call center context, agents are actually productive only Given this modeling choice, we must still with the
during the interval in which the agents are scheduled to be challenge of selecting the patience factor, which we esti-
actually handling phone calls. mate from historical data about callers’ time in queue.
In addition, it is conventional to model agents as com- We do not include caller retrial in the example model.
pleting the task that they are engaged, even if it extends
past the time at which they are to switch activities. That is, 4.6 Key Inputs: Agent Skills
an agent within our simulation will be modeled as complet-
ing the phone call that he is working on before leaving for Our definition of “Agent Skills” is comprised of three ma-
a break or a lunch. jor types of inputs for each agent or group of agents:
A common step in call center simulation is to translate a 1. What calls is the agent capable of handling?
set of individual agent schedules into a matrix of resources, 2. Given a choice of multiple calls waiting, which
where the dimensions of the matrix are defined by the num- will the agent handle (“call priority”)
ber of Agent Groups and the number of Time Intervals. 3. How fast will the agent be able to handle each
In our example, we have leveraged the fact that our type of call, and how often will the agent resolve
schedules are at a 15-minute level of granularity, and there- the issue successfully (“call proficiency”)
fore prior to running the simulation we have converted When combined with routing logic and call forecasts,
these schedules into a number of on-phone agents for each these attributes fully specify the queueing model to be
group for each 15-minute interval. simulated.
In our example, we have three distinct groups of
4.5 Key Inputs: Abandonment agents, each with different skills:
Model and Parameters • Agent Group #1 (Inbound Only) handle only In-
bound calls on a First-Come-First-Served basis.
Abandonment is one of the most hotly debated topics in These agents have a call proficiency of 1.0 for In-
call center management and research. There are two basic bound Calls, so that their AHT is equal to the
questions that must be answered in order to effectively forecasted AHT for Inbound Calls.
model customer abandonment behavior: • Agent Group #3 (Outbound Specialists) handle
1. What is the customer’s tolerance for waiting, and only Outbound calls. These agents have a call
at what point will this customer hang up and proficiency of 1.0 for Outbound Calls, meaning
thereby leave the queue? that their AHT is equal to the forecasted AHT for
2. How likely is the customer to call back, and after Outbound Calls.
how long? • Agent Group #2 (Cross-Trained Outbound) handle
Many researchers (e.g. Hoffman and Harris 1986, An- both Inbound and Outbound calls. These agents
drews and Parsons 1993) have examined the challenge of have a call proficiency of 1.0 for Outbound Calls,
modeling these problems from both an empirical point of meaning that their AHT is equal to the forecasted
view and from an analytic perspective. AHT for Outbound Calls. However, these agents
From our experience, these questions are difficult to will give priority to Inbound Callers if there are any
answer not only because of the mathematical complexity of waiting in queue, and have a call proficiency of 2.0
the queue dynamics but also because of a lack of observ- for Inbound calls, reflecting the relative ineffi-
able data about customer abandonment and retrial. While ciency of cross-training (see Pinker and Shumsky
many surveys have been done, we have observed great dif- 2000 for more discussion of this phenomenon both
ferences in customer behavior across different industries in and out of the contact center).
and different companies’ operations. In addition, informa-
tion provided to callers about expected waiting time and/or 4.7 Other Modeling Considerations
position in queue can have a marked impact on abandon-
ment behavior. 4.7.1 Shrinkage
In our example model, simulated customers arrive at the
call center and are served by an agent if one is available. If It is well known that a certain amount of agent time will be
not, they join the queue, at which point they are also as- lost, either in large blocks (unanticipated shift cancella-
signed a “life span” which is drawn from an exponential dis- tions, partial day absences for personal reasons) or in small
tribution. If a customer’s life span expires while they are blocks (late arrivals to the call center, extra-long breaks,
still waiting in queue, they then abandon the queue. trips to the bathroom).
That is, we represent customers’ tolerance for waiting There is an important distinction between two differ-
in queue as an exponential random variable (as suggested ent kinds of lost agent time. On one hand, agent time that
139
Mehrotra and Fama
is known to be lost prior to the creation and publication of (Group #2 and Group #3). We treat agent schedules for
a schedule has essentially no additional impact on the each of the three agent groups, as well as call forecasts (a
simulation model beyond the fact that this particular agent total of about 20,000 calls for the week) for the Inbound
is not included in the schedule. calls, as fixed inputs for this simulation model. In addi-
On the other hand, scheduled time that is not worked, tion, we assume that there is an effectively unlimited pool
either because of unexpected absences or because of lack of customers to contact with Outbound calls.
of rigorous adherence with agent schedules, is time that The operational problem facing the management of
should be accounted for in the simulation if this represents this call center is focused on call routing and agent skilling.
a known phenomenon (e.g. higher absenteeism on Mon- Underlying this problem is the classical tension between
days). In the call center industry, this is known as “shrink- specialization and cost.
age” and it is a major management problem as well as sig- In terms of specialization, Inbound agents are far more
nificant modeling challenge. effective in handling Inbound calls than Outbound agents,
Most call centers have significant levels of shrinkage – while Inbound calls disrupt the rhythm and effectiveness of
we have seen many sites with over 30% overall. We have Outbound agents; for both of these reasons, it would be far
included a shrinkage level of 10% in our example model. better to have specialized agents for Inbound and Out-
bound calls respectively.
4.7.2 Additional Detail for Outbound Queues In terms of cost, there is a management objective of
handling 80% of Inbound calls within 60 seconds for each
As we have discussed earlier, the workflow associated with interval of the day. With dedicated agents, this translates
Outbound calls is very different than the logic for Inbound into a larger amount of Inbound agents required than are
queues. At heart, this modeling difference stems from the actually available. Current staffing levels, therefore, will
fact that inbound calls are characterized by a random arri- result in longer than desirable waiting times, which in turn
val pattern; in contrast, the outbound dialing pattern can be is correlated with higher abandonment rates.
scheduled but each call features a random outcome (right Specifically, the business decisions to be addressed are
party connect, wrong party connect, no answer). as follows:
In addition, as discussed in Section 3 above, the per- • Of the 150 Outbound-skilled agents, how many of
formance metrics associated with Outbound queues are them should be enabled to handle Inbound calls
quite different (overall RPCs achieved, rather than the and included in the Cross-Trained Outbound
queue and abandonment statistics that are typically used to group?
evaluate Inbound queues). In order to effectively estimate • If no Inbound agents are available, how quickly
the number and pattern of RPCs, simulation models require should an Inbound call be offered to the Cross-
information about the probabilities that a given dial Trained Outbound group?
achieves an RPC, which typically varies by time of day, as In an ideal world, there would be an empirical “right
well as the AHT associated with an RPC. answer” to these questions, a mathematically optimal solu-
To model one level deeper, one might consider actu- tion that could be determined through sequential simula-
ally representing the detailed logic of the predictive dialer tion runs.
(see Samuelson 1999 for more on predictive dialer logic). In practice, however, such decisions typically involve
However, this level of detail was not necessary for the substantial trade-offs that are difficult to value in relation
types of business decisions being addressed by our exam- to one another, and simulation’s role is to quantify the im-
ple model, and so we have not included detailed dialer pact of different possible decisions.
logic in it. The key output metrics for these simulations are:
1. Phone Service Levels (% of Inbound calls handled
5 EXAMPLE: ROUTING STRATEGIES FOR within 60 seconds).
A COLLECTIONS CALL CENTER 2. Abandonment Levels (% of Inbound callers
hanging up prior to receiving service).
5.1 Operational Problem and Business Decisions 3. Right Party Connects (total number of Outbound
calls completed to the correct individuals).
Throughout Section 4, we have described parts the simula- 4. Number of Overflows (of Inbound calls to Cross-
tion model associated with this example. The call center of Trained Outbound group).
interest is illustrated in Figure 3, and the formulation was
motivated by discussions with several blended inbound-
outbound centers about optimal system design.
In our example, the call center is open Monday - Fri-
day from 7:00 am to 6:00 pm. There are 50 Inbound
agents (Group #1) and a total of 150 Outbound agents
140
Mehrotra and Fama
5.2 Numerical Results impact of this cross training on the Outbound call statistics.
These trade-offs are evident in Table 2 below.
5.2.1 Determining the Number of Replications Based on these preliminary simulations, we chose to
focus on cross-training a total of 30-40 Outbound agents.
For each of the individual scenarios that are discussed be- From here, we turned our attention to defining parameter
low, we ran multiple replications of the simulation model for how long Inbound calls should wait before being made
and computed estimates for performance measures based available to Cross-Trained Outbound agents.
on the average of the run length.
For purposes of determining the number of runs for 5.2.4 Varying the Wait Time Parameter for
each scenario, we focused on average weekly Service Overflowing Inbound Calls
Level for the Inbound queue as the statistic of interest. Af-
ter each run, we would examine overall standard deviation Results for different scenarios associated with 30 and 40
of this statistic across all runs to date. We continued to run Cross-Trained Outbound agents are shown in Tables 2
additional iterations until this overall standard deviation and 3.
was under 2.5%, which we had set arbitrarily as our confi-
dence threshold. Table 2: Simulation Results - 30 Cross-Trained Agents
Scenario Wait Until Abandon Number RPCs for
5.2.2 Base Case # Overflow SL % % Interrupts Week
4 0 seconds 86.7 4 2816 2339
Our baseline scenario is one with no Outbound Cross- 9. 15 seconds 85.9 4.2 2592 2360.3
Trained agents. This base case is listed as Scenario 1 in 10. 30 seconds 82.6 4.7 2403 2380.6
Table 1 below. 11. 45 seconds 78.3 5.7 2173 2414.3
12. 60 seconds 67.9 7.6 1769 2453.8
Table 1: Simulation Results For Base Case and Initial
Cross-Training Scenarios
Scenario Number Number Total Table 3: Simulation Results - 40 Cross-Trained Agents
# Cross- Inter- RPCs for Scenario Wait Until Abandon Number RPCs for
Trained SL % Abandon % rupts Week # Overflow SL % % Interrupts Week
1. 5 0 seconds 92.1 2.3 3242 2301.7
0 56.3 16 0 2621.8
2. 13. 15 seconds 90.2 2.7 3057 2324.7
10 70.1 10 1288 2494.4
3. 20 80 6.3 2211 2400.5 14. 30 seconds 87.7 3.3 2727 2347.2
4. 30 86.7 4 2816 2339 15. 45 seconds 81.9 4.3 2428 2377.9
5. 40 92.1 2.3 3242 2301.7 16. 60 seconds 70.1 6.3 1998 2427.2
6. 50 93.5 1.7 3374 2284
7. 75 97.1 0.8 3715 2255.1 5.2.5 Summary
8. 100 98 0.6 3726 2250.4
The different scenarios that we have simulated have en-
From this base case, it was clear that the Inbound abled us to (a) hone in on the right levels of cross training
Agent Group alone cannot deliver the desired Service Lev- to meet the Service Level goals with the current staffing
el (80% within 60 seconds), and that the Abandonment levels and (b) examine trade-offs between different scenar-
Rate is also much higher than desired. ios in terms of the key model outputs.
For example, consider Scenarios 3, 10, 14, and 15, all
5.2.3 Varying Cross-Training Levels of which deliver SLs at or above the 80% target. The an-
swer to which of these is the “best” choice will of course
We then began to vary the number of Outbound-Skilled depend on the relative value of RPCs, Service Levels, and
Agents who were included in the Cross-Trained Outbound Abandoned customers. However, it is interesting to note
group, assuming for these initial experiments that Inbound that Scenario 3 produces essentially the same SL and RPC
calls would immediately overflow to Cross-Trained Out- values as Scenarios 10 and 15 – but with a substantially
bound agents whenever all Inbound Only agents were higher abandonment rate. In turn, the tangible difference
busy. The impact of this cross training on the population between Scenario 14 and 15 enables managers to explicitly
of Inbound callers is dramatic, as even limited cross train- quantify the level of increased Service Level and decreased
ing has a big impact on Service Levels and Abandonment abandonment rates against the decreased number of RPCs.
Rates. In addition, there is an equally obvious negative Finally, it is worth mentioning that while we have
shown summary statistics for sixteen scenarios here, it is
relatively easy for us to produce more detailed statistics
141
Mehrotra and Fama
142
Mehrotra and Fama
143