Problem Management Process
Problem Management Process
design guide
Orlando release
Updated: January 2020
Table of contents
Introduction ............................................................................................................................................3
Principles and basic concepts .............................................................................................................3
Process scope ........................................................................................................................................3
Process objectives .................................................................................................................................4
Roles and responsibilities.......................................................................................................................4
Process owner ..................................................................................................................................... 4
Problem coordinator .......................................................................................................................... 4
Problem manager .............................................................................................................................. 5
Technical support ............................................................................................................................... 5
Specialist roles ..................................................................................................................................... 5
How problems are initiated ..................................................................................................................6
Problem Management lifecycle .........................................................................................................6
Process overview ................................................................................................................................ 7
State: New ........................................................................................................................................... 8
State: Assess ........................................................................................................................................ 9
State: Root Cause Analysis .............................................................................................................. 10
State: Fix in Progress ......................................................................................................................... 12
State: Resolved ................................................................................................................................. 13
State: Closed..................................................................................................................................... 13
Other processes ...................................................................................................................................14
Incident Management .................................................................................................................... 14
Change Management .................................................................................................................... 14
Configuration Management .......................................................................................................... 14
Knowledge Management .............................................................................................................. 14
User experience ................................................................................................................................ 14
Process governance ...........................................................................................................................14
Measurement .................................................................................................................................... 14
Metrics................................................................................................................................................ 15
2
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
Introduction
This process guide will provide a detailed explanation on how the problem management
process is enabled within the Now Platform. It is intended that this process be followed as
closely as possible. ServiceNow® encourages simple, lean ITSM processes and that is
reflected in the out-of-the-box design. Additional functionality can be incorporated into
what is offered; however, this should only be in scenarios where there is a required business
outcome gained that could not be achieved using an out-of-the-box method. Following this
approach should also ease upgrade paths and the ability to expand the use of the platform.
There are three primary goals of the problem management process. The first goal is to
provide a mechanism to understand the cause and permanently solve incidents that have
occurred. These are cases where the incident process may have only identified a temporary
solution purely to restore service. The second goal of the problem management process is to
prevent incidents and service impacts from occurring. Finally, problem management
attempts to minimize the impact of incidents that cannot be prevented.
A known error article is a type of knowledge article the problem management team can
create to help with incident deflection. A workaround or root cause may not be known.
A root cause is defined as the underlying or original cause of an incident or problem.
A workaround is defined as a temporary way to restore service failures to a usable level
ServiceNow focuses on the use of automation and information to speed the path to root
cause identification and permanent resolution.
Problem Management relies heavily on:
• The CMDB for problem assignment and impact analysis
• The incident management process for providing details of individual related incidents
• The change management process for controlling changes needed to solve problems
• The knowledge management process for sharing information about known errors
Process scope
The scope of Problem Management includes:
• The identification and diagnosis of problems through Event Management, technical
identification, and proactive Problem Management
• The diagnosis of all problems as quickly as possible using:
– Problem and error control
3
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
– Event and incident trends
– Identifying workarounds to reduce incident duration
– Identifying and implementing permanent solutions to eliminate incidents reoccurring
Process objectives
The objectives of Problem management are to:
• Determine the root cause of incidents, identify viable workarounds, and drive to
permanent solutions that prevent recurrence
• Maintain information about problems, associated workarounds and permanent solutions
• Communicate information appropriately to reduce and eliminate the number and
impact of incidents over time
• Identify and solve problems proactively to improve IT services and prevent potential
incidents from occurring
Problem coordinator
The problem coordinator is the responsible owner for getting a problem permanently
resolved or prevented as soon as possible. They must also manage and co-ordinate all
problems through the process.
Responsible for:
• Assessing problems to ensure they are genuine
• Monitoring and controlling the detection, recording, assignment, escalation, and
resolution of problems
4
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
• Coordinating technical and service subject matter experts (SMEs)
• Documenting problem information
• Publishing workarounds
• Publishing known error articles
• Coordinating decisions on whether to apply a fix
• Deciding whether to accept the risk of not applying a fix
• Reviewing problems to check for quality and completeness
• Driving the efficiency and effectiveness of the problem management process
ServiceNow role required: The problem_coordinator role is required in ServiceNow.
Problem manager
The Problem Manager is responsible for the overall Problem Management process and can
configure Problem Management settings as well as act as a problem coordinator
Responsible for:
• Can configure whether a problem or problem task can be re-opened and if so by which
roles.
• Can configure whether accepting risk of not fixing this problem moves the problem to
resolved (still active) or closed.
ServiceNow role required: The problem_manager role is required in ServiceNow.
Technical support
Technical support teams assist the problem coordinator to investigate problems, identify, and
implement solutions.
Responsible for:
• Providing subject matter expertise
• Conducting investigation into problems
• Identifying the root cause of problems
• Identifying workarounds, and notifying Service Desk and Technical Support of
workaround availability
• Publishing known error articles
• Identifying technical solutions to eliminate faults
• Providing stakeholder communication on active problems
• Resolving problems, through Change Management where applicable
ServiceNow role required: The problem_task_analyst role is required in ServiceNow
Specialist roles
Request read and write roles
Roles to assign access permissions at a granular level when the ITSM Roles — Problem
Management plugin (com.snc.itsm.roles.problem_management) is activated.
• sn_problem_read
5
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
• sn_problem_write
These roles are added when the ITSM Roles — Problem Management plugin is installed. The
new roles added are:
• sn_problem_read: The user with this role has read access to the Problem Management
application and related records.
• sn_problem_write: The user with this role has write access to the Problem Management
application and related records.
6
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
Process overview
7
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
State: New
When a problem is first created it is in a state of New. This is where very basic information is
added that may suggest a problem exists. All known information about the symptoms
experienced is captured. At the very least enough to warrant some kind of investigation.
The mandatory fields are:
• Problem statement
If other fields such as CIs are known at this point, they can still be added and will be
automatically if they are coming from an existing incident record; however, they do not
need to be mandatory to progress.
Problem assignment
At New state it is necessary to identify the appropriate assignment group to assess the
problem in the next phase of the lifecycle. This will need to be a Problem Management
group or if a dedicated function does not exist in the organization then the group that will
perform that function such as the service owner or incident management team, these users
will require the problem_coordinator role. This assignment is best achieved by automatically
updating the Assignment group field rather than letting the user try to pick the correct group
manually since this approach is prone to error.
At this point the assignment group (problem management group) will need to choose an
individual problem coordinator to take responsibility for the problem on behalf of the group.
This is done by populating the Assigned to field.
Once the mandatory fields are populated the problem coordinator is expected to click the
Assess button. This will move the problem into the lifecycle where it is considered live and
something that requires attention.
8
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
State: Assess
At Assess state the problem coordinator is primarily assessing the problem to determine
whether it is genuine or not.
The new Assigned to individual (the problem coordinator) will now conduct an initial review
of the problem primarily to check that it is indeed a real problem. If the problem coordinator
determines that this is a duplicate problem, they click Mark Duplicate and select the
problem this is a duplicate of. If the problem coordinator recognizes it is not a genuine
problem they will simply click Cancel and populate a Work Note to explain why they do not
consider the problem to be genuine and something that requires further investigation. If they
are comfortable to proceed with the investigation, before clicking Confirm they will update
the Priority, Business Service, and CI.
Establishing priority
Problem prioritization typically drives the criticality associated with the handling of the
problem and the order in which problems will be focused on. Priority is calculated through a
combination of impact and urgency.
Impact is the affect that a problem has on business.
Urgency is the extent to which the problem’s resolution can bear delay.
Priority is generated from urgency and impact according to the following table.
Urgency
It is possible to automatically establish the priority of the problem based on the CI that is
identified in the problem record. With this technique, the business criticality value of the CI is
used to determine the priority of the problem. For example, an online banking service would
be considered critical to a financial organization. If this CI is related to the problem the
priority can be automatically set to Critical as a result. This ensures a more accurate and
consistent prioritization of problem, as the determination of impact and urgency can be a
subjective call. If this automated method is being used this can occur at New state when the
problem is first raised as it will be helpful to the problem coordinator to see this immediately.
9
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
offers the use of the dependency views feature that displays the relationship map from the
service/CI in question to other related CMDB components and will display any ongoing
incidents or changes that may exist to aid with root cause analysis.
If the ServiceNow Agile Development application is in use, existing defects should be
searched to look for possible causes or relationships to the problem.
Once this data has been entered the problem coordinator will click Confirm to move to the
Root Cause Analysis state.
The problem coordinator may now need to engage one or more technical support teams to
investigate and potentially help fix the problem. This is achieved using problem tasks.
Problem Tasks
The parent problem record remains assigned to the problem coordinator throughout the
entire process. The coordinator creates and assigns individual tasks out to the various
technical support teams to aid in the investigation and diagnosis. Each team will capture
their own investigation and discoveries in their individual tasks and the problem coordinator
will review and coordinate them all. There are two types of problem tasks: Root Cause
10
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
Analysis and General. The RCA type should be selected for specific tasks required to
investigate the root cause and should be created by the problem coordinator at this point.
During the Root Cause Analysis state four main activities are intended to occur:
1. Discover a workaround (if possible) – The problem coordinator should enter it into the
Workaround field and communicate it to all open related incidents using the
Communicate Workaround related link. This populates the text from the Workaround field
into the Activity Log of all related open incidents explaining that it is a workaround from
the problem record.
2. Discover the root cause and document if using the Cause notes field on the problem – If
the problem coordinator needs help to discover the root cause, they can assign a Root
Cause Analysis problem task to the relevant team who will attempt to document the
cause code, cause notes, proposed fix and provide a workaround on the problem task.
3. Discover a permanent fix for the problem to prevent it from happening in the future –
Enter it into the Fix notes field and communicate it to all open incidents using the
Communicate Fix related link. As with the root cause, if the problem coordinator needs
help to discover a permanent fix, they can use a problem task to gather that information.
4. Communicate that this problem is known and currently being worked on that can help to
deflect incidents – Since it could take time to implement a permanent fix, this might be
helpful. The problem coordinator should create and publish a known error knowledge
article. Known error articles show up when an end user goes to create an Incident via the
Service Portal. Click the Create Known Error article related link to create a known error
article from a problem which creates a link between the two records displayed in the
Primary Known Error article field in the problem record. The Problem statement,
Description, Workaround, and Cause notes are copied over when creating a known error
article. The known error article will be in the draft state and can then be published. If it is
not deemed necessary to publish the known error in the Knowledge Base, service desk
analysts and other technical support users can simply search for all problems to find these
and use the information as they require it. Most commonly this will be used by the service
desk if they are trying to resolve an incident, find a workaround or if they recognize a
pattern of similar incidents.
These activities can all happen in parallel and may be discovered at different times through
root cause analysis. It is possible that none of these are discovered, some or all four.
If the root cause and a permanent fix are discovered, then the problem can be moved to
the Fix in Progress state using the Start Fix button. If a fix cannot be discovered or the fix
cannot be implemented, the problem coordinator can use the Accept Risk button and
provide reason why they are accepting the risk of not fixing this problem for now.
11
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
State: Fix in Progress
The Fix in Progress state represents a problem that has been investigated and is now needing
a fix. At this state it becomes mandatory to enter Cause notes and Fix notes.
The problem coordinator can create or relate to one or more change requests to show the
clear path to resolution.
Once the change(s) are implemented and the problem considered to be resolved the State
field is now updated to Resolved using the Resolve button.
Risk Accepted
From the Root Cause Analysis or Fix in Progress states, the problem coordinator can accept
the risk of not fixing the problem for now:
• Due to cost implications the business may determine it is not worth the cost of fixing
• A fix cannot be determined
12
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
Closed suggests that it is no longer a problem and can be misleading since closed records
are considered inactive as a standard across the platform.
Risk Accepted problems should be set to Resolved and can remain in Resolved indefinitely. It
is important to remember that this is a key difference between Incident and Problem
Management. Incident Management is concerned with the restoration of service as quickly
as possible using whatever means possible. Problem management is concerned with
permanently resolving the issue and ensuring it will not reoccur, therefore it is acceptable for
this to take as much time as is required. There should be no driver to close out unresolved
problems purely because they are not going to be fixed at that time and are sitting in a list of
active problems.
State: Resolved
Once the problem has been moved to Resolved state the Assigned to individual will need to
populate the Mandatory Fix notes field with text to describe exactly what has been done to
solve the issue.
At the Resolved state it is also possible for an organization to conduct a review of the
problem if their process requires it. Additional fields can be added here to capture that
information. Alternatively, problem tasks can be assigned out by the problem coordinator as
required.
A set period can also be observed before setting the problem to Closed state to confirm that
the known error has been solved. If evidence suggests that the issue persists, the state can be
set back to Root Cause Analysis using the Re-analyze button. The process can then be
worked through again to continue to find the solution.
If the problem is confidently considered solved, then the Assigned to individual will close the
problem using the Complete button.
State: Closed
At Closed state several resolution codes are displayed. These are automatically determined
by certain actions that occurred during the process:
• Duplicate – The problem was marked as a duplicate of another problem. Any related
incidents will be moved over to the problem that this one is a duplicate of. This is
configurable in the problem properties.
• Canceled – The problem was canceled. There are very few scenarios where a problem is
genuinely canceled. This will only occur when a problem was raised in error usually
prematurely before realizing there is no real problem.
• Fix Applied – A permanent fix was applied and the problem was resolved.
• Risk Accepted – The problem has not been resolved but it has been accepted that the
solution is not to be applied at this time. If that decision changes the problem can revert
Note: the problem management properties determine whether a closed problem can be re-
analyzed, for example: additional incidents are added to this problem after the fix was
applied. Set the role to Nobody if you do not want to be able to reanalyze the problem, if
you do that, should the problem reoccur, create a new problem ticket and set the first
reported by field to refer back to this problem so you can trace it back later on.
13
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
Other processes
Incident Management
Most problem records are triggered in reaction to one or more incidents. Incident history
helps identify trends or potential weaknesses as part of proactive problem management.
Incident records that are related to problem records which are pending resolution, are
automatically updated when a problem is resolved.
Change Management
For problems, implementing the workaround or the permanent solution will require work on a
service, hardware, or software. Conducting this work will require a change record to be
raised. This is done by selecting the Create Normal Change or Create Emergency Change
option in the context menu.
Emergency changes typically require an incident record to be related to prove that they are
urgent enough to bypass the full process and lead times.
Configuration Management
The Configuration Management system underpins all records and activities related to any CI.
It contains details of the infrastructure vital to services, CIs and their relationships.
The CMDB is used within the problem management process by relating CIs, including
business services to the problem. This allows dependency views to be used which display the
relationship between the selected CI and other CIs related up and downstream.
Knowledge Management
Knowledge is a vital part of the problem process. Known errors are documented and
published in the knowledge base including workaround information to allow end users to see
knowledge articles and, if appropriate, make use of workaround information to help
themselves while the known error is being fixed.
User experience
Mobile platforms and virtual technology can have a positive impact on how end users
interact with the end to end process and ultimately how the entire user experience is
perceived. Consider which touch point in the process can leverage the mobile platform to
minimize delays in the process. Tasks such as chat can all be performed on mobile devices.
Consider also how Virtual Agent can be deployed to assist users in common actions and
tracking progress.
Process governance
Measurement
Key performance indicators (KPIs) evaluate the success of a particular activity toward
meeting the critical success factors. Successfully managing KPIs can be either through
repeatedly meeting an objective (maintain) or by making progress toward an objective
14
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
(increase/decrease). The benchmarks feature gives you instant visibility into your KPIs and
trends, as well as comparative insight relative to industry averages of your peers. You can
contrast the performance of your organization with recognized industry standards and view
a side-by-side comparison of performance with global benchmarks.
The main point of note when creating any KPIs or metrics for problem management is not to
be driven by the same measurements used for incident management. With incidents, the
purpose of the process is to restore service as quickly as possible using whatever means
available. Therefore, speed of resolution is a key measurement for this process. With problem
management, the purpose is to understand the underlying cause of issues and permanently
fix them no matter how long that takes or if it is even possible. Therefore, in problem
management speed of resolution is not something that should be measured. This would drive
the wrong behavior for the process and focus on closing records rather than finding the
permanent fix. Process owners need to feel comfortable with problem records potentially
remaining open for months or even years.
Metrics
Process KPIs
• Provide information on the effectiveness of the process and the impact of continuous
improvement efforts
• Are best represented as trend lines and tracked over time
• Monitored by the process owner
Item Purpose
Mean time to first respond to Measure of how well response SLAs are achieved
problems, by priority
% of problems with a root cause Measure the effectiveness of problem management in defining
identified for the failure root cause
% of problem with workaround Measure the effectiveness of problem management in defining
defined and communicating workarounds
Percentage of incidents resolved by Measures the effectiveness of problem management in
fixing known errors supporting the timely resolution of incidents
Operational data
Active catalog items/requests that require visibility, oversight, and possible management
intervention are best tracked on a dashboard or homepage that is monitored by the service
desk and request fulfilment team.
Item Purpose
Problems ready to be assessed Shows all problems that require a problem coordinator to assess
them
List of active problems that have Highlights where there may be a process issue in assessing new
missed target response times problems
Aged list of backlogged problems Provides visibility to unassigned work
Risk Accepted problems with new Highlights where a risk accepted problem may need to be
incidents reviewed for further work
15
© 2020 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, Now Platform, and other ServiceNow marks are trademarks and/or registered trademarks of
ServiceNow, Inc. in the United States and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated.