0% found this document useful (0 votes)
369 views

Problem Management

This document provides guidelines for a Problem Management Process for an organization's IT infrastructure. It outlines the key components of the Problem Management Process, including problem notification, determination, resolution, tracking, reporting, and roles and responsibilities. The goal of the Problem Management Process is to minimize the impact of problems and outages, manage problems within agreed timeframes, and prevent future recurrences. It aims to provide a standardized process for handling service-impacting issues across the organization's IT systems and services.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
369 views

Problem Management

This document provides guidelines for a Problem Management Process for an organization's IT infrastructure. It outlines the key components of the Problem Management Process, including problem notification, determination, resolution, tracking, reporting, and roles and responsibilities. The goal of the Problem Management Process is to minimize the impact of problems and outages, manage problems within agreed timeframes, and prevent future recurrences. It aims to provide a standardized process for handling service-impacting issues across the organization's IT systems and services.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

XXX

Problem Management
Process Guide
Process Re-engineering
Problem Management Process

Version Control
Document
Version

Name

Status

Date

Author

Signature

Date

Table of Contents
1.

INTRODUCTION .................................................................................................................... 1

2.

PURPOSE OF THE DOCUMENT......................................................................................... 2


2.1
2.2
2.3
2.4

3.

OVERVIEW OF THE PROBLEM MANAGEMENT PROCESS...................................... 5


3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10

4.

PROBLEM MANAGEMENT PROCESS MEASUREMENTS...................................................... 19

ROLES AND RESPONSIBILITIES .................................................................................... 20


5.1
5.2
5.3
5.4
5.5
5.6

6.

PROBLEM MANAGEMENT - OVERVIEW DESCRIPTION........................................................ 5


PROBLEM MANAGEMENT PROCESS FLOW (ITIL) .............................................................. 6
NOTIFICATION .................................................................................................................... 7
PROBLEM DETERMINATION................................................................................................ 8
WORKAROUND AND RECOVERY ...................................................................................... 10
PROBLEM RESOLUTION .................................................................................................... 12
PROBLEM TRACKING ........................................................................................................ 14
REPORT AND CONTROL .................................................................................................... 16
GROUPED LEVEL 2 XXX PROBLEM MANAGEMENT PROCESS ......................................... 17
ITIL PROBLEM MANAGEMENT OVERVIEW ...................................................................... 18

PROBLEM MANAGEMENT MEASURES ....................................................................... 19


4.1

5.

SERVICE DESCRIPTION ....................................................................................................... 2


TERMINOLOGY ................................................................................................................... 3
PROBLEM MANAGEMENT OBJECTIVE ................................................................................ 4
SCOPE ................................................................................................................................. 4

PROBLEM MANAGEMENT PROCESS OWNER .................................................................... 20


PROBLEM MANAGEMENT CONTROLLER (HELPDESK LEVEL 1) .................................... 21
PROBLEM MANAGEMENT ANALYSTS............................................................................... 22
KNOWLEDGE ENGINEER ................................................................................................... 22
LEVEL 2 SUPPORT (OPERATIONS/OTHER)........................................................................ 23
VENDORS (LEVEL 3)......................................................................................................... 23

APPENDICES ........................................................................................................................ 25
6.1
6.2
6.3
6.4

APPENDIX A: ASSIGNING SEVERITY CODES .................................................................... 25


APPENDIX B: MANAGING ESCALATION ........................................................................... 27
APPENDIX C: SUPPORT LEVELS........................................................................................ 29
APPENDIX D: PROBLEM MANAGEMENT SYSTEM PARTICIPANTS .................................... 30

1.

Introduction
This document sets out the overall Problem Management Process for the XXX IT
infrastructure and environment. This procedure took existing XXX Problem
management practices, the AAA Process Model and ITIL best practices as input.

2.

Purpose of the Document


This document contains high level process flows pertaining to the Problem
Management Service in XXX IT environment. The document provides a framework
and roadmap from which lower level operational procedures can be defined and
implemented by the Service Improvement Team and IT Service Delivery staff. The
document also serves the purpose of providing material for high level training and
education to end user and IT communities. This aids high level understanding of
process based service delivery and specific process based tasks for the Problem
Management Service.
Every participant in the process is expected to understand and implement the
guidelines described in this document.

2.1

Service Description
Problem Management is the ongoing service concerned with minimising the impact of
problems affecting the availability and services of the service delivery environment,
whilst minimising expenditure of resource and maintaining the highest level of client
satisfaction.
This process captures information about problems and resolves them, according to
XXX Standards and policies. Problems will flow in from and out to the XXX Incident
management process.
The process identifies, documents, analyses, tracks and resolves all problems within
the XXX IT environment.
The suggested Problem Definition is: Any deviation from an expected norm. That is, a
problem is any event resulting in a loss or potential loss of the availability or
performance to a service delivery resource and/or its supporting environment. This
includes errors related to systems, networks, workstations and their connectivity;
hardware, software, and applications. The recognition of problems can come from
any point in the environment and can be identified using a variety of automated and
non-automated methods
An incident is a single occurrence of a difficulty, which is affecting the normal or
expected service of the user. The usual priority when an incident occurs must be to
restore normal service as quickly as possible, with minimum disruption to the users.
A problem is the underlying cause of one or more incidents, the exact nature of which
has not yet been diagnosed. Restoring normal service to the users should normally
take priority over investigating and diagnosing problems, although this may not
always be possible.

2.2

Terminology
INCIDENT
KNOWN PROBLEM
PROBLEM

REQUEST FOR
CHNANGE

The following are ITIL descriptions:


ITIL recommends a clear demarcation between incident control and problem
management. If help desk cannot resolve an incident, it is progressed to problem
management.
An incident is a single occurrence of a difficulty, which is affecting the normal or
expected service of the user. The usual priority when an incident occurs must be to
restore normal service as quickly as possible, with minimum disruption to the users.
A problem is the underlying cause of one or more incidents, the exact nature of which
has not yet been diagnosed. Restoring normal service to the users should normally
take priority over investigating and diagnosing problems, although this may not
always be possible.
A known problem is a problem which has been diagnose and for which a resolution
or circumvention exists. There may be good reasons for leaving a problem outstanding
even though a resolution is possible, for example if the problem is minor and the
resolution will impact on normal service provision.
ITIL refers to Problem Management as:
Incident Control
w Restoring normal service when service has one wrong
Problem Control
w Getting to the route cause of the problems
w

Correcting Problems

Management Information
w Resulting from the other areas

Problem Management is also concerned with proactively preventing problems


occurring.

2.3

Problem Management Objective


The objectives of the Problem Management Process is to provide a straightforward
and workable process for handling all types of service inhibiting situations with
minimum effort for the XXX IT clients. The goal is to maximise client satisfaction
with IT systems and services.
The Problem Management process and Incident Management process are closely
linked with many of the Problem Sub-process activities performed by the Helpdesk.
Problem Management Process Objectives:

2.4

Minimize the impact of problems

Minimize the duration of any related outages

Manage problems within agreed-to time frames

Reduce number of problems

Prevent reoccurrence

Perform trend analyses

Assure performance of root cause analyses

Maximize productivity of resources

Monitor and measure the service

Automate tasks wherever possible.

Scope
The Problem Management service begins with receipt of a problem record.
The assumption upon entering Problem Management is that the problem has already
been logged, as a problem, via the Incident Management Process.
In order to resolve problems, the service includes the following activities:

Notification

Problem Determination

Workaround and Recovery

Problem Resolution

Tracking

Report and Control.

3.

Overview of the Problem


Management Process
The overall Problem Management process comprises a number of tasks or activities.

3.1

Problem Management - Overview Description


The Problem Management Process consists of the following sub process activities:
1.

Notification
The identification of a problem. Examples of a problem might be an outage, an
incorrect or an unusual result. This sub process also includes notifying the
appropriate support structure that there is a problem and a need for assistance.
The initial recording of a problem, including all relevant information that is
available when the problem occurs. This is the introduction of the problem into
the management system. (Source Incident Management Process)

2.

Problem Determination
The collection, analysis, and correlation, of data to determine and isolate the
cause of the problem.

3.

Workaround and Recovery


Activity to recover, workaround or circumvent the problem, and notification to
the affected clients of action taken.

4.

Problem Resolution
The identification, implementation, and verification of solutions, and notification
to affected clients.

5.

Tracking
The assignment of ownership for resolving problems and the follow-up activity
to ensure that the goals for problem resolution are being met. It includes setting
priorities and escalating issues via the appropriate system.

6.

Reporting and Control


The production and analysis of reports, over time, to determine if the problem
management process is working effectively and to identify changes that me be
necessary. It also helps identify significant results and problem trends.

Note:

3.2

Emergency Changes will always relate to a problem record

There will be known problems that will not be fixed

There will be known problems for which XXX will be waiting on a vendor to
provide the fix.

Problem Management Process Flow (ITIL)


The following process flow shows an overall ITIL based version of the Problem
Management process for resolving client problems. This illustration is meant to
provide the reader with an understanding of the general functional flow of the problem
process.

ITIL Problem Management Process


Escalate to Problem
M anager
Incident from Incident
process

Notification

Allocate & Prioritise

To Incident Process

Problem Record
Problem Workaround
Validated Severity

Problem
Determ ination

Level 2 Priority

Escalate to Problem
M anager

Updated Problem
Record
Problem Record

Workaround
& Recovery

Project Requests

Escalations

Problem Record

Problem
Resolution

Updated Problem
Record

Com munications
Problem Record

Tracking
Closed Problems

Problem Record

Reporting
& Control

M anagem ent
Inform ation

3.3

Notification
w
w
w
w
w
w
w

Inputs:

Outputs:

Roles:

Problem Record
External Notification
User Communication
Assigned Problem
Escalated record
Problem Management Controller / Problem
Manager

AIB IT - Notification
Incident from
Incident process

Assign Severity

Escalate to
Problem Manager

Incorrect
Assignment

To Incident process

Incorrect

Escalate to Prblem
Management

Problem Record

Validate Assignment

Correct

Validate Severity
Level

Allocate &
Prioritise

Severity 1 & 2

Critical Situation
Management

The Identification and notification sub process includes the following steps:
1.

Raise Problem Record


Problem Management Controller raises problem record, copying relevant details
from the incidents and expanding as required. This should include verification or
modification of the severity and impact, escalating to the problem manager if
found to be high severity

2.

Validate Severity Level


Problem Manager assesses incident forwarded from first level support area (e.g.
Helpdesk) via the Incident Management process, checking noted severity and
verifying or modifying as required and identifying further handling requirements

3.

Directly Manage Critical Situations


Problem manager, if it is a severe situations (severity 1 & 2), directly initiates
and co-ordinates resolution actions, or designates someone else to co-ordinate.
Ownership of the Incident process remains with the Incident manager / Helpdesk
although management has passed to the problem manager for the duration of the
critical situation.
Problem Manager, or designate, continues co-ordination of major incident
through to resolution, or until severity or impact has been reduced sufficiently to
progress as per other incident standards. Regular updates must be provided to

service delivery groups and / or user management. If necessary, problem manager


should convene problem/critical situation meetings with the relevant experts to
determine the best course of action and maintain progress in line with severity
and impact.
4.

Allocate and assign Incidents for further investigation


Problem manager, if further investigation of incident is required, allocates
incident to a problem analyst, progressing to identify nature of the problem.
Problem manager, if further investigation is not required prior to assignment to a
specialist support function (and a problem has already been created) progress to
assign problem.
If a Level 2 to Level 2 reassignment takes place the group passing the problem
on will notify the incident manager/Helpdesk of the move.
The problem manager will also assign significant problems that have been
externally notified, for example urgent notification of virus signatures from the
relevant external agencies.

3.4

Problem Determination
w Updated Problem
w Problem Status (Updated Problem Record)
w Problem Management Controller / Level 2/3

Inputs:
Outputs:
Roles:

support

AIB IT - Problem Determination Process


Problem Record

Validated Severity

Collect & Analyse


Data

Cause Identified
Yes / No

No

Level 3 Required for


Prob. Determination
Yes /No

Yes

Invoke Level 3
Support

Yes

Level 2 Priority

No
Update Problem
Record

Prob. Identification
Complete Yes / No

No

Escalate to
Problem Manager

Yes

Prob. Workaround

The Problem determination sub process includes the following steps:


1.

Collect Problem & Analyse Available Problem Data; Identify Related


Occurrences
Collect all available data about the problem, its symptoms, and associated
configuration data. Identify any related occurrences of the problem from the
knowledge database. (Maybe performed at level 1 Helpdesk). Analyse available
problem data using normal problem determination procedures

2.

Is this a problem requiring a specialist service (level 2 or 3 support)


Based on the available problem data, decide whether this problem is of a
specialist nature, for example a performance problem

3.

Correct Owner?
Determine if the problem has been referred to the correct owner (work
group/queue)

4.

Escalate Problem as Appropriate


If the correct owner is not known, escalate the problem to the Problem
Management Co-ordinator or, if necessary, the Problem Management Controller
for resolution

5.

Reassign Problem Record to Correct Owner


Note: If the correct owner is known, the Problem Controller does this. If the
correct owner is not known, the Problem Management Co-ordinator or Controller
does this.

6.

Resolve Incident
Problem Management controller, co-ordinates actions of, resolution with
assistance and participation of relevant support groups.
Problem Controller organises communication of resolution to users via Help
Desk.

7.

Update Call Record with Additional Details


Update the Call Record with additional detail to help with future assignment of
problem records. If the incident is still affecting users, then its record should stay
open pending circumvention or resolution

8.

Identify Probable Cause


Identify the probable cause for the problem, isolating the problem to a single
point of failure if possible. Perform an initial root cause analysis

9.

Is it a Problem?
Determine if the reported problem is actually a problem

10. Action Required?


If the reported problem is not actually a problem, determine if any action is
required

11. Perform Appropriate Action


If action is required for a non-problem, perform the appropriate action.
Note: For example, if a customer calls about a service outage, and service has
already been restored, ensure that the customer is able to use the service.

Update Problem Record to Indicate that Reported Problem is not Actually a


Problem
Update the problem record to indicate why the reported problem is not a problem.

Note: The problem record is then closed by way of the Close Request activity of the
Incident Management service
Adjust Initial Severity/Priority if Required
Notify Severity/Priority Change
This activity invokes Incident Management to register the severity/priority change and
carries on in parallel to Update Problem Record.
Incident Management
Call Management is responsible for resetting the severity/priority.
Update Problem Record

3.5

Workaround and Recovery


Inputs:

Outputs:

Roles:

w
w
w
w
w
w
w
w
w

Problem record
Available workaround
Operational procedures
Change management
Problem status (Updated problem record)
Project request
Change Intention
Configuration Update Details
Problem Co-ordinator /Team Leader

Emergency Change
Mangmt Process

AIB IT - Workaround and Recovery

Implement Bypass
Apply Temp fix
Recover/ Resources
/ Services
Verify recover
actions
Backout bypass

Problem
Record

Operational
Procedures

YES
Project
Request

10

Change
Management
Required

No

YES

Operational
Procedures

No

Successful
Bypass /
Recovery

No
No

Appropriate?
Yes / No

YES

Escalate
According to
Severity

YES

Update Problem
record with
details

The Problem Workaround & Recovery sub process includes the following steps:
1.

Review/Develop Bypass/Recovery Plan with Affected Parties

2.

Project Required?
Based on Policy.
Determine if a project is required to implement the bypass.
- If yes, proceed to Project Request.
- If No, proceed to Change Management Required?

3.

Project Request Management


If a project is required, invoke the Project Request to implement the workaround.
- Proceed to Successful Bypass?

4.

Change Control Required?


Based on Policy
Determine if Change Control is required to implement the bypass.
- If Yes, proceed to Change management
- If No, proceed to Operational Processes.
Emergencies will be handled according to XXX IT Change Policies; for example,
this may mean that Change Control is invoked retrospectively, i.e. after the
workaround or recovery has been implemented.

5.

Change Management
If required, invoke Change Control to approve and schedule the workaround.

6.

Change Management Appropriate?


Determine if Change management is required and if so is it appropriate to the
situation.
- If No proceed to Operational Processes
- If Yes, proceed to Escalate According to Severity

7.

Operational Processes
If a project is not required, start the implementation of the workaround or
recovery plan by way of the operational procedures that perform implementation
tasks such as:
- Emergency Change Management
- Implement the bypass
- Apply temporary fixes
- Recover resources and services
- Verify that the bypass/recovery actions work
- Back out the bypass if it was unsuccessful
Note: Operational Processes include Desk-side Support, Software Distribution,
Server Management, Applications Management, and so on.

11

8.

Successful Workaround?
- If Yes, proceed to Update Problem Record to Indicate Workaround was
Successful
- If No, proceed to Update Problem Record to Indicate Workaround was
Unsuccessful.

9.

Update Problem Record to Indicate Workaround was Successful


If the workaround was successful, update the problem record to indicate that the
workaround was successfully implemented.

10. Update Problem Record to Indicate Workaround was Unsuccessful


If the workaround was unsuccessful, update the problem record to indicate that
the workaround was not successfully implemented.
11. Update Problem Record to Indicate Workaround was not Approved
If the change (workaround) was not approved, update the problem record to
indicate that the workaround was not approved.

3.6

Problem Resolution

Outputs:

w
w
w
w
w

Roles:

w
w
w
w Problem Co-ordinator / Level 2 / Level 3

Inputs:

Problem record
Knowledge database
Change Approval
Change Status Report
Problem Status (Updated Problem
Record)
Problem Resolution Plan
Project Request
Change Intention

AIB IT - Problem Resolution


Problem
Record

Investigate
Solutions

Level 2
Resolution
Yes / No

YES

Select
Problem
Solution

Review
Specify
Solution

Design
Solution

NO

NO

Escalate to
Level 3

Project
Proposal

Project
Deferred
Yes / No

YES

12

Project
Required?
Yes / No

NO
Project Work

Develop
Resolution
Plan

Update
PBM Record

YES

The Problem Resolution sub process includes the following steps:


1.

Investigate Possible Solutions


Investigate possible permanent solutions for the problem
Problem owner assesses alternative resolution approaches, with assistance of
other support areas (including Change Management) and identifies preferred
approach.
The potential to combine resolutions into a scheduled upgrade should be
actively considered

13

2.

Select Problem Solution


Select the best permanent solution for the problem

3.

Review/Design Solution
Review or design the permanent solution for the problem

4.

Develop Plan to Create, Test, Apply & Verify the Fix


Develop a resolution plan to create, test, apply, and verify the permanent fix

5.

Project Required?
Based on Policy
Determine if a project is required to implement the solution.
- If Yes, proceed to Project Request (in tracking)
- If No, proceed to Select Problem Solution

6.

Project Request
If a project is required, invoke Project Request to implement the solution

7.

Provide Service
After handling the entitlement failure, determine if service is to be provided; that
is, will the recommended solution or an acceptable alternative be implemented

8.

Develop Resolution Plan


Follow XXX operational procedures to develop a resolution plan install the fix to
the problem

9.

Update Problem Record.

3.7

Problem Tracking
w
w
w
w
w
w
w
w

Inputs:

Outputs:

Roles:

Problem record
Knowledge database
Configuration Information
Problem Status (Updated Problem Record)
Problem analysis information
Root cause analysis
Possible Problem Solution
Problem Management Co-ordinator / Team
Leader
Problem Management Controller

AIB IT - Problem Tracking


Problem Record

Cordinate /
Communicate
Incident resolution

Check status
of call, provide
feedback

Identify
issues for
investigation

Monitor
progress
of problems

Follow up
enquiries on
actions
Update users
via Helpdesk

Ascertain
Trends
Escalate
Problem

Advise Problem
Manager

Confirm
Resolution

Review Problem
Record

Satisfied

Not
Satisfied

Communicate
Resolution

Route to
Problem
Resolution

Close
Problem

Re-drive
Problem

Project
Required

The Problem Tracking sub process includes the following steps:


1.

Actively Monitor or Manage Progress on Significant Problems


Problem manager actively monitors (or directly manages) actions on problems
and known errors related to major incidents.
As appropriate, problem manager convenes and chairs specific co-ordination
meetings with participants in the resolution.

14

2.

Follow up on specific Progress on Problems and known problems


Problem manager, in response to enquiry or trigger from the problem
management system (e.g. change in problem status), checks on latest status and
documentation of related problems and provides feedback as required.

3.

Monitor Overall Progress on Problems and known problems


Problem manager, maintains an overall awareness of incident, problem, and
known problem environment, identifying any issues for further investigation and
either following up directly or initiating follow up by other Problem Management
staff. Follow up may include enquiries on actions, initiating updates to users (via

helpdesk), ascertaining trends (for feed into production management information)


and potential escalation
Resolve and Close Problem
Identify Resolution
Problem owner verifies that the solution has successfully resolved the problem or
known error.
Problem owner completes associated resolution details on problem or known error
record.
Problem owner advises Problem Management of resolution.
Confirm Resolution
The Problem manager checks for resolution details, confirming details and resolving
any inconsistency with problem owner, change owner and Change Manager as
required.
Problem Manager, if resolution is not satisfactorily complete, returns problem-toproblem owner as not resolved. Problem manager will monitor for pervasive problem
records & will initiate project activity, where required, to resolve the root cause.
Close Problem
Problem Manager completes closure details and closure of associated incident links
(unless done by the helpdesk as part of Incident Management in which case Problem
Manager advises Help Desk manager of completion), and closes problem.

AIB IT - Project Required


Problem Records

Long Term

Recurring Issue
Bigger Problem

Project Required /
Project Request

Project Proposal

Long Term Review

Short Term

15

Project Work /
Change Mgmt etc.

Update Problem
Record

3.8

Report and Control


Outputs:

w
w
w
w

Roles:

w Problem Management Controller

Inputs:

Problem Record
Requirement(s) for process improvement
Documented non-compliances
Problem Management Measurements and Reports,
including trend analyses

AIB IT - Report & Control

Problem
Record

Root Cause
Resolved

NO

Process Not
Working?

Process
Improvement
required

Project
Required?

Problem
Resolution
Sub-Process

Document
service
improvments
Management
Information

YES
Project Identified
Not Imlemented

Report /
Escalate

The Report and Control Problems sub process includes the following steps:

16

1.

Root Causes Resolved?


Tests whether or not a resolution has been implemented for the root cause.

2.

Problem Management Sub-Processes not working effectively?


Invoke sub-processes improvement activity as required.

3.

Project Required?
Based upon the outcome of analysis of generic incidents or problems determine
whether or not specific project activities are required.

4.

Project Identified but not implemented?


Project activities identified but project activities rejected or deferred. Escalate to
the problem manager via exception reporting.

5.

Perform Projects / Actions Required

6.

Process Improvements Required?

7.

Document Recommended Process Improvements


This task documents the required process improvement. This could be an
improvement to the Problem Management service itself, or an improvement to
any other service within the XXX Service.

3.9

Grouped Level 2 XXX Problem Management


Process

Incident from
Incident process

Assign Severity

AIB Problem 'ITIL' based problem


process
Escalate to
Problem Manager

Incorrect
Assignment

To Incident process

Incorrect

Escalate to Prblem
Management

Validate Assignment

Problem Record

Correct

Allocate &
Prioritise

Validate Severity
Level

Emergency
Change
Mangmt Process
Implement Bypass

Severity 1 & 2

Critical Situation
Management

Problem Record

Collect & Analyse


Data

Validated Severity

No

Cause Identified
Yes / No

Level 3 Required for Yes


Prob. Determination
Yes /No

Apply Temp
fix
Recover/ Resources
/ Services

Invoke Level 3
Support

Verify recover
actions

Yes
No

Level 2 Priority

Prob. Identification
Complete Yes / No

Update Problem
Record

No

Escalate to
Problem Manager

Backout
bypass

Yes

Prob. Workaround
Problem
Record

Problem
Record

Investigate
Solutions

Level 2
Resolution
Yes / No

YES

Select
Problem
Solution

Review
Specify
Solution

Design
Solution

NO

Project
Required?
Yes / No

Operational
Procedures

YES

YES
Project
Request

NO

Change
M anagement
Required

No

Operational
Procedures

No

Successful
Bypass /
Recovery

YES
No
Appropriate?
Yes / No

YES

Escalate
According to
Severity

YES

Escalate to
Level 3

Project
Proposal

NO

Project
Deferred
Yes / No

Project Work

Develop
Resolution
Plan

Update
PBM Record

Problem Record

Cordinate /
Communicate
Incident resolution

Update Problem
record with
details

No

Check status
of call, provide
feedback

Identify
issues for
investigation

Monitor
progress
of problems

Follow up
enquiries on
actions
Update users
via Helpdesk

Ascertain
Trends

YES

Escalate
Problem

Problem
Record

Root Cause
Resolved

NO

Process Not
Working?

Process
Improvement
required

Project
Required?

Problem
Resolution
Sub-Process

YES
Project
Identified
Not Imlemented

17

Report /
Escalate

Document
service
improvments

Advise Problem
Manager

Confirm
Resolution

Review Problem
Record

Satisfied

Not
Satisfied

Management
Information

Communicate
Resolution

Route to
Problem
Resolution

Project
Required

Close
Problem

Re-drive
Problem

3.10

ITIL Problem Management Overview


Problem Management Overview
C usto m er calls
S ervice D esk

O pen new record,


com p lete details

A uto m atic inciden t


re cog nitio n
IR g roup detect
incident

SD
resolve
p roble m
?

N
S e t se verity & priority
& a dv is e cu stom er
w ith re f no.

Se verity
1
?

S ervice D esk inform s


P roblem M anag er

N
N

A ssign to supp ort


group

2 nd
p as s
?
G roup
accep t
?

Y
Y

S e rvice D esk esc alate


to P ro blem M anag er to
d ecide assig nm ent

Y
S u pport grou p ring
custom e r w ithin S LA to
discuss p roblem / give
fix tim e / co nfirm prio rity

P riority
chang e
need ed
?

In form S ervice D esk


w ho w ill chan ge priority

N
S up port g rou p pe rform
p rob lem determ in ation
(P D ) and d evelop fix Y

C h ange
n eed ed
?

C re ate c hang e re cord ,


up date pro blem reco rd,
info rm u ser of sta tus

N
S up port gro u p in form custo m er of s olution ,
upd ate prob lem record w ith full d escrip tion
and caus e c od e; set record to "op en,
resolved" s tatus

C lose reco rd

R ec ord
c om plete d
corre ctly
?

C h an ge
im plem en ted
?

S D refer incid ent


re cord back to
su ppo rt gro up

Y
S D clo se reco rd

END

18

C ust S at
que stion naire
n eeded
?

M ajor
In cident
?

S D com p lete
q ue stionnaire w ith
c usto m er

P ro ble m M an ager
inform s S ervice
C on tin uity M anag er

4.

Problem Management Measures


The reports that are produced for the problem management system are designed to
help manage the process. Daily reports identify results from the previous day, and any
problems, which must be confronted during the day. Weekly reports provide a
summary of the previous weeks success, current status and weekly trend information.
Daily reports are primarily for technicians. Weekly reports enable effective
management of the process. Monthly reports can also enable IT to evaluate the
effectiveness of the problem management system.

4.1

Problem Management Process Measurements


The problem management process measurements are used to determine if adjustments
must be made to the process:
w

Cost and resource time to support the process

Number of problems raised

Number of known errors identified

Number of incidents linked to problems

Number of rejected resolutions

Number of changes resulting in problems

Number and cost of problems caused by changes

Numbers of problems fixed by changes.

Some examples of reports are listed below:


w

Daily Turnover Report of all Problems


Showing at the detail level all problems opened and closed the previous 24 hours

Weekly Report of Problems Resolved by the Help Desk


Showing detail of those problems resolved by the Help Desk

Weekly Report of Problems Resolved by Level 2 Departments


Showing detailed of all resolved problems assigned by the Help Desk to any Level
2 department.

It should be noted that these reports are not intended to replace the normal operations
reports for systems and network availability/outages. These reports are to monitor the
progress of the problem management system and provide guidance in the effectiveness
of the problem management activity.

19

5.

Roles and Responsibilities

5.1

Problem Management Process Owner


Job Purpose

Major Tasks

This position is a senior service delivery co-ordination and development


role for the Problem Management and underpinning technical services.
Is responsible for ensuring the problem, management system is in
place and effective.

w Is responsible for and owns the overall Problem Management


service

w The process owner must build the process. This includes defining

w
w
w
w
w
w

w
w
w
w
w

20

what is a problem, setting goals and objectives of the problem


management process, understanding what severitys, priorities,
service levels are required, and setting up the information flows
Responsible for overall performance to target service levels for
Problem Management and underpinning technical services
Ultimately responsible for resolving Problem Management and
technology service/s dissatisfaction issues
Escalates exceptions to senior management as appropriate
Has a nominated deputy to cover for service owner absence
Develops requirements for Problem Management standards,
procedures, measurements, tools and technology in conjunction with
the Incident Management service owner
Sponsors and / or manages internal improvement projects to
implement new technology and process improvement, ensuring
compatibility and integration with other XXX services and non XXX
service providers
Communicates Problem Management procedures and working
practices and changes to internal standards, processes, procedures
and technology
Co-ordinates and sets annual service requirements, objectives and
targets for Problem Management and underpinning technical
services in conjunction with technology service owners
Approves and sponsors Problem Management and technical service
improvement ideas
Attends appropriate senior management level service support and
development reviews as appropriate
Involved in development and subsequent agreement of service level
targets and target improvements related to the Problem
Management and underpinning technical services.

5.2

Problem Management Controller (Helpdesk


Level 1)
Job Purpose

The Helpdesk personnel play the key role in the day-to-day operation of
the problem management process and in the majority of incidents
becomes the problem owner.
The problem owner/controller assumes responsibility for all
communications and for co-ordinating resolution activity on that problem,
in accordance with severity.

Major Tasks

w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w

21

Is the initial point of contact for the client community


Do the initial problem logging and problem determination
Resolve most level 1 problems
Contact vendors for most hardware problems
Do the problem tracking.
Provide feedback to the client who reported the problem
Records all calls that require a problem or incident to be opened
Complete the initial descriptive portion of the problem record for all
problems
Assign problem severity level and the initial priority
Update the problem record and maintain a list for tracking all problems
that have been assigned problem numbers
Assign the problem and send a copy of the problem record to the
appropriate group(s) for additional problem determination and problem
resolution
Reassign the problem if the Level 2 that was first assigned is not the
correct group to fix the problem
Summarise daily, weekly and monthly statistics and provide reports to
interested departments
Provide the problem management co-ordination. In that role, the
responsibilities are:
Oversee and track all exception problems affecting clients, from initial
recording, through management review, through escalation, through
closing.
Notify management of the requirements to schedule
escalation/problem review meetings
Prepare problem reports
Review closed problems for validity

5.3

Problem Management Analysts


Job Purpose

The problem analyst is a member of the Problem Management function


and is responsible for examining incidents escalated from first level
support to identify their cause. Incidents are either related to existing
problems or known problems, or recorded as new problems which will
normally be allocated to a support area and subsequently progressed by
the problem owner / controller.

Major Tasks

w Responsible for effective implementation and maintenance of Problem


Management procedures and working practices

w Defines training and development needs for individuals within the team
w Ensures adherence to staff training plan
w Undertakes performance review meetings with team members in
w
w
w
w
w
w
w
w

5.4

compliance with XXX policy


Invokes escalation procedures and communicates with management
as appropriate
Identifies and reports exception items to management as appropriate
Identifies incident and problem trends to anticipate potential service
outages and duplicated problems
Co-ordinates / undertakes appropriate action as a result of service
deterioration
Participates in customer satisfaction surveys obtaining feedback from
customers with respect to service level attainment and service quality
and feeding information into service improvement process
Provides first line escalation point for customer service dissatisfaction
Recommends working practice improvement ideas with the team,
passing them to the Problem Management Controller and / or Service
Owner for approval and action
Provides individual input to Problem Management service
improvement.

Knowledge Engineer
Job Purpose

Major Tasks

Responsible for providing all aspects of knowledge insertion into the


appropriate tools. These responsibilities include identifying knowledge
bases to build, finding sources of expertise, acquiring the knowledge, and
implementing the knowledge systems.

w Creating a process to easily identify the knowledge needed


w Implementation of the knowledge will include inserting, quality
assurance, delivering and supporting the knowledge

w The Knowledge Engineer may also be responsible for routine followw

22

up, an occasional backup for Level 1, contacting vendors, and minor


bug fixes
The Knowledge Engineer will participate in out of hours support.

5.5

Level 2 Support (Operations/Other)


Job Purpose

Level 2 support is responsible for problem determination and resolution,


and for bypass, recovery and / or circumvention when the Helpdesk
(Level 1) or operations functions are unable to resolve the problem.
Operations have specific responsibility for identifying those problems
that are caused by systems and operational activities.

Major Tasks

w Timely acceptance of responsibility for resolving problems which are


w
w
w
w
w
w
w

assigned by the Helpdesk


Timely reaction based on priority of the problem
Meeting the established objectives for the problem resolution priority
Determining the failing component or the cause of the problem
Creating bypass/recovery/circumvention procedures, making the
decisions as to when they need to be invoked, and invoking them
when necessary
Providing the solution to the problem or contacting the vendor to
resolve
Updating the resolution section of the problem record; working with
the Helpdesk when the problem status changes, when there is
activity, and when the problem is resolved
Assisting with Problem Determination when requested by others

Operations

w Notify the Helpdesk of problems, in the operations environment,


w
w
w
w

which will affect the user community


Identify the failing component or the cause of the problem
Assist the Helpdesk with problem determination when requested
Help determine the availability of Bypass/Recovery Procedures
Obtain approval for Bypass/Recovery procedures and execute them
when necessary or contact the appropriate group to perform
Bypass/Recovery

Update the problem record or have the Helpdesk update it.

5.6

Vendors (Level 3)
Job Purpose
Major Tasks

Vendors are a critical part of the problem management support


process.

w Provide timely, skilled service dispatch and resolution


w Provide feedback on the results of each assigned problem.

Need to add more of a Level 3 description, so that level 3 can be integrated into the
process.

23

24

6.

Appendices

6.1

Appendix A: Assigning Severity Codes


The impact of a problem is a composite of many factors: the number of clients
affected, the type of service disrupted, the length of outage, the number of times the
problem has recurred, the availability of a workaround, and the length of time the
problem has been open.
Severity codes provide the means for assigning a value to a problem so that the impact
of the problem can be communicated to the people involved in the Problem
Management Process. The Help Desk personnel will make severity code assignment
for client problems when the problem record is created.
Severity Level
Severity 1

Impact Description
w All BBI Branches and or all ATMs
w Based on Banks ability to process value

TSD Keyword
Critical

through these channels


Severity 2

w Escalated Severity 2 incidents


w Major Service Impact to any Group Business
w

(Ark Life, F&L, CM etc. up to 50 Branches or


ATM)
Based on the Banks ability to process value
through these channels or businesses
Escalated Severity 3 incidents

Severity 3

w
w Major Problems
w Large Business or Systems
w Large non-branch business units, big branch or

Severity 4

w Small User group within a business unit, small

Severity 5

w
w
w
w

Severe

Significant

support department

Severity 6

25

branch

High

Single User
Low Customer Impact

Medium

Default
All Requests & Batch Fails

Low

Sample Incident/Problem Close Codes


A = User Error
B = Request for Information / Education / Advice
C = Desktop Hardware
D = Desktop Software
E = System Hardware
F = System Software
G = Network
H = Security
I = Change
J = Duplicate Call

26

6.2

Appendix B: Managing Escalation


Escalation is a normal part of the problem management process, which recognises that
some problems will not be resolved within established time frames.
The Helpdesk, with the participation of the appropriate level 2 departments and
managers, manages the escalation process. The purpose of the escalation process is to
bring additional resources to a problem which is not meeting the resolution objective
for any number of reasons, such as lack of resource, problem more difficult to resolve
than anticipated, lack of attention on the part of the client etc.
The escalation process is the means for bringing additional effort and emphasis to a
problem.
Level 2 support is responsible for responding to the escalation and negotiating a
solution. Level 2 is responsible for the technical quality of the resolution plan,
ensuring that:
w

The plan will result in problem resolution

The problem can be resolved in the projected time frame

The resolution will be acceptable to the client. If not, an acceptable agreement


must be made with the client.

The escalation process is triggered when:


w

The target time for problem resolution will be missed

The identified Level 2 department does not accept responsibility for the resolution

The client escalates the problem

There is a critical application or system exposure

Escalation is meant to focus management attention on a specific problem. Escalating a


problem should ensure that:
w

The problem is resolved or bypassed

The client is satisfied

Responsibility is assigned, a plan is put in place, and a target for resolution exists

The required resources are available

The process works as follows:

27

The Helpdesk determines that escalation is needed and identifies the departments
to be involved

An Escalation Action Line is added with the relevant details

28

The Helpdesk Manager identifies the appropriate managers or supervisors to be


involved. They set the objectives of the escalation and identify who needs to be
involved as part of the resolution team

The Helpdesk provides the history of the problem (via the Helpdesk call record)
and ensures that an action plan is developed

The team develops an action plan that outlines the action and sets target times and
ensures resource commitments

If there is no agreement on a plan, or if the objective is missed, the problem is


escalated to the next level of management

The assignee ensures that the affected department/clients are notified and are in
agreement with the plan. If they are not, then agreement must be obtained

The Helpdesk documents the results of the escalation

The assignee notifies the appropriate management of the situation and plan

The assigned Level 2 department is responsible for updating the problem call in a
timely fashion.

6.3

Appendix C: Support Levels


Support levels define the problem management functions to be performed by the staff
and departments. Example support levels are described below. They can help each
department determine how well prepared they are, to support the problem
management process.
Level 1
w Act as the first point of contact for clients
w

Perform problem Logging and tracking

Answer basic operational and product knowledge questions

Resolve most procedural and usage problems

Perform problem determination for some applications and some hardware, and
network usage problems. Level 1 should be able to perform routine Problem
Determination for; PC workstations, key generic applications, and the network

Dispatch problems to level 2 or vendors

Level 2
w Be able to operate and install
w

Take responsibility for problem resolution

Isolate complex problems to failing component

Fix routine technical problems

Identify bypass and recovery procedures

Work with vendors to resolve problems

Use diagnostic tools

Update problem tracking system

Level 3 (usually the vendor)


w Work with level 2 to resolve complex problems
w

29

Supply solutions with target time frames

6.4

Appendix D: Problem Management System


Participants
The process participants are the XXX IT departments and groups identified below.

30

Client

Problem Management Process Owner

Helpdesk (level 1)

Operations

Other XXX IT departments (level 2) i.e.: ITD, Networking

Management

Vendors (Level 3)

You might also like