Problem Management
Problem Management
Problem Management
Process Guide
Process Re-engineering
Problem Management Process
Version Control
Document
Version
Name
Status
Date
Author
Signature
Date
Table of Contents
1.
INTRODUCTION .................................................................................................................... 1
2.
3.
4.
6.
5.
APPENDICES ........................................................................................................................ 25
6.1
6.2
6.3
6.4
1.
Introduction
This document sets out the overall Problem Management Process for the XXX IT
infrastructure and environment. This procedure took existing XXX Problem
management practices, the AAA Process Model and ITIL best practices as input.
2.
2.1
Service Description
Problem Management is the ongoing service concerned with minimising the impact of
problems affecting the availability and services of the service delivery environment,
whilst minimising expenditure of resource and maintaining the highest level of client
satisfaction.
This process captures information about problems and resolves them, according to
XXX Standards and policies. Problems will flow in from and out to the XXX Incident
management process.
The process identifies, documents, analyses, tracks and resolves all problems within
the XXX IT environment.
The suggested Problem Definition is: Any deviation from an expected norm. That is, a
problem is any event resulting in a loss or potential loss of the availability or
performance to a service delivery resource and/or its supporting environment. This
includes errors related to systems, networks, workstations and their connectivity;
hardware, software, and applications. The recognition of problems can come from
any point in the environment and can be identified using a variety of automated and
non-automated methods
An incident is a single occurrence of a difficulty, which is affecting the normal or
expected service of the user. The usual priority when an incident occurs must be to
restore normal service as quickly as possible, with minimum disruption to the users.
A problem is the underlying cause of one or more incidents, the exact nature of which
has not yet been diagnosed. Restoring normal service to the users should normally
take priority over investigating and diagnosing problems, although this may not
always be possible.
2.2
Terminology
INCIDENT
KNOWN PROBLEM
PROBLEM
REQUEST FOR
CHNANGE
Correcting Problems
Management Information
w Resulting from the other areas
2.3
2.4
Prevent reoccurrence
Scope
The Problem Management service begins with receipt of a problem record.
The assumption upon entering Problem Management is that the problem has already
been logged, as a problem, via the Incident Management Process.
In order to resolve problems, the service includes the following activities:
Notification
Problem Determination
Problem Resolution
Tracking
3.
3.1
Notification
The identification of a problem. Examples of a problem might be an outage, an
incorrect or an unusual result. This sub process also includes notifying the
appropriate support structure that there is a problem and a need for assistance.
The initial recording of a problem, including all relevant information that is
available when the problem occurs. This is the introduction of the problem into
the management system. (Source Incident Management Process)
2.
Problem Determination
The collection, analysis, and correlation, of data to determine and isolate the
cause of the problem.
3.
4.
Problem Resolution
The identification, implementation, and verification of solutions, and notification
to affected clients.
5.
Tracking
The assignment of ownership for resolving problems and the follow-up activity
to ensure that the goals for problem resolution are being met. It includes setting
priorities and escalating issues via the appropriate system.
6.
Note:
3.2
There will be known problems for which XXX will be waiting on a vendor to
provide the fix.
Notification
To Incident Process
Problem Record
Problem Workaround
Validated Severity
Problem
Determ ination
Level 2 Priority
Escalate to Problem
M anager
Updated Problem
Record
Problem Record
Workaround
& Recovery
Project Requests
Escalations
Problem Record
Problem
Resolution
Updated Problem
Record
Com munications
Problem Record
Tracking
Closed Problems
Problem Record
Reporting
& Control
M anagem ent
Inform ation
3.3
Notification
w
w
w
w
w
w
w
Inputs:
Outputs:
Roles:
Problem Record
External Notification
User Communication
Assigned Problem
Escalated record
Problem Management Controller / Problem
Manager
AIB IT - Notification
Incident from
Incident process
Assign Severity
Escalate to
Problem Manager
Incorrect
Assignment
To Incident process
Incorrect
Escalate to Prblem
Management
Problem Record
Validate Assignment
Correct
Validate Severity
Level
Allocate &
Prioritise
Severity 1 & 2
Critical Situation
Management
The Identification and notification sub process includes the following steps:
1.
2.
3.
3.4
Problem Determination
w Updated Problem
w Problem Status (Updated Problem Record)
w Problem Management Controller / Level 2/3
Inputs:
Outputs:
Roles:
support
Validated Severity
Cause Identified
Yes / No
No
Yes
Invoke Level 3
Support
Yes
Level 2 Priority
No
Update Problem
Record
Prob. Identification
Complete Yes / No
No
Escalate to
Problem Manager
Yes
Prob. Workaround
2.
3.
Correct Owner?
Determine if the problem has been referred to the correct owner (work
group/queue)
4.
5.
6.
Resolve Incident
Problem Management controller, co-ordinates actions of, resolution with
assistance and participation of relevant support groups.
Problem Controller organises communication of resolution to users via Help
Desk.
7.
8.
9.
Is it a Problem?
Determine if the reported problem is actually a problem
Note: The problem record is then closed by way of the Close Request activity of the
Incident Management service
Adjust Initial Severity/Priority if Required
Notify Severity/Priority Change
This activity invokes Incident Management to register the severity/priority change and
carries on in parallel to Update Problem Record.
Incident Management
Call Management is responsible for resetting the severity/priority.
Update Problem Record
3.5
Outputs:
Roles:
w
w
w
w
w
w
w
w
w
Problem record
Available workaround
Operational procedures
Change management
Problem status (Updated problem record)
Project request
Change Intention
Configuration Update Details
Problem Co-ordinator /Team Leader
Emergency Change
Mangmt Process
Implement Bypass
Apply Temp fix
Recover/ Resources
/ Services
Verify recover
actions
Backout bypass
Problem
Record
Operational
Procedures
YES
Project
Request
10
Change
Management
Required
No
YES
Operational
Procedures
No
Successful
Bypass /
Recovery
No
No
Appropriate?
Yes / No
YES
Escalate
According to
Severity
YES
Update Problem
record with
details
The Problem Workaround & Recovery sub process includes the following steps:
1.
2.
Project Required?
Based on Policy.
Determine if a project is required to implement the bypass.
- If yes, proceed to Project Request.
- If No, proceed to Change Management Required?
3.
4.
5.
Change Management
If required, invoke Change Control to approve and schedule the workaround.
6.
7.
Operational Processes
If a project is not required, start the implementation of the workaround or
recovery plan by way of the operational procedures that perform implementation
tasks such as:
- Emergency Change Management
- Implement the bypass
- Apply temporary fixes
- Recover resources and services
- Verify that the bypass/recovery actions work
- Back out the bypass if it was unsuccessful
Note: Operational Processes include Desk-side Support, Software Distribution,
Server Management, Applications Management, and so on.
11
8.
Successful Workaround?
- If Yes, proceed to Update Problem Record to Indicate Workaround was
Successful
- If No, proceed to Update Problem Record to Indicate Workaround was
Unsuccessful.
9.
3.6
Problem Resolution
Outputs:
w
w
w
w
w
Roles:
w
w
w
w Problem Co-ordinator / Level 2 / Level 3
Inputs:
Problem record
Knowledge database
Change Approval
Change Status Report
Problem Status (Updated Problem
Record)
Problem Resolution Plan
Project Request
Change Intention
Investigate
Solutions
Level 2
Resolution
Yes / No
YES
Select
Problem
Solution
Review
Specify
Solution
Design
Solution
NO
NO
Escalate to
Level 3
Project
Proposal
Project
Deferred
Yes / No
YES
12
Project
Required?
Yes / No
NO
Project Work
Develop
Resolution
Plan
Update
PBM Record
YES
13
2.
3.
Review/Design Solution
Review or design the permanent solution for the problem
4.
5.
Project Required?
Based on Policy
Determine if a project is required to implement the solution.
- If Yes, proceed to Project Request (in tracking)
- If No, proceed to Select Problem Solution
6.
Project Request
If a project is required, invoke Project Request to implement the solution
7.
Provide Service
After handling the entitlement failure, determine if service is to be provided; that
is, will the recommended solution or an acceptable alternative be implemented
8.
9.
3.7
Problem Tracking
w
w
w
w
w
w
w
w
Inputs:
Outputs:
Roles:
Problem record
Knowledge database
Configuration Information
Problem Status (Updated Problem Record)
Problem analysis information
Root cause analysis
Possible Problem Solution
Problem Management Co-ordinator / Team
Leader
Problem Management Controller
Cordinate /
Communicate
Incident resolution
Check status
of call, provide
feedback
Identify
issues for
investigation
Monitor
progress
of problems
Follow up
enquiries on
actions
Update users
via Helpdesk
Ascertain
Trends
Escalate
Problem
Advise Problem
Manager
Confirm
Resolution
Review Problem
Record
Satisfied
Not
Satisfied
Communicate
Resolution
Route to
Problem
Resolution
Close
Problem
Re-drive
Problem
Project
Required
14
2.
3.
Long Term
Recurring Issue
Bigger Problem
Project Required /
Project Request
Project Proposal
Short Term
15
Project Work /
Change Mgmt etc.
Update Problem
Record
3.8
w
w
w
w
Roles:
Inputs:
Problem Record
Requirement(s) for process improvement
Documented non-compliances
Problem Management Measurements and Reports,
including trend analyses
Problem
Record
Root Cause
Resolved
NO
Process Not
Working?
Process
Improvement
required
Project
Required?
Problem
Resolution
Sub-Process
Document
service
improvments
Management
Information
YES
Project Identified
Not Imlemented
Report /
Escalate
The Report and Control Problems sub process includes the following steps:
16
1.
2.
3.
Project Required?
Based upon the outcome of analysis of generic incidents or problems determine
whether or not specific project activities are required.
4.
5.
6.
7.
3.9
Incident from
Incident process
Assign Severity
Incorrect
Assignment
To Incident process
Incorrect
Escalate to Prblem
Management
Validate Assignment
Problem Record
Correct
Allocate &
Prioritise
Validate Severity
Level
Emergency
Change
Mangmt Process
Implement Bypass
Severity 1 & 2
Critical Situation
Management
Problem Record
Validated Severity
No
Cause Identified
Yes / No
Apply Temp
fix
Recover/ Resources
/ Services
Invoke Level 3
Support
Verify recover
actions
Yes
No
Level 2 Priority
Prob. Identification
Complete Yes / No
Update Problem
Record
No
Escalate to
Problem Manager
Backout
bypass
Yes
Prob. Workaround
Problem
Record
Problem
Record
Investigate
Solutions
Level 2
Resolution
Yes / No
YES
Select
Problem
Solution
Review
Specify
Solution
Design
Solution
NO
Project
Required?
Yes / No
Operational
Procedures
YES
YES
Project
Request
NO
Change
M anagement
Required
No
Operational
Procedures
No
Successful
Bypass /
Recovery
YES
No
Appropriate?
Yes / No
YES
Escalate
According to
Severity
YES
Escalate to
Level 3
Project
Proposal
NO
Project
Deferred
Yes / No
Project Work
Develop
Resolution
Plan
Update
PBM Record
Problem Record
Cordinate /
Communicate
Incident resolution
Update Problem
record with
details
No
Check status
of call, provide
feedback
Identify
issues for
investigation
Monitor
progress
of problems
Follow up
enquiries on
actions
Update users
via Helpdesk
Ascertain
Trends
YES
Escalate
Problem
Problem
Record
Root Cause
Resolved
NO
Process Not
Working?
Process
Improvement
required
Project
Required?
Problem
Resolution
Sub-Process
YES
Project
Identified
Not Imlemented
17
Report /
Escalate
Document
service
improvments
Advise Problem
Manager
Confirm
Resolution
Review Problem
Record
Satisfied
Not
Satisfied
Management
Information
Communicate
Resolution
Route to
Problem
Resolution
Project
Required
Close
Problem
Re-drive
Problem
3.10
SD
resolve
p roble m
?
N
S e t se verity & priority
& a dv is e cu stom er
w ith re f no.
Se verity
1
?
N
N
2 nd
p as s
?
G roup
accep t
?
Y
Y
Y
S u pport grou p ring
custom e r w ithin S LA to
discuss p roblem / give
fix tim e / co nfirm prio rity
P riority
chang e
need ed
?
N
S up port g rou p pe rform
p rob lem determ in ation
(P D ) and d evelop fix Y
C h ange
n eed ed
?
N
S up port gro u p in form custo m er of s olution ,
upd ate prob lem record w ith full d escrip tion
and caus e c od e; set record to "op en,
resolved" s tatus
C lose reco rd
R ec ord
c om plete d
corre ctly
?
C h an ge
im plem en ted
?
Y
S D clo se reco rd
END
18
C ust S at
que stion naire
n eeded
?
M ajor
In cident
?
S D com p lete
q ue stionnaire w ith
c usto m er
P ro ble m M an ager
inform s S ervice
C on tin uity M anag er
4.
4.1
It should be noted that these reports are not intended to replace the normal operations
reports for systems and network availability/outages. These reports are to monitor the
progress of the problem management system and provide guidance in the effectiveness
of the problem management activity.
19
5.
5.1
Major Tasks
w The process owner must build the process. This includes defining
w
w
w
w
w
w
w
w
w
w
w
20
5.2
The Helpdesk personnel play the key role in the day-to-day operation of
the problem management process and in the majority of incidents
becomes the problem owner.
The problem owner/controller assumes responsibility for all
communications and for co-ordinating resolution activity on that problem,
in accordance with severity.
Major Tasks
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
21
5.3
Major Tasks
w Defines training and development needs for individuals within the team
w Ensures adherence to staff training plan
w Undertakes performance review meetings with team members in
w
w
w
w
w
w
w
w
5.4
Knowledge Engineer
Job Purpose
Major Tasks
22
5.5
Major Tasks
Operations
5.6
Vendors (Level 3)
Job Purpose
Major Tasks
Need to add more of a Level 3 description, so that level 3 can be integrated into the
process.
23
24
6.
Appendices
6.1
Impact Description
w All BBI Branches and or all ATMs
w Based on Banks ability to process value
TSD Keyword
Critical
Severity 3
w
w Major Problems
w Large Business or Systems
w Large non-branch business units, big branch or
Severity 4
Severity 5
w
w
w
w
Severe
Significant
support department
Severity 6
25
branch
High
Single User
Low Customer Impact
Medium
Default
All Requests & Batch Fails
Low
26
6.2
The identified Level 2 department does not accept responsibility for the resolution
Responsibility is assigned, a plan is put in place, and a target for resolution exists
27
The Helpdesk determines that escalation is needed and identifies the departments
to be involved
28
The Helpdesk provides the history of the problem (via the Helpdesk call record)
and ensures that an action plan is developed
The team develops an action plan that outlines the action and sets target times and
ensures resource commitments
The assignee ensures that the affected department/clients are notified and are in
agreement with the plan. If they are not, then agreement must be obtained
The assignee notifies the appropriate management of the situation and plan
The assigned Level 2 department is responsible for updating the problem call in a
timely fashion.
6.3
Perform problem determination for some applications and some hardware, and
network usage problems. Level 1 should be able to perform routine Problem
Determination for; PC workstations, key generic applications, and the network
Level 2
w Be able to operate and install
w
29
6.4
30
Client
Helpdesk (level 1)
Operations
Management
Vendors (Level 3)