100% found this document useful (1 vote)
655 views33 pages

ITIL v3 Incident Management Process: ... Restoring Normal Service Operation As Soon As Possible

The document discusses the ITIL v3 Incident Management process. It defines key terms like incident and service request. It describes the incident lifecycle of open, in progress, resolved and closed. The purpose of incident management is to restore normal service operation as quickly as possible while minimizing impact. It provides value to businesses by reducing downtime and improving alignment between IT and business priorities. Incident priority is assigned based on urgency and impact and targets are set for resolution times. Escalation procedures increase resolution when needed. Standard models are used to efficiently handle recurring incidents.

Uploaded by

Manjunatha Rao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
655 views33 pages

ITIL v3 Incident Management Process: ... Restoring Normal Service Operation As Soon As Possible

The document discusses the ITIL v3 Incident Management process. It defines key terms like incident and service request. It describes the incident lifecycle of open, in progress, resolved and closed. The purpose of incident management is to restore normal service operation as quickly as possible while minimizing impact. It provides value to businesses by reducing downtime and improving alignment between IT and business priorities. Incident priority is assigned based on urgency and impact and targets are set for resolution times. Escalation procedures increase resolution when needed. Standard models are used to efficiently handle recurring incidents.

Uploaded by

Manjunatha Rao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Incident Management

ITIL v3 Incident Management Process

...restoring normal service operation as soon as


possible
Incident Management

Content
• Key definitions • Challenges
• Incident Lifecycle • Risks
• Purpose and Objectives • Critical success factors (CSF)
• Value to business
• Key Performance Indicators
• Incident Priority (KPIs)
• Incident Priority and Target • Roles and Responsibilities
resolution times
• Major Incidents
• Escalationas – Hierarchical &
Functional
• Standard Incident Models
• Process Workflow
• Process Interfaces
• Information Management
Incident Management

Key definitions

• unplanned interruption to an IT service


• reduction in the quality of an IT service
Incident • failure of a CI that has not yet impacted an IT service
( e.g. Redundant component failure)

Formal request from a user for something to be provided.


... e.g. a request for information or advice; to reset a password; or to
install a workstation for a new user
Service Request
... NOT a disruption to the agreed service
... Request Fulfilment process. Manages lifecycle of Service Requests

Method of bypassing an Incident or Problem (temporary fix).

Workaround * It is not a permanent solution but something that is used to get the
service up and running till the real solution is found.
Incident Management

Incidents have a Lifecycle!...


„Life y le is the series of changes that happen
to a living creature/project/product/etc. over the course of its lifetime.”

Open Status codes indicate where Incidents are in relation


to the lifecycle. E.g.:

In progress
• Open
• In progress
Resolved
• Resolved
• Closed
Closed • (Pending)

... Incident management is the process responsible for managing the lifecycle of all
incidents.
Incident Management

Incident Management is like fire-fighting!


Incident Management

Purpose and objectives

...restore normal service operation as quickly as possible

...minimize the adverse impact on business


Purpose
...ensuring best possible levels of service quality and
availa ility are ai tai ed a ordi g to SLA s

• standardized methods and procedures


Objectives • increased visibility and better communication
• priorities aligned with business
• user satisfaction with the quality of IT services.
Incident Management

Value to business

• reducing service downtime

• reducing Incident impact to business

• aligning IT to business priority

• identify possible improvements to service


Value to business
• identification of additional requirements (e.g.
training, new service) as a result of handling multiple
incidents.

* Value of IM is highly visible to the business. For this


reason, IM is often one of the first processes to be
implemented in Service Management projects.
Incident Management

Incident Priority

... assigned, to ensure that the support groups will pay


Incident priority
the required attention to the incident.

...based on the Urgency and Impact.

IMPACT + URGENCY = PRIORITY

How much damage, if not fixed soon? How fast does it need to be fixed?
Incident Management

Incident Priority & Timescales


• must be agreed for all incident-handling stages
• based upon the overall incident response and
resolution targets within SLAs
• captured as targets within OLAs and Underpinning
Incident priority
Contracts (UCs).
& Timescales
• support groups should be made aware of these
timescales.
• Service Management tools automate timescales
and escalate as required

Example of priority coding system:


IMPACT Priority Target
Description
code Resolution Time
High Medium Low
1 Critical 1 hour
URGENCY

High 1 2 3 2 High 8 hours


Medium 2 3 4 3 Medium 24 hours
4 Low 48 hours
Low 3 4 5
5 Planning Planned
Incident Management

Major Incidents

Major Incident = High Impact + High Urgency

Major Incident
... highest category of impact for an incident.
...results in significant disruption to the business.
...should have separate procedure

A separate procedure !!! ( for major incidents)


... shorter timescales and greater urgency
... separate major incident team under the direct leadership of the incident
manager
... Informing Management and Customer
…Servi e Desk ensures that all activities are recorded and users are kept fully
informed of progress.
Incident Management

Functional and Hierarchical Escalation

Escalation is the mechanism that assists timely


Escalation
resolution of an Incident.

IT Service
Manager

Hierarchical (authority)
Hierarchical Escalation Service Desk 2nd Line 3rd Line
…can take place at any Manager Manager Manager

moment during resolution.

Reasons might be:


•SLA threat Service Desk 2nd Line 3rd Line
•Extra resources required Support Team Support Team Support Team

•Need to inform Higher Functional (competence)


management
Functional Escalation ... means involving more specialist personnel
or access privileges to solve the incident. Departmental boundaries
may be exceeded.
Incident Management

Standard Incident models

Standard Incident Models are designed and


Standard Incident implemented for handling standard (reoccurring)
Models
incidents more efficiently.

An Incident Model should include the following:


• steps required to handle the incident and their
order
• Responsibilities
• Ti es ales a d thresholds for completion
• A es alatio pro edure
• A e ide e pre e tio a ti ities

...Support tools can then be used to


automate handling of standard incidents.
Incident Management

Process Workflow
Incident Management

Process Workflow – Incident Identification

TRIGGERS: Incidents ... from Event Mgmt, from web interface, from Users, from suppliers, from
technical Stuff

Incident Identification:

• Usuall it’s u a epta le to ait u til a user logs a i ide t

• Monitoring assures :
• Early detection of Failure/potential failure
• Quick start of Incient Management

* IDEAL SITUATION: Incident is resolved before it had an influence on users !


Incident Management

Process Workflow – Incident Logging

• All incidents must be


• fully logged (INC. NO., DATE, TIME, OWNER, CONTACT... )
• ... and date/time stamped;
• Full historical record of all incidents must be maintained

... Opportu it fi es ust also e logged !!! ALL ust e logged

* Incident is logged -> Resolution time count starts !


Incident Management

Process Workflow – Incident Categorization

• Categorization indicates the type of incident being logged

* Category is often related to team that will handle the incident from the Service Desk

* Categories are often multi-level . For Example:


- Hardware – Server – Memory Board – Card Failure
- Software – Application – Finance Suite – Purchase Order System

* It s useful if i ide t a d pro le ategories are alike


Incident Management

Process Workflow – Service Request?

• A part of categorisation will be to check if it’s a Se vice Re uest


• If it is -> It will be transferred to Request Fulfilment process

* Requests are not incidents and should be handeled diferently


Incident Management

Process Workflow – Incident Prioritization

• Prioritisation determines how the


incident will be handled by support staff and by support tools

* Remember: PRIORITY = Urgency + Impact (+ SLA)

High Medium Low Priority code Description Target Resolution Time


High 1 2 3 1 Critical 1 hours
URGENCY

Medium 2 3 4 2 High 8 hours


Low 3 4 5 3 Medium 24 hours
IMPACT 4 Low 48 hours
5 Planning Planned
Incident Management

Process Workflow – Major Incident?

• If priority indicates Major Incident it must be handled by following the Major Incident
Procedure

* Staff must be familiar with the procedure !


Incident Management

Process Workflow – Incident Diagnosis

• Service Desk Analyst will determine with the user:


• Full symptoms (what has gone wrong)
• How to correct it ?
• Using:
• Diagnostic scripts
• Known Error information

If the incident is resolved -> it will be closed (after informing the user !!!)
Incident Management

Process Workflow – Functional Escalation

If the incident is NOT resolved -> it will be escalated (and user informed !!!)

• FUNCTIONAL ESCALATION (to the next level of support) occurs when:

• Current level of support can’t esolve the incident


• Current level of suppot has reached time scales for resolving the incident

* Ownership of the incident stays with the Service Desk!


* Service Desk will track and monitor progress !
Incident Management

Process Workflow – Hierarchical Escalation

If the incident is NOT resolved -> it will be escalated (and user informed !!!)

• HIERARCHIC ESCALATION (up the management chain) occurs when:

• SLA breaches are threatened


• Extra resources are needed to resolve the incident
• Senior Management needs to be aware / approve the steps required

* May also be initiated by the customer / user if they see it necessary !


Incident Management

Process Workflow – Investigation & Diagnosis

• More detailed information might be collected on:


• Exactly what has gone wrong
• Understanding the chronological order of events
• Confirming the full impact
• Identifying events that may have triggered the incident
• Knowledge searches
• Previous incidents
• Changes made

* All actions and finidings must be recorded !


* As much actions as possible should be performed in parallel to save time
Incident Management

Process Workflow – Resolution and Recovery

• When the resolution has been identified it should be applied and tested
• If satisfactory a time / date stamp is recorded as this is the end of downtime
• The incident record must be updated with the details of actions taken
• The incident should be returned to the Service Desk for closure action
Incident Management

Process Workflow – Incident Closure

• Before closing the incident the SD Analyst must:

• Make sure the user is informed and happy with the solution
• The assigned incident category is the correct one (if not , correct it)
• The incident documentation is complete

• If there is indication the incident might recur, a Problem record should be raised

* The Incident is closed by Service Desk !


* Re-opening incidents – strict rules must exist for this action !!!
Incident Management

Process Interfaces

• Event can (automatically) raise incident


Event Mgmt

• Request handling can also be handled by IM process


Request Fulfilment

• Incidents (repeated) often point to problems


Problem • Solving the problems should reduce the number of
Management incidents
• Provides data used to identify and progress incidents
Asset & • IM assists in verfication of CMS
Configuration Mgmt

• Changes are often reasons why incidents occure


Change Management • Incidents can lead to changes required for
resolutions/workarounds
Incident Management

Process Interfaces – cont.

• IM must restore service as agreed in SLAs – thus, targets for


Service Level IM are determined considering SLM and vice-versa
Management

• Service Dask will consult Service Catalogue in handling


Service Catalogue incidents
Management

• IM may trigger monitoring of a system or service


Capacity performed by Capacity Management
Management • Workarounds used by Incident Management can come from
Capacity Management
• Incident data is important in determining availability.
Availability
Management

• ...
...
Incident Management

Involvement in Information Management

CMS
Incident Record
•Reference Number
•CI Impacted
•Dates and Times
•Originator CMDB
•Affected users
•Symptoms
•Category, Priority
•Actions Taken Diagnostic
•Relationships Script
•Closure details

KEDB
Incident Management

Challenges
• Ability to detect incidents as early as possible.

Challenges
• Convincing all staff that ALL incidents must be
logged.

• Making information available about known


errors to ensure staff learn from previous
incidents.

• Configuration Management System integration

• Integration into the Service Level Management


processes in order to correctly assess the impact
and priority of incidents, and

• Defining escalation procedures.


Incident Management

Risks
• Incidents not being handled in appropriate
timescales
Risks • Insufficient incident backlog
• Poor information availability (for
resolving/escalating...)
• Mismatch in objectives/expectations for Incident
Management

...due to a lack of or inappropriate training ?


...due to inadequate support tools ?
...due to lack of support tools integration ?
...due to poorly-aligned or non-existent OLAs or
UCs ...SLAs ?

....due to ...... ?
Incident Management

Critical Success Factors (CSF) &


Key performance Indicators (KPI)

• CSF Resolve incidents as quickly as possible minimizing impacts


to the business
• KPI Mean elapsed time to achieve incident resolution or
CSF & KPI circumvention, broken down by impact code
Examples • KPI Breakdown of incidents at each stage (e.g. logged,
work in progress, closed etc.)
• KPI Percentage of incidents closed by the service desk
without reference to other levels of support (often referred
to as first poi t of o ta t )
• KPI Number and percentage of incidents resolved remotely,
without the need for a visit
• CSF Maintain quality of IT services
• KPI Total numbers of incidents (as a control measure)
• KPI Size of current incident backlog for each IT service
• KPI Number and percentage of major incidents for each IT
service
• CSF Maintain user satisfaction with IT services
• KPI Average user/customer survey score (total and by
question category)
• KPI Percentage of satisfaction surveys answered versus
total number of satisfaction surveys sent
Incident Management

Roles
IM Process Owner - accountable for the process
Incident Management
Process Owner
Incident Manager ...manages the work of Incident Support
Staff
• Developing and maintaining IM process and procedures (driving
efficiency and effectiveness )

• Managing the work of incident support staff (first- and second-line)


Incident Manager • Managing Major Incidents
• Monitoring the effectiveness of IM ...recommending improvement
• Developing and maintaining the IM systems
• Producing management information
1st line support (normally the Service Desk)

• Identify, logg, categorize, priotitize, diagnose, reslove/escalate and


1st, 2nd and 3rd line close an incident.
support
2nd line support * (generally Technical/Application Management)
• Investigate, diagnose, resolve (recover) an incident.
3rd line support * (External experts or Internal ones)
• Investigate, diagnose, resolve (recover) an incident.
Incident Management

THE END

ITIL v3 Incident Management Process

...restoring normal service operation as soon as


possible

You might also like