0% found this document useful (0 votes)
40 views44 pages

Service Operation

The document provides an overview of service operation concepts, processes, roles, and objectives. It defines key terminology and explains the goals of service operation as coordinating activities to deliver services at agreed levels while managing supporting technology. Specific processes covered include incident management, event management, problem management, and request fulfillment. The document emphasizes achieving balance between internal technology management and external customer service delivery.

Uploaded by

Aniela Olteanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views44 pages

Service Operation

The document provides an overview of service operation concepts, processes, roles, and objectives. It defines key terminology and explains the goals of service operation as coordinating activities to deliver services at agreed levels while managing supporting technology. Specific processes covered include incident management, event management, problem management, and request fulfillment. The document emphasizes achieving balance between internal technology management and external customer service delivery.

Uploaded by

Aniela Olteanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Service Operation

Run the system


Course objectives
On completion of this session you will be able to:
• Define and explain some of the key terminology and concepts of
Service Operation
• Explain the high level objectives, scope, basic concepts, process
activities, key metrics, roles, and challenges for Incident
Management
• State the objectives, basic concepts and roles for Event
Management, Request Fulfillment, Problem Management and
Access Management
• Explain the role, objectives, organizational structures, staffing, and
metrics of the Service Desk function
• State the role, objectives, and organizational overlap of:
– Technical Management
– Application Management
– IT Operations Management
• IT Operations Control
• Facilities Management
Goals
• To coordinate and carry out the activities and
processes required to deliver and manage services at
agreed levels to business users and customers
• To manage the technology that is used to deliver and
support services
• To properly conduct, control and manage the day to
day operations using the well designed and
implemented processes from Service Design and
Service Transition
• To monitor performance, assess metrics and gather
data systematically to enable continual service
improvement
Principles and scope
• Principles
– Managing day-to-day activities and technology
– Executing processes to optimize cost and quality
– Enabling the business to meet its objectives
– Effective functioning of components
• Scope
– Includes the execution of all ongoing activities required to deliver and
support services:
– The services themselves performed by the service provider, an
external supplier of the user or customer of that service
– Service Management processes regardless of origin (such as
Change in SD) are in use or receive input from SO
– Technology - the management of the infrastructure used to deliver
services
– The People who manage the technology, processes, and services
Value
• "Service Operation is where the value is seen."
• Service value is modeled in Service Strategy. The
cost of the service is designed, predicted and validated
in Service Design and Service Transition. Measures
for optimization are identified in Continual Service
Improvement. But SO is where any value is actually
realized! Until a service is operational, there is no value
being delivered.
– Services run within budget and ROI targets
– Design flaws fixed and unforeseen requirements satisfied
– Efficiency gains achieved
– Services optimized
Basic Terminology
• Function - a logical concept referring to the
people and automated measures that execute
a defined process, activity or combination
• Group - a number of people who perform
similar activities
• Team - a more formal group
• Department - formal organization structure
• Division - group of departments
• Role - set of connected behaviors or actions
performed by a person, group, or team in a
specific context
Achieving Balance
• One of the key responsibilities for Service Operations is achieving
balance. There are many aspects of balance, each of which must be
carefully considered when deciding upon the optimum way to respond to
a set of requirements.
• SO addresses the conflicts between the status quo and responding to
changes in the business and technological environment.
– Internal (IT as a set of technology components)
Versus
– External (IT as a set of services)
• The external view of IT is the way in which services are experienced by
users and customers. They do not always understand, nor care about, the
details of what technology is used to deliver or manage the services. All
they care about is the service meeting their requirements (Utility and
Warranty).
• The internal view is of the way the IT components and systems are
managed to deliver the services. Since IT systems are complex and
diverse, this often means multiple teams managing their own aspects of
the total solution. They will tend to focus on achieving good performance
and availability of "their" systems.
Acheiving balance - continued
• Both views are relevant and necessary, but an
organization that focuses on one extreme or the other
will not achieve value. Too far to the right and often
promises are made that can't be met; too far to the left
and expensive services delivering little customer value
result.
– Achieving this balance should consider:
– Maturity of the organization
– Culture of the organization
– Role of IT in the business
– Level of integration of management processes and tools
– Maturity of Knowledge Management (for example,
Problem/Availability data)
Various choices
• Stability versus Responsiveness
– SO needs to ensure stability of the infrastructure and availability of
services. SO also has to respond to change which may at times be
unexpected, or have to happen quickly whilst under pressure.
• Costs versus Quality
– Costs will increase in proportion to the requirement for higher quality.
Marginal improvements for high quality services can be expensive.
SO has to deliver agreed service levels in line with optimal costs, this
must to some degree be dependent on good strategy and design
activities especially financial management.
• Reactive versus Proactive
– Reactive activity - fire fighting - is still a reality in some IT
organizations
– Proactive activity whist seen as a good thing can, in the extreme can
be costly and can lose focus
Processes
• Event Management
• Incident Management
• Problem Management
• Request Fulfillment
• Access Management
• The processes in the above list are part of SO
but others that will be carried out or supported
during the SO phase of the Service Lifecycle
are: Change, Capacity, Availability, Financial,
Continuity, and Knowledge Management.
Event Management Process
An Event is any detectable or discernable occurrence that has
significance for the management of the IT infrastructure or the
delivery of service and evaluation of the impact a deviation might
cause to the services.
• An Alert is a warning that a threshold has been reached,
something has changed, or a failure has occurred.
• Events are typically notifications created by an IT service, CI, or
monitoring tool. Events are provided by good monitoring and
control systems, using both active tools generating exceptions and
passive tools detecting an correlating alerts generated by CIs.
• Alerts are often created and managed by System Management
tools and are managed by the Event Management Process.
Objectives and concepts
• Objectives
– To detect events, make sense of them, and determine appropriate
control action
– To act as a basis for automating routine Operations Management
activities
• Concepts
– The following types of events can be identified:
• Information - signifying regular operations
• Warning - unusual but not exceptional
• Exception - would require intervention
– Information events signify regular operation:
– A device indicates that it is still "alive"
– A message indicates an activity has completed normally
– An authorized user logs in to an application
– An email reaches its intended recipient
– Typically no action is required, though clearly all the messages need
to be recorded in case later forensic investigation is required.
Further definitions
• A warning event signifies something unusual has occurred, but
not something that constitutes an exception. Some form of
intervention may or may not be required. At the least, closer
monitoring is probably required. The situation may resolve itself,
for example, an unusual workload mix causes one or more
thresholds to be reached without breaching any service targets,
but when one or more of the tasks completes, normal operational
state is returned.
– An exception signifies that an action is or will be required; for
example:
– An unauthorized attempt to access an application/service
– Some component of a solution is unavailable
– A threshold is breached which is unacceptable
– A scan reveals unauthorized software on a PC
• All types of events generate event notifications. The distinction
between unusual and exception should be clearly defined.
Roles
• Typically Event Management tasks would be
delegated to these functions:
– Service Desk (or operations bridge) - if the event is
within the scope of their function (as an incident)
– Technical and Application Management -
involved in design and transition so that the
appropriate events are designed and built into the
services and that procedures are in place
– IT Operations - may well monitor and respond
where delegated; actions to be taken included in
SOPs for teams
Incident Management Process
• An Incident is:
– "An unplanned interruption to an IT Service or a reduction in
the quality of an IT Service."
– Failure of a Configuration Item that has not yet had an impact
on a Service is also an Incident; for example, failure of one
disk from a mirror set.
– Normal Service is defined as in line with the SLA.
• Objectives
– To restore normal service operation as quickly as possible
– To minimize the adverse impact upon business operations
– To ensure that the best possible levels of service quality and
availability are maintained.
Scope
• Any Event which disrupts or could disrupt a service
• Incidents can be reported/logged by:
– Technical staff
– Users
– Event Management
• Note:
• Not all Events are Incidents
• Service Requests are not Incidents (Service Requests are handled by
Request Fulfillment)
• Covers the activities of the IM process:
– Identification - has the incident occurred?
– Logging - all must be logged; relevant information recorded to ensure full
history recorded
– Categorization - must be relevant, consistent, supported by the tool set
– Prioritization - impact and urgency gives code, can be dynamic, guidance
given to support staff
– Initial diagnosis - SD analyst
Major Incidents
• Major Incidents represent the highest Category of Impact for an
Incident. A Major Incident results in significant disruption to the
business. Special procedures need to be followed to ensure that
all resources are available to deal with the incident speedily. The
organization defines what constitutes a major incident.
Metrics
• Metrics are monitored and reported to assess the effectiveness
and efficiency of the process and its operation.
• Key metrics may be used for identifying process compliance to
support Critical Success Factors (CSF).
– Number of incidents
– Breakdown by stage
– Backlogs
– Number and percentage of major incidents
– Mean time to resolve by impact code
– Percentage handled within time
– Number and percentage of incidents handled per SD agent
– Number and percentage resolved remotely
– Number of incident per incident model
– Breakdown by time of day
Roles
• Incident manager
• First line - Service Desk
• Second line - greater technical skills
• Third line - specialist groups
• An Incident Manager has responsibility for:
– Driving the efficiency and effectiveness of the process
– Producing management information
– Managing the work of support staff
– Monitoring the effectiveness of IM and recommending
improvements
– Developing and maintaining IM systems
– Managing Major Incidents
– Developing and maintaining the IM process and procedures
Challenges
• The challenges to successful Incident
Management are:
– Early detection ability
– Need for logging and use of Self Help
– Availability of Problem and Known Error information
– Integration into Configuration Management System
– Integration into Service Level Management process
• Self Help is being increasingly used as a
consistent option to take pressure off first line
staff and reduce cost. Web-based self help
requires technology linked to Request
Fulfillment and Event Management.
Request Fulfillment Process
• The term Service Request describes varying demands placed on
IT by users. Many are actually small changes - low risk, frequently
occurring, low cost, and so on (such as password reset or
software installation on a single PC) - or information requests.
• As service requests can occur frequently and are low risk, they are
better handled as a separate process. This removes pressure
from Incident and Change Management.
• Objectives
– To provide a regular channel for users to request and receive
standard services
– To provide information to users about service availability and access
– To source and deliver components of standard services
– To assist with general information, complaints or comments
Definition and Concepts
• A request from a User for information, advice,
or for a Standard Change or for Access to an
IT Service, such as to reset a password or to
provide standard IT Services for a new User.
• Many requests recur frequently, so standard
mechanisms can be defined for dealing with
them. These are referred to as Request
Models.
• Concepts
– Predefined process-flow (model)
– Standard changes
– Request models
Roles
• Service Desk - initially requests will be handled
by the service desk
• Incident Management - service requests can
be part of the IM process
• Service Operation teams - fulfilling the specific
request may involve other Ops teams from
within support; however, it is unlikely that there
will be any new or separate roles created for
handling requests
Self help
• Self Help is the use of technology to handle Service
Requests and some Incidents.
• The characteristics of self help are that it is web-based,
available 24 x 7, and that the customer is able to pick
and choose from a variety of services on a menu, and
then place them in a "shopping cart" for check-out.
Problem Management Process
• The Problem Management process manages problems through
their lifecycle. It includes to all activities required to diagnose the
root cause of incidents, determining and implementing resolutions
through change and release processes. PM will maintain
information about problem, workarounds, and resolutions. PM has
a close relationship with Incident Management.
• Objectives
– To prevent problems and resulting incidents from happening
– Eliminate recurring incidents
• To minimize the impact of incidents that cannot be prevented.

• Problem Definition
– "The unknown cause of one or more incidents"
Concepts
• Concepts
– Problem Models - pre-defined steps for handling
recurring types of problems
– Workaround - a temporary way of overcoming an
incident, reducing or eliminating it where a full
resolution is not yet available
– Known Errors - where diagnosis is complete and a
workaround or permanent resolution found
– Known Error Database - stores Known Error
records; available during Incident and Problem
Diagnosis
– Resolution - action taken to repair an Incident or
Problem or implement a workaround
Activities
– Detection
– Logging
– Categorization
– Prioritization
– Diagnosis
– Create KE record
– Resolution
• The reactive process is similar in its flow of
activities to the Incident process.
• Proactive PM is considered as part of
Continual Service Improvement.
Roles
• The Problem Manager role is responsible for:
– Liaison with all resolving groups to ensure speedy
resolution with SLA targets
– Ownership of the KEDB including protection, KE
inclusion, and search algorithms
– Formal closure of Problem records
– Liaison with suppliers and third parties regarding
their obligations
– All aspects related to major problems
• Resolving Groups are specialist teams with
in-depth knowledge and skills will typically deal
with particular problems.
Access Management Process
• Access Management (also known as Rights or Identity
Management) is the process of granting users the right to use a
service while preventing unauthorized access.
• Security and Availability define policies for who is allowed to
access what under which conditions. Collectively they are defining
the Confidentiality, Integrity, and Availability (CIA)
requirements/constraints for the users, services and the data.
Access management is the execution level of ensuring/enforcing
those policies.
• Objectives
– Execute the policies and actions defined in Information Security and
Availability Management
– To provide the right for users to be able to use a service or group of
services
Concepts
• Access - the level and extent of a service's functionality
or data that a user is entitled to use
• Identity - the information about users that distinguishes
them as an individual and verifies their status within the
organization
• Rights (privileges) - settings whereby a user is
provided access - read, write, execute, change, delete
• Service groups - aggregation of a set of users
accessing a common set of services
• Directory services - a specific type of tool used to
manage access and rights
Activities
• Requesting access
• Verification
• Providing rights
• Monitoring identity status
• Logging and tracking access
• Removing or restricting rights
Roles
• Service Desk - main point of contact for
requesting access to services and
communication with users, as well as handling
access related incidents
• Technical and Application Management -
have roles throughout the service lifecycle,
ensuring access control is designed into
services and tested to ensure it performs as
designed and performing access management
activities during service operations
• IT Operations Management - monitors and
supports the process
Service Operation Functions

• A function is a logical concept that refers to people and automated


measures, and that executes a defined process, an activity or
combination of activities.
• Logical functions perform specific activities and processes - not
necessarily mapping to organizational structures or individuals.
– Service Desk
– Technical Management
– Application Management
– IT Operations Management - includes two functions:
• IT Operations Control
• Facilities Management
. Service Desk Function
• The Service Desk should be a Single Point Of Contact (SPOC)
for IT users, handling incidents and service requests. Normal
operations are referred to by SLAs, procedure, and so on.
• Objectives
– To restore normal operations as quickly as possible
• Responsibilities/Activities
– Logging all incidents/service requests, allocating categorization, and
prioritization codes
– First line investigation and diagnosis
– Resolving incidents/service requests
– Escalation
– Closing all resolved incidents and requests
– Conducting customer satisfaction surveys
– Communication with users - progress, information
– Updating CMS as agreed and authorized
Types of service desks
• Local • Centralized
Types of service desks
• Virtual

• Follow the sun - global solution that can


provide 24 x 7 support
• Specialized
Staffing and metrics
• Staffing
– Staffing levels - must meet the demands of the business at any given time
– Skill levels - balance between response and resolution times, and cost; basic
requirements are typically a balance of technical, communication skills, and business
knowledge
– Training - must be adequately trained and kept up to date
– Staff retention - loss of staff can be disruptive, incentives, and environment should be
considered
– Super users - can be used to liaise with the Service Desk, can filter requests and issues,
and cascade information
• Metrics
– Customer/user satisfaction
– First-line resolution rate
– Average time to resolve an incident
– Average time to escalate an incident
– Average cost of handling an incident
– Percentage of customer/user update completed on time
– Average time to review and close a resolved call
– Break down of number of calls by time/day
Technical Management Function
• Technical Management is a function that covers groups, teams,
and departments that provide technical expertise and overall
management of the IT infrastructure.
• Technical Management is the custodian of technical knowledge
and expertise related to managing the IT infrastructure. It ensures
the knowledge required to design, test, manage and improve IT
services is identified, developed, and refined.
• Technical Management provides the resources to support the IT
management lifecycle. It ensures that resources are effectively
trained and deployed.
• Technical Management is responsible for:
– Maintenance of the technical infrastructure
– Documenting and maintaining the technical skills required to manage
and support the IT infrastructure
– Diagnosis of, and recovery from, technical failures
Application Management Function
• "To help plan, implement, and maintain a
stable technical infrastructure to support
the organization's business processes."
• Application Management is similar to
Technical Management but manages
applications throughout their lifecycle.
Application management -
objectives
• To support the organization's business processes by helping to identify
functional and manageability requirements for application software
• To assist in the design and deployment of applications
• To assist in the ongoing support and improvement of applications
• The Application Management function:
– Contributes to the decision on whether to buy an application or build it
– Is the custodian of technical knowledge and expertise relating to the
management of applications
– Provides resources to support the Service Management Lifecycle
• Application Management's objectives are achieved through:
– Well designed and highly resilient, cost-effective technical topology
– Required functionality is available to achieve business outcomes
– Use of adequate technical skills to maintain infrastructure in optimum condition
– Swift use of technical skills to diagnose and resolve any technical failures
Operations Management Function
• The IT Operations Management function applies to
the group/team/department that performs day-to-day
operational activities. These activities involve the day-
to-day running of the IT infrastructure to deliver IT
services at agreed levels to meet stated business
requirements.
• The IT Operations function:
– Executes the ongoing activities and procedures required to
manage and maintain the IT infrastructure so as to deliver and
support IT Services at the agreed service levels
– Continually adapts to business requirements and demand
Objectives and concepts
• Objectives
– To maintain the 'status quo' to achieve stability of the organization's day-to-day
processes and activities
– Regularly scrutinize and improve service at reduced cost, while maintaining
stability
– Swiftly applying operational skills to diagnose and resolve any IT operations
failures that occure
• Concepts
Operations Management has two functions which are generally formal
structures:
– IT Operations Control:
• Ensures routine operational tasks are carried out
• Provide centralized monitoring and control activities
• Based on an Operational Bridge or Network Operations Centre
– Facilities Management:
• Covers the management of the physical environment
• Refers to Data Centers, computer rooms
• May include management of outsourced facilities
Overlap of Functions and
Operational Activities
– Technical Management and Operations Management both
play a role in management and maintenance of the IT
infrastructure
– Technical Management and Application Management both
play a role in the design, testing, and improvement of CIs that
form part of IT services
– Application Management and Operations Management
both play a role in application support
• In addition to being responsible for the specific
processes covered in this session, Service Operation is
responsible for executing many processes that are
owned and managed by other areas of Service
Management. This shows the processes in which
operations becomes involved.

You might also like