100% found this document useful (2 votes)

168 views

Patch Management Automation For Enterprise Cloud

This document discusses challenges with patch management in enterprise and cloud environments and proposes an automated solution. Key challenges include lack of standardization, different customer requirements, and human involvement making the process costly. Cloud offers opportunities through standardized stacks but introduces challenges like time zone differences and restoring virtual machines. The proposed solution aims to automate the full patch management workflow from notification to deployment and testing across operating systems, middleware and applications in a cloud environment.

Uploaded by

Cyril Mbede

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

168 views

Patch Management Automation For Enterprise Cloud

Uploaded by

Cyril Mbede

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Patch Management Automation

for Enterprise Cloud

Hai Huang, Salman Baset, Chunqiang Tang

IBM TJ Watson Research Center

Ashu Gupta, Madhu Sudhan KN, Fazal Feroze, Rajesh Garg, Sumithra Ravichandran
IBM Global Technology Services

Abstract
Applying patches to operating systems, middleware, and applications is considered a major IT pain
point due to several reasons. The operating systems and software are of myriad types, there is
interdependency among the updates, operating system, and applications, there is lack of
standardization among different enterprise customers, and finally testing the applications and
operating systems post-update is challenging. As a result, human operator is involved in different
stages of the patching process, making it costly and cumbersome. Cloud can help standardize
various offerings to customers, and potentially remove human operators. However, it introduces other
challenges such as VM time zones and restoring VMs from snapshots which are not present in
traditional enterprise environments. We discuss the challenges of achieving patch automation in a
Cloud, and then describe our solution.
What is patching?
• Software patches
– Security: Fix vulnerabilities
– Functional: Add features, improve existing functions, change software behaviors
– Impacted systems: Hypervisor, OS, middleware, applications
• Vendor Tools: Windows Server Update Services, Redhat Network, etc.
– Only handles Windows/Redhat systems
– Standalone tool with no integration with other management tools, e.g., change management
• 3rd Party Tools: IBM TEM, VMware vSphere update manager
– TEM: Hypervisor, guests, middlewares, apps. vSphere: hypervisor & guests
– Scheduling & deployment made easy but largely rely on human involvement, no tool
integration
• Amazon EC2, Microsoft Azure, many private Clouds
– User manages patching. VMs can be potentially vulnerable when first provisioned
• Rackspace
– Patching is a part of managed service ($0.12 per hour, $100 per month)

NOMS 2012 Application Track 2

Section 1: Introduction
Operating systems, middleware, and applications need to be regularly patched to guard against
newly found vulnerabilities or to provide additional functionality. In the non-enterprise space, updates
are typically handled by turning on the auto-update feature of the operating system or middleware,
which can apply the patches as they are released by vendors. However, these mechanisms for
automatically updating are problematic in an enterprise environment. First, IT administrators need to
assess the impact of newly released patches before rolling them out to their IT infrastructure. Such
pre-assessment is clearly not possible using the automatic update feature. Second, IT administrators
need to have a consistent view of their IT infrastructure, including the vulnerability assessment which
cannot be achieved by the automatic updates of vendors ISVs (independent software vendors).
Lastly, IT administrators need to assess the post patch impact, including failures of existing
applications running on their IT infrastructure. All these changes should be recorded in a change
management system for audit and recovery from failure purposes (integration w/ other management
tools).
Section 1.1: Patch Management Tools
Broadly, there are two different types of patch management tools. There include ISV-vendor specific
tools and third party tools. As discussed in Introduction, each ISV vendor has its own mechanism for
updating its software. However, by definition, those mechanisms are vendor specific. Examples of
these vendors include Windows server update services (WSUS) [WSUS’11] and Red Hat Network
(RHN) [RHN’11]. The third party tools such as Tivoli Endpoint Manager (formerly BigFix) [TEM’11] or
VMware vCenter Update Manager [VMwareUpdate’11]. Although these tools make the scheduling
and deployment of patches relatively easy, but they still rely on humans to schedule the patches.
Moreover, these tools are not integrated with other management tools such as asset databases,
change management databases, and failure recovery databases. As a result, they require additional
human involvement during the end-to-end patch management process.
Among the public Cloud providers, EC2 and Azure VMs leave it to the customer to patch or update
the virtual machines. This results in newly provisioned VMs being immediately vulnerable.
Rackspace and Microsoft Azure (web and worker roles) provide patching of VMs as part of managed
services, but there is a significant cost associated with it. For example, Rackspace offers the
managed services at $100 per VM per month. It is unknown whether the patch management process
in Rackspace and Azure is fully automated or partially manual.

2
Patch management in enterprise
• Patch management challenges in traditional enterprise
– Lack of standardization in software/hardware/services
• Human intervention in every step (notification, approval, scheduling, deployment, post-
deployment)
– Every customer wants a different patch management policy to suit their needs
• Patch management solution provided to one customer can be completely different from
another
Little or no solution reuse
– Labor intensive and cost ineffective

NOMS 2012 Application Track 3

Section 1.2: Enterprise Requirements for Patch Management

Patch management in enterprise environment is far more complicated than just using a patch
deployment tool to apply patches to endpoints as making changes to a working system in any way
might cause failures. In an enterprise environment, such failures can impact business in catastrophic
ways if not carefully controlled. As a result, there is often a safeguarding process around patching to
ensure risk is minimized. We will describe this process in more details in Section 2. Automating patch
management process is challenging due to several reasons. First, there is lack of standardization in
IT services. This happens because each customer has their own unique hardware and software
stacks on which customized services are run. Moreover, each customer often requires a different
patch policy to suit their business and infrastructure needs. Some customers may want to patch their
machines immediately while other customers may want to fully understand the impact of patches
before scheduling and applying them.

3
Patch management in Cloud
• Cloud opportunities
– Highly standardized software/hardware stacks
• Aggressively increase the level of services automation in many processes of IT delivery,
including patch management
– A clean slate to offer customers with more standardized solutions and services at
a lower cost
• Lower provider’s operational costs get passed to customers so they are likely willing to
accept standardized services such as patch management

• Cloud challenges
– Extremely large, multi-tenancy, deeper vertical stack due to virtualization

NOMS 2012 Application Track 4

Section 1.3: Cloud Challenges and Opportunities for Patch Management

Cloud presents unique opportunities for patch management. Cloud offers highly standardized and a
clean slate solution for customers and can help lower cost for customers substantially. However,
Cloud also offers unique challenges because it can be an extremely large, multi-tenant environment,
with a deep vertical stack due to virtualization. First, virtual machines can be in different time zones.
Since virtual machines run on hypervisors, the scheduling of patches on VMs can interfere with the
maintenance of the data center physical infrastructure. Further, unlike physical machines, a customer
can take snapshots of virtual machines, and restore the virtual machine from snapshots. Restoring a
virtual machine from a snapshot is challenging from patching’s perspective, because a compliant VM
being restored from an earlier snapshot may immediately become non-compliant.

4
Patch management process workflow
Patch notification Patch scheduling Patch deployment and post-dep

Vendor Patch Advisory System admin Application Change Mgmt Deployment Target
System team team System tools system
Security
notification Vulnerability
notification
Initiate change

Approve change

Request for
shutdown
Request for middleware / app shutdown
Shutdown Shutdown of middleware / app completed
complete

Initiate patch cycle

Actual patching
Actual patching
completed
Patch cycle completed

OS health check
OS health check completed
Request for check
Health check middleware / app
Health check result
Health check result

Update / close Update / close change

CIRATs record
Create incident for failed systems

NOMS 2012 Application Track 5

Section 2: Patch Management Process Workflow

To improve and automate patch management process, one must first understand how it works today.
The above figure shows an example of a three-phrase patch management process : i) notification, ii)
scheduling, and iii) deployment and post-deployment. In the notification phase, when new patches
are released by vendors, their availability is then detected by either subscribing to vendors’
notification systems or manually checking patch bulletins periodically. For each new patch, patch
advisory system is responsible for notifying system administrative teams whose managed assets are
affected so that the patch can be scheduled, applied, and tested in the least intrusive manner and
timeframe. Patch advisory system is also used to keep track of the lifecycle of a patch, e.g., whether
or not a patch has succeeded or failed; on which endpoints was the patch applied; who has done the
work; at what time the work was done; and if a patch was not applied on time, was the customer
notified, etc.
In the scheduling phase, various parties such as the system administrators, support teams,
application owners, customers are notified of the new patch. A change management system is used
for various teams to negotiate on what systems will be patched, at what time the patch will be
applied, are there any exceptions that should be made, etc. Information exchange and negotiation
between multiple teams results in unavoidable wait times. This is especially true nowadays as teams
usually situated in different geographic locations with different time zones. One simple back-and-forth
between two parties can take up to 24-hours or longer, and when multiple parties are involved, it is
not hard to imagine that a simple routine task can take a very long time to reach a decision,
dominated by wait times.
In the deployment and post deployment phase, actions agreed upon in the previous phase will be
executed by the delivery teams to coordinate the efforts, e.g., when an OS patch is to be applied,
application team will first need to shut down the running applications and middleware to avoid data
corruption and other potential inconsistency problems. When patch is completed, various health
checking tasks and regression tests are performed to verify that the applied changes did not
adversely impact anything. As one can see, using patch deployment tools to apply patches onto
endpoint systems is only one of many steps in the end-to-end process.
Major areas of labor and cost inefficiencies
Patch notification Patch scheduling Patch deployment and post-dep

Vendor Patch Advisory System Application Change Mgmt Deployment Target

System team team System tools system
Security
notification Vulnerability
notification
Initiate change
Wait for
customer
Approve change
approval -
can be long
Request for
shutdown
Request for middleware / app shutdown
Shutdown Shutdown of middleware / app completed
complete

Initiate patch cycle

Human Actual patching
Need highly
operator scalable tools to Actual patching
is highly support large # completed
involved Patch cycle completed of endpoints

OS health check
OS health check completed
Request for check
Health check middleware / app
Health check result
Health check result

Update / close Update / close change

CIRATs record
Create incident for failed systems
Manual interactions
with other management
NOMS 2012 Application Track tools is time consuming 6

Section 2.1: Major Areas of Labor and Cost Inefficiencies

From analyzing the existing patch management workflow, we identified several major areas of labor
and cost inefficiencies. The above figure shows that the system administrator is involved almost
every step of the way in this entire process, from receiving patch notifications, to negotiating approval
for applying patches to endpoints, to coordinating deployment actions with other support teams and
customers, etc. Even the most dedicated system administrators need to take break from time to time,
cannot work 24x7, and are often handling multiples concurrent tasks, thus, resulting in the human
operator cannot stay completely 100% synchronized with the current progress in the workflow. When
multiple of such human operators are collaborating, these little time gaps add up quickly and cause a
seemingly simple task to take much longer to complete than what most people would expect. The
key in automating this process is to take human operators (e.g., system administrators, support
teams, customers) out of the critical path, and rather, only involve the human operators to handle
complicated problems or exceptions that are hard to automate. We found improvements in four areas
are critical to streamlining the workflow.
Patch management workflow automation

• Solution standardization
– Customer accepts one of the standardized change windows and patch management policies at
contract time
– No more waiting for customer approval for change requests

• Tools integration
– Enterprise system management tools have well-defined interfaces and can often be interfaced
programmatically
– Integration is critical to streamlining the end-to-end patch management workflow

• Tool automation, scalability, and multi-platform support

– Labor and cost associated with patch management increases with number of endpoints, number of
customer accounts, number of policies, etc. Need to make ǻcost as close to zero as possible

• Eager Patch Testing and Lazy Health Checking

– To minimize post deployment problems, patches should be thoroughly tested before automated
– Health checking after post deployment can be delayed until problems actually occur

NOMS 2012 Application Track 7

An automated solution first requires the process to be standardized. A major hurdle that has
prevented patch management from being automated in existing managed environments is the lack of
services standardization, e.g., a patch management solution tailored to one customer often cannot be
used or easily adopted by another customer. Secondly, patch management is not simply using a
patch tool to apply patches to endpoint systems, but rather, a collaboration of multiple management
tools and teams, e.g., change management and patch advisory tools. Thirdly, in a large enterprise
environment, patch tools need to be able to manage a large number of managed entities in a
scalable way while being able to handle heterogeneity that is unavoidably in any large environments.
Lastly, to avoid problems due to automatically applying patches to endpoints, thorough testing of
patches beforehand is absolutely mandatory. However, post deployment health checking can be
lazily done as there are a plethora of monitoring tools, e.g., at platform, middleware, and applications
layers, that would help detect patch related problems as well as any adhoc health checking
procedures.
Standardize patch management services
• Patch severities (examples) – tradeoffs between timeliness and downtime
– Low/Medium severity patches to be applied every 3 months
– High/Critical severity patches to be applied every month

• VM categories (examples) – staged patching minimizes bad patches from

propagating to production systems
Service provider’s VMs Customer’s VMs

testbed VMs Test VMs Dev VMs Prod VMs

Patch testing

… time

New patch Auto-patched Change window Change window Change window

and tested Starts +1 week +2 weeks

We implement the means to deliver standardized patching solution based on an agreed set of
policies with the customer. We leave the actual policy to be defined to the customers and will
honor the policy as long as it is within the our pre-defined guideline

NOMS 2012 Application Track 8

Section 2.2: Solution Standardization for Patch Management

Patch management services are often provided to customers in an one-off fashion as each customer
has different business needs, compliance requirements, and uses different sets of management
tools. To ensure all the needs and requirements are fulfilled and met, support teams and customers
need to negotiate very carefully on which patches to apply, when to apply, dependencies between
systems, and whether or not any exceptions should be made, for each new patch to be applied. We
devised an alternative solution, which is consisted of a policy agreed upon between the service
provider and the customer at contract time. The policy semantic is general enough that it can capture
many different business needs, but still flexible enough that customers can request for exceptions as
the need arises. This allows an automated solution to perform tasks based on the agreed upon
policy, and when the customer’s need changes, it will adjust its actions accordingly. In summary, the
service provider is to implement the means to deliver standardized patch management solution
based on an agreed set of policies with customers, and leaves how a policy is to be defined to the
customer to best match their business needs.

Some parameters that customers can use to define patch scheduling policy are patch severity and
virtual machine category. Patching causes downtime, and a customer can use patch severity to
balance between the amount of downtime due to patching and the benefits of applying a patch. For
example, for low and medium severity patches, they can be applied every 3 months; for high severity
patches, they can be applied every month. Orthogonal to patch severity, customer can also put
virtual machines into different categories, e.g., Test, Development, and Production. A patch is
scheduled to VMs of different categories in a staged fashion to minimize the chance that a
problematic patch is propagated to production machines.
Tool automation: TEM architecture and APIs
• Tivoli Endpoint Manager (TEM)
– Hierarchical infrastructure: Server, relay, clients
– Management through a console
• Core technologies
– Fixlets – action scripts to test patch applicability
and to apply patch
• The script language is flexible and fixlets can
be used beyond patch for other automation
tasks
• Hierarchical broadcast network overlay for
content distribution and data collection
– Programmatic APIs
– Implements Platform APIs and SOAP APIs to
allow external entities to control its action
programmatically
– Extensions needed to fully benefit Cloud
– Fully leverage automation enabled through Cloud
– Integrate with other management tools

NOMS 2012 Application Track 9

Section 2.3: Tool Automation

The patching tool we are using in our solution is IBM Tivoli Endpoint Manager (TEM) [TEM’11],
formerly known as BigFix. It uses a centralized management infrastructure for managing patches,
endpoints, subscriptions, and scheduling and performing actions. Patches are distributed to
endpoints in a hierarchical broadcast network, where caching servers (relays) are placed at strategic
locations to reduce network traffic. These relays are also used to deduplicate data sent back from
endpoints to central server to minimize reverse traffic. This infrastructure has been shown to be able
to scale to hundreds of thousands of endpoints in a loosely coupled network. One of TEM’s core
technologies is fixlet, which is a scripting language that describes various actions to apply a patch if it
were performed by a human operator, e.g., checking the relevancy of a patch to an endpoint,
downloading the patch from an URL, performing checksum of the downloaded patch, applying the
patch, etc. Writing a fixlet is an one-time effort, but can be reused by many less experienced system
administrators to apply a patch in their own environment in a safer manner.
Out-of-the-box, however, TEM is an interactive tools, where it is expected that a system administrator
uses a graphic console to perform various actions. However, if we are to enable end-to-end
automation, patch deployment tool needs to be driven in a programmatic way. TEM provides two
such APIs: Platform and SOAP. The SOAP APIs is mainly used to read data from TEM server, but to
perform any actions (e.g., schedule a fixlet to an endpoint), the Platform APIs are used. The Platform
APIs are provided via a client-side DLL library on Windows platforms.

9
Tool integration

Patch
Vendor Change Patch Health
Advisory
Advisory Mgmt Deployment Check
Mgmt

Patch Automation Engine (PAE)

PAE orchestrates patch management flow from end-to-end

• Patch Automation Engine

– A common substrate that allows orchestration of patch management
workflow from start to end across different management tools and
facilitates communications across tools

NOMS 2012 Application Track 10

Section 2.4: Tool Integration

Various management tools are used during the lifecycle of a patch, e.g., patch advisory, change
management, patch deployment, and health checking. Ideally, there should be a common protocol
and communication bus that would allow tools to communicate with each other via well defined APIs.
However, in practice, such APIs are rarely implemented by all tools in the tool chain. We built a
common substrate layer, called Patch Automation Engine (PAE), that speaks with various tools using
whichever interface implemented by individual tools to allow PAE to orchestrate the patch
management workflow from start to end across different management tools. For example, for PAE to
talk to TEM Server, it uses Platform APIs and SOAP APIs that TEM Server implements, and to
communicate with the change management system (in our case, IBM Integrated Service
Management (ISM)), it uses SOAP APIs implemented by ISM. The interfaces PAE
communicates with the tools in the tool chain are well defined and modular, so in case a tool is
replaced by another, only interface functions need to be replaced while the workflow process stays
the same.
Patch Automation Engine Architecture
CIRATS server ISM server TEM server

CIRAT Interface APIs ISM Interface APIs TEM Interface APIs

open/update/close CR patch->fixlets
open/update IR test fixlet applicability
schedule/cancel fixlets

update/close/markNA

Patch Automation Engine

PAE DB PAE UI

NOMS 2012 Application Track 11

Section 3: Design and Implementation

Patch Automation Engine (PAE) is capable of streamlining the end-to-end patch management
process largely due to its integration with various management tools, mimicking how a human
operator would use these tools to carry out various tasks. Various interface functions/libraries are
implemented to integrate PAE with these tools. Not all tools can be interfaced with standard protocols
such as SOAP. For example, TEM uses its own proprietary protocol. The interface functions are
defined such that when a tool is changed, e.g., TEM being replaced by another tool, only the
interface functions need to be replaced. This plug-and-play capability would allow PAE to be re-used
in other environments with all changes contained within a limited scope. PAE uses a local database
instance to record persistent state information. This is used for auditing purposes, and the
intermediate state information can also be used for crash recovery when failure occurs either in PAE
itself or one of its dependent services. In the next section, we take a look at each of the PAE
components in details.
Patch Automation Engine Components

VMs are up-to-date to the latest patch level. Need to

Patch Automation Engine Steady-state operations continue to keep them up-to-date without causing
PAE Scheduler PAE Scheduler compliance issues
PAE Baseliner PAE Baseliner
PAE Checker PAE Checker
PAE Act/Deact Module

When VMs are provisioned, they might not be patched

One-time operations to the latest patch level. Adjustments need to be done
PAE Activation/Deactivation before handing these VMs to customers.

NOMS 2012 Application Track 12

Section 3.1: PAE components

PAE supports both one-time operations and steady-state operations. Examples of one-time
operations are VM provisioning and de-provisioning. When a VM is provisioned, it needs to have the
latest patches applied before being handed to the customer. Otherwise, the VM can be vulnerable
from the start. The PAE Activation/Deactivation Module is responsible for detecting newly provisioned
VMs and applying the latest baseline to them. For steady-state operations to keep VMs always in a
compliant state, the Scheduler component detects newly released patches, opens the necessary
change tickets, and once the tickets are approved, it schedules the patches onto the relevant VMs
during an acceptable change window. However, as things can changed dynamically in a large
managed environment, e.g., a customer can change the category of a VM from Test to Production. If
there are already change record opened against this VM and patches being scheduled, these actions
need to be rolled back and rescheduled accordingly without disturbing existing actions. The Baseliner
deals with these dynamic aspects of PAE. When patches are applied during change windows, the
Checker component detects whether or not they are applied successfully or not. If so, change
records are closed and evidence is collected for audit purposes. However, if a patch has failed, it
opens an incident ticket to alert system administrators so that the problem can be resolved within the
allowable time frame.
Discussions
• Time synchronization and time zones
– In a Cloud environment, even VMs located in the same physical location can be
configured to use different time zones
– Patch correctness highly depends on the correctness of time on all management
and managed components
• Time synchronization
• Different time zones and daylight saving time

• VM suspension and snapshot

– Suspension and resume
• Suspended VMs cannot be patched, which may cause a resumed VM to be non-
compliant
– Snapshot and revert back
• Revert back to an earlier snapshot may also make a VM non-compliant as it may not
have the latest patches

NOMS 2012 Application Track 13

Section 4: Discussion
In this section, we discuss some more complex scenarios related to patching that are Cloud specific
or need to be more carefully handled in a Cloud environment.
Section 4.1: Multiple Time Zones
In a Cloud environment, virtual machines that are physically located in the same time zone can be
configured to operate in different time zones. When a customer’s VMs are spanning multiple time
zones, scheduling of patches needs to be carefully done so that the correct behavior is implemented.
For some patches, the correct behavior is to apply the patches at the same local time of each virtual
machine, e.g., applying MS10-081 from Microsoft to all Windows machines at 11pm of their
respective local time. For other patches, the correct behavior is to apply at the same absolute time to
avoid mixed-mode problem where multiple versions of a software are concurrently running, resulting
in data corruption. TEM supports both methods of scheduling patches, and it is up to the customers
to specify the intended behavior. The default behavior is to use local time.
4.2 VM suspension and snapshot
In a virtualized environment, there are additional modes of operations available to system
administrators and users, such as VM suspension and resume, snapshot and revert back. The
management console that allows users of these operations need to be tightly integrated with patch
management and compliance processes, otherwise, a VM could become noncompliant
unexpectedly. For example, before a VM is suspended, it should have been patched to the latest
patch level using the automated patch management process we have described previously. When it
is resumed after an extended amount of time, it will most likely be in a noncompliant state with
missing patches. Therefore, it is important that the patch management system catches it up to the
latest patch level before handing the VM to user’s control. Likewise, when a VM is reverted back to
an earlier snapshot, baselining the VM to the latest patch level is required.

13
Related work
• Live Patching
– OS [Lowell’04] [Potter’05] [Arnold’09]
– Application [Tewksbury’01]

• Patching in distributed environment

– Staged upgrade: [Crameri’07]
– Mixed-mode: [Dumitras’10] [Choi’09] [Segal’89]

• Offline patching
– VM image [Zhou’10]

NOMS 2012 Application Track 14

Section 5: Related Works

Live patching is a powerful technique that patches running systems without causing services
downtimes. This is particularly useful for enterprise applications where downtime translates directly to
loss in revenue. Lowell et al. [Lowell’04] and Potter et al. [Potter’05] explored OS virtualization
techniques to migrate running applications so that the underlying operating system can be patched
without disturbing running workload. Arnold and Kaashoek described Ksplice [Arnold’09], where
source code of a patch is converted to hot update, which can then be applied to a running kernel
without restarting it. Tewksbury et al. proposes the Eternal system [Tewksbury’01] to enable
distributed CORBA applications to continue provide service during upgrades, but it does not provide
a general solution. Even though live patching is a powerful technology, it provides no guarantee on
the correctness of the patch, thus, its benefit is only limited in practice and is not generally adopted.
Patching complex distributed applications is even more challenging than standalone applications.
Crameri et al. proposed the Mirage [Crameri’07] framework to upgrade a distributed application in a
stage manner to significantly reduce upgrade overhead. Several other works [Dumitras’10, Choi’09,
Segal’99] further explored the problem of mixed mode where multiple versions of an application can
operate concurrently during the upgrade process. Patching complex applications is beyond the scope
of this paper as we focus on typical Cloud workloads.
Zhou et al. described Nuwa [Zhou’10], a patching tool that applies patches directly to VM images. A
benefit to this approach is that it does not require the VM to start from the image, which would take
time, before the patch can be applied. However, this also implicitly implies the technique needs to
internally handle multiple types of image formats, multiple times of patch package formats, multiple
operating systems, etc. In this paper, we focus more on patching running instances.
Conclusion
• Automated Patch Management Solution
- Fully automated and integrated patch management solution
- Schedule patches on VMs according to customer-agreed policies, while allowing user
initiated exceptions
- Support multiple categories of VMs (e.g., test, dev, production, etc.): staged patching allows
patch schedule to better fit with customer’s business needs and minimizing patching errors
- Support patching of newly provisioned VMs as part of Services Activation & Deactivation
- Detect suspended and resumed VMs and auto catch them up to the latest patch level
- Support VM reverting back to an earlier snapshot
- Support patching VMs located in different time zones and DST

• Future Works
- Automated post-patch testing using runtime signatures

NOMS 2012 Application Track 15

Section 6: Conclusion and Future Direction

In this paper, we presented our experience in building an automated patch management system for Cloud using
TEM. The solution centers around three key areas: standardizing patch management solution, automation of
patch tools, and integration of management tools. This essentially removes human operators out of the critical
path of patch management process workflow, and only involves them to resolve exceptions and complicated
problems that are hard to automate. The initial version of our solution is geared toward standalone Cloud
workloads. For more complex application setups, more advanced policies are needed to capture application
dependencies to better orchestrate patching of such applications. Additionally, we are working toward an
automated solution to perform application/OS health checking by using and comparing runtime signatures of
workloads.

[References]
[Lowell’04] D. E. Lowell, Y. Saito, and E. J. Samberg. Devirtualizable virtual machines enabling general single-node, online maintenance. In
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems. Boston, MA,
USA, 2004.
[Potter’05] S. Potter and J. Nieh. Reducing downtime due to system maintenance and upgrades. In Proceedings of the 19th Conference on
Large Installation System Administration Conference. San Diego, CA, 2005.
[Zhou’10] W. Zhou, P. Ning, X. Zhang, G. Ammons, R. Wang, V. Bala. Always up-to-date: scalable offline patching of VM images in a compute
Cloud. In Proceedings of the 26th Annual Computer Security Applications Conference. Austin, TX, 2010.
[Tewksbury’01] L. Tewksbury, L. Moser, and M. Melliar-Smith. Live upgrades of CORBA applications using object replication. In International
Conference on Software Maintenance. Florence, Italy, Nov, 2001.
[Segal’89] M. E. Segal and O. Frieder. Dynamically updating distributed software: supporting change in uncertain and mistrustful environments.
In IEEE Conference on Software Maintenance, Oct, 1989.
[Choi’09] Online application upgrade using edition-based redefinition. In ACM Workshop on Hot Topics in Software Upgrades, Oct, 2009.
[Crameri’07] O. Crameri, N. Knezvic, D. Kostic, R. Bianchini, and W. Zwaenepoel. Staged deployment in Mirage: an integrated software
upgrade testing and distribution system. In Symposium on Operating Systems Principles, Oct, 2007.
[Dumitras’10] T. Dumitras, P. Narasimhan, E. Tilevich. To Upgrade or Not to Upgrade: Impact of Online Upgrades across Multiple
Administrative Domains. Onward, Oct, 2010.
[Arnold’09] J. Arnold, M. F. Kaashoek. Ksplice: Automatic Rebootless Kernel Updates Eurosys, 2009.
[WSUS’10] Windows server update services (URL) accessed August 2011. http://technet.microsoft.com/en-us/windowsserver/bb332157
[RHN’10] Red Hat Network (RHN), accessed August, 2011. https://www.redhat.com/red_hat_network/
[TEM’11] IBM Tivoli Endpoint Manager (URL), accessed August, 2011. https://www-01.ibm.com/software/tivoli/solutions/endpoint/?s_pkg=bfwm
[VMwareUpdate’11] VMware Update Manager (URL), accessed August 2011. http://www.vmware.com/products/update-manager/overview.html

Downloadable Official CompTIA IT Fundamentals (ITF+) Student Guide (Exam FC0-U61)
100% (20)
Downloadable Official CompTIA IT Fundamentals (ITF+) Student Guide (Exam FC0-U61)
550 pages
Extreme Privacy What It Takes To Disappear
93% (15)
Extreme Privacy What It Takes To Disappear
514 pages
How To Download Documents From Scribd For Free - 7 Methods
67% (9)
How To Download Documents From Scribd For Free - 7 Methods
25 pages
Dump Trackssjs
41% (27)
Dump Trackssjs
88 pages
Turbo Tax Method Step by Step
96% (28)
Turbo Tax Method Step by Step
19 pages
Certified Blackhat Methodology To Unethical Hacking
100% (1)
Certified Blackhat Methodology To Unethical Hacking
135 pages
How To Unlock Icloud, Bypass Icloud, Remove Icloud, Unlock Iphone, Hack Iphone, Icloud Hack
67% (3)
How To Unlock Icloud, Bypass Icloud, Remove Icloud, Unlock Iphone, Hack Iphone, Icloud Hack
24 pages
CCDE v3 Practice Lab
100% (4)
CCDE v3 Practice Lab
506 pages
Patch Management Policy and Procedure
50% (2)
Patch Management Policy and Procedure
4 pages
Extreme Privacy - Linux Devices (Michael Bazzell)
100% (1)
Extreme Privacy - Linux Devices (Michael Bazzell)
90 pages
Android Secret Codes PDF Book
100% (10)
Android Secret Codes PDF Book
8 pages
Top 800+ Latest Android Secret Codes - Hidden Codes 2019
100% (10)
Top 800+ Latest Android Secret Codes - Hidden Codes 2019
12 pages
Secret Code List For Android and All Samsung - Code Exercise
100% (2)
Secret Code List For Android and All Samsung - Code Exercise
6 pages
Michael Bazzell - Open Source Intelligence Techniques - Resources For Searching and Analyzing Online Information-Createspace Independent Publishing Platform (2021)
100% (11)
Michael Bazzell - Open Source Intelligence Techniques - Resources For Searching and Analyzing Online Information-Createspace Independent Publishing Platform (2021)
669 pages
Patch Management Policy
No ratings yet
Patch Management Policy
6 pages
Guidance For Privileged Access Management
100% (1)
Guidance For Privileged Access Management
63 pages
Extreme Privacy - Mobile Devices
100% (5)
Extreme Privacy - Mobile Devices
135 pages
Identity and Access Management Policy
100% (3)
Identity and Access Management Policy
4 pages
Logging and Monitoring Policy-XX Version
100% (2)
Logging and Monitoring Policy-XX Version
14 pages
Access Control Policy
100% (1)
Access Control Policy
8 pages
LEP Removable Media Policy
No ratings yet
LEP Removable Media Policy
3 pages
RTFM - Red Team Field Manual v3
100% (15)
RTFM - Red Team Field Manual v3
134 pages
Icloud Bypass 1
67% (3)
Icloud Bypass 1
2 pages
Vulnerability Management Policy
No ratings yet
Vulnerability Management Policy
5 pages
Patch Management
No ratings yet
Patch Management
6 pages
Privileged Access Management
100% (1)
Privileged Access Management
5 pages
Cyber Capability Toolkit - Cyber Incident Response - Ransomware Playbook v2.3
No ratings yet
Cyber Capability Toolkit - Cyber Incident Response - Ransomware Playbook v2.3
22 pages
ERP
No ratings yet
ERP
56 pages
Anti-Virus and Anti-Malware Policy - V2
No ratings yet
Anti-Virus and Anti-Malware Policy - V2
10 pages
A Patch & Vulnerability Management Program Presentation
No ratings yet
A Patch & Vulnerability Management Program Presentation
9 pages
Patch Management Plan
100% (3)
Patch Management Plan
6 pages
Cloud Computing Policy and Procedure
100% (1)
Cloud Computing Policy and Procedure
32 pages
BITS Vulnerability Management Maturity Model
No ratings yet
BITS Vulnerability Management Maturity Model
19 pages
Information Security Incident Management Policy
No ratings yet
Information Security Incident Management Policy
10 pages
Cism Glossary Mis Eng 0815
No ratings yet
Cism Glossary Mis Eng 0815
15 pages
Data and Network Security Checklist: Technology in Your Corner
No ratings yet
Data and Network Security Checklist: Technology in Your Corner
2 pages
Vulnerability Management Policy and Procedures
50% (4)
Vulnerability Management Policy and Procedures
18 pages
Access Management Policy and Procedure
100% (3)
Access Management Policy and Procedure
44 pages
IT Policy - Network Security Policy
No ratings yet
IT Policy - Network Security Policy
4 pages
Firewall Policy: Prepared by
No ratings yet
Firewall Policy: Prepared by
8 pages
Information Logging Standard: 1. Overview
No ratings yet
Information Logging Standard: 1. Overview
4 pages
IT Incident Management
No ratings yet
IT Incident Management
3 pages
Security Patching Policy
No ratings yet
Security Patching Policy
11 pages
Disaster Recovery Plan Policy
No ratings yet
Disaster Recovery Plan Policy
3 pages
Information Security Incident Management
100% (2)
Information Security Incident Management
3 pages
Acceptable Encryption Policy
No ratings yet
Acceptable Encryption Policy
3 pages
NIST Approach To Patch and Vulnerability Management
No ratings yet
NIST Approach To Patch and Vulnerability Management
6 pages
ISMS-DOC-A09-2 User Access Management Process
No ratings yet
ISMS-DOC-A09-2 User Access Management Process
20 pages
Cloud Security Posture Management - Why You - Cloud Security Alliance PDF
No ratings yet
Cloud Security Posture Management - Why You - Cloud Security Alliance PDF
7 pages
SOW Security
100% (2)
SOW Security
3 pages
Log Management Policy - GP 470k
No ratings yet
Log Management Policy - GP 470k
3 pages
AWS Risk and Compliance Whitepaper
No ratings yet
AWS Risk and Compliance Whitepaper
81 pages
Threat and Vulnerability Management Policy
100% (1)
Threat and Vulnerability Management Policy
12 pages
Windows Security Audit Tools
No ratings yet
Windows Security Audit Tools
4 pages
BYOD Policy
No ratings yet
BYOD Policy
6 pages
InfoSec OrgStructure (Scope-Resp-RACI)
100% (1)
InfoSec OrgStructure (Scope-Resp-RACI)
14 pages
2 - Access Control Policy
No ratings yet
2 - Access Control Policy
15 pages
It Security Assessment Tools
100% (1)
It Security Assessment Tools
8 pages
Patch Management Policy
No ratings yet
Patch Management Policy
8 pages
Complete Guide To IT Risk Management
No ratings yet
Complete Guide To IT Risk Management
5 pages
Data Encryption Policy
No ratings yet
Data Encryption Policy
6 pages
Patch Management Datasheet-MAY2204
100% (1)
Patch Management Datasheet-MAY2204
2 pages
FINAL - Identity and Access Management Policy - ENG - 2
100% (1)
FINAL - Identity and Access Management Policy - ENG - 2
11 pages
IT Security Policy Template Remote Access Policy OSIBeyond
No ratings yet
IT Security Policy Template Remote Access Policy OSIBeyond
3 pages
IT General Controls
No ratings yet
IT General Controls
6 pages
Clock Synchronization Procedure
100% (1)
Clock Synchronization Procedure
12 pages
Byod Policy Sample
No ratings yet
Byod Policy Sample
2 pages
Cyber+Incident+Response+ +Generic+Ransomware+Playbook+v2.3
No ratings yet
Cyber+Incident+Response+ +Generic+Ransomware+Playbook+v2.3
26 pages
IT Systems Patching and Vulnerability Management Policy - Tivolv 2023
No ratings yet
IT Systems Patching and Vulnerability Management Policy - Tivolv 2023
2 pages
FedRAMP Security Controls Preface
No ratings yet
FedRAMP Security Controls Preface
2 pages
Overview: Remote Access Policy
No ratings yet
Overview: Remote Access Policy
3 pages
Information Security Risk Management Policy: General Description
100% (2)
Information Security Risk Management Policy: General Description
8 pages
ISM12 PatchAndVulnerabilityManagementPolicy
100% (2)
ISM12 PatchAndVulnerabilityManagementPolicy
4 pages
It Service Continuity Management Itscm
No ratings yet
It Service Continuity Management Itscm
4 pages
Continuity of Operations The Ultimate Step-By-Step Guide
From Everand
Continuity of Operations The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
Disaster Recovery Business Continuity Complete Self-Assessment Guide
From Everand
Disaster Recovery Business Continuity Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
DMVPN
No ratings yet
DMVPN
59 pages
Cisco Esa Deep Dive
No ratings yet
Cisco Esa Deep Dive
92 pages
Safe Design Guide Edge Remote Access VPN Ddos
No ratings yet
Safe Design Guide Edge Remote Access VPN Ddos
61 pages
Cisco Email Security
No ratings yet
Cisco Email Security
9 pages
Strengthen and Scale Security Using Devsecops: Owasp Indonesia Meetup
No ratings yet
Strengthen and Scale Security Using Devsecops: Owasp Indonesia Meetup
44 pages
Common Cyber Attacks-Reducing The Impact PDF
No ratings yet
Common Cyber Attacks-Reducing The Impact PDF
17 pages
Cisco Talos
No ratings yet
Cisco Talos
70 pages
About Security Intelligence
No ratings yet
About Security Intelligence
4 pages
The Five Signs: A Network Security Checklist: Did You Know?
No ratings yet
The Five Signs: A Network Security Checklist: Did You Know?
2 pages
Free Prepper and Survival Manuals PDF S
No ratings yet
Free Prepper and Survival Manuals PDF S
19 pages
Apps For Musicians
100% (1)
Apps For Musicians
2 pages
Secret Iphone
86% (7)
Secret Iphone
1 page
Master PowerShell Tricks Volume 3
100% (2)
Master PowerShell Tricks Volume 3
254 pages
Android Security Research
No ratings yet
Android Security Research
43 pages
Active Directory INTERVIEW QUESTIONS
No ratings yet
Active Directory INTERVIEW QUESTIONS
4 pages
Codes
100% (1)
Codes
28 pages
Iphone Software Unlock Tutorial
100% (3)
Iphone Software Unlock Tutorial
4 pages
LINUX Administrator's Quick Reference Card: User Management NFS File Sharing
100% (3)
LINUX Administrator's Quick Reference Card: User Management NFS File Sharing
6 pages
How To Sideload Apps On Firestick For Movies, TV, and More (2022)
No ratings yet
How To Sideload Apps On Firestick For Movies, TV, and More (2022)
35 pages
How To Get Verified Cash App Accounts - 2024 Safe and Secure...
No ratings yet
How To Get Verified Cash App Accounts - 2024 Safe and Secure...
6 pages
Coding Python
100% (9)
Coding Python
252 pages
All Secret Android Codes
100% (4)
All Secret Android Codes
2 pages
Windows Hacking
100% (2)
Windows Hacking
69 pages
The NSHipster Fake Book
33% (12)
The NSHipster Fake Book
108 pages
SD-WAN Controller Setup Guide (On-Prem, Non Cloud-Managed) - Cisco Community
No ratings yet
SD-WAN Controller Setup Guide (On-Prem, Non Cloud-Managed) - Cisco Community
34 pages
Instant Access to Mastering Linux Administration : Take your sysadmin skills to the next level by configuring and maintaining Linux systems 2nd Edition Alexandru Calcatinge & Julian Balog ebook Full Chapters
100% (3)
Instant Access to Mastering Linux Administration : Take your sysadmin skills to the next level by configuring and maintaining Linux systems 2nd Edition Alexandru Calcatinge & Julian Balog ebook Full Chapters
43 pages
Chapter Two
No ratings yet
Chapter Two
11 pages
Week 1 Cloud Computing Lecture 2
No ratings yet
Week 1 Cloud Computing Lecture 2
39 pages
AZ-900 Exam - Free Actual Q&as, Page 3
No ratings yet
AZ-900 Exam - Free Actual Q&as, Page 3
210 pages
Powering Up Virtualization With HP Proliant G7 Scale-Up Servers
No ratings yet
Powering Up Virtualization With HP Proliant G7 Scale-Up Servers
4 pages
Painless Docker Sample
No ratings yet
Painless Docker Sample
59 pages
Is Software Engineering An Oxymoron
No ratings yet
Is Software Engineering An Oxymoron
6 pages
Architecture and Best Practices Recommendations: For PI Systems
No ratings yet
Architecture and Best Practices Recommendations: For PI Systems
50 pages
Course Outline: Veeam Certified Engineer (VMCE) v9: Audience
No ratings yet
Course Outline: Veeam Certified Engineer (VMCE) v9: Audience
4 pages
CLOUD COMPUTING LAB SYLLABUS
No ratings yet
CLOUD COMPUTING LAB SYLLABUS
1 page
AZ 300 Exam 100 150 of 237
No ratings yet
AZ 300 Exam 100 150 of 237
62 pages
Computer Assignment 2
No ratings yet
Computer Assignment 2
17 pages
Metasploit Unleashed
0% (1)
Metasploit Unleashed
458 pages
How To Disable - Enab
No ratings yet
How To Disable - Enab
6 pages
Site Recovery Manager 8.5 Configuration - Maximums
No ratings yet
Site Recovery Manager 8.5 Configuration - Maximums
4 pages
1114 Lab Installing The CyberOps Workstation Virtual Machine
No ratings yet
1114 Lab Installing The CyberOps Workstation Virtual Machine
5 pages
DCA6201 Operating System (All Units) PDF
No ratings yet
DCA6201 Operating System (All Units) PDF
258 pages
Veritas Vmware
No ratings yet
Veritas Vmware
30 pages
Cloud Computing Test Bank
No ratings yet
Cloud Computing Test Bank
18 pages

Patch Management Automation For Enterprise Cloud

Uploaded by

Patch Management Automation For Enterprise Cloud

Uploaded by

Patch Management Automation

for Enterprise Cloud

Hai Huang, Salman Baset, Chunqiang Tang

NOMS 2012 Application Track 2

NOMS 2012 Application Track 3

Section 1.2: Enterprise Requirements for Patch Management

NOMS 2012 Application Track 4

Section 1.3: Cloud Challenges and Opportunities for Patch Management

Initiate patch cycle

Update / close Update / close change

NOMS 2012 Application Track 5

Section 2: Patch Management Process Workflow

Vendor Patch Advisory System Application Change Mgmt Deployment Target

Initiate patch cycle

Update / close Update / close change

Section 2.1: Major Areas of Labor and Cost Inefficiencies

• Tool automation, scalability, and multi-platform support

• Eager Patch Testing and Lazy Health Checking

NOMS 2012 Application Track 7

• VM categories (examples) – staged patching minimizes bad patches from

testbed VMs Test VMs Dev VMs Prod VMs

New patch Auto-patched Change window Change window Change window

NOMS 2012 Application Track 8

Section 2.2: Solution Standardization for Patch Management

NOMS 2012 Application Track 9

Section 2.3: Tool Automation

Patch Automation Engine (PAE)

PAE orchestrates patch management flow from end-to-end

• Patch Automation Engine

NOMS 2012 Application Track 10

Section 2.4: Tool Integration

CIRAT Interface APIs ISM Interface APIs TEM Interface APIs

Patch Automation Engine

NOMS 2012 Application Track 11

Section 3: Design and Implementation

VMs are up-to-date to the latest patch level. Need to

When VMs are provisioned, they might not be patched

NOMS 2012 Application Track 12

Section 3.1: PAE components

• VM suspension and snapshot

NOMS 2012 Application Track 13

• Patching in distributed environment

NOMS 2012 Application Track 14

Section 5: Related Works

NOMS 2012 Application Track 15

Section 6: Conclusion and Future Direction

You might also like

• Tool automation, scalability, and multi-platform support

• Eager Patch Testing and Lazy Health Checking

• VM categories (examples) – staged patching minimizes bad patches from

• Patch Automation Engine

• VM suspension and snapshot

• Patching in distributed environment