0% found this document useful (0 votes)
156 views25 pages

Determine Maintenance Strategy

Determine Maintenance Strategy identifies and analyzes maintenance needs by identifying risks to business continuity from system malfunctions. Key risks come from natural events, human threats, and technical issues that can cause hardware or software failures. A risk analysis should identify threats, their impacts, and probabilities to quantify risks and prioritize mitigation strategies. This helps develop business recovery plans to maintain operations if critical systems experience failures. Preventative maintenance is recommended to minimize downtime impacts on business functions and customer service.

Uploaded by

Asib Kassaye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views25 pages

Determine Maintenance Strategy

Determine Maintenance Strategy identifies and analyzes maintenance needs by identifying risks to business continuity from system malfunctions. Key risks come from natural events, human threats, and technical issues that can cause hardware or software failures. A risk analysis should identify threats, their impacts, and probabilities to quantify risks and prioritize mitigation strategies. This helps develop business recovery plans to maintain operations if critical systems experience failures. Preventative maintenance is recommended to minimize downtime impacts on business functions and customer service.

Uploaded by

Asib Kassaye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Determine Maintenance Strategy

Unit one
Identify and analyze Maintenance Needs
Introduction:

Why maintenance is needed?

Now a day it’s difficult to run any business without the help of ICT. People use IT
components in one or another way. From small enterprise to large companies rely on
this technology for better improvement. So the components used in these business
environments can’t work as it’s intended to. Hardware can fail due to different reasons
like age, power problems, improper handling, malicious user etc. software can corrupt,
and data can be lost or corrupt. The problem here is that when the ICT component
fails it can affect the normal functioning of the whole business. So it’s worthwhile to
incorporate maintenance plan as another business plan in order to resume the
running of the business to its normal condition after the failure occurred. Depending
on the type of business and people’s choice there are different types of maintenance
strategies. Some people may wait until failure occurs and take immediate reaction
when fault is determined. Others may take preventive action in order to prevent fault.
It’s the people’s interest to choose one method of maintenance but as an IT
professional it’s recommended to prevent failure.

Risks to business Continuity due to system malfunction

Identifying likely areas of Risk

A risk analysis program entails the identification of the most likely threats to business
continuity and deciding which areas of a company are most susceptible to these
threats. The consequences of failure once these threats are realized are then
considered in Business Impact forums.

The principal objective in emergency recovery planning is to maintain continuity if


critical systems experience any level of failure. Risk analysis is the first stage in
constructing both recovery and downtime coverage plans and every area of a
company needs to be assessed to determine the potential risk and impact of
perceivable threats.

Threats to Continuity
Possible threats to business continuity can be external or internal and can be
natural, technical or human related. Even though it can be difficult to determine the
exact nature of potential failures, it is important that risks be assessed and if possible
quantified.

Consideration should include the geography of the business location including the
proximity of rivers, landslide areas, power stations, airports, highways that may carry
hazardous waste, and potential terrorist targets or accident zones. The history of the
local area should be investigated to ascertain the level and regularity of natural
disasters. Accessibility is another factor with security being an aspect affecting the
likelihood of any attack on premises or infrastructure.

The track record of any Utilities used should also be factored in with older power
stations more susceptible to failure and therefore more likely to be responsible for
downtime.

Possible Threats to Business Continuity


Natural Human
 Flooding  Hacking
 Fire  Vandalism
 Snow storms  Sabotage
 High winds  Burglary
 Hurricanes and Typhoons  Staff on strike
 Industrial action
 Tornadoes
 Supplier disruption
 Landslides  Partner company down
 Seismic activity  Terrorism
 Epidemics  Civil disorder
Technical  War
 Power failure or fluctuation  Explosion
 Heating, ventilation, air con  Bomb threat
 Failure  Biological contamination
 Malfunction or failure of  Hazardous waste spillage
Hardware  Radiation
 System software bugs  Embezzlement or extortion
 Application software failure  Vehicular accident
 Communications failure
 Gas leaks
 Chemical spill
 Nuclear accident

Expanding the Scope of the Analysis


A comprehensive analysis of risk to business continuity can also include the internal
structure of the organization. Impact on the various departments and services of
perceived threats can be ascertained and included in the assessment.

Varying levels of automation and the amount of technology used will result in varying
susceptibility to these threats with existing backup systems and services needing to be
included when deciding on the final level of risk for each area.

Communications is an area often needing specialist analysis due to the nature and
sophistication of the technology and companies such as Telecommunication can assist
in telecom risk analysis and evaluation of the telecom recovery options.

Quantifying Risk Analysis

It's a worthwhile exercise to quantify the various threats in terms of overall impact.
There is an array of methods used, plus the option for professional help, but a simple
analysis could involve a combination of an impact level and probability ratings.

A scale of 1 to 4 could be applied to impact assessment such as:

1 Minor impact with disruption up to 2 hours. This would cover the more usual
: threats such as power outages and internal application failures.
2 Disruptive impact up to 8 hours. Hardware failures and malicious damage would
: usually fall into this category.
3 Serious outage up to 2 days. Cut communications or staff disputes may be
: involved at this level.
4 Major outage over 2 days. Natural disasters such as flooding and fire are the
: most likely causes of extended outages.

It's of course necessary to apply the scale uniformly and to ignore cumulative threats
such as flooding leading to a landslide - consider them individually.

A probability value then needs to be applied to each threat going from 1 for low to 10
for high.

To then create a weighted risk rating the impact value should be multiplied by the
probability factor e.g. if vandalism such as cabling into the premises being cut will
produce an impact rating of 2 and because it's a high crime area the probability is
assessed at 7, the weighted risk rating is 14.

Using a system such as this the threats can be scaled with resources and priorities be
applied accordingly.

Readiness and Disaster Preemption


The risk of disaster affecting business continuity is uncertain and difficult to assess,
but by being thorough and attempting the quantify the threats a complete and
comprehensive business recovery plan can be developed. It should identify all the
critical areas and functions of business, rate the risks, assess the subsequent
damages and costs and make recommendations to protect services and data.

Activity 1

Discus the following case study and try to answer the questions that follows.

1. Assume that you are running a barber in Arbaminch town but the power
fluctuation is the big problem because it goes for a day or even two days.
 List the all possible problems that can be caused due to power outage
 Calculate the risk of the business due to power outage.
 Try to quantify the amount of money in birr if the power gone for two
days.
 If the power goes twice a week, how much amount of money in birr lost
in one month? In a year?
 If there are 100 barbers how much money in birr will be lost in a year?
2. Assume that you are providing a service for customers by using your laptop
computer. Unfortunately your computer’s hard disk is crashed and you can’t
run your business due to the failure.
 What is the possible lose your business may face?
 Try to quantify the lose in economic value.
 What are effects of your computer’s damage on your customers?
3. How can you prevent internal system threats?

System Architecture and Configuration


Systems Architecture is a generic discipline to handle objects (existing or to be
created) called "systems", in a way that supports reasoning about the structural
properties of these objects.

Systems Architecture is a response to the conceptual and practical difficulties of the


description and the design of complex systems.

Architecture is the design and structure of a computer system (dictionary definition)

The differences between 32-bit vs. 64-bit operating systems

The terms 32-bit and 64-bit refer to the way a computer's processor (also called a
CPU), handles information. The 64-bit version of Windows handles large amounts of
random access memory (RAM) more effectively than a 32-bit system.

So you just bought a fancy new computer, and it’s got a big sticker on it that says “64-
bit!” Have you found yourself wondering why this particular computing buzzword is so
prominently featured on your new hardware, and what exactly it means? Modern
computing has been shifting towards 64-bit for a few years now, and it has saturated
the market to a point where even entry-level computers are equipped with these new,
more powerful processors. Even with the manufacturers pushing the new CPUs, your
computer may not be able to take full advantage of the technology, and getting to that
point may cost you more money in software than it’s worth.

What are bits?

The number of bits in a processor refers to the size of the data types that it handles
and the size of its registry. A 64-bit processor is capable of storing 264 computational
values, including memory addresses, which means it’s able to access over four billion
times as much physical memory than a 32-bit processor! The key difference: 32-bit
processors are perfectly capable of handling a limited amount of RAM, and 64-bit
processors are capable of utilizing much more. Of course, in order to achieve this,
your operating system also needs to be designed to take advantage of the greater
access to memory

How many bits?

As a general rule, if you have under 4 GB of RAM in your computer, you don’t need a
64-bit CPU, but if you have 4 GB or more, you do. While many users may find that a
32-bit processor provides them with enough performance and memory access,
applications that tend to use large amounts of memory may show vast improvements
with the upgraded processor. Image and video editing software, 3D rendering utilities,
and video games will make better use of a 64-bit architecture and operating system,
especially if the machine has 8 or even 16 GB of RAM that can be divided among the
applications that need it.

Through hardware emulation, it’s possible to run 32-bit software and operating
systems on a machine with a 64-bit processor. The opposite isn’t true however, in that
32-bit processors cannot run software designed with 64-bit architecture in mind. This
means if you want to take full advantage of your new processor you also need a new
operating system, otherwise you won’t experience any marked benefits over the 32-bit
version of your hardware.

Operating System Differences

With an increase in the availability of 64-bit processors and larger capacities of RAM,
Microsoft and Apple both have begun to develop and release upgraded versions of their
operating systems that are designed to take full advantage of the new technology. In
the case of Microsoft Windows, the basic versions of the operating systems put
software limitations on the amount of RAM that can be used by applications, but even
in the ultimate and professional version of the operating system, 4 GB is the
maximum usable memory the 32-bit version can handle. While a 64-bit operating
system can increase the capabilities of a processor drastically, the real jump in power
comes from software designed with this architecture in mind.

Software and Drivers

Applications with high performance demands already take advantage of the increase in
available memory, with companies releasing 64-bit versions of their programs. This is
especially useful on programs that can store a lot of information for immediate access,
like image editing and software that opens multiple large files at the same time.

Video games are also uniquely equipped to take advantage of 64-bit processing and
the increased memory that comes with it. Being able to handle more computations at
once means more spaceships on screen without lagging and smoother performance
from your graphics card, which doesn’t have to share memory with other processes
anymore.

Most software is backwards compatible, allowing you to run applications that are 32-
bit in a 64-bit environment without any extra work or issues. Virus protection
software and drivers tend to be the exception to this rule, with hardware mostly
requiring the proper version be installed in order to function correctly.
Here are answers to some common questions about the 32-bit and 64-bit
versions of Windows

How can I tell if my computer is running a 32-bit or a 64-bit version of


Windows?

To see if your computer is running 32-bit or 64-bit Windows, do the following:

1. Click to open System information

2. Under System, you can view the system type.

Can my computer run a 64-bit version of Windows?

To run a 64-bit version of Windows, your computer must have a 64-bit-capable


processor. If you are unsure whether your processor is 64-bit-capable, do the
following:

1. Click to open Performance Information and Tools.

2. Click View and print detailed performance and system information.

3. In the System section, you can see what type of operating system you're
currently running under System type, and whether or not you can run a 64-bit
version of Windows under 64-bit capable. (If your computer is already running
a 64-bit version of Windows, you won't see the 64-bit capable listing.)

Can I upgrade from a 32-bit version of Windows to a 64-bit version of Windows?

No. If you are currently running a 32-bit version of Windows, you can only perform an
upgrade to another 32-bit version of Windows. Similarly, if you are running a 64-bit
version of Windows, you can only perform an upgrade to another 64-bit version of
Windows.

If you want to move from a 32-bit version of Windows to a 64-bit version of Windows,
you'll need to back up your files and then perform a custom installation of the 64-bit
version of Windows.

Can I run 32-bit programs on a 64-bit computer?

Most programs designed for a computer running a 32-bit version of Windows will work
on a computer running 64-bit versions of Windows. Notable exceptions are many
antivirus programs, and some hardware drivers.
Drivers designed for 32-bit versions of Windows do not work on computers running a
64-bit version of Windows. If you're trying to install a printer or other device that only
has 32-bit drivers available, it won't work correctly on a 64-bit version of Windows.
For information about updating drivers and troubleshooting issues with device drivers
for 64-bit versions of Windows, contact the manufacturer of the device or program.

Would I benefit from using a 64-bit computer?

The benefits are most apparent when you have a large amount of random access
memory (RAM) installed on your computer, typically 4 GB of RAM or more. In such
cases, because a 64-bit operating system can handle large amounts of memory more
efficiently than a 32-bit operating system can, a 64-bit system can be more responsive
when running several programs at the same time and switching between them
frequently.

If I'm running a 64-bit version of Windows, do I need 64-bit drivers for my devices ?

Yes. All hardware devices need 64-bit drivers to work on a 64-bit version of Windows.
Drivers designed for 32-bit versions of Windows don't work on computers running 64-
bit versions of Windows.

Architecture Evaluation and Review Practices


You likely have made it to where you are by paying your dues: starting out writing
maintenance code, moving on to some green-field development, eventually leading
successful development projects—and, usually, a few not so successful ones—before
becoming an architect. After several years, you might think that you have seen it all
—"Bring on new technologies, stormy political waters, and ugly legacy systems to
contend with!

Confidence is an important trait for a technical leader, but it should be accompanied


with an ongoing willingness to evolve your skills and evaluate your work. Experienced
architects know that they are going to miss things. And, the earlier that you can detect
a problem with your architecture, the better off that the project will be—because the
longer that a fault goes undetected, the costlier that it will be to correct. If you have
indeed "seen it all," you know that architecture evaluations are your best friends.

Risk Mitigation

In recent years, many organizations have introduced architecture evaluation as a


critical component of the software-development life cycle. The objective is to identify
potential issues with a proposed architecture, prior to the construction phase, to
determine its architectural feasibility and to evaluate its ability to meet its quality
requirements. So, before you throw up your hands thinking that this is yet another
layer of process with the potential to slow you down, take time to understand the
reasoning behind it. This is all about risk mitigation. It's good for your organization,
and it's good for you as an architect.

Imagine that you are an investigative journalist. For each assignment, your job is to
research a topic deeply to uncover the hidden facts, and to report the story in such a
way that it provides context to your readers. How does what you uncovered have the
potential to affect their daily lives? Unlike traditional analytical journalism, which
simply reports a story from the data that is available, investigative journalism
attempts to determine if what has been presented is, in fact, reality. Architecture
evaluation shares that objective. The purpose of the evaluation is not simply to review
and communicate the candidate-architecture specification to the stakeholders. The
objective is to review and evaluate the architecture, assess its ability to meet quality
requirements, detect design errors early in the software-development life cycle (SDLC),
and identify potential risks to the project. In other words, the objective is to determine
if the reality of the specification measures up to its claims.

Like investigative journalism, architecture evaluation is based on the old Journalism


101 fundamentals: who, what, when, where, why, and how. Whether you are
preparing to have one of your candidate architectures reviewed or you are conducting
an evaluation yourself, these questions address the major components of the process.
Throughout your career, you will be exposed to specific methods of architecture
evaluation that have emerged in this important domain. While each architecture has
its own flavor, they all share key concepts that are relevant in any context.

Five Ws and an H: An Evaluation Toolset

The following sections examine an approach to software evaluation and review,


organized by each of the fundamental journalism questions.

Why?

Why should an organization review and evaluate software architecture? The bottom
line is that architecture review produces better architectures—resulting in the delivery
of better systems. Too often, systems are released with performance issues, security
risks, and availability problems as a result of inappropriate architectures. The
architectures were defined early in the project life cycle, but the resulting flaws were
discovered much later. They were exposed when the project was affected most
negatively by change, when downstream artifacts were too costly to overhaul.

The most significant benefit of evaluation is to reassure stakeholders that the


candidate architecture is capable of supporting the current and future business
objectives; specifically, it can meet its functional and nonfunctional requirements. The
quality attributes of a system—such as performance, availability, extensibility, and
security—are a direct result of its architecture; therefore, quality cannot be introduced
easily to your system late in the game. An evaluation of the architecture while it is still
a candidate specification can reduce project risk greatly.

There are also some positive side effects of evaluation. First, the process necessitates
the unambiguous articulation of the system's quality requirements. If the
requirements are too vague to evaluate an architecture against, they must be
elaborated upon. Poorly specified requirements result in hit-or-miss architectures.
Evaluation also forces you to document the architecture clearly, so that it can be
reviewed. Furthermore, as you participate in regular evaluations of your work, you
learn to anticipate the questions that will be asked and the typical criteria against
which your work will be measured. Over time, this process promotes stronger
architectural skills.

Going further, an investigative journalist would ask why an organization wouldn't


conduct software evaluations and reviews. A common response would be concern over
the cost of the effort. It should be noted that, as with any process, evaluations should
be right-sized for the target effort. Other reasons for not conducting architecture
evaluations that you might have to overcome include a fear of exposing limitations in
skill or experience, or reluctance to provide a client with visibility to the work.

What?

What is a software architecture evaluation and review? Basically, it is a process by


which conclusions can be drawn about the suitability of an architecture. Architectural
decisions are evaluated to determine how they enable or restrict the ability of a system
to meet its architecturally significant requirements.

The objectives for a review are based upon stakeholder concerns and focus on specific
aspects of the architecture. Objectives will vary from project to project, according to
each system's specific requirements, but there are a few general categories under
which most tend to fall. Typically, stakeholders want to ensure the quality and
suitability of the architecture, identify areas in which improvement is required, open a
dialogue between decision makers to address areas of risk, and negotiate any
necessary trade-offs.

What are the outputs of an architectural evaluation and review? The primary output is
a comprehensive report that describes the evaluation-and-review findings. This
document need only be as formal as required by the project, but it should serve as a
concise summary of the assessment that can be communicated to the project team, as
well as the stakeholders. The report should include the scope of the review,
evaluation-and-review objectives, architecturally significant requirements list, findings
and recommendations, and an action plan.
What is the scope of an architecture evaluation and review? The scope describes the
boundaries of a specific instance of a review. For example, the architecture of the
entire system can be evaluated, or only part of the system. A review can evaluate the
architecture against all of the system's quality requirements, or only the most critical
ones. Discover the appropriate scope by prioritizing the goals of the evaluation, based
on its defined objectives.

What exactly should be reviewed? Based on the defined objectives and scope, create
a list of the specific criteria against which the architecture will be measured. The list
might include system-wide properties, significant functional requirements to deliver,
and general attributes of quality architectures. The goal is to review and assess how
each item on the list is affected by the architectural decisions that are made.

For example, performance is a quality objective that ends up on most evaluation


criteria lists. Working from a typical business requirement, the architecture could be
expected to execute predictably within its required performance profile. To actually
evaluate the architecture, however, the performance criteria must be stated explicitly.
An example could be the architecture's ability to deliver 3,000 lookup requests and
4,000 transactions within a four-hour period, with a peak load of 15 percent of the
transactions taking place in a 45-minute window.

Reliability, security, availability, extensibility, manageability, and portability are all


quality attributes that can be considered in an architecture evaluation and review.
Keep in mind the scope and objectives of the evaluation, to keep the list manageable
and useful to your project.

A true investigative approach, however, takes time to ask, "What criteria have been
excluded, and why?" Are there political agendas at stake that selectively ignore aspects
of the architecture? Have software and other technologies been mandated that
constrain the architecture's ability to meet its objectives? While some of these
scenarios cannot be avoided in the real business world, it is always appropriate for the
architect and the reviewers to acknowledge any limitations of the architecture, even if
they cannot be removed.

Who?

Who participates in a software architecture evaluation and review? The objective of the
selection process is to ensure that people with the right skills and relevance to the
project are assigned to support the effort effectively, without creating a crowd that is
too large to be efficient. Ideally, there should be active representation from three
contingencies: an evaluation team, project stakeholders, and project practitioners.

The evaluation team conducts the actual evaluation and documents all findings. In
large organizations, an evaluation team often comprises practitioners who rotate
through the team in between other projects. Staffing the evaluation team with
practitioners from the target project should be avoided, if possible, to maintain the
highest degree of objectivity. For very small projects, however, self-assessments and
peer reviews are completely acceptable. It is critical that members of the evaluation
team have respect and credibility as architects, so that their conclusions will carry
weight with the project representatives and stakeholders.

Stakeholders are the people who have specific architectural concerns and a vested
interest in the resulting system. Most of the architectural requirements were specified
by these stakeholders, so that their participation in the evaluation is critical.

System architects and component designers are the key project representatives and
are responsible for communicating the architecture and presenting their motivations
for design decisions. Other project representatives to include are project and program
managers, developers, system administrators, and component vendors.

The follow-up step for an investigative approach is to ask, "Who is missing from the
participant list?" What stakeholders or project representatives intentionally were not
included? Occasionally, practitioners and stakeholders are excluded because of past
experiences. Perhaps they were not supportive of a previous evaluation effort—not
dedicating enough time, not taking the evaluation as seriously as they should have, or
exhibiting defensive or contentious behavior. Part of the evaluation process is coaching
the participants. If someone is important to an evaluation for the knowledge that they
have or the requirements that they represent, it is worth the effort to try to influence
their behavior, so that they can contribute to the process.

When?

When should an architecture evaluation and review take place? If only one evaluation
can be performed, it takes place ideally as early in the life cycle as is reasonable and
possible. Generally, you want to conduct the evaluation when the architecture is
specified, but before anything has been implemented. The goal is to identify any areas
of concern as early as possible, while they are still relatively easy and cheap to correct.

That being said, an evaluation and review can be conducted at any stage in the life
cycle. For projects using an iterative development approach, evaluation can take place
within each iteration—whenever architectural decisions have been made. Evaluations
also can be conducted on legacy systems, to assess their ability to support future
business objectives.

Your investigative instincts should be getting sharper by now. How can we take the
"when" question a step further? Beware of stakeholders or project representatives
balking at the timing of an evaluation. The reasons could be completely valid; maybe
they are unavailable, or they truly feel that the timing is inappropriate. Digging a little
deeper might reveal project issues. The architecture team might be struggling. They
might not see the evaluation as their chance to get valuable input and advice.
Stakeholders might not be ready and willing to negotiate any conflicting requirements.
Take the time to uncover the true reasons behind any postponement attempts. You
might find a critical risk hidden behind that reluctance.

How?

How is an architecture evaluation and review performed? Prior to the review, you
should gather inputs that describe the architecture and explain the rationale behind
the architectural decisions that are made. Examples of typically selected inputs are
the architecturally significant requirements, an architectural description or software
architecture document, an architectural-decisions document, and an architectural
proof of concept.

The primary activity of the evaluation-and-review process is the assessment of the


architecture. A proven technique involves the use of scenarios, which allow the quality
attributes of the architecture to be evaluated in specific contexts. Walking through the
steps of a scenario provides you with the opportunity to describe how an architecture
will respond to specific demands that are placed upon it. If you want to assess how
easily a system that is built upon the candidate architecture could be modified, you
could create a scenario that describes a set of specific changes to implement in the
system. You then could analyze the architecture, looking for modifiability tactics such
as semantic coherence and generalized modules. For small-scale evaluations not
requiring such a detailed technique, a simple questionnaire or checklist could suffice.

The final step of the evaluation-and-review process is to document the findings, and
communicate them to the project team and stakeholders. When architectural concerns
or deficiencies are exposed, it is critical to provide recommendations for improvement
that are actionable. The whole point of the investigative approach is to uncover issues
that otherwise might have been overlooked. If recommendations are too generic to be
implemented, the evaluation cannot contribute much to the success of the project.

Where?

After the review—where do you go from here? When the evaluation report is complete,
you typically are given an opportunity to respond to the findings and
recommendations. The report then is forwarded to the stakeholders for use in
planning the next steps for the project. Sometimes, an evaluation will identify the need
for trade-offs. For example, if the architecture cannot support a specific performance
requirement, stakeholders must determine if the benefit of strengthening the
architecture to achieve that requirement is worth the cost. Following an evaluation,
the architectural decisions should be updated, requirements refined and prioritized,
and the project adjusted as necessary.

While each evaluation produces different results, the goal is always the same: to
produce a better architecture. For you, the architect: Consider an evaluation of your
work as a way to produce improved specifications by tapping into the experiences of
veteran architects. See each evaluation as a valuable learning opportunity. Your
projects will benefit, your organization will benefit, and so, too, will your career.

Critical-Thinking Questions

 A validated architecture does not guarantee the quality of the resulting system.
How can downstream design decisions undermine the architecture's ability to
meet its quality objectives?

 How can the introduction of evaluations help your organization adopt a


standard method of architectural description?

Unit Two

Develop Service Level Agreement


Unit Three

Formulate Maintenance Strategy

Introduction: In industry, there is a constant demand for increased productivity in


order to stay competitive. One important factor for increasing the equipment
utilization is effective maintenance of production assets. Within process industry a
strategic view on maintenance activities is common and most companies regard
maintenance as a profit centre. Meanwhile, the discrete units manufacturing industry
still in many cases view maintenance as a cost driver. However, with the spread of
Toyota-inspired production concepts, the manufacturing industry is beginning to view
maintenance as a strategic asset. Still, though, many companies have no formulated
maintenance strategy. The main purpose of the research, presented in this unit, has
been to develop a work-process for formulation of effective maintenance strategies for
enterprises in the manufacturing industry.

Maintenance Strategy Formulation 

Purpose

You can use this business process to do the following:

 Formulate maintenance strategies that conform to company objectives


 Review, confirm, and update requirements and assumptions
1. Initiate process

You initiate the identification of maintenance processes.

This process could be time-based (yearly, quarterly), campaign-based (field, reservoir)


or executed on an as-needed basis.

2. Review maintenance assumptions

In this process step, you:

 Review all assumptions that have led to the formulation of strategies and the
applicability of any policies to the particular asset
 Assess whether they have any significant impact with respect to well/facility
maintenance and intervention
 Identify assumptions to be reviewed and any departures from policy
 Formulate assumptions into change proposals for the asset reference plan
3. Determine maintenance strategies

You determine maintenance strategies in line with company objectives and the
requirements of the asset holder as specified in the asset reference plan which
includes any changes from the Review maintenance assumptions process step (see
above).

In addition, you do the following:

 Review maintenance and intervention strategies for other strategies on the


asset and ensure that they are consistent and mutually supportive
 Contribute to confirming or updating the asset reference plan

       4.      Approve assessment

Well or Facility Maintenance Requirements Determination 

Purpose

You can use this business process to do the following:

 Undertake fault modes, effects and criticality analysis on equipment


 Identify activities required to safeguard technical integrity and to optimize
cash flow
 Develop, evaluate, and select maintenance options
 Determine maintenance requirements for all items of equipment and enter
these requirements into the work packages
Examining Maintenance options

Reliability Centered Maintenance (RCM) analysis provides a structured framework for


analyzing the functions and potential failure modes for a physical asset (such as an
commuters, a manufacturing production line, etc.) in order to develop a scheduled
maintenance plan that will provide an acceptable level of operability, with an
acceptable level of risk, in an efficient and cost-effective manner.

RCM techniques often utilize a logic diagram approach for evaluating the potential
effects of failure and selecting the appropriate maintenance strategy. As an example,
Figure 1 shows a portion of one of the decision-making flowcharts presented in the
SAE JA1012 document, A Guide to the Reliability-Centered Maintenance (RCM)
Standard. Similar diagrams are provided in other published RCM guidelines. (Some of
the major RCM publications are listed in the References section of this article.)

In addition to, or instead of, a logic diagram approach, the RCM analyst may wish to
use cost- and availability-based comparisons of potential maintenance strategies when
selecting and assigning maintenance tasks. This article provides an overview of these
comparison techniques along with a couple of demonstration examples.

Types of Maintenance Strategies to Consider

Although there is variation among practitioners regarding the terminology used to


describe the available maintenance techniques, in general, the RCM analyst may
consider any of the following maintenance strategies to address a potential failure
mechanism:

 Run-to-Failure - fix the equipment when it fails but do not perform any
scheduled maintenance.
 Scheduled Inspections
 Failure Finding Inspections - inspect the equipment on a scheduled
basis to discover hidden failures. If the equipment is found to be failed,
initiate corrective maintenance.
 On-Condition Inspections - inspect the equipment on a scheduled or
ongoing basis to discover conditions indicating that a failure is about to
occur. If the equipment is found to be about to fail, initiate preventive
maintenance.
 Scheduled Preventive Maintenance
 Service - perform lubrication or other servicing actions on a scheduled
basis.
 Repair - repair or overhaul the equipment on a scheduled basis.
 Replace - replace the equipment on a scheduled basis.
 Design Change - Re-design the equipment, select different equipment or make
some other one-time change to improve the reliability/availability of the
equipment.

Using Simulation to Compare Maintenance Strategies

Given certain information about how the equipment will be operated, the probability of
occurrence for the failure mode and the maintenance characteristics, the analyst can
use simulation to estimate the cost and average availability that can be expected over
the operational life of the equipment when a particular maintenance strategy is
employed. The calculations can then be used to compare available maintenance
strategies so that the analyst can select the most cost-effective strategy that provides
an acceptable level of performance.

Run-to-Failure (Corrective Maintenance Only)

To estimate the cost and average availability that can be expected for a run-to-failure
(corrective maintenance only) maintenance strategy, the analyst must provide the
following information:

 The amount of time that the equipment will operate.


 The probability density function ( pdf ) that describes the probability that the
equipment will fail due to a particular failure cause.
 An indication of whether the failure is detectable during normal operation.
 The amount of time that the equipment is expected to be down each time
corrective maintenance is required. This can include the time to perform the
maintenance as well as any logistical delays (i.e., waiting for labor and/or
materials required).
 The cost each time corrective maintenance is required, including the
downtime, labor, materials and other costs.
 The degree to which the equipment will be restored by corrective maintenance
(e.g., "as good as new," "as bad as old," etc.).
The analyst can then simulate the operation of the equipment for the specified
operating time, given the specified reliability/maintainability characteristics, in order
to estimate

A. The expected number of corrective maintenance actions that will be performed


and
B. The amount of time that the equipment is expected to be operating (uptime)
over the specified time. These estimates can then be used to calculate the total
operating cost, cost per uptime and average availability, as follows:

Scheduled Repair/Replacement

To calculate the cost and availability that can be expected from a maintenance
strategy that involves preventive repair/replacement of the equipment, the following
information is required (in addition to the inputs described previously):

 The time interval at which the preventive maintenance will be performed.


 The amount of time that the equipment is expected to be down each time
preventive maintenance is performed.
 The cost each time preventive maintenance is performed.
 The degree to which the equipment will be restored by preventive maintenance.

With this additional information, simulation can be used to estimate the expected
number of corrective maintenance (CM) and preventive maintenance (PM) actions, along
with the uptime. The total operating cost for this maintenance strategy includes the
cost of all CMs plus the cost of all PMs, as shown next. Note that the Cost per Uptime
and Average Availability calculations are the same, regardless of task type.

Calculations for Service and Failure Finding tasks are performed in a similar manner
except that the assumptions of the simulation will vary to fit the conditions of the
task. For example, if the failure is undetectable during normal operation and the
equipment is found to be failed during a scheduled Service task, then the simulation
will assume that corrective maintenance will be initiated. Likewise, a Failure Finding
task can initiate corrective action if the equipment is found to be failed but does not
restore the equipment to any degree if it is found to be running.

On-Condition Inspections

On-Condition Inspection tasks (which are designed to monitor the equipment at


scheduled intervals or on an ongoing basis and initiate preventive maintenance only if
a specific condition is detected) require additional information and a more complex
simulation/calculation method. In addition to operating life, probability of failure and
corrective maintenance characteristics, the analyst must describe the characteristics
of the scheduled inspections that will be performed:

 The time interval at which the inspection will be performed.


 The amount of time that the equipment is expected to be down each time an
inspection is performed.
 The cost each time an inspection is performed.
 An indication of when the approaching failure will become detectable during
inspection (which could be described as a percentage of item life or as a fixed
time interval).

For the cases in which the inspection detects that a failure is approaching, the
analysis also requires the downtime, cost and restoration factor associated with the
preventive maintenance that will be initiated.

Simulation of this scenario will return 1) the expected number of corrective


maintenance actions, 2) the expected number of inspections, 3) the expected number
of preventive maintenance actions and 4) the amount of uptime. The total operating
cost then includes the cost of all CMs plus all inspections plus all PMs, as shown next.

This total operating cost is then used to calculate cost per uptime and average
availability as described previously.
Example 1: Mechanical Component with Wearout

Consider an RCM analysis for a large truck that is intended to operate for 120,000
miles per year. A critical failure mode has been identified for a mechanical component
and reliability analysis indicates that the failure behavior follows a Weibull
distribution with beta = 2.3 and eta = 72,000 miles. Considering logistical factors,
downtime penalties and the actual repair resources, it takes 7 work days (3,500 miles
of lost “production”) and costs $4,650 each time the component must be replaced
when it fails. The component will be “as good as new” after the maintenance action.
The RCM analysis team is considering whether to incorporate a scheduled preventive
replacement task into the maintenance plan. Because there are no additional logistical
delays/costs for a planned replacement, the PM task will take only 1 work day and
cost $2,050.

Using the RCM++ software, the team can first estimate the optimum preventive
replacement time for the component and then simulate the operation of the equipment
for 120,000 miles to estimate the cost and average availability that can be expected in
a year from the two maintenance strategies that are under consideration. By entering
the cost of corrective maintenance (CM), the cost of preventive maintenance (PM) and
the probability of failure into the following equation, the optimum PM interval is
determined to be 60,330.25 miles.

Rounding to 60,000 miles and performing the simulation yields the following results
per vehicle per year:

Run-to-Failure

 1.43 CMs and Uptime = 115,114.20


 Total Operating Cost = $6,626.25
 Cost per Uptime = 0.058 per mile
 Average Availability = 95.93%

Preventive Replacement

 .98 CMs, .79 PMs and Uptime = 116,246.80


 Total Operating Cost = $6,188.95
 Cost per Uptime = 0.053 per mile
 Average Availability = 96.87%
The analysis indicates that the scheduled replacement strategy provides both lower
cost and better availability. Note that the differences between the two strategies will be
even greater when applied to the entire fleet of vehicles over multiple years of
operation.

Example 2 - Electrical Component with Infant Mortality

Another critical failure mode has been identified for an electrical component of the
truck described in Example 1. This follows a Weibull distribution with beta = .76 and
eta = 100,000 miles. The RCM analysis team is considering a planned replacement for
this component at 60,000 miles to coincide with the PM for the mechanical
component. For this failure mode, the CM downtime is 4 work days; the CM cost is
$2,800; the PM cost would be $1,200 and there would be no additional PM downtime
because the equipment is already down for the other maintenance task. The analysis
yields the following results:

Run-to-Failure

 1.21 CMs and Uptime = 117,617.74


 Total Operating Cost = $3,374.00
 Cost per Uptime = 0.029 per mile
 Average Availability = 98.02%

Preventive Replacement

 1.39 CMs, .88 PMs and Uptime = 117,252.30


 Total Operating Cost = $4,934.80
 Cost per Uptime = 0.042 per mile
 Average Availability = 97.71%

In this case, the analysis indicates that a run-to-failure maintenance strategy will be
more cost-effective and provide better availability. In fact, since the beta parameter of
the failure distribution is less than 1, this indicates that the equipment does not
experience wearout and there is no optimum preventive replacement time. The team
could repeat the analysis for other maintenance intervals and would always determine
that run-to-failure is more cost-effective.
Conclusion
As this topic demonstrates, cost-based comparisons can be very useful to help RCM
analysts to select the most appropriate maintenance strategy for a particular piece of
equipment/failure mode. ReliaSoft's RCM++ software automatically performs the
maintenance task cost calculations described here. This functionality relies on the
same powerful simulation engine available in ReliaSoft's BlockSim software, which can
also be used for maintenance planning and other more complex system reliability,
maintainability and availability analyses.

Preventive Maintenance Schedules


This chapter contains the following topics:

 Understanding PM Schedules
 Understanding PM Cycle Events

Understanding PM Schedules

When you manage the equipment maintenance needs, you define the type and
frequency of each maintenance task for each piece of equipment in the organization.
The PM cycle refers to the sequence of events that make up a maintenance task, from
its definition to its completion. Because most PM tasks are commonly performed at
scheduled intervals, parts of the PM cycle repeat, based on those intervals.

You should be familiar with the following terms and concepts that are related to the
PM cycle.

Service Type

You define service types to describe individual preventive maintenance tasks. You can
define as many service types as you need. You can set up service types to apply to a
particular piece of equipment or a class of equipment. Examples of service types
include:

 250-hour inspection
 Clutch adjustment
 Lubricate ventilation fan
 10,000-hour engine rebuild
 Installing ant viruses
 Cleaning computer parts
 Installing updates patches
Preventive Maintenance (pm)

A PM refers to one or more service types that are scheduled to be performed for a piece
of equipment. You typically specify that a PM be performed at a predefined point in
time. The point in time can be based on days, date, or when a piece of equipment
accumulates a predefined number of statistical units, such as hours, miles, and so on.
You identify how many units have accumulated for each piece of equipment by
periodically entering equipment meter readings.

Preventive Maintenance Schedule

You create one PM schedule for each piece of equipment for which you want to
perform PMs. The PM schedule defines which service types apply to a piece of
equipment. The PM schedule also defines the service interval for each service type. A
service interval refers to the frequency at which the service types are performed.

You schedule maintenance by periodically updating PM schedule information. When


you update PM schedule information, the system determines which service types are
due to be performed based on meter readings, dates, and other user-defined criteria. If
service types are due to be performed, the system updates the PM status. In addition,
depending on how you set up the system, the system generates a PM work order.

You might also like