CAAT ENG02 Condition Mornitoring Maintenance Handbook 1
CAAT ENG02 Condition Mornitoring Maintenance Handbook 1
CAAT (ENG-02)
This document is property of The Civil Aviation Authority of Thailand. All right
reserved. No part of this publication may be reproduced, stored in a retrieval system
or transmitted in any form or by any means, without prior permission for The Civil
Aviation Authority of Thailand.
CAAT (ENG-02)
Page
1 INTRODUCTION
2 PRIMARY MAINTENANCE 2
1.1 "Airworthiness" which, for the purposes of this publication, is defined as "the
continuing capability of the aircraft to perform in a satisfactory manner the flight
operations for which it was designed", is based on the expectation that flight operations
will be performed with acceptable reliability in respect of flight crew workload; flight
handling characteristics; flight performance/envelope availability; safety margins;
welfare of occupants; punctuality; economics.
1.2 Time has not changed the objectives of airworthiness. What has changed is the size,
complexity and increased performance of aircraft, together with improved design
techniques and a more knowledgeable approach to the control of maintenance.
Confidence in continued airworthiness has long been based on the traditional method
of maintaining safety margins by the prescription of fixed component lives and by
aircraft 'strip-down' policies. The call for changes to the basic philosophy of aircraft
maintenance has been greatly influenced by the present day economic state of the
industry as well as by changes in aircraft design philosophy allied to progress in
engineering technology. These changes have, in turn, resulted in the necessity for the
management and control of expensive engineering activities to take a new and more
effective form.
1.3 It is from this background that a maintenance process known as Condition Monitoring*
has evolved. It is necessary to attempt to correct a misunderstanding which has arisen
about the term Condition Monitoring. Condition Monitoring is not a separate activity but
a complete process which cannot be separated from the complete maintenance
programme. It is not just an identification of a single maintenance action but is a basic
maintenance philosophy.
1.4 Maximum use can be made of the Condition Monitoring process which includes a
statistical reliability element (see 3.3), when it is applied to aircraft meeting the following
criteria.
NOTE: These safeguards are provided by the provision of either Active Redundancy* or Standby
Redundancy*. In simple terms the safeguards take the form of more than one means of
accomplishing a given function. Systems (or functions within systems) beyond those
necessary for immediate requirements are installed. These are so designed that with
an Active Redundancy philosophy
all the redundant Items* are operating simultaneously and, in simple terms, sharing
the load to meet the demand. Thus in the event of failure of one of the redundant Items,
the demand will continue to be met by the remaining
(b) Aircraft for which the initial scheduled maintenance programme has been
specified by a Maintenance Review Board and to which a Maintenance Steering
Group Logic Analysis has been applied.
NOTES : (1) Examples of this class of aircraft are the Boeing 747, Lockheed L lO l l ,
Douglas DC 10 and Concorde.
(2) For an aircraft type introduced into service by Maintenance Review Board and
Maintenance Steering Group procedures and where Condition Monitoring tasks
are prescribed, a Condition Monitored Maintenance Programme (the
Programme) will have to be established, even for a single aircraft.
1.5 For aircraft not covered by the criteria of 1.4, the statistical reliability element of
Condition Monitoring may, nevertheless, be applied for the purpose of monitoring
system or component performance (but not be prescribed in the Maintenance Schedule
as a primary maintenance process).
NOTE: For a statistical reliability element of a programme to be effectively used, a fleet minimum of
five aircraft is normally necessary, but this can vary dependent upon the aircraft type and
utilization. To date, in Hong Kong, reliability elements of these Programmes have not been
applied to rotorcraft, although there is no fundamental reason why they should not.
2 PRIMARY MAINTENANCE
2.1 Civil Aviation Authority of Thailand (CAAT) recognizes three primary maintenance
processes. They are Hard Time*, On-Condition* and Condition
Appendix A.
See Appendix A. Should fuller details of the Maintenance Steering Group process in respect of a
specific aircraft be required, they would have to be obtained from the regulatory
authority responsible for the initial certification of that aircraft.
2.2.2 On Condition
This also is a preventative process but one in which the Item is inspected or
tested, at specified periods, to an appropriate standard in order to determine
whether it can continue in service (such an inspection or test may reveal a need
for servicing actions). The fundamental purpose of On Condition is to remove
an Item before its failure in service. It is not a philosophy of 'fit until failure'
or 'fit and forget it'.
This is not a preventative process, having neither Hard Time nor On
Condition elements, but one in which information on Items gained from
operational experience is collected, analysed and interpreted on a continuing
basis as a means of implementing corrective procedures.
2.3 Where a Maintenance Steering Group Logic Analysis has not been applied to a
particular aircraft to establish and allocate the primary maintenance processes for each
Item, the considerations of (a),(b) and (c) will be applied separately to all Items to
determine the acceptability of the primary maintenance process.
(i) Where the failure of the Item has a direct adverse effect on
airworthiness and where evidence indicates that the Itern is subject to
wear or deterioration.
(b) On-Condition
Where a failure of an Item does not have a direct adverse effect on operating
safety, and where (a) and (b) are not prescribed and no adverse age reliability
relationship has been identified as the result of analysis of the data arising from
a formalized monitoring procedure or programme.
3.1 Introduction
3.2.3 Maintenance of a particular Item could well be some combination of the three
primary maintenance processes (Hard Time, On-Condition and Condition
Monitoring). There is no hierarchy of the three processes; they are applied to the
various Items according to need and feasibility. Maintenance Schedules which
are based on the Maintenance Steering Group principles will have Hard Time,
On-Condition, or Condition Monitoring specified as the primary maintenance
process for specific systems and sub-systems as well as for individual
Maintenance Significant Items.* Condition Monitoring can, therefore, be the
primary maintenance process prescribed for an Item, in which case it has also
to be used for controlling the availability of those functions which are not
directly controlled by a prescribed On-Condition or Hard Time process; this
control is provided by the statistical reliability element of Condition Monitored
Maintenance. Items for which Hard Time and On-Condition are prescribed may,
however, have the statistical reliability element of Condition Monitored
Maintenance applied, not as a primary maintenance process, but as a form of
Quality Surveillance.*
3.3.1 The assessment of defect/removal/failure rate trend, of age bands at which items
fail, or the probability of survival to a given life are, in most cases, used to
measure the effect or suitability of the primary maintenance processes applied
to Items. The assessment is made by examination of rates of occurrence of
events such as in-flight defects, incidents, delays, use of Redundancy
capability, engine unscheduled shut-downs, air turn backs, etc., which are
reported in accordance with the procedure associated with the
reliability element of Condition Monitored Maintenance.
3.3.3 If the mystery of numbers and the various theories of probability are
discounted, a statistical reliability programme, as an element of Condition
Monitoring, is, in practical terms, the continuous monitoring, recording and
analysing of the functioning and condition of aircraft components and systems.
The results are then measured or compared against established normal
behaviour levels so that the need for corrective action may be assessed and,
where necessary, taken.
3.4.1 A maintenance programme which provides for the application of Hard Time,
On-Condition and Condition Monitoring is known as a Condition Monitored
Maintenance Programme. A Programme has two basic functions. Firstly, by
means of the statistical reliability element, to provide a summary of aircraft
fleet reliability and thus reflect the effectiveness of the way in which
maintenance is being done. Secondly, to provide significant and timely
technical information by which improvement of reliability may be achieved
through changes to the Programme or to the practices for implementing it.
3.4.3 The fundamental factors of a successful Programme are the manner in which
it is organized and the continuous monitoring of it by responsible personnel.
Because of differences in the size and structure of the various airlines, the
organizational side of any Programme is individual to each Operator. Hence,
it is necessary to detail the organisation and responsibilities in the Programme
control documentation.
3.5.1 Every Programme is required to have a controlling body, (usually known as the
Reliability Control Committee) which is responsible for the implementation,
decision making and day-to-day running of the Programme. It is essential that
the Reliability Control Committee should ensure that the Programme
establishes not only close co-operation between all relevant departments and
personnel within the Operator's own Organisation, but also liaison with other
appropriate Organizations. Lines of communication are to be defined and fully
understood by all concerned. A typical Organisation and Data Flow Chart is
shown in Appendix B.
3.5.2 The Reliability Control Committee is responsible for, and will have full
authority to take, the necessary actions to implement the objectives and
processes defined in the Programme. It is normal for the Quality Manager or the
Engineering Manager to head the Committee and to be responsible to the
Director for the operation of the Programme.
3.5.3 The formation of the Committee and the titles of members will vary between
Operators. The structure and detailed terms of reference of the Committee and
its individual members will be fully set out in the documentation for each
Programme. The Committee will usually comprise the Quality or Engineering
Manager, the Reliability Engineer or Co-ordinator, the Chief Development
Engineer, and the Chief Production Engineer.
3.5.4 The Committee should meet frequently to review the progress of the
Programme and to discuss and, where necessary, resolve current problems.
The Committee should also ascertain that appropriate action is being taken, not
only in respect of normal running of the Programme, but also in respect of
corrective actions.
3.5.5 Formal review meetings are held with the CAAT at agreed intervals to assess
the effectiveness of the Programme. An additional function of the formal review
meeting is to consider the policy of, and any proposed changes to, the
Programme.
3.6.1 Data (or more realistically, collected information) will vary in type according
to the needs of each Programme. For example, those parts of the Programme
based on data in respect of systems and sub-systems will utilize inputs from
reports by pilots, reports on engine unscheduled shutdowns and also, perhaps,
reports on mechanical delays and cancellations. Those parts of the Programme
based on data in respect of components will generally rely upon inputs from
reports on component unscheduled removals and on workshop reports. Some
of the larger Programmes embrace both 'systems' and 'component' based data
inputs
3.6.2 The principle behind the data collection process is that the information to be
collected has to be adequate to ensure that any adverse defect rate, trend, or
apparent reduction in failure resistance, is quickly identified for specialized
attention. Some aircraft systems will function acceptably after specific
component or sub-system failures; reports on such failures in such systems
will, nevertheless, act as a source of data which may be used as the basis of action
either to prevent the recurrence of such failures, or to control the failure rates.
3.6.3 Typical sources of data are reports on delays, in-flight defects, authorized
operations with known defects (i.e. equipment inoperative at a level compatible
with the Minimum Equipment List*, flight incidents and accidents, air-turn-
backs; the findings of line, hangar and workshop investigations. Other typical
sources include reports resulting from On Condition tasks and in-flight
monitoring (Airborne Integrated Data Systems); Service Bulletins; other
Operators' experience, etc. The choice of a source of data, and the processes
for data collection, sifting and presentation (either as individual events or as
rates of occurrence) should be such as to permit adequate condition assessment
to be made relative both to the individual event and to any trend.
(a) These are flight crew reports of engine shut-downs and usually include
details of the indications and symptoms prior to shut down. When
analyzed, these reports provide an overall measure of propulsion
system reliability, particularly when coupled with the investigations
and records of engine unscheduled removals.
(a) These are normally daily reports, made by the Operator's line
maintenance staff, of delays and cancellations resulting from
mechanical defects. Normally each report gives the cause of delay and
clearly identifies the system or component in which the defect occurred.
The details of any corrective action taken and the period of the delay are
also included.
(b) The reports are monitored by the Reliability Section and are classified
(usually in Air Transport Association of America, Specification 100
(ATA 100) Chapter sequence), recorded and passed to the appropriate
engineering staffs for analysis. At prescribed periods, recorded delays
and cancellations for each system are plotted, usually as events per 100
departures.
At the end of the prescribed reporting period the unscheduled removals and/or
confirmed failure rates for each component are calculated to a base of 1,000
hours flying, or, where relevant, to some other base related to component
running hours, cycles, landings, etc.
3.7.1 To assist in the assessment of reliability, Alert Levels are established for the
Items which are to be controlled by the Programme. The most commonly used
data and units of measurement (Pireps per 1,000 hours, Component
Removals/Failures per 1,000 hours, Delays/Cancellations per
3.7.2 There are arguments for and against the choice of the various sources of data to
be used in the Programme for the purpose of statistical reliability measurement.
Are statistics derived from Pireps better than those derived from reports on
Delays/Cancellations? Are the statistics derived from reports on Component
Unscheduled Removals better than those from reports on Confirmed Failures?,
and so on.
3.7.3 The value of Pireps can vary where flight crews within the fleet have differing
standards of vigilance, or where differing standards occur in the abilities of
engineering staff. Where reasonable uniformity of reporting is not present then
the difference between the number of Component Unscheduled Removals
and those which are confirmed as failures can result in reports being
unrepresentative of true reliability.
3.7.4 Information collected over many years has been analysed and statistically tested,
and the following statements may be accepted as valid.
(e) The number of reports normally follows a 'seasonal' pattern and can be
statistically unrealistic during periods of aircraft low utilization.
3.7.5 When considering data based on components, it is useful to note that where
a Programme is introduced for an aircraft fleet for the first time and in the early
'settling in' period, the number of failures which are not confirmed after an
unscheduled removal can be as high as 40% for all components taken
together. For individual components this can range from 5% for landing gear
and flying control components to 65% for some communications and avionic
components; thus indicating the need for inclusion of data on both
unscheduled removal and confirmed failure of components.
3.8. l A reliability alert level (or equivalent title, e.g. Performance Standard, Control
Level, Reliability Index, Upper Limit) hereinafter referred to as an 'Alert
Level', is purely an 'indicator' which when exceeded indicates that there has
been an apparent deterioration in the normal behaviour pattern of the Item
with which it is associated. When an Alert Level is exceeded the appropriate
action has to be taken. It is important to realize that Alert Levels are not
minimum acceptable airworthiness levels. When Alert Levels are based on a
representative period of safe operation (during which failures may well have
occurred) they may be considered as a form of protection against erosion of the
design aims of the aircraft in terms of system function availability. In the case
of a system designed to a multiple Redundancy philosophy it has been a common
misunderstanding that, as Redundancy exists, an increase in failure rate can
always be tolerated without corrective action being taken.
3.8.2 Alert Levels can range from 0.00 failure rate per 1,000 hours both for
important components and, where failures in service have been extremely rare,
to perhaps as many as 70 Pireps per 1,000 hours on a systems basis
for ATA 100 Chapter 25 - Equipment/Furnishings, or for 20 removals of Passenger
entertainment units in a like period.
(a) Alert Levels should, where possible, be based on the number of events
which have occurred during a representative period of safe operation of
the aircraft fleet. They should be updated periodically to reflect operating
experience, product improvement, changes in procedures, etc.
(i) For a new aircraft type during the first two years of operation
all malfunctions should be considered significant and should be
investigated, and although Alert Levels may not be in use,
Programme data will still be accumulated for future use.
(d) There are several recognized methods of calculating Alert Levels, any one
of which may be used provided that the method chosen is fully defined
in the Operator's Programme documentation. It is not necessary for
elaborate mathematical proofs or statistical methods to be explored in
this publication; in fact neither is necessary for the operation of a
Programme. The methods given herein as examples and many more, may
be found in any standard test book on statistics.
Calculation 1.
The three-monthly running average Pirep rate per 1,000 hours
for each system (or sub-system), as in the Table of Example 1,
is averaged over the sample operating period and is known as the
Mean; the Mean is multiplied by 1.30 to produce the Alert Level
for the given system. This is sometimes known as the ' 1.3
Mean' or ' 1.3x' method.
Calculation 2.
The Mean, as in Calculation 1, plus 3 Standard Deviations of the
Mean (as illustrated in Appendix C - Example 1).
Calculation 3.
The Mean, as in Calculation 1, plus the Standard Deviation of
the 'Mean of the Means', plus 3 Standard Deviations of the
Mean (as illustrated in Appendix C - Example 2).
Calculation 4.
The Mean of the individual quarterly Component Unscheduled
Removal rates for the period of seven quarters, plus 2 Standard
Deviations of the Mean.
Calculation 5.
The maximum acceptable number of 'Expected Component
Unscheduled Removals' in a given quarter, as calculated using a
statistical process in association with the Poison Distribution
of Cumulative Probabilities (as illustrated in Appendix C -
Example 3).
Calculation 6.
The Number of 'predicted Component Unscheduled Removals
(or failures)' in a given quarter, as determined by the Weibull
or other suitable statistical method.
(a) Both the method used for establishing an Alert Level, and the associated
qualifying period, apply also when the level is re-calculated to reflect current
operating experience. However if, during the period between re calculation of
an Alert Level, a significant change in the reliability of an Item is experienced
which may be related to the introduction of a known action (e.g. modification,
changes in maintenance or operating procedures) then the Alert Level
applicable to the Itern would be re assessed and revised on the data
subsequent to the change.
(b) All changes in Alert Levels are normally required to be approved by the Director
and the procedures, periods and conditions for re-calculation are required to be
defined in each Programme.
3.10.1 General
3.10.2 The main purpose of displaying the information is to provide the Operator and the
Director with an indication of aircraft fleet reliability in such a manner that
the necessity for corrective actions may be assessed. The format, frequency
of preparation and the distribution of displays and reports are fully detailed in
the Programme documentation. Typical data displays are described in 3.10.3 to
3.10.9 and some examples are illustrated in Appendix D.
This display (see Fig. Dl ), which is related to all aircraft of the same type in the
fleet, is usually produced in tabular form, and should contain the following
minimum information for the defined reporting period:-
The purpose of this type of display is to indicate the aircraft systems which
have caused delay to or cancellation of flights as a result of mechanical
malfunctions. It is normal for each display to show the delays/cancellations as
a total for all systems (to represent fleet overall reliability, as in Fig. D2) as
well as separately for the individual systems. The displays for the separate
systems will usually show the delay/cancellation rate for the defined reporting
period, the three-monthly moving average rate and, where appropriate, the
Alert Level, and will present the information for a minimum period of 12
months.
This display (see Fig. D3) is the prime indication of engine in-service
reliability and also, to a large degree, of total power-plant reliability. Because
of the high level of reliability of engines and the consequently relatively low
numbers of unscheduled shut-downs per fleet, both the actual number of shut-
downs and the shut-down rate per 1,000 hours for the defined reporting period
as a three monthly running average, shown as a graphical display, will provide
useful information in addition to that of Fig. D3. To be of most use, when
dealing with small numbers of unscheduled shut-downs, it is usual to present
both types of information in such a way as to show the trend over a two-to-
three-year period.
Having collected the information, and having presented it in a timely manner it should
now be possible to identify any problems and to assess the necessity for corrective
actions. The information, having been sifted and categorized (normally in ATA 100
Chapter order) as individual events and/or rates of occurrence, can be analysed using
engineering and/or statistical methods. The analysis can be made at various stages in
the handling of the data to differing degrees. Initially, reports on flight defects, delay
causes, engine unscheduled shut-downs, workshop and hangar findings, other operators'
experience, etc., should be analysed individually to see if any immediate action is
desirable. This initial individual
3.12.1 The effectiveness of corrective action will normally be monitored by the very
process which revealed the need for it - the Condition Monitoring process.
*See Appendix E
3.13.2 In setting the number of samples and any other qualifying conditions, both
engineering assessment of the design and service experience are taken into
account. Evidence derived from other activities (e.g. unscheduled removals or
removals scheduled for other purposes) will supplement scheduled sampling
and the removal itself may, if representative, be substituted for a scheduled
sampling removal.
3.13.3 When the optimum period for a particular workshop activity has been
determined, threshold sampling will be discontinued and a Hard Time
limitation for workshop activity (e.g. Overhaul) will be prescribed.
3.13.4 A typical example of the use of threshold sampling is the control of the 'release
for service' periods of certain gas-turbine engines, where some of the units on
the engines are subject to individual Hard Time limitations (e.g. turbine disc
lives, refurbishing intervals). These individual limitations are, in most cases,
established and varied by the process described in 3.13.1 to 3.13.3. The
outcome is that the engine release period for installation in the aircraft is then
fixed by the expiration of the lowest unit Hard Time limitation.
3.14.1 With the major issues of airworthiness and the economical allocation of vast
sums of money being involved, it is essential that Quality Control* should be
applied as an overall control of the Maintenance Programme. Each Programme
will describe the managerial responsibilities and procedures for continuous
monitoring of the Programme at progressive and fixed periods. Reviews, to
assess the effectiveness of the Programme, will also be prescribed.
3.14.2 There are various methods, both engineering and statistical, by which the
effectiveness of the Programme may be evaluated, and these include :-
*See Appendix E
NOTE: Generally there would be two levels of committee activity, functional and
managerial; the functional activity covering the practicality of corrective
actions, and the managerial activity covering the overall Quality
management of the Programme.
4.1 Approval
Approval of the Programme (as identified by the 'Document') will depend on the results
of an assessment as to whether or not the stated objectives can be achieved. The
approval of the Document then becomes a recognition of the potential ability of the
Organisation to achieve the stated objectives of the Programme.
NOTE: The Quality Department of the Organisation, together with the CAAT, monitors both the
performance of the Programme in practice as well as its continuing effectiveness in achieving
the stated objectives.
Condition Monitored Maintenance Programmes can vary from the very simple to the
very complex, and thus it is impractical to describe their content in detail. However,
the Document has to be such that the considerations in (a) to G) are adequately
covered.
(a) It generates a precise, specific and logical Quality assessment by the Operator
of the ability of the Organisation to achieve the stated objectives.
(b) It enables the Director initially to accept, and, with subsequent continued
monitoring, to have confidence in, the ability of the Organisation to such an
extent that the Director can renew Certificates of Airworthiness,
(c) It ensures that the Operator provides himself with Quality management of his
Organisation.
(d) It provides the Operator with a basic for the discharge of his moral and legal
obligations in respect of the operation of aircraft.
(e) It enables the Director (as the Airworthiness Authority) to discharge its duties
and legal obligations in respect of the maintenance aspects of airworthiness,
and, where applicable, to delegate certain tasks to the Operator.
(g) With (a) to (f) in mind, it states the objectives of the Programme as precisely
as is possible, e.g. "maintenance of designated components by reliability
management in place of routine overhaul", "Condition Monitoring as a primary
maintenance process" .
(h) The depth of description of the details of the Programme is such that :-
(ii) the procedures for revision of the Document, are clearly stated.
(b) The Document may either be physically contained within the Approved
Maintenance Schedule, or be identified in the Approved Maintenance Schedule
by reference and issue number, in such a manner that the Approved
Maintenance Schedule could be deemed to contain it by specific statement and
cross-reference.
(b) Are the objectives of the Programme clearly defined? e.g. 'Maintenance of
designated Items by reliability management in place of routine overhaul',
'Confidence assessment of overhaul periods', 'Condition monitoring as a
primary maintenance process', 'Airworthiness/economic Quality management
of maintenance'.
(c) Does the Approved Maintenance Schedule clearly state to which Items the
Programme is applicable?
(e) What types of data are to be collected? How? By whom? When? How is this
information to be sifted, grouped, transmitted and displayed?
(f) What reports/displays are provided? By whom? To whom? When? How soon
following data collection? How are delays in publishing controlled?
(g) How is all information and data analysed and interpreted to identify aircraft
actual and potential condition? By whom? When?
(j) Is there a requirement that the Approved Maintenance Schedule be amended, and
is the method of doing so included in the Programme, e.g. variation of time
limitations, additional checks?
(k) Is there a requirement that Maintenance Manuals be amended and is the method
of doing so included in the Programme, e.g. maintenance practices, tools and
equipment, materials?
(m) What provision is made for corrective action follow-up and for checks on
compliance with original intention, e.g. those which are not working out
(o) Is there a diagram of the relationship between the departments and groups
concerned with the Programme and does it show the flow of Condition
Monitoring data, its handling and the prescribed reaction to it?
(p) Are all of the departments involved in the Programme included and are there
any responsibilities not allocated?
(q) What Quality management processes are contained within the Programme in
respect of :-
(i) . Responsibility f o r the Document itself and the procedure for its
amendment?
5.1 Maintenance based solely on the traditional methods of fixed component lives and
'strip-down' policies constitutes a very simple condition control process. Its
administration, effectiveness and the legal obligations of all concerned are easily defined.
When, for any Item, these traditional processes are replaced by Condition Monitored
Maintenance, confidence in the unmanif est condition of the Item can only be through
confidence in the procedure for controlling that condition, i.e. the Condition Monitoring
process.
5.2 Most of the latest generation of aircraft have been so designed that their reliability is based
on the extensive use of multiple Redundancy, thus achieving the continued availability
of system function, even in the event of failures. The scope of this 'System Redundancy'
and 'multiplicity of system function' (see l .4(a) NOTE) is such that it allows
maintenance to be almost totally controlled by Condition Monitoring as the primary
maintenance process, with a few items controlled by the On-Condition process and
even fewer controlled by the Hard Time process. This, in turn, has meant that the
maintenance of the aircraft as a
5.4 In addition to the obvious advantages which are generated by the achievement of the
objectives of the Programme, the formalized structure and operation of a Programme
can provide the Airworthiness Authority with confidence that the Condition
Monitoring processes are effectively contributing to continuing airworthiness, as well
as informing all concerned about the reliability of the aircraft in question.
Airline and manufacturer experience in developing scheduled maintenance program for new aircraft
has shown that more efficient programs can be developed through the use of logical decision
processes.
Subsequently, it was decided that experience gained on this project should be applied to update the
decision logic and to delete certain 747 detailed procedural information so that a universal document
could be made applicable for later new type aircraft. This was done and resulted in the document,
entitled, "Airline/Manufacturer Maintenance Program Planning Document", MSG-2. MSG-2 decision
logic was used to develop scheduled maintenance programs for the aircraft of the 1970's.
In 1979, a decade after the publication of MSG-2, experience and events indicated that an update of
MSG procedures was both timely and opportune in order for the document to be used to develop
maintenance programs for new aircraft, systems or powerplants.
An ATA Task Force reviewed MSG-2 and identified various areas that were likely candidates for
improvement. Some of these areas were the rigor of the decision logic, the clarity of the distinction
between economics and safety, and the adequacy of treatment of hidden functional failures.
Additionally:
A. The development of new generation aircraft provided a focus, as well as motivation, for an
evolutionary advancement in the development of the MSG concept.
B. New regulations which had an effect on maintenance programs had been adopted and
therefore needed to be reflected in MSG procedures. Among those were new damage
tolerance rules for structures and the Supplemental Structural Inspection program for high time
aircraft.
C. The high price of fuel and the increasing cost of materials created trade-off evaluations which
had great influences on maintenance program development. As a result, maintenance
programs required careful analysis to ensure that only those tasks were selected which
provided genuine retention of the inherent designed level of safety and reliability, or provided
economic benefit.
Against this background, ATA airlines decided that a revision to existing MSG-2 procedures was both
timely and appropriate. The active participation and combined efforts of the FAA, CAA/UK, AEA,
U.S. and European aircraft and engine manufacturers, U.S. and foreign airlines, and the U.S. Navy
generated the document, MSG-3. As a result there were a number of
differences between MSG-2 and MSG-3, which appeared both in the organization/presentation of the
material and in the detailed procedural content. However, MSG-3 did not constitute a fundamental
departure from the previous version, but was built upon the existing framework of MSG-2 which had
been validated by ten years of reliable aircraft operation using maintenance programs based thereon.
The following reflects some of the major improvements and enhancements generated by MSG-3 as
compared to MSG-2.
1. Systems/Powerplant Treatment:
MSG-3 adjusted the decision logic flow paths to provide a more rational procedure for task
definition and a more straightforward and linear progression through the decision logic.
MSG-3 logic took a ''from the top down" or consequence of failure approach. At the outset,
the functional failure was assessed for consequence of failure and was assigned one of two
basic categories:
A. SAFETY
B. ECONOMIC
Further classification determined sub-categories based on whether the failure was evident to or
hidden from the operating crew. (For structures, category designation was "significant" or
"other" structure, and all functional failures were considered safety consequence items).
With the consequence category established for systems/powerplants, only those task
selection questions pertinent to the category needed to be asked. This eliminated unnecessary
assessments and expedited the analysis. A definite applicability and effectiveness criteria was
developed to provide more rigorous selection of tasks. In addition, this approach helped to
eliminate items from the analytical procedure whose failures had no significant consequence.
Task selection questions were arranged in a sequence such that the most preferred, most easily
accomplished task, was considered first. In the absence of a positive indication concerning
the applicability and effectiveness of a task, the next task in sequence was considered, down
to and including possible redesign.
Structures Treatment:
Structures logic evolved into a form which more directly assessed the possibility of
structural deterioration processes. Considerations of fatigue, corrosion, accidental damage,
age exploration programs and others, were incorporated into the logic diagram and were
routinely considered.
2. MSG-3 recognized the new damage tolerance rules and the supplemental inspection
programs, and provided a method by which their intent could be adapted to the Maintenance
data certificate restraints. Concepts such as multiple failures, effect of failure on adjacent
structures, crack growth from detectable to critical length, and threshold exploration for
potential failure, were covered in the decision logic of the procedural material.
3. The MSG-3 logic was task-oriented and not maintenance process oriented (MSG-2). This
eliminated the confusion associated with the various interpretations of Condition Monitoring
(CM), On-Condition (OC), Hard-time (HT) and the difficulties encountered when attempting
to determine what maintenance was being accomplished on an item that carried one of the
process labels.
By using the task-oriented concept, one would be able to view the MRB document and see
the initial scheduled maintenance program reflected for a given item (e.g., an item might
show a lubrication task at the "A" frequency, and inspection/functional check at the "C"
frequency and a restoration task at the "D" frequency).
4. Servicing/Lubrication was included as part of the logic diagram to ensure that this important
category of task was considered each time an item was analyzed.
5. The selection of maintenance tasks, as output from the decision logic, was enhanced by a
clearer and more specific delineation of the task possibilities contained in the logic.
6. The logic provided a distinct separation between tasks applicable to either hidden or evident
functional failures; therefore, treatment of hidden functional failures was more thorough than
that of MSG-2.
7. The effect of concurrent or multiple failure was considered. Sequential failure concepts were
used as part of the hidden functional failure assessment (Systems/Powerplant), and multiple
failure was considered in structural evaluation (Structures).
8. There was a clear separation between tasks that were economically desirable and those that
were required for safe operation.
9. The structures decision logic no longer contained a specific numerical rating system. The
responsibility for developing rating systems was assigned to the appropriate manufacturer with
approval of the Industry Steering Committee.
MSG-3, REVISION 1:
In 1987, after using MSG-3 procedures on a number of new aircraft and powerplants in the first half
of the 1980's. it was decided that the benefits of the experience so gained should be used to improve
the document for future application; thus, Revision I was undertaken.
This revised document includes changes developed by American and European airframe
manufacturers, American and European airworthiness authorities, supplemented and agreed to
.by the Air Transport Association of America and other airline representatives.
The major improvements and enhancements reflected in items one through nine above were
basically unchanged and remain applicable to this revised document.
The following are some of the more noteworthy revisions that have been incorporated:
10. Inspections:
13. Reference to a "User's Guide" for procedures related to administration and forms added.
15. Operating Crew Normal Duties - "Normal Duties" revised to delete pre-flight and post flight
check list; added "on a daily basis" for frequency of usage with respect to normal crew duties.
16. Added that procedures for handling composite of other new materials may have to be
developed.
19. Defined logic for failures which may affect dispatch capability or involve the use of abnormal
or emergency procedures. Failure-effect Category 6 is now identified as "Operational -
Evident".
20. Noted that each MSI and SSI should be recorded for tracking purposes whether or not a task
was derived therefrom.
MSG-3. Revision 2:
In 1993. MSG-3 Revision 2, was incorporated. The most significant changes introduced were:
2. To provide guidelines which ensure that a consistent approach be taken with respect to
tasks/intervals required to maintain compliance with Type Certification requirements.
MSG-3 Section 2.4 and its respective logic diagrams have been revised to add an evaluation process
to insure the Corrosion Prevention and Control Program (CPCP) is considered in the evaluation of
each Structural Significant Item (SSI) and every zone.
Damage Sources Section 2.4.3 .1 now includes a discussion of non-metallic materials (composites).
Procedures Section 2.4.4.1 has been revised to add Procedure and Decision blocks for the CPCP
evaluation and edited to produce a more ordered flow of the Procedure and Decision block numbers.
The Glossary - Appendix A Inspection Level Definitions have been revised to apply to Systems,
Powerplants and Structures, and definitions related to CPCP have been added.
It is suggested, in order to fully comprehend the MSG-3 concept, that the entire MSG-3 document be
reviewed and considered prior to accepting or modifying its approaches to maintenance programs
development. A User's Guide or Policies and Procedures Handbook may
.be adopted with guidance and approval of the Industry Steering Committee.
-
Delays/Cancellations
in-Flight Shut-downs I
RELIABILITY CONTROL COMMITTEE
CORRECTIVE ACTION MEMBERS
LINE Quality Control Manager
MAINTENANCE , Requests for increases in Engineering Manager
DefectsComponent Removals Development Engineering
Production Engineering
period between
Reliability Control
Maintenance
, ______
WORKSHOP REPORTS CAAT
Maintenance Procedures
Corrective Action Others (as required)'
Workshop Procedures
Unjustified Removals Flight Procedures
Failure Reports Product Improvement
·Sampling Reports Provisioning
33
Method : Alert Level = The 'corrected' Mean of the Quarterly Failure Rates plus I Standard Deviation of this
mean, based on past seven calendar quarters of confirmed component failure rates per 1,000 hours
to provide an Alert Level for use as a quarterly period of comparison.
N =7
• Where an individual Quarterly Failure Rate falls outside plus or minus 50 % of the uncorrected Mean Quarterly Failure Rate
(0·63 in this case), then this Mean is to be used as a Corrected Rate in place of the uncorrected Quarterly Rate.
AIRCRAFT TYPE : JANUARY 1971 1970 FIRST HALF 1970 LAST HALF
ALERT
ATA 100 CHAPTER LEVEL UR UR R FR: UR UR R FR; UR UR R FR
21 -Air Conditioning ·35 2 ·53 ·33 14 ·34 ·32 15 ·36 ·31
22 -Auto-pilot ·80 4 1·33 ·33 16 ·98 ·29 19 ·98 ·32
23 - Communications ·92 2 ·67 ·48 10 ·57 ·48 8 ·56 ·37
24 -Electric Power ·20 2 ·08 ·02 8 ·06 ·02 9 ·07 ·03
27 -Flight Controls ·30 1 ·20 ·09 7 ·12 ·10 6 ·10 ·08
28 -Fuel ·23 0 ·00 ·00 2 ·64 ·30 1 ·09 ·06
29 -Hydraulic ·38 1 ·42 ·40 2 ·26 ·18 4 ·46 ·22
30 -Ice & Rain Protection ·15 0 ·00 ·00 2 ·14 ·08 2 ·14 ·08
31 -Instruments ·65 4 ·63 ·34 20 ·61 ·31 16 ·57 ·20
32 -Landing Gear ·33 1 ·04 ·02 7 ·05 ·03 9 ·09 ·04
34 - Navigation ·73 3 ·66 ·21 20 ·69 ·24 24 ·71 ·29
35 -0xygen ·30 2 ·66 ·32 11 ·65 31 9 ·64 ·30
36 -Pneumatic ·20 ·00 ·00 2 ·01 ·01 4 ·02 ·02
38 -Water/Waste
49 -APU
·24
·48
,
0 ·09
·33
·06
·32
6
7
·16
·34
·15
·34
7
4
·17
·26
·16
·29
73 -Engine Fuel & Control ·39 0 ·00 ·02 4 ·10 ·06 2 ·06 ·05
75 -Engine Air ·28 1 ·17 ·16 5 ·16 ·14 3 ·12 ·12
77 -Engine Indicating ·30 5 ·42 ·17 26 ·46 ·18 22 ·44 ·17
79 -0il ·22 0 . ·00 ·00 2 ·04 ·02 3 ·06 ·04
80 -Starting ·50 1 ·17 ·11 6 ·18 ·12 3 ·09 ·10
UR - Unscheduled Removals
URR - Unscheduled Removal Rate
FR -Confirmed Failure Rate (3 months cum. av.)
1 INTRODUCTION
Those terms and abbreviations in the text which have a specific meaning are brought together
in this Appendix E for ease of reference. Where a definition has been derived from British
Standard 4778 "Glossary of Terms used in Quality Assurance" or the ''World Airlines
Technical Operations Glossary", the source of the definition is indicated by the addition of
"(BS)" or "(WATOG)" , as appropriate, at the end of the text.
2.6 Condition Monitoring. A primary maintenance process under which data on the whole
population of specified items in service is analyzed to indicate whether some allocation
of technical resources is required. Not a preventive maintenance process, condition
monitored maintenance allows failures to occur, and relies upon analysis of operating
experience information to indicate the need for appropriate action.
NOTE: Failure modes of condition monitored items do not have a direct adverse effect on
operating safety. (WATOG).
2.8 Failure Mode. The way in which the failure of an item occurs. (WATOG).
2.9 Hard Time Limit. A maximum interval for performing maintenance tasks. This interval
can apply to Overhaul of an Item, and also to removal following the expiration of
life of an Item.
2.10 Item. Any level of hardware assembly (i.e. part, sub-system, system, accessory,
component, unit, material, etc.). (sic) (WATOG).
2.11 Maintenance Significant Items. Maintenance items that are judged to be relatively the
most important from a safety, reliability or economic stand-point. (sic) (WATOG).
2.12 Minimum Equipment List. An approved list of items which may be inoperative for
flight under specified conditions. (WATOG).
2.14 Overhaul. The restoration of an item in accordance with the instructions defined in the
relevant manual. (WATOG).
2.15 Partial Overhaul. The overhaul of a sub-assembly of an item with a time controlled
overhaul to permit the longer-lifed item to achieve its authorized overhaul life.
(WATOG).
2.18 Quality. The totality of features and characteristics of a product or service that bear
on its ability to satisfy a given need. (BS).
2.19 Quality Control. A system of programming and co-ordinating the efforts of the various
groups in an organization to maintain or improve quality, at an economical level
which allows for customer satisfaction. (BS).
2.21 Redundancy. The existence of more than one means for accomplishing a given
function. Each means of accomplishing the function need not necessarily be
identical. (WATOG).
2.22 Redundancy, Active. That redundancy wherein all redundant items are operating
simultaneously rather than being activated when needed. (WATOG).
2.23 Redundancy, Standby. That redundancy wherein the alternative means of performing
the function is inoperative until needed and is activated upon failure of the primary
means of performing the function. (WATOG).
2.24 Replace. The action whereby an item is removed and another item is installed in its
place for any reason. (WATOG).
2.27 Test. An examination of an item in order to ensure that the item meets specified
requirements. (WATOG).