Failure Reporting, Analysis, and Corrective Action System
Failure Reporting, Analysis, and Corrective Action System
Failure Reporting, Analysis, and Corrective Action System
SEMATECH and the SEMATECH logo are registered trademarks of SEMATECH, Inc.
WorkStream is a registered trademark of Consilium, Inc.
FAILURE REPORTING,
ANALYSIS, AND
CORRECTIVE ACTION SYSTEM
ii
CONTENTS
Page
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
EXECUTIVE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
FAILURE REPORTING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Collecting the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Reporting Equipment Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Reporting Software Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Logging Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Using FRACAS Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Failure Analysis Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Failure Review Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Root Cause Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Failed Parts Procurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
CORRECTIVE ACTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
GLOSSARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iii
iv
ACKNOWLEDGMENTS
In putting together this document, there were many obstacles that kept getting in the way of
completing it. Without the support of my family, Sonia, Jason, and Christopher, this would
never have been accomplished.
Mario Villacourt
External Total Quality and Reliability
SEMATECH
I would like to thank my wife, Monique I. Govil, for her support and encouragement in
completing this work. Also, I want to thank SVGL for supporting my participation with this
effort.
Pradeep Govil
Reliability Engineering
Silicon Valley Group Lithography Systems
v
vi
EXECUTIVE SUMMARY
Reliable
Equipment
Test &
Engineering In-house Test
Change & Inspections
Control
Corrective Failure
Action Reporting
Analysis
Page 1
Failure Reporting, Analysis, and Corrective Action System
Since factors vary among installation sites, equipment users must
work closely with each of their suppliers to ensure that proper data
is being collected, that the data is being provided to the correct
supplier, and that the resulting solutions are feasible.
1. Quoted from MIL-STD-2155(AS), Failure Reporting, Analysis and Corrective Action System.
Page 2
Failure Reporting, Analysis, and Corrective Action System
FRACAS objectives include the following:
FAILURE REPORTING
In-house Test
& Inspections
All events (failures) that occur during inspections and tests should
be reported through an established procedure that includes
Field
Operations collecting and recording corrective maintenance information and
times. The data included in these reports should be verified and
then the data should be submitted on simple, easy-to-use forms that
Failure are tailored to the respective equipment or software.
Reporting
Page 3
Failure Reporting, Analysis, and Corrective Action System
Collecting the Data
Many problems go unnoticed because insufficient information was
provided. The FRB must know if, for example, someone was able
to duplicate the problem being reported. There are three common
causes for missing essential data:
The person who filled out the form had not been trained.
Page 4
Failure Reporting, Analysis, and Corrective Action System
Reporting Equipment Failures
Collecting and sharing appropriate data through event reports are
essential components of an effective FRACAS process, both for the
supplier and for the user. There are common elements in every
report (when the event occurred, what item failed, etc.) that the
user and the supplier both use to analyze failures. Other crucial
information includes the duration of the failure, the time it took to
repair, and the type of metric used (time or wafer cycles).
In addition, make full use of reporting tools that already exist (to
which little or no modification may be necessary), such as
Page 5
Failure Reporting, Analysis, and Corrective Action System
Customer activity logs
Table 1 shows the types of detail that the supplier needs to see in
the event report (in-house or field). Figure 2 is an example of an
event report form. The format of the form is important only to
simplify the task of the data recorder. You may want to computer-
ize your data entry forms to expedite the process and also minimize
failure description errors.
Page 6
Failure Reporting, Analysis, and Corrective Action System
TABLE 1 EVENT REPORT FIELDS
Event Report
Fields Example Description
Page 7
Failure Reporting, Analysis, and Corrective Action System
EVENT RECORD
Serial Number: 9003 Reliability Code: DMD-XS-HRN-M3S-002
NON-SCHEDULED
UNSCHEDULED
DOWNTIME
equipment
downtime SCHEDULED total time
DOWNTIME (168 hrs/wk)
ENGINEERING operations
time
equipment STANDBY
uptime
PRODUCTIVE
Page 8
Failure Reporting, Analysis, and Corrective Action System
Reporting Software Problems
When reporting software problems, too much detail can be
counterproductive. An error in software does not necessarily mean
that there will be a problem at the system (equipment) level. To
quote Sam Keene, 1992 IEEE Reliability Society president,
FRACAS helps you focus on those errors that the customer may
experience as problems. One focusing mechanism is to track the
reason for a corrective action. This data can be collected by review-
ing the repair action notes for each problem in the problem report.
Engineering and management can categorize these reasons to
determine the types of errors that occur most often and address
them by improving procedures that most directly cause a particular
type of error. The process of analyzing this data is continuous.
Unclear requirements
Misinterpreted requirements
Changing requirements
Bad nesting
Missing code
Page 9
Failure Reporting, Analysis, and Corrective Action System
Excess code
Previous maintenance
Misuse of variables
Conflicting parameters
Specification error
1. Catastrophic
2. Severe
4. Negligible
Page 10
Failure Reporting, Analysis, and Corrective Action System
Logging Data
Whether software or equipment errors are being reported, once the
supplier receives the reports, all data should be consolidated into
one file. The suppliers reliability/quality organization should
oversee the data logging.
PROBLEMS RECORD
Reliability Code Title/Description
ATF-STK-PRI-RT-XAX-001 LIMIT FLA SENSORS FAIL
System: ATF Initial Status: I
Subsystem: STK Fault Category: Facilities Elec
Assembly: PRI Error Codes: errcode 2
Subassembly: RT Assignee: Eric Christie
Sub-subassembly: XAX
DATES
ROOT CAUSE SOLUTION (RCS)
Insufficient Info: 11/04/93
Original Plan: / /
Sufficient Info: / /
Current Plan: / /
Defined: / /
Complete: / /
Contained: / /
Retired: / /
RCS Documentation:
Page 11
Failure Reporting, Analysis, and Corrective Action System
Using FRACAS Reports
The FRACAS database management system (DBMS) can display
data in different ways. This section describes some of the report
types you may find useful. A complete failure summary and other
associated reports that this DBMS can provide are described in
more detail beginning on page 27.
Reject %
Target
Rate (8%)
Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan 86
Volume 566 97 679 598 22 123 126 333 98 98 145 88
Page 12
Failure Reporting, Analysis, and Corrective Action System
By Reliability Code
From 08/30/93 to 10/03/93
8
Events
4
0
1 2 3 4 5
Week
DMD-OP-IBIGA-SRC-002
DMD-OP-IBIGA-SRC-003 DMD-VS-HC-001
0
1 2 3
Phase II Phase III
Page 13
Failure Reporting, Analysis, and Corrective Action System
FAILURE/REPAIR REPORT
NO.
V H/W V S/W V SE V TE V OTHER
1. PROJECT 2. FAILURE DATE 3. FAILURE GMT 4. LOG NO. 5. REPORTING LOCATION V OTHER
DAY ___ HR ___ MIN ___ V SUPPLIER
6. SUBSYSTEM 7. 1ST TIER 8. 2ND TIER 9. 3RD TIER 10. 4TH TIER 11. DETECTION AREA
H/W, S/W H/W, S/W H/W, S/W H/W, S/W
A REFERENCE V BENCH V ACCEPT
DESIGNATIONS
V FAB/ASSY V QUAL
I ORIGINATOR
B NOMENCLATURE
V INTEG V SYSTEM
C SERIAL NUMBER
V FIELD V OTHER
D OPERATING TIME
CYCLES
13. ENVIRONMENT V ACOUSTIC V VIE V SHOCK V HUMID 14. ORIGINATOR 15. DATE 16. COGNIZANT ENGINEER
V AMB V TEMP V EMC/EMI V OTHER
18. ASSOCIATED
II VERIFICATION
DOCUMENTATION
18. ITEM A. ITEM NAME B. NUMBER C. SERIAL NO. D. CIRCUIT DESIGNATION E. MANUFACTURER G. DEFECT
DATA
b DATE
c HRS/MINS
d LOCATION
e WORK
PERFORMED
28. SIGNATURE COG ENGR 29. SEC 30. DATE 31. SIGNATURE COG SEC ENGR 32. DATE 33. SUBSYSTEM
RATING
34. SYSTEM ENGR 35. DATE 36. PROJECT RELIABILITY ASSURANCE 37. DATE 38. RISK ASSESSMENT
COPY DISTRIBUTION WHITE PFA CENTER, CANARY PROJECT OFFICE, PINK ACTION COPY
GOLDENROD ACCOMPANY FAILED ITEM
Page 14
Failure Reporting, Analysis, and Corrective Action System
Failure ANALYSIS
Reporting
Failure analysis is the process that determines the root cause of the
failure. Each failure should be verified and then analyzed to the
Analysis extent that you can identify the cause of the failure and any
contributing factors. The methods used can range from a simple
investigation of the circumstances surrounding the failure to a
sophisticated laboratory analysis of all failed parts.
z USED ON
EQUIPMENT S.NO.
ASSEMBLY NO.
SITE NUMBER
ORIGINATOR DATE
DEFECT DESCRIPTION (FILLED BY THE ORIGINATOR)
z DISTRIBUTION
V RELIABILITY
V MDC
V FS
V QUALITY
Page 16
Failure Reporting, Analysis, and Corrective Action System
PROBLEM ANALYSIS REPORT
PROBLEM NUMBER: ___ ___ ___ ___ ___ ___ ___
COMMENTS:
j DEFINITION OF PROBLEM:
DEFINED: j DATE
j CONTAINMENT:
CONTAINED: j DATE
j ROOT CAUSE:
j CORRECTIVE ACTION:
REVISION: A: j B: j C: j D: j DATE:
ENTERED IN DB:
*** IF MORE SPACE IS NEEDED, PLEASE ATTACH SEPARATE PIECE OF PAPER ***
Page 17
Failure Reporting, Analysis, and Corrective Action System
Failure Review Board
The FRB reviews failure trends, facilitates and manages the failure
analysis, and participates in developing and implementing the
resulting corrective actions. To do these jobs properly, the FRB
must be empowered with the authority to require investigations,
analyses, and corrective actions by other organizations. The FRB
has much in common with the techniques of quality circles; they are
self-managed teams directed to improve methods under their
control (see Figure 11).
FRB meets
regularly
Review in-house
& field reports
No
FRB confirms
problem
assignments within
24 hours
Page 18
Failure Reporting, Analysis, and Corrective Action System
A
Yes
Sufficient
Enter into information?
DBMS & write
PARs
No
Operator/
field Others
services
Establish
containment, RCA
Update DBMS plan/schedule
RCA complete
Begin ECN Process
Page 19
Failure Reporting, Analysis, and Corrective Action System
The makeup of the FRB and the scope of authority for each member
should be identified in the FRACAS procedures. The FRB is
typically most effective when it is staffed corporate-wide, with all
functional and operational disciplines within the supplier organiza-
tion participating. The user may be represented also. Members
should be chosen by function or activity to let the composition
remain dynamic enough to accommodate personnel changes within
specific activities.
* Hardware, software, process, and/or materials design, depending on the type of system
being analyzed.
Page 20
Failure Reporting, Analysis, and Corrective Action System
Conduct trend analysis and inform management on the types
and frequency of observed failures
Review and analyze failure reports from in-house tests and field
service reports
Page 21
Failure Reporting, Analysis, and Corrective Action System
Furthermore, each FRB member should
Actively participate
Page 22
Failure Reporting, Analysis, and Corrective Action System
Reliability
Manager
Failure
Reliability
Review
Coordinator
Board
Problem Database
Owner Support
Problem In-house
Owner Test
Inspections
Problem Field
Owner Operations
Brainstorming
Histogram
Flow chart
Page 23
Failure Reporting, Analysis, and Corrective Action System
Pareto analysis
FMEA4
Trend analysis
These tools are directly associated with the analysis and problem-
solving process.5 Advanced methods, such as statistical design of
experiments, can also be used to assist the FRB. The Canada and
Webster divisions of Xerox Corporation extensively use design of
experiments as their roadmap for problem solving;6 they find this
method helps their FRBs make unbiased decisions.
4. Failure Mode and Effects Analysis. More information is available through the Failure
Mode and Effects Analysis (FMEA): A Guide for Continuous Improvement for the Semiconductor In-
dustry, which is available from SEMATECH as technology transfer #92020963A-ENG.
5. Analysis and problems solving tools are described in the Partnering for Total Quality: A To-
tal Quality Toolkit, Vol. 6, which is available from SEMATECH as technology transfer
#90060279A-GEN.
6. A fuller description of the Xerox process can be found in Proceedings of Workshop on Acceler-
ated Testing of Semiconductor Manufacturing Equipment, which is available from SEMATECH as
technology transfer #91050549AWS.
Page 24
Failure Reporting, Analysis, and Corrective Action System
Failed part reports from each location
7. Refer to Equipment Change Control: A Guide for Customer Satisfaction, which is available
from SEMATECH as technology transfer #93011448AGEN.
Page 25
Failure Reporting, Analysis, and Corrective Action System
An effective change control system incorporates the following characteristics.
An established equipment baseline exists with current and accurate
V Design specs and drawings
V Manufacturing process instructions
V Equipment purchase specifications
Configuration Table
Events Table
Problems Table
Reports
8. Available from SEMATECH Total Quality division. Software transfer and accompanying
documentation in press.
Page 27
Failure Reporting, Analysis, and Corrective Action System
Configuration Table
The configuration table accepts information about each machine or
software instance. There are six standard fields, as shown in the
following table. In the SEMATECH FRACAS database, there are
seven optional fields that can be customized for your purposes,
such as customer identification number, software revision, process
type, engineering change number, site location, etc.
Phase Date User input. Enter the date the machine went into
the current Life Cycle Phase.
Page 28
Failure Reporting, Analysis, and Corrective Action System
Events Table
The primary purpose of the FRACAS application is to record
downtime events for a particular system. The fields of the Events
Table are described in Table 3. To understand these fields fully, you
need to understand the relationship between the problem fields
(System, Subsystem, Assembly, Subassembly, and/or Sub-subas-
sembly) and the reliability code.
The reliability code field contains one or more of the problem field
names. Data (if any) from the problem fields automatically helps
you select or create the reliability code. Once you select a code,
associated fields are updated automatically.
Assembly User input. The optional assembly code (if the part
is at an assembly level) involved in the event.
Related to the selected System and Subsystem
codes.
Page 29
Failure Reporting, Analysis, and Corrective Action System
TABLE 3 EVENTS TABLE FIELD DESCRIPTIONS, CONTINUED
Page 30
Failure Reporting, Analysis, and Corrective Action System
Problems Table
In the SEMATECH implementation, the Problem Record allows you
to view, modify, or add records for problems. The fields of the
Problems Table are described in Table 4.
Page 31
Failure Reporting, Analysis, and Corrective Action System
TABLE 4 PROBLEMS FIELD DESCRIPTIONS, CONTINUED
Report Types
Your reporting mechanism for this database should be flexible
enough to accommodate various customer types and their needs.
For example, you need to be able to pull reports for an FRB analysis
as well as for the users, sub-tier suppliers, and those who are
addressing corrective actions. At a minimum, the following reports
should be available:
Page 32
Failure Reporting, Analysis, and Corrective Action System
GLOSSARY
Page 33
Failure Reporting, Analysis, and Corrective Action System
Failure Cause The circumstance that induces or activates a failure.
Examples of a failure cause are defective soldering,
design weakness, assembly techniques, and software
error.
Page 34
Failure Reporting, Analysis, and Corrective Action System
Laboratory Analysis The determination of a failure mechanism
using destructive and nondestructive laboratory
techniques such as X-ray, dissection, spectrographic
analysis, or microphotography.
Page 35
Failure Reporting, Analysis, and Corrective Action System
derived in a tree fashion from top to bottom as
System, Subsystem, Assembly, Subassembly, and
Sub-subassembly (typically lowest replaceable
component). Codes vary in size; however, a
maximum three alphanumeric descriptor per level is
recommended. For example, AH-STK-ROT-XAX-
MOT for Automated Handler-Stocker-Robot-X_Axis-
Motor.
Root Cause The reason for the primary and most fundamental
failures, faults, or errors that have induced other
failures and for which effective permanent corrective
action can be implemented.
Glossary References:
MIL-STD-721C Definitions of Terms for Reliability and Maintainability
(1981).
Page 36
Failure Reporting, Analysis, and Corrective Action System
FRACAS Functions and Responsibilities
Responsibilities Function Description
Data Logging Reliability Log all failure reports. Validate failures and
forms. Classify failures (inherent, induced,
false alarm, etc.).
Failure Review Failure Review Board Determine failure trends. Prepare plans for
action. Identify failures to the root cause.
Failure Analysis Reliability and/or Review operating procedures for error. Pro-
Problem Owners cure failed parts. Decide which parts will be
destructively analyzed. Perform failure anal-
ysis to determine root cause.
Quality Inspect incoming test data for item.