0% found this document useful (0 votes)
32 views46 pages

D2 L5, Module-7 ABC of Product Reliability

The document discusses assigning, building, and conforming product reliability. It covers reliability fundamentals like quality vs reliability, reliability indices, reliability block diagrams, failure rates, and causes of failures. The document also discusses the challenges of quantifying human reliability.

Uploaded by

RAVI SHANKAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views46 pages

D2 L5, Module-7 ABC of Product Reliability

The document discusses assigning, building, and conforming product reliability. It covers reliability fundamentals like quality vs reliability, reliability indices, reliability block diagrams, failure rates, and causes of failures. The document also discusses the challenges of quantifying human reliability.

Uploaded by

RAVI SHANKAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Assigning, Building and Conforming

Product Reliability
(Module- 7)
By
P. NARASIMHA RAO
Retd. Scientist-G, DLRL(DRDO)
FIETE, MSEMCI, MISCA, MISOI, MIDST
Chartered Engineer (IETE)
IPMA Certified Project Management Professional
STQC Certified Reliability Professional
Adjunct Faculty, NI-MSME
Mentor, BYST (CII)
at a 5- Day Course on
“Design Thinking and New Product Innovation
in the DRDO Context”
(9 – 13 October 2023)
Organised by:
Institute of Defence Scientists & Technologists
DLRL Campus, Hyderabad – 500 005.
Agenda

 Refreshing the fundamentals

 Assigning Reliability

 Building Reliability

 Conforming Reliability

 Conclusion

 Discussions
Quality and Reliability

Quality is the degree of conformance to applicable specifications and


workmanship standards. It is associated with the manufacturing.

Reliability is the probability that the unit perform its intended function
adequately well for a given period of time under the stated operating conditions
or environment. It is primarily associated with the design.

More simply, Reliability is the ability of the unit to maintain its Quality under
specified conditions for a specified time.

Unlike Quality, Reliability is measurable and several metrics have been defined
to express reliability. They include Failure rate, MTBF, MTTF MTTR and so on.

A good Quality Product need not be always reliable. (A high quality smart
phone may not work reliably when you are travelling in a train)

And some times a Reliable product may not be of good quality. (The Walkie
Talkies sets used by the Guard and the Driver of the train are definitely not of
good quality but work reliably)
Reliability Indices
Reliability is the probability that the unit perform its intended function adequately
for a given period of time under the stated operating conditions or environment.
Reliability R(t) of a unit under test (UUT) can be mathematically expressed as :
R(t) = e- λt where, λ = Failure rate = 1/MTBF, t = Mission Time
Reliability, Rsys of complex system is the product of individual reliabilities R1, R2,
R3, ….. Rn of all its constituent units can be mathematically expressed as :
Rsys = R1 * R2 * R3 * ….. Rn
Mean Time Between Failures (MTBF) is the average time the equipment
performed its intended functions between failures ie., the productive time divided
by the number of failures during that time.
Mean Time to Repair (MTTR) is the average time to correct a failure and return the
equipment to a condition where it can perform the intended function. It is the sum
of all repair time (elapsed time) incurred during a specified period (including the
equipment and process test time (but not including maintenance delay), divided by
the number of failures during that period.
Failure Rate is the ratio of the number of failures reported / experienced by a
device and the total equipment operating time. Failure rate is the reciprocal of
Meantime between failures and is typically measured in failures per million hours.
Reliability Block Diagram (RBD)

Reliability Block Diagram (RBD) is a graphical representation of the


constituents of a product and indicates how they are reliability-wise (not
necessarily be physically) connected. It is possible for each block of a RBD
to have its own reliability block diagram. The level of granularity is based on
the availability of data and the lowest actionable item concept.

Reliability engineers draw a high level reliability block diagram as series


system with all its essential minimum constituents (unavoidable single-point-
failure items) in place. Such series system, will fail if any one of its
constituent fails.

R1 R2 R3 Rn

Reliability, Rsys of complex system is the product of individual reliabilities R1,


R2, R3,..Rn of all its constituent units can be mathematically expressed as:

Rsys = R1 * R2 * R3 * ….. Rn

Since Ri ≤ 1, the Rsys < Ri for all values of 1< i < n 5


Reliability of redundant systems
Types of redundancy
S/N Type Description Examples
1 No Any block in the series RBD Motor cycle without any
redundancy spare tyre
2 Standby At a time only one unit will be working, Scooter or LMV with
Redundancy which on failure, will be sensed and spare tyre.
replaced with the redundant unit
Maintaining spare LRUs
and SRUs on line.
3 Active Each one of the elements performs the Rear wheels of a truck
Redundancy required function. If element fails, the or HMV.
(Also called other element provides the function
Receivers in a multi
hot standby)
channel receivers
4 Voting A special case of active redundancy Travel agencies
Redundancy where some decision making is involved deputing transport
in pressing the redundancy unit after the vehicles on duty.
working unit fails. The reliability of this
Antenna switching units
decision making also plays a role.
5 Hybrid Systems using one or more redundant Trucks and HMVs
Redundancy configurations carrying spare tyres
Conventional bathtub curve
Anomalies in the bathtub curve

 The conventional bathtub curve had certain anomalies in depicting the


failure rates of a single item. It describes the relative failure rate of an entire
population of products over time. Some individual units will fail relatively early
(infant mortality failures), other will last until wear-out.

 Bathtub curve is typically used as a visual model to illustrate the three


key periods of product failure and not calibrated to depict a graph of expected
or actual behavior for a particular family. It is rare to have enough short-term
and long-term failure information to actually model a population of products
with a calibrated bathtub curve.
Bathtub Curve
Revised for the R&D (first time developed) products
Causes of Failures / Defects
S/N Cause Distribution Precautions
1 Design Failures 9% Reliability Engineering & Design
Reviews
2 Software Failure 9% Review of Requirement Specs,
Verification & Validation of s/w
design, development, coding,
testing, documentation
3 Hardware Failure 22% Component (Hardware) Selection
(Part Defect) and screening
4 Process Failure 15% Quality Control of Fabrication
(Manufacturing Defect) sequence and fabrication process.
Quality Audits on the Fabrication
Facilities.
5 System Management 4% Quality Procedures , Quality
Documentation, Quality Testing /
Acceptance before delivery
6 Wear out 9% Operational De-rating
7 No visible, identified 12% Extensive BIT (Built-in Test)
Defects (random failures) Mechanism
8 Induced (vandalism, Inter 20% Fool-proof Design, Implementation
system disturbances) of Fault Tolerance mechanisms
Reliability Dilemma

Electronic Human
Mechanical Reliability
Reliability Reliability
Mature • Operator error is inevitable
Maturing
• Human error rates are high
• Derating Guidelines • Sensory • Documented rates are often wrong, often
• Computerized Predictions misapplied
- See it
• Part Screening Techniques - Feel it • Error rate “ Controls” are available, e.g. :
• Reliability Design Guides - Hear it – Training / retraining / drilling
• EE Reliability Engineers – Using checklists / inspectors / backups
– Using Performance Shaping Factors
– Exploiting “stereotypical” behavior
It is easy for some one ask “What is the• BUT: Person-to-Person variability precludes
“hard rules.”
reliability of your system?”
The field of Human Factors is a
complex one with many aspects.
But it is very difficult for them to quantify Much has been studied. Much has
the reliability figure they expect. been documented. Certainties are
few, save one:
ie. Human Operators WILL ERR!
Reliability Engineering
Reliability Goal

Proposed System 1st Level RBD Reliability


Configuration (All in series) Estimation

FMECA FTA Revised


At module At module RBD
level level

Revised RBD Parts Stress


(with redundancy) Analysis

Reliability apportionment
at module level Reliability
Improvement
by change /
Reliability FMECA replacement
BOM of Each of
Parts Count
Module components
Method FTA

13
Assigning Reliability – A case study of COMINT System

System Controller
(Within MARS LRU)
COMINT/OWS Simulator

CSM LAN SWITCH (GIGABIT Ethernet) HLSC

Scan-DF MARS WBSAR


(3 LRUs) (1LRU) (3LRUs)

Scan-DF (SDF) - Performs the search and direction finding function in


the V/UHF band
MARS (Monitoring, Analysis - Performs the monitoring, analysis and recording of
and Recording Subsystem) communication signals in the V/UHF band
WBSAR (Wideband - Performs interception and recording of wideband
Surveillance and Recording) communication signals in the Microwave band
System Controller - Commands and controls all the COMINT subsystems
and also communicates with the Higher Level System
Controller for accomplishing the mission .
Assigning Reliability
Given that Reliability Goal (of COMINT System) = 0.75
Number of
Sl Name of the Subsystems
Subsystem Functionality per system
No
1 Scan-DF (SDF) Performs the search and direction 1
finding function in the V/UHF band
2 MARS (Monitoring, Performs the monitoring, analysis and 1
Analysis and Recording recording of communication signals in
Subsystem) the V/UHF band
3 WBSAR (Wideband Performs interception and recording of 1
Surveillance and wideband communication signals in
Recording Subsystem) the microwave band
4 COMINT Controller Commands and controls all the 1
COMINT subsystems and also
communicates with the HLSC for
accomplishing the mission
5 COMINT LAN Switch To provide intra and inter system 1
connectivity among the subsystems
within and outside the COMINT
Total Number of Functional Subsystems 5
If we distribute the reliabilities uniformly among all the 5 functional
subsystems, each subsystem is expected to have a minimum reliability of
(0.75)1/5 which equals to 0.9440875113.
Assigning Reliability
Given that Reliability Goal (of CSM in AEW&C = 0.75)
Sl Number of
No Functional Subsystems Modules
1. Scan-DF Subsystem
1.1 V/UHF Antenna Array Elements (for DF) 5 (array)
1.2 DF Receivers (5 Channel) 5
1,3 DF Synthesizer Module 1
2 MARS (Monitoring, Analysis and Recording Subsystem)
2.1 V/UHF Antenna Element (Monitoring) 1
2.2 Monitoring Receivers (2 Channel) 2
2.3 Analysis and Recording Module (ARM) 2
3 WBSAR (Wideband Surveillance and Recording Subsystem)
3.1 Microwave Antenna Element (Surveillance) 1
3.2 Microwave Receiver 1
3.3 Wideband Recorder 1
4 COMINT Controller (CC) Module 1
5 COMINT LAN Switch 1
Total Number of Modules/Packages 21
If we distribute the reliabilities uniformly among all the 21 modules/ packages,
each of the modules/ package is expected to have a minimum reliability of
(0.75) 1/21 which equals to 0.9863942 16
Types of Reliability Analyses
S.No Type Description
1 Failure Examines most potential failure modes of a system or
Mode equipment and their causes in order to determine the effect
Effects of each failure mode on operation.
Analysis
FMEA is also used to identify design modifications if any
(FMEA)
needed. It provides a basis for trouble shooting procedure
and for built-in diagnostic features
2 Failure Developed by NASA for space program hardware, it is an
Mode advanced design review techniques used to eliminate
Effects and design weaknesses.
Criticality FMECA is primarily used to avoid cost modifications,
Analysis identification of potential failure modes by considering
(FMCEA) operational and maintenance requirements.
3 Fault FTA uses a deductive logic and focuses on the total
Tree system.
Analysis It assumes a system failure and determines the possible
(FTA) causes or fault occurrence at the lower level that directly or
indirectly contribute to the major fault or undesired event .
Important : It may be noted that FMEA follows Bottom-up approach while
FTA adapts Top-down approach. Both FMEA (or FMECA) and FTA are to be
performed in the initial design phase of the product development.
Analysis and Identification of Single Point Failures
Failure
Sl. Package/ Impact/
No. Subsystem Classifi- Criticality
Mode Effect
cation
1 1st of 5 DF Disconnect No signal for 1st Degradation DF inaccuracy
Antennae Failure DF channel
2 2nd of 5 DF Disconnect No signal for 2nd Degradation DF inaccuracy
Antennae Failure DF channel
3 3rd of 5 DF Disconnect No signal for 3rd Degradation DF inaccuracy
Antennae Failure DF channel
4 4th of 5 DF Disconnect No signal for 4th Degradation DF inaccuracy
Antennae Failure DF channel
5 5th of 5 DF Disconnect No signal for 5th Degradation DF inaccuracy
Antennae Failure DF channel
6 V/UHF Disconnect No signals for 2 Degradation Monitoring analysis
Monitoring Failure Monitor and recording
Antenna channels affected
7 Microwave Disconnect DF inaccuracy Degradation Wideband
Antenna Failure In Elevation surveillance and
recording function
affected
Analysis and Identification of Single Point Failures
Failure
Sl. Package/ Impact/
No. Subsystem Classifi- Criticality
Mode Effect
cation
8 1st of five Disconnect No DFing on Degradation DF inaccuracy
DF Rxs Failure 1st channel
9 2nd of five Disconnect No DFing on Degradation DF inaccuracy
DF Rxs Failure 2nd channel
10 3rd of five Disconnect No DFing on Degradation DF inaccuracy
DF Rxs Failure 3rd channel
11 4th of five Disconnect No DFing on Degradation DF inaccuracy
DF Rxs Failure 4th channel
12 5th of five Disconnect No DFing on Degradation DF inaccuracy
DF Rxs Failure 5th channel
13 DF Disconnect No DFing on Degradation Mission failure for DF
Synthesizer Failure any channel (critical)
14 1st of two Disconnect No Monitoring Degradation Monitoring analysis
Mon. Rxs Failure on 1nd channel and recording
function affected in
one channel
15 2nd of two Disconnect No Monitoring Degradation Affected in other
Mon. Rxs Failure on 2nd channel channel
Analysis and Identification of Single Point Failures
Package/ Failure Impact/
Sl. No. Subsystem/ Classifi-
Mode Effect Criticality
Item cation
16 Analysis & Disconnect No analysis of analysis and
recording Failure narrowband Degradation recording function
Module signals affected
17 Microwave Disconnect No interception Degradation Wideband signal
Receiver Failure of Mw signals interception affected
18 Wideband Disconnect No Recording of Degradation Wideband recording
Recorder Failure Mw signals affected
19 Signal Disconnect No signal input Degradation Loss of WBSAR
Conditioner Failure to mw receiver function
Unit
20 Antenna Disconnect No signal input Degradation Loss of DF function
Switching Failure to DF receivers
Unit
21 CSM Disconnect No Control of Mission CSM Mission failure
Controller Failure CSM subsysems Critical
Targeting the Components for Reliability improvement
1. Assign the system RSYS1 < RSYS RSYS2 ≈ R SYS RSYS2 > RSYS
reliability RSYS to all
systems
2. Chose the RSYS1 whose RSS1 << RSS RSS2 ≈ RSS RSS2 >> RSS
reliability is less than RSYS
3. Assign the subsystem RSA1 << RSA RSA2 >> RSA
reliability RSS to all
subsystems of RSYS1
RC1 << RC RC2 >> RC
4. Chose the RSS1 whose
reliability is less than RSS

5. Assign the subassembly reliability RSA to all subassemblies of RSS1


6. Chose the RSA1 whose reliability is less than RSA
7. Assign the Component reliability RC to all subassemblies of RSA1
8. Chose the RC1 whose reliability is less than RC
9. Improve its reliability of Component C1 by replacing or adding redundancy
10. Re calculate the reliabilities at all levles (RSA1 , RSS1 and RSYS1) and repeat the
steps 1 through 9 till , , RSSk ≈ RSS and RSYSl ≈ R SYS for all
values of i, j, k and l.
11. Thus achieve the overall system reliability goal.
Building Reliability

Product Assurance

Design Parts, Material & Fabrication & Test & Non-Conformance


Process control manufacturing Evaluation Management

• Reliability • Parts • Mech.Drawing • Sub-System • Material


Engg. Reliability Review Level Review Board
• Design • Material • Fabrication • Test matrix •T&E
Review Certification Sequence • Integrated Committee
• Process • Online QC pay load
Qualification testing • Configuration
• Quality Audit Change
Control Mgt.
Building Reliability in the Design
Design Margins (Adding six Sigma Flavor)

Successful designs have sufficient design margins in respect of


specifications and tolerances so that the manufacturing imperfections have only
little effect on the final performance of the product.

Choice of components/parts

The choice of components or parts used in in the realization of the product


shall always be meeting the functional specifications in the first place and
Reliability metrics in line with the operational requirements. A good Parts,
Materials / Process Control Plan should be worked out as per the applicable
and customized standards and shall be strictly adhered to.

Component De-rating

Selected components to be subjected only to the de-rated (always less than


the rated stress value) values of power, voltage, temperatures, memory, clock
frequency.
Building Reliability in the Design

Fool Proof Design

The design shall be fool proof to prevent any intentional and unintentional
vandalism. This should be achieved by providing sufficient inter-locking
mechanisms both in hardware and in the user interfaces to protect the product
from permanent failures during its usage by untrained or unauthorized operators

Choice of Manufacturing processes (As per the applicable standards)

The choice and sequence of the manufacturing processes (mechanical,


chemical) should be strictly as per the applicable standards which are specific to
the particular building block (components / devices / subassemblies) of a
product. Strict online quality control and quality audits shall be in place to ensure
the same.
Design Margins (Adding six Sigma Flavor)
The essence of six sigma is that the ability of producing defect free
products lies in maintaining the process width (variation / tolerance) centered
around the target (mean of the USL and LSL) much (6 times) less compared to the
specification width (range). In other words, for a given and best possible
minimization of the process width, the best way to achieve six sigma is to increase
the specification width and thereby the design margins by extending the range or
tolerance on either side of the specification limits.
Maintenance – Design for Maintainability
Maintenance refers to all actions performed to upkeep or bring back a
product in a specified working conditions so that it is available for use by the user.
Maintenance helps in extending normal useful life period (shown in the bottom of
the bathtub curve) of a product.

While the preventive(proactive) maintenance involves the periodic or non-


periodic overhauling of the product during its non-operational hours, the Corrective
(reactive) maintenance is taken up when once a problem or defect in the product is
reported.

A good design should prescribe its preventive(proactive) maintenance to


be simple and easy enough to be carried out by the user himself who is neither
skilled not trained for it.

A good design shall cater for quickest possible corrective maintenance by


providing ease and access to the inside of the product.

A good maintenance doctrine help in enhancing the ‘availability’ of a


product for its intended use by the user whenever he needs its.
Availability
Availability is a measure of the degree to which an item is in the operable
and committable state at the start of a mission when the mission is called for at an
unknown time.
Sl Mathematical
Measure Description
No Representation
1 Inherent This is the ideal state for analyzing availability.
Availability This measure takes into account of only MTBF
(Ai) corrective maintenance and assumes repair ---------------------
MTTR+MTBF
begins immediately upon the failure of the
system.
2 Achieved Achieved availability is some what more
MTBMA
Availability realistic and takes both preventive and ---------------------
corrective maintenances with that no time is MTBMA+MMT
(AA)
lost in beginning the maintenance action.
3 Operational This is what generally occurs in practice. MTBMA
Availability Operational availability takes into account that ---------------------
(Ao) maintenance response is not instantaneous, MTBMA+MDT
repair parts may not be in stock as well as
other logistic issuers.
MTBMA : Mean Time between maintenance actions (both preventive and
corrective), MMT : Mean Maintenance Action Time, ; MDT : Mean Down Time,
Availability and its Dependence on
Reliability & Maintainability

Reliability Maintainability Availability

Constant Decreases Decreases

Constant Increases Increases

Increases Constant Increases

Decreases Constant Decreases


Design Reviews
Sl No System Design Review Package Design Review
Inputs Technical Requirements Conceptual Drawings
Required Functional Requirements Materials used
Operational Requirements Components Details (CoCs)
Input / Output specifications Processes used
Platform considerations Environmental Specifications
Environmental Specifications/Conditions Structural / Thermal analysis
Considera Feasibility / Functional Block Diagram Dynamic Stresses
tions Heritage / Technology availability Climatic Stresses
Infrastructure /Test facilities availability EMI/EMC
Implementation Methodology Weight Budgets
Weight / Power Budgets Power Budgets
Time Schedules/ Funds
Qualification / Certification Plan
Review Subject Experts Reliability Expert
by Reliability Expert QA Group (Electrical)
User Group QA Group (Mechanical)
QA Group Mechanical Fabrication Group
Design Group Quality Control (QC) Group
Design Group
Fabrication / Manufacturing
Fabrication drawing Review / Approval
All fabrication drawings generated by the design group for all items shall be
reviewed by QA for adequacy and correctness of BOM / Processes and thereby
avoiding rejection during fabrication.
Fabrication Sequence approval
Fabrication sequence prepared by the design group for all items shall be
reviewed and approved by QC before the start of their fabrication. Fabrication
sequence shall be prepared for smooth flow of assembly and avoiding rework /
rejection due to improper assembly procedure.
Online Quality Control
Online inspection of various Fabrication activities at the work place shall be
carried out on 100% basis as per the approved guidelines / standards. Non-
conformances and deficiencies, if any identified shall be corrected before the UUT is
allowed for further processing.
Quality Audits
Quality Audits shall be carried out on regular basis for all components /
subsystems / packages to verify whether the quality activities in place are in
compliance with planned arrangements and to determine the effectiveness of quality
system.
Conforming to Product Reliability
The word conforming refers that a product shall act or perform in
accordance with a set of standards, expectations or specifications.
The best and only way to conform a product for such expected
performance is to test it against the defined set of standards, expectations or
specifications
In order to conform product reliability, the following types of tests need to
be conducted.

S/N. Nature of Tests Purpose of the test


1 Performance To ensure and establish the product is meeting its
(Functional tests) performance / functional requirements / specs.
2 Qualification Tests To ensure and establish the product performs to its
specifications in its intended operating environment.
3 Highly Accelerated To precipitate hardware failures if any that could occur
Tests in field use during the product life cycle.
4 Reliability Demon- To establish quantitatively that the product meets its
stration Tests reliability goals.
5 Reliability To account for the reliability of the product against its
Accounting Tests design goals / objectives.
Relevance / Applicability of Tests at Various Levels of
Product Assurance
S System
Design Qualification Tests Product Acceptance Tests
l Architecture
N Consti- HW SW Func. HW SW Func.
O Level
tuens Level Level Level Level Level Level
1 System Sub Physical Perfor- Func- Physical Perfor- Func-
systems Availability mance tional Availability mance tional
Verifi- Demo Verifi- Demo
Electrical Electrical
cation cation
Inter- Inter-
Checks Checks
connectivity connectivity
2 Sub LRUs Physical Perfor- Func- Physical Perfor- Func-
system Availability mance tional Availability mance tional
Electrical Verifi- Demo Electrical Verifi- Demo
Inter- cation Inter- cation
connectivity Checks connectivity Checks

3 LRU SRUs Physical As per Func- Physical As per Func-


SQA tional SQA tional
Mechanical Mechanical
Tests Tests
ETS Tests ESS Tests
EMI/EMC Electrical
Electrical
Relevance / Applicability of Tests at Various Levels of
Product Assurance
S System
Design Qualification Tests Product Acceptance Tests
l Architecture
N Consti- HW SW Func. HW SW Func.
o Level
tuents Level Level Level Level Level Level
4 SRU Boards Physical As per Func- Physical As per Func-
Mechanical SQA tional Mechanical SQA tional
Tests Tests
ETS Tests ESS Tests
EMI/EMC Electrical
Electrical
5 Board Devices / Physical Not Not Physical Not Not
Compo- Screening Done Done Screening Done Done
nents
COC’s BOM/MDI
Electrical Electrical
6 Device Compo- Physical Not Not Physical Not Not
nents Screening Done Done Screening Done Done
Datasheets BOM/MDI
Electrical Electrical
7 Compo - Physical Not Not Physical Not Not
nent Screening Done Done Screening Done Done
Datasheets BOM
Highly Accelerated Life Testing (HALT)
What ?
 HALT is a Method of surfacing Hardware design (not paper design) and
potential process flaws more quickly and effectively in order to undertake
corrective action.
 HALT is an operational test that runs a product far in excess of the
specifications limits to determine the design margin.
 HALT is discovery testing compared to QT which is a success testing. Thus
HALT is a total Paradigm shift.
Why ?
 To design and develop a fault free and failure free product in a cost effective
manner.
 To achieve multi-fold improvement in product reliability.
 To achieve time compression in launching the product into the market.
Philosophy
If a flaw / weakness is found in HALT, it is probably relevant. Every flaw /
weakness found represents an opportunity for improvement. Opportunities not
taken well, will probably lead to field failures which are much more expensive
than the improvement would have been. HALT is proactive, but no action means
no improvement.
Highly Accelerated Stress Screening(HASS)
What ?
 HASS is an accelerated aging process to eliminate the infant mortality
failures without removing significant amount of the product’s life.
 HASS is an enhanced ESS using the higher possible stresses in order to
attain time compression in the screens.
 HASS the process for precipitating and detection of relevant defects which
would show up in normal use.
Why ?
 To Precipitate relevant defects from the latent to patent at minimum cost and
in minimum time.
 Used to detect / correct process and design changes.
 To prevent the manufacturing and shipment of defective units by giving early
feedback.
 To achieve reduction in the total cost of production, screening, maintenance
and warranty.
Philosophy
HASS is not a test. It is a production process. It is used for 100%
screening of production items. The stresses applied are higher than those
experienced by the product in normal use.
Reliability Testing
What ?
 Method of testing the system In a simulated Operational Life cycle
Environmental profile.

 Method of determining the compliance with the quantitative reliability


requirements of a system in the absence of field performance data.

Why ?

 To verify the potential Reliability characteristic of any system estimated by


prediction techniques.

 To improve the confidence on the reliability performance of the hardware.


 To provide measured reliability data as inputs for estimates of operational
readiness, task accomplishment, maintenance and logistics support.

Philosophy
Overstress is a valid way to accelerate the discovery of deficiencies and
defects, but it is not valid means of compressing test time when reliability is to be
measured. Ideal testing for reliability is seldom practical and cost effective.
Categories of Reliability Testing
S.No Name of the
Test Description
1 Reliability Reliability Qualification Testing is carried out for
Qualification Reliability Design Qualification and is performed on pre-
Testing (RQT) production or initial production hardware to determine
design compliance with the specified reliability
requirements.
2 Production Production Reliability Acceptance Testing is a periodic
Reliability series of tests to indicate continuing production of
Acceptance acceptable equipment used to assure individual item or
Testing (PRAT) lot compliance with reliability requirements.
3 Reliability This test is an experiment used to show whether or not
Compliance the value of a reliability characteristic or an item /
Testing (RCT) component meets its stated reliability requirement.
This test is used as a condition of acceptance of the
units by the customer.
4 Reliability This test is an experiment used to determine the value
Determination of a reliability characteristic of an item / component.
Testing (RDT) This test is normally used to provide information where
a specific reliability requirement has not been stated.
Reliability Testing – Parameters measured
S.No Name of the
Parameter Description
1 Upper Test An effective test plan will ‘accept’ with high probability,
MTBF (θo) equipment with a true MTBF which approaches or
exceeds θo .
2 Lower Test An effective test plan will ‘reject’ with high probability,
MTBF (θ1) equipment with a true MTBF which approaches or is
less than θ1 .
3 Discrimination Indicates the capacity of the test to discriminate
ratio (ϒ) between good and bad equipment.
4 Producer’s Probability of ‘rejecting’ equipment which has true MTBF
Risk (α) equal to the Upper test MTBF (θo).
5 Consumer’s Probability of ‘accepting’ equipment which has true
Risk (β) MTBF equal to the Lower Test MTBF (θ1).
Important : Reliability tests are either Failure Terminated or Time Terminated.
Failure terminated tests are performed on a sample (of that particular production
batch) and give a realistic figures on the reliability test parameters. Time terminated
tests are performed over a large population and presents a less realistic figures on
the reliability test parameters. It is important to know whether the reliability test
parameters presented by a manufacturer is derived from time terminated tests or
failure terminated tests. 38
Defect Management – An Example & Case Study
Problem Definition : Electronic packaging of modules with differently rated Operating
Temperature. Specification of one SRU is –10oC to +50oC against the specified Operating
temperature of – 20oC to + 55oC complied by all other SRUs.
Design Approach : Use appropriate Heaters and Fans to create environment inside the
LRU within the operational limits of all SRUs
Initial Design logic : 1. Fans ‘on’ by default with power ’on’. 2. Heater ‘on’ when inside
temperature is between – 20oC to - 10oC
Outcome : Unit failed in low temperature testing at – 20oC. Inside temperature of the
unit is not rising because of the fans which are continuously ‘on’
Design logic-2 : 1.Heater ‘on’ and Fans ‘off’ when inside temperature is between – 20oC
to + 10oC. 2. Heater ‘off’ and Fans ‘on’ when inside temperature is above+ 10oC
Outcome : Unit passed the low/high temperature tests but failed in High altitude (11800
mtrs @– 20oC) tests. Defect found is the burn ‘out’ of one power supply module inside
the LRU. It is not a random failure as expected in the first two iterations. Problem
identified due to the fans are ineffective cooling due to air stagnation at high altitude
resulting in excessive heating due to heaters (switched on at – 20oC) resulting in a
thermal runaway of the zener diode in the power supply module
Design logic-3 : 1. more Number of heaters and more number of Fans with lesser rating
employed. 2. Different sets of heaters and fans are pressed to act in different
temperature ranges viz., –20oC to –15oC, –15oC to –10oC, –10oC to –5oC, –5oC to 0oC, 0oC
to +5oC, and +5oC to +10oC. 3. Heater ‘off’ and Fans ‘on’ when inside temperature is
above +10oC.
Outcome : The unit worked defect free in its rated operating temperature range.
Hardware Components of
Monitoring, Analysis and Recording Subsystem
Analysis & Recording Module Monitoring Receivers CSM Controller

Flash Memory

G-bit Switch Quad PPC


ADC & DDC Quad PPC

MARS + CC LRU
CSM AEW&C - Operational Requirements
Mechanical Details assembly wise
Monitoring (Narrow Band)
CSM AEW&C - Operational Requirements
MARS+CC Enclosure
Monitoring (Narrow Band)
Non-Conformance Management
Following committees / Boards shall be constituted for effective Non-
conformance management :
Material Review Board : A Material Review Board (MRB) shall give dispositions for
various non-conformances which could be deviations in parts, materials, fabrication
process, plating / painting process, dimensional deviations in mechanical
components, PCB etc.
Test & Evaluation Committee : Apart from activities related to test and evaluation
like generation of test plan, review of test results etc, T&E committee shall also look
into the non-conformities arising due to module / subsystem testing.
Configuration Change Control Management : Configuration Management Review
Board (CMRB). This board also reviews and approves test document of each
subsystem. Any changes in design subsequent to drawing approval shall go through
Design Change Control Procedure.
System Review Board (SSRB) : The SRB shall look into the configuration changes
if any proposed/required, critically and dispose appropriately at the System level.
Any issue related to intra-system and intersystem interfaces shall be referred to SRB
at the system level.
Waivers / Dispositions : Waivers in respect of any specifications shall not be
allowed. However if such waivers/dispositions are inevitable, they shall be approved
by project Director / Programme Director.
Design Approvals and Type Certification
S/N Activity Stage of interaction Purpose and Outcome
1 Preliminary System level After the Project To ensure QA aspects be properly
Design sanction and adequately reflected in the
Review System level Design. Design Go-
ahead given for the project
2 Detailed Subsystem level after the To ensure QA aspects be properly
Design preliminary Design is cleared and adequately reflected in the
Review Subsystem level Design. Design Go-
ahead given at the subsystem level
3 Critical Carried out on the Engg. Model To ensure QA aspects be properly
Design with Hardware developed, and adequately reflected in the
Review integrated and tested meeting finalized Design.
the Operational Requirements
4 Type Full Range Qualification Tests Certifies the design for Realization /
Certification on the Qualification model with Production of subsequent units for
the involvement of all operational use in the project.
concerned
5 Provisional Full Range Acceptance Tests on Clearance given for the usability of
Clearance the Flight model with the the SRU / LRU in the application/
involvement of all concerned platform Specified for the project.
6 Production The UUT has successfully Clearance given for the usability of
Clearance completed its field trials and the SRU / LRU for bulk production
proven for operation in the by the PSU and induction in
intended environment. services
Conclusion
In most of the projects, we are quick enough to prove the proto model at
the earliest but slowed down in slating it to the development flight / user trials.
The delays are purely in our own field trials for want of reliable working of the
system. Somehow, we are under the impression that the reliability either
automatically comes from heaven or it is the responsibility of manufacturer or
Production Agency which can be added at a later stage. While emphasizing the
need to deliver Reliable products, it important to realise that the reliability need to
be built into the design and not in the manufacturing process.

Quality & Reliability shall be built-in right from component level to system
level. They need to be addressed both at hardware level and software level.
There is no dearth on the availability of Reliability Prediction / Estimation Tools in
the market place which need to be relevantly selected and optimally utilised.

Our aim should be to realize products in the very first attempt without
any time and cost overruns and deliver to the customer who is not only satisfied
but delighted with its performance quality and reliability.
Thank You

47

You might also like