0% found this document useful (0 votes)

85 views9 pages

Dependability: Dependability Proper Improper Failure Restoration

The document discusses dependability as the ability of a system to deliver specified service. A system provides proper service if the service is delivered as specified, otherwise it is improper. System failure occurs when transitioning from proper to improper service. Dependability is measured through availability, reliability, safety, and other metrics.

Uploaded by

Wesal Refat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views9 pages

Dependability: Dependability Proper Improper Failure Restoration

Uploaded by

Wesal Refat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Dependability

• Dependability is the ability of a system to deliver a specified service.

• System service is classified as proper if it is delivered as specified; otherwise it
is improper.
• System failure is a transition from proper to improper service.
• System restoration is a transition from improper to proper service.

failure
proper improper
service service
restoration

⇒ The “properness” of service depends on the user’s viewpoint!

Reference: J.C. Laprie (ed.), Dependability: Basic Concepts and Terminology,

Springer-Verlag, 1992.

ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 1
permission of the author.
Examples of Specifications of Proper Service

• k out of N components are functioning.

• every working processor can communicate with every other working processor.
• every message is delivered within t milliseconds from the time it is sent.
• all messages are delivered in the same order to all working processors.
• the system does not reach an unsafe state.
• 90% of all remote procedure calls return within x seconds with a correct result.
• 99.999% of all telephone calls are correctly routed.

⇒ Notion of “proper service” provides a specification by which to evaluate a

system’s dependability.

ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 2
permission of the author.
Dependability Concepts
• Measures - properties expected from • Impairments - causes of
a dependable system undependable operation
– Availability – Faults
– Reliability – Errors
– Safety – Failures
– Confidentiality
– Integrity
– Maintainability
– Coverage
• Means - methods to achieve
dependability
– Fault Avoidance
– Fault Tolerance
– Fault Removal
– Dependability Assessment

ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 3
permission of the author.
Faults, Errors, and Failures can Cause Improper Service

• Failure - transition from proper to improper service

• Error - that part of system state that is liable to lead to subsequent failure
• Fault - the hypothesized cause of error(s)

Module 1 Fault Error Failure Fault Error Failure

Module 2 Fault Error Failure

Module 3 Fault Error Failure

ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 4
permission of the author.
Dependability Measures: Availability
Availability - quantifies the alternation between deliveries of proper and improper
service.
– A(t) is 1 if service is proper at time t, 0 otherwise.

– E[A(t)] (Expected value of A(t)) is the probability that service is proper at

time t.

– A(0,t) is the fraction of time the system delivers proper service during [0,t].

– E[A(0,t)] is the expected fraction of time service is proper during [0,t].

– P[A(0,t) > t*] (0 ≤ t* ≤ 1) is the probability that service is proper more than
100t*% of the time during [0,t].

– A(0,t)t→∞ is the fraction of time that service is proper in steady state.

– E[A(0,t)t→∞], P[A(0,t)t→∞ > t*] as above.

ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 5
permission of the author.
Other Dependability Measures
• Reliability - a measure of the continuous delivery of service
– R(t) is the probability that a system delivers proper service throughout [0,t].

• Safety - a measure of the time to catastrophic failure

– S(t) is the probability that no catastrophic failures occur during [0,t].
– Analogous to reliability, but concerned with catastrophic failures.

• Time to Failure - measure of the time to failure from last restoration. (Expected
value of this measure is referred to as MTTF - Mean time to failure.)

• Maintainability - measure of the time to restoration from last experienced

failure. (Expected value of this measure is referred to as MTTR - Mean time to
repair.)

• Coverage - the probability that, given a fault, the system can tolerate the fault
and continue to deliver proper service.

ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 6
permission of the author.
Illustration of the Impact of Coverage on Dependability
• Consider two well-known architectures: simplex and duplex.

λ
λ
λ
Simplex System
Duplex System

• The Markov model for both architectures is:

2cλ
1 2 1
µ
λ 2( λ
1-
c)
λ

• The analytical expression of the MTTF can be calculated for each architecture
using these Markov models.

ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 7
permission of the author.
Illustration of the Impact of Coverage, cont.
• The following plot shows the ratio of MTTF (duplex)/MTTF (simplex) for
different values of coverage (all other parameter values being the same).
• The ratio shows the dependability gain by the duplex architecture.

1E+04
c=1
1E+03 c = 0.999

1E+02 c = 0.99

1E+01 c = 0.95

1E+00
1E-04 1E-03
1E-02
&'#
Ratio of failure rate to repair rate $ !
%µ"
• We observe that the coverage of the detection mechanism has a significant
impact on the gain: a change of coverage of only 10-3 reduces the gain in
dependability by the duplex system by a full order of magnitude.
ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 8
permission of the author.
Failure Sources and Frequencies
Non-Fault-Tolerant Systems Fault-Tolerant Systems
– Japan, 1383 organizations – Tandem Computers (Gray 1990)
(Watanabe 1986, Siewiorek & – Bell Northern Research (Cramp et al.
Swarz 1992) 1992)
– USA, 450 companies (FIND/SVP
1993)
Mean time to failure:
Mean time to failure: 6 to 12 weeks 21 years (Tandem)
Average outage duration after failure:
1 to 4 hours
Failure Sources:
Hardware
10% 10% 8%
Software 7%
15%
50%
Communications
25% Environment 65%
Operations-
Procedures
ECE/CS 541: Computer System Analysis, Instructor William H. Sanders. ©2006 William H. Sanders. All rights reserved. Do not duplicate without Module 1, Slide 9
permission of the author.

LJ Create: Analog and Digital Motor Control
No ratings yet
LJ Create: Analog and Digital Motor Control
7 pages
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
No ratings yet
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
21 pages
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
No ratings yet
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
6 pages
STDcurs1 Merged
No ratings yet
STDcurs1 Merged
139 pages
Reference Book Principles of Distributed Database System Chapters
No ratings yet
Reference Book Principles of Distributed Database System Chapters
25 pages
Fault Avoidance and Tolerance Technique
No ratings yet
Fault Avoidance and Tolerance Technique
15 pages
Fault Tolerance Techniques
No ratings yet
Fault Tolerance Techniques
4 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
51 pages
Reliable System Design: Hardware Design Checklist Testing Embedded Systems Critical Systems
No ratings yet
Reliable System Design: Hardware Design Checklist Testing Embedded Systems Critical Systems
28 pages
Introduction To Dependable and Fault Tolerant Computing Systems
No ratings yet
Introduction To Dependable and Fault Tolerant Computing Systems
31 pages
Dependable Computing: Concepts, Limits, Challenges
No ratings yet
Dependable Computing: Concepts, Limits, Challenges
13 pages
Distrsyslectureset7 Win20
No ratings yet
Distrsyslectureset7 Win20
114 pages
11 Errors
No ratings yet
11 Errors
33 pages
Dependable and Secure Computing Concepts
No ratings yet
Dependable and Secure Computing Concepts
14 pages
SDA Session 8
No ratings yet
SDA Session 8
17 pages
Reliability
No ratings yet
Reliability
58 pages
OS Presentattion
No ratings yet
OS Presentattion
15 pages
DS CH7 - Fault Tolerance
No ratings yet
DS CH7 - Fault Tolerance
17 pages
Unit 11 Dependability-and-Security
No ratings yet
Unit 11 Dependability-and-Security
39 pages
CS61C Su18 27 MRR Dependability
No ratings yet
CS61C Su18 27 MRR Dependability
60 pages
Ascs 04 0213
No ratings yet
Ascs 04 0213
5 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
7.fault Tolerance
No ratings yet
7.fault Tolerance
35 pages
05-Reliability Analysis Method - GB-s
No ratings yet
05-Reliability Analysis Method - GB-s
77 pages
Software Reliability: CIS 376 Bruce R. Maxim UM-Dearborn
No ratings yet
Software Reliability: CIS 376 Bruce R. Maxim UM-Dearborn
37 pages
Software Reliability: by Allesh Panda Iiit BBSR
No ratings yet
Software Reliability: by Allesh Panda Iiit BBSR
37 pages
DS Unit - 4
No ratings yet
DS Unit - 4
20 pages
Dis Sys
No ratings yet
Dis Sys
16 pages
Lesson 1 - Introduction To Fault-Tolerant Computing
No ratings yet
Lesson 1 - Introduction To Fault-Tolerant Computing
6 pages
Mostafa Abd-El-Barr Design and Analysis of Reliabookfi
No ratings yet
Mostafa Abd-El-Barr Design and Analysis of Reliabookfi
463 pages
Computer and Spftware Reliability
No ratings yet
Computer and Spftware Reliability
4 pages
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
No ratings yet
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
48 pages
EENG 415 Power System Reliability Analytical Methods: Lecture # 5
No ratings yet
EENG 415 Power System Reliability Analytical Methods: Lecture # 5
51 pages
Chapter 8
No ratings yet
Chapter 8
107 pages
Fault Tolerance Computing Lecture Note
No ratings yet
Fault Tolerance Computing Lecture Note
61 pages
Ch-4-Fault Tularance - Naming-SM
No ratings yet
Ch-4-Fault Tularance - Naming-SM
42 pages
Dependability: © 2011 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 2
No ratings yet
Dependability: © 2011 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 2
7 pages
LECT-7A-Software Reliability Metrics
No ratings yet
LECT-7A-Software Reliability Metrics
37 pages
University of Massachusetts Dept. of Electrical & Computer Engineering Fault Tolerant Computing
No ratings yet
University of Massachusetts Dept. of Electrical & Computer Engineering Fault Tolerant Computing
19 pages
Software Fault Tolerance Methods
No ratings yet
Software Fault Tolerance Methods
50 pages
Dependability & Security
No ratings yet
Dependability & Security
24 pages
DFTS BE 4 II Sem Unit 1
No ratings yet
DFTS BE 4 II Sem Unit 1
166 pages
Dependability: © 2016 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 2
No ratings yet
Dependability: © 2016 A.W. Krings CS449/549 Fault-Tolerant Systems Sequence 2
7 pages
Basic Reliability Concepts and Analysis Chapter 2
No ratings yet
Basic Reliability Concepts and Analysis Chapter 2
34 pages
Distributed System - Failures
No ratings yet
Distributed System - Failures
12 pages
Safety Critical Computer Systems: Failure Independence and Software Diversity Effects On Reliability of Dual Channel Structures
No ratings yet
Safety Critical Computer Systems: Failure Independence and Software Diversity Effects On Reliability of Dual Channel Structures
10 pages
BCS 413 - Lecture7 - Fault Tolerance
No ratings yet
BCS 413 - Lecture7 - Fault Tolerance
47 pages
Lesson 2 - Fault and Error Modelling
No ratings yet
Lesson 2 - Fault and Error Modelling
7 pages
Lecture 7 - FAULT-TOLERANT COMPUTING
No ratings yet
Lecture 7 - FAULT-TOLERANT COMPUTING
13 pages
Revision Notes - 02 Reliability in Computer Systems
No ratings yet
Revision Notes - 02 Reliability in Computer Systems
12 pages
1 Chapter 11 Security and Dependability
No ratings yet
1 Chapter 11 Security and Dependability
46 pages
Slides 08 PDF
No ratings yet
Slides 08 PDF
95 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
6 pages
Fault Tolerant Systems: Prerequisites
No ratings yet
Fault Tolerant Systems: Prerequisites
14 pages
03 - Reliability Software
No ratings yet
03 - Reliability Software
56 pages
Notes On Fault Tolerance
No ratings yet
Notes On Fault Tolerance
2 pages
Design For Six Sigma - Contd..: Session13
100% (1)
Design For Six Sigma - Contd..: Session13
43 pages
Chapter 1 - Dependable Systems
No ratings yet
Chapter 1 - Dependable Systems
35 pages
Presentation - 02 Reliability in Computer Systems
No ratings yet
Presentation - 02 Reliability in Computer Systems
24 pages
Fault Tolerant Design: An Introduction: Elena Dubrova
No ratings yet
Fault Tolerant Design: An Introduction: Elena Dubrova
162 pages
Linear Systems: Stability and Control
From Everand
Linear Systems: Stability and Control
Eshwar Sekhon
No ratings yet
Request For A Full Fellowship To Be Submitted by 25 April 2015
No ratings yet
Request For A Full Fellowship To Be Submitted by 25 April 2015
2 pages
FlexiFDRULES HB389709
No ratings yet
FlexiFDRULES HB389709
135 pages
New Text Document
No ratings yet
New Text Document
4 pages
Wikipedia:Community Portal: From Wikipedia, The Free Encyclopedia
No ratings yet
Wikipedia:Community Portal: From Wikipedia, The Free Encyclopedia
20 pages
New Text Document
No ratings yet
New Text Document
4 pages
Test
No ratings yet
Test
35 pages
Algoritham and Architectural Level Methodologies
No ratings yet
Algoritham and Architectural Level Methodologies
44 pages
Requirement Specification eRhODIS Application PDF
No ratings yet
Requirement Specification eRhODIS Application PDF
24 pages
PL950SERIES
No ratings yet
PL950SERIES
1 page
Sonicwall Nsa 2700 Next-Generation Firewall (NGFW) : Strategic Analysis vs. Fortinet Fortigate 100F NGFW
No ratings yet
Sonicwall Nsa 2700 Next-Generation Firewall (NGFW) : Strategic Analysis vs. Fortinet Fortigate 100F NGFW
7 pages
Exp No4
No ratings yet
Exp No4
6 pages
E ModbusTCPIP
No ratings yet
E ModbusTCPIP
16 pages
ClassMarker Test Taking Guide
No ratings yet
ClassMarker Test Taking Guide
3 pages
T.E. - 2019 Pattern - Endsem Exam Timetable For Nov-Dec. 2024
No ratings yet
T.E. - 2019 Pattern - Endsem Exam Timetable For Nov-Dec. 2024
25 pages
Wireless Card and Antenna (Port C) : Kit Instructions
No ratings yet
Wireless Card and Antenna (Port C) : Kit Instructions
19 pages
Linux TCS
No ratings yet
Linux TCS
229 pages
Big Data
No ratings yet
Big Data
41 pages
Building Successful Information Systems
100% (1)
Building Successful Information Systems
56 pages
Introduction To Thevenins Theorem
No ratings yet
Introduction To Thevenins Theorem
8 pages
01 - Dali Basic Theory
No ratings yet
01 - Dali Basic Theory
23 pages
RFIQ SPA - v5 - 01012021
No ratings yet
RFIQ SPA - v5 - 01012021
3 pages
50 Objective Questions For IT Officer With Answer
100% (3)
50 Objective Questions For IT Officer With Answer
6 pages
ESD 502 Analog CMOS VLSI Design PDF
No ratings yet
ESD 502 Analog CMOS VLSI Design PDF
2 pages
RWA Presentation
No ratings yet
RWA Presentation
10 pages
Siemens STL
0% (1)
Siemens STL
272 pages
Pytxt
No ratings yet
Pytxt
4 pages
Log
No ratings yet
Log
3 pages
Advanced Configuration 2007 Rev B.00 Day 2 Part 1
No ratings yet
Advanced Configuration 2007 Rev B.00 Day 2 Part 1
113 pages
Activity Lifecycle
No ratings yet
Activity Lifecycle
6 pages
T24 Support Engineer, Based in Chennai Main Responsibilities
No ratings yet
T24 Support Engineer, Based in Chennai Main Responsibilities
2 pages
Business Rule Types, Order, and Update
No ratings yet
Business Rule Types, Order, and Update
7 pages
GAT Subject For CS TEST All Questions
No ratings yet
GAT Subject For CS TEST All Questions
7 pages
Online Air Ticket Reservation System (SRS)
100% (1)
Online Air Ticket Reservation System (SRS)
20 pages
FYP - Report Template
No ratings yet
FYP - Report Template
39 pages
Dell Openmanage Essentials v1.1 White Paper
No ratings yet
Dell Openmanage Essentials v1.1 White Paper
34 pages

Dependability: Dependability Proper Improper Failure Restoration

Uploaded by

Dependability: Dependability Proper Improper Failure Restoration

Uploaded by

Dependability

• Dependability is the ability of a system to deliver a specified service.

⇒ The “properness” of service depends on the user’s viewpoint!

Reference: J.C. Laprie (ed.), Dependability: Basic Concepts and Terminology,

• k out of N components are functioning.

⇒ Notion of “proper service” provides a specification by which to evaluate a

• Failure - transition from proper to improper service

Module 1 Fault Error Failure Fault Error Failure

Module 2 Fault Error Failure

Module 3 Fault Error Failure

– E[A(t)] (Expected value of A(t)) is the probability that service is proper at

– E[A(0,t)] is the expected fraction of time service is proper during [0,t].

– A(0,t)t→∞ is the fraction of time that service is proper in steady state.

– E[A(0,t)t→∞], P[A(0,t)t→∞ > t*] as above.

• Safety - a measure of the time to catastrophic failure

• Maintainability - measure of the time to restoration from last experienced

• The Markov model for both architectures is:

You might also like