0% found this document useful (0 votes)

31 views5 pages

Ascs 04 0213

This document summarizes fault tolerance techniques in distributed systems. It defines key terms like fault, error, failure, and discusses different types of failures that can occur. It then describes common characteristics of distributed systems like resource sharing and scalability. The document emphasizes that fault tolerance is important for distributed systems to guarantee availability, reliability, and safety. It discusses techniques like redundancy, recovery from checkpoints, and majority voting to achieve fault tolerance.

Uploaded by

solma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views5 pages

Ascs 04 0213

Uploaded by

solma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Acta Scientific COMPUTER SCIENCES

Volume 4 Issue 1 Janauary 2022

Literature Review

Fault Tolerance in Distributed System: A Review

Gajendra Sharma* and Santosh Sah

Received: October 19, 2021
Department of Computer Science and Engineering Kathmandu University, Dhulikhel,
Published: December 17, 2021
Kathmandu, Nepal
© All rights are reserved by Gajendra
*Corresponding Author: Gajendra Sharma, Department of Computer Science and
Sharma and Santosh Sah.
Engineering Kathmandu University, Dhulikhel, Kathmandu, Nepal.

Abstract
Distributed systems consist of several hardware and software components connected together which may fail eventually. There-
fore, the system should be designed with the proper fault tolerance technique, so that in case of fault the system would be able to
recover from the failure without any loss of service. Fault may occur due to various reasons like communication failure, resources
or hardware failure, failure due to fault in process, software errors etc. Any of these faults may result the system into faulty environ-
ment. The system in faulty environment may not perform the task as expected and will result in faulty output or no output. This paper
attempts to introduce fault, fault tolerance and fault tolerance techniques in detail with the help of previous research in the field of
fault tolerance in distributed system.

Keywords: Fault; Fault-Tolerance; Distributed System; Fault Tolerance Techniques

Introduction Common characteristics of a Distributed System are Resource

Sharing, Openness, Scalability, Transparency, and most importantly
Distributed Computing Systems consists of variety of hardware
is Fault Tolerance. In distributed system the individual worksta-
and software components. Failure of any of these components can
tions communicate each other by passing messages. There are al-
lead to unanticipated, potentially disruptive behavior and service
ways chances of fault to occurs which may be due communication
unavailability [1]. In the event of failure in the system or any com-
failure, hardware failure, shortage of memory, software bugs etc.
ponent of the system, the system must be capable of operating as
normal condition. This quality of system to operate normally in
case of failure is the fault tolerance of the system. High-availability
of the system is guaranteed by the fault tolerance of the system.
Incorrect resultor unpredictable service of the system cannot be
accepted in real time distributed system. Some examples are the
online transaction, process control and computer based commu-
nication system. A system’s fault tolerant capability guarantees
to minimize the loss that may be caused due to the unpredictable
system behavior. The demand for fault tolerance in a system is in-
creasing day by day.
Figure 1: Distributed system [2].

Citation: Gajendra Sharma and Santosh Sah. “Fault Tolerance in Distributed System: A Review". Acta Scientific Computer Sciences 4.1 (2022): 36-40.
Fault Tolerance in Distributed System: A Review

The above figure (Figure 1) shows the example of real time sys- • Arbitrary (Byzanthine) Failure: A server may produce
tem in distributed environment. Real time system is highly depend- arbitraryresponses at arbitrary times
able on deadline. The given task to the system must be completed
with the allocated amount of time. Resultobtained after the given Fault Tolerance – Ability of system to behave in a well-defined

period of time is of no use in case of real time system. Some ex- manner upon occurrence of faults.

amples of real time system are Nuclear System, Robotic Controls,

Recovery – Recovery is a passive approach in which the state of
Medical equipment, defense system etc. Distributed systems are
the system ismaintained and is used to roll back the execution to a
presented to the user as a single system image and has a feature of
predefined checkpoint.
easily expanding the resources based the load to the system. Addi-
tional component can be added to the system, this additional com- Redundancy – With respect to fault tolerance it is replication of
ponent can be used in case of fault in the systemhelping to reduce hardware,software components or computation.
the fault and for better output.
Security – Robustness of the system characterized by secrecy,
Below are few terminologies that are related to fault tolerance integrity,availability, reliability and safety during its operation.
in distributed system.
It has been found that the need of fault tolerance is for the sys-
Fault - At the lowest level of abstraction fault can be termed as tem Availability, Reliability, Safety, Maintainability and Security.
“defect”. It canlead to inaccurate system state. Fault in the system
• Availability: The system must be usable immediately at any
can be categorized based ontime as below:
time.
• Transient: This type of fault occurs once and disappear • Reliability: The system must work for a long period of
• Intermittent: This type of fault occurs many time in an ir- time withouterror.
regular way • Safety: There should be no catastrophic consequences of
• Permanent: This is the fault that is permanent and brings temporalfailure.
system to halt. • Maintainability: The system must be able to repair and
fix the faultquickly and easily.
Error – May be defined as state of the system, which is undesir-
able and maylead to failure of the system. • Safety: The system should be able to resist the attacks
against itsintegrity.
Failure – May be defined as faults due to unintentional intru-
sion. Differenttypes of failure are as below: Failure masking by redundancy
• Information redundancy: Extra bits are added (e.g. CRC)
• Crash Failure: A server halts, but is working correctly until
it halts • Time redundancy: Action may be redone (e.g. transaction
after abort)
• Omission Failure: A server fails to respond incoming re-
• Physical redundancy: Hardware and software component
quests
may bemultiplied (e.g. adding extra disk, replicating the da-
• Receive omission: A server fails to respond incoming tabase), TMR.
message
Triple modular redundancy (TMR)
• Send Omission: A server fails to send message
It uses the principle of building a majority of opinion. Each de-
• Timing Failure: A server’s response lies outside the vice is replicated 3 times, signal pass all 3 devices.If one device fails,
specified timeinterval a voter can reproduce the correct value based on 2 correct signals.
• Response Failure: The server’s response in incorrect In this case it is assumed that at every stage 1 device and 1 voter
may fail.
• Value failure: The value of the response is wrong

• State transition failure: The server deviates from the

correct flowof control

Citation: Gajendra Sharma and Santosh Sah. “Fault Tolerance in Distributed System: A Review". Acta Scientific Computer Sciences 4.1 (2022): 36-40.
Fault Tolerance in Distributed System: A Review

Literature Review They have mentioned some approaches for fault tolerance in
Real Time distributed system. They are as following:
There is a lot of research that has already been performed and
is ongoing in the field of fault tolerance in distributed system. Re- • Replication
search and experimentation efforts began in earnest in the 1970s • Job Replication
and continued through 1990s, with focused interest peaking in the • Component Replication
late 1980s.
• Data Replication
A number of distributed operating system were introduced • Check pointing
during these period; however, very few of these implementations
• Scheduling/Redundancy
achieved even modest commercialsuccess.
• Space Scheduling/Redundancy
Different authors have reviewed the concept of fault tolerance
• Time Scheduling/ Redundancy
computing system, like Ramamoorthy [1967], Short [1968], Avi-
zienes [1971], Khul and Reddy [1980, 1981], Bagchi and Hakimi • Hybrid Redundancy
[1991]. The SAPO Computer built in Prague, Czechoslovakia was
A common way to handle crashes involves two steps: (1) De-
probably the first Fault- Tolerant Computer built in 1950-1954 un-
tect the failure; and (2) Recover, by restarting or failing over the
der the supervision of Antonin Svoboda, using relays and a mag-
crashed component. Failure recovery has received a lot more atten-
netic drum and was operated in 1957-1960.
tion than Failure detection. Joshua B. Leners., et al. [9], have given
According to Leslie Lamport [3],Time should be used instead a fault detector mechanism named as Falcon. According to the au-
of timeout to increase the fault tolerance. A general method is de- thors Falcon achieves these features by coordinating a network of
scribed for implementing a distributed system with any desired de- spies, each monitoring a layer of the system.
gree of fault- tolerance. Instead of relying upon explicit timeouts,
Padmakumari (2015) [10] has provided the idea for diverse
processes execute a simple clock-driven algorithm. For Byzantine
fault tolerance and monitoring mechanism to enhance the reliabil-
problem solution author has assumed reliable clock synchroniza-
ity in cloud computing environment. In has given the data about
tion.
various techniques and methods which are used for fault toler-
According to Paval., et al. [4], the fault resilience techniques can ance and also focused on future research direction in cloud fault
be broadly classified into three categories as below: tolerance. Joshi (2014) [11] has given the concept of virtual data
• Hardware Resilience centres (VDC) which is based on the migration technique. In this
methodology if a virtual machine is overloaded then some of its
• Resilient System Software
resources are migrated to another virtual machine to handle the
• Application Based Resilience. server failure. TT-based designs have proved to be a viable solu-
tion in the scope of adaptive systems and recent works in this area
According to Diego Zuquim Guimarães Garcia., et al. [5], the web
show that there is an on-going interest in continuing improving the
service architecture still lacks the facilities to support fault toler-
RT-related features of the FTT protocol [12].
ance. The author has provided an architecture that provides me-
diation and monitoring for web service. Fault tolerance model
Fault model Fault model describes which faults and associated
According to Arvind Kumar., et al. [7], types of faults occurring in
rate of occurrence are assumed by the system being designed. Ac-
the system, fault detection and recovery techniques are discussed.
cording to different viewpoints the faults can be: system bound-
A system after failure can be in one the three below:
aries, internal or external; phenomenological cause natural or
• Fail Stop System
human-made; intent-deliberate or non-deliberate; capacity acci-
• Byzantine System dental or incompetence; persistence permanent or transient. Two
• Fail-Fast System examples of the faults considered by the presented fault model are

Citation: Gajendra Sharma and Santosh Sah. “Fault Tolerance in Distributed System: A Review". Acta Scientific Computer Sciences 4.1 (2022): 36-40.
Fault Tolerance in Distributed System: A Review

physical deterioration and physical interference as seen in figure prevent undesirable fault attrition. These replication techniques
2. These faults are caused by processes such as radiation, power may not be economical. There are many fault tolerance techniques
transients, noisy input lines, etc. proposed earlier but none of the single fault tolerance mechanism
can fulfill all aspects of fault tolerance. The model can be used
with necessary customization according to system that is being de-
signed.

Bibliography

1. Flavin Cristian. “Understaning Fault-Tolerant Distributed Sys-

tems”. Computer Science and Engineering, University of Cali-
fornia, San Diego (1993).

2. Lakshmi PS. “Distributed Fault Tolerance Sytem in Real Time

Environment”. Kundal Kr. Medhi, International Journal of Ad-
vance Research in Computer Science and Software Engineering
(2013).

3. Leslie Lamport. “Using Time instead of Timout for Fault Toler-

Figure 2: Fault tolerance model. ant Distributed System”. SRI International (2017).

4. Pavan B., et al. “Fault Tolerance Techniques for Scalable Com-

puting”. Mathematics and Computer Science Division, Argonne
Summary and Conclusion
National Laboratory (2014).
When a fault occurs in a system, then the system requires the
fault tolerance method to detect the fault and recover the system to 5. Diego Z., et al. “A Fault Tolerant Web Service Architecture”. In-
stitute of Computing University of Campinas, São Paulo, Brazil
its normal state. Fault tolerance techniques are required to predict
(2016).
these failures and take appropriate action before these faults actu-
ally occurs. Fault detection is equally importantas failure recovery 6. Avizienis A. “Fault Tolerance Computing-An overview”. IEEE
for having a better fault tolerance mechanism in a system. Computer 3 (2011).

All the activities executed by fault-tolerant system have to be 7. Arvind Kumar., et al. “Fault Tolerance in Real Time Distributed
synchronized, e.g. node replicas have to first execute certain ap- System”. International Journal on Computer Science and Engi-
neering 3 (2011): 933-939.
plication tasks in order to produce the results, then exchange the
produced results by transmission/reception of messages using the 8. Mitvin S. “Fault Tolerant Distributed System”. Department of
network protocol, and execute application tasks that vote on the lo- Computer Science and Engineering, University of Texas at Ar-
cally produced result and the results received through the channel. lington (2019).

After going through previous works in the field of fault toler- 9. Joshua B L., et al. “Detecting Failure in Distributed Systems
ance, it is found that several fault tolerance models are present. with FALCON spy network”. The University of Texas at Austin,
Microsoft Research Silicon Valley (2011).
Transient link faults may affect the capacity of a node for transmit-
ting/receiving, but they are transparently tolerated by using the 10. Padmakumari P. “Methodical Review on Various Fault Toler-
pro-active retransmission mechanism. Replication of Hardware ant and Monitoring Mechanisms to improve Reliability on
and Software technique are the most common technique used for Cloud Environment”. Indian Journal of Science and Technology
fault tolerance. Faults may lead a node replica to become desyn- 8 (2015).
chronized at the communication and/or the application level be-
11. Joshi SC and Sivalingam KM. “Fault tolerance mechanisms for
yond the error recovery capacity. Thus, it was realized that it was virtual data center architectures”. Photonic Network Communi-
necessary to propose more sophisticated recovery mechanisms to cations 28 (2014): 154-164.

Citation: Gajendra Sharma and Santosh Sah. “Fault Tolerance in Distributed System: A Review". Acta Scientific Computer Sciences 4.1 (2022): 36-40.
Fault Tolerance in Distributed System: A Review

12. Garibay-Martínez R. “Improved Holistic Analysis for Fork–Join

Distributed Real-Time Tasks Supported by the FTT-SE Pro-
tocol”. In: IEEE Transactions on Industrial Informatics 12.5
(2016): 1865-1876.

Assets from publication with us

• Prompt Acknowledgement after receiving the article
• Thorough Double blinded peer review
• Rapid Publication
• Issue of Publication Certificate
• High visibility of your Published work

Website: www.actascientific.com/
Submit Article: www.actascientific.com/submission.php
Email us: [email protected]
Contact us: +91 9182824667

Citation: Gajendra Sharma and Santosh Sah. “Fault Tolerance in Distributed System: A Review". Acta Scientific Computer Sciences 4.1 (2022): 36-40.

Social Dimensions of Education: Lesson 2
100% (1)
Social Dimensions of Education: Lesson 2
22 pages
Chapter - 2 Emergence and Development of Management Thought
56% (9)
Chapter - 2 Emergence and Development of Management Thought
58 pages
BS EN 62676 Standards For CCTV: Graded Requirements Under
No ratings yet
BS EN 62676 Standards For CCTV: Graded Requirements Under
36 pages
Student Management System
75% (4)
Student Management System
20 pages
Organizational Change and Development
100% (1)
Organizational Change and Development
20 pages
Level-V IT Service Management Latest
No ratings yet
Level-V IT Service Management Latest
57 pages
Decision Support Systems
100% (1)
Decision Support Systems
40 pages
Logical Data Modelling PDF
No ratings yet
Logical Data Modelling PDF
29 pages
Unit V
No ratings yet
Unit V
45 pages
Fault Tolerant Design: An Introduction: Elena Dubrova
No ratings yet
Fault Tolerant Design: An Introduction: Elena Dubrova
162 pages
Aerospace Prop
No ratings yet
Aerospace Prop
151 pages
Acquisition, Visibility, Accessibility and Use of Periodicals Among Library and Information Science Postgraduate Students in Federal University Libraries in Nigeria
No ratings yet
Acquisition, Visibility, Accessibility and Use of Periodicals Among Library and Information Science Postgraduate Students in Federal University Libraries in Nigeria
166 pages
Electrical Operation Manual 2 ECS Control System Plant Name: Electrical Standard Contract Number
No ratings yet
Electrical Operation Manual 2 ECS Control System Plant Name: Electrical Standard Contract Number
22 pages
Chapter 3 - Requirements Engineering-1
No ratings yet
Chapter 3 - Requirements Engineering-1
49 pages
Systems Thinking
100% (2)
Systems Thinking
6 pages
Syllabus: Managerial Accounting
No ratings yet
Syllabus: Managerial Accounting
7 pages
Mostafa Abd-El-Barr Design and Analysis of Reliabookfi
No ratings yet
Mostafa Abd-El-Barr Design and Analysis of Reliabookfi
463 pages
STDcurs1 Merged
No ratings yet
STDcurs1 Merged
139 pages
Chapter 8-Fault Tolerance
100% (1)
Chapter 8-Fault Tolerance
71 pages
Architectural Design in Software Engineering
No ratings yet
Architectural Design in Software Engineering
11 pages
Systems Engineering For Naval Ship Design Evolutio
No ratings yet
Systems Engineering For Naval Ship Design Evolutio
19 pages
Fault Detection and Fault Tolerant Control
No ratings yet
Fault Detection and Fault Tolerant Control
207 pages
Lecture 7
No ratings yet
Lecture 7
57 pages
Chapter 8
No ratings yet
Chapter 8
107 pages
Information Strategy Planning (Isp)
No ratings yet
Information Strategy Planning (Isp)
21 pages
Ch-4-Fault Tularance - Naming-SM
No ratings yet
Ch-4-Fault Tularance - Naming-SM
42 pages
Du3 1
No ratings yet
Du3 1
54 pages
Attributes of Fault-Tolerant Distributed File Systems
No ratings yet
Attributes of Fault-Tolerant Distributed File Systems
69 pages
Ie2141 Collab Cheatsheet
No ratings yet
Ie2141 Collab Cheatsheet
57 pages
Capacity Building On Transportation and Automotive Maintenance Training For Philippine National Police
No ratings yet
Capacity Building On Transportation and Automotive Maintenance Training For Philippine National Police
10 pages
Types of Testing
No ratings yet
Types of Testing
13 pages
Fault Tolerance in Distributed Computing
No ratings yet
Fault Tolerance in Distributed Computing
32 pages
SoftSkill Study Material
No ratings yet
SoftSkill Study Material
34 pages
Father Involvement and Family Function
No ratings yet
Father Involvement and Family Function
47 pages
Narayanamurthy - Systemic Leanness An Index For Facilitating Continuous Improvement of Lean Implementation
No ratings yet
Narayanamurthy - Systemic Leanness An Index For Facilitating Continuous Improvement of Lean Implementation
40 pages
DS Unit-3 Notes
No ratings yet
DS Unit-3 Notes
35 pages
Lesson 1 - Introduction To Fault-Tolerant Computing
No ratings yet
Lesson 1 - Introduction To Fault-Tolerant Computing
6 pages
Intro To DS Chapter 6
No ratings yet
Intro To DS Chapter 6
51 pages
BCS 413 - Lecture7 - Fault Tolerance
No ratings yet
BCS 413 - Lecture7 - Fault Tolerance
47 pages
Summary Writing 8 - Nhat Quang
No ratings yet
Summary Writing 8 - Nhat Quang
5 pages
DS Unit - 4
No ratings yet
DS Unit - 4
20 pages
ch08 Ts TK Fault Tolerance I
No ratings yet
ch08 Ts TK Fault Tolerance I
29 pages
RTS UNiT 4
No ratings yet
RTS UNiT 4
19 pages
Dependable Systems
No ratings yet
Dependable Systems
22 pages
Slides 08 PDF
No ratings yet
Slides 08 PDF
95 pages
Fault-Tolerant Parallel Algorithms
No ratings yet
Fault-Tolerant Parallel Algorithms
16 pages
Inductionn + Chapter 1 Part 1
No ratings yet
Inductionn + Chapter 1 Part 1
22 pages
Dis Sys
No ratings yet
Dis Sys
16 pages
Lec 3
No ratings yet
Lec 3
30 pages
Levines Theory
No ratings yet
Levines Theory
14 pages
CSC423 - Lec12 - Distributed and Parallel ComputerSystems
No ratings yet
CSC423 - Lec12 - Distributed and Parallel ComputerSystems
28 pages
Components of Theory and Their Definitions. All C
No ratings yet
Components of Theory and Their Definitions. All C
1 page
A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems
No ratings yet
A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems
13 pages
Ds Chapter 7
No ratings yet
Ds Chapter 7
21 pages
w9s1 FaultTolerance1
No ratings yet
w9s1 FaultTolerance1
34 pages
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
No ratings yet
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
48 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Adobe Scan Oct 11, 2023
No ratings yet
Adobe Scan Oct 11, 2023
23 pages
DS CH7 - Fault Tolerance
No ratings yet
DS CH7 - Fault Tolerance
17 pages
Ijcse V11i4p101
No ratings yet
Ijcse V11i4p101
10 pages
Fault Tolerance Slides
No ratings yet
Fault Tolerance Slides
18 pages
CH 4
No ratings yet
CH 4
25 pages
Reference Book Principles of Distributed Database System Chapters
No ratings yet
Reference Book Principles of Distributed Database System Chapters
25 pages
Chapter 1 - Intro
No ratings yet
Chapter 1 - Intro
31 pages
Lecture 7 - FAULT-TOLERANT COMPUTING
No ratings yet
Lecture 7 - FAULT-TOLERANT COMPUTING
13 pages
The Advantages of Manual or Computerized Accounting
No ratings yet
The Advantages of Manual or Computerized Accounting
2 pages
Ics 2403 Distributed Systems
No ratings yet
Ics 2403 Distributed Systems
8 pages
Redundancy in Instrumentations.
No ratings yet
Redundancy in Instrumentations.
3 pages
Research Paper2
No ratings yet
Research Paper2
5 pages
Distributed Systems - Fault Tolerance
No ratings yet
Distributed Systems - Fault Tolerance
21 pages
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
No ratings yet
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
21 pages
Distributed System Unit No 1
No ratings yet
Distributed System Unit No 1
11 pages
Function Point Analysis (FPA)
No ratings yet
Function Point Analysis (FPA)
4 pages
Lesson 2 - Fault and Error Modelling
No ratings yet
Lesson 2 - Fault and Error Modelling
7 pages
Unit 4
No ratings yet
Unit 4
11 pages
Fault Tolerance Slides
No ratings yet
Fault Tolerance Slides
18 pages
Survey ON Fault Tolerance IN Grid Computing: P. Latchoumy and P. Sheik Abdul Khader
No ratings yet
Survey ON Fault Tolerance IN Grid Computing: P. Latchoumy and P. Sheik Abdul Khader
14 pages
CBDT3103 Answer
No ratings yet
CBDT3103 Answer
9 pages
Future Trends in Fault Tolerant (Lect.10)
No ratings yet
Future Trends in Fault Tolerant (Lect.10)
3 pages
Fault Tolerance
No ratings yet
Fault Tolerance
10 pages
Unit-5 Faults in RTOS
No ratings yet
Unit-5 Faults in RTOS
5 pages
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
From Everand
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
Devi Prasad
No ratings yet
Fault Avoidance and Tolerance Technique
No ratings yet
Fault Avoidance and Tolerance Technique
15 pages
Groupproject 2022023 s2
No ratings yet
Groupproject 2022023 s2
6 pages
Distributed System - Failures
No ratings yet
Distributed System - Failures
12 pages
WRL0004 TMP
No ratings yet
WRL0004 TMP
9 pages
Synthesis of Fault-Tolerant Embedded Systems: Eles, Petru Izosimov, Viacheslav Pop, Paul Peng, Zebo
No ratings yet
Synthesis of Fault-Tolerant Embedded Systems: Eles, Petru Izosimov, Viacheslav Pop, Paul Peng, Zebo
7 pages
A Review On Fault Tolerance in Distributed Database
No ratings yet
A Review On Fault Tolerance in Distributed Database
4 pages
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
No ratings yet
Fundamental Concepts of Dependability: Algirdas Aviz Ienis Jean-Claude Laprie Brian Randell
6 pages
Icst 1011
No ratings yet
Icst 1011
6 pages

Ascs 04 0213

Uploaded by

Ascs 04 0213

Uploaded by

Acta Scientific COMPUTER SCIENCES

Volume 4 Issue 1 Janauary 2022

Fault Tolerance in Distributed System: A Review

Gajendra Sharma* and Santosh Sah

Keywords: Fault; Fault-Tolerance; Distributed System; Fault Tolerance Techniques

Introduction Common characteristics of a Distributed System are Resource

amples of real time system are Nuclear System, Robotic Controls,

• State transition failure: The server deviates from the

1. Flavin Cristian. “Understaning Fault-Tolerant Distributed Sys-

2. Lakshmi PS. “Distributed Fault Tolerance Sytem in Real Time

3. Leslie Lamport. “Using Time instead of Timout for Fault Toler-

4. Pavan B., et al. “Fault Tolerance Techniques for Scalable Com-

12. Garibay-Martínez R. “Improved Holistic Analysis for Fork–Join

Assets from publication with us

You might also like