0% found this document useful (0 votes)

41 views10 pages

Fault-Injection and Dependability Benchmarking For Grid Computing Middleware

This document discusses fault injection and dependability benchmarking tools for grid computing middleware. It presents the FAIL-FCI fault injection system from INRIA, which provides fault injection in large distributed systems. It also presents DBGS, a dependability benchmark for grid services being developed by the University of Coimbra. The goal is to provide tools and techniques to evaluate dependability metrics and conduct dependability benchmarking of grid middleware and applications.

Uploaded by

Mustafamna Al Salam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views10 pages

Fault-Injection and Dependability Benchmarking For Grid Computing Middleware

Uploaded by

Mustafamna Al Salam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Fault-Injection and Dependability Benchmarking for

Grid Computing Middleware

Sébastien Tixeuil, 1, Luis Moura Silva2,

William Hoarau1, Gonçalo Jesus2, João Bento2, Frederico Telles2

1
LRI – CNRS UMR 8623 & INRIA Grand Large,
Université Paris Sud XI, France
Email : [email protected]
2
Departamento Engenharia Informática, Universidade de Coimbra,
Polo II, 3030-Coimbra, Portugal
Email: [email protected]

Abstract. In this paper we will present some work on dependability

benchmarking for Grid Computing that represents a common view between two
groups of Core-Grid: INRIA-Grand Large and University of Coimbra. We
present a brief overview of the state of the art, followed by a presentation of the
FAIL-FCI system from INRIA that provides a tool for fault-injection in large
distributed systems. Then we present DBGS, a dependable Benchmark for Grid
Services. We conclude the paper with some considerations about the avenues of
research ahead that both groups would like to contribute, on behalf of the Core-
GRID network.

1 Introduction

One of the topics of paramount importance in the development of Grid middleware is

the impact of faults since their probability of occurrence in a Grid infrastructure and
in large-scale distributed system is actually very high. So it is mandatory that Grid
middleware should be itself reliable and should provide a comprehensive support for
fault-tolerance mechanisms, like failure-detection, checkpointing, replication,
software rejuvenation, component-based reconfiguration, among others. One of the
techniques to evaluate the effectiveness of those fault-tolerance mechanisms and the
reliability level of the Grid middleware it to make use of some fault-injection tool and
robustness tester to conduct some experimental assessment of the dependability
metrics of the target system. In this paper, we will present a software fault-injection
tool and a workload generator for Grid Services that can be used for dependability
benchmarking in Grid Computing.
The final goal of our common work is to provide some contributions for the
definition of a dependability-benchmark for Grid computing and to provide a set of
176

tools and techniques that can be used by the developers of Grid middleware and Grid-
based applications to conduct some dependability benchmarking of their systems.
In this paper we present a fault-injection tool for large-scale distributed systems
(developed developed by INRIA-GrandLarge) and a workload generator for Grid
Services (being developed by the University of Coimbra) that include those four
components mentioned before. To the best of our knowledge the combination of these
two tools represent the most complete testbed for dependability benchmarking of Grid
applications.
The remainder of this paper is organized as follows. Section 2 describes a summary
of the related work. Section 3 describes the FAIL-FCI infrastructure from INRIA.
Section 4 briefly describes DBGS, a dependability benchmarking tool for Grid
Services. Section 5 concludes the paper.

2 Related Work

In this section we present a summary of the state-of-the-art in the two main topics of
this paper: dependability benchmarking and fault-injection tools.

2.1 Dependability Benchmarking

The idea of dependability benchmarking is now a hot-topic of research [1] and there
are already several publications in the literature. In [2] it is proposed a dependability
benchmark for transactional systems (DBench-OLTP). Another dependability
benchmark for transactional systems is proposed in [3]. This one considered a
faultload based on hardware faults. A dependability benchmark for operating systems
is proposed by [4]. Research work developed at Berkeley University has lead to the
proposal of a dependability benchmark to assess human-assisted recovery processes
[5]. The work carried out in the context of the Special Interest Group on
Dependability Benchmarking (SIGDeB), created by the IFIP WG 10.4, has resulted
in a set of standardized availability classes to benchmark database and transactional
servers [6]. Research work at Sun Microsystems defined a high-level framework [7]
dedicated specifically to availability benchmarking. Within this framework, they have
developed two benchmarks: one benchmark [8] that addresses specific aspects of a
system's robustness on handling maintenance events such as the replacement of a
failed hardware component or the installation of software patch; and another
benchmark is related to system recovery [9]. At IBM, the Autonomic Computing
initiative is also developing benchmarks to quantify a system's level of autonomic
capability, addressing four main spaces of IBM's self-management: self-
configuration, self-healing, self-optimization, and self-protection [10]. We are looking
with detail into this initiative and our aim will be to introduce some of these metrics
in Grid middleware to reduce the maintenance burden and to increase the availability
of Grid applications in production environments. Finally, in [11] the authors present a
177

dependability benchmark for Web-Servers. In some way we follow this trend by

developing a benchmark for SOAP-based Grid services.

2.2 Fault-injection Tools

When considering solutions for software fault injection in distributed systems, there
are several important parameters to consider. The main criterion is the usability of the
fault injection platform. If it is more difficult to write fault scenarios than to actually
write the tested applications, those fault scenarios are likely to be dropped from the
set of performed tests. The issues in testing component-based distributed systems
have already been described and methodology for testing components and systems
has already been proposed [12-13]. However, testing for fault tolerance remains a
challenging issue. Indeed, in available systems, the fault-recovery code is rarely
executed in the test-bed as faults rarely get triggered. As the ability of a system to
perform well in the presence of faults depends on the correctness of the fault-recovery
code, it is mandatory to actually test this code. Testing based on fault-injection can be
used to test for fault-tolerance by injecting faults into a system under test and
observing its behavior. The most obvious point is that simple tests (e.g. every few
minutes or so, a randomly chosen machine crashes) should be simple to write and
deploy. On the other hand, it should be possible to inject faults for very specific cases
(e.g. in a particular global state of the application), even if it requires a better
understanding of the tested application. Also, decoupling the fault injection platform
from the tested application is a desirable property, as different groups can concentrate
on different aspects of fault-tolerance.
Decoupling requires that no source code modification of the tested application
should be necessary to inject faults. Also, having experts in fault-tolerance test
particular scenarios for application they have no knowledge of favors describing fault
scenarios using a high-level language, that abstract practical issues such that
communications and scheduling. Finally, to properly evaluate a distributed
application in the context of faults, the impact of the fault injection platform should be
kept low, even if the number of machines is high. Of course, the impact is doomed to
increase with the complexity of the fault scenario, e.g. when every action of every
processor is likely to trigger a fault action, injecting those faults will induce an
overhead that is certainly not negligible.
Several fault injectors for distributed systems already exist. Some of them are
dedicated to distributed real-time systems such as DOCTOR [14]. ORCHESTRA [15]
is a fault injection tool that allows the user to test the reliability and the liveliness of
distributed protocols. ORCHESTRA is a "Message-level fault injector" because a
fault injection layer is inserted between two layers in the protocol stack. This kind of
fault injector allows injecting faults without requiring the modification of the protocol
source code. However, the expressiveness of the faults scenario is limited because
there is no communication between the various state machines executed on every
node. Then, as the faults injection is based on exchanged messages, the knowledge of
the type and the size of these messages is required. Nevertheless, those approaches do
not fit the cluster and Grid category of applications.
178

The NFTAPE project [16] arose from the double observation that no tool is
sufficient to inject all fault models and that it is difficult to port a particular tool to
different systems. Although NFTAPE is modular and very portable, the choice of a
completely centralized decision process makes it very intrusive (its execution strongly
perturbs the system being tested). Finally, writing a scenario quickly becomes
complex because of the centralized nature of the decisions during the tests when they
imply numerous nodes.
LOKI [17] is a fault injector dedicated to distributed systems. It is based on a partial
view of the global state of the distributed system. An analysis a posteriori is executed
at the end of the test to infer a global schedule from the various partial views and then
verify if faults were correctly injected (i.e. according to the planned scenario).
However, LOKI requires the modification of the source code of the tested application.
Furthermore, faults scenario are only based on the global state of the system and it is
difficult (if not impossible) to specify more complex faults scenario (for example
injecting "cascading" faults). Also, in LOKI there is no support for randomized fault
injection.
In [18] is presented Mendosus, a fault-injection tool for system-area networks that
is based on the emulation of clusters of computers and different network
configurations. This tool made some first steps in the fault-injection and assessment
of faults in large distributed systems, although FCI has made some steps ahead.
Finally in [19] is presented a fault-injection tool that was specially developed to
assess the dependability of Grid (OGSA) middleware. This is the work more related
with ours and we welcome the first contributions done by those authors in the area of
grid middleware dependability. However, the tool described in that paper is very
limited since it only allows the injection of faults in the XML messages in the OGSA
middleware, which seems to be a bit far from the real faults experienced in real
systems.
In the rest of the paper we will present two tools for fault-injection and workload
generation that complement each other quite well, and if used together might
represent an interesting package to be used by developers of Grid middleware and
applications.

3 FAIL-FCI Framework from INRIA

In this section, we describe the FAIL-FCI framework from INRIA. First, FAIL (for
FAult Injection Language) is a language that permits to easily described fault
scenarios. Second, FCI (for FAIL Cluster Implementation) is a distributed fault
injection platform whose input language for describing fault scenarios is FAIL. Both
components are developed as part of the Grid eXplorer project [20] which aims at
emulating large-scale networks on smaller clusters or grids.
The FAIL language allows defining fault scenarios. A scenario describes, using a
high-level abstract language, state machines which model fault occurrences. The
FAIL language also describes the association between these state machines and a
computer (or a group of computers) in the network. The FCI platform (see Figure 1)
is composed of several building blocks:
179

1. The FCI compiler: The fault scenarios written in FAIL are pre-compiled
by the FCI compiler which generates C++ source files and default
configuration files.
2. The FCI library: The files generated by the FCI compiler are bundled
with the FCI library into several archives, and then distributed across the
network to the target machines according to the user-defined
configuration files. Both the FCI compiler generated files and the FCI
library files are provided as source code archives, to enable support for
heterogeneous clusters.
3. The FCI daemon: The source files that have been distributed to the target
machines are then extracted and compiled to generate specific executable
files for every computer in the system. Those executables are referred to
as the FCI daemons. When the experiment begins, the distributed
application to be tested is executed through the FCI daemon installed on
every computer, to allow its instrumentation and its handling according to
the fault scenario.
Our approach is based on the use of a software debugger. Like the Mantis parallel
debugger [21], FCI communicates to and from gdb (the Free Software Foundation's
portable sequential debugging environment) through Unix pipes. But contrary to
Mantis approach, communications with the debugger must be kept to a minimum to
guarantee low overhead of the fault injection platform (in our approach, the debugger
is only used to trigger and inject software faults). The tested application can be
interrupted when it calls a particular function or upon executing a particular line of its
source code. Its execution can be resumed depending on the considered fault scenario.

With FCI, every physical machine is associated to a fault injection daemon. The
fault scenario is described in a high-level language and compiled to obtain a C++
code which will be distributed on the machines participating to the experiment. This
C++ code is compiled on every machine to generate the fault injection daemon. Once
this preliminary task has been performed, the experience is then ready to be launched.
The daemon associated to a particular computer consists in:
1. a state machine implementing the fault scenario,
2. a module for communicating with the other daemons (e.g. to inject faults
based on a global state of the system),
3. a module for time-management (e.g. to allow time-based fault injection),
4. a module to instrument the tested application (by driving the debugger),
and
5. a module for managing events (to trigger faults).

FCI is thus a Debugger-based Fault Injector because the injection of faults and the
instrumentation of the tested application is made using a debugger. This makes it
possible not to have to modify the source code of the tested application, while
enabling the possibility of injecting arbitrary faults (modification of the program
counter or the local variables to simulate a buffer overflow attack, etc.). From the user
point of view, it is sufficient to specify a fault scenario written in FAIL to define an
experiment. The source code of the fault injection daemons is automatically
generated. These daemons communicate between them explicitly according to the
180

user-defined scenario. This allows the injection of faults based either on a global state
of the system or on more complex mechanisms involving several machines (e.g. a
cascading fault injection). In addition, the fully distributed architecture of the FCI
daemons makes it scalable, which is necessary in the context of emulating large-scale
distributed systems.

)$,/)&,
'DHPRQ

)$,/
&DQG3HUO
)DLO6FHQDULR )$,/FRPSLOHU &FRPSLOHU )&,
6RXUFHFRGH
'DHPRQ

Figure 1: the FCI Platform

FCI daemons have two operating modes: a random mode and a deterministic mode.
These two modes allow fault injection based on a probabilistic fault scenario (for the
first case) or based on a deterministic and reproducible fault scenario (for the second
case). Using a debugger to trigger faults also permits to limit the intrusion of the fault
injector during the experiment. Indeed, the debugger places breakpoints which
correspond to the user-defined fault scenario and then runs the tested application. As
181

long as no breakpoint is reached the application runs normally and the debugger
remains inactive.
Fail-FCI has been used to assess the dependability of XtremWeb [22] and some
results are being collected that allow us to assess the effectiveness of some fault-
tolerance techniques that can be applied to desktop grids.

4 DBGS: Dependability Benchmark for Grid Services

DBGS is a dependability benchmark for Grid Services that follow the OGSA
specification [23]. Since the OGSA model is based on SOAP technology we have
developed a benchmark tool for SOAP-based services. This benchmark includes the
four components, mentioned in section 1: (a) definition of a workload to the system
under test (SUT); (b) optional definition of a faultload to the SUT system; (c)
collection and definition of the benchmark measurements; (d) definition of the
benchmark procedures. The DGGS is composed by the following components
presented in Figure 2.

V WV
T XH
3 5H
6 2$

Figure 2: Experimental setup overview of the DBGS benchmark.

The system-under-test (SUT) consists of a SOAP server running some Grid or

Web-Service. From the point of view of the benchmark the SUT corresponds to an
application server, a SOAP router and a Grid service that will execute under some
workload, and optionally will be affected by some fault-load.
The Benchmark Management System (BMS) is a collection of software tools that
allows the automatic execution of the benchmark. It includes a module for the
definition of the benchmark, a set of procedures and rules, definition of the workload
182

that will be produced in the SUT, a module that collects all the benchmark results and
produces some results that are expressed as a set of dependability metrics. The BMS
system may activate a set of clients (running in separate machines) that inject the
defined workload in the SUT by making SOAP requests to the end Grid Service. All
the execution of these client machines is timely synchronized and all the partial
results collected by each individual client are merged into a global set of results that
generated the final assessment of the dependability metrics. The BMS system
includes a reporting tool that presents the final results in a readable and graphic
format.

The results generated by each benchmark run are expressed as throughput-over-

time (requests-per-second in a time axis), the total turnaround time of the execution,
the average latency, the functionality of the services, the occurrence of failures in the
Grid service/server, the characterization of those failures (crash, hang, zombie-
server), the correctness of the final results at the server side and the failure scenarios
that are observed at the client machines (explicit SOAP error messages or time-outs).

From the side of the SUT system, there are four modules that also make part of the
DBGS benchmark: a fault-load injector, a configuration manager, a collector of
benchmark results and a watchdog of the SUT system.

The fault-load injector does not inject faults directly in the software like the fault-
injection tools, previously mentioned in section 2. This injector only produces some
impact at the operating system level: it consumes resources from the operating system
like memory, threads, file-handles, database-connections, sockets. We have observed
that Grid and WS middleware is not robust enough because the underlying
middleware (e.g. Application server and the SOAP implementation) is very unreliable
when there are lack of operating system resources, like memory leakage, memory
exhaustion and over-creation of threads. These are the scenarios we want to generate
with this fault-load module. This means that software bugs are not directly emulated
by this module, but rather by a tool like FAIL-FCI.

The configuration manager helps in the definition of the configuration parameters

of the SUT middleware. It is absolutely that the configuration parameters may have a
considerable impact in the robustness of the SUT system. By changing those
parameters in different runs of the benchmark it allow us to assess the impact of those
parameters in the results expressed as dependability metrics.

Finally, the SUT system should also be installed with a module to collect raw data
from the benchmark execution. This data will be then sent to the BMS server that will
merge and compare with the data collected from the client machines. The final
module is a SUT-Watchdog that detects when a SUT system crashes or hangs when
the benchmark is executing. When a crash or hang is detected the watchdog generates
a restart of the SUT system and associated applications, thereby allowing an
automatic execution of the benchmark runs without user intervention.
183

We have been collecting a large set of experimental results with this tool. The
results are not presented here for lack of space, but in summary, we can say that this
benchmark tool allowed us to spot some of the software leaks that can be found in
current implementations of SOAP that are currently being used in Grid services and
those problems may completely undermine the dependability level of the Grid
applications.

5 Conclusions and Future Work

This paper presented a fault-injection tool for large-scale distributed systems that is
currently being used to measure the fault-tolerance capabilities included in
XtremWeb, and a second tool that can be directly used for dependability
benchmarking of Grid Services that follow the OGSA model, and are thereby
implemented by using SOAP technology. These two tools together fit quite well,
since their target is really complementary. We feel that these two groups of Core-
GRID will provide some valuable contribution in the area of dependability
benchmarking for Grid Computing, and our work in cooperation has a long avenue
ahead with several research challenges. At the end of the road we hope to have
contributed to increase the dependability of Grid middleware and applications by the
deployment of these tools to the community.

6 Acknowledgements

This research work is carried out in part under the FP6 Network of Excellence
CoreGRID funded by the European Commission (Contract IST-2002-004265).

References

1. P.Koopman, H.Madeira. “Dependability Benchmarking & Prediction: A Grand

Challenge Technology Problem”, Proc. 1st IEEE Int. Workshop on Real-Time
Mission-Critical Systems: Grand Challenge Problems; Phoenix, Arizona, USA, Nov
1999
2. M. Vieira and H. Madeira, “A Dependability Benchmark for OLTP Application
Environments”, Proc. 29th Int. Conf. on Very Large Data Bases (VLDB-03), Berlin,
Germany, 2003.
3. K. Buchacker and O. Tschaeche, “TPC Benchmark-c version 5.2 Dependability
Benchmark Extensions”, http://www.faumachine.org/papers/tpcc-depend.pdf, 2003.
4. A. Kalakech, K. Kanoun, Y. Crouzet and A. Arlat. “Benchmarking the Dependability
of Windows NT, 2000 and XP”, Proc. Int. Conf. on Dependable Systems and
Networks (DSN 2004), Florence, Italy, IEEE CS Press, 2004.
5. A. Brown, L. Chung, W. Kakes, C. Ling, D. A. Patterson, "Dependability
Benchmarking of Human-Assisted Recovery Processes", Dependable Computing and
Communications, DSN 2004, Florence, Italy, June, 2004
184

6. D. Wilson, B. Murphy and L. Spainhower. “Progress on Deining Standardized

Classes of Computing the Dependability of Computer Systems”, Proc. DSN 2002,
Workshop on Dependability Benchmarking, Washington, D.C., USA, 2002.
7. J. Zhu, J. Mauro, I. Pramanick. “R3 - A Framwork for Availability Benchmarking,”
Proc. Int. Conf. on Dependable Systems and Networks (DSN 2003), USA, 2003.
8. Ji J. Zhu, J. Mauro, and I. Pramanick, “Robustness Benchmarking for Hardware
Maintenance Events”, in Proc. Int. Conf. on Dependable Systems and Networks
(DSN 2003), pp. 115-122, San Francisco, CA, USA, IEEE CS Press, 2003.
9. J. Mauro, J. Zhu, I. Pramanick. “The System Recovery Benchmark,” in Proc. 2004
Pacific Rim Int. Symp. on Dependable Computing, Papeete, Polynesia, 2004.
10. S. Lightstone, J. Hellerstein, W. Tetzlaff, P. Janson, E. Lassettre, C. Norton, B.
Rajaraman and L. Spainhower. "Towards Benchmarking Autonomic Computing
Maturity", 1st IEEE Conf. on Industrial Automatics (INDIN-2003), Canada, August
2003.
11. J. Durães, M. Vieira and H. Madeira. "Dependability Benchmarking of Web-
Servers", Proc. 23rd International Conference, SAFECOMP 2004, Potsdam,
Germany, September 2004. Lecture Notes in Computer Science, Volume 3219/2004
12. S Ghosh, AP Mathur, "Issues in Testing Distributed Component-Based Systems", 1st
Int. ICSE Workshop on Testing Distributed Component-Based Systems, 1999
13. H. Madeira, M. Zenha Rela, F. Moreira, and J. G. Silva. “Rifle: A general purpose
pin-level fault injector”. In European Dependable Computing Conference, pages
199–216, 1994.
14. S. Han, K. Shin, and H. Rosenberg. “Doctor: An integrated software fault injection
environment for distributed real-time systems”, Proc. Computer Performance and
Dependability Symposium, Erlangen, Germany, 1995.
15. S. Dawson, F. Jahanian, and T. Mitton. Orchestra: A fault injection environment for
distributed systems. Proc. 26th International Symposium on Fault-Tolerant
Computing (FTCS), pages 404–414, Sendai, Japan, June 1996.
16. D.T. Stott and al. Nftape: a framework for assessing dependability in distributed
systems with lightweight fault injectors. In Proceedings of the IEEE International
Computer Performance and Dependability Symposium, pages 91–100, March 2000.
17. R. Chandra, R. M. Lefever, M. Cukier, and W. H. Sanders. Loki: A state-driven fault
injector for distributed systems. In In Proc. of the Int.Conf. on Dependable Systems
and Networks, June 2000.
18. X. Li, R. Martin, K. Nagaraja, T. Nguyen, B.Zhang. “Mendosus: A SAN-based Fault-
Injection Test-Bed for the Construction of Highly Network Services”, Proc. 1st
Workshop on Novel Use of System Area Networks (SAN-1), 2002
19. N. Looker, J.Xu. “Assessing the Dependability of OGSA Middleware by Fault-
Injection”, Proc. 22nd Int. Symposium on Reliable Distributed Systems, SRDS, 2003
20. http://www.lri.fr/~fci/GdX
21. S. Lumetta and D. Culler. “The Mantis parallel debugger”. In Proceedings of
SPDT’96: SIGMETRICS Symposium on Parallel and Distributed Tools, pages 118–
126, Philadelphia, Pennsylvania, May 1996.
22. G. Fedak, C. Germain, V. Néri, and F. Cappello. “XtremWeb: A generic global
computing system”. Proc. of IEEE Int. Symp. on Cluster Computing and the Grid,
2001.
23. I.Foster, C. Kesselman, J.M. Nick and S. Tuecke. “Grid Services for Distributed
System Integration”, IEEE Computer June 2002.
24. J. Kephart. “Research Challenges of Autonomic Computing”, Proc. ICSE05,
International Conference on Software Engineering, May 2005

RIPMWC Round 2 Sample Questions 2019
100% (3)
RIPMWC Round 2 Sample Questions 2019
2 pages
An Overview of Existing Tools For Fault-Injection and Dependability Benchmarking in Grids
No ratings yet
An Overview of Existing Tools For Fault-Injection and Dependability Benchmarking in Grids
15 pages
Easy Fault Injection and Stress Testing With FAIL-FCI: William Hoarau S Ebastien Tixeuil Fabien Vauchelles
No ratings yet
Easy Fault Injection and Stress Testing With FAIL-FCI: William Hoarau S Ebastien Tixeuil Fabien Vauchelles
20 pages
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
From Everand
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
Robert Johnson
No ratings yet
OpenTracing in Distributed Systems: Definitive Reference for Developers and Engineers
From Everand
OpenTracing in Distributed Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
From Everand
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
From Everand
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Survey ON Fault Tolerance IN Grid Computing: P. Latchoumy and P. Sheik Abdul Khader
No ratings yet
Survey ON Fault Tolerance IN Grid Computing: P. Latchoumy and P. Sheik Abdul Khader
14 pages
Feinbube17Dependability - Dependability Stress Testing of Cloud Infrastructures
No ratings yet
Feinbube17Dependability - Dependability Stress Testing of Cloud Infrastructures
8 pages
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Effective Error Monitoring with Bugsnag: Definitive Reference for Developers and Engineers
From Everand
Effective Error Monitoring with Bugsnag: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
ChaosBlade in Practice: The Complete Guide for Developers and Engineers
From Everand
ChaosBlade in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Embedded Systems Programming with C++: Real-World Techniques
From Everand
Embedded Systems Programming with C++: Real-World Techniques
Robert Johnson
No ratings yet
Computer Science Self Management: Fundamentals and Applications
From Everand
Computer Science Self Management: Fundamentals and Applications
Fouad Sabry
No ratings yet
Litmus Chaos Experiments in Practice: The Complete Guide for Developers and Engineers
From Everand
Litmus Chaos Experiments in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
From Everand
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Study Guide 300-435 ENAUTO: Automating and Programming Cisco Enterprise Solutions Certification Exam
From Everand
Study Guide 300-435 ENAUTO: Automating and Programming Cisco Enterprise Solutions Certification Exam
Anand Vemula
No ratings yet
Efficient Deployment Automation with Fabric: Definitive Reference for Developers and Engineers
From Everand
Efficient Deployment Automation with Fabric: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Vert.x Architecture and Reactive System Design: Definitive Reference for Developers and Engineers
From Everand
Vert.x Architecture and Reactive System Design: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Observability Engineering with Relic: Definitive Reference for Developers and Engineers
From Everand
Practical Observability Engineering with Relic: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Embedded Systems Programming with C: Writing Code for Microcontrollers
From Everand
Embedded Systems Programming with C: Writing Code for Microcontrollers
Larry Jones
No ratings yet
SystemTap Essentials: Definitive Reference for Developers and Engineers
From Everand
SystemTap Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
JFrog Solutions in Modern DevOps: Definitive Reference for Developers and Engineers
From Everand
JFrog Solutions in Modern DevOps: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Framework for SCADA Cybersecurity
From Everand
Framework for SCADA Cybersecurity
Richard Clark
5/5 (1)
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
From Everand
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Polly in Action: Definitive Reference for Developers and Engineers
From Everand
Polly in Action: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Detectron2 in Practice: Definitive Reference for Developers and Engineers
From Everand
Detectron2 in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DNP3 Protocol Engineering: Definitive Reference for Developers and Engineers
From Everand
DNP3 Protocol Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Sentry Error Monitoring and Application Observability: Definitive Reference for Developers and Engineers
From Everand
Sentry Error Monitoring and Application Observability: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Deploying Python Applications with Gunicorn: Definitive Reference for Developers and Engineers
From Everand
Deploying Python Applications with Gunicorn: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Canary Deployments in Modern Software Engineering: Definitive Reference for Developers and Engineers
From Everand
Canary Deployments in Modern Software Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Netdata in Practice: Definitive Reference for Developers and Engineers
From Everand
Netdata in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CircuitPython in Practice: Definitive Reference for Developers and Engineers
From Everand
CircuitPython in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Rollbar Implementation and Best Practices: Definitive Reference for Developers and Engineers
From Everand
Rollbar Implementation and Best Practices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Certification Exam Prep
From Everand
Microsoft AZ-400: Designing and Implementing Microsoft DevOps Solutions - Certification Exam Prep
Steve Brown
No ratings yet
Architecting Distributed Applications with Macrometa: The Complete Guide for Developers and Engineers
From Everand
Architecting Distributed Applications with Macrometa: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Observium Network Monitoring Solutions: Definitive Reference for Developers and Engineers
From Everand
Observium Network Monitoring Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Embedded C: The Ultimate Guide to Building Efficient Systems
From Everand
Mastering Embedded C: The Ultimate Guide to Building Efficient Systems
Robert Johnson
No ratings yet
Mountebank in Practice: The Complete Guide for Developers and Engineers
From Everand
Mountebank in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
From Everand
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
Devi Prasad
No ratings yet
Automated Application Deployment with CodeDeploy: Definitive Reference for Developers and Engineers
From Everand
Automated Application Deployment with CodeDeploy: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Truffle for Blockchain Development: Definitive Reference for Developers and Engineers
From Everand
Truffle for Blockchain Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
ZeroMQ in Practice: Definitive Reference for Developers and Engineers
From Everand
ZeroMQ in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Networking Programming with C++: Build Efficient Communication Systems
From Everand
Networking Programming with C++: Build Efficient Communication Systems
Robert Johnson
No ratings yet
Embedded Systems Design Essentials: Definitive Reference for Developers and Engineers
From Everand
Embedded Systems Design Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Streamlit Development Essentials: Definitive Reference for Developers and Engineers
From Everand
Streamlit Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Omni-Path Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
Omni-Path Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenHAB Solutions and Integration: Definitive Reference for Developers and Engineers
From Everand
OpenHAB Solutions and Integration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
FineReport System Design and Implementation: Definitive Reference for Developers and Engineers
From Everand
FineReport System Design and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Engineering Anthos Solutions: Definitive Reference for Developers and Engineers
From Everand
Engineering Anthos Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Istio in Production Environments: Definitive Reference for Developers and Engineers
From Everand
Istio in Production Environments: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Litmus Chaos Engineering for Kubernetes: The Complete Guide for Developers and Engineers
From Everand
Litmus Chaos Engineering for Kubernetes: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Live Trace Visualization for System and Program Comprehension in Large Software Landscapes
From Everand
Live Trace Visualization for System and Program Comprehension in Large Software Landscapes
Florian Fittkau
No ratings yet
Thundra Observability and Monitoring Solutions: Definitive Reference for Developers and Engineers
From Everand
Thundra Observability and Monitoring Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Ganache for Ethereum Development: Definitive Reference for Developers and Engineers
From Everand
Ganache for Ethereum Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to MiniTest: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to MiniTest: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Real-Time Applications with FreeRTOS: Definitive Reference for Developers and Engineers
From Everand
Real-Time Applications with FreeRTOS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Study Guide Cisco 300-915 DEVIOT Developing Solutions using Cisco IoT and Edge Platforms Exam
From Everand
Study Guide Cisco 300-915 DEVIOT Developing Solutions using Cisco IoT and Edge Platforms Exam
Anand Vemula
No ratings yet
Digital Signal Processing 2001: U Z U A Az
No ratings yet
Digital Signal Processing 2001: U Z U A Az
16 pages
6419 Mt1r13feb 14
No ratings yet
6419 Mt1r13feb 14
58 pages
Advanced Digital Signal Processing Lecture 1
0% (1)
Advanced Digital Signal Processing Lecture 1
42 pages
Compressed Sensing in Wireless Sensor Networks: Survey: January 2011
No ratings yet
Compressed Sensing in Wireless Sensor Networks: Survey: January 2011
5 pages
Wireless Sensor Network Simulators A Survey and Comparisons
No ratings yet
Wireless Sensor Network Simulators A Survey and Comparisons
18 pages
SCMC: An Efficient Scheme For Minimizing Energy in Wsns Using A Set Cover Approach
No ratings yet
SCMC: An Efficient Scheme For Minimizing Energy in Wsns Using A Set Cover Approach
18 pages
Virtual Reality's Technologies Use in E-Learning: December 2009
No ratings yet
Virtual Reality's Technologies Use in E-Learning: December 2009
7 pages
Wireless Sensor Networks and The Internet Things: Selected Challenges
No ratings yet
Wireless Sensor Networks and The Internet Things: Selected Challenges
3 pages
Designing An Effective E-Learning Experience
No ratings yet
Designing An Effective E-Learning Experience
116 pages
A Barrier Framework For Open E-Learning in Public Administrations
No ratings yet
A Barrier Framework For Open E-Learning in Public Administrations
3 pages
Architectures, Design Methodologies, and Service Composition Techniques For Grid Job and Resource Management
No ratings yet
Architectures, Design Methodologies, and Service Composition Techniques For Grid Job and Resource Management
40 pages
Hardware Design of Aes S-Box Using Pipelining Structure Over GF ( (2) )
No ratings yet
Hardware Design of Aes S-Box Using Pipelining Structure Over GF ( (2) )
5 pages
Design of AES Algorithm Using FPGA: International Conference On Advanced Computing, Communication and Networks'11
No ratings yet
Design of AES Algorithm Using FPGA: International Conference On Advanced Computing, Communication and Networks'11
6 pages
Abstract - The Choice of A Platform, Software, ASIC Or: Ntroduction Algorithm Analysis and Implementation
No ratings yet
Abstract - The Choice of A Platform, Software, ASIC Or: Ntroduction Algorithm Analysis and Implementation
4 pages
Hardware Implementation of Aes-Ccm For Robust Secure Wireless Network
No ratings yet
Hardware Implementation of Aes-Ccm For Robust Secure Wireless Network
8 pages
Dependability Assessment of Operating Systems in Multi-Core Architectures
No ratings yet
Dependability Assessment of Operating Systems in Multi-Core Architectures
2 pages
Fault Injection Testing Method of Software Implemented Fault Tolerance Mechanisms of Web Service Systems
No ratings yet
Fault Injection Testing Method of Software Implemented Fault Tolerance Mechanisms of Web Service Systems
163 pages
Dependency Analysis in Distributed Systems Using Fault Injection: Application To Problem Determination in An E-Commerce Environment
No ratings yet
Dependency Analysis in Distributed Systems Using Fault Injection: Application To Problem Determination in An E-Commerce Environment
14 pages
Surface Engineering Industry Germany
No ratings yet
Surface Engineering Industry Germany
27 pages
Bryson Yee Resume 2018-2019 Updated
No ratings yet
Bryson Yee Resume 2018-2019 Updated
2 pages
TOPIC 7 Unemployment
No ratings yet
TOPIC 7 Unemployment
13 pages
Library Jit Final Handout
No ratings yet
Library Jit Final Handout
49 pages
A Framework For The Automation of Testing Computer Vision Systems
No ratings yet
A Framework For The Automation of Testing Computer Vision Systems
4 pages
2023 - RPIA Assessment 2
No ratings yet
2023 - RPIA Assessment 2
5 pages
PR1 Characteristics Strengths and Weaknesses Kinds and Importance of Qualitative Research
No ratings yet
PR1 Characteristics Strengths and Weaknesses Kinds and Importance of Qualitative Research
13 pages
Petron Plustm Formula 7 Diesel Engine Conditioner
No ratings yet
Petron Plustm Formula 7 Diesel Engine Conditioner
2 pages
CH3 4
No ratings yet
CH3 4
32 pages
Final Suggestion EC, BC by GKJ
No ratings yet
Final Suggestion EC, BC by GKJ
109 pages
Special Power of Attorney
No ratings yet
Special Power of Attorney
2 pages
Unit 5 Review Answers
No ratings yet
Unit 5 Review Answers
17 pages
Car Amp Subwofer JBL - bp1200.1
No ratings yet
Car Amp Subwofer JBL - bp1200.1
33 pages
DOC0535335316Jun23 SC4500
No ratings yet
DOC0535335316Jun23 SC4500
4 pages
Chapter 10 Strategy Implementation Organizing and Structure
100% (1)
Chapter 10 Strategy Implementation Organizing and Structure
28 pages
Plate # 2 Primary and Secondary Batteries
No ratings yet
Plate # 2 Primary and Secondary Batteries
14 pages
ALPFA Brings Top Latino Professionals Together For 2011 Annual Convention in Anaheim, CA.
No ratings yet
ALPFA Brings Top Latino Professionals Together For 2011 Annual Convention in Anaheim, CA.
1 page
Activity Hazards Analysis: MD485B Tower Assembly AHA
No ratings yet
Activity Hazards Analysis: MD485B Tower Assembly AHA
6 pages
Introduction To Linear Programming Sau
No ratings yet
Introduction To Linear Programming Sau
42 pages
Spare Parts List: Forward and Reversible Plate
No ratings yet
Spare Parts List: Forward and Reversible Plate
44 pages
Business Model Canvas
No ratings yet
Business Model Canvas
3 pages
Technical Tip: Overview of Ethylene Oxide (Eo or Eto) Residuals
No ratings yet
Technical Tip: Overview of Ethylene Oxide (Eo or Eto) Residuals
3 pages
Optima Super Secure Brochure
No ratings yet
Optima Super Secure Brochure
20 pages
Atkinson 2020 Fields and Individuals From Bourdieu To Lahire and Back Again
No ratings yet
Atkinson 2020 Fields and Individuals From Bourdieu To Lahire and Back Again
16 pages
Translation Certification: Form H-1
No ratings yet
Translation Certification: Form H-1
2 pages
Att-4 LV Cable Epr Epr GSWB Shf-2
No ratings yet
Att-4 LV Cable Epr Epr GSWB Shf-2
7 pages
BS en Iso 14692-3-2017
No ratings yet
BS en Iso 14692-3-2017
46 pages
Univariate Time Series Analysis: Arnaud Amsellem
No ratings yet
Univariate Time Series Analysis: Arnaud Amsellem
26 pages

Fault-Injection and Dependability Benchmarking For Grid Computing Middleware

Uploaded by

Fault-Injection and Dependability Benchmarking For Grid Computing Middleware

Uploaded by

Fault-Injection and Dependability Benchmarking for

Grid Computing Middleware

Sébastien Tixeuil, 1, Luis Moura Silva2,

William Hoarau1, Gonçalo Jesus2, João Bento2, Frederico Telles2

Abstract. In this paper we will present some work on dependability

One of the topics of paramount importance in the development of Grid middleware is

2.1 Dependability Benchmarking

dependability benchmark for Web-Servers. In some way we follow this trend by

2.2 Fault-injection Tools

3 FAIL-FCI Framework from INRIA

Figure 1: the FCI Platform

4 DBGS: Dependability Benchmark for Grid Services

Figure 2: Experimental setup overview of the DBGS benchmark.

The system-under-test (SUT) consists of a SOAP server running some Grid or

The results generated by each benchmark run are expressed as throughput-over-

The configuration manager helps in the definition of the configuration parameters

5 Conclusions and Future Work

1. P.Koopman, H.Madeira. “Dependability Benchmarking & Prediction: A Grand

6. D. Wilson, B. Murphy and L. Spainhower. “Progress on Deining Standardized

You might also like