Xavier Défago, École Polytechnique, Fédérale Lausanne
Total order broadcast and multicast (also called atomic broadcast/multicast) present an important problem in distributed systems, especially with respect to fault-tolerance. In short, the primitive...
Rami Yared, Xavier Défago, Matthias Wiesmann
prevention using group communication for asynchronous cooperative
Unreliable Compasses for Robust Gathering of Asynchronous Mobile Robots (2008)
Samia Souissi, Xavier Défago, Masafumi Yamashita
Faire en sorte qu’un ensemble de robots mobiles se comporte comme un système cohérent est une question fondamentale dans les systèmes robotiques distribués. Ce problème est souvent illustré...
Résumé: Definition and properties of accrual failure detectors: an overview (2008)
Xavier Défago, Naohiro Hayashibara, Péter Urbán, Rami Yared, Takuya Katayama
Assurer la sûreté de fonctionnement dans les systèmes répartis est difficile à faire de manière efficace. Cela nécessite notament de pouvoir détecter la défaillance de processus de manière...
Other Systems General Terms Algorithms, Theory (2008)
Xavier Défago, Akihiko Konagaya
This paper proposes a distributed algorithm by which a collection of mobile robots roaming on a plane move to form a circle. The algorithm operates under the premises that robots (1) are unable to...
Total order broadcast and multicast (also called atomic broadcast/multicast) present an important problem in distributed systems, especially with respect to fault-tolerance. In short, the primitive...
1 Overview Reliability with CORBA Event Channels (2008)
Xavier Défago, Pascal Felber, Benoît Garbinato, Rachid Guerraoui
Several application domains such as finance, process control, and telecommunications, have strong reliability requirements. Typically, such applications tend to avoid having a single point of...
1 Overview Reliability with CORBA Event Channels (2008)
Xavier Défago, Pascal Felber, Benoît Garbinato, Rachid Guerraoui
Several application domains such as finance, process control, and telecommunications, have strong reliability requirements. Typically, such applications tend to avoid having a single point of...
Other Systems General Terms Algorithms, Theory (2007)
Xavier Défago, Akihiko Konagaya
This paper proposes a distributed algorithm by which a collection of mobile robots roaming on a plane move to form a circle. The algorithm operates under the premises that robots (1) are unable to...
Péter Urbán, Xavier Défago, André Schiper
Fault tolerance can be achieved in distributed systems by replication. However, Fischer, Lynch and Paterson have proven an impossibility result about consensus in the asynchronous system model....
Semi-passive replication and Lazy Consensus (2006)
Défago, Xavier, Schiper, André
This paper presents two main contributions: semi-passive replication and Lazy Consensus. The former is a replication technique with parsimonious processing. It is based on the latter; a variant of...
Fault-tolerant and Self-stabilizing Mobile Robots Gathering - Feasibility Study - (2006)
Défago, Xavier, Gradinariu, Maria, Messika, Stéphane, Raïpin-Parvédy, Philippe
Gathering is a fundamental coordination problem in cooperative mobile robotics. In short, given a set of robots with arbitrary initial location and no initial agreement on a global coordinate system,...
Fault-tolerant and Self-stabilizing Mobile Robots Gathering - Feasibility Study - (2006)
Défago, Xavier, Gradinariu, Maria, Messika, Stéphane, Raïpin-Parvédy, Philippe
Gathering is a fundamental coordination problem in cooperative mobile robotics. In short, given a set of robots with arbitrary initial location and no initial agreement on a global coordinate system,...
Fault-tolerant and Self-stabilizing Mobile Robots Gathering - Feasibility Study - (2006)
Défago, Xavier, Gradinariu, Maria, Messika, Stéphane, Raïpin-Parvédy, Philippe
Gathering is a fundamental coordination problem in cooperative mobile robotics. In short, given a set of robots with arbitrary initial location and no initial agreement on a global coordinate system,...
An SNMP based failure detection service (2006)
Matthias Wiesmann, Péter Urbán, Xavier Défago
In this paper, we present the SNMP-FD service, a novel failure detection service entirely based on the Simple Network Management Protocol (SNMP). This approach promises better interoperability with...
An SNMP based failure detection service (2006)
Matthias Wiesmann, Péter Urbán, Xavier Défago, Matthias Wiesmann, Péter Urbán, Xavier Défago
In this paper, we present the SNMP-FD system. This system is a novel failure detection service entirely based on the SNMP standard. The advantage of this approach is better interoperability, and the...
Anonymous Stabilizing Leader Election using a Network Sequencer ∗ (2006)
Matthias Wiesmann, Xavier Défago
In this paper, we present an anonymous, stable, communication efficient, stabilizing leader election algorithm that works using anonymous communication primitives. The algorithm offers properties...
Agreement-related problems: from semi-passive replication to totally ordered broadcast (2005)
Agreement problems constitute a fundamental class of problems in the context of distributed systems. All agreement problems follow a common pattern: all processes must agree on some common decision,...
Fault-tolerant group membership protocols using physical robot messengers (2005)
Rami Yared, Xavier Défago, Takuya Katayama
In this paper, we consider a distributed system that consists of a group of teams of worker robots that rely on physical robot messengers for the communication between the teams. Unlike traditional...
A Sowing Routing Protocol for Dense Mobile Ad-Hoc Networks (2005)
Julien Cartigny, Xavier Défago
Abstract — To reduce the number of control messages in dense ad-hoc networks, some protocols reduce the redundancy (or overlap) between communication radii to limit the routing overhead. In this...
Definition and specification of accrual failure detectors (2005)
Xavier Défago, Péter Urbán, Naohiro Hayashibara, Takuya Katayama, Xavier Défago (a, Péter Urbán (a, ...
For many years, people have been advocating the development of failure detection as a basic service, but, unfortunately, without meeting much success so far. We believe that this comes from the fact...
Samia Souissi, Samia Souissi, Xavier Défago, Xavier Défago, Masafumi Yamashita, Masafumi Yamashita
Reaching agreement among a set of mobile robots is one of the most fundamental issues in distributed robotic systems. This problem is often illustrated by the gathering problem, where the robots must...
Fault-tolerant group membership protocols using physical robot messengers (2005)
Rami Yared, Rami Yared, Xavier Défago, Xavier Défago, Takuya Katayama, Takuya Katayama
Fault-tolerant group membership protocols using physical robot messengers
Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey (2004)
Défago, Xavier, Schiper, André, Urbán, Péter
Total order broadcast and multicast (also called atomic broadcast/multicast) present an important problem in distributed systems, especially with respect to fault-tolerance. In short, the primitive...
Semi-passive replication and Lazy Consensus (2004)
Défago, Xavier, Schiper, André
This paper presents two main contributions: semi-passive replication and Lazy Consensus. The former is a replication technique with parsimonious processing. It is based on the latter; a variant of...
The ϕ accrual failure detector (2004)
Xavier Défago, Péter Urbán, Naohiro Hayashibara, Takuya Katayama
Traditionally, failure detectors have considered a binary model whereby a given process can be either trusted or suspected. This paper defines a family of failure detectors, called accrual failure...
Total order broadcast and multicast algorithms: Taxonomy and survey (2004)
Xavier Défago, André Schiper, Péter Urbán
Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast...
Total order broadcast and multicast algorithms: Taxonomy and survey (2004)
Xavier Défago, Xavier Défago, André Schiper, André Schiper, Péter Urbán, Péter Urbán
“Information and Systems, ” PRESTO,
Flexible failure detection with κ-fd (2004)
Naohiro Hayashibara, Naohiro Hayashibara, Xavier Défago, Xavier Défago, Takuya Katayama, Takuya Katayama
Many people rightly consider that failure detection should be provided as a generic distributed system service to be used to support fault-tolerance within many applications, in spite of possibly...
The ϕ accrual failure detector (2004)
Xavier Défago, Xavier Défago, Péter Urbán, Péter Urbán, Naohiro Hayashibara, Naohiro Hayashibara, ...
Traditionally, failure detectors have considered a binary model whereby a given process can be either trusted or suspected. This paper defines a family of failure detectors, called accrual failure...
The ϕ accrual failure detector (2004)
Xavier Défago, Naohiro Hayashibara, Naohiro Hayashibara, Xavier Défago (contact, Rami Yared, Rami Yared, ...
Detecting failures is a fundamental issue for fault-tolerance in distributed systems. Recently, many people have come to realize that failure detection ought to be provided as some form of generic...
Comparative performance analysis of ordering strategies in atomic broadcast algorithms (2003)
Défago, Xavier, Schiper, André, Urbán, Péter
Comparative Performance Analysis of Ordering Strategies in Atomic Broadcast Algorithms Xavier Defago, Andre Schiper, Peter Urban In this paper, we present the results of a comparative analysis of...
Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey (2003)
Défago, Xavier, Schiper, André, Urbán, Péter
Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey X.Defago, A.Schiper, P.Urban ABSTRACT: Total order broadcast and multicast (also called atomic broadcast/multicast) is an important...
On the Design of a Failure Detection Service for Large-Scale Distributed Systems (2003)
Xavier Défago, Naohiro Hayashibara, Takuya Katayama
It is widely recognized that distributed systems would greatly benefit from the availability of a generic failure detection service. There are however several issues that must be addressed before...
Group Communication based on Standard Interfaces (2003)
Matthias Wiesmann, Xavier Défago, André Schiper
While group communication system have been proposed for some time, they are still not used much in actual systems. We believe that one reason for this is the lack of standardisation of group...
Implementation and Performance Analysis of the ϕ-Failure Detector (2003)
Naohiro Hayashibara, Xavier Défago, Takuya Katayama, Naohiro Hayashibara, Xavier Défago, Takuya Katayama
Failure detection is a fundamental building block for ensuring fault tolerance in distributed systems. However, providing accurate and flexible failure detection in off-the-shelf distributed systems...
Neko: A Single Environment to Simulate and Prototype Distributed Algorithms (2002)
Urbán, Péter, Défago, Xavier, Schiper, André
Peter Urban, Xavier Defago and Andre Schiper Neko: A Single Environment to Simulate and Prototype Distributed Algorithms Designing, tuning, and analyzing the performance of distributed algorithms and...
Specification of Replication Techniques, Semi-Passive Replication, and Lazy consensus* (2002)
Défago, Xavier, Schiper, André
This paper brings the following three main contributions: a hierarchy of specifications for replication techniques, semi-passive replication, and Lazy Consensus. Based on the definition of the...
Specification of replication techniques, semi-passive replication, and lazy consensus (2002)
Replication is one of the main techniques for building reliable applications in a distributed environment. Although the question has been extensively studied for several decades, some fundamental...
Bernadette Charron-bost, Xavier Défago, André Schiper
This paper investigates the two main and seemingly antagonistic approaches to broadcasting reliably messages in fault-tolerant distributed systems: the approach based on Reliable Broadcast, and the...
Bernadette Charron-bost, Xavier Défago, André Schiper
This paper investigates the two main and seemingly antagonistic approaches to broadcasting messages in fault-tolerant distributed systems: the approach based on Reliable Broadcast, and the one based...
Specification of replication techniques, semi-passive replication, and lazy consensus (2002)
This paper brings the following three main contributions: a hierarchy of specifications for replication techniques, semi-passive replication, and Lazy Consensus. Based on the definition of the...
Broadcasting Messages in Fault-Tolerant Distributed Systems: the benefit of (2002)
Bernadette Charron-bost, Xavier Défago, André Schiper
This paper investigates the two main and seemingly antagonistic approaches to broadcasting messages in fault-tolerant distributed systems: the approach based on Reliable Broadcast, and the one based...
Specification of replication techniques, semi-passive replication, and lazy consensus (2002)
This paper brings the following three main contributions: a hierarchy of specifications for replication techniques, semi-passive replication, and Lazy Consensus. Based on the definition of the...
Time vs. Space in Fault-Tolerant Distributed Systems (2001)
Charron-Bost, Bernadette, Défago, Xavier, Schiper, André
Algorithms for solving agreement problems can be classified in two categories: (1) those relying on failure detectors that we call \emph{FD-based}, and (2) those that rely on a Group Membership...
Neko: A Single Environment to Simulate and Prototype Distributed Algorithms (2001)
Urbán, Péter, Défago, Xavier, Schiper, André
Neko: A Single Environment to Simulate and Prototype Distributed Algorithms Peter Urban, Xavier Defago and Andre Schiper Designing, tuning, and analyzing the performance of distributed algorithms and...
Neko: A single environment to simulate and prototype distributed algorithms (2001)
Péter Urbán, Xavier Défago, André Schiper
Designing, tuning, and analyzing the performance of distributed algorithms and protocols are complex tasks. A major factor that contributes to this complexity is the fact that there is no single...
Neko: A single environment to simulate and prototype distributed algorithms (2001)
Péter Urbán, Xavier Défago, André Schiper
Designing, tuning, and analyzing the performance of distributed algorithms and protocols are complex tasks. A major factor that contributes to this complexity is the fact that there is no single...
Time vs. space in fault-tolerant distributed systems (2001)
Bernadette Charron-bost, Xavier Défago, André Schiper
Algorithms for solving agreement problems can be classified in two categories: (1) those relying on failure detectors that we call FD-based, and (2) those that rely on a Group Membership Service that...
Neko: A single environment to simulate and prototype distributed algorithms (2001)
Péter Urbán, Xavier Défago, André Schiper
Designing, tuning, and analyzing the performance of distributed algorithms and protocols are complex tasks. A major factor that contributes to this complexity is the fact that there is no single...
Agreement-related problems: from semi-passive replication to totally ordered broadcast (2000)
Agreement problems constitute a fundamental class of problems in the context of distributed systems. All agreement problems follow a common pattern: all processes must agree on some common decision,...
Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey (2000)
Défago, Xavier, Schiper, André, Urbán, Péter
Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast...
Contention-aware metrics: analysis of distributed algorithms (2000)
Urbán, Péter, Défago, Xavier, Schiper, André
Resource contention is widely recognized as having a major impact on the performance of distributed algorithms. Nevertheless, the metrics that are commonly used to predict their performance take...
Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey (2000)
Défago, Xavier, Schiper, André, Urban, Peter
Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast...
Contention-aware metrics: Analysis of distributed algorithms (2000)
Péter Urbán, Xavier Défago, André Schiper
Resource contention is widely recognized as having a major impact on the performance of distributed algorithms. Nevertheless, the metrics that are commonly used to predict their performance take...
Péter Urbán, Xavier Défago, André Schiper
Resource contention is widely recognized as having a major impact on the performance of distributed algorithms. Nevertheless, the metrics that are commonly used to predict their performance take...
Agreement-Related Problems: From Semi-Passive Replication To Totally Ordered Broadcast (2000)
Thèse N Æ, Xavier Défago, Prof D. Malkhi, Prof F. Mattern, Le Boudec
Agreement problems constitute a fundamental class of problems in the context of distributed systems. All agreement problems follow a common pattern: all processes must agree on some common decision,...
Contention-Aware Metrics: Analysis of Distributed Algorithms (2000)
Péter Urbán, Xavier Défago, André Schiper
Resource contention is widely recognized as having a major impact on the performance of distributed algorithms. Nevertheless, the metrics that are commonly used to predict their performance take...
Contention-aware metrics: Analysis of distributed algorithms (2000)
Péter Urbán, Xavier Défago, André Schiper
Resource contention is widely recognized as having a major impact on the performance of distributed algorithms. Nevertheless, the metrics that are commonly used to predict their performance take...
Replicating CORBA objects: A marriage between active and passive replication (1999)
Pascal Felber, Xavier Défago, Patrick Eugster, André Schiper
Abstract: Replication is a key mechanism for developing fault-tolerant and highly available applications. In this paper, we present a replication framework for replicating CORBA objects that combines...
Failure detectors as first class objects (1999)
Pascal Felber, Xavier Défago, Rachid Guerraoui
One of the fundamental differences between a centralized system and a distributed one is the notion of partial failures. The ability to efficiently and accurately detect failures is a key element...
Failure detectors: Implementation issues and impact on consensus performance (1999)
Nicole Sergent, Xavier Défago, André Schiper
Due to their nature, distributed systems are vulnerable to failures of some of their parts. Conversely, distribution also provides a way to increase the fault tolerance of the overall system....
Semi-passive replication (1998)
Xavier Défago, André Schiper, Nicole Sergent
This paper presents the semi-passive replication technique – a variant of passive replication – that can be implemented in the asynchronous system model without requiring a membership service to...
Highly available trading system: Experiments with CORBA (1998)
Xavier Défago, Karim R. Mazouni, André Schiper, École Polytechnique Fédérale
The Swiss Exchange system (SWX system) was the first stock exchange system in service to be fully computerised. For high availability, the trading system is built as a replicated service based on...