When consensus meets self-stabilization: Self-stabilizing failure-detector, consensus and replicated state-machine

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    7 Scopus citations

    Abstract

    This paper presents a self-stabilizing failure detector, asynchronous consensus and replicated state-machine algorithm suite, the components of which can be started in an arbitrary state and converge to act as a virtual state-machine. Self-stabilizing algorithms can cope with transient faults. Transient faults can alter the system state to an arbitrary state and hence, cause a temporary violation of the safety property of the consensus. New requirements for consensus that fit the on-going nature of self-stabilizing algorithms are presented. The wait-free consensus (and the replicated state-machine) algorithm presented is a classic combination of a failure detector and a (memory bounded) rotating coordinator consensus that satisfy both eventual safety and eventual liveness. Several new techniques and paradigms are introduced. The bounded memory failure detector abstracts away synchronization assumptions using bounded heartbeat counters combined with a balance-unbalance mechanism. The practically infinite paradigm is introduced in the scope of self-stabilization, where an execution of, say, 264 sequential steps is regarded as (practically) infinite. Finally, we present the first self-stabilizing wait-free reset mechanism that ensures eventual safety and can be used in other scopes.

    Original languageEnglish
    Title of host publicationPrinciples of Distributed Systems - 10th International Conference, OPODIS 2006, Proceedings
    PublisherSpringer Verlag
    Pages45-63
    Number of pages19
    ISBN (Print)9783540499909
    DOIs
    StatePublished - 1 Jan 2006
    Event10th International Conference on Principles of Distributed Systems, OPODIS 2006 - Bordeaux, France
    Duration: 12 Dec 200615 Dec 2006

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume4305 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference10th International Conference on Principles of Distributed Systems, OPODIS 2006
    Country/TerritoryFrance
    CityBordeaux
    Period12/12/0615/12/06

    Keywords

    • Consensus
    • Distributed Reset
    • Failure Detector
    • Self-Stabilization
    • State-Machine
    • Wait-Free

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'When consensus meets self-stabilization: Self-stabilizing failure-detector, consensus and replicated state-machine'. Together they form a unique fingerprint.

    Cite this