TY - GEN
T1 - Self-stabilizing Uniform Reliable Broadcast
AU - Lundström, Oskar
AU - Raynal, Michel
AU - M. Schiller, Elad
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - We study a well-known communication abstraction called Uniform Reliable Broadcast (URB). URB is central in the design and implementation of fault-tolerant distributed systems, as many non-trivial fault-tolerant distributed applications require communication with provable guarantees on message deliveries. Our study focuses on fault-tolerant implementations for time-free message-passing systems that are prone to node-failures. Moreover, we aim at the design of an even more robust communication abstraction. We do so through the lenses of self-stabilization—a very strong notion of fault-tolerance. In addition to node and communication failures, self-stabilizing algorithms can recover after the occurrence of arbitrary transient faults; these faults represent any violation of the assumptions according to which the system was designed to operate (as long as the algorithm code stays intact). We propose the first self-stabilizing URB algorithm for asynchronous (time-free) message-passing systems that are prone to node-failures. The algorithm recovers within O(bufferUnitSize) (in terms of asynchronous cycles) from transient faults, where bufferUnitSize is a predefined constant. Also, the communication costs are similar to the ones of the non-self-stabilizing URB. The main differences are that our proposal considers repeated gossiping of O(1 ) bits messages and deals with bounded space (which is a prerequisite for self-stabilization). Moreover, each node stores up to bufferUnitSize· n records of size O(ν+ nlog n) bits, where n is the number of nodes and ν is the number of bits needed to encode a single URB instance.
AB - We study a well-known communication abstraction called Uniform Reliable Broadcast (URB). URB is central in the design and implementation of fault-tolerant distributed systems, as many non-trivial fault-tolerant distributed applications require communication with provable guarantees on message deliveries. Our study focuses on fault-tolerant implementations for time-free message-passing systems that are prone to node-failures. Moreover, we aim at the design of an even more robust communication abstraction. We do so through the lenses of self-stabilization—a very strong notion of fault-tolerance. In addition to node and communication failures, self-stabilizing algorithms can recover after the occurrence of arbitrary transient faults; these faults represent any violation of the assumptions according to which the system was designed to operate (as long as the algorithm code stays intact). We propose the first self-stabilizing URB algorithm for asynchronous (time-free) message-passing systems that are prone to node-failures. The algorithm recovers within O(bufferUnitSize) (in terms of asynchronous cycles) from transient faults, where bufferUnitSize is a predefined constant. Also, the communication costs are similar to the ones of the non-self-stabilizing URB. The main differences are that our proposal considers repeated gossiping of O(1 ) bits messages and deals with bounded space (which is a prerequisite for self-stabilization). Moreover, each node stores up to bufferUnitSize· n records of size O(ν+ nlog n) bits, where n is the number of nodes and ν is the number of bits needed to encode a single URB instance.
UR - http://www.scopus.com/inward/record.url?scp=85101522093&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-67087-0_19
DO - 10.1007/978-3-030-67087-0_19
M3 - Conference contribution
AN - SCOPUS:85101522093
SN - 9783030670863
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 296
EP - 313
BT - Networked Systems - 8th International Conference, NETYS 2020, Proceedings
A2 - Georgiou, Chryssis
A2 - Majumdar, Rupak
PB - Springer Science and Business Media Deutschland GmbH
T2 - 8th International Conference on Networked Systems, NETYS 2020
Y2 - 3 June 2020 through 5 June 2020
ER -