TY - GEN
T1 - An adaptive technique for constructing robust and high-throughput shared objects
AU - Hendler, Danny
AU - Kutten, Shay
AU - Michalak, Erez
PY - 2010/12/1
Y1 - 2010/12/1
N2 - Shared counters are the key to solving a variety of coordination problems on multiprocessor machines, such as barrier synchronization and index distribution. It is desired that they, like shared objects in general, be robust, linearizable and scalable. We present the first linearizable and wait-free shared counter algorithm that achieves high throughput without a-priori knowledge about the system's level of asynchrony. Our algorithm can be easily adapted to any other combinable objects as well, such as stacks and queues. In particular, in an N-process execution E, our algorithm achieves high throughput of Ω(N/φE2 log2 φE log N), where φE is E's level of asynchrony. Moreover, our algorithm stands any constant number of faults. If E contains a constant number of faults, then our algorithm still achieves high throughput of Ω(φ′E2 log2 φ′E log N), where φ′E bounds the relative speeds of any two processes, at a time that both of them participated in E and none of them failed. Our algorithm can be viewed as an adaptive version of the Bounded-Wait-Combining (BWC) prior art algorithm. BWC receives as an input an argument φ as a (supposed) upper bound of φE, and achieves optimal throughput if φ = φE. However, if the given φ happens to be lower than the actual φE, or much greater than φE, then the throughput of BWC degraded significantly. Moreover, whereas BWC is only lock-free, our algorithm is more robust, since it is wait-free. To achieve high throughput and wait-freedom, we present a method that guarantees (for some common kind of procedures) the procedure's successful termination in a bounded time, regardless of shared memory contention. This method may prove useful by itself, for other problems.
AB - Shared counters are the key to solving a variety of coordination problems on multiprocessor machines, such as barrier synchronization and index distribution. It is desired that they, like shared objects in general, be robust, linearizable and scalable. We present the first linearizable and wait-free shared counter algorithm that achieves high throughput without a-priori knowledge about the system's level of asynchrony. Our algorithm can be easily adapted to any other combinable objects as well, such as stacks and queues. In particular, in an N-process execution E, our algorithm achieves high throughput of Ω(N/φE2 log2 φE log N), where φE is E's level of asynchrony. Moreover, our algorithm stands any constant number of faults. If E contains a constant number of faults, then our algorithm still achieves high throughput of Ω(φ′E2 log2 φ′E log N), where φ′E bounds the relative speeds of any two processes, at a time that both of them participated in E and none of them failed. Our algorithm can be viewed as an adaptive version of the Bounded-Wait-Combining (BWC) prior art algorithm. BWC receives as an input an argument φ as a (supposed) upper bound of φE, and achieves optimal throughput if φ = φE. However, if the given φ happens to be lower than the actual φE, or much greater than φE, then the throughput of BWC degraded significantly. Moreover, whereas BWC is only lock-free, our algorithm is more robust, since it is wait-free. To achieve high throughput and wait-freedom, we present a method that guarantees (for some common kind of procedures) the procedure's successful termination in a bounded time, regardless of shared memory contention. This method may prove useful by itself, for other problems.
UR - http://www.scopus.com/inward/record.url?scp=78650872731&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-17653-1_24
DO - 10.1007/978-3-642-17653-1_24
M3 - Conference contribution
AN - SCOPUS:78650872731
SN - 3642176526
SN - 9783642176524
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 318
EP - 332
BT - Principles of Distributed Systems - 14th International Conference, OPODIS 2010, Proceedings
T2 - 14th International Conference on Principles of Distributed Systems, OPODIS 2010
Y2 - 14 December 2010 through 17 December 2010
ER -