TY - JOUR
T1 - On the impact of serializing contention management on STM performance
AU - Heber, Tomer
AU - Hendler, Danny
AU - Suissa, Adi
N1 - Funding Information:
A preliminary version of this paper appeared in the proceedings of the 13th International Conference On Principles Of Distributed Systems, Nimes, France, 2009, pages 225–239. This work was supported in part by the Israel Science Foundation (grant number 1227/10 ).
PY - 2012/6/1
Y1 - 2012/6/1
N2 - Transactional memory (TM) is an emerging concurrent programming abstraction. Numerous software-based transactional memory (STM) implementations have been developed in recent years. STM implementations must guarantee transaction atomicity and isolation. In order to ensure progress, an STM implementation must resolve transaction collisions by consulting a contention manager (CM). Recent work established that serializing contention management-a technique in which the execution of colliding transactions is serialized for eliminating repeat-collisions-can dramatically improve STM performance in high-contention workloads. In low-contention and highly-parallel workloads, however, excessive serialization of memory transactions may limit concurrency too much and hurt performance. It is therefore important to better understand how the impact of serialization on STM performance varies as a function of workload characteristics. We investigate how serializing CM influences the performance of STM systems. Specifically, we study serialization's influence on STM throughput (number of committed transactions per time unit) and efficiency (ratio between the extent of "useful" work done by the STM and work "wasted" by aborts) as the workload's level of contention changes. Towards this goal, we implement CBench - a synthetic benchmark that generates workloads in which transactions have (parameter) pre-determined length and probability of being aborted in the lack of contention reduction mechanisms. CBench facilitates evaluating the efficiency of contention management algorithms across the full spectrum of contention levels. The characteristics of TM workloads generated by real applications may vary over time. To achieve good performance, CM algorithms need to monitor these characteristics and change their behavior accordingly. We implement adaptive algorithms that control the activation of serializing CM according to measured contention level, based on a novel low-overhead serialization mechanism. We then evaluate our new algorithms on CBench-generated workloads and on additional well-known STM benchmark applications. Our results shed light on the manner in which serializing CM should be used by STM systems. We show that adaptive contention managers are susceptible to a phenomenon of mode oscillations-in which serialization is repeatedly turned on and off-which hurts performance. We implement a simple stabilizing mechanism that solves this problem. We also compare the performance of local and global adaptive CM algorithms and demonstrate that local adaptive algorithms are superior for applications with asymmetric workloads.
AB - Transactional memory (TM) is an emerging concurrent programming abstraction. Numerous software-based transactional memory (STM) implementations have been developed in recent years. STM implementations must guarantee transaction atomicity and isolation. In order to ensure progress, an STM implementation must resolve transaction collisions by consulting a contention manager (CM). Recent work established that serializing contention management-a technique in which the execution of colliding transactions is serialized for eliminating repeat-collisions-can dramatically improve STM performance in high-contention workloads. In low-contention and highly-parallel workloads, however, excessive serialization of memory transactions may limit concurrency too much and hurt performance. It is therefore important to better understand how the impact of serialization on STM performance varies as a function of workload characteristics. We investigate how serializing CM influences the performance of STM systems. Specifically, we study serialization's influence on STM throughput (number of committed transactions per time unit) and efficiency (ratio between the extent of "useful" work done by the STM and work "wasted" by aborts) as the workload's level of contention changes. Towards this goal, we implement CBench - a synthetic benchmark that generates workloads in which transactions have (parameter) pre-determined length and probability of being aborted in the lack of contention reduction mechanisms. CBench facilitates evaluating the efficiency of contention management algorithms across the full spectrum of contention levels. The characteristics of TM workloads generated by real applications may vary over time. To achieve good performance, CM algorithms need to monitor these characteristics and change their behavior accordingly. We implement adaptive algorithms that control the activation of serializing CM according to measured contention level, based on a novel low-overhead serialization mechanism. We then evaluate our new algorithms on CBench-generated workloads and on additional well-known STM benchmark applications. Our results shed light on the manner in which serializing CM should be used by STM systems. We show that adaptive contention managers are susceptible to a phenomenon of mode oscillations-in which serialization is repeatedly turned on and off-which hurts performance. We implement a simple stabilizing mechanism that solves this problem. We also compare the performance of local and global adaptive CM algorithms and demonstrate that local adaptive algorithms are superior for applications with asymmetric workloads.
KW - Conflict detection
KW - Contention management
KW - Transactional memory
UR - http://www.scopus.com/inward/record.url?scp=84859927476&partnerID=8YFLogxK
U2 - 10.1016/j.jpdc.2012.02.009
DO - 10.1016/j.jpdc.2012.02.009
M3 - Article
AN - SCOPUS:84859927476
SN - 0743-7315
VL - 72
SP - 739
EP - 750
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
IS - 6
ER -