TY - GEN
T1 - Scalable flat-combining based synchronous queues
AU - Hendler, Danny
AU - Incze, Itai
AU - Shavit, Nir
AU - Tzafrir, Moran
PY - 2010/12/13
Y1 - 2010/12/13
N2 - In a synchronous queue, producers and consumers handshake to exchange data. Recently, new scalable unfair synchronous queues were added to the Java JDK 6.0 to support high performance thread pools. This paper applies flat-combining to the problem of designing a synchronous queue algorithm. We first use the original flat-combining algorithm, a single "combiner" thread acquires a global lock and services the other threads' combined requests with very low synchronization overheads. As we show, this single combiner approach delivers superior performance up to a certain level of concurrency, but unfortunately does not continue to scale beyond that point. In order to continue to deliver scalable performance as concurrency increases, we introduce a new parallel flat-combining algorithm. The new algorithm dynamically adds additional concurrently executing flat-combiners that coordinate their work. It enjoys the low coordination overheads of sequential flat combining, with the added scalability that comes with parallelism. Our novel unfair synchronous queue using parallel flat combining exhibits scalability far and beyond that of the JDK 6.0 algorithm: it matches it in the case of a single producer and consumer, and is superior throughout the concurrency range, delivering up to 11 (eleven) times the throughput at high concurrency.
AB - In a synchronous queue, producers and consumers handshake to exchange data. Recently, new scalable unfair synchronous queues were added to the Java JDK 6.0 to support high performance thread pools. This paper applies flat-combining to the problem of designing a synchronous queue algorithm. We first use the original flat-combining algorithm, a single "combiner" thread acquires a global lock and services the other threads' combined requests with very low synchronization overheads. As we show, this single combiner approach delivers superior performance up to a certain level of concurrency, but unfortunately does not continue to scale beyond that point. In order to continue to deliver scalable performance as concurrency increases, we introduce a new parallel flat-combining algorithm. The new algorithm dynamically adds additional concurrently executing flat-combiners that coordinate their work. It enjoys the low coordination overheads of sequential flat combining, with the added scalability that comes with parallelism. Our novel unfair synchronous queue using parallel flat combining exhibits scalability far and beyond that of the JDK 6.0 algorithm: it matches it in the case of a single producer and consumer, and is superior throughout the concurrency range, delivering up to 11 (eleven) times the throughput at high concurrency.
UR - http://www.scopus.com/inward/record.url?scp=78649807456&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-15763-9_8
DO - 10.1007/978-3-642-15763-9_8
M3 - Conference contribution
AN - SCOPUS:78649807456
SN - 3642157629
SN - 9783642157622
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 79
EP - 93
BT - Distributed Computing - 24th International Symposium, DISC 2010, Proceedings
T2 - 24th International Symposium on Distributed Computing, DISC 2010
Y2 - 13 September 2010 through 15 September 2010
ER -