TY - GEN

T1 - Massively parallel computations of the LZ-complexity of strings

AU - Belousov, Alexander

AU - Ratsaby, Joel

N1 - Publisher Copyright:
© Copyright 2015 IEEE All rights reserved.

PY - 2014/1/1

Y1 - 2014/1/1

N2 - We introduce a new parallel algorithm LZMP for computing the Lempel-Ziv complexity of a string of characters from a finite alphabet. The LZ-complexity is a mathematical quantity that is related to the amount of non-redundant information that a string contains. It has been recently shown to be useful in pattern recognition in measuring the distance between various kinds of input patterns [1],[2]. We implement the algorithm on a software parallel-computing platform (CUDA) which works with a Graphical Processing Unit (GPU) on a TESLA K20c board. The GPU has 2, 496 computing cores and 5G bytes of global memory. This platform enables high increase in computing performance. We exploit multiple places in the sequential version of the code and use CUDA to implement these code sections to run in parallel on the GPU cores. For example, the string XYZMXZXYZKR has a complexity of 7. The sequential version of the algorithm takes 27 time steps to compute this while the parallel version takes only 11 time steps. The advantage of the parallel implementation over the sequential one becomes more relevant as the string's length increases. Our results show that for single strings of length up to 5k bytes there is no significant improvement but for strings of length greater than 5k the speedup factor grows at a linear rate with respect to the string length. When we compute the LZ-complexity of multiple strings all of length 48k bytes in a parallel our results indicate that the speedup ratio is approximately 100. This is advantageous for pattern recognition where the object to be recognized can be represented as a collection of substrings and LZ-complexity based distance ca be computed to each of them in parallel.

AB - We introduce a new parallel algorithm LZMP for computing the Lempel-Ziv complexity of a string of characters from a finite alphabet. The LZ-complexity is a mathematical quantity that is related to the amount of non-redundant information that a string contains. It has been recently shown to be useful in pattern recognition in measuring the distance between various kinds of input patterns [1],[2]. We implement the algorithm on a software parallel-computing platform (CUDA) which works with a Graphical Processing Unit (GPU) on a TESLA K20c board. The GPU has 2, 496 computing cores and 5G bytes of global memory. This platform enables high increase in computing performance. We exploit multiple places in the sequential version of the code and use CUDA to implement these code sections to run in parallel on the GPU cores. For example, the string XYZMXZXYZKR has a complexity of 7. The sequential version of the algorithm takes 27 time steps to compute this while the parallel version takes only 11 time steps. The advantage of the parallel implementation over the sequential one becomes more relevant as the string's length increases. Our results show that for single strings of length up to 5k bytes there is no significant improvement but for strings of length greater than 5k the speedup factor grows at a linear rate with respect to the string length. When we compute the LZ-complexity of multiple strings all of length 48k bytes in a parallel our results indicate that the speedup ratio is approximately 100. This is advantageous for pattern recognition where the object to be recognized can be represented as a collection of substrings and LZ-complexity based distance ca be computed to each of them in parallel.

UR - http://www.scopus.com/inward/record.url?scp=84941242055&partnerID=8YFLogxK

U2 - 10.1109/EEEI.2014.7005885

DO - 10.1109/EEEI.2014.7005885

M3 - Conference contribution

AN - SCOPUS:84941242055

T3 - 2014 IEEE 28th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2014

BT - 2014 IEEE 28th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2014

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2014 28th IEEE Convention of Electrical and Electronics Engineers in Israel, IEEEI 2014

Y2 - 3 December 2014 through 5 December 2014

ER -