## Abstract

In this paper we consider the p-Norm Hamming Centroid problem which asks to determine whether some given strings have a centroid with a bound on the p-norm of its Hamming distances to the strings. Specifically, given a set S of strings and a real k, we consider the problem of determining whether there exists a string s^{∗} with (Ʃ_{s∈S} d^{p}(s^{∗}, s))^{1/p} ≤ k, where d(, ) denotes the Hamming distance metric. This problem has important applications in data clustering and multi-winner committee elections, and is a generalization of the well-known polynomial-time solvable Consensus String (p = 1) problem, as well as the NP-hard Closest String (p = ∞) problem. Our main result shows that the problem is NP-hard for all fixed rational p > 1, closing the gap for all rational values of p between 1 and ∞. Under standard complexity assumptions the p reduction also implies that the problem has no 2°^{(n+m)}-time or 2°^{(k p/(p+1))}-time algorithm, where m denotes the number of input strings and n denotes the length of each string, for any fixed p > 1. The first bound matches a straightforward brute-force algorithm. The second bound is tight in the sense that for each fixed ε > 0, we provide a 2^{k}^{(p/p+1) +ε}-time algorithm. In the last part of the paper, we complement our hardness result by presenting a fixed-parameter algorithm and a factor-2 approximation algorithm for the problem.

Original language | English GB |
---|---|

Number of pages | 22 |

Journal | Leibniz International Proceedings in Informatics, LIPIcs |

Volume | 144 |

DOIs | |

State | Published - 1 Sep 2019 |

Event | 27th Annual European Symposium on Algorithms, ESA 2019 - Munich/Garching, Germany Duration: 9 Sep 2019 → 11 Sep 2019 |

## Keywords

- Clustering
- Hamming distance
- Multiwinner election
- Strings

## ASJC Scopus subject areas

- Software