TY - GEN
T1 - Unimodal Strategies in Density-Based Clustering
AU - Nir, Oron
AU - Tenenbaum, Jay
AU - Shamir, Ariel
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026/1/1
Y1 - 2026/1/1
N2 - Density-based clustering methods often surpass centroid-based counterparts, when addressing data with noise or arbitrary data distributions common in real-world problems. In this study, we reveal a key property intrinsic to density-based clustering methods regarding the relation between the number of clusters and the neighborhood radius of core points—we empirically show that it is nearly unimodal, and support this claim theoretically in a specific setting. We leverage this property to devise new strategies for finding appropriate values for the radius more efficiently based on the Ternary Search algorithm. This is especially important for large scale data that is high-dimensional, where parameter tuning is computationally intensive. We validate our methodology through extensive applications across a range of high-dimensional, large-scale NLP, Audio, and Computer Vision tasks, demonstrating its practical effectiveness and robustness. This work not only offers a significant advancement in parameter control for density-based clustering but also broadens the understanding regarding the relations between their guiding parameters. Our code is available at https://github.com/oronnir/UnimodalStrategies.
AB - Density-based clustering methods often surpass centroid-based counterparts, when addressing data with noise or arbitrary data distributions common in real-world problems. In this study, we reveal a key property intrinsic to density-based clustering methods regarding the relation between the number of clusters and the neighborhood radius of core points—we empirically show that it is nearly unimodal, and support this claim theoretically in a specific setting. We leverage this property to devise new strategies for finding appropriate values for the radius more efficiently based on the Ternary Search algorithm. This is especially important for large scale data that is high-dimensional, where parameter tuning is computationally intensive. We validate our methodology through extensive applications across a range of high-dimensional, large-scale NLP, Audio, and Computer Vision tasks, demonstrating its practical effectiveness and robustness. This work not only offers a significant advancement in parameter control for density-based clustering but also broadens the understanding regarding the relations between their guiding parameters. Our code is available at https://github.com/oronnir/UnimodalStrategies.
KW - Density-based clustering
KW - Efficient parameter search
UR - https://www.scopus.com/pages/publications/105020011577
U2 - 10.1007/978-3-032-05962-8_29
DO - 10.1007/978-3-032-05962-8_29
M3 - Conference contribution
AN - SCOPUS:105020011577
SN - 9783032059611
T3 - Lecture Notes in Computer Science
SP - 498
EP - 513
BT - Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Proceedings
A2 - Ribeiro, Rita P.
A2 - Jorge, Alípio M.
A2 - Soares, Carlos
A2 - Gama, João
A2 - Pfahringer, Bernhard
A2 - Japkowicz, Nathalie
A2 - Larrañaga, Pedro
A2 - Abreu, Pedro H.
PB - Springer Science and Business Media Deutschland GmbH
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025
Y2 - 15 September 2025 through 19 September 2025
ER -