TY - GEN
T1 - Robust Alignment via Partial Gromov-Wasserstein Distances
AU - Gong, Xiaoyun
AU - Nietert, Sloan
AU - Goldfeld, Ziv
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - The Gromov-Wasserstein (GW) problem provides a powerful framework for aligning heterogeneous datasets by matching their internal structures in a way that minimizes distortion. However, GW alignment is sensitive to data contamination by outliers, which can greatly distort the resulting matching scheme. To address this issue, we study robust GW alignment, where upon observing contaminated versions of the clean data distributions, our goal is to accurately estimate the GW alignment cost between the original (uncontaminated) measures. We propose an estimator based on the partial GW distance, which trims out a fraction of the mass from each distribution before optimally aligning the rest. The estimator is shown to be minimax optimal in the population setting and is near-optimal in the finite-sample regime, where the optimality gap originates only from the suboptimality of the plug-in estimator in the empirical estimation setting (i.e., without contamination). Towards the analysis, we derive new structural results pertaining to the approximate pseudo-metric structure of the partial GW distance. Overall, our results endow the partial GW distance with an operational meaning by posing it as a robust surrogate of the classical distance when the observed data may be contaminated.
AB - The Gromov-Wasserstein (GW) problem provides a powerful framework for aligning heterogeneous datasets by matching their internal structures in a way that minimizes distortion. However, GW alignment is sensitive to data contamination by outliers, which can greatly distort the resulting matching scheme. To address this issue, we study robust GW alignment, where upon observing contaminated versions of the clean data distributions, our goal is to accurately estimate the GW alignment cost between the original (uncontaminated) measures. We propose an estimator based on the partial GW distance, which trims out a fraction of the mass from each distribution before optimally aligning the rest. The estimator is shown to be minimax optimal in the population setting and is near-optimal in the finite-sample regime, where the optimality gap originates only from the suboptimality of the plug-in estimator in the empirical estimation setting (i.e., without contamination). Towards the analysis, we derive new structural results pertaining to the approximate pseudo-metric structure of the partial GW distance. Overall, our results endow the partial GW distance with an operational meaning by posing it as a robust surrogate of the classical distance when the observed data may be contaminated.
UR - https://www.scopus.com/pages/publications/105021982826
U2 - 10.1109/ISIT63088.2025.11195327
DO - 10.1109/ISIT63088.2025.11195327
M3 - Conference contribution
AN - SCOPUS:105021982826
T3 - IEEE International Symposium on Information Theory - Proceedings
BT - ISIT 2025 - 2025 IEEE International Symposium on Information Theory, Proceedings
PB - Institute of Electrical and Electronics Engineers
T2 - 2025 IEEE International Symposium on Information Theory, ISIT 2025
Y2 - 22 June 2025 through 27 June 2025
ER -