Abstract
In this paper, we investigate the problem of deciding whether two standard normal random vectors X ε Rn and Y ε Rn are correlated or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these vectors are statistically independent, while under the alternative, X and a randomly and uniformly permuted version of Y, are correlated with correlation ρ. We analyze the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of n and ρ. To derive our information-theoretic lower bounds, we develop a novel technique for evaluating the second moment of the likelihood ratio using an orthogonal polynomials expansion, which among other things, reveals a surprising connection to integer partition functions. We also study a multi-dimensional generalization of the above setting, where rather than two vectors we observe two databases/matrices, and furthermore allow for partial correlations between these two.
Original language | English |
---|---|
Pages (from-to) | 8942-8960 |
Number of pages | 19 |
Journal | IEEE Transactions on Information Theory |
Volume | 70 |
Issue number | 12 |
DOIs | |
State | Published - 1 Jan 2024 |
Keywords
- Hypothesis testing
- integer partitions
- planted structure
- random permutations
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Library and Information Sciences