TY - GEN
T1 - The Communication-Aware Clustered Federated Learning Problem
AU - Shlezinger, Nir
AU - Rini, Stefano
AU - Eldar, Yonina C.
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/6/1
Y1 - 2020/6/1
N2 - Federated learning (FL) refers to the adaptation of a central model based on data sets available at multiple remote users. Two of the common challenges encountered in FL are the fact that training sets obtained by different users are commonly heterogeneous, i.e., arise from different sample distributions, and the need to communicate large amounts of data between the users and the central server over the typically expensive up-link channel. In this work we formulate the problem of FL in which different clusters of users observe labeled samples drawn from different distributions, while operating under constraints on the communication overhead. For such settings, we identify that the combination of statistical heterogeneity and communication constraints induces a tradeoff between the ability of the users of each cluster to learn a proper model and the accuracy in aggregating these models into a global inference rule. We propose an algorithm based on multi-source adaptation methods for such communication-aware clustered FL scenarios which allows to balance these performance measures, and demonstrate its ability to achieve improved inference over conventional federated averaging without inducing additional communication overhead.
AB - Federated learning (FL) refers to the adaptation of a central model based on data sets available at multiple remote users. Two of the common challenges encountered in FL are the fact that training sets obtained by different users are commonly heterogeneous, i.e., arise from different sample distributions, and the need to communicate large amounts of data between the users and the central server over the typically expensive up-link channel. In this work we formulate the problem of FL in which different clusters of users observe labeled samples drawn from different distributions, while operating under constraints on the communication overhead. For such settings, we identify that the combination of statistical heterogeneity and communication constraints induces a tradeoff between the ability of the users of each cluster to learn a proper model and the accuracy in aggregating these models into a global inference rule. We propose an algorithm based on multi-source adaptation methods for such communication-aware clustered FL scenarios which allows to balance these performance measures, and demonstrate its ability to achieve improved inference over conventional federated averaging without inducing additional communication overhead.
UR - http://www.scopus.com/inward/record.url?scp=85090403725&partnerID=8YFLogxK
U2 - 10.1109/ISIT44484.2020.9174245
DO - 10.1109/ISIT44484.2020.9174245
M3 - Conference contribution
AN - SCOPUS:85090403725
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 2610
EP - 2615
BT - 2020 IEEE International Symposium on Information Theory, ISIT 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers
T2 - 2020 IEEE International Symposium on Information Theory, ISIT 2020
Y2 - 21 July 2020 through 26 July 2020
ER -