Accelerated Gradient Descent Learning over Multiple Access Fading Channels

Raz Paul, Yuval Friedman, Kobi Cohen

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


We consider a distributed learning problem in a wireless network, consisting of N distributed edge devices and a parameter server (PS). The objective function is a sum of the edge devices' local loss functions, who aim to train a shared model by communicating with the PS over multiple access channels (MAC). This problem has attracted a growing interest in distributed sensing systems, and more recently in federated learning, known as over-the-air computation. In this paper, we develop a novel Accelerated Gradient-descent Multiple Access (AGMA) algorithm that uses momentum-based gradient signals over noisy fading MAC to improve the convergence rate as compared to existing methods. Furthermore, AGMA does not require power control or beamforming to cancel the fading effect, which simplifies the implementation complexity. We analyze AGMA theoretically, and establish a finite-sample bound of the error for both convex and strongly convex loss functions with Lipschitz gradient. For the strongly convex case, we show that AGMA approaches the best-known linear convergence rate as the network increases. For the convex case, we show that AGMA significantly improves the sub-linear convergence rate as compared to existing methods. Finally, we present simulation results using real datasets that demonstrate better performance by AGMA.

Original languageEnglish
Pages (from-to)532-547
Number of pages16
JournalIEEE Journal on Selected Areas in Communications
Issue number2
StatePublished - 1 Feb 2022


  • Distributed learning
  • federated learning
  • gradient descent (GD) learning
  • multiple access channel (MAC)
  • over-the-air computation
  • wireless edge networks

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Electrical and Electronic Engineering


Dive into the research topics of 'Accelerated Gradient Descent Learning over Multiple Access Fading Channels'. Together they form a unique fingerprint.

Cite this