A distributed learning problem over multiple access channel (MAC) using a large wireless network is considered. The objective function is a sum of the nodes' local loss functions. The inference decision is made by the network edge and is based on received data from distributed nodes which transmit over a noisy fading MAC. We develop a novel Gradient-Based Multiple Access (GBMA) algorithm to solve the distributed learning problem over MAC. Specifically, the nodes transmit an analog function of the local gradient using common shaping waveforms. The network edge receives a superposition of the analog transmitted signals which represents a noisy distorted gradient used for updating the estimate. We analyze the performance of GBMA theoretically, and prove that it can approach the convergence rate of the centralized gradient descent (GD) algorithm in large networks under both convex and strongly convex loss functions with Lipschitz gradient.