Traditional deep learning models are trained on centralized servers using labeled sample data collected from edge devices. This data often includes private information, which the users may not be willing to share. Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model. A major challenge that arises in this method is the need of each user to efficiently transmit its learned model over the throughput limited uplink channel. In this work, we tackle this challenge using tools from quantization theory. In particular, we identify the unique characteristics associated with conveying trained models over rate-constrained channels, and characterize a suitable quantization scheme for such setups. We show that combining universal vector quantization methods with FL yields a decentralized training system, which is both efficient and feasible. We also derive theoretical performance guarantees of the system. Our numerical results illustrate the substantial performance gains of our scheme over FL with previously proposed quantization approaches.