We consider the problem of multi-user dynamic spectrum access (DSA) in cognitive radio networks. The shared bandwidth is divided into K orthogonal channels, and M (secondary) users aim at accessing the spectrum, where K ≥ M. Each user is allowed to choose a single channel for transmission at each time slot. The state of each channel is modeled by a restless unknown Markovian process. By contrast to existing studies that analyzed a special case of this setting, in which each channel yields the same expected rate for all users, in this paper we consider the more general model, where each channel yields a different expected rate for each user. This general model adds a significant challenge of how to efficiently learn a channel allocation in a distributed manner so as to yield a global system wide objective. We adopt the stable matching utility as the system objective, which is known to yield strong performance in multichannel wireless networks, and develop a novel Distributed Stable Strategy Learning (DSSL) algorithm to achieve the objective. We prove theoretically that the DSSL algorithm converges to the stable matching allocation, and the regret, defined as the loss in total rate with respect to the stable matching solution, has a logarithmic order with time. Finally, we present numerical examples that support the theoretical results and demonstrate strong performance of the DSSL algorithm.