Abstract
We informally call a stochastic process learnable if it admits a generalization error approaching zero in probability for any concept class with finite VC-dimension (IID processes are the simplest example). A mixture of learnable processes need not be learnable itself, and certainly its generalization error need not decay at the same rate. In this paper, we argue that it is natural in predictive PAC to condition not on the past observations but on the mixture component of the sample path. This definition not only matches what a realistic learner might demand, but also allows us to sidestep several otherwise grave problems in learning from dependent data. In particular, we give a novel PAC generalization bound for mixtures of learnable processes with a generalization error that is not worse than that of each mixture component. We also provide a characterization of mixtures of absolutely regular (β-mixing) processes, of independent probability-theoretic interest.
Original language | English |
---|---|
Journal | Advances in Neural Information Processing Systems |
State | Published - 1 Jan 2013 |
Event | 27th Annual Conference on Neural Information Processing Systems, NIPS 2013 - Lake Tahoe, NV, United States Duration: 5 Dec 2013 → 10 Dec 2013 |
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing