Acquiring a precise model is a challenging task for many important robotic tasks and systems - including in-hand manipulation using underactuated, adaptive hands. Learning stochastic, data-driven models is a promising alternative as they provide not only a way to propagate forward the system dynamics, but also express the uncertainty present in the collected data. Therefore, such models enable planning in the space of state distributions, i.e., in the belief space. This paper proposes a planning framework for solving Non-Observable Markov Decision Process (NOMDP) problems which employs learned stochastic models, expressing a distribution of states as a set of particles. The integration achieves anytime behavior in terms of returning paths of increasing quality under constraints for the probability of success to achieve a goal. The focus of this effort is on pushing the efficiency of the overall methodology despite the notorious computational hardness of belief-space planning. Experiments show that the proposed framework enables reaching a desired goal with higher success rate compared to alternatives in simple benchmarks. This work also provides an application to the motivating domain of in-hand manipulation with underactuated, adaptive hands, both in the case of physically-simulated experiments as well as demonstrations with a real hand.