We consider the problem of detecting a change in a time series quickly and reliably, where only a few training instances are available. Examples include identifying changes in network traffic due to zero-day attacks, and computer vision applications where changes in series of images that represent significant events needed to be detected. These are known as cases of one-shot learning. We develop a novel Deep Reinforcement One-shot Learning (DeROL) framework to address this challenge. The basic idea of the DeROL algorithm is to train a deep-Q network to obtain a policy which is oblivious to the unseen classes in the testing data. Then, in real-time, DeROL maps the current state of the one-shot learning process to operational actions based on the trained deep-Q network, to maximize the objective function. We tested the algorithm using the OMNIGLOT dataset to demonstrate the efficiency of the DeROL framework.