Learning-Based Optimal Admission Control in a Single-Server Queuing System

Asaf Cohen, Vijay Subramanian, Yili Zhang

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We consider a long-term average profit–maximizing admission control problem in an M/M/1 queuing system with unknown service and arrival rates. With a fixed reward collected upon service completion and a cost per unit of time enforced on customers waiting in the queue, a dispatcher decides upon arrivals whether to admit the arriving customer or not based on the full history of observations of the queue length of the system. Naor [Naor P (1969) The regulation of queue size by levying tolls. Econometrica 37(1):15–24] shows that, if all the parameters of the model are known, then it is optimal to use a static threshold policy: admit if the queue length is less than a predetermined threshold and otherwise not. We propose a learning-based dispatching algorithm and characterize its regret with respect to optimal dispatch policies for the full-information model of Naor [Naor P (1969) The regulation of queue size by levying tolls. Econometrica 37(1):15–24]. We show that the algorithm achieves an O(1) regret when all optimal thresholds with full information are nonzero and achieves an O(ln1+ɛ (N)) regret for any specified ɛ >0 in the case that an optimal threshold with full information is 0 (i.e., an optimal policy is to reject all arrivals), where N is the number of arrivals.

Original languageEnglish
Pages (from-to)69-107
Number of pages39
JournalStochastic Systems
Volume14
Issue number1
DOIs
StatePublished - 1 Mar 2024
Externally publishedYes

Keywords

  • queueing systems with uncertainty
  • reinforcement learning

ASJC Scopus subject areas

  • Statistics and Probability
  • Modeling and Simulation
  • Statistics, Probability and Uncertainty
  • Management Science and Operations Research

Fingerprint

Dive into the research topics of 'Learning-Based Optimal Admission Control in a Single-Server Queuing System'. Together they form a unique fingerprint.

Cite this