Queue configurations and customer ownership: An analytical and experimental investigation
New York University Stern School of Business
Contrary to traditional queueing theory, recent field studies in health care and call centers indicate that pooling queues may not lead to operational efficiencies relative to dedicated queues. We use an equilibrium queueing model analysis and a series of controlled experiments to examine the conditions under which this may be the case and to test several behavioral mechanisms that may explain why. We also examine whether the order in which servers are exposed to one type of queue configuration versus another has an impact on their performance, and why.
Joint work with Guillaume Roels (INSEAD) and Hummy Song (Wharton)
The University of Auckland
A survey of parameter and state estimation in queues: A tutorial
We present a broad literature survey of parameter and state estimation for queueing systems. Our approach is based on various inference activities, queueing models, observations schemes, and statistical methods. We categorize these into branches of research that we call estimation paradigms. These include the classical sampling approach, inverse problems, inference for non-interacting systems, inference with discrete sampling, inference with queueing fundamentals, queue inference engine problems, Bayesian approaches, online prediction, implicit models, and control, design, and uncertainty quantification. For each of these estimation paradigms, we outline the principles and ideas, while surveying key references.
Joint work with Yoni Nazarathy and Peter Taylor.
University of New South Wales
Cellular Queueing Systems: Distance measurements for parameter inference
The regulation of protein movements in cells is essential for life. The hormone insulin regulates glucose uptake in mammalian fat and muscle cells – controlling the distribution of the glucose transporter protein GLUT4.
Mean field models have been able to identify some of the dominant processes that act in this cellular regulation system. These deterministic models were optimised using data from multiple protocols and repeated experiments to simultaneously constrain the parameters and network structure. Stochastic models are required however, to test hypotheses about how the cell controls these processes at the molecular scale.
Here we are interested in methods to quantitatively compare stochastic data with stochastic models – one candidate model is a closed queueing network. We present a preliminary study using synthetic data to explore some measures of difference between the experimental datasets and the queueing model, with a view to informing parameter inference and model selection for this system.
Joint work with Maria Vlasiou and Marko Boon.
Cornell University and CUHK-Shenzhen
Deep reinforcement learning for stochastic processing networks
Stochastic processing networks (SPNs) provide high fidelity mathematical modeling for operations of many service systems such as data centers. It has been a challenge to find a scalable algorithm for approximately solving the optimal control of large-scale SPNs, particularly when they are heavily loaded. We demonstrate that a class of deep reinforcement learning algorithms known as Proximal Policy Optimization (PPO) can generate control policies for SPNs that consistently beat the performance of known state-of-arts control policies in the literature. PPO is an approximate policy iteration algorithm that can naturally be implemented in a pure data-driven fashion. In this talk, I will present both the motivating theory and the critical components of the algorithm that make PPO a success in our setting.
This talk is based on the paper written jointly with Mark Gluzman at Cornell University:
“Queueing Network Controls via Deep Reinforcement Learning”.
Statistics, Stochastics, and Queueing
In this talk, we will discuss how statistical modeling and insights gained from limit theorems for queues can be fruitfully used in combination with one another to improve decision-making, as illustrated by many server queueing applications. In particular, the rich stochastic modeling literature identifies the key statistical features in the underlying observed data that drive performance in such systems, and this impacts the types of statistical models that one should adopt. In addition, considerations related to computational, analytical, and statistical tractability shape one’s modeling choices. This talk will discuss this modeling perspective, and some of the recent theoretical, modeling, and computational tools that support this framework.
Carnegie Mellon University
New Directions in Stochastic Scheduling
The massive expansion of datacenter computing has led to a plethora of new queueing models and new scheduling problems. This talk will discuss several new open problems of practical importance. Topics include:
- Advancing the state of scheduling for the M/G/1
- Scheduling for multiserver (M/G/k) systems
- Queueing for today’s multiserver jobs
- Scheduling of malleable jobs, with flexible parallelizability.
In addition we will characterize today’s datacenter workloads, particularly their extremely heavy tails.
This talk is based on the following paper: “Open problems in queueing theory inspired by datacenter computing.” Queueing Systems, vol. 97, no. 1, 2021, pp. 3-37.
Volunteer programs in Emergency Medical Services
Typical ambulance response times are inadequate for improving out-of-hospital cardiac arrest (OHCA) survival rates. Recently, volunteer systems such as GoodSam and Pulsepoint have arisen in which volunteers are alerted by their smartphones to a nearby OHCA and can choose to respond or not. With a sufficient density of volunteers, response times to OHCA can be dramatically reduced, thereby materially increasing survival rates. How many volunteers are needed? And what policy should be followed to alert volunteers to ensure a timely volunteer response while not overly burdening volunteers with too many alerts? Ideally we would use volunteer location data to estimate volunteer density and thence infer survival rates, but privacy considerations preclude this approach. We use a combination of Poisson point processes, convex optimization and other modelling techniques to provide answers to these questions.
Joint work with Caroline Jagtenberg, Hemeng (Maggie) Li and Pieter van den Berg.
Adaptive design of switchback experiments: A Markov chain perspective
Suppose an online platform wants to compare a treatment and control policy, e.g., two different
matching algorithms in a ridesharing system, or two different inventory management algorithms in an
online retail site. Standard randomized controlled trials are typically not feasible, since the goal
is to estimate policy performance on the entire system. Instead, the typical current practice
involves dynamically alternating between the two policies for fixed lengths of time, and comparing
the average performance of each over the intervals in which they were run as an estimate of the
treatment effect, called a “switchback” design.
In this talk, we cast switchback design optimization as a problem of estimating the difference in
steady state reward between two Markov chains: a “control” chain and a “treatment” chain. We
discuss the rich structure exhibited by this formulation of the problem, including reduction of the
variance optimal design to a convex optimization problem, in the case where parametric or
nonparametric maximum likelihood is used for estimation. We’ll close with a discussion of open
issues, novel directions of application, and discussion of alternative experimental designs of interest.
Joint work with Jose Blanchet, Peter Glynn, Mohammad Rasouli, and Linjia Wu.
Resource-Driven Activity Networks (RANs)
Jointly with Mor Armony, Nitzan Carmeli, Petar Momcilovic, Galit Yom-Tov.
I shall describe a data-based modelling framework that supports analysis and optimization of large and complex, congestion-prone service-operations (e.g. hospitals, contact-centers, courts, banks). Drawing from many-server asymptotic regimes, all participants in the service process (e.g. customers, servers) are equally considered resources that are either busy or await each others’ availability. Models are then activity-oriented, where each activity (e.g. service in a hospital) first consumes a subset of the resources at specific states (e.g. waiting patient, available doctor, idle exam-room); and then, upon completion, produces possibly other resources at other states (e.g. served patient, available doctor, exam-room that requires cleaning). We hence refer to our models as Resource-Driven Activity Networks, or RANs for short.
University of Amsterdam
Data-driven queueing challenges
In this overview talk I’ll give my perspective on the area of data-driven queueing. I’ll distinguish between (i) in-depth data-studies to soundly characterise the stochastic flows in a network and the underlying queueing mechanisms, (ii) methods to estimate model primitives from queueing observations, (iii) techniques dealing with parameter uncertainty, (iv) techniques to jointly estimate and optimise, and (v) congestion-level dependent queueing mechanisms. For each of these branches I discuss a few of my favourite papers, that provided me with useful conceptual understanding, as well as some of my own work in the area.
University of Haifa
Estimating customer impatience for a queueing system with unobserved balking
We study a service system in which arriving customers are provided with information about the delay they will experience. Based on this information they decide to wait for service or to leave the system. The main objective is to estimate the customers’ patience-level distribution and the corresponding potential arrival rate, using knowledge of the actual queue-length process only. The main complication, and distinguishing feature of our setup, lies in the fact that customers who decide not to join are not observed, but, remarkably, we manage to devise a procedure to estimate the load they would generate. We express our system in terms of a multi-server queue with a Poisson stream of customers, which allows us to evaluate the corresponding likelihood function. Estimating the unknown parameters relying on a maximum likelihood procedure, we prove strong consistency and derive the asymptotic distribution of the estimation error. Several applications and extensions of the method are discussed. The performance of our approach is further assessed through a series of numerical experiments. By fitting parameters of hyperexponential and generalized-hyperexponential distributions our method provides a robust estimation framework for any continuous patience-level distribution.
Joint work with Yoshiaki Inoue and Michel Mandjes
The University of Melbourne
A call centre server allocation problem
Presented by Keith Royston (Probegroup) and Peter Taylor (University of Melbourne).
In this session we shall discuss a real data-driven queueing problem faced by a call centre management company. The Excel spreadsheet (see link below) gives data on callers who rang a call centre in each of 72 quarter of an hour periods from 6.00am until midnight on September 15, 2021.
There are four classes of enquiry Tier 0.5, Tier 1, Tier 2 and Tier 3, which roughly correspond to the complexity of the issue that the caller has. Servers with different skill sets handle the enquiries: Tier 0.5 enquiries can be handled by servers who have just started, while Tier 3 enquiries require the most knowledgeable servers.
University of Technology Sydney
Network modelling and measurement: Whence, Where and Whither
There is a long established history of the interplay between telecommunication networks and queueing theory. The rise of the Internet brought with it the availability of large data sets (with surprising features), and highlighted new problems of practical importance for network operators and end applications. This led to a burst of activity in applied queueing, and interest in new kinds of problems, including those with closer links to inference. The discipline of network measurement meanwhile explored many network inference problems, including those directly related to queueing systems, as well as others inspired by, or at least informed by, queueing considerations and insights.
Twenty years on, networks have changed, and the literature has matured. We are now entering into a new era where old assumptions fail, the questions are different and more complex, but where many more opportunities are opening up. In this talk I will describe this changing landscape, beginning with an overview of the old approaches, and their limitations. I will then try to map out some of the new directions of interest in the space, which should contain rich opportunities for expansion of the queueing frontier.
University of Manchester
Learning and Information in Stochastic Networks and Queues: A tutorial
We review the role of information and learning in the stability and optimization of queueing systems. In recent years, techniques from supervised learning, bandit learning and reinforcement learning have been applied to queueing systems supported by increasing role of information in decision making. We present observations and new results that help rationalize the application of these areas to queueing systems. We prove that the MaxWeight and BackPressure policies are an application of Blackwell’s Approachability Theorem. This connects queueing theoretic results with adversarial learning. We then discuss the requirements of statistical learning for service parameter estimation. As an example, we show how queue size regret can be bounded when applying a perceptron algorithm to classify service. Next, we discuss the role of state information in improved decision making. Here we contrast the roles of epistemic information (information on uncertain parameters) and aleatoric information (information on an uncertain state). Finally we review recent advances in the theory of reinforcement learning and queueing, as well as, provide discussion on current research challenges.
Technical University of Darmstadt
Statistical inference for unreliable queueing systems
In practical applications of queueing systems there are typically processes or parameters which are unknown and cannot be observed. Thus, there is great interest in statistical inference for system characteristics depending on incomplete information of the stochastic components. Further, real service regimes are never completely reliable, the service is interrupted from time to time due to human or technical failures. In this talk we will present the statistical analysis for a queueing system with Poisson arrivals, a general service time distribution and parallel servers which are unreliable; the servers interrupt the service process from time to time and take a random repair period. Based only on observations of the arrival and departure epochs of the customers (without the possibility of their correct matching) we want to obtain information about the unknown repair time distributions and the breakdown rates of the servers. Explicit constructions of the estimators are presented as well as their analytic performance. Decompounding techniques for random sums play a major role in the proofs. By simulation results we illustrate the practicability of our analysis.
The Co-Production of Service: Modeling Service Times in Contact Centers Using Hawkes Processes
In customer support contact centers, a successful service interaction involves a messaging dialogue between a customer and an agent. Both parties depend on one another for information and problem solving, and this interaction defines a co-produced service process. In this paper, we propose, develop, and compare new stochastic models for service co-production in a contact center. A key observation is that the customer and agent’s co-produced service has cross- and self-exciting dynamics within each conversation. The cross-excitation stems from the two parties responding to one another, and the self-excitation captures one party sending follow-ups to their own prior message. Hence, messages beget messages, and we capture this phenomenon by introducing Hawkes point process models of the conversational services. These models distinguish between the role of the customer and of the agent, reflect the service process’s dynamic evolution over time based on its own history, and include additional behavioral and operational aspects, including the agent’s number of simultaneous assignments and measures of the amount of information and sentiment each message contains.
Georgia Institute of Technology
Learning traffic correlations in multi-class queueing systems by sampling workloads
We consider a service system consisting of parallel single server queues of infinite capacity. Work of different classes arrives as correlated Gaussian processes with known drifts but unknown covariances, and it is deterministically routed to the different queues according to some routing matrix. In this setting we show that, under some conditions, the covariance matrix of the arrival processes can be directly recovered from the large deviations behavior of the queue lengths. Also, we show that in some cases this covariance matrix cannot be directly recovered this way, as there is an inherent loss of information produced by the dynamics of the queues. Finally, we show how this can be used to quickly learn an optimal routing matrix with respect to some utility function.