Inside Queue Models: Markov Chains

Apr 27, 2016
9 minute read

So far in this series on queueing theory, we’ve seen single server queues,
bounded queues, multi-server queues, and most recently queue
networks. A fascinating result from queueing theory is that wait time
degrades significantly as utilisation tends towards 100%. We saw that
queues, which are unbounded, have degenerate behaviour under heavy load when
utilisation hits dangerous levels.

Perhaps more interestingly, we saw that bounded queues limit
customer ingress to prevent this undesirable behaviour; in essence we learned
that we can reject some customers to ensure good service for those who do make
it into the system. Those customers who get rejected might not be entirely
happy with this arrangement, so the question is: can we do better?

Over the course of the next two entries, I want to dig deeper into the internals
of queuing models so that we can explore sophisticated ways to better capture
how modern systems behave. The ultimate goal of this series is to learn how we
can build models of reactive systems, paying particular attention to how we
model back pressure in those systems.

A reactive system, using back pressure, signals to its source of traffic when
it’s ready for more customers. At first glance, this might sound incredibly
impractical, but in practice back pressure works for any system that is
completely in control of both the traffic source and the processing service.

For a typical service-based architecture, back pressure is the perfect mechanism
for regulating traffic flow between services. One can imagine that as services
pull more traffic in from their upstream provider, this pull propagates
towards the outside of the system until finally it hits the boundary where
traffic is coming from the outside world. Even here, back pressure can be
applied to some level. Both TCP and HTTP traffic can be limited to a certain
degree with back-pressure. Eventually though, back pressure will no longer
suffice to limit resource usage and we’ll need to start dropping customers.

In practice then, a reactive system bounds all in-flight processing, but uses
back pressure to regulate the amount of in-flight work, and thus, reduce the
number of cases where work must be rejected. We can model queues with
back-pressure by replacing the Poisson arrival process used by all
queues with something more sophisticated.

Before we look at other arrival processes though, we should first ensure that we
really understand how a simple queue, like an queue, really functions.
In particular, we are interested in analysing our queue as a particular type of
continuous time Markov chain called a birth-death process.

A Quick Recap

Before we proceed, let’s remind ourselves of the basics of queue
models. Arrivals into the queue are modelled as a Poisson process where the
arrival rate is designated . Service times have rate and
are exponentially-distributed with mean service time of .

The ratio of arrival to service completion is denoted
. For unbounded queues, ensures that the queue is
stable, if , then both queue size and latency tend towards
infinity.

Markov Chains in Two Minutes

A Markov chain is a random process described by states and the transitions
between those states. Transitions between states are probabilistic and exhibit a
property called memorylessness. The memorylessness property ensures that the
probability distribution for the next state depends only on the current state.
Put another way, the history of a Markov process is unimportant when considering
what the next transition will be.

The diagram above shows a simple Markov chain with three states: in bed, at
the gym and at work. The transitions between each state to the next state are
labelled with the respective probabilities. For example, the probability of
going from in bed to at work is 30%. Note also, that the probability of
remaining in bed is 20%; there’s no requirement that we actually leave the
current state.

We can represent these transition probabilities using a transition probability
matrix :

The probability of moving from state to state is given by
. Each row in the matrix must sum to indicating that the
probability of doing something when in a given state is always .

This kind of Markov chain is called a discrete-time Markov chain (DTMC), where
the time parameter is discrete and the state changes randomly between each
discrete step in the process. The models we’ve seen so far have a continuous
time parameter resulting in continuous-time Markov chains (CTMC).

We can recast our discrete-time process as a continuous-time process. We use a
slightly different representation for our continous-time chains. Rather than
modelling the transition probabilities, we model the transition rates:

Note that we omit rates for staying in the same state: it makes little sense to
talk about the rate at which a process remains stationary. Just as we used a
transition probability matrix for the discrete-time chain, we use a transition
rate matrix for the continuous-time chain:

Here, is the rate of transition from state to state .
Diagonals () are constructed such that each row equals unlike
the diagonals for the transition probability matrix, which ensure that each row
equals . The diagram and the matrix show that our continuous-time chain
moves from the in bed state to the at the gym state () with rate
.

Poisson Processes

Now we understand how to construct continuous-time Markov chains we can explore
Markovian queues in more detail. Recall that for an queue, both
arrivals and service times are Poisson processes, that is they are both
stochastic processes with Poisson distribution.

We can model a Poisson process, and thus the arrivals and service processes, as
a CTMC where each state in the chain corresponds to a given population size.
Consider the arrivals process in an queue. We know that arrivals are a
Poisson process with rate . At the start of the process, there have
been no arrivals. Thi first arrival occurs with rate , so to second,
the third and so on for as long as the process continues. We can model this as a
Markov chain where the states correspond to the arrivals count:

When we translate this into a transition rate matrix we get:

This matrix continues unbounded since the number of arrivals is effectively
unbounded.

Birth-Death Processes

An queue is composed of two Poisson processes working in tandem: the
arrivals process and the service process. As we saw, each of these processes can
be described by a Markov chain. We can go further and describe the queue as a
whole using a special kind of Markov chain process called a birth-death
process. Birth-death processes are processes where the states represent the
population count and transitions correspond to either births, which
increment the population count by one, or deaths which decrease the
population count by one. Note that Poisson processes are themselves birth-death
processes, just with zero deaths.

This diagram shows the Markov chain for an queue with arrival rate
and service rate . As you can see, the population state
increases as customers arrive at the queue and decreases as customers are
served. We can translate this simple diagram into a transition rate matrix for
the queue:

When the process starts, the only possible transition is from zero customers to
one with rate (). After this, at each state, the
process can transition to having one more customer, again at rate or
to having one fewer customer with rate .

Steady-State Probabilities

With the transition rate matrix in hand, we can calculate the steady-state
probabilities for the queue. Recall that the steady-state
probabilities tell us the probability of the queue being in state ,
that is the probability of having customers in the system. More formally:

Where is the probability of having customers in the system at
time . Note that the steady-state probabilities are time-independent and,
as the name implies, steady. More precisely, we expect that:

That is, we expect the rate of change of the probabilities to be zero in the
limit. Let’s think about for a while. The transition rate matrix
tells us how the process flows between states. We can see that each state
can be entered from states and state . Entry from state
corresponds to a customer arriving in the system and has rate . Entry
from state corresponds to a customer completing service and leaving the
system with rate .

Each state can also exit to states and as customers are
served (with rate ) and arrive (with rate ). This gives us:

Using our limit condition we find these
steady-state flow equations:

Solving this recurrence relation with dependence on gives us:

Since we know that all probabilites must sum to we can derive :

Coming full circle

You might recall that, in my first post in this series, I mentioned that the
equation for the mean number of customers in an queue follows from
the steady-state probabilities. Let’s see how that works. The mean number
of customers for an queue is:

To get here from the steady-state probabilities let’s start by simply defining
in terms of :

We’re saying that the mean numbers of customers is simply the sum of each
possible value adjusted by its probability. Let’s expand on this:

We know that queues have divergent behaviour if , and
indeed the series , only converges
for . So, assuming we have
(otherwise is undefined):

And thus we arrive at the definition for , the mean customers in the
queue for queues.

What’s next?

With an understanding of how Markov chains are used to construct queue models,
we can start looking at some more complex models. In particular, the next post
in this series will introduce Markov-modulated Arrival Processes (MMAP). An MMAP
composes two or more Markov arrival processes and switches between them. The
switching is itself modelled as a Markov chain. MMAPs are a great way of
creating a rudimentary model of how back-pressure works.