Using Markov Diagrams in BlockSim for Reliability Analysis

Invented by Russian mathematician Andrey Markov, Markov chains are used across a broad range of applications to represent a "memoryless" stochastic
process. This process is made up of random variables that represent the evolution of the process through various states. The meaning of "memoryless," also
called the Markov property, is that the probability of being in a state during the next step is only dependent on the information present in the current step
and not on any information from any steps prior to the current step. This article presents a way of using Markov chains
in BlockSim 10 (if this feature is supported by your license).

When Markov chains are used in reliability analysis, the process usually represents the various stages (states) that a system can be in at any given time.
The states are connected via transitions that represent the probability, or rate, that the system will move from one state to another during a step, or a
given time. When using probabilities and steps the Markov chain is referred to as a discrete Markov chain, while a Markov chain that uses rate and the
time domain is referred to as a continuous Markov chain. In this article we will limit ourselves to discrete Markov chains.

In a discrete Markov chain, we have to define each possible state that the system can be in at any given time, and also the transition probabilities per
step that link the states together. The steps can represent time, but they do not have to. Lastly, we must also define the initial state probabilities that
give us the starting point(s) of the system.

Mathematically, we can represent the initial state probabilities as a vector such
that Xi represents the initial probability of being in state i:

The transitions between the states can be represented by a matrix :

where, for example, the term P12 is the transition probability from state 1 to state 2.

Then if we want to know the probability of being in a particular state after n steps, we can use the Chapman-Kolmogorov equation to arrive at
the following equation:

where is the vector that represents the probability of being in a state
after n steps. Using this methodology, we can find the point probability of being in a state at each step and from there also calculate
the mean probability of being in a state over a certain number of steps.

Example

In BlockSim 10, we are doing an initial estimation analysis on the life cycle of a complex drilling system that starts off as brand new
(100% initial probability in the full capacity state). The system has a probability to degrade into various states of capacity with time
and can eventually enter a salvage state. There is also a probability of being returned to the as-good-as-new condition from each degraded
state, except from the salvage state. The salvage state is considered to be a "sink," a state from which there are no transitions to any other
state and therefore we have zero probability of leaving. We want to determine, on average, what percent of the time will be spent in
each state over a 10-year period. To perform the analysis we will use a discrete Markov chain diagram. Our initial setup looks like this:

We estimate the following probabilities per month to move between states:

1% chance to degrade from 100% to 80% capacity.

10% chance to be restored from 80% to 100% capacity.

3% chance to degrade from 80% to 60% capacity.

8% chance to be restored from 60% to 100% capacity.

6% chance to degrade from 60% to 40% capacity.

5% chance to be restored from 40% to 100% capacity.

8% chance to degrade from 40% capacity to salvage.

Based on these percentages, the final diagram that is ready for analysis looks like this:

Since our estimated probabilities are on a month scale, we will take each step of the analysis to be the equivalent of one month.
This means that we will run our calculation for 120 steps. After we calculate the diagram, we can see that the transition probability
matrix between the states looks like this (which we can easily use to verify our inputs):

We can use the state point probability plot to see if our system has reached steady state within our time frame.

In this example, because we have a "sink" state, we do not reach steady state, where all the probabilities
have reached a constant value, but rather a pseudo-steady state where the probabilities are changing at a roughly constant rate.

Afterwards, we can check the results summary to determine the mean probabilities in each state and the point probabilities
after 120 steps (10 years).

Conclusions

From the results we can conclude that the majority of the time (89.4%) our system should be running at 100% capacity and
that after the 10-year period there is about a 5.3% chance that the system will degrade to a point from which it cannot be
restored (the salvage state).