where for any i
and j,
P(ai,bj)
is the probability of the event ai
and bj.

In general, if A
and B
are variables with possible states {a1,
a2,
..., an}
and {b1,
&ldots;, bm}
respectively then the joint probability distribution P(A,B)
is the set of probabilities

{ P(ai,bj)
| i=1,&ldots;,nj=1,&ldots;,m}

If we
know the joint probability distribution P(A,B)
then we can calculate P(A)
by a formula (called marginalisation)
which comes straight from the third axiom, namely:

This is because
the events (a,b1),
(a,b2),
..., (a,bm)
are mutually exclusive. When we calculate P(A)
in this way from the joint probability distribution we say that the
variable B
is marginalised out of P(A,B).
It is a very useful technique because in many situations it may be
easier to calculate P(A)
from P(A,B).