Mutual Information and Kullback-Leibler Divergence

Questions

Q1: Which is a better measure to report - KL Divergence or Mutual Inform=
ation?

Q2: Is it true that the mutual information of a variable to itself is 1?=

Answers=
h3>
Q1: Mutual Information vs KL D=
ivergence=20

The Mutual Information between two variables X and Y is defined as fo=
llows: $$I(X,Y)=3D\sum_{x \in X}\sum_{y \in Y} p(x,y)\log_2 \frac{p(x,y)}{p=
(x)p(y)}$$ The KL Divergence allows comparing two probability distributions=
, P and Q $$D_{KL}(P({\cal X})\|Q({\cal X}))=3D\sum_{\cal X}P({\cal X})log_=
2\frac{P({\cal X})}{Q({\cal X})}$$ We use the KL Divergence in BayesiaLab f=
or measuring the strength of a direct relationship between two variables. P=
is then the Bayesian network with the link and Q is the one without the li=
nk. The Mutual Information can be rewritten as:$$I(x,y)=3DD_{KL}(p(x,y)\|p(=
x)p(y))$$

=20

Therefore, Mutual Information (I) and KL Divergence are identical when t=
here are no spouses (co-parents) implied in the measured relation.

The percentage value in blue in the Mutual Information analysis corre=
sponds to the Normalized Mutual Information $$I_N(X,Z)=3D\frac{I(X,Z)}{H(Z)=
}$$ and the one in red corresponds to $$I_N(X,Z)=3D\frac{I(X,Z)}{H(X)}$$ wh=
ere H() is the entropy defined as: $$H(X)=3D-\sum_{x\in X}p(x)log_{2}(p(x))=
$$

=20

The percentage in blue in the Arc Force analysis is the relative weight =
of the link compared to the sum of all the arc forces.

=20

However, as soon as other variables are implied in the relation as co-pa=
rents, the KL Divergence will integrate them in the analysis, leading=
to a more precise result.

Example

=20

Let's take the following deterministic example where Z is an Ex=
clusive Or between X and Y, i.e. true when X and=
Y are different.

The analysis of the relations with Mutual Information (Validatio=
n Mode: Analysis | Visual | Arcs' Mutual Information) returns the =
following graph where the mutual information between X and Z=
em> and Y and Z are both null.

Indeed, X and Y do not have any impact on Z when they are analyzed separ=
ately.

On the other hand, the force of the arcs computed with KL (Valid=
ation Mode: Analysis | Visual | Arc Force) reflects perfectl=
y the deterministic relation between of X and Y on Z<=
/em>.

=20

Q2: Normalized Mutual Informat=
ion

Two clones will have a Normalized Mutual Information I_N(X, X) =
=3D 1 but not necessarily a Mutual Information I(X, X)=3D1. It dep=
ends on the value of the initial entropy H(X). You will get it wit=
h a binary variable X that has a uniform marginal distribution=
.