This is a short HOWTO describing the data analysis performed to build the structural model
of class III malocclusion in “Bayesian Networks Analysis of Malocclusion Data” by
Scutari, Auconi, Caldarelli and Franchi (Scientific Reports, 2017).

The data

The raw data contains 143 patients with two measurements at ages T1 and
T2 (measured in years) for the following variables:

A second data set can be constructed by subtracting the population reference values for each
variable at a specific age given by Bathia. The resulting adjusted data can be loaded from
prepd-bathia.rda (link).

> load("prepd-bathia.rda")

The analysis for both data sets follows the same steps, so we will just cover the raw, unadjusted
data in the following.

Preprocessing and exploratory data analysis

Firstly, we create a data frame with the differences for all the variables and
with Growth and Treatment.

The Growth and Treatment variables carry redundant information on
the prognosis of the patient, as evidenced by the difference in the proportions of patients with
good Growth between TB and TG.

> table(ortho[,c("Treatment","Growth")])

Growth
Treatment Bad Good
NT 51 26
TB 10 3
TG 24 29

To avoid the confounding that would result from including both variables in the model we
re-code Treatment as a binary variable for which 0 means NT
and 1 means either TB or TG. Similarly, we re-code
Growth with 0 meaning Bad and 1 meaning
Good.

> table(diff[,c("Treatment","Growth")])

Growth
Treatment 0 1
0 51 26
1 34 32

We can then explore the (pair-wise) correlation structure of the data on the scale of change
rate (that is, difference in value divided by the difference in age, ΔX / ΔT).
Treatment is the same for both T1 and T2, and
Growth only refers to T2; so they are not divided by ΔT
and keep their original values.

> library(gplots)

Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess

We can see that there is a single cluster of (pair-wise) correlated variables: (dANB,
dCoA, dGoPg and Treatment); all the other variables are
isolated. This already suggests which variables are most likely to be the target of the treatment:
dANB and dCoA, both related to point A.

Learning the Bayesian network

Firstly, we encode the available prior knowledge in sets of whitelisted and blacklisted arcs to
use in learning the structure of the Bayesian network.

We blacklist any arc pointing to dT, Treatment and
Growth from the orthodontic variables.

We blacklist any arc from dT and Treatment. This means that whether
a patient is treated does not change over time.

We whitelist the dependence structure dANB → dIMPA ←
dPPPM.

We whitelist the arc from dT to Growth which allows the prognosis
to change over time.

Secondly, we produce a consensus model by learning 200 Bayesian networks and keeping the arcs
that appear at least ≈ 50% of the time (the threshold is estimated from the data). The
algorithm used is hill-climbing with the BIC score.

The resulting consensus network is as below. Arcs that are forced to be present in the graph
by the whitelist are highlighted in red, and arc thickness is in proportion to the frequency
reported in the str.raw object.

All the directions of the arcs seem to be well established; this can probably be attributed to
the use of a whitelist and a blacklist, as they force the directions of nearby arcs to cascade
into place.

Furthermore, a cursory examination of the arc strengths above the threshold confirms that
14 arcs in the consensus network appear in fact with a frequency of at least 0.85. All arc
directions are also clearly established (all frequencies are equal to 1).

Model validation

In order to validate the model learning strategy we perform 10 runs of 10-fold
cross-validation and measure the predictive accuracy for Growth given all the
other variables. The result is a classification error of ≈ 0.28.

Interesting questions

An excessive growth of CoGo should induce a reduction in
PPPM.
We test this hypothesis by generating samples for the Bayesian network stored in
fitted.raw.simpler for both dCoGo and dPPPM
and assuming no treatment is taking place.

We note that as dCoGo increases (which indicates an increasingly rapid
growth) dPPPM becomes increasingly negative (which indicates a reduction
in the angle assuming the angle is originally positive; in ortho the
variable PPPM ranges between 10.30 and 39.10).

If ANB decreases, IMPA decreases to compensate.
This hypothesis can be tested by simulation as before, and looking for negative values of
dANB (which indicate a decrease assuming the angle is originally positive)
associated with negative values of IMPA (same). In ortho the
original IMPA is indeed between 70.60 and 109.60, but ANB is often
negative, in which case the angle actually increases compared to a 90° reference position.

From the figure above dANB is proportional to dIMPA, so a
decrease in one suggests a decrease in the other; the mean trend (the black line) is
negative for both at the same time.

If GoPg increases strongly, then both ANB and
IMPA decrease.
If we simulate dGoPg, dANB and dIMPA from the
Bayesian network assuming dGoPg > 5 (i.e. GoPg is increasing)
we estimate the probability that dANB < 0 (i.e. ANB is
decreasing) at ≈ 0.71 and that dIMPA < 0 at ≈ 0.59.

Therapy attempts to stop the decrease of ANB. If we fix ANB
is there any difference treated and untreated patients?
First, we can check the relationship between treatment and growth for patients that
have dANB ≈ 0 without any intervention (i.e. using the model we
learned from the data).

The estimated P(GOOD.GROWTH | Treatment) is different for
treated and untreated patients (≈ 0.63 versus ≈ 0.51).
If we simulate a formal intervention (a la Judea Pearl) and externally set
dANB = 0 (thus making it independent from its parents and removing the
corresponding arcs), we have that GOOD.GROWTH has practically the same
distribution for both treated and untreated patients and thus becomes independent
from Treatment. This suggests that a favourable prognosis is indeed
determined by preventing changes in ANB and that other components of the
treatment (if any) then become irrelevant.

Does treatment affect point B after controlling for point A (using
GoPg as a proxy for point B due to its closeness)?
One way of assessing this is to check whether the angle between point A and point B
(ANB) changes between treated and untreated patients while keeping
GoPg fixed.

Assuming GoPg does not change, the angle between point A and point B increases
for treated patients (strongly negative values denote horizontal imbalance, so a positive
rate of changes indicate a reduction in imbalance) and decreases for untreated
patients (imbalance slowly worsens over time).