We set up the PSAnalysis with our list of Universes and labels. While path_select determines which atoms to calculate the path similarities for, ref_select determines which atoms to use to align each Universe to reference.

For each Universe, we will generate a transition path containing each conformation from a trajectory.

First, we will do a mass-weighted alignment of each trajectory to the reference structure reference, along the atoms ref_select. To turn off the mass weighting, set weights=None. If your trajectories are already aligned, you can skip the alignment with align=False.

Now we can compute the similarity of each path. The default metric is to use the Hausdorff method. [5] The Hausdorff distance between two conformation transition paths \(P\) and \(Q\) is:

\[\delta_H(P,Q) = \max{(\delta_h(P|Q), \delta_h(Q|P))}\]

\(\delta_h(P|Q)\) is the directed Hausdorff distance from \(P\) to \(Q\), and is defined as:

\[\delta_h(P|Q) = \max_{p \in P}\min_{q \in Q} d(p,q)\]

The directed Hausdorff distance of \(P\) to \(Q\) is the distance between the two points, \(p \in P\) and its structural nearest neighbour \(q \in Q\), for the point \(p\) where the distance is greatest. This is not commutative, i.e. the directed Hausdorff distance from \(Q\) to \(P\) is not the same. (See scipy.spatial.distance.directed_hausdorff for more
information).

In MDAnalysis, the Hausdorff distance is the RMSD between a pair of conformations in \(P\) and \(Q\), where the one of the conformations in the pair has the least similar nearest neighbour.

[6]:

ps.run(metric='hausdorff')

/Users/lily/anaconda3/envs/mdanalysis/lib/python3.7/site-packages/MDAnalysis/analysis/psa.py:1556: DeprecationWarning: `save_result` is deprecated!
`save_result` will be removed in release 1.0.0.
You can save the distance matrix :attr:`D` to a numpy file with ``np.save(filename, PSAnalysis.D)``.
self.save_result(filename=filename)

psa.PSAnalysis provides two convenience methods for plotting this data. The first is to plot a heat-map dendrogram from clustering the trajectories based on their path similarity. You can use any clustering method supported by scipy.cluster.hierarchy.linkage; the default is ‘ward’.

The discrete Fréchet distance between two conformation transition paths \(P\) and \(Q\) is:

\[\delta_{dF}(P,Q) = \min_{C \in \Gamma_{P,Q}} \|C\|\]

where \(C\) is a coupling in the set of all couplings \(\Gamma_{P,Q}\) between \(P\) and \(Q\). A coupling \(C(P,Q)\) is a sequence of pairs of conformations in \(P\) and \(Q\), where the first/last pairs are the first/last points of the respective paths, and for each successive pair, at least one point in \(P\) or \(Q\) must advance to the next frame.

The coupling distance \(\|C\|\) is the largest distance between a pair of points in such a sequence.

\[\|C\| \equiv \max_{i=1, ..., L} d(p_{a_i}, q_{b_i})\]

In MDAnalysis, the discrete Fréchet distance is the lowest possible RMSD between a conformation from \(P\) and a conformation from \(Q\), where the two frames are at similar points along the trajectory, and they are the least structurally similar in that particular coupling sequence. [6-9]

[10]:

ps.run(metric='discrete_frechet')ps.D

/Users/lily/anaconda3/envs/mdanalysis/lib/python3.7/site-packages/MDAnalysis/analysis/psa.py:1556: DeprecationWarning: `save_result` is deprecated!
`save_result` will be removed in release 1.0.0.
You can save the distance matrix :attr:`D` to a numpy file with ``np.save(filename, PSAnalysis.D)``.
self.save_result(filename=filename)