Bounty: 50

Suppose I have two models with likelihoods calculated through thermodynamic integration. thermodynamic_integration_log_evidence in PTSampler returns both an estimate of the integral and an error term for it, which the docs say arises from sampling at a finite number of temperatures.

If I was confident in the likelihoods $L_1, L_2$ for my models $M_1,M_2$ then I could just compute the Bayes factor $L_1/L_2$ (or $exp(LL_1-LL_2)$ for log likelihoods) to see how many times more likely is $M_1$ than $M_2$.

But as $L_1, L_2$ come with corresponding errors $sigma_{L1}, sigma_{L2}$ how can I take these into account in the comparison?