Mayo on S. Senn: “How Can We Cultivate Senn’s-Ability?”–reblogs

Since Stephen Senn will be leading our seminar at the LSE tomorrow morning (see PH500 page), I’m reblogging my deconstruction of his paper (“You May Believe You Are a Bayesian But You Probably Are Wrong”) from Jan.15 2012 (though not his main topic tomorrow). At the end I link to other “U-Phils” on Senn’s paper (by Andrew Gelman, Andrew Jaffe, Christian Robert), Senn’s response, and my response to them). Queries, write me at: error@vt.edu

Mayo Philosophizes on Stephen Senn: “How Can We Cultivate Senn’s-Ability?”

Where’s Mayo?

Although, in one sense, Senn’s remarks echo the passage of Jim Berger’s that we deconstructed a few weeks ago, Senn at the same time seems to reach an opposite conclusion. He points out how, in practice, people who claim to have carried out a (subjective) Bayesian analysis have actually done something very different—but that then they heap credit on the Bayesian ideal. (See also “Who Is Doing the Work?”)

“A very standard form of argument I do object to is the one frequently encountered in many applied Bayesian papers where the first paragraphs laud the Bayesian approach on various grounds, in particular its ability to synthesize all sources of information, and in the rest of the paper the authors assume that because they have used the Bayesian machinery of prior distributions and Bayes theorem they have therefore done a good analysis. It is this sort of author who believes that he or she is Bayesian but in practice is wrong.” (Senn 58)

Why in practice is this wrong? For starters, Senn points out, the analysis seems to violate such strictures as temporal coherence:

Attempts to explain away the requirement of temporal coherence always seem to require an appeal to a deeper order of things—a level at which inference really takes place that absolves one of the necessity of doing it properly at the level of Bayesian calculation. (ibid.)

So even if they come out with sensible analyses, Senn is saying, it is despite rather than because they followed strict Bayesian rules and requirements. It is thanks to certain unconscious interventions, never made explicit, and perhaps not even noticed by the Bayesian reasoner. “This is problematic,” Senn thinks, “because it means that the informal has to come to the rescue of the formal.” Not that there is anything wrong with informality . . .

“Indeed, I think it is inescapable. I am criticising claims to have found the perfect system of inference as some form of higher logic because the claim looks rather foolish if the only thing that can rescue it from producing silly results is the operation of the subconscious.” (59)

Now, many Bayesians would concede to Senn that in arriving at their outputs they violate strict norms laid down by De Finetti or other subjective Bayesians. But why then do they credit these outputs to some kind of philosophical Bayesianism? The answer, I take Senn to be suggesting, is the fact that they assume that there is but one philosophically righteous position—that of being a Bayesian deep down, where “Bayesian deep down” alludes to a fundamental subjective Bayesian position.

Senn’s idea may be that their belief in Bayesianism deep down is a priori, so it’s little wonder that no empirical facts can shatter their standpoint. (The very definition of an a priori claim is that it’s not open to empirical appraisal.) I think this is generally the case. Many have simply been taught the Bayesian catechism­­—that subjective Bayesianism is at the foundation of all adequate statistical analyses, and offers the only way to capture uncertainty. Others are true-blue believers (not only in the Bayesian ideal but in the frequentist howlers regularly trotted out) . Either way, one can understand why so many Bayesian articles follow the pattern Senn describes: begin by saying grace and end by thanking the Bayesian account for its offer to house all their uncertainties within prior probability distributions, even if in between, the analysis immediately turns to non-Bayesian means that can more ably grapple with both the limits and the goals of the actual inquiry.

Yet Senn, as I understand him, finds this Bayesian “grace and amen routine”—my term not his—disingenuous and utterly insufficient as a foundation for statistical research. We ought to be able to look into the black box and recognize that the methods used scarcely toe the (subjective) Bayesian line, or so Senn seems to be saying:

In a paper published in Statistics in Medicine in 2005 Lambert et al. considered thirteen different Bayesian approaches to the estimation of the so-called random effects variance in meta-analysis. . . .

The paper begins with a section in which the authors make various introductory statements about Bayesian inference. For example, “In addition to the philosophical advantages of the Bayesian approach, the use of these methods has led to increasingly complex, but realistic, models being fitted,” and “an advantage of the Bayesian approach is that the uncertainty in all parameter estimates is taken into account” (Lambert et al. 2005, 2402), but whereas one can neither deny that more complex models are being fitted than had been the case until fairly recently, nor that the sort of investigations presented in this paper are of interest, these claims are clearly misleading in at least two respects. (Senn 2011, 62)

First, the “philosophical” advantages to which the authors refer must surely be to the subjective Bayesian approach outlined above, yet what the paper considers is no such thing. None of the thirteen prior distributions considered can possibly reflect what the authors believe about the random effect variance.[i] Second, the degree of uncertainty must be determined by the degree of certainty and certainty has to be a matter of belief so that it is hard to see how prior distributions that do not incorporate what one believes can be adequate for the purpose of reflecting certainty and uncertainty. (62-3)

Now let’s compare this with Jim Berger. Berger, I take it, holds to philosophical Bayesianism, while granting that, in practice, we need conventional priors that are not claimed to be expressions of uncertainty or degree of belief (see also Dec 19, Dec 26, Jan 3). Senn’s second point says to Berger that, in that case, one cannot claim that the Bayesian analysis reflects uncertainty or degree of belief (be it actual or rational). But one who holds to Bayesianism Deep Down (DD?) can appeal to the position we crafted to resolve the paradox in Berger’s notion that the use of conventional priors is a way of becoming more subjective: Since being a philosophical Bayesian DD (BADD?) is assumed (a priori), and since replacing “terrible” priors with default priors is deemed an improvement, it must therefore be closer to the subjective Bayesian ideal.

Although Senn at times seems almost to grant that subjective Bayesianism is perfect in theory (or he at least admits to having a love-hate relationship with it), he’s clearly “criticising the claim that it is the only system of inference and in particular I am criticising the claim that because it is perfect in theory it must be the right thing to use in practice” (59).[ii]

Despite these occasional whiffs of being (BADD), Senn’s critique would seem to locate him outside the Bayesian (and perhaps any other) formal paradigm. Yet why suppose that this “metastatistical standpoint” admits of no general, non-trivial, empirical standards and principles? It seems to me that one should not suppose this, but instead try and unearth these general arguments, however “informal” or “quasi-formal” they may be. Moreover, I will argue that unless we do so, a Senn-style position here in praise of eclecticism fail at its intended aim.

Noting that another Bayesian paper a few years later effectively concedes his point, Senn remarks:

This latter paper by the by is also a fine contribution to practical data-analysis but it is not, despite the claim in the abstract, “We conclude that the Bayesian approach has the advantage of naturally allowing for full uncertainty, especially for prediction,” a Bayesian analysis in the De Finetti sense. Consider, for example this statement, “An effective number of degrees of freedom for such a t-distribution is difficult to determine, since it depends on the extent of the heterogeneity and the sizes of the within-study standard errors as well as the number of studies in the meta-analysis.” This may or may not be a reasonable practical approach but it is certainly not Bayesian. (63)

Here, as elsewhere, Senn seems to have no trouble regarding the work as “a fine contribution” to statistical analysis, but one wonders: what criteria is he using to approve it? Is he content to leave those criteria at the unconscious level without making them explicit? If so, isn’t he open to the same kinds of subliminal appraisals made by the Bayesians he takes to task? Can we not learn the basis for Senn’s sensibility (senn’s-ibility?)? Does he think that the standards he uses for critically appraising, interpreting, and using statistical methods are ephemeral? Can we say nothing more than that they shouldn’t be too terribly awful on any of the four strands of statistical methodology? Senn takes the Bayesian to task for showing us only how to be perfect, but not how to be “good.” Let’s move on to this.

To make this more concrete: How, specifically, would Senn have those authors describe what they actually did, given that it’s “certainly not Bayesian”? Now, Senn is not really crediting any overarching or underlying philosophical standpoint for his expertise—but shouldn’t he? Is the choice between adopting an a priori standpoint and adopting eclecticism “all the way down”—even at the level of critically appraising, interpreting, and using statistical methods? If, as Senn himself suggests, most of the Bayesians writing the papers he takes to task are doing what they do more or less unconsciously, then how will he raise their consciousness? Saying it’s not really Bayesian doesn’t quite tell them what it is.

One might question my presumption that there are some overarching standards, principles, or criteria used in judging work from different schools. But we should at least try to articulate them before assuming it’s not possible. And anyway, Senn’s remarks suggest he is senn-sitive to applying a “second-order” scrutiny.

The account would be far more complex than the neat and tidy accounts often sought: ranging from determining what one wants to learn, breaking it up into piecemeal questions, collecting, modeling, interpreting data and feeding results from one stage into others. Nevertheless, I have suggested there are overarching criteria and patterns of inference (based on identifying the error or threat at the particular stage). (See Nov. 5, post).

To conclude these remarks, then, I want to laud Senn for courageously calling attention to the widespread practice of erroneously describing research as Bayesian, as well as to the tendency of a priori adulation of philosophical Bayesianism Deep Down. But now that nearly no Bayesians explicitly advocate the one true subjective Bayesian ideal, more is needed[iv]. Their position has shifted. While adhering to the BADD ideal, they will still describe their methods as mere approximations of that ideal. After all, they will (and do)say, they can’t be perfect, but the Bayesian ideal still lights the way, and therefore discredits all Senn-ible criticism of their claim that all you need is Bayes.

Unless Senn identifies the non-Bayesian work in-between the “grace and amen” Bayesianism, the worry (my worry) is that there will be no obligation to amend this practice. Nor is it enough, it seems to me, to merely point out that they are using tools from standard frequentist schools, since these can always be reinterpreted Bayesianly—or so they will say. If it’s just a name game, the new-styled Bayesians can say, as some already do about their favorite methods, “I dub thee Bayesian”—since “Bayesian” is in the title of my book, or since a conditional probability is used somewhere. That’s the challenge I am posing to those who would advance the current state of statistical foundations.

[i] He continues: “One problem, which seems to be common to all thirteen prior distributions, is that they are determined independently of belief about the treatment effect. This is unreasonable since large variation in the treatment effect is much more likely if the treatment effect is large” (Senn 2007b).

[ii] In at least one place Senn slips into the tendency to equate the use of background knowledge to being Bayesian in a subjective sense: Senn declares that a frequentist statistician who chose to set a carry-over effect to zero, in a clinical trial where it fairly obviously warranted being ignored, “would be being more Bayesian in the De Finetti sense than one who used conventional uninformative prior distributions or even Bayes’ factor” (p. 62). (See, in this connection, the discussion in Cox and Mayo [also RMM 2011] on the use of background knowledge.) But there is no evidence that this background knowledge was or needs to be translated into a prior probability distribution.

Related

Post navigation

11 thoughts on “Mayo on S. Senn: “How Can We Cultivate Senn’s-Ability?”–reblogs”

“Senn’s idea may be that their belief in Bayesianism deep down is a priori, so it’s little wonder that no empirical facts can shatter their standpoint. (The very definition of an a priori claim is that it’s not open to empirical appraisal.) I think this is generally the case.”

This reminds me of that time when Clint Eastwood projected an imaginary personality on a chair so that he could argue with it.

As I see myself often, and also here, in agreement with Prof. Senn (significantly often than with the vast majority of writers on foundations of statistics), I can see the challenge to elaborate the standards also as a challenge to myself, and a well placed one.

Just as a first attempt, I’d see clarity/transparency as an important standard violated by the mentioned Bayesians (or “Bayesians”), apparent in the inconsistency between the philosophy to which they refer and their own way of justifying their priors (or the absence of such a justification).

Christian: Thanks for this. I concur that their clarity/transparency needs shoring up, but on the apparent inconsistency—couldn’t they/one say we just introspect in both cases? The introspection might be shallow and not very deeply self-critical, but still a similar appeal?

“couldn’t they/one say we just introspect in both cases?”
They could but
a) they don’t (as far as I know), and
b) of course if they did, their whole analysis would be useless for everyone who says “but my beliefs are different”. This could be improved by them trying to justify why their beliefs are what they are, if they do this convincingly (that’s the thing about doing proper subjectivist analysis in a useful manner; you know I’m not as critical of de Finetti as some others…).
What I dislike most in what some Bayesians do is the appeal to subjectivist philosophy combined with a deep wish not to appear subjective.

Christian: But I still want to know what the subjectivist is measuring. Perhaps something like the degree to which one believes in the occurrence of event x, or the truth of a proposition or hypothesis about a phenomenon (on given evidence and background), but to be warranted belief, this judgment must be well supported in some way. So one is back to needing an account of inferential warrant,or reliable method or the like– perhaps imagining it in the background. Even aside from this, I think probability theory gets the logic wrong. (Hence all the fuss about avoiding support of some sort for irrelevant conjunctions. Disjunctions and conditionals are even worse.)
But I’m prepared to grant people their logics of belief, since they love them so much, and yet have them realize that they need another type of reasoning account (even to get the warranted evidence claim, and certainly to avoid classic problems of underdetermination of theory by data, Duhemian problems of pinpointing blame, distinguishing which aspect of a large-scale theory is supported, justifying intuitions about ad hocness and novel evidence). (See my “continuing the conversation with Corey, not long ago, on characterizing a “poor test”.)

This seems to me like a collection of what’s wrong with a number of approaches and if I just wanted to defend a purist de Finetti-stance some of these issues wouldn’t apply, though of course others would.
Anyway, that’s not the topic of the current posting. I’d think we agree that the people Senn is discussing can be charged on grounds of missing clarity. That I find de Finetti much better in at least this respect (although not necessarily in all other relevant departments) is a different story.
Well not so different perhaps because I should still try to be clear at least to myself what the problems are where it’s not clarity.

I was recently reading Nate’s popular book, so some of those ideas have been on my mind lately.

I think a point of departure is that from the frequentist perspective the most important “measurement” is often the interval estimate – it must have a well-defined objective interpretation, since that will be the basis for any sort of hypothesis test. From this perspective, the fact that a 95% credible interval derived from a posterior distribution doesn’t have a well-defined objective interpretation is problematic.

By contrast, according to Nate’s perspective the “measurement” one is ultimately interested in is the estimate of the parameter, which usually has an inherent well-defined real-world meaning. Credible interval estimates are needed to help interpret his estimate, but are not the measurement of interest. He defines a “good” credible interval in a purely utilitarian way based on calibration, which Larry has been arguing means he’s effectively a frequentist who applies bayes theorem.

rv: People have lots of strange ideas about frequentists. If, as you say, the “estimate of the parameter, which usually has an inherent well-defined real-world meaning” does have a real-world meaning, then it’s precisely the kind of thing we are interested in, whether I look at the inference as outputting an estimate or testing a claim. The objective part, for the frequentist error statistician, is being able to appraise and control–at least approximately– the properties of the estimation or test method to have made various discriminations, rule out/reveal discrepancies, etc. Anyway, what you say sounds in sync with Normal Deviate’s claim that Nate is a frequentist. I don’t know Nate.

Christian: By the way, I find your remark interesting because the very relativism that leads you to claim “their whole analysis would be useless” (i.e., everyone is free to have their own beliefs and evidential interpretations) seems to me just what subjective Bayesian epistemologists say. Isn’t that what being a subjectivist is all about? Isn’t that why Lindley says he doesn’t know what right or wrong could even mean (being a subjective Bayesian)? ref is in EGEK.

Mayo: True, in principle the subjectivist’s analysis is aimed at being useful only for the subjectivist herself. But if a subjectivist explains convincingly how she arrived at her prior and somebody else thinks that this make sense, this other person can basically adopt the analysis. That’s why in science (at least) the prior should always be justified and as good as at all possible.

According to de Finetti, the aim of subjectivist analysis is to obtain predictive distributions for future events, based on the subjective prior assessment and data. The quality of this can be checked. De Finetti wanted to have people betting on their predictions. I guess this is hardly new to you so I’m not really sure what you think is missing in this account. My personal problem with it is that a) it’s often something else than predictive distributions we’re interested in and b) in most cases it is difficult if not impossible to come up with a really convincing prior.