(The prod that got me thinking about this was a light-hearted blog post by Rasmus Baath regarding a "mascot" for Bayesian data analysis. My comment on that post is the beginning of this post. Please note that the icons presented here are not intended as advocacy or cheer leading or as mascots. Instead, I would like the icons to capture succinctly key ideas.)

What is "the essence" of Bayesian data analysis? And what is "the essence" of frequentist data analysis? Any answer will surely provoke disagreement, but that is not my goal. The questions are earnest and often asked by beginners and by experienced practitioners alike. As an educator, I think that the questions deserve earnest answers, with the explicit caveat that they will be incomplete and subject to discussion and improvement. So, here goes.

The essence of Bayesian data analysis is inferring the uncertainty (i.e., relative credibility) of parameters in a model space, given the data. Therefore, an icon should represent the data, the form of the model space, and the credible parameters.

The simplest example I can think of is linear regression: Many people are familiar with x-y scatter plots as diagrams of data, and many people are familiar with lines as a model form. The credible parameters can then be suggested by a smattering of lines sampled from the posterior distribution, like this (Fig. 1):

Figure 1. An icon for Bayesian data analysis.

The icon in Figure 1 represents the data as black dots. The icon represents the model form by the obvious linearity of every trend line. The icon represents uncertainty, or relative credibility, by the obvious range of slopes and intercepts, with greater density in the middle of the quiver of lines. I like Figure 1 as a succinct representation of Bayesian analysis because the figure makes it visually obvious that there is a particular model space being considered, and that there is a range of credible possibilities in that space, given the data.

Are there infelicities in Figure 1? Of course. For example, the shape and scale of the noise distribution are not represented. But many ancillary details must be suppressed in any icon.

Perhaps a more important infelicity in Figure 1 is that the prior distribution is not represented, other than the form of the model space. That is, the lines indicate that the prior distribution puts zero probability on quadratic or sinusoidal or other curved trends, but the lines do not indicate the form of the prior distribution on the allowed parameters.

Some people may feel that this lack of representing the prior is a failure to capture an essential part of Bayesian data analysis. Perhaps, therefore, a better icon would include a representation of the prior -- maybe as a smattering of grey lines set behind the data and the blue posterior lines. For the vague prior used in this example, the prior would be a background of randomly criss-crossing grey lines, which might be confusing (and ugly).

Now, an analogous icon for frequentist data analysis.

The essence of frequentist data analysis is inferring the extremity of a data property in the space of possibilities sampled a specified way from a given hypothesis. (That is, inferring a p value.) Note that the data property is often defined with respect to a model family, such as the best fitting slope and intercept in linear regression. Therefore, an icon should represent the data, the data property, and space of possibilities sampled from the hypothesis (with the extremity of the data property revealed by its visual relation to the space of possibilities).

Keeping with the linear regression scenario, an icon for frequentist analysis might look like this (Fig. 2):

Figure 2. An icon for frequentist data analysis.

As before, the data are represented by black dots. The data property is represented by the single blue line, which shows the least-squares fit to the data. The space of possibilities is represented by the smattering of red lines, which were created as least-squares fits to randomly resampled x and y values (with replacement, using fixed sample size equal to the data sample size). In other words, the hypothesis here is a "null" hypothesis that there is no systematic covariation between the x and y values. I like Figure 2 as a succinct representation of frequentist data analysis, especially when juxtaposed with Figure 1, because Figure 2 shows that there is a point estimate to describe the data (i.e., the single blue line) and a sample of hypothetical descriptions unrelated to the data.

Are there infelicities in Figure 2? Of course. Perhaps the most obvious is that there is no representation of a confidence interval. In my opinion, to properly represent a confidence interval in the format of Figure 2, there would need to be two additional figures, one for each limit of the confidence interval. One additional figure would show a quiver of red lines generated from the lower limit of the confidence interval, which would show the single blue line among the 2.5% steepest red lines. The second additional figure would show a quiver of red lines generated from the upper limit of confidence interval, which would show the single blue line among the 2.5% shallowest red lines. The key is that there would be a single unchanged blue line in all three figures; what changes across figures is the quiver of red lines sampled from the changing hypothesis.

Well, there you have it. I should have been spending this Saturday morning on a thousand other pressing obligations (my apologies to colleagues who know what I'm referring to!). Hopefully it will have taken you less time to have read this far than it took me to have written this far.

Appended 12:30pm, 28 Dec 2013: My wife suggested that the red lines of the frequentist sampling distribution ought to be more distinct from the data, and from the best fitting line. So here are modified versions that might be better:

Description of data is the single blue line. Red lines show the sampling distribution from the null hypothesis.

Blue lines show the distribution of credible descriptions from the posterior.

11 comments:

This is great! Super succinct illustration of the difference between the two approaches that would fit on a t-shirt :)

I'm thinking that one reason why the confidence interval is harder to visualize is because it is a pretty difficult concept. A confidence interval created by finding the space of "null lines" that are not rejected could perhaps be visualized using an animation in which many different nulls are tested and only those not rejected are retained. A bootstrap CI could be visualized similarly to the Bayesian posterior distribution, but I don't know how one then could make the difference between the two approaches clear...

Rasmus:Capturing the essence of bootstrapping/resampling in an icon, in a way that clearly distinguishes it from Bayesian, could indeed be challenging. I personally find resampling to be simultaneously attractive and mysterious. Attractive because it's often so straightforward to execute. Mysterious because I can't really get a conceptual grip on exactly how to interpret its results, especially with respect to bootstrapped confidence intervals.

I've been playing around with an icon. My goal was to create something simple enough that it could serve as a small icon or similar emblem and be sufficiently clear, yet still contain some key talking points. It's not technically perfect, only a starting point (for instance, the alternative best fit lines should end at the edges of the data). You can generate it using the code at: https://gist.github.com/bryanhanson/8176875

I feel the same way about bootstrapping, a seductively simple concept but what does it really do. Bootstrapping was my first love who unfortunately isn't yet on friendly terms with my current girlfriend Bayesia...

I've experimented with an animation showing "the other way" of constructing confidence intervals, that is, as the interval of nulls where corresponding null hypothesis tests are "accepted". It turned out a bit messy, but I'm fairly happy with it :)

In this sense, the bootstrap distribution represents an (approximate)nonparametric, noninformative posterior distribution for our parameter.But this bootstrap distribution is obtained painlessly—without having toformally specify a prior and without having to sample from the posteriordistribution. Hence we might think of the bootstrap distribution as a “poorman’s” Bayes posterior. By perturbing the data, the bootstrap approxi-mates the Bayesian effect of perturbing the parameters, and is typicallymuch simpler to carry out.

I think the Bayesian icon could be improved by showing samples from the prior in a lighter color. That way the move from prior to posterior would be emphasized (especially if the latter is more concentrated).

In my view your frequentist icon is not as good, because it focuses on Neyman-Pearson NHST reasoning. I would instead have put an underlying true regression line (in black, say). Then one could add, in different colors, samples of size n from the true distribution, along with the associated least squares line for each sample.

This emphasizes that frequentism concerns frequency properties (i.e., across hypothetical alternative datasets) of estimators when sampling from some underlying true distribution.

If I understand your suggestion, the red lines are already doing what you suggest should be emphasized, but what should be added is a dark red line in the middle of them that represents the hypothesis (i.e., "true" parameter value). Good point.

Hi John,Congratulations fot the very creative way of spending your Saturday mornings! This is really inspiring and could be used for e.g. teaching purposes, as long as you make things clear. As Leon suggests, this is not a fair comparison. The Bayesian approach you suggest focuses on inference while the frequentist one on hypothesis testing. Plotting the sampling distribution of the coefficients in a frequentist manner (using confidence intervals) would yield an identical picture as the Bayesian one.

The Bayesian icon could emphasize decision about a null value by putting in a single heavy red horizontal line. This would be the same single heavy red horizontal line proposed for the frequentist display. Having a single heavy red horizontal line in both icons might help emphasize that the frequentist approach makes a decision based on a sampling distribution from the null hypothesis, while the Bayesian approach makes a decision based on the posterior distribution from the data. I like this idea and will implement it soon.

If instead of emphasizing decisions about null values, you/we want the emphasis to be on parameter estimation, then neither icon needs a representation of the null value, and the frequentist icon instead needs a representation of confidence interval, which gets difficult to display in a single-panel static icon, as was discussed in the blog post and in comments such as Rasmus'.

I am very much in favor of making things clear, and to avoid doing harm by oversimplifying. To really make these ideas clear takes a lot more space than two minimalist icons! The icons are meant to be graphically clean and simplistic and suggestive -- hopefully suggestive of correct ideas, not wrong ones.

My comment was referring to your introduction to the frequentist figure: "Now, an analogous icon for frequentist data analysis." I am just saying that the two concepts are not comparable (thus, "analogous") since one is referring to Bayesian inference and the other to frequentist decision making, which are two related but distinct aspects of statistical theory. And if they were to be comparable, either one should depict frequentist inference picturing sampling distributions which would be identical to the Bayesian inference assuming uninformative priors, or the other would show Bayesian decision making with explicitly defined loss (or utility) functions. Therefore I find the images slightly misleading.