Is there a way of creating scatterplots with marginal histograms just like in the sample below in ggplot2? In Matlab it is the scatterhist() function and there exist equivalents for R as well. However, I haven't seen it for ggplot2.

I started an attempt by creating the single graphs but don't know how to arrange them properly.

1+ for demonstrating the placement, but you should not be re-doing the random sampling if you want the interior scatter to "line up" with the marginal histograms.
–
BondedDustDec 17 '11 at 16:35

You're right. They're sampled from the same distribution though, so the marginal histograms should theoretically match the scatter plot.
–
oeo4bDec 17 '11 at 17:03

6

In "theory" they will be asymptotically "match"; in practice the number of times they will match is infinitesimally small. It's very easy to use the example provided xy <- data.frame(x=rnorm(300), y=rt(300,df=2) ) and use data=xy in the ggplot calls.
–
BondedDustDec 17 '11 at 17:10

1

I wouldn't recommend this solution as the plots axes usually don't align exactly. Hopefully future versions of ggplot2 will make it easier to align the axes, or even allow for custom annotations on the sides of a plot panel (like customized secondary axis functions in lattice).
–
baptisteDec 18 '11 at 6:33

7

No, they would not, in general. ggplot2 currently outputs a varying panel width that changes depending on the extent of the axis labels etc. Have a look at ggExtra::align.plots to see the kind of hack that is currently required to align axes.
–
baptisteDec 18 '11 at 18:51

This is not a completely responsive answer but it is very simple. It illustrates an alternate method to display marginal densities and also how to use alpha levels for graphical output that supports transparency:

That's an interesting way to show the density. Thanks for adding this answer. :)
–
MichelleDec 17 '11 at 18:54

12

It should be noted that this method is much more commonplace than putting marginal histograms. In fact, have rug plots is common in published articles where I have never seen a published article with marginal historgrams.
–
Xu WangDec 17 '11 at 23:26

Just a very minor variation on BondedDust's answer, in the general spirit of marginal indicators of distribution.

Edward Tufte has called this use of rug plots a 'dot-dash plot', and has an example in VDQI of using the axis lines to indicate the range of each variable. In my example the axis labels and grid lines also indicate the distribution of the data. The labels are located at the values of Tukey's five number summary (minimum, lower-hinge, median, upper-hinge, maximum), giving a quick impression of the spread of each variable.

These five numbers are thus a numerical representation of a boxplot. It's a bit tricky because the unevenly spaced grid-lines suggest that the axes have a non-linear scale (in this example they are linear). Perhaps it would be best to omit grid lines or force them to be in regular locations, and just let the labels show the five number summary.

This might be a bit late, but I decided to make a package (ggExtra) for this since it involved a bit of code and can be tedious to write. The package also tries to address some common issue such as ensuring that even if there is a title or the text is enlarged, the plots will still be inline with one another.

The basic idea is similar to what the answers here gave, but it goes a bit beyond that. Here is an example of how to add marginal histograms to a random set of 1000 points. Hopefully this makes it easier to add histograms/density plots in the future.