Luis Verde Arregoitia

Yesterday I tweeted this gif showing what we can do about non-data grouping rows embedded in the data rectangle using the ‘unheadr’ package (we can and we should put them into their own variable in a tidier way). Please ignore the typo in the tweet.

There was some interest in the code behind the animation, and I wanted to share it anyway because it’s based on actual data and I think that’s pretty cool.

This is all made possible thanks to Thomas Lin Pedersen’s ‘gganimate’ package, a cool usecase with geom_tile() plots by @mikefc, and this post by David Robison where he melts a table into long format with indices for each row and column and a variable holding the value for each cell.

We can use real data from this table, originally from a book chapter about rodent sociobiology by Ojeda et al. (2016). I had a PDF version of the chapter, and I got the data into R following this post by Bob Rudis. I highly recommend ‘pdftools’ and ‘readr’ for importing PDF tables.

The book cover.

cute!

The first few lines of the table looked like this, and for this demo we can just set up the data directly as a tibble.

There are grouping values for the taxonomic families that the different genera belong to, and these are interspersed within the taxon variable. All taxonomic families end with “dae”, so we can match this with regex easily. Install ‘unheadr’ from GitHub before proceeding.

table1_tidy<-table1%>%untangle2("dae$",Taxon,Family)

Once we have the original and ‘untangled’ version of the table, we define a function (inspired by @drob) to melt the data and apply it to each one.

For now, ‘gganimate’ is only available on GitHub. Once we have installed it, ‘transition_states’ does all the magic.

ut_animation<-ggplot(longTabs_both,aes(column,-row,fill=celltype))+geom_tile(color="black")+theme_void()+scale_fill_manual(values=c("#247ba0","#70c1b3","#b2dbbf","#ff1654","#ead2ac","gray"),name="",labels=c(c(paste("group",seq(1:4)),"data","NA")))+transition_states(states=tstep,# variable in datatransition_length=1,# all states display for 1 time unitstate_length=1# all transitions take 1 time unit)+enter_fade()+# How new blocks appearexit_fade()+# How blocks disappearease_aes('sine-in-out')

Check it out!

Once the animation is rendered we can save it to disk using anim_save().

This approach seems like a good way to animate various types of common steps in data munging, and it should work nicely to illustrate how several ‘dplyr’ or ‘tidyr’ verbs work. I’ll make more animations in the near future.