If so, do I have to immediately start getting ready for a life without do and rowwise()? Or is it too early? I'm almost confident list-columns and purrr can take the place of these, but I want to know if there are any cases where only do() can do at the moment.

@hadley I like that you are striving to create the best approach to doing analysis. Maintaining backwards compatibility is always a pain. It would help if R had a richer package versioning approach where you could make backwards incompatible changes in major updates while making it easy for people to stick to the old major version.

Packrat and checkpoint help a bit with that but aren't part of the core.

though in this (every?) case it’s really equivalent to calling collect beforehand, and is thus not really that useful.

Thanks, I totally agree with you at this point. I think do() could be useful if it supported database backends and iterated computation group by group before passing to R's memory, but actually it's not...

I don't think this is too inconvenient, but I feel it's great if we have data.frame-specific version of pmap().

For example, a data.frame usually has many rows so it's not that all of them involve all computation. In this case, you need a function with ... in its argument to ignore irrelevant rows. I often forgot this

You could do the same thing with a non-data.frame list, but creating and manipulating sub-elements gets tricky, and the print method takes way too much space (and can't be salvaged by str if there's a model inside).

Thanks! Agreed, nest() can do well and the nicer print method is a thing.

(Yet, sometimes I feel more comfortable with a list of data.frames than with a data.frame with a nested column, since list is more flexible than data.frame. I think this is rather a matter of preference.)

I'm a big fan of do() and particularly like the progress bar. In light of this thread I decided to explore alternatives to do() using purrr::map(). However, it seems to me that the purrr::map() approach is slower than do() as illustrated by this example:

I'm obviously calling more complex functions that return lists than mean_and_sd() above in the course of my work so the difference between the approaches becomes much clearer.

So, the question is, is there a way of using purrr::map() (or something else) in this sort of context that is as efficient as do()? I'm aware that data.table might be faster, but I'm more interested in a tidyverse method.