Other sites

Retries in API packages and reinventing the wheel

[This article was first published on Posts on R-hub blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Web APIs can sometimes fail for no particular reason;
therefore packages accessing them often add some robustness to their code by retrying calling the API a few times if there was an error.
The two high-level R HTTP clients, httr and crul, offer ready-made sub-routines for such cases, but some developers like me have rolled their own out of ignorance. 😅
In this post I shall present the retry sub-routines of httr and crul, and more generally reflect on (not) reinventing the wheel in your R package. 🎡

The few figures of this post come from the funny HTTP Cats website and are hyperlinked.

Retry in httr and crul

Relying on internet resources might make a package fragile, since the connection or interfaced web API can fail.
Therefore, in packages wrapping APIs, one can find some variation of the following pseudo-code that retries a few times:

As underlined in httr’s excellent “Best practices for API packages” vignette, “it’s extremely important to make sure to do this with some form of exponential backoff: if something’s wrong on the server-side, hammering the server with retries may make things worse, and may lead to you exhausting quota (or hitting other sorts of rate limits).”

Now, if you need such a pattern in your API package, you could use a shortcut rather than patiently ingesting examples and best practice… by using ready-made features of either httr or crul.

Retry in httr

The httr package contains a handy RETRY() function that, well, safely retries a request until it succeeds or until the maximal number of tries is reached.
It uses best practice written up by AWS to define the increasing waiting time.

If there’s no error, it simply behaves like the corresponding verb would.

It does not wrap the HTTP calls in tryCatch so the only errors it handles gracefully are HTTP errors.

It offers the possibility to use a callback function, “if the request will be retried and a wait time is being applied. The function will be passed two parameters, the response object from the failed request, and the wait time in seconds.”. For instance before retrying maybe you could query an API status endpoint if such a thing exists.

On not reinventing the wheel

Once I heard about httr::RETRY() and the crulretry method, I was a bit disappointed at having reinvented the wheel.
Could one avoid doing that too often?

How to not reinvent the wheel in your code

As an R package developer, how do you know about functions and methods already existing in packages your package depends on, or could depend on, or could draw inspiration from?
Sometimes you might guess your problem is something others encountered but you might not even know the right words to present it (mocking for instance!).

In a blog post Jeff Atwood states“If anything, “Don’t Reinvent The Wheel” should be used as a call to arms for deeply educating yourself about all the existing solutions”.
General strategies for learning more and more about the R ecosystem include

reading the whole reference of packages your package depends on, and even its changelog once in a while, because you might as well use all the gems of a package once you’ve decided to trust it;

Of course, “deeply educating yourself” takes time one doesn’t necessarily have and which no one should feel guilty about.
Sometimes you’ll re-implement something that already exists elsewhere, and it’s fine!

Lastly, you might even want to create your own (better) version, which is obviously neat. 😎

How to help users of your package not reinvent the wheel

As the developer of a package, you might help users find useful features by… working on its docs.
A good time investment could be to create a pkgdown website with a well-organized reference index.

Furthermore, some features could be added to your package if they’re often implemented downstream.

Conclusion

In this post we’ve presented useful functions implementing retries for API packages in httr and crul.