Top takeaways from R Studio conf 2019

Introduction

During the most recent decade, the force originating from both the scholarly community and industry has lifted the R programming language. Also, they have worked hard to end up the absolute most significant tool for computational statistics, perception, and data science.

Due to the growth of R in the data science community, there is a constant need to upgrade and develop both R and RStudio. R studio conference is a platform where scholars around the globe come and share their knowledge and developments.

What is the R studio Conference?

Rstudio conference 2019 is all about R and RStudio. Hundreds of advanced and new R users in Austin, Texas from Tuesday, January 15 thru Friday, January 18 came together to become better at data science with R through this conference.

This conference happens every year where the latest trends, developments and goals for next year take place.

Agendas

This year conference happens in two parts i.e workshops and conferences. Conferences emphasise on different speakers sharing their experiences and advancements they have done in the previous year in R. Though, in workshops, hands-on experience happens over various topics like Big data in R, getting hands-on R-shiny on the production level.

The main theme of this session was R in production. Most of the conferences and workshops focus on the production aspect of the application in R whether be it Rshiny, sparkR or even different packages that rolled out in the conference

Conferences

1. Shiny in production: Principles, practices, and tools

Shiny is a web framework for R, a language not traditionally known for web frameworks, to say the least. As such, Shiny has always faced questions about whether it can or should be used “in production”. In this talk, they explored what “production” even means, reviewed some of the historical obstacles and objections to using Shiny for production purposes, and discussed practices and tools that can help your Shiny apps flourish.

2. A guide to modern reproducible data science with R

The session focussed on creating custom computing environments that can be shared and instantly with remote users, packaging small to medium data inside and outside packages, and creating simple to complex workflows to track the provenance of your results. Additionally, this solves the problem developers used to face in while reproducing their work in R in terms of package dependencies, mismatched versions etc

3. R in Production

With the increase in people using R for data science comes an associated increase in the number of people and organisations wanting to put models or other analytic code into “production”. We often hear it said that R isn’t suitable for production workloads, but is that true? Also, in this talk, Mark looked at some of the misinformation around the idea of what “putting something into production” actually means, as well as provided tips on overcoming the obstacles put in your path.

4. Databases using R: The latest

Many of the techniques covered are based on our personal and the community’s experiences of implementing concepts introduced last year, such as offloading most of the data wrangling to the database using dplyr, and using the RStudio IDE to preview the database’s layout and data.

5. Scaling R with Spark

This talk introduced new features in sparklyr that enable real-time data processing, brand new modelling extensions and significant performance improvements. The sparklyr package provides an interface to Apache Spark to enable data analysis and modelling in large datasets through familiar packages like dplyr and broom.

6. Democratizing R with Plumber APIs

The Plumber package provides an approachable framework for exposing R functions as HTTP API endpoints. This allows R developers to create code that can be consumed by downstream frameworks, which may be R agnostic. In this talk, an existing Shiny application that uses an R model was taken and turned that model into an API endpoint so it can be used in applications that don’t speak R.

7. tidy time series analysis

Time series can be frustrating to work with, particularly when processing raw data into model-ready data. But, this work presented two new packages that address a gap in existing methodology for time series analysis (raised in rstudio:: conf 2018). Furthermore, the tsibble package supports organizing and manipulating modern time series, leveraging tidy data principles along with contextual semantics: index and key. The tsibble data structure seamlessly flows into forecasting routines. Also, the fable package is a tidy renovation of the forecast package. It promotes transparent forecasting practices and concise model representations, to empower analysts tackling a broad domain of forecasting problems. This collection of packages form the tidyverts, which facilitates a fluent and fluid workflow for analyzing time series.

8. 3D mapping, plotting, and printing with rayshade

In this talk, speakers emphasised on the creation of beautiful 3D maps and visualizations with the rayshader package. In addition, talk about the value of 3D plotting, how interactions with the R community helped drive the development of rayshader, and how writing/blogging about your projects can vastly improve your code was taken.

9. Spatial Data Science in the Tidyverse

Package sf (simple feature) and ggplot2::geom_sf have caused a fast uptake of tidy spatial data analysis by data scientists. But, they do not handle important spatial data science challenges, including raster and vector data cubes and out-of-memory datasets. Powerful methods to analyse such datasets are now put in packages stars and tidync. This talk discussed how the simple feature and tidy data frameworks can handle these challenging data types. Also, it shows how R can be used for out-of-memory spatial and spatiotemporal datasets using tidy concepts.