Build Your Own R Modules in Azure ML

This post is by Roope Astala, Senior Program Manager in Microsoft’s Information Management and Machine Learning team.

Azure ML currently offers almost 100 modules to solve a wide spectrum of data science problems that our customers may encounter. Nevertheless, what if you need more, or maybe something a bit different from what we have to offer?

Custom R Modules

Custom R Modules give you a way to extend the built-in module set with your own. You can share these modules with friends or co-workers by putting them in GitHub.

Custom R modules are first-class citizens – they can be used in experiments and operationalized in web services just like built-in modules. You can use such modules for things such as:

Handling of domain-specific data formats.

Flexible data transformations.

Customized feature construction and extraction.

Within your R script, you can use hundreds of R packages preinstalled in Azure ML. You can even bundle your own packages with the module.

Example

As an example, let’s create a module that takes some JSON-formatted data and parses it into an Azure ML dataset. The module consists of 3 parts:

An XML file that defines what inputs and output and parameters the module will have. In a sense, the XML is the skeleton of the module, and the R code its muscle.

The module takes in one input, a dataset which consists of a JSON-formatted string, and one output, the contents of JSON objects as a flattened dataset. It also has one parameter: a string that specifies null replacement value. The corresponding R script is:

To add the module to Azure ML, you simply put the different files into a zip package and upload the package by selecting +NEW > Module in your Azure ML Studio workspace. Once uploaded, your module appears in “Custom” category in the module palette, alongside all the built-in modules:

You can now use the new R module to build experiments, and deploy it to production by publishing your experiment as web service.

Summary

Custom R Modules are a great way for you to extend Azure ML’s built-in modules. Such modules can be used in experiments, operationalized in web services and shared with your colleagues and the community. Although the example provided in this blog post is a simple one, custom R modules can be far more complex and can take multiple inputs and outputs and parameters of different types. Also, they have access to the same user interfaces as built-in modules, e.g. column selectors and drop-down menus of parameters. In the future, we plan to add support for input and output types beyond datasets: e.g. learners and transformations.