What are web services in Machine Learning Server?

本文內容

Data scientists can deploy R and Python code and models as web services into Machine Learning Server to give other users a chance to use their code and predictive models. Once hosted there, these web services are exposed and available for consumption.

Web services can be consumed directly in R or Python, programmatically using REST APIs, or via Swagger-generated client libraries. They can be consumed synchronously, in real-time, or in batch mode. They can also be deployed from one platform and consumed on another.

Web services facilitate the consumption and integration of the operationalized models and code they contain. Once you've built a predictive model, in many cases the next step is to operationalize the model. That is to generate predictions from the pre-trained model on demand. In this scenario, where new data often become available one row at a time, latency becomes the critical metric. It is important to respond with the single prediction (or score) as quickly as possible.

Standard web services

These web services offer fast execution and scoring of arbitrary Python or R code and models. They can contain code, models, and model assets. They can also take specific inputs and provide specific outputs for those users who are integrating the services inside their applications.

Standard web services, like all web services, are identified by their name and version. Additionally, they can also be defined by any Python or R code, models, and any necessary model assets. When deploying a standard web service, you should also define the required inputs and any output the application developers use to integrate the service in their applications.

Real-time web services

Real-time web services do not support arbitrary code and only accept models created with the supported functions from packages installed with the product. See the following sections for the list of supported functions by language and package.

Real-time web services offer even lower latency to produce results faster and score more models in parallel. The improved performance boost comes from the fact that these web services do not depend on an interpreter at consumption time even though the services use the objects created by the model. Therefore, fewer additional resources and less time is spent spinning up a session for each call. Additionally, the model is only loaded once in the compute node and can be scored multiple times.

Versioning

Every time a web service is published, a version is assigned to the web service. Versioning enables users to better manage the release of their web services and helps the people consuming your service to find it easily.

At publish time, specify an alphanumeric string that is meaningful to those users who consume the service. For example, you could use '2.0', 'v1.0.0', 'v1.0.0-alpha', or 'test-1'. Meaningful versions are helpful when you intend to share services with others. We highly recommend a consistent and meaningful versioning convention across your organization or team such as semantic versioning. Learn more about semantic versioning here: http://semver.org/.

If you do not specify a version, a globally unique identifier (GUID) is automatically assigned. These GUID numbers are long making them harder to remember and use.

Who consumes web services

After a web service has been published, authenticated users can consume that web service on various platforms and in various languages. You can consume directly in R or Python, using APIs, or in your preferred language via Swagger.

You can make it easy for others to find your web services by providing them with the name and version of the web service.

Data scientists who want to explore and consume the services directly in R and in Python.

Quality engineers who want to bring the models in these web services into validation and monitoring cycles.

Application developers who want to call and integrate a web service into their applications. Developers can generate client libraries for integration using the Swagger-based JSON file generated during service deployment. Read "How to integrate web services and authentication into your application" for more details. Services can also be consumed using the RESTful APIs that provide direct programmatic access to a service's lifecycle.

How are web services consumed

Web services can be consumed using one of these approaches:

Approach

Description

Request Response

The service is consumed directly using a single synchronous consumption call.Learn how in R | in Python

Asynchronous Batch

Users send a single asynchronous request to the server who in turn makes multiple service calls on their behalf.Learn how in R

Permissions

By default, any authenticated Machine Learning Server user can:

Publish a new service

Update and delete web services they have published

Retrieve any web service object for consumption

Retrieve a list of any or all web services

Destructive tasks, such as deleting a web service, are available only to the user who initially created the service. However, your administrator can also assign role-based authorization to further control the permissions around web services. When you list services, you can see your role for each one of them.