Everything about Oracle Data Mining, a component of the Oracle Advanced Analytics Option - News, Technical Information, Opinions, Tips & Tricks. All in One Place

Tuesday Mar 18, 2014

The free Oracle
Data Miner GUI is an extension to Oracle SQL Developer that enables data
analysts to work directly with data inside the database, explore the data
graphically, build and evaluate multiple data mining models, apply Oracle Data
Mining models to new data and deploy Oracle Data Mining's predictions and
insights throughout the enterprise.The
product enables a complete workflow deployment to a production system via
generated PL/SQL scripts (See Generate
a PL/SQL script for workflow deployment).This time I want to focus on the model
scoring side, especially the single record real-time scoring.Would it be nice if the scoring function can
be accessed by different systems on different platforms?How about deploying the scoring function as a
Web Service?This way any system that
can send HTTP request can invoke the scoring Web Service, and consume the
returning result as they see fit.For
example, you can have a mobile app that collects customer data, and then
invokes the scoring Web Service to determine how likely the customer is going
to buy a life insurance.This blog shows
a complete demo from building predictive models to deploying a scoring function
as a Web Service.However, the demo does
not take into account of any authentication and security consideration related
to Web Services, which is out of the scope of this blog.

Note: The APEX is pre-installed with the Oracle database 12c, but you need to
configure it in order to use it.

For this demo, create a Workspace called DATAMINER that is
based on an existing user account that has already been granted access to the
Data Miner (this blog assumes DMUSER is the Data Miner user account).Please refer to the Oracle
By Example Tutorials to review how to create a Data Miner user account and
install the Data Miner Repository.In
addition, you need to create an APEX user account (for simplicity I use
DMUSER).

Build Models to Predict BUY_INSURANCE

This demo uses the demo data set, INSUR_CUST_LTV_SAMPLE,
that comes with the Data Miner installation.Now, let’s use the Classification Build node to build some models using
the CUSTOMER_ID as the case id and BUY_INSURANCE as the target.

Evaluate the Models

Nice thing about the Build node is that it builds a set of
models with different algorithms within the same mining function by default, so
we can select the best model to use.Let’s look at the models in the Test Viewer; here we can compare the
models by looking at their Predictive Confidence, Overall Accuracy, and Average
Accuracy values.Basically, the model
with the highest values across these three metrics is the good one to use.As you can see, the winner here is the
CLAS_DT_3_6 decision tree model.

Next, let’s see what input data columns are used as
predictors for the decision tree model.You can find that information in the Model Viewer below.Surprisingly, it only uses a few columns for
the prediction.These columns will be
our input data requirement for the scoring function, the rest of the input
columns can be ignored.

Score the Model

Let’s complete the workflow with an Apply node, from which
we will generate the scoring SQL statement to be used for the Web Service.Here we reuse the INSUR_CUST_LTV_SAMPLE data
as input data to the Apply node, and select only the required columns as found
in the previous step.Also, in the Class
Build node we deselect the other models as output in the Property Inspector
(Models tab), so that only decision tree model will be used for the Apply node.The generated scoring SQL statement will use
only the decision tree model to score against the limited set of input columns.

Generate SQL Statement for Scoring

After the workflow is run successfully, we can generate the
scoring SQL statement via the “Save SQL” context menu off the Apply node as
shown below.

We need to modify
the first SELECT SQL statement to change the data source from a database table
to a record that can be constructed on the fly, which is crucial for real-time
scoring.The bind variables (e.g. :funds)
are used; these variables will be replaced with actual data (passed in by the
Web Service request) when the SQL statement is executed.

Create Scoring Web Service

Assume the Oracle APEX and Oracle REST Data Services have
been properly installed and configured; we can proceed to create a RESTful web
service for real-time scoring.The
followings describe the steps to create the Web Service in APEX:

1. APEX Login

You can bring up the APEX login screen by pointing your
browser to http://<host>:<port>/ords.Enter your Workspace name and account info to
login.The Workspace should be based on
the Data Miner DMUSER account for this demo to work.

2. Select SQL
Workshop

Select the SQL
Workshop icon to proceed.

3. Select RESTful
Services

Select the RESTful
Services to create the Web Service.

Click the “Create”
button to continue.

4. Define Restful
Services

Enter the following
information to define the scoring Web Service in the RESTful Services Module
form:

Lastly, create the
following parameters that are used to pass the data from the Web Service request
(URI) to the bind variables used in the scoring SQL statement.

The final RESTful
Services Module definition should look like the following.Make sure the “Requires Secure Access” is set
to “No” (HTTPS secure request is not addressed in this demo).

Test the Scoring Web Service

Let’s create a simple web page using your favorite HTML
editor (I use JDeveloper to create this web page).The page includes a form that is used to
collect customer data, and then fires off the Web Service request upon
submission to get a prediction and associated probability.

Notice the {funds},
{checking}, {credit}, {atm}, {payments} will be replaced with actual data from
the form.This URI matches the URI
Template specified in the RESTful Services Module form above.

Let’s test out the
scoring Web Service by entering some values in the form and hit the Score
button to see the prediction.

The prediction along
with its probability and cost is returned as shown below.Unfortunately, this customer is less likely to
buy insurance.

Let’s change some
values and see if we have any luck.

Bingo!This customer is more likely to buy
insurance.

Conclusion

This blog shows how to deploy Data Miner generated scoring
SQL as Web Service, which can be consumed by different systems on different
platforms from anywhere.In theory, any
SQL statement generated from the Data Miner node could potentially be made as
Web Services.For example, we can have a
Web Service that returns Model Details info, and this info can be consumed by
some BI tool for application integration purpose.