In this release we are continuing our new strategy of including open source based algorithms available in the Modeler GUI without having to install anything else. In Modeler 18.1.1, we are adding two Python algorithms. First we have added t-SNE which is a dimensionality reduction method that lets end-users easily visualize groupings in their data. We are also adding the random forest algorithm based on Python. This is in addition to the existing random trees node. While random trees provides some options not available in the Python algorithm, in many cases the Python version will build the model more quickly. We wanted to let the data scientist using Modeler make the decision which is why we are including both.

We are also adding three algorithms that run on Spark and can be run locally in Modeler or in Hadoop through Analytic Server. First we added K-Means on Spark. K-Means is a widely used clustering algorithm now available for the first time for Analytic Server. We are adding XGBoost (already available in Python) as a Spark node. Finally we are adding a new algorithm — Isotonic Regression-AS. Isotonic regression relaxes the constraint in linear regression that the model be completely linear — instead the new constraint is now that the prediction is non-decreasing as one changes an input field.

We have also improved the CPLEX node in this release. We added the CPLEX node in Modeler 18.1 to allow end users to run OPL (a language used in IBM’s CPLEX Optimization Studio). code in Modeler — thus allowing Modeler end users to run optimization directly as part of the Modeler flow. However, in Modeler 18.1 the CPLEX node only allow a single input — making it difficult to add data around constraints or costs. In this release we allow multiple inputs to the CPLEX node making it much easier to use. An end user still needs a separately purchased CPLEX Optimization Studio license to have the optimization model involve more than 1000 optimization variables and 1000 constraints (which generally but not always translates into 1000 rows of data in Modeler).

We have received a lot of feedback that the visualization in Modeler could be improved. We are thus exposing as a beta feature a new type of visualization. This is interactive and allows one to change the colors, scaling and even the type of graph on the fly. Since this is a beta feature it should not be used for production implementations. We are looking for feedback on this node so please feel free to leave a comment on the blog about how you like it.

The following are the new features in Analytic Server 3.1.1 (in addition to the new Spark nodes mentioned above):

1) A more automated offline installation procedure is now available for Hortonworks HDP
2) You can now configure separate YARN queue for each Analytic Server tenant with specific queue name and resource allocation that match the requirement for different type of AS users and jobs.
3) Data (from the same data source) now can be shared across Analytic Server jobs at runtime using global RDD in SparkContext. A checkbox of â€śis global shareâ€ť added into the AS Admin Console under Data Source page. It’s recommended to enable this feature when there are multiple steams using the same data source.
4) Two more data processing operations â€“ â€śSamplingâ€ť and â€śDistinctâ€ť now supported to SQL pushback to Hive for execution.

The following are the related software support changes for Analytic Server 3.1.1:

Platform support

Support for Cloudera 5.11 and 5.12.
Support for Ubuntu Linux 16.04 (with Hortonworks Data Platform 2.6 and Cloudera 5.11).
Cloudera 5.8 and 5.9 are no longer supported.
Big Insights 4.1, 4.2, and 4.2.5 are no longer supported.
MapR 5.0 is no longer supported

Data sources

Support for Apache Hive 2.1
MongoDB 2.6 is no longer supported
MySQL 5.1 is no longer supported

I am the offering manager (IBM's term for product manager) for IBM SPSS Modeler and IBM SPSS Collaboration and Deployment Services. Before moving to offering management, I worked as a data scientist consultant for many years building and deploying predictive models using IBM SPSS Modeler mostly for U.S. government customers. The models I worked on were very successful -- resulting in large savings for a U.S. government agency.

Hi Ted, do you know why OneHub’s free year trial of Modeler for academics only has version 18.0 and not 18.1? I was wondering if there was an oversight somewhere. I tried calling IBM, but that didn’t help. IBM directs me to OneHub and OneHub directs me to IBM. I thought you might be able to help clear things up.

There is a defect which applies if one installs Modeler 18.1.1 on top of Modeler 18.1 but there is a workaround. Go to Tools, Options, User Options and then the Mode tab. Switch to Analytic Server mode and click Apply. Then go back to “Traditional mode” and you should be able to see the new graph node.