(1) Five vendors are out.

Gartner drops the hammer on five of the weaker players.

— Accenture made the analysis last year, according to Gartner, because it acquired Milan-based i4C Analytics, a small Milan-based company. Accenture rebranded the software as the Accenture Analytics Applications Platform. However, it appears that Accenture gives away the software in bundled consulting deals since it reports no revenue from software licensing.

— Lavastorm is an ETL and data blending tool that does not claim to offer native predictive analytics, so its presence in last year’s MQ was and is a mystery.

— Megaputer, a text mining vendor, made it into the MQ for two years running despite being so marginal that they lack a record in Crunchbase. Last year, Gartner noted that “Megaputer scores low on viability and visibility and there is a lack of awareness of the company outside of text analytics in the advanced analytics market.” Which begs the question: why were they included in the first place?

— Prognoz appeared in the MQ for two years running and, like Megaputer, inspired WTF reactions from folks in the know. Primarily a BI tool with some time-series and analytics functionality included, Prognoz lacks the predictive analytics capabilities that Gartner says are minimally required. It also appears to lack customers West of Moscow.

— Predixion Software was acquired by Greenwave Systems in what seems to be a fire sale. So we won’t have Predixion to kick around anymore.

(2) Five new vendors are in.

Three real data science platforms enter the MQ this year as visionaries.

— Dataiku moved headquarters to New York earlier this year, so competitors can no longer dismiss it as “that French company.” The company markets Data Science Studio, a drag-and-drop interface that runs on top of open source platforms and has a slew of database connectors.

— Domino Data Lab is a data science platform with collaboration and reproducibility features. Originally marketed as a cloud-based managed service, Domino now offers its platform for on-premises implementation.

— H2O.ai develops and supports H2O, a scalable machine learning package. H2O.ai operates on a pure open source model, which makes it unique among the vendors included in this year’s MQ.

Mathworks is a welcome addition. According to IDC, it’s the #3 vendor in the advanced and predictive analytics segment, with 10% of the market, and it has held that position for years. Thus, its exclusion from the previous MQs is a mystery. Exactly how do you miss a vendor with a quarter-billion in category revenue? Just asking.

Teradata debuts as a niche vendor, thereby demonstrating the adage that the only thing worse than not making it into the Gartner MQ is to make it in as a niche vendor.

(3) Six vendors’ ratings changed markedly.

Six returning vendors declined markedly on one or both dimensions in Gartner’s assessment.

— Alpine continues to sink towards the bottom of the chart. Alpine’s close ties to Greenplum and EMC were once a feature, now a bug.

Due to its small size, Alpine is struggling to gain significant market visibility, which accounts for the drop in its Ability to Execute score. Of the vendors surveyed, it submitted the fewest reference customers, and 20% of those expressed concern about the small size of the community of users with whom they could network and share knowledge.

Recently, I asked an Alpine executive to disclose the company’s current customer count; he declined to do so. Since Alpine bragged about its 60 customers in 2015, I suspect that the comparison is not favorable.

— Alteryx and KNIME scored significantly lower on “Completeness of Vision” due to limited visualization tools and some scalability issues. These product attributes don’t change from year to year; the implication is that Gartner put more weight on them in this year’s MQ.

— FICO scored poorly on critical capabilities, open source tool support, algorithm selection, and innovation. It makes you wonder why anyone buys the software, or why FICO is still in the MQ.

— Quest (Statistica) declined because it’s hard to learn and use, has performance and stability issues, lacks key features, does not have an elastic cloud capability and lags in Spark integration. Other than that, it’s a winning product.

— RapidMiner‘s rating on “Ability to Deliver” increased, but its score on “Completeness of Vision” declined for no apparent reason.

(4) Hadoop and Spark integration are table stakes.

In the Market Overview section, Gartner writes:

All the vendors in this Magic Quadrant — indeed, in the market as a whole — have moved to include data in the open-source Hadoop ecosystem, which is now considered first-class. As such, it is equal in status to the proprietary stores that are predominantly used for traditional data warehouses.

Of course, while all vendors can use Hadoop as a data source, not all can leverage Hadoop as a computing platform. Moreover, vendors differ widely in their degree of integration with Hadoop.

Spark is becoming a de facto data science foundation for the vendors in this Magic Quadrant, as well as for other participants in this market.

Seven vendors in the 2016 MQ had no discernable Spark story. Five of them are now gone from the MQ. Pardon me while I take a victory lap.

(5) Gartner’s standards are strangely “flexible.”

Three examples:

Example 1: You can run an existing Python script in Enterprise Miner, and you can run an existing Python script in Alteryx. Neither application, however, provides authoring tools. This is rarely a problem since people who want to run a Python script usually already have a preferred IDE. Gartner singles out Alteryx, not SAS, however, for “lack of Python integration.”

(6) Data scientists don’t use Gartner’s top “data science” platforms.

If you wanted to create a Magic Quadrant based on the tools data scientists actually use, you might produce something like this:

Just 5% of data scientists surveyed by O’Reilly use any SAS software, and 0% use any IBM analytics software. In the slightly broader KDnuggets poll, 6% use SAS Enterprise Miner, and 8% say they use IBM SPSS Modeler.

Gartner’s obsession with “Citizen Data Scientists” leads it to criticize Domino and H2O because they are “hard to use:”

Without programming skills, prospective users are likely to struggle to learn the H2O stack of products.

Imagine that! If you want to use a data science platform, you need to know how to do data science.

(7) Gartner is clueless about open source software.

Data scientists use open source software. Gartner seems to get that:

Open-source languages — Python, R and Scala — dominate this market. Almost all data science platform vendors support Python and R, and many of the vendors in this Magic Quadrant also support Scala.

But:

Gartner’s research methodology prevents evaluation of pure open-source platforms (such as Python and R) in a Magic Quadrant, as there are no vendors behind them that offer commercially licensable products.

That is BS on two levels.

First, it’s not true. Continuum Analytics distributes and supports Anaconda, the most widely used Python distribution for data science. Microsoft and Oracle distribute and support R, and there are ample community and vendor support resources. Multiple vendors support Apache Spark.

Second, while you can argue that commercial support is an important attribute of a data science platform, you can’t claim that it is the only attribute that matters. Many data scientists function quite well without vendor support. And some of the vendors Gartner rates as “leaders” offer low-quality support, so it doesn’t seem to carry much weight in the ratings.

(8) The assessments of SAS and IBM are misleading.

Gartner says that for SAS, it evaluated SAS Enterprise Miner and the SAS Visual Analytics suite. What Gartner does not say is that it is impossible for SAS to score highly on the functional assessment of those products alone. SAS’ score depends on many other SAS software products, including:

Customers must license all of these products separately from SAS; doing so will more than triple the cost, and significantly increase the complexity of the architecture. Few SAS customers actually do this; about 15% of all SAS customers license SAS Enterprise Miner, and a tiny fraction of these customers license all of the software reflected in Gartner’s assessment.

The vast majority of SAS customers use its legacy software, whose code base dates to the mid-1990s. Thus, Gartner’s evaluation of SAS assumes a software configuration that hardly anyone uses.

The same story holds for IBM. Gartner rates IBM a leader on the strength of SPSS Modeler and, to a lesser extent, SPSS Statistics. Once again, however, the Gartner assessment depends on other IBM products. For example, Gartner praises IBM for its model management capabilities:

In short, by failing to disclose what products the customer must license to realize the attributed functionality, Gartner creates a false impression about the software most customers will actually license.

It’s like rating a hotel based solely on an examination of the Presidential Suite. Or evaluating a Chevrolet by test-driving a Cadillac.

(9) Gartner has a warm and fuzzy for IBM.

This gem lurks in the bowels of the report:

Customers are often confused by mismatches between (IBM’s) marketing messages and actual, purchasable products.

In other words, IBM is a giant hype machine. Gartner quaffs the Kool-Aid about IBM’s new Data Science Experience (DSx):

DSx is likely to be one of the most attractive platforms in the future — modern, open, flexible and suitable for a range of users, from expert data scientists to business people.

Keep in mind that DSx is a managed service for Spark and R in IBM Cloud. It includes Jupyter and RStudio IDEs. That’s all it is — a vanilla managed service, with fewer capabilities than managed services provided by Altiscale, Databricks, Domino, Microsoft Azure, or Qubole.

And, since it runs in IBM Cloud, there is about a 5% chance that any of your organization’s data is there already.

But Gartner thinks it will solve world hunger.

Most of IBM’s customers in this space use the legacy products, about which Gartner says this:

To many new users, IBM SPSS Modeler and Statistics seem outdated and overpriced.

IBM may be expensive, but you get blue-chip technical support, right?

Reference customers expressed dissatisfaction with IBM’s support and bureaucracy; they reported difficulties finding the right liaisons and technical help, despite high maintenance fees.

Okay, so the software is outdated and overpriced, and the support stinks. But you still get “white glove” service from IBM Client Executives, right?

Customers expressed concerns about purchasing products from IBM, as the company reportedly often tries to bring its consulting organization, IBM Global Business Services, into data science projects.

17 comments

I would argue that kdb+ and q/k should also be in the upper right quadrant, because they are the best in class database and vector language for speed and scalability. Ask any financial player of import (Deutsche Bank, Bank of america, etc.). Other than that this article is completely correct. Gartner can’t hide it’s affection for IBM because that is where they came from.

kdb+ is currently the fifth-ranked time series database tracked in DB-Engines. The q programming language ranks 44th in the TIOBE Programming Language Index, just below Erlang and just above Bash; it doesn’t register at all in the RedMonk or IEEE indices. On OpenHub, kdb+ shows virtually no development activity in the past several years. Not one respondent in either the O’Reilly or KDnuggets surveys said they use q/kdb+, even a little bit.

Thanks for reading and for commenting. Yes, you can run a single-node instance of Spark locally on the SPSS server directly from Modeler. But if you want to run Spark in a cluster, you must license SPSS Analytic Server.

Really good read on a product space I don’t follow that closely but need to know more about. I rather enjoyed some of your jabs at Gartner (although I am on the verge of buying stock in them, because… well, check out the charts for the last several years. In a time of uncertainty, selling advise that is Generally Recognized as Safe is a great racket to be in). I work in the business, I deal with some of these analysts. No slur on them as people or on their integrity, but they have their caprices, their preconceived notions and, dare I say it, their biases. And they work on limited information and can be skewed by what vendors tell them (I’ve been on the vendor side of a number of MQs over the years).

Posting under a pseudonym for reasons that should be plain enough… and not representing my employer, if you figure out who that is from your server logs.