RESEARCH & RESOURCES

Predixion Wants to Make Data Mining Easier, More Pervasive

Predixion proposes to take data mining -- long the province of statisticians and complex predictive models -- and make it pervasive.

By Stephen Swoyer

July 6, 2011

If the goal of pervasive business intelligence (BI) remains elusive, that of pervasive data mining seems even tougher.

Few BI pros may think data mining can be simplified, much less made pervasive. Simon Arkell, CEO of data mining specialist Predixion Software, begs to differ. Arkell says he doesn't see any reason why data mining -- long the province of SAS or SPSS geeks, statisticians, and those able to tackle complex predictive models -- can't be made pervasive.

Predixion, he claims, aims to do just that. On the back end, Arkell explains, Predixion uses the data mining algorithms built into Microsoft Corp.'s SQL Server Analysis Services (SSAS).

"There are already algorithms within Analysis Services that we just turn into a one-click, easy operation, so I can take my data and get some really easy insight without having to build a model. We're trying to abstract the complexity away from the end user. [We can] even create a model in real-time and score the model with a result set without the user having to know that's happening."

From the user's perspective, it exposes a predictive modeling (PM) environment that Arkell says is similar to a WYSIWYG HTML design tool. The upshot, he claims, is that many more BI professionals can use Predixion to build predictive models.

"[T]his doesn't mean that overnight my grandmother is going to start building predictive models," Arkell concedes. In the Predixion scheme, he suggests, an analyst could conceivably "view some documentation, maybe take a bit of training, and be up and running as a predictive modeler without having to be a Ph.D.." This isn't as far-fetched as it might sound, he argues.

"The thing about predictive modeling is that your effectiveness is determined by how much domain experience you've had as well. I think it's about allowing people with domain experience to get business value out of predictive modeling," he suggests. "I think ... our value proposition is that we give companies access to predictive analytics that they previously just couldn't or didn't [have]."

Isn't building a predictive model the most complex (and frequently the most costly, given the required skills) aspect of developing an effective data mining program? Isn't this why data mining and predictive analytics are -- to use the language of computational complexity theory -- hard problems?

Yes, Arkell concedes, it is, but it doesn't have to be.

"We've come up with some really nice stuff on the data preparation side, [with regard to] the creation of the model comparison and the tuning of the model, which can happen over time, [so] as you use it [the model] gets smarter. It's all wizard-driven. Each step of the way, there's a description of what that particular model or algorithm does and how you should use it," he explains.

This doesn't preclude input or assistance from statisticians or data mining experts, either. Instead, it proposes to focus and commoditize data mining by eliminating the so-called "Ph.D. bottleneck." Nor does it do away with data mining experts, either. "We can give those [data mining] Ph.D.s a [wider] audience. We say, 'Let's raise the collective IQ of an organization by having [the organization] leverage those [existing] models and use the predictive [models] that we generate to permeate data mining [throughout an organization] automatically," Arkell points out.

Even if you're a SAS or SPSS shop, Arkell isn't suggesting you do away with your existing data mining programs. Predixion can import SAS or SPSS models and transform them for use with SSAS. "We can compare a SAS model with one of our own models and see which one is more accurate and tune that model over time and make it more accurate," he explains.

Why would a shop opt to dump best-of-breed data mining technologies from SAS or SPSS for a (relatively) unproven technology like Predixion?

They probably won't, Arkell admits, but there's no reason Predixion can't be deployed alongside SAS or SPSS -- particularly if (as Arkell and other officials claim) it's able to push data mining out to more consumers.

"We're not going to displace SAS," explaining that Predixion uses a Predictive Modeling Mark-up Language (PMML) connector to get information from SAS. "Within our interface, you can automatically import a SAS model, you can query that model, you can share that model, [and] you can publish the results of that model ... in the same way [that you'd do] with our own [model]," Arkell continues. "This allows us to not have to go head-to-head with the 800 pound gorilla [i.e., SAS]."

Yes We Can

Arkell's account isn't the pipe-dream it might sound like.

In a his Advanced Analytics seminar at a recent TDWI World Conference, Mark Madsen, a veteran data warehouse architect and a consultant with Third Nature Inc., cited Predixion as a way for SQL Server-centric shops to quickly build an effective data mining program.

The point, Madsen said, isn't that Predixion's mix of automation, guided wizardry, and learn-as-you-go can't possibly produce data mining results that rival those of, say, a fully optimized SAS or SPSS configuration. Nor is it that Predixion delivers the kind of "good enough" results that should suffice for data mining newbies. In the former case, the bar's set absurdly high; in the latter, arguably, too low.

What's more, Madsen pointed out, there's a chance Predixion could improve results or performance in shops that already have SSAS-based data mining programs. "If Predixion sees a continuous variable where it should be [a] categorical [variable], they'll automatically [correct] the data for you," Madsen told TDWI conference attendees. The Predixion brain trust, he said, consists of "all the people who designed the Microsoft data mining software that went into SQL Server and the Excel Analysis Services plug-ins."

Predixion's employee directory reads like a Who's Who of Successful Software Ventures Past. There's Stuart Frost, former CEO of DATAllegro, which Microsoft acquired three years ago for its massively parallel processing (MPP) database engine. Another DATAllegro veteran, Stephan DeSantis, manages Predixion's books as CFO. Its CTO is Jamie MacLennon, who formerly made significant contributions to the design and development of SSAS, the data mining engine that Predixion itself exploits.

The presence of so many start-up veterans, to say nothing of former Microsoft movers and shakers, coupled with Predixion's explicit dependence on SQL Server, begs a question, particularly from the perspective of prospective customers: isn't it just a matter of time before Redmond gobbles up Predixion?

Microsoft, after all, hasn't distinguished itself in its handling of BI-related acquisitions. At TDWI's World Conference in Las Vegas, for example, a prominent analyst cringed when she was asked if she believed Microsoft would purchase a particular best-of-breed data visualization company. "I hope not," this person said. "Look at what they did with ProClarity."

Arkell says the question hasn't ever come up.

"We have never been asked that question by a customer[,] but if it came up we would state that we are confident that anyone buying our company would do so for the great technology we have created and our hope is that it's development would continue and even accelerate with a larger partner," he indicates.

Microsoft is very much aware of Predixion. In a March interview, for example, Herain Oberoi, director of product management for SQL Server with Microsoft, described Predixion's product as a "world-class" offering. He said that Microsoft had attempted something similar -- albeit on a less ambitious scale -- with its PowerPivot add-in for Excel.

"[T]he whole objective [with PowerPivot] is to provide some level of abstraction from the deep predictive analytics and models that exist in Analysis Services. We have a whole lot of expertise around how exactly those models work," he said. "What Predixion is doing is taking that same idea and taking it one step further and really building a world-class environment around that."