Slide 16 reminds us that in-database data mining is quite competitive with what SAS has actually delivered with its DBMS partners, even if it doesn’t have the nice architectural approach of Aster or Netezza. (I.e., Sybase IQ’s more-than-SQL advanced analytics story relies on C++ UDFs — User Defined Functions — running in-process with the DBMS.) In particular, there’s a data mining/predictive analytics library — modeling and scoring both — licensed from a small third party.

A number of the other later slides also have quite a bit of technical crunch. (More on some of those points below too.)

Sybase IQ may have a bit of a funky architecture (e.g., no MPP), but the age of the product and the substantial revenue it generates have allowed Sybase to put in a bunch of product features that newer vendors haven’t gotten around to yet.

More recently, Sybase volunteered permission for me to preannounce Sybase IQ Version 15.2 by a few days (it’s scheduled to come out this week). Sybase IQ seems to be focused in large part on the government/intelligent market, with three major features being:

A kind of data federation, querying external databases, that makes sense mainly in the context of rigorous security rules. (I find that confusing, since Sybase IQ’s indexes tend to hold all the information in the database, but I didn’t push the point.)

An upgrade to Sybase IQ’s built-in text indexing. I doubt anybody would confuse this with best-of-breed text search, but evidently that intelligence community is satisfied with less. But even before 15.2, Sybase IQ could do both LIKE and WHERE CONTAINS searching.

Improved LOB (Large OBject) management.

One part of my Sybase IQ conversations I haven’t blogged yet in much details is scale-out, concurrency, and “multiplexing.”

Sybase feels that Sybase IQ’s competitive sweet spot, especially in terms of performance, is reached when there are 20 or more concurrent queries.

In general, Sybase asserts that a shared-everything architecture is great for concurrency — just run different queries on different boxes, all against the same data.

The ability to use a bunch of boxes run Sybase IQ is called “multiplexing.” This is a chargeable option, without which one is limited to a single SMP box.

Just under 20% of the top 250 Sybase IQ customers have multi-node scale-out configuration (vs. single-node SMP scale-up). And around 8% have it overall.

Sybase IQ nodes can be heterogeneous (e.g., in compute power).

Sybase IQ nodes can be dedicated to be read-only, or can be read-write. Indeed, Sybase IQ nodes can change roles dynamically, for example becoming write-only during nightly batch load. (I didn’t clarify whether all this applies just to nodes-as-boxes, or if some parts apply to specific processors or cores within the same box.)

Finally, along the way in the discussions I picked up various tidbits about the Sybase IQ user base. Unfortunately, Sybase is pretty vague in discussing database sizes — are they user data? Are they compressed? What do the numbers mean? With that huge caveat:

By some metric or other, a couple of classified customers are approaching petabyte scale.

The largest commercial Sybase IQ customer — a credit card company — has a couple hundred terabytes or so.

The largest financial services Sybase IQ databases are 50-70 terabytes. This sounds low, frankly, so maybe those are compressed figures, with user data being 200+ terabytes. But I’m just speculating there.

Sybase IQ has a little less than 100 customers in the “data aggregator” market, which is a lot like what I call “data mart outsourcer.”

Sybase IQ’s ILM technology is a chargeable option, with Sybase being “cautious” about sales. Compliance is a big market driver for it.

Sybase IQ’s #1 vertical market is financial services. Other biggies are government, telecom, marketing services, and to some extent retail.

As of February, there were 40-45 production users of Sybase IQ 15.0 and 15.1.