Machine Learning, TDA and the Future of Invention

Last week
Ayasdi came out of stealth mode and told the world it had a new way
to analyze big data, and I think the implications for CRM and social are very large
indeed. The new way is called "topological data analysis" (TDA) and hearing about it has the feel
of hearing about relativity for the first time (or Salesforce.com) and learning that
space is curved.

Who would have thought it, but Big Data is not some amorphous
mass but something with topology -- an entity with curves and folds and shapes?

Why is that important? Well, understanding the shape of data turns out to be,
mathematically, a shortcut to understanding it or to extracting meaning from
it. Shapes include clusters, and they can tell us where the interesting bits are.

Consider the implications. No longer does one have to be inspired to ask good
questions of data so as to write queries that deliver information. With topological
data analysis, you can first identify the interesting clusters of data and then ask
what's so interesting about that?

I'll Ask the Questions, Dave

It's a big shift in perspective and maybe philosophy. Certainly, it takes the human race down a notch in its own esteem. Now we don't rack our brains to ask piercing questions of our data -- we have machines that do it better, so we have to stand back and watch.

This may seem odd, but what if there's a bombshell lurking
in your data that you were never inspired to ask about? Would the data hold its
secrets forever? Well not any more.

Right now, topological data analysis is a very geeky mathematical concept -- just a
couple of years removed from Stanford and a DARPA lab -- but the potential it holds
is big.

The Next New Age

I believe that the Information Age is winding down, just like the Age of Steam did
and just as all "Ages" do. That's not to be feared -- it's something to be embraced.
What will take the place of information as the major disruptor and economic
driver? Whatever it is, it will have to stand on the shoulders of the Information
Age and use the latest and greatest tools.

Part of that means topological
data analysis for the simple reason that our ability to exploit discoveries in
both pharmaceuticals and oil and gas -- to take two for the moment -- is maxing out.

It costs upwards of US$100 million to drill an oil well in the Gulf of Mexico; it takes
a team of people a few billion dollars and a decade to bring a new drug to market.
It hardly gets said, but these investments cost the same whether or not the oil well
has oil at the bottom of it, and it's the same story if the pharmaceutical comes a
cropper.

Those numbers are big -- so big that they represent ceilings to further
discovery unless we find breakthroughs that will reduce the costs and the risks of
getting it all wrong.

All Roads Lead to Discovery

Already we're seeing topological data analysis crack some amazingly hard
nuts, not only in the aforementioned pharmaceuticals, oil and gas, but also in financial
services and government. Anywhere there's big data there is an opportunity for
topological analysis, and that means the mass of social data we generate too.

People at Ayasdi tell me that when they apply topological data analysis to 20-year-old data from pharmaceutical research, they find new and interesting information. So far, I don't think they've come up with any new drugs, but it's
early days.

The market has other entrants too, and while Ayasdi might be taking the
highest road to the biggest customers and perhaps the hardest problems,
other companies using machine learning are implementing roughly the same
idea.

'CustomerDNA' by Any Other Name

Consider
Mintigo for example. This company focuses on identifying sales
prospects, which is not the same as generating leads, but it's a cool and important
idea nonetheless and essential in many industries.

Mintigo analyzes existing customers to build a sophisticated data model of what
a successful customer looks like for your organization. This is to say that Mintigo
looks at the data surrounding those customers and identifies the clusters of
relevant data that qualify them as a match for your company and its products.

From there, it's a simple matter of targeting the machine's model on the general
marketplace to see what it drags in. They call it identifying your "CustomerDNA."

Call it "CustomerDNA" or "TDA" or more broadly, "machine learning." Whatever you
call it, we're on the cusp of another revolution that simplifies a major headache
and reduces the cost of important business processes to manageable levels again.
With these as catalysts, can new discoveries and economic growth be far behind?