I'm Managing Partner at gPress, a marketing, publishing, research and education consultancy. Previously, I held senior marketing and research management positions at NORC, DEC and EMC. Most recently, I was Senior Director, Thought Leadership Marketing at EMC, where I launched the Big Data conversation with the “How Much Information?” study (2000 with UC Berkeley) and the Digital Universe study (2007 with IDC). I blog at http://whatsthebigdata.com/ and http://infostory.com/ Twitter: @GilPress

Scaling Users, Not Data: SiSense New Take on Machine Learning and Crowdsourcing

In a new Saturday Night Live sketch, “Secretary Sebelius” explains that HealthCare.gov is so slow because it was designed to handle only six users at a time. We’ve become accustomed to everything online slowing down with additional users, but SiSense today announced the general availability of patent-pending technology that makes its big data analytics engine respond faster as the number of its users grows.

SiSense has positioned itself as the big data analytics company focused not so much on how big the data is but on how fast it responds to the queries of its business users. The speed of the response depends on how fast you can get the data from where it is to the Central Processing Unit (CPU) which performs the analysis and displays the results. While other big data players process data either by spreading it over many commodity servers or by cramming it into the computer’s memory (RAM), SiSense dynamically manages the flow of data from disk to RAM to the cache on the CPU itself.

With SiSense technology, all queries are broken down to more granular, machine-level “instructions.” This detailed knowledge of the nature of each query helps it understand which data to keep on the disk and which data to bring to memory, at times even anticipating the next query before the user asks the question. It then uses sophisticated mathematical calculations to process the query in parallel on the CPU.

There is nothing new in relying on cache (memory) to perform repetitive tasks quickly and SiSense’s breakthrough was the attention it has paid to the cache on the CPU itself and its granular understanding of the queries. This allowed it to demonstrate the analysis of 10 terabytes of data in 10 seconds on a $10,000 server at Strata earlier this year, where it won the “people choice” award.

Now SiSense is taking its technology (and user-friendly philosophy) further by allowing more users to join in the fun without the need to add memory resources. “The problem with analytics is that people don’t ask exactly the same question,” SiSense CEO Amit Bendov told me last week. “If the question is slightly different, than the [traditional] caching system is not effective. Someone may ask about ‘sales by product by quarter’ and another user may ask for ‘sales by product by category.’ Similar but not identical.”

This is where “translating” each query into machine-level instructions helps in accommodating a larger number of users. “Instead of looking for identical queries like other caching systems we look for similar queries with 80% overlap in the instructions,” says Bendov. “Every query is broken down into a very large tree structure and we look if this tree has sub-trees that are identical or similar to other queries. It’s actually a learning system that stores all the answers to the most difficult question.” Instead of slowing down as more users come up with slightly different queries, SiSense learns from the results of these similar queries and increases its efficiency and speed.

The company is on its way to tripling of revenues this year, the same rate of growth it saw in the previous two and expects to repeat next year. They currently have more than 500 customers, “adding dozens every quarter,” says Bendov. While some of these customers are large companies such as MerckMerck and TargetTarget, the typical SiSense engagement is with a business analyst at a department such as marketing. The new “more users=faster results” capability, however, may help accelerate its plan to offer enterprise-wide licenses. Says Bendov: “This positions SiSense as a tool that is very nimble and very much for the business user but also keeps IT happy because it can grow to an enterprise-scale solution.”

What SiSense calls “crowd accelerated analytics” may also help accelerate Bendov’s plan to go public in 2017. Calling me from the new office SiSense just opened on 14 Wall Street, he says: “I can see the New York Stock Exchange just across the street so it will be a short walk when we go to ring the bell…”

Post Your Comment

Post Your Reply

Forbes writers have the ability to call out member comments they find particularly interesting. Called-out comments are highlighted across the Forbes network. You'll be notified if your comment is called out.