AWS, Redshift and Co-Opetition

Long time technology commentators understand the tensions between platform companies, intent on both growing their business and creating a healthy ecosystem, and ecosystem partners who leverage what the platform brings, but remain apprehensive about the long term intensions of the platform vendor. Case in point – the growing number of services offered by Amazon Web Services that start to encroach upon what AWS partners provide. With the launch of Redshift at AWS’ Re:Invent conference in Las Vegas, this was put into stark contrast yet again.

Redshift, according to AWS, is a service that offers:

…a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Amazon Redshift offers you fast query performance when analyzing virtually any size data set using the same SQL-based tools and business intelligence applications you use today. With a few clicks in the AWS Management Console, you can launch a Redshift cluster, starting with a few hundred gigabytes of data and scaling to a petabyte or more, for under $1,000 per terabyte per year.

The obvious main target AWS has its sights set on are the large data warehouse and analysis vendors like Oracle, IBM and HP. But there are also companies using AWS infrastructure to build products that are now directly competitive with Redshift – companies like BitYota and Kognitio. Over on GigaOm there was a short piece covering BitYota’s reaction, essentially it went along expected lines:

Now, BitYota CEO Dev Patel, a former Yahoo exec, says there are data warehouses and then there are data warehouses. Bityota built its Software-as-a-Service offering from the ground up with its own technology and crafted it so users won’t have to sweat how to configure compute instances or storage. And doesn’t include Hadoop, so it will sport a performance advantage there. And they won’t have to hire Hadoop eggheads, who are expensive and hard to find.

Now of course there are differences between Redshift and other analysis/warehouse tools – Redshift itself works with existing analytics products from the likes of Jaspersoft and Cognos, whereas BitYota and Kognitio bundle warehousing and analytics – but that’s a shaky piece of ground to try and build a defensible product play from. I wanted to dive in to what all this means for the ecosystem and spent some time with Kognitio to get their take on it all. Kognitio was putting on a brave face, talking more about how it’s ready to be used with Amazon’s upgraded instance types – but the meat really happens when the discussion turns to Redshift.

Kognitio told me that they expected something from Amazon at the event, but that in their view Redshift is clearly targeted at pulling Oracle DW users from on-premises to Cloud in an AWS environment, versus the Oracle Cloud environment. they pointed out that AWS Keynote speakers even “poked fun” at vendors like Oracle as being “cloud washed”. As such their view is that the RedShift introduction supports the Kognitio position of the need for specialist public cloud based options for Information Management

In their view,RedShift is just another data source from which they can read, load data into RAM and execute queries on. As suck, Kognitio is a complementary offering to RedShift, in that the data can be stored efficiently in an AWS Cloud environment and then loaded rapidly into Kognitio in-memory structures for high-speed analytical processing. They pointed out that Kognitio runs both public and private, and hence delivers on the real enterprise need for a hybrid offering.

Kognition was quick to sing the complementary praises of Redshift:

While we don’t see head-to-head comparisons happening often, we would clarify that RedShift is an excellent replacement for existing data warehouses – those systems that store data – while Kognitio is specialized to accelerate the overall BI infrastructure as a “middle tier” between that persistence layer and the visualization/user interface layer in which it is presented

Clearly there is some substance to the perspective here, Kognitio pointed out the differences between in-memory analysis to column based approaches:

Data naturally occurs in rows, so it can be loaded much faster this way

In-memory RAM processing is thousands of times faster than conventional spinning hard disk

Kognitio has superior load and complex query performance as a result of considerably lower data:core ratio with no need to build columnar structures which require extensive administration and tuning

All valid points that speak to a real difference between what Redshift is today, and what the analytics vendors provide. But AWS moves fast, and I’d be surprised if these vendors weren’t spending lots of time strategizing about what they’re going to do if, and when, AWS decides to eat their lunch. The joys of a rapidly developing ecosystem huh?

Ben Kepes is a technology evangelist, an investor, a commentator and a business adviser. Ben covers the convergence of technology, mobile, ubiquity and agility, all enabled by the Cloud. His areas of interest extend to enterprise software, software integration, financial/accounting software, platforms and infrastructure as well as articulating technology simply for everyday users.

2 Comments

Ben,
A correction on your post above – We don’t “bundle warehousing and analytics”, rather we play nicely with our customers’ ecosystem of BI tools like Jaspersoft,Tableau etc. Any BI tool can use our ODBC API to submit analytical queries to BitYota’s data warehouse as a service. This is standard functionality for any data warehouse, traditional or in the cloud, and we are no different. Where we differentiate is through our SaaS – we manage the entire process and underlying infrastructure of bringing large volumes of data together from multiple sources, and enabling submission of analytics against it, fast and at scale while managing the underlying data, storage and infrastructure for you. This is a pretty expensive, time-consuming and labor-intensive process today. Other data warehouse vendors require customers to have trained IT staff and spend $$$ to build a data pipeline, to model the data, to extract-transform-load it via custom programs into a data warehouse and then finally manage the warehouse infrastructure and data. Redshift has not removed the customer requirement to do ANY of this – they’ve just moved it to their cloud infrastructure.
You can learn more about our product here http://www.bityota.com or better yet, try the product out for free from the Free Trial link there.

With Redshift, Amazon, yet again, shows that its long term strategy is still favoring the ‘startups’ and those who are setting up new ‘asset-light’ businesses. Given the rate and volume of data growth, both traditional enterprise as well as ‘cloud-based’ SMBs would have enough data and need to analyze/warehouse/crunch it so both Redshift as well as Oracles/IBMs have enough opportunity to gain from. This is true until the enterprises onboard Amazon-like public clouds on a large scale and for non-legacy applications that don’t need huge data xfers, going forward. Once that inflection point is reached (~5yrs maybe) – data volume growth and the need to warehouse/analyze it would tilt in favor of such cloud based services from Amazon or other SaaS offerings.