Keeping You Up On The Lastest

Posts tagged ‘NoSQL’

A 2011McKinsey & Co. survey pointed out that many organizations don’t have the skilled personnel needed to mine big data for insights and the structures and incentives required to use big data to make informed decisions and act on them.

Big data is a mixture of distributed data architectures and tools like Hadoop, NoSQL, Hive and R. Data scientists serve as the gatekeepers and mediators between these systems and the people who run the business – the domain experts.

Three main roles served by the data scientist: data architecture, machine learning, and analytics. While these roles are important, but not every company actually needs a highly specialized data team of the sort you’d find at Google or Facebook.

Most of the standard challenges that require big data, like recommendation engines and personalization systems, can be abstracted out. On a per domain basis, however, feature creation could be templatized. What if domain experts could directly encode their ideas and representations of their domains into the system, bypassing the data scientists as middleman and translator?

Currently Big Data is synonymous with technologies like Hadoop, and the “NoSQL” class of databases like Mongo (document stores) and Cassandra (key-values). Today it’s possible to stream real-time analytics with ease. Spinning clusters up and down is a (relative) cinch, accomplished in 20 minutes or less.

Business intelligence applications, have begun to transition from an OLAP to a new type of service that connects different data sources from social networks, third-party apps and other sources. NoSQL has begun to appear as a popular option for its scaling capability across cheap, commodity-based nodes. It’s much cheaper than scaling with vertically integrated systems that require attaching expensive storage arrays.

A new generation of big data applications are turning up. Which in turn has put pressure on enterprise vendors to modify existing software suites.Venture capitalists will continue to invest in data infrastructure and big data apps that represent the manifest disruption in IT.