Wednesday, October 24, 2012

Michael is a mid tier specialist and Adam is with Greenplum. What is responsible for the big jump in data? It is the proliferation of smart devices recording and uploading images and data around the globe.

This explosion in data introduces new opportunities for business. For example being able to tailor make smart phone adds as the customer is entering the store based on prior business habits. This provides a localized experience for consumers.

The first thing you need is lots of space. For example: Broad Institute is using Isilon to store data for genome sequencing. Isilon has a single management interface amalgamating petabytes of potential storage space.

Once you have all that data is localized what do you do with it? You need to apply analytics to data to turn it into business value. This segmentation of massive amounts of data is called micro segmentation.

Greenplum data analytics for structured or unstructured data (Hadoop) adds nodes for linear scalability and performance. Queries can run in parallel and are tuned to scale.

The layers of this model is the Isilon platform and presentation of the data using the HDFS protocol. The Greenplum uses an HDFS API to access this data.

Big Data analytics require data science; essentially you are running mathematical algorithms to predict want would be needed next. In the example these algorithms were used to predict what the customer would be interested in and target market to them.

How many packaged apps are built around big data; very few so they are all custom built at this point in time. Developing these custom interfaces can be very advantages: as an example their are a few online retailers enabling partners to query their customer data to understand the shopping habits.

Greenplum Chorus brings a social networking type interface to allow you to interact with big data. You can create a workspace, create a team of users to interact with it and add a sandbox to store your data. Once your workspace is created you can grab an instance of data to interact with it. You can tag it to associate it with your workspace (vs. moving it). This is beneficial as it gives you the ability to associate but avoid having to create copies or moving large amounts of data. You have the ability to join relational and hadoop based data sets. The point of this flexibility is to enable a business to do things like customer profiling.

Pivotlabs helped built these interfaces and EMC liked it them so much they essentially acquired the company. They bring the application development piece that was missing in the EMC big data message.

No comments:

Post a Comment

VMware Horizon Suite

About Me

I am a Principal Cloud Architect at Long View Systems and have spent 16 years designing, implementing, and managing IT Infrastructures in highly available computing environments. My primary areas of focus are the deployment of virtualization (Server, Storage, Desktop, Application and WAN Optimization).