How do you know whether you are dealing with Big Data or Small Data? I’m constantly asked for my definition of “Big Data”. Well, here it is…for batch analytics, now addressed by technologies such as Hadoop.

Batch Analytics

Batch Analytics

Small Data

Big Data

Data Volume

Gigabytes

Terabytes – Petabytes

Data Velocity

Updated periodically with non-real-time intervals

Updated both in real-time and through bulk timed intervals

Data Variety

1-6 structured sources

6+ structured AND 6+ unstructured sources

Data Models

Store data without cleaning, transforming, or normalizing.

Store data without cleaning, transforming, and normalizing. Then apply schemas based on application needs.

Business Functions

One line of business (e.g. sales)

Several lines of business – to – 360 view

Business Intelligence

Queries are complex requiring many concurrent data modifications, a rich breadth of operators, and many selectivity constraints. However, they are applied to a simpler data structure.Response times are in minutes to hours, issued by one or maybe two experts.Example: determine how much profit is made on a given line of parts, broken out by supplier, by geography, by year.

Queries are complex requiring many concurrent data modifications, a rich breadth of operators, and many selectivity constraints. Queries span across business functions.Response times are in minutes to hours, issued by a small group of experts.

Example: determine how much profit is made on a given line of parts, broken out by supplier, by geography, by year; and then determining which customers purchased the higher profit parts, by geography, by year; determining the profile of those high-profit customers; finding out what products purchased by high-profit customers were NOT purchased by other similar customers in order to cross-sell / up-sell.

Subscribe

About Jim Kaskade

Jim currently leads Janrain, the category creator of Consumer Identity & Access Management (CIAM). We believe that your identity is the most important thing you own, and that your identity should not only be easy to use, but it should be safe to use when accessing your digital world. Janrain [...]more →