You Know You Have Big Data When…(Humor)

One of the more philosophical questions analysts like to ask is “What is Big Data?” It’s relative – it begs the question, “what’s big?” And that is a constantly moving number, and always assessed by comparison to the ridiculous amounts some companies work with. But Big Data as a concept in IT parlance today tends to mean something fairly specific, not just about size but also about composition and the nature of the processing. So I considered a serious attempt at a fairly rigorous discussion about the nature of the workload, structure of the data and the kinds of analytics that comprise what people think of as Big Data….and then I thought of Steve Martin, who would have considered this carefully and then looked into the camera and said “Naaaahh.” So I determined to emulate him and have a bit of fun instead, by crowdsourcing some help completing the sentence “You know you have Big Data when…” Here’s what some Twitter folks said> Some are funny, some more serious …

You know you have Big Data when….

… you get a call from the utility company asking you not to run ‘that brownout query’ again. (@aristippus303 at Datawatch)

… your IT spends more time purchasing storage capacity than making sure the business has the data they need – @judyiko (Informatica)

.,. EMC name a new product after you (@aristippus303 at Datawatch)

… it piles up so high that it disappears into the clouds (@evertlammerts – I assume pun was intended?)

… the SAN undergoes gravitational collapse and you get cited by OSHA for an unlicensed singularity. (@datamartist)

… a query is long enough to require a couple of DBA generations to see it returning first data. (@Stray_Cat)

… your datacenter manager divides time between installing a new NAS in the kitchen and googling for vacant aircraft hangars. (@alanjharrison)

And a few of mine:

… you conduct an audit, including external files, and add more in to the databases than you take out.

… you think Flomax is a new ETL product.

… the first item on your bucket list is “finish data model.”

… you’ve never gotten to the “Reduce” part.

… your Dad won’t let you have the keys to the table you want to join to because he’s still doing the schema update he started on your birthday. No, your BIRTH day.

OK – that’s way more than enough. Don’t you have a schema to update? Get back to work. If you get bored, send me some more.

8 Responses to You Know You Have Big Data When…(Humor)

– You find that your ERP product generates 200 GB of LOG files.. PER DAY!!! And your users do not even create 1% of that in REAL data….
– Your ERP Vendor cannot tell you what part of these LOG files is “critical” and what is “nice to have, just in case…”
– A simple audit finds as many copies of this years 9 MB “Strategy Presentation.PPT” as there are employees.
– People are allowed to put daily “versions” of their personal 2GB PST files on the file-server.
– When the daily extract of the Production Mainframe Database creates Distributed Databases that contain (when combined) 200 times the amount of data as the original production database.

Thanks for the nice comment! Interestingly enough, I did a SERIOUS discussion of this at Gartner Symposium this week. And will do another in 2 weeks in Australia. It’s certainly heated up a lot since this post was written.

You know you have big data when ….
… Your HOURLY log data exceeds the size your ENTIRE product catalogue
… Instead of a small ETL window you have a small transaction window and the rest is used by ETL processes
… You apply CAP theorem to your DBAs
Cheers

Follow me at Gartner

I am a Gartner analyst, covering information management with a strong focus these days on big data and NoSQL-related issues. I'll continue to post here, subject to the guidelines there, as well as in my Gartner blog. Posts here will link there.