Last year I was asked if I would contemplate taking on the management of Brexit. It was a hypothetical question. No one is actually going to offer me the job. I had no qualms in stating that, given the complexity, uncertainties, risks and costs, I would hope that I would have the good sense to turn down such an offer. With the recommendation to the client that they should not to undertake such a reckless, costly and strategically stupid project.

So, that’s my going-in position. As other strategists will have picked up on. Continue reading →

A data warehousing superhero is something to be

Not all that glitters is Big Data, and Big Data has a long way to go before it can deliver anything like the same satisfying results, tangible benefits and organisational agility that a properly implemented Inmon Enterprise Data Warehouse can provide.

Every year I ask myself the same question. Will there be any tangible, coherent and verifiable Big Data success stories in the coming year? Every year I come up with nothing. Nothing at all. “Sorry, no rooms at the Big Data Success Inn, as we are closed for vacations.”

However, this year things are different. More positive, more alive and more fantastic.

As you can probably guess, I am well excited to be able to reach out and tell you about the twelve amazingly fab Big-Data stories that will appear during the course of 2016. The year of the incredible, startling and awesome Big Data monkey.

To this end, and as this is a magically special occasion, I have made an extra-special effort to deliver the goods, to do full justice to the task, and to go that extra Big Data kilometer for my demanding readership.

So, I gazed into Madame Frufru’s crystal ball, I opened up the kimono with the Ouija spirits of Von Neumann, Babbage and Jobs, and I pushed the envelope in the vast disruptive solution-spaces habited by Ada Augusta, Audrey Tautou and Jennifer Saunders… and, I came back with the best of the best.

Big Data leads to massive government savings – 2016 will be the year in which right-thinking, common sense and pragmatic governments around the world will leverage Big Data to bring about a radical reduction in government expenditure. Unchained from the dogma of professionality, administrations will replace overpaid, over-educated and over-bearing statisticians, with Data Scientists who can produce ‘the required numbers’, a priori, and at a tenth of the cost. If this works well, as no doubt it will, other professions, such as medicine, teaching and the law enforcement agencies, will also be subjected to the Big Data treatment. Why pay a professional Doctor, Teacher or Police Officer their exorbitant fees and salaries, when a Quack Scientist, Chalky Scientist or Plod Scientist can fill their places, and for a fraction of the cost.

Big Data clamps down on gum chewing in Singapore – Radical Polymer masticating criminals are the bane of the upstanding street-walking citizens of Singapore. However, in 2016, this will change. Why? Because Big Data will be used to identify, track-down and apprehend gum-chewing, sidewalk spoiling and anti-social spearmint-breathed offenders. Yes, capital punishment for such offenses may seem harsh, but remember, if Hadoop says it is a heinous crime, especially if it’s backed up by expert social media opinion, then it must be right.

Big Data solves the Climate Change conundrum – Following the amazingly successful climate talks in Paris this year, 2016 will herald in a period of fantastic adjustment in how climate change is seen, measured and addressed. No longer will Climate Change it be seen as a threat or a problem, but as a seriously good opportunity for market capitalism in general, and Big Data in particular. Measurement of temperature changes will no longer be made, but massive Big Data technologies will collect climate change opinions from global social media, and that will be our unique guide to the actual effectiveness of the fight against things ‘getting too hot’. Big

Data will lead the way, and factor 10,000 sun blocker and super-mega walk-in fridge-freezers, will follow.

Big Data helps put Real Madrid back in the toptier – The BBC* might not like it, but Big Data will triumph in sport in 2016, thanks in the main to its innate ability to help Real Madrid win the Champions League, the Spanish League, and the Spanish King’s Cup. Even though the mighty-whites have already been eliminated from the last of these competitions (for fielding a Big Data player who was under a match ban). Okay, so Big Data can’t get it all right, but no one is perfect.

*Bale, Benzema and Cristiano.

Big Data knocks out Data Warehousing – In 2016, Big Data will finally put Data Warehousing to bed. It’s been on the cards for a while now, but in 2016 it will be proven beyond any shadow of doubt that the best input into strategic, tactical and operational decision making are massive concatenations of simple word counts, done on a vast array of what people are now describing as commodity hardware. Commodity hardware, to distinguish it from the other hardware that we were using up until now, which was also confusingly termed ‘commodity hardware’.

LinkedIn publishes its first ever Big Data success story – Incredible, but true. In 2016, LinkedIn will get its resident Big Data guru, data master and influencer to document a tangible, coherent and verifiable Big Data success story. It will matter not a jot that it is a knock-off plagiarism of a late nineteen-ninety Data Warehousing partial success-story, as it’s the thoughts, and not the facts, that count.

Queen Brenda inaugurates the Lady Di Memorial Big Data Lake – During 2016, HRH will inaugurate the former Windermere Lake as the new Lady Di Memorial Big Data Lake. Millions of subjects will hail this as a clear success story for Big Data and for Britain. The inauguration day will be slightly marred (no pun intended) by a gushing Big Data guru being told to beggar ‘orf by none other than Phil the Greek.

Big Data housing becomes an issue of significant importance to the EU – Because of the incredible speeds amazingly valuable Big Data is being created at, the EU will move to take measures to capture and more importantly store all of this new Big Data. There will exist an existential realization that none of this life-giving Big Data should be lost or compromised, or both. Chancellor Merkel has already come out strongly and offered to take much of the generated Big Data in 2016, which will be housed in both public and private premises. For example, each German household will be asked to house volumes of Big Data based on the size of the family abode, the internet bandwidth and the number of smart phones im haus. France and Spain will follow suit, but with modestly reduced quotas. The UK will spend most of 2016 trying to opt out, and will even threaten a Big Data Referendum if the onus on them to take so much Big Data is not radically reduced. So, in net, a win-win for Europe and Big Data.

The CIA will be charged with custodianship of all Big Data success stories – During 2016, together with the custodians of Fort Knox, the CIA will be charged with custodianship of all tangible, coherent and verifiable Big Data success stories, and only those who should know, and can handle the power of information, will have access to the files. This will be done to avoid information of global importance from falling into the hands of evil-doers, delinquents and busy-bodies. This is a success story because it will demonstrate once again a truly tangible, coherent and verifiable Big Data success story. The Head of the FBI was unavailable for comment.

Big Data solves world issues – For years we have struggled to see the elephants in the global room. Now with the help of Big Data, not only will we finally be able to see them, but also we will have a key component of the solution within a click of the mouse and a rapid stroke of a smartphone gesture. Yes, hunger, poverty and the refugee crisis can all be identified in 2016, thanks mainly to Big Data. What’s more, if we get the political will to do so we can also think of ways of partially, or wholly, fixing those problems. Although admittedly that is a ‘big ask’ of Big Data, especially in one enormously hectic year, where the focus of attention will be mainly on the UEFA European Championship, the Olympics and the war on terror. Now if that isn’t a Big Data success story then I really don’t know what is.

Big Data success stories to top a million by the end of 2016 – Thanks to a global and socially responsible market-driven initiative to reclassify Microsoft Access and Microsoft Excel as Big Data repositories, the number of Big Data success stories for 2016 will amazingly exceed a million, and that’s just in Milton Keynes.

Democratic Elections replaced by Mega-Democratic Big Data Social Media Mining – Sick and tired of having to turn out to vote every four years? Tense, nervous pre-election headaches over not being able to think, weigh or decide? Worry no more. Thanks to advanced social-media mining techniques, from 2016 the election of politicians will be decided not by you – least not in the legacy way – but by a broad interconnected raft of machine learning, sentiment analysis and other data science gizmos – guaranteed 100% democratic. This is what we have all been waiting for. The end of old fashioned and boringly DIY-democratic elections, and the heralding of a brave new world of online interactive social-media politics. Don’t look at it as the trivialization of democracy, the puerility of post-modernity and the throwing away of centuries of fights for civil and human rights, look upon it as being real progress – progress with a capital pee.

On the other hand, what 2016 might really herald might just be The Golden Age of Big Data Bullshit.

Let’s wait and see.

Thank you so much for reading.

Also, if you are of a mind, then please join The Big Data Contrarians on LinkedIn:

First things first. The Big Data Contrarians (“a hype free Agora for Big Data dialogue”) is now a community of over one thousand professionals.

Since its LinkedIn group registration on the 1st of July 2015, the Big Data Contrarians has grown to become, without a shadow of doubt, the nicest, friendliest and most well informed Big Data group that you will ever come across in your entire life.

The Big Data Contrarians is a community of professionals who enjoy talking about data, statistics, analytics, data-centric applications, ideas, opinions and insights.

The Big Data Contrarians is a great place to contrast ideas about data. It is a group that passively encourages discourse. Especially discourse that comes with a touch of humour, a hint of disbelief and a delightful bouquet of subtle cynicism.

Also, no data, analytics or visualisation related subject, for as tenuous as the relationship might be, is out of bounds. This is a forum by professional adults for professional adults, with all its attendant facets and all that this implies. Indeed, who knows what the next topic of conversation will be on The Big Data Contrarians forum. But here’s some ideas:

We may call it Big Data, Smart Data or Small data, but in reality isn’t the only intrinsic quality of data is in its being and in its symbolism, if indeed it has any?

If data were a religion would Big Data be a craven image, a sect or a schism?

Does ascribing qualities to data, such as Big, Small or Smart, places us at risk of outdoing the degrees of anthropomorphism of some pet lovers?

To be a Big Data guru, is it necessary to know the difference between Hadoop and Spark?

Did Big Data hype fall off the radar because it’s gone, or did Big Data hype turn ‘pro’?

How do we measure the qualitative and quantitative value of data?

Do we really need The Big Data Contrarians community?

After four weeks of the group’s existence, I wrote a piece for Data Science Central (July 23rd, 2015), in which I itemised some of the reasons why I believed that The Big Data Contrarians groups was necessary. Those reasons were:

To alert people to interesting but ultimately dubious Big Data claims

To share lessons learned, good sense and practical data and Big Data principles

I genuinely believe that those reasons are as valid now as they were then. Perhaps even more so.

So, to get back to basics, I will leave you where I started.

The Big Data Contrarians. Is quite simply the best Big Data community on the whole wild-wild-wild internet, anywhere? Yes, anywhere. The Big Data Contrarians is an amazingly great data community that promotes inclusivity, interaction and coherence in data, statistics, analytics and Big Data discourse.

Anyone who is anyone in Big Data and data is a member of THE BIG DATA CONTRARIANS, either now or in the near future.

You love data. You eat, breathe and sleep data! You source it, clean it, integrate and then analyse it until it confesses. You represent, invent and present results. Data is your life and Big Data is your prophet. The Big Data Big Top is the place to be, and (passively) that is where you are headed. For you, Big Data drives everything we do! Is that the case?

You may have heard of us before, we are the group that others dare not name, but let me go over some of the remarkable benefits, features and side effects of joining in the fun.

As well as giving you access to some of brightest, most well informed and experienced practitioners in the fields of statistics, analytics, data management, architecture, data governance, data technologies and a plethora of etceteras, there are other amazingly fabulous, exciting and compelling reasons for joining The Big Data Contrarians.

The Big Data Contrarians is the most fascinating data-community in the ‘whole wide world’ – forget what you’ve been told elsewhere, this group is just quite possibly the most sophisticated, friendly and enlightened data and Big Data community that you’ll ever have the privilege of being a part of.

Being part of The Big Data Contrarians lets you tell the world just exactly what you think – If you think the hype has gone way for far, bring the offender down a peg or two, no matter who they are, or what they profess to know. So, do your own thing, without fear nor favour. If the pundits and their mates threaten you, do not worry, Rab C Nesbitt knows where they live.

Associating yourself with The Big Data Contrarians places you amongst the Big Data elite – you will stand out as not as just any ten a dime fad-follower but as a thought-leader, a professional and a critical-thinker. People will come from far and wide, just to hear what you have to say.

The Big Data Contrarian groups is unique and universal – There are many groups with Big Data in their titles, but they are little better than anaemic and poverty-stricken representations of the worst data unemployment centres in the world.

The Big Data Contrarians is a convergence of opinions, ideas and more – but it isn’t a melting pot, neither is it about conformance or value-sapping integration or uniformity. We are all unique and wonderful, and we don’t need to be fed through a cookie cutter.

The Big Data Contrarians can question everything – For example, you don’t like this piece and have a question, then fire away. If we can’t handle the questions then we shouldn’t be creating them in the first place. Just ask away… “Mart, why is Bernice telling Big Data porkies again?”, “Mart, why are Cloudera talking out of their arse?” or “Mart, why do the same babblers like Java, Agile and Big Data hype?”

The Big Data Contrarians is quite possibly the sexiest group on the internet – but that’s not the point, the point is I don’t care, we shouldn’t care, because that sort of talk is for the trashy blogs, the Big Data society dopes and the socially confounded and confronted.

The Big Data Contrarians are not the best – I’m not stating that the Big Data Contrarians are the best, even if we are, but that we should look at this position as one of responsibility as well as power, and take it from there.

The Big Data Contrarians is the nicest smelling data community on the world wide web – We all know that, right? However, what are the wider ramifications for Big Data? I don’t think we need to answer that rhetorical question. When one knows, then one simply knows.

Has that whet your appetite?

Do you want to know more about the group?

Do you want to learn more about the disciplines we cover?

Do you want to interact with some of the best independent minds in their respective fields?

Want to be an integral part of a group that the Big Data BS babblers, flim-flam artists and snake-oil medicine merchants auto-exclude themselves from?

When I first started The Big Data Contrarians group on LinkedIn I was thinking that maybe we would get 100 members within three or four months. Well, I was mistaken. Since the 1st of July, the membership ranks of The Big Data Contrarians has risen to over 500 members. However, it’s not about the quantity it’s about the quality, and The Big Data Contrarians is ‘the nicest Big Data community that you are ever likely to encounter in your entire life’. However, many people hesitate before joining us.

I ‘kinda’ know what some people might be thinking. Are they Big Data anarchists? Are they pro or anti-establishment data stirrers? Will membership of The Big Data Contrarians put me on some sort of McCarthyite death-wish list? Will my data look small in this group? Will I ever work again? Will I ever be able to sing like Elvis?

Don’t worry, I assure you that none of the previously mentioned ‘potential sticking points’ have anything to do with The Big Data Contrarians; isn’t that right, Comrade Leon?

Therefore, if you think that this could be the group for you, but are hesitating in your decision, then hesitate no further, and join today. If you find that you do not like it, then you can always leave. It’s not the Hotel California.

So, onwards and upwards.

In order to celebrate the 500 members of the community, I decided to talk a little bit about one of the memes doing the rounds of the Big Data hype circus. “Hello big boy is that big data you have in your pen-drive, or are you just happy to visualise me?” If you like, dress it up, take it out on the town and call it a Rantasaurus Rex. If you use the material mentioned here you will win friends, influence people and break the ice at parties – guaranteed! Now, if that is not a deal that cannot be refused, I don’t know what is.

Many people who are ‘bigging up’ Big Data but without talking the tangible, refer to the massive volumes of data that have been created in the past few years as palpable evidence of a great, growing and massively beneficial phenomenon. With the absence of more reliable evidence, we have to suffer the continual exhumation, resuscitation and recycling of Google’s Eric Schmidt musings on the issue of the data explosion. For example, I seem to vaguely recall a Mister Whatshisname or Mister Whatever, stating on this very same self-publication platform, that in 2010:

<<Eric Schmidt, executive chairman of Google, tells a conference that as much data is now being created every two days, as was created from the beginning of human civilization to the year 2003.>>

Now, this isn´t the first or the last time that Eric has been mentioned in connection with this reference and the context of the massive data explosion. Indeed, if I had a Ben Franklin for every time that Eric Schmidt has been used to justify the Big Data hoopla, then I’d have quite a few greenbacks by now.

Now, call me old fashioned and even cynical, but I do not see what the data generation explosion has to do with anything other than the explosion of the creation of data. It’s not a revolution, it’s an increase in output, and may not be any more of a innate benefit to humankind and the planet, than massive landfills filled with trillions of plastic bags, some of them even with “save the planet” and “we’re all in this together” printed on them in toxic inks.

But, what will future generations say about that, and the passing Big Data phenomenon? “Ah, they went a bit crazy with the old Big Data lark, those old timers did, but at least they made throwaway packaging that would last a trillion years. That’s foresight, that is.”

However, I like the latest line of attack used by those who really are over-egging the Big Data and Big Data analytics hype, it’s a pseudo-intellectual rendition of “if Big Data is misunderstood – and YOU clearly don’t understand it – then it’s not because of Big Data, but of your own wilful ignorance. You choose not to believe in Big Data, because you are anchored in the past. That’s where you are your kind are! Yesterday’s men! Oh, yes! You and people like Franklin and his outdated theories of gravity, and that Marx, and his archaic explanations of whatever, or whomever and her outdated theories of all that jazz. Yesterday’s token men and women, entrenched in your passé European logic and ensconced in your modernist liberal studies Bleh! Bleh! Dee! Bleh! Bleh!”

How can one reply to that? Thank you so much Professor for the exposition of that brilliant counter argument. A strong, solid and coherent counter argument to the vain, naïve and self-interested question of “Can we have some tangible, detailed and verifiable Big Data Analytics success stories or case studies, please?”

I find that after asking all of these questions, and yet not getting many or any reasonable or tangible answers, that trying to get a straight answer to a straight question on Big Data and Big Data Analytics is like trying to pin Jell-O to an elephant.

I have issues with the claims that more data is necessarily better data, and therefore massive amounts of data means massive amounts of data benefits; so, I would just like to wrap things with a quote from alternative comedian Stewart Lee.

“The eighteenth-century polymath Thomas Young was the last person to have read all the books published in his lifetime. That means that he would’ve read all the Shakespeare and all the Greek and Roman classics and all the theology and all the philosophy and all the science. But the same man today, a man who had read all the books published today, would’ve had to have read all Dan Brown’s novels, two volumes of Chris Moyles’ autobiography, The World According to Clarkson by Jeremy Clarkson, The World according to Clarkson II by Jeremy Clarkson, The World according to Clarkson III by Jeremy Clarkson… his mind would be awash with bad metaphors and unsustainable, reactionary opinion; one long anecdote about the time that Comedy Dave put pound coins in the urinal. In short, the man who had read everything published today would be more stupid than a man who had read nothing. That’s not a good state of affairs.”

Which is pretty much what I think about all of the gratuitous and promiscuous hype surrounding Big Data and Big Data Analytics. There is so much rich data that still has not been exploited, and so much more significant data that we could be capturing (much of it of more humanitarian, social and scientific value than commercial) yet we are now cajoled into dedicating significant efforts and resources on harvesting commercial junk data of potentially marginal or no value.

Many thanks for reading.

If you enjoy this piece or find it useful then please consider joining The Big Data Contrarians:

In subsequent blog pieces I will be sharing my views on the evolution of information management in general, and the incorporation novel and innovative techniques, technologies and methods into well architected mainstream information supply frameworks, for primarily strategic and tactical objectives.

As always, please reach out and share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps you will even consider sending me a LinkedIn invite if you feel our data interests coincide. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.

Big Data, together with Cloud computing, the Internet of Things and Machine Learning, are topics that are very much to the fore in contemporary trends in Information Management. But is Big Data really the revolution that people have been waiting for or is it simply about the next steps in the evolution of business data architecture and management?

Whilst it is true that we are increasingly generating more and more data, do we really need to rethink our whole approach to data demand and data supply, or can we build on the best of what we have already accomplished in order to move with intelligence, rigour and stability to the next level?

We are told that all organisations, small, medium, large and gigantic, will be overwhelmed with an existential threat if they don’t recognise and act on the fundamental need to leverage all data – massive data, in all formats, all volumes and all qualities. From wherever it may come from, legally or legally.

But is this really the case? Are we being urged to action by caring, principled and disinterested good data Samaritans? Or are we being railroaded into a shallow, anti-intellectual and mass-reactionary fervour that will run ultimately against the grain of our own best interests? Whether as individuals or as stakeholders.

I am not entirely convinced by the breadth or depth or intellectual veracity of the claims for Big Data, which frequently border the hyperbolic and the absurd, but I also think it is wise to be forearmed as well as forewarned.

That is why I am encouraging a discussion about where Big Data and Data Science fit into a much larger picture of business information supply and demand.

To begin at the beginning

There are many dimensions to business data and information, and here I will touch on some of those dimensions here, albeit with a broad brush.

There are also many potential sources of data; these can be divided into two broad dimensions: analogue data and digital data.

Analogue data – Any type of data not stored in a digital format. This covers the data found in analogue form. For the purpose of this piece this dimension is overlooked.

Digital data – Any type of data stored in a digital format. The source may be a conventional database, a log file, and audit trail, a spreadsheet, document, presentation, project plan or other similar source.

What is being contemplated is data that could help us to interpret history; manage the present; and, plan for the future.

Past – What happened and what can we learn from it?

Present – What is happening and what, if anything, should we do differently?

Future – What might or might not happen and what might we do or not do if this happens, or to prevent this from happening?

If the data in focus cannot help us with any of these dimensions does that data really matter?

The third dimension of data that I want to identify concerns place (internal and external):

Internal data – This is the information that belongs to the organisation and is obtained from internal sources, such as operational systems.

External data – This is the data from external sources, such as market data providers. that an organisation can legally and legitimately acquire and use.

And finally, for the purpose of brevity, the dimensions of demand and supply:

Demand driven data – This is when business is the driver for the flow of information. Data is delivered because there is a business demand. In simple terms this is about giving people what they want, continuously.

Supply driven data – This is about delivering data and information based upon what is available. If done on the right scale and with the aim of creating new internal markets for data, then it can be beneficial. This is where IT needs to be far more sales, marketing and Ad-land oriented.

But what do all of these considerations mean to a managers need for adequate, appropriate and timely information? Let’s take a look.

The following diagram illustrates the overall drivers for the Information Supply Framework.

From the right, the consumers and prospective consumers of data and information create the demand for information and data.

The middle process of ‘data processing, enrichment and information creation’ strives to meet the business demands and also ‘provokes’ business demands for data and information.

From the left the data sources provide data to meet real and secondary demands.

So how do we ‘make it happen’?

How do we try and obtain the business advantages supposedly accruable to Big Data without succumbing to yet another anarchic information management aberration that will take us one step forward, and two steps back?

Let me explain.

Is this the sort of diagram that gives you a warm and fuzzy feeling?

Of course not, it’s a mess. Print it out on a plotter and place it on the office wall, by all means. You can even call it art, but it’s also a recipe for data bedlam hell.

Now contrast that colourful fun-loving anarchical approach to something I prepared earlier.

There, isn’t that better?

It’s clean, cohesive and coherent. It’s a rational, structurally sound and practical model to support the business demands for data and for the speculative supply of data. The model is based on unifying strands of business data requirements, (with specific regard to operational, tactical and strategic aspects,) into to an integrated, agile and flexible whole.

In short, it is a coalescing model for business data and information.

That’s all folks

Is it absolutely unnecessary in this day and age to take a maverick, piecemeal and DIY approach to the provision of non-traditional business data and information?

No, evidently it’s not.

There is no rational business reason to use speculative, quick and dirty – and characteristically ‘throw away’ approaches to data supply, when quick, clean, useful, maintainable, cost effective and usable alternatives are available.

The Information Supply Framework provides a strategic approach to meeting the data and information needs of an organisation. It accurately reflects business process, it can change in step with the changing needs of the organisation, and it is capable of handling the emerging requirements associated with ever increasing data volumes, data variety and data velocity.

Simply stated, the DW 3.0 Information Supply Framework provides a nascent opportunity for businesses to easily transition to the inclusion of more formal and rigorous methods of statistical analysis, analysis that can now be powered by the inclusion of new, alternative and complementary sources of data.

Naturally the choice is yours. But ensure that you weigh up your options with your ‘eyes wide open’, or it may well all end in tears.

For your amusement, delectable enjoyment and delight, I bring you the first in a series of Big Data Quizzes from The Big Data Contrarians – the nicest, most civilised and congenial Big Data community on the entire World Wide Web.

So here’s today’s top 20 Big Data Bafflers

Question 1: The British all-round politician, writer and good-egg Denis Healy is on record as stating in the British Parliament “I think we have all enjoyed another lugubrious concatenation of meaningless clichés from…” and went on to name the target of his acerbity. Of course, he could have been talking about any Big Data bullshit babbler, but on this occasion who was the unlucky target of his cutting wit? Was it:

Dolly McClonic?

Geoffrey Howe?

Clark Stanley?

Mother Teresa of Calcutta?

Question 2: The original idea for the remarkable yellow-Hadoop-elephant came from where? Was it:

In praise of the size of the Java King’s ego?

Based on a character from Doctor Goebells’ ABC?

Inspired by a child’s toy elephant? (How contrived and obviously false is that?)

A tribute to all the IT jobs that have been offshored since the birth of Java?

Question 3: Data was obviously invented. It stands to reason that something so awesome couldn’t have just materialized. However, when and who made it so? Was it:

In 1998 when Larry Page and Sergey Brin invented data at Google?

God herself, in a fit of self-deprecating humour, possibly around 1st April 1956 BC (Before Computers)?

Bill Inmon and Ralph Kimball after a night on the razzle-dazzle at a 1993 Florida Data Warehouse conference?

A. N. Other, the given-name used on the tomb of the unknown Big Data warrior, circa 1066 BD (before digital)?

Question 4: Who not only conclusively proved the existence of water on Mars (using the content of old Google web logs), but discovered a completely awesomedata lake that the amazing Martians were using for ‘absolutely fabulous’ inter-galactic brand sentiment analysis? Was it:

Jastor Gollusty the Dangerous?

NASA’s Michael Meyer?

Cherry Coke?

Martes y Trece?

Question 5: After a secret poll of Incredible Big Data Gurus what remarkable adjective was thought to best describe the term Big Data? Is it…

Dopey?

Awesome?

Amazing?

Bullshit?

Question 6: The amazing 3Vs from Doug Laney are very handy when it comes to describing data in terms of the absolutely key aspects of volumes, velocities and varieties. However, what is everyone’s’ favourite 4th V? Is it…

Vacantness?

Vagueness?

Vacuity?

Voracity?

Question 7: What term best describes those who insist on expanding Doug Laney’s 3 Vs of data ad infinitum? Is it…

Enlightened?

Obnoxious?

Open minded?

Obtuse?

Question 8: Before the advent of amazing Big Data we were almost as dumb as rocks, however what archaic artefactual gadget did we have at our disposal to help us to formulate strategy? Was it…

A crystal ball?

A box of PG Tips?

A kit for the casting of sortes?

A data warehouse?

Question 9: Big Data analytics are famous and notorious for the inherent capability that they are gifted with. Nothing else surpasses the inherent amazing magic of being able to spot emerging viral socially-mobile content even before it its creation – ney, even before it has been thought of – than Big Data science. But, joviality aside, who was the first person to go viral with that astonishingly famous Brit Pop anthem ‘Greensleeves’? Was it…

Oasis?

Justin Beaver?

Henry Tudor?

Lucky Charms?

Question 10: Many people talk about the highest value of Big Data being the gold standard, but what is the generally accepted principle of Big Data? Is it…

From each according to their ability to engage with brands, from each according to their ability to pay for crap they don’t need, and even less understand?

The meek shall inherit the world of social media, but the rich shall inherit all the Big Data rights?

It is not the consciousness of the brand-botherer that determines their social-media existence but their social-media existence that determines their consciousness?

It doesn’t matter how much Big Data one has, it never seems to be enough?

Question 11: According to an unverified and unofficial UN report, wrongly ascribed to the UNFCCC, what Big Data app has created the most benefit for humankind? Is it…

Identifying playful kittens in YouTube videos?

The ‘Help yourself to our money’ app?

The Big Data Contrarians quiz?

The app that drowns micro-nations in the watery care, sentimentality and love of the empathic first world?

Question 12: It is clearly obvious to the trained eye that Big Data contains nuggets of 24 karat gold, but what data has the highest content value of all? Is it…

Social media chit-chat?

Web logs of inter-species brand engagement?

Operational databases? (Boo! Down with the data aristos!)

WikiLeaks and Snowden leaks?

Question 13: Using Big Data and a graph model database engine a group of fervent and disinterested Data Sciences were able to uncover previously hidden geo-political connections between insignificant leaders and their roles and responsibilities in ‘difficult times’. Who were those enigmatic leaders? Were they…

The painter and decorator, the ice cream vendor and the pint-sized fascist with the squeaky voice?

Itchy, Scratchy and Tom & Jerry?

JR, Bobby and Pame?

Dubya, Tony and Josemar?

Question 14: Data Scientists have proved time and time again that Big Data can be turned into amazing 24 karat Big Gold, but how is this achieved? Is it with…

High-mass and transubstantiation?

Alchemy?

A proletarian revolution?

Class A drugs?

Question 15: As we all know, Statistics is an artificial movement of belligerent troublemakers specifically created as a reactionary counter-movement to the eminently respectable field of Data Science and the Gentlemen Amateur Data Scientists exercising their favourite hobby within it. In which country did the anarchical group of Statisticians first raise their banner? Was it in…

England?

France?

Switzerland?

Canada?

Question 16: Apart from curing the incurable, creating strategies to end world hunger and fixing the common cold, what other feat of massive Noble inspired awesomeness has Big Data also been responsible for? Is it…

Non-violence. Bringing about world peace?

Reducing the production and emission of CO2 and equivalent gasses into the atmosphere to acceptable levels that can be monitored and validated by the UNFCCC?

Helping to shift valuable and necessary resources to any part of the world where they are most needed?

Something else. Solving the critical world shortage of amazingly cheap crap that no-one really wants or needs or can even use for any productive purpose?

Question 17: Who is rightly (or wrongly) attributed as saying “Big Data is negative and dialectical, because it resolves the determinations of the understanding of things into nothings”? Was it…

Thomas Davenport?

Benny Hill?

Dawn French?

Georg Hegel?

Question 18: If Big Data was cuisine, in what type of restaurant could you quite possibly find it on offer? Is it…

Restaurante Martín Berasategui in Lasarte-Oria?

At the Hard Rock Café?

In it an ‘all the fast-food you can lift’ free buffet?

At a themed eatery designed along the lines of La Grande Bouffe?

Question 19: Regardless of the actual outcomes, what is the most common anti-business business imperative for embracing Big Data technologies? Is it…

Fear?

Uncertainty?

Doubt?

Emotional blackmail?

Question 20: Forget amazing Big Data for one moment. When considering a capital IT purchase, whom do you suspect the least? Is it…

The manufacturer?

The provider?

Industry pundits?

Your Mom?

How to calculate your score?

Loosely speaking, and seen in the grand scheme of things, there aren’t any right and wrong answers in this quiz. Gain maximum points by simply contemplating a question or two. If you did more than that, then consider it a ‘more perfect’ result.

As a child, I had a great love of stories of Spain, of the idea of travelling through the Iberian Peninsula and of mastering, and not just learning, the classical Spanish guitar. One of the phrases that stuck with me from those days was the in underivable quote of “amateurs practice until they get it right; professionals practice until they can’t get it wrong.”

In my professional working life, I have striven to identify those things that I want to be sufficiently competent at doing and those things that I consider a fundamental part of my professional competence, and then in making a clear distinction between the two.

As many of those who know me will know, a significant part of my professional life has been dedicated to the architecture and management of data, information and structured intellectual capital. Therefore, in the light of this fact and with reference to the previous bit of whimsical fancy, I will address the following question posed to me some time ago: What makes a great Data Architect, truly great?

What follows is by no means an exhaustive list of essential elements, but it should give you a flavour of what a great Data Architect is.

ONE – Establish a clear, cohesive and communicable idea of the theoretical, technical, philosophical and practical nature of data and information. Learn it inside, out, upside down and back to front. Then learn it well.

Put it another way. A great Data Architect should be able to answer the question “what is data?” from almost any viewpoint and then be able to give a simple, precise and understandable reply.

The internet abounds with content on ‘data’ and ‘information’. You may even be familiar with the way Wikipedia describes ‘data’, you may even agree with it, even though it is (in its current form – 3/8/2015[1]) a naïve, sloppy and circular definition. Which only serves as an example of how not to define data.

TWO – Know your audiences, understand their motivations, have empathy with them, and develop a keen ability to spot what the audience wants and then sell that back to them as if it were their own idea.

One of the greatest architects of the 20th century, Ludwig Mies van der Rohe, had this to say about the relationship between an architect and a client: “Never talk to a client about architecture. Talk to him about his children. That is simply good politics. He will not understand what you have to say about architecture most of the time. An architect of ability should be able to tell a client what he wants. Most of the time a client never knows what he wants.”

THREE – Learn to communicate clearly, simply and effectively, and remember who the most important members of your audience are at any given moment and speak mainly to them.

The mantra of ‘keep it simple’ is what separates a great Data Architect from the swathe of sycophantic worriers, software train-spotters and smart-ass wannabes that make up much of the world of IT. So do not even try to appeal to that segment, they don’t matter. Speak to those that do matter.

The job of the Data Architect is not to impress his colleagues, get likes on Facebook or to be the manager’s pet. A great Data Architect uses language that is appropriate for the occasion, not to flout their extensive knowledge and experience but to communicate ideas, concepts and architectures in the language and manner that the listener can immediately grasp. A Data Architect who aspires to greatness does not need to prove themselves to his or her peers, but just needs to strive to be a true professional and the greatness will come along in its own good time.

The eighteenth century English theologian, dissenter, philosopher and scientist Joseph Priestly wrote, “The more elaborate our means of communication, the less we communicate”. With such influences in mind I try to encourage my team members and other collaborators to use appropriate channels of communication, and one of the ways I use to this message across is with a list of options. I find that doing this early on can help to really simplify things and bring a greater degree of clarity to the table. However, as with many other aspects of life, with this approach one too has to be flexible and realistic, and allow for the election of the most appropriate option according to the circumstances. My preference list is:

Face-to-face

Video conference

Telephone/mobile

Post-it note – or similar

Texting/SMS/Wassup

Email

Smoke signals

Social Media

FOUR – Be a great listener. Data Architects must nurture and hone effective listening skills; otherwise, they place themselves at a serious disadvantage.

Here are the four listening aspects that a Data Architect should aspire to dominate:

Cultivate a self-awareness of the importance of listening.

Understand what barriers there are and learn how to overcome the barriers to listening.

Identify poor listening habits and practices that you have adopted – ask people about how they see your listening skills.

Improve your own responsive listening skills.

Take this as an open-ended continuous improvement programme.

Put it this way, as a leader you might be the most amazing talker this side of the Rockies, but if you can’t listen effectively then it would be like Nadal, Federer or Djokovic, having a great world-class tennis serve, but with a cultivated inability to accurately read the play or to return any difficult shot.

FIVE – Understand how data is generated; why it is generated; who or what triggers the generation; how it flows; how it is used; who uses it and why. Understand the life-cycles of data and information.

A great Data Architect must understand the public and private life of data before actually trying to do anything with it.

I’ll cut to the chase on this topic and leave you with a comment on The Social Life of Information by John Seely Brown and Paul Duguid.

“To see the future we can build with information technology, we must look beyond mere information to the social context that creates and gives meaning to it. For years, pundits have predicted that information technology will obliterate the need for almost everything—from travel to supermarkets to business organizations to social life itself. Individual users, however, tend to be more sceptical. Beaten down by info-glut and exasperated by computer systems fraught with software crashes, viruses, and unintelligible error messages, they find it hard to get a fix on the true potential of the digital revolution.”

That’s just another indication of what we have to learn to avoid.

On the up side, “The Social Life of Information gives us an optimistic look beyond the simplicities of information and individuals. It shows how a better understanding of the contribution that communities, organizations, and institutions make to learning, working and innovating can lead to the richest possible use of technology in our work and everyday lives.”

SIX – Get a great understanding of all the data oriented vices and bad data architecture practice that goes on in the IT application world, and most especially in the web application-development world.

Some of the most atrocious examples of bad data architecture, engineering and management are in web applications. Learn from them, and learn how not to repeat such gross and wilful incompetence in your own Data Architecture work. Look at it as extreme examples of lessons learned. I.e. How not to do it.

SEVEN – Cultivate a well-developed sixth sense for the appreciation of the intrinsic values of data and information.

No, I am not arguing the case for the idea that all data has value, that extreme notion is clearly absurd, but fortunately one that has limited adherence. However, I am saying that we should develop a ‘nose’ for understanding what data could be of value, and measuring in qualitative and pseudo-quantitative terms, what that value actually represents.

I would also encourage people to check-out the Wikipedia piece on Infonomics ( URL: https://en.wikipedia.org/wiki/Infonomics) a termed coined by Gartner’s Doug Laney, and based at work carried out at Bill Inmon’s Prism Solutions, which incidentally is one of my former employees.

Here’s a snippet:

“Infonomics is the theory, study and discipline of asserting economic significance to information. It provides the framework for businesses to value, manage and wield information as a real asset. Infonomics endeavors to apply both economic and asset management principles and practices to the valuation, handling and deployment of information assets.”

When you are a Data Architect, you should really be aware of such stuff, and at least be able to carry out a reasonable conversation about it.

EIGHT – Strive to be the best of all data modellers you are ever going to meet in your entire life.

I say that I’m a lean data modeller. What does that mean?

The first thing I model are the data flows.

Then I will create the conceptual, logical and physical models.

Then I will repeat until I get consensus, or until I become the Data Dictator – this usually occurs when the Portfolio Director demands closure and delivery.

Simples!

Nevertheless, not so fast. You will also need to know how to design physical data models for OLTP as well as for Enterprise Data Warehousing, and no, they are not the same, even if they are similar in many aspects.

Not only will a great Data Architect have polished skills in the art of data modelling according to the divine tenets of Codd and Date and later extended by blasphemers and acolytes alike, but they must also be comfortable designing dimensional models.

Some other models that will separate the competent from the great Data Architect would be working knowledge of the Hierarchical database model; the Network model; the Object model; the Document model; and the Entity–attribute–value model. It would also be of interests to have a passing acquaintance with the Inverted index; flat file usage; the Associative model; the Multidimensional model; the Multivalue model; the Semantic model; the XML database; the Named graph; and, Triplestore. Knowing stuff about stuff like this is where the killer skills differentiator comes into play.

I have been fortunate in that I can name some of the greatest data people of all times, as my own personal mentors, and I appreciate that for many, well everyone now, that this is not an option. However, there are ways and ways.

There is some great material out there about data modelling; unfortunately, there is an awful lot of crap as well. If you unsure how to differentiate, then ask an expert. There are a number of data experts commenting on the data related groups on LinkedIn.

In the old days it was quite easy to spot a data pro – slightly dishevelled look, tweed jacket, patches on the sleeves and a pipe, matches and tobacco in one of the pockets, Doctor Watson style, etc. but now in the virtual and aseptic worlds, it’s not so obvious who is who. What a pity those days have past, but such is life.

Lastly, consider this quote from Ove Arup. “Engineering is not a science. Science studies particular events to find general laws. Engineering design makes use of the laws to solve particular practical problems. In this it is more closely related to art or craft.”

NINE – Understand the database and data related technologies and products out there, and the pros and cons of using them. Also, strive to be technology agnostic.

This is probably the one aspect of the life of the Data Architect that most people will be familiar with… the tools and technologies. Probably for this reason alone there are recruiting agencies that cannot tell the difference between a technology product and the entire vast field of data architecture and management, or the differing importance of knowing the version of a piece of software and the knowing how to competently manage the Data Architecture of a global business.

Nonetheless, it’s good to have a grasp of the vast array of data related technologies and products out there, and to keep that knowledge as up to date as possible.

Therefore, this list is more for the aspiring Data Architect rather than for the experimented professional. Nevertheless, make sure you have a handle on these:

Please also note that there is a surfeit of data products in addition to those mentioned or referenced above.

TEN – Absolutely dominate the subject of Data Governance. Make Data Governance one of your master subjects, and be ready to bring it into play at a moment’s notice.

Take heed of the wise words of Sun Tzu: “If you know your enemies and know yourself, you will not be imperiled in a hundred battles… if you do not know your enemies nor yourself, you will be imperiled in every single battle.”

The DAMA Dictionary of Data Management defines Data Governance as “The exercise of authority, control and shared decision making (planning, monitoring and enforcement) over the management of data assets.” DAMA has identified 10 major functions of Data Management in the DAMA-DMBOK (Data Management Body of Knowledge). Data Governance is identified as the core component of Data Management, tying together the other 9 disciplines, such as Data Architecture Management, Data Quality Management, Reference & Master Data Management, etc., as shown in the following diagram:

Whilst we are at it, I would encourage everyone interested with a professional interest in Data Architecture to check out ‘Data Architecture: A Primer for the Data Scientist’. This is a bit of blurb from the Amazon site:
“Today, the world is trying to create and educate data scientists because of the phenomenon of Big Data. And everyone is looking deeply into this technology. But no one is looking at the larger architectural picture of how Big Data needs to fit within the existing systems (data warehousing systems). Taking a look at the larger picture into which Big Data fits gives the data scientist the necessary context for how pieces of the puzzle should fit together. Most references on Big Data look at only one tiny part of a much larger whole. Until data gathered can be put into an existing framework or architecture it can’t be used to its full potential. Data Architecture a Primer for the Data Scientist addresses the larger architectural picture of how Big Data fits with the existing information infrastructure, an essential topic for the data scientist.”

That’s all folks

Now, the clock on the wall is really telling me that I should wrap up this baby, warts and all, accidents, omission and typos included, and put it to bed.

This is far from being an exhaustive list of the things which a Data Architect should cultivate, hone and excel in. And yes, I know I ‘missed a bit, there’ as well. And yes, I know I started a new sentence with an ‘And’, and, and, and. And yes I… But, anyways… hey ho! upwards and onwards!

Nonetheless, I hope this little piece was informative or entertaining, or even both. At some level of abstraction or another.

If you spot any glaring errors in this piece then please let me know in the comments section below and I will revise as necessary. Thanks in advance for that.

I will leave you with the words of one of my favourite contemporary architects, Zaha Hadid**:

“I started out trying to create buildings that would sparkle like isolated jewels; now I want them to connect, to form a new kind of landscape, to flow together with contemporary cities and the lives of their peoples”

Many thanks for reading.

In subsequent blog pieces I will be sharing my views on the evolution of information management in general, and the incorporation novel and innovative techniques, technologies and methods into well architected mainstream information supply frameworks, for primarily strategic and tactical objectives.

As always, please reach out and share your questions, views and criticisms on this piece using the comment box below. I frequently write about strategy, organisational, leadership and information technology topics, trends and tendencies. You are more than welcome to keep up with my posts by clicking the ‘Follow’ link and perhaps you will even consider sending me a LinkedIn invite if you feel our data interests coincide. Also feel free to connect via Twitter, Facebook and the Cambriano Energy website.