However, if you’re like me, hearing anybody repeat that curly fries data point as fact likely sends shiver down your spine. It’s not that it’s not true — it very well might be — but that it’s nearly useless information without more background.

That’s right, the old correlation versus causation argument is front and center once again. In all the big data world, it’s probably the biggest fallacy there is, no matter how you look at it. No, getting value from big data always doesn’t require giving greater credence to correlation than causation. And, no, relying on correlation isn’t inherently some sort of an ethically or scientifically questionable practice.

Really, the choice between relying on correlation or striving to find causation probably depends on what you’re trying to do.

You visit my site, my system sees you’re using a Mac (or that you like curly fries, or any other attribute it can associate with you) and it shows you content that it thinks you’ll want to see. It’s not a perfect approach, but it’s probably a far cry better than the old method of just showing everybody the exact same content.

And when you’re collecting potentially petabytes of user data and trying to serve ads in near real time, strong correlations might be about the best things you can hope to find. It’s a volume-and-velocity business, and heavy examinations of why any two (or more) things are related to one another might not always provide a high return on investment.

A more extreme example of when correlations might suffice would be something like machine-to-machine systems that need to make decisions in real-time in order to prevent disasters. The people charged with running these systems might not know why a certain series of events often precedes a particular outcome, but it’s better safe than sorry.

Many of the reasons for not acting on correlations alone are based on privacy and a whole collection of civil, constitutional and human rights. You simply can’t profile and then arrest, for example, people based on what their Likes suggest they might be. You probably shouldn’t make decisions about people’s financial, health or general well being based on mere correlations, either.

Heck, I wouldn’t even serve ads that delve into personal information such as health, sexual orientation or intelligence without a very strong reason to believe I was accurate (and express consent to serve those ads). And the Facebook-curly-fries study is full of correlations that could be potential landmines, a small portion of which are visible in the chart below.

More correlations from the “curly fries” study. Source: Proceedings of the National Academy of Sciences

But these are all situations where the fear of incorrectly profiling someone occasionally — and being sued as a result — might overpower the desire to do good most of the time. The data Darwinism that my colleague Om Malik wrote about recently extends beyond just peer reviews and social-media ratings, and one shouldn’t take the role of playing God (or catalyst for evolutionary change, to continue the Darwin metaphor) lightly.

Sometimes, though, correlations aren’t enough because you really want to solve a problem or perhaps build a great product. As Gourley explained at Structure: Data, even using correlative data to predict insurgent attacks in a place like Iraq is relatively easy, but predicting the likelihood of events doesn’t stop them. Stopping them requires really understanding and addressing the root causes of the attacks.

So feel free to try selling the next guy you see eating curly fries on a documentary about Dostoevsky, but don’t expect him to care. It might be that there’s some strong connection between curly fries and intelligence; of course, it might also be that intelligent people — entirely coincidentally — tend to live within walking distances of an Arby’s. But no one has asked about that.