Wednesday, February 11

As much as marketers hold on to hope for the promised land of big data — one algorithm to rule them all and in the darkness bind them — the information they covet remains convoluted. Big data can't crack what consumers don't share because algorithms play by the program rules and people never do.

One such study making the rounds even proves the point in its attempt to demonstrate the opposite. Despite some headline capturing claims that Facebook "likes" can assess your personality just as accurately as your spouse (and better than your friends), most people were miffed when they accepted an open invitation to take the algorithm for a test run. It seems that results vary.

The algorithm developed by Michal Kosinski at the Stanford University Computer Science Department, for example, pinned me down as a 25-year-old single female who is unsatisfied with life (among other things). It wasn't the only data fail among other friends who tried it. The model missed and missed and missed. There are reasons why, with mine being the easiest to decipher.

My personal usage of Facebook is best described as treating a few minutes out of every day as casual Friday. My connections are mostly limited to friends, family, and long-time online acquaintances. My principal activities include catching up with what they are doing, sharing stories about my children, and posting the occasional baked goods pictures. Why? Because I don't really do that anywhere else.

I also make a conscious effort to avoid controversy, not because I'm "agreeable" but because that social network isn't a place I want to invite deep discussion, debate, or any drama. And what that means is, in sum, that only a sliver of my personality comes across on Facebook. For others, I'm told, the assessments are wrong for a different reason. Not everyone is completely honest on Facebook, not all profiles are complete, and people "like" different pages and things for reasons you never expect.

Why big data models miss the mark with psychological stereotypes.

Beyond the most obvious — that any algorithm is only as good as the input it is allowed to compile — there is always unexpected trouble when stereotypes are introduced into a psychological test. According to the aforementioned model, the algorithm assumes people who like "Snooki" or "Beer Pong" are outgoing and people who like "Doctor Who" and "Wikipedia" are not. Men who like "Wicked, The Musical" were defined as more likely to be homosexual and those liking "WWE" or "Bruce Lee" were not. Those who like "the Bible" are said to be more cooperative while those who like "Atheism" are competitive. And so on, and so forth.

Says who? Says some of the data that came from the myPersonality project designed by David Stillwell, deputy director of the Psychometrics Centre at the University of Cambridge. Between 2008 and 2012, myPersonality users agreed to take a survey, which asked participants about their personal details and personality traits. Their answers were then assigned to buckets such as openness, conscientiousness, extroversion, agreeableness and emotional stability. (The new test delivers those results too.)

But no matter how those results are derived, the best an algorithm can do is capture a data point and put it in a bucket. It has a much harder time recognizing intent, realizing something a human might notice like, let's say, that Joey isn't pregnant but his cousin June might be. The baby shower is coming up and he has been searching for and liking pages that might give him an idea of what to buy.

Stripped of any overreach that would paint Joey as an expectant mother, there is one area where analytics sometimes succeed. It might recognize that Joey is in the market for some baby gifts (assuming this deduction is made before and not after he finds a gift). Or perhaps, if Joey has also liked certain television shows, then one might deduce that he would be interested in similar shows. Or perhaps, some data might be employed to fine tune the tone of a message much like direct mail writers once did using PRISM research data.

But even then, minor research advantages were tempered by Rule No. 7 in Advertising in the past. It's the rule that reminded commercial writers that people tend to lie. They are predisposed to "like" things (or even "list" things in a Neilsen ratings book) that make themselves look a little brighter, better, smarter, or savvy regardless of what they really watch, like, or do. They also tend to share more positive life events than they do negative ones, connect and disconnect with people more easily, and like pages that friends recommend because they think they are doing their friends a favor. Maybe.

And while all three studies might provide an interesting read, marketers could probably learn more about their markets from the SizeUp tool provided by the Small Business Administration; any number of other affordable data providers like SEC filings, BizStats, or even the United States Census, or proven research methods such as consumer interviews, focus groups, and tests with a control group.

Does that sound too time consuming, cumbersome, or expensive? Then just wait until you see how expensive a product or service launch can be based on social network data alone. It's a little bit crazy.

The Psychology Of Facebook Can Get A Little Bit Crazy

As much as marketers hold on to hope for the promised land of big data — one algorithm to rule them all and in the darkness bind them — the information they covet remains convoluted. Big data can't crack what consumers don't share because algorithms play by the program rules and people never do.

One such study making the rounds even proves the point in its attempt to demonstrate the opposite. Despite some headline capturing claims that Facebook "likes" can assess your personality just as accurately as your spouse (and better than your friends), most people were miffed when they accepted an open invitation to take the algorithm for a test run. It seems that results vary.

The algorithm developed by Michal Kosinski at the Stanford University Computer Science Department, for example, pinned me down as a 25-year-old single female who is unsatisfied with life (among other things). It wasn't the only data fail among other friends who tried it. The model missed and missed and missed. There are reasons why, with mine being the easiest to decipher.

My personal usage of Facebook is best described as treating a few minutes out of every day as casual Friday. My connections are mostly limited to friends, family, and long-time online acquaintances. My principal activities include catching up with what they are doing, sharing stories about my children, and posting the occasional baked goods pictures. Why? Because I don't really do that anywhere else.

I also make a conscious effort to avoid controversy, not because I'm "agreeable" but because that social network isn't a place I want to invite deep discussion, debate, or any drama. And what that means is, in sum, that only a sliver of my personality comes across on Facebook. For others, I'm told, the assessments are wrong for a different reason. Not everyone is completely honest on Facebook, not all profiles are complete, and people "like" different pages and things for reasons you never expect.

Why big data models miss the mark with psychological stereotypes.

Beyond the most obvious — that any algorithm is only as good as the input it is allowed to compile — there is always unexpected trouble when stereotypes are introduced into a psychological test. According to the aforementioned model, the algorithm assumes people who like "Snooki" or "Beer Pong" are outgoing and people who like "Doctor Who" and "Wikipedia" are not. Men who like "Wicked, The Musical" were defined as more likely to be homosexual and those liking "WWE" or "Bruce Lee" were not. Those who like "the Bible" are said to be more cooperative while those who like "Atheism" are competitive. And so on, and so forth.

Says who? Says some of the data that came from the myPersonality project designed by David Stillwell, deputy director of the Psychometrics Centre at the University of Cambridge. Between 2008 and 2012, myPersonality users agreed to take a survey, which asked participants about their personal details and personality traits. Their answers were then assigned to buckets such as openness, conscientiousness, extroversion, agreeableness and emotional stability. (The new test delivers those results too.)

But no matter how those results are derived, the best an algorithm can do is capture a data point and put it in a bucket. It has a much harder time recognizing intent, realizing something a human might notice like, let's say, that Joey isn't pregnant but his cousin June might be. The baby shower is coming up and he has been searching for and liking pages that might give him an idea of what to buy.

Stripped of any overreach that would paint Joey as an expectant mother, there is one area where analytics sometimes succeed. It might recognize that Joey is in the market for some baby gifts (assuming this deduction is made before and not after he finds a gift). Or perhaps, if Joey has also liked certain television shows, then one might deduce that he would be interested in similar shows. Or perhaps, some data might be employed to fine tune the tone of a message much like direct mail writers once did using PRISM research data.

But even then, minor research advantages were tempered by Rule No. 7 in Advertising in the past. It's the rule that reminded commercial writers that people tend to lie. They are predisposed to "like" things (or even "list" things in a Neilsen ratings book) that make themselves look a little brighter, better, smarter, or savvy regardless of what they really watch, like, or do. They also tend to share more positive life events than they do negative ones, connect and disconnect with people more easily, and like pages that friends recommend because they think they are doing their friends a favor. Maybe.

And while all three studies might provide an interesting read, marketers could probably learn more about their markets from the SizeUp tool provided by the Small Business Administration; any number of other affordable data providers like SEC filings, BizStats, or even the United States Census, or proven research methods such as consumer interviews, focus groups, and tests with a control group.

Does that sound too time consuming, cumbersome, or expensive? Then just wait until you see how expensive a product or service launch can be based on social network data alone. It's a little bit crazy.