Mining Company Data for PR Gold

Jason Del Rey was a senior reporter covering technology, branding, and company culture for Inc. magazine. Before joining Inc., his work appeared in Newsday, The (Newark) Star-Ledger, and the Staten Island Advance, and on ESPN.com. He lives in New Jersey.

After weeks of digging through hundreds of gigabytes of information, Christian Rudder set out to tackle one of life's unanswered questions: Which smartphone users have the most sex? Rudder, the co-founder of OkCupid, a New York City-based dating site, regularly mines the company's database of customer profiles and surveys to find amusing trends like this particular smartphone statistic. (Answer: iPhone owners.)

Companies are gathering more and more information about their customers, but perhaps none are as obsessed with statistics as is OkCupid. The dating site, which was founded by four Harvard-educated mathematicians, has come up with a way to spot interesting trends in its data – and garner free publicity in the process.

The company's foray into statistics started last year, when the founders set out to create a company blog that would attract an audience outside of OkCupid's existing customer base. The founders had an idea: What if they took survey answers, messaging habits, self-descriptions, and other statistics from OkCupid's millions of members and compared that information with external data? Perhaps they could unveil larger truths about the online dating world – or even society as a whole.

The blog, OkTrends, intentionally stoked controversy. The inaugural post was titled "Rape Fantasies and Hygiene by State." It revealed members' collective answers to some of the more intimate questions used in OkCupid's matchmaking algorithm. On the blog, Chris Coyne, one of the founders, unveiled multicolored maps that displayed the results by state. The provocative post received more than 30,000 hits.

After the success of that post, Rudder dug deeper into OkCupid's data. He found some startling trends when he studied how various traits of OkCupid members seemed to affect their interactions with potential mates. Last October, Rudder wrote a highly controversial post: "How Your Race Affects the Messages You Get." In the opening paragraph, Rudder wrote, "We've processed the messaging habits of over a million people and are about to basically prove that…racism is alive and well."

The results included evidence that white men get more responses to their romantic missives than any other group. The company also determined that its male users write back to black women "far less often than they should" when taking into account OkCupid's compatibility algorithm. The post generated some 1,300 comments. Several media outlets, including Salon.com, NPR, and The New Republic, reported on the findings. "We could have very easily produced lists of our users' favorite cars or sports teams," says Rudder, "but people don't care about that."

Since then, Rudder has crafted about one lengthy post a month and hired a full-time data scientist to help him. Posts have included service-type articles geared toward helping online daters improve their chances of connecting with other members. One gave tips on how to write an initial message to another member. (Don't use physical compliments such as hot and sexy; do use How's it going? as a greeting.) Another offered statistics and advice on the best poses for profile photos. (Guys, look away from the camera; ladies, make a flirty face.) Hundreds of people commented on both posts.

But OkCupid's blog has done more than just spur conversation. The blog's popularity has boosted OkCupid's ranking in search-engine results. "It's helped us a ton in search-engine optimization and traffic," says CEO Sam Yagan. He says nearly two million people have visited the blog since it launched. In that time, OkCupid's monthly unique visitor count has nearly doubled, to 5.5 million. "Having a blog about data helps create the impression that OkCupid treats this like a science," Rudder says. "And we do."

On the blog, OkCupid makes a point of addressing possible privacy concerns. In the first post, Coyne wrote, "though we plan to discuss and manipulate user data on this blog, it will always be anonymized." Later, in a post about how to break the ice with other users, Rudder explained that sender and recipient information was removed during the data mining. "In addition," he wrote, "our sifting program looks at the content of messages only two or three words at a time...no human has read any actual user messages."

Every month, OkCupid's data scientist writes code to query the company's databases for relevant information. Rudder then dumps the findings into an Excel spreadsheet, which often includes as many as 30 columns and 500,000 rows of information. Over a two-week period, he runs calculations and identifies interesting statistics, then molds them into charts and graphs in Excel. "I'm a long-term power user of Excel," Rudder says, "but it's probably not the easiest way."

Other companies are taking notice of OkCupid's creative use of customer statistics. Gizmodo, a technology blog owned by Gawker Media, has begun running some of OkCupid's posts. And OkCupid has been approached about possible content- and data-sharing projects by several companies, including Bundle, a new personal finance site. The start-up is interested in working with OkCupid to find new trends using its information on how singles spend money.

"I think businesses are sitting on all kinds of interesting data," Yagan says. "The thing to consider is, How does it drive your business? In our case, because we're kind of an unknown site, just having people discovering OkCupid is really powerful."