Susan Etlinger is an industry analyst at Altimeter Group, where she works with global companies to develop social data strategies that support their business objectives. Etlinger has a diverse background in marketing and strategic planning within both corporations and agencies. Her work focuses on helping clients develop strategic, actionable plans that support their unique objectives and organizations through periods of rapid change.

Etlinger is a frequent speaker on social media and analytics and has been extensively quoted in media outlets including The Wall Street Journal, the BBC, Marketplace, and The New York Times. She holds a bachelor’s degree in rhetoric from the University of California at Berkeley. Etlinger talked with us about the importance of utilizing data analytics to inform decision making.

WHAT DOES IT MEAN WHEN WE REFER TO “BIG DATA”?

IBM has said that 90 percent of the data in the world was created in the last two years alone, and more data has been created in the past few years than existed in all of human history before. So we are living in a world where there’s just a lot to know. And what is happening is that people obviously want to derive insight from that data, but the nature of it makes it very complex.

The phenomenon of “Big Data” was first explained in depth in 2001 by Doug Laney from Gartner Group. And if you think about 2001, it’s pre-Internet, so what could big data have been back then? Well, according to Laney, there are three characteristics that are ascribed to big data, and we refer to them as the three V’s. One is volume, and one is velocity, one is variety. And so the idea is that big data is high volume moving at high speed, but what makes it the most complex is that it’s different kinds of data.

So if you think back to 2001, the biggest collection of data would have been things like stock transactions. When you think about how quickly a stock market moves on a high trading day, that fulfills the criteria of volume and velocity. But the variety is kind of limited, because really all you have is somebody saying “it’s a buy order” or “it’s a sell order” in a stock transaction. Compare that to today’s collections of data—for example, the data derived from social media transactions. With social data, you could have a Tweet that could be a comment. Or it could be a social action like somebody pressing the Like button, or Share, or a +1, or a Pin on Pinterest. Or it could be a video, a sound file, a slide show, or an image. And so then you think about, “Well, how do we interpret the meaning of all these different things, and what data do we have to tell us what to do with it?”

IN PRESENTATIONS YOU’VE DESCRIBED SOCIAL DATA AS THE “CANARY IN THE COAL MINE” FOR BIG DATA. WHAT DO YOU MEAN BY THAT?

A lot of organizations are looking at social data, responding to it, interpreting it, and they don’t even think of big data as being related in the slightest. But, in fact, if you’re interpreting social data, then you’re interpreting big data. Social data is the largest data set that most organizations are actively dealing with right now, so that’s why it’s the canary in the coal mine. If you can actually create the processes, the governance structures, and the guidelines and policies to deal with social data, then you have also started to prepare yourself for other types of big data streams.

WHAT ARE SOME EXAMPLES OF OTHER TYPES OF BIG DATA STREAMS?

Think of sensors, for example. What if you had an identification card that had a chip in it that tells you when students check in to the student union or purchase something in the cafeteria? Or it could work like these fitness devices, such as a Nike fuel band or a Fitbit that collects exercise information. You start thinking about how if you had data about traffic patterns at the university or consumption patterns in the cafeteria, then you could start to understand based on what’s happening that you need to order more supplies. Or you thought you were going to build a dorm in one area, but in fact the data might show you that you need it in another area. There are all sorts of ways in which you can integrate those different data sets to get at a more complete picture of what needs exist, what issues arise, and that sort of thing.

SO HOW DO YOU GET FROM JUST CAPTURING DATA TO GAINING INSIGHTS FROM IT THAT YOU CAN ACTUALLY USE TO MEET BUSINESS OBJECTIVES?

That’s the one billion dollar question! Let’s go back to social data as our example. The first thing to think about is, “What is my objective?” Say it’s something like, “We want to begin to have conversations with students and their families around the time the student turns 14, so we can start to appeal to them and show them a little bit about who we are and what we have to offer at our institution. And we’ll do that on Facebook.” (Probably the parents are there; the kids are on Facebook less so these days. A lot of them are moving on to Twitter, and a lot of them use Instagram and other social platforms.) And then we say, “Okay, now that we’ve done this, how are people actually talking back to us?” And that’s the idea of engagement. Are people sharing the content about our basketball team? Are they sharing the content about our computer science lab or our history graduate program or whatever it might be? What kinds of content are they sharing the most?

Then you get into the question of, “So where do we invest our resources?” And I think one of the things data analytics can do is start to show you where those hotspots are. If we know that we’re famous for our sports teams and we post little clips of team video on our Facebook page, or put up images on Instagram, or use Vine with Twitter, then do we start to see any impact from that? Does having images and videos tend to increase the likelihood that people will share that content? And do we start to see impact from people coming back multiple times? Because one of the ways Facebook can be used, particularly, is for people to understand whether they’re going to like the people who are hanging out at the institution where they’re going to spend at least the next four years. So you want to create an ability to begin a lifecycle of relationships with people and have a kind of hub where conversations can occur.

WHAT ARE SOME OF THE BARRIERS THAT ARE INHIBITING THE GOOD USE OF SOCIAL DATA?

Right now the biggest challenge for organizations is pulling insight out of the data, because what you need is a good tool with solid text analytics. A lot of the first-generation social media listening tools were created and launched, let’s say, between 2006 and 2010—which is now ancient history in social media terms. Those tools enabled you to do things like key word searches. So, for example, I want to search for every instance of “government shutdown.” And now I want to search for every instance of “government shutdown” related to “national parks.” And now I want to understand the sentiment around the impact of the government shutdown on the national parks. So it’s kind of the “20 Questions” method of deriving meaning. And it requires a trained analyst who is able to ask those kinds of thought-experiment questions that lead to some kind of insight. So it’s a scientific process, and it’s very, very time-consuming.

What’s happening now is that the market is moving toward trying to simplify that process with better algorithms and propensity models that can determine, for example, that people who like hybrid cars might also like organic cleaning products. And then also in terms of building in alerts: “Please tell me every time there are more than five mentions of food poisoning related to my product so I can understand whether I might have a crisis on my hands.” The tools are becoming more sophisticated in that they’re still complex on the backend, but intended to be easier to use. Yet it’s still an immature market for understanding social data, which is highly, highly complex.

When you think about the way we’ve traditionally conducted marketing in the past, as marketers we would create a survey to understand customer attitudes, or we would bring people in for focus groups, and we would be asking structured questions. The real difference when you think about social data is that it is created and sustained completely outside our organization, and it consists of spontaneous articulations of people’s desires, hopes, feelings, frustrations, etc. Trying to structure that in a way that makes sense and that you can act on is a complicated matter.

Let’s say, for example, that somebody took a picture of herself sitting in the middle of Sproul Plaza on the Berkeley campus with a protest sign. The tools that can interpret images are pretty new (and probably more expensive than a lot of universities would want to pay at this point, although those prices will come down), so you’re most likely going to miss that signal. You probably will find it if people talk about “boycott” and “Berkeley” on Twitter, but you’ll be missing some context in the conversation. So that’s one of the challenges.

And in order to have solid text analytics, you really need intellectual property in the form of a deep understanding of linguistics and how language works (and not just in English, because if you think about an academic institution, it’s not as if the conversation about that institution is only going to be happening in one language). There are some really, really complex challenges to human language that technology can’t always sufficiently address, and so you’re not going to get 100 percent accuracy ever. Part of the reason for that is if you were to have 20 people individually code conversations they see on Twitter, there will be conversations they disagree about. Some are going to say, “That’s pretty positive,” and somebody else is going to say, “No, no, no, that’s totally negative, and we need to be worried about it,” because humans don’t even agree on interpretation. (If they did, we wouldn’t have literary criticism!)

And when you think about some of the natural language processing work that’s going on in social media, a lot of it has to do with understanding the way people really communicate. Let’s say I have a 16-year-old kid with a car. If he says, “My car is sick,” what it means is it’s all tricked out, it’s beautiful, it’s been detailed, it has these rims that sparkle, and when it drives by you’re going to be completely impressed. And if I, 40-something mother, tell you my car is sick, it means it’s got to go to the shop. So that’s another challenge.

WHAT SKILLS AND TRAITS SHOULD ADMINISTRATORS LOOK FOR WHEN HIRING A DATA ANALYST?

I think there are two things you need, and this sort of speaks to how “social data” is almost an oxymoron. Because on the one hand, you have the human piece with “social,” and then on the other hand, you have the supposedly quantitative aspect of “data.” And we’ve seen this a lot, that data scientists come with a very strong statistics orientation and background and understand how to do complex regression analysis and create the algorithms to indicate the probability of certain things over other things.

But then you also have to have a sense of what question you should be asking in the first place. So you need somebody who has that humanistic orientation to be able to identify the cultural and social implications of a lot of the data that we see—because as everybody knows, it’s perfectly easy to draw the wrong conclusions from a beautiful set of data if you ask the wrong question. And so that’s why you need that interplay. You need someone who understands the business of higher education, the temperament of students, and the issues of their day, and also someone who can do the hard-core math.

THE CULTURE AT COLLEGES AND UNIVERSITIES TYPICALLY REQUIRES CONSENSUS DECISION MAKING IN ORDER TO BRING ABOUT CHANGE, AND SO LOTS OF PEOPLE’S OPINIONS COME INTO PLAY. COULD HAVING A MORE DATA-DRIVEN CULTURE BENEFIT THEM?

Yes, absolutely. There’s that whole notion of “the hippo phenomenon.” Harvard Business Review did a special issue on big data in October 2012, and they had a wonderful article talking about how it’s changing organizational culture. In the past you would have a leader of the organization, and that person made decisions based on experience and what they have known to be true in the past. And “hippo” stands for “highest paid person’s opinion,” so that’s why it’s called the hippo phenomenon.

But what if you have data that directly contradicts that person’s opinion? How do you handle that? Suppose somebody says, “I think we should close our Near Eastern Languages department,” but you have social data that actually indicates there’s tremendous demand; it just wasn’t articulated in a place that person was aware of it. Certainly, data can provide context to decision making.

IT SEEMS TECHNOLOGY OFTEN OUTPACES STRATEGY IN THE DIGITAL ERA, SO WHAT’S THE BEST THING ADMINISTRATORS CAN DO TO GET AHEAD OF THE CURVE WHEN IT COMES TO DATA ANALYTICS?

You have to think about the objective. For example, there’s a company called Gild based in San Francisco that’s a start-up founded by some data scientists. They were able to build an algorithm to help technology companies figure out which software engineers to hire. When you’re a Facebook or a Google or a Procter & Gamble or other large, prestigious company and you’re looking for software developers, the first question you ask is, “Did you go to Stanford or MIT?” And if the answer is “no,” then that person loses credibility in the mind of the recruiter. But Gild’s thought was “I bet there are lots of people who are amazing software engineers who didn’t go to one of those colleges, or maybe didn’t even finish college because they didn’t have the financial resources, and who are out there doing brilliant things.” And the truth of the matter is there are people like that, but you can’t find them unless you optimize your algorithm for something in addition to education. So Gild wrote code to consider their online reputation, the quality of their code, and all sorts of other things.

I think it’s a wonderful example of how what’s really important is to figure out what are you trying to accomplish. A lot of what I do is around trying to demystify this social data stuff, because it’s really about looking at the nature of the social content out there and asking yourself a series of questions about what can you learn from it. And so anyone in an academic institution who has a scientific bent or a bent toward literary criticism or linguistics should be able to start to have that conversation. I would encourage academic institutions not to limit themselves in terms of who participates in this process, because I think this is an opportunity in which art and science can work beautifully together.