How India’s favorite TV show uses data to change the world

Every Sunday morning, millions of people in India tune in to watch Bollywood star Aamir Khan host one of the country’s highest-rated television shows, Satyamev Jayate. Only unlike so many popular programs, Satyamev Jayate doesn’t involve a singing competition or a collection of volatile strangers living under the same roof. It’s a documentary program tackling some of the country’s most-sensitive topics, and it has the whole country — indeed, the whole world — talking. In order to funnel millions of messages a week into something valuable, the shows producers have turned to big data.

Aside from Khan’s star power, the show is so popular because of the types of issues it tackles — female feticide, caste discrimination, dowry deaths, child abuse and medical practice among them. According to one of the show’s producers, the amount of engagement and the number of responses from viewers is “completely unprecedented.” Here’s a sample of what we’re talking about, just 13 episodes into the show’s existence:

Advertisement

400 million viewers on Indian television and across the world on YouTube (s goog).

More than 1.2 billion people have connected with Satyamev Jayate across its website, Facebook, Twitter, YouTube and mobile devices.

More than 8 million people have contributed a total of more than 14 million responses to the show’s content via Facebook, web comments, text-message votes and a telephone hotline. More than 100,000 new people respond each week.

The responses take all sorts of forms, from votes on a weekly poll question to long, heartfelt letters explaining a viewer’s experience with an issue or how the show has changed their thinking on an issue. And although 95 percent of responses come from India, the show has received them from 5,000 locations in 165 countries, including as far away as northern Canada and Alaska. The show’s topics regularly rank among the top trends on Twitter shortly after each episode airs.

Surprisingly, the producer said, the India-created Satyamev Jayate has not received a single piece of hate mail from bitter geopolitical rival Pakistan. In fact, there have been numerous requests for an episode on India-Pakistan unity. (If you have 90 minutes, here’s an episode on human dignity.)

Parsing through millions of messages

In order keep up with all the messages, Satyamev Jayate turned to Persistent Systems, an Indian IT consultancy with offices around the world, which created a system for automating their analysis. Here’s how the process works.

About a day-and-a-half before each show, Satyamev Jayate’s production company tells Persistent what the issue will be and the two groups come up with a taxonomy that will help the system sort through messages based on what topics will be brought up during Sunday’s show. But it’s not by any means the definitive list. As activity ramps up on Twitter while the show airs (tweet rates are highest during commercials and immediately after it ends, by the way), the team gets a sense of what topics are resonating with viewers and what themes they can expect in the nearly million responses that will follow.

When the responses actually do start pouring in after lunch, they hit a system designed by Persistent to automatically tag them and score them based on interest level and sentiment. So, as Mukund Deshpande, head of business intelligence and analytics at Persistent, told me, a long message with an interesting story will be marked as higher quality, while a short, congratulatory note will be scored lower. Because so many viewers write in “Hinglish,” a combination of Hindi and English, an off-the-shelf system wouldn’t have been as accurate for processing these messages.

In the future, he’d like to train the system to recognize various gradients of emotion, too, beyond just simple sentiment. That means not just “positive” or “negative,” but also “happy,” “sad,” “angry” and any other way a viewer might be feeling.

The best messages are then sent to a team of trained analysts — often college students and graduates, along with some Persistent employees — who decide which ones are worth following up on for a Friday radio show Khan does, and for placement on Satyamev Jayate’s web site. These analysts try to ensure that the stories shared are truthful and that the messages don’t contain personal information that could get viewers in trouble or affect their privacy. Data visualizations about how many people have responded and where they come from is available on the Impact section of the show’s site, as well as on separate Impact pages for each episode.

Making a difference with data

Aamir Khan

All this feedback has an impact, both on the show itself and on India. Satyamev Jayate’s voting process, in particular, has yielded some impressive results. After the first episode about female feticide, or the selective abortion of female fetuses, 99.8 percent of viewers said they agreed with the idea of a fast-track court to prosecute doctors who perform such operations. When Khan presented the results to the Indian government, officials agreed almost immediately to amend the court system accordingly, the producer told me.

Sometimes, though, the results simply present an interesting — if not troubling — view into the Indian subconscious. Almost 32 percent of respondents, for example, voted in favor of the right of families to use force preventing the marriage of two willing adults (subsequent analysis uncovered some reasons why, including continuing opposition to inter-caste marriage), while almost 14 percent of respondents one week said that beating a woman is a sign of masculinity. And although women comprise only about 32 percent of the show’s audience, they have accounted for the majority of responses on shows addressing issues important to them.

The producer said his team also uses the data to inspire ideas for future shows and to populate a weekly radio show that Khan does with a local journalist. The Satyamev Jayate team analyzes the week’s messages in order to pick the most powerful and determine trends in viewers’ feelings, and Khan shares them during the interview. The second season, he said, will be shaped in part by how viewers responded to the format during the first season and the issues they want covered next.

Beyond just the next season, though — and the occasional political victory — the hope is that all the data Satyamev Jayate generates will have continuing utility. Deshpande said he’d like to see it used for ethnographic and social science research, because the dataset is larger than most academic studies could generate (something that’s already happening with crowdsourced medical research) and it’s very high quality because of the demographic and geographic information attached to it.

However, the producer with whom I spoke seems perfectly content right now with the way Satyamev Jayate is resonating with the public. For example, he said, viewers are reporting crimes they previously might not have considered too big a deal and are reaching out to disabled citizens. This is the first time many people are speaking openly about these issues, he said, and they’re able to track the effects because they’re able to ensure no message is left behind.

Amir Khan should seriously consider a SMJ Season-2 to take a closer look at why educated people are frittering away their right to vote . . . as a consequence of which we have corrupt politicians coming to power by buying up the votes from impoverished vote banks.
Anna Hazare is history. Baba Ramdev is a flash in the pan. Amir not only has the credibility, but has the capability to get people to vote.

now, this is where i find it confusing. 50% (more or less) used facebook to either correct or respond to the show but only 1.9% community members or facebook fans as they call them through facebook??!! sounds very odd to me.

data does provide you an insight but then again statistics is a science that can be easily used to mislead people. like the power outage that recently affected 600 million people in India but surprisingly cell phone services were still up or not one report mentioned that homes in India do have power backups. as far as mining realtime data is concerned then i would say most of us today use facebook/twitter as a mean to show we are aware of whats going on. dont believe me? remember the recent incident involding a habitual twitterazzi, who btw happens to be a big bollywood star, tweeting about an Indian athelete after she won a medal at London Olympics?

i havent mentioned it anywhere but i’ll here since we are talking about data. i have watched a few episode and one that gave me chill was on domestic violence. most of the people interviwed in those small clippings played on tv during the show about their views on beating their wife were from weaker section of the economy and they all agreed it was okie to beat your wife every now and then. shocking but atleast they admitted it honestly on camera. a question popped up in my head straightaway. why nobody from the middle or higher section of the economy? did these men believe what they said because they were men and the breadwinner of the family? or the lack of education never really allowed them to confront their parental beliefs about treating a women? i dont know answers to these questions but what i know this since my job involves data analysis, this analysis cant be used to bring about a change we are looking at. just like registering for a vote is still done on foot and not through online portals, a change has to be based on the ground realities and not realtime data based on someone who feels the need to relay everything they are doing to the entire world ‘real time’ and just can’t put their phone down and listen to that individual on the show who’s holding back their tears to explain something that they have not told to any other human being before. if they can’t be part of that moment which may not last for more than few seocnds, do you really expect them to be a part of a bigger moment that may last longer?

SMJ was a wonderful much-needed show–and so glad to find that Persistent systems has done such an excellent job of collecting highly useful and usable data, based on all the feedback the show generated.

Aamir has talked about a Pune based software company doing collection, analysis and presentation of feedback material. So this is it. Great!

However I am disturbed to read this — ‘Sometimes, though, the results simply present an interesting â€” if not troubling â€” view into the Indian subconscious. Almost 32 percent of respondents, for example, voted in favor of the right of families to use force preventing the marriage of two willing adults (subsequent analysis uncovered some reasons why, including continuing opposition to inter-caste marriage), while almost 14 percent of respondents one week said that beating a woman is a sign of masculinity. And although women comprise only about 32 percent of the showâ€™s audience, they have accounted for the majority of responses on shows addressing issues important to them.’

Well, we can only hope that Indian mindset is persuaded to change enough in due course, especially with continued education brought about through avenues like positive minded community tv shows, besides education at grassroot level.

popularity and its effect claimed seems to be exaggerated. we need to appreciate the indians’ voeyer mentality and our tch..tch..mentality..our disbelief in positive change by directed actions..if there is a mass Y mentality population in the world..it is in india..its not criticism..its my analysis..maybe Aamir can do a show on our chalta hai…kuch nahin badlega attitude..

Ravi, you still seems to be living in euphoria of SMJ. Lets see in coming times, how much change it brings in the society, the procedures (government), and the thinking. So lets stop trying to prove each other right/wrong.

I’ve nothing wrong in my mind about SMJ or efforts that Amir has put in; however, remember that this is not first of its kind of agitation except the fact that it is being done at such a wide scale. Use of technology is at its best.

Ravi:
Why do you take umbrage at criticism? It seems valid enough. Reach of 400 mn TV is an exaggeration unless you are counting anybody who has watch any one minute of any of the episodes. 1.2 billion on the net, twitter etc is a misleading figure. It cannot be a count of number of unique people. Think and work out why
Sankara Pillai

its an interesting analysis of what the show is doing through out the world, atleast it is making to stop for a while and think about the issues we are facing society, which we were ignoring all these years.

This is very interesting analysis. I would love to see this data being available in open domain so that researchers around the world can do more analysis. This data can bring LOT of change in India and it will be sad to see it behind the closed doors!