How Atreyee Dey redefined both Big-Data problems and Gender Norms

Atreyee Dey has worked in Tech R&D for more than ten years. Now she’s solving complex Big-Data problems at Zapr Media Labs, and profiling media consumption at a scale that’s never been done before. As a female software engineer, she is pushing the boundaries of data search and retrieval in an industry where women have had lesser space so far.

I’ve been with Zapr a little over a year. I wanted to see what a startup culture feels like and how I can contribute. I had almost 10 years of experience in research and product labs, but I felt there are a lot more dynamism and freedom in the way we shape our products here. At the same time, a lot of urgency and constraints come into the picture since we’re dealing directly with clients. Additionally as a startup, we want to be frugal in our design choices when we build products. It was challenging to balance all these initially, but it’s worked out pretty well so far.

There’s a myth that startups are too demanding of time and difficult for women engineers with kids. So I wanted to go against the tide and give it a try.

Since my experience revolves around building products to facilitate query and search on big-data, I started working primarily on creating the Zapr data-lake where all our data can be stored, queried and processed at scale. We collect TV viewership information for over 40 million smartphone users and cannot compromise even a single second’s loss. One of our biggest challenges is to profile each of these million users on a daily basis and make it searchable. This is important because the kind of media a person consumes actually defines them right? And viewership patterns will changefor every single user over time as they go through different phases of life.

Our problems are twofold: first we need to capture this varying user data at scale. So we need to figure out both the solution and the right fit for it. And much of the work goes into putting together various tools and products that give us the level of accuracy we want. Secondly, we need to understand the very notion of ‘change’: how do we even attribute or find context behind every change?

If a person changes their mobile or shifts location, they still have the same interests and watch the same things right? Users never really leave the Zapr universe because they have opted back in through some or the other app-installs, so we capture this movement and tie the loose ends. There is no dearth of good and completely new Big Data problems here.

Atreyee making a code work at Zapr Hackathon

There are two different aspects to the user profiles that we are capturing: the static profile where we look at a particular time period to understand each user’s affinity to the kind of media they consume. Another aspect is what in Data Sciences we call a concept drift: let’s say a particular person has been watching a lot of Hindi Entertainment, suddenly they migrate to watching more kids shows. So there is definitely a kid who has come into their household. Capturing this event near real-time would be extremely essential for us to reach them at the right time when they need help with certain services and products.

A major part of business runs on our ad tech platform where we carry out advanced targeting and optimize bidding for both advertisers (brands, agencies) and publishers (apps). The major data concern here is detecting fraud, both in the ad tech ecosystem and in our TV viewership database. At Zapr Media Labs we’ve spent a lot of time ensuring our systems are sanitized for erroneous data practices, for which we’ve also been verified by global anti-fraud entity TAG.

Every piece of data that enters the Zapr lake goes into defining some reporting parameter or metric which our clients and partners use on a daily basis. So we cannot let garbage data come in. How do we do this? By fixing extremely challenging data problems in real time and at scale.

But it’s not all toughness as it sounds. In fact, it’s the opposite here at Zapr. In the past, I’ve seen most people in bigger organisations get very territorial about work and credit sharing. But here we’re constantly encouraging each other. When something is not going well, we’re trying to solve it instead of coming up with the right person to blame.

Having a good time with colleagues at Slurp Cooking Studio, Bangalore

I love coming to work, everyday there are all these smiling faces who are there to greet me. The kind of personal attention I receive from my coworkers (and friends) is something I’ve never experienced before and it’s very heartening.

Moving forward, we’re building a Semantic User-Search platform. Queries which were just a bunch of words have now become a concept. For example, how do we distinguish Shahrukh Khan’s fan-base from Rajinikanth’s, or Action movie fans from RomCom? Our semantic search would combine varying patterns of a user’s TV viewing habits to deliver the most relevant results.

There is no one actually telling you what to do, it’s your open playground. We’re the only people in India who are working on media consumption information for every smartphone user. So all the Big Data and Search problems have to be relooked and redone here. It’s really exciting for me and anybody who wants to work in this field.

Categories

Related Posts

Two years ago, Pooja plunged into a world of unusual data. Since then, she has preserved in the science of problem-solving – carving out complex insights from massive databases. At Zapr, she has made great Read more…

Agam joined Zapr fresh out of college, and went straight to developing Android apps and massive data centers. His journey at Zapr begins with three founders and an intern trying to solve big, challenging technology problems.