Managing a Data Science Team

Executive Summary

Great data scientists have career options and won’t abide bad managers for very long. If you want to retain great data scientists you need to care about your team members, connect their work to the business, and design a diverse, resilient, high-performing team. The best way to build trust is to make sure your team members have interesting projects to work on and that they’re not overburdened by projects with vague requirements or unrealistic timelines. To get the most from a data scientist’s time, they need to have a clear understanding of what the business goal behind the project is. Finally, as a data science manager, you’ll get lots of applications. Take advantage of that to be picky in the right ways.

Ragnar Schmuck/Getty Images

Many managers of data science teams become managers because they were great individual contributors and not necessarily because they have the skills or training to lead a team. (I include myself in that group.) But management is a skill unto itself, and relying on your experience as a successful individual contributor is not enough to ensure that you are able to retain and develop great talent while delivering valuable learnings, products, and outcomes back to the organization. Great data scientists have career options and won’t abide bad managers for very long. If you want to retain great data scientists you’d better commit to being a great manager.

What does it take to become a great manager? Volumes have been written on that subject, of course, including from HBR. But in my experience, a few areas are particularly important for those who lead data science teams. Great management means caring about your team members, connecting their work to the business, and designing diverse, resilient, high-performing teams.

Build trust and be candid

Trust, authenticity, and loyalty are essential to good management. That’s particularly true in data science where confusion around the discipline and its role in the organization means the team manager is responsible for insulating team members from unreasonable requests and for explaining the team’s role to the rest of the organization. Your team needs to trust that you will have their back.

Having your employees’ back doesn’t mean blindly defending them at all costs. It means making sure they know that you value their contributions. The best way to do that is to make sure your team members have interesting projects to work on and that they’re not overburdened by projects with vague requirements or unrealistic timelines (which is all too common given the high demand for data scientists.)

To build trust over time, you should invest in candor. Data scientists are smart people who are trained in how to interrogate and handle information. Therefore, my heuristic is to be about 20% more direct and candid than you think you should be. Be transparent with the good and the bad during the entire process, from recruiting, to onboarding, to the day-to-day, to performance reviews, and when discussing the team’s, department’s and organization’s strategy. It’s painful but critical for success. The moment you start “being nice” to avoid a tough conversation, you and your team have begun to lose.

Finally, feedback should be consistent and bi-directional, and great data scientists will smell bullshit a mile away. If you say you’re a believer in candor but become defensive or (worse!) don’t actually act on feedback, then your best reports will want to leave.

Connect the work to the business

To get the most from a data scientist’s time, they need to have a clear understanding of what the business goal behind the project is. Anchoring your team’s work in the context of the broader organizational strategy is among the most important jobs a manager of data science has. Unfortunately, it’s not always easy to do.

Data science projects often start with a question from someone outside the team. But often the question that the person asks isn’t exactly what they actually want to know. A lot of managing data science involves discussing and fine-tuning questions from stakeholders to better understand the information they actually want and how it will be used. Don’t let questions or requests become projects for your team until you know exactly what the stakeholder wants to understand and how they’ll use it. Having very clear objectives for the data-related questions that come your way is one of the most important things you can provide for your team.

Of course, stakeholders can’t always answer these questions on their own. They might not have a clear idea of what a finished data science product would look like (or how they would apply it). To fill this gap, make sure members of the data science team are regularly invited to product and strategy meetings. This way they can be inputs into the creative process rather than merely responding to requests.

Design great teams

There are many professionals trying to break into the “sexiest profession of the 21st century” and so, as a data science manager, you’ll get lots of applications and will have to be picky. Take advantage of that to be picky in the right ways. Care about your hiring process.

One of the biggest areas where people fail as managers is in the tradeoff between the short- and the long-term. For instance, it’s easy to start thinking that you don’t have time to recruit. This is a huge mistake. If you don’t have the time to find great team members and to scrutinize your interview and onboarding processes to ensure that you have good ones in place, then you don’t have time to manage a new direct report. Creating a great hiring process will pay off in the long term.

What does a great hiring process look like? For one thing, it doesn’t just focus on technical skills. Social skills like empathy and communication are undervalued in data science and the disciplines from which data scientists usually emerge, but they’re critical for a team. Make this a part of your hiring (but not in a way that amounts to hiring just for ‘culture fit’ and reinforces your affinity and confirmation biases). Instead of focusing on whether you can get along with a candidate, ask yourself if there is a lens though which this person sees the world that expands the boundaries of the team’s knowledge sphere—and value that dimension as highly as you value other attributes such as technical ability and domain expertise. This is why it is important to prioritize diversity. That includes diversity of academic discipline and professional experience but also of lived experience and perspective.

A few areas in particular stand out as important for data science. First, don’t just hire senior people. Not only are they in high demand and expensive, but less experienced employees have the “luxury of ignorance” and can ask “dumb” questions. These questions are not actually dumb, of course, but are unencumbered by the usual assumptions that more experienced professionals stop being aware they are making. It’s not hard to become infatuated with a particular way of doing things and to forget to question whether a favored approach is still the best solution to a new task.

Second, data scientists come from a variety of academic backgrounds: computer science, physics, statistics, and many others. What matters most is having a creative mind coupled with first rate critical thinking skills. I have a team member who studied marine biology and this diversity of expertise has proven extremely valuable. (The ability to translate domain knowledge about how pods of dolphin behave in the wild can be surprisingly useful when modeling a fleet of robots.)

Third, it’s important to hire individuals whose strengths complement one another, rather than building a team that all excels in the same area. A “big picture” person, someone who can articulate stories with data, and a visualization wizard working together can collaborate to produce things none could independently. To take the most advantage of these complementary skills, it’s important to make sure that the team actually works as a team and collaborates. You want your team working with each other and not just alongside. Regularly requiring members to read each other’s code and reports and fostering team activities centered around technical discussions ensure that you get the most out of this sort of diversity.

Finally, it’s also important to build a team that reflects the people whose data you’re analyzing. This is the only way to ensure that you have a resilient team that will ask better questions and a have wider aperture of perspectives from which to ask these questions. This way, each individual’s blind spots are covered by another’s past experiences and skill set.

When to specialize

One final piece of advice: When a data science team is just starting out, everyone on it will “wear many hats” and do lots of different kinds of data science. That’s ok—it’s like when someone joins a startup. But as your team matures and proves its value, recognize that roles will become more defined and some activity will move to other teams (infrastructure, ops, etc.).

Having said this, I would caution against specializing too soon. Specialization only works when well-defined and clear requirements are available to offset the coordination delays and costs associated with multiple teams working together. “Full stack” data scientists are very hard to find, but it is possible to find smart and driven “partial stack” data scientists who can learn, with a little dedicated coaching, how to appropriately frame a problem, manage a small project, develop and train a model, integrate with APIs, and push to production.

If you’ve done your job right as manager, this evolution will proceed relatively smoothly. You’ll have been picky in your hiring and created a great team with a balanced skillset. Your employees will trust you, and they’ll understand how changes support the organization and its goals.