Blog

Get to Know Data Scientist Michael Schmidt, founder & CTO of Nutonian

Companies are seemingly measuring everything these days, but it's often the case that they end up with mountains of information no one can make sense of.

Enter the data scientist, one of the hottest roles in tech today and one that more and more organizations are hiring for. In a nut shell, these people are digging into those mountains of data to find trends and patterns that impact the broader business.

But what does that really mean?

I wanted to find out, so I caught up with Michael Schmidt, founder and CTO of Boston-based Nutonian. Named by Forbes as one of the world's most powerful data scientists, Schmidt's other accolades include Boston Business Journal's 40 Under 40, MIT Technology Review’s 35 Innovators Under 35, The Boston Globe’s Top Innovators, and more.

I recently sat down with Schmidt at Nutonian's Summer Street offices to learn more about his background, what it means to be a data scientist, and how advances like artificial intelligence (AI) and machine learning are changing the data game. Read more in our interview below.

Kaite Rosa: Tell me a bit about yourself and what lead you to founding Nutonian.

Michael Schmidt: Back when I was in grad school, I was passionate about machine learning and AI. A lot of work was being done was around systems that could crunch numbers, crunch data, and beat the best human players in chess, for example. But even though we built these systems, we didn’t learn anything about how to become better chess players. We had this new level of automation - and it was incredible - but it was hard to use or learn form.

I was passionate about helping to advance science and wanted AI systems that could help scientists. That’s where I got started, helping people bootstrap findings from data. I spent a ton of time working with people in pure sciences: physics, medicines, bio, chemistry, etc. I was helping them interpret the data they collected to scale the complexity of problems to analyze.

The connection is that the problem scientists face with data is the same problem every business faces today — especially enterprises that have invested in collecting and storing data. They’re storing lots of information with the hope it will be valuable down the road. It will be, but one of the challenges is they haven’t been able to unlock a lot of that information.

KR: Can you explain what Nutonian does in basic terms?

MS: Nutonian is the only software AI company that can automatically interpret your data for you. We can tell you what looks to be causal, what’s not, and what happens under different scenarios. Instead of just predicting, we’re actually explaining what’s going on with the data. And we’re the first AI software that’s able to do that…

There's so much hype around data science in general, and that comes from the frustration of not being able to apply some of the data companies have invested in collecting. There’s a lack of technology to interpret that data.

That’s where Nutonian fits. Helping people do that.

KR: How is AI and machine learning revolutionizing the way we interpret and use data? How does it help data scientists and others in data-driven roles do their jobs better?

MS: There’s been enormous investments in collecting this data. Before AI and machine learning, the way to unlock value from that data was just collecting stats and doing standard business intelligence on it.

AI allows you to do something new with the data. It allows you to do things a human couldn’t do, that even experts couldn’t do on their own.

KR: In the past few years, we’ve seen data scientist roles cropping up everywhere. What does this kind of role entail?

MS: Data science is a broad field. A lot of people have adopted that name because it commands a lot of respect and high salaries.

But to me, what does it really mean to be a data scientist? At a high level, the data scientist is the guy responsible for getting value out of the data. Some people would disagree, but I think most what most people are interested in is uncovering some of the value in data.

With [our software] Eureqa, we automate some major parts of that. It alleviates the need to program and develop and unlock data. There’s a limited number of data scientists who can do that kind of work. About 20K to 30K. But there are around 3M analysts in the world who, if they had the right tech and software, could do it on their own.

Right now, it’s a lot of heavy lifting. For data scientists in any of the companies we work with, they’re going back and forth between business stakeholders and getting feedback on results and striating that data back into the algorithms they’re running.

KR: Can you tell me what your day-to-day as a co-founder/CTO/data scientist looks like?

MS: I own the technical vision and direction of the company. I work with some of our more high profile customers to learn about their challenges and how to incorporate those back into our technology and protect. I see their data science and approaches, what data they have, what problems they have, and find the best solutions for that.

At the same time, I also go out and give talks and conferences on data science. That’s a good chunk of my time. I’d say purely public facing talks makes up 25 percent of what I do. But that’s varied over past few years past. I was very heads down when we first founded the company. Since we’ve grown the team, I’ve had time to do that stuff [speaking opportunities].

KR: What are some of the biggest pain points you face as a data scientist?

MS: One of the biggest obstacles is that it’s easy to run algorithms and do analysis. To apply that, though, you actually need to interpret it, give reasons for you findings, and communicate that. Today, it takes a tremendous amount of manual research and analysis. We’re trying to solve that with machine learning...

Other challenge is that, a lot of the time, the data sucks. That’s symptomatic because of the way companies have approached collecting data. They have been collecting things with the hope it would be valuable, rather than collecting the type of information they need to solve a problem.

But it’s exciting to see what comes out of the data you’re working on. The opportunity is to make a profound discovery in a new set of data that hasn’t been found before, because you have a new advantage. You have information and are applying it in ways it’s never been applied before. It’s always exciting to discover something that wasn’t known.

RAPID FIRE Q&A

KR: What time do you get to the office? What time do you leave?

MS: This varies a lot depending on what stage of a project I’m working on. But generally speaking, I usually arrive between 8 and 10 a.m. and leave between 6 and 8 p.m.

KR: Coffee or tea?

MS: I have a cup of coffee most days, but the main thing I do to focus is listen to music – particularly something simple and without vocals so it doesn’t distract me.

KR: Tell me about your morning routine.

MS: I usually get ready as quickly as possible, grab something to eat on my way to the office, catch up on emails, and then finally get back to actual work.

KR: Once you get to the office, what do you tackle first?

MS: When working on a project, I like to use the morning to explore new approaches to the problem while feeling fresh.

KR: How do you organize your day to stay productive?

MS: I try to do less important items/meetings after lunch, since I tend to have the best focus in morning and evening.

KR: Where are you most productive?

MS: As a data scientist, you have to balance your time between mentoring and collaborating with implementing and finishing tasks. For me, I get the most done while at home because it has the fewest distractions. But I usually need to be in the office to collaborate with and mentor others.

KR: How do you deal with stress?

MS: Working out is the best stress management for me. Most recently, I’ve been running or doing intervals to reset my thoughts.

KR: Are you a written note taker or do you use an app or something else?

MS: I will always take notes online, usually in google docs, so that it’s version controlled, shareable, and easily editable into formal docs later on.