Interview: David Steinmetz, Machine Learning Engineer at Capital One

I recently caught up with David Steinmetz, who is a Machine Learning Engineer with Capital One Bank, to discuss how to get a job at Capital One, the types skills they are looking for, and what his typical day looks like. Daniel is a strategic minded analyst, with progressive experience in research, analytics, modeling, process improvement, machine learning, and developing data products. He is organized, detail-oriented, driven and takes initiative. Daniel works well within a team environment; adept at delivering presentations and public speaking; excellent interpersonal, written and oral communication skills. He hold a Ph.D. in materials science from RWTH Aachen University, and attended a bootcamp at the NYC Data Science Academy. The views expressed are solely his own and do not reflect those of his employer.

Daniel D. Gutierrez – Managing Editor, insideBIGDATA

insideBIGDATA: Tell us about your background. Did it naturally lead you to data science? If not, what gave you the push to enter the field?

David Steinmetz:I studied materials science, which is a blend of math, physics, and chemistry. I noticed that programming was integral to solving many of the materials science problems, which also tended to be steeped in sophisticated math. I used a genetic algorithm during my undergrad and a particle swarm algorithm during my PhD. After I graduated, I joined a management consultancy and learned about business and data analysis at companies. That background coupled with a genuine interest in computers naturally led to data science.

insideBIGDATA: What were things that made you consider a bootcamp? Were there other things you were considering at the time?

David Steinmetz:I took online data science courses to get my feet wet. Upon realizing that I really enjoyed the work, I looked at how to learn as much material as possible as quickly as possible. A friend mentioned the bootcamps, and I quickly decided it was the right move for me. I had been considering jobs in materials science, but data science drew me in.

insideBIGDATA: What skills were most useful in helping you land your position at Capital One?

David Steinmetz: The skills that were most useful were practical experience with a number of machine learning algorithms, project work shown on Github, and a working knowledge of data structures and algorithms. The question an interviewer is really asking is “can this person do the job”. The more project work you have on your public profile, the less of a risk it will be to hire you, because the hiring manager can already see your abilities.

insideBIGDATA: What are the tools you find most relevant in your position? What are the skills you thought that were most important?

David Steinmetz: Python, AWS, Github, Scala, and Spark are the tools which are most relevant to my current position and project. I use Pandas and Spark Datasets often, and Github always. I thought R would be used more, but it’s not, because it’s harder to use R in production. I also thought I would rely more on the standard machine learning libraries, but we don’t hesitate to implement an algorithm that doesn’t exist in Scitkit-Learn or MLlib if it suits our purposes.

insideBIGDATA: Can you describe your day to day job as a Data Scientist?

David Steinmetz: Often I spend time reading original research papers and books in the attempt to find state-of-the-art approaches to the problem I am trying to solve. The rest of the time is spent coding, visualizing data, bouncing ideas off of colleagues, and creating new products to solve our clients’ needs. I use cloud services and open source software extensively, allowing me to iterate quickly and try new approaches.

insideBIGDATA: What do you find most enjoyable about your job?

David Steinmetz: It’s varied, mentally challenging, and at the cutting edge of implemented machine learning. The people I work with are amazingly fascinating, and it’s motivating and an honor to be able to work with them.

insideBIGDATA: What are skills your team looks for in a Data Scientist?

David Steinmetz: We look for someone who is curious, passionate, and well-rounded in the sense that they have experience both with data engineering and distributed systems as well as data science and machine learning. Since we work so much in the cloud, knowledge of cloud services is a plus. A lot of work is done in Scala and Java, so knowledge of one of those two also helps.

insideBIGDATA: What advice do you have for people looking to enter the field?

David Steinmetz: There is so much to learn in the field, so pick one thing and learn it well before moving on to another. Learning many things superficially will backfire once you get into the interview or onto the job. There deep understanding and the capacity for further learning is necessary. A bootcamp is a great way to get both the deep understanding and cover the breadth of material necessary to get you started in the field. Whatever you do, get advice on what to learn, otherwise what you are learning might not be best suited to your situation.

Yes, there are many excellent educational resources available. Many such resources are available through Coursera. A new specialization series on Deep Learning just started. To check it out, go to https://www.deeplearning.ai/

Resource Links:

Industry Perspectives

In this special guest feature, Assaf Katan, CEO & Co-Founder of Apertio, the Open Data deep search engine, suggests that there are huge social and financial benefits that businesses and economies can realize if they can successfully leverage Open Data. Despite this, there are still some hurdles for data professionals to leap. A great way to start is to consider whether your data meets the criteria for what’s known as the FAIR principles. These are Findability, Accessibility, Interoperability and Reusability. [READ MORE…]