David Talby
CTO, Pacific AI

David Talby is a chief technology officer at Pacific AI, helping fast-growing companies apply big data and data science techniques to solve real-world problems in healthcare, life science, and related fields. David has extensive experience in building and operating web-scale data science and business platforms, as well as building world-class, Agile, distributed teams. Previously, he was with Microsoft’s Bing Group, where he led business operations for Bing Shopping in the US and Europe. Earlier, he worked at Amazon both in Seattle and the UK, where he built and ran distributed teams that helped scale Amazon’s financial systems. David holds a PhD in computer science and master’s degrees in both computer science and business administration.

Natural language processing is a key component in many data science systems that must understand or reason about text. David Talby offers an overview of the NLP library for Apache Spark, which natively extends Spark ML to provide open source, fully distributed, and optimized versions of state-of-the-art NLP algorithms, covering the library's design and sharing working code samples in PySpark.
Read more.

To achieve high accuracy when reasoning about text, you generally need to understand specific languages, jargons, domain-specific documents, and writing styles. David Talby explains how to train custom word embeddings, named entity recognition, and question-answering models on the NLP library for Apache Spark.
Read more.