End-to-end Goal-oriented

Question Answering Systems

As an interesting research problem in conversational AI, Question Answering (QA) has been surveyed from various perspectives recently [1]. In this tutorial, we focus on goal-oriented QA systems [2, 3, 4], which aim at guiding users to complete a specific task like finding a job, booking a ticket etc. via a conversational bot, in contrast to the large amount of social bots in Alexa [5], and the engineering designs and all of the components involved in building an end-to-end system in practice.

Depending on the setting of answer retrieval, QA systems can be categorized as structured data based systems [6, 16, 17, 18] and unstructured data based systems [7, 15]. Structured data based QA systems produce answers from a closed and structured data source, including modern knowledge graphs and traditional databases. Unstructured data based QA systems extract answers from free text by combining the information retrieval and the information extraction. While structured data based QA systems can provide well-formatted accurate answers in closed domains and are thus more suitable for goal-oriented bots, the unstructured data based QA systems can cover wider user intents via search. In industry, a practical goal-oriented bot leverages both with a focus on structured data based QA systems.

Structured data based QA systems semantically parse the question into a formal query language executable in knowledge graph or database. Varied by different semantic parsing process, structured data based QA systems are grouped as ontology-based systems [8] and intent classification based systems [9]. Ontology-based QA systems take questions and an ontology as input, and associate question features with answer patterns of a knowledge base via the ontology file. Despite their generic framework in semantic understanding, the ontology-compliant systems [10] tend to not outperform simpler models in public benchmark datasets [13, 14]. On the other hand, intent classification simply classifies questions into predefined answer patterns needed by goal oriented bots in a knowledge base. The challenge boils down to the training data preparation for each predefined answer pattern, which can be more easily tackled in an industrial environment.

In this tutorial, we first introduce a variety of QA systems based on knowledge graph and intent classification proposed by pioneer researchers. The audience can easily comprehend what common technical components remain challenging and what unique engineering heuristics are useful. For the first time, the audience can learn in-depth not only the scientific methods that boost the precision of question understanding and answer retrieval / generation [11], but also our practical experiences as well as engineering designs that enable an end-to-end system [12]. Then, on top of three LinkedIn real scenarios, we share our hands-on experiences in the end-to-end process of building goal oriented bots, including problem analysis from scratch, architecture design, training data collection, paraphrase generation, intent modeling and dialogue management. Our goal is that, after this tutorial, the audience knows how to efficiently build a goal-oriented bot without getting stuck in unrealistic solutions.

Deepak Agarwal​ is the VP of Artificial Intelligence at LinkedIn. He is an expert in Artificial Intelligence technologies and engineering leadership with more than twenty years of experience developing and deploying state-of-the-art machine learning and statistical methods for improving the relevance of web applications. He has worked in various positions: chief scientist of large projects, managed small and highly technical teams, experienced in managing large teams. He is experienced in conducting novel scientific research to solve notoriously difficult AI problems. He is a Fellow of the American Statistical Association, Member Board of Directors for SIGKDD, program chair of KDD in the past and associate editor of two top-tier journals in Statistics. He regularly serves on senior program committees of top-tier conferences like KDD, NIPS, CIKM, ICDM, SIGIR, WSDM.

Bee-Chung Chen​ is a Principal Staff Engineer at LinkedIn with extensive industrial and research experience in recommender systems, some of which is summarized in the book titled​ ​Statistical Methods for Recommender Systems​. He currently leads the development of LinkedIn's machine learning technology. He was a key designer of the recommendation algorithms that power LinkedIn news feed, Yahoo! homepage, Yahoo! News and other sites. His research interests include recommender systems, machine learning and big data processing​.

Qi He​ is a Director of Engineering at LinkedIn, leading a team of machine learning scientists, software engineers and linguists to standardize LinkedIn data, build the LinkedIn Knowledge Graph, and develop next-gen AI technologies to enable Q&A and deeper conversations on the site. Before that, he managed LinkedIn Feed Relevance team, with a focus on developing and deploying personalized machine learning and statistical methods for improving the relevance of LinkedIn Feed. Prior to LinkedIn, he was a Research Staff Member at IBM Almaden Research Center until 2013. He completed two years of postdoctoral work on citation recommendations in CiteSeer at PSU from 2008 to 2010 and completed his PhD in information retrieval at NTU with a Microsoft Research Fellowship from 2005 to 2008. He was the General Chair of ACM CIKM 2013, is the Program Committee Chair of ACM CIKM 2019, serves as Associate Editor of IEEE Transactions on Knowledge and Data Engineering (TKDE) and Neurocomputing Journal, and regularly serves on the (senior) program committee of SIGKDD, SIGIR, WWW, CIKM and WSDM for 10+ years. He is a Member Board of Directors for ACM CIKM, a senior member of ACM and a senior member of IEEE. He received the 2008 SIGKDD Best Application Paper Award and constantly published in top-tier international conferences/journals.

Jaewon Yang​ is a staff software engineer at LinkedIn where he leads various projects on developing cutting-edge AI techniques for the LinkedIn Knowledge Graph including Q&A models. Jaewon obtained his Ph.D degree at the Computer Science department of Stanford University in 2014, and a Master’s degree in Statistics from Stanford University in 2012. He received SIGKDD Doctoral Dissertation Award honorable mention in 2014 and the Best Application Award at ICDM 2010.

Liang Zhang​ is currently a Principal Staff AI Researcher at LinkedIn, who has led a lot of critical AI projects in the company to success and brought great improvements of experiences to the 500M+ professional users of LinkedIn through the cutting-edge AI technology. Liang obtained his Ph. D. degree at Department of Statistical Science, Duke University in 2008, worked at Yahoo! Labs as a Scientist from 2008 to 2012, and has been working at LinkedIn since 2012. Liang has published extensively in top-tier computer science conferences as well as statistics journals, and also co-authored 20+ AI-related patents. His research mainly focuses on bringing the cutting-edge AI technologies to user-facing products at scale. Liang also served as the Program Committee members for various data mining and machine learning venues.

LinkedIn AI Department

Made with the new Google Sites, an effortless way to create beautiful sites.