People acquire language through social interaction. Computers learn linguistic models from data, and increasingly, from language-based exchange with people. How do computational linguistic techniques and interactive visualizations work in concert to improve linguistic data processing for humans and computers? How can statistical learning models be best paired with interactive interfaces? How can the increasing quantity of linguistic data be better explored and analyzed? These questions span statistical natural language processing, human-computer interaction, and information visualization, three fields with natural connections but infrequent meetings. Vis and HCI are niches in NLP; Vis and HCI have not fully utilized the statistical techniques developed in NLP. This workshop aims to assemble an interdisciplinary community that promotes collaboration across these fields.

Workshop Themes

Three themes will define this workshop.

1. Active, Online, and Interactive Machine Learning

Statistical machine learning (ML) has yielded tremendous gains in coverage and robustness for many tasks, but there is a growing sense that additional error reduction might require a fresh look at the human role. Presently, human inputs are often restricted to passive annotation in ML research. However, the fields of ML and HCI are both developing new techniques—such as active learning, incremental/online learning, and crowdsourcing—that attempt to engage people in novel and productive ways.

The first theme of this workshop focuses on advancing interactive machine learning. How do we jointly solve the learning questions that have been the domain of NLP and address research topics in HCI such as managing human workers and increasing the quality of their responses?

2. Language-based User Interfaces

NLP techniques have entered mainstream use, but the field currently focuses more on building and improving systems and less on understanding how users interact with them in real-world environments. User interface (UI) design decisions can affect the perceived or actual performance of a system. For example, while machine translation (MT) quality improved considerably over the last decade, studies found that human translators disliked MT output for reasons unrelated to translation quality. Many existing systems present sentence-level translations in the absence of relevant context, and disrupt rather than contribute to a translator's workflow.

The second theme of this workshop focuses on improving people's ability to engage with language-based UIs and work with linguistic data. How do we best integrate learning methods, user behavior understanding, and human- centered design methodology?

3. Text Visualization and Analysis

The quantity and diversity of linguistic corpora is swelling. Recent work on visualizing text data annotated with linguistic structures (e.g., syntactic trees, hypergraphs, and sequences) has produced tools that enable exploration of thematic and recurrence patterns in text. Visual representations built on the outputs of word-level models (e.g., sentiment classifiers, topic models, and continuous word embedding models) now power exploratory analysis of legal documents, political text, and social media content. Beyond adding analytic value, interactive visualization can also reduce the upfront effort needed to set up, configure, and learn a tool, as well as promote adoption.

The third theme of this workshop centers on improving the utility and accessibility of text analysis tools. How do we pair appropriate NLP techniques and visualizations to assist both expert and non-technical users, who encounter a growing amount of linguistic data in their professional and everyday lives?