There was no need to build a new system for the challenge. Participants may also report on the use of existing systems. All entries must use the challenge data set. Interface sketches, mockups, wireframes, etc. are not permitted.

Data

The data set used was the New York Times (NYT) Annotated Corpus. The corpus is a collection of over 1.8 million articles annotated with rich metadata published by The NYT between January 1, 1987 and July 19, 2007. Use of the NYT corpus for the HCIR challenge was appealing for several reasons:

The content is broadly accessible without any special domain expertise.

The annotations are rich enough to support rich interactive approaches without requiring sophisticated information extraction techniques.

The size of the collection is large enough to be interesting without being so large as to cause scale challenges.

The focus of the challenge was on the development or use of interactive techniques not on data wrangling. As such, we indexed the collection and provided a baseline retrieval system. The NYT corpus was available to challenge participants free of charge thanks to the generosity of the Linguistic Data Consortium (LDC). We are very grateful to the LDC for covering the cost of shipping The NYT corpus to challenge participants.

Baseline

A baseline search system for The NYT corpus can be built using Solr. Solr scripts for building a searchable index of The NYT corpus are available here.

Task Scenarios

A pilot evaluation of the system was optional. Participants were requested to consider some or all task scenarios from a set of historical exploration tasks based on the NYT corpus:

Learn about a topic that has a long history:

Draw a rough chart of how has subway crime in New York varied over the past two decades.

Draw a rough chart of how the price of a slice of pizza in New York varied over the past two decades.

Understand the competing perspectives on a controversial topic:

Enumerate the main arguments that have been made for and against rent control in New York.

Enumerate the main arguments that have been made for and against the impeachment of U.S. president Bill Clinton.

Answer a question that requires looking at more than one document:

Enumerate the major venues in New York City that offer free concerts.

Determine if a member of the Communist party has ever held a legislative or executive post in New York State.

Challenge Reports

Each participant in the HCIR challenge submitted a four-page challenge report describing their work. All accepted challenge papers were included in the proceedings. At the workshop, participants presented their systems so that attendees could evaluate them based on the following HCIR evaluation criteria:

Effectiveness: Is a user able to complete the task?

Efficiency: How efficiently does the user complete the task?

Control: To what extent does the system give the user control over the information seeking process?

Transparency: Does the user understand what the system is doing?

Guidance: How much direction does the system provide to help the user refine their search strategy or reach their search goal?