The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment, and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])

+

The [http://www.thai-sbobet.com sbobet] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment, and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])

== Focus on lexicalization ==

== Focus on lexicalization ==

Revision as of 02:06, 25 June 2012

This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.

Focus on content selection, aggregation

SumTime Meteo

These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.

The weather corpus currently exists as an Access database and, alternatively, in form of CSV (ASCII) files.

CLASSiC is a project on Computational Learning in Adaptive Systems for Spoken Conversation. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards' choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.

Focus on generating referring expressions

Referring expression generation is a sub-task of NLG with an active research community.

COCONUT Corpus

COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the COCONUT-DRI coding scheme. (direct download link)

GRE3D3: Spatial Relations in Referring Expressions

A Web-based production experiment was conducted by Jette Viethen under the supervision of Robert Dale.
The resulting sbo contains 720 referring expressions for simple objects in simple 3D scenes.
(direct download link)

TUNA Reference Corpus

The sbobet is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment, and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. (direct download link)

Focus on lexicalization

...

Focus on syntax, realization

...

This page was imported semi-automatically from the NLG Resources Wiki which was run by ACL SIGGEN in the years 2005–2009. Please correct conversion errors and help update its contents.