[http://openccg.sourceforge.net/ OpenCCG], the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. License: LGPL.

+

[http://openccg.sourceforge.net/ OpenCCG], the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.

== Project Reporter ==

== Project Reporter ==

Revision as of 18:04, 16 February 2009

The natural language generation systems listed below are available for download over the web.
If you know of a system which is not listed here, please click on Edit in the upper left corner of this page and add the system yourself.

GenI

surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification). Toy example grammars provided for English and French. Largish core grammar for French is under development (contact us for details). GPL, known to work under Linux and Mac OS X (potential for making it work on Windows as well). Written in Haskell. Source code avalailable via darcs.

Grammar Explorer

provides a means of exploring large-scale systemic-functional grammars in order to see how they are
organized and what kinds of things they cover. It can be used to explore the KPML resources.
Downloadable standalone executables of the grammar explorer are available for Windows 95/98/NT.
These already include a version of the Nigel grammar of English and pre-installed examples.

HALogen

HALogen is a general-purpose natural language generation system developed by Irene Langkilde-Geary and Kevin Knight at the USC Information Sciences Institute.
The download package consists of the symbolic generator, the forest ranker, and some sample inputs. The symbolic generator includes the Sensus Ontology dictionary (which is based on WordNet). The forest ranker includes a 250-million word ngram language model (unigram, bigram, and trigram) trained on WSJ newspaper text. The symbolic generator is written in LISP and requires a CommonLisp interpreter.

KPML

The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.

The KPML system was a direct descendent of the Penman text generation system, as developed further
multilingually in cooperative work between
the Komet (http://www.darmstadt.gmd.de/publish/komet/index.html)
project in Darmstadt and the Systemic Modelling Group
at Macquarie University. Downloadable standalone executables of the system are available for
PCs running Windows. The source code is written in ANSI Common Lisp and uses the
Common Lisp Interface Manager (CLIM).
The system has been compiled and tested[
under Franz Allegro Common Lisp (4.2, 4.3, 4.3.1, 5.0, 6.0, 7.0)
for Unix and Franz Allegro Common Lisp 3.0
and Harlequin Lispworks 4.0, 4.1, 4.2 for Windows.
It is possible to use the system without the window interface as a generator serving requests for generation across sockets or via files.

LKB

LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.
It includes a realiser that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license.

Multimodal Unification Grammar

MUG Workbench is a development and debugging tool for Multimodal NLG. The grammar formalism supported is
Multimodal Functional Unification Grammar (MUG). The MUG system runs MUG grammars with fixed (test cases)
and arbitrary input specifications to produce output in a natural language, graphical user interface and
possibly in other modes. It is designed to do three things:
- Multimodal Fission (distributing output to interaction/communication modes)
- Some sentence planning (chosing information to include in the utterance)
- Natural Language and graphical user interface realization (producing some form of output)
The MUG system does these three jobs in parallel. MUG Workbench can serve to inspect the data-structures
used during generation. It should help you to learn more about the nature of unification grammars used
for parsing or natural language generation. Furthermore, the MUG Workbench is helpful in debugging your grammars.

NaturalOWL

Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a Protégé plug-in. See here for publications describing NaturalOWL. (GPL)

OpenCCG

OpenCCG, the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for Combinatory Categorial Grammar. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.

RSTTool

is a tool which allows you to graphically annotate the
rhetorical structure of your text. The structure can be saved in an xml format, or save
eps versions of the structure diagram for inclusion in Latex, etc. Written in Tcl/Tk.
Runs on any machine.

Simplenlg

is an ultra-simple Java-based realiser. Its
grammatical coverage and syntactic knowledge is
minuscule compared to KPML or FUF/SURGE.
However, because it is so simple, its relatively
easy for people to learn how to use it. It has
been used by many people in Aberdeen, and also
for teaching. It is set up as a Java package,
so it can only be used by Java programs.

Suregen-2

Suregen is “a hybrid, ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”
The system Suregen-2 is written in (Allegro) Common Lisp. A demo system which runs under Windows is available for download. A screencast video shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in VLC if you run into problems.) Perhaps this system could be considered an instance of the WYSIWYM approach.

SURGE

Syntactic realization package. (A CommonLisp package providing an interpreter for a functional
unification formalism called FUF and SURGE, a large grammar of English written in FUF.) Offers download of SURGE 2.2.

SURG-SP

Systemic Unification Reusable Grammar for Spanish is a large scale
Spanish grammar allowing systems which already use FUF/SURGE for English NLG to be able
to generate syntactically (and many times semantically) equivalent text in Spanish when
new lexical items are introduced. SURG-SP makes use of inputs almost identical to the
English version Surge 2.3.

SURG-IT

TG/2

is a shallow verbalizer that can be quickly accustomed to new domains and tasks.
It combines context-free grammars with templates and canned
text in a single formalism. Thus the granularity of the language model may depend on the application
needs. The system currently runs under Solaris 2.5. It is available freely under a research license.

This page was imported semi-automatically from the NLG Resources Wiki which was run by ACL SIGGEN in the years 2005–2009. Please correct conversion errors and help update its contents.