An Gramadóir

This is the home page for An Gramadóir,
an open source grammar checking engine. It is intended
as a platform for the development of sophisticated
natural language processing tools for languages with limited
computational resources.
It is currently implemented for the Irish language (Gaeilge);
this is, to the best of my knowledge, the first grammar checker
developed for any minority language. Ports for Afrikaans, Akan,
Cornish, Esperanto, French, Hiligaynon, Icelandic, Igbo,
Languedocien, Scottish Gaelic, Samoan,
Tagalog, Walloon, and Welsh
are currently underway; see the
Developers' Guide for more information
on porting.

The word gramadóir is Irish for "grammarian" or
"grammar expert". If you're curious about the pronounciation, you can
now listen to the word as it's pronounced by the wonderful Irish
speech synthesizer abair.ie.

News

(June 2010): Our friends at the site scriobh.ie are seeking volunteers to help integrate
An Gramadóir into OpenOffice.org Writer. Read more.

(March 2010): Article in the first edition of Gaelscéal about "Anois" - a combination of the grammar checker and my thesaurus in one package.

Portable. As of version 0.5 the core engine is
written entirely in Perl which means it will run on just about
anything that plugs in.
It has been tested on a variety of
platforms and operating systems:
Linux (x86, ppc, amd64),
*BSD (x86),
Sun Solaris,
DEC alpha,
Mac OSX,
and Windows.

Modular. The Perl module provides separate interfaces
for sentence segmentation, spell checking, part-of-speech tagging,
and grammar checking. These components provide a platform for the development of applications for much more complex natural language processing tasks
(e.g. parsing, machine translation).

Easy to use. There is a simple command line interface
that reads text from standard input and writes errors to standard
output (you can see some actual output on
the Sampler page).
Or, you can try the software now using the
Web Interface.
There are also interfaces that allow the grammar checker to
be called from the text editors emacs, vim, and OpenOffice.
Last but not least, if you have a Mac you can try
the Java front-end "Ceart"
developed by Cruinneog.

Corpus-based. Various components of the engine can be
bootstrapped from corpora harvested by my web crawling
software An Crúbadán. This is
essential for languages with severely limited resources, allowing
rapid development with a minimum of effort.

Easy to develop. I've tried to design the language
developers' pack so that no programming experience is needed.
All Perl code is generated automatically from a number
of (hopefully simple) plain text input files.
This is especially important for minority languages where in
many cases there is a lack of trained linguists, software
developers, or both.

Scalable. With as little as an hour or two of work (editing
word lists output by my web crawler) a developer can have a decent
spell checking package up and running. On the other hand, the
engine is flexible enough to allow for full-scale grammar checking
(as evidenced by the rococo Irish version).

Language independent. (More or less). Most open source
language technology is designed with (Indo-)European languages in mind.
To counter this trend I've included things like full Unicode support and
better support for
the rich morphological phenomena found in many non-European languages.

Free Software.
It is released under the
GNU General Public License
which means (roughly speaking)
that you are free to copy, modify, or even sell this
software as long as redistributed versions preserve these same freedoms.
Please consider
contributing to
help improve later releases.