Fri Feb 21 1997

Disc: Parsers

Directory

>After reading the recent postings on FUNKNET about the parser
>challenge, I went to the Ergo parser site and tried it out. I was
>particularly interested since I have worked with the Link Grammar
>parser extensively, and other parsers, and so I have a pretty good
>idea what the state of the art looks like.
>
>The functionality built into the Ergo interface is very nice:
>certainly it is an advantage, for the purposes of evaluating parsers,
>being able to get the grammatical analysis outputted directed in a
>simple and easily understood format. And such functionalities as
>getting transformational variants of sentences (especially
>question-answer pairs) is of obvious commercial benefit. (Though there
>are certainly other sites with such functionality.
I thought we had a very good collection of all the parsers that are on
the web, yet I have not seen any besides ours with such functionality.
If someone could send us these addresses we would appreciate it.
>Usually, though,
>that is something built for a particular application on top of a
>parser engine, rather than being built into the parser. It would be
>nice as a standard parser feature though.)
Certainly, that makes sense but then it is more difficult for others
to look at and judge the state of the art. This more easily read
format allows the world to see the state of the art.
>
>Leaving that aside, I found the performance of the Ergo parser
>substantially below state of the art in the most important criterion:
>being able to parse sentences reliably - at least, judging by the web
>demo (though there are some risks in doing so, of course, since it is
>always possible that performance problems are the result of incidental
>bugs rather than the fundamental engine or its associated database.)
This may or may not be so. What's important (and the main point of
the original challenge) is to compare parsers using the same
sentences. A report of the Ergo parser alone is not sufficient. Users
need to try these same sentences on ours and the other parsers to get
a good sense of the state of the art, and the advantages and
disadvantages of different parsers.
>Quite frankly, though, the self-imposed limitation of 12-14 words
>concerned me right off the bat, since most of the nastiest problems
>with parsers compound exponentially with sentence length. But I
>decided to try it out within those limitations.
Our reason for the limit is that is the size at which we can handle
the very thorough analysis which is required by our web site. While
other parsers may be providing large complex tree diagrams or labelled
bracketings that theories that are used to create these complex trees
do not lend themselves to practical applications; thus, on long or
short sentences they cannot reliably label parts of the sentence,
tense and so on. And most importantly, they cannot manipulate strings
of any size (e.g. change active to passive or statement to question).
We currently do analyze much larger sentences than we allow on our web
site, but we cannot do so for all the criteria imposed by our
challenge. As soon as we can we will offer larger sentences on the
site. At the time of the challenge we anticipated that it would take
approximately 6 monhts to get to these large length sentences, but we
have been fortunate enough to bring that time down to a number of
weeks, so we should be able to offer our very thorough analysis of
longer strings as well in about 8 weeks.
>
>As a practical test, I took one of the emails sent out from Ergo, and
>tried variants of the sentences in it. By doing this, I avoided the
>trap of trying simple garden-variety "example sentences" (which just
>about any parser can handle) in favor of the variety of constructions
>you can actually get in natural language text. But I reworded it
>slightly where necessary to eliminate fragments and colloquialisms and
>to get it into the 12-14 word length limit. That meant in most cases I
>had to try a couple of variants involving parts of sentences, since
>most of the sentences in the email were over the 12-14 word limit.
>
>Here were the results:
>
>I didn't realize it but our head programmer was here last night.
> -- did not parse
>
>I fixed the sentences that Mr. Sleator said didn't work.
> -- failed to return a result at all within a reasonable time;
> I turned it off and tried another sentence after about ten
>minutes.
>
>Our verb section of our dictionary on the web was corrupted.
> - parsed in a reasonable time.
>
>Part of the problem was that our dictionary was corrupted.
> - took 74.7 seconds to parse
>
>It is easy for us to update and repair problems with our parser.
> -again, it failed to return a result in a reasonable time.
>
>This is something that most others cannot handle.
> -did not parse.
>
>Even minor repairs take months.
> -again, it failed to return a result in a reasonable time.
I am not sure when these were a problem, but I have run them all and
they work just fine.
>
>I am not particularly surprised by these results. Actual normal use of
>language has thousands of particular constructions that have to be
>explicitly accounted for in the lexicon, so even if the parser engine
>Ergo uses is fine, the database could easily be missing a lot of the
>constructions necessary to handle unrestricted input robustly. Even the
>best parsers I have seen need significant work on minor constructions;
>but these sentences ought to parse. They are perfectly ordinary English
>text (and in fact all but one parses in a less than a second on the
>parser I am currently using).
Yes, we agree and they do. However, the real test of any parser is
not against the ideal for parsers but the state of the art today and
this should be done by individuals running the same sentences on a
different parsers.
>No doubt the particular problems causing trouble with these sentences
>can be fixed quickly (any parser which properly separates parse engine
>from rule base should be easy to modify quickly) but the percentage of
>sentences that parsed suggests that there's a fair bit of work left to
>be done here.
Yes, actually we agree. But that is true for all parsers and the
challenge I believe stimulates growth in all of them.
The main point of our challenge was to stimulate discussion and
awareness of the state of the art of parsing technology by proposing
standards that could be understood and judged by members inside the CL
community and outside of it. That is, we felt it necessary for all to
look at how a parser works in terms of analysis, interpretation, and
the manipulation of strings; thus, we felt it reasonable to propose
that a parser be able to label parts of speech and parts of the
sentence, tense and voice, internal clauses, and sentence type as well
as perform some standard set of manipulations such as changing active
to passive and statement to question. Further the criteria that were
chosen were chosen to allow a parser to demonstrate its usefulness for
the development of practical applications. It is important to note
that no parsers in existence today come to the level of what an
outsider might call the "ideal" for parsing technology. It is
important to recognize that parsers are at the level of biplanes here,
not spaceships. Many of the critical comments we have received tended
to compare us against some sort of unstated ideal rather than with the
state of the art as judged by looking at other parsers using the same
criteria.
In any case, we appreciate the comments and will use them to improve
our parser.
Phil Bralich