Automatic Classification for English Verbs

Brief description

This is a verb classification of English verbs. This
classification is acquired using spectral clustering and
subcategorization frame parameterized by selection preference as
features. The feature and method are described in the paper Sun
and Korhonen(2008).

The classification contains two parts:

General domain: 1510 verbs (those verbs which occur more than
1000 times in English Gigaword corpus) from VerbNet
are clustered into 170,144 and 60 clusters, according to the
gold-standard extracted from VerbNet. The result was evaluated
against VerbNet, the F-Measure is 59.1, 54.8 and 52.3
respectively.

Biomedical domain: 399 verbs contain both general and
biomedical verbs. They contain verbs from Korhonen and
Krymolowski(2008), and other frequent verbs in biomedical
publications. The verbs are clustered into 78,46 and 17
clusters. The F-Measures are 68.8, 64.7 and 62.5.

Download

The first release of this resource: vc.zip
The clustering code is available on request (send mail to ls418@cam.ac.uk).
You can acknowledge this resource by citing this publication: