Abstract

Methods for estimating peoples conceptual knowledge have
the potential to be very useful to theoretical research on conceptual
semantics. Traditionally, feature-based conceptual representations
have been estimated using property norm data;
however, computational techniques have the potential to build
such representations automatically. The automatic acquisition
of feature-based conceptual representations from corpora is a
challenging task, given the unconstrained nature of what can
constitute a semantic feature. Existing computational methods
typically do not target the full range of concept-relation-feature
triples occurring in human generated norms (e.g. tiger have
stripes) but rather focus on concept-feature tuples (e.g. tiger
 stripes) or triples involving specific relations only. We investigate
the large-scale extraction of concept-relation-feature
triples and the usefulness of encyclopedic, syntactic and semantic
information in guiding the extraction process. Our
method extracts candidate triples (e.g. tiger have stripes, flute
produce sound) from parsed corpus data and ranks them on
the basis of semantic information. Our investigation shows the
usefulness of external knowledge in guiding feature extraction
and highlights issues of methodology and evaluation which
need to be addressed in developing models for this task.