Bookmark

Computer Science > Computation and Language

Title:
Geometry of Polysemy

Abstract: Vector representations of words have heralded a transformational approach to
classical problems in NLP; the most popular example is word2vec. However, a
single vector does not suffice to model the polysemous nature of many
(frequent) words, i.e., words with multiple meanings. In this paper, we propose
a three-fold approach for unsupervised polysemy modeling: (a) context
representations, (b) sense induction and disambiguation and (c) lexeme (as a
word and sense pair) representations. A key feature of our work is the finding
that a sentence containing a target word is well represented by a low rank
subspace, instead of a point in a vector space. We then show that the subspaces
associated with a particular sense of the target word tend to intersect over a
line (one-dimensional subspace), which we use to disambiguate senses using a
clustering algorithm that harnesses the Grassmannian geometry of the
representations. The disambiguation algorithm, which we call $K$-Grassmeans,
leads to a procedure to label the different senses of the target word in the
corpus -- yielding lexeme vector representations, all in an unsupervised manner
starting from a large (Wikipedia) corpus in English. Apart from several
prototypical target (word,sense) examples and a host of empirical studies to
intuit and justify the various geometric representations, we validate our
algorithms on standard sense induction and disambiguation datasets and present
new state-of-the-art results.