Abstract:Word Sense Disambiguation (WSD) is the task of determining themeaning of an ambiguous word within a given context. It is an openproblem that has to be solved effectively in order to meet the needsof other natural language processing tasks. Supervised andunsupervised algorithms have been tried throughout the WSD researchhistory. Up to now, supervised systems achieved the bestaccuracies. However, these systems with the first sense heuristichave come to a natural limit. In order to make improvement in WSD,benefits of unsupervised systems should be examined.

In this thesis, an unsupervised algorithm based on sense similarityand syntactic context is presented. The algorithm relies on theintuition that two different words are likely to have similarmeanings if they occur in similar local contexts. With the help of aprinciple-based broad coverage parser, a 100-million-word trainingcorpus is parsed and local context features are extracted based onsome rules. Similarity values between the ambiguous word and thewords that occurred in a similar local context as the ambiguous wordare evaluated. Based on a similarity maximization algorithm,polysemous words are disambiguated. The performance of the algorithmis tested on SENSEVAL-2 and SENSEVAL-3 English all-words task dataand an accuracy of 59% is obtained. Related link