Joint Transition-Based Models for Morpho-Syntactic Parsing: Parsing Strategies for MRLs and a Case Study from Modern Hebrew

Amir More, Amit Seker, Victoria Basmova, Reut Tsarfaty

Abstract

In standard NLP pipelines, morphological analysis and disambiguation (MA&D) precedes syntactic analysis and semantic downstream tasks. However, for languages with complex and ambiguous word-internal structure, known as morphologically rich languages (MRLs), it has been hypothesized that syntactic context may be crucial for accurate MA&D. In this work we empirically confirm this hypothesis for Modern Hebrew, an MRL with complex morphology and severe word-level ambiguity, in a novel transition-based parsing framework. Specifically, we propose a joint morphosyntactic transition-based parsing framework, which formally unifies two distinct transition systems, morphological and syntactic, into a single transition-based system with joint training and joint inference. We empirically show that MA&D results obtained in our joint settings outperform MA&D results obtained by the respective standalone components, and that end-to-end parsing results obtained by our joint system present new state of the art for Hebrew end-to-end dependency parsing.