Abstract [en]

Speech recognition typically involves three types of models; an acoustic model, a phonetic dictionary and a language model. The primary purpose of the language model is to decide if a sentence is part of the language, and optionally how likely it is.

N-gram is a common type of language model which predicts upcoming words based on a series of prior words. While ecient a problem with this type of model is that it is not ideal for supporting long-distance dependencies. Tree-based models addresses this issue since they allow hierarchal constructs. The probabilistic lexicalized tree insertion grammar (PLTIG) is particularly interesting since it also limits the computational complexity of parsing sentences compared to similar methods such as tree-adjoining grammars of context-free grammars.

This thesis takes a closer look at the N-gram model and compares it with PLTIGs. Although PLTIGs tend to achieve higher accuracy the computational complexity is still very large in comparison.