RNNLM Toolkit

Introduction

Neural network based language models are nowdays among the most successful techniques for statistical language modeling. They can be easily applied in wide range of tasks, including automatic speech recognition and machine translation, and provide significant improvements over classic backoff n-gram models. The 'rnnlm' toolkit can be used to train, evaluate and use such models.

The goal of this toolkit is to speed up research progress in the language modeling field. First, by providing useful implementation that can demonstrate some of the principles. Second, for the empirical experiments when used in speech recognition and other applications. And finally third, by providing a strong state of the art baseline results, to which future research that aims to "beat state of the art techniques" should compare to.

Frequently asked questions

Contact

Tomas Mikolov - tmikolov@gmail.com

Stefan Kombrink - kombrink@fit.vutbr.cz

Acknowledgements

We would like to thank to all who have helped us with the development of this toolkit, either by providing advices or by testing it. Specially, thanks to Anoop Deoras, Sanjeev Khudanpur, Scott Novotney, Stefan Kombrink, Dan Povey, YongZhe Shi, Geoff Zweig.

Mikolov Tomáš, Deoras Anoop, Kombrink Stefan, Burget Lukáš, Černocký Jan: Empirical Evaluation and Combination of Advanced Language Modeling Techniques, In: Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH 2011), Florence, ITComparison to other LMs shows that RNN LMs are state of the art by a large margin. Improvements inrease with more training data.