Tidying up the Basement: A Tale of Large-Scale Parsing on National eInfrastructure

DescriptionLanguage is the fabric of the Web, and language technologies arguably provide the grease for the weaving loom, evidenced for example by automated on-line translation, spoken-language interfaces to mobile devices, or the advertizing and content recommendation systems that drive monetization of Web services, and thus availability at no charge to the end user. In this presentation, I will give a high-level impression of core techniques used in a variety of language technologies, with special emphasis on their computational properties. Then I will review my own experience, and that of my research group at the University of Oslo, in migrating from operating a dedicated server farm in the basement of our department, to taking advantage of a national ‘throughput’ supercomputer, the ABEL cluster at Oslo. As a direct consequence of this happy development, the research profile of the group today is far more computation-heavy than would have been possible otherwise, and we work experimentally and empirically on a scale that would have been impossible to imagine five years ago.