(Semi-)Automatic Categorization of Natural Language Requirements

Paper i proceeding, 2014

Context and motivation: Requirements of todays industry specifications need to be categorized for multiple reasons, including analysis of certain requirement types (like non-functional requirements) and identification of dependencies among requirements.This is a pre-requisite for effective communication and prioritization of requirements in industry-size specifications.
Question/problem: Because of the size and complexity of these specifications, categorization tasks must be specifically supported in order to minimize manual efforts and to ensure a high classification accuracy. Approaches that make use of (supervised) automatic classification algorithms have to deal with the problem to provide enough training data with excellent quality.
Principal ideas/results: In this paper, we discuss the requirements engineering team and their requirements management tool as a socio-technical system that allows consistent classification of requirements with a focus on organizational learning. We compare a manual, a semi-automatic, and a fully-automatic approach for the classification of requirements in this environment. We evaluate performance of these approaches by measuring effort and accuracy of automatic classification recommendations and combined performance of user and tool, and capturing the opinion of the expert-participants in a questionnaire. Our results show that a semi-automatic approach is most promising, as it offers the best ratio of quality and effort and the best learning performance.
Contribution: Our contribution is the definition of a socio-technical system for requirements classification and its evaluation in an industrial setting at Mercedes-Benz with a team of ten practitioners.