LEXIA - A Data Science Environment for Legal Texts

Abstract

The analysis of textual data is a common and important task in legal science and practice. This knowledge- and data-intensive task is essential but time consuming. With contributions in information retrieval and artificial intelligence, legal informatics has brought great value into the legal domain, allowing efficient workflows and accurate results. However, only a few efforts have been spent on the design of a data science environment tailored to the legislation in Germany.

Considering state-of-the-art technologies and most recent developments in software engineering and text mining, a flexible reference architecture for software-supported analysis and annotation of semantic and linguistic properties of legal texts has been developed. We provide a web-based software application (LEXIA) that is tailored to German legal texts and can easily be extended with arbitrary text mining modules. It implements a lean data model for the internal representation of legal texts, considering the characteristics of the German legislation. The main focus in the architectural design is the support of extensibility and adaptability.

We conducted a case study based on the German tenancy law including relevant judgments from the German Federal Court of Justice to demonstrate the feasibility and the value added by our approach.