Enriched Event Streams: A General Platform For Empirical Studies On In-IDE Activities Of Software Developers

Language:

English

Abstract:

Current studies on software development either focus on the change history of source code from version-control systems or on an analysis of simplistic in-IDE events without context information. Each of these approaches contains valuable information that is unavailable in the other case. This work proposes enriched event streams, a solution that combines the best of both worlds and provides a holistic view on the in-IDE software development process. Enriched event streams not only capture developer activities in the IDE, but also specialized context information, such as source-code snapshots for change events. To enable the storage of such code snapshots in an analyzable format, we introduce a new intermediate representation called Simplified Syntax Trees (SSTs) and build CARET, a platform that offers reusable components to conveniently work with enriched event streams. We implement FeedBaG++, an instrumentation for Visual Studio that collects enriched event streams with code snapshots in the form of SSTs and share a dataset of enriched event streams captured in an ongoing field study from 81 users and representing 15K hours of active development. We complement this with a dataset of 69M lines of released source code extracted from 360 GitHub repositories. To demonstrate the usefulness of our platform, we use it to conduct studies on the in-IDE development process that are both concerned with source-code evolution and the analysis of developer interactions. In addition, we build recommendation systems for software engineering and analyze and improve current evaluation techniques.