Abstract

Researchers within a community gather in scientific conferences periodically. As a result, the scientific output, reflected in the papers published in conferences, carries valuable knowledge about the underlying community, such as collaboration groups and the history of the evolution of topics. Researchers can use this knowledge to identify i) candidates for collaboration for future projects, ii) active topics of research, and iii) relevant papers in their field of research. However, to obtain this knowledge, researchers would have to collect and analyze data such as titles, authors, and keywords that might be spread across thousands of papers. The size and the number of relations in such data sets can make the analysis hard using tabular representations such a spreadsheet. Instead, visualizations provide users a graphical representation of the attributes and relations of data on which they can reflect. Through visualizations researchers can obtain an overview of a scientific community, analyze patterns of evolution, and identify entities of interesting. We propose ExtendedEggShell (EES), a unified framework for extracting, modeling and visualizing scientific communities. EES enables users to visualize the collaboration network of a community based in node-link dia- grams, and interact with the graphs by posing queries inspired by meaningful keywords in word clouds. We evaluate the performance of EES by analyzing the complete set of 366 papers published in the software visualization community (VISSOFT). We demonstrate the tool via selected usage examples, on which we analyze 1084 papers from the object-oriented, systems, languages and applications community (OOPSLA). We found that visualizing scientific communities as bigraphs using node-link diagrams helps users to better understand the collaboration within these communities.

Abstract

With ever growing numbers and complexity of server applications efficient monitoring is key to detect performance issues in production and prevent them during development. In this work we show how such a monitoring solution could be built for JavaEE to meet specific requirements. We describe a software project for the Informatik Service Center ISC-EJPD1 which is interested in integrating performance monitoring into an existing monitoring framework. Our software provides a simple to use and highly customisable solution for basic performance monitoring in JavaEE applications.

Abstract

Null pointer exceptions are common bugs in Java projects. Previous research has shown that dereferencing the results of method calls is the main source of these bugs, as developers do not anticipate that some methods return null. To make matters worse, we find that whether a method returns null or not (nullness), is rarely documented. We argue that method nullness is a vital piece of information that can help developers avoid this category of bugs. This is especially important for external APIs where developers may not even have access to the code. In this paper, we study the method nullness of Apache Lucene, the de facto standard library for text processing in Java. Particularly, we investigate how often the result of each Lucene method is checked against null in Lucene clients. We call this measure method nullability, which can serve as a proxy for method nullness. Analyzing Lucene internal and external usage, we find that most methods are never checked for null. External clients check more methods than Lucene checks internally. Manually inspecting our dataset reveals that some null checks are unnecessary. We present an IDE plugin that complements existing documentation and makes up for missing documentation regard- ing method nullness and generates nullness annotations, so that static analysis can pinpoint potentially missing or unnecessary null checks.

Abstract

Understanding API usage is important for upstream and downstream developers. However, compiling a dataset of API clients is often a tedious task, especially since one needs many clients to draw a representative picture of the API usage. In this paper, we present KOWALSKI, a tool that takes the name of an API, then finds and downloads client binaries by exploiting the Maven dependency management system. As a case study, we collect clients of Apache Lucene, the de facto standard for full-text search, analyze the binaries, and create a typed call graph that allows developers to identify hotspots in the API. A video demonstrating how KOWALSKI is used for this experiment can be found at https://youtu.be/zdx28GnoSRQ.

Abstract

Background: Bug prediction helps developers steer maintenance activities towards the buggy parts of a software. There are many design aspects to a bug predictor, each of which has several options, i.e. software metrics, machine learning model, and response variable. Aims: These design decisions should be judiciously made because an improper choice in any of them might lead to wrong, misleading, or even useless results. We argue that bug prediction configurations are intertwined and thus need to be evaluated in their entirety, in contrast to the common practice in the field where each aspect is investigated in isolation. Method: We use a cost-aware evaluation scheme to evaluate 60 different bug prediction configuration combinations on five open source Java projects. Results: We find out that the best choices for building a cost-effective bug predictor are change metrics mixed with source code metrics as independent variables, Random Forest as the machine learning model, and the number of bugs as the response variable. Combining these configuration options results in the most efficient bug predictor across all subject systems. Conclusions: We demonstrate a strong evidence for the interplay among bug prediction configurations and provide concrete guidelines for researchers and practitioners on how to build and evaluate efficient bug predictors.