Subscribe to the latest research through IGI Global's new InfoSci-OnDemand Plus

InfoSci®-OnDemand Plus, a subscription-based service, provides researchers the ability to access full-text content from over 93,000+ peer-reviewed book chapters and 24,000+ scholarly journal articles covering 11 core subjects. Users can select articles or chapters that meet their interests and gain access to the full content permanently in their personal online InfoSci-OnDemand Plus library.

When ordering directly through IGI Global's Online Bookstore, receive the complimentary e-books for the first, second, and third editions with the purchase of the Encyclopedia of Information Science and Technology, Fourth Edition e-book.

InfoSci®-Journals Annual Subscription Price for New Customers: As Low As US$ 4,080*

This collection of over 185 e-journals offers unlimited access to highly-cited, forward-thinking content in full-text PDF and HTML with no DRM. There are no platform or maintenance fees and a guarantee of no more than 5% increase annually.

Abstract

The last few years have seen a rapid increase in the number of Free/Libre Open Source Software (FLOSS) projects. Some of these projects, such as Linux and the Apache web server, have become phenomenally successful. However, for every successful FLOSS project there are dozens of FLOSS projects which never succeed. These projects fail to attract developers and/or consumers and, as a result, never get off the ground. The aim of this research is to better understand why some FLOSS projects flourish while others wither and die. This article presents a simple agent-based model that is calibrated on key patterns of data from SourceForge, the largest online site hosting open source projects. The calibrated model provides insight into the conditions necessary for FLOSS success and might be used for scenario analysis of future developments of FLOSS.

Background

There have been a limited number of attempts to simulate various parts of the open source development process (Dalle & David, 2004). For example, Dalle and David (2004) use agent-based modeling to create SimCode, a simulator that attempts to model where developers will focus their contributions within a single project. However, in order to predict the success/failure of a single FLOSS project, other existing FLOSS projects, which are vying for a limited pool of developers and users, may need to be considered. This is especially true when multiple FLOSS projects are competing for a limited market share (e.g., two driver projects for the same piece of hardware or rival desktop environments such as GNOME and the KDE). Wagstrom, Herbsleb, and Carley (2005) created OSSim, an agent-based model containing users, developers, and projects that is driven by social networks. While this model allows for multiple competing projects, the published experiments include a maximum of only four projects (Wagstrom et al., 2005). Preliminary work on modeling competition among projects is currently being explored by Katsamakas and Georgantzas (2007) using a system dynamics framework. By using a population of projects, it is possible to consider factors between the projects, e.g., the relative popularity of a project with respect to other projects as a factor that attracts developers and users to a particular project. Therefore, our model pioneers new territory by attempting to simulate across a large landscape of FLOSS with agent-based modeling.

Gao, Madey, and Freeh (2005) approach modeling and simulating the FLOSS community via social network theory, focusing on the relationships between FLOSS developers. While they also use empirical data from the online FLOSS repository SourceForge to calibrate their model, they are mostly interested in replicating the network structure and use network metrics for validation purposes (e.g. network diameter and degree). Our model attempts to replicate other emergent properties of FLOSS development without including the complexities of social networking. However, both teams consider some similar indicators, such as the number of developers working on a project, when evaluating the performance of the models.

In addition, there have been attempts to identify factors that influence FLOSS. These have ranged from pure speculation (Raymond’s (2000) gift giving culture postulates) to surveys of developers (Rossi, 2004) to case studies using data mined from SourceForge (Michlmayr, 2005). Wang (2007) demonstrates specific factors can be used for predicting the success of FLOSS projects via K-Means clustering. However, this form of machine learning offers no insight into the actual underlying process that causes projects to succeed. Therefore, the research presented here approaches simulating the FLOSS development process using agent-based modeling instead of machine learning.

To encourage more simulation of the FLOSS development process, Antoniades, Samoladas, Stamelos, Angelis, and Bleris (2005) created a general framework for FLOSS models. The model presented here follows some of the recommendations and best practices suggested in this framework. In addition, Antoniades et al. (2005) developed an initial dynamical simulation model of FLOSS. Although the model presented here is agent-based, many of the techniques, including calibration, validation, and addressing the stochastic nature of the modeling process, are similar between the two models. One difference is the empirical data used for validation: Antoniades et al.’s (2005) model uses mostly code-level metrics from specific projects while the model presented here uses higher project-level statistics gathered across many projects.

Identifying And Selecting Influential Factors

Factors which are most likely to influence the success/failure of FLOSS must first be identified and then incorporated into the model. Many papers have been published in regards to this, but most of the literature simply speculates on what factors might affect the success and offers reasons why. Note that measuring the success of a FLOSS project is still an open problem: some metrics have been proposed and used but unlike for commercial software, no standards have been established. Some possible success indicators are: