About the Editor

Roberto has over 25 years experience in the IT field, and has spent the last 12 years working in the intersection of open source software and business development. Roberto has taken an active interest in different open source projects and organizations, he has served on advisory boards, and helped large IT vendors, open source vendors and customers to design and deploy their open source strategies. After serving as Senior Director of Business Development at SourceForge for over 4 years, in 2016 he started a new company called Business Follows, whose mission is to is to help developers, companies and organizations to make Open Source development a key part of their business strategies. He is the editor of commercial open source blog.

Tweets

Is the sun dawning into a new day of brotherhood, as Roberto thinks? Should we think that this time it is different, that noharshwords were spoken? That critics are wrong to suspect that something is brewing? I believe that the initiatives described by Roberto are just a new front of an ongoing market (and mindshare) battle, that Microsoft is playing to guarantee its position in the IT landscape of the future.

If there is one thing that should be visible to every analyst in the IT market, is that monopolies does not disappear in the night. As I already wrote in the past, the fact that every year is believed to be the “linux year” remains wishful thinking; and I still believe that even with the many new low-cost devices designed to run linux, the linux desktop market share in my simulations does not exceed 5% for the end of 2009 (of course, I hope to be wrong, and that in a bold sweep some new company is capable of selling 20M pcs in one year). On the other hand, open source is clearly capable of entering in both new markets or to be the underlying basis for more traditional products, like Apple OSX or the iPhone. I believe that the new activities from Microsoft are the first mature attacks against the OSS ecosystem, designed to de-emphasize both the ethical aspects behind OSS and the differences in licensing that provide the real differentiators from the technical point of view.

Let me share with you some initial musings:Microsoft is a development tool company, and primarily sells to other developers. This may sound strange- after all, Microsoft sells operating systems and office suites that are not developer oriented. The reality is that Microsoft has created mostly platforms for other to build upon, and by providing nice and centrally-managed software libraries for every conceivable task it simplified the work for those building on Windows, Office, SQLserver and now SharePoint (among many other things). This simplification allowed ISVs to write software that run conceivably well, on a large number of machines, without having to juggle with updates from many different vendors of a separate DB, a separate web server, a separate presentation layer and so on. I believe that it is this ease of integration of components (because they were mostly from a single vendor, with rather similar and laissez-faire licensing conditions) and the fact that most of spending could be reused for different applications by buying licenses centrally from Microsoft once, and reusing them for additional value. In fact, I suspect that part of the lackluster performance of Vista was probably caused by the fact that, similarly to Windows ME, Vista had very little of value to offer to developers when confronted with the additional hardware requirements and the additional licensing cost.

For Microsoft (and its partners) everything is a PC. Remember when Microsoft designed its first game console? It was a PC, with just some changes in the bios and startup circuitry. Media centers? PCs. Servers? PCs. Mobile devices? PCs with a small screen, and a small “start” menu. The only “outsider” is the Zune, that is clearly designed as a clone of a product designed by others, and that as suchis somehowneglected even by Microsoft itself.

And now, what happened? Many different things. First of all, the web (and virtualization) finally managed to deliver on the promises made years ago; even with some immaturities, a modern web engine can deliver end-user applications with security, speed and central management that provides significant cost reductions and much less hassles for both users and administrations. This combination allows for near-unlimited scaling (horizontally and vertically) and when used with open source software require no licensing steps that may increase the time to market, that is fast becoming the deciding element for IT deployments. Call it Prism, Air, Silverlight, JavaFX, there are enough choices that by leveraging existing and new platforms can give to software vendors new choices. And now there are enough options for developers to be free from the Microsoft endless supply of libraries, and they can now search for their own liking.

On the other hand, low-cost devices, handheld systems designed for the web and embedded systems on one side, and verylarge scale systems are so different from a PC that trying to shoehorn a PC model there simply fails, and in this way Microsoft has left opened several breaches that were ineffectively guarded (like stopping a flood with barbed wire). Now, mobile internet devices like the iPhone/iPod touch, nokia’s own N770/N800/N810 tablet (and the other WebKit-based N-series phones) and the up-and-coming intel MID are all examples of a new kind of platform that Microsoft is not prepared to fight for.

So, after trying to ignore OSS, badmouth it, or scaring companies into cross-platforms agreements, now Microsoft is taking a more mature approach, that uses its innate developer-oriented strength to swoon developers to develop and deploy on Windows and with windows-oriented tools, by dangling in front of software vendors the promise of a much larger market and the support of an extraordinary marketing force. By doing this, of course, it creates an incentive to leverage Microsoft technologies whenever possible, to “adapt” licenses (avoiding copyleft-based ones, that prevent deep linking with proprietary software) and thus facilitating a progressive embrace into additional Microsoft (or partner) technologies that can be centrally controlled. I suspect that there will also be a licensing change in future version of Enterprise/Grid versions of Windows, to counteract the economic and licensing advantage of OSS-based virtualization; this may however be difficult to manage well, as it may significantly lower extractable prices for large-scale installations. Pushing effort to reengineer their software offering in a modular way may help the company to move into smaller scale computing, as well as large scale system, and at the same time maintain the comfortable development and deployment environment that has made Microsoft such a large scale success.

What will happen? If Microsoft is consistent in its “good spirit”, they may be able to reduce significantly the platform threat and create strong bonds with at least half of the commercial OSS vendors within 2010. On the other hand, this can increase the penetration and perception of OSS in general, and if a suitable service provider appears on the market it can capitalize on that “visibility asset” and weaken Microsoft position from the inside.

If Microsoft (and at this point I mainly think about Steve “chairs” Ballmer) shows its “bad face” it may polarize the market further, creating a cadre of “white knights” that show no compromise and gain visibility and interest from the part of the OSS community that believe in ethical and openness values, thus reducing the value of accepting the Microsoft compromise.

If there is a message received in visiting CeBIT, is the fact that open source is everywhere and nowhere. Everywhere because inside most products on show it can be seen an underlying OSS component (be it linux, asterisk, Eclipse…) and nowhere because this was written nowhere (with some notable exceptions). The fact that a product has inside some open source parts is so common that nowadays is not differentiating anymore; and this brings the second thing that I observed: the Linux part of CeBIT was sad and gave little value to the companies (and OSS communities) exposing there. For example, the OpenBravo stand was nice and filled with knowledgeable people, but would probably gained much more attention in the ERP pavillion; the same applies to Zimbra and the other (few) companies that were using the “free software” card ahead of that of what their product was for.
I believe that this self-segregation is counterproductive, as the main objective of a company looking for a solution to an IT problem is (not surprisingly) to find a solution, and then later prioritizing requirements and features (including ethical and economic ones) to decide if the adoption process can continue. In fact, I had the opportunity to see two companies presenting more or less the same service (based on OSS), one in the IT infrastructure pavilion and one in the Linux stand, and the difference in terms of people stopping by was quite noticeable, with the Linux one getting 2/3 times less people than the other. It may make sense to have a separate “community” part of CeBIT for those project that still have no significant commercial backings, or that prefer to show themselves in a “pure” way (in this sense, I appreciated the enthusiasm of the people at KDE, Scribus, Gnome, and Amarok), but not for companies: OSS is a differentiator in the long term, but cannot be the only thing you promote at your stand.

In the valley, an open-source strategy will not get you particular attention/funding any longer. In fact, if going open source is all you have to differentiate yourself, I’m pretty sure you won’t get any funding at all, at least not from a first-tier VC. I find it weird to see hear that Open Bravo would not be in the ERP pavillion. What were they thinking?

I totally agree with you, if going open source is all you have to differentiate is not a big deal. I understand second and third round investments are more likely to happen – as seen also with SAP Venture – in the next future, but I believe that there is plenty of blue ocean opportunities out there. Stay tuned, next week I will post about one of them.. 😉

It is widely known that despite many significant advantages, “explicit” use of OSS is still not as widespread. One of the many approaches designed to help in overcoming the adoption gap is the creation of “OSS competence centers”, that provide support and knowledge to facilitate open source software adoption.(Either JavaScript is not active or you are using an old version of Adobe Flash Player. Please install the newest Flash Player.)

Creating a competence center may take years, especially when it is necessary to create everything from scratch. But as I wrote in a recent presentation, it may be more efficient to “piggy-back” on top of existing IT incubators or IT districts, leverage what has already been produced in other projects and especially offer mediation as a service, because it is clear from the many surveys that companies need significant hand-holding when performing the first open source migrations. We will test this approach (after several trials) at the FutureMatch event colocated within CeBIT,

The EU has for a long time supported research on open source software, first with the creation of the European Working Group on Libre Software, by sponsoring studies and research and through various EU branches, like IDABC (the Interoperable Delivery of European eGovernment Services to public Administrations, Businesses and Citizens). Among the most interesting activities:IDABC OSS observatory: a long term activity, that provides news and information on OSS with a focus on Public Administrations. It provides news, a software repository, a taxonomy of software applications, a list of OSS competence centers, and several resources and papers related to legal and adoption processes for Public Administrations.
The IST research area of the Commission has a long history of research in OSS, including past projects like SPIRIT (open source healthcare) or the FLOSS study (one of the first longitudinal study of OSS participation and development). More recently, projects like COSPA researched the real costs of migration of public administrations to OSS, and provided the data for later research like the EU study “Economic impact of open source software on innovation and the competitiveness of the Information and Communication Technologies (ICT) sector in the EU” that had a significant impact. Other significant projects were CALIBRE (open source in industrial applications) and EDOS (Environment for the development and Distribution of Open Source software).
Several new projects focusing on OSS software quality were funded, like SQO-OSS, FLOSSMETRICS and QUALOSS, collectively grouped on a coordinated initiative called FLOSSQUALITY. While in the beginning the Commission was more interested in “stimulating” OSS production in under-represented areas (especially those that are more relevant for EU at large, like embedded systems, security, development tools like TOPCASED) now most research is devoted to other areas like economic impact and business models, along with the many projects that are using OSS licenses to disseminate the results to a wider population.
This is just a small outline of the most recent activities, and I will provide a small summary of the results of individual projects in future posts.

Carlo,
Thanks for a helpful overview. It makes a nice entry point to the European FLOSS activities.
I am however still lacking a more critical review of what has been done. It seems that nobody has written such a review in Europe simply because all the experts are already involved and they cannot remain unbiased. Outside of Europe there are not many people who can properly understand nor evaluate what has been done since the matter has been researched in the EU deeper and more thoroughly than anywhere.

It is true that it is difficult to be impartial with the project you worked in… but I can talk about those I was not involved in, and maybe Roberto can write about the others? I will prepare a post with links to most of the projects I know of, and we can start from there.

From the CeBIT website: “The EU project Open TTT is supporting enterprises in finding, applying and developing the right Open Source Software to fulfill their specific needs. By collaborating with Open TTT the IRC Future Match 2008 has expanded its breadth to include the Open Source Software sector and thus offers a mediation platform for innovative Open Source Software offers and requests.”

In the past event, FutureMatch organized more than 1200 one-to-one meetings between companies, and it is my hope that a significant number of those in the next edition will be for open source services between OSS providers and end-users. I would like to invite any interested company willing to be there to register at the FutureMatch site; please choose “Open TTT” as “Assisting Organisation” during your registration to receive free entrance tickets for the CeBIT 2008.

The OpenTTT project is evaluating a novel approach to help in the OSS adoption process, by “industrializing” the matching process between the demand for software with the necessary functionalities and the offer (the whole set of suitable OSS packages). The mediation process is designed to find the best selection of tools and projects that can best match the expressed needs, and then we try to create one-to-one (or many-to-one, when more than one company is interested in paying for modifications or updates) business exchanges between the potential customers and the OSS-based companies that provide support or services on the selected packages. This approach has been tested in several workshops, held in France, Germany, Italy and Bulgaria and will be refined with the results of the FutureMatch event; we plan to leverage our experience to create a standardized approach to OSS mediation, eventually creating a “blueprint” for competency centers based on the OpenTTT model. Maybe this may be the basis for improving existing marketplaces?

I enjoyed joining the Italian OpenTTT workshop held in Rome on the 14 th of January, as I found appropriate and interesting Carlo’s speech on open source solutions for horizontal and vertical needs needs.

Talking about the audience I was disappointed by the small number of attendees joining the conference, and I spoke about that with Martina Desole (APRE agency).

Martina, who opened the conference talking about APRE’s role and presenting the participation of OpenTTT to Cebit FutureMatch, on the contrary was happy because with 17 attendees they reached the established target (namely at least 4 members for every vertical “club” among Energy&Environment, Industry Production, Transport and Public Administration).

I don’t know how these targets are defined, but I believe that four participants for each vertical segment are not enough to drive conclusions out of mere assumptions. OpenTTT definitely needs a broader audience to verify and test its OSS mediation approach, let’s see if Cebit FutureMatch could help in this respect.

The reason for Martina to be happy can be traced probably to the fact that for traditional matching processes 17 attendees can be considered a good participation 🙂
The project has reached an overall of around 100 companies that were audited, submitted requests, and for which a match was found. In this sense, the number is sufficient to obtain some results, like the fact that there is limited difference in the horizontal requests (across company size and across different countries) and that we were able to match 95% of the requests directly with a single project.
The biggest problem found is that in Italy (less so in Germany and France) the number of interested OSS companies was quite low, and most were not interested in participating in the matching process. I suspect that here we have a confirmation of Roberto (and mine) hypothesis that Italy has a strongly underdeveloped commercialization channel, and for this reason the market itself is still immature. I estimate that we are 2-3 years behind France in this respect, and probably 5 years away from a “well formed” market for OSS companies.

Carlo I believe that the underdeveloped commercialization channel it is a partial answer, since covers only IT firms. In my opinion vertical needs are potentially interesting for a broader audience, resulting in the ideal match for projects like OpenTTT.

In my understanding projects like OpenTTT would need an appropriate budget to promote its events, otherwise technical findings could end to be a tool for (few) geeks.

At the end of each year since 2000 we are bombardedwithopposingviews about the next coming of linux on desktops, or the growth or decline of open source software on servers, whether Apache is growing or IIS is regaining share. It reminds me so much about heated debates about football, or politics, or many other clearly undecidable questions; the debate has an entertaining value in itself, so despite the lack of any practical value it remains a common sport. As I would never leave such an entertaining opportunity unfulfilled, I will try to present a few opinions on my own.

First of all, I strongly believe that the overall idea of a “tipping point” that happens in the short term (0-2 years) that shows a sudden switch from Windows users to Linux on the desktop has no factual basis. All the research on ICT and innovation diffusion shows that when the incumbent enjoys strong network effects (like Microsoft with the combination of economic incentives to its channel and latency of user base) and is willing to adapt its pricing strategy to counter external threats, it can significantly delay the adoption process of even technically perfect alternatives. This, combined with the fact that at the moment the channel for linux desktops does not exist (apart from some internal successes like IBM, or some external sales by Novell) means that my models predict a less than 5% adoption within 2 years for enterprise desktops if everything stays the same.

And what can change? The first important idea is that there are two ways of doing business, the “red ocean” (fighting for the same market and undercutting competition) and the “blue ocean” (searching for new markets and ideas). My belief is that abrupt changes are much more difficult in red ocean environments, as everyone tries to outsmart the others, and those that are capable of surviving for longer (for example, because they have more cash) are increasingly favorite by this competitive model. But “order of magnitude” changes are possible in the blue ocean strategy, because the space for exploring new things is much larger. Andy Grove of Intel once mentioned that:

in how some element of one’s business is conducted becomes an order of magnitude larger than what that business is accustomed to, then all bets are off. There’s wind and then there’s a typhoon, there are waves and then there’s a tsunami.

Can we find examples of this “order of magnitude” change? Some examples are the Amazon EC2 (cost of one hour of managed and scalable CPU one order of magnitude lower than alternatives), the Asus eeepc (nearly one order of magnitude lower cost compared to other ultraportables), the XO notebook (one order of magnitude reduction in costs, one order of magnitude or more in planned audience); all were surprisingly successful (even the XO, well before shipping, forced companies like Intel, AMD, Microsoft to react and compromise in order to be able to participate in the same market).

Still with me? The missing piece is the fact that we should strive to facilitate the choice of open source at the change points; for example, it is easier to suggest an alternative when the current situation is undergoing change (like suggesting a migration to linux when people has to change its PCs). We should make sure that we propose something that has one order of magnitude less costs than alternatives, that can provide sustainable business models, and that satisfies the needs of users. We have to create a software/hardware/services assembly (as the XO was created from scratch) to replace and enhance what desktop PCs are doing now. Technically speaking, we have to create a hardware assembly that costs one order of magnitude less, software that costs one order of magnitude less to maintain, and services that cost one order of magnitude less to maintain.

How we can do it? The hardware part is easy: design for the purpose. Take the lead from what XO has done, and create a similar platform for the desktop. Flash disk is still too costly, so design a single platter disk, with controller and metal case soldered on the motherboard; think about different chip designs (maybe leveraging Niagara T2) by reducing the number of cores and adding on-chip graphics and memory architectures (when source code is available, more sophisticated manual prefetching architectures are possible). Software needs are in a sense easier: we still need to facilitate management (Sun’s APOC or Gonicus’ GOSa are good examples) and integrate in the system an easy way for receiving external help. Think out of the box: maybe LLVM may be a better compiler for some aspects of the machine than GCC? (think about what Apple has done with it) Leverage external network services (like the WalMart’s gPC and gOS). This means create external backups and storage for moving users; allow for “cloning” of one PCs to another when a replacement is needed, easily synchronize files and data with external services using tools like Conduit. Allow for third parties to target this as a platform, like Google is doing with Android; partner with local companies, to create a channel that will sell services on top of it. As the cost of materials goes down of roughly 10% for every order of magnitude in produced parts, an ambitious company can create a 99$ PC, with reasonable capabilities, packaged by local companies for local needs; the potential market can be estimated at 25% of the actual installed PC base (both new users and users adopting it as second platform or replacement platform), or roughly 200 million PCs.

The assumption that everything is going to be as today is just our inability to plan for a different future.

It is always interesting to read discussion about what economic impact OSS has, or will have. It goes from the wild enthusiasm of Matt Asay to the more moderate view of Savio Rodriguez. As readers of this blog know, I enjoy writing aboutthis kind of things, and would like to provide a few comments on what is measured, and what really would be important to do.

Savio cites IDC, and calculates that OSS software will reach in 2011 sales of 5.8B$, or 1.8% of the market. On the other hand, Gartner says that in 2008 25% of the software market will be OSS based, either through internal development or through external OSS providers.

Who is right? The answer is of course in testing the hypothesis that is hidden under every measurement effort: that what is measured is a reasonable proxy for the true variable that we want to know. IDC (I have to guess from Forrester press release and Savio’s post) measures OSS software sales, without hardware and services. So, of course, Savio takes RedHat, MySQL, a bunch of others and struggles to reach even 1B$. Is it realistic? Maybe not; while it is true that selling software is one of the possible business models, it is difficult to ignore the economic impact not only of service-based models, but of all those users that internally are using OSS and maintaining it, thus getting an economic benefit and without appearing on anyone’s radar. It reminds me of those reports of just a few years ago, that found that Linux was used in less than a few percent of all servers sold; the problem was that this sampling ignored servers sold without operating systems, or those with non-commercial version of linux installed.

Savio’s comments (and Roberto’s too) are however spot-on for the main problem of the OSS market, or the lack of a commercial channel with credible strength. I enjoyed a lot an evening organized by our local industry association, debating pros and cons of OSS for companies. I heard several nice (and some bad) stories of usage of OSS, and found that ALL the companies that are OSS adopters found about the software and the solutions by themselves. While proprietary solutions (like Microsoft ones) are promoted by tons of companies (I can find more or less 50 in a range of 5 kilometers from my office) there are less than 200 OSS companies in all of Italy. If we want to grow the 1.8% market Savio reports, we should create a channel of companies (maybe through service franchising). At the moment, companies interested in OSS (and there is a very high percentage of those) are forced to go alone, and in many cases spend too much money before giving up. The same companies are quite happy to pay for someone that helps them, but up to now you must be customer of Atos Origin, Engineering, IBM Global Services or Accenture to enjoy such help. Horizontal companies like OpenLogic or SpikeSource are starting to address this problem, but we need smaller entities at the local level, that can provide the necessary handholding and avoid the pitfalls.

It has been a long, long, longroad, but after the evaluation by the Commission, I am happy to announce that we have finally published our guide for small and medium enterprises, designed to help the adoption process of open source and free software.

We have striven to be pragmatic (no vendor-paid research, for example) and practical; two more editions will follow, every 6 months, to allow for updates and new material.

The guide (developed in the context of the FLOSSMETRICS and OpenTTT projects) present a set of guidelines and suggestions for the adoption of open source software within SMEs, using a ladder model that will guide companies from the initial selection and adoption of FLOSS within the IT infrastructure up to the creation of suitable business models based on open source software.

The guide is split into an 80 pages introduction to FLOSS and a catalog of 165 applications, selected to fulfill the requests that were gathered in the interviews and audit in the OpenTTT project. The application areas are infrastructural software (ranging from network and system management to security), ERP and CRM applications, groupware, document management, content management systems (CMS), VoIP, graphics/CAD/GIS systems, desktop applications, engineering and manufacturing, vertical business applications and eLearning.

The guide is available at the guide web page ; the two pdfs are 2Mb (“FLOSS Guide“) and 20 Mbytes (“FLOSS Catalog“), so take it into account if you are connected by narrowband or cell phone.

I welcome any suggestion, addition or criticism; I hope that this can become the beginning of a collaborating community centered on helping the use and adoption of OSS in companies. I thank Roberto for the many suggestions and for the use of his blog as a media for future updates and interactions with any welcome contributor. To facilitate conversation on the topic touched by the guide, and in particular those related to open source business model, we are preparing a mailing-list that will be announced soon.

Carlo and Roberto
this is an invaluable exceptional work!
thank you very much for it, I am looking forward to have some spare time these nights to read both docs, and maybe give you some feedback based on my experience.

Despite the fact that many believe FLOSS of interest mainly for developers, I strongly believe that we are simply starting to see a rush of different projects that extend the collaborative development approach to non-software areas.

During the research activity in the OpenTTT project, we tried to find non-software projects that are developed or extended in a collaborative way, similar to the “bazaar” or moderated bazaar typical of most FLOSS projects; having restricted this to 65 examples, we have found many interesting facts:

many large scale software projects are really mixed media projects, as exemplified by the map created by Matthias Mueller-Prove, that shows that the number of people participating in “ancillary” areas like documentation, promotion and such is as large as that devoted to development. KDE and GNOME has similar proportion of non-code participation..

whenever the software allows for mixed participation, such participation happens. It is relatively easy to see that simple Wiki-based tools seems capable to attract a large participation base, while cooperative schemes for music or artwork are less present. In fact, most non-textual forms are more oriented towards “remixing”, that is the leveraging of a digital artifact for integration into some other work, and not modification and improvement of it directly. I suspect that as more complete and complex “packaged” file formats (like those used by proprietary video editing suites, for example) become used by open source tools, we will begin to see a more interesting approach not only towards remixing but towards “reinvention” as well. A wonderful example is NineInchNails’ open source remix project..

the sheer scope if the phenomenon is amazing- collaboratively created prayer books? (see the open source judaism project, or the Open source Haggadah). The Multimachine tool is also amazing (an accurate all-purpose machine tool that can be used as a metal or wood lathe, end mill, horizontal mill, drill press, wood or metal saw or sander, surface grinder and sheet metal “spinner”. It can be built by a semi-skilled mechanic using just common hand tools; for machine construction, electricity can be replaced with “elbow grease” and all the necessary material can come from discarded vehicle parts)

I believe that as FLOSS demonstrated that software can be created with good quality and innovation in collaborative modes, this will show in many other areas as well.

A recurring debate discussion among FLOSS-supporters and detractors is related to the estimation of the real number of active FLOSS projects. While it is easy to look at the main repository site (sourceforge.net) that boasts more than 100.000 projects, it is equally easy to look in more depth and realize that a significant number of those projects are really abandoned or have no significant development. How many active and stable projects are really out there?

For the purpose of obtaining some unbiased estimates in the context of the FLOSSMETRICS project, we performed a first search among the main repository sites and FLOSS announce portals; we also set a strict activity requirement, stately an activity index from 80 to 100% and at least a file release in the last 6 months. Of the overall 155959 projects, only 10656 (6.8%) are “active” (with a somehow very restrictive definition; a more relaxed release period of 1 year shows an active percentage of 9.2% or 14455 projects).

However, while Sourceforge can rightly be considered the largest single repository, it is not the only potential source of projects; there are many other vertical repositories, among them BerliOS, Savannah, Gna! and many others, derived both from the original version of the Sourceforge code and many more based on a rewritten version called GForge. That gives a total of 23948 projects, to which (using a sampling of 100 projects from each) we have found a similar number of active projects (between 8% and 10%).

The next step is the estimation of how many projects of the overall FLOSS landscape are hosted on those sites, and for performing this estimate we took the entire FreshMeat announce database, as processed by the FLOSSmole project and found that the projects that have an homepage in one of the repository sites are 23% of the total. This count is however biased by the fact that the probability of a project to be announced on FreshMeat is not equal for all projects; that is, english-based and oriented towards a large audience have a much higer probability to be listed. To take this into account, we performed a search for non-english based forges, and for software that is oriented towards a very specific area, using data from past IST projects like Spirit and AMOS.

We have found that non-english projects are underrepresented in FreshMeat in a significant way, but as the overall “business-readiness” of those projects is unclear (as for example there may be no translations available, or be specific to a single country legal environment) we have ignored them. Vertical projects are also underrepresented, especially with regard to projects in scientific and technical areas, where the probability of being included is around 10 times lower compared to other kind of software. By using the results from Spirit, a sampling from project announcements in scientific mailing lists, and some repositories for the largest or more visible projects (like the CRAN archive, that hosts libraries and packages for the R language for statistics, that hosts 1195 projects) we have reached a lower bound estimate of around 12000 “vertical” and industry-specific projects. So, we have an overall lower bound estimate of around 195000 projects, of which we can estimate that 7% are active, leading to around 13000 active projects.

Of those, we can estimate (using data from Slashdot, FreshMeat and the largest Gforge sites) that 36% fall in the “stable” or “mature” stage, leading to a total of around 5000 projects that can be considered suitable for an SME, that is with an active community, stable and with recent releases. It should be considered that this number is a lower bound, obtained with slightly severe assumptions; just enlarging the file release period from 6 months to one year nearly doubles the number of suitable projects. Also, this estimate does not try to assess the number of projects not listed in the announcement sites (even vertical application portals); this is a deliberate action, as it would be difficult to estimate the reliability of such a measure, and because the “findability” of a project and its probability of having a sustained community participation are lower if it is difficult to find information on the project in the first place; this means that the probability of such “out of the bounds” projects would probably be not a good opportunity for SME adoption in any case. By using a slightly more relaxed definition of “stability”, with an activity rating between 60% and 100% and at least a release in the last year, we obtain around 18000 stable and mature project from which to choose- not a bad result, after all.

The activity criterion used underestimates the number of projects that provide useful software. A project may not have had a recent release because it is complete and has no known bugs, or no bugs significant enough to fix. Of course, it would be difficult to take this into account without a lot more work since it would be necessary to examine the status of each project.

As mentioned in the text, this is meant to provide a lower bound to the number of available, active and stable projects; as such, we have chosen a very strict definition of activity, and we used the project choice of “stability”, even considering that this lowers the number of suitable projects even more (there are many “beta” projects that are really stable). We already have found projects that are stable but not included in the count; an example is GNU make (that is stable, but having no new release in one year would not make it to the list).
It must be considered, however, that even projects that are more or less finished (no more bugs) may need a small recompile or modification to adapt to changing platforms and environments; in this sense, stable project with no release in one year should be considered an exception and not the rule. Using a simple sampling approach, we estimate that those are less than 2% of our original count, and so we would not rise the package count in a significant way. Our main objective was to demonstrate that the lower bound of the number of both stable and maintained packages was significant, and I believe that that result was reached.
Many thanks for your comment (and for reading the article thoroughly :-))

For some of the forge sites that allows for data extraction, such a list can be obtained through the FLOSSMOLE data source. For those sites that have no search functionality, or that provide only part of their database in a searchable way, statistical methods were using based on a sampling approach, and in this case no list (just the numbers) can be obtained. It is important to understand that what we were looking at was a lower bound on the number of active and stable projects, not a “final” list.

Hi,
I am doing currently a research on open source firm. for statistical model i need number of projects registered to sourceforge year by year, is there any way to extract these information from sourceforge?