Nice numbers. I have used those articles as reference points while speaking about the potential market size for our memory leak detection tool. But something about these numbers has bothered me for years – there is no trustworthy and public analysis behind those numbers. Its just conjured up from thin air. So I finally thought I would do something about it and try to figure it out for good.

It proved to be a challenging task. After all – with more than seven billion people on our planet I couldn’t call everyone and ask them. Well, maybe I could, but if every call would take on average 20 seconds I would need at least 4,439 years to complete the survey. If I did not sleep or eat or rest. So I had to use other ways for estimation.

After playing around with different sources of information, I decided to dig into four of them for a closer look:

How many programmers could there be in total?

The world population is currently above seven billion. Out of those seven billion we can leave out sub-Saharan Africa (900M) and rural Asia (about 50% of its 2.2B population) as negligible. This leaves us with approximately 5 billion people living in regions where overall economical and cultural background can be considered suitable for software industries to spawn.

Now, out of those 5,000,000,000 how many could be actually developing software? A good answer at StackExchange gives us some pointers as to where we can find information on the percentage of software developers in different countries. Using the US, Japan, Canada, the EU27 and the UK as a baseline we can estimate that 0.86% of the population is employed as a software developer or programmer:

Canada

33,476,688

387,000

1.16%

EU27

502,486,499

5,900,000

1.17%

Japan

127,799,000

1,016,929

0.80%

UK

63,162,000

333,000

0.53%

US

313,931,000

1,336,300

0.43%

Weighted average:

0.86%

0.86% out of five billion is 43,000,000. Lets remember this number as it will be used as a baseline in the following calculations.

Popularity contests

In the popularity contest we will use two channels for the source of data – the TIOBE index and the Langpop one. Other sources such as Dataist figures were hard to interpret, so we’ll stick just to those two.

For the background – the TIOBE ratings are calculated by counting hits of the most popular search engines. The search query that is used is

+”<language> programming”, e.g. +“Java programming” in our case.

Langpop uses more sources for input besides search engine queries – in equal weights it traces open job positions, book titles, search engine results, the number of open source projects and other data to calculate its popularity score.

Simplifying TIOBE and Langpop results, we can conclude that according to TIOBE 17% and according to Langpop ~15% of the programmers in the world are using Java. Averaging those numbers we can say that around 16% out of the 43,000,000 developers in the world use Java. This translates to 6,880,000 Java developers out there.

Job portals

Job portals, especially when considering both available positions and uploaded resumes, are definitely a good source of information. The larger ones also provide nice reports on labor market, which we will dig into next. Note that we used Indeed.com and Monster.com – if you can point us towards more and/or better sources of information, we would be glad to correct our calculations.

But using this analysis from Monster.com and the aggregated statistics from Indeed.com we can say that ~18% of Monster.com applicants can program in Java and ~16% of open engineering / programming positions scanned by Indeed.com are looking for Java talent. Averaging those numbers we arrive at 17%. Which out of 43,000,000 programmers in total would translate to 7,310,000 Java guys and girls in the world.

Software downloads

Every Java developer uses something to build the application. Well, we expect them to use at least a JVM and a compiler. If you happen to know anyone who can get away without those two, please let us know. We would hire him or her immediately.

But most of us tend to use more than just a compiler and a virtual machine. We use IDEs, application servers, build tools, etc. So we figured that we would look into the publicly available download numbers of these tools and try to estimate the number of developers from the download numbers.

When calculating the total number of developers from estimated number of users, we take into account the market share of the corresponding software. To estimate the market share we use Zeroturnaround’s statistics gathered in the spring of 2012.

Eclipse downloads. Eclipse Juno was released on June 27 and has been downloaded 1,200,000 times during the first 20 days. Looking into the historical data published by eclipse.org we can predict that Juno will be downloaded approximately 8,000,000 times in total. The last four major Eclipse releases have all been released using a yearly release calendar and all the releases took place in June:

Juno – 8,000,000 (in a year, expecting the trend to continue. Currently has 1,200,000 downloads in first 20 days).

Indigo – 6,000,000 downloads

Helios – 4,100,000 downloads

Galileo – 2,200,000 downloads

Averaging Juno estimates and Indigo results, we can say that Eclipse is downloaded approximately 7,000,000 times a year.
Using the Zeroturnaround’s statistics, we expect 68% of Java developers to use Eclipse as a (primary) IDE.

If we now make a bold claim that each Java developer on Eclipse will download the IDE exactly once a year, expect the number of downloads per year to be 7,000,000 and consider that 32% of Java developers do not use Eclipse at all, we come to a conclusion that there should be 10,300,000 Java developers in total.
Apache Tomcat downloads. Vadim Gritsenko has put together some nice statistics on top of Apache logs. From there we can see that during the last year Tomcat has been downloaded approximately 550,000 times/month. This gives us a yearly total of 6,600,000 Tomcat downloads.

Applying now statistics from the same report used for calculating Eclipse’s market share we can estimate that 59% of Java developers are using Tomcat as one of their development platform.

If we now again make a bold claim that each Java developer on Tomcat will download every major release exactly once and consider that 41% of Java developers do not use Tomcat, we reach to conclusion that there should be 11,186,000 Java developers out there.

Averaging the numbers from Eclipse and Tomcat downloads, we end up with 10,743,000 Java developers.

Conclusions

>We used three different sources for estimation – popularity contests, job market analysis and download numbers of popular Java development infrastructure products. The numbers varied quite a bit – from 6,880,000 to 10,743,000. Aggressively averaging the three numbers we can conclude that there are 8,311,000 Java developers out there. Not quite as much as Oracle or Wikipedia think, but still enough to build a business that provides developing tools for the Java community.

It’s amazing to see that despite its age, C is still one of the most used programming languages in the world. I would give anything to code in C once again (and yes – I’m prepared to use “malloc” and pointers once again!)nnGoing back to the point of your article, how many of these Java developers are full time?

Why is half of Africa and Asia negligible?. You are excluding half of the world population. If indeed the other half gives us 8.3M programmers, well the other half can give another million of Java guys. In this way your estimation will be close to Oracle’s or Wikipedia

When I used to program in Java, I would use the milestone releases of Eclipse, thus, I might download it 7-8 times a year. I would also download different bundles, one for Java and JBoss, another for reporting, and so on. I might have been an extreme example, but I think an average of 2-3 downloads per year would be more accurate.

Exactly. Eclipse stats are probably 5 or 6 times too large. I don’t use Eclipse and I do exactly the same thing- multiple downloads of the same package for one reason or another. Let me also state the obvious- trusting self reported download numbers from a competitive environment where network effects can make or break a product is not a good idea and clearly the weakest link in this analysis. Lets not ask products how popular they are; let’s find some other more trustworthy source for that information.

I would ignore Tomcat download stats. They wander too far into servers and operations. I can easily download Tomcat 25+ times a year to run it on dev machines, different linux boxes, compare version differences, find and test for some weird bug with different versions etc. And I also write installation instructions which cause some administrator to download and install Tomcat in sys test, acceptance test, production test and production.

Good point. On the other hand, I bet there are whole development teams that skip a Tomcat version. And in countries with slow international connection people download the package once and share it with each other.nnSo we figured we can make an assumption that it all balances out to 1 download per developer per major version of Tomcat. I agree that it’s a bold claim 🙂