The real problem with Java in Linux distros

Java is not a first-class citizen in Linux distributions. We generally have decent coverage for Java libraries, but lots of Java software is not packaged at all, or packaged in alternate repositories. Some consider that it’s because Linux distribution developers dislike Java and prefer other languages, like C or Python. The reality is slightly different.

Java is fine

There is nothing sufficiently wrong with Java that would cause it to uniformly be a second-class citizen on every distro. It is a widely-used language, especially in the corporate world. It has a vibrant open source community. On servers, it generated very interesting stable (Tomcat) and cutting-edge (Hadoop, Cassandra…) projects. So what grudge do the distributions hold against Java ?

Distributing distributions

The problem is that Java open source upstream projects do not really release code. Their main artifact is a complete binary distribution, a bundle including their compiled code and a set of third-party libraries they rely on. If you take the Java project point of view, it makes sense: you pick versions of libraries that work for you, test that precise combination, and release the same bundle for all platforms. It makes it easy to use everywhere, especially on operating systems that don’t enjoy the greatness of an unified package management system.

That doesn’t play well with how Linux distributions package software. We want to avoid code duplication (so that a security update in a library package benefits all software that uses it), so we package libraries separately. We keep those up to date, to benefit from bugfixes and new features. We consider libraries to be part of the platform provided by the Linux distribution.

The Java upstream project consider libraries to be part of the software bundle they release. So they keep the libraries at a precise version they tested, and only update them when they really need to. Essentially, they maintain their own platform of libraries. They do, at their scale, the same work the Linux distributions do. And that’s where the real problem lies.

Solutions ?

Force software to use your libraries

For simple Java software, stripping the upstream distribution and forcing it to use your platform libraries can work. But that creates friction with upstream projects (since you introduce an untested difference). And that doesn’t work with more complex software: swapping libraries below it will just make it fail.

Package all versions of libraries

The next obvious solution is to make separate packages for every version of library that the software uses. The problem is that there is no real convergence on “commonly-used” versions of libraries. There is no ABI protection, nor general guidelines on versioning. You end up having to package each and every minor version of a library that the software happens to want. That doesn’t scale well: it creates an explosion in the number of packages, code duplication, security update nightmares, etc. Furthermore, sometimes the Java project patches the libraries they ship with to include a specific feature they need, so it doesn’t even match with a real library version anymore.

Note: The distribution that is the closest to implementing this approach is Gentoo, through the SLOT system that lets you have several versions of the same package installed at the same time.

Bundle software with their libraries

At that point, you accept code duplication, so just shipping the precise libraries together with the software doesn’t sound that bad of an idea. Unfortunately it’s not that simple. Linux distributions must build everything from source code. In most cases, the upstream Java project doesn’t ship the source code used in the libraries it bundles. And what about the source code of the build dependencies of your libraries ? In some corner cases, the library project is even abandoned, and its source code lost…

What can we do to fix it ?

So you could say that the biggest issue the Linux distributions have with Java is not really about the language itself. It’s about an ecosystem that glorifies binary bundles and not source code. And there is no easy solution around it, that’s why you can often hear Java packagers in Linux distributions explain how much they hate Java. That’s why there is only a minimal number of Java projects packaged in distributions. Shall we abandon all hope ?

The utopia solution is to aim for a reference platform, reasonably up-to-date libraries that are known to work well together, and encourage all Java upstream developers to use that. That was one of JPackage’s goals, but it requires a lot more momentum to succeed. It’s very difficult, especially since Java developers often use Windows or OSX.

Another plan is to build a parallel distribution mechanism for Java libraries inside your distro. A Java library wouldn’t be shipped as a package anymore. But I think unified package systems are the glory of Linux distributions, so I don’t really like that option.

Other issues, for reference

There are a few other issues I didn’t mention in this article, to concentrate on the “distributing distributions” aspect. The tarball distributions don’t play nice with the FHS, forcing you to play with symlinks to try to keep both worlds happy (and generally making both unhappy). Maven encourages projects to pick precise versions of libraries and stick to them, often resulting in multiple different versions of the same library being used in a given project. Java code tends to build-depend on hundreds of obscure libraries, transforming seemingly-simple packaging work into a man-year exponential effort. Finally, the same dependency inflation issue makes it a non-trivial engagement to contractually support all the dependencies (and build dependencies) of a given software (like Canonical does for software in the Ubuntu main repository).

Share this:

Related

Maven could atually be used to reverse the situation: we already have a Maven repository in Debian/Ubuntu containing well packaged libraries with correct POM descriptors (that is not always the case in Maven Central repository, and Java developpers have often to work around issues in that repo).
If we publish this repository on the Internet (could Canonical / Launchpad provide such a facility?) then we can offer a realiable repository for developers to use.

You would still have the “multiple versions” issue: our Maven repository would just provide our versions, not the ones the project might need. I agree though that Maven could be used to advocate and make available to developers the “reference platform” I mention in the article.

If they’re not releasing source code, how can it be considered “open source”?
Such projects definitely can’t be considered “free software”, since access to the code is a precondition.
If the source code were available, any project could be packaged by the distro for inclusion, without issue, no?

Those projects publish the source code, so they are open source. It’s just not their primary deliverable. As mentioned in the article, the problem is that they produce a full platform, and that specific platform doesn’t integrate well with the one the Linux distributions provide.

maven is the key
1) it handles dependencies
2) it builds binary as well as source artifacts! all the public repos of all the projects I use (apache, jboss, spring, etc.) – they all have both binary and source artifact
3) a semi-automated tool can be created to try to change the version of dependencies for your distro:
– run “mvn test” on project as it is
– try to (automatically) replace version in maven – say you want to distribute slf4j 1.6.1 instead of 1.5.8 as project specifies – fine, just replace it!
– run “mvn test” again – it will rebuild and run the tests again
– did it caused regressions?
– report a bug to upstream bug tracker if so – it’s type and url are supposed to be in pom.xml itself, so you can fully automate that!
– no regressions? congratulations! you’ve just got a rid of a dependency on a version you don’t want to support

Ok, the version elimination will not be 100% automatic. Unit test don’t catch everything. But it should be enough to manually mark a few “problematic” libraries (and their particular versions) – those will have to have multiple versions packaged in the distro. The majority of others can be eliminated automatically.

One percent of the work is to ask project maintainers to use maven. My guess is well over 50% of them already do so (it may be 90%). The last manual thing is to “google” bug tracker of a project if it uses maven but has no bug tracker defined in pom file. When you find it you manually report “include bug tracker definition to maven, please”. The rest could go automatically.

BTW: I’m not saying maven is perfect. I believe it can help make java packaging for any linux distro reasonably easy. Plus it already has very high penetration in java ecosystem… and maven3 is coming

However most distributions prefer (some have it as a requirement) to package only Java packages that are able to build from source with unbundled dependencies. As was mentioned in the article this is not always easy, because application/library will expect different version of a dependency. And there is no notion of “binary” compatibility versioning as is in C world (you know X.Y.Z is compatible with X.Y.Z-1, but doesn’t have to be with X.Y-1.Z’). There are a lot of niche Java libraries and maven makes them trivially easy to use. I’d say sometimes it’s way too easy.

I dare you to scan some major Java application for recursive dependencies and search “alpha” or “beta” in versions. You’d be surprised how many alpha or beta quality code gets into stable releases of big Java projects. And that’s exactly because maven makes it easy. Ah, this topic would easily suffice for a whole book. I sometimes think most Java developers are pigs, but I guess that’s the ecosystem.

One more example came to my mind: Whole set of plexus-X libraries…have you looked at the source code repository? It’s a mess. Who is supposed to find where exactly is some library stored. There are sometimes multiple copies, deleted/removed/moved libraries that are still being used in their original version even though they have been “banned” by their original developers. There used to be “plexus-maven-plugin” now it’s called “plexus-component-annotations” and is part of bigger project (plexus-containers). plexus-maven-plugin code is deleted (you can of course access it from the archive but..) I could go on and on…

Sadly, the Java modularization project (Jigsaw) was just booted out of JDK7 this week and won’t appear until JDK8, if ever. The project intended to use the systems native packages to deploy jars. I have no idea how that would work on Windows or OSX, but here are dome .deb of the core libraries:

Is this “the real problem with Java in Linux distros” or “the real problem with Linux package management”? Consider that all these challenges (multiplied by the number of distros one wishes to “package” for) are what keep outside developers from making packages available to your distro of choice. One place this is brutally evident is in the Linux gaming world where installing games (open and otherwise) has been (for the last 15 years) “awkward” and sadly, still is!

The fact the majority of the software on any Linux distro is neatly packaged in this manageable form and only the tiny part comprised of Java programs aren’t indicates, rather clearly, the problem rests in the Java side.

“Doesn’t play well with others” is what comes to my mind.

Carlie Coats

September 26, 2010 at 16:26

No.
I don’t do Java development. Instead, I do
(mixed Fortran/C) environmental modeling.
Over the course of the last twenty years,
I’ve had in excess of a million LOC in production use, so I know very well what I’m talking about:

The shared-libraries-only ideology prevalent in the GNU and the Linux-distributor level makes it EXTREMELY difficult to write and distribute software intended for users who use a wide variety of distributions, and who don’t have root on the (unversity or government) servers they use for the modeling.

I sense that Java app-developers have similar problems.

If any of the usual ideologues have written as much production software as I have, then we need to talk. Otherwise, they need to stifle their idelogy asnd listen to me.

Lots of issues here:
1. The developers probably aren’t using linux and it’s particular issues just aren’t of their concern and even if they were they still have to deal with other systems too.
2. gnu/linux distros are too stuck (particularly debian) on this idea of packaging everything as a separate dependency. Shared libraries are more about saving memory than disk – and there are so many of them today even that argument is pretty weak, and worrying about a few duplicated java libraries seems a little silly. And java libraries don’t share like that anyway.
3. ant. How any sane human could choose to use such a braindead horrid build tool is seriously beyond me. Using such a crap tool makes it very difficult to customise the build – required for most distributions – if it even works in the first place. It almost makes libtool look sane. i’m not sure maven is much better – like most similar things its ok when it works but a nightmare when it doesn’t.
4. Having a platform wont work and will just be a waste of time and energy. It’ll move too slow/be too narrow for some, too fast/too bloated for others. And it’s hard getting good stuff together – the apache guys have a bunch of java but a lot of it just isn’t ‘best of breed’.
5. Java has some of it’s own stuff (e.g. maven at developer end) or webstart at run-time end which works beyond linux – which is a big issue in javaland. Pity it’s a bit flaky.
6. Dictating tools wont work either. e.g. I’m not interested in using maven ever and if it wasn’t for netbeans using it automatically i definitely wouldn’t be touching ant.

gnu/Linux distros should just suck it in a bit and realise that disk is cheap and shipping only 1 instance of the library doesn’t make a big difference to execution. There just isn’t any practical alternative to shipping at least most of the libraries for each application unless distros want to do their own qa testing and porting (they don’t seem to mind for some other software mind you …). And unless they start shipping stuff they will never learn how to deal with it or be able to offer real solutions.

Maybe if they had a well defined system to use then they could ask up-stream to add support for it – but you can’t really expect upstream developers to support something undefined or anything that might interfere with java’s cross platform nature which is an issue for most java projects.

The other alternative since java is cross platform and it’s something it probably needs anyway, is a completely separate red-carpet like thing (oh sorry, ‘app store’) with a simple client for installation which is the only thing distributions have to manage in terms of their normal distribution system. Perhaps even using jnlp for the underlying mechanism which has some versioning stuff and is at least quite simple (so it should be possible to fix any issues with it). Distros could then provide repositories of standard jnlp files for developer use which would entice them into the ecosystem (and even just advertise java as a viable platform). Problem is say redhat might not be that keen on exporting software to windows or mac (ubuntu might not mind), or even keeping it around forever. Hmm, maybe something for the gnu project although they might be a bit dark on java atm.

I think this post is quite accurate description of the problem, which is the ecosystem. The language is not the fault, but the JVM approach is facilitating this :) In other words, it’s a price for the portability of jar files, because it’s so easy to bundle them and they work everywhere, unlike native (C/C++) libraries.

Interestingly, script languages (perl, python, ruby) are similar, because they distribute sources that are directly ‘executable’, so bundling libs is also easy for them. But it seems to me they handle this better and libraries are shared. Maybe because they got their own packaging system (gem, distutils) early, unlike java?

Just to clarify the SLOTs in Gentoo, we don’t slot all (or nearly all) versions of libraries, but only versions that break compatibility (usually major ones but as you say there are no guidelines), which we can only detect by compiling/testing all depending packages when bumping given library.

As for the solution, from a distro-based source we like when the build system supports easy replacement of bundled libraries with system installed ones and doesn’t try to download them from web. So having an unified way to do this would help making the packaging faster. Maven is possibly going to the right direction (haven’t looked at it much yet). What could then possibly help, if not the reference repo idea, a way to make it download latest versions instead of fixed ones, and make upstreams try to do it regularly and test for regressions themselves :)

The underlying problem with Linux package management and Java is that it is far easier just to manually download and install the JDK, app server and application binaries that you need. I have deployed Java applications in a number of organizations on Red Hat, SUSE, Debain/Ubuntu and Windows. In many cases I have started off using the packaged Java and app servers for the different distributions, but for one reason or another it has proven easier to manually install the required binaries in /opt.

Primarily this has occurred as the various package management systems have applied a level of complexity or overhead that was hard to justify. For example in the case of Red Hat/JPackage the Tomcat/JBoss packages are out of date, and the files/libraries installed do not match that of the standard install. Whilst the situation isn’t as bad on SUSE and Ubuntu, there’s something comforting about knowing your mission critical Java applications are safely tucked away in /opt and that an apt or yast upgrade will not change a critical library. The majority of the organizations I have worked with have a number of legacy Java applications, and having package managers decide it is time to change Java library files is often a recipe for disaster.

One thing the Linux distributions could do is make the experience of manually installing JDKs, app servers and applications far easier. This could be achieved by having meta packages within the package managers, for example tomcat6-meta.deb. These meta packages would contain distribution specific files such as init scripts, and a script that would walk through setting these scripts up to work with your manually installed Java binaries. For example:
Installing Tomcat 6 meta…
Where is your JDK installed (/opt/java)?
Where is your Tomcat 6 binary files installed (/opt/tomcat6)?
Where is your Java web applications installed (/var/java)?
Do you have any specific Java runtime options?
Configuring the Tomcat 6 meta package…
Configuration complete.
You can start Tomcat 6 by running: sudo service tomcat6 start

The majority of the organizations I have worked with have a number of legacy Java applications, and having package managers decide it is time to change Java library files is often a recipe for disaster.

This has been my experience with most Java current apps as well, not just legacy ones.

Java will remain a second choice for me, the newest use of the ‘none removable’ cookie is a security risk and invasion of privacy… when I
use the net I control what’s removed from my system, not the program language…

Maven is hardly the answer. Compiles today, breaks tomorrow (but your code,is the same). Fedora even have to use a modified maven to prevent it from doing network access at build time, which is a really big no-no from a security standpoint.

Java developers have never done the hard yards to do proper dependency analysis, and this is encouraged by the “JDK” system. Furthermore the java namespacing system has made the problem far worse. Whenever a project changes website, the whole compilation chain breaks down.

Java developers need to:
* Tag source releases against binary releases (notably jogl/gluegen never did this — try to find two jogl/gluegen source code repo revisions that work together — impossible).
* Name projects properly, and keep them available in source forum (It is apparently OK to ship binaries when the original project disappears).
* Keep build system scrips working. Most java developers have no clue about their build systems, what they do, or how they work.

Really, java development is only “easy” because it is easy to do things completely wrong. This is a pain for end users, a pain for distributors trying to make everything neat, and very easy for developers. (heck, many java developers don’t have a freaking clue that they are distributing zip files with fancy names and specific file layouts).

Frankly, java, as a system provides nothing helpful to distributions. The code that others contribute is useful, however, and many packagers work to help get this code in place. Look up how many people ask how to set the classpath, and are told to install eclipse.

But a half decently maintained c/c++ system is far better than a half decently maintained java system. The lines of source required are about the same (if you use the right libs), and the portability is the same.

From a pure end-user perspective: Who cares? All I know is that there are fewer Java apps showing up in my package manager than they statistically should. As developer I see the dependency problem, but I don’t agree that Linux distributionns should break up JARs just for purification purposes. Let upstream do upstream tasks, don’t duplicate work. Someone please just write a jar2deb tool and set up a repository.

Really? And how do you propose to solve this scenario:
You have 1000 of Java applications/libraries in your repositories created by jar2deb. 500 of these libraries bundle let’s say…plexus-utils version 1.2.3 that has serious security flaw. You have a fix for this bug, but how do you apply it? The jars are binary, fix is in form of patch to the source code. I see 2 solutions:
* update 500 libraries/applications
* rip out bundled plexus-utils and update just one system library

Which one would you chose? (or do you know any other way?)

helsinkiharbour

July 22, 2012 at 14:38

That’s a so old and brain-dead argument from the beginning of time… :/

There is a name for such conservative thinking, “meteor shower insurance syndrome” -> constant, real costs for a hypothetical scenario in future.

Such “feared” security flaws happens seldom, and if, so let’s in godsakes name update the applications (regularly distro updates create more work burden all the time). Application and end-user centric thinking would be a much more pragmatic and sane approaches than this annoying distro focussed thinking causing only troubles and work all the time.

One has to remenber just a few “basics” :
1/ In the Java world, a program/application must be guaranteed to run as well and the same way in every platform => one JVM per pltform, the application packaged as a unique binary for all platforms
2/ Users don’t care about the underlying packaging system (even more, the details of the software behind the curtail is NOT their job). They want the program to just run as expected ! => they like the simplicity of manually installing a JVM if platform don’t already provide it, and then directly download the application as a binary : ready to run !!! Building ? what the fuck … Remember the majority of the users DON’T UNDERSTAND even the word ..

So, the most reasonable think to do for Linux distros is facilitate the installation of the most well known Java programs, like said David Harrison earlier. This is the only way to attract more Java users to the Linux platform.

We have to accept the idea that Java was until today at last not made with a plat-form packaging requirement. The application is a whole already packaged. The developpers are on charge to garantee that it will work as axpected everywhere, and this is not in a third party packagers hand.

For the rest of the geeks here with their personal tastes and ideas about Java as a language or as a platform, please remember : in the end, the fact that your precious program runs, is because users loves it and choose to run it, not because it is written in C or Java. So, if you are happy working with mem allocs, #pragma, pointers … well go on ! The important think is the user base. If Linux people think they can live without Java and the ideas behind it … well it might end in beeing a system for geeks, or “1% of the OS market share”. Is this really suitable ?

There are idiots who don’t understand the benefits of a distribution-managed package repository. They should keep downloading Java zips (jars).

The rest of us, however, do care about maintenance and the professional quality control most of distribution packages receive.

As you mentioned, the Java ecosystem is incompetent when it comes to packaging. Compare the situation to Perl, Ruby, Python, and many other language communities that got the packaging [mostly] right. Maven and some other projects have paved tiny steps in the right direction, so the situation will resolve in time, hopefully before the Java platform becomes obsolete by the likes of C#, and Objective-C.

I also believe that a large and well-endorsed repository of well-maintained Java libraries plus the ability to install multiple versions of a library (hopefully only from the said repository), could be the approach to fix up the Java library ecosystem. Think of a Github for Java with a facility for managing the built Jars, and a “gem”-like client that manages your ~/.m2, gets the classpath, and execs. A huge plus if that can be done using dpkg, to manage both system-level (/usr/share/java/*.jar) and user (e.g., ~/.m2). If it cannot be done using dpkg, maybe the second best option is to build a small but reliable (security-hardened) “gem”-like client to resolve dependencies, and keep an index of what is installed. Such a tool can then be called from dpkg.

>Users don’t care about the underlying packaging >system (even more, the details of the software >behind the curtail is NOT their job). They want >the program to just run as expected

This is why you need packaging. If Jave developer A builds binary A, and it is supposed to work with developer B’s B, and both of these require the C jar library– this can only be guaranteed to not break if they use the same C. Otherwise you are doomed.

Funny that you should focus on package versioning as a big issue. In theory, the Java language is well positioned to not have library version problems. I didn’t get this until I actually started writing Java code, but the most powerful feature introduced by Java is the way it does run-time linking. You mean, I don’t have to write .h files? Wow. Unlike C++ where code actually compiles in detailed object layouts defined in .hpp’s, Java code accesses library objects in a consistent way that does not require any knowledge of the object’s internal layout at compile time (or at least, that’s how I understand it). That *should* make it much easier to design libraries that maintain backward compatibility than with a language like C++. So why don’t the various projects do that? Could it be that they’re just not aware of the problems that could be solved that way?

Lots of C/C++ open source projects seem to take the attitue that ‘you can build it from source, so who cares about binary compatibility’. True enough, but that’s also a big part of why the having multitudes of Linux distros causes problems. Most Linux distros come into being for stylistic reasons. By that, I mean preferences for what combination of packages and themes makes for a nice desktop system. There shouldn’t really be any reason why an app developer couldn’t release a binary app to run on all distros, but there is. Because so many commonly-used libraries break compatibility (or at least are not committed to *not* break compatibility) with each release, you really need build binaries for specific distros (and the specific levels of each library that they bundle).

Windows gets binary compatibility, and largely achieves it. Not becuase of any inherent technical superiority – they just understand that binary compatibility is more important than fixing some ugly API that’s gnawing at some developer’s sense of aesthetics. Hell, there’s plenty of ugliness in POSIX libraries, but everyone seems to get that it’s more important to maintain a consistent standard than to ‘fix’ strcpy, etc. Memory leaks and all.

I think Google appreciates this with Android, and for the most part old Android app binaries can run on an upgraded Android OS with no problems. Big contributer to Android’s success. So it’s not a Java problem per se. It’s more of an open source attitude problem. Choice is always good, except when it isn’t, except that it always is… etc.

You have it a little mixed up. Most C/C++ projects/libraries guarantee binary compatibility on micro revisions (3rd number in version). Not 100% but Java libraries don’t have any notion of binary compatibility and just because it won’t scream at you, doesn’t mean it will work. Imagine java packager that is taking care of let’s say…velocity (picked up at random). Current version 1.2.0, but there is version 1.2.1 upstream. If he was a C/C++ maintainer he could update without hesitation because like I said binary compatibility usually stays in micro revisions. Not so for java. Therefore what java packager has to to is look at changes introduced in 1.2.1 (in source code) to verify it won’t break few tens of packages depending on velocity that are in repositories.

I think you nailed it: Many give a sh** on binary compatibility. With modern IDEs it is so easy to do major refactoring. In classic VB it also bails at you when you are about to break binary compatibility. I miss that in Java, although I think I am quite aware of the situations that could break binary compatibility and avoid to break it.

As a Linux user, I see very little problem with the available program universe in the distro I use. (Debian).

The Java developers here all seem to think that a successful Linux system should be trashed for their convenience. Well, that’s not going to happen. If there are problems with Java, then fix Java. If that is too much work, then bye bye Java.

I use Linux because it is much more secure and controllable than Windows. No viruses, etc. Java is an insecurity vector for me. Saying I have to allow unknown libraries because some developer wants it is asking for a disaster. I will say NO. In Linux, I can do that. I do do it. If you want to be able to reach me as a customer, you will need to do better than claiming that you are perfect, and I am the problem.

Linux is now around 10% of the market in combined desktop, laptop, electronics and phones. Tomorrow it will be 20%. Linux is already around 40% of total server share. The single biggest player. If you can’t deal with the system, then you won’t get that market. Others will. Java is there in servers. It can be done in other systems. To get there, you will need to use the Java we have. Make it easy, or make it work with backwards compatibility. Everyone else does it. Why can’t you?

There is a reason Apple iPhone doesn’t do Java. Give Google a reason to have another language for Dalvik, and you lose the Android market too. The choice is yours. But, complaining that my platform needs to change for your conveninece is a good way to lose that platform. It’s a rapidly growing platform.

There’s been a few comments similar to this one which demonstrate the level of misunderstanding of Java by the typical Linux user (i.e. non-Java developer). The Java software stack isn’t just a interpreter, but a complete virtual machine that has been designed to allow third-party (untrusted) code to run within a sandbox.

The majority of Java developers will agree that it is important that Linux distributions provide first-class support for the Java Virtual Machine. Where it gets difficult is when the ideology of the Linux distributions gets applied to the applications that run within the JVM. For the most part these Java applications are designed to operate as isolated, standalone entities that has more in common with a VMWare/Xen/KVM virtual machine than a ‘typical’ C or C++ application.

Attempting to break down these applications into self-contained libraries that are centrally managed by a package manager provides minimal, if any benefit. Assuming you are using a current JVM and have adequate security policies in place (http://www.javaworld.com/javaworld/jw-08-1997/jw-08-hood.html) it doesn’t really matter what Java libraries your application uses.

If the Linux distributions wanted to help make Java a first class citizen they should:
a. Ensure the JVMs they ship are quickly patched if security vulnerabilities emerge.
b. Publish guidelines for Java developers on where they should install their application ‘bundles’.
e.g. /opt/applications, /usr/lib/java/applications, /var/java/www, etc.
c. Produce some tools that allow Java developers quickly package these bundles, and the distribution specific scripts for launching them, as deb or rpm files. These tools could be command-line based (e.g. rpm-javabuild), or plugins for Maven and Eclipse.

You can’t expect Java to change for Linux or vice versa. Instead a middle ground needs to be sought where the best properties of both environments can be leveraged.

Buchan Milne

September 27, 2010 at 14:25

Are you implying that Java and Java-based software never have vulnerabilities?

If so, why should distros ship JVM updates (I note that a number do already). If not, why do you say users don’t understand the java sandbox?

If Java is only meant for dedicated, isolated applications, why do Java developers write desktop applications in Java?

Most distros have /usr/share/java for jar files.

How can you automate packaging of java applications, if Java has no built-in support for indicating/discovering run-time requirements (where native libraries, perl, php, mono etc. do)?

How can distributors use/provide maven plugins, when it is impossible to prevent maven from always trying to update itself, even if it is up-to-date-enough for the software.

Java needs to change for Java, acknowledging the issues the Linux community has with Java will only improve it for everyone on all platforms. Many linux distributions, and some Linux-related projects (such as jpackage.org) already do a lot to try and improve the deployment of Java applications/packages, but the platform itself has limitations which prevent improving it, and the Java developer mindset that goes with it doesn’t help.

I don’t think anyone should blame Java devs for not packaging right. As far as my experience using Java in few GNU/Linux distros, many provides outdated Java and libraries. For example, It’s been years that Ubuntu still using SWT 3.3 and just moved to SWT 3.5 *recently*. Ubuntu have outdated Eclipse and sometimes failed to run (on my system). Netbeans didn’t get packaged. Even if it’s packaged now, it still not the newest.

Gentoo put jars in /usr/lib, Debian/Ubuntu put jars in /usr/share/java. AFAIK, we have no tool to locate all the jars. So, the safest bet is to put all Java applications and libraries in /opt.

I know that because of the SUN’s licensing, Sun JDK hadn’t been packaged. But, even until *now*, we still don’t have Sun JDK. I do try OpenJDK in Ubuntu repo, but our smart card implementation don’t runs well in that. We had to switch back again to Sun JDK.

IMHO, to put Java as a first class citizen, distros must put all standard libraries in a consistent place and keep track of the recent development of prominent java libraries and IDE such as Eclipse/SWT and Netbeans. And please, put Sun JDK under main branch.

Why do you think all non-Java applications are kept up-to-date, but Java ones are not? Do you really think it is that packages are just so lazy, and Java is easier to package than (say) perl or mono/.Net? Have you not considered that maybe Java distribution practices are insufficiently standard?

Why should the distribution provide the means to locate JARs? Shouldn’t the distribution just need to specify at compile-time of the JVM where the default JAR search path is (equivalent to how say, perl, or python, do)? And provide tools to detect where new JARs should be installed to? How about which JARs are required by other JARs?

Most distros provide “Sun” JVM/JDK in non-free at present.

Mandriva 2010.1 has tomcat 6.0.26, netbeans 6.8 and eclipse 3.4.2 (vs. 6.0.29, 6.9.1 and 3.6.2 that are now current), but a lot of other software is not packaged, due to maven brain-deadness and other similar problems with the Java platform, which just makes it too much effort for packages (such as me), who rather spend time on software that isn’t always a constant struggle to update.

If the Java application is packed into a deb as Freemind for example I don’t have a problem. It is sufficient that the applications keep their libraries approx up-to-date and they usually have to during development process.

You’re just trolling. You should contribute something meaningful to the discussion or keep your comments to yourself. Just because there are bad Java developers, just as there are bad developers in any language, doesn’t mean you can generalize the skills of all Java developers.

Go tell the developers of Glassfish, Hadoop, Cassandra and a thousand other amazing java projects how much they suck at writing code and let them know you think that they are jokes. Your likely to get sat down pretty hard, considering your amazingly eloquent response here. But please…enlighten us with your amazing coding credentials and the authority you have to speak on such matters.

BTW, many of us can code just as well in VIM/Emacs as we can in a Java IDE. That doesn’t mean we’ll choose to gimp ourselves and run away from code completion, large-scale refactoring and many more incredibly useful features. You’re not uber just because you don’t use an IDE…but obviously no excellent developer should be dependent on an IDE.

a) Using Maven or binary distributions is a security nightmare. I think the java community really needs the lesson of a trojaner in some jetty / tomcat / eclipse jar to learn that code should have a cryptographically secure trust path from every individual code commit to the resulting binary – as it’s the case with e.g. the Linux kernel.

b) Starting some java xyz.war in a screen session as root is not the same as correctly deploying a server application. What about security? Debian packages will generate appropriate system users and groups. Who tells you, that (security) updates are available? The package manager! What about deploying at scale? Do you want to maintain your own repository of .war files? A linux repository will give you that for free. What about process restart after boot or clean shutdown? Who rolls your log files? Who versions your configuration files? (Some admins put /etc under version control) – In short: Deploying java applications is a system administrators nightmare. Maybe it’s one part of the problem that too many java application developers don’t have any clue about system administration because that’s the job of those hairy monkeys in the basement?

Buchan Milne :
Are you implying that Java and Java-based software never have vulnerabilities?

I guess you could have that impression if you didn’t read my comment and didn’t understand the Java stack. Like any virtualisation platform it is critical that the ‘hypervisor’, i.e. the JDK, be kept current, and that the applications running on top of it take appropriate security precautions.

Unfortunately what I see too often (as someone who goes into organisations using Java) is out of date JDKs and Java applications that are being run as root with the security manager disabled because ‘it was easy’. The above average Linux administrator will scoff at this practice, but the reality is the majority of IT staff have very little knowledge of Linux or Java.

The only way this can improve is if clear documentation and relevant tools are provided to Java developers on how best to package their applications for the various Linux distributions. At the same time these distributions need to respect how and why Java applications are put together the way they are. The fact of the matter is that Java applications are platform agnostic. More often than not the same binaries are written, tested and deployed on Windows, OSX and other *nix derivatives, so asking that these be broken down and reformed in order to comply with a specific Linux distribution’s idea of package management just isn’t practical.

Yes, there are official distributions of software still using Java 1.4!

I like it very much, that Java applications usually pack all together, what they need. Never forget the DLL hell on Windows which is a never ending story.

Of course under Linux the dependency handling works better although there are also conflicts sometimes.

I find it not so bad, that the developers have to decide which version of a component goes into the distribution. And either if it seems to be just a security fix for a component – it could break some other functionality of the application using the component. So an update of a 3rd party component should be tested.

That said, I try to reduce used 3rd party components to the minimum required. To keep it simple and to avoid increasind dependencies.

I think you really hit the nail on the head, in regards to the fact that Java is basically working as expected, which is in conflict with how Linux typically works. I worked through a few Java books and the Java philosophy is always part of it. Most of my dislike comes from having worked with it, both in production and learning, but I would be lying if I said I couldn’t appreciate it ;)

Sorry, but I think the whole article and discussion in reality completely misses the point: Java is very good for Linux.

Companies currently running Windows only have a very few serious options when they want to do their software development in a way that they free themselves from the OS dependency.

Java is a very good helper to get companies going Linux. So I would be cautious critizising Java too much. Of course things could be improved. Personally I love it just copying over just a single file or maybe a folder and that’s it. When I show this to Windows admins they are so happy that they don’t need to install anything (I have seen so many Windows boxes going in tilt after a bunch of software installs). For my Windows applications I also had to take care of using recent security-fixed libraries, but serously: There are only some neuralgic points that are security-relevant – where there are interfaces to the outer world. Who cares, if you don’t use the latest security-fixed persistance layer on a client app for example, where nobody connects to and just the client app connecting to the server over ssl.

Really, be happy that there is Java that makes applications platform agnostic.

I think we can all agree there’s an impedance mismatch in here. Linux package maintainers and Java ecosystem don’t go hand in hand. That’s very unfortunate situation! Let’s fix that.

Java is not evil and the ecosystem is nowhere near to be bad. Java world had problems with dependencies. Every ecosystem had / have those. The solution in Java realm is ivy / maven. These continue to improve with new versions. Have a look at upcoming maven3, please. Does it help? Can we suggest anything to its developers to solve our problem?

Jigsaw is another really key component to solve the problem. Oh, it’s delayed? Great! Perfect! We have time to read the specification, think about it, try it out, and give its developers a feedback. I ask all of you, package maintainers, have a look at it! Tell (politely!) jigsaw developers what you want / need for easy packaging. It’s one of the primary goals for Jigsaw!

There are three things to look at – jigsaw, maven and OSGi (why nobody has mentioned it before?). I would have a look at them myself, but I can’t say what is needed for package maintainers. I’ve never made a single package myself. I’m using linux for nine years now. Shame on me. Maybe. But I like to help with this one! I can help you understand Java ecosystem. Is there any package maintainer with positive attitude? Let’s have a look at those together, talk about them… do the team work :-)

BTW: Create even the “ugly” packages. Just download eclipse and put it to the repositories as one monolithic package. That way you can attract more Java developers. Show them the packaging is actually useful. The packaging can be improved later. Java developers are told today to download eclipse and put it to /opt. I do it the same way. How can you expect such a developer to do anything towards native packaging? He was just told that Java packages don’t work with native packages! It’s a chicken and egg problem.

— the story of my current project

For the project I’m working on I’ve tried python, c, erlang,… none fit the needs. Not even close! Ok, erlang was quite good, but no match to java. So it’s in Java. What I’m trying to say is that my choice of tools is driven by the needs. For the same reasons I use alfa, beta, SNAPSHOT, and even custom versions of libraries. What linux kernel do I use? 2.6.35 from unreleased Maverick. Yep, not even the kernel is from stable distribution. Why? We’ve tried on 2.6.18 from CentOS. It melt down in smoke testing. Two weeks of googling, reading linux kernel source code, lwn.net and others… and the result was simply to upgrade because others had the problem before. And guess what – they solved it already! The same story with all the unstable dependencies.

We have a vision. We’re building something valuable for our customers. We need Java for the job! We need linux for the job!

If you package some of the dependencies, you will help us! I can simply download my favourite web server, unzip it and run bin/start.sh. That’s it! It works. But in production I have to create an user for it. Hook it to the system as a service. Upgrade it…. Everybody needs what package maintainers do! We’re not enemies :-)

There is a combination of great insight with this post as well as ignorance.

Insight on the Linux side, ignorance on the Java side.

It is unfortunate because Linux and java represent the two biggest forces in open source.

Both sides have the same problem: Configuration Management and have come up with different solutions.

By Configuration Management I mean that every piece of software out there was written assuming some version of an OS and some versios(s) of any library it uses. Whether you are writing Java or C++ building or running a piece of software requires gathering that configuration.

In Linux, this is done by a combination of the Linux distros, and their various package schemes. In practice, this has meant projects have to relentlessly move forward as libraries update or fall by the wayside. Major API changes to a library require a name change to the library so that dependent projects aren’t forced to upgrade, you end up needing two versions of the library installed.

While it may seem like Java is solving that problem differently, the differences are less than you might assume. Bundling jars is no different in principle than statically linking a library. If you look inside several open source c++ projects you may be surprised to find that they do just that. Even in the .deb packages.

In Java though, you have some unique issues. Every jar in Java is essentially a dynamic library along with the header files to use it. So managing jars can become a big issue wi th any java project.

The solution is Maven and it’s associated ecosystem, which is approaching critical mass in many java open source projects. The closest thing to compare Maven to is really the debian package manager blended with make. That is, a maven pom file describes not just the dependencies of a project, but also the instructions for building it; and it serves as the file that the developers of the project use themselves for building it. Dependencies don’t have to be fixed; [1.2,1.3) implies at least version 1.2 and 1.3+ is even better.

A fully fleshed out pom file can also include the source code repository URL, etc. Most open source maven repositories publish both the sources and the “binaries”.

So I think that most of your objections to Java in Linux distros could actually be solved if someone seriously looked at both Maven and its associated ecosystem. Every Linux distro could simply include a starter maven repository for libraries it needs, and it could be standard to include a pom file with every maven executable that used mvn exec:exec to find the right java libraries either locally or on the Internet, muct like apt works now.

Email me if you’re interested in how I see that working but really, it’s really just a matter of coming up with a standardized configuration; I don’t think you would need any code.

Note that Maven is already used to some extent inside Debian & Ubuntu. Most Java library packages ship with POM files and we have tools in place to use them. It’s not a completely parallel distribution system though, so it doesn’t really address the underlying issue: that some Java applications, to build using maven, require three different versions of the same library. The general lack of convergence is what makes it all difficult, not the technology or the tools. That’s why I mentioned that the issue is not in Java or Maven, but more in the way those tools are used in the ecosystem (or the behavior that they encourage).

Requiring 3 different versions of the same library is actually considered a bug, generally. Unless you’re using OSGI or something, you can’t control which one the JVM will find first. If you are using OSGI, then you know what you’re doing, but you’re probably some kind of server process and have a good reason.

Ok, you’re familiar enough with mvn, so imagine this.

1. A “system” maven repository where java stuff gets stored. Maven repositories are organized by version.
2. A simple script to replace “java” that uses the maven exec plugin to run a java command from a pom file. Any needed java libraries are loaded and installed in the users local repository, or the system repository of run by root.

Yes, that’s one possible solution: maintain a parallel distribution system alongside the classic APT repositories, to better support a Java maven-driven ecosystem.

About “3 different versions”: usually they are not all required at runtime. I remember trying a full maven build (starting with an empty .m2) of Glassfish or Geronimo and inspecting the contents of the resulting .m2 directory, realizing that it had to download 4 or 5 different versions of plexus-utils in order to complete. It’s probably that project being sloppy in convergence, but that’s a pattern that maven encourages rather than tries to prevent.

I have only one problem with this approach. Yes we could make it work, but if we want to make sure that applications are actually open-source and current version can be built if they suddenly turn proprietary we really have to build from source. We cannot just let go and use binary pre-compiled jars that we have no idea what’s inside. We would put our users at risk because after some time they would probably have several bug-ridden insecure libraries in their systems without their knowledge.

You already have the same possibility using binary debs. I build most packages from source, but in my experience I am in the minority. Sure debs have signatures but few people check them. Jars support signatures also – the use of jar signing was built into Java long before Debian maintainers thought of using crypto hashes to protect debs.

What is most evident in many of the posts above is that the biggest hurdles to be overcome are ignorance, suspicion and the resulting unwarranted derision of Java and Java developers by many Linux idealogues.

This is truly ironic considering the role Java played in securing Linux a foothold in many corporations during the late 90s.

I’ve been a Linux user and contributor since 1996 and a Java developer and Java open-source contributor since 1999. Throughout my work as a developer and advocate I have found the two to be entirely complementary.

When it comes to distribution Java is open to any mechanism you may choose to implement. Sadly the mental block seems to be in the arrogant my-way-or-the-highway mentality of the maintainers of some Linux distributions.

As mentioned by a few, Maven is a part of the solution. Another is providing first-class support to Java developers and encouraging intelligent filesystem placement for their artifacts consistent with the Linux Filesystem Standard. There is no harm in having multiple library versions on a machine provided there is a mechanism to restrict the runtime classpath to those versions against which the shipped application/library has been tested. Again the Maven coordinate system provides a good starting point for this.

The duplication mentioned above could be limited to the deployment host if Maven is used to satisfy installation dependencies during application installation. Even then the “duplicate” jars could be symlinks to single copies in somewhere like /usr/share/java or a local Maven repository which replaces /usr/share/java. I have used exactly this approach for a few years and found it to be entirely compatible and consistent with using Apt to manage deb packages.

Can we stop comparing packaging sizes and just get on with resolving the problem.

I realise this is an old thread, but having read through all the wildly erroneous statements made I had to say something in case some poor noob read some of this dribble and mistook it for truth.

java is corporate because of the “education purposes” very similar to $$MS’s corporate representation… but they even went further and bought government officials to push their rubbish and tie everyone to their pocket…