Haskell’s Own DLL Hell

I touched on this issue in a more positive way in this recent entry. But now I’m going to be more negative about it. You see, here’s the thing: Significant Haskell software projects are struggling under the weight of the Haskell equivalent of “DLL hell”.

“In computing, DLL hell is a colloquial term for the complications that arise when working with dynamic link libraries (DLLs) used with Microsoft Windows operating systems.”

The idea is that various applications on the computer share libraries. These libraries have different versions, and different programs often need different versions. The “hell” starts when some programs overwrite the libraries with other, incompatible, versions, or when one program somehow turns out to need both of two incompatible sets of libraries.

To be sure, Haskell is better equipped to handle the problem than historical Windows executables. In old versions of Windows, programs often blindly copied the DLLs they needed into a system-wide location (Windows\System) without regard to any versioning at all. However, in Haskell, we are dealing with the problem on a different order of magnitude. Whereas a DLL on Windows is generally a pretty substantial project in its own right, many Haskell package on Hackage consist of just a few lines of code (put in a positive light, they do one thing, and do it well!), and as a result, many other projects depend on dozens of different packages, either directly or indirectly.

This has consequences: my experience is that it’s currently near-impossible to embark on most nontrivial Haskell software projects without spending a substantial percentage of one’s time fighting dependency version issues. This is especially true of “real world” sorts of projects: those that often involve pulling together a lot of different functionality, rather than solving specific, well-defined computational problems.

That’s where we are. The question is what we do about it. I’d like to propose some questions for discussion:

This one is certainly going to be controversial. For background, the PVP is a set of guidelines for things like specifying the dependencies of a Haskell package, and how to manage your own package’s version number. The short version is that packages shuld bump at least the minor version number (the x in version 1.x) every time they make a potentially breaking change, such as whenever any entity is removed or renamed or their types are changed. Furthermore, the PVP suggests that dependencies should have have upper bounds on their versions. The goal here is that if you make a change that might break someone else’s package, you should create a new version and their package will continue to build against the old version.

There are two possible effects of following the PVP by adding upper bounds on dependencies:

Someone might try to install some package, and because of an upper bound, Cabal builds it against an older version of some library. This causes the build to succeed, where it otherwise would have failed because you removed or renamed something.

Someone might try to install some package, and because of an upper bound, Cabal fails to find the right combination of dependencies, and refuses to build it at all.

Just to be argumentative, I’ll mention that it’s pretty clear to me from personal experience that #2 happens a lot more often than #1. People following the package versioning policy by specifying upper bounds is far more likely to prevent ‘cabal install’ from succeeding, than to allow it to succeed. Upper bounds, on balance, make it harder to get Haskell libraries installed successfully. When (again, from personal experience) attempts to build any nontrivial Haskell application have a less than 30% chance or succeeding anyway, should we be all that worried about the theoretical chance that a build might fail?

That’s just one side of the story, though. It’s true that an error due to failed dependencies makes it clearer what is going on than a random failure involving an unresolved symbol or type mismatch during a compile. So this is an open question in my mind.

Perhaps the less contentious way to ask the question would be this: should Cabal be modified to give a warning instead of an error for upper bounds when they would prevent the package from building at all? (And if so, perhaps it should get a new strong upper bound, which indicates someone actually knows that the build fails.)

2. How close are we to the goal of getting GHC and Cabal to tell the difference between exported and non-exported dependencies?

It would be one thing if the problem here were actual incompatibilities in code. If I’m using libraries that rely on different versions of the same package, and they both export things that rely on types or instances from that package, then I should expect the build to fail. But a lot of the time (I’d guess a majority of the time!) that’s not the case. One place this comes up a lot is with network‘s dependency on parsec. But it doesn’t actually export any parsers; its use of parsec is an implementation detail.

Similar issues arise in many other situations. Many library dependencies are a matter of implementation, not public interface. Even where it’s not currently the case, if this is fixed, it will change the community’s best practice to include using a lot more newtype wrappers rather than re-exporting other package’s types, or splitting packages if they provide substantial functionality that does not need the re-exports.

I mentioned earlier that Haskell deals with this problem on a whole different scale than other languages: part of the reason is this lack of distinction between implementation detail and exposed interfaces.

3. Can we stop cabal-install from breaking the package database?

A very special case of this problem happens in a particularly disturbing way. It goes like this:

Package foo depends on bar-0.5 and baz-1.2.

Package flox depends on bar-0.5 and baz-1.1.

Package bar-0.5 depends on baz, but has no preference between versions 1.1 and 1.2.

The way this works now is as follows: When I cabal install foo, Cabal first builds baz-1.2, and then bar-0.5, and then foo. But if I later cabal install flox, then Cabal will build baz-1.1 (this is fine, since multiple package versions can exist at once), then it will rebuildbar-0.5, linking against baz-1.1. (This does not coexist; because bar has the same version number, it gets overwritten.). At this point, foo is broken. Running ghc-pkg check will complain about it being broken because of a missing dependency.

I’m not sure what the right thing to do is here; I suppose that if bar-0.5 re-exports types from baz, then its “version” needs to include the version information from baz somewhere. In any case, this is an extremely confusing, and extremely common, issue to run into, and it results in an inconsistent package database without so much as a warning. Something really needs to be done.

4. What is the best way to deal with this in the interim?

Yackage is a great idea. (Or any other way to maintain a local Hackage; I’m aware of discussion about whether it might be better to just use the new hackage server code eventually, and I don’t think the implementation particularly matters.) Michael Snoyman’s goal was really more about maintaining collections of packages that he’s maintaining… but for a real-world software project that otherwise won’t build, it sounds like a great way to keep track of local modifications to other people’s packages.

Another idea that might work really well is to just be able to ask cabal-install to remember a modification to various packages’ build-depends’ fields persistently. So instead of having to download, build, and manually install these packages just to change their package.cabal files, cabal-install would continue updating from Hackage, but reconcile your existing build-depends requests (“relax fizbuzz’s build-depends to build with foobar-0.5″) automatically. Even better, make it easy to get a list of which of these local changes are still at odds with public packages, to report to either the maintainer.

All of these are options for mitigating the problem. But first, I think we need to realize that this is a serious problem. I’m afraid there’s a bit of sampling bias here; I have to believe that these problems aren’t getting solved because many established Haskellers tend to work on projects with very narrowly tailored scope… and this is true because many people who want to work on more general (“real world” by my definition above) projects have often fled to languages that don’t make it a week-long job to get the dependencies for a project to all build at once.

Mike, thanks for the comments. Reading about bundler, it looks like that’s actually precisely what Cabal and cabal-install do. Every haskell project has a myproject.cabal file in the root, which specifies (among other things) the packages it depends on and what version numbers it needs of each. Then ‘cabal install’ will resolve them all, fetch the dependendencies, and build it all. This already exists, and works…. most of the time.

The problem may actually be too *much* of that good stuff. Since all Haskell packages include dependencies and version bounds on each, it’s very frequently the case that there is no selection of package versions that can meet all of those constraints. And that’s what my blog post was about. Looking at http://hackage.haskell.org/package/http-enumerator and the deps list there is a decent way to get a handle on the issue. :)

rvm is definitely an interesting idea, as well. We have a tool to maintain sandboxed package databases, called cabal-dev, which some people know about, but it’s not widely used. Looks like that’s what rvm is. Is rvm widely used in the Ruby world?

I was confused because Ruby’s gem command does dependency management (gem A 0.5 depends on gem B > 0.6), but bundler does dependency management management (gem A 0.5 depends on gem B > 0.6 and B 0.8 is available but the system has 0.7 which is needed by gem C which is a test dependency of gem A, …). If cabal-install combines those then I didn’t realize.

This weekend I attempted to install shsh using cabal and ran into what I can only assume are Debian issues with old packages.

rvm does sandboxed package DBs but also sandboxed Ruby installs; switch to Ruby 1.8.7 for this project, JRuby 1.5.2 for that one, etc. It’s very widely used by Ruby devs, though presumably not by end users who don’t care what version of Ruby they’re using.

cdsmith / Jan 17 2011 8:32 am

Mike, yes, I think it’s fair to say, then, that Cabal does dependency management, by having developers specify version dependencies in their Cabal file. And the combination of cabal-install and ghc-pkg does dependency management management — these maintain the installed versions of various packages, and cabal-install includes a constraint solver designed to figure out the best way to satisfy all of the version dependencies of entire sets of packages.

Debian packaging is a different matter; it’s generally best practice to get only enough from Debian packages that you can install GHC and cabal-install. After that, additional libraries should be built and installed with ‘cabal install foo’. Of course apt-get is great, but there are thousands upon thousands of haskell packages with very few packaged by Debian. I typically let the Debian packages (well, Ubuntu, but same thing) manage my global package database, and let cabal-install manage my user-specific database. So I never run ‘cabal install’ as root, but instead install things under ~/.ghc and ~/.cabal. (This is the default for cabal-install anyway.) This seems to be a nice way to get the two working together; Haskell packages in Debian repositories may then break local packages if they are updated, but this happens infrequently enough, and it’s easy enough to run ‘cabal install’ again, that it’s not an issue.

Bundler does way more than just install deps. For one thing it records the solved gem versions in a “lock” file. Once I have that I can easily build and deploy the exact same set of ruby gems on another server. Maybe cabal can do that but it’s not painless like bundler, or we wouldn’t be having this conversation.

Also I think Gems let you specify that which dependencies are only needed for developing the gem. That does free up the solver.

Furthermore, I can have several different versions of Rails and any other gems all cohabiting the same bundle directory. My current project’s gem file exposes the ones I want and _only_ the ones I want for my current project. I’m not sure but does Cabal do that?

Since I converted my project to use bundler I’ve pretty much forgotten about dependency issues. If I want something new I add to Gemfile, run bundler and at commit time I check in the new version of Gemfile and Gemfile.lock. If I want to roll back I revert Gemfile and Gemfile.lock and it is as if nothing changed. I’m pretty sure cabal can’t do that.

I take no action on the production system to deploy new gems. Bundler on the remote system runs as part of my deployment system and I don’t ever remember having had a problem with it since I set it up. I change stuff, run my deployment script and go to bed.

That said, I think your main gripe is solved by making sure cabal knows that some deps are just for building the module itself and not at all needed for the module’s use.

The Gemfile.lock thing is just very nice icing the cake, and really, I think, what makes using bundler so easy.

cdsmith / Jan 19 2011 1:37 pm

Darrin, thanks for your comments.

The lock file is definitely not something that Cabal currently does, and it does sound like a useful addition. Cabal writes something like this in the dist directory as the result of “cabal configure”, but it’s not really meant to be shared across machines.

The rest of what you’re saying is actually exactly how Cabal and ghc-pkg either work, or should work modulo the issues in other points in this article. Yes, ghc-pkg does permit multiple versions of the same packages to coexist in the package repository; and Cabal does restrict any given build to just the ones mentioned in the .cabal file it’s building from. That’s the core functionality of Cabal, and I think it does a great job at it. Just a few things to get worked out! But something akin to “lock” files and syncing them across servers might be a welcome addition to Cabal.

When I started using Haskell several years ago, Cabal existed but cabal install did not yet. I think this obscured the problem. Recently, trying to use a pre-release version of Snap, this problem was really magnified for me. Starting with a pre-existing but rather outdated package database made the situation worse. Advice on the mailing list was, dump your installed packages, and that worked, but that’s so un-Haskell it’s not even funny. “Get out of the car and get back in” is exactly the kind of thing Haskell exists to abolish.

I have heard before that the package guidelines say to bump at least the minor version number if there is an API change. The problem is that all it takes is a small human error to wreak havoc on everyone. This is particularly perplexing to me because the API signature in Haskell is certainly data available to us to be processed algorithmically. Perhaps we could find a way to make an API hash, or calculate a partial API hash for the portion of the API a given program uses. It wouldn’t save you from behavioral changes to the API that are not available in the type signature, but it may be better than relying on programmers to be vigilant and honest, which is (again) the opposite of the Haskell way.

Thanks for bringing this up. I think this is a big issue and we do need to find solutions.

This is an excellent idea! I wonder about false-positives, though. Is it possible to make a non-breaking API change that would cause a forced version (hash) bump? For example, simply exporting more functions. Adding instances may or may not be breaking changes. Does changing an opaquely exported newtype to a data change the API hash? Strictly speaking :), it should (the isomorphism isn’t exact), but practically, it probably doesn’t matter.

Yes, for exactly that reason, a hash is probably insufficient. New APIs can be exported without bumping the minor version… though it does require a change to the third version component. But that doesn’t mean this couldn’t be automated. Since this only need be enforced on upload to Hackage, it’s not as if such extreme performance is a crucial requirement.

The biggest problems here are likely to be:

(a) Currently, Hackage accepts uploads without even trying to parse or build them first. Enforcing version compatibility would require a successful build (or at least parse) of the code. Given the possibility of TH, that probably means a build. And given the number of packages that fail to build on Hackage today, that looks infeasible.

(b) Some packages intentionally don’t follow the PVP. I’m of, at best, mixed feelings on whether we ought to exclude them from Hackage as a result.

One – at least partway – solution is to leave the packaging to your OS distribution. Hopefully, your OS distribution will ship with the most important libraries, and these will be compiled against each other, forming a consistent whole, reducing the need for ‘cabal install’ to go out and fetch a lot of libraries each time you install an application.

The great thing about distributions is that the software is tested together (including any C libraries that get linked in), and thousands of users will have the same setup,
making any interoperability problem likely to be discovered and fixed quickly – and usually before it hits you.

I think anyone who’s done much development in Haskell will agree that leaving libraries to OS packaging is a non-starter. There are, what, about 3000 libraries on Hackage? And less than a dozen or two are packaged on Debian, for example. This includes *no* high-level libraries for building web applications beyond the core library for CGI processing. At the same time, I don’t think Debian wants to take on the task of packaging 3000 libraries, some of which are only about 50 lines of code! The packaging work would swamp the effort to just rewrite that code on demand.

It also takes six months to a year to get packages into most OS distributions, and Haskell development is changing much faster than that.

Well, the correctness of the dependencies of any debian package depends on many other debian packages. And back say, 10 years ago, debian package management could at times be the headache as well; one of my longest running debian systems (Installed 2000, last booted up circa 2007) had gotten to the point that it was becoming increasingly difficult to make any changes. I don’t recall all the issues, though; but one I do recall is that it had basically become impossible to upgrade libc without breaking the system, and installing or upgrading almost anything wanted to upgrade libc.

I’m not a dpkg guru, but I think the problem was basically solved by becoming much more rigorous and developing a competent group of people dedicated to editing and maintaining sets of dpkgs. Package management is hard, and while dpkg probably isn’t perfect, it’s been the least trouble for me over the years.

The point put forth here was a real problem for me as well. There are two parts to this: 1) I don’t really claim to understand Cabal. Since I don’t understand Cabal, any error is kind of black magic to me until I understand a cargo-cult solution to the problem. After that, I can work around it, but then I don’t understand anything about the problem. 2) I usually spend time battling Cabal which would have better spent on hacking code.

In Erlang, the problem is non-solved. The only somewhat present packaging tool, rebar, is a sandboxer. It will pull dependencies to the sandbox (given hg or git URIs and branches to pull) and then build them in the sandbox. The odd thing is that this is way easier to maintain, mostly because the process is transparent: if you get a compilation error, it is usually local to your project in the sandbox and you have an apparent fix handy local to the project.

Another problem, which I always hit in Cabal packaging, is this: I now have stm-2.1.x and stm 2.2.x installed. One is stemming from the default Ubuntu install, while the other is stemming from my own local install of the same package. This would be fine, if there was a sandbox specifying what stm variant I wanted for what package, but it is kind of a point of nervousness for me that I obviously have missing convergence on package dependencies.

My own proposal would be a non-solution: make the haskell-platform strict in the sense that Hackage packages are “built against” a specific platform. Cabal install will then by default grab the package version built against my platform, ignoring that there are newer packages altogether. This means that all packages must be explicitly bumped to a newer platform version, but I hardly see that as a problem: It should be a burden of the maintainer so we can get the number of packages down to the sensible small group of maintained packages – rather than a quest for quantity.

While I completely agree with (and have had massive headaches caused by the problems in) points 2 and 3, I must disagree with your first point.

A project of mine has dependencies on a diverse range of over 20 packages. I’ve had many problems caused by the maintainers of those packages failing to provide upper bounds and precisely zero because of packages providing those upper bounds.

Most problems I can imagine that would be caused by the presence of upper bounds would be solved by addressing your points 2 and 3.

I didn’t think point 3 happened any more with GHC ≥ 6.12, because of the way it adds a long extra string (like `base-4.2.0.2-99442781c4fd10a8c30c35c9ce5fac5c`), based on the exact versions of dependencies. Which means you can have baz-1.1-12345 and baz-1.1-54321 installed separately.

GHC 6.12 *does* add that extra string… but it doesn’t prevent this problem. It does seem to play a role in *detecting* the problem; ghc-pkg reports the package as broken because the dependency with the extra long string doesn’t exist any more.

So it appears that, in fact, this string is essentially a hash of the exposed API of a package. Hopefully, that includes the professed version numbers of any exposed dependencies, though I’m unsure about that. If so, this is very good news; all that need happen is for Cabal and package database to be updated to identify packages uniquely by this string rather than just their version numbers, and maintain as many packages of the same version as required.

That’s still not quite enough, because the hash doesn’t allow you to say which parts of the API upon which you depend. The hash will change with every release in which the API changes, even if the API changes do not affect your program.

cdsmith / Jan 17 2011 6:53 pm

So long as building against a different library changes that hash, it’s good enough for this purpose. We just want to let “x-0.5 built against y-0.1″ coexist with “x-0.5 built against y-0.2″ in the package database. The hash works for distinguishing the two.

Deciding when your program is compatible with a newly built library is a different question… and IMO, that decision should be made based only on the announced version numbers and the rules of the package versioning policy. This hash should play no role in that decision.

Dave Bayer / Jun 25 2011 6:37 am

In a way this is very amusing: outsiders view us as rabid religious fundamentalists for choosing a purely functional language. Yet none of these issues arise with values in a purely functional language. Our package system is akin to going to church on Sunday, then sinning like crazy the rest of the week. ~/.cabal/config might as well be a global variable in a 1960s BASIC program. See! We’re not dogmatic about this purely functional stuff!

This could all be explained by “conservation of hipness”, which in our case is concentrated in the language itself. For another instance of this principle, look at the inane prevalence of two-column papers on Haskell. They bring back the image of President George Bush amazed in 1988 at a grocery scanner. Have any of these two-column authors ever seen a tablet computer? With the advent of cell phones, many people dropped their land lines. Today with tablets, many people don’t even own a printer.

Some mathematicians won’t use a theorem whose proof they can’t reproduce on demand at a blackboard, others sling the big machines around with reckless abandon. Calling a package one didn’t write is frequently essential, but calling a package of “only a few lines” is of questionable merit. Blues musicians internalize all the riffs they need, they’d look pretty silly stopping the performance to cue up iTunes for a couple of bars. Then there are DJs, who do exactly this. A matter of style, but I don’t aspire to DJ programming, and if I did, I’d want the design of the package system itself to be purely functional.

Well it’s 2013 now and still …the problem persists.
The only hackish workaround seems to be cabal-dev. Actually I am in favor of dynamic libraries, but
only for system wide installs. For cabal-install local installs: I think all builds of EXECUTABLES should be sandboxed and the executable should be statically linked to all libraries not explicitly installed as shared libraries by the user/OS.

Of course there are downsides… it costs soo much main memory and updating a dependency (library) requires a complete rebuild. But actually, like 99% of the time, I frankly dont care, because it’s the executables I am insterested to be up to date not the libraries.