While this article is a bit harsh, I can see some of its points. I installed Async yesterday, and Cabal tells me my STM isn't up to date - isn't downloading that your job? Oh well, "cabal install stm" - "This will also update XY and most likely break Z". Seriously, why doesn't it a) install everything automatically and b) support multiple versions on my hard drive? Space is cheap, please Cabal, feel free to make a new folder for every new version.

The issue isn't really saving space, cabal is happy to install many versions of the same package. The issue is installing many copies of the same version of the same package, but compiled against different dependencies. This require more support from GHC and Cabal, something that has been worked on during last year's GSoC.

Let's say I have libraries A and B, where library B depends on A. The compilation of B, therefore, depends intrinsically on the specific instance of A we have on hand. That's what we mean by B being compiled against A. Notably, it's not the version of A that matters, it's the specific instance of A--- the actual library object file on your harddisk.

As for why there's that intrinsic dependency, that's more complex to answer. A big part of it is that GHC does a massive amount of inlining, and this inlining can happen across library boundaries. Thus, when compiling B, actual code from A can end up in the binary for B. In a lot of cases this isn't actually a problem: because the A-code that gets inlined is effectively source code from A, so as long as we stick with the same version of A, then we're fine. However, it can be problematic in certain cases. Consider, for instance that we have a new library C which gets compiled against our current instance of B. Because there's some A-code in the instance of B, we can end up getting some A-code in C--- or more to the point, some dependencies on the datatypes defined by A.

This triggers the real problem: GHC does not specify a stable ABI layer. So when we inline some B-code in C, there's nothing to guarantee that this will be "the same" regardless of which A-instance our B-instance was compiled against. Thus, if we replace the current A-instance with a different A-instance (i.e., the same version of A, but compiled against different instances of its dependencies), there's nothing to guarantee that our B-instance will continue to link properly with the new A-instance. They may disagree about the representation of A's datatypes, or they may disagree about how many arguments one of A's functions takes (which can be affected by other optimizations). Devastatingly, if our B-instance breaks, that means the C-instance also breaks, and so on all the way up the dependency tree.

It's the lack of a stable ABI (compounded by issues of cross-library inlining and other optimizations) which causes fragility, because it means we don't get the modularity which is necessary to keep changes from percolating through the whole web of dependencies. These problems are then further compounded by the fact that cabal only allows one instance of any given version of any given library--- you can have multiple libraries, and multiple versions of each of those, but you only get one instance of each version. Thus, you can't choose to leave the old A-instance in place (thereby ensuring the B-instance still works) when you install the new A-instance. This in turn leads to recompiling everything (at best) or the butterfly dependency problem (at worst).

Why doesn't GHC define a stable ABI? Is this an optimisation thing? - a bit like those optimised non-standard calling conventions you can get for completely hidden functions in e.g. C, but still happening for things that are visible across module boundaries.

Clearly the main issue is keeping the field open for optimizations. But I think a big part of it is that there hasn't been enough demand and that it's not entirely clear what a stable ABI should mean for Haskell (i.e., one that doesn't interfere with current/foreseeable optimizations). Other issues of ABI stability also crop up from time to time[1], so it would be nice to get stability even if just for simplifying the GHC build process and infrastructure support. If someone did research on figuring out what a stable API for Haskell should look like, I'm sure the Simons wouldn't have a problem including it in future GHC. If the research is done well then it may also be helpful to the UHC and jhc folks.

[1] E.g., ghc-pkg uses a hash of the ABI as part of the identifier for a library instance. In addition to the libraries you build against, the ABI hash also includes things about the GHC version, phase of the moon etc. Unfortunately this means instability in the ABI hash can get in the way of certain kinds of bootstrapping, idempotency tests, etc. Hang around on the ghc mailing list and you hear these sort of issues from time to time.

Why would I install the same version multiple times? Rebuilding a package with updated dependencies (because they have the same API but are more performant, for example) should leave the infrastructure intact, shouldn't it?

You know who also has this problem? Java. And it's really effing annoying. I've had to handle issues like this multiple times. I sort of wonder where the people who write articles like the top post come from, because my experiences with Maven, Ivy, Sbt, Leiningen, Rubygems, often contain bits of dependency hell like this.

A lot of Java programs just ignore this problem, force compile, and pray it works. It's really fun to watch those expectations get violated in subtle non-exception-throwing ways.

Not always true. You can fix this problem quite well when the two libraries use the third package for implementation. In fact, GHC has most of the machinery for doing so already, putting Haskell in a much better position than Java for fixing it. You can NOT fix it when the two libraries re-export types or classes from the third library. We should distinguish between the two cases.

Incidentally, it's not good enough to say "oh, Java has that problem too". Java has many problems, of course (including the lack of a well-maintained global repository for libraries with easy installation... we solved that one). But while Java and Haskell have the same library ecosystem issues in theory, what happened in practice is that Java libraries became massive monolithic things to minimize the damage, which Haskell still clings to the notion that we can have complex webs of thousands of libraries and dependencies. We eventually will have to either change that, or fix the problems it causes, which happen in Haskell far more often than they do in Java specifically because Java has developed social norms against it. I rather hope the answer is fixing it rather than abandoning the ideal.

This is a good point. It'd be easier if there was a way to check this without examining the code.

We eventually will have to either change that, or fix the problems it causes, which happen in Haskell far more often than they do in Java specifically because Java has developed social norms against it. I rather hope the answer is fixing it rather than abandoning the ideal.

I have hopes that once a relatively good iteratee emerges that this will calm down somewhat. It seems to me like a lot of Yesod's problems are because it tries to split itself into a bunch of disparate implementation sets to try and export things like ResourceT and Conduits because the community is crying out for them. But this creates network-effect-driven complexities that a monolithic Yesod package would not have.

Part of the reason Java runs into this less is because their solution is to have very closed ecosystems competing for mindshare, and their community is big enough to sustain that approach. Haskell, Clojure, Scala, and other small community players don't have big enough communities to take this approach, it just ends up isolating everyone.

Unfortunately that means that it's much more important to "pick winners" for critical libraries in these worlds. Incidentally, that need being satisfied is probably why we don't see this problem with Go or C# or Python; the extremely strong guidance these languages receive from their leadership sort of keeps them more unified.

Incidentally, it's not good enough to say "oh, Java has that problem too". Java has many problems, of course (including the lack of a well-maintained global repository for libraries with easy installation... we solved that one).

Maven central doesn't count?

It is really that hard to add packages to a project? I find it rather trivial to add dependencies but I guess it's just me.

Honestly though, if cabal could act more like ruby's gem where you can install multiple versions of the same library to satisfy dependencies for different libraries, that would be great.

Honestly though, if cabal could act more like ruby's gem where you can install multiple versions of the same library to satisfy dependencies for different libraries, that would be great.

That's not the problem, Cabal can do that... What it currently can't do is install twice the same version of a library but with different dependencies. And since it can't use two different versions of the same library in a single compilation, the dependencies from two already installed dependencies can conflict !

The .NET runtime allows loading multiple versions of the same (dynamic) library, and allows very fine grained (probably overly fine) versioning policies to be applied as well. But most system loaders can deal with coarse grained versioning of dynamic libraries.

As for static linking... I'm sure there is some history on why GHC doesn't include the package version in the symbol name, possibly due to symbol length limits in binutil's linker? Doesn't seem like it would be that hard to test out...

I don't agree. Having a globally shared package database is what makes the brokenness more likely to occur. But for everything that fails to work with a globally shared package database, there exists a project that has no good reason to fail, but will fail even with private sandboxes.

What's a little more broken is having no distinction between the libraries I depend on, and the libraries I re-export. That makes it impossible to use certain libraries together, even though there's no logical reason for them to conflict. People are working on that.

What's also a little more broken is how installing one thing in the globally shared package database can break other things even when dependencies are all explicitly given. This is also a known problem, and being worked on.

I made this statement in the context of the original post, which presumably says that the Haskell tool chain is worse than other tool chains that the author has used. This is true in the context of sandboxing; lots of other tools do it. It's not true in the diamond dependency context, as most other tools that use version ranges for dependencies don't handle such dependencies either. So adding sandboxing is not going to solve all out build problems, but should hopefully get us to a level were people are not screaming that the tools suck.

This seems to come up every time these kind of issues come up. You say "If you need semantic diffs", but there's no "if". You do need semantic versioning. You can use "stubName" instead of "1.2.3.4", but it fundamentally doesn't change the problem.

Currently, whenever you make any potentially breaking change, you have to explicitly forbid all users from compiling against your library until they review whether or not the change actually broke their use of your library (By bumping the appropriate minor version number).

In practice, most potentially breaking changes will not break most user code -- meaning that this wastes a lot of everyone's time, and prevents lots of builds from succeeding due to over-conservative version ranges.

With API signature dependencies -- a potentially breaking change will not affect users if they do not use the changed functionality. This will mean that everything that can compile, will compile, saving everyone from all the unnecessary troubles.

When you need "bugFixABC" you can explicitly depend on "bugFixABC" without depending on a version number which is opaque and does not communicate the intent -- and is over-conservative about future changes.

Another cool thing about API signatures -- is that we may have multiple packages supplying the same API, and they'll all be interchangeable as a dependency. This could be great for more easily swapping in an alternative implementation of a dependency to compare.

Another possibility, which is not as nice, but perhaps more pragmatic, is to be able to specify dependency information retroactively on hackage -- rather than bundle it with a package forever. The current forever-bundling is very problematic:

The Haskell eco-system suffers greatly from over-conservative version deps (necessarily due to the version range model)

Mistakes are set in stone, forever confusing cabal (e.g: old version which has under-conservative range will wrongly be picked over new one with correct range)

In practice, most potentially breaking changes will not break most user code -- meaning that this wastes a lot of everyone's time, and prevents lots of builds from succeeding due to over-conservative version ranges.

Agreed. Several of us have stopped adding upper bounds and only add them after we know something breaks. Not ideal either though, as breakages get caught at compile time instead of at version resolution time.

Another cool thing about API signatures -- is that we may have multiple packages supplying the same API, and they'll all be interchangeable as a dependency. This could be great for more easily swapping in an alternative implementation of a dependency to compare.

Sounds like structural typing (aka duck typing), so it has the pros/cons of that. One con is that having the same API does not mean being compatible; I very much prefer an explicit (type class based) declarative approach to interfaces.

Another possibility, which is not as nice, but perhaps more pragmatic, is to be able to specify dependency information retroactively on hackage -- rather than bundle it with a package forever.

Duncan Coutts is working on this.

Btw, how do you depend on instances being available under this scheme?

Agreed. Several of us have stopped adding upper bounds and only add them after we know something breaks. Not ideal either though, as breakages get caught at compile time instead of at version resolution time.

This is a recipe for bit rot. Currently the burden is on library maintainers to go and bump upper bounds when new versions of dependencies come out, and sometimes you can't use "new-shiny 1.2" because directory needs 1.1, or whatever.

To me this is better than leaving off the upper bounds completely, which results in programs you wrote last year no longer compiling because garble-2.3 changed its public API. If everyone follows the PVP then at least we know that old programs should continue to build on some version of GHC.

To me this is better than leaving off the upper bounds completely, which results in programs you wrote last year no longer compiling because garble-2.3 changed its public API. If everyone follows the PVP then at least we know that old programs should continue to build on some version of GHC.

I'm not very happy with the solution, but it's better than core contributors spending two weeks of their spare time twice a year bumping version numbers. Think of what fantastic libraries Bryan could be releasing if he didn't waste his time on this.

Since when does bumping version numbers take two weeks? Maybe if you have to resolve API differences (like directory-1.2), but in that case you're screwed anyways.

This should be solved with tooling. We have Michael's "packdeps" package, it's a short step from there to a package that will bump the upper bounds on all of the dependencies for you and try to build with --upgrade-dependencies.

This is the time Bryan reported. There's a lot of coordination that needs to happen. Some packages will have to have their bounds fixed and then they will have to be released before the next level of libraries can be fixed and released and so on. With every GHC release there's a flurry of pull requests and a general state of brokenness for weeks or sometimes months, due to upper bounds on base and other libraries shipped with GHC.

If it takes Bryan an average of 20 minutes per library he's maintaining (https://jenkins.serpentine.com/). That's already ~9 hours worth of work and that doesn't take all the context switching into account (i.e. email some maintainer of a package you depend on, wait for them to make a release, etc).

A major drawback of the no-upper-bound approach is that if you discover breakage and publish a newer, more conservative version, cabal may prefer the older more lenient package to go with a newer dependency.

The type-class approach to package selection is likely to be clumsy (using indexed types/etc is a poor alternative to the module system). It also requires Haskell code to explicitly make the choice and potentially adds indirection overhead (dictionary passing).

Dependence on instance exports and the exports themselves may be listed explicitly in the API import/export signatures that don't have to be Haskell code themselves.

A major drawback of the no-upper-bound approach is that if you discover breakage and publish a newer, more conservative version, cabal may prefer the older more lenient package to go with a newer dependency.

Cabal always prefers the latest patch level version so if you publish A.B.C.D+1 that adds an upper bound, that should be preferred.

The type-class approach to package selection is likely to be clumsy (using indexed types/etc is a poor alternative to the module system). It also requires Haskell code to explicitly make the choice and potentially adds indirection overhead (dictionary passing).

I will have to see if I find it clumsy or not. It's certainly is safer and doesn't introduce implicit dependencies (on module signatures), a problem with duck typing. It also adds a degree of polymorphism not available using import time "polymorphism". Like so:

As far as I am concerned, duck typing is only a bad thing if it implies dynamic runtime errors. If you get "duck typing" with static checks, then it is fine.

I don't see how a type-class indirection for method resolution is any safer than a module import indirection for name resolution. Both cases are explicit and the interface and semantics are just as well defined.

I agree about "class Map". It has advantages, and it has disadvantages. In some cases, you don't need the runtime polymorphism, but it would still be nice to plug in different implementations and compare performance or what not.

As far as I am concerned, duck typing is only a bad thing if it implies dynamic runtime errors. If you get "duck typing" with static checks, then it is fine.

I don't think so, as it introduces implicit dependencies, which will break in the presence of separate compilation. Nowhere in the code (except perhaps in comments) do you describe that a given module has a function with a specific name and type because that name and type is required so this module can participate in duck typing. With duck typing you don't separate interface from implementation. For example, in

f aRecord = (field1 aRecord) + (field2 aRecord)

what are the requirements on aRecord. Is aRecord required to also have a field field3, but in this implementation of f we don't happen to need it but in another implementation we might want to?

I think lower limits are fine, because it's at least possible to skim the major version numbers and pick the first one that works. Upper limits aren't. To take a silly example, if I only use Data.Text.pack and Data.Text.unpack, why should I depend on any specific version of text? Those are unlikely to go away any time soon.

Do recall that there's a reason we started pressuring folks to add upper limits. Back in the day they weren't there, and nobody cared until the world exploded because some of the major packages changed dramatically. Then, suddenly, then people cared a lot.

Now the pendulum has swung the other way. I agree that aggressive upper bounds are not the best idea. However, removing upper bounds isn't the solution. There's a reason we started pressuring folks to add upper bounds, and that reason has not changed (nor will it ever).

There was a great deal of discussion on this matter recently in the haskell-cafe and libraries mailinglists, which I'd rather not rehash here. However, the gist of the problem is that there are fundamentally three zones of compatibility: guaranteed to work; guaranteed not to; and the grey in between. This means that there are (just as fundamentally) three thresholds. We have the white-to-grey (aka known-to-work vs untested) threshold, which is knowable by simply testing against what's out there today. And there's the white-to-black (aka known-to-not-work) threshold, which is also knowable by simply testing against what's out there today. The grey-to-black threshold is the point where due to changes in the future some package suddenly stops working; unfortunately, the exact location of this threshold is unknowable, and no software can ever fix this (in the absence of time travel). The major point of contention, then, is trying to decide what a "single upper bound" means in this context; which comes down to trying to come up with heuristics for determining where the grey-to-black threshold is.

Personally, my stance on the matter is that a "single upper bound" is necessarily meaningless in virtue of the things we cannot know. This doesn't mean getting rid of upper bounds, it means accepting the fact that there are multiple notions of upper bounds. Moreover it means accepting that different users will have different desires, and so no single set of heuristics can work for everyone. Thus, we should provide both the white-to-grey and the white-to-black thresholds, and then let users (and their tools) tweak their heuristics as necessary.

Having said that, we spend 2 days (!!!) preparing Delphi 7 dev. env for our new developer. :)

Yeah, and God help the poor developer who can't call on experts in such cases.

It's hardly unique to Haskell. It took me years to figure out roughly how to build a source-distributed Java app. Don't get me started on feeding the right xml to Maven, or the right m4 to autotools. It's the sort of mucky details they don't teach in college.

But it really is a shame that cabal is in a similar state of painfulness.

I think cabal is a lot better than the alternatives for other languages (mainly in virtue of not being side-effecting nor context/platform-dependent). The problem is that we rely on it a hell of a lot more. Libraries in Haskell are ridiculously small things, many of them no more than a halfdozen modules. When you only have to link together 3~5 libraries you're less likely to run into problems than when you're linking together 20~50 libraries.

Just thought it's interesting/important for us to notice criticism coming from the outside, even if griping about Cabal has been done to death from the inside and folks are already doing what they can to improve the situation…

It is absolutely the case that the Haskell world needs to see this frustration. There are newbies out there excited by the possibilities of Haskell that get turned off by the difficulty in installing the kinds of packages people actually want to install. It is great to post it.

People who are upset at how "mean" the pot are are missing the point. Haskell as a community is suffering from this problem, and it is stifling its growth.

Haskell -- as it is perceived by the outside -- is what matters here. For a newbie, cabal can be about as fun as having an arterial gash in your neck. If your experience with cabal is crippling then your likelihood of reporting the wonders of Haskell to the world are greatly reduced.

Once bitten, twice shy.

This is about traction. Negative news on this front will persist into the future even if the problem is solved. The longer the problems with cabal continue, the longer and more profoundly the bad news will persist in the minds of ever-increasing numbers of candidates. Candidates who might otherwise have used Haskell in their environment, from individuals, corporations, governments, etc.

Is Haskell at a publicity tipping point? I don't know, but if it is then perhaps it's time to stop the world for a moment and redress the situation.

Not a very constructive article. Didn't know what they were doing, didn't ask for help well, apparently kept doing frustrated and destructive things because it was more satisfying than stepping back and trying to learn what was going on.

Still, I did exactly the same thing as he did a few weeks ago only I didn't blog about it. I think he's helped more than I am by at least letting people know that another person ran into that wall that's been there a while.

Sure. I was mostly reading it in sniff-for-global-trends way (or hmm, was I really?). What was interesting about it was the point that as practicing Haskellers, we can often forget how confusing/rage-inducing things can be that would otherwise be petty frustrations. Sort of a sense of needing constant newbie WTF'ery to keep you honest. However that feeling is expressed is not so important, at least if you're reading from a “let's put a probe out and feel the temperature” perspective

For the record, I have not in a year and a half experienced "totally broken" from cabal, At worst, "slightly annoying". Then again, I install from apt whenever I can, because that's the right thing to do :)

I wonder where the repeated suggested to install the latest/greatest cabal-install came from. My guess would be that the author might not have put their ~/.cabal/bin (or ~/Library/Haskell/bin, etc) in their PATH? If so, maybe the “hey, there's a new cabal-install out!” could say something? Dunno if that diagnosis makes any sense though.

I wonder where the repeated suggested to install the latest/greatest cabal-install came from. My guess would be that the author might not have put their ~/.cabal/bin (or ~/Library/Haskell/bin, etc) in their PATH?

Hmmm... while this is a bit risky, I wonder if cabal should say, "Hey, if I'm going to suggest that I need to update myself, I should look at where I would put that binary, and if there's one already there, exec that instead." It feels wrong but I'm not sure my feeling is correct.

A lot of swearing on some free software and its community.. While I felt annoyed by the same things he mentioned, I just did not feel compelled to bash the Haskell community, but instead found heaps of resources on how people are trying to fix it!

His main point seems to be that the Haskell community is ignorant of it -- which is untrue. It just takes long to fix it, as it needs a lot of effort to really fix these issues properly.

Development focus is still mostly on the language. I suppose that once the endlessly growing pile of monad tutorials stops and we start to see more practical courses devs will start to care about deployment.

Of course, that time may never come. Haskell might continue to be a research language for ever. With all these JS spin offs that wouldn't even be all that bad.

Fine, we should document a bit more the rough edges. But before throwing rocks at a whole community like that saying that we are not aware of how stuffs are broken etc., it'd be a smart move to actually check that is indeed true. Which isn't. We do a lot of stuffs to solve this situation, or at least to make it less problematic. cabal sandboxes in the next release, scoutess, all the cabal-dev like tools, etc. We try to improve the situation.

I think the Haskell community needs to accept some of rocks thrown its way with good grace. Saying, "Don't be mean though!" is sort of absurd. This post was obviously written in frustration, and quite frankly everyone agrees that cabal has serious issues as it stands that cabal-dev only addresses if your dep tree is quite small.

And Yesod? Yesod is probably one of the worst offenders here. I recently went through exactly the same dance with my Yesod installation only it was worse because I had installed the 64-bit platform on OSX and it is a known-but-not-easily-discoverable bug that the yesod executable just crashes on OSX with 64-bit haskell.

So yes, the post is vituperation, but cabal earns this. It does no good to people trying to join the language now to say, "Oh yes cabal the proverbial princess with the pea in her mattress, but the future will be better."

I agree. Still, words said in anger are just words said in anger, and anger you earned is anger you earned.

But I am sure you can envision how this played out. #haskell on FreeNode varies from an incredibly friendly and helpful place to a maddeningly irritating and smug place, depending on your question and the time of day.

Certainly I've felt like I regretted trying to install Yesod more than once.

By having hackage2 and scoutess, we will be able to catch a lot of version bounds issues, API breakages and whatnot before packages are even released. And even when they have been, and Hackage2 will probably let you edit the version bounds directly online. That's only a part of what's going to help.

I don't question that this is your (and others') experience, but I do wonder why that is. I've been doing Haskell full-time for a number of years now and I've never run into these major problems I keep hearing about. (Minor problems, sure; but I've only broken the world once.) Then again, I don't do any web development anymore, so I've never had to deal with yesod nor any of the competitors.

Given all this, it seems to me that the real problem here is yesod and its ilk. I can't believe that I've just been stupidly lucky for my whole career with Haskell; a little lucky, sure, but not that lucky and for that long. I'm all for fixing cabal. But it sounds like someone needs to wrest with Snoyman to slay this particular beast.