Perl’s *DBM_File packages: what the heck are you people thinking?

Among many other things, my day-job involves working on a web-based system, written in Perl, for administrating our MasterKey metasearching system. We call the admin system MKAdmin. It was developed on Debian GNU/Linux, since that’s our usual production platform, but I also work on it from my Mac laptop.

Excuse me if I rant for a moment.

MKAdmin, until recently, used the NDBM_File module to implement a simple persistent session store — just an on-disk hashtable that maps cookie values to session structures. NDBM_File is part of the core Perl distribution.

But when one of our partners wanted to run it on Fedora Linux, it became apparent that in their infinite wisdom the Fedora packagers have removed the NDBM_File module from their Perl package, so MKAdmin wouldn’t run. Understand: it isn’t that they didn’t provide an RPM package for the NDBM_File module, it’s that they removed that module from the Perl package, even though it’s part of the Perl core distribution.

So I changed MKAdmin to use the functionally equivalent GDBM_File module, which is also part of the core Perl distribution but hasn’t been sabotaged by the Fedora people. And so all was well.

Except that now all the MKAdmin applications I have running on my Mac laptop no longer work.

It turns out that in their infinite wisdom the MacPorts packagers have removed the GDBM_File module from their Perl package, so MKAdmin won’t run. Understand: it isn’t that they didn’t provide a port package for the GDBM_File module, it’s that they removed that module from the Perl package, even though it’s part of the Perl core distribution.

Holy crap, people! Can you all please just stop pooping all over the Perl distribution? Would you please refrain from doing additional work in order to remove functionality? Would you please just LEAVE WELL ENOUGH ALONE?!

So now I am going to modify MKAdmin again, so that instead it uses the (also functionally equivalent) SDBM_File module — which, I’ve checked, does exist on Debian, Red Hat and MacPorts.

You can’t (easily) install either GDBM or NDBM from CPAN, precisely because they are both part of the core Perl distribution. They have no distributions of their own. No doubt it would be possible to download the whole Perl distribution and extract the relevant bits, but I can easily imagine that getting messy very quickly, and I don’t want to go there.

… but I do wonder whether AnyDBM_File might have been a better solution.

Well, the next time I have to move MKAdmin to a different persistent session-store back-end, I’ll try that. *sigh*

I think that the problem you’ve had is part of a larger one within free software. Many distributions make changes from the upstream projects, unnecessarily.

It /shouldn’t matter/ in most cases what distribution you use, because it is the same programs that are being installed on all of them. I have often had the experience of searching the Web to see if anyone has had the same problem with a particular program, and have gotten forums specific to a distribution (e.g. Ubuntu) where people are suggesting solutions (installing distribution specific packages) that are irrelevant to people using a different distribution method, and in these cases there will always be a distribution-nonspecific solution but because the world of free software is split up like this it can be hard to find anyone discussing it.

One reason I have installed Slackware is that it tends to provide unmodified upstream packages, and is unobtrusive as a package management system.

I’ve used that one many times. Especially when I don’t want to have to explain the actual reasoning behind how it ended up that way. Especially when the new manager isn’t going to understand that reasoning, and is going to want to go back and change it all to be “proper” even though thats going to break existing stuff built on top of it and will cause the project to run overtime with no practical benefit to anyone.

So why on earth there are three core packages that do the same thing exactly?

Since Aric Caley asked the same thing, I guess an actual answer might be in order. First there was DBM, the original hash-table-on-disk library provided by Unix. That was superceded by NDBM (New DBM) which was better in various ways, including a diferent on-disk representation (and IIRC a wider and less modal API, introducing for the first time the concept of a descriptor for an open NDBM file, so that you could have two or more of them open at once). Then came GDBM (GNU DBM) and SDBM (no idea) which also offered additional functionality and/or performance and had different on-disk formats.

If you are writing a Perl program that needs access to a *DBM file generated by another packages, then of course you need to use the correct *DBM_File module — I can’t read your NDBM files with my GDBM-based Perl program — which is why Perl comes with modules for all the different *DBM packages. Except of course when it’s sabotaged by a packager.

It was removed because it really isn’t necessary; there is DB_File
which does the same, and on Linux, NDBM is just GDBM anyway, which is
part of perl. This reduced the confusion and magic of the NDBM module
since ndbm varies from platform to platform, making portable files
impossible.

I can understand what the thinking was here; it was wrong, but it wasn’t incomprehensible. They just didn’t think through the consequences.

What it doesn’t explain is why Apple removed GDBM from their Perl. Of course we will never get an official answer to that, but we know the answer already: if you remove NDBM to drive toward maximizing portability, you remove GDBM to sabotage portability. It is clearly in Apple’s interest to make porting away from Apple harder. That it made porting to Apple harder in this case is something Apple is evidently willing to live with.

GDBM probably went away forthe same reason I don’t use it anymore (and haven’t for several years). It’s buggy, undocumented, and unmaintained. The last release of the underlying library was eight years ago for goodness sake.

Really, if you are going to use a local key/value store just use BerkeleyDB.

I hate to reopen an old thread, but I stumbled across this in a search for the answer to a related question, and since it’s still out there, I just want to clarify:

Perl does not provide three modules with equivalent functionality. Like the more modern DBD/DBI sytem for SQL access, the *DB_FIle modules are all interfaces API provided by external systems. NDBM, GDBM, BDB, etc., are all libraries or services commonly provided by systems Perl commonly runs on. But not all services are provided by all OS. Until the introduction of SDBM, perl provided no native on-disk hash store of it’s own, since perl’s built-in hash functions provided the services other languages relied on *DB_File implementations for.

The functions are part of the core distribution (and have been since the very beginning, before CPAN) to allow Perl to maintain files written by other processes. There are so many of them because back in the day, each mainframe OS provided its own DBM implementation but Perl was intended to be as portable as possible, so if you were moving to a new system, all you needed to do was update the “use” line (or use AnyDBM_File).

The expectation that NDBM might run on some Linux, or GDBM might be installed on Solaris, etc., is relatively new, and although some OS do supply multiple *DBM versions, others still only ship with whatever their traditional database is. The missing perl module is normally simply a reflection of the missing underlying functionality. If Red Hat has removed the perl NDBM_File module, it’s almost certainly because they’re dropped NDBM itself from the system. Even if the Perl module were installed, it wouldn’t do anything.

Likewise, Apple historically doesn’t ship any GNU software for licensing reasons, which means no GnuDB (GDBM) interface. GDBM_File is probably dropped for a similar reason. Even if they built their default Perl with GDBM_File, it would fail, because GDBM itself isn’t provided by Apple.

As someone else suggested, the problem is that whoever wrote the original routine tied it to an OS-specific feature, and AnyDBM_File is your friend.