My Linux Wishlist

Googling for "Linux Wishlist" brings up a bunch of old mailing list posts where people wish their soundcards could be detected and various Microsoft Windows applications would be ported over. My list is, well, different. Read on, add your own..

Top Ten Things I'd Like To See (but don't have the time to do my self). This isn't a wishlist of wanting certain applications ported from other OSes (well, except for #8), but more high level ideas that would make the world a better place (and bring world peace, free coffee and a plethora of other goodness).

1. Diff Packages: Instead of downloading a completely new version of a package when an update is available, I'd like to have a tool that is able to discuss with the server what version I have installed now, and download a premade package that provides me with only the necessary updates. This package would contain patches for binaries, any new files, instructions to remove old files, changes in permissions, etc. A tool to generate diff packages given complete old packages would be a must. It may be a simple as a well-defined naming convention to figure out what diff packages are available (allowing you to still use http/ftp).

2. Encrypted/Signed SLP: Some sort of mechanism that allows me to have services announced on my machine, but only to those that share a common secret with me. This would allow me to let my Jabber server announce itself on the corporate LAN, but only to those that belonged to one of shared secret groups that I recognize.

3. Improved Disk I/O in UML: Jeff and crew has done an amazing job with UML. It's great for colo'd machines, servers, firewalls, just about everything - except heavy disk I/O. Perfecting disk I/O speed is (to me - and others I've chatted with) is the holy grail for UML; everything else (ebtables, evms, etc) seem to be coming into play to make UML very hot. (Apologies if this is already done in recent SKAS or 2.6).

4. Mail Flow Editor: This idea is a tad more vague. GStreamer has a tool that allows you to graphically plug pieces together. I'd like to see something similar for mail. Something where I can drag a box that represents an MTA onto a workspace, link it to a virus scanner, SSL certificates, spam assassin, a mailing list manager, auto-responder, bug tracker, deny/permit relaying based on source IP, virtual domains, etc. The editor would work on existing configuration files (possibly using procmail to drive more complex tasks).

5. Unified Authentication Mechanism: Some API that supports RADIUS, SmartCards, OTPW, PKI, TLS/SSL certs, USB key chains, PAM, SASL, LDAP, GSSAPI, SKEY, etc. There's a lot of disparate technologies out there, but no good integration. Try making a OTPW system that uses an LDAP back end to authenticate people to Apache. This idea strikes me as something that's easy to do wrong, and hard to do well.

6. Named, Persistent File System Transactions: I'd like to be able to do a bulk of disk i/o in a transaction, then either commit it, or roll it back. The transaction should be able to persist across reboots, and it should be possible to boot the kernel into the transaction. For example, I could start a "UpgradeToFedoraCore2" transaction, install all the new packages, and reboot my system so it boots into that transaction. If it works, I can commit the transaction. If it doesn't work, I reboot back into the main file system and rollback the "UpgradeToFedoraCore2" transaction. This would let me play with new software without harming my existing system (at the expense of speed and disk space).

7. Persistant Malloc: A few years ago, wrote something a that could allocate memory from a mmap'd file. You could tag the allocated memory with a unique key and use it in your application. If your app crashed, the OS automatically flushed the memory to the file - key and all - except in one of the BSDs (I forget which now). When you restarted the application, it would notice you had "left over memory" and you could rendezvous with it to recover whatever you were working on by using the unique key. It worked amazingly well, nothing hit the disk unless you flushed it or crashed. I never did anything with this, but always envisioned a way of applications under development being able to ensure users wouldn't lose data through a crash. I still envision this....

8. Personal Firewalls: Pretty simple. Use libipq to make a decent personal firewall tool, allowing you to control what applications are allowed to use what network resources.

9. Media Database: A player and desktop independent API/daemon of cataloging music and videos. This includes streams, local music, remote music, playlists, meta data, etc. This mechanism should scale from 3 songs to 300,000 songs.

10. Audio Redirector: I'd love a way of taking what I'm listening to and having it sent to another machine, to a friend over Jabber, to an icecast server, etc. Taking the stream and transcoding/downsampling it in realtime if needed.

Number 11 anyone? Try to avoid asking for ports of applications. Be creative!

Hell, I think the only thing I've seen that real non-technical users, and even quite a few of us technical users, could possibly want to improve is application installation. As things stand now, it requires users to find the exact package for their distro/version, find dependencies, etc.

Solutions like Debian's don't really work, because no matter how large that archive gets, it'll never have everything (especially not the proprietary apps that people want), not to mention that as that archive grows its overall quality decreases. (More maintainers don't really help manage all that - it just adds more communication and cooperation failure points.)

What we need is simple. Any software, be it something on the web, on a CD (or on many CDs, like most decent modern games), local file system, or whatever, should be as easy to install as "Click." (And probably a few additional steps for security - ask for passphrase, verify trusted package authors, etc.)

Most people I've had try to use a Linux desktop have gotten along just fine with it, including several other "popular" warts like mounting/unmounting media, but application installation kills them every time. Especially if they want to play games, since most games worth playing are proprietary, don't come in RPMs/debs, and just offer nothing but utter hell getting them installed. Literally impossible to install them if you don't know what you're doing.

I think I've ranted both in Advogato and other venues on how and why current package managers suck, and even (gasp) offered suggestions on fixing them, so I suppose I stop now before I this post turns into several dozen pages of stuff I've already written. ~,^

Even if you are sure that the service announcement is authentic, it does not help against spoofed IP addresses and names. In other words, you can't still be sure that you are connected to the right server. You can only do this at the application protocol level. BTW SLP allows you to sign announcements (aka authentication), but no one uses it... the general rule is: if somebody has physical access to your LAN, SLP authentication does not help anymore.

Encryption does not make much sense. Almost everything you can do with SLP can also be done using port scans, only slower and with higher network load.

The distribution of keys is a real problem. You need to install them on every single computer. For authentication this means that SLP would have no advantage over a centralized server (e.g. LDAP). For encryption in a large company that means that it's relatively easy to break in, you only need to access a computer/notebook for a short time and you can access everything. It's not feasible to have more than a few different keys.

In other words, a lot of effort, very small improvement. You better spent your time securing application protocols or using IPsec for everything.

I've played with Synaptic (which is a GUI front end for APT), and I've been very impressed. I think single-click application installation is something already on the horizon and just requires a few more iterations to be perfect. APT supports local repositories (on disk, CD, NFS, etc..)

However, I completely agree with you that many commercial games and applications still don't seem to "get it" as far as packaging goes.

They're all the same. Names are not case-sensitive. Yes, I know ASCII D
and ASCII d are different, but so what? It would be interesting to know
what a change to ext3 to fold case would break. Not a lot, I suspect,
but I don't know.

My wishlist item would be the adoption of a convention (presumably
incorporated into applications over time) that allows you to figure out
easily the CLI equivalents of GUI actions. Sometimes I avoid using GUI
system management tools, especially when logged in as root, for things
like network setup because I don't understand what they will be doing
"behind the scenes" and am afraid they will screw up my system. A
solution, for me, would be to have a widget's context (right-click) menu
optionally show me exactly what will happen if I decide to click on it. This would
be presented -- probably in a subwindow you can copy from --
in the form of a CLI equivalent that if you were to type it, the exact
same action would be performed.

Such a feature would also allow you to build scripts more easily by
copying from that context menu. I think it might also be helpful to
people trying to learn CLI commands.

I don't think that's really realistic. Many administrative tasks aren't at all simple command lines. Think of an Apache manager - would you want a tooltip (or separate window) that shows pages of commands and Vi keystrokes to perform the configuration? In many cases, there is no direct map between anything you can resemble "demonstrate" on the CLI and the actual task being done. Furthermore, most tools don't just run CLI commands - they do things which are a bit more robust and fault tolerant than exec'ing commands and attempting to parse output and return codes. (Yuck.) Making the developers maintain a set of feasible CLI commands to go with this would just introduce more work on the part of the developers, and open all new possibilities for bugs. ("Hey, I cut-n-paste the command your app spit out and it borked my system!!" "Sorry, we don't actually *use* those commands, so typos pop in rather frequently.")

Honestly, if you're using a good GUI tool, it's a hell of a lot less likely to screw up than a human typing in CLI commands. Trust your tools to do your job for you. If you can't do that, why even bother having those tools installed at all?

"However, I completely agree with you that many commercial games and applications still don't seem to "get it" as far as packaging goes."

No, it isn't the game vendors that have the problem. It's the useless packaging systems we have. Large complex games simply can't be installed as RPMs or DPKGs. Neither of those formats are intended to be used for multi-gigabyte packages that span multiple-CDs. Sure, they can run install scripts, but those don't have registering files in the database or providing clean error recovery.

Furthermore, decent UI on those formats isn't possible. When I have a multi-CD package, I don't just want some dinky "Installing foogame-cryptic-package-name-0.235x-gamma-ray-mog..." dialog with a progress bar. When a user is waiting for something, they want some minor entertainment during the process. (Never mind them opening up a browser or doing something actually useful. ~,^) It's a lot like how during the installation process of a distro like Fedora you have those slideshows. Game developers want, and in fact use, those as well. They like to use them for marketing things like strategy guides and other titles by the publisher.

Things like EULA agreements (sorry, whether you hate them or not, no commercial publisher will allow their product to be installed without making the user agree), CD-Keys, installation options, and so on also need to be handled. <side-rant> CD-Keys are very important for games; despite how much most people hate DRM, a simple sample of almost 20 people I know shows that about 60% of them run pirated games simply because they don't feel like paying and that they somehow have a God-given right to play the games anyhow. (Don't get me started on music. I think I'm honestly the only person I know who pays for music. "Ya, I love that band, but they don't deserve any pay!") Those kinds of assholes make DRM mandatory. </side-rant>

Simple things like even *running* an install engine is a pain on Linux. Take executable bits for example. Download a perfectly functional installer from the web and it won't run. Why? It's not executable. Assuming the end-user knows how to do that, it's still a pain to have to do it. And my attempts at getting friends running on Linux has shown that it is too much of a pain to be worth it. They'd rather pay $150 for a copy of WinXP and deal with activation and all that pain and just have simple Click installs. (Honestly. 6 friends I've attempted to get running on Linux have all ended up doing this. The *only* problem they ran into on Linux was the application installation.) And truly, executable bits don't serve much of a purpose anymore. No, they don't do anything for security. There was discussion about this issue on the Autopackage mailing list for the curious. All executable bits do now is require extra steps by the user to run legitimate applications.

Finally, in regards to Synaptic, no, it isn't at all the answer to anything. The GUI is absolutely disgusting, because all it is is a GUI on apt. Sure, if you're a Linux geek who's used to the command line, knows how to use apt, and is knowledgable as to the major workings of a packaging system, Synaptic is a lot easier. Take someone who doesn't know much beyond "I want to install this application I found" and it's pitifully useless, not to mention confusing. The package list is still filled with way too much useless technical information, like the package name (especially for non-English speakers; translated package descriptions are the only thing that's actually useful to them), package version (who cares? they want it installed, they don't care which sub-point-micro release it is), etc. The organization of the packages is usually next to useless, being grouped into cryptic package set names (again, non-translated) and each package being in only one branch (shouldn't Evolution be in both Desktop (GNOME) and E-Mail Applications?).

Linux is absolutely nowhere near being usable in terms of application installation or management. Not close at all. The current tools available are all geared towards system administrators who need to keep existing software up-to-date or micro-managing package sets. (Does a user really care about the several dozen independent GNOME packages? Do they need to see any of that in the Install/Uninstall Application window? All the user cares about are the visible applications, and downloading "Latest Patches".)

The package systems need to be improved, a lot, in order to handle large installation procedures (large dependent file sets, multiple CDs with switching, etc.). Autopackage handles this to an extent, although its UI querying capabilities are rather limited. The package formats also need a *lot* better meta-data about the packages. I.e., is a package an application? Or just support (library/data/etc.)? What is the clear simple title of the package (with translations)?

Distributions need to stop being so "I want things done my own special way, screw you guys" and work on a tad bit of compatibility. In many cases, this isn't even the distributions' fault, but the fault of irresponsible or incompetent upstream software authors who develop libraries with unstable APIs/ABIs, fail to use proper library versioning, *also* fail to version header include paths, data files, etc., make libraries or applications that need to be recompiled to change feature sets and available APIs/ABIs, etc. You simply can't make a usable package when the upstream source is utter crap.

Of course, the packagers *do* need to work harder/better. In many cases, perfectly good upstream packages turn into horrible messes of an RPM/DPK/whatever. Things that should be built/installed in one clean way end up being done 20 different ways, and that's just talking about packages for different versions of the same distribution. Mandake and Fedora are very compatible in many ways, yet they have diverged in their exact package sets. This shouldn't be a problem, yet it ends up being one anyhow, for *no* good reason other than lack of cooperation and effort on part of the packagers. There is no reason for the same library to end up with different package names, different versions, etc between two distributions, yet it happened.

To assist in cross-distribution packaging (be it using the same package format or wholly different formats), packagers could use help. One thing I've had in my mind for a while now is a central "Packaging Heaven" site which lists registered upstream packages and packaging information. I.e., this site might say that the libfoo v2.3 should be packages as libfoo23 with version number 2.3.0-<pkg_rev>, be installed in /usr, and have all optional features enabled. Any packager who follows those guidelines will result in a binary-compatible package with sane dependency information (for dependent packages). In cases where the upstream source is just screwed (see paragraph several paragraphs above), the upstream source could be listed as "packaging unfriendly" with a list of problems, and the recommended work-arounds to ensure cross-distribution compatibility where possible. One could then even come up with a tool to use the database to verify a distribution's packaging and that it conforms with the guidelines for registered packages, so ISVs can make informed decisions about which packages it can depend on, which libraries and such it has to include with its application package because distributions (or upstream) don't support it in a compatible fashion, etc. Sure, there may be distributions/packagers who decide to intentionally go against the recommendations, but screw them. If they go out of their way to make incompatible systems, its only their systems that will be broken. Just tell users to avoid the pain that distribution or package set will bring them.

Given the current state of projects out there trying to correct any of the above, it doesn't look like it's going to be any time soon that a normal user can just stick a CD in their computer or click a website link and install a complete third-party application painlessly. Autopackage is by far the best effort out there, but it has quite a long way to go. Until then, anyone hoping to get friends and family on Linux had better be ready to spend a *lot* of time helping them install any applications that didn't come on the OS CDs.

I'd really rather my soundcard be detected, my USB printer be detected, my new firewire card be detected and any other hardware I put in my computer be detected than anything you listed. Not to mention quite a few applications I want ported. These things would make _my_ world a much better place.

I'd like some system that I could dump tons of bits (MP3s, Oggs, photos) and metadata (ID3, Vorbis comments, EXIF data, thumbnails) about them, and I could burn some of it off on a CD once it a while (when the part that's new is about 700 megs), would allow me to search it (using the metadata) and would cache it on my hard drive.

This would have it both be backed up, searchable and as fast as I want (I could use a big hard drive to cache everything if I wanted).

Thanks for your reply. You brought up some good points I'd like
to clarify.

In many cases, there is no direct map between anything you can
resemble "demonstrate" on the CLI and the actual task being done.
Furthermore, most tools don't just run CLI commands - they do things
which are a bit more robust and fault tolerant than exec'ing commands
and attempting to parse output and return codes.

I don't mean it should spew out vi keystrokes; that would be useless.
Ideally it would be the equivalent command you'd issue to the tool
itself, running in CLI mode instead of GUI mode (assuming the tool has a
CLI scripting mode).

Making the developers maintain a set of feasible CLI commands to go
with this would just introduce more work on the part of the developers,
and open all new possibilities for bugs. ("Hey, I cut-n-paste the
command your app spit out and it borked my system!!" "Sorry, we don't
actually *use* those commands, so typos pop in rather frequently.")

It's something I think I would find useful, and of course each developer
has to decide whether they agree and if it makes sense for their
project. If it is designed in up front, so that the CLI command is the
direct equivalent for the GUI action, it could assist debugging and bug
reporting by providing an easy way to duplicate the GUI's actions in a
repeatable manner.

Honestly, if you're using a good GUI tool, it's a hell of a lot less
likely to screw up than a human typing in CLI commands. Trust your
tools to do your job for you. If you can't do that, why even bother
having those tools installed at all?

Indeed; some of them I don't trust, having had problems with them in the
past, and I don't have these installed. I was hoping this idea could (in
principle) start to provide a way to trust them. :)

What it sounds like you want is that most functions that a particular GUI could perform would be made available in a library. The GUI would use the library to do that particular task, and making a command line tool that performed the same task (also using that library) would be trivial.

Doing so is just good design. Separating your business logic from your display logic is a fair bit of work, but the benefits that can be reaped are huge.

On your comment about trusting GUIs to Do The Right Thing, wish list item #5 could be used in that sort of situation.

Hi ishamael. Hardware detection and configuration is something that's already in the works. From the sounds of things, the D-BUS stuff going on will bring hardware setup to the userland. Tools like kudzu are already making good inroads, the problems that we've all faced in the past are slowly going away.

As far as applications being ported; as I said at the beginning of my post, I completely understand that having certain applications are on peoples wishlists, but I wanted a wishlist that was less in the style of "a copy of FooApp for Linux".

Sorta. I know they're SLOWLY going away, but thats half the problem. SLOWLY. I still had a hell of a headache getting my ipod (which required a firewire card) to work in linux, and its still pretty damn finicky. Its just all rather bothersome that I still have to triple check new hardware I buy against compatibility lists on the internet. AFTER this trend disappears, I'd be all for any of the things on that list.

dunne asks for case insensitive file names. This should be done at the application level, not by the kernel or filesystem code. The kernel and filesystem should just treat file names as a string of arbitrary bytes excluding the values 0x00 and 0x2f.

Making the kernel or filesystem case-insensitive is nearly impossible to do correctly, because there is no universal way to tell whether any two characters differ only in case. A naive view is that this would only apply to the latin alphabetic characters in ASCII, but for people who use languages with more or different letters, this "solution" is worse than useless since it would result in some filenames being case-insensitive and others not.

I used to be in favor of case-insensitive file names (and variable names, etc.) until the i18n issues were pointed out to me.

It either needs to be done in the low-level system, or not at all. If you have applications do it, they'll suffer from the *exact* same problem, *plus* they'll have even more inconsistency issues. ("Hey, ViM let me make two files with the same name but different capitalization, but now Emacs won't let me access one because it just sees two files with the exact same (case-insensitive) file name!")

I'm not personally too convinced that it's impossible for the kernel to handle case-insensitizing file names. Difficult, sure, and perhaps just waaay too much crap (Unicode mappings and such) would be required to be shoved into the kernel, but it's still possible. (I am assuming that only one encoding it allowed. UTF-8, perhaps. Letting users mix and match encodings for different individual file names is just asking for pain.)

Why do the case-insensitivity mappings in kernel space? Why not hand it off to some user-space daemon? I'm not sure how this compares with the GNU Hurd's concept of translators. Performance might a problem, though.

I wrote an LD_PRELOAD hack a while ago that lower-cased all fopen(), etc requests. I can't remember all the details of the situation, but my end conclusion was that applications *and* the kernel *and* filesystems would need to be reworked to satisfy case-insensitive filenames. It'd be a huge undertaking - and adding in the i18n issues make it even huger.

consequently, even under NT it's possible to create two file names with the same name except by case, and thereby totally confuse the win32 subsystem (and all lovey gooey programs thereby).

yes, it's absolutely necessary to have the kernel do the lower casing.

about the only sensible compromise way to do it is to have a loopback kernel driver that munges onto a real filesystem.

realistically, however, you really need to have two names associated with the file. one of them is unique (e.g. a complete lower-case remap of the name given when the file was created) and the second is the one that was originally given.

even then, you _still_ have problems like you do with Subversion, where if you rename a file from Xscale.c to xscale.c your NT system can't cope. why? because someone forgot that the file will be renamed but when it comes to deleting the "old" one you find that it's still there. uhm... or something like that.

so, no, i don't think case insensitive filenames are a good idea.

if you _really_ want case insensitive filenames, then run samba and smbfs on the same box.

that way, you will use the samba case remapping functionality (which has taken years to get right), and you won't poo up your box.

I (and a number of others) argued that names in XML should be case sensitive, and won out in the end.

The main reason was that the upper case variant of a given letter varies from country to country in some cases.

For example, rôle (contains o with ^ accent) turns to RÔLE in Quebec, and to ROLE in France.

In addition, lower to upper to lower is not a reliable identity transform. Turkish has two lower-case letters that map to I un upper case (i with dot an i without dot).

The upshot of all this is that case-insensitivity only works within a single locale, or where every piece of data is marked with both language and locale (e.g. "French" isn't sufficient).

One common approach is to say that accent marks are also ignored, so that e-acute (é) and e (e) are considered the same.

However, you then can't have two files in the same directory that differ only by accents, and in some languages that can be very surprising. It would be like saying "you disregard vowels in file names, so that hablo (I speak) and habló would be indistinguishable in Spanish, as would speak and spoke in English if you disregarded vowels (of course, ancient Hebrew was written without vowels, which causes many disputes in biblical scholarship!)

There are also other issues with case mapping, including spelling changes in German (ß/SS) and ligatures (æ, AE, oe (oe) (ij) and others becoming split into separate letters in upper case in some languages or places in the world.

If you agree never to use a networked file system, never to exchange files with people in other countries, and never to use a language other than American English (other dialects of English have greater use of accents and ligatures), then you can reduce surprises.

Although people in a given locale might be comfortable with case insensitive file names, they rarely know the rules for other locales. This means you could write a tar archive or a .deb or rpm package that contains files that my file system considers identical. Have fun supporting me, if I don't speak your language.

To avoid this, the solution is that everyone's kernel must contain all of the case mapping rules for all locales in the world.

Unfortunately, this is not a static list - e.g the rules in Spain changed recently for collation of ch, and the use of accents in Modern Greek changed in the 1970s. Rules on accents in French are also undergoing change in various parts of the world. Obviously, historical documents continue to follow the older rules.

So, doing something that makes Americans happy most of the time is fairly easy; making the other 96% or so of the world happy with case insensitivity is much much harder.

For XML, we had the situation that a valid document in one country might not even be well-formed in another country, so we chose case sensitive names.

It's a small burden to pay for being able to operate internationally.

Liam

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser
code is live. It needs further work but already handles most
markup better than the original parser.