GNU-Info Bad, Man(1) Good

I remember the first time I encountered GNU Info. A man page
said nothing useful, and referred me to GNU Info.

The first problem was that the key bindings were stupid
and counterintuitive. (Still are. Every time I run it I
have to figure them out all over again.) A bigger problem
was that the files were badly organized, so that the
information I needed was dispersed to lots of different
places. Worst, having scoured the whole subtree, what I
needed to know usually was not there.

I could see what had happened. All this handy organizational
apparatus had to be used, so everything is broken into
little sub-pages. Since it's essentially hierarchical, every
detail gets separated from other related details and stuck
under a nice sub-sub-heading. The really useful information
doesn't really fit under any nice hierarchical category.
Since there's no good place to put it, it doesn't get written
down at all.

Contrast this with the experience of man pages. On my system,
man automatically pipes to less, so I can scroll around freely
to see the whole thing. There are headers, but no intrusive
hierarchy. For anything useful the author thinks of, there was
someplace good to put it, generally next to something related.
In better-written pages, there's a flow, with later paragraphs
building on earlier foundations. At the end, there's a
"see also" that tells me what other man pages might be
interesting. Usually there are examples.

The only useful info pages I've seen were converted from
man pages.

Back in "the day", something as big as the make manual was
considered too big to put in a man page, but that seems not
to be a problem any more; certainly the bash man page is huge
yet useful and entirely usable.

Can man(1) be improved upon without damaging what's good about it?
Does it really lack anything important?

your bash experiment, yeupou, makes me think of the ksh man page i checked once. i had installed openbsd and the csh was a bit hard on me so i installed ksh and checked its man page. i started reading but it looked very long to me so i started scrolling down and the more i did scroll, the more pages came to screen and it was frightening at first :)
i have the habit to take notes when i read some man pages but i saw that ksh man page was so big i'd rather print it. in fact, printed man pages look pretty good.
i have a plan9 cd + manual (sent by the plan9 team to me, i was to write an article about it but i left the journal i was working for before i even started that article) and their manual is just man pages printed, and the whole thing is just great and very nice to read.
makes me think of the documentation for unix. i think i read that from "25 years of unix" (buy it if you don't own it, it's only good stuff) that they had first wrote the man pages and documentation before starting to write code. laying doc made sure everything was logical and complete, then it was just programming each tool up to its specs and what it should do. like the design of an api before you start chewing code around, what i consider a very good move to do first when it comes to writing code other people will (hopefully) reuse :D

(Some fundamentalist bloke said, "man is underfeatured". Right, how
many bells and whistles do we need to just read a technical document? Should
we throw in Microsoft's Clippit too?)

(And, "Still baffles me how people learn vi as if it's the cool
thing to do, [...] and yet are scared off by simpler tools such as
lynx or info." Because, many of vi's keystrokes
can be directly used in more and man. Not info.)

Can man be improved? Some kind of internal/external hyperlinking
facility will be nice, but this requires some twiddling in both
groff and more. And last I checked, groff's HTML
output support is still experimental.)

I already ranted about info but I'll link to it here for completeness. Thanks ncm for expressing your opinion (which is pretty similar to mine) on man(1) in a less ranty and more coherent way than I did, and putting it into an article (which I should have done).

trs80: the standalone info reader does support searching within an info document. It uses the same keybinding as Emacs does. Just type ctrl+s and start typing. If the word isn't found on the current page, it continues to the next. Press ctrl+s again to search to the next match.

personally, I really like info. I like that emacs has a nice reader for it, I like writing docs in texi (comparing with docbook and troff, texi is aesthetically much more pleasant to write) and I like the browser paradigm for documentation: setting bookmarks and having a "go back" key really helps.

I think there's a philosophical difference, anyways. texi is designed to "embed reference docs in the context of a book", or book-like document. it's a medium which is more GNU, less unix: the document is supposed to invite readers in, even novices, to learn about your program. texi docs should contain tutorial and getting started sections, as well as reference.

man pages are terse, usually just refreshers in "how to invoke so-and-so properly, and some common mistakes". you still need to know you want to run so-and-so to do a task, and roughly how it works. if you look at a lot of the older man pages, they really are just 1 printed page.

You can get the "one page, readable with less" format from texi files as well. I use it to produce files like README and INSTALL from texi source.

In general, I believe the info format is well over its prime. Low-end desktop computer are more than fast enough to convert texinfo on the fly, and can do a much better job when the markup hasn't been stripped. Of course, this would require that a) someone provided an texi based infrastructure, and b) someone wrote the texinfo based browsers.

I do like the texinfo format very much, as is simple and readable, especially compared to all the SGML based formats out there.

The standalone info browser used to be rather broken, because most of the people who cared about info files were using the superior build-in Emacs info browser. I hear that it has improved a lot recently, though.

For SPASS, we use texi2pod and pod2man to create man pages out of texinfo sources. Using texinfo, it's also quite trivial to generate hyperlinked docs from the same source for online help, and PDF docs for printing. There is also a texi2man project somewhere, but it's hopelessly outdated and unmaintained.

For Kaffe OpenVM, though, we[1] are moving over to using DocBook for everything. From that, it should also be easy to generate man pages, hypertext docs, or whatever else people come up with as the documentation standard of the future.

The main problem I have with man is not man pages, it's the arcane language needed to write man pages. I'll take texinfo anyday over roff.

Info is good for very big manuals, or tutorials, or for meta-organization (like having some big tract about sockets that points at the actual documentation for the various related calls). It isn't for small software or documenting individual system calls.

What really should happen is that the info system should additionally be able to produce man pages as well as html, TeX, and .info files as it currently does.

Or, perhaps docbook can be extended so that it can make man pages as well. Man pages are extremely useful as a well indexed quick reference. Other documentation systems should work with man.

DocBook has had elements for marking up man pages for a long time (I don't know if they were there in the beginning). Just use <refentry> as your toplevel element. The xmlto tool included in many distros can perform the conversion from DocBook/XML to man format like so:

I do dislike info. I like xemacs but I don't like info. Perhaps JWZ should create a forked version of info. :P

Man pages are easy to use because you can just press "Page Up" or "Page Down" depending on which way you want to go. Also you can search them with the '/' character. That's not intuitive, but I guess a bunch of programs use it.

When you type `man ls` it says right at the top in bold letters what `ls` is and what it does. When you type `info ls` it says right at the top "File: coreutils.info, Node: ls invocation, Next: dir invocation, Up: Directory listing." The info page is confusing and cramped. It doesn't use bold characters to seperate the heading from the rest of the text.

Even large complex topics can be handled with man pages. For example, the Perl man pages are brilliant. I refer to them all the time.

Python has great online books and resources, but I do a lot of programming on the train to work and cannot connect to the net. Even when the html is available, I sometimes prefer man pages.

One thing where I think man pages could be improved was if there was some way to make changes and additions to the documentation. Maybe you could press "CTRL-E" to edit, after you were finished it would be saved somewhere in your $HOME directory.

Also it would be nice if there was someway to automatically upload your changes to the net. The man page would have an embeded address in it and when you typed CTRL-U, then you would type in a few notes about your changes and upload it to the address. The person on the recieving end would have some kind of screening process and then the changes could go live.

In Debian when you look for something that should have a man page but doesn't it points you to "man undocumented". Wouldn't it be awesome if people could throw up some documentation right away without any fuss? Even if they just filled out the name and one line description fields it would help. I use "man -k" all the time when I need programs. In one situation, I logged into a system and couldn't find an editor until I used `man -k` to locate `ae`.

Another man page improvement would be to add more examples of the obvious stuff. The Debian "tar" manual is terrific because it explains everything I want right away. The "tar" manual on other OSs is crap because it doesn't have the examples...

I think what error27 is talking about is a man->wiki
gateway. I think such a thing could be fabulous, and ought to be
remarkably easy to cobble together. There are already lots of
man->www gateways that would be easy to adapt. The trick would
be getting the edits merged upstream, and merging local edits with
new versions released.

Having thought about the whole issue further, I think that (like
the products of the Sirius Cybernetics Corporation) the fundamental
problems of info are masked by its superficial problems. Its UI
stinks, but that's fixable. The fundamental problem may be that
while man pages have a recommended set of headings appearing in
fixed order, writers of info pages are on their own. It appears
that they are, as a rule, spectacularly bad at choosing section
headings, and at breaking material along boundaries meaningful to
readers.

Besides familiarity, there is much to be said for a standard set of
headers that has nothing to do with the program or function being
documented. It allows the material in the paragraphs to flow
naturally. There's nothing inherent in make or sed that would
lead the author of an info page to add a "see also", or even an
"examples" section. The only reason they would appear is if they
were in the template. Expecting the author to come up with the
headings leads them to design bad documents.

No fooling around with a UI can fix this fundamental problem.
Even with an info->wiki gateway, the texts would not naturally
evolve away from their bad initial organization. The hierarchy
would persist.

Good man pages have the examples that you're looking for 80% of the time in the first screenful. Monkey-see-monkey-do works. Copy-and-paste, slurp-and-burp, whatever you want to call it. The synopsis section often works this way for arguments to functions.

Please, folks that document etc files and such, gimme examples!

Why do so many man pages list the options in alphabetical order? How many readers open the ls man page thinking quick! I need to know what the -l flag does! More likely, they know they want a long or detailed listing.

Taking an arbitrary topic, I just looked at the man page and the info manual for pr. They both exhuastively enumerate the arguments. (OK, so the synopsis is not too bad, and the one sentence description "Paginate or columnate FILE(s) for printing." is worth its screenspace, and the info page starts with a paragraph or so that's perhaps useful. But then on to the march of arguments they both go.)

Hooray for the author of the pump man page; the one configuration option I wanted to know about (how to keep it from clobbering /etc/resolv.conf and hosing my local DNS cache) is documented with an example!

Successive elaboration, not exhaustive enumeration, please! Start with the most straightforward thing that works. Then add a wrinkle, etc. A tutorial, it's sometimes called.

If, after you've explained what I need 80% of the time, go ahead, document the remaining 20% in exhaustive enumeration style. (if I really want to know how to use all the arcane features, I'll probably have to resort to the O'Reilly book anyway).

Yes, sometimes I have too much time on my hand and I appreciate a good "theory of operation" tome. But not often.

The Linux How-To series was a godsend for a while. Step 1: get the source from foo.edu; step 2: unpack... make install... put your email addres there. start it up. you win! But I see more and more documents called "How To" that don't tell me how to do it at all, but insist that I learn all about how it's architected for maximum extensibility and flexibility. Argh!

Both info and man are pretty much flawed, they simply show there age. Sure man is still good enough for providing basically a 'foobar --help' with some addition documentation, but thats it. I find longer manpages quite hard to handle and unpleasant to use due to there lack of any TOC, hyperlinks and other things that make life easier. Let alone that most even lack a simply example section, but thats really more a content issue than a issue with man itself. After all manpages are not much more than plain text-files with some bold/underline markup.

Info on the other side while it provides basic hyperlinks and TOCs and is much more pleasant for longer documentation (Emacs, Gnus, etc.) is not so usefull for the short references that man is good at. Most info pages simply overstructure the commands just a bit to much, so one has to click through a bunch of links before reaching the desired info. The major flaw of info however is the dependency on .info files, .texi provides so much more structure and makes wonderfull printed documents, but all that gets stripped away once they get converted to .info. For example there is no way to get an equally nice printout from an info page compared that one could get from the original .texi page. So if one wants a nice printout one often ends up downloading source (distros seldomly ship the .texi files with the binary packages) and trying to generate a .ps from the .texi sources, which often is far from throuble free (A4 vs letter, double-page vs single-page), so even that way the results are less good than they could be in theory. The .info limitation of course also results in the impossibility of including graphics and such, after all a documentation system should be flexible enough for all apps, not just for command-line tools. So after all getting rid of .info and working directly with the .texi files could already lead to far better results.

One last word to .html documetation as it is seen here and there (Gnome, KDE). While it provides nice looking rendering, it becomes pretty much useless from a users point of view due to the complete lack of structure across multilpe pages, which basically result in the lack of a full-text search across the whole documentation or single section as provided by info.

So what would a good documetation system need? Fulltext search, good command line and GUI clients to access the docu, a way to structure and markup the documentation (after all when I want to look for a commandline option, it would be good to have a way to directly jump to it), a good way to print the docu and after all a few standards or guide lines for documentation writers. Both man and info provide a few of these features, but lack some of the other not less important ones.

A little digging revealed to me that Info was first writen in 1987. I was betting that it was designed before HTML was in widespread use, and I seem to be correct. I prefer to read texinfo manuals in their HTML or printed forms over using the clumsy info tool. There's no advantage to the info reader - HTML has hyperlinks, which info only implements in an awkward way. I'd prefer lynx to info any day.

Texinfo is pretty good when you evaluate it as a documentation markup format, which is what it is. It's designed to be used for full length reference books, not "pages". It's easily converitble to everything from text to DVI, and nothing forces you to use the info program rather than a browser or PDF viewer. Most texinfo documents (all the GNU ones at least) are avilable in HTML and PDF/PS.

The existence of a useful reference manual in texinfo format is no excuse for a shabby manpage. Info documents should exist to supplement terse man(1) pages. A lot of GNU tools neglect their manpages, and I think more projects should follow the lead of GCC which has a complete 9000 line manual page indexing command line options in addition to reference documentation detailing usage and development.

info is older than texinfo, I believe it was used for some of the ancient DEC operating systems.

html 1.0 was strongly inspired by texinfo, one could view it as an sgml based version of texinfo, without the programming documentations specific tags.

I actually find info superior for documentations than plain html, because info has "full document search" and "one key reading" (like man pages), plus internal hyperlinks like html. However, a specialized html browser or an ordinary browser plus a bit of scripting on both server and client should be able to offer the same.

Is info really that bad? I don't find that hard to use. I use it both on the command-line and from Emacs. Its user interface isn't great, but cursors to navigate, s to search whole manual, / to search individual page, u to go up section, etc. These all make sense to me. It's slightly easier to use in Emacs.

Anyhow, I typically use a combination of man, info, Emacs info mode, perldoc, Emacs perldoc mode and HTML pages, depending on whether I just want to refer quickly to the docs or not. For quick reference, command-line man, info and perldoc are great. For longer reading I much prefer the HTML docs.

I think man and info are complementary. man pages are like "cheat sheets" in my opinion. So the GNU project's anti-man page stance strikes me as illogical and just idealogical. Fortunately most GNU packages include man pages anyhow, by using help2man, which generates a nicely formatted summary of the command-line options.

Finally, as usually happens when this topic comes up, you don't have to use info. There are lots of alternative interfaces to info documentation.

richdawe: that's a source for the problem, the fact that the anti-man page stance of gnu is not a technical decision but rather an ideological/political one. it's like some recent moves from the gnu gcc project where they do/try to do stuff for political reasons rather than technical ones (and you don't need examples for that, do you ?). doing things from a technical point of view makes me feel good. but too much gnu-fsf land prioritizes ideology/politics and restrict their moves to stuff that gets into agreement with their stance. well, one of the many reasons i keep to bsd rather than linux. just want to restrict myself to technical stuff, not having politricks or alike getting into the equation.

gilbou: source for the problem, the fact that the anti-man page stance of gnu is not a technical decision but rather an ideological/political one

I don't think there is any justification for this - but OTOH there are many technical reasons why info is a better conceived system for online documentation (today it's hard to imagine a serious manual without hypertext). Developments of the last 15 years (e.g. HTML) may mean that info isn't going to become the leading viewer but as far as I am concerned, man is obsolete. How that translates into any kind of "ideological/political" position baffles me.

Many people seem to take any opportunity to label GNU capricious. They're the software geniuses you love to hate! Perhaps it's the tall poppy syndrome. Personally I admire GNU both for awesome technical achievements and hard work, as well as their "ideological/political" stance. I imagine they have long given up hope for credit where it's due.

I use man with less, in case that's new to you. When I
want to go to the section SHELL BUILTIN COMMANDS in
bash(1), I type
/^, copy and paste to get

/^SHELL BUILTIN COMMANDS

and then press [Enter]. To look up information on the set
command, I just enter

/^ *set \[

So there you are: the technical part.

I'm not trying to chase info off the face of the earth. I think
everyone should be free to read documentation in whatever way he likes.
On the contrary, it's people like your good self who'd like to
force/twist/persuade everyone to switch to info, and in the process
conjure up imaginary man-iacs who are bent on killing
info. The FSF's the one sniping people, not the other way round.

...man and info have already been replaced. Almost any good, newer library will have documentation in javadoc, doxygen, headerdoc or similar. These systems are completely superior.

Their most important feature is how easy it is to create documentation and keep it up-to-date. It's right with the actual source, so you've already got the file open whenever you're making changes. You can make it whine when things are unimplemented. Because they can look at the program context, the source is much less redundant. It's also very readable. This is a killer feature. For lots and lots of projects, I think it's the difference between not having any documentation at all and having fairly good documentation. Sure, ideally they'd also have tutorials and a user manual, but just API docs isn't horrible, especially with the quality these tools generate (below).

Another important feature is that they generate good-looking, easy-to-browse documentation. You can view it through your web browser of choice, so there are many viewers available, and you're already familiar with their use. You can view the documentation on the project's website without having it installed. It's hypertext with lots of hyperlinks, so it's extremely easy to go to related sections.

Doxygen generates great diagrams. Inheritance diagrams, collaboration diagrams, dependency diagrams, and call tree diagrams. These are really helpful for understanding the system very quickly.

Furthermore, if you're completely set on having man pages, it can generate them. doxygen has a man output mode which is fairly good.

I just wish that all the legacy code I still use had doxygen-generated API references. For one thing, it'd be nice to see everything in a consistent format. For another, my documentation could easily include the underlying code's with hyperlinks and call tree diagrams. For example, my code uses the system C library, boost, OpenSSL, and libpq. None of these have doxygen documentation available. It also uses ICU and libstdc++; they do.

Flame wars pro- or anti-FSF, vi, emacs,
troff, texinfo, or what have
you, are off topic. Yes, the attitude of some in the FSF toward
man is unfortunate. People who live in emacs
do like it. Yes, troff is archaic markup, and
texinfo is nicer in its way, and doxygen is
leading the way to a glorious future. This is all secondary.

The point I meant to hint at in the original article was that,
normally, newer solutions to old problems are manifestly better
than the old ones. The implementer has full benefit of hindsight, a
modern execution environment, better tools, new ideas, and motivation.
When the new supplants the old, we generally don't notice for long.
Everybody switches to the newer, better thing, and moves on. When
the new fails to supplant the old, that's worth studying.

Now, some people do like info better, but many don't.
That's something odd. With all the advantages, how did info
fail to win over so many? It's not as if man got new features
that enabled it to compete. (True, less makes it nicer to
read man pages than did more, but that's pretty
thin icing.) Info's UI is sucks, but there are lots of
alternatives, none of which (except emacs, in some
quarters) has really caught on.

My claim is that there's something, or things, fundamentally good
about the man format, and the contents of actual
man pages, that has failed to be translated forward
into its proposed replacements. Disparaging man as a
"cheatsheet" just prevents you from discovering those merits and
adopting or bettering them. (I see the same effect when Lispers
disparage C, and a similar effect wherever nationalism
is practiced.) "Bigotry prevents insight" is not an earthshaking
observation, but it bears repeating now and again.

I have speculated earlier on what it is about the man
pages that made them work better for me. It occurs to me,
though, that my experience may differ from most readers'.
My early exposure to man pages was of the Bell Labs pages,
often as filtered through Berkeley. While riddled with factual
errors, and incomplete, they set a standard of literacy and
thoughtfulness not matched in the info files I have
encountered. Very few of the man pages on typical Linux
installations are traceable to Bell Labs, or even to Berkeley.

Still, I don't think lamenting the current condition of programmer
(or FSF) literacy gets to the heart of the matter. The Bell Labs
programmers were also users, like us. Like us, they were also
writing for nonprogrammers -- in their case, the Bell legal
department word processing department secretaries, at first.
When they found a man page unclear, they could read the source,
and then check in (with sccs!) improvements on the spot. Improving
a random man or info page you encounter these days takes a lot more
work. We don't generally have CVS write access for every
page we use. A web-based wiki that automatically generates
sorta-patches sent to maintainers ought to help encourage
participation.

I've already alluded to the difference in organization of
info pages, compared to man pages. In
info pages, cleavage into sections is left entirely
to the author, and in practice, section headings tend to match
program features. While this makes the page feel more natural to
write, it bewilders those new to a program because it presumes
more knowledge than the reader has, yet. Man's more-rigid
header organization ensures that top-level topics are familiar to
the reader, both because they are the same in different man
pages, and because they refer to the users' experience, rather than
to internal details of program operation.

Man's more rigid organization also, paradoxically,
allows details about a program to be placed more freely.
This freedom allows the author to put together in one place
details that, if sorted out by category, might be widely
scattered. It allows details to be presented in a narrative
order more natural to a new user, perhaps building from simple
to more elaborate examples, each bringing in a new feature.

Besides encouraging better organization, the man format
provokes information that might not otherwise be expressed. Its
generic top-level headers create a "commons" for details that
relate to more than one part of a program, that might find no
place in a hierarchy based directly on program features. At a
more basic level, the man format demands thought on what to
place under sections "SEE ALSO", "HISTORY", "BUGS", and "FILES",
that it might not naturally occur to a program author to document.

None of these merits has anything to do with hypertext capability,
or with choice of markup format. There's no inherent reason that
info files could not be composed and presented as well
as man pages are today. It would take a lot of rewriting,
true, and the info program itself would probably best be
scrapped. The key, though, is not in the code, but in the structure
imposed on authors. Given good text to work from, presentation is a
matter of taste.

As for most long-standing problems in Free Software, to improve
matters we need clear standards, and apparatus to make participation
easy. Both are readily achieved.

back in the day, it was easy for anyone to edit them, so unclear things were quickly clarified

they follow a really good style guide

#2 should be easy to duplicate. Write style guides for various types of documentation and publicize them. For API reference documentation, Sun already created this guide for writing Javadoc comments. It actually is pretty helpful, but it doesn't address a "History", "Bugs", or "Files" section. You could write a new guide that includes whatever you found helpful about man pages. And add specific notes for languages other than Java and doc tools other than Javadoc.

I prefer man. I find the policy of FSF and the GNU Project of deprecatign man pages inexcuseable. Info was a neat idea, but has been surpassed by HTML and DocBook in its niche, and never should have attempted replacing man.

I have not seen info as having any good qualities. It has been very damaging, though. The non-intuitive reader and lack of man pages because the documentation is in info, of course, but then there's the other formats generated from info suffering from the same structure etc.

If a man page is not enough to cover everything, write a whole book elsewhere and have minimal usage (invocation, maybe common commands or options etc) in the man page.

Maybe a monolith that requires that much usage information is not exactly a good design to begin with.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser
code is live. It needs further work but already handles most
markup better than the original parser.