Cross-package documentation, part 1

It would appear to the casual observer that Haddock works fairly excellently cross-packages. For example, I haddocked the Haddock code, and all the links to the types from the GHC API point to the haddock-docs in my GHC 6.10.3 installation. (Yes, you need the versions of the dependent packages chosen, installed and haddocked, and this doesn’t always work out optimally online. But that’s not the problem that I’m confronting.)

Anyway, the way we get any links to modules in other packages is because we read their generated .haddock files. They’re binary and haddock doesn’t have any obvious way to make them human-readable, but they correspond to Haddock.InterfaceFile.InterfaceFile. Which is, roughly, a list of Haddock.Types.InstalledInterface, which seems to be the interesting bit.

A Module (not to be confused with GhcModule, which is defined by Haddock) is a low-information type defined in GHC that contains the package name and version and the module name.

HaddockModInfo is just the module’s header description, plus the portability:, stability:, maintainer: fields (HaddockModInfo is defined in GHC, oddly enough: must be parse result. defined in ghc:HsSyn to be precise.)

The rest contain a lot of “Name”s, which is a GHC thing that refers unambiguously to the place an identifier originates. Sufficient for making a link, but not sufficient by itself for copying the named identifier’s docs or type. So passing over them for now… there is one interesting thing left.

A DocName contains a Name and also (if any) the module we’d like to link to in which that name is documented. instDocMap :: Map Name (HsDoc DocName). This gives more info on any number of Name identifiers. (HsDoc provides formatting, DocName provides its references .) There is no type information here at all, as far as I can tell, which will clearly need to be remedied somehow (in Interface, roughly a superset of InstalledInterface, types appears in ifaceDeclMap, though I’m not sure if that’s where they’re retrieved from for HTML-doc-printing). But there’s another big question I need to find out: *which* names are documented in any given module’s instDocMap? The type provides no clue, nor does its current (lack of a) doc string, nor ifaceRnDocMap. I could look everywhere in the code that it’s generated, or I could ask David Waern… who will need to tell me if I said anything confused here anyway :-)

7 Responses to “Cross-package documentation, part 1”

During HTML generation, the declarations are taken from ifaceRnExportItems which represents the export items. See the ExportItem the data type – it contains LHsDecl. Creating the export items is one of the most important jobs of Haddock.Interface.Create.

You were wondering which names are documented in a given modules’s instDocMap. That is the names of all declarations in that module, that have documentation. It is the same as ifaceRnDocMap in Interface, which is generated from ifaceDeclMap by taking all declared names that have documentation and renaming the documentation. ifaceDeclMap is just a map of all declarations in the module.

this blog is becoming a nice resource for wannabe haddock hackers :)
so i’ll try asking here:
does the .haddock interface file contains enough information to generate e.g. the documentation for that module? i’m guessing the answer is no.
that’s a bit unfortunate though, since a machine-readable format to ship documentation that can be then converted in a desired format would be quite useful, especially when one uses binary tarballs for libraries (like when installing ghc).

Andrea: nope. I believe the missing bits of information are the doc-strings, which I plan to include in the .haddock files; and the type-signatures, which my current plan is to retrieve from .hi files. Luckily .hi files are present in binary distributions, so your idea might work. Do you have a particular use-case? Like, letting the end-user turn their docs into whatever strange format they like, as long as Haddock supports it? (or even machine-processable. hmm. Haddock and GHC and .hi/.haddock file versions are all very tightly coupled at the moment, by the way.)

The motivating goal for me was keeping a central hoogle index of all the packages installed, and finding that while packages shipped with ghc have the .haddock interface installed you can’t currently extract the .txt file to feed hoogle from it.

Andrea: that might be possible to do in the future, since it should be possible to fully re-create the interface given the .hi files and the .haddock file provided that we put in some missing bits in the .haddock file. Note that othing is missing for cross-package documentation, but if you want to re-generate documentation or create a hoogle file, we need to put some more bits into the .haddock file. This is not in scope of Isaac’s project, though.