There may be an argument to store them in a GitHub repository as you can publish the static HTML using pages. Although then an entirely separate set of arguments arise as to how you ensure they're up to date etc...
– Boris the SpiderMay 13 at 7:51

21

If files are generated, then by definition they aren't source.
– chrylisMay 13 at 9:29

3

You publish what you want published. Especially on GitHub. If you want everyone to see a generated PDF or image, you should include it instead of expecting everyone to install LaTeX and compile it themselves. For example, this repository wouldn't be very good if it didn't include the produced images, only the project files...
– DžurisMay 13 at 9:57

As a consumer of third party libraries, out of the 10 times that I see a library with no online documentation (whether in a subfolder of the repository, or linked from the readme), I will click away and skip those libraries, all 10 times. I'm not going to mess around with Doxygen for half an hour just to see if a library meets my needs.
– AlexanderMay 14 at 0:50

8 Answers
8

Absent any specific need, any file that can be built, recreated, constructed, or generated from build tools using other files checked into version control should not be checked in. When the file is needed, it can be (re)built from the other sources (and normally would be as some aspect of the build process).

But this may depend on versions of build tools or even the availability of build tools (e.g. to generate some files some old version of a build tool is required). How do you handle that? Can you address it in your answer?
– Peter MortensenMay 13 at 10:20

27

@PeterMortensen If you need an artifact built with a special version of buld tools, you build it with the version of build tools that you need. Such a need is either a) discoverd by yourself, in which case you're on your own; b) documented in README ("You'll need two have 2 specific versions of doxygen installed..."); c) dealt with by the build scripts (they check the available build tools versions and act appropriately). In any case, source control is for sources, not for build artifacts.
– Joker_vDMay 13 at 10:32

2

I think this answer is only viable iff a continuous deployment server builds and publishes the documentation in an easily accessible way. Otherwise, there's a great value in "caching" the docs in the repo, to improve accessibility. No user should have to muck with your build scripts just to see your software's documentation.
– AlexanderMay 14 at 0:53

4

@Alexander Would you also put the built binary into the repo? The documentation is built. You take the built documentation and make it accessible somewhere.
– 1201ProgramAlarmMay 14 at 1:06

5

@1201ProgramAlarm "Would you also put the built binary into the repo?" Nope, because a built binary has low up-front value to people browsing around GitHub, as compared to the documentation. "You take the built documentation and make it accessible somewhere." As long as that's publicly hosted, visibly linked, then yea that's great. It's probably the best case.
– AlexanderMay 14 at 1:31

My rule is that when I clone a repository and press a “build” button, then, after a while, everything is built. To achieve this for your generated documentation, you have two choices: either someone is responsible for creating these docs and putting them into git, or you document exactly what software I need on my development machine, and you make sure that pressing the “build” button builds all the documentation on my machine.

In the case of generated documentation, where any single change that I make to a header file should change the documentation, doing this on each developer’s machine is better, because I want correct documentation all the time, not only when someone has updated it. There are other situations where generating something might be time consuming, complicated, require software for which you have only one license, etc. In that case, giving one person the responsibility to put things into git is better.

@Curt Simpson: Having all the software requirements documented is a lot better than I have seen in many places.

Don't document what software someone needs to to the build (or at least don't just document): make the build script tell the user what he's missing or even install it itself if that's reasonable. In most of my repos any half-way competent developer can just run ./Test and get a build or get good information about what he needs to do to get a build.
– Curt J. SampsonMay 13 at 15:14

5

I don't really agree that putting generated documentation into git can be good in the case you specify. That's the reason we have artifactories and archives.
– SulthanMay 13 at 17:13

That is your rule and it is a good rule and I like it. But others can make their own rules.
– emoryMay 13 at 20:41

I think you mean "run a build command," as there would be no build button on your machine. ...Unless you're expecting the entire build to be integrated with an IDE, which is wholly unreasonable.
– jpmc26May 14 at 15:14

@jpmc26 I find it totally reasonable to have the entire build integrated in an IDE. The build button on my machine is Command-B.
– gnasher729May 15 at 21:35

One advantage of having them in some repository (either the same or a different one, preferably automatically generated) is that then you can see all the changes to the documentation. Sometimes those diffs are easier to read than the diffs to the source code (specifically if you only care about specification changes, not implementation one).

But in most cases having them in source control is not needed, as the other answers explained.

That would pretty much require a pre-commit hook in each and every repo that is used to create commits. Because if the documentation generation process is not fully automated, you will get commits that have the documentation out-of-sync with the code. And those broken commits will hurt understandability more than uncommitted documentation.
– cmasterMay 14 at 11:10

1

This doesn't have to be at the commit stage. It could easily be a downstream/CI/Jenkins job to publish them every time they are deemed worthy of storage. This may well be each commit, but the decision should be decoupled in the absence of a good reason. Or at least that's the way I see it.
– ANoneMay 14 at 13:57

Ignored. You'll want to have the repo's users be able to rebuild them anyway, and it removes the complexity of being sure the doc's are always in sync. There's no reason not to have the built artifacts bundled up in one place if you want to have everything in one place and not have to build anything. However source repos are not really a good place to do this though as complexity there hurts more than most places.

It depends on your deployment process. But committing generated files into a repository is an exception and should be avoided, if possible. If you can answer both of the following questions with Yes, checking in your docs might be a valid option:

Are the docs a requirement for production?

Does your deployment system lack the necessary tools to build the docs?

If these conditions are true, you are probably deploying with a legacy system or a system with special security constrains. As an alternative, you could commit the generated files into a release branch and keep the master branch clean.

Committing generated files into a release branch doesn't work in every situation, but there are a number, especially with things like static web sites built from markdown, where this is an excellent solution. I do it often enough that I built a special tool to easily generate such commits as part of the build process.
– Curt J. SampsonMay 13 at 15:16

Needs to be part of the repository, like the readme.md, then it's preferred to keep them in the git repo. Because it can be tricky to handle those situations on a automated way.

If you don't have an automated way to build and update them, like a CI system, and it is intended to be seen for the general audience, then is preferred to keep them in the git repo.

Takes A LOT of time to build them, then is justifiable to keep them.

Are intended to be seen for the general audience (like the user manual), and it takes a considerable time to build, while your previous docs becomes inaccessible (offline), then is justifiable to keep them in the git repo.

Are intended to be seen for the general audience and has to show a history of its changes/evolution, it could be easier to keep previous doc versions commited and build/commit the new one linked to the previous. Justifiable.

Has an specific accepted reason for all the team to be commited, then is justifiable to keep them in the git repo. (We don't know your context, you & your team do)

In any other scenario, it should be safely ignored.

However, if its justifiable to keep them in the git repo, could be a sign of another bigger issue that your team is facing. (Not having a CI system or similar, horrible performance issues, facing downtime while building, etc.)

As a principle of version control, only "primary objects" should be stored in a repository, not "derived objects".

There are exceptions to the rule: namely, when there are consumers of the repository who require the derived objects, and are reasonably expected not to have the required tools to generate them. Other considerations weigh in, like is the amount of material unwieldy? (Would it be better for the project just get all the users to have the tools?)

An extreme example of this is a project that implements a rare programming language whose compiler is written in that language itself (well known examples include Ocaml or Haskell). If only the compiler source code is in the repository, nobody can build it; they don't have a compiled version of the compiler that they can run on the virtual machine, so that they can compile that compiler's source code. Moreover, the latest features of the language are immediately used in the compiler source itself, so that close to the latest version of the compiler is always required to build it: a month old compiler executable obtained separately will not compile the current code because the code uses language features that didn't exist a month ago. In this situation, the compiled version of the compiler almost certainly has to be checked into the repository and kept up-to-date.

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).