I know a lot of software shops keep binaries under source control. However, our shop had come to store entire frameworks on the repository: DirectX runtime, CUDA, nVidia Optix, whatever.

It is said to make a dev machine setup easier (supposedly, get latest and start coding). However, it considerably bloats the repository and burdens it with irrelevant history.

I've never seen such a usage pattern. Do you consider it a good practice?

[EDIT:] I have no problem with source-controlling isolated 3rd party binaries. The question refers to entire framework runtimes, typically consisting in 10+ binaries. As an extreme example, take Windows SDK (which we don't keep on the repository, thank god, but I see no difference in principle).

This question came from our site for professional and enthusiast programmers.

1

You could make a script which downloads the latest or the relevant framework runtimes, and put that script under version control.
–
Basile StarynkevitchJan 3 '12 at 6:57

2

Why do you think that it burdens [the repository] with irrelevant history? More specifically, why do you think that history is irrelevant? If your code references those frameworks and libraries, it is very helpful to know which versions were being used at a particular revision in the repository.
–
James McNellisJan 3 '12 at 7:03

I agree about the bloat (especially if those runtimes are shared between multiple projects) but I don't understand the part about irrelevant history. A change for example in which version of a runtime you are using is quite relevant...
–
6502Jan 3 '12 at 7:03

Wouldn't it be easier to create images of the developer machine as updates come out?
–
RamhoundJan 4 '12 at 15:31

6 Answers
6

Binaries are generally not well suited for version control system because:

they don't benefit from version control features (merge, diff)

they increase the size of the repo...

... which matters because you don't remove easily a version from a VCS (VCS are made to keep the history), as opposed to artifact repositories like Nexus, which are simple shared directories (easy to clean up: cd + rm!)

they should be referenced in a VCS as a text, a declaration of path+version: you can keep relevant history that way, recording any change of binary version as a change in that text file (like a pom.xml if you are using Nexus for instance)

I'm okay with version controlling binary assets. I'm against version controlling generated files.

Also, environment setup is something different from development. We mostly develop in Python and it has a tool called virtualenv which allows you to create a lightweight isolated environment (including libraries) for a project. When we check out our sources, we have a setup script in there which builds this virtualenv. Using a manifest, this specifies what versions of libraries are needed and other such things. None of this is version controlled. Only the setup script is.

Throwing the whole framework under your main project will clutter your history and seriously mess things up. It's not part of your project and should be treated differently.

It is generally a good idea to do configuration management with framework versions. If your code needs a specific DirectX version, that version should be easily available, and if you check out an older version of your software, it should be easy to determine which external dependencies it has.

What I don't think to be a good idea here is to use your typical versioning control system for storing those binaries. In our company, we store every version of frameworks, libraries and external tools in a subfolder structure on a network drive. Where we think it makes sense, we have readme files to document which tool version belongs to which software version, or, if possible, we have scripts for installing or using a specific version. Only those readme files and scripts go into version control.

We also keep older versions of tools and libraries as long as we think there may be a chance we must rebuild and older version depending on those tools. That way it gives us the possibility to delete some of the very old libs&tools from our network drive when they got deprecated (of course, just in case we have archives on external media).

I think that the binaries should be stores somewhere. I would suggest storing them outside of a repository, especially if they are big and lead to large check out times. I'm not going to see it's poor practice, but it's also not one I've seen.

This may actually be a good idea if your organization has lots of projects that target different runtime versions. It ensures that you have the right runtime binary when you're working on the right projects.

I personally consider this a very bad practice. I prefer setting up a wiki with installation instructions and uploading necessary binaries to it. These files are needed only by new devs, there's no need to bloat repositories of everyone else.

There is a good reason for doing this, namely that you have all you need in a single location, without any external dependencies.

This is much more important than you may think. It essentially ensures that you do not rely on an artifact on a vendor server which may go away after a few years, since you have everything in-house.

Regarding repository bloat. This is only a problem if your VCS keeps a full local copy (git does this, cvs doesn't) since cloning and/or updating will be slow. In return you will have copies on each development machine, which may save your company if your central backup scheme for some reason fails some day.

It is a matter of priority or policy. As long as the decision is deliberate, I would be fine with this.

If you store your thirdparty library on an artifact repository like your own Nexus or NuGet server, you won't have to fear for a "vendor" server going away. So by all mean, store them locally. Just don't use a VCS. They are not meant to keep that kind of file.
–
VonCJan 4 '12 at 7:33

@VonC depends on your infrastructure and working methology. For "just store a single binary once" a VCS can be just as fine as a full artifact repository. It also keeps infrastructure simple. I am not advocating - the OP asked whether anybody had seen such a usage pattern.
–
user1249Jan 4 '12 at 7:39

Ok. I have seen such a usage pattern. AndI had to administer such repos (Centralize VCS like SVN or ClearCase mainly, never seen that usage for DVCS like Git). And this is generally a mess. Regarging vendor libraries, you rarely just store a single binary once. You have to deal with patches. Lots of them. Plus storage isn't the only goal (if it was, VCS might arguably be a potential solution). Dependency management is. And that what Nexus or NuGet offer (in addition to an easy to administer and clean-up) storage.
–
VonCJan 4 '12 at 7:45