For a distributed team that uses Git and Github as version control, should images also be stored in the git repository?

For the most part, the images won't be changed. The folder containing them will only grow in size as images are added. A concern is that the image folder may grow to a large size over time by combination of large images, or just a lot of them.

Is this considered a best practice? What other alternatives are there to sharing binary files needed in projects that a distributed team can easily access?

When you say "images" are we talking about 26mb DSLR Raw files, 1mb 3d game textures, or <100k png icons? (I was going to answer "it depends" but I'll refrain)
–
BrookJun 2 '11 at 1:50

@Brook: I sort of assumed we were talking icons or small graphic elements for websites. Game textures, graphic design raw files or precise graphics for documentation editing might be a different story, you're right.
–
haylemJun 2 '11 at 1:55

It should really be for small/medium sized web-friendly images. A concern is that some dev-signers will start sticking every large original image in there, when I'm thinking that should probably use something else.
–
spongJun 2 '11 at 2:28

9 Answers
9

Are your images original work or can they be recovered (guaranteed?) from else where? Are they needed to ship a software unit built from source?
If they are original, they need backing up. Put them in you rev control, if the never change, the space penalty is the same as a backup, and they are where you need them.

Can they be edited to change the appearance of the software, accidentally or intentionally?
Yes - then they MUST be revision controlled somehow, why use another way when you have a perfect solution already. Why introduce "copy and rename" version control from the dark ages?

I have seen an entire projects original artwork go "poof" when the graphics designers macbook hard drive died, all because someone, with infinite wisdom, decided that "binaries don't belong in rev control", and graphics designers (at least this one) don't tend to be good with backups.

Same applies to any and all binary files that fit the above criteria.

The only reason not to is disk space, I am afraid at $100/terabyte, that excuse is wearing a bit thin.

BTW : The Internet is NOT a reliable source. If you downloaded an image from "bobsfreestuff.com", it probably will not be there next week.
–
mattnzJun 2 '11 at 1:59

10

+1 - and should be +more. The point of version control is to allow you to recover / roll back to stuff, whatever the stuff might be, AT SOME PAST TIME. The only way to be 100% that you can get back what was supposed to be at that point in time it to put EVERYTHING under version control. Thats source, images, resouces, helpfull/supporting PDFs. Heck, I even put Zipped CD images in. I have even been known to put a VM virtual machine (including the VMDK) into source control. Seems extreme? Saved my bacon 2 years later.
–
quickly_nowJun 2 '11 at 4:06

1

100% agree. If images are part of the software, they need to be revision controlled.
–
Dean HardingJun 2 '11 at 8:44

6

The only reason I would disagree would be if it made your repo cumbersome to clone to the point where developers had to actually think "do I really want to take the time to clone this, or can I just do X in this other branch". If this occurs make sure things get re-organized very quickly
–
BrookJun 2 '11 at 13:29

2

+1 for the point about needing it for deploy. If I clone your repo, because I'm a new team member or something, then it should work out of the box. This does include having a makefile equivalent clever enough to get necessary 3rd party libraries if necessary.
–
Spencer RathbunMar 6 '12 at 20:14

Why the hell not? :)

Storing binaries is considered bad practice, yes, but I never worried too much about images.

Worst case, if you have tons, store them somewhere else or use externals or an extension for binary support. And if the images won't be changed that often, then where's the problem? You won't get a big fat delta. And if they get removed over time, it's only your server that suffers a bit from storing the history, but clients won't see a thing.

In my opinion, you shouldn't worry about it - granted you don't store GBs of those.

What you could do though, is only store "source" images: SVGs, LaTeX macros, etc... and have the final images generated by your build system.
That's probably even better, if you can. If not, then don't bother.

(All that being said, Git shines for text files, but is not the best VCS for pictures. Give us more context and metrics if you can)

+1 for storing the source, but if they can do development testing without a full build then that might mess it up. That also means you would need to build all the images before starting work in the morning
–
TheLQJun 2 '11 at 1:56

@TheLQ: I guess, but then maybe you should have cascading builds, where your downstream (test) builds can only rely on upstream builds (the actual build). And then export these to a public folder for re-use by testers locally. That implies a bit of infrastructure, obviously, but that would be my way of doing things in a relatively sizable team.
–
haylemJun 2 '11 at 2:06

The whole "don't store binaries in source control" is set forth for a specific reason: If you have source code that compiles, don't store the actual compilation, but just the source code. Images and visual assets do not have a "source," so they should be tracked in version control.

I believe the recommended way with Git is to use a sub-module (introduced in Git 1.5.3) which is basically a separate repository that is associated with the main one. You store your images (and other binary assets) in the sub-module. This can then be checked-out with the main repository or left, depending on what is required.

"Git's submodule support allows a
repository to contain, as a
subdirectory, a checkout of an
external project. Submodules maintain
their own identity; the submodule
support just stores the submodule
repository location and commit ID, so
other developers who clone the
containing project ("superproject")
can easily clone all the submodules at
the same revision. Partial checkouts
of the superproject are possible: you
can tell Git to clone none, some or
all of the submodules."

Also, size shouldn't be a significant issue if the images don't change often. You can also run commands to prune/reduce size, such as:

Lets say you release software version 1.0. For version 2.0 you decide to redo all the pictures to be with shadows. So you do this, and release 2.0. Then some customer who is using 1.0 and cannot upgrade to 2.0 decides they want the program in another language. They give you $1G to do it, so you say sure. But in a different culture, some of your pictures do not make sense, so you have to change them...

If you would keep your images in source control, this is easy, based on 1.0 you make changes to images (among other things), build, release. If you did not have these in source control, you would have a much harder time, since you would have to find the old images, change them, and then build.

If it is part of the Project, it has to be in the VCS. How to achieve this best may depend on the VCS, or how you organize a Project. Maybe a repo for the designers, and only the results in the coder's repo, or only the 'Image sources' (i once had a project with a only a .svg file, and the images where generated via make/inscape cli).

But, if a VCS cannot handle that, or becomes unusuable, i would say, that it not the right tool for your job.

So far, i had no problems with putting 'usual' amounts of graphics (mockups, concepts, and page graphics) for web projects in git.

git is very good with text files, but by its very nature isn't too hot with binaries. You will have issues with the size of the data transferred when you clone or push, your .git directories will grow, and you could get ina right mess with merging (ie how do you merge 2 images!)

One answer is to use submodules, as this means the link between your project and the images will be weaker - so you won't have to manage the images as if they were part of your source, yet still keeping them controlled, and not having worries with branching them - assuming the subproject is just a 'flat' repository of data that doesn't go through the same churn during the usual development process.

The other answer is to put them in a different project, never branch it, and ensure that everyone who commits to that project pushes it upstream immediately - never let 2 people change the same version of the file - you'll find this the most difficult aspect as git isn't designed for such a non-distributed workflow. You'll have to use old-fashioned communication methods to enfore this rule.

A third answer is to put them in a different SCM entirely that is better geared to working with images.

Adding to @haylem's answer, note that size plays a large factor in this. Depending on the VCS it might not work well with tons of images. When clones or large pushes starting taking all night then its really too late as all the images are already in your repository.

Plan for large pictures and future growth. You don't want to get two years into this project and have a "oh crap, maybe the repo is a little too big."

Your answer is somewhat irrelevant, as the question is specific to git. Do you happen to know if size plays a large (or any) factor for git repositories?
–
Yannis♦Jun 2 '11 at 2:02

@Yannis Must of missed that first sentence... AFAIK, git is better with larger repositories but the size issue is still relevant as gargantuan clones or pushes are an issue
–
TheLQJun 2 '11 at 2:42

With GIT is trivially easy to rearrange repositories and create partial clones etc, if this happens to become a problem. Don't confuse the historical molasses of revision control tools from decades ago with those of today.
–
mattnzJun 2 '11 at 3:43

I definitely agree that technically and economically storing them is feasible. Question I would as is "are these images part of the shipping product or part of the content of a shipping product?" Not that you can't store content in GIT (or any other VCS) but that it is a separate problem for a separate VCS.