A simple way to detect unused files in a project using git

After finding that we had a few images checked into our project’s repository but that were not referenced in the project, I wanted to write a script to quickly see if there were any other unused assets.

This was a one-off script, so it probably won’t suit everyone’s needs, but here’s how we approached the problem:

First, we needed to get a list of the files that git was tracking in our image directory. While you could use ls for that, I wanted to be sure that we weren’t going to list any files that git was ignoring, so we started with git ls-files, whose output will look something like this if called as git ls-files ./img:

img/foo.png
img/bar.png

(For the sake of the example, we’ll assume that foo.png is referenced in the project and bar.png isn’t.)

The next thing we want to do is to see if those filenames are referenced anywhere in the code. At this point, I wasn’t sure if they would be referenced by a relative or absolute path, so I knew I wanted to just search for e.g. foo.png. I like to check my work with an intermediate command, so the next command we tried out was

for FILE in $(git ls-files ./img); do
echo $(basename "$FILE")
done

(basename gives you everything after the last slash of its input — in this case, just the raw filename.) And when we ran that command, we saw the expected output:

foo.png
bar.png

Now that we know we are correctly extracting the desired part of the path, we can check whether that filename is referenced anywhere in the code. git grep works enough like regular grep, but it only searches tracked files in the working tree (if you call it without a commit-ish), so we don’t have to worry about excluding the .git directory or .gitignored files.

If we call git grep foo.png manually, we will see some output like

src/index.html: <img src="../img/foo.png"/>

and git grep bar.png will have no output. But it isn’t the output we care about so much as the exit status (noting that git grep will return non-zero when no results are found) — so let’s run our command again, and verify that we will only remove the expected files: