Description

How Subversion saves disk space

To keep the repository small,
Subversion uses deltification (or,
“deltified storage”) within the repository
itself. Deltification involves encoding the representation
of a chunk of data as a collection of differences against
some other chunk of data. If the two pieces of data are
very similar, this deltification results in storage savings
for the deltified chunk—rather than taking up space
equal to the size of the original data, it takes up only
enough space to say, “I look just like this other
piece of data over here, except for the following couple of
changes”. The result is that most of the repository
data that tends to be bulky—namely, the contents of
versioned files—is stored at a much smaller size than
the original “fulltext” representation of that
data. And for repositories created with Subversion 1.4 or
later, the space savings are even better—now those
fulltext representations of file contents are themselves
compressed.

Note

Because all of the data that is subject to
deltification in a BDB-backed repository is stored in a
single Berkeley DB database file, reducing the size of the
stored values will not immediately reduce the size of the
database file itself. Berkeley DB will, however, keep
internal records of unused areas of the database file, and
consume those areas first before growing the size of the
database file. So while deltification doesn't produce
immediate space savings, it can drastically slow future
growth of the database.