If the file was no more than 12 blocks long, then the block numbers of all
its data are stored in the inode: you can read them directly out of the
stat output for the inode. Moreover, debugfs has a command which
performs this task automatically. To take the example we had before, repeated
here:

With either debugfs or fsgrab, there will be some garbage at the end
of /mnt/recovered.000, but that's fairly unimportant. If you want to
get rid of it, the simplest method is to take the Size field from the
inode, and plug it into the bs option in a dd command line:

# dd count=1 if=/mnt/recovered.000 of=/mnt/resized.000 bs=6065

Of course, it is possible that one or more of the blocks that made up your file
has been overwritten. If so, then you're out of luck: that block is gone
forever. (But just imagine if you'd unmounted sooner!)

The problems appear when the file has more than 12 data blocks. It pays
here to know a little of how UNIX file systems are structured. The file's data
is stored in units called `blocks'. These blocks may be numbered sequentially.
A file also has an `inode', which is the place where information such as owner,
permissions, and type are kept. Like blocks, inodes are numbered sequentially,
although they have a different sequence. A directory entry consists of the
name of the file and an inode number.

But with this state of affairs, it is still impossible for the kernel to find
the data corresponding to a directory entry. So the inode also stores the
location of the file's data blocks, as follows:

The block numbers of the first 12 data blocks are stored directly in the
inode; these are sometimes referred to as the direct blocks.

The inode contains the block number of an indirect block. An
indirect block contains the block numbers of 256 additional data blocks.

The inode contains the block number of a doubly indirect block. A
doubly indirect block contains the block numbers of 256 additional indirect
blocks.

The inode contains the block number of a triply indirect block. A
triply indirect block contains the block numbers of 256 additional doubly
indirect blocks.

Read that again: I know it's complex, but it's also important.

Now, the kernel implementation for all versions up to and including 2.0.36
unfortunately zeroes all indirect blocks (and doubly indirect blocks, and so
on) when deleting a file. So if your file was longer than 12 blocks, you
have no guarantee of being able to find even the numbers of all the blocks
you need, let alone their contents.

The only method I have been able to find thus far is to assume that the file
was not fragmented: if it was, then you're in trouble. Assuming that the file
was not fragmented, there are several layouts of data blocks, according to how
many data blocks the file used:

0 to 12

The block numbers are stored in the inode, as described above.

13 to 268

After the direct blocks, count one for the indirect block, and
then there are 256 data blocks.

269 to 65804

As before, there are 12 direct blocks, a (useless) indirect
block, and 256 blocks. These are followed by one (useless) doubly indirect
block, and 256 repetitions of one (useless) indirect block and 256 data blocks.

65805 or more

The layout of the first 65804 blocks is as above. Then
follow one (useless) triply indirect block and 256 repetitions of a `doubly
indirect sequence'. Each doubly indirect sequence consists of a (useless)
doubly indirect block, followed by 256 repetitions of one (useless) indirect
block and 256 data blocks.

Of course, even if these assumed data block numbers are correct, there is no
guarantee that the data in them is intact. In addition, the longer the file
was, the less chance there is that it was written to the file system without
appreciable fragmentation (except in special circumstances).

You should note that I assume throughout that your blocksize is 1024 bytes, as
this is the standard value. If your blocks are bigger, some of the numbers
above will change. Specifically: since each block number is 4 bytes long,
blocksize/4 is the number of block numbers that can be stored in each indirect
block. So every time the number 256 appears in the discussion above, replace
it with blocksize/4. The `number of blocks required' boundaries will also have
to be changed.

There seems to be a reasonable chance that this file is not fragmented:
certainly, the first 12 blocks listed in the inode (which are all data blocks)
are contiguous. So, we can start by retrieving those blocks:

# fsgrab -c 12 -s 8314 /dev/hda5 > /mnt/recovered.001

Now, the next block listed in the inode, 8326, is an indirect block, which we
can ignore. But we trust that it will be followed by 256 data blocks (numbers
8327 through 8582).

# fsgrab -c 256 -s 8327 /dev/hda5 >> /mnt/recovered.001

The final block listed in the inode is 8583. Note that we're still looking
good in terms of the file being contiguous: the last data block we wrote out
was number 8582, which is 8327 + 255. This block 8583 is a doubly indirect
block, which we can ignore. It is followed by up to 256 repetitions of an
indirect block (which is ignored) followed by 256 data blocks. So doing the
arithmetic quickly, we issue the following commands. Notice that we skip the
doubly indirect block 8583, and the indirect block 8584 immediately (we hope)
following it, and start at block 8585 for data.

Adding up, we see that so far we've written 12 + (7 * 256) blocks, which is
1804. The `stat' results for the inode gave us a `blockcount' of 3616;
unfortunately these blocks are 512 bytes long (as a hangover from UNIX), so we
really want 3616/2 = 1808 blocks of 1024 bytes. That means we need only four
more blocks. The last data block written was number 10125. As we've been
doing so far, we skip an indirect block (number 10126); we can then write those
last four blocks.