Using Go to Recover JPEGs from a Forensic Image

CS50 is an introductory course offered by Harvard to students at the college and online via EdX and the Harvard Extension School. It teaches mostly through a series of problems sets which mostly focus on using C to solve them. One of the more interesting problem sets is Problem Set 4: Forensics.

The Problem

The original problem was solved in C as part of the course and here will be solved using Go. But before looking at any code it is important to give a little context (all code is available on GitHub for those that are interested).

Key characteristics of the problem have been outlined below:

What we have:

a forensic image of a cameras SD card

What to attempt:

Recover 16 deleted images from the file numbered 001.jpg through to 016.jpg.

Points of note:

The card was wiped with zeros before images were taken and deleted (This makes the slack space of the image full of zeros simplifying image recovery).

The card stores data in contiguous 512-byte blocks. If an image is 500 bytes it will still occupy one block (with some slack space left).

Trailing zeros in the recovered image are acceptable.

It is OK to hardcode the path of the file.

A Little Bit of Theory – Finding JPEGS

How can a jpeg file be identified in a forensic image?

As mentioned above images are stored in blocks of 512 bytes so no matter the actual size of the image it will always take up 512 bytes on the card. Therefore we know that an image could start every 512 bytes – so instead of reading the file byte by byte we can read in 512-byte blocks and check that for the start of a file.

Now how is the file identified in the 512-byte block?

Fortunately, while JPEG files are complex their headers are simple. If the first 4 bytes of your 512-byte block match either sequence below then it is a JPEG:

0xff 0xd8 0xff 0xe0

Or:

0xff 0xd8 0xff 0xe1

Of course, if the card had not been zero wiped this part would become a lot harder as slack space could also match the pattern leading to false positives.

Coding a Solution in Go

There were a few interesting differences in Go compared to other languages I had worked in. Quite a few are shown the snippet below:

If the curly brace was moved from the end of the if statement the last token would be a closing parenthesis. This would have a semicolon after it so one would be inserted by the lexer and your code would be syntactically incorrect.

There are a few exceptions to all of this. The most notable of my brief experiment with the language is the Forloop which works pretty much as you would expect any C style language would. Except in Go, there is only a For loop and all parts are optional. So not so much as you would expect!

Variables in Go

Variables in Go can be both static and dynamically typed. There are multiple ways to define a variable:

The var keyword

var i, j, k int

Above I have declared three int variables i, j and k as type int. But I needn’t have specified a type I could just have easily defined them as:

var i, j, k

At this point, the type will be inferred by the compiler at the first assignment.

Within Functions

Within a function I can shorten a dynamically typed variables declaration and assignment into one simple operator:

y := 42

The := operator will assign a value to a newly created variable of which is named on the left of it in this case 42 is assigned to the newly declared variable y.

Deferring a Function and Parentheses

There are two more unusual bits of Go I found through my exploration that deserve a brief mention.

Deferring a Function

In Go, you can defer a function call to be executed immediately before the executing function returns. A good example is freeing resources like an open file:

Parentheses in Go

Here I have updated the if statement for finding a JPEG. Now it only uses parentheses where they are needed at the end of the if statement around the or (||).

Conclusions

Go reminds me a little of Ruby in its strive to make the code as clean as possible and removing redundant characters which I quite like. Although there are a few things which threw me like the type being on the right-hand side of a variable name. But clearly, readability is one of the main aims of Go, in fact, they even have a tool to format your code: gofmt.