Visualising colour themes in popular movies using Clojure

Was it just me, or did the Harry Potter films get darker in tone as the kids got older? Let's find out. By taking each frame of each movie and compressing it down to 5x5 pixels and then averaging all of the frames (trimming the start and end to avoid credits) you can produce a single block colour:

Harry Potter and the Philosopher's Stone (2001)

Harry Potter and the Chamber of Secrets (2002)

Harry Potter and the Prisoner of Azkaban (2004)

Harry Potter and the Goblet of Fire (2005)

Harry Potter and the Order of the Phoenix (2007)

Harry Potter and the Half-Blood Prince (2009)

Harry Potter and the Deathly Hallows Part 1 (2010)

Harry Potter and the Deathly Hallows Part 2 (2011)

Below is the series of images as an animated gif, with the films position in the series in the bottom right hand corner:

If you generate a composited image from shrunken, averaged frames from the movie, and analyse it to calculate the average colour use for each film, you can show the overall darkening in tone as below:

So, in summary, the films did get noticeably darker - there are certainly more browns and greens in the early movies, whereas the last two are over 50% ‘dark’. A more impressive series of images highlighting the transition is included later.

It turns out I’m not the first person to ask this question and, after a couple of weeks of work, I stumbled across a number of resources, including this Reddit thread basically achieving the same result. I’ll walk you through my approach.

Given a local copy of each movie converted to an MP4 format, we can extract each frame and visualise the changing colour palette each director chose to use as the films progress. The choice of colour palette tends to have a large impact on the 'mood' of the film as perceived by the audience. As such, it is often a good proxy for the tone of the content.

Capturing a single frame

When planning how to achieve these results I anticipated a fair amount of matrix manipulation, or conversion of sequences between formats, and so I chose to use Clojure. It helps that it's my language of choice anyway, and is used in my day job at uSwitch. The Java interop is an added bonus, particularly as there are very useful libraries out there, such as JavaCV, which provides a suite of tools for image capture and manipulation.

Below is an example which captures a single frame of video and returns it as a BufferedImage:

As Java is typically stateful, we have to start and stop the FFmpegFrameGrabber to avoid memory overhead and to ensure resources are appropriately released. To abstract these steps away, we can write a macro to make working with image streams a little easier:

Turning a frame into a pixel

My original plan was to iterate through each pixel in each frame, summing the colour components and then dividing by the size of the frame to calculate the average colour. It’s relatively easy to get a byte array of pixel data from a BufferedImage:

(byte-array (.. buffered-image getRaster getDataBuffer getData))

Each byte represents the value of a single colour channel of the image. Most images have 4 channels; red, blue, green and alpha. To determine whether or not an image includes an alpha channel, we can simply ask it:

(not (nil? (.getAlphaRaster buffered-image)))

If there is an alpha channel, each pixel will be represented by 4 bytes, allowing us to partition the byte array by pixel and to convert each pixel to a vector of integer colour values:

Being able to view the pixel values as ints in the range 0-255 made development easier but was clearly not the optimal way to calculate the average colour value. This process takes several seconds to complete. Given we’d have to calculate this for every frame (24/second), assuming we could average 1 frame/second, we’d be able to process about 60 seconds of material every 24 hours.

I was about to investigate a more optimal algorithm when it occurred to me that what I was actually doing was resizing the image down to 1 pixel using averaging. After some exploration of the tooling already available, it turned out this is efficiently implemented already on the Graphics class, which exposes an editable component of the BufferedImage. The pixels are still averaged but in a much more efficient manner:

There’s quite a lot going on here, but the key part is the scale-image function which, given a graphics component of a BufferedImage, will draw a different image to x and y coordinates, scaled to a height and width as follows:

Be resizing each frame to 1 pixel and tiling them, we can produce an image like below, which shows Skyfall (2012):

Scaling Mechanism

Java provides a number of mechanisms for scaling an image based on your priorities - most of the time, the key decision is performance vs ‘quality’ for some definition of quality, which is usually centred on how much of the image is discarded when scaling down or how interpolation works when scaling up.

Fast image scaling is usually achieved using a nearest neighbour algorithm; additional blank pixels are added between the existing pixels and the colour of the pixel nearest each blank pixel is copied over:

Below is an example of an image scaled up using the nearest neighbour algorithm

Original Image

Scaled x 2

Scaled x 4

This leads to relatively decent results although jagged artifacts are obvious, particularly on curved edges. An image with a wider range of colour would also end up looking ‘blockier’.

What’s more interesting for our purposes is seeing what happens when the image is scaled down:

Original Image

Scaled x 0.5

Scaled x 0.25

There are a number of alternative image-scaling techniques which interpolate the pixel data in a variety of ways. The most common (and default options in the Java graphics class) are bicubic and bilinear. Below is a comparison of the results of scaling the original image down through each different method:

Original Image

Scaled x 0.25 using nearest neighbour

Scaled x 0.25 using bilinear interpolation

Scaled x 0.25 using bicubic interpolation

This excellent article provides detail of the different options available for scaling. The method of scaling can be trivially achieved using the .setRenderingHints method on the Graphics class, meaning we could write something as follows to scale images up or down in Clojure using bilinear interpolation:

Increasing the size of each pixel

My next thought was to see if I could generate more impressive images by resizing each frame to a small 5x5 size, retaining a small amount of the detail of the initial frame. After experimenting with the different scaling hints possible, and after doing some research, I came across a number of users pointing me to the Thumbnailinator library, which abstracts the action of choosing an optimal algorithm away from the user based on the size of the desired final image.

Integration was relatively easy, and our create-tiled-image function doesn’t actually need to change that much - we just need to create an instance of the ThumbnailMaker which we'll skip over for now:

This image, composed of frames from 2001: A Space Oddysey (1968), highlights Kubrick’s use of long static shots, indicated by the large stretches of regular colour.

On the other hand, The Matrix (1999) is a great study in colour delineation, with scenes in the Matrix tinged with green (or white in the case of scenes set in The Construct), whereas scenes in the ‘real world’ are coloured blue.

Here are some other favourites, the interpretation of which I will leave as an exercise for the reader:

Next Steps

Colour in films is rarely accidental. Filmmaking is an art form and, in most filmmaking, barring indie movements such as Dogme 95, nothing in a shot is there by accident. Further, a director will choose a colour palette which they feel best conveys the most appropriate mood for a scene. Sometimes colours can be used to provide a clear delineation between ideas - a technique dating back to the ‘black hat’ and ‘white hat’ Westerns where hat colour differentiated the good guys from the bad guys.

Modern filmmaking has the advantage of being able to change the colours in a scene in post-production, allowing even more subtle manipulations of the audience’s perceptions.

As such, mining colour information for sentiment, tone or even genre is an interesting possibility. I hope to post a follow up exploring some or all of these areas.