IPython inline images and videos

When working with large arrays of data, especially spatiotemporal data, visualization methods are needed. While the imshow function of scipy provides a backend-agnostic way of visualizing array data, and allows for all sorts of annotations, like axes labels, I found myself unhappy with the result more often than not.

There are two things that are difficult to get right when displaying image arrays in an IPython notebook: First, the figure size determines the maximum size of the image and cannot be adjusted dynamically by dragging its edges in the browser. Second, the architecture of matplotlib requires the image to undergo a set of transformations, and without obsessing about things like the figure resolution and interpolation methods, it’s very difficult to just plainly display an image in the browser without some form of resampling.

Images

The following convenience function takes a two or three-dimensional array and displays it as an image in the notebook – directly as an image embedded in HTML, with automatic dynamic range adjustment, if desired, but no other transformations. It can be dynamically rescaled in the browser by dragging its edges. Here’s the source code:

The data must be either two-dimensional, in which case it is interpreted as a grayscale image; or three-dimensional, with one dimension having either 3 or 4 elements, corresponding to RGB images without or with an alpha channel, respectively. The code tries to guess whether the color channels are in the first or the last dimension (“planar” or “interleaved” format) and shuffles the dimensions accordingly.

Dynamic range adjustment

The dynamic range adjustment works as follows: First, if the array is already in 8-bit unsigned integer format, it is assumed that dynamic range adjustment is not necessary. If it has any other format, the default is to linearly scale the data such that it fills the 8-bit dynamic range of the image. The vmin and vmax parameters can be used to override the data values corresponding to the minimum and maximum pixel value. Any data values exceeding those limits are clipped to the maximum/minimum. The vsym parameter alters the default behavior in a way that I find often useful for displaying data like filter coefficients: It ensures that a data value of 0 will be mapped to mid-gray. All of this functionality is delegated to a separate function called rerange:

Yes, that’s right! HTML5 nicely allows us to convert data arrays into inline browser videos. For this to work, we need a copy of ffmpeg that is accessible on the command line.

I wrote another function called nbvideo which works analogous to the nbimage function above. The dimensions of the array must now be 3 or 4, for grayscale or color images, respectively, and the first dimension is assumed to be time. The color dimension can either be the second or the last. It has some additional parameters: fps gives the number of frames per second, and loop is a boolean that, if set, puts a HTML attribute telling the browser to loop the playback by default (this can also be changed manually in the browser). Additionally, you can encode a frame counter into the video.

The parameters theora, h264, and vp8 can each be either set to an integer, indicating that the video will be compressed using the respective codec with that quality setting, or to None, indicating that codec should be turned off. Note that not all browsers support all codecs (have a look at this table). The default settings are a safe choice that should work in most browsers. If you know the preferred codec of your browser, you should turn one of the codecs off to save space (the default setting will produce both a theora and a H.264 version of the data).

nbvideo is quite lengthy, so I won’t include its source code here. Here‘s a python file with all of the functions.