How to load Tiff stacks fast, really fast!

Loading Tiff files has been slow for many years in Matlab. With the recent introduction of the TIFF library, things have improved a lot. But still, when it comes to loading large dataset stored in Tiff files, Matlab functions are not as good as they could be. Today I am going to introduce a few lines of codes that will make all of this past history for good.

Quite some time ago, I introduced “inlining” as a way to efficiently boost your code efficiency. the basic principle is to look for Matlab functionalities that are not built-in functions. If it happens that these are slowing down your calculations, you can access the underlying M-file and fish in the only few lines of codes that are relevant to you. Today’s post is basically a comprehensive example of this technique so that, even if you don’t care about Tiff files, this post is interesting as a guideline for optimization

Okay, now that the introduction is done, let’s dig in.

Let’s suppose you have a TIff file named ImageStack.tif with a series of images stored in it. In good old Matlab code, you would use this code to load it in a 3D matrix :

imfinfo is used to get the size of the movie stack to preallocate the big matrix. Nothing is especially fancy in this code. This would be the way most people load a tif stack if there was no performance issues.

With this particular code and a decent dataset of 1575 images (256 by 256 pixels) in a single Tiff file, it takes approximately 200 seconds to run on my computer.

To give you an idea how awful this is, ImageJ, a very widely used software in microscopy, takes approximately 3 seconds to load the same stack.

To help solve this issue, Mathworks modified imread to allow feeding some additional info and avoid some overhead within imread, as mentioned in the help :

Note: When reading images from a multi-image TIFF file, passing the output of imfinfo as the value of the ‘Info’ argument helps imread locate the images in the file more quickly.

Again, a big improvement, now the very same file is loaded in 19 seconds. Still this is rather slow compare to ImageJ so I decided to really push on this front as I am loading Tiff files in Matlab many times, every single day, some having many more images than 1575 (up to 10000 and more).

It turned out, as you dig in Mathworks implementation of the TIFF library that they did a very poor job at limiting the overhead when dealing with TiFF stacks. This is especially annoying as I believe that the main advantage of this move was to get faster at stacks.

Indeed when you run the profiler on this code. You should get this :

As you can see, the number one process is Tiff.getTag. getTag is used to get some properties of the image. So they actually duplicated the mistake they did with imread as this function is being called 28350 times to read my stack!

What we want to do now is to use the profiler to select the pieces of code that are relevant and get rid of the rest. So within the profiler I clicked on Tiff.read, I realized that Tiff.read makes a call to Tiff.readAllStips which also make many calls to Tiff.readEncodedStrip and there, deeply buried within a loop that goes over all the pixels of the data, there was the real call to tifflib, the original compiled library.

Oh surprise, most of the time is not spent reading the data! Most of the time is spent checking that the image has a certain color profile (look at the call to Photometrics).

This is a golden example on why inlining can be extremely efficient.

So I went through all these functions, copied and pasted some code and tried to make a new loader that makes smarter usage of the TIfflib for stacks. This is the new code :

What this codes does is to bypass the M-file wrapper wrote by Mathworks (the one that is very bad at stacks) around their built-in MEX file. So I make direct call to tifflib now.

The problem is that Matlab does not place tifflib within your search path, so you MUST copy the compiled libraries from your own distribution of Matlab into your function folder. On my mac, this file is at :

/Applications/MATLAB_R2011b.app/toolbox/matlab/imagesci/private and is called tifflib.mexmaci64. I copied this file into the folder where my M-file code is located.

This also means that, in my case, this function will work only on MAC 64 bits until I copy the mex files for the other distributions.

Keep in mind that I also removed lots of overhead to check the particular tiff types (in this example it is a chunky file) so you might want to create several loader depending on the file type (instead of checking the file type at every pixel like Mathworks did). The current code works for my particular application.

With this in mind, using TIC/TOC routines, this codes now takes 1.5 seconds. Yes, I am not joking, Matlab is now FASTER than ImageJ.

I hope Mathworks is reading this for their next release… They might consider changing their wrapper as I am not the only one around that use TIff stacks…

NOTA : Mathworks released (a few months after I posted about it) a bug correction of the TIFF class to deal with this issue. The new class is far better and gives very decent loading time. I recommend you download it and overload your local copy of Tiff.m with the bug fix. Direct call to these new libraries still provides a little boost but not as drastic.

Oups, sorry. I misunderstood your code. It’s possible they changed imread recently to use the Tiff library in a better way. I am using 2011 here. Maybe I should upgrade to try it out.
In my hands, imread was doing a lot of overhead with Tiff stacks. I am curious to see if the last approach works out nice for you.

TIFF files are not supposed to be that big according to the standard of the file format.
Usually these huge TIFF are created using ImageJ. To make this to work, ImageJ actually leaves the standard and stop recording frames into directories. I think you will greatly benefit to migrate to HDF5 files. These are much faster than TIFF for such large size and they are designed for this exact purpose.
Check my post on this particular issue.http://www.matlabtips.com/how-to-store-large-datasets/

hey, I tried your code using Matlab 2012b. Unfortunately, for me it did not work — i copied the tifflib to my folder and it is running. but it takes pretty much the same time as with imread — around 45 seconds for 20000 frames of a 170000 frames file with 128×80 pixels. With ImageJ the whole 170000 frames file is loaded in less than a second. the file is 16bit, could this be a problem?

do you have any other suggestions?

also: take care: you mixed up ImageWidth and ImageHeight … you will notice, if you use a non-squared image

There is indeed a typo in the code. ‘mImage’ in loop needs to be replaced with ‘nImage’ (see below). For tiff stacks where Width >> Height, the code is much slower than imread (using the most recent patch to Tiff).

Thanks.
I corrected the typo.
As I said in the NOTA, Mathworks fixed their Tiff class for good (after nearly half a decade of extreme slowness). So the trick is not worth now the effort. Basically the new version loads the entire image as one single ‘Strip’, if possible, instead of doing it nearly at the pixel level.
You can still get a boost in speed by taking the new libraries and using it to get rid of one for loop (the one that goes across the strips (i.e. for r=1:rps:nImage). In my hands, it gives something like a 10-20 % boost. I am not sure it is worth it as to do so, you have to possibly break compatibility with future releases of Matlab.

This is probably why the proposed code is not good at elongated image as it is still processing each image as multiple strips. For elongated image, that’s a lot of strips…
You can try my proposed trick in my last comment in that particular case.

I tried using your code, it doesn’t seem to help me. It is considerably slower than just using imread+file info. Maybe it is because my tiff files are a bit different, they are 1312×1082 pixel images (up to 750 of them). Maybe imread works better for those larger images, I don’t know what the rtifc.mex is doing…

If I understand correctly, you have 750 files? That’s a very different case than what I propose here as I am working with single tiff file with many images in it.
If you do have a single file, then you should try using the mex file approach.

I was a bit unclear I guess. I have one multi-page file with 750 images in it. I tried the mex-file approach (if that is the last code you present above), but I still get a better performance with imread.
If you’re interested I can send you the profiler reports.

The images are LZW compressed, if that makes a difference.
Also, the machine I am running the code on is just a standard laptop, not a lot of processing power…

Hi Jerome! This code is fantastic. Thank you so much for posting it. I wonder if a similar solution is possible for writing a very large multi-page tiff. So given a 3D matrix Img(x,y,time) – do you know the syntax to write a multi-page tiff ( with “time” number of pages and each page of dimensions X and Y) using the tifflib mex?

I have used similar techniques on functions like interp1, which contains all sorts of error trap code and coping mechanisms to handle vectors of different orientations. I can’t remember the figures but got a massive speed increase by essentially writing my own in the end.

Thank for your message. I found this fix about 2 years ago on an older version of Matlab. Mathworks has fixed their Tiff library implementation in the meantime. They also changed their underlying code so that probably my code does not work well with the new library. Besides the current Tiff class (in 2013b) gives good loading results.