The home of Caliph & Emir and LIRE.

Category Archives: Dev

The current LireDemo 0.9.4 beta release features a new indexing routine, which is much faster than the old one. It’s based on the producer-consumer principle and makes — hopefully — optimal use of I/O and up to 8 cores of a system. Moreover, the new PHOG feature implementation is included and you can give it a try. Furthermore JCD, FCTH and CEDD got a more compact representation of their descriptors and use much less storage space now. Several small changes include parameter tuning on several descriptors and so on. All the changes have been documented in the CHANGES.txt file in the SVN.

In the current SVN version three global features have been re-visited in terms of serialization. This was necessary as the index of the web demo with 300k images already exceed 1.5 GB.

Feature

prior

now

EdgeHistogram

320 bytes

40 bytes

JCD

1344 bytes

84 bytes

ColorLayout

14336 bytes

504 bytes

This significant reduction in space leads to (i) smaller indexes, (ii) reduced I/O time, and (iii) therefore, to faster search.

How was this done? Basically it’s clever organization of bytes. In the case of JCD the histogram has 168 entries, each in [0,127], so basically half a byte.Therefore, you can stuff 2 of these values into one byte, but you have to take care of the fact, that Java only supports bit-wise operations on ints and bytes are signed. So the trick is to create an integer in [0, 2^8-1] and then subtract 128 to get it into byte range. The inverse is done for reading. The rest is common bit shifting.

The code can be seen either in the JCD.java file in the SVN, or in the snippet at pastebin.com for your convenience.

With the implementation of the PHOG descriptor I came around the situation that no well-performing Canny Edge Detector in pure Java was available. “Pure” in my case means, that it just takes a Java BufferedImage instance and computes the edges. Therefore, I had to implement my own

As a result there is now a “simple implementation” available as part of LIRE. It takes a BufferedImage and returns another BufferedImage, which contains all the edges as black pixels, while the non-edges are white. Thresholds can be changed and the blurring filter using for preprocessing can be changed in code. Usage is dead simple:

LIRE is not a sleeping beauty, so there’s something going on in the SVN. I recently checked in updates on Lucene (now 4.2) and Commons Math (now 3.1.1). Also I removed some deprecation things still left from Lucene 3.x.

Most notable addition however is the Extractor / Indexor class pair. They are command line applications that allow to extract global image features from images, put them into an intermediate data file and then — with the help of Indexor — write them to an index. All images are referenced relatively to the intermediate data file, so this approach can be used to preprocess a whole lot of images from different computers on a network file system. Extractor also uses a file list of images as input (one image per line) and can be therefore easily run in parallel. Just split your global file list to n smaller, non overlapping ones and run n Extractor instances. As the extraction part is the slow one, this should allow for a significant speed-up if used in parallel.

Extractor is run with

$> Extractor -i <infile> -o <outfile> -c <configfile>

<infile> gives the images, one per line. Use “dir /s /b *.jpg > list.txt” to create a compatible list on Windows.

<outfile> gives the location and name of the intermediate data file. Note: It has to be in a folder parent to all images!

<configfile> gives the list of features as a Java Properties file. The supported features are listed below the post. The properties file looks like:
feature.1=net.semanticmetadata.lire.imageanalysis.CEDD
feature.2=net.semanticmetadata.lire.imageanalysis.FCTH
…

Indexor is run with

Indexor -i <input-file> -l <index-directory>

<input-file> is the output file of Extractor, the intermediate data file.

<index-directory> is the directory of the index the images will be added (appended, not overwritten)

I just uploaded Lire 0.9.3 to the all new Google Code page. This is the first version with full support for Lucene 4.0. Run time and memory performance are comparable to the version using Lucene 3.6. I’ve made several improvements in terms of speed and memory consumption along the way, mostly within the CEDD feature. Also I’ve added two new features:

JointHistogram – a 64 bit RGB color histogram joined with pixel rank in the 8-neighborhood, normalized with max-norm, quantized to [0,127], and JSD for a distance function

Both features are fast in extraction (the second one naturally being faster as it does not investigate the neighborhood) and yield nice, visually very similar results in search. See also the image below showing 4 queries, each with the new features. The first one of a pair is always based on JointHistogram, the second is based on the OpponentHistogram (click ko see full size).

I also changed the Histogram interface to double[] as the double type is so much faster than float in 64 bit Oracle Java 7 VM. Major bug fix was in the JSD dissimilarity function. So many histograms now turned to use JSD instead of L1, depending on whether they performed better in the SIMPLIcity data set (see TestWang.java in the sources).

Final addition is the Lire-SimpleApplication, which provides two classes for indexing and search with CEDD, ready to compile with all libraries and an Ant build file. This may — hopefully — help those that still seek Java enlightenment

Finally this just leaves to say to all of you: Merry Christmas and a Happy New Year!

In the course of finishing the book, I reviewed several aspects of the LIRE code and came across some bugs, including one with the Jensen-Shannon divergence. This dissimilarity measure has never been used actively in any features as it didn’t work out in retrieval evaluation the way it was meant to. After two hours staring at the code the realization finally came. In Java the short if statement, “x ? y : z” is overruled by almost any operator including ‘+’. Hence,

System.out.print(true ? 1: 0 + 1) prints '1',

while

System.out.print((true ? 1: 0) + 1) prints '2'

With this problem identified I was finally able to fix the implementation of the Jensen-Shannon divergence implementation and came to new retrieval evaluation results on the SIMPLIcity data set:

map

p@10

error rate

Color Histogram – JSD

0,450

0,704

0,191

Joint Histogram – JSD

0,453

0,691

0,196

Color Correlogram

0,475

0,725

0,171

Color Layout

0,439

0,610

0,309

Edge Histogram

0,333

0,500

0,401

CEDD

0,506

0,710

0,178

JCD

0,510

0,719

0,177

FCTH

0,499

0,703

0,209

Note that the color histogram in the first row now performs similarly to the “good” descriptors in terms of precision at ten and error rate. Also note that a new feature creeped in: Joint Histogram. This is a histogram combining pixel rank and RGB-64 color.

All the new stuff can be found in SVN and in the nightly builds (starting tomorrow

I just submitted my code to the SVN and created a download for Lire 0.9.3_alpha. This version features support for Lucene 4.0, which changed quite a bit in its API. I did not have the time to test the Lucene 3.6 version against the new one, so I actually don’t know which one is faster. I hope the new one, but I fear the old one

This is a pre-release for Lire for Lucene 4.0

Global features (like CEDD, FCTH, ColorLayout, AutoColorCorrelogram and alike) have been tested and considered working. Filters, like the ReRankFilter and the LSAFilter also work. The image shows a search for 10 images with ColorLayout and the results of re-ranking the result list with (i) CEDD and (ii) LSA. Visual words (local features), metric indexes and hashing have not been touched yet, beside making it compile, so I strongly recommend not to use them. However, due to a new weighting approach I assume that the visual word implementation based on Lucene 4.0 will — as soon as it is done — be much better in terms for retrieval performance.

I just uploaded version 0.9.2 of Lire and LireDemo to Google Code. Yes, Google Code! I also migrated (more or less in a under cover action some month ago) the SVN trunk to Google Code and will move on with development there. Main reasons were that ads were getting more and more aggressive over at sf.net and the interface of a Google Code project is so much cleaner and easier to handle from a project manager point of view.

Lire 0.9.2 fixes two bugs in KMeans and GenericImageSearcher. Both were critical. The KMeans fix allows now for the use of the bag of visual words approach. The GenericImageSearcher fix makes search much faster.

Due to numerous requests I prepared a package showing off a simple indexer and a simple search. Inside there are two classes: Indexer and Searcher. Each of them does what their name suggests.

Indexer takes the first command line argument, interprets it as directory, gets all images from this directory and indexes and stores them in a newly created directory called “index”. Searcher searches in excactly this image index for the query image specified with the first argument.

The sample application employs CEDD and provides an ANT build file. IDEs like NetBeans, Eclipse or IntelliJ IDEA should have no problems importing the sources and using the build.xml file for compiling and running. Arguments can be changed in the build.xml file.

Recently I posted binaries and packaged libraries for face detection based on OpenCV an OpenIMAJhere and here. Basically both employ similar algorithms to detect faces in photos. As this is based on supervised classification not only the algorithm but also the employed training set has strong influence on the actual precision (and recall) of results. So out of interest I took a look on how well the results of both libraries are correlated:

Above table shows the Pearson correlation of the face detection algorithm with the default models of OpenIMAJ (with a minimum face size of 20 and 40 pixels) and OpenCV. As can be seen the results correlate, but are not the same. Conclusion is: make sure that you check which one to use for your aplication and eventually train one yourself (as actually recommended by the documentation of both libraries).

This experiment has been done on just 171 images, but experiments with larger data sets have shown similar results.