Plant identification is an important task in botany and related areas, such as agriculture, forestry, and nature conservation. It is also of interest of general public. While botanists usually have no problem identifying a species, non-specialists would often welcome a computer-aided system for species recognition. Creating such a system is a challenge that we have resolved using visual pattern recognition methods.

Generally, trees and shrubs can be recognized by their local features extracted from leaves, flowers, fruits and bark, and by global characteristics, such as height, crown shape, and branch structure. Our system uses only the leaves, since leaves alone provide enough discriminative information, are available from spring to autumn, and, last but not least, the user can collect them and classify the trees retrospectively.

The system consists of two parts: the database, which is dependent on the geographic location, and the universal search engine. We created our own dataset named Middle European Woody Plants (MEW 2012, see Figure 1), which is available at http://zoi.utia.cas.cz/node/662. It comprises native and frequently cultivated trees and shrubs of the Central Europe region. It contains 151 botanical species, at least 50 samples per species and a total of 10,000 samples (leaves) scanned in high resolution.

Recognition is based solely on leaf contour, leaf size and classification of the leaf as either simple or compound. We avoided using leaf texture and colour because they can vary between individuals of a given species, and can vary depending on the season (phenology phase). Furthermore, describing these features requires working with very fine venation detail.

From a mathematical point of view, the crucial question was how to efficiently encode and characterize the leaf contours. The best results were obtained by Fourier descriptors (Figure 2). We achieved an 89% success rate in the experiment, in which the dataset was randomly divided into two halves. One half was used as a training set and the other half was tested against it. We also compared the performance of the automatic method with the performance of humans. We asked 12 computer science students to classify the leaves visually. The students were able to see the query leaf and simultaneously browse the database to compare the query with the training leaves. Unlike the algorithm, which works only with contours, the students worked with full colour images. Each test person classified 30 leaves. The mean success rate was 63%, significantly lower the success rate of the algorithm.

Our recognition system is publicly available as a web-based application, which allows the user to upload the query leaf and receive the answer with greater reliability and higher speed than from a non-trained individual. We encourage readers to try the system at http://leaves.utia.cas.cz/index?lang=en

The system is a good tool for non-specialists; it is not aimed at professional botanists. The system can be easily adapted to other areas and continents just by replacing/extending the MEW database with a database of local plants, the search engine does not require any modification. Some publicly available databases which can be used for this purpose include:

ImageCLEF (Cross Language Evaluation Forum): version 2011 includes 6,436 pictures of 71 species from the French Mediterranean area.

The main weakness in the current version of the system is that it requires high-quality leaf scanning. We plan to develop an advanced version which should also work with cell-phone photographs, and we are also considering an embedded application for smartphones.