ADSC software engineers Thi Ngoc Tho Nguyen (left) and Hong Wei Ng took second place in the 2013 MLSP Bird Classification Challenge. They developed an algorithm to predict the presence of a set of bird species in audio clips accurately.

Bird behavior and population trends are important topics to study as birds respond quickly to environmental changes and are able to tell researchers about other organisms, while being easy to detect. However, traditional methods of collecting data on birds often require manual effort. There are many advantages to using acoustic monitoring to study bird populations, such as increased temporal and spatial resolution and extent, applicability in remote sites, reduced observer bias and potentially lower costs.

Since the recordings were taken outside of a lab environment, the recordings included sounds that the researchers were unable to account for, such as wind, rain, insects or airplanes flying overhead. This was further challenging because there would often be more than one type of bird singing in an audio clip and sometimes the birds would sing together.

“Our technique had to be able to distinguish these sounds from those of the birds that we were interested in, in order to make accurate predictions,” Ng said.

The 79 teams that participated were given 322 audio clips for training and another 323 for testing. Each audio recording was 10 seconds long and all of the data files were collected in HJA over a two-year period. Each audio clip in the training set was labeled with the set of bird species present or otherwise given no labels.

The problem is formulated as a multi-instance, multi-label machine learning problem where an audio clip can potentially contain a range of anywhere from zero to multiple instances of bird vocalizations and the classifier has to predict probabilities indicating the presence of 19 species of birds in the test clips.

To complete the challenge, Ng and Nguyen’s algorithm first detected segments of interest in each spectrogram computed from the given audio clips. Next, audio and image features were computed for each of the segments and summarized using a “bag-of-words” model. These features were further augmented with encoding statistics relating species of birds to the environment, such as the probability that each of the 19 bird species would appear in the recording location of a given audio clip. Finally, Extremely Randomized Trees classifiers were trained using the binary relevance method to make the required predictions.

The key factors of Ng and Nguyen’s algorithm were the ability to segment very faint signals in the spectrogram and the ability to convert the multi-instance multi-label problem into single-instance multi-label problem. In addition, the algorithm was able to account for audio clips that did not contain any bird calls, but only background noise and sound from other sources such as insects and rain.

The team, named Herbal Candy, was awarded $600 for their runner-up placement. Ng and Nguyen, who took part in this competition as a side-project are considering further developing some of the techniques they learned during the competition for more general audio surveillance and event classification purposes.

“ADSC aspires not just to do theoretical research, but also to do research that has real world impact,” ADSC Director Doug Jones said. “These contests are an important step in that direction. They’re narrower than real world, but they’re based on real data with real application. The techniques our researchers used were related to their ADSC machine learning and audio research, but this was just an extra challenge for them.”