So, if you have read my previous articles, probably you may have noticed that had everything ready to train the model, for bird voice recognition but for the actual pipeline, model architecture and training itself. After researching what I found available on the subject, these approaches helped me to speed up my training time 10-20s:

Probably loading your files into RAM will give you some additional speed bump, but images are compressed and simple calculation (800k pics * 200x64 * 1 channel * 4 bytes per pixel in float32) yields that the full grayscale dataset would take around 40GBs of RAM (I had 16 when doing the model);

Using cyclical learning rates may help you with faster / more stable convergence;

Also practice tells me that using batchnorm in the CNN architecture also helps, but I did not test it. In in applying heavier CNN designs (U-NET) I noticed 2-3x speed up in training time;

1. Useful links on the subject

Actually when you have this many files, there are three options on how to organize them:

Have a folder for each category (which did not fit me because I wanted to test species vs. genera model) and use flow_from_directory from Keras;

Write image augmentation / generators from scratch and / or use some Kaggle kernel boilerplates;

2. What did not work

Having 500 seconds per epoch model figured out, the biggest disappointment was the fact that I could not train a model (or I am just bad at tuning hyper-parameters) to recognize a bird species by its song. In my final dataset there were 213 bird genera vs. 1623 bird species. Probably closely related bird species are just too similar to distinguish.

As for the bird genus, I am quite sure that palatable accuracy can be achieved (it's work in progress now).

Also I did a lot of fiddling with loading files to memory and sparse arrays, but all of this was not necessary in the end.

3. Snippets, code, useful examples for models in Keras

Here is the list of files (if you want to use my code - please also download and unpack the tools archive to the folder where you will have the notebook);

Custom generator

This is a simple generator that I built on top of Keras generators, that allows to feed any labels to Keras. Note that code_dict is a dictionary that just maps file ids to their respective label encoded classes.

Nice callbacks

Do you want just to leave your model to train and forget about it? This example may be useful to get you started on Keras callbacks. It essentially saves the best iterations of your model, reduces learning rate on the plateau and logs the training process.