Menu

Posts by language

Other sites

Using emacs with flycheck and company is great experience, but what bothers me is that I need to carefully set many variables so that configuration used by these helper tools matches my actual build configuration – e.g. include directories, extra compiler and linker flags etc. I need to apply these settings per-project, and keep updating them whenever I change my build config in any way. This is particularly inconvenient for projects that feature configurable build process.

I use CMake for most of my projects, and I’ve recently found a package that can utilize it to automatically configure many other Emacs packages. The package is called cmake-ide, and it is available on MELPA.

There is literally zero configuration required. It automatically discovers whether a file you are editing belongs to a CMake project, runs CMake to prepare an out-of-tree build, and investigates compile_commands.json generated by CMake to figure out the precise build config for each file. It then uses this information to set up irony, flycheck, rtags, company-clang, and probably some other packages too. Whenever build config might change, cmake-ide will automatically update everything.

So eventually I got to analyse how training mini-batch size will affect a network that uses Batch Normalisation. There are several factors in play here:

Larger batch size is good for normalisation – the more samples we normalize over, the closer the estimation is. In effect, a large mini-batch size should cause the estimations to vary less between each mini-batch.

Smaller batch size results in more precise stochastic gradient descent steps, which may increase learning speed and final success rate.

It is computationally cheaper to process large batches, because of the parallel nature of modern hardware (especially GPU units).

Supposedly there might be a optimal mini-batch size for Batch normalisation. In order to find it, I tested the same network again using various mini-batch sizes, observed its performance, averaged results from multiple runs, and plotted results.

Once I fixed all my bugs in Batch Normalisation implementation and fine-tuned all parameters, I started getting reasonable results. In particular, it turned out that I needed to significantly (more than 10 times) increase weight decay ratio constant. I also had to modify learning rate scheduling so that it decays much faster, this makes sense, because Batch Normalisation is supposed to speed up learning. Eventually, the network:

I was interested in the advantage of using BN. To investigate it, I created another network, which is an identical clone of the one described above, but no Batch Normalisations are performed at all. Comparing the results of these two networks should express the gain introduced by using BN.

As a final assignment on the Neural Networks course I took part in (University of Wrocław, Institute of Computer Science, winter2015/2016), I am tasked with designing, implementing and training a neural net that would classify CIFAR-10 images with some reasonable success rate. I am also encouraged to experiment with the network by implementing some of the recent inventions that may, in one way or another, improve my network’s performance. I will be sharing my results and observations here, in this post, and in some that will follow soon within the next two weeks.

The source code I am using for my experiments is available at github. The sources come with a number of utilities that simplify running them on our lab’s computers, which may come in handy if you are a fellow student peeking at my progress, but if you are not, then you should ignore all files except the ones within ./project directory.