IAM

DAVIDSTUTZ

ARTICLE

Convolutional Batch Normalization for OctNets

During my master thesis I partly worked on OctNets, octree-bases convolutional neural networks for efficient learning in 3D. Among others, I implemented convolutional batch normalization for OctNets. This article briefly discusses the implementation, which will be available on GitHub.

OctNets, illustrated in Figure 1 and proposed by Riegler et al. in [], are octree-based convolutional neural networks intended for efficient learning on 3D data. A C++/CUDA implementation with Torch interface can be found on GitHub at griegler/octnet.

Figure 1(click to enlarge): Illustration of the OctNet structure. See [] for details.

Batch normalization was introduced by Ioffe and Szegedy in [] to address the problem of internal covariate shift. In particular, they discovered that the distribution of network activations may change during training; to prevent internal covariate shift and improve training, batch normalization normalizes the activations over each training batch.

Efficiently implementing batch normalization on octrees is non-trivial because of the variable tree structure; the details will be discussed in detail below. The implementation can be found on GitHub:

Batch Normalization in OctNets

Figure 2 (click to enlarge): A network trained without batch normalization illustrating the observation of covariate shift (left) and a network with batch normalization (right).

Internal covariate shift occurs when the distributions of network activations — for example, of a specific convolutional or fully-connected layer — change significantly during training. This is illustrated in Figure 1 (taken from []). For preventing covariate shift, the activations can be normalized. Batch normalization can be summarized as follows:

Note that this is also referred to as convolutional batch normalization, as all activations within the layers are normalized equally — the parameters $\gamma$ and $\beta$ are also layer-wise. This is important, as regular batch normalization — where each activation is normalized independently — cannot directly be implemented for OctNets. To see this, consider the following examples:

Example. Consider a batch of two octrees with maximum depth 3 (representing volumes of size $8^3$). We can obtain the value $O[i,j,k]$ of voxel $(i,j,k)$ within the volume by traversing the octrees. Additionally, we would have parameters $\gamma_{i,j,k}$ and $\beta_{i,j,k}$ and would need to compute $\mu_{i,j,k} = \frac{1}{2}\left(x_{i,j,k}^{(1)} + x_{i,j,k}^{(2)}\right)$. However, as the tree structure of both octrees need not be identical, we cannot really take advantage of the octree data structure. For example, $O[i,j,k]$ might not be a leaf in octree $1$ — the computation could be shared with $O[i + 1,j,k]$ (and possibly $6$ other voxels). However, in octree $2$, $O[i,j,k]$ and $O[i + 1,j,k]$ might be different. Thus the computation needs to be done for each voxel, $8^3$ in total; this makes the octree structure essentially useless, as not only more computation is required, but also more memory.

Convolutional batch normalization, however avoids this problem by computing the normalization over the full volumes — the full $8^3$ voxels, in our example. This is illustrated in the following, simplified 2D example:

Example. Consider two quadtree (the 2D equivalent of octrees) of maximum depth $3$, as illustrated in Figure 3; the quadtree represents $8^2$ pixels. Then, computing $\mu$ involves summing both over batch samples and pixels. Summing over pixels, per batch sample, can be done very efficiently. For example, considering $0 \leq i,j \leq 2$, $O[i,j]$ is constant — simplifying the computation of $\mu$ as follows:

Thus, convolutional batch normalization can be done efficiently on quadtrees as well as octrees.

Implementation

The above example also illustrates how the presented implementation looks in practice. In more detail, the function octree_bn_stat_cpu computing the statistics $\mu$ and $\sigma^2$ is given in the listing below. Normalization, as well as backward passes, can be implemented in a similar manner.

The remaining code can be found in core/src/bn.cpp; the above example, however, illustrates how batch normalization can easily be implemented on octrees.

Usage and Examples

The repository also provides a Torch/LUA example for using OctNets with batch normalization. The example is located in example/02_classification_3d/ and implements a simple classification network on a 3D toy dataset which can be generated on-the-fly. The model is defined as follows: