I'm using it pretrained, and replaced the final fully connected layer to train it on our own data. All other weights are not changed during training.

I tried pinpointing where it goes wrong and the Eltwise layer 'inception_resnet_v2_a1_residual_eltwise' seems to be the first where the result of caffe and the result of ncs seem to differ significantly (180%, compared to 0.5% the layers before). Which eventually leads to consistent NaN outputs at the end of the network.