Category: deep learning

encountered while reading “STN-OCR: A Single Neural Network for Text Detection and Text Recognition” which adopted spatial transformer networks.

This video is very clear in understanding how it works. Although I didn’t fully understand the interpolation equations, the other parts were clear. And at the end of the video, it briefly compares the spatial transformer with deformable convolutional networks which is interesting.

Two single-class training attempts have been made where one successfully produced reliable bounding boxes and the other failed to produce even one. The successful case was a single-class ‘car’ detector and the other was a ‘face’ detector. The training results do not make sense and this post will document this erratic behavior. Continue reading “darkflow yolo v2 training from scratch not working”→