How to train YOLOv2 on custom dataset

YOLOv2 is open source state-of-the-art real-time object detector that is written on deep learning framework darknet in C language https://pjreddie.com/darknet/yolo/ . Simple guide to reproduce results in the YOLOv2 paper is provided at author’s blog.

To train on custom dataset some elegant instructions are given on windows port of the YOLO at https://github.com/AlexeyAB/darknet .

Here I will show hands on approach to train YOLOv2 detector (If you cannot see the images clearly, please zoom in the browser)

Training

change filters=125 in last convolutional layer to filters=65 which is (5+8)*5. Here first 5 corresponds to (x,y,w,h,objectness_score), 8 corressponds to number of classes, in my case I have 8 classes. Last 5 corressponds to number of BoundingBox predictions for each cell.

change classess=20 to classess=8

Now this is how our cfg file looks like

I have 4 GB GTX 1050 GPU on my laptop, so I set batch=64 and subdivisions=8. That way my GPU will process 64/8 = 8 images in one pass. Lets say if you have 8GB GPU memory then you can set batch=64 and subdivisions=4. In order to take advantage of all of your gpu memory in order to speed up the training

Creating *.data and *.names files

Copy obj.names and obj.data files (that we created in Data Preparation step with YOLO_MARK) to C:\darknet\build\darknet\x64\data