Training YOLOv2 using Custom Images

Prerequisites

Note that most of this tutorial will assume you are using a Debian based linux distribution such
as Ubuntu or Linux Mint. For RPM based distros, the commands will be very similar. Simply
look up the packages and many times it’s just a matter of using yum or dnf instead of apt-get.

Unfortunately, I don’t do any development on Windows or Mac so I am unformiliar with how
the process would work on those platforms. Feel free to leave your advice for those users.

Disclaimer: I have only personally followed the steps for the Fedora 25 installation from source. I
don’t own a Mac and therefore cannot test that specific installation process. As for Ubuntu, I usually
install from the PPA.

CUDA (Optional)

This step is only applicable if you have an Nvidia GPU. If not, you can skip to the Darknet installation.

For those that don’t know, here is Nvidia’s description of CUDA:

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing
on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing
applications by harnessing the power of GPUs.

In my experience, they aren’t overselling it. By installing CUDA, you will greatly decrease your training
times. Even more fortunately, unlike their linux drivers, Nvidia actually makes CUDA very easy to install.
Head over to the cuda downloads page and follow the instructions. For the path of least resistance, choose
the precompiled binaries (.deb, .rpm, etc…).

Like CUDA, this library will greatly improve the speed of your training. The installation process is only
slightly harder than the CUDA one. Head over to the cudnn download page
and sign up to be a member. Once you finish their survey and agree to the terms, you will be presented with
a large list of cuDNN versions to choose from. The versions closer to the top are the most up-to-date ones
however you want to choose the version that corresponds to your CUDA version. To check you installed CUDA version,
run nvcc --version from the command line.

On Ubuntu, I suggest installing the Runtime and Developer libraries. If you’re on another platform, Nvidia
provides a very detailed installation guide which should be the second item in the list that appeared after
choosing your cuDNN version.

Darknet

Alright, time for the fun part! If you’ve successfully completed the steps so far, this step will be a piece
of cake.

Open the terminal and navigate to the folder where you would like to install darknet

Clone the repo: git clone https://github.com/pjreddie/darknet

Enter the repo folder: cd darknet

Edit the Makefile

On line 3: change OPENCV=0 to OPENCV=1

If you installed CUDA and cuDNN:

On line 1: change GPU=0 to GPU=1

On line 2: change CUDNN=0 to CUDNN=1

Build the source: make

Congratulations! You’re now ready to start collecting your data.

Data Collection

Since we are training a computer vision model, a lot of data is required. Generally, at least 300 images
per class are required but I would try to shoot for something in the 500 to 1000 range if you’re expecting
a refined model.

Before running the program, you’re going to want to change some configuration settings. From the README:

Delete all files from directory x64/Release/data/img

Put your .jpg-images to this directory x64/Release/data/img

Change number of classes (objects for detection) in file x64/Release/data/obj.data

Put names of objects, one for each line in file x64/Release/data/obj.names

Now that you’re ready to go, run ./linux_mark.sh to start Yolo_mark. The UI is relatively intuitive and
it should be easy to start labeling your classes.

Assemble Data and Configuration Files

From what I have experienced, Darknet is very particular about where you put your files so follow closely.

Move Data and Config Files

In x64/Release/data/img, you should have all of your images and an accompanying .txt file. Copy all
those files to /path/to/darknet/data/img.

In x64/Release/data, you should have train.txt, obj.names, and obj.data. Copy those files to
/path/to/darknet/data.

In x64/Release, you should have yolo-obj.cfg. Copy that file to /path/to/darknet

Modify Config Files

At this point, you can choose which version of YOLO you want to train. There are two options: Regular and Tiny.
According to the Darknet site, the regular network runs at ~40 FPS on a Titan X whereas the tiny network runs at
~200 FPS. The tiny network simply sacrifices accuracy for speed. Unless you plan on using your model on a mobile or
embedded device, the regular network should be fine.

Regular YOLO

Set number of classes you’re training on line 230 of yolo-obj.cfg

Set filter value equal to (classes + 5)*5 on line 224 of yolo-obj.cfg

Tiny YOLO

Copy tiny-yolo-voc.cfg from /path/to/darknet/cfg and rename it to tiny-yolo-obj.cfg

Set number of classes you’re training on line 120 of tiny-yolo-obj.cfg

Set filter value equal to (classes + 5)*5 on line 114 of tiny-yolo-obj.cfg

Start Training!

When to stop training

I want to credit this article for this information.
I simply changed some wording for easier understanding.

AlexeyAB has a very informative description
explaining when you should stop training the model. The average loss (error) value (marked bold in the line below)
that gets reported after every training iteration should be as low as possible.

The YOLOv2 is configured so that weights of the model are saved into the backup folder every 100, 200, 300, 400, 500
and eventually every multiple of 1000 iterations. If training ever were to be interrupted, you can continue training
from the last saved .weights file like so:

I trained my model on emergency exit signs and here is a screenshot of it running on a Samsung Galaxy S7!

Conclusion

In reality, training YOLOv2 using custom data is quite easy but the information describing the process is hard to come
by. I hope this tutorial was helpful and perhaps saved you some time while figuring out how to train your own model.