Stephen Z.'s Senior Project Blog!

2019Silicon Valley

Week Nine: Creating a Dataset

Apr 16, 2019

As I mentioned last week, I shifted my main focus onto creating a model to distinguish between different cuts of meat. I received three cameras from my internal advisor to take pictures; however, I ran into issues with all of them. One had poor image quality, another took too long to take pictures, and the last simply didn’t work, since I couldn’t figure out how to take pictures with it. I ended up returning all three to my advisor.

Luckily, I had another camera (an AKASO EK7000), albeit it doesn’t have the best image quality. I took around 2500 pictures (~1500 for sliced beef shank and ~1000 for shredded beef shank [I only had these two cuts on hand]) with different angles, lighting conditions, and backgrounds. After importing the files onto my system, I saw that the names of the images needed to be changed. As all the images would eventually end up in the same directory, I wanted them to be named so that I could tell what cut of meat any image was showing. Fortunately, I remembered performing a similar operation during my internship, and I consulted a list of sites1 to fully remember how to code it. Below is a picture of my code:

As a result, I was successfully able to complete my goal:

And then, I realized I had made another mistake. The pictures were all taken in 4K, while the camera I needed to use was supposed to be in 2K. I looked up what resolution “2K” was supposed to be, but no site gave me a straight answer, as nobody really uses “2K.” As a result, I decided to keep things simple and resize all the pictures to 1920 x 1080. As I didn’t have any experience with doing so, I had to look on the internet for a bit longer to search for a solution. I eventually stumbled across a site that gave me the code I wanted, and I modified it a bit to suit my purposes. Below is an image of my code:

I also forgot how to view the time elapsed, so I took a look at this site and put that in my code for feedback during the execution.

With all of that out of the way, I installed labelImg, a tool for labeling objects inside images. Below is a picture of labelImg as I was labeling my dataset:

Essentially, what labelImg does is provide an interactable GUI to speed up the process of labeling. Normally, you need to specify in code the pixels for the bounding box and its label, but with labelImg the process becomes much faster, as you can simply draw boxes. As a result of these boxes, an XML file specific to an image is created with the information inside. EdjeElectronics on GitHub provides a file that converts these XML files to CSV files, which can be used by another program he made to generate tfrecords, which are used as inputs to train.

Of course, while it is much faster to draw boxes, labeling 2500 images does take a long time, so I am still in the midst of labeling. Once I finish though, I will proceed onto actual training and evaluation.