What You Get is What You See – node.js + yolo

Teaching your computer how to see just got easier with node-yolo. Created as a collaboration between the moovel lab and Alex (@OrKoN of moovel engineering), node-yolo builds upon Joseph Redmon’s neural networkframework and wraps up the You Only Look Once (YOLO) real-time object detection library - YOLO - into a convenient and web-ready node.js module. The best thing about it: it’s open source!

To put it simply, node-yolo is bringing new possibilities to YOLO your way into computer vision. Among all of the features already available in YOLO, node-yolo makes it possible to stream and analyze video over the web in real-time and makes it easier to work with the detection results. At the Lab, we’ve developed node-yolo to help us with an up-and-coming internet of things project (stay tuned!) and we hope by sharing it open source, it can help you too.

The YOLO library is “a state-of-the-art, real-time object detection system” that has become a household name in the world of image recognition and computer vision. What makes YOLO so new and exciting is in its novel “jointly trained object detection and classification” method. YOLO is essentially able to predict the object category and position of about 9000 objects even if it doesn’t have labelled detection data associated with it. And all of this works in real time!

If that sounds like a bunch of mumbo jumbo, just know that YOLO can detect a heck of a lot of objects in images, classify those objects and/or suggest what they might be, and all at 30 frames per second or more (with a CUDA GPU).

We ❤️ YOLO, but we noticed a few crucial features that we and potentially other tinkerers might be interested to toy with. Along with the ability to stream and analyze video over the web, we also needed a way to return the bounding boxes of classification results for the detected features in the images we were analyzing. Together with Alex, we “macguyvered” our way into depths of the YOLO library, remixing, restructuring, and adding to the codebase. If Alex’s karate skills are anything like his programming skills, he’s got the “five point palm exploding heart technique” equivalent of problem solving with code. After a few road bumps and seemingly insurmountable challenges, we were able to package up all of these wish list items into an easier-to-use node.js package and build, in our opinion, a pretty friendly node.js wrapper for YOLO.

If you’d like to know more about how YOLO works, check out the paper Joseph Redmon and Ali Farhadi published. The authors explain the novelty and strengths of the YOLO method for object recognition as well its limitations. We give a “Too Long Didn’t Read” (TLDR) executive summary of their abstract here:

The authors have improved their YOLO method and introduce YOLO9000 - a “real-time object detection system that can detect over 9000 object categories”

YOLO uses a “multi-scale training method...and can run at varying sizes, offering an easy tradeoff between speed and accuracy”

YOLO9000 is trained on both the COCO detection dataset and the ImageNet classification dataset which allows for YOLO9000 to predict for detections on objects that haven’t been labelled in the detection dataset.

Despite only being trained with 200 object classes, YOLO9000 can produce “detections for more than 9000 different object categories and in real-time.”

With the growing community around computer vision and in particular the YOLO library, we hope that our node-yolo package will help to expand the realm of possibilities for this technology. Furthermore we hope to engage people (e.g. artists and designers) who might not otherwise be able to interact with the latest and greatest in machine learning and computer vision. We hope you can use node-yolo for your next computer vision project!