Visual codes like barcodes and quick response (QR) codes are the most prevalent linking elements between physical objects and digital information. They are found in numerous consumer applications such as shopping, electronic payments, ticketing, or marketing campaigns, but they are also found in logistics and enterprise asset tracking and provide employees access to detailed service records. Reading visual codes is also often the first step for pairing and interaction with physical appliances in various research projects in the area of pervasive computing. While these codes are found almost everywhere, the reading of codes usually requires expensive scanner devices which hinders the original goal of easy access to information about every physical object.
Technological advancements in wearable computing and mobile computer vision may radically expand the adoption of visual codes because smartphones, smartwatches, smartglasses, and other wearables enable instant barcode scanning on the go. The wide accessibility, the computing performance, the intuitive user interface, and the relatively low price make personal wearable computers strong competitors for traditional scanners while they also enable new use cases for visual codes. As these devices are primarily designed for other purposes, there are also a few shortcomings.
In this dissertation, we describe methods to overcome these shortcomings and to add advanced features that can make wearable barcode scanning an attractive alternative to traditional barcode scanning even outside the consumer domain. We present fast and robust solutions to the following problems on computationally restricted unmodified wearable devices:
(i) Fast and robust localization of visual codes:
Current smartphone-based barcode scanning solutions require the user to hold and align the tagged object close to the camera. This is especially problematic with smartglasses and leads to lower user acceptance. On the other hand, today's wearable cameras have a high enough resolution to scan visual codes that are further away, once they are segmented in a preprocessing step.
We propose a fast algorithm for joint 1D and 2D visual code localization in large digital images. The proposed method outperforms other solutions in terms of accuracy while it is invariant to scale, orientation, code symbology, and is more robust to blur than previous approaches. We further optimize for speed by exploiting the parallel processing capabilities of mobile graphics hardware for image processing. Fast segmentation allows for scanning multiple codes at the same time and thus helps in application scenarios where interaction with multiple objects is necessary.
(ii) Fast and robust compensation of motion blur:
When the camera or the code undergoes slight motion during scanning, in contrast to the performance of laser scanners, wearable cameras often suffer from motion blur that renders the codes unreadable. We build upon existing work in photograph deblurring and develop a fast algorithm for scanning motion-blurred QR codes on mobile devices.
We exploit the fact that QR codes do not need to be visually pleasing for decoding and propose a fast restoration-recognition loop that exploits the special structure of QR codes. In our optimization scheme, we interweave blind blur estimation from the edges of the code and image restoration regularized with typical properties of QR code images. Our proposed restoration algorithm is on par with the state of the art in quality while it is about a magnitude faster. We also propose to combine blur estimation from image edges with blur estimation from built-in inertial sensors to make the restoration even faster. Fast blur compensation means that no precise code alignment is required but the user can simply swipe the camera in front of the code.
(iii) Fast and robust recognition of in-air hand gestures:
Wearable devices have limited input capabilities; the way of interaction is usually limited to a few buttons or slim touchpads. We add natural hand gesture input to our wearable computers, but indirectly also to other smart appliances in our smart environment that can be automatically recognized through tiny visual codes, and that can 'outsource' their own user interface to our wearable devices.
We present a machine learning technique to recognize hand gestures with only a single monocular camera as can be found on off-the-shelf mobile devices. The algorithm robustly recognizes a wide range of in-air gestures and runs in real time on unmodified wearable devices. We further show that, with little modification, our method can not only classify the gesture but can also regress the distance of the hand from the camera. 3D in-air gesture control allows hands-free scanning with smartglasses which brings many advantages in enterprise scenarios. Furthermore, through user interface outsourcing, it also enables expressive vision-based gesture control even to those appliances that do not possess a camera by their own.
Along with the dissertation, we created several showcase scenarios and demonstrators of our contributions. All proposed algorithms have been designed and implemented to be compatible with various platforms and device families (e.g., PCs, tablets, smartphones, smartwatches, smartglasses), with their resource constraints in mind. The solutions presented in this dissertation are pushing forward the state of the art in terms of accuracy, speed, and robustness, and thus help to make wearable barcode scanning a promising alternative to traditional barcode scanning.