In this study, the authors propose a new technique to achieve one-shot scan using single colour and static pattern projector; such a method is ideal for acquisition of moving objects. Since projector–camera systems generally have uncertainties on retrieving correspondences between the captured image and the projected pattern, many solutions have been proposed. Especially for one-shot scan, which means that only a single image is required for shape reconstruction, positional information of a pixel of the projected pattern should be encoded by spatial and/or colour information. Although colour information is frequently used for encoding, it is severely affected by texture and material of the object and leads unstable reconstruction. In this study, the authors propose a technique to solve the problem by using geometrically unique pattern only with black and white colour that further considers the shape distortion by surface orientation of the shape. The authors technique successfully acquires high-precision one-shot scan with an actual system.

The authors present a novel approach to improve three-dimensional (3D) structure estimation from an image stream in urban scenes. The authors consider a particular setup, where the camera is installed on a moving vehicle. Applying traditional structure from motion (SfM) technique in this case generates poor estimation of the 3D structure because of several reasons such as texture-less images, small baseline variations and dominant forward camera motion. The authors idea is to introduce the monocular depth cues that exist in a single image, and add time constraints on the estimated 3D structure. The scene is modelled as a set of small planar patches obtained using over-segmentation, and the goal is to estimate the 3D positioning of these planes. The authors propose a fusion scheme that employs Markov random field model to integrate spatial and temporal depth features. Spatial depth is obtained by learning a set of global and local image features. Temporal depth is obtained via sparse optical flow based SfM approach. That allows decreasing the estimation ambiguity by forcing some constraints on camera motion. Finally, the authors apply a fusion scheme to create unique 3D structure estimation.

A stroke segmentation method named quick penalty-based dynamic programming is proposed for splitting a sketchy stroke into several regular primitive shapes, such as line segments and elliptical arcs. The authors extend the dynamic programming framework with a customisable penalty function, which measures the correctness of splitting a stroke at a particular point. With the help of the penalty function, the proposed dynamic programming framework can finish the stroke segmentation process without any prior knowledge of the number and/or the type of segments contained in the sketchy stroke. Its response time is sufficiently short for online applications, even for long strokes. Experiments show that the proposed method is robust for strokes with arbitrary shape and size.

When objects undergo large pose change, illumination variation or partial occlusion, most existed visual tracking algorithms tend to drift away from targets and even fail in tracking them. To address this issue, in this study, the authors propose an online algorithm by combining multiple instance learning (MIL) and local sparse representation for tracking an object in a video system. The key idea in our method is to model the appearance of an object by local sparse codes that can be formed as training data for the MIL framework. First, local image patches of a target object are represented as sparse codes with an overcomplete dictionary, where the adaptive representation can be helpful in overcoming partial occlusion in object tracking. Then MIL learns the sparse codes by a classifier to discriminate the target from the background. Finally, results from the trained classifier are input into a particle filter framework to sequentially estimate the target state over time in visual tracking. In addition, to decrease the visual drift because of the accumulative errors when updating the dictionary and classifier, a two-step object tracking method combining a static MIL classifier with a dynamical MIL classifier is proposed. Experiments on some publicly available benchmarks of video sequences show that our proposed tracker is more robust and effective than others.

This study presents a real-time refinement procedure for depth data acquired by RGB-D cameras. Data from RGB-D cameras suffer from undesired artefacts such as edge inaccuracies or holes owing to occlusions or low object remission. In this work, the authors use recent depth enhancement filters intended for time-of-flight cameras, and extend them to structured light-based depth cameras, such as the Kinect camera. Thus, given a depth map and its corresponding two-dimensional image, we correct the depth measurements by separately treating its undesired regions. To that end, the authors propose specific confidence maps to tackle areas in the scene that require a special treatment. Furthermore, in the case of filtering artefacts, the authors introduce the use of RGB images as guidance images as an alternative to real-time state-of-the-art fusion filters that use greyscale guidance images. The experimental results show that the proposed fusion filter provides dense depth maps with corrected erroneous or invalid depth measurements and adjusted depth edges. In addition, the authors propose a mathematical formulation that enables to use the filter in real-time applications.

Non-negative matrix factorisation (NMF) has been widely used in pattern recognition problems. For the tasks of classification, however, most of the existing variants of NMF ignore both the discriminative information and the local geometry of data into the factorisation. The actual conditions of the problems will be affected by the change of the environmental factors to affect the recognition accuracy. In order to overcome these drawbacks, the authors regularised NMF by intra-class and inter-class fuzzy K nearest neighbour graphs, leading to NMF-FK-NN in this study. By introducing two novel fuzzy K nearest neighbour graphs, NMF-FK-NN can contract the intra-class neighbourhoods and expand the inter-class neighbourhoods in the decomposition. This method not only exploits the discriminative information and uses the geometric structure in the data effectively, but also reduces the influence of the external factors to improve recognition effect. In the factorisation, the authors minimised the approximation error whilst contracting intra-class fuzzy neighbourhoods and expanding inter-class fuzzy neighbourhoods. The authors develop simple multiplicative updates for NMF-FK-NN and present monotonic convergence results. Experiments of the text clustering on the CLUTO toolkit and face recognition on ORL and YALE datasets show the effectiveness of our proposed method.

Although there is no shortage of clustering algorithms, existing algorithms are often afflicted by problems of one kind or another. Dominant sets clustering is a graph-theoretic approach to clustering and exhibits significant potential in various applications. However, the authors' work indicates that this approach suffers from two major problems, namely over-segmentation tendency and sensitiveness to distance measures. In order to overcome these two problems, the authors present a density-based enhancement to dominant sets clustering where a cluster merging step is used to fuse adjacent clusters close enough from the original dominant sets clustering. Experiments on various datasets validate the effectiveness of the proposed method.

This study presents an approach for estimating the fisheye camera parameters using three vanishing points corresponding to three sets of mutually orthogonal parallel lines in one single image. The authors first derive three constraint equations on the elements of the rotation matrix in proportion to the coordinates of the vanishing points. From these constraints, the rotation matrix is calculated under the assumption of the image centre known. The experimental results with synthetic images and real fisheye images validate this method. In contrast to the existing methods, the authors method needs less image information and does not know the three-dimensional reference point coordinates.

The inspection of textile defects is challenging because of the large number of defects categories that are characterised by their imprecision and uncertainty. In this study, novel interval type-2 fuzzy system is proposed for resolving defects recognition problem of textile industries. The proposed system mixes interval type-2 fuzzy reasoning and swarm optimisation algorithm together in order to enhance the defects classification capabilities. Interval type-2 fuzzy logic is powerful in handling high level of indecisions in the human decision making process, including uncertainties in measurements of textile features and data used to calibrate the examination's parameters. Swarm intelligence algorithm is used to optimise parameters of the membership functions to increase the accuracy of fuzzy controller. Besides, the problem of fuzzy linguistic rules learning has been tackled by utilising ant colony meta-heuristic method to reduce the complexity of the inspection system. Excellent recogniser results on real textile samples, using this system, are demonstrated.

In this study, the authors propose a robust and high accurate pose estimation algorithm to solve the perspective-N-point problem in real time. This algorithm does away with the distinction between coplanar and non-coplanar point configurations, and provides a unified formulation for the configurations. Based on the inverse projection ray, an efficient collinearity model in object–space is proposed as the cost function. The principle depth and the relative depth of reference points are introduced to remove the residual error of the cost function and to improve the robustness and the accuracy of the authors pose estimation method. The authors solve the pose information and the depth of the points iteratively by minimising the cost function, and then reconstruct their coordinates in camera coordinate system. In the following, the optimal absolute orientation solution gives the relative pose information between the estimated three-dimensional (3D) point set and the 3D mode point set. This procedure with the above two steps is repeated until the result converges. The experimental results on simulated and real data show that the superior performance of the proposed algorithm: its accuracy is higher than the state-of-the-art algorithms, and has best anti-noise property and least deviation by the influence of outlier among the tested algorithms.

Vision-based traffic surveillance plays an important role in traffic management. However, outdoor illuminations, the cast shadows and vehicle variations often create problems for video analysis and processing. Thus, the authors propose a real-time cost-effective traffic monitoring system that can reliably perform traffic flow estimation and vehicle classification at the same time. First, the foreground is extracted using a pixel-wise weighting list that models the dynamic background. Shadows are discriminated utilising colour and edge invariants. Second, the foreground on a specified check-line is then collected over time to form a spatial–temporal profile image. Third, the traffic flow is estimated by counting the number of connected components in the profile image. Finally, the vehicle type is classified according to the size of the foreground mask region. In addition, several traffic measures, including traffic velocity, flow, occupancy and density, are estimated based on the analysis of the segmentation. The availability and reliability of these traffic measures provides critical information for public transportation monitoring and intelligent traffic control. Since the proposed method only process a small area close to the check-line to collect the spatial–temporal profile for analysis, the complete system is much more efficient than existing visual traffic flow estimation methods.

Background suppression is vitally important for the small target detection, which aims to enhance targets and improve the signal-to-noise ratio of small target images. Consequently, the study proposes a background suppression approach based on the fast local reverse entropy operator, which is designed according to the fact that the appearance of a small target could result in the great change of the value of local reverse entropy in the local region. The operator is adopted to suppress complex backgrounds of small target images in order to enhance small targets, and then bring about high probabilities of detection and low probabilities of false alarm in the small target detection. Both quantitative and qualitative analyses contribute to confirm the validity and efficiency of the proposed approach.