The classic way of aerial photographs geolocation is to bind their local coordinates to a geographic coordinate system using GPS and IMU data. At the same time the possibility of geolocation in a jammed navigation field is also of interest for practical purposes. In this paper we consider one approach to visual localization relatively to a vector road map without GPS. We suggest a geolocalization algorithm which detects image line segments and looks for a geometrical transformation which provides the best mapping between the obtained segments set and line segments in the road map. We consider IMU and altimeter data still known which allows to work with orthorectified images. The problem is hence reduced to a search for a transformation which contains an arbitrary shift and bounded rotation and scaling relatively to the vector map. These parameters are estimated using RANSAC by matching straight line segments from the image to vector map segments. We also investigate how the proposed algorithm’s stability is influenced by segment coordinates (two spatial and one angular).

In this paper, we introduce slant detection method based on Fast Hough Transform calculation and demonstrate its application in industrial system for Russian passports recognition. About 1.5% of this kind of documents appear to be slant or italic. This fact reduces recognition rate, because Optical Recognition Systems are normally designed to process normal fonts. Our method uses Fast Hough Transform to analyse vertical strokes of characters extracted with the help of x-derivative of a text line image. To improve the quality of detector we also introduce field grouping rules. The resulting algorithm allowed to reach high detection quality. Almost all errors of considered approach happen on passports of nonstandard fonts, while slant detector works in appropriate way.

An iterative algorithm is proposed for blind multi-image deblurring of binary images. The binarity is the only prior restriction imposed on the image. Image formation model assumes convolution with arbitrary kernel and addition of a constant value. Penalty functional is composed using binarity constraint for regularization. The algorithm estimates the original image and distortion parameters by alternate reduction of two parts of this functional. Experimental results for natural (non-synthetic) data are present.

The presence of errors in tomographic image may lead to misdiagnosis when computed tomography (CT) is used in medicine, or the wrong decision about parameters of technological processes when CT is used in the industrial applications. Two main reasons produce these errors. First, the errors occur on the step corresponding to the measurement, e.g. incorrect calibration and estimation of geometric parameters of the set-up. The second reason is the nature of the tomography reconstruction step. At the stage a mathematical model to calculate the projection data is created. Applied optimization and regularization methods along with their numerical implementations of the method chosen have their own specific errors. Nowadays, a lot of research teams try to analyze these errors and construct the relations between error sources. In this paper, we do not analyze the nature of the final error, but present a new approach for the calculation of its distribution in the reconstructed volume. We hope that the visualization of the error distribution will allow experts to clarify the medical report impression or expert summary given by them after analyzing of CT results. To illustrate the efficiency of the proposed approach we present both the simulation and real data processing results.

This paper explores method of layer-by-layer training for neural networks to train neural network, that use approximate calculations and/or low precision data types. Proposed method allows to improve recognition accuracy using standard training algorithms and tools. At the same time, it allows to speed up neural network calculations using fast-processed approximate calculations and compact data types. We consider 8-bit fixed-point arithmetic as the example of such approximation for image recognition problems. In the end, we show significant accuracy increase for considered approximation along with processing speedup.

In this paper, we propose an expansion of convolutional neural network (CNN) input features based on Hough Transform. We perform morphological contrasting of source image followed by Hough Transform, and then use it as input for some convolutional filters. Thus, CNNs computational complexity and the number of units are not affected. Morphological contrasting and Hough Transform are the only additional computational expenses of introduced CNN input features expansion. Proposed approach was demonstrated on the example of CNN with very simple structure. We considered two image recognition problems, that were object classification on CIFAR-10 and printed character recognition on private dataset with symbols taken from Russian passports. Our approach allowed to reach noticeable accuracy improvement without taking much computational effort, which can be extremely important in industrial recognition systems or difficult problems utilising CNNs, like pressure ridge analysis and classification.

In this work we describe an approach to real-time image search in large databases robust to variety of query distortions such as lighting alterations, projective distortions or digital noise. The approach is based on the extraction of keypoints and their descriptors, random hierarchical clustering trees for preliminary search and RANSAC for refining search and result scoring. The algorithm is implemented in Snapscreen system which allows determining a TV-channel and a TV-show from a picture acquired with mobile device. The implementation is enhanced using preceding localization of screen region. Results for the real-world data with different modifications of the system are presented.

In this paper we propose a novel method for localization based on matching two stereo images. It is based on minimizing the sum of square distances between each 3D point and four corresponding 3D rays. The method shows good results for practical localization purposes. Moreover it is robust to the presence of feature point correspondences with zero disparity, which is usually a problem for classical methods. The algorithm is tested in comparison to the classical method. It has linear complexity with respect to feature point correspondence number.

Obtaining high quality images from Computed Tomography (CT) is important for correct image interpretation. In this paper, we propose novel procedures that can be used for a quantitative description of the degree of artifact expressiveness in CT images, and show that the use of this type of metric allows to assess the dynamics of image degradation. We perform different image reconstruction tests in order to analyse our approach, and the obtained results confirm the usefulness of the proposed method. We conclude that the use of the proposed estimates allows moving from image quality assessment based on visual scoring to a quantitative approach and consequently to support a CT setup providing high quality reconstructed images obtained by appropriate changes of the reconstruction parameters or algorithms.

In this paper we study multiple reflection effect in a fold of material with regard to color constancy problem. Namely we consider light source chromaticity estimation using perceived material color. We measured relative spectra of reflected light source emission for different positions under folds. Experiment was performed on 105 fabric samples. Using this data we discuss applicability of different spectral models for description of observed chromaticity deviation in different fold’s areas. Obtained experimental data was released in open access.

This paper presents a method of radial distortion automatic compensation on video from an unknown camera. The proposed algorithm estimates the distortion parameters by analyzing a sequence of video frames. It does not require any calibration objects, but is based on the assumption that the original scene contained straight lines. The method tries to perform such radial distortion correction that makes lines look generally straighter. To estimate the overall curvature of the lines we propose to use the fast Hough transform; without actually detecting them in the image. The proposed algorithm has been tested on real data.

Neural network calculations for the image recognition problems can be very time consuming. In this paper we propose three methods of increasing neural network performance on SIMD architectures. The usage of SIMD extensions is a way to speed up neural network processing available for a number of modern CPUs. In our experiments, we use ARM NEON as SIMD architecture example. The first method deals with half float data type for matrix computations. The second method describes fixed-point data type for the same purpose. The third method considers vectorized activation functions implementation. For each method we set up a series of experiments for convolutional and fully connected networks designed for image recognition task.

This work considers the tracking of the UAV (unmanned aviation vehicle) on the basis of onboard observations of natural landmarks including azimuth and elevation angles. It is assumed that UAV's cameras are able to capture the angular position of reference points and to measure the angles of the sight line. Such measurements involve the real position of UAV in implicit form, and therefore some of nonlinear filters such as Extended Kalman filter (EKF) or others must be used in order to implement these measurements for UAV control. Recently it was shown that modified pseudomeasurement method may be used to control UAV on the basis of the observation of reference points assigned along the UAV path in advance. However, the use of such set of points needs the cumbersome recognition procedure with the huge volume of on-board memory. The natural landmarks serving as such reference points which may be determined on-line can significantly reduce the on-board memory and the computational difficulties. The principal difference of this work is the usage of the 3D reference points coordinates which permits to determine the position of the UAV more precisely and thereby to guide along the path with higher accuracy which is extremely important for successful performance of the autonomous missions. The article suggests the new RANSAC for ISOMETRY algorithm and the use of recently developed estimation and control algorithms for tracking of given reference path under external perturbation and noised angular measurements.

Demosaicing is the process of reconstruction of a full-color image from Bayer mosaic, which is used in digital cameras for image formation. This problem is usually considered as an interpolation problem. In this paper, we propose to consider the demosaicing problem as a problem of solving an underdetermined system of algebraic equations using regularization methods. We consider regularization with standard l1/2-, l1 -, l2- norms and their effect on quality image reconstruction. The experimental results showed that the proposed technique can both be used in existing methods and become the base for new ones

In this paper, we analyze properties of dyadic patterns. These pattern were proposed to approximate line segments in the fast Hough transform (FHT). Initially, these patterns only had recursive computational scheme. We provide simple closed form expression for calculating point coordinates and their deviation from corresponding ideal lines.

The artifacts (known as metal-like artifacts) arising from incorrect reconstruction may obscure or simulate pathology in medical applications, hide or mimic cracks and cavities in the scanned objects in industrial tomographic scans. One of the main reasons caused such artifacts is photon starvation on the rays which go through highly absorbing regions. We indroduce a way to suppress such artifacts in the reconstructions using soft penalty mimicing linear inequalities on the photon starved rays. An efficient algorithm to use such information is provided and the effect of those inequalities on the reconstruction quality is studied.

The growing adoption of intelligent transportation systems (ITS) and autonomous driving requires robust real-time solutions for various event and object detection problems. Most of real-world systems still cannot rely on computer vision algorithms and employ a wide range of costly additional hardware like LIDARs. In this paper we explore engineering challenges encountered in building a highly robust visual vehicle detection and classification module that works under broad range of environmental and road conditions. The resulting technology is competitive to traditional non-visual means of traffic monitoring. The main focus of the paper is on software and hardware architecture, algorithm selection and domain-specific heuristics that help the computer vision system avoid implausible answers.

In this paper we consider a task of finding information fields within document with flexible form for credit card expiration date field as example. We discuss main difficulties and suggest possible solutions. In our case this task is to be solved on mobile devices therefore computational complexity has to be as low as possible. In this paper we provide results of the analysis of suggested algorithm. Error distribution of the recognition system shows that suggested algorithm solves the task with required accuracy.

This paper describes a method for real-time object detection based on a hybrid of a Viola-Jones cascade with a convolutional neural network. This scheme allows flexible trade-offs between detection quality and computational performance. We also propose a generalization of this method to multispectral images that effectively and efficiently utilizes information from each spectral channel. The new scheme is experimentally compared to traditional Viola-Jones, showing improved detection quality with adjustable performance.

Periodic patterns often present on document images as holograms, watermarks or guilloche elements which are mostly used for fraud protection. Localization of such patterns lets an embedded OCR system to vary its settings depending on pattern presence in particular image regions and improves the precision of pattern removal to preserve as much useful data as possible. Many document images’ noise detection and removal methods deal with unstructured noise or clutter on documents with simple background. In this paper we propose a method of periodic pattern localization on document images which uses discrete Fourier transform that works well on documents with complex background.

In the paper we consider the problem of multi-agent continuous mapping of a changing, low dynamic environment. The mapping problem is a well-studied one, however usage of multiple agents and operation in a non-static environment complicate it and present a handful of challenges (e.g. double-counting, robust data association, memory and bandwidth limits). All these problems are interrelated, but are very rarely considered together, despite the fact that each has drawn attention of the researches. In this paper we devise an architecture that solves the considered problems in an internally consistent manner.

In this paper we propose an algorithm for real-time rectangular document borders detection in mobile device based applications. The proposed algorithm is based on combinatorial assembly of possible quadrangle candidates from a set of line segments and projective document reconstruction using the known focal length. Fast Hough Transform is used for line detection. 1D modification of edge detector is proposed for the algorithm.

The goal of the X-ray Fluorescence Computed Tomography (XFCT) is to give the quantitative description of an object under investigation (sample) in terms of the element composition. However, light and heavy elements inside the object give different contribution to the attenuation of the X-ray probe and of the fluorescence. It leads to the elements got in the shadow area do not give any contribution to the registered spectrum. Iterative reconstruction procedures will try to set to zero the variables describing the element content in composition of corresponding unit volumes as these variables do not change system's condition number. Inversion of the XFCT Radon transform gives random values in these areas. To evaluate the confidence of the reconstructed images we first propose, in addition to the reconstructed images, to calculate a generalized image based on Jacobian matrix. This image highlights the areas of doubt in case if there are exist. In the work we have attempted to prove the advisability of such an approach. For this purpose, we analyzed in detail the process of tomographic projection formation.

In this paper we present two algorithms modifications of projective-recognition of the plane boundaries with one concavity. The input images are created with orthographic pinhole camera with a fixed focal length. Thus variety of the possible projective transformations is limited. The first modification considers the task more generally, the other uses prior information about camera model. A hypothesis that the second modification has better accuracy is being checked. Results of around 20000 numeral experiments that confirm the hypothesis are included.

In this paper, we present a new modification of Viola-Jones complex classifiers. We describe a complex classifier in the form of a decision tree and provide a method of training for such classifiers. Performance impact of the tree structure is analyzed. Comparison is carried out of precision and performance of the presented method with that of the classical cascade. Various tree architectures are experimentally studied. The task of vehicle wheels detection on images obtained from an automatic vehicle classification system is taken as an example.

In this paper an improvement possibility of multilayer perceptron based classifiers with using composite classifier scheme with predictor function was exploited. Recognition of embossed number characters on plastic cards in the image taken by mobile camera was used as a model problem.

Keywords/Phrases

Keywords

in

Remove

in

Remove

in

Remove

+ Add another field

Search In:

Proceedings

Volume

Journals +

Volume

Issue

Page

Journal of Applied Remote SensingJournal of Astronomical Telescopes Instruments and SystemsJournal of Biomedical OpticsJournal of Electronic ImagingJournal of Medical ImagingJournal of Micro/Nanolithography, MEMS, and MOEMSJournal of NanophotonicsJournal of Photonics for EnergyNeurophotonicsOptical EngineeringSPIE Reviews