We use cookies in order to improve the quality and usability of the HSE website. More information about the use of cookies is available here, and the regulations on processing personal data can be found here. By continuing to use the site, you hereby confirm that you have been informed of the use of cookies by the HSE website and agree with our rules for processing personal data. You may disable cookies in your browser settings.

Many people are able to recognize the personality traits of the person they are talking to by their facial features. Experts in non-verbal communication can do this even with a photograph. But is it possible to teach artificial intelligence to do the same?

Due to the COVID-19 pandemic, people around the world have faced an unprecedented crisis. The cataclysm has impacted Russia as well. Who will better deal the hardships—experienced baby boomers, Gen Xers who survived the 1990s, or Gen Yers who have had an easy life?

In lockdowns, why do some people stay home, while others violate the quarantine rules and go out for picnics in the park? Behavioural economics may provide the answer to this question. Oksana Zinchenko, a Research Fellow of the Institute of Cognitive Neuroscience, explains how we can predict people’s behaviour with game theory.

Book chapter

People Tracking Algorithm for Human Height Mounted Cameras

We present a new people tracking method for human height mounted camera, e.g. the one attached near information or advertising stand. We use state-of-the-art particle filter approach and improve it by explicitly modeling of object visibility which makes the method able to cope with difficult object overlapping. We employ our own method based on online-boosting classifiers to resolve occlusions and show that it is well suited for tracking multiple objects. In addition to training an online-classifier which is updated each frame we propose to store object appearance and update it with a certain lag. It helps to correctly handle situations when a person enters the scene while another one leaves it at the same time. We demonstrate the perfomance of our algorithm and advantages of our contributions on our own video dataset

We present a new combined approach for monocular model-based 3D tracking. A preliminary object pose is estimated by using a keypoint-based technique. The pose is then refined by optimizing the contour energy function. The energy determines the degree of correspondence between the contour of the model projection and the image edges. It is calculated based on both the intensity and orientation of the raw image gradient. For optimization, we propose a technique and search area constraints that allow overcoming the local optima and taking into account information obtained through keypoint-based pose estimation. Owing to its combined nature, our method eliminates numerous issues of keypoint-based and edge-based approaches. We demonstrate the efficiency of our method by comparing it with state-of-the-art methods on a public benchmark dataset that includes videos with various lighting conditions, movement patterns, and speed.

In our recent papers, we proposed a new family of residual convolutional neural networks trained for semi-dense and sparse depth reconstruction without use of RGB channel. The proposed models can be used in low-resolution depth sensors or SLAM methods estimating partial depth with certain distributions. We proposed using perceptual loss for training depth reconstruction in order to better preserve edge structure and reduce over-smoothness of models trained on MSE loss alone.

This paper contains reproducibility companion guide on training, running and evaluating suggested methods, while also presenting links on further studies in view of reviewers comments and related problems of depth reconstruction.

Modern biometric systems based on face recognition demonstrate high recognition quality, but they are vulnerable to face presentation attacks, such as photo or replay attack. Existing face anti-spoofing methods are mostly based on texture analysis and due to lack of training data either use hand-crafted features or ﬁne-tuned pretrained deep models. In this paper we present a novel CNN-based approach for face anti-spoofing, based on joint analysis of the presence of a spoofing medium and eye blinking. For training our classifiers we propose the procedure of synthetic data generation which allows us to train powerful deep models from scratch. Experimental analysis on the challenging datasets (CASIA-FASD, NUUA Imposter) shows that our method can obtain state-of-the-art results.

This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016.

The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life and social sciences; morphological and technological approaches to image analysis.

ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.

The authors consider the problem of human pose estimation using probabilistic convolutional neural networks. They explore ways to improve human pose estimation accuracy on standard pose estimation benchmarks MPII human pose and Leeds Sports Pose (LSP) datasets using frameworks for probabilistic deep learning. Such frameworks transform deterministic neural network into a probabilistic one and allow sampling of independent and equiprobable hypotheses (different outputs) for a given input. Overlapping body parts and body joints hidden under clothes or other obstacles make the problem of human pose estimation ambiguous. In this context to get accurate estimation of joints’ position they use uncertainty in network's predictions, which is represented by variance of hypotheses, provided by a probabilistic convolutional neural network, and confidence is characterised by mean of them. Their work is based on current CNN cascades for pose estimation. They propose and evaluate three probabilistic convolutional neural networks built on top of deterministic ones with two probabilistic deep learning frameworks – DISCO networks and Bayesian SegNet. The authors evaluate their models on standard pose estimation benchmarks and show that proposed probabilistic models outperform base deterministic ones.

The Shape Boltzmann Machine (SBM) and its multilabel version MSBM have been recently introduced as deep generative models that capture the variations of an object shape. While being more flexible MSBM requires datasets with labeled parts of the objects for training. In the paper we present an algorithm for training MSBM using binary masks of objects and the seeds which approximately correspond to the locations of objects parts. The latter can be obtained from part-based detectors in an unsupervised manner. We derive a latent variable model and an EM-like training procedure for adjusting the weights of MSBM using a deep learning framework. We show that the model trained by our method outperforms SBM in the tasks related to binary shapes and is very close to the original MSBM in terms of quality of multilabel shapes.

This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and image segmentation. We present experimental results showing that this model improves the computational efficiency of Residual Networks on the challenging ImageNet classification and COCO object detection datasets. Additionally, we evaluate the computation time maps on the visual saliency dataset cat2000 and find that they correlate surprisingly well with human eye fixation positions.

Human gait or walking manner is a biometric feature that allows identification of a person when other biometric features such as the face or iris are not visible. In this study, the authors present a new pose-based convolutional neural network model for gait recognition. Unlike many methods that consider the full-height silhouette of a moving person, they consider the motion of points in the areas around human joints. To extract motion information, they estimate the optical flow between consecutive frames. They propose a deep convolutional model that computes pose-based gait descriptors. They compare different network architectures and aggregation methods and experimentally assess various body parts to determine which are the most important for gait recognition. In addition, they investigate the generalisation ability of the developed algorithms by transferring them between datasets. The results of these experiments show that their approach outperforms state-of-the-art methods.