Speakers

Recent Results on Image Editing and Learning Filters

Ming-Hsuan Yang

UC Merced

Nov. 30 16:40~17:25

Abstract

In the first part of this talk, I will present recent results on semantic-aware image editing. Skies are common backgrounds in photos but are often less interesting due to the time of photographing. Professional photographers correct this by using sophisticated tools with painstaking efforts that are beyond the command of ordinary users. In this work, we propose an automatic background replacement algorithm that can generate realistic, artifact-free images with diverse styles of skies. The key idea of our algorithm is to utilize visual semantics to guide the entire process including sky segmentation, search and replacement. First, we train a deep convolutional neural network for semantic scene parsing, which is used as visual prior to segment sky regions in a coarse-to-fine manner. Second, in order to find proper skies for replacement, we propose a data-driven sky search scheme based on semantic layout of the input image. Finally, to re-compose the stylized sky with the original foreground naturally, an appearance transfer method is developed to match statistics locally and semantically. We show that the proposed algorithm can automatically generate a set of visually pleasing results. In addition, we demonstrate the effectiveness of the proposed algorithm with extensive user studies.

In the second part, I will present recent results on learning image filters for low-level vision. We formulate numerous low-level vision problems (e.g., edge preserving filtering and denoising) as recursive image filtering via a hybrid neural network. The network contains several spatially variant recurrent neural networks (RNN) as equivalents of a group of distinct recursive filters for each pixel, and a deep convolutional neural network (CNN) that learns the weights of the RNNs. The deep CNN can learn regulations of recurrent propagation for various tasks and effectively guides recurrent propagation over an entire image. The proposed model does not need a large number of convolutional channels nor big kernels to learn features for low-level vision filters. It is much smaller and faster compared to a deep CNN based image filter. Experimental results show that many low-level vision tasks can be effectively learned and carried out in real-time by the proposed algorithm.

Biography

Ming-Hsuan Yang is a professor in Electrical Engineering and Computer Science at University of California, Merced. He received the PhD degree in Computer Science from the University of Illinois at Urbana-Champaign in 2000. He serves as an area chair for several conferences including IEEE Conference on Computer Vision and Pattern Recognition, IEEE International Conference on Computer Vision, European Conference on Computer Vision, Asian Conference on Computer, AAAI National Conference on Artificial Intelligence, and IEEE International Conference on Automatic Face and Gesture Recognition. He serves as a program co-chair for IEEE International Conference on Computer Vision in 2019 as well as Asian Conference on Computer Vision in 2014, and general co-chair for Asian Conference on Computer Vision in 2016. He serves as an associate editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence (2007 to 2011), International Journal of Computer Vision, Computer Vision and Image Understanding, Image and Vision Computing, and Journal of Artificial Intelligence Research. Yang received the Google faculty award in 2009, and the Distinguished Early Career Research award from the UC Merced senate in 2011, the Faculty Early Career Development (CAREER) award from the National Science Foundation in 2012, and the Distinguished Research Award from UC Merced Senate in 2015.