This paper proposes a novel reversible data hiding scheme for encrypted images by using homomorphic and probabilistic properties of Paillier cryptosystem. In the proposed method, groups of adjacent pixels are randomly selected, and reversibly embedded into the rest of the image to make room for data embedding. In each group, there are a reference pixel and a few host pixels. Least significant bits...
View full abstract»

Different from conventional haze removal methods based on a single image, near-infrared imaging can provide two types of multimodal images: one is the near-infrared image and the other is the visible color image. These two images have different characteristics regarding color and visibility. The captured near-infrared image is haze-free, but it is grayscale, whereas the visible color image has col...
View full abstract»

Background extraction is generally the first step in many computer vision and augmented reality applications. Most existing methods, which assume the existence of a clean background during the reconstruction period, are not suitable for video sequences such as highway traffic surveillance videos, whose complex foreground movements may not meet the assumption of a clean background. Therefore, we pr...
View full abstract»

Stereoscopic subtitle insertion is a fundamental and essential element in stereoscopic film and TV industry. However, little work has been dedicated to the optimal region selection for stereoscopic subtitle insertion. In addition, there is no public database reported for the performance evaluation of it. In this paper, we build the first large-scale video database (TJU3D) for stereoscopic video su...
View full abstract»

The popularity of mobile applications has greatly enriched and facilitated our lives. However, the rapid increase of digital images and the problem of narrow bandwidth of the wireless network call for an appropriate approach to reduce the amount of data transmitted over the wireless network (i.e., low bit-rate transmission) while ensuring high recognition accuracy at the cloud. We propose a simple...
View full abstract»

This paper presents a new technique for hyperspectral image (HSI) classification by using superpixel guided deep-sparse-representation learning. The proposed technique constructs a hierarchical architecture by exploiting the sparse coding to learn the HSI representation. Specifically, a multiple-layer architecture using different superpixel maps is designed, where each superpixel map is generated ...
View full abstract»

Depth estimation from single monocular images is a key component in scene understanding. Most existing algorithms formulate depth estimation as a regression problem due to the continuous property of depths. However, the depth value of input data can hardly be regressed exactly to the ground-truth value. In this paper, we propose to formulate depth estimation as a pixelwise classification task. Spe...
View full abstract»

Two approaches are proposed for cross-pose face recognition, one is built on the handcrafted features extracted from the 3D reconstruction of facial components and the other is built on the learned features from a deep convolutional neural network (CNN). As both approaches rely on facial landmarks for alignment across large poses, we propose the Fast Hierarchical Model (FHM) for locating cross-pos...
View full abstract»

In the latest video coding standard high efficiency video coding (HEVC), a quadtree-based coding unit (CU) partitioning scheme is adopted to better adapt to the characteristics of the video contents. However, the flexible scheme significantly increases the coding complexity, because large amount of possible CU partitioning modes should be traversed. In this paper, we propose a two-stage fast inter...
View full abstract»

A continuous evaluation of the end user’s quality-of-experience (QoE) is essential for efficient video streaming. This is crucial for networks with constrained resources that offer time-varying channel quality to its users. In hypertext transfer protocol-based video streaming, the QoE is measured by quantifying the perceptual impact of distortions caused by rate adaptation or interruptions in play...
View full abstract»

Affine-invariant extension of scale-invariant feature transform (ASIFT) algorithm requires a large amount of computation and memory access, and consequently, is hard to process in real time. In order to increase the operation speed of ASIFT algorithm, this paper proposes a new hardware architecture for the ASIFT algorithm. In order to reduce the memory access time, the affine transform is modified...
View full abstract»

Lines are significant features enclosing high-level information in an image. The line segment Detector (LSD) Algorithm with low error rate is a widely used method to extract lines in images effectively and accurately. However, the algorithm on PC performs too costly both in time and resources for the real-time video processing. This paper provides a fast and resource-efficient hardware implementat...
View full abstract»

In this paper, a stereoview to multiview conversion system, which includes stereo matching and depth image-based rendering (DIBR) hardware designs, is proposed. To achieve an efficient architecture, the proposed stereo matching algorithm simply generates the raw matching costs and aggregates cost based on 1D iterative aggregation schemes. For the DIBR architecture, an inpainting-based method is us...
View full abstract»

Real-time visual analysis tasks, like tracking and recognition, require swift execution of computationally intensive algorithms. Visual sensor networks could be enabled to perform such tasks by allowing the camera nodes to offload their computational load to nearby processing nodes. In this paper, we address the problem of minimizing the completion time of multiple camera sensors that share the tr...
View full abstract»

Recently, some picture-embedding schemes have been proposed to improve the aesthetic appearance of 2D barcodes. However, these aesthetic 2D barcodes are not robust to the distortions incurred by the print/display-and-capture channel under limited rendering space and resolution. In this paper, a picture-embedding 2D barcode named the Robust and Aesthetic (RA) Code is proposed to counter the channel...
View full abstract»

Energy-efficient streaming of real-time scalable video over wireless fading channels with imperfect channel state information is considered in this paper. A traffic rate adaptation (TRA) scheme is presented to minimize the transmission energy consumption while satisfying the quality of experience (QoE) requirement by dynamically selecting the delivered encoding layers of streaming media depending ...
View full abstract»

Aims & Scope

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) covers the circuits and systems aspects of all video technologies. General, theoretical, and application-oriented papers with a circuits and systems perspective are encouraged for publication in TCSVT on or related to image/video acquisition, representation, presentation and display; processing, filtering and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication and networking; storage, retrieval, indexing and search; and/or hardware and software design and implementation.