In this letter, the performance of non-orthogonal multiple access (NOMA) is investigated in a cellular downlink scenario with randomly deployed users. The developed analytical results show that NOMA can achieve superior performance in terms of ergodic sum rates; however, the outage performance of NOMA depends critically on the choices of the users' targeted data rates and allocated power. In parti...
View full abstract»

This letter addresses multisensor data fusion under the Gaussian noise. Under the Gauss-Markov model assumptions, data fusion based on maximum likelihood estimation (MLE) is the minimum variance unbiased estimator. Nonetheless, we propose a linear fusion algorithm based on the random matrix theory, which yields a biased estimator. The proposed estimator has a lower mean squared error (MSE) than th...
View full abstract»

We propose a new universal objective image quality index, which is easy to calculate and applicable to various image processing applications. Instead of using traditional error summation methods, the proposed index is designed by modeling any image distortion as a combination of three factors: loss of correlation, luminance distortion, and contrast distortion. Although the new index is mathematica...
View full abstract»

Synthetic aperture radar (SAR) images are often contaminated by a multiplicative noise known as speckle. Speckle makes the processing and interpretation of SAR images difficult. We propose a deep-learning-based approach called, image despeckling convolutional neural network (ID-CNN), for automatically removing speckle from the input noisy images. In particular, ID-CNN uses a set of convolutional l...
View full abstract»

Low power wide area networks (LPWAN) are emerging as a new paradigm, especially in the field of Internet of Things (IoT) connectivity. LoRa is one of the LPWAN and it is gaining quite a lot of commercial traction. The modulation underlying LoRa is patented and has never been described theoretically. The aim of this letter is to give the first rigorous mathematical signal processing description of ...
View full abstract»

Denoising is a fundamental task in image processing with wide applications for enhancing image qualities. BM3D is considered as an effective baseline for image denoising. Although learning-based methods have been dominant in this area recently, the traditional methods are still valuable to inspire new ideas by combining with learning-based approaches. In this letter, we propose a new convolutional...
View full abstract»

Road detection is a key component of autonomous driving; however, most fully supervised learning road detection methods suffer from either insufficient training data or high costs of manual annotation. To overcome these problems, we propose a semisupervised learning (SSL) road detection method based on generative adversarial networks (GANs) and a weakly supervised learning (WSL) method based on co...
View full abstract»

This letter proposes a reconstruction-based single image super resolution method by using joint regularization, where a group-residual-based regularization (GRR) and a ridge-regression-based regularization (3R) are combined. In GRR, nonlocal similar patches are grouped together, and the group weights are calculated so as to adaptively constrain the residual values in the gradient domain. In 3R, we...
View full abstract»

This letter presents a regression-based speech enhancement framework using deep neural networks (DNNs) with a multiple-layer deep architecture. In the DNN learning process, a large training set ensures a powerful modeling capability to estimate the complicated nonlinear mapping from observed noisy speech to desired clean signals. Acoustic context was found to improve the continuity of speech to be...
View full abstract»

Recently, by feat of the Generative Adversarial Network (GAN), single image super-resolution (SISR) has achieved great breakthroughs in enhancing the perceptual image quality. However, since the network is trained by minimizing the perceptual loss, the GAN based SISR method (SRGAN) [1] results in images with very low objective quality, i.e., peak signal-to-noise ratio (PSNR). In this letter, we ai...
View full abstract»

An approximate joint singular value decomposition algorithm is proposed for a set of K(K ≥ 2) complex matrices. It can be seen as an orthogonal non-Hermitian approximate joint diagonalization algorithm. We exploit a Givens-like rotation method based on a special parameterization of the updating matrices and a reasonable approximation. The main points consist of the presentation of the new p...
View full abstract»

In non-orthogonal multiple access (NOMA) downlink, multiple data flows are superimposed in the power domain and user decoding is based on successive interference cancellation. NOMA's performance highly depends on the power split among the data flows and the associated power allocation (PA) problem. In this letter, we study NOMA from a fairness standpoint and we investigate PA techniques that ensur...
View full abstract»

Face detection and alignment in unconstrained environment are challenging due to various poses, illuminations, and occlusions. Recent studies show that deep learning approaches can achieve impressive performance on these two tasks. In this letter, we propose a deep cascaded multitask framework that exploits the inherent correlation between detection and alignment to boost up their performance. In ...
View full abstract»

We propose a novel pilot structure for covariance matrix estimation in massive multiple-input multiple-output systems in which each user transmits two pilot sequences, with the second pilot sequence multiplied by a random phase shift. The covariance matrix of a particular user is obtained by computing the sample cross-correlation of the channel estimates obtained from the two pilot sequences. This...
View full abstract»

The ability of deep convolutional neural networks (CNNs) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep CNN architecture for environmental sound classif...
View full abstract»

An important aim of research on the blind image quality assessment (IQA) problem is to devise perceptual models that can predict the quality of distorted images with as little prior knowledge of the images or their distortions as possible. Current state-of-the-art “general purpose” no reference (NR) IQA algorithms require knowledge about anticipated distortions in the form of trainin...
View full abstract»

We present a two-fold fusion framework that constructs comprehensive cost volumes for stereo matching. To this end, we develop fusion schemes at two key steps, i.e., a) the raw cost computation and b) the cost aggregation. Specifically, we commence by fusing both structure- and data-oriented features as raw costs. We then incorporate the guided filtered costs into the cross-based cost aggregation ...
View full abstract»

This letter introduces a novel grayscale-inversion and rotation invariant descriptor, called sorted local gradient pattern, for texture classification. First, we propose two complementary local gradient patterns (LGP), the center-to-ring LGP (LGP_CR) and the ring-to-ring LGP (LGP_RR), to encode rich gradient information present in a local neighborhood. Then, we propose to enhance LGP by encoding p...
View full abstract»

The typical approach for solving the problem of single-image super-resolution (SR) is to learn a nonlinear mapping between the low-resolution (LR) and high-resolution (HR) representations of images in a training set. Training-based approaches can be tuned to give high accuracy on a given class of images, but they call for retraining if the HR → LR generative model deviates or if the test im...
View full abstract»

As a popular signal modeling technique, sparse representation (SR) has achieved great success in image fusion over the last few years with a number of effective algorithms being proposed. However, due to the patch-based manner applied in sparse coding, most existing SR-based fusion methods suffer from two drawbacks, namely, limited ability in detail preservation and high sensitivity to misregistra...
View full abstract»

In the era of big data, adaptive censoring (AC) provides us a natural option of trimming data by only keeping the statistical informative data. However, the data chosen by AC may arrive in clusters, which do not relieve the computational resource requirement as expected. In this letter, we exploit queuing theory to model a single sink node with abundant sensor nodes. By adding a buffer to censored...
View full abstract»

Compressive sensing (CS) is proposed for signal sampling below the Nyquist rate based on the assumption that the signal is sparse in some transformed domain. Most sensing matrices (e.g., Gaussian random matrix) in CS, however, usually suffer from unfriendly hardware implementation, high computation cost, and huge memory storage. In this letter, we propose a deterministic sensing matrix for collect...
View full abstract»

The large scale of the recently demanded biometric systems has put a pressure on creating a more efficient, accurate, and private biometric solutions. Iris biometrics is one of the most distinctive and widely used biometric characteristics. High-performing iris representations suffer from the curse of rotation inconsistency. This is usually solved by assuming a range of rotational errors and perfo...
View full abstract»

Underwater vision suffers from severe effects due to selective attenuation and scattering when light propagates through water. Such degradation not only affects the quality of underwater images, but limits the ability of vision tasks. Different from existing methods that either ignore the wavelength dependence on the attenuation or assume a specific spectral profile, we tackle color distortion pro...
View full abstract»

Human activity recognition in videos with convolutional neural network (CNN) features has received increasing attention in multimedia understanding. Taking videos as a sequence of frames, a new record was recently set on several benchmark datasets by feeding frame-level CNN sequence features to long short-term memory (LSTM) model for video activity recognition. This recurrent model-based visual re...
View full abstract»

Empirical mode decomposition (EMD) has recently been pioneered by Huang et al. for adaptively representing nonstationary signals as sums of zero-mean amplitude modulation frequency modulation components. In order to better understand the way EMD behaves in stochastic situations involving broadband noise, we report here on numerical experiments based on fractional Gaussian noise. In such a case, it...
View full abstract»

The challenge of scheduling user transmissions on the downlink of a long term evolution (LTE) cellular communication system is addressed. A maximum rate algorithm which does not consider fairness among users was proposed in . Here, a multiuser scheduler with proportional fairness (PF) is proposed. Numerical results show that the proposed PF scheduler provides a superior fairness performance with a...
View full abstract»

Generative adversarial network has shown to effectively generate artificial samples indiscernible from their real counterparts with a united framework of two subnetworks competing against each other. In this letter, we first propose an automatic steganographic distortion learning framework using a generative adversarial network, which is composed of a steganographic generative subnetwork and a ste...
View full abstract»

Median filtering detection has recently drawn much attention in image editing and image anti-forensic techniques. Current image median filtering forensics algorithms mainly extract features manually. To deal with the challenge of detecting median filtering from small-size and compressed image blocks, by taking into account of the properties of median filtering, we propose a median filtering detect...
View full abstract»

We demonstrate a novel method for the automatic modulation classification based on a deep learning autoencoder network, trained by a nonnegativity constraint algorithm. The learning algorithm aims to constrain the negative weights, learns features that amount to a part-based representation of data, and disentangles a more meaningful hidden structure. The performance of this algorithm is tested on ...
View full abstract»

Accurate age estimation from a facial image is quite challenging, since physical age and apparent age can be quite different, and this difference is dependent on gender, ethnicity, and many other factors. Multitask deep learning is one of the approach to improve age estimation by employing auxiliary tasks, such as gender recognition, that are related to the primary task. However, in traditional mu...
View full abstract»

Recent steganalytic schemes reveal embedding traces in a promising way by using convolutional neural networks (CNNs). However, further improvements, such as exploring complementary data processing operations and using wider structures, were not extensively studied so far. In this letter, we design a new CNN in these aspects in order to better capture embedding artifacts. Specifically, on the one h...
View full abstract»

Video content providers put stringent requirements on the quality assessment methods realized on their services. They need to be accurate, real-time, adaptable to new content, and scalable as the video set grows. In this letter, we introduce a novel automated and computationally efficient video assessment method. It enables accurate real-time (online) analysis of delivered quality in an adaptable ...
View full abstract»

The impressive gains in performance obtained using deep neural networks (DNNs) for automatic speech recognition (ASR) have motivated the application of DNNs to other speech technologies such as speaker recognition (SR) and language recognition (LR). Prior work has shown performance gains for separate SR and LR tasks using DNNs for direct classification or for feature extraction. In this work we pr...
View full abstract»

The performance of the recursive least-squares (RLS) algorithm is governed by the forgetting factor. This parameter leads to a compromise between (1) the tracking capabilities and (2) the misadjustment and stability. In this letter, a variable forgetting factor RLS (VFF-RLS) algorithm is proposed for system identification. In general, the output of the unknown system is corrupted by a noise-like s...
View full abstract»

Dynamic hand gesture recognition is a crucial but challenging task in the pattern recognition and computer vision communities. In this paper, we propose a novel feature vector which is suitable for representing dynamic hand gestures, and presents a satisfactory solution to recognizing dynamic hand gestures with a Leap Motion controller (LMC) only. These have not been reported in other papers. The ...
View full abstract»

Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMM models is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea of stacking the means of the GMM model to form a GMM mean...
View full abstract»

This letter investigates methods to detect graph topological changes without making any assumption on the nature of the change itself. To accomplish this, we merge recently developed tools in graph signal processing with matched subspace detection theory and propose two blind topology change detectors. The first detector exploits the prior information that the observed signal is sparse w.r.t. the ...
View full abstract»

The study of complex systems greatly benefits from graph models and their analysis. In particular, the eigendecomposition of the graph Laplacian lets emerge properties of global organization from local interactions; e.g., the Fiedler vector has the smallest nonzero eigenvalue and plays a key role for graph clustering. Graph signal processing focuses on the analysis of signals that are attributed t...
View full abstract»

Sparsity exploiting image reconstruction (SER) methods have been extensively used with total variation (TV) regularization for tomographic reconstructions. Local TV methods fail to preserve texture details and often create additional artifacts due to over-smoothing. Nonlocal TV (NLTV) methods have been proposed as a solution to this but they either lack continuous updates due to computational cons...
View full abstract»

This letter introduces the LOOP binary descriptor (local optimal-oriented pattern) that encodes rotation invariance into the main formulation itself. This makes any post processing stage for rotation invariance redundant and improves on both accuracy and time complexity. We consider fine-grained lepidoptera (moth/butterfly) species recognition as the representative problem since it involves repeti...
View full abstract»

Voice activity detection (VAD) classifies incoming signal segments into speech or background noise; its performance is crucial in various speech-related applications. Although speech-signal context is a relevant VAD asset, its usefulness varies in unpredictable noise environments. Therefore, its usage should be adaptively adjustable to the noise type. This paper improves the use of context informa...
View full abstract»

Skin detection from images, typically used as a preprocessing step, has a wide range of applications such as dermatology diagnostics, human computer interaction designs, and etc. It is a challenging problem due to many factors such as variation in pigment melanin, uneven illumination, and differences in ethnicity geographics. Besides, age and gender introduce additional difficulties to the detecti...
View full abstract»

In this letter, we propose novel (semi)blind hard decision fusion rules that use the mean of the secondary user characteristics instead of their actual values. We show that these rules with slight (or no) additional system knowledge achieve better receiver operating characteristics than existing (semi)blind alternatives. These rules also have a low-complexity analytical solution under Neyman-Pears...
View full abstract»

Numerous methods that automatically identify subjects depicted in sketches as described by eyewitnesses have been implemented, but their performance often degrades when using real-world forensic sketches and extended galleries that mimic law enforcement mug-shot galleries. Moreover, little work has been done to apply deep learning for face photo-sketch recognition despite its success in numerous a...
View full abstract»

The phonocardiogram (PCG) signal indicates closing instants of atrio-ventricular and semilunar valves, and this information can also be extracted from two major profiles of a seismocardiographic (SCG) cycle. This letter presents a method to extract fundamental heart sounds (HSs) from a SCG signal. The proposed method employs discrete wavelet transform for signal decomposition, and subsequently, ce...
View full abstract»

Matrix factorization is among the most popular approaches for matrix completion, with recent advances including gradient-based and deep-learning-based methods. Even though many applications involve matrices with discrete values, most of the existing matrix factorization models focus on the continuous domain. Discretization is applied as an additional step, often using a heuristic mapping that resu...
View full abstract»

A sound field recording method based on spherical or circular harmonic analysis for arbitrary array geometry and directivity of microphones is proposed. In current methods based on harmonic analysis, a sound field is decomposed into harmonic functions with a center given in advance, which is called a global origin, and their coefficients are obtained up to a certain truncation order using micropho...
View full abstract»