Warning: jsMath
requires JavaScript to process the mathematics on this page.
If your browser supports JavaScript, be sure it is enabled.

Sparse Spikes Deconvolution over the Space of Measures

This numerical tour explores the resolution of the sparse spikes deconvolution problem over the infinite dimensional Banach
space of Radon measures. I would like to thank Jalal Fadili for his comments and corrections.

Installing toolboxes and setting up the path.

You need to unzip these toolboxes in your working directory, so that you have toolbox_signal and toolbox_general in your directory.

For Scilab user: you must replace the Matlab comment '%' by its Scilab counterpart '//'.

Recommandation: You should create a text file named for instance numericaltour.sce (in Scilab) or numericaltour.m (in Matlab) to write all the Scilab/Matlab command you want to execute. Then, simply run exec('numericaltour.sce'); (in Scilab) or numericaltour; (in Matlab) to run the commands.

We follow here a recent trend initiated by several papers (see for instance [deCastroGamboa12, CandesFernandezGranda13, BrediesPikkarainen13]) that performs the recovery of sparse measures (i.e. sum of Diracs) using a convex sparse regularization over the the space
of Radon measures (which is in some sense the dual of the Banach space of continuous functions).

We consider, for \(\la>0\), the following regularization \[ \umin{\mu \in \Mm(\mathbb{T})} \frac{1}{2} \norm{y-\Phi \mu}^2
+ \la \norm{\mu}_{\text{TV}}, \quad (\Pp_\la(y)) \] where \(\norm{\mu}_{\text{TV}}\) is the so-called total variation
of the measure \(\mu\), which should not be confused with the total variation seminorm of a function. For an arbitrary measure,
it reads \[ \norm{\mu}_{\text{TV}} = \inf\enscond{ \int_{\mathbb{T}} f d\mu }{ f \in C^0(\mathbb{T}), \normi{f} \leq 1 }.
\] When there is no noise, it makes sense to consider the limit \(\la \rightarrow 0^+\) of \(\Pp_\la(y)\), which reads \[
\umin{\mu} \enscond{ \norm{\mu}_{\text{TV}} }{ y=\Phi \mu }. \]

We focus our attention on discrete measures, which are finite sum of Diracs, and that we write, for \(x \in \mathbb{T}^n\)
and \(a \in \RR^n\) \[ \mu_{x,a} = \sum_{i=1}^n a_i \de_{x_i}. \] In this case, one has \[ \norm{\mu_{x,a}}_{\text{TV}} =
\norm{a}_1 = \sum_{i=1}^n \abs{a_i}, \] which shows that \(\norm{\cdot}_{\text{TV}}\) is a natural generalization of the
\(\ell^1\) norm to measures.

The goal of this numerical tour is to detail how to numerically solve this problem using an approach proposed by [CandesFernandezGranda13] and also to get some intuition about the primal-dual relationships that underly several theoretical analyses of the performance
of the recovery.

We denote \(\eta_\la = \Phi^* p_\la \in L^2(\mathbb{T})\), which is a trigonometric polynomial.

The Lagrangian dual problem associated to \(\Pp_0(y)\) reads \[ \umax{p \in \CC^N} \enscond{ \dotp{y}{p} }{ \normi{\Phi^*p}
\leq 1 }. \] It does not have in general an unique solution. In [DuvalPeyre13], is it proved that \( p_\la \rightarrow p_0\) when \(\la \rightarrow 0\) where \(p_0\) is the solution of \(\Dd_\la(y)\)
having a minimal \(\ell^2\) norm. We denote \(\eta_0=\Phi^* p_0\) the minimal norm dual certificate.

Strong duality holds between the primal problem \(\Pp_\lambda(y)\) and its dual \(\Dd_\lambda(y)\), and thus the values of
these problems are equal. For any solution \(\mu_\la\) of \(\Pp_\la(y)\), one has \[ \eta_\la = \frac{1}{\la} \Phi^*( y-\Phi
\mu_\la ). \]

Furthermore, the support of any solution \(\mu^\star = \mu_{x^\star,a^\star}\) of \( \Pp_\la(y) \) satisfies \[ \forall i,
\quad \abs{\eta_\la(x_i^\star)}=1. \] This property is at the heart of the numerical scheme proposed by [CandesFernandezGranda13] to solve the primal problem.

In [DuvalPeyre13], it is shown that this certificate \(\eta_0\) is important, since it some how governs the stability of the recovered spikes
locations (in particular how close they are to \(x_0\)) when the noise \(w\) is small.

Douglas-Rachford Algorithm

Note that this section is independent from the other one, and in particular this section has its own notations. The Douglas-Rachford
(DR) algorithm is an iterative scheme to minimize functionals of the form \[ \umin{x} F(x) + G(x) \] over a Hibert space (assumed
for ismplicity to be finite dimensional) endowed with a norm \(\norm{\cdot}\), where \(F\) and \(G\) are proper closed convex
functions with intersecting domains. We assume that the set of minimizers is non-empty. We also assume that one is able to
compute the proximal mappings \( \text{prox}_{\gamma F} \) and \( \text{prox}_{\gamma G} \) which are defined as \[ \text{prox}_{\gamma
F}(x) = \uargmin{y} \frac{1}{2}\norm{x-y}^2 + \ga F(y) \] (the same definition applies also for \(G\)).

The important point is that \(F\) and \(G\) do not need to be smooth. One onely needs them to be "simple" in the sense that
one can compute in closed form their respective proximal mappings.

This algorithm was introduced in [LionsMercier79]. as a generalization of an algorithm introduced by Douglas and Rachford in the case of quadratic minimization (which corresponds
to the solution of a linear system).

We have used the following shorthand notation: \[ \text{rprox}_{\gamma F}(x) = 2\text{prox}_{\gamma F}(x)-x \]

It is of course possible to inter-change the roles of \(F\) and \(G\), which defines another iterative scheme.

One can show that for any value of \(\gamma>0\), any \( 0 < \mu < 2 \), and any \(\tilde x_0\), \(x_k \rightarrow x^\star\)
which is a minimizer of the minimization of \(F+G\).

Solving the Dual Problem with Douglas-Rachford

It is in general impossible to solve numerically \(\Pp_\la(y)\) because it is an infinite dimensional problem. In contrast,
\(\Dd_\la(y)\) is a finite dimensional problem, so that there is some hope to be able to solve it with an algorithm.

As detailed in [CandesFernandezGranda13], the constraint \(\normi{\Phi^* p} \leq 1\) can be re-cast as imposing that the trigonometric polynomials \(1-\Phi^* p\)
and \(1+\Phi^* p\) are sums of square polynomials. A classical result (see for instance [Dumitrescu07]) ensures that the convex set of sum of square polynomials (SOS) can be described as the intersection between the set of positive
hermitian semi-definite (SDP) matrices \(\Ss^+\) of size \( (N+1)\times(N+1) \) and an affine constraint.

The proximal operator of \(G = \iota_{\Ss^+}\) is the orthogonal projection on \(\Ss^+\). It is computed, for \(X=U\diag(\si_i)_i
V^*\) any SVD of \(X\) as \[ \text{prox}_{\ga G}(X) = \text{Proj}_{\Ss^+}(X) = U\diag(\max(\si_i,0))_i V^*. \] For convenience,
this is implemented in the function perform_sdp_projection.

ProxG = @(X,gamma)perform_sdp_projection(X);

The proximal operator of \(F\) is the concatenation of the proximal operator of \(f\) and the proximal operator of \(\iota_{\Cc}\)
(which in turn is the orthogonal projector on \(\Cc\)). Note that \(\Cc\) is an affine set for which the projection is easy
to compute as \[ \text{Proj}_{\Cc}(Q)_{i,j} = Q_{i,j} - \rho_{j-i} + \de_{i-j} \qwhereq \rho_{c} = \frac{1}{N-\abs{c}} \sum_{i}
Q_{i,i+c} \] where \(\de_{c}\) is the Kronecker delta. For convenience, this is implemented in the function perform_sos_projection.

Solving the Primal Problem with Root Finding

Following [CandesFernandezGranda13], one can hope to solve the primal problem by using the fact that the support of any solution is included in the saturation
set of \(\eta_\la\) \[ S = \enscond{t}{ \abs{\eta_\la(t)}=1 }. \] A key remark is that this set can be computed by finding
the roots of a triginometric polynomial on the unit circle

We assume that \(\Phi_{x^\star}\) is injective, where \(x^\star\) is a vector containing the points in \(S\). This is obtained
in the case that \(\eta_\la\) is not a constant polynomial.

Display the magnitude of \(1-\abs{\eta_\la(t)}^2\), which is a trigonometric polynomial which is zero at the locations \(S\)
of the Diracs of the primal problem.

Exercice 1: (check the solution) You can see that the dual certificate \(\abs{\eta_\la}\) saturate to \(+1\) or \(-1\) in region that are far away from the
initial support \(x_0\). This means either that the noise is too large for the method to successfully estimate robustly this
support, or that \(\la\) was not chosen large enough. Explore different value of noise level \(\norm{w}\) and \(\la\) to determine
empirically the signal/noise/\(\la\) regime where the support is sucessfully estimated.

exo1;

Certificates and Pre-Certificates

The minimal norm certificate \(\eta_0\) is important because it describes the evolution of \(\mu_\la\) of \(\Pp_\la(y)\) as
a function of \(\la\) when \(\la\) is small enough. In particular, as shown in [DuvalPeyre13], if the saturation set \[ S_0 = \enscond{t \in \mathbb{T}}{ \abs{\eta_0(t)}=1 } \] is equal to \(x_0\) and if \(\eta_0''(t)
\neq 0\) for \(t \in S_0\), then for small noise and small \(\la\), \(\mu_\la\) have the same number of Diracs as \(\mu_0\),
and both Diracs' locations and amplitude are close to those of \(\mu_0\).

A major difficulty is that in general \(\eta_0\) is the solution of a convex progam, and is hence difficult to compute and
to analyze.

Exercice 2: (check the solution) Study the evolution of the pre-certificate as the separation between the spikes diminishes. When is it the case that \(\eta_0=\eta_V\)?
When is it the case that \(\mu_0\) is the solution of \(\Pp_0(\Phi \mu_0)\) ?