Project description

Proximal Minimization

The methods in this package provide solvers for constrained optimization problems. All of them use proximal operators to deal with non-smooth constraint functions.

The algorithms:

Proximal Gradient Method (PGM): forward-backward split with a single smooth function with a Lipschitz-continuous gradient and a single (non-smooth) constraint function. Nesterov acceleration is available.

Block/Alternating Proximal Gradient Method (bPGM): Extension of PGM to objective functions that are convex in several arguments; optional Nesterov acceleration.

Alternating Direction Method of Multipliers (ADMM): Rachford-Douglas split for two potentially non-smooth functions. We use the linearized form of it solve for additional linear mappings in the constraint functions.

Block-Simultaneous Direction Method of Multipliers (bSDMM): Extension of SDMM to work with objective functions that are convex in several arguments. It's a proximal version of Block coordinate descent methods.

In addition, bSDMM is used as the backend of a solver for Non-negative Matrix Factorization (NMF). As our algorithm allows an arbitrary number of constraints on each of the matrix factors, we prefer the term Constrained Matrix Factorization.

Details can be found in the paper"Block-Simultaneous Direction Method of Multipliers - A proximal primal-dual splitting algorithm for nonconvex problems with multiple constraints" by Fred Moolekamp and Peter Melchior.

Many proximal operators can be constructed analytically, see e.g. Parikh & Boyd (2014). We provide a number of common ones in proxmin.operators. An important class of constraints are indicator functions of convex sets, for which the proximal operator, given some point X, returns the closes point to X in the Euclidean norm that is in the set. That is what prox_circle above does.

If the objective function is smooth and there is only one constraint, one can simply perform a sequence of forward-backward steps: step in gradient direction, followed by a projection onto the constraint.

If the objective function is not smooth, one can use ADMM. This also allows for two functions (the objective and one constraint ) to be satisfied, but it treats them separately. Unlike PGM, the constraint is only met at the end of the optimization and only within some error tolerance.

A fully working example to demonstrate the principle of operations is [examples/parabola.py] that find the minimum of a 2D parabola under hard boundary constraints (on a shifted circle or the intersection of lines).

Constrained matrix factorization (CMF)

We have developed this package with a few application cases in mind. One is matrix factorization under constraints on the matrix factors, i.e. describing a target matrix Y as a product of A S. If those constraints are only non-negativity, the method is known as NMF.

We have extended the capabilities substantially by allowing for an arbitrary number of constraints to be enforced. As above, the constraints and the objective function will be accessed through their proximal operators only.

For a solver, you can simply do this:

fromproxminimportnmf# PGM-like approach for each factorprox_A=...# a single constraint on A, solved by projectionprox_S=...# a single constraint on S, solved by projectionA0,S0=...# initializationA,S=nmf(Y,A0,S0,prox_A=prox_A,prox_S=prox_S)# for multiple constraints, solved by ADMM-style splitproxs_g=[[...],# list of proxs for A[...]]# list of proxs for SA,S=nmf(Y,A0,S0,proxs_g=proxs_g)# or a combinationA,S=nmf(Y,A0,S0,prox_A=prox_A,prox_S=prox_S,proxs_g=proxs_g)

A complete and practical example is given in these notebooks of the hyperspectral unmixing study from our paper.