My research

I'm interested in various kinds of applied mathematics. Below I
describe some of the topics I have worked on.

Deep neural networks

Discontinuous weight selection and emergence of subprograms

There is a general DeVore-Howard-Micchelli theorem establishing upper
bounds on approximation rates of any parameterized approximation model
(for example, a neural network) under assumption of continuous
parameter selection. However, if the assumption of continuity is
dropped, deep neural nets with a rather standard (but deep!)
architecture and the standard ReLU activation function can surpass
these bounds. This can be achieved by speeding up the computation with
something like subprograms, see this paper. This
construction is inspired by a construction used by Shannon in
his work on Boolean circuits.

Fast approximation of smooth functions
with deep ReLU networks

Smooth functions can be typically approximated more efficiently than
non-smooth functions. Usually, this is achieved by
choosing an approximating model of appropriate smoothness (e.g., using cubic
splines rather than linear splines if the approximated function is
smooth). However, deep neural networks can provide efficient (in a
sense, optimal) approximation rates for smooth functions even if their
activation function is only piecewise-linear (standard ReLU), see
the paper
(preprint).

Voxel features for 3D shape recognition

I have written a small library for
computation of various geometric features of voxelized 3D shapes.
These features can be used in automated classification of 3D shapes,
e.g. by training an XGBoost classifier; see the paper.

Space tether systems

Space tether
systems is an interesting class of systems, potentially useful for
various purposes such as space debris removal, satellite collocation,
etc. In
this joint work with our Astrium colleagues we studied a
"hub-and-spoke" pyramidal formation rotating about a central
satellite and holding another satellite beneath it. Unfortunately,
this configuration requires a relatively high fuel consumption.

So, in this paper we
proposed another, freely moving (no fuel!) formation serving
the same purpose. Instead of a circle, deputy satellites now move
along Lissajous curves. We find relations between the system's
parameters ensuring that the satellites and tethers never collide and
the main satellite remains immobile, and show how all these relations
can be satisfied.

Interestingly, the model seems to be especially stable if there are
at least 5 deputy satellites. Also interestingly, the tethers can get
entangled during operation; we have been able to only partially
demarcate the cases of absent or present entanglement (based on the winding
number invariant).

Surrogate Based Optimization (SBO)

In this
post I tried to explain in simple terms the idea of SBO and its most
natural version based on Expected Improvement (EI).

My research in this area concerned the following question: can
EI-based SBO fail, in the sense of never getting near the true global
optimum? The expected answer is "yes", but the proof is not obvious
because the behavior of SBO trajectories is not well understood on a
rigorous level. Nevertheless, in this paper I give a
rigorous example of failure in a sort of "analytic black hole"
scenario.

Interpolation

Explicit error formulas seem to be rare in the approximation theory.
One well-known example is the beautiful integral error formula for the
common polynomial
interpolation with \(N\) knots \(x_1,\ldots,x_N\):
\[f(x)-\widehat{f}(x)=\frac{\prod_{n=1}^N (x-x_n)}{N!} \int_{s_n\ge 0,
\sum_{n=0}^N s_n=1} \frac{d^N f}{dx^N}\Big(\sum_{n=0}^N s_n x_n \Big)
d\mathbf{s},\] where \(x\equiv x_0\). This formula immediately
implies, for example, that the interpolants converge to the true
function if it is analytic in a sufficiently large domain. In this paper I show that this
formula can be generalized to interpolation by exponential or Gaussian
functions using the Harish-Chandra-Itzykson-Zuber integral; in
particular, for Gaussian basis functions \(e^{-(x-x_n)^2/2}\) \[f(x)-
\widehat{f}(x) = \frac{ \prod_{n=1}^N (x-x_n) }{ N! Z} \int_{S^{2N+1}}
\int_{\mathbb{U}(N)} e^{\mathrm{tr}(X U^{\dagger}
P_\mathbf{v}^{\dagger} \widetilde{X} P_\mathbf{v} U)}
e^{-\frac{x^2}{2}}\Big[\prod_{n=1}^N \big(\frac{d}{dq}-x_n\big)\Big]
e^{\frac{q^2}{2}}f(q)\Big|_{q = \mathbf{v}^{\dagger}
\widetilde{X}\mathbf{v}} d\mathbf{v}dU\] Though this expression looks
complicated, it can be used to prove convergence of interpolants
almost as easily as in the polynomial case. This result does not seem
to generalize to more general radial basis functions; e.g. the proof
breaks down even for basis functions of the form \(\sum_{k} c_k
e^{-(x-x_n)^2/a_k}\). The HCIZ integral is well known in the random
matrix theory, representation theory and quantum field theory; it is
interesting that it also has applications to the interpolation theory.

Quantum Spin Systems

My research in this area mostly concerned rigorous analysis of ground
states with the help of cluster expansions.

In this paper I
developed a quadratic form-based perturbation theory and used it to
prove that small perturbations of the AKLT model
remain gapped (which was widely believed, but hard to prove).

In this
paper (preprint) I prove uniqueness of
the ground state of a weakly interacting system in a strong sense
involving "most general quantum boundary conditions", and discuss how
one can interprete these conditions.

In this
paper (preprint) I show that the
so-called "commensurate-incommensurate transition" in the AKLT model
can be explained by a peculiar Poisson-type random walk with a single reversal.

My industrial experience

At Datadvance, I was one of the developers of the Macros/pSeven
Core library and other custom software for optimization and
predictive modeling. This software is used at a number of major
engineering companies, e.g. at
Airbus. Our toolbox of regression methods is described in this paper.