Mathematics > Statistics Theory

Title:
Implicit stochastic approximation

Abstract: The need to carry out parameter estimation from massive data has
reinvigorated interest in iterative estimation methods, in statistics and
machine learning. Classic work includes deterministic gradient-based methods,
such as quasi-Newton, and stochastic gradient descent and its variants,
including adaptive learning rates, acceleration and averaging. Current work
increasingly relies on methods that employ proximal operators, leading to
updates defined through implicit equations, which need to be solved at each
iteration. Such methods are especially attractive in modern problems with
massive data because they are numerically stable and converge with minimal
assumptions, among other reasons. However, while the majority of existing
methods can be subsumed into the gradient-free stochastic approximation
framework developed by Robbins and Monro (1951), there is no such framework for
methods with implicit updates. Here, we conceptualize a gradient-free implicit
stochastic approximation procedure, and develop asymptotic and non-asymptotic
theory for it. This new framework provides a theoretical foundation for
gradient-based procedures that rely on implicit updates, and opens the door to
iterative estimation methods that do not require a gradient, nor a fully known
likelihood.