Bookmark

Computer Science > Machine Learning

Title:On the Optimization Landscape of Tensor Decompositions

Abstract: Non-convex optimization with local search heuristics has been widely used in
machine learning, achieving many state-of-art results. It becomes increasingly
important to understand why they can work for these NP-hard problems on typical
data. The landscape of many objective functions in learning has been
conjectured to have the geometric property that "all local optima are
(approximately) global optima", and thus they can be solved efficiently by
local search algorithms. However, establishing such property can be very
difficult.
In this paper, we analyze the optimization landscape of the random
over-complete tensor decomposition problem, which has many applications in
unsupervised learning, especially in learning latent variable models. In
practice, it can be efficiently solved by gradient ascent on a non-convex
objective. We show that for any small constant $\epsilon > 0$, among the set of
points with function values $(1+\epsilon)$-factor larger than the expectation
of the function, all the local maxima are approximate global maxima.
Previously, the best-known result only characterizes the geometry in small
neighborhoods around the true components. Our result implies that even with an
initialization that is barely better than the random guess, the gradient ascent
algorithm is guaranteed to solve this problem.
Our main technique uses Kac-Rice formula and random matrix theory. To our
best knowledge, this is the first time when Kac-Rice formula is successfully
applied to counting the number of local minima of a highly-structured random
polynomial with dependent coefficients.

Comments:

Best paper in the NIPS 2016 Workshop on Nonconvex Optimization for Machine Learning: Theory and Practice. In submission