Abstract: Fitting a rank-r matrix to noisy data is in general NP-hard. A popular approach is by convex relaxations via nuclear/trace norm minimization. This approach is shown to provide strong (often order-wise unimprovable) statistical guarantees in terms of error bounds and sample complexity. Computationally, while nuclear norm minimization can be solved in polynomial time in principle by semidefinite programming, its time complexity is often too high for large matrices. In this talk, we consider an alternative approach via projected gradient descent over the space of n-by-r matrices, which scales well to large instances. Moreover, we develop a unified framework characterizing the convergence of projected gradient descent for a broad range of non-convex low-rank estimation problems. Our results apply to the problems of matrix sensing, matrix completion, robust PCA, sparse PCA, densest subgraph detection and others, for which we match the best known statistical guarantees provided by convex relaxation methods.