In this paper, the authors propose to map data to a low-dimensional Euclidean space, such that the inner product in this space is a close approximation of the inner product computed by a stationary (shift-invariant) kernel (in a potentially infinite-dimensional RKHS). The approach is based on Bochner’s theorem.

The paper is about large-scale Gaussian process classification. Unlike many others, the authors use Expectation Propagation (and not Variational Inference) for approximate inference. An approximate marginal likelihood expression is derived that factorizes over the data instances, which allows for distributed inference and training. Training is additionally sped up by using mini-batches of data.