We propose a parameter server system for distributed ML, which follows a
Stale Synchronous Parallel (SSP) model of computation that maximizes the time
computational workers spend doing useful work on ML algorithms, while still
providing correctness guarantees. The parameter server provides an easy-to-use
shared interface for read/write access to an ML model’s values (parameters
and variables), and the SSP model allows distributed workers to read older,
stale versions of these values from a local cache, instead of waiting to get
them from a central storage. This significantly increases the proportion of
time workers spend computing, as opposed to waiting. Furthermore, the SSP
model ensures ML algorithm correctness by limiting the maximum age of the
stale values. We provide a proof of correctness under SSP, as well as
empirical results demonstrating that the SSP model achieves faster algorithm
convergence on several different ML problems, compared to fully-synchronous
and asynchronous schemes.