to us in this work, we define limited quantum
speedup as a speedup obtained when compared
specifically with classical algorithms that “
correspond” to the quantum algorithm in the sense
that they implement the same algorithmic approach, but on classical hardware. A natural example is quantum annealing (8, 9) or adiabatic
quantum optimization (10) implemented on a
candidate physical quantum information processor versus corresponding classical algorithms
such as simulated annealing (SA) (11) (which
performs annealing on a classical Monte Carlo
simulation of the Ising spin glass and makes no
use of quantum effects) or simulated quantum
annealing (SQA) (12, 13) (a classical algorithm
mimicking quantum annealing in a path-integral
quantum Monte Carlo simulation). In this comparison, a limited quantum speedup would be a
demonstration that quantum effects improve the
annealing algorithm.

To illustrate the subtleties in detecting quantum speedup, even after a classical reference
algorithm is chosen, we compared the performance of an experimental 503-qubit D-Wave Two
(DW2, D-Wave Systems, Incorporated, Burnaby,
Canada) device to classical algorithms and analyzed the evidence for quantum speedup on the
benchmark problem of random spin glass instances. Specifically, we considered the problem
of finding the ground state of an Ising spin glass
model described by a “problem Hamiltonian”

HIsing ¼ − ∑
i∈V

hisz i − ∑
ði; j Þ∈E

Jijs z i sz j ð2Þ

with N binary variables sz i ¼ T1. The local fields{hi} and couplings {Jij} are fixed and define aproblem instance of the Ising model. The spinsoccupy the vertices V of a graph G ¼ fV;Eg withedge set E. We will consider the distributions ofthe time to solution over many random spinglass problems instances with zero local fields onthe Chimera graph realized by the DW2 device[see supplementary materials (14) for a definitionof that graph and why we choose hi = 0]. Thisproblem is NP-hard (15), and all known classicalalgorithms scale super-polynomially not only forthe hardest but also for typical instances. Althoughquantum mechanics is not expected to reduce thesuper-polynomial scaling to polynomial, a quan-tum algorithm might still scale better with prob-lem size N than any classical algorithm.

The D-Wave devices (16–19) are designed to be
physical realizations of quantum annealing using
superconducting flux qubits and programmable
fields {hi} and couplings {Jij}. Quantum annealing is implemented by initializing the system in
the ground state of a transverse magnetic field
HX ¼ −∑i∈Vsx i , where the s’s denote the Pauli
matrices; HX is turned off adiabatically, while
HIsing is turned on simultaneously. A measurement of each qubit in the computational basis
then determines the ground state of HIsing with a
probability affected by many factors, including
thermal excitations and implementation errors
(20). Tests on D-Wave One (DW1) (21) and DW2
devices (22) have shown that for small problem
sizes the device correlates well with the predictions of a quantum master equation. For larger
problem sizes, a 108-qubit DW1 device correlated
well with SQA (20), which indicates that, despite
decoherence and coupling to a thermal bath, the
behavior of the device is consistent with it actually performing quantum annealing (23, 24).
However, for these benchmarks both SQA and
the DW1 device are also well described by a semiclassical mean-field version of SQA (25), which
raises the question of whether quantum effects
play an important role. The approach adopted
here, of seeking evidence of a (limited) quantum
speedup, directly addresses the crucial question of
whether large-scale quantum effects create a
potential for the devices to outperform classical
algorithms. To test this possibility, we compared
the performance of a DW2 device to two “
corresponding” classical algorithms: SA and SQA.

Because quantum speedup concerns the as-ymptotic scaling of S(N), we considered thesubtleties of estimating it from small N andinefficiencies at small N that can fake or mask aspeedup. In the context of annealing, the optimalchoice of the annealing time ta turns out to becrucial for estimating asymptotic scaling. Toillustrate this, we first considered the time tosolution using SA and SQA run at different fixedta values, independent of N. Figure 1A shows thescaling of the median total annealing time [over1000 different random instances on the D-WaveChimera graph; see supplementary materials(14)] for SQA to find a solution at least once withprobability P = 0.99. Corresponding times for SAare shown in fig. S10. We observed that at con-stant ta, as long as ta is long enough to find theground state almost every time, the scaling of thetotal effort is at first relatively flat. The total effortthen rose more rapidly, once one reached Nvalues for which the chosen ta is too short, andthe success probabilities were thus low, requiringmany repetitions. Extrapolations to N → needto consider the lower envelope of all curves,which corresponds to choosing an optimal an-nealing time topt a ðN Þ for each N.

Figure 1B demonstrates that, when using fixed
ta values, no conclusion can be drawn from annealing (simulated or in a device) about the asymptotic scaling. The initial slow increase at
constant ta is misleading, and instead the optimal annealing time topt a needs to be used for
each N. To illustrate this, we show (Fig. 1B) the
real “speedup” ratio of the scaling of SA and
SQA (actually a slowdown) and a fake speedup resulting from a constant and excessively
long ta for SQA. Because SA outperforms SQA
on our benchmark set, it is our algorithm of
choice in the comparisons with the DW2 reported below.

A related issue is the scaling of hardware
resources (computational gates and memory)
with problem size, which must be identical for
the devices we compare. A device whose hardware resources scale as N can achieve an intrinsic parallel speedup compared with a fixed

SCIENCE
sciencemag.org 25 JULY 2014 • VOL 345 ISSUE 6195 421
AB

Fig. 2. Scaling of time to solution for r = 1 (A) and r = 7 (B). Shown is the scaling of the pure annealing time to find the ground state at least once with a
probability P = 0.99 for various quantiles of hardness for simulated annealing (SA, dashed) and the DW2 (solid). The solid lines terminate for the highest
quantiles because the DW2 did not solve the hardest instances for large problem sizes within the maximum number of repetitions (at least 32,000) of the
annealing we performed.