May 27, 3:00, WeH 4601
A Distributed RL Scheme for Packet Routing
Justin Boyan
(joint work with Michael Littman)
We describe an adaptive algorithm for packet routing in
which a reinforcement learning module is embedded into each
node of a switching network. In unit time, a node examines
the top packet in its queue and sends it to the neighbor
estimated to give the shortest routing time to that packet's
destination. Only local information and local communication
are used at each node to keep up-to-date estimates of
shortest routing times. In simple experiments involving a
36-node, irregularly-connected network, this learning
approach proves superior to a nonadaptive algorithm based on
precomputed shortest paths.