The important thing is that we need to use tf.assign() to push Variable back to Parameter Server. The operation ‘tf.add’ was about to run on the task0 of worker in this example. But if we deploy more complicated application by many tasks, things became weird: a pipeline operation sometimes even runs on ‘ps’ role! The official solution to this problem is using ‘tf.train.replica_device_setter()’， which will automatically deploy Variables to parameter servers and Operations (many replicas) to many workers. What did ‘tf.train.replica_device_setter()’ do? Let’s see the backbone code of its implementation:

Python

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

defreplica_device_setter(ps_tasks=0,ps_device="/job:ps",

worker_device="/job:worker",merge_devices=True,

cluster=None,ps_ops=None,ps_strategy=None):

...

ifps_ops isNone:

# TODO(sherrym): Variables in the LOCAL_VARIABLES collection should not be

# placed in the parameter server.

ps_ops=["Variable","VariableV2","VarHandleOp"]

ifnotmerge_devices:

logging.warning(

"DEPRECATION: It is recommended to set merge_devices=true in "

"replica_device_setter")

ifps_strategy isNone:

ps_strategy=_RoundRobinStrategy(ps_tasks)

ifnotsix.callable(ps_strategy):

raiseTypeError("ps_strategy must be callable")

chooser=_ReplicaDeviceChooser(

ps_tasks,ps_device,worker_device,merge_devices,ps_ops,ps_strategy)

returnchooser.device_function

All the Variables will be counted as ‘ps_ops’, and the deploy strategy for Operations will be replication, for it’s called ‘_ReplicaDeviceChooser’.