“Eager Mode” in Tensorflow

Although Tensorflow is the most popular Deep Learning Framework in 2016, Pytorch, a smaller new framework developed by FAIR(Facebook AI Research)， become a dark horse this year. Pytorch supports Dynamic Graph Computing, which means you can freely add or remove layers in your model at runtime. It makes developer or scientist build new models more rapidly.
To fight back Pytorch, Tensorflow team add a new mechanism named “Eager Mode”, in which we could also use Dynamic Graph Computing. The example of “Eager Mode” looks like:

1

2

3

4

5

6

7

8

import tensorflow astf

import tensorflow.contrib.eager astfe

tfe.enable_eager_execution()#Enalbe Eager Execution Mode

x=[[2.]]

m=tf.matmul(x,x)

print(m)

As above, unlike traditional Tensorflow application that use “Session.run()” to execute whole graph, developers could see values and gradients of variables in any layer at any step.

How did Tensorflow do it? Actually, the tricks behind the API is not difficult. Take the most common Operation ‘matmul’ as example:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

# file: tensorflow/python/ops/math_ops.py

def matmul(a,

b,

transpose_a=False,

transpose_b=False,

adjoint_a=False,

adjoint_b=False,

a_is_sparse=False,

b_is_sparse=False,

name=None):

......

ifuse_sparse_matmul:

returnsparse_matmul(

a,

b,

transpose_a=transpose_a,

transpose_b=transpose_b,

a_is_sparse=a_is_sparse,

b_is_sparse=b_is_sparse,

name=name)

else:

returngen_math_ops._mat_mul(

a,b,transpose_a=transpose_a,transpose_b=transpose_b,name=name)

Le’t look into “gen_math_ops._mat_mul()”:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

# file: bazel-genfiles/tensorflow/python/ops/gen_math_ops.py

def _mat_mul(a,b,transpose_a=False,transpose_b=False,name=None):

......

if_ctx.in_graph_mode():

_,_,_op=_op_def_lib._apply_op_helper(

"MatMul",a=a,b=b,transpose_a=transpose_a,transpose_b=transpose_b,

name=name)

_result=_op.outputs[:]

_inputs_flat=_op.inputs

_attrs=("transpose_a",_op.get_attr("transpose_a"),"transpose_b",

_op.get_attr("transpose_b"),"T",_op.get_attr("T"))

else:

_attr_T,_inputs_T=_execute.args_to_matching_eager([a,b],_ctx)

(a,b)=_inputs_T

_inputs_flat=[a,b]

_attrs=("transpose_a",transpose_a,"transpose_b",transpose_b,"T",

_attr_T)

_result=_execute.execute(b"MatMul",1,inputs=_inputs_flat,

attrs=_attrs,ctx=_ctx,name=name)

_execute.record_gradient(

"MatMul",_inputs_flat,_attrs,_result,name)

_result,=_result

return_result

As we can see, in Graph Mode, it will go to “_apply_op_helper()” to build graph (but not running it). In Eager Mode, it will execute the Operation directly.