Activation function to use
for the recurrent step.
Default: hard sigmoid (hard_sigmoid).
If you pass None, no activation is applied
(ie. "linear" activation: a(x) = x).

use_bias

Boolean, whether the layer uses a bias vector.

kernel_initializer

Initializer for the kernel weights matrix,
used for the linear transformation of the inputs.

recurrent_initializer

Initializer for the recurrent_kernel
weights matrix,
used for the linear transformation of the recurrent state.

bias_initializer

Initializer for the bias vector.

unit_forget_bias

Boolean.
If True, add 1 to the bias of the forget gate at initialization.
Setting it to true will also force bias_initializer="zeros".
This is recommended in Jozefowicz et
al.

kernel_regularizer

Regularizer function applied to
the kernel weights matrix.

recurrent_regularizer

Regularizer function applied to
the recurrent_kernel weights matrix.

bias_regularizer

Regularizer function applied to the bias vector.

kernel_constraint

Constraint function applied to
the kernel weights matrix.

recurrent_constraint

Constraint function applied to
the recurrent_kernel weights matrix.

bias_constraint

Constraint function applied to the bias vector.

dropout

Float between 0 and 1.
Fraction of the units to drop for
the linear transformation of the inputs.

recurrent_dropout

Float between 0 and 1.
Fraction of the units to drop for
the linear transformation of the recurrent state.

implementation

Implementation mode, either 1 or 2.
Mode 1 will structure its operations as a larger number of
smaller dot products and additions, whereas mode 2 will
batch them into fewer, larger operations. These modes will
have different performance profiles on different hardware and
for different applications.

Call arguments:

inputs: A 2D tensor.

states: List of state tensors corresponding to the previous timestep.

training: Python boolean indicating whether the layer should behave in
training mode or in inference mode. Only relevant when dropout or
recurrent_dropout is used.

reset_dropout_mask

This is important for the RNN layer to invoke this in it call() method so
that the cached mask is cleared before calling the cell.call(). The mask
should be cached across the timestep within the same batch, but shouldn't
be cached between batches. Otherwise it will introduce unreasonable bias
against certain index of data within the batch.

reset_recurrent_dropout_mask

This is important for the RNN layer to invoke this in it call() method so
that the cached mask is cleared before calling the cell.call(). The mask
should be cached across the timestep within the same batch, but shouldn't
be cached between batches. Otherwise it will introduce unreasonable bias
against certain index of data within the batch.