Bidirectional is a "wrapper" layer: it wraps any uni-directional RNN layer to make it bidirectional.
Note that multiple different modes are supported - these specify how the activations should be combined from
the forward and backward RNN networks. See Bidirectional.Mode javadoc for more details.
Parameters are not shared here - there are 2 separate copies of the wrapped RNN layer, each with separate parameters.
Usage: .layer(new Bidirectional(new LSTM.Builder()....build())

Do one or more time steps using the previous time step state stored in stateMap.
Can be used to efficiently do forward pass one or n-steps at a time (instead of doing
forward pass always from t=0)
If stateMap is empty, default initialization (usually zeros) is used
Implementations also update stateMap at the end of this method

Method Detail

rnnTimeStep

Do one or more time steps using the previous time step state stored in stateMap.
Can be used to efficiently do forward pass one or n-steps at a time (instead of doing
forward pass always from t=0)
If stateMap is empty, default initialization (usually zeros) is used
Implementations also update stateMap at the end of this method

rnnActivateUsingStoredState

Similar to rnnTimeStep, this method is used for activations using the state
stored in the stateMap as the initialization. However, unlike rnnTimeStep this
method does not alter the stateMap; therefore, unlike rnnTimeStep, multiple calls to
this method (with identical input) will:
(a) result in the same output
(b) leave the state maps (both stateMap and tBpttStateMap) in an identical state

rnnGetTBPTTState

Get the RNN truncated backpropagations through time (TBPTT) state for the recurrent layer.
The TBPTT state is used to store intermediate activations/state between updating parameters when doing
TBPTT learning

rnnSetTBPTTState

Set the RNN truncated backpropagations through time (TBPTT) state for the recurrent layer.
The TBPTT state is used to store intermediate activations/state between updating parameters when doing
TBPTT learning

tbpttBackpropGradient

Truncated BPTT equivalent of Layer.backpropGradient().
Primary difference here is that forward pass in the context of BPTT is that we do
forward pass using stored state for truncated BPTT vs. from zero initialization
for standard BPTT.

Pair where Gradient is gradient for this layer, INDArray is epsilon (activation gradient)
needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is
L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the
ArrayType.ACTIVATION_GRAD workspace via the workspace manager

setInputMiniBatchSize

Set current/last input mini-batch size.
Used for score and gradient calculations. Mini batch size may be different from
getInput().size(0) due to reshaping operations - for example, when using RNNs with
DenseLayer and OutputLayer. Called automatically during forward pass.

allowInputModification

A performance optimization: mark whether the layer is allowed to modify its input array in-place. In many cases,
this is totally safe - in others, the input array will be shared by multiple layers, and hence it's not safe to
modify the input array.
This is usually used by ops such as dropout.

feedForwardMaskArray

Feed forward the input mask array, setting in in the layer as appropriate. This allows different layers to
handle masks differently - for example, bidirectional RNNs and normal RNNs operate differently with masks (the
former sets activations to 0 outside of the data present region (and keeps the mask active for future layers like
dense layers), whereas normal RNNs don't zero out the activations/errors )instead relying on backpropagated error
arrays to handle the variable length case.
This is also used for example for networks that contain global pooling layers, arbitrary preprocessors, etc.