An implementation of a stochastic gradient decent with proximal perameter adjustment
(for regularised parameters).
Data is dealt with sequentially using a one pass implementation of the
online proximal algorithm described in chapter 9 and 10 of:
The Geometry of Constrained Structured Prediction: Applications to Inference and
Learning of Natural Language Syntax, PhD, Andre T. Martins
The implementation does the following:
- When an X,Y is recieved:
- Update currently held batch
- If the batch is full:
- While There is a great deal of change in U and W:
- Calculate the gradient of W holding U fixed
- Proximal update of W
- Calculate the gradient of U holding W fixed
- Proximal update of U
- Calculate the gradient of Bias holding U and W fixed
- flush the batch
- return current U and W (same as last time is batch isn't filled yet)

getBias

addU

Expand the U parameters matrix by added a set of rows.
If currently unset, this function does nothing (assuming U will be initialised in the first round)
The new U parameters are initialised used BilinearLearnerParameters.EXPANDEDUINITSTRAT

Parameters:

newUsers - the number of new users to add

addW

Expand the W parameters matrix by added a set of rows.
If currently unset, this function does nothing (assuming W will be initialised in the first round)
The new W parameters are initialised used BilinearLearnerParameters.EXPANDEDWINITSTRAT