edited

This PR overhauls workspaces in DL4J, to improve performance and memory use.
It also adds a lot of workspace validation, to detect invalid workspace use (bugs, bad layer implementations etc).

Goals here:

Eliminate performance bottleneck of migrateInputs() etc: all arrays are allocated in the final workspace they should be in

Reduce memory requirements

Improve maintainability and considerably simplify workspace usage:
a. Workspace-related code had become unweildy in MultiLayerNetwork and ComputationGraph FF/BP methods, and has been simplified (and made more maintainable)
b. A lot of additional workspace-related validation has been added - this should make it considerably easier to detect workspace-related bugs in the future (due to bugs in layers, etc)

In terms of design:

Workspaces-related code should be handled at the level of array types (input, activations, etc) not workspace names ("loop_ff" etc). Layers etc don't need to know/care about the workspace configuration - simply that input arrays should use the INPUT enumeration, activation arrays should use the ACTIVATION enumeration, etc.

I've also slightly simplified the layer API by cleaning up a number of superfluous methods (unnecessary activate/output methods)

We're looking at substantial rewrite of the MLN and CG feedforward and backprop methods:
a. They are built with workspaces in mind from scratch.
b. Reduced code redundancy (we had essentially the same FF loops in a number of places, with minor variants)

LGTM overall (of course, what else!) The only real criticism I have is that changing the API so that "almost everything" takes a workspace manager (WSM) seems a little excessive at first glance. I'm sure you had your reasons, maybe it doesn't really work any other way. My hunch says it should've been possible to set a WSM config at layer and network level.

For instance, in the Keras preprocessor changes you had to change the signature of preprocess, but the WSM isn't used at all in the body (which my IDE complains about).

Another idea worth exploring is to keep the old methods and wrap them like this:

This comment has been minimized.

This comment has been minimized.

edited

The only real criticism I have is that changing the API so that "almost everything" takes a workspace manager (WSM) seems a little excessive at first glance.

I'm sure you had your reasons, maybe it doesn't really work any other way. My hunch says it should've been possible to set a WSM config at layer and network level.

It isn't the greatest for (developer) usability, sure - though doesn't impact end users unless they write custom layers etc as it's internal.

I initially considered a "Layer.setWorkspaceManager" type design, but eventually rejected this as likely to introduce bugs. We'd be continually swapping out workspace managers in the layers (note different feed-forward methods have very different workspace configs), and if we forget to do it at some point, we can end up with crashes and really hard to track down bugs and performance issues.
tl;dr I went with explicit and stateless as it's the least likely to introduce bugs and hidden performance/memory issues (unnecessary/unintended detaches), at the expense of those API changes.

Another idea worth exploring is to keep the old methods and wrap them like this:

Good question, but I also considered and rejected it. I did it in one place (time series utils) but I'm not adding this to the layers, again to avoid hidden performance/memory issues. Also we have too many Layer methods already, and I'm very hesitant to add even more :)

For instance, in the Keras preprocessor changes you had to change the signature of preprocess, but the WSM isn't used at all in the body (which my IDE complains about).

That was actually a bug (well, incomplete implementation) on my part, for those keras preprocessors. I've fixed it now. :)

Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.