The Code

The main goal of this project was to build a more flexible and extendable managed version of Mike O'Neill's excellent C++ project. I've included and used the splendid WPF TaskDialog Wrapper from Sean A. Hanley, the Extended WPF Toolkit and for unzipping the CIFAR-10 dataset the open-source SharpDevelop SharpZipLib module. Visual Studio 2012/2013 and/or Windows 7 are the minimum requirements. I made maximal use of the parallel functionality offered in C# 4.0 by letting the user at all times choose how many logical cores are used in the parallel optimized code parts with a simple manipulation of a sliderbar next to the View combobox.

Using the Code

Here is the example code to construct a LeNet-5 network in code (see the InitializeDefaultNeuralNetwork() function in MainViewWindows.xaml.cs):

Design View

In Design View you see how your network is defined and get a good picture of the current distribution of weight values in all the layers concerned.

Training View

In Training View you obiviously train the network. The 'Play' button gives you the 'Select Training Parameters' dialog where you define the training parameters. The 'Training Scheme Editor' button gives you the possibility to make training schemes to experiment with. At any time the training can be paused or aborted. If the training is pauzed you can save its weights. The 'Star' button will forget (random reset) all the learned weight values in each layer.

Testing View

In Testing View you get a better picture of the testing (or training) samples which are not recognized correctly.

Calculate View

In Calculate View we test a single testing or training sample with the desired properties and get a graphical view of all the output values in every layer.

Final Words

I would love to see a GPU integration for offloading the highly parallel task of learning the neural network. I made an attempt to use a simple MVVM structure in this WPF application. In the Model folder you will find the NeuralNetwork and DataProvider class which provide all the neural network code and deals with loading and providing the necessary training and testing samples. Also a NeuralNetworkDataSet class is used to load and save neural network definitions. The View folder contains four different PageViews and a global PageView which acts as the container for all the different views (Design, Training, Testing and Calculate). Hope there's someone out there who can actually use the code and improve on it. Extend it with an unsupervised learning stage, (encoder/decoder construction), implement better loss-functions, more training strategies (conjugate gradient, l-bgfs, ...), more datasets, better activation fuctions, ...

History

1.0.3.7: (07-08-14) (updated download of Setup & Sources on 07-21-14)

BugFix: slow speed resolved in Testing View

Added SGDLevenbergMarquardtModA training strategy. This can be used with a softmax output layer

Posibility to save the weights while training. Just click on Pause and then Save/Save as...

each neuron in the local layer has several separate bias connections.
For example, in the layer 7 the neuron #0 has biases with weight indexes #26, #52, etc (the map #0 is connected to the previous maps #1, #2, etc.) Generally, if the map size is 1 x 1, number of neuron biases is equal to number of previous maps it is connected. Is it by design?

There's a mistake in the network definition above. More precisely in the definition of layer #8. You can't have a layer with a receptive field of 5x5 in that position because the size of a map in layer #7 is exactly 1x1. Try changing the receptive field size in layer #8 to 1x1 instead of 5x5.

The net result is right, but we have multiple assignments of the same bias.
For example, for map 1 x 1, we will assign bias #0 for every previous map connected to the neuron (map) #0. Not a big deal for 64 neurons, but I saw article with many thousands of neurons in a layer.
It seems that because each and every neuron has a bias, and you placed the biases in the beginning of the weight array, it might be simpler just assign the biases to connections outside of the previousMap loop.

I understand your reasoning, but I don't see a proper way to implement it like you describe whithout altering all the fprop, bprop & bbprop steps. I'm currently not using this codebase anymore for myself. Have now a much faster c++ implementation I'm still tinkering on. Thanks anyway for debugging the code!

Fix that you suggested, indeed, links the biases to the right weights. But it introduces a new connection[][i] in Connections for each connected previous map.
For example, the layer 7 (Local) consists of 64 maps size 1 x1. Maps are connected to the previous layer's 64 x 5 x 5 maps. The first map is not connected to the first map, but is connected to the previous maps #2 and #3. The function AddBias(Connections[posotion], position) is called on position #0 for each connected map. On each call it resizes the array Connections[][] and adds the new connections to the end of the array.
As a result, we have

So for biases we still have many connections to the same weight (bias.)
Does it compromise forward and backdrop calculations? Seems like for forward calculations it adds bias for layer's neuron multiple times.

I did not look at training time, because, first, C# is not so quick comparing to C++, and, second, IMHO there is a lot of corrections to speed up the existing C# program. What I am doing, I am learning from the great knowledge of ANN field you have embedded in the program. I appreciate it very much.
Years ago I have compared C# and C++ versions of the same small and simple ANN program and got about 70% gain for C++. I am not sure that comparison was correct. It was before MS Concurrency

Not sure if .NET Native will really improve this kind of thing. Being managed requires runtime bounds checks, which, given all the array indexing, is probably the culprit here. Some of those can be optimized away by a compiler, but most remain. You can work around that in C# using "unsafe" constructs.

The exception is thrown when the layer previous to the last layer is instantiating.
This is not a bug exactly; it is a violation of an implicit constraint.
Obviously, a receptive field should fit into a map of its previous layer. But there we have the previous map 1 x 1 neurons, and receptive field is 5 x 5 neurons. So , because the mask has dimensions of the previous layer map, and the maskMatrix consosts of the previous layer's mapCout masks, we are going out of maskMatrix boundaries when we instantiate connections to the last of previous layer's maps.

Correction to network.AddLayer(LayerTypes.Local, ActivationFunctions.Logistic, 384, 1, 1, 1, 1, 1, 1, 0, 0, 50) solves the problem, but with it this layer becomes just a full connected layer.
If we want to connect each map to the 25 (5 x 5) previous maps, we have to use mapping.
The similar configuration is in other (commented out) network in NeuralNetwork InitializeDefaultNeuralNetwork().

I think it will not hurt to add some validation (exception) for this constraint to AddLayer(). If C# has something like static_assert of C++, compile time check (meta function) would be an excellent solution.

I've Add another DataProviderSet, modified mainly from CIFAR10 to do regression, specifically locate 8 key points. I changed the Output neural number to 16. However I am quite clear what the code below is for.

for (int i = 0; i < ClassCount; i++)
D2ErrX[i] = 1D; // TrainToValue; Is this also true with cross entropy loss and a softmax output layer ???

Well in my case all the outputs of the neurons represent the relative location of the points. Would you please clarify it for me?

This value must be the second derivative of the cost function.
for MSE (0.5*sumof( (actual-target)^2 )) this differential is 1, for Cross Entropy I'm not sure. Don't use TrainToValue this is plain wrong. It only matters when you're using Levenberg-Marquardt based learning strategies.

My question is when one of the ten neurons of the last layer is positive, then it is the recognized char. But if we want to know the prob. or a normalized value between 0-100 of this recognition how can we calculate it?

The second layer of the LeCun network is 6 maps by 14x14 neurons each. The reception field for each neuron is 2x2. It gives 5880 connections for not overlapping field. The same nunber is in LeCun article. Your program sets 11262 connections. The neuron #0 has 5 connections (bias + 2x2), that is right, but the neuron #1 has 7 connections, and so on.