Introduction

The BackPropagation network gets its name from the way that the learning is done due to the fact that with the BackPropagation network, the learning is started at the Learn function in the output nodes and proceeds backwards through the nodes updating the weights on the links as it goes. This example is based on chapter 5 of Joey Rogers' Object Orientated Networks in C++ book and has been expanded with the provision of the facility to generate test data and then run the data through the trained network.

The BackPropagation Network

There are five new classes to introduce with the BackPropagation program, most of which inherit directly from classes that have already been seen previously. Structure wise, the network is slightly more complicated than that seen in previous examples because we are now starting to look at networks that contain layers of nodes. These layers now have what are called "Hidden" nodes in that they are part of the actual network and are not directly accessed by code from the running program but lie between the input and the output nodes.

The Back Propagation Network Class

The BackPropagationNetwork class is used for building and controlling the network and inherits from the BasicNetwork class. This class expands on the basic network class in that it must be capable of providing layers for the network and must know where the layers are. I say this because the nodes are still stored in a single array list, the layers are a conceptual construct in that they only exist where the code says they are. There is no attempt to build a code hierarchy that models the diagram of the BackPropagation like that one above.

It should also be noted that although this example only has one output node, the code contains the facility to deal with more than one output node. This is noticeable in the code that gets and sets the output errors and values.

As you can see from the code that whenever the code accesses the output values for the network, it calls GetNodeAt which takes the ID passed to the value which will be zero for the first node and adds that to the nFirstOutputNode value which is the network class' way of keeping track of the number in the ArrayList that the first output node starts at.

The CreateNetwork function starts by calculating the numbers of nodes and links that are required for the creation of the network. There are a couple of ways that you can add layers to the BackPropagationNetwork class. The first and the one used by this code is to build a three layer network using the provided constructor which then takes the number of nodes for each layer. The second is to use the constructor that just takes the number of layers, you can then add the layers by:

AddLayer = numberOfNodes;

passing the number of nodes that are to be in that specific layer.

The CreateNetwork function then creates the nodes depending on where they are in the layer structure, i.e. anything in the first layer is an input node, anything in the last layer is an output node, and anything in between is a middle node. With the creation of the nodes, the code then creates the correct number of links to join the nodes together before cycling through the nodes and creating the links between the separate layers.

The Back Propagation Output Node Class

The BackPropagationOutputNode class inherits from the Adaline node class so that it can use the Run function provided by that class. The most major change in the BackPropagationOutputNode class is the change to the Transfer function. This is:

which computes the error for the output node as the current node value multiplied by the current node value minus one, multiplied by the current error value for the node minus the current node value. This function is called from the BackPropagationOutputNodeLearn function which sets the node error value with the returned result.

which starts off by getting the error value for the current node and then calculates the new weight value for the node as the learning rate multiplied by the node error value multiplied by the input value.

The Back Propagation Middle Node Class

The BackPropagationMiddleNode class inherits from the BackPropagationOutputNode class and overrides the ComputeError function.

The difference between the versions of the ComputeError function is that the BackPropagationMiddleNode calculates the error for the middle nodes differently in that it calculates the total values for the weighted errors and then returns the current node value multiplied by the current node value minus one multiplied by the total value calculated as the total weighted error value.

The Back Propagation Link Class

The BackPropagationLink class is an extension of the BasicLink class and is provided to enable the use of the delta and the momentum values that are used by the Back Propagation Network. Its main difference to the BasicLink class comes with UpdateWeight function.

This gets the momentum from the node that is set when the Back Propagation network is created. It then calculates the new weight by adding the current weight to the new value passed in and then adding a proportion of the previous value for the weight. In this example, the momentum is being set 0.9 so the value will be 0.9 times whatever value is stored in the delta value, which as you can see, is stored in the delta in the next line of code, hence giving a percentage of the previous update value.

Training

Training for the Back Propagation Network is slightly more complicated than it was for the Adaline Network that we have seen earlier. The reason for this is because the Back Propagation Network does the weight adjustments to the network all in one go. That is, it starts at the output node and propagates the training backwards through the middle nodes.

The training loop only tries to get four good results as we are trying to solve the XOR problem here that means that, of the four inputs 0 & 0, 0 & 1, 1 & 0, and 1 & 1, only the pairs of numbers that contain a single value of 1 should give a positive value of 1 in the output.

The Back Propagation Network figures out the answer to this problem by running through each epoch of the four patterns or value pairs indicated above. It then calls the network Run function for the pair. The essential part of the Back Propagation Network Run function is:

The Run function starts at the first middle node and then calls the Adaline node Run function on all of the nodes up to Nodes.Count which includes calling the Run function on the output node.

The output error for the output nodes in the network is then set to the desired output that is stored in the pattern value, before the Learn function is called for every single run through the loop. The Learn function:

cycles backwards through the output nodes to the first middle node in the network, calling the Learn function on each node. Each Learn is called differently for the Back Propagation Output Node and the Back Propagation Middle node as they have different ways of computing the error values for the nodes as shown in the class descriptions above.

Saving And Loading

The Back Propagation network uses the same XML loading and saving techniques used through out the library. Here is an example of a saved XML file.

As you can see above, the layers section stores each layer in the XML file with the number of nodes that are to be in that layer of the network. The Back Propagation network also stores the array positions of the first middle node and the array position of the first output node, as well as the momentum for the network which is also stored individually by each node at array position two of the node value array. This is as well as the learning rate which is stored at position one in the node value array. The Back Propagation links also are slightly different in that they now store the delta value in position one of the link value array.

Testing

The Testing portions of the code are located under the run menu for the Neural Net Tester program. The test for this program is the "Load And Run back Propagation 1" menu option. This will load the file that resembles the one above. I say resembles as the linkage values won't be exactly the same any two times running.

The menu option will load and run the backpropagtionworkingfile.wrk and generate the log file Neural Network Tester Load And Run BackPropagation One Network.xml which can be viewed using the LogViewer that is part of the neural net tester program.

The display will show at the end a list of all the input data and the conclusion the back propagation network reached about that data. Next to this will be the answer that was generated by the test data in the pattern. So far in my testing, the function has performed with one hundred percent accuracy.

The quick guide is:

Menu :- Generate/Generate BackPropagation working File :- Generates the file that is used for the Adaline Load and run menu option.

Menu :- Run/Load And Run BackPropagation 1:- Loads the saved BackPropagation network from the disk and then runs it against the working file.

Menu :- Train/Train BackPropagation 1 :- Trains the network from scratch using the hard coded XOR data and then saves it to disk.

Menu :- Options BackPropagation 1 Options :- Brings up a dialog that allows you to set certain parameters for the running of the network.

Options

The above is the options dialog for the Back Propagation One network and contains the five options you can set. The first being the Number of Tests which is the number of items that are read from and generated into the testing file for the network, which in the case of this network is the BackPropagationOneWorkingFile.wrk. The second is the tolerance level that is acceptable to the program. This should always be a value that the code is able to distinguish, i.e., if this was set to 0.6, then the acceptable values would overlap making any answers returned from the network meaningless. The third and fourth are the momentum and the learning rate which are both used in the calculations that determine the weight values for each link to the nodes and the final option is a simple check box to specify if you want to use the in-built bias which always has a value of one in the calculations.

The above shows two runs through the training loop for the Back Propagation Network. Unlike the Adaline network, the Back Propagation network calls Learn automatically each time through the loop so there is no output to say that Learn has been called. The first line of the output is the value that the network has arrived at and the pattern output value that we want the network to arrive at. The second line shows the absolute value of the network's output which is the absolute value of the network's output value minus the pattern's output value. This absolute value is then tested against the tolerance level which in this case has been set to 0.4. If the absolute value is less than the tolerance value then it is determined to be a successful test. The final line shows the error values returned by the network and the pattern that was run through the network to begin with.

Once the network has successfully trained against the test data, in this case, the four acceptable values for the XOR test, then the network is saved and reloaded into a new Back Propagation network before the values are again entered into the network to see how it performs. The final four lines above show the output of a run for the XOR test with each indicating which pattern was entered for the test and the expected output value along with the final value which is the value that the network has arrived at. As you can see, all the values fall within the tolerance levels, with all the values where a 0 is expected to be returned being less than 0.4, and all the values where a 1 is expected being greater than 0.6.

As with the Adaline network samples, the Back Propagation network generates a file of data to run against the loaded network when load and run is selected. This file is filled with pairs of values that are either 0 or 1. And the third value is the value that the network should arrive at.

Once the pattern array has been loaded, the data is run and the results are printed to the screen. The results show the original values entered and the required output, followed by the value that the network arrived at and finally indicates if the value is within the tolerance level of the correct answer.

Fun And Games

As with the other network examples, the main aim is to produce a network that can be repetitively trained and saved to disk for demonstration purposes. For this reason, the main parameter that was played with during testing is the tolerance parameter. For the purpose of the example, it is set to a value of 0.4 which is giving it quite a large margin which you may not wish to, were you using the network in a production environment. There is the option to change the tolerance parameter within the options dialog, but it should be remembered that the smaller the tolerance gets, the longer the network training will take. Still, this shouldn't be too much of a problem as once trained, the network can simply be reloaded as fully trained.

Thanks

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Comments and Discussions

hey,
U hav done a amaizing job here. so i thut may b i can ask for some help from u on ma project.

Im doing a project, which is a word dictionary which will work with any
kinda document. specially focusing on scanned BMP images.(not hand
writen characters. only printed fonts) i have to identify the highlighted
word from the document which the user highlights using IMAGE PROCESSING
AND ARTIFICIAL NEURAL NETWORKS.

Now im running out of time to finish this parts of the code. codes have
to be in VC++ or MATLAB. tranning for one font type is more than
enough. and does not have to b 100% trained. js a working code is enough. cos
i dont have time to edit any coding now.