At the forefront of Artificial Intelligence
  Home Articles Reviews Interviews JDK Glossary Features Discussion Search
Home » Articles » Neural Networks » Beginner

BP Example: XOR Net

Firstly, please read the back-propagation essay. Also, if you want to make use of the programming classes, you must have a good working knowledge of C++. For the non-programmers out there, the programming part is just a small part at the end of the essay - you won't miss anything important! For the programmers - all major programming discussion takes place in another essay.


To demonstrate back-propagation were are going to look at three layer, 5 neuron (2 input, 2 hidden, 1 output) network (shown to the right). Before we start looking at the calculations, let us get some terminology straight.

The inputs/outputs of the neurons are described as follows. Each layer has number, starting at 1 for the input layer. The inputs for each layer are indicated by xl-1(n) where l is the layer, and n is the neuron. So, for example, the inputs fed into the networks are x0(0) and x0(1) (marked red). Or the outputs from the hidden layer to the output layer are x2(0) and x2(1) (marked green).

Weights are defined by wl(f,n), where l is the layer, f is the neuron number is came from in the previous layer, and n is the number of the neuron itself. Note that when f = 0, it refers to the bias for the neuron. For example, the weight for the output of the second neuron in the input layer to the input of the first neuron in the hidden layer is w2(2,1) (marked blue).

Weights and Calculations

Firstly, the network would be initialized, and given random weights. Let's assign these initial weights. The weights can be anything between -1 and 1.


Hidden Neuron 1: w2(0,1) = 0.341232 w2(1,1) = 0.129952 w2(2,1) =-0.923123
Hidden Neuron 2: w2(0,2) =-0.115223 w2(1,2) = 0.570345 w2(2,2) =-0.328932
Output Neuron: w3(0,1) =-0.993423 w3(1,1) = 0.164732 w3(2,1) = 0.752621

Since back-propagation and training requires thousands of steps, we are obviously not going to go through it all, I will merely look at the first iteration that occurs. So, let us look at what would happen during training of (0,0). Firstly, the sum has to calculated, then run through the sigmoid function to limit it.

x1(0) = 1 (bias)
x1(1) = 0 
x1(2) = 0

Neuron 1: (1 * 0.341232) + (0 * 0.129952) + (0 * -0.923123) =  0.341232
Neuron 2: (1 *-0.115223) + (0 * 0.570345) + (0 * -0.328932) = -0.115223
So, we now have the net (weighted sum) values of the two hidden neurons. Now, to run them through our hard-limiter function.
x2(1) = 1/(1+e^(-0.341232)) = 0.584490
x2(2) = 1/(1+e^( 0.115223)) = 0.471226
We now have the outputs for the hidden layer. So, let us now do the same for the output layer. Using x2(1) and x2(2) as the inputs for the output layer we can make the following calculations:
x2(0) = 1 (bias)
x2(1) = 0.584490
x2(2) = 0.471226

Net: (1 *-0.993423) + (0.584490 * 0.164732) + (0.471226 * 0.752621) = -0.542484

Therefore, x3(1) = 1/(1+e^(0.542484)) = 0.367610
This is the value that the network would output. This is only half of the training process though, we now have to adjust all the weights to get the result closer to the one we want (0 in this case). So, lets calculate our deltas using the formulas discussed in the BP essay. We will first calculate the delta for the output layer:
d3(1) = x3(1)(1 - x3(1))(d - x3(1))
      = 0.367610 * (1 - 0.367610)(0 - 0.367610)
      =-0.085459
Now that we have that, we can use it to propagate the error backwards:
d2(1) = x2(1)(1 - x2(1))w3(1,1)d3(1)
      = 0.584490 * (1 - 0.584490)*(0.164732)*(-0.085459) = -0.0034190
d2(2) = 0.471226 * (1 - 0.471226)*(0.752621)*(-0.085459) = -0.0160263
That's all the deltas calculated for layers. Now to actually alter the weights - remember that the learning coefficient h is defined by the user and I have picked 0.5 to work with . Now, for some of them the weight change will be 0, because you are multiplying by the inputs, which in our case is 0. Therefore, I am only going to show the calculations for the ones that change:
dw2(0,1) = h*x1(0)*d2(1) = 0.5 * 1 * -0.0034190 = -0.017095
dw2(1,1) = 0
dw2(2,1) = 0


dw2(0,2) = 0.5 * 1 * -0.0160263 = -0.0080132 dw2(1,2) = 0 dw2(2,2) = 0

dw3(0,1) = 0.5 * 1 * -0.085459 = -0.042730 dw3(1,1) = 0.5 * 0.584490 * -0.085459 = -0.024975 dw3(2,1) = 0.5 * 0.471226 * -0.085459 = -0.020135
So, these are the weight changes. You would add these to their respective weights, then run the entire process again on the next set of training data. Slowly, as the training data is fed in and the network in retrained a few thousand times, the network could balance out to values such as these:


Hidden Neuron 1: w2(0,1) =-6.062263 w2(1,1) =-6.072185 w2(2,1) = 2.454509
Hidden Neuron 2: w2(0,2) =-4.893081 w2(1,2) =-4.894898 w2(2,2) = 7.293063
Output Neuron: w3(0,1) =-9.792470 w3(1,1) = 9.484580 w3(2,1) =-4.473972

With these outputs, you would get the follow results for XOR:

0 XOR 0 = 0.017622
0 XOR 1 = 0.981504
1 XOR 0 = 0.981491
1 XOR 1 = 0.022782
Which, with a small amount of rounding, is the correct truth table. Now, for a brief look at the C++ class.

C++ Class Code

The C++ class for this is very simple. You only have two functions you really care about, Train() and Run(). Train takes three floating point values, the two inputs and an expected value. The function returns the output of the net. Run() only takes the two inputs, and returns the output. Therefore, to apply the network the above example, you main() should look like:
void main() {
   CBPNet bp;

   for (int i=0;i<BPM_ITER;i++) {
      bp.Train(0,0,0);
      bp.Train(0,1,1);
      bp.Train(1,0,1);
      bp.Train(1,1,0);
   }

   cout << "0,0 = " << bp.Run(0,0) << endl;
   cout << "0,1 = " << bp.Run(0,1) << endl;
   cout << "1,0 = " << bp.Run(1,0) << endl;
   cout << "1,1 = " << bp.Run(1,1) << endl;
}
BPM_ITER is defined as the number of iterations the network is to run for. Here is some sample output from the program:
C:\Program Files\DevStudio\MyProjects\BPNet\Release>bpnet.exe
0,0 = 0.0494681
0,1 = 0.955633
1,0 = 0.942529
1,1 = 0.0433488
To look into the class code, please see the CBPNet essay. You can download the code from here.

Submitted: 03/04/2001

Article content copyright © James Matthews, 2001.
 Article Toolbar
Print
BibTeX entry

Search

Latest News
- The Latest (03/04/2012)
- Generation5 10-year Anniversary (03/09/2008)
- New Generation5 Design! (09/04/2007)
- Happy New Year 2007 (02/01/2007)
- Where has Generation5 Gone?! (04/11/2005)

What's New?
- Back-propagation using the Generation5 JDK (07/04/2008)
- Hough Transforms (02/01/2008)
- Kohonen-based Image Analysis using the Generation5 JDK (11/12/2007)
- Modelling Bacterium using the JDK (19/03/2007)
- Modelling Bacterium using the JDK (19/03/2007)


All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -