Neural Networks – A perceptron in Matlab

Neural networks can be used to determine relationships and patterns between inputs and outputs. A simple single layer feed forward neural network which has a to ability to learn and differentiate data sets is known as a perceptron.

Single layer feed forward perceptron

By iteratively “learning” the weights, it is possible for the perceptron to find a solution to linearly separable data (data that can be separated by a hyperplane). In this example, we will run a simple perceptron to determine the solution to a 2-input OR.

X1 or X2 can be defined as follows:

X1 X2 Out
0 0 0
1 0 1
0 1 1
1 1 1

If you want to verify this yourself, run the following code in Matlab. Your code can further be modified to fit your personal needs. We first initialize our variables of interest, including the input, desired output, bias, learning coefficient and weights.

input = [0 0; 0 1; 1 0; 1 1];
numIn = 4;
desired_out = [0;1;1;1];
bias = -1;
coeff = 0.7;
rand('state',sum(100*clock));
weights = -1*2.*rand(3,1);

The input and desired_out are self explanatory, with the bias initialized to a constant. This value can be set to any non-zero number between -1 and 1. The coeff represents the learning rate, which specifies how large of an adjustment is made to the network weights after each iteration. If the coefficient approaches 1, the weight adjustments are modified more conservatively. Finally, the weights are randomly assigned.

A perceptron is defined by the equation:

Therefore, in our example, we have w1*x1+w2*x2+b = out
We will assume that weights(1,1) is for the bias and weights(2:3,1) are for X1 and X2, respectively.

One more variable we will set is the iterations, specifying how many times to train or go through and modify the weights.

iterations = 10;

Now the feed forward perceptron code.

for i = 1:iterations
     out = zeros(4,1);
     for j = 1:numIn
          y = bias*weights(1,1)+...
               input(j,1)*weights(2,1)+input(j,2)*weights(3,1);
          out(j) = 1/(1+exp(-y));
          delta = desired_out(j)-out(j);
          weights(1,1) = weights(1,1)+coeff*bias*delta;
          weights(2,1) = weights(2,1)+coeff*input(j,1)*delta;
          weights(3,1) = weights(3,1)+coeff*input(j,2)*delta;
     end
end

A little explanation of the code. First, the equation solving for ‘out’ is determined as mentioned above, and then run through a sigmoid function to ensure values are squashed within a [0 1] limit. Weights are then modified iteratively based on the delta rule.

When running the perceptron over 10 iterations, the outputs begin to converge, but are still not precisely as expected:

out =
  0.3756
  0.8596
  0.9244
  0.9952
weights =
  0.6166
  3.2359
  2.7409

As the iterations approach 1000, the output converges towards the desired output.

out =
  0.0043
  0.9984
  0.9987
  1.0000
weights =
  5.4423
  12.1084
  11.8823

As the OR logic condition is linearly separable, a solution will be reached after a finite number of loops. Convergence time can also change based on the initial weights, the learning rate, the transfer function (sigmoid, linear, etc) and the learning rule (in this case the delta rule is used, but other algorithms like the Levenberg-Marquardt also exist). If you are interested try to run the same code for other logical conditions like ‘AND’ or ‘NAND’ to see what you get.

While single layer perceptrons like this can solve simple linearly separable data, they are not suitable for non-separable data, such as the XOR. In order to learn such a data set, you will need to use a Bookmark and Share

This entry was posted in Tips & Tutorials and tagged , , , , . Bookmark the permalink.