Perceptron: Learning Algorithm

This notes page is mainly to provide one worked through example of the perceptron learning algorithm. You will get practice with this algorithm and the opportunity to ask questions in your tutorials.

Let’s demonstrate how to update the weights from our ferromagnet-detecting perceptron on the last page.

Step One

Initialize all the weights randomly.

Recall that our perceptron has 5 inputs, one for each reference object. Using the threshold trick (see below), that means we need to initialize 6 weights.

\[ W = [0.25, 0.2, -0.8, 0.4, -0.5, 0.75] \]

The threshold trick turns the threshold into an additional weight with a fixed input of -1.

\[ \sum_{i=0}^n w_i x_i \geq \theta \] \[ \sum_{i=0}^n w_i x_i - \theta \geq 0 \] \[ w_0 x_0 + w_1 x_1 + ... + w_n x_n - \theta \geq 0 \] \[ w_0 x_0 + w_1 x_1 + ... + w_n x_n + \theta(-1) \geq 0 \]

Step Two

For each training example, apply the learning rule:

\[ w_i \leftarrow w_i + \Delta w_i \] \[ \Delta w_i = \eta (t-o) x_i \]

Let’s assume our perceptron is a slow learner and set the learning rate

\[ \eta=0.1 \]

Let’s calculate the update if we’ve seen the following input and inferred that the object is ferromagnetic (t = 1)!

\[ X = [2.0, 2.3, 1.9, 2.0, 1.9] . \]

Substep 1: Calculate the current perceptron output o

\[ \sum w_i x_i = (0.25 \cdot 2) + (0.2 \cdot 2.3) + (-0.8 \cdot 1.9) + (0.4 \cdot 2) + (-0.5 \cdot 1.9) + (0.75 \cdot -1) \] \[ \sum w_i x_i = -1.46 \]

The perceptron’s activation is less than 0 so it does not fire o = 0.

Substep 2: For each weight calculate the update (Delta W)

X	W	Delta W
2.0	0.25	0.1 * (1-0) * 2.0 = 0.20
2.3	0.2
1.9	-0.8
2.0	0.4
1.9	-0.5
-1.	0.75

We’ve done the first weight for you. Try the rest yourself!

Click here to check your answers.

Substep 3: Now calculate the new weights

X	W	Delta W	W'
2.0	0.25	0.20	0.25 + 0.2 = 0.45
2.3	0.2
1.9	-0.8
2.0	0.4
1.9	-0.5
-1.	0.75

Click here to check your answers.

Step Three

Now we have to check our learning.

For each training example, calculate the current output and count the errors.

If you think there are too many errors, repeat Step Two.

Normally, we set a threshold for how many errors are acceptable before we start training the model. We also set a total number of update loops before our algorithm should just stop. Why do you think we do that?

X	W	Delta W
2.0	0.25	0.1 * (1-0) * 2.0 = 0.20
2.3	0.2	0.1 * (1-0) * 2.3 = 0.23
1.9	-0.8	0.1 * (1-0) * 1.9 = 0.19
2.0	0.4	0.1 * (1-0) * 2.0 = 0.20
1.9	-0.5	0.1 * (1-0) * 1.9 = 0.19
-1.	0.75	0.1 * (1-0) * -1. = -0.1