back propagation algorithm example, Summaries of Pattern Classification and Recognition

back propagation algorithm example

Typology: Summaries

2019/2020

Uploaded on 08/08/2021

ahmed-mohammed-16
ahmed-mohammed-16 🇪🇬

5

(1)

6 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
pf3
pf4

Partial preview of the text

Download back propagation algorithm example and more Summaries Pattern Classification and Recognition in PDF only on Docsity!

3 Back Propagation (BP) Algorithm One of the most popular NN algorithms is back propagation algorithm. Rojas [2005] claimed that BP algorithm could be broken down to four main steps. After choosing the weights of the network randomly, the back propagation algorithm is used to compute the necessary corrections. The algorithm can be decomposed in the following four steps: i) Feed-forward computation ii) Back propagation to the output layer iii) Back propagation to the hidden layer iv) Weight updates The algorithm is stopped when the value of the error function has become sufficiently small. This is very rough and basic formula for BP algorithm. There are some variation proposed by other scientist but Rojas definition seem to be quite accurate and easy to follow. The last step, weight updates is happening through out the algorithm. BP algorithm will be explained using exercise example from figure 4. 3.1 Worked example NN on figure 4 has two nodes (NO0,0 and NO,1) in input layer, two nodes in hidden layer (N1,0 and N1,1) and one node in output layer (N2,0). Input layer nodes are connected to hidden layer nodes with weights (W0,1-W0,4). Hidden layer nodes are connected with output layer node with weights (W1,0 and W1,1). The values that were given to weights are taken randomly and will be changed during BP iterations. Table with input node values and desired output with learning rate and momentum are also given in figure 5. There is also sigmoid function formula f(«) = 1.0/(1.0 + exp(—a)). Shown are calculations for this simple network (only calculation for example sct 1 is going to be shown (input values of 1 and 1 with output value 1)). In NN training, all example sets are calculated but logic behind calculation is the same. 3.1.1 Feed-forward computation Feed forward computation or forward pass is two step process. First part is getting the values of the hidden layer nodes and second part is using those values from hidden layer to compute value or values of output layer. Input values of nodes NO,0 and NO,1 are pushed up to the network towards nodes in hidden layer ( N1,0 and N1,1). They are multiplied with weights of connecting nodes and values of hidden layer nodes are Pattern data for AND nod af, | Ourputn2,0 1 i 1 1 0 o o 1 o i] 0 o = Learning rate = 0.45 a= Momentum term = 0.9 f(x)=1.0/ (1.0 + exp) Figure 4: Example for N1,1 node will be found. N1,0grror = N2,0grror * W1,0new = 0.133225 * 0.097317 = 0.012965 N1,1grror = N2,0grror * W1, 1yew = 0.133225 « (—0.373012) = —0.049706 Once error for hidden layer nodes is known, weights between input and hidden layer can be updated. Rate of change first needs to be calculated for every weight: AW0,0 = 8 * N1,0prror « NO.0 = 0.45 * 0.012965 = 0.005834 AWO0,1 = 8 * N1,0grror « n0,1 = 0.45 * 0.012965 * 1 = 0.005834 AW0,2 = 6 * N1, lerror * n0,0 = 0.45 + —0.049706 * 1 = —0.022368 AW0,3 = 6 * N1, 1lgrror *n0,1 = 0.45 * —0.049706 * 1 = —0.022368 Than we calculate new weights between input and hidden layer. W0,0new = WO, 0old + AW0,0 + (a * A(t — 1)) = 0.4 + 0.005834 + 0.9 + 0 = 0.405834 WO, Lyew = w0, lold + AWO, 1 + (a * A(t — 1) = 0.1 + 0.005834 + 0 = 0.105384 WO, 2new = w0, 2old + AWO, 2 + (a * A(t — 1)) = —0.1 + —0.022368 + 0 = —0.122368 ) WO, 3new = w0, 30ld + *AW0,3 + (a * A(t — 1)) = —0.1 + —0.022368 + 0 = —0.122368 3.1.4 Weight updates Important thing is not to update any weights until all errors have been calculated. It is easy to forget this and if new weights were used while calculating errors, results would not be valid. Here is quick second pass using new weights to see if error has decreased. N1,0 = f(x1) = f(w0,0 * 0,0 + w0, 1 * 0,1) = f(0.406 + 0.1) = (0.506) = 0.623868314 N1,1= f(#2) = f(w0, 2 * 70,0 + w0,3*n0,1) = f(—0.122 — 0.122) = f(—0.244) = 0.43930085 N2,0 = f(x3) = f(w1,0 * n1,0 + wl,1*nl,1) = f(0.097 * 0.623868314 + (—0.373) * 0.43930085) = f(—0.103343991) = 0.474186972 Having calculated N2,0, forward pass is completed. Next step is to calculate error of N2,0 node. From the table in figure 4, output should be 1. Predicted value (N2,0) in our example is 0.464381. Error calculation is done in following way. N2,0grror = n2,0*(1—n2, 0)*(N2, Opesired—N2, 0) = 0.474186972+(1—0.474186972)*(1—0.474186972) = 0.131102901 So after initial iteration, calculated error was 0.133225 and new calculated error is 0.131102. Our algorithm has improved, not by much but this should give good idea on how BP algorithm works. Although this was very simple example, it should help to understand basic operation of BP algorithm.