Pages

Tuesday, 19 December 2017

How it Works: Back Propagation in Neural Network with Mathematical Example

Back propagation is one of the algorithms used to find the error as well as to minimize the error in order to get the target output.
Let us consider the following example, to achieve the target output using back propagation algorithm.


GIVEN VALUES:

At input layer:
Inputs: i1=0.05, i2=0.10
Weights: w1=0.15, w2=0.20, w3=0.25, w4=0.30

At hidden layer:
Bias: b1=0.35

w5= 0.40, w6= 0.45, w7= 0.50, w8= 0.55

At output layer:
Bias: b2=0.60

Target output: o1= 0.01, o2= 0.99

To find:
  To reduce error and achieve the target output by using backward propagation algorithm.
Solution:
To find the hidden layer output:
net h1 = {(w1*i1) + (w2*i2)} + (b1*1)
           = {(0.15*0.05) + (0.20*0.10)} + (0.35*1)
net h1 = 0.3775
net h2 = {(w3*i1) + (w4*i2)} + (b1*1)
           = {(0.25*0.05) + (0.30*0.10)} + (0.35*1)
net h2 = 0.3925

Here, the sigmoid function is used to convert linear into nonlinear function at activation function in hidden layer.

out h1 = 1/(1+e^(-net h1))
           = 1/(1+e^(-0.3775))
out h1 = 0.593269992
out h2 = 1/(1+e^(-net h2))
           = 1/(1+e^(-0.3925))
out h2 = 0.5968843782

To find the output at output layer:

net o1 = {(w5*out h1)+(w6*out h2)}+(b2*1)
   = {(0.40*0.593269992) + (0.45*0.5968843782)} + (0.60*1)
net o1 = 1.105905967
net o2 = {(w7*out h1)+(w8*out h2)}+(b2*1)
           = {(0.50*0.593269992) + (0.55*0.5968843782)} + (0.60*1)
net o2 = 1.224921404

Again the sigmoid function is used in output layer,
out o1 = 1/(1+e^(-net o1))
           = 1/(1+e^(-1.105905967))
out o1 = 0.7513650695

out o2 = 1/(1+e^(-net o2))
           = 1/(1+e^(-1.224921404))
out o2 = 0.7729284653

To find the total error at output layer:

E total = ∑1/2(target - out) 2
E out o1 = 1/2(target o1 – out o1)2
               = 1/2(0.01-0.7513650695)2
E out o1 = 0.274811773
E out o2 = 1/2(target o2 – out o2)2
       = 1/2(0.99-0.7729284653)2
E out o2 = 0.0235600256

E total = E out o1+E out o2
            = 0.274811773+0.0235600256
E total  = 0.2983717986

To find the error at hidden layer:
E out h1 = (E out o1*w5) + (E out o2*w6)
 = (0.274811773*0.40) + (0.0235600256*0.45)
           E out h1   = 0.1205267207
E out h2 = (E out o1*w7) + (E out o2*w8)
 = (0.274811773*0.50) + (0.0235600256*0.55)
          E out h2 = 0.15036390036

According to chain rule, we calculate the updated weight of w5, w6, w7, w8.
To calculate w5+,
    (ԺE totalw5)   = (ԺE totalout o1)*(Ժ out o1net o1)*(Ժ net ­o1w5)
   (ԺE totalout o1) = 1/2(target o1 – out o1)2 + 1/2(target o2 – out o2)2
                                       = 2/2(target o1 – out o1)2-1 *(-1) + 0
                                  = (-target o1 + out o1)

 (ԺE totalout o1) = (-0.01+0.75136560695)
(ԺE totalout o1)  = 0.74136507
 (Ժ out o1net o1) = out o1 (1- out o1)
                               = 0.7513650695(1-0.7513650695)
out o1net o1) = 0.186815602


 net o1 = {(w5*out  h1) + (w6*out  h2­)} + (b2*1)
 Ժ net o1/ Ժw5 = 1*out h1+0+0
                      = out h1
   Ժ net o1/ Ժw5 = 0.59326992
       ԺE total/ Ժw5 = 0.74136507*0.186815602*0.59326992
       ԺE total/ Ժw5 = 0.0821670407
           W5+= w5-{ȵ* (ԺE total/ Ժw5)}
                             = 0.40-(0.5*0.0821670407)
 W5+= 0.35891648
              W5+= 0.35891648 is the updated weight for w5.

To calculate W6+,
                   W6= w6- {ȵ* (ԺE total/ Ժw6)}
   (ԺE total/ Ժw6) = (ԺE totalout o1)*(Ժ out o1net o1)*(Ժ net ­o1/Ժw6)
 (ԺE totalout o1) = 1/2(target o1 – out o1)2 + 1/2(target o2 – out o2)2
                                     =2/2(target o1 – out o1)2-1 *(-1) + 0
       = (-target o1 + out o1)
                               = (-0.01+0.75136560695)
  (ԺE total/ Ժ out o1)=0.74136507

  (Ժ out o1net o1) =out o1 (1- out o1)
                                =0.7513650695(1-0.7513650695)
  (Ժ out o1net o1) =0.186815602

   (Ժ net ­o1/Ժw6) = {(w6*out h2) + (w7* out h2) }+ (b2*1)
                =1*out h2+0+0
   (Ժ net o1/ Ժw6) = 0.5968843782

Substitute above values in (ԺE total/ Ժw6) ,
     (ԺE total/ Ժw6) = 0.082667628
      W6= w6-{ȵ*(ԺE total/ Ժ w6)}
               = 0.45-(0.5*0.082667628
   W6+= 0.408666186 ;     [ updated weight for w6]

To calculate the updated weight for w7
W7+= w7-{ȵ*(ԺE total/ Ժw7)}
(ԺE totalw7)  = (ԺE totalout o2)*(Ժ out o2net o2)*(Ժ net ­02w7)
(ԺE total/Ժ out o2)  = 1/2(target o1 – out o1)2 + 1/2(target o2 – out o2)2
                                       = 0 + {2/2(target o1 – out o1)2-1 *(-1)}
                                = (-target o2 + out o2)
                                =-0.99+0.7729284653

(ԺE total/Ժ out o2) = -0.2170715347
out o2/ Ժ net o2) = out o2 (1-out o2)
                              = 0.7729284653(1-0.77292846653)
out o2/ Ժ net o2) =0.1755100528

 net o2 = (w7*out  h1) + (w8*out h2)
net o2/ Ժw7) = out h1
net o2/ Ժw7) = 0.593269992
(ԺE totalw7) = (ԺE totalout o2)*(Ժ out o2net o2)*(Ժ net ­02w7)
(ԺE total/Ժw7) = -0.2170715347*0.1755100528*0.59326992
                         =-0.0226025377
 W7+= 0.50-(0.5*-0.0226025377
W7+= 0.511301270;     [updated weight for w7]

To calculate the updated weight for w8,
                              
W8+= w8 – {ȵ*(ԺE totalw7)}
(ԺE total/Ժw8) = (ԺE totalout o2)*(Ժ out o2net o2)*(Ժ net ­o2/Ժw8)
(ԺE total/Ժ out o2) = -0.2170715347
 (Ժ out o2/ Ժ net o2) = 0.1755100528
                            
net o2 = (w7*out  h1)+(w8*out h2)
net o2/ Ժw8) = out h2
                         = 0.5968843782

 (ԺE total/Ժw8) = (-0.2170715347*0.1755100528*0.59688437)
                          = -0.0227402422
                  W8+= w8-{ȵ*(ԺE total/ Ժw8)}
                         = 0.55-(0.5*-0.0227402422)
      W8+= 0.5613701211;       [updated weight for w8]

To calculate the updated weight for w1,
W1+= w1-{ȵ*(ԺE total/ Ժw1)}
(ԺE total/Ժw1) = (ԺE totalout h1)*(Ժ out h1net h1)*(Ժ net ­h1/Ժw1)
(ԺE totalout h1) = (ԺE o1out h1) + (ԺE o2out h1)
(ԺE o1out h1) = (ԺE total/Ժ out o1)* (Ժ net o1out h1)
                           = 0.74136507*0.186815602

                (ԺE o1net o1) = 0.138498562
                                net o1 = {(w5*out  h1)+(w6*out  h2­)} + (b2*1)
 (Ժ net o1out h1) = w5+0+0
 (Ժ net o1out h1) = w5;           [W5=0.40]
    (ԺE o1out h1) = 0.138498562*0.40
                               = 0.055399425

    (ԺE o2out h1) = (ԺE o2net o2)* (Ժ net o2 out h1)
                (ԺE o2net o2) = (ԺE o2out o2 )*(Ժ out o2net o2 )
                   =-0.2170715347*0.1755100528
     (ԺE o2net o2) = -0.038098236
                                 net o2 = {(w7*out h1) + (w8*out h2)}+(b2*1)

         (Ժ net o2/Ժ out h1) = (w7+0) + 0;          [W7=0.50]
             (ԺE o2/Ժ out h1) = (-0.038098236*0.50)
                                          =-0.019049118
        (ԺE total/Ժ out h1) = 0.05539945-0.019049118
                                        = 0.03650307
         (Ժ out h1/Ժ net h1) = out h1 (1-out h1)
                                          = 0.593269992(1-0.593269992)
                    (Ժ out h1/Ժ net h1) = 0.241300709
                    (Ժ net o1/Ժ net h1) = net h1
                                          net h1 = {(w1*i1)+(w3*i2)} + (b1*1)
                         (Ժ net h1 /Ժw1) = (i1 + 0) + 0;      [i1=0.05]
                         (Ժ net h1 /Ժw1) = 0.05
                          (ԺE total/Ժw1) = (ԺE total/Ժ out h1)*(Ժ out h1/Ժ net h1)*(Ժ net ¬h1/Ժw1)
                          (ԺE total/Ժw1) = (0.036350307*0.241300709*0.05)
                                                     = 0.0004385677
                                            W1+ = w1- {ȵ*(ԺE total/ Ժw1)} 
                                                     = 0.15-(0.5*0.0004385677)
                                            W1+ = 0.1497807162;     [updated weight for w1]
Similarly,
 
W2+=0.1956143
W3+=0.24975114
W4+=0.29950229

 Again calculate the output at hidden layer by using updated weight,
                               h1+ = {(w1+*i1) + (w2+*i2)}+(b1*1)
                                     = {(0.1497807162*0.05) + (0.10*0.19956143)}+(0.35)
                                      = 0.0274451758+0.35
                               h1+ = 0.3774451758
                               h2+ = {(w3+*i1)+(w4+*i2)}+(b1*1)
                                     = {(0.24975114*0.05) + (0.29950299*0.10)} + (0.35*1)
                              h2+ = 0.392437786

Again the sigmoid function is used in hidden layer,
                     out h1+ = 1/(1+e^(-h1+))
                                  = 1/(1+e^(-(0.3774451758)))
                   out h1+ = 0.59325676
                     out h2+ = 1/(1+e^(-h2+))
                                  = 1/(1+e^(-(0.392437786)))
                   out h2+ = 0.5968694086

Again calculate the output at output layer,
                         o1+ = {(w5+*out h1+) + (w6+*out h2+)}+(b2*1)
                                 = {(0.35891648*0.59325676) + (0.408666186*0.5968694086)}+ (0.60*1)
                        o1+ = 1.0568499728
                          o2+ = {(w7+*out h1+) + (w8+*out h2+)} + (b2*1)
                             = {(0.51130127*0.59325676) + (0.5613701211*0.5968694086)}+ (0.6*1)
                          o2+ = 1.238297587

Again calculate the output at hidden layer by using sigmoid function,
                     out h1+ = 1/(1+e^(-o1+))
                                 = 1/(1+e^(-(1.0568499728)))
                                  = {(0.1497807162*0.05) + (0.10*0.19956143)}+(0.35)
                                  = 0.0274451758+0.35
                          h1+ = 0.3774451758
                          h2+ = {(w3+*i1)+(w4+*i2)}+(b1*1)
                                = {(0.24975114*0.05) + (0.29950299*0.10)} + (0.35*1)
                         h2+ = 0.392437786

Again the sigmoid function is used in hidden layer,
                    out h1+ = 1/(1+e^(-h1+))
                                 = 1/(1+e^(-(0.3774451758)))
                  out h1+ = 0.59325676
                       
                     out h2+ = 1/(1+e^(-h2+))
                                  = 1/(1+e^(-(0.392437786)))
                out h2+ = 0.5968694086
       Again calculate the output at output layer,

      o1+ = {(w5+*out h1+) + (w6+*out h2+)}+(b2*1)
             = {(0.35891648*0.59325676) + (0.408666186*0.5968694086)}+ (0.60*1)
   o1+ = 1.0568499728

     o2+ = {(w7+*out h1+) + (w8+*out h2+)} + (b2*1)
        = {(0.51130127*0.59325676) + (0.5613701211*0.5968694086)} + (0.6*1)
     o2+ = 1.238297587

     Again calculate the output at hidden layer by using sigmoid function,

   out h1+ = 1/(1+e^(-o1+))
               = 1/(1+e^(-(1.0568499728)))
   out h1+ = 0.742088111
   out h2+ = 1/(1+e^(-o2+))                     
              = 1/(1+e^(-(1.238297587)))  
  out h2+ = 0.7752675456
                                                                                 
 To find the total error at output layer,

E+ total = ∑1/2(target - out) 2                       
    E+o1 = 1/2(target – out o1+) 2
    E+o1 = 1/2(0.01 – 0.742088111)2
 E+ o1 = 0.2679765011
    E+o2 = 1/2(target – out o2+) 2
              = 1/2(0.99 – 0.7752675456) 2
E+o2 = 0.0230550133
E+ total = E+o1+ E+o2
E+ total = 0.2910315144

Before back propagation , the total error value is 0.2983717986.  After completion of first back propagation, the error value is reduced to 0.2910315144. The above process is repeated for 10,000 times to achieve the target output. For example, after the completion of 10,000 times, the error is reduced to 0.0000351085. At this point, feed forward input as 0.05 and 0.1, the two output neurons generate 0.015912196 9 VS (0.01 target) and 0.984065734 VS (0.99 target).  


Any Doubts reply back


No comments:

Post a Comment