Matlab Made Simple: How it Works: Back Propagation in Neural Network with Mathematical Example

Back propagation is one of the algorithms used to find the error as well as to minimize the error in order to get the target output.
Let us consider the following example, to achieve the target output using back propagation algorithm.

GIVEN VALUES:

At input layer:
Inputs: i1=0.05, i2=0.10
Weights: w1=0.15, w2=0.20, w3=0.25, w4=0.30

At hidden layer:
Bias: b1=0.35

w5= 0.40, w6= 0.45, w7= 0.50, w8= 0.55

At output layer:
Bias: b2=0.60

Target output: o1= 0.01, o2= 0.99

To find:
To reduce error and achieve the target output by using backward propagation algorithm.
Solution:
To find the hidden layer output:
net h1 = {(w1*i1) + (w2*i2)} + (b1*1)
= {(0.15*0.05) + (0.20*0.10)} + (0.35*1)
net h1 = 0.3775
net h2 = {(w3*i1) + (w4*i2)} + (b1*1)
= {(0.25*0.05) + (0.30*0.10)} + (0.35*1)
net h2 = 0.3925

Here, the sigmoid function is used to convert linear into nonlinear function at activation function in hidden layer.

out h1 = 1/(1+e^(-net h1))
= 1/(1+e^(-0.3775))
out h1 = 0.593269992
out h2 = 1/(1+e^(-net h2))
= 1/(1+e^(-0.3925))
out h2 = 0.5968843782

To find the output at output layer:

net o1 = {(w5*out h1)+(w6*out h2)}+(b2*1)
= {(0.40*0.593269992) + (0.45*0.5968843782)} + (0.60*1)
net o1 = 1.105905967
net o2 = {(w7*out h1)+(w8*out h2)}+(b2*1)
= {(0.50*0.593269992) + (0.55*0.5968843782)} + (0.60*1)
net o2 = 1.224921404

Again the sigmoid function is used in output layer,
out o1 = 1/(1+e^(-net o1))
= 1/(1+e^(-1.105905967))
out o1 = 0.7513650695

out o2 = 1/(1+e^(-net o2))
= 1/(1+e^(-1.224921404))
out o2 = 0.7729284653

To find the total error at output layer:

E total = ∑1/2(target - out) 2
E out o1 = 1/2(target o1 – out o1)2
= 1/2(0.01-0.7513650695)2
E out o1 = 0.274811773
E out o2 = 1/2(target o2 – out o2)2
= 1/2(0.99-0.7729284653)2
E out o2 = 0.0235600256

E _total= E _{out o1}+E _{out o2}

= 0.274811773+0.0235600256

E _total = 0.2983717986

To find the error at hidden layer:

E _{out h1}= (E _{out o1}*w5) + (E _{out o2}*w6)

= (0.274811773*0.40) + (0.0235600256*0.45)

E _{out h1}= 0.1205267207

E _{out h2}= (E _{out o1}*w7) + (E _{out o2}*w8)

= (0.274811773*0.50) + (0.0235600256*0.55)

E _{out h2}= 0.15036390036

According to chain rule, we calculate the updated weight of w5, w6, w7, w8.

To calculate w5⁺,

(ԺE _total/Ժ_w5) = (ԺE _total/Ժ _{out o1})*(Ժ _out _o1/Ժ _net _o1)*(Ժ _net_o1/Ժ_w5)

(ԺE _total/Ժ _{out o1}) = 1/2(target o1 – out o1)² + 1/2(target o2 – out o2)²

= 2/2(target o1 – out o1)^2-1*(-1) + 0

= (-target o1 + out o1)

(ԺE _total/Ժ _{out o1}) = (-0.01+0.75136560695)

(ԺE _total/Ժ _{out o1}) = 0.74136507

(Ժ _out _o1/Ժ _net _o1) = out o1(1- out o1)

= 0.7513650695(1-0.7513650695)

(Ժ _out _o1/Ժ _net _o1) = 0.186815602

net o1 = {(w5*out h1) + (w6*out h2)} + (b2*1)

Ժ _{net o1}/ Ժw5 = 1*out h1+0+0

= out h1

Ժ _{net o1}/ Ժw5 = 0.59326992

ԺE _total/ Ժw5 = 0.74136507*0.186815602*0.59326992

ԺE _total/ Ժw5 = 0.0821670407

W5⁺= w5-{ȵ* (ԺE _total/ Ժw5)}

= 0.40-(0.5*0.0821670407)

W5⁺= 0.35891648

W5⁺= 0.35891648 is the updated weight for w5.

To calculate W6⁺,

W6⁺= w6- {ȵ* (ԺE _total/ Ժw6)}

(ԺE _total/ Ժw6) = (ԺE _total/Ժ _{out o1})*(Ժ _out _o1/Ժ _net _o1)*(Ժ _net_o1/Ժw6)

(ԺE _total/Ժ _{out o1}) = 1/2(target o1 – out o1)² + 1/2(target o2 – out o2)²

=2/2(target o1 – out o1)^2-1*(-1) + 0

= (-target o1 + out o1)

= (-0.01+0.75136560695)

(ԺE_total/ Ժ _{out o1})=0.74136507

(Ժ _out _o1/Ժ _net _o1) =out o1(1- out o1)

=0.7513650695(1-0.7513650695)

(Ժ _out _o1/Ժ _net _o1) =0.186815602

(Ժ _net_o1/Ժw6) = {(w6*out h2) + (w7* out h2) }+ (b2*1)

=1*out h2+0+0

(Ժ _{net o1}/ Ժw6) = 0.5968843782

Substitute above values in (ԺE _total/ Ժw6) ,

(ԺE _total/ Ժw6) = 0.082667628

W6⁺= w6-{ȵ*(ԺE _total/ Ժ w6)}

= 0.45-(0.5*0.082667628

W6⁺= 0.408666186 ; [ updated weight for w6]

To calculate the updated weight for w7

W7⁺= w7-{ȵ*(ԺE_total/ Ժw7)}

(ԺE _total/Ժ _w7) = (ԺE _total/Ժ _{out o2})*(Ժ _out _o2/Ժ _net _o2)*(Ժ _net₀₂/Ժ_w7)

(ԺE _total/Ժ out o2) = 1/2(target o1 – out o1)² + 1/2(target o2 – out o2)²

= 0 + {2/2(target o1 – out o1)^2-1*(-1)}

= (-target o2 + out o2)

=-0.99+0.7729284653

(ԺE _total/Ժ out o2) = -0.2170715347

(Ժ _{out o2}/ Ժ _net _o2) = out o2 (1-out o2)

= 0.7729284653(1-0.77292846653)

(Ժ _{out o2}/ Ժ _net _o2) =0.1755100528

net o2 = (w7*out h1) + (w8*out h2)

(Ժ_{net o2}/ Ժw7) = out h1

(Ժ_{net o2}/ Ժw7) = 0.593269992

(ԺE _total/Ժ _w7) = (ԺE _total/Ժ _{out o2})*(Ժ _out _o2/Ժ _net _o2)*(Ժ _net₀₂/Ժ_w7)

(ԺE _total/Ժw7) = -0.2170715347*0.1755100528*0.59326992

=-0.0226025377

W7⁺= 0.50-(0.5*-0.0226025377

W7⁺= 0.511301270; [updated weight for w7]

To calculate the updated weight for w8,

W8⁺= w8 – {ȵ*(ԺE _total/Ժ _w7)}

(ԺE _total/Ժw8) = (ԺE _total/Ժ _{out o2})*(Ժ _out _o2/Ժ _net _o2)*(Ժ _net_o2/Ժw8)

(ԺE _total/Ժ out o2) = -0.2170715347

(Ժ _{out o2}/ Ժ _net _o2) = 0.1755100528

net o2 = (w7*out h1)+(w8*out h2)

(Ժ _{net o2}/ Ժw8) = out h2

= 0.5968843782

(ԺE _total/Ժw8) = (-0.2170715347*0.1755100528*0.59688437)

= -0.0227402422

W8⁺= w8-{ȵ*(ԺE_total/ Ժw8)}

= 0.55-(0.5*-0.0227402422)

W8⁺= 0.5613701211; [updated weight for w8]

To calculate the updated weight for w1,

W1⁺= w1-{ȵ*(ԺE_total/ Ժw1)}

(ԺE _total/Ժw1) = (ԺE _total/Ժ _{out h1})*(Ժ _{out h1}/Ժ _net _h1)*(Ժ _net_h1/Ժw1)

(ԺE _total/Ժ _out
h1) = (ԺE _o1/Ժ _{out h1}) + (ԺE _o2/Ժ _out
h1)

(ԺE _o1/Ժ _{out h1}) = (ԺE _total/_{Ժ out o1})* (Ժ _{net o1}/Ժ _{out h1})

= 0.74136507*0.186815602

(ԺE _o1/Ժ _{net o1}) = 0.138498562

net o1 = {(w5*out h1)+(w6*out h2)} + (b2*1)

(Ժ _net
o1/Ժ _{out h1}) = w5+0+0

(Ժ _{net o1}/Ժ _{out h1}) = w5; [W5=0.40]

(ԺE_o1/Ժ _{out h1}) = 0.138498562*0.40
= 0.055399425

(ԺE_o2/Ժ _{out h1}) = (ԺE_o2/Ժ _{net o2})* (Ժ _{net o2}/Ժ _{out h1})

(ԺE_o2/Ժ _{net o2}) = (ԺE_o2/Ժ _{out o2})*(Ժ_{out o2}/Ժ _{net o2})

=-0.2170715347*0.1755100528

(ԺE_o2/Ժ _{net o2}) = -0.038098236

net o2 = {(w7*out h1) + (w8*out h2)}+(b2*1)

(Ժ net o2/Ժ out h1) = (w7+0) + 0; [W7=0.50]

(ԺE o2/Ժ out h1) = (-0.038098236*0.50)

=-0.019049118

(ԺE total/Ժ out h1) = 0.05539945-0.019049118

= 0.03650307

(Ժ out h1/Ժ net h1) = out h1 (1-out h1)

= 0.593269992(1-0.593269992)

(Ժ out h1/Ժ net h1) = 0.241300709

(Ժ net o1/Ժ net h1) = net h1

net h1 = {(w1*i1)+(w3*i2)} + (b1*1)

(Ժ net h1 /Ժw1) = (i1 + 0) + 0; [i1=0.05]

(Ժ net h1 /Ժw1) = 0.05

(ԺE total/Ժw1) = (ԺE total/Ժ out h1)*(Ժ out h1/Ժ net h1)*(Ժ net ¬h1/Ժw1)

(ԺE total/Ժw1) = (0.036350307*0.241300709*0.05)

= 0.0004385677

W1+ = w1- {ȵ*(ԺE total/ Ժw1)}

= 0.15-(0.5*0.0004385677)

W1+ = 0.1497807162; [updated weight for w1]

Similarly,

W2+=0.1956143

W3+=0.24975114

W4+=0.29950229

Again calculate the output at hidden layer by using updated weight,

h1+ = {(w1+*i1) + (w2+*i2)}+(b1*1)

= {(0.1497807162*0.05) + (0.10*0.19956143)}+(0.35)

= 0.0274451758+0.35

h1+ = 0.3774451758

h2+ = {(w3+*i1)+(w4+*i2)}+(b1*1)

= {(0.24975114*0.05) + (0.29950299*0.10)} + (0.35*1)

h2+ = 0.392437786

Again the sigmoid function is used in hidden layer,

out h1+ = 1/(1+e^(-h1+))

= 1/(1+e^(-(0.3774451758)))

out h1+ = 0.59325676

out h2+ = 1/(1+e^(-h2+))

= 1/(1+e^(-(0.392437786)))

out h2+ = 0.5968694086

Again calculate the output at output layer,

o1+ = {(w5+*out h1+) + (w6+*out h2+)}+(b2*1)

= {(0.35891648*0.59325676) + (0.408666186*0.5968694086)}+ (0.60*1)

o1+ = 1.0568499728

o2+ = {(w7+*out h1+) + (w8+*out h2+)} + (b2*1)

= {(0.51130127*0.59325676) + (0.5613701211*0.5968694086)}+ (0.6*1)

o2+ = 1.238297587

Again calculate the output at hidden layer by using sigmoid function,

out h1+ = 1/(1+e^(-o1+))

= 1/(1+e^(-(1.0568499728)))

= {(0.1497807162*0.05) + (0.10*0.19956143)}+(0.35)

= 0.0274451758+0.35

h1+ = 0.3774451758

h2+ = {(w3+*i1)+(w4+*i2)}+(b1*1)

= {(0.24975114*0.05) + (0.29950299*0.10)} + (0.35*1)

h2+ = 0.392437786

Again the sigmoid function is used in hidden layer,

out h1+ = 1/(1+e^(-h1+))

= 1/(1+e^(-(0.3774451758)))

out h1+ = 0.59325676

out h2+ = 1/(1+e^(-h2+))

= 1/(1+e^(-(0.392437786)))

out h2+ = 0.5968694086

Again calculate the output at output layer,

o1+ = {(w5+*out h1+) + (w6+*out h2+)}+(b2*1)

= {(0.35891648*0.59325676) + (0.408666186*0.5968694086)}+ (0.60*1)

o1+ = 1.0568499728

o2+ = {(w7+*out h1+) + (w8+*out h2+)} + (b2*1)

= {(0.51130127*0.59325676) + (0.5613701211*0.5968694086)} + (0.6*1)

o2+ = 1.238297587

Again calculate the output at hidden layer by using sigmoid function,

out h1+ = 1/(1+e^(-o1+))

= 1/(1+e^(-(1.0568499728)))

out h1+ = 0.742088111

out h2+ = 1/(1+e^(-o2+))

= 1/(1+e^(-(1.238297587)))

out h2+ = 0.7752675456

To find the total error at output layer,

E+ total = ∑1/2(target - out) 2

E+o1 = 1/2(target – out o1+) 2

E+o1 = 1/2(0.01 – 0.742088111)2

E+ o1 = 0.2679765011

E+o2 = 1/2(target – out o2+) 2

= 1/2(0.99 – 0.7752675456) 2

E+o2 = 0.0230550133
E+ total = E+o1+ E+o2
E+ total = 0.2910315144

Before back propagation , the total error value is 0.2983717986. After completion of first back propagation, the error value is reduced to 0.2910315144. The above process is repeated for 10,000 times to achieve the target output. For example, after the completion of 10,000 times, the error is reduced to 0.0000351085. At this point, feed forward input as 0.05 and 0.1, the two output neurons generate 0.015912196 9 VS (0.01 target) and 0.984065734 VS (0.99 target).

Any Doubts reply back

Matlab Made Simple

Pages

Tuesday, 19 December 2017

How it Works: Back Propagation in Neural Network with Mathematical Example

No comments:

Post a Comment

Total Pageviews