CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB Multsal Bernhard Mehlg, 73-42 988 (moble Johan Fres, 7-37 272 (moble, vsts once at 9 Mathematcs Handbook for Scence and Engneerng Any other wrtten materal, calculator Maxmum score on ths exam: 2 ponts. Maxmum score for homework problems: 2 ponts. To pass the course t s necessary to score at least 5 ponts on ths wrtten exam. CTH 4 passed; 7.5 grade 4; 22 grade 5, GU 4 grade G; 2 grade VG.. Recognton of one pattern. a Defne Q (µ,ν = =42 = ζ (µ ζ (ν. ( = ζ (ν, and wth f ζ (µ The bt constrbutes wth + to Q (µ,ν f ζ (µ. Snce the number of bts are 42, we have Q (µ,ν = 42 2 H (µ,ν, where ζ (ν H (µ,ν s the number of bts that are dfferent n pattern µ and pattern ν (the Hammng dstance. We fnd: H (, = Q (, = 42 H (,2 = Q (,2 = 22 H (,3 = 2 Q (,3 = 38 H (,4 = 42 Q (,4 = 42 H (,5 = 2 Q (,5 =
H (2, = H (,2 = Q (2, = 22 H (2,2 = Q (2,2 = 42 H (2,3 = Q (2,3 = 22 H (2,4 = 42 H (2, = 32 Q (2,4 = 22 H (2,5 = Q (2,5 = 2 b We have that b (ν = = 42 ζ( From a, we have that: b ( b (2 b (3 b (4 b (5 w ζ (ν = ζ ( ζ (ν ( ζ ( ζ ( + ζ (2 ζ (2 ζ (ν 42 + 42 ζ(2 ζ (2 ζ (ν = 42 ζ( Q (,ν + 42 ζ(2 Q (2,ν. (2 = Q(, 42 ζ( + Q(2, 42 ζ(2 = ζ ( + 22 42 ζ(2, = Q(,2 42 ζ( + Q(2,2 42 ζ(2 = 22 42 ζ( + ζ (2, = Q(,3 42 ζ( + Q(2,3 42 ζ(2 = 38 42 ζ( + 22 42 ζ(2, = Q(,4 42 ζ( + Q(2,4 42 ζ(2 = ζ ( 22 42 ζ(2, = Q(,5 42 ζ( + Q(2,5 42 ζ(2 = 2 42 ζ(2. (3 c We have that From b, we fnd that: ζ ( sgn(b ( = ζ (, ζ (2 sgn(b (2 = ζ (2, ζ (3 sgn(b (3 = ζ (, ζ (4 sgn(b (4 = ζ ( = ζ (4, Thus patterns ζ (, ζ (2 and ζ (4 are stable. ζ (5 sgn(b (5 = ζ (2. (4 2
2. Lnearly nseparable problem. a In the fgure below ξ (A and ξ (B are to have output and ξ (C and ξ (D are to have output. There s no straght lne that can separate patterns ξ (A and ξ (B from patterns ξ (C and ξ (D. b The trangle corners are: [ ] 4 ξ ( = ξ (2 = [ ] 4 ξ (3 = [ ]. (5 3 Let v = at ξ ( and ξ (2. Ths mples = w ξ ( + w 2 ξ ( 2 θ = 4w θ = w ξ (2 + w 2 ξ (2 2 θ = 4w w 2 θ θ = 4w and w 2 = 4w θ = 8w. (6 We choose w =, w 2 = 8 and θ = 4. 3
Let v 2 = at ξ (2 and ξ (3. Ths mples = w2 ξ (2 + w 22 ξ (2 2 θ 2 = 4w 2 w 22 θ 2 = w 2 ξ (3 + w 22 ξ (3 2 θ 2 = 3w 22 θ 2 w 22 = 4w 2 θ 2 = 4w 2 3w 22 w 22 = w 2 and θ 2 = 3w 22. (7 We choose w 2 = w 22 = and θ 2 = 3. Let v 3 = at ξ (3 and ξ (. Ths mples = w 3 ξ (3 + w 32 ξ (3 2 θ 3 = 3w 32 θ 3 = w 3 ξ ( + w 32 ξ ( 2 θ 3 = 4w 3 θ 3 3w 32 = 4w 3 and θ 3 = 3w 32. (8 We choose w 32 = 4, w 3 = 3 and θ 3 = 2. In summary: 8 4 w = and θ = 3. (9 3 4 2 The orgn maps to 4 v = H (w θ = H 3 =. ( 2 We know that the orgn maps to v = [,, ] T and that the hdden neurons change values at the dashed lnes: 4
Thus we can conclude that the regons n nput space maps to these regons n the hdden space: We want v = [,, ] T to map to and all other possble values of v to map to. The hdden space can be llustrated as ths: W must be normal to the plane passng through the crosses n the pcture 5
above. Also, W ponts to v = [,, ] T from v = [,, ] T. We may choose W = =. ( We know that the pont v = [/2,, ] T les on the decson boundary we are lookng for. So /2 W T T = T = 2. (2 3. Backpropagaton. a Let N m denote the number of weghts w (m. Let n m denote the number of hdden unts v (m,µ for =,..., L, let n denote the number of nput unts and let n L denote the number of output unts. Fnd that the number of weghts are L N m = m= and that the number of thresholds are L n m n m, (3 m= L n m. (4 m= 6
b v (m,µ = w (p =g ( =g ( ( g ( = g ( ( θ (m + w (m v (m,µ w (m v (m,µ. (5 Usng that p < m, we fnd: v (m,µ =g ( w (m v (m,µ. (6 w (p c From b, we have: But snce p = m, we fnd: v (m,µ = g ( d We have w (L 2 v (m,µ = g ( w (L 2 δ q δ r v (m,µ = g ( + δw (L 2, where δw (L 2 = η H w (L 2 w (m v (m,µ. (7 δ q v (m,µ r. (8. (9 We derve the energy functon: H w (L 2 From 3b and 3c, we have: v (m,µ ( g = ( g = w (L 2 2 = µ µ ( O (µ ζ (µ ( 2 O (µ ζ (µ O (µ w (L 2 w(m v (m,µ f p < m w (p δ q v r (m,µ f p = m. (2 (2 7
Defne v (L,µ = O (µ. We have: O (µ w (L 2 = v(l,µ w (L 2 Insert from eq. (2. Use that L 2 < L. } =g ( b (L,µ w (L v (L,µ w (L 2 Insert from eq. (2. Use that L 2 < L. } ( ( =g b (L,µ w (L g b (L,µ w (L v (L 2,µ k k w (L 2 Insert from eq. (2. Use that L 2 = L 2. } ( ( ( =g b (L,µ w (L g b (L,µ w (L k g k =g ( b (L,µ w (L g ( b (L,µ k k w (L q b (L 2,µ δ qk v (L 3,µ r g ( b (L 2,µ q v (L 3,µ r. (22 The update rule s eq. (9 wth the dervatve of the energy functon gven by eqs. (2 and (22. 4. True/False questons. Indcate whether the followng statements are true or false. 3-4 correct answers gve 2 ponts, -2 correct answers gve.5 ponts, 9- correct answers gves pont and, 8 correct answers gve.5 ponts and -7 correct answers gve zero ponts. (2 p. You need access to the state of all neurons n a multlayer perceptron when updatng all weghts through backpropagaton. TRUE (the update of a weght n layer depends on the value of the neuron n the layer before. 2. Consder the Hopfeld network. If a pattern s stable t must be an egenvector of the weght matrx.false (due to the step-functon. 3. If you store two orthogonal patterns n a Hopfeld network, they wll always turn out unstable. FALSE (the crosstalk term s zero. 4. Kohonens algorthm learns convex dstrbutons better than concave ones. TRUE (concave corners can cause problems. 5. The number of N-dmensonal Boolean functons s 2 N. FALSE (t s 2 (2N. 6. The weght matrces n a perceptron are symmetrc. FALSE (they may not even be square matrces. 8
7. Usng g(b = b as actvaton functon and puttng all thresholds to zero n a multlayer perceptron, allows you to solve some lnearly nseparable problems. FALSE (you have effectvely one weght matrx that s the product of all your orgnal ones. 8. You need at least four radal bass functons for the XOR-problem to be lnearly separable n the space of the radal bass functons. FALSE (two are enough. 9. Consder p > 2 patterns unformly dstrbuted on a crcle. None of the egenvalues of the covarance matrx of the patterns s zero. TRUE (zero egenvalue ndcates patterns on a lne.. Even f the weght vector n Oa s rule equals ts stable steady state at one teraton, t may change n the followng teratons. TRUE (t s only a statstcally steady state.. If your Kohonen network s supposed to learn the dstrbuton P (ξ, t s mportant to generate the patterns ξ (µ before you start tranng the network. FALSE (tranng your network does not affect whch pattern you draw from your dstrbuton. 2. All one-dmensonal Boolean problems are lnearly separable. TRUE (two dfferent ponts can always be separated by a lne. 3. In Kohonen s algorthm, the neurons have fxed postons n the output space. TRUE (t s the weghts, n the nput space, that are updated. 4. Some elements of the covarance matrx are varances. TRUE (the dagonal elements. 9
5. Oa s rule. a Insert ξξ T = C s a matrx, so: = δw = ηζ (ξ ζw ξζ = ζζw. (23 ζ = ξ T w = w T ξ : (24 = δw ξξ T w = w T ξξ T ww ξξ T w = w T ξξ T ww. (25 = δw Cw = w T Cww. (26 We see that δw = mples that w s an egenvector of C wth egenvalue λ = w T Cw: (note that w T w = w w. λ = w T Cw = w T λw = λw T w w T w =. (27 b Are the patterns centered? 5 ξ (µ = 6 2 + + + 5 = (28 µ= 5 µ= ξ (µ 2 = 5 4 + 2 + 3 + 4 =. (29 So ξ =, and the patterns are centered. Ths means that the covarance matrx s: C = 5 ξ (µ ξ (µ T = ξξ T. (3 5 We have µ= [ ] 36 3 ξ ( ξ ( T = 3 25 [ ] 4 8 ξ (2 ξ (2 T = 8 6 [ ] 4 4 ξ (3 ξ (3 T = 4 4 [ ] 3 ξ (4 ξ (4 T = 3 9 [ ] 25 2 ξ (5 ξ (5 T =. (3 2 6
We compute the elements of C: 5C =36 + 4 + 4 + + 25 = 7 5C 2 =5C 2 = 3 + 8 + 4 + 3 + 2 = 65 5C 22 =25 + 6 + 4 + 9 + 6 = 7. (32 We fnd that C = [ ] 4 3. (33 3 4 Maxmal egenvalue: = 4 λ 3 3 4 λ = (4 λ2 3 2 = λ 2 28λ + 4 2 3 2 = λ 2 28λ + 27 λ = 4 ± 4 2 27 = 4 ± 69 = 4 ± 3 λ max = 27. (34 Egenvector u: [ ] [ ] 4 3 u = 27 3 4 u 2 [ u u 2 ] u = u 2. (35 So an egenvector correspongng to the largest egenvalue of C s gven by [ ] u = t (36 for an arbtrary t. Ths s the prncpal component.
6. General Boolean problems. There was a typo n Eqn. (8 of the exam. The correct equaton s: v (µ f θ + = w ξ (µ > f θ + w ξ (µ. a The soluton uses w = ξ (. Ths means that the th row of the weght matrx w s a vector w ( = ξ ( : From the fgure above, we see that: w (T ξ ( = + + = 3. w (T ξ (µ = + = for µ =, 4 and 3. w (T ξ (µ = = for µ = 2, 5 and 7. w (T ξ (6 = = 3. Usng that θ = 2, we note that: w (T ξ ( θ s > f = µ < f µ. (37 So we have: v (,µ = f = µ f µ. (38 We can understant that the corner µ of the cube of possble nputs s separated from the other corners by that t assgns to the µ th hdden neuron and to the others. 2
From Fgure 4 n the exam, we see that there are exactly 4 of the 8 possble nputs ξ (µ that are to be mapped to O µ =. These are ξ (µ for µ = 2, 4, 5 and 7. These nputs wll assgn, respectvely: v (2 =, v (4 =, v (5 =, and v (7 =. (39 The weghts W are now to detect these and only these patterns, so that O (µ = W T w = for µ 2, 4, 5, 7} for µ, 3, 6, 8}. (4 Ths s acheved by lettng: W =. (4 3
b The soluton n 6a mples separatng each corner ξ (µ of the cube of nput patterns by lettng v (,µ f = µ =. (42 f µ Thus the soluton requres 2 3 = 8 hdden neurons. The analogous soluton n 2D s to separate each corner of a square, and t requres 2 N = 2 2 = 4 neurons. The decson boundares of the hdden neurons are shown here: 4