Multiple line-template matching with the EM algorithm

Pattern Recognition Letters 18 1997 1283 1292 Multiple line-teplate atching with the EM algorith Sion Moss, Edwin R. Hancock ) Departent of Coputer Science, UniÕersity of York, York Y01 5DD, U Abstract This paper shows how ultiple shape hypotheses can be used to recognise coplex line patterns using the expectationaxiisation algorith. The idea underpinning this work is to construct a ixture distribution for an observed configuration of line segents over a space of hypothesised shape odels. According to the EM fraework each odel is represented by a set of axiu likelihood registration paraeters together with a set of atching probabilities. These two pieces of inforation are iteratively updated so as to axiise the expected data likelihood over the space of odel-data associations. This architecture can be viewed as providing siultaneous shape registration and hypothesis verification. We illustrate the effectiveness of the recognition strategy by studying the registration of noisy radar data against a database of alternative cartographic aps for different locations. q 1997 Elsevier Science B.V. eywords: Mixture of Gaussians; Line teplates; Aerial iage registration; EM algorith 1. Introduction Shape recognition is invariably posed as a hypothesis and test procedure. In the hypothesis generation step particular odel poses are projected onto the raw iage data. In the verification step a easure of goodness of fit is used to assess how well odel and data register against one another. However, when ultiple shape hypotheses are to hand the recognition process soon becoes unanageable. In ost practical situations coplexity is curbed by exploiting doain-specific cues to restrict the diensionality of the search space Ži.e. the nuber of odel poses that ust be projected onto the data. or liit the nuber of active hypotheses Ži.e. to reject certain odels as being unlikely to deliver good descriptions of the iage data.. ) Corresponding author. E-ail: erh@inster.york.ac.uk. Although this odel-driven hypothesis and test approach can be rendered highly efficient, it clearly violates the principle of least coitent. Abiguities are resolved early and partial atches are not fully evaluated. Moreover, there is no possibility for odel overlap or ixing. Indeed, the absence of these possibilities is at odds with uch of the psychology of odel recall ŽFeldan and Ballard, 1982; awabata and Mori, 1992.. In particular, it rules out iportant phenoena such as perceptual alternation ŽBialek and Deweese, 1995; Riani and Sionotto, 1994.. Our ai in this paper is to deonstrate how statistical ixtures of shapes can be used for recognition when ultiple odel hypotheses are to hand. Specifically, we show how the expectation-axiisation Ž Depster et al., 1977. algorith can be used to iteratively estiate shape ixing proportions and shape registration paraeters. According to our viewpoint, the expectation step of the algorith can be viewed as cobining evidence for the different 0167-8655r97r$17.00 q 1997 Elsevier Science B.V. All rights reserved. PII S0167-8655 97 00102-5

1284 ( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 copeting odel hypotheses. The axiisation step is concerned with finding the axiu-likelihood paraeters for each of the odels. Because each odel has an associated probability, there is scope for accoodating both abiguity of appearance ŽCallari and Ferrie, 1996; Qin and Luh, 1994; awabata and Mori, 1992. and hypothesis alternation ŽBialek and Deweese, 1995; Riani and Sionotto, 1994.. Moreover, different odels can co-operate Ž Minka and Picard, 1996. to provide a partial explanation for the data. The outline of this paper is as follows. In Section 2 we furnish details of our line-based representation of odel shapes. Section 3 briefly reviews the details of the EM algorith that are pertinent to our study. Section 4 provides experiental evaluation of our ethod on registering radar iages onto a database of cartographic odels. Finally, Section 5 offers our conclusions and outlines our future plans. 2. Representation The tokens used in the atching process are line segents which are characterised by their id-point coordinates Ž x, y. i i in the iage plane and their line orientation in the iage coordinate syste u i. Each line in the iage is represented by a vector w i s Ž x. i, y i,u i T, where i is the segent index. The co- plete set of data lines is set wsw,;igd 4 i, where D is the segent index set. The database of odels consists of a set of line-teplates. In our application, each set of lines constitutes a separate cartographic odel. Suppose that the cartographic odel indexed consists of the set of lines z s z,;jgm 4 j. Here M is the index set for the odel lines and the z j represent the corresponding easureent vec- tors. The ai of our atching algorith is to iteratively recover a paraeter vector f which de- scribes the Euclidean transforation that brings the data lines into registration with the odel indexed Ž Žn.. Each paraeter vector f. s f,1,...,f,4 T has four coponents; the eanings of these coponents are as follows: f,1 represents the x-transla- tion, f,2 represents the y-translation, f,3 is the rotation and f,4 is the relative scale. The state of registration of the entire odel base is denoted by Ž Žn. the atrix of colun vectors F s f,...,f. 1 where is the nuber of odels. Individual odel lines are transfored into the coordinate syste of the iage by applying the appropriate set of Euclidean paraeters. The transfored position and orientation for the jth line in the th odel is given by F z,f su f z qvf. Ž 1. j j Ž The atrix U f. odels the scaling and rotation of coordinates Žn. Žn. Žn. f,4cosf,3 yf,4sinf,3 0 U f s Žn. Žn. Žn., f sinf f cosf 0,4,3,4,3 0 0 1 Ž 2. while the atrix V selects the translation coponents for the paraeter vector f 1 0 0 0 Vs 0 1 0 0. Ž 3. 0 0 1 0 Before we proceed to detail the EM algorith, it is worth noting that we have chosen the Euclidean transforation because it is known to accurately odel the iaging process for our radar data. However, it ust be stressed that the ethodology presented in this paper is applicable to a large variety of coordinate transforations. 3. Shape ixtures In this section we detail our representation of the atching process and describe how the underlying set of transforation paraeters can be recovered using the EM algorith of Depster et al. Ž 1977.. Our algorith recovers ultiple odel descriptions in a two-stage process. The expectation step involves estiating a ixture distribution using current paraeter values. The axiisation step involves coputing new paraeter values that optiise the expected value of the weighted data likelihood. This two-stage process is iterated to convergence. In our application the different odels are sets of line segents residing within a database of possible iage interpretations.

( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 1285 3.1. Mixture odel Basic to our philosophy of exploiting the EM algorith is the idea that every line segent in the radar data can in principle associate to each of the lines in each of stored odels with soe a posteriori probability. This odelling ingredient is naturally incorporated into the fitting process by developing a ixture odel over the space of potential atching assignents. Specifically, we ai to construct a ixture odel for the conditional data likelihood Ž p w F.. We coence our developent by assuing that the easureents of the individual data-lines are conditionally independent given the current atrix of transforation paraeters: p w F s p w F. Ž 4. Ł Ž i. igd Our next step is to develop a ixture odel for the Ž individual easureent densities, i.e. for p w F. i. Accordingly, we apply the Bayes rule over the space of potential odel-data associations between the available line sets: Ý Ý Ž i j. p w F s p w, z F. Ž 5. s1 jgm Applying the chain rule, we develop the conditional density under the suation as follows: i Ý Ý i j j s1 jgm p w F s p w z,f P z F. Ž 6. Written in this way, the ixture density has two distinct odel ingredients. The first of these is the set of individual coponent conditional easure- Ž ent densities p w z,f. i j. This density repre- sents the likelihood that the data-line easureent wi originated fro the line indexed j drawn fro the odel indexed. Since the odel only transfors under the prevailing colun vector f of the a- trix F, we reove the conditional redundancy and write p w z,f s p w z,f. Ž 7. Ž i j. Ž i j. The second ingredient appearing in the ixture distribution are the odel ixing proportions. We use Ž the shorthand notation a s P z F. to reprej, j sent the ixing proportion for the line j fro the odel. With these ingredients, we can turn our attention to the log-likelihood function for the paraeter atrix, i.e. Ý L F s ln p wi F. igd Substituting for the ixture distribution given in Eq. Ž. 6, Ý Ý Ý Ž i j. L F s ln p w z,f igd s1 jgm Ž j. = P z F. Ž 8. In the next section we will describe how the EM algorith can be applied to this likelihood function to estiate the paraeters of the odel-transforation atrix F. 3.2. Expectation The expectation step of the EM algorith is aied at estiating the data log-likelihood function when the data under consideration is incoplete. In our line-atching exaple this incopleteness is a consequence of the fact that we do not know how to associate tokens in the iage and their counterparts in the set of stored odels. It was Depster et al. Ž 1977. who observed that axiising the weighted log-likelihood was equivalent to axiising the conditional expectation of the log-likelihood for a new paraeter set given an old paraeter set. For our atching proble, axiisation of the expectation of the conditional likelihood, is equivalent to axiising the weighted log-likelihood function Ž nq1. Žn. Ý Ý Ý Ž j i. Q F F s P z w,f s1 igd jgm ½ Ž i j. = Ž nq1. ln p w z,f 5 qln P z j F. 9 Viewed in this way, the EM algorith potentially involves two separate axiisation steps for each of the ters under the curly-braces. However, the second ter is couched purely in ters of the odel representation Ž i.e. the line ixing proportions. and is hence not of direct relevance to the data likeli-

1286 ( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 hood. In other words, we confine our attention to the quantity Ž nq1. Žn. Qˆ F F s P z w,f Ý Ý Ý Ž j i. s1 igd jgm Ž i j. = Ž nq1. ln p w z,f. Ž 10. Ž The a posteriori probabilities P z w,f. j i play the role of atching weights in the expected likelihood. We interpret these weights as representing the probability of atch between the data line indexed i and the odel-line indexed j fro the odel indexed. In other words, they represent odel-datu affinities. Using the Bayes rule, we re-write the a posteriori atching probabilities in ters of the conditional easureent densities Ž j i. P z w,f s Žn. b aj, p wi z j,f X. p X Ž w z. Ý Ý b Žn. a X X j, i j ŽXn.,f Ž 11. X X j gm The line ixing proportions for each odel in turn are coputed by averaging the a posteriori probabilities over the set of data lines, i.e. 1 Ž nq1. a j, s Ý P Ž zj w i,f.. D igd The a posteriori odel probabilities are found by suing the relevant set of line ixing proportions, i.e. b Ž nq1. sý j g M a Ž j, nq1.. In this way the a poste- riori odel probabilities su to unity over the coplete set of shape odels. The probability assignent schee allows for both odel overlap and the assessent of abiguous hypotheses. These probabilities provide a natural echanis for assessing the significance of the individual lines and odels in explaining the current data likelihood. For instance, if a Ž j, n. approaches zero, then this indicates that there is no line in the data that atches the line j in the odel. In other words, the ixture odel provides a natural way not only of accoodating issing segents but also for handling odel overlap. It is iportant to stress that the ixing proportions are iteration dependent, being conditioned upon the current paraeter values. To proceed we require a odel for the conditional Ž easureent densities, i.e. p w z,f.. Here we i j assue that the required odel can be specified in ters of a ultivariate Gaussian distribution. The rando variables appearing in these distributions are the error residuals for the position and orientation predictions of the jth odel line delivered by the current estiated transforation paraeters. Accordingly we write 1 pž wi z j,f. s 3r2 Ž 2p. ( S 1 T y1 i, j, i, j, =exp y e f S e f. Ž 12. 2 In the above expression S is the variance covari- ance atrix for the odel indexed. The quantity Ž. Ž e f sw yf z,f. i, j, i j is the vector of error residuals between the data-line indexed i and transforation of the line indexed j in the odel indexed under the current set of Euclidean paraeters, i.e. f. The variance covariance atrix for the odel indexed j is equal to the expected value of the outer-product of error residuals, i.e. S s w Ž. Ž E e f P e f. T x i, j, i, j,. Initially, we coence with a diagonal estiate of the covariance atrix with x and y coponents set equal to one another. 3.3. Maxiisation With the ingredients outlined in the previous subsection, the expectation step of the EM algorith siply reduces to coputing the weighted squared error criterion X Ž nq1. Q Ž F F. 1 sy P z w,f Ý Ý Ý j i 2 s1 igd jgm T i, j,ž. i, j,ž. = e f Ž nq1. S y1 e f Ž nq1.. Ž 13. In other words, the a posteriori probabilities b and a Ž j, n. effectively regulate the contributions to the likelihood function. Models or lines for which there is little evidence of atch contribute insignificantly, while those which are in good registration doinate. The axiisation step ais to locate the updated atrix of paraeter vectors F Ž nq1. that optiises X Ž Ž nq1. the quantity Q F F., i.e. Ž nq1. X F s argaxf Q FF. 14

( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 1287 We solve the iplied weighted least-squares iniisation proble using the Levenburg Marquardt technique. This non-linear optiisation technique offers a coproise between the steepest gradient and inverse Hessian ethods. The forer is used when close to the optiu while the latter is used far fro it. The ain advantages are speed of convergence and a reduced susceptibility to local optia. Full details of the optiisation step can be found in the recent account of Moss and Hancock Ž 1996.. 4. Experients The overall goal of the study reported in this paper is to deonstrate that the EM algorith can be used to siultaneously recognize and register odels fro a database. The application vehicle is provided by the proble of recognising and registering hedge patterns fro digital aps when illieter radar iages are to hand. The iages being used in our study are of rural areas in which the principal anade linear features available for atching are hedge rows. Fig. 1 shows a typical radar iage with the associated area of ap. The radar data is delivered as a series of non-overlapping sweeps interspersed with substantial dead-regions. Within each sweep there is both a significant oriented background texture and a systeatic variation in background intensity. The Fig. 2. Table of odel atching probabilities. linear hedge structures ay extend across several adjacent sweeps and are in consequence likely to be broken or fragented. In other words, the ap registration process ust be capable of accoodating the atching of single odel lines to ultiple dataline fragents. It ust also accoodate the possibility of issing atches for odel lines which fall into the dead-regions. Fig. 1Ž. a shows an exaple iage. Ž Cf. Fig. 2.. Fig. 3 shows the database of hedge patterns which have been extracted fro a series of digital aps. The ground-truth ap data is the set of lines shown in green. In Fig. 4Ž. b we illustrate the siultaneous axiu likelihood registrations for the different hedge patterns. Fig. 4Ž. a is the initial line configuration. The raw line features extracted fro the radar iage are shown in black. The odel lines are colour coded in the sae anner as Fig. 3. Fig. 4Ž. b shows the final registration obtained after 6 iterations of the EM algorith. The green lines Fig. 1. Iages to be atched. Ž. a MDBS iage Žnote both how the sweeps fragent the iage features and the highly textures nature of the background.. Ž b. OS ap of the sae area.

1288 ( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 Fig. 3. Database of colour-coded cartographic odels. correctly register against the data lines. The other ap configurations adopt eaningless poses with few correspondences with the raw data. For instance, the red odel contracts in scale to fit a few localised lines. The gating probabilities for different odels are given in the table shown in Fig. 2. The probability for the correctly atching green odel is 0.462. The red odel has the lowest atching probability of 0.07. Fig. 4. Initial odel configurations left and final atched configurations right. The odels are colour coded and the black lines are the data segents extracted fro the illieter radar iage.

( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 1289 4.1. SensitiÕity study We have perfored a sensitivity analysis with the ai of showing the effect of noise and clutter on the odel data atching probabilities, and to show what effect the nuber of odels stored has on these probabilities. In order to do this, we have constructed a nuber of synthetic probles coposed of randoly placed and orientated line segents. Each odel set contains only one odel which can be accurately registered to the data set. Our first experient, consists of varying the aount of noise on the data segents and the addition of clutter line segents. Fig. 5 shows the 2d surface of atching probabilities as a function of these two quantities. In this case, noise takes the for of rando displaceent of the data lines segents easured in ters of the standard deviation of the Gaussian noise. It can be seen that the addition of clutter causes the atching probability to steadily decrease. This can be attributed to the clutter segents foring spurious local inia in the energy function, corresponding to false atches. Clearly the larger the nuber of clutter segents, the larger the likelihood of false correspondences and hence the lower the probability of a odel-data assignent. The relative insensitivity of the ethod to translational noise can be attributed to the inclusion of line orientation in the cost function and its weighting through the use of the covariance atrix S. Fig. 6 shows how the odel atching probability varies as a function of the nuber of odels stored by the syste. The green line corresponds to the probability of the true odel atching the data, whilst the lower line shows the highest atching probability fro any of the reaining odels. It can be seen that the true probability falls rapidly as the nuber of odels increases. However, even though the atching probability decreases rapidly, it reains higher than any of the odel atching probabilities. In this case, taking the odel with the axiu a posteriori atching probability always results in the correct odel classification. However, it should be noted that this does not guarantee that the final transforation paraeters are accurate. Finally, we provide soe experients on the effect of odel overlap. We have generated a scene which consists of two noise corrupted odels fro our database. We have subsequently systeatically deleted a fraction of the lines fro one of the odels. Fig. 7 shows the gating probabilities for the two odels as a function of the fraction of deleted line segents. As the fraction increases so the gating probability of the in-tact odel increases while the Fig. 5. Effect of noise and clutter on the gating probabilities. The red surface shows the result obtained when there are two odels in the database. The green surface is the result obtained when there are seven odels.

1290 ( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 Fig. 6. The gating probabilities as a function of nuber of odels in the database. The green curve is the correctly atching odel; the blue curve is the second-highest ranking odel. gating probability of the corrupted odel decreases. Moreover the gating probabilities directly reflect the ixing fractions of the two sets of lines. This study of the effect of line-set deciation on the odel gating probabilities illustrates the feasibility of recognition under occlusion. Fig. 7. The effect of odel overlap. The curves show the gating probabilities for two overlapped odels as the line set of the blue odel is deciated.

( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 1291 5. Conclusions Our contribution can be regarded as illustrating the effectiveness of the EM algorith in recalling odels stored in a database and registering the against noisy iage data. In other words, the algorith has significant potential for shape recall fro iage databases. There are clearly ways in which the current work can be extended. In the first instance, it would be interesting to know the shape eory capacity of our algorith. Moreover, there is still considerable scope for assessing the capacity of the technique to handle scenes which contain ultiple instances of overlapped odels. Studies aied at answering these questions are in progress and will be reported in due course. Discussion Mardia: The classical EM algorith has a lot of probles. Especially, there could be very large bias. A siple exaple is that of noral ixtures. In contrast, it is interesting to see that your sensitivity exercise doesn t show any bias or at least it looks like atching well. So there is soe contradiction. Hancock: I don t think we have a proble with odel selection here, which I guess is a source of bias in ost classical algoriths. In fact we know that there is only one odel present, so we don t have to worry about effectively overfitting the data. If, on the other hand, you are trying to apply the EM algorith to fit Gaussian ixtures to a distribution, then you do not know the odel-order a priori. Mardia: Also, there could be instability or local axia. It is well known for axiu likelihood estiators that there could be ulti-odal solutions. Hancock: We certainly found that, for instance, we were getting better results with Levenburg Marquardt than with siple gradient descent ethods. So our arguent would be that Levenburg Marquardt is supposed to have better global convergence properties than steepest descent, because it is effectively using the curvature of the iniization surface as well as the slope. But the optiization process is always the weakness here. Mardia: The final question is just a siple inquiry. You took a nice exaple of registration where there was ostly affine variation. So registration consist of taking one shape into another. Hancock: The first exaple is based purely on Euclidean registration. The second one was based on perspective registration. Mardia: So what happens when the deforation is non-linear. There are any practical exaples of probles which I a sure you are aware of. Hancock: That is soething we are thinking about at the oent: how to put deforation odels into this. We are interested in trying spline deforations or aybe to represent the deforation process, using DCT. Roli: I a a bit confused about the forulation of your proble in the sense that in aerial navigation you are interested in finding the ap of the terrain which you are flying over. In your ethod, at the end you find a ixture of different aps. Hancock: There are two ways of looking at this. We started off by trying to find the ap that gives the highest probability explanation of the data. The alternative is to ask what happens when what you are looking at is actually a ixture of different odels. And that is the way in which we are looking at it now. So the practical proble is to use this to odel the data where we expect a particular odel to provide a total explanation. We have been looking at it ore recently as a process which you can use to describe coposite or occluding shapes where there is soe genuine ixture. Roli: In the original work by Jordan and Jacobs, this kind of odular neural network was proposed to give a copetitive learning algorith. In your application what is the behavior, is it copetitive or cooperative learning? Hancock: I should first stress that it is a very siplified version of Jordan and Jacobs algorith. For instance they have effectively paraeters on the gating process, which we don t have. So it is actually half way between EM and Jordan and Jacobs. So I think you should not think of this as a full blown Jordan and Jacobs ipleentation. But your point is

1292 ( ) S. Moss, E.R. HancockrPattern Recognition Letters 18 1997 1283 1292 an interesting one, because this is soething we have been trying to study about the algorith at the oent, is it copetitive or is it cooperative. Roli: In your work, the final stage, is it a stochastic selector or a cobiner? Maybe a trade-off would be the best. Hancock: Well, I think we can use it in both odalities. Your question is very interesting, is it copetitive or cooperative, we are trying to evaluate that at the oent. References Bialek, W., Deweese, M., 1995. Rando switching and optial processing in the perception of abiguous figures A neural network odel. Phys. Rev. Lett. 74, 3077 3080. Callari, F.G., Ferrie, F.P., 1996. Active recognition: using uncertainty to reduce abiguity. In: Proc. IEEE CVPR Conf., pp. 701 707. Depster, A.P., Rubin, N.M., Rubin, D.B., 1977. Maxiu likelihood fro incoplete data via the EM algorith. J. Royal Statist. Soc. Ser. B 39, 1 38. Feldan, J.A., Ballard, D.H., 1982. Connectionist odels and their properties. Cognitive Science 6, 205 254. awabata, N., Mori, T., 1992. Disabiguating abiguous figures by a odel of selective attention. Biological Cybernetics 67, 417 425. Minka, T., Picard, R., 1996. Interactive learning with a society of odels, In: Proc. IEEE Coputer Society Coputer Vision and Pattern Recognition Conf., pp. 447 452. Moss, S., Hancock, E.R., 1996. Cartographic atching onto illietre radar iages, In: Proc. IEEE Coputer Society Workshop on Applications of Coputer Vision. Qin, C., Luh, J.Y.S., 1994. Abiguity reduction in relaxation labelling. Pattern Recognition 27, 165 180. Riani, M., Sionotto, E., 1994. Stochastic resonance in the perceptual interpretation of abiguous figures A neural network approach. Phys. Rev. Lett. 72, 3120 3123.