Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:

Size: px

Start display at page:

Download "Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:"

Ilene Carter
5 years ago
Views:

Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Landmines, Targets Under Trees, Underground Facilities Research Team: Lawrence Carin, Leslie

1 Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Landmines, Targets Under Trees, Underground Facilities Research Team: Lawrence Carin, Leslie Collins, Qing Liu Duke University Waymond Scott and James McClellan Georgia Institute of Technology George Papanicolaou Stanford University Sponsors: DARPA (Dr. Doug Cochran), ARO (Dr. Russell Harmon)

2 Research Overview Sensing of landmines, targets under trees, underground structures and targets in a water channel are very distinct missions, although they fall under the general problem of sensing concealed targets in the presence of a complex, stochastic environment Rather than focusing on one of these areas, we exploit their inter-relationships to investigate the general concealed-target problem Particular examples will be investigated by connecting members of the MURI to appropriate members of the user community (e.g. landmines: Army Countermine Office, Ft. Belvoir, VA) Undertake a multi-modal (multi-sensor) approach to effect inversion, with insights from the evolving inversion used to refine/optimize the inter- and intra-sensor parameters Will exploit the fact that future (and current) DoD systems will rely increasingly on autonomous vehicles (e.g. multiple robots, UAVs and UUVs that can be positioned to optimize the sensor platform as the inversion is undertaken)

3 Multi-Modal Inversion of General Targets Embedded In Arbitrary Stochastic Layered Media General Adaptive Algorithms and Phenomenological Insights Application- Specific Questions/ Issues Landmines Underground Structures Concealed Ground Structures

4 Multiple Sensors Multi-Sensor Physics-Based Constrained RDOF Phenomenology from Forward Models Direct Inversion Statistical Inversion Reverse-Time Migration General Nonlinear Inversion Adaptive Coarse-to- Fine Pruning Adaptive HMM Multi-sensor mutual information Multi-sensor, adaptive Bayesian Optimize Intra-Sensor Parameters Optimize Inter-Sensor Parameters Inversion Confidence High Inversion complete Phenomenology from Forward Models

5 Key to Program Success: Close Coupling to User Community MURI team members have a track record of working together and with the responsible parties within the DoD (and contractors) To assure that the research addresses relevant issues, and is transitioned to the appropriate organizations, it is essential that the MURI team members work closely with the user community Close relationships exist with the DoD landmine community (NVESD), and similar relationships are planned for the TUT and underground-structures problems Research relevance assured through goal of processing data generated by the appropriate DoD community

6 Duke Team Presentations 8:50-9:20 Lawrence Carin Optimal Experiments for Adaptive Sensing: Regression, Detection and Classification 9:20-9:50 Leslie Collins Bayesian Adaptive Multi-Modality Processing for the Discrimination of Landmines 9:50-10:20 Qing Liu Fast Forward Solvers for Electromagnetic and Elastic Scattering 10:20-10:45 Break 10:45-11:15 George Papanicolaou Imaging in Clutter 11:15-12:15 Waymond Scott and James McClellan Detection of Obscured Objects: Signal Processing & Modeling

7 Optimal Experiments for Adaptive Sensing: Regression, Detection and Classification Lawrence Carin, Xuejun Liao, Shihao Yu and Yan Zhang Department of Electrical & Computer Engineering Duke University Durham, NC

8 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions

9 Magnetization Tensor for EMI Sensing Magnetic fields of a rotated EMI dipole H z z zs T T (Θ rs, ωs ) = ( r r ) ( ) ( ) T 3 s U M ωs U r rs [( r r ) ( r r )] s s ωmpk ωmpk M( ω) = diag[ mp +, mp0 +, mz0 ω jω ω jω + 0 k pk k pk k ωmzk ω jω zk ] The target EMI dipole parameters to be inverted are Θ = [ m,θ φ T p0,mpk,ωpk,mz0,mzk,ωzk,x,y,z, ] The sensor parameters that may be varied are p = [ xs,ys,zs,ωs T ]

10 r s Mobile induction sensor x r z

11 z target dipole 0 θ φ y x x-y-z: sensor coordinate system

12 Measurement Model Set of N measurement parameters and observations represented (p n,o n ) n=1,n The observation is modeled as the magnetic field plus additive WGN O = H Θ r, ω ) + G z ( s s Given N measurements (p n,o n ) n=1,n the target model parameters are estimated as Θˆ = arg min g ( Θ) = arg min Hz( Θ pn) On( pn) Θ subject to : m p0,m pk,ω pk Θ,m z0 N n= 1,m zk,ω zk 0 2

13 Statistical Characterization The error between the observation and the model is modeled as WGN, where we are assuming that the model is correct ), ( ), ( ), ( ) ; ( Θ p Θ Θ Θ I R je = e + = p p e p f O The associated density function is ) ; ( ) ; ( ) exp( 2 ) exp( 2 ) ; ( Θ Θ Θ I R I R e p e p e e p = = β π β β π β e Associated Fisher information matrix (iid) = = β N n H n n 1 } )] ; ( )][ ; ( Re{[ Θ Θ Θ Θ p f p f J Assumption of additive WGN is analogous to linearization with arbitrary noise Θ Θ Θ Θ Θ Θ Θ ˆ p f ˆ ˆ p f p f = Θ + ) ; ( ) - ( ) ; ( ) ; ( T N N L

14 Information Gain of New Measurement Measure of information in N iid measurements where J n ( p N n 1,..., pn}) = J ( p1,..., pn ) = J ( n) n= 1 q({ p p n ) = β Re{[ Θ f ( p n ; Θ)][ Θ f ( p n ; Θ)] H } Information gain by adding measurement N+1 δ q ( p) = lnq({ p1,..., p, p 1}) lnq({ p1,..., p }) = ln I N N + N + β F T B 1 N F F = N [ FR, FI ] = [Re{ Θ f ( pn + 1; Θ)},Im{ Θ f ( p + 1; Θ)}] B N = N n= 1 J n Sensor parameters for measurement N+1 p H 1 = arg maxδq( p) = argmaxln I + F BN N + β p p 1 F

15 ) ( ) ( ) ( ) ( ) ( N N T T N N N H q ln c p p W W p p F B F I p p p µ + β = µ = δ ψ + + Accounting for Measurement Cost Previous theory chose new sensor parameters based on information gain alone Desirable to account as well for measurement cost Simple augmentation of cost function

16 60 digits=sensor positions,star=sources,angles=[ ] Deg y position (meters) y position (meters) source dipole numbers: sensor dipole xposition(meters) x position (meters)

17 550 sensor-frequency used in the fixed grid Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number

18 digits=sensor positions,star=sources,angles=[ ] Deg. source dipole numbers: sensor dipole y position (meters) y position (meters) xposition(meters) x position (meters)

19 10 4 optimal sensor-frequency from the search strategy Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number

20 10 4 MSE of model parameter estimate 10 2 MSE with optimal sampling points MSE with fixed sampling points Mean Squared Error (MSE) mean squared error (MSE) number of data points used Number of Data Points Used

21 No Constraints On Sensor Motion digits=sensor positions,star=sources,angles=[ ] Deg. source dipole numbers: sensor dipole 1 y position (meters) y position (meters) xposition(meters) x position (meters)

22 1500 optimal sensor-frequency as a function of iteration number Sensor sensor-frequency Frequency (Hz) iteration number Iteration Number

23 Constrained Sensor Motion digits=sensor positions,star=sources,angles=[ ] Deg source dipole numbers: sensor dipole 20 y position (meters) y position (meters) xposition(meters) x position (meters)

24 10 5 optimal sensor-frequency as a function of iteration number 10 4 Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number

25 Information-Gain Surface, Measurement One

26 Information-Gain Surface, Measurement Two

27 Information-Gain Surface, Measurement Three

28 Information-Gain Surface, Measurement Four

29 Information-Gain Surface, Measurement Five

30 true dipole, dipole fitted from all selected data, and dipole fitted from the initial single datum m p0 m p1 ω p1 m z0 m z1 ω z1 x n y n z n θ φ true dipole 1: true dipole 2: last fit. dipole 1: last fit. dipole 2: last fit. dipole 3: ini fit. dipole 1: ini fit. dipole 2: ini fit. dipole 3:

31 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions

32 Basic Construct S1 S2 target S3 S4 Scattering Data Can be Segmented into Angular Bins Characterized by particular physics Each such angular range termed a state (S1, S2,..., SN)

33 Hidden Markov Models π 1 π 2 π 3 s 1 s 2 s 3 a 11 a 13 a 33 a 12 a 23 s 1 s 2 s 3 a 21 a 22 a 31

34 Target Target Target 3 Target 4 φ Ribs Target 5 Internal Oscillators

35 Theory of Optimal Experiments In previous HMM classification studies the path of the sensor was fixed, with uniform sampling spatially For a UUV/UAV, one may be able to adaptively control sensor path Question: Can we optimize the UUV/UAV path to most efficiently classify a submerged target (mine)? Employ the theory of optimal experiments Assume that we wish to distinguish between K targets, and we perform t measurements, or observations O t ={O 1,,O t } ML classification: Choose target i if p( O T ) > p( O T ), t i t k T k T i

36 Theory of Optimal Experiments - 2 Assume that we have performed t measurements, at which we have measured, O t ={O 1,,O t }. The measurements are performed at the relative angles θ, L 2, θ t Note: Since we don t know the target orientation, we can only control the relative sensor angles with respect to the target Assume we have the target-dependent statistical models p for example with an HMM ( O1, L, O t θ2, L, θt, Tk ) θ t+1 At what relative angle should we perform measurement t+1, to optimize target classification?

37 Theory of Optimal Experiments - 3 Consider the probabilities p( Tk O1, L, O t, θ2, L, θt ) for all targets Idea: Choose the t+1 measurement to minimize the entropy characteristic of these density functions H t ( θ2,..., θt ) = k = K p( T θ log p( T, L, θ θ 2, O 1 k 2 t 1 k, L, O, L, θ t t ), O 1, L, O t ) θ t+1 Issue: When considering different we do not actually have access to O t+1 Solution: We compute the expected entropy, integrating over our density function for O t+1 Key: With HMM we need not assume independent measurements + HMM efficient

Theory of Optimal Experiments - 4 We can perform this efficiently utilizing our previous work with HMMs 50 Resopnse for Target 1 50 Resopnse for Target 2 50 Resopnse for Target 3 45 10

300 350 Angle (Deg) Resopnse for Target 4 50-40 5 0 0 50 100 150 200 250 300 350 Angle (Deg) 50-40 5 0 0 50 100 150 200 250 300 350 Resopnse for Target 5 Angle (Deg) -40 45 10 45 10 40

38 Theory of Optimal Experiments - 4 We can perform this efficiently utilizing our previous work with HMMs 50 Resopnse for Target 1 50 Resopnse for Target 2 50 Resopnse for Target Frequency (khz) Frequency (khz) Frequency (khz) Angle (Deg) Resopnse for Target Angle (Deg) Resopnse for Target 5 Angle (Deg) Frequency (khz) Frequency (khz) Angle (Deg) Angle (Deg)

39 Confusion Matrix (5 points) Uniform Sampling: o 5 T 1 T 2 T 3 T 1 T 2 T 3 T 4 T T 4 T Ave = 77.72% Uniform Sampling: T 1 T T 2 T 3 T T o 45 T 2 T T T Ave= 86.89%

40 T 1 T1 T2 T3 T4 T Optimal Sampling: T 2 T T 4 T Ave= 92.89%

41 Objective Function 1.2 x θ 5 = delta Entropy Sample rate

42 Sample postions= , delta= Targ 1 Targ 2 Targ 3 Targ 4 Targ 5 Probablity Number of Observations

43 Sample postions= , delta= Targ 1 Targ 2 Targ 3 Targ 4 Targ 5 Probablity Number of Observations

44 Adaptive 5 degree 48 degree 1 Entropy Number of Observations

45 Optimal Multi-Aspect Target Detection Previous results considered optimal sensing for classification problem We assumed knowledge of HMMs for each of the targets Often we have HMMs for known targets but may come across objects we haven t seen previously Can we extend theory of optimal experiments to case for which we only have HMMs for a subset of objects we may encounter? Relevant for target detection

46 Optimal Multi-Aspect Target Detection - 2 Assume that we have performed t measurements, at which we have measured, O t ={O 1,,O t }. The measurements are performed at the relative angles θ, L 2, θ t The log likelihood of these measurements, for the known target T is log p( O1, O2,..., Ot θ1, θ2,..., θt, T) We perform the next measurement at the change of angle expected increase in log likelihood θ +1 t that maximizes the arg max θt+ = E [log p( O1,..., O, O 1 1,...,, 1 O 1 O1,..., Ot, θ1,..., θt, θt 1 t t+ θ θt θ + + θ t+ 1 t t+ 1, T )] This corresponds to minimizing the entropy (uncertainty) of whether the target under interrogation corresponds to target T

47 NSWC Data: 6 Targets khz Bullet Plastic Cone 55 Gal. Drum Rock 1 Rock 2 Log

48 ROC curve between Target 5 and Dobeck NSWC Targets Adaptive 5 degree 22 degree P d P f

49 0 Uniform Sampling o 22 Log Likelihood Optimal Sampling Log Likelihood

50 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions

51 Background UXO versus clutter : a classification problem Labeled data and unlabeled data Data is easy to observed, but excavation is very costly Reducing cost by the smart learning, no prior training data required

52 Step 1: Collect Data

53 Step 2: Choose Representative Signatures (Basis Functions), No Digging

54 Step 3: Dig First Hole to Learn Label

55 Step 4: Refine Classifier and then Dig Hole Two

56 Step 5: Refine Classifier and then Dig Hole Three, etc

57 Step 6: Obtain Set of Labeled Signatures and Complete Classifier Design after Information Gain Accrued By New Labels is Minimal

58 Classifier design: A Sketch Three parts for building the classifier: 1. Select basis functions sequentially to build the structure of the classifier using all available data (unlabeled). The structure is designed to occupy the maximum information from the data set. 2. Sequentially select the most informative data to the classifier structure determined from Part 1, discover its label, and refine the information of the structure. 3. Estimate the weights of the classifier with labeled data selected from Part 2.

59 Classifier design: Kernel Machine Kernel machine is a linear classifier in high dimensional space, it can produce complex decision boundaries while keep the generalization ability high. n T f n ( x ) = wn, i K ( ci, x ) + wn, 0 = w n Φn ( x i= 1 ) Basis functions: Φ n ( ) = [1, ( c1, ), K ( c2, ),..., K ( cn, K )] T Weights : w n = [ n, 0, wn,1, wn,2, L, wn, n w ] T Kernel: Relevant vectors: K ( c i, x ) Comparing the similarity between c i and x in high dimensional space. ci, i = 1, L, n

60 Classifier design: Selection of Basis Functions (1) With the whole unlabeled data set X ={x 1, x 2,, x N }, basis functions are selected sequentially by Fisher Information Criteria (FIC) to capture the data characteristics. Assume a linear kernel machine has n basis functions, then the Fisher information matrix for the weights is N M = = Φn ( xi ) Φ n i 1 T n (x i ) M n indicates the information from data set X for the current structure of the classifier. Its inverse is the Cramer-Rao Bound.

61 Classifier design: Selection of Basis Functions (2) When a new kernel Φ n+1 ( ) = K(c n+1, ), c n+1 X is added to basis functions, the new Fisher information matrix is M Φ [ T Φ ( x ) ( x ] N n i n + = = Φ 1 i 1 n i n+ 1 i Φn+ 1( x ) i ) ( x ) The information gain by adding a new kernel is defined as follows. It can also be viewed as the drop of CRB. I ker nal = + ln Det( M n 1) ln Det( M n )

62 Classifier design: Selection of Basis Functions (3) For the new kernel K(c n+1, ), the information gain is equal to ln I ( ) kernel cn+ 1 = N 2 i = (xi ) Φ n ( N T 1 N Φ i= n+ i n i n i= n i Φ 1 1 x ) Φ ( x )) M ( Φ ( x ) 1 n+ 1 ( ( x i )) Then the optimal new kernel is selected by a greedy search to maximize the information gain c n+ = arg max ln I ( c) 1 c X \{ c1,..., c } ker nel n

63 Classifier design: Selection of labeled Data (1) After a set of most informative basis functions Φ n0 ( ) have been selected, in order to determine the classifier, the labeled data are required to estimate the weights. Another FIC is utilized to assign a priority to each unlabeled data, and those more informative to the estimation are labeled first by digging them out. Therefore, the classifier is determined with substantial less excavating effort, which is the most costly part for UXO problem.

64 Classifier design: Selection of labeled Data (2) Labeled data selection is also a sequential process. Assuming we have a subset of labeled data X L X, X L = {x 1, x 2,, x L }, the Fisher information matrix with the data set X L is M ( L T X L) Φ n 1 n ( xl ) Φn ( x l ) = l = When a new datum x L+1 is added, the new Fisher information matrix with the data set X L+1 is M n T ( X L+ ) = M n ( X L ) + Φn ( xl+ 1) Φn ( xl 1)

65 Classifier design: Selection of labeled Data (3) The information gain by adding a new datum is defined by Idata = ln Det( M n ( X )) ln ( ( 0 L+ 1 Det M n X 0 L )) For the datum x L+1, the information gain is T 1 ln I data( xl+ ) = 1+ Φn ( xl+ 1) M n ( X L) Φn ( xl 1) Then the next most informative datum is selected by a greedy search, and its label is recovered for the estimation of weights. x L+ = arg max ln I ( x) 1 x X \ X data L

66 Classifier design: Weights Estimation With n 0 basis functions Φ n0 ( ) and L 0 labeled data {X L0, Y L0 } selected by FIC, the weights w n0 of the classifier are obtained via a Best Linear Unbiased Estimation (BLUE). = = ( ( ) ( L l l T n l n L n ) ) x Φ x Φ X M = = ) ( ) ( L l l l n L n n y x Φ X M w

67 Overall Procedure Collect magnetometer and induction data at a given site (NO training data) Select a set of example signatures for basis function (NO digging required) Sequentially dig up UXO/clutter, one at a time, to learn labels of particular signatures Using the signatures and labels, refine classifier Using new classifier, choose next example to dig and learn label Refine classifier Continue digging and learning labels until the information gain to classifier is minimal Classifier design complete Using final classifier, discriminate UXO from clutter for remaining un-excavated objects

68 Results The proposed method is tested with the data from the JPG V demonstration site, which is deployed to asses different technologies for UXO detection and discrimination under realistic scenarios. Both GEM-3 data and magnetometer data are considered. Features are extracted via model fitting. Magnetometer feature extraction is similar to that of the induction sensor, but much simpler. A typical set of features are utilized for the classification. The data is from two adjoining areas, and includes totally 40 UXO, which are 60 mm and 81mm mortars; 57 mm, 76 mm, 105 mm, 152 mm, and 155 mm projectiles; and 2.75 inch rockets. After the model fitting based prescreening operation, 260 clutters are left.

69 Results There are 16 UXO and 112 clutters in area 3, which is originally assigned as the training area. We observe that most UXO items presented in one area are also appeared in the other area, which seems to be a manual effort to keep the training data matched to the detecting data. We feel this may not be feasible in practice. The ROC curve of the proposed active training method is presented in comparison with results of a standard Kernel Matching Pursuits (KMP) from 100 random training selections. KMP is the most related classifier to the method proposed here, and it has achieved superior results comparing with some state of the art methods.

70 Number of labeled data: 128

71 Number of labeled data: 90

72 Number of labeled data: 60

73 Number of labeled data: 40

74 Future Directions Thus far have considered optimization of parameters (e.g. position, frequency, etc.) of single sensors -EMI -Acoustic - Magnetometer Now must place optimal sensing into context of multiple sensors Multiple sensors operating collectively (e.g. team of robots and/or UAVs)

Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:

Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Landmines, Targets Under Trees, Underground Facilities Research Team: Duke University Georgia Institute of