저작자표시 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 이차적저작물을작성할수있습니다. 이저작물을영리목적으로이용할수있습니다. 저작자표시. 귀하는원저작자를표시하여야합니다.

Size: px

Start display at page:

Download "저작자표시 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 이차적저작물을작성할수있습니다. 이저작물을영리목적으로이용할수있습니다. 저작자표시. 귀하는원저작자를표시하여야합니다."

Elvin Gyles West
5 years ago
Views:

귀하는, 이저작물의재이용이나배포의경우, 이저작물에적용된이용허락조건을명확하게나타내어야합니다.

1 저작자표시 2. 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 이차적저작물을작성할수있습니다. 이저작물을영리목적으로이용할수있습니다. 다음과같은조건을따라야합니다 : 저작자표시. 귀하는원저작자를표시하여야합니다. 귀하는, 이저작물의재이용이나배포의경우, 이저작물에적용된이용허락조건을명확하게나타내어야합니다. 저작권자로부터별도의허가를받으면이러한조건들은적용되지않습니다. 저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다. 이것은이용허락규약 (Legal Code) 을이해하기쉽게요약한것입니다. Disclaimer

2 공학박사학위논문 다중로봇을이용한지속적정찰및감시기법연구 : 조정및경로계획 Persistent Surveillance using Multiple Robots: Coordination and Path Planning 214 년 8 월 서울대학교대학원 기계항공공학부 김우진

5 Persistent Surveillance using Multiple Robots: Coordination and Path Planning A Dissertation by Woojin Kim Presented to the Faculty of the Graduate School of Seoul National University in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Department of Mechanical & Aerospace Engineering Seoul National University Supervisor : Professor H. Jin Kim August 214

6 to my MOTHER, FATHER, and SISTER with love ii

7 Abstract Persistent Surveillance using Multiple Robots: Coordination and Path Planning Kim, Woojin Department of Mechanical & Aerospace Engineering The Graduate School Seoul National University A multi-agent system is a distributed system composed of multiple interacting intelligent agents within an environment. Multi-agent systems can be used to solve problems that are difficult or impossible for an individual agent or a single system to solve. Especially, the heterogeneity of the multi-agent system may be due to physical difference between agents or behavioral difference when robots serve diverse roles in a cooperating team. In this dissertation, the coordination and path planning methods of the heterogeneous robots are proposed for persistent surveillance in the following three parts: (i) persistent robotic environmental sensing, (ii) modeling environmental information, and (iii) coordination of multi-robot systems for boundary tracking. A key problem of robotic environmental sensing and monitoring is that of active sensing. This problem can be solved by generating the most informative observation paths for the robots, to limit the uncertainty in modeling and predicting an environmental phenomenon. In this work, the intermittent Kalman filter is applied to the active sensing. From the Kalman process, the uncertainties of the sensed information for each cell of the discrete workspace are obtained. And the target distribution of the robots visit which guarantees the predefined performance limit is calculated. After that, the robot is guided to visit each cell of the workspace according to the target distribution by a modified A* algorithm which we call distribution-motivated A*. The proposed algorithm generate iii

8 paths achieving the target distribution. Using the sensed information, the ensemble implementation of support vector machine (SVM) is proposed to model the environmental phenomenon in a distributed manner. The ensemble combination is a modeling or prediction technique that builds sub-predictors for each robot with its own dataset, and combines the sub-predictors with proper weights for the final prediction. This work shows that a well-organized collection of sub-predictors yields a more accurate result when compared with conventional SVM predictors. Moreover, this technique offers a flexible solution for the problems arising from large data and communication limitation. This work also considers a boundary tracking problem using mobile agents, in which a controller is designed for the mobile agents to move along the boundary of the environmental phenomenon. By using the hyper-dimensional decision function obtained from the SVM described in the previous paragraph, the hyper-potential field is constructed to generate a velocity vector field which is globally attractive to the desired closed path with circulation at the desired speed. The collective configuration of the multiple robots is coordinated in the evenly-spaced formation that encloses the boundary by minimizing the level of synchrony of the agents. In order to validate the proposed methods, the simulations and the experiments with multiple flying and ground robots are carried out. The simulation results verify guaranteed informativeness during the active sensing, accuracy and robustness of the ensemble SVM, and good performance of the collective boundary tracking. Furthermore, the experimental results demonstrate the applicability of the proposed methods to the practical heterogeneous robotic systems for the persistent surveillance. iv

9 Table of Contents Page Abstract iii Table of Contents v List of Figures viii Chapter 1 Introduction Motivation Literature Review Persistent Active Sensing Modeling the Environmental Information Boundary Tracking with Multi-Robots Objectives and Contributions Active Sensing Modeling the Environmental Information Boundary Tracking with Multi-Robots Thesis Orgarnization Persistent Active Sensing of Environmental Phenomena Research Objectives Problem Description Uncertainty Representation Proposed Active Sensing Algorithm Stability of Kalman Filter Target Distribution and Required Agent Number Persistent and Informative Path Generation v

10 2.4 Extension to Multi-agent Systems Simulation Results Modeling Environmental Information Problem Description Support Vector Machine (SVM) Linear Support Vector Machines Nonlinear Support Vector Machines One-Class Support Vector Machine Subpredictors and Ensemble Combination Training Subpredictors: Local Ensemble Aggregation Ensemble Weighted Ensemble Known Variances Unknown Variances Performance Analysis If the Variance of the Noise is Known If the Variance of the Noise is Unknown Algorithm Validation Environmental Boundary Tracking Problem Description SVM-based Lyapunov Vector Field SVM-based Potential Lyapunov-based Vector Field Vector Filed Construction Analysis on ϕ = Coordination of Multiple Mobile Agents Simulation Results Experimental Validations and Results vi

11 5.1 Experimental Setup Experimental Results Active Sensing with Flying Robots Boundary Tracking using Ground Robots Integrated System Conclusions vii

12 List of Figures 1.1 Examples of physical events (clockwise): the oil slick, the red tides and the green tides A representative field fire scenario of persistent surveillance using multiple robots Notation description of the dynamic equation and observation equation Variance updating with constant Σ η Variance updating with time-varying Σ η Categorized decision methods of Σ η : (a) no decision, (b) signal-based decision, and (c) model-based decision The required visiting rate λ verses the upper bound V The graph representation for the distribution-motivated A* algorithm in 2-dimension Feasibility test of distribution-motivated A* algorithm: (a) the workspace is 5 5 grid space and (b) the feasiblilty and expansion numbers are computed along S The one-dimensional simulation for the distribution-motivated A* algorithm which is compared with the Gibbs sampler algorithm The two-dimensional simulation for the distribution-motivated A* algorithm which is compared with the MILP approach: results of visiting distribution The two-dimensional simulation for the distribution-motivated A* algorithm which is compared with the MILP approach: decrease of ergodic metric viii

13 2.11 The two-dimensional simulation for the distribution-motivated A* algorithm which is compared with the MILP approach: computation time Potential generating tree for the inter-collision cost Simulation settings for the active sensing in the 2-dimensional workspace Simulation results for static events when P max = Simulation results for static events when P max = Simulation settings for the active sensing in the 2-dimensional workspace: dynamic event case Simulation results for dynamic events when P max = A conceptual flowchart of the ensemble classifier framework Linear separating hyperplanes for separable case Linear separating hyperplanes for non-separable case Experimental measurement data in I Experimental measurements of a sound source with respect to the distance between the source and the sensor Error region associated with the additive noise Error analysis Simulation scenario for validation of the ensemble SVM Data gathered from each agent Collected data from each agent Learning results when there is no noise in the data using (a) conventional SVM and (b) ensemble combination method Results when there is no noise in the dataset Accuracy test for the noisy data whose variances are.1 for the agent 1 and 3 and.2 for the agent 2 and 4 using (a) conventional SVM and (b) ensemble combination method ix

14 3.14 Accuracy test for the noisy data whose variances are.2 for the agent 1 and 3 and.4 for the agent 2 and 4 using (a) conventional SVM and (b) ensemble combination method Results when there is noise in the dataset Comparison of the misclassification error with varying noise level Results of the one-class SVM: (a) the boundary specification with training data points, and (b) the potential-like function Lyapunov vector fields satisfying a globally attractive limit cyclic path: (a) ρ =2,and(b)ρ = Simulation results of the boundary tracking for a single mobile agent at (a) t = 16, (b) t = 32, (c) t = 48 and (d) t = 64. The boundary curve is updated during the simulation Modality of a Gaussian mixture with two components: (left) unimodal and (right) bimodal The virtual phase diagram of mobile agents: φ k is the virtual phase of the agent k and controlled by v k Results of the boundary tracking with 1 mobile robots: (a) trajectories of agents and the final configuration and (b) the relative virtual phases along the time steps Results of the boundary tracking with 1 mobile robots: (a) the history of the Lyapunov function and (b) the history of the phase potential U( φ) Experimental setup in the indoor environment The results of the active sensing with two flying robots The results of the boundary tracking with two ground robots: the configuration The results of the boundary tracking with two ground robots: the phase difference x

15 5.5 The results of the boundary tracking with two ground robots: the phase potential The results of the integrated system: active sensing with two flying robots The results of the integrated system: boundary tracking with two ground robots xi

16 1 Introduction 1.1 Motivation Environmental information which we deal with in this dissertation indicates a set of observation data obtained from physical events. In the surrounding environment, there exist many types of physical events which we are interested in, for example, wild fire, radioactively polluted area, oil slicks or red/green tides. Since those environmental issues are distributed on the vast area extensively, the multi-robot systems are very useful for sensing, monitoring, managing, and acting. Recently, various researches using the heterogeneous multi-agent systems are studied such as an agricultural application [1], a military application [2], autonomous rescue [3] and wild fire detection and fighting [4]. In agricultural applications, the multi-aircraft systems are used to detect irregulars and apply spray to the aquatic weeds. The ground robots are deployed for the operation which produces soil impact. In the military application, known as the MAST project in US, various kinds of robots are developed for the indoor target discovery and tracking. In the research of the autonomous rescue, aircraft can be 1

Figure 1.1: Examples of physical events (clockwise): the oil slick, the red tides and the green tides. used for the fast search and ground robots for delivering the rescue support, e.g., oxygen masks, first aid kits, etc.

17 Figure 1.1: Examples of physical events (clockwise): the oil slick, the red tides and the green tides. used for the fast search and ground robots for delivering the rescue support, e.g., oxygen masks, first aid kits, etc. Finally, for the fire-fighting research, the UAVs are used for fire detection and navigation, and the ground robots are used for fire-fighting. The common issues on the above researches are how to assign the roles to the different kinds of robots appropriately and how to coordinate the robots in order to perform the assigned role efficiently. According to those researches, the flying robots are obviously suitable for monitoring, searching or detection, and the ground robots are suitable for detailed actions such as rescuing and fire-fighting. On this point of view, we organize the multi-robot systems which consist of flying robots and ground robots for sensing, modeling and tracking of the environmental phenomenon. The flying robots are assigned for the persistent active sensing to detect the physical events and cover the vast area. The collected information is modeled in the form of the boundary function and the ground robots are assigned for the boundary tracking. A field fire scenario is introduced as a representative example in Fig If there is 2

18 field fire on the workspace, the flying robots start to search for the fire and collect data of location and intensity of the field fire. Since the fire can spread out, the flying robots is aimed to visit the intensive fire area more frequently, while less-frequently visit the non-fire area as a warning for another pop-up fire. We call this topic for flying robots active sensing. From the measurement data from the flying robots, the field fire can be modeled. The model classifies whether the fire exists or not over the workspace so that the model indicates the spatial distribution of the field fire(s). Furthermore, the boundary of field fire can be obtained from the classification model. We call this process environmental modeling. Finally, using the boundary model obtained from the environmental modeling, ground robots track the boundary for detailed actions such as fire fighting or blocking the spreading of the field fire. For the efficiency, ground robots are aimed to achieve an evenly-spaced configuration. We call this boundary tracking. Since each topic of persistent surveillance is important and contributes to its own area, this dissertation is organized in three parts: active sensing, environmental modeling, and boundary tracking. The detailed descriptions of active sensing, environmental modeling, and boundary tracking are introduced and discussed in Chap. 2, Chap. 3 and Chap. 4, respectively. 3

19 Figure 1.2: A representative field fire scenario of persistent surveillance using multiple robots 4

20 1.2 Literature Review This section offers the survey result of scholarly articles, books, and other sources relevant to this research. Since this dissertation consists of the three main topics, we categorize the related literatures into those detailed topics: (i) active sensing, (ii) modieling the environmental information, and (iii) boundary tracking with multiple robots Persistent Active Sensing The active sensing and monitoring using multiple robots is a promising research area and receives considerable attention in robotic sensing and artificial intelligence literature recently. Many approaches to solve the active sensing are studied. [5] use a multiple travelling salesman (TSP) appraoch to solve the surveillance problem. [6] introduces an approximate policy for the persistent surveillance problem using a parallel and distributed implementation of dynamic programming (DP). In the problem, communication constraints and probabilistic sensor failure are considered for the embedded health monitoring system. [7] applied integer programming for optimization and graph partitioning for dividing agents into multiple teams which consist of the unmanned ground vehicles (UGVs) and micro aerial vehicles (MAVs). Since the persistent active sensing problem can be interpreted as the sequential task assignment problem, many researches apply the market-based auction algorithm to the persistent surveillance problem [8, 9, 1]. In those works, they considered an approach for monitoring robot performance in a patrolling task and employed an auction algorithm to dynamically reassign tasks from those team members that perform poorly. It divides the graph into subsets of vertices and assigns the vertices to each robot. As for cooperation of heterogeneous teams, the decentralized methods for allocating heterogeneous tasks to agents with different capabilities are proposed in [11]. They use the consensus-based bundle algorithm (CBBA) for the allocation problem. For the rest, [12] sets as their objectives finding the minimum number of robots and a 5

21 time-invariant memoryless control law that guarantees the largest number of visits for each state. In [13], the authors propose a reactive policy for persistent surveillance which aims at an equal visiting frequency using multiple unmanned aerial vehicles (UAVs). Graph partition approaches to the patrolling problem are also proposed in [14] Modeling the Environmental Information The second topic is modeling the environmental information and finding the boundary in the multi-platform system. The problem is to construct a model of the environmental phenomenon which classifies the event area using data obtained from active sensing in the distributed platforms. In machine learning, classification is the problem of identifying to which of a set of categories a new observation belongs, on the basis of a training set of data containing observations whose category membership is known. There exist many supervised learning techniques applied to the various scenarios with classification problem. Kernel-based learning [15, 16] is suggested for simplified localization, object tracking and environmental monitoring. Also, maximum-likelihood parametric approaches [17], Bayesian networks [18], hidden Markov models [19], statistical regression methods [2] and support vector machines (SVM) [21, 22] are employed for source localization, activity recognition, human behavior detection, parameter regression, self-localization and environmental sound recognition, respectively. In particular, SVM is one of the most popular classification algorithms since it has advantages of wide applicability, data sparsity and global optimality. In recent studies, due to the dramatic advances in the multi-platform systems and embedded computing, distributed SVM training is investigated. A parallel design of centralized support vector machine is one approach [23, 24]. When the training data set is very large, partial SVMs are obtained using small training subsets and combined at a fusion center. This approach can handle enormous sizes of data, but can be applied only if a central processor is available to combine the partial support vectors, and convergence to the centralized SVM is not always guaranteed for arbitrary partitioning of the data set 6

22 [25]. On the other hand, there are fully distributed approaches that solve the entire SVM using distributed optimization methods. Because SVM is a quadratic optimization problem, existing convex optimization techniques can be used. In [26], a distributed SVM has been presented, which adopts the alternating direction method of multipliers (ADMM) [27]. This approach is based on message exchanges among neighbors and provably convergent to the centralized SVM. However, since the gradient-based iteration should maintain the connection between nodes until convergence, the intercommunication cost is high. Furthermore, in the nonlinear case, the exchanged message length can become extremely long. These issues render it not suitable to wireless sensor network applications. Another class of distributed SVM, which is not based on the gradient method, relies on gossipbased incremental support vectors obtained from local training data sets [28, 29, 3]. These gossip-based distributed SVM approaches guarantee convergence when the labeled classes are linearly separable. When they are not linearly separable, these approaches can approximate, although not ensure, convergence to the centralized SVM solution Boundary Tracking with Multi-Robots The final topic is to develop motion plans for the multi-agent systems to surveil or patrol along the environmental boundary which is in the mathematical form (such as the result of SVM) efficiently. Using real-time feedback and on-line coordination of robots, coordinated moves of the platforms are handled. Moreover, robots move along the temporal boundary (computed by SVM) with an evenly-spaced configuration. Our approach is inspired by the recent studies in unmanned vehicle guidance [31] and cooperative control [32, 33]. The algorithm we propose is based on a Lyapunov vector field for convergence of boundary tracking and cooperative phase control for stabilization of collective motion. Many boundary tracking algorithms have been studied such as local rules based on the snake algorithms in computer vision [34], adaptive and cooperative feedback control law [35, 36] and Page s cumulative sum algorithm (CUSUM) [37]. In [35, 36], each robot pro- 7

23 vides a single measurement at a time and gradients are computed collectively while nodes move along the boundary as one united body. In [34, 37], robots are distributed over the boundary. In [34], a level set approach is employed for tracking the boundary. It needs a large number of agents for precise tracking because the agents do not have circulatory motion along boundaries. In [37], a control law is also developed for tracking along the boundary with multiple vehicles, but the boundary needs to be estimated as ellipses. 1.3 Objectives and Contributions The main objective of this dissertation is sensing, modeling and tracking of environmental phenomenon using multi-agent systems. We divide the main objective into three detailed subjects: active sensing, modeling, and boundary tracking Active Sensing Objectives The objective of active sensing is to generate the most informative observation paths, which limit the uncertainties of the environmental phenomenon persistently. The main problems are defining performance index denoting the uncertainties of the environmental information, finding required visiting rate and required number of agents and developing the persistent path planner which satisfies the target distribution. Contributions In this work, the intermittent Kalman filter is applied to the environmental spatiotemporal process in the discrete space and time domain. Using the variance process from the Kalman filter, we can derive the upper-bound sequence, as known as the modified algebraic Riccati equation, and deduct the target visiting rate and the corresponding required number of agents. In order to follow the target visiting rate efficiently and accurately, the 8

24 distribution-motivated A* algorithm is proposed. Contributions of this work is the following: first, while many active sensing methods are developed with diverse optimization formulations, e.g., minimizing the uncertainty or variance, this work is the first whose problem is not formulated as the optimization but the probabilistic path planning which limits and maintains the uncertainty or variance, with analytically driven target distribution. Especially, the existence and the convergence of the sequence of the upper-bound of uncertainties is proved. Second, distribution-motivated A* algorithm is proposed to generate paths achieving target distribution. In order to follow the target distribution persistently, the goal of general A* algorithm is replaced by the target distribution and the ergodicity metric is accompanied. Ergodicity metric denotes the accumulative visiting distribution and the cost of the A* is redefined the magnitude between the ergodicity metric and target distribution. The good performance of the distribution-motivated A* algorithm is validated in the simulation and compared with Gibbs sampler algorithm. Furthermore, in order to extend the distribution-motivated A* algorithm for multi-agent system, the inter-collision potential is adopted. Numerical simulations and experiments are performed to validate the proposed methods Modeling the Environmental Information Objectives The objective of the environmental modeling is to construct a model of the environmental information which classifies event area or sensor-detection area using data from the multiplatforms. Especially, the distributed SVM learning method is employed for solving the classification problem and the main issues are how to enhance the performance in terms of accuracy and exploring the robustness analysis to the measurement noise. 9

25 Contributions In this work, an ensemble combination method is proposed to support vector machine (SVM), and employed for the environmental modeling over the multi-platform systems. The ensemble method uses multiple sub-predictors to obtain better predictive performance than could be obtained from any of the overall single predictor. The contributions of this work are the following: first, as far as we know, although many ensemble methods with diverse models have been studied and they have flexible representation [38, 39], this work is the first report that includes the accuracy and robust analysis of the ensemble technique to SVM. We show that a well-organized collection of sub-predictors with SVMs yields a more accurate and robust result when compared with conventional SVM models. This method can be applied to a wide range of learning tasks that use a large amount of data, such as system identification, parameter estimation, detection or monitoring. Second, the proposed ensemble SVM is employed for the multi-platform systems. Each individual platform trains local SVM with its own local data, and those trained predictors are merged as an ensemble combination. Most commonly used platforms built with commercial components which have limited resources in terms of measurement accuracy, communication and computation. The ensemble SVM enables to rule out measurement noise and obtain more accurate results with properly determined weights. If the number of platforms becomes large, the multi-platform systems could experience problems such as communication delay, data drop-out, and excessive power consumption for signal transmission. By using the proposed approach, each sub-predictor needs to collect the data only from its own sensor, rather than from all the platforms. This could mitigate the issues related with the flexibility, computation and communication cost. 1

26 1.3.3 Boundary Tracking with Multi-Robots Objectives The objective of the boundary tracking with multi-robots is to develop collective motion plans for multi robots in order to track the boundaries of the environmental phenomenon obtained from SVM algorithm. The main problems are developing the tracking algorithm for the SVM-based boundary curves and stabilizing the evenly-spaced configuration formed by multi-robots. Contributions In this work, SVM algorithm is adopted to represent boundary and using the results from the SVM, we develop the Lyapunov-based vector field for the boundary tracking. We also propose the virtual phase control method for multi-robot coordination. Contribution of this work is the following: first, SVM algorithm is adopted to represent boundaries as an mathematical form regardless of the shape. However, the boundary curve obtained from SVM is complex and explicit analysis is very difficult. Instead of the explicit analysis, we generate a velocity vector field which gives asymptotic convergence to the boundary with circulation at the desired speed, by using the hyper-dimensional decision function obtained from SVM. Since the decision function of SVM is in the form Gaussian mixture, we prove that there exits no local minima on the outside of the boundary. Second, the desired speed is adjusted to coordinate the multiple robot collectively so that their inter-vehicular spaces are maximized for efficient surveillance or patrol along the boundary and fast reaction when the boundary changes. We perform both a simulation and an experiment to validate the proposed algorithm. 11

27 1.4 Thesis Orgarnization The dissertation is organized as follows. In Chapter 2, the active sensing strategy which consists of the analysis of the upper-bound of the uncertainty, ergodic A* algorithm and the simulation results are presented. In Chapter 3, the environmental modeling strategy using the ensemble SVM is introduced and analysed in terms of accuracy and robustness. In Chapter 4, the boundary tracking using multi-robot systems is presented. The SVMbased Lyapunov vector field and virtual phase control are introduced. In Chapter 5, the proposed methods are validated with the indoor experiments. Furthermore, the integrated system is also examined for the persistent surveillance of the environmental phenomenon, which is our final goal. Chapter 6 summarizes the issues and concluding remarks in this dissertation. 12

28 2 Persistent Active Sensing of Environmental Phenomena 2.1 Research Objectives In this chapter, we consider a persistent and informative searching method, whose goal is generating a persistent coordination rule that performs an efficient robotic environmental sensing and monitoring. A key problem of robotic environmental sensing and monitoring is active sensing. This problem can be solved by generating the most informative observation paths of the multiple robots, to limit and maintain the uncertainty in modeling and predicting an environmental phenomenon. We define the performance index implying the quality of the information (or measurements) and develop the searching method guaranteeing the predefined performance i.e. maintaining the quality of information. Basically, enough number of agents should be available for the better searching capability of the system. In addition, the path planning algorithm is an important factor for the individual capabilities. In this context, an efficient path generating algorithm is developed and 13

29 also the decision criteria of the number of agent are investigated simultaneously in the following parts of this chapter. 2.2 Problem Description Consider a region of interest I, wherei is a discrete spatial domain (a grid space with N cells) which consists of the position vector q i,fori =1,,N and a discrete time domain t =1, 2, 3,. We can define the dynamic equation and observation equations for each cell i as, Y ( q i,t) = H t Y ( q i,t 1) + η i, (2.1) Z( q i,t) = K i Y ( q i,t)+ɛ i, (2.2) where a scalar value Y ( q, t) is the information of the environmental phenomenon, a scalar value Z( q, t) is its measurement in the position q at time t, andη i and ɛ i are the random variables whose means are zero and variances are Σ η,i and Σ ɛ,i, respectively. Moreover, we define the visiting and observing the cell i at time t as a binary random variable K i (t). Denoting the visiting ratio at cell i during the active sensing by p Ki, the probability mass function of K i =1isp Ki (1) = λ i,wherek i (t) is independent of K i (s) ift s. H t is a parameter which represents the time-varying dynamics of the environmental phenomenon. If we know how the environmental phenomenon evolves, H t can be determined. However, in general cases, we set H t = 1, i.e., we assume that the environmental phenomenon is static. For the simplicity of the notation, we use Y i,t, Z i,t and K i (t) instead of Y ( q i,t), Z( q 2,t)andK i,t respectively, in the remained parts of this chapter. 14

30 Figure 2.1: Notation description of the dynamic equation and observation equation. From the equations (2.1) and (2.2), we can apply the Kalman filter to the process as follows: Measurement updates Ŷ i,t t = Y i,t t 1 + G i [Z i,t K i,t P i,t t 1 ], (2.3) P i,t t = P i,t t 1 G i K i,t P i,t t 1, (2.4) where the Kalman gain is G i = P i,t t 1 [Σ ɛ,i + P i,t t 1 ] 1. Time updates Ŷ i,t t 1 = Y i,t 1 t 1, (2.5) P i,t t 1 = P i,t 1 t 1 +Σ η,i. (2.6) Since the variance matrices P i,t t and P i,t t 1 denote the uncertainty or quality of the values of Y i,t, we can use the value of P i s as the performance index of the informative path planning. Equations (2.4) and (2.6) show the updating rule of the uncertainty P i.atevery time step, the uncertainty increases by Σ η,i according to (2.6) and there will be a drastic decrease if there happens an observation (or a visit) as in (2.4), as shown in Fig

31 Figure 2.2: Variance updating with constant Σ η. Figure 2.3: Variance updating with time-varying Σ η. So, by controlling the value of Σ η,i with respect to the measurement Z i, we can obtain the behavior such that the robots visit the high-uncertain region more frequently and the high-uncertain region can be addressed by defining the variance parameter Σ η,i (Z i ) appropriately and generating paths which reduce the overall uncertainties. Fig. 2.3 shows the fluctuation of the uncertainty P by controlling Σ η. This growing uncertainty problem is significantly related to the sweeping control or sweep coverage [14], in which robots with finite length of sensor footprints move over an environment so that every point in the environment is visited at least once by a robot. The robots are controlled so as to maximize a metric on the quality of the state estimate. In the recent works, although these works are well-motivated in real world 16

32 estimation applications, they suffer from the fact that planning optimal trajectories under these models requires the solution of the task assignment problem or travelling salesman problem using an intractable dynamic programming or integer programming, even for a static environment. In this work, we propose a probabilistic approach for the persistent and informative path generation that guarantees the expectation of the uncertainty for each cell to remain under the predefined upper-bound of the performance Uncertainty Representation As mentioned before, it is important to define the variance parameter Σ η (Z) appropriately. If Σ η is a constant, the covariance varies in the manner irrelevant to the events or event changing. The uncertainty of each cell grows with an identical speed and expected robot behaviors will be the normal sweeping control as shown in Fig However, from the equation (2.6), we can control the increasing speed of the variance by changing Σ η as showninfig.2.3. Decision methods for the value of Σ η can be categorized as the following: (i) no decision, (ii) signal-based decision, and (iii) model-based decision. No decision denotes Σ η is a fixed value which is predefined before the active sensing task. A signal-based decision simply obtains Σ η according to the observation Z t. The signal-based decision is intuitive and easy to compute the Σ η. Model-based decision methods can compute Σ η exactly based on the measurement history, however the history data should be rich enough to calculate exact Σ η. This point is not suitable for our case where the observation is intermittent. Fig. 2.4 shows the block diagrams of the categorized decision methods. In this work, we adopt the signal-based decision rule setting Σ η to be proportional to Z t and Żt, so that the agent visits the region where the event occurs or changes more than the other region. 17

33 Figure 2.4: Categorized decision methods of Σ η : (a) no decision, (b) signal-based decision, and (c) model-based decision. 18

34 2.3 Proposed Active Sensing Algorithm In the previous section, we define the uncertainties and their representation. In this section, we discuss an approach of the persistent and informative path generation algorithm, of which we can guarantee the stability and performance. Our new strategy contains the following four steps: First, we define the performance index of our visiting method. Second, we obtain the region of parameter in which Kalman filter is stable. Third, we calculate the stationary target distribution π of visiting which guarantees the predefined bound. Finally, we compute the coordination rule of agents to visit each site according to π Stability of Kalman Filter Since the purpose of the active sensing is to minimize the uncertainty of information, we set the variance P of the observations as the performance index of the informative path generation. Moreover, the average of the variance E[P ] is our final performance index because we handle the persistent behaviors. From the equations (2.4) and (2.6), we have the following variance equation, P i,t+1 = P i,t +Σ η,i K i,t P 2 i,t(p i,t +Σ ɛ,i ) 1 (2.7) where K i,t is a binary and stochastic variable. If K i,t = 1, there is the observation, and if K i,t =, there is the absence of the observation as mentioned before. Therefore, how to choose the time series of {K i,t },foreachi =1,,N, is equivalent to the path planning rule. 19

35 Consider for each i, the sequence {K i,t } with probability distribution p Ki (1) = λ i and derive the average of the variance E[P i,t ] as the following: E[P i,t+1 ] = E[P i,t +Σ η,i K i P 2 i,t(p i,t +Σ i,ɛ ) 1 ] = E[P i,t +Σ η,i λ i P 2 i,t(p i,t +Σ i,ɛ ) 1 ] = E[g λi (P i,t )] (2.8) where g λ (X) =X +Σ η λx 2 (X +Σ ɛ ) 1 is the modified algebraic Riccati equation with intermittent observations. Consider the process, X t+1 = g λ (X t )=X t +Σ η λ. (2.9) X t + σ ɛ If the propagation of modified algebraic Riccati equation is bounded at time t =, i.e. lim t X t = lim t g t λ (X ) <, we can say the Kalman process is stable. that For a positive value of X t and nonnegative Σ ɛ, there always exists <m 1, such Then we have the following: m X2 t X t X t +Σ ɛ 1. (2.1) X t+1 = X t +Σ η λ X2 t X t + σ ɛ X t +Σ η mλx t = (1 mλ)x t +Σ η = λ X t +Σ η, (2.11) where, λ =1 mλ 1. Since lim t X t = lim t λ t X + t τ=1 λ τ Σ η <, the condition λ < 1 guarantees the convergence of the sequence X t. In conclusion, if λ>, the intermittent process (2.9) is stable. 2

36 2.3.2 Target Distribution and Required Agent Number Now, we want to obtain the target distribution which can guarantee the stability and performance. The target distribution denotes the desired visiting rate for each cell with agents. Before calculating the target distribution, we drive the upper bound of the performance index E[P i,t ]. Revisit the equation, E[P i,t+1 ]=E[P i,t +Σ η,i λ i P 2 i,t(p i,t +Σ ɛ,i ) 1 ]=E[g λi (P i,t )], (2.12) where i =1,,N. Since the second derivative of g λ (X) is always negative on the domain X [, ), g λ (X) is concave and by the Jensen s inequality, we have E[P i,t+1 ]=E[g λi (P i,t )] g λi (E[P i,t ]). (2.13) Assume that there exists the upper bound E[P it ] V i,t,theng λi (E[P i,t ]) g λi (V i,t ). Moreover, by letting g λi (V i,t )=V i,t+1,wehave E[P i,t+1 ] V i,t+1. (2.14) From the above procedure, we have the sequence {V i,t } as the MARE process, i.e. V i,t+1 = g λi (V i,t ) and the sequence is the upper-bound of the performance index E[P i,t ], for t =, 1, 2,. Since we deal with the persistent behavior, consider the upper-bound of the long term of E[P ] as follows: V i = lim t V i,t. (2.15) Since the sequence {V i,t } converges if λ i > as proved in the previous subsection, the 21

37 limit value of the upper bound V i can be obtained by solving V i = g λi ( V i ). (2.16) Then, we have the solution, V i = Σ η,i +(Σ 2 η,i +4λ i Σ η,i Σ ɛ,i ) 1 2 2λ i. (2.17) By letting V i as P max, the limiting uncertainty, we can obtain the required visiting rate for each state as, λ i = Σ η,i(p max +Σ ɛ,i ), (2.18) Pmax 2 and let λ i be the proper probability of observations at cell i, wehaveλ=[λ 1,,λ N ] T. Fig. 2.5 shows the required visiting rate λ verses the upper bound V according to the equation (2.18). It is obvious that if the upper bound V is small, then the target uncertainty is low, so that the required visiting rate is large. On the other hand, if the upper bound V is large, then the target uncertainty is high, so that the required visiting rate is small. The former is the case of the tight condition and the latter is the naive condition. Furthermore, Σ η is also an important parameter determining the relationship between λ and V. Large Σ η requires a higher visiting rate when the upper-bound V is same. 22

38 1 9 8 Σ η = 1 Σ η = 3 V Figure 2.5: The required visiting rate λ verses the upper bound V. λ By normalizing Λ, we have the target distribution π of visiting as, π = Λ M (2.19) where M = N i=1 λ i, the sum of the required visiting rates. When M 1, each element of π is higher than λ i s, so that we need only one agent. However, if M>1, the elements of π are lower than λ i s, so that we need more than one agent. From this idea, we can choose M as the rounded up decimal value of M, as the minimum required number of agents for guaranteeing the uncertainty metric under P max. Obviously, M increases when we set tighter P max, or when the area of the environmental phenomenon is large or the environmental phenomenon changes fast so that Σ η increases. 23

39 2.3.3 Persistent and Informative Path Generation Distribution-Motivated A* Algorithm According to the target distribution π, we develop the trajectory rule of agents who perform the active sensing. Most of the developed path generation algorithms in the discrete space or graph-based environment, such as A* [4], Dijkstra [41], and RRT [42], deal with the problem where there are an initial vertex and a goal vertex and one finds best path from the initial to the goal regarding obstacles or threats and so on. However, those algorithms are not suitable to our interests, the persistent behavior, so that we give attention toward ergodicity metric. The visiting history of the agents can be converted to the accumulative visiting distribution (ergodicity metric) π( q, t) attimet where the current position of the agent is q, and the goal is achieving the target distribution π. Now, consider the distribution-motivated version of A* with ergodicity metric. The idea is simple. The cost and the cost-to-go (heuristic) are expressed by the ergodicity metric. The target distribution is π and the initial distribution is π(). The ergodic metric at position q i at time t is π( q i,t) and the graph of the environment is constructed permitting revisiting, i.e. the edges are deployed from t to t + 1, as shown in Fig From this setting, we can run a distribution-motivated A* algorithm as below. At time step t and the position q, we can calculate, Cost: g( q, t) = t 1 τ=1 π(τ) π(τ 1) + π( q, t) π(t 1) Cost-to-go: h( q, t) = π π( q, t) Cost of the path through q: f( q, t) =g( q, t)+h( q, t). when π( q, t) =[π 1 (t),π 2 (t),,π N (t)] T,andπ( q, t) can be updated as, π i (t) = 1 t {(t 1)π i(t 1) + δ( q q i )}, (2.2) where δ( ) is a Dirac delta function. 24

40 Figure 2.6: The graph representation for the distribution-motivated A* algorithm in 2-dimension. Generally, A* algorithm is complete if the heuristic h(x) for node x is positive semidefinite. Furthermore, if the heuristic is monotonic (consistent), the A* algorithm is admissible, i.e., the A* finds optimal path. Our cost-to-go (heuristic) h( q, t) is positive semi-definite (h(x) = π π(x) ) so that it is complete and is also monotonic as the following: for adjacent cells x and y, h(x) = π π(x) π(x) π(y) + π π(y) = d(x, y)+h(y) (2.21) where d(x, y) is the cost from x to y. If there is a solution path Υ = (v 1,v 2,,v n ) satisfying the final condition h(v n ) =, the following demonstrates the admissibility of 25

41 A* algorithm: h(v 1 ) d(v 1,v 2 )+h(v 2 ) L(Υ) + h(v n ) L(Υ) (2.22) where L(Υ) is the cost of path Υ. For the practical computation issue, in our distributionmotivated A* algorithm, we set the final condition as π π(x) ɛ and the relaxed admissibility can be explained as h(v 1 ) d(v 1,v 2 )+h(v 2 ) L(Υ) + h(v n ) L(Υ) + ɛ, (2.23) and A* finds the path which arrives at the nearest point satisfying the relaxed final condition π π(x) ɛ. Since if there exists a solution, the distribution-motivated A* algorithm finds the optimal path, it is important to figure out whether there exists a solution or not. Judging the existence of solution in the distribution-motivated path planning problem is closely related to the existence of the Hamiltonian path in general graphs, which is unsolved in the general form of graph, so far. In this work, we derive the sufficient condition for non-existence of solution. Consider a binary variable x i (k) fori =1,,N such that 1 if agent visits cell i x i (k) = else. (2.24) Since the agent always moves from the current cell to its adjacent cell, we have the following constraint: x i (k) j N i x j (k +1), i =1,,N 1. (2.25) 26

42 From the equation (2.25), the target distribution at cell i is π i = 1 L L x i (k) 1 L k=1 L x j (k) = j N i k=2 j N i [ π j 1 ] L x j(1) π j, (2.26) j N i where L is the length of the path and N i is the set of adjacent cells of cell i. Fromthe equation (2.26), we have the following sufficient condition for non-existence of the solution: π i > j N i π j, i. (2.27) This condition indicates that there is no solution path if the target distribution of a cell is larger than the sum of the target distributions of its adjacent cells. In order to validate the sufficient condition (2.27), we perform a simple test on the 5 5 grid space as shown in Fig. 2.7(a). We examine the feasibility according to adjacency ratio S i, which is defined as follows: S i = π i j N i π j (2.28) Fig. 2.7(b) shows the results of the feasibility test with various S 5 in Fig. 2.7(a). If S 7 > 1, i.e. the sufficient condition (2.27) satisfies, A* cannot find the solution path. However, in the region of S 7 < 1, A* finds the solution and furthermore, the expansion numbers during the test. When S 7 = 1, A* path planning is infeasible and this indicates the condition (2.27) is sufficient but not necessary. In order to avoid the infeasible situation, we should decide Σ η,i in the saturated form respect to the measurement Z i,t, so that the sharpness of the target distribution can be smoother. Algorithm Validation In order to validate and analyze the performance of the distribution-motivated A* algorithm, a one-dimensional simulation is performed. Consider one-dimensional graph with a 27

Figure 2.7: Feasibility test of distribution-motivated A* algorithm: (a) the workspace is 5 5 grid space and (b) the feasiblilty and expansion numbers are computed along S 7.

43 Figure 2.7: Feasibility test of distribution-motivated A* algorithm: (a) the workspace is 5 5 grid space and (b) the feasiblilty and expansion numbers are computed along S 7. ring topology consisting of 2 vertices. The target distribution π is arbitrarily chosen in the form of Gaussian distribution function and the performance in terms of accuracy and convergence speed is compared with the performance of Gibbs sampler. Gibbs sampler constructs a Markov chain which has a limiting stationary distribution when the Markov chain is irreducible and aperiodic. However, the Gibbs sampler generates randomly-chosen paths according to the Markov chain, the behavior is not ideal and the convergence is slow. Fig. 2.8 shows the simulation results on the one-dimensional ring topology with 2 vertices. Fig. 2.8(a) shows the target distribution π with the ergodicity metric obtained from distribution-motivated A* algorithm and Gibbs sampler. A* algorithm perfectly follows the target distribution, while Gibbs sampler has error on the distribution. Fig. 2.8(b) shows the convergence results of distribution-motivated A* algorithm and Gibbs sampler. The error of the ergodic metric of the distribution-motivated A* algorithm decays faster than the error of Gibbs sampler, and the steady state error is also smaller. Fig. 2.8(c) and (d) show the uncertainties when distribution-motivated A* algorithm and Gibbs sampler are used. Both algorithms satisfy the limiting condition of the uncertainties, however the 28

44 Gibbs sampler shows undesirable behavior during the first few hundreds steps, which is the burn-in phase of the Gibbs sampler. An optimization-based approach, especially mixed integer linear programming (MILP), is also applied to generate the two-dimensional path achieving target visiting distribution and the performance of A* in terms of accuracy, convergence speed and computation time is compared with the performance of MILP approach. For the test, MILP is solved by a function called intlinprog in MATLAB, which uses branch and bound algorithm. Figs. 2.9, 2.1 and 2.11 show the comparisons between the distribution-motivated A* and the MILP approach in terms of accuracy, convergence of ergodic metric and computation time, respectively. As shown in Fig. 2.9, the accuracy of distribution-motivated A* is similar to the MILP approach. Also, the ergodic metric of MILP decreases slightly faster until 4 steps, however, after 4 steps, the decreasing speeds of both the A* and MILP become similar as shown in Fig Fig shows the comparison in terms of computation time between A* and MILP. Since the computation time of both algorithms depends on the expansion of searching spaces, we perform multiple simulations with various initial conditions and compute their average. Since the difference between A* and MILP is large, the results are represented in log-scale. As shown in the figure, A* takes shorter computation time than MILP, although the computation time of both algorithms increases exponentially according to the growth of the size of workspace. Furthermore, the difference between A* and MILP becomes large as the size of workspace increases. From these results, the proposed A* algorithm is as accurate as MILP while the calculation time of A* is faster than MILP. 29

45 Figure 2.8: The one-dimensional simulation for the distribution-motivated A* algorithm which is compared with the Gibbs sampler algorithm. 3

46 Figure 2.9: The two-dimensional simulation for the distribution-motivated A* algorithm which is compared with the MILP approach: results of visiting distribution. Figure 2.1: The two-dimensional simulation for the distribution-motivated A* algorithm which is compared with the MILP approach: decrease of ergodic metric. 31

47 Figure 2.11: The two-dimensional simulation for the distribution-motivated A* algorithm which is compared with the MILP approach: computation time. 32

48 2.4 Extension to Multi-agent Systems In order to extend the distribution-motivated A* algorithm to the multi-agent case, we simply include the inter-collision term in the cost function. We define the inter-collision term using a potential generating tree shown in Fig Each point of the generated path within horizon H is treated as the basis through the RBF function, f j ( q) =exp( q q j 2 2σ 2 ), (2.29) where q j is the location of j -th path point, for j =1,,H. The bias term b denotes the threshold of the sum of the RBFs and by controlling the bias term, we can handle the range of the effect of the RBFs which generate repulsive potential. Moreover, in order to regularize and strictly specify the repulsive region, the following sigmoid function is used: g(x) = 1 1+exp( x T ). (2.3) From the above setting, the final form of the repulsive potential C( q) is ( H ) C( q) =g w j f j ( q) b j, (2.31) where w i is the weight and b is the bias term. We have <C( x) < 1, where C 1if x is close to other trajectories and C otherwise. The repulsive potential C( q) is now implemented in the modified A* algorithm, specifically, on the cost-to-go term h( q, t) as follows: h ( q, t) = h( q, t) 1 C( q), If q is close to the other trajectories, h ( q, t) increases drastically, and if q is far from the trajectories, h ( q, t) h( q, t). 33

49 Figure 2.12: Potential generating tree for the inter-collision cost. 2.5 Simulation Results In order to validate the proposed active sensing method with multiple robots, we consider the 1 1 grid workspace with a static Gaussian event as shown in Fig The agents under consideration are moving straight in the cardinal direction called the 4-connected grid. The proposed method can be extended to the 9-connected grid, where the agents additionally consider the straight moves in the diagonal direction. The diagonal movement can be treated as two-step moves in the 4-connected grid. Fig shows the simulation results when we set the limiting uncertainty level as 2. Fig. 2.14(a) denotes that the ergodic metric well-follows the target distribution and it converges fast as shown in Fig. 2.14(c). Fig. 2.14(d) shows the uncertainties of all cells are well-bounded by the limiting constraint P max = 2. The interesting thing is how the number of agents is chosen. As shown in Fig. 2.14(b), the simulation starts with one agent. However, after some iteration, the agent detects events and the uncertainties on the event region are expected to grow fast. So, as time goes on, the summation of the required visiting rate, i λ i, grows to over 1, and at that moment, the required number of agents changes to 2. We can find similar behaviors in the next simulation when the limiting uncertainty level 34

50 P max = 1 shown in Fig Since the limiting constraint is tighter than the previous simulation, the active sensing starts with two agents. And after some detection steps, events are found and the required number of agents increases as shown in Fig. 2.15(b). The other characteristics such as accuracy, convergence speed, and performance, are similar to the previous simulation. Furthermore, we consider the dynamic event case as shown in Fig The situation in Fig starts with the same as the static event case, however, at the middle of the simulation, a new event is added unexpectedly, in some other region. Fig shows the results of the active sensing simulation when a new event pops-up at t = 1, with the limiting uncertainty level of 2. As shown in Figs. 2.17(a), (c) and (d), the visiting distribution is accurate, the ergodic metric converges fast and the performance is well-bounded by the limiting constraint. And in Fig. 2.17(b), we can observe that the number of agents grows again after t = 1. Since a new event is added at t = 1, the summation of the required visiting rate, i λ i, starts growing again. And after some steps, the sum of λ i exceeds 2, so that the required number of agents is chosen as 3. 35

51 Figure 2.13: Simulation settings for the active sensing in the 2-dimensional workspace. 36

52 (a) Visiting Distribution Target A* (b) # of Agents Σλ # of Agents Visiting Rate # of Agents Cell Number Time Step.7.6 (c) Ergodic Metric 4 35 (d) Performance Pmax.5 3 π(t) π E[P] Time Step Time Step Figure 2.14: Simulation results for static events when P max = 2. 37

53 Visiting Rate (a) Visiting Distribution Target A* # of Agents (b) # of Agents Σλ # of Agents Cell Number Time Step.8.7 (c) Ergodic Metric 2 (d) Performance Pmax.6 15 π(t) π E[P] Time Step Time Step Figure 2.15: Simulation results for static events when P max = 1. 38

54 Figure 2.16: Simulation settings for the active sensing in the 2-dimensional workspace: dynamic event case. 39

55 .3.25 (a) Visiting Distribution Target A* 6 5 (b) # of Agents Σλ # of Agents Visiting Rate # of Agents Cell Number Time Step.7.6 (c) Ergodic Metric 4 35 (d) Performance Pmax π(t) π E[P] Time Step Time Step Figure 2.17: Simulation results for dynamic events when P max = 2. 4

56 3 Modeling Environmental Information 3.1 Problem Description The main objective of the work in this chapter is modeling the environmental information using the data gathered from the networked robot platforms which collect the information with the path generation algorithm described in the previous chapter. In this chapter, we focus on the SVM learning with the streaming data from the distributed robots. The distributed SVM learning solves a problem which classifies sensor-detection area as the event region. These kinds of knowledge discovery on distributed streaming data is a research topic of growing interest [43]. The fundamental problem we need to solve is the following: how can we enhance the performance of the modeling environmental information in terms of accuracy and robustness to the measurement noise, within the distributed platforms using streaming data. On this point of view, we propose an ensemble combination method of support vector machines (SVMs). The ensemble combination is a prediction technique that divides the 41

57 training data set into subsets, builds subpredictors for each subset, and combines the subpredictors with proper weights for the final estimation. In the networked robot setting, each robot collects data with sensors, and builds subpredictors with their local data, and communicate with its neighbours to combine the subpredictors for the final results. Fig. 3.1 shows a conceptual flowchart of the ensemble classifier with local data streaming. For each agent, we first build local SVM classifiers with own local dataset. Then we combine these base classifiers to form an aggregated ensemble through model averaging (or weighted averaging), which indicates that the aggregate ensemble is a mixture of the ensembles of agent-wise. Our analysis demonstrates the proposed method outperforms the conventional SVM in terms of accuracy. This chapter is organized as follows: Section 3.2 briefly introduce the conventional support vector machine (SVM) for the centralized setting and Section 3.3 describes the ensemble combination method in the distributed setting. Furthermore, in Section 3.4, the weighted ensemble of subpredictors is introduced, and performance of the weighted ensemble method is studied in Section 3.5. Finally, in Section 3.6, the proposed algorithm is validated. 42

58 Figure 3.1: A conceptual flowchart of the ensemble classifier framework 43

59 3.2 Support Vector Machine (SVM) Consider a classification problem using a training set {( x i,y i )} N i=1 where N denotes the total number of training data, x i R Nc the input pattern whose dimension is N c,andy i the corresponding binary output in { 1, 1} Linear Support Vector Machines Separable Case We start with the simplest case which is a linear machine trained by separable dataset. Suppose we have some hyperplane which separates the positive from the negative data. The points x which lie on the hyperplane satisfy w x + b =,where w is normal to the hyperplane and b / w is the perpendicular distance from the hyperplane to the origin. Defining the margin of a separating hyperplane as the sum of the shortest distance from the separating hyperplane to the closest positive and negative data, the support vector learning simply looks for the separating hyperplane with largest margin. Suppose that all training data satisfy the following constraint: y i ( w x i + b) 1, i. (3.1) Now consider the points for which the equality in equation (3.1) holds. These points lie either on w x + b =1or w x + b = 1 with normal w and hence, the margin is simply 2/ w as shown in Fig. Thus we can find the pair of hyperplanes which gives maximum margin by minimizing w 2, subject to (3.1). Now, we introduce positive Lagrangian multipliers α i, i, one for each of the inequality constraints (3.1). This gives L a = 1 2 w 2 N α i y i ( w x i + b)+ i=1 N α i. (3.2) i=1 44

60 Figure 3.2: Linear separating hyperplanes for separable case According to the Karush-Kuhn-Turcker condition of (3.2), we have the following: N L a w = w = α i y i x i, (3.3) i=1 N L a b = α i y i =. (3.4) i=1 Substituting the above conditions into (3.2), the dual form of Lagrangian can be obtained as follows: L D = N α i 1 α i α j y i y j x i x j. (3.5) 2 i=1 i,j Support vector training therefore amounts to maximizing L D with respect to the α i, subject to constraints (3.4) and positivity of the α i, with solution (3.3) using quadratic programming. In the solution, those points for which α i > are called support vectors, and lie either on w x + b =1or w x + b = 1. All other training points have α i = and lie on the outside of the boundaries. The support vectors are the critical elements of the training set. If all non-support vectors were removed, and training was repeated, the same separating hyperplane would be found. 45

61 Non-Separable Case When the above algorithm is applied to non-separable dataset, it will find no feasible solution. In order to extend the algorithm for separable case to handle non-separable dataset, we relax the constraints (3.1) by introducing positive slack variables ξ i, i in the constraints, which become: w x i + b 1 ξ i for y i =1, (3.6) w x i + b 1+ξ i for y i = 1, (3.7) ξ i, i. (3.8) Now, the objective function to be minimized is changed from w 2 /2to w 2 /2+ C N i=1 ξ i,wherec is a parameter to be a tuning parameter, a large C corresponding to assigning a higher penalty to errors. Then, the dual problem becomes: Maximize : Subject to : L D = N α i 1 α i α j y i y j x i x j, (3.9) 2 i=1 i,j α i C, (3.1) N α i y i =. (3.11) i=1 The only difference from separable case is that the α i now have an upper bound of C. Output of SVM After training the support vector machine (SVM), we simply determine on which side of the decision boundary a given x lies and assign the class label by taking the following: (class) =sgn( w x + b). (3.12) 46

62 Figure 3.3: Linear separating hyperplanes for non-separable case Nonlinear Support Vector Machines In order to generalize the linear support vector machine to the case where the decision function is not a linear function of the data, one can use a nonlinear mapping that maps the input data into a high-dimensional feature space, so called kernel trick. Consider the following nonlinear mapping: φ : R Nc H, (3.13) where H is a kernel space. As shown in (3.9) - (3.11), the only way in which the data appears in the training problem is in the form of dot-products, x i x j. If there is a kernel function κ, such that κ( x i, x j )=φ( x i ) φ( x j ), the only need to use in the training problem is κ( ) and we do not even have to know what φ is explicitly. As a popular example, we have a Gaussian kernel κ( x i, x j )=exp( x i x j 2 /(2σ 2 )). (3.14) In this Gaussian kernel, H is infinite dimensional and it is hard to obtain φ explicitly. However, replacing the dot-product x i x j by the kernel function κ( x i, x j ), the algorithm produce a support vector machine. All the considerations hold because the SVM is still doing linear separation, but in a different space, i.e. H. 47

63 Now the SVM problem in the kernel space is the following: Maximize : Subject to : L D = N α i 1 α i α j y i y j κ( x i, x j ), (3.15) 2 i=1 i,j α i C, (3.16) N α i y i =. (3.17) i=1 The above problem can be solved by quadratic programming and the output of the nonlinear SVM (a decision function) is, ϕ( x) = N α i y i κ( x, x i )+b. (3.18) i=1 We can use the SVM in the test phase by computing the sign of ϕ( x) One-Class Support Vector Machine Suppose we have a training set { x i } N i=1, we are interested in estimating a set in which anew x lies with an apriori-specivfic probability. This is fundamentally different from two-class classification problems, since we assume that data is available only from one class [44]. However, if we adopt the concept of SVM described in the previous sections, it is much easier to represent one-class problem. Similar to the two-class SVM, the one-class SVM tries to find maximum margin line which seperate data from the origin with the following optimization problem: 48

64 Minimize : Subject to : L P = 1 2 w νm m ξ i b, (3.19) i=1 w T φ( x i ) b ξ i, ξ. (3.2) where ν (, 1] represents an upper bound on the fraction of data that may be outliers. This is analogous to the two-class SVM formulation. Now, introducing Lagrangian multipliers α i s, we arrive at the following quadratic program which is the dual of the primal problem as follows: Minimize : Subject to : L P = 1 α i α j κ( x i, x j ), (3.21) 2 i,j α i 1 νm, (3.22) m α i =1. (3.23) i=1 The above problem can be solved by quadratic programming and the output of the oneclass csvm (a decision function) is, ϕ( x) = N α i y i κ( x, x i ) b. (3.24) i=1 49

65 3.3 Subpredictors and Ensemble Combination In this section, the SVM training is described in a distributed fashion. We consider a situation where the centralized fusion is not allowed since we want to comply with the important properties of WSNs such as: low communication complexity, scalability, flexibility, and redundancy. Our goal of distributed SVM training is as follows: The exchanged packet should be short enough to reduce the communication costs and battery usage. Especially, the packet length should be smaller than the conventional SVM which communicates all the dataset like the centralized setting. The computation time should be small because the local computing device for each agent has limited capability. The common global (aggregated) estimate outperforms, at least has the same performance as the result of the conventional SVM in terms of accuracy and robustness Training Subpredictors: Local Ensemble Consider a region of interest I, wherei is discrete spatial domain and a discrete time domain t =1, 2, 3,. Assume that there are K agents which collect data as input-output pairs { x, y} for SVM training, for example, such that x = q in some input space X I where q Iis the position of measurement, and y { 1, 1} (label) is the measurement corresponding to q or some function of the measurement which belongs to the binary output space Y. In a centralized setting, the robot network has its own fusion center which gathers information from all the nodes and performs massive computation to obtain the global SVM solution. This may incur a heavy communication load, which can cause packet loss, communication delay and much energy consumption, deteriorating the performance of the SVM. 5

66 However, in this study, each agent trains the model of environmental information with SVM algorithm separately, and combines its trained predictor with other predictors trained from neighbours in aggregation. Consider the i th agent has data streaming D i from its own sensor. From D i, the agent can train its local predictor ϕ i by using the SVM algorithm described in the previous section. In the last part of this chapter, we represent the local predictor of the i th agent trained with the data streaming D i is denoted as ϕ i ( x, D i ) Aggregation Ensemble After building the K sub-predictors ϕ 1 ( ),,ϕ K ( ) with SVM, we can merge the prediction of each agent with aggregating through the model averaging mechanism. Similar to the local ensemble denoted in the Section 3.3.1, we have the aggregated predictor as follows: ϕ A ( x) = 1 K K ϕ i ( x, D i ) (3.25) Assuming that we have conventional predictor ϕ C ( x, D) using the whole dataset D = K i=1d i. Now, our mission is to use D to get a better predictor than the single-learning-set predictor ϕ C ( x, D). If y is numerically valued, then the aggregated predictor ϕ A ( x) can be defined by the average of ϕ C ( x, D i )overi =1,,K, i.e., i=1 ϕ A ( x) =E D {ϕ C ( x, D)}, (3.26) where E D denotes the expectation over D. The following procedure proves that the aggregated predictor has smaller average prediction error than the conventional SVM. The error of the conventional predictor for each new input x is ɛ C ( x) = E D [ (y ϕc ( x, D)) 2] = y 2 2 y ϕ A ( x)+e D [ ϕ 2 C ( x, D) ]. (3.27) 51

67 Using the inequality (E{Z}) 2 E{Z 2 } gives ɛ C ( x) y 2 2 y ϕ A ( x)+(e D [ϕ C ( x, D)]) 2 = E D [ (y ϕa ( x)) 2] = ɛ A ( x). (3.28) Thus, the error ɛ A of the proposed predictor ϕ A is lower than the error ɛ C of ϕ C for each new input x. 3.4 Weighted Ensemble This section describes how the aggregated predictor ϕ A ( ) can be formed by the weighted combination of sub-predictors, in order to reflect the tendency of the prediction performance of sub-predictors over the workspace I. LetD be the overall learning data, as mentioned in the previous section, and D i is the learning data set for the SVM-based sub-predictor ϕ i ( x, D i ), such that D = K i=1 D i. Then the proposed aggregated predictor ϕ A ( ) is formed by the combination of sub-predictors ϕ i ( ) as ϕ A ( x, D) = K γ i ϕ i ( x, D i ), (3.29) i=1 where γ i s are the weights, D = D i is the learning data set, and ϕ i ( ) s are the subpredictors. Here, we describe how to choose the proper weight γ i with two cases whether we know the variances of measurements or not Known Variances There are the cases that we know the variance (or covariance) of the measurements, such as our case that each observation (measurement) has its variance according to the Kalman process as mentioned in Chapter 2. Usually these kinds of situations can commonly happen when there is preprocessing before SVM, such as optimal filter or Gaussian process. The 52

68 variance of the observation can be directly interpreted as the uncertainty or its inverse can be interpreted as the reliability of the measurement. Therefore, letting σ η 2 i be the average of the variances of the data which belong to D i, we have the following simple aggregation weight: γ i = 1 / K 1, (3.3) σ η 2 i σ η 2 i which is normalized mixture weight. According to this, the sub-predictor having smaller variance obtains more reliability and weighted more. If the variances σ 2 η i have an identical value, then the weights will be the same so that ϕ A ( ) will simply be the arithmetic average of the sub-predictors. j= Unknown Variances In general, the measurements near each sensor, e.g. acoustic sensor, deployed on the agent reveal more distinct dependency on the distance as shown in Fig. 3.4 and 3.5. Thus, if the output y corresponding to a query point x is far from the i-th agent, then the prediction accuracy of ϕ i will decrease. In order to consider this characteristics, we choose positiondependent weights γ i so that the sub-predictor from the local ensemble located nearer the agent position x is weighted more. In particular, we use the following normalized mixture weights: γ i ( x) = ψ( x, μ i,l i ) j ψ( x, μ j,l j ) (3.31) where, ψ(q, q,l)=e q q 2 βl, β is a positive constant, μ i is the mean of the data set D i. μ i = 1 D i n D i x n 53

69 In (3.31), l i is the average distance between each data point and its nearest-neighbor in the subset D i represented as following equation (3.32): l i = 1 D i min m D i \n n D i x m x n (3.32) where, D i denotes the number of data which belong to the subset D i,andd i \ n means n-th data of D i is excluded. The values μ i and l i s are determined by the observed data distribution, and l i represents how dense each training dataset is. The term β is our design parameter that determines how the sub-predictors are mixed. If β is extremely large, then all the γ i s defined in (3.31) will be almost the same so that ϕ A ( ) in (3.29) will simply be the arithmetic average of the sub-predictors. On the other hand, if β is small, then the shape of the function ψ(,, ) becomes narrow. So the predictor ϕ A ( ) is mainly determined by the sub-predictor from its own dataset. 54

70 1 (a) 1 (b) 1 (c) NORTH (d) 1 (e) 1 (f) NORTH (g) 1 (h) 1 (i) NORTH EAST 5 1 EAST 5 1 EAST Figure 3.4: Experimental measurement data in I 55

71 Figure 3.5: Experimental measurements of a sound source with respect to the distance between the source and the sensor 56

72 3.5 Performance Analysis If the Variance of the Noise is Known If the data stream has been pre-processed by an appropriate method (e.g. Kalman filter) and we can obtain the variance (or the uncertainty) as our problem setting, it is more comfortable to analyze the performance of aggregation ensemble when there is noise. Assume that the following additive error model for a predictor y = ϕ( x) when the kernel which we use for the SVM training is linear: ϕ( x) = ˆϕ( x)+v ( x) (3.33) where V ( x) are random variables representing the output error. As shown in Fig. 3.6, x is the true boundary of the ˆϕ( x) and x + is the boundary of the ϕ( x), which is trained by noisy data. Then the misclassification region caused by the noise will be the dark region in Fig. 3.6 and we can define the expected error of ϕ( x) as the area of the dark region. By linearlizing the true predictor ˆϕ( x at the point x, we have the following area of the triangle: (error) = 1 2 x x + E[ V ( x) ] = = σ2 V 2s E[ V ]2 2s (3.34) where s =ˆϕ ( x ) is the gradient of ˆϕ at x which is independent of the noisy model and σ 2 V is the variance of the additive noise. Thus, given dataset D = K i=1d i, the conventional SVM is, ϕ C ( x, D) = ˆϕ C ( x, D)+V C, (3.35) 57

73 and assuming each dataset D i is of the same size, we know the standard deviation of V C, i.e. σ VC,is σ VC = 1 K σ Vi, (3.36) K i=1 where σ Vi is the standard deviation of V i and V i is the additive noise of sub-predictor ϕ i ( x, D i ) defined as ϕ i ( x, D i )= ˆϕ i ( x, D i )+V i. (3.37) According to (3.36), it is easy to show that σv 2 C 1 K σ 2 K 2 V i. (3.38) i=1 For the aggregated SVM, we have the following: ϕ A ( x) = = K / ( K ) γ i ϕ i ( x) γ i i=1 { K γ i ˆϕ i ( x)+ i=1 = ˆϕ A ( x)+ i=1 } ( K / K ) γ i V i γ i i=1 i=1 i=1 i=1 K / ( K ) γ i V i γ i. (3.39) From the equation (3.39), the noise term of the aggregation ensemble is V A = K / ( K ) γ i V i γ i i=1 i=1, (3.4) and the variance of V A can be obtained as K σv 2 i=1 A = γ2 i σv 2 i ( K ) 2. (3.41) i=1 γ i 58

74 By setting the γ i =1/σV 2 i as (3.3), we have σ 2 V A = 1 K i=1, (3.42) 1 σv 2 i From the famous inequality, we know that, K K 1 i=1 σv 2 i 1 K K σv 2 i (3.43) i=1 and furthermore, by substituting (3.38) and (3.42), we have σ 2 V A = 1 K i=1 1 σ 2 V i 1 K 2 K i=1 σ 2 V i σ 2 V C. (3.44) Since the expected error of the SVM is proportional to the variance as noted in (3.34), we can conclude that the aggregation ensemble is more accurate than the conventional SVM according to (3.44) If the Variance of the Noise is Unknown It is difficult to perform generalized analysis on performance against the sensor noise if we never know about the variance of the noise. Here we try to observe the effect of aggregation on the noise sensitivity in a very special setting, by assuming the following additive error model for a predictor y = ϕ( x) : ϕ( x) = ˆϕ( x)+v ( x) (3.45) 59

75 Figure 3.6: Error region associated with the additive noise where V ( x) are random variables representing the output error. As the distance between a measurement and the sensor deployed on the agent increases, the predicted results become more sensitive to the noise. Fig. 3.7 shows the prediction error corresponding to the distance between the target position and the agent position, x μ. According to Fig. 3.7, the error pattern can be approximated as the exponential function of x μ, so we consider the output error model of the following exponential form: V ( x) :=k e ( x)e x μ 2 βl (3.46) where k e ( x) is positive function, β is positive constant, and (μ, l) are the mean and the average distance between each data point and its nearest-neighbor in the workspace. Using (3.31), (3.45), and (3.46), we can compare the output errors of the conventional predictor and the proposed predictor analytically. For each predictor, we obtain the following 6

76 expression: ϕ C ( x) = ˆϕ C ( x)+k e ( x)e x μ C 2 βl C (3.47) K ϕ A ( x) = ˆϕ A ( x)+ γ i k e ( x)e x μ i 2 βl i i=1 Kk e ( x) = ˆϕ A ( x)+. (3.48) x μ j 2 K j=1 e βl j where (μ C,l C ) are the mean and the average distance between each data point and its nearest-neighbor in the whole workspace data set and (μ i,l i )areind i Considering a special case in which all data are distributed uniformly, we can set: l C = l j = l, j =1, 2,,K. (3.49) Taking ( X,Y ) be random variables corresponding to ( x, y) and letting E X,Y [ Y μ C 2 ]= r 2, we have the following form: E X,Y [ X μ i 2 ]= r2 M 2 i. (3.5) where M i 1 means the ratio of the size of D overthesizeofd i.soifk =1,then M i =1. We can compare the expectation of V C and V A, the variances of the conventional and aggregated prediction errors, by taking the logarithm of (3.47) and (3.48), E X,Y [ln(v C )] = E X,Y [ln(k e ( X))] + r2 βl [ E X,Y [ln(v A )] = E X,Y [ln(k e ( X))] + E X,Y ln ( K K j=1 e X μ j 2 βl )] (3.51) (3.52) 61

77 Apply the inequality of arithmetic and geometric means as follows: 1 K K j=1 e X μj 2 βl (e K j=1 X μ j 2 The equation (3.52) and the inversion of the inequality (3.53) gives, βl ) 1 K [ ( E X,Y [ln(v A )] E X,Y [ln(k e ( X))] + E X,Y ln e K j=1 = E X,Y [ln(k e ( X))] + 1 K = E X,Y [ln(k e ( X))] + 1 K K j K E X,Y [ln(k e ( X))] + r2 βlm 2 min j. (3.53) X μ j 2 βl [ ] X μj 2 E X,Y βl r 2 βlm 2 i ) 1 K ], (3.54) where M min = min i M i. Since M min 1, from (3.51) and (3.54) we conclude the following: E X,Y [V A ] E X,Y [V C ]. (3.55) This shows that the proposed method is more accurate than the conventional predictor when there is noise. 62

78 Figure 3.7: Error analysis 63

79 3.6 Algorithm Validation In this section, we validate the proposed algorithm by simulating a simple example. Consider four agents which collect binary data along the path in their own region as shown in Fig In the situation that the event, the region we are interested in, located on the center of the workspace, the agents try to figure out the separating boundary using the proposed ensemble SVM. Fig. 3.9 shows the collected data from each agent and now we compare the results of the proposed ensemble method with the conventional SVM. For the convenience, we set the number of data collected from each agent as the same. The red + markers indicates the positive data and the black x markers indicates the negative data. The red line is the true (reference) boundary of this scenario. Here we start with the conventional SVM. Since the conventional SVM ϕ C ( x, D) is a single-learning-set predictor, the raw data streaming D i s from all the agents should be collected as D and sent to each agent and it causes a significant communication load. Moreover, the number of the training data (actually used for the SVM algorithm) is so large that local computation device is overburden in the computation. Fig. 3.1 shows the overall collected data, i.e. the training data for the conventional SVM. Meanwhile, the ensemble method only shares the predictor ϕ i ( x, D i ), i.e. the support vectors and corresponding coefficients whose number is relatively small than the number of the whole dataset, i.e. D i SV i,wheresv i is the set of the support vectors of ϕ i. Furthermore, since the sub-predictors only use their local data rather than the whole dataset for the SVM training, the local computation can be simply done, i.e. D D i. Fig shows the learning results of the conventional SVM and the proposed ensemble SVM when there is no noise, i.e. no outliers. The red + markers indicates the positive data and the black x markers indicates the negative data. The red line is the true (reference) boundary of this scenario and the blue line is the training results. As we show in (3.28), the ensemble (not weighted) method outperforms the conventional SVM in 64

80 terms of the accuracy. For the detailed quantitative analysis, we repeat the same scenario with random datasets as shown in Fig In the table, we perform six trials, where each agent collects 25 data in Trial 1, 2, 3, and 125 data in Trial 4, 5, 6. We use SMO algorithm for training SVM with Matlab R213a. The error in the table is misclassification rate, and the exchanged data is the number of data communicated between agents. As shown in Fig. 3.12, the misclassification rate of the conventional SVM is about 3 times than the misclassification rate of the proposed ensemble SVM. Furthermore, the computation time of the conventional SVM is about 1 times longer and the number of the exchanged data is also larger that the ensemble SVM. The computation time difference can be explained by the computation complexity analysis. According to [46], SMO algorithm for SVM learning has computation complexity between O(n) ando(n 2.3 ). In this simulation, the conventional SVM uses 4n training data when the local SVMs of ensemble approach use n training data for each. Therefore, the complexity of the conventional SVM is 4 24 times larger than the complexity of ensemble SVM, as the results in Fig These results demonstrate that the proposed ensemble SVM outperforms the conventional SVM in terms of accuracy, computation time and communication load. In practical applications, however, measurements accompany distortion and error, which may cause incorrect learning results. To test the effect of measurement noise, a Gaussian noise with various variations is intentionally added to the raw measurement. Fig shows the comparison between the conventional method and the proposed method when such noise was added. Here, we use the variance of.1 for the agent 1 and 3 and.2 for the agent 2 and 4. As shown in the figure, there are some outliers caused by the noise and the conventional SVM learns the errors affected by the noise. However, as shown in Fig. 3.13, the ensemble combination method shows better results which is accurate when there is the noise. Fig shows the comparison between the conventional method and the proposed method when the variance of the noise is double the previous case, i.e..2 for the agent 1 and 3 and.4 for the agent 2 and 4. Even for the large noise level, the aggregated SVM shows relatively accurate results. For the de- 65

81 tailed quantitative analysis, we repeat the same scenario with random datasets as shown in Fig In this table, we perform eight trials, where each agent collects 25 data in Trial 1, 2, 3, 4, and 125 data in Trial 5, 6, 7, 8. We use SMO algorithm for training SVM with Matlab R213a, same as before. Here, the noise is the average variance of the whole data. As shown in Fig. 3.15, the misclassification rate of the conventional SVM is more than 2 times than the misclassification rate of the proposed ensemble SVM. Even for the noisy training data, ensemble shows better performance than the conventional SVM and these results demonstrate that the proposed ensemble SVM outperforms the conventional SVM in terms of accuracy. In Fig. 3.16, the misclassification rates are shown as the variance of the noise varies from.1 to.35. As the variance of the noise becomes larger, the performance difference between the conventional SVM and the ensemble method increases, and this result supports that the proposed method is more accurate than the conventional SVM when there is noise. 66

82 Figure 3.8: Simulation scenario for validation of the ensemble SVM 67

83 The Agent 1 The Agent x 2 x x 1 x 1 The Agent 3 The Agent x 2 x x 1 x 1 Figure 3.9: Data gathered from each agent x x 1 Figure 3.1: Collected data from each agent 68

84 (a) Conventional SVM 1 5 x 2 5 Trained Boundary 1 True Boundary 1 Positive Data Negative Data x 1 1 (b) Ensemble SVM 5 x x 1 Figure 3.11: Learning results when there is no noise in the data using (a) conventional SVM and (b) ensemble combination method 69

85 Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Trial 6 Noise (var) Error (%) Computation (sec) Exchanged data Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Figure 3.12: Results when there is no noise in the dataset 7

86 (a) Conventional SVM 1 5 x Trained Boundary True Boundary Positive Data x Negative Data 1 1 (b) Ensemble SVM 5 x x 1 Figure 3.13: Accuracy test for the noisy data whose variances are.1 for the agent 1 and 3 and.2 for the agent 2 and 4 using (a) conventional SVM and (b) ensemble combination method 71

87 (a) Conventional SVM 1 5 x Trained Boundary True 1Boundary Positive Data x 1 Negative Data 1 (b) Ensemble SVM 5 x x 1 Figure 3.14: Accuracy test for the noisy data whose variances are.2 for the agent 1 and 3 and.4 for the agent 2 and 4 using (a) conventional SVM and (b) ensemble combination method 72

88 Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Trial 6 Trial 7 Trial 8 Noise (var) Error (%) Computation (sec) Exchanged data Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Conventional Ensemble Figure 3.15: Results when there is noise in the dataset 73

89 14 12 Conventional SVM Ensemble SVM 1 Misclassification (%) Variance Figure 3.16: Comparison of the misclassification error with varying noise level 74

90 4 Environmental Boundary Tracking 4.1 Problem Description This chapter describes an algorithmic framework of coordination for the collective behaviors, especially the boundary tracking, using the multi-robot systems (or the multi-agent systems). Consider K mobile robots deployed over a workspace I R 2 and a physical event area Y whose boundary is a single closed curve. Let S i (t) Ibe a set of points within the sensing range of a robot i at time t, fori =1,,K. From obtained binary measurements, robot i discriminates whether each element (point) of S i (t) lies in Y or not. Then the detected points x S i (t) I are stacked in a training dataset D i (t) for i =1,,K. From the training dataset D i s, the one-class SVM classification is performed to specify the boundary of the physical event area. Detailed description of the ensemble SVM training process with D i (t) Ifor i =1,,K is introduced in the previous chapter. The dynamic model of K agents that we study is subject to planar steering and 75

91 acceleration control as follows: ẋ i1 = v i cos θ i ẋ i2 = v i sin θ i θ i = u i v i = a i, i =1,,K (4.1) where x i =[x i1,x i2 ] T and θ i are the position and heading angle of each node, v i is the magnitude of the robot speed, and u i and a i are planar steering and acceleration control respectively. The goal of this work is to develop motion plans for the mobile robot networks to efficiently obtain an expression of the environmental boundary in an mathematical form, such as the result of the SVM training. Using real-time feedback and on-line coordination of the agents, the SVM and coordinated move of the platforms are handled simultaneously. Moreover, the agents move along the temporal boundary (obtained by the SVM training) with evenly-spaced configuration. The algorithm we propose is based on a Lyapunov vector field for convergence of boundary tracking and cooperative phase control of stabilization of the collective motion. The rest of this chapter is organized as follows: in section 4.2, the SVM-based Lyapunov vector field approach is introduced. Using the Lyapunov stability theroy, the convergence law is derived under the SVM decision function. Coordination of multiple robots for the collective behavior is described in section 4.3, and the simulation results and the experimental results are presented in section SVM-based Lyapunov Vector Field In this section, a description of the Lyapunov vector field approach using the result of oneclass SVM training and its stability analysis are presented. With a single mobile sensor 76

92 example, the whole procedure of boundary tracking is illustrated in the last subsection SVM-based Potential Solving the SVM problem, the decision function ϕ( x) which has the positive value when x belongs to the event area and the negative value when x belongs to the outside of the event area is given by n SV ϕ( x) = α i κ( x, x i ) b, (4.2) i=1 where x i s and α i s for i =1,,n SV are support vectors and corresponding Lagrangian coefficients, b> is the bias term, and n SV is the number of the support vectors. From the result of the one-class SVM described in (4.2), we can decide that any point x Ilies in I if the following criterion is satisfied: ϕ( x) > (4.3) And obviously the following is the boundary equation: n SV ϕ( x) = α i κ( x, x i ) b =. (4.4) i=1 Fig. 4.1 shows an example of the one-class SVM training. In Fig. 4.1(a), the red + markers are the training data points and the contour plot indicates the boundary ϕ( x) =. Interestingly, ϕ( x) is a potential-like function whose magnitude decreases as x goes farther away from the boundary, and zero when x is on the boundary. Using these characteristics of the function ϕ( x), we construct a boundary-tracking vector field described in section

93 (a) Boundary Specification x 2 1 Boundary curve 8 Data Support vectors x 1 (b) Potentail like function 1 Potential Boundary curve φ x x Figure 4.1: Results of the one-class SVM: (a) the boundary specification with training data points, and (b) the potential-like function. 78

94 4.2.2 Lyapunov-based Vector Field For the guidance of the mobile robots, we want to construct a vector field x d which provides the desired velocity vector at the position of the robot, x =[x 1,x 2 ] T. Desired asymptotic convergence of x(t) to a boundary is produced by an attracting force. In this case, there exist infinite equilibrium points on the boundary that x converges to a fixed position asymptotically when it arrives at the boundary. For tracking the boundary, however, robots need to keep moving along the boundary to maintain their searching capability. In order to allow asymptotic motion of robots with nonzero velocity, we simply consider the limit cyclic motion which contain no equilibrium points. Lyapunov stability theory provides a good inspiration to construct a vector field as introduced in [31]. Let V = ϕ 2 ( x) =( i α iκ( x, x i )+b) 2, and consider Q( x) as V T Q( x) =, (4.5) and the following integral curves of the vector field: where β is a positive definite value. x = β ϕ V + Q( x, t), (4.6) 2 Under an assumption that V is not an explicit function of time, for any x I,the time derivative of V ( x) isgivenby V ( x) = V + V T x = βϕ 2, (4.7) }{{} t and this can be concluded as V ( x), because β is positive definite. Therefore, the proposed vector field (4.6) provides a trajectory converges the boundary, i.e. ϕ( x) =,in the region satisfying ϕ. 79

95 4.2.3 Vector Filed Construction Here, the vector field (4.6) is analyzed and constructed with desired agent speed v d (t). The first term of the vector field (4.6) is a direct attraction which is opposite to the gradient of V ( x), i.e. V ( x) decreases, and the second term points a tangential direction which is always normal to the gradient of V ( x). By employing the cooperative coordination presented in the next section, it is possible to choose β and Q( x, t) so that the magnitude of the vector field (4.6) can be adjusted according to the desired mobile agent speed v d (t) as follows: x = β ϕ V + Q( x, t) 2 = v d(t). (4.8) The Lyapunov vector field (4.6) is obtained with the definitions Q( x, t) = ρβ ẑ ϕ, ϕ 2 (4.9) v d β = ϕ2 + ρ 2 (4.1) where ẑ is a unit vector whose direction is orthogonal to the x 1 -x 2 plane and the sign of ρ determines the direction of circulation on the boundary. Also, the magnitude of ρ controls the relative force of circulation and attraction force as shown in Fig Substituting (4.2), (4.1) and (4.9) into the vector field (4.6), we obtain the desired vector field as ( ) ( ϕ vd x = ϕ2 + ρ 2 ϕ ϕ ρ vd + ϕ2 + ρ }{{} 2 ϕ attraction ) (ẑ ϕ) } {{ } circulation. (4.11) The first term of the equation (4.11) denotes the opposite direction of the gradient of the Lyapunov function V,weightedby ϕ v d (t)/ ϕ 2 + ρ 2 and the second term denotes the orthogonal direction of the gradient, i.e. a tangential direction of the boundary, weighted by ρv d (t)/ ϕ 2 + ρ 2. 8

96 Fig. 4.2 shows the vector field for the example in Fig. 4.1 with a constant desired velocity v d (t) = 2 which satisfies a globally attractive limit cyclic path. Fig. 4.2(a) has a parameter of ρ =2andFig.4.2(b)hasρ = 4. As expected, the vector field which has a smaller ρ converges to the boundary faster. Fig. 4.3 shows the boundary tracking results of a single mobile agent. The mobile agent moves toward the boundary obtained from the one-class SVM training. As presented in Fig. 4.3(d), the mobile agent tracks the boundary curve. Here, the black contour line denotes the boundary curve, red + markers are data points, red circles are support vectors and the blue circles indicate the trajectory of the single mobile agent. 81

97 (a) ρ = 2 1 Boundary Vector field 5 x x (b) ρ = 4 x x 1 Figure 4.2: Lyapunov vector fields satisfying a globally attractive limit cyclic path: (a) ρ =2, and (b) ρ =4. 82

98 (a) t = 16 sec (b) t = 32 sec 1 5 Boundary 1 curve Data Support vectors 5 Trajectory x 2 x x 1 x 1 (c) t = 48 sec (d) t = 64 sec x 2 x x 1 x 1 Figure 4.3: Simulation results of the boundary tracking for a single mobile agent at (a) t = 16, (b) t = 32, (c) t = 48 and (d) t = 64. The boundary curve is updated during the simulation. 83

99 4.2.4 Analysis on ϕ = Since the output of the one-class SVM is in the form of the Gaussian mixture, it is possible that there exist the points satisfying ϕ = and it is very difficult to obtain the point explicitly, although the analysis on thoses points are important. Especially, among the equilibrium points satisfying ϕ =, the local maxima (outside of boundary) or the local minima (inside of boundary) causes trapped behaviors so that the agents cannot leach the boundary. Fortunately, it is possible to show that in the region of the outside of the boundary, there is no local maxima. Consider a Gaussian mixture with two components in one dimension as follows: F 1 (x) =α i κ(x, x 1 )+ακ(x, x 2 ), (4.12) where α 1 α 2. A Gaussian mixture with two components has two cases on its modality, unimodal or bimodal as shown in Fig Figure 4.4: Modality of a Gaussian mixture with two components: (left) unimodal and (right) bimodal. The unimodal is in the form of the merged components and the modality is unity, while the bimodal maintains the mode of each components. According to the conjecture 84

100 studied in [45], there exists no third mode with mixture of Gaussian functions. And this holds for n components in d-dimension, when the variance terms in Gaussian are isotropic and identical as follows: n F d ( x) = α i κ( x, x i ). (4.13) i=1 It is obvious that the merged mode has its local maxima whose value is larger than value of local maxima for the original components. Since α 1 α2 in the example, we can say that F 1 (x ) F 1 (x2) for any local maxima x. Now, revisit our own problem, the one-class SVM. Since we deal with the problem in two-dimensional space, the output of the one-class SVM is, n SV ϕ( x) = α i κ( x, x i ) b = F 2 ( x) b (4.14) i=1 where, x i s are support vectors, corresponding Lagrangian multipliers are α 1 α nsv >, b>andn SV is their number. Since support vectors always lie interior of the boundary, we have F 2 ( x nsv ) >b. (4.15) For any local maxima x,weknowthatf 2 ( x ) F 2 ( x nsv ) >b, and the following holds: ϕ( x )=F 2 ( x ) b>. (4.16) The condition (4.16) means that every local maxima lies interior of the boundary and we can conclude that there is no local maximum on the outside of the boundary if Carreira-Perpinan s conjecture is true. 85

101 4.3 Coordination of Multiple Mobile Agents In this section, we describe the cooperative boundary tracking method. For efficient boundary tracking, multiple agents need to be controlled in an evenly-spaced configuration. We define virtual phases of mobile agents with respect to the integrated trajectories along the boundary. Assuming that the boundary B is a single closed curve, we can consider the period of circulation along the boundary with the following integral trajectory equation: T x d dt =, T >, x d () B, (4.17) where the initial condition x d () can be any points on the boundary curve B. Since x d is the limit cyclic vector field, there exist infinite number of solutions and we choose the minimum T from the solution set. In a similar manner, we also consider the time of arrival at the standard point for the mobile agent k. The standard point, x s can be any points in B. The time of arrival can be obtained from the following equation: Tk x d dt = x s x k, T k, x d () = x k. (4.18) Now we define the virtual phase φ k of mobile agent k as φ k = 2πT k T (4.19) In this work, to simplify the calculation of T k we handle the case of φ k [, 2π), i.e., T k <T, which means that we consider the cooperative motion control with only the mobile sensors near the boundary. In practice, it is difficult to obtain the standard point which is on the boundary B explicitly. This makes it hard to calculate the absolute value of the virtual phase φ k. However, we still use the concept of the virtual phase since we use the relative virtual 86

102 Figure 4.5: The virtual phase diagram of mobile agents: φ k is the virtual phase of the agent k and controlled by v k. phase rather than the absolute value (the detailed explanation is in the following). Fig. 4.5 shows a virtual phase diagram of mobile agents, where the phase changes are presented as a moving vector on a unit circle with the speed of v k (t). Now the configuration problem of mobile agents is equivalent to controlling the magnitude of phase order parameter p φ [?], defined as p φ = 1 K K [cos φ k, sin φ k ] T. (4.2) k=1 The magnitude of p φ, which satisfies p φ 1, is proportional to the level of overlap of the phases. For overlapped phases, p φ = 1 and the condition of p φ = indicates the balanced configuration of phases. In order to space the mobile agents evenly, we minimize p φ to zero by controlling the speed of each agent v k (t) defined as v k (t) =v +Δv k,where v is the nominal speed and Δv k is its balancing offset. Consider the following phase potential definition, U( φ) = K 2 p φ 2, (4.21) 87

103 where φ =[φ 1,,φ K ]. To minimize the phase potential for the balanced phase configuration, we set the desired balancing offset of the agent k as Δv d,k = η U φ k = η K K sin φ kj, (4.22) where η is the convergence rate of the collective configuration and φ kj = φ k φ j. Since the calculation of Δv d,k in (4.22) only needs phase differences, the standard point x s used in (4.18) and (4.19) is not required any more. Therefore, the proposed virtual phase is applicable. The rest of the process is the same as section where v d (t) =v +Δv d,k. j=1 4.4 Simulation Results In order to validate the proposed boundary tracking method with multiple robots, we perform simulations. With an arbitrarily chosen physical event, the boundary is driven by the one-class SVM. Using the output of the one-class SVM, we set the Lyapunov function and construct vector field based on the Lyapunov function. Fig. 4.6 shows the result of the boundary tracking with 1 mobile agents. Fig. 4.6(a) denotes the initial positions (red triangles), the trajectories (blue solid curves) and the final configuration (black circles) of agents. The red + markers are measurements, red circles are support vectors. and the contour plot is ϕ( x) =, the boundary curve. As shown in Fig. 4.6(a), the final positions of the agents are well-balanced in the form of the evenly-spaced configuration. Fig. 4.6(b) shows the difference of the virtual phases. As time goes on, the phase differences between agents are well-balanced. The histories of the Lyapunov function V ( x = 1 2 ϕ2 ( x) and the phase potential U( φ) are shown in Fig. 4.7(b). The Lyapunov function quickly decreases to zero as the agents arrive at the boundary, and then the phase potential starts to decrease. 88

104 (a) Trajectories x Initial Positions Final Configuration x 1 4 (b) Virtual Phases Degree Time Figure 4.6: Results of the boundary tracking with 1 mobile robots: (a) trajectories of agents and the final configuration and (b) the relative virtual phases along the time steps. 89

105 5 (a) Lyapunov Function V(x) Time 3.5 (b) Phase Potential U Time Figure 4.7: Results of the boundary tracking with 1 mobile robots: (a) the history of the Lyapunov function and (b) the history of the phase potential U( φ). 9

106 5 Experimental Validations and Results 5.1 Experimental Setup In order to validate the proposed methods and find out practical issues, we perform experiments in an indoor environment. In the indoor setting, it is convenient to observe the characteristics associated with the parameters of the proposed methods and analyze their relationship because the environment is static and the effect of noise is less significant. Furthermore, the global positioning for the agents (robots) is accurate, so the experiments can be more reliable than the outdoor situation. The whole experimental setup is shown in Fig We consider the workspace whose size is m 2. The Vicon multi-camera system is used for global positioning of each agent. The Vicon system uses the infrared cameras to detect pre-deployed markers so that users can obtain the information about the global position and attitude of each agent. In this work, we use both the flying robots and the ground robots, and especially, the flying robots are used for the active sensing and the ground robots are used for the boundary tracking. We employ the Ar-drone 2. quadrotors developed by Parrot as the 91

107 flying robots. Ar-drone 2. has two cameras, one directed forward and the other downward. Using the Ardrone autonomy, which is developed based on ROS by Autonomy Lab of Simon Fraser University. Ardrone autonomy is very useful for controlling and acquiring data from Ar-drone through wifi communication. For the ground robot, we use Stella B2, manufactured by NTREX, which is two-wheeled mobile robot. In order to send the command from the user computer, we implement Arduio with wifi add-on. For applying the algorithms and methods which we propose to the multi-robot systems, we also develop the interfaces between user computer and multiple robots. The interfaces are developed using C++ in Linux environment and interface and algorithm of each agent is processed thread by thread (the computation is fully decoupled so that we can validate the algorithms for distributed platforms). Flight interface is developed for the flying robots and has image processor, path planner for active sensing, and ensemble SVM. In order to solve the SVM, we employ the LibSVM, the open source for solving various SVM problems. The drive interface is developed for the ground robots and has the path planner. Furthermore, both the flight interface and drive interface contain the Vicon data streaming and wifi communication modules. In the scenario we consider, the flying robots are moving around the workspace for the active sensing of the red region which we set as the event of interest. They use the camera implemented underneath themselves and find the boundary with the ensemble SVM method. The specified boundary is transmitted to the ground robot system, so that the robots construct vector fields to track the boundary and stabilize the configuration simultaneously. 92

108 Figure 5.1: Experimental setup in the indoor environment. 93

Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems

1 Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems Mauro Franceschelli, Andrea Gasparri, Alessandro Giua, and Giovanni Ulivi Abstract In this paper the formation stabilization problem