Optimal path planning using Cross-Entropy method
|
|
- June Griffin
- 6 years ago
- Views:
Transcription
1 Optimal path planning using Cross-Entropy method F Celeste, FDambreville CEP/Dept of Geomatics Imagery Perception 9 Arcueil France {francisceleste, fredericdambreville}@etcafr J-P Le Cadre IRISA/CNRS Campus de Beaulieu 350 Rennes France lecadre@irisafr Abstract - This paper addresses the problem of optimizing the navigation of an intelligent mobile in a real world environment, described by a map The map is composed of features representing natural landmars in the environment The vehicle is equipped with a sensor which allows it to obtain range and bearing measurements from observed landmars during the execution These measurements are correlated with the map to estimate its position The optimal trajectory must be designed in order to control a measure of the performance for the filtering algorithm used for the localization tas As the mobile state and the measurements are random, a well-suited measure can be a functional of the approximate Posterior Cramer-Rao Bound A natural way for optimal path planning is to use this measure of performance within a (constrained) Marovian Decision Process framewor However, due to the functional characteristics, Dynamic Programming method is generally irrelevant To face that, we investigate a learning approach based on the Cross- Entropy method Keywords: Marov Decision Process, planning, estimation, Posterior Cramer Rao Bound, Cross Entropy method Introduction In this paper we are concerned with the tas of finding a plan for a vehicle moving around in its environment That is to say, reaching a given goal position from an initial position In many application, it is crucial to be able to estimate accurately the state of the mobile during the execution of the plan So it seems necessary to couple the planning and the execution stages One way to achieve that aim is to propose trajectories where the performance of the localization algorithm can be guaranteed This problem have been well studied and in most approaches the environment is discretized and described as a graph whose nodes correspond to particular area and edges are actions to move from one place to another Some of these previous contributions address the problem within a Marov Decision Process (MDP) Such approaches also use a graph representation of the mobile s state space and provide theoretical framewor to solve it in some cases In the present paper, we will also use the constrained MDP framewor, as in[] Our optimality criterion is based on the Posterior Cramer-Rao bound However, the nature of the objective function for path planning maes it impossible to perform complete optimization of MDP with classical means Indeed, we will show that the reward in one stage of our MDP depends on all the history of the trajectory To solve the problem, the Cross-Entropy originally used for rare-events simulation seemed a valuable tool The paper is organized in five parts In the second section we introduce the problem in details Section three deals with the Posterior Cramér-Rao bound and its properties We also derive its formulation for our problem and the criterion for the optimization In section four, we mae a short introduction of the Crossentropy and show how to apply it to our needs Finally, in section five the results of a first example are discussed Problem statement Let X, A and Z respectively denote the state of the vehicle, the action and the vector of observations at time We consider that the state vector X is the position of the vehicle in the x-axis, y-axis, so that X (x, y ) The action vector A (a x, a y ) is restricted to the set { d, 0, +d} { d, 0, +d} with d R + given constant So, the state {X } motion is governed by a dynamic model and an initial uncertainty, given by : X 0 π 0 N(X 0, P0) X + = f(x, A ) + w w N(0, Q ) where {w } is a white noise sequence and the symbol means distributed according to The nown map M is composed of N f features (m i ) i Nf with respective position (x i, y i ) R At time, due to the sensor capabilities, only a few landmars can be observed So, the observation process depends on the number of visible features Moreover, we do not consider data association and non detection problem for the moment That is to say, each observation is made from one landmar represented in the map If we denote I v () the indexes The symbol N means normal
2 of visible landmars at time, the measurement vector received at time is : with j I v () : Z = {z (j)} j Iv() zr(j) = (x x j ) + (y y j ) + γ z r (j) (j) zβ (j) = arctan( yj y x j x ) θ + γβ (j) s f s i where θ is the global mobile orientation, {γr (j)} and {γβ (j)} are considered as white Gaussian noise and mutually independent Nevertheless, a more complex modeling must be considered if we want to tae into account correlated errors in the map or observations 3 A Marov Decision Process model with constraints As in [], we first formulate the problem within the Marov decision Process framewor Generally, a Marov Decision Process is defined by its state and action sets, the state transition probabilities and reward functions In our case, the state and action spaces are discrete and finite Indeed the map is discretized in N s = N x N y locations and one action is defined as a move between two neighbor points (figure ), where N x and N y are the number of grid points in both axis At each point, the mobile can choose one action among N a So we can model the process with: S = {,, N s } the state space A = {,, N a } the action space T ss (a) Pr(s + = s s = s, a = a) the transition function R ss (a) the reward or cost function, associated with transition from state s to state s, for an action a Finding the optimal path between two states s i and s f is equivalent to determine the best sequence of actions or decisions Va = (a 0,, a K ) allowing to simultaneously connecting them and optimizing a (global) criterion In our application, we consider only possible actions for one state (figure ) Each of them can be selected except for the points located on the border of the map and in places where obstacles can be found Moreover, we tae into account operational constraints on the mobile dynamic between two following epochs as in [] It is possible to do that by defining one authorized transition matrix δ(a, a ) which indicates actions that can be chosen at time according to the choice at time For example, if only [ π, π ] headings controls are allowed, such a matrix can be expressed as below : eg Marov Random Fields 3 eg colored noise or biases Figure : MDP grid with states (red crosses) Trajectories with (solid line) and without(dashed line) [ π, π ] constrained headings change δ = a a The actions are clocwise numbered from ( go up and right ) to ( go up ) The above δ matrix indicates that if action is chosen at time, only actions,, 3, 7 and can be selected at time + Moreover, the mobile orientation θ is restricted to values directly lined with the orientation of the elementary displacements In the Marov Decision Process context when the reward is completely nown, optimal policy can be computed using Dynamic Programming technique or similar algorithms such as Value Iteration and Policy Iteration [] They are based on the principle of optimality due to Bellman However, applicability of this principle implies some basic assumptions of the cost functional It is not the case in our context where the criterion of optimality depends on the Posterior Cramér Rao Bound (PCRB) matrix which is introduced in next section Indeed, as shown in [], any cost functional must satisfy the Matrix Dynamic Programming Property to guarantee the principle of optimality For the determinant functional used in our wor, this property is not verified [] 3 Posterior Cramér-Rao Bound 3 Definition In this section, we briefly remind the properties of the Cramér-Rao bound for estimation problem Let ˆX(Z)
3 be an unbiased estimator of an unnown random vector X R d based on random observations Z The PCRB is given by the inverse of the Fisher Information Matrix (FIM) F [9] and measures the minimum mean square errors of ˆX(Z) : where H(X, j) = z r (j) x z β (j) x z r (j) y z β (j) y, (9) E{( ˆX(Z) X)( ˆX(Z) X) T } F (X, Z), () where A B means (A B) is positive semidefinite and : F E [ X X log (p x,z (X, Z)) ] is the second-order partial derivatives operator p x,z is the joint probability density of (X, Y ) For the filtering case, it can be shown [] that : ( ) ( ) T E{ ˆX X ˆX X V a } J (V with X the state at time, ˆX ˆX (Z (:), Va ) the estimate based on Z (:) all measurements observed until time and Va = {a,, a } the sequence of chosen actions until time 3 Tichavsy PCRB recursion A remarable contribution of Tichasy et al[] was to introduce a Ricatti lie recursion for computing J : where a ) J + = D D (J + D ) D () D = E{ X X log (p(x + X ))}, (3) D = E{ X + X log (p(x + X ))}, () D = [D ] T, (5) D = E{ X + X + log (p(x + X )} + () E{ X + X + log (p(z + X + ) } (7) The relation is valid under the assumptions that the mentioned second-order derivatives, expectations and matrix inverses exist The initial information matrix J 0 can be calculated from π 0 The measurements only contribute through the D term The dynamic being linear, the following equations are easily derived : J 0 = P0, () D = Q, D = Q, D = [Q ]T, D = Q + J + (Z) The observations contribution J + (Z) is given by the following expression : J (Z) = E X { H(X, j) T R H(X, j)}, j I v(x + ) Actually, this is a lower bound which may be reasonably accurate and R + is the covariance matrix of the combined visible observations We assume that this matrix only depends on the current position Obviously, there is no explicit expression for J + (Z) because the observation model is nonlinear Consequently, it is necessary to resort to Monte Carlo simulation to approximate it : draw N c (X i 0 ) i N c according to π 0, until l = draw (Xl+ i ) i N c from (Xl i) i N c according to the prediction equation (), for each X i + compute I v (X i + ) then H(X i +, j), j I v(x i + ) and R + For our observation model, the matrix H(X, j) is as follows : H(X, j) = (x x j) d j (y y j) d j (y y j) d j (x x j) d j, (0) where d j = (x x j ) +(y y j ) The approximation can then be obtained : J + N c i I v(x i + ) H(X i +, j) T R + H(Xi +, j) 33 A criterion for path planning We are interested in finding one or more paths connecting two points which minimize a functional φ of the PCRB along the trajectory [] We consider two functionals which depend on the determinant of the history of approximated PCRB {J0,, JK } The determinant is lined to the volume of the error ellipse of the position estimate : minimizing a weighted mean volume along the trajectory φ (J 0:K ) = K =0 w det(j ) minimizing the final error of estimation φ (J 0:K ) = det(j K ) In figure, 90% error ellipses are computed and drawn along two trajectories of the same length The size of the ellipses decreases rapidly for the trajectory which goes towards the area with landmars Since a classical optimization method is irrelevant, we propose to investigate a learning approach based on the innovative Cross Entropy (CE) method developed by Rubinstein et al [5]
4 Figure : example of 90% error ellipses for trajectories of the same length (features are in blac (o)) Cross Entropy algorithm In this section we briefly introduce the Cross Entropy algorithm The reader interested in the CE method and its convergence properties should refer to [5, ] A short introduction The Cross Entropy method was first used to estimate the probability of rare events A rare event is an event with very small probabilities It was then adapted for optimization assuming that sampling around the optimum of a function is a rare event Simulation of rare event Let X a random variable on a space X, p x its probability density function (pdf) and φ a function on X Suppose, we are concerned with estimating l(γ) the probability of the event F γ = {x X φ(x) γ} with γ R + F γ is a rare event if l(γ) is very small An unbiased estimate can be obtained via Crude Monte Carlo simulation Given a random sample (x,, x N ) drawn from p x, this estimate is : ˆl(γ) = I [φ(x i ) γ] N For rare event, the variance of ˆl(γ) is very high and it is necessary to increase N to improve the estimation The estimate properties can be improved with variance minimization technique such as importance sampling where the random sample is drawn from a more appropriate probability density function q It can be easily shown that the optimal q pdf is given by I [f(x) > γ]p x (x)/l(γ) Nevertheless, q depends on l(γ) which needs to be estimated To get round this difficulty, it can be convenient to choose q in a family of pdfs {π(, λ) λ Λ} The idea is to find the optimal parameter λ such as D(q, π(, λ)) is minimized where D is the Kullbac-Leibler pseudo-distance : D(p, q) = E p ln p q = p(x)ln p(x)dx q(x)ln p(x)dx X X Minimizing D(q, π(, λ)) is equivalent to maximize X q (x) ln π(, λ)dx which implies : λ arg max λ Λ E p (I [φ(x) γ] ln π(x, λ)) () The computation of the expectation in must also be done using importance sampling So we need a change of measure q, drawing one sample (x,, x N ) from q and estimate λ as follows : ˆλ arg max λ Λ ( I [φ(x i ) γ] p ) x(x i ) q(x i ) ln π(x i, λ) () The family {π(, λ) λ Λ} must be chosen to easily solve equation Natural Exponential Families 5 (NEF)[5] are especially adapted however, q is not nown in equation The CE algorithm tries to overcome this difficulty by constructing adaptively a sequence of parameters (γ t t ) and (λ t t ) such as : F γ is not a rare event F γt+ is not a rare event for π(, λ t ) lim t = γ More precisely, given ρ ]0, [ : choose λ 0 such as π(, λ 0 ) = p x draw (x,, x N ) from π(, λ t ) sort in increasing order (φ(x ),, φ(x N )) and evaluate the ( ρ)) quantile γ t compute λ t as λ t argmax λ Λ ( I [φ(x i ) γ t ] ) p x (x i ) π(x i, λ t ) ln π(x i, λ) if γ t < γ, set t = t + and go to step Else estimate the probability of F γ with : ˆl(γ) = p x (x i ) I [φ(x i ) γ] N π(x i, λ t ) This is the main CE algorithm but other versions can be found in [5] application to optimization The CE was adapted to solve optimization problem Consider the optimization problem : φ(x ) = γ = max φ(x) (3) x X The principle of CE for optimization is to translate the problem 3 into an associated stochastic problem and then solved it adaptively as the simulation of a rare event If γ is the optimum of φ, F γ is generally a rare event The main idea is to define a family π(, λ) λ Λ 5 pdfs f λ (x) = c(λ)e λ t(x) h(x)
5 and iterate enough the CE algorithm such as γ t γ to draw samples around the optimum Moreover, one has to notice that unlie the other local random search algorithm such as simulated annealing which used the assumption of local neighborhood hypothesis, the CE method tries to solve globally the optimization problem To apply the CE approach, we need to : define a selection rate ρ choose a well-suited family of pdf π(, λ) λ Λ define a criterion to stop the iteration Once these elements are given the algorithm for the optimization proceeds as follows : Initialize λ t = λ 0 Generate a sample of size N (x t i ) i N from π(, λ t ), compute (φ(x t i )) i N and order them from smallest to biggest Estimate γ t as the ( ρ) sample percentile 3 update λ t with : λ t+ = arg max λ N I [ φ(x t ] i) γ t ln π(x t i, λ) repeat from step until convergence () 5 assume convergence is reached at t = t, an optimal value for φ can be done by drawing from π(, λ t ) Application to the path planning tas In this part we deal with the application of the CE methods to our tas First of all, it is necessary to define the random mechanism (π(, λ) λ Λ) to generate sample of trajectories More precisely we want to generate sample which : start from s i and end at s f respect the authorized matrix δ(a, a + ), have whole length less than T max One way to achieve that is to use a probability matrix P sa = (p sa ) with s {,, N s } and a {,, N a } defining the probability of choosing action a in state s P sa is a N s N a matrix (in our case N a = ) p p p 7 p P sa = (5) p Ns p Ns p Ns7 p Ns with s, the s th row P s () is one dimensional discrete probability such as : P s (i) = p si, i =,, with p si = So, to solve our problem we are concerned with the optimization of the N s N a parameters (p sa ) using the CE algorithm Trajectory generation At each iteration of the CE algorithm, we want to generate from P sa N trajectories of length T T max starting from s i and ending within s f and drawn in accordance with the authorized transition matrix δ Let first introduce : A( s, ã) = {a s = s, δ(a = a, a = ã) = } P s () such as, a A( s, ã), p sa p sa = P a A( s,ã) p sa else p sa = 0; A( s, ã) is the admissible actions at time in state s = s nowing that ã was chosen at time and P s () is the normalized restriction of P s () to A( s, ã) We proceed as follows for one iteration for the CE: Table : Path generation principle j = 0 while (j < N) = 0, set s 0 = s i generate one action a j 0 according to P s0 and apply it set = + and T = until s = s f do - compute A( s, ã) if A( s, ã) - generate one action a A(s, a ) according to P s and apply it else stop and set j = j - set = + and T = T + if T > T max stop and set j = j else return x(j) = (s i, a j 0, sj, aj,, sj, aj, s f) j = j + Updating the P sa matrix At each iteration t of the Cross Entropy algorithm, given the N trajectories (x(j)) j N (starting from s i to s f and respecting to the matrix δ) generated via the algorithm described in table, we can evaluate each one by calculating the PCRB sequence and applying our functional φ ( {, }) Then the parameter γ t can be evaluated One can update P sa by solving Let x(j) = (s i, a j 0, sj, aj,, sj, aj, s f) be one trajectory, we have : ln π(x(j),p sa ) = I [{x(j) χ sa }] lnp sa () i=0
6 where I is the indicator function and {x(j) χ sa } means that the trajectory contains a visit to state s in which action a is taen Since for each s, the row P s () is a discrete probability, must be solved under the condition that the rows of P sa sum up to This yields to solve the following problem using Lagrange multipliers (µ s ) { s Ns} : 0 max P sa min µ,,µ Ns N I [φ (x(j)) γ t ] lnπ(x(j),p sa ) (7) j= ( N s + µ s a= s= a= ) (p sa ) with lnπ(x(j),p sa ) given in We can derive after differentiation with respect each parameter p sa (s {,, N s } a {,, }) the following equation N I [φ (x(j)) γ t ] I [{x(j) χ sa }] = µ s p sa j= After summing over a =,, and applying the condition a= p sa =, we can obtain µ s and the final updating formula µ s = N I [φ (x(j)) γ t ] I [{x(j) χ s }] () j= N j= p sa = I [{φ (x(j)) γ t }] I [{x(j) χ sa }] N j= I [{φ (x(j)) γ t }] I [{x(j) χ s }] (9) where {x(j) χ s } means that the trajectory contains a visit to state s For the first iteration, s, P s () is a uniform probability density function 5 Simulation results The algorithm has not been widely tested, and only one simple scenario is introduced in this paper In this example, the map is defined on [, ] [, ] with m j point features (figure 3) The state space is discretized in N s = 5 5 states The initial and terminal states are respectively in positions (0, 0) and (0, 0) m 0 m m m 3 m m 5 x y Table : the features of the map Figure 3: The maximum lielihood optimal solution after the first iteration of the CE Starting and terminal states ( ) and map features ( ) The mobile can only apply [ π ; π ] headings controls at each time Thus the δ matrix is exactly the same defined as in section Only trajectories with length T 30 are admissible For the observation model, a landmar m j is visible at time provided that: { r min z r (j) r max z β (j) θ max with r min = 000, r max = and θ max = 0 deg The noise variance σa on the range and σ β on the bearing are the same for all features and are time independent : σ a = 50 3, σ β = 05 deg The computation of the PCRB matrices was performed with N c = 000 to estimate the observation term J (Z) For the optimization step, the Cross Entropy algorithm was implemented with 000 iterations, N = 5000 admissible trajectories and a selective rate ρ = 0 That is to say, the 50 best samples are used for updating the psa probabilities Figure show the optimal trajectory found by the algorithm at iteration 000 for the performance function φ 0 For the mobile dynamic model, the elementary displacement d is equal to and the noise process covariance is defined by: P 0 = σ I, Q = σ I, where I is the identity matrix of size and σ = Figure : The most liely optimal trajectory found by the algorithm for φ
7 5 Analysis As expected (figure ) the mobile is guided toward the area with landmars in order to improve its performance of localization Moreover, it operates to eep the landmars visible while the maneuvers (δ) and the time constraints (T max ) allow it We can also notice that the algorithm converge rapidly to a solution To illustrate that we present in the next figure, the evolution of parameters γ and the minimum value of φ (or the maximum of φ ) at each iteration of the CE algorithm (figure 5) Figure : state (squares) with pdf different from a dirac after convergence max( φ ) gamma Figure 5: Evolution of γ (solid line) and the minimum value of the functional (dashed line) When we loo at precisely after convergence the densities (P s ()) for all s in the optimal trajectory we can notice that some of them are not a dirac probability law In some state, The most liely trajectory optimal trajectory is composed of 30 states, only 3 have their associated P s equivalent to a dirac probability law Table 3 shows the probability density functions for the others (see figure ) In these states the al Table 3: probability densities functions gorithm converge toward a multi-modal density Two actions can be chosen but with different probability We can notice that this behavior are concentrated on state where maneuvers can be done to increase the time to observe the landmars Conclusions and perspectives In this paper, we presented a framewor to solve a path planning tas for a mobile with the The problem was discretized and a Marov Decision Process with constraints on the mobile maneuver was used Our main goal was to find the optimal trajectory according to a measure of capability of estimating accurately the state of the mobile during the execution Functionals of the Posterior Cramr-Rao Bound was used as the criterion of performance The main contribution of the paper is the use of the Cross Entropy algorithm to solve the optimization step as Dynamic Programming could not be applied This approach was tested on a simple first example and seems to be relevant Future wor will first concentrate on the complete implementation of the algorithm and applications to more examples More analysis on the probability has to be made We will also investigate a continuous approach and try to approximate the computation of the observation contribution to the PCRB, which is time consuming The tuning of the Cross-Entropy to our specific tas was not studied, some experiments have to be carried out based on device given in [5] Finally, we want to consider more complex maps such those used in Geographical Information Systems and tae into account measurement models with data association and non detection problem References [] S Paris, J-P Le Cadre Planning for Terrain- Aided Navigation, Fusion 00, Annapolis (USA), pp 007 0, 7- Jul 00 [] J-P Le Cadre and O Tremois, The Matrix Dynamic Programming Property and its Implications SIAM Journal on Matrix Analysis, (): pp -, April 997
8 [3] R Enns and D Morrell, Terrain-Aided Navigation Using the Viterbi Algorithm, Journal of Guidance, Control, and Dynamics, vol, no, pp 9, November-December 995 [] P Tichavsy, CH Muravchi, A Nehorai Posterior Cramér-Rao Bounds for Discrete-Time Nonlinear Filtering,IEEE Transactions on Signal Processing, Vol, no 5, pp 3 39, May 99 [5] R Rubinstein,D P Kroese The Cross-Entropy method An unified approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, Information Science & Statistics, Springer 00 [] de-boer,p Kroese, D Mannor and RRubinstein A tutorial on the cross-entropy method, ptdeboer/ce/ [7] S Mannor, R Rubinstein, YGat The Cross Entropy method for Fast Policy Search, Procof the 0 th IC on Machine Learning, 003 [] R S Sutton and AGBarto, Reinforcement Learning An Introduction A Bradford boo, 000 [9] HL Van Trees, Detection, Estimation and Modulation Theory, New Yor Wiley, 9
E Celeste and F Dambreville IRISA/CNRS Campus de Beaulieu
OPTIMAL STRATEGIES FOR MOBILE ROBOTS BASED ON THE CROSS-ENTROPY ALGORITHM J.-P. Le Cadre E Celeste and F Dambreville IRISA/CNRS Campus de Beaulieu 35042 Rennes lecadre @irisa.fr CEP/Dept. of Geomatics
More informationEvaluation of a robot learning and planning via Extreme Value Theory
Evaluation of a robot learning and planning via Extreme Value Theory F. CELESTE, F. DAMBREVILLE CEP/Dept. of Geomatics Imagery Perception 94114 Arcueil, France Emails: francis.celeste, frederic.dambreville}@etca.fr
More informationCross entropy-based importance sampling using Gaussian densities revisited
Cross entropy-based importance sampling using Gaussian densities revisited Sebastian Geyer a,, Iason Papaioannou a, Daniel Straub a a Engineering Ris Analysis Group, Technische Universität München, Arcisstraße
More informationFisher Information Matrix-based Nonlinear System Conversion for State Estimation
Fisher Information Matrix-based Nonlinear System Conversion for State Estimation Ming Lei Christophe Baehr and Pierre Del Moral Abstract In practical target tracing a number of improved measurement conversion
More informationAn Introduction to Expectation-Maximization
An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative
More informationF denotes cumulative density. denotes probability density function; (.)
BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models
More informationEUSIPCO
EUSIPCO 3 569736677 FULLY ISTRIBUTE SIGNAL ETECTION: APPLICATION TO COGNITIVE RAIO Franc Iutzeler Philippe Ciblat Telecom ParisTech, 46 rue Barrault 753 Paris, France email: firstnamelastname@telecom-paristechfr
More informationCombine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning
Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning Michalis K. Titsias Department of Informatics Athens University of Economics and Business
More informationBearings-Only Tracking in Modified Polar Coordinate System : Initialization of the Particle Filter and Posterior Cramér-Rao Bound
Bearings-Only Tracking in Modified Polar Coordinate System : Initialization of the Particle Filter and Posterior Cramér-Rao Bound Thomas Brehard (IRISA/CNRS), Jean-Pierre Le Cadre (IRISA/CNRS). Journée
More informationConditional Posterior Cramér-Rao Lower Bounds for Nonlinear Sequential Bayesian Estimation
1 Conditional Posterior Cramér-Rao Lower Bounds for Nonlinear Sequential Bayesian Estimation Long Zuo, Ruixin Niu, and Pramod K. Varshney Abstract Posterior Cramér-Rao lower bounds (PCRLBs) 1] for sequential
More informationRobust Monte Carlo Methods for Sequential Planning and Decision Making
Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory
More informationMachine Learning Lecture Notes
Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective
More informationLecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions
DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K
More informationLecture 8: Bayesian Estimation of Parameters in State Space Models
in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space
More informationBasic concepts in estimation
Basic concepts in estimation Random and nonrandom parameters Definitions of estimates ML Maimum Lielihood MAP Maimum A Posteriori LS Least Squares MMS Minimum Mean square rror Measures of quality of estimates
More informationA Model Reference Adaptive Search Method for Global Optimization
OPERATIONS RESEARCH Vol. 55, No. 3, May June 27, pp. 549 568 issn 3-364X eissn 1526-5463 7 553 549 informs doi 1.1287/opre.16.367 27 INFORMS A Model Reference Adaptive Search Method for Global Optimization
More informationExtended Object and Group Tracking with Elliptic Random Hypersurface Models
Extended Object and Group Tracing with Elliptic Random Hypersurface Models Marcus Baum Benjamin Noac and Uwe D. Hanebec Intelligent Sensor-Actuator-Systems Laboratory ISAS Institute for Anthropomatics
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationMaximum Likelihood Diffusive Source Localization Based on Binary Observations
Maximum Lielihood Diffusive Source Localization Based on Binary Observations Yoav Levinboo and an F. Wong Wireless Information Networing Group, University of Florida Gainesville, Florida 32611-6130, USA
More information2D Image Processing. Bayes filter implementation: Kalman filter
2D Image Processing Bayes filter implementation: Kalman filter Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de
More informationReinforcement Learning. Introduction
Reinforcement Learning Introduction Reinforcement Learning Agent interacts and learns from a stochastic environment Science of sequential decision making Many faces of reinforcement learning Optimal control
More informationCramér-Rao Bounds for Estimation of Linear System Noise Covariances
Journal of Mechanical Engineering and Automation (): 6- DOI: 593/jjmea Cramér-Rao Bounds for Estimation of Linear System oise Covariances Peter Matiso * Vladimír Havlena Czech echnical University in Prague
More informationFUNDAMENTAL FILTERING LIMITATIONS IN LINEAR NON-GAUSSIAN SYSTEMS
FUNDAMENTAL FILTERING LIMITATIONS IN LINEAR NON-GAUSSIAN SYSTEMS Gustaf Hendeby Fredrik Gustafsson Division of Automatic Control Department of Electrical Engineering, Linköpings universitet, SE-58 83 Linköping,
More informationConditional Filters for Image Sequence Based Tracking - Application to Point Tracking
IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Conditional Filters for Image Sequence Based Tracing - Application to Point Tracing Élise Arnaud 1, Étienne Mémin 1 and Bruno Cernuschi-Frías 2 Abstract In this
More informationCS599 Lecture 1 Introduction To RL
CS599 Lecture 1 Introduction To RL Reinforcement Learning Introduction Learning from rewards Policies Value Functions Rewards Models of the Environment Exploitation vs. Exploration Dynamic Programming
More informationOptimal Control. McGill COMP 765 Oct 3 rd, 2017
Optimal Control McGill COMP 765 Oct 3 rd, 2017 Classical Control Quiz Question 1: Can a PID controller be used to balance an inverted pendulum: A) That starts upright? B) That must be swung-up (perhaps
More informationLatent Variable Models and EM algorithm
Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic
More informationMachine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity
More informationDistributed Estimation and Detection for Smart Grid
Distributed Estimation and Detection for Smart Grid Texas A&M University Joint Wor with: S. Kar (CMU), R. Tandon (Princeton), H. V. Poor (Princeton), and J. M. F. Moura (CMU) 1 Distributed Estimation/Detection
More informationEfficient Sensor Network Planning Method. Using Approximate Potential Game
Efficient Sensor Network Planning Method 1 Using Approximate Potential Game Su-Jin Lee, Young-Jin Park, and Han-Lim Choi, Member, IEEE arxiv:1707.00796v1 [cs.gt] 4 Jul 2017 Abstract This paper addresses
More informationDensity Approximation Based on Dirac Mixtures with Regard to Nonlinear Estimation and Filtering
Density Approximation Based on Dirac Mixtures with Regard to Nonlinear Estimation and Filtering Oliver C. Schrempf, Dietrich Brunn, Uwe D. Hanebeck Intelligent Sensor-Actuator-Systems Laboratory Institute
More informationBasics of reinforcement learning
Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system
More informationCSE250A Fall 12: Discussion Week 9
CSE250A Fall 12: Discussion Week 9 Aditya Menon (akmenon@ucsd.edu) December 4, 2012 1 Schedule for today Recap of Markov Decision Processes. Examples: slot machines and maze traversal. Planning and learning.
More informationEstimating Gaussian Mixture Densities with EM A Tutorial
Estimating Gaussian Mixture Densities with EM A Tutorial Carlo Tomasi Due University Expectation Maximization (EM) [4, 3, 6] is a numerical algorithm for the maximization of functions of several variables
More informationSensor Fusion: Particle Filter
Sensor Fusion: Particle Filter By: Gordana Stojceska stojcesk@in.tum.de Outline Motivation Applications Fundamentals Tracking People Advantages and disadvantages Summary June 05 JASS '05, St.Petersburg,
More informationCMU Lecture 11: Markov Decision Processes II. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 11: Markov Decision Processes II Teacher: Gianni A. Di Caro RECAP: DEFINING MDPS Markov decision processes: o Set of states S o Start state s 0 o Set of actions A o Transitions P(s s,a)
More informationA Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley
A Tour of Reinforcement Learning The View from Continuous Control Benjamin Recht University of California, Berkeley trustable, scalable, predictable Control Theory! Reinforcement Learning is the study
More informationReinforcement Learning and NLP
1 Reinforcement Learning and NLP Kapil Thadani kapil@cs.columbia.edu RESEARCH Outline 2 Model-free RL Markov decision processes (MDPs) Derivative-free optimization Policy gradients Variance reduction Value
More informationGaussian Mixtures Proposal Density in Particle Filter for Track-Before-Detect
12th International Conference on Information Fusion Seattle, WA, USA, July 6-9, 29 Gaussian Mixtures Proposal Density in Particle Filter for Trac-Before-Detect Ondřej Straa, Miroslav Šimandl and Jindřich
More information2D Image Processing. Bayes filter implementation: Kalman filter
2D Image Processing Bayes filter implementation: Kalman filter Prof. Didier Stricker Dr. Gabriele Bleser Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche
More information1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.
Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification
More informationNON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES
2013 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING NON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES Simo Särä Aalto University, 02150 Espoo, Finland Jouni Hartiainen
More informationIntroduction to Reinforcement Learning. CMPT 882 Mar. 18
Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and
More informationRao-Blackwellized Particle Filter for Multiple Target Tracking
Rao-Blackwellized Particle Filter for Multiple Target Tracking Simo Särkkä, Aki Vehtari, Jouko Lampinen Helsinki University of Technology, Finland Abstract In this article we propose a new Rao-Blackwellized
More informationRAO-BLACKWELLIZED PARTICLE FILTER FOR MARKOV MODULATED NONLINEARDYNAMIC SYSTEMS
RAO-BLACKWELLIZED PARTICLE FILTER FOR MARKOV MODULATED NONLINEARDYNAMIC SYSTEMS Saiat Saha and Gustaf Hendeby Linöping University Post Print N.B.: When citing this wor, cite the original article. 2014
More informationReinforcement Learning II
Reinforcement Learning II Andrea Bonarini Artificial Intelligence and Robotics Lab Department of Electronics and Information Politecnico di Milano E-mail: bonarini@elet.polimi.it URL:http://www.dei.polimi.it/people/bonarini
More informationRECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T DISTRIBUTION
1 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 3 6, 1, SANTANDER, SPAIN RECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T
More informationDistributed Optimization. Song Chong EE, KAIST
Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links
More informationData Detection for Controlled ISI. h(nt) = 1 for n=0,1 and zero otherwise.
Data Detection for Controlled ISI *Symbol by symbol suboptimum detection For the duobinary signal pulse h(nt) = 1 for n=0,1 and zero otherwise. The samples at the output of the receiving filter(demodulator)
More informationArtificial Intelligence
Artificial Intelligence Dynamic Programming Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: So far we focussed on tree search-like solvers for decision problems. There is a second important
More informationRecall that in finite dynamic optimization we are trying to optimize problems of the form. T β t u(c t ) max. c t. t=0 T. c t = W max. subject to.
19 Policy Function Iteration Lab Objective: Learn how iterative methods can be used to solve dynamic optimization problems. Implement value iteration and policy iteration in a pseudo-infinite setting where
More informationMARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti
1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early
More informationMachine Learning, Midterm Exam
10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have
More informationContinuous Learning Method for a Continuous Dynamical Control in a Partially Observable Universe
Continuous Learning Method for a Continuous Dynamical Control in a Partially Observable Universe Frédéric Dambreville Délégation Générale pour l Armement, DGA/CTA/DT/GIP 16 Bis, Avenue Prieur de la Côte
More informationComputing Maximum Entropy Densities: A Hybrid Approach
Computing Maximum Entropy Densities: A Hybrid Approach Badong Chen Department of Precision Instruments and Mechanology Tsinghua University Beijing, 84, P. R. China Jinchun Hu Department of Precision Instruments
More informationLecture 3: Markov Decision Processes
Lecture 3: Markov Decision Processes Joseph Modayil 1 Markov Processes 2 Markov Reward Processes 3 Markov Decision Processes 4 Extensions to MDPs Markov Processes Introduction Introduction to MDPs Markov
More informationComputer Vision Group Prof. Daniel Cremers. 14. Sampling Methods
Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationQ-Learning in Continuous State Action Spaces
Q-Learning in Continuous State Action Spaces Alex Irpan alexirpan@berkeley.edu December 5, 2015 Contents 1 Introduction 1 2 Background 1 3 Q-Learning 2 4 Q-Learning In Continuous Spaces 4 5 Experimental
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More information2D Image Processing (Extended) Kalman and particle filter
2D Image Processing (Extended) Kalman and particle filter Prof. Didier Stricker Dr. Gabriele Bleser Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
More informationMobile Robot Localization
Mobile Robot Localization 1 The Problem of Robot Localization Given a map of the environment, how can a robot determine its pose (planar coordinates + orientation)? Two sources of uncertainty: - observations
More informationKernel adaptive Sequential Monte Carlo
Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline
More informationTSRT14: Sensor Fusion Lecture 9
TSRT14: Sensor Fusion Lecture 9 Simultaneous localization and mapping (SLAM) Gustaf Hendeby gustaf.hendeby@liu.se TSRT14 Lecture 9 Gustaf Hendeby Spring 2018 1 / 28 Le 9: simultaneous localization and
More informationThe Expectation-Maximization Algorithm
The Expectation-Maximization Algorithm Francisco S. Melo In these notes, we provide a brief overview of the formal aspects concerning -means, EM and their relation. We closely follow the presentation in
More informationL09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms
L09. PARTICLE FILTERING NA568 Mobile Robotics: Methods & Algorithms Particle Filters Different approach to state estimation Instead of parametric description of state (and uncertainty), use a set of state
More informationUTILIZING PRIOR KNOWLEDGE IN ROBUST OPTIMAL EXPERIMENT DESIGN. EE & CS, The University of Newcastle, Australia EE, Technion, Israel.
UTILIZING PRIOR KNOWLEDGE IN ROBUST OPTIMAL EXPERIMENT DESIGN Graham C. Goodwin James S. Welsh Arie Feuer Milan Depich EE & CS, The University of Newcastle, Australia 38. EE, Technion, Israel. Abstract:
More informationHierarchical Particle Filter for Bearings-Only Tracking
Hierarchical Particle Filter for Bearings-Only Tracing 1 T. Bréhard and J.-P. Le Cadre IRISA/CNRS Campus de Beaulieu 3542 Rennes Cedex, France e-mail: thomas.brehard@gmail.com, lecadre@irisa.fr Tel : 33-2.99.84.71.69
More informationBAYESIAN MULTI-TARGET TRACKING WITH SUPERPOSITIONAL MEASUREMENTS USING LABELED RANDOM FINITE SETS. Francesco Papi and Du Yong Kim
3rd European Signal Processing Conference EUSIPCO BAYESIAN MULTI-TARGET TRACKING WITH SUPERPOSITIONAL MEASUREMENTS USING LABELED RANDOM FINITE SETS Francesco Papi and Du Yong Kim Department of Electrical
More informationA Tutorial on the Cross-Entropy Method
A Tutorial on the Cross-Entropy Method Pieter-Tjerk de Boer Electrical Engineering, Mathematics and Computer Science department University of Twente ptdeboer@cs.utwente.nl Dirk P. Kroese Department of
More informationSignal Processing - Lecture 7
1 Introduction Signal Processing - Lecture 7 Fitting a function to a set of data gathered in time sequence can be viewed as signal processing or learning, and is an important topic in information theory.
More informationMulti-Target Tracking Using A Randomized Hypothesis Generation Technique
Multi-Target Tracing Using A Randomized Hypothesis Generation Technique W. Faber and S. Charavorty Department of Aerospace Engineering Texas A&M University arxiv:1603.04096v1 [math.st] 13 Mar 2016 College
More informationThe Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma
The Cross Entropy Method for the N-Persons Iterated Prisoner s Dilemma Tzai-Der Wang Artificial Intelligence Economic Research Centre, National Chengchi University, Taipei, Taiwan. email: dougwang@nccu.edu.tw
More informationt x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.
Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae
More informationMini-Course 07 Kalman Particle Filters. Henrique Massard da Fonseca Cesar Cunha Pacheco Wellington Bettencurte Julio Dutra
Mini-Course 07 Kalman Particle Filters Henrique Massard da Fonseca Cesar Cunha Pacheco Wellington Bettencurte Julio Dutra Agenda State Estimation Problems & Kalman Filter Henrique Massard Steady State
More informationParameterized Joint Densities with Gaussian Mixture Marginals and their Potential Use in Nonlinear Robust Estimation
Proceedings of the 2006 IEEE International Conference on Control Applications Munich, Germany, October 4-6, 2006 WeA0. Parameterized Joint Densities with Gaussian Mixture Marginals and their Potential
More informationReinforcement Learning. Spring 2018 Defining MDPs, Planning
Reinforcement Learning Spring 2018 Defining MDPs, Planning understandability 0 Slide 10 time You are here Markov Process Where you will go depends only on where you are Markov Process: Information state
More informationThree examples of a Practical Exact Markov Chain Sampling
Three examples of a Practical Exact Markov Chain Sampling Zdravko Botev November 2007 Abstract We present three examples of exact sampling from complex multidimensional densities using Markov Chain theory
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2014-2015 Jakob Verbeek, ovember 21, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More information10-701/15-781, Machine Learning: Homework 4
10-701/15-781, Machine Learning: Homewor 4 Aarti Singh Carnegie Mellon University ˆ The assignment is due at 10:30 am beginning of class on Mon, Nov 15, 2010. ˆ Separate you answers into five parts, one
More informationReinforcement Learning
Reinforcement Learning Model-Based Reinforcement Learning Model-based, PAC-MDP, sample complexity, exploration/exploitation, RMAX, E3, Bayes-optimal, Bayesian RL, model learning Vien Ngo MLR, University
More informationThe Regularized EM Algorithm
The Regularized EM Algorithm Haifeng Li Department of Computer Science University of California Riverside, CA 92521 hli@cs.ucr.edu Keshu Zhang Human Interaction Research Lab Motorola, Inc. Tempe, AZ 85282
More informationRecursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations
PREPRINT 1 Recursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations Simo Särä, Member, IEEE and Aapo Nummenmaa Abstract This article considers the application of variational Bayesian
More informationGaussian Mixture Model
Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,
More informationCS181 Midterm 2 Practice Solutions
CS181 Midterm 2 Practice Solutions 1. Convergence of -Means Consider Lloyd s algorithm for finding a -Means clustering of N data, i.e., minimizing the distortion measure objective function J({r n } N n=1,
More informationIntroduction to Reinforcement Learning Part 1: Markov Decision Processes
Introduction to Reinforcement Learning Part 1: Markov Decision Processes Rowan McAllister Reinforcement Learning Reading Group 8 April 2015 Note I ve created these slides whilst following Algorithms for
More informationORBIT DETERMINATION AND DATA FUSION IN GEO
ORBIT DETERMINATION AND DATA FUSION IN GEO Joshua T. Horwood, Aubrey B. Poore Numerica Corporation, 4850 Hahns Pea Drive, Suite 200, Loveland CO, 80538 Kyle T. Alfriend Department of Aerospace Engineering,
More informationin a Rao-Blackwellised Unscented Kalman Filter
A Rao-Blacwellised Unscented Kalman Filter Mar Briers QinetiQ Ltd. Malvern Technology Centre Malvern, UK. m.briers@signal.qinetiq.com Simon R. Masell QinetiQ Ltd. Malvern Technology Centre Malvern, UK.
More informationInternet Monetization
Internet Monetization March May, 2013 Discrete time Finite A decision process (MDP) is reward process with decisions. It models an environment in which all states are and time is divided into stages. Definition
More informationEstimation and Maintenance of Measurement Rates for Multiple Extended Target Tracking
Estimation and Maintenance of Measurement Rates for Multiple Extended Target Tracing Karl Granström Division of Automatic Control Department of Electrical Engineering Linöping University, SE-58 83, Linöping,
More information11. Learning graphical models
Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical
More information6 Reinforcement Learning
6 Reinforcement Learning As discussed above, a basic form of supervised learning is function approximation, relating input vectors to output vectors, or, more generally, finding density functions p(y,
More informationAn Adaptive Clustering Method for Model-free Reinforcement Learning
An Adaptive Clustering Method for Model-free Reinforcement Learning Andreas Matt and Georg Regensburger Institute of Mathematics University of Innsbruck, Austria {andreas.matt, georg.regensburger}@uibk.ac.at
More informationA Comparison of Particle Filters for Personal Positioning
VI Hotine-Marussi Symposium of Theoretical and Computational Geodesy May 9-June 6. A Comparison of Particle Filters for Personal Positioning D. Petrovich and R. Piché Institute of Mathematics Tampere University
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationA FEASIBILITY STUDY OF PARTICLE FILTERS FOR MOBILE STATION RECEIVERS. Michael Lunglmayr, Martin Krueger, Mario Huemer
A FEASIBILITY STUDY OF PARTICLE FILTERS FOR MOBILE STATION RECEIVERS Michael Lunglmayr, Martin Krueger, Mario Huemer Michael Lunglmayr and Martin Krueger are with Infineon Technologies AG, Munich email:
More information4: Parameter Estimation in Fully Observed BNs
10-708: Probabilistic Graphical Models 10-708, Spring 2015 4: Parameter Estimation in Fully Observed BNs Lecturer: Eric P. Xing Scribes: Purvasha Charavarti, Natalie Klein, Dipan Pal 1 Learning Graphical
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationA NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1
A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 Jinglin Zhou Hong Wang, Donghua Zhou Department of Automation, Tsinghua University, Beijing 100084, P. R. China Control Systems Centre,
More information