Deep Nonlinear Non-Gaussian Filtering for Dynamical Systems
|
|
- Charlotte McKinney
- 5 years ago
- Views:
Transcription
1 Deep Nonlinear Non-Gaussian Filtering for Dnamical Sstems Arash Mehrjou Department of Empirical Inference Ma Planck Institute for Intelligent Sstems Bernhard Schölkopf Department of Empirical Inference Ma Planck Institute for Intelligent Sstems Abstract Filtering is a general name for inferring the states of a dnamical sstem given observations. The most common filtering approach is Gaussian Filtering (GF) where the distribution of the inferred states is a Gaussian whose mean is an affine function of the observations. There are two restrictions in this model: Gaussianit and Affinit. We propose a model to rela both these assumptions based on recent advances in implicit generative models. Empirical results show that the proposed method gives a significant advantage over GF and nonlinear methods based on fied nonlinear kernels. 1 Introduction Inference in dnamical sstems is an standing process in man control sstems. We as intelligent agents are constantl inferring the states of the nature and sstems around us. are often so nois and unreliable that require us to first infer the underling states, then make our decisions based on the estimated states. We can use two sources of information to infer the states causing the current observation: (1) The histor of our estimate of the previous states () The current observation. Fusing these two sources of information to obtain an accurate estimate of the current state of a dnamical sstem is generall called filtering. Dnamical sstems We assume time-invariant closed-loop dnamical sstems described as { t = f( t 1, n t ) t = h( t, m t ) where the subscript t correspond to the current value and subscript t 1 correspond to the values at the moment one step before the current time in discrete setting. This formulation is generic enough for the purposes of this paper; however, the path from the initial formulation to this simplified version can be followed in the Appendi. A. Obviousl, this notation is correct if the sstem satisfies Markov propert. In the above sstem, n t and m t come from some simple noise models. Notice that the simplicit of these noise models is not restrictive because the can be transformed into an comple distribution through the nonlinear functions f and h. In a phsical sstem, the first line of (1) describes p( t t 1 ) as the evolution of the states of the sstem and the second line describes p( t t ) as the probabilistic model of the observations (sensors). Filtering The goal of filtering is to estimate the current state of the sstem. Assume the subscript [: t] refers to all time instances before the moment t including t. At the current moment denoted b subscript t, we have seen the histor of all observations :t = ( :t 1, t ). Thus, the inference over the states of a dnamical sstem can be written as a two-phase process. A prediction phase that models our belief about the net state given onl the previous observations (d is dropped for (1) Accepted to the Workshop on InferControl at 3nd Conference on Neural Information Processing Sstems (NIPS 18). Do not distribute.
2 simplicit throughout the paper): p( t :t 1 ) = p( t t 1 )p( t 1 :t 1 ) t 1 () and an update phase that modulates our belief about the current state through the Baes s formula: p( t :t ) = p( t t )p( t :t 1 ) t p( t t )p( t :t 1 ) (3) Kalman derived the closed-from solution for a linear process Ẋ = AX + Bv and Gaussian noise v N (, I) [1]. The problem is though ver difficult for nonlinear process f and sensor model h unless in ver restricted cases [, 3]. The actual goal of filtering is often not computing the posterior distribution of states. Instead, the goal is computing some epectation E[g( t )] of a function g of the current state with respect to p( t :t ) or p( t, t :t 1 ). The former results in an intractable integral whose computation scales eponentiall with the state dimension dim() []. However, computing the latter scales linearl with dim(). Even though the integral with respect to the probabilit measure p( t, t :t 1 ) is computationall feasible, approimating the probabilit distribution itself is difficult for high dimensional states and observations. This problem has been approached b various methods including Unscented Kalman Filter (UKF) [5], Etended Kalman Filter (EKF) [] and Particle Filter (PF) [7] where the first two are parametric and the last one is non-parametric. Most of the parametric methods adopt a variational approach and approimate p( t, t :t 1 ) b q( t, t :t 1 ) that belongs to a parametric hpothesis space. The assumed form for q must be in a wa that eases the conditioning on t which is readil possible for a Gaussian q. Nonetheless, a Gaussian distribution is not a realistic assumption for p unless in ver limited applications. In this paper, we propose an easil trainable and highl epressive variational distribution and an efficient method to learn its parameters. Gaussian Filtering In common filtering applications, what we usuall care about is an epectation of the following form: E[g( t, t )] = g( t, t )p( t, t :t 1 ) = t, t t,m t g( t, h( t, m t ))p(m t )p( t :t 1 ) () where the right-hand integral is derived b plugging in the observation model of (1) in (). This integral is computable b Monte Carlo methods when the distribution p( t :t 1 ) can be sampled efficientl and noise has a simple model p(m t ). As an special case, integrals with respect to p( t :t 1 ) can be computed efficientl as well. However, it requires p( t t ) to be easil computable from p( t, t ) which is not the case for most distributions ecept ver simple ones such as Gaussians. To ease the presentation, let s focus onl on the prediction step () to compute p( t :t 1 ). The histor of observations :t 1 is implicit in the model. Thus, we drop the indices and represent t b and t b. For eample, p( t :t ) = p( t t, :t 1 ) is simpl represented b p( ). As mentioned in the previous section, filtering tries to find a good approimation to p(, ) and ultimatel p( ). This process is carried out b first approimating p(, ) b q(, ), then computing q( ) given q(, ). The distribution q(, ) is often chosen from a hpothesis space with limited capacit. For a Gaussian hpothesis set, we have (( ) ( µ q(, ) = N µ ), ( )) Σ Σ Σ Σ q( ) = N ( µ + Σ Σ 1 ( µ ), Σ Σ Σ 1 Σ T ) () which is in general called Gaussian Filter (GF). There are two obvious limitations in this framework: First, the posterior distribution () is Gaussian. Second, the mean of the posterior distribution of states which is underlined in () is an affine function of the the observations. In the net section, we rela both these assumptions. (5)
3 t T t T+1 t t T:t () = ( (),z) t T t T+1 (a) State-Observation evolution t z N (,I) (b) The architecture for sampling from the posterior Figure 1: (a) Solid lines show how observations are generated b the evolution of states in Markovian setting. Dashed lines show the non-markovian setting. (b) The observations of T previous timesteps is fed to the network and are transformed to the nonlinear features. The features are concatenated with samples from an eternal source of noise (z) and passed through a nonlinear function whose output is supposed to match the samples from the posterior distribution of states given observations of the last T timesteps. 3 Nonlinear Non-Gaussian Filtering We take a nonlinear approach and directl approimate the conditional distribution p( ) of (3) b Multilaer Perceptron (MLP) as a universal function approimator [8]. In this formulation, q( ) = D( φ()) where φ is a nonlinear function of. Moreover, D can be an comple distribution over belonging to n-dimensional state space. We do not compute D directl. Rather, we generate samples i such that i D. In analog with kernel machines, we call φ : R m R M a feature etractor that transforms the measurements b a nonlinear function from the m-dimensional sensor space to the M-dimensional feature space. Let s assume φ is parameterised b an MLP as φ(; θ φ ). This is after all a deterministic mapping and lacks the required stochasticit. Therefore, we provide the stochastic fuel to q( ) b passing samples z N (, I) alongside the etracted features φ(; θ φ ) through a secondar parameterized function ψ(z, φ(; θ φ ); θ ψ ). Back-propagation is then used to perturb the parameters θ φ and θ ψ to make the output of ψ close to the samples from p( ). The overall architecture partl inspired b [9] is shown in Fig. (b). The dashed arrows in Fig. (a) suggests the possibilit of weakening the Markovian assumption of (1) such that distant states in the past can influence the current observation. Despite the difficult of filtering for non-markovian sstems in other methods [], the proposed method can take care of it simpl b feeding more observations from the past in the network as depicted in Fig. (b). Learning the state posterior The proposed method is epected to accuratel capture the posterior distribution p( ) b q( ) where q( ) is much more fleible than Gaussian. We define the loss l : X X R + {} as a simple Euclidean distance l(, ) = for p( ) and q( ). Since MLP can theoreticall capture arbitraril comple functions [8], we move on one step further and make q((t) (t)) a function of not onl (t) but also a few previous observations of the sstem that turns the implicit distribution into q( t T:t) where T is the approimate time interval in the past through which the observations are informative about the current hidden state of the dnamical sstem. Inspired b [9], given an non-negative smmetric loss function l(, ) for (, ) X X, we define diversit coefficient as (p, q) = E [ E p() [l(, )]] (7) q( t T :t) p( t T :t) On the other hand, due to the uncertaint in the posterior, we know that q( tt T ) should not :t collapse to an etremel low entrop distribution. To encourage the implicitl estimated posterior to have higher entrop, we add a diversit encouraging term to the loss function where similarit among the samples from the same distribution acts as repulsive forces. Hence, the overall loss function becomes L l (q) = l (p, q) λ l (q, q) (8) where λ is a hper-parameter that roughl controls the empirical entrop of the samples generated from the implicit variational posterior q( ). 3
4 8 GF estimation Posterior estimation P ( ) 8 Proposed method s estimation Posterior estimation P ( ) (a) Gaussian filter (b) Proposed method Posterior estimation P ( ) Posterior estimation P ( ) 8 NGF estimation 8 NGF estimation (c) Nonlinear Gaussian Filter (degree=3) (d) Nonlinear Gaussian Filter (degree=7) Figure : Estimated posterior b different methods. Shaded area shows the standard deviation around the mean. Notice that the observation is on the vertical ais, so the uncertaint is the width of the shaded area along the horizontal ais. As can be seen, GF(a) is quite inaccurate since the posterior is Gaussian and its mean is onl an affine function of the observation. NGF(c) with polnomial nonlinearit with degree 3 fits the posterior better than GF but it needs a good choice of nonlinearit to give an acceptable result. Otherwise, the estimated posterior can be too simple or too comple(d). Moreover, this method becomes ver uncertain around = which means it can gives man different values for when [, ]. The proposed method shows a good performance which is caused b two trainable nonlinearit. Roughl speaking, φ approimates the mean and ψ approimates the variance of p( ) corresponding to each observation. Optimization In practice, the loss function (8) is approimated empiricall b the sums: ˆ l (p, q θφ,θ ψ ) = 1 N ˆ l (q θφ,θ ψ, q θφ,θ ψ ) = 1 N N n=1 N n=1 1 K K l( n, ψ(φ( n ; θ φ ), z k ; θ ψ )) k=1 1 K(K 1) K k=1,k k l(ψ(φ( n ; θ φ ), z k ; θ ψ ), ψ(φ( n ; θ φ ), z k ; θ ψ )) and the loss function ˆL(θ φ, θ ψ ) = ˆ l (p, q θφ,θ ψ ) + ˆ l (q θφ,θ ψ, q θφ,θ ψ ) is optimized with respect to {θ φ, θ ψ } b gradient descent. Notice that, for each training eample ( n, n ) from the training set, we need to sample K values of eternal noise z. The greater K results in faster convergence. Eperiments We compare the performance of the proposed method with Gaussian Filter (GF) and also the Nonlinear Gaussian Filter (NGF) [] where a fied nonlinear feature etractor is used to transform sensor measurements to the feature space. The sstem is described b g( t 1, n t ) = t 1 +n t, h( t, m t ) = t +m t +5H( t ) and p( t 1 ) = N ( t 1, 5) where H(.) is the Heaviside step function. See Fig. and its caption for description and Appendi (B) for more details. Conclusion We proposed a method that learns to filter the states of dnamical sstems given previous values of sensor measurements and states. It benefits from the fleibilit of MLPs to deal with major limitations of other methods such as linearit, Gaussianit, and Markovian assumption in a simple unified wa. Notice that the method generates samples from the posterior which is enough for computing the integral (3). The method however cannot be used in possible applications where the evaluation of p( ) is required and also where some samples from (observations, states) pair are not available to accomplish the learning phase.
5 References [1] Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. Journal of basic Engineering, 8(1):35 5, 19. [] VE Beneš. Eact finite-dimensional filters for certain diffusions with nonlinear drift. Stochastics: An International Journal of Probabilit and Stochastic Processes, 5(1-):5 9, [3] Frederick Daum. Eact finite-dimensional nonlinear filters. IEEE Transactions on Automatic Control, 31(7):1, 198. [] Manuel Wuethrich, Sebastian Trimpe, Cristina Garcia Cifuentes, Daniel Kappler, and Stefan Schaal. A new perspective and etension of the gaussian filter. The International Journal of Robotics Research, 35(1): , 1. [5] Simon J Julier and Jeffre K Uhlmann. New etension of the kalman filter to nonlinear sstems. In Signal processing, sensor fusion, and target recognition VI, volume 38, pages International Societ for Optics and Photonics, [] Harold Wane Sorenson. Kalman filtering: theor and application. IEEE, [7] Neil J Gordon, David J Salmond, and Adrian FM Smith. Novel approach to nonlinear/nongaussian baesian state estimation. In IEE Proceedings F (Radar and Signal Processing), volume 1, pages IET, [8] Kurt Hornik. Approimation capabilities of multilaer feedforward networks. Neural networks, ():51 57, [9] Diane Bouchacourt, Pawan K Mudigonda, and Sebastian Nowozin. Disco nets: Dissimilarit coefficients networks. In Advances in Neural Information Processing Sstems, pages 35 3, 1. [1] Diederik P Kingma and Jimm Ba. Adam: A method for stochastic optimization. arxiv preprint arxiv:11.98, 1. 5
6 Appendices A Description of dnamical sstems We assume the generic formulation of dnamical sstems as follows { ẋ(t) = f((t), u(t), t) (t) = h((t), u(t), t) (9) As a simplifing assumption, we ignore the eplicit dependence on time for now. Moreover, we assume the sstem is closed loop, i.e., u(t) is designed b state feedback to be a function of states as u((t)). B considering the sstem as time invariant and discrete (which is the case in practice where reading the sensors and issuing control signals are performed b digital sstems), we denote the current moment b t and one timestep before it b t 1. Therefore, the description of the dnamics f and the observation model h are simplified to (1). B Details on the eperiment The eperiment was performed for the following nonlinear stochastic dnamical sstem proposed in []: { ẋt = t 1 + n t (1) t = t + m t + 5H( t ) where the subscripts have the meaning that was described earlier for (1). The state and observation noises are both Gaussian with variances.1 and.3 accordingl. The value of λ in the loss function (8) is set to 1. However, we obtaine comparable results for a range of λ values between.7 to.5. The training set is generated b running the dnamical sstem for 1 time instances starting from a random starting state N (, 1). The generated training set is then used to train the loss function (8) b Adam optimizer [1] with learning rate.5, deca rate.95 per ever 1 iterations and batch size. Training the networks has been continued for approimatel 3 iterations until the value of the parameters converge. Notice that the shaded area shows the standard deviation around the mean. In GF and NGF, standard deviation has a closed-form formula. In the proposed method, since the posterior distribution is implicit and we onl have access to samples generated from the approimated posterior, the variance is computed empiricall using the generated samples from q( ) and plotted to show that the diversit in the generated samples matches the diversit of the actual posterior distribution p( ). Network architecture We used almost the same network architecture for both φ and ψ functions. It consists of two hidden laers each with 18 neurons and tanh nonlinearit. These laers are followed b a linear output laer. The onl difference between the networks realizing φ and ψ is that the former has an output laer with 1 neurons meaning that the feature space to which the sensor measurements are transformed is 1-dimensional. On the other hand, the function ψ obviousl has the same output dimension as the dimension of the state. We did eperiments with several other dnamical sstems with different state/sensor dimensions and constantl observed improvement over GF and NGF.
A New Perspective and Extension of the Gaussian Filter
Robotics: Science and Sstems 2015 Rome, Ital, Jul 13-17, 2015 A New Perspective and Etension of the Gaussian Filter Manuel Wüthrich, Sebastian Trimpe, Daniel Kappler and Stefan Schaal Autonomous Motion
More informationA New Perspective and Extension of the Gaussian Filter
A New Perspective and Etension of the Gaussian Filter The International Journal of Robotics Research 35(14):1 20 c The Author(s) 2016 Reprints and permission: sagepub.co.uk/journalspermissions.nav DOI:
More informationParameterized Joint Densities with Gaussian and Gaussian Mixture Marginals
Parameterized Joint Densities with Gaussian and Gaussian Miture Marginals Feli Sawo, Dietrich Brunn, and Uwe D. Hanebeck Intelligent Sensor-Actuator-Sstems Laborator Institute of Computer Science and Engineering
More information8.1 Exponents and Roots
Section 8. Eponents and Roots 75 8. Eponents and Roots Before defining the net famil of functions, the eponential functions, we will need to discuss eponent notation in detail. As we shall see, eponents
More informationSpeech and Language Processing
Speech and Language Processing Lecture 5 Neural network based acoustic and language models Information and Communications Engineering Course Takahiro Shinoaki 08//6 Lecture Plan (Shinoaki s part) I gives
More informationBayesian Semi-supervised Learning with Deep Generative Models
Bayesian Semi-supervised Learning with Deep Generative Models Jonathan Gordon Department of Engineering Cambridge University jg801@cam.ac.uk José Miguel Hernández-Lobato Department of Engineering Cambridge
More informationA comparison of estimation accuracy by the use of KF, EKF & UKF filters
Computational Methods and Eperimental Measurements XIII 779 A comparison of estimation accurac b the use of KF EKF & UKF filters S. Konatowski & A. T. Pieniężn Department of Electronics Militar Universit
More informationLecture 3: Pattern Classification. Pattern classification
EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and
More informationIran University of Science and Technology, Tehran 16844, IRAN
DECENTRALIZED DECOUPLED SLIDING-MODE CONTROL FOR TWO- DIMENSIONAL INVERTED PENDULUM USING NEURO-FUZZY MODELING Mohammad Farrokhi and Ali Ghanbarie Iran Universit of Science and Technolog, Tehran 6844,
More informationPerturbation Theory for Variational Inference
Perturbation heor for Variational Inference Manfred Opper U Berlin Marco Fraccaro echnical Universit of Denmark Ulrich Paquet Apple Ale Susemihl U Berlin Ole Winther echnical Universit of Denmark Abstract
More informationNonlinear State Estimation! Particle, Sigma-Points Filters!
Nonlinear State Estimation! Particle, Sigma-Points Filters! Robert Stengel! Optimal Control and Estimation, MAE 546! Princeton University, 2017!! Particle filter!! Sigma-Points Unscented Kalman ) filter!!
More informationLecture 13 - Handling Nonlinearity
Lecture 3 - Handling Nonlinearit Nonlinearit issues in control practice Setpoint scheduling/feedforward path planning repla - linear interpolation Nonlinear maps B-splines Multivariable interpolation:
More informationAdversarial Sequential Monte Carlo
Adversarial Sequential Monte Carlo Kira Kempinska Department of Security and Crime Science University College London London, WC1E 6BT kira.kowalska.13@ucl.ac.uk John Shawe-Taylor Department of Computer
More informationWhy do we care? Measurements. Handling uncertainty over time: predicting, estimating, recognizing, learning. Dealing with time
Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 2004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing
More informationVariational Autoencoder
Variational Autoencoder Göker Erdo gan August 8, 2017 The variational autoencoder (VA) [1] is a nonlinear latent variable model with an efficient gradient-based training procedure based on variational
More informationDual Estimation and the Unscented Transformation
Dual Estimation and the Unscented Transformation Eric A. Wan ericwan@ece.ogi.edu Rudolph van der Merwe rudmerwe@ece.ogi.edu Alex T. Nelson atnelson@ece.ogi.edu Oregon Graduate Institute of Science & Technology
More informationNonparametric Inference for Auto-Encoding Variational Bayes
Nonparametric Inference for Auto-Encoding Variational Bayes Erik Bodin * Iman Malik * Carl Henrik Ek * Neill D. F. Campbell * University of Bristol University of Bath Variational approximations are an
More informationON MODEL SELECTION FOR STATE ESTIMATION FOR NONLINEAR SYSTEMS. Robert Bos,1 Xavier Bombois Paul M. J. Van den Hof
ON MODEL SELECTION FOR STATE ESTIMATION FOR NONLINEAR SYSTEMS Robert Bos,1 Xavier Bombois Paul M. J. Van den Hof Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CD
More informationPower EP. Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR , October 4, Abstract
Power EP Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR-2004-149, October 4, 2004 Abstract This note describes power EP, an etension of Epectation Propagation (EP) that makes the computations
More informationNonlinear Estimation Techniques for Impact Point Prediction of Ballistic Targets
Nonlinear Estimation Techniques for Impact Point Prediction of Ballistic Targets J. Clayton Kerce a, George C. Brown a, and David F. Hardiman b a Georgia Tech Research Institute, Georgia Institute of Technology,
More informationEfficient Monitoring for Planetary Rovers
International Symposium on Artificial Intelligence and Robotics in Space (isairas), May, 2003 Efficient Monitoring for Planetary Rovers Vandi Verma vandi@ri.cmu.edu Geoff Gordon ggordon@cs.cmu.edu Carnegie
More informationTracking. Readings: Chapter 17 of Forsyth and Ponce. Matlab Tutorials: motiontutorial.m, trackingtutorial.m
Goal: Tracking Fundamentals of model-based tracking with emphasis on probabilistic formulations. Eamples include the Kalman filter for linear-gaussian problems, and maimum likelihood and particle filters
More informationMobile Robot Localization
Mobile Robot Localization 1 The Problem of Robot Localization Given a map of the environment, how can a robot determine its pose (planar coordinates + orientation)? Two sources of uncertainty: - observations
More informationA STATE ESTIMATOR FOR NONLINEAR STOCHASTIC SYSTEMS BASED ON DIRAC MIXTURE APPROXIMATIONS
A STATE ESTIMATOR FOR NONINEAR STOCHASTIC SYSTEMS BASED ON DIRAC MIXTURE APPROXIMATIONS Oliver C. Schrempf, Uwe D. Hanebec Intelligent Sensor-Actuator-Systems aboratory, Universität Karlsruhe (TH), Germany
More informationROBOTICS 01PEEQW. Basilio Bona DAUIN Politecnico di Torino
ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Probabilistic Fundamentals in Robotics Gaussian Filters Course Outline Basic mathematical framework Probabilistic models of mobile robots Mobile
More informationExpectation Propagation in Dynamical Systems
Expectation Propagation in Dynamical Systems Marc Peter Deisenroth Joint Work with Shakir Mohamed (UBC) August 10, 2012 Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 1 Motivation Figure : Complex
More informationOptimal Kernels for Unsupervised Learning
Optimal Kernels for Unsupervised Learning Sepp Hochreiter and Klaus Obermaer Bernstein Center for Computational Neuroscience and echnische Universität Berlin 587 Berlin, German {hochreit,ob}@cs.tu-berlin.de
More informationDensity Propagation for Continuous Temporal Chains Generative and Discriminative Models
$ Technical Report, University of Toronto, CSRG-501, October 2004 Density Propagation for Continuous Temporal Chains Generative and Discriminative Models Cristian Sminchisescu and Allan Jepson Department
More informationA parametric approach to Bayesian optimization with pairwise comparisons
A parametric approach to Bayesian optimization with pairwise comparisons Marco Co Eindhoven University of Technology m.g.h.co@tue.nl Bert de Vries Eindhoven University of Technology and GN Hearing bdevries@ieee.org
More informationWhy do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning
Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where
More informationGaussian Process priors with Uncertain Inputs: Multiple-Step-Ahead Prediction
Gaussian Process priors with Uncertain Inputs: Multiple-Step-Ahead Prediction Agathe Girard Dept. of Computing Science University of Glasgow Glasgow, UK agathe@dcs.gla.ac.uk Carl Edward Rasmussen Gatsby
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationNeural Network Approach to Control System Identification with Variable Activation Functions
Neural Network Approach to Control System Identification with Variable Activation Functions Michael C. Nechyba and Yangsheng Xu The Robotics Institute Carnegie Mellon University Pittsburgh, PA 52 Abstract
More informationAn Information Theory For Preferences
An Information Theor For Preferences Ali E. Abbas Department of Management Science and Engineering, Stanford Universit, Stanford, Ca, 94305 Abstract. Recent literature in the last Maimum Entrop workshop
More informationAn artificial neural networks (ANNs) model is a functional abstraction of the
CHAPER 3 3. Introduction An artificial neural networs (ANNs) model is a functional abstraction of the biological neural structures of the central nervous system. hey are composed of many simple and highly
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward
More informationDirect Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions
Direct Method for Training Feed-forward Neural Networks using Batch Extended Kalman Filter for Multi- Step-Ahead Predictions Artem Chernodub, Institute of Mathematical Machines and Systems NASU, Neurotechnologies
More informationRecursive Neural Filters and Dynamical Range Transformers
Recursive Neural Filters and Dynamical Range Transformers JAMES T. LO AND LEI YU Invited Paper A recursive neural filter employs a recursive neural network to process a measurement process to estimate
More informationIntroduction to Neural Networks
CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character
More informationNumerical Solutions of ODEs by Gaussian (Kalman) Filtering
Numerical Solutions of ODEs by Gaussian (Kalman) Filtering Hans Kersting joint work with Michael Schober, Philipp Hennig, Tim Sullivan and Han C. Lie SIAM CSE, Atlanta March 1, 2017 Emmy Noether Group
More informationNonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania
Nonlinear and/or Non-normal Filtering Jesús Fernández-Villaverde University of Pennsylvania 1 Motivation Nonlinear and/or non-gaussian filtering, smoothing, and forecasting (NLGF) problems are pervasive
More informationThe Scaled Unscented Transformation
The Scaled Unscented Transformation Simon J. Julier, IDAK Industries, 91 Missouri Blvd., #179 Jefferson City, MO 6519 E-mail:sjulier@idak.com Abstract This paper describes a generalisation of the unscented
More informationFuzzy Systems, Modeling and Identification
Fuzz Sstems, Modeling and Identification Robert Babuška Delft Universit of Technolog, Department of Electrical Engineering Control Laborator, Mekelweg 4, P.O. Bo 53, 26 GA Delft, The Netherlands tel: +3
More informationVariational Autoencoders
Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly
More informationArtificial Neural Networks 2
CSC2515 Machine Learning Sam Roweis Artificial Neural s 2 We saw neural nets for classification. Same idea for regression. ANNs are just adaptive basis regression machines of the form: y k = j w kj σ(b
More informationAnalytic Long-Term Forecasting with Periodic Gaussian Processes
Nooshin Haji Ghassemi School of Computing Blekinge Institute of Technology Sweden Marc Peter Deisenroth Department of Computing Imperial College London United Kingdom Department of Computer Science TU
More informationNUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS
NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS Justin Dauwels Dept. of Information Technology and Electrical Engineering ETH, CH-8092 Zürich, Switzerland dauwels@isi.ee.ethz.ch
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationA Probabilistic Representation for Dynamic Movement Primitives
A Probabilistic Representation for Dynamic Movement Primitives Franziska Meier,2 and Stefan Schaal,2 CLMC Lab, University of Southern California, Los Angeles, USA 2 Autonomous Motion Department, MPI for
More informationComparison of Kalman Filter Estimation Approaches for State Space Models with Nonlinear Measurements
Comparison of Kalman Filter Estimation Approaches for State Space Models with Nonlinear Measurements Fredri Orderud Sem Sælands vei 7-9, NO-7491 Trondheim Abstract The Etended Kalman Filter (EKF) has long
More informationNonlinear State Estimation Methods Overview and Application to PET Polymerization
epartment of Biochemical and Chemical Engineering Process ynamics Group () onlinear State Estimation Methods Overview and Polymerization Paul Appelhaus Ralf Gesthuisen Stefan Krämer Sebastian Engell The
More informationFeedforward Neural Networks
Chapter 4 Feedforward Neural Networks 4. Motivation Let s start with our logistic regression model from before: P(k d) = softma k =k ( λ(k ) + w d λ(k, w) ). (4.) Recall that this model gives us a lot
More informationProbabilistic Fundamentals in Robotics. DAUIN Politecnico di Torino July 2010
Probabilistic Fundamentals in Robotics Gaussian Filters Basilio Bona DAUIN Politecnico di Torino July 2010 Course Outline Basic mathematical framework Probabilistic models of mobile robots Mobile robot
More informationUnsupervised Learning
CS 3750 Advanced Machine Learning hkc6@pitt.edu Unsupervised Learning Data: Just data, no labels Goal: Learn some underlying hidden structure of the data P(, ) P( ) Principle Component Analysis (Dimensionality
More informationIdentification of Nonlinear Dynamic Systems with Multiple Inputs and Single Output using discrete-time Volterra Type Equations
Identification of Nonlinear Dnamic Sstems with Multiple Inputs and Single Output using discrete-time Volterra Tpe Equations Thomas Treichl, Stefan Hofmann, Dierk Schröder Institute for Electrical Drive
More informationOptimum Structure of Feed Forward Neural Networks by SOM Clustering of Neuron Activations
Optimum Structure of Feed Forward Neural Networks b SOM Clustering of Neuron Activations Samarasinghe, S. Centre for Advanced Computational Solutions, Natural Resources Engineering Group Lincoln Universit,
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationThe Unscented Particle Filter
The Unscented Particle Filter Rudolph van der Merwe (OGI) Nando de Freitas (UC Bereley) Arnaud Doucet (Cambridge University) Eric Wan (OGI) Outline Optimal Estimation & Filtering Optimal Recursive Bayesian
More informationEKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics
EKF, UKF Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Kalman Filter Kalman Filter = special case of a Bayes filter with dynamics model and sensory
More informationEKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics
EKF, UKF Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Kalman Filter Kalman Filter = special case of a Bayes filter with dynamics model and sensory
More informationLearning Static Parameters in Stochastic Processes
Learning Static Parameters in Stochastic Processes Bharath Ramsundar December 14, 2012 1 Introduction Consider a Markovian stochastic process X T evolving (perhaps nonlinearly) over time variable T. We
More informationState Space and Hidden Markov Models
State Space and Hidden Markov Models Kunsch H.R. State Space and Hidden Markov Models. ETH- Zurich Zurich; Aliaksandr Hubin Oslo 2014 Contents 1. Introduction 2. Markov Chains 3. Hidden Markov and State
More informationBACKWARD FOKKER-PLANCK EQUATION FOR DETERMINATION OF MODEL PREDICTABILITY WITH UNCERTAIN INITIAL ERRORS
BACKWARD FOKKER-PLANCK EQUATION FOR DETERMINATION OF MODEL PREDICTABILITY WITH UNCERTAIN INITIAL ERRORS. INTRODUCTION It is widel recognized that uncertaint in atmospheric and oceanic models can be traced
More information15. Eigenvalues, Eigenvectors
5 Eigenvalues, Eigenvectors Matri of a Linear Transformation Consider a linear ( transformation ) L : a b R 2 R 2 Suppose we know that L and L Then c d because of linearit, we can determine what L does
More informationLecture 6: Bayesian Inference in SDE Models
Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs
More informationDeep Learning and Information Theory
Deep Learning and Information Theory Bhumesh Kumar (13D070060) Alankar Kotwal (12D070010) November 21, 2016 Abstract T he machine learning revolution has recently led to the development of a new flurry
More informationSpeaker Representation and Verification Part II. by Vasileios Vasilakakis
Speaker Representation and Verification Part II by Vasileios Vasilakakis Outline -Approaches of Neural Networks in Speaker/Speech Recognition -Feed-Forward Neural Networks -Training with Back-propagation
More informationKalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein
Kalman filtering and friends: Inference in time series models Herke van Hoof slides mostly by Michael Rubinstein Problem overview Goal Estimate most probable state at time k using measurement up to time
More informationAutonomous Navigation for Flying Robots
Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 6.2: Kalman Filter Jürgen Sturm Technische Universität München Motivation Bayes filter is a useful tool for state
More informationOptimal Sojourn Time Control within an Interval 1
Optimal Sojourn Time Control within an Interval Jianghai Hu and Shankar Sastry Department of Electrical Engineering and Computer Sciences University of California at Berkeley Berkeley, CA 97-77 {jianghai,sastry}@eecs.berkeley.edu
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationDeep Feedforward Networks. Sargur N. Srihari
Deep Feedforward Networks Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation
More informationDevelopment of an Extended Operational States Observer of a Power Assist Wheelchair
Development of an Extended Operational States Observer of a Power Assist Wheelchair Sehoon Oh Institute of Industrial Science Universit of Toko 4-6-, Komaba, Meguro, Toko, 53-855 Japan sehoon@horilab.iis.u-toko.ac.jp
More informationSequential Monte Carlo Methods for Bayesian Computation
Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)
More informationA Novel Activity Detection Method
A Novel Activity Detection Method Gismy George P.G. Student, Department of ECE, Ilahia College of,muvattupuzha, Kerala, India ABSTRACT: This paper presents an approach for activity state recognition of
More informationIntroduction to Unscented Kalman Filter
Introduction to Unscented Kalman Filter 1 Introdution In many scientific fields, we use certain models to describe the dynamics of system, such as mobile robot, vision tracking and so on. The word dynamics
More informationMultilayer Neural Networks
Pattern Recognition Multilaer Neural Networs Lecture 4 Prof. Daniel Yeung School of Computer Science and Engineering South China Universit of Technolog Outline Introduction (6.) Artificial Neural Networ
More informationPrediction of ESTSP Competition Time Series by Unscented Kalman Filter and RTS Smoother
Prediction of ESTSP Competition Time Series by Unscented Kalman Filter and RTS Smoother Simo Särkkä, Aki Vehtari and Jouko Lampinen Helsinki University of Technology Department of Electrical and Communications
More informationAnnouncements. CS 188: Artificial Intelligence Spring Classification. Today. Classification overview. Case-Based Reasoning
CS 188: Artificial Intelligence Spring 21 Lecture 22: Nearest Neighbors, Kernels 4/18/211 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!) Remaining
More informationA Position-Based Visual Impedance Control for Robot Manipulators
2007 IEEE International Conference on Robotics and Automation Roma, Ital, 10-14 April 2007 ThB21 A Position-Based Visual Impedance Control for Robot Manipulators Vinceno Lippiello, Bruno Siciliano, and
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More information5.6. Differential equations
5.6. Differential equations The relationship between cause and effect in phsical phenomena can often be formulated using differential equations which describe how a phsical measure () and its derivative
More informationx y plane is the plane in which the stresses act, yy xy xy Figure 3.5.1: non-zero stress components acting in the x y plane
3.5 Plane Stress This section is concerned with a special two-dimensional state of stress called plane stress. It is important for two reasons: () it arises in real components (particularl in thin components
More informationMathematics 309 Conic sections and their applicationsn. Chapter 2. Quadric figures. ai,j x i x j + b i x i + c =0. 1. Coordinate changes
Mathematics 309 Conic sections and their applicationsn Chapter 2. Quadric figures In this chapter want to outline quickl how to decide what figure associated in 2D and 3D to quadratic equations look like.
More informationA recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz
In Neurocomputing 2(-3): 279-294 (998). A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models Isabelle Rivals and Léon Personnaz Laboratoire d'électronique,
More informationProbabilistic Graphical Models
10-708 Probabilistic Graphical Models Homework 3 (v1.1.0) Due Apr 14, 7:00 PM Rules: 1. Homework is due on the due date at 7:00 PM. The homework should be submitted via Gradescope. Solution to each problem
More informationEstimation Of Linearised Fluid Film Coefficients In A Rotor Bearing System Subjected To Random Excitation
Estimation Of Linearised Fluid Film Coefficients In A Rotor Bearing Sstem Subjected To Random Ecitation Arshad. Khan and Ahmad A. Khan Department of Mechanical Engineering Z.. College of Engineering &
More informationA Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling
A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling G. B. Kingston, H. R. Maier and M. F. Lambert Centre for Applied Modelling in Water Engineering, School
More informationOverview. IAML: Linear Regression. Examples of regression problems. The Regression Problem
3 / 38 Overview 4 / 38 IAML: Linear Regression Nigel Goddard School of Informatics Semester The linear model Fitting the linear model to data Probabilistic interpretation of the error function Eamples
More informationImproving Semi-Supervised Learning with Auxiliary Deep Generative Models
Improving Semi-Supervised Learning with Auiliar Deep Generative Models Lars Maaløe 1 larsma@dtu.dk Casper Kaae Sønderb 2 casper.sonderb@bio.ku.dk Søren Kaae Sønderb 2 soren.sonderb@bio.ku.dk Ole Winther
More informationIntroduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen
Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /
More informationParticle Methods as Message Passing
Particle Methods as Message Passing Justin Dauwels RIKEN Brain Science Institute Hirosawa,2-1,Wako-shi,Saitama,Japan Email: justin@dauwels.com Sascha Korl Phonak AG CH-8712 Staefa, Switzerland Email: sascha.korl@phonak.ch
More informationA variational radial basis function approximation for diffusion processes
A variational radial basis function approximation for diffusion processes Michail D. Vrettas, Dan Cornford and Yuan Shen Aston University - Neural Computing Research Group Aston Triangle, Birmingham B4
More informationCOMPACT IMPLICIT INTEGRATION FACTOR METHODS FOR A FAMILY OF SEMILINEAR FOURTH-ORDER PARABOLIC EQUATIONS. Lili Ju. Xinfeng Liu.
DISCRETE AND CONTINUOUS doi:13934/dcdsb214191667 DYNAMICAL SYSTEMS SERIES B Volume 19, Number 6, August 214 pp 1667 1687 COMPACT IMPLICIT INTEGRATION FACTOR METHODS FOR A FAMILY OF SEMILINEAR FOURTH-ORDER
More informationINF Introduction to classifiction Anne Solberg
INF 4300 8.09.17 Introduction to classifiction Anne Solberg anne@ifi.uio.no Introduction to classification Based on handout from Pattern Recognition b Theodoridis, available after the lecture INF 4300
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More information