Missing Data and Dynamical Systems
|
|
- Marilyn Harrell
- 5 years ago
- Views:
Transcription
1 U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598PS Machine Learning for Signal Processing Missing Data and Dynamical Systems 12 October 2017
2 Today s lecture Dealing with missing data Tracking and linear dynamical systems 2
3 Missing data You don t always have all your data! e.g. cell phone drops, gaps in pictures, How do we deal with that? 3
4 A simple case? 4
5 A real-world case! 5
6 SIGNAL PROCESSING FALL 2017 FOR MACHINE LEARNING 6 General classification of cases Missing completely at random (MCAR) Missingness is really random Missing at random (MAR) Missingness depends on some variable But not on the missing data values Not missing at random (NMAR) None of the above
7 Starting with 1D Assume a small gap in 1D data Can we find the missing values?
8 Key observations Temporal structure The signal is somewhat periodic We could predict the future from the past How do we formalize this?
9 A predictive model Predict current sample from the past Using a weighted average of preceding samples x t = Autoregressive (AR) model a i x t i + e t We can learn the coefficients a using various methods 9
10 Rewriting the model Linear algebra notation A = e = A x a p! a ! a p! a ! 0 " " # # # # # " "! 0 0 a p! a ! 0 0 a p! a ! 0 0 a p! a
11 Filling gaps Model the input sequence as: e = A x = A U x u + K x k ( ) = A u x u + A k x k x u and x k are the unknown and known samples U and K are repositioning matrices, e.g.: x = x 1 x 2 x 3 x 4 = x 2 x 3 Unknown x 1 x 4 Known 11
12 Isolating the unknowns Final form isolates unknowns as a variable These are values that should conform to the model Assuming they aren t missing at random! We thus want to find x u so that we minimize e i.e. fill in the blanks with least unpredictable samples A u x u + A k x k = e 12
13 Solving for the gaps Set the derivative to zero e e = 2e e x u Solve for unknown samples ( ) A u = 0 x u = 2 A x u + A k x k ( ) 1 A u A k x k = A u + A k x k ˆx u = A u A u 13
14 Results Works pretty well! Input data blue/cyan real red restoration
15 But not for large windows! Input data blue/cyan real red restoration
16 Looking for a new idea We should move away from the sample level, and look at a coarser scale e.g. a time-frequency view 16
17 A simpler idea Find sections most similar to the gap edges Replace missing sections with their neighbors?? x 1 x 2 x 3?? x 6 x 7 x 8?? How do we find the close match? 17
18 In action First iteration 18
19 In action Second iteration 19
20 In action Third iteration 20
21 In action Fourth iteration 21
22 In action Final iteration Input Output 22
23 Some examples Out In Out In Out In Speech Music Sound effects 23
24 What about images? Basic idea is the same Find a neighbor and blend it A bit more complicated We need to blend carefully, we can t just add Poisson blending Also more computationally intensive (2d search!) 24
25 A real-world example Inpainting, recomposing, warping the truth! 25 PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing Connelly Barnes, Eli Shechtman, Adam Finkelstein, Dan B Goldman
26 The statistical viewpoint Nearest-neighbor search was local We ignore global data structure Missing data should conform to a statistical model of the input i.e. they shouldn t be outliers 26
27 SVD-based imputation Simple global approach Replace missing data with some values Perform an SVD approximation of the data Replace missing data with SVD approximation Repeat! x u = E{x} or x u N(µ,σ 2 ) or... X U S V ( ) u x u = U S V 27
28 Why? Known data will dominate statistics Assuming they are enough! Each approximation will try to conform to the global statistics of the input i.e. made up values will not be random With each new iteration, that conformance to global statistics will increase 28
29 Variations We don t have to use an SVD e.g. our data might be non-negative So use NMF, or probabilistic models, or whatever makes sense Any global statistical model would work Just make sure that it fits the data well 29
30 Example Learn from input and fill-in missing values Input SVD NMF 30
31 Learning from outside Bandwidth expansion Learn model from other recordings Input SVD NMF 31
32 Tracking Tracking things that evolve in time missiles, faces, finances, etc A time-series model again 32
33 A simple problem I m looking at a star which is about there So does my drunken friend How do we consolidate our observations? 33
34 An uncertain estimate My observation is a Gaussian that helps me account for some uncertainty Mean is what I think is right Variance is an indication of how sure I am N(µ 1,σ 1 2 )
35 More uncertainty My drunken friend s estimate is unreliable Therefore his variance is higher N(µ,σ 2 ) 1 1 N(µ,σ 2 ) 2 2 σ > σ
36 Consensus estimate The consensus estimate is proportional to a Gaussian N(m,s) N(µ 1,σ 1 )N(µ 2,σ 2 ) 2 K = σ 1 σ 2 + σ m = µ + K(µ µ ) s =(1 K)σ
37 Serializing this process Assume a noisy time series z = z + e t t 1 t e N t Can we use that consensus idea for removing noise in successive values? What is the model here? 37
38 Example case Track the position of a moving ball 38
39 Some simple processing Background removal Subtract constant part, find center of remaining pixels 39
40 The problem The position estimate is very noisy Lets see how we can clean it up
41 Making a model Defining a transition process Looks familiar? x t y t A = I e t N = A x t 1 y t 1 In short, the next input will be a random value away from the current input + e z = A z + e t t t 1 t 1 41
42 Starting with it Initial estimate will be the first input: With a given certainty: And two noise models: P = σ σ 2 Q = q x 0 0 q y Process noise (how much the data changes) ẑ 1 = z 1 R = r x 0 0 r y Measurement noise (how reliable my measurements are) 42
43 Setting up sequential estimates Predict future value z = ẑ t t 1 P = P + Q t 1 We don t expect change other than noise And we accumulate the uncertainty Correct that estimate given new input ( ) 1 K = P P + R ẑ t = z t + K z t z t P = ( I K) P t ( ) Get uncertainty of a consensus estimate And the estimate itself Get new uncertainty 43
44 Single step Q ẑ t 1 ẑ t z t R 44
45 Example output Estimate is closer to ground truth
46 Example output Tracking is more smooth But it lags a bit (why?) Noisy input Filtered output 46
47 Complicating the process Adding velocity information Internal state representation z = dx x y t t t dt Transition forces constant velocity z t = A z t 1 + e t = dy dt 1 0 dt dt x t 1 y t 1 dx dt dy dt 47
48 Measurements What we measure is position only: w = H z + v = t t t z + v t t Internal state is not same as measurement And it has it s own noise 48
49 The Kalman filter Predict future value z t = A ẑ t 1 P = A P t 1 A + Q Change state according to model Accumulate the uncertainty Correct that estimate given new input ( ) 1 ( ) K = P H H P H + R ẑ t = z t + K w t H z t P = ( I K H) P t Get uncertainty of a consensus estimate And the estimate itself Get new uncertainty 49
50 More elaborate extensions Add acceleration, etc More complex internal state More accurate models given various inputs Use this to track moving processes Missile over an ocean Finger over a touchscreen Cars on the highway 50
51 And yet more extensions Non-linear processes Extended Kalman filter Unscented Kalman filter (for very nonlinear) Particle filters z = f ( z )+ e, w = h( w )+ v t t 1 t t t 1 t Sampling based method Bypasses Gaussian assumptions 51
52 An example 52
53 In the real world 53
54 Using dynamical systems to classify We can also classify sequences using dynamical models So far we only did prediction, tracking and smoothing Dynamical models can be fit on training sequences And evaluated on new sequences that we want to classify Goodness of fit (for now) can be used for classification 54
55 An example 3D Accelerometer data from a cell phone Classes are user activity (sitting, walking, going upstairs, etc). Each activity has a unique temporal pattern Sensor reading (g) Accelerometer data from walking z-axis y-axis x-axis Sensor reading (g) Accelerometer data from walking_downstairs z-axis y-axis x-axis Time (sec) Time (sec)
56 The VAR model (Vector AutoRegression) We can make a linear model to predict future samples using the past x t+1 y t+1 z t+1 Matrix A should be unique to each class Captures the temporal structure of each class Can easily estimate A with e.g. least squares = A x t y t z t 1 + e t 56
57 Predicting new inputs Get each model s prediction error on new samples Model with least error belongs to the same class as input Can also reformulate probabilistically to get likelihoods instead Prediction error Prediction error Sample class A on model A, MSE: Sample class A on model B, MSE: Sample class B on model A, MSE: Prediction error Prediction error Sample class B on model B, MSE:
58 One big happy family ICA A = 0 x = g N(0,Q ) y = C x + N(0,R) HMM ( ) x = WTA A x + N(0,Q ) t+1 y t = C x t + N(0,R) LDS x t+1 = A x + N(0,Q ) y t = C x t + N(0,R) PCA A = 0 x = N(0,Q ) y = C x + N(0,R) GMM / k-means / VQ A = 0 x = WTA N(µ,Q ) y = C x + N(0,R) 58
59 So it s all really the same model! Like I said in the beginning of this class, DSP and ML are pretty much the same equation over and over 59
60 Recap Missing data techniques AR models using temporal dynamics NN, SVD, NMF using context information Tracking and prediction Kalman filter et al. The broader Linear Dynamical System family 60
61 Next week Source separation Array methods using machine learning Monophonic signal separation methods 61
62 Reading Missing data in audio Inpainting (and more) in images The Kalman filter LDS 62
Classification: The rest of the story
U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning
More informationWhy do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning
Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where
More informationWhy do we care? Measurements. Handling uncertainty over time: predicting, estimating, recognizing, learning. Dealing with time
Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 2004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where
More informationGaussian Models
Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationTime Series Classification
Distance Measures Classifiers DTW vs. ED Further Work Questions August 31, 2017 Distance Measures Classifiers DTW vs. ED Further Work Questions Outline 1 2 Distance Measures 3 Classifiers 4 DTW vs. ED
More informationGaussian with mean ( µ ) and standard deviation ( σ)
Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (
More informationSound Recognition in Mixtures
Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationLecture 12 Body Pose and Dynamics
15-869 Lecture 12 Body Pose and Dynamics Yaser Sheikh Human Motion Modeling and Analysis Fall 2012 Debate Advice Affirmative Team Thesis Statement: Every (good) paper has a thesis. What is the most provocative
More informationDimension Reduction. David M. Blei. April 23, 2012
Dimension Reduction David M. Blei April 23, 2012 1 Basic idea Goal: Compute a reduced representation of data from p -dimensional to q-dimensional, where q < p. x 1,...,x p z 1,...,z q (1) We want to do
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationKalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein
Kalman filtering and friends: Inference in time series models Herke van Hoof slides mostly by Michael Rubinstein Problem overview Goal Estimate most probable state at time k using measurement up to time
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationTemporal Modeling and Basic Speech Recognition
UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab Temporal Modeling and Basic Speech Recognition Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Today s lecture Recognizing
More informationDimensionality Reduction
Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationLesson 3-1: Solving Linear Systems by Graphing
For the past several weeks we ve been working with linear equations. We ve learned how to graph them and the three main forms they can take. Today we re going to begin considering what happens when we
More informationDiscrete Mathematics and Probability Theory Fall 2015 Lecture 21
CS 70 Discrete Mathematics and Probability Theory Fall 205 Lecture 2 Inference In this note we revisit the problem of inference: Given some data or observations from the world, what can we infer about
More informationMachine Learning! in just a few minutes. Jan Peters Gerhard Neumann
Machine Learning! in just a few minutes Jan Peters Gerhard Neumann 1 Purpose of this Lecture Foundations of machine learning tools for robotics We focus on regression methods and general principles Often
More informationCSE 446 Dimensionality Reduction, Sequences
CSE 446 Dimensionality Reduction, Sequences Administrative Final review this week Practice exam questions will come out Wed Final exam next week Wed 8:30 am Today Dimensionality reduction examples Sequence
More informationABSTRACT INTRODUCTION
ABSTRACT Presented in this paper is an approach to fault diagnosis based on a unifying review of linear Gaussian models. The unifying review draws together different algorithms such as PCA, factor analysis,
More informationLecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan
Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always
More informationCommunication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi
Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking
More informationLecture 9. Time series prediction
Lecture 9 Time series prediction Prediction is about function fitting To predict we need to model There are a bewildering number of models for data we look at some of the major approaches in this lecture
More informationClassification and Regression Trees
Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationNonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania
Nonlinear and/or Non-normal Filtering Jesús Fernández-Villaverde University of Pennsylvania 1 Motivation Nonlinear and/or non-gaussian filtering, smoothing, and forecasting (NLGF) problems are pervasive
More informationKalman Filter. Man-Wai MAK
Kalman Filter Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S. Gannot and A. Yeredor,
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationLecture 7: Con3nuous Latent Variable Models
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/
More informationMath 350: An exploration of HMMs through doodles.
Math 350: An exploration of HMMs through doodles. Joshua Little (407673) 19 December 2012 1 Background 1.1 Hidden Markov models. Markov chains (MCs) work well for modelling discrete-time processes, or
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationComputational Cognitive Science
Computational Cognitive Science Lecture 9: A Bayesian model of concept learning Chris Lucas School of Informatics University of Edinburgh October 16, 218 Reading Rules and Similarity in Concept Learning
More informationECE 592 Topics in Data Science
ECE 592 Topics in Data Science Final Fall 2017 December 11, 2017 Please remember to justify your answers carefully, and to staple your test sheet and answers together before submitting. Name: Student ID:
More informationLinear and Logistic Regression. Dr. Xiaowei Huang
Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics
More informationGaussian Processes for Audio Feature Extraction
Gaussian Processes for Audio Feature Extraction Dr. Richard E. Turner (ret26@cam.ac.uk) Computational and Biological Learning Lab Department of Engineering University of Cambridge Machine hearing pipeline
More informationClustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning
Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades
More informationKalman Filter Computer Vision (Kris Kitani) Carnegie Mellon University
Kalman Filter 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Examples up to now have been discrete (binary) random variables Kalman filtering can be seen as a special case of a temporal
More informationThe Kalman Filter ImPr Talk
The Kalman Filter ImPr Talk Ged Ridgway Centre for Medical Image Computing November, 2006 Outline What is the Kalman Filter? State Space Models Kalman Filter Overview Bayesian Updating of Estimates Kalman
More informationDynamic models 1 Kalman filters, linearization,
Koller & Friedman: Chapter 16 Jordan: Chapters 13, 15 Uri Lerner s Thesis: Chapters 3,9 Dynamic models 1 Kalman filters, linearization, Switching KFs, Assumed density filters Probabilistic Graphical Models
More informationMachine Learning for Signal Processing Sparse and Overcomplete Representations
Machine Learning for Signal Processing Sparse and Overcomplete Representations Abelino Jimenez (slides from Bhiksha Raj and Sourish Chaudhuri) Oct 1, 217 1 So far Weights Data Basis Data Independent ICA
More informationSequences and infinite series
Sequences and infinite series D. DeTurck University of Pennsylvania March 29, 208 D. DeTurck Math 04 002 208A: Sequence and series / 54 Sequences The lists of numbers you generate using a numerical method
More informationDay 5: Generative models, structured classification
Day 5: Generative models, structured classification Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 22 June 2018 Linear regression
More informationDecision-making, inference, and learning theory. ECE 830 & CS 761, Spring 2016
Decision-making, inference, and learning theory ECE 830 & CS 761, Spring 2016 1 / 22 What do we have here? Given measurements or observations of some physical process, we ask the simple question what do
More informationTheory of Computation Prof. Raghunath Tewari Department of Computer Science and Engineering Indian Institute of Technology, Kanpur
Theory of Computation Prof. Raghunath Tewari Department of Computer Science and Engineering Indian Institute of Technology, Kanpur Lecture 10 GNFA to RE Conversion Welcome to the 10th lecture of this course.
More informationRobotics 2 Target Tracking. Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard
Robotics 2 Target Tracking Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard Slides by Kai Arras, Gian Diego Tipaldi, v.1.1, Jan 2012 Chapter Contents Target Tracking Overview Applications
More information9 Multi-Model State Estimation
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 9 Multi-Model State
More informationEM-algorithm for Training of State-space Models with Application to Time Series Prediction
EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationFeature selection. Micha Elsner. January 29, 2014
Feature selection Micha Elsner January 29, 2014 2 Using megam as max-ent learner Hal Daume III from UMD wrote a max-ent learner Pretty typical of many classifiers out there... Step one: create a text file
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 Discriminative vs Generative Models Discriminative: Just learn a decision boundary between your
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for
More informationBayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington
Bayesian Classifiers and Probability Estimation Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1 Data Space Suppose that we have a classification problem The
More informationChapter 5: Integrals
Chapter 5: Integrals Section 5.5 The Substitution Rule (u-substitution) Sec. 5.5: The Substitution Rule We know how to find the derivative of any combination of functions Sum rule Difference rule Constant
More informationIntroduction to Artificial Intelligence (AI)
Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 9 Oct, 11, 2011 Slide credit Approx. Inference : S. Thrun, P, Norvig, D. Klein CPSC 502, Lecture 9 Slide 1 Today Oct 11 Bayesian
More informationECE 541 Stochastic Signals and Systems Problem Set 9 Solutions
ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions Problem Solutions : Yates and Goodman, 9.5.3 9.1.4 9.2.2 9.2.6 9.3.2 9.4.2 9.4.6 9.4.7 and Problem 9.1.4 Solution The joint PDF of X and Y
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationBased on slides by Richard Zemel
CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we
More informationMachine Learning for Signal Processing Bayes Classification and Regression
Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For
More informationRAO-BLACKWELLISED PARTICLE FILTERS: EXAMPLES OF APPLICATIONS
RAO-BLACKWELLISED PARTICLE FILTERS: EXAMPLES OF APPLICATIONS Frédéric Mustière e-mail: mustiere@site.uottawa.ca Miodrag Bolić e-mail: mbolic@site.uottawa.ca Martin Bouchard e-mail: bouchard@site.uottawa.ca
More informationPrediction of ESTSP Competition Time Series by Unscented Kalman Filter and RTS Smoother
Prediction of ESTSP Competition Time Series by Unscented Kalman Filter and RTS Smoother Simo Särkkä, Aki Vehtari and Jouko Lampinen Helsinki University of Technology Department of Electrical and Communications
More informationA Spectral Approach to Linear Bayesian Updating
A Spectral Approach to Linear Bayesian Updating Oliver Pajonk 1,2, Bojana V. Rosic 1, Alexander Litvinenko 1, and Hermann G. Matthies 1 1 Institute of Scientific Computing, TU Braunschweig, Germany 2 SPT
More informationRecap from previous lecture
Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience
More informationPartially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS
Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 24) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu October 2, 24 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24 Outline Review
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationHuman-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg
Temporal Reasoning Kai Arras, University of Freiburg 1 Temporal Reasoning Contents Introduction Temporal Reasoning Hidden Markov Models Linear Dynamical Systems (LDS) Kalman Filter 2 Temporal Reasoning
More informationLecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University
Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization
More informationBut if z is conditioned on, we need to model it:
Partially Unobserved Variables Lecture 8: Unsupervised Learning & EM Algorithm Sam Roweis October 28, 2003 Certain variables q in our models may be unobserved, either at training time or at test time or
More informationData Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur
Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 17 K - Nearest Neighbor I Welcome to our discussion on the classification
More informationECE531 Lecture 11: Dynamic Parameter Estimation: Kalman-Bucy Filter
ECE531 Lecture 11: Dynamic Parameter Estimation: Kalman-Bucy Filter D. Richard Brown III Worcester Polytechnic Institute 09-Apr-2009 Worcester Polytechnic Institute D. Richard Brown III 09-Apr-2009 1 /
More informationUncertainty Quantification for Machine Learning and Statistical Models
Uncertainty Quantification for Machine Learning and Statistical Models David J. Stracuzzi Joint work with: Max Chen, Michael Darling, Stephen Dauphin, Matt Peterson, and Chris Young Sandia National Laboratories
More informationChapter 5: Integrals
Chapter 5: Integrals Section 5.3 The Fundamental Theorem of Calculus Sec. 5.3: The Fundamental Theorem of Calculus Fundamental Theorem of Calculus: Sec. 5.3: The Fundamental Theorem of Calculus Fundamental
More informationKalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q
Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I
More information1 Review of the dot product
Any typographical or other corrections about these notes are welcome. Review of the dot product The dot product on R n is an operation that takes two vectors and returns a number. It is defined by n u
More informationImproving the travel time prediction by using the real-time floating car data
Improving the travel time prediction by using the real-time floating car data Krzysztof Dembczyński Przemys law Gawe l Andrzej Jaszkiewicz Wojciech Kot lowski Adam Szarecki Institute of Computing Science,
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More information1 Kalman Filter Introduction
1 Kalman Filter Introduction You should first read Chapter 1 of Stochastic models, estimation, and control: Volume 1 by Peter S. Maybec (available here). 1.1 Explanation of Equations (1-3) and (1-4) Equation
More informationL5 Support Vector Classification
L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander
More informationCS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)
CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis
More informationDesigning Information Devices and Systems II Fall 2018 Elad Alon and Miki Lustig Homework 9
EECS 16B Designing Information Devices and Systems II Fall 18 Elad Alon and Miki Lustig Homework 9 This homework is due Wednesday, October 31, 18, at 11:59pm. Self grades are due Monday, November 5, 18,
More informationCHAPTER 6 - THINKING ABOUT AND PRACTICING PROPOSITIONAL LOGIC
1 CHAPTER 6 - THINKING ABOUT AND PRACTICING PROPOSITIONAL LOGIC Here, you ll learn: what it means for a logic system to be finished some strategies for constructing proofs Congratulations! Our system of
More informationSupervised Learning: Non-parametric Estimation
Supervised Learning: Non-parametric Estimation Edmondo Trentin March 18, 2018 Non-parametric Estimates No assumptions are made on the form of the pdfs 1. There are 3 major instances of non-parametric estimates:
More information2D Image Processing. Bayes filter implementation: Kalman filter
2D Image Processing Bayes filter implementation: Kalman filter Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de
More information12 Statistical Justifications; the Bias-Variance Decomposition
Statistical Justifications; the Bias-Variance Decomposition 65 12 Statistical Justifications; the Bias-Variance Decomposition STATISTICAL JUSTIFICATIONS FOR REGRESSION [So far, I ve talked about regression
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Hidden Markov Models Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Additional References: David
More informationFactor Analysis and Kalman Filtering (11/2/04)
CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used
More informationTAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω
ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.
More informationLecture 3: Pattern Classification. Pattern classification
EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationMulti-Robotic Systems
CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed
More informationSTA 414/2104, Spring 2014, Practice Problem Set #1
STA 44/4, Spring 4, Practice Problem Set # Note: these problems are not for credit, and not to be handed in Question : Consider a classification problem in which there are two real-valued inputs, and,
More informationDerivatives and the Product Rule
Derivatives and the Product Rule James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University January 28, 2014 Outline 1 Differentiability 2 Simple Derivatives
More information