Distributed MAP probability estimation of dynamic systems with wireless sensor networks

Size: px
Start display at page:

Download "Distributed MAP probability estimation of dynamic systems with wireless sensor networks"

Transcription

1 Distributed MAP probability estimation of dynamic systems with wireless sensor networks Felicia Jakubiec, Alejandro Ribeiro Dept. of Electrical and Systems Engineering University of Pennsylvania Penn Seminar on Communications and Networking, Nov. 15, 2011 F. Jakubiec Distributed MAP probability estimation of dynamic systems 1

2 Estimation problem in sensor networks A sensor network consisting of several sensors Signal values change with time Every node collects a noisy observation of the true values Sensors within the network communicate with each other x 1 x 6 s 1 s 2 x 2 x 5 x 3 x 4 F. Jakubiec Distributed MAP probability estimation of dynamic systems 2

3 Examples of sensor networks A WSN project to detect the health of a city Uses WiFi radios Included sensors for particle counting, temperature, pressure, wind speed, rainfall, monitoring street activity etc. F. Jakubiec Distributed MAP probability estimation of dynamic systems 3

4 Local and Global estimation Local estimation Sensors estimate based on their own observations only Estimation accuracy can improve using other sensors observations Global estimation Centralized solution using all information available Designated node or fusion center receives information from all nodes Fragile. What if fusion center fails? Or, all nodes receive information from all other nodes Large communication (information exchanges) cost F. Jakubiec Distributed MAP probability estimation of dynamic systems 4

5 Distributed estimation Exchange information with neighboring nodes only Increases estimation accuracy with reasonable communication cost x 1 x 6 s 1 s 2 x 2 x 5 x 3 x 4 Sensor network represented by connected graph G = (V, E) Sensor i can communicate with sensor j only if j n i Estimation is computed locally with information only from neighbors F. Jakubiec Distributed MAP probability estimation of dynamic systems 5

6 Signal Model Continuous-time model Signal s a (τ) satisfies differential equation (u a (τ)=driving noise) ṡ a (τ) = f as (s a (τ), u a (τ)), Determines transition probability P (s a (τ + h) s a (τ)) Observation model specified by conditional pdf P (x a (τ) s a (τ)) Equivalent discrete-time model Sample continuous time signal with period T s Discrete-time observations x n k = x ak(nt s ) and signals s n = s a (nt s ) Discrete-time transition P ( s n s n 1) and conditional P (x n k sn ) F. Jakubiec Distributed MAP probability estimation of dynamic systems 6

7 Linear autoregressive Gaussian model LTI system with Gaussian noise for all sensors k { ṡa (t) = A a s a (t) + u a (t) x ak (t) = H ak s a (t) + n ak (t) { s n+1 = A s n + u n x n k = H k s n + n n k Driving noise u a (t) N (0, Q a ) n ak (t) N (0, R ak ) Observation noise u n N (0, Q) n n k N (0, R k) Equivalent system parameters Signal coefficient A = exp(a a T s ) Observation coefficient H k = H ak Signal noise covariance matrix Q = E [ u nt u n] = (Q a /2)A 1 a (exp(2a a T s ) I ) Observation noise covariance matrix R k = E [ n nt k n n k] = Rak /T s F. Jakubiec Distributed MAP probability estimation of dynamic systems 7

8 Centralized MAP estimation Can do MMSE or MAP estimation here MAP Maximum a posteriori (MAP) estimator ŝ MAP (t) = argmax s P (s x(t)) = argmax P (x(t) s) P (s) s From conditional independence of observations across sensors P (x(t) s) = P ( x 1 1 s 1) P ( x 2 1 s 2) P ( x t K s t) = Prior pdf t n=1 k=1 K P (x n k s n ) P (s) = P ( s t s t 1) P ( s 1 s 0) P ( s 0) = P ( s 0) t P ( s n s n 1) n=1 F. Jakubiec Distributed MAP probability estimation of dynamic systems 8

9 Centralized MAP estimation (continued) MAP becomes ( ŝ MAP (t) = P (s 0) t K ) P (x n k s n ) P (s n s n 1) Complexity grows Introduce time window T ( t K ) ŝ MAP (t) = P (x n k s n ) P (s n s n 1) Taking the logarithm n=1 n=t T +1 k=1 k=1 ŝ MAP (t) = argmax f 0(s, t) = s t ( K argmax s n=t T +1 k=1 ( ln P (x n k s n ) ) + ln P ( s n s n 1) ) But estimation of global signal s n cannot be distributed F. Jakubiec Distributed MAP probability estimation of dynamic systems 9

10 Distributed implementation Introduce local variables ŝ(t) = {ŝ k (t)} k=1...k ŝ(t) = argmax s t n=t T +1 ( K k=1 ln P (x n k s n k) + ln P ( s n k s n 1 ) ) k s.t. s n k = s n l, for all l n k, for all n = t T + 1,..., t The constraints can be rewritten as Cs = 0 C is the (directed) edge-incidence matrix Dual gradient descent When log likelihood function is convex, P = D Dual problem is separable F. Jakubiec Distributed MAP probability estimation of dynamic systems 10

11 Distributed implementation (continued) Introduce Lagrangian and Lagrange maximizers λ, L(s, λ, t) = t n=t T +1 k=1 K [ ln P (x n k s n k)+ 1 ( ) K ln P s n k s n 1 + ] λ n T kl (s n k s n l ) l nk Dual problem g(λ, t) = argmax L(s, λ, t) s Lagrangian and dual function change over time Rearrange and separate into local Lagrangians L k (s k, λ, t), L k (s k, λ, t) = where t n=t T +1 k [ ln P (x n k s n k)+ 1 ( ) K ln P s n k s n 1 k + ] s n T k (t)(λ n kl λ n lk), l nk L(s, λ, t) = K L k (s k, λ, t) k=1 Can now be solved using dual gradient descent F. Jakubiec Distributed MAP probability estimation of dynamic systems 11

12 On-line algorithm for distributed MAP estimation Initialize λ 0 kl as 1 Primal update, at each time t s k (t) = argmax s k t n=t T +1 [ ln P (x n k s n k) + 1 K ln P ( s n k s n 1 ) k + ] s nt k (λ n kl(t) λ n lk(t)). l n k Dual update, at each time t λ n kl(t + 1) = λ n kl(t) ɛ ( s n k(t) s n l (t) ) ] n This is gradient descent, [ g(λ(t), t) = s n k(t) s n l (t) kl F. Jakubiec Distributed MAP probability estimation of dynamic systems 12

13 D-MAP for a linear model MAP estimator in case of linear model t ( K ŝ MAP (t) = argmax (x n k H k s n ) T R 1 k s n=t T +1 k=1 ) (x n k H k s n ) + (s n A s n 1 ) T Q 1 (s n A s n 1 ). Primal update becomes t ( s k (t) = argmax (x n k H k s n ) T R 1 k (xn k H k s n ) s k n=t T K (sn As n 1 ) T Q 1 (s n As n 1 ) + ) s nt k (λ n kl(t) λ n lk(t)). l n k Dual update as before Can be solved in closed-form F. Jakubiec Distributed MAP probability estimation of dynamic systems 13

14 Convergence to optimality Problem: Dual problem changes with time Primal optima s (t) and dual optima λ (t) change with time Dual iterates λ(t) approach optimum λ (t) But optimum λ (t) drifts away to λ (t + 1) Characterize difference between current estimate and centralized MAP, s n k(t) ŝ n MAP(t). s is not deterministic, can only characterize in a probabilistic sense First look at the quantity λ(t) Λ (t) Desired relation for s n k (t) ŝn MAP (t) follows as a corollary F. Jakubiec Distributed MAP probability estimation of dynamic systems 14

15 Assumptions (A1) Strong convexity of dual function in direction of gradient g(µ, t) g(λ, t) + g(λ, t) T (µ λ) + m µ λ 2 2 where µ = λ + c g(λ, t) for some constant c. (A2) Lipschitz continuity of the gradient of the dual function g(λ) g(µ, t) g(λ, t) M µ λ. Customary assumptions in gradient descent algorithm (almost) Assumption (A1) is a little weaker F. Jakubiec Distributed MAP probability estimation of dynamic systems 15

16 Assumptions (continued) (A3) Distance between the derivatives of primals expected to be small, [ f 0,t E (ŝmap (t) ) f 0,t+1 (ŝmap (t + 1) ) ] δ(t s ) s s We need to bound change of primal functions from t to t + 1 Bounds how much the optimum drifts away in each time step This difference will depend on sampling time Smaller time increment, smaller change Assumption can be fulfilled for some log likelihood functions For linear case, δ(t s ) = c T s with some constant c F. Jakubiec Distributed MAP probability estimation of dynamic systems 16

17 Convergence of dual variables Theorem Let λ(t) denote the vector with current dual iterates and Λ (t) the set of optimal multipliers. When the step size ɛ < 1/M, then lim E [ λ(t) t Λ (t) ] γ 1 ɛm ɛm δ(t s), where γ := µ max (C ) > 0 is largest eigenvalue of the pseudoinverse of C. Expected value of distance to optimal multipliers is small λ(t) Λ (t) behaves similar to a supermartingale Term on RHS depends on sampling time and network characteristics γ is the mixing constant of the graph G ɛm is the condition number of the optimization problem δ(ts) describes the bound on the change of the likelihood derivatives F. Jakubiec Distributed MAP probability estimation of dynamic systems 17

18 Convergence of dual variables (continued) Theorem With the same definitions and assumptions as before, for almost all realization of the signal process s(t), lim inf t λ(t) Λ (t) γ 1 ɛm ɛm δ(t s) a.s. Almost surely, the distance to optimal multipliers will become small For any realization, this will happen infinitely often F. Jakubiec Distributed MAP probability estimation of dynamic systems 18

19 Convergence of primal variables (assumptions) Strong convexity of primal function f 0 (s, t) f 0 (r, t) + f 0 (r, t) T (s r) + l 2 s r 2. Lipschitz continuity of the dual function g(λ) g(µ, t) g(λ, t) L µ λ Bounded Lagrange multipliers λ λ max. F. Jakubiec Distributed MAP probability estimation of dynamic systems 19

20 Convergence of primal variables Corollary Let s(t) denote the current primal iterate obtained at time t and let ŝ MAP (t) the optimal MAP estimate. With the same definitions and assumptions as before, lim t s(t) ŝ MAP(t) 2 Γ 1 ( 1 ɛm ɛm where Γ 1 = γl/l and Γ 2 = 2γM λ max /l. Result depends on difference to optimal multipliers ) 2 δ(t s ) 2 + Γ 2 1 ɛm ɛm δ(t s) a.s. Difference becomes small at a worst rate of max(δ(t s ), δ(t s )) (1) F. Jakubiec Distributed MAP probability estimation of dynamic systems 20

21 Simulation setup Sensor network with 1 signal and K = 10 sensors Edge set E randomly drawn with probability 0.5 Sampling period = 0.1 seconds Window size = 2 seconds Simulation runs for 100 seconds Linear system with parameters ṡ a (t) = 0.01 s a (t) + u a (t) x ak (t) = H ak s a (t) + n ak (t) Hak uniformly drawn between 0.5 and 1.5 entries of ua(t) N (0, 0.25) entriess of nak (t) N (0, 1) F. Jakubiec Distributed MAP probability estimation of dynamic systems 21

22 Simulation results Mean squared Error Kalman Filter Centralized MAP D MAP Estimation time n Figure: D-MAP, MAP, and Kalman filter mean squared error (MSE) for all n [t T + 1, t] = [981, 1000]. F. Jakubiec Distributed MAP probability estimation of dynamic systems 22

23 Simulation results (continued) Mean squared Error Centralized MAP Local MAP D MAP Estimation time t Figure: D-MAP, centralized MAP, and local MAP mean squared error for the time t estimate computed at time t, i.e., s t k(t), for t [600, 1000]. F. Jakubiec Distributed MAP probability estimation of dynamic systems 23

24 Conclusion Introduced dynamic distributed estimation problem Based on known signal and observation model Algorithm should approach global estimate but only use information from neighbors Implemented distributed MAP with dual gradient descent Studied convergence of estimator for time n if t Algorithm achieves convergence to centralized MAP very quickly D-MAP presents a significant improvement over local MAP F. Jakubiec Distributed MAP probability estimation of dynamic systems 24

WE consider the problem of estimating a time varying

WE consider the problem of estimating a time varying 450 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 61, NO 2, JANUARY 15, 2013 D-MAP: Distributed Maximum a Posteriori Probability Estimation of Dynamic Systems Felicia Y Jakubiec Alejro Ribeiro Abstract This

More information

Multiple Bits Distributed Moving Horizon State Estimation for Wireless Sensor Networks. Ji an Luo

Multiple Bits Distributed Moving Horizon State Estimation for Wireless Sensor Networks. Ji an Luo Multiple Bits Distributed Moving Horizon State Estimation for Wireless Sensor Networks Ji an Luo 2008.6.6 Outline Background Problem Statement Main Results Simulation Study Conclusion Background Wireless

More information

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

SIMON FRASER UNIVERSITY School of Engineering Science

SIMON FRASER UNIVERSITY School of Engineering Science SIMON FRASER UNIVERSITY School of Engineering Science Course Outline ENSC 810-3 Digital Signal Processing Calendar Description This course covers advanced digital signal processing techniques. The main

More information

CS181 Midterm 2 Practice Solutions

CS181 Midterm 2 Practice Solutions CS181 Midterm 2 Practice Solutions 1. Convergence of -Means Consider Lloyd s algorithm for finding a -Means clustering of N data, i.e., minimizing the distortion measure objective function J({r n } N n=1,

More information

Lecture 18: Optimization Programming

Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

Minimum Mean Squared Error Interference Alignment

Minimum Mean Squared Error Interference Alignment Minimum Mean Squared Error Interference Alignment David A. Schmidt, Changxin Shi, Randall A. Berry, Michael L. Honig and Wolfgang Utschick Associate Institute for Signal Processing Technische Universität

More information

Network Optimization with Heuristic Rational Agents

Network Optimization with Heuristic Rational Agents Network Optimization with Heuristic Rational Agents Ceyhun Eksin and Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania {ceksin, aribeiro}@seas.upenn.edu Abstract

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Expectation Maximization (EM)

Expectation Maximization (EM) Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

Distributed Optimization over Random Networks

Distributed Optimization over Random Networks Distributed Optimization over Random Networks Ilan Lobel and Asu Ozdaglar Allerton Conference September 2008 Operations Research Center and Electrical Engineering & Computer Science Massachusetts Institute

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10) Subject: Optimal Control Assignment- (Related to Lecture notes -). Design a oil mug, shown in fig., to hold as much oil possible. The height and radius of the mug should not be more than 6cm. The mug must

More information

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL.

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL. Adaptive Filtering Fundamentals of Least Mean Squares with MATLABR Alexander D. Poularikas University of Alabama, Huntsville, AL CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is

More information

Distributed power allocation for D2D communications underlaying/overlaying OFDMA cellular networks

Distributed power allocation for D2D communications underlaying/overlaying OFDMA cellular networks Distributed power allocation for D2D communications underlaying/overlaying OFDMA cellular networks Marco Moretti, Andrea Abrardo Dipartimento di Ingegneria dell Informazione, University of Pisa, Italy

More information

Distributed Estimation and Detection for Smart Grid

Distributed Estimation and Detection for Smart Grid Distributed Estimation and Detection for Smart Grid Texas A&M University Joint Wor with: S. Kar (CMU), R. Tandon (Princeton), H. V. Poor (Princeton), and J. M. F. Moura (CMU) 1 Distributed Estimation/Detection

More information

Minimax MMSE Estimator for Sparse System

Minimax MMSE Estimator for Sparse System Proceedings of the World Congress on Engineering and Computer Science 22 Vol I WCE 22, October 24-26, 22, San Francisco, USA Minimax MMSE Estimator for Sparse System Hongqing Liu, Mandar Chitre Abstract

More information

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness. CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

More information

ADMM and Fast Gradient Methods for Distributed Optimization

ADMM and Fast Gradient Methods for Distributed Optimization ADMM and Fast Gradient Methods for Distributed Optimization João Xavier Instituto Sistemas e Robótica (ISR), Instituto Superior Técnico (IST) European Control Conference, ECC 13 July 16, 013 Joint work

More information

DISTRIBUTED ORBIT DETERMINATION VIA ESTIMATION CONSENSUS

DISTRIBUTED ORBIT DETERMINATION VIA ESTIMATION CONSENSUS AAS 11-117 DISTRIBUTED ORBIT DETERMINATION VIA ESTIMATION CONSENSUS Ran Dai, Unsik Lee, and Mehran Mesbahi This paper proposes an optimal algorithm for distributed orbit determination by using discrete

More information

Covariance Matrix Simplification For Efficient Uncertainty Management

Covariance Matrix Simplification For Efficient Uncertainty Management PASEO MaxEnt 2007 Covariance Matrix Simplification For Efficient Uncertainty Management André Jalobeanu, Jorge A. Gutiérrez PASEO Research Group LSIIT (CNRS/ Univ. Strasbourg) - Illkirch, France *part

More information

Ergodic Stochastic Optimization Algorithms for Wireless Communication and Networking

Ergodic Stochastic Optimization Algorithms for Wireless Communication and Networking University of Pennsylvania ScholarlyCommons Departmental Papers (ESE) Department of Electrical & Systems Engineering 11-17-2010 Ergodic Stochastic Optimization Algorithms for Wireless Communication and

More information

Algorithms for MDPs and Their Convergence

Algorithms for MDPs and Their Convergence MS&E338 Reinforcement Learning Lecture 2 - April 4 208 Algorithms for MDPs and Their Convergence Lecturer: Ben Van Roy Scribe: Matthew Creme and Kristen Kessel Bellman operators Recall from last lecture

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions December 14, 2016 Questions Throughout the following questions we will assume that x t is the state vector at time t, z t is the

More information

Dynamic Sensor Subset Selection for Centralized Tracking of a Stochastic Process

Dynamic Sensor Subset Selection for Centralized Tracking of a Stochastic Process PREPRINT VERSION 1 Dynamic Sensor Subset Selection for Centralized Tracking of a Stochastic Process Arpan Chattopadhyay & Urbashi Mitra arxiv:1804.03986v1 [math.oc] 9 Apr 2018 Abstract Motivated by the

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

Linear and Combinatorial Optimization

Linear and Combinatorial Optimization Linear and Combinatorial Optimization The dual of an LP-problem. Connections between primal and dual. Duality theorems and complementary slack. Philipp Birken (Ctr. for the Math. Sc.) Lecture 3: Duality

More information

Decentralized Dynamic Optimization Through the Alternating Direction Method of Multipliers

Decentralized Dynamic Optimization Through the Alternating Direction Method of Multipliers Decentralized Dynamic Optimization Through the Alternating Direction Method of Multipliers Qing Ling and Alejandro Ribeiro Abstract This paper develops the application of the alternating directions method

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Linear Regression (continued)

Linear Regression (continued) Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression

More information

Diffusion LMS Algorithms for Sensor Networks over Non-ideal Inter-sensor Wireless Channels

Diffusion LMS Algorithms for Sensor Networks over Non-ideal Inter-sensor Wireless Channels Diffusion LMS Algorithms for Sensor Networs over Non-ideal Inter-sensor Wireless Channels Reza Abdolee and Benoit Champagne Electrical and Computer Engineering McGill University 3480 University Street

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 11 Adaptive Filtering 14/03/04 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Adaptive Distributed Algorithms for Optimal Random Access Channels

Adaptive Distributed Algorithms for Optimal Random Access Channels Forty-Eighth Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 29 - October, 2 Adaptive Distributed Algorithms for Optimal Random Access Channels Yichuan Hu and Alejandro Ribeiro

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning)

Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning) Exercises Introduction to Machine Learning SS 2018 Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning) LAS Group, Institute for Machine Learning Dept of Computer Science, ETH Zürich Prof

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

ELEG 5633 Detection and Estimation Signal Detection: Deterministic Signals

ELEG 5633 Detection and Estimation Signal Detection: Deterministic Signals ELEG 5633 Detection and Estimation Signal Detection: Deterministic Signals Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Matched Filter Generalized Matched Filter Signal

More information

V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline

V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline V. Adaptive filtering Widrow-Hopf Learning Rule LMS and Adaline Goals Introduce Wiener-Hopf (WH) equations Introduce application of the steepest descent method to the WH problem Approximation to the Least

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Optimal Dynamic Sensor Subset Selection for Tracking a Time-Varying Stochastic Process

Optimal Dynamic Sensor Subset Selection for Tracking a Time-Varying Stochastic Process Optimal Dynamic Sensor Subset Selection for Tracking a Time-Varying Stochastic Process Arpan Chattopadhyay, Urbashi Mitra Optimal sensor subset selection problems can be broadly classified into two categories:

More information

Expectation maximization tutorial

Expectation maximization tutorial Expectation maximization tutorial Octavian Ganea November 18, 2016 1/1 Today Expectation - maximization algorithm Topic modelling 2/1 ML & MAP Observed data: X = {x 1, x 2... x N } 3/1 ML & MAP Observed

More information

Decentralized Quasi-Newton Methods

Decentralized Quasi-Newton Methods 1 Decentralized Quasi-Newton Methods Mark Eisen, Aryan Mokhtari, and Alejandro Ribeiro Abstract We introduce the decentralized Broyden-Fletcher- Goldfarb-Shanno (D-BFGS) method as a variation of the BFGS

More information

The Kalman Filter ImPr Talk

The Kalman Filter ImPr Talk The Kalman Filter ImPr Talk Ged Ridgway Centre for Medical Image Computing November, 2006 Outline What is the Kalman Filter? State Space Models Kalman Filter Overview Bayesian Updating of Estimates Kalman

More information

Managing Uncertainty

Managing Uncertainty Managing Uncertainty Bayesian Linear Regression and Kalman Filter December 4, 2017 Objectives The goal of this lab is multiple: 1. First it is a reminder of some central elementary notions of Bayesian

More information

Distributed Optimization over Networks Gossip-Based Algorithms

Distributed Optimization over Networks Gossip-Based Algorithms Distributed Optimization over Networks Gossip-Based Algorithms Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Random

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

Consensus and Distributed Inference Rates Using Network Divergence

Consensus and Distributed Inference Rates Using Network Divergence DIMACS August 2017 1 / 26 Consensus and Distributed Inference Rates Using Network Divergence Anand D. Department of Electrical and Computer Engineering, The State University of New Jersey August 23, 2017

More information

CSCI : Optimization and Control of Networks. Review on Convex Optimization

CSCI : Optimization and Control of Networks. Review on Convex Optimization CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

More information

Bayesian statistics. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Bayesian statistics. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Bayesian statistics DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall15 Carlos Fernandez-Granda Frequentist vs Bayesian statistics In frequentist statistics

More information

Distributed Constrained Recursive Nonlinear Least-Squares Estimation: Algorithms and Asymptotics

Distributed Constrained Recursive Nonlinear Least-Squares Estimation: Algorithms and Asymptotics 1 Distributed Constrained Recursive Nonlinear Least-Squares Estimation: Algorithms and Asymptotics Anit Kumar Sahu, Student Member, IEEE, Soummya Kar, Member, IEEE, José M. F. Moura, Fellow, IEEE and H.

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

A Brief Review on Convex Optimization

A Brief Review on Convex Optimization A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review

More information

Numerical Optimization: Basic Concepts and Algorithms

Numerical Optimization: Basic Concepts and Algorithms May 27th 2015 Numerical Optimization: Basic Concepts and Algorithms R. Duvigneau R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 1 Outline Some basic concepts in optimization Some

More information

Parallel Particle Filter in Julia

Parallel Particle Filter in Julia Parallel Particle Filter in Julia Gustavo Goretkin December 12, 2011 1 / 27 First a disclaimer The project in a sentence. workings 2 / 27 First a disclaimer First a disclaimer The project in a sentence.

More information

ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS

ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS Giorgio Picci with T. Taylor, ASU Tempe AZ. Wofgang Runggaldier s Birthday, Brixen July 2007 1 CONSENSUS FOR RANDOM GOSSIP ALGORITHMS Consider a finite

More information

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization Convex Optimization Ofer Meshi Lecture 6: Lower Bounds Constrained Optimization Lower Bounds Some upper bounds: #iter μ 2 M #iter 2 M #iter L L μ 2 Oracle/ops GD κ log 1/ε M x # ε L # x # L # ε # με f

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Expectation Maximization Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning with MAR Values We discussed learning with missing at random values in data:

More information

Overfitting, Bias / Variance Analysis

Overfitting, Bias / Variance Analysis Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic

More information

Solution to EE 617 Mid-Term Exam, Fall November 2, 2017

Solution to EE 617 Mid-Term Exam, Fall November 2, 2017 Solution to EE 67 Mid-erm Exam, Fall 207 November 2, 207 EE 67 Solution to Mid-erm Exam - Page 2 of 2 November 2, 207 (4 points) Convex sets (a) (2 points) Consider the set { } a R k p(0) =, p(t) for t

More information

Logarithmic Regret Algorithms for Strongly Convex Repeated Games

Logarithmic Regret Algorithms for Strongly Convex Repeated Games Logarithmic Regret Algorithms for Strongly Convex Repeated Games Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci & Eng, The Hebrew University, Jerusalem 91904, Israel 2 Google Inc 1600

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Stochastic dynamical modeling:

Stochastic dynamical modeling: Stochastic dynamical modeling: Structured matrix completion of partially available statistics Armin Zare www-bcf.usc.edu/ arminzar Joint work with: Yongxin Chen Mihailo R. Jovanovic Tryphon T. Georgiou

More information

Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron

Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron CS446: Machine Learning, Fall 2017 Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron Lecturer: Sanmi Koyejo Scribe: Ke Wang, Oct. 24th, 2017 Agenda Recap: SVM and Hinge loss, Representer

More information

The official electronic file of this thesis or dissertation is maintained by the University Libraries on behalf of The Graduate School at Stony Brook

The official electronic file of this thesis or dissertation is maintained by the University Libraries on behalf of The Graduate School at Stony Brook Stony Brook University The official electronic file of this thesis or dissertation is maintained by the University Libraries on behalf of The Graduate School at Stony Brook University. Alll Rigghht tss

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

CCP Estimation. Robert A. Miller. March Dynamic Discrete Choice. Miller (Dynamic Discrete Choice) cemmap 6 March / 27

CCP Estimation. Robert A. Miller. March Dynamic Discrete Choice. Miller (Dynamic Discrete Choice) cemmap 6 March / 27 CCP Estimation Robert A. Miller Dynamic Discrete Choice March 2018 Miller Dynamic Discrete Choice) cemmap 6 March 2018 1 / 27 Criteria for Evaluating Estimators General principles to apply when assessing

More information

7. Statistical estimation

7. Statistical estimation 7. Statistical estimation Convex Optimization Boyd & Vandenberghe maximum likelihood estimation optimal detector design experiment design 7 1 Parametric distribution estimation distribution estimation

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Adaptive Systems Homework Assignment 1

Adaptive Systems Homework Assignment 1 Signal Processing and Speech Communication Lab. Graz University of Technology Adaptive Systems Homework Assignment 1 Name(s) Matr.No(s). The analytical part of your homework (your calculation sheets) as

More information

Switching Regime Estimation

Switching Regime Estimation Switching Regime Estimation Series de Tiempo BIrkbeck March 2013 Martin Sola (FE) Markov Switching models 01/13 1 / 52 The economy (the time series) often behaves very different in periods such as booms

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. Revised submission to IEEE TNN

Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. Revised submission to IEEE TNN Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables Revised submission to IEEE TNN Aapo Hyvärinen Dept of Computer Science and HIIT University

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Statistical and Adaptive Signal Processing

Statistical and Adaptive Signal Processing r Statistical and Adaptive Signal Processing Spectral Estimation, Signal Modeling, Adaptive Filtering and Array Processing Dimitris G. Manolakis Massachusetts Institute of Technology Lincoln Laboratory

More information

Machine Learning Lecture 6 Note

Machine Learning Lecture 6 Note Machine Learning Lecture 6 Note Compiled by Abhi Ashutosh, Daniel Chen, and Yijun Xiao February 16, 2016 1 Pegasos Algorithm The Pegasos Algorithm looks very similar to the Perceptron Algorithm. In fact,

More information

Manipulators. Robotics. Outline. Non-holonomic robots. Sensors. Mobile Robots

Manipulators. Robotics. Outline. Non-holonomic robots. Sensors. Mobile Robots Manipulators P obotics Configuration of robot specified by 6 numbers 6 degrees of freedom (DOF) 6 is the minimum number required to position end-effector arbitrarily. For dynamical systems, add velocity

More information

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016 AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework

More information

Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation

Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation Peter J.C. Dickinson DMMP, University of Twente p.j.c.dickinson@utwente.nl http://dickinson.website/teaching/2017co.html version:

More information

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016 Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall 206 2 Nov 2 Dec 206 Let D be a convex subset of R n. A function f : D R is convex if it satisfies f(tx + ( t)y) tf(x)

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

RECENTLY, wireless sensor networks have been the object

RECENTLY, wireless sensor networks have been the object IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 4, APRIL 2007 1511 Distributed Sequential Bayesian Estimation of a Diffusive Source in Wireless Sensor Networks Tong Zhao, Student Member, IEEE, and

More information

Lecture 24 November 27

Lecture 24 November 27 EE 381V: Large Scale Optimization Fall 01 Lecture 4 November 7 Lecturer: Caramanis & Sanghavi Scribe: Jahshan Bhatti and Ken Pesyna 4.1 Mirror Descent Earlier, we motivated mirror descent as a way to improve

More information

DECENTRALIZED algorithms are used to solve optimization

DECENTRALIZED algorithms are used to solve optimization 5158 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 19, OCTOBER 1, 016 DQM: Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Aryan Mohtari, Wei Shi, Qing Ling,

More information

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Computer Vision Group Prof. Daniel Cremers. 3. Regression Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the

More information

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China)

Network Newton. Aryan Mokhtari, Qing Ling and Alejandro Ribeiro. University of Pennsylvania, University of Science and Technology (China) Network Newton Aryan Mokhtari, Qing Ling and Alejandro Ribeiro University of Pennsylvania, University of Science and Technology (China) aryanm@seas.upenn.edu, qingling@mail.ustc.edu.cn, aribeiro@seas.upenn.edu

More information

Distributed Convex Optimization

Distributed Convex Optimization Master Program 2013-2015 Electrical Engineering Distributed Convex Optimization A Study on the Primal-Dual Method of Multipliers Delft University of Technology He Ming Zhang, Guoqiang Zhang, Richard Heusdens

More information