Distributed MAP probability estimation of dynamic systems with wireless sensor networks

Size: px

Start display at page:

Download "Distributed MAP probability estimation of dynamic systems with wireless sensor networks"

Jade Copeland
5 years ago
Views:

1 Distributed MAP probability estimation of dynamic systems with wireless sensor networks Felicia Jakubiec, Alejandro Ribeiro Dept. of Electrical and Systems Engineering University of Pennsylvania Penn Seminar on Communications and Networking, Nov. 15, 2011 F. Jakubiec Distributed MAP probability estimation of dynamic systems 1

2 Estimation problem in sensor networks A sensor network consisting of several sensors Signal values change with time Every node collects a noisy observation of the true values Sensors within the network communicate with each other x 1 x 6 s 1 s 2 x 2 x 5 x 3 x 4 F. Jakubiec Distributed MAP probability estimation of dynamic systems 2

3 Examples of sensor networks A WSN project to detect the health of a city Uses WiFi radios Included sensors for particle counting, temperature, pressure, wind speed, rainfall, monitoring street activity etc. F. Jakubiec Distributed MAP probability estimation of dynamic systems 3

4 Local and Global estimation Local estimation Sensors estimate based on their own observations only Estimation accuracy can improve using other sensors observations Global estimation Centralized solution using all information available Designated node or fusion center receives information from all nodes Fragile. What if fusion center fails? Or, all nodes receive information from all other nodes Large communication (information exchanges) cost F. Jakubiec Distributed MAP probability estimation of dynamic systems 4

5 Distributed estimation Exchange information with neighboring nodes only Increases estimation accuracy with reasonable communication cost x 1 x 6 s 1 s 2 x 2 x 5 x 3 x 4 Sensor network represented by connected graph G = (V, E) Sensor i can communicate with sensor j only if j n i Estimation is computed locally with information only from neighbors F. Jakubiec Distributed MAP probability estimation of dynamic systems 5

6 Signal Model Continuous-time model Signal s a (τ) satisfies differential equation (u a (τ)=driving noise) ṡ a (τ) = f as (s a (τ), u a (τ)), Determines transition probability P (s a (τ + h) s a (τ)) Observation model specified by conditional pdf P (x a (τ) s a (τ)) Equivalent discrete-time model Sample continuous time signal with period T s Discrete-time observations x n k = x ak(nt s ) and signals s n = s a (nt s ) Discrete-time transition P ( s n s n 1) and conditional P (x n k sn ) F. Jakubiec Distributed MAP probability estimation of dynamic systems 6

7 Linear autoregressive Gaussian model LTI system with Gaussian noise for all sensors k { ṡa (t) = A a s a (t) + u a (t) x ak (t) = H ak s a (t) + n ak (t) { s n+1 = A s n + u n x n k = H k s n + n n k Driving noise u a (t) N (0, Q a ) n ak (t) N (0, R ak ) Observation noise u n N (0, Q) n n k N (0, R k) Equivalent system parameters Signal coefficient A = exp(a a T s ) Observation coefficient H k = H ak Signal noise covariance matrix Q = E [ u nt u n] = (Q a /2)A 1 a (exp(2a a T s ) I ) Observation noise covariance matrix R k = E [ n nt k n n k] = Rak /T s F. Jakubiec Distributed MAP probability estimation of dynamic systems 7

8 Centralized MAP estimation Can do MMSE or MAP estimation here MAP Maximum a posteriori (MAP) estimator ŝ MAP (t) = argmax s P (s x(t)) = argmax P (x(t) s) P (s) s From conditional independence of observations across sensors P (x(t) s) = P ( x 1 1 s 1) P ( x 2 1 s 2) P ( x t K s t) = Prior pdf t n=1 k=1 K P (x n k s n ) P (s) = P ( s t s t 1) P ( s 1 s 0) P ( s 0) = P ( s 0) t P ( s n s n 1) n=1 F. Jakubiec Distributed MAP probability estimation of dynamic systems 8

9 Centralized MAP estimation (continued) MAP becomes ( ŝ MAP (t) = P (s 0) t K ) P (x n k s n ) P (s n s n 1) Complexity grows Introduce time window T ( t K ) ŝ MAP (t) = P (x n k s n ) P (s n s n 1) Taking the logarithm n=1 n=t T +1 k=1 k=1 ŝ MAP (t) = argmax f 0(s, t) = s t ( K argmax s n=t T +1 k=1 ( ln P (x n k s n ) ) + ln P ( s n s n 1) ) But estimation of global signal s n cannot be distributed F. Jakubiec Distributed MAP probability estimation of dynamic systems 9

10 Distributed implementation Introduce local variables ŝ(t) = {ŝ k (t)} k=1...k ŝ(t) = argmax s t n=t T +1 ( K k=1 ln P (x n k s n k) + ln P ( s n k s n 1 ) ) k s.t. s n k = s n l, for all l n k, for all n = t T + 1,..., t The constraints can be rewritten as Cs = 0 C is the (directed) edge-incidence matrix Dual gradient descent When log likelihood function is convex, P = D Dual problem is separable F. Jakubiec Distributed MAP probability estimation of dynamic systems 10

11 Distributed implementation (continued) Introduce Lagrangian and Lagrange maximizers λ, L(s, λ, t) = t n=t T +1 k=1 K [ ln P (x n k s n k)+ 1 ( ) K ln P s n k s n 1 + ] λ n T kl (s n k s n l ) l nk Dual problem g(λ, t) = argmax L(s, λ, t) s Lagrangian and dual function change over time Rearrange and separate into local Lagrangians L k (s k, λ, t), L k (s k, λ, t) = where t n=t T +1 k [ ln P (x n k s n k)+ 1 ( ) K ln P s n k s n 1 k + ] s n T k (t)(λ n kl λ n lk), l nk L(s, λ, t) = K L k (s k, λ, t) k=1 Can now be solved using dual gradient descent F. Jakubiec Distributed MAP probability estimation of dynamic systems 11

12 On-line algorithm for distributed MAP estimation Initialize λ 0 kl as 1 Primal update, at each time t s k (t) = argmax s k t n=t T +1 [ ln P (x n k s n k) + 1 K ln P ( s n k s n 1 ) k + ] s nt k (λ n kl(t) λ n lk(t)). l n k Dual update, at each time t λ n kl(t + 1) = λ n kl(t) ɛ ( s n k(t) s n l (t) ) ] n This is gradient descent, [ g(λ(t), t) = s n k(t) s n l (t) kl F. Jakubiec Distributed MAP probability estimation of dynamic systems 12

13 D-MAP for a linear model MAP estimator in case of linear model t ( K ŝ MAP (t) = argmax (x n k H k s n ) T R 1 k s n=t T +1 k=1 ) (x n k H k s n ) + (s n A s n 1 ) T Q 1 (s n A s n 1 ). Primal update becomes t ( s k (t) = argmax (x n k H k s n ) T R 1 k (xn k H k s n ) s k n=t T K (sn As n 1 ) T Q 1 (s n As n 1 ) + ) s nt k (λ n kl(t) λ n lk(t)). l n k Dual update as before Can be solved in closed-form F. Jakubiec Distributed MAP probability estimation of dynamic systems 13

14 Convergence to optimality Problem: Dual problem changes with time Primal optima s (t) and dual optima λ (t) change with time Dual iterates λ(t) approach optimum λ (t) But optimum λ (t) drifts away to λ (t + 1) Characterize difference between current estimate and centralized MAP, s n k(t) ŝ n MAP(t). s is not deterministic, can only characterize in a probabilistic sense First look at the quantity λ(t) Λ (t) Desired relation for s n k (t) ŝn MAP (t) follows as a corollary F. Jakubiec Distributed MAP probability estimation of dynamic systems 14

15 Assumptions (A1) Strong convexity of dual function in direction of gradient g(µ, t) g(λ, t) + g(λ, t) T (µ λ) + m µ λ 2 2 where µ = λ + c g(λ, t) for some constant c. (A2) Lipschitz continuity of the gradient of the dual function g(λ) g(µ, t) g(λ, t) M µ λ. Customary assumptions in gradient descent algorithm (almost) Assumption (A1) is a little weaker F. Jakubiec Distributed MAP probability estimation of dynamic systems 15

16 Assumptions (continued) (A3) Distance between the derivatives of primals expected to be small, [ f 0,t E (ŝmap (t) ) f 0,t+1 (ŝmap (t + 1) ) ] δ(t s ) s s We need to bound change of primal functions from t to t + 1 Bounds how much the optimum drifts away in each time step This difference will depend on sampling time Smaller time increment, smaller change Assumption can be fulfilled for some log likelihood functions For linear case, δ(t s ) = c T s with some constant c F. Jakubiec Distributed MAP probability estimation of dynamic systems 16

17 Convergence of dual variables Theorem Let λ(t) denote the vector with current dual iterates and Λ (t) the set of optimal multipliers. When the step size ɛ < 1/M, then lim E [ λ(t) t Λ (t) ] γ 1 ɛm ɛm δ(t s), where γ := µ max (C ) > 0 is largest eigenvalue of the pseudoinverse of C. Expected value of distance to optimal multipliers is small λ(t) Λ (t) behaves similar to a supermartingale Term on RHS depends on sampling time and network characteristics γ is the mixing constant of the graph G ɛm is the condition number of the optimization problem δ(ts) describes the bound on the change of the likelihood derivatives F. Jakubiec Distributed MAP probability estimation of dynamic systems 17

18 Convergence of dual variables (continued) Theorem With the same definitions and assumptions as before, for almost all realization of the signal process s(t), lim inf t λ(t) Λ (t) γ 1 ɛm ɛm δ(t s) a.s. Almost surely, the distance to optimal multipliers will become small For any realization, this will happen infinitely often F. Jakubiec Distributed MAP probability estimation of dynamic systems 18

19 Convergence of primal variables (assumptions) Strong convexity of primal function f 0 (s, t) f 0 (r, t) + f 0 (r, t) T (s r) + l 2 s r 2. Lipschitz continuity of the dual function g(λ) g(µ, t) g(λ, t) L µ λ Bounded Lagrange multipliers λ λ max. F. Jakubiec Distributed MAP probability estimation of dynamic systems 19

20 Convergence of primal variables Corollary Let s(t) denote the current primal iterate obtained at time t and let ŝ MAP (t) the optimal MAP estimate. With the same definitions and assumptions as before, lim t s(t) ŝ MAP(t) 2 Γ 1 ( 1 ɛm ɛm where Γ 1 = γl/l and Γ 2 = 2γM λ max /l. Result depends on difference to optimal multipliers ) 2 δ(t s ) 2 + Γ 2 1 ɛm ɛm δ(t s) a.s. Difference becomes small at a worst rate of max(δ(t s ), δ(t s )) (1) F. Jakubiec Distributed MAP probability estimation of dynamic systems 20

21 Simulation setup Sensor network with 1 signal and K = 10 sensors Edge set E randomly drawn with probability 0.5 Sampling period = 0.1 seconds Window size = 2 seconds Simulation runs for 100 seconds Linear system with parameters ṡ a (t) = 0.01 s a (t) + u a (t) x ak (t) = H ak s a (t) + n ak (t) Hak uniformly drawn between 0.5 and 1.5 entries of ua(t) N (0, 0.25) entriess of nak (t) N (0, 1) F. Jakubiec Distributed MAP probability estimation of dynamic systems 21

22 Simulation results Mean squared Error Kalman Filter Centralized MAP D MAP Estimation time n Figure: D-MAP, MAP, and Kalman filter mean squared error (MSE) for all n [t T + 1, t] = [981, 1000]. F. Jakubiec Distributed MAP probability estimation of dynamic systems 22

23 Simulation results (continued) Mean squared Error Centralized MAP Local MAP D MAP Estimation time t Figure: D-MAP, centralized MAP, and local MAP mean squared error for the time t estimate computed at time t, i.e., s t k(t), for t [600, 1000]. F. Jakubiec Distributed MAP probability estimation of dynamic systems 23

24 Conclusion Introduced dynamic distributed estimation problem Based on known signal and observation model Algorithm should approach global estimate but only use information from neighbors Implemented distributed MAP with dual gradient descent Studied convergence of estimator for time n if t Algorithm achieves convergence to centralized MAP very quickly D-MAP presents a significant improvement over local MAP F. Jakubiec Distributed MAP probability estimation of dynamic systems 24

WE consider the problem of estimating a time varying

WE consider the problem of estimating a time varying 450 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 61, NO 2, JANUARY 15, 2013 D-MAP: Distributed Maximum a Posteriori Probability Estimation of Dynamic Systems Felicia Y Jakubiec Alejro Ribeiro Abstract This