Sufficient Statistics in Decentralized Decision-Making Problems
|
|
- Walter Johnston
- 5 years ago
- Views:
Transcription
1 Sufficient Statistics in Decentralized Decision-Making Problems Ashutosh Nayyar University of Southern California Feb 8, 05
2 / 46 Decentralized Systems Transportation Networks Communication Networks Networked Control Systems Sensor Networks Energy Systems Social Networks Markets Organizations Supply chain systems
3 / 46 Decentralized Decision Problems I. Static/One-stage: One shot decisions. II. Dynamic/Multi-stage: Dynamic system; Decisions made over time.
4 / 46 Decentralized Decision Problems I. Static/One-stage: One shot decisions. II. Dynamic/Multi-stage: Dynamic system; Decisions made over time. A. Cooperative/Team Problem: Decision makers (DMs) have a common goal. B. Non-cooperative/Game Formulation: DMs have different goals.
5 / 46 Decentralized Decision Problems I. Static/One-stage: One shot decisions. II. Dynamic/Multi-stage: Dynamic system; Decisions made over time. A. Cooperative/Team Problem: Decision makers (DMs) have a common goal. B. Non-cooperative/Game Formulation: DMs have different goals. In this talk, we will focus on cooperative, multi-stage decentralized decision problems. Decentralized Stochastic Control Problem
6 Stochastic Control for Decentralized Systems 3 / 46
7 3 / 46 Stochastic Control for Decentralized Systems Key features of decentralized stochastic control problem: Uncertainty: In system evolution and DMs information Information asymmetry: Different decision-makers (DMs) with different information Signaling: Decisions of one DM affect information of other DMs Information growth: DMs accumulate information over time.
8 3 / 46 Stochastic Control for Decentralized Systems Key features of decentralized stochastic control problem: Uncertainty: In system evolution and DMs information Information asymmetry: Different decision-makers (DMs) with different information Signaling: Decisions of one DM affect information of other DMs Information growth: DMs accumulate information over time. Key Question: Sufficient Statistics Can the ever-growing information history available to DMs be aggregated without compromising performance? In other words, are there sufficient statistics for the controllers?
9 4 / 46 Overview Centralized results POMDP LQG Partial history sharing model of decentralized control 3 Person-by-person method 4 Common Information Method 5 Common information method for LQG
10 5 / 46 Centralized Stochastic Control: Model Partially Observable Markov Decision Problems (POMDP) System Dynamics: X t+ = f t (X t, U t, W t ), t = 0,,,..., T. () Observation Model: Y t = h t (X t, V t ), t = 0,,,..., T. () X = State, U = Action/Decision W, V = Noise
11 5 / 46 Centralized Stochastic Control: Model Partially Observable Markov Decision Problems (POMDP) System Dynamics: X t+ = f t (X t, U t, W t ), t = 0,,,..., T. () Observation Model: Y t = h t (X t, V t ), t = 0,,,..., T. () X = State, U = Action/Decision W, V = Noise X 0, {W t, V t, t 0}, mutually independent random variables with known distributions. Cost: T t=0 l t(x t, U t ).
12 5 / 46 Centralized Stochastic Control: Model Partially Observable Markov Decision Problems (POMDP) System Dynamics: X t+ = f t (X t, U t, W t ), t = 0,,,..., T. () Observation Model: Y t = h t (X t, V t ), t = 0,,,..., T. () X = State, U = Action/Decision W, V = Noise X 0, {W t, V t, t 0}, mutually independent random variables with known distributions. Cost: T t=0 l t(x t, U t ). One decision-maker (DM) with perfect recall DM s information at t I t := {Y 0:t, U 0:t }
13 6 / 46 Centralized Stochastic Control: Strategy Optimization Control action/decision U t U t = g t (I t ) = g t (Y 0:t, U 0:t ), t = 0,,,..., T g t decision rule/control law at t. g := (g 0, g,..., g T ) decision/control strategy
14 6 / 46 Centralized Stochastic Control: Strategy Optimization Control action/decision U t U t = g t (I t ) = g t (Y 0:t, U 0:t ), t = 0,,,..., T g t decision rule/control law at t. g := (g 0, g,..., g T ) decision/control strategy Expected cost of strategy g: ] J Cent. (g) := E g [ T t=0 l t (X t, U t ),
15 6 / 46 Centralized Stochastic Control: Strategy Optimization Control action/decision U t U t = g t (I t ) = g t (Y 0:t, U 0:t ), t = 0,,,..., T g t decision rule/control law at t. g := (g 0, g,..., g T ) decision/control strategy Expected cost of strategy g: ] J Cent. (g) := E g [ T t=0 l t (X t, U t ), Optimization: Find g that minimizes J Cent. (g)
16 7 / 46 Centralized Stochastic Control: Sufficient Statistics DM s posterior belief on state at time t: Π t (x t ) = P g (x t Y 0:t, U 0:t ), x t X
17 7 / 46 Centralized Stochastic Control: Sufficient Statistics DM s posterior belief on state at time t: Π t (x t ) = P g (x t Y 0:t, U 0:t ), x t X Strategy Independence of belief: Given the DM s information, the belief at time t does not depend on the strategy!
18 7 / 46 Centralized Stochastic Control: Sufficient Statistics DM s posterior belief on state at time t: Π t (x t ) = P g (x t Y 0:t, U 0:t ), x t X Strategy Independence of belief: Given the DM s information, the belief at time t does not depend on the strategy! Sufficient Statistic: The belief is a sufficient statistic. Optimal decision rules have the form U t = g t (Π t ).
19 7 / 46 Centralized Stochastic Control: Sufficient Statistics DM s posterior belief on state at time t: Π t (x t ) = P g (x t Y 0:t, U 0:t ), x t X Strategy Independence of belief: Given the DM s information, the belief at time t does not depend on the strategy! Sufficient Statistic: The belief is a sufficient statistic. Optimal decision rules have the form U t = g t (Π t ). LQG Result: When the dynamics and observation are linear, cost function quadratic and noises Gaussian: Optimal decision rules have the form U t = g t (Z t ) Z t = E g (X t Y 0:t, U 0:t ).
20 8 / 46 Decentralized Stochastic Control: Model Partial History Sharing Model N DMs, DM, DM,..., DM N.
21 8 / 46 Decentralized Stochastic Control: Model Partial History Sharing Model N DMs, DM, DM,..., DM N. System Dynamics: X t+ = f t (X t, U t, U t,..., U N t, W t ), t = 0,,,..., T. U i t DM i s action at t, i =,,..., N. Cost: T t= l t(x t, U t,..., U N t ).
22 Information Structure Data Available at DM i 9 / 46
23 9 / 46 Information Structure Data Available at DM i Local observation Y i t = h i t(x t, V i t )
24 9 / 46 Information Structure Data Available at DM i Local observation Y i t = h i t(x t, V i t ) Local memory: Subset of past local observations and actions M i t {Y i 0:t, Ui 0:t }
25 9 / 46 Information Structure Data Available at DM i Local observation Y i t = h i t(x t, V i t ) Local memory: Subset of past local observations and actions M i t {Y i 0:t, Ui 0:t } Shared memory: Subset of past observations and actions of all controllers C t {Y 0:t,..., YN 0:t, U 0:t,..., UN 0:t }
26 0 / 46 Model Optimization of Strategies DM i s control action at time t is a function of local observation Y i t local memory M i t shared memory C t, U i t = g i t(y i t, M i t, C t ) g i t DM i s decision rule/control law at time t
27 0 / 46 Model Optimization of Strategies DM i s control action at time t is a function of local observation Y i t local memory M i t shared memory C t, U i t = g i t(y i t, M i t, C t ) g i t DM i s decision rule/control law at time t g i := (g i 0, gi,..., gi T ) DMi s decision/control strategy g := (g,..., g N ) strategy profile of the system
28 0 / 46 Model Optimization of Strategies DM i s control action at time t is a function of local observation Y i t local memory M i t shared memory C t, U i t = g i t(y i t, M i t, C t ) g i t DM i s decision rule/control law at time t g i := (g i 0, gi,..., gi T ) DMi s decision/control strategy g := (g,..., g N ) strategy profile of the system Expected cost of strategy profile g J g T := Eg [ T t=0 l t (X t, U t, U t,..., U N t ) ]
29 Model Memory Update Assumptions / 46
30 / 46 Model Memory Update Assumptions Assumption : Shared memory is non-decreasing (analogous to perfect recall assumption in centralized case ) C t+ = {C t, Z t, Z t,..., Z N t } Z i t is DM i s contribution to shared memory at time t.
31 / 46 Model Memory Update Assumptions Assumption : Shared memory is non-decreasing (analogous to perfect recall assumption in centralized case ) C t+ = {C t, Z t, Z t,..., Z N t } Z i t is DM i s contribution to shared memory at time t. Assumption : The increment in shared memory from t to t + and the local memory at t + are fixed functions of current local memories, observations and control actions: F i t, G i t are fixed. Z i t = F i t(m i t, Y i t, U i t) M i t+ = Gi t(m i t, Y i t, U i t)
32 Example: Delayed Sharing Information Structure Communication links between controllers have delay d Z i t = Y i t d, Ui t d. System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t, C t DM M t, C t Z t Z t C t = {Y 0:t d, U 0:t d, Y 0:t d, U 0:t d } M i t = {Y i t d+:t, Ui t d+:t } / 46
33 3 / 46 Model-Memory Update Shared Memory C t Z t, Z t C t+ DM M t Y t U t Z t M t+ DM M t Y t U t Z t M t+ t t + Time ordering of observations, actions and memory updates
34 4 / 46 Special Instances of the Model Decentralized Control Delayed sharing information structure (Witsenhausen 97) Delayed state sharing information structure (Aicardi et al 987) Periodic sharing information structure (Ooi et al 997) Control sharing information structure (Mahajan 03) Broadcast information structure (Wu-Lall, 00) Communications Real-time encoding (Wits. 978, Tatikonda-Mitter 004) Real-time encoding-decoding with noiseless feedback (Wal.-Var. 983) Paging and registration in cellular systems (Hajek et al 008) Sensor Networks Sequential problems in decentralized detection (DT-Varaiya 984, Tsitsiklis 986, 993, DT-Ho 987, Veeravalli et al. 993) Communication and estimation in remote sensing (Imer-Basar 00, Lipsa-Martins 0, Nayyar et al. 03)
35 Sufficient Statistics for Decentralized Problem 5 / 46
36 5 / 46 Sufficient Statistics for Decentralized Problem First approach: Person-by-person optimization A commonly used method for decentralized problems. Main Idea. Fix strategies of all DMs except DM to arbitrary choices.. Focus on the centralized stochastic control problem of DM.
37 5 / 46 Sufficient Statistics for Decentralized Problem First approach: Person-by-person optimization A commonly used method for decentralized problems. Main Idea. Fix strategies of all DMs except DM to arbitrary choices.. Focus on the centralized stochastic control problem of DM. Lemma Suppose that DM s centralized problem has a sufficient statistic: S t = function(y t, M t, C t ), that is, there is an optimal strategy of DM of the form U t = g t (S t ) t =,..., T.
38 5 / 46 Sufficient Statistics for Decentralized Problem First approach: Person-by-person optimization A commonly used method for decentralized problems. Main Idea. Fix strategies of all DMs except DM to arbitrary choices.. Focus on the centralized stochastic control problem of DM. Lemma Suppose that DM s centralized problem has a sufficient statistic: S t = function(y t, M t, C t ), that is, there is an optimal strategy of DM of the form U t = g t (S t ) t =,..., T. We get the same statistic irrespective of the choice of other DMs strategies.
39 5 / 46 Sufficient Statistics for Decentralized Problem First approach: Person-by-person optimization A commonly used method for decentralized problems. Main Idea. Fix strategies of all DMs except DM to arbitrary choices.. Focus on the centralized stochastic control problem of DM. Lemma Suppose that DM s centralized problem has a sufficient statistic: S t = function(y t, M t, C t ), that is, there is an optimal strategy of DM of the form U t = g t (S t ) t =,..., T. We get the same statistic irrespective of the choice of other DMs strategies. Then, S t is a sufficient statistic for DM in the original decentralized problem.
40 6 / 46 Sufficient Statistics for Decentralized Problem Person-by-person optimization Useful method to remove redundant information in many cases. Real-time communication problems (Mahajan and Teneketzis, 00; Kaspi and Merhav, 00) Decentralized detection (Tenney and Sandell, 98; Veeravalli, Basar and Poor, 993) Decentralized LQG control (Lessard and Nayyar, 03, Wu and Lall, 00).
41 6 / 46 Sufficient Statistics for Decentralized Problem Person-by-person optimization Useful method to remove redundant information in many cases. Real-time communication problems (Mahajan and Teneketzis, 00; Kaspi and Merhav, 00) Decentralized detection (Tenney and Sandell, 98; Veeravalli, Basar and Poor, 993) Decentralized LQG control (Lessard and Nayyar, 03, Wu and Lall, 00). Repeated application can lead to person-by-person optimal solutions (Nash eq.) in some cases (e.g. static LQG teams).
42 6 / 46 Sufficient Statistics for Decentralized Problem Person-by-person optimization Useful method to remove redundant information in many cases. Real-time communication problems (Mahajan and Teneketzis, 00; Kaspi and Merhav, 00) Decentralized detection (Tenney and Sandell, 98; Veeravalli, Basar and Poor, 993) Decentralized LQG control (Lessard and Nayyar, 03, Wu and Lall, 00). Repeated application can lead to person-by-person optimal solutions (Nash eq.) in some cases (e.g. static LQG teams). Limitation: Does not always yield a useful statistic. The only S t satisfying the lemma is the entire information itself! Example: Decentralized control with communication delay between controllers.
43 7 / 46 In search of a new approach... What are the main roadblocks in decentralized problems?
44 7 / 46 In search of a new approach... What are the main roadblocks in decentralized problems? If all DMs have identical information (Centralized problem) DMs can form identical posterior beliefs on the state. Given the strategies (g,..., g N ) each DM can exactly predict other DM s actions at the current time.
45 7 / 46 In search of a new approach... What are the main roadblocks in decentralized problems? If all DMs have identical information (Centralized problem) DMs can form identical posterior beliefs on the state. Given the strategies (g,..., g N ) each DM can exactly predict other DM s actions at the current time. DMs with different information (Our model Partial history sharing)
46 7 / 46 In search of a new approach... What are the main roadblocks in decentralized problems? If all DMs have identical information (Centralized problem) DMs can form identical posterior beliefs on the state. Given the strategies (g,..., g N ) each DM can exactly predict other DM s actions at the current time. DMs with different information (Our model Partial history sharing) DMs have non-identical beliefs on the state.
47 7 / 46 In search of a new approach... What are the main roadblocks in decentralized problems? If all DMs have identical information (Centralized problem) DMs can form identical posterior beliefs on the state. Given the strategies (g,..., g N ) each DM can exactly predict other DM s actions at the current time. DMs with different information (Our model Partial history sharing) DMs have non-identical beliefs on the state. Even with fixed strategies (g,..., g N ) a DM can not exactly predict the other DM s action.
48 8 / 46 In search of a new approach... Key Ideas for Our Solution Methodology
49 8 / 46 In search of a new approach... Key Ideas for Our Solution Methodology. Common information
50 8 / 46 In search of a new approach... Key Ideas for Our Solution Methodology. Common information Sharing data among DMs creates common information {C t }
51 8 / 46 In search of a new approach... Key Ideas for Our Solution Methodology. Common information Sharing data among DMs creates common information {C t } Beliefs based on common information are consistent among DMs
52 8 / 46 In search of a new approach... Key Ideas for Our Solution Methodology. Common information Sharing data among DMs creates common information {C t } Beliefs based on common information are consistent among DMs. Partial decision rules
53 8 / 46 In search of a new approach... Key Ideas for Our Solution Methodology. Common information Sharing data among DMs creates common information {C t } Beliefs based on common information are consistent among DMs. Partial decision rules Given fixed strategies (g, g ) DM can not know U t
54 8 / 46 In search of a new approach... Key Ideas for Our Solution Methodology. Common information Sharing data among DMs creates common information {C t } Beliefs based on common information are consistent among DMs. Partial decision rules Given fixed strategies (g, g ) DM can not know U t But for a given realization c t of common information DM knows exactly the mapping from Y t, M t to U t U t = g t (Y t, M t, c t ) U t = g t (,, c t ) = γ t (, ) γ t partial decision rule for the given realization c t of common information
55 Common Information Methodology 9 / 46
56 9 / 46 Common Information Methodology Step Introduce a problem with a fictitious coordinator"
57 9 / 46 Common Information Methodology Step Introduce a problem with a fictitious coordinator" Coordinator s beliefs are based on common information Coordinator selects partial decision rules (prescriptions)
58 9 / 46 Common Information Methodology Step Introduce a problem with a fictitious coordinator" Coordinator s beliefs are based on common information Coordinator selects partial decision rules (prescriptions) Step Establish equivalence between the original problem and the problem with the coordinator
59 9 / 46 Common Information Methodology Step Introduce a problem with a fictitious coordinator" Coordinator s beliefs are based on common information Coordinator selects partial decision rules (prescriptions) Step Establish equivalence between the original problem and the problem with the coordinator Step 3 Establish sufficient statistics for the problem with the coordinator
60 9 / 46 Common Information Methodology Step Introduce a problem with a fictitious coordinator" Coordinator s beliefs are based on common information Coordinator selects partial decision rules (prescriptions) Step Establish equivalence between the original problem and the problem with the coordinator Step 3 Establish sufficient statistics for the problem with the coordinator Step 4 Use the equivalence (step ) to find sufficient statistics for the original problem
61 0 / 46 Common Information Methodology-Step Step - The Fictitious Coordinator System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t DM M t Z t Z t Coordinator C t C t+ = {C t, Z t, Z t } Fictitious coordinator has perfect recall
62 / 46 Common Information Methodology-Step Step - The Fictitious Coordinator System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t DM M t Γ Γ Z t t t Z t Coordinator C t Coordinator selects prescriptions" Γ t, Γ t Γ i t = ψ i t (C t ), i =, ψ t = (ψ t, ψ t ) coordination strategy at t
63 / 46 Common Information Methodology-Step Step - The Fictitious Coordinator
64 / 46 Common Information Methodology-Step Step - The Fictitious Coordinator Prescriptions Γ t, Γ t instruct DM and DM how to use their private information
65 / 46 Common Information Methodology-Step Step - The Fictitious Coordinator Prescriptions Γ t, Γ t instruct DM and DM how to use their private information If the private information at controller i is y i t, m i t, it takes the action Γ i t(y i t, m i t). That is, Ut i = Γ i ( t Y i t, Mt i ),
66 3 / 46 Common Information Methodology-Step Step - The coordinator s problem P CD Choose ψ := (ψ, ψ,..., ψ T ), ψ t = ( ) ψt, ψt to minimize { T J ψ CD,T := Eψ l t (X t, Ut, Ut t=0 ) } where Γ i t = ψ i t (C t ) U i t = Γ i t(y i t, M i t)
67 4 / 46 Common Information Methodology-Step Step - Equivalence of Problems P D and P CD Problem P D System Problem P CD System Y t U t U t Y t Y t U t U t Y t DM M t, C t DM M t, C t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Z t Γ t Γ t Z t Z t g = g, g T g J T = E g l t (X t, U t, U t ) t=0 ψ J CD,T Coordinator C t ψ = (ψ, ψ ) T = E ψ l t (X t, U t, U t ) t=0
68 5 / 46 Common Information Methodology-Step Lemma Consider any choice of the coordinator s policy ψ = (ψ, ψ,..., ψ T ), ψ t = ( ψt, ψt ). Define g = (g, g ) by g i t(, C t ) = ψt(c i t ), i =,, t = 0,,..., T Then J g T = Jψ CD,T
69 6 / 46 Common Information Methodology-Step Lemma 3 Consider any choice of control strategy g = (g, g ). Define ψ = (ψ, ψ ) as follows. Then ψ i t(c t ) = g i t(, C t ), i =,, t = 0,,..., T J ψ CD,T = Jg T
70 7 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD Coordinator s Environment System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Γ t Γ t Z t Coordinator C t
71 7 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD Coordinator s Environment System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Γ t Γ t Z t Coordinator C t Think of the original system and the DMs together as the coordinator s environment"
72 7 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD Coordinator s Environment System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Γ t Γ t Z t Coordinator C t Think of the original system and the DMs together as the coordinator s environment" Coordinator s observations at t : Z t, Z t
73 7 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD Coordinator s Environment System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Γ t Γ t Z t Coordinator C t Think of the original system and the DMs together as the coordinator s environment" Coordinator s observations at t : Z t, Z t Coordinator s decisions/actions at t : Γ t, Γ t
74 8 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD Coordinator s Environment System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Γ t Γ t Z t Coordinator C t
75 8 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD Coordinator s Environment System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Γ t Γ t Z t Coordinator C t What is the state" that describes the coordinator s environment?
76 8 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD Coordinator s Environment System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM, M t U t = Γ t (M t, Y t ) DM, M t U t = Γ t (M t, Y t ) Z t Γ t Γ t Z t Coordinator C t What is the state" that describes the coordinator s environment? State from the coordinator s perspective ( ) S t = X t, Mt, Mt, Yt, Yt
77 9 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD
78 9 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD There exist functions ˆf t, ĥ t, ĥ t, such that State Dynamics: S t+ = ˆf t (S t, Γ t, Γ t, Noise Variables) Observation Eq: Z i t = ĥ i t(s t, Γ i t), i =,
79 9 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD There exist functions ˆf t, ĥ t, ĥ t, such that State Dynamics: S t+ = ˆf t (S t, Γ t, Γ t, Noise Variables) Observation Eq: Z i t = ĥ i t(s t, Γ i t), i =, State satsfies controlled Markov property ( ) P S t+, Zt, Zt S 0:t, Z0:t, Z 0:t, Γ t, Γ t ( ) = P S t+, Zt, Zt S t, Γ t, Γ t
80 9 / 46 Common Information Methodology-Step 3 Step 3 - The Coordinator s Problem P CD There exist functions ˆf t, ĥ t, ĥ t, such that State Dynamics: S t+ = ˆf t (S t, Γ t, Γ t, Noise Variables) Observation Eq: Z i t = ĥ i t(s t, Γ i t), i =, State satsfies controlled Markov property ( ) P S t+, Zt, Zt S 0:t, Z0:t, Z 0:t, Γ t, Γ t ( ) = P S t+, Zt, Zt S t, Γ t, Γ t There exist functions ˆl t, such that for all t =,,..., T l t (X t, U t, U t ) = ˆl t (S t, Γ t, Γ t )
81 30 / 46 Common Information Methodology-Step 3 The Coordinator s Problem P CD can be rewritten as subject to min ψ Jψ CD,T := Eψ { T t=0 ˆl t (S t, Γ t, Γ t ) } S t+ = ˆf t (s t, Γ t, Γ t, Noise Variables) Z i t = ĥ i t(s t, Γ i t), i =, Γ i t = ψt(c i t, Γ 0:t, Γ 0:t ), i =,
82 30 / 46 Common Information Methodology-Step 3 The Coordinator s Problem P CD can be rewritten as subject to min ψ Jψ CD,T := Eψ { T t=0 ˆl t (S t, Γ t, Γ t ) } S t+ = ˆf t (s t, Γ t, Γ t, Noise Variables) Z i t = ĥ i t(s t, Γ i t), i =, Γ i t = ψt(c i t, Γ 0:t, Γ 0:t ), i =, POMDP (Centralized stochastic control)!
83 3 / 46 Common Information Methodology-Step 3 Centralized Problem A sufficient Statistic Π t = P(X t Y 0:t, U 0:t ) Optimal Control Strategy of the form U t = g t (Π t ) Sequential decomposition: dynamic program in terms of Π t Coordinator s Problem A sufficient Statistic for the coordinator Π CD t = P(S t C t, Γ 0:t, Γ 0:t ) Optimal Control Strategy of the form Γ t = ψt (Π CD t ), Γ t = ψt (Π CD t ) Sequential decomposition: dynamic program in terms of Π CD t
84 3 / 46 Common Information Methodology-Step 3 Step 3 - A Dynamic Program for the Coordinator
85 3 / 46 Common Information Methodology-Step 3 Step 3 - A Dynamic Program for the Coordinator The dynamic program V T (π) = inf E γt,γ T V t (π) = inf E γt,γt for t =,,..., T ( ( ) ( )) } {l T X T, γt YT, MT, γt YT, MT Π CD T = π ( ( ) ( )) {l t X t, γt Yt, Mt, γt Yt, Mt ( )) } +V t+ ˆη t (π, γt, γt, Zt, Zt Π CD t = π
86 33 / 46 Common Information Methodology-Step 4 Step 4 - From coordinator to original problem
87 33 / 46 Common Information Methodology-Step 4 Step 4 - From coordinator to original problem Optimal policy for coordinator s problem P CD is of the form ( ) Γ t = ψt Π CD t ( ) Γ t = ψt Π CD t ( ) Π CD t = P X t, Mt, Mt, Yt, Yt C t, Γ 0:t, Γ 0:t
88 33 / 46 Common Information Methodology-Step 4 Step 4 - From coordinator to original problem Optimal policy for coordinator s problem P CD is of the form ( ) Γ t = ψt Π CD t ( ) Γ t = ψt Π CD t ( ) Π CD t = P X t, Mt, Mt, Yt, Yt C t, Γ 0:t, Γ 0:t Theorem (Nayyar, Mahajan, Teneketzis) For decentralized stochastic control problem with partial history sharing information structure, there exist optimal policies of the DMs of the form ) Ut = g t (Y t, Mt, Π t ) Ut = g t (Y t, Mt, Π t ) Π t = P (X t, Mt, Mt, Yt, Yt C t for t =,,..., T
89 Solution to Witsenhausen s Conjecture (97) 34 / 46
90 34 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97 ) { } C t = Y0:t n, Y 0:t n, U 0:t n, U 0:t n { } Mt = Yt n+:t, U t n+:t { } Mt = Yt n+:t, U t n+:t
91 34 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97 ) { } C t = Y0:t n, Y 0:t n, U 0:t n, U 0:t n { } Mt = Yt n+:t, U t n+:t { } Mt = Yt n+:t, U t n+:t { } { } P t := Yt, Mt = Yt n+:t, U t n+:t { } { } P t := Yt, Mt = Yt n+:t, U t n+:t
92 34 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97 ) { } C t = Y0:t n, Y 0:t n, U 0:t n, U 0:t n { } Mt = Yt n+:t, U t n+:t { } Mt = Yt n+:t, U t n+:t { } { } P t := Yt, Mt = Yt n+:t, U t n+:t { } { } P t := Yt, Mt = Yt n+:t, U t n+:t Theorem 4 (Nayyar, Mahajan, Teneketzis, 0) For the n-step delayed sharing problem there exist optimal policies of the DMs of the form ( )) Ut = g t P t, P (X t, P t, P t C t ( )) Ut = g t P t, P (X t, P t, P t C t
93 35 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97): Comparison with Witsenhausen s conjecture
94 35 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97): Comparison with Witsenhausen s conjecture Witsenhausen (97) Ut i = g i ( t P i t, P (X t n+ C t ) ), i =,
95 35 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97): Comparison with Witsenhausen s conjecture Witsenhausen (97) Ut i = g i ( t P i t, P (X t n+ C t ) ), i =, Only true for one-step delay n = (Varaiya-Walrand, 978).
96 35 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97): Comparison with Witsenhausen s conjecture Witsenhausen (97) Ut i = g i ( t P i t, P (X t n+ C t ) ), i =, Only true for one-step delay n = (Varaiya-Walrand, 978). Nayyar, Mahajan, Teneketzis (0) ( )) Ut i = g i t P i t, P (X t, P t, P t C t, i =,
97 35 / 46 Solution to Witsenhausen s Conjecture (97) n-step Delayed Sharing Information Structure (Witsenhausen 97): Comparison with Witsenhausen s conjecture Witsenhausen (97) Ut i = g i ( t P i t, P (X t n+ C t ) ), i =, Only true for one-step delay n = (Varaiya-Walrand, 978). Nayyar, Mahajan, Teneketzis (0) ( )) Ut i = g i t P i t, P (X t, P t, P t C t, i =, The two results are equivalent only when n =.
98 36 / 46 Summary so far Main idea
99 36 / 46 Summary so far Main idea DMs can use common information" to coordinate how they use their private information".
100 36 / 46 Summary so far Main idea DMs can use common information" to coordinate how they use their private information". Beliefs based on common information are consistent among all DMs.
101 36 / 46 Summary so far Main idea DMs can use common information" to coordinate how they use their private information". Beliefs based on common information are consistent among all DMs. Instead of selecting decisions, the (fictitious) coordinator selects partial decision rules.
102 36 / 46 Summary so far Main idea DMs can use common information" to coordinate how they use their private information". Beliefs based on common information are consistent among all DMs. Instead of selecting decisions, the (fictitious) coordinator selects partial decision rules. Sufficient statistics Common information based beliefs on system state and local information serve as a key components of sufficient statistics. The sufficient statistics found by common information approach can not be identified by the person-by-person approach
103 Common Information Methodology: Special Cases 37 / 46
104 37 / 46 Common Information Methodology: Special Cases When C t = { } Y0:t, Y 0:t, U 0:t, U 0:t M t = M t = the coordinator s methodology is the same as classical dynamic programming
105 37 / 46 Common Information Methodology: Special Cases When C t = { } Y0:t, Y 0:t, U 0:t, U 0:t M t = M t = the coordinator s methodology is the same as classical dynamic programming When C t = { } Mt Y0:t, U 0:t { } Mt Y0:t, U 0:t the coordinator s methodology is the same as the designer s approach" (Witsenhausen 973, Mahajan-DT 009)
106 37 / 46 Common Information Methodology: Special Cases When C t = { } Y0:t, Y 0:t, U 0:t, U 0:t M t = M t = the coordinator s methodology is the same as classical dynamic programming When C t = { } Mt Y0:t, U 0:t { } Mt Y0:t, U 0:t the coordinator s methodology is the same as the designer s approach" (Witsenhausen 973, Mahajan-DT 009) Designer s methodology: sequential optimization of decision rules as an open loop control problem
107 38 / 46 Decentralized LQG Problems Linear partial history sharing model System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t, C t DM M t, C t Z t Z t Dynamics and observation equations are linear, cost quadratic and noise Gaussian. Z i t and M i t+ are linear functions of Mi t, Y i t, U i t.
108 38 / 46 Decentralized LQG Problems Linear partial history sharing model System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t, C t DM M t, C t Z t Z t Dynamics and observation equations are linear, cost quadratic and noise Gaussian. Z i t and M i t+ are linear functions of Mi t, Y i t, U i t. In general decentralized control problem, Linear control strategies may not be optimal. Example: Witsenhausen, 968.
109 38 / 46 Decentralized LQG Problems Linear partial history sharing model System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t, C t DM M t, C t Z t Z t Dynamics and observation equations are linear, cost quadratic and noise Gaussian. Z i t and M i t+ are linear functions of Mi t, Y i t, U i t. In general decentralized control problem, Linear control strategies may not be optimal. Example: Witsenhausen, 968. Even if we restrict to linear strategies, we may not have finite dimensional sufficient statistics. Example: Whittle and Rudge, 974.
110 39 / 46 Decentralized LQG Problems contd. Linear partial history sharing model System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t, C t DM M t, C t Z t Z t Restrict to Linear strategies: Control action is superposition of 3 components: U i t = G i ty i t + H i tm i t + K i tc t
111 39 / 46 Decentralized LQG Problems contd. Linear partial history sharing model System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t, C t DM M t, C t Z t Z t Restrict to Linear strategies: Control action is superposition of 3 components: U i t = G i ty i t + H i tm i t + K i tc t Focus on the case when M i t is finite dimensional perhaps after employing person-by-person optimality arguments.
112 39 / 46 Decentralized LQG Problems contd. Linear partial history sharing model System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t, C t DM M t, C t Z t Z t Restrict to Linear strategies: Control action is superposition of 3 components: U i t = G i ty i t + H i tm i t + K i tc t Focus on the case when M i t is finite dimensional perhaps after employing person-by-person optimality arguments. C t is growing in time C t C t+ A finite dimensional sufficient statistic must compress common information.
113 40 / 46 Modified Common Information Approach Control action is a superposition: U i t = G i ty i t + H i tm i t + K i tc t.
114 40 / 46 Modified Common Information Approach Control action is a superposition: U i t = G i ty i t + H i tm i t + K i tc t. Fix the part of control strategies that use local information G i t, H i t are fixed. Optimize the part of control action that depends on the common information: Γ i t = K i tc t.
115 40 / 46 Modified Common Information Approach Control action is a superposition: U i t = G i ty i t + H i tm i t + K i tc t. Fix the part of control strategies that use local information G i t, H i t are fixed. Optimize the part of control action that depends on the common information: Γ i t = K i tc t. We introduce a coordinator that selects the optimal Γ i t based on the common information C t.
116 4 / 46 Modified Common Information Approach contd. System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t DM M t Γ Γ Z t t t Z t Coordinator C t Coordinator s problem is a centralized LQG problem with S t = (X t, M t, M t, Y t, Y t ) as the new state.
117 4 / 46 Modified Common Information Approach contd. System X t+ = f t (X t, U t, U t, W t 0 ) Y t U t U t Y t DM M t DM M t Γ Γ Z t t t Z t Coordinator C t Coordinator s problem is a centralized LQG problem with S t = (X t, M t, M t, Y t, Y t ) as the new state. Centralized Sufficient Statistic: E[S t C t, Γ, :t ] is coordinator s sufficient statistic.
118 4 / 46 Decentralized LQG Problem Linear partial history sharing model Theorem There exists an optimal linear strategy of the form Ut i = G i tyt i + HtM i t i + KtŜt, i where Ŝ t is the common information based estimate of (X t, Mt, Mt, Yt, Yt ). For given G i, H i matrices, Ŝ t has a linear recursive update equation similar to a Kalman estimator.
119 43 / 46 Application to Graph Problems A graph describes interconnections as well as communication links among subsystems Figure: Directed acyclic graph (DAG): An edge i j means that subsystem i affects subsystem j through its dynamics and controller i shares its information with controller j. Each node represents a linear sub-system with a local controller. LQG problem setup.
120 44 / 46 Application to Graph Problems contd. Iterated Application of Common Information Approach Step : First find sufficient statistics for leaf nodes (nodes 4 and 5) using a person-by-person approach.
121 44 / 46 Application to Graph Problems contd. Iterated Application of Common Information Approach Step : First find sufficient statistics for leaf nodes (nodes 4 and 5) using a person-by-person approach. Step : Use common information for nodes 3 and 5 to refine the statistics.
122 44 / 46 Application to Graph Problems contd. Iterated Application of Common Information Approach Step : First find sufficient statistics for leaf nodes (nodes 4 and 5) using a person-by-person approach. Step : Use common information for nodes 3 and 5 to refine the statistics. Step 3: Use common information for nodes, 3, 4 and 5 to further refine the statistics.
123 44 / 46 Application to Graph Problems contd. Iterated Application of Common Information Approach Step : First find sufficient statistics for leaf nodes (nodes 4 and 5) using a person-by-person approach. Step : Use common information for nodes 3 and 5 to refine the statistics. Step 3: Use common information for nodes, 3, 4 and 5 to further refine the statistics. Continue to apply common information approach on bigger and bigger sub-graphs.
124 45 / 46 Application to Graph Problems contd. Iterated Application of Common Information Approach Theorem There is an optimal linear strategy of the form u i t = Kt ij z j t j ancestors{i} where z j t is node j s estimate of its ancestors and descendants states. For example, for node 3, u 3 t = K 3 t z t + K 3 t z t + K 33 t z 3 t, z t is node s estimate of states of nodes, 3, 5 z t is node s estimate of states of nodes, 3, 4, 5 z 3 t is node s estimate of states of nodes,, 3,
125 46 / 46 Concluding Remarks Common information approach provides a systematic method for finding sufficient statistics in decentralized stochastic control problems. Finds sufficient statistics that cannot be computed by just using person-by-person methods. In Delayed sharing information structures, this approach provides results for general values of delay. Sufficient statistics for decentralized LQG problems.
126 46 / 46 Concluding Remarks Common information approach provides a systematic method for finding sufficient statistics in decentralized stochastic control problems. Finds sufficient statistics that cannot be computed by just using person-by-person methods. In Delayed sharing information structures, this approach provides results for general values of delay. Sufficient statistics for decentralized LQG problems. Optimization of strategies: In POMDP like formulations, a dynamic programming like decomposition for the coordinator. IN LQG formulations, we get an open loop control problem for finding the best gain matrices for local information.
127 46 / 46 Concluding Remarks Common information approach provides a systematic method for finding sufficient statistics in decentralized stochastic control problems. Finds sufficient statistics that cannot be computed by just using person-by-person methods. In Delayed sharing information structures, this approach provides results for general values of delay. Sufficient statistics for decentralized LQG problems. Optimization of strategies: In POMDP like formulations, a dynamic programming like decomposition for the coordinator. IN LQG formulations, we get an open loop control problem for finding the best gain matrices for local information. Common Information and Dynamic games: In general, cannot use a coordinator in non-cooperative setting. Common information approach allows us to recast a game of asymmetric information as a game of symmetric information. Under some conditions, provides an extension of Markov perfect equilibrium for asymmetric information games.
Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach
Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach 1 Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis Abstract A general model of decentralized
More informationOptimal Decentralized Control of Coupled Subsystems With Control Sharing
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 58, NO. 9, SEPTEMBER 2013 2377 Optimal Decentralized Control of Coupled Subsystems With Control Sharing Aditya Mahajan, Member, IEEE Abstract Subsystems that
More informationCommon Knowledge and Sequential Team Problems
Common Knowledge and Sequential Team Problems Authors: Ashutosh Nayyar and Demosthenis Teneketzis Computer Engineering Technical Report Number CENG-2018-02 Ming Hsieh Department of Electrical Engineering
More informationStructure of optimal decentralized control policies An axiomatic approach
Structure of optimal decentralized control policies An axiomatic approach Aditya Mahajan Yale Joint work with: Demos Teneketzis (UofM), Sekhar Tatikonda (Yale), Ashutosh Nayyar (UofM), Serdar Yüksel (Queens)
More informationChapter 4 The Common-Information Approach to Decentralized Stochastic Control
Chapter 4 The Common-Information Approach to Decentralized Stochastic Control Ashutosh Nayyar, Aditya Mahajan, and Demosthenis Teneketzis 4.1 Introduction Many modern technological systems, such as cyber-physical
More informationIntroduction to Sequential Teams
Introduction to Sequential Teams Aditya Mahajan McGill University Joint work with: Ashutosh Nayyar and Demos Teneketzis, UMichigan MITACS Workshop on Fusion and Inference in Networks, 2011 Decentralized
More informationOptimal Control Strategies in Delayed Sharing Information Structures
Optimal Control Strategies in Delayed Sharing Information Structures Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis arxiv:1002.4172v1 [cs.oh] 22 Feb 2010 February 12, 2010 Abstract The n-step
More informationEquilibria for games with asymmetric information: from guesswork to systematic evaluation
Equilibria for games with asymmetric information: from guesswork to systematic evaluation Achilleas Anastasopoulos anastas@umich.edu EECS Department University of Michigan February 11, 2016 Joint work
More information6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011
6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011 On the Structure of Real-Time Encoding and Decoding Functions in a Multiterminal Communication System Ashutosh Nayyar, Student
More informationProblems in the intersection of Information theory and Control
Problems in the intersection of Information theory and Control Achilleas Anastasopoulos anastas@umich.edu EECS Department University of Michigan Dec 5, 2013 Former PhD student: Junghuyn Bae Current PhD
More informationStochastic Nestedness and Information Analysis of Tractability in Decentralized Control
Stochastic Nestedness and Information Analysis of Tractability in Decentralized Control Serdar Yüksel 1 Abstract Communication requirements for nestedness conditions require exchange of very large data
More informationDynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition
Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition 1 arxiv:1510.07001v1 [cs.gt] 23 Oct 2015 Yi Ouyang, Hamidreza Tavafoghi and
More informationMcGill University Department of Electrical and Computer Engineering
McGill University Department of Electrical and Computer Engineering ECSE 56 - Stochastic Control Project Report Professor Aditya Mahajan Team Decision Theory and Information Structures in Optimal Control
More informationSequential team form and its simplification using graphical models
Forty-Seventh Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 30 - October 2, 2009 Sequential team form and its simplification using graphical models Aditya Mahajan Dept. of Electrical
More informationSequential Decision Making in Decentralized Systems
Sequential Decision Making in Decentralized Systems by Ashutosh Nayyar A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering:
More informationDecentralized LQG Control of Systems with a Broadcast Architecture
Decentralized LQG Control of Systems with a Broadcast Architecture Laurent Lessard 1,2 IEEE Conference on Decision and Control, pp. 6241 6246, 212 Abstract In this paper, we consider dynamical subsystems
More informationDecentralized stochastic control
DOI 0.007/s0479-04-652-0 Decentralized stochastic control Aditya Mahajan Mehnaz Mannan Springer Science+Business Media New York 204 Abstract Decentralized stochastic control refers to the multi-stage optimization
More informationarxiv: v1 [cs.sy] 30 Sep 2015
Optimal Sensor Scheduling and Remote Estimation over an Additive Noise Channel Xiaobin Gao, Emrah Akyol, and Tamer Başar arxiv:1510.00064v1 cs.sy 30 Sep 015 Abstract We consider a sensor scheduling and
More informationOptimal Strategies for Communication and Remote Estimation with an Energy Harvesting Sensor
1 Optimal Strategies for Communication and Remote Estimation with an Energy Harvesting Sensor A. Nayyar, T. Başar, D. Teneketzis and V. V. Veeravalli Abstract We consider a remote estimation problem with
More informationStructure of optimal strategies for remote estimation over Gilbert-Elliott channel with feedback
Structure of optimal strategies for remote estimation over Gilbert-Elliott channel with feedback Jhelum Chakravorty Electrical and Computer Engineering McGill University, Montreal, Canada Email: jhelum.chakravorty@mail.mcgill.ca
More informationOptimal Decentralized Control with Asymmetric One-Step Delayed Information Sharing
Optimal Decentralized Control with Asymmetric One-Step Delayed Information Sharing 1 Naumaan Nayyar, Dileep Kalathil and Rahul Jain Abstract We consider optimal control of decentralized LQG problems for
More informationOptimal Decentralized Control with. Asymmetric One-Step Delayed Information Sharing
Optimal Decentralized Control with 1 Asymmetric One-Step Delayed Information Sharing Naumaan Nayyar, Dileep Kalathil and Rahul Jain Abstract We consider optimal control of decentralized LQG problems for
More informationCommunication constraints and latency in Networked Control Systems
Communication constraints and latency in Networked Control Systems João P. Hespanha Center for Control Engineering and Computation University of California Santa Barbara In collaboration with Antonio Ortega
More informationJointly Optimal LQG Quantization and Control Policies for Multi-Dimensional Linear Gaussian Sources
Fiftieth Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 1-5, 2012 Jointly Optimal LQG Quantization and Control Policies for Multi-Dimensional Linear Gaussian Sources Serdar Yüksel
More informationOptimality of Walrand-Varaiya Type Policies and. Approximation Results for Zero-Delay Coding of. Markov Sources. Richard G. Wood
Optimality of Walrand-Varaiya Type Policies and Approximation Results for Zero-Delay Coding of Markov Sources by Richard G. Wood A thesis submitted to the Department of Mathematics & Statistics in conformity
More informationAvailable online at ScienceDirect. IFAC-PapersOnLine (2016) Remote-state estimation with packet drop
Available online at www.sciencedirect.com ScienceDirect IFAC-PapersOnLine 49-22 (2016) 007 012 Remote-state estimation with packet drop Jhelum Chakravorty Aditya Mahajan McGill University, Electrical Computer
More informationOUTLINE! Stochastic Dynamic Teams and Games with Asymmetric Information. General Framework. General Framework. General Framework.
Stochastic Dynamic Teams and Games ith Asymmetric Information TAMER BAȘAR ECE, CAS, CSL, ITI and MechSE University of Illinois at U-C basar@illinois.edu September 9, 5 IMA-Distributed Control and DM over
More informationGame Theory with Information: Introducing the Witsenhausen Intrinsic Model
Game Theory with Information: Introducing the Witsenhausen Intrinsic Model Michel De Lara and Benjamin Heymann Cermics, École des Ponts ParisTech France École des Ponts ParisTech March 15, 2017 Information
More informationTeam Decision Theory and Information Structures in Optimal Control Problems
Team Decision Theory and Information Structures in Optimal Control Problems YU-CHI HO and K AI-CHING CHU Presented by: Ali Zaidi, Comm. Theory, KTH October 4, 2011 YU-CHI HO and K AI-CHING CHU Presented
More informationLearning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationRemote estimation over a packet-drop channel with Markovian state
1 Remote estimation over a packet-drop channel with Markovian state Jhelum Chakravorty and Aditya Mahajan Abstract We investigate a remote estimation problem in which a transmitter observes a Markov source
More informationEncoder Decoder Design for Feedback Control over the Binary Symmetric Channel
Encoder Decoder Design for Feedback Control over the Binary Symmetric Channel Lei Bao, Mikael Skoglund and Karl Henrik Johansson School of Electrical Engineering, Royal Institute of Technology, Stockholm,
More informationCompetitive Equilibrium and the Welfare Theorems
Competitive Equilibrium and the Welfare Theorems Craig Burnside Duke University September 2010 Craig Burnside (Duke University) Competitive Equilibrium September 2010 1 / 32 Competitive Equilibrium and
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationRemote Estimation Games over Shared Networks
October st, 04 Remote Estimation Games over Shared Networks Marcos Vasconcelos & Nuno Martins marcos@umd.edu Dept. of Electrical and Computer Engineering Institute of Systems Research University of Maryland,
More informationOn Equilibria of Distributed Message-Passing Games
On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu
More informationCS Lecture 3. More Bayesian Networks
CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,
More informationInformation Structures, the Witsenhausen Counterexample, and Communicating Using Actions
Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions Pulkit Grover, Carnegie Mellon University Abstract The concept of information-structures in decentralized control
More informationRobust Predictions in Games with Incomplete Information
Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton University) November 2010 Payoff Environment in games with incomplete information, the agents are uncertain
More informationOptimal remote estimation of discrete random variables over the collision channel
1 Optimal remote estimation of discrete random variables over the collision channel Marcos M. Vasconcelos and Nuno C. Martins Abstract Consider a system comprising sensors that communicate with a remote
More informationMarkov localization uses an explicit, discrete representation for the probability of all position in the state space.
Markov Kalman Filter Localization Markov localization localization starting from any unknown position recovers from ambiguous situation. However, to update the probability of all positions within the whole
More informationReasoning Under Uncertainty: Belief Network Inference
Reasoning Under Uncertainty: Belief Network Inference CPSC 322 Uncertainty 5 Textbook 10.4 Reasoning Under Uncertainty: Belief Network Inference CPSC 322 Uncertainty 5, Slide 1 Lecture Overview 1 Recap
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationRobust Partially Observable Markov Decision Processes
Submitted to manuscript Robust Partially Observable Markov Decision Processes Mohammad Rasouli 1, Soroush Saghafian 2 1 Management Science and Engineering, Stanford University, Palo Alto, CA 2 Harvard
More informationOPTIMAL FUSION OF SENSOR DATA FOR DISCRETE KALMAN FILTERING Z. G. FENG, K. L. TEO, N. U. AHMED, Y. ZHAO, AND W. Y. YAN
Dynamic Systems and Applications 16 (2007) 393-406 OPTIMAL FUSION OF SENSOR DATA FOR DISCRETE KALMAN FILTERING Z. G. FENG, K. L. TEO, N. U. AHMED, Y. ZHAO, AND W. Y. YAN College of Mathematics and Computer
More informationFeedback Capacity of a Class of Symmetric Finite-State Markov Channels
Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Nevroz Şen, Fady Alajaji and Serdar Yüksel Department of Mathematics and Statistics Queen s University Kingston, ON K7L 3N6, Canada
More informationOptimal H Control Design under Model Information Limitations and State Measurement Constraints
Optimal H Control Design under Model Information Limitations and State Measurement Constraints F. Farokhi, H. Sandberg, and K. H. Johansson ACCESS Linnaeus Center, School of Electrical Engineering, KTH-Royal
More informationLectures 25 & 26: Consensus and vehicular formation problems
EE 8235: Lectures 25 & 26 Lectures 25 & 26: Consensus and vehicular formation problems Consensus Make subsystems (agents, nodes) reach agreement Distributed decision making Vehicular formations How does
More informationA Separation Principle for Decentralized State-Feedback Optimal Control
A Separation Principle for Decentralized State-Feedbac Optimal Control Laurent Lessard Allerton Conference on Communication, Control, and Computing, pp. 528 534, 203 Abstract A cooperative control problem
More informationRobust Monte Carlo Methods for Sequential Planning and Decision Making
Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory
More informationSolving Zero-Sum Extensive-Form Games. Branislav Bošanský AE4M36MAS, Fall 2013, Lecture 6
Solving Zero-Sum Extensive-Form Games ranislav ošanský E4M36MS, Fall 2013, Lecture 6 Imperfect Information EFGs States Players 1 2 Information Set ctions Utility Solving II Zero-Sum EFG with perfect recall
More informationGame Theory and its Applications to Networks - Part I: Strict Competition
Game Theory and its Applications to Networks - Part I: Strict Competition Corinne Touati Master ENS Lyon, Fall 200 What is Game Theory and what is it for? Definition (Roger Myerson, Game Theory, Analysis
More informationarxiv: v2 [eess.sp] 20 Nov 2017
Distributed Change Detection Based on Average Consensus Qinghua Liu and Yao Xie November, 2017 arxiv:1710.10378v2 [eess.sp] 20 Nov 2017 Abstract Distributed change-point detection has been a fundamental
More informationInteracting Vehicles: Rules of the Game
Chapter 7 Interacting Vehicles: Rules of the Game In previous chapters, we introduced an intelligent control method for autonomous navigation and path planning. The decision system mainly uses local information,
More informationKalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q
Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I
More informationPartially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS
Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal
More informationExpectation propagation for signal detection in flat-fading channels
Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA
More informationEL2520 Control Theory and Practice
EL2520 Control Theory and Practice Lecture 8: Linear quadratic control Mikael Johansson School of Electrical Engineering KTH, Stockholm, Sweden Linear quadratic control Allows to compute the controller
More informationCyber-Awareness and Games of Incomplete Information
Cyber-Awareness and Games of Incomplete Information Jeff S Shamma Georgia Institute of Technology ARO/MURI Annual Review August 23 24, 2010 Preview Game theoretic modeling formalisms Main issue: Information
More informationMarkov Decision Processes
Markov Decision Processes Lecture notes for the course Games on Graphs B. Srivathsan Chennai Mathematical Institute, India 1 Markov Chains We will define Markov chains in a manner that will be useful to
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of
More informationEfficient Sensor Network Planning Method. Using Approximate Potential Game
Efficient Sensor Network Planning Method 1 Using Approximate Potential Game Su-Jin Lee, Young-Jin Park, and Han-Lim Choi, Member, IEEE arxiv:1707.00796v1 [cs.gt] 4 Jul 2017 Abstract This paper addresses
More informationHere represents the impulse (or delta) function. is an diagonal matrix of intensities, and is an diagonal matrix of intensities.
19 KALMAN FILTER 19.1 Introduction In the previous section, we derived the linear quadratic regulator as an optimal solution for the fullstate feedback control problem. The inherent assumption was that
More informationA Deterministic Annealing Approach to Witsenhausen s Counterexample
A Deterministic Annealing Approach to Witsenhausen s Counterexample Mustafa Mehmetoglu Dep. of Electrical-Computer Eng. UC Santa Barbara, CA, US Email: mehmetoglu@ece.ucsb.edu Emrah Akyol Dep. of Electrical
More informationMatthew Zyskowski 1 Quanyan Zhu 2
Matthew Zyskowski 1 Quanyan Zhu 2 1 Decision Science, Credit Risk Office Barclaycard US 2 Department of Electrical Engineering Princeton University Outline Outline I Modern control theory with game-theoretic
More informationPartially Observable Markov Decision Processes (POMDPs)
Partially Observable Markov Decision Processes (POMDPs) Sachin Patil Guest Lecture: CS287 Advanced Robotics Slides adapted from Pieter Abbeel, Alex Lee Outline Introduction to POMDPs Locally Optimal Solutions
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationCooperation-based optimization of industrial supply chains
Cooperation-based optimization of industrial supply chains James B. Rawlings, Brett T. Stewart, Kaushik Subramanian and Christos T. Maravelias Department of Chemical and Biological Engineering May 9 2,
More informationEVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER
EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER Zhen Zhen 1, Jun Young Lee 2, and Abdus Saboor 3 1 Mingde College, Guizhou University, China zhenz2000@21cn.com 2 Department
More informationGames on Social Networks: On a Problem Posed by Goyal
Games on Social Networks: On a Problem Posed by Goyal Ali Kakhbod Demosthenis Teneketzis November 13, 2017 arxiv:1001.3896v4 [cs.gt] 9 Feb 2012 Abstract Within the context of games on networks S. Goyal
More informationRECURSION EQUATION FOR
Math 46 Lecture 8 Infinite Horizon discounted reward problem From the last lecture: The value function of policy u for the infinite horizon problem with discount factor a and initial state i is W i, u
More informationA Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty
2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China A Decentralized Approach to Multi-agent Planning in the Presence of
More informationMulti-Robotic Systems
CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed
More informationCapacity of the Discrete Memoryless Energy Harvesting Channel with Side Information
204 IEEE International Symposium on Information Theory Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information Omur Ozel, Kaya Tutuncuoglu 2, Sennur Ulukus, and Aylin Yener
More informationGame Theory. School on Systems and Control, IIT Kanpur. Ankur A. Kulkarni
Game Theory School on Systems and Control, IIT Kanpur Ankur A. Kulkarni Systems and Control Engineering Indian Institute of Technology Bombay kulkarni.ankur@iitb.ac.in Aug 7, 2015 Ankur A. Kulkarni (IIT
More information1 Setup & Literature Review. 2 SDA: analysis of decision probabilities. 3 SDA: scalability analysis of accuracy/decision time
Sequential Decision Aggregation: Accuracy and Decision Time Sandra H. Dandach, Ruggero Carli and Francesco Bullo Center for Control, Dynamical Systems & Computation University of California at Santa Barbara
More informationIterative Encoder-Controller Design for Feedback Control Over Noisy Channels
IEEE TRANSACTIONS ON AUTOMATIC CONTROL 1 Iterative Encoder-Controller Design for Feedback Control Over Noisy Channels Lei Bao, Member, IEEE, Mikael Skoglund, Senior Member, IEEE, and Karl Henrik Johansson,
More informationMobile Robot Localization
Mobile Robot Localization 1 The Problem of Robot Localization Given a map of the environment, how can a robot determine its pose (planar coordinates + orientation)? Two sources of uncertainty: - observations
More informationK 1 K 2. System Level Synthesis: A Tutorial. John C. Doyle, Nikolai Matni, Yuh-Shyang Wang, James Anderson, and Steven Low
System Level Synthesis: A Tutorial John C. Doyle, Nikolai Matni, Yuh-Shyang Wang, James Anderson, and Steven Low Abstract This tutorial paper provides an overview of the System Level Approach to control
More informationOptimal Zero Delay Coding of Markov Sources: Stationary and Finite Memory Codes
1 Optimal Zero Delay Coding of Markov Sources: Stationary and Finite Memory Codes Richard G. Wood, Tamás Linder, and Serdar Yüksel arxiv:1606.09135v2 cs.it 5 Apr 2017 Abstract The optimal zero delay coding
More informationEncoder Decoder Design for Event-Triggered Feedback Control over Bandlimited Channels
Encoder Decoder Design for Event-Triggered Feedback Control over Bandlimited Channels Lei Bao, Mikael Skoglund and Karl Henrik Johansson Department of Signals, Sensors and Systems, Royal Institute of Technology,
More informationLearning Approaches to the Witsenhausen Counterexample From a View of Potential Games
Learning Approaches to the Witsenhausen Counterexample From a View of Potential Games Na Li, Jason R. Marden and Jeff S. Shamma Abstract Since Witsenhausen put forward his remarkable counterexample in
More informationCoordinating multiple optimization-based controllers: new opportunities and challenges
Coordinating multiple optimization-based controllers: new opportunities and challenges James B. Rawlings and Brett T. Stewart Department of Chemical and Biological Engineering University of Wisconsin Madison
More informationA Stochastic Online Sensor Scheduler for Remote State Estimation with Time-out Condition
A Stochastic Online Sensor Scheduler for Remote State Estimation with Time-out Condition Junfeng Wu, Karl Henrik Johansson and Ling Shi E-mail: jfwu@ust.hk Stockholm, 9th, January 2014 1 / 19 Outline Outline
More informationRobotics. Mobile Robotics. Marc Toussaint U Stuttgart
Robotics Mobile Robotics State estimation, Bayes filter, odometry, particle filter, Kalman filter, SLAM, joint Bayes filter, EKF SLAM, particle SLAM, graph-based SLAM Marc Toussaint U Stuttgart DARPA Grand
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationIntroduction to Mobile Robotics Probabilistic Robotics
Introduction to Mobile Robotics Probabilistic Robotics Wolfram Burgard 1 Probabilistic Robotics Key idea: Explicit representation of uncertainty (using the calculus of probability theory) Perception Action
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture State space models, 1st part: Model: Sec. 10.1 The
More informationVolume 31, Issue 3. Games on Social Networks: On a Problem Posed by Goyal
Volume 31, Issue 3 Games on Social Networks: On a Problem Posed by Goyal Ali Kakhbod University of Michigan, Ann Arbor Demosthenis Teneketzis University of Michigan, Ann Arbor Abstract Within the context
More informationInsider Trading and Multidimensional Private Information
Insider Trading and Multidimensional Private Information Tomasz Sadzik, UCLA, Chris Woolnough, NYU March 2014 Tomasz Sadzik, UCLA, Chris Woolnough, NYU Insider ( ) Trading and Multidimensional Private
More informationCoordinating multiple optimization-based controllers: new opportunities and challenges
Coordinating multiple optimization-based controllers: new opportunities and challenges James B. Rawlings and Brett T. Stewart Department of Chemical and Biological Engineering University of Wisconsin Madison
More informationProbabilistic Graphical Networks: Definitions and Basic Results
This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical
More informationarxiv: v1 [cs.sy] 24 May 2013
Convexity of Decentralized Controller Synthesis Laurent Lessard Sanjay Lall arxiv:35.5859v [cs.sy] 4 May 3 Abstract In decentralized control problems, a standard approach is to specify the set of allowable
More informationStochastic Models, Estimation and Control Peter S. Maybeck Volumes 1, 2 & 3 Tables of Contents
Navtech Part #s Volume 1 #1277 Volume 2 #1278 Volume 3 #1279 3 Volume Set #1280 Stochastic Models, Estimation and Control Peter S. Maybeck Volumes 1, 2 & 3 Tables of Contents Volume 1 Preface Contents
More informationCapacity of the Trapdoor Channel with Feedback
Capacity of the Trapdoor Channel with Feedback Haim Permuter, Paul Cuff, Benjamin Van Roy and Tsachy Weissman Abstract We establish that the feedback capacity of the trapdoor channel is the logarithm of
More informationEncoder Decoder Design for Event-Triggered Feedback Control over Bandlimited Channels
Encoder Decoder Design for Event-Triggered Feedback Control over Bandlimited Channels LEI BAO, MIKAEL SKOGLUND AND KARL HENRIK JOHANSSON IR-EE- 26: Stockholm 26 Signal Processing School of Electrical Engineering
More informationPolitical Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class.
Political Economy of Institutions and Development: 14.773 Problem Set 1 Due Date: Thursday, February 23, in class. Answer Questions 1-3. handed in. The other two questions are for practice and are not
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More information