Slopey quantizers are locally optimal for Witsenhausen's counterexample

Size: px

Start display at page:

Download "Slopey quantizers are locally optimal for Witsenhausen's counterexample"

Hollie Moore
5 years ago
Views:

1 Slopey quantizers are locally optimal for Witsenhausen's counterexample The MIT Faculty has mae this article openly available. Please share how this access benefits you. Your story matters. Citation As Publishe Publisher Ajorlou, Amir, an Jababaie, Ali. Slopey Quantizers Are Locally Optimal for Witsenhausen s Counterexample. 06 IEEE 55th Conference on Decision an Control (CDC), December -4 06, Las Vegas, Nevaa, USA, Institute of Electrical an Electronics Engineers (IEEE), December Institute of Electrical an Electronics Engineers (IEEE) Institute of Electrical an Electronics Engineers (IEEE) Version Author's final manuscript Accesse We Nov 4 7::40 EST 08 Citable Link Terms of Use Creative Commons Attribution-Noncommercial-Share Alike Detaile Terms

2 Slopey Quantizers Are Locally Optimal for Witsenhausen s Counterexample Amir Ajorlou an Ali Jababaie Abstract We stuy the perfect Bayesian equilibria of a leaer-follower game of incomplete information. The follower makes a noisy observation of the leaer s action (who moves first) an chooses an action minimizing her expecte eviation from the leaer s action. Knowing this, leaer who observes the realization of the state, chooses an action that minimizes her istance to the state of the worl an the ex-ante expecte eviation from the follower s action. We show the existence of what we call near piecewise-linear equilibria when there is strong complementarity between the leaer an the follower an the precision of the prior is poor. As a major consequence of this result, we prove local optimality of a class of slopey quantization strategies which ha been suspecte of being the optimal solution in the past, base on numerical evience for Witsenhausen s counterexample. Inex Terms Decentralize control, optimal stochastic control, incomplete information games, perfect Bayesian equilibrium. I. INTRODUCTION In his seminal work [], Witsenhausen constructe a simple two-stage linear-quaratic-gaussian (LQG) ecentralize control problem where the optimal controller happens to be nonlinear. This example showe for the first time that linear quaratic Gaussian team problems can have nonlinear solutions. By resorting to Witsenhausen s counterexample, [] prouce an example showing that the stanar ecentralize static output optimal control problem of linear eterministic systems coul also amit optimal nonlinear solutions. For nearly half a century, this counterexample has been a subject of intense research across multiple communities ( [3] [8]). The enogenous information structure of Witsenhausen s counterexample, where the signal observe in the secon stage is a noisy version of the control action in the first stage, gives rise to a nonclassical information structure. While the problem looks eceptively simple an quaratic on first look, it is actually a very complicate, nonconvex, functional optimization problem. This counterexample has she light on intricacies of optimal ecisions in stochastic team optimization problems with similar information structure. Naturally, this problem has given rise to a large boy of literature. For example, [9] provies a variant of Witsenhausen s counterexample with iscrete primitive ranom variables an finite support, where no optimal solution exists. Another interesting variant, with the same information structure but ifferent cost function, is the Gaussian test channel ( [4], [0]) where the linear strategies can be shown to be optimal. Also, [] shows that if the objective function is Institute for Data, Systems, an Society, Massachusetts Institute of Technology (MIT), Cambrige, MA 039, USA. {ajorlou,jababai}@mit.eu. This work was supporte by ARO MURI W9NF change to a worst case inuce norm, the linear controllers ominate nonlinear policies. While [] proves the existence of an optimal solution using tools from real an functional analysis, other works such as ( [6], [8]) suggest lifting the problem to an equivalent optimization problem over the space of probability measures an then employing tools from the optimal transport theory []. Although the optimal strategy an optimal cost for Witsenhausen s counterexample are still unknown, it can be shown that carefully esigne nonlinear strategies can largely outperform the linear strategies (see, e.g., the multi-point quantization strategies propose by [5]). This result, in particular, implies the fragility of the comparative statics an policies solely erive base on the linear strategies in problems with similar setting. A relevant line of research is to provie error bouns on the proximity to optimality for approximate solutions. [3] [5] use information theoretic techniques an vector versions of the original problem to provie such bouns. There are also several works aiming to approximate the optimal solution. [6] [9] employ ifferent heuristic approaches, all confirming an almost piecewiselinear form for the optimal controller. However, a complete optimality proof for such strategies has been elusive. In this paper, we view Witsenhausen s problem as a leaer-follower coorination game in which the action of the leaer is corrupte by an aitive noise, before reaching the follower. The leaer aims to coorinate with the follower while staying close to the observe state, recognizing that her action is not observe perfectly. As a result, she nees to signal the follower in a manner that can be ecoe efficiently. More than a mere acaemic counterexample, the above setup coul moel a scenario where coorination happens across generations an the insights of the leaer who is from a ifferent generation is corrupte/lost by the time the message reaches the future generations. If the leaer can internalize the fact that her actions will not be observe perfectly, how shoul she act to make sure coorination occur? When the leaer cares far more about coorination with the follower than staying on the message, the near piecewise-linear equilibrium strategy of the leaer coarsens the observation in well-space intervals, rather than merely broacasting a linearly scale version of the observe state as the linear strategy woul suggest. In this line, we prove the existence of a superior equilibrium strategy which coarsens the message via a slopey quantizer, making both the leaer an follower better off compare to the linear equilibrium strategies. To this en, we analyze the perfect Bayesian equilibria of this game an show that strong complimentarily between

3 the leaer an the follower combine with a prior with poor enough precision can give rise to nonlinear equilibria, an in particular, equilibria in form of slopey quantizers. To the best of our knowlege, this work is the first proviing an analytical rationale for the optimality of slopey quantization strategies for Witsenhausen s counterexample. The main iea behin the proof is to carefully construct a class of what we call near piecewise-linear strategies for the leaer that stays invariant uner the best response operator. By a near piecewise-linear strategy, we mean a piecewise strategy where the changes in the erivative in each segment are very small, making the strategy in each segment almost linear. Each strategy has a fixe number of segments, with leaer s action changing very slowly within each segment. As a result, leaer s actions stay very close to fixe points of the strategy in each segment. These fixe points are the values of the state which the leaer oes not istort. Therefore, well-space fixe points (combine with some appropriate bouns on the relative prior of the state of the worl in ifferent segments) reveal the leaer s actions to the follower with high probability, making the signal easily ecoable. As a consequence, we can characterize the best response of the follower to leaer s strategy. Using this characterization, we show that the best response of the leaer to follower s strategy also varies very little, essentially remaining near piecewise-linear as well. A key challenge in eriving the invariance property for this set of strategies for the leaer is to boun an control the isplacement in the fixe points an enpoints of the segments of leaer s strategy uner the best response. A key observation here is that the fixe points of the leaer s best responses are local minimizers of the expecte eviation of the leaer s action from the follower. This insight allows us to show that the fixe points of the leaer s best response lie in a tight neighborhoo of the fixe points of the follower s strategy. We then show that the fixe points of the follower s strategy in turn lie in a vicinity of a convex combination of the leaer s fixe points an the expecte value of the state of the worl within each segment. Combining the two, we can erive an approximate ynamics for the isplacement in the fixe points an enpoints of the segments in leaer s strategy uner the best response. Using this approximate ynamics, we then characterize an invariant set of fixe points an interval enpoints for leaer s strategy, which we can then use in orer to prove the existence of a near piecewise-linear equilibrium strategy for the leaer. II. MODEL The game consists of a leaer L an a follower F. Before the agents act, the state of the worlθ is rawn from a normal istribution with zero mean an variance σ. The leaer can observe the realization of θ an acts first. The payoff of the leaer is given as u L = r L (θ a L ) ( r L )(a F a L ), () where a F is the action of the follower an 0 < r L <. The follower makes a private, noisy observation of the leaer s action, s = a L + δ where δ N(0,). The payoff of the follower is u F = (a L a F ). () We consier the perfect Bayesian equilibria of the game an show that they reuce to the Bayes Nash equilibria ue to the Gaussian noise in the observation. Denote with a L (θ) an a F (s) the equilibrium strategies, an with ν ( s) the follower s belief about leaer s action given s. Due to the normal noise in the observation, ν ( s) is fully etermine by a L (θ) an the prior as there are no off-equilibrium-path information sets. Equilibrium strategies shoul thus satisfy a F (s) =E ν [a L s] = a L ν (a L s)a L, a L(θ) =argmax r L (θ a L ) a L ( r L ) (a F(s) a L ) φ(s a L )s, (3) where φ( ) enotes the PDF of the stanar normal istribution. We can easily characterize the linear equilibria of the game, following []. Theorem : Linear Bayes strategies of the leaer an follower are of the form a F (s) = µs an a L (θ) = λθ, where µ = t +t an λ = t σ, an t is a real root of the equation r L t (σ t) = r L (+t ). (4) Our main objective in this paper is to show the existence of an equilibrium with a near piecewise-linear strategy for the leaer in the regime where there is strong complementarity between the leaer an the follower (when r L is small) an the prior s precision is poor (or large σ). To this en, an motivate mainly by [], we focus on regime σ r Lσ, an aim to prove the existence of such an equilibrium for sufficiently large values of σ (an hence small r L ). III. NONLINEAR EQUILIBRIA Our approach for proving the existence of an equilibrium with a near piecewise-linear strategy for the leaer is to ientify a set of such strategies for the leaer that is invariant uner the best response operator. We characterize such a set in the next section. A. An Invariant Set of Near Piecewise-Linear Strategies for the Leaer Given m N, consier a partition of the normal istribution N(0,σ ) into m + segments m k= m B0 k, with Bk 0 = [b0 k,b0 k+ ) for k N m, B0 0 = (b 0,b 0 ), an B k 0 = (b 0 k,b0 k ], with b0 k = b0 k an b0 m+ = b0 m = +. Denote with c 0 k the expecte value of θ N(0,σ ) in segment Bk 0, that is, c0 k = E N(0,σ )[θ θ Bk 0 ]. Clearly, c 0 0 = 0 an c0 k = c0 k for k N m. We are in particular intereste in a partition where the interval enpoints b 0 k are the mipoints of [c0 k,c0 k ], i.e., Proofs are not inclue in this manuscript ue to space limitations. See [0] for the proofs.

4 b 0 k = c0 k +c0 k for k N m. We can show that such a partition exists an is unique. Next, we construct a set of (m+)- segmente increasing o functions, enote by A m L (r L,σ) satisfying the following properties. Property : For every a L (θ) A m L (r L,σ), there exist m + segments B k = [b k,b k+ ), for k N m, B 0 = ( b,b ), an B k = (b k,b k ], with b m+ = b m = + such that: a L (θ) is increasing an o (i.e., a L ( θ) = a L (θ)), an is smooth over each interval. a L (θ) has a unique fixe point in each segment. That is, for each interval B k, ( m k m), there exists a unique c k B k such that a L (c k ) = c k, with c 0 = 0. We also impose a constraint on the slope of a L (θ) in each interval, keeping the slope very close tor L, as well as a linear boun on a L (θ) in the tail. More precisely, we impose the following property: Property : For every m < k < m an θ B k, r θ a L(θ) r, where r = r L ( 0.5r L σ ) an r = r L (+.5r L σ ). For the tail interval B m, r θ a L(θ) r for b m < θ < c m + σ. For θ > c m + σ we have a L (θ) c m +5r L (θ c m +σ). The largerσ, the closer r an r to r L. For instance, choosing σ 6 ensures r <.0r L an r > 0.998r L. Finally, we impose the constraint that interval enpoints b k remain close to mipoints of [c k,c k ] an that fixe points c k remain close to c 0 k s. Property 3: For every k N m, Moreover, b k c k +c k 0.r L σ. (5) c k c 0 k k(m k)ζ r L, (6) where ζ = 0.44r L σ +(m+3.4)r L σ. Our proof strategy is then to show that for any m N, there exists σ m > 0 such that, for every σ σ m in regime σ r Lσ, the set of strategies A m L (r L,σ) characterize by Property -3 is invariant uner the best response operator. B. Best Response Analysis The first step in verifying the invariance of A m L (r L,σ) is to characterize the best response of the follower a F (s) to the leaer s strategy a L (θ) A m L (r L,σ). We can then use these properties to fin the upate best response of the leaer to a F (s), enote by ã(θ) an enforce its inclusion in A m L (r L,σ). Choosing σ sufficiently large, we can ensure several useful properties fora m L (r L,σ) that can facilitate this process. We state these properties in the lemma below. Lemma : There exists σ m max(6,7m ) such that for any σ σ m in regime σ r Lσ we have the following: The lengths of the half intervals in any strategya L (θ) A m L (r L,σ) are upper-boune by σ. That is, c k b k σ an c k b k σ for k N m. We only state the properties (an in many cases the analysis) only for θ 0. The counterpart for θ 0 is immeiate since the function is o. There exists a C > 3lnσ such that the lengths of the half intervals in any strategy a L (θ) A m L (r L,σ) are lower-boune by C. That is, c k b k C an c k b k C for k N m. Moreover, c m σ an b m 0.5σ. The lengths of the first half (secon half) of the intervals in any strategy a L (θ) A m L (r L,σ) are in increasing orer. That is, c k b k < c k+ b k+ an c k b k < c k b k+, for k N m. Moreover, b b > b ensuring an increasing orer on the lengths of the intervals as well. Let e k, 0 k m, be the expecte value of θ in segment B k, i.e., e k = E N(0,σ )[θ θ B k ]. Then, the istances from e k to the enpoints b k an b k+ also satisfies the lower an upper bouns C an σ. Moreover, e m 0.5σ. By choosing σ σ m, we can exploit the properties state for A m L (r L,σ) in the above lemma. In what follows, we use these properties to prove the invariance of the set of strategies A m L (r L,σ) uner the best response. The follower s best response to the strategy of the leaer a L (θ) is the expecte action of the leaer given the observations = a L +δ an can be written as a F (s) = E δ [a L s] = a L(θ)φ(s a L (θ))φ( θ σ )θ φ(s a L(θ))φ( θ σ )θ. (7) Using this, we can easily show that a F (s) is analytic an increasing, with s a F(s) = Var[a L s] (see [] for a proof). In orer to characterize a F (s), we start by estimating the expecte action of the leaer an its variance conitione on the interval to which θ belongs. Actions of the leaer in interval B k (k ±m) are well-concentrate aroun c k. In fact a L (θ) c k rσ for θ B k, from which the lemma below immeiately follows. Lemma : For 0 k < m, E[a L (θ) s,θ B k ] c k rσ an Var[a L (θ) s,θ B k ] r σ. The analysis is a bit involve in the tail, since for θ > c m the leaer s actions are not in a boune vicinity of c m anymore. However, we can erive several useful properties for the tail as well. Lemma 3: Consier a tail observation by the leaer (i.e., θ B m ). Then, for s c m +σ, an E[a L (θ) s,θ B m ] c m rσ, (8) E[a L (θ) s,θ B m ] c m 5r L σ(s c m ), (9) σ for s > c m +σ. Also, E[a L (θ) s,θ B m ] c m rσ. As for the variance, 3, for s < c m Var[a L (θ) s,θ B m ] 0.64 r σ, for c m s c m +σ 8.8rL σ (s c m ), for s > c m +σ. (0) Let the signal observe by the follower be between c k an c k+, i.e., s = c k +δ with0 δ c k+ c k. Then, we claim

5 that the follower s posterior on θ given s has a negligible probability out of the neighboring intervals B k B k+. We first erive the following property for the relative prior of θ in B k s by relating it to e k s an using the bouns on the istance from e k to the enpoints of the intervals, as well as the increasing orer of the lengths of the intervals given by Lemma. Lemma 4: For any m k,k m, we have Prob[θ B k ] Prob[θ B k ] e (ck c k ) 64. () This lemma implies that the posterior on θ is more affecte by the likelihoos rather than relative priors. Using this, we can boun the posterior of θ outsie the neighboring intervals to s (i.e., out of B k B k+ ). Lemma 5: Let the observe signal by the follower be s = c k +δ, where 0 δ c k+ c k. Then, for any r, Similarly, Prob[θ B k r s] Prob[θ B k s] e 45(c k c k r ) 5. () Prob[θ B k+r+ s] e 45(c k+r+ c k+ ) 5. (3) Prob[θ B k+ s] Using this lemma an the fact that the fixe points c k are well-space, we can show that the effect of the intervals other than B k an B k+ on a F (s) are negligible. In orer to characterize the follower s best response a F (s), we then nee to focus only on the segments ajacent to the observe signal, an in particular figure out the weight of each of these two neighboring intervals in the follower s posterior on θ. We o this in the following lemma. Lemma 6: Define m k+ = c k +c k+ + ( ) Prob[θ Bk ] ln, (4) k+ Prob[θ B k+ ] where k+ = c k+ c k. Also, write the signal observe by the follower as s = m k+ +δ. Then, for 0 k < m, e k+(δ+ rσ) r σ Prob[θ B k s] Prob[θ B k+ s] e k+(δ rσ)+ r σ. For the case involving the tail segment B m, e m(δ+ rσ) r σ Prob[θ B m s].68e m(δ rσ)+ r σ. ũ L (θ,a L ) = r L (θ a L ) Prob[θ B m s] (6) ( r L ) It is worth mentioning that m k+ efine in the above lemma is quite close to the mipoint of c k an c k+. In fact, it follows from Lemma 4 that m k+ c k+c k+ < k+ 64. We can now characterize the best response of the follower a F (s) to the leaer s strategy a L (θ) A m L (r L,σ) up to the first orer. Lemma 7: Let s = m k+ +δ, with c k s c k+. Then a F (s) c k + a F (s) c k + k+ +.68e k+(δ rσ)+ r σ k+ +e k+(δ+ rσ) r σ.0 rσ (5) +.0 rσ. (7) Also, 0 s a F(s).7e k+( δ rσ) ( k+ + rσ) +.0 r σ. (8) Corollary : A useful consequence of Lemma 7 is that a F (s) c k+.7e k+(δ rσ) k+.0 rσ a F (s) c k +.7e k+(δ+ rσ) k+ +.0 rσ, (9) where s = m k+ +δ, with c k s c k+. Note that the exponential terms in the above bouns vanish quite fast for large C an δ. for small δ, another useful upper boun on the erivative of a F (s) is s a F(s) 4 ( k+ + rσ) +.0 r σ. (0) Corollary : Let s = m k+ + δ, with c k s c k+. Then, c k.5 rσ a F (s) c k +.5 rσ for δ < 0.65 c k+.5 rσ a F (s) c k+ +.5 rσ for δ > () Roughly speaking, the above corollary says that, if the observe signal by the follower is far enough from the mipoint of c k an c k+, then the optimal action of the follower is well-concentrate aroun c k or c k+ (whichever that is closer), an changes very slowly accoring to Lemma 7. However, a F (s) may have very high variations for s close to m k+ as can be seen from Lemma 7. The following lemma characterizes a F (s) when follower makes a tail observation. Lemma 8: Let s = c m +δ, where δ > 0. Then, i) for δ σ, c m.0 rσ a F (s) c m + rσ, an 0 s a F(s) 0.65 r σ. ii) for δ > σ, c m.0 rσ a F (s) c m +5r L σ(δ σ ), an 0 s a F(s) 9r L σ δ. Lemma 7 an 8 provie the first orer characteristics of the best response of the follower to a leaer s strategy a L (θ) A m L (r L,σ). We are now reay to analyze the leaer s best response ã L (θ) to a F (s) an see if it stays in A m L (r L,σ). We have ã L (θ) = argmax al ũ L (θ,a L ), where (a F (s) a L ) φ(s a L )s. () Lemma 9: Consier θ [c k,c k+ ], 0 k < m. Then, there exists a unique b k+ [c k,c k+ ] such that ã L (θ) c k < 5 rσ for θ < b k+, ã L (θ) c k+ < 5 rσ for θ b k+. (3) The points b k+ etermine the segments of the best response strategy ã L (θ). We can boun the erivative of ã L (θ) over these segments by incorporating Lemma 7 an Corollary an into the above boun.

6 Lemma 0: Consier θ [c k,c k+ ], 0 k < m, with θ b k+. Then, r L θãl(θ) r L +( r L )(+0.3 r σ ) r L θãl(θ) r L +( r L )(.4 r σ ). (4) Using this lemma an the values = r L ( 0.5r r L σ ) an r = r L ( +.5rL σ ), we can easily verify that r θãl(θ) r. This means that Property is preserve by the best response for θ [ c m,c m ]. We stuy the tail case later in Lemma. Next, we characterize the fixe points of the best response strategy ã L (θ). Lemma : Define J L (a L ) = (a F (s) a L ) φ(s a L )s. (5) Then, JL (a L ) is strongly convex over [c k 5 rσ,c k +5 rσ], with J a L (a L ) (.4 r σ ). Let c k be the unique L solution of c k = argmin J L (a L ). (6) a L [c k 5 rσ,c k +5 rσ] Then, ã L ( c k ) = c k. The above lemma implies that Property is also preserve uner the best response. Next lemma escribes the tail properties of ã L (θ). Lemma : If b m < θ < c m +σ, then r θãl(θ) r. For θ > c m +σ, we have ã L (θ) c m +5r L (θ +σ c m ). (7) Now, in orer to verify that the upate strategy ã L (θ) satisfies Property 3, we nee to boun the isplacements in the fixe points c k an enpoints b k. Lemma 3: For the enpoints of the intervals corresponing to ã L (θ), we have b k+ c k + c k+ 0.r L σ. (8) Bouning the isplacement in c k has multiple steps: it involves relating the fixe point of the leaer s best response ã L (θ) in interval Bk to the fixe point of a F (s) in B k (i.e., s k ), followe by estimating s k in terms of c k an e k (recall that e k = E N(0,σ )[θ θ B k ], i.e., the expecte value of θ over B k ). Finally we boun the isplacement in e k with the isplacement of the interval enpoints using several properties of truncate normal istribution. Lemma 4: Let s k be the fixe point of a F (s) in the interval [c k 5 rσ,c k +5 rσ], i.e., a F (s k ) = s k. Then, c k s k < 0.44r L σ. (9) Lemma 5: s k can be locate base on c k an e k as s k ( r L )c k r L e k < (m+3.3)r Lσ. (30) Using Lemma 4 an 5, we can reach at c k ( r L )c k r L e k < 0.44r Lσ +(m+3.3)r Lσ. (3) We can now use (3) an Lemma 3 to verify that Property 3 is also preserve by the best response, completing the proof of the invariance of A m L (r L,σ) for σ σ m given by Lemma. Theorem : Consier the regime σ r Lσ an a given m N. Then, there exists σ m > 0 such that for any σ σ m, the set of (m+)-segmente strategies A m L (r L,σ) for the leaer, characterize by Property -3, is invariant uner the best response. Moreover, the game escribe in Section II has an equilibrium for which a L (θ,r L,σ) A m L (r L,σ). Given m N, the maximum eviation of fixe points c k from their counterparts c 0 k for strategies in Am L (r L,σ) is upper-boune by m (0.44r L σ +(m+3.4)r L σ), accoring to (6). This upper boun oes not grow unbounely with σ in the regime σ r Lσ. Furthermore, if σ + along a path where r L σ 0, this upper boun approaches zero, hence asymptotically ientifying the equilibrium strategy. Proposition : Let a 0 L (θ,r L,σ) be the (m+)-segmente piecewise-linear strategy with B 0 k s as segments an c0 k s as fixe points an fixe common slope r L, that is, a 0 L (θ,r L,σ) = c 0 k +r L(θ c 0 k ) for θ B0 k. Then, lim E N(0,σ )[ a L(θ,r L,σ) a 0 L(θ,r L,σ) ] 0. (3) r L σ 0 σ + IV. NUMERICAL EXAMPLES In [], Witsenhausen proves the existence of a nonlinear controller which outperforms the optimal linear strategy for a simple two-stage LQG ecentralize control problem. The equivalent regime uner our setup is k σ =, where k = rl r L. Witsenhausen shows that for large enough values of σ the optimal controller in this regime is nonlinear. It has been a long-staning conjecture that the optimal controller is a near piecewise-linear controller ( [4]). Theorem proves the existence of an equilibrium for the game of Section II with a near piecewise-linear strategy for the leaer for sufficiently large σ in the regime σ r Lσ ; This implies the existence of a slopey quantize local optimum 3 for Witsenhausen s counterexample in this regime (it is easy to verify that k σ = falls in this regime, since k σ = yiels σ < r Lσ = r L < for σ > ). The aim of this section is to illustrate Theorem for the special cases of m =, by specifying explicit values for σ m which guarantee the existence of 3-segmente an 5- segmente near piecewise-linear equilibria, respectively. Proposition (3-Segmente Equilibria): Suppose that σ r Lσ an σ 6. Then, the game escribe in Section II has an equilibrium with a L (θ,r L,σ) A L (r L,σ); a L (θ,r L,σ) is a 3-segmente near piecewise-linear strategy possessing Property -3. The strip containing this nonlinear strategy for the leaer for the caseσ = 6 anr L σ = r L is epicte in Figure (the plot only shows the region θ 0; a L ( θ) = a L (θ)). 3 This is inee stronger than a local optimum since, at an equilibrium, fixing one player s strategy the eviation in the other player s strategy oes not have to be local.

7 al(θ) θ Fig.. The strip containing the nonlinear equilibrium strategy for the leaer for the case σ = 6 an r L σ = r L. al(θ) θ Fig.. The strip containing the nonlinear equilibrium strategy for the leaer for the case σ = 60 an r L σ = r L. Proposition 3 (5-Segmente Equilibria): Suppose that σ r Lσ an σ Then, the game escribe in Section II has an equilibrium with a L (θ,r L,σ) A L (r L,σ); a L (θ,r L,σ) is a 5-segmente near piecewise-linear strategy possessing Property -3. The strip containing this nonlinear strategy for the leaer for the case σ = 60 an r L σ = r L is epicte in Figure. V. CONCLUSIONS We stuie Witsenhausen s counterexample in a leaerfollower game setup where the follower makes noisy observations from the leaer s action an aims to choose her action as close as possible to that of the leaer. Leaer who moves first an can see the realization of the state of the worl chooses her action to minimize her ex-ante istance from the follower s action as well as the state of the worl. We showe the existence of nonlinear perfect Bayesian equilibria in the regime where there is strong complementarity between the leaer an the follower when the prior has very poor precision. Our results provie the first analytical proof for the local optimality of near piecewise-linear controllers for the well-known Witsenhausen s counterexample, where the optimal controller is conjecture to be a slopey quantizer. REFERENCES [] H. S. Witsenhausen, A counterexample in stochastic optimum control, SIAM J. Control, vol. 6, pp. 3 47, 968. [] T. Başar, On the optimality of nonlinear esigns in the control of linear ecentralize systems, IEEE Transactions on Automatic Control, vol., p. 797, 976. [3] Y. Ho an T. Chang, Another look at the nonclassical information structure problem, IEEE Transactions on Automatic Control, vol. 5, pp , 980. [4] R. Bansal an T. Başar, Stochastic teams with nonclassical information revisite: When is an affine law optimal?, IEEE Transactions on Automatic Control, vol. 3, pp , 987. [5] S. Mitter an A. Sahai, Information an control: Witsenhausen revisite, Lecture Notes in Control an Information Sciences, vol. 4, pp. 8 93, 999. [6] Y. Wu an S. Verú, Witsenhausen s counterexample: A view from optimal transport theory, in Proceeings of the 50th IEEE Conf. Decision Control (CDC), pp , 0. [7] P. Grover, A. B. Wagner, an A. Sahai, Information embeing an the triple role of control, IEEE Transactions On Information Theory, vol. 6, pp , 05. [8] A. Gupta, S. Yüksel, T. Başar, an C. Langbort, On the existence of optimal policies for a class of static an sequential ynamic teams, SIAM J. Control Optim., vol. 53, pp. 68 7, 05. [9] S. Yüksel an T. Başar, Stochastic networke control systems: Stabilization an optimization uner information constraints, Birkhäuser, 03. [0] T. Başar, Variations on the theme of the Witsenhausen counterexample, in Proceeings of the 47th IEEE Conf. Decision Control (CDC), pp , 008. [] M. Rotkowitz, Linear controllers are uniformly optimal for the Witsenhausen counterexample, in Proceeings of the 45th IEEE Conf. Decision Control (CDC), pp , 006. [] C. Villani, Optimal Transport: Ol an New. Springer-Verlag, 009. [3] P. Grover an A. Sahai, A vector version of Witsenhausen s counterexample: Towars convergence of control, communication an computation, in Proceeings of the 47th IEEE Conf. Decision Control (CDC), pp , 008. [4] P. Grover an A. Sahai, Vector Witsenhausen counterexample as assiste interference suppression, Int. J. Syst., Control Commun (IJSCC) Spec. iss. Inf Process. Decision Making in Distribute Control Syst., vol., pp , 00. [5] P. Grover, S. Y. Park, an A. Sahai, Approximately optimal solutions to the finite-imensional Witsenhausen counterexample, IEEE Transactions on Automatic Control, vol. 58, pp , 03. [6] N. Li, J. R. Maren, an J. S. Shamma, Learning approaches to the Witsenhausen counterexample from a view of potential games, in Proceeings of the 48th IEEE Conf. Decision Control (CDC), 009. [7] M. Baglietto, T. Parisini, an R. Zoppoli, Numerical solutions to the Witsenhausen counterexample by approximating networks, IEEE Transactions on Automatic Control, vol. 46, pp , 00. [8] J. T. Lee, E. Lau,, an Y.-C. L.Ho, The Witsenhausen counterexample: A hierarchical search approach for nonconvex optimization problems, IEEE Transactions on Automatic Control, vol. 46, pp , 00. [9] M. Mehmetoglu, E. Akyol, an K. Rose, A eterministic annealing approach to Witsenhausen s counterexample, in ISIT, 04. [0] A. Ajorlou an A. Jababaie, Slopey quantizers are locally optimal for witsenhausen s counterexample, available at: org/abs/ , It is to be note that smaller values for σ m may work here (an similarly in Proposition ). These values are obtaine using the bouns erive for generic m. Of course, given a specific value of m, these bouns can be tightene much further which may result in smaller values for σ m.

arxiv: v4 [math.oc] 2 Jun 2017

arxiv: v4 [math.oc] 2 Jun 2017 Local Optimality of Almost Piecewise-Linear Quantizers for Witsenhausen s Problem Amir Ajorlou Ali Jababaie arxiv:1604.03806v4 [math.oc] Jun 017 Abstract We pose Witsenhausen s problem as a leaer-follower