Realization Plans for Extensive Form Games without Perfect Recall

Similar documents
Exact and Approximate Equilibria for Optimal Group Network Formation

Self-stabilizing uncoupled dynamics

Price of Stability in Survivable Network Design

Exact and Approximate Equilibria for Optimal Group Network Formation

On improving matchings in trees, via bounded-length augmentations 1

An Algebraic View of the Relation between Largest Common Subtrees and Smallest Common Supertrees

Tree sets. Reinhard Diestel

CS364A: Algorithmic Game Theory Lecture #13: Potential Games; A Hierarchy of Equilibria

Graph Theorizing Peg Solitaire. D. Paul Hoilman East Tennessee State University

COMBINATORIAL GAMES AND SURREAL NUMBERS

Matroid Secretary for Regular and Decomposable Matroids

Analysis of Algorithms - Midterm (Solutions)

A note on monotone real circuits

Maximising the number of induced cycles in a graph

Tijmen Daniëls Universiteit van Amsterdam. Abstract

K 4 -free graphs with no odd holes

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria

Maximal and Maximum Independent Sets In Graphs With At Most r Cycles

Section 7.1: Functions Defined on General Sets

Lecture 4 October 18th

Reading 11 : Relations and Functions

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about

Microeconomics. 2. Game Theory

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

THE concept of an AND-OR tree is interesting because

Columbia University. Department of Economics Discussion Paper Series

Paths and cycles in extended and decomposable digraphs

arxiv: v1 [cs.cc] 5 Dec 2018

Binary Decision Diagrams. Graphs. Boolean Functions

2 : Directed GMs: Bayesian Networks

AVERAGE TREE SOLUTION AND SUBCORE FOR ACYCLIC GRAPH GAMES

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Boundary cliques, clique trees and perfect sequences of maximal cliques of a chordal graph

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

Rationalization of Collective Choice Functions by Games with Perfect Information. Yongsheng Xu

Computing an Extensive-Form Correlated Equilibrium in Polynomial Time

Game Theory and Social Psychology

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

4: Dynamic games. Concordia February 6, 2017

Reverse mathematics of some topics from algorithmic graph theory

Lecture 10 Algorithmic version of the local lemma

Chapter 1 The Real Numbers

DEPARTMENT OF ECONOMICS WORKING PAPER SERIES. Modeling Resource Flow Asymmetries using Condensation Networks

Nash-solvable bidirected cyclic two-person game forms

BIPARTITE GRAPHS AND THE SHAPLEY VALUE

Efficient Reassembling of Graphs, Part 1: The Linear Case

On Minimal Words With Given Subword Complexity

Automata on linear orderings

On Acyclicity of Games with Cycles 1

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

Computing Minmax; Dominance

EXACT DOUBLE DOMINATION IN GRAPHS

arxiv: v2 [math.co] 7 Jan 2016

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

Phylogenetic Networks, Trees, and Clusters

Convergence Rate of Best Response Dynamics in Scheduling Games with Conflicting Congestion Effects

Parikh s theorem. Håkan Lindqvist

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

arxiv: v1 [math.co] 22 Jan 2013

Hierarchical Simple Games: Weightedness and Structural Characterization

Generalized Pigeonhole Properties of Graphs and Oriented Graphs

Parity Versions of 2-Connectedness

Near-Potential Games: Geometry and Dynamics

3.1 Asymptotic notation

Binary Decision Diagrams

Petri nets. s 1 s 2. s 3 s 4. directed arcs.

On zero-sum partitions and anti-magic trees

c 2011 Nisha Somnath

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

CS 6820 Fall 2014 Lectures, October 3-20, 2014

EFRON S COINS AND THE LINIAL ARRANGEMENT

0.2 Vector spaces. J.A.Beachy 1

Strongly chordal and chordal bipartite graphs are sandwich monotone

Theoretical Computer Science

Weak Dominance and Never Best Responses

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Matroids and Greedy Algorithms Date: 10/31/16

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

Synthesis weakness of standard approach. Rational Synthesis

COMPUTATION OF EXTENSIVE FORM EQUILIBRIA IN SEQUENCE FORM GAMES

Berge Trigraphs. Maria Chudnovsky 1 Princeton University, Princeton NJ March 15, 2004; revised December 2, Research Fellow.

A Generic Approach to Coalition Formation

Basing Decisions on Sentences in Decision Diagrams

About partial probabilistic information

MATH 521, WEEK 2: Rational and Real Numbers, Ordered Sets, Countable Sets

Characterization of Semantics for Argument Systems

Lecture 17: May 29, 2002

Enumeration Schemes for Words Avoiding Permutations

DR.RUPNATHJI( DR.RUPAK NATH )

int cl int cl A = int cl A.

Discrete Mathematics. On Nash equilibria and improvement cycles in pure positional strategies for Chess-like and Backgammon-like n-person games

Game Theory: Lecture 3

Notes on the Dual Ramsey Theorem

Pareto Optimality in Coalition Formation

Every Choice Correspondence is Backwards-Induction Rationalizable

Economics 3012 Strategic Behavior Andy McLennan October 20, 2006

Volume 31, Issue 3. Games on Social Networks: On a Problem Posed by Goyal

Clause/Term Resolution and Learning in the Evaluation of Quantified Boolean Formulas

Space and Nondeterminism

Stable Matching Existence, Computation, Convergence Correlated Preferences. Stable Matching. Algorithmic Game Theory.

A New 3-CNF Transformation by Parallel-Serial Graphs 1

Transcription:

Realization Plans for Extensive Form Games without Perfect Recall Richard E. Stearns Department of Computer Science University at Albany - SUNY Albany, NY 12222 April 13, 2015 Abstract Given a game in extensive form and a player p in the game, we want to find a small set of parameters describing a set M of mixed strategies with the property that every mixed strategy for p has an equivalent mixed strategy in M. In the case that the player has perfect recall, behavioral strategies describe such a set. [2] For computational purposes, it is more useful to work with corresponding path probabilities because the relationships among these probabilities are linear [3, 4, 5, 7]. A tree-like description of these linear relationships is often called a realization plan. Here we generalize the idea of a realization plan so that, in some cases, a player without perfect recall may also limit consideration to mixed strategies described by a small set of linearly related probabilities. We then describe techniques whereby such descriptions might be found and classes of games where these techniques might be effective. In the worst case, the generalized plans are too large to be useful. However, individual games with good enough recall will have small generalized realization plans. The point is that, whenever results are obtained using path probabilities, the results may immediately extend to certain more general situations merely by replacing traditional realization plans with the more general plans. To demonstrate this point, we define a class of near perfect recall games where the number of parameters is linear in the size of the game tree. Keywords: AMS subject classifications: 1

1 Introduction One familiar way to describe an n-person game is with a game tree. A game so described is said to be an extensive form game. Each interior game tree node is associated with a player and is designated as belonging to one of the player s information sets. Each information set has a designated set of actions which are used to label the edges leaving nodes associated with the information set. Formally: Definition 1.1 An extensive form game is given by 1. A finite rooted tree T. 2. A finite set of P of players. 3. A partition of the interior nodes of T into player sets N p for p in P. 4. A partition of each player set N p into information sets. The set of information sets for player p is denoted by U p. For n in N p, u(n) denotes the set in U p which contains n. 5. For each information set u, a non-empty finite set A(u) of actions. If u 1 and u 2 are distinct information sets, then A(u 1 ) A(u 2 ) =. 6. For each interior node n of T, a one-to-one correspondence between the children of n and the actions in A(u(n)). 7. The terminal nodes of T are called outcomes. 8. No path from root to outcome contains more than one node from the same information set. This paper is concerned only with strategy sets of one player. For this reason, we have no need to model a chance player moving with specified probabilities. For the same reason, we have no need to specify payoffs for the outcome nodes. A player can be thought of as having a set of agents, one agent for each of the player s information sets. Whenever play reaches a node in an information set, the agent associated with the information set must take one of the actions associated with the information set without knowing which node in the information set has been reached. Play starts at the root and, when an outcome is reached, the game is over. Before a play of the game, each player instructs each of his agents which action the agent should play if some node in the agent s information set is reached. In this way, the actions taken by the agents become correlated. Definition 1.2 A pure strategy for a player p is a mapping π which assigns an action in A(u) to each information set u in U p. A probability distribution on pure strategies is called a mixed strategy. A pure strategy for one player will allow certain game tree edges to be reached (if the other players move accordingly) and will prevent certain edges from being reached (for any moves the other players might make). A mixed strategy generates a reachability probability that a given edge can be reached. It is these reachability probabilities, together with the actions of the other players, which determine outcome probabilities. The number of pure strategies is exponential in the number of information sets and so an exponential number of probabilities are needed to specify an arbitrary mixed strategy. Therefore, if a subset of mixed strategies can generate the same set of 1

reachability probabilities as an arbitrary mixture, then the player can restrict herself to strategies from that subset. Such a subset may be describable by a small number of probabilities thus making certain computational problems easier. A behavioral strategy [2] is an assignment of probabilities to each action of each information set. This is a small set of probabilities, no larger than size of the game tree. These probabilities can be used to assign each agent an action to take if and when the agent s information set is reached. In the case that a player has perfect recall 1 Kuhn has shown that behavioral strategies generate the same set of reachability probabilities as arbitrary mixed strategies. For computational purposes, it is useful to analyze perfect recall games using probabilities that a given information set will be reached (if the other players move accordingly) and that a given action for that set is then chosen. (See [3, 4, 5, 6, 7]) These probabilities, often called path probabilities, are proportional to behavioral probabilities in the obvious way. The advantage of path probabilities is that they are linearly related. One application of this approach is that twoperson zero-sum games with perfect recall can be solved by a small-size linear program [3]. The technique is a powerful one and a variety of other applications appear in the literature. The goal of this paper is to do a similar thing for games without perfect recall. That is, given a game in extensive form and a player in the game, we want to find a subset of the mixed strategies for the player such that 1. the subset is described by a small set of probability-valued parameters, 2. every mixed strategy for the player has an equivalent strategy in the subset. 3. the mixed strategies are easy to implement from the plan description. The hope is that, whenever results are obtained using path probabilities, the result may immediately be extended to more general situations simply be replacing traditional realization plans with the more general plans. We generalize the techniques for perfect recall to apply to players without perfect recall. These generalized techniques do not always produce small parameter sets but in some cases they do. In particular, some extensive form games are close enough to perfect recall that they too will have strategy sets with small descriptions. We do not plan to define small formally. In Section 2, we introduce the concept of an AC-tree. It is used to describe the choices available to a single player. The tree has two kinds of nodes, choice nodes labeled with the names of information sets and action nodes labeled with actions. It enables us to formally distinguish between the actions available to the player when certain choices are to be made and the many possible choices that can follow that action. One important AC-tree is the skeleton which describes the possible choices in the order they occur in the game. In Section 3, we introduce the concept of a realization plan. The plan serves two purposes. One is to set up linear equations describing a set of strategies. The other is to implement a mixed strategy once values for the parameters have been chosen. This generalizes the realization plans from the literature. Our generalization is based on the idea that mixed strategies can be constructed using trees which display decisions in a significantly different order than they occur in the game tree. Much of this section is devoted to proving that the proposed realization plans do describe sets of mixed strategies. Section 4 is used to show that every mixed strategy has an equivalent strategy in the set described by a realization plan. It is further shown that this relationship is described by linear equations. 1 Although the perfect recall concept is often treated as a property of games, the concept applies to individual players as well. 2

Next we study how realization plans can be obtained from the skeleton through a sequence of transformations. In Section 5, we begin this study with a simple size preserving transformation we call a merge operation. This section explains how our model relates to concepts in the literature, specifically to realization plans for perfect recall games and to game equivalence as in Dalkey. In Section 6, we define an insert operation which resolves problems that occur when two nodes in an information set cannot always be distinguished by a players earlier actions. Repeated applications of this operation always produces a realization plan. The idea behind the operation is to pick the action for the corresponding agent before picking actions for agents that move earlier when the game is played. In Section 7, we show how our tree-based realization can be condensed into a directed acyclic graph (DAG) thereby reducing the number of parameters. Finally, in Section 8, we sketch how our methods can be used to define classes of games with polynomial sized realizations. In some sense, the games in these classes can be considered to have near perfect recall. 2 AC-Trees In this section, we introduce the concept of an AC-tree. These trees will serve several purposes including strategy descriptions and strategy implementation. In contrast with the game tree, an AC-tree only displays information pertaining to a single player. We use common terminology for rooted trees. One node is designated as the root. From the root, one can go down the tree from parent to child; each node (except the root) having one parent and zero or more children. The least upper bound (lub) of two nodes n 1 and n 2 is thus the node where the tree forks, one branch leading to n 1 and the other to n 2. If we say path to node n we mean path from the root to node n. If we say sub-tree with root n, we mean n and all its descendants. Definition 2.1 Given a game in extensive form and a player p in the game, an AC-tree (actionchoice tree) T for player p is a rooted tree such that: 1. the nodes of T are partitioned into action nodes and choice nodes; 2. the root of T is an action node; 3. the children of action nodes are choice nodes and the children of choice nodes are action nodes; 4. each choice node n is labeled with an information set u(n) in U p ; 5. the children of a choice node n are labeled in a one-to-one manner with the actions in A(u(n)); 6. no path from the root contains two nodes with the same label. An AC-tree may have several choice nodes labeled with the same information set. The choice nodes have one child per action, just as in the game tree. On the other hand, action nodes can have any number of children. To make definitions and theorems more readable, we henceforth omit the phrase given a game in extensive form and a player p in the game from definition and theorem statements. 3

There is a particular AC-tree which describes the choice structure a player has in the game tree. In essence, it is the game tree with the moves of the other players removed: Definition 2.2 The skeleton S for p is the AC-tree for p such that 1. the choice nodes are in one-to-one correspondence with the game tree nodes belonging to p; 2. each choice node is labeled with the information set of the corresponding game tree node; 3. the parent of a choice node n is the action node (if any) corresponding to the game tree action taken at the previous game tree node for player p. Otherwise, the parent is the root. Definition 2.3 Given an AC-tree T and a pure strategy π for player p, the sub-tree of T induced by π is the sub-tree T π defined inductively by 1. an action node n of T is in T π if and only if (a) n is the root of T or (b) the parent m of n is in T π and n is labeled with π(u(m)). 2. a choice node n of T is in T π if and only if the parent of n is in T π. For the skeleton S, the nodes of the sub-tree S π induced by a pure strategy π correspond to the nodes in the game tree that are reachable if π is used by p and the other players move accordingly. The particular nodes reached during the play will depend on the particular choices of the other players. Definition 2.4 Given a mixed strategy µ and an AC-tree T for player p, let f µ map action nodes n of T into the probability that a pure strategy from µ includes n in its induced sub-tree. We say that f µ is induced on T by µ. Definition 2.5 Given an AC-tree T for player p, a mapping f of the action nodes into non-negative real numbers is called a path probability map for T if and only if 1. for the root node r, f(r)=1; 2. for all choice nodes m, c C f(c) = f(n) where C is the set of children of m and n is the parent of m. Theorem 2.6 The function f µ from Definition 2.4 is a path probability map. Proof: The root is always part of an induced sub-tree and thus Condition 1 of Definition 2.5 is satisfied. If π is a pure strategy in µ and n is a choice node in T π, exactly one child of n belongs to the sub-tree, namely the child labeled π(n). Therefore, the reaching a given child of n events are mutually exclusive which implies Condition 2. Definition 2.7 We say that two mixed strategies for player p are equivalent if they induce the same probability map on the skeleton for p. The strategies are called equivalent because, for any strategy choices of the other players, the probability of reaching any game tree node are the same. This notion of equivalence goes back to [2]. We seek strategy sets which are sufficient : 4

Definition 2.8 A set of mixed strategies M for player p is called sufficient if, for all mixed strategies µ, there is a mixed strategy µ in M equivalent to µ. In the next section, we work in reverse and use path probability functions on certain AC-trees to define and generate mixed strategies. It will be sufficient to find mixtures of reduced strategies defined as follows: Definition 2.9 Given an AC-tree T for a player p, let U be a subset of the information sets for p and π be a function with domain U which maps each u U into an action for u. Then π is called a reduced strategy for p if there exist a pure strategy π such that π (u) = π(u) for u U and the action nodes of the tree induced on the skeleton by π are all labeled with actions from π. We call π a reduced form of π. Note that we call a strategy reduced even if it can be reduced further. 3 Realization Plans In this section, we consider defining a set of reduced mixed strategies using an AC-tree and probability maps for that tree. We show that an AC-tree is suitable for this purpose if it is a plan basis as defined in Definition 3.3 below. The definition insures that the tree actually does define (reduced) pure strategies and (as shown in Section 4) that every mixed strategy has an equivalent mixed strategy defined by a probability map on the basis. First we need two preliminary definitions: Definition 3.1 Given an AC-tree T for player p and a node n in T, the notation α T (n) denotes the set of action symbols on the path to and including n. Definition 3.2 Given an AC-tree T for player p, two choice nodes n 1 and n 2 of T with the same label are called distinguishable if α T (n 1 ) contains an action a 1 and α T (n 2 ) contains an action a 2 such that a 1 and a 2 are actions for the same information set and a 1 a 2. Now for the central definition: Definition 3.3 An AC-tree B for player p is called a plan basis if and only if the following two conditions hold: 1. if two distinct choice nodes n 1 and n 2 of B satisfy u(n) = u(m), then n 1 and n 2 are distinguishable, 2. if the sub-tree S π induced on the skeleton S by a pure strategy π includes action node m, then the sub-tree B π of B induced by π contains an action node n such that α S (m) α B (n). A pair (B, f) where B is a plan basis and f is a path probability function for B is called a realization plan. Theorem 3.8 below describes how a realization plan can be used to generate a mixed strategy. We need some alternatives to Condition 1: Theorem 3.4 For an AC-tree T for player p, the following three conditions are equivalent: 1. Condition 1 of Definition 3.3. 5

2. For all distinct choice nodes n 1 and n 2 of T such that U(n 1 ) = U(n 2 ), the lub of n 1 and n 2 is a choice node. 3. If choice nodes n 1 and n 2 are distinct children of the same action node, then the choice nodes in the sub-tree with root n 1 have different labels than the choice nodes in the sub-tree with root n 2. Proof: 1 2: If T does have nodes n 1 and n 2 such that U(n 1 ) = U(n 2 ) and n 1 and n 2 are not distinguishable, then the lub of n 1 and n 2 cannot be a choice node m because then n 1 and n 2 would be descended from distinct children of m and thus be distinguishable. 2 1: Suppose choice nodes n 1 and n 2 with the same label have lub m where m is an action node. Let n 1 and n 2 be two choice nodes with the same label such that n 1 is on the path from m to n 1, n 2 is on the path from m to n 2, and the sum of the lengths of the paths m to n 1 and m to n 2 is minimal. Then the paths from m to n 1 and from m to n 2 have no choice nodes in common and n 1 and n 2 are therefore not distinguishable. 2 3: The lub of a node in the sub-tree with root n 1 and a node in the sub-tree with root n 2 is the parent of n 1 and n 2, an action node. Therefore Condition 2 is violated if and only if Condition 3 is violated. Now we introduce a concept used to connect a plan basis with pure strategies: Definition 3.5 Given an AC-tree T for player p, a sub-tree T of T is called a decision sub-tree if it satisfies the following three conditions: 1. the root of T is in T ; 2. if n is an action node of T, then all children of n in T are in T ; 3. if n is a choice node of T, then exactly one child of n in T is in T. Now we can give yet another condition equivalent to Condition 1: Lemma 3.6 Let T be an AC-tree for player p. Then the following condition is equivalent to Condition 1 of Definition 3.3: For all decision sub-trees T of T and information sets u for player p, at most one choice node of T is labeled with u. Proof: We show equivalence with Condition 2 of Theorem 3.4. First we show that Condition 2 implies the condition of the lemma. If there are two choice nodes n 1 and n 2 in T labeled u, Condition 2 (and the fact that T is a sub-tree of T ) says that the lub of n 1 and n 2 is a choice node n. But Condition 3 of Definition 3.5 says only one child of n belongs to T. Now for the implication in the other direction. If there are two choice nodes in T labeled u such that the lub of n 1 and n 2 is an action node, then a T containing n 1 and n 2 is easily constructed following the rules of Definition 3.5 by always picking children on the paths to n 1 and n 2. Now we bring in Condition 2 of Definition 3.3 to show that, in the case of a plan basis, a decision sub-tree describes a reduced pure strategy. Lemma 3.7 Let B be a plan basis for player p and let B be a decision sub-tree of B. Then the labels on the action nodes of B describe a reduced strategy for p. Proof: By Lemma 3.6, there is at most one action node in B for a given information set. Thus the labels on the action nodes of B describe a mapping π from a subset of information sets to 6

actions. Let π be any extension of π to the remaining information sets and let S be the sub-tree induced by π on the skeleton S. Let m be an action node of S. By Condition 2 of Definition 3.3, there is a node of n of B such that α S (m) α B (n). Therefore the action labeling m also labels an action node of T and is thus an action specified by π. Definition 2.9 is thus satisfied. Now for the conclusion: Theorem 3.8 Given a realization plan (B, f) for player p, a decision sub-tree and corresponding reduced pure strategy can be selected at random as follows: 1. select the root of B. 2. if action node n is selected, select all children of n. 3. if choice node m is selected, select one child n of m using the probabilities f(n)/f(n ) where n is the parent of m. Furthermore, the mixed strategy so described induces the path probability map f on B. Proof: Let m be a choice node and C be the set of children of m. Definition 2.5(2) insures that n C f(n)/f(n ) is equal to one so the selection method is indeed a probabilistic procedure. Lemma 3.7 insures that the sub-tree so selected does correspond to a reduced pure strategy. Let n 0... n k be the action node sequence in the path in B from the root n 0 to some action node n k. The probability that n k is reached is k i=1 f(n i )/f(n i 1 ) = f(n k )/f(n 0 ) = f(n k ) so the final statement of the theorem is true. Definition 3.9 Given a realization plan (B, f) for player p, we call the mixed strategy obtained by the method of Theorem 3.8 the strategy realized by (B, f). Theorem 3.8 can now be restated: Proposition 3.10 If B is a plan basis and µ a mixed strategy for player p, then the path probability function f µ induced on B by µ is the same as the path probability function induced on B by the strategy realized by realization plan (B, f µ ). 4 The Skeleton and Plan Basis Connection Let B, µ, and f µ be as in Proposition 3.10. We want to show that µ and the mixed strategy µ realized by (B, f µ ) are equivalent. We do that in this section by showing the path probabilities on the skeleton induced by µ and µ are determined by f µ and are hence the same. Furthermore, the probabilities on the skeleton are determined by linear equations. Definition 4.1 Let S be the skeleton and B a plan basis for player p. For all action nodes m of S, define A(m) to be the set of action nodes n of B such that: 1. the label on n labels some action node on the path to m in S, 2. α B (n) α S (m). Lemma 4.2 Let S be the skeleton and B a plan basis for player p. Let π be a pure strategy for p and B π be the sub-tree of B induced by π. 7

1. If π induces a path to node m in S, there is exactly one node n in A(m) that is also in B π. 2. Otherwise, there is no such n in A(m). Proof: For any information set u, B π can have only one choice node labeled u, for two such nodes would be indistinguishable, contrary to the assumption that B is a plan basis. Therefore, no two action nodes of B π can have the same label since action sets are disjoint by Definition 1.1(5). Now suppose that A(m) has two nodes n 1 and n 2 in B π. The path to n 1 must have a node n 3 with the same label as n 2 since α B (n) α S (m) and n 2 is labeled with some action in α S (m). Since n 2 and n 3 cannot be distinct, they must be the same and n 2 must be on the path to n 1. But then α B (n 2 ) does not contain the label on n 1, a label in α S (m), in violation of Condition 2. This contradiction implies there can be at most one node of B π in A(m). There must be at least one such n by Condition 2 of Definition 3.3. To prove part 2, suppose there is an n in A(m) for some m in S but that π does not induce a path to m in S. This is impossible because α B (n) α S (m). Theorem 4.3 Let S be the skeleton and B a plan basis for player p. For mixed strategy µ, let f µ be the path probability function induced on B by µ and let g µ be the path probability function induced on S. Then for all action nodes m of S, g µ (m) = n A(m) f µ(n). Proof: Let π be a pure strategy for p. Let S π be the sub-tree of S induced by π and B π be the sub-tree of B induced by π. If S π does not contain action node m, then B π does not contain a node n from A(m) because of Lemma 4.2(2). If S π does contain m, Lemma 4.2(1) says that exactly one node from A(n) appears in B π. Thus the expected number of times m is reached under µ is the same as the probability that some element of A(m) is reached which (because reaching individual elements of A(m) are mutually exclusive events) is equal to n A(m) f µ(n). Corollary 4.4 Given a plan basis B for player p, two mixed strategies for p are equivalent if they induce the same path probability functions on B. Proof: If they induce the same f on the plan basis, the theorem says they induce the same g on the skeleton. Corollary 4.5 Given a mixed strategy µ and a plan basis B for p, let f µ be the path probability function induced by µ on B. Then the mixed strategy described in Theorem 3.8 is equivalent to µ. Proof: Theorem 3.8 says f µ is also induced by the described mixed strategy and must be equivalent to µ by Corollary 4.4. 5 The Merge Operation In this section, we begin to study how a plan basis can be obtained from a skeleton using a sequence of operations which change one AC-tree into another. A skeleton already satisfies Condition 2 of Definition 3.3 and the results of our operations will continue to satisfy this condition. Through a sequence of operations, we seek to derive an AC-tree from the skeleton which also satisfies Condition 1. In this section, we study a simple operation we call a merge operation. Each merge operation reduces the size of a tree. Therefore, if a plan basis can be found using only merge operations, its description will be smaller than the description of the game. 8

Definition 5.1 Given an AC-tree T, an action node n of T, an information set u, and a sub-set C of children of n labeled with information set u, a merge operation is as follows: 1. Create a new choice node m labeled u having n as a parent and create a child m a of m for each action a in A(u). 2. For each node c in C and action a A(u), let c a be the child of c labeled a. Make the children of c a have m a as their parent. 3. For each node c in C, remove c and the c a from the tree. We say the merge operator has merged the nodes in C into m. Theorem 5.2 The merge operation preserves Condition 2 of Definition 3.3. Proof: Let T 1 be an AC-tree for p satisfying Condition 2 and T 2 an AC-tree obtained from T 1 by a merge operation. Let π be any pure strategy and let m be an action node in S π. Condition 2 says there is an n in T 1 such that α S (m) α T1 (n). If n was retained in T 2 by the merge operation, α T1 (n) = α T2 (n) so Condition 2 for T 2 is also satisfied by n. If n was the child of some choice node deleted in Step 3, α T1 (n) = α T2 (n a ) where a is the label on n and n a is as in Step 1. Thus Condition 2 is satisfied by n a. Definition 5.3 Two choice nodes n 1 and n 2 in AC-tree T are called congruent if the path to n 1 and the path to n 2 are labeled with the same sequence. Proposition 5.4 No merges are available for AC-tree T if and only if T has no congruent pairs. Proof: Suppose a merge is available. Using the notation of Definition 5.1, the nodes in set C have the same parent, hence the same path to the root and hence they are congruent. Now suppose two choice nodes n 1 and n 2 are congruent. Let m be the lub of n 1 and n 2. Node m cannot be a choice node because the paths must include children of m with the same label and m has only one such child by Definition 2.1(5). Thus m is an action node and the children of m on the paths to n 1 and n 2 are choice nodes with the same label and can be merged. Theorem 5.5 Let S the skeleton for p. Suppose that, for any two choice nodes n 1 and n 2 of S having the same label, either n 1 is congruent to n 2 or n 1 and n 2 are distinguishable. Then performing all available merge operations on S transforms S into a plan basis. Proof: We need to show that the merge operation preserves the following property: T does not contain two nodes with the same label which are not congruent and not distinguishable. Then when all congruences have been removes with merge operations, what remains satisfies both conditions of Definition 3.3. This property follows easily from the fact that merge operations do not change path sequences. Corollary 5.6 If skeleton S satisfies the condition of Theorem 5.5, the player p has a sufficient set of mixed strategies described by a plan basis smaller than the game tree. Proof: The plan basis obtained in Theorem 5.5 is smaller than S. By Theorem 5.5, the result is a tree satisfying Definition 3.3. Theorem 3.8 implies the result. A player p is said to have perfect recall [2] if all nodes of the skeleton belonging to the same information set are congruent (ie. p always remembers his previous actions). Therefore we know: 9

Corollary 5.7 A player with perfect recall always has a plan basis with one node per information set. Dalkey [1] has defined an inflation and a complete inflation. definitions, Without repeating these Corollary 5.8 If, for an extensive form game, the full inflation for player p gives p perfect recall, then a plan basis for p can be obtained from the skeleton for p through a series of merges. 6 The Insert Operation If two choice nodes with the same label u are not distinguishable, we know from Section 3 that their lub is an action node n. The insert operation defined below addresses non-distinguishability problems by inserting a choice node labeled u at n and removing the conflicting choice nodes. In effect, the conflicting nodes are consolidated into the new node and the decision as to what action to take is made earlier. By repeated application of this operation, all instances of non-distinguishability can be eliminated and Condition 1 of Definition 3.3 satisfied. Definition 6.1 Given an AC-tree T, an action node n of T, an information set u which does not label a node above n, and a sub-set C of children of n, an insert operation is as follows: 1. Create a new choice node m labeled u having n as a parent and create a child m a of m for each action a in A(u). 2. For each node c in C and each m a : (a) Make a copy T a of the sub-tree with root c. (b) Make m a the parent of the root of T a. (c) For each choice node h labeled u in T a, let h a be the child of h labeled a. Remove h and all its descendents except for the children of h a and there descendents. (d) Make the parent of h be the parent of the children of h a. 3. For c C, remove the original sub-tree with root c. Note that, in Step 2d, the children of h a are choice nodes and their new parent, the parent of h, is an action node. Thus the result is another AC-tree. Proposition 6.2 The merge operation is a special case of the insert operation. Proof: When c is labeled with u, the h is Step 2c becomes c, h a becomes c a, and the only portion of T a remaining is the sub-trees descending from the children of c a. The effect is therefore the same as Step 2 of a merge operation. Since our overall plan is to transform the skeleton with operations that preserve Condition 2, we need to verify this for the insert operation: Theorem 6.3 The insert operation preserves Condition 2 of Definition 3.3 10

Proof: Suppose that AC-tree T satisfies Condition 2 and is transformed into T as the result of an insert operation. Let π be a pure strategy for p, let m 1 be an action node in S π, and let n 1 be the action node in T π such that α S (m 1 ) α T (n 1 ). We want to find an action n 2 in T π such that α S (m 1 ) α T (n 2 ). Let m, u, n, and m a be as in Step 1 from Definition 6.1. First, suppose that n 1 is not below n. Then n 1 is also in T and we can let n 2 = n 1. Now suppose that n 1 is below n. Let choice node c be the child of n such than n 1 is in the sub-tree with root c. Since n and c are on the path in T to n 1, they are all in T π. The new node m created is Step 1, being the child of n is in CT π and thus so in m π(u). The copy of c with parent m π(u) is also in T π as is the copy of the nodes from c to n 1. We have two cases to consider. In the case that action node n 1 is not labeled with π(u), a copy n 1 of n 1 is also in T π. Since action node m π(u) has been inserted in the path to n 1, α T (n 1 ) = α T (n 1 ) {π(u)} so we can let n 2 = n 1. In the case that n 1 is labeled with π(u), n 1 is the child of a choice node h labeled u as described in Step 2c. Let n 2 be the action node which is the parent of h. Since π(u) labels m π(u) in the path to n 2, α T (n 2 ) = α T (n 1 ) and the result holds. The unfortunate feature of the insert operation is that the size of the AC-tree usually grows because of Step 2 which can cause multiple copies of a sub-tree to be included in the transformed tree. Although one insert can do no more than multiply the size of an AC-tree by the size of an action set, repeated application can increase the tree size exponentially. 7 Realization DAGs The insert operation involves making copies of sub-trees. The probabilities in a path probability function can differ from copy to copy. In this section, we show that certain sub-trees can be combined and and share a single set of probabilities. Combining sub-trees makes a plan basis into a directed acyclic graph or DAG. For this plan to work, shared sub-trees need to be related to the skeleton in a similar way. To describe this relationship, we need the following definition: Definition 7.1 Let S be the skeleton for player p and B a plan basis for p. An action node m of S is called compatable with action node n of B if α B (n) α S (m). Now we can say when two sub-trees are similar enough to be shared. Definition 7.2 Let S be the skeleton for player p and B a plan basis for p. Let T 1 and T 2 be sub-trees of B. A one-to-one correspondence between nodes of T 1 and T 2 is called a strong isomorphism if the corresponding nodes have the same label, have corresponding parents and children, and corresponding action nodes are compatible with the same action nodes of S. Our construction will combine nodes equivalent under the following relation: Definition 7.3 Let S be the skeleton for player p, let B be a plan basis for p, and let n 1 and n 2 be nodes of B. We write n 1 n 2 if and only if n 1 and n 2 are roots of strongly isomorphic sub-trees. Obviously, is an equivalence relation. The next lemma insures that, if two nodes are to be combined because of this equivalence relation, their children can be combined also. Lemma 7.4 Let n 1 and n 2 be choice nodes in a plan basis B for player p. Suppose that n 1 n 2 and that a child of n 1 labeled a has a child m 1. Then there is exactly one node m 2 such that m 1 m 2 and m 2 is the child of the child of n 2 labeled a. 11

Proof: Since n 1 and n 2 are roots of strongly isomorphic sub-trees T 1 and T 2, there is an m 2 in T 2 corresponding to m 1 satisfying m 1 m 2. In T 2, m 2 is the child of the child of n 2 labeled a. There can t be two such m 2 because they would not be distinguishable, a violation of Condition 1 of Definition 3.3. We now describe a DAG obtained from a plan basis by combining equivalent nodes. The combined nodes are, in effect, nodes of a shared sub-tree. Definition 7.5 Given a plan basis B for player p, let E be the set of equivalence classes for player p on the choice nodes of B under the equivalence relation of Definition 7.3. Let D be the DAG constructed as follows: 1. Create a choice node for each set E in E and label the created node with the information set u which labels the nodes in E. 2. Create a root and, for each created choice node labeled u, create a child action node for each action in A(u). 3. Make the child labeled a of choice node E 1 be a parent of choice node E 2 if and only if E 1 contains a node n 1 and E 2 contains a node n 2 such that, in B, the child labeled a of n 1 is the parent of n 2. 4. Make the root of D a parent of choice node E if the root of B is the parent of a node n in E. We call D a DAG basis. The DAG works as a realization plan because the paths in B and the paths in D have the following strong connection: Lemma 7.6 Let B and D be as in Definition 7.5. There exists a path rn 1 a 1 n k to choice node n k in B if and only if re 1 a 1 E k is a path to E k in D where n i E i for 1 i k, where r represents a root, and a i (1 i < k) represents the child of n i in B or of E i in D labeled a i. Furthermore, there is exactly one such path in B for any given path in D. The same is true for paths ending in action nodes. Proof: Given the path in D, E 1 by construction has a node n 1 which is the child of the root. There can t be more than one such n 1 because a second such node would not be distinguishable from the first as required by the definition of B. Given node n i, node n i+1 is the unique node provided by Lemma 7.4. Given the path in B, there are corresponding E i by construction (Step 3). For paths ending in action nodes, the paths to the parents satisfy the lemma and the step to the final action node is unique because a choice node has exactly one child for each action symbol. Among other things, this lemma confirms that D is indeed a DAG: Corollary 7.7 Let B and D be as in Definition 7.5. Then for all choice nodes E of D, the path to E does not contain another node with the same label as E. In particular. D is a DAG. Proof: The first time a path in D reaches a second node in D with the same label, the same would be true for the corresponding path in B. But these two nodes in B would be indistinguishable contrary to the definition of B. The second node cannot be a reoccurrence of the first for then B would not be a tree. Thus D is acyclic. The next lemma shows that the ideas from Definition 3.5 and Lemma 3.7 apply to D. 12

Lemma 7.8 Let B and D be as in Definition 7.5 and let D be a sub-dag of D such that: 1. the root of D is in D, 2. if n is an action node of D, then all children of n in D are in D, 3. if n is a choice node of D, then exactly one child of n in D is in D. Then D is a tree and the labels on the action nodes of D describe a reduced strategy for p. We call D a decision sub-tree. Proof: Each node in D is reached by a path in D and that path has a unique corresponding path in B by Lemma 7.6. The corresponding paths establish a mapping h of nodes in D to nodes in B such that, if n 1 n 2 is an edge in D, h(n 1 ) h(n 2 ) is an edge in B. Thus h maps D to a sub-tree B of B and the set of action node labels in D is the same set of action node labels in B. We want to show that B satisfies the three conditions of Definition 3.5 and that therefore B is a decision sub-tree of B. First observe that the root of D maps to the root of B so that Part 1 of the lemma implies Condition 1 of Definition 3.5. Now let m be an action node in B, let n be a node in D such that h(n) = m, and let choice node m be any child of m in B. The path to m and then to m in B corresponds to a path to n and then to some n in D such that m n. But n must be in D by Part 2 so m is in B and Condition 2 is satisfied. Thirdly, suppose m in B is a choice node and that n = h(m). By Part 3, there is exactly one child m of m in D. This child corresponds to the one child n of n with the same action symbol and so there is a node in B which is a child of n. Thus Condition 3 is satisfied. Because B is a decision sub-tree, Lemma 3.7 says B describes a reduced strategy and therefore so does D. Now for the analogy to Definition 2.3: Theorem 7.9 Let D be as in Definition 7.5. Given a pure strategy π for player p, led D π be the sub-graph of D defined inductively by 1. an action node n of D is in D π if and only if (a) n is the root of D or (b) the parent m of n is in D π and n is labeled with π(u(m)). 2. a choice node n of D is in D π if and only if some parent of n is in D π. Then D π is a tree. We call D π the sub-tree induced by π. Proof: Sub-graph D π satisfies Lemma 7.8 Definition 2.4 now extends directly: Definition 7.10 Let D be as in Definition 7.5. Given a mixed strategy µ, let f µ map action nodes n of D into the probability that a pure strategy from µ includes n in its induced sub-tree. We say that f µ is induced on D by µ. 13

Next we have the analogy of Definition 2.5. We now use the more descriptive term action probability map rather then path probability map because action nodes may now be reached by several paths. Action nodes in D have only one parent by construction but there can be many paths to the parent. Definition 7.11 Let D be as in Definition 7.5. A mapping f of the action nodes of D into nonnegative real numbers is called an action probability map for T if and only if 1. for the root node r, f(r)=1; 2. for all choice nodes m, c C f(c) = d D f(d) where C is the set of children of m and D is the set of parents of m. As in Theorem 2.6, mixed strategies impose probability maps. Theorem 7.12 The function f µ from Definition 7.10 is an action probability map. Proof: The root is always part of an induced sub-tree and thus Condition 1 of Definition 7.11 is satisfied. If π is a pure strategy in µ and n is a choice node in the sub-tree induced by π, exactly one child of n belongs to the sub-tree, namely the child labeled π(n). Therefore, for each π, the number of edges entering a choice node is equal to the number leaving. This implies Condition 2. Proposition 7.13 Let B and D be as in Definition 7.5 and let µ be a mixed strategy. Let b µ be the path probability function µ induces on B and let d µ be the edge probability function µ induces on D. For all E E and a A(u(E)), let E a be the child of E labeled a. For all n E, let n a be the child of n labeled a. Then for all E E and a A(u(E), d µ (E a ) = n E b µ(n a ). Theorem 7.14 Let D be as in Definition 7.5. Given an action probability map f for D, a decision sub-tree and corresponding reduced pure strategy can be selected at random as follows: 1. select the root of D. 2. if action node n is selected, select all children of n. 3. if choice node m is selected, select one child n of m using the probabilities f(n)/ d D f(d) where D is the set of parents of m. Furthermore, the mixed strategy so described induces the action probability map f on D. Proof: Let m be a choice node and C be the set of children of m. Definition 7.11(2) insures that n C f(n)/ d D f(d) is equal to one so the selection method is indeed a probabilistic procedure. Lemma 7.8 insures that the sub-tree so selected does correspond to a reduced pure strategy. To prove the furthermore, observe that the root is selected with probability one since it is always selected. Working inductively down the tree, we know that any pure strategy can only reach a choice node m one way (Theorem 7.9) so the probability of reaching m is the sum of the probabilities of reaching its parents or d D f(d) where D is the set of parents of m. Thus the probability of selecting an action node n is f(n)/ d D f(d). The next goal is to show that, given a mixed strategy µ and the resulting action probability map f µ, we can compute, in a linear fashion, the probabilities µ imposes on the skeleton. 14

Definition 7.15 Let B and D be as in Definition 7.5. For all action nodes m in the skeleton S, define A D (m) to be set of action nodes E a in D such that n A(m) for some m E a. Theorem 7.16 Let m be an action node in S and let n 1 and n 2 be action nodes of B. If n 1 A(m) and n 1 n 2, then n 2 A(m). Proof: n 1 A(n 1 ) implies α B (n 1 ) α S (m) by definition of A. This in turn implies α B (n 2 ) α S (m) by definition of. Because the label on n 1 is on the path to m, so is the label on n 2 because n 1 and n 2 have the same label. Theorem 7.17 Let S be the skeleton, B a plan basis, and D the DAG for p. For mixed strategy µ, let f µ be the edge probability function induced on D by µ and g µ be the path probability function induced on S. Then for all action nodes m of S, g µ (m) = n A D (m) f µ(n). Proof: Because of Theorem 7.16, A D (m) = n m A(n) and the result follows from Theorem 4.3. Corollary 7.18 Given a plan DAG B for player p, two mixed strategies for p are equivalent if they induce the same path probability functions on D. Proof: If they induce the same f on the plan basis, the theorem says they induce the same g on the skeleton. Corollary 7.19 Given a mixed strategy µ and a plan DAG D for p, let f µ be the path probability function induced by µ on D. Then the mixed strategy described in Theorem 7.14 is equivalent to µ. Proof: Theorem 7.14 says f µ is also induced by the described mixed strategy and must be equivalent to µ by Corollary 4.4. 8 Complexity Classes In this section, we start to address the following question: For what classes of extensive form games can we guarantee that the number of nodes in B or D is polynomial in the size of S? For such classes, the number of parameters required to describe a sufficient set of mixed strategies may be considered small.. We address this topic only briefly because this document must be submitted this week. Definition 8.1 Set u is a problem at action node n of S if n has two children m 1 and m 2 and some node labeled u below m 1 cannot be distinguished between some node labeled u below m 2. That is, Condition 2 of Theorem 3.4 is violated. Definition 8.2 We say a player has k-bounded conflicts for integer k if the size of the players action sets are bounded by k and the number of problems along any branch of skeleton S for p is bounded by k. Proposition 8.3 The number of parameters needed by player p to describe a sufficient set of strategies is polynomial in the game size if, for fixed k, p has k-bounded conflicts. 15

Proof: Using the insert operation, a realization plan can be constructed where the number of nodes is at most k k times the skeleton size. (Remember, k k for fixed k is a constant.) Definition 8.4 A problem for information set u is said to be localized at node n of the skeleton if it is a problem at some node above n and some set below n is labeled u. We say a player has k-bounded local conflicts for integer k if the size of the players action sets are bounded by k and, for each node n of the skeleton for p, the number of localized problems at n is bounded by k. Proposition 8.5 The number of parameters needed by player p to describe a sufficient set of strategies is polynomial in the game size if, for fixed k, p has k-bounded local conflicts. Proof: This time, a realization DAG can be constructed where the number of nodes is at most k k times the skeleton size. References [1] N. Dalkey, Equivalence of Information Patterns and Essentially Indeterministic Games, Contributions to the Theory of Games II, H. W. Kuhn and A. W. Tucker, eds., Princeton University Press, 1953, pp. 217-243. [2] H. W. Kuhn, Extensive Games and Partial Information, Contributions to the Theory of Games II, H. W. Kuhn and A. W. Tucker, eds., Princeton University Press, 1953, pp. 193-216. [3] Daphne Koller and Nimrod Megiddo, The Complexity of Two-Person Zero-Sum Games in Extensive Form, Games and Economic Behavior 4:4, October 1992, pp. 528-552. [4] Daphine Koller and Nimrod Megiddo, Finding Mixed Strategies with Small Supports in Extensive Form games, International Journal of Game Theory 25,1, March 1996, pp. 73-92. [5] Daphine Koller, Nimrod Megiddo, and Bernhard von Stengel, Efficient Computation of Equilibria for Extensive Two-Person Games, Games and Economic Behavior 14:2, June 1996, pp. 247-259. [6] Bernhard von Stengel, Efficient Computation of Behavior Strategies, Games and Economic Behavior 14:2, June 1996, pp. 220-246. [7] Bernhard von Stengel, Equilibrium Computation for Two-Player Games in Strategic and Extensive Form, Algorithmic Game Theory, Nisan, Roughgarden, Tardos, and Vazirani editors, Cambridge University Press, 2007, pp. 53-78 16