Generalizing Plans to New Environments in Relational MDPs

Size: px
Start display at page:

Download "Generalizing Plans to New Environments in Relational MDPs"

Transcription

1 In International Joint Conference on Artificial Intelligence IJCAI-03), Acapulco, Mexico, August 003. Generalizing Plans to New Environents in Relational MDPs Carlos Guestrin Daphne Koller Chris Gearhart Neal Kanodia Coputer Science Departent, Stanford University {guestrin, koller, cg33, Abstract A longstanding goal in planning research is the ability to generalize plans developed for soe set of environents to a new but siilar environent, with inial or no replanning. Such generalization can both reduce planning tie and allow us to tackle larger doains than the ones tractable for direct planning. In this paper, we present an approach to the generalization proble based on a new fraework of relational Markov Decision Processes RMDPs). An RMDP can odel a set of siilar environents by representing objects as instances of different classes. In order to generalize plans to ultiple environents, we define an approxiate value function specified in ters of classes of objects and, in a ultiagent setting, by classes of agents. This class-based approxiate value function is optiized relative to a sapled subset of environents, and coputed using an efficient linear prograing ethod. We prove that a polynoial nuber of sapled environents suffices to achieve perforance close to the perforance achievable when optiizing over the entire space. Our experiental results show that our ethod generalizes plans successfully to new, significantly larger, environents, with inial loss of perforance relative to environent-specific planning. We deonstrate our approach on a real strategic coputer war gae. 1 Introduction Most planning ethods optiize the plan of an agent in a fixed environent. However, in any real-world settings, an agent will face ultiple environents over its lifetie, and its experience with one environent should help it to perfor well in another, even with inial or no replanning. Consider, for exaple, an agent designed to play a strategic coputer war gae, such as the Freecraft gae shown in Fig. 1 an open source version of the popular Warcraft gae). In this gae, the agent is faced with any scenarios. In each scenario, it ust control a set of agents or units) with different skills in order to defeat an opponent. Most scenarios share the sae basic eleents: resources, such as gold and wood; units, such as peasants, who collect resources and build structures, and footen, who fight with eney units; and structures, such as barracks, which are used to train footen. Each scenario is coposed of these sae basic building blocks, but they differ in ters of the ap layout, types of units available, aounts of resources, etc. We would like the agent to learn fro its experience with playing soe scenarios, enabling it to tackle new scenarios without significant aounts of replanning. In particular, we would like the agent to generalize fro siple scenarios, allowing it to deal with other scenarios that are too coplex for any effective planner. The idea of generalization has been a longstanding goal in Markov Decision Process MDP) and reinforceent learning Figure 1: Freecraft strategic doain with 9 peasants, a barrack, a castle, a forest, a gold ine, 3 footen, and an eney, executing the generalized policy coputed by our algorith. research [15; 16], and even earlier in traditional planning [5]. This proble is a challenging one, because it is often unclear how to translate the solution obtained for one doain to another. MDP solutions assign values and/or actions to states. Two different MDPs e.g., two Freecraft scenarios), are typically quite different, in that they have a different set and even nuber) of states and actions. In cases such as this, the apping of one solution to another is not well-defined. Our approach is based on the insight that any doains can be described in ters of objects and the relations between the. A particular doain will involve ultiple objects fro several classes. Different tasks in the sae doain will typically involve different sets of objects, related to each other in different ways. For exaple, in Freecraft, different tasks ight involve different nubers of peasants, footen, eneies, etc. We therefore define a notion of a relational MDP RMDP), based on the probabilistic relational odel PRM) fraework [10]. An RMDP for a particular doain provides a general schea for an entire suite of environents, or worlds, in that doain. It specifies a set of classes, and how the dynaics and rewards of an object in a given class depend on the state of that object and of related objects. We use the class structure of the RMDP to define a value function that can be generalized fro one doain to another. We begin with the assuption that the value function can be well-approxiated as a su of value subfunctions for the different objects in the doain. Thus, the value of a global Freecraft state is approxiated as a su of ters corresponding to the state of individual peasants, footen, gold, etc. We then assue that individual objects in the sae class have a very siilar value function. Thus, we define the notion of a class-based value function, where each class is associated with a class subfunction. All objects in the sae class have the value subfunction of their class, and the overall value function for a particular environent is the su of value subfunctions for the individual objects in the doain. A set of value subfunctions for the different classes ie-

2 diately deterines a value function for any new environent in the doain, and can be used for acting. Thus, we can copute a set of class subfunctions based on a subset of environents, and apply the to another one without replanning. We provide an optiality criterion for evaluating a classbased value function for a distribution over environents, and show how it can, in principle, be optiized using a linear progra. We can also learn a value function by optiizing it relative to a saple of environents encountered by the agent. We prove that a polynoial nuber of sapled environents suffice to construct a class-based value function which is close to the one obtainable for the entire distribution over environents. Finally, we show how we can iprove the quality of our approxiation by autoatically discovering subclasses of objects that have siilar value functions. We present experients for a coputer systes adinistration task and two Freecraft tasks. Our results show that we can successfully generalize class-based value functions. Iportantly, our approach also obtains effective policies for probles significantly larger than our planning algorith could handle otherwise. Relational Markov Decision Processes A relational MDP defines the syste dynaics and rewards at the level of a teplate for a task doain. Given a particular environent within that doain, it defines a specific MDP instantiated for that environent. As in the PRM fraework of [10], the doain is defined via a schea, which specifies a set of object classes C = {C 1,..., C c }. Each class C is also associated with a set of state variables S[C] = {C.S 1,..., C.S k }, which describe the state of an object in that class. Each state variable C.S has a doain of possible values Do[C.S]. We define S C to be the set of possible states for an object in C, i.e., the possible assignents to the state variables of C. For exaple, our Freecraft doain ight have classes such as Peasant, Footan, Gold; the class Peasant ay have a state variable Task whose doain is Do[Peasant.Task] = {Waiting, Mining, Harvesting, Building}, and a state variable Health whose doain has three values. In this case, S Peasant would have 4 3 = 1 values, one for each cobination of values for Task and Health. The schea also specifies a set of links L[C] = {L 1,..., L l } for each class representing links between objects in the doain. Each link C.L has a range ρ[c.l] = C. For exaple, Peasant objects ight be linked to Barrack objects ρ[peasant.buildtarget] = Barrack, and to the global Gold and Wood resource objects. In a ore coplex situation, a link ay relate C to any instances of a class C, which we denote by ρ[c.l] = {C }, for exaple, ρ[eney.my Footen] = {Footan} indicates that an instance of the eney class ay be related to any footan instances. A particular instance of the schea is defined via a world ω, specifying the set of objects of each class; we use O[ω][C] to denote the objects in class C, and O[ω] to denote the total set of objects in ω. The world ω also specifies the links between objects, which we take to be fixed throughout tie. Thus, for each link C.L, and for each o O[ω][C], ω specifies a set of objects o ρ[c.l], denoted o.l. For exaple, in a world containing peasants, we would have O[ω][Peasant] = {Peasant1, Peasant}; if Peasant1 is building a barracks, we would have that Peasant1.BuildTarget = Barrack1. The dynaics and rewards of an RMDP are also defined at the schea level. For each class, the schea specifies an action C.A, which can take on one of several values Do[C.A]. For exaple, Do[Peasant.A] = {Wait, Mine, Harvest, Build}. Each class C is also associated with a transition odel P C, which specifies the probability distribution over the next state of an object o in class C, given the current state of o, the action taken on o, and the states and actions of all of the objects linked to o: P C S C S C, C.A, S C.L1, C.L 1.A,..., S C.Ll, C.L l.a). 1) For exaple, the status of a barrack, Barrack.Status, depends on its status in the previous tie step, on the task perfored by any peasant that could build it Barrack.BuiltBy.Task), on the aount of wood and gold, etc. The transition odel is conditioned on the state of C.L i, which is, in general, an entire set of objects e.g., the set of peasants linked to a barrack). Thus we ust now provide a copact specification of the transition odel that can depend on the state of an unbounded nuber of variables. We can deal with this issue using the idea of aggregation [10]. In Freecraft, our odel uses the count aggregator, where the probability that Barrack.Status transitions fro Unbuilt to Built depends on [Barrack.BuiltBy.Task = Built], the nuber of peasants in Barrack.BuiltBy whose Task is Build. Finally, we also define rewards at the class level. We assue for siplicity that rewards are associated only with the states of individual objects; adding ore global dependencies is possible, but coplicates planning significantly. We define a reward function R C S C, C.A), which represents the contribution to the reward of any object in C. For exaple, we ay have a reward function associated with the Eney class, which specifies a reward of 10 if the state of an eney object is Dead: R Eney Eney.State = Dead) = 10. We assue that the reward for each object is bounded by R ax. Given a world, the RMDP uniquely defines a ground factored MDP Π ω, whose transition odel is specified as usual) as a dynaic Bayesian network DBN) [3]. The rando variables in this factored MDP are the state variables of the individual objects o.s, for each o O[ω][C] and for each S S[C]. Thus, the state s of the syste at a given point in tie is a vector defining the states of the individual objects in the world. For any subset of variables X in the odel, we define s[x] to be the part of the instantiation s that corresponds to the variables X. The ground DBN for the transition dynaics specifies the dependence of the variables at tie t + 1 on the variables at tie t. The parents of a variable o.s are the state variables of the objects o that are linked to o. In our exaple with the two peasants, we ight have the rando variables Peasant1.Task, Peasant.Task, Barrack1.Status, etc. The parents of the tie t + 1 variable Barrack1.Status are the tie t variables Barrack1.Status, Peasant1.Task, Peasant.Task, Gold1.Aount and Wood1.Aount. The transition odel is the sae for all instances in the sae class, as in 1). Thus, all of the o.status variables for

3 F o o t a n y_ eney Eney R Health A Footan H Count Health H 1 Tie t t+ Footan1 E ne y 1 R R Footan E ne y F 1.Health F 1.A E 1.Health F.Health F.A E.Health a) b) Figure : Freecraft tactical doain: a) Schea; b) Resulting factored MDP for a world with footen and eneies. barrack objects o share the sae conditional probability distribution. Note, however, that each specific barrack depends on the particular peasants linked to it. Thus, the actual parents in the DBN of the status variables for two different barrack objects can be different. The reward function is siply the su of the reward functions for the individual objects: Rs, a) = Rs[S o ], a[o.a]). C C o O[ω][C] Thus, for reward function for the Eney class described above, our overall reward function in a given state will be 10 ties the nuber of dead eneies in that state. It reains to specify the actions in the ground MDP. The RMDP specifies a set of possible actions for every object in the world. In a setting where only a single action can be taken at any tie step, the agent ust choose both an object to act on, and which action to perfor on that object. Here, the set of actions in the ground MDP is siply the union o ω Do[o.A]. In a setting where ultiple actions can be perfored in parallel say, in a ultiagent setting), it ight be possible to perfor an action on every object in the doain at every step. Here, the set of actions in the ground MDP is a vector specifying an action for every object: o ω Do[o.A]. Interediate cases, allowing degrees of parallelis, are also possible. For siplicity of presentation, we focus on the ultiagent case, such as Freecraft, where, an action is an assignent to the action of every unit. Exaple.1 Freecraft tactical doain) Consider a siplified version of Freecraft, whose schea is illustrated in Fig. a), where only two classes of units participate in the gae: C = {Footan, Eney}. Both the footan and the eney classes have only one state variable each, Health, with doain Do[Health] = {Healthy, Wounded, Dead}. The footan class contains one single-valued link: ρ[footan.my Eney] = Eney. Thus the transition odel for a footan s health will depend on the health of its eney: P Footan S Footan S Footan, S Footan.My Eney ), i.e., if a footan s eney is not dead, than the probability that a footan will becoe wounded, or die, is significantly higher. A footan can choose to attack any eney. Thus each footan is associated with an action Footan.A which selects the eney it is attacking. 1 As a consequence, an 1 A odel where an action can change the link structure in the F 1.H E 1.H F.H E.H eney could end up being linked to a set of footen, ρ[eney.my Footen] = {Footan}. In this case, the transition odel of the health of an eney ay depend on the nuber of footen who are not dead and whose action choice is to attack this eney: P Eney S Eney S Eney, [S Eney.My Footen, Eney.My Footen.A]). Finally, we ust define the teplate for the reward function. Here there is only a reward when an eney is dead: R Eney S Eney ). We now have a teplate to describe any instance of the tactical Freecraft doain. In a particular world, we ust define the instances of each class and the links between these instances. For exaple, a world with footen and eneies will have 4 objects: {Footan1, Footan, Eney1, Eney}. Each footan will be linked to an eney: Footan1.My Eney = Eney1 and Footan.My Eney = Eney. Each eney will be linked to both footen: Eney1.My Footen = Eney.My Footen = {Footan1, Footan}. The teplate, along with the nuber of objects and the links in this specific vs ) world yield a well-defined factored MDP, Π vs, as shown in Fig. b). 3 Approxiately Solving Relational MDPs There are any approaches to solving MDPs [15]. An effective one is based on linear prograing LP): Let SΠ) denote the states in an MDP Π and AΠ) the actions. If SΠ) = {s 1,..., s N }, our LP variables are V 1,..., V N, where V i represents Vs i ), the value of state s i. The LP forulation is: Miniize: i αs i)v i ; Subject to: V i Rs i, a) + γ k P s k s i, a)v k s i SΠ), a AΠ). The state relevance weights αs 1 ),..., αs N ) in the objective function are any set of positive weights, αs i ) > 0. In our setting, the state space is exponentially large, with one state for each joint assignent to the rando variables o.s of every object e.g., exponential in the nuber of units in the Freecraft scenario). In a ultiagent proble, the nuber of actions is also exponential in the nuber of agents. Thus this LP has both an exponential nuber of variables and an exponential nuber of constraints. Therefore the exact solution to this linear progra is infeasible. We address this issue using the assuption that the value function can be well-approxiated as a su of local value subfunctions associated with the individual objects in the odel. This approxiation is a special case of the factored linear value function approach used in [6].) Thus we associate a value subfunction V o with every object in ω. Most siply, this local value function can depend only on the state of the individual object S o. In our exaple, the local value subfunction V Eney1 for eney object Eney1 ight associate a nueric value for each assignent to the variable Eney1.Health. A richer approxiation ight associate a value function with pairs, or even sall subsets, of closely related objects. Thus, the world requires a sall extension of our basic representation. We oit details due to lack of space.

4 function V Footan1 for Footan1 ight be defined over the joint assignents of Footan1.Health and Eney1.Health, where Footan1.My Eney = Eney1. We will represent the coplete value function for a world as the su of the local value subfunction for each individual object in this world. In our exaple world ω = vs) with footen and eneies, the global value function will be: V vs F1.Health, E1.Health, F.Health, E.Health) = V Footan1 F1.Health, E1.Health) + V Eney1 E1.Health) + V Footan F.Health, E.Health) + V Eney E.Health). Let T o be the scope of the value subfunction of object o, i.e., the state variables that V o depends on. Given the local subfunctions, we approxiate the global value function as: V ω s) = V o s[t o ]). ) o O[ω] As for any linear approxiation to the value function, the LP approach can be adapted to use this value function representation [14]. Our LP variables are now the local coponents of the individual local value functions: {V o t o ) : o O[ω], t o Do[T o ]}. 3) In our exaple, there will be one LP variable for each joint assignent of F1.Health and E1.Health to represent the coponents of V Footan1. Siilar LP variables will be included for the coponents of V Footan, V Eney1, and V Eney. As before, we have a constraint for each global state s and each global action a: o V os[t o ]) o Ro s[s o ], a[o.a])+ γ s P ωs s, a) o V os 4) [T o ]); s, a. This transforation has the effect of reducing the nuber of free variables in the LP to n the nuber of objects) ties the nuber of paraeters required to describe an object s local value function. However, we still have a constraint for each global state and action, an exponentially large nuber. Guestrin, Koller and Parr [6] GKP hereafter) show that, in certain cases, this exponentially large LP can be solved efficiently and exactly. In particular, this copact solution applies when the MDP is factored i.e., represented as a DBN), and the approxiate value function is decoposed as a weighted linear cobination of local basis functions, as above. Under these assuptions, GKP present a decoposition of the LP which grows exponentially only in the induced tree width of a graph deterined by the coplexity of the process dynaics and the locality of the basis function. This approach applies very easily here. The structure of the DBN representing the process dynaics is highly factored, defined via local interactions between objects. Siilarly, the value functions are local, involving only single objects or groups of closely related objects. Often, the induced width of the resulting graph in such probles is quite sall, allowing the techniques of GKP to be applied efficiently. 4 Generalizing Value Functions Although this approach provides us with a principled way of decoposing a high-diensional value function in certain types of doains, it does not help us address the generalization proble: A local value function for objects in a world ω does not help us provide a value function for objects in other worlds, especially worlds with different sets of objects. To obtain generalization, we build on the intuition that different objects in the sae class behave siilarly: they share the transition odel and reward function. Although they differ in their interactions with other objects, their local contribution to the value function is often siilar. For exaple, it ay be reasonable to assue that different footen have a siilar long-ter chance of killing eneies. Thus, we restrict our class of value functions by requiring that all of the objects in a given class share the sae local value subfunction. Forally, we define a class-based local value subfunction V C for each class. We assue that the paraeterization of this value function is well-defined for every object o in C. This assuption holds trivially if the scope of V C is siply S C : we siply have a paraeter for each assignent to Do[S C ]. When the local value function can also depend on the states of neighboring objects, we ust define the paraeterization accordingly; for exaple, we ight have a paraeter for each possible joint state of a linked footan-eney pair. Specifically rather than defining separate subfunctions V Footan1 and V Footan, we define a class-based subfunction V Footan. Now the contribution of Footan1 to the global value function will be V Footan F1.Health, E1.Health). Siilarly Footan will contribute V Footan F.Health, E.Health). A class-based value function defines a specific value function for each world ω, as the su of the class-based local value functions for the objects in ω: V ω s) = V C s[t o ]). 5) C C o O[ω][C] This value function depends both on the set of objects in the world and when local value functions can involve related objects) on the links between the. Iportantly, although objects in the sae class contribute the sae function into the suation of 5), the arguent of the function for an object is the state of that specific object and perhaps its neighbors). In any given state, the contributions of different objects of the sae class can differ. Thus, every footan has the sae local value subfunction paraeters, but a dead footan will have a lower contribution than one which is alive. 5 Finding Generalized MDP Solutions With a class-level value function, we can easily generalize fro one or ore worlds to another one. To do so, we assue that a single set of local class-based value functions V C is a good approxiation across a wide range of worlds ω. Assuing we have such a set of value functions, we can act in any new world ω without replanning, as described in Step 3 of Fig. 3. We siply define a world-specific value function as in 5), and use it to act. We ust now optiize V C in a way that axiizes the value over an entire set of worlds. To foralize this intuition, we assue that there is a probability distribution P ω) over the worlds that the agent encounters. We want to find a single set of class-based local value functions {V C } C C that is a good fit for this distribution over worlds. We view this task as one of optiizing for a single eta-level MDP Π, where

5 nature first chooses a world ω, and the rest of the dynaics are then deterined by the MDP Π ω. Precisely, the state space of Π is {s 0 } ω {ω, s) : s SΠ ω)}. The transition odel is the obvious one: Fro the initial state s 0, nature chooses a world ω according to P ω), and an initial state in ω according to the initial starting distribution P 0 ωs) over the states in ω. The reaining evolution is then done according to ω s dynaics. In our exaple, nature will choose the nuber of footen and eneies, and define the links between the, which then yields a well-defined MDP,e.g., Π vs. 5.1 LP Forulation The eta-mdp Π allows us to foralize the task of finding a generalized solution to an entire class of MDPs. Specifically, we wish to optiize the class-level paraeters for V C, not for a single ground MDP Π ω, but for the entire Π. We can address this proble using a siilar LP solution to the one we used for a single world in Sec. 3. The variables are siply paraeters of the local class-level value subfunctions {V C t C ) : C C, t C Do[T C ]}. For the constraints, recall that our object-based LP forulation in 4) had a constraint for each state s and each action vector a = {a o } o O[ω]. In the generalized solution, the state space is the union of the state spaces of all possible worlds. Our constraint set for Π will, therefore, be a union of constraint sets, one for each world ω, each with its own actions: V ωs) o Ro s[s o], a o) + γ s P ωs s, a)v ωs ); ω, s SΠ ω), a AΠ ω); 6) where the value function for a world, V ω s), is defined at the class level as in Eq. 5). In principle, we should have an additional constraint for the state s 0. However, with a natural choice of state relevance weights α, this constraint is eliinated and the objective function becoes: Miniize: 1 + γ ω s S ω P ω)p 0 ωs)v ω s); 7) if Pωs) 0 > 0, s. In soe odels, the potential nuber of objects ay be infinite, which could ake the objective function unbounded. To prevent this proble, we assue that the P ω) goes to zero sufficiently fast, as the nuber of objects tends to infinity. To understand this assuption, consider the following generative process for selecting worlds: first, the nuber of objects is chosen according to P ); then, the classes and links of each object are chosen according to P ω ). Using this decoposition, we have that P ω) = P )P ω ). The intuitive assuption described above can be foralized as: n, P = n) κ e λn ; for soe κ > 0. Thus, the distribution P ) over nuber of objects can be chosen arbitrarily, as long as it is bounded by soe exponentially decaying function. 5. Sapling worlds The ain proble with this forulation is that the size of the LP the size of the objective and the nuber of constraints grows with the nuber of worlds, which, in ost situations, grows exponentially with the nuber of possible objects, or ay even be infinite. A practical approach to address this proble is to saple soe reasonable nuber of worlds fro the distribution P ω), and then to solve the LP for these worlds only. The resulting class-based value function can then be used for worlds that were not sapled. We will start by sapling a set D of worlds according to P ω). We can now define our LP in ters of the worlds in D, rather than all possible worlds. For each world ω in D, our LP will contain a set of constraints of the for presented in Eq. 4). Note that in all worlds these constraints share the variables V C, which represent our class-based value function. The coplete LP is given by: Variables: Miniize: Subject to: {V Ct C) : C C, t C Do[T C]}. 1+γ ω D C C o O[ω][C] t o T o P 0 ωt o)v Ct o). C C o O[ω][C] VCs[T o]) o O[ω][C] RC s[s o], a[o.a])+ γ s P ωs s, a) C C o O[ω][C] VCs [T o]); ω D, s SΠ ω), a AΠ ω); 8) where PωT 0 o ) is the arginalization of PωS 0 o ) to the variables in T o. For each world, the constraints have the sae for as the ones in Sec. 3. Thus, once we have sapled worlds, we can apply the sae LP decoposition techniques of GKP to each world to solve this LP efficiently. Our generalization algorith is suarized in Step of Fig. 3. The solution obtained by the LP with sapled worlds will, in general, not be equal to the one obtained if all worlds are considered siultaneously. However, we can show that the quality of the two approxiations is close, if a sufficient nuber of worlds are sapled. Specifically, with a polynoial nuber of sapled worlds, we can guarantee that, with high probability, the quality of the value function approxiation obtained when sapling worlds is close to the one obtained when considering all possible worlds. Theore 5.1 Consider the following class-based value functions each with k paraeters): V obtained fro the LP over all possible worlds by iniizing Eq. 7) subject to the constraints in Eq. 6); Ṽ obtained fro the LP with the sapled worlds in 8); and V the optial value function of the eta- MDP Π. For a nuber of sapled worlds polynoial in 1/ε, ln 1/δ, 1/1 γ), k,, 1/κ ), the error is bounded by: V Ṽ 1,P Ω V V 1,PΩ + εr ax ; with probability at least 1 δ, for any δ > 0 and ε > 0; where V 1,PΩ = ω,s S ω P ω)pωs) 0 V ω s), and R ax is the axiu per-object reward. Proof: See Appendix A. The proof uses soe of the techniques developed by de Farias and Van Roy [] for analyzing constraint sapling in general MDPs. However, there are two iportant differences: First, our analysis includes the error introduced when sapling the objective, which in our case is a su only over a subset of the worlds rather than over all of the as in the LP for the full eta-mdp. This issue was not previously addressed. Second, the algorith of de Farias and Van Roy relies on the assuption that constraints are sapled according to soe

6 ideal distribution the stationary distribution of the optial policy). Unfortunately, sapling fro this distribution is as difficult as coputing a near-optial policy. In our analysis, after each world is sapled, our algorith exploits the factored structure in the odel to represent the constraints exactly, avoiding the dependency on the ideal distribution. 6 Learning Classes of Objects The definition of a class-based value function assues that all objects in a class have the sae local value function. In any cases, even objects in the sae class ight play different roles in the odel, and therefore have a different ipact on the overall value. For exaple, if only one peasant has the capability to build barracks, his status ay have a greater ipact. Distinctions of this type are not usually known in advance, but are learned by an agent as it gains experience with a doain and detects regularities. We propose a procedure that takes exactly this approach: Assue that we have been presented with a set D of worlds ω. For each world ω, an approxiate value function V ω = o O[ω] V o was coputed as described in Sec. 3. In addition, each object is associated with a set of features F ω [o]. For exaple, the features ay include local inforation, such as whether the object is a peasant linked to a barrack, or not, as well as global inforation, such as whether this world contains archers in addition to footen. We can define our training data D as { F ω [o], V o : o O[ω], ω D}. We now have a well-defined learning proble: given this training data, we would like to partition the objects into classes, such that objects of the sae class have siilar value functions. There are any approaches for tackling such a task. We choose to use decision tree regression, so as to construct a tree that predicts the local value function paraeters given the features. Thus, each split in the tree corresponds to a feature in F ω [o]; each branch down the tree defines a subset of local value functions in D whose feature values are as defined by the path; the leaf at the end of the path is the average value function for this set. As the regression tree learning algorith tries to construct a tree which is predictive about the local value function, it will ai to construct a tree where the ean at each leaf is very close to the training data assigned to that leaf. Thus, the leaves tend to correspond to objects whose local value functions are siilar. We can thus take the leaves in the tree to define our subclasses, where each subclass is characterized by the cobination of feature values specified by the path to the corresponding leaf. This algorith is suarized in Step 1 of Fig. 3. Note that the ean subfunction at a leaf is not used as the value subfunction for the corresponding class; rather, the paraeters of the value subfunction are optiized using the class-based LP in Step of the algorith. 7 Experiental results We evaluated our generalization algorith on two doains: coputer network adinistration and Freecraft. 7.1 Coputer network adinistration For this proble, we ipleented our algorith in Matlab, using CPLEX as the LP solver. Rather than using the full LP decoposition of GKP [6], we used the constraint generation extension proposed in [13], as the eory requireents 1. Learning Subclasses: Input: A set of training worlds D. A set of features F ω[o]. Algorith: a) For each ω D, copute an object-based value function, as described in Sec. 3. b) Apply regression tree learning on { F ω[o], V o : o O[ω], ω D}. c) Define a subclass for each leaf, characterized by the feature vector associated with its path.. Coputing Class-Based Value Function: Input: A set of sub)class definitions C. A teplate for {V C : C C}. A set of training worlds D. Algorith: a) Copute the paraeters for {V C : C C} that optiize the LP in 8) relative to the worlds in D. 3. Acting in a New World: Input: A set of local value functions {V C : C C}. A set of sub)class definitions C. A world ω. Algorith: Repeat a) Obtain the current state s. b) Deterine the appropriate class C for each o O[ω] according to its features. c) Define V ω according to 5). d) Use the coordination graph algorith of GKP to copute an action a that axiizes Rs, a) + γ s P s s, a)v ωs ). e) Take action a in the world. Figure 3: The overall generalization algorith. were lower for this second approach. We experiented with the ultiagent coputer network exaples in [6], using various network topologies and pair basis functions that involve states of neighboring achines see [6]). In one of these probles, if we have n coputers, then the underlying MDP has 9 n states and n actions. However, the LP decoposition algorith uses structure in the underlying factored odel to solve such probles very efficiently [6]. We first tested the extent to which value functions are shared across objects. In Fig. 4a), we plot the value each object gave to the assignent Status = working, for instances of the three legs topology. These values cluster into three classes. We used CART R to learn decision trees for our class partition. In this case, the learning algorith partitioned the coputers into three subclasses illustrated in Fig. 4b): server, interediate, and leaf. In Fig. 4a), we see that server third colun) has the highest value, because a broken server can cause a chain reaction affecting the whole network, while leaf value first colun) is lowest, as it cannot affect any other coputer. We then evaluated the generalization quality of our classbased value function by coparing its perforance to that of planning specifically for a new environent. For each topology, we coputed the class-based value function with 5 sapled networks of up to 0 coputers. We then sapled a

7 L L L Nuber of objects Value function paraeter value Interediate eaf eaf Interediate Server Interediate eaf Estiated policy value per agent Class-based value function 'Optial' approxiate value function Utopic expected axiu value Ring Star Three legs Max-nor error of value function No class learning Learnt classes Ring Star Three legs a) b) c) d) e) Figure 4: Network adinistrator results: a) Training data for learning classes; b) Classes learned for three legs ; c) Generalization quality evaluated by 0 Monte Carlo runs of 100 steps); d) Advantage of learning subclasses. Tactical Freecraft: e) 3 footen against 3 eneies. new network and coputed for it a value function that used the sae factorization, but with no class restrictions. This value function has ore paraeters different paraeters for each object, rather than for entire classes, which are optiized for this particular network. This process was repeated for 8 sets of networks. The results, shown in Fig. 4c), indicate that the value of the policy fro the class-based value function is very close to the value of replanning, suggesting that we can generalize well to new probles. We also coputed a utopic upper bound on the expected value of the optial policy by reoving the negative) effect of the neighbors on the status of the achines. Although this bound is loose, our approxiate policies still achieve a value close to it. Next, we wanted to deterine if our procedure for learning classes yields better approxiations than the ones obtained fro the default classes. Fig. 4d) copares the axnor error between our class-based value function and the one obtained by replanning. The graph suggests that, by learning classes using our decision trees regression tree procedure, we obtain a uch better approxiation of the value function we would have, had we replanned. 7. Freecraft In order to evaluate our algorith in the Freecraft gae, we ipleented the ethods in C++ and used CPLEX as the LP solver. We created two tasks that evaluate two aspects of the gae: long-ter strategic decision aking and local tactical battle aneuvers. Our Freecraft interface, and scenarios for these and other ore coplex tasks are publicly available at: For each task we designed an RMDP odel to represent the syste, by consulting a doain expert. After planning, our policies were evaluated on the actual gae. To better visualize our results, we direct the reader to view videos of our policies at a website: guestrin/research/generalization/. This website also contains details on our RMDP odel. It is iportant to note that, our policies were constructed relative to a very approxiate odel of the gae, but evaluated against the real gae. In the tactical odel, the goal is to take out an opposing eney force with an equivalent nuber of units. At each tie step, each footan decides which eney to attack. The eneies are controlled using Freecraft s hand-built strategy. We odelled footen and eneies as each having 5 health points, which can decrease as units are attacked. We used a siple aggregator to represent the effect of ultiple attackers. To encourage coordination, each footan is linked to a buddy in a ring structure. The local value functions include ters over triples of linked variables. We solved this odel for a world with 3 footen and 3 eneies, shown in Fig. 4e). The resulting policy which is fairly coplex) deonstrates successful coordination between our footen: initially all three footen focus on one eney. When the eney becoes injured, one footan switches its target. Finally, when the eney is very weak, only one footan continues to attack it, while the others tackle a different eney. Using this policy, our footen defeat the eneies in Freecraft. The factors generated in our planning algorith grow exponentially in the nuber of units, so planning in larger odels is infeasible. Fortunately, when executing a policy, we instantiate the current state at every tie step, and action selection is significantly faster [6]. Thus, even though we cannot execute Step in Fig. 3 of our algorith for larger scenarios, we can generalize our class-based value function to a world with 4 footen and eneies, without replanning using only Step 3 of our approach. The policy continues to deonstrate successful coordination between footen, and we again beat Freecraft s policy. However, as the nuber of units increases, the position of eneies becoes increasingly iportant. Currently, our odel does not consider this feature, and in a world with 5 footen and eneies, our policy loses to Freecraft in a close battle. In the strategic odel, the goal is to kill a strong eney. The player starts with a few peasants, who can collect gold or wood, or attept to build a barrack, which requires both gold and wood. All resources are consued after each Build action. With a barrack and gold, the player can train a footan. The footen can choose to attack the eney. When attacked, the eney loses health points, but fights back and ay kill the footen. We solved a odel with peasants, 1 barrack, footen, and an eney. Every peasant was related to a central peasant and every footan had a buddy. The scope of our local value function included triples between related objects. The resulting policy is quite interesting: the peasants gather gold and wood to build a barrack, then gold to build a footan. Rather than attacking the eney at once, this footan waits until a second footan is built. Then, they attack the eney together. The stronger eney is able to kill both footen, but it becoes quite weak. When the next footan is trained, rather than waiting for a second one, it attacks the now weak eney, and is able to kill hi. Again, planning in large scenarios is infeasible, but action selection can be perfored efficiently. Thus, we can use our generalized value function to tackle a world with 9 peasants and 3 footen, without replanning. The 9 peasants coordinate to gather resources. Interestingly, rather than attacking with footen, the policy now waits for 3 to be trained before attacking. The 3 footen kill the eney, and only one of the dies. Thus,

8 we have successfully generalized fro a proble with about 10 6 joint state-action pairs to one with over pairs. 8 Discussion and Conclusions In this paper, we have tackled a longstanding goal in planning research, the ability to generalize plans to new environents. Such generalization has two copleentary uses: First we can tackle new environents with inial or no replanning. Second it allows us to generalize plans fro saller tractable environents to significantly larger ones, which could not be solved directly with our planning algorith. Our experiental results support the fact that our class-based value function generalizes well to new plans and that the class and subclass structure discovered by our learning procedure iproves the quality of the approxiation. Furtherore, we successfully deonstrated our ethods on a real strategic coputer gae, which contains any characteristics present in real-world dynaic resource allocation probles. Several other papers consider the generalization proble. Several approaches can represent value functions in general ters, but usually require it to be hand-constructed for the particular task. Others [1; 8; 4] have focused on reusing solutions fro isoorphic regions of state space. By coparison, our ethod exploits siilarities between objects evolving in parallel. It would be very interesting to cobine these two types of decoposition. The work of Boutilier et al. [1] on sybolic value iteration coputes first-order value functions, which generalize over objects. However, it focuses on coputing exact value functions, which are unlikely to generalize to a different world. Furtherore, it relies on the use of theore proving tools, which adds to the coplexity of the approach. Methods in deterinistic planning have focused on generalizing fro copactly described policies learned fro any doains to increentally build a first-order policy [9; 11]. Closest in spirit to our approach is the recent work of Yoon et al. [17], which extends these approaches to stochastic doains. We perfor a siilar procedure to discover classes by finding structure in the value function. However, our approach finds regularities in copactly represented value functions rather than policies. Thus, we can tackle tasks such as ultiagent planning, where the action space is exponentially large and copact policies often do not exist. The key assuption in our ethod is interchangeability between objects of the sae class. Our echanis for learning subclasses allows us to deal with cases where objects in the doain can vary, but our generalizations will not be successful in very heterogeneous environents, where ost objects have very different influences on the overall dynaics or rewards. Additionally, the efficiency of our LP solution algorith depends on the connectivity of the underlying proble. In a doain with strong and constant interactions between any objects e.g., Robocup), or when the reward function depends arbitrarily on the state of any objects e.g., Blocksworld), the solution algorith will probably not be efficient. In soe cases, such as the Freecraft tactical doain, we can use generalization to scale up to larger probles. In others, we could cobine our LP decoposition technique with constraint sapling [] to address this high connectivity issue. In general, however, extending these techniques to highly connected probles is still an open proble. Finally, although we have successfully applied our class-value functions to new environents without replanning, there are doains where such direct application would not be sufficient to obtain a good solution. In such doains, our generalized value functions can provide a good initial policy, which could be refined using a variety of local search ethods. We have assued that relations do not change over tie. In any doains e.g., Blocksworld or Robocup), this assuption is false. In recent work, Guestrin et al. [7] show that context-specific independence can allow for dynaically changing coordination structures in ultiagent environents. Siilar ideas ay allow us to tackle dynaically changing relational structures. In suary, we believe that the class-based value functions ethods presented here will significantly further the applicability of MDP odels to large-scale real-world tasks. Acknowledgeents We are very grateful to Ron Parr for any useful discussions. This work was supported by the DoD MURI progra, adinistered by the Office of Naval Research under Grant N , and by Air Force contract F under DARPA s TASK progra. References [1] C. Boutilier, R. Reiter, and B. Price. Sybolic dynaic prograing for first-order MDPs. In IJCAI-01, 001. [] D.P. de Farias and B. Van Roy. On constraint sapling for the linear prograing approach to approxiate dynaic prograing. Subitted to Math. of Operations Research, 001. [3] T. Dean and K. Kanazawa. Probabilistic teporal reasoning. In AAAI-88, [4] T. G. Dietterich. Hierarchical reinforceent learning with the MAXQ value function decoposition. Journal of Artificial Intelligence Research, 13:7 303, 000. [5] R. E. Fikes, P. E. Hart, and N. J. Nilsson. Learning and executing generalized robot plans. Artf. Intel., 34):51 88, 197. [6] C. E. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In NIPS-14, 001. [7] C. E. Guestrin, S. Venkataraan, and D. Koller. Context specific ultiagent coordination and planning with factored MDPs. In AAAI-0, 00. [8] M. Hauskrecht, N. Meuleau, L. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov decision processes using acro-actions. In UAI, [9] R. Khardon. Learning action strategies for planning doains. Artificial Intelligence, 113:15 148, [10] D. Koller and A. Pfeffer. Probabilistic frae-based systes. In AAAI, [11] M. Martin and H. Geffner. Learning generalized policies in planning using concept languages. In KR, 000. [1] R. Parr. Flexible decoposition algoriths for weakly coupled arkov decision probles. In UAI-98, [13] D. Schuurans and R. Patrascu. Direct value-approxiation for factored MDPs. In NIPS-14, 001. [14] P. Schweitzer and A. Seidann. Generalized polynoial approxiations in Markovian decision processes. J. of Matheatical Analysis and Applications, 110:568 58, [15] R. Sutton and A. Barto. Reinforceent Learning: An Introduction. MIT Press, Cabridge, MA, [16] S. Thrun and J. O Sullivan. Discovering structure in ultiple learning tasks: The TC algorith. In ICML-96, [17] S. W. Yoon, A. Fern, and B. Givan. Inductive policy selection for first-order MDPs. In UAI-0, 00.

9 A Proof of Theore 5.1 Notation: k nuber of paraeters; V value function obtained by LP fro the LP with infinitely any worlds in constraints and objective; V value function obtained by LP fro the LP with sapled any worlds in constraints,but infinitely any in the objective; Ṽ value function obtained by LP fro the LP with sapled worlds in constraints and objective; V ax axiu value function over all possible worlds ties the probability of that world; π optial policy; µ stationary distribution of optial policy; µ ω stationary distribution of optial policy for world ω; D sapled worlds; [ω] nuber of objects in ω; Assuption A.1 ω, P [ω]) κ e λ[ω] ; for soe κ > 0. Theore A. Let V be the value function obtained fro the linear progra with all of the constraints and the correct objective function; let Ṽ be the value function fro the linear progra with the sapled objective and constraints; and let V be the optial value function. If the nuber of sapled worlds is at least: ) 4 {ln 8δ [ )]} 1 γ)ε + k ln 1 γ)ε + ln ln + 1 γ)ε 16e [ ε κ ln e 1 δε κ ] ; then: Ṽ V V V + ε 3κ R ax 1,PΩ 1,PΩ e 1 γ) ; with probability at least 1 δ, for any δ > 0 and ε > 0. Proof: We will start by proving an auxiliary lea which considers only sapled constraints, but not the sapled objective: Lea A.3 ) 4 {ln 8δ [ )]} 1 γ)ε + k ln 1 γ)ε + ln ln ; 1 γ)ε then: V V 1,PΩ V V + ε κ R ax 1,PΩ e 1 γ) ; with probability at least 1 δ Proof: There are two ain differences between our proof and the proof of de Farias and Van Roy s Theore 5.1 [] for standard MDPs: The first is that, in our relational odels, the stationary distribution decoposes as the ixture of the stationary distributions of each world. The second is that we only saple part of the state, in particular, we saple the world, but represent the constraints for each world in closed for. For our generalization proble, the stationary distribution decoposes as: µ ω, s) = P ω)µ ωs). We ust now bound the probability that V violates any constraints with respect to the constraints defined by the optial policy: µ ω, s) 1 V < T π V ) = P ω) µ ωs) 1 Vs) < T π Vs) ) ; ω,s ω s ω P ω) 1 s SΠ ω ), a AΠ ω ) : Vs) < T a Vs) ). If a world ω has been sapled, i.e., ω D, then the indicator over s SΠ ω ), a AΠ ω ) : Vs) < T a Vs) is guaranteed to be zero. Thus, the last ter is less than or equal to ψ ω : Vs) < T a Vs) ) in de Farias and Van Roy s notation), which in turn is bounded in their Theore 4.1.

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

When Short Runs Beat Long Runs

When Short Runs Beat Long Runs When Short Runs Beat Long Runs Sean Luke George Mason University http://www.cs.gu.edu/ sean/ Abstract What will yield the best results: doing one run n generations long or doing runs n/ generations long

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1. Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized

More information

Deflation of the I-O Series Some Technical Aspects. Giorgio Rampa University of Genoa April 2007

Deflation of the I-O Series Some Technical Aspects. Giorgio Rampa University of Genoa April 2007 Deflation of the I-O Series 1959-2. Soe Technical Aspects Giorgio Rapa University of Genoa g.rapa@unige.it April 27 1. Introduction The nuber of sectors is 42 for the period 1965-2 and 38 for the initial

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

Randomized Accuracy-Aware Program Transformations For Efficient Approximate Computations

Randomized Accuracy-Aware Program Transformations For Efficient Approximate Computations Randoized Accuracy-Aware Progra Transforations For Efficient Approxiate Coputations Zeyuan Allen Zhu Sasa Misailovic Jonathan A. Kelner Martin Rinard MIT CSAIL zeyuan@csail.it.edu isailo@it.edu kelner@it.edu

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Generalized Queries on Probabilistic Context-Free Grammars

Generalized Queries on Probabilistic Context-Free Grammars IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 20, NO. 1, JANUARY 1998 1 Generalized Queries on Probabilistic Context-Free Graars David V. Pynadath and Michael P. Wellan Abstract

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

EMPIRICAL COMPLEXITY ANALYSIS OF A MILP-APPROACH FOR OPTIMIZATION OF HYBRID SYSTEMS

EMPIRICAL COMPLEXITY ANALYSIS OF A MILP-APPROACH FOR OPTIMIZATION OF HYBRID SYSTEMS EMPIRICAL COMPLEXITY ANALYSIS OF A MILP-APPROACH FOR OPTIMIZATION OF HYBRID SYSTEMS Jochen Till, Sebastian Engell, Sebastian Panek, and Olaf Stursberg Process Control Lab (CT-AST), University of Dortund,

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Use of PSO in Parameter Estimation of Robot Dynamics; Part One: No Need for Parameterization

Use of PSO in Parameter Estimation of Robot Dynamics; Part One: No Need for Parameterization Use of PSO in Paraeter Estiation of Robot Dynaics; Part One: No Need for Paraeterization Hossein Jahandideh, Mehrzad Navar Abstract Offline procedures for estiating paraeters of robot dynaics are practically

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Tracking using CONDENSATION: Conditional Density Propagation

Tracking using CONDENSATION: Conditional Density Propagation Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

IN modern society that various systems have become more

IN modern society that various systems have become more Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Multi-Scale/Multi-Resolution: Wavelet Transform

Multi-Scale/Multi-Resolution: Wavelet Transform Multi-Scale/Multi-Resolution: Wavelet Transfor Proble with Fourier Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. A serious drawback in transforing to the

More information

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China 6th International Conference on Machinery, Materials, Environent, Biotechnology and Coputer (MMEBC 06) Solving Multi-Sensor Multi-Target Assignent Proble Based on Copositive Cobat Efficiency and QPSO Algorith

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science Proceedings of the 6th WSEAS International Conference on Applied Coputer Science, Tenerife, Canary Islands, Spain, Deceber 16-18, 2006 183 Qualitative Modelling of Tie Series Using Self-Organizing Maps:

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Curious Bounds for Floor Function Sums

Curious Bounds for Floor Function Sums 1 47 6 11 Journal of Integer Sequences, Vol. 1 (018), Article 18.1.8 Curious Bounds for Floor Function Sus Thotsaporn Thanatipanonda and Elaine Wong 1 Science Division Mahidol University International

More information

Equilibria on the Day-Ahead Electricity Market

Equilibria on the Day-Ahead Electricity Market Equilibria on the Day-Ahead Electricity Market Margarida Carvalho INESC Porto, Portugal Faculdade de Ciências, Universidade do Porto, Portugal argarida.carvalho@dcc.fc.up.pt João Pedro Pedroso INESC Porto,

More information

Linear Program Approximations for Factored Continuous-State Markov Decision Processes

Linear Program Approximations for Factored Continuous-State Markov Decision Processes Linear Progra Approiations for Factored Continuous-State Markov ecision Processes Milos Hauskrecht and Branislav Kveton epartent of Coputer Science and Intelligent Systes Progra University of Pittsburgh

More information

Chaotic Coupled Map Lattices

Chaotic Coupled Map Lattices Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each

More information

MULTIAGENT Resource Allocation (MARA) is the

MULTIAGENT Resource Allocation (MARA) is the EDIC RESEARCH PROPOSAL 1 Designing Negotiation Protocols for Utility Maxiization in Multiagent Resource Allocation Tri Kurniawan Wijaya LSIR, I&C, EPFL Abstract Resource allocation is one of the ain concerns

More information

Decision-Theoretic Approach to Maximizing Observation of Multiple Targets in Multi-Camera Surveillance

Decision-Theoretic Approach to Maximizing Observation of Multiple Targets in Multi-Camera Surveillance Decision-Theoretic Approach to Maxiizing Observation of Multiple Targets in Multi-Caera Surveillance Prabhu Natarajan, Trong Nghia Hoang, Kian Hsiang Low, and Mohan Kankanhalli Departent of Coputer Science,

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies OPERATIONS RESEARCH Vol. 52, No. 5, Septeber October 2004, pp. 795 803 issn 0030-364X eissn 1526-5463 04 5205 0795 infors doi 10.1287/opre.1040.0130 2004 INFORMS TECHNICAL NOTE Lost-Sales Probles with

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies Approxiation in Stochastic Scheduling: The Power of -Based Priority Policies Rolf Möhring, Andreas Schulz, Marc Uetz Setting (A P p stoch, r E( w and (B P p stoch E( w We will assue that the processing

More information

International Scientific and Technological Conference EXTREME ROBOTICS October 8-9, 2015, Saint-Petersburg, Russia

International Scientific and Technological Conference EXTREME ROBOTICS October 8-9, 2015, Saint-Petersburg, Russia International Scientific and Technological Conference EXTREME ROBOTICS October 8-9, 215, Saint-Petersburg, Russia LEARNING MOBILE ROBOT BASED ON ADAPTIVE CONTROLLED MARKOV CHAINS V.Ya. Vilisov University

More information

lecture 36: Linear Multistep Mehods: Zero Stability

lecture 36: Linear Multistep Mehods: Zero Stability 95 lecture 36: Linear Multistep Mehods: Zero Stability 5.6 Linear ultistep ethods: zero stability Does consistency iply convergence for linear ultistep ethods? This is always the case for one-step ethods,

More information

Model-Free Reinforcement Learning as Mixture Learning

Model-Free Reinforcement Learning as Mixture Learning Model-Free Reinforceent Learning as Mixture Learning Nikos Vlassis vlassis@dpe.tuc.gr Technical University of Crete, Dept. of Production Engineering and Manageent, 73100 Chania, Greece Marc Toussaint toussai@cs.tu-berlin.de

More information

Introduction to Discrete Optimization

Introduction to Discrete Optimization Prof. Friedrich Eisenbrand Martin Nieeier Due Date: March 9 9 Discussions: March 9 Introduction to Discrete Optiization Spring 9 s Exercise Consider a school district with I neighborhoods J schools and

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

C na (1) a=l. c = CO + Clm + CZ TWO-STAGE SAMPLE DESIGN WITH SMALL CLUSTERS. 1. Introduction

C na (1) a=l. c = CO + Clm + CZ TWO-STAGE SAMPLE DESIGN WITH SMALL CLUSTERS. 1. Introduction TWO-STGE SMPLE DESIGN WITH SMLL CLUSTERS Robert G. Clark and David G. Steel School of Matheatics and pplied Statistics, University of Wollongong, NSW 5 ustralia. (robert.clark@abs.gov.au) Key Words: saple

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

Tight Complexity Bounds for Optimizing Composite Objectives

Tight Complexity Bounds for Optimizing Composite Objectives Tight Coplexity Bounds for Optiizing Coposite Objectives Blake Woodworth Toyota Technological Institute at Chicago Chicago, IL, 60637 blake@ttic.edu Nathan Srebro Toyota Technological Institute at Chicago

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

An Algorithm for Posynomial Geometric Programming, Based on Generalized Linear Programming

An Algorithm for Posynomial Geometric Programming, Based on Generalized Linear Programming An Algorith for Posynoial Geoetric Prograing, Based on Generalized Linear Prograing Jayant Rajgopal Departent of Industrial Engineering University of Pittsburgh, Pittsburgh, PA 526 Dennis L. Bricer Departent

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION

REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION ISSN 139 14X INFORMATION TECHNOLOGY AND CONTROL, 008, Vol.37, No.3 REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION Riantas Barauskas, Vidantas Riavičius Departent of Syste Analysis, Kaunas

More information

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x) 7Applying Nelder Mead s Optiization Algorith APPLYING NELDER MEAD S OPTIMIZATION ALGORITHM FOR MULTIPLE GLOBAL MINIMA Abstract Ştefan ŞTEFĂNESCU * The iterative deterinistic optiization ethod could not

More information

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space Journal of Machine Learning Research 3 (2003) 1333-1356 Subitted 5/02; Published 3/03 Grafting: Fast, Increental Feature Selection by Gradient Descent in Function Space Sion Perkins Space and Reote Sensing

More information

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel 1 Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel Rai Cohen, Graduate Student eber, IEEE, and Yuval Cassuto, Senior eber, IEEE arxiv:1510.05311v2 [cs.it] 24 ay 2016 Abstract In

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information