e-companion ONLY AVAILABLE IN ELECTRONIC FORM

Similar documents
Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

arxiv: v1 [cs.ds] 3 Feb 2014

Combining Classifiers

A note on the multiplication of sparse matrices

An improved self-adaptive harmony search algorithm for joint replenishment problems

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Kernel Methods and Support Vector Machines

Fairness via priority scheduling

Lost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies

Interactive Markov Models of Evolutionary Algorithms

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

Nonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy

Randomized Recovery for Boolean Compressed Sensing

A Simple Regression Problem

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Figure 1: Equivalent electric (RC) circuit of a neurons membrane

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Bootstrapping Dependent Data

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Sharp Time Data Tradeoffs for Linear Inverse Problems

C na (1) a=l. c = CO + Clm + CZ TWO-STAGE SAMPLE DESIGN WITH SMALL CLUSTERS. 1. Introduction

Bayes Decision Rule and Naïve Bayes Classifier

CS Lecture 13. More Maximum Likelihood

Stochastic Subgradient Methods

Distributed Subgradient Methods for Multi-agent Optimization

COS 424: Interacting with Data. Written Exercises

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Introduction to Discrete Optimization

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

Lecture 21. Interior Point Methods Setup and Algorithm

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

Estimating Parameters for a Gaussian pdf

Optimal Resource Allocation in Multicast Device-to-Device Communications Underlaying LTE Networks

OPTIMIZATION in multi-agent networks has attracted

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

1 Identical Parallel Machines

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

Weighted- 1 minimization with multiple weighting sets

Computational and Statistical Learning Theory

N-Point. DFTs of Two Length-N Real Sequences

Non-Parametric Non-Line-of-Sight Identification 1

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Bipartite subgraphs and the smallest eigenvalue

INTEGRATIVE COOPERATIVE APPROACH FOR SOLVING PERMUTATION FLOWSHOP SCHEDULING PROBLEM WITH SEQUENCE DEPENDENT FAMILY SETUP TIMES

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

Randomized Accuracy-Aware Program Transformations For Efficient Approximate Computations

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Finite Horizon Throughput Maximization and Sensing Optimization in Wireless Powered Devices over Fading Channels

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

Analyzing Simulation Results

Boosting with log-loss

Ch 12: Variations on Backpropagation

Ensemble Based on Data Envelopment Analysis

List Scheduling and LPT Oliver Braun (09/05/2017)

arxiv: v1 [math.na] 10 Oct 2016

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Convex Programming for Scheduling Unrelated Parallel Machines

Lecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon

Polygonal Designs: Existence and Construction

ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD

Physically Based Modeling CS Notes Spring 1997 Particle Collision and Contact

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Asynchronous Gossip Algorithms for Stochastic Optimization

EMPIRICAL COMPLEXITY ANALYSIS OF A MILP-APPROACH FOR OPTIMIZATION OF HYBRID SYSTEMS

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,

Stochastic Optimization of Product-Machine Qualification in a Semiconductor Back-end Facility

Shannon Sampling II. Connections to Learning Theory

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40

Constrained Consensus and Optimization in Multi-Agent Networks arxiv: v2 [math.oc] 17 Dec 2008

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Least Squares Fitting of Data

When Short Runs Beat Long Runs

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Block designs and statistics

Handwriting Detection Model Based on Four-Dimensional Vector Space Model

A Theoretical Analysis of a Warm Start Technique

On Constant Power Water-filling

Projectile Motion with Air Resistance (Numerical Modeling, Euler s Method)

Best Procedures For Sample-Free Item Analysis

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

On the Analysis of the Quantum-inspired Evolutionary Algorithm with a Single Individual

Hybrid System Identification: An SDP Approach

Konrad-Zuse-Zentrum für Informationstechnik Berlin Heilbronner Str. 10, D Berlin - Wilmersdorf

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors

Pattern Recognition and Machine Learning. Artificial Neural networks

Optimum Value of Poverty Measure Using Inverse Optimization Programming Problem

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Supplementary Information for Design of Bending Multi-Layer Electroactive Polymer Actuators

Transcription:

OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer Segent by Diitris Bertsias and Ada J. Mersereau, Operations Research, doi 10.1287/opre.1070.0427. This docuent accopanies A Learning Approach for Interactive Marketing to a Custoer Segent by Bertsias and Mersereau. We provide proofs of soe results fro that paper and soe additional coputational results. References point to that docuent, and we use notation specified there. A. Proofs of. Results Proposition 1 can be extended to the case in which the decisions x are not fixed but are decided in stages. Corrollary 1. For 0 and where x i ay depend on s t + i 1 j=t yj f t + i 1 j=t xj y j for i>t, E y t ( E y t+1 E y t+ J t s t + t+ t+ y i f t + x i y i ) x t+ s t + t+ 1 t+ 1 y i f t + x i y i xt+1 s t +y t f t +x t y t x t s t f J t t s t f t Proof. We can apply Proposition 1 to the inner expectation to show that the left-hand side of the inequality is greater than or equal to E y t ( t+ 1 t+ 1 E y t+1 E y t+ 1 J t s t + y i f t + x i y i ) x t+ 1 s t + t+ 2 t+ 2 y i f t + x i y i xt+1 s t + y t f t + x t y t x t s t f t The sae arguent applied 1 ore ties yields the desired result. Proof of Proposition 2 Observe that a feasible policy for the N A + N B -stage-size proble is to set the decision x t at stage t equal to the optial decision for the N A -stage-size proble plus the optial decision for the N B -stage-size proble. Let J t s t f t N A N B T denote the cost-to-go function corresponding to this policy beginning in stage t and state s t f t. Observe J T 1 s T 1 f T 1 NT is linear in N,so J T 1 s T 1 f T 1 N A N B T= J T 1 s T 1 f T 1 N A T+ J T 1 s T 1 f T 1 N B T Now consider stage t<t 1. Assue for any s t+1 and f t+1, J t+1 s t+1 f t+1 N A N B T J t+1 s t+1 f t+1 N A T+ J t+1 s t+1 f t+1 N B T and let x A s t = arg ax x x =N A s t + f x t + E y J t+1 s t + y f t + x yn A Txs t f t x B s t = arg ax x x =N B s t + f x t + E y J t+1 s t + y f t + x yn B Txs t f t ec1

ec2 Bertsias and Mersereau: Learning Approach for Interactive Marketing pp. ec1 ec5; suppl. to Oper. Res., doi 10.1287/opre.1070.0427, 07 INFORMS Then J t s t f t N A N B T = s t s t + f t x A + xb + E y J t+1 s t + y f t + x A + x B yn A N B T x A + x B s t f t s t s t + f x A t + xb + E yj t+1 s t + y f t + x A + x B yn A Tx A + x B s t f t + E y J t+1 s t + y f t + x A + x B yn B Tx A + x B s t f t s t = s t + f x A t + xb + E y AE y BJ t+1 s t + y A + y B f t + x A + x B y A y B N A Tx B s t + y A f t + x A y A x A s t f t + E y BE y AJ t+1 s t + y A + y B f t + x A + x B y A y B N B Tx A s t + y B f t + x B y B x B s t f t s t s t + f x A t + xb + E y AJ t+1s t + y A f t + x A y A N A Tx A s t f t + E y BJ t+1 s t + y B f t + x B y B N B Tx B s t f t = J t s t f t N A T+ J t s t f t N B T where the equality in the third-to-last line follows fro Lea 1 and the inequality on the second-to-last line follows fro Proposition 1. The desired result follows fro induction and the fact that J 0 s t f t N A + N B T J 0 s t f t N A N B T. Proof of Proposition The optial T A + T B -horizon policy yields at least as uch expected reward as the policy that uses the optial T A -horizon policy for ties 0 1T A 1, then uses the optial T B -horizon policy for ties T A T A + 1T A + T B 1. If we denote the decisions and outcoes in periods 0T A 1 under this policy as the vectors x A and y A respectively, then this stateent is equivalent to J 0 s 0 f 0 NT A + T B J 0 s 0 f 0 NT A + E x A y AJ 0s 0 + y A f 0 + x A y A NT B where the expectation is over the decisions and outcoes of the T A -stage proble, and is shorthand for the expression in the left-hand side of the inequality in Corollary 1 with t = 0 and = T A 1. By that corollary then, we have E x A y AJ 0s 0 + y A f 0 + x A y A NT B J 0 s 0 f 0 NT B This gives the desired result, J 0 s 0 f 0 NT A + T B J 0 s 0 f 0 NT A + J 0 s 0 f 0 NT B. Proof of Proposition 4 The proble with stage size N and horizon T/ is equivalent to a proble with horizon T, where stage sizes are 0 when the stage t is not divisible by, and N when t is divisible by. Denote the optial value function of this odified proble by J t s t f t NT. Adopt the convention J T s T f T NT= J T s T f T NT= 0 for all s T, f T. We proceed by induction. Assue for soe t divisible by that J t+ s t+ f t+ NT Jt+s t+ f t+ NT for all s t+, f t+. Fix s t, f t and let { x = arg ax x x =N s t s t + f t x + E y Jt+s t + y f t + x yntxs t f t and arbitrarily assign x t x t+1 x t+ 1 such that x = t+ 1 =t x and x t = x t+1 = = x t+ 1 = N. This represents a feasible (non-markov) policy for stages tt + 1t+ 1oftheT-stage, N -stage-size proble.

Bertsias and Mersereau: Learning Approach for Interactive Marketing pp. ec1 ec5; suppl. to Oper. Res., doi 10.1287/opre.1070.0427, 07 INFORMS ec By the definition of x, we can write { J t st f t NT = ax x x =N s t s t + f t x + E y J t+s t + y f t + x ynt x s t f t = s t s t + f t x + E y J t+s t + y f t + x ynt x s t f t s t s t + f x t + E y Jt+ s t + y f t + x ynt x s t f t where the inequality follows fro the induction assuption. We introduce the notation x a b = b =a x. Using this notation, we can write x = x tt+ 1 = x tt+ 2 + x t+ 1. Substituting yields J t st f t NT s t s t + f t x tt+ 1 + E y J t+ s t + y f t + x tt+ 1 ynt x tt+ 1 s t f t (EC1) = s t s t + f t s t s t + f t x tt+ 2 + E y x tt+ 2 + E y s t + y s t + f t + xtt+ 2 x t+ 1 + E y J t+ s t + y + y f t + x tt+ 2 + x t+ 1 y y NT x t+ 1 s t + y f t + x tt+ 2 y x tt+ 2 s t f t ax x x =N { s t + y s t + f t + xtt+ 2 x +E y Jt+ s t +y+y f t +x tt+ 2 +x y y NT x s t + y f t + x tt+ 2 y x tt+ 2 s t f t s t = s t + f x tt+ 2 t + E y J t+ 1 s t + y f t + x tt+ 2 ynt x tt+ 2 s t f t (EC2) where the first equality follows fro Lea 1 and the fact that s E + y y s + f + x x s f = s + E y y x s f = s + x s /s + f = s s + f + x s + f + x s + f We can repeat the arguents (EC1) (EC2) 2 ore ties to get J t st f t NT s t s t + f x t t + E yj t+1 s t + y t f t + x t y t NT x t s t f t { ax x x =N = J t s t f t NT The desired result follows by induction. B. Soe Properties of J 0 s0 f 0 s t s t + f t x + E y J t+1 s t + y t f t + x y t NTxs t f t The propositions proven in this section support the assertions ade in 4.1 that the function J 0 s0 f 0 is convex as a function of and an upper bound for the true value function J 0 s 0 f 0. Proposition 6. J 0 s0 f 0 J 0 s 0 f 0 for all, s 0, and f 0. Proof. First, we use induction to show Jt st f t J t s t f t for all t,, s t, and f t. Fix, and consider stage T 1. Let = arg ax s T 1 1 /st + f T 1, then J T 1s T 1 f T 1 = Ns T 1 1 /st + f T 1. If s T 1 1 /st + f T 1 T 1, then: M J T 1 s T 1 f T 1 = N T 1 + Jˆ 1 1 T 1 st ft NT 1 + Jˆ 1 1 T 1sT ft = J T 1 st 1 f T 1 =1

ec4 Bertsias and Mersereau: Learning Approach for Interactive Marketing pp. ec1 ec5; suppl. to Oper. Res., doi 10.1287/opre.1070.0427, 07 INFORMS If s T 1 1 /st + f T 1 <T 1, then: J T 1 s T 1 f T 1 = N s T 1 s T 1 + f T 1 N T 1 = J T 1 st 1 f T 1 Now for soe t assue Jt+1 st+1 f t+1 J t+1 s t+1 f t+1 for all, s t+1, and f t+1. Let x t be feasible and achieve the axiu in the optiization of Equation (5). x t is feasible in the optiization proble of Equation (7), E y tj t+1 s t + y t f t + x t y t x t s t f t E y tj t+1 st + y t f t + x t y t x t s t f t and t N M =1 xt = 0, thus by coparison of Equations (5) and (7) we have Jt st f t J t s t f t.by induction we thus have J1 s1 f 1 J 1 s 1 f 1 for all, s 1, and f 1, then a coparison of Equations (5) and (11) gives us J 0 s0 f 0 J 0 s 0 f 0. Proposition 7. J 0 s0 f 0 is convex as a function of for all s 0 and f 0. Proof. First, we use induction to show that Jˆ t st ft is convex in for all tst, and f t 1. For all st and f T 1, Jˆ T 1 1 1sT ft is the axiu of linear functions of and is thus convex in. Now for arbitrary t<t 1 assue Jˆ t+1 st+1 ft+1 is convex in for all st+1 and f t+1. Then fix xt and see that E y Jˆ t+1 st + xf t + x y xst ft is a positively weighted su of convex functions of and is thus convex. Jˆ t st ft is then a axiu over a finite set of convex functions and is thus convex as a function of for all s t and f t. Jt st f t is the su of convex functions and is thus convex for all s t and f t. Thus EJ1 s1 f 1 is convex as a function of for all s 1 and f 1. Finally, J 0 s0 f 0 is a axiu of convex functions and is thus convex as a function of for all s 0 and f 0. C. Alternate Algoriths for Selecting Here we present support for our choice of ethod for selecting the paraeter. The results in this section ake use of the subproble approxiations of 4.2 with H = 2, B = N/10, and copare the following ethods for selecting : ADP: This is the ethod used to generate the results in 6. is assued constant and is chosen using binary search to identify the for which the constraint M =1 x0 = N is satisfied in the relaxed proble. After 7 iterations of binary search, the constrained proble (11) is used to deterine a feasible solution. ADP_in: This approach assues constant and attepts to select a that iniizes the value J 0 s0 f 0. The nuerical iniization relies on the convexity of the relaxed proble value function (see Proposition 7). Specifically, we begin with a known interval for (initially, 0 1) and subdivide the interval into 4 evenly spaced subintervals. By evaluating and coparing the relaxed value function at each of the subinterval boundaries, we can narrow the interval to at ost one half the original interval. This procedure is iterated 7 ties. ADP_: This ipleents a version of the algorith with retaining a coponent for each future tie stage. We include variables for 1 through H, using H for estiating the relaxed value functions beyond the lookahead horizon H. The coponent paraeters are chosen in a iniization procedure that perfors local search on a discretized grid of values. The discretization we use is 005, which we note is coarser than the precision of ADP_in. Table EC.1 gives results for a few selected probles for each of the ethods described. We note that the results do not give evidence that our assuption of constant is a poor one, nor does it see that significant gains can be achieved by using a iniization procedure to choose. TABLE EC.1. Siulation results coparing ADP, ADP_in, and ADP_l. Nubers represent average nubers of successes over 2,000 siulated probles. s, f T N k Ideal Greedy Intval. ADP ADP_in ADP_l 2, 8 10 50 U0 478 1856 19012 18978 1890 18907 2, 8 10 100 U0 41268 764 8886 8860 8754 8710 2, 50 6 0 U0 60 9889 8740 8875 8861 8824 8851 4, 100 5 1000 U100 0 258 0507 070 0699 064 0688

Bertsias and Mersereau: Learning Approach for Interactive Marketing pp. ec1 ec5; suppl. to Oper. Res., doi 10.1287/opre.1070.0427, 07 INFORMS ec5 TABLE EC.2. Siulation results for soe randoly generated ulti-segent probles. Coputation ties represent average CPU tie per stage on an Intel Xeon 2.4 GHz processor. The Greedy and Dynaic algoriths took negligible tie per stage (<0005 seconds). S T M Ni 0 { 4 25 2 8 4 25 8 4 8 2 8 { 1 4 Cpu tie Cpu tie sf k Greedy Dynaic Info Decop Info Decop 1 9 U0 2 8 U0 9695 9957 982 100.84 0.1 0.2 2 8 U010 2 8 U010 22521 22702 22807 229.98 0.8 0.9 2 8 U010 2 8 U010 2 8 U0 2 8 U00 2 8 U040 18089 18272 18242 184.0 0.4 0.44 5 5 U010 40 17577 1764 17626 176.67 0.09 0.10 D. Coputational Results for Multiple Segents with Migrating Custoers Through siulated experients, we evaluate the effectiveness of the ethod described in 4. for accounting for the igration of custoers aong segents. For purposes of coparison, we have ipleented the following heuristics for the proble described in 4.: Greedy: Sends to all custoers in state i the available essage offering the greatest expected reward in the current stage. Thus, this ethod accounts for neither the custoer igration dynaics nor the effects of inforation accuulation. Dynaics: This heuristic fixes all reward probabilities at their expected values, then solves a siple dynaic progra for each custoer. In the case with known purchase probabilities, solving for each custoer independently produces an optial policy for the overall proble. This ethod accounts for custoer igration dynaics but ignores inforation effects. Info: This ignores custoer dynaics entirely and akes decisions using the dynaic prograing-based adaptive sapling heuristic of 4. Decop: This is the decoposition-based approxiation described in 4.. We test the algorith on a few randoly generated exaples. True reward probabilities and prior distributions are generated as in 6. Transition probabilities for each segent and essage are chosen by selecting an S-vector of unifor rando deviates and noralizing so that S j=1 P ij = 1. Average results over 2,000 randoly generated probles for a few cases are presented in Table EC.2. We observe that the Decop heuristic outperfors all the other ethods for each set of probles tried. Moreover, the iproveent afforded by the Decop heuristic is statistically significant in each case. Most notably, the decoposition approach perfors as well as the bext of the Dynaic and Info techniques in all of the exaples, suggesting it is adequately accounting for both inforation value and custoer dynaics. We also note that all three of the Dynaic, Info, and Decop ethods are preferable to the Greedy heuristic in all the exaples. Coputation ties are reasonable and coparable to those observed for the single-segent probles of 6, although we point out that we have chosen instances with fewer custoers per stage than in 6.