Assortment Optimization under a Mixture of Mallows Model
|
|
- Briana Willis
- 5 years ago
- Views:
Transcription
1 Assortment Optimization under a Mixture of Mallows Model Antoine Désir *, Vineet Goyal *, Srikanth Jagabathula and Danny Segev * Department of Industrial Engineering and Operations Research, Columbia University Leonard N. Stern School of Business, New York University Department of Statistics, University of Haifa Abstract Assortment optimization is an important problem that arises in many applications including retailing and online advertising. The goal in this problem is to determine a revenue/profit maximizing subset of products to offer from a large universe of products when customers exhibit a random substitution behavior. We suppose that demand is described by a mixture of Mallows model. This is a smoothed generalization of the class of sparse rank-based choice models, designed to overcome some of its key limitations. The Mallows distribution has an exponential support size and does not admit a closed-form expression for choice probabilities. We present an efficient procedure to compute the choice probabilities for any assortment under the mixture of Mallows model. Based on this procedure and the structural properties of the distribution, we give a polynomial time approximation scheme (PTAS) for the constrained assortment optimization under a reasonable assumption. In other words, for any constant >0, we get a (1 )-approximation in running time that is polynomial in number of products, n and number of segments, K but depends exponentially on 1/. Our PTAS can handle a large class of constraints on the assortments including cardinality constraints, capacity constraints and matroid constraints. To the best of our knowledge, this is the first efficient algorithm with provably near-optimal performance guarantees for the assortment optimization problem under the Mallows or the mixture of Mallow model in such generality. Furthermore, we give a compact mixed integer program (MIP) and present a variable bound strengthening technique that leads to a practical approach for the constrained assortment optimization problem under a general mixture of Mallows model. 1 Introduction What subset (or assortment) of product to offer is a key decision problem that commonly arises in several application contexts. A concrete setting is that of a retailer who carries a large universe of products U but can offer only a subset of the products in each store, online or offline. The objective of the retailer 1
2 is typically to choose the offer set that maximizes the expected revenue/profit 1 earned from each arriving customer, under stochastic demand. One of the key ingredients for the assortment optimization problem is a model to capture the stochastic demand. The demand model must account for product substitution behavior, whereby customers substitute to an available product (say, a dark blue shirt) when her most preferred product (say, a black one) is not offered. The substitution behavior makes the demand for each offered product a function of the entire offer set, increasing the complexity of the demand model. Nevertheless, existing work has shown that demand models that incorporate substitution effects provide significantly more accurate predictions than those that do not. The common approach to capturing substitution is through a choice model that specifies the demand as the probability P(a S) of a random customer choosing product a from offer set S. Several parametric choice models from the Random Utility Maximization (RUM) class have been extensively studied in the literature in diverse areas including marketing, transportation, economics, and operations management. Among these models, the multinomial logit (MNL) model has by far been the most popular model in practice (Talluri and Van Ryzin 2004), primarily because of the tractability gained from making simplifying assumptions. However, this simplicity also limits 2 the ability of the model to faithfully capture complex substitution patterns present in various applications. To address these limitations, more complex choice models have been developed to capture a richer class of substitution behaviors. Popular examples include include the nested logit (Williams 1977), the latent-class MNL, and the mixed logit models (McFadden et al. 2000); see Train (2003) for a detailed overview of these models. The increased model complexity makes these models more flexible but both the estimation and the assortment problems become computationally difficult to solve. Further, from among these many options, choosing the right model is difficult task in assortment planing, especially as the customer preferences are latent and unobservable and we only have access to censored data. In fact, (Farias et al. 2013a) show that because parametric models do not grow in complexity with the data, they under-perform nonparametric models in accurately predicting expected revneue/profit (Farias et al. 2013b). To address the limitations of parametric models, the nonparametric rank-based choice model has emerged as the alternative. In its more general form, it models customer preferences as a distribution over all possible 1 As elaborated below, conversion-rate maximization can be obtained as a special case of revenue/profit maximization by setting the revenue/profit of all the products to be equal. 2 For instance, a commonly recognized limitation is the so-called Independent of Irrelevant Alternatives property, which specifies that when a product is not offered, then the probability that customers substitute to another product is proportional to its market share. 2
3 rankings/preference lists of the products (Farias et al. 2013b, van Ryzin and Vulcano 2014, Honhon et al. 2012). Each preference list specifies a rank ordering of the products, with lower ranked products being more preferred. In each choice instance, a customer samples a preference list according to the underlying distribution and chooses the first available product (possibly the no-purchase option) from her list. The rank-based model is very general and subsumes the general RUM model class mentioned above. The model accommodates distributions with exponentially large support sizes and, therefore, can capture complex substitution patterns. However, available data are usually not sufficient to identify such a complex model. Therefore, sparsity is used as a model selection criterion to pick a model from the set of models consistent with the data. Specifically, it is assumed that the distribution has a support size K, for some K that is polynomial in the number of products. Sparsity results in data-driven model selection (Farias et al. 2013b), obliviating the need for imposing arbitrary parametric structures. Despite their generality, however, sparse rank-based models cannot account for noise or any deviations from the K ranked-lists in the support. This limits their modeling flexibility, resulting in unrealistic predictions and inability to model individual-level observations. Specifically, because K n! (where n is the number of products), the model specifies that there is a zero chance that a customer uses a ranking that is even slightly different from any of the K rankings in the support and a zero chance of observing certain choices. However, choices may be observed in real (holdout) data that are not consistent with any of the K rankings, making the model predictions unrealistic. In addition, a natural way to interpret sparse choice models is to assume that the population consists of K types of customers, with each type described by one of the ranked lists. When this interpretation is applied to individual-level observations, it implies that all the choice observations of each individual must be consistent with at least one of the K rankings, which again may not be the case in real data. In order to address these issues, we generalize the sparse rank-based models by smoothing them using the Mallows kernel. Specifically, we suppose that the choice model is a mixture of K Mallows models and focus on assortment optimization under this generalized class. The Mallows distribution was introduced in the mid-1950 s Mallows (1957) and is the most popular member of the so-called distance-based ranking models, which are characterized by a modal ranking! and a concentration parameter. The probability that a ranking is sampled falls exponentially as e d(,!), where d(, ) is the distance between and!. Different distance functions result in different models. The Mallows model uses the Kendall-Tau distance, which measures the number of pairwise disagreements between the two rankings. Intuitively, the Mallows 3
4 model assumes that consumer preferences are concentrated around a central permutation, with the likelihood of large deviations being low. The mixture of Mallows model with K segments is specified by the modal rankings:! 1,...,! K, concentration parameters: 1,..., K and probabilities: p 1,...,p K where for any k =1,...,K, p k specifies the probability that a random customer belongs to Mallows segment k with modal ranking! k and concentration parameter k. This mixture model is a more natural model allowing for deviations from the modal rankings and assigning a non-zero probability to every choice. Further, it is a parsimonious way to extend the support of the distribution to an exponential size, and as k!1 for all k, the distribution concentrates around each of the K modes, yielding the sparse rank-based model. In this paper, we assume that the parameters of the mixture of Mallows model are given and focus on the constrained assortment optimization problem. We refer the interested readers to a large body of existing work in the literature on estimating such models from data (Lu and Boutilier 2011, Awasthi et al. 2014, Dwork et al. 2001, Lebanon and Mao 2008). 1.1 Our contributions Our main contribution in this paper is to give a near-optimal algorithm for a large class of constrained assortment optimization for the mixture of Mallows model under a reasonable assumption. We address the two key computational challenges that arise in solving our problem: (a) efficiently computing the choice probabilities and hence, the expected revenue/profit, for a given offer set S and (b) finding a near-optimal assortment. We also present a compact mixed integer program (MIP) and present a variable bound strengthening technique that leads to a practical approach for the constrained assortment optimization problem under a general mixture of Mallows model. We describe the main contributions below. Exact computation of choice probabilities. We present an efficient procedure to compute the choice probabilities P(a S) exactly under a general mixture of Mallows model. Because the mixture of Mallows distribution has an exponential support size, computing the choice probabilities for a fixed offer set S requires marginalizing the distribution by summing it over an exponential number of rankings, and therefore, is a non-trivial computational task. In fact, computing the probability of a general partial order under the Mallows distribution is known to be a #P hard problem (Lu and Boutilier 2011, Brightwell and Winkler 1991). The only other known class of partial orders whose probabilities can be computed efficiently is the class of partitioned preferences (Lebanon and Mao 2008); while this class includes top-k/bottom-k ranked lists, it does not include other popular partial orders such as pairwise comparisons. 4
5 We exploit the structural symmetries in the Mallows distribution to derive an efficiently computable closed-form expression for the choice probabilities for a given offer set under the mixture of Mallows distribution. In particular, we first consider a single Mallows distribution and show that the choice probabilities under the Mallows distribution can be expressed as a discrete convolution. Using fast Fourier transform, the choice probabilities can be computed in O(n 2 log n) time where n is the number of products. Therefore, we obtain a procedure with running time O(K n 2 log n) to compute the choice probabilities for a fixed offer set under a mixture of K Mallows distribution. Near-optimal algorithm for constrained assortment optimization problem. We present a polynomial time approximation scheme (PTAS) for a large class of constrained assortment optimization for the mixture of Mallows model including cardinality constraints, knapsack constraints, and matroid constraints. Our PTAS holds under the assumption that the no-purchase option is ranked last in the modal rankings for all Mallows segments in the mixture; such assumptions are necessary because Aouad et al. (2015) show that the assortment optimization under a sparse rank-based model with K>nis hard to approximate within a factor better than O(1/n 1 ) for any >0 3. Under the above assumption and for any >0, our algorithm computes an assortment with expected revenue at least (1 ) times the optimal in running time that is polynomial in n and K but depends exponentially on 1/. The PTAS is based on establishing a surprising sparsity property about near-optimal assortments, namely, that there exist a near-optimal assortment of size O(1/ ). Therefore, enumerating over all such subsets gives a (1 )-approximation to the constrained assortment optimization problem. To the best of our knowledge, this is the first provably near-optimal algorithm for the assortment optimization under Mallows or the mixture of Mallows model in such generality. A mixed integer programming formulation. We present a compact mixed integer linear program (MIP) with O(K n 3 ) variables, O(n) binary variables and O(K n 3 ) constraints for the constrained assortment optimization under a general mixture of Mallows model with K segments. The compact formulation is based on an alternative efficient procedure to compute the choice probabilities for a fixed offer set exactly. In particular, we exploit the repeated insertion method (RIM) introduced by Doignon et al. (2004) for sampling rankings according to the Mallows distribution and show that the choice probabilities for the Mallows model can be expressed as the unique solution to a system of linear equations that can be solved in O(n 3 ) time. This gives us an alternate procedure to efficiently compute the choice probabilities for a fixed offer set exactly. 3 The reduction is from the independent set problem to an assortment optimization problem under a distribution over only n rankings. 5
6 While this is less efficient than using fast Fourier transform (O(n 3 ) versus O(n 2 log n)), it allows us to formulate a compact MIP for the constrained assortment optimization problem under a general mixture of Mallows model. Our MIP formulation holds for general mixture of Mallows model and does not require any assumption on the rank of no-purchase in the modal rankings. We conduct numerical experiments to test the computational performance of the MIP. In particular, we implement a variable bound strengthening and observe that the MIP is efficient for reasonably sized assortment optimization problems. Therefore, the MIP provides a practical approach for assortment optimization under a general mixture of Mallows model. 1.2 Literature review A large number of parametric models over rankings have been extensively studied in the areas of statistics, transportation, marketing, economics, and operations management (see Marden (1995) for a detailed survey of most of the models). Our work particularly has connections to the work in machine learning and operations management. The existing work in machine learning has focused on designing computationally efficient algorithms for estimating the model parameters from commonly available observations (complete rankings, top-k/bottom-k lists, pairwise comparisons, etc.). The developed techniques mainly consist of efficient algorithms for computing the likelihood of the observed data (Lebanon and Mao 2008, Guiver and Snelson 2009) and sampling techniques for sampling from the distributions conditioned on observed data (Lu and Boutilier 2011, Meila et al. 2012). The Plackett-Luce (PL) model, the Mallows model, and their variants have been by far the most studied models in this literature. On the other hand, the work in operations management has mainly focused on designing set optimization algorithms to find the best subset efficiently. The multinomial logit (MNL) model has been the most commonly studied model in this literature. The MNL model was made popular by the work of McFadden (1978) and has been shown by Yellott (1977) be equivalent to the PL model, introduced independently by Luce (1959) and Plackett (1975). When given the model parameters, the assortment optimization problem has been shown to be efficiently solvable for the MNL model by Talluri and Van Ryzin (2004), variants of the nested logit (NL) model by Davis et al. (2014), Gallego and Topaloglu (2014), and the Markov chain model by Blanchet et al. (2013). The problem is known to be hard for most other choice models (Bront et al. 2009), so Jagabathula (2014) studies the performance of the local search algorithm for some of the assortment problems that are known to be hard. As mentioned, the literature in operations management has restricted itself to only those models for which 6
7 choice probabilities are known to be efficiently computed. In the context of the literature, our key contribution is to extend a popular model in the machine learning literature to choice contexts and the assortment problem. Outline. The rest of the paper is organized as follows. In Section 2, we formally introduce the model and the notation. In Section 3, we present the closed-form expression for the choice probabilities. In Section 4, we present the PTAS for the assortment optimization problem under the mixture of Mallows model. In Section 5, we introduce an alternative procedure to compute the choice probabilities which we build on to present the MIP formulation in Section 6. Finally, the computational study is presented in Section 7. 2 Model and problem statement Notation. We consider a universe U of n products. In order to distinguish products from their corresponding ranks, we let U = {a 1,...,a n } denote the universe of products, under an arbitrary indexing. Preferences over this universe are captured by an anti-reflexive, anti-symmetric, and transitive relation, which induces a total ordering (or ranking) over all products; specifically, a b means that a is preferred to b. We represent preferences through rankings or permutations. A complete ranking (or simply a ranking) is a bijection : U! [n] that maps each product a 2 U to its rank (a) 2 [n], where [j] denotes the set {1, 2,...,j} for any integer j. Lower ranks indicate higher preference so that (a) < (b) if and only if a b, where denotes the preference relation induced by the ranking. For simplicity of notation, we also let i denote the product ranked at position i. Thus, 1 2 n is the list of the products written by increasing order of their ranks. Finally, for any two integers i apple j, let [i, j] denote the set {i, i +1,...,j}. Mallows model. The Mallows model is a member of the distance-based ranking family models (see Murphy and Martin (2003)). This model is described by a modal ranking!, which denotes the central or modal permutation, and a concentration parameter 2 R +, such that the probability of each permutation is given by ( )= e d(,!) ( ), where ( ) = P exp( d(,!)) is the normalization constant, and d(, ) is the Kendall-Tau metric of 7
8 distance between permutations defined as d(,!) = i<j 1l [ ( (a i ) (a j )) (!(a i )!(a j )) < 0]. In other words, d(,!) counts the number of pairwise disagreements between the permutations and!. It can be verified that d(, ) is a distance function that is right-invariant under the composition of the symmetric group, i.e., d( 1, 2 )=d( 1, 2 ) for every, 1, 2, where the composition is defined as (a) = ( (a)). This symmetry can be exploited to show that the normalization constant ( ) has a closed-form expression (Marden (1995)) given by ( ) = n+1 Y i=1 1 e i 1 e. Note that ( ) depends only on the concentration parameter and does not depend on the modal ranking. Intuitively, the Mallows model defines a set of consumers whose preferences are similar, in the sense of being centered around a common permutation, where the probability for deviations thereof are decreasing exponentially. The similarity of consumer preferences is captured by the Kendall-Tau distance metric. Mixture of Mallows model. The mixture of K Mallows models is given by K segments where for each segment k =1,...,K, we are given its probability p k and the Mallows distribution with modal ranking! k and concentration parameter k. Therefore, the probability of any permutation in the mixture model is given by ( )= K k=1 p k e k d(,! k ). ( k ) 2.1 Problem statement Choice probabilities computation. We first focus on efficiently computing the probability that a product a will be chosen from an offer set S U under a given mixture of Mallows model. When offered the subset S, the customer is assumed to sample a preference list according to the mixture of Mallows model and then choose the most preferred product from S according to the sampled list. Therefore, the probability of choosing product a from the offer set S is given by P(a S) = ( ) 1l [, a, S], (1) 8
9 where 1l[, a, S] indicates whether (a) < (a 0 ) for all a 0 2 S, a 0 6= a. Note that the above sum runs over n! preference lists, meaning that it is a priori unclear if P(a S) can be computed efficiently. Assortment optimization. Once we are able to compute the choice probabilities, we consider the assortment optimization problem. In the assortment optimization problem, each product a has an endogenously fixed price r a. Moreover, there is an additional product a q that represents the outside option (no-purchase), with price r q =0that is always included in the assortment. Let S 2 U be denote a set of feasible assortments. We assume that S satisfied the following assumption: Assumption 1. Let S be the set of feasible assortments. We assume that S satisfies the following properties. (Membership) For any S U, it is easy to test whether S 2 S or not. (Closure) For any S 2 S and T S implies that T 2 S. This is a fairly general assumption satisfied for a large class of constraints including cardinality constraints, multi-dimensional knapsack constraints and matroid constraints. The goal in the assortment optimization problem is to determine a feasible subset of products that maximizes the expected revenue (2): max S2S R(S) = max S2S a2s P(a S [{r q }) r a. (2) We would like to note that even the unconstrained version where S contains all possible subsets of U is hard to approximate within a factor better than O(1/n 1 ) under a general distribution over permutation model (Aouad et al. (2015)). 3 Choice probabilities: closed-form expression In this section, we show that the choice probabilities can be computed efficiently under the Mallows model. Note that this directly give a efficient procedure to compute the choice probabilities under a mixture of Mallows model. Without loss of generality, we assume from this point on that the products are indexed such that the central permutation! ranks product a i at position i, for all i 2 [n]. The next theorem shows that, when the offer set is contiguous, the choice probabilities enjoy a rather simple form. Using these expressions as building blocks, we further derive a closed-form expression for general offer sets. 9
10 Theorem 3.1 (Contiguous offer set). Suppose S = a [i,j] = {a i,...,a j } for some 1 apple i apple j apple n. Then, the probability of choosing product a k 2 S under the Mallows model with modal ranking! and concentration parameter is given by P(a k S) = (k i) e 1+e + + e (j i). The choice probability under a general offer set has a more involved structure for which additional notation are needed. For a pair of integers 1 apple m apple q apple n, define (q, ) = qy s 1 e ` and (q, m, ) = (m, ) (q m, ). s=1 `=0 In addition, for a collection of M discrete functions h m : Z! R, m =1,...,M such that h m (r) =0 for any r<0, their discrete convolution is defined as (h 1??h m )(r) = r 1,...,r M : P m rm=r h 1 (r 1 ) h M (r M ). Theorem 3.2 (General offer set). Suppose S = a [i1,j 1 ] [ [a [im,j M ] where i m apple j m for 1 apple m apple M and j m <i m+1 for 1 apple m apple M 1. Let G m = a [jm,i m+1 ] for 1 apple m apple M 1, G = G 1 [ [G M, and C = a [i1,j M ]. Then, the probability of choosing a k 2 a [i`,j`] can be written as P(a k S) =e (k i 1) Q M 1 m=1 ( G m, ) (f 0? ( C, ) f 1?? f`?f`+1??f M )( G ), where: f m (r) =e r (jm i 1+1+r/2) 1 ( G m,r, ), if 0 apple r apple G m, for 1 apple m apple M. f m (r) =e r f m (r), for 1 apple m apple M. f 0 (r) = ( C, G r, ) e ( G r)2 /2 1+e + +e ( S 1+r), for 0 apple r apple G. f m (r) =0, for 0 apple m apple M and any r outside the ranges described above. Proof. At a high level, deriving the expression for a general offer set involves breaking down the probabilistic event of choosing a k 2 S into simpler events for which we can use the expression given in Theorem 3.1, and then combining these expressions using the symmetries of the Mallows distribution. 10
11 For a given vector R =(r 0,...,r M ) 2 R M+1 such that r r M = G, let h(r) be the set of permutations which satisfy the following two conditions: i) among all the products of S, a k is the most preferred, and ii) for all m 2 [M], there are exactly r m products from G m which are preferred to a k.we denote this subset of products by G m for all m 2 [M]. This implies that there are r 0 products from G which are less preferred than a k. With this notation, P(a k S) = R:r r M = G 2h(R) ( ). Recall that for all, we have ( )= e P i,j (,i,j), ( ) where (, i, j) =1l[( (a i ) (a j )) (!(a i )!(a j )) < 0]. For all, we can break down the sum in the exponential as follows: (, i, j) =C 1 ( )+C 2 ( )+C 3 ( ), i,j where C 1 ( ) contains pairs of products (i, j) such that a i 2 G m for some m 2 [M] and a j 2 S, C 2 ( ) contains pairs of products (i, j) such that a i 2 G m for some m 2 [M] and a j 2 G m 0\ G m 0 for some m 6= m 0, C 3 ( ) contains the remaining pairs of products. For a fixed R, we show that C 1 ( ) and C 2 ( ) are constant for all 2 h(r). Part 1. C 1 ( ) counts the number of disagreements (i.e., number of pairs of products that are oppositely ranked in and!) between some product in S and some product in G m for any m 2 [M]. For all m 2 [M], a product in a i 2 G m induces a disagreement with all product a j 2 S such that j<i. Therefore, the sum of all these disagreements is equal to, M M C 1 ( )= (, i, j) = r m (j m i 1 + 1). m=1 a j 2S m=1 a i 2 G m 11
12 Part 2. C 2 ( ) counts the number of disagreements between some product in any G m and some product in any G m 0\ G m 0 for m 0 6= m. The sum of all these disagreements is equal to, C 2 ( )= = = m6=m 0 M m=2 M m=2 a i 2 G m a j 2G m 0 \ G m 0 m 1 r m G j j=1 m 1 r m j=1 G j (, i, j) = M m=2 M m=2 m 1 r m j=1 m 1 r m ( G j r j ) r j 1 2 ( G m 0) j=1 M rm. 2 m=1 Consequently, for all 2 h(r), we can write d(,!) =C 1 (R)+C 2 (R)+C 3 ( ) and therefore, P(a k S) = R:r 0 + +r M = G e (C 1(R)+C 2 (R)) ( ) 2h(R) e.c 3( ). Computing the inner sum requires a similar but more involved partitioning of the permutations as well as using Theorem 3.1. The details are presented in Appendix A. In particular, we can show that for a fixed R, P 2h(R) e.c 3( ) is equal to ( G m 0, ) ( S + m 0, ) e (k 1 P` 1 m=1 rm) 1+ + e ( S +m 0 1) MY m=1 ( G m, ) (r m, ) ( G m r m, ). Putting all the pieces together yields the desired result. Due to representing P(a S) as a discrete convolution, we can efficiently compute this probability using fast Fourier transform in O(n 2 log n) time (see for instance Cooley and Tukey (1965)), which is a dramatic improvement over the exponential sum (1) that defines the choice probabilities. Note that for a mixture of K Mallows, this implies that we can compute the choice probability P(a S) in O(K n 2 log n) time. 4 A PTAS for the assortment optimization In this section, we present a polynomial time approximation scheme (PTAS) for the assortment optimization problem under the mixture of Mallows model described by (2). In other words, for any accuracy level >0, we compute an assortment with expected revenue at least (1 ) times the optimal. For every fixed, the 12
13 running time is polynomial in n and K. For ease of exposition, we first focus on a single Mallows model and thus drop the index corresponding to the Mallows segment. At the end of the section, we explain how the results extend to a mixture of Mallows model. Before describing the algorithm, we introduce a number of structural properties relative to the Mallows distribution. 4.1 Probabilistic claims We first show that for any pair of products (a i,a j ) such that i<j(i.e. a i is preferred to a j in!), and for a permutation drawn from a Mallows distribution, we have P(a i a j ) 1/2. Note that when =0, since the distribution is uniform, we have P(a i a j )=1/2. Moreover, when!1, P(a i a j )=1. Our result extends these extreme cases to all values of. Claim 4.1. For any pair of products (a i,a j ) such that i<j, if is drawn from a Mallows distribution, we have, P(a i a j ) 1 2. Proof. Let A = { : a i a j } and B = { : a i a j }. We consider the bijection f : A! B which switches a i and a j. More precisely, for all 2 A, 8 >< f( )(a k )= >: (a i ) if k = j (a j ) if k = i (a k ) otherwise. We show that for all 2 A, d(,!) apple d(f( ),!) which in turn implies the desired result. Note that for any 2 A, we have d(f( ),!) d(,!) =1+ [ (f( ),i,k)) + (f( ),j,k) (, i, k)) (, j, k)], k:a i a k a j where (, i, j) =1l[( (a i ) (a j )) (!(a i )!(a j )) < 0]. Since a i! a j, we have three cases to consider. Case 1: a i! a k! a j. In that case, (f( ),i,k)) + (f( ),j,k)=2and (, i, k)) + (, j, k) =0. Case 2: a k! a i. In that case, (f( ),i,k)) + (f( ),j,k)=1and (, i, k)) + (, j, k) =1. Case 3: a j! a k. In that case, (f( ),i,k)) + (f( ),j,k)=1and (, i, k)) + (, j, k) =1. 13
14 In each case, (f( ),i,k)) + (f( ),j,k) (, i, k)) (, j, k) 0, which concludes the proof. We also extend this result to a tuple of products (i 1,...,i m ). More precisely, if i 1 < <i m (i.e. a i1 is the most preferred product of (a i1,...,a im ) in!), then if is drawn from a Mallows distribution, we have P(a i1 a ij, 8j 2) 1/m. Again, note that when =0, we have P(a i1 a ij, 8j 2) = 1/m. Moreover, when!1, P(a i1 a ij, 8j 2) = 1. Claim 4.2. For any tuple of products (a i1,...,a im ) such that i 1 < <i m, if is drawn from a Mallows distribution, we have P(a i1 a im, 8j 2) 1 m. Proof. Let A k = { : a ik a i`, 8` 6= k} be the set of permutations in which a ik appears first among a i1,...,a im. For a fixed pair (k, m) such that k<m, consider the bijection f : A m! A k which switches m and k. More precisely, for all 2 A m, 8 >< f( )(`) = >: (a k ) if ` = m (a m ) if ` = k (a`) otherwise. The proof of Claim 4.1 shows that for all 2 A m, d(,!) apple d(f( ),!). This in turn implies that for all k<m, P(a ik a i`, 8` 6= k) P(a im a i`, 8` 6= m), and concludes the proof. 4.2 A PTAS for the assortment optimization problem We now present a polynomial time approximation scheme (PTAS) for the assortment optimization problem under the Mallows distribution under an additional assumption. Our algorithm is based on establishing a surprising sparsity property, proving the existence of small-sized near-optimal assortments, crucially utilizing certain symmetries in the distribution over permutations. 14
15 Description of the algorithm. Let S be the optimal assortment. Let M =1/, where without loss of generality, assume that M takes an integer value. We enumerate all possible subsets of U of less or equal than M and return the best candidate assortment. Algorithm 1 describes the procedure. Algorithm 1 Computing choice probabilities 1: Let Ŝ = arg max {R(S) : S applem}. 2: Return Ŝ. Assumption 2. The outside option is ranked last in the central permutation!, i.e. q = n. Theorem 4.1. Under Assumption 2, Algorithm 1 is a PTAS for the assortment optimization problem (2) under the Mallows distribution. Proof. We first argue the correctness of the algorithm, i.e., that the assortment returned is a (1 )-optimal solution. Again, let S be the optimal assortment. Note that if S <M, then S is one of the candidate assortments we examine, and therefore the algorithm returns the optimal solution. We therefore assume that S M. In this case, let S consists of the M highest revenue product of S. Note that S is among the candidate assortment constructed by the algorithm. Moreover, it is a feasible assortment by Assumption 1. We show that S is (1 )-optimal using a sample-path analysis. In particular, let be a fixed preference list. Let R(, S) be the revenue obtained by when assortment S is offered. In particular, R(, S) = a2s 1l [, a, S] r a. We consider two cases. Case 1. We first assume that a q a i for all i 2 S. In this case, R(, S )=0. On the other hand, offering a single product a i is always a feasible solution. Therefore, by Claim 4.1, R(S ) R({i })=P(a i a q ) r i r i 2, where a i is the highest price product. Moreover, note that all product in S \S have revenue smaller or equal to r i. Therefore, any product in S \S has revenue less or equal than 2 R(S ). This implies that R(, S ) apple 2 R(S ). Case 2. In this case, we assume that in the permutation, there exists a product a i 2 S such that a i a q and R(, S )=r ai. We show that R(, S ) R(, S ). Indeed, suppose that there exists a product a j is in 15
16 S \S such that a j a i. Since S contains the M highest revenue product of S, it must be that r j apple r j. Therefore, R(, S ) R(L, S ). We now combine the two cases. For case 1 to happen, note that a q has to be preferred to all products from S. From Claim 4.2, this event occurs with probability at most 1/ S =1/M =. Consequently, R(S ) R(S )=P(Case 1) E [R(S ) R(S ) Case 1] {z } apple0 + P(Case 2) E [R(S ) R(S ) Case 2] {z } =E[R(S ) Case 2] apple E [R(S ) Case 2] apple2 R(S ). From a running time perspective, the number of candidate assortment is equal to n 1/. By Theorem 3.2, we can compute R(S) for any assortment S in O(n 3 log(n)). Therefore the overall running time of the algorithm is O(n 1/ n 3 log(n)). Extension to a mixture of Mallows model. The PTAS extends to a mixture of Mallows model as long as Assumption 2 holds for each segment. Indeed, since the probabilistic claims hold for each segment, we can adapt the proof of Theorem 4.1 to the case of a mixture of Mallows model. Moreover, we would like to emphasize that the running time scales linearly in K, the number of segments of the Mallows model. More precisely, Algorithm 1 returns a (1 )-optimal assortment in O(K n 1/ n 3 log(n)). The PTAS presented in this section provides an approximation algorithm with provable guarantees for a special case of a mixture of Mallows model. However, it requires Assumption 2 to hold. We next present an alternative way of solving the assortment optimization problem using a MIP formulation which does not require any assumption. To this end, we first present an alternative method for computing the choice probabilities by means of dynamic programming. 5 Choice probabilities: a dynamic programming approach In what follows, we present an alternative algorithm for computing the choice probabilities under a Mallows model. Again, note that this directly implies an efficient algorithm for computing the choice probabilities under a mixture of Mallows model. Our approach is based on an efficient procedure to sample a random 16
17 permutation according to a Mallows model with modal ranking! and concentration parameter. The random permutation is constructed sequentially, as explained in Algorithm 2. Algorithm 2 Repeated insertion procedure 1: Let = {a 1 }. 2: For i =2,...,n, insert a i into at position s =1,...,iwith probability i,s = (i s) e 1+e + + e (i 1). 3: Return. Lemma 5.1 (Theorem 3 in Lu and Boutilier (2011)). The repeated insertion procedure generates a random sample from a Mallows distribution with modal ranking! and concentration parameter. Based on the correctness of this procedure, we describe a dynamic program to compute the choice probabilities of a general offer set S. The key idea is to decompose these probabilities to include the position at which a product is chosen. In particular, for i apple m and s 2 [m], let (i, s, m) be the probability that product a i is chosen (i.e., appears first among products in S) at position s after the m-th step of Algorithm 2. In other words, (i, s, m) corresponds to a choice probability when restricting U to the first m products, a 1,...,a m. With this notation, we have for all i 2 [n], n P(a i S) = (i, s, n). s=1 We compute (i, s, m) iteratively for m =1,...,n. In particular, in order to compute (i, s, m + 1), we use the correctness of the sampling procedure. Specifically, starting from a permutation that includes the products a 1,...,a m, the product a m+1 is inserted at position j with probability m+1,j, and we have two cases to consider. Case 1: a m+1 /2 S. In this case, (m +1,s,m+ 1) = 0 for all s =1,...,m+1. Consider a product a i for i apple m. In order for a i to be chosen at position s after a m+1 is inserted, one of the following events has to occur: i) a i was already chosen at position s before a m+1 is inserted, and a m+1 is inserted at a position `>s, ii) a i was chosen at position s 1, and a m+1 is inserted at a position ` apple s 1. 17
18 Consequently, we have for all i apple m, (i, s, m + 1) = m+1 `=s+1 s 1 m+1,` (i, s, m)+ m+1,` (i, s `=1 1,k) =(1 m+1,s) (i, s, m)+ m+1,s 1 (i, s 1,m), where Case 2: a m+1 m,s = P s `=1 m,` for all m, s. 2 S. Consider a product a i with i apple m. This product is chosen at position s only if it was already chosen at position s and a m+1 is inserted at a position `>s. Therefore, for all i apple m, (i, s, m + 1) = (1 m+1,s) (i, s, m). For product a m+1, it is chosen at position s only if all products a i for i apple m are at positions ` s and a m+1 is inserted at position s, implying that Algorithm 3 summarizes this procedure. Algorithm 3 Computing choice probabilities (m +1,s,m+ 1) = m+1,s iapplem `=s n (i, `, m). 1: Let S be a general offer set. Without loss of generality, we assume that a 1 2 S. 2: Let (1, 1, 1) = 1. 3: For m =1,...,n 1, (a) For all i apple m and s =1,...m+1, let (i, s, m + 1) = (1 m+1,s) (i, s, m)+1l[a m+1 /2 S] m+1,s 1 (i, s 1,m). (b) For s =1,...,m+1, let (m +1,s,m+ 1) = 1l[a m+1 2 S] m+1,s 4: For all i 2 [n], return P(a i S) = P n s=1 (i, s, n). iapplem `=s n (i, `, m). Theorem 5.2. For any offer set S, Algorithm 3 returns the choice probabilities under a Mallows distribution with modal ranking! and concentration parameter. This dynamic programming approach provides an O(n 3 ) time algorithm for computing P(a S) for all products a 2 S simultaneously. Moreover, as explained in the next section, these ideas lead to an algorithm to solve the assortment optimization problem. 18
19 6 Assortment optimization: integer programming formulation Building on Algorithm 3 and introducing a binary variable for each product, we can reformulate the assortment optimization problem (2) under a mixture of Mallows model as a mixed inter program (MIP). Although the MIP formulation does not enjoy the theoretical guarantees of the PTAS (i.e. upper bound on the running time), it does not require Assumption 2 to hold. Again, we start with a single Mallows model. In particular, we give a MIP with only O(n 3 ) variables and constraints, with only n 0-1 variables. We assume for simplicity that the first product of S (say a 1 ) is known. Since this product is generally not known a-priori, in order to obtain an optimal solution to problem (2), we need to guess the first offered product and solve the above integer program for each of the O(n) guesses. We would like to note that the MIP formulation is presented for the unconstrained assortment optimization problem but is quite powerful and can handle a large class of constraints on the assortment (such as cardinality and capacity constraints). Theorem 6.1. Conditional on a 1 2 S, the following mixed integer program (MIP) computes an optimal solution to the unconstrained assortment optimization problem under a Mallows model: max r i (i, s, n) i,s s.t. (1, 1, 1) = 1, (1,s,1) = 0, 8s =2,...,n (i, s, m + 1) = (1 w m+1,s ) (i, s, m)+y i,s,m+1, 8i, s, 8m 2 (m +1,s,m+ 1) = z s,m+1, 8s, 8m 2 y i,s,m apple m+1,s 1 (i, s 1,m 1), 8i, s, 8m 2 0 apple y i,s,m apple m+1,s 1 (1 x m ), 8i, s, 8m 2 z s,m apple m+1,s n `=s m 1 i=1 (i, `, m 1), 8s, 8m 2 0 apple z s,m apple m+1,s x m, 8s, 8m 2 x 1 =1,x q =1,x m 2{0, 1} Proof. Let x =(x 1,...,x n ) be a feasible binary vector to the MIP and let S = {a i : x i =1}. Note that there is a one to one correspondence between feasible vector x to the MIP and feasible assortment S such 19
20 that a 1 2 S and a q 2 S. Consequently, we can rewrite the MIP as max S U a q2s max r i (i, s, n) i,s s.t. (i, s, m + 1) = (1 w m+1,s ) (i, s, m)+y i,s,m+1, 8i, s, 8m 2 (m +1,s,m+ 1) = z s,m+1, 8s, 8m 2 0 apple y i,s,m apple 1l [ a m+1 /2 S] m+1,s 1 (i, s 1,m 1), 8i, s, 8m 2 0 apple z s,m apple 1l [ a m+1 2 S] m+1,s (1, 1, 1) = 1 n `=s m 1 i=1 (i, `, m 1), 8s, 8m 2 Note that it is always optimal to set y i,s,m and z s,m at their upper bound because all the coefficients in the objective function are non-negative. The correctness of Algorithm 3 then shows that the MIP is an equivalent formulation of the unconstrained assortment optimization problem under a Mallows model. We now present the MIP formulation for the unconstrained assortment optimization problem for a mixture of Mallows model. Again, we want to emphasize that the binary variables allow capturing a wide variety of constraints. Theorem 6.2. Conditional on a 1 2 S, the following mixed integer program (MIP) computes an optimal solution to the unconstrained assortment optimization problem under a mixture of Mallows model: max r i p k k (i, s, n) i,s,k s.t. k (1, 1, 1) = 1, k (1,s,1) = 0, 8s =2,...,n,8k k (i, s, m + 1) = (1 w m+1,s ) k (i, s, m)+y k i,s,m+1, 8i, s, k, 8m 2 k (m +1,s,m+ 1) = z k s,m+1, 8s, k, 8m 2 y k i,s,m apple m+1,s 1 k (i, s 1,m 1), 8i, s, k, 8m 2 0 apple y k i,s,m apple m+1,s 1 (1 x!k (a m)), 8i, s, k, 8m 2 z k s,m apple m+1,s n `=s m 1 i=1 k (i, `, m 1), 8s, k, 8m 2 0 apple z k s,m apple m+1,s x!k (a m), 8s, k, 8m 2 20
21 x 1 =1,x q =1,x m 2{0, 1} 7 Numerical experiments In this section, we examine how the MIP performs in term of running time. We considered the following simulation setup for a single Mallows model. Product prices are sampled independently and uniformly at random from the interval [0, 1]. The modal ranking is fixed to the identity ranking with the outside option ranked at the top. The outside option being ranked at the top is characteristic of applications in which the retailer captures a small fraction of the market and the outside option represents the (much larger) rest of the market. Indeed, most of the customers visiting a website or a store leave without making a purchase. Because the outside option is always offered, we need to solve only a single instance of the MIP (described in Theorem 6.1). Note that in the more general setting, the number of MIPs that must be solved is equal the minimum of the rank of the outside option and the rank of the highest revenue item. Because the MIPs are independent of each other, they can be solved in parallel. We solved the MIPs using the Gurobi Optimizer version on a computer with processor 2.4GHz Intel Core i5, RAM of 8GB, and operating system Mac OS El Capitan. In order to improve the running time of the MIP, we implemented some strengthening of some of the constraints. We describe this next. Strengthening of the MIP formulation. We use some structural properties of the optimal solution to tighten some of the upper bounds involving the binary variables in the MIP formulation. In particular, for all i, s, and m, we replace the constraint y i,s,m apple m+1,s 1 (1 x m ), by the following constraint y i,s,m apple m+1,s 1 u i,s,m (1 x m ), where u i,s,m is the probabliity that product a i is selected at position (s 1) after the m th step of Algorithm 2 when the offer set is S = {a i,a q }, i.e. when only the highest priced product is offered. Since we know that the highest price product is always offered in the optimal assortment, this is a valid upper bound to 21
22 (i, s 1,m 1) and therefore a valid strenghtening of the constraint. Similarly, for all s and m, we replace the constraint, z s,m apple m+1,s x m, by the following constraint z s,m apple m+1,s v s,m x m, where v s,m is equal to the probability that product that product i is selected at position ` = s,..., n when the offer set is S = {a q } if a i w a i, and S = {a q,a i } otherwise. Again using the fact that the highest price product is always offered in the optimal assortment, we can show that this is a valid upper bound. Results and discussion. Table 1 shows the running time of strengthened MIP formulation for different values of e and n. For each pair of parameters, we generated 50 different instances. n e Average running time (s) Max running time (s) , Table 1: Running time of the strengthened MIP for various values of e and n. We would like to note that the strengthening improves the running time considerably. Under the initial formulation, the MIP did not terminate after several hours for n = 25 whereas it was able to terminate in a few minutes with the additional strengthening. Our MIP obtains the optimal solution in a reasonable amount of time for the considered parameter values. Outside of this range, i.e. when e is too small or when n is too large, there are potential numerical instabilities. The strengthening we propose is one way to improve the running time of the MIP but other numerical optimization techniques may be applied to improve the running time even further. Finally, we emphasize that the MIP formulation is necessary because of its flexibility to handle versatile business constraints (such as cardinality or capacity constraints) that naturally 22
23 arise in practice. References Aouad, A., V. Farias, R. Levi, D. Segev The approximability of assortment optimization under ranking preferences. Available at SSRN: Awasthi, P., A. Blum, O. Sheffet, A. Vijayaraghavan Learning mixtures of ranking models. Advances in Neural Information Processing Systems Blanchet, Jose H, Guillermo Gallego, Vineet Goyal A markov chain approximation to choice modeling. EC Brightwell, G., P. Winkler Counting linear extensions is #P-complete. STOC 91 Proceedings of the twenty-third annual ACM Symposium on Theory of Computing Bront, Juan José Miranda, Isabel Méndez-Díaz, Gustavo Vulcano A column generation algorithm for choicebased network revenue management. Operations Research 57(3) Cooley, J., J. Tukey An algorithm for the machine calculation of complex fourier series. Mathematics of Computation 19(90) Davis, James M, Guillermo Gallego, Huseyin Topaloglu Assortment optimization under variants of the nested logit model. Operations Research 62(2) Doignon, J., A. Pekeč, M. Regenwetter The repeated insertion model for rankings: Missing link between two subset choice models. Psychometrika 69(1) Dwork, C., R. Kumar, M. Naor, D. Sivakumar Rank aggregation methods for the web. ACM, ed., Proceedings of the 10th international conference on World Wide Web Farias, V., S. Jagabathula, D. Shah. 2013a. A new approach to modeling choice with limited data. Management Science 59(2) 305 ï Farias, V., S. Jagabathula, D. Shah. 2013b. A nonparametric approach to modeling choice with limited data. Management Science 59(2) Gallego, Guillermo, Huseyin Topaloglu Constrained assortment optimization for the nested logit model. Management Science 60(10) Guiver, John, Edward Snelson Bayesian inference for plackett-luce ranking models. proceedings of the 26th annual international conference on machine learning. ACM, Honhon, D., S. Jonnalagedda,. Pan Optimal algorithms for assortment selection under ranking-based consumer choice models. Manufacturing & Service Operations Management 14(2) Jagabathula, S Assortment optimization under general choice. Available at SSRN
24 Lebanon, G., Y. Mao Non-parametric modeling of partially ranked data. Journal of Machine Learning Research Lu, T., C. Boutilier Learning mallows models with pairwise preferences. Proceedings of the 28th International Conference on Machine Learning Luce, R.D Individual Choice Behavior: A Theoretical Analysis. Wiley. Mallows, C Non-null ranking models. Biometrika 44(1) Marden, J Analyzing and Modeling Rank Data. Chapman and Hall. McFadden, D Modeling the choice of residential location. Transportation Research Record (672) McFadden, Daniel, Kenneth Train, et al Mixed mnl models for discrete response. Journal of applied Econometrics 15(5) Meila, M., K. Phadnis, A. Patterson, J. Bilmes Consensus ranking under the exponential model. ariv preprint ariv: Murphy, T., D. Martin Mixtures of distance-based models for ranking data. Computational Statistics & Data Analysis 41(3) Plackett, R.L The analysis of permutations. Applied Statistics 24(2) Talluri, K., G. Van Ryzin Revenue management under a general discrete choice model of consumer behavior. Management Science 50(1) Train, K Discrete choice methods with simulation. Cambridge University Press, New York, NY. van Ryzin, G., G. Vulcano A market discovery algorithm to estimate a general class of nonparametric choice models. Management Science 61(2) Williams, H.C.W.L On the formation of travel demand models and economic evaluation measures of user benefit. Environment and Planning A 3(9) Yellott, John I The relationship between luce s choice axiom, thurstone s theory of comparative judgment, and the double exponential distribution. Journal of Mathematical Psychology 15(2) A Proof of Theorem 3.2 (continued) In this section, we prove that for a fixed R, P 2h(R) e.c3( ) is equal to ( G m 0, ) ( S + m 0, ) e (k 1 P` 1 m=1 rm) 1+ + e ( S +m0 1) MY m=1 ( G m, ) (r m, ) ( G m r m, ). 24
Assortment Optimization Under the Mallows model
Assortment Optimization Under the Mallows model Antoine Désir IEOR Department Columbia University antoine@ieor.columbia.edu Srikanth Jagabathula IOMS Department NYU Stern School of Business sjagabat@stern.nyu.edu
More informationNear-Optimal Algorithms for Capacity Constrained Assortment Optimization
Submitted to Operations Research manuscript (Please, provide the manuscript number!) Near-Optimal Algorithms for Capacity Constrained Assortment Optimization Antoine Désir Department of Industrial Engineering
More informationCapacity Constrained Assortment Optimization under the Markov Chain based Choice Model
Submitted to Operations Research manuscript (Please, provide the manuscript number!) Capacity Constrained Assortment Optimization under the Markov Chain based Choice Model Antoine Désir Department of Industrial
More informationOn the Tightness of an LP Relaxation for Rational Optimization and its Applications
OPERATIONS RESEARCH Vol. 00, No. 0, Xxxxx 0000, pp. 000 000 issn 0030-364X eissn 526-5463 00 0000 000 INFORMS doi 0.287/xxxx.0000.0000 c 0000 INFORMS Authors are encouraged to submit new papers to INFORMS
More informationConstrained Assortment Optimization for the Nested Logit Model
Constrained Assortment Optimization for the Nested Logit Model Guillermo Gallego Department of Industrial Engineering and Operations Research Columbia University, New York, New York 10027, USA gmg2@columbia.edu
More informationTechnical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets
Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman Olin Business School, Washington University, St. Louis, MO 63130, USA
More informationAssortment Optimization under the Multinomial Logit Model with Sequential Offerings
Assortment Optimization under the Multinomial Logit Model with Sequential Offerings Nan Liu Carroll School of Management, Boston College, Chestnut Hill, MA 02467, USA nan.liu@bc.edu Yuhang Ma School of
More informationAssortment Optimization under the Multinomial Logit Model with Nested Consideration Sets
Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,
More informationTechnical Note: Assortment Optimization with Small Consideration Sets
Technical Note: Assortment Optimization with Small Consideration Sets Jacob Feldman Alice Paul Olin Business School, Washington University, Data Science Initiative, Brown University, Saint Louis, MO 63108,
More informationAssortment Optimization under Variants of the Nested Logit Model
WORKING PAPER SERIES: NO. 2012-2 Assortment Optimization under Variants of the Nested Logit Model James M. Davis, Huseyin Topaloglu, Cornell University Guillermo Gallego, Columbia University 2012 http://www.cprm.columbia.edu
More informationA Partial-Order-Based Model to Estimate Individual Preferences using Panel Data
A Partial-Order-Based Model to Estimate Individual Preferences using Panel Data Srikanth Jagabathula Leonard N. Stern School of Business, New York University, New York, NY 10012, sjagabat@stern.nyu.edu
More informationUNIVERSITY OF MICHIGAN
Working Paper On (Re-Scaled) Multi-Attempt Approximation of Customer Choice Model and its Application to Assortment Optimization Hakjin Chung Stephen M. Ross School of Business University of Michigan Hyun-Soo
More informationA tractable consideration set structure for network revenue management
A tractable consideration set structure for network revenue management Arne Strauss, Kalyan Talluri February 15, 2012 Abstract The dynamic program for choice network RM is intractable and approximated
More informationAssortment Optimization under Variants of the Nested Logit Model
Assortment Optimization under Variants of the Nested Logit Model James M. Davis School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jmd388@cornell.edu
More informationGreedy-Like Algorithms for Dynamic Assortment Planning Under Multinomial Logit Preferences
Submitted to Operations Research manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the
More informationAn Overview of Choice Models
An Overview of Choice Models Dilan Görür Gatsby Computational Neuroscience Unit University College London May 08, 2009 Machine Learning II 1 / 31 Outline 1 Overview Terminology and Notation Economic vs
More informationApproximation Algorithms for Dynamic Assortment Optimization Models
Submitted to Mathematics of Operations Research manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which
More informationGallego et al. [Gallego, G., G. Iyengar, R. Phillips, A. Dubey Managing flexible products on a network.
MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 10, No. 2, Spring 2008, pp. 288 310 issn 1523-4614 eissn 1526-5498 08 1002 0288 informs doi 10.1287/msom.1070.0169 2008 INFORMS On the Choice-Based Linear
More informationarxiv: v4 [stat.ap] 21 Jun 2011
Submitted to Management Science manuscript A Non-parametric Approach to Modeling Choice with Limited Data arxiv:0910.0063v4 [stat.ap] 21 Jun 2011 Vivek F. Farias MIT Sloan, vivekf@mit.edu Srikanth Jagabathula
More informationWe consider a network revenue management problem where customers choose among open fare products
Vol. 43, No. 3, August 2009, pp. 381 394 issn 0041-1655 eissn 1526-5447 09 4303 0381 informs doi 10.1287/trsc.1090.0262 2009 INFORMS An Approximate Dynamic Programming Approach to Network Revenue Management
More informationOn the Approximate Linear Programming Approach for Network Revenue Management Problems
On the Approximate Linear Programming Approach for Network Revenue Management Problems Chaoxu Tong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,
More informationarxiv: v2 [math.oc] 12 Aug 2017
BCOL RESEARCH REPORT 15.06 Industrial Engineering & Operations Research University of California, Berkeley, CA 94720 1777 arxiv:1705.09040v2 [math.oc] 12 Aug 2017 A CONIC INTEGER PROGRAMMING APPROACH TO
More informationRecoverabilty Conditions for Rankings Under Partial Information
Recoverabilty Conditions for Rankings Under Partial Information Srikanth Jagabathula Devavrat Shah Abstract We consider the problem of exact recovery of a function, defined on the space of permutations
More informationAssortment Optimization for Parallel Flights under a Multinomial Logit Choice Model with Cheapest Fare Spikes
Assortment Optimization for Parallel Flights under a Multinomial Logit Choice Model with Cheapest Fare Spikes Yufeng Cao, Anton Kleywegt, He Wang School of Industrial and Systems Engineering, Georgia Institute
More informationOn Two Class-Constrained Versions of the Multiple Knapsack Problem
On Two Class-Constrained Versions of the Multiple Knapsack Problem Hadas Shachnai Tami Tamir Department of Computer Science The Technion, Haifa 32000, Israel Abstract We study two variants of the classic
More informationBreaking the 1/ n Barrier: Faster Rates for Permutation-based Models in Polynomial Time
Proceedings of Machine Learning Research vol 75:1 6, 2018 31st Annual Conference on Learning Theory Breaking the 1/ n Barrier: Faster Rates for Permutation-based Models in Polynomial Time Cheng Mao Massachusetts
More informationLecture 21: Spectral Learning for Graphical Models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation
More informationA New Dynamic Programming Decomposition Method for the Network Revenue Management Problem with Customer Choice Behavior
A New Dynamic Programming Decomposition Method for the Network Revenue Management Problem with Customer Choice Behavior Sumit Kunnumkal Indian School of Business, Gachibowli, Hyderabad, 500032, India sumit
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationCUSTOMER CHOICE MODELS AND ASSORTMENT OPTIMIZATION
CUSTOMER CHOICE MODELS AND ASSORTMENT OPTIMIZATION A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationProduct Assortment and Price Competition under Multinomial Logit Demand
Product Assortment and Price Competition under Multinomial Logit Demand Omar Besbes Columbia University Denis Saure University of Chile November 11, 2014 Abstract The role of assortment planning and pricing
More informationPermutation-based Optimization Problems and Estimation of Distribution Algorithms
Permutation-based Optimization Problems and Estimation of Distribution Algorithms Josu Ceberio Supervisors: Jose A. Lozano, Alexander Mendiburu 1 Estimation of Distribution Algorithms Estimation of Distribution
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory
More informationReductionist View: A Priori Algorithm and Vector-Space Text Retrieval. Sargur Srihari University at Buffalo The State University of New York
Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval Sargur Srihari University at Buffalo The State University of New York 1 A Priori Algorithm for Association Rule Learning Association
More informationLIMITATION OF LEARNING RANKINGS FROM PARTIAL INFORMATION. By Srikanth Jagabathula Devavrat Shah
00 AIM Workshop on Ranking LIMITATION OF LEARNING RANKINGS FROM PARTIAL INFORMATION By Srikanth Jagabathula Devavrat Shah Interest is in recovering distribution over the space of permutations over n elements
More informationDynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint
Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint Paat Rusmevichientong Zuo-Jun Max Shen David B. Shmoys Cornell University UC Berkeley Cornell University September
More informationo A set of very exciting (to me, may be others) questions at the interface of all of the above and more
Devavrat Shah Laboratory for Information and Decision Systems Department of EECS Massachusetts Institute of Technology http://web.mit.edu/devavrat/www (list of relevant references in the last set of slides)
More informationMulti-Attribute Bayesian Optimization under Utility Uncertainty
Multi-Attribute Bayesian Optimization under Utility Uncertainty Raul Astudillo Cornell University Ithaca, NY 14853 ra598@cornell.edu Peter I. Frazier Cornell University Ithaca, NY 14853 pf98@cornell.edu
More informationMinimax-optimal Inference from Partial Rankings
Minimax-optimal Inference from Partial Rankings Bruce Hajek UIUC b-hajek@illinois.edu Sewoong Oh UIUC swoh@illinois.edu Jiaming Xu UIUC jxu8@illinois.edu Abstract This paper studies the problem of rank
More informationWe consider a nonlinear nonseparable functional approximation to the value function of a dynamic programming
MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 13, No. 1, Winter 2011, pp. 35 52 issn 1523-4614 eissn 1526-5498 11 1301 0035 informs doi 10.1287/msom.1100.0302 2011 INFORMS An Improved Dynamic Programming
More informationA Randomized Linear Program for the Network Revenue Management Problem with Customer Choice Behavior. (Research Paper)
A Randomized Linear Program for the Network Revenue Management Problem with Customer Choice Behavior (Research Paper) Sumit Kunnumkal (Corresponding Author) Indian School of Business, Gachibowli, Hyderabad,
More informationAssortment Optimization Under Consider-then-Choose Choice Models
Submitted to Management Science manuscript MS-16-00074.R1 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use
More informationMachine Learning Techniques for Computer Vision
Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationIntroduction to Discrete Choice Models
Chapter 7 Introduction to Dcrete Choice Models 7.1 Introduction It has been mentioned that the conventional selection bias model requires estimation of two structural models, namely the selection model
More informationIndicator Constraints in Mixed-Integer Programming
Indicator Constraints in Mixed-Integer Programming Andrea Lodi University of Bologna, Italy - andrea.lodi@unibo.it Amaya Nogales-Gómez, Universidad de Sevilla, Spain Pietro Belotti, FICO, UK Matteo Fischetti,
More informationCombine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning
Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning Michalis K. Titsias Department of Informatics Athens University of Economics and Business
More informationEfficient Approximation for Restricted Biclique Cover Problems
algorithms Article Efficient Approximation for Restricted Biclique Cover Problems Alessandro Epasto 1, *, and Eli Upfal 2 ID 1 Google Research, New York, NY 10011, USA 2 Department of Computer Science,
More informationRecognizing single-peaked preferences on aggregated choice data
Recognizing single-peaked preferences on aggregated choice data Smeulders B. KBI_1427 Recognizing Single-Peaked Preferences on Aggregated Choice Data Smeulders, B. Abstract Single-Peaked preferences play
More informationModel Complexity of Pseudo-independent Models
Model Complexity of Pseudo-independent Models Jae-Hyuck Lee and Yang Xiang Department of Computing and Information Science University of Guelph, Guelph, Canada {jaehyuck, yxiang}@cis.uoguelph,ca Abstract
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationA General Framework for Designing Approximation Schemes for Combinatorial Optimization Problems with Many Objectives Combined into One
OPERATIONS RESEARCH Vol. 61, No. 2, March April 2013, pp. 386 397 ISSN 0030-364X (print) ISSN 1526-5463 (online) http://dx.doi.org/10.1287/opre.1120.1093 2013 INFORMS A General Framework for Designing
More informationComputer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization
Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationSubmodular Functions Properties Algorithms Machine Learning
Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationDo Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods
Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of
More informationarxiv: v2 [math.oc] 22 Dec 2017
Managing Appointment Booking under Customer Choices Nan Liu 1, Peter M. van de Ven 2, and Bo Zhang 3 arxiv:1609.05064v2 [math.oc] 22 Dec 2017 1 Operations Management Department, Carroll School of Management,
More informationApproximation Algorithms for Re-optimization
Approximation Algorithms for Re-optimization DRAFT PLEASE DO NOT CITE Dean Alderucci Table of Contents 1.Introduction... 2 2.Overview of the Current State of Re-Optimization Research... 3 2.1.General Results
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationA column generation algorithm for choice-based network revenue management
A column generation algorithm for choice-based network revenue management Juan José Miranda Bront Isabel Méndez-Díaz Gustavo Vulcano April 30, 2007 Abstract In the last few years, there has been a trend
More informationarxiv: v2 [cs.dm] 30 Aug 2018
Assortment Optimization under the Sequential Multinomial Logit Model Alvaro Flores Gerardo Berbeglia Pascal Van Hentenryck arxiv:1707.02572v2 [cs.dm] 30 Aug 2018 Friday 31 st August, 2018 Abstract We study
More informationLearning MN Parameters with Alternative Objective Functions. Sargur Srihari
Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient
More informationLearning Combinatorial Functions from Pairwise Comparisons
Learning Combinatorial Functions from Pairwise Comparisons Maria-Florina Balcan Ellen Vitercik Colin White Abstract A large body of work in machine learning has focused on the problem of learning a close
More informationChapter 3: Discrete Optimization Integer Programming
Chapter 3: Discrete Optimization Integer Programming Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Website: http://home.deib.polimi.it/amaldi/opt-16-17.shtml Academic year 2016-17
More informationLecture 1: March 7, 2018
Reinforcement Learning Spring Semester, 2017/8 Lecture 1: March 7, 2018 Lecturer: Yishay Mansour Scribe: ym DISCLAIMER: Based on Learning and Planning in Dynamical Systems by Shie Mannor c, all rights
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationStochastic Submodular Cover with Limited Adaptivity
Stochastic Submodular Cover with Limited Adaptivity Arpit Agarwal Sepehr Assadi Sanjeev Khanna Abstract In the submodular cover problem, we are given a non-negative monotone submodular function f over
More information1 [15 points] Frequent Itemsets Generation With Map-Reduce
Data Mining Learning from Large Data Sets Final Exam Date: 15 August 2013 Time limit: 120 minutes Number of pages: 11 Maximum score: 100 points You can use the back of the pages if you run out of space.
More informationCS264: Beyond Worst-Case Analysis Lecture #15: Smoothed Complexity and Pseudopolynomial-Time Algorithms
CS264: Beyond Worst-Case Analysis Lecture #15: Smoothed Complexity and Pseudopolynomial-Time Algorithms Tim Roughgarden November 5, 2014 1 Preamble Previous lectures on smoothed analysis sought a better
More informationA minimalist s exposition of EM
A minimalist s exposition of EM Karl Stratos 1 What EM optimizes Let O, H be a random variables representing the space of samples. Let be the parameter of a generative model with an associated probability
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationRandomized algorithms for lexicographic inference
Randomized algorithms for lexicographic inference R. Kohli 1 K. Boughanmi 1 V. Kohli 2 1 Columbia Business School 2 Northwestern University 1 What is a lexicographic rule? 2 Example of screening menu (lexicographic)
More informationOutline. The binary choice model. The multinomial choice model. Extensions of the basic choice model
Outline The binary choice model Illustration Specification of the binary choice model Interpreting the results of binary choice models ME output The multinomial choice model Illustration Specification
More informationEstimating Primary Demand in Bike-sharing Systems
Estimating Primary Demand in Bike-sharing Systems Chong Yang Goh Operations Research Center, Massachusetts Institute of Technology cygoh@mit.edu Chiwei Yan Operations Research Center, Massachusetts Institute
More informationSeparation Techniques for Constrained Nonlinear 0 1 Programming
Separation Techniques for Constrained Nonlinear 0 1 Programming Christoph Buchheim Computer Science Department, University of Cologne and DEIS, University of Bologna MIP 2008, Columbia University, New
More informationCS168: The Modern Algorithmic Toolbox Lecture #6: Regularization
CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization Tim Roughgarden & Gregory Valiant April 18, 2018 1 The Context and Intuition behind Regularization Given a dataset, and some class of models
More informationDistributed Optimization. Song Chong EE, KAIST
Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links
More informationManaging Appointment Scheduling under Patient Choices
Submitted to manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However,
More informationOn the Problem of Error Propagation in Classifier Chains for Multi-Label Classification
On the Problem of Error Propagation in Classifier Chains for Multi-Label Classification Robin Senge, Juan José del Coz and Eyke Hüllermeier Draft version of a paper to appear in: L. Schmidt-Thieme and
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete
More informationChapter 3: Discrete Optimization Integer Programming
Chapter 3: Discrete Optimization Integer Programming Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Sito web: http://home.deib.polimi.it/amaldi/ott-13-14.shtml A.A. 2013-14 Edoardo
More informationPreference Elicitation for Sequential Decision Problems
Preference Elicitation for Sequential Decision Problems Kevin Regan University of Toronto Introduction 2 Motivation Focus: Computational approaches to sequential decision making under uncertainty These
More informationScheduling Markovian PERT networks to maximize the net present value: new results
Scheduling Markovian PERT networks to maximize the net present value: new results Hermans B, Leus R. KBI_1709 Scheduling Markovian PERT networks to maximize the net present value: New results Ben Hermans,a
More informationAn Improved Approximation Algorithm for Virtual Private Network Design
An Improved Approximation Algorithm for Virtual Private Network Design Friedrich Eisenbrand Fabrizio Grandoni Abstract Virtual private network design deals with the reservation of capacities in a network,
More informationModeling High-Dimensional Discrete Data with Multi-Layer Neural Networks
Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks Yoshua Bengio Dept. IRO Université de Montréal Montreal, Qc, Canada, H3C 3J7 bengioy@iro.umontreal.ca Samy Bengio IDIAP CP 592,
More informationA Near-Optimal Exploration-Exploitation Approach for Assortment Selection
A Near-Optimal Exploration-Exploitation Approach for Assortment Selection SHIPRA AGRAWAL, VASHIS AVADHANULA, VINEE GOYAL, and ASSAF ZEEVI, Columbia University We consider an online assortment optimization
More informationNear-Potential Games: Geometry and Dynamics
Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics
More informationRank aggregation in cyclic sequences
Rank aggregation in cyclic sequences Javier Alcaraz 1, Eva M. García-Nové 1, Mercedes Landete 1, Juan F. Monge 1, Justo Puerto 1 Department of Statistics, Mathematics and Computer Science, Center of Operations
More informationThe Maximum Flow Problem with Disjunctive Constraints
The Maximum Flow Problem with Disjunctive Constraints Ulrich Pferschy Joachim Schauer Abstract We study the maximum flow problem subject to binary disjunctive constraints in a directed graph: A negative
More informationOnline Supplement to:
Online Supplement to: Adaptive Appointment Systems with Patient Preferences Wen-Ya Wang Diwakar Gupta wenya@ie.umn.edu guptad@me.umn.edu Industrial and Systems Engineering Program, University of Minnesota
More informationRandomized Algorithms
Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models
More informationA Database Framework for Probabilistic Preferences
A Database Framework for Probabilistic Preferences Batya Kenig 1, Benny Kimelfeld 1, Haoyue Ping 2, and Julia Stoyanovich 2 1 Technion, Israel 2 Drexel University, USA Abstract. We propose a framework
More informationMonotone Submodular Maximization over a Matroid
Monotone Submodular Maximization over a Matroid Yuval Filmus January 31, 2013 Abstract In this talk, we survey some recent results on monotone submodular maximization over a matroid. The survey does not
More information