Assortment Optimization under a Mixture of Mallows Model

Size: px

Start display at page:

Download "Assortment Optimization under a Mixture of Mallows Model"

Briana Willis
5 years ago
Views:

1 Assortment Optimization under a Mixture of Mallows Model Antoine Désir *, Vineet Goyal *, Srikanth Jagabathula and Danny Segev * Department of Industrial Engineering and Operations Research, Columbia University Leonard N. Stern School of Business, New York University Department of Statistics, University of Haifa Abstract Assortment optimization is an important problem that arises in many applications including retailing and online advertising. The goal in this problem is to determine a revenue/profit maximizing subset of products to offer from a large universe of products when customers exhibit a random substitution behavior. We suppose that demand is described by a mixture of Mallows model. This is a smoothed generalization of the class of sparse rank-based choice models, designed to overcome some of its key limitations. The Mallows distribution has an exponential support size and does not admit a closed-form expression for choice probabilities. We present an efficient procedure to compute the choice probabilities for any assortment under the mixture of Mallows model. Based on this procedure and the structural properties of the distribution, we give a polynomial time approximation scheme (PTAS) for the constrained assortment optimization under a reasonable assumption. In other words, for any constant >0, we get a (1 )-approximation in running time that is polynomial in number of products, n and number of segments, K but depends exponentially on 1/. Our PTAS can handle a large class of constraints on the assortments including cardinality constraints, capacity constraints and matroid constraints. To the best of our knowledge, this is the first efficient algorithm with provably near-optimal performance guarantees for the assortment optimization problem under the Mallows or the mixture of Mallow model in such generality. Furthermore, we give a compact mixed integer program (MIP) and present a variable bound strengthening technique that leads to a practical approach for the constrained assortment optimization problem under a general mixture of Mallows model. 1 Introduction What subset (or assortment) of product to offer is a key decision problem that commonly arises in several application contexts. A concrete setting is that of a retailer who carries a large universe of products U but can offer only a subset of the products in each store, online or offline. The objective of the retailer 1

2 is typically to choose the offer set that maximizes the expected revenue/profit 1 earned from each arriving customer, under stochastic demand. One of the key ingredients for the assortment optimization problem is a model to capture the stochastic demand. The demand model must account for product substitution behavior, whereby customers substitute to an available product (say, a dark blue shirt) when her most preferred product (say, a black one) is not offered. The substitution behavior makes the demand for each offered product a function of the entire offer set, increasing the complexity of the demand model. Nevertheless, existing work has shown that demand models that incorporate substitution effects provide significantly more accurate predictions than those that do not. The common approach to capturing substitution is through a choice model that specifies the demand as the probability P(a S) of a random customer choosing product a from offer set S. Several parametric choice models from the Random Utility Maximization (RUM) class have been extensively studied in the literature in diverse areas including marketing, transportation, economics, and operations management. Among these models, the multinomial logit (MNL) model has by far been the most popular model in practice (Talluri and Van Ryzin 2004), primarily because of the tractability gained from making simplifying assumptions. However, this simplicity also limits 2 the ability of the model to faithfully capture complex substitution patterns present in various applications. To address these limitations, more complex choice models have been developed to capture a richer class of substitution behaviors. Popular examples include include the nested logit (Williams 1977), the latent-class MNL, and the mixed logit models (McFadden et al. 2000); see Train (2003) for a detailed overview of these models. The increased model complexity makes these models more flexible but both the estimation and the assortment problems become computationally difficult to solve. Further, from among these many options, choosing the right model is difficult task in assortment planing, especially as the customer preferences are latent and unobservable and we only have access to censored data. In fact, (Farias et al. 2013a) show that because parametric models do not grow in complexity with the data, they under-perform nonparametric models in accurately predicting expected revneue/profit (Farias et al. 2013b). To address the limitations of parametric models, the nonparametric rank-based choice model has emerged as the alternative. In its more general form, it models customer preferences as a distribution over all possible 1 As elaborated below, conversion-rate maximization can be obtained as a special case of revenue/profit maximization by setting the revenue/profit of all the products to be equal. 2 For instance, a commonly recognized limitation is the so-called Independent of Irrelevant Alternatives property, which specifies that when a product is not offered, then the probability that customers substitute to another product is proportional to its market share. 2

3 rankings/preference lists of the products (Farias et al. 2013b, van Ryzin and Vulcano 2014, Honhon et al. 2012). Each preference list specifies a rank ordering of the products, with lower ranked products being more preferred. In each choice instance, a customer samples a preference list according to the underlying distribution and chooses the first available product (possibly the no-purchase option) from her list. The rank-based model is very general and subsumes the general RUM model class mentioned above. The model accommodates distributions with exponentially large support sizes and, therefore, can capture complex substitution patterns. However, available data are usually not sufficient to identify such a complex model. Therefore, sparsity is used as a model selection criterion to pick a model from the set of models consistent with the data. Specifically, it is assumed that the distribution has a support size K, for some K that is polynomial in the number of products. Sparsity results in data-driven model selection (Farias et al. 2013b), obliviating the need for imposing arbitrary parametric structures. Despite their generality, however, sparse rank-based models cannot account for noise or any deviations from the K ranked-lists in the support. This limits their modeling flexibility, resulting in unrealistic predictions and inability to model individual-level observations. Specifically, because K n! (where n is the number of products), the model specifies that there is a zero chance that a customer uses a ranking that is even slightly different from any of the K rankings in the support and a zero chance of observing certain choices. However, choices may be observed in real (holdout) data that are not consistent with any of the K rankings, making the model predictions unrealistic. In addition, a natural way to interpret sparse choice models is to assume that the population consists of K types of customers, with each type described by one of the ranked lists. When this interpretation is applied to individual-level observations, it implies that all the choice observations of each individual must be consistent with at least one of the K rankings, which again may not be the case in real data. In order to address these issues, we generalize the sparse rank-based models by smoothing them using the Mallows kernel. Specifically, we suppose that the choice model is a mixture of K Mallows models and focus on assortment optimization under this generalized class. The Mallows distribution was introduced in the mid-1950 s Mallows (1957) and is the most popular member of the so-called distance-based ranking models, which are characterized by a modal ranking! and a concentration parameter. The probability that a ranking is sampled falls exponentially as e d(,!), where d(, ) is the distance between and!. Different distance functions result in different models. The Mallows model uses the Kendall-Tau distance, which measures the number of pairwise disagreements between the two rankings. Intuitively, the Mallows 3

4 model assumes that consumer preferences are concentrated around a central permutation, with the likelihood of large deviations being low. The mixture of Mallows model with K segments is specified by the modal rankings:! 1,...,! K, concentration parameters: 1,..., K and probabilities: p 1,...,p K where for any k =1,...,K, p k specifies the probability that a random customer belongs to Mallows segment k with modal ranking! k and concentration parameter k. This mixture model is a more natural model allowing for deviations from the modal rankings and assigning a non-zero probability to every choice. Further, it is a parsimonious way to extend the support of the distribution to an exponential size, and as k!1 for all k, the distribution concentrates around each of the K modes, yielding the sparse rank-based model. In this paper, we assume that the parameters of the mixture of Mallows model are given and focus on the constrained assortment optimization problem. We refer the interested readers to a large body of existing work in the literature on estimating such models from data (Lu and Boutilier 2011, Awasthi et al. 2014, Dwork et al. 2001, Lebanon and Mao 2008). 1.1 Our contributions Our main contribution in this paper is to give a near-optimal algorithm for a large class of constrained assortment optimization for the mixture of Mallows model under a reasonable assumption. We address the two key computational challenges that arise in solving our problem: (a) efficiently computing the choice probabilities and hence, the expected revenue/profit, for a given offer set S and (b) finding a near-optimal assortment. We also present a compact mixed integer program (MIP) and present a variable bound strengthening technique that leads to a practical approach for the constrained assortment optimization problem under a general mixture of Mallows model. We describe the main contributions below. Exact computation of choice probabilities. We present an efficient procedure to compute the choice probabilities P(a S) exactly under a general mixture of Mallows model. Because the mixture of Mallows distribution has an exponential support size, computing the choice probabilities for a fixed offer set S requires marginalizing the distribution by summing it over an exponential number of rankings, and therefore, is a non-trivial computational task. In fact, computing the probability of a general partial order under the Mallows distribution is known to be a #P hard problem (Lu and Boutilier 2011, Brightwell and Winkler 1991). The only other known class of partial orders whose probabilities can be computed efficiently is the class of partitioned preferences (Lebanon and Mao 2008); while this class includes top-k/bottom-k ranked lists, it does not include other popular partial orders such as pairwise comparisons. 4

5 We exploit the structural symmetries in the Mallows distribution to derive an efficiently computable closed-form expression for the choice probabilities for a given offer set under the mixture of Mallows distribution. In particular, we first consider a single Mallows distribution and show that the choice probabilities under the Mallows distribution can be expressed as a discrete convolution. Using fast Fourier transform, the choice probabilities can be computed in O(n 2 log n) time where n is the number of products. Therefore, we obtain a procedure with running time O(K n 2 log n) to compute the choice probabilities for a fixed offer set under a mixture of K Mallows distribution. Near-optimal algorithm for constrained assortment optimization problem. We present a polynomial time approximation scheme (PTAS) for a large class of constrained assortment optimization for the mixture of Mallows model including cardinality constraints, knapsack constraints, and matroid constraints. Our PTAS holds under the assumption that the no-purchase option is ranked last in the modal rankings for all Mallows segments in the mixture; such assumptions are necessary because Aouad et al. (2015) show that the assortment optimization under a sparse rank-based model with K>nis hard to approximate within a factor better than O(1/n 1 ) for any >0 3. Under the above assumption and for any >0, our algorithm computes an assortment with expected revenue at least (1 ) times the optimal in running time that is polynomial in n and K but depends exponentially on 1/. The PTAS is based on establishing a surprising sparsity property about near-optimal assortments, namely, that there exist a near-optimal assortment of size O(1/ ). Therefore, enumerating over all such subsets gives a (1 )-approximation to the constrained assortment optimization problem. To the best of our knowledge, this is the first provably near-optimal algorithm for the assortment optimization under Mallows or the mixture of Mallows model in such generality. A mixed integer programming formulation. We present a compact mixed integer linear program (MIP) with O(K n 3 ) variables, O(n) binary variables and O(K n 3 ) constraints for the constrained assortment optimization under a general mixture of Mallows model with K segments. The compact formulation is based on an alternative efficient procedure to compute the choice probabilities for a fixed offer set exactly. In particular, we exploit the repeated insertion method (RIM) introduced by Doignon et al. (2004) for sampling rankings according to the Mallows distribution and show that the choice probabilities for the Mallows model can be expressed as the unique solution to a system of linear equations that can be solved in O(n 3 ) time. This gives us an alternate procedure to efficiently compute the choice probabilities for a fixed offer set exactly. 3 The reduction is from the independent set problem to an assortment optimization problem under a distribution over only n rankings. 5

6 While this is less efficient than using fast Fourier transform (O(n 3 ) versus O(n 2 log n)), it allows us to formulate a compact MIP for the constrained assortment optimization problem under a general mixture of Mallows model. Our MIP formulation holds for general mixture of Mallows model and does not require any assumption on the rank of no-purchase in the modal rankings. We conduct numerical experiments to test the computational performance of the MIP. In particular, we implement a variable bound strengthening and observe that the MIP is efficient for reasonably sized assortment optimization problems. Therefore, the MIP provides a practical approach for assortment optimization under a general mixture of Mallows model. 1.2 Literature review A large number of parametric models over rankings have been extensively studied in the areas of statistics, transportation, marketing, economics, and operations management (see Marden (1995) for a detailed survey of most of the models). Our work particularly has connections to the work in machine learning and operations management. The existing work in machine learning has focused on designing computationally efficient algorithms for estimating the model parameters from commonly available observations (complete rankings, top-k/bottom-k lists, pairwise comparisons, etc.). The developed techniques mainly consist of efficient algorithms for computing the likelihood of the observed data (Lebanon and Mao 2008, Guiver and Snelson 2009) and sampling techniques for sampling from the distributions conditioned on observed data (Lu and Boutilier 2011, Meila et al. 2012). The Plackett-Luce (PL) model, the Mallows model, and their variants have been by far the most studied models in this literature. On the other hand, the work in operations management has mainly focused on designing set optimization algorithms to find the best subset efficiently. The multinomial logit (MNL) model has been the most commonly studied model in this literature. The MNL model was made popular by the work of McFadden (1978) and has been shown by Yellott (1977) be equivalent to the PL model, introduced independently by Luce (1959) and Plackett (1975). When given the model parameters, the assortment optimization problem has been shown to be efficiently solvable for the MNL model by Talluri and Van Ryzin (2004), variants of the nested logit (NL) model by Davis et al. (2014), Gallego and Topaloglu (2014), and the Markov chain model by Blanchet et al. (2013). The problem is known to be hard for most other choice models (Bront et al. 2009), so Jagabathula (2014) studies the performance of the local search algorithm for some of the assortment problems that are known to be hard. As mentioned, the literature in operations management has restricted itself to only those models for which 6

7 choice probabilities are known to be efficiently computed. In the context of the literature, our key contribution is to extend a popular model in the machine learning literature to choice contexts and the assortment problem. Outline. The rest of the paper is organized as follows. In Section 2, we formally introduce the model and the notation. In Section 3, we present the closed-form expression for the choice probabilities. In Section 4, we present the PTAS for the assortment optimization problem under the mixture of Mallows model. In Section 5, we introduce an alternative procedure to compute the choice probabilities which we build on to present the MIP formulation in Section 6. Finally, the computational study is presented in Section 7. 2 Model and problem statement Notation. We consider a universe U of n products. In order to distinguish products from their corresponding ranks, we let U = {a 1,...,a n } denote the universe of products, under an arbitrary indexing. Preferences over this universe are captured by an anti-reflexive, anti-symmetric, and transitive relation, which induces a total ordering (or ranking) over all products; specifically, a b means that a is preferred to b. We represent preferences through rankings or permutations. A complete ranking (or simply a ranking) is a bijection : U! [n] that maps each product a 2 U to its rank (a) 2 [n], where [j] denotes the set {1, 2,...,j} for any integer j. Lower ranks indicate higher preference so that (a) < (b) if and only if a b, where denotes the preference relation induced by the ranking. For simplicity of notation, we also let i denote the product ranked at position i. Thus, 1 2 n is the list of the products written by increasing order of their ranks. Finally, for any two integers i apple j, let [i, j] denote the set {i, i +1,...,j}. Mallows model. The Mallows model is a member of the distance-based ranking family models (see Murphy and Martin (2003)). This model is described by a modal ranking!, which denotes the central or modal permutation, and a concentration parameter 2 R +, such that the probability of each permutation is given by ( )= e d(,!) ( ), where ( ) = P exp( d(,!)) is the normalization constant, and d(, ) is the Kendall-Tau metric of 7

8 distance between permutations defined as d(,!) = i<j 1l [ ( (a i ) (a j )) (!(a i )!(a j )) < 0]. In other words, d(,!) counts the number of pairwise disagreements between the permutations and!. It can be verified that d(, ) is a distance function that is right-invariant under the composition of the symmetric group, i.e., d( 1, 2 )=d( 1, 2 ) for every, 1, 2, where the composition is defined as (a) = ( (a)). This symmetry can be exploited to show that the normalization constant ( ) has a closed-form expression (Marden (1995)) given by ( ) = n+1 Y i=1 1 e i 1 e. Note that ( ) depends only on the concentration parameter and does not depend on the modal ranking. Intuitively, the Mallows model defines a set of consumers whose preferences are similar, in the sense of being centered around a common permutation, where the probability for deviations thereof are decreasing exponentially. The similarity of consumer preferences is captured by the Kendall-Tau distance metric. Mixture of Mallows model. The mixture of K Mallows models is given by K segments where for each segment k =1,...,K, we are given its probability p k and the Mallows distribution with modal ranking! k and concentration parameter k. Therefore, the probability of any permutation in the mixture model is given by ( )= K k=1 p k e k d(,! k ). ( k ) 2.1 Problem statement Choice probabilities computation. We first focus on efficiently computing the probability that a product a will be chosen from an offer set S U under a given mixture of Mallows model. When offered the subset S, the customer is assumed to sample a preference list according to the mixture of Mallows model and then choose the most preferred product from S according to the sampled list. Therefore, the probability of choosing product a from the offer set S is given by P(a S) = ( ) 1l [, a, S], (1) 8

9 where 1l[, a, S] indicates whether (a) < (a 0 ) for all a 0 2 S, a 0 6= a. Note that the above sum runs over n! preference lists, meaning that it is a priori unclear if P(a S) can be computed efficiently. Assortment optimization. Once we are able to compute the choice probabilities, we consider the assortment optimization problem. In the assortment optimization problem, each product a has an endogenously fixed price r a. Moreover, there is an additional product a q that represents the outside option (no-purchase), with price r q =0that is always included in the assortment. Let S 2 U be denote a set of feasible assortments. We assume that S satisfied the following assumption: Assumption 1. Let S be the set of feasible assortments. We assume that S satisfies the following properties. (Membership) For any S U, it is easy to test whether S 2 S or not. (Closure) For any S 2 S and T S implies that T 2 S. This is a fairly general assumption satisfied for a large class of constraints including cardinality constraints, multi-dimensional knapsack constraints and matroid constraints. The goal in the assortment optimization problem is to determine a feasible subset of products that maximizes the expected revenue (2): max S2S R(S) = max S2S a2s P(a S [{r q }) r a. (2) We would like to note that even the unconstrained version where S contains all possible subsets of U is hard to approximate within a factor better than O(1/n 1 ) under a general distribution over permutation model (Aouad et al. (2015)). 3 Choice probabilities: closed-form expression In this section, we show that the choice probabilities can be computed efficiently under the Mallows model. Note that this directly give a efficient procedure to compute the choice probabilities under a mixture of Mallows model. Without loss of generality, we assume from this point on that the products are indexed such that the central permutation! ranks product a i at position i, for all i 2 [n]. The next theorem shows that, when the offer set is contiguous, the choice probabilities enjoy a rather simple form. Using these expressions as building blocks, we further derive a closed-form expression for general offer sets. 9

10 Theorem 3.1 (Contiguous offer set). Suppose S = a [i,j] = {a i,...,a j } for some 1 apple i apple j apple n. Then, the probability of choosing product a k 2 S under the Mallows model with modal ranking! and concentration parameter is given by P(a k S) = (k i) e 1+e + + e (j i). The choice probability under a general offer set has a more involved structure for which additional notation are needed. For a pair of integers 1 apple m apple q apple n, define (q, ) = qy s 1 e ` and (q, m, ) = (m, ) (q m, ). s=1 `=0 In addition, for a collection of M discrete functions h m : Z! R, m =1,...,M such that h m (r) =0 for any r<0, their discrete convolution is defined as (h 1??h m )(r) = r 1,...,r M : P m rm=r h 1 (r 1 ) h M (r M ). Theorem 3.2 (General offer set). Suppose S = a [i1,j 1 ] [ [a [im,j M ] where i m apple j m for 1 apple m apple M and j m <i m+1 for 1 apple m apple M 1. Let G m = a [jm,i m+1 ] for 1 apple m apple M 1, G = G 1 [ [G M, and C = a [i1,j M ]. Then, the probability of choosing a k 2 a [i`,j`] can be written as P(a k S) =e (k i 1) Q M 1 m=1 ( G m, ) (f 0? ( C, ) f 1?? f`?f`+1??f M )( G ), where: f m (r) =e r (jm i 1+1+r/2) 1 ( G m,r, ), if 0 apple r apple G m, for 1 apple m apple M. f m (r) =e r f m (r), for 1 apple m apple M. f 0 (r) = ( C, G r, ) e ( G r)2 /2 1+e + +e ( S 1+r), for 0 apple r apple G. f m (r) =0, for 0 apple m apple M and any r outside the ranges described above. Proof. At a high level, deriving the expression for a general offer set involves breaking down the probabilistic event of choosing a k 2 S into simpler events for which we can use the expression given in Theorem 3.1, and then combining these expressions using the symmetries of the Mallows distribution. 10

11 For a given vector R =(r 0,...,r M ) 2 R M+1 such that r r M = G, let h(r) be the set of permutations which satisfy the following two conditions: i) among all the products of S, a k is the most preferred, and ii) for all m 2 [M], there are exactly r m products from G m which are preferred to a k.we denote this subset of products by G m for all m 2 [M]. This implies that there are r 0 products from G which are less preferred than a k. With this notation, P(a k S) = R:r r M = G 2h(R) ( ). Recall that for all, we have ( )= e P i,j (,i,j), ( ) where (, i, j) =1l[( (a i ) (a j )) (!(a i )!(a j )) < 0]. For all, we can break down the sum in the exponential as follows: (, i, j) =C 1 ( )+C 2 ( )+C 3 ( ), i,j where C 1 ( ) contains pairs of products (i, j) such that a i 2 G m for some m 2 [M] and a j 2 S, C 2 ( ) contains pairs of products (i, j) such that a i 2 G m for some m 2 [M] and a j 2 G m 0\ G m 0 for some m 6= m 0, C 3 ( ) contains the remaining pairs of products. For a fixed R, we show that C 1 ( ) and C 2 ( ) are constant for all 2 h(r). Part 1. C 1 ( ) counts the number of disagreements (i.e., number of pairs of products that are oppositely ranked in and!) between some product in S and some product in G m for any m 2 [M]. For all m 2 [M], a product in a i 2 G m induces a disagreement with all product a j 2 S such that j<i. Therefore, the sum of all these disagreements is equal to, M M C 1 ( )= (, i, j) = r m (j m i 1 + 1). m=1 a j 2S m=1 a i 2 G m 11

12 Part 2. C 2 ( ) counts the number of disagreements between some product in any G m and some product in any G m 0\ G m 0 for m 0 6= m. The sum of all these disagreements is equal to, C 2 ( )= = = m6=m 0 M m=2 M m=2 a i 2 G m a j 2G m 0 \ G m 0 m 1 r m G j j=1 m 1 r m j=1 G j (, i, j) = M m=2 M m=2 m 1 r m j=1 m 1 r m ( G j r j ) r j 1 2 ( G m 0) j=1 M rm. 2 m=1 Consequently, for all 2 h(r), we can write d(,!) =C 1 (R)+C 2 (R)+C 3 ( ) and therefore, P(a k S) = R:r 0 + +r M = G e (C 1(R)+C 2 (R)) ( ) 2h(R) e.c 3( ). Computing the inner sum requires a similar but more involved partitioning of the permutations as well as using Theorem 3.1. The details are presented in Appendix A. In particular, we can show that for a fixed R, P 2h(R) e.c 3( ) is equal to ( G m 0, ) ( S + m 0, ) e (k 1 P` 1 m=1 rm) 1+ + e ( S +m 0 1) MY m=1 ( G m, ) (r m, ) ( G m r m, ). Putting all the pieces together yields the desired result. Due to representing P(a S) as a discrete convolution, we can efficiently compute this probability using fast Fourier transform in O(n 2 log n) time (see for instance Cooley and Tukey (1965)), which is a dramatic improvement over the exponential sum (1) that defines the choice probabilities. Note that for a mixture of K Mallows, this implies that we can compute the choice probability P(a S) in O(K n 2 log n) time. 4 A PTAS for the assortment optimization In this section, we present a polynomial time approximation scheme (PTAS) for the assortment optimization problem under the mixture of Mallows model described by (2). In other words, for any accuracy level >0, we compute an assortment with expected revenue at least (1 ) times the optimal. For every fixed, the 12

13 running time is polynomial in n and K. For ease of exposition, we first focus on a single Mallows model and thus drop the index corresponding to the Mallows segment. At the end of the section, we explain how the results extend to a mixture of Mallows model. Before describing the algorithm, we introduce a number of structural properties relative to the Mallows distribution. 4.1 Probabilistic claims We first show that for any pair of products (a i,a j ) such that i<j(i.e. a i is preferred to a j in!), and for a permutation drawn from a Mallows distribution, we have P(a i a j ) 1/2. Note that when =0, since the distribution is uniform, we have P(a i a j )=1/2. Moreover, when!1, P(a i a j )=1. Our result extends these extreme cases to all values of. Claim 4.1. For any pair of products (a i,a j ) such that i<j, if is drawn from a Mallows distribution, we have, P(a i a j ) 1 2. Proof. Let A = { : a i a j } and B = { : a i a j }. We consider the bijection f : A! B which switches a i and a j. More precisely, for all 2 A, 8 >< f( )(a k )= >: (a i ) if k = j (a j ) if k = i (a k ) otherwise. We show that for all 2 A, d(,!) apple d(f( ),!) which in turn implies the desired result. Note that for any 2 A, we have d(f( ),!) d(,!) =1+ [ (f( ),i,k)) + (f( ),j,k) (, i, k)) (, j, k)], k:a i a k a j where (, i, j) =1l[( (a i ) (a j )) (!(a i )!(a j )) < 0]. Since a i! a j, we have three cases to consider. Case 1: a i! a k! a j. In that case, (f( ),i,k)) + (f( ),j,k)=2and (, i, k)) + (, j, k) =0. Case 2: a k! a i. In that case, (f( ),i,k)) + (f( ),j,k)=1and (, i, k)) + (, j, k) =1. Case 3: a j! a k. In that case, (f( ),i,k)) + (f( ),j,k)=1and (, i, k)) + (, j, k) =1. 13

14 In each case, (f( ),i,k)) + (f( ),j,k) (, i, k)) (, j, k) 0, which concludes the proof. We also extend this result to a tuple of products (i 1,...,i m ). More precisely, if i 1 < <i m (i.e. a i1 is the most preferred product of (a i1,...,a im ) in!), then if is drawn from a Mallows distribution, we have P(a i1 a ij, 8j 2) 1/m. Again, note that when =0, we have P(a i1 a ij, 8j 2) = 1/m. Moreover, when!1, P(a i1 a ij, 8j 2) = 1. Claim 4.2. For any tuple of products (a i1,...,a im ) such that i 1 < <i m, if is drawn from a Mallows distribution, we have P(a i1 a im, 8j 2) 1 m. Proof. Let A k = { : a ik a i`, 8` 6= k} be the set of permutations in which a ik appears first among a i1,...,a im. For a fixed pair (k, m) such that k<m, consider the bijection f : A m! A k which switches m and k. More precisely, for all 2 A m, 8 >< f( )(`) = >: (a k ) if ` = m (a m ) if ` = k (a`) otherwise. The proof of Claim 4.1 shows that for all 2 A m, d(,!) apple d(f( ),!). This in turn implies that for all k<m, P(a ik a i`, 8` 6= k) P(a im a i`, 8` 6= m), and concludes the proof. 4.2 A PTAS for the assortment optimization problem We now present a polynomial time approximation scheme (PTAS) for the assortment optimization problem under the Mallows distribution under an additional assumption. Our algorithm is based on establishing a surprising sparsity property, proving the existence of small-sized near-optimal assortments, crucially utilizing certain symmetries in the distribution over permutations. 14

15 Description of the algorithm. Let S be the optimal assortment. Let M =1/, where without loss of generality, assume that M takes an integer value. We enumerate all possible subsets of U of less or equal than M and return the best candidate assortment. Algorithm 1 describes the procedure. Algorithm 1 Computing choice probabilities 1: Let Ŝ = arg max {R(S) : S applem}. 2: Return Ŝ. Assumption 2. The outside option is ranked last in the central permutation!, i.e. q = n. Theorem 4.1. Under Assumption 2, Algorithm 1 is a PTAS for the assortment optimization problem (2) under the Mallows distribution. Proof. We first argue the correctness of the algorithm, i.e., that the assortment returned is a (1 )-optimal solution. Again, let S be the optimal assortment. Note that if S <M, then S is one of the candidate assortments we examine, and therefore the algorithm returns the optimal solution. We therefore assume that S M. In this case, let S consists of the M highest revenue product of S. Note that S is among the candidate assortment constructed by the algorithm. Moreover, it is a feasible assortment by Assumption 1. We show that S is (1 )-optimal using a sample-path analysis. In particular, let be a fixed preference list. Let R(, S) be the revenue obtained by when assortment S is offered. In particular, R(, S) = a2s 1l [, a, S] r a. We consider two cases. Case 1. We first assume that a q a i for all i 2 S. In this case, R(, S )=0. On the other hand, offering a single product a i is always a feasible solution. Therefore, by Claim 4.1, R(S ) R({i })=P(a i a q ) r i r i 2, where a i is the highest price product. Moreover, note that all product in S \S have revenue smaller or equal to r i. Therefore, any product in S \S has revenue less or equal than 2 R(S ). This implies that R(, S ) apple 2 R(S ). Case 2. In this case, we assume that in the permutation, there exists a product a i 2 S such that a i a q and R(, S )=r ai. We show that R(, S ) R(, S ). Indeed, suppose that there exists a product a j is in 15

16 S \S such that a j a i. Since S contains the M highest revenue product of S, it must be that r j apple r j. Therefore, R(, S ) R(L, S ). We now combine the two cases. For case 1 to happen, note that a q has to be preferred to all products from S. From Claim 4.2, this event occurs with probability at most 1/ S =1/M =. Consequently, R(S ) R(S )=P(Case 1) E [R(S ) R(S ) Case 1] {z } apple0 + P(Case 2) E [R(S ) R(S ) Case 2] {z } =E[R(S ) Case 2] apple E [R(S ) Case 2] apple2 R(S ). From a running time perspective, the number of candidate assortment is equal to n 1/. By Theorem 3.2, we can compute R(S) for any assortment S in O(n 3 log(n)). Therefore the overall running time of the algorithm is O(n 1/ n 3 log(n)). Extension to a mixture of Mallows model. The PTAS extends to a mixture of Mallows model as long as Assumption 2 holds for each segment. Indeed, since the probabilistic claims hold for each segment, we can adapt the proof of Theorem 4.1 to the case of a mixture of Mallows model. Moreover, we would like to emphasize that the running time scales linearly in K, the number of segments of the Mallows model. More precisely, Algorithm 1 returns a (1 )-optimal assortment in O(K n 1/ n 3 log(n)). The PTAS presented in this section provides an approximation algorithm with provable guarantees for a special case of a mixture of Mallows model. However, it requires Assumption 2 to hold. We next present an alternative way of solving the assortment optimization problem using a MIP formulation which does not require any assumption. To this end, we first present an alternative method for computing the choice probabilities by means of dynamic programming. 5 Choice probabilities: a dynamic programming approach In what follows, we present an alternative algorithm for computing the choice probabilities under a Mallows model. Again, note that this directly implies an efficient algorithm for computing the choice probabilities under a mixture of Mallows model. Our approach is based on an efficient procedure to sample a random 16

17 permutation according to a Mallows model with modal ranking! and concentration parameter. The random permutation is constructed sequentially, as explained in Algorithm 2. Algorithm 2 Repeated insertion procedure 1: Let = {a 1 }. 2: For i =2,...,n, insert a i into at position s =1,...,iwith probability i,s = (i s) e 1+e + + e (i 1). 3: Return. Lemma 5.1 (Theorem 3 in Lu and Boutilier (2011)). The repeated insertion procedure generates a random sample from a Mallows distribution with modal ranking! and concentration parameter. Based on the correctness of this procedure, we describe a dynamic program to compute the choice probabilities of a general offer set S. The key idea is to decompose these probabilities to include the position at which a product is chosen. In particular, for i apple m and s 2 [m], let (i, s, m) be the probability that product a i is chosen (i.e., appears first among products in S) at position s after the m-th step of Algorithm 2. In other words, (i, s, m) corresponds to a choice probability when restricting U to the first m products, a 1,...,a m. With this notation, we have for all i 2 [n], n P(a i S) = (i, s, n). s=1 We compute (i, s, m) iteratively for m =1,...,n. In particular, in order to compute (i, s, m + 1), we use the correctness of the sampling procedure. Specifically, starting from a permutation that includes the products a 1,...,a m, the product a m+1 is inserted at position j with probability m+1,j, and we have two cases to consider. Case 1: a m+1 /2 S. In this case, (m +1,s,m+ 1) = 0 for all s =1,...,m+1. Consider a product a i for i apple m. In order for a i to be chosen at position s after a m+1 is inserted, one of the following events has to occur: i) a i was already chosen at position s before a m+1 is inserted, and a m+1 is inserted at a position `>s, ii) a i was chosen at position s 1, and a m+1 is inserted at a position ` apple s 1. 17

18 Consequently, we have for all i apple m, (i, s, m + 1) = m+1 `=s+1 s 1 m+1,` (i, s, m)+ m+1,` (i, s `=1 1,k) =(1 m+1,s) (i, s, m)+ m+1,s 1 (i, s 1,m), where Case 2: a m+1 m,s = P s `=1 m,` for all m, s. 2 S. Consider a product a i with i apple m. This product is chosen at position s only if it was already chosen at position s and a m+1 is inserted at a position `>s. Therefore, for all i apple m, (i, s, m + 1) = (1 m+1,s) (i, s, m). For product a m+1, it is chosen at position s only if all products a i for i apple m are at positions ` s and a m+1 is inserted at position s, implying that Algorithm 3 summarizes this procedure. Algorithm 3 Computing choice probabilities (m +1,s,m+ 1) = m+1,s iapplem `=s n (i, `, m). 1: Let S be a general offer set. Without loss of generality, we assume that a 1 2 S. 2: Let (1, 1, 1) = 1. 3: For m =1,...,n 1, (a) For all i apple m and s =1,...m+1, let (i, s, m + 1) = (1 m+1,s) (i, s, m)+1l[a m+1 /2 S] m+1,s 1 (i, s 1,m). (b) For s =1,...,m+1, let (m +1,s,m+ 1) = 1l[a m+1 2 S] m+1,s 4: For all i 2 [n], return P(a i S) = P n s=1 (i, s, n). iapplem `=s n (i, `, m). Theorem 5.2. For any offer set S, Algorithm 3 returns the choice probabilities under a Mallows distribution with modal ranking! and concentration parameter. This dynamic programming approach provides an O(n 3 ) time algorithm for computing P(a S) for all products a 2 S simultaneously. Moreover, as explained in the next section, these ideas lead to an algorithm to solve the assortment optimization problem. 18

19 6 Assortment optimization: integer programming formulation Building on Algorithm 3 and introducing a binary variable for each product, we can reformulate the assortment optimization problem (2) under a mixture of Mallows model as a mixed inter program (MIP). Although the MIP formulation does not enjoy the theoretical guarantees of the PTAS (i.e. upper bound on the running time), it does not require Assumption 2 to hold. Again, we start with a single Mallows model. In particular, we give a MIP with only O(n 3 ) variables and constraints, with only n 0-1 variables. We assume for simplicity that the first product of S (say a 1 ) is known. Since this product is generally not known a-priori, in order to obtain an optimal solution to problem (2), we need to guess the first offered product and solve the above integer program for each of the O(n) guesses. We would like to note that the MIP formulation is presented for the unconstrained assortment optimization problem but is quite powerful and can handle a large class of constraints on the assortment (such as cardinality and capacity constraints). Theorem 6.1. Conditional on a 1 2 S, the following mixed integer program (MIP) computes an optimal solution to the unconstrained assortment optimization problem under a Mallows model: max r i (i, s, n) i,s s.t. (1, 1, 1) = 1, (1,s,1) = 0, 8s =2,...,n (i, s, m + 1) = (1 w m+1,s ) (i, s, m)+y i,s,m+1, 8i, s, 8m 2 (m +1,s,m+ 1) = z s,m+1, 8s, 8m 2 y i,s,m apple m+1,s 1 (i, s 1,m 1), 8i, s, 8m 2 0 apple y i,s,m apple m+1,s 1 (1 x m ), 8i, s, 8m 2 z s,m apple m+1,s n `=s m 1 i=1 (i, `, m 1), 8s, 8m 2 0 apple z s,m apple m+1,s x m, 8s, 8m 2 x 1 =1,x q =1,x m 2{0, 1} Proof. Let x =(x 1,...,x n ) be a feasible binary vector to the MIP and let S = {a i : x i =1}. Note that there is a one to one correspondence between feasible vector x to the MIP and feasible assortment S such 19

20 that a 1 2 S and a q 2 S. Consequently, we can rewrite the MIP as max S U a q2s max r i (i, s, n) i,s s.t. (i, s, m + 1) = (1 w m+1,s ) (i, s, m)+y i,s,m+1, 8i, s, 8m 2 (m +1,s,m+ 1) = z s,m+1, 8s, 8m 2 0 apple y i,s,m apple 1l [ a m+1 /2 S] m+1,s 1 (i, s 1,m 1), 8i, s, 8m 2 0 apple z s,m apple 1l [ a m+1 2 S] m+1,s (1, 1, 1) = 1 n `=s m 1 i=1 (i, `, m 1), 8s, 8m 2 Note that it is always optimal to set y i,s,m and z s,m at their upper bound because all the coefficients in the objective function are non-negative. The correctness of Algorithm 3 then shows that the MIP is an equivalent formulation of the unconstrained assortment optimization problem under a Mallows model. We now present the MIP formulation for the unconstrained assortment optimization problem for a mixture of Mallows model. Again, we want to emphasize that the binary variables allow capturing a wide variety of constraints. Theorem 6.2. Conditional on a 1 2 S, the following mixed integer program (MIP) computes an optimal solution to the unconstrained assortment optimization problem under a mixture of Mallows model: max r i p k k (i, s, n) i,s,k s.t. k (1, 1, 1) = 1, k (1,s,1) = 0, 8s =2,...,n,8k k (i, s, m + 1) = (1 w m+1,s ) k (i, s, m)+y k i,s,m+1, 8i, s, k, 8m 2 k (m +1,s,m+ 1) = z k s,m+1, 8s, k, 8m 2 y k i,s,m apple m+1,s 1 k (i, s 1,m 1), 8i, s, k, 8m 2 0 apple y k i,s,m apple m+1,s 1 (1 x!k (a m)), 8i, s, k, 8m 2 z k s,m apple m+1,s n `=s m 1 i=1 k (i, `, m 1), 8s, k, 8m 2 0 apple z k s,m apple m+1,s x!k (a m), 8s, k, 8m 2 20

21 x 1 =1,x q =1,x m 2{0, 1} 7 Numerical experiments In this section, we examine how the MIP performs in term of running time. We considered the following simulation setup for a single Mallows model. Product prices are sampled independently and uniformly at random from the interval [0, 1]. The modal ranking is fixed to the identity ranking with the outside option ranked at the top. The outside option being ranked at the top is characteristic of applications in which the retailer captures a small fraction of the market and the outside option represents the (much larger) rest of the market. Indeed, most of the customers visiting a website or a store leave without making a purchase. Because the outside option is always offered, we need to solve only a single instance of the MIP (described in Theorem 6.1). Note that in the more general setting, the number of MIPs that must be solved is equal the minimum of the rank of the outside option and the rank of the highest revenue item. Because the MIPs are independent of each other, they can be solved in parallel. We solved the MIPs using the Gurobi Optimizer version on a computer with processor 2.4GHz Intel Core i5, RAM of 8GB, and operating system Mac OS El Capitan. In order to improve the running time of the MIP, we implemented some strengthening of some of the constraints. We describe this next. Strengthening of the MIP formulation. We use some structural properties of the optimal solution to tighten some of the upper bounds involving the binary variables in the MIP formulation. In particular, for all i, s, and m, we replace the constraint y i,s,m apple m+1,s 1 (1 x m ), by the following constraint y i,s,m apple m+1,s 1 u i,s,m (1 x m ), where u i,s,m is the probabliity that product a i is selected at position (s 1) after the m th step of Algorithm 2 when the offer set is S = {a i,a q }, i.e. when only the highest priced product is offered. Since we know that the highest price product is always offered in the optimal assortment, this is a valid upper bound to 21

22 (i, s 1,m 1) and therefore a valid strenghtening of the constraint. Similarly, for all s and m, we replace the constraint, z s,m apple m+1,s x m, by the following constraint z s,m apple m+1,s v s,m x m, where v s,m is equal to the probability that product that product i is selected at position ` = s,..., n when the offer set is S = {a q } if a i w a i, and S = {a q,a i } otherwise. Again using the fact that the highest price product is always offered in the optimal assortment, we can show that this is a valid upper bound. Results and discussion. Table 1 shows the running time of strengthened MIP formulation for different values of e and n. For each pair of parameters, we generated 50 different instances. n e Average running time (s) Max running time (s) , Table 1: Running time of the strengthened MIP for various values of e and n. We would like to note that the strengthening improves the running time considerably. Under the initial formulation, the MIP did not terminate after several hours for n = 25 whereas it was able to terminate in a few minutes with the additional strengthening. Our MIP obtains the optimal solution in a reasonable amount of time for the considered parameter values. Outside of this range, i.e. when e is too small or when n is too large, there are potential numerical instabilities. The strengthening we propose is one way to improve the running time of the MIP but other numerical optimization techniques may be applied to improve the running time even further. Finally, we emphasize that the MIP formulation is necessary because of its flexibility to handle versatile business constraints (such as cardinality or capacity constraints) that naturally 22

23 arise in practice. References Aouad, A., V. Farias, R. Levi, D. Segev The approximability of assortment optimization under ranking preferences. Available at SSRN: Awasthi, P., A. Blum, O. Sheffet, A. Vijayaraghavan Learning mixtures of ranking models. Advances in Neural Information Processing Systems Blanchet, Jose H, Guillermo Gallego, Vineet Goyal A markov chain approximation to choice modeling. EC Brightwell, G., P. Winkler Counting linear extensions is #P-complete. STOC 91 Proceedings of the twenty-third annual ACM Symposium on Theory of Computing Bront, Juan José Miranda, Isabel Méndez-Díaz, Gustavo Vulcano A column generation algorithm for choicebased network revenue management. Operations Research 57(3) Cooley, J., J. Tukey An algorithm for the machine calculation of complex fourier series. Mathematics of Computation 19(90) Davis, James M, Guillermo Gallego, Huseyin Topaloglu Assortment optimization under variants of the nested logit model. Operations Research 62(2) Doignon, J., A. Pekeč, M. Regenwetter The repeated insertion model for rankings: Missing link between two subset choice models. Psychometrika 69(1) Dwork, C., R. Kumar, M. Naor, D. Sivakumar Rank aggregation methods for the web. ACM, ed., Proceedings of the 10th international conference on World Wide Web Farias, V., S. Jagabathula, D. Shah. 2013a. A new approach to modeling choice with limited data. Management Science 59(2) 305 ï Farias, V., S. Jagabathula, D. Shah. 2013b. A nonparametric approach to modeling choice with limited data. Management Science 59(2) Gallego, Guillermo, Huseyin Topaloglu Constrained assortment optimization for the nested logit model. Management Science 60(10) Guiver, John, Edward Snelson Bayesian inference for plackett-luce ranking models. proceedings of the 26th annual international conference on machine learning. ACM, Honhon, D., S. Jonnalagedda,. Pan Optimal algorithms for assortment selection under ranking-based consumer choice models. Manufacturing & Service Operations Management 14(2) Jagabathula, S Assortment optimization under general choice. Available at SSRN

24 Lebanon, G., Y. Mao Non-parametric modeling of partially ranked data. Journal of Machine Learning Research Lu, T., C. Boutilier Learning mallows models with pairwise preferences. Proceedings of the 28th International Conference on Machine Learning Luce, R.D Individual Choice Behavior: A Theoretical Analysis. Wiley. Mallows, C Non-null ranking models. Biometrika 44(1) Marden, J Analyzing and Modeling Rank Data. Chapman and Hall. McFadden, D Modeling the choice of residential location. Transportation Research Record (672) McFadden, Daniel, Kenneth Train, et al Mixed mnl models for discrete response. Journal of applied Econometrics 15(5) Meila, M., K. Phadnis, A. Patterson, J. Bilmes Consensus ranking under the exponential model. ariv preprint ariv: Murphy, T., D. Martin Mixtures of distance-based models for ranking data. Computational Statistics & Data Analysis 41(3) Plackett, R.L The analysis of permutations. Applied Statistics 24(2) Talluri, K., G. Van Ryzin Revenue management under a general discrete choice model of consumer behavior. Management Science 50(1) Train, K Discrete choice methods with simulation. Cambridge University Press, New York, NY. van Ryzin, G., G. Vulcano A market discovery algorithm to estimate a general class of nonparametric choice models. Management Science 61(2) Williams, H.C.W.L On the formation of travel demand models and economic evaluation measures of user benefit. Environment and Planning A 3(9) Yellott, John I The relationship between luce s choice axiom, thurstone s theory of comparative judgment, and the double exponential distribution. Journal of Mathematical Psychology 15(2) A Proof of Theorem 3.2 (continued) In this section, we prove that for a fixed R, P 2h(R) e.c3( ) is equal to ( G m 0, ) ( S + m 0, ) e (k 1 P` 1 m=1 rm) 1+ + e ( S +m0 1) MY m=1 ( G m, ) (r m, ) ( G m r m, ). 24

Assortment Optimization Under the Mallows model

Assortment Optimization Under the Mallows model Antoine Désir IEOR Department Columbia University antoine@ieor.columbia.edu Srikanth Jagabathula IOMS Department NYU Stern School of Business sjagabat@stern.nyu.edu