Technical Note: An Expectation-Maximization Algorithm to Estimate the Parameters of the Markov Chain Choice Model

Size: px
Start display at page:

Download "Technical Note: An Expectation-Maximization Algorithm to Estimate the Parameters of the Markov Chain Choice Model"

Transcription

1 Techncal Note: An Expectaton-Maxmzaton Algorthm to Estmate the Parameters of the Markov Chan Choce Model A. Serdar Şmşek 1, Huseyn Topaloglu 2 August 1, 2017 Abstract We develop an expectaton-maxmzaton algorthm to estmate the parameters of the Markov chan choce model. In ths choce model, a customer arrves nto the system to purchase a certan product. If ths product s avalable for purchase, then the customer purchases t. Otherwse, the customer transtons between the products accordng to a transton probablty matrx untl she reaches an avalable one and purchases ths product. The parameters of the Markov chan choce model are the probablty that the customer arrves nto the system to purchase each one of the products and the entres of the transton probablty matrx. In our expectaton-maxmzaton algorthm, we treat the path that a customer follows n the Markov chan as the mssng pece of the data. Condtonal on the fnal purchase decson of a customer, we show how to compute the probablty that the customer arrves nto the system to purchase a certan product and the expected number of tmes that the customer transtons from a certan product to another one. These results allow us to execute the expectaton step of our algorthm. Also, we show how to solve the optmzaton problem that appears n the maxmzaton step of our algorthm. Our computatonal experments show that the Markov chan choce model, coupled wth our expectaton-maxmzaton algorthm, can yeld better predctons of customer choce behavor when compared wth other commonly used alternatves. 1 Introducton Incorporatng customer choce behavor nto revenue management models has been seeng consderable attenton. Tradtonal revenue management models capture the customer demand for a product through an exogenous random varable whose dstrbuton does not depend on what other products are avalable. In realty, however, many customers choose and substtute among the avalable products n a partcular product category, ether because the customer arrves wth no specfc product n mnd and makes a choce among the offered products or because the customer arrves wth a specfc product n mnd and ths product s not avalable. Feld experments, 1 Naveen Jndal School of Management, The Unversty of Texas at Dallas, Rchardson, TX 75080, USA, serdar.smsek@utdallas.edu 2 School of Operatons Research and Informaton Engneerng, Cornell Tech, New York, NY 10011, USA, topaloglu@ore.cornell.edu 1

2 customer surveys and controlled studes n Znn and Lu (2001), Chong et al. (2001), Campo et al. (2003), Sloot et al. (2005) and Guadagn and Lttle (2008) ndcate that customers ndeed make a choce among the offered products after comparng them wth respect to varous features and substtute for another product when the product that they orgnally have n mnd s not avalable. When customers choose and substtute, the demand for a partcular product depends on what other products are avalable. Dscrete choce models become useful to model the customer demand when the customers choose and substtute among the avalable products. Although dscrete choce models can provde a more realstc model of the customer demand when compared wth usng an exogenous random varable, estmatng the parameters of a dscrete choce model can be challengng. In ths paper, we consder the problem of estmatng the parameters of the Markov chan choce model. In ths choce model, wth a certan probablty, a customer arrvng nto the system s nterested n purchasng a certan product. If ths product s avalable for purchase, then the customer purchases t and leaves the system. Otherwse, the customer transtons nto another product wth a certan transton probablty and checks the avalablty of ths next product. In ths way, the customer transtons between the products untl she reaches an avalable one. Therefore, the parameters of the Markov chan choce model are the probablty that a customer arrves nto the system to purchase each one of the products and the probablty that a customer transtons from the current product to the next one when the current product s not avalable. We develop an expectaton-maxmzaton algorthm to estmate the parameters of the Markov chan choce model from the past purchase hstory of the customers. The expectaton-maxmzaton algorthm dates back to Dempster et al. (1977) and t s useful for solvng parameter estmaton problems when the data avalable for estmaton has a mssng pece. In ths algorthm, we start from some ntal parameter estmates and terate between the expectaton and maxmzaton steps. We focus on the so-called complete log-lkelhood functon, whch s constructed under the assumpton that we have access to the mssng pece of the data. In the expectaton step, we compute the expectaton of the complete log-lkelhood functon, when the dstrbuton of the mssng pece of the data s drven by the current parameter estmates. In the maxmzaton step, we maxmze the expectaton of the complete log-lkelhood functon to obtan new parameter estmates and repeat the process startng from the new parameter estmates. In parameter estmaton problems, our goal s to fnd a maxmzer of the so-called ncomplete log-lkelhood functon, whch s constructed under the assumpton that we do not have access 2

3 to the mssng pece of the data. Dempster et al. (1977) show that the successve parameter estmates from the expectaton-maxmzaton algorthm monotoncally mprove the value of the ncomplete log-lkelhood functon. Wu (1983) and Nettleton (1999) gve regularty condtons to ensure convergence to a local maxmum of the ncomplete log-lkelhood functon. Man Contrbutons. The data for estmatng the parameters of the Markov chan choce model are the set of products made avalable to each customer and the product purchased by each customer. In our expectaton-maxmzaton algorthm, we treat the path that a customer follows n the Markov chan choce model as the mssng pece of the data. In the expectaton step, we need to compute two quanttes, when the dstrbuton of the mssng pece of the data s drven by the current parameter estmates. The frst quantty s the probablty that the customer arrves nto the system to purchase a certan product, condtonal on the fnal purchase decson of the customer n the data. The second quantty s the expected number of tmes that a customer transtons from a certan product to another one, also condtonal on the fnal purchase decson of the customer n the data. We show how to compute these quanttes by solvng lnear systems of equatons. Also, we show that the optmzaton problem n the maxmzaton step has a closed-form soluton. We gve a convergence result for our expectaton-maxmzaton algorthm. In partcular, we show that the value of the ncomplete log-lkelhood functon at the parameter estmates generated by our algorthm monotoncally ncreases and converges to the value at a local maxmzer, under the assumpton that the parameters of the Markov chan choce model that we are tryng to estmate are bounded away from zero. Ths assumpton s arguably mld snce we can put a small but strctly postve lower bound on the parameters wth a neglgble effect on the choce probabltes. In our computatonal experments, we ft a Markov chan choce model to dfferent data sets by usng our expectaton-maxmzaton algorthm and compare the ftted Markov chan choce model wth a benchmark that captures the choce process of the customers by usng the multnomal logt model. The out-of-sample log-lkelhoods of the Markov chan choce model can mprove those of the benchmark by as much as 2%. Related Lterature. The Markov chan choce model was proposed by Blanchet et al. (2013). The authors study assortment problems, where there s a fxed revenue assocated wth each product and the goal s to fnd a set of products to offer to maxmze the expected revenue from a customer. They gve a polynomal-tme algorthm to solve the assortment problem under the Markov chan choce model exactly. Feldman and Topaloglu (2014) focus on a determnstc approxmaton 3

4 for network revenue management problems, where the decson varables correspond to the duratons of tme durng whch dfferent subsets of products are offered to the customers. Thus, the number of decson varables ncreases exponentally wth the number of products. The authors show that f the customers choose under the Markov chan choce model, then the number of decson varables n the determnstc approxmaton ncreases lnearly wth the number of products. Desr et al. (2015) study assortment problems under the Markov chan choce model wth a constrant on the total space consumpton of the offered products. They gve a constant factor approxmaton algorthm and show that t s NP-hard to approxmate the problem better than a fxed constant factor. The expectaton-maxmzaton algorthm s used to estmate the parameters of varous choce models. Vulcano et al. (2012) focus on estmatng the parameters of the multnomal logt model when the demand s censored so that the customers who do not make a purchase are not recorded n the data. Followng ther work, we can also deal wth demand censorshp, as dscussed n our conclusons secton. Faras et al. (2013), van Ryzn and Vulcano (2015), van Ryzn and Vulcano (2016), Jagabathula and Vulcano (2016) and Jagabathula and Rusmevchentong (2016) consder the rankng-based choce model, where each customer has a ranked lst of products n mnd and she purchases the most preferred avalable product. The authors focus on estmatng the parameters and comng up wth ranked lsts supported by the data. Chong et al. (2001), Kok and Fsher (2007), Msra (2008), Vulcano et al. (2010) and Da et al. (2014) use real data to quantfy the revenue mprovements when one accounts for the customer choce process n assortment decsons. Outlne. In Secton 2, we descrbe the Markov chan choce model. In Secton 3, we provde the ncomplete and complete lkelhood functons. In Secton 4, we gve our expectaton-maxmzaton algorthm, show how to execute the expectaton and maxmzaton steps and dscuss convergence. In Secton 5, we gve computatonal experments. In Secton 6, we conclude. 2 Markov Chan Choce Model In the Markov chan choce model, we have n products ndexed by N = {1,..., n}. A customer arrvng nto the system s nterested n purchasng product wth probablty λ. If ths product s avalable for purchase, then the customer purchases t and leaves the system. If ths product s not avalable for purchase, then the customer transtons from product to product j wth probablty ρ,j. The customer vsts dfferent products n ths fashon untl she vsts a product 4

5 that s avalable for purchase and purchases t. Naturally, we assume that N λ = 1 and j N ρ,j = 1 for all N. We note that a customer may have the opton of leavng the system wthout a purchase n certan settngs. To capture such a settng, we can assume that the opton of leavng the system wthout a purchase corresponds to one of the products n N. Ths product s always avalable for purchase and f a customer vsts ths product, then she leaves the system wthout a purchase. Blanchet et al. (2013) and Feldman and Topaloglu (2014) show that we can solve a lnear system of equatons to compute the probablty that a customer purchases a certan product when we offer the subset S N of products. In partcular, we let Θ (S) be the expected number of tmes that a customer vsts product durng the course of her choce process gven that we offer the subset S of products. We can compute Θ(S) = (Θ 1 (S),..., Θ n (S)) by solvng Θ (S) = λ + ρ j, Θ j (S) N. (1) j N\S We nterpret (1) as follows. On the left sde, Θ (S) s the expected number of tmes that a customer vsts product. The expected number of tmes that a customer vsts product when she arrves nto the system s λ, yeldng λ on the rght sde. The expected number of tmes that a customer vsts some product j N \S s Θ j (S). In each one of these vsts, she transtons from product j to product wth probablty ρ j,, yeldng j N\S ρ j, Θ j (S) on the rght sde. If product s avalable for purchase so that S, then a customer can vst ths product at most once, snce she purchases ths product whenever she vsts t. So, the expected number of tmes that a customer vsts product S s the same as the probablty that a customer vsts ths product, whch s, n turn, the same as the probablty that a customer purchases ths product. Thus, f Θ(S) = (Θ 1 (S),..., Θ n (S)) s the soluton to (1), then the probablty that a customer purchases product S s Θ (S). Usng the vector λ = {λ : N \ S} and the matrx ρ = {ρ,j :, j N \ S}, by (1), the vector Θ(S) = {Θ (S) : N \ S} satsfes Θ(S) = λ + ρ Θ(S). Thus, we have (I ρ) Θ(S) = λ, where I s the dentty matrx wth the approprate dmenson. By Corollary C.4 n Puterman (1994), f we have j N\S ρ,j < 1 for all N \ S, then (I ρ) 1 exsts and has non-negatve entres. So, snce ((I ρ) ) 1 = ((I ρ) 1 ), we obtan Θ(S) = ((I ρ) 1 ) λ. Once we compute Θ(S) = {Θ (S) : N \ S} by usng the last equalty, (1) mples that we can compute {Θ (S) : S} as Θ (S) = λ + j N\S ρ j, Θ j (S) for all S. Ths dscusson mples that f we have j N\S ρ,j < 1 for all N \ S, then there exsts a unque value of Θ(S) that satsfes (1) and ths value of Θ(S) has non-negatve entres. 5

6 3 Incomplete and Complete Lkelhood Functons Our goal s to estmate the parameters λ = (λ 1,..., λ n ) and ρ = {ρ,j :, j N} of the Markov chan choce model by usng the data on the subsets of products offered to the customers and the purchase decsons of the customers. We use the maxmum lkelhood method to estmate the parameters (λ, ρ). To capture the product that a customer purchases, we defne the random varable Z (S) {0, 1} such that Z (S) = 1 f and only f a customer purchases product when we offer the subset S of products. Therefore, usng e R n to denote the unt vector wth a one n the -th component, we have Z(S) = (Z 1 (S),..., Z n (S)) = e wth probablty Θ (S). In the data that we have avalable to estmate the parameters of the Markov chan choce model, there are τ customers ndexed by T = {1,..., τ}. We use Ŝt N to denote the subset of products offered to customer t. To capture the product purchased by ths customer, we defne Ẑt {0, 1} such that Ẑ t = 1 f and only f customer t purchased product, whch mples that Ẑt = (Ẑt 1,..., Ẑt n) s a sample of the random varable Z(Ŝt ) = (Z 1 (Ŝt ),..., Z n (Ŝt )). Thus, the data that s avalable to estmate the parameters of the Markov chan choce model s {(Ŝt, Ẑt ) : t T }. The probablty that customer t purchases product s Θ (Ŝt λ, ρ), where we explctly show that the soluton Θ(S λ, ρ) = (Θ 1 (S λ, ρ),..., Θ n (S λ, ρ)) to (1) depends on the parameters (λ, ρ) of the Markov chan choce model. In ths case, the lkelhood of the purchase decson of customer t s gven by N Θ (Ŝt λ, ρ)ẑt, where we follow the conventon that 0 0 = 1. The log-lkelhood of ths purchase decson s N Ẑt log Θ (Ŝt λ, ρ). Assumng that the purchase decsons of the dfferent customers are ndependent of each other, the log-lkelhood of the data {(Ŝt, Ẑt ) : t T } s L I (λ, ρ) = t T Ẑ t log Θ (Ŝt λ, ρ). (2) N To estmate the parameters of the Markov chan choce model, we can maxmze L I (λ, ρ) subject to the constrant that N λ = 1, j N ρ,j = 1 for all N, λ R n + and ρ R n n +. The dffculty wth ths approach s that there s no closed-form expresson for Θ (Ŝt λ, ρ) n the defnton of L I (λ, ρ), whch s the man motvaton for our expectaton-maxmzaton algorthm. In our expectaton-maxmzaton algorthm, we use a lkelhood functon constructed under the assumpton that we have access to addtonal data for each customer. In partcular, we defne the random varable F {0, 1} such that F = 1 f and only f a customer arrvng nto the system s nterested n purchasng product. Thus, we have F = (F 1,..., F n ) = e wth probablty λ. Also, 6

7 we use the random varable X,j (S) to denote the number of tmes that a customer transtons from product to product j durng the course of her choce process when we offer the subset S of products. We do not gve the probablty law for the random varable X,j (S) explctly, but we compute certan expectatons nvolvng ths random varable n the next secton. For each customer t, we assume that we have access to addtonal data so that we know the product that ths customer was nterested n purchasng when she arrved nto the system, as well as the number of tmes that she transtoned from each product to each product j durng the course of her choce process. (Ths assumpton s temporary to facltate our analyss and our expectaton-maxmzaton algorthm wll not requre havng access to the addtonal data.) In partcular, we defne ˆF t {0, 1} such that ˆF t = 1 f and only f customer t was nterested n purchasng product when she arrved nto the system. We use ˆX t,j to denote the number of tmes that customer t transtoned from product to product j durng the course of her choce process. So, ˆF t = ( ˆF 1 t,..., ˆF n) t and ˆX t = { ˆX,j t :, j N} are respectvely samples of the random varables F = (F 1,..., F n ) and X(Ŝt ) = {X,j (Ŝt ) :, j N}. We construct a lkelhood functon under the assumpton that the data that s avalable to estmate the parameters of the Markov chan choce model s {(Ŝt, Ẑt, ˆF t, ˆX t ) : t T }. The probablty that customer t s nterested n purchasng product when she arrves nto the system s λ. Also, gven that customer t s nterested n purchasng product up on arrval, the probablty that she vsts products, 1, 2,..., k 1, k, j to purchase product j s gven by ρ,1 ρ 1, 2... ρ k 1, k ρ k,j. In ths case, the lkelhood of the purchase decson of customer t s N λ ˆF t ths purchase decson s N ˆF t log λ +,j N {(Ŝt, Ẑt, ˆF t, ˆX t ) : t T } s,j N ρ ˆX t,j,j. The log-lkelhood of ˆX t,j log ρ,j. Thus, the log-lkelhood of the data L C (λ, ρ) = t T N ˆF t log λ + t T,j N ˆX t,j log ρ,j. (3) Note that once we know {( ˆF t, ˆX t ) : t T }, {(Ŝt, Ẑt ) : t T } does not play a role n (3). To estmate the parameters of the Markov chan choce model, knowng {( ˆF t, ˆX t ) : t T } s equvalent to knowng the path that a customer follows n the Markov chan, snce the second term on the rght sde of (3) does not depend on the order n whch the transtons take place. The lkelhood functon L C (λ, ρ) n (3) s constructed under the assumpton that we have access to addtonal data for each customer. Ths lkelhood functon s known as the complete lkelhood functon and the subscrpt C n L C (λ, ρ) stands for complete. In contrast, the lkelhood functon L I (λ, ρ) n (2) s constructed under the assumpton that we do not have access to the addtonal 7

8 data for each customer. Ths lkelhood functon s known as the ncomplete lkelhood functon and the subscrpt I n L I (λ, ρ) stands for ncomplete. Notng that log x s concave n x, the lkelhood functon L C (λ, ρ) s concave n (λ, ρ) and t has a closed-form expresson. However, ths lkelhood functon s not mmedately useful when estmatng the parameters of the Markov chan choce model, snce we do not have access to the data {( ˆF t, ˆX t ) : t T } n practce. In the next secton, we gve an expectaton-maxmzaton algorthm that uses the lkelhood functon L C (λ, ρ) to estmate the parameters of the Markov chan choce model, whle makng sure that we do not need to have access to the data {( ˆF t, ˆX t ) : t T }. 4 Expectaton-Maxmzaton Algorthm In ths secton, we descrbe our expectaton-maxmzaton algorthm. We show how to execute the expectaton and maxmzaton steps of ths algorthm n detal. Lastly, we dscuss the convergence propertes of the terates of our expectaton-maxmzaton algorthm. 4.1 Overvew of the Algorthm Our expectaton-maxmzaton algorthm estmates the parameters of the Markov chan choce model by usng the lkelhood functon L C (λ, ρ) n (3). Although ths algorthm works wth the lkelhood functon L C (λ, ρ) n (3), t requres havng access to the data {(Ŝt, Ẑt ) : t T }, but not to the data {( ˆF t, ˆX t ) : t T }. In our expectaton-maxmzaton algorthm, we start wth some estmate of the parameters (λ 1, ρ 1 ) at the frst teraton. At teraton l, we estmate ˆF t as the expectaton of the random varable F condtonal on the fact that customer t chooses accordng to the Markov chan choce model wth parameters (λ l, ρ l ) and her purchase decson s gven by Ẑ t = (Ẑt 1,..., Ẑt n). Smlarly, we estmate ˆX t,j as the expectaton of the random varable X,j(Ŝt ) condtonal on the fact that customer t chooses accordng to the Markov chan choce model wth parameters (λ l, ρ l ) and her purchase decson s gven by Ẑt = (Ẑt 1,..., Ẑt n). Computng these condtonal expectatons to estmate ˆF t and ˆX,j t for all, j N, t T s known as the expectaton step. Next, we plug these estmates nto (3) to construct the lkelhood functon L C (λ, ρ) and maxmze ths lkelhood functon subject to the constrant that N λ = 1, j N ρ,j = 1 for all N, λ R n + and ρ R n n +. The optmal soluton to ths problem yelds parameters (λ l+1, ρ l+1 ) that we use at teraton l + 1. Maxmzng the lkelhood functon L C (λ, ρ) n ths fashon s known as the maxmzaton step. Usng the parameters (λ l+1, ρ l+1 ), we can go back to 8

9 the expectaton step to estmate ˆF t and ˆX,j t for all, j N, t T. The expectaton-maxmzaton algorthm teratvely carres out the expectaton and maxmzaton steps to generate a sequence of parameters {(λ l, ρ l ) : l = 1, 2,...}. We state the expectaton-maxmzaton algorthm below. Step 1. Choose the ntal estmates (λ 1, ρ 1 ) of the parameters of the Markov chan choce model arbtrarly, as long as they satsfy N λ1 = 1, j N ρ1,j = 1 for all N, λ1 R n + and ρ 1 R n n +. Intalze the teraton counter by settng l = 1. Step 2. (Expectaton) Assumng that the customers choose accordng to the Markov chan choce model wth parameters (λ l, ρ l ), set for all, j N, t T. ˆF = E{F Z(Ŝt ) = Ẑt } and ˆX,j = E{X,j(Ŝt) Z(Ŝt ) = Ẑt } N Step 3. (Maxmzaton) Let (λ l+1, ρ l+1 ) be the maxmzer of L l C (λ, ρ) = t T ˆF log λ + t T,j N ˆX,j log ρ,j subject to the constrant that N λ = 1, j N ρ,j = 1 for all N, λ R n + and ρ R n n +. Increase l by one and go to Step 2. In the expectaton step, we compute non-trval condtonal expectatons. In Secton 4.2, we show that we can compute these condtonal expectatons by solvng lnear systems of equatons. In the maxmzaton step, we maxmze the functon L l C (λ, ρ) subject to lnear constrants. In Secton 4.3, we show that ths optmzaton problem has a closed-form soluton. To estmate the parameters of the Markov chan choce model, we need to maxmze the lkelhood functon L I (λ, ρ). In Secton 4.4, we consder our expectaton-maxmzaton algorthm under the assumpton that the parameters that we are tryng to estmate are bounded away from zero. We show that the value of the lkelhood functon {L I (λ l, ρ l ) : l = 1, 2,...} at the successve parameter estmates {(λ l, ρ l ) : l = 1, 2,...} generated by our algorthm monotoncally ncreases and converges to the value of the lkelhood functon L I (λ, ρ) at a local maxmzer. In that secton, we precsely defne what we mean by a local maxmzer. We also show that we can stll solve the optmzaton problem n the maxmzaton step n polynomal tme when we have a strctly postve lower bound on the parameters. 4.2 Expectaton Step In the expectaton step of the expectaton-maxmzaton algorthm, customer t s offered the subset Ŝ t of products. She chooses among these products accordng to the Markov chan choce model wth parameters (λ l, ρ l ). We know that the purchase decson of ths customer s gven by the 9

10 vector Ẑt, whch s to say that f we know that customer t purchased product k, then Ẑt = e k. We need to compute the condtonal expectatons E{F Z(Ŝt ) = Ẑt } and E{X,j (Ŝt) Z(Ŝt ) = Ẑt }. In ths secton, we show that we can compute these condtonal expectatons by solvng lnear systems of equatons. For notatonal brevty, we omt the superscrpt t ndexng the customer and the superscrpt l ndexng the teraton counter. In partcular, we consder the case where a customer s offered the subset S of products. She chooses among these products accordng to the Markov chan choce model wth parameters (λ, ρ). We know that the customer purchased some product k. In other words, notng that the purchase decson of a customer s captured by the random varable Z(S) = (Z 1 (S),..., Z n (S)), we know that Z(S) = e k. expectatons E{F Z(S) = e k } and E{X,j (S) Z(S) = e k }. We want to compute the condtonal Computaton of E{F Z(S) = e k }. The expectaton E{F Z(S) = e k } s condtonal on havng Z(S) = e k. Thus, we know that a customer purchased product k out of the subset S of products, whch mples that we must have k S. Therefore, we assume that k S n our dscusson. Usng the Bayes rule, we have E{F Z(S) = e k } = P{F = 1 Z(S) = e k } = P{Z k(s) = 1 F = 1} P{F = 1}. (4) P{Z k (S) = 1} On the rght sde of (4), P{Z k (S) = 1 F = 1} s the probablty that a customer purchases product k out of the subset S of products gven that she s nterested n purchasng product when she arrves nto the system. Ths probablty s smple to compute when S. In partcular, f product s offered and a customer s nterested n purchasng product when she arrves nto the system, then ths customer defntely purchases product. Thus, lettng 1( ) be the ndcator functon, we have P{Z k (S) = 1 F = 1} = 1( = k) for all S. We focus on computng P{Z k (S) = 1 F = 1} for all N \ S. Lettng Ψ k (, S) = P{Z k (S) = 1 F = 1} for all N \ S for notatonal brevty, we can compute {Ψ k (, S) : N \ S} by solvng the lnear system of equatons Ψ k (, S) = ρ,k + ρ,j Ψ k (j, S) N \ S. (5) j N\S We nterpret (5) as follows. On the left sde, Ψ k (, S) s the probablty that a customer purchases product k out of the subset S of products gven that she s nterested n purchasng product up on arrval. For ths customer to purchase product k, she may transton from product to product k, yeldng ρ,k on the rght sde. Alternatvely, the customer may transton from product to some 10

11 product j N \ S, at whch pont, she s dentcal to a customer nterested n purchasng product j up on arrval and ths customer purchases product k wth probablty Ψ k (j, S). Ths reasonng yelds j N\S ρ,j Ψ k (j, S) on the rght sde. By the same dscusson at the end of Secton 2, f j N\S ρ,j < 1 for all N \ S, then there exsts a unque soluton to the system of equatons n (5). Thus, f {Ψ k (, S) : N \ S} solve (5), then we have Ψ k (, S) = P{Z k (S) = 1 F = 1} for all N \ S. For notatonal unformty, notng the dscusson rght before (5), we let Ψ k (, S) = 1( = k) for all S. In ths case, we have Ψ k (, S) = P{Z k (S) = 1 F = 1} for all N. The other probabltes on the rght sde of (4) are smple to compute. Notng that P{F = 1} s the probablty that a customer arrvng nto the system s nterested n purchasng product, we have P{F = 1} = λ. Smlarly, snce P{Z k (S) = 1} s the probablty that a customer purchases product k out of the subset S of products, we have P{Z k (S) = 1} = Θ k (S), where (Θ 1 (S),..., Θ n (S)) solve the system of equatons n (1). Puttng the dscusson so far together, we compute {Ψ k (, S) : N \ S} by solvng the system of equatons n (5). Also, lettng Ψ k (, S) = 1( = k) for all S, by (4), for all N, we have E{F Z(S) = e k } = P{Z k(s) = 1 F = 1} P{F = 1} P{Z k (S) = 1} = Ψ k(, S) λ. (6) Θ k (S) When we offer the subset S of products, Ψ k (, S) s the purchase probablty of product k condtonal on the fact that the customer s nterested n purchasng product when she arrves nto the system, whch can be computed by solvng (5), whereas Θ k (S) s the uncondtonal purchase probablty of product k, whch can be computed by solvng (1). The systems of equatons n (1) and (5) are smlar to each other. Blanchet et al. (2013) and Feldman and Topaloglu (2014) use (1) to compute uncondtonal purchase probabltes, but t s nterestng that a slght varaton of (1) allows computng condtonal purchase probabltes. Computaton of E{X,j (S) Z(S) = e k }. Smlar to our earler argument, snce the expectaton E{X,j (S) Z(S) = e k } s condtonal on havng Z(S) = e k, we know that a customer purchased product k out of the subset S of products, whch mples that k S. Therefore, we assume that k S n our dscusson. The random varable X,j (S) captures the number of tmes that a customer transtons from product to product j durng the course of her choce process when we offer the subset S of products. If we have S, then a customer cannot transton from product to another product, snce the customer purchases product whenever she vsts t. So, 11

12 X,j (S) = 0 wth probablty one for all S, whch mples that E{X,j (S) Z(S) = e k } = 0. We focus on computng the expectaton E{X,j (S) Z(S) = e k } for all N \S. We defne the random varable Y m (S) {0, 1} such that Y m (S) = 1 f and only f the m-th product that a customer vsts durng the course of her choce process s product when we offer the subset S of products. Snce X,j (S) s the number of tmes a customer transtons from product to product j when we offer the subset S of products, we have X,j (S) = m=1 1(Y m (S) = 1, Y m+1 (S) = 1). So, we have E{X,j (S) Z(S) = e k } = = m=1 m=1 P{Y m+1 j where the second equalty s by the Bayes rule. P{Y m+1 j P{Y m (S) = 1, Y m+1 j (S) = 1 Z(S) = e k }. (S) = 1 Y m (S) = 1, Z(S) = e k } P{Y m (S) = 1 Z(S) = e k }, (7) j We focus on each one of the probabltes (S) = 1 Y m (S) = 1, Z(S) = e k } and P{Y m (S) = 1 Z(S) = e k } on the rght sde of (7) separately. Consderng the probablty P{Y m (S) = 1 Z(S) = e k }, from the perspectve of the fnal purchase decson, a customer that vsts product as the m-th product s ndstngushable from a customer that vsts product as the frst product. Thus, we have P{Z(S) = e k Y m (S) = 1} = P{Z(S) = e k F = 1}. In ths case, by usng the Bayes rule once more, we have P{Y m (S) = 1 Z(S) = e k } = P{Z k(s) = 1 Y m (S) = 1} P{Y m (S) = 1} P{Z k (S) = 1} = P{Z k(s) = 1 F = 1} P{Y m (S) = 1} P{Z k (S) = 1} = Ψk(, S) P{Y m (S) = 1}, (8) Θ k (S) where we compute {Ψ k (, S) : N \ S} by solvng (5). probablty P{Y m+1 j On the other hand, consderng the (S) = 1 Y m (S) = 1, Z(S) = e k }, by the Bayes rule, we also have P{Yj m+1 (S) = 1 Z(S) = e k, Y m (S) = 1} = P{Z k(s) = 1 Yj m+1 (S) = 1, Y m j P{Z k (S) = 1 Y m (S) = 1} (S) = 1} P{Y m+1 (S) = 1 Y m (S) = 1} = P{Z k(s) = 1 Yj m+1 (S) = 1} P{Yj m+1 (S) = 1 Y m (S) = 1} P{Z k (S) = 1 Y m (S) = 1} = P{Z k(s) = 1 F j = 1} P{Yj m+1 (S) = 1 Y m (S) = 1} = Ψ k(j, S) ρ,j. (9) P{Z k (S) = 1 F = 1} Ψ k (, S) In the chan of equaltes above, the second equalty uses the fact that f we know the (m + 1)-st product that a customer vsts, then the dstrbuton of the product that she purchases does not 12

13 depend on the m-th product that ths customer vsts. The thrd equalty uses the fact that a customer that vsts product j as the (m + 1)-st product s ndstngushable from a customer that vsts product j as the frst product from the perspectve of the fnal purchase decson. The fourth equalty s by the fact that gven that a customer vsts product as the m-th product, the probablty that she vsts product j next s gven by the transton probablty ρ,j. To compute the condtonal expectaton E{X,j (S) Z(S) = e k }, we use (8) and (9) n (7) to get E{X,j (S) Z(S) = e k } = m=1 P{Y m+1 j Ψ k (j, S) ρ,j = Ψ m=1 k (, S) = Ψ k(j, S) ρ,j Θ k (S) (S) = 1 Y m (S) = 1, Z(S) = e k } P{Y m (S) = 1 Z(S) = e k } m=1 In the last equalty above, we use the fact that Ψ k (, S) P{Y m (S) = 1} Θ k (S) P{Y m (S) = 1} = Ψ k(j, S) ρ,j Θ (S). (10) Θ k (S) m=1 P{Y m (S) = 1} corresponds to the expected number of tmes that a customer vsts product gven that we offer the subset S of products, n whch case, by the dscusson n Secton 2, ths quantty s gven by Θ (S). The dscusson n ths secton shows how we can compute the condtonal expectatons E{F Z(S) = e k } and E{X,j (S) Z(S) = e k }. The man bulk of the work nvolves solvng the systems of equatons n (1) and (5) to obtan (Θ 1 (S),..., Θ n (S)) and {Ψ k (, S) : N \ S}. Usng (6) and (10), we can gve explct expressons to execute the expectaton step of our expectaton-maxmzaton algorthm. We replace (λ, ρ) wth (λ l, ρ l ) and S wth Ŝt n (1) and solve ths system of equatons. We use (Θ l 1 (Ŝt ),..., Θ l n(ŝt )) to denote the soluton. Smlarly, we replace (λ, ρ) wth (λ l, ρ l ) and S wth Ŝt n (5) and solve ths system of equatons. We use {Ψ l k (, Ŝt ) : N \ S} to denote the soluton. Also, we let Ψ l k (, Ŝt ) = 1( = k) for all S. In ths case, by (6), for all N and t T, we have = E{F Z(Ŝt ) = e k } = Ψ l k (, Ŝt ) λ l / Θl k (Ŝt ). Also, by (10), for all N \ S, j N and t T, we have ˆX,j = E{X,j(Ŝt ) Z(Ŝt ) = e k } = Ψ l k (j, Ŝt ) ρ l,j Θl (Ŝt ) / Θ l k (Ŝt ). Fnally, we set ˆX,j ˆF = 0 for all S, j N and t T. 4.3 Maxmzaton Step In the maxmzaton step of the expectaton-maxmzaton algorthm we need to maxmze the functon L l C (λ, ρ) = t T N ˆF log λ + t T,j N ˆX,j log ρ,j subject to the constrant 13

14 that N λ = 1, j N ρ,j = 1 for all N, λ R n + and ρ R+ n n. Ths optmzaton problem decomposes nto 1 + n problems gven by { max t T { max t T j N N ˆF log λ : N λ = 1, λ 0 N ˆX,j log ρ,j : j N ρ,j = 1, ρ,j 0 j N } } and (11) N. (12) Problem (11) corresponds to the problem of computng the maxmum lkelhood estmators of the parameters (λ 1,..., λ n ) of the multnomal dstrbuton, where λ s the probablty of observng outcome n each tral, we have a total of t T ˆF N ˆF trals and we observe outcome n t T trals. In ths case, the maxmum lkelhood estmator of λ s known to be t T ˆF / t T j N ˆF j ; see Secton 2.2 n Bshop (2006). Therefore, the optmal soluton to problem (11) s obtaned by settng λ = t T ˆF / t T j N ˆF j for all N. In the next secton, we focus on our expectaton-maxmzaton algorthm under the assumpton that the parameters of the Markov chan choce model are known to be bounded away from zero by some ɛ > 0. In ths case, the maxmzaton step requres solvng the frst problem above wth a lower bound of ɛ on the decson varables (λ 1,..., λ n ). In Onlne Appendx A, we dscuss how to solve ths optmzaton problem. Repeatng ths dscusson wth ɛ = 0 also shows that we can obtan the optmal soluton to problem (11) by settng λ = t T ˆF / t T j N ˆF j for all N. Each one of the n problems n (12) has the same structure as problem (11). Followng the same argument used to fnd the optmal value of λ, the optmal soluton to each one of the n problems n (12) s obtaned by settng ρ,j = t T ˆX,j / t T k N ˆX,k for all j N. Thus, to execute the maxmzaton step of our expectaton-maxmzaton algorthm, we smply set λ l+1 = t T ˆF / t T j N ˆF j and ρ l+1,j = t T ˆX,j / t T k N ˆX,k for all, j N. 4.4 Convergence of the Algorthm We can gve a convergence result for our expectaton-maxmzaton algorthm under the assumpton that the parameters of the Markov chan choce model that we are tryng to estmate are known to be bounded away from zero by some ɛ > 0. Under ths assumpton, we execute the maxmzaton step of the expectaton-maxmzaton algorthm slghtly dfferently. In partcular, we let (λ l+1, ρ l+1 ) be the maxmzer of L l C (λ, ρ) = t T N ˆF log λ + t T,j N ˆX,j log ρ,j subject to the 14

15 constrant that N λ = 1, j N ρ,j = 1 for all N, λ ɛ for all N and ρ,j ɛ for all, j N. In other words, we mpose a lower bound of ɛ on the decson varables. In Onlne Appendx A, we show that we can stll solve the last optmzaton problem n polynomal tme. The assumpton that the parameters of the Markov chan choce model are known to be bounded away from zero by some ɛ > 0 allows us to satsfy certan regularty condtons when we study the convergence of our expectaton-maxmzaton algorthm. Ths assumpton s arguably mld, snce we can put a small lower bound of ɛ > 0 on the parameters wth neglgble effect on the choce probabltes. To gve a convergence result for our expectaton-maxmzaton algorthm, we let Ω = {(λ, ρ) R n + R n n + : N λ = 1, j N ρ,j = 1 N, λ ɛ N, ρ,j ɛ, j N}, capturng the set of possble parameter values when we have a lower bound of ɛ on the parameters. Also, we defne the set of parameters Φ = { (λ 0, ρ 0 ) Ω : dl I((1 γ) (λ 0, ρ 0 ) + γ (λ, ρ)) dγ } 0 (λ, ρ) Ω. γ=0 Roughly speakng, havng (λ 0, ρ 0 ) Φ mples that f we start from the pont (λ 0, ρ 0 ) and move towards any pont (λ, ρ) Ω for an nfntesmal step sze, then the value of the lkelhood functon L I (λ, ρ) does not mprove. In the next theorem, we gve a convergence result for our expectaton-maxmzaton algorthm when we know that the parameters that we are tryng to estmate are bounded away from zero by some ɛ > 0. The proof s n Onlne Appendx B. Theorem 1 Assume that the sequence {(λ l, ρ l ) : l = 1, 2,...} s generated by our expectaton-maxmzaton algorthm when we mpose a lower bound of ɛ > 0 on the parameters of the Markov chan choce model. Then, we have L I (λ l+1, ρ l+1 ) L I (λ l, ρ l ) for all l = 1, 2,.... Furthermore, all lmt ponts of the sequence {(λ l, ρ l ) : l = 1, 2,...} are n Φ and the sequence {L I (λ l, ρ l ) : l = 1, 2,...} converges to L I (ˆλ, ˆρ) for some (ˆλ, ˆρ) Φ. Note that L I (λ, ρ) s the functon that we need to maxmze to estmate the parameters. By the theorem above, the sequence of parameters generated by our algorthm monotoncally mproves L I (λ, ρ) and we have convergence to a some form of local maxmum of L I (λ, ρ). Snce L I (λ, ρ) s not necessarly concave, we are not guaranteed to get to the global maxmum. Nettleton (1999) gves regularty condtons to ensure convergence of the expectaton-maxmzaton algorthm. The proof of Theorem 1 follows by verfyng these regularty condtons. In Onlne Appendx C, we gve an example to show that the regularty condtons n Nettleton (1999) may not hold wthout a lower 15

16 bound of ɛ > 0 on the parameters. Wu (1983) gves other regularty condtons but he assumes that the parameters generated by the algorthm are n the nteror of the set of possble parameter values, whch s dffcult to satsfy for the Markov chan choce model. Also, Theorem 1 does not rule out the possblty of multple lmt ponts for the sequence {(λ l, ρ l ) : l = 1, 2,...}, but all lmt ponts are n Φ. Lastly, we are not able to gve a convergence result wthout a lower bound of ɛ > 0 on the parameters. In Onlne Appendx D, however, we show that as long as the ntal parameter estmates are strctly postve, even f we do not mpose a lower bound on the parameters, our algorthm always generates a sequence of parameters {(λ l, ρ l ) : l = 1, 2,...} such that there exst unque solutons to the systems of equatons n (1) and (5) when we solve these systems of equatons after replacng (λ, ρ) wth (λ l, ρ l ) and S wth Ŝt for any subset n the data {(Ŝt, Ẑt ) : t T }. So, we do not encounter parameters that render the systems of equatons n (1) or (5) unsolvable. 5 Computatonal Experments We test the performance of our expectaton-maxmzaton algorthm on randomly generated data, as well as on a data set comng from a hotel revenue management applcaton. 5.1 Benchmark Strateges In our frst benchmark, referred to as EM, we estmate the parameters of the Markov chan choce model by usng our expectaton-maxmzaton algorthm. In our second benchmark, referred to as DM, we contnue usng the Markov chan choce model to capture the customer choces, but we estmate the parameters by drectly maxmzng the lkelhood functon L I (λ, ρ) n (2) through contnuous optmzaton software. In our thrd benchmark, referred to as ML, we use the multnomal logt model to capture the customer choces and estmate ts parameters by usng maxmum lkelhood. We brefly descrbe the multnomal logt model. In the multnomal logt model, the mean utlty of product s η. If we offer the subset S of products, then a customer purchases product wth probablty e η / j S eη j. As mentoned n Secton 2, we represent the no-purchase opton as a product always avalable for purchase. We denote ths product by φ N. In other words, a customer purchasng product φ corresponds to a customer leavng the system wthout a purchase. If we add the same constant to the mean utltes of all products, then the choce probablty e η / j S eη j of each product does not change. So, we normalze the mean utlty of the no-purchase opton to zero. In ths case, the parameters of the multnomal logt 16

17 model are η = {η : N \ {φ}}. Assume that we offer the subset Ŝt of products to customer t and the purchase decson of ths customer s gven by Ẑt = (Ẑt 1,..., Ẑt n), where Ẑt = 1 f and only f the customer purchases product. The lkelhood of the purchase decson of customer t s N (eη / j Ŝt eη j )Ẑt. Notng that N Ẑt = 1, the log-lkelhood of ths purchase decson s N Ẑt η N Ẑt log(j Ŝt eη j ) = N Ẑt η log( Ŝt eη ); see Secton II n McFadden (1974). Thus, the log-lkelhood of the data {(Ŝt, Ẑt ) : t T } s gven by L(η) = Ẑ t η t T N t T ( ) log e ηj. (13) Ŝt In ML, we estmate the parameters of the multnomal logt model by maxmzng the log-lkelhood functon n (13) through the Matlab routne fmncon. Secton n Boyd and Vandenberghe (2005) shows that log( n =1 ex ) s convex n (x 1,..., x n ) R n. Thus, the log-lkelhood functon L(η) n (13) s concave n η. Lettng w = e η, we can also express the choce probablty of product out of the subset S as w / j S w j, but the log-lkelhood functon L(w) = t T N Ẑt log w t T log( w ) s not concave n w = {w : N \ {φ}}. Ŝt In DM, we drectly maxmze the log-lkelhood functon L I (λ, ρ) n (2) also by usng the Matlab routne fmncon. Snce the functon L I (λ, ρ) s not necessarly concave n (λ, ρ), the parameters estmated by DM may depend on the ntal soluton, but exploratory trals ndcated that the performance of DM s rather nsenstve to the ntal soluton. We use the ntal soluton λ = 1/n for all N, ρ,j = 1/n for all N \ {φ}, j N and ρ φ,φ = 1. In EM, we use ths ntal soluton as well. In our expectaton-maxmzaton algorthm, we do not mpose a strctly postve lower bound on the parameters and stop when the ncomplete log-lkelhood ncreases by less than 0.01% n two successve teratons. We gve the pseudo-code for our algorthm n Onlne Appendx E. We also used the so-called ndependent demand model as a benchmark. Ths model performed consstently worse than EM, DM and ML and we wll comment on ts performance only brefly. 5.2 Known Ground Choce Model We provde computatonal experments on randomly generated data where we have access to the exact ground choce model that governs the customer choce process. Expermental Setup. We assume that the ground choce model that governs the customer choce process s the rankng-based choce model. In ths choce model, each arrvng customer 17

18 has a ranked lst of products n mnd and she purchases the most preferred avalable product n her ranked lst. The rankng-based choce model s used n Mahajan and van Ryzn (2001), van Ryzn and Vulcano (2008), Smth et al. (2009), Honhon et al. (2010), Honhon et al. (2012), Faras et al. (2013), van Ryzn and Vulcano (2015), Jagabathula and Vulcano (2016), Jagabathula and Rusmevchentong (2016) and van Ryzn and Vulcano (2016). We have m possble ranked lsts ndexed by M = {1,..., m}. We denote the possble ranked lsts that an arrvng customer can have n mnd by usng {(σ g 1,..., σg n) : g M}, where σ g s the preference order of product n the ranked lst σ g = (σ g 1,..., σg n). For example, f n = 3 and σ g 1 = 2, σg 2 = 3, σg 3 = 1, then product 3 s the most preferred product, product 1 s the second most preferred product and product 2 s the thrd most preferred product. The probablty that an arrvng customer has the ranked lst σ g n mnd s β g. If we offer the subset S of products, then an arrvng customer purchases product wth probablty g M βg 1( = arg mn j S σ g j ), whch s the probablty that an arrvng customer has product as her most preferred avalable product. In our computatonal experments, to come up wth the possble ranked lsts that a customer can have n mnd, we generate m random permutatons of the products. To come up wth the probablty β g that an arrvng customer has the ranked lst σ g n mnd, followng van Ryzn and Vulcano (2016), we generate γ g from the unform dstrbuton over [0, 1] and set β g = γ g / h M γh. We note that (β 1,..., β m ) generated n ths fashon s not unformly dstrbuted over the (m 1)-dmensonal smplex. Once we generate the ground choce model that governs the customer choce process, we generate the past purchase hstores {(Ŝt, Ẑt ) : t T } of the customers from ths ground choce model. To come up wth the subset Ŝt of products offered to customer t, we assume that the no-purchase opton s always avalable n the offered subset of products. Each of the other products are ncluded n the subset Ŝt wth probablty 1/2. Once we come up wth the subset of products offered to customer t, we generate the choce of a random customer out of ths subset accordng to the ground choce model and set Ẑt = 1 f and only f customer t purchases product. Usng ths approach, we generate the purchase hstory of 50,000 customers to use as the tranng data and a separate purchase hstory of 10,000 customers to use as the hold-out data. We use dfferent portons of the generated tranng data wth 2,500, 5,000, 10,000 and 50,000 customers to ft our choce models. In our test problems, the number of products s n = 11 or n = 21. One of these products corresponds to the no-purchase opton. The number of possble ranked lsts s m = 10 + n, m = 20 + n or m = 40 + n. For each product N, there s one ranked lst where the most preferred product n the ranked lst s 18

19 Out-of-sample Trn. log-lkelhood EM- EM- (n, m) data EM DM ML DM ML (11, 21) 2,500-16,391-16,407-16, % 1.72% (11, 21) 5,000-16,387-16,390-16, % 1.67% (11, 21) 10,000-16,366-16,349-16, % 1.78% (11, 21) 50,000-16,361-16,327-16, % 1.81% (11, 31) 2,500-16,199-16,222-16, % 0.61% (11, 31) 5,000-16,154-16,161-16, % 0.81% (11, 31) 10,000-16,141-16,138-16, % 0.87% (11, 31) 50,000-16,132-16,120-16, % 0.93% (11, 51) 2,500-17,029-17,059-17, % 0.51% (11, 51) 5,000-17,017-17,032-17, % 0.55% (11, 51) 10,000-16,992-16,993-17, % 0.67% (11, 51) 50,000-16,986-16,966-17, % 0.69% Out-of-sample Trn. log-lkelhood EM- EM- (n, m) data EM DM ML DM ML (21, 31) 2,500-22,839-23,053-23, % 1.30% (21, 31) 5,000-22,737-22,740-23, % 1.64% (21, 31) 10,000-22,673-22,648-23, % 1.85% (21, 31) 50,000-22,627-22,583-23, % 2.00% (21, 41) 2,500-22,581-22,783-22, % 0.99% (21, 41) 5,000-22,483-22,580-22, % 1.31% (21, 41) 10,000-22,422-22,408-22, % 1.54% (21, 41) 50,000-22,378-22,331-22, % 1.69% (21, 61) 2,500-23,406-23,562-23, % 0.14% (21, 61) 5,000-23,301-23,321-23, % 0.53% (21, 61) 10,000-23,267-23,271-23, % 0.59% (21, 61) 50,000-23,226-23,183-23, % 0.74% Table 1: Out-of-sample log-lkelhoods obtaned by EM, DM and ML. product. In ths way, we fx the most preferred product n n of the ranked lsts. The preference order of the other products n these n ranked lsts are randomly generated. The remanng ranked lsts other than these n ranked lsts are fully random permutatons of the products. Results. In Table 1, we show the out-of-sample log-lkelhoods obtaned by EM, DM and ML. The frst column n ths table shows the parameter combnaton (n, m) n the ground choce model. The second column shows the number of customers n the tranng data that we use to ft our choce models. The thrd column shows the out-of-sample log-lkelhood obtaned by EM, whch s the value of the log-lkelhood functon n (2) after replacng (λ, ρ) by the parameters estmated by EM and usng the hold-out data as the data {(Ŝt, Ẑt ) : t T } n ths log-lkelhood. The fourth column shows the out-of-sample log-lkelhood obtaned by DM, whch s the value of the same log-lkelhood functon n the second column, but the parameters (λ, ρ) correspond to the those estmated by DM. The ffth column shows the out-of-sample log-lkelhood obtaned by ML, whch s the value of the log-lkelhood functon n (13) after replacng η by the parameters estmated by ML and usng the hold-out data as the data {(Ŝt, Ẑt ) : t T } n ths log-lkelhood. The last two columns compare the log-lkelhood obtaned by EM wth those obtaned by DM and ML by gvng the percent gaps between the correspondng pars of log-lkelhoods. The results ndcate that EM provdes better out-of-sample log-lkelhoods than ML n our test problems. The out-of-sample log-lkelhoods obtaned by EM and DM are qute close. When we have 2,500 customers n the tranng data and 11 products, the average computaton tmes for EM, DM and ML are respectvely 12.93, and 0.54 seconds on a 2.2 GHz Intel Core 7 CPU wth 16 GB RAM. Wth 50,000 customers and 21 products, the average computaton tmes 19

20 for EM, DM and ML are respectvely 3,456.19, 218, and seconds. EM termnates n 31 to 52 teratons. The computaton tmes for EM and ML are reasonable snce we do not solve the estmaton problem n real tme, but DM s computatonally demandng. The computaton tme for EM s mostly spent on solvng the systems of equatons n (1) and (5) for each subset n the tranng data. Thus, EM s drastcally faster when the customers n the tranng data are offered a few dfferent subsets, whch s lkely to happen n practce. For example, when we have 50,000 customers n the tranng data, f these customers are offered one of 10 dfferent subsets, then EM takes about 20 seconds. In Onlne Appendx F, we gve the detaled computaton tmes. EM mproves the log-lkelhoods obtaned by ML, but ML may provde advantages n certan cases. EM estmates O(n 2 ) parameters gven by {λ : N} and {ρ,j :, j N}, whereas ML estmates O(n) parameters gven by {η : N \ {φ}}. The Markov chan choce model s more flexble due to ts larger number of parameters. However, snce the Markov chan choce model has a large number of parameters, EM may over-ft ths choce model to the tranng data, especally when we have too few customers n the tranng data and too many products so that we need to estmate too many parameters from too lttle data. In ths case, the out-of-sample performance of EM may be nferor. For example, f we have 1,000 customers n the tranng data and 21 products, so that EM estmates about 400 parameters from 1,000 data ponts, then the average percent gap between the out-of-sample log-lkelhoods obtaned by EM and ML s 0.45%, favorng ML, where the average s computed over the test problems wth m {10 + n, 20 + n, 40 + n}. Clearly, t s dffcult to estmate 400 parameters from 1,000 data ponts! If we have 1,000 customers n the tranng data and 11 products, so that EM estmates about 100 parameters nstead of 400, then the same average percent gap s 0.57%, favorng EM back agan. Thus, we should be cautous about usng the Markov chan choce model when we have too lttle data and too many products. To form a base lne, we also check the out-of-sample log-lkelhoods when we ft a rankng-based choce model, whch s the ground choce model that actually drves the choce process of the customers n the tranng and hold-out data. The papers by van Ryzn and Vulcano (2015) and van Ryzn and Vulcano (2016) gve algorthms for estmatng the parameters of the rankng-based choce model. We ft two versons of the ground choce model. In the frst verson, we estmate both the ranked lsts {σ g : g M} n the ground choce model and the correspondng probabltes (β 1,..., β m ). In the second verson, we assume that we know the ranked lsts {σ g : g M} and we estmate only the probabltes (β 1,..., β m ). We refer to the frst and second versons of the 20

Assortment Optimization under the Paired Combinatorial Logit Model

Assortment Optimization under the Paired Combinatorial Logit Model Assortment Optmzaton under the Pared Combnatoral Logt Model Heng Zhang, Paat Rusmevchentong Marshall School of Busness, Unversty of Southern Calforna, Los Angeles, CA 90089 hengz@usc.edu, rusmevc@marshall.usc.edu

More information

Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model

Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman School of Operatons Research and Informaton Engneerng, Cornell Unversty, Ithaca, New York 14853, USA

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Technical Note: Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model

Technical Note: Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model Techncal Note: Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman, Huseyn Topaloglu School of Operatons Research and Informaton Engneerng, Cornell Unversty,

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Approximation Methods for Pricing Problems under the Nested Logit Model with Price Bounds

Approximation Methods for Pricing Problems under the Nested Logit Model with Price Bounds Approxmaton Methods for Prcng Problems under the Nested Logt Model wth Prce Bounds W. Zachary Rayfeld School of Operatons Research and Informaton Engneerng, Cornell Unversty, Ithaca, New York 14853, USA

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

k t+1 + c t A t k t, t=0

k t+1 + c t A t k t, t=0 Macro II (UC3M, MA/PhD Econ) Professor: Matthas Kredler Fnal Exam 6 May 208 You have 50 mnutes to complete the exam There are 80 ponts n total The exam has 4 pages If somethng n the queston s unclear,

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,

More information

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that Artcle forthcomng to ; manuscrpt no (Please, provde the manuscrpt number!) 1 Onlne Appendx Appendx E: Proofs Proof of Proposton 1 Frst we derve the equlbrum when the manufacturer does not vertcally ntegrate

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Applied Stochastic Processes

Applied Stochastic Processes STAT455/855 Fall 23 Appled Stochastc Processes Fnal Exam, Bref Solutons 1. (15 marks) (a) (7 marks) The dstrbuton of Y s gven by ( ) ( ) y 2 1 5 P (Y y) for y 2, 3,... The above follows because each of

More information

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Analysis of Discrete Time Queues (Section 4.6)

Analysis of Discrete Time Queues (Section 4.6) Analyss of Dscrete Tme Queues (Secton 4.6) Copyrght 2002, Sanjay K. Bose Tme axs dvded nto slots slot slot boundares Arrvals can only occur at slot boundares Servce to a job can only start at a slot boundary

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Pricing Problems under the Nested Logit Model with a Quality Consistency Constraint

Pricing Problems under the Nested Logit Model with a Quality Consistency Constraint Prcng Problems under the Nested Logt Model wth a Qualty Consstency Constrant James M. Davs, Huseyn Topaloglu, Davd P. Wllamson 1 Aprl 28, 2015 Abstract We consder prcng problems when customers choose among

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look

More information

Chapter - 2. Distribution System Power Flow Analysis

Chapter - 2. Distribution System Power Flow Analysis Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

Determinants Containing Powers of Generalized Fibonacci Numbers

Determinants Containing Powers of Generalized Fibonacci Numbers 1 2 3 47 6 23 11 Journal of Integer Sequences, Vol 19 (2016), Artcle 1671 Determnants Contanng Powers of Generalzed Fbonacc Numbers Aram Tangboonduangjt and Thotsaporn Thanatpanonda Mahdol Unversty Internatonal

More information

CS286r Assign One. Answer Key

CS286r Assign One. Answer Key CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

Suggested solutions for the exam in SF2863 Systems Engineering. June 12,

Suggested solutions for the exam in SF2863 Systems Engineering. June 12, Suggested solutons for the exam n SF2863 Systems Engneerng. June 12, 2012 14.00 19.00 Examner: Per Enqvst, phone: 790 62 98 1. We can thnk of the farm as a Jackson network. The strawberry feld s modelled

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

Technical Note: A Simple Greedy Algorithm for Assortment Optimization in the Two-Level Nested Logit Model

Technical Note: A Simple Greedy Algorithm for Assortment Optimization in the Two-Level Nested Logit Model Techncal Note: A Smple Greedy Algorthm for Assortment Optmzaton n the Two-Level Nested Logt Model Guang L and Paat Rusmevchentong {guangl, rusmevc}@usc.edu September 12, 2012 Abstract We consder the assortment

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information