Optimal Control: Applications in Risk Management and Online Markets

Size: px

Start display at page:

Download "Optimal Control: Applications in Risk Management and Online Markets"

Jonah Norman
5 years ago
Views:

1 Optimal Control: Applications in Risk Management and Online Markets Albert Cohen Actuarial Sciences Program Department of Mathematics Department of Statistics and Probability Michigan State University East Lansing, MI Joint work with Milan Bradonjić (Bell Labs) and Matt Causley (Kettering U.) Mathematics Department Colloquium Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

2 Outline 1 Stochastic Optimal Control for Online Seller under Reputational Mechanisms Modeling Agent Behavior Stochastic Model and Solution Hamilton-Jacobi-Bellman formulation Market share based pricing mechanism Reduced model Numerical results for the reduced model Numerical results for the full model 2 Conclusions Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

3 Problem Working as an individual agent, in selling an asset or collection of assets within a network like Amazon or ebay, we can: Work a little harder. Manage our reputation or brand within a strategy that takes advantage of our extra capacity to work. Model how our online reputation affects our bottom line, and act in a way that it is optimal (maximize expected profit, for example.) Question: Can we leverage our skills in continuous-time modeling of stochastic optimal control to design such an optimal strategy? As a first approach, we don t consider shipping considerations or other factors available to a buyer, only ranking 1 1 Bradonjic, M., Causley, M., and Cohen, A. Stochastic Optimal Control for Online Seller under Reputational Mechanisms. Risks Dec 4;3(4): Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

4 Online Marketplace In this work, we consider sellers in an online market like Amazon without a bidding mechanism. The setting may include concurrent online auctions on a site like ebay, for a product or service with only a buy-it-now option, under the assumption that buyers can compare reputation among online platforms. In this online enviroment, the buyer and seller have a certain amount of anonymity, which has implications on the fairness of the transaction. One of the only counter measures to ensure fairness is the use of feedback forums, such as Amazon, ebay, and Yahoo!, at online sites. The feedback provided by customers provides a ranking system for the sellers, which in turn rewards fair sellers and penalizes unfair or unreliable sellers. Since it is in the sellers best interest to maximize their profits, it follows that a seller will seek to use reputation as a means to optimize their decision making. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

5 Online Marketplace We propose a simple behavioral model to study the relationship between a seller s wealth and reputation in the online environment with only the buy-it-now option. The wealth and growth of the seller are described by continuous stochastic processes. Note that our model is a continuous approximation of a discrete sales/marketing/processing process. The discrete model of the process may be less tractable but still very interesting and useful. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

6 Online Marketplace Online sellers can be broadly categorized by two types of behavior: those focused on advertising and customer service, which includes following up with past customers; and those who instead focus only on processing existing orders. While the latter behavior will likely lower the seller reputation on average, this immediate increase in wealth may offset the long-term damage from a lower reputation. With this in mind, it is compelling to ask whether there exists an optimal long term strategy, in which a seller attempts to maximize her profit by achieving a fixed reputation. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

7 Online Marketplace This work incorporates the idea that humans do not multitask well, but rather switch between tasks 2, 3. In other words, switching costs do not factor in for an individual agent, but only for large online retailers who have marketing department reorganization costs. Additionally, the model in this work allows all sellers to have a little extra capacity/money for either shipping or advertising (e.g. writing a letter or similar individual campaign), which is rather small and not unlimited Levitin, Daniel J. The organized mind: Thinking straight in the age of information overload. Penguin, Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

8 Agent Description Consider a seller in an online marketplace as a triple (W, R, µ) representing her wealth, reputation, and excess rate, respectively. Here, the excess rate is extra effort that can be allocated to either expedite payment or increase her reputation. In this setting, the reputation R is a positive number reflecting customer satisfaction. As an initial approach, we consider only the effects of reputation on the optimal way to grow her wealth, and leave the more general model with shipping costs for future work. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

9 Agent Description Reputation R is observed in O := (0, ) and the control µ is constrained to U := [ ɛ, ɛ]. Since this process will be stochastic, we introduce a probability space (Ω, F, P) and corresponding reference probability systems ν = ((Ω, {F s }, P), B) where F s F and B is an F s -adapted Brownian motion. The evolution of wealth and reputation is hence modeled by a two-state Markov process, dw (µ) t = (1 µ t )h(r (µ) t ) dt, (1) dr (µ) t = (µ t κr (µ) t ) dt + σ db t, (2) where κ is a proportionality constant which accounts for mean reversion. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

10 Agent Description Also, τ O is the first exit time from O, or if R O for all s 0. We set as an absorbing state for reputation, and enforce that the µ are chosen from the admissible class A ν, A ν := B 1 B 2 B 3 B 1 := {µ F s µ is F s p.m.} B 2 := {µ F s µ. U on [0, )} { [ τ O ] B 3 := µ F s E e ρs (1 µs )h(r (µ) ds Recall that p.m. denotes progressively measurable. 0 s ) } <. (3) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

11 Mink-Seifert Model The major difficulty is in defining the growth rate h(r). However, an explicit mechanism has been proposed by Mink and Seifert 4. There the authors not only propose, but empirically justify a growth rate of h(r) = A + C ( 1 1 ln (e + R) ), (4) where A relates to the inherent value of the object for sale and C is a parameter to be fitted. This is accomplished by obtaining data using an auction robot and then computing a single regression, which gives C = Mink, Moritz, and Stefan Seifert. Reputation on ebay and its impact on sales prices. Group decision and negotiation (GDN) 2006, International Conference, Karlsruhe, Germany Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

12 Mink-Seifert Model It should be noted that in many economic papers on the effect of reputation on bidding and final sales price, the Mink-Seifert model is the first one we found to give an explicit relationship between reputation and price. To the best of our knowledge, their model is among the first to give an explicit relationship between reputation and price. The Mink-Seifert model also suggests a multiple regression formula where other factors, such as shipping costs and whether a buy-it-now price is offered, are considered as well, in which case C = Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

13 Hamilton-Jacobi-Bellman formulation With the stochastic dynamics for reputation above, a growth rate model for reputational effect on sales, and an infinite horizon, we expect that switching would depend only on the current reputational state. We now seek a function V C 2 [0, ) C p [0, ) which is a candidate solution of the optimal control problem [ τ0 ] V (R) = sup sup E e ρs (1 µ s )h(r s (µ) ) ds R 0 = R ν µ A ν 0 ( ) 1 h(r) = A + C 1 ln (e + R) τ 0 := inf {t 0 R t = 0}. (5) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

14 Hamilton-Jacobi-Bellman formulation One approach to finding V would be to solve (5) directly. For example, by the definition of V, it follows that V (0) = 0 [ V ( ) = sup sup E ν µ A ν 0 [ ] = E e ρs (1 + ɛ)h( ) ds 0 ] e ρs (1 µ s )h(r s (µ) ) ds R 0 = = 1 + ɛ ρ h( ). (6) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

15 Hamilton-Jacobi-Bellman formulation However, we shall instead apply the principle of optimality, and in doing so arrive at the following nonlinear Hamilton-Jacobi-Bellman (HJB) boundary value problem (suppressing the explicit dependence of R on µ) [ 0 = max (µ κr) V ɛ µ ɛ R + σ2 2 ] V + (1 µ)h(r) ρv 2 R2 V (0) = 0 V ( ) = 1 + ɛ ρ h( ) h(r) = A + C ( 1 ) 1 ln (e + R) = A + C ln ( 1 + R e 1 + ln ( 1 + R ), e ) (7) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

16 Hamilton-Jacobi-Bellman formulation The previous problem simplifies to ρv = h(r) κr V R + σ2 2 V 2 R 2 + ɛ V R h(r) V (0) = 0 V ( ) = 1 + ɛ (A + C) ρ h(r) = A + C ln ( 1 + R ) e 1 + ln ( 1 + R ). e (8) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

17 Hamilton-Jacobi-Bellman formulation We therefore instead solve the HJB problem, which is justified by the following standard theorem in Fleming-Soner (2006) 5 : Theorem Let V C 2 (O) C b (Ō) be a solution to the boundary value problem (8). Then V is an optimal solution to the optimal control problem (5). Remark We note here that it is sufficient to require polynomial growth on V, not bounded growth. However, as h is bounded above by A + C on O, we can restrict our attention to bounded V. Furthermore, an upper bound (and in fact limit as R ) for V is the value obtained by a seller earning maximum value A + C on her items on the time interval [0, ) while always processing orders. 5 Fleming, Wendell H., and Halil Mete Soner. Controlled Markov processes and viscosity solutions. Vol. 25. Springer Science and Business Media, Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

18 Hamilton-Jacobi-Bellman formulation Proof. By Corollary IV.5.1 in Fleming-Soner (2006), since (i) the drift and volatility are (uniformly) Lipschitz in R and µ, (ii) ρ > 0, and (iii) we impose an additional boundary condition on the solution as R as our infinite horizon is now [0, τ 0 ) and not [0, τ O ), it follows directly that our bounded classical solution V = V and the control [ µ (R) argmax µ [ ɛ,ɛ] (µ κr) V R + σ2 2 leads to an optimal, stationary Markov control µ (R (µ ) s ). 2 ] V + (1 µ)h(r) ρv R2 (9) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

19 Conversion from Reputation to Market Share We observe that under its current definition, the reputation R O. In this section, we employ the mapping Y := f (R) = which results in a normalized market share reputation Y Q := (0, 1). R 1 + R, (10) This is consistent with Mink and Seifert s assumption that the value function h(r) be concave, and bounded. We therefore also simplify (4) accordingly, replacing logarithms with a rational function h(r) = A + C R 1 + R. (11) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

20 Conversion from Reputation to Market Share Consequently, we have dy t = f (R)dR t f (R)dR t dr t and so [ ( dy t = (1 Y ) 2 µ κ Y 1 Y ) dt + σ db t ] ( 2(1 Y )3 )σ 2 dt = [ µ(1 Y ) 2 κy (1 Y ) σ 2 (1 Y ) 3] dt + σ(1 Y ) 2 db t h(y ) = A + CY [ τ0 ] Ṽ (y) = sup sup E e ρs (1 µ s ) h(y s ) ds Y 0 = y. ν µ A ν 0 (12) Note that a function that is bounded for market share Y (0, 1) is also correspondingly bounded for all reputation R (0, ). We therefore study the market share model below, confident that our results will hold for the full reputation model by virtue of the inverse map of (10). Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

21 Conversion from Reputation to Market Share After market share rescaling, the HJB now takes a form which is in fact degenerate at both endpoints y = 0, 1: σ2 2 (1 y)4 V (y) + ρv = h(y) [ κy(1 y) + σ 2 (1 y) 3] V (y) + ɛ (1 y) 2 V (y) h(y) V (0) = 0 V (1) = 1 + ɛ ρ h(1). (13) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

22 Conversion from Reputation to Market Share In theory, a closed form solution for this boundary value problem can be formally constructed piecewise, to the left and right of the special point y, for which h(y ) = (1 y ) 2 V (y ). We shall refer to y as the switching point below, since it is the reputational value at which the seller switches from advertising to processing. In practice, we shall resort to numerical computation to approximate this solution, and in doing so estimate the value y. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

23 Conversion from Reputation to Market Share Theorem If V C 2 (Q) is a monotonically increasing solution to the HJB problem (13), then V = Ṽ. Proof. By Theorem 5.1 in Fleming-Soner (2006), the construction of a solution V on an open interval Q to (13) that resides in C 2 (Q) C p ( Q) is in fact equal to the optimal value function Ṽ. Enforcing the boundary condition on V (1) enforces not only polynomial growth on our monotonically increasing V, but in fact bounded growth. Note that Q = (0, 1) is open, but y = 1 is not reached in finite time from anywhere in Q and the diffusion Y is in fact absorbed for all time in state Y = 1. Hence, the exact value of V (1) imposed. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

24 Reduced Model With the previous mapping, note that the market (reputation) share Y is absorbed at Y = 1, which corresponds to R. Moreover, as in the Stochastic Nerlove evolution, we have that the probability that Y t < 0 for some positive t is greater than 0, and the drift term p 2 ((1 Y ); µ) is a second order polynomial in Y. Based on these observations, we propose a reduced model for Stochastic Reputation Share, expressed by the following stochastic differential equation and associated stochastic optimal control problem ˇV (y) = sup ν sup E µ A ν [ τ0 dy t 1 Y t = µdt + σ db t. 0 ] e ρs (1 µ s )(A + CY s ) ds Y 0 = y (14) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

25 Reduced Model This leads to a corresponding nonlinear HJB boundary value problem: σ2 2 (1 y)2 V (y) + ρv = h(y) + ɛ (1 y)v (y) h(y) V (0) = 0 V (1) = 1 + ɛ ρ h(1). One of the most attractive features of the reduced model (15) is that it has a closed form, piecewise defined analytic solution. (15) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

26 Reduced Model Theorem There exists a solution V C 2 (Q) C b ( Q) to the reduced model (15). Furthermore, V = ˇV, and is given by the piecewise solution { V V l (y), 0 y y, (y) = V r (y), y (16) y 1, where with constants defined below. V l = c 1 (1 y) γl + c2 (1 y) γl + + α l + β l y, (17) V r = c 3 (1 y) γr + + α r + β r y, (18) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

27 Reduced Model Proof. The construction of this piecewise solution is found in the Appendix. The solution constructed is monotonically increasing in y. Enforcing the boundary condition on V (1) uniformly bounds V on Q. The proof of equality V = ˇV then follows from Theorem 5.1 in Fleming-Soner (2006), as in our previous theorem. Remark: Since the reduced model has a known closed form solution, we can measure the error made in constructing numerical solutions, which in turn acts as a benchmark for the code we use to study the full model. It is uncommon to find a closed form solution for most optimal control problems. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

28 Piecewise analytic solution of reduced model We now construct the exact solution to the reduced HJB boundary value problem. The principal difficulty that must be overcome is the appearance of an absolute value, which contains the unknown variable V. We therefore consider a piecewise defined solution V (y), which remains C 2 across the switching point y, defined by the vanishing of the expression in the absolute value (1 y )V (y ) = h(y ). (19) We now separately consider the regions y < y and y > y in the reduced model (8), and define the corresponding linear differential operators L ± as L ± [V ] = ( σ2 d 2 (1 y)2 2 dy 2 ± ɛ(1 y) d ) dy + ρ V. (20) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

29 Piecewise analytic solution of reduced model The boundary value problem can then be decomposed into two smaller problems for V l and V r, where boundary conditions provide one condition for each of V l and V r. The remaining two conditions are provided by enforcing C 2 smoothness of the solution at the switching point (19). Hence we now formulate two (well-posed) boundary value problems. Once solved, enforcing continuity uniquely defines the value of the switching point y : V l (y ) = V r (y ). (21) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

30 Piecewise analytic solution of reduced model L [V l ] = (1 ɛ) h(y) 0 < y < y, (22) V l (0) = 0, (1 y )V l (y ) = h(y ), (23) L + [V r ] = (1 + ɛ) h(y) y < y < 1, (24) V r (1) = 1 + ɛ ρ h(1), (1 y )V r (y ) = h(y ). (25) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

31 Constructing the reduced model solution We shall construct the solutions V l and V r separately first, although the approach for both solutions will be the same. Following standard methods, we first decompose the full solution into a homogeneous and particular solution, where the homogeneous part satisfies L ± [u] = 0. Since the operators L ± are equi-dimensional, we seek solutions of the form u = c(1 y) γ, (26) and the application of the differential operator yields ( ) L ± [u] = c(1 y) γ σ2 2 (γ2 γ) ɛγ + ρ = 0. (27) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

32 Constructing the reduced model solution We set the parenthetical term of this expression to zero in order to solve for the admissible exponents γ (1 γ± l = ɛ σ 2 ± 2 + ɛ ) 2 σ ρ σ 2 (from L ) (28) (1 γ± r = 1 2 ɛ σ 2 ± 2 ɛ ) 2 σ ρ σ 2 (from L + ). (29) Because γ r < 0 for ρ > 0, the homogeneous solution (1 y) γr will be unbounded as y 1, and so we exclude it from further consideration below. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

33 Constructing the reduced model solution Furthermore, since h(y) = A + Cy is a linear function, the particular solution will also be linear. We therefore have a general solution of the form V l = c 1 (1 y) γl + c2 (1 y) γl + + α l + β l y, (30) V r = c 3 (1 y) γr + + α r + β r y. (31) The particular solution is fixed by enforcing that the differential equations (22) and (24) are satisfied, which yields ( ) ( ) ( ) 1 ɛ 1 ɛ 1 ɛ α l = (A + C) C, β l = C, (32) ρ ρ + ɛ ρ + ɛ ( ) ( ) ( ) 1 + ɛ 1 + ɛ 1 + ɛ α r = (A + C) C, β r = C. (33) ρ ρ ɛ ρ ɛ Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

34 Constructing the reduced model solution The homogeneous coefficients c 1 and c 2 are then found by applying the boundary conditions (23), and the resulting linear system ( ) ( ) ( ) 1 1 c1 α γ (1 l y ) γl γ l + (1 y = l ) γl + β l (1 y ) h(y ) is solved by c 1 = α lγ l +(1 y ) γl + + βl (1 y ) h(y ) γ l + (1 y ) γl + γ l (1 y ) γl c 2, (34) c 2 = β l(1 y ) h(y ) + α l γ l (1 y ) γl γ l + (1 y ) γl + γ l (1 y ) γl. (35) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

35 Constructing the reduced model solution The boundary condition (25) at y = 1 is automatically satisfied by the particular solution, and so the final coefficient c 3 is determined by the condition at y = y, c 3 = β r (1 y ) h(y ) γ r + (1 y ) γr +. (36) Finally, having determined the general solutions V l and V r, the solution V is obtained by enforcing continuity. This also fixes the value y, which must now satisfy the nonlinear transcendental equation c 1 (1 y ) γl + c2 (1 y ) γl + + α l + β l y = c 3 (1 y ) γr + + α r + β r y. (37) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

36 Constructing the numerical solution The boundary value problem (8) is discretized using a standard finite difference scheme. Let y k = k/n y, and V k = V (y k ), for k = 0, 1,..., N y. Then we solve a linear system of N y + 1 equations of the form Mv = f, where Mv ρv σ2 2 (1 y)2 V, and f h + ɛ (1 y)v h. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

37 Constructing the numerical solution Since f depends on V, the solution is implicit, and therefore must be obtained by using a fixed point iteration Mv (k+1) = f (k). In our numerical experiments, we let N y = 1000, and set the maximum iteration count at K = 20. Convergence is observed in all tested cases. In Figure 1, several numerical solutions V (y) are shown, for fixed A = 1, C = 0.15, ɛ = 0.02, and various values of ρ = 0.1, 0.2, 0.5, 2.0 and σ = 0.2, 0.5, 1.0, 5.0. It follows that both ρ and σ have a dramatic effect not only the shape of the solution, but also, as is demonstrated in Figure 2, on the switching point y. These numerical results suggest that both the discount rate ρ and the volatility σ of a seller s reputation can dramatically affect her behavior, particularly the critical reputation at which she switches from the processing to advertising mode. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

38 Figure 1: Value Function (Reduced Model) (a) ρ = 0.1 (b) ρ = 0.2 (c) ρ = 0.5 (d) ρ = 2.0 Figure: A plot of several numerical solutions V (y) for the reduced boundary value problem (15) are shown with A = 1, C = 0.15, and ɛ = Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

39 Figure 2: Switching Point (Reduced Model) (a) ρ = 0.1 (b) ρ = 0.2 (c) ρ = 0.5 (d) ρ = 2.0 Figure: A plot of the quantity (1 y)v (y), where V (y) is a numerical solution of the the reduced boundary value problem (15). The value y is defined as the intersection of these curves with h(y) (the solid line). Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

40 Figure 3: Value Function (Full Model) (a) ρ = 0.1 (b) ρ = 0.2 (c) ρ = 0.5 (d) ρ = 2.0 Figure: A plot of several numerical solutions V (y) for the full boundary value problem (13) are shown with A = 1, C = 0.15, ɛ = 0.02, and κ = 0. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

41 Figure 4: Switching Point (Full Model) (a) ρ = 0.1 (b) ρ = 0.2 (c) ρ = 0.5 (d) ρ = 2.0 Figure: A plot of the quantity (1 y) 2 V, where V is a numerical solution of the the full boundary value problem (13). The value y is defined as the intersection of these curves with h(y) (black solid line). Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

42 Constructing the numerical solution The same numerical discretization is employed to solve the full Nerlove-Arrow model (13), where we solve a linear system of the form Mv = f, with Mv ρv + [ κy(1 y) + σ 2 (1 y) 3] V (y) σ2 2 (1 y)4 V (y) f h + ɛ (1 y) 2 V h. (38) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

43 Figure 5: Value Function (Full Model) (a) ρ = 0.1 (b) ρ = 0.2 (c) ρ = 0.5 (d) ρ = 2.0 Figure: The same as Figure 3, but with κ = 1. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

44 Figure 6: Switching Point (Full Model) (a) ρ = 0.1 (b) ρ = 0.2 (c) ρ = 0.5 (d) ρ = 2.0 Figure: The same as Figure 4, but with κ = 1. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

45 Conclusions In this work, we have proposed a model which describes the behavior of sellers participating in online markets. In doing so, we were able to design an optimal selling strategy, wherein the seller switches their behavior when an optimal market share reputation is reached, which depends on statistical modeling parameters. These modeling parameters were introduced through the wealth-reputation mechanisms by Nerlove and Arrow 6, which have been generalized to stochastic settings by Raman 7, among others. 6 Nerlove, Marc, and Kenneth J. Arrow. Optimal advertising policy under dynamic conditions. Economica (1962): Raman, Kalyan. Boundary value problems in stochastic optimal control of advertising. Automatica 42.8 (2006): Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

46 Conclusions We have additionally modified the empirical model h(r) found in Mink and Seifert, relating the price per unit to the reputation. Rather than viewing reputation R as an unbounded quantity, we instead introduced a market scaled reputation Y = R/(1 + R), where Y [0, 1), which reduces the price per unit to a simple linear model h(y ) = A + CY. We then optimized the stochastic model over the control of the excess rate µ using the Hamilton-Jacobi-Bellman equation, yielding a deterministic equation relating the value per unit to the seller reputation. The resulting boundary value problem is then solved numerically, and we qualitatively see that the value of sellers goods increase monotonically with reputation, but that a unique optimal reputation y = y determines when the seller should switch from advertising to processing. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

47 Conclusions The numerical scheme was validated with a reduced model, which has a closed form piecewise analytic solution, and permits direct determination of y. The numerical results qualitatively validate our modeling assumptions, and provide a framework for studying seller behavior based on seller reputation. Although the used techniques are standard, we believe that the optimal strategy presented both analytically and numerically has implications on the way reputational information can be used to predict the behavior of an individual seller in an online market. This work can naturally be generalized in a variety of ways. For instance, the model could be developed in the case of finite horizon time T, since a more realistic assumption is that a seller strategy depends on both time and the current reputation state. Unlike our current infinite horizon model, the resulting HJB equation will lead to a time-dependent partial differential equation. Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

48 Acknowledgements Milan Bradonjic (Bell Labs) Matt Causley (Kettering U.) John Chadam (Pitt) Nick Costanzino (AIG) Sean Hilden (Bank of America) Goran Peskir (U. of Manchester) Steven Shreve (CMU) Albert Cohen (MSU) Albion Math Department Colloquium Albion College April / 48

Stochastic Optimal Control for Online Seller under Reputational Mechanisms

Stochastic Optimal Control for Online Seller under Reputational Mechanisms Milan Bradonjić, Matthew Causley, Albert Cohen arxiv:1103.1918v3 [math.oc 31 Dec 2015 January 1, 2016 Abstract In this work we