Probability Models for Customer-Base Analysis
|
|
- Caroline Fowler
- 5 years ago
- Views:
Transcription
1 Probability Models for Customer-Base Analysis Peter S Fader University of Pennsylvania wwwpetefadercom Bruce G S Hardie London Business School wwwbrucehardiecom Workshop on Customer-Base Analysis Johann Wolfgang Goethe-Universität, Frankfurt March 8 9, Peter S Fader and Bruce G S Hardie 1 Customer-Base Analysis Faced with a customer transaction database, we may wish to determine which customers are most likely to be active in the future, the level of transactions we could expect in future periods from those on the customer list, both individually and collectively, and individual customer lifetime value (CLV) Forward-looking/predictive versus descriptive 2
2 Comparison of Modelling Approaches Traditional approach future = f (past) Past Future latent characteristics (θ) Probability modelling approach ˆθ = f (past) future = f(ˆθ) 3 Classifying Analysis Settings Consider the following two statements regarding the size of a company s customer base: Based on numbers presented in a January 26 press release that reported Vodafone Group Plc s third quarter key performance indicators, we see that Vodafone UK has 63 million pay monthly customers In his Q3 25 Financial Results Conference Call, the CFO of Amazon made the comment that [a]ctive customer accounts, representing customers who ordered in the past year, surpassed 52 million, up 19% 4
3 Classifying Analysis Settings It is important to distinguish between contractual and noncontractual settings: In a contractual setting, we observe the time at customers become inactive In a noncontractual setting, the time at which a customer becomes inactive is unobserved The challenge of noncontractual markets: How do we differentiate between those customers who have ended their relationship with the firm versus those who are simply in the midst of a long hiatus between transactions? 5 Classifying Analysis Settings Consider the following four specific business settings: Airport lounges Electrical utilities Academic conferences Mail-order clothing companies 6
4 Classifying Customer Bases Continuous Opportunities for Transactions Discrete Grocery purchases Doctor visits Hotel stays Event attendance Prescription refills Charity fund drives Credit card Student mealplan Mobile phone usage Magazine subs Insurance policy Health club m ship Noncontractual Contractual Type of Relationship With Customers 7 Agenda The right way to think about computing CLV Models for noncontractual settings The Pareto/NBD model The BG/NBD model The BG/BB model Models for contractual settings Beyond the basic models 8
5 The Right Way to Think About Computing Customer Lifetime Value 9 Calculating CLV Customer lifetime value is the present value of the future cash flows associated with the customer A forward-looking concept Not to be confused with (historic) customer profitability 1
6 Calculating CLV Standard classroom formula: T CLV = m r t (1 + d) t t= where m = net cash flow per period (if active) r = retention rate d = discount rate T = horizon for calculation 11 Calculating CLV A more correct starting point: E(CLV ) = E[v(t)]S(t)d(t)dt where E[v(t)] = expected value (or net cashflow) of the customer at time t (if active) S(t) = the probability that the customer has remained active to at least time t d(t) = discount factor that reflects the present value of money received at time t 12
7 Calculating CLV Definitional; of little use by itself We must operationalize E[v(t)], S(t), and d(t) in a specific business setting then solve the integral Important distinctions: E(CLV ) of an as-yet-be-acquired customer E(CLV ) of a just-acquired customer E(CLV ) of an existing customer (expected residual CLV) 13 Models for Noncontractual Settings 14
8 Classifying Customer Bases Continuous Opportunities for Transactions Discrete Grocery purchases Doctor visits Hotel stays Event attendance Prescription refills Charity fund drives Credit card Student mealplan Mobile phone usage Magazine subs Insurance policy Health club m ship Noncontractual Contractual Type of Relationship With Customers 15 Setting New customers at CDNOW, 1/97 3/97 Systematic sample (1/1) drawn from panel of 23,57 new customers 39-week calibration period 39-week forecasting (holdout) period Initial focus on transactions 16
9 ID = 1 ID = 2 Purchase Histories ID = 1178 ID = 1179 ID = 2356 ID = 2357 Week Week Raw Data A B C ID x T
10 Cumulative Repeat Transactions 3 # transactions Week 19 Modelling Objective Given this customer database, we wish to determine the level of transactions that should be expected in next period (eg, 39 weeks) by those on the customer list, both individually and collectively 2
11 Modelling the Transaction Stream A customer purchases randomly with an average transaction rate λ Transaction rates vary across customers 21 Modelling the Transaction Stream Let the random variable X(t) denote the number of transactions in a period of length t time units At the individual-level, X(t) is assumed to be distributed Poisson with mean λt: P(X(t) = x λ) = (λt)x e λt x! Transaction rates (λ) are distributed across the population according to a gamma distribution: g(λ r,α) = αr λ r 1 e αλ Γ (r ) 22
12 Modelling the Transaction Stream The distribution of transactions for a randomly-chosen individual is given by: P(X(t) = x r,α) = = P(X(t) = x λ) g(λ) dλ Γ (r + x) Γ (r )x! ( ) α r ( ) t x, α + t α + t which is the negative binomial distribution (NBD) 23 Frequency of Repeat Transactions 15 Actual NBD 1 Frequency # Transactions 24
13 Tracking Cumulative Repeat Transactions 6 # transactions 4 2 Actual NBD Week 25 Tracking Weekly Repeat Transactions 15 Actual NBD # transactions Week 26
14 Conditional Expectations We are interested in computing E(Y(t) data), the expected number of transactions in an adjacent period (T, T + t], conditional on the observed purchase history For the NBD, a straight-forward application of Bayes theorem gives us E [ Y(t) r, α, x, T ] ( ) r + x = t α + T 27 Conditional Expectations 1 9 Actual NBD 8 Expected # Transactions in Weeks # Transactions in Weeks
15 Conditional Expectations Cust A Cust B Week Week 7 Week 39 According to the NBD model: Cust A: E [ Y(39) x = 4,T = 32 ] = 388 Cust B: E [ Y(39) x = 4,T = 32 ] =? 29 Tracking Cumulative Repeat Transactions 6 # transactions 4 2 Actual NBD Week 3
16 Towards a More Realistic Model Total # transactions Cust B Cust C Cust A Time 31 Modelling the Transaction Stream Transaction Process: While active, a customer purchases randomly around his mean transaction rate Transaction rates vary across customers Dropout Process: Each customer has an unobserved lifetime Dropout rates vary across customers 32
17 The Pareto/NBD Model (Schmittlein, Morrison and Colombo 1987) Transaction Process: While active, # transactions made by a customer follows a Poisson process with transaction rate λ Heterogeneity in transaction rates across customers is distributed gamma(r,α) Dropout Process: Each customer has an unobserved lifetime of length τ, which is distributed exponential with dropout rate μ Heterogeneity in dropout rates across customers is distributed gamma(s,β) 33 Deriving the Model Likelihood Function Let us assume we know when each of a customer s x transactions occurred during the period (,T] (denoted by t 1,t 2,,t x ) There are two possible ways this pattern of transactions could arise: i The customer is still alive at the end of the observation period (ie, τ>t) ii The customer became inactive at some time τ in the interval (t x,t] 34
18 Deriving the Model Likelihood Function Conditional on λ, L(λ t 1,,t x,t,τ >T) = λe λt 1 λe λ(t 2 t 1) λe λ(t x t x 1 ) e λ(t t x) = λ x e λt L(λ t 1,,t x,t,inactive at τ (t x,t]) = λe λt 1 λe λ(t 2 t 1) λe λ(t x t x 1 ) e λ(τ t x) = λ x e λτ (Note: we do not need t 1,,t x ; x and t x are sufficient) 35 Deriving the Model Likelihood Function Removing the conditioning on τ, L(λ,μ x,t x,t) = L(λ x,t,τ > T)P(τ > T μ) T + L(λ x,t,inactive at τ (t x, T ])f (τ μ) dτ t x = λx μ λ + μ e (λ+μ)t x + λx+1 λ + μ e (λ+μ)t 36
19 Key Observation Given the model assumptions, we do not require information on when each of the x transactions occurred The only customer-level information required by this model is recency and frequency The notation used to represent this information is (x,t x,t), where x is the number of transactions observed in the time interval (,T] and t x ( <t x T ) is the time of the last transaction 37 ID = 1 ID = 2 Purchase Histories ID = 1178 ID = 1179 ID = 2356 ID = 2357 Week Week 39 38
20 Raw Data A B C D ID x t_x T The Gaussian Hypergeometric Function 2F 1 (a, b; c; z) = Γ (c) Γ (a)γ (b) j= Γ (a + j)γ (b + j) Γ (c + j) Easy to compute, albeit tedious, in Excel as using the recursion u j u j 1 = where u = 1 2F 1 (a, b; c; z) = j= u j (a + j 1)(b + j 1) z, j = 1, 2, 3, (c + j 1)j z j j! 4
21 The Gaussian Hypergeometric Function Euler s integral representation of the Gaussian hypergeometric function is 2F 1 (a, b; c; z) = 1 1 t b 1 (1 t) c b 1 (1 zt) a dt B(b,c b) where and c>b> and B(, ) is the beta function 41 Pareto/NBD Likelihood Function Removing the conditioning on λ and μ: L(r,α,s,β x,t x,t) = Γ (r + x)αr β s {( ) ( α β ) s 2 F 1 r + s + x,s + 1; r + s + x + 1; α+t x Γ (r ) r + s + x (α + t x ) r +s+x ( ) ( α β r + x 2 F 1 r + s + x,s; r + s + x + 1; α+t + r + s + x (α + T) r +s+x ) }, if α β L(r,α,s,β x,t x,t) = Γ (r + x)αr β s {( Γ (r ) s r + s + x ) 2 F 1 ( r + s + x,r + x; r + s + x + 1; β α β+t x ) (β + t x ) r +s+x ( ) ( β α r + x 2 F 1 r + s + x,r + x + 1; r + s + x + 1; β+t + r + s + x (β + T) r +s+x ) }, if α β 42
22 E [ X(t) ] Key Results The expected number of transactions in the time interval (,t] P(alive x,t x,t) The probability that an individual with observed behavior (x, t x,t) is still active at time T E(Y(t) x,t x,t) The expected number of transactions in the future period (T, T + t] for an individual with observed behavior (x, t x,t) 43 Expected Number of Transactions Given that the number of transactions follows a Poisson process while the customer is alive, i if τ>t,the expected number of transactions is simply λt ii if τ t, the expected number of transactions in the interval (,τ] is λτ 44
23 Expected Number of Transactions Removing the conditioning on τ: E[X(t) λ, μ] = λtp(τ > t μ) + = λ μ λ μ e μt t λτf (τ μ) dτ Taking the expectation over the distributions of λ and μ: E[X(t) r, α, s, β] = = rβ α(s 1) E[X(t) λ, μ]g(λ r, α)g(μ s, β) dλdμ ( ) s 1 β 1 β + t 45 P(alive x, t x,t) The probability that a customer with purchase history (x, t x,t) is alive at time T is the probability that the (unobserved) time at which he becomes inactive (τ) occurs after T, P(τ > T) By Bayes theorem: P(τ > T λ, μ, x, t x,t)= = But λ and μ are unobserved L(λ x,t,τ > T)P(τ > T μ) L(λ, μ x,t x,t) λ x e (λ+μ)t L(λ, μ x,t x,t) 46
24 P(alive x, t x,t) We take the expectation of P(τ > T λ, μ, x, t x,t) over the distribution of λ and μ, updated to take account of the information (x, t x,t): P(alive r, α, s, β, x, t x,t) { = P(τ > T λ, μ, x, t x,t) } g(λ, μ r, α, s, β, x, t x,t) dλdμ 47 P(alive x, t x,t) By Bayes theorem, the joint posterior distribution of λ and μ is g(λ, μ r, α, s, β, x, t x,t) Therefore, = L(λ, μ x,t x, T )g(λ r, α)g(μ s,β) L(r,α,s,β x,t x,t) P(alive r, α, s, β, x, t x,t) = Γ (r + x)α r β s / L(r,α,s,β x,t Γ (r )(α + T) r +x (β + T) s x,t) 48
25 Conditional Expectations Let Y(t) = the number of purchases made in the period (T, T + t] E[Y(t) λ, μ, alive at T] = λtp(τ > T + t μ, τ > T) + T +t T = λ μ λ μ e μt λτf (τ μ, τ > T) dτ E[Y(t) λ, μ, x, t x,t]= E[Y(t) λ, μ, alive at T] P(τ > T λ, μ, x, t x,t) 49 Conditional Expectations Taking the expectation over the joint posterior distribution of λ and μ yields: E[Y(t) r, α, s, β, x, t x,t] { = Γ (r + x)α r β s / L(r,α,s,β x,t Γ (r )(α + T) r +x (β + T) s x,t) (r + x)(β + T) (α + T )(s 1) 1 ( ) s 1 β + T β + T + t } 5
26 Frequency of Repeat Transactions 15 Actual Pareto/NBD 1 Frequency # Transactions 51 Tracking Cumulative Repeat Transactions 5 # transactions Actual 4 Pareto/NBD Week 52
27 Tracking Weekly Repeat Transactions 15 Actual Pareto/NBD # transactions Week 53 Conditional Expectations 7 Actual Pareto/NBD 6 Expected # Transactions in Weeks # Transactions in Weeks
28 Proportions of Active Customers P(active) Empirical Pareto/NBD # Transactions in Weeks E(CLV ) = Computing CLV E[v(t)]S(t)d(t)dt If we assume that an individual s spend per transaction is constant, v(t) = net cashflow / transaction t(t) (where t(t) is the transaction rate at t) and E(CLV ) = E(net cashflow / transaction) E[t(t)]S(t)d(t)dt }{{} discounted expected transactions 56
29 Computing CLV Standing at time T, we wish to compute the present value of the expected future transaction stream for a customer with purchase history (x, t x,t) We call this quantity DERT, discounted expected residual transactions 57 Standing at time T, DERT = Computing DERT T E[v(t)]S(t t > T )d(t T )dt For Poisson purchasing and exponential lifetimes with continuous compounding at rate of interest δ, DERT (δ λ, μ, alive at T) = = λ μ + δ λe μt e δt dt 58
30 Computing DERT DERT (δ r, α, s, β, x, t x,t) { = DERT (δ λ, μ, alive at T) P(alive at T λ, μ, x, t x,t) } g(λ, μ r, α, s, β, x, t x,t) dλdμ = αr β s δ s 1 Γ (r + x + 1)Ψ ( s,s; δ(β + T) ) Γ (r )(α + T) r +x+1 L(r,α,s,β x,t x,t) where Ψ( ) is the confluent hypergeometric function of the second kind 59 Continuous Compounding An annual discount rate of (1 d)% is equivalent to a continuously compounded rate of δ = ln(1 + d) If the data are recorded in time units such that there are k periods per year (k = 52 if the data are recorded in weekly units of time) then the relevant continuously compounded rate is δ = ln(1 + d)/k 6
31 DERT by Recency and Frequency DET (discounted expected trans) Frequency (x) Recency (t x ) 8 61 Iso-Value Representation of DERT Frequency (x) Recency (t ) x 62
32 The Increasing Frequency Paradox Cust A Cust B Week Week 78 DERT Cust A 46 Cust B Modeling the Spend Process The dollar value of a customer s given transaction varies randomly around his average transaction value Average transaction values vary across customers but do not vary over time for any given individual The distribution of average transaction values across customers is independent of the transaction process 64
33 Modeling the Spend Process For a customer with x transactions, let z 1,z 2,,z x denote the dollar value of each transaction The customer s average observed transaction value x m x = z i /x is an imperfect estimate of his (unobserved) mean transaction value E(M) Our goal is to make inferences about E(M) given m x, which we denote as E(M m x,x) i=1 65 Summary of Average Transaction Value 946 individuals (from the 1/1th sample of the cohort) make at least one repeat purchase in weeks 1 39 $ Minimum th percentile 1575 Median th percentile 418 Maximum Mean 358 Std deviation 328 Mode
34 Modeling the Spend Process The dollar value of a customer s given transaction is distributed gamma with shape parameter p and scale parameter ν Heterogeneity in ν across customers follows a gamma distribution with shape parameter q and scale parameter γ 67 Modeling the Spend Process Marginal distribution for m x : f(m x p, q, γ, x) = Γ (px + q) γ q mx px 1 x px Γ (px)γ (q) (γ + m x x) px+q Expected average transaction value for a customer with an average spend of m x across x transactions: E(M p, q, γ, m x,x)= ( ) ( ) q 1 γp px + q 1 q 1 + px px + q 1 m x 68
35 Distribution of Average Transaction Value 4 Actual (nonparametric density) Model (weighted by actual x) 3 f(m x ) 2 1 $ $5 $1 $15 $2 $25 $3 Average transaction value (m x ) 69 E(Monetary Value) as a Function of M and F $55 $5 $45 $4 E(M m x,x) $35 $3 $25 $2 $15 m x = $5 m x = $35 m x = $ # Transactions in Weeks
36 Conditional Expectations of Total Value We are interested in predicting an individual s expected total spend in the future period (T, T + t] conditional on his RFM = conditional expectation of transactions conditional expectation of revenue/transaction = E(Y(t) r, α, s, β, x, t x,t) E(M p, q, γ, m x,x) 71 Conditional Expectations of Total Value 25 Actual Expected 2 Expected Total Spend in Weeks # Transactions in Weeks
37 Computing Expected Residual CLV We are interested in computing the present value of an individual s expected residual margin stream conditional on his observed behavior (RFM) E(RCLV ) = margin revenue/transaction DERT = margin E(M p, q, γ, m x,x) DERT (δ r, α, s, β, x, t x,t) 73 Estimates of Expected Residual CLV M = $2 M = $5 $6 $6 $5 $5 $4 $4 CLV $3 CLV $3 $2 $2 $1 $1 $ $ Frequency (x) Recency (t x ) Frequency (x) Recency (t x ) (Margin = 3%, 15% discount rate) 74
38 2 Estimates of Expected Residual CLV M = $2 M = $ Frequency (x) Frequency (x) Recency (t ) x Recency (t ) x 5 1 (Margin = 3%, 15% discount rate) 75 Distribution of Repeat Customers 2 Number of Customers Recency (t x ) Frequency (x) (12,54 customers make no repeat purchases) 76
39 Total CLV by RFM Group Recency Frequency M= $53,29 M=1 1 $7,654 $9,893 $1,794 2 $2,79 $15,26 $17,41 3 $259 $12,477 $52,912 M=2 1 $5,863 $7,629 $2,341 2 $3,551 $26,527 $25,754 3 $45 $37,212 $23,4 M=3 1 $11,257 $19,738 $3,738 2 $7,289 $45,911 $47,96 3 $1,38 $62,698 $414, Substantive Observations There is a highly nonlinear relationship between recency/frequency and future transactions the increasing frequency paradox The underlying process for monetary value appears to be stationary and independent of recency and frequency A thorough analysis of the customer base requires careful consideration of the zero class Iso-value curves can be used to identify customers with different purchase histories but similar CLVs 78
40 Summary We are able to generate estimates of DERT as a function of recency and frequency in a noncontractual setting: DERT = f(r,f) Adding a sub-model for spend per transaction enables us to generate estimates of expected (residual) CLV as a function of RFM in a noncontractual setting: E(CLV ) = f(r,f,m) = DERT g(f,m) 79 An Alternative to the Pareto/NBD Model Estimation of model parameters can be a barrier to Pareto/NBD model implementation Recall the dropout process story: Each customer has an unobserved lifetime Dropout rates vary across customers Let us consider an alternative story: After any transaction, a customer tosses a coin heads become inactive tails remain active P(heads) varies across customers 8
41 Purchase Process: The BG/NBD Model (Fader, Hardie and Lee 25c) While active, # transactions made by a customer follows a Poisson process with transaction rate λ Heterogeneity in transaction rates across customers is distributed gamma(r,α) Dropout Process: After any transaction, a customer becomes inactive with probability p Heterogeneity in dropout probabilities across customers is distributed beta(a, b) 81 Deriving the Model Likelihood Function Let us assume we know when each of a customer s x transactions occurred during the period (,T] (denoted by t 1,t 2,,t x ) There are two possible ways this pattern of transactions could arise: i The customer is still alive at the end of the observation period (ie, τ>t) ii The customer became inactive immediately after the xth transaction (ie, τ = t x ) 82
42 Deriving the Model Likelihood Function Conditional on λ, L(λ t 1,,t x,t,τ >T) = λe λt 1 λe λ(t 2 t 1) λe λ(t x t x 1 ) e λ(t t x) = λ x e λt L(λ t 1,,t x,t,inactive at τ = t x ) = λe λt 1 λe λ(t 2 t 1) λe λ(t x t x 1 ) = λ x e λt x 83 Deriving the Model Likelihood Function Removing the conditioning on τ, L(λ,p x,t x,t) = L(λ x,t,τ > T)P(τ > T p) + L(λ x,t,inactive at τ = t x )P(τ = t x p) = (1 p) x λ x e λt + δ x> p(1 p) x 1 λ x e λt x where δ x> =1ifx>, otherwise 84
43 Deriving the Model Likelihood Function Removing the conditioning on λ and p, L(r,α, a, b x,t x,t) = = 1 1 L(λ, p x,t x, T )f (λ r, α)f (p a, b)dλdp (1 p) x λ x e λt αr λ r 1 e λα p a 1 (1 p) b 1 dλdp Γ (r ) B(a, b) { p(1 p) x 1 λ x e λt x + δ x> 1 αr λ r 1 e λα Γ (r ) p a 1 (1 p) b 1 B(a, b) } dλdp 85 BG/NBD Likelihood Function We can express the model likelihood function as: L(r,α,a,b x,t x,t)= A 1 A 2 (A 3 + δ x> A 4 ) where Γ (r + x)αr A 1 = Γ (r ) Γ (a + b)γ (b + x) A 2 = Γ (b)γ (a + b + x) ( 1 ) r +x A 3 = α + T ( a )( 1 ) r +x A 4 = b + x 1 α + t x 86
44 BGNBD Estimation A B C D E F G H I r 243 alpha 4414 a 793 b 2426 LL =GAMMALN(B$1+B8)- GAMMALN(B$1)+B$1*LN(B$2) =IF(B8>,LN(B$3)-LN(B$4+B8-1)- (B$1+B8)*LN(B$2+C8),) =-(B$1+B8)*LN(B$2+D8) ID x t_x T ln() ln(a_1) ln(a_2) ln(a_3) ln(a_4) =SUM(E8:E2364) =F8+G8+LN(EXP(H8)+(B8>)*EXP(I8)) =GAMMALN(B$3+B$4)+GAMMALN(B$4+B8) GAMMALN(B$4)-GAMMALN(B$3+B$4+B8) Frequency of Repeat Transactions 15 Actual BG/NBD Pareto/NBD 1 Frequency # Transactions 88
45 Tracking Cumulative Repeat Transactions Cum Rpt Transactions Actual BG/NBD Pareto/NBD Week 89 Tracking Weekly Repeat Transactions 15 Weekly Rpt Transactions 1 5 Actual BG/NBD Pareto/NBD Week 9
46 Conditional Expectations 7 6 Actual BG/NBD Pareto/NBD Expected # Transactions in Weeks # Transactions in Weeks Further Reading Schmittlein, David C, Donald G Morrison, and Richard Colombo (1987), Counting Your Customers: Who They Are and What Will They Do Next? Management Science, 33 (January), 1 24 Fader, Peter S and Bruce G S Hardie (25), A Note on Deriving the Pareto/NBD Model and Related Expressions < Fader, Peter S, Bruce G S Hardie, and Ka Lok Lee (25a), A Note on Implementing the Pareto/NBD Model in MATLAB < 92
47 Further Reading Fader, Peter S, Bruce G S Hardie, and Ka Lok Lee (25b), RFM and CLV: Using Iso-value Curves for Customer Base Analysis, Journal of Marketing Research, 42 (November) Fader, Peter S, Bruce G S Hardie, and Ka Lok Lee (25c), "Counting Your Customers" the Easy Way: An Alternative to the Pareto/NBD Model, Marketing Science, 24 (Spring) Fader, Peter S, Bruce G S Hardie, and Ka Lok Lee (25d), Implementing the BG/NBD Model for Customer Base Analysis in Excel < Fader, Peter S and Bruce G S Hardie (24), Illustrating the Performance of the NBD as a Benchmark Model for Customer-Base Analysis < 93 Modelling the Transaction Stream How valid is the assumption of Poisson purchasing? can transactions occur at any point in time? 94
48 Classifying Customer Bases Continuous Grocery purchases Doctor visits Hotel stays Credit card Student mealplan Mobile phone usage Opportunities for Transactions Discrete Event attendance Prescription refills Charity fund drives Magazine subs Insurance policy Health club m ship Noncontractual Contractual Type of Relationship With Customers 95 Discrete-Time Transaction Opportunities necessarily discrete attendance at sports events attendance at annual arts festival generally discrete charity donations blood donations discretized by recording process cruise ship vacations 96
49 Discrete-Time Transaction Data A transaction opportunity is a well-defined point in time at which a transaction either occurs or does not occur, or a well-defined time interval during which a (single) transaction either occurs or does not occur a customer s transaction history can be expressed as a binary string: y t = 1 if a transaction occurred at/during the tth transaction opportunity, otherwise 97 Repeat Purchasing for Luxury Cruises (Berger, Weinberg, and Hanna 23) Y Y Y Y Y N N Y N N Y Y N N Y N N Y Y Y N N Y N N Y Y N N Y N # Customers
50 Model Development A customer s relationship with a firm has two phases: he is alive (A) for some period of time, then becomes permanently inactive ( dies, D) While alive, the customer buys at any given transaction opportunity (ie, period t) with probability p: P(Y t = 1 p, alive at t) = p A living customer becomes inactive at the beginning of a transaction opportunity (ie, period t) with probability θ P(alive at t θ) = P(AAA }{{} θ) = (1 θ) t t 99 Model Development What is P(Y 1 = 1,Y 2 =,Y 3 = 1,Y 4 =,Y 5 = p, θ)? Three scenarios give rise to Y 4 =,Y 5 = : Alive? t = 1 t = 2 t = 3 t = 4 t = 5 i) A A A D D ii) A A A A D iii) A A A A A The customer must have been alive for t = 1,2,3 1
51 Model Development We compute the probability of the purchase string conditional on each scenario and multiply it by the probability of that scenario: f(11 p, θ) = p(1 p)p (1 θ) 3 θ }{{} P(AAADD) + p(1 p)p(1 p) (1 θ) 4 θ }{{} P(AAAAD) + p(1 p)p(1 p)(1 p) (1 θ) }{{}}{{ 5 } P(Y 1 =1,Y 2 =,Y 3 =1) P(AAAAA) 11 Model Development Bernoulli purchasing while alive the order of a given number of transactions (prior to the last observed transaction) doesn t matter For example, f(11 p, θ) = f(11 p, θ) Recency (time of last transaction, t x ) and frequency (number of transactions, x = n t=1 y t ) are sufficient summary statistics = we do not need the complete binary string representation of a customer s transaction history 12
52 Repeat Purchasing for Luxury Cruises # Customers 18 x t x n # Customers Model Development For a customer with purchase history (x, t x,n), L(p, θ x,t x,n)= p x (1 p) n x (1 θ) n + n t x 1 i= p x (1 p) t x x+i θ(1 θ) t x+i We assume that heterogeneity in p and θ across customers is captured by beta distributions: g(p α, β) = pα 1 (1 p) β 1 B(α, β) g(θ γ,δ) = θγ 1 (1 θ) δ 1 B(γ,δ) 14
53 Model Development Removing the conditioning on p and θ, L(α,β, γ, δ x,t x,n) = = 1 1 L(p, θ x,t x, n)g(p α, β)g(θ γ, δ) dp dθ B(α + x,β + n x) B(α, β) + n t x 1 i= B(γ,δ + n) B(γ,δ) B(α + x,β + t x x + i) B(α, β) B(γ + 1,δ+ t x + i) B(γ,δ) which is (relatively) easy to code-up in Excel 15 BGBB Estimation A B C D E F G H I J K L M alpha 66 B(alpha,beta) 4751 beta 519 gamma B(gamma,delta) 4E-26 delta LL i x t_x n # cust L( X=x,t_x,n) n-t_x
54 BGBB Estimation A B C D E F G H I J K L M alpha 66 B(alpha,beta) 4751 beta 519 =EXP(GAMMALN(B1)+GAMMALN(B2)-GAMMALN(B1+B2)) gamma B(gamma,delta) 4E-26 =EXP(GAMMALN($B$1+A9)+GAMMALN($B$2+C9-A9)- delta GAMMALN($B$1+$B$2+C9))/$E$1*EXP(GAMMALN($B$3)+ LL =SUM(E9:E19) GAMMALN($B$4+C9)-GAMMALN($B$3+$B$4+C9))/$E$3 i x t_x n # cust L( X=x,t_x,n) n-t_x =IF($H9<J$8,,EXP(GAMMALN($B$1+$A9)+GAMMALN($B$2+$B9-$A9+J$8) GAMMALN($B$1+$B$2+$B9+J$8))/$E$1*EXP(GAMMALN($B$3+1) GAMMALN($B$4+$B9+J$8)-GAMMALN($B$3+$B$4+$B9+J$8+1))/$E$3) =C15-B =D19*LN(F19)) =SUM(I19:M19) Model Fit 5 Actual BG/BB 4 3 Frequency # Repeat Trip Years (ˆα = 66, ˆβ = 519, ˆγ = 17376, ˆδ = ,LL= 7137) 18
55 Key Results P(alive in period n + 1 x,t x,n) The probability that an individual with observed behavior (x, t x,n) will be active in the next period E(X n,x,t x,n) The expected number of transactions across the next n transaction opportunities for an individual with observed behavior (x, t x,n) DERT (d x,t x,n) The discounted expected residual transactions for an individual with observed behavior (x, t x,n) 19 P(alive in period n + 1 x, t x,n) According to Bayes theorem, P(alive in n data) = P(data alive in n)p(alive in n) P(data) Recalling the individual-level likelihood function, L(p, θ x,t x,n)= p x (1 p) n x (1 θ) n it follows that + n t x 1 i= p x (1 p) t x x+i θ(1 θ) t x+i, P(alive in period n p, θ, x, t x,n) = p x (1 p) n x (1 θ) n/ L(p, θ x,t x,n) 11
56 P(alive in period n + 1 x, t x,n) For a customer with purchase history (x, t x,n), P(alive in period n + 1 α, β, γ, δ, x, t x,n) 1 1 { = (1 θ)p(alive in period n p, θ, x, t x,n) } g(p,θ α, β, γ, δ, x, t x,n) dp dθ B(α + x,β + n x)b(γ,δ + n + 1) / = L(α, β, γ, δ x,t x,n) B(α, β)b(γ, δ) 111 Conditional Expectations Let X denote the number of purchases over the next n periods (ie, in the interval (n, n + n ]) Assuming the customer is alive in period n, E(X n,p,θ,alive in period n) = = n t=n+1 p(1 θ) θ P(Y t = 1 p, alive at t)p(alive at t t>n,θ) (1 + d) t n p(1 θ)n +1 θ 112
57 Conditional Expectations For a customer with purchase history (x, t x,n), E(X n,α,β,γ,δ,x,t x,n) 1 1 { = E(X n,p,θ,alive in period n) = P(alive in period n p, θ, x, t x,n) } g(p,θ α, β, γ, δ, x, t x,n) dp dθ B(α + x + 1,β+ n x) B(α, β)b(γ, δ) B(γ 1,δ+ n + 1) B(γ 1,δ+ n + n + 1) L(α, β, γ, δ x,t x,n) 113 P(alive) A B C D E F G H I J K L M N O alpha 66 B(alpha,beta) 4751 beta 519 gamma B(gamma,delta) 4E-26 delta =EXP(GAMMALN($B$1+A9)+GAMMALN($B$2+C9-A9)- GAMMALN($B$1+$B$2+C9))*EXP(GAMMALN($B$3)+ GAMMALN($B$4+C9+1)-GAMMALN($B$3+$B$4+C9+1))/(E$1*E$3)/F9 LL i x t_x n # cust L( X=x,t_x,n) P(alive in 1998) n-t_x
58 Conditional Expectations A B C D E F G H I J K L M N O alpha 66 B(alpha,beta) 4751 beta 519 gamma B(gamma,delta) 43E-26 delta n* 4 LL =(EXP(GAMMALN($B$3-1)+GAMMALN($B$4+C1+1)-GAMMALN($B$3+$B$4+C1))- EXP(GAMMALN($B$3-1)+GAMMALN($B$4+C1+$B$5+1)- GAMMALN($B$3+$B$4+C1+$B$5)))*EXP(GAMMALN($B$1+A1+1)+ GAMMALN($B$2+C1-A1)-GAMMALN($B$1+$B$2+C1+1))/(E$1*E$3)/F1 i x t_x n # cust L( X=x,t_x,n) E(X* n*) n-t_x P(alive in 1998) as a Function of Recency and Frequency Year of Last Cruise # Cruise-years
59 Posterior Mean of p as a Function of Recency and Frequency Year of Last Cruise # Cruise-years Expected # Transactions in as a Function of Recency and Frequency Year of Last Cruise # Cruise-years
60 Expected # Transactions in as a Function of Recency and Frequency 2 E(# Trans in ) Frequency (x) 1993 Recency (t x ) 119 Computing DERT For a customer with purchase history (x, t x,n), DET (d p, θ, alive at n) P(Y t = 1 p, alive at t)p(alive at t t>n,θ) = (1 + d) t n = t=n+1 p(1 θ) d + θ However, p and θ are unobserved 12
61 Computing DERT For a just-acquired customer (x = t x = n = ), DET (d α, β, γ, δ) = = 1 1 ( α α + β p(1 θ) g(p α, β)g(θ γ, δ) dp dθ d + θ )( ) ( δ 2F 1 1,δ+ 1; γ + δ + 1; 1 ) 1+d γ + δ 1 + d 121 Computing DERT For a customer with purchase history (x, t x,n),we multiply DET (d p, θ, alive at n) by the probability that he is alive at transaction opportunity n and integrate over the posterior distribution of p and θ, giving us DET (d α, β, γ, δ, x, t x,n) = B(α + x + 1,β+ n x)b(γ,δ + n + 1) B(α, β)b(γ, δ)(1 + d) ( 2 1 ) F 1 1,δ+ n + 1; γ + δ + n + 1; 1+d L(α, β, γ, δ x,t x,n) 122
62 DET A B C D E F G H I J K L M N O alpha 66 B(alpha,beta) 4751 beta 519 gamma B(gamma,delta) 4E-26 delta =EXP(GAMMALN($B$1+A11+1)+GAMMALN($B$2+C11-A11)- GAMMALN($B$1+$B$2+C11+1))*EXP(GAMMALN($B$3)+ GAMMALN($B$4+C11+1)-GAMMALN($B$3+$B$4+C11+1)) *$K$24/(E$1*E$3*(1+$B$6))/F11 d 1 annual discount rate LL i x t_x n # cust L( X=x,t_x,n) DET n-t_x =M24*($K$25+L25-1)*($K$26+L25-1) =SUM(M24:M174) *$K$28/(($K$27+L25-1)*L25) j u_j 2F a b c z E E DERT as a Function of R&F(d = 1) 25 2 DET(d=1) Frequency (x) Recency (t x )
63 DERT as a Function of R&F(d = 1) Year of Last Cruise # Cruise-years See < for a copy of the paper that develops the BG/BB model See < for a note on how to implement the BG/BB model in Excel, along with a copy of the associated spreadsheet 126
64 Models for Contractual Settings 127 Classifying Customer Bases Continuous Opportunities for Transactions Discrete Grocery purchases Doctor visits Hotel stays Event attendance Prescription refills Charity fund drives Credit card Student mealplan Mobile phone usage Magazine subs Insurance policy Health club m ship Noncontractual Contractual Type of Relationship With Customers 128
65 SUNIL GUPTA, DONALD R LEHMANN, and JENNIFER AMES STUART* It is increasingly apparent that the financial value of a firm depends on off-balance-sheet intangible assets In this article, the authors focus on the most critical aspect of a firm: its customers Specifically, they demonstrate how valuing customers makes it feasible to value firms, including high-growth firms with negative earnings The authors define the value of a customer as the expected sum of discounted future earnings They demonstrate their valuation method by using publicly available data for five firms They find that a 1% improvement in retention, margin, or acquisition cost improves firm value by 5%, 1%, and 1%, respectively They also find that a 1% improvement in retention has almost five times greater impact on firm value than a 1% change in discount rate or cost of capital The results show that the linking of marketing concepts to shareholder value is both possible and insightful Valuing Customers 129 Vodafone Germany Quarterly Annualized Churn Rate (%) Q2 2/3 Q3 2/3 Q4 2/3 Q1 3/4 Q2 3/4 Q3 3/4 Q4 3/4 Q1 4/5 Source: Vodafone Germany Vodafone Analyst & Investor Day presentation ( ) 13
66 Valuing the Customer Base Assuming a 1% discount rate, a MBA student s back-ofthe-envelope calculation would value Vodafone Germany s existing customer base as E(Residual Value) = # customers average margin per user (1 18) t (1 + 1) t t=1 = 293 # customers AMPU 131 Vodafone Germany Quarterly Annualized Churn Rate (%) Q2 2/3 Q3 2/3 Q4 2/3 Q1 3/4 Q2 3/4 Q3 3/4 Q4 3/4 Q1 4/5 Source: Vodafone Germany Vodafone Analyst & Investor Day presentation ( ) 132
67 A Look Behind the Aggregate Retention Rates Number of active customers each year by year-ofacquisition cohort: , 8, 233 5, 677 4, 242 3, , 9, 5 6, 55 4, , 5 11, 83 7, , 12, 33 2, 5 13, 23, , 677 4, , A Look Behind the Aggregate Retention Rates Annual retention rates by cohort:
68 A Real-World Consideration 1 Retention Rate Year In practice, we (almost) always observe increasing retention rates 135 A Real-World Example Renewal rates at regional magazines vary; generally 3% of subscribers renew at the end of their original subscription, but that figure jumps to 5% for second-time renewals and all the way to 75% for longtime readers Fielding, Michael (25), Get Circulation Going: DM Redesign Increases Renewal Rates for Magazines, Marketing News, September 1,
69 Calculating CLV Modified classroom formula: E(CLV) = { t m t= i= }/( 1 ) t r i 1 + d where the retention rate for period i (r i ) is defined as the proportion of customers active at the end of period i 1 who are still active at the end of period i 137 Implementation Questions How do we project the r i beyond the set of observed retention rates? Retention Rate ?? What is the expected (remaining) CLV of a customer who has been with us for n periods? Year 138
70 Why Do Retention Rates Increase Over Time? Individual-level time dynamics (eg, increasing loyalty as the customer gains more experience with the firm) vs A sorting effect in a heterogeneous population 139 The Role of Heterogeneity Suppose we track a cohort of 15, customers, comprising two underlying segments: Segment 1 comprises 5, customers, each with a time-invariant annual retention probability of 9 Segment 2 comprises 1, customers, each with a time-invariant annual retention probability of 5 14
71 The Role of Heterogeneity # Active Customers r t Year Seg 1 Seg 2 Total Seg 1 Seg 2 Total 5, 1, 15, 1 4, 5 5, 9, , 5 2, 5 6, , 645 1, 25 4, , , , , The Role of Heterogeneity 1 8 Segment 1 Retention Rate 6 4 Segment Year 142
72 Projecting Retention Rates Given Story 1 8 Segment 1 Retention Rate 6 4 Segment Year 143 Vodafone Italia Churn Clusters 1 Cluster P(churn) %CB Low risk 6 7 Medium risk 35 2 High risk P(churn) Source: Vodafone Achievement and Challenges in Italy presentation ( ) 144
73 What is the expected residual CLV of a customer with a tenure of n periods? 145 Fleshing Out the Story Assume we re dealing with an insurance company The annual gross contribution (after accounting for retention marketing activities) equals $1 Contracts renewed on 1 January The finance department has suggested that a 1% discount rate for marketing activities 146
74 E[RCLV (d = 1% survived two years)] If this person belongs to segment 1: E[RCLV (d = 1%)] = 1 t=1 = $495 9 t (1 + 1) t 1 If this person belongs to segment 2: E[RCLV (d = 1%)] = 1 t=1 = $917 5 t (1 + 1) t But to which segment does this person belong? 148
75 The Role of Heterogeneity # Active Customers r t Year Seg 1 Seg 2 Total Seg 1 Seg 2 Total 5, 1, 15, 1 4, 5 5, 9, , 5 2, 5 6, , 645 1, 25 4, , , , , E[RCLV (d = 1% survived two years)] According to Bayes theorem, the probability that this person belongs to segment 1 is P(survived two years segment 1) P(segment 1) P(survived two years) = =
76 E[RCLV (d = 1% survived two years)] It follows that the CLV for an individual with a tenure of two years is E[RCLV (d = 1% survived two years)] = CLV (d = 1% seg 1)P(seg 1 survived two yrs) + CLV (d = 1% seg 2)P(seg 2 survived two yrs) = (1 618) = Implications To the extent that the observed retention dynamics may be largely driven by heterogeneity, they are not indicative of true individual-level behavior any estimates of CLV for existing customers must be based off a proper story of individual-level buyer behavior that explicitly accounts for heterogeneity Our two-segment model is but one such story 152
77 How much is our existing customer base worth? 153 Valuing an Existing Customer Base As we move from a single cohort of customers (defined by year-of-acquisition), to a customer base composed of a number of different cohorts, we must condition any calculations on time-of-entry into the firm s customer base 154
78 Valuing an Existing Customer Base Number of active customers by cohort: , 8, 233 5, 677 4, 242 3, , 9, 5 6, 55 4, , 5 11, 83 7, , 12, 33 2, 5 13, 23, , 677 4, , 455 Agg 3 4 retention rate = 27,955/ 4,875 = Valuing an Existing Customer Base Standing at December 31, 24, how do we compute the total CLV for the existing customer base? Two approaches: Naïve (aggregate) retention rate (684) Segment-based (ie, model-based) 156
79 Approach 1 Using the aggregate 3 4 retention rate: Total CLV = 48, t=1 = $8, 761, 684 t (1 + 1) t Approach 2 Recognizing the underlying segments: Cohort # Active in 24 P(seg 1) 2 3, , , , , Total CLV = $14,2, 158
80 Valuing an Existing Customer Base Cohort Total CLV Loss Naïve $8,761, 38% Segment (model) $14,2, 159 Exploring the Magnitude of the Loss Systematically vary heterogeneity in retention rates First need to specify a flexible model of contract duration 16
81 A Discrete-Time Model for Contract Duration i An individual remains a customer of the firm with constant retention probability 1 θ the duration of the customer s relationship with the firm is characterized by the (shifted) geometric distribution: S(t θ) = (1 θ) t, t = 1, 2, 3, ii Heterogeneity in θ is captured by a beta distribution with pdf f(θ α, β) = θα 1 (1 θ) β 1 B(α, β) 161 A Discrete-Time Model for Contract Duration The probability that a customer cancels their contract in period t P(T = t α, β) = = 1 P(T = t θ)f(θ α, β) dθ B(α + 1,β+ t 1) B(α, β) The aggregate survivor function is S(t α, β) = = 1 S(t θ)f(θ α, β) dθ B(α, β + t) B(α, β), t = 1, 2,, t = 1, 2, 162
82 A Discrete-Time Model for Contract Duration The (aggregate) retention rate is given by r t = S(t) S(t 1) = β + t 1 α + β + t 1 This is an increasing function of time, even though the underlying (unobserved) retention rates are constant at the individual-level 163 Recall: E(CLV ) = Computing CLV E[v(t)]S(t)d(t)dt In a contractual setting, assuming an individual s mean value per unit of time is constant ( v), E(CLV ) = v S(t)d(t)dt }{{} discounted expected lifetime 164
83 Computing DERL Standing at the end of period n, just prior to the point in time at which the customer makes her contract renewal decision, DERL(d θ, n 1 renewals) = But θ is unobserved = t=n S(t t>n 1; θ) (1 + d) t n (1 θ)(1 + d) d + θ 165 Computing DERL By Bayes theorem, the posterior distribution of θ is S(n 1 θ)f(θ α, β) f(θ α, β, n 1 renewals) = S(n α, β) = θα 1 (1 θ) β+n 2 B(α, β + n 1) It follows that DERL(d α, β, n 1 renewals) ( ) β + n 1 ( = 2F 1 1,β+ n; α + β + n; 1 ) 1+d α + β + n 1 166
84 Impact of Heterogeneity on Error Assume the following arrival of new customers: Assume a 1% discount rate For given values of α and β, determine the loss associated with computing the value of the existing customer base using the naïve approach (a constant aggregate retention rate) compared with the correct model-based approach 167 Two Scenarios 5 g(θ) Case 1 Case Case α β E(θ) θ 168
85 Number of Active Customers: Case n CLV 1, 8, 6, 48 5, 37 4, , 12, 9, 72 7, , 5 14, 11, , 15, , , 23, 35, 98 48, 27 59, 392 Aggregate 3 4 retention rate = 38,892/48,27 = Number of Active Customers: Case n CLV 1, 8, 7, 6 7, 383 7, , 4 11, , , 23, 37, 1 51, , 39 Aggregate 3 4 retention rate = 46,89/51,783 = 9 17
86 Impact of Heterogeneity on Error: Case 1 Naïve valuation = 59,392 = 182,292 t=1 81 t (1 + 1) t 1 Correct valuation = 4, , , , ,5 331 = 27,438 Naïve underestimates correct by 12% 171 Impact of Heterogeneity on Error: Case 2 Naïve valuation = 67,39 = 341,42 t=1 9 t (1 + 1) t 1 Correct valuation = 7, , , , ,5 768 = 617,536 Naïve underestimates correct by 45% 172
87 Interpreting the Beta Distribution Parameters mean μ = α α + β and polarization index φ = 1 α + β + 1 E(θ) = 4 θ E(θ) = 4 θ 1 g(θ) θ θ 173 Shape of the Beta Distribution 1 75 J μ 5 IM U 25 rj φ 174
88 2 Error as a Function of μ and φ We determine the percentage underestimation for the values of α and β associated with each point on the (μ, φ) unit square: % underestimation 6 4 μ μ φ φ 175 Classifying Customer Bases Continuous Opportunities for Transactions Discrete Grocery purchases Doctor visits Hotel stays Event attendance Prescription refills Charity fund drives Credit card Student mealplan Mobile phone usage Magazine subs Insurance policy Health club m ship Noncontractual Contractual Type of Relationship With Customers 176
89 Contract Duration in Continuous-Time i The duration of an individual customer s relationship with the firm is characterized by the exponential distribution with pdf and survivor function, f(t λ) = λe λt S(t λ) = e λt ii Heterogeneity in λ follows a gamma distribution with pdf g(λ r,α) = αr λ r 1 e αλ Γ (r ) 177 Contract Duration in Continuous-Time This gives us the exponential-gamma model with pdf and survivor function f(t r,α) = S(t r,α) = = r α = f(t λ)g(λ r, α) dλ ( α ) r +1 α + t S(t λ)g(λ r, α) dλ ( α α + t ) r 178
90 The Hazard Function The hazard function, h(t), is defined by h(t) = lim Δt P(t < T t + Δt T >t) Δt = f(t) 1 F(t) and represents the instantaneous rate of failure at time t conditional upon survival to t The probability of failing in the next small interval of time, given survival to time t, is P(t < T t + Δt T >t) h(t) Δt 179 The Hazard Function For the exponential distribution, For the EG model, h(t λ) = λ h(t r,α) = r α + t In applying the EG model, we are assuming that the increasing retention rates observed in the aggregate data are simply due to heterogeneity and not because of underlying time dynamics at the level of the individual customer 18
91 Standing at time s, Computing DERL DERL = s S(t t > s)d(t s)dt For exponential lifetimes with continuous compounding at rate of interest δ, DERL(δ λ, tenure of at least s) = But λ is unobserved = 1 λ + δ λe λt e δt dt 181 Computing DERL By Bayes theorem, the posterior distribution of λ for an individual with tenure of at least s, S(s λ)g(λ r,α) g(λ r,α,tenure of at least s) = S(s r,α) = (α + s)r λ r 1 e λ(α+s) Γ (r ) 182
92 Computing DERL It follows that DERL(δ r,α,tenure of at least s) { = DERL(δ λ, tenure of at least s) } g(λ r,α,tenure of at least s) dλ = (α + s) r δ r 1 Ψ(r, r ; (α + s)δ) where Ψ( ) is the confluent hypergeometric function of the second kind 183 Beyond the Basic Models 184
93 Customer-Base Analysis We have proposed a set of models that enable us to answer questions such as which customers are most likely to be active in the future, the level of transactions we could expect in future periods from those on the customer list, both individually and collectively, and individual customer lifetime value (CLV) when faced with a customer transaction database 185 Philosophy of Model Building Keep it as simple as possible Minimize cost of implementation use of readily available software (eg, Excel) use of data summaries Purposively ignore the effects of covariates (customer descriptors and marketing activities) so as to highlight the important underlying components of buyer behavior 186
94 Implementation Issues Handling multiple cohorts treatment of acquisition consideration of cross-cohort dynamics Implication of data recording processes 187 Implications of Data Recording Processes ID=1 ID=2 ID=3 ID = n The model likelihood function must match the data structure: Interval-censored individual-level data Period-by-period (cross-sectional) histograms 188
95 Model Extensions Duration dependence individual customer lifetimes interpurchase times Nonstationarity Covariates 189 Individual-Level Duration Dependence The exponential distribution is often characterized as being memoryless This means the probability that the event of interest occurs in the interval (t, t + Δt] given that it has not occurred by t is independent of t: P(t < T t + Δt) T>t)= 1 e λδt This is equivalent to a constant hazard function 19
Implementing the Pareto/NBD Model Given Interval-Censored Data
Implementing the Pareto/NBD Model Given Interval-Censored Data Peter S. Fader www.petefader.com Bruce G. S. Hardie www.brucehardie.com November 2005 Revised August 2010 1 Introduction The Pareto/NBD model
More informationProbability Models for Customer-Base Analysis
Probability Models for Customer-Base Analysis Peter S Fader niversity of Pennsylvania Bruce G S Hardie London Business School 27th Annual Advanced Research Techniques Forum June 26 29, 2016 2016 Peter
More informationIncorporating Time-Invariant Covariates into the Pareto/NBD and BG/NBD Models
Incorporating Time-Invariant Covariates into the Pareto/NBD and BG/NBD Models Peter S Fader wwwpetefadercom Bruce G S Hardie wwwbrucehardiecom August 2007 1 Introduction This note documents how to incorporate
More informationModeling Discrete-Time Transactions Using the BG/BB Model
University of Pennsylvania ScholarlyCommons Wharton Research Scholars Wharton School May 2008 Modeling Discrete-Time Transactions Using the BG/BB Model Harvey Yang Zhang University of Pennsylvania Follow
More informationNew Perspectives on Customer Death Using a Generalization of the Pareto/NBD Model
University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 9-211 New Perspectives on Customer Death Using a Generalization of the Pareto/NBD Model Kinshuk Jerath Peter S Fader
More informationAn Introduction to Probability Models for Marketing Research
An Introduction to Probability Models for Marketing Research Peter S Fader University of Pennsylvania Bruce G S Hardie London Business School 25th Annual Advanced Research Techniques Forum June 22 25,
More informationPeter Fader Professor of Marketing, The Wharton School Co-Director, Wharton Customer Analytics Initiative
DATA-DRIVEN DONOR MANAGEMENT Peter Fader Professor of Marketing, The Wharton School Co-Director, Wharton Customer Analytics Initiative David Schweidel Assistant Professor of Marketing, University of Wisconsin-
More informationAn Introduction to Probability Models for Marketing Research
An Introduction to Probability Models for Marketing Research Peter S Fader University of Pennsylvania Bruce G S Hardie London Business School 26th Annual Advanced Research Techniques Forum June 14 17,
More informationAn Introduction to Probability Models for Marketing Research
An Introduction to Probability Models for Marketing Research Peter S Fader University of Pennsylvania Bruce G S Hardie London Business School 24th Annual Advanced Research Techniques Forum June 9 12, 213
More informationApplied Probability Models in Marketing Research: Introduction
Applied Probability Models in Marketing Research: Introduction (Supplementary Materials for the A/R/T Forum Tutorial) Bruce G. S. Hardie London Business School bhardie@london.edu www.brucehardie.com Peter
More informationA Count Data Frontier Model
A Count Data Frontier Model This is an incomplete draft. Cite only as a working paper. Richard A. Hofler (rhofler@bus.ucf.edu) David Scrogin Both of the Department of Economics University of Central Florida
More informationAn Introduction to Applied Probability Models
An Introduction to Applied Probability Models Peter S Fader University of Pennsylvania wwwpetefadercom Bruce G S Hardie London Business School wwwbrucehardiecom Workshop on Customer-Base Analysis Johann
More informationModeling Recurrent Events in Panel Data Using Mixed Poisson Models
Modeling Recurrent Events in Panel Data Using Mixed Poisson Models V. Savani and A. Zhigljavsky Abstract This paper reviews the applicability of the mixed Poisson process as a model for recurrent events
More informationPractice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:
Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation
More informationImplementing the S BB -G/B Model in MATLAB
Implementing the S BB -G/B Model in MATLAB Peter S. Fader www.petefader.com Bruce G. S. Hardie www.brucehardie.com January 2011 This note documents a somewhat informal MATLAB-based implementation of the
More informationChapter 8 - Forecasting
Chapter 8 - Forecasting Operations Management by R. Dan Reid & Nada R. Sanders 4th Edition Wiley 2010 Wiley 2010 1 Learning Objectives Identify Principles of Forecasting Explain the steps in the forecasting
More informationMortality Surface by Means of Continuous Time Cohort Models
Mortality Surface by Means of Continuous Time Cohort Models Petar Jevtić, Elisa Luciano and Elena Vigna Longevity Eight 2012, Waterloo, Canada, 7-8 September 2012 Outline 1 Introduction Model construction
More informationDuration Analysis. Joan Llull
Duration Analysis Joan Llull Panel Data and Duration Models Barcelona GSE joan.llull [at] movebarcelona [dot] eu Introduction Duration Analysis 2 Duration analysis Duration data: how long has an individual
More informationProbabilities & Statistics Revision
Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF
More informationCensoring mechanisms
Censoring mechanisms Patrick Breheny September 3 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Fixed vs. random censoring In the previous lecture, we derived the contribution to the likelihood
More informationLecture 2: Priors and Conjugacy
Lecture 2: Priors and Conjugacy Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 6, 2014 Some nice courses Fred A. Hamprecht (Heidelberg U.) https://www.youtube.com/watch?v=j66rrnzzkow Michael I.
More informationApplied Probability Models in Marketing Research: Extensions
Applied Probability Models in Marketing Research: Extensions Bruce G S Hardie London Business School bhardie@londonedu Peter S Fader University of Pennsylvania faderp@whartonupennedu 12th Annual Advanced
More informationTime-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation
Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation H. Zhang, E. Cutright & T. Giras Center of Rail Safety-Critical Excellence, University of Virginia,
More informationASM Study Manual for Exam P, First Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA Errata
ASM Study Manual for Exam P, First Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA (krzysio@krzysio.net) Errata Effective July 5, 3, only the latest edition of this manual will have its errata
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationDynamic Models Part 1
Dynamic Models Part 1 Christopher Taber University of Wisconsin December 5, 2016 Survival analysis This is especially useful for variables of interest measured in lengths of time: Length of life after
More informationProblem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56
STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your
More informationarxiv: v1 [q-fin.gn] 12 Apr 2017
A generalized Bayesian framework for the analysis of subscription based businesses Rahul Madhavan arxiv:1704.05729v1 [q-fin.gn] 12 Apr 2017 Ankit Baraskar Atidiv Pune, IN-411014 rahul.madhavan@atidiv.com
More information3 Continuous Random Variables
Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random
More informationParameter Estimation
Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods
More informationConstructing Learning Models from Data: The Dynamic Catalog Mailing Problem
Constructing Learning Models from Data: The Dynamic Catalog Mailing Problem Peng Sun May 6, 2003 Problem and Motivation Big industry In 2000 Catalog companies in the USA sent out 7 billion catalogs, generated
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationChap 4. Software Reliability
Chap 4. Software Reliability 4.2 Reliability Growth 1. Introduction 2. Reliability Growth Models 3. The Basic Execution Model 4. Calendar Time Computation 5. Reliability Demonstration Testing 1. Introduction
More informationINSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS
INSTITUTE AND FACULTY OF ACTUARIES Curriculum 09 SPECIMEN SOLUTIONS Subject CSA Risk Modelling and Survival Analysis Institute and Faculty of Actuaries Sample path A continuous time, discrete state process
More informationThis exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability and Statistics FS 2017 Session Exam 22.08.2017 Time Limit: 180 Minutes Name: Student ID: This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationEE/CpE 345. Modeling and Simulation. Fall Class 5 September 30, 2002
EE/CpE 345 Modeling and Simulation Class 5 September 30, 2002 Statistical Models in Simulation Real World phenomena of interest Sample phenomena select distribution Probabilistic, not deterministic Model
More informationInformation Content Change under SFAS No. 131 s Interim Segment Reporting Requirements
Vol 2, No. 3, Fall 2010 Page 61~75 Information Content Change under SFAS No. 131 s Interim Segment Reporting Requirements Cho, Joong-Seok a a. School of Business Administration, Hanyang University, Seoul,
More informationf X (x) = λe λx, , x 0, k 0, λ > 0 Γ (k) f X (u)f X (z u)du
11 COLLECTED PROBLEMS Do the following problems for coursework 1. Problems 11.4 and 11.5 constitute one exercise leading you through the basic ruin arguments. 2. Problems 11.1 through to 11.13 but excluding
More information1 Probability and Random Variables
1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete
More informationRMSC 2001 Introduction to Risk Management
RMSC 2001 Introduction to Risk Management Tutorial 4 (2011/12) 1 February 20, 2012 Outline: 1. Failure Time 2. Loss Frequency 3. Loss Severity 4. Aggregate Claim ====================================================
More informationA hidden semi-markov model for the occurrences of water pipe bursts
A hidden semi-markov model for the occurrences of water pipe bursts T. Economou 1, T.C. Bailey 1 and Z. Kapelan 1 1 School of Engineering, Computer Science and Mathematics, University of Exeter, Harrison
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationBeyond the Target Customer: Social Effects of CRM Campaigns
Beyond the Target Customer: Social Effects of CRM Campaigns Eva Ascarza, Peter Ebbes, Oded Netzer, Matthew Danielson Link to article: http://journals.ama.org/doi/abs/10.1509/jmr.15.0442 WEB APPENDICES
More informationEXPERIENCE-BASED LONGEVITY ASSESSMENT
EXPERIENCE-BASED LONGEVITY ASSESSMENT ERMANNO PITACCO Università di Trieste ermanno.pitacco@econ.units.it p. 1/55 Agenda Introduction Stochastic modeling: the process risk Uncertainty in future mortality
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationErrata for the ASM Study Manual for Exam P, Fourth Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA
Errata for the ASM Study Manual for Exam P, Fourth Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA (krzysio@krzysio.net) Effective July 5, 3, only the latest edition of this manual will have its
More informationExercises. (a) Prove that m(t) =
Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for
More informationABC methods for phase-type distributions with applications in insurance risk problems
ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon
More informationIntroduction to Probability Theory for Graduate Economics Fall 2008
Introduction to Probability Theory for Graduate Economics Fall 008 Yiğit Sağlam October 10, 008 CHAPTER - RANDOM VARIABLES AND EXPECTATION 1 1 Random Variables A random variable (RV) is a real-valued function
More informationSTAT 418: Probability and Stochastic Processes
STAT 418: Probability and Stochastic Processes Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical
More informationEstimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio
Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationThings to remember when learning probability distributions:
SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions
More informationContinuous Distributions
A normal distribution and other density functions involving exponential forms play the most important role in probability and statistics. They are related in a certain way, as summarized in a diagram later
More informationCAS Exam MAS-1. Howard Mahler. Stochastic Models
CAS Exam MAS-1 Howard Mahler Stochastic Models 2019 Stochastic Models, HCM 12/31/18, Page 1 The slides are in the same order as the sections of my study guide. Section # Section Name 1 Introduction 2 Exponential
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationSPRING 2007 EXAM C SOLUTIONS
SPRING 007 EXAM C SOLUTIONS Question #1 The data are already shifted (have had the policy limit and the deductible of 50 applied). The two 350 payments are censored. Thus the likelihood function is L =
More informationMulti-period credit default prediction with time-varying covariates. Walter Orth University of Cologne, Department of Statistics and Econometrics
with time-varying covariates Walter Orth University of Cologne, Department of Statistics and Econometrics 2 20 Overview Introduction Approaches in the literature The proposed models Empirical analysis
More informationBinomial random variable
Binomial random variable Toss a coin with prob p of Heads n times X: # Heads in n tosses X is a Binomial random variable with parameter n,p. X is Bin(n, p) An X that counts the number of successes in many
More informationproblem. max Both k (0) and h (0) are given at time 0. (a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming
1. Endogenous Growth with Human Capital Consider the following endogenous growth model with both physical capital (k (t)) and human capital (h (t)) in continuous time. The representative household solves
More informationUnobserved Heterogeneity
Unobserved Heterogeneity Germán Rodríguez grodri@princeton.edu Spring, 21. Revised Spring 25 This unit considers survival models with a random effect representing unobserved heterogeneity of frailty, a
More informationChapter 2 - Survival Models
2-1 Chapter 2 - Survival Models Section 2.2 - Future Lifetime Random Variable and the Survival Function Let T x = ( Future lifelength beyond age x of an individual who has survived to age x [measured in
More informationECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172.
ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172. 1. Enter your name, student ID number, e-mail address, and signature in the space provided on this page, NOW! 2. This is a closed book exam.
More informationBayesian Inference. Chapter 2: Conjugate models
Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in
More information1 Random Variable: Topics
Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?
More informationNovember 2000 Course 1. Society of Actuaries/Casualty Actuarial Society
November 2000 Course 1 Society of Actuaries/Casualty Actuarial Society 1. A recent study indicates that the annual cost of maintaining and repairing a car in a town in Ontario averages 200 with a variance
More informationRMSC 2001 Introduction to Risk Management
RMSC 2001 Introduction to Risk Management Tutorial 4 (2011/12) 1 February 20, 2012 Outline: 1. Failure Time 2. Loss Frequency 3. Loss Severity 4. Aggregate Claim ====================================================
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More information1 Poisson processes, and Compound (batch) Poisson processes
Copyright c 2007 by Karl Sigman 1 Poisson processes, and Compound (batch) Poisson processes 1.1 Point Processes Definition 1.1 A simple point process ψ = {t n : n 1} is a sequence of strictly increasing
More information1 Introduction. P (n = 1 red ball drawn) =
Introduction Exercises and outline solutions. Y has a pack of 4 cards (Ace and Queen of clubs, Ace and Queen of Hearts) from which he deals a random of selection 2 to player X. What is the probability
More informationContagious default: application of methods of Statistical Mechanics in Finance
Contagious default: application of methods of Statistical Mechanics in Finance Wolfgang J. Runggaldier University of Padova, Italy www.math.unipd.it/runggaldier based on joint work with : Paolo Dai Pra,
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationSTAT 414: Introduction to Probability Theory
STAT 414: Introduction to Probability Theory Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical Exercises
More informationLecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University
Lecture 15 20 Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Modeling for Time Series Forecasting Forecasting is a necessary input to planning, whether in business,
More informationDepartment of Statistical Science FIRST YEAR EXAM - SPRING 2017
Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More information(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)
3 Probability Distributions (Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) Probability Distribution Functions Probability distribution function (pdf): Function for mapping random variables to real numbers. Discrete
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationChapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory
Chapter 1 Statistical Reasoning Why statistics? Uncertainty of nature (weather, earth movement, etc. ) Uncertainty in observation/sampling/measurement Variability of human operation/error imperfection
More informationModule 22: Bayesian Methods Lecture 9 A: Default prior selection
Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical
More informationProbability Distribution
Economic Risk and Decision Analysis for Oil and Gas Industry CE81.98 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department
More informationClosed book and notes. 120 minutes. Cover page, five pages of exam. No calculators.
IE 230 Seat # Closed book and notes. 120 minutes. Cover page, five pages of exam. No calculators. Score Final Exam, Spring 2005 (May 2) Schmeiser Closed book and notes. 120 minutes. Consider an experiment
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationPractice Examination # 3
Practice Examination # 3 Sta 23: Probability December 13, 212 This is a closed-book exam so do not refer to your notes, the text, or any other books (please put them on the floor). You may use a single
More informationSampling Distributions
In statistics, a random sample is a collection of independent and identically distributed (iid) random variables, and a sampling distribution is the distribution of a function of random sample. For example,
More informationEco517 Fall 2004 C. Sims MIDTERM EXAM
Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering
More informationChapter 5. Bayesian Statistics
Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationSTAT2201. Analysis of Engineering & Scientific Data. Unit 3
STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationProbability Models. 4. What is the definition of the expectation of a discrete random variable?
1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions
More informationContinuous-Valued Probability Review
CS 6323 Continuous-Valued Probability Review Prof. Gregory Provan Department of Computer Science University College Cork 2 Overview Review of discrete distributions Continuous distributions 3 Discrete
More information(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)
3 Probability Distributions (Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) Probability Distribution Functions Probability distribution function (pdf): Function for mapping random variables to real numbers. Discrete
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationASM Study Manual for Exam P, Second Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA Errata
ASM Study Manual for Exam P, Second Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA (krzysio@krzysio.net) Errata Effective July 5, 3, only the latest edition of this manual will have its errata
More informationCommon ontinuous random variables
Common ontinuous random variables CE 311S Earlier, we saw a number of distribution families Binomial Negative binomial Hypergeometric Poisson These were useful because they represented common situations:
More information