Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1
|
|
- Aubrey Dickerson
- 5 years ago
- Views:
Transcription
1 Estimation and Model Selection in Mixed Effects Models Part I Adeline Samson 1 1 University Paris Descartes Summer school Lipari, Italy These slides are based on Marc Lavielle s slides
2 Outline 1 Introduction 2 Some pharmacokinetics-pharmacodynamics examples 3 Estimation in classical regression models linear regression model non linear regression model examples 4 The mixed effects model linear mixed effects model non linear mixed effects model examples
3 Some examples of data Growth of children Height (cm) age (year)
4 Some examples of data Pharmacokinetics of Indomethacin Indomethicin concentration (mcg/ml) Time since drug administration (hr)
5 Some examples of data Pharmacokinetics of Theophylline Theophylline concentration in serum (mg/l) Time since drug administration (hr)
6 The classical regression model For one subject y j = f (x j, β) + ε j, 1 j n y j R is the jth observation of the subject, n is the number of observations The regression variables, or design variables, (x j ) are known, The vector of parameters (β) is unknown. The measurement error variables (ε j ) are unknown.
7 The classical regression model For one subject y j = f (x j, β) + ε j, 1 j n y j R is the jth observation of the subject, n is the number of observations The regression variables, or design variables, (x j ) are known, The vector of parameters (β) is unknown. The measurement error variables (ε j ) are unknown.
8 The classical regression model For one subject y j = f (x j, β) + ε j, 1 j n y j R is the jth observation of the subject, n is the number of observations The regression variables, or design variables, (x j ) are known, The vector of parameters (β) is unknown. The measurement error variables (ε j ) are unknown.
9 The classical regression model For one subject y j = f (x j, β) + ε j, 1 j n y j R is the jth observation of the subject, n is the number of observations The regression variables, or design variables, (x j ) are known, The vector of parameters (β) is unknown. The measurement error variables (ε j ) are unknown.
10 The classical regression model y j = f (x j, β) + ε j, 1 j n The (ε j ) are modelized as sequences of random variables. The goal of the modeler is to develop simultaneously two kinds of models: (1) The structural model f (2) The statistical model
11 The classical regression model y j = f (x j, β) + ε j, 1 j n (1) The structural model f : We are not interested with a purely descriptive model which nicely fits the data, but rather with a mechanistic model which has some biological meaning and which is a function of some physiological parameters. Examples: compartimental PK models, viral dynamic models,...
12 The classical regression model y j = f (x j, β) + ε j, 1 j n (2) The statistical model aims to explain the variability observed in the data: the residual error model: distribution of (ε j ) the model of covariates
13 The classical regression model y j = f (x j, β) + ε j, 1 j n Some statistical issues: Estimation: estimate the parameters of the model Model selection and model assessment: Select and assess the best structural model f, Select and assess the best statistical model Optimization of the design : Find the optimal design (x j )
14 Outline 1 Introduction 2 Some pharmacokinetics-pharmacodynamics examples 3 Estimation in classical regression models linear regression model non linear regression model examples 4 The mixed effects model linear mixed effects model non linear mixed effects model examples
15 Pharmacokinetics and Pharmacodynamics (PK/PD) dose PK concentration PD response Pharmacokinetics (PK): What the body does to the drug Pharmacodynamics (PD): What the drug does to the body
16 One compartment PK model intravenous administration
17 intravenous administration and first-order elimination dose D (t=0) DRUG AMOUNT Q(t) elimination (rate k e ) dq (t) = kq(t) ; Q(0) = D dt Q(t) = De kt C(t) = Q(t) V = D V e ket C(t) : concentration of the drug, V : volume of the compartment
18 intravenous administration and nonlinear elimination dose D (t=0) DRUG AMOUNT Q(t) nonlinear elimination dq(t) dt = V m Q(t) V K m + Q(t) C(t) = Q(t) V (V m, K m ) : Michaelis-Menten elimination parameters, V : volume of the compartment.
19 oral administration, first-order absorbtion and elimination dose D at time t=0 absorption (rate k a ) DRUG AMOUNT Q(t) elimination (rate k e ) dq a dt (t) = k aq a (t) ; Q a (0) = D dq dt (t) = k aq a (t) k e Q(t) ; Q(0) = 0 Q a (t): amount at absorption site. C(t) = Q(t) V = D k a (e ket e kat) V (k a k e )
20 Two compartments PK model intravenous administration
21 Two compartments PK model dq a dt (t) = k aq a (t), dq c dt (t) = k aq a (t) k e Q c (t) k 12 Q c (t) + k 21 Q p (t), dq p dt (t) = k 12Q c (t) k 21 Q p (t). Q a (t): amount at absorption site, Q a (0) = D. Q c (t): amount in the central compartment, Q c (0) = 0. Q p (t): amount in the peripheral compartment, Q p (0) = 0.
22 Outline 1 Introduction 2 Some pharmacokinetics-pharmacodynamics examples 3 Estimation in classical regression models linear regression model non linear regression model examples 4 The mixed effects model linear mixed effects model non linear mixed effects model examples
23 The regression model y j = f (x j, β) + ε j, 1 j n n is the number of observations. The regression variables, or design variables, (x j ) are known, The vector of parameters β is unknown. - linear model: f is a linear function of the parameters β - non linear model: f is a non linear function of the parameters β
24 The regression model y j = f (x j, β) + ε j, 1 j n n is the number of observations. The regression variables, or design variables, (x j ) are known, The vector of parameters β is unknown. - linear model: f is a linear function of the parameters β - non linear model: f is a non linear function of the parameters β
25 The regression model The statistical model y j = f (x j, β) + ε j The simplest statistical model assumes that the (ε j ) are independent and identically distributed (i.i.d) Gaussian random variables: ε j i.i.d. N (0, σ 2 ) Problem: estimate the parameters of the model θ = (β, σ 2 ).
26 Maximum Likelihood Estimation Maximum likelihood estimation (MLE) is a popular statistical method used for fitting a statistical model to data, and providing estimates for the model s parameters. For a fixed set of data and underlying probability model, maximum likelihood picks the values of the model parameters that make the data "more likely" than any other values of the parameters would make them
27 Maximum Likelihood Estimation Consider a family of continuous probability distributions parameterized by an unknown parameter θ, associated with a known probability density function p θ. Draw a vector y = (y 1, y 2,..., y n ) from this distribution, and then using p θ compute the probability density associated with the observed data, p θ (y) = p θ (y 1, y 2,..., y n ) As a function of θ with y 1, y 2,..., y n fixed, this is the likelihood function L(θ; y) = p θ (y)
28 Maximum Likelihood Estimation Let θ be the true value of θ. The method of maximum likelihood estimates θ by finding the value of θ that maximizes L(θ; y). This is the maximum likelihood estimator (MLE) of θ: ˆθ = Arg max L(θ; y) θ
29 Maximum Likelihood Estimation Some properties of the MLE Under certain (fairly weak) regularity conditions, the MLE is "asymptotically optimal": The MLE is asymptotically unbiased: E(ˆθ) θ n The MLE is a consistant estimate of θ (LLN): ˆθ θ n The MLE is asymptotically normal (CLT) n(ˆθ θ ) n N (0, I(θ ) 1 ) I(θ ) = E 2 θ log L(θ ; y)/n is the Fisher Information Matrix The MLE is asymptotically efficient, (Cramér-Rao) This means that no asymptotically unbiased estimator has lower asymptotic mean squared error than the MLE.
30 Maximum Likelihood Estimation Some properties of the MLE Under certain (fairly weak) regularity conditions, the MLE is "asymptotically optimal": The MLE is asymptotically unbiased: E(ˆθ) θ n The MLE is a consistant estimate of θ (LLN): ˆθ θ n The MLE is asymptotically normal (CLT) n(ˆθ θ ) n N (0, I(θ ) 1 ) I(θ ) = E 2 θ log L(θ ; y)/n is the Fisher Information Matrix The MLE is asymptotically efficient, (Cramér-Rao) This means that no asymptotically unbiased estimator has lower asymptotic mean squared error than the MLE.
31 Maximum Likelihood Estimation Some properties of the MLE Under certain (fairly weak) regularity conditions, the MLE is "asymptotically optimal": The MLE is asymptotically unbiased: E(ˆθ) θ n The MLE is a consistant estimate of θ (LLN): ˆθ θ n The MLE is asymptotically normal (CLT) n(ˆθ θ ) n N (0, I(θ ) 1 ) I(θ ) = E 2 θ log L(θ ; y)/n is the Fisher Information Matrix The MLE is asymptotically efficient, (Cramér-Rao) This means that no asymptotically unbiased estimator has lower asymptotic mean squared error than the MLE.
32 Maximum Likelihood Estimation Some properties of the MLE Under certain (fairly weak) regularity conditions, the MLE is "asymptotically optimal": The MLE is asymptotically unbiased: E(ˆθ) θ n The MLE is a consistant estimate of θ (LLN): ˆθ θ n The MLE is asymptotically normal (CLT) n(ˆθ θ ) n N (0, I(θ ) 1 ) I(θ ) = E 2 θ log L(θ ; y)/n is the Fisher Information Matrix The MLE is asymptotically efficient, (Cramér-Rao) This means that no asymptotically unbiased estimator has lower asymptotic mean squared error than the MLE.
33 The regression model Maximum likelihood estimation y j = f (x j, β) + ε j, 1 j n ε j i.i.d. N (0, σ 2 ) y j N (f (x j, β), σ 2 ) L(θ; y) = = n L(θ; y j ) = j=1 n j=1 n p θ (y j ) j=1 ( 2πσ 2 ) 1 2 e 1 2σ 2 (y j f (x j,β)) 2 = ( 2πσ 2) n 2 e 1 2σ 2 P n j=1 (y j f (x j,β)) 2
34 The regression model Maximum likelihood estimation ˆβ = Arg max β = Arg max β = Arg min β L(β; y) { (2πσ 2 ) n 2 e 1 P n 2σ 2 j=1 (y j f (x j,β)) 2} n (y j f (x j, β)) 2 j=1 (Maximum Likelihood estimate of β = Least-Square estimate of β)
35 The linear regression model y 1 = x 11 β 1 + x 12 β x 1p β p + ε 1 y 2 = x 21 β 1 + x 22 β x 2p β p + ε 2. y n = x n1 β 1 + x n2 β x np β p + ε n Y = X β + ε
36 The linear regression model Maximum Likelihood Estimation y = X β + ε ε j N (0, σ 2 ) θ = (β, σ 2 ) y N (X β, σ 2 I n ) L(θ; y) = ( 2πσ 2) n 2 e 1 2σ2 y X β 2 ˆβ = Arg max β L(β; y) = Arg min y X β 2 β
37 The linear regression model Maximum Likelihood Estimation y = X β + ε ˆβ = Arg min β y X β 2 E( ˆβ) = β = (X X ) 1 X y = (X X ) 1 X (X β + ε) = β + (X X ) 1 X ε Var( ˆβ) = σ 2 (X X ) 1 log L(θ ; y) = n 2 log(2πσ2 ) + 1 2σ 2 y X β 2 I(β) = 1 n E 2 β log L(β, σ2 ; y) = 1 nσ 2 (X X )
38 The linear regression model Maximum Likelihood Estimation y = X β + ε ˆβ = Arg min β y X β 2 E( ˆβ) = β = (X X ) 1 X y = (X X ) 1 X (X β + ε) = β + (X X ) 1 X ε Var( ˆβ) = σ 2 (X X ) 1 log L(θ ; y) = n 2 log(2πσ2 ) + 1 2σ 2 y X β 2 I(β) = 1 n E 2 β log L(β, σ2 ; y) = 1 nσ 2 (X X )
39 The linear regression model Maximum Likelihood Estimation y = X β + ε Let V = Var( ˆβ) = σ 2 (X X ) 1 be the variance-covariance matrix of ˆβ. The diagonal elements of V are the variances of the components of ˆβ: V k,k is the variance of ˆβ k Vk,k is the standard error (s.e.) of ˆβ k 90% confidence interval for β k : [ ˆβ k V k,k ; ˆβ k V k,k ]
40 The non linear regression model y 1 = f (x 11 β 1, x 12 β 2,..., x 1p β p ) + ε 1 y 2 = f (x 21 β 1, x 22 β 2,..., x 2p β p ) + ε 2. y n = f (x n1 β 1, x n2 β 2,..., x np β p ) + ε n Y = f (X, β) + ε
41 The non linear regression model Maximum Likelihood Estimation y = f (X, β) + ε ε j N (0, σ 2 ) y N (f (X, β), σ 2 I n ) L(θ; y) = ( 2πσ 2) n 2 e 1 2σ2 y f (X, β) 2 ˆβ = Arg max β L(β; y) = Arg min y f (X, β) 2 β No explicit expression for ˆβ, use of optimization algorithm (Newton-Raphson)
42 Model selection 1 Compute the likelihood of the different models Let ˆθ M be the maximum likelihood estimate of θ for model M: ˆθ M = Arg max L M (θ; y) θ Let L M = L M (ˆθ M ; y) be the likelihood of model M. Selecting the most likely models by comparing the likelihoods favor models of high dimension (with many parameters)! 2 Penalize the models of high dimension Select the model ˆM that minimizes the penalized criteria 2L M + pen(m) Bayesian Information Criteria (BIC) : pen(m) = log(n) dim(m). Akaike Information Criteria (AIC) : pen(m) = 2dim (M).
43 Linear regression model Growth of children height age y j = β 1 + t j β 2 + ε j ˆβ 1 = 97.97, ˆβ 2 = 3.59
44 Non linear regression model Pharmacokinetic of Indomethacin Subject 1 Subject 2 concentration concentration time time y j = β 1 e β 2t j + β 3 e β 4t j + ε j Subject ˆβ1 ˆβ2 ˆβ3 ˆβ
45 Outline 1 Introduction 2 Some pharmacokinetics-pharmacodynamics examples 3 Estimation in classical regression models linear regression model non linear regression model examples 4 The mixed effects model linear mixed effects model non linear mixed effects model examples
46 Some examples of data Pharmacokinetics of Theophyllinee concentration Each individual curve is described by the same parametric model, with its own individual parameters time
47 Analysis of data from several subjects Classical regression model for one subject y j = f (x j, ψ) + ε j, 1 j n Classical regression model for several subjects y ij = f (x ij, ψ) + ε ij, 1 i N, 1 j n i y ij R is the jth observation of subject i, N is the number of subjects n i is the number of observations of subject i. x ij are the regression variables, or design variables
48 Analysis of data from several subjects Classical regression model for one subject y j = f (x j, ψ) + ε j, 1 j n Classical regression model for several subjects y ij = f (x ij, ψ) + ε ij, 1 i N, 1 j n i y ij R is the jth observation of subject i, N is the number of subjects n i is the number of observations of subject i. x ij are the regression variables, or design variables
49 Analysis of data from several subjects Classical regression model for one subject y j = f (x j, ψ) + ε j, 1 j n Classical regression model for several subjects y ij = f (x ij, ψ) + ε ij, 1 i N, 1 j n i y ij R is the jth observation of subject i, N is the number of subjects n i is the number of observations of subject i. x ij are the regression variables, or design variables
50 Analysis of data from several subjects Classical regression model for one subject y j = f (x j, ψ) + ε j, 1 j n Classical regression model for several subjects y ij = f (x ij, ψ) + ε ij, 1 i N, 1 j n i y ij R is the jth observation of subject i, N is the number of subjects n i is the number of observations of subject i. x ij are the regression variables, or design variables
51 Analysis of data from several subjects Classical regression model for one subject y j = f (x j, ψ) + ε j, 1 j n Classical regression model for several subjects y ij = f (x ij, ψ) + ε ij, 1 i N, 1 j n i y ij R is the jth observation of subject i, N is the number of subjects n i is the number of observations of subject i. x ij are the regression variables, or design variables
52 Analysis of data from several subjects Classical regression model for one subject y j = f (x j, ψ) + ε j, 1 j n Classical regression model for several subjects y ij = f (x ij, ψ) + ε ij, 1 i N, 1 j n i y ij R is the jth observation of subject i, N is the number of subjects n i is the number of observations of subject i. x ij are the regression variables, or design variables
53 The mixed effects model [Pinheiro and Bates, 2002] y ij = f (x ij, ψ i ) + ε ij, 1 i N, 1 j n i y ij R is the jth observation of subject i, N is the number of subjects n i is the number of observations of subject i. The regression variables, or design variables, (x ij ) are known, The individual parameters (ψ i ) are unknown.
54 The mixed effects model y ij = f (x ij, ψ i ) + ε ij, 1 i N, 1 j n i The (ψ i ) and the (ε ij ) are modelized as sequences of random variables. The goal of the modeler is to develop simultaneously two kinds of models: (1) The structural model f (2) The statistical model
55 The mixed effects model y ij = f (x ij, ψ i ) + ε ij, 1 i N, 1 j n i (2) The statistical model aims to explain the variability observed in the data: the residual error model: distribution of (ε ij ) the model of the individual parameters: distribution of (ψ i ) C i is a vector of covariates β is a vector of fixed effects η i is a vector of random effects ψ i = h(c i, β, η i )
56 The individual parameters examples: ψ i = h(c i, β, η i ) additive random effects (C i = 1) ψ i = β + η i effect of a covariate (β = [β 1, β 2 ], C i = [1, sex i ]) ψ i = β 1 + β 2 sex i + η i multiplicative random effects ψ i = β e η i "partial" vector of random effects ψ i = (ψ i1, ψ i2 ) = (β 1 + η i1, β 2 )
57 The individual parameters examples: ψ i = h(c i, β, η i ) additive random effects (C i = 1) ψ i = β + η i effect of a covariate (β = [β 1, β 2 ], C i = [1, sex i ]) ψ i = β 1 + β 2 sex i + η i multiplicative random effects ψ i = β e η i "partial" vector of random effects ψ i = (ψ i1, ψ i2 ) = (β 1 + η i1, β 2 )
58 The individual parameters examples: ψ i = h(c i, β, η i ) additive random effects (C i = 1) ψ i = β + η i effect of a covariate (β = [β 1, β 2 ], C i = [1, sex i ]) ψ i = β 1 + β 2 sex i + η i multiplicative random effects ψ i = β e η i "partial" vector of random effects ψ i = (ψ i1, ψ i2 ) = (β 1 + η i1, β 2 )
59 The individual parameters examples: ψ i = h(c i, β, η i ) additive random effects (C i = 1) ψ i = β + η i effect of a covariate (β = [β 1, β 2 ], C i = [1, sex i ]) ψ i = β 1 + β 2 sex i + η i multiplicative random effects ψ i = β e η i "partial" vector of random effects ψ i = (ψ i1, ψ i2 ) = (β 1 + η i1, β 2 )
60 The individual parameters examples: ψ i = h(c i, β, η i ) additive random effects (C i = 1) ψ i = β + η i effect of a covariate (β = [β 1, β 2 ], C i = [1, sex i ]) ψ i = β 1 + β 2 sex i + η i multiplicative random effects ψ i = β e η i "partial" vector of random effects ψ i = (ψ i1, ψ i2 ) = (β 1 + η i1, β 2 )
61 The mixed effects model y ij = f (x ij, ψ i ) + ε ij, 1 i N, 1 j n i ψ i = h(c i, β, η i ) C i is a known vector of covariates β is a unknown p-vector of fixed effects η i is a unknown q-vector of random effects ε ij N (0, σ 2 ) ψ i N (0, Ω) Ω is the q q variance-covariance matrix of the random effects (Hyper or population) parameters: θ = (β, Ω, σ 2 )
62 The mixed effects model Example Classical regression model y j = β 1 + β 2 t j + ε j Mixed effects model y ij = ψ i1 + ψ i2 t ij + ε ij = (β 1 + η i1 ) + (β 2 + η i2 )t ij + ε ij where η i = (η i1, η i2 ) N (0, Ω) means ( ηi1 η i2 ) N (( 0 0 ) ( ω 2, 1 ω 12 ω 12 ω2 2 ))
63 The mixed effects model Some statistical issues: Estimation: estimate the population parameters of the model θ estimate the individual parameters compute confidence intervals Model selection and model assessment: Determine if a parameter varies in the population Select the best combination of covariates Compare several treatments Optimization of the design : Determine the design (the measurement times) that yields the most accurate estimation of the model
64 The mixed effects model Estimation of the population parameters The maximum likelihood estimator of θ = (β, Ω, σ 2 ) maximizes L(θ; y) = L i (θ; y i ) = = N L i (θ; y i ) i=1 p(y i, η i ; θ)dη i p(y i η i ; θ)p(η i ; θ)dη i We know that y i η i N ( f (x i, h(c i, β, η i )), σ 2 I ni ) η i N (0, Ω)
65 The mixed effects model Estimation of the population parameters The maximum likelihood estimator of θ = (β, Ω, σ 2 ) maximizes L(θ; y) = L i (θ; y i ) = = N L i (θ; y i ) i=1 p(y i, η i ; θ)dη i p(y i η i ; θ)p(η i ; θ)dη i We know that y i η i N ( f (x i, h(c i, β, η i )), σ 2 I ni ) η i N (0, Ω)
66 The mixed effects model Estimation of the population parameters Thus L i (θ; y i ) = (2πσ 2 ) n i 2 e 1 2σ 2 y i f (x i,h(c i,β,η i )) 2 (2π Ω ) 1 2 e 1 2 η i Ω 1 η i dη i Example: ψ i = β + η i L i (θ; y i ) = C σ n i Ω 1 2 e 1 2σ 2 y i f (x i,ψ i ) (ψ i β) Ω 1 (ψ i β) dψ i
67 The mixed effects model Estimation of the population parameters Thus L i (θ; y i ) = (2πσ 2 ) n i 2 e 1 2σ 2 y i f (x i,h(c i,β,η i )) 2 (2π Ω ) 1 2 e 1 2 η i Ω 1 η i dη i Example: ψ i = β + η i L i (θ; y i ) = C σ n i Ω 1 2 e 1 2σ 2 y i f (x i,ψ i ) (ψ i β) Ω 1 (ψ i β) dψ i
68 The mixed effects model Estimation of the individual parameters Assume that θ = (β, Ω, σ 2 ) is known (or previously estimated) ˆψ i maximizes the conditional distribution p(ψ i y i ; θ) p(ψ i y i ; θ) = p(ψ i, y i ; θ) p(y i : θ) = p(y i ψ i ; θ)p(ψ i ; θ) p(y i : θ) p(y i ψ i ; θ)p(ψ i ; θ) Example: ψ i = β + η i p(ψ i y i ; θ) = Ce 1 2 y i f (x i,ψ i ) (ψ i β) Ω 1 (ψ i β) ˆψ i minimizes a penalized least-square criteria: ˆψ i = arg min ψ ( y i f (x i, ψ i ) (ψ i β) Ω 1 (ψ i β) )
69 The mixed effects model Estimation of the individual parameters Assume that θ = (β, Ω, σ 2 ) is known (or previously estimated) ˆψ i maximizes the conditional distribution p(ψ i y i ; θ) p(ψ i y i ; θ) = p(ψ i, y i ; θ) p(y i : θ) = p(y i ψ i ; θ)p(ψ i ; θ) p(y i : θ) p(y i ψ i ; θ)p(ψ i ; θ) Example: ψ i = β + η i p(ψ i y i ; θ) = Ce 1 2 y i f (x i,ψ i ) (ψ i β) Ω 1 (ψ i β) ˆψ i minimizes a penalized least-square criteria: ˆψ i = arg min ψ ( y i f (x i, ψ i ) (ψ i β) Ω 1 (ψ i β) )
70 The linear mixed effects model Maximum likelihood estimate y i = X i ψ i + ε i ε i N (0, σ 2 I ni ) ψ i N (β, Ω) We have y i = X i β + X i η i + ε i thus by linearity y i N (X i β, X i ΩX i + σ 2 I ni ) Set V i = X i ΩX i /σ2 + I ni, thus the likelihood is explicit N ( L(θ; y) = (2π σ 2 V i ) 1 2 exp 1 ) 2σ 2 (y i X i β) V 1 i (y i X i β) i=1 Computation of the MLE ˆθ via an optimization routine (Newton-Raphson iterations or EM algorithm)
71 The linear mixed effects model Maximum likelihood estimate y i = X i ψ i + ε i ε i N (0, σ 2 I ni ) ψ i N (β, Ω) We have y i = X i β + X i η i + ε i thus by linearity y i N (X i β, X i ΩX i + σ 2 I ni ) Set V i = X i ΩX i /σ2 + I ni, thus the likelihood is explicit N ( L(θ; y) = (2π σ 2 V i ) 1 2 exp 1 ) 2σ 2 (y i X i β) V 1 i (y i X i β) i=1 Computation of the MLE ˆθ via an optimization routine (Newton-Raphson iterations or EM algorithm)
72 The linear mixed effects model Maximum likelihood estimate y i = X i ψ i + ε i ε i N (0, σ 2 I ni ) ψ i N (β, Ω) We have y i = X i β + X i η i + ε i thus by linearity y i N (X i β, X i ΩX i + σ 2 I ni ) Set V i = X i ΩX i /σ2 + I ni, thus the likelihood is explicit N ( L(θ; y) = (2π σ 2 V i ) 1 2 exp 1 ) 2σ 2 (y i X i β) V 1 i (y i X i β) i=1 Computation of the MLE ˆθ via an optimization routine (Newton-Raphson iterations or EM algorithm)
73 The linear mixed effects model Maximum likelihood estimate y i = X i ψ i + ε i ε i N (0, σ 2 I ni ) ψ i N (β, Ω) We have y i = X i β + X i η i + ε i thus by linearity y i N (X i β, X i ΩX i + σ 2 I ni ) Set V i = X i ΩX i /σ2 + I ni, thus the likelihood is explicit N ( L(θ; y) = (2π σ 2 V i ) 1 2 exp 1 ) 2σ 2 (y i X i β) V 1 i (y i X i β) i=1 Computation of the MLE ˆθ via an optimization routine (Newton-Raphson iterations or EM algorithm)
74 The linear mixed effects model Maximum likelihood estimate y i = X i ψ i + ε i ε i N (0, σ 2 I ni ) ψ i N (β, Ω) We have y i = X i β + X i η i + ε i thus by linearity y i N (X i β, X i ΩX i + σ 2 I ni ) Set V i = X i ΩX i /σ2 + I ni, thus the likelihood is explicit N ( L(θ; y) = (2π σ 2 V i ) 1 2 exp 1 ) 2σ 2 (y i X i β) V 1 i (y i X i β) i=1 Computation of the MLE ˆθ via an optimization routine (Newton-Raphson iterations or EM algorithm)
75 The linear mixed effects model Maximum likelihood estimate y i = X i ψ i + ε i ε i N (0, σ 2 I ni ) ψ i N (β, Ω) We have y i = X i β + X i η i + ε i thus by linearity y i N (X i β, X i ΩX i + σ 2 I ni ) Set V i = X i ΩX i /σ2 + I ni, thus the likelihood is explicit N ( L(θ; y) = (2π σ 2 V i ) 1 2 exp 1 ) 2σ 2 (y i X i β) V 1 i (y i X i β) i=1 Computation of the MLE ˆθ via an optimization routine (Newton-Raphson iterations or EM algorithm)
76 The linear mixed effects model Profiled likelihood Optimization much simpler using concentrated or profiled likelihood, ie likelihood as a function of Ω From one can deduce y i N (X i β, σ 2 V i ) ˆβ(Ω) = ˆσ 2 (Ω) = ( N i=1 X i V i X i ) 1 N X i V 1 i y i i=1 ( ) N i=1 y i X i ˆβ(Ω) V 1 i N i=1 n i ( ) y i X i ˆβ(Ω)
77 The linear mixed effects model Profiled likelihood Optimization much simpler using concentrated or profiled likelihood, ie likelihood as a function of Ω From one can deduce y i N (X i β, σ 2 V i ) ˆβ(Ω) = ˆσ 2 (Ω) = ( N i=1 X i V i X i ) 1 N X i V 1 i y i i=1 ( ) N i=1 y i X i ˆβ(Ω) V 1 i N i=1 n i ( ) y i X i ˆβ(Ω)
78 The linear mixed effects model Profiled likelihood Using these expressions, derive the profiled log-likelihood L(Ω; y) as a function of Ω: L(Ω; y) = L( ˆβ(Ω), Ω, ˆσ 2 (Ω); y) Estimator of Ω is obtained by maximizing L(Ω; y) Plug in estimators of β and σ 2 ˆΩ = arg max L(Ω; y) Ω ˆβ = ˆβ( ˆΩ) ˆσ 2 = ˆσ 2 (ˆΩ)
79 The linear mixed effects model Profiled likelihood Using these expressions, derive the profiled log-likelihood L(Ω; y) as a function of Ω: L(Ω; y) = L( ˆβ(Ω), Ω, ˆσ 2 (Ω); y) Estimator of Ω is obtained by maximizing L(Ω; y) Plug in estimators of β and σ 2 ˆΩ = arg max L(Ω; y) Ω ˆβ = ˆβ( ˆΩ) ˆσ 2 = ˆσ 2 (ˆΩ)
80 The linear mixed effects model Restricted likelihood estimation MLE ˆΩ and ˆσ 2 underestimate the parametersω and σ 2 [Patterson and Thompson, 1971] proposes the Restricted maximum likelihood (REML) estimates by maximizing L R (Ω, σ 2 ; y) = L(β, Ω, σ 2 ; y)dβ Equivalent in a Bayesian framework to assume a uniform prior distribution for the fixed effects β
81 The linear mixed effects model Estimation of the individual parameters L(θ; y) = = = N i=1 N i=1 N i=1 p(y i, η i ; θ)dη i p(y i η i ; θ)p(η i ; θ)dη i 1 (2π σ 2 ) n i 2 1 e 1 (2π Ω ) 1 2 η i Ω 1 η i dη i 2 e 1 2σ 2 (y i X i β X i η i ) (y i X i β X i η i )
82 The linear mixed effects model Estimation of the individual parameters Introduce = Ω 1 σ 2, ỹ i = [ yi 0 ] [ Xi, X i = 0 ] [ Xi, Z i = ] L(θ; y) = = N i=1 1 (2π σ 2 ) n i 2 e 1 2σ 2 (y i X i β X i η i ) (y i X i β X i η i ) 1 e 1 (2π Ω ) 1 2 η i Ω 1 η i dη i 2 N C e 1 2σ 2 (ỹ i X i β Z i η i ) (ỹ i X i β Z i η i ) dη i i=1 then by linearity of the model ˆη i = ( Z i Z i ) 1 Z i (ỹ i X i ˆβ)
83 The linear mixed effects model Estimation of the individual parameters Introduce = Ω 1 σ 2, ỹ i = [ yi 0 ] [ Xi, X i = 0 ] [ Xi, Z i = ] L(θ; y) = = N i=1 1 (2π σ 2 ) n i 2 e 1 2σ 2 (y i X i β X i η i ) (y i X i β X i η i ) 1 e 1 (2π Ω ) 1 2 η i Ω 1 η i dη i 2 N C e 1 2σ 2 (ỹ i X i β Z i η i ) (ỹ i X i β Z i η i ) dη i i=1 then by linearity of the model ˆη i = ( Z i Z i ) 1 Z i (ỹ i X i ˆβ)
84 The non-linear mixed effects model y i = f (X i, ψ i ) + ε i = f (X i, β + η i ) + ε i L(θ; y i ) = (2π σ 2 ) n i 2 (2π Ω ) 1 2 e 1 2σ 2 (y i f (X i,β+η i )) (y i f (X i,β+η i )) 1 2 η i Ω 1 η i dη i The likelihood has no explicit form because of the non linearity of the regression function f with respect to η i Existing methods are based on approximations or numerical computations of the likelihood
85 The non-linear mixed effects model Linearization methods Principle: linearization of f to come down to a linear mixed effects model First order methods (FO) [Beal and Sheiner, 1982] linearization of f around β NONMEM software First order conditional methods (FOCE) [Lindstrom and Bates, 1990] linearization of f around ψ i NONMEM, SAS, proc NLMIXED, Splus/R function nlme
86 The non-linear mixed effects model Linearization methods Principle: linearization of f to come down to a linear mixed effects model First order methods (FO) [Beal and Sheiner, 1982] linearization of f around β NONMEM software First order conditional methods (FOCE) [Lindstrom and Bates, 1990] linearization of f around ψ i NONMEM, SAS, proc NLMIXED, Splus/R function nlme
87 The non-linear mixed effects model Linearization methods Principle: linearization of f to come down to a linear mixed effects model First order methods (FO) [Beal and Sheiner, 1982] linearization of f around β NONMEM software First order conditional methods (FOCE) [Lindstrom and Bates, 1990] linearization of f around ψ i NONMEM, SAS, proc NLMIXED, Splus/R function nlme
88 The non-linear mixed effects model First order method Linearization of f around β f (x ij, ψ i ) = f (x ij, β + η i ) = f (x ij, β) + f ψ (x ij, β) η i + o(ηi 2 ) An (approximated) model is deduced y ij = f (x ij, β) + f ψ (x ij, β) η i + ε ij A linear mixed effects model is defined by plugging in a previously estimated value of β y ij = f (x ij, ˆβ) + f ψ (x ij, ˆβ) η i + ε ij
89 The non-linear mixed effects model First order method Linearization of f around β f (x ij, ψ i ) = f (x ij, β + η i ) = f (x ij, β) + f ψ (x ij, β) η i + o(ηi 2 ) An (approximated) model is deduced y ij = f (x ij, β) + f ψ (x ij, β) η i + ε ij A linear mixed effects model is defined by plugging in a previously estimated value of β y ij = f (x ij, ˆβ) + f ψ (x ij, ˆβ) η i + ε ij
90 The non-linear mixed effects model First order method Linearization of f around β f (x ij, ψ i ) = f (x ij, β + η i ) = f (x ij, β) + f ψ (x ij, β) η i + o(ηi 2 ) An (approximated) model is deduced y ij = f (x ij, β) + f ψ (x ij, β) η i + ε ij A linear mixed effects model is defined by plugging in a previously estimated value of β y ij = f (x ij, ˆβ) + f ψ (x ij, ˆβ) η i + ε ij
91 The non-linear mixed effects model First order method Iterative algorithm 1 Penalized nonlinear least squares (PNLS) step: with current estimate ˆΩ and ˆσ 2, conditional modes of β and η i obtained by minimizing N (y i f (X i, β + η i ) (y i f (X i, β + η i )) + ˆσ 2 η i ˆΩ 1 η i i=1 2 Linear mixed effects (LME) step: first order Taylor expansion of f around ˆβ y i f (X i, ˆβ + ˆη i ) + f ψ i (X i, ˆβ) η i + ε i MLE estimates of Ω and σ 2
92 The non-linear mixed effects model First order conditional estimate method Iterative algorithm 1 Penalized nonlinear least squares (PNLS) step: with current estimate ˆΩ and ˆσ 2, conditional modes of β and η i obtained by minimizing N (y i f (X i, β + η i ) (y i f (X i, β + η i )) + ˆσ 2 η i ˆΩ 1 η i i=1 2 Linear mixed effects (LME) step: first order Taylor expansion of f around ˆψ i = ˆβ + ˆη i y i f (X i, ˆψ i ) + f ψ i (X i, ˆψ i ) (ψ i ˆψ i ) + ε i MLE estimates of Ω and σ 2
93 The non-linear mixed effects model First order conditional estimate method Drawbacks Theoretical drawbacks: no well-known statistical properties of the algorithm Practical drawbacks: very sensitive to the initial guess, does not always converge, poor estimation of some parameters
94 The non-linear mixed effects model Other classical methods Methods based on numerical approximations of the likelihood Laplace method [Wolfinger, 1993] Gaussian quadrature method [Davidian and Gallant, 1993] (SAS proc NLMIXED) Properties Theoretical: maximum likelihood estimate is performed Practical: limited to few random effects
95 Model selection Several steps: Choice of the regression function f Choice of the covariate model C i Choice of the random effects model Ω: diagonal matrix, block-diagonal matrix, plain matrix, etc
96 Model selection Several steps: Choice of the regression function f Choice of the covariate model C i Choice of the random effects model Ω: diagonal matrix, block-diagonal matrix, plain matrix, etc
97 Model selection Several steps: Choice of the regression function f Choice of the covariate model C i Choice of the random effects model Ω: diagonal matrix, block-diagonal matrix, plain matrix, etc
98 Model selection Several steps: Choice of the regression function f Choice of the covariate model C i Choice of the random effects model Ω: diagonal matrix, block-diagonal matrix, plain matrix, etc
99 Model selection 1 Compute the likelihood of the different models Let ˆθ M be the maximum likelihood estimate of θ for model M: ˆθ M = Arg max L M (θ; y) θ Let L M = L M (ˆθ M ; y) be the likelihood of model M. Selecting the most likely models by comparing the likelihoods favor models of high dimension (with many parameters)! 2 Penalize the models of high dimension Select the model ˆM that minimizes the penalized criteria 2L M + pen(m) Bayesian Information Criteria (BIC) : pen(m) = log(n) dim(m). Akaike Information Criteria (AIC) : pen(m) = 2dim (M).
100 Model checking Computation of Plots Population predictions: f (x ij, ˆβ) Individual predictions: f (x ij, ˆψ i ) Population residuals: y ij f (x ij, ˆβ) Individual residuals: y ij f (x ij, ˆψ i ) Population/Individual predictions vs observations Population/Individual residuals vs population/individual predictions Normality of the residuals
101 Pharmacokinetic of Indomethacin Indomethicin concentration (mcg/ml) Time since drug administration (hr) y ij = ψ i1 e ψ i2t ij + ψ i3 e ψ i4t ij + ε ij = (β 1 + η i1 )e (β 2+η i2 )t ij + (β 3 + η i3 )e (β 4+η i4 )t ij + ε ij = Choice of the covariance matrix Ω
102 Pharmacokinetic of Indomethacin Diagonal covariance matrix Ω AIC = fixed Subject Indomethicin concentration (mcg/ml) Time since drug administration (hr) ˆβ 1 = 2.83, ˆβ 2 = 2.10, ˆβ 3 = 0.41, ˆβ 4 = 0.24 ˆω β1 = 0.56, ˆω β2 = 0.34, ˆω β3 = 0.10, ˆω β4 = 10 6
103 Pharmacokinetic of Indomethacin Diagonal covariance matrix Ω Population predictions Individual predictions observations observations Population residuals Individual residuals Population predictions Individual predictions
104 Pharmacokinetic of Indomethacin Plain covariance matrix Ω AIC = fixed Subject Indomethicin concentration (mcg/ml) Time since drug administration (hr) ˆβ 1 = 2.84, ˆβ 2 = 2.46, ˆβ 3 = 0.63, ˆβ 4 = 0.28 ˆω β1 = 0.74, ˆω β2 = 0.61, ˆω β3 = 0.40, ˆω β4 = 0.17
105 Pharmacokinetic of Indomethacin Plain covariance matrix Ω Population predictions Individual predictions observations observations Population residuals Individual residuals Population predictions Individual predictions
106 Pharmacokinetic of Theophylline Theophylline concentration in serum (mg/l) Time since drug administration (hr) y ij = Dose k a i V i (k ai k ei ) (e ke i t ij e ka i t ij ) + ε ij
107 Non linear mixed model Pharmacokinetic of Theophylline fixed Subject Theophylline concentration in serum (mg/l) Time since drug administration (hr) ˆk e = 0.08, ˆka = 1.53, ˆV = 0.48 ˆω ke = 0.02, ˆω V = 0.08
108 Non linear mixed model Pharmacokinetic of Theophylline Population predictions Individual predictions observations observations Population residuals Individual residuals Population predictions Individual predictions
109 Pharmacokinetic of Theophylline y ij = Log parametrisation Dose k a i V i (k ai k ei ) (e ke i t ij e ka i t ij ) + ε ij y ij = Dose e lka i lke e lv i (e lka i e lke i ) (e e i tij e elka i t ij ) + ε ij = Final model Random effects on lk e, lk a and V Covariate effect (weight) on lka
110 Non linear mixed model Pharmacokinetic of Theophylline fixed Subject Theophylline concentration in serum (mg/l) Time since drug administration (hr)
111 Non linear mixed model Pharmacokinetic of Theophylline Population predictions Individual predictions observations observations Population residuals Individual residuals Population predictions Individual predictions
Estimation, Model Selection and Optimal Design in Mixed Eects Models Applications to pharmacometrics. Marc Lavielle 1. Cemracs CIRM logo
Estimation, Model Selection and Optimal Design in Mixed Eects Models Applications to pharmacometrics Marc Lavielle 1 1 INRIA Saclay Cemracs 2009 - CIRM Outline 1 Introduction 2 Some pharmacokinetics-pharmacodynamics
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationStochastic approximation EM algorithm in nonlinear mixed effects model for viral load decrease during anti-hiv treatment
Stochastic approximation EM algorithm in nonlinear mixed effects model for viral load decrease during anti-hiv treatment Adeline Samson 1, Marc Lavielle and France Mentré 1 1 INSERM E0357, Department of
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationEvaluation of the Fisher information matrix in nonlinear mixed eect models without linearization
Evaluation of the Fisher information matrix in nonlinear mixed eect models without linearization Sebastian Ueckert, Marie-Karelle Riviere, and France Mentré INSERM, IAME, UMR 1137, F-75018 Paris, France;
More informationInformation in a Two-Stage Adaptive Optimal Design
Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for
More informationA comparison of estimation methods in nonlinear mixed effects models using a blind analysis
A comparison of estimation methods in nonlinear mixed effects models using a blind analysis Pascal Girard, PhD EA 3738 CTO, INSERM, University Claude Bernard Lyon I, Lyon France Mentré, PhD, MD Dept Biostatistics,
More informationAccurate Maximum Likelihood Estimation for Parametric Population Analysis. Bob Leary UCSD/SDSC and LAPK, USC School of Medicine
Accurate Maximum Likelihood Estimation for Parametric Population Analysis Bob Leary UCSD/SDSC and LAPK, USC School of Medicine Why use parametric maximum likelihood estimators? Consistency: θˆ θ as N ML
More informationnonlinear mixed effect model fitting with nlme
nonlinear mixed effect model fitting with nlme David Lamparter March 29, 2010 David Lamparter nonlinear mixed effect model fitting with nlme Purpose of nonlinear mixed effects modeling nonlinearity fitting
More informationLinear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.
Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationFitting PK Models with SAS NLMIXED Procedure Halimu Haridona, PPD Inc., Beijing
PharmaSUG China 1 st Conference, 2012 Fitting PK Models with SAS NLMIXED Procedure Halimu Haridona, PPD Inc., Beijing ABSTRACT Pharmacokinetic (PK) models are important for new drug development. Statistical
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationChapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program
Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationP n. This is called the law of large numbers but it comes in two forms: Strong and Weak.
Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationExtension of the SAEM algorithm for nonlinear mixed models with 2 levels of random effects Panhard Xavière
Extension of the SAEM algorithm for nonlinear mixed models with 2 levels of random effects Panhard Xavière * Modèles et mé thodes de l'évaluation thérapeutique des maladies chroniques INSERM : U738, Universit
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationNonlinear mixed-effects models using Stata
Nonlinear mixed-effects models using Stata Yulia Marchenko Executive Director of Statistics StataCorp LP 2017 German Stata Users Group meeting Yulia Marchenko (StataCorp) 1 / 48 Outline What is NLMEM?
More informationF. Combes (1,2,3) S. Retout (2), N. Frey (2) and F. Mentré (1) PODE 2012
Prediction of shrinkage of individual parameters using the Bayesian information matrix in nonlinear mixed-effect models with application in pharmacokinetics F. Combes (1,2,3) S. Retout (2), N. Frey (2)
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationNonlinear Mixed Effects Models
Nonlinear Mixed Effects Modeling Department of Mathematics Center for Research in Scientific Computation Center for Quantitative Sciences in Biomedicine North Carolina State University July 31, 216 Introduction
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 15 Outline 1 Fitting GLMs 2 / 15 Fitting GLMS We study how to find the maxlimum likelihood estimator ˆβ of GLM parameters The likelihood equaions are usually
More informationSGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection
SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28
More informationPlot of Laser Operating Current as a Function of Time Laser Test Data
Chapter Repeated Measures Data and Random Parameter Models Repeated Measures Data and Random Parameter Models Chapter Objectives Understand applications of growth curve models to describe the results of
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationPharmacometrics : Nonlinear mixed effect models in Statistics. Department of Statistics Ewha Womans University Eun-Kyung Lee
Pharmacometrics : Nonlinear mixed effect models in Statistics Department of Statistics Ewha Womans University Eun-Kyung Lee 1 Clinical Trials Preclinical trial: animal study Phase I trial: First in human,
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationFor iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.
Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationLikelihood-Based Methods
Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)
More information1 One-way analysis of variance
LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More information1Non Linear mixed effects ordinary differential equations models. M. Prague - SISTM - NLME-ODE September 27,
GDR MaMoVi 2017 Parameter estimation in Models with Random effects based on Ordinary Differential Equations: A bayesian maximum a posteriori approach. Mélanie PRAGUE, Daniel COMMENGES & Rodolphe THIÉBAUT
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationNow consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.
Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)
More informationPractical Econometrics. for. Finance and Economics. (Econometrics 2)
Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics
More information36-720: Linear Mixed Models
36-720: Linear Mixed Models Brian Junker October 8, 2007 Review: Linear Mixed Models (LMM s) Bayesian Analogues Facilities in R Computational Notes Predictors and Residuals Examples [Related to Christensen
More informationAn Extended BIC for Model Selection
An Extended BIC for Model Selection at the JSM meeting 2007 - Salt Lake City Surajit Ray Boston University (Dept of Mathematics and Statistics) Joint work with James Berger, Duke University; Susie Bayarri,
More informationAnswer Key for STAT 200B HW No. 8
Answer Key for STAT 200B HW No. 8 May 8, 2007 Problem 3.42 p. 708 The values of Ȳ for x 00, 0, 20, 30 are 5/40, 0, 20/50, and, respectively. From Corollary 3.5 it follows that MLE exists i G is identiable
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Hierarchical Statistical Models for Complex Data Structures Motivation Basic hierarchical model and assumptions Statistical inference
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More information2017 Financial Mathematics Orientation - Statistics
2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Principles of Statistical Inference Recap of statistical models Statistical inference (frequentist) Parametric vs. semiparametric
More informationAn effective approach for obtaining optimal sampling windows for population pharmacokinetic experiments
An effective approach for obtaining optimal sampling windows for population pharmacokinetic experiments Kayode Ogungbenro and Leon Aarons Centre for Applied Pharmacokinetic Research School of Pharmacy
More informationEstimation, Inference, and Hypothesis Testing
Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationA New Approach to Modeling Covariate Effects and Individualization in Population Pharmacokinetics-Pharmacodynamics
Journal of Pharmacokinetics and Pharmacodynamics, Vol. 33, No. 1, February 2006 ( 2006) DOI: 10.1007/s10928-005-9000-2 A New Approach to Modeling Covariate Effects and Individualization in Population Pharmacokinetics-Pharmacodynamics
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationInference in non-linear time series
Intro LS MLE Other Erik Lindström Centre for Mathematical Sciences Lund University LU/LTH & DTU Intro LS MLE Other General Properties Popular estimatiors Overview Introduction General Properties Estimators
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationRecap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1
Recap Vector observation: Y f (y; θ), Y Y R m, θ R d sample of independent vectors y 1,..., y n pairwise log-likelihood n m i=1 r=1 s>r w rs log f 2 (y ir, y is ; θ) weights are often 1 more generally,
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationEM Algorithm II. September 11, 2018
EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data
More informationA brief introduction to mixed models
A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.
More informationSize Distortion and Modi cation of Classical Vuong Tests
Size Distortion and Modi cation of Classical Vuong Tests Xiaoxia Shi University of Wisconsin at Madison March 2011 X. Shi (UW-Mdsn) H 0 : LR = 0 IUPUI 1 / 30 Vuong Test (Vuong, 1989) Data fx i g n i=1.
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More informationStat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota
Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we
More informationEconomics 582 Random Effects Estimation
Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0
More informationModel selection in penalized Gaussian graphical models
University of Groningen e.c.wit@rug.nl http://www.math.rug.nl/ ernst January 2014 Penalized likelihood generates a PATH of solutions Consider an experiment: Γ genes measured across T time points. Assume
More informationThe estimation methods in Nonlinear Mixed-Effects Models (NLMM) still largely rely on numerical approximation of the likelihood function
ABSTRACT Title of Dissertation: Diagnostics for Nonlinear Mixed-Effects Models Mohamed O. Nagem, Doctor of Philosophy, 2009 Dissertation directed by: Professor Benjamin Kedem Department of Mathematics
More informationEconometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11
Econometrics A Keio University, Faculty of Economics Simple linear model (2) Simon Clinet (Keio University) Econometrics A October 16, 2018 1 / 11 Estimation of the noise variance σ 2 In practice σ 2 too
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationif n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1
Count models A classical, theoretical argument for the Poisson distribution is the approximation Binom(n, p) Pois(λ) for large n and small p and λ = np. This can be extended considerably to n approx Z
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More information