NONMEM7_Technical_Guide.doc. Technical Guide on the Expectation-Maximization Population Analysis Methods in the NONMEM 7 Program

Size: px

Start display at page:

Download "NONMEM7_Technical_Guide.doc. Technical Guide on the Expectation-Maximization Population Analysis Methods in the NONMEM 7 Program"

Brittany Hart
6 years ago
Views:

1 NONMEM7_Techncal_Gude.doc Techncal Gude on the Expectaton-Maxzaton Populaton Analyss Methods n the NONMEM 7 Progra Basc Theory Indvdual paraeters ph ( φ ) to a PK/PD odel are assued to have a rando dstrbuton n a populaton of subects, typcally a noral dstrbuton wth ean MU ( ) and varance OMEGA ( Ω ). The ean ay n turn be odeled as a functon of a set of unknown but to be estated fxed effects paraeters THETA ( ), and a set of covarates, or nforaton about ndvdual, x. desgnated η, so that the followng relaton holds: The devaton of the ndvdual paraeter φ fro ts ean s φ (, x ) + η (.) where are those thetas that are related to etas through a u functon, of the above forat. Thus, the dstrbuton of φ can be descrbed as h φ Ω φ Ω φ det( Ω) (, )) exp ( ) ( ) / (.) The populaton paraeter densty h ( φ, Ω) s the probablty that the partcular φ would occur for an ndvdual, gven ean populaton paraeters and ts nter-ndvdual covarance Ω. The dstrbuton of η s therefore centered about zero (0), and can be descrbed as h ( 0, )) exp det( Ω) η Ω ηω η / (.3) Not all fxed effects theta are nvolved n an eta ( η) relatonshp as shown above. For those theta that are not exclusvely expressed n the PK/PD odel or error odel va u (), these are consdered not u odeled. We shall desgnate these thetas as thetas s then. The entre vector of {, } (.4) The paraeters as desgnated n NONMEM are THETA for, ETA for η, and OMEGA for Ω. There are also a set of paraeters desgnated as SIGMA n NONMEM, whch are never u odeled, and because n our dscusson they wll be treated n exactly the sae 4/8/00 7:55:00 PM Page of 7

2 NONMEM7_Techncal_Gude.doc way as non u odeled theta, we shall nclude the n to reduce the coplexty of the noenclature. There ay also be soe etas that are not related to a MU odel, n whch case φ η (.5) Thus the ph vector ncludes all etas, whether or not they are nvolved n a u functon. For observed data that are odeled as norally dstrbuted, a predctve functon ay be evaluated usng the ndvdual PK/PD paraeters ph, and/or ay be odeled drectly fro fxed effects paraeters not be ph/u odeled, f ( φ, ). In addton, a resdual varance atrx V descrbes the uncertanty of the observed values, and ay be drectly a functon of the predcted value f, sga paraeters and other non-u odeled thetas, and rarely, ndvdual paraeters φ : V( f, φ, ). The noral data densty can be expressed as l ( y φ, ) y f V y f exp ( ) ( ) det ( V ) (.6) where l ( y φ, ) s the ndvdual data densty, the probablty of data y occurrng for ndvdual, gven ndvdual PK/PD paraeters φ, and fxed effect paraeters not u odeled. that are The ont densty of data y and φ for an ndvdual s then p( y, φ, ( ), Ω) l ( y φ, ) h ( φ ( ), Ω ) (.7) The l ( y φ, ) h ( φ, Ω ) s the ont lkelhood densty of φ and y for a gven ndvdual. It s ntegrated over all possble values of φ for each ndvdual, so that the best populaton paraeters and Ω are deterned by takng nto account the ont probablty to an ndvdual s data over the entre paraeter space of φ, rather than at ust one partcular locaton, such as at the ndvdual s best ft. We are therefore nterested n evaluatng the argnal densty of y for any gven andω (or,, Ω ): p( y Ω, ) p( y,φ,, Ω) dφ l ( y φ, h ( φ, Ω) dφ (.8) ) 4/8/00 7:55:00 PM Page of 7

3 NONMEM7_Techncal_Gude.doc for each subect. The total argnal densty for all subects s then p( y Ω, ) p( y,φ,, Ω) dφ (.9) It s convenent at ths stage to use the negatve logarth of the densty, and refer to ths as the obectve functon, for each ndvdual: L log( p( y, φ,, Ω) dφ ) (.0) + and for the total data set: L log( p( y Ω, )) L (.) Thus, the negatve logarth of the paraeter densty s log( h ( φ, Ω)) log(det( Ω)) + ( φ) Ω ( φ ) (.) And the negatve logarth of the data densty s: ( ) log( l ( y φ, )) ( y f ) V ( y f) + log det ( V ) (.3) To ft a odel wth ean populaton paraeters and populaton varance Ω to data y, the argnal densty (.9) s to be axzed wth respect to and Ω. In practce, as an equvalent process, the negatve logarth of the argnal densty (.) s nzed. Ths s the goal of the frst order (FO), frst order condtonal estaton (FOCE/FOCEI), Laplace, teratve two stage (ITS), and expectaton axzaton (EM) ethods. FOCE and Laplace Methods Generally the ntegral of the ont densty (.0) s very dffcult to evaluate deternstcally, but t ay be approxated for classcal ethods FOCE and Laplace usng a ethod descrbed by Beal (part VII of NONMEM anuals []). The dervaton s gven n [], whle we wll erely report the results. Classcal NONMEM (FO, FOCE, and Laplace) does not requre the use of odelng, so for ths secton, we wll use the paraeterzaton of, η, for all paraeters, rather than dstngushng between odeled 4/8/00 7:55:00 PM Page 3 of 7

4 NONMEM7_Techncal_Gude.doc and non- odeled. For exaple, the ndvdual s ont densty ay be alternately expressed as p y, φ,, Ω) l ( y φ, ) h ( φ, Ω) l ( y, η) h ( η 0, Ω) p( y, η, Ω ) (.4) ( and ntegraton over all values of η s equvalent to ntegratng over all values of φ : L log( p( y, η, Ω) dη ) (.5) + In order to ntegrate the ndvdual s ont densty over all ηusng the approxaton suggested by Beal, we wsh frst to deterne the set of ηat the axu of ths ont densty, or equvalently, at the nu of the negatve logarth of the ont densty: log( l ( y, ) h ( 0, )) log( l ( y, )) + log(det( )) + η η Ω η Ω ηω η (.6) We nze wth respect to eta by evaluatng log( l ( y η,) h ( η 0, Ω)) log( l ( y η,)) η η + Ω η 0 (.7) usng typcal search strateges. The η at whch equaton (.7) s satsfed s called the ode of the ont densty for subect, and shall be desgnated η ˆ (the hat over the paraeter shall refer to a ode or pont estate, whereas the lne over a paraeter refers to a ean). Fndng the ηˆ paraeters that provde the nu of the ndvdual s ont densty s called ode a posteror (MAP) estaton. Ths s used to then evaluate an approxaton of the negatve logarth of the ndvdual s ntegral of hs ont densty as follows: + L log( l ( y η,) h ( η 0, Ω) dη) log( l ( y ˆ )) log(det( )) ˆ ˆ η, + Ω + ηω η + log(det( Ω + S ( y ˆ η, ))) LN (.8) wheres ( y η, ˆ ) s the hessan or nforaton atrx to the data densty l ( y η, ˆ ). The total approxate obectve functon s n turn, for subects: L N L (.9) N (the subscrpt N refers to classcal NONMEM) where the frst three ters of (.8) are sply the negatve logarth of the ont densty evaluated at the ode η ˆ, and the last 4/8/00 7:55:00 PM Page 4 of 7

5 NONMEM7_Techncal_Gude.doc ter s ½ of the negatve logarth of the deternant of the varance of η under the ont densty, l ( y η,) h ( η 0, Ω ). One can therefore thnk of the ont densty evaluated at the ode (that s, the frst three ters of equaton (.8) as the heght of the ont densty: log( H ) log( ( ˆ )) log(det( )) ˆ ˆ l y η, + Ω + ηω η (.0) where H s the heght of ndvdual s ont densty. Slarly, one ay thnk of one-half the deternant of the varance under the ont densty (the last ter of the equaton (.8)) as ts wdth : +S y ˆ log( W ) log(det( Ω ( η, )) (.) Thus equaton (.8) represents the negatve logarth of the heght ultpled by the wdth, resultng n the area of the ont densty. TheS ( y η,) ay be evaluated several ˆ ways. One ethod s to evaluate the second dervatve, usually by fnte dfference ethods: S log( l ( y η, )) ( y η,), k to n, k to n, ηk η k (.) where {} eans atrx contanng eleents. Ths evaluaton s used n the Laplace ethod n NONMEM. Ths evaluaton guarantees postve defnteness (assung no nuercal dffcultes arse) when evaluated at the ode (see appendx B). Another ethod s by the cross product of the frst dervatves of the ndvdual data pont denstes: S log( l( y η, )) log( l( y η, )) ( y η,) ηk η k (.3) where s the nuber of data ponts for patent. Based on ts structure, postve defnteness s guaranteed even when evaluated not at the ode (see appendx B). A thrd ethod for evaluatng S ( y η, ) s by the expected value of the second dervatve: log( l( y η, )) ( η,) E k η η k S y f( t,η, ) f( t,η, ) V V + V tr V V η k ηk ηk ηk (.4) 4/8/00 7:55:00 PM Page 5 of 7

6 NONMEM7_Techncal_Gude.doc whch s also postve defnte even when not evaluated at the ode. Equaton (.4) s used as the non-laplace (CONDITIONAL) ethod n NONMEM. The Hessan (nforaton) atrx of the ont densty: Ω + S s n turn the log( l ( y η, ) h( η 0, Ω)) E η η log( l( y η, )) log( h( η 0, Ω)) E + E k k η η ηk η k k k S ( y η, ) + Ω (.5) and hence ts nverse s the varance atrx of η under the ont densty, as entoned earler. Because the su of two postve defnte atrces s tself postve defnte, the varance of the ont densty as evaluated above s postve defnte. For ont denstes that are exactly ultvarate norally dstrbuted wth respect to η, equaton (.8) evaluates the ont area exactly. We shall also refer to S ( y η,) as ˆ ˆ S. The ˆ S ust be evaluated at the ndvdual s ode of hs ont densty, at η ˆ, and not at the ean populaton poston of η 0, so the INTERACTION opton n NONMEM ust be used. Keep n nd that whle the ηˆ represents the th ndvdual s best ft paraeters for ts data, based on ts ont densty, t s only needed here to evaluate the area under hs ont densty usng the above approxaton. In other words, we really don t need an ndvdual s best ft paraeter set theoretcally, but we need t practcally, n order to evaluate the heght of the densty, and thus approxate hs ont densty area. There are alternatve ethods of fndng the area wthout needng to know the ndvdual s best ft paraeters, whch we wll explore later. Followng the evaluaton of each ndvdual s obectve functon n the anner descrbed above, these are sued to for the total approxate obectve functon L N. NONMEM optzes L N wth respect to THETAS, OMEGAS, and SIGMAS usng a varable etrc ethod, n whch L N s evaluated at a seres of values of and Ω, to provde a drectonal 4/8/00 7:55:00 PM Page 6 of 7

7 NONMEM7_Techncal_Gude.doc search to fnd the set of and Ω that optzes L N. The descrpton of the varable etrc ethod s beyond the scope of ths docuent, but a good reference s [3]. Expectaton Maxzaton (EM) Prncples Maxzaton-expectaton ethods separate the process of expectaton (ntegraton) and axzaton. To fnd proved estates for odeled, t s convenent to frst nze the negatve logarth of p( y, Ω) wth respect to, whch s equvalent to axzng p( y, Ω ). We can do ths as follows: L (,, Ω) dφ) log p( y, φ ( p( y, φ,, Ω) dφ) / p(, y φ,, )/ d Ω φ p( y, φ,, Ω) dφ p( y, φ,, Ω) dφ log( p( y, φ,, Ω))/ p( y, φ,, Ω) dφ p( y, φ,, Ω) dφ (.6) (.7) (.8) (.9) log( p( y, φ,, Ω)) z ( φ y,,, Ω) dφ (.30) E φ, log( p( y, φ,, Ω)) g (.3) where g s the gradent wth respect to, and E φ, ( ) represents the expected value when ntegratng over all φ, and p( y, φ,, Ω) z ( φ y,,, Ω) p( y, φ,, Ω) dφ (.3) s called the condtonal densty of φ for ndvdual. The condtonal densty ntegrated over all possble φ evaluates to : p( y, φ, Ω) z ( φ y,,, Ω) dφ p( y, φ, Ω) dφ (.33) 4/8/00 7:55:00 PM Page 7 of 7

8 NONMEM7_Techncal_Gude.doc The relatonshp log( p( y,, Ω)) log( p( y, φ Eφ,,, Ω)) (.34) holds for any ont densty p( y, φ, Ω ). Now, to evaluate specfcally for a paraeter densty h that s ultvarate noral: log( p( y, φ,, Ω)) z ( φ y,,, Ω) dφ (.35) log( l ( y φ, ) h ( φ, Ω)) z ( φ y,,, Ω) dφ (.36) Ω ( φ ) z ( φ y,,, Ω) dφ (.37) Ω ( φ ) z ( φ y,,, Ω) dφ g (.38) We can perfor the above algebrac anpulaton because (and therefore ) appears only n the paraeter densty h, but does not appear n the data densty l. We defne φ φz ( φ y,,, Ω) dφ (.39) as the condtonal ean φ vector for ndvdual, so that L Ω ( φ ) z ( φ y,,, Ω) dφ Ω [ φz ( φ y,,, Ω) dφ z ( φ y,,, Ω) dφ ] Ω ( φ ) g (.40) There are several ways of deternng φ whch are descrbed later, and are called the expectaton (ntegratng or averagng) step. Maxzaton To deterne the odeled theta that reduces the obectve functon, we ust solve: L L Ω φ ( ) g (.4) So that g 0 (.4) 4/8/00 7:55:00 PM Page 8 of 7

9 NONMEM7_Techncal_Gude.doc To evaluate (.4) fully, an optzaton algorth s necessary whch vares, and evaluatng L at each. Keep n nd that n addton to varyng wth, φ also vares wth through the condtonal densty z, so ths nzaton process can be coputatonally expensve. Alternatvely, we can perfor a lted axzaton step n whch φ s kept constant, whle only s vared wth changes n. Ths separaton of the expectaton step fro the axzaton step s characterstc of the EM algorth. Evaluatng (.4) by ths lted optzaton s equvalent to nzng the followng surrogate obectve functon (keepng φ constant): ( log( h( φ, Ω)) ) Eφ, E, ( ( ( ) φ φ ) Ω ( φ ( ) )) + Eφ, ( log(det( Ω) ) φω φ Ω φ + Ω + log(det( Ω) log(det( φω φ Ω φ +Ω + Ω)+ φω φ φω φ ˆ ( ( ) ( ( ˆ ) Lc φ ) Ω φ ) (.43) The L c s called the (negatve) coplete data log lkelhood, and t can be shown (see [4]) that any that reduces L c, wll reduce L by an at least equvalent aount, or: L( ) L( ˆ ) L ( ) L ( ˆ ) (.44) c c where ˆ s an proved value over the present value. That s, any proveent value ˆ that reduces L c (where φ was kept constant), wll also reduce L (n whch φ vares wth ), by at least the sae aount as t reduced L c. The easest way to nze L c s to perfor a least squares analyss, by producng the followng postve defnte Hessan atrx: ( L ) ( L E ) H Ω c c E (.45) 4/8/00 7:55:00 PM Page 9 of 7

10 NONMEM7_Techncal_Gude.doc And perforng the followng update wth a varable step sze α : ˆ + H g (.46) α Ths s the axzaton step of the EM algorth. If all of the u s have lnear relatonshps wth respect to theta, then the step sze that nzes L c wth respect to the u s s α. However, f the u s are not lnearly related to thetas, then α ust be adusted to nze L c wth respect to u. Ths can be done by selectng a value α, evaluate L ˆc usng the proposed ˆ, and f L ˆc s not saller than the present L evaluated at the present, try another value of α, etc. In NONMEM, α s frst selected, tested, and f necessary, α s reduced by geoetrcal decreents of square root of untl an L ˆc s found that s less than L c. More elaborate search algorths (such as conugate gradent or varable etrc ethods) for thetas not lnearly odeled wth respect to u could be used for the expectaton-axzaton ethods, but no real te savngs occurs n dong so for populaton analyss probles. In the next teraton, the updated are used to evaluate a new set of condonal eans φ n the expectaton step, followed by a lted axzaton step to update agan. By repeatedly perforng the expectaton step (.39), and evaluatng the axzaton step as expressed n equatons (.40) through (.46), the gradent g becoes saller, and estates ˆ that axze the argnal densty (satsfy equaton (.4)) are eventually obtaned [4]. Agan, because appeared only n the paraeter densty h as the ean to ths ultvarate noral densty, and does not appear n the data densty l, and the paraeters to be estated appear n the obectve functon only through, ths allowed us to obtan a gradent evaluaton wth a sple constructon as gven n (.4) For those that are not expressed n the odel through the ay appear anywhere n the ont densty. No shortcut evaluaton can be ade by axzng ust the paraeter densty porton. Thus, to optze the populaton obectve functon n these as well, we need to dfferentate the 4/8/00 7:55:00 PM Page 0 of 7

11 NONMEM7_Techncal_Gude.doc entre ont densty. Through a slar process as we showed n dfferentatng wth respect to, L (.47) log( p( y, φ,, Ω)) z ( φ y,,, Ω) dφ (.48) E φ, L log( p( y, φ,, Ω)) g (.49) L g g (.50) A Hessan atrx ay be constructed as follows: H g g (.5) ˆ + H g (.5) To nze the obectve functon wth respect to the nter-subect varance paraeters, we recognze that Ω s syetrcal, and we ust vary only the lower trangular porton of the atrx. Defnng A as the lower trangular atrx of Ω, and nzng wth respect to A, we have log( p( y, )) Ω log( p( y, φ, Ω )) Eφ, A (.53) A log( l( y φ) h( φ, Ω)) z ( φ y,, Ω) dφ (.54) A Lower R dag ( R) z (, ) d φ y, Ω φ g A (.55) where ( ( )( ) ) R Ω Ω φ φ Ω (.56) and ga s the gradent wth respect to A. The dervaton fro equaton (.54) to (.55) requres evaluatng partal dervatves of atrx coponents, the tools for whch are derved n appendx A. 4/8/00 7:55:00 PM Page of 7

12 NONMEM7_Techncal_Gude.doc We defne [ ] Ω ( φ )( φ ) z ( φ y,, Ω) dφ Ε, ( )( ) φ φ φ (.57) as the contrbuton to the evaluated populaton varance fro each ndvdual. Then, E ( ) φ, ( R) Ω ΩΩ Ω (.58) and ( E ) Lower Eφ, ( R) dag φ, ( R) ga 0 (.59) s equvalent to solvng for ( ) E ( ) φ, R Ω Ω Ω Ω 0 (.60) whch suggests the followng update for Ω : ˆ Ω Ω (.6) Note for any gven Ω ( ˆ ) ˆ Lower ( ) dag ( ) ga Ω ΩΩ Ω Ω ΩΩ Ω (.6) Thus, wth repeatedly evaluatng the expectaton step (.57), and utlzng the result to evaluate the next estate of the ntersubect varance (axzaton step (.6), when the output ˆΩ equals the nput Ω, then the gradent ga s equal to 0. Note that equaton (.57) ay be rearranged as follows, whch wll be useful later: Ω ( φ )( φ ) + ( φ φ)( φ φ) z( φ y,, Ω) dφ (.63) Defnng B ( φ φ)( φ φ) z( φ y,, Ω) dφ (.64) as the condtonal varance of for ndvdual, then Ω ( φ )( φ ) +B (.65) so that ˆ Ω ( φ )( φ ) + B (.66) 4/8/00 7:55:00 PM Page of 7

13 NONMEM7_Techncal_Gude.doc Thus, the update varance nter-subect varance s evaluated as the su of the saple varance of the condtonal eans and the average condtonal varance. To suarze, the EM algorth conssts of an expectaton step evaluatng condtonal eans φ and condtonal varances B, keepng and Ω constant, followed by a lted axzaton step to obtan updated and Ω, keepng φ and B constant. Evaluatng the Expectaton step Iportance Saplng One can evaluate the area under the ont densty and the other ntegrals by Monte Carlo technques. The advantage to these ethods s that the actual atheatcal expresson of the ntegral s not necessary for ts coputaton, and the precson to whch the ntegral s evaluated depends on the nuber of rando saples generated to evaluate the ntegral. One Monte Carlo ethod s to use a saplng functon that approxates the ont densty, fro whch one obtans saple values of η or φ. One possble saplng functon s the ultvarate noral densty that has ean at φ ˆ and varance of (Ω + Sˆ ) as descrbed n the prevous secton. To get these values, therefore, one frst axzes the ont densty wth respect to φ (or η ) as one would for a MAP estaton. The negatve logarth of the area of ths saplng functon s exactly L N of equaton (.8). Thus, the purpose of the randozaton ethod s to odfy L N to the extent that the ont densty devates fro ths saplng densty. In practce, one ay start wth a saplng functon that s soewhat larger, by ultplyng the varance by a value γ>: γ ( Ω + S ). The area of ths saplng functon s then E ˆ log( l ( y φ, )) + log(det( Ω)) + (.67) ˆ ˆ ( φ ) Ω ( φ ) + log(det( Ω + S )) nlog( γ ) where n s the nuber of φ paraeters to be ntegrated, snce φ s ntegrated to for L. 4/8/00 7:55:00 PM Page 3 of 7

14 NONMEM7_Techncal_Gude.doc For the kth rando saple selected fro ths saplng densty, the paraeter vectors φ ( k ) are used to evaluate the logarth of the ont densty at that poston: log( π ( )) log( ( y, φ( k) l φ( k) h ( k) ) ( φ, Ω )) (.68) To evaluate the noralzed log of the ont densty, we subtract log( π ( φˆ )) log( l ( y φˆ, ˆ ) h( φ, Ω )) (.69) To obtan ( k) log( π( φ )) log( π( φ ˆ )) (.70) so that ths noralzed log ont densty s 0 at the ode φ ˆ. We also evaluate the logarth of the noralzed saplng functon (whch s also equal to 0 at φ ˆ ), ˆ ˆ log( e ˆ ( φ( k) )) ( φ( k) φ) ( Ω +S )( φ( k) φ ) / γ (.7) The logarth of the rato between ont densty and saplng densty s then: q log( π( φ )) log( π( φ ˆ )) log( e( φ )) (.7) ( k) ( k) ( k) whch evaluates to y φ ) ) log( l ( y ˆ φ, )) q( k) log( l ( ( k), ˆ ˆ + ( φ ) Ω ( φ ) ( φ( k) ) Ω ( φ( k) ) + ˆ ˆ ( φ ˆ ( k) φ)( Ω + S )( φ( k) φ)/ γ (.73) Its exponent s the probablty of acceptng ths poston by the ont densty, relatve to the saplng densty, whch we ay consder as a weght: u exp( q ) (.74) ( k) ( k) Thus the followng fracton, r ψ u( k) (.75) r k represents the rato of the area of the condtonal densty to the area of the saplng densty. The risample s the nuber of rando saples selected for each ndvdual. Ths fracton s now used to adust the area of the saplng densty E, whch s known, to obtan the true area of the condtonal densty, whch s unknown: 4/8/00 7:55:00 PM Page 4 of 7

15 NONMEM7_Techncal_Gude.doc L E log( ψ ) log( ( ˆ l y φ, ˆ ˆ )) + log(det( Ω)) + ( φ ) Ω ( φ ) + log(det( Ω + S )) (.76) n/ log( ψγ ) so that s the correcton factor that the randozaton ethod adds to our n/ log( ψγ ) orgnal area equaton to prove ts accuracy. In NONMEM, γ s contnually adusted so that ψ approxates IACCEPT, up to the lt of the boundares of γ beng between ISCALE_MIN and ISCALE_MAX (avalable n NONMEM 7.). The above dervaton of saple weghts and lkelhood evaluaton for portance saplng resultng n equatons (.73) and (.76) was developed to deonstrate that they are based on general prncples of obtanng ntegrals by Monte Carlo ethods. These relatonshps can be splfed by ovng all of the eleents fro E to q (k), gven that the coponents n E are constant for all rando saples k, so that the use of exp(q (k) ) as a weght factor wll not be affected. Furtherore, we ay generalze for all saplng denstes φ ( ) N(, γ Ω ), k s s by substtutng a general saplng densty ean s n place of φ ˆ, and a general saplng densty varance q( k) log( l ( y φ( k), γ Ω n place of γ ( Ω + S ) s, so that we obtan: ) ) ( φ( k) ) Ω ( φ( k) ) log(det( Ω)) + ( φ( k) s) ( γωs) ( φ( k) s) + log(det( γωs)) u ( k) ( k) (.77) exp( q ) (.78) r ψ u( k) (.79) r k L log( ψ ) (.80) The new and proved obectve functon s then L L (.8) 4/8/00 7:55:00 PM Page 5 of 7

16 NONMEM7_Techncal_Gude.doc Wth ths technque, we can also evaluate an proved ean and proved varancecovarance atrx. Lettng z ( k) so that r ( k) k u u ( k) r k ( k) (.8) z (.83) then r φ z φ (.84) ( k) ( k) k r B z ( φ φ)( φ φ ) (.85) ( k) ( k) ( k) k (note also that these are eans and varance about the eans, as ndcated by the lne above the paraeter). The update equatons yeld the followng: L L H Ω Ω φ g 0 ( ) The easest way to axze s to perfor the followng updates: α (.86) (.87) ˆ + H g (.88) ˆ ( ˆ ) And, accordng to equatons (.66), (.64) and (.57), ˆ Ω ( φ ˆ ˆ )( φ ) + B r ( φ ˆ )( ˆ φ ) + z( k) ( φ( k) φ)( φ( k) φ) (.89) k r z ˆ ˆ ( k) ( φ( k) )( φ( k) ) k Ths s equvalent to perforng suary statstcs on all of the rando saples aong all of the ndvduals. Note that the noralzed weghts z ( k) defned n equaton (.8) are obtaned fro sapled evaluatons under the ont densty l ( y φ, h ( φ, Ω ), and are ) 4/8/00 7:55:00 PM Page 6 of 7

17 NONMEM7_Techncal_Gude.doc therefore eprcal evaluatons of the condtonal densty of equaton (.3). As the nuber of saples approaches nfnty ( r ), equatons (.88) and (.89) approach the exact evaluaton of updates that are requred, as expressed n equatons (.39) and (.66). For subsequent teratons, the Monte Carlo evaluated condtonal ean and varances of the prevous teraton for that subect ay be used as the paraeters to the saplng densty. Ths ultvarate densty has ean at φ p and, and varance of B p, so we saple fro φ( k) N( φp, γ Β p) where subscrpt p refers to prevous teraton, so the pertnent weghtng functon s: q( k) log( l ( y φ( k), ) ) ( φ( k) ) Ω ( φ( k) ) log(det( Ω)) + ( φ( k) φp) ( γβs) ( φ( k) φp) + log(det( γβs)) Followed by u ( k) ( k) (.90) exp( q ) (.9) r ψ u( k) (.9) r k L log( ψ ) (.93) and the addtonal coputatons are carred out as before. Whether the paraeters to the proposal densty are obtaned fro a MAP estaton, or fro condtonal eans and varances deterned fro a prevous teraton, depends on whether METHODIMP or METHODIMPMAP s used, and the settngs of MAPITER and MAPINTER (avalable n NONMEM 7.). For those that are not expressed n the odel through, the ay appear anywhere n the lkelhood. To optze the populaton obectve functon n these as well, we need to perfor a fnte dfference on the entre lkelhood for each non-u odeled of L L ( +Δ ) L( ) (.94) Δ 4/8/00 7:55:00 PM Page 7 of 7

18 NONMEM7_Techncal_Gude.doc log( p( y, φ ( +Δ ),, Ω)) + log( p( y, φ,, Ω)) z ( φ y,,, Ω) dφ Δ r log( p( y, φ ( +Δ ),, Ω)) + log( p( y, φ,, Ω)) z( k) k Δ E φ, L log( p( y, φ,, Ω)) where g L g g (.95) (.96) (.97) g 0 (.98) s the vector of all g for whch constructed as follows: H ˆ g g. A Hessan atrx ay be (.99) + αh g (.00) We now consder the coputatonal expense for portance saplng requred to update u odeled theta paraeters versus non-u odeled paraeters. For coplex PK/PD probles that use the nuercal ntegraton ($DES), the greatest coputatonal expense s n evaluatng the predctve functon f ( t,φ, ). The evaluaton of the ndvdual obectve functon log( p( y, φ,, Ω)) log( l ( y φ, ) h ( φ, Ω ))), n partcular the data lkelhood porton l ( y φ, ) requres evaluaton of f ( t,φ, ) for every observed value of subect. In portance saplng, the ndvdual lkelhood s evaluated r tes n the evaluaton of the condtonal eans and varances, per subect per teraton, regardless of how any u odeled paraeters are to be evaluated, accordng to equaton (.84). Once the condtonal eans and varances are deterned, the ndvdual obectve functon s no longer needed to evaluate the update for these thetas, accordng to equatons (.86) and (.88). 4/8/00 7:55:00 PM Page 8 of 7

19 NONMEM7_Techncal_Gude.doc For non-mu odeled paraeters, however, equatons (.96), (.97),(.98), (.99), and (.00) suggest that n r ndvdual obectve functon calls are requred, where n s the nuber of non-u odeled paraeters, one for each log( p( y, φ ( +Δ ),, Ω )) evaluaton. There s a sub-class of non-u odeled paraeters for whch soe coputaton effcency can be ade, and these are the SIGMA paraeters, or Sga-lke, theta paraeters. Such paraeters are not used n evaluatng the predctve functon f ( t,φ, ), the ost coputatonally expensve coponent, but only n evaluatng the resdual varance V( f, ), so NONMEM uses f ( t,φ, log( l( y φ, )n evaluatng log( l( y φ, )) as well as +Δ ))) durng the fnte dfference step, and wll not re-evaluate f: log( l ( y φ, )) y f φ V f )[ y f ( φ)] + log det V ( f, [ ( )] (, ( ( ))) (.0) log( l ( y φ, +Δ )) [ y f( φ)] V ( f, +Δ )[ y f( φ)] + log det V ( f, +Δ ) ( ( )) (.0) Note that for these paraeters only the V( f, ) has to be re-evaluated (as the Y value n the NONMEM control strea fle), whch s usually a sple algebrac relaton. SIGMA paraeters are autoatcally recognzed by NONMEM as those for whch t can ake ths short-cut. THETA paraeters that are used only n evaluatng the resdual varance (n the evaluaton of Y n the control strea fle) but not, drectly or ndrectly, n evaluatng the predctve functon (n the evaluaton of F n the control strea fle), ay be gven an S desgnaton n the GRD settng of $EST, and only then wll NONMEM utlze the short cut for evaluatng ts partal dervatve. Sga paraeters (but not Sga-lke THETA paraeters) can be addtonally updated effcently by evaluatng ther partal dervatve gradent contrbutons analytcally, as gven n Appendx F (avalable n NONMEM 7.). However, f the user specfes that Sga s 4/8/00 7:55:00 PM Page 9 of 7

20 NONMEM7_Techncal_Gude.doc GRD value wth an N, then ther partal dervatves are evaluated nuercally by fnte dfference ethod. In general therefore, t s best to u odel THETA paraeters whenever possble, to take advantage of the effcency avalable for EM ethods, and to specfy when thetas ay be consdered Sga-lke, or to take advantage of odelng resdual varances va SIGMA paraeters, as uch as possble. Drect Saplng (avalable n NONMEM 7.) Drect saplng s uch less effcent than portance saplng, and can requre 0000 to rando saples per subect to properly obtan condtonal eans and varances. Drect saplng does not use an portance regon saplng densty, but creates saples φ drectly fro the noral dstrbuton populaton paraeter densty: φ( ) N(, Ω ) (see ( k ) [5]). The followng weght s then assocated wth the saple, based on the approprate substtutons nto equaton (.77): u( k) l ( ( k), y φ ) (.03) k Condtonal eans and varances are obtaned as shown earler: z ( k) u r u k ( k) ( k) (.04) r z( k) ( k) k φ φ (.05) r z( k) ( k) R ( k) R k B ( φ φ )( φ φ ) (.06) As wth portance saplng, an average of weghts s obtaned, r ψ u( k) (.07) r k Fro whch the ntegrated obectve functon s obtaned L log( ψ ) (.08) Iteratve Two Stage 4/8/00 7:55:00 PM Page 0 of 7

21 NONMEM7_Techncal_Gude.doc Iteratve two stage approxates the expectaton step by usng the condtonal odes and approxate condtonal varances that are evaluated durng the MAP estaton ethod that s also used n FOCE or LAPLACE ethods, as descrbed earler. We can consder an approxate update for u odeled thetas that s appled n teratve two stage, by evaluatng: L ˆ LA Ω ( φ ) ( ) ˆ φ Ω η g 0 (.09) Where subscrpt A refers to approxate. Ths s an approxaton to that extent that the ode φ ˆ φˆ ( ) + η ˆ approxates the ean φ. Then as before, L L Ω η A A ˆ g 0 (.0) We then perfor a Gauss-Newton update: H g g (.) ˆ + αh g (.) Then, updatng the us: ˆ (ˆ ) Slarly, to update Oega, teratve two stage approxates update (.66): ˆ Ω ( φ )( φ ) + B (.3) ˆ ˆ ( φ ˆ )( ˆ ) ˆ φ + B (.4) where ( ˆ + S ) ˆ B Ω (.5) s the approxate condtonal varance evaluated durng the FOCE or LAPLACE ntegraton step. 4/8/00 7:55:00 PM Page of 7

22 NONMEM7_Techncal_Gude.doc The approxate optzaton of the teratve two stage ethod s related to an approxate optzaton of FOCE s or Laplace s L N. To consder optzng L N for u odeled thetas, we can convenently rephrase equaton (.8) as log( ( ˆ LN l y φ, ˆ )) + log(det( Ω )) + (φ ) ˆ Ω (φ ) + log(det( Ω + S ( y ˆ η, ))) (.6) Snce φ ˆ s at the ode of the posteror densty, then log( l ( y φ, )) + Ω (ˆ φ ) 0 φ ˆ φ (.7) It follows that: L log( ( N L l y N φ, ˆ )) ( φ ˆ ˆ ) Ω ( φ ) φ + + ˆ φ ˆ φ φ ˆ ( ) Ω log(det( ( ˆ, ))) ˆ φ Ω + S y η η + ηˆ ˆ ( ) Ω log(det( ( ˆ, ))) ˆ φ Ω + S y η η + 0 ηˆ (.8) (.9) where the ter n parentheses cancels because of equaton (.7). Coparng equaton (.9) wth that of (.0) shows that teratve two stage only approxates the optzaton of FOCE s L N because t does not nclude the contrbuton of change n the nforaton atrx of the ont densty wth respect to theta. Slarly, we consder dfferentatng FOCE s obectve functon wth respect to OMEGA: L log( ( N L l y N φ, ˆ )) ( φ ˆ ˆ ) Ω ( φ ) φ + Ω Ω ˆ φ ˆ φ φ Ω ˆ ˆ log(det( ) ( ) ( ) Ω φ log(det( ( ˆ Ω φ Ω + S y η, )) (.0) Ω Ω Ω log(det( Ω + S ( y ˆ, )) ˆ η η ηˆ Ω ( ˆ ) log(det( Ω + S ) ηˆ S + 0 ηˆ Ω ˆˆ Ω ηη Ω (.) 4/8/00 7:55:00 PM Page of 7

23 NONMEM7_Techncal_Gude.doc Here we dfferentate the obectve functon wth respect to gradent wth respect to Ω for convenence. When the Ω equals 0, then the gradent wth respect to Ω also equals 0. The detals of the lnear algebra anpulatons leadng to the last part of (.) are gven n appendx A. Then we can express the exact nzaton of L N wth respect to Oega as: L N Ω log(det( ) ˆ ˆˆ ˆ Ω + S η Ω+ ηη + B + 0 ηˆ Ω Note thatb ˆ represents a lnearzed approxaton to the true condtonal varance ay consder an approxate partal gradent to L N wth respect to Oega as: (.) B. We L N L N ˆ ˆˆ Ω Ω Ω+ ηη + B 0 (.3) Solvng for the next estate of Oega fro equaton (.3): ˆΩ ηη + Bˆ (.4) whch s slar to the update of Oega n the teratve two stage algorth (.4). To suarze, n teratve two stage, s updated here usng the average of the odes of the ndvdual ont denstes, whch serves only as an approxaton to the ore precse update of the average of the eans of the φ under the ont densty, as dctated by the exact equaton (.39). If there s a skewness to each ndvdual s ont densty, such that the eans tend to dffer systeatcally fro the odes, then the teratve two stage update ay yeld based results. For non-u odeled theta, teratve two stage n NONMEM falls back on a forward dfference evaluaton of the full lkelhood: L N L N ( +Δ ) LN ( ) g (.5) Δ Followed by a sngle step Gauss-Newton update: H g g (.6) ˆ + H g (.7) 4/8/00 7:55:00 PM Page 3 of 7

24 NONMEM7_Techncal_Gude.doc To suarze, the NONMEM FOCE ethod optzes L N, whch s an approxaton to the true obectve functon L, and teratve two stage further approxates the optzaton of L N. All teratve update ethods that rely on updatng the populaton paraeters usng the average to the ndvdual estates guarantee centeredness of the populaton paraeters about the ndvdual paraeters by defnton. However, because the FOCE NONMEM ethod uses a search algorth on an approxate obectve functon, t does not guarantee centeredness. One can pose the CENTERING opton to the estaton process n NONMEM, whch then optzes a odfed obectve functon of equaton (.8): L N log( l ( y ( ˆ ˆ )) log(det( )) ( ˆ ˆ) ( ˆ ˆ η η), + Ω + η η Ω η η) (.8) + log(det( Ω + S ( y ˆ η, ))) where ηˆ η ˆ (.9) to ensure statstcal centerng, although not exact centerng. The MCMC ethod of Expectaton n SAEM In Markov Monte Carlo saplng, used n the SAEM and BAYES ethods, saples are generated fro a larger varety of proposal denstes than n portance saplng. As pleented n NONMEM, for a gven set of populaton paraeters u and oega, proposed paraeters ph for each ndvdual are generated by a three ode process. The followng s based on references [6] and [7]. Durng ode, a vector of odel paraeters s generated fro the followng proposal densty or kernel: k φ N φ Ω h φ Ω φ Ω φ Ω (.30) log( ( )) log( (, )) log( (, )) ( ) ( ) log 4/8/00 7:55:00 PM Page 4 of 7

25 NONMEM7_Techncal_Gude.doc For the acceptance test, we need to evaluate the above densty along wth the followng backward densty, at the present φ for subect : k φ N φ Ω h φ Ω φ Ω φ Ω (.3) log( ( )) log( (, )) log( (, )) ( ) ( ) log Also, the ont densty s evaluated at the present φ : log( π ( φ )) log( p( y, φ,, Ω )) log( l( y,φ )+ ) log( h ( φ, Ω )) (.3) And at the proposed φ log( π ( φ )) log( p( y, φ,, Ω )) log( l( y,φ )+ ) log( h ( φ, Ω )) (.33) Then the test statstc s created: t log( π ( φ) ) log( π ( φ) ) + log( k ( φ )) log( k ( φ) ) log( l( y,φ ) ) log( l( y,φ )) (.34) A unfor rando devate u s then generated, log transfored, and f log( u) < t (.35) then the proposed saple set φ of paraeters s accepted and becoes the new φ for subect. Followng ode saplng, proposal kernel ode A saplng and testng s perfored, n whch a saple fro one of the other subects s randoly selected. It s assued that the set of paraeters aong subects s norally dstrbuted wth ean and varance of the present and Ω. Thus, the statstc t of equaton (.34) s used as the acceptance test. Ths ethod has lted use to assst certan subects to fnd good paraeter values by borrowng fro ther neghbors, n case the neghbors had obtaned good values. Ths ode should generally not be used, and can be naccurate f not all subects share the sae and Ω, such as n covarate odelng. Alternatvely, use ode A saplng at the begnnng of an SAEM analyss for a few burn n teratons, then contnue wth a coplete SAEM analyss wth ode A saplng turned off, wth ore burn n and accuulated saplng teratons. 4/8/00 7:55:00 PM Page 5 of 7

26 NONMEM7_Techncal_Gude.doc Followng ode A saplng, proposal kernel ode saplng and testng s perfored, usng the proposal densty: k φ φ)) N φ φ Z φφ Z φφ Z (.36) log( ( log( (, )) ( ) ( ) ( ) log where φ s the present set of paraeters for ndvdual (t could have been the one accepted fro the ust copleted ode saplng), and where Z κ Ω whch ncludes a scalng factor κ. Ths scalng factor s adusted for each subect such that saples are accepted at a fractonal rate ρ IACCEPT. Ths scalng factor κ s slar to M the scalng factor γ n portance saplng, and s also subect to the boundary values of ISCALE_MIN and ISCALE_MAX (avalable n NONMEM 7.). The backward densty s log( k( φ φ)) log( N( φ φ, Z)) ( φ φ) ( Z) ( φ φ) log Z k( φ φ) (.37) so the test statstc s calculated as: t log( π( φ) ) log( π( φ )) + log( k ( φ φ) ) log( k ( φ φ) ) log( π( φ) ) log( π( φ )) (.38) A unfor rando devate u s then generated, log transfored, and f log( u) < t (.39) then the proposed saple set φ of paraeters s accepted and becoes the new φ for subect. For proposal kernel ode 3, each paraeter of the vector φ s sequentally sapled usng the unvarate densty: k3( φl φl ) log( N( φl φl, zll )) ( φl φl ) ( zll )( φl φl ) + log zll (.40) where subscrpt l refers to the lth paraeter, and backward densty s 3 l l 3 l l z ll s the lth dagonal eleent of Z. The k ( φ φ ) k ( φ φ ) (.4) so the test statstc s: t3 log( π ( φl )) log( π ( φ) ) (.4) 4/8/00 7:55:00 PM Page 6 of 7

27 NONMEM7_Techncal_Gude.doc Where φl equals φ but wth the lth eleent replaced wthφ l : φ { φ, φ } l l, l Once a paraeter s tested, the result contrbutes to the new φ for the next paraeter n the vector to be sapled. Durng the MCMC saplng process, the IACCEPT sets ρ M, ISAMPLE_M deternes the nuber of ode saplngs, followed by ISAMPLE_MA saplngs, followed by ISAMPLE_M ode saplngs, followed by ISAMPLE_M3 ode 3 saplngs. The fnal paraeter set φ after the cycle of ISAMPLE_M+ISAMPLE_M+pISAMPLE_M3 saplngs (where pnuber of eleents n vector φ ) serves as the results of one chan for subect. Durng each teraton, risample separate chans of vectors φ ay be collected. Then, as descrbed wth portance saplng, the followng ay be calculated: r ( k) k φ φ (.43) r ( k) ( k) k B ( φ φ)( φ φ ) (.44) Note that the acceptance/reecton process assured that the collecton of φ ( k) reflect the dstrbuton of the desred condtonal densty, and weghts z are not needed. Durng the stochastc ode, the updates to the populaton paraeters (both u and non-u odeled) are then perfored as descrbed n portance saplng. Durng the accuulaton ode, update results fro prevous k- teratons are averaged nto the updates of the present kth teraton. For u odeled theta, and Oegas, the condtonal eans and varances are accuulatvely updated and saved as follows: k φ φ + φ k k Sk Sk k ( k ) S B + + B + k k ( φ ) ( φ ) Sk Sk Sk k k 4/8/00 7:55:00 PM Page 7 of 7

28 NONMEM7_Techncal_Gude.doc B S φ Sk Sk Sk followed by update of the an populaton paraeters n the usual anner: L Ω ( φ S ) k g 0 (.45) H Ω +H g (.46) ˆ (.47) ˆ ( ˆ ) ˆ Ω ( φ ˆ )( ˆ S φ ) k S k + BS (.48) k For non-u odeled theta, the thetas ˆ k of the present kth teraton are updated usng equatons (.47)-(.5) usng the present teraton s saplng process, followed by: ˆ S (( k ) S k ˆ k + ˆ k )/ k (.49) Frst dervatve gradents of non-u odeled theta are also accuulated (for use n frst order approxatons of standard errors, see Appendx C): k gs g k S + g k k (.50) k k In general, the order of accuracy for the varous ethods s Monte Carlo EM (IMP, SAEM, DIRECT) >Laplace>FOCEI>ITS. Three Stage EM Analyss There are tes when one desres to use nforaton fro a prevous analyss and ncorporate t nto the present analyss. Ths would be n the for of pror nforaton for thetas and/or oegas. The prncple on whch ths s based s as follows. Let be the prors to the thetas (theta prors, whch could be estates of theta fro a prevous analyss). Let the atrx Ω be the nforaton atrx (whch could be the theta porton of the nverse of the standard error varance atrx of a prevous analyss) of the theta prors. Then Ω ay be called varance to theta prors, or theta varance prors. Let Ω Ω be the pror to the oegas of the populaton nter-subect varance-covarance atrx, of denson 4/8/00 7:55:00 PM Page 8 of 7

29 NONMEM7_Techncal_Gude.doc p (Oega prors, could be Oega estates of a prevous analyss), let ρ be the degrees of freedo of Ω Ω (degrees of freedo prors, could be the nuber of subects n the prevous analyss). The contrbuton to the obectve functon that ncorporates ths pror nforaton s ( ) ( ρ Ω ρ ) L log N(,Ω ) log W ( Ω Ω, + p+ ) (.5) p And s then added to equaton (.): L log( p( y, φ, Ω) dφ ) + L (.5) where + p log( N(, )) ( ) Ω Ω ( ) + log(det( Ω )) (.53) ρ ) log( W ( Ω ΩΩ, dw ) ( ρtr( ΩΩ ) + ( dw n)ln ( ) d W ln ( ) + nln ( ρ) Ω Ω ΩΩ ) (.54) (not ncludng constants) where n s the denson of Ω. The degrees of freedo d w wll be descrbed later. It follows that the partal dervatves to L contrbuted by ths pror nforaton are: LP Ω ( ) L ω P where ρc(, ) I Ω ( z ΩΩ ) Ω I (.56) c(, ) for /for and W Ω (.57) z ( d n )/ ρ (.58) W W (.55) Whch suggest the followng updates. To deterne the odeled theta that nze the obectve functon, we ust solve addng the contrbuton fro the pror: L Ω ( φ ) Ω ( ) g (.59) and 4/8/00 7:55:00 PM Page 9 of 7

30 NONMEM7_Techncal_Gude.doc H Ω + Ω And perforng the followng update As before: α (.60) ˆ + H g (.6) For non-u odeled thetas, L Eφ, L log( p( y, φ,, Ω)) g (.6) L Ω ( ) g ( ) g Ω 0 (.63) H g g + Ω (.64) ˆ + H g (.65) The nter-subect varances are updated as ˆ Ω ( φ )( φ ) + B + ρω H (.66) + d n W These were derved fro settng partal dervatves of the obectve functon to 0 and usng the approprate nverse denstes and partcular degrees of freedo n the obectve functon. For axzaton ethods (all ethods except BAYES), the degrees of freedo to the nverse Wshart s selected as dw ρ + n+ (.67) so that the axzaton of these denstes leads to a centerng about the pror nter-subect varances, weghted accordng to the nuber of subects fro that prevous analyss, and wth a denonator ter of + ρ, yeldng an ntutve update. That s, the densty whose ode s at Ω Ω s W ( Ω ρω, ρ+ n+ ). We shall call ths the odal or axzaton Ω verson of addng the pror nforaton. The densty whose ean s at Ω Ω s W ( Ω ρω, ρ) ). A BAYES analyss s concerned wth obtanng average populaton Ω 4/8/00 7:55:00 PM Page 30 of 7

31 NONMEM7_Techncal_Gude.doc paraeters rather than best ft or odal populaton paraeters, so t utlzes the degrees of freedo dw ρ (.68) whch we shall call the ean verson of addng the pror nforaton. The above equatons are also suggested by the saple dstrbuton equatons lsted on page 34 of [8]. Populaton Mxture Modelng Soetes the data ay be derved fro two or ore sub-populatons, as evdenced by a dstrbuton of a paraeter aong the subects that appears to be b-odal, or skewed. For exaple, suppose the data s frst ft wth a sple one copartent odel, wth volue Vc and rate constant of elnaton k0. A hstogra analyss of the ndvdual k0's suggests a bodal or skewed dstrbuton. However, none of the known bnary covarates (gender, for exaple) explans ths bodalty. Under these crcustances, one can specfy the probablty of an ndvdual belongng to a sub-group, wthout nsstng on the certanty of belongng to that partcular sub-group. Consder that we have N sub-populatons. Then for each subect, and for each subpopulaton we have the probablty N + y p( y Ω, ) ap(, φ Ω, ) dφ (.69) where p ( y, φ Ω, ) (.70) s the densty for sub-populaton odel, for subect, and a s the probablty of belongng to sub-populaton. Then defne + L log( p ( y, φ Ω, ) dφ) (.7) so the negatve log-lkelhood of an ndvdual s: N + L log( a p ( y, φ Ω, ) dφ) (.7) 4/8/00 7:55:00 PM Page 3 of 7

32 NONMEM7_Techncal_Gude.doc N log( a p ( y, φ Ω, ) dφ) (.73) + N log( a exp( L )) (.74) Consder that equatons for updatng the non-proporton (that s, non-a) populaton paraeters q {, Ω} are derved fro obtanng the partal dervatves of the obectve functon L: L a exp( L) L N q N q a exp( L ) or k L ar N L N q a q where r and a k N ar k k k k L q exp( L ) (.77) ar N ar k k k (.78) s the probablty or weght for ndvdual, sub-populaton odel. As an exaple, L where Ω ( φ ) z ( φ y,, Ω) dφ g (.79) z ( φ y,, Ω ) (.80) (.75) (.76) s the condtonal densty for subect odeled under sub-odel, then the approprate condtonal ean for subect would be N φ a φ (.8) where 4/8/00 7:55:00 PM Page 3 of 7

33 NONMEM7_Techncal_Gude.doc φ φz ( φ y,, Ω) dφ (.8) whch are then used n the usual way to update the thetas. Slarly: N B a ( B + φφ ) φφ (.83) where B ( φ φ )( φ φ ) z ( φ y,, Ω) dφ (.84) s the condtonal varance for ndvdual, sub-odel, whereupon the update s the usual: ˆ Ω ( φ )( φ ) + B (.85) The weghted average of the other expectaton results are also perfored, usng the sae weghtngs. The L, and therefore a, s readly obtaned durng the expectaton step as the obectve functon to subect, under sub-populaton odel. In practce therefore, the expectaton step s done N tes for each ndvdual, collectng the resultng condtonal eans, varances, and obectve functon values to each sub-odel, and then perforng the weghted average, as shown above. A ethod n keepng wth nzng the total obectve functon would be to construct partal dervatves and partal second dervatves, where for each subect : L a exp( L ) exp( L ) r r N N a exp( L ) a r N N k k k k k k (.86) exp( L ) exp( L ) exp( L ) exp( L ) a a a a snce L N N L L N N a exp( ) exp( ) k Lk ak Lk k k (.87) a N N a (.88) Then, perfor the usual Gauss-Newton update, where a are all thetas that odel the subpopulaton proportons n the $MIX odule: g a L a a a (.89) 4/8/00 7:55:00 PM Page 33 of 7

34 NONMEM7_Techncal_Gude.doc L L H g g a a a a anew aold ( ) ( ) a a a (.90) H g (.9) 4/8/00 7:55:00 PM Page 34 of 7

35 NONMEM7_Techncal_Gude.doc MCMC Bayesan Analyss for Evaluatng a Dstrbuton of Populaton Paraeters The Markov chan Monte Carlo (MCMC) Bayesan analyss can be used to obtan any thousands of populaton paraeter and varance paraeters that represent the dstrbuton accordng to ther ablty to ft the data. Ths nforaton s slar to what s obtaned by boot strap ethods, and MCMC Bayesan analyss can be used n ther place. The Bayesan analyss ay be perfored wth or wthout ncludng pror nforaton, but t s recoended that there at least be pror nforaton for OMEGA. There are two an types of Bayesan analyss avalable n NONMEM. The ost effcent s the Gbbs saplng ethod, and s used to create saples of thetas that are lnearly odeled wth respect to ther u s, and the nter-subect varances. Ths s perfored n the anner of page 34 of [8]. Updatng lnearly odeled thetas (desgnated as follows. Use the EM update ethod to obtan estates ˆ : L L ) s done as H L Ω + Ω L L L (.9) Followed by ˆ + ah g (.93) L L L L Next, saple fro the followng condtonal densty: ˆ ( H L ) [.] N, (.94) L For the Oegas: L ˆ Ω ( φ )( φ ) + B + ρω + ρ Ω (.95) Followed by saplng fro an nverse Wshart densty: (( ) + ) [ Ω. ] W Ω, ρ (.96) ˆ A atrx wth an nverse Wshart dstrbuton of + ρ degrees of freedo could be constructed as follows. Create k vectors of norally dstrbuted rando saples: xk N(0,) (.97) Then construct 4/8/00 7:55:00 PM Page 35 of 7

36 NONMEM7_Techncal_Gude.doc + ρ S x x (.98) + ρ k k k Ω L S L (.99) Where Ωˆ Ωˆ L Ω ˆ s the cholesky of ˆΩ. More effcent ethods of creatng an nverse Wshart atrx saple are avalable. Because these saple denstes are also the condtonal denstes for the respectve paraeters, the saples are always accepted, and no acceptance/reecton analyss needs to be perfored. Sga paraeters (but not Sga-lke THETA paraeters) that are solated resdual varance coeffcents are updated as follows: ˆ σ ( y f ) y ( / ε ) (.00) Followed by saplng fro an nverse ch-square: σ χ ( N, ˆ σ ) Where N s the total nuber of data ponts nvolved n evaluaton of that partcular sga. Metropols-Hastngs saplng ust be perfored on all other types of theta paraeters, as follows. For the frst ode, for thetas not lnearly u odeled and cholesky decoposed sga eleents, desgnated collectvely as, proposed saple paraeters for the k+th teraton are created usng log( N( ( ) 0, Z)) ( 0) Z ( 0) log Z (.0) 0 and Z vary accordng to how any saples have so far been created. Durng the frst several hundred teratons of burn-n, 0 are the ntal thetas at the start of the analyss, and Z s a dagonal atrx wth dagonal eleents that are equal to (0.5* 0 ). Durng the subsequent teratons of burn-n, 0 and Z are the saple eans and varances of collected durng the prevous several hundred teratons. Durng the statonary phase, 0 and Z are the saple eans and varances of all collected so far snce the begnnng of the statonary phase. 4/8/00 7:55:00 PM Page 36 of 7

37 NONMEM7_Techncal_Gude.doc To reflect the probablty of choosng these values, the followng log densty values are therefore calculated, based on the respectve proposal denstes, for ode log( k ( )) log( N(, Z )) (.0) The log lkelhood of the kth set of populaton paraeters wth respect to the data, and wth respect to postons of the kth set of ndvdual paraeters φ k s evaluated also: log( π ( )) log( p( y,φ,, Ω )) (.03) k k k k k The log lkelhood of the proposed saple set of populaton paraeters wth respect to the data, and wth respect to postons of the present kth set of ndvdual thetas s evaluated also: log( π ( )) log( p( y,φ,, Ω )) (.04) k k The followng test statstc s created: t log( k ( )) + log( k ( )) log( π( )) + log( π( )) (.05) 0 k 0 A unfor rando devate u s then generated, log transfored, and f log( u) k < t (.06) then the proposed saple set of populaton paraeters s accepted. If reected, then the kth saple set s used as the k+th saple set. Ths s done PSAMPLE_M (an opton n NONMEM) tes. Next, durng the second kernel densty ode, the populaton paraeters of the present poston k ay be used to create a saple for the next teraton: log( N(, )) ( ) ( ) ( ) k Z k wz k log wz (.07) Where k s the accepted theta of the kth teraton, w s a scalng paraeter, whch s adusted throughout the analyss so that a fracton PACCEPT (opton) of rando saple sets are accepted. The PACCEPT (opton) paraeter s set by the user. To reflect the probablty of choosng these values, the followng log densty values are therefore calculated, based on the respectve proposal denstes: log( k ( )) log( N(, wz)) (.08) k k 4/8/00 7:55:00 PM Page 37 of 7

38 NONMEM7_Techncal_Gude.doc as well as ther backward densty of ode : log( k ( )) log( N(, wz)) log( k ( )) (.09) k k k The test statstc s created: t log( k ( )) + log( k ( )) log( π( )) + log( π( )) (.0) k k k k A unfor rando devate u s then generated, log transfored, and f log( u) < t (.) Then the saple s accepted. Ths s done PSAMPLE_M tes. As a thrd kernel saplng ode, saples on each paraeter separately and sequentally ay be ade usng the unvarate dstrbuton log( N( l kl, z ll )) ( l kl) z ll ( l kl) + log z ll (.) where z ll s the llth dagonal eleent to Z -, for paraeter l. The other paraeters are not oved when n ths ode. To reflect the probablty of choosng these values, the followng log densty values are therefore calculated, based on the respectve proposal denstes: log( k ( )) log( N(, wz)) (.3) 3l l kl l kl and backward densty of ode 3: log( k ( )) log( N(, wz)) (.4) 3l kl l kl l The test statstc s created for each paraeter l: t log( k ( )) + log( k ( )) log( π( )) + log( π( )) (.5) 3 3l kl kl k k kl A unfor rando devate u s then generated, log transfored, and f log( u) < t (.6) 3 then the proposed saple set of populaton paraeters s accepted as the k+th saple set. If reected, then the kth saple set s used as the k+th saple set. The thrd ode s done for each paraeter PSAMPLE_M3 tes, for n*psample_m3 tes n a gven teraton, where n s the nuber of populaton paraeters n the vector. 4/8/00 7:55:00 PM Page 38 of 7

39 NONMEM7_Techncal_Gude.doc If the user has selected to perfor Metropols-Hastngs saplngs for Oega eleents, then for each te that saples of populaton ean paraeters and covarates are created, saples of populaton varances are also created usng the nverse Wshart dstrbuton. For ode, usng the startng poston values (k0) (OSAMPLE_M tes): ρ+ ρ+ ) ( + ) ( ) + ( + )ln ( + ) ln + ln + log( W ( Ω ( ) Ω0,( ) ( ρ trωω ρ n ( ) ρ ( 0 ) n ( ρ ) 0 Ω Ω ) (.7) To reflect the probablty of choosng these values, the followng log densty values are therefore calculated, based on the respectve proposal denstes: log( k ( Ω Ω )) log( W ( Ω ( ρ+ ) Ω,( ρ+ ) ) (.8) 0 0 The log lkelhood of the k set of populaton paraeters wth respect to the data, and wth respect to postons of the k set of ndvdual paraeters φ k s evaluated also: k p k k k k log( π ( Ω )) log( ( y,φ,, Ω )) (.9) The log lkelhood of the proposed saple set of populaton varances wth respect to the data, and wth respect to postons of the present kth set of ndvdual thetas s evaluated also: log( π ( Ω )) log( p( y,φ,, Ω )) (.0) k k Durng ode, the followng test statstc s created: t log( k ( Ω Ω )) + log( k ( Ω Ω )) log( π( Ω )) + log( π( Ω )) (.) 0 k 0 A unfor rando devate u s then generated, log transfored, and f log( u) < t (.) then the proposed saple set of varances s accepted as the k+th saple set. If reected, then the kth saple set s used as the k+th saple set. k For ode, the present poston k s used (OSAMPLE_M tes): 4/8/00 7:55:00 PM Page 39 of 7

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (