M 1 = + x 2 f(x)dx M 2 = - PDF Free Download

AppendixB The Delt method... Suppose you hve done study, over 4 yers, which yields 3 estimtes of survivl (sy, φ 1, φ 2, nd φ 3. But, suppose wht you re relly interested in is the estimte of the product of the three survivl vlues (i.e., the probbility of surviving from the beginning of the study to the end of the study? While it is esy enough to derive n estimte of this product (s [ φ 1 φ 2 φ 3 ], how do you derive n estimte of the vrince of the product? In other words, how do you derive n estimte of the vrince of trnsformtion of one or more rndom vribles (in this cse, we trnsform the three rndom vribles - φ i - by considering their product? One commonly used pproch which is esily implemented, not computer-intensive, nd cn be robustly pplied in mny (but not ll situtions is the so-clled Delt method (lso known s the method of propgtion of errors. In this ppendix, we briefly introduce the underlying bckground theory, nd the implementtion of the Delt method, to firly typicl scenrios. B.1. Men nd vrince of rndom vribles Our primry interest here is developing method tht will llow us to estimte the men nd vrince for functions of rndom vribles. Let s strt by considering the forml pproch for deriving these vlues explicitly, bsed on the method of moments. For continuous rndom vribles, consider continuous function f(x on the intervl [,+ ]. The first three moments of f(x cn be written s M 0 M 1 M 2 + + + f(xdx x f(xdx x 2 f(xdx In the prticulr cse tht the function is probbility density (s for continuous rndom vrible, then M 0 1 (i.e., the re under the PDF must equl 1. For exmple, consider the uniform distribution on the finite intervl [, b]. A uniform distribution (sometimes lso known s rectngulr distribution, is distribution tht hs constnt probbility c Cooch & White (2012 05.10.2012

B.1. Men nd vrince of rndom vribles B - 2 over the intervl. The probbility density function (pdf for continuous uniform distribution on the finite intervl [, b] is 0 for x < P(x 1/(b for < x < b Integrting the pdf, for p(x 1/(b, 0 for x > b M 0 M 1 M 2 p(xdx 1 b dx 1 xp(xdx x +b dx b 2 x 2 p(xdx x 2 1 b dx 1 ( 2 + b+b 2 3 (B.1 (B.2 (B.3 We see clerly tht M 1 is the men of the distribution. Wht bout the vrince? Where does the second moment M 2 come in? Recll tht the vrince is defined s the verge vlue of the fundmentl quntity [distnce from men] 2. The squring of the distnce is so the vlues to either side of the men don t cncel out. Stndrd devition is simply the squre-root of the vrince. Given some discrete rndom vrible x i, with probbility p i, nd men µ, we define the vrince s Vr (x i µ 2 p i Note we don t hve to divide by the number of vlues of x becuse the sum of the discrete probbility distribution is 1 (i.e., p i 1. For continuous probbility distribution, with men µ, we define the vrince s Vr Given our moment equtions, we cn then write Vr (x µ 2 p(xdx (x µ 2 p(xdx ( x 2 2µx+µ 2 p(xdx x 2 p(xdx x 2 p(xdx 2µ 2µxp(xdx+ µ 2 p(xdx xp(xdx+µ 2 p(xdx Now, if we look closely t the lst line, we see tht in fct the terms represent the different moments

B.2. Trnsformtions of rndom vribles nd the Delt method B - 3 of the distribution. Thus we cn write Vr Since M 1 µ, nd M 0 1 then (x µ 2 p(xdx x 2 p(xdx 2µ xp(xdx+µ 2 p(xdx M 2 2µ(M 1 +µ 2 (M 0 Vr M 2 2µ(M 1 +µ 2 (M 0 M 2 2µ(µ+µ 2 (1 M 2 2µ 2 + µ 2 M 2 µ 2 M 2 (M 1 2 In other words, the vrince for the pdf is simply the second moment (M 2 minus the squre of the first moment ((M 1 2. Thus, for continuous uniform rndom vrible x on the intervl [, b], Vr M 2 (M 1 2 ( b2 12 B.2. Trnsformtions of rndom vribles nd the Delt method OK - tht s fine. If the pdf is specified, we cn use the method of moments to formlly derive the men nd vrince of the distribution. But, wht bout functions of rndom vribles hving poorly specified or unspecified distributions? Or, situtions where the pdf is not esily defined? In such cses, we my need other pproches. We ll introduce one such pproch (the Delt method here, by considering the cse of simple liner trnsformtion of rndom norml distribution. Let X 1, X 2,... N(10, σ 2 2 In other words, rndom devites drwn from norml distribution with men of 10, nd vrince of 2. Consider some trnsformtions of these rndom vlues. You might recll from some erlier sttistics or probbility clss tht linerly trnsformed norml rndom vribles re themselves normlly distributed. Consider for exmple, X i N(10, 2 - which we then linerly trnsform to Y i, such tht Y i 4X i + 3. Now, recll tht for rel sclr constnts nd b we cn show tht i. E(, E(X+ b E(X+b ii. vr( 0, vr(x+ b 2 vr(x

B.2. Trnsformtions of rndom vribles nd the Delt method B - 4 Thus, given X i N(10, 2 nd the liner trnsformtion Y i 4X i + 3, we cn write Y N(4(10+3 43,(4 2 2 N(43, 32 Now, n importnt point to note is tht some trnsformtions of the norml distribution re close to norml (i.e., re liner nd some re not. Since liner trnsformtions of rndom norml vlues re norml, it seems resonble to conclude tht pproximtely liner trnsformtions (over some rnge of rndom norml dt should lso be pproximtely norml. OK, to continue. Let X N(µ, σ 2,nd let Y g(x, where g is some trnsformtion of X (in the previous exmple, g(x 4X + 3. It is hopefully reltively intuitive tht the closer g(x is to liner over the likely rnge of X (i.e., within 3 or so stndrd devitions of µ, the closer Y g(x will be to normlly distributed. From clculus, we recll tht if you look t ny differentible function over nrrow enough region, the function ppers pproximtely liner. The pproximting line is the tngent line to the curve, nd it s slope is the derivtive of the function. Since most of the mss (i.e., most of the rndom vlues of X is concentrted round µ, let s figure out the tngent line t µ, using two different methods. First, we know tht the tngent line psses through (µ, g(µ, nd tht it s slope is g µ (we use the g nottion to indicte the first derivtive of the function g. Thus, the eqution of the tngent line is Y g X+ b for some b. Replcing(X, Y with the known point (µ, g(µ, we find g(µ g (µµ+b nd so b g(µ g (µµ. Thus, the eqution of the tngent line is Y g (µx+g(µ g (µµ. Now for the big step we cn derive n pproximtion to the sme tngent line by using Tylor series expnsion of g(x (to first order round X µ Y g(x g(µ+g (µ(x u+ǫ g (µx+g(µ g (µµ+ǫ OK, t this point you might be sking yourself so wht?. Well, suppose tht X N(µ, σ 2 nd Y g(x, where g (µ 0. Then, whenever the tngent line (derived erlier is pproximtely correct over the likely rnge of X (i.e., if the trnsformed function is pproximtely liner over the likely rnge of X, then the trnsformtion Y g(x will hve n pproximte norml distribution. Tht pproximte norml distribution my be found using the usul rules for liner trnsformtions of normls. Thus, to first order, E(Y g (µµ+ g(µ g (µµ g(µ vr(y vr(g(x (g(x g(µ 2 ( g (µ(x µ 2 (g (µ 2 (X µ 2 (g (µ 2 vr(x In other words, we tke the derivtive of the trnsformed function with respect to the prmeter, squre it, nd multiply it by the estimted vrince of the untrnsformed prmeter. These first-order pproximtions to the vrince of trnsformed prmeter re usully referred to s the Delt method.

B.2. Trnsformtions of rndom vribles nd the Delt method B - 5 Tylor series expnsions? begin sidebr A very importnt, nd frequently used tool. If you hve no fmilirity t ll with series expnsions, here is (very short introduction. Briefly, the Tylor series is power series expnsion of n infinitely differentible rel (or complex function defined on n open intervl round some specified point. For exmple, one-dimensionl Tylor series is n expnsion of rel function f(x bout point x over the intervl ( r, +r, is given s: f(x f(+ f ((x 1! + f (x(x 2 2! where f ( is the first derivtive of f with respect to, f (x is the second derivtive of f with respect to, nd so on. For exmple, suppose the function is f(x e x. The convenient fct bout this function is tht ll it s derivtives re equl to e x s well (i.e., f(x e x, f (x e x, f e x,.... In prticulr, f (n (x e x so tht f (n (0 1. This mens tht the coefficients of the Tylor series re given by +... nd so the Tylor series is given by n f(n (0 n! 1 n! 1+ x+ x2 2 + x3 6 + x4 xn +...+ 24 n! +... x n n0 n! The primry utility of such power series in simple ppliction is tht differentition nd integrtion of power series cn be performed term by term nd is hence prticulrly (or, t lest reltively esy. In ddition, the (truncted series cn be used to compute function vlues pproximtely. Now, let s look t n exmple of the fit of Tylor series to fmilir function, given certin number of terms in the series. For our exmple, we ll expnd the function f(x e x, t x 0, on the intervl ( 2, +2, for n 0, n 1, n 2,... (where n is the number of terms in the series. For n 0, the Tylor expnsion is sclr constnt (1: f(x 1 which is obviously poor pproximtion to the function f(x e x t ny point. This is shown clerly in the following figure - the blck line in the figure is the function f(x e x, evluted over the intervl ( 2, 2, nd the red line is the Tylor series pproximtion for n 0.

B.2. Trnsformtions of rndom vribles nd the Delt method B - 6 Wht hppens when we dd higher order terms? Here is the plot of the Tylor series for n 1. Hmmm... bit better. Wht bout n 2? We see tht when we dd more terms (i.e., use higher-order series, the fit gets progressively better. Often, for nice, smooth functions (i.e., those nerly liner t the point of interest, we don t need mny terms t ll. For this exmple, n 4 yields ner-perfect fit (over the intervl ( 2, 2. Another exmple - suppose the function of interest is f(x (x 1/3 (i.e., f(x 3 x. Suppose we re interested in f(x (x 1/3 where x 27 (i.e., f(27 3 27. Now, it is strightforwrd to show tht f(27 3 27 3. But suppose we wnt to know f(25 3 25, using Tylor series pproximtion? We recll tht to first order, f(x f(+ f ((x where in this cse, 25 nd x 27. The derivtive of f with respect to x for this function f( ( 1/3 is f ( 2/3 3 1 3 3 x 2

B.3. Trnsformtions of one vrible B - 7 Thus, using the first-order Tylor series, we write f(25 f(27+ f (27(25 27 3+(0.037037( 2 3 0.0740741 2.926 which is very close to the true vlue of f(25 3 25 2.924. In other words, the first-order Tylor pproximtion works pretty well for this function. end sidebr B.3. Trnsformtions of one vrible OK, enough bckground for now. Let s see some pplictions. Let s check the Delt method out in cse where we know the nswer. Assume we hve n estimte of density D nd it s conditionl smpling vrince, vr(ds. We wnt to multiply this by some constnt c to mke it comprble with other vlues from the literture. Thus, we wnt D s g(d c D nd vrd s. The Delt method gives vr(d s (g (D 2 σ D 2 ( D s 2 vr(d D c 2 vr( D which we know to be true for the vrince of rndom vrible multiplied by rel constnt. Another exmple of the sme thing consider known number of hrvested fish nd n verge weight ( µ w nd it s vrince. If you wnt n estimte of totl biomss (B, then B N µ w nd the vrince of B is N 2 vr(µw. Still nother exmple - you hve some prmeter θ, which you trnsform by dividing it by some constnt c. Thus, by the Delt method, ( θ vr c ( 1 2 vr(θ c B.3.1. A potentil compliction - violtion of ssumptions A finl - nd importnt - exmple for trnsformtions of single vribles. The importnce lies in the demonstrtion tht the Delt method does not lwys work - remember, it ssumes tht the trnsformtion is pproximtely liner over the expected rnge of the prmeter. Suppose one hs n MLE for the men nd estimted vrince for some prmeter θ which is bounded rndom uniform on the intervl [0, 2]. Suppose you wnt to trnsform this prmeter such tht ψ e (θ

B.3.1. A potentil compliction - violtion of ssumptions B - 8 (Recll tht this is convenient trnsformtion since the derivtive of e x is e x, mking the clcultions very simple. Now, bsed on the Delt method, the vrince for ψ would be estimted s ( ψ vr(ψ 2 vr(θ θ ( e θ 2 vr(θ Now, suppose tht θ 1.0, nd vr(θ 0.3 3. Then, from the Delt method, vr(ψ ( e θ 2 vr(θ (7.38906(0.3 3 2.46302 OK, so wht s the problem? Well, let s derive the vrince of ψ using the method of moments. To do this, we need to integrte the pdf (uniform, in this cse over some rnge. Since the vrince of uniform distribution is (b 2 /12, nd if b nd re symmetric round the men (1.0, then we cn show by lgebr tht given vrince of 0.3 3, then 0 nd b 2. Given uniform distribution, the pdf is p(θ 1/(b. Thus, by the method of moments, M 1 g(x p(xdx g(x b dx eb e b Thus, by moments, If 0 nd b 2, then vr(e(ψ is M 2 1 2 g(x 2 b dx e 2 e 2b b vr(e(ψ M 2 (M 1 2 1 e 2b + e (e 2 b e 2 2 b+ ( b 2 vr(e(ψ M 2 (M 1 2 1 e 2b + e (e 2 b e 2 2 b+ ( b 2 3.19453 which is not prticulrly close to the vlue estimted by the Delt method (2.46302. Why the discrepncy? As discussed erlier, the Delt method rests on the ssumption the firstorder Tylor expnsion round the prmeter vlue is effectively liner over the rnge of vlues likely

B.3.1. A potentil compliction - violtion of ssumptions B - 9 to be encountered. Since in this exmple we re using uniform pdf, then ll vlues between nd b re eqully likely. Thus, we might nticipte tht s the intervl between nd b gets smller, then the pproximtion to the vrince (which will clerly decrese will get better nd better (since the smller the intervl, the more likely it is tht the function is pproximtely liner over tht rnge. For exmple, if 0.5 nd b 1.5 (sme men of 1.0, then the true vrince of θ will be 0.08 3. Thus, by the Delt method, the estimted vrince of ψ will be 0.61575, while by the method of moments (which is exct, the vrince will be 0.65792. Clerly, the proportionl difference between the two vlues hs declined mrkedly. But, we chieved this improvement by rtificilly reducing the true vrince of the untrnsformed vrible θ. Obviously, we cn t do this in generl prctice. So, wht re the prcticl options? Well, one possible solution is to use higher-order Tylor series pproximtion - by including higher-order terms, we cn chieve better fit to the function (see the preceding sidebr. If we used second-order TSE, E(g(x g(µ+ 1 2 g (µσ 2 Vr(g(x g (µ 2 σ 2 + 1 4 (g (µ 2 (Vr(x 2 4µ 2 σ 2 (B.4 (B.5 we should do bit better. For the vrince estimte, we need to know vr(x 2, which for continuous uniform distribution by the method of moments is ( 1 5 b5 5 ( 1 b 9 ( b 3 3 2 (b 2 Thus, from the second-order pproximtion, nd gin ssuming 0 nd b 2, then vr(ψ (e θ 2 vr(θ+ 1 4 (eθ 2 vr(θ 2 4µ 2 vr(θ 3.756316 vr(ψ is which is closer (proportiontely to the true vrince (3.19453 thn ws the estimte using only the first-order TSE (2.46302. The reson tht even second-order pproximtion isn t much closer is becuse the trnsformtion is very non-liner over the rnge of the dt (uniform [0, 2] in this cse, such tht the second-order pproximtion doesn t fit prticulrly well over this rnge. So, we see tht the clssicl Delt method, which is bsed on first-order Tylor series expnsion of the trnsformed function, my not do prticulrly well if the function is highly non-liner over the rnge of vlues being exmined. Of course, it would be fir to note tht the preceding exmple mde the ssumption tht the distribution ws rndom uniform over the intervl. For most of our work with MARK, the intervl is likely to hve symmetric mss round the estimte, typiclly β. As such, most of dt, nd thus the trnsformed dt, will ctully fll closer to the prmeter vlue in question (the men in this exmple thn we ve demonstrted here. So much so, tht the discrepncy between the first order Delt pproximtion to the vrince nd the true vlue of the vrince will likely be significntly smller thn shown here, even for strongly non-liner trnsformtion. We leve it to you s n exercise to prove this for yourself. But, this point notwithstnding, it is importnt to be wre of the ssumptions underlying the Delt method - if your trnsformtion is non-liner, nd there is considerble vrition in your dt, the first-order pproximtion my not be prticulr good. Fortuntely, use of second order Tylor series pproximtions is not heroiclly difficult the chllenge is usully coming up with vr(x 2. If the pdf for the untrnsformed dt is specified (which

B.4. Trnsformtions of two or more vribles B - 10 is essentilly equivlent to ssuming n informtive prior, then you cn derive vr(x 2 firly esily using the method of moments. B.4. Trnsformtions of two or more vribles Clerly, we re often interested in trnsformtions involving more thn one vrible. Fortuntely, there re lso multivrite generliztions of the Delt method. Suppose you ve estimted p different rndom vribles X 1, X 2,..., X p. In mtrix nottion, these vribles would constitute (p 1 rndom vector X X 1 X 2.. X p which hs men vector nd the (p p vrince-covrince mtrix is EX 1 EX 2 µ. EX p µ 1 µ 2. µ p vr(x 1 cov(x 1, X 2... cov(x 1, X p cov(x 2, X 1 vr(x 2... cov(x 2, X p.... cov(x p, X 1 cov(x p, X 2... vr(x p Note tht if the vribles re independent, then the off-digonl elements (i.e., the covrince terms re ll zero. Then, for (k p mtrix of constnts A ij, the expecttion of rndom vector Y AX is given s EY 1 EY 2 Aµ. with vrince-covrince mtrix EY p cov(y AΣA T

B.4. Trnsformtions of two or more vribles B - 11 Now, using the sme logic we first considered for developing the Delt method for single vrible, for ech x i ner µ i, we cn write g 1 (x g 1 (µ g 2 (x g 2 (µ y + D(x µ.. g p (x g p (µ where D is the mtrix of prtil derivtives of g i with respect to x j, evluted t (x µ. As with the single-vrible Delt method, if the vrinces of the X i re smll (so tht with high probbility Y is ner µ, such tht the liner pproximtion is usully vlid, then to first-order we cn write EY 1 g 1 (µ EY 2.. EY p g 2 (µ. g p (µ vr(y DΣD T In other words, to pproximte the vrince of some multi-vrible function Y, we (i tke the vector of prtil derivtives of the function with respect to ech prmeter in turn, D, (ii right-multiply this vector by the vrince-covrince mtrix, Σ, nd (iii right-multiply the resulting product by the trnspose of the originl vector of prtil derivtives, D T. Exmple (1 - vrince of product of survivl probbilities Let s consider the ppliction of the Delt method in estimting smpling vrinces of firly common function - the product of severl prmeter estimtes. Now, from the preceding, we see tht ( ( vr(y DΣD T (Ŷ (Ŷ Σ ( θ ( θ where Y is some liner or nonliner function of the prmeter estimtes θ 1, θ 2,.... The first term on the RHS of the vrince expression is row vector contining prtil derivtives of Y with respect There re lterntive formultions of this expression which my be more convenient to implement in some instnces. When the vribles θ 1, θ 2... θ k (in the function, Y re independent, then where f / θ 1 is the prtil derivtive of Y with respect to θ i. vr(y vr( f(θ 1, θ 2,... θ k k ( f 2 vr(θ i θ i i1 When the vribles θ 1, θ 2... θ k (in the function, Y re not independent, then the covrince structure mong the vribles must be ccounted for: vr(y vr( f(θ 1, θ 2,... θ k k ( f 2 vr(θ i + 2 θ i i1 k k i1 j1 T ( ( f f cov(θ i, θ j θ i θ j

B.4. Trnsformtions of two or more vribles B - 12 to ech of these prmeters ( θ 1, θ 2,.... The right-most term of the RHS of the vrince expression is simply trnspose of this row vector (i.e., column vector. The middle-term is simply the estimted vrince-covrince mtrix for the prmeters. OK, let s try n exmple - let s use estimtes from the mle Europen dipper dt set (yes, gin. We ll fit model {φ t p. } to these dt. Suppose we re interested in the probbility of surviving from the strt of the first intervl to the end of the third intervl. Well, the point-estimte of this probbility is esy enough - it s simply ( φ 1 φ 2 φ 3 (0.6109350 0.458263 0.4960239 0.138871. So, the probbility of mle Dipper surviving over the first three intervls is 14% (gin, ssuming tht our time-dependent survivl model is vlid model. To derive the estimte of the vrince of the product, we will lso need the vrince-covrince mtrix for the survivl estimtes. You cn generte the mtrix esily in MARK by selecting Output Specific Model Output Vrince Covrince Mtrices Rel Estimtes. Here is the vrince-covrince mtrix for the mle Dipper dt, generted from model {φ t p. }: In the Notepd output from MARK, the vrince-covrince vlues re below the digonl, wheres the stndrdized correltion vlues re bove the digonl. The vrinces re given long the digonl. However, it is very importnt to note tht the V-C mtrix tht MARK outputs to the Notepd is rounded to 5 significnt digits. For the ctul clcultions, we need to use the full precision vlues. To get those, you need to either (i output the V-C mtrix into dbse file (which you could then open with dbse, or Excel, or (ii copy the V-C mtrix into the Windows clipbord, nd then pste it into some other ppliction. Filure to use the full precision V-C mtrix will often (lmost lwys, in fct led to rounding errors. The full precision V-C mtrix for the survivl vlues is shown t the top of the next pge.

B.4. Trnsformtions of two or more vribles B - 13 cov(y vr(φ 1 cov(φ 1, φ 2 cov(φ 1, φ 3 cov(φ 2, φ 1 vr(φ 2 cov(φ 2, φ 3 cov(φ 3, φ 1 cov(φ 3, φ 2 vr(φ 3 0.0224330125 0.0003945405 0.0000654469 0.0003945405 0.0099722201 0.0002361998 0.0000654469 0.0002361998 0.0072418858 Now wht? First, we need to identify the trnsformtion we re pplying to our estimtes ( φ 1, φ 2, nd φ 3. In this cse, the trnsformtion (which we ll cll Y is simple - it is the product of the three estimted survivl rtes. Conveniently, this mke differentiting the trnsformtion strightforwrd. So, here is the vrince estimtor, in full: ( (Ŷ [ φ ( ( ( 1 vr(y (Ŷ (Ŷ (Ŷ ( ] (Ŷ φ 1 φ 2 φ 3 φ 2 ( (Ŷ φ 3 Ech of the prtil derivtives is esy enough for this exmple. Since Ŷ φ 1 φ 2 φ 3, then Ŷ/ φ 1 φ 2 φ 3. And so on. So, ( (Ŷ [ φ ( ( ( 1 vr(y (Ŷ (Ŷ (Ŷ ( ] (Ŷ φ 1 φ 2 φ 3 φ 2 ( (Ŷ [ ( φ 2 φ 3 ( φ 1 φ 3 ( φ 1 φ 2 ] φ 3 ( φ 2 φ 3 ( φ 1 φ 3 ( φ 1 φ 2 OK, wht bout the vrince-covrince mtrix? Well, from the preceding we see tht cov(y vr(φ 1 cov(φ 1, φ 2 cov(φ 1, φ 3 cov(φ 1, φ 1 vr(φ 2 cov(φ 2, φ 3 cov(φ 3, φ 1 cov(φ 3, φ 2 vr(φ 3

B.4. Trnsformtions of two or more vribles B - 14 0.0224330125 0.0003945405 0.0000654469 0.0003945405 0.0099722201 0.0002361998 0.0000654469 0.0002361998 0.0072418858 Thus, [ ( φ vr(y ( φ 2 φ 3 ( φ 1 φ 3 ( φ ] 2 φ 3 1 φ 2 Σ ( φ 1 φ 3 ( φ 1 φ 2 [ ( φ 2 φ 3 ( φ 1 φ 3 ( φ ] vr(φ 1 cov(φ 1, φ 2 cov(φ 1, φ 3 1 φ 2 cov(φ 1, φ 1 vr(φ 2 cov(φ 2, φ 3 cov(φ 3, φ 1 cov(φ 3, φ 2 vr(φ 3 ( φ 2 φ 3 ( φ 1 φ 3 ( φ 1 φ 2 Clerly, this expression is getting more nd more impressive s we progress. Here is the resulting expression (written in piecewise fshion to mke it esier to see the bsic pttern: vr(y φ 2 2 φ 2 3 ( vr 1 + 2 φ 2 φ 2 3 φ 1 (ĉov 1,2 + 2 φ 2 2 φ 3 φ 1 (ĉov 1,3 + φ 2 1 φ 2 3 ( vr 2 + 2 φ 2 1 φ 3 φ 2 (ĉov 2,3 + φ 2 1 φ 2 2 ( vr 3 Whew - lot of work (nd if you think this eqution looks impressive, try it using second-order Tylor series pproximtion!. But, under some ssumptions, the Delt method does rther well in llowing you to derive n estimte of the smpling vrince for functions of rndom vribles (or, s we ve described, functions of estimted prmeters. So, fter substituting in our estimtes for φ i nd the vrinces nd covrinces, our estimte for the smpling vrince of the product Ŷ ( φ 1 φ 2 φ 3 is (pproximtely 0.0025565. Exmple (2 - vrince of estimte of reporting rte In some cses nimls re tgged or bnded to estimte reporting rte - the proportion of bnded nimls reported, given tht they were killed nd retrieved by hunter or ngler (see chpter 9 for more detils. Thus, N c nimls re tgged with norml (control tgs nd, of these, R c re recovered the first yer following relese. The recovery rte of control nimls is merely R c /N c nd we denote this s f c. Another group of nimls, of size N r, re tgged with rewrd tgs; these tgs indicte tht some mount of money (sy, $50 will be given to people reporting these specil tgs. It is ssumed tht $50

B.4. Trnsformtions of two or more vribles B - 15 is sufficient to ensure tht ll such tgs will be reported, thus these serve s bsis for comprison nd the estimtion of reporting rte. The recovery probbility for the rewrd tgged nimls is merely R r /N r, where R r is the number of recoveries of rewrd-tgged nimls the first yer following relese. We denote this recovery probbility s f r. The estimtor of the reporting rte is rtio of the recovery rtes nd we denote this s λ. Thus, λ f c f r Now, note tht both recovery probbilities re binomils. Thus, vr( f c f c ( 1 f c N c vr( fr f r ( 1 f r In this cse, the smples re independent, thus cov(f c, f r nd the smpling vrince-covrince mtrix is digonl: vr( f c 0 0 vr( fr Next, we need the derivtives of λ with respect to f c nd f r : N r λ f c 1 fr λ f r f c f 2 r Thus, vr(λ ( 1 f r, f c f 2 r vr( fc 0 0 vr( f r 1 f r f c f 2 r Exmple (3 - vrince of bck-trnsformed estimtes - simple The bsic ide behind this worked exmple ws introduced bck in Chpter 6 - in tht chpter, we demonstrted how we cn bck-trnsform from the estimte of β on the logit scle to n estimte of some prmeter θ (e.g., φ or p on the probbility scle (which is bounded [0, 1]. But, we re clerly lso interested in n estimte of the vrince (precision of our estimte, on both scles. Your first thought might be to simply bck-trnsform from the link function (in our exmple, the logit link, to the probbility scle, just s we did bove. But, s discussed in chpter 6, this does not work. For exmple, consider the mle Dipper dt. Using the logit link, we fit model {φ. p. } to the dt - no time-dependence for either prmeter. Let s consider only the estimte for φ. The estimte for β for φ is 0.2648275. Thus, our estimte of φ on the probbility scle is φ e0.2648275 1.303206 1+ e0.2648275 2.303206 0.5658226 which is exctly wht MARK reports (to within rounding error.

B.4. Trnsformtions of two or more vribles B - 16 But, wht bout the vrince? Well, if we look t the β estimtes, MARK reports tht the stndrd error for the estimte of β corresponding to survivl is 0.1446688. If we simply bck-trnsform this from the logit scle to the probbility scle, we get ŜE e0.1446688 1+e 0.1446688 1.155657 2.155657 0.5361043 However, MARK reports the estimted stndrd error for φ s 0.0355404, which isn t even remotely close to our bck-trnsformed vlue of 0.5361043. Wht hs hppened? Well, hopefully you now relize tht you re trnsforming the estimte from one scle (logit to nother (probbility. And, since you re working with trnsformtion, you need to use the Delt method to estimte the vrince of the bck-trnsformed prmeter. Since φ e β 1+ e β then ( 2 vr( φ φ vr( β β e β 1+ e β e β ( 1+e β 2 (e β 2 ( 1+ e β 2 2 2 vr( β vr( β It is worth noting tht if then it cn be esily shown tht φ e β φ(1 φ 1+ e β e β ( 1+ e β 2 which is the derivtive of φ with respect to β. So, we could rewrite our expression for the vrince of φ conveniently s vr( φ ( 1+ e β e β 2 2 vr( β ( φ(1 φ 2 vr( β From MARK, the estimte of the SE for β ws 0.1446688. Thus, the estimte of vr(β is 0.1446688 2 0.02092906. Given the estimte of β of 0.2648275, we substitute into the preceding expression, which

B.4. Trnsformtions of two or more vribles B - 17 yields vr( φ ( 1+ e β e β 2 2 vr( β 0.0603525 0.02092906 0.001263 So, the estimted SE for φ is 0.001263 0.0355404, which is wht is reported by MARK (gin, within rounding error. SE nd 95% CI begin sidebr The stndrd pproch to clculting 95% confidence limits for some prmeter θ is θ ±(1.96 SE. Is this how MARK clcultes the 95% CI on the rel probbility scle? Well, tke the exmple we just considered - the estimted SE for φ 0.5658226 ws 0.001263 0.0355404. So, you might ssume tht the 95% CI on the rel probbility scle would be 0.5658226 ±(2 0.0355404 - [0.4947418, 0.6369034]. However, this is not wht is reported by MARK - [0.4953193, 0.6337593], which is quite close, but not exctly the sme. Why the difference? The difference is becuse MARK first clculted the 95% CI on the logit scle, before bck-trnsforming to the rel probbility scle. So, for our estimte of φ, the 95% CI on the logit scle for β 0.2648275 is[ 0.0187234, 0.5483785], which, when bck-trnsformed to the rel probbility scle is [0.4953193, 0.6337593], which is wht is reported by MARK. In this cse, the very smll difference between the two CI s is becuse the prmeter estimte ws quite close to 0.5. In such cses, not only will the 95% CI be nerly the sme (for estimtes of 0.5, it will be identicl, but they will lso be symmetricl. However, becuse the logit trnsform is not liner, the reconstituted 95% CI will not be symmetricl round the prmeter estimte, especilly for prmeters estimted ner the [0, 1] boundries. For exmple, consider the estimte for p 0.9231757. On the logit scle, the 95% CI for the β corresponding to p (SE0.5120845 is [1.4826128, 3.4899840]. The bck-trnsformed CI is [0.8149669, 0.9704014]. This CI is clerly not symmetric round p 0.9231757. Essentilly the degree of symmetry is function of how close the estimted prmeter is to either the 0 or 1 boundry. Further, the estimted vrince for p vr( p ( p(1 p 2 vr( β (0.9231757(1 0.9231757 2 0.262231 0.001319 yields n estimted SE of 0.036318 on the norml probbility scle (which is wht is reported by MARK. Estimting the 95% CI on the norml probbility scle simply s 0.9231757 ±(2 0.036318 yields [0.85054, 0.99581], which is clerly quite bit different, nd more symmetricl, thn wht is reported by MARK (from bove, [0.8149669, 0.9704014]. MARK uses the bck-trnsformed CI to ensure tht the reported CI is bounded [0, 1]. As the estimted prmeter pproches either the 0 or 1 boundry, the degree of symmetry in the bcktrnsformed 95% CI tht MARK reports will increse. end sidebr Got it? Well, s finl test, consider the following, more difficult, exmple of bck-trnsforming the CI from model fit using individul covrites.

B.4. Trnsformtions of two or more vribles B - 18 Exmple (3 - vrince of bck-trnsformed estimtes - somewht hrder In Chpter 6 we considered the nlysis of vrition in the pprent survivl of the Europen Dipper, s function of whether or not there ws flood in the smpling re. Here, we will consider just the mle Dipper dt (the encounter dt re contined in ed_mles.inp. Recll tht for these dt, there re 7 smpling occsions (6 intervls, nd tht flood occurred during the second nd third intervls. For present purposes, we ll ssume tht encounter probbility ws constnt over time, nd tht survivl is liner function of flood or non-flood. Using logit link function, where flood yers were coded using 1, nd non-flood yers were coded using 0, the liner model for survivl on the logit scle is So, in flood yer, logit(φ 0.4267863 0.5066372(flood logit( ˆφ f lood 0.4267863 0.5066372(flood 0.4267863 0.5066372(1 0.0798509 Bck-trnsforming onto the rel probbility scle, which is precisely wht is reported by MARK. ˆφ f lood e 0.0798509 0.48005 1+ e 0.0798509 Now, wht bout the estimted vrince for φ f lood? First, wht is our trnsformtion function (Y? Simple it is the bck-trnsform of the liner eqution on the logit scle. Given tht then the bck-trnsform function Y is logit( φ β 0 + β 1 (flood 0.4267863 0.5066372(flood Y e0.4267863 0.5066372(flood 1+e 0.4267863 0.5066372(flood Second, since our trnsformtion clerly involves multiple prmeters (β 0, β 1, the estimte of the vrince is given to first-order by vr(y DΣD T ( ( (Ŷ (Ŷ Σ ( θ ( θ T

B.4. Trnsformtions of two or more vribles B - 19 Given our liner (trnsformtion eqution, then the vector of prtil derivtives is (we ve trnsposed it to mke it esily fit on the pge: ( ( (Ŷ (Ŷ T β 0 β 1 e β 0+β 1 (flood 1+ e β 0+β 1 (flood flood e β 0+β 1 (flood 1+ e β 0+β 1 (flood (e β 0+β 1 (flood 2 (1+ e β 0+β 1 (flood 2 flood ( e β 0+β 1 (flood 2 (1+ e β 0+β 1 (flood 2 While this is firly ugly looking, the structure is quite strightforwrd - the only difference between the 2 elements of the vector is tht the numertor of both terms (on either side of the minus sign re multiplied by 1, nd flood, respectively. Where do these sclr multipliers come from? They re simply the prtil derivtives of the liner model (we ll cll it Y on the logit scle Y logit( φ β 0 + β 1 (flood with respect to ech of the prmeters (β i in turn. In other words, Y/ β 0 1, nd Y/ β 1 flood. Substituting in our estimtes for ˆβ 0 0.4267863 nd ˆβ 1 0.5066372, nd setting flood1 (to indicte flood yer yields [ ( ( (Ŷ (Ŷ β 0 β 1 0.249602 0.249602 ] From the MARK output (fter exporting to dbse file - nd not to the Notepd - in order to get full precision, the full V-C mtrix for the prmeters β 0 nd β 1 is ( 0.0321405326 0.0321581167 0.0321581167 0.0975720877 So, [ vr(y 0.249602 0.249602 ( ] 0.0321405326 0.0321581167 0.0321581167 0.0975720877 [ 0.249602 0.249602 ] 0.0040742678 So, the estimted SE for vr for the reconstituted vlue of survivl for n individul during flood yer is 0.0040742678 0.0638300, which is wht is reported by MARK (to within rounding error.

B.4. Trnsformtions of two or more vribles B - 20 Exmple (4 - vrince of bck-trnsformed estimtes - bit hrder still Recll tht in Chpter 11, we considered nlysis of the effect of vrious functions of mss (specificlly, mss, nd mss 2 on the survivl of hypotheticl species of bird (the simulted dt re in file indcov1.inp. The liner function relting survivl to mss nd mss2, on the logit scle, is logit(φ 0.256732+ 1.1750358(msss 1.0554864(mss 2 s Note tht for the two mss terms, there is smll subscript s - reflecting the fct tht these re stndrdized msses. Recll tht we stndrdized the covrites by subtrcting the men of the covrite, nd dividing by the stndrd devition (the use of stndrdized or non-stndrdized covrites is discussed t length in Chpter 11. Thus, for ech individul in the smple, the estimted survivl probbility (on the logit scle for tht individul, given it s mss, is given by ( ( m m m 2 m 2 logit(φ 0.256732 + 1.1750358 1.0554864 SD m SD m 2 In this expression, m refers to mss nd m 2 refers to mss2. The output from MARK (preceding pge ctully gives you the men nd stndrd devitions for both covrites: for mss, men 109.97, nd SD 24.79, while for mss2, the men 12707.46, nd the SD 5532.03. The vlue column shows the stndrdized vlues for mss nd mss2 (0.803 nd 0.752 for the first individul in the dt file. Let s look t n exmple. Suppose the mss of the bird ws 110 units. Thus mss 110, mss2 110 2 12100. Thus, ( ( (110 109.97 (12100 12707.46 logit(φ 0.2567 + 1.17504 1.0555 0.374. 24.79 5532.03 So, if logit(φ 0.374, then the reconstituted estimte of φ, trnsformed bck from the logit scle is e 0.374 0.592 1+e0.374 Thus, for n individul weighing 110 units, the expected nnul survivl probbility is pproximtely 0.5925 (which is wht MARK reports if you use the User specify covrite option. OK, but wht bout the vrince (nd corresponding SE for this estimte? First, wht is our trnsformtion function (Y? Esy - it is the bck-trnsform of the liner eqution on the logit scle. Given tht then the bck-trnsform function Y is logit( φ β 0 + β 1 (msss+ β 2 (mss 2 s 0.2567+ 1.17505(msss 1.0555(mss 2 s 2 Y e0.2567+1.17505(mss s 1.0555(mss s 1+ e 0.2567+1.17505(mss s 1.0555(mss 2 s

B.4. Trnsformtions of two or more vribles B - 21 As in the preceding exmple, since our trnsformtion clerly involves multiple prmeters (β 0, β 1, β 2, the estimte of the vrince is given by vr(y DΣD T ( ( (Ŷ (Ŷ Σ ( θ ( θ Given our liner (trnsformtion eqution (from bove then the vector of prtil derivtives is (we ve substituted m for mss nd m2 for mss2, nd trnsposed it to mke it esily fit on the pge: ( ( ( (Ŷ (Ŷ (Ŷ T β 0 β 1 β 2 e β 0+β 1 (m+β 2 (m2 (e β 0+β 1 (m+β 2 (m2 2 1+e β 0+β 1 (m+ β 2 (m2 (1+ e β 0+β 1 (m+ β 2 (m2 2 m e β 0+β 1 (m+ β 2 (m2 (e 1+ e β 0+β 1 (m+ β 2 (m2 m β 0+β 1 (m+β 2 (m2 2 (1+ e β 0+β 1 (m+ β 2 (m2 2 m2 e β 0+β 1 (m+ β 2 (m2 (e 1+ e β 0+β 1 (m+ β 2 (m2 m2 β 0+β 1 (m+β 2 (m2 2 (1+e β 0+β 1 (m+ β 2 (m2 2 T Agin, while this is firly ugly looking (even more so thn the previous exmple, the structure is gin quite strightforwrd the only difference between the 3 elements of the vector is tht the numertor of both terms (on either side of the minus sign re multiplied by 1, m, nd m2, respectively, which re simply the prtil derivtives of the liner model (we ll cll it Y on the logit scle Y logit( φ β 0 + β 1 (ms+ β 2 (m 2 s with respect to ech of the prmeters (β i in turn. In other words, Y/ β 0 1, Y/ β 1 m, nd Y/ β 2 m2. So, now tht we hve our vectors of prtil derivtives of the trnsformtion function with respect to ech of the prmeters, we cn simplify things considerbly by substituting in the stndrdized vlues for m nd m2, nd the estimted prmeter vlues ( β 0, β 1, nd β 2. For mss of 110 g, the stndrdized vlues for mss nd mss2 re ( ( 110 109.97 12100 12707.46 msss 0.0012102 mss2s 0.109808 24.79 5532.03 The estimtes for β i we red directly from MARK: ˆβ 0 0.2567333, ˆβ 1 1.1750545, ˆβ 2 1.0554864.

B.4. Trnsformtions of two or more vribles B - 22 Substituting in these estimtes for β i nd the stndrdizedm ndm2 vlues (from the previous pge into our vector of prtil derivtives (bove yields ( ( ( 0.24145 (Ŷ (Ŷ (Ŷ T 0.00029 β 0 β 1 β 2 0.02651 From the MARK output (fter exporting to dbse file - nd not to the Notepd - in order to get full precision, the full V-C mtrix for the β prmeters is 0.0009006921 0.0004109710 0.0003662359 0.0004109710 0.0373887267 0.0364250288 0.0003662359 0.0364250288 0.0362776933 So, [ vr(y 0.24145 0.00029 0.02651 ] 0.0009006921 0.0004109710 0.0003662359 0.0004109710 0.0373887267 0.0364250288 0.0003662359 0.0364250288 0.0362776933 0.24145 0.00029 0.02651 0.00007387 So, the estimted SE for vr for the reconstituted vlue of survivl for n individul weighing 110 g is 0.00007387 0.00860, which is wht is reported by MARK (gin, to within rounding error. It is importnt to remember tht the estimted vrince will vry depending on the mss you use - the estimte of the vrince for 110 g individul (0.00007387 will differ from the estimted vrince for (sy 120 g individul. For 120 g individul, the stndrdized vlues of mss nd mss 2 re 0.4045568999 nd 0.3059519429, respectively. Bsed on these vlues, then ( ( ( 0.23982 (Ŷ (Ŷ (Ŷ T 0.08871 β 0 β 1 β 2 0.07337 Given the vrince covrince-mtrix for this model (shown bove, then vr(y DΣD T 0.000074214 Thus, the estimted SE for vr for the reconstituted vlue of survivl for n individul weighing 120 g is 0.000074214 0.008615, which is wht is reported by MARK (gin, within rounding error. Note tht this vlue for the SE for 120 g individul (0.008615 differs from the SE estimted for 110 g individul (0.008600, lbeit not by much (the smll difference here is becuse this is very lrge simulted dt set bsed on deterministic model - see Chpter 11 for detils. Since ech weight would hve it s own estimted survivl, nd ssocited estimted vrince nd SE, to generte curve showing the reconstituted vlues nd their SE, you d need to itertively clculte DΣD T over rnge of weights. We ll leve it to you to figure out how to hndle the progrmming if you wnt to

B.5. Delt method nd model verging B - 23 do this on your own. For the less mbitious, MARK now hs the cpcity to do much of this for you - you cn output the 95% CI dt over rnge of individul covrite vlues to spredsheet (see section 11.5 in Chpter 11. B.5. Delt method nd model verging In the preceding exmples, we focused on the ppliction of the Delt method to trnsformtions of prmeter estimtes from single model. However, s introduced in Chpter 4 - nd emphsized throughout the reminder of this book - we re often interested in ccounting for model selection uncertinty by using model-verged vlues. There is no mjor compliction for ppliction of the Delt method to model-verged prmeter vlues - you simply need to mke sure you use modelverged vlues for ech element of the clcultions. We ll demonstrte this using nlysis of the mle dipper dt (ed_mle.inp. Suppose tht we fit 2 cndidte models to these dt: {φ. p t } nd {φ f lood p t }. In other words, model where survivl is constnt over time, nd model where survivl is constrined to be function of binry flood vrible (see section 6.4 of Chpter 6. Here re the results of fitting these 2 models to the dt: As expected (bsed on the nlysis of these dt presented in Chpter 6, we see tht there is some evidence of model selection uncertinty - the model where survivl is constnt over time hs roughly 2-3 times the weight s does model where survivl is constrined to be function of the binry flood vrible. The model verged vlues for ech intervl re shown below: 1 2 3 4 5 6 estimte 0.5673 0.5332 0.5332 0.5673 0.5673 0.5673 SE 0.0441 0.0581 0.0581 0.0441 0.0441 0.0441 Now, suppose we wnt to derive the best estimte of the probbility of survivl over (sy the first 3 intervls. Clerly, ll we need to do is tke the product of the 3 model-verged vlues corresponding to the first 3 intervls: (0.5673 0.5332 0.5332 0.1613 In other words, our best estimte of the probbility tht mle dipper would survive from the strt of the time series to the end of the third intervl is 0.1613. Wht bout the stndrd error of this product? Here, we use the Delt method. Recll tht vr(y DΣD T

B.5. Delt method nd model verging B - 24 which we write out more fully s vr(y DΣD T ( ( (Ŷ (Ŷ Σ ( θ ( θ where Y is some liner or nonliner function of the prmeter estimtes θ 1, θ 2,.... For this exmple, Y is the product of the survivl estimtes. So, the first thing we need to do is to generte the estimted vrince-covrince mtrix for the model verged survivl estimtes. This is esy enough to do - in the Model Averging Prmeter Selection window, you simply need to Export Vrince Covrince Mtrix to dbse file - you do this by checking the pproprite check box (lower-left, s shown t the top of the next pge: T The rounded vlues which would be output to the Notepd (or whtever editor you ve specified re shown t the top of the next pge. Recll tht the vrince-covrince mtrix of estimtes is given on the digonl nd below (wheres the correltion mtrix of the estimtes is shown bove the digonl. (Note: remember tht for the ctul clcultions you need the full precision vrincecovrince mtrix from the exported dbse file. All tht remins is to substitute our model-verged estimtes for (i ˆφ nd (ii the vrincecovrince mtrix, into vr(y DΣD T. Thus, vr(y DΣD T ( ( (Ŷ (Ŷ Σ ( θ ( θ T

B.6. Summry B - 25 [ ( ( ( ] ˆφ 2 ˆφ 3 ˆφ 1 ˆφ 3 ˆφ 1 ˆφ 2 Σ ( ˆφ 2 ˆφ 3 ( ˆφ 1 ˆφ 3 ( ˆφ 1 ˆφ 2 [ ( ( ( ] vr( ˆφ 1 cov( ˆφ 1, ˆφ 2 cov( ˆφ 1, ˆφ 3 ˆφ 2 ˆφ 3 ˆφ 1 ˆφ 3 ˆφ 1 ˆφ 2 cov( ˆφ 1, ˆφ 2 vr( ˆφ 2 cov( ˆφ 2, ˆφ 3 cov( ˆφ 3, ˆφ 1 cov( ˆφ 3, ˆφ 2 vr(φ 3 [ ] 0.284303069 0.3024783390 0.3024783390 0.0019410083 0.0001259569 0.0001259569 0.284303069 0.0001259569 0.0033727452 0.0033727423 0.3024783390 0.001435 0.0001259569 0.0033727423 0.0033727452 0.3024783390 ( ˆφ 2 ˆφ 3 ( ˆφ 1 ˆφ 3 ( ˆφ 1 ˆφ 2 B.6. Summry In this ppendix, we ve briefly introduced convenient, firly strightforwrd method for deriving n estimte of the smpling vrince for trnsformtions of one or more vribles. Such trnsformtions re quite commonly encountered when using MARK, nd hving method to derive estimtes of the smpling vrinces is convenient. The most strightforwrd method bsed on first-order Tylor series expnsion is known generlly s the Delt method. However, s we sw, the first-order Tylor series pproximtion my not lwys be pproprite, especilly if the trnsformtion is highly non-liner, nd if there is significnt vrition in the dt. In such cse, you my hve to resort to higher-order pproximtions, or numericlly intensive bootstrpping pproches.