Statistical models with uncertain error parameters

Size: px
Start display at page:

Download "Statistical models with uncertain error parameters"

Transcription

1 Eur. Phys. J. C (09) 79:33 Regular Artcle - Expermental Physcs Statstcal models wth uncertan error parameters Glen Cowan a Physcs Department, Royal Holloway, Unversty of London, Egham TW0 0EX, UK Receved: 3 October 08 / Accepted: 3 February 09 / Publshed onlne: February 09 The Author(s) 09 Abstract In a statstcal analyss n Partcle Physcs, nusance parameters can be ntroduced to take nto account varous types of systematc uncertantes. The best estmate of such a parameter s often modeled as a Gaussan dstrbuted varable wth a gven standard devaton (the correspondng systematc error ). Although the assgned systematc errors are usually treated as constants, n general they are themselves uncertan. A type of model s presented where the uncertanty n the assgned systematc errors s taken nto account. Estmates of the systematc varances are modeled as gamma dstrbuted random varables. The resultng confdence ntervals show nterestng and useful propertes. For example, when averagng measurements to estmate ther mean, the sze of the confdence nterval ncreases for decreasng goodness-of-ft, and averages have reduced senstvty to outlers. The basc propertes of the model are presented and several examples relevant for Partcle Physcs are explored. Introducton Data analyss n Partcle Physcs s based on observaton of a set of numbers that can be represented by a (vector) random varable, here denoted as y. The probablty of y (or probablty densty for contnuous varables) can n general be wrtten P(y, θ), where represents parameters of nterest and θ are nusance parameters needed for the correctness of the model but not of nterest to the analyst. The goal of the analyss s to carry out nference related to the parameters of nterest. A procedure for dong ths n the framework of frequentst statstcs usng the profle lkelhood functon s descrbed n Sect.. Ths nvolves usng control measurements wth gven standard devatons to provde nformaton on the nusance parameters. Here we wll take the term systematc error to mean the standard devaa e-mal: g.cowan@rhul.ac.uk ton of a control measurement tself. The word error s used n the sense defned here and not to mean, e.g., the unknown dfference between an nferred and true value. The systematc errors defned n ths way should also not be confused wth correspondng systematc uncertanty n the estmate of the parameter of nterest. Often the values assgned to the systematc errors are themselves uncertan. Ths can be ncorporated nto the model by treatng ther values as adjustable parameters and ther estmates as random varables. A model s proposed n whch the estmates of systematc varances are treated as followng a gamma dstrbuton, whose mean and wdth are set by the analyst to reflect the desred nomnal value and ts relatve uncertanty. The confdence ntervals that result from ths type of model are found to have nterestng and useful propertes. For example, when averagng measurements to estmate ther mean, the sze of the confdence nterval ncreases wth decreasng goodness-of-ft, and averages have reduced senstvty to outlers. The basc propertes of the model are presented and several types of examples relevant for Partcle Physcs are explored. The approach followed here s that of frequentst statstcs, as ths s wdely used n Partcle Physcs. Models wth elements smlar to the one proposed have been dscussed n the statstcs lterature, e.g., Refs.,]. Analogous Bayesan procedures have been nvestgated n Partcle Physcs 3 5] and found to produce results wth qualtatvely smlar propertes. After revewng parameter nference usng the profle lkelhood wth known systematc errors n Sect., the model wth adjustable error parameters s presented n Sect. 3 and ts use n determnng confdence ntervals s dscussed n Sect.. In ths paper two areas where such a model can be appled are explored: a sngle Gaussan dstrbuton measurement n Sect. 5 and the method of least squares n Sect. 6. The ssue of correlated systematc uncertantes s dscussed n Sect. 7 and conclusons are gven n Sect. 8.

2 33 Page of 7 Eur. Phys. J. C (09) 79 :33 Parameter nference usng the profle lkelhood and the case of known systematc errors Inference about a model s parameters s based on the lkelhood functon L(, θ) = P(y, θ). More specfcally one can construct a frequentst test of values of the parameters of nterest by usng the profle lkelhood rato (see, e.g., Ref. 6]), λ() = L(, ˆθ) L( ˆ, ˆθ). () Here n the denomnator, ˆ and ˆθ represent the maxmumlkelhood (ML) estmators of and θ, and ˆθ are the profled values of θ,.e., the values of θ that maxmze the lkelhood for a gven value of. Often the nusance parameters are ntroduced to account for a systematc uncertanty n the model. Ther presence parameterzes the systematc uncertanty such that for some pont n the enlarged parameter space the model should be closer to the truth. Because of correlatons between the estmators of the parameters, however, the nusance parameters result n a decrease n senstvty to the parameters of nterest. To counteract ths unwanted effect, one often ncludes nto the set of observed quanttes addtonal measurements that provde nformaton on the nusance parameters. A smple and often used form of such control measurements nvolves treatng the best avalable estmates of the nusance parameters θ = (θ,...,θ N ) as ndependent Gaussan dstrbuted values u = (u,...,u N ) wth standard devatons σ u = (σ u,...,σ u N ). In ths way the full lkelhood becomes L(, θ) = P(y, u, θ) = P(y, θ)p(u θ) N = P(y, θ) e (u θ ) /σu, () πσu or equvalently the log-lkelhood s ln L(, θ) = ln P(y, θ) N (u θ ) σu + C, (3) where C represents terms that do not depend on the adjustable parameters of the problem and therefore can be dropped; n the followng such constant terms wll usually not be wrtten explctly. The log-lkelhood n Eq. (3) represents one of the most wdely used methods for takng account of systematc uncertantes n Partcle Physcs analyses. Frst nusance parameters are ntroduced nto the model to parameterze the systematc uncertanty, and then these parameters are constraned by means of control measurements. The quadratc constrant terms n Eq. (3) correspond to the case where the estmate u of the parameter θ s modeled as a Gaussan dstrbuted varable of known standard devaton σ u. In some problems one may have parameters η that are ntrnscally postve wth estmates t modeled as followng a log-normal dstrbuton. The Gaussan model covers ths case as well by defnng θ = ln η and u = ln t, so that u s the correspondng Gaussan dstrbuted estmator for θ. Often the estmates u are the outcome of real control measurements, and so the standard devatons σ u are related to the correspondng sample sze. The control measurement tself could, however, nvolve a number of uncertantes or arbtrary model choces, and as a result the values of the σ u may themselves be uncertan. Gaussan modellng of the u can be used even f the measurement exsts only n an dealzed sense. For example, the parameter θ could represent a not-yet computed coeffcent n a perturbaton seres, and u s one s best guess of ts value (e.g., zero). In ths case one may try to estmate an approprate σ u by means of some recpe, e.g., by varyng some aspects of the approxmaton technque used to arrve at u. For example, n the case of predcton based on perturbaton theory one may try varyng the renormalzaton scale n some reasonable range. In such a case the estmate of σ u results from farly arbtrary choces, and values that may dffer by 50% or even a factor of two mght not be unreasonable. 3 Gamma model for estmated varances One can extend the model expressed by Eq. () to account for the uncertanty n the systematc errors by treatng the σ u as adjustable parameters. The best estmates s for the σ u are regarded as measurements to be ncluded n the lkelhood model. The wdth of the dstrbuton of the s s set by the analyst to reflect the approprate uncertanty n the σ u. The characterzaton of the error on the error s descrbed n Sect. 3.. In Sect. 3. the full mathematcal model s defned and the correspondng lkelhood profled over the σ u s derved. Ths s shown n Sect. 3.3 to be equvalent to a model n whch the estmates u follow a Student s t dstrbuton. 3. The relatve error on the error In the model proposed here t s convenent to regard the varances σ u as the parameters, and to take values v = s as ther estmates. There s a specal case n whch the estmated varances v wll follow a ch-squared dstrbuton, namely, when v s the sample varance of n ndependent observatons of u,.e., v = n n (u, j u ), () j=

3 Eur. Phys. J. C (09) 79 :33 Page 3 of 7 33 Fg. Plots of a the gamma dstrbuton of the estmated varance v and b the Nakagam dstrbuton for the estmated standard devaton s = v for several values of the parameter r (see text) f(v α,β) 5 3 (a) r = 0.05 r = 0. r = 0. r = 0.5 r =.0 g(s α,β) 8 6 (b) r = 0.05 r = 0. r = 0. r = 0.5 r = v s where u, j s the jth observaton of u and u = n nj= u, j. If the u, j are Gaussan dstrbuted wth standard devatons σ u, then one fnds (see, e.g., Ref. 7]) that the statstc (n )v /σu follows a ch-squared dstrbuton for n degrees of freedom. Furthermore, the ch-squared dstrbuton for n degrees of freedom s a specal case of the gamma dstrbuton, f (v; α, β) = βα Γ(α) vα e βv, v 0, (5) for parameter values α = n/ and β = /. The mean and varance are related to the parameters α and β by Ev] =α/β and V v] =α/β. Therefore f (n )v /σu follows a chsquare dstrbuton wth n degrees of freedom, then v follows a gamma dstrbuton wth α = n, (6) β = n σu. (7) In general the analyst wll not base the estmate v on n observatons of u but rather on dfferent types of nformaton, such as related control measurements or approxmate theoretcal predctons. The analyst must then set the wdth of the dstrbuton of v to reflect the approprate level of uncertanty n the estmate of σu. For v = s, usng error propagaton gves to frst order σ v Ev ] σ s Es ]. (8) To characterze the wdth of the gamma dstrbuton we defne σ v σ v r Ev ] = σu. (9) From Eq. (8) one sees that to frst approxmaton r σ s /Es ] and thus we can thnk of these factors as representng the relatve uncertanty n the estmate of the systematc error. The parameters r wll be referred to as the error on the error. A more accurate relaton between r as defned here and the quantty σ s /Es ] s gven n Appendx A. Usng the expectaton value of the gamma dstrbuton Ev ] = α /β and ts varance V v ] = α /β, we can relate the values r suppled by the analyst and the σ u to α and β by α = r, () β = r σ u. () Fgure a shows the gamma dstrbuton for σ u = and several values of r and Fg. b shows the correspondng dstrbuton of s = v. More detals on the dstrbuton of s and ts propertes are gven n Appendx A. The assumpton of a gamma dstrbuton s not unque but represents nevertheless a reasonable and flexble expresson of uncertanty n the σ u. Moreover t wll be shown that by usng the gamma dstrbuton one fnds a very smple procedure for ncorporatng uncertan systematc errors nto the model. Usng Eq. (6) to connect the relatve uncertanty r to the effectve number of measurements n gves n = + /r.a relevant specal case s n =, sometmes called the problem of two-pont systematcs, where one has two estmates u, and u, of a parameter θ. Ths gves ˆθ = u = (u, + u, ), () s = u, u,, (3) r = /. () It wll be assumed n ths paper that the analyst s able to assgn meanngful values for the error-on-the-error parameters r. The procedure for dong ths wll nvolve elctaton

4 33 Page of 7 Eur. Phys. J. C (09) 79 :33 of expert knowledge from those who assgned the systematc errors and wll n general vary dependng on the experment. One may want to regard a subset of the measurements as havng a certan common r whch could be ftted from the data, but we do not nvestgate ths possblty further here. The proposed model thus makes two mportant assumptons. Frst, the control measurements are taken to be ndependent and Gaussan dstrbuted. As mentoned n Sect., the Gaussan u can be extended to an alternatve dstrbuton f t can be related to a Gaussan by a transformaton. Second, the estmates of the varances of the u are gamma dstrbuted. Both assumptons are reasonable but nether s a perfect descrpton n practce, and thus the resultng nference could be subject to correspondng systematc uncertantes. Nevertheless the proposed model wll n general be an mprovement over the wdely used Gaussan assumpton for u wth fxed varances. In addton, the choce of the gamma dstrbuton leads to mportant smplfcatons n mathematcal expressons needed for nference, as shown n Sect. 3. below. 3. Lkelhood for the gamma model By treatng the estmated varances v = (v,...,v N ) as ndependent gamma dstrbuted random varables, the full lkelhood functon becomes L(, θ, σ u ) = P(y, θ) N πσu e (u θ ) /σ u β α Γ(α ) vα e β v. (5) By usng Eqs. () and () to relate the parameters α and β to σ u and r one fnds, up to addtve terms that are ndependent of the parameters, the log-lkelhood ln L(, θ, σ u ) = ln P(y, θ) ( ) ] N (u θ ) σ + + u r ln σu + v r σ u. (6) By settng the dervatves of ln L wth respect to the σ u to zero for fxed θ and one fnds the profled values σ u = v + r (u θ ) + r. (7) Usng these for the σu gves the profle lkelhood wth respect to the systematc varances, but whch stll depends on θ as well as the parameters of nterest. Aftersome manpulaton t can be wrtten up to constant terms as ln L (, θ) = ln L(, θ, σ u ) = ln P(y, θ) ( ) N + r ln + r (u θ ) v ]. (8) Some ntermedate steps n the dervaton of Eq. (8) are gven n Appendx B. In the lmt where all of the r are small, the estmates v are very close to ther expectaton values σ u. Makng ths replacement and expandng the logarthmc terms to frst order one recovers the quadratc terms as n Eq. (3). 3.3 Dervaton of profle lkelhood from Student s t dstrbuton An equvalent dervaton of the profle lkelhood (8) can be obtaned by frst defnng z u θ v. (9) As u follows a Gaussan wth mean θ and standard devaton σ u, and v follows a gamma dstrbuton wth mean σu and standard devaton σ v = r σ u, one can show (see, e.g., Ref. 7]) that z follows a Student s t dstrbuton, ( ν + ) ( Γ f (z ν ) = ν πγ(ν /) + z ν wth a number of degrees of freedom ) ν +, (0) ν = r. () By constructng the lkelhood L(, θ) as the product of P(y, θ) and Student s t dstrbutons, L(, θ) = P(y, θ) ( ) N Γ ν + ( ν πγ(ν /) + z ν ) ν +, () one obtans the same log-lkelhood as gven by ln L from Eq. (8). That s, the same model results f one replaces the estmates v by constants σ u, but stll takes the z to follow a Student s t dstrbuton, wth u = θ + σ u z. Thus n the followng we can drop the prme n the profle log-lkelhood (8) and regard ths equvalently as the log-lkelhood resultng from a model where the control measurements are dstrbuted accordng to a Student s t. In the lmt where r 0 and thus the number of degrees of freedom ν, the Student s t dstrbuton becomes a Gaussan (see, e.g., Ref. 7]),

5 Eur. Phys. J. C (09) 79 :33 Page 5 of 7 33 and the correspondng term n the log-lkelhood becomes quadratc n u θ,asneq.(3). Estmators and confdence regons from profle lkelhood The ML estmators are found by maxmzng the full ln L(, θ, σ u ) wth respect to all of the parameters, whch s equvalent to maxmzng the profle lkelhood wth respect to and θ. In ths way the statstcal uncertantes due to both the estmated bases u as well as ther estmated varances v are ncorporated nto the varances of the estmators for the parameters of nterest ˆ. Consder for example the case of a sngle contnuous parameter of nterest. Havng found the estmator ˆ, one could quantfy ts statstcal precson by usng the standard devaton σ ˆ. The covarance matrx for all of the estmated parameters can to frst approxmaton be found from the nverse of the matrx of second dervatves of ln L (see, e.g., Refs. 8,9]). From ths we extract the varance of the estmator of the parameter of nterest,.e., V ˆ] =σ.the ˆ presence of the nusance parameters n the model wll n general nflate σ ˆ, whch reflects the correspondng systematc uncertantes. But σ ˆ s by constructon a property of the model and not of a partcular data set. One may want, however, to report a measure of uncertanty along wth the estmate ˆ that reflects the extent to whch the data values are consstent wth the hypotheszed model, and therefore σ ˆ s not sutable for ths purpose. We wll show below, however, that a confdence regon can be constructed that has ths desred property. In general to fnd a confdence regon (or for a sngle parameter a confdence nterval) one tests all values of wth atestofszeα for some fxed probablty α. Those values of that are not rejected by the test consttute a confdence regon wth confdence level α. To determne the crtcal regon of the test of a gven one can use a test statstc based on the profle lkelhood rato t = lnλ() = ln L(, ˆθ) L( ˆ, ˆθ). (3) The crtcal regon of a test of corresponds to the regon of data space havng probablty content α wth maxmal t. Equvalently, provded t can be treated as contnuous, the p-value of a hypotheszed pont n parameter space s p = t,obs f (t, θ, σ u ) dt = F(t,obs ), () where t,obs s the observed value of t and F s the cumulatve dstrbuton of t. That s, we defne the regon of data space even less compatble wth the hypothess than what was observed to correspond to t > t,obs. The boundary of the confdence regon corresponds to the values of where p = α. Solvng Eq. () for the test statstc gves t = F ( p ), (5) where here t refers to the value observed, and F s the quantle of t. The statstc t s also defned n terms of the lkelhood through Eqs. () and (3), and by usng p = α one fnds that the boundary of the confdence regon s gven by ln L(, ˆθ) = ln L( ˆ, ˆθ) F ( α). (6) To fnd the p-values and thus determne the boundary of the confdence regon one needs the dstrbuton f (t, θ, σ u ). Accordng to Wlks theorem ], for M parameters of nterest = (,..., M ) the statstc t should follow a ch-squared dstrbuton for M degrees of freedom n the asymptotc lmt, whch here corresponds to the case where the dstrbutons of all ML estmators are Gaussan. To the extent that ths approxmaton holds we may dentfy the quantle F n Eq. (6) wth F, the chsquared quantle for M degrees of freedom. χm If t s further assumed that the log-lkelhood can be well approxmated by a quadratc functon about ts maxmum, then one fnds asymptotcally (see, e.g., Ref. ]) that ln L(, ˆθ) = ln L( ˆ, ˆθ) ( ˆ)T V ( ˆ), (7) where V j = covˆ, ˆ j ] s the covarance matrx for the parameters of nterest. Ths equaton says that the confdence regon s a hyper-ellpsod of fxed sze centred about ˆ.For example, for a sngle parameter one fnds that the endponts ± =ˆ ± σ ˆ / F ( α)] (8) χ gve the central confdence nterval wth confdence level α. For a probablty content correspondng to plus or mnus one standard devaton about the centre of a Gaussan,.e., α = 68.3%, one has F χ ( α) =, whch gves the wellknown result that the nterval of plus or mnus one standard devaton about the estmate s asymptotcally a 68.3% CL central confdence nterval. The relatons (7) and (8) depend, however, on a quadratc approxmaton of the log-lkelhood. In the model where the σ u are treated as adjustable, the profle loglkelhood s gven by Eq. (8), whch contans terms that are logarthmc n (u θ ), and not just the quadratc terms

6 33 Page 6 of 7 Eur. Phys. J. C (09) 79 :33 that appear n Eq. (3). As a result the relaton (7) s only a good approxmaton n the lmt of small r, whch s not always vald n the present problem. We can nevertheless use Eq. (6) assumng a ch-squared dstrbuton for t as a frst approxmaton for confdence regons. We wll see n the examples below that these have nterestng propertes that already capture the most mportant features of the model. If hgher accuracy s requred then Monte Carlo methods can be used to determne the dstrbuton of t. Alternatvely we can modfy the statstc so that ts dstrbuton s closer to the asymptotc form; ths s explored further n Sect Sngle-measurement model To nvestgate the asymptotc propertes of the profle lkelhood rato t s useful to examne a smple model wth a sngle measured value y followng a Gaussan wth mean and standard devaton σ. The parameter of nterest s and we treat the varance σ as a nusance parameter, whch s constraned by an ndependent gamma-dstrbuted estmate v. Thus the lkelhood s gven by L(, σ ) = f (y,v, σ ) = πσ e (y ) /σ βα Γ(α) vα e βv. (9) As before we set the parameters α and β of the gamma dstrbuton so that Ev] =σ and so that from Eq. (9) the standard devaton of v s σ v = rσ, where r characterzes the relatve error on the error. Ths gves α = r, (30) β = r σ. (3) The goal s to construct a confdence nterval for by usng the profle lkelhood rato λ() = L(, σ ()) L( ˆ, σ. (3) ) The log-lkelhood s ln L(, σ ) = (y ) ( σ + ) r ln σ v r σ + C, (33) where C represents constants that do not depend on or σ. From ths we fnd the requred estmators ˆ = y, (3) σ v = + r, (35) σ () = v + r (y ) + r. (36) Wth these ngredents we fnd the followng smple expresson for the statstc t = lnλ(), ( t = + ) ] (y ) r ln + r. (37) v Accordng to Wlks theorem ], the dstrbuton f (t ) should, n the large-sample lmt, be ch-squared for one degree of freedom. The large-sample lmt corresponds to the stuaton where estmators for the parameters become Gaussan, whch n ths case means r. The behavour of the dstrbuton of t for nonzero r s llustrated n Fg., whch shows the dstrbutons from data generated accordng to a Gaussan of mean = 0, standard devaton σ = and values of r = 0.0, 0., 0. and 0.6. The case of r = 0.0 approxmates the stuaton where the relatve uncertanty on σ s neglgbly small. One can see that greater values of r lead to an ncreasng departure of the dstrbuton from the asymptotc form. Dependng on the sze of the test beng carred out or equvalently the confdence level of the nterval, one may fnd that the asymptotc approxmaton s nadequate. In such a case one may wsh to use the Monte Carlo smulaton to determne the dstrbuton of the test statstc. Alternatvely one can modfy the statstc so that ts dstrbuton s better approxmated by the asymptotc form, as descrbed n the followng secton. 5. Bartlett correcton for profle lkelhood-rato statstc The lkelhood-rato statstc can be modfed so as to follow more closely a ch-square dstrbuton usng a type of correcton due to Bartlett ]. Ths method has receved some lmted notce n Partcle Physcs 5] but has not been wdely used n that feld. The basc dea s to determne the mean value Et ] of the orgnal statstc. In the asymptotc lmt, ths should be equal to the number of degrees of freedom n d of the ch-square dstrbuton, whch n ths example s n d =. One then defnes a modfed statstc t = n d Et ] t, (38) so that by constructon Et ]=n d.itwasshownbylawley 6] that the modfed statstc approaches the reference chsquared dstrbuton wth a dfference of order n /, where here the effectve sample sze n s related to the parameter r by n = + /r (cf. Eqs. (6) and ()). One could n prncple fnd the expectaton value Et ] by the Monte Carlo method. But for the method to be convenent to use one would lke to determne the Bartlett cor-

7 Eur. Phys. J. C (09) 79 :33 Page 7 of 7 33 Fg. Dstrbutons of the test varable t for a sngle Gaussan dstrbuted measurement wth relatve error-on-error r ) f(t (a) N = r = 0.0 ) f(t (b) N = r = 0. χ pdf χ pdf t t (c) (d) ) f(t N = ) f(t N = r = 0. r = 0.6 χ pdf χ pdf t t recton wthout resortng to smulaton. By expandng the expectaton value Et ]= t (y,v) f (y,v, σ ) dydv (39) as a Taylor seres n r one fnds Et ]= + 3r + cr, (0) where the coeffcent of the r term s found numercally to be c wth an accuracy of around %. Dvdng t /n d (here wth n d = ) from Eq. (3) byet ] to obtan the Bartlett-corrected statstc therefore gves t = + r r ( + 3r + r ) ln + r ] (y ). () v In more complex problems one may not have a smple expresson for the expectaton value needed n the Bartlett correcton and calculaton by Monte Carlo may be necessary. Dstrbutons of t are shown n Fg. 3 along wth Monte Carlo dstrbutons. As can be seen by comparng the uncorrected dstrbutons from Fgs. to those n Fg. 3, the Bartlett correcton s clearly very effectve, as s needed when the parameter r s large. 5. Confdence ntervals for the sngle-measurement model In the smple model explored n ths secton one can use the measured values of y and v to construct a confdence nterval for the parameter of nterest. The probablty that the nterval ncludes the true value of (the coverage probablty) can then be studed as a functon of the relatve error on the error r. What emerges s that the nterval based on the ch-squared dstrbuton of t has a coverage probablty substantally less than the nomnal confdence level, but that ths can be greatly mproved by use of the Bartlett-corrected nterval. To derve exact confdence ntervals for we can use the fact that z = y () v follows a Student s t dstrbuton for ν = /r degrees of freedom (see, e.g., Ref. 7]). From the dstrbuton of z one can fnd the correspondng pdf of

8 33 Page 8 of 7 Eur. Phys. J. C (09) 79 :33 Fg. 3 Dstrbutons of the Bartlett-corrected test varable t for a sngle Gaussan dstrbuted measurement wth relatve error-on-error r ) f(t (a) N = r = 0.0 ) f(t (b) N = r = 0. χ pdf χ pdf t (c) t (d) ) f(t N = ) f(t N = r = 0. r = 0.6 χ pdf χ pdf t t ] t = ( + ν)ln + z, (3) ν but n fact ths s not drectly needed. Rather we can use the pdf of z to fnd confdence ntervals for from the fact that a crtcal regon defned by t > t c s equvalent to the correspondng regon of z gven by z < z c and z > z c where the boundares of the crtcal regons n the two varables are related by Eq. (3). Equvalently one can say that the p-value of a hypotheszed value of s the probablty, assumng, to fnd z further from zero than what was observed,.e., zobs ( ( )) y p = f (z) dz = F ; ν, () z obs v where F(z; ν) s the cumulatve Student s t dstrbuton for ν = /r degrees of freedom. The boundares of the confdence nterval at confdence level CL = α (here α refers to the sze of the statstcal test, not the parameter α n the gamma dstrbuton) are found by settng p = α and solvng for, whch gves the upper and lower lmts ± = y ± vz α/. (5) Here z α/ s the α/ upper quantle of the Student s t dstrbuton,.e., the value of z obs needed n Eq. () tohave p = α. If one were to assume that the statstc t follows the asymptotc ch-squared dstrbuton, then z α/ s replaced by z a = ( r exp ) / Q α r + r ]. (6) Here Q α = F ( α) s obtaned from the quantle of χ the ch-squared dstrbuton for one degree of freedom. And f the Bartlett-corrected statstc t s used to construct the nterval, then the Q α n Eq. (6) s replaced by Q α Et ], where Et ]= + 3r + r s the expectaton value of t from Eq. (0). The half-wdth of the nterval measured n unts of the estmated standard devaton v,.e., z α/ or z a, are shown n Fg. a as a functon of the r parameter. The probablty P c for the confdence nterval to cover the true value of s by constructon equal to α for the exact confdence nterval. For the nterval based on the asymptotc dstrbuton of the test statstc ths s P c = za z a f χ (z) dz = F χ (z a ), (7)

9 Eur. Phys. J. C (09) 79 :33 Page 9 of 7 33 Fg. Plots of a the nterval half-wdth n unts of the estmated standard devaton v and b coverage probablty of the 68.3% CL confdence ntervals for 68.3% CL nterval half-wdth 5 3 (a) exact asymptotc Bartlett corrected coverage probablty (b) exact 0. asymptotc Bartlett corrected r r where F χ s the cumulatve ch-squared dstrbuton for one degree of freedom and z a s gven by Eq. (6), wth Q α replaced by Q α Et ] for the Bartlett-corrected case. The nterval half-wdths and coverage probabltes based on t and t are shown n Fg.. As can be seen, the nterval based on the Bartlett-corrected statstc s very close to the exact one, and ts coverage s close to the nomnal α for relevant values of r. As seen from the dstrbutons n Fgs. and 3 for the sngle-measurement model, the agreement wth the asymptotc form worsens for ncreasng values of the test statstc. For Z = t of (a four standard-devaton sgnfcance; see, e.g., Ref. 6]), the Bartlett-corrected statstc s close to the asymptotc form for r = 0., wth a small but vsble departure for r = 0.. In contrast, for a 68.3% confdence level (correspondng to t = ), one sees from Fg. a that the Bartlett corrected nterval s n satsfactory agreement wth the exact nterval out to r. For a more complcated analyss wth multple measurements havng dfferent r parameters one would need to check the valdty of asymptotc dstrbutons wth Monte Carlo. 6 Least-squares fttng and averagng measurements An mportant applcaton of the model descrbed n Sect. 3 s the least-squares ft of a curve, or as a specal case of ths, the average of a set of measurements. Suppose the data consst of N ndependent Gaussan dstrbuted values y, wth mean and varance Ey ]=ϕ(x ; ) + θ, (8) V y ]=σy. (9) Here the nusance parameters θ represent a potental bas or offset. The functon ϕ(x ; ) plus the bas θ gves the mean of y as a functon of a control varable x, and t depends on a set of M parameters of nterest = (,..., M ). That s, the probablty P(y ϕ, θ) n Eq. () becomes P(y, θ) = N πσy e (y ϕ(x ;) θ ) /σ y. (50) As before suppose the nusance parameters θ are constraned by N correspondng ndependent Gaussan measurements u, wth mean and varance Eu ]=θ, (5) V u ]=σu. (5) Often the best estmates of a potental bas θ wll be u = 0 for the actual measurement, but formally the u are treated as random varables that would fluctuate upon repetton of the experment. Therefore the full log-lkelhood or equvalently lnl(, θ) s up to an addtve constant gven by lnl(, θ) = ] N (y ϕ(x ; ) θ ) σy + (u θ ). σ u (53) That s, f we consder the σ u as known, then maxmumlkelhood estmators are obtaned by the mnmum of the sum of squares (53) whch s the usual formulaton of the method of least squares. The next step wll be to treat the σ u as adjustable parameters but before dong ths s t nterestng to note that by proflng over the nusance parameters θ, one fnds the profle lkelhood lnl () = N (y ϕ(x ; ) u ) σy + σu χ (). (5) That s, the same result s obtaned by usng the usual method of least squares wth statstcal and systematc uncertantes added n quadrature. Ths procedure gves the best lnear unbased estmator (BLUE), whch s wdely used n Partcle

10 33 Page of 7 Eur. Phys. J. C (09) 79 :33 Physcs, partcularly for the problem of averagng a set of measurements as descrbed n Refs. 7 0]. Returnng to the full dependence on and θ and followng the model of Sect. 3 we now regard the systematc varances σ u as free parameters for whch we have ndependent gamma dstrbuted estmates v, wth parameters α and β set by σ u and r accordng to Eqs. () and (). The log-lkelhood profled over the σ u s (cf. Eq. (8)), lnl (, θ) = N (y ϕ(x ; ) θ ) σ y ( ) + + ( r ln + r (u θ ) ) ]. v (55) To fnd the requred estmators we need to solve the system of equatons ln L = 0 =,...,M, (56) ln L = 0, θ =,...,N. (57) Equaton (57) results n θ 3 + u y + ϕ ] θ v + ( + r + )σ y ] r + u (y ϕ ) + u θ ( ) v + (ϕ y ) r + u ( + r )σ y ] u r = 0, =,...,N, (58) where here ϕ = ϕ(x ; ). Smultaneously solvng all M +N equatons for and the θ gves ther ML estmators. Solvng for the θ for fxed,.e., fxed ϕ, gves the profled values ˆθ. Equaton (58) are cubc n θ and so can be solved n closed form gvng ether one or three real roots. In the case of three roots, the one s chosen that maxmzes ln L. Usng the profle log-lkelhood from Eq. (55) one can use, for example, the test statstc t defned n Eq. (3) to fnd confdence regons for followng the general procedure outlned n Sect.. Examples of ths wll be shown n Sect Goodness of ft In the usual method of least squares, the mnmzed sum of squares χ mn = χ ( ˆ) based on Eq. (5) s often used to quantfy the goodness-of-ft. Because t s constructed as a sum of squares of Gaussan dstrbuted quanttes, one can show (see, e.g., Ref. ]) that ts samplng dstrbuton s ch-squared for N M degrees of freedom, and the p-value of the hypothess that the true model les somewhere n the parameter space of s thus p = χ mn f χ N M (x) dx. (59) When usng the gamma error model presented above, the quantty lnl (, θ) s no longer a smple sum of squares. Nevertheless one can construct the statstc that wll play the same role as the mnmzed χ () by consderng the model n whch the means ϕ(x, ), whch depend on the M parameters of nterest, are replaced by a vector of N ndependent mean values, one for each of the measurements: ϕ = (ϕ,...,ϕ N ). By requrng that the ϕ are gven by ϕ(x, ) one mposes N M constrants and restrcts the more general hypothess to an M-dmensonal subspace. One can then construct the lkelhood rato statstc q = ln L ( ˆ, ˆθ) L ( ˆϕ, ˆθ), (60) where the numerator contans the M ftted parameters of nterest ˆ, and n the denomnator one fts all N of the ϕ. When fttng separate values of ϕ and θ for each measurement (the saturated model ), one can see from nspecton that the maxmzed value of ln L (ϕ, θ) s zero, and therefore the statstc q becomes q = mn,θ ( + N + r (y ϕ(x ; ) θ ) ) σ y ( ln + r (u θ ) ) ]. (6) Accordng to Wlks theorem ], n the lmt where the estmators ˆ and ˆθ are Gaussan dstrbuted, q wll follow a ch-squared pdf for N M degrees of freedom. The statstc q thus plays the same role as the mnmzed sum of squares χ mn n the usual method of least squares. In the case of Eq. (6), however, the ch-squared approxmaton s not exact. One can see ths from the fact that the v are gamma rather than Gaussan dstrbuted; the Gaussan approxmaton holds only n the lmt where the r are suffcently small. If all r 0,.e., there s no uncertanty n the reported systematc errors, then the statstc q reduces to the mnmzed sum of squares from the method of least squares or BLUE, namely, q = N (y ( ˆϕ)) σ y + σ u. (6) One can check n an example that the samplng dstrbuton of q follows a ch-squared dstrbuton by generatng measured values y, u, and s accordng to the model descrbed n v

11 Eur. Phys. J. C (09) 79 :33 Page of 7 33 Fg. 5 Dstrbutons of the test varable q for averages of N = and 5 values usng r = 0. andr = 0. f(q) (a) N = r = 0. χ pdf f(q) (b) N = 5 r = 0. χ pdf q (c) q (d) f(q) N = f(q) N = 5 r = 0. r = 0. χ pdf χ pdf q q Sect. usng the followng parameter values: ϕ = =, σ y =, σ u = for all =,...,N. That s, the measurements are assumed to have the same mean and the goal s to ft ths parameter. The resultng dstrbutons of q are shown n Fg. 5a, b for N = and N = 5usngr = 0. for all measurements. Overlayed on the hstograms s the ch-squared pdf for N degrees of freedom. Although the agreement s reasonably good there s stll a notceable departure from the asymptotc dstrbuton n the tals. The same set of curves s shown n Fg. 5c, d for r = 0., for whch one sees an even greater dscrepancy between the true (.e., smulated) and asymptotc dstrbutons. One mght need a p-value wth an accuracy such that assumpton of the asymptotc dstrbuton of q s not adequate. In such a case one can use Monte Carlo to determne the correct samplng dstrbuton of q. Alternatvely, followng the procedure of Sect. 5. one can defne a Bartlettcorrected statstc q as q = N M q, (63) Eq] so that by constructon Eq ]=N M (n the example above for a sngle ftted parameter M = ). Dstrbutons of q correspondng to Fg. 5 are shown n Fg. 6, where the mean value Eq] was tself found from Monte Carlo smulaton. Whle one sees that the dstrbutons of q are n better agreement wth the Monte Carlo, vsble dscrepances reman. And snce here smulaton was requred to determne the Bartlett correcton, one could use t as well to fnd the p- value drectly. The Bartlett correcton s nevertheless useful n such a stuaton because the number of smulated values of q requred to estmate accurately Eq] may be much less than what one needs to fnd the upper tal area for a very hgh observed value of the test statstc. 6. Averagng measurements An mportant specal case of a least-squares ft s the average of N ndependent measurements, y,...,y N,ofthesame quantty,.e., the ft functon ϕ(x; ) = s n effect a horzontal lne and the control varable x does not enter. The expectaton values of the measurements are thus Ey ]= + θ, =,...,N, (6) where the parameter of nterest represents the desred mean value and as before θ are the bas parameters. As there s one

12 33 Page of 7 Eur. Phys. J. C (09) 79 :33 Fg. 6 Dstrbutons of the Bartlett-corrected test varable q for averages of N = and5 values usng r = 0. and r = 0. f(q ) (a) N = r = 0. χ pdf f(q ) (b) N = 5 r = 0. χ pdf q q (c) (d) f(q ) N = f(q ) N = 5 r = 0. χ pdf r = 0. χ pdf q q parameter of nterest, the statstc q follows asymptotcally a ch-squared dstrbuton for N degrees of freedom, although as we have seen above ths approxmaton breaks down as the r ncrease. As an example, consder the average of two ndependent measurements, nomnally reported as y ± σ y ± s for =,, n whch the σ y represent the statstcal uncertantes and s are the estmated systematc errors. Suppose here these are σ y = and s = for both measurements, and that the analyst reports values r representng the relatve accuracy of the estmates of the systematc errors, whch n ths example we wll take to be equal to a common value r. Furthermore suppose that the observed values of y and y are + δ and δ, respectvely, and we wll allow δ to vary. For the values of σ y and s chosen n ths example, the value of δ corresponds to the sgnfcance of the dscrepancy between y and y n standard devatons under assumpton of r = 0. Usng the nput values descrbed above, the mean,bas parameters θ, and systematc errors σ u are adjusted to maxmze the log-lkelhood from Eq. (6). Fgure 7 show the half-wdth of the 68.3% confdence nterval for as a functon of the parameter r for dfferent levels of δ. Ths nterval corresponds to the standard devaton σ ˆ when the r are all small, where the problem s the same as n least squares or BLUE. In Fg. 7a, the nterval s based on Eq. (6),.e., t s determned by the pont where the profle log-lkelhood drops by a fxed amount from ts maxmum (n Partcle Physcs often referred to as the MINOS nterval ]). In Fg. 7b, the nterval s found by solvng for the value of where ts p-value s p = α, and here α = = The p-value depends, however, on the assumed values of the nusance parameters. Here we use the values of θ and σ u profled at the value of tested. Ths technque s often called profle constructon n Partcle Physcs ], where t s wdely used, and elsewhere called hybrd resamplng 3,]. The resultng confdence nterval wll have the correct coverage probablty of α f the nusance parameters are equal to ther profled values; elsewhere the nterval could under- or over-cover. Although the ntervals from profle constructon dffer somewhat from those found drectly on the log-lkelhood, they have the same qualtatve behavour. From Fg. 7 one can extract several nterestng features. Frst, f r s small, that s, the systematc errors σ u are very close to ther estmated values s, then the nterval s halflength s very close to the standard devaton of the estmator,

13 Eur. Phys. J. C (09) 79 :33 Page 3 of 7 33 Fg. 7 Plots of the half-length of the -σ (68.3%) central confdence nterval for the parameter as a functon of the relatve uncertanty on the systematc errors r for dfferent levels of dscrepancy δ between two averaged measurements. Intervals are derved a from the log-lkelhood and b usng profle constructon (see text) half-length of -σ MINOS nterval 8 6 (a) δ = 0 δ = δ = δ = 3 δ = δ = 5 y = - δ ± ± y = + δ ± ± half-length of -σ confdence nterval 8 6 (b) δ = 0 δ = δ = δ = 3 δ = δ = 5 y = - δ ± ± y = + δ ± ± r r σ ˆ =, regardless of the level of dscrepancy between the two measured values. Further, the effect of larger values of r s seen to depend very much on the level of dscrepancy between the measured values. If y or y are very close (e.g., δ = 0 or ), then the length of the confdence nterval can even be reduced relatve to the case of r = 0. If the measurements are n agreement at a level that s better than expected, gven the reported statstcal and systematc uncertantes, then one fnds that the lkelhood s maxmzed for values of the systematc errors σ u that are smaller than the ntally estmated s. And as a consequence, the confdence nterval for shrnks. Fnally, one can see that f the data are ncreasngly nconsstent, e.g., n Fg. 7 for δ, then the effect of allowng hgher r s to ncrease the length of the nterval. Ths s also a natural consequence of the assumed model, whereby an observed level of heterogenety greater than what was ntally estmated results n maxmzng the lkelhood for larger values of σ u and consequently an ncreased confdence nterval sze. The coverage propertes of the ntervals for the average of two measurements example are nvestgated by generatng data values y for =, accordng to a Gaussan wth a common mean (here ) and the standard devatons both σ y =, and the u are generated accordng to a Gaussan dstrbuted wth mean of θ = 0 and standard devaton σ u =. The values v are gamma dstrbuted wth parameters α and β gvenbyeqs.() and () so as to correspond σ u = and for dfferent values of the parameters r, taken here to be the same for both measurements. Fgure 8 shows the coverage probablty for the nterval wth nomnal confdence level 68.3% based on the loglkelhood (the MINOS nterval) and also usng profle constructon (hybrd resamplng), as a functon of the r parameter. As seen n the fgure, the coverage probablty approxmates the nomnal value reasonably well out to r = 0.5, where one fnds P cov = 0.63 and for MINOS and P cov Nomnal CL MINOS nterval Profle constructon r Fg. 8 The coverage probablty of the ntervals based on the lkelhood (MINOS method) and on profle constructon (hybrd resamplng) as a functon of the parameter r (see text) profle constructon respectvely; at r =, the correspondng values are 0.56 and 0.67 (the Monte Carlo statstcal errors for all values s around 0.005). Thus reasonable agreement s found wth both methods but one should be aware that the coverage probablty may depart from the nomnal value for large values of r. 6.3 Senstvty to outlers One of the mportant propertes of the error model used n ths paper s that curves ftted to data become less senstve to ponts that depart sgnfcantly from the ftted curve (outlers) as the r parameters of the measurements are ncreased. Ths s a well-known feature of models based on the Student s t dstrbuton (see, e.g., Ref. ]). The reduced senstvty to outlers s llustrated n Fg. 9 for the case of averagng fve measurements of the same quantty (.e., the ft of a horzontal lne). All measured values are assgned σ y and s equal to.0, and n Fg. 9a, c they are all farly close to the central value of. In Fg. 9b, d the mddle

14 33 Page of 7 Eur. Phys. J. C (09) 79 :33 Fg. 9 Result of averagng fve quanttes: a no outler, r = 0.0; b wth outler, r = 0.0; c no outler, r = 0.; d wth outler, r = 0.. Also ndcated on the plots are the values of the Bartlett-corrected goodness-of-ft statstc q and the correspondng p value y (a) =.00 ± 0.63 q = 5.0 p = 0.9 r = 0.0 (all) data y (b) =.00 ± 0.63 q =.9-9 p =. r = 0.0 (all) data 5 5 y 5 (c) =.00 ± 0.65 data y 5 (d) =.75 ± 0.78 data 0 q =.9 p = 0.30 r = 0. (all) 0 q = p = 3.9 r = 0. (all) pont s at 0. In the top two plots, the r parameters for all measurements are taken to be r = 0.0, whch s very close to what would be obtaned wth an ordnary least-squares ft. In (a) the average s ; n (b) the outler causes the ftted mean to move to.00. In both cases the half-wdth of the confdence nterval s In the lower two plots, (c) and (d), all of the ponts are assgned r = 0.,.e., a 0% relatve uncertanty on the systematc error. In the case wth no outler, (c), the estmated mean stays at.00, and the half-wdth of the confdence nterval only ncreases a small amount to Wth the outler n (d), the ftted mean s.75 wth an nterval half-wdth of That s, the amount by whch the outler pulls the estmated mean away from the value preferred by the other ponts (.00) s substantally less than wth r = 0.0, (ftted mean.00). Furthermore, the lower compatblty between the measurements results n a confdence nterval that s larger than wthout the outler (half-wdth 0.78 rather than 0.65). When the r are small, however, the nterval sze s ndependent of the goodness of ft. Both the ncrease n the sze of the confdence nterval and the decrease n senstvty to the outler represent mportant mprovements n the nference. It s mportant to note that the above-mentoned propertes pertan to the case where each measurement has ts own bas parameter θ wth ts own r. It mght appear that one would obtan a result roughly equvalent to that of the proposed model by usng the ordnary least-squares approach,.e., the log-lkelhood of Eq. (53), and smply makng the replacement σ u σ u ( + r ). In the example shown above wth all r = 0., however, the result s ˆ =.00 ± 0.70 wthout the outler (mddle data pont at ) and ˆ =.00 ± 0.70 f the mddle pont s moved to 0. So by nflatng the systematc errors but stll usng least squares, one ncreases the sze of the confdence nterval by an amount that does not depend on the goodness of ft and the senstvty to outlers s not mproved. 7 Treatment of correlated uncertantes The phrase correlated systematc uncertantes s often taken to mean the stuaton where a nusance parameter affects multple measurements n a coherent way. Suppose, for example, that the expectaton values Ey ] of measured quanttes y wth =,...,L are functons ϕ (, θ) of parameters of nterest = (,..., M ) and nusance parameters θ = (θ,...,θ N ). Suppose further that the nusance parameters are defned such that for θ = 0they

15 Eur. Phys. J. C (09) 79 :33 Page 5 of 7 33 are unbased measurements of the nomnal model ϕ (). Expandng ϕ to frst order n θ therefore gves Ey ]=ϕ (, θ) ϕ () + N R j θ j, (65) j= where the factors R j = ϕ / θ j θ=0 determne how much θ j bases the measurement y. Suppose that the R j are known, ether from symmetry (e.g., a partcular θ j could be known to contrbute equally to all of the y ) or they are determned usng a Monte Carlo smulaton. As before suppose one has a set of ndependent Gaussan-dstrbuted control measurements u j used to constran the nusance parameters, wth mean values θ j and standard devatons σ u j. One can defne the total bas of measurement y as b = N R j θ j. (66) j= and an estmator for b s ˆb = N R j u j. (67) j= These estmators of the bases are correlated. As the control measurements are assumed ndependent, and therefore covu k, u l ]=V u k ]δ kl, the covarance of the bas estmators s U j = cov ˆb, ˆb j ]= N R k R jk V u k ]. (68) k= It s n the sense descrbed here that the proposed model s capable of treatng correlated systematc uncertantes. That s, although the control measurements u are ndependent they result n a nondagonal covarance for the estmated bases of the measurements. The matrx U j s shown here only to llustrate how correlated bas estmates can be related to ndependent control measurements and t s not explctly needed n the type of the analyss descrbed here. The full lkelhood can be constructed from the measurements y together wth ther expectaton values gven by Eq. (65), where the R j are assumed known. That s, n the log-lkelhood of Eqs. (53) or(55) the terms y ϕ(x ; ) θ are replaced by y ϕ () Nj= R j θ j. If the varances σu of the control measurements u are themselves uncertan then they are treated as adjustable parameters wth ndependent gamma-dstrbuted estmates. 8 Dscusson and conclusons The statstcal model proposed here can be appled n a wde varety of analyses where the standard devatons of Gaussan measurements are deemed to have a gven relatve uncertanty, reflected by the parameters r defned n Eq. (9). The quadratc constrant terms connectng control measurements to ther correspondng nusance parameters that appear n the log-lkelhood are replaced by logarthmc terms cf. Eqs. (3) and (8)]. The resultng model s equvalent to takng a Student s t dstrbuton for the control measurements, wth the number of degrees of freedom gven by ν = /r. It s not uncommon for systematc errors, especally those related to theoretcal uncertantes, to be uncertan themselves to several tens of percent. The model presented here allows such uncertantes to be taken nto account and t has been shown that ths has nterestng and useful consequences for the resultng nference. Confdence ntervals are found to ncrease n sze f the goodness of ft s poor and can decrease slghtly f the data are more nternally consstent than expected, gven the level of statstcal fluctuaton assumed n the model. Averages and ftted curves become less senstve to outlers. If the relatve uncertanty on the systematc errors s large enough (r greater than around 0. n the examples studed), then the samplng dstrbuton of lkelhood-rato test statstcs starts to depart from the asymptotc ch-squared form. Thus one cannot n general apply asymptotc results for p values and confdence ntervals wthout takng some care to ensure ther valdty. In some cases Bartlett-corrected statstcs can be used; alternatvely one may need to determne the relevant dstrbutons by Monte Carlo smulaton. In reportng results that use the procedure presented here t s mportant to communcate all of the r parameters. To allow for combnatons wth other measurements one should deally report the full lkelhood, ncludng the r values, to permt a consstent treatment of uncertantes common to several of the measurements. The pont of vew taken here has been that the analyst must determne reasonable values for the relatve uncertantes n the systematc errors. One should not, for example, decde to use the proposed model only f the goodness of ft s found to be poor. Rather, the r parameters should reflect the accuracy wth whch the systematc varances have been estmated and the resultng nference about the parameters of nterest then ncorporates ths knowledge n a manner that s vald for any data outcome. An alternatve mentoned here as a possblty would be to ft a common relatve uncertanty to all systematc errors (a global r), e.g., when averagng a set of numbers for whch no r values have been reported. Ths s analogous to the scalefactor procedure used by the Partcle Data Group 9] or the method of DerSmonan and Lard 7] wdely used n meta-

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Laboratory 1c: Method of Least Squares

Laboratory 1c: Method of Least Squares Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly

More information

Laboratory 3: Method of Least Squares

Laboratory 3: Method of Least Squares Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Lecture 20: Hypothesis testing

Lecture 20: Hypothesis testing Lecture : Hpothess testng Much of statstcs nvolves hpothess testng compare a new nterestng hpothess, H (the Alternatve hpothess to the borng, old, well-known case, H (the Null Hpothess or, decde whether

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Basic Statistical Analysis and Yield Calculations

Basic Statistical Analysis and Yield Calculations October 17, 007 Basc Statstcal Analyss and Yeld Calculatons Dr. José Ernesto Rayas Sánchez 1 Outlne Sources of desgn-performance uncertanty Desgn and development processes Desgn for manufacturablty A general

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics ) Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton

More information

U-Pb Geochronology Practical: Background

U-Pb Geochronology Practical: Background U-Pb Geochronology Practcal: Background Basc Concepts: accuracy: measure of the dfference between an expermental measurement and the true value precson: measure of the reproducblty of the expermental result

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi LOGIT ANALYSIS A.K. VASISHT Indan Agrcultural Statstcs Research Insttute, Lbrary Avenue, New Delh-0 02 amtvassht@asr.res.n. Introducton In dummy regresson varable models, t s assumed mplctly that the dependent

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

SIO 224. m(r) =(ρ(r),k s (r),µ(r)) SIO 224 1. A bref look at resoluton analyss Here s some background for the Masters and Gubbns resoluton paper. Global Earth models are usually found teratvely by assumng a startng model and fndng small

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Uncertainty in measurements of power and energy on power networks

Uncertainty in measurements of power and energy on power networks Uncertanty n measurements of power and energy on power networks E. Manov, N. Kolev Department of Measurement and Instrumentaton, Techncal Unversty Sofa, bul. Klment Ohrdsk No8, bl., 000 Sofa, Bulgara Tel./fax:

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Thermodynamics and statistical mechanics in materials modelling II

Thermodynamics and statistical mechanics in materials modelling II Course MP3 Lecture 8/11/006 (JAE) Course MP3 Lecture 8/11/006 Thermodynamcs and statstcal mechancs n materals modellng II A bref résumé of the physcal concepts used n materals modellng Dr James Ellott.1

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

Which estimator of the dispersion parameter for the Gamma family generalized linear models is to be chosen?

Which estimator of the dispersion parameter for the Gamma family generalized linear models is to be chosen? STATISTICS Dalarna Unversty D-level Master s Thess 007 Whch estmator of the dsperson parameter for the Gamma famly generalzed lnear models s to be chosen? Submtted by: Juan Du Regstraton Number: 8096-T084

More information

9. Binary Dependent Variables

9. Binary Dependent Variables 9. Bnar Dependent Varables 9. Homogeneous models Log, prob models Inference Tax preparers 9.2 Random effects models 9.3 Fxed effects models 9.4 Margnal models and GEE Appendx 9A - Lkelhood calculatons

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2 ISQS 6348 Fnal Open notes, no books. Ponts out of 100 n parentheses. 1. The followng path dagram s gven: ε 1 Y 1 ε F Y 1.A. (10) Wrte down the usual model and assumptons that are mpled by ths dagram. Soluton:

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected. ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry Workshop: Approxmatng energes and wave functons Quantum aspects of physcal chemstry http://quantum.bu.edu/pltl/6/6.pdf Last updated Thursday, November 7, 25 7:9:5-5: Copyrght 25 Dan Dll (dan@bu.edu) Department

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 0 A nonparametrc two-sample wald test of equalty of varances

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede Fall 0 Analyss of Expermental easurements B. Esensten/rev. S. Errede We now reformulate the lnear Least Squares ethod n more general terms, sutable for (eventually extendng to the non-lnear case, and also

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede. . For P such independent random variables (aka degrees of freedom): 1 =

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede. . For P such independent random variables (aka degrees of freedom): 1 = Fall Analss of Epermental Measurements B. Esensten/rev. S. Errede More on : The dstrbuton s the.d.f. for a (normalzed sum of squares of ndependent random varables, each one of whch s dstrbuted as N (,.

More information

Notes prepared by Prof Mrs) M.J. Gholba Class M.Sc Part(I) Information Technology

Notes prepared by Prof Mrs) M.J. Gholba Class M.Sc Part(I) Information Technology Inverse transformatons Generaton of random observatons from gven dstrbutons Assume that random numbers,,, are readly avalable, where each tself s a random varable whch s unformly dstrbuted over the range(,).

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

CHAPTER 8. Exercise Solutions

CHAPTER 8. Exercise Solutions CHAPTER 8 Exercse Solutons 77 Chapter 8, Exercse Solutons, Prncples of Econometrcs, 3e 78 EXERCISE 8. When = N N N ( x x) ( x x) ( x x) = = = N = = = N N N ( x ) ( ) ( ) ( x x ) x x x x x = = = = Chapter

More information