Confidence intervals for weighted polynomial calibrations

Confdence ntervals for weghted olynomal calbratons Sergey Maltsev, Amersand Ltd., Moscow, Russa; ur Kalambet, Amersand Internatonal, Inc., Beachwood, OH e-mal: kalambet@amersand-ntl.com htt://www.chromandsec.com Summary In many chromatograhc software roducts, weghted olynomal regresson s used for the calbraton curves. Weghtng s a useful mean to account for the measurement error that may deend on the detector resonse value. Confdence ntervals show how accurate a measurement s wth the hel of a calbraton curve. We have etended the confdence nterval theory to the frequent case of weghts, eressed as a functon of value, n artcular / and /. Ths etenson allows accurate calculaton of concentratons wth confdence ntervals for calbraton curves, constructed usng ont weghtng. Eamles are shown that demonstrate alcatons of weghted calbraton curves wth confdence ntervals n chromatograhy. Introducton General theory of lnear regresson analyss s usually used for calculatng confdence ntervals. However, n analytcal chemstry, we may face a ractcal roblem where the regresson analyss cannot be aled drectly and needs some adataton. A very common stuaton n analyss s that the error s not the same for dfferent sgnal levels. The error can be roortonal to the sgnal tself, that s R ~ R. For radoactvty measurements we would have ~ R R. Here, R s the detector resonse and R s a related measurement error. Correct handlng of such errors and calculatng true errors n the resultng concentraton requres secal consderaton. Another eamle of weghted regresson s curve fttng by olynomal of second or thrd ower. Conventonal lnear regresson analyss gves tools for calculatng confdence ntervals for ths case. Eact formulas are tycally not resent n secalzed lterature for analytcal chemstry. The most mortant ractcal cases related to the confdence ntervals are the followng:. We buld the regresson of value y whch s measured wth error at the recsely known set (calbraton. Then, we make a set of measurements of the detector resonse at known. What s the varaton of value and how does t dffer from the eected value ˆ, calculated from calbraton at?. We buld the regresson of value y whch s measured wth error over the recsely known set (calbraton. Then, we measure the detector resonse at some unknown. A value ˆ from calbraton curve at s used as the estmate of varaton of ˆ value and how t dffers from "true"?. What s a Ths artcle s based on the classcal work on lnear regresson analyss []. Stll aled asects of analytcal chemstry are consdered.

I. Regresson Wthout Weghtng Regresson wthout weghtng assumes that the error of measurement deends on nether the detector resonse nor the concentraton. Regresson can be lnear through orgn or not through orgn. Also, t can be a olynomal of second or thrd order. Ths case s consdered n detals n the lterature. A regresson s defned by the eresson: Xβ ε (I., (Seber 3. where X s a regresson matr. For eamle, quadratc regresson not gong through orgn t s X......... n n The lower nde refers to the number of calbraton ont. y. y... y } s vector of detector resonse values { n {,,..., n {,, } ε } - a related vector of errors of measurements. β - a regresson coeffcents whch must be calculated. It s assumed that errors of measurements follow the rule: cov[, ] j j D[ ] I n That s, errors are not correlated and have the same dserson. The soluton of the dserson s βˆ ( X X X (I., (Seber 3.5 Confdence nterval at ( level for resonse value at a gven We are buldng the regresson of resonse values known set y measured wth errors over the recsely (calbraton. After buldng the calbraton curve, we make another measurement at some known. Then, the confdence nterval of the sngle measurement s gven by the eresson: (/ / ˆ S ( u (I.3, (Seber 5. tn where ( Xβˆ ( Xβˆ (Seber 3.3 S n u (X X (Seber 5.7,,..., } f olynomal curve does not go through {

orgn {,..., } f olynomal curve goes through orgn ˆ хˆ β (Seber 5. n number of calbraton onts ower of the olynomal Student's coeffcent at confdence robablty ( wth m degrees of freedom. t m Confdence nterval for redcton n solvng the nverse roblem (dscrmnaton We buld the regresson of resonse values y measured wth errors at the recsely known set (calbraton. After buldng a calbraton curve, we make another measurement. Usng the calbraton curve, we fnd ˆ - an estmate of the true value : хˆ ˆ β We have to follow the reasonng of (Seber 7.. Let х х true value of х n our measurement. We defne Ŷ so that ˆ хˆ β Then, from (I.3 we have: ~ N(, ( and T ˆ ~ t n S u ˆ u Thus a set of х whch satsfy the condton (/ ( ˆ ( t S ( u (I. n form a confdence regon at ( level for х Here we use: ˆ хβˆ (I.5 u (X X (I. In the artcular case of lnear regresson, formula (I. s equvalent to (Seber 7.. For lnear regresson gong through orgn, formula (I. gves result equvalent to (Seber 7..

II. Weghted Regressons Assumtons, whch were used n the revous chater for constructng the regresson are cov[, ] D[ ] j j I n Xβ ε (II. Stll, followng to (Seber 3., we can buld a generalzed method of least squares. Let D[ ] V, where V s a known ostvely-defned matr of sze ( n n. In ths case, non-sngular matr K ests so that V K K We defne Z K, B K X η K ε and buld another regresson Z Bβ η (II. In ths regresson E[ η] и D[ η] In Ths means that model (II. s equvalent to model (I., where all are not correlated and have the same dserson. Least-squres estmate βˆ for vector β s calculated by mnmzng of value eresson βˆ (B B BZ ( X V X X V (II.3 Just n the same way the eected value ˆ at a gven known х s gven by ˆ хˆ β and true unknown х for measured s estmated by хˆ : хˆ ˆ β η η and s gven by Methods of evaluatng confdence ntervals are not drectly alcable, although (II. and (I. seems to be equvalent. We can eect that the statstcal behavor of vector βˆ from (II.3 s analogous to the conventonal regresson model because (II.3 s the usual estmate made by the least squares method. In the eerment, we are measurng or settng values х and. In general, we cannot match them related values b and Z n nverted regresson. Fortunately, n some ractcally sgnfcant cases such a match s ossble. We need to make addtonal assumtons.

Let us assume that errors are not correlated as revously, that s cov[, j ] when j. Now the dsersons are not the same. Let us assume that we are makng multle measurements of the detector resonse at a gven strctly known ~ х and by averagng resonses we obtan a true resonse ~. Also, assume that the dserson of the error deends on ~ х and ~ only, that s ~ w( ~, ~ (II. Also, we have to assume that errors are the same for calbraton and for analyte and they follow (II.. In ractce a artcular form of (II. s known aromately and s defned by the tye of hyscal eerment. The most mortant models are lsted below: w (, (II.5. conventonal regresson, no weghtng (II.5. error of measurement s roortonal to w(,. For eamle, ths s a case of radoactvty detector. (II.5.3 Constant relatve error? That s w(, ~ and const w(, w(, We defne matr V as: (II.5. (II.5.5 error of measurement s roortonal to. Analogous to (II.5., f we relace wth х. error of measurement s roortonal to. Analogous to (II.5.3, f we relace wth х. V dag { } dag{ } V ~ (, ( ~, ~ w w where s a number of the calbraton ont. (II. When we are buldng the calbraton, a recse value ~ for calbraton ont s unknown. Therefore, we have to use aromaton by relacng ~. Ths means that a true error matr V ~ s relaced by an aromate V. An nverse matr for V s: ~ V V dag{ w( ~, ~ }

Confdence nterval at ( level for the resonse value at a gven. Usng (II.3 we can estmate the true detector resonse at a gven known. A natural estmate s: ˆ хˆ β We defne w w( ~, ˆ w(, w so ~ w(, ˆ w(, s an estmate of dserson of measured resonse, accordng to (II.. Now we defne w k, Z, b, k k so that Zˆ ˆ β k b ˆ ˆ β k s an estmate for Z n nverted regresson (II.. Ths means that the method for calculatng confdence ntervals from chater for (I s alcable for Z and Ẑ. Then, a confdence nterval for resonse value of sngle measurement s gven by: (/ / (II.7 Z ˆ Z tn S ( u Where ˆ ( Z Bβ (Z Bβˆ S n u b (BB b Now, we need to convert from ( b, Z back to (,. We get: ˆ (/ / tn S ( U w (II. where ( Xβˆ V ( Xβˆ S n (X V X U

Confdence nterval for redcton n solvng an nverse roblem (dscrmnaton. We are makng an estmate ˆ of a true value usng the regresson (II.3 at a measured detector resonse : хˆ ˆ β We need to reeat the reasonng, analogous to the one we made for confdence nterval for resonse value. In the same way we get: w w( ˆ, w(, w~ w k, k k Z хˆ k Z βˆ bˆ βˆ We defne b. An estmate of confdence ntervals from chater (I s alcable for b and k ˆb. Namely, a set of all b whch follow the condton (/ ( Z Zˆ ( t S ( u (II.9 b n form a confdence regon for b at ( level. Here we defne: u b Zˆ b bβˆ b (B B Now we need to convert back from ( b, Z to (,. Assumng b b b we get a confdence k regon for at ( level. Ths s a set of all х, whch follow the condton: ( ˆ ( t S ( w U (/ n (II. where ( Xβˆ V ( Xβˆ S n ( X V X U ˆ βˆ

Heght Heght III. Eamles The eamles below reresent tycal regressons of dfferent olynomal owers and dfferent weghtng models. For demonstraton uroses, the smulated data are generated wth sgnfcant error. Confdence ntervals for resonse values at.95 confdence robablty are drawn. 5 9 7 3 Concentraton 3 5 7 9 Fgure. Lnear olynomal not gong through the orgn. The dserson of error s the same for all measurements, no weghtng. 79 5 3 Concentraton 3 5 7 9 Fgure. Lnear olynomal not gong through the orgn. The dserson of error s roortonal to Heght. Weghtng /Heght.

Heght Heght 5 9 7 3 Concentraton 3 5 7 9 Fgure 3. Lnear olynomal not gong through the orgn. The dserson of error s roortonal to Heght (constant relatve error. Weghtng /Heght. 5 9 7 3 Concentraton 3 5 7 9 Fgure. Lnear olynomal not gong through the orgn, the same data as on Fg.3. The dserson of error s roortonal to Heght (constant relatve error. Regresson s constructed wthout weghtng (just as f absolute error s constant. Ths regresson gves an aromaton formula that s not qute correct and has ncorrect confdence ntervals.

Heght 7 9 5 3 Concentraton 3 5 7 9 Fgure 5. Quadratc olynomal through the orgn. The dserson of error s roortonal to Heght. Weghtng /Heght.

Heght Heght A correct test lan for calbraton s esecally mortant when buldng regresson wth a nonlnear olynomal. Let us consder a regresson of the smulated data whch should be aromated by quadratc olynomal. If we select 3 concentratons and make measurements twce at each concentraton, the calbraton curve could look qute nce. Stll, when we calculate and draw related confdence ntervals we would notce that adequate results are ossble near the ntal concentratons only. Therefore, ths test lan s ncorrect. 79 3 5 Concentraton 3 5 7 9 Fgure. Quadratc olynomal not gong through orgn. The dserson of error s the same for all measurements, no weghtng. The smulated detector resonse s measured three tmes at three concentratons. Ths test lan roduces a oor redcton. A redcton becomes much better f we select unformly dstrbuted concentratons for the calbraton. Ths calbraton curve would suly good redctons over an entre range of calbraton. 7 9 5 3 Concentraton 3 5 7 9 Fgure 7. Quadratc olynomal not gong through orgn. The errors are the same for all measurements, no weghtng. The concentratons for calbraton are unformly dstrbuted. Ths test lan roduces an adequate redcton.

IV.Concluson We have strct mathematcal eressons for confdence ntervals for detector resonse and for the redcton n an nverse roblem n the case of generalzed regresson wth weghtng. An eressons (II. and (II. are generalzatons of (I.3 and (I. resectvely. They are alcable ether for lnear regressons or for regressons wth olynomal of other owers (quadratc, cubc etc.. References. G. Seber Lnear Regresson Analyss