estimate results from a recursive sceme tat generalizes te algoritms of Efron (967), Turnbull (976) and Li et al (997) by kernel smooting te data at e

Size: px
Start display at page:

Download "estimate results from a recursive sceme tat generalizes te algoritms of Efron (967), Turnbull (976) and Li et al (997) by kernel smooting te data at e"

Transcription

1 A kernel density estimate for interval censored data Tierry Ducesne and James E Staord y Abstract In tis paper we propose a kernel density estimate for interval-censored data It retains te simplicity andintuitive appeal of te usual kernel density estimate and is easy to compute Te estimate results from an algoritm were conditional expectations of a kernel are computed at eac iteration Tese conditional expectations are computed wit respect to te density estimate from te previous iteration, allowing te estimator to extract more information from te data at eac step Te estimator is applied to HIV data were interval censoring is common In terms of te cumulative distribution function te algoritm is sown to coincide wit tose of Efron (967), Turnbull (976), and Li et al (997), as te window sizeof te kernel srinks to zero Viewing te iterative sceme as a generalized EM algoritm permits a natural interpretation of te estimator as being close to te ideal kernel density estimate were te data is not censored in any way Simulation results support te conjecture tat kernel smooting at every iteration does not eect convergence In addition, comparison to te standard kernel density estimate, based on smooting Turnbull's estimator, reect favourably on te estimator for all criteria considered Use of te estimator for scatterplot smooting is considered in a nal example Keywords: Cross-validation, EM algoritm, HIV, importance sampling, interval censoring, kernel smooting, Kullbeck-Leibler, mean squared error, Monte Carlo integration, nonparametric maximum likeliood, scatterplot smooting, self-consistency Introduction We propose a kernel density estimate to be used in te presence of interval censored data, ie data tat are observed to lie witin an interval but wose exact value is unknown Te Department of Statistics, University oftoronto y Department of Public Healt Sciences, University oftoronto

2 estimate results from a recursive sceme tat generalizes te algoritms of Efron (967), Turnbull (976) and Li et al (997) by kernel smooting te data at eac iteration ^f j (x) = n E j; K x ; X X 2 I i : () Here expectation is wit respect to te previous iterate conditional on te observed interval Convergence of te algoritm implies tat ^f j approaces some density for wic te application of () as no eect Efron (967) called suc a xed point a self-consistent estimator Te estimator retains te simplicity and intuitive appeal of a kernel density estimate In fact, tis simplicity avoids some of te awkward aspects associated wit kernel smooting Turnbull's estimator, F t,oftecumulative distribution function (cdf) ^f t (x) = Z < x ; u K df t (u) wic is a standard tecnique Turnbull's F t is a non-parametric maximum likeliood estimator (NPMLE) tat is not uniquely dened over te wole real line but only up to an equivalence class of distributions tat may dier over gaps called \innermost" intervals Associated wit tese gaps are probability masses wose distribution over te gap is left unspecied and tat proves to be troublesome wen computing ^f t Pan (2000) suggests arbitrarily assuming tat jumps occur at te rigt-and points of te gaps wic may be appropriate if te censoring proportion and te lengt of te censoring intervals are small However, if most observations are interval censored wit interval lengts tat can be large, as is often te case wit HIV/AIDS data, ten assuming tat te jumps occur at te rigt-and point of te interval may cause considerable bias in te estimator Tis complication never arises wen computing () because we smoot te data directly at every iteration rater tat smooting a NPMLE once Moreover, tis smooting process distributes probability mass over eac observed interval using a conditional density determined by te previous iterate Tis process is data driven rater tan arbitrary Figure depicts use of te estimator as applied to a group of eavily treated emopiliacs (De Gruttola and Lagakos, 989) wose time of infection wit te HIV virus was interval censored Te upper plot gives te original data ordered by te left end point Time is measured in six mont intervals and rigt censored observations are denoted by dotted lines Te lower plot gives ^f t and our estimator ^f 4 based on four iterations of te algoritm Te coice of j = 4 is based on bot simulations and visual inspection of te estimator for several values of j > 4 Te latter can be made common practice as successive iterates are based on an importance sampling sceme were te time to compute an iterate does not increase wit te number of iterations Te estimator ^f t was computed assuming jumps occur at te 2

3 center of innermost intervals rater tan te rigt-and point, wic causes te estimate to be sifted to te rigt Window sizes were cosen using a metod of cross-validation discussed in x5 Wat is evident from te plot is tat ^f 4 does a better job of smooting wat appears to be a sampling anomaly on te left side of te plot witout eroding te peak on te rigt It eliminates sampling artifacts in te estimate witout degrading te estimate itself, and to some extent overcomes te fact tat smooting te NPMLE does not recover te information lost by te non-parametric estimation (Pan, 2000) By smooting at every iteration it does a better job of borrowing information from neigbouring data points in te smooting process Tis is borne out in simulations of mean squared error Tis example is used trougout te paper to illustrate oter aspects of te estimator and a separate example concerning HIV infection and infant mortality is given in x7 Innermost intervals, wose concept is not entirely straigtforward, never explicitly enter into te calculation resulting in te advantage tat our estimator lls in te gaps of Turnbull's F t Tis idea of lling in te gaps is not new as Li et al: (997) embed Turnbull's NPMLE in an EM algoritm designed specically for tis purpose Tey obtain an estimator tat will converge to te NPMLE were te NPMLE is uniquely dened, and to some cdf tat depends on te starting point oftealgoritm were te NPMLE as gaps In x3 we sow tat as te window size,, srinks to zero our algoritm coincides wit tat of Li et al (997) and ence wit te algoritms of Efron (967) and Turnbull (976) as well Te remainder of tis paper is organized as follows In x2 te estimator is proposed as a natural extension of te usual kernel density estimate in te complete data case (no censoring) It is formally dened troug a generalized EM algoritm were te \M" step is caracterised by optimizing an \MSE" criterion Tis criterion is quite natural as it involves te complete data kernel density estimate, ^fc, allowing te estimator to be interpreted as minimizing te distance between itself and te ideal estimate ^f c Numerical implementation of te metod is discussed in x4 and te coice of te smooting parameter is considered in x5 Te question of convergence of te algoritm is considered in x3 Altoug te developments are not rigorous, te conjecture is tat use of kernel smooting at every iteration does not perturb algoritms, tat are known converge, to suc an extent tat tey no longer converge Te argument is supported by simulation results in x6 Finally, in x7 te metod is used to provide kernel weigts for scatterplot smooting Trougout te paper analogies wit te complete data case make developments transparent 3

4 2 Denition of te estimator In te presence of complete data X ::: X n te standard kernel density estimate, ^f c (x) = x ; n K Xi may be written as an expectation wit respect to te empirical distribution, F n, of te sample x ; X ^f c (x) =E Fn K : Wen te data are interval censored, so tat X i 2 I i 8i and only I i =(L i R i ) is observed, it seems natural to express te kernel density estimate in terms of iterated expectation ^f(x) =E Fn E x ; X K X 2 I = n E K x ; X X 2 I i : Here conditional expectation is computed wit respect to te distribution for te true value of X i over te interval I i Goutis (997) uses suc a strategy for te nonparametric estimation of a mixing density Tis conditional distribution is itself unknown and must be estimated A natural coice is data driven and involves using te kernel density estimate itself to approximate eac conditional distribution Tis results in an iterative algoritm wit te following smoot estimate of te density at te jt step: ^f j (x) = n x ; X E j; K X 2 I i were E k [g(x)j X 2 I i ]= ( R Ri L i g(t) ^f k i (t)dt L i 6= R i g(x i ) L i = R i = X i : Te conditional density ^f k i () over te interval I i is dened as ^f k i (t) = i (t) ^f k (t), c k i were i () is te indicator function for te interval I i and c k i is its unconditional expectation under ^f k At te(k +) st iterate it is te conditional density ^f k i (x) tat is used to smootly distribute a probability mass of =n over te interval I i Note ow tis diers from, for example, te product limit estimator wic distributes te mass associated wit a rigt censored observation X i to only tose uncensored observations tat exceed X i and not to te entire interval [X i ) 4

5 Given te estimator weigts a data point by computing te average eigt of te kernel over te observed interval consider Figure 2 wic depicts ow te weigt depends on te lengt and proximity of te interval to te location of te kernel In te gure te weigts for two intervals, centered at 0 but wit dierent lengts, are sown for dierent positions of te kernel Wen te kernel is also centered at 0, te metod rewards precision by giving te sorter interval a greater weigt However, wen te location of te kernel is sifted so it overlaps predominantly wit te longer interval, it assigns a larger weigt to tis interval even toug it is less precise Tis is due to te longer interval being more \local" tan te sorter interval to te point of estimation, or te center of te kernel Te longer interval is local because tere is non-zero probability tat te true observation is in a region close to \-2" wile tis is not te case for te sorter interval Wile te above derivation as intuitive appeal te estimator may be formally dened as minimizing an integrated squared distance between some arbitrary function and te ideal estimator ^f c We rst present te following result Teorem 2 Let F be te set of absolutely continuous density functions in L 2 Suppose tat X is distributed wit density f 2F Let C(f) = R ; n ^fc (x) ; f(x) o 2 dx, were ^fc (x) = (n) P ; n K((x ; X i)=), and assume tat ; K((;u)=) 2F for any xed >0 and u 2 IR Ten ^f = E f [(n) P ; n K((x ; X i)=)jx i 2 I i 8i] solves " # ^f = arg min E f f2f C(f) X i 2 I i : i = ::: n Proof: Let " ^f = arg min E f f2f C(f) = arg min f2f E f "Z ; # X i 2 I i i = ::: n n o 2 ^fc (x) ; f(x) dx X i 2 I i i = ::: n Under te assumptions about f and K, tis expectation is nite and ence E f "Z ; n ^fc (x) ; f(x) o 2 dx X i 2 I i i = ::: n # = Z E f ; # " n o 2 ^fc (x) ; f(x) : X i 2 I i i = ::: n By minimizing te positive integrand for every xed x, we minimize te integral Tus for axedvalue of x te denition of conditional expectation implies tat 2 E f 4( x ; Xi ) 2 3 X i 2 I i K ; f(x) 5 n i = ::: n is minimized wit respect to f(x) at " x ; ^f(x) =E f n K Xi X i 2 I i i = ::: n 5 # = n E f K x ; Xi # dx: X i 2 I i :2

6 Note te criterion is quite restrictive It explicitly involves ^f c and ence te form of te optimal estimator is not surprising Neverteless, te result is useful due to te interpretation it lends te estimator In terms of squared distance te estimator gets as close as possible to te ideal kernel density estimate In addition, since we do not know te true density f, we replace it wit any current guess for f, say ^f j; Hence te estimator may be regarded as resulting from a generalized EM algoritm wit: E-step: 8i dene ^f j; i (x) and compute w i (x) =E K x;x i j; X 2 Ii M-step: Compute ^f P j (x) = w(x) = n w n i(x) Computational issues concerning te E-step are considered in x4 Figure 3 gives te result of te rst four iterations of te algoritm for te emopiliac data In tis data set, patients wo were infected at te time of entry were assigned a leftand point of L i =wic resulted in a number of lengty intervals commencing from te beginning of te study Te common practice in te HIV literature of assuming a uniform distribution over eac interval is clearly inappropriate from an inspection of te data For te estimates in Figure 3 we used a uniform distribution as our starting point (rigt censored observations were given a weigt of 0) As one expects, dierences between te rst two iterates are quite large as te initial assumption of a uniform distribution is adjusted by te density estimate itself wic places more weigt on te later period of te study Convergence is acieved after four iterations 3 Properties of te estimator Wen te complete data kernel density estimate, ^f c, is used to estimate te cdf as ^F c (x) = Z x ; ^f c (u)du te estimate ^Fc reduces to te NPMLE as # 0 Here te NPMLE is te empirical distribution function F n An analogous development olds for te estimator ^f j as well In tis section we sow te algoritm () reduces to tat of Efron (967), Turnbull (976) and Li et al (997) as # 0 Since eac of tese converges under broad conditions we conjecture tat te use of kernel smooting at eac iteration does not perturb te algoritm to suc an extent as to eect convergence Te simulation results of x6 support tis Efron (967) proposed an iterative sceme for approximating te survivor function at a point x, S(x) =P [X x]: n ~ Sj (x) =N(x)+ X Li <x i =0 6 ~S j; (x) ~S j; (L i )

7 were N(x) =#X i x, i =if X i is observed exactly and i = 0 if X i is rigt-censored (R i = ) Efron sows ~ Sj converges to a xed point tat coincides wit te Kaplan-Meier product limit estimator, tat is te NPMLE Turnbull (976) generalized tis algoritm to obtain a NPMLE of te distribution function under general censoring and truncation scemes Li et al (997) proposed an estimator tat is te xed point ofanemalgoritm Teir estimator coincides wit Turnbull's estimator were Turnbull's estimator is uniquely dened, and converges to a value tat depends on te starting point of te iterative sceme were Turnbull's estimator is not uniquely dened Te iterative sceme proposed by Li et al (997) involves computing te conditional expectation of F n at eac step F j (x) =E j; [F n (x)j X i 2 I i 8i] : Te following teorem sows tat Li et al's estimator can be obtained as a limit of our estimator wen we let te window widt of te kernel srink to zero at every step Teorem 3 Let ^F j (x) be te estimate of te cdf corresponding to te density estimate () Assuming bot algoritms ave te same initial value, ten Proof: Fj may be rewritten as lim #0 ^Fj (x) = Fj (x) 8x j = 2 ::: F j (x) = E j; [F n (x)j X i 2 I i 8i] " = E j; I[X i x] n = ( Fj; (x) ; Fj; (L i ) n F j; (x) ; Fj; (L i ) = n E j; [I[X i x]j X i 2 I i ] # X 2 I ::: X n 2 I n! i (x)+i[x R i ] j = 2 ::: Note Li et al (997) use te tird expression for computation Dening K (u) = R u ; K(y)dy and using Tonelli's teorem to intercange expectation and integration we may similarly write ) ^F j (x) = = Z x = n ; Z x ; ^f j (u) du n u ; E j; K Xi X i 2 I i E j; K x ; Xi X i 2 I i : du 7

8 Since K ((x ; X i )=) for all, we can bring te limit inside te expectation Te result obtains since lim #0 K ((u ; v)=) =I[v u] 8u v 2< 2 In te case of rigt-censored data te algoritm () reduces to tat of Efron (967) as an immediate consequence of Teorem 3 Corollary 3 If R i = for all interval censored data points, ten lim #0 ^Fj (x) =; ~ Sj (x) 8x j = 2 ::: Proof: Under rigt censoring, ; ~ Sj (x) = Fj (x) and ence te result 2 Te above developments naturally lead to te consideration of convergence of te algoritm to a xed point Series expansions, (Silverman 986), similar to tose for te complete data kernel density estimate, ^f c, sow tat ^F j is equivalentto F j to second order Te following teorem does not prove convergence but it does sow tat convergence of te algoritm is linked to te convergence of Fj and may be inerited from Fj In oter words, te convergence of Fj is a necessary condition for te convergence of ^Fj Li et al (997) sow tat F j converges wen F0 is a strictly increasing distribution function Teorem 32 Assume, R < K(u)du = R< uk(u)du = 0 and R < u2 K(u)du = 2 K <, ten, assuming bot algoritms ave same initial value we ave ^F j (x) = Fj (x)+o( 2 ) 8x j = 2 ::: Te proof of tis Teorem can be found in te Appendix Te assumptions of te teorem are typical of most popular kernel functions, including te Gaussian kernel Te eect of te O( 2 ) term depends on bot te properties of te kernel as well as te size of In te simulations of x6 te O( 2 ) term does not disturb convergence 4 Implementation troug importance sampling Computing an iterate in te recursive sceme () requires te computation of a conditional expectation for eac interval censored observation For an interval I tis conditional expectation as te form I = E j K x ; X X 2 I = Z R L x ; X K ^fj I(X)dX wic, except in special cases, will not be computable in closed form Rater tan numerically approximating te integral involved in te expectation, we estimate it by a sample 8

9 mean in a Monte Carlo sceme tat is fast and easy to implement Tus te iterative algoritm involves a sampling process wic iterates until we are condent tat we are sampling from te xed point of() Two sampling scemes are considered were te second is an approximation of te rst Te rst metod involves sampling exactly from ^f j I using an acceptance/rejection metod were candidate values Y are generated from te distribution wit density ^f j and accepted if Y 2 I generate Y ^f j 2 if Y 2 I set X ; Y oterwise goto Te rst step is accomplised by te following recursive sceme: sample wit replacement from fi ::: I n g to get I? 2 sample from f j; I? 3 sample Y from x; K( ) were te recursion occurs at step 2 Once a sample X ::: X B is obtained, ^ I is computed as and since te sampling is exact ^ I B! ^ I = B BX k= x ; K Xk x ; X ;! E j K X 2 I Tus we can limit te eect of Monte Carlo error by coosing B to be as large as we want Te diculty wit tis exact sampling metod is tat it punises precision in te data Wen an interval is narrow te acceptance/rejection step will largely reject proposals Tus obtaining a large sample may take a long time and given te sceme is recursive te impact can be substantial To oset tis we use an importance sampling sceme were te time to compute an iterate does not increase wit te number of iterations as in te exact sampling sceme above Based on E j " K x ; X)! X 2 I# x ; X = E g K w(x) were g is some distribution over te interval I tat is easy to sample from, and w(x) = ^f j I (X) is te importance sampling weigt, ^ I becomes g(x) ^ I = BX k= x ; K Xk w? k were w? k = w(x k )= P B l= w(x l) and te above acceptance/rejection sceme is replaced by 9

10 Generate X k g k = ::: B 2 Compute ^ I Te only additional complication is to compute te sampling weigts w? k wic, upon inspection, simplify in a convenient way Ultimately tey involve te eigt of te unconditional kernel density estimate ^f j tus avoiding computation of te constants c j I w? k = ^f j I (X k ) g(x k ), BX l= ^f j I (X l ) g(x l ) = ^f j (X k ) c j I g(x k ), BX l= ^f j (X l ) c j I g(x l ) = ^f j (X k ) g(x k ), BX l= ^f j (X l ) g(x l ) : Finally, using te values of te kernel density estimate ^f j at a suciently ne grid, ^f j; (X k ) can be accurately computed by interpolation Using te emopiliac data gure 4 gives te result of a simulation study for values of B = 0 & 00 respectively For eac plottekernel density estimate, based on 4 iterations of te algoritm, was computed 00 times for a xed value of B Te plot depicts te resulting pointwise mean and 99 % percentile interval Te metod works quite well for samples of size 00 5 Coice of te smooting parameter A central component of kernel density estimation is te coice of te smooting parameter We propose an automatic metod for tis purpose based on likeliood cross-validation tat is is analogous to te complete data case (Silverman, 986) In te presence of complete data X ::: X n likeliood cross-validation aims to maximize CV () = ny ^f (;i) (X i ) wit respect to te smooting parameter Te superscript indicates X i is left out wen (;i) te estimate ^f (X i ) is computed and te metod works because E[CV ()] involves te Kullbeck-Leibler distance between f and ^f: E[CV ()] ; Z f(t)logff(t)= ^f(t)gdt + Z f(t)logff(t)gdt: In te case of interval censored data it is natural to mimic te above strategy troug (;i) analogy In te above, ^f (X i ) is obtained by eliminating a point of support, X i from te NPMLE, namely F n By eliminating X i te contribution to CV at tat point of support uses only te remaining data In our case, te support of te NPMLE, F t, are te innermost intervals dened as J r = (p r q r ) r = ::: m were p r 2 fl i i = ::: ng, q r 2 fr i i = ::: ng and J r \ I i equals J r or 8r i (see Turnbull 976, or Li et al, 997 for a more detailed discussion of innermost sets) For simplicity of exposition, and witout loss of 0

11 generality, assume all data are interval censored In tis case, te cross-validated likeliood is dened as my Z (;r) ^f (t)dt r= J r were R (;r) J r ^f (t)dt is obtained by dropping te innermost interval J r wen estimating te density Dropping an innermost interval is accomplised by removing all intervals in te original sample tat contribute to its presence but not to te presence of any oter innermost interval Tis conveniently addresses te question of tied observations wic are common for interval censored data For example, te emopiliac data contains only 40 distinct intervals in a sample of size 05 In addition it also addresses te issue of ow toandletwo observed intervals tat are not tied but ave a ig degree of overlap If tey bot overlap completely wit te eliminated innermost interval ten tey are bot eliminated wen estimating te contribution to te cross-validation process for tat interval Wile te sceme is admittedly adoc, it worked well in a limited simulation study using 40 samples A description of ow data was generated is given in x6 Table compares average values of our cross-validated likeliood wit te Kullbeck-Leibler distance for bot ^f j and ^f t In bot cases our cross-validated likeliood is quite accurate wen compared to te Kullbeck-Leibler distance It obtains its maximum at, or near, te value of te window size tat minimizes te Kullbeck-Leibler distance In addition, te ideal window size is smaller for te proposed estimator, ^fj, indicating it uses more information in te data Note te metod contradicts our aim of simplicity as knowledge of te innermost intervals is required for computation Ultimately, a metod tat is independent on innermost intervals, like k-fold cross-validation as considered in Pan (2000), may be preferred 6 A simulation study For te estimator ^f j two patterns of beaviour are evident in te following simulation study: convergence and improvement over te standard kernel density estimate, ^ft Five criteria are considered of wic tree are useful for comparing ^f j and ^f t Te remaining two assess te dependence of te estimator on te initial value, f 0, of te algoritm All criteria assess convergence Dene te squared distance,, between two functions, u and v, as X (u v) = fu(x) ; v(x)g 2 x2x were X is a xed grid of equally spaced points spanning te range of te data Tis distance is central to all convergence criteria wit te exception of one involving te Kullbeck-Leibler distance

12 If te algoritm converges to a xed point, ^f, tat is independent of te initial value, ten it is said to be a contraction mapping,, if for some suitably dened space of densities F, is suc tat : F ;! F ^f j =(^f j; ) ^f =(^f) ( ^f j ^g j ) <( ^f j; ^g j; ) j > were ^f j and ^g j are te density estimates at te j t step for two arbitrary but dierent starting points f 0 g 0 2 F Te rst two columns of table assess te beaviour of te estimator as a contraction mapping Te \squared distance" column gives te average value of ( ^f j ^g j ) j = ::: 0 based on 00 samples X ( ^f j ^g j )= i ( ^f j ^g j ) were i ( ) denotes te value of ( ) for te i t sample Te \contraction" column give te proportion of samples tat satisfy te condition i ( ^f j ^g j ) < i ( ^f j; ^g j; ) j > : For eac of te 00 random samples we generated 20 failure times from a Weibull distribution wit sape parameter 75 and scale parameter 3 Independently, we ten generated \visit times" using a omogeneous Poisson process Eac failure time was interval-censored by te visit times tat bracketed it For eac sample, we computed te iterative sceme () using a Gaussian kernel wit = We used a sample size of B = 00 for te importance sampling sceme described in x4 Te initial values of te density for te iterative sceme are various scale and location sifted beta distributions Here f 0 and g 0 are based on beta(5, 2) and beta(2, 5) distributions respectively Te criteria \MSE r ", r = 2 use (u v) to assess te expected value of te squared distance (u v) under te Weibull(75,3) distribution Wen r =, te function u is set to be te true density, f, and v = ^f j Tus MSE estimates te actual mean squared error of te estimator For MSE 2 te true density is replaced by te ideal estimator, ^fc, and te criterion assesses te closeness of ^fj to te ideal estimator as discussed in x2 Te nal column assesses an estimate of te Kullbeck-Leibler distance " (, (f ^f j )=E log f(x) ^f j (X) For eac of te 00 samples an estimate, ^ i (f ^f j )=f(x), ^f j (X) )# : 2

13 is itself based on a sample X Weibull(:75 3) As in x4, te values of te density estimate ^f j (X) are found by interpolation using te values of ^f j computed at te grid X Te entry given in te table is X (f ^f j )= ^ i (f ^f j ): Note te MSE and Kullbeck-Leibler criteria are evaluated for Turnbull's estimator as well were ^f j is simply replaced by ^f t All columns in te table give standard errors in brackets Te results reported in Table 2 sow te algoritm tends to reac convergence after 3-6 iterations Te contraction criterion improves until te 6 t iteration after wic its beaviour is consistent wit wat would be expected if te Monte Carlo sceme of x4 involved sampling from te xed point (once convergence is reaced we expect te condition i ( ^f j ^g j ) < i ( ^f j; ^g j; ) to old % 50 of te time) Te squared distance criterion sows te distance between estimators wit dierent initial values gets very small indeed by te 6 t iteration Te MSE and Kullbeck-Leibler criteria reac teir minimum after only 3 iterations of te algoritm after wic tey remain fairly constant Te large improvement between te rst and second iterates, and te smaller improvement between te second and tird iterates, sow ow te estimator continues to extract more information out of te data after te rst iteration It is tese improvements tat result in te estimator aving better properties tan ^f t for tese criteria Finally, asan example, Figure 5 sows four successive iterates for te emopiliac data based on various initial values of te algoritm 7 Use as a scatterplot smooter Kernel weigts are useful in regression as well as density estimation In te regression context we consider te use of kernel weigts were acovariate is interval censored Te tecniques described ere are understood to be applicable to multiple regression problems were an additive or generalized additive model are used Tis and te context were te response in a regression is also interval censored are deferred Te purpose of te following example is to exibit te exibility of te metods of te paper rater tan to perform te ideal data analysis Te data used is a subset of a larger dataset concerning HIV infection and infant mortality (Huges and Ricardson, 200) Here we only consider infants wit no interval censoring in te response (ie infants tat died) and tat were infected wit HIV Consider te model E[Y ]=g(x) were te covariate x may be interval censored In terms of a scatterplot, interval censoring in a covariate means only te y coordinate is known Any smooting process tat uses kernel weigts wose size is determined by some nearest neigbourood tecnique needs 3

14 modication, as suc neigbouroods are determined by te covariate Suppose, forexample, tat a running mean smooter is used were ^y =^g(x) = X y j 2N x v j y j : were N x is te nearest neigbourood for x Typically, te weigts v j are given as v j = K x;xj owever te Xj are not observed In keeping wit te spirit of tis paper v j is replaced by Ij x ; X = E ^f x K X 2 I j were expectation is computed wit respect to te xed point ^f x of () restricted to te interval I j Note te estimate ^f x is te density estimate for te covariate X As in x4 expectation is approximated by an importance sampling algoritm and so te recipe for computing ^g is Generate a sample X ::: X B from te cosen importance sampling distribution for te interval I j 2 Using ^f x compute te sampling weigts w k? as in x4 3 approximate Ij by ^ Ij = P o B k= w k? 4 Compute ^g(x) = P y j 2N x ^ Ij y j n K x;xk Figure 6 gives four plots for te infant data Te rst of tese is a scatterplot of te times of deat for te sixty infants used in te tting process Te covariate is te time of infection wit te HIV virus It is interval censored and ence a scatterplot \point" is actually a line obtained by joining te rigt and left endpoints of eac interval Te second plot gives te tted density estimate for te covariate based on four iterations of our algoritm Te cross validation tecnique of x5 was used to pick te \ideal" window size of 375 Te data were originally collected in a study of te eect of breast feeding on infection However, for te infants used ere te primary source of infection seems to be invetro Te time point 0 indicates te time of birt of te cild Note intervals for te covariate extend to - indicating infection may ave taken place before te birt of te cild Te remaining two plots use tis density estimate to kernel smoot te scatterplot using a running mean, altoug any linear smooter could be used Several window sizes were arbitrarily cosen Te plots raise many interesting issues worty of furter study For example, te smoots are dominated in an unreasonable fasion by tree points on te rigt suggesting tat eiter te tecniques be made robust or te coice of smooting parameter be made adaptive, or bot! Tese and o ter issues, like owtoandleinterval censored response as well, will be considered in future work 4

15 Acknowledgements Te autors are grateful to Je Rosental, David Andrews, Rob Tibsirani and Jerry Lawless for inspiring discussions We particularly appreciate many useful conversations wit Paul Corey Tey also wis to tank te Natural Sciences and Engineering Researc Council of Canada for supporting tis researc troug individual operating grants Appendix Proof of Teorem 32: Te proof is given for te case were all data are interval censored Recall, were ~F j (x) = n Z x = ~fj (t)dt ( ~Fj;! ) (x) ; Fj; ~ (L i ) ~F j; (R i ) ; Fj; ~ i (x)+i[x R i ] (L i ) ~f j (x) = d F ~ j (x) dx = n = n f j; (x) ~F j; (R i ) ; ~ Fj; (L i ) i(x) c ; i j; ~ f j; (x) i (x) Now standard calculations like tose of Silverman (984) allow expansion about = 0 of eac conditional expectation and tus expansion of ^f j 8j Z Ri x ; t ; K f j; i (t)dt ^f j (x) = n = n = n = n = n = n L i c ; i j; ; Z< K c ; i j; ; Z< K c ; i j; c ; i j; c ; i j; Z Z < < x ; t x ; t f j; (t) i (t)dt g(t)dt K(u)g(x ; u)du g(t) =f j; (t) i (t) K(u) g(x) ; u _g(x)+ 2 2 u 2 g(x)+ du g(x) Z< K(u)du ; _g(x) Z < uk(u)du g(x) Z < u2 K(u)du + 5

16 = n = ~ f j (t)+o( 2 ) c i j; i (x)f j; (x) g(x) 2 K + and terefore ^F j (x) = ~ Fj (x)+o( 2 ) j = 2 ::: 2: References [] Efron, B (967) \Te two sample problem wit censored data", Fourt Berkeley Symposium on Matematical Statistics, University of California Press, [2] Goutis, C (997) \Nonparametric estimation of a mixing density via te kernel metod", Journal of te American Statistical Association, 92, [3] De Gruttola, V and Lagakos, S W (989) Analysis of Doubly-Censored Survival Data, wit Applications to AIDS Biometrics 45, - [4] Huges, J P and Ricardson, R (200) Analysis of a Randomized Trial to Prevent Vertical Transmission of HIV- J of Amer Statist Assoc In press [5] Li, L, Watkins, T and Yu, Q (997) An EM algoritm for smooting te selfconsistent estimator of survival functions wit interval-censored data Scandinavian Journal of Statistics 24, [6] Pan, W (2000) Smoot estimation of te survival function for interval censored data Statistics in Medicine 9, [7] Silverman, B (986) Density Estimation for Statistics and Data Analysis, London, Capman-Hall [8] Turnbull, BW (976) Te empirical distribution function wit arbitrarily grouped, censored and truncated data Journal of te Royal Statistical Society, Series B 38, [9] Wu, CF (983) \On te convergence of te EM algoritm", Annals of Statistics,,

17 observation count time of infection density estimate time of infection Figure : Te rst plot gives te original data wit a line joining te left and rigt endpoints of eac interval Te second plot gives two kernel density estimates wit te solid line being tat of te metod proposed, ^f4, after 4 iterations and te dotted line a kernel smooted version of Turnbull's estimator, ^f t 7

18 weigt x weigt x Figure 2: Te plot depicts ow te kernel density estimate works Te diamond saped plotting caracter gives te weigt for te longer interval 8

19 density estimate j= j=2 j=3 j= time of infection Figure 3: Te plot gives te kernel density estimate ^f j for eac of te rst four iterations of te algoritm Convergence appears to ave been reaced by te tird iteration 9

20 density estimate time of infection density estimate time of infection Figure 4: Plots of te pointwise mean and 99% percentile intervals for B wit values of 0 and 00 respectively 20

21 density estimates time of infection density estimates time of infection density estimates time of infection density estimates time of infection Figure 5: Plots of ^f j j = ::: 4 for dierent initial values of te algoritm 2

22 Time of deat Time of infection f Time of infection Time of deat = =5 =2 =25 =3 = Time of infection Time of deat = =5 =2 =25 =3 = Time of infection Figure 6: Clockwise from te upper left are plots of te data, a density estimate for te interval censored covariate and two plots of te tted curve for various window sizes Te rst of tese includes te scatterplot data reported as te midpoint of te interval and te second gives te interval itself 22

23 Table : Cross validation (CV) and Kullbeck-Leibler (KL) distances CV ( ^f 4 ) KL ( ^f 4 ) CV ( ^f t ) KL ( ^f t ) ;

24 Table 2: Te beavior of various criteria j Squared distance Contraction MSE MSE 2 Kullbeck-Leibler 2e-0 (37e-02) (00032) (00054) 079 (00059) 2 2e-02 (94e-03) 000(N/A) 0055 (00037) 0030 ( ) 03 (00049) 3 6e-03 (20e-03) 000(N/A) (000322) 0022 ( ) 025 (00047) 4 24e-04 (44e-04) 000(N/A) (000325) 0022 ( ) 026 (00050) 5 55e-05 (80e-05) 09(00286) (000328) 0023 ( ) 025 (00050) 6 27e-05 (26e-05) 069(00462) (000329) 0024 ( ) 026 (00050) 7 28e-05 (24e-05) 047(00499) 0049 (000328) 0023 (000088) 024 (00052) 8 26e-05 (27e-05) 053(00499) 0049 (000328) 0023 ( ) 025 (00049) 9 28e-05 (30e-05) 046(00498) (00033) 0023 ( ) 025 (00049) 0 25e-05 (25e-05) 053(00499) 0049 (000329) 0023 ( ) 026 (00052) ^f t (000579) (000302) 049 (00067) 24

The Priestley-Chao Estimator

The Priestley-Chao Estimator Te Priestley-Cao Estimator In tis section we will consider te Pristley-Cao estimator of te unknown regression function. It is assumed tat we ave a sample of observations (Y i, x i ), i = 1,..., n wic are

More information

lecture 26: Richardson extrapolation

lecture 26: Richardson extrapolation 43 lecture 26: Ricardson extrapolation 35 Ricardson extrapolation, Romberg integration Trougout numerical analysis, one encounters procedures tat apply some simple approximation (eg, linear interpolation)

More information

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx. Capter 2 Integrals as sums and derivatives as differences We now switc to te simplest metods for integrating or differentiating a function from its function samples. A careful study of Taylor expansions

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximatinga function fx, wose values at a set of distinct points x, x, x,, x n are known, by a polynomial P x suc

More information

HOMEWORK HELP 2 FOR MATH 151

HOMEWORK HELP 2 FOR MATH 151 HOMEWORK HELP 2 FOR MATH 151 Here we go; te second round of omework elp. If tere are oters you would like to see, let me know! 2.4, 43 and 44 At wat points are te functions f(x) and g(x) = xf(x)continuous,

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximating a function f(x, wose values at a set of distinct points x, x, x 2,,x n are known, by a polynomial P (x

More information

Numerical Differentiation

Numerical Differentiation Numerical Differentiation Finite Difference Formulas for te first derivative (Using Taylor Expansion tecnique) (section 8.3.) Suppose tat f() = g() is a function of te variable, and tat as 0 te function

More information

Order of Accuracy. ũ h u Ch p, (1)

Order of Accuracy. ũ h u Ch p, (1) Order of Accuracy 1 Terminology We consider a numerical approximation of an exact value u. Te approximation depends on a small parameter, wic can be for instance te grid size or time step in a numerical

More information

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines Lecture 5 Interpolation II Introduction In te previous lecture we focused primarily on polynomial interpolation of a set of n points. A difficulty we observed is tat wen n is large, our polynomial as to

More information

A = h w (1) Error Analysis Physics 141

A = h w (1) Error Analysis Physics 141 Introduction In all brances of pysical science and engineering one deals constantly wit numbers wic results more or less directly from experimental observations. Experimental observations always ave inaccuracies.

More information

Bootstrap prediction intervals for Markov processes

Bootstrap prediction intervals for Markov processes arxiv: arxiv:0000.0000 Bootstrap prediction intervals for Markov processes Li Pan and Dimitris N. Politis Li Pan Department of Matematics University of California San Diego La Jolla, CA 92093-0112, USA

More information

Kernel Density Estimation

Kernel Density Estimation Kernel Density Estimation Univariate Density Estimation Suppose tat we ave a random sample of data X 1,..., X n from an unknown continuous distribution wit probability density function (pdf) f(x) and cumulative

More information

Chapter 1. Density Estimation

Chapter 1. Density Estimation Capter 1 Density Estimation Let X 1, X,..., X n be observations from a density f X x. Te aim is to use only tis data to obtain an estimate ˆf X x of f X x. Properties of f f X x x, Parametric metods f

More information

Fast Exact Univariate Kernel Density Estimation

Fast Exact Univariate Kernel Density Estimation Fast Exact Univariate Kernel Density Estimation David P. Hofmeyr Department of Statistics and Actuarial Science, Stellenbosc University arxiv:1806.00690v2 [stat.co] 12 Jul 2018 July 13, 2018 Abstract Tis

More information

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY (Section 3.2: Derivative Functions and Differentiability) 3.2.1 SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY LEARNING OBJECTIVES Know, understand, and apply te Limit Definition of te Derivative

More information

Te comparison of dierent models M i is based on teir relative probabilities, wic can be expressed, again using Bayes' teorem, in terms of prior probab

Te comparison of dierent models M i is based on teir relative probabilities, wic can be expressed, again using Bayes' teorem, in terms of prior probab To appear in: Advances in Neural Information Processing Systems 9, eds. M. C. Mozer, M. I. Jordan and T. Petsce. MIT Press, 997 Bayesian Model Comparison by Monte Carlo Caining David Barber D.Barber@aston.ac.uk

More information

Regularized Regression

Regularized Regression Regularized Regression David M. Blei Columbia University December 5, 205 Modern regression problems are ig dimensional, wic means tat te number of covariates p is large. In practice statisticians regularize

More information

7 Semiparametric Methods and Partially Linear Regression

7 Semiparametric Methods and Partially Linear Regression 7 Semiparametric Metods and Partially Linear Regression 7. Overview A model is called semiparametric if it is described by and were is nite-dimensional (e.g. parametric) and is in nite-dimensional (nonparametric).

More information

Parameter Fitted Scheme for Singularly Perturbed Delay Differential Equations

Parameter Fitted Scheme for Singularly Perturbed Delay Differential Equations International Journal of Applied Science and Engineering 2013. 11, 4: 361-373 Parameter Fitted Sceme for Singularly Perturbed Delay Differential Equations Awoke Andargiea* and Y. N. Reddyb a b Department

More information

Boosting Kernel Density Estimates: a Bias Reduction. Technique?

Boosting Kernel Density Estimates: a Bias Reduction. Technique? Boosting Kernel Density Estimates: a Bias Reduction Tecnique? Marco Di Marzio Dipartimento di Metodi Quantitativi e Teoria Economica, Università di Cieti-Pescara, Viale Pindaro 42, 65127 Pescara, Italy

More information

Chapter 4: Numerical Methods for Common Mathematical Problems

Chapter 4: Numerical Methods for Common Mathematical Problems 1 Capter 4: Numerical Metods for Common Matematical Problems Interpolation Problem: Suppose we ave data defined at a discrete set of points (x i, y i ), i = 0, 1,..., N. Often it is useful to ave a smoot

More information

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative Matematics 5 Workseet 11 Geometry, Tangency, and te Derivative Problem 1. Find te equation of a line wit slope m tat intersects te point (3, 9). Solution. Te equation for a line passing troug a point (x

More information

Math 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006

Math 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006 Mat 102 TEST CHAPTERS 3 & 4 Solutions & Comments Fall 2006 f(x+) f(x) 10 1. For f(x) = x 2 + 2x 5, find ))))))))) and simplify completely. NOTE: **f(x+) is NOT f(x)+! f(x+) f(x) (x+) 2 + 2(x+) 5 ( x 2

More information

Chapter 5 FINITE DIFFERENCE METHOD (FDM)

Chapter 5 FINITE DIFFERENCE METHOD (FDM) MEE7 Computer Modeling Tecniques in Engineering Capter 5 FINITE DIFFERENCE METHOD (FDM) 5. Introduction to FDM Te finite difference tecniques are based upon approximations wic permit replacing differential

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION MODULE 5

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION MODULE 5 THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA EXAMINATION NEW MODULAR SCHEME introduced from te examinations in 009 MODULE 5 SOLUTIONS FOR SPECIMEN PAPER B THE QUESTIONS ARE CONTAINED IN A SEPARATE FILE

More information

Combining functions: algebraic methods

Combining functions: algebraic methods Combining functions: algebraic metods Functions can be added, subtracted, multiplied, divided, and raised to a power, just like numbers or algebra expressions. If f(x) = x 2 and g(x) = x + 2, clearly f(x)

More information

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households Volume 29, Issue 3 Existence of competitive equilibrium in economies wit multi-member ouseolds Noriisa Sato Graduate Scool of Economics, Waseda University Abstract Tis paper focuses on te existence of

More information

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points MAT 15 Test #2 Name Solution Guide Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points Use te grap of a function sown ere as you respond to questions 1 to 8. 1. lim f (x) 0 2. lim

More information

IEOR 165 Lecture 10 Distribution Estimation

IEOR 165 Lecture 10 Distribution Estimation IEOR 165 Lecture 10 Distribution Estimation 1 Motivating Problem Consider a situation were we ave iid data x i from some unknown distribution. One problem of interest is estimating te distribution tat

More information

. If lim. x 2 x 1. f(x+h) f(x)

. If lim. x 2 x 1. f(x+h) f(x) Review of Differential Calculus Wen te value of one variable y is uniquely determined by te value of anoter variable x, ten te relationsip between x and y is described by a function f tat assigns a value

More information

Yishay Mansour. AT&T Labs and Tel-Aviv University. design special-purpose planning algorithms that exploit. this structure.

Yishay Mansour. AT&T Labs and Tel-Aviv University. design special-purpose planning algorithms that exploit. this structure. A Sparse Sampling Algoritm for Near-Optimal Planning in Large Markov Decision Processes Micael Kearns AT&T Labs mkearns@researc.att.com Yisay Mansour AT&T Labs and Tel-Aviv University mansour@researc.att.com

More information

A Jump-Preserving Curve Fitting Procedure Based On Local Piecewise-Linear Kernel Estimation

A Jump-Preserving Curve Fitting Procedure Based On Local Piecewise-Linear Kernel Estimation A Jump-Preserving Curve Fitting Procedure Based On Local Piecewise-Linear Kernel Estimation Peiua Qiu Scool of Statistics University of Minnesota 313 Ford Hall 224 Curc St SE Minneapolis, MN 55455 Abstract

More information

The total error in numerical differentiation

The total error in numerical differentiation AMS 147 Computational Metods and Applications Lecture 08 Copyrigt by Hongyun Wang, UCSC Recap: Loss of accuracy due to numerical cancellation A B 3, 3 ~10 16 In calculating te difference between A and

More information

arxiv: v1 [math.pr] 28 Dec 2018

arxiv: v1 [math.pr] 28 Dec 2018 Approximating Sepp s constants for te Slepian process Jack Noonan a, Anatoly Zigljavsky a, a Scool of Matematics, Cardiff University, Cardiff, CF4 4AG, UK arxiv:8.0v [mat.pr] 8 Dec 08 Abstract Slepian

More information

The Complexity of Computing the MCD-Estimator

The Complexity of Computing the MCD-Estimator Te Complexity of Computing te MCD-Estimator Torsten Bernolt Lerstul Informatik 2 Universität Dortmund, Germany torstenbernolt@uni-dortmundde Paul Fiscer IMM, Danisc Tecnical University Kongens Lyngby,

More information

New Distribution Theory for the Estimation of Structural Break Point in Mean

New Distribution Theory for the Estimation of Structural Break Point in Mean New Distribution Teory for te Estimation of Structural Break Point in Mean Liang Jiang Singapore Management University Xiaou Wang Te Cinese University of Hong Kong Jun Yu Singapore Management University

More information

= 0 and states ''hence there is a stationary point'' All aspects of the proof dx must be correct (c)

= 0 and states ''hence there is a stationary point'' All aspects of the proof dx must be correct (c) Paper 1: Pure Matematics 1 Mark Sceme 1(a) (i) (ii) d d y 3 1x 4x x M1 A1 d y dx 1.1b 1.1b 36x 48x A1ft 1.1b Substitutes x = into teir dx (3) 3 1 4 Sows d y 0 and states ''ence tere is a stationary point''

More information

CS522 - Partial Di erential Equations

CS522 - Partial Di erential Equations CS5 - Partial Di erential Equations Tibor Jánosi April 5, 5 Numerical Di erentiation In principle, di erentiation is a simple operation. Indeed, given a function speci ed as a closed-form formula, its

More information

Copyright c 2008 Kevin Long

Copyright c 2008 Kevin Long Lecture 4 Numerical solution of initial value problems Te metods you ve learned so far ave obtained closed-form solutions to initial value problems. A closedform solution is an explicit algebriac formula

More information

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist Mat 1120 Calculus Test 2. October 18, 2001 Your name Te multiple coice problems count 4 points eac. In te multiple coice section, circle te correct coice (or coices). You must sow your work on te oter

More information

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point MA00 Capter 6 Calculus and Basic Linear Algebra I Limits, Continuity and Differentiability Te concept of its (p.7 p.9, p.4 p.49, p.55 p.56). Limits Consider te function determined by te formula f Note

More information

MA455 Manifolds Solutions 1 May 2008

MA455 Manifolds Solutions 1 May 2008 MA455 Manifolds Solutions 1 May 2008 1. (i) Given real numbers a < b, find a diffeomorpism (a, b) R. Solution: For example first map (a, b) to (0, π/2) and ten map (0, π/2) diffeomorpically to R using

More information

Investigating Euler s Method and Differential Equations to Approximate π. Lindsay Crowl August 2, 2001

Investigating Euler s Method and Differential Equations to Approximate π. Lindsay Crowl August 2, 2001 Investigating Euler s Metod and Differential Equations to Approximate π Lindsa Crowl August 2, 2001 Tis researc paper focuses on finding a more efficient and accurate wa to approximate π. Suppose tat x

More information

Kernel Density Based Linear Regression Estimate

Kernel Density Based Linear Regression Estimate Kernel Density Based Linear Regression Estimate Weixin Yao and Zibiao Zao Abstract For linear regression models wit non-normally distributed errors, te least squares estimate (LSE will lose some efficiency

More information

ERROR BOUNDS FOR THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BRADLEY J. LUCIER*

ERROR BOUNDS FOR THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BRADLEY J. LUCIER* EO BOUNDS FO THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BADLEY J. LUCIE* Abstract. Te expected error in L ) attimet for Glimm s sceme wen applied to a scalar conservation law is bounded by + 2 ) ) /2 T

More information

Handling Missing Data on Asymmetric Distribution

Handling Missing Data on Asymmetric Distribution International Matematical Forum, Vol. 8, 03, no. 4, 53-65 Handling Missing Data on Asymmetric Distribution Amad M. H. Al-Kazale Department of Matematics, Faculty of Science Al-albayt University, Al-Mafraq-Jordan

More information

Math 1241 Calculus Test 1

Math 1241 Calculus Test 1 February 4, 2004 Name Te first nine problems count 6 points eac and te final seven count as marked. Tere are 120 points available on tis test. Multiple coice section. Circle te correct coice(s). You do

More information

1. Consider the trigonometric function f(t) whose graph is shown below. Write down a possible formula for f(t).

1. Consider the trigonometric function f(t) whose graph is shown below. Write down a possible formula for f(t). . Consider te trigonometric function f(t) wose grap is sown below. Write down a possible formula for f(t). Tis function appears to be an odd, periodic function tat as been sifted upwards, so we will use

More information

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING Statistica Sinica 13(2003), 641-653 EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING J. K. Kim and R. R. Sitter Hankuk University of Foreign Studies and Simon Fraser University Abstract:

More information

Bootstrap confidence intervals in nonparametric regression without an additive model

Bootstrap confidence intervals in nonparametric regression without an additive model Bootstrap confidence intervals in nonparametric regression witout an additive model Dimitris N. Politis Abstract Te problem of confidence interval construction in nonparametric regression via te bootstrap

More information

[db]

[db] Blind Source Separation based on Second-Order Statistics wit Asymptotically Optimal Weigting Arie Yeredor Department of EE-Systems, el-aviv University P.O.Box 3900, el-aviv 69978, Israel Abstract Blind

More information

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LAURA EVANS.. Introduction Not all differential equations can be explicitly solved for y. Tis can be problematic if we need to know te value of y

More information

Symmetry Labeling of Molecular Energies

Symmetry Labeling of Molecular Energies Capter 7. Symmetry Labeling of Molecular Energies Notes: Most of te material presented in tis capter is taken from Bunker and Jensen 1998, Cap. 6, and Bunker and Jensen 2005, Cap. 7. 7.1 Hamiltonian Symmetry

More information

64 IX. The Exceptional Lie Algebras

64 IX. The Exceptional Lie Algebras 64 IX. Te Exceptional Lie Algebras IX. Te Exceptional Lie Algebras We ave displayed te four series of classical Lie algebras and teir Dynkin diagrams. How many more simple Lie algebras are tere? Surprisingly,

More information

Differential Calculus (The basics) Prepared by Mr. C. Hull

Differential Calculus (The basics) Prepared by Mr. C. Hull Differential Calculus Te basics) A : Limits In tis work on limits, we will deal only wit functions i.e. tose relationsips in wic an input variable ) defines a unique output variable y). Wen we work wit

More information

Sin, Cos and All That

Sin, Cos and All That Sin, Cos and All Tat James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 9, 2017 Outline Sin, Cos and all tat! A New Power Rule Derivatives

More information

CHAPTER (A) When x = 2, y = 6, so f( 2) = 6. (B) When y = 4, x can equal 6, 2, or 4.

CHAPTER (A) When x = 2, y = 6, so f( 2) = 6. (B) When y = 4, x can equal 6, 2, or 4. SECTION 3-1 101 CHAPTER 3 Section 3-1 1. No. A correspondence between two sets is a function only if eactly one element of te second set corresponds to eac element of te first set. 3. Te domain of a function

More information

Modelling evolution in structured populations involving multiplayer interactions

Modelling evolution in structured populations involving multiplayer interactions Modelling evolution in structured populations involving multiplayer interactions Mark Broom City University London Complex Systems: Modelling, Emergence and Control City University London London June 8-9

More information

2.1 THE DEFINITION OF DERIVATIVE

2.1 THE DEFINITION OF DERIVATIVE 2.1 Te Derivative Contemporary Calculus 2.1 THE DEFINITION OF DERIVATIVE 1 Te grapical idea of a slope of a tangent line is very useful, but for some uses we need a more algebraic definition of te derivative

More information

3.4 Worksheet: Proof of the Chain Rule NAME

3.4 Worksheet: Proof of the Chain Rule NAME Mat 1170 3.4 Workseet: Proof of te Cain Rule NAME Te Cain Rule So far we are able to differentiate all types of functions. For example: polynomials, rational, root, and trigonometric functions. We are

More information

Time (hours) Morphine sulfate (mg)

Time (hours) Morphine sulfate (mg) Mat Xa Fall 2002 Review Notes Limits and Definition of Derivative Important Information: 1 According to te most recent information from te Registrar, te Xa final exam will be eld from 9:15 am to 12:15

More information

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator.

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator. Lecture XVII Abstract We introduce te concept of directional derivative of a scalar function and discuss its relation wit te gradient operator. Directional derivative and gradient Te directional derivative

More information

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT LIMITS AND DERIVATIVES Te limit of a function is defined as te value of y tat te curve approaces, as x approaces a particular value. Te limit of f (x) as x approaces a is written as f (x) approaces, as

More information

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION PETER G. HALL AND JEFFREY S. RACINE Abstract. Many practical problems require nonparametric estimates of regression functions, and local polynomial

More information

LECTURE 14 NUMERICAL INTEGRATION. Find

LECTURE 14 NUMERICAL INTEGRATION. Find LECTURE 14 NUMERCAL NTEGRATON Find b a fxdx or b a vx ux fx ydy dx Often integration is required. However te form of fx may be suc tat analytical integration would be very difficult or impossible. Use

More information

Lecture 21. Numerical differentiation. f ( x+h) f ( x) h h

Lecture 21. Numerical differentiation. f ( x+h) f ( x) h h Lecture Numerical differentiation Introduction We can analytically calculate te derivative of any elementary function, so tere migt seem to be no motivation for calculating derivatives numerically. However

More information

The derivative function

The derivative function Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Te derivative function Wat you need to know already: f is at a point on its grap and ow to compute it. Wat te derivative

More information

ch (for some fixed positive number c) reaching c

ch (for some fixed positive number c) reaching c GSTF Journal of Matematics Statistics and Operations Researc (JMSOR) Vol. No. September 05 DOI 0.60/s4086-05-000-z Nonlinear Piecewise-defined Difference Equations wit Reciprocal and Cubic Terms Ramadan

More information

Basic Nonparametric Estimation Spring 2002

Basic Nonparametric Estimation Spring 2002 Basic Nonparametric Estimation Spring 2002 Te following topics are covered today: Basic Nonparametric Regression. Tere are four books tat you can find reference: Silverman986, Wand and Jones995, Hardle990,

More information

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x)

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x) Calculus. Gradients and te Derivative Q f(x+) δy P T δx R f(x) 0 x x+ Let P (x, f(x)) and Q(x+, f(x+)) denote two points on te curve of te function y = f(x) and let R denote te point of intersection of

More information

Function Composition and Chain Rules

Function Composition and Chain Rules Function Composition and s James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 8, 2017 Outline 1 Function Composition and Continuity 2 Function

More information

A Locally Adaptive Transformation Method of Boundary Correction in Kernel Density Estimation

A Locally Adaptive Transformation Method of Boundary Correction in Kernel Density Estimation A Locally Adaptive Transformation Metod of Boundary Correction in Kernel Density Estimation R.J. Karunamuni a and T. Alberts b a Department of Matematical and Statistical Sciences University of Alberta,

More information

2.8 The Derivative as a Function

2.8 The Derivative as a Function .8 Te Derivative as a Function Typically, we can find te derivative of a function f at many points of its domain: Definition. Suppose tat f is a function wic is differentiable at every point of an open

More information

Quantum Numbers and Rules

Quantum Numbers and Rules OpenStax-CNX module: m42614 1 Quantum Numbers and Rules OpenStax College Tis work is produced by OpenStax-CNX and licensed under te Creative Commons Attribution License 3.0 Abstract Dene quantum number.

More information

Differentiation. Area of study Unit 2 Calculus

Differentiation. Area of study Unit 2 Calculus Differentiation 8VCE VCEco Area of stud Unit Calculus coverage In tis ca 8A 8B 8C 8D 8E 8F capter Introduction to limits Limits of discontinuous, rational and brid functions Differentiation using first

More information

Artificial Neural Network Model Based Estimation of Finite Population Total

Artificial Neural Network Model Based Estimation of Finite Population Total International Journal of Science and Researc (IJSR), India Online ISSN: 2319-7064 Artificial Neural Network Model Based Estimation of Finite Population Total Robert Kasisi 1, Romanus O. Odiambo 2, Antony

More information

New families of estimators and test statistics in log-linear models

New families of estimators and test statistics in log-linear models Journal of Multivariate Analysis 99 008 1590 1609 www.elsevier.com/locate/jmva ew families of estimators and test statistics in log-linear models irian Martín a,, Leandro Pardo b a Department of Statistics

More information

Bob Brown Math 251 Calculus 1 Chapter 3, Section 1 Completed 1 CCBC Dundalk

Bob Brown Math 251 Calculus 1 Chapter 3, Section 1 Completed 1 CCBC Dundalk Bob Brown Mat 251 Calculus 1 Capter 3, Section 1 Completed 1 Te Tangent Line Problem Te idea of a tangent line first arises in geometry in te context of a circle. But before we jump into a discussion of

More information

Click here to see an animation of the derivative

Click here to see an animation of the derivative Differentiation Massoud Malek Derivative Te concept of derivative is at te core of Calculus; It is a very powerful tool for understanding te beavior of matematical functions. It allows us to optimize functions,

More information

One-Sided Position-Dependent Smoothness-Increasing Accuracy-Conserving (SIAC) Filtering Over Uniform and Non-uniform Meshes

One-Sided Position-Dependent Smoothness-Increasing Accuracy-Conserving (SIAC) Filtering Over Uniform and Non-uniform Meshes DOI 10.1007/s10915-014-9946-6 One-Sided Position-Dependent Smootness-Increasing Accuracy-Conserving (SIAC) Filtering Over Uniform and Non-uniform Meses JenniferK.Ryan Xiaozou Li Robert M. Kirby Kees Vuik

More information

WYSE Academic Challenge 2004 Sectional Mathematics Solution Set

WYSE Academic Challenge 2004 Sectional Mathematics Solution Set WYSE Academic Callenge 00 Sectional Matematics Solution Set. Answer: B. Since te equation can be written in te form x + y, we ave a major 5 semi-axis of lengt 5 and minor semi-axis of lengt. Tis means

More information

University Mathematics 2

University Mathematics 2 University Matematics 2 1 Differentiability In tis section, we discuss te differentiability of functions. Definition 1.1 Differentiable function). Let f) be a function. We say tat f is differentiable at

More information

Finding and Using Derivative The shortcuts

Finding and Using Derivative The shortcuts Calculus 1 Lia Vas Finding and Using Derivative Te sortcuts We ave seen tat te formula f f(x+) f(x) (x) = lim 0 is manageable for relatively simple functions like a linear or quadratic. For more complex

More information

The Laplace equation, cylindrically or spherically symmetric case

The Laplace equation, cylindrically or spherically symmetric case Numerisce Metoden II, 7 4, und Übungen, 7 5 Course Notes, Summer Term 7 Some material and exercises Te Laplace equation, cylindrically or sperically symmetric case Electric and gravitational potential,

More information

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4.

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4. December 09, 20 Calculus PracticeTest s Name: (4 points) Find te absolute extrema of f(x) = x 3 0 on te interval [0, 4] Te derivative of f(x) is f (x) = 3x 2, wic is zero only at x = 0 Tus we only need

More information

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Statistica Sinica 24 2014, 395-414 doi:ttp://dx.doi.org/10.5705/ss.2012.064 EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Jun Sao 1,2 and Seng Wang 3 1 East Cina Normal University,

More information

Differentiation in higher dimensions

Differentiation in higher dimensions Capter 2 Differentiation in iger dimensions 2.1 Te Total Derivative Recall tat if f : R R is a 1-variable function, and a R, we say tat f is differentiable at x = a if and only if te ratio f(a+) f(a) tends

More information

Math 312 Lecture Notes Modeling

Math 312 Lecture Notes Modeling Mat 3 Lecture Notes Modeling Warren Weckesser Department of Matematics Colgate University 5 7 January 006 Classifying Matematical Models An Example We consider te following scenario. During a storm, a

More information

REVIEW LAB ANSWER KEY

REVIEW LAB ANSWER KEY REVIEW LAB ANSWER KEY. Witout using SN, find te derivative of eac of te following (you do not need to simplify your answers): a. f x 3x 3 5x x 6 f x 3 3x 5 x 0 b. g x 4 x x x notice te trick ere! x x g

More information

UNIMODAL KERNEL DENSITY ESTIMATION BY DATA SHARPENING

UNIMODAL KERNEL DENSITY ESTIMATION BY DATA SHARPENING Statistica Sinica 15(2005), 73-98 UNIMODAL KERNEL DENSITY ESTIMATION BY DATA SHARPENING Peter Hall 1 and Kee-Hoon Kang 1,2 1 Australian National University and 2 Hankuk University of Foreign Studies Abstract:

More information

Taylor Series and the Mean Value Theorem of Derivatives

Taylor Series and the Mean Value Theorem of Derivatives 1 - Taylor Series and te Mean Value Teorem o Derivatives Te numerical solution o engineering and scientiic problems described by matematical models oten requires solving dierential equations. Dierential

More information

Optimal parameters for a hierarchical grid data structure for contact detection in arbitrarily polydisperse particle systems

Optimal parameters for a hierarchical grid data structure for contact detection in arbitrarily polydisperse particle systems Comp. Part. Mec. 04) :357 37 DOI 0.007/s4057-04-000-9 Optimal parameters for a ierarcical grid data structure for contact detection in arbitrarily polydisperse particle systems Dinant Krijgsman Vitaliy

More information

Exam 1 Review Solutions

Exam 1 Review Solutions Exam Review Solutions Please also review te old quizzes, and be sure tat you understand te omework problems. General notes: () Always give an algebraic reason for your answer (graps are not sufficient),

More information

2.11 That s So Derivative

2.11 That s So Derivative 2.11 Tat s So Derivative Introduction to Differential Calculus Just as one defines instantaneous velocity in terms of average velocity, we now define te instantaneous rate of cange of a function at a point

More information

On Local Linear Regression Estimation of Finite Population Totals in Model Based Surveys

On Local Linear Regression Estimation of Finite Population Totals in Model Based Surveys American Journal of Teoretical and Applied Statistics 2018; 7(3): 92-101 ttp://www.sciencepublisinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20180703.11 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

Impact of Lightning Strikes on National Airspace System (NAS) Outages

Impact of Lightning Strikes on National Airspace System (NAS) Outages Impact of Ligtning Strikes on National Airspace System (NAS) Outages A Statistical Approac Aurélien Vidal University of California at Berkeley NEXTOR Berkeley, CA, USA aurelien.vidal@berkeley.edu Jasenka

More information

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES

A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES A MONTE CARLO ANALYSIS OF THE EFFECTS OF COVARIANCE ON PROPAGATED UNCERTAINTIES Ronald Ainswort Hart Scientific, American Fork UT, USA ABSTRACT Reports of calibration typically provide total combined uncertainties

More information

The Verlet Algorithm for Molecular Dynamics Simulations

The Verlet Algorithm for Molecular Dynamics Simulations Cemistry 380.37 Fall 2015 Dr. Jean M. Standard November 9, 2015 Te Verlet Algoritm for Molecular Dynamics Simulations Equations of motion For a many-body system consisting of N particles, Newton's classical

More information

Financial Econometrics Prof. Massimo Guidolin

Financial Econometrics Prof. Massimo Guidolin CLEFIN A.A. 2010/2011 Financial Econometrics Prof. Massimo Guidolin A Quick Review of Basic Estimation Metods 1. Were te OLS World Ends... Consider two time series 1: = { 1 2 } and 1: = { 1 2 }. At tis

More information

1 2 x Solution. The function f x is only defined when x 0, so we will assume that x 0 for the remainder of the solution. f x. f x h f x.

1 2 x Solution. The function f x is only defined when x 0, so we will assume that x 0 for the remainder of the solution. f x. f x h f x. Problem. Let f x x. Using te definition of te derivative prove tat f x x Solution. Te function f x is only defined wen x 0, so we will assume tat x 0 for te remainder of te solution. By te definition of

More information