Estimating joinpoints in continuous time scale for multiple change-point models

Size: px
Start display at page:

Download "Estimating joinpoints in continuous time scale for multiple change-point models"

Transcription

1 Computational Statistics & Data Analysis 5 (2007) wwwelseviercom/locate/csda Estimating joinpoints in continuous time scale for multiple change-point models Binbing Yu a,, Michael J Barrett a, Hyune-Ju Kim b, Eric J Feuer c a Information Management Services, Inc 250 Prosperity Dr Suite 200, Silver Spring, MD 20904, USA b Department of Mathematics, 25 Carnegie Building, Syracuse University, Syracuse, NY , USA c Statistical Research and Applications Branch, National Cancer Institute, 66 Executive Boulevard, Suite 504, Bethesda, MD , USA Received 5 February 2006; received in revised form 2 July 2006; accepted 29 July 2006 Available online September 2006 Abstract Joinpoint models have been applied to the cancer incidence and mortality data with continuous change points The current estimation method [Lerman, PM, 980 Fitting segmented regression models by grid search Appl Statist 29, 77 84] assumes that the joinpoints only occur at discrete grid points However, it is more realistic that the joinpoints take any value within the observed data range Hudson [966 Fitting segmented curves whose join points have to be estimated J Amer Statist Soc 6, ] provides an algorithm to find the weighted least square estimates of the joinpoint on the continuous scale Hudson described the estimation procedure in detail for a model with only one joinpoint, but its extension to a multiple joinpoint model is not straightforward In this article, we describe in detail Hudson s method for the multiple joinpoint model and discuss issues in the implementation We compare the computational efficiencies of the LGS method and Hudson s method The comparisons between the proposed estimation method and several alternative approaches, especially the Bayesian joinpoint models, are discussed Hudson s method is implemented by C++and applied to the colorectal cancer incidence data for men under age 65 from SEER nine registries 2006 Elsevier BV All rights reserved Keywords: Constrained least square; Cancer incidence and mortality; Joinpoint regression; SEER Introduction It is of great importance to describe the trend of cancer incidence and mortality data The joinpoint regression model, which is composed of a few continuous linear phases, is often useful to describe changes in trend data Suppose that for the observations {(x,y ),,(x n,y n )}, x x n, the responses y i = E(y x i ) + e i,i=,,n, with E(e i ) = 0 and V(e i ) = σ 2 i for random errors e i The joinpoint regression models assume that, in each segment, the E(y x) follows a linear model E(y x) = β k,0 + β k, x, if τ k <x τ k, k =,,K +, () where τ 0 =, τ K+ = and E(y x) is continuous throughout [x 0,x n ], such that β k,0 + β k, τ k = β k+,0 + β k+, τ k for k =,,K (2) Corresponding author Tel: ; fax: addresses: yub@imswebcom, whybb@yahoocom (B Yu) /$ - see front matter 2006 Elsevier BV All rights reserved doi:006/jcsda

2 B Yu et al / Computational Statistics & Data Analysis 5 (2007) As the response is continuous at the change points, we call model () the joinpoint model and the τ k s joinpoints (JPs) This model is also called segmented-line regression model or piecewise linear model (Kim et al, 2004) An alternative parameterization of the JP model () is E(y x) = β 0 + β x + K δ k (x τ k ) +, (3) k= where δ k = β k+, β k, and (x τ k ) + = x τ k if x τ k and 0 otherwise This parameterization implicitly satisfies the continuity of E(y x) at τ k The current estimation method is the grid search (LGS) method proposed by Lerman (980), which is implemented by Joinpoint software developed by US National Cancer Institute ( Although the LGS method can be refined such that the JPs could occur at the middle point or quarterly point between two data points, the computation time for finer grid increases dramatically Hence, the LGS method is practical only when the JPs occur at the observed data points Hudson (966) described the continuous algorithm in detail for a one-jp model and discussed its extension to a model with more than two JPs, which is not straightforward Our aims in this paper are to describe the details of the extension to a multiple JP model and to compare computational efficiencies of these two fitting methods Several alternative methods have been proposed to estimate the locations of the change points for single series in different contexts For example, Quandt (958) and Quandt and Ramsey (978) proposed the procedure of estimating a single change point without continuity constraint at response in economics settings, Hinkley (969, 97) discussed the estimation and inference for the joinpoints in one-joinpoint models, Smith (975), Carlin et al (992), Slate and Turnbull (2000) and Tiwari et al (2005) use Bayesian approaches to estimate the change points under different scenarios Most of the available methods estimate the single change/join point The proposed method in the paper estimates the multiple joinpoints in continuous scale, hence it provides a better fit The rest of the paper is organized as follows: The model formulation and notation are described in Section 2 and Hudson s method for a one-jp model is reviewed in Section 3 In Section 4, Hudson s method is extended to a multiple JP model and the issues arising in the implementation are discussed Then the multiple JP model is applied to colorectal cancer incidence data for men under age 65 from the SEER nine registries The relative merits of different approaches are discussed in the final section 2 Model formulation and notation Let the kth segment denoted by S k ={x i : τ k <x i τ k }={x ik +,,x ik } for i 0 = 0 and i K+ = n For each segment S k,k=,,k +, we define that Y k = y ik + y ik x ik +, X k =, ɛ k = x ik e ik + where E(ɛ k ) = 0, Cov(ɛ k ) = Σ k and the weight matrix W k = Σ k Let Y X 0 β Y =, X =, β = Y K+ 0 X K+ e ik β K+ ( ) βk0, β k =, β k, ɛ = ɛ ɛ K+ Notice that Y = (y,,y n ) and ɛ = (e,,e n ) Then, the JP model () can be expressed as Y = Xβ + ɛ, with constraints (2), where E(ɛ) = 0, Cov(ɛ) = Σ Let τ = (τ,,τ K ) To fit this model, we find the estimates (ˆτ, ˆβ), of the JPs τ and the regression coefficients β which minimize the weighted sum of squared error (SSE) R(τ, β) = (Y Xβ) T W(Y Xβ),

3 2422 B Yu et al / Computational Statistics & Data Analysis 5 (2007) where W = Σ When the e i s are independent, Σ k, hence Σ, are diagonal matrices Especially if V(e i ) = σ 2 /w i, then Σ = σ 2 Diag(w,,w n ), and R(τ, β) can be simplified as K+ K+ (Y k X k β k ) W k (Y k X k β k ) = σ 2 k= k= x i S k w i [y i (β k,0 + β k, x i )] 2 In general, however, Σ k and Σ are non-diagonal matrices For example, for AR() model, the ijth element of Σ is σ ij = Cov(e i,e j ) = σ 2 i j /( ), 0 < 3 Review of Hudson s method: one JP τ In this section, we first summarize Hudson s algorithm for the -JP model The procedure to estimate ˆτ is described as follows: (a) For the partition [x,x i ], [x i+,x n ], 2 i n 2, fit the least square (LS) regression for each segment Let y y i+ x x i+ Y =, Y 2 =, X =, X 2 = y i y n x i x n The unconstrained weighted LS estimates are β k = ( β k,0, β k, ) = (X k W kx k ) X k W ky k, k =, 2 (4) (b) Let τ (i) be the solution to the equation β 0 + β τ = β 20 + β 2 τifx i τ (i) <x i+, then ˆτ (i) is called in the right place That means the two unconstrained regression lines cross between the two observations x i and x i+ (b) If ˆτ (i) is in the right place, then let R(i) = ρ + ρ 2, where ρ and ρ 2 are the unconstrained SSE for the two segments (b2) Otherwise, ˆτ (i) / [x i,x i+ ), ie, two unconstrained regression lines cross outside the interval [x i,x i+ ) Then we need to adjust the unconstrained LS estimates β = ( β, β 2 ) to the constrained estimates ˆβ = (ˆβ, ˆβ 2 ) and set ˆτ (i) = x i (Hudson, 966, Appendix 2) The linear constraint is Aˆβ = 0, where A = (,x i,, x i ) and ˆβ = (ˆβ,0, ˆβ,, ˆβ 2,0, ˆβ 2, ) Let ( C = (XWX) (X = W X ) ) 0 0 (X 2 W 2X 2 ) Using the method of Lagrange Multipliers, the constrained LS estimate is ˆβ = β C A [AC A ] A β (5) For -JP model, t = AC A and s = A β are scaler numbers Hence, ˆβ = β s t C A, and the adjusted SSE is given by R(i) = ρ + ρ 2 + s 2 /t (c) Repeat (a) (b) for all of i to choose the τ (i) which minimize R(i), that is τ = arg τ (i) min R(i) However, we do not need to make adjustment (b2) for all i since we may rule out some cases (Hudson, 966) As the adjustment with linear constraint in (b2) always increases the SSE, Hudson (966) proved that (a) we need not try ˆτ (i) = x i or ˆτ (i) = x i+ if ˆτ is in the right place; (b) Even if ˆτ (i) is not in the right place, no further adjustment is necessary if the constrained SSE R(i) = ρ + ρ 2 is larger than some previously obtained SSE

4 B Yu et al / Computational Statistics & Data Analysis 5 (2007) Estimation of multiple JP model in continuous scale For a K-JP model, there are K + segments, S,,S K+ and K JPs The kth JP τ k [x ik,x ik +) divides segments S k and S k+ Recall that the unconstrained LS estimates that minimize R(τ, β) in (4) are β = ( β,, β K+ ) = (X WX) X WY (6) When the e i,i=,,n, are independent, then W is block diagonal and (X W X ) 0 X W Y (X WX) =, X WY =, 0 (X K+ W K+X K+ ) X K+ W K+Y K+ and β k = (X k W kx k ) X k W ky k The kth JP ˆτ k is obtained by solving equation β k,0 + β k, τ k = β k+,0 + β k+, τ k Let T k denote the location of ˆτ k If the estimated JP ˆτ k is in the right place, ie, x ik < ˆτ k <x ik +, then T k = ; otherwise, T k = 2 and further adjustment is needed In the ideal situation, all ˆτ k s from unconstrained LS regression are in the right places, ie, ˆτ k (x ik,x ik +) for k =,,K, no adjustment is needed and (ˆτ,,ˆτ K )are the final estimates of JPs Otherwise, some ˆτ k s need to be adjusted to x ik and the LS estimate β needs to be adjusted subject to continuity constraints Let x i x i 0 0 Q K (2K+2) = 0 x ik x ik x ik x ik The continuity constraint at ˆτ k = x ik implies that Q(k, )β = 0, where Q(k, ) is the kth row of Q Let A be the constraint matrix such that Aβ = 0 For example, if τ k and τ l are both adjusted to x ik and x il, then row k and row l of Q will be added into matrix A Then the estimate of β with constraint Aβ = 0(Plackett, 960, p 53) is ˆβ = β (X WX) A [A(X WX) A ] A β, and the corresponding SSE is R(ˆτ, ˆβ) = R( τ, β) + (A β) [A(X WX) A ] A β In the rest of this section, we discuss several issues arising in the implementation of the Hudson s method for the multiple JP regression Given the estimates of the joinpoints (ˆτ,,ˆτ K ), the covariance matrix of the regression coefficients ˆβ and the confidence intervals for the JPs can be calculated as described by Lerman (980) 4 Comparison of the computations between the LGS method and the Hudson s method To find the global minimum of SSE using the LGS method, the number of necessary trials (LS calculations) is ( G K ), where G is the number of grid points When only data points are used as grid, then G = n and when the midpoints are inserted as grid, then G = 2n When Hudson s method is used, the location of the joinpoints need to be taken in consideration For each partition, you may try all possible adjustments for each partition The number of possible trials for a K-JP model is given by (Hudson, 966) K ( )( ) 2 r K n K 2 r r r=0

5 2424 B Yu et al / Computational Statistics & Data Analysis 5 (2007) Table Maximum number of trials for a K-JP model when n = 30 K Hudson s method ,5 24,60 Grid-search using only data points ,920 Grid-search by inserting one midpoint , ,845 Typically, in the analysis of cancer incidence and mortality data, n 30 and the upper limit for the number of JPs is 4 Table shows the maximum number of trials from the Hudson s method, the LGS method using only data points, and the LGS method by inserting one midpoint between two data points The LGS method using only data points takes the least number of trials The number of trials for the Hudson s method is less than the LGS method with midpoint inserted Practically, no adjustment is needed for the LGS method, so each trial of the Hudson s method takes longer time than that of the LGS method The Joinpoint software uses a permutation test to select the optimal JP model (Kim et al, 2000) The permutation test procedure sequentially conducts the tests of the null hypothesis so that there are k 0 JPs against the alternative of k JPs until we reach to a conclusion, where 0 k 0 <k 3 At each level of testing, the models with k 0 and k joinpoints are fitted for each of the N permuted data and N is usually large to generate the permutation distribution of the test statistic When fitting a JP model without model selection, the difference between the LGS method on data points and the Hudson s method is not noticeable, both finish in a few seconds When the permutation test is used for model selection, the LGS method using data points as the grid is the fastest, and the Hudson s method becomes substantially longer, which is about the time of the LGS method with three grid points inserted between consecutive data points This is because of the extra comparisons needed to check the joinpoint location, which is further discussed below However, the time of the LGS method with nine grid points inserted is daunting One major advantage of Hudson s method is that the location of JP is continuous and it provides a better model fit than the LGS method does 42 Implementation of a multiple JP model IfaJPˆτ k is not in the right location, ie, ˆτ k / [x ik,x ik +), it could be adjusted to either x ik or x ik + If two adjacent JPs ˆτ k and ˆτ k+ are not in the right locations, adjusting one JP could automatically change the adjacent JP to the right location Hence if L JPs are not in the right locations, the maximum number of possible adjustments would be ( L )2 + ( L 2 )22 + +( L L )2L = 3 L To speed up the Hudson s method, only left adjustment is necessary, ie, ˆτ k only needs to be adjusted to x ik The approximate number of adjustments are ( L ) + ( L 2 ) + +( L L ) = 2L, which is substantially less than 3 L The exceptional cases are i k+ i k = 4, when ˆτ k needs to be adjusted to both x ik and x ik +, and i k = n 3, when ˆτ k needs to be adjusted to both x n 3 and x n 2 Let P be one of the partitions and R min be the current minimum of unconstrained SSE The initial value of R min = The steps to find the estimates of (τ, β) for a K-JP model are as follows: () For the partition P, find the unconstrained LS estimates β k for each segment S k Calculate the total unconstrained SSE R (0) P = K+ k= ρ k, where ρ k is the unconstrained SSE for the kth segment If R (0) P R min, then stop and try another partition; otherwise, go to 2 (2) Calculate the JPs ˆτ k of the regression lines from S k and S k+,k=,,k (a) If all ˆτ k s are in the right places, then update R min = R (0) P and go back to step (b) If some τ k s are not in the right places and we need to adjust those τ k to x ik Let A k = (0) indicate whether a JP τ k needs adjustment (or not) For example, for a model with three JPs (τ, τ 2, τ 3 ), the possible adjustments (A,A 2,A 3 ) are (, 0, 0),(0,, 0), (0, 0, ), (,, 0), (, 0, ), (0,, ), (,, ) (i) For each adjustment, check whether the JPs after adjustment are all in the right places, ie, τ k [x ik,x ik +) If they are, calculate the adjusted SSE R (m) P, =,,M; otherwise, set R(m) P = (ii) If min(r () P,,R(M) P ) R min then update R min ; otherwise, go to step (3) Try all possible partitions, then the global minimum SSE is R min and the corresponding estimates of (τ, β) are the final estimates

6 43 Restrictions on the JP locations B Yu et al / Computational Statistics & Data Analysis 5 (2007) Although the JPs can occur anywhere within the range of observed data, some restrictions apply For example, two JPs may not be too close to each other and a JP may not occur too early or too late The default options in the current Joinpoint software restrict that, including the data points that are also JPs, the minimum number of data points between two JPs is 4 and the minimum number of data points from a JP to either end of the data is 3 These restrictions are necessary to calculate the standard errors of the regression coefficients Suppose that the estimated JPs ˆτ k [x ik,x ik +), k =,,K In order to include at least three data points between the ˆτ k and either end, it should satisfy that x 3 ˆτ and ˆτ K x n 3 In order to contain at least four data points between ˆτ k and ˆτ k+, then i k+ i k 4ifˆτ k (x ik,x ik +) and i k+ i k 3ifˆτ k = x ik 5 Application In the Annual Report to the Nation on the Status of Cancer, jointly released by the National Cancer Institute (NCI), the American Cancer Society (ACS), the North American Association of Central Cancer Registries (NAACCR), and the Centers for Disease Control and Prevention (CDC), including the National Center for Health Statistics (NCHS), the rate of new cancer cases and deaths for all cancers combined as well as for most of the top 0 cancer sites were reported The joinpoint regression models were used to analyze the changing trends of cancer incidence and mortality rates over successive segments of time, and to estimate the amount of increase or decrease within each time period The report includes a special section on colorectal cancer, which has the third highest incidence of any cancer site Using JP regression with annual grid, the report shows that overall incidence increased until 985 and then began decreasing steadily at an average rate of 6% per year In this application, we consider the colorectal cancer incidence rates from 976 to 999 for the male less than age 65 and compare the JP model estimated using the LGS method and the one estimated using the Hudson s method The data were extracted from the nine cancer registries in the National Cancer Institute s Surveillance, Epidemiology, and End Results (SEER) program, which covers approximately 0% of the US population The response variable for the JP analysis, y, is the natural logarithms of age-adjusted colorectal cancer incidence rate per 00,000 people The range of the possible number of JPs is set at the default for the Joinpoint software with minimum 0 and maximum 3, as most cancer trend data has up to three joinpoints The trend of the incidence rates is represented by annual percent change (APC), where for the kth segment, the APC is (exp(β k ) ) 00% The permutation test procedure (Kim et al, 2000) was used to select the best model among the zero- to three-jp models When the default restrictions are used, ie, the middle segment should have at least four data points, both methods choose a zero-jp model as the best model The APC is 04% with CI ( 064%, 07%), indicating that the colorectal cancer incidence rates has decreased slowly, but significantly from 976 to 999 To allow more rapid changes, we decrease the number of data points in the middle segment from 4 to 2 The estimates and confidence intervals (CIs) of the JPs and the sum of squared error (SSE) from both methods are shown in Table 2 From Table 2, we see that the SSEs for the zero- and one-jp models are identical for both methods The SSEs for the two-jp model from both methods are very close However, the SSE for the three-jp model from the Hudson s method Table 2 SSE and the estimated JPs with 95% CI from both methods No JPs Grid-search method Hudson s method SSE JP (95% CI) SSE JP (95% CI) τ = 985 (982, 988) 4743 τ = 9850 (9826, 9878) τ = 986 (978, 99) 3634 τ = 9860 (977, 9908) τ 2 = 987 (982, 997) τ 2 = 987 (9830, 9979) τ = 983 (978, 987) 845 τ = 9832 (9773, 9840) τ 2 = 985 (983, 996) τ 2 = 9855 (9840, 9864) τ 3 = 988 (985, 997) τ 3 = 987 (986, 9958)

7 2426 B Yu et al / Computational Statistics & Data Analysis 5 (2007) Fig Plot of final models selected by grid-search and Hudson s methods is much smaller than that from the LGS method As we see from Fig, the 3-JP model from the LGS method only allows the JPs at the data points Also, Hudson s method yields narrower CIs for the JPs, especially for τ 2 under the three-jp model The improvement in SSE and CI is because the JPs are not restricted to be at the observed data points in the Hudson s method When the permutation test procedure is used to select the optimal model, the final model using the grid search method is still a zero-jp model; whereas a three-jp model is selected when the Hudson s method is used The plots of the final models from both methods are shown in Fig From the three-jp model selected by Hudson s method, we see a spike at 9855 Starting from 9832, the incidence rate increased dramatically until 9855, then decreased sharply until 987 This spike in incidence rates might be due to the presidential effect (Brown and Potosky, 990) Brown and Potosky examined the public health impact of mass media coverage of President Reagan s colon cancer episode of 985 They also found a sharp but somewhat transitory increase in public interest following the diagnosis of the President s colon cancer, with a corresponding increase in early detection tests Their analysis of the incidence data showed an increase in early stage colorectal cancers in the months following the President s diagnosis and a decrease in advanced disease in , suggestive of a screening effect The new trend represented by the three-jp model may shed light on the use and usefulness of colorectal cancer screening Although we do not know the true underlying model for the colorectal cancer incidence rates, the Hudson s method is more sensitive and has more power to discover the new trend which is missed by the grid-search method Furthermore, the Hudson s method always provides a smaller SSE, hence more accurate estimates of the regression coefficients and APCs The comparisons between the LGS method and the Hudson s method regarding their effects on inference of the regression parameters are addressed in detail in a companion paper (Kim et al, 2006) 6 Discussion In this paper, we discuss the computational details of estimating multiple joinpoints in continuous time scale, and compare the computational efficiencies of the two fitting methods, the Hudson s method and the Lerman s grid search method In summary, the Hudson s method takes longer time than the basic grid search where only the data points serve as the grid points, but it is more efficient than a grid search with more than four points inserted between the consecutive data points To illustrate other advantages of the Hudson s method, we applied both methods to male colorectal cancer incidence rate data and found that the Hudson s method provides estimates with smaller biases Because the Hudson s method does not restrict that the JPs occur at the data points, it provides a better fit than the grid search method This enables us to describe the cancer incidence and mortality trend more accurately The extension of the Hudson s method is able to fit multiple JP regression model and the computation time of fitting a K-JP model

8 B Yu et al / Computational Statistics & Data Analysis 5 (2007) is faster than the fine grid-search with one midpoint The extended Hudson s method is currently implemented in the Joinpoint software developed by the National Cancer Institute ( Several other alternative approaches are proposed for multiple change/join point problems for single time series (Smith, 975; Carlin et al, 992) The Bayesian approach with Markov Chain Monte Carlo (MCMC) is becoming more popular as the computer is more powerful Using the data concerning prostate specific antigen serial markers for prostate cancer, Slate and Turnbull (2000) compare two joinpoint models, ie, the fully Bayesian hierarchical change point model and the latent disease process model Tiwari et al (2005) compare different model selection procedures in Bayesian joinpoint models The Bayesian method with MCMC usually takes more time to fit a multiple joinpoint model, depending on the number of MCMC runs However, the Bayesian approach is able to produce posterior distributions of the parameters, particularly the posterior distribution of number and the locations of the joinpoints As Tiwari et al pointed out (2005), the Bayesian methods is a useful companion of the frequentist methods, since the posterior distribution of the number and location of the joinpoints gives additional insight to compare different joinpoint models Hence, it is of interest to study how these different estimation methods perform under different situations References Brown, ML, Potosky, AL, 990 The presidential effect: the public health response to media coverage about Ronald Reagan s colon cancer episode The Public Opinion Quarterly 54, Carlin, BP, Gelfand, AE, Smith, AFM, 992 Hierarchical Bayesian analysis of change point problems Appl Statist 4, Hinkley, DV, 969 Inference about the intersection in two-phase regression Biometrika 56, Hinkley, DV, 97 Inference in two-phase regression J Amer Statist Soc 66, Hudson, DJ, 966 Fitting segmented curves whose join points have to be estimates J Amer Statist Soc 6, Kim, H-J, Fay, MP, Feuer, EJ, Midthune, DN, 2000 Permutation tests for joinpoint regression with applications to cancer rates Statist Medicine 9, Kim, H-J, Fay, MP, Yu, B, Barrett, MJ, Feuer, EJ, 2004 Comparability of segmented line regression models Biometrics 60, Kim, H-J, Yu, B, Feuer, EJ, 2006 Inference in segmented line regression: a simulation study Comput Statist Data Anal, under revision Lerman, PM, 980 Fitting segmented regression models by grid search Appl Statist 29, Plackett, RL, 960 Principles of Regression Analysis Clarendon Press, Oxford Quandt, RE, 958 The estimation of the parameters of a linear regression system obeying two separate regimes J Amer Statist Assoc 53, Quandt, RE, Ramsey, JN, 978 Estimating mixtures of normal distributions and switching regressions J Amer Statist Assoc 73, (with discussion) Slate, EH, Turnbull, BW, 2000 Statistical models for longitudinal biomarkers of disease onset Statist Medicine 9, Smith, AFM, 975 A Bayesian approach to inference about a change point in a sequence of random variables Biometrika 62, Tiwari, R, Cronin, K, Davis, W, Feuer, EJ,Yu, B, Chib, S, 2005 Bayesian model selection for join point regression with application to age-adjusted cancer rates Appl Statist 54,

RESEARCH ARTICLE. Detecting Multiple Change Points in Piecewise Constant Hazard Functions

RESEARCH ARTICLE. Detecting Multiple Change Points in Piecewise Constant Hazard Functions Journal of Applied Statistics Vol. 00, No. 00, Month 200x, 1 12 RESEARCH ARTICLE Detecting Multiple Change Points in Piecewise Constant Hazard Functions Melody S. Goodman a, Yi Li b and Ram C. Tiwari c

More information

An Age-Stratified Poisson Model for Comparing Trends in Cancer Rates Across Overlapping Regions

An Age-Stratified Poisson Model for Comparing Trends in Cancer Rates Across Overlapping Regions bimj header will be provided by the publisher An Age-Stratified Poisson Model for Comparing Trends in Cancer Rates Across Overlapping Regions Yi Li, Ram C. Tiwari 2, and Zhaohui Zou 3 Harvard University,

More information

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Monte Carlo Methods Appl, Vol 6, No 3 (2000), pp 205 210 c VSP 2000 Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Daniel B Rowe H & SS, 228-77 California Institute of

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Harvard University. Harvard University Biostatistics Working Paper Series. Survival Analysis with Change Point Hazard Functions

Harvard University. Harvard University Biostatistics Working Paper Series. Survival Analysis with Change Point Hazard Functions Harvard University Harvard University Biostatistics Working Paper Series Year 2006 Paper 40 Survival Analysis with Change Point Hazard Functions Melody S. Goodman Yi Li Ram C. Tiwari Harvard University,

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

On Using Truncated Sequential Probability Ratio Test Boundaries for Monte Carlo Implementation of Hypothesis Tests

On Using Truncated Sequential Probability Ratio Test Boundaries for Monte Carlo Implementation of Hypothesis Tests On Using Truncated Sequential Probability Ratio Test Boundaries for Monte Carlo Implementation of Hypothesis Tests (to appear in Journal of Computational and Graphical Statistics) Michael P. Fay National

More information

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS REVSTAT Statistical Journal Volume 3, Number 1, June 2005, 1 18 ROBUSTNESS OF TWO-PHASE REGRESSION TESTS Authors: Carlos A.R. Diniz Departamento de Estatística, Universidade Federal de São Carlos, São

More information

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline: CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2.

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model

Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model 1 / 23 Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model Glen DePalma gdepalma@purdue.edu Bruce A. Craig bacraig@purdue.edu Eastern North

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Christopher Paciorek, Department of Statistics, University

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Conjugate Analysis for the Linear Model

Conjugate Analysis for the Linear Model Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Lecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay

Lecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay Lecture 6: State Space Model and Kalman Filter Bus 490, Time Series Analysis, Mr R Tsay A state space model consists of two equations: S t+ F S t + Ge t+, () Z t HS t + ɛ t (2) where S t is a state vector

More information

On the Fisher Bingham Distribution

On the Fisher Bingham Distribution On the Fisher Bingham Distribution BY A. Kume and S.G Walker Institute of Mathematics, Statistics and Actuarial Science, University of Kent Canterbury, CT2 7NF,UK A.Kume@kent.ac.uk and S.G.Walker@kent.ac.uk

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Properties of the least squares estimates

Properties of the least squares estimates Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Cross-sectional space-time modeling using ARNN(p, n) processes

Cross-sectional space-time modeling using ARNN(p, n) processes Cross-sectional space-time modeling using ARNN(p, n) processes W. Polasek K. Kakamu September, 006 Abstract We suggest a new class of cross-sectional space-time models based on local AR models and nearest

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Adaptive Prediction of Event Times in Clinical Trials

Adaptive Prediction of Event Times in Clinical Trials Adaptive Prediction of Event Times in Clinical Trials Yu Lan Southern Methodist University Advisor: Daniel F. Heitjan May 8, 2017 Yu Lan (SMU) May 8, 2017 1 / 19 Clinical Trial Prediction Event-based trials:

More information

Intrinsic products and factorizations of matrices

Intrinsic products and factorizations of matrices Available online at www.sciencedirect.com Linear Algebra and its Applications 428 (2008) 5 3 www.elsevier.com/locate/laa Intrinsic products and factorizations of matrices Miroslav Fiedler Academy of Sciences

More information

Bayesian sensitivity analysis of a cardiac cell model using a Gaussian process emulator Supporting information

Bayesian sensitivity analysis of a cardiac cell model using a Gaussian process emulator Supporting information Bayesian sensitivity analysis of a cardiac cell model using a Gaussian process emulator Supporting information E T Y Chang 1,2, M Strong 3 R H Clayton 1,2, 1 Insigneo Institute for in-silico Medicine,

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes Biometrics 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materials for A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian

More information

Inverse Sampling for McNemar s Test

Inverse Sampling for McNemar s Test International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures

More information

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM. Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Hierarchical Modeling and Analysis for Spatial Data

Hierarchical Modeling and Analysis for Spatial Data Hierarchical Modeling and Analysis for Spatial Data Bradley P. Carlin, Sudipto Banerjee, and Alan E. Gelfand brad@biostat.umn.edu, sudiptob@biostat.umn.edu, and alan@stat.duke.edu University of Minnesota

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Appendix: Modeling Approach

Appendix: Modeling Approach AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,

More information

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

A Simulation Study of a Bayesian Hierarchical Changepoint Model with Covariates

A Simulation Study of a Bayesian Hierarchical Changepoint Model with Covariates A Simulation Study of a Bayesian Hierarchical Changepoint Model with Covariates Wonsuk Yoo (1), Elizabeth H. Slate (2) (1) Department of Mathematical Sciences New Jersey Institute of Technology Newark,

More information

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options.

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options. 1 Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam Name: KEY Problems do not have equal value and some problems will take more time than others. Spend your time wisely. You do

More information

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping : Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj InSPiRe Conference on Methodology

More information

Approaches for Multiple Disease Mapping: MCAR and SANOVA

Approaches for Multiple Disease Mapping: MCAR and SANOVA Approaches for Multiple Disease Mapping: MCAR and SANOVA Dipankar Bandyopadhyay Division of Biostatistics, University of Minnesota SPH April 22, 2015 1 Adapted from Sudipto Banerjee s notes SANOVA vs MCAR

More information

CHANGE POINT PROBLEMS IN THE MODEL OF LOGISTIC REGRESSION 1. By G. Gurevich and A. Vexler. Tecnion-Israel Institute of Technology, Haifa, Israel.

CHANGE POINT PROBLEMS IN THE MODEL OF LOGISTIC REGRESSION 1. By G. Gurevich and A. Vexler. Tecnion-Israel Institute of Technology, Haifa, Israel. CHANGE POINT PROBLEMS IN THE MODEL OF LOGISTIC REGRESSION 1 By G. Gurevich and A. Vexler Tecnion-Israel Institute of Technology, Haifa, Israel and The Central Bureau of Statistics, Jerusalem, Israel SUMMARY

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

Supplementary Note on Bayesian analysis

Supplementary Note on Bayesian analysis Supplementary Note on Bayesian analysis Structured variability of muscle activations supports the minimal intervention principle of motor control Francisco J. Valero-Cuevas 1,2,3, Madhusudhan Venkadesan

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Longitudinal breast density as a marker of breast cancer risk

Longitudinal breast density as a marker of breast cancer risk Longitudinal breast density as a marker of breast cancer risk C. Armero (1), M. Rué (2), A. Forte (1), C. Forné (2), H. Perpiñán (1), M. Baré (3), and G. Gómez (4) (1) BIOstatnet and Universitat de València,

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, )

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, ) Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, 302-308) Consider data in which multiple outcomes are collected for

More information

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

Disease mapping with Gaussian processes

Disease mapping with Gaussian processes EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno

More information

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case Areal data models Spatial smoothers Brook s Lemma and Gibbs distribution CAR models Gaussian case Non-Gaussian case SAR models Gaussian case Non-Gaussian case CAR vs. SAR STAR models Inference for areal

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

VCMC: Variational Consensus Monte Carlo

VCMC: Variational Consensus Monte Carlo VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information

ASA Section on Survey Research Methods

ASA Section on Survey Research Methods REGRESSION-BASED STATISTICAL MATCHING: RECENT DEVELOPMENTS Chris Moriarity, Fritz Scheuren Chris Moriarity, U.S. Government Accountability Office, 411 G Street NW, Washington, DC 20548 KEY WORDS: data

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Dynamic Scheduling of the Upcoming Exam in Cancer Screening

Dynamic Scheduling of the Upcoming Exam in Cancer Screening Dynamic Scheduling of the Upcoming Exam in Cancer Screening Dongfeng 1 and Karen Kafadar 2 1 Department of Bioinformatics and Biostatistics University of Louisville 2 Department of Statistics University

More information

1 Bayesian Linear Regression (BLR)

1 Bayesian Linear Regression (BLR) Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics

More information

Constrained data assimilation. W. Carlisle Thacker Atlantic Oceanographic and Meteorological Laboratory Miami, Florida USA

Constrained data assimilation. W. Carlisle Thacker Atlantic Oceanographic and Meteorological Laboratory Miami, Florida USA Constrained data assimilation W. Carlisle Thacker Atlantic Oceanographic and Meteorological Laboratory Miami, Florida 33149 USA Plan Range constraints: : HYCOM layers have minimum thickness. Optimal interpolation:

More information

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance

More information

Group Sequential Designs: Theory, Computation and Optimisation

Group Sequential Designs: Theory, Computation and Optimisation Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

A characterization of consistency of model weights given partial information in normal linear models

A characterization of consistency of model weights given partial information in normal linear models Statistics & Probability Letters ( ) A characterization of consistency of model weights given partial information in normal linear models Hubert Wong a;, Bertrand Clare b;1 a Department of Health Care

More information

Gaussian Process Regression Model in Spatial Logistic Regression

Gaussian Process Regression Model in Spatial Logistic Regression Journal of Physics: Conference Series PAPER OPEN ACCESS Gaussian Process Regression Model in Spatial Logistic Regression To cite this article: A Sofro and A Oktaviarina 018 J. Phys.: Conf. Ser. 947 01005

More information

4 Introduction to modeling longitudinal data

4 Introduction to modeling longitudinal data 4 Introduction to modeling longitudinal data We are now in a position to introduce a basic statistical model for longitudinal data. The models and methods we discuss in subsequent chapters may be viewed

More information

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Dawn Woodard Operations Research & Information Engineering Cornell University Johns Hopkins Dept. of Biostatistics, April

More information

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial William R. Gillespie Pharsight Corporation Cary, North Carolina, USA PAGE 2003 Verona,

More information

Efficient MCMC Samplers for Network Tomography

Efficient MCMC Samplers for Network Tomography Efficient MCMC Samplers for Network Tomography Martin Hazelton 1 Institute of Fundamental Sciences Massey University 7 December 2015 1 Email: m.hazelton@massey.ac.nz AUT Mathematical Sciences Symposium

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options.

Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam. Please choose ONE of the following options. 1 Biostatistics 533 Classical Theory of Linear Models Spring 2007 Final Exam Name: Problems do not have equal value and some problems will take more time than others. Spend your time wisely. You do not

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Game Physics. Game and Media Technology Master Program - Utrecht University. Dr. Nicolas Pronost

Game Physics. Game and Media Technology Master Program - Utrecht University. Dr. Nicolas Pronost Game and Media Technology Master Program - Utrecht University Dr. Nicolas Pronost Rigid body physics Particle system Most simple instance of a physics system Each object (body) is a particle Each particle

More information

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj

More information

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

Lecture 24: Weighted and Generalized Least Squares

Lecture 24: Weighted and Generalized Least Squares Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares When we use ordinary least squares to estimate linear regression, we minimize the mean squared error: MSE(b) = 1 n (Y i X i β)

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

Bayesian (conditionally) conjugate inference for discrete data models. Jon Forster (University of Southampton)

Bayesian (conditionally) conjugate inference for discrete data models. Jon Forster (University of Southampton) Bayesian (conditionally) conjugate inference for discrete data models Jon Forster (University of Southampton) with Mark Grigsby (Procter and Gamble?) Emily Webb (Institute of Cancer Research) Table 1:

More information

Basics of reinforcement learning

Basics of reinforcement learning Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system

More information