Conditional Inference in Two-Stage Adaptive Experiments via the Bootstrap
|
|
- Lindsay Terry
- 5 years ago
- Views:
Transcription
1 Conditional Inference in Two-Stage Adaptive Experiments via the Bootstrap Adam Lane, HaiYing Wang, and Nancy Flournoy Abstract We study two-stage adaptive designs in which data accumulated in the first stage are used to select the design for a second stage. Inference following such experiments is often conducted by ignoring the adaptive procedure and treating the design as fixed. Alternative inference procedures approximate the variance of the parameter estimates by approximating the inverse of expected Fisher information. Both of these inferential methods often rely on normal distribution assumptions in order to create confidence intervals. In an effort to improve inference, we develop bootstrap methods that condition on a non-ancillary statistic that defines the second stage design. Introduction In many experiments, the evaluation of designs with respect to a specific objective requires knowledge of the true model parameters. Examples include optimal designs for the estimation of nonlinear functions of the parameters in linear models; optimal designs for the estimation of linear functions of parameters in nonlinear models; and dose-finding designs where it is desired to treat patients at a pre-specified quantile of the response function. In the absence of perfect knowledge of the model parameters, it is appealing to update initial parameter estimates using data accumulated from all previous stages and to allocate observations in the current stage by an assessment of designs evaluated at these estimates. Such procedures result in designs that are functions of random variables whose distributions depend on the model parameters, i.e., the designs are not ancillary statistics. A. Lane ( ) Cincinnati Children s Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, H 459, USA Adam.Lane@cchmc.org H. Wang University of New Hampshire, 33 Academic Way, Durham, NH 0384, USA HaiYing.Wang@unh.edu N. Flournoy University of Missouri, 46 Middlebush Hall, Columbia, M 65, USA flournoyn@missouri.edu Springer International Publishing Switzerland 06 J. Kunert et al. (eds.), mda - Advances in Model-riented Design and Analysis, Contributions to Statistics, DI 0.007/ _0 73
2 74 A. Lane et al. In practice it is common to conduct inference ignoring the adaptive nature of the experiment and treating the design as if it were an ancillary statistic. The potential issues resulting from such procedures are well known; see []. Briefly, when a design is ancillary, it is non-informative with respect to model parameters. Analysing adaptive experiments with non-ancillary designs as if the design were ancillary does not account for all the information contained in the design. An alternative often suggested is to approximate the variance of the maximum likelihood estimate (MLE) with an approximation of the inverse of the expected Fisher information (defined below). Regardless of the method, the distribution of the MLE resulting from adaptive experiments are often assumed to follow a normal distribution when used to create confidence intervals. We develop three bootstrap procedures that approximate the distribution of the MLE conditional on a non-ancillary statistic that defines the second stage design. The bootstrap reduces the reliance on the assumption of normality. A bootstrap procedure in the context of adaptive experiments with binary outcomes, primarily in the context of urn models, has been developed previously [7]. The Model and an Illustration of the Problem Consider an experiment conducted to measure a constant. Suppose it is possible to obtain unbiased observations y of from two sources; each source k D r; s produces errors from a N.0; k /,where r s are known constants. Throughout this paper Y and y denote the random variable and its observed realization, respectively. A function./ is defined to determine which of the two sources should be used based on maximizing efficiency, ethics, cost or other considerations. For illustration, we use./ D r if <cand s otherwise, given some constant c. The adaptive procedure is as follows: obtain an initial cohort of n independent observations, y D.y ;:::;y n /, from source r. Evaluate at the MLE of based on first stage observations; D y D P n jd y j=n. The function.y / determines the source from which a second cohort of n independent observations, y D.y ;:::;y n /, will be obtained and induces a correlation between Y and Y. This model has been used to illustrate conditional inference in experiments with ancillary designs; see [] and[4]. Here we use the model to illustrate conditional inference with non-ancillary designs in adaptive experiments. Let N D n C n represent the total sample size; and let S r D.; c/ and S s D.c; / be the regions of the parameter space that selects the second stage source. The second stage design is completely determined by the non-ancillary variable I.Y S k /, k D r; s, wherei./ is the indicator function. Therefore, Y jy S k is independent of Y, k D r; s. The log likelihood is l Dn.y / =r n.y / =.y / C constant. Defining w.y / D.n =r Cn =.y / /,themle based on both stages is D w.y /.n y =r C n y =.y / /.
3 Conditional Inference in Two-Stage Adaptive Experiments via the Bootstrap 75 To approximate the distribution of.y ; Y /jy S k, k D r; s, three bootstrap methods are developed. Resulting parameter estimates and confidence intervals are compared to analyses where the variance of the MLE is approximated using the inverse of the observed and expected Fisher information: I. / D Œ@ l =@ D D w.y / and F D EŒI. /, respectively. Confidence intervals for comparative methods are then constructed using normal quantiles. If the design were ancillary, then the variance of the MLE would be w.y /.UsingF to approximate the variance, does not treat the design as ancillary. But it is a function of and must be estimated; F is used here. In contrast, since Y jy S k is independent of Y,.Y ; Y /jy S k has variance h i Var.Y ; Y /jy S k D w k n 4 r Var n Y jy S k C k ; where w k D.n =r C n =k /. The design of the second stage is completely determined by I.Y S k /. Therefore f.y /jy S k gdk. The function w.y / depends on Y only through.y /. Hence fw.y /jy S k gdw k,fork D r; s. Let p k D P.Y S k /, E k D EŒ.Y ; Y /jy S k, V k D VarŒ.Y ; Y /jy S k and I k D I. /, k D r; s for short. From a series of 0,000 simulations, Table presents, for k D r; s, p k, E k, Nbse and tail probabilities, PŒ < C l and PŒ > C u,wherebse is either VarŒ.Y ; Y /, V k, Ik or F ; C l D Z = bse, C u D C Z = bse and Z is the quantile of the standard normal distribution. Values of D, r D, s D 3, n D 50, n D 50, c D C = p n and D 0:05 are used throughout. Both Ik and F are significantly greater than V k, k D r; s. BothV k, k D r; s, are significantly less than VarŒ.Y ; Y /. IfY S r,thene r is nearly unbiased, and despite slightly unbalanced tail probabilities, overall coverage would be adequate if V r were known and could be used. When Y S s, the bias is considerable and the coverage is unacceptable. We focus throughout on improvements when Y S s. Table Conditional probability, conditional expectation, variance approximation method along with its estimate and tail probabilities by the source of the second stage observations Source p k E k Variance approximation Nbse PŒ < C l PŒ > C u r V r VarŒ.Y ; Y / Ir F s V s VarŒ.Y ; Y / Is F
4 76 A. Lane et al. 3 Conditional Bootstrap Inference In an effort to improve inference following adaptive experiments, three bootstrap methods are developed. We begin with a straightforward and intuitive conditional bootstrap procedure. Unfortunately, as will be discussed, the conditions required for this procedure to give accurate inference are extremely restrictive. Conditional Bootstrap Method (BM). Construct a probability distribution, F i, by putting mass =n i at each point y i ;:::;y ini, i D ;. With fixed F and F, draw random samples of size n and n from F and F, respectively. Denote the vector of bootstrap samples as y i ; i D ;.. Repeat step B times. For the bootstrap samples satisfying y S.y /,useny and Ny to find.use.c l ; C u / D.Q = ; Q = / as the. / confidence interval, where Q is the quantile of the bootstrapped sample distribution.y ; Y /jy S.y /. Let P be the empirical probability measure given Y and let Œ b represent the bth sampled bootstrap. Then P.Y S.y // P B bd I.Œy b S.y //=B is the probability the mean of a bootstrap sample is in S.y /. Table presents the simulation results p s,e s,v s, C l, C u,pœ < C l and PŒ > C u. The first row results are for the distribution of.y ; Y /jy S s.the second and third row results use Is and F, and tail probabilities found using Table Results are conditional on the region Y S s. Method describes the procedure by which the approximate distribution was obtained; p s is approximated by P B bd I.Œy b S s /=B for the bootstrap and by P.Y S s j D / for the expected information. E s,v s, the confidence limits and tail probabilities are also provided. All values are averages from 0,000 simulations except the column p s which uses the median. True refers to either the empirical distribution of.y ; Y /jy S s or Q.Y ; Y /jy S s obtained from simulation Estimate Method p s E s N*V s C l C u PŒ < C l PŒ > C u True F Is BM C BM Q True Q Q Q C BM
5 Conditional Inference in Two-Stage Adaptive Experiments via the Bootstrap 77 quantiles of the standard normal distribution. The fourth row results are from BM using B D 000 bootstrap samples. All values are averages except p s whose values are skewed and for which the median provides a better summary of the data. Both P.Y S s/ and P.Y S s j D / are significantly greater than p s. The methods BM and F produce similar variances, both of which are less than I s.all three approximations are significantly greater than V s. However, the mean of the bootstrap distribution has greater bias than. Thus BM does not improve inference in comparison to using F. 3. Technical Details for the Conditional Bootstrap In this section we consider the MLE conditional on an indicator function of the first stage sample mean. This illustrates the implications of conditioning in two-stage adaptive experiments with non-ancillary designs and why the inference produced by BM was poor. Let T k D.a k ; d k /,wherea k D C a 0 k =p n, d k D C dk 0 =p n, a 0 k and dk 0 are constants. This is a slightly more general division of the parameter space than S k which has only one of a 0 k and d0 k finite. Note, we consider the parameter space defined in a local neighborhood of. This is done so that P.Y S k /, k D r; s has positive probability not equal to for large n ; otherwise the design would be deterministic for large first stage sample sizes and not adaptive. Consider h P p n Y Y < xj p n Y Y p i n.a k ;d k / h P f p n Y Y < xg\f p n.y Y / p i n.a k ;d k /g D h pn P.Y Y / p i n.a k ;d k / D P f p n Y < xg\f p n.y / p n.a k ;d k /g C o./ P p n.y / p n.a k ;d k / C o./ D P p n Y < xj p n Y p n.a k ;d k / C o./; () where x R, and the equalities hold almost surely under P. Note, from the left hand side of equation (), that a conditional bootstrap should include bootstrap samples satisfying p n Y Y p n.a k ;d k /,i.e.,y.a k " ; d k " /,where " D Y. BM considers bootstrap samples satisfying Y.a k; d k / and hence
6 78 A. Lane et al. its poor performance. Unfortunately, is unknown and must be estimated. In (), if one can replace with an estimator (say Q ) that converges to at a rate faster than p n,then h P p n Y Y < xj p n Y Y p n.a k ;d Q k / Q i D P p n Y < xj p n Y p n.a k ;d k / C o P./: () In a single stage experiment this would not be possible. However, in a twostage adaptive experiment in which n D o.n /,suchanestimatormayexist. Theoretically this is somewhat unsatisfactory, but practically this is interesting especially since it has been shown that two-stage experiments are optimized when n D. p n /;see[5]and[6]. It is also common, for practical or logistical reasons, that a small pilot study precedes a much larger follow-up. Note Y jy T k TN.; k =n I T k /,wheretn denotes a truncated normal distribution. Suppose an estimator,, Q exists as described and let QT k D.a k Q" ; d k Q" /,whereq" D Y.Then() Q implies fy jy QT k g can be approximated by a TN.y ;k =n I QT k / and therefore h i E Y Y jy QT k Y C b k. /I Q (3) h i Var Y Y jy QT k k n C Q.a k/f Q.a k/g Q.d k/f Q.d k/g U n : b k. /o Q where b k./ D k Œf.a k /gf.d k /g=. p n U/,.a k / D p n.a k /= k, U D f.b/g f.a/g;./ and./ represent the probability density function (PDF) and cumulative distribution function (CDF) of the standard normal distribution, respectively. Note a conditional bootstrap leads to an additional bias term in the expectation in equation (3). This must be accounted for in any conditional bootstrap procedure. 4 Adjusted Conditional Bootstrap Methods The bootstrap methods developed in this section are predicated on the existence of an estimator that converges to at a rate faster than p n. Provided standard regularity conditions hold and n D o.n /, such an estimator will always exists in the form of the second stage MLE conditional on the second stage design. However, in cases where the bias has an explicit form, it is possible to obtain Q by adapting
7 Conditional Inference in Two-Stage Adaptive Experiments via the Bootstrap 79 the bias reduction method suggested in [8]. Note h i E.Y ; Y /jy S k D w k E n Y =k C n Y =k jy S k D w k n = r C n = k C wk n b k./= r D C w k n b k./= r : Let b k./ D w k n b k./=r. Then a bias corrected estimate of is Q D b k. /. Q The Newton-Raphson method was used to solve this equation. ne iteration was sufficient, that is, Q D k. /,where k. / D b k. /=ŒC.@b k./=@/ D was used for analytic expressions and numeric calculations. Now we develop a bootstrap method that adjusts the conditioning region per the discussion in Sect. 3.. Let QS r Df; c Q" g and QS s DfcQ" ; g, where Q" D Y. Q Adjusted Conditional Bootstrap Method (BM): Repeat BM keeping only bootstrap samples satisfying y QS.y /. Let C D b.y /. / Q and.cl C ; Cu C/ D.QC =; QC = / as the. / confidence interval, where QC is the quantile of the bootstrap sample distribution of C.Y ; Y /jy QS.y /. Note C is used in place of to correct for the additional bias term previously discussed. A concern in inference when the variance of the MLE depends on the parameter estimates is how sensitive the approximate distribution is to this estimate. Figure (top left) plots the histogram of the simulated distribution of f.y ; Y /jy S s g E s (solid line). In the same figure, for simulations that correspond to the 0.05, 0.50 and quantiles of, histograms of the bootstrap sample distributions of f C.Y ; Y /jy QS s g (dotted, dashed and dot-dashed line) are plotted. This figure illustrates how well BM works across the domain of at approximating the distribution.y ; Y /jy S s. Compare this to Fig. (top right) which plots the PDF of N.0; F / for F evaluated at the same quantiles of. The bootstrap / and it provides a better approximation to the shape of the target distribution. The fifth row of Table shows that with BM P.Y QS s / is nearly equal to p s ; E Œ C.Y ; Y /jy QS s is approximately equal to E s ;Var Œ C.Y ; Y /jy QS s is only slightly greater than V s ; and the confidence interval endpoints correspond to the quantiles of. It is clear that BM provides a more accurate representation of the conditional distribution of than the alternatives. However, BM still fails to provide adequate coverage. This is not a problem with the bootstrap but rather a problem with the distribution of being tightly centered around the wrong value. distribution is less sensitive to the value of than N.0; F
8 80 A. Lane et al. Considering the poor coverage of BM, we develop a bootstrap procedure for the bias corrected estimator Q. Recall Q D k. /, k D r; s; is simply a function of. Thus BM can be adapted to estimate its distribution. Because the distribution of the bias is skewed, bias corrected bootstrap confidence intervals were used; see [3]. Bias Adjusted Conditional Bootstrap Method (BM3): Replace C in BM with Q C D C.y /. C /. Use.C l ; C u / D Œ b CDF f.v 0 C Z = /g; b CDF f.v 0 Z = /g,where b CDF.t/ D P Œ Q C.Y ; Y /jy QS.y / < t and v 0 D b CDF. Q /. Since Q is a function of, for comparison we approximate the variance of Q with Q D Œf@..y /.//=@g F D. Figure (bottom left) plots the histogram of the simulated distribution of f.y Q ; Y /jy S s geœ.y Q ; Y /jy S s (solid line). In the same figure, for simulations that correspond to the 0.05, 0.50 and quantiles of, Q histograms of the bootstrap sample distributions of f QC.Y ; Y /jy QS s g Q (dotted, dashed and dot-dashed line) are plotted. For comparison, Fig. (bottom right) plots the probability density functions of a N.0; Q / with Q evaluated at the same quantiles of. Q nce again we see that Fig. Histograms of bootstrap distributions f C.Y ; Y /jy QS s g (top left) and f Q C.Y ; Y /jy QS s g Q (bottom left). Probability density functions of N.0; F / (top right) and N.0; Q / (bottom right). The dotted, dot dashed and dashed lines correspond to the bootstrap distribution or expected information for the 0.05, 0.50, and of quantiles of or Q. In each case the solid line is the histogram of the distribution of f.y ; Y /jy S s ge s or f Q.Y ; Y /jy S s geœq.y ; Y /jy S s
9 Conditional Inference in Two-Stage Adaptive Experiments via the Bootstrap 8 Table 3 Mean square error (MSE) for the true distribution of and Q along with the bootstrap MSE of C or Q C for the case when Y s Estimate Method MSE True 4.9 C BM 4.98 Q True 5.5 Q C BM3 5.7 the BM3 well approximates the distribution.y Q ; Y /jy S s across most of the domain of, Q perhaps with the exception of the 0.05 quantile. The sixth row of Table shows the simulation results the distribution of Q.Y ; Y /jy S s. The seventh and eighth rows are results for Q and BM3, respectively. BM3 provides an unbiased estimate of the mean; slightly overestimates the VarŒ.Y Q ; Y /jy S s ; and provides confidence limits which coincide with the correct quantiles of. Q Coverage is slightly less than the nominal level, but is a significant improvement over inference methods for. Note using Q significantly overestimates VarŒ.Y Q ; Y /jy S s and slightly skews the confidence limits. The performance of C and QC reflects the bias versus variance tradeoff. Table 3 compares the mean square error (MSE) for the simulated distribution of and Q along with the bootstrap MSE of C and QC for the case when.y / D s. Despite the shortcomings of the procedure for C, it still provides a lower MSE than QC in this example. References. Baldi Antognini, A., Giovagnoli, A.: n the asympotic inference for response-adaptive experiments. METRN Int. J. Stat. 49, 9 46 (006). Dette, H., Bornkamp, B., Bretz, F.: n the efficiency of two-stage response-adaptive designs. Stat. Med. 3, (03) 3. Efron, B.: Nonparametric standard errors and confidence intervals. Can. J. Stat. 9, (98) 4. Efron, B., Hinkley, D.V.: Assessing the accuracy of the maximum likelihood estimate: observed versus expected Fisher information. Biometrika 65, (978) 5. Lane, A., Yao, P., Flournoy, N.: Information in a two-stage adaptive optimal design. J. Stat. Plan. Inference 44, (04) 6. Pronzato, L., Pazman, A.: Design of Experiments in Nonlinear Models. Springer, New York (03) 7. Rosenberger, W.F., Hu, F.: Bootstrap methods for adaptive designs. Stat. Med. 8, (999) 8. Whitehead, J.: n the bias of maximum likelihood estimation following a sequential test. Biometrika 73, (986)
Information in a Two-Stage Adaptive Optimal Design
Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for
More informationST745: Survival Analysis: Nonparametric methods
ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate
More informationFrequency Estimation of Rare Events by Adaptive Thresholding
Frequency Estimation of Rare Events by Adaptive Thresholding J. R. M. Hosking IBM Research Division 2009 IBM Corporation Motivation IBM Research When managing IT systems, there is a need to identify transactions
More informationPrevious lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)
Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationPreliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference
1 / 171 Bootstrap inference Francisco Cribari-Neto Departamento de Estatística Universidade Federal de Pernambuco Recife / PE, Brazil email: cribari@gmail.com October 2013 2 / 171 Unpaid advertisement
More informationChapter 15 Confidence Intervals for Mean Difference Between Two Delta-Distributions
Chapter 15 Confidence Intervals for Mean Difference Between Two Delta-Distributions Karen V. Rosales and Joshua D. Naranjo Abstract Traditional two-sample estimation procedures like pooled-t, Welch s t,
More informationPreliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference
1 / 172 Bootstrap inference Francisco Cribari-Neto Departamento de Estatística Universidade Federal de Pernambuco Recife / PE, Brazil email: cribari@gmail.com October 2014 2 / 172 Unpaid advertisement
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationBayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples
Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,
More informationA noninformative Bayesian approach to domain estimation
A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal
More informationON THE NUMBER OF BOOTSTRAP REPETITIONS FOR BC a CONFIDENCE INTERVALS. DONALD W. K. ANDREWS and MOSHE BUCHINSKY COWLES FOUNDATION PAPER NO.
ON THE NUMBER OF BOOTSTRAP REPETITIONS FOR BC a CONFIDENCE INTERVALS BY DONALD W. K. ANDREWS and MOSHE BUCHINSKY COWLES FOUNDATION PAPER NO. 1069 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and
More informationFinite Population Correction Methods
Finite Population Correction Methods Moses Obiri May 5, 2017 Contents 1 Introduction 1 2 Normal-based Confidence Interval 2 3 Bootstrap Confidence Interval 3 4 Finite Population Bootstrap Sampling 5 4.1
More informationInferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals
Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Michael Sherman Department of Statistics, 3143 TAMU, Texas A&M University, College Station, Texas 77843,
More informationStatistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018
Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationSupporting Information for Estimating restricted mean. treatment effects with stacked survival models
Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation
More informationApproximate Median Regression via the Box-Cox Transformation
Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The
More informationA union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling
A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling Min-ge Xie Department of Statistics, Rutgers University Workshop on Higher-Order Asymptotics
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationMaximum-Likelihood Estimation: Basic Ideas
Sociology 740 John Fox Lecture Notes Maximum-Likelihood Estimation: Basic Ideas Copyright 2014 by John Fox Maximum-Likelihood Estimation: Basic Ideas 1 I The method of maximum likelihood provides estimators
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationInterval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean
Interval estimation October 3, 2018 STAT 151 Class 7 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0
More informationON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION
ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ZHENLINYANGandRONNIET.C.LEE Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationBernoulli and Poisson models
Bernoulli and Poisson models Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes rule
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 7: Con dence intervals
Economics 326 Methods of Empirical Research in Economics Lecture 7: Con dence intervals Hiro Kasahara University of British Columbia December 24, 2014 Point estimation I Our model: 1 Y i = β 0 + β 1 X
More informationASSESSING A VECTOR PARAMETER
SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;
More informationPrecision of maximum likelihood estimation in adaptive designs
Research Article Received 12 January 2015, Accepted 24 September 2015 Published online 12 October 2015 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.6761 Precision of maximum likelihood
More informationFast and robust bootstrap for LTS
Fast and robust bootstrap for LTS Gert Willems a,, Stefan Van Aelst b a Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium b Department of
More informationPubh 8482: Sequential Analysis
Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 7 Course Summary To this point, we have discussed group sequential testing focusing on Maintaining
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationExact two-sample nonparametric test for quantile difference between two populations based on ranked set samples
Ann Inst Stat Math (2009 61:235 249 DOI 10.1007/s10463-007-0141-5 Exact two-sample nonparametric test for quantile difference between two populations based on ranked set samples Omer Ozturk N. Balakrishnan
More informationEstimation in Flexible Adaptive Designs
Estimation in Flexible Adaptive Designs Werner Brannath Section of Medical Statistics Core Unit for Medical Statistics and Informatics Medical University of Vienna BBS and EFSPI Scientific Seminar on Adaptive
More informationTwo-stage Adaptive Randomization for Delayed Response in Clinical Trials
Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal
More informationOriginal citation: Kimani, Peter K., Todd, Susan and Stallard, Nigel. (03) Conditionally unbiased estimation in phase II/III clinical trials with early stopping for futility. Statistics in Medicine. ISSN
More informationPart 4: Multi-parameter and normal models
Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,
More informationThe Nonparametric Bootstrap
The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use
More informationChapter 13 Correlation
Chapter Correlation Page. Pearson correlation coefficient -. Inferential tests on correlation coefficients -9. Correlational assumptions -. on-parametric measures of correlation -5 5. correlational example
More informationComparing MLE, MUE and Firth Estimates for Logistic Regression
Comparing MLE, MUE and Firth Estimates for Logistic Regression Nitin R Patel, Chairman & Co-founder, Cytel Inc. Research Affiliate, MIT nitin@cytel.com Acknowledgements This presentation is based on joint
More informationLecture 2: CDF and EDF
STAT 425: Introduction to Nonparametric Statistics Winter 2018 Instructor: Yen-Chi Chen Lecture 2: CDF and EDF 2.1 CDF: Cumulative Distribution Function For a random variable X, its CDF F () contains all
More informationEmpirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications
Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationNew Developments in Econometrics Lecture 16: Quantile Estimation
New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile
More informationINTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y
INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationEstimation for generalized half logistic distribution based on records
Journal of the Korean Data & Information Science Society 202, 236, 249 257 http://dx.doi.org/0.7465/jkdi.202.23.6.249 한국데이터정보과학회지 Estimation for generalized half logistic distribution based on records
More informationConfidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods
Chapter 4 Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods 4.1 Introduction It is now explicable that ridge regression estimator (here we take ordinary ridge estimator (ORE)
More informationRANDOMIZATIONN METHODS THAT
RANDOMIZATIONN METHODS THAT DEPEND ON THE COVARIATES WORK BY ALESSANDRO BALDI ANTOGNINI MAROUSSA ZAGORAIOU ALESSANDRA G GIOVAGNOLI (*) 1 DEPARTMENT OF STATISTICAL SCIENCES UNIVERSITY OF BOLOGNA, ITALY
More informationDiagnostics and Remedial Measures
Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression
More informationRobert Collins CSE586, PSU Intro to Sampling Methods
Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Topics to be Covered Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling (CDF) Ancestral Sampling Rejection
More informationWeb-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach
1 Web-based Supplementary Material for Dependence Calibration in Conditional Copulas: A Nonparametric Approach Elif F. Acar, Radu V. Craiu, and Fang Yao Web Appendix A: Technical Details The score and
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:
More informationUsing R in Undergraduate Probability and Mathematical Statistics Courses. Amy G. Froelich Department of Statistics Iowa State University
Using R in Undergraduate Probability and Mathematical Statistics Courses Amy G. Froelich Department of Statistics Iowa State University Undergraduate Probability and Mathematical Statistics at Iowa State
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationCharacterizing Forecast Uncertainty Prediction Intervals. The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ
Characterizing Forecast Uncertainty Prediction Intervals The estimated AR (and VAR) models generate point forecasts of y t+s, y ˆ t + s, t. Under our assumptions the point forecasts are asymtotically unbiased
More informationDepartment of Statistical Science FIRST YEAR EXAM - SPRING 2017
Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationBFF Four: Are we Converging?
BFF Four: Are we Converging? Nancy Reid May 2, 2017 Classical Approaches: A Look Way Back Nature of Probability BFF one to three: a look back Comparisons Are we getting there? BFF Four Harvard, May 2017
More information4 Resampling Methods: The Bootstrap
4 Resampling Methods: The Bootstrap Situation: Let x 1, x 2,..., x n be a SRS of size n taken from a distribution that is unknown. Let θ be a parameter of interest associated with this distribution and
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationRidge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation
Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking
More informationEstimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1
Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1
More informationCSCI-6971 Lecture Notes: Monte Carlo integration
CSCI-6971 Lecture otes: Monte Carlo integration Kristopher R. Beevers Department of Computer Science Rensselaer Polytechnic Institute beevek@cs.rpi.edu February 21, 2006 1 Overview Consider the following
More informationLecture #11: Classification & Logistic Regression
Lecture #11: Classification & Logistic Regression CS 109A, STAT 121A, AC 209A: Data Science Weiwei Pan, Pavlos Protopapas, Kevin Rader Fall 2016 Harvard University 1 Announcements Midterm: will be graded
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More informationBayesian Non-parametric Modeling With Skewed and Heavy-Tailed Data 1
Bayesian Non-parametric Modeling With Skewed and Heavy-tailed Data David Draper (joint work with Milovan Krnjajić, Thanasis Kottas and John Wallerius) Department of Applied Mathematics and Statistics University
More informationInference on distributions and quantiles using a finite-sample Dirichlet process
Dirichlet IDEAL Theory/methods Simulations Inference on distributions and quantiles using a finite-sample Dirichlet process David M. Kaplan University of Missouri Matt Goldman UC San Diego Midwest Econometrics
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:
More information11. Bootstrap Methods
11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods
More informationDouble Bootstrap Confidence Interval Estimates with Censored and Truncated Data
Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 22 11-2014 Double Bootstrap Confidence Interval Estimates with Censored and Truncated Data Jayanthi Arasan University Putra Malaysia,
More informationPoint and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples
90 IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003 Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples N. Balakrishnan, N. Kannan, C. T.
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationConstant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring
Algorithms Research 013, (): 43-49 DOI: 10.593/j.algorithms.01300.0 Constant Stress Partially Accelerated Life Test Design for Mustafa Kamal *, Shazia Zarrin, Arif-Ul-Islam Department of Statistics & Operations
More informationBias-corrected Estimators of Scalar Skew Normal
Bias-corrected Estimators of Scalar Skew Normal Guoyi Zhang and Rong Liu Abstract One problem of skew normal model is the difficulty in estimating the shape parameter, for which the maximum likelihood
More informationOn the efficiency of two-stage adaptive designs
On the efficiency of two-stage adaptive designs Björn Bornkamp (Novartis Pharma AG) Based on: Dette, H., Bornkamp, B. and Bretz F. (2010): On the efficiency of adaptive designs www.statistik.tu-dortmund.de/sfb823-dp2010.html
More informationBetter Bootstrap Confidence Intervals
by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose
More informationEmpirical Likelihood Inference for Two-Sample Problems
Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationSTAT 6350 Analysis of Lifetime Data. Probability Plotting
STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis
More informationTargeted Group Sequential Adaptive Designs
Targeted Group Sequential Adaptive Designs Mark van der Laan Department of Biostatistics, University of California, Berkeley School of Public Health Liver Forum, May 10, 2017 Targeted Group Sequential
More informationSome Properties of the Randomized Play the Winner Rule
Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 1-8 ISSN 1538-7887 Some Properties of the Randomized Play the Winner Rule David Tolusso 1,2, and Xikui Wang 3 1 Institute for
More informationCOMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY
Annales Univ Sci Budapest, Sect Comp 45 2016) 45 55 COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Ágnes M Kovács Budapest, Hungary) Howard M Taylor Newark, DE, USA) Communicated
More information1 (19) Hannele Holttinen, Jussi Ikäheimo
Hannele Holttinen, Jussi Ikäheimo 1 (19) Wind predictions and bids in Denmark Wind power producers need a prediction method to bid their energy to day-ahead market. They pay balancing costs for the energy
More informationOpen Problems in Mixed Models
xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For
More informationQuiz 1. Name: Instructions: Closed book, notes, and no electronic devices.
Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1.(10) What is usually true about a parameter of a model? A. It is a known number B. It is determined by the data C. It is an
More informationOn Sarhan-Balakrishnan Bivariate Distribution
J. Stat. Appl. Pro. 1, No. 3, 163-17 (212) 163 Journal of Statistics Applications & Probability An International Journal c 212 NSP On Sarhan-Balakrishnan Bivariate Distribution D. Kundu 1, A. Sarhan 2
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationConditional Inference by Estimation of a Marginal Distribution
Conditional Inference by Estimation of a Marginal Distribution Thomas J. DiCiccio and G. Alastair Young 1 Introduction Conditional inference has been, since the seminal work of Fisher (1934), a fundamental
More informationRecursive Estimation
Recursive Estimation Raffaello D Andrea Spring 08 Problem Set : Bayes Theorem and Bayesian Tracking Last updated: March, 08 Notes: Notation: Unless otherwise noted, x, y, and z denote random variables,
More information