Background. GLM with clustered data. The problem. Solutions. A fixed effects approach

Size: px
Start display at page:

Download "Background. GLM with clustered data. The problem. Solutions. A fixed effects approach"

Transcription

1 Background GLM with clustered data A fixed effects aroach Göran Broström Poisson or Binomial data with the following roerties A large data set, artitioned into many relatively small grous, and where members within grous have something in common, Deartment of Statistics Umeå University SE Umeå, Sweden GLM with clustered data. 1 GLM with clustered data. 2 The roblem Solutions the number of arameters tend to increase with samle size. This fact causes the standard assumtions underlying asymtotic results to be violated. There are (at least two ossible solutions to the roblem, 1. a random intercets model, and 2. a fixed effects model, with asymtotics relaced by simulation. GLM with clustered data. 3 GLM with clustered data. 4

2 Packages in R Data structure The ackage Matrix has lmer, the MASS ackage has mpql, Jim Lindsey s m in his reeated ackage, Myles and Clayton s GLMMGibbs for fitting mixed models by Gibbs samling. Adding to that m and mboot in the ackage m. n clusters of sizes,i = 1,...,n. For each cluster i,i = 1,...,n, observe resonses (y i1,...,y ini and vectors of exlanatory variables (x i1,...,x ini, where x ij are -dimensional vectors with the first element identically equal to unity, corresonding to the mean value of the random intercets. The random art, u i of the intercets are normal with mean zero and variance σ 2, and it is assumed that u 1,...,u n are indeendent. The conditional distribution GLM with clustered data. 5 Likelihood function GLM with clustered data. 6 given the random intercets β 1 + u i,i = 1,...,n: Pr(Y ij = y ij u i ;x = P(βx ij + u i,y ij, y ij = 0, 1,... ; j = 1,...,, i = 1,...,n. Bernoulli distribution logit link, P(x, y = exy, y = 0,1; < x <, 1 + ex cloglog link P(x, y = `1 ex( e x y ex` (1 ye x, y = 0,1; < x <, In the fixed effects model (and in the conditional random effects model, the likelihood functios L ( (β,γ;y,x = The log likelihood functios l ( (β,γ;y,x = n i=1 P(βx ij + γ i,y ij. n log P(βx ij + γ i,y ij, i=1 Poisson distribution with log link P(x, y = exy y! e ex, y = 0,1, 2,... ; < x < GLM with clustered data. 7 GLM with clustered data. 8

3 Tests of cluster effect Comutational asects Testing is erformed via a simle bootstra (mboot. Under the null hyothesis of no grouing effect, the grouing factor can be randomly ermuted without changing the robability distribution (the conditional aroach, or a arametric bootstra aroach: simulate observations from the fitted model under the null hyothesis (the unconditional aroach. A rofiling aroach reduces an otimizing roblem in high dimensions to a roblem consisting of solving several one-variable equations followed by otimization low dimensions. The score vector GLM with clustered data. 9 Cluster comonents of the score GLM with clustered data. 10 The artial derivatives wrt β m, m = 1;...,, of the log likelihood function are: U m (β,γ = l ( (β,γ;y,x n = x ijm G(βx ij + γ i, y ij, i=1 m = 1,...,. The artial derivatives wrt γ i, i = 1,...,n, are U +i (β,γ = γ i l ( (β,γ;y,x = G(βx ij + γ i, y ij, i = 1,...,n. where G(x,y = x log P(x,y = x P(x,y P(x,y GLM with clustered data. 11 GLM with clustered data. 12

4 With rofiling Profile score Setting U +i (β,γ = 0 defines γ imlicitly as functions of β, γ i = γ i (β, i = 1,...,n: F ( β,γ i (β = G ( βx ij + γ i (β, y ij = 0, From we get F ( β,γ i (β = γ i F γ i + F = 0 i = 1,...,n. γ i (β = = F F γ i ni j=i x ijmh ( βx ij + γ i, y ij ni H( βx ij + γ i, y ij, i = 1,...,n; m = 1,..., which is needed when calculating the score corresonding to the rofile likelihood. Profile loglihood GLM with clustered data. 13 Profile artial derivatives GLM with clustered data. 14 Relacing γ by γ(β gives the rofile log likelihood l (P : l (P( β;y,x n = log P ( βx ij + γ i (β, y ij, as a function of β alone. i=1 The artial derivatives wrt β m, m = 1;...,, of the log rofile likelihood function becomes: U (P m (β = l (P (β;y,x n = i=1 ( x ijm + γ i(β = U m ( β,γ(β + n = U m (β,γ(β, i=1 γ i (β G ( βx ij + γ i (β, y ij G ( βx ij + γ i (β, y ij Thus we get back the unrofiled artial derivatives. GLM with clustered data. 15 GLM with clustered data. 16

5 Profile hessian At the maximum I (P ms (β = β s U m ( β, γ(β = n i=1 ( x ijm x ijs + γ i(β H ( βx ij + γ i (β, y ij β s = I ms (β, γ(β n ni x ni ijmh ij ni H ij i=1 m, s = 1,...,. x ijsh ij, Justifying the use of the rofile likelihood: Theorem 1 (Patefield The inverse hessians from the full likelihood and from the rofile likelihood for β are equal when (γ,β = (ˆγ, ˆβ. where H ij = H ( βx ij + γ i (β, y ij, j = 1,...ni ; i = 1,...,n. Prearation for R GLM with clustered data. 17 Imlementation R GLM with clustered data. 18 l (P (β = n ni i=1 U m (P (β = n m = 1,...,. i=1 log P( βx ij + γ i (β, y ij, ni x ijmg ( βx ij + γ i (β, y ij, For fixed β, γ i (β is found by solving G(βx ij + γ i, y ij = 0, with resect to γ i, i = 1,...,n. The maximizatios erformed by otim, via the C function vmmin, available as an entry oint in the C code of R. Imlemented in the ackage m in R. Covers three cases, 1. Binomial with logit link, 2. Binomial with cloglog link, 3. Poisson with log link. The functios mboot, Testing of cluster effect is done by simulation (a simle form of bootstraing. conditionally, or unconditionally. GLM with clustered data. 19 GLM with clustered data. 20

6 Binomial with logit link Binomial with cloglog link P(x,y = ex(xy/(1 + ex(x, G(x,y = y P(x, 1. We get (γ 1,...,γ n by solving the equations y ij = ex(βx ij + γ i 1 + ex(βx ij + γ i for i = 1,...,n (using the C version of uniroot. Secial cases: y ij = 0 or ; giving γ i = or +, resectively. Corresonding cluster can be thrown out. (Should be used in? P(x,y = (1 ex( ex(x y ex( (1 y ex(x, G(x,y = ex(x P(x,1 {y P(x, 1} We get (γ 1,...,γ n by solving the equations y ij = ex( ex(βx ij + γ i for i = 1,...,n (using the C version of uniroot. Secial cases: y ij = 0 or ; γ i = or +, resectively. Corresonding cluster can be thrown out. GLM with clustered data. 21 GLM with clustered data. 22 Poisson with log link Simulation P(x,y = exy y! ex( ex(x G(x,y = y e x We get (γ 1,...,γ n from giving y ij = e γ i ex(βx ij, γ i = log i = 1,...,n, { j y } ij j ex(βx, i = 1,...,n. ij Model: P(Y ij = 1 γ i = 1 P(Y ij = 0 γ i where γ 1,...,γ n are iid N(0,σ. Hyothesis: σ = 0. = eγ i 1 + e γ, j = 1,...,5; i = 1,...,n, i Secial case: y ij = 0, giving γ i =. GLM with clustered data. 23 GLM with clustered data. 24

7 Simulation secifications Null model (σ = 0; 5 clusters σ = 0, 0.5. n = 5, 50, 500. Four methods: mboot, unconditional and conditional, m, (naively?. F( F(, conditional F( F( Null model (σ = 0; 50 clusters GLM with clustered data. 25 Null model (σ = 0; 500 clusters GLM with clustered data. 26, conditional, conditional F( F( F( F( F( F( F( F( GLM with clustered data. 27 GLM with clustered data. 28

8 Clustering (σ = 0.5; 5 clusters Clustering (σ = 0.5; 50 clusters, conditional, conditional F( F( F( F( F( F( F( F( Clustering (σ = 0.5; 500 clusters GLM with clustered data. 29 Timings, 5 clusters GLM with clustered data. 30 F( F(, conditional > system.time(mboot(y 1, cluster = cluster, + data = timing, conditional = FALSE, boot = 2000 [1] > system.time(mboot(y 1, cluster = cluster, data = timing, conditional = TRUE, boot = 2000 [1] F( F( > system.time(m(y 1, cluster = cluster, data = timing [1] > system.time((y factor(cluster, data = timing, family = binomial [1] GLM with clustered data. 31 GLM with clustered data. 32

9 Timings, 50 clusters Timings, 500 clusters > system.time(mboot(y 1, cluster = cluster, data = timing, conditional = FALSE, boot = 2000 [1] > system.time(mboot(y 1, cluster = cluster, data = timing, conditional = TRUE, boot = 2000 [1] > system.time(m(y 1, cluster = cluster, data = timing [1] > system.time((y factor(cluster, data = timing, family = binomial [1] > system.time(mboot(y 1, cluster = cluster, data = timing, conditional = FALSE, boot = 2000 [1] > system.time(mboot(y 1, cluster = cluster, data = timing, conditional = TRUE, boot = 2000 [1] > system.time(m(y 1, cluster = cluster, data = timing [1] > system.time((y factor(cluster, data = timing, family = binomial [1] vs. mboot(boot = 0 GLM with clustered data. 33 GLM with clustered data. 34 Execution times No. of clusters mboot Conclusion: Profiling is numerically very efficient. GLM with clustered data. 35

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution Outline for today Maximum likelihood estimation Rasmus Waageetersen Deartment of Mathematics Aalborg University Denmark October 30, 2007 the multivariate normal distribution linear and linear mixed models

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Estimating function analysis for a class of Tweedie regression models

Estimating function analysis for a class of Tweedie regression models Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

Chapter 3. GMM: Selected Topics

Chapter 3. GMM: Selected Topics Chater 3. GMM: Selected oics Contents Otimal Instruments. he issue of interest..............................2 Otimal Instruments under the i:i:d: assumtion..............2. he basic result............................2.2

More information

Introduction to Probability and Statistics

Introduction to Probability and Statistics Introduction to Probability and Statistics Chater 8 Ammar M. Sarhan, asarhan@mathstat.dal.ca Deartment of Mathematics and Statistics, Dalhousie University Fall Semester 28 Chater 8 Tests of Hyotheses Based

More information

Uniform Law on the Unit Sphere of a Banach Space

Uniform Law on the Unit Sphere of a Banach Space Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a

More information

STK4900/ Lecture 7. Program

STK4900/ Lecture 7. Program STK4900/9900 - Lecture 7 Program 1. Logistic regression with one redictor 2. Maximum likelihood estimation 3. Logistic regression with several redictors 4. Deviance and likelihood ratio tests 5. A comment

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

1 Extremum Estimators

1 Extremum Estimators FINC 9311-21 Financial Econometrics Handout Jialin Yu 1 Extremum Estimators Let θ 0 be a vector of k 1 unknown arameters. Extremum estimators: estimators obtained by maximizing or minimizing some objective

More information

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform Statistics II Logistic Regression Çağrı Çöltekin Exam date & time: June 21, 10:00 13:00 (The same day/time lanned at the beginning of the semester) University of Groningen, Det of Information Science May

More information

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test) Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant

More information

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit Chater 5 Statistical Inference 69 CHAPTER 5 STATISTICAL INFERENCE.0 Hyothesis Testing.0 Decision Errors 3.0 How a Hyothesis is Tested 4.0 Test for Goodness of Fit 5.0 Inferences about Two Means It ain't

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI ** Iranian Journal of Science & Technology, Transaction A, Vol 3, No A3 Printed in The Islamic Reublic of Iran, 26 Shiraz University Research Note REGRESSION ANALYSIS IN MARKOV HAIN * A Y ALAMUTI AND M R

More information

1 Probability Spaces and Random Variables

1 Probability Spaces and Random Variables 1 Probability Saces and Random Variables 1.1 Probability saces Ω: samle sace consisting of elementary events (or samle oints). F : the set of events P: robability 1.2 Kolmogorov s axioms Definition 1.2.1

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

Hotelling s Two- Sample T 2

Hotelling s Two- Sample T 2 Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test

More information

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating

More information

Finite Mixture EFA in Mplus

Finite Mixture EFA in Mplus Finite Mixture EFA in Mlus November 16, 2007 In this document we describe the Mixture EFA model estimated in Mlus. Four tyes of deendent variables are ossible in this model: normally distributed, ordered

More information

HEC Lausanne - Advanced Econometrics

HEC Lausanne - Advanced Econometrics HEC Lausanne - Advanced Econometrics Christohe HURLI Correction Final Exam. January 4. C. Hurlin Exercise : MLE, arametric tests and the trilogy (3 oints) Part I: Maximum Likelihood Estimation (MLE) Question

More information

Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014

Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014 Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 All models are aroximations! The best model does not exist! Comlicated models needs a lot of data. lower your ambitions or get

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

The Poisson Regression Model

The Poisson Regression Model The Poisson Regression Model The Poisson regression model aims at modeling a counting variable Y, counting the number of times that a certain event occurs during a given time eriod. We observe a samle

More information

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution Monte Carlo Studies Do not let yourself be intimidated by the material in this lecture This lecture involves more theory but is meant to imrove your understanding of: Samling distributions and tests of

More information

Some Measures of Agreement Between Close Partitions

Some Measures of Agreement Between Close Partitions Some Measures of Agreement Between Close Partitions Genane Youness and Gilbert Saorta CEDRIC CNAM, BP - 466, Beirut, Lebanon, genane99@hotmail.com Chaire de Statistique Aliquée- CEDRIC, CNAM, 9 rue Saint

More information

Empirical Bayesian EM-based Motion Segmentation

Empirical Bayesian EM-based Motion Segmentation Emirical Bayesian EM-based Motion Segmentation Nuno Vasconcelos Andrew Liman MIT Media Laboratory 0 Ames St, E5-0M, Cambridge, MA 09 fnuno,lig@media.mit.edu Abstract A recent trend in motion-based segmentation

More information

The Longest Run of Heads

The Longest Run of Heads The Longest Run of Heads Review by Amarioarei Alexandru This aer is a review of older and recent results concerning the distribution of the longest head run in a coin tossing sequence, roblem that arise

More information

Flexible Tweedie regression models for continuous data

Flexible Tweedie regression models for continuous data Flexible Tweedie regression models for continuous data arxiv:1609.03297v1 [stat.me] 12 Se 2016 Wagner H. Bonat and Célestin C. Kokonendji Abstract Tweedie regression models rovide a flexible family of

More information

Estimating Time-Series Models

Estimating Time-Series Models Estimating ime-series Models he Box-Jenkins methodology for tting a model to a scalar time series fx t g consists of ve stes:. Decide on the order of di erencing d that is needed to roduce a stationary

More information

Supplementary Materials for Robust Estimation of the False Discovery Rate

Supplementary Materials for Robust Estimation of the False Discovery Rate Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides

More information

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment

More information

Lecture 23 Maximum Likelihood Estimation and Bayesian Inference

Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Thais Paiva STA 111 - Summer 2013 Term II August 7, 2013 1 / 31 Thais Paiva STA 111 - Summer 2013 Term II Lecture 23, 08/07/2013 Lecture

More information

Recent Advances on Computer Experiment

Recent Advances on Computer Experiment All Chinese Look Alike? Why? Recent Advances on Comuter Exeriment Dennis Lin The Pennsylvania State University DKL5@su.edu February, 008 OR Seminar Penn State (US) criteria for eole classification (as

More information

Introduction to Probability for Graphical Models

Introduction to Probability for Graphical Models Introduction to Probability for Grahical Models CSC 4 Kaustav Kundu Thursday January 4, 06 *Most slides based on Kevin Swersky s slides, Inmar Givoni s slides, Danny Tarlow s slides, Jaser Snoek s slides,

More information

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is

More information

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression 1/9 MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression Dominique Guillot Deartments of Mathematical Sciences University of Delaware February 15, 2016 Distribution of regression

More information

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split A Bound on the Error of Cross Validation Using the Aroximation and Estimation Rates, with Consequences for the Training-Test Slit Michael Kearns AT&T Bell Laboratories Murray Hill, NJ 7974 mkearns@research.att.com

More information

A Model for Randomly Correlated Deposition

A Model for Randomly Correlated Deposition A Model for Randomly Correlated Deosition B. Karadjov and A. Proykova Faculty of Physics, University of Sofia, 5 J. Bourchier Blvd. Sofia-116, Bulgaria ana@hys.uni-sofia.bg Abstract: A simle, discrete,

More information

Statics and dynamics: some elementary concepts

Statics and dynamics: some elementary concepts 1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and

More information

arxiv: v2 [stat.me] 3 Nov 2014

arxiv: v2 [stat.me] 3 Nov 2014 onarametric Stein-tye Shrinkage Covariance Matrix Estimators in High-Dimensional Settings Anestis Touloumis Cancer Research UK Cambridge Institute University of Cambridge Cambridge CB2 0RE, U.K. Anestis.Touloumis@cruk.cam.ac.uk

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM

More information

Asymptotic F Test in a GMM Framework with Cross Sectional Dependence

Asymptotic F Test in a GMM Framework with Cross Sectional Dependence Asymtotic F Test in a GMM Framework with Cross Sectional Deendence Yixiao Sun Deartment of Economics University of California, San Diego Min Seong Kim y Deartment of Economics Ryerson University First

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model Shadow Comuting: An Energy-Aware Fault Tolerant Comuting Model Bryan Mills, Taieb Znati, Rami Melhem Deartment of Comuter Science University of Pittsburgh (bmills, znati, melhem)@cs.itt.edu Index Terms

More information

A Power and Prediction Analysis for Knockoffs with Lasso Statistics

A Power and Prediction Analysis for Knockoffs with Lasso Statistics A Power and Prediction Analysis for Knockoffs with Lasso Statistics Asaf Weinstein a, Rina Barber b, and mmanuel Candès a a Stanford University b University of Chicago Abstract Knockoffs is a new framework

More information

STA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2

STA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2 STA 25: Statistics Notes 7. Bayesian Aroach to Statistics Book chaters: 7.2 1 From calibrating a rocedure to quantifying uncertainty We saw that the central idea of classical testing is to rovide a rigorous

More information

Estimation and Detection

Estimation and Detection Estimation and Detection Lecture : Detection Theory Unknown Parameters Dr. ir. Richard C. Hendriks //05 Previous Lecture H 0 : T (x) < H : T (x) > Using detection theory, rules can be derived on how to

More information

The non-stochastic multi-armed bandit problem

The non-stochastic multi-armed bandit problem Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at

More information

Quick Detection of Changes in Trac Statistics: Ivy Hsu and Jean Walrand 2. Department of EECS, University of California, Berkeley, CA 94720

Quick Detection of Changes in Trac Statistics: Ivy Hsu and Jean Walrand 2. Department of EECS, University of California, Berkeley, CA 94720 Quic Detection of Changes in Trac Statistics: Alication to Variable Rate Comression Ivy Hsu and Jean Walrand 2 Deartment of EECS, University of California, Bereley, CA 94720 resented at the 32nd Annual

More information

The Knuth-Yao Quadrangle-Inequality Speedup is a Consequence of Total-Monotonicity

The Knuth-Yao Quadrangle-Inequality Speedup is a Consequence of Total-Monotonicity The Knuth-Yao Quadrangle-Ineuality Seedu is a Conseuence of Total-Monotonicity Wolfgang W. Bein Mordecai J. Golin Lawrence L. Larmore Yan Zhang Abstract There exist several general techniues in the literature

More information

Fig. 21: Architecture of PeerSim [44]

Fig. 21: Architecture of PeerSim [44] Sulementary Aendix A: Modeling HPP with PeerSim Fig. : Architecture of PeerSim [] In PeerSim, every comonent can be relaced by another comonent imlementing the same interface, and the general simulation

More information

Downloaded from jhs.mazums.ac.ir at 9: on Monday September 17th 2018 [ DOI: /acadpub.jhs ]

Downloaded from jhs.mazums.ac.ir at 9: on Monday September 17th 2018 [ DOI: /acadpub.jhs ] Iranian journal of health sciences 013; 1(): 56-60 htt://jhs.mazums.ac.ir Original Article Comaring Two Formulas of Samle Size Determination for Prevalence Studies Hamed Tabesh 1 *Azadeh Saki Fatemeh Pourmotahari

More information

arxiv: v3 [physics.data-an] 23 May 2011

arxiv: v3 [physics.data-an] 23 May 2011 Date: October, 8 arxiv:.7v [hysics.data-an] May -values for Model Evaluation F. Beaujean, A. Caldwell, D. Kollár, K. Kröninger Max-Planck-Institut für Physik, München, Germany CERN, Geneva, Switzerland

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

Random variables. Lecture 5 - Discrete Distributions. Discrete Probability distributions. Example - Discrete probability model

Random variables. Lecture 5 - Discrete Distributions. Discrete Probability distributions. Example - Discrete probability model Random Variables Random variables Lecture 5 - Discrete Distributions Sta02 / BME02 Colin Rundel Setember 8, 204 A random variable is a numeric uantity whose value deends on the outcome of a random event

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

On the Toppling of a Sand Pile

On the Toppling of a Sand Pile Discrete Mathematics and Theoretical Comuter Science Proceedings AA (DM-CCG), 2001, 275 286 On the Toling of a Sand Pile Jean-Christohe Novelli 1 and Dominique Rossin 2 1 CNRS, LIFL, Bâtiment M3, Université

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Toics in Assurance Related Technologies Table of Contents Introduction Statistical Models for Simle Systems (U/Down) and Interretation Markov Models for Simle Systems (U/Down) and Interretation

More information

WALD S EQUATION AND ASYMPTOTIC BIAS OF RANDOMLY STOPPED U-STATISTICS

WALD S EQUATION AND ASYMPTOTIC BIAS OF RANDOMLY STOPPED U-STATISTICS PROCEEDINGS OF HE AMERICAN MAHEMAICAL SOCIEY Volume 125, Number 3, March 1997, Pages 917 925 S 0002-9939(97)03574-0 WALD S EQUAION AND ASYMPOIC BIAS OF RANDOMLY SOPPED U-SAISICS VICOR H. DE LA PEÑA AND

More information

The Binomial Approach for Probability of Detection

The Binomial Approach for Probability of Detection Vol. No. (Mar 5) - The e-journal of Nondestructive Testing - ISSN 45-494 www.ndt.net/?id=7498 The Binomial Aroach for of Detection Carlos Correia Gruo Endalloy C.A. - Caracas - Venezuela www.endalloy.net

More information

Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Response) Logistic Regression

Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Response) Logistic Regression Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Resonse) Logistic Regression Recall general χ 2 test setu: Y 0 1 Trt 0 a b Trt 1 c d I. Basic logistic regression Previously (Handout

More information

MULTIVARIATE SHEWHART QUALITY CONTROL FOR STANDARD DEVIATION

MULTIVARIATE SHEWHART QUALITY CONTROL FOR STANDARD DEVIATION MULTIVARIATE SHEWHART QUALITY CONTROL FOR STANDARD DEVIATION M. Jabbari Nooghabi, Deartment of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad-Iran. and H. Jabbari

More information

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in

More information

ute measures of uncertainty called standard errors for these b j estimates and the resulting forecasts if certain conditions are satis- ed. Note the e

ute measures of uncertainty called standard errors for these b j estimates and the resulting forecasts if certain conditions are satis- ed. Note the e Regression with Time Series Errors David A. Dickey, North Carolina State University Abstract: The basic assumtions of regression are reviewed. Grahical and statistical methods for checking the assumtions

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

Decoding First-Spike Latency: A Likelihood Approach. Rick L. Jenison

Decoding First-Spike Latency: A Likelihood Approach. Rick L. Jenison eurocomuting 38-40 (00) 39-48 Decoding First-Sike Latency: A Likelihood Aroach Rick L. Jenison Deartment of Psychology University of Wisconsin Madison WI 53706 enison@wavelet.sych.wisc.edu ABSTRACT First-sike

More information

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population Chater 7 and s Selecting a Samle Point Estimation Introduction to s of Proerties of Point Estimators Other Methods Introduction An element is the entity on which data are collected. A oulation is a collection

More information

Improved Capacity Bounds for the Binary Energy Harvesting Channel

Improved Capacity Bounds for the Binary Energy Harvesting Channel Imroved Caacity Bounds for the Binary Energy Harvesting Channel Kaya Tutuncuoglu 1, Omur Ozel 2, Aylin Yener 1, and Sennur Ulukus 2 1 Deartment of Electrical Engineering, The Pennsylvania State University,

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

Heteroskedasticity, Autocorrelation, and Spatial Correlation Robust Inference in Linear Panel Models with Fixed-E ects

Heteroskedasticity, Autocorrelation, and Spatial Correlation Robust Inference in Linear Panel Models with Fixed-E ects Heteroskedasticity, Autocorrelation, and Satial Correlation Robust Inference in Linear Panel Models with Fixed-E ects Timothy J. Vogelsang Deartments of Economics, Michigan State University December 28,

More information

Exercises Econometric Models

Exercises Econometric Models Exercises Econometric Models. Let u t be a scalar random variable such that E(u t j I t ) =, t = ; ; ::::, where I t is the (stochastic) information set available at time t. Show that under the hyothesis

More information

Valid Inference in Partially Unstable GMM Models

Valid Inference in Partially Unstable GMM Models Valid Inference in Partially Unstable GMM Models Hong Li Deartment of Economics Brandeis University Waltham, MA 02454, USA hli@brandeis.edu Ulrich K. Müller Deartment of Economics Princeton University

More information

Biostat Methods STAT 5820/6910 Handout #5a: Misc. Issues in Logistic Regression

Biostat Methods STAT 5820/6910 Handout #5a: Misc. Issues in Logistic Regression Biostat Methods STAT 5820/6910 Handout #5a: Misc. Issues in Logistic Regression Recall general χ 2 test setu: Y 0 1 Trt 0 a b Trt 1 c d I. Basic logistic regression Previously (Handout 4a): χ 2 test of

More information

Semiparametric Efficiency in GMM Models with Nonclassical Measurement Error

Semiparametric Efficiency in GMM Models with Nonclassical Measurement Error Semiarametric Efficiency in GMM Models with Nonclassical Measurement Error Xiaohong Chen New York University Han Hong Duke University Alessandro Tarozzi Duke University August 2005 Abstract We study semiarametric

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables High-dimensional Ordinary Least-squares Projection for Screening Variables Xiangyu Wang and Chenlei Leng arxiv:1506.01782v1 [stat.me] 5 Jun 2015 Abstract Variable selection is a challenging issue in statistical

More information

Lecture 3 Consistency of Extremum Estimators 1

Lecture 3 Consistency of Extremum Estimators 1 Lecture 3 Consistency of Extremum Estimators 1 This lecture shows how one can obtain consistency of extremum estimators. It also shows how one can find the robability limit of extremum estimators in cases

More information

Hidden Predictors: A Factor Analysis Primer

Hidden Predictors: A Factor Analysis Primer Hidden Predictors: A Factor Analysis Primer Ryan C Sanchez Western Washington University Factor Analysis is a owerful statistical method in the modern research sychologist s toolbag When used roerly, factor

More information

On parameter estimation in deformable models

On parameter estimation in deformable models Downloaded from orbitdtudk on: Dec 7, 07 On arameter estimation in deformable models Fisker, Rune; Carstensen, Jens Michael Published in: Proceedings of the 4th International Conference on Pattern Recognition

More information

Variable Selection and Model Building

Variable Selection and Model Building LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 38 Variable Selection and Model Building Dr. Shalabh Deartment of Mathematics and Statistics Indian Institute of Technology Kanur Evaluation of subset regression

More information

Ecological Resemblance. Ecological Resemblance. Modes of Analysis. - Outline - Welcome to Paradise

Ecological Resemblance. Ecological Resemblance. Modes of Analysis. - Outline - Welcome to Paradise Ecological Resemblance - Outline - Ecological Resemblance Mode of analysis Analytical saces Association Coefficients Q-mode similarity coefficients Symmetrical binary coefficients Asymmetrical binary coefficients

More information

¼ ¼ 6:0. sum of all sample means in ð8þ 25

¼ ¼ 6:0. sum of all sample means in ð8þ 25 1. Samling Distribution of means. A oulation consists of the five numbers 2, 3, 6, 8, and 11. Consider all ossible samles of size 2 that can be drawn with relacement from this oulation. Find the mean of

More information

PROFIT MAXIMIZATION. π = p y Σ n i=1 w i x i (2)

PROFIT MAXIMIZATION. π = p y Σ n i=1 w i x i (2) PROFIT MAXIMIZATION DEFINITION OF A NEOCLASSICAL FIRM A neoclassical firm is an organization that controls the transformation of inuts (resources it owns or urchases into oututs or roducts (valued roducts

More information

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1) CERTAIN CLASSES OF FINITE SUMS THAT INVOLVE GENERALIZED FIBONACCI AND LUCAS NUMBERS The beautiful identity R.S. Melham Deartment of Mathematical Sciences, University of Technology, Sydney PO Box 23, Broadway,

More information

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &

More information

Benoît MULKAY Université de Montpellier. January Preliminary, Do not quote!

Benoît MULKAY Université de Montpellier. January Preliminary, Do not quote! Bivariate Probit Estimation for Panel Data: a two-ste Gauss-Hermite Quadrature Aroach with an alication to roduct and rocess innovations for France Benoît MULKAY Université de Montellier January 05 Preliminary,

More information

Bayesian Networks Practice

Bayesian Networks Practice Bayesian Networks Practice Part 2 2016-03-17 Byoung-Hee Kim, Seong-Ho Son Biointelligence Lab, CSE, Seoul National University Agenda Probabilistic Inference in Bayesian networks Probability basics D-searation

More information

arxiv:cond-mat/ v2 25 Sep 2002

arxiv:cond-mat/ v2 25 Sep 2002 Energy fluctuations at the multicritical oint in two-dimensional sin glasses arxiv:cond-mat/0207694 v2 25 Se 2002 1. Introduction Hidetoshi Nishimori, Cyril Falvo and Yukiyasu Ozeki Deartment of Physics,

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analysis of Variance and Design of Exeriment-I MODULE II LECTURE -4 GENERAL LINEAR HPOTHESIS AND ANALSIS OF VARIANCE Dr. Shalabh Deartment of Mathematics and Statistics Indian Institute of Technology Kanur

More information

A Unified Framework for GLRT-Based Spectrum Sensing of Signals with Covariance Matrices with Known Eigenvalue Multiplicities

A Unified Framework for GLRT-Based Spectrum Sensing of Signals with Covariance Matrices with Known Eigenvalue Multiplicities A Unified Framework for GLRT-Based Sectrum Sensing of Signals with Covariance Matrices with Known Eigenvalue Multilicities Erik Axell and Erik G. Larsson Linköing University Post Print N.B.: When citing

More information