Akaike Information Criterion
|
|
- Allan Gilbert
- 5 years ago
- Views:
Transcription
1 Akaike Information Criterion Shuhua Hu Center for Research in Scientific Computation North Carolina State University Raleigh, NC February 7,
2 background Background Model statistical model: Y j = h(t j ;q)+e j, j = 1,2,...,n h(t j ;q): corresponds to the observed part of the solution of a mathematical model (e.g., described by system of ODE, or PDE) with model parameter q at the measurement time point t j. E j : measurement error at measurement time point t j (random variable). Y j : observation at measurement time point t j ( random variable). Under the assumption of E j being i.i.d N(0,σ 2 ), we have n [ 1 probability distribution model: g(y θ) = j=1 )] 2πσ exp ( (y j h(t j ;q)) 2 2σ 2 g: probability density function of Y = (Y 1,Y 2,...,Y n ) T given parameter θ. θ includes mathematical model parameter q and statistical model parameter σ, i.e., θ = (q,σ) T. y = (y 1,y 2,...,y n ) T is a realization of Y. February 7,
3 background Risk Modeling error (in terms of uncertainty assumption) Specified inappropriate parametric probability distribution for the data at hand. Estimation error ϑ ˆθ 2 = ϑ θ 2 }{{} bias + θ ˆθ 2 }{{} variance ϑ: parameter vector for the full reality model. θ: is the projection of ϑ onto the parameter space Θ k of the approximating model. ˆθ: the maximum likelihood estimate of θ in Θ k. February 7,
4 background Principle of Parsimony (with same data set) Variance For sufficiently large sample size n, we have n θ ˆθ 2 assym. χ 2 k, where E(χ2 k ) = k. Akaike Information Criterion February 7,
5 K-L information Kullback-Leibler Information Information lost when approximating model is used to approximate the full reality. Continuous Case I(f,g( θ)) = = Ω Ω f(x)log ( ) f(x) dx g(x θ) f(x)log(f(x))dx f(x) log(g(x θ))dx } Ω {{} relative K-L information f: full reality or truth in terms of a probability distribution. g: approximating model in terms of a probability distribution. θ: parameter vector in the approximating model g. Remark I(f,g) 0, with I(f,g) = 0 if and only if f = g almost everywhere. I(f,g) I(g,f),whichimpliesK-L information is not the real distance. February 7,
6 AIC Akaike Information Criterion (1973) Motivation The truth f is unknown. The parameter θ in g must be estimated from the empirical data {y j } n j=1. y = (y 1,y 2,...,y n ) T is a realization of Y, which is the random sample from the density function f(y). θ (Y): estimator of θ. It is a random variable. I(f,g( θ (Y))) is a random variable. Remark We need to use expected K-L information E Y [I(f,g( θ (Y)))] to measure the distance between g and f. February 7,
7 AIC Selection Target Minimizing E Y [I(f,g( θ (Y)))] g G E Y [I(f,g( θ (Y)))] = [ ] Ω f(x)log(f(x))dx f(y) f(x)log(g(x θ (y)))dx dy. Ω Ω }{{} E Y E X [log(g(x θ (Y)))] G: collection of admissible models (in terms of probability density functions). θ (y) is MLE estimate based on model g and data {y j } n j=1. Model Selection Criterion Maximizing g G E Y E X [log(g(x θ (Y)))] February 7,
8 AIC Key Result An approximately unbiased estimate of E Y E X [log(g(x θ (Y)))] for large sample and good model is log(l(ˆθ y)) k L(ˆθ y) represents likelihood of ˆθ given data {y j } n j=1, that is, L(ˆθ y) = g(y ˆθ). ˆθ: maximum likelihood estimate of θ given data {y j } n j=1, that is, ˆθ = θ (y). k: number of estimated parameters (including mathematical model parameters and statistical model parameters). Remark Large sample: used to ensure that the estimate ˆθ provides a good approximation to the true value θ 0 (consistency property of maximum likelihood estimator). Good model: the model that is close to f in the sense of having a small K-L value. February 7,
9 AIC Maximum Likelihood Case AIC = 2logL(ˆθ y) ւ bias + 2k ց variance CalculateAICvalueforeachmodelwiththesame data set,andthe best model is the one with minimum AIC value. The value of AIC depends on data {y j } n j=1, which leads to model selection uncertainty. Least-Squares Case Assumption: i.i.d. normally distributed errors ( ) RSS AIC = nlog +2k n RSS denotes residual sum of squares of fitted model. February 7,
10 TIC Takeuchi s Information Criterion (1976) useful in cases where the model is not particular close to truth. Model Selection Criterion Maximizing g G E Y E X [log(g(x θ (Y)))] Key Result An approximately unbiased estimator of E Y E X [log(g(x θ (Y)))] for large sample is log(l(ˆθ y)) tr(j(θ 0 )I(θ 0 ) 1 ) { ( J(θ 0 ) = E f θ log(g(x θ 0)) )( θ log(g(x θ 0)) ) } T ( } I(θ 0 ) = E f { 2 log(g(x θ 0 )) θ i θ j )k k Remark If g f, then I(θ 0 ) = J(θ 0 ). Hence tr(j(θ 0 )I(θ 0 ) 1 ) = k. If g is close to f, then tr(j(θ 0 )I(θ 0 ) 1 ) k. February 7,
11 TIC TIC TIC = 2log(L(ˆθ y))+2tr(ĵ(ˆθ)[î(ˆθ)] 1 ), where Î(ˆθ) and Ĵ(ˆθ) are both k k matrix, and ( ) 2 log(g(y ˆθ)) Î(ˆθ) = estimate of I(θ 0 ) θ i θ j k k Ĵ(ˆθ) = [ θ log(g(y ˆθ)) ][ θ log(g(y ˆθ)) ] T estimate of J(θ 0 ) Remark Attractive in theory. Rarely used in practice because we need a very large sample size to obtain good estimates for both I(θ 0 ) and J(θ 0 ). February 7,
12 AIC c A Small Sample AIC use in the case where the sample size is small relative to the number of parameters rule of thumb: n/k < 40 Univariate Case Assumption: i.i.d normal error distribution with the truth contained in the model set. 2k(k +1) AIC c = AIC + } n k {{ 1} bias-correction Remark Thebias-correctiontermvariesbytypeofmodel(e.g.,normal,exponential,Poisson). Inpractice, AIC c isgenerallysuitableunlesstheunderlyingprobabilitydistribution is extremely nonnormal, especially in terms of being strongly skewed. February 7,
13 AIC c Multivariate Case Assumption: each row of E is i.i.d N(0,Σ) with the truth contained in the model set. Applying to the multivariate case: AIC c = AIC +2 k( k +1+p) n k 1 p Y = TB +E, where Y R n p,t R n k,b R k p. p: total number of components. n: number of independent multivariate observations, each with p nonindependent components. k: total number of unknown parameters and k = kp+p(p+1)/2. Remark Bedrick and Tsai in [1] claimed that this result can be extended to the multivariate nonlinear regression model. February 7,
14 AIC difference AIC Differences, Likelihood of a Model, Akaike Weights AIC differences Information loss when fitted model is used rather than the best approximating model i = AIC i AIC min AIC min : AIC values for the best model in the set. Likelihood of a Model Useful in making inference concerning the relative strength of evidence for each of the models in the set ( L(g i y) exp 1 ) 2 i, where means is proportional to. Akaike Weights Weight of evidence in favor of model i being the best approximating model in the set exp( 1 2 w i = i) R r=1 exp( 1 2 r) February 7,
15 confidence set Confidence Set for K-L Best Model Three Heuristic Approaches (see [4]) Based on the Akaike weights w i To obtain a 95% confidence set on the actual K-L best model, summing the Akaike weights from largest to smallest until that sum is just 0.95, and the corresponding subset of models is the confidence set on the K-L best model. Based on AIC difference i 0 i 2, substantial support, 4 i 7, considerable less support, i > 10, essentially no support. Remark Particularly useful for nested models, may break down when the model set is large. The guideline values may be somewhat larger for nonnested models. February 7,
16 confidence set Motivated by likelihood-based inference The confidence set of models is all models for which the ratio L(g i y) L(g min y) > α, where α might be chosen as 1 8. February 7,
17 multimodel inference Multimodel Inference Unconditional Variance Estimator [ R var(ˆ θ) = w i var(ˆθ i g i )+(ˆθ i ˆ θ) 2 i=1 θ is a parameter in common to all R models. ˆθ i means that the parameter θ is estimated based on model g i, ˆ θ is a model-averaged estimate ˆ θ = R i=1 w iˆθ i. Remark Unconditional meansnotconditionalonanyparticularmodel,butstillconditional on the full set of models considered. If θ is a parameter in common to only a subset of the R models, then w i must be recalculated based on just these models (thus these new weights must satisfy wi = 1). Use unconditional variance unless the selected model is strongly supported (for example, w min > 0.9). February 7, ] 2
18 summary Summary of Akaike Information Criteria Advantages Valid for both nested and nonnested models. Compare models with different error distribution. Avoid multiple testing issues. Selected Model The model with minimum AIC value. Specific to given data set. Pitfall in Using Akaike Information Criteria Can not be used to compare models of different data sets. For example, if nonlinear regression model g 1 is fitted to a data set with n = 140 observations,onecannotvalidlycompareitwithmodelg 2 when7outliershavebeen deleted, leaving only n = 133. February 7,
19 summary Should use the same response variables for all the candidate models. For example, if there was interest in the normal and log-normal model forms, the models would have to be expressed, respectively, as g 1 (x µ,σ) = 1 2πσ exp instead of g 1 (x µ,σ) = 1 2πσ exp [ [ (x µ)2 2σ 2 (x µ)2 2σ 2 ] ], g 2 (x µ,σ) = [ ] 1 x 2πσ exp (log(x) µ)2, 2σ 2, g 2 (log(x) µ,σ) = 1 [ ] exp (log(x) µ)2. 2πσ 2σ 2 Do not mix null hypothesis testing with information criterion. Information criterion is not a test, so avoid use of significant and not significant, or rejected and not rejected in reporting results. Do not use AIC to rank models in the set and then test whether the best model is significantly better than the second-best model. Should retain all the components of each likelihood in comparing different probability distributions. February 7,
20 reference References [1] E.J. Bedrick and C.L. Tsai, Model Selection for Multivariate Regression in Small Samples, Biometrics, 50 (1994), [2] H. Bozdogan, Model Selection and Akaike s Information Criterion (AIC): The General Theory and Its Analytical Extensions, Psychometrika, 52 (1987), [3] H. Bozdogan, Akaike s Information Criterion and Recent Developments in Information Complexity, Journal of Mathematical Psychology, 44 (2000), [4] K.P. Burnham and D.R. Anderson, Model Selection and Inference: A Practical Information-Theoretical Approach, (1998), New York: Springer-Verlag. [5] K.P. Burnham and D.R. Anderson, Multimodel Inference: Understanding AIC and BIC in Model Selection, Sociological methods and research, 33 (2004), [6] C.M. Hurvich and C.L. Tsai, Regression and Time Series Model Selection in Small Samples, Biometrika, 76 (1989). February 7,
Model comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationBias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition
Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationAnalysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems
Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA
More informationKULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE
KULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE Kullback-Leibler Information or Distance f( x) gx ( ±) ) I( f, g) = ' f( x)log dx, If, ( g) is the "information" lost when
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationA Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification
TR-No. 13-08, Hiroshima Statistical Research Group, 1 23 A Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationSTAT Financial Time Series
STAT 6104 - Financial Time Series Chapter 4 - Estimation in the time Domain Chun Yip Yau (CUHK) STAT 6104:Financial Time Series 1 / 46 Agenda 1 Introduction 2 Moment Estimates 3 Autoregressive Models (AR
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationSelection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models
Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation
More informationModel Selection for Semiparametric Bayesian Models with Application to Overdispersion
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and
More informationAIC under the Framework of Least Squares Estimation
AIC under the Framework of Least Squares Estimation H.T. Banks and Michele L. Joyner Center for Research in Scientific Computation orth Carolina State University Raleigh, C, United States and Dept of Mathematics
More informationInformation Theory and Entropy
3 Information Theory and Entropy Solomon Kullback (1907 1994) was born in Brooklyn, New York, USA, and graduated from the City College of New York in 1927, received an M.A. degree in mathematics in 1929,
More informationThe degree distribution
Universitat Politècnica de Catalunya Version 0.5 Complex and Social Networks (2018-2019) Master in Innovation and Research in Informatics (MIRI) Instructors Argimiro Arratia, argimiro@cs.upc.edu, http://www.cs.upc.edu/~argimiro/
More informationBias-corrected AIC for selecting variables in Poisson regression models
Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,
More informationAn Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data
An Akaike Criterion based on Kullback Symmetric Divergence Bezza Hafidi a and Abdallah Mkhadri a a University Cadi-Ayyad, Faculty of sciences Semlalia, Department of Mathematics, PB.2390 Marrakech, Moroco
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationEconomics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,
Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationThe Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models
The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population Health
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will
More informationDistributions of Maximum Likelihood Estimators and Model Comparisons
Distributions of Maximum Likelihood Estimators and Model Comparisons Peter Hingley Abstract Experimental data need to be assessed for purposes of model identification, estimation of model parameters and
More informationRegression I: Mean Squared Error and Measuring Quality of Fit
Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationStatistical Practice. Selecting the Best Linear Mixed Model Under REML. Matthew J. GURKA
Matthew J. GURKA Statistical Practice Selecting the Best Linear Mixed Model Under REML Restricted maximum likelihood (REML) estimation of the parameters of the mixed model has become commonplace, even
More informationBootstrap Variants of the Akaike Information Criterion for Mixed Model Selection
Bootstrap Variants of the Akaike Information Criterion for Mixed Model Selection Junfeng Shang, and Joseph E. Cavanaugh 2, Bowling Green State University, USA 2 The University of Iowa, USA Abstract: Two
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationMultimodel Inference: Understanding AIC relative variable importance values. Kenneth P. Burnham Colorado State University Fort Collins, Colorado 80523
November 6, 2015 Multimodel Inference: Understanding AIC relative variable importance values Abstract Kenneth P Burnham Colorado State University Fort Collins, Colorado 80523 The goal of this material
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationAkaike criterion: Kullback-Leibler discrepancy
Model choice. Akaike s criterion Akaike criterion: Kullback-Leibler discrepancy Given a family of probability densities {f ( ; ψ), ψ Ψ}, Kullback-Leibler s index of f ( ; ψ) relative to f ( ; θ) is (ψ
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More information1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa
On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University
More informationComparison of New Approach Criteria for Estimating the Order of Autoregressive Process
IOSR Journal of Mathematics (IOSRJM) ISSN: 78-578 Volume 1, Issue 3 (July-Aug 1), PP 1- Comparison of New Approach Criteria for Estimating the Order of Autoregressive Process Salie Ayalew 1 M.Chitti Babu,
More informationAn Information Criteria for Order-restricted Inference
An Information Criteria for Order-restricted Inference Nan Lin a, Tianqing Liu 2,b, and Baoxue Zhang,2,b a Department of Mathematics, Washington University in Saint Louis, Saint Louis, MO 633, U.S.A. b
More informationKen-ichi Kamo Department of Liberal Arts and Sciences, Sapporo Medical University, Hokkaido, Japan
A STUDY ON THE BIAS-CORRECTION EFFECT OF THE AIC FOR SELECTING VARIABLES IN NOR- MAL MULTIVARIATE LINEAR REGRESSION MOD- ELS UNDER MODEL MISSPECIFICATION Authors: Hirokazu Yanagihara Department of Mathematics,
More information2.2 Classical Regression in the Time Series Context
48 2 Time Series Regression and Exploratory Data Analysis context, and therefore we include some material on transformations and other techniques useful in exploratory data analysis. 2.2 Classical Regression
More informationModel Fitting, Bootstrap, & Model Selection
Model Fitting, Bootstrap, & Model Selection G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu http://astrostatistics.psu.edu Model Fitting Non-linear regression Density (shape) estimation
More informationModel comparison: Deviance-based approaches
Model comparison: Deviance-based approaches Patrick Breheny February 19 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/23 Model comparison Thus far, we have looked at residuals in a fairly
More informationEIE6207: Estimation Theory
EIE6207: Estimation Theory Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: Steven M.
More informationModel Fitting and Model Selection. Model Fitting in Astronomy. Model Selection in Astronomy.
Model Fitting and Model Selection G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu http://astrostatistics.psu.edu Model Fitting Non-linear regression Density (shape) estimation Parameter
More informationData Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationAkaike criterion: Kullback-Leibler discrepancy
Model choice. Akaike s criterion Akaike criterion: Kullback-Leibler discrepancy Given a family of probability densities {f ( ; ), 2 }, Kullback-Leibler s index of f ( ; ) relativetof ( ; ) is Z ( ) =E
More informationPractical Econometrics. for. Finance and Economics. (Econometrics 2)
Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics
More informationTesting Restrictions and Comparing Models
Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationResampling techniques for statistical modeling
Resampling techniques for statistical modeling Gianluca Bontempi Département d Informatique Boulevard de Triomphe - CP 212 http://www.ulb.ac.be/di Resampling techniques p.1/33 Beyond the empirical error
More information2018 2019 1 9 sei@mistiu-tokyoacjp http://wwwstattu-tokyoacjp/~sei/lec-jhtml 11 552 3 0 1 2 3 4 5 6 7 13 14 33 4 1 4 4 2 1 1 2 2 1 1 12 13 R?boxplot boxplotstats which does the computation?boxplotstats
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationModel selection in penalized Gaussian graphical models
University of Groningen e.c.wit@rug.nl http://www.math.rug.nl/ ernst January 2014 Penalized likelihood generates a PATH of solutions Consider an experiment: Γ genes measured across T time points. Assume
More informationVariable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension
Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension (Last Modified: November 3 2008) Mariko Yamamura 1, Hirokazu Yanagihara 2 and Muni S. Srivastava 3
More informationDYNAMIC ECONOMETRIC MODELS Vol. 9 Nicolaus Copernicus University Toruń Mariola Piłatowska Nicolaus Copernicus University in Toruń
DYNAMIC ECONOMETRIC MODELS Vol. 9 Nicolaus Copernicus University Toruń 2009 Mariola Piłatowska Nicolaus Copernicus University in Toruń Combined Forecasts Using the Akaike Weights A b s t r a c t. The focus
More information1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches
Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationHypothesis testing: theory and methods
Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationLecture 6: Model Checking and Selection
Lecture 6: Model Checking and Selection Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 27, 2014 Model selection We often have multiple modeling choices that are equally sensible: M 1,, M T. Which
More informationBayesian Asymptotics
BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {
More informationDiscrepancy-Based Model Selection Criteria Using Cross Validation
33 Discrepancy-Based Model Selection Criteria Using Cross Validation Joseph E. Cavanaugh, Simon L. Davies, and Andrew A. Neath Department of Biostatistics, The University of Iowa Pfizer Global Research
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationEXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009
EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates
More informationObtaining simultaneous equation models from a set of variables through genetic algorithms
Obtaining simultaneous equation models from a set of variables through genetic algorithms José J. López Universidad Miguel Hernández (Elche, Spain) Domingo Giménez Universidad de Murcia (Murcia, Spain)
More informationLikelihood based Statistical Inference. Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona
Likelihood based Statistical Inference Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona L. Pace, A. Salvan, N. Sartori Udine, April 2008 Likelihood: observed quantities,
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More informationIntroduction to Bayesian Statistics
Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California
More informationLikelihood inference in the presence of nuisance parameters
Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference
More informationTube formula approach to testing multivariate normality and testing uniformity on the sphere
Tube formula approach to testing multivariate normality and testing uniformity on the sphere Akimichi Takemura 1 Satoshi Kuriki 2 1 University of Tokyo 2 Institute of Statistical Mathematics December 11,
More informationGraduate Econometrics I: Unbiased Estimation
Graduate Econometrics I: Unbiased Estimation Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Unbiased Estimation
More informationAnswer Key for STAT 200B HW No. 7
Answer Key for STAT 200B HW No. 7 May 5, 2007 Problem 2.2 p. 649 Assuming binomial 2-sample model ˆπ =.75, ˆπ 2 =.6. a ˆτ = ˆπ 2 ˆπ =.5. From Ex. 2.5a on page 644: ˆπ ˆπ + ˆπ 2 ˆπ 2.75.25.6.4 = + =.087;
More information1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments
/4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population
More informationISyE 691 Data mining and analytics
ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)
More informationFor iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.
Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationBayesian Estimation of Prediction Error and Variable Selection in Linear Regression
Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Andrew A. Neath Department of Mathematics and Statistics; Southern Illinois University Edwardsville; Edwardsville, IL,
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationSize Distortion and Modi cation of Classical Vuong Tests
Size Distortion and Modi cation of Classical Vuong Tests Xiaoxia Shi University of Wisconsin at Madison March 2011 X. Shi (UW-Mdsn) H 0 : LR = 0 IUPUI 1 / 30 Vuong Test (Vuong, 1989) Data fx i g n i=1.
More informationAn Extended BIC for Model Selection
An Extended BIC for Model Selection at the JSM meeting 2007 - Salt Lake City Surajit Ray Boston University (Dept of Mathematics and Statistics) Joint work with James Berger, Duke University; Susie Bayarri,
More informationTransformations The bias-variance tradeoff Model selection criteria Remarks. Model selection I. Patrick Breheny. February 17
Model selection I February 17 Remedial measures Suppose one of your diagnostic plots indicates a problem with the model s fit or assumptions; what options are available to you? Generally speaking, you
More informationMAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM
Eco517 Fall 2004 C. Sims MAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM 1. SOMETHING WE SHOULD ALREADY HAVE MENTIONED A t n (µ, Σ) distribution converges, as n, to a N(µ, Σ). Consider the univariate
More informationEstimation of Dynamic Regression Models
University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More information