Joint work with Nottingham colleagues Simon Preston and Michail Tsagris.
|
|
- Shanon Wiggins
- 5 years ago
- Views:
Transcription
1 /pgf/stepx/.initial=1cm, /pgf/stepy/.initial=1cm, /pgf/step/.code=1/pgf/stepx/.expanded= pt,/pgf/stepy/.expanded= pt, /pgf/step/.value required /pgf/images/width/.estore in= /pgf/images/height/.estore in= /pgf/images/page/.estore in= /pgf/images/interpolate/.cd,.code=,.default=true /pgf/images/mask/.code= /pgf/images/mask/matte/.cd,.estore in=,.value
2 /pgf/imagesheight=,width=,page=,interpolate=false,mask=,width=14pt,heig /pgf/imagesheight=,width=,page=,interpolate=false,mask=,width=14pt,heig /pgf/imagesheight=,width=,page=,interpolate=false,mask=,width=11pt,heig /pgf/imagesheight=,width=,page=,interpolate=false,mask=,width=11pt,heig /pgf/imagesheight=,width=,page=,interpolate=false,mask=,width=14pt,heig /pgf/imagesheight=,width=,page=,interpolate=false,mask=,width=14pt,heig Joint work with Nottingham colleagues Simon Preston and Michail Tsagris. Regression Models for Directional Data ADISTA14 Workshop 1 / 25
3 Regression Models for Directional Data Andrew Wood School of Mathematical Sciences University of Nottingham ADISTA14 Workshop, Wednesday 21 May 2014 Joint with Nottingham colleagues Simon Preston and Michail Tsagris. Supported by EPSRC. Regression Models for Directional Data ADISTA14 Workshop 1 / 25
4 Outline of Talk 1 Introduction and background discussion. 2 Regression models with Fisher error distribution. 3 Regression models with Kent error distribution. 4 Some numerical results. 5 Discussion. Regression Models for Directional Data ADISTA14 Workshop 2 / 25
5 Data Structure Data structure : {y 1, x 1 },..., {y n, x n }, where for each i, y i is a response vector and x i is a covariate vector; y i S 2 = {y i R 3 : yi y i = 1}, i.e. y i is a unit vector in 3D space; the covariate vectors are p-dimensional, i.e. x i R p ; it is assumed throughout the talk that y 1,..., y n are independent; The main purpose here is to develop parametric regression models on the sphere in which rotational symmetry of the error distribution is not assumed. Regression Models for Directional Data ADISTA14 Workshop 3 / 25
6 Multivariate linear model Consider the standard multivariate linear model: Y = XB + E, where Y (n p) is the response matrix; X (n q) is the covariate matrix; B (q p) is the parameter matrix; E (n p) is the unobserved error matrix whose rows are assumed to be IID with common distribution N p (0, Σ). In this situation, the least squares estimator (and MLE) of B, does not depend on Σ. ˆB = (X X) 1 X Y, However, we would still not want to assume that Σ is a scalar multiple of the identity, which is the analogue of the assumption of rotational symmetry of the error distribution on the sphere. Regression Models for Directional Data ADISTA14 Workshop 4 / 25
7 A general covariate vector x i A second purpose is to develop models which can handle a general covariate x i, i.e. special structure for x i such as x i S 2 or x i R is not assumed. The spirit of the modelling here is similar to that often used with generalised linear models (GLMs). The use of GLMs is sometimes a bit cavalier in the following respects: for simplicity and convenience we assume covariate information combines in linear fashion (i.e. through the linear predictor ) on a suitable scale (i.e. after applying the link function ); we do not expect our models to extrapolate well to regions in which the data are not well represented. We shall make similar assumptions here. Regression Models for Directional Data ADISTA14 Workshop 5 / 25
8 Fisher error distribution We will be focusing on the following two error distributions on the sphere: Rotationally symmetric case: the Fisher (or von Mises-Fisher) distribution. Rotationally asymmetric case: the Kent distribution. The Fisher distribution has pdf given by f (y κ, µ) = c F (κ) 1 exp(κy µ), where µ S 2 is a unit vector, the mean direction, and κ 0 is a concentration parameter. In the case of the Fisher distribution, we can construct a regression model with constant concentration parameter κ, by specifying µ to be of the form µ = µ(x, γ), µ i = µ(x i, γ), i = 1,..., n, with γ a parameter vector, and x i the covariate vector for observation i. Regression Models for Directional Data ADISTA14 Workshop 6 / 25
9 Particular cases of the Fisher regression model In cases (i) (iii) below it is assumed that y i, x i S 2. Case (i) [Chang, AoS, 1986; Rivest, AoS, 1989] Here, µ i = Ax i, where A is an unknown rotation matrix to be estimated. Case (ii) [Downs, Biometrika, 2003] A regression model on the sphere based on Möbius transformations. Case (iii) [Rosenthal et al., JASA, 2014] In this case, µ i = Ax i Ax i, where A is a non-singular 3 3 matrix to be estimated. Regression Models for Directional Data ADISTA14 Workshop 7 / 25
10 Particular cases of the Fisher regression model (continued) Case (iv) In this case, µ i = QR i δ, where Q is a 3 3 orthogonal matrix, and δ S 2 is a unit vector. One way to define the 3 3 rotation matrix R i is by 0 c 1i c 2i R i = exp(c i ), where C i = c 1i 0 c 3i c 2i c 3i 0 is skew-symmetric, and c ji = γ j x i, j = 1, 2, 3. A second possibility is to define R i = (I 3 + C i )(I C i ) 1, which has the advantage that it is not a many-to-one mapping. Regression Models for Directional Data ADISTA14 Workshop 8 / 25
11 The Kent distribution on S 2 Kent (1982) proposed a 5 parameter family which contains the Fisher family as well as rotationally asymmetric distributions. This has pdf where [ ] f (y; µ, ξ 1, ξ 2, κ, β) = c K (κ, β) 1 exp κµ y + β{(ξ1 y) 2 (ξ2 y) 2 } µ, ξ 1 and ξ 2 are mutually orthogonal unit vectors; κ 0 and β 0 are concentration and shape parameters. The distribution is unimodal if κ > 2β. Regression Models for Directional Data ADISTA14 Workshop 9 / 25
12 Motivation for Kent distributions Datasets in directional statistics and shape analysis are often highly concentrated. When highly concentrated datasets are projected onto a suitable tangent space, a multivariate normal approximation in the tangent space is often reasonable. When the Kent distribution is highly concentrated and unimodal, the induced distribution in the tangent space at the mode is approximately bivariate normal. Regression Models for Directional Data ADISTA14 Workshop 10 / 25
13 Orthonormal basis determined by a unit vector Consider a unit vector Provided sin θ 0, sin θ cos φ sin θ sin φ cos θ, sin θ cos φ sin θ sin φ cos θ sin φ cos φ 0. and cos θ cos φ cos θ sin φ sin θ. In cartesian coordinates, assuming x x 2 2 y 1 y 2 y 3, y 2 / y1 2 + y 2 2 y 1 / y1 2 + y and 0, we have y 1 y 3 / y1 2 + y 2 2 y 2 y 3 / y1 2 + y y3 2. Regression Models for Directional Data ADISTA14 Workshop 11 / 25
14 Model for axes Given that µ i = µ(x i, γ), use the previous slide to define an orthonormal triple µ i, ξ 1i, ξ 2i, ξ 1i = ξ 1i (x i, γ), ξ 2i = ξ 2i (x i, γ). Now suppose where y i Kent(µ i, ξ 1i, ξ 2i, κ, β), i = 1,..., n, ξ 1i = (cos ψ i )ξ 1i + (sin ψ i )ξ 2i, ξ 2i = (sin ψ i )ξ 1i + (cos ψ i )ξ 2i, and ψ i = ψ(x i, α), i.e. ψ i depends on the covariate vector x i and a further parameter vector α. Regression Models for Directional Data ADISTA14 Workshop 12 / 25
15 Parameters in the Kent regression model There are three parameter vectors: α; γ; and (κ, β). In particular: The parameter vector γ, along with the covariate vector x i, determines the mean direction µ i, and also the orthonormal triple µ i, ξ 1i and ξ 2i. The parameter vector α, along with x i, determines the angle ψ i. Note that the angle ψ i is measured relative to the coordinate system determined by the orthonormal triple µ i, ξ 1i and ξ 2i, which depends on i. The parameters κ 0 and β 0 are dispersion parameters. If so desired, one could allow κ and β to depend on x i and further parameters, but for simplicity we assume here than κ and β do not depend on i. Regression Models for Directional Data ADISTA14 Workshop 13 / 25
16 The log-likelihood under the Kent model The log-likelihood under the Kent model is given by l(α, β, γ, κ) = n log c K (κ, β)+κ where, as above, µ i = µ(x i, γ), n i=1 y i µ i +β n i=1 { ( y i ) 2 ) } 2 ξ 1i (yi ξ 2i, ξ 1i = (cos ψ i )ξ 1i + (sin ψ i )ξ 2i, ξ 2i = (sin ψ i )ξ 1i + (cos ψ i )ξ 2i, ξ 1i = ξ 1i (x i, γ), ξ 2i = ξ 2i (x i, γ), ψ i = ψ i (x i, α), and c K (κ, β) is the Kent normalising constant. A key point: this construction is generic, in the sense that we can use any suitable rotationally symmetric Fisher regression model for µ i = µ(x i, γ), and any suitable (double) von Mises model for ψ i = ψ i )x i, α). Regression Models for Directional Data ADISTA14 Workshop 14 / 25
17 A structured (or switching) optimisation algorithm When maximising the log-likelihood, we have found it desirable to cycle between 3 steps: Step 1: maximise l(α (m), β (m), γ (m), κ (m) ) over γ with α (m), β (m), κ (m) held fixed, to obtain γ (m+1). Step 2: maximise l(α (m), β (m), γ (m+1), κ (m) ) over α with β (m), γ (m+1) and κ (m) held fixed, to obtain α (m+1). Step 3: maximise l(α (m+1), β (m), γ (m+1), κ (m) ) over β and κ with α (m+1) and γ (m+1) held fixed, to obtain β (m+1) and κ (m+1). Regression Models for Directional Data ADISTA14 Workshop 15 / 25
18 Comments on the algorithm The normalising constant c K (κ, β) can be calculated exactly (using the Holonomic gradient method), or approximately, to a high level of accuracy (using e.g. the Kume and Wood (2005) saddlepoint approximation). So Step 3 in the algorithm is relatively straightforward. Step 1 can be performed by modifying (because of the addition of the quadratic term) whatever algorithm is used to fit the rotationally symmetric Fisher model. Step 2 is equivalent to fitting a weighted (double) von Mises regression model for the ψ i. The weight for observation i is proportional to ( y i ξ 1i ) 2 + ( y i ξ 2i ) 2. Step 2 is simplified due to the following result. Regression Models for Directional Data ADISTA14 Workshop 16 / 25
19 A useful lemma Consider a two-parameter natural exponential family likelihood based on an IID sample of size n: l(θ 1, θ 2 ) = n log c(θ 1, θ 2 ) + θ 1 t 1 + θ 2 t 2, where θ 1 and θ 2 are natural parameters and (t 1, t 2 ) is the sufficient statistic. Let h( t 1, t 2 ) = l{ˆθ 1 ( t 1, t 2 ), ˆθ 2 ( t 1, t 2 )} denote the maximised likelihood, viewed as a function of t 1 = t 1 /n and t 2 = t 2 /n. Lemma. h( t 1, t 2 ) is an increasing function of t 1 at ( t 1, t 2 ) if and only if ˆθ 1 = ˆθ 1 ( t 1, t 2 ) is strictly positive at ( t 1, t 2 ). Regression Models for Directional Data ADISTA14 Workshop 17 / 25
20 Application of the lemma An application of the lemma to Step 2 of the computation algorithm shows that, because β 0, Step 2 is equivalent to maximising over α the quadratic term in the log likelihood, namely n i=1 { ( y i ) 2 ) } 2 ξ 1i (yi ξ 2i. This is equivalent to maximising the log-likelihood of a weighted (double) von Mises regression model. Regression Models for Directional Data ADISTA14 Workshop 18 / 25
21 An alternative approach Recall the Fisher regression model in Case (iv): y i Fisher(µ i, κ), where µ i = QR i δ with Q and R i (3 3) rotational matrices and delta S 2. We could modify to a Kent distribution by writing y i Kent(µ i, ξ 1i, ξ 2i, κ, β), where µ i, ξ 1i and ξ 2i are an orthonormal triple determined by with R i = R(x i, γ). QR i = (µ i, ξ 1i, ξ 2i ), Regression Models for Directional Data ADISTA14 Workshop 19 / 25
22 Animal (seal) tracking data Some numerical results from the animal movement tracking data found in Jonsen et al. (2005). Estimated parameters and log-likelihood for the Fisher and Kent regression models assuming linear predictor is linear in time. Fisher log-likelihood = 377.6, ˆκ = ˆγ = ( , , ) Kent log-likelihood = 432.8, ˆκ = , ˆβ = ˆγ = ( , , ) Regression Models for Directional Data ADISTA14 Workshop 20 / 25
23 Animal (seal) tracking data plot Regression Models for Directional Data ADISTA14 Workshop 21 / 25
24 A single simulated example Regression Models for Directional Data ADISTA14 Workshop 22 / 25
25 γ = ( 0.02, 0.03, 0.004) ˆγ = ( , , ) True κ = 50 and estimated κ, ˆκ, is with n = 50. The x i are scalars from a normal with mean 0 and standard deviation 10. The latitude goes from 4 to 66.8 degrees and the longitude from 9 to 329 degrees. Regression Models for Directional Data ADISTA14 Workshop 23 / 25
26 Some simulation results Some simulation results from the Fisher model under Case (iv). True betas Estimated Betas Estimated kappa N=50 Kappa= Kappa= N=100 Kappa= Kappa= Table 1: Estimated parameters (averages over 500 simulations). Root mean squared errors Estimated Estimated Betas kappa Kappa= N=50 Kappa= Kappa= N=100 Kappa= Table 2: Square root of the MSEs of the parameters (500 simulations) Regression Models for Directional Data ADISTA14 Workshop 24 / 25
27 Selected References Chang, T. (1986). Spherical regression. Annals of Statistics, 14, Downs, T.D. (2003). Spherical regression. Biometrika, 90, Jonsen, I.D., Flemming, J.M., Myers, R.A. (2005). Robust state-space modelling of animal movement data. Ecology, 86, Kent, J.T. (1982). The Fisher Bingham distribution on the sphere. J.R. Statist. Soc. B 44, Rivest, L-P. (1989). Spherical regression for concentrated Fisher-von Mises distributions. Annals of Statistics, 17, Rosenthal, M., Wei, W., Klassen, E. & Srivastava, A. (2014). Spherical regression models using projective linear transformations. Journal of the American Statistical Association (to appear). Regression Models for Directional Data ADISTA14 Workshop 25 / 25
On the Fisher Bingham Distribution
On the Fisher Bingham Distribution BY A. Kume and S.G Walker Institute of Mathematics, Statistics and Actuarial Science, University of Kent Canterbury, CT2 7NF,UK A.Kume@kent.ac.uk and S.G.Walker@kent.ac.uk
More informationDirectional Statistics
Directional Statistics Kanti V. Mardia University of Leeds, UK Peter E. Jupp University of St Andrews, UK I JOHN WILEY & SONS, LTD Chichester New York Weinheim Brisbane Singapore Toronto Contents Preface
More informationLink lecture - Lagrange Multipliers
Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationCOM336: Neural Computing
COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk
More informationStatistics and econometrics
1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample
More informationWrapped Gaussian processes: a short review and some new results
Wrapped Gaussian processes: a short review and some new results Giovanna Jona Lasinio 1, Gianluca Mastrantonio 2 and Alan Gelfand 3 1-Università Sapienza di Roma 2- Università RomaTRE 3- Duke University
More informationDiscriminant Analysis with High Dimensional. von Mises-Fisher distribution and
Athens Journal of Sciences December 2014 Discriminant Analysis with High Dimensional von Mises - Fisher Distributions By Mario Romanazzi This paper extends previous work in discriminant analysis with von
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a
More informationCopula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011
Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models
More informationSpherical-Homoscedastic Distributions: The Equivalency of Spherical and Normal Distributions in Classification
Spherical-Homoscedastic Distributions Spherical-Homoscedastic Distributions: The Equivalency of Spherical and Normal Distributions in Classification Onur C. Hamsici Aleix M. Martinez Department of Electrical
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationGraduate Econometrics I: Maximum Likelihood II
Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationUniversity of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians
Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationJournal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error
Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA
More informationarxiv: v2 [stat.me] 16 Jun 2011
A data-based power transformation for compositional data Michail T. Tsagris, Simon Preston and Andrew T.A. Wood Division of Statistics, School of Mathematical Sciences, University of Nottingham, UK; pmxmt1@nottingham.ac.uk
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationMath Camp. Justin Grimmer. Associate Professor Department of Political Science Stanford University. September 9th, 2016
Math Camp Justin Grimmer Associate Professor Department of Political Science Stanford University September 9th, 2016 Justin Grimmer (Stanford University) Methodology I September 9th, 2016 1 / 61 Where
More informationSemiparametric Gaussian Copula Models: Progress and Problems
Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle 2015 IMS China, Kunming July 1-4, 2015 2015 IMS China Meeting, Kunming Based on joint work
More informationHypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes
Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis
More informationSemiparametric Gaussian Copula Models: Progress and Problems
Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle European Meeting of Statisticians, Amsterdam July 6-10, 2015 EMS Meeting, Amsterdam Based on
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationASSESSING A VECTOR PARAMETER
SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;
More informationGaussian Models (9/9/13)
STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationNaive Bayes and Gaussian Bayes Classifier
Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others February 22, 2016 Naive Bayes and Gaussian Bayes Classifier February 22, 2016 1 / 21 Naive Bayes Bayes Rule:
More informationSpherical-Homoscedastic Distributions: The Equivalency of Spherical and Normal Distributions in Classification
Journal of Machine Learning Research 8 (7) 1583-163 Submitted 4/6; Revised 1/7; Published 7/7 Spherical-Homoscedastic Distributions: The Equivalency of Spherical and Normal Distributions in Classification
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationApproximating models. Nancy Reid, University of Toronto. Oxford, February 6.
Approximating models Nancy Reid, University of Toronto Oxford, February 6 www.utstat.utoronto.reid/research 1 1. Context Likelihood based inference model f(y; θ), log likelihood function l(θ; y) y = (y
More informationThe General Linear Model (GLM)
The General Linear Model (GLM) Dr. Frederike Petzschner Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering, University of Zurich & ETH Zurich With many thanks for slides & images
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationSection 9: Generalized method of moments
1 Section 9: Generalized method of moments In this section, we revisit unbiased estimating functions to study a more general framework for estimating parameters. Let X n =(X 1,...,X n ), where the X i
More informationUniversity of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians
University of Cambridge Engineering Part IIB Module 4F: Statistical Pattern Processing Handout 2: Multivariate Gaussians.2.5..5 8 6 4 2 2 4 6 8 Mark Gales mjfg@eng.cam.ac.uk Michaelmas 2 2 Engineering
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationImputation Algorithm Using Copulas
Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More informationStat751 / CSI771 Midterm October 15, 2015 Solutions, Comments. f(x) = 0 otherwise
Stat751 / CSI771 Midterm October 15, 2015 Solutions, Comments 1. 13pts Consider the beta distribution with PDF Γα+β ΓαΓβ xα 1 1 x β 1 0 x < 1 fx = 0 otherwise, for fixed constants 0 < α, β. Now, assume
More informationExercises Chapter 4 Statistical Hypothesis Testing
Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationSampling distribution of GLM regression coefficients
Sampling distribution of GLM regression coefficients Patrick Breheny February 5 Patrick Breheny BST 760: Advanced Regression 1/20 Introduction So far, we ve discussed the basic properties of the score,
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More information1 Mixed effect models and longitudinal data analysis
1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationLinear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52
Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two
More informationAccurate directional inference for vector parameters
Accurate directional inference for vector parameters Nancy Reid October 28, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Parametric models and likelihood model f (y; θ), θ R p data y = (y 1,...,
More informationNaive Bayes and Gaussian Bayes Classifier
Naive Bayes and Gaussian Bayes Classifier Mengye Ren mren@cs.toronto.edu October 18, 2015 Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 1 / 21 Naive Bayes Bayes Rules: Naive Bayes
More informationBootstrap Goodness-of-fit Testing for Wehrly Johnson Bivariate Circular Models
Overview Bootstrap Goodness-of-fit Testing for Wehrly Johnson Bivariate Circular Models Arthur Pewsey apewsey@unex.es Mathematics Department University of Extremadura, Cáceres, Spain ADISTA14 (BRUSSELS,
More informationMachine Learning - MT & 5. Basis Expansion, Regularization, Validation
Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships
More informationThe Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB)
The Poisson transform for unnormalised statistical models Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB) Part I Unnormalised statistical models Unnormalised statistical models
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationSupplementary Material for Analysis of Principal Nested Spheres
3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 Biometrika (), xx, x, pp. 4 C 7 Biometrika Trust Printed in Great Britain Supplementary Material for Analysis
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationDS-GA 1002 Lecture notes 10 November 23, Linear models
DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.
More informationRandom vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.
Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationMethods in Computer Vision: Introduction to Matrix Lie Groups
Methods in Computer Vision: Introduction to Matrix Lie Groups Oren Freifeld Computer Science, Ben-Gurion University June 14, 2017 June 14, 2017 1 / 46 Definition and Basic Properties Definition (Matrix
More informationBayesian Inference for Clustered Extremes
Newcastle University, Newcastle-upon-Tyne, U.K. lee.fawcett@ncl.ac.uk 20th TIES Conference: Bologna, Italy, July 2009 Structure of this talk 1. Motivation and background 2. Review of existing methods Limitations/difficulties
More informationElliptically Contoured Distributions
Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal
More informationBayesian inference for multivariate extreme value distributions
Bayesian inference for multivariate extreme value distributions Sebastian Engelke Clément Dombry, Marco Oesting Toronto, Fields Institute, May 4th, 2016 Main motivation For a parametric model Z F θ of
More informationMATH 829: Introduction to Data Mining and Analysis Principal component analysis
1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional
More informationLatent Variable Models and EM Algorithm
SC4/SM8 Advanced Topics in Statistical Machine Learning Latent Variable Models and EM Algorithm Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/atsml/
More informationMultivariate Statistics
Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical
More informationVectors Coordinate frames 2D implicit curves 2D parametric curves. Graphics 2008/2009, period 1. Lecture 2: vectors, curves, and surfaces
Graphics 2008/2009, period 1 Lecture 2 Vectors, curves, and surfaces Computer graphics example: Pixar (source: http://www.pixar.com) Computer graphics example: Pixar (source: http://www.pixar.com) Computer
More informationSpring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =
Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationChapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix
Probability Theory Linear transformations A transformation is said to be linear if every single function in the transformation is a linear combination. Chapter 5 The multivariate normal distribution When
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V
More informationReflections and Rotations in R 3
Reflections and Rotations in R 3 P. J. Ryan May 29, 21 Rotations as Compositions of Reflections Recall that the reflection in the hyperplane H through the origin in R n is given by f(x) = x 2 ξ, x ξ (1)
More informationLecture 5: A step back
Lecture 5: A step back Last time Last time we talked about a practical application of the shrinkage idea, introducing James-Stein estimation and its extension We saw our first connection between shrinkage
More informationAccurate directional inference for vector parameters
Accurate directional inference for vector parameters Nancy Reid February 26, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Nancy Reid Accurate directional inference for vector parameters York University
More informationEstimation of parametric functions in Downton s bivariate exponential distribution
Estimation of parametric functions in Downton s bivariate exponential distribution George Iliopoulos Department of Mathematics University of the Aegean 83200 Karlovasi, Samos, Greece e-mail: geh@aegean.gr
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationp y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise.
1. Suppose Y 1, Y 2,..., Y n is an iid sample from a Bernoulli(p) population distribution, where 0 < p < 1 is unknown. The population pmf is p y (1 p) 1 y, y = 0, 1 p Y (y p) = (a) Prove that Y is the
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationThreshold estimation in marginal modelling of spatially-dependent non-stationary extremes
Threshold estimation in marginal modelling of spatially-dependent non-stationary extremes Philip Jonathan Shell Technology Centre Thornton, Chester philip.jonathan@shell.com Paul Northrop University College
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationLecture 1 October 9, 2013
Probabilistic Graphical Models Fall 2013 Lecture 1 October 9, 2013 Lecturer: Guillaume Obozinski Scribe: Huu Dien Khue Le, Robin Bénesse The web page of the course: http://www.di.ens.fr/~fbach/courses/fall2013/
More informationBayesian Inference by Density Ratio Estimation
Bayesian Inference by Density Ratio Estimation Michael Gutmann https://sites.google.com/site/michaelgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh
More informationLecture 4 September 15
IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric
More informationSTAT 730 Chapter 5: Hypothesis Testing
STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28 Likelihood ratio test def n: Data X depend on θ. The
More informationKey Algebraic Results in Linear Regression
Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in
More informationNaive Bayes and Gaussian Bayes Classifier
Naive Bayes and Gaussian Bayes Classifier Elias Tragas tragas@cs.toronto.edu October 3, 2016 Elias Tragas Naive Bayes and Gaussian Bayes Classifier October 3, 2016 1 / 23 Naive Bayes Bayes Rules: Naive
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More information3.0.1 Multivariate version and tensor product of experiments
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 3: Minimax risk of GLM and four extensions Lecturer: Yihong Wu Scribe: Ashok Vardhan, Jan 28, 2016 [Ed. Mar 24]
More informationTAMS39 Lecture 2 Multivariate normal distribution
TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution
More informationLikelihood inference in the presence of nuisance parameters
Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference
More informationThe formal relationship between analytic and bootstrap approaches to parametric inference
The formal relationship between analytic and bootstrap approaches to parametric inference T.J. DiCiccio Cornell University, Ithaca, NY 14853, U.S.A. T.A. Kuffner Washington University in St. Louis, St.
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationLocal Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models
Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models Francisco José A. Cysneiros 1 1 Departamento de Estatística - CCEN, Universidade Federal de Pernambuco, Recife - PE 5079-50
More information