Parameter estimation in linear Gaussian covariance models
|
|
- Randell Phillip Sparks
- 6 years ago
- Views:
Transcription
1 Parameter estimation in linear Gaussian covariance models Caroline Uhler (IST Austria) Joint work with Piotr Zwiernik (UC Berkeley) and Donald Richards (Penn State University) Big Data Reunion Workshop Simons Institute, UC Berkeley December 16, 2014 Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
2 Linear Gaussian covariance model S p : (real) symmetric p p matrices S p 0 : cone of (real) symmetric p p positive definite matrices Definition (Linear Gaussian covariance model) A random vector X R p satisfies the linear Gaussian covariance model M G given by G = (G 0, G 1,..., G r ), G i S p, if X N p (µ, Σ θ ) and Σ θ = G 0 + r θ i G i, θ = (θ 1,... θ r ) R r. i=1 M G parametrized by a spectrahedron { Θ G = θ = (θ 1,... θ r ) R r r } G0 + θ i G i S p 0 Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14 i=1
3 Examples of linear Gaussian covariance models Correlation matrices: Θ G = {θ R (p 2) Ip + 1 i<j p θ ij (E ij + E ji ) S p 0 where E ij is the p p zero-matrix, where the (i, j) entry is 1. Covariance matrices with prescribed zeros (relevance networks) Butte et al. (2000), Chaudhuri, Drton & Richardson (2007) Stationary stochastic processes from repeated time series data Anderson (1970, 1973) Brownian motion tree models: Phylogenetic models (Felsenstein (1973, 1981)) Network tomography models for analyzing the structure of the connections in the Internet (Eriksson et al. (2010), Tsang et al. (2004)) Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14 },
4 Maximum likelihood estimation X 1,..., X n sample from N p (µ, Σ ), Σ true covariance matrix X = 1 n X i n sample mean, used to estimate µ S n = 1 n i=1 n (X i X )(X i X ) T sample covariance matrix i=1 S n S p 0 with probability 1 if n p Log-likelihood function: l( ; S n ) : S p 0 R log-likelihood function l(σ; S n ) = n 2 log det Σ n 2 tr(s nσ 1 ) For linear Gaussian covariance models we constrain l( ; S n ) to Θ G : MLE ˆΣ := arg max l(σ θ ; S n ) θ Θ G Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
5 Maximum likelihood estimation For K := Σ 1, l(k; S n ) = n 2 log det K n 2 tr(s nk) is concave in K S p 0 with global maximum K = S 1 n Concave function constrained to affine subspace remains concave ML estimation in Gaussian graphical models is convex problem l(σ; S n ) is not concave for all Σ S p 0, but it is strictly concave over random convex region 2Sn := {Σ S p 0 Σ 2S n } Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
6 Numerical optimization of the likelihood Newton-Raphson method: Start at natural least-squares estimator Σ given by θ, the solution to r j=1 θ j tr (G i G j ) = tr (S n G i ) for all i = 1,..., r Update: θ (k+1) = θ (k) ( θ T θ l(θ(k) ; S n )) 1 θ l(θ (k) ; S n ) Our observation: For simulated data Newton-Raphson algorithm typically converges in 2-3 steps It converges to a point ˆΣ with larger likelihood than the true (data-generating) covariance matrix Σ, and usually ˆΣ 2Sn. Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
7 Estimating correlation matrices A venerable problem that becomes difficult even for 3 3 matrices: Rousseeuw & Molenberghs (1994), Small, Wang & Yang (2000), Stuart & Ord (1991) Simulations using Newton-Raphson algorithm Simulations for 2 examples: 1 1/2 1/3 1/4 1 1/2 1/3 Σ = 1/2 1 1/4 and Σ = 1/2 1 1/5 1/6 1/3 1/5 1 1/7. 1/3 1/4 1 1/4 1/6 1/7 1 Least-squares estimator is given by Σ = I p + 1 i<j p (S n) ij (E ij + E ji ) Plot ratio of likelihoods L(Σ (t) )/L(Σ ) and compare to L(S n )/L(Σ ) Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
8 Estimating correlation matrices: Simulations n = 10 n = 50 n = 100 Paths Paths Paths 3 3: Ratio Ratio Ratio S Newton steps S Newton steps S Newton steps Paths Paths Paths 4 4: Ratio Ratio Ratio S Newton steps S Newton steps S Newton steps Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
9 Geometric picture Σ = true covariance matrix, S n = sample covariance matrix 2Sn = {Σ S p 0 Σ 2S n } (random!) convex region Clearly: S n 2 2Sn With high probability : 2 ˆ 2 2 2Sn 2Sn 2Sn Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
10 Wishart distribution i.i.d. sample X 1,..., X n from N p (µ, Σ). Then n S n has Wishart distribution W p (n 1, Σ) Q R p p full rank, Y W p (n, Σ), then QYQ T W p (n, QΣQ T ) So taking Q = Σ 1/2, then W n 1 := n Σ 1/2 S n Σ 1/2 has standard Wishart distribution W p (n 1, I p ) Hence: Note: P(Σ 2Sn ) = P(2S n Σ 0) = P(2(Σ ) 1/2 S n (Σ ) 1/2 I p 0) = P(W n 1 n 2 I p 0) = P(λ min (W n 1 ) > n/2) P(Σ 2Sn ) does not depend on Σ Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
11 Minimum eigenvalue of Wishart matrix What is known about the distribution of λ min (W n )? R. Muirhead, Aspects of Multivariate Statistical Theory (1982): Distribution of λ min (W n ) is known but expressed in terms of complicated functions that are hard to evaluate Approximating the integral P(λ min (W n ) > n/2) is hard Convergence to the asymptotic distribution (for n ) is very slow However: Recent development in random matrix theory: Asymptotic distribution of λ min (W n ) as n, p and n/p γ > 1 is given by Tracy-Widom distribution with convergence rate O(min(n, p) 3/2 ) (Ma, 2012) Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
12 Approximating P(Σ 2Sn ) for small p and n p=10 p=5 Frequency Simulated probability T W approximation 0.95 line Frequency Simulated probability T W approximation 0.95 line Frequency Simulated probability T W approximation 0.95 line Sample size Sample size Sample size (a) p = 3 (b) p = 5 (c) p = 10 In each plot, p {3, 5, 10} is fixed and n varies between p and 20p For n > 14 p it holds with probability 0.95 that Σ 2Sn Above curves converge to the graph f : (1, ) [0, 1] with f (n/p) = 1(n/p ), where Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
13 Conclusions and Discussion Likelihood function for linear Gaussian covariance models is, in general, multimodal However, multimodality is relevant only if sample size is not sufficiently large to compensate for model dimension Derived asymptotic conditions which guarantee when Σ, ˆΣ and Σ are contained in the convex region 2Sn Our results provide lower bounds on the probabilities that maximum likelihood estimation problem for linear Gaussian covariance models is well behaved 2Sn is contained in larger region over which likelihood function is strictly concave, and this region is contained in even larger region over which likelihood function is unimodal We are studying these regions and working on extensions to learning the model Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
14 -.*3'9%'"$=*9&*%270%#%*(0F'(0G996*')"0%2"094*04*32#))024*;$2:G0 Reference 96'()*>G9:'&#((=*94*"G'*2$H01*<=*!#462=E G246$2)'F2$245*!G2G5*,-*?)=%:"9"08)*9&*%270%#%*(0F'(0G996* Zwiernik, Uhler & Richards: Maximum likelihood estimation for linear )"0%2"094*04*32#))024*8=8(')*>04*:$9;$'))E Gaussian covariance models (arxiv: )!"#$%&'()* Caroline Uhler (IST Austria) Linear Gaussian covariance models Berkeley, December / 14
arxiv: v1 [math.st] 24 Aug 2014
MAXIMUM LIKELIHOOD ESTIMATION FOR LINEAR GAUSSIAN COVARIANCE MODELS arxiv:1408.5604v1 [math.st] 24 Aug 2014 By Piotr Zwiernik, Caroline Uhler, and Donald Richards University of California, Berkeley, IST
More informationAn Algebraic and Geometric Perspective on Exponential Families
An Algebraic and Geometric Perspective on Exponential Families Caroline Uhler (IST Austria) Based on two papers: with Mateusz Micha lek, Bernd Sturmfels, and Piotr Zwiernik, and with Liam Solus and Ruriko
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationLecture 4 September 15
IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric
More informationTotal positivity in Markov structures
1 based on joint work with Shaun Fallat, Kayvan Sadeghi, Caroline Uhler, Nanny Wermuth, and Piotr Zwiernik (arxiv:1510.01290) Faculty of Science Total positivity in Markov structures Steffen Lauritzen
More informationThe Geometry of Semidefinite Programming. Bernd Sturmfels UC Berkeley
The Geometry of Semidefinite Programming Bernd Sturmfels UC Berkeley Positive Semidefinite Matrices For a real symmetric n n-matrix A the following are equivalent: All n eigenvalues of A are positive real
More informationFinite Singular Multivariate Gaussian Mixture
21/06/2016 Plan 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Plan Singular Multivariate Normal Distribution 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Multivariate
More informationThe largest eigenvalues of the sample covariance matrix. in the heavy-tail case
The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)
1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016
More informationSemidefinite Programming
Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has
More informationGaussian Graphical Models: An Algebraic and Geometric Perspective
Gaussian Graphical Models: An Algebraic and Geometric Perspective Caroline Uhler arxiv:707.04345v [math.st] 3 Jul 07 Abstract Gaussian graphical models are used throughout the natural sciences, social
More informationGraphical Gaussian models and their groups
Piotr Zwiernik TU Eindhoven (j.w. Jan Draisma, Sonja Kuhnt) Workshop on Graphical Models, Fields Institute, Toronto, 16 Apr 2012 1 / 23 Outline and references Outline: 1. Invariance of statistical models
More informationMaximum likelihood estimation
Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationMultivariate Gaussian Analysis
BS2 Statistical Inference, Lecture 7, Hilary Term 2009 February 13, 2009 Marginal and conditional distributions For a positive definite covariance matrix Σ, the multivariate Gaussian distribution has density
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationCombinatorial Types of Tropical Eigenvector
Combinatorial Types of Tropical Eigenvector arxiv:1105.55504 Ngoc Mai Tran Department of Statistics, UC Berkeley Joint work with Bernd Sturmfels 2 / 13 Tropical eigenvalues and eigenvectors Max-plus: (R,,
More informationDecomposable and Directed Graphical Gaussian Models
Decomposable Decomposable and Directed Graphical Gaussian Models Graphical Models and Inference, Lecture 13, Michaelmas Term 2009 November 26, 2009 Decomposable Definition Basic properties Wishart density
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationGeodesic Convexity and Regularized Scatter Estimation
Geodesic Convexity and Regularized Scatter Estimation Lutz Duembgen (Bern) David Tyler (Rutgers) Klaus Nordhausen (Turku/Vienna), Heike Schuhmacher (Bern) Markus Pauly (Ulm), Thomas Schweizer (Bern) Düsseldorf,
More informationComputing the MLE and the EM Algorithm
ECE 830 Fall 0 Statistical Signal Processing instructor: R. Nowak Computing the MLE and the EM Algorithm If X p(x θ), θ Θ, then the MLE is the solution to the equations logp(x θ) θ 0. Sometimes these equations
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationSample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson
Sample Geometry Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationNonlinear Programming Models
Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems minf(x) x S R n Standard form:
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationLecture 32: Asymptotic confidence sets and likelihoods
Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications
ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March
More informationMachine Learning. Linear Models. Fabio Vandin October 10, 2017
Machine Learning Linear Models Fabio Vandin October 10, 2017 1 Linear Predictors and Affine Functions Consider X = R d Affine functions: L d = {h w,b : w R d, b R} where ( d ) h w,b (x) = w, x + b = w
More informationDecomposable Graphical Gaussian Models
CIMPA Summerschool, Hammamet 2011, Tunisia September 12, 2011 Basic algorithm This simple algorithm has complexity O( V + E ): 1. Choose v 0 V arbitrary and let v 0 = 1; 2. When vertices {1, 2,..., j}
More informationECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012
Reading ECE 275B Homework #2 Due Thursday 2-16-12 MIDTERM is Scheduled for Tuesday, February 21, 2012 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example 7.11,
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationMultivariate Gaussians, semidefinite matrix completion, and convex algebraic geometry
Ann Inst Stat Math (2010) 62:603 638 DOI 10.1007/s10463-010-0295-4 Multivariate Gaussians, semidefinite matrix completion, and convex algebraic geometry Bernd Sturmfels Caroline Uhler Received: 15 June
More informationStatistical Inference with Monotone Incomplete Multivariate Normal Data
Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationDS-GA 1002 Lecture notes 12 Fall Linear regression
DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models
1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall
More informationSTA 294: Stochastic Processes & Bayesian Nonparametrics
MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a
More informationEstimators based on non-convex programs: Statistical and computational guarantees
Estimators based on non-convex programs: Statistical and computational guarantees Martin Wainwright UC Berkeley Statistics and EECS Based on joint work with: Po-Ling Loh (UC Berkeley) Martin Wainwright
More informationFE670 Algorithmic Trading Strategies. Stevens Institute of Technology
FE670 Algorithmic Trading Strategies Lecture 3. Factor Models and Their Estimation Steve Yang Stevens Institute of Technology 09/12/2012 Outline 1 The Notion of Factors 2 Factor Analysis via Maximum Likelihood
More informationCS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine
CS295: Convex Optimization Xiaohui Xie Department of Computer Science University of California, Irvine Course information Prerequisites: multivariate calculus and linear algebra Textbook: Convex Optimization
More informationOn the smallest eigenvalues of covariance matrices of multivariate spatial processes
On the smallest eigenvalues of covariance matrices of multivariate spatial processes François Bachoc, Reinhard Furrer Toulouse Mathematics Institute, University Paul Sabatier, France Institute of Mathematics
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationLikelihood Analysis of Gaussian Graphical Models
Faculty of Science Likelihood Analysis of Gaussian Graphical Models Ste en Lauritzen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 2 Slide 1/43 Overview of lectures Lecture 1 Markov Properties
More informationOpen Problems in Algebraic Statistics
Open Problems inalgebraic Statistics p. Open Problems in Algebraic Statistics BERND STURMFELS UNIVERSITY OF CALIFORNIA, BERKELEY and TECHNISCHE UNIVERSITÄT BERLIN Advertisement Oberwolfach Seminar Algebraic
More informationDifferentiation of functions of covariance
Differentiation of log X May 5, 2005 1 Differentiation of functions of covariance matrices or: Why you can forget you ever read this Richard Turner Covariance matrices are symmetric, but we often conveniently
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationLikelihood-Based Methods
Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)
More informationComparison Method in Random Matrix Theory
Comparison Method in Random Matrix Theory Jun Yin UW-Madison Valparaíso, Chile, July - 2015 Joint work with A. Knowles. 1 Some random matrices Wigner Matrix: H is N N square matrix, H : H ij = H ji, EH
More informationLinear-Time Inverse Covariance Matrix Estimation in Gaussian Processes
Linear-Time Inverse Covariance Matrix Estimation in Gaussian Processes Joseph Gonzalez Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 jegonzal@cs.cmu.edu Sue Ann Hong Computer
More informationMarkov Chains and Hidden Markov Models
Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationRandom matrices: Distribution of the least singular value (via Property Testing)
Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationEUSIPCO
EUSIPCO 03 56974375 ON THE RESOLUTION PROBABILITY OF CONDITIONAL AND UNCONDITIONAL MAXIMUM LIKELIHOOD DOA ESTIMATION Xavier Mestre, Pascal Vallet, Philippe Loubaton 3, Centre Tecnològic de Telecomunicacions
More informationChapter 4: Asymptotic Properties of the MLE (Part 2)
Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationInvertibility of random matrices
University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]
More informationBayesian Decision and Bayesian Learning
Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i
More informationCMPE 58K Bayesian Statistics and Machine Learning Lecture 5
CMPE 58K Bayesian Statistics and Machine Learning Lecture 5 Multivariate distributions: Gaussian, Bernoulli, Probability tables Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationSparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results
Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic
More informationInverse Covariance Estimation with Missing Data using the Concave-Convex Procedure
Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Jérôme Thai 1 Timothy Hunter 1 Anayo Akametalu 1 Claire Tomlin 1 Alex Bayen 1,2 1 Department of Electrical Engineering
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationMIMO Capacities : Eigenvalue Computation through Representation Theory
MIMO Capacities : Eigenvalue Computation through Representation Theory Jayanta Kumar Pal, Donald Richards SAMSI Multivariate distributions working group Outline 1 Introduction 2 MIMO working model 3 Eigenvalue
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationRandom Matrices and Multivariate Statistical Analysis
Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical
More informationExponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that
1 More examples 1.1 Exponential families under conditioning Exponential families also behave nicely under conditioning. Specifically, suppose we write η = η 1, η 2 R k R p k so that dp η dm 0 = e ηt 1
More informationExam 2. Jeremy Morris. March 23, 2006
Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following
More informationA tailor made nonparametric density estimate
A tailor made nonparametric density estimate Daniel Carando 1, Ricardo Fraiman 2 and Pablo Groisman 1 1 Universidad de Buenos Aires 2 Universidad de San Andrés School and Workshop on Probability Theory
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationNote Set 5: Hidden Markov Models
Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional
More informationLinear regression. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Linear regression DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall15 Carlos Fernandez-Granda Linear models Least-squares estimation Overfitting Example:
More informationThe Maximum Likelihood Threshold of a Graph
The Maximum Likelihood Threshold of a Graph Elizabeth Gross and Seth Sullivant San Jose State University, North Carolina State University August 28, 2014 Seth Sullivant (NCSU) Maximum Likelihood Threshold
More informationGeneralized Concomitant Multi-Task Lasso for sparse multimodal regression
Generalized Concomitant Multi-Task Lasso for sparse multimodal regression Mathurin Massias https://mathurinm.github.io INRIA Saclay Joint work with: Olivier Fercoq (Télécom ParisTech) Alexandre Gramfort
More informationStructure estimation for Gaussian graphical models
Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationLabor-Supply Shifts and Economic Fluctuations. Technical Appendix
Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January
More information