Some Areas of Recent Research
|
|
- Pierce Payne
- 6 years ago
- Views:
Transcription
1 University of Chicago Department Retreat, October 2012
2 Funders & Collaborators NSF (STATMOS), US Department of Energy Faculty: Mihai Anitescu, Liz Moyer Postdocs: Jie Chen, Bill Leeds, Ying Sun Grad students: Stefano Castruccio, Michael Horrell, Andy Poppick Undergrads: Peter Hansen, Grant Wilder
3 Preconditioning and fitting Gaussian process models Gaussian process Z determined by its mean and covariance functions: EZ(x) = (x) covfz(x); Z(y)g = K(x; y) Assume mean is 0 and covariance structure known up to parameter θ. Let K θ be covariance matrix for observations Z(x 1); : : : ; Z(x n) given θ. Then Then the loglik is (ignoring an additive constant) `(θ) = 1 2 log 1 jk(θ)j 2 Z0 K(θ) 1 Z: Problem: How to compute `(θ)? Particularly log jk(θ)j?
4 Important aside: Even when loglik can be computed exactly, maximizing it (or sampling from a posterior) may not be easy. Consider 400 evenly spaced observations on R and Z is fractional Brownian motion with variogram 1 2 E fz(x) Z(y)g2 = Γ `jx 2 with ` = 10 and = 1:5. y j Neither parameter is estimated well although there is strong evidence parameters lie along a curve in (`; ) space. Problem is worse if leave out Γ ` 2. I am unaware of any transformation independent of observation locations that would give concave loglikelihood. This kind of function makes some people in the optimization community unhappy. Things only get worse with more complex models.
5 60 40 log likelihood l α
6 Computing exact MLE Exact computations of likelihood function for n irregularly sited observations generally requires O(n 3 ) computation and O(n 2 ) memory to compute Cholesky decomposition of covariance matrix. Computation is becoming cheap much faster than memory. Increasing emphasis on matrix-free methods in which never have to store an n n matrix, even if requires more computation.
7 Iterative solution of linear equations Computing quadratic form in likelihood best done by solving systems like Kx = y, not by finding K 1. Iterative methods: for K positive definite, equivalent to minimizing 1 2 x0 Kx x 0 y, which can solve by, for example, conjugate gradient. Main computation requires multiplying vectors by K. This is fast for sparse K some structured (e.g., Toeplitz) matrices But even for dense unstructured matrices, iterative solution is matrix-free and may require many fewer flops than Cholesky decomposition: O(n 2 # iterations) v : O(n 3 ) Number of iterations for accurate solution related to condition number (ratio of largest to smallest eigenvalue) (K) of K.
8 When nearby observations strongly correlated, (K) can be very large, so need to precondition: Find a matrix P such that P 0 K(θ)P is well-conditioned for θ in vicinity of MLE and multiplying a vector by P is fast. Let Y = P 0 Z. Then the loglik (with Z as data, but written in terms of Y) equals `(θ) = 1 2 log jp0 K(θ)Pj + log jpj Okay for P to depend on θ as long as use this formula. 1 2 Y0 fp 0 K(θ)Pg 1 Y: Can ignore log jpj if it doesn t depend on θ (even if P does). What to do about log jp 0 K(θ)Pj?
9 Solve score equations instead? (Ignore preconditioning for Writing K i(θ) for K(θ), score equations are (assume mean is n o Z 0 K(θ) 1 K i(θ)k(θ) 1 Z = tr K(θ) 1 K i(θ) for i = 1; : : : ; p. First term requires only one solve. Instead of log determinant, need, for each component of θ, n o tr K(θ) 1 K i(θ) ; which requires n solves for exact calculation. Approximate by the unbiased estimate (Hutchinson, 1990) 1 N NX U 0 jk(θ) 1 K i(θ)u j ; j=1 where U j = (U j1; : : : ; U jn) 0 is random vector with U jk s iid and Pr(U jk = 1) = Pr(U jk = 1) = 1 2. Yields unbiased estimating equations.
10 Can bound statistical inefficiency of procedure in terms of (K). Thus, if can find a decent preconditioner for K, moderate N works well. Don t need N comparable to n! Preconditioning helps in two ways: Reduces number of iterations needed in iterative solver. Reduces need for large N. Scope for further improvement by choosing U j s not independent. Design of experiments! Stein, Chen and Anitescu (under revision).
11 Some other interests When low rank approximations to covariance matrices don t work. Won t discuss this here, but work likely to annoy some who have been advocating this approach for massive spatial datasets. Modeling and computation for massive (as opposed to large) space-time datasets. Without assuming covariance (or inverse covariance) matrices are low rank or sparse. Climate model emulation.
12 One-pass methods Look at data block by block and summarize the information about K(θ) from that block so that don t have to go back to raw data again (Anitescu, Horrell). Simple example: Divide data into B blocks. Within each block, approximate the loglik (or score) function. Mle of θ and observed information matrix an adequate approximation? If not, store more complete representation of loglik function. Adding loglik across blocks reduces storage with little loss of information? Save a few observations (or other summaries) from each block. Add within block approximate logliks to loglik of sparse observations. For truly massive (petascale, exascale) data, will need more than two layers.
13
14
15
16 +
17 +
18 + +
19
20 Climate model emulation Reproducing some of the output of a GCM under some forcing scenario without actually running it (Castruccio, Leeds, Moyer, Wilder). Or, better yet, producing accurate simulations of actual climate under some forcing scenario. GCM runs we have: NCAR Community Climate System Model version 3 (CCSM3), T31 resolution (approx 3:75 3:75 grid cells) Input is CO 2, output is temperature T (t) and precipitation P(t), t is year 18 forcing scenarios, 53 realizations, > 10;000 model years
21 Statistical emulation of mean Separate time series model for each of 47 regions: where T (t) = flog[co2](t) + log[co2](t 2 1)g X w i 2 log[co 2](t i) + "(t) i=2 "(t) is an autoregressive model of order 1 w i = (1 ) i. Fit with small number of scenarios and a few realizations per scenario. Compare to standard computer model emulation approach in which view (CO 2(1); : : : ; CO 2(n)) as input and (T (1); : : : ; T (n)) as output.
22 Total column ozone OMI (Ozone Monitoring Instrument, successor to TOMS) is aboard the satellite EOS Aura: Polar-orbiting. Sun-synchronous, so satellite always at local noon. Each orbit about 100 minutes, or 14.1 orbits a day. From raw data (photon counts in multiple frequency bands), levels of many trace constituents of atmosphere are deduced, including ozone. Over 80,000 observations per orbit, so over 10 6 a day. Near global coverage (no data during polar nights, some missing data). How might statistical models be used to produce better Level-3 (gridded) product than what NASA currently does?
23 Observation locations from 2 orbits latitude Date line longitude
24 Scope for fruitful interaction between statistics and numerical analysis. Information flow in both directions. Statistical problems produce new challenges in applied/computational math. Statistical/probabilistic thinking can yield new algorithms and theory for numerical analysis.
25 STATMOS Statistics in the Atmospheric and Oceanic Sciences, NSF-supported network. For anyone interested in this area, I have money for travel to rest of network (NC State, U of Washington, NCAR, etc.). For any graduate student interested in this area, I can also pay your salary while you are visiting another member of the network. If someone has postdoc money, I may be able to split cost of postdoc for research related to network goals.
Theory and Computation for Gaussian Processes
University of Chicago IPAM, February 2015 Funders & Collaborators US Department of Energy, US National Science Foundation (STATMOS) Mihai Anitescu, Jie Chen, Ying Sun Gaussian processes A process Z on
More informationThe Matrix Reloaded: Computations for large spatial data sets
The Matrix Reloaded: Computations for large spatial data sets The spatial model Solving linear systems Matrix multiplication Creating sparsity Doug Nychka National Center for Atmospheric Research Sparsity,
More informationThe Matrix Reloaded: Computations for large spatial data sets
The Matrix Reloaded: Computations for large spatial data sets Doug Nychka National Center for Atmospheric Research The spatial model Solving linear systems Matrix multiplication Creating sparsity Sparsity,
More informationAn Inversion-Free Estimating Equations Approach for. Gaussian Process Models
An Inversion-Free Estimating Equations Approach for Gaussian Process Models Mihai Anitescu Jie Chen Michael L. Stein November 29, 2015 Abstract One of the scalability bottlenecks for the large-scale usage
More informationStatistica Sinica Preprint No: SS wR2
Statistica Sinica Preprint No: SS-13-227wR2 Title A covariance parameter estimation method for polar-orbiting satellite data Manuscript ID SS-13-227wR2 URL http://www.stat.sinica.edu.tw/statistica/ DOI
More informationApplications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices
Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationFixed-domain Asymptotics of Covariance Matrices and Preconditioning
Fixed-domain Asymptotics of Covariance Matrices and Preconditioning Jie Chen IBM Thomas J. Watson Research Center Presented at Preconditioning Conference, August 1, 2017 Jie Chen (IBM Research) Covariance
More informationScientific Computing
Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting
More informationNonparametric Bayesian Methods
Nonparametric Bayesian Methods Debdeep Pati Florida State University October 2, 2014 Large spatial datasets (Problem of big n) Large observational and computer-generated datasets: Often have spatial and
More informationSupplemental Material for. Statistical Emulation of Climate Model Projections based on Precomputed GCM Runs
Supplemental Material for Statistical Emulation of Climate Model Projections based on Precomputed GCM Runs Stefano Castruccio Department of Statistics, University of Chicago, Chicago, Illinois David J.
More informationComputer Models of the Earth s Climate
Computer Models of the Earth s Climate DARGAN M. W. FRIERSON DEPARTMENT OF ATMOSPHERIC SCIENCES MATH DAY, 3-25-13 Climate Models Climate Models Climate Models Mathematical model: uses equations to describe
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationSolutions Parabola Volume 49, Issue 2 (2013)
Parabola Volume 49, Issue (013) Solutions 1411 140 Q1411 How many three digit numbers are there which do not contain any digit more than once? What do you get if you add them all up? SOLUTION There are
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large
More informationApproximate Principal Components Analysis of Large Data Sets
Approximate Principal Components Analysis of Large Data Sets Daniel J. McDonald Department of Statistics Indiana University mypage.iu.edu/ dajmcdon April 27, 2016 Approximation-Regularization for Analysis
More information9.1 Preconditioned Krylov Subspace Methods
Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationNumerical Methods I Non-Square and Sparse Linear Systems
Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant
More informationHOMEWORK 10 SOLUTIONS
HOMEWORK 10 SOLUTIONS MATH 170A Problem 0.1. Watkins 8.3.10 Solution. The k-th error is e (k) = G k e (0). As discussed before, that means that e (k+j) ρ(g) k, i.e., the norm of the error is approximately
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationSpatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter
Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint
More informationChapter 7 Iterative Techniques in Matrix Algebra
Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationCS 542G: Conditioning, BLAS, LU Factorization
CS 542G: Conditioning, BLAS, LU Factorization Robert Bridson September 22, 2008 1 Why some RBF Kernel Functions Fail We derived some sensible RBF kernel functions, like φ(r) = r 2 log r, from basic principles
More informationLoglikelihood and Confidence Intervals
Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,
More informationLecture 13: Simple Linear Regression in Matrix Format
See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 13: Simple Linear Regression in Matrix Format 36-401, Section B, Fall 2015 13 October 2015 Contents 1 Least Squares in Matrix
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationScalable kernel methods and their use in black-box optimization
with derivatives Scalable kernel methods and their use in black-box optimization David Eriksson Center for Applied Mathematics Cornell University dme65@cornell.edu November 9, 2018 1 2 3 4 1/37 with derivatives
More informationECE521 lecture 4: 19 January Optimization, MLE, regularization
ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?
Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what
More informationStochastic Analogues to Deterministic Optimizers
Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured
More informationBias-Variance Tradeoff
What s learning, revisited Overfitting Generative versus Discriminative Logistic Regression Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University September 19 th, 2007 Bias-Variance Tradeoff
More informationA COVARIANCE PARAMETER ESTIMATION METHOD FOR POLAR-ORBITING SATELLITE DATA
Statistica Sinica 25 (2015), 41-59 doi:http://dx.doi.org/10.5705/ss.2013.227w A COVARIANCE PARAMETER ESTIMATION METHOD FOR POLAR-ORBITING SATELLITE DATA Michael T. Horrell and Michael L. Stein University
More informationFrom Stationary Methods to Krylov Subspaces
Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written
More informationLab 1: Iterative Methods for Solving Linear Systems
Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationAlgebra I EOC Review (Part 2)
1. Let x = total miles the car can travel Answer: x 22 = 18 or x 18 = 22 2. A = 1 2 ah 1 2 bh A = 1 h(a b) 2 2A = h(a b) 2A = h a b Note that when solving for a variable that appears more than once, consider
More informationBut if z is conditioned on, we need to model it:
Partially Unobserved Variables Lecture 8: Unsupervised Learning & EM Algorithm Sam Roweis October 28, 2003 Certain variables q in our models may be unobserved, either at training time or at test time or
More informationNotes on Markov Networks
Notes on Markov Networks Lili Mou moull12@sei.pku.edu.cn December, 2014 This note covers basic topics in Markov networks. We mainly talk about the formal definition, Gibbs sampling for inference, and maximum
More informationLA Support for Scalable Kernel Methods. David Bindel 29 Sep 2018
LA Support for Scalable Kernel Methods David Bindel 29 Sep 2018 Collaborators Kun Dong (Cornell CAM) David Eriksson (Cornell CAM) Jake Gardner (Cornell CS) Eric Lee (Cornell CS) Hannes Nickisch (Phillips
More informationCurrent Status of the Stratospheric Ozone Layer From: UNEP Environmental Effects of Ozone Depletion and Its Interaction with Climate Change
Goals Produce a data product that allows users to acquire time series of the distribution of UV-B radiation across the continental USA, based upon measurements from the UVMRP. Provide data in a format
More informationA MATRIX-FREE APPROACH FOR SOLVING THE PARAMETRIC GAUSSIAN PROCESS MAXIMUM LIKELIHOOD PROBLEM
Preprint ANL/MCS-P1857-0311 A MATRIX-FREE APPROACH FOR SOLVING THE PARAMETRIC GAUSSIAN PROCESS MAXIMUM LIKELIHOOD PROBLEM MIHAI ANITESCU, JIE CHEN, AND LEI WANG Abstract. Gaussian processes are the cornerstone
More informationSolution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI
Solution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI Sagar Bhatt Person Number: 50170651 Department of Mechanical and Aerospace Engineering,
More informationThe convergence of stationary iterations with indefinite splitting
The convergence of stationary iterations with indefinite splitting Michael C. Ferris Joint work with: Tom Rutherford and Andy Wathen University of Wisconsin, Madison 6th International Conference on Complementarity
More informationQuasi-Newton Methods
Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications
More informationVariables which are always unobserved are called latent variables or sometimes hidden variables. e.g. given y,x fit the model p(y x) = z p(y x,z)p(z)
CSC2515 Machine Learning Sam Roweis Lecture 8: Unsupervised Learning & EM Algorithm October 31, 2006 Partially Unobserved Variables 2 Certain variables q in our models may be unobserved, either at training
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 24: Preconditioning and Multigrid Solver Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 5 Preconditioning Motivation:
More informationIntroduction to Machine Learning Midterm Exam
10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but
More informationNon-stationary Cross-Covariance Models for Multivariate Processes on a Globe
Scandinavian Journal of Statistics, Vol. 38: 726 747, 2011 doi: 10.1111/j.1467-9469.2011.00751.x Published by Blackwell Publishing Ltd. Non-stationary Cross-Covariance Models for Multivariate Processes
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013
Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you
More informationPhysics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester
Physics 403 Credible Intervals, Confidence Intervals, and Limits Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Summarizing Parameters with a Range Bayesian
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationTopics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems
Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationCOM336: Neural Computing
COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk
More informationE = UV W (9.1) = I Q > V W
91 9. EOFs, SVD A common statistical tool in oceanography, meteorology and climate research are the so-called empirical orthogonal functions (EOFs). Anyone, in any scientific field, working with large
More information6.4 Krylov Subspaces and Conjugate Gradients
6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationMulti-resolution models for large data sets
Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation Iowa State March, 2013 Credits Steve Sain, Tamra Greasby, NCAR Tia LeRud,
More information18.05 Practice Final Exam
No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For
More informationFast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation
Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation Won Chang Post Doctoral Scholar, Department of Statistics, University of Chicago Oct 15, 2014 Thesis Advisors: Murali
More informationSub-kilometer-scale space-time stochastic rainfall simulation
Picture: Huw Alexander Ogilvie Sub-kilometer-scale space-time stochastic rainfall simulation Lionel Benoit (University of Lausanne) Gregoire Mariethoz (University of Lausanne) Denis Allard (INRA Avignon)
More informationOn Gaussian Process Models for High-Dimensional Geostatistical Datasets
On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May
More informationRegression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)
Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features
More informationBetter Simulation Metamodeling: The Why, What and How of Stochastic Kriging
Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Jeremy Staum Collaborators: Bruce Ankenman, Barry Nelson Evren Baysal, Ming Liu, Wei Xie supported by the NSF under Grant No.
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline
More informationRegression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)
Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features
More informationChapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)
HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More information18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages
Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was
More informationBAYESIAN HIERARCHICAL MODELS FOR EXTREME EVENT ATTRIBUTION
BAYESIAN HIERARCHICAL MODELS FOR EXTREME EVENT ATTRIBUTION Richard L Smith University of North Carolina and SAMSI (Joint with Michael Wehner, Lawrence Berkeley Lab) IDAG Meeting Boulder, February 1-3,
More informationConjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)
Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html
More informationStatistics Research in Remote Sensing Data Analysis for Climate Science at the Jet Propulsion Laboratory
Statistics Research in Remote Sensing Data Analysis for Climate Science at the Jet Propulsion Laboratory Amy Braverman Jet Propulsion Laboratory, California Institute of Technology Mail Stop 306-463 4800
More informationBayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington
Bayesian Classifiers and Probability Estimation Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1 Data Space Suppose that we have a classification problem The
More informationIntroduction to Machine Learning Midterm Exam Solutions
10-701 Introduction to Machine Learning Midterm Exam Solutions Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes,
More informationExpectation Maximization
Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationConsider the following example of a linear system:
LINEAR SYSTEMS Consider the following example of a linear system: Its unique solution is x + 2x 2 + 3x 3 = 5 x + x 3 = 3 3x + x 2 + 3x 3 = 3 x =, x 2 = 0, x 3 = 2 In general we want to solve n equations
More informationSampling and incomplete network data
1/58 Sampling and incomplete network data 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington 2/58 Network sampling methods It is sometimes difficult to obtain a
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationClimate Change: the Uncertainty of Certainty
Climate Change: the Uncertainty of Certainty Reinhard Furrer, UZH JSS, Geneva Oct. 30, 2009 Collaboration with: Stephan Sain - NCAR Reto Knutti - ETHZ Claudia Tebaldi - Climate Central Ryan Ford, Doug
More informationAn Introduction to Gaussian Processes for Spatial Data (Predictions!)
An Introduction to Gaussian Processes for Spatial Data (Predictions!) Matthew Kupilik College of Engineering Seminar Series Nov 216 Matthew Kupilik UAA GP for Spatial Data Nov 216 1 / 35 Why? When evidence
More informationConsistent Downscaling of Seismic Inversions to Cornerpoint Flow Models SPE
Consistent Downscaling of Seismic Inversions to Cornerpoint Flow Models SPE 103268 Subhash Kalla LSU Christopher D. White LSU James S. Gunning CSIRO Michael E. Glinsky BHP-Billiton Contents Method overview
More informationExercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians
Exercise Sheet 1 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Show that an infinite mixture of Gaussian distributions, with Gamma distributions as mixing weights in the following
More informationQuiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)
Introduction to Algorithms October 13, 2010 Massachusetts Institute of Technology 6.006 Fall 2010 Professors Konstantinos Daskalakis and Patrick Jaillet Quiz 1 Solutions Quiz 1 Solutions Problem 1. We
More informationNaive Bayes and Gaussian Bayes Classifier
Naive Bayes and Gaussian Bayes Classifier Elias Tragas tragas@cs.toronto.edu October 3, 2016 Elias Tragas Naive Bayes and Gaussian Bayes Classifier October 3, 2016 1 / 23 Naive Bayes Bayes Rules: Naive
More informationComputational methods for mixed models
Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More information