ACTIVE SUBSPACES for dimension reduction in parameter studies
|
|
- Robyn Gardner
- 5 years ago
- Views:
Transcription
1 ACTIVE SUBSPACES for dimension reduction in parameter studies PAUL CONSTANTINE Ben L. Fryrear Assistant Professor Applied Mathematics & Statistics Colorado School of Mines In collaboration with: RACHEL WARD UT Austin ARMIN EFTEKHARI UT Austin This material is based upon work supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE-SC SLIDES: goo.gl/ck6gol DISCLAIMER: These slides are meant to complement the oral presentation. Use out of context at your own risk.
2 TAKE-HOMES An active subspace is a type of low-dimensional structure in a function of several variables. We have tools for identifying and exploiting active subspaces for parameter studies. Active subspaces appear in a wide range of physical models. Active subspaces are closely related to ridge approximation.
3 TAKE-HOMES An active subspace is a type of low-dimensional structure in a function of several variables. We have tools for identifying and exploiting active subspaces for parameter studies. Active subspaces appear in a wide range of physical models. Active subspaces are closely related to ridge approximation.
4 TAKE-HOMES An active subspace is a type of low-dimensional structure in a function of several variables. We have tools for identifying and exploiting active subspaces for parameter studies. Active subspaces appear in a wide range of physical models. Active subspaces are closely related to ridge approximation.
5 TAKE-HOMES An active subspace is a type of low-dimensional structure in a function of several variables. We have tools for identifying and exploiting active subspaces for parameter studies. Active subspaces appear in a wide range of physical models. Active subspaces are closely related to ridge approximation.
6 f(x)
7 f(x) Shape optimization in aerospace vehicles (with J. Alonso, T. Lukaczyk)
8 f(x) Shape optimization in aerospace vehicles (with J. Alonso, T. Lukaczyk) Sensitivity analysis in integrated hydrologic models (with R. Maxwell, J. Jefferson, J. Gilbert)
9 f(x) Shape optimization in aerospace vehicles (with J. Alonso, T. Lukaczyk) Uncertainty quantification for hypersonic scramjets (with G. Iaccarino, J. Larsson, M. Emory) Sensitivity analysis in integrated hydrologic models (with R. Maxwell, J. Jefferson, J. Gilbert)
10 f(x) Sensitivity analysis in HIV modeling (with T. Loudon, S. Pankavich)
11 f(x) Sensitivity analysis in HIV modeling (with T. Loudon, S. Pankavich) Calibration of an atmospheric reentry vehicle model (with P. Congedo, T. Magin)
12 f(x) Sensitivity analysis in HIV modeling (with T. Loudon, S. Pankavich) Sensitivity analysis in solar cell models (with B. Zaharatos, M. Campanelli) Calibration of an atmospheric reentry vehicle model (with P. Congedo, T. Magin)
13 Sensitivity analysis in Ebola transmission models (with P. Diaz, S. Pankavich) f(x) Sensitivity analysis in HIV modeling (with T. Loudon, S. Pankavich) Sensitivity analysis in solar cell models (with B. Zaharatos, M. Campanelli) Calibration of an atmospheric reentry vehicle model (with P. Congedo, T. Magin)
14 f(x) Lithium ion battery model (with A. Doostan)
15 f (x) Lithium ion battery model (with A. Doostan) Magnetohydrodynamics generator model (with A. Glaws, T. Wildey, J. Shadid)
16 f (x) Vehicle design (with C. Othmer, J. Alonso) Lithium ion battery model (with A. Doostan) Magnetohydrodynamics generator model (with A. Glaws, T. Wildey, J. Shadid)
17
18
19 Ridge approximation f(x) g(u T x) where U T : R m! R n g : R n! R
20 Ridge approximation What is U? f(x) g(u T x)
21 Ridge approximation What is U? f(x) g(u T x) What is g?
22 Ridge approximation What is the approximation error? What is U? f(x) g(u T x) What is g?
23 Ridge approximation What is the approximation error? f(x) g(u T x)
24 Ridge approximation What is the approximation error? f(x) g(u T x) Use the weighted root-mean-squared error: f(x) g(u T x) L 2 ( ) = Z (f(x) 1 g(u T x)) 2 2 (x) dx
25 Ridge approximation What is the approximation error? f(x) g(u T x) Use the weighted root-mean-squared error: f(x) g(u T x) L 2 ( ) = Z (f(x) 1 g(u T x)) 2 2 (x) dx Given weight function
26 Ridge approximation f(x) g(u T x) What is g?
27 Ridge approximation f(x) g(u T x) What is g? Use the conditional average: µ(y) = Z f(uy + V z) (z y) dz
28 Ridge approximation f(x) g(u T x) What is g? Use the conditional average: µ(y) = Z f(uy + V z) (z y) dz Subspace coordinates
29 Ridge approximation f(x) g(u T x) What is g? Use the conditional average: µ(y) = Z f(uy + V z) (z y) dz Subspace coordinates Complement subspace and coordinates
30 Ridge approximation f(x) g(u T x) What is g? Use the conditional average: µ(y) = Z Conditional density f(uy + V z) (z y) dz Subspace coordinates Complement subspace and coordinates
31 Ridge approximation f(x) g(u T x) What is g? Use the conditional average: µ(y) = Z Conditional density f(uy + V z) (z y) dz Subspace coordinates Complement subspace and coordinates µ(u T x) is the best approximation (Pinkus, 2015).
32 Ridge approximation What is U? f(x) g(u T x)
33 Ridge approximation What is U? f(x) g(u T x) Define the error function: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx
34 Ridge approximation What is U? f(x) g(u T x) Define the error function: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Minimize the error: minimize U R(U) subject to U 2 G(n, m)
35 Ridge approximation What is U? f(x) g(u T x) Define the error function: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Minimize the error: minimize U R(U) subject to U 2 G(n, m) Grassmann manifold of n-dimensional subspaces
36 Define the active subspace Consider a function and its gradient vector, f = f(x), x 2 R m, rf(x) 2 R m, : R m! R +
37 Define the active subspace Consider a function and its gradient vector, f = f(x), x 2 R m, rf(x) 2 R m, : R m! R + The average outer product of the gradient and its eigendecomposition, Z C = rf(x) rf(x) T (x) dx = W W T
38 Define the active subspace Consider a function and its gradient vector, f = f(x), x 2 R m, rf(x) 2 R m, : R m! R + The average outer product of the gradient and its eigendecomposition, Z C = rf(x) rf(x) T (x) dx = W W T Partition the eigendecomposition, = apple 1 2, W = W 1 W 2, W 1 2 R m n
39 Define the active subspace Consider a function and its gradient vector, f = f(x), x 2 R m, rf(x) 2 R m, : R m! R + The average outer product of the gradient and its eigendecomposition, Z C = rf(x) rf(x) T (x) dx = W W T Partition the eigendecomposition, = apple 1 2, W = W 1 W 2, W 1 2 R m n Rotate and separate the coordinates, x = WW T x = W 1 W T 1 x + W 2 W T 2 x = W 1 y + W 2 z
40 Define the active subspace Consider a function and its gradient vector, f = f(x), x 2 R m, rf(x) 2 R m, : R m! R + The average outer product of the gradient and its eigendecomposition, Z C = rf(x) rf(x) T (x) dx = W W T Partition the eigendecomposition, = apple 1 2, W = W 1 W 2, W 1 2 R m n Rotate and separate the coordinates, active variables x = WW T x = W 1 W T 1 x + W 2 W T 2 x = W 1 y + W 2 z inactive variables
41 The eigenpairs identify perturbations that change the function more, on average. LEMMA i = Z w T i rf(x) 2 (x) dx, i =1,...,m
42 The eigenpairs identify perturbations that change the function more, on average. LEMMA i = Z w T i rf(x) 2 (x) dx, i =1,...,m LEMMA Z Z kr y f(x)k 2 2 (x) dx = n kr z f(x)k 2 2 (x) dx = n m
43 An approximation result f(x) µ(w T 1 x) L 2 ( ) apple C ( n m ) 1 2
44 An approximation result Conditional average f(x) µ(w T 1 x) L 2 ( ) apple C ( n m ) 1 2
45 An approximation result Conditional average f(x) µ(w T 1 x) L 2 ( ) apple C ( n m ) 1 2 Active subspace
46 An approximation result Conditional average Poincaré constant f(x) µ(w T 1 x) L 2 ( ) apple C ( n m ) 1 2 Active subspace
47 An approximation result Conditional average Poincaré constant f(x) µ(w T 1 x) L 2 ( ) apple C ( n m ) 1 2 Active subspace Eigenvalues associated with inactive subspace
48 The active subspace is nearly stationary. Recall: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Assume (1) Lipschitz continuous function (2) Gaussian density function
49 The active subspace is nearly stationary. Recall: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Assume (1) Lipschitz continuous function (2) Gaussian density function k rr(w 1 )k F apple L 2m (m n) 2 ( n m ) 1 2
50 The active subspace is nearly stationary. Recall: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Assume (1) Lipschitz continuous function (2) Gaussian density function Gradient on the Grassmann manifold k rr(w 1 )k F apple L 2m (m n) 2 ( n m ) 1 2
51 The active subspace is nearly stationary. Recall: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Assume (1) Lipschitz continuous function (2) Gaussian density function Gradient on the Grassmann manifold k rr(w 1 )k F apple L 2m (m n) 2 ( n m ) 1 2 Active subspace
52 The active subspace is nearly stationary. Recall: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Assume (1) Lipschitz continuous function (2) Gaussian density function Gradient on the Grassmann manifold k rr(w 1 )k F apple L 2m (m n) 2 ( n m ) 1 2 Active subspace Frobenius norm
53 The active subspace is nearly stationary. Recall: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Assume (1) Lipschitz continuous function (2) Gaussian density function Gradient on the Grassmann manifold k rr(w 1 )k F Lipschitz constant apple L Dimensions 2m (m n) 2 ( n m ) 1 2 Active subspace Frobenius norm
54 The active subspace is nearly stationary. Recall: R(U) = 1 2 Z (f(x) µ(u T x)) 2 (x) dx Assume (1) Lipschitz continuous function (2) Gaussian density function Gradient on the Grassmann manifold k rr(w 1 )k F Lipschitz constant apple L Dimensions 2m (m n) 2 ( n m ) 1 2 Active subspace Frobenius norm Eigenvalues associated with inactive subspace
55 IDEA Use active subspace as the starting point for numerical ridge approximation. Given an initial subspace U 0 and samples {x i,f(x i )} (1) Compute y i = U T 0 x i (2) Fit a polynomial p N (y, ) with the pairs = argmin X i {y i,f(x i )} f(x i ) p(y i, ) 2 (3) Minimize residual over subpsaces U = argmin U2G(n,m) X i 2 f(x i ) p(u T x i, ) (4) Set U 0 = U and repeat
56 IDEA Use active subspace as the starting point for numerical ridge approximation. Given an initial subspace U 0 and samples {x i,f(x i )} (1) Compute y i = U T 0 x i (2) Fit a polynomial p N (y, ) with the pairs = argmin X i {y i,f(x i )} f(x i ) p(y i, ) 2 (3) Minimize residual over subpsaces U = argmin U2G(n,m) X i 2 f(x i ) p(u T x i, ) (4) Set U 0 = U and repeat
57 IDEA Use active subspace as the starting point for numerical ridge approximation. Given an initial subspace U 0 and samples {x i,f(x i )} (1) Compute y i = U T 0 x i (2) Fit a polynomial p N (y, ) with the pairs = argmin X i {y i,f(x i )} f(x i ) p(y i, ) 2 (3) Minimize residual over subpsaces U = argmin U2G(n,m) X i 2 f(x i ) p(u T x i, ) (4) Set U 0 = U and repeat
58 IDEA Use active subspace as the starting point for numerical ridge approximation. Given an initial subspace U 0 and samples {x i,f(x i )} (1) Compute y i = U T 0 x i (2) Fit a polynomial p N (y, ) with the pairs = argmin X i {y i,f(x i )} f(x i ) p(y i, ) 2 (3) Minimize residual over subpsaces U = argmin U2G(n,m) X i 2 f(x i ) p(u T x i, ) (4) Set U 0 = U and repeat
59 IDEA Use active subspace as the starting point for numerical ridge approximation. Given an initial subspace U 0 and samples {x i,f(x i )} (1) Compute y i = U T 0 x i (2) Fit a polynomial p N (y, ) with the pairs = argmin X i {y i,f(x i )} f(x i ) p(y i, ) 2 (3) Minimize residual over subpsaces U = argmin U2G(n,m) X i 2 f(x i ) p(u T x i, ) (4) Set U 0 = U and repeat
60 An example where it doesn t work f(x 1,x 2 ) = 5x 1 + sin(10 x 2 ) C = apple
61 An example where it doesn t work f(x 1,x 2 ) = 5x 1 + sin(10 x 2 ) C = apple ERROR π/4 π/2 3π/4 π SUBSPACE ANGLE
62 An example where it doesn t work f(x 1,x 2 ) = 5x 1 + sin(10 x 2 ) C = apple ERROR Active subspace U = [0; 1] π/4 π/2 3π/4 π SUBSPACE ANGLE
63 An example where it doesn t work f(x 1,x 2 ) = 5x 1 + sin(10 x 2 ) C = apple ERROR Active subspace U = [0; 1] Inactive subspace U = [1; 0] π/4 π/2 3π/4 π SUBSPACE ANGLE
64 An example where it works DRAG COEFFICIENT as a function of 18 shape parameters Uniform on a hypercube SU2 CFD solver with adjoint solver for gradients
65 An example where it works DRAG COEFFICIENT as a function of 18 shape parameters Uniform on a hypercube SU2 CFD solver with adjoint solver for gradients Recall: C = Z rf(x) rf(x) T (x) dx = W W T
66 Residual as a function of alternating iteration for different starting subspaces RANDOM Increasing subspace dimension Residual Increasing polynomial degree (total) IDENTITY ACTIVE SUBSPACE Alternating iteration
67 Residual as a function of alternating iteration for different starting subspaces RANDOM Increasing subspace dimension Increasing polynomial degree (total) IDENTITY ACTIVE SUBSPACE
68 TAKE-HOMES An active subspace is a type of low-dimensional structure in a function of several variables. We have tools for identifying and exploiting active subspaces for parameter studies. Active subspaces appear in a wide range of physical models. Active subspaces are closely related to ridge approximation.
69 QUESTIONS? How do active subspaces relate to [insert method]? How do I compute active subspaces? What if I don t have gradients? What kinds of models does this work on? Active Subspaces SIAM (2015) PAUL CONSTANTINE Ben L. Fryrear Assistant Professor Colorado School of Mines
70 BACK UP SLIDES
71 Sufficient Dimension Reduction in regression Projection Pursuit Regression, Neural Nets, Ridge functions Assume: Assume: y = f(x)+ P(y x) =P(y A T x) minimize A, Z (f(x) g(a T x, )) 2 dx Given (y i, x i ), find A Fisher Information Theory Principal Components / Regression, Karhunen-Loéve Z r 2 log L(x, ) (x) dx Z xx T (x) dx
72 References and related work Sufficient dimension reduction Cook. Regression Graphics. (1998/2009) K.C. Li. Sliced inverse regression for dimension reduction. (1991) K.C. Li. On principal Hessian directions for data visualization and dimension reduction (1992) In approximation theory Fornassier, Schass, Vybiral. Learning functions of few arbitrary linear parameters in high dimensions. FOCM (2012) Mayer, Ullrich, Vybiral. Entropy and sampling numbers of classes of ridge functions. Constructive Approximation (2014) In UQ Tipireddy, Ghanem. Basis adaptation in homogeneous chaos spaces. JCP (2014) Stoyanov, Webster. A gradient-based sampling approach for dimension reduction of partial differential equations with stochastic coefficients. IJ4UQ (2015) Lei, Yang, Lei, Zheng, Lin, Baker. Constructing Surrogate Models of Complex Systems with Enhanced Sparsity: Quantifying the Influence of Conformational Uncertainty in Biomolecular Solvation. SIAM MMS (2015) In engineering applications Abdel-Khalik, Bang, Wang. Overview of hybrid subspace methods for uncertainty quantification, sensitivity analysis. Annals of Nuclear Energy (2013) Berguin, Rancourt, Mavris. A method for high-dimensional design space exploration of expensive functions with access to gradient information. AIAA Russi. Uncertainty Quantification with Experimental Data and Complex System Models, Ph.D. thesis (2010) In statistics/machine learning: principal pursuit regression, neural nets
73 Discover the active subspace with random sampling. Draw samples: x j Compute: f j = f(x j ) and rf j = rf(x j ) Approximate with Monte Carlo C Equivalent to SVD of samples of the gradient 1 N NX rf j rfj T = Ŵ ˆ Ŵ T j=1 1 T p rf1 rf N = Ŵ pˆ ˆV N Called an active subspace method in T. Russi s 2010 Ph.D. thesis, Uncertainty Quantification with Experimental Data in Complex System Models
74 How many gradient samples? Bound on gradient norm squared Dimension L 2 1 N = 2 k "2 log(m) =) k ˆk apple" k (with high probability) Relative accuracy Using Gittens and Tropp (2011)
75 How many gradient samples? Bound on gradient norm squared L 2 N = 1" 2 log(m) Dimension (with high probability) =) dist (W 1, Ŵ 1) apple 4 1 " n n+1 Relative accuracy Spectral gap Gittens and Tropp (2011), Golub and Van Loan (1996), Stewart (1973)
76 Let s be abundantly clear about the problem we are trying to solve. Low-rank approximation of the collection of gradients: q 1 T p rf1 rf N Ŵ 1 ˆ 1 ˆV 1 N Low-dimensional linear approximation of the gradient: rf(x) Ŵ 1 a(x) Approximate a function of many variables by a function of a few linear combinations of the variables: f(x) g T Ŵ 1 x
Off-axis anisotropy in multivariate functions
Off-axis anisotropy in multivariate functions PAUL CONSTANTINE Assistant Professor Department of Computer Science University of Colorado, Boulder Thanks to: Jeff Hokanson (CU Boulder) Andrew Glaws (CU
More informationConcepts in Global Sensitivity Analysis IMA UQ Short Course, June 23, 2015
Concepts in Global Sensitivity Analysis IMA UQ Short Course, June 23, 2015 A good reference is Global Sensitivity Analysis: The Primer. Saltelli, et al. (2008) Paul Constantine Colorado School of Mines
More informationParameter Selection Techniques and Surrogate Models
Parameter Selection Techniques and Surrogate Models Model Reduction: Will discuss two forms Parameter space reduction Surrogate models to reduce model complexity Input Representation Local Sensitivity
More informationActive Subspaces of Airfoil Shape Parameterizations
Active Subspaces of Airfoil Shape Parameterizations arxiv:172.299v1 [math.na] 9 Feb 217 Zachary J. Grey and Paul G. Constantine Colorado School of Mines, Golden, Colorado, 841, USA Design and optimization
More informationDimension reduction in MHD power generation models: dimensional analysis and active subspaces
Dimension reduction in MHD power generation models: dimensional analysis and active subspaces arxiv:69.255v [math.na] 5 Sep 26 Andrew Glaws and Paul G. Constantine Department of Applied Mathematics and
More informationQuantifying conformation fluctuation induced uncertainty in bio-molecular systems
Quantifying conformation fluctuation induced uncertainty in bio-molecular systems Guang Lin, Dept. of Mathematics & School of Mechanical Engineering, Purdue University Collaborative work with Huan Lei,
More informationUtilizing Adjoint-Based Techniques to Improve the Accuracy and Reliability in Uncertainty Quantification
Utilizing Adjoint-Based Techniques to Improve the Accuracy and Reliability in Uncertainty Quantification Tim Wildey Sandia National Laboratories Center for Computing Research (CCR) Collaborators: E. Cyr,
More informationc 2016 Society for Industrial and Applied Mathematics
SIAM J. SCI. COMPUT. Vol. 8, No. 5, pp. A779 A85 c 6 Society for Industrial and Applied Mathematics ACCELERATING MARKOV CHAIN MONTE CARLO WITH ACTIVE SUBSPACES PAUL G. CONSTANTINE, CARSON KENT, AND TAN
More informationarxiv: v2 [math.na] 5 Dec 2013
ACTIVE SUBSPACE METHODS IN THEORY AND PRACTICE: APPLICATIONS TO KRIGING SURFACES PAUL G. CONSTANTINE, ERIC DOW, AND QIQI WANG arxiv:1304.2070v2 [math.na 5 Dec 2013 Abstract. Many multivariate functions
More informationIV. Matrix Approximation using Least-Squares
IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that
More informationWhy should you care about the solution strategies?
Optimization Why should you care about the solution strategies? Understanding the optimization approaches behind the algorithms makes you more effectively choose which algorithm to run Understanding the
More informationNon-Convex Optimization. CS6787 Lecture 7 Fall 2017
Non-Convex Optimization CS6787 Lecture 7 Fall 2017 First some words about grading I sent out a bunch of grades on the course management system Everyone should have all their grades in Not including paper
More informationHierarchical Parallel Solution of Stochastic Systems
Hierarchical Parallel Solution of Stochastic Systems Second M.I.T. Conference on Computational Fluid and Solid Mechanics Contents: Simple Model of Stochastic Flow Stochastic Galerkin Scheme Resulting Equations
More informationSparse polynomial chaos expansions in engineering applications
DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Sparse polynomial chaos expansions in engineering applications B. Sudret G. Blatman (EDF R&D,
More informationSampling and low-rank tensor approximation of the response surface
Sampling and low-rank tensor approximation of the response surface tifica Alexander Litvinenko 1,2 (joint work with Hermann G. Matthies 3 ) 1 Group of Raul Tempone, SRI UQ, and 2 Group of David Keyes,
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview
More informationA Brief Overview of Uncertainty Quantification and Error Estimation in Numerical Simulation
A Brief Overview of Uncertainty Quantification and Error Estimation in Numerical Simulation Tim Barth Exploration Systems Directorate NASA Ames Research Center Moffett Field, California 94035-1000 USA
More informationFunctional Analysis Review
Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all
More informationAccelerated Block-Coordinate Relaxation for Regularized Optimization
Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth
More informationIntroduction to gradient descent
6-1: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction to gradient descent Derivation and intuitions Hessian 6-2: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction Our
More informationExperiences with Model Reduction and Interpolation
Experiences with Model Reduction and Interpolation Paul Constantine Stanford University, Sandia National Laboratories Qiqi Wang (MIT) David Gleich (Purdue) Emory University March 7, 2012 Joe Ruthruff (SNL)
More informationCharacterization of heterogeneous hydraulic conductivity field via Karhunen-Loève expansions and a measure-theoretic computational method
Characterization of heterogeneous hydraulic conductivity field via Karhunen-Loève expansions and a measure-theoretic computational method Jiachuan He University of Texas at Austin April 15, 2016 Jiachuan
More informationsublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)
sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank
More informationCHAPTER 11. A Revision. 1. The Computers and Numbers therein
CHAPTER A Revision. The Computers and Numbers therein Traditional computer science begins with a finite alphabet. By stringing elements of the alphabet one after another, one obtains strings. A set of
More informationKernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.
SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University
More informationAPPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.
APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product
More informationNeural Networks in Structured Prediction. November 17, 2015
Neural Networks in Structured Prediction November 17, 2015 HWs and Paper Last homework is going to be posted soon Neural net NER tagging model This is a new structured model Paper - Thursday after Thanksgiving
More informationA Polynomial Chaos Approach to Robust Multiobjective Optimization
A Polynomial Chaos Approach to Robust Multiobjective Optimization Silvia Poles 1, Alberto Lovison 2 1 EnginSoft S.p.A., Optimization Consulting Via Giambellino, 7 35129 Padova, Italy s.poles@enginsoft.it
More informationDistributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College
Distributed Estimation, Information Loss and Exponential Families Qiang Liu Department of Computer Science Dartmouth College Statistical Learning / Estimation Learning generative models from data Topic
More informationLeast Sparsity of p-norm based Optimization Problems with p > 1
Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from
More informationSolving the Stochastic Steady-State Diffusion Problem Using Multigrid
Solving the Stochastic Steady-State Diffusion Problem Using Multigrid Tengfei Su Applied Mathematics and Scientific Computing Advisor: Howard Elman Department of Computer Science Sept. 29, 2015 Tengfei
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More information10-725/36-725: Convex Optimization Prerequisite Topics
10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationCS540 Machine learning Lecture 5
CS540 Machine learning Lecture 5 1 Last time Basis functions for linear regression Normal equations QR SVD - briefly 2 This time Geometry of least squares (again) SVD more slowly LMS Ridge regression 3
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationDerivative-Free Trust-Region methods
Derivative-Free Trust-Region methods MTH6418 S. Le Digabel, École Polytechnique de Montréal Fall 2015 (v4) MTH6418: DFTR 1/32 Plan Quadratic models Model Quality Derivative-Free Trust-Region Framework
More informationApproximate Principal Components Analysis of Large Data Sets
Approximate Principal Components Analysis of Large Data Sets Daniel J. McDonald Department of Statistics Indiana University mypage.iu.edu/ dajmcdon April 27, 2016 Approximation-Regularization for Analysis
More informationOrganization. I MCMC discussion. I project talks. I Lecture.
Organization I MCMC discussion I project talks. I Lecture. Content I Uncertainty Propagation Overview I Forward-Backward with an Ensemble I Model Reduction (Intro) Uncertainty Propagation in Causal Systems
More informationA Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades
A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades Qiaochu Yuan Department of Mathematics UC Berkeley Joint work with Prof. Ming Gu, Bo Li April 17, 2018 Qiaochu Yuan Tour
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationStochastic optimization - how to improve computational efficiency?
Stochastic optimization - how to improve computational efficiency? Christian Bucher Center of Mechanics and Structural Dynamics Vienna University of Technology & DYNARDO GmbH, Vienna Presentation at Czech
More informationMaximum variance formulation
12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes 1 Objectives to express prior knowledge/beliefs about model outputs using Gaussian process (GP) to sample functions from the probability measure defined by GP to build
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More informationACTIVE SUBSPACE METHODS IN THEORY AND PRACTICE: APPLICATIONS TO KRIGING SURFACES
ACTIVE SUBSPACE METHODS IN THEORY AND PRACTICE: APPLICATIONS TO KRIGING SURFACES PAUL G. CONSTANTINE, ERIC DOW, AND QIQI WANG Abstract. Many multivariate functions encountered in practical engineering
More informationGRADIENT = STEEPEST DESCENT
GRADIENT METHODS GRADIENT = STEEPEST DESCENT Convex Function Iso-contours gradient 0.5 0.4 4 2 0 8 0.3 0.2 0. 0 0. negative gradient 6 0.2 4 0.3 2.5 0.5 0 0.5 0.5 0 0.5 0.4 0.5.5 0.5 0 0.5 GRADIENT DESCENT
More informationInverse Power Method for Non-linear Eigenproblems
Inverse Power Method for Non-linear Eigenproblems Matthias Hein and Thomas Bühler Anubhav Dwivedi Department of Aerospace Engineering & Mechanics 7th March, 2017 1 / 30 OUTLINE Motivation Non-Linear Eigenproblems
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo
Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative
More informationLinear Regression (continued)
Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationThe Kadison-Singer Conjecture
The Kadison-Singer Conjecture John E. McCarthy April 8, 2006 Set-up We are given a large integer N, and a fixed basis {e i : i N} for the space H = C N. We are also given an N-by-N matrix H that has all
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationUQ in Reacting Flows
UQ in Reacting Flows Planetary Entry Simulations High-Temperature Reactive Flow During descent in the atmosphere vehicles experience extreme heating loads The design of the thermal protection system (TPS)
More informationBig Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten
More informationGreedy Signal Recovery and Uniform Uncertainty Principles
Greedy Signal Recovery and Uniform Uncertainty Principles SPIE - IE 2008 Deanna Needell Joint work with Roman Vershynin UC Davis, January 2008 Greedy Signal Recovery and Uniform Uncertainty Principles
More informationCHAPTER 2: QUADRATIC PROGRAMMING
CHAPTER 2: QUADRATIC PROGRAMMING Overview Quadratic programming (QP) problems are characterized by objective functions that are quadratic in the design variables, and linear constraints. In this sense,
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationMath Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.
Math 5620 - Introduction to Numerical Analysis - Class Notes Fernando Guevara Vasquez Version 1990. Date: January 17, 2012. 3 Contents 1. Disclaimer 4 Chapter 1. Iterative methods for solving linear systems
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationMatrix Approximations
Matrix pproximations Sanjiv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 00 Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning Latent Semantic Indexing (LSI) Given n documents
More informationThe Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017
The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the
More informationMarkov Chain Monte Carlo (MCMC)
School of Computer Science 10-708 Probabilistic Graphical Models Markov Chain Monte Carlo (MCMC) Readings: MacKay Ch. 29 Jordan Ch. 21 Matt Gormley Lecture 16 March 14, 2016 1 Homework 2 Housekeeping Due
More informationUncertainty Quantification in MEMS
Uncertainty Quantification in MEMS N. Agarwal and N. R. Aluru Department of Mechanical Science and Engineering for Advanced Science and Technology Introduction Capacitive RF MEMS switch Comb drive Various
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationDATA MINING AND MACHINE LEARNING
DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems
More informationReminders. Thought questions should be submitted on eclass. Please list the section related to the thought question
Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a
More informationSVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)
Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular
More informationPredictive Engineering and Computational Sciences. Local Sensitivity Derivative Enhanced Monte Carlo Methods. Roy H. Stogner, Vikram Garg
PECOS Predictive Engineering and Computational Sciences Local Sensitivity Derivative Enhanced Monte Carlo Methods Roy H. Stogner, Vikram Garg Institute for Computational Engineering and Sciences The University
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationModel Calibration under Uncertainty: Matching Distribution Information
Model Calibration under Uncertainty: Matching Distribution Information Laura P. Swiler, Brian M. Adams, and Michael S. Eldred September 11, 008 AIAA Multidisciplinary Analysis and Optimization Conference
More informationSuggestions for Making Useful the Uncertainty Quantification Results from CFD Applications
Suggestions for Making Useful the Uncertainty Quantification Results from CFD Applications Thomas A. Zang tzandmands@wildblue.net Aug. 8, 2012 CFD Futures Conference: Zang 1 Context The focus of this presentation
More informationDinesh Kumar, Mehrdad Raisee and Chris Lacor
Dinesh Kumar, Mehrdad Raisee and Chris Lacor Fluid Mechanics and Thermodynamics Research Group Vrije Universiteit Brussel, BELGIUM dkumar@vub.ac.be; m_raisee@yahoo.com; chris.lacor@vub.ac.be October, 2014
More informationStochastic Solvers for the Euler Equations
43rd AIAA Aerospace Sciences Meeting and Exhibit 1-13 January 5, Reno, Nevada 5-873 Stochastic Solvers for the Euler Equations G. Lin, C.-H. Su and G.E. Karniadakis Division of Applied Mathematics Brown
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationPractical unbiased Monte Carlo for Uncertainty Quantification
Practical unbiased Monte Carlo for Uncertainty Quantification Sergios Agapiou Department of Statistics, University of Warwick MiR@W day: Uncertainty in Complex Computer Models, 2nd February 2015, University
More informationLecture 24: Element-wise Sampling of Graphs and Linear Equation Solving. 22 Element-wise Sampling of Graphs and Linear Equation Solving
Stat260/CS294: Randomized Algorithms for Matrices and Data Lecture 24-12/02/2013 Lecture 24: Element-wise Sampling of Graphs and Linear Equation Solving Lecturer: Michael Mahoney Scribe: Michael Mahoney
More informationProblem 1: Toolbox (25 pts) For all of the parts of this problem, you are limited to the following sets of tools:
CS 322 Final Exam Friday 18 May 2007 150 minutes Problem 1: Toolbox (25 pts) For all of the parts of this problem, you are limited to the following sets of tools: (A) Runge-Kutta 4/5 Method (B) Condition
More informationNumerical Methods in Matrix Computations
Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices
More information. Frobenius-Perron Operator ACC Workshop on Uncertainty Analysis & Estimation. Raktim Bhattacharya
.. Frobenius-Perron Operator 2014 ACC Workshop on Uncertainty Analysis & Estimation Raktim Bhattacharya Laboratory For Uncertainty Quantification Aerospace Engineering, Texas A&M University. uq.tamu.edu
More informationPolynomial chaos expansions for sensitivity analysis
c DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for sensitivity analysis B. Sudret Chair of Risk, Safety & Uncertainty
More informationSupplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data
Supplement to A Generalized Least Squares Matrix Decomposition Genevera I. Allen 1, Logan Grosenic 2, & Jonathan Taylor 3 1 Department of Statistics and Electrical and Computer Engineering, Rice University
More informationInner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:
Inner products Definition: An inner product on a real vector space V is an operation (function) that assigns to each pair of vectors ( u, v) in V a scalar u, v satisfying the following axioms: 1. u, v
More informationInverse Singular Value Problems
Chapter 8 Inverse Singular Value Problems IEP versus ISVP Existence question A continuous approach An iterative method for the IEP An iterative method for the ISVP 139 140 Lecture 8 IEP versus ISVP Inverse
More informationUncertainty Quantification in Viscous Hypersonic Flows using Gradient Information and Surrogate Modeling
Uncertainty Quantification in Viscous Hypersonic Flows using Gradient Information and Surrogate Modeling Brian A. Lockwood, Markus P. Rumpfkeil, Wataru Yamazaki and Dimitri J. Mavriplis Mechanical Engineering
More informationSufficient Conditions for Finite-variable Constrained Minimization
Lecture 4 It is a small de tour but it is important to understand this before we move to calculus of variations. Sufficient Conditions for Finite-variable Constrained Minimization ME 256, Indian Institute
More informationMachine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart
Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural
More informationEFFICIENT AERODYNAMIC OPTIMIZATION USING HIERARCHICAL KRIGING COMBINED WITH GRADIENT
EFFICIENT AERODYNAMIC OPTIMIZATION USING HIERARCHICAL KRIGING COMBINED WITH GRADIENT Chao SONG, Xudong YANG, Wenping SONG National Key Laboratory of Science and Technology on Aerodynamic Design and Research,
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationSketched Ridge Regression:
Sketched Ridge Regression: Optimization and Statistical Perspectives Shusen Wang UC Berkeley Alex Gittens RPI Michael Mahoney UC Berkeley Overview Ridge Regression min w f w = 1 n Xw y + γ w Over-determined:
More informationfor Complex Environmental Models
Calibration and Uncertainty Analysis for Complex Environmental Models PEST: complete theory and what it means for modelling the real world John Doherty Calibration and Uncertainty Analysis for Complex
More informationLECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION
15-382 COLLECTIVE INTELLIGENCE - S19 LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION TEACHER: GIANNI A. DI CARO WHAT IF WE HAVE ONE SINGLE AGENT PSO leverages the presence of a swarm: the outcome
More informationPreface to Second Edition... vii. Preface to First Edition...
Contents Preface to Second Edition..................................... vii Preface to First Edition....................................... ix Part I Linear Algebra 1 Basic Vector/Matrix Structure and
More informationPolynomial Chaos and Karhunen-Loeve Expansion
Polynomial Chaos and Karhunen-Loeve Expansion 1) Random Variables Consider a system that is modeled by R = M(x, t, X) where X is a random variable. We are interested in determining the probability of the
More informationUsing the Karush-Kuhn-Tucker Conditions to Analyze the Convergence Rate of Preconditioned Eigenvalue Solvers
Using the Karush-Kuhn-Tucker Conditions to Analyze the Convergence Rate of Preconditioned Eigenvalue Solvers Merico Argentati University of Colorado Denver Joint work with Andrew V. Knyazev, Klaus Neymeyr
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More information