Adaptive low-rank approximation in hierarchical tensor format using least-squares method
|
|
- Jonas Alvin Barton
- 5 years ago
- Views:
Transcription
1 Workshop on Challenges in HD Analysis and Computation, San Servolo 4/5/2016 Adaptive low-rank approximation in hierarchical tensor format using least-squares method Anthony Nouy Ecole Centrale Nantes, GeM Joint work with Mathilde Chevreuil, Loic Giraldi, Prashant Rai Supported by the ANR (CHORUS project) Anthony Nouy Ecole Centrale Nantes 1
2 High-dimensional problems in uncertainty quantification (UQ) Parameter-dependent models: where X are random variables. M(u(X ); X ) = 0 Anthony Nouy Ecole Centrale Nantes 2
3 High-dimensional problems in uncertainty quantification (UQ) Parameter-dependent models: where X are random variables. M(u(X ); X ) = 0 Solving forward and inverse UQ problems requires the evaluation of the model for many instances of X. Anthony Nouy Ecole Centrale Nantes 2
4 High-dimensional problems in uncertainty quantification (UQ) Parameter-dependent models: where X are random variables. M(u(X ); X ) = 0 Solving forward and inverse UQ problems requires the evaluation of the model for many instances of X. In practice, we rely on approximations of the solution map which are used as surrogate models. x u(x) Anthony Nouy Ecole Centrale Nantes 2
5 High-dimensional problems in uncertainty quantification (UQ) Parameter-dependent models: where X are random variables. M(u(X ); X ) = 0 Solving forward and inverse UQ problems requires the evaluation of the model for many instances of X. In practice, we rely on approximations of the solution map which are used as surrogate models. x u(x) Complexity issues: High-dimensional function u(x 1,..., x d ) For complex numerical models, only a few evaluations of the function are available. Anthony Nouy Ecole Centrale Nantes 2
6 Structured approximation for high-dimensional problems Specific structures of high-dimensional functions have to be exploited (application dependent) Low effective dimensionality Low-order interactions Sparsity (relatively to a basis or frame) u(x 1,..., x d ) u 1(x 1) u(x 1,..., x d ) u 1(x 1) u d (x d ) u(x) = α N d u αψ α(x) α Λ u αψ α(x)... Low-rank structures Anthony Nouy Ecole Centrale Nantes 3
7 Outline 1 Rank-structured approximation 2 Statistical learning methods for tensor approximation 3 Adaptive approximation in tree-based low-rank formats Anthony Nouy Ecole Centrale Nantes 4
8 Outline 1 Rank-structured approximation 2 Statistical learning methods for tensor approximation 3 Adaptive approximation in tree-based low-rank formats Anthony Nouy Ecole Centrale Nantes 5
9 Rank-structured approximation A multivariate function u(x 1,..., x d ) defined on X = X 1... X d is identified with an element of the tensor space V 1... V d = span{v 1 (x 1)... v d (x d ); v ν V ν} where V ν is a space of functions defined on X ν. Anthony Nouy Ecole Centrale Nantes 5
10 Rank-structured approximation A multivariate function u(x 1,..., x d ) defined on X = X 1... X d is identified with an element of the tensor space V 1... V d = span{v 1 (x 1)... v d (x d ); v ν V ν} where V ν is a space of functions defined on X ν. Approximation in a subset of tensors with bounded rank M r = {v V n 1... V n d ; rank(v) r} Anthony Nouy Ecole Centrale Nantes 5
11 Rank-structured approximation A multivariate function u(x 1,..., x d ) defined on X = X 1... X d is identified with an element of the tensor space V 1... V d = span{v 1 (x 1)... v d (x d ); v ν V ν} where V ν is a space of functions defined on X ν. Approximation in a subset of tensors with bounded rank M r = {v V n 1... V n d ; rank(v) r} For order-two tensors, a single notion of rank: rank(v) r v = r i=1 v 1 i (x 1)v 2 i (x 2) Anthony Nouy Ecole Centrale Nantes 5
12 Rank-structured approximation A multivariate function u(x 1,..., x d ) defined on X = X 1... X d is identified with an element of the tensor space V 1... V d = span{v 1 (x 1)... v d (x d ); v ν V ν} where V ν is a space of functions defined on X ν. Approximation in a subset of tensors with bounded rank M r = {v V n 1... V n d ; rank(v) r} For order-two tensors, a single notion of rank: rank(v) r v = r i=1 v 1 i (x 1)v 2 i (x 2) For higher-order tensors, different notions of rank, such as the canonical rank rank(v) r v = r i=1 Storage complexity: O(rdn) v 1 i (x 1)... v d i (x d ) Anthony Nouy Ecole Centrale Nantes 5
13 Subspace based low-rank formats α-rank: for α {1,..., d}, V = V α V α c, with V α = µ α Vµ, and rank α(v) r α v = r α i=1 vi α (x α)vi αc (x α c ), vi α V α, vi αc V α c v U α V α c with dim(u α) = r α {1, 2, 3, 4} α = {1, 2} α c = {3, 4} Storage complexity: O(r α(n #α + n #αc )) Anthony Nouy Ecole Centrale Nantes 6
14 Subspace based low-rank formats Tree-based Tucker format [Hackbusch-Kuhn 09,Oseledets-Tyrtyshnikov 09]: rank T (v) = (rank α(v) : α T ) with T a dimension tree {1, 2, 3, 4} {1, 2} {3, 4} {1} {2} {3} {4} storage complexity: O(dnR + dr s+1 ) with R r α Example (additive model): u(x 1,..., x d ) = u 1(x 1) u d (x d ) has rank T (u) = (2, 2, 2,..., 2) for any tree T. Anthony Nouy Ecole Centrale Nantes 7
15 Subspace-based low-rank formats Tensor-train format: a degenerate case where T is a subset of a linearly-structured dimension tree {1, 2, 3, 4} {1, 2, 3} {4} {1, 2} {3} T = {{1}, {1, 2}, {1, 2, 3},..., {1,..., d 1}} {1} {2} r {1} rank T (v) r v(x) = r {1,2} i 1 =1 i 2 =1... r {1,...,d 1} i d 1 =1 v 1 1,i 1 (x 1)v 2 i 1,i 2 (x 2)... v d i d 1,1(x d ) storage complexity: O(dnR 2 ) with R r α Anthony Nouy Ecole Centrale Nantes 8
16 Subspace based low-rank formats Storage and computational complexity scales as O(d). Good topological and geometrical properties of manifolds M r of tensors with bounded rank. [Holtz-Rohwedder-Schneider 11, Uschmajew-Vandereycken 13,Falco-Hackbusch-N. 15] Multilinear parametrization: where F is a multilinear map. M r = {v = F (p 1,..., p L ); p k P k, 1 k L} Anthony Nouy Ecole Centrale Nantes 9
17 Outline 1 Rank-structured approximation 2 Statistical learning methods for tensor approximation 3 Adaptive approximation in tree-based low-rank formats Anthony Nouy Ecole Centrale Nantes 10
18 Statistical learning methods for tensor approximation Approximation of a function u(x ) = u(x 1,..., X d ) from evaluations {y k = u(x k )} K k=1 on a training set {x k } K k=1 (i.i.d. samples of X ) Anthony Nouy Ecole Centrale Nantes 10
19 Statistical learning methods for tensor approximation Approximation of a function u(x ) = u(x 1,..., X d ) from evaluations {y k = u(x k )} K k=1 on a training set {x k } K k=1 (i.i.d. samples of X ) Approximation in subsets of rank-structured functions by minimization of an empirical risk M r = {v : rank(v) r} R K (v) = 1 K where l is a certain loss function. K l(u(x k ), v(x k )) k=1 Anthony Nouy Ecole Centrale Nantes 10
20 Statistical learning methods for tensor approximation Approximation of a function u(x ) = u(x 1,..., X d ) from evaluations {y k = u(x k )} K k=1 on a training set {x k } K k=1 (i.i.d. samples of X ) Approximation in subsets of rank-structured functions by minimization of an empirical risk M r = {v : rank(v) r} R K (v) = 1 K where l is a certain loss function. K l(u(x k ), v(x k )) k=1 Here, we consider for least-squares regression R K (v) = 1 K K (u(x k ) v(x k )) 2 = ÊK ((u(x ) v(x )) 2 ) k=1 but other loss functions could be used for different objectives than L 2 -approximation (e.g. classification). Anthony Nouy Ecole Centrale Nantes 10
21 Alternating minimization algorithm Multilinear parametrization of tensor manifolds M r = {v = F (p 1,..., p L ) : p l R m l, 1 l L} so that min RK (v) = min RK (F (p v M r p 1,...,p 1,..., p d )) L Anthony Nouy Ecole Centrale Nantes 11
22 Alternating minimization algorithm Multilinear parametrization of tensor manifolds so that M r = {v = F (p 1,..., p L ) : p l R m l, 1 l L} min RK (v) = min RK (F (p v M r p 1,...,p 1,..., p d )) L Alternating minimization algorithm: Successive minimization problems min p l R m l R K (F (p 1,..., p l,..., p d )) }{{} Ψ l ( ) T p l which are standard linear approximation problems min p l R m l 1 K K l(u(x k ), Ψ l (x k ) T p l ) k=1 Anthony Nouy Ecole Centrale Nantes 11
23 Regularization Regularization min RK (F (p p 1,...,p 1,..., p L )) + L with regularization functionals Ω l promoting sparsity (e.g. Ω l (p l ) = p l 1), smoothness,... L λ l Ω l (p l ) Alternating minimization algorithm requires the solution of successive standard regularized linear approximation problems l=1 1 min p l K K l(u(x k ), Ψ l (x k ) T p l ) + λ l Ω l (p l ) k=1 ( ) For square-loss and Ω l (p l ) = p l 1, ( ) is a LASSO problem. Cross-validation methods for the selection of regularization parameters λ l. Anthony Nouy Ecole Centrale Nantes 12
24 Illustrations Approximation in tensor-train (TT) format: v(x 1,..., x d ) = Polynomial approximations r 1 i 1 =1... r d 1 i d 1 =1 vi k k 1,i k P q v 1 1,i 1 (x 1)... v d i d 1,1(x d ) v = F (p 1,..., p d ) with parameter p k R (q+1)r k r k 1 gathering the coefficients of functions of x k on a polynomial basis (orthonormal in L 2 P Xk (X k )). Sparsity inducing regularization and cross-validation (leave one out) for the automatic selection of polynomial basis functions. Use of standard least-squares in the selected basis. Anthony Nouy Ecole Centrale Nantes 13
25 Illustration : Borehole function The Borehole function models water flow through a borehole: 2πT u(h u H l ) u(x ) = ( ), X = (r w, log(r), T u, H u, T l, H l, L, K w ) 2LT ln(r/r w ) 1 + u + Tu ln(r/r w )rw 2 Kw T l r w radius of borehole (m) N(µ = 0.10, σ = ) r radius of influence (m) LN(µ = 7.71, σ = ) T u transmissivity of upper aquifer (m 2 /yr) U(63070, ) H u potentiometric head of upper aquifer (m) U(990, 1110) T l transmissivity of lower aquifer (m 2 /yr) U(63.1, 116) H l potentiometric head of lower aquifer (m) U(700, 820) L length of borehole (m) U(1120, 1680) K w hydraulic conductivity of borehole (m/yr) U(9855, 12045) Polynomial approximation with degree q = 8. Test set of size Anthony Nouy Ecole Centrale Nantes 14
26 Illustration : Borehole function Test error for different ranks and for different sizes K of the training set. rank K = 100 K=1000 K=10000 ( ) ( ) ( ) ( ) ( ) ( ) Anthony Nouy Ecole Centrale Nantes 15
27 Illustration : Borehole function Test error for different ranks and for different sizes K of the training set. rank K = 100 K=1000 K=10000 ( ) ( ) ( ) ( ) ( ) ( ) Finding optimal rank is a combinatorial problem... Anthony Nouy Ecole Centrale Nantes 15
28 Outline 1 Rank-structured approximation 2 Statistical learning methods for tensor approximation 3 Adaptive approximation in tree-based low-rank formats Anthony Nouy Ecole Centrale Nantes 16
29 Heuristic strategy for rank adaptation (tree-based Tucker format) Given T 2 {1,...,d}, construction of a sequence of approximations u m in tree-based Tucker format with increasing rank: u m {v : rank T (v) (r m α ) α T } Anthony Nouy Ecole Centrale Nantes 16
30 Heuristic strategy for rank adaptation (tree-based Tucker format) Given T 2 {1,...,d}, construction of a sequence of approximations u m in tree-based Tucker format with increasing rank: u m {v : rank T (v) (r m α ) α T } At iteration m, { rα m+1 = rα m + 1 if α T m rα m+1 = rα m if α / T m where T m is selected in order to obtain (hopefully) the fastest decrease of the error. Anthony Nouy Ecole Centrale Nantes 16
31 Heuristic strategy for rank adaptation (tree-based Tucker format) Given T 2 {1,...,d}, construction of a sequence of approximations u m in tree-based Tucker format with increasing rank: u m {v : rank T (v) (r m α ) α T } At iteration m, { r m+1 α = r m α + 1 if α T m r m+1 α = r m α if α / T m where T m is selected in order to obtain (hopefully) the fastest decrease of the error. A possible strategy consists in computing the singular values σ1 α... σr α α m of α-matricizations M α(u m) of u m for all α T, M α(u m) = r m α i=1 σ α i v α i v αc i V α V α c Anthony Nouy Ecole Centrale Nantes 16
32 Heuristic strategy for rank adaptation (tree-based Tucker format) Given T 2 {1,...,d}, construction of a sequence of approximations u m in tree-based Tucker format with increasing rank: u m {v : rank T (v) (r m α ) α T } At iteration m, { r m+1 α = r m α + 1 if α T m r m+1 α = r m α if α / T m where T m is selected in order to obtain (hopefully) the fastest decrease of the error. A possible strategy consists in computing the singular values σ1 α... σr α α m of α-matricizations M α(u m) of u m for all α T, M α(u m) = r m α i=1 u m 2 = r m α i=1 (σ α i ) 2 for all α T. σ α i v α i v αc i V α V α c Anthony Nouy Ecole Centrale Nantes 16
33 Heuristic strategy for rank adaptation (tree-based Tucker format) Given T 2 {1,...,d}, construction of a sequence of approximations u m in tree-based Tucker format with increasing rank: u m {v : rank T (v) (r m α ) α T } At iteration m, { r m+1 α = r m α + 1 if α T m r m+1 α = r m α if α / T m where T m is selected in order to obtain (hopefully) the fastest decrease of the error. A possible strategy consists in computing the singular values σ1 α... σr α α m of α-matricizations M α(u m) of u m for all α T, M α(u m) = r m α i=1 σ α i v α i v αc i V α V α c u m 2 = r m α i=1 (σ α i ) 2 for all α T. σ α r m α provides an estimation of an upper bound of u um (V α V α c ) Anthony Nouy Ecole Centrale Nantes 16
34 Heuristic strategy for rank adaptation (tree-based Tucker format) Given T 2 {1,...,d}, construction of a sequence of approximations u m in tree-based Tucker format with increasing rank: u m {v : rank T (v) (r m α ) α T } At iteration m, { r m+1 α = r m α + 1 if α T m r m+1 α = r m α if α / T m where T m is selected in order to obtain (hopefully) the fastest decrease of the error. A possible strategy consists in computing the singular values σ1 α... σr α α m of α-matricizations M α(u m) of u m for all α T, M α(u m) = r m α i=1 σ α i v α i v αc i V α V α c u m 2 = rα m i=1 (σi α ) 2 for all α T. σr α α m provides an estimation of an upper bound of u um (V α V α c ) Letting 0 θ 1, we choose { } T m = α T : σr α α m θ max β T σβ r β m Anthony Nouy Ecole Centrale Nantes 16
35 Illustration : Borehole function Training set of size K = 1000 iteration rank test error 0 ( ) ( ) ( ) ( ) ( ) ( ) ( ) The selected rank is one order of magnitude better than the optimal isotropic rank (r, r,..., r) Anthony Nouy Ecole Centrale Nantes 17
36 Illustration : Borehole function Different sizes K of training set, selection of optimal ranks. TT format K rank test error 100 ( ) ( ) ( ) Canonical format K rank test error Anthony Nouy Ecole Centrale Nantes 18
37 Influence of the tree Test error for different trees T (Training set of size K = 50) {ν 1,..., ν d } {ν 1} {ν 2,..., ν d } {ν d 1 } {ν d } tree {ν 1,..., ν d } optimal rank test error T 1 ( ) ( ) T 2 ( ) ( ) T 3 ( ) ( ) T 4 ( ) ( ) Finding the optimal tree is a combinatorial problem... Anthony Nouy Ecole Centrale Nantes 19
38 Strategy for tree adaptation Starting from an initial tree, we perform iteratively the following two steps: Run the algorithm with rank adaptation to compute an approximation v associated with the current tree v(x 1,..., x d ) = r 1 i 1 =1... r d 1 i d 1 =1 v 1 1,i 1 (x ν1 )... v d i d 1,1(x νd ) Run a tree optimization algorithm yielding an equivalent representation of v (at the current precision) v(x 1,..., x d ) r 1 i 1 =1... r d 1 i d 1 =1 v 1 1,i 1 (x ν1 )... v d i d 1,1(x νd ) with reduced ranks { r 1,..., r d 1 }, where { ν 1,..., ν d } is a permutation of {ν 1,..., ν d }. Anthony Nouy Ecole Centrale Nantes 20
39 Strategy for tree adaptation Illustration with training set of size K = 50. We run the algorithm for different initial trees. Indicated in blue are the permuted dimensions in the final tree. tree {ν 1,..., ν d } optimal rank test error initial ( ) ( ) final ( ) ( ) initial ( ) ( ) final ( ) ( ) initial and final ( ) ( ) initial ( ) ( ) ( ) ( ) final ( ) ( ) Anthony Nouy Ecole Centrale Nantes 21
40 Concluding remarks and on-going works For rank adaptation, possible use of constructive (greedy) algorithms for tree-based Tucker formats. Need for robust strategies for tree adaptation. Statistical dimension of low-rank subsets? Adaptive sampling strategies. Goal-oriented construction of low-rank approximations. Anthony Nouy Ecole Centrale Nantes 22
41 References and announcement A. Nouy. Low-rank methods for high-dimensional approximation and model order reduction. arxiv: M. Chevreuil, R. Lebrun, A. Nouy, and P. Rai. A least-squares method for sparse low rank approximation of multivariate functions. SIAM/ASA Journal on Uncertainty Quantification, 3(1): , Open post-doc positions in Ecole Centrale Nantes (France). Mode order reduction for uncertainty quantification, high-dimensional approximation, low-rank tensor approximation, statistical learning Anthony Nouy Ecole Centrale Nantes 23
A regularized least-squares method for sparse low-rank approximation of multivariate functions
Workshop Numerical methods for high-dimensional problems April 18, 2014 A regularized least-squares method for sparse low-rank approximation of multivariate functions Mathilde Chevreuil joint work with
More informationarxiv: v2 [math.na] 9 Jul 2014
A least-squares method for sparse low rank approximation of multivariate functions arxiv:1305.0030v2 [math.na] 9 Jul 2014 M. Chevreuil R. Lebrun A. Nouy P. Rai Abstract In this paper, we propose a low-rank
More informationTensor networks, TT (Matrix Product States) and Hierarchical Tucker decomposition
Tensor networks, TT (Matrix Product States) and Hierarchical Tucker decomposition R. Schneider (TUB Matheon) John von Neumann Lecture TU Munich, 2012 Setting - Tensors V ν := R n, H d = H := d ν=1 V ν
More informationAn Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples
An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples Lars Grasedyck and Wolfgang Hackbusch Bericht Nr. 329 August 2011 Key words: MSC: hierarchical Tucker tensor rank tensor approximation
More informationTensor Product Approximation
Tensor Product Approximation R. Schneider (TUB Matheon) Mariapfarr, 2014 Acknowledgment DFG Priority program SPP 1324 Extraction of essential information from complex data Co-workers: T. Rohwedder (HUB),
More informationTensor networks and deep learning
Tensor networks and deep learning I. Oseledets, A. Cichocki Skoltech, Moscow 26 July 2017 What is a tensor Tensor is d-dimensional array: A(i 1,, i d ) Why tensors Many objects in machine learning can
More informationPolynomial chaos expansions for sensitivity analysis
c DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for sensitivity analysis B. Sudret Chair of Risk, Safety & Uncertainty
More informationDynamical low-rank approximation
Dynamical low-rank approximation Christian Lubich Univ. Tübingen Genève, Swiss Numerical Analysis Day, 17 April 2015 Coauthors Othmar Koch 2007, 2010 Achim Nonnenmacher 2008 Dajana Conte 2010 Thorsten
More informationTensor Networks and Hierarchical Tensors for the Solution of High-Dimensional Partial Differential Equations
TECHNISCHE UNIVERSITÄT BERLIN Tensor Networks and Hierarchical Tensors for the Solution of High-Dimensional Partial Differential Equations Markus Bachmayr André Uschmajew Reinhold Schneider Preprint 2015/28
More informationLecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko
tifica Lecture 1: Introduction to low-rank tensor representation/approximation Alexander Litvinenko http://sri-uq.kaust.edu.sa/ KAUST Figure : KAUST campus, 5 years old, approx. 7000 people (include 1400
More informationMatrix-Product-States/ Tensor-Trains
/ Tensor-Trains November 22, 2016 / Tensor-Trains 1 Matrices What Can We Do With Matrices? Tensors What Can We Do With Tensors? Diagrammatic Notation 2 Singular-Value-Decomposition 3 Curse of Dimensionality
More informationTensor Sparsity and Near-Minimal Rank Approximation for High-Dimensional PDEs
Tensor Sparsity and Near-Minimal Rank Approximation for High-Dimensional PDEs Wolfgang Dahmen, RWTH Aachen Collaborators: Markus Bachmayr, Ron DeVore, Lars Grasedyck, Endre Süli Paris, Oct.11, 2013 W.
More informationNon-Intrusive Solution of Stochastic and Parametric Equations
Non-Intrusive Solution of Stochastic and Parametric Equations Hermann G. Matthies a Loïc Giraldi b, Alexander Litvinenko c, Dishi Liu d, and Anthony Nouy b a,, Brunswick, Germany b École Centrale de Nantes,
More informationON MANIFOLDS OF TENSORS OF FIXED TT-RANK
ON MANIFOLDS OF TENSORS OF FIXED TT-RANK SEBASTIAN HOLTZ, THORSTEN ROHWEDDER, AND REINHOLD SCHNEIDER Abstract. Recently, the format of TT tensors [19, 38, 34, 39] has turned out to be a promising new format
More informationIntroduction to the Tensor Train Decomposition and Its Applications in Machine Learning
Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March
More informationSparse polynomial chaos expansions as a machine learning regression technique
Research Collection Other Conference Item Sparse polynomial chaos expansions as a machine learning regression technique Author(s): Sudret, Bruno; Marelli, Stefano; Lataniotis, Christos Publication Date:
More informationMath 671: Tensor Train decomposition methods
Math 671: Eduardo Corona 1 1 University of Michigan at Ann Arbor December 8, 2016 Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3
More informationNON-LINEAR APPROXIMATION OF BAYESIAN UPDATE
tifica NON-LINEAR APPROXIMATION OF BAYESIAN UPDATE Alexander Litvinenko 1, Hermann G. Matthies 2, Elmar Zander 2 http://sri-uq.kaust.edu.sa/ 1 Extreme Computing Research Center, KAUST, 2 Institute of Scientific
More informationDynamical low rank approximation in hierarchical tensor format
Dynamical low rank approximation in hierarchical tensor format R. Schneider (TUB Matheon) John von Neumann Lecture TU Munich, 2012 Motivation Equations describing complex systems with multi-variate solution
More informationRank reduction of parameterized time-dependent PDEs
Rank reduction of parameterized time-dependent PDEs A. Spantini 1, L. Mathelin 2, Y. Marzouk 1 1 AeroAstro Dpt., MIT, USA 2 LIMSI-CNRS, France UNCECOMP 2015 (MIT & LIMSI-CNRS) Rank reduction of parameterized
More informationInterpolation via weighted l 1 minimization
Interpolation via weighted l 1 minimization Rachel Ward University of Texas at Austin December 12, 2014 Joint work with Holger Rauhut (Aachen University) Function interpolation Given a function f : D C
More informationBook of Abstracts - Extract th Annual Meeting. March 23-27, 2015 Lecce, Italy. jahrestagung.gamm-ev.de
GESELLSCHAFT für ANGEWANDTE MATHEMATIK und MECHANIK e.v. INTERNATIONAL ASSOCIATION of APPLIED MATHEMATICS and MECHANICS 86 th Annual Meeting of the International Association of Applied Mathematics and
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next
More informationGreedy control. Martin Lazar University of Dubrovnik. Opatija, th Najman Conference. Joint work with E: Zuazua, UAM, Madrid
Greedy control Martin Lazar University of Dubrovnik Opatija, 2015 4th Najman Conference Joint work with E: Zuazua, UAM, Madrid Outline Parametric dependent systems Reduced basis methods Greedy control
More informationNumerical tensor methods and their applications
Numerical tensor methods and their applications 8 May 2013 All lectures 4 lectures, 2 May, 08:00-10:00: Introduction: ideas, matrix results, history. 7 May, 08:00-10:00: Novel tensor formats (TT, HT, QTT).
More informationMachine Learning. Regularization and Feature Selection. Fabio Vandin November 14, 2017
Machine Learning Regularization and Feature Selection Fabio Vandin November 14, 2017 1 Regularized Loss Minimization Assume h is defined by a vector w = (w 1,..., w d ) T R d (e.g., linear models) Regularization
More informationarxiv: v2 [math.na] 18 Jan 2017
Tensor-based dynamic mode decomposition Stefan Klus, Patrick Gelß, Sebastian Peitz 2, and Christof Schütte,3 Department of Mathematics and Computer Science, Freie Universität Berlin, Germany 2 Department
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationA new truncation strategy for the higher-order singular value decomposition
A new truncation strategy for the higher-order singular value decomposition Nick Vannieuwenhoven K.U.Leuven, Belgium Workshop on Matrix Equations and Tensor Techniques RWTH Aachen, Germany November 21,
More informationLow Rank Approximation Lecture 7. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL
Low Rank Approximation Lecture 7 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Alternating least-squares / linear scheme General setting:
More informationHierarchical Matrices. Jon Cockayne April 18, 2017
Hierarchical Matrices Jon Cockayne April 18, 2017 1 Sources Introduction to Hierarchical Matrices with Applications [Börm et al., 2003] 2 Sources Introduction to Hierarchical Matrices with Applications
More informationA Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem
A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem Morteza Ashraphijuo Columbia University ashraphijuo@ee.columbia.edu Xiaodong Wang Columbia University wangx@ee.columbia.edu
More informationStructured tensor missing-trace interpolation in the Hierarchical Tucker format Curt Da Silva and Felix J. Herrmann Sept. 26, 2013
Structured tensor missing-trace interpolation in the Hierarchical Tucker format Curt Da Silva and Felix J. Herrmann Sept. 6, 13 SLIM University of British Columbia Motivation 3D seismic experiments - 5D
More information12. Cholesky factorization
L. Vandenberghe ECE133A (Winter 2018) 12. Cholesky factorization positive definite matrices examples Cholesky factorization complex positive definite matrices kernel methods 12-1 Definitions a symmetric
More informationOn the convergence of higher-order orthogonality iteration and its extension
On the convergence of higher-order orthogonality iteration and its extension Yangyang Xu IMA, University of Minnesota SIAM Conference LA15, Atlanta October 27, 2015 Best low-multilinear-rank approximation
More informationNumerical tensor methods and their applications
Numerical tensor methods and their applications 14 May 2013 All lectures 4 lectures, 2 May, 08:00-10:00: Introduction: ideas, matrix results, history. 7 May, 08:00-10:00: Novel tensor formats (TT, HT,
More informationMLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net
More informationA New Scheme for the Tensor Representation
J Fourier Anal Appl (2009) 15: 706 722 DOI 10.1007/s00041-009-9094-9 A New Scheme for the Tensor Representation W. Hackbusch S. Kühn Received: 18 December 2008 / Revised: 29 June 2009 / Published online:
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More informationPolynomial chaos expansions for structural reliability analysis
DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for structural reliability analysis B. Sudret & S. Marelli Incl.
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More informationTENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY
TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY Eugene Tyrtyshnikov Institute of Numerical Mathematics Russian Academy of Sciences (joint work with Ivan Oseledets) WHAT ARE TENSORS? Tensors
More informationEfficiently Implementing Sparsity in Learning
Efficiently Implementing Sparsity in Learning M. Magdon-Ismail Rensselaer Polytechnic Institute (Joint Work) December 9, 2013. Out-of-Sample is What Counts NO YES A pattern exists We don t know it We have
More informationThe geometry of algorithms using hierarchical tensors
The geometry of algorithms using hierarchical tensors André Uschmajew a,1, Bart Vandereycken b,2 a Institut für Mathematik, Technische Universität Berlin, 10623 Berlin, Germany b ANCHP, Mathematics Section,
More informationReduced Modeling in Data Assimilation
Reduced Modeling in Data Assimilation Peter Binev Department of Mathematics and Interdisciplinary Mathematics Institute University of South Carolina Challenges in high dimensional analysis and computation
More informationDefinition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition
6 Vector Spaces with Inned Product Basis and Dimension Section Objective(s): Vector Spaces and Subspaces Linear (In)dependence Basis and Dimension Inner Product 6 Vector Spaces and Subspaces Definition
More informationTensors and graphical models
Tensors and graphical models Mariya Ishteva with Haesun Park, Le Song Dept. ELEC, VUB Georgia Tech, USA INMA Seminar, May 7, 2013, LLN Outline Tensors Random variables and graphical models Tractable representations
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationTENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS. Cris Cecka Senior Research Scientist, NVIDIA GTC 2018
TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist, NVIDIA GTC 2018 Tensors Computations and the GPU AGENDA Tensor Networks and Decompositions Tensor Layers in
More informationStatistical Machine Learning for Structured and High Dimensional Data
Statistical Machine Learning for Structured and High Dimensional Data (FA9550-09- 1-0373) PI: Larry Wasserman (CMU) Co- PI: John Lafferty (UChicago and CMU) AFOSR Program Review (Jan 28-31, 2013, Washington,
More informationBindel, Fall 2016 Matrix Computations (CS 6210) Notes for
1 A cautionary tale Notes for 2016-10-05 You have been dropped on a desert island with a laptop with a magic battery of infinite life, a MATLAB license, and a complete lack of knowledge of basic geometry.
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationOrthogonal tensor decomposition
Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.
More informationCOMP 551 Applied Machine Learning Lecture 2: Linear regression
COMP 551 Applied Machine Learning Lecture 2: Linear regression Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this
More informationMassive MIMO: Signal Structure, Efficient Processing, and Open Problems II
Massive MIMO: Signal Structure, Efficient Processing, and Open Problems II Mahdi Barzegar Communications and Information Theory Group (CommIT) Technische Universität Berlin Heisenberg Communications and
More informationMIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco
MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 08: Sparsity Based Regularization Lorenzo Rosasco Learning algorithms so far ERM + explicit l 2 penalty 1 min w R d n n l(y
More informationDirect Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina
Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:
More informationAn Introduction to Sparse Approximation
An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,
More informationOn a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model
On a Data Assimilation Method coupling, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model 2016 SIAM Conference on Uncertainty Quantification Basile Marchand 1, Ludovic
More informationLecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko
tifica Lecture 1: Introduction to low-rank tensor representation/approximation Alexander Litvinenko http://sri-uq.kaust.edu.sa/ KAUST Figure : KAUST campus, 5 years old, approx. 7000 people (include 1400
More informationQuantifying conformation fluctuation induced uncertainty in bio-molecular systems
Quantifying conformation fluctuation induced uncertainty in bio-molecular systems Guang Lin, Dept. of Mathematics & School of Mechanical Engineering, Purdue University Collaborative work with Huan Lei,
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationConvex relaxation for Combinatorial Penalties
Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,
More informationLow-rank tensor discretization for high-dimensional problems
Low-rank tensor discretization for high-dimensional problems Katharina Kormann August 6, 2017 1 Introduction Problems of high-dimensionality appear in many areas of science. High-dimensionality is usually
More informationA Block-Jacobi Algorithm for Non-Symmetric Joint Diagonalization of Matrices
A Block-Jacobi Algorithm for Non-Symmetric Joint Diagonalization of Matrices ao Shen and Martin Kleinsteuber Department of Electrical and Computer Engineering Technische Universität München, Germany {hao.shen,kleinsteuber}@tum.de
More informationLeast Squares Regression
E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute
More informationGENERAL VECTOR SPACES AND SUBSPACES [4.1]
GENERAL VECTOR SPACES AND SUBSPACES [4.1] General vector spaces So far we have seen special spaces of vectors of n dimensions denoted by R n. It is possible to define more general vector spaces A vector
More informationSparse Approximation and Variable Selection
Sparse Approximation and Variable Selection Lorenzo Rosasco 9.520 Class 07 February 26, 2007 About this class Goal To introduce the problem of variable selection, discuss its connection to sparse approximation
More informationIntroduction to Compressed Sensing
Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral
More informationScientific Computing: Dense Linear Systems
Scientific Computing: Dense Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 February 9th, 2012 A. Donev (Courant Institute)
More informationNUMERICAL METHODS WITH TENSOR REPRESENTATIONS OF DATA
NUMERICAL METHODS WITH TENSOR REPRESENTATIONS OF DATA Institute of Numerical Mathematics of Russian Academy of Sciences eugene.tyrtyshnikov@gmail.com 2 June 2012 COLLABORATION MOSCOW: I.Oseledets, D.Savostyanov
More informationLeast Squares Regression
CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationGreedy algorithms for high-dimensional non-symmetric problems
Greedy algorithms for high-dimensional non-symmetric problems V. Ehrlacher Joint work with E. Cancès et T. Lelièvre Financial support from Michelin is acknowledged. CERMICS, Ecole des Ponts ParisTech &
More informationMathematical Methods for Data Analysis
Mathematical Methods for Data Analysis Massimiliano Pontil Istituto Italiano di Tecnologia and Department of Computer Science University College London Massimiliano Pontil Mathematical Methods for Data
More informationApproximate Kernel Methods
Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression
More informationAvailable Ph.D position in Big data processing using sparse tensor representations
Available Ph.D position in Big data processing using sparse tensor representations Research area: advanced mathematical methods applied to big data processing. Keywords: compressed sensing, tensor models,
More information9. Numerical linear algebra background
Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationIntroduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012
Introduction to Sparsity Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Outline Understanding Sparsity Total variation Compressed sensing(definition) Exact recovery with sparse prior(l 0 ) l 1 relaxation
More informationFrom Matrix to Tensor. Charles F. Van Loan
From Matrix to Tensor Charles F. Van Loan Department of Computer Science January 28, 2016 From Matrix to Tensor From Tensor To Matrix 1 / 68 What is a Tensor? Instead of just A(i, j) it s A(i, j, k) or
More informationTECHNISCHE UNIVERSITÄT BERLIN
TECHNISCHE UNIVERSITÄT BERLIN Adaptive Stochastic Galerkin FEM with Hierarchical Tensor Representations Martin Eigel Max Pfeffer Reinhold Schneider Preprint 2015/29 Preprint-Reihe des Instituts für Mathematik
More informationLinear vector spaces and subspaces.
Math 2051 W2008 Margo Kondratieva Week 1 Linear vector spaces and subspaces. Section 1.1 The notion of a linear vector space. For the purpose of these notes we regard (m 1)-matrices as m-dimensional vectors,
More information(Part 1) High-dimensional statistics May / 41
Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2
More informationA Do It Yourself Guide to Linear Algebra
A Do It Yourself Guide to Linear Algebra Lecture Notes based on REUs, 2001-2010 Instructor: László Babai Notes compiled by Howard Liu 6-30-2010 1 Vector Spaces 1.1 Basics Definition 1.1.1. A vector space
More informationMean-field dual of cooperative reproduction
The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time
More informationOptimization for Compressed Sensing
Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve
More informationLecture 2: Linear Algebra Review
EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1
More informationStochastic methods for solving partial differential equations in high dimension
Stochastic methods for solving partial differential equations in high dimension Marie Billaud-Friess Joint work with : A. Macherey, A. Nouy & C. Prieur marie.billaud-friess@ec-nantes.fr Centrale Nantes,
More informationBayesian Decision and Bayesian Learning
Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i
More informationDATA MINING AND MACHINE LEARNING
DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More informationEfficient Solvers for Stochastic Finite Element Saddle Point Problems
Efficient Solvers for Stochastic Finite Element Saddle Point Problems Catherine E. Powell c.powell@manchester.ac.uk School of Mathematics University of Manchester, UK Efficient Solvers for Stochastic Finite
More informationISyE 691 Data mining and analytics
ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationStructured Statistical Learning with Support Vector Machine for Feature Selection and Prediction
Structured Statistical Learning with Support Vector Machine for Feature Selection and Prediction Yoonkyung Lee Department of Statistics The Ohio State University http://www.stat.ohio-state.edu/ yklee Predictive
More informationOverview of sparse system identification
Overview of sparse system identification J.-Ch. Loiseau 1 & Others 2, 3 1 Laboratoire DynFluid, Arts et Métiers ParisTech, France 2 LIMSI, Université d Orsay CNRS, France 3 University of Washington, Seattle,
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More information