sparse and low-rank tensor recovery Cubic-Sketching
|
|
- Alban Cross
- 5 years ago
- Views:
Transcription
1 Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University Math Oct. 27, 2017 Joint wor with Botao Hao and Anru Zhang
2 Tensor: Multi-dimensional Array
3 Tensor Data Example
4 Motivation: Compressed Image Transmission
5 Motivation: Interaction Effect Model source: Contraceptive Method Choice dataset from UCI
6 Motivation: Interaction Effect Model
7 Sparse and Low-Ran Tensor Recovery
8 Noisy Cubic Setching Model Observe {y i, X i } from noisy cubic setching model, y }{{} i = T, X i }{{} scalar tensor inner product + ɛ i }{{} noise, i = 1,..., n. For two tensors A R p 1 p 2 p 3 inner product is defined as and B R p 1 p 2 p 3, the tensor A, B = ij A ij B ij.
9 Noisy Cubic Setching Model General model including: tensor regression (Zhou, Li, Zhu (2013)), tensor completion (Yuan and Zhang (2014)). Goal: Recover unnown third-order tensor parameter T. High-dimensional problem: n dim(t ) p 3.
10 Key Assumptions on Tensor Parameter When T R p p p is a symmetric tensor... 1 CANDECOMP/PARAFAC(CP) low-ran: Represented as sum of ran-one tensor, where p. 2 Sparse components: β 0 s for [K]. The cubic setching tensor X i for symmetric case is X i = x i x i x i, where {x i } n i=1 are Gaussian random vectors. β and β are not orthogonal. Different from singular-value decomposition in matrix case.
11 Key Assumptions on Tensor Parameter When T R p 1 p 2 p 3 is a non-symmetric tensor... 1 CANDECOMP/PARAFAC(CP) low-ran: 2 Sparse components: β 1 0 s 1, β 2 0 s 2, β 3 0 s 3 for [K]. The cubic setching tensor X i for non-symmetric case is X i = u i v i w i, where {u i, v i, w i } n i=1 are Gaussian random vectors.
12 Reduced Symmetric Tensor Recovery Model For symmetric tensor recovery model K y i = ηβ β β, x i x i x i + ɛ i = =1 Connect with interaction effect model. K η (x i β) 3 +ɛ i }{{} =1 non-linear New Goal: Recover {η, β }K =1
13 Reduced Non-symmetric Tensor Recovery Model For non-symmetric tensor recovery model y i = K ηβ 1 β2 β3, u i v i w i + ɛ i =1 = K η (u i β1)(v i β2)(w i β3) +ɛ i }{{} =1 non-linear Connect with compressed image transmission model. New Goal: Recover {η, β 1, β 2, β 3 }K =1.
14 Empirical Ris Minimization Consider Empirical Ris Minimization T = argmin {η,β } T = argmin n K (y i η (x i β ) 3 ) 2 i=1 } =1 {{ } L 1 (η,β ) n K (y i η (u i β 1 )(vi β 2 )(wi β 3 )) {η,β i } i=1 } =1 {{ } L 2 (η,β i ) Difficulties: Non-convex optimization! Non-convexity from cube structure or tri-convexity.
15 Our Contributions 1 Efficient two-stage implementation to non-convex optimization problem. 2 Non-asymptotic analysis. Provide optimal estimation rate.
16 Two-stage Implementation
17 Main Algorithm
18 Initial Step: Non-symmetric unbiased estimator Construct an unbiased empirical moment based tensor T (y i, X i ) R p 1 p 2 p 3 as following T := 1 n y i u i v i w i n i=1 }{{} only depends on observations. This is used for non-symmetric case and u i, v i, w i are independent standard Gaussian random vectors.
19 Initial Step: Symmetric unbiased estimator Construct an unbiased empirical moment based tensor T s (y i, X i ) R p p p as following T s := 1 [ 1 6 n n ] y i x i x i x i U i=1 }{{} only depends on observations. where the bias term U = ) p j=1 (m 1 e j e j + e j m 1 e j + e j e j m 1, and m 1 = 1 ni=1 n y i x i. Here {e j } p j=1 are the canonical vectors in R p. Bias term U is due to the correlation among three identical Gaussian random vectors.
20 Initial Step: Decompose unbiased estimator Intuition: E[T s ] = T. Observation T s. Noise E = T s E(T s ): approximation error. Decompose T s to obtain {η (0), β(0) } or Decompose T to obtain {η (0), β(0) 1, β(0) 2, β(0) 3 } through sparse tensor decomposition. See next slide for details. Far from the optimal estimation, but good enough as a warm start.
21 Initial Step: Decompose unbiased estimator Intuition: E[T s ] = T. Observation T s. Noise E = T s E(T s ): approximation error. Decompose T s to obtain {η (0), β(0) } or Decompose T to obtain {η (0), β(0) 1, β(0) 2, β(0) 3 } through sparse tensor decomposition. See next slide for details. Far from the optimal estimation, but good enough as a warm start.
22 Initial Step: Decompose unbiased estimator Intuition: E[T s ] = T. Observation T s. Noise E = T s E(T s ): approximation error. Decompose T s to obtain {η (0), β(0) } or Decompose T to obtain {η (0), β(0) 1, β(0) 2, β(0) 3 } through sparse tensor decomposition. See next slide for details. Far from the optimal estimation, but good enough as a warm start.
23 Sparse Tensor Decomposition for Symmetric Tensor Generate L staring points {β start l } L l=1. For each starting point, compute a non-sparse factor of moment-based T s via symmetric tensor power update: β (t+1) l = T s 2 β (t) l T s 2 β (t) l 3 β (t) l 3 β (t) l 2, where for T s R p p p and x R p, define T s 2 x 3 x := j,l x jx l [T ] :,j,l. Get a sparse solution β (t+1) l via thresholding or truncation. Cluster L sets of single component {β (T ) l, β (T ) l, β (T ) l clusters to obtain a ran-k decomposition {β (0), β(0) Different from matrix SVD due to non-orthogonality. } L l=1 into K, β(0) }K =1.
24 Sparse Tensor Decomposition for Non-symmetric Tensor Generate L staring points {β start 1l, β start 2l, β start 3l } L l=1. For each starting point, compute a non-sparse factor of moment-based T via alternating tensor power update: β (t+1) 1l = T s 2 β (t) 2l T s 2 β (t) β (t+1) 2l β (t+1) 3l 2l 3 β (t) 3l 3 β (t) 3l 2 The updates for and Get a sparse solution {β (t+1) 1l, β (t+1) 2l truncation. Cluster L sets of single component {β (T ) 1l, β (T ), are similar., β (t+1) 3l } via thresholding or 2l, β (T ) 3l } L l=1 into K clusters to obtain a ran-k decomposition {β (0) 1, β(0) 2, β(0) 3 }K =1.
25 Gradient Update: Thresholded Gradient Decent Input initial estimator {η (0), β(0) }K =1. In each iteration step, update {β } K =1 as β (t+1) = β (t) µ t φ β L 1 (η (0), β(t) ) where φ = 1 n n i=1 y2 i, µ t is the step size. Sparsify current update by thresholding β (t+1) (t+1) = ϕ ρ ( β ). Normalize final update β (T ) = β(t ) and update the weight β (T ) 2 η = η (0) β (T ) Alternating update for non-symmetric tensor recovery.
26 Gradient Update: Thresholded Gradient Decent Input initial estimator {η (0), β(0) 1, β(0) 2, β(0) 3 }K =1. In each iteration step, alternatively update {β 1, β 2, β 3 } K =1 as β (t+1) 1 = β (t) 1 µ t φ β 1 L 2 (η (0), β(t) 1, β(t) 2, β(t) 3 ) where φ = 1 n n i=1 y2 i, µ t is the step size. The update for (t+1) and β 3 is similar. Sparsify current update by thresholding β (t+1) j = ϕ ρ ( j = 1, 2, 3. Normalize final update β (T ) j = β(t ) j β (T ) j 2 η = η (0) β (T ) 1 2 β (T ) 2 2 β (T ) 3 2. β (t+1) 2 (t+1) β j ) for and update the weight 1 Alternating update for non-symmetric tensor recovery.
27 Non-asymptotic Analysis
28 Non-asymptotic Upper Bound 定理 Suppose some regularity conditions for the true tensor parameter hold. Assume n C 0 s 2 log p for some large constant C 0. Denote Z (t) = K =1 3 η β (t) 3 η β 2 2 For any t = 0, 1, 2,..., the factor-wise estimator satisfies Z (t+1) min κ t Z (t) + C 1η }{{} 16 computational error 4 3 σ 2 s log p n } {{ } statistical error with high probability, where κ is the contraction parameter between 0 and 1, η min = min {η }, σ is the noise level and C 0, C 1 are some absolute constants.,
29 Remars Interesting characterization for computational error and statistical error; Geometric convergence rate to the truth in the noiseless case and minimax optimal statistical rate shown later; The error bound is dominated by computation error in the first several iterations and then is dominated by statistical error. Useful guideline for choosing stopping rule.
30 Remars When t T for some enough T, the final estimator is bounded by T (T ) T 2 Cσ2 Ks log p, F n with high probability. Minimax optimal rate!
31 Class of Sparse and Low-ran tensor Sparse CP decomposition K T = β β β, β 0 s for [K] =1 Incoherence condition(nearly orthogonal): components are incoherent such that The true tensor max i j [K] β i, β j C. s
32 Minimax Lower Bound 定理 Consider the class of tensor satisfy sparse CP-decomposition and incoherence condition. Suppose we sample via cubic measurements with i.i.d. standard normal setches with i.i.d. N(0, σ 2 ) noise, then we have the following lower bound result for recovery loss for this class of low-ran tensors, inf T sup T F E T T 2 F Ks log(ep/s) cσ2. n
33 Optimal Estimation Rate 定理 Consider the class of tensor F p,k,s satisfy sparse CP-decomposition and incoherence condition. Suppose we observe n samples {y i, X i } n i=1 from symmetric tensor cubic setching model, where n Cs 2 log p for some large constant C. Then the estimator T achieves inf T sup E T T T F p,k,s 2 F Ks log(p/s) σ2 }{{ n } when log p log p/s. Here σ is the noise level. R,
34 Remars Our analysis is non-asymptotic and our estimator is rate-optimal. In general, we have a trade-off R is the outcome of statistical error and optimization error trade-off. Similar argument holds for non-symmetric case. Different technical tools are used. To overcome the obstacle from high-order Gaussian random variable, we develop novel high-order concentration inequality by using truncation argument and ψ α -norm.
35 Application to Interaction Effect Model Given the response y R n and covariates X R n p, the regression model with three-way interactions can be formulated as p p p y l = β 0 + X li β i + γ ij X li X lj + η ij X li X lj X l + ɛ l i=1 i,j=1 i,j,=1 = B, X l X l X l + ɛ l where X l = (1, X l ) R p+1.
36 Application to Interaction Effect Model It is reasonable to assume that B possess low-ran and/or sparsity structures in some biometrics studies (Hung, et. 2016). K y l = η β β β, X l X l X l + ε l =1 The symmetric tensor recovery model can be treated as a high-order interaction effect model.
37 Some Changes for Algorithm The unbiased empirical moment based tensor A R p 1 p 2 p 3 for interaction effect model is constructed as following: Define three quantities a = 1 n n l=1 y lx l, Ã = 1 n n i=l y lx l X l X l, Ā = 1 6 (Ã p j=1 (a e j e j + e j a e j + e j e j a)). For i, j, 0, A ij = Āij. For i 0, A 0,0,i = 1 3Ã0,0,i 1 6 ( p =1 Ã,,i (p + 2)a i ). And A 0,0,0 = 1 2p 2 ( p =1 Ã0,, (p + 2)Ã0,0,0). The additional intercept term changes the model structure dramatically.
38 Numerical Study Recover low-ran and sparse symmetric tensor T R p p p from y i = T, X i + ɛ i, i = 1,..., n. Proportion of non-zero elements for each factor s = 0.3. Tensor CP-ran K = 3. Replication = 200. Stopping rule for initialization: β m (l+1) β m (l) Stopping rule for gradient update: B (T +1) B (T ) F The dimension, sample size and noise level vary in different scenarios.
39 Left panel: relative error for initialization. Right panel: percent of successful recovery with varying sample size. Both are noiseless case with p = 30.
40 Noisy case. Left panel: absolute error for recovering ran-three tensor with varying noise level and sample size. Right panel: absolute error for recovering ran-five tensor with varying noise level and sample size.
41 Thans! and Questions?
Analysis of Greedy Algorithms
Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationDe-biasing the Lasso: Optimal Sample Size for Gaussian Designs
De-biasing the Lasso: Optimal Sample Size for Gaussian Designs Adel Javanmard USC Marshall School of Business Data Science and Operations department Based on joint work with Andrea Montanari Oct 2015 Adel
More informationEE 381V: Large Scale Optimization Fall Lecture 24 April 11
EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that
More informationarxiv: v1 [cs.it] 21 Feb 2013
q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationConfidence Intervals for Low-dimensional Parameters with High-dimensional Data
Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology
More informationNew Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit
New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence
More informationStructured signal recovery from non-linear and heavy-tailed measurements
Structured signal recovery from non-linear and heavy-tailed measurements Larry Goldstein* Stanislav Minsker* Xiaohan Wei # *Department of Mathematics # Department of Electrical Engineering UniversityofSouthern
More informationAn iterative hard thresholding estimator for low rank matrix recovery
An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More information(Part 1) High-dimensional statistics May / 41
Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2
More informationOptimal Linear Estimation under Unknown Nonlinear Transform
Optimal Linear Estimation under Unknown Nonlinear Transform Xinyang Yi The University of Texas at Austin yixy@utexas.edu Constantine Caramanis The University of Texas at Austin constantine@utexas.edu Zhaoran
More informationDoes l p -minimization outperform l 1 -minimization?
Does l p -minimization outperform l -minimization? Le Zheng, Arian Maleki, Haolei Weng, Xiaodong Wang 3, Teng Long Abstract arxiv:50.03704v [cs.it] 0 Jun 06 In many application areas ranging from bioinformatics
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationInterpolation-Based Trust-Region Methods for DFO
Interpolation-Based Trust-Region Methods for DFO Luis Nunes Vicente University of Coimbra (joint work with A. Bandeira, A. R. Conn, S. Gratton, and K. Scheinberg) July 27, 2010 ICCOPT, Santiago http//www.mat.uc.pt/~lnv
More informationarxiv: v1 [math.st] 10 Sep 2015
Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees Department of Statistics Yudong Chen Martin J. Wainwright, Department of Electrical Engineering and
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationStochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions
International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.
More informationSolving Corrupted Quadratic Equations, Provably
Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin
More informationElaine T. Hale, Wotao Yin, Yin Zhang
, Wotao Yin, Yin Zhang Department of Computational and Applied Mathematics Rice University McMaster University, ICCOPT II-MOPTA 2007 August 13, 2007 1 with Noise 2 3 4 1 with Noise 2 3 4 1 with Noise 2
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationCompressed Sensing Using Bernoulli Measurement Matrices
ITSchool 11, Austin Compressed Sensing Using Bernoulli Measurement Matrices Yuhan Zhou Advisor: Wei Yu Department of Electrical and Computer Engineering University of Toronto, Canada Motivation Motivation
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationSparse linear models
Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationEstimating Unknown Sparsity in Compressed Sensing
Estimating Unknown Sparsity in Compressed Sensing Miles Lopes UC Berkeley Department of Statistics CSGF Program Review July 16, 2014 early version published at ICML 2013 Miles Lopes ( UC Berkeley ) estimating
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationBias-free Sparse Regression with Guaranteed Consistency
Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March
More informationCompressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery
Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationOrthogonal Matching Pursuit for Sparse Signal Recovery With Noise
Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More informationOWL to the rescue of LASSO
OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationRecovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm
Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm J. K. Pant, W.-S. Lu, and A. Antoniou University of Victoria August 25, 2011 Compressive Sensing 1 University
More informationNonparametric Inference In Functional Data
Nonparametric Inference In Functional Data Zuofeng Shang Purdue University Joint work with Guang Cheng from Purdue Univ. An Example Consider the functional linear model: Y = α + where 1 0 X(t)β(t)dt +
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationOverlapping Variable Clustering with Statistical Guarantees and LOVE
with Statistical Guarantees and LOVE Department of Statistical Science Cornell University WHOA-PSI, St. Louis, August 2017 Joint work with Mike Bing, Yang Ning and Marten Wegkamp Cornell University, Department
More informationNonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy
Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille
More informationRecent developments on sparse representation
Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last
More informationMLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net
More informationConstrained optimization
Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained
More informationRecovering overcomplete sparse representations from structured sensing
Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix
More informationSparse Solutions of an Undetermined Linear System
1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from
More informationOn the l 1 -Norm Invariant Convex k-sparse Decomposition of Signals
On the l 1 -Norm Invariant Convex -Sparse Decomposition of Signals arxiv:1305.6021v2 [cs.it] 11 Nov 2013 Guangwu Xu and Zhiqiang Xu Abstract Inspired by an interesting idea of Cai and Zhang, we formulate
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationSupremum of simple stochastic processes
Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there
More informationIntroduction to Compressed Sensing
Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral
More informationConvex relaxation for Combinatorial Penalties
Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,
More informationThe picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R
The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R Xingguo Li Tuo Zhao Tong Zhang Han Liu Abstract We describe an R package named picasso, which implements a unified framework
More informationhigh-dimensional inference robust to the lack of model sparsity
high-dimensional inference robust to the lack of model sparsity Jelena Bradic (joint with a PhD student Yinchu Zhu) www.jelenabradic.net Assistant Professor Department of Mathematics University of California,
More informationAn Introduction to Sparse Approximation
An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,
More informationAdaptive one-bit matrix completion
Adaptive one-bit matrix completion Joseph Salmon Télécom Paristech, Institut Mines-Télécom Joint work with Jean Lafond (Télécom Paristech) Olga Klopp (Crest / MODAL X, Université Paris Ouest) Éric Moulines
More informationSparse and Low Rank Recovery via Null Space Properties
Sparse and Low Rank Recovery via Null Space Properties Holger Rauhut Lehrstuhl C für Mathematik (Analysis), RWTH Aachen Convexity, probability and discrete structures, a geometric viewpoint Marne-la-Vallée,
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationVast Volatility Matrix Estimation for High Frequency Data
Vast Volatility Matrix Estimation for High Frequency Data Yazhen Wang National Science Foundation Yale Workshop, May 14-17, 2009 Disclaimer: My opinion, not the views of NSF Y. Wang (at NSF) 1 / 36 Outline
More informationCombining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation
UIUC CSL Mar. 24 Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation Yuejie Chi Department of ECE and BMI Ohio State University Joint work with Yuxin Chen (Stanford).
More informationInformation-Theoretic Limits of Group Testing: Phase Transitions, Noisy Tests, and Partial Recovery
Information-Theoretic Limits of Group Testing: Phase Transitions, Noisy Tests, and Partial Recovery Jonathan Scarlett jonathan.scarlett@epfl.ch Laboratory for Information and Inference Systems (LIONS)
More informationOrthogonal tensor decomposition
Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationReconstruction of Block-Sparse Signals by Using an l 2/p -Regularized Least-Squares Algorithm
Reconstruction of Block-Sparse Signals by Using an l 2/p -Regularized Least-Squares Algorithm Jeevan K. Pant, Wu-Sheng Lu, and Andreas Antoniou University of Victoria May 21, 2012 Compressive Sensing 1/23
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview
More informationarxiv: v1 [math.st] 6 Sep 2018
Optimal Sparse Singular Value Decomposition for High-dimensional High-order Data arxiv:809.0796v [math.st] 6 Sep 08 Anru Zhang and Rungang Han University of Wisconsin-Madison September 7, 08 Abstract In
More informationApplied Machine Learning for Biomedical Engineering. Enrico Grisan
Applied Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Data representation To find a representation that approximates elements of a signal class with a linear combination
More informationRobust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng
Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...
More information12.4 Known Channel (Water-Filling Solution)
ECEn 665: Antennas and Propagation for Wireless Communications 54 2.4 Known Channel (Water-Filling Solution) The channel scenarios we have looed at above represent special cases for which the capacity
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu
More informationEstimators based on non-convex programs: Statistical and computational guarantees
Estimators based on non-convex programs: Statistical and computational guarantees Martin Wainwright UC Berkeley Statistics and EECS Based on joint work with: Po-Ling Loh (UC Berkeley) Martin Wainwright
More informationApproximate Message Passing Algorithms
November 4, 2017 Outline AMP (Donoho et al., 2009, 2010a) Motivations Derivations from a message-passing perspective Limitations Extensions Generalized Approximate Message Passing (GAMP) (Rangan, 2011)
More informationNumerical Linear and Multilinear Algebra in Quantum Tensor Networks
Numerical Linear and Multilinear Algebra in Quantum Tensor Networks Konrad Waldherr October 20, 2013 Joint work with Thomas Huckle QCCC 2013, Prien, October 20, 2013 1 Outline Numerical (Multi-) Linear
More information1 Sparsity and l 1 relaxation
6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationConditions for Robust Principal Component Analysis
Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and
More informationDictionary Learning Using Tensor Methods
Dictionary Learning Using Tensor Methods Anima Anandkumar U.C. Irvine Joint work with Rong Ge, Majid Janzamin and Furong Huang. Feature learning as cornerstone of ML ML Practice Feature learning as cornerstone
More informationSignal Recovery, Uncertainty Relations, and Minkowski Dimension
Signal Recovery, Uncertainty Relations, and Minkowski Dimension Helmut Bőlcskei ETH Zurich December 2013 Joint work with C. Aubel, P. Kuppinger, G. Pope, E. Riegler, D. Stotz, and C. Studer Aim of this
More informationQuantifying conformation fluctuation induced uncertainty in bio-molecular systems
Quantifying conformation fluctuation induced uncertainty in bio-molecular systems Guang Lin, Dept. of Mathematics & School of Mechanical Engineering, Purdue University Collaborative work with Huan Lei,
More informationBootstrapping high dimensional vector: interplay between dependence and dimensionality
Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang
More informationMathematical introduction to Compressed Sensing
Mathematical introduction to Compressed Sensing Lesson 1 : measurements and sparsity Guillaume Lecué ENSAE Mardi 31 janvier 2016 Guillaume Lecué (ENSAE) Compressed Sensing Mardi 31 janvier 2016 1 / 31
More informationTruncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences
Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 207-4212 Ubiquitous International Volume 7, Number 5, September 2016 Truncation Strategy of Tensor Compressive Sensing for Noisy
More informationProvable Alternating Minimization Methods for Non-convex Optimization
Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish
More information19.1 Problem setup: Sparse linear regression
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 19: Minimax rates for sparse linear regression Lecturer: Yihong Wu Scribe: Subhadeep Paul, April 13/14, 2016 In
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationMotivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble
Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Zhilin Zhang and Ritwik Giri Motivation Sparse Signal Recovery is an interesting
More information1 Regression with High Dimensional Data
6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:
More informationIEOR 265 Lecture 3 Sparse Linear Regression
IOR 65 Lecture 3 Sparse Linear Regression 1 M Bound Recall from last lecture that the reason we are interested in complexity measures of sets is because of the following result, which is known as the M
More informationThe adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
Electronic Journal of Statistics Vol. 0 (2010) ISSN: 1935-7524 The adaptive the thresholded Lasso for potentially misspecified models ( a lower bound for the Lasso) Sara van de Geer Peter Bühlmann Seminar
More informationIdentification of ARX, OE, FIR models with the least squares method
Identification of ARX, OE, FIR models with the least squares method CHEM-E7145 Advanced Process Control Methods Lecture 2 Contents Identification of ARX model with the least squares minimizing the equation
More informationRandom Matrix Theory and its Applications to Econometrics
Random Matrix Theory and its Applications to Econometrics Hyungsik Roger Moon University of Southern California Conference to Celebrate Peter Phillips 40 Years at Yale, October 2018 Spectral Analysis of
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationOn the convergence of higher-order orthogonality iteration and its extension
On the convergence of higher-order orthogonality iteration and its extension Yangyang Xu IMA, University of Minnesota SIAM Conference LA15, Atlanta October 27, 2015 Best low-multilinear-rank approximation
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationGraphlet Screening (GS)
Graphlet Screening (GS) Jiashun Jin Carnegie Mellon University April 11, 2014 Jiashun Jin Graphlet Screening (GS) 1 / 36 Collaborators Alphabetically: Zheng (Tracy) Ke Cun-Hui Zhang Qi Zhang Princeton
More informationarxiv: v1 [cs.it] 26 Oct 2018
Outlier Detection using Generative Models with Theoretical Performance Guarantees arxiv:1810.11335v1 [cs.it] 6 Oct 018 Jirong Yi Anh Duc Le Tianming Wang Xiaodong Wu Weiyu Xu October 9, 018 Abstract This
More informationNoisy Signal Recovery via Iterative Reweighted L1-Minimization
Noisy Signal Recovery via Iterative Reweighted L1-Minimization Deanna Needell UC Davis / Stanford University Asilomar SSC, November 2009 Problem Background Setup 1 Suppose x is an unknown signal in R d.
More information