Sampling and high-dimensional convex geometry

Similar documents
Random hyperplane tessellations and dimension reduction

New ways of dimension reduction? Cutting data sets into small pieces

High-dimensional distributions with convexity properties

Optimisation Combinatoire et Convexe.

Sections of Convex Bodies via the Combinatorial Dimension

Invertibility of random matrices

Approximately Gaussian marginals and the hyperplane conjecture

Convex inequalities, isoperimetry and spectral gap III

On the distribution of the ψ 2 -norm of linear functionals on isotropic convex bodies

Sparse and Low Rank Recovery via Null Space Properties

Random Matrices: Invertibility, Structure, and Applications

Mahler Conjecture and Reverse Santalo

Small Ball Probability, Arithmetic Structure and Random Matrices

Geometry of log-concave Ensembles of random matrices

Anti-concentration Inequalities

Probabilistic Methods in Asymptotic Geometric Analysis.

Asymptotic shape of the convex hull of isotropic log-concave random vectors

IEOR 265 Lecture 3 Sparse Linear Regression

Dimensionality Reduction Notes 3

Super-Gaussian directions of random vectors

ALEXANDER KOLDOBSKY AND ALAIN PAJOR. Abstract. We prove that there exists an absolute constant C so that µ(k) C p max. ξ S n 1 µ(k ξ ) K 1/n

A Bernstein-Chernoff deviation inequality, and geometric properties of random families of operators

Least singular value of random matrices. Lewis Memorial Lecture / DIMACS minicourse March 18, Terence Tao (UCLA)

Entropy extension. A. E. Litvak V. D. Milman A. Pajor N. Tomczak-Jaegermann

Exponential decay of reconstruction error from binary measurements of sparse signals

Sparse Recovery with Pre-Gaussian Random Matrices

GAUSSIAN MEASURE OF SECTIONS OF DILATES AND TRANSLATIONS OF CONVEX BODIES. 2π) n

On the isotropic constant of marginals

Lecture Notes 9: Constrained Optimization

On the singular values of random matrices

Rapid Steiner symmetrization of most of a convex body and the slicing problem

Sparse Proteomics Analysis (SPA)

SMALL BALL PROBABILITIES FOR LINEAR IMAGES OF HIGH DIMENSIONAL DISTRIBUTIONS

Intrinsic Volumes of Convex Cones Theory and Applications

Reconstruction from Anisotropic Random Measurements

Diameters of Sections and Coverings of Convex Bodies

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

arxiv: v3 [cs.it] 19 Jul 2012

ON THE ISOTROPY CONSTANT OF PROJECTIONS OF POLYTOPES

Weak and strong moments of l r -norms of log-concave vectors

Asymptotic shape of a random polytope in a convex body

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Concentration inequalities: basics and some new challenges

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007

Uniform uncertainty principle for Bernoulli and subgaussian ensembles

On John type ellipsoids

A SIMPLE TOOL FOR BOUNDING THE DEVIATION OF RANDOM MATRICES ON GEOMETRIC SETS

Estimates for the affine and dual affine quermassintegrals of convex bodies

DS-GA 1002 Lecture notes 10 November 23, Linear models

arxiv: v1 [cs.it] 21 Feb 2013

arxiv: v2 [cs.it] 8 Apr 2013

Methods for sparse analysis of high-dimensional data, II

ON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION

Pointwise estimates for marginals of convex bodies

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Quantized Iterative Hard Thresholding:

Approximately gaussian marginals and the hyperplane conjecture

An example of a convex body without symmetric projections.

arxiv:math/ v1 [math.fa] 30 Sep 2004

Curriculum Vitae. Grigoris Paouris November 2018

Compressed Sensing and Sparse Recovery

Abstracts. Analytic and Probabilistic Techniques in Modern Convex Geometry November 7-9, 2015 University of Missouri-Columbia

Lecture 2: Linear Algebra Review

Universal low-rank matrix recovery from Pauli measurements

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

On the existence of supergaussian directions on convex bodies

On isotropicity with respect to a measure

Methods for sparse analysis of high-dimensional data, II

On Z p -norms of random vectors

Hereditary discrepancy and factorization norms

The Stability of Low-Rank Matrix Reconstruction: a Constrained Singular Value Perspective

On the monotonicity of the expected volume of a random. Luis Rademacher Computer Science and Engineering The Ohio State University

BEYOND HIRSCH CONJECTURE: WALKS ON RANDOM POLYTOPES AND SMOOTHED COMPLEXITY OF THE SIMPLEX METHOD

1-Bit Matrix Completion

Supremum of simple stochastic processes

RESEARCH STATEMENT GALYNA V. LIVSHYTS

Concentration phenomena in high dimensional geometry.

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Constrained optimization

The convex algebraic geometry of linear inverse problems

Binary matrix completion

Empirical Processes and random projections

Optimal compression of approximate Euclidean distances

An algebraic perspective on integer sparse recovery

Geometric Functional Analysis College Station July Titles/Abstracts. Sourav Chatterjee. Nonlinear large deviations

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

Isomorphic and almost-isometric problems in high-dimensional convex geometry

On a quantitative reversal of Alexandrov s inequality

1-Bit Matrix Completion

GREEDY SIGNAL RECOVERY REVIEW

ROBUST BINARY FUSED COMPRESSIVE SENSING USING ADAPTIVE OUTLIER PURSUIT. Xiangrong Zeng and Mário A. T. Figueiredo

Structured signal recovery from non-linear and heavy-tailed measurements

Small ball probability and Dvoretzky Theorem

CS286.2 Lecture 13: Quantum de Finetti Theorems

arxiv: v2 [math.pr] 9 Apr 2018

Another Low-Technology Estimate in Convex Geometry

THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX

Subspaces and orthogonal decompositions generated by bounded orthogonal systems

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

arxiv: v2 [math.pr] 15 Dec 2010

Transcription:

Sampling and high-dimensional convex geometry Roman Vershynin SampTA 2013 Bremen, Germany, June 2013

Geometry of sampling problems Signals live in high dimensions; sampling is often random. Geometry in high dimensions, with probabilistic methods (1970 2013+): Asymptotic convex geometry = geometric functional analysis.

Geometry of sampling problems Signal recovery problem. Signal: x R n. Unknown. Sampling (measurement) map: Φ : R n R m. Known. Measurement vector: y = Φ(x) R m. Known. Goal: recover x from y.

Geometry of sampling problems Prior information (model): x K, where K R n is a known signal set. Examples of K: Discrete L p, i.e. l n p, for 0 < p < {band-limited functions} {s-sparse signals in R n } {block-sparse signals} {low-rank matrices} etc.

Geometry of sampling problems Linear sampling: y = Ax, where A is an m n matrix. Non-linear sampling: later.

Convexity Signal set K may be non-convex. Then convexity: K conv(k). Question. What do convex sets look like in high dimensions? This is the main question of asymptotic convex geometry = geometric functional analysis.

Asymptotic convex geometry Main message of asymptotic convex geometry: K bulk + outliers. Bulk = round ball, makes up most volume of K. Outliers = few long tentacles, contain little volume. V. Milman s heuristic picture of a convex body in high dimensions

Concentration of volume The volume of an isotropic convex set is concentrated in a round ball. Isotropy assumption: X Unif(K) satisfies E X = 0, E XX T = I n. E X 2 2 = n, so the radius of that ball is n. Theorem (concentration of volume) [Paouris 06] P { X 2 > t n } exp( ct n), t 1. A simpler proof: [Adamczak-Litvak-Oleszkiewicz-Pajor-Tomczak 13]

Concentration of volume Theorem (thin shell) [Klartag 08] X 2 n n with high probability. Question. What is the best bound? Currently best known: n 1/3 [Guedon-E.Milman 11]. Thin Shell Conjecture: absolute constant. This would imply the Hyperplane Conjecture [Eldan-Klartag 11].

Round sections A random subspace E misses the outliers, passes through the bulk of K. Theorem (Dvoretzky s theorem) Consider a random subspace E in R n of dimension log n. Then K E round ball with high probability. Remark. For many convex sets K, the dimension of E is much larger than log n. See Milman s version of Dvoretzky s theorem.

Sections of arbitrary dimension K R n any convex set. Theorem (M estimate) [Milman 81, Pajor-Tomczak 85, Mendelson-Pajor-Tomczak 07] Consider a random subspace E in R n of codimension m. Then diam(k E) C w(k) m with high probability. Here w(k) is the mean width of K.

Mean width w(k) := E sup x K K g, x, where g N(0, I n ). Note: g 2 n, so w(k) = n width of K in a random direction.

Example: the l 1 ball K = conv(±e i ) = {x : x 1 1} = B n 1. The standard picture: Vol(K) 1/n Vol( ) 1/n 1 n. Hence a more accurate picture is this:

Example: the l 1 ball Mean width: w(k) := E sup x K K g, x = n E [ width of K in random direction ]. w(k) log n, w( ) 1. Hence mean width sees the bulk, ignores the outliers.

Signal recovery (following [Mendelson-Pajor-Tomczak 07]) Back to the signal recovery problem (linear sampling): Recover signal x K R n from y = Ax R m. If m n, problem is ill-posed. What do we know? x belongs to both K and the affine subspace E x := {x : Ax = y} = x + ker(a). Both K and E x are known. If diam(k E x ) ε then x can be recovered with error ε.

Signal recovery diam(k E x ) ε? Assume K is convex and origin-symmetric. Then diam is largest for x = 0. Conclusion. Let E = ker(a). If diam(k E) ε then any x K can be recovered from y = Ax with error ε. The recovery can be done by a convex program Find x K such that Ax = y. Find a signal consistent with the model (K) and with measurements (y).

Signal recovery Remaining question: when is diam(k E) ε? E = ker(a). If A is an random matrix then E is a random subspace. So the M estimate applies and yields diam(k E) C w(k) m with high probability. Equate with ε and obtain the sample size (# of measurements): m ε 2 w(k) 2.

i.e. the metric projection of A y onto K. Conclusion (Signal recovery) Let K be a convex and origin-symmetric signal set in R n. Let A be an m n random matrix. If m w(k) 2 then one can accurately recover any signal x K from m random linear measurements given as y = Ax R m. The recovery is done by the convex program Find x K such that Ax = y i.e. find x consistent with the model (K) and measurements (y). Other natural recovery programs: (1) max Ax, y subject to x K i.e. find x K most correlated with the measurements. (2) find x K closest to A y

Remarks. 1. If the signal set K is not convex, then convexify; w(conv(k)) = w(k). 2. If the signal set K is not origin-symmetric, then symmetrize; w(k K) 2w(K). 3. Mean width can be efficiently estimated. Randomized linear program: w(k) sup g, x, g N(0, I n ). x K K

Information theory viewpoint The sample size m = w(k) 2 is an effective dimension of K, the amount of information in K. Conclusion: One can effectively recover any signal in K from w(k) 2 random linear measurements. Example 1. K = B n 1 = conv(±e i). w(k) 2 log n. So, one can effectively recover any signal in B n 1 from log n random linear measurements. Remark. log n bits are required to specify even a vertex of K. So the number of measurements is optimal.

Information theory viewpoint Example 2. K = {s-sparse unit vectors in R n }. w(k) 2 s log n. Conclusion: One can effectively recover any s-sparse signal from s log n random linear measurements. This is a well-known result in compressed sensing. (Warning: exact recovery is not explained by this geometric reasoning.) Remark. log ( n s) s log n bits are required to specify the sparsity pattern.

Information theory viewpoint Remark. The mean width is related to other information-theoretic quantities, in particular to the metric entropy: c max t>0 t log N(K, t) w(k) C Sudakov s and Dudley s inequalities. 0 log N(K, t) dt.

Non-linear sampling Claim. One can recover a signal x K from non-linear measurements y = Φ(x), and even without fully knowing the nature of non-linearity Φ.

Non-linear sampling Non-linear measurements are given as y = θ(ax). Here A is an m n random matrix (known), θ : R R is a link function (possibly unknown). θ is applied to each coordinate of Ax, i.e. the measurements are y i = θ( a i, x ), i = 1,..., m

Non-linear sampling Examples. 1. Quantization. Single-bit compressed sensing: θ(t) = sign(t). [Plan-Vershynin 11] 2. Generalized Linear Models (GLM) in Statistics. θ(t) is arbitrary, perhaps unknown. In particular, logistic regression. [Plan-Vershynin 12]

Probabilistic Heuristics How is reconstruction from non-linear measurements possible? Probabilistic heuristic. 1. Linear. Let a N(0, I n ). Then Both sides are linear in y, so Law of Large Numbers E a, x a, y = x, y for all x, y R n. 1 m E a, x a = x for all x R n. m a i, x a i x if m is large. i=1

Probabilistic Heuristics 1 m m a i, x a i x if m is large. i=1 One can interpret this as: {a i } m i=1 is a frame in Rn. Gaussian frame. One can reconstruct x from linear measurements y = Ax, where y i = a i, x, i = 1,..., m Question. How large does m need to be? Without any prior knowledge about x, m n suffices. If x K, then m w(k) 2, the effective dimension of K. Reconstruction: metric projection of 1 m m i=1 a i, x a i = A y onto K.

Probabilistic Heuristics Probabilistic heuristic. 2. Non-linear. Let a N(0, I n ). Then E θ ( a, x ) a, y = λ x, y where λ = E θ(g)g =: 1, g N(0, 1). for all unit x, y R n Still linear in y Law of Large Numbers E θ ( a, x ) a = x for all unit x R n. 1 m m θ ( a i, x ) a i x i=1 if m is large.

Probabilistic Heuristics 1 m m θ ( a i, x ) a i x i=1 if m is large. One can interpret this as: {a i } m i=1 is a non-linear frame in Rn. One can reconstruct x from non-linear measurements y = θ(ax), where y i = θ ( a i, x ), i = 1,..., m For the reconstruction, one does not need to know θ! (at least, in principle).

Single-bit measurements Let us work out the particular example of single-bit measurements, y i = sign( a i, x ), i = 1,..., m. Geometric interpretation: y = vector of orientations of x with respect to m random hyperplanes (with normals a i ). In stochastic geometry: random hyperplane tessellation.

Single-bit measurements We know y = sign(ax), i.e. y i = a i, x i = we know the cell x. If diam(cell) ε cell, then we could recover x with error ε. Recovery would be done by the convex program Find x K such that sign(ax ) = y. It remains to find m so that all cells have diameter ε.

Question (Pizza cutting). Given a set K R n, how many random hyperplanes does it take in order to cut K in pieces of diameter ε? Non-trivial even for K = S n 1.

Theorem (Pizza cutting) [Plan-V 12]. Consider a set K S n 1 and m random hyperplanes. Then, with high probability, [ ] 1/3 C w(k) diam(cell) cell. m This is similar to the M estimate for diam(k random subspace), except for the exponent 1/3. Optimal exponent =? Corollary (Single-bit recovery). One can accurately and effectively recover any signal x K from m w(k) 2 random measurements y = sign(ax) { 1, 1} m.

General non-linear measurements Similar recovery results hold for general link function θ( ), y = θ(ax). Recovery of x can be done by the convex program max y, Ax subject to x K. i.e. find x K most correlated with the measurements. Remark. The solver does not need to know the nature of non-linearity given by θ. It may be unknown or unspecified. [Plan-V 12]

Summary: geometric view of signal recovery Recover signal x K from m random measurements/samples y = Ax (linear), y = θ(ax) (non-linear). Convex sets look like this: K bulk + outliers The size of the bulk determines the accuracy of recovery. Accurate and efficient recovery is possible if m w(k) 2 = the effective dimension of K. Here w(k) is the mean width of K, a computable quantity.

Further reading: Elementary introduction into modern convex geometry [Ball 97] Lectures in geometric functional analysis [Vershynin 11] Notes on isotropic convex bodies [Brazitikos-Giannopoulos-Valettas-Vritsiou 13+] Textbook in geometric functional analysis [Artstein-Giannopoulos-Milman-Vritsiou 13+] Further on non-linear recovery: single-bit matrix completion [Davenport-Plan-van den Berg-Wootters 12] Thank you!