A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing via Polarization of Analog Transmission

Similar documents
Error Correction via Linear Programming

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

Multipath Matching Pursuit

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

SPARSE signal representations have gained popularity in recent

Lecture 4 Noisy Channel Coding

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

Compressed Sensing and Linear Codes over Real Numbers

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

A New Estimate of Restricted Isometry Constants for Sparse Solutions

Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation

Practical Polar Code Construction Using Generalised Generator Matrices

Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions

Generalized Orthogonal Matching Pursuit- A Review and Some

Rényi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

A new method on deterministic construction of the measurement matrix in compressed sensing

GREEDY SIGNAL RECOVERY REVIEW

Pre-weighted Matching Pursuit Algorithms for Sparse Recovery

Compound Polar Codes

Recent Developments in Compressed Sensing

Shannon-Theoretic Limits on Noisy Compressive Sampling Mehmet Akçakaya, Student Member, IEEE, and Vahid Tarokh, Fellow, IEEE

Reconstruction from Anisotropic Random Measurements

Optimal Deterministic Compressed Sensing Matrices

RSP-Based Analysis for Sparsest and Least l 1 -Norm Solutions to Underdetermined Linear Systems

of Orthogonal Matching Pursuit

SPARSE NEAR-EQUIANGULAR TIGHT FRAMES WITH APPLICATIONS IN FULL DUPLEX WIRELESS COMMUNICATION

Randomness-in-Structured Ensembles for Compressed Sensing of Images

On Bit Error Rate Performance of Polar Codes in Finite Regime

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

Constructing Polar Codes Using Iterative Bit-Channel Upgrading. Arash Ghayoori. B.Sc., Isfahan University of Technology, 2011

Observability of a Linear System Under Sparsity Constraints

On Sparsity, Redundancy and Quality of Frame Representations

An Overview of Compressed Sensing

Compressed Sensing and Robust Recovery of Low Rank Matrices

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

ACCORDING to Shannon s sampling theorem, an analog

Compressed Sensing and Sparse Recovery

6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only

Lecture Notes 9: Constrained Optimization

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

CoSaMP: Greedy Signal Recovery and Uniform Uncertainty Principles

Guess & Check Codes for Deletions, Insertions, and Synchronization

RCA Analysis of the Polar Codes and the use of Feedback to aid Polarization at Short Blocklengths

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Model-Based Compressive Sensing for Signal Ensembles. Marco F. Duarte Volkan Cevher Richard G. Baraniuk

IN this paper, we consider the capacity of sticky channels, a

Performance of Polar Codes for Channel and Source Coding

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

Stopping Condition for Greedy Block Sparse Signal Recovery

COMPRESSED Sensing (CS) is a method to recover a

Signal Recovery From Incomplete and Inaccurate Measurements via Regularized Orthogonal Matching Pursuit

Strengthened Sobolev inequalities for a random subspace of functions

A New Achievable Region for Gaussian Multiple Descriptions Based on Subset Typicality

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

arxiv: v1 [math.na] 26 Nov 2009

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Appendix B Information theory from first principles

The Pros and Cons of Compressive Sensing

Signal Recovery from Permuted Observations

Information-Theoretic Limits of Matrix Completion

Sparse Solutions of an Undetermined Linear System

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Upper Bounds on the Capacity of Binary Intermittent Communication

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

SIGNALS with sparse representations can be recovered

Sparse Optimization Lecture: Sparse Recovery Guarantees

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

4 An Introduction to Channel Coding and Decoding over BSC

Lecture 3: Error Correcting Codes

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Performance Regions in Compressed Sensing from Noisy Measurements

Near Optimal Signal Recovery from Random Projections

Greedy Signal Recovery and Uniform Uncertainty Principles

ORTHOGONAL matching pursuit (OMP) is the canonical

Sensing systems limited by constraints: physical size, time, cost, energy

Stable Signal Recovery from Incomplete and Inaccurate Measurements

arxiv: v2 [cs.lg] 6 May 2017

On the Duality between Multiple-Access Codes and Computation Codes

An Introduction to Sparse Approximation

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery

Signal Recovery, Uncertainty Relations, and Minkowski Dimension

Binary Compressive Sensing via Analog. Fountain Coding

Stochastic geometry and random matrix theory in CS

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Sparsest Solutions of Underdetermined Linear Systems via l q -minimization for 0 < q 1

Convergence analysis for a class of LDPC convolutional codes on the erasure channel

Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes

The Capacity of Finite Abelian Group Codes Over Symmetric Memoryless Channels Giacomo Como and Fabio Fagnani

Reliable Computation over Multiple-Access Channels

COMPRESSED SENSING IN PYTHON

Chapter 9 Fundamental Limits in Information Theory

Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents

An algebraic perspective on integer sparse recovery

On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals

Transcription:

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 1 A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing via Polarization of Analog Transmission Linbo Li and Inyup Kang Abstract In this paper, by establishing a connection between noiseless compressed sensing (CS) and polar coding, we propose a method of structured construction of the optimal measurement matrix for sparse recovery in noiseless CS We show that the number of measurements only needs to be as large as the sparsity of the signal itself to be recovered to guarantee almost error-free recovery, for sufficiently large block size To arrive at the results, we employ a duality between noiseless sparse recovery and analog coding across sparse noisy channel Extending Rényi s Information Dimension to Mutual Information Dimension, we derive the fundamental limit of asymptotically error-free analog transmission across sparse noisy channel under certain condition of the underlying channel and linear analog encoding constraint, and equivalently obtain the minimum number of measurements to guarantee exact sparse recovery in noiseless CS By means of channel polarization of sparse noisy channel, a structured construction scheme is proposed for the linear measurement matrix which achieves the minimum measurement requirement for noiseless CS Index Terms Analog coding, channel polarization, compressed sensing, almost error-free sparse recovery, minimum measurement requirement, mutual information dimension, Rényi information dimension, sparse noisy channel, structured optimal measurement matrix T I INTRODUCTION he era of big data is enabled by the rapid growth of data acquisition from various sources, such as different kinds of sensor networks, data sampling and collecting devices A problem of central interest is to reduce the cost of data sampling and improve the accuracy and efficiency of data recovery The question of accurate data recovery has been answered by the Shannon-Nyquist sampling theorem [1], which states that a signal can be completely recovered by its uniformly sampled values spaced apart, if the bandwidth of is confined to within However, for many practical situations, the Nyquist sampling rate is too high considering the cost of sampling devices/sensors, or the hardware limitation of sampling rate To overcome this problem, compressed sensing (CS) has attracted much research interest with the prospect of drastically reducing the sampling rate while maintaining high signal recovery accuracy [3] [8] In CS, a sparse signal can be sampled and reliably recovered with a sampling rate proportional to the underlying content or sparsity of the signal, rather than the signal s bandwidth The sparsity of an -dimensional vector is defined by the number of non-zero elements, or in other words the cardinality of the support of as, where { } Vector is sparse if In CS, the sparse signal is sampled by taking linear measurements of Let be the measurement matrix, the measurement output R P is (1) Manuscript revised April 28, 2014 Linbo Li is with the Mobile Solutions Lab, Samsung Modem Team, 4921 Directors Place, San Diego, CA 92121, USA (phone: 858-320-1311; fax: 858-320-1379; e-mail: linboli@samsungcom) Inyup Kang is with the Mobile Solutions Lab, Samsung Modem Team, 4921 Directors Place, San Diego, CA 92121, USA (e-mail: inyupkang@samsungcom)

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 2 As in Section II, we are also interested in the following model of analog encoding and transmission across noisy channel which is the dual problem of CS in (1) In (2), is an full-rank analog code generator matrix, R N is the information vector, R M is the sparse noise vector, and R M is the channel output The main question to be answered by CS is: What is the minimum number of measurements necessary for reliable recovery of the original sparse signal, or what is the minimum number of linear equations in order to solve the vastly underdetermined linear system as specified by (1)? It is shown in [9, 10] that the signal is the unique solution to the following L 0 -minimization problem (3) (2) provided that the sparsity of satisfies (4) is the spark of the measurement matrix, which is the minimum number of its columns which are linearly dependent [16, 17] As a result, for full-rank the maximum recoverable signal sparsity is (5) In other words, measurements are necessary to perfectly recover signal with sparsity of To reduce the prohibitive complexity of L 0 -minimization, which is shown to be NP-hard [11], L 1 convex relaxation is used, which goes by the name of basis pursuit [18] According to [9, Theorem 13], if certain RIP (restricted isometry property) condition is satisfied, noiseless CS can be exactly solved by the L 1 convex optimization (which can be recast as a linear program), and it is actually equivalent to the original L 0 -minimization, in that they admit the same unique solution We note that the result in [9] is deterministic and guarantees exact recovery for all sparse signals with finite block-length under suitable conditions In contrast, our paper tackles noiseless CS from a different perspective We take an information-theoretic approach and our results are derived in the large block-length regime Our main interest is to find the information-theoretic limit or capacity of sparse signal recovery In our approach, the aim is not to perfectly recover all the sparse signals Rather, our aim is to recover almost all the sparse signals error-free However, as the block-length goes to infinity, the recovery error probability will vanish This is analogous to the standard setting used in Shannon s channel coding theorem In [14, 15] it is shown that the equivalence between L 0 and L 1 holds true with overwhelming probability for certain classes of random matrices In addition, [9] showed that the Gaussian unitary ensemble, which is a class of random matrices composed of iid Gaussian entries, satisfies the RIP condition with overwhelming probability The matrix multiplication by in (1) can be thought of as a form of encoding or compressing the source vector Analogous to classical channel coding, random encoding matrix suffers from the problem of high encoding complexity, whereas a structured encoding matrix is of great practical interest since it lends itself to lower implementation complexity In this paper, we propose a structured construction of the optimal measurement matrix for noiseless CS, under which asymptotically error-free recovery is attained for large block size We will show that the number of linear measurements only needs to be as large as the sparsity of the signal itself to guarantee asymptotically error-free recovery This result reduces the number of measurements in (5) by half, and is the minimum number of measurements necessary for given signal sparsity (or maximum recoverable sparsity for a given number of linear measurements) Our line of reasoning and paper organization go as follows In Section II, a duality is utilized to transform

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 3 measurement requirement of noiseless CS to the transmission rate of analog sparse noisy channel Then in Section III, in order to solve the fundamental limit of analog sparse noisy transmission, following Rényi s original definition of information dimension for real-valued random variables and vectors [13], we extend Rényi information dimension to the cases of conditional entropy and mutual information, and term these quantities Conditional Information Dimension (CID) and Mutual Information Dimension (MID), respectively We note that [2] showed Rényi information dimension as the fundamental limit of almost lossless analog source compression The authors of [19] further showed that, with linearity of analog compressor, the optimal phase-transition threshold of measurement rate for reconstruction error probability can be characterized by Rényi information dimension and MMSE dimension In particular, for discrete-continuous mixture input which is most relevant in CS practice, [19] showed the optimal phase-transition for measurement rate is given by the weight of the continuous part, regardless of the distribution of the discrete and continuous components In this paper, using the duality in Section II we focus our attention on SANC (sparse additive noisy channel), which will be analyzed in our analog polar construction We will show that the operational meaning of MID is the fundamental limit of asymptotically error-free analog transmission under certain conditions on the channel transition distribution and linear encoder constraint The performance limit is in terms of the asymptotic ratio of the dimensionality of the codebook to the codeword length as, where the codebook is an -dimensional subspace of R M This ratio is termed Achievable Dimension Rate in this paper Following the performance limit of analog sparse noisy channel, the minimum number of measurements to guarantee asymptotically error-free sparse recovery in noiseless CS can be equivalently obtained using the duality of Section II In Section IV we will discuss the channel polarization effect for the sparse noisy channel of interest The corresponding analog polar decoding procedure is also described in Section IV In Section V we provide an explicit structured construction of the linear measurement matrix such that the performance limit derived in the previous sections is achieved, using machinery of channel polarization Numerical results are provided in Section VI to demonstrate the performance of polar analog coding matrix Section VII contains our concluding remarks Existing works in the literature that are related to our results are discussed here Inspired by the application of second order Reed-Muller code in CS, [20] proposes to use polar code as deterministic measurement matrix in CS, and compares the numerical performances of Polar and Gaussian measurement matrices using practical BPDN (Basis Pursuit DeNoising) algorithm However, it is not discussed in [20] the fundamental limit of CS and the optimality of polar construction in terms of information dimension limit In [22], the results on universal source polar coding are applied to compressively sensing and recovering a sparse signal valued on a finite field [21] proposes CS sensing scheme using deterministic partial Hadamard matrices (selecting a subset of rows of the original Hadamard matrix) and proves that it is lossless and entropy preserving for discretely distributed random vectors with a given distribution Expander graphs are used in construction of random sensing matrices with RIP and requiring reduced degree of randomness in [23] In [24], different deterministic sensing matrices (including ones formed from discrete chirps, Delsarte-Goethals codes, extended BCH codes) are shown to achieve Uniquenessguaranteed Statistical RIP (UStRIP) The spatial coupling idea in coding theory is explored in [25] and [25] to utilize spatial coupling matrix for sensing and it is shown to achieve information dimension limit However, the sensing matrix used in [25] and [25] is still randomly generated among an ensemble We note that our derivation is information-theoretic in nature Thus all the results hold in the asymptotic regime of large block length For the sake of simplicity, we consider real-valued signal and noise in this paper It is straightforward to generalize all results to the field of complex numbers II DUALITY BETWEEN NOISELESS COMPRESSED SENSING AND ANALOG TRANSMISSION ACROSS SPARSE NOISY CHANNEL As in [9], let us consider the following model of analog encoding and transmission across noisy channel where R N is the signal vector, is an full-rank analog code generator matrix, R M is the noise vector, and R M is the channel output We will utilize a duality between noiseless CS (1) and analog coding and transmission (6), so that the minimum (6)

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 4 measurement for CS is equivalently transformed to a problem of rate limit of analog transmission subject to linear encoding constraint The precise definition of analog transmission rate is deferred to Definition 5 in Section III The minimum measurement requirement for CS is obtained as a result of the duality Denote by the projection matrix onto the kernel of matrix (the annihilator of as considered in [9]) The dimensionality of is, and it satisfies Therefore The ratio of the lengths of information signal and transmitted signal is (7) in (6) For sparse noise vector, it can be seen that analog coding subject to sparse additive noisy channel (SANC) in (6) can be converted to noiseless CS in (7) From (6), it is evident that if can be exactly recovered by CS, the information signal can be recovered errorfree by subtracting from (6) Conversely, for a noiseless CS problem where R M is a sparse vector and is the measurement matrix First, we find any vector that satisfies Let us note that is not constrained to be sparse As a result, From (8) must belong to the subspace spanned by the columns of, where is the matrix that spans the null space of matrix Therefore, there exist such that can be represented as As a result, Therefore, noiseless CS is transformed to linear analog coding across sparse noisy channel We would like to point out that the choice of in (9) does not need to depend on Actually, for (9) any R N will satisfy Therefore, in this paper, we will assume that when converting CS to analog coding, the solver of the underdetermined system will produce a solution with a random realization independent of This can be justified as follows First, we note that the solution set of is given by { R N }, where is a particular solution The solution set is independent of the choice of the particular solution Since is also a particular solution, for any particular solution Then for a particular solution, with a random and equally likely R N which is independent of, a random among the solution set can be generated as can also be written as for certain It is readily seen that is also equally likely on R N and independent of for any given After decoding of the information signal, the sparse vector can be perfectly recovered by As a result of the duality, maximizing the analog transmission rate (precise definition deferred to Section III) subject to linear encoding constraint will minimize the measurement requirement for the corresponding noiseless CS problem In Section III we will calculate the information-theoretic limit of linear analog encoding across SANC, and design the corresponding linear encoding scheme in subsequent sections Remark 1: It is useful to draw the analogy of analog coding of real-valued signal to the binary coding on the binary field { } For linear encoding of binary by a generator matrix followed by transmission over binary channel represented by the additive error vector, the received vector is A linear encoding scheme is readily found which can correct up to errors Reed-Solomon code is such an example Let be the parity check matrix with dimensionality, which satisfies We have (8) (9) is the decoding syndrome For with full rank, its spark is given by Therefore from the

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 5 discussion in Section I, can be perfectly recovered if its support satisfies (10) In turn, the information vector can be reliably decoded Finally, let us note that it is well-known that the RS code does not achieve the channel capacity As a result, the upper bound in (10) on s sparsity cannot be optimal Analogous to the L 0 scheme and the upper bound in (4) for CS, on binary field RS code is a finite block-length scheme and guarantees perfect recovery for all sparse signals with sparsity up to It has been shown that with better construction of the binary encoding matrix (polar code is an example [12]), decoding error can be made arbitrarily small and channel capacity can be achieved with asymptotically large codeword length In subsequent sections, we will apply this analogy to real field to derive the upper limit on CS recoverable sparsity for asymptotically large codeword length III INFORMATION-THEORETIC LIMIT OF ANALOG TRANSMISSION ACROSS SPARSE NOISY CHANNEL A Definitions and Modeling From the duality discussed in Section II, we are interested in SANC (sparse additive noisy channel) as described in (6) We will find the information-theoretic limit of its transmission rate A structured rate-limit-achieving linear encoding scheme will also be proposed by utilizing the channel polarization effect First of all, the following modeling and definitions need to be introduced to accurately define our framework To model SANC, we introduce the following Gaussian SANC model where is a Gaussian noise component with 0 mean and variance, and has the following distribution: (11) { (12) is independent of the noise For the channel in (11), it is easy to verify that the conditional distribution of given is given by (13) where denotes the Gaussian distribution of mean and variance The question we need to answer is: What is the maximum rate of reliably transmitting real-valued signals across this channel in (11)? In order to answer the question above, we will need to accurately define the quantities of interest To this end, some new definitions will be introduced first Let us extend Rényi s definition of information dimension to the cases of conditional entropy and mutual information We term these quantities as conditional information dimension (CID), and mutual information dimension (MID), respectively Let us recall the definition of the original Rényi information dimension as: Definition 1 (Rényi Information Dimension [13]): Let be a real-valued random variable For integer, define the following quantized random variable The Rényi information dimension is defined as

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 6 if the limit above exists Alternatively, Rényi information dimension has the following equivalent definition Definition 2 (Equivalent Definition of Rényi information dimension [2]): Let us consider the following mesh cube of size on R k : where is the k-dimensional integer vector of cube index The entire R k space is divided into mesh cubes of size across all possible values of For an arbitrary -dimensional real-valued random vector, its Rényi information dimension can be equivalently defined as where is the discrete probability measure by setting where is the original probability measure of random vector Now we extend the Rényi information dimension to the cases of condition entropy and mutual information Here we adopt the -quantization form in our definitions Definition 3 (Conditional Information Dimension (CID)): For random variables conditional information dimension if the respective limit exists: and, we define the following where and are the quantized random variables of and by the mesh cubes of size Definition 4 (Mutual Information Dimension (MID)): For random variables and, the mutual information dimension of and is defined as the following, if the respective limit exists B Operational Meaning of Mutual Information Dimension In this section, we will find the following operational meaning of MID as defined in Definition 4: MID is the fundamental limit of asymptotically error-free analog transmission (rigorous definition provided in Definition 5) under certain conditions on the channel transition distribution and under the regularity constraint of linear analog encoder The performance limit is in terms of the asymptotic ratio of the dimensionality of the codebook to the codeword length as The codebook is an - dimensional subspace of R M, using which asymptotically error-free analog transmission is achieved The ratio is termed Achievable Dimension Rate in this paper

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 7 In what follows, we will make the statement above accurate and prove it for the important class of channel transition distributions, which follows a discrete-continuous mixed distribution and satisfies the Dual-Discrete Condition (see Definition 6) A special case of this class is the channel model in (13), which is of central interest in our modeling of the sparse signal for CS and the sparse noise vector for analog transmission We impose linear encoding constraint on the transmitter for two reasons First, with the codebook taking a continuum of entries, it would be impossible to discuss meaningfully the fundamental limit of real-valued data transmission without imposing certain regularity constraint on the encoder For example, it is well-known that there is a one-to-one mapping (although a highly discontinuous and irregular one) from R to R n Therefore, without any constraint on the encoding scheme, a single real number could be used to encode a whole vector of real numbers, which would result in infinite transmission capacity Second, linear encoding scheme appears naturally from our discussion in Section II on the duality between noiseless CS and analog data transmission Definition 5 (Achievable Dimension Rate of Asymptotically Error-Free Analog Transmission): Let be the channel transition distribution of a real-valued analog transmission channel Codebook: For codeword of length, given linear encoding constraint, the codebook is an -dimensional ( subspace of R M, which is denoted by R M Data transmission: For codeword, the -dimensional channel output is generated according to repeated and independent applications of componentwise to the entries in Decoder: The decoder is a function that generates Decoder error probability: Achievable Dimension Rate: A dimension rate is said to be achievable if there exists a sequence of -dimensional codes such that the error probability as the codeword length Definition 6 (Dual-Discrete Condition): For a channel transition distribution with a discrete-continuous mixed form (14) where is the discrete part (with countable number of atoms) and is the absolutely continuous (AC) part For the discrete part, let us denote the atoms of for a given by the countable set { } Across all, the set defines a countable collection of 1-dimensional curves in R 2 We require that all such curves of be AC The dual-discrete condition is satisfied if for a given, all the that satisfies is also a countable set Namely, the following set is countable: { } We denote by { } We also require that all the curves of be AC Remark 2: The dual-discrete condition excludes the case in which for a given, the set example, for, the set R is uncountable For Regarding the achievable dimension rate, we have the following theorem Theorem 1 Let us assume the channel transition distribution satisfies the dual-discrete condition in Definition 6 Let us also assume the channel input is AC Subject to linear encoding constraint, a dimension rate is achievable, as with, if Conversely, subject to linear encoding constraint, any sequence of -dimensional codes with

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 8 must have the dimension rate satisfying Proof In Appendix I we prove the achievability part The converse part can be proved by following the same argument in [2] and utilizing Lemmas 9 and 10 therein and therefore will not be repeated Remark 3: The result that will be proved in Proposition 1 in Section IIIC Also, it is easy to verify that our modeling of the sparse noisy channel in (13) satisfies the dual-discrete condition From the proof of Theorem 1, for sufficiently large, there exists at least one realization of codebook such that the dimension rate is achievable with arbitrarily small decoding error probability The existence of a sequence of good codebooks for analog transmission prompts us to find a structured construction of it for practical purposes Based on Theorem 1 and the duality between noiseless CS and analog transmission, for sufficiently large, there exists a sequence of measurement matrix such that the number of measurement is achievable with arbitrarily small recovery error probability for C MID Calculation for Discrete-Continuous Channel Model in (14) Proposition 1 Let us assume the channel input distribution model in (14) with the dual-discrete condition is given by Proof In Appendix II is AC The MID of the discrete-continuous channel With Theorem 1 and Proposition 1, we just showed MID as the fundamental performance limit of asymptotically error-free analog transmission for the channel model in (14), which includes our sparse noisy channel model (13) Remark 4: We provide an interesting interpretation of (15) It can also be obtained by means of adding a vanishing Gaussian noise term to (11) and the noise free channel, and evaluate their respective rate of growth of the mutual information as the variance of goes to zero Specifically, we add a vanishing Gaussian noise term, which is independent of and has variance, to both (11) and the noise free channel: (15) (16) (17) In the regime of vanishing, the mutual information of channel (17) behaves as (18) For the channel in (16), the mutual information is denotes the differential entropy Since is finite, we only need to calculate the second term Given the conditional probability distribution (19)

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 9 it is readily shown that the mutual information behaves as (20) As a result, for the two channels in (16) and (17), the relative ratio of the rate of growth of the mutual information is as, which turns out to be the same as the MID in (15) Therefore, in this case in the asymptotic domain of vanishing noise, the capacity of transmitting discrete bits is related to the capacity of transmitting realvalued signals through its asymptotic rate of growth At the end of this section, let us compare our result in (15) with the rate achieved with prior work From the duality in Section II, let us transform the problem of noisy analog transmission with linear encoding to a CS problem From the prior result on CS (see (5) and [9, 10]), if the support of the error vector satisfies (21) error-free recovery of is guaranteed The rate of the analog coding is and the normalized sparsity of the error vector is As a result, for the original noisy channel, the supported rate of reliable analog transmission is It is interesting to compare (22) with the MID result in (15) The rate in (22) is strictly less than (15) for The reduction of achieved rate is mainly because, whereas the rate in (22) is guaranteed for finite block length and all codewords, the MID in (15) is achieved asymptotically with vanishing error probability as the block length goes to infinity (22) A SANC and SAEC IV POLARIZATION OF SPARSE NOISY CHANNEL From our discussion in Section III, there exists a sequence of codebook, which is a realization of the random subspace and asymptotically achieves the dimension rate of for the channel model in (11) Now a natural question to ask is: What is a structured way of linear encoding, instead of random realization, in order to achieve the dimension rate in Theorem 1? In this section, we will answer this question with the tool of channel polarization First, let us note that according to (15), the MID of the channel in (11) does not depend on the distribution of the noise component, namely, its variance for the Gaussian case we have assumed It depends only on the probability of noisy realization of the channel itself (the parameter) We recognize this as an erasing effect, namely, the input real-valued signal is erased by a noisy realization of the channel and cannot possibly be recovered without distortion Therefore, what really matters is, which indicates the probability of noisy channel realization, instead of the distribution of the noise component Based on this observation, the channel in (11) is equivalent to the following sparse analog erasure channel (SAEC) in terms of MID For SAEC, with input, the output is given by { (23)

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 10 Here we can regard the parameter as the erasure probability (also the noise sparsity), as analogous to the erasure probability of the binary erasure channel (BEC) Remark 5: We will use SAEC channel model in (23) in the derivation of this section, since it lends itself to simpler analysis However, we note that the same MID results will hold for SANC in (11) as long as they have the same noise sparsity B Channel Polarization and Duality between SAEC and BEC In this section, we derive a duality between the sparse analog erasure channel (SAEC) and the binary erasure channel (BEC) in terms of channel polarization with respect to the MID and Shannon capacity, respectively Reference [12] proved that for BEC, through recursive application of the following one-step transformation on the channel input (24) channel polarization is achieved in the limit of large input size Also, the one-step transformation preserves the combined channel capacity of the two new bit channels If we denote the erasure probability by for the original BEC, the erasure probability for the new channel pair after the one-step transformation is given by [12]: Therefore, after one-step transformation, one of the new bit channel becomes better and the other one gets worse For SAEC on real-valued signals, we will show that the following one-step transformation in (25) on independent input and will produce the same polarization effect in terms of MID: (25) We can choose, for example, After the transformation in (25), and are sent through two independent realizations of SAEC and the corresponding output is denoted by and, as illustrated in Fig 1 β x 1 z 1 α 1 n 1 y 1 β x 2 z 2 α 2 n 2 y 2 Fig 1 One-step transformation The above linear transformation creates a pair of new channels for the original input and For SAEC, Table I lists the possible channel realizations with the corresponding probability TABLE I CHANNEL REALIZATIONS AND THE CORRESPONDING PROBABILITY SAEC REALIZATION PROBABILITY

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 11 In what follows, we will evaluate the MID of the new channel pair through and Proposition 2 After the single-step transformation of (25) Proof In Appendix III From (26) and (27), it is easy to see that (26) (27) Also we notice that which is exactly the combined MID of the two original SAEC Thus, the one-step linear transformation of (28) preserves the combined MID while improving the new channel for and degrading the new channel for Table II compares our results for SAEC on real field and the results in [12] for BEC on binary field: TABLE II COMPARISON OF SAEC AND BEC BEFORE AND AFTER ONE-STEP TRANSFORMATION SAEC Noise sparsity: MID: One-step transformation preserves combined MID After one-step transformation, the new channel pair s MID is: BEC Erasure probability: Shannon capacity: One-step transformation preserves combined Shannon capacity After one-step transformation, the new channel pair s Shannon capacity is: From Table II, we can see that the properties of the one-step transformation (25) of SAEC are equivalent to those of the one-step transformation (24) of BEC The noise sparsity of SAEC serves as the counterpart of erasure

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 12 probability of BEC Then as in [12], we recursively apply the one-step transformation For the construction is illustrated in Fig 2 -th step, the channel x 2i-3 x 2i-2 β β v i-1 v i W(2 n ) y1 β x 2i-1 β n v i-1+2 x 2i v i+2 n W(2 n ) y 2 Fig 2 -th step of iterative application of the one-step transformation In Fig 2, denotes the channel block formed in the -th step Let us define where and Following the induction in [12], the same iterative relationship as in [12] for BEC holds: Proposition 3 Proof In Appendix IV Therefore, following the argument in [12], channel polarization effect will take place for large in the sense that: The new effective channels will polarize into a class of good channels which are asymptotically (as ) noise free, and a class of bad channels which are asymptotically completely noisy The proportion of the good channels is equal to Fig 3 and Fig 4 show the channel polarizing effect for block length of and sparsity Fig 3 shows MID of the sub-channels with natural ordering of the sub-channels, whereas Fig 4 shows the sorted MID in decreasing order As, all MIDs will converge to either 0 (completely noisy) or 1 (noise free), except for a subset with vanishing proportion

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 13 1 09 08 Mutual Information Dimension 07 06 05 04 03 02 01 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Sub-channel index Fig 3 Mutual information dimension for and Sorted MID 1 09 08 07 06 05 04 03 02 01 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Fig 4 Sorted mutual information dimension for and In the -th step of recursive application of the one-step transformation in (28), a size transformation matrix is formed The overall transformation from the original signal to the input vector of the underlying independent SAECs is: where is the -fold Kronecker product of, is the bit reversal matrix which permutes the columns of in the bit-reverse order For example, for, the linear transformation matrix on is: (29)

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 14 From the channel polarization of SAEC, we only need to transmit the real-valued signals on the good channels and transmit independent random signals on the bad channels The input on the bad channels will be revealed to the receiver The MID of (15) can be achieved by such polar encoding scheme Polar decoding algorithm will be discussed in the following section Remark 6: As discussed in Section IVA, we use SAEC in place of SANC for MID calculation, since they are equivalent in that regard and the analysis is simplified It can be readily verified that for SANC, for which Table I is replaced by Table III below, Proposition 2 will also hold Then Proposition 3 follows from the same induction in Appendix IV The derivation will be omitted to save space TABLE III CHANNEL REALIZATIONS AND THE CORRESPONDING PROBABILITY SANC REALIZATION PROBABILITY C SANC Polar Decoding In this section, we discuss the corresponding decoding algorithm for SANC with channel polarizing transformation Let Let us denote the -dimensional input signal by, the -dimensional subset of corresponding to the good channels by, the -dimensional subset of corresponding to the bad channels by, and the polar transformation matrix by From (29) the relationship between and the transformed vector, which is the input to the underlying physical SANC, is (30) where is the matrix whose columns are associated with the good channels in, and is the matrix whose columns correspond to the bad channels in The channel model is (31) where is the additive sparse noise vector and is given by

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 15 (32) is revealed to the receiver Without a priori knowledge of, the optimal decoding rule is ML, namely, finding input vector that maximizes the conditional distribution Next, we discuss a SC (successive cancellation) based SANC polar decoder analogous to the binary case in [12] with lower complexity than ML Let denote the index set of good channels, and denote the index set of bad channels Description of SC-SANC polar decoder: for if /* is known to receiver*/ else end; end; In the SC decoder, is the column of Conditional distribution is calculated via the recursive relationship in (36) and (37) in Appendix IV For example, it is readily verified that is given by: ( ) where is Gaussian distribution of mean and variance In the above maximization, it is understood that first order -function, if satisfied, is larger than finite term, and higher order -function, if satisfied, is larger than lower order ones Unfortunately, even with the recursive formula (36) and (37) to calculate, it is still rather complex to find the maximizing for each stage, and an efficient algorithm for this task seems to be unknown In Section VI, in order to show the performance of the proposed channel polarizing transformation for SANC, we will obtain numerical results for decoding error rate of polarized SANC using the L 1 decoder in [9] V STRUCTURED CONSTRUCTION OF OPTIMAL MEASUREMENT MATRIX FOR NOISELESS CS Following the discussion in Section IV on polarization of sparse noisy channel, in this section, we will propose a method of structured construction of the optimal measurement matrix for noiseless CS and the procedure to solve the noiseless CS using the optimal construction From Section II on the duality between noiseless CS and analog transmission across sparse noisy channel, we know that a rate-limit-achieving linear encoding scheme across sparse noisy channel actually optimally solves the equivalent noiseless CS problem, where the optimality lies in the construction of the measurement matrix such

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 16 that it achieves the minimum number of measurements for reliable signal recovery From Section II, the noiseless CS problem is equivalent to a problem of analog transmission across sparse noisy channel, where is the measurement matrix, is the sparse vector to be recovered in CS (also the sparse noise vector in analog transmission), is the linear encoding matrix, and is the real-valued information vector The columns of matrix span the null space of, namely, From our discussion on channel polarization, for input of large block-length, SAEC (and also SANC) can be polarized into good channels and bad channels For large input size we have For large input size of, the indices of the good and bad channels in are known [12] As a result, matrices and are known We use a frozen vector of for transmission on the bad channels and reveal it to the decoder From [12], it is known that there exists a frozen realization of that does not degrade the performance of the polar code Thus, summing up the discussions above, the optimal measurement matrix can be constructed as follows: Optimal Structured Construction: is the projection matrix onto the kernel of matrix, where is the submatrix of formed by picking the columns of corresponding to the good channels, and is the channel-polarizing linear transformation matrix in (29) Given the optimal construction of the measurement matrix, the noiseless CS problem following the steps below: can be solved 1 As in Section II, converting to the equivalent analog decoding problem 2 Add the known term to and form 3 Use polar decoding algorithm to decode from 4 Recover by Let us calculate the sparsity of that can be recovered error-free Since where is the sparsity of, from the fact that achieves MID, we have As a result This is actually two times as good as the result achieved by (21), which says Here we would like to point out again that our proposed CS algorithm achieves error-free recovery asymptotically as the block-length goes to infinity, whereas (21) achieves error-free recovery for finite block length

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 17 VI NUMERICAL EXPERIMENTS In this section, we perform numerical experiments to demonstrate the performance of polar analog coding matrix Since an efficient algorithm for analog polar decoding is not known (such as for the analog SC decoder proposed in Section IVC), instead, we will use the L 1 decoder in [9] for evaluation purposes L 1 decoder solves the following program where is our polar analog coding matrix We will also compare the performance of polar coding matrix with that of Gaussian random coding matrix In the numerical result, codeword length is 256 The analog information vector is generated according to iid standard Gaussian distribution The non-zero entries of the sparse noise vector are also generated according to iid, with their locations randomly placed among the entries of the codeword Due to finite precision of numerical simulation, a codeword is declared in error if the average L 1 error per coordinate exceeds In Fig 5, the analog code rate is 025, and the codeword error rate is plotted against noise sparsity In Fig 6, the noise sparsity is fixed to 02, and the codeword error rate is plotted against analog code rate From the figures, polar coding matrix performs worse than Gaussian random matrix in region of low error rate, with both of them achieving vanishing error rate for decreasing code rate and noise sparsity We point out that the performance of analog polar coding depends on the decoding scheme and could be improved by adopting better decoder than the L 1 decoder 10 0 Polar Coding Mat Gaussian Coding Mat 10-1 Codeword Error Rate 10-2 10-3 10-4 04 038 036 034 032 03 028 Noise Sparsity 026 024 022 02 Fig 5 Codeword ER vs Noise Sparsity (Analog Code Rate = 025)

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 18 10 0 Polar Coding Mat Gaussian Coding Mat 10-1 Codeword Error Rate 10-2 10-3 10-4 048 046 044 042 04 038 Analog Code Rate 036 034 032 03 Fig 6 Codeword ER vs Analog Code Rate (Noise Sparsity = 02) VII CONCLUDING REMARKS Compressed sensing (CS) and polar coding have been two recent important developments in the areas of signal processing and channel coding In a conventional sense, there is the fundamental difference between the two fields that CS deals with recovery of real-valued signals, whereas channel coding is concerned with the recovery of discrete bits across noisy channels However, there also exists profound analogy between the two fields, in that channel coding aims at minimizing the coding redundancy while ensuring reliable communication across noisy channels, whereas in CS we would like to minimize the number of measurements which is required to ensure signal recovery Hence measurement in CS plays a role that is analogous to the role coding redundancy plays in coding theory In this paper, we established a connection between the two fields The concept of mutual information dimension proves to be important in the bridging, which has the interpretation of the fundamental limit of achievable dimension rate for analog transmission under certain conditions We showed that the sparse additive noisy channel (SANC) and sparse analog erasure channel (SAEC) lend themselves to the channel polarization effect for large block-length, which in turn enables us to propose the structured construction of the optimal measurement matrix for noiseless CS, which results in the minimum number of linear measurements From our results, the number of linear measurements only needs to be as large as the sparsity of the signal itself for asymptotically error-free recovery A duality between noiseless CS and analog coding across sparse noisy channel is utilized in the derivation From the duality, it is evident that reducing the redundancy in the linear analog coding directly translates to reduced number of measurements in noiseless CS APPENDIX I PROOF OF THEOREM 1 For achievability, for, let Now we consider a random matrix with iid standard Gaussian distributed entries is independent of the transmitted signal Denote the subspace spanned by the rows of by R M Since is full-rank with probability 1, the dimensionality of is wp 1 Let the codebook be the -dimensional subspace which is the kernel of is the transmitted codeword Let us define the following set (standing for continuous component support) as

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 19 { { }} Then let us define the following set for arbitrary { } The decoder works as follows For the channel output vector, decoder output is declared if A decoding error is declared otherwise Therefore the decoding error probability can be written as which by union bound can be calculated as { } For sufficiently large block length, the first term will vanish according to the weak law of large numbers For the second term, it is equivalent to { } { } { } where we used the fact that since is a linear space and In what follows, we will show that We have { } { } (33) Let us choose large enough such that From the definition of and, the support of the continuous component of is at most Let us denote the sub-vector formed by the discrete components of by, the sub-vector of the AC components of by Therefore, the dimensionality of is at most Denote by the sub-matrix formed by the columns of corresponding to, and the sub-matrix of the remaining columns of by The event in (33) can be further written as ( ) (34)

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 20 Let us denote, where is the components of with the same indices as, and is the components of with the same indices as is equivalent to Since the matrix is composed of iid Gaussian entries, it is full-rank wp 1 For, all its columns are linearly independent wp 1, because its number of columns is at most which is strictly less than its number of rows ( is a tall matrix) and are independent Conditioned on, and denote and We know the probability that a solution exists for the overdetermined system is 0 if and are not both Then after averaging wrt, we have Since the cardinality of all s is countable, the right-hand side of (34) is 0 Therefore from (34) As a result of (35) and the fact that vanishes, the decoding error probability will vanish for sufficiently large block length Thus, the achievable dimension rate of the codebook can be calculated as (35) Therefore, the achievable dimension rate is for arbitrary The converse of Theorem 1 can be proved by following the same argument in [2] and utilizing the Lemma 9 and Lemma 10 therein and therefore will not be repeated here APPENDIX II The joint distribution is PROOF OF PROPOSITION 1 is AC With probability, resides on the countable collection of AC curves { } as in Definition 6 For technicality reason, let us assume that the curves { } form a 1-dimensional manifold in R 2 Therefore, the cdf corresponding to the first term is AC on the 1-dimensional manifold The cdf corresponding to the second term is AC on R 2 since the channel input is AC Reference [13] proved the Rényi information dimension of a random variable with discrete-continuous mixed distribution to be the weight of the AC part of the distribution, and the results are extended to higher dimensional Euclidean spaces From [13], the contribution of the first and second terms are and respectively, and the Rényi information dimension of is From our definition of CID, we have

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 21 Then we prove that the channel output distribution is AC The density of is ( ) Since can be represented as the summation of and the second term is a density, has a density As a result, the channel output is AC Thus, The MID is given by APPENDIX III PROOF OF PROPOSITION 2 Assuming all inputs and WGNs are AC, given it is easily shown that the channel output is 2-dimensional AC As a result, since in this case it coincides with its geometrical dimensionality of 2 The joint distribution of is Similar to the derivation in Appendix II, the Rényi information dimension is calculated as: So the CID is given by, For the conditional distribution of the joint distribution is

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 22 The Rényi information dimension is calculated as: So the CID is given by, Now we are ready to calculate the MIDs: where we have used the fact that and are independent APPENDIX IV PROOF OF PROPOSITION 3 Proposition 3 can be proved by following the induction in [12] for BEC In this appendix we only identify the key steps of the proof Let us define the following terminology A pair of new real-valued-input-output channels and is said to be obtained from two independent copies of the original channel after a single-step transformation if for all and In such case, we use the following notation for,, and : Obviously, the one-step transformation in Fig 1 can be written in, where is the original sparse noisy channel, is the new channel, and is the new channel From Proposition 1 and Proposition 2, we know that the MIDs satisfy Then for the ( )-th step transformation in Fig 2, we have

Li and Kang: A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing 23 ( ) (36) ( ) (37) where and Let us note that there is a one-to-one correspondence between { } and Therefore, we can write after the following substitutions: And Proposition 3 follows ACKNOWLEDGEMENT The authors would like to thank the anonymous reviewers for their insights and suggestions REFERENCES [1] C E Shannon, Communication in the presence of noise, Proceedings of Institute of Radio Engineers, vol 37, no 1, pp 10-21, 1949 [2] Yihong Wu and Sergio Verdú, Rényi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression, IEEE Transactions on Information Theory, vol 56, no 8, pp 3721 3748, August 2010 [3] D L Donoho, Compressed sensing, IEEE Transactions on Information Theory, vol 52, no 4, pp 1289 1306, 2006 [4] E J Candes and T Tao, Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Transactions on Information Theory, vol 52, no 12, pp 5406 5425, 2006 [5] E J Candes, M Rudelson, T Tao, and R Vershynin, Error correction via linear programming, Foundations of Computer Science, Annual IEEE Symposium on, pp 295 308, 2005 [6] R G Baraniuk, V Cevher, M F Duarte, and C Hegde, Model-based compressive sensing, IEEE Transactions on Information Theory, vol 56, pp 1982 2001, 2010 [7] S Ji, Y Xue, and L Carin, Bayesian compressive sensing, IEEE Transactions on Signal Processing, vol 56, no 6, pp 2346 2356, 2008 [8] A Gilbert and P Indyk, Sparse recovery using sparse matrices, Proceedings of the IEEE, vol 98, no 6, pp 937 947, 2010 [9] E J Candes and T Tao, Decoding by linear programming, IEEE Trans Inf Theory, vol 51, no 12, pp 4203 4215, Dec 2005 [10] D Needell and J A Tropp, Cosamp: Iterative signal recovery from incomplete and inaccurate samples, Appl Comp Harmonic Anal, vol 26, pp 301 321 [11] J A Tropp, Topics in Sparse Approximation, PhD dissertation, Computational and Applied Mathematics, UT-Austin, TX, 2004 [12] E Arikan, Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels, IEEE Trans Inform Theory, vol IT-55, pp 3051-3073, July 2009 [13] A Rényi, On the dimension and entropy of probability distributions, Acta Mathematica Hungarica, vol 10, no 1-2, Mar 1959 [14] E J Candes, J Romberg, and T Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Trans Inf Theory, vol 52, no 2, pp 489-509, Feb 2006 [15] E J Candes and J Romberg, Quantitative robust uncertainty principles and optimally sparse decompositions, Found Comput Math Available on the ArXiV preprint server: math:ca=0411273, to be published [16] R Gribonval and M Nielsen, Sparse representations in unions of bases, IEEE Trans Inf Theory, vol 49, no 12, pp 3320 3325, Dec 2003 [17] D L Donoho and M Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization, in Proc Nat Acad Sci USA, vol 100, 2003, pp 2197 2202 [18] S S Chen, D L Donoho, and M A Saunders, Atomic decomposition by basis pursuit, SIAM J Sci Comput, vol 20, pp 33 61, 1998 [19] Yihong Wu and Sergio Verdú, Optimal Phase Transitions in Compressed Sensing, IEEE Transactions on Information Theory, vol 58, no 10, pp 6241-6263, Oct 2012 [20] M Pilanci, O Arikan, and E Arikan, Polar compressive sampling: A novel technique using Polar codes, IEEE 18 th Signal Processing and Communications Applications Conference (SIU), pp 73-76, 2010 [21] S Haghighatshoar, E Abbe, and E Telatar, Adaptive sensing using deterministic partial Hadamard matrices, Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on Information Theory, pp 1842 1846, 2012 [22] E Abbe, Universal source polarization and sparse recovery, 2010 IEEE Information Theory Workshop (ITW), pp 1-5, 2010 [23] S Jafarpour, W Xu, B Hassibi, and R Calderbank, Efficient and Robust Compressed Sensing Using Optimized Expander Graphs, IEEE Transactions on Information Theory, vol 55, no 9, pp 4299 4308, 2009 [24] R Calderbank, S Howard, and S Jafarpour, Construction of a Large Class of Deterministic Sensing Matrices That Satisfy a Statistical Isometry Property, IEEE Journal of Selected Topics in Signal Processing, Volume: 4, Issue: 2, pp 358 374, 2010 [25] D L Donoho, A Javanmard, and A Montanari, Information-theoretically optimal compressed sensing via spatial coupling and approximate message passing, 2012 IEEE International Symposium on Information Theory Proceedings (ISIT), pp 1231 1235, 2012 [26] F Krzakala, M M ezard, F Sausset, Y Sun, and L Zdeborova, Statistical physics-based reconstruction in compressed sensing, Physical Review X, 2(2), 021005, 2012