Weighted Superimposed Codes and Constrained Integer Compressed Sensing

Similar documents
Randomized Recovery for Boolean Compressed Sensing

Support recovery in compressed sensing: An estimation theoretic approach

Block designs and statistics

Multi-Dimensional Hegselmann-Krause Dynamics

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

An RIP-based approach to Σ quantization for compressed sensing

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Compressive Sensing Over Networks

Chapter 6 1-D Continuous Groups

Sharp Time Data Tradeoffs for Linear Inverse Problems

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Polygonal Designs: Existence and Construction

Feature Extraction Techniques

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Physics 215 Winter The Density Matrix

Lower Bounds for Quantized Matrix Completion

Asynchronous Gossip Algorithms for Stochastic Optimization

arxiv: v1 [cs.ds] 3 Feb 2014

Bipartite subgraphs and the smallest eigenvalue

A note on the multiplication of sparse matrices

The Hilbert Schmidt version of the commutator theorem for zero trace matrices

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

In this chapter, we consider several graph-theoretic and probabilistic models

Weighted- 1 minimization with multiple weighting sets

COS 424: Interacting with Data. Written Exercises

On Constant Power Water-filling

Non-Parametric Non-Line-of-Sight Identification 1

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links

Lecture 20 November 7, 2013

On Conditions for Linearity of Optimal Estimation

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

arxiv: v1 [cs.ds] 17 Mar 2016

Fairness via priority scheduling

ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD

Fundamental Limits of Database Alignment

Kernel Methods and Support Vector Machines

A Simple Regression Problem

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

On the theoretical analysis of cross validation in compressive sensing

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding

On the Maximum Number of Codewords of X-Codes of Constant Weight Three

12 Towards hydrodynamic equations J Nonlinear Dynamics II: Continuum Systems Lecture 12 Spring 2015

A Simple Homotopy Algorithm for Compressive Sensing

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Hybrid System Identification: An SDP Approach

Hamming Compressed Sensing

Interactive Markov Models of Evolutionary Algorithms

Semicircle law for generalized Curie-Weiss matrix ensembles at subcritical temperature

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

Fixed-to-Variable Length Distribution Matching

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

ESE 523 Information Theory

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Distributed Subgradient Methods for Multi-agent Optimization

3.8 Three Types of Convergence

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010

Tail estimates for norms of sums of log-concave random vectors

Least Squares Fitting of Data

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

arxiv: v2 [math.co] 3 Dec 2008

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

IN modern society that various systems have become more

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013).

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 60, NO. 2, FEBRUARY ETSP stands for the Euclidean traveling salesman problem.

1 Bounding the Margin

Pattern Recognition and Machine Learning. Artificial Neural networks

Estimating Parameters for a Gaussian pdf

arxiv: v3 [quant-ph] 18 Oct 2017

Highly Robust Error Correction by Convex Programming

Testing Properties of Collections of Distributions

A Bernstein-Markov Theorem for Normed Spaces

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

A Probabilistic and RIPless Theory of Compressed Sensing

Combining Classifiers

GEE ESTIMATORS IN MIXTURE MODEL WITH VARYING CONCENTRATIONS

Pattern Recognition and Machine Learning. Artificial Neural networks

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

On the Use of A Priori Information for Sparse Signal Approximations

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions

arxiv: v5 [cs.it] 16 Mar 2012

Linear Algebra (I) Yijia Chen. linear transformations and their algebraic properties. 1. A Starting Point. y := 3x.

Reed-Muller codes for random erasures and errors

Lecture 9 November 23, 2015

Error Exponents in Asynchronous Communication

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS

The Transactional Nature of Quantum Information

On Certain C-Test Words for Free Groups

Antenna Saturation Effects on MIMO Capacity

An Algorithm for Quantization of Discrete Probability Distributions

An Adaptive UKF Algorithm for the State and Parameter Estimations of a Mobile Robot

SPECTRUM sensing is a core concept of cognitive radio

Transcription:

Weighted Superiposed Codes and Constrained Integer Copressed Sensing Wei Dai and Olgica Milenovic Dept. of Electrical and Coputer Engineering University of Illinois, Urbana-Chapaign Abstract We introduce a new faily of codes, tered weighted superiposed codes WSCs. This faily generalizes the class of Euclidean superiposed codes ESCs, used in ultiuser identification systes. WSCs allow for discriinating all bounded, integer-valued linear cobinations of real-valued codewords that satisfy prescribed nor and non-negativity constraints. By design, WSCs are inherently noise tolerant. Therefore, these codes can be seen as special instances of robust copressed sensing schees. The ain results of the paper are lower and upper bounds on the largest achievable code rates of several classes of WSCs. These bounds suggest that with the codeword and weighting vector constraints at hand, one can iprove the code rates achievable by standard copressive sensing. I. INTRODUCTION Superiposed codes SCs and designs were introduced by Kautz and Singleton [], for the purpose of studying database retrieval and group testing probles. In their original forulation, superiposed designs were defined as arrays of binary codewords with the property that bitwise OR functions of all sufficiently sall collections of codewords are distinguishable. Superiposed designs can therefore be viewed as binary parity-chec atrices for which syndroes represent bitwise OR, rather than XOR, functions of selected sets of coluns. The notion of binary superiposed codes was further generalized by prescribing a distance constraint on the OR evaluations of subsets of coluns, and by extending the fields in which the codeword sybols lie []. In the latter case, Ericson and Györfi introduced Euclidean superiposed codes ESCs, for which the sybol field is R, for which the OR function is replaced by real addition, and for which all sus of less than K codewords are required to have pairwise Euclidean distance at least d. The best nown upper bound on the size of Euclidean superiposed codes was derived by Füredi and Ruszino [3], who used a cobination of sphere pacing arguents and probabilistic concentration forulas to prove their result. On the other hand, copressed sensing CS is a new sapling ethod usually applied to K-sparse signals, i.e. signals ebedded in an N-diensional space that can be represented by only K N significant coefficients [4] [6]. Alternatively, when the signal is projected onto a properly chosen basis of the transfor space, its accurate representation relies only on a sall nuber of coefficients. Encoding of a K-sparse discrete-tie signal x of diension N is accoplished by coputing a easureent vector y that consists of N linear projections, i.e. y = Φx. Here, Φ represents an N atrix, usually over the field of real nubers. Consequently, the easured vector represents a linear cobination of coluns of the atrix Φ, with weights prescribed by the nonzero entries of the vector x. Although the reconstruction of the signal x R N fro the possibly noisy projections is an ill-posed proble, the prior nowledge of signal sparsity allows for accurate recovery of x. This wor is supported by the NSF Grant CCF 064447, the NSF Career Award, and the DARPA Young Faculty Award of the second author. Parts of the results were presented at the CCIS 008 ant ITW 008 conferences.

The connection between error-correcting coding theory and copressed sensing was investigated by Candès and Tao in [7], and reared upon in [8]. In the forer wor, the authors studied rando codes over the real nubers, the noisy observations of which can be decoded using linear prograing techniques. As with the case of copressed sensing, the perforance guarantees of this coding schee are probabilistic, and the K-sparse signal is assued to lie in R N. We propose to study a new class of codes, tered weighted superiposed codes WSCs, which provide a lin between SCs and CS atrices. As with the case of the forer two entities, WSCs are defined over the field of real nubers. But unlie ESCs, for which the sparse signal x consists of zeros and ones only, and unlie CS, for which x is assued to belong to R N, in WSCs the vector x is drawn fro B N, where B denotes a bounded, syetric set of integers. The otivation for studying WSCs coes fro the fact that in any applications, the alphabet of the sparse signal can be odeled as a finite set of integers. Codewords fro the faily of WSCs can be designed to obey prescribed nor and non-negativity constraints. The restriction of the weighting coefficients to a bounded set of integers ensures reconstruction robustness in the presence of noise - i.e., all weighted sus of at ost K codewords can be chosen at iniu distance d fro each other. This iniu distance property provides deterinistic perforance guarantees, which CS techniques usually lac. Another benefit of the input alphabet restriction is the potential to reduce the decoding coplexity copared to that of CS reconstruction techniques. This research proble was addressed by the authors in [9] [], but is beyond the scope of this paper. The central proble of this paper is to characterize the rate region for which a WSC with certain paraeters exists. The ain results of this wor include generalized sphere pacing upper bounds and rando coding lower bounds on the rates of several WSC failies. The upper and lower bounds differ only by a constant, and therefore iply that the superposition constraints are ensured whenever = O K /. In the language of CS theory, this result suggests that the nuber of required signal easureents is less than the standard O K /K, required for discriinating real-valued linear cobinations of codewords. This reduction in the required nuber of easureents codelength can be seen as a result of restricting the input alphabet of the sparse signal. The paper is organized as follows. Section III introduces the relevant terinology and definitions. Section IV contains the ain results of the paper upper and lower bounds on the size of WSCs. The proofs of the rate bounds are presented in Sections V and VIII. Concluding rears are given in Section IX. II. MOTIVATING APPLICATIONS We describe next two applications - one arising in wireless counication, the other in bioengineering - otivating the study of WSCs. The adder channel and signature codes: One coon application of ESCs is for signaling over ulti-access channels. For a given set of K active users in the channel, the input to the receiver y equals the su of the signals signatures x ij, j =,...,, of the active users, i.e. y = j= x i j. The signatures are only used for identification purposes, and in order to iniize energy consuption, all users are assigned unit energy [], [3]. Now, consider the case that in addition to identifying their presence, active users also have to convey soe liited inforation to the receiver by adapting their transission power. The received signal can in this case be represented by a weighted su of the signatures of active users, i.e., y = j= pij x ij. The codeboo used in this schee represents a special for of WSC, tered Weighted Euclidean Superiposed Codes WESCs; these codes are forally defined in Section III. Copressive sensing icroarrays: A icroarray is a bioengineering device used for easuring the level of certain olecules, such as RNA ribonucleic acid sequences, representing the joint expression profile of thousands of genes.

3 A icroarray consist of thousands of icroscopic spots of DNA sequences, called probes. The copleentary DNA cdna sequences of RNA olecules being easured are labeled with fluorescent tags, and such units are tered targets. If a target sequence has a significant hoology with a probe sequence on the icroarray, the target cdna and probe DNA olecules will bind or hybridize so as to for a stable structure. As a result, upon exposure to laser light of the appropriate wavelength, the icroarray spots with large hybridization activity will be illuinated. The specific illuination pattern and intensities of icroarray spots can be used to infer the concentration of RNA olecules. In traditional icroarray design, each spot of probes is a unique identifier of only one target olecule. In our recent wor [], [3], we proposed the concept of copressive sensing icroarrays CSM, for which each probe has the potential to hybridize with several different targets. It uses the observation that, although the nuber of potential target RNA types is large, not all of the are expected to be present in a significant concentration at all observed ties. Matheatically, a icroarray is represented by a easureent atrix, with an entry in the i th row and the j th colun corresponding to the hybridization probability between the i th probe and the j th target. In this case, all the entries in the easureent atrix are nonnegative real nubers, and all the coluns of the easureent atrix are expected to have l -nors equal to one. In icroarray experients, the input vector x has entries that correspond to integer ultiples of the sallest detectable concentration of target cdna olecules. Since the nuber of different target cdna types in a typical test saple is sall copared to the nuber of all potential types, one can assue that the vector x is sparse. Furtherore, the nuber of RNA olecules in a cell at any point in tie is upper bounded due to energy constraints, and due to intracellular space liitations. Hence, the integer-valued entries of x are assued to have bounded agnitudes and to be relatively sall copared to the nuber of different RNA types. With the above considerations, the easureent atrix of a CSM can be described by nonnegative l -WSCs, forally defined in Section III. III. DEFINITIONS AND TERMINOLOGY Throughout the paper, we use the following notation and definitions. A code C is a finite set of N codewords vectors v i R, i =,,, N. The code C is specified by its codeword atrix codeboo C R N, obtained by arranging the codewords in coluns of the atrix. For two given positive integers, t and K, let be a syetric, bounded set of integers, and let B t = [ t, t] = t, t +,, t, t} Z B K = b B N t : b 0 K } denote the l 0 ball of radius K, with b 0 representing the nuber of nonzero coponents in the vector b i.e., the support size of the vector. We forally define WESCs as follows. Definition : A code C is said to be a WESC with paraeters N,, K, d, η, B t for soe d 0, η, if C R N, v i = η, for all i =,, N, and, 3 if the following iniu distance property holds: for all b, b B K. d E C, K, B t := in b b Cb Cb d

4 Henceforth, we focus our attention on WESCs with η =, and denote the set of paraeters of interest by N,, K, d, B t. The definition above can be extended to hold for other nored spaces. Definition : A code C is said to be an l p -WSC with paraeters N,, K, d, B t if C R N, v i lp =, for all i =,, N, and, 3 if the following iniu distance property holds: for all b, b B K. d p C, K, B t := in Cb Cb lp d b b Note that specializing p = reproduces the definition of a WESC. Motivated by the practical applications described in the previous section, we also define the class of nonnegative l p -WSC. Definition 3: A code C is said to be a nonnegative l p -WSC with paraeters N,, K, d, B t if it is an l p -WSC such that all entries of C are nonnegative. Given the paraeters, K, d and B t, let N, K, d, B t denote the axiu size of a WSC, The asyptotic code exponent is defined as N, K, d, B t := ax N : C N,, K, d, B t φ}. R K, d, B t := li sup, K, d, B t. We are interested in quantifying the asyptotic code exponent of WSCs, and in particular, WESCs and nonnegative WSCs with p =. Results pertaining to these classes of codes are suarized in the next section. IV. ON THE CARDINALITY OF WSC FAMILIES The central proble of this paper is to deterine the existence of a superiposed code with certain paraeters. In [], [3], it was shown that for ESCs, for which the codeword alphabet B t is replaced by the asyetric set 0, }, one has 4K where o d converges to zero as K. + o d R K, d, } K + o d, The ain result of the paper is the upper and lower bounds on the asyptotic code exponents of several WSC failies. For WESCs, introducing weighting coefficients larger than one does not change the asyptotic order of the code exponent. Theore : Let t be a fixed paraeter. For sufficiently large K, the asyptotic code exponent of WESCs can be bounded as 4K + o R K, d, B t K + o t,d where o 0 and o t,d 0 as K. The exact expressions of the o and o t,d ters are given in Equations 9 and 7, respectively. Rear : The derivations leading to the expressions in Theore show that one can also bound the code exponent in a non-asyptotic regie. Unfortunately, those expressions are too coplicated for practical use.

5 Nevertheless, this observation iplies that the results pertaining to WESC are applicable for the sae paraeter regions as those arising in the context of CS theory. Rear : The paraeter t can also be allowed to increase with K. For WESCs, the value of t does not affect the lower bound on the asyptotic code exponent, while the upper bound is valid as long as t = o K. For clarity of exposition, the proof of the lower bound is postponed to Section VI, while the proof of the upper bound, along with the proofs of the upper bounds for other WSC failies, are presented in Section V. We briefly setch the ain steps of the proofs in the discussion that follows. The proof of the upper bound is based on the sphere pacing arguent. The classical sphere pacing arguent is valid for all WSC failies discussed in this paper. The leading ter of the resulting upper bound is /K. This result can be iproved when restricting one s attention to the Euclidean nor. The ey idea is to show that ost points of the for Cb lie in a ball of radius significantly saller than the one derived by the classic sphere pacing arguent. The leading ter of the upper bound can in this case be iproved fro /K to / K. The lower bound in Theore is proved by rando coding arguents. We first randoly generate a faily of WESCs fro the Gaussian enseble, with the code rates satisfying li,n < 4K Then we prove that these randoly generated codeboos satisfy d E C, K d + o. with high probability. This fact iplies that the asyptotic code exponent R K, d, B t = li sup 4K, K, d, B t + o. We also analyze two ore classes of WSCs: the class of general l -WSCs and the faily of nonnegative l -WSCs. The characterization of the asyptotic code rates of these codes is given in Theores and 3, respectively. Theore : For a fixed value of the paraeter t and sufficiently large K, the asyptotic code exponent of l -WSCs is bounded as 4K + o R K, d, B t K + o t,d, where the expressions for o and o t,d are given in Equations 3 and 6, respectively. oof: The lower bound is proved in Section VII, while the upper bound is proved in Section V. Theore 3: For a fixed value of the paraeter t and sufficiently large K, the asyptotic code exponent of nonnegative l -WSCs is bounded as 4K + o t R K, d, B t K + o t,d, 3 where the expressions for o t and o t,d are given by Equations 40 and 6, respectively. oof: The lower and upper bounds are proved in Sections VIII and V, respectively. Rear 3: The upper bounds in Equations and 3 also hold if one allows t to grow with K, so that t = o K. The lower bound in for general l -WSCs does not depend on the value of t. However, the lower bound 3 for nonnegative l -WSCs requires that t = o K /3 see Equation 40 for details. This difference in the convergence

6 regie of the two l -WSCs is a consequence of the use of different proof techniques. For the proof of the rate regie of general l -WSCs, Gaussian codeboos were used. On the other hand, for nonnegative l -WSCs, the analysis is coplicated by the fact that one has to analyze linear cobinations of nonnegative rando variables. To overcoe this difficulty, we used the Central Liit Theore and Berry-Essen type of distribution approxiations [4]. The obtained results depend on the value of t. Rear 4: The upper bound for WESCs is roughly one half of the corresponding bound for l -WSCs. This iproveent in the code exponent of WESCs rests on the fact that the l -nor of a vector can be expressed as an inner product, i.e. v = v v in other words, l is a Hilbert space. Other nored spaces considered in the paper lac this property, and at the present, we are not able to iprove the upper bounds for l p -WSCs with p. V. PROOF OF THE UPPER BOUNDS BY SPHERE PACKING ARGUMENTS It is straightforward to apply the sphere pacing arguent to upper bound the code exponents of WSCs. Regard an l p -WSC with arbitrary p Z +. The superposition Cb satisfies Cb p b 0 j= vij b p ij Kt for all b such that b 0 K, where the b ij s, j b 0 K, denote the nonzero entries of b. Note that the l p distance of any two superpositions is required to be at least d. The size of the l p -WSC codeboo, N, satisfies the sphere pacing bound K = N t A siple algebraic anipulation of the above equation shows so that one has K = N t K log + tk d = K log t N t K K tk + d. 4 d N K K t K, K log t log K N + t K log d + K log K N. The asyptotic code exponent is therefore upper bounded by where if t = o K. K + o t,d, 5 o t,d = log t d + K K 0 6 This sphere pacing bound can be significantly iproved when considering the Euclidean nor. The result is an upper bound with the leading ter / K. The proof is a generalization of the ideas used by Füredi and

K Ruszino in [3]: ost points of the for Cb lie in a ball with radius saller than 3 t +, and therefore the right hand side of the classic sphere pacing bound 5 can be reduced by a factor of two. To proceed, we assign to every b B K the probability B K = N t +. K = For a given codeword atrix C, define a rando variable ξ = Cb. We shall upper bound the probability ξ λµ}, for arbitrary λ, µ R +, via Marov s inequality ξ λµ E [ξ] E [ξ λµ ] λµ. We calculate E [ ξ ] as follows. For a given vector b, let I [, N] be its support set - i.e., the set of indices for which the entries of b are nonzero. Let b I be the vector coposed of the nonzero entries of b. Furtherore, define Then, E [ ξ ] = B K B t, = B t \ 0}. K = I = where i j I, j =,,. Note that b ij v ij I = b I B t, j= = b i j + I = = b I B t, I = b I B t, j= j= b i j } } + b I B t, l j b ij v ij j= b ij b il v i j v il I = b I B t, l j, b ij b il v i j v il. } } It is straightforward to evaluate the two sus in the above expression in closed for: N = N = N = b I B t, t b i j = j= b i B t, b i N t t t + t + ; 3 b I B t, b i 7

8 and N = l j N = t = 0, b il b ij v i l v ij b I B t, i j b il,b ij B t, b il b ij v i l v ij where the last equality follows fro the observation that b il b ij v i l v ij b il B t,,b ij B t, = b il b ij v i l v ij b il >0,b ij B t, + b il b ij v i l v ij b il >0,b ij B t, = 0. Consequently, one has so that b ij v ij I = b I B t, j= N t t + t + =, 6 E [ ξ ] = K N t t+t+ = 6 N. t K = Next, substitute E [ ξ ] into Marov s inequality, with µ = E [ξ ], so that for any λ >, it holds that ξ λµ λ. This result iplies that at least a /λ-fraction of all possible Cb vectors lie within an -diensional ball of radius λµ around the origin. As a result, one obtains a sphere pacing bound of the for λµ + d B K. λ d Note that µ = E [ ξ ] K 3 t +, and that N K K B K t K. K

9 Consequently, one has N K λ K K t K + λ t + d, or, equivalently, K log λ K + λ t + K log d log K N + K log t. Without loss of generality, we choose λ =. The asyptotic code exponent is therefore upper bounded by where K + o t,d, if t = o K. This proves the upper bound of Theore. o t,d = t + log + K 0 7 d K VI. PROOF OF THE LOWER BOUND FOR WESCS Siilarly as for the case of copressive sensing atrix design, we show that standard Gaussian rando atrices, with appropriate scaling, can be used as codeboos of WESCs. Let H R N be a standard Gaussian rando atrix, and let h j denote the j th colun of H. Let v j = h j / h j and C = [v v N ]. Then C is a codeboo with unit l -nor codewords. Now choose a δ > 0 such that d + δ <. Let E = N j= H : } h j δ, + δ be the event that the noralized l -nors of all the coluns of H concentrate around one. Let E = H : C b b d}. 9 B K b b B K In other words, E denotes the event that any two different superpositions of codewords lie at Euclidean distance at least d fro each other. In the following, we show that for any for which o is given by Equation 9, if then This will establish the lower bound of Theore. Note that R < 4K li,n + o, 8 R, 0 li E =.,N E E E = E E E c.

0 According to Theore 4, stated and proved in the next subsection, one has li E =.,N Thus, the desired relation holds if Observe that where li E E c = 0.,N C b b = Hb, b := Λ H b b, and Λ H = / h.... 3 / hn By Theore 9 in Section VI-B, in the asyptotic doain of 0, E H : H + δ b } d + δ } = E H : Hb d = E H : C b b d} 0. This establishes the lower bound of Theore. A. Colun Nors of H In this subsection, we quantify the rate regie in which the Euclidean nors of all coluns of H, when properly noralized, are concentrated around the value one with high probability. Theore 4: Let H R N be a standard Gaussian rando atrix, and h j be the j th colun of H. For a given δ 0,, for all j N. If, N siultaneously, so that then it holds that h j > δ exp 4 δ li,n < δ 4, N } li,n h j > δ = 0. j=

oof: The first part of this theore is proved by invoing large deviations techniques. Note that h j = i= H i,j is χ distributed. We have } H i,j > + δ i= a exp α + δ log E [e α Hi,j ]} = exp α + δ + } log α b = exp } δ log + δ, 4 and } H i,j < δ i= c exp α δ + log E = exp α δ } log + α [e α Hi,j ]} d = exp } log δ δ, 5 where a and c are obtained by applying Chernoff s inequality [5] and hold for arbitrary α > 0, and b and d are obtained by specializing α to δ +δ and δ δ, respectively. By observing that log δ δ > δ log + δ > 0, we arrive at } h j > δ } δ log + δ exp exp 4 δ. The second part of the claied result is proved by applying the union bound, i.e. N } h j > δ j= N exp 4 δ} } δ = exp 4 log +. This copletes the proof of Theore 4. B. The Distance Between Two Different Superpositions This section is devoted to identifying the rate regie in which any pair of different superpositions is at sufficiently large Euclidean distance. The ain result is presented in Theore 6 at the end of this subsection. Since the proof of this theore is rather technical, and since it involves coplicated notation, we first prove a siplified version

of the result, stated in Theore 5. Theore 5: Let H R N be a standard Gaussian rando atrix, and let δ 0, be fixed. For sufficiently large K, if li,n < 4K + o δ where the exact expression for o δ given by Equation 8, then li,n Hb δ for all b B N t such that b 0 K. oof: By the union bound, we have = K = b 0 K K = N We shall upper bound the probability H : b 0 = 4t } Hb δ H : } Hb δ = 0, 6 Hb δ, b 0 =. 7 Hb δ, b 0 = for each =,, K. Fro Chernoff s inequality, for all α > 0, it holds that Hb δ, b 0 = [ exp αδ + log E e αhi, b]}, where H i, is the i th row of the H atrix. Furtherore, [e αhi, b] E [ = E exp α b H i, b/ b }] a [ E exp α H i, b/ b }] b log + α, where a follows fro the fact that b for all b BN t such that b 0 =, and where b holds because H i, b/ b is a standard Gaussian rando variable. Let α = δ δ = δ.

3 Then Hb δ, b 0 = exp log δ } = exp δ log log δ + δ Substituting the above expression into the union bound gives } H : Hb δ K = K = Now, let K be sufficiently large so that b 0 K exp }. log log δ + δ log 4t } log exp + δ / log δ } log 4t. If where then 4K + δ / K log δ 4K = in K li,n log + δ / log δ < 4K + o δ,. o δ = log + δ / K log δ, 8 li,n Hb δ = 0 for all b Bt N such that b 0 K. This copletes the proof of the claied result. Based on Theore 5, the asyptotic region in which E c E 0 is characterized in below. Theore 6: Let H R N be a standard Gaussian rando atrix, and for a given H, let Λ H be as defined in 3. For a given d 0,, choose a δ > 0 such that d + δ <. Define the set E as in 8. For sufficiently large K, if where li,n < 4K + o o = log, 9

4 then for all pairs of b, b B K such that b b. E, since li,n HΛ H b b d, E = 0, 0 oof: The proof is analogous to that of Theore 4 with inor changes. Let b = Λ H b b. On the set h j + δ for all j N, the nonzero entries of + δ b satisfy + δ b i. Replace b in Theore 5 with + δ b. All the arguents in the proof of Theore 5 are still valid, except that the higher order ter is changed to This copletes the proof of the theore. o d+δ log + d + δ / K log d + δ = log. VII. PROOF OF THE LOWER BOUND FOR l -WSCS The proof is siilar to that of the lower bound for WESCs. Let A R N be a standard Gaussian rando atrix, and let H be the atrix with entries π H i,j = A i,j, i, j N. Once ore, let h j be the j th colun of H. Let v j = h j / h j and C = [v v N ]. Then C is a codeboo with unit l -nor codewords. Now choose a δ > 0 such that d + δ <. Let and E = E = We consider the asyptotic regie where N j= H : } h j δ, + δ, B K b b B K H : C b b d}. li,n R, R < 4K + o, and o d is given in Equation 3. Theore 7 in Section VII-A suggests that li E =,,N

5 while Theore 8 in Section VII-B shows that Therefore, This result iplies the lower bound of Theore. li E E c = 0.,N li E li E E E c =.,N,N A. Colun Nors of H The following theore quantifies the rate regie in which the l -nors of all coluns of H, with proper noralization, are concentrated around one with high probability. Theore 7: Let A R N be a standard Gaussian rando atrix. Let H be the atrix with entries π H i,j = A i,j, i, j N. Let h j be the j th colun of H. For a given δ 0,, for soe positive constant c and c ; Let, N siultaneously, with The it holds that oof: h j > δ c e cδ li,n < c δ. N } li,n h j > δ = 0. j= Since A i,j is a standard Gaussian rando variable, A i,j is a Subgaussian distributed rando variable, and E [ A i,j ] = π. According to oposition in Appendix A, A i,j π is a Subgaussian rando variable with zero ean. A direct application of Theore stated in Appendix A gives h j > δ = A i,j > δ π π which proves clai. i= c exp c δ,

6 This part is proved by using the union bound: first, note that N } h j > δ j= exp c δ + log c + = exp c δ log c }. If then one has li,n < c δ, N } li,n h j > δ = 0. j= This copletes the proof of clai. B. The Distance Between Two Different Superpositions Siilarly to the analysis perfored for WESCs, we start with a proof of a siplified version of the result needed in order to siplify tedious notation. We then explain how to establish the proof of Theore 9 by odifying soe of the steps of the siplified theore. Theore 8: Let A R N be a standard Gaussian rando atrix. Let H be the atrix with entries π H i,j = A i,j, i, j N. Let δ 0, be given. For sufficiently large K, if where then for all b B N t such that b 0 K. li,n < 4K + o δ, oof: The proof starts by using the union bound, as } H : Hb δ K = o δ = log π δ, 3 li,n Hb δ = 0 4 b 0 K N 4t To estiate the above upper bound, we have to upper bound the probability Hb δ, b 0 =, Hb δ, b 0 =. 5

7 for each =,, K. Let us derive next an expression for such an upper bound that holds for arbitrary values of. Note that E [e α bjai,j ] j= = a = 0 = e α b b = e α b e α b = π e x e α b x dx 0 e x b e αx dx π b exp x + α b π 0 α b α b π exp x α b x π exp e α b e α b α b π dx dx x dx c, 6 α π where a and b follow fro the change of variables x = x/ b and x = x + α b, respectively. Inequality c holds based on the assuption that b. As a result, b j H i,j i= j δ = b j A i,j i= j= δ π exp α δ + log E [e α bjhj ]} j π exp α δ } + log π π α = exp α δ } π log α π π } = exp log, 7 δ where the last equality is obtained by specializing α = π/ δ. The upper bound in 7 is useful only when it is less than one, or equivalently, π log >. 8 δ For any δ 0,, if 4, inequality 8 holds. Thus, for any 4, N 4t Hb δ, b 0 = log exp + o log 4t δ 0, } 9

8 as, N with where li,n < log + o δ, o δ = log π log δ. Another upper bound is needed for =,, 3. For a fixed taing one of these values, Hb δ = A i,j b j i j < δ π = A i,j b j i j b < δ b. π π It is straightforward to verify that j A i,jb j is Gaussian and that E A i,j b j = b. π Thus j A i,j b j b π i j [, ] is a su of independent zero-ean subgaussian rando variables. Furtherore, b t and therefore, δ b < 0. Hence, we can apply Theore of Appendix A: as a result, there exist positive constants c 3, and c 4, such that Hb δ c 4, δ b c 3, exp c 3, exp c 4, δ. Note that the values of c 3, and c 4, depend on. Consequently, N 4t Hb δ, b 0 = c 3, exp c 4, δ 0 log 4t } 30 as, N with li,n < c 4, δ.

9 Finally, substitute the upper bounds of 9 and 30 into the union bound of Equation 5. If K is large enough so that and if 4K + o δ < c 4, δ for all =,, 3, 4K + o δ in 4 K where o δ is as given in 3, then the desired result 4 holds. log + o δ, Based on Theore 8, we are ready to characterize the asyptotic region in which E c E 0. Theore 9: Define A and H as in Theore 9. For a given H, define the diagonal atrix / h 0... Λ H = 0 / h N For a given d 0,, choose a δ > 0 such that d + δ <. Define the set E as in. For sufficiently large K, if where then li,n for all pairs of b, b B K such that b b. < 4K + o,. o = log π log, 3 li,n HΛ H b b d, E = 0 oof: Let b = Λ H b b. On the set E, since h j + δ, for all j N, all the nonzero entries of + δ b satisfy + δ b i. Replace b in Theore 8 with + δ b. All arguents used in the proof of Theore 8 are still valid, except that now, the higher order ter 3 in the asyptotic expression reads as This copletes the proof. o d+δ = log π log log d + δ log π log. VIII. PROOF OF THE LOWER BOUND FOR NONNEGATIVE l -WSCS The proof follows along the sae lines as the one described for l -WSCs. However, there is a serious technical difficulty associated with the analysis of nonnegative l -WSCs. Let A R N be a standard Gaussian rando

0 atrix. For general l -WSCs, we let and therefore, H i,j = π A i,j, i, j N, N H i,j b j j= is a Gaussian rando variable, whose paraeters are easy to deterine. However, for nonnegative l -WSCs, one has to set H i,j = π A i,j, i, j N. 3 Since the rando variables H i,j s are not Gaussian, but rather one-sided Gaussian, N H i,j b j is not Gaussian distributed, and it is coplicated to exactly characterize its properties. j= Nevertheless, we can still define E and E as in Equations and. The results of Theore 7 are still valid under the non-negativity assuption: the nors of all H coluns concentrate around one in the asyptotic regie described in Theore 7. The ey step in the proof of the lower bound is to identify the asyptotic region in which any two different superpositions are sufficiently separated in ters of the l -distance. We therefore use an approach siilar to the one we invoed twice before: we first prove a siplified version of the clai, and then proceed with proving the needed result by introducing soe auxiliary variables and notation. Theore 0: Let A R N be a standard Gaussian rando atrix. Let H be the atrix with entries π H i,j = A i,j, i, j N. Let δ 0, be given. For a given sufficiently large K, if where o t is given in 39, then for all b B N t and b 0 K. li,n < 4K + o t li,n Hb δ = 0 33 oof: Siilarly as for the corresponding proof for general l -WSCs, we need a tight upper bound on the oent generation function of the rando variable b j A i,j. j= For this purpose, we resort to the use of the Central Liit Theore. We first approxiate the distribution of j= b j A i,j by a Gaussian distribution. Then, we uniforly upper bound the approxiation error according to the Berry-Esseen Theore see [4] and Appendix B for an overview of this theory. Based on this approxiation, we obtain an upper bound on the oent generating function, with leading ter log / see Equation 38 for details.

To siplify the notation, for a b Bt N with b 0 =, let N π Y b, = A j b j, where A j s are standard Gaussian rando variables. Then, Hb δ, b 0 = [ exp αδ + log E e α Yb, ]}, j= where the inequality holds for all α > 0. Now, we fix α and upper bound the oent generating function as follows. Note that E [e α Yb, ] [ = E e α Yb,, Y b, α log ] 34 [ + E e α Yb,, Y b, < α log ]. 35 The first ter 34 is upper bounded by [ E e α log α, Y b, α log ] [ E, Y b, α log ] Y b, α log. 36 In order to upper bound the second ter in Equation 35, we apply Lea fro the Appendix, proved using the Central Liit Theore and the Berry-Esseen result: [ E, Y b, < α log ] = Y b, < α log = b j A j j= < π α log log + t3 E [ A 3] π απ = π log απ + 48t3. 37 Cobining the upper bounds in 36 and 37 shows that [ E e α Yb, ] + log π απ + 48t3 + log 4α + 4t3. 38

Next, set α = /δ. Then where Now we choose a 0 Z + such that for all 0, Hb δ, b 0 = } exp log + o t,, + log + log 4 + 4t 3 o t, =. log log + o t, > 0. It is straightforward to verify that 0 is well defined. Consider the case when 0. It can be verified that is Subgaussian and that π b j H i,j = j= b j A i,j j= E b j H i,j j= for all b Bt N such that b 0 =. By applying the large deviations result for Subgaussian rando variables, as stated in Theore, and the union bound, it can be proved that there exists a c > 0 such that } H : Hb δ exp 0. b 0 = log 4t c The above result holds whenever, N siultaneously, with and Finally, let K be sufficiently large so that li,n < c. } 4K + o log t,k in 0 K + o t,, 4K + o t,k in 0 c.

3 Then as, N with where K = b 0 K N 0 exp = K + 0, = 0+ H : 4t } Hb δ Hb δ, b 0 = log 4t c exp log 4t li,n } log + o t, } < 4K + o t, + log + 4 + 4t 3 o t =. 39 Based on Theore 0, we can characterize the rate region in which any two distinct superpositions are sufficiently separated in the l space. Theore : Define A and H as in Theore 0. For a given H, define the diagonal atrix / h 0... Λ H = 0 / h N Also, for d 0,, choose a δ 0, such that d + δ <. Define the set E as in. ovided that K is sufficiently large, if where then it holds li,n for all pairs of b, b B K such that b b. < 4K + o t. + log + 4 + 648t 3 o t =, 40 li,n HΛ H b b d, E = 0, oof: The proof is very siilar to that of Theore 0. The only difference is the following. Let b = Λ H b b. Since δ h j + δ 3,

4 all the nonzero entries of + δ b on the set E satisfy the following inequality + δ b i 3t. As a result, we have the higher order ter o t as given in Equation 40. IX. CONCLUSIONS We introduced a new faily of codes over the reals, tered weighted superiposed codes. Weighted superiposed codes can be applied to all probles in which one sees to robustly distinguish between bounded integer valued linear cobinations of codewords that obey predefined nor and sign constraints. As such, they can be seen as a special instant of copressed sensing schees in which the sparse sensing vectors contain entries fro a syetric, bounded set of integers. We characterized the achievable rate regions of three classes of weighted superiposed codes, for which the codewords obey l, l, and non-negativity constraints. APPENDIX A. Subgaussian Rando Variables Definition 4 The Subgaussian and Subexponential distributions: A rando variable X is said to be Subgaussian if there exist positive constants c and c such that X > x c e cx x > 0. It is Subexponential if there exist positive constants c and c such that X > x c e cx x > 0. Lea Moent Generating Function: Let X be a zero-ean rando variable. Then, the following two stateents are equivalent. X is Subgaussian. c such that E [ e αx] e cα, α 0. Theore : Let X,, X n be independent Subgaussian rando variables with zero ean. For any given a,, a n R, a X is a Subgaussian rando variable. Furtherore, there exist positive constants c and c such that where a = a. a X > x c e cx / a, x > 0, oof: See [6, Lecture 5, Theore 5 and Corollary 6]. We prove next a result that asserts that translating a Subgaussian rando variable produces another Subgaussian rando variable. oposition : Let X be a Subgaussian rando variable. For any given a R, Y = X + a is a Subgaussian rando variable as well. oof: It can be verified that for any y R, y a y a,

5 and y + a y a. Now for y > a, Y > y = X + a > y + X + a < y X > y a + X < y a. 4 When a > 0, 4 X > y a c e cy a c c ca e cy /. 4 When a 0, 4 X > y + a c e cy+a c c ca e cy /. 43 Cobining Equations 4 and 43, one can show that Y > y c c ca e cy /, y > a. On the other hand, Y y e ca / e cy /, y a. Let c 3 = ax c e ca, e ca / and c 4 = c /. Then This proves the claied result. Y > y c 3 e c4y. B. The Berry-Esseen Theore and Its Consequence The Central Liit Theore CLT states that under certain conditions, an appropriately noralized su of independent rando variables converges wealy to the standard Gaussian distribution. The Berry-Esseen theore quantifies the rate at which this convergence taes place. Theore 3 The Berry-Esseen Theore: Let X, X,..., X be independent rando variables such that E [X i ] = 0, E [ Xi ] = σ i, E [ X i 3 ] = ρ i. Also, let s = σ + + σ, and r = ρ + + ρ. Denote by F the cuulative distribution function of the noralized su X + + X /s, and by N standard Gaussian distribution. Then for all x and, the F x N x 6 r s 3.

6 The Berry-Esseen theore is used in the proof of the lower bound for the achievable rate region of nonnegative l -WSCs. In the proof, one need to identify a tight bound on the probability of a weighted su of nonnegative rando variables. The probability of this su lying in a given interval can be estiated by the Berry-Esseen, as suarized in the following lea. Lea : Assue that b Bt is such that b 0 =, let X, X,, X be independent standard Gaussian rando variables. For a given positive constant c > 0, one has b j X j < c log c log + ρ t3, π [ where ρ := E X 3]. j= oof: This lea is proved by applying the Berry-Essen theore. Note that the b j X j s are independent rando variables. Their su j= b j X j can be approxiated by a Gaussian rando variable with properly chosen ean and variance, according to the Central Liit Theore. In the proof, we first use the Gaussian approxiation to estiate the probability b j X j < c log. j= Then we subsequently eploy the Berry-Essen theore to upper bound the approxiation error. To siplify notation, let Y b, = b j X j, and let N x denote, as before, the standard Gaussian distribution. Then, Y b, < c log Y b, c log, c log j b j j b j j b j j= Y b, j b j c log, c log Y b, c log Y b, c log, b b where in the second inequality we used the fact that b j, so that j= b j. According to Theore 3, for all x R and all, Yb, x N x b j= 6ρ b j 3 j= b j 3/ 6ρt3 3/ = 6ρt3, since j= b j 3 t 3, and j= b j.

7 Thus, N Y b, < c log c log + 6ρt3 N c log + 6ρt3 which copletes the proof of the claied result. c log + ρt3, π REFERENCES [] W. Kautz and R. Singleton, Nonrando binary superiposed codes, IEEE Trans. Infor. Theory, vol. 0, no. 4, pp. 363 377, 964. [] T. Ericson and L. Györfi, Superiposed codes in R n, IEEE Trans. Infor. Theory, vol. 34, no. 4, pp. 877 880, 988. [3] Z. Füredi and M. Ruszinó, An iproved upper bound of the rate of Euclidean superiposed codes, IEEE Trans. Infor. Theory, vol. 45, no., pp. 799 80, 999. [4] D. Donoho, Copressed sensing, IEEE Trans. Infor. Theory, vol. 5, no. 4, pp. 89 306, 006. [5] E. Candès, J. Roberg, and T. Tao, Robust uncertainty principles: exact signal reconstruction fro highly incoplete frequency inforation, IEEE Trans. Infor. Theory, vol. 5, no., pp. 489 509, 006. [6] E. J. Candès, J. K. Roberg, and T. Tao, Stable signal recovery fro incoplete and inaccurate easureents, Co. Pure Appl. Math., vol. 59, no. 8, pp. 07 3, 006. [7] E. J. Candès and T. Tao, Near-optial signal recovery fro rando projections: Universal encoding strategies? IEEE Trans. Infor. Theory, vol. 5, no., pp. 5406 545, 006. [8] G. Corode and S. Muthurishnan, What s hot and what s not: Tracing ost frequent ites dynaically, IEEE Trans. Infor. Theory, vol. 50, no. 0, pp. 3 4, 004. [9] W. Dai and O. Milenovic, Weighted euclidean superiposed codes for integer copressed sensing, in IEEE Inforation Theory Worshop ITW, 008, subitted. [0], Constrained copressed sensing via superiposed coding, in Inforation Theory and Applications Worshop, San Diego, CA, invited tal, Jan. 008. [], Sparse weighted euclidean superiposed coding for integer copressed sensing, in Conference on Infoation Sciences and Systes CISS, 008, subitted. [] M. Sheih, O. Milenovic, and R. Baraniu, Designing copressive sensing DNA icroarrays, oceedings of the IEEE Worshop on Coputational Advances in Multi-Sensor Adaptive ocessing CAMSAP, St. Thoas, U.S. Virgin Islands, Dec. 007. [3] W. Dai, M. Sheih, O. Milenovic, and R. Baraniu, obe designs for copressed sensing icroarrays, in IEEE International Conference on Bioinforatics and Bioedicine, Philadelphia, PA, subitted, 008. [4] W. Feller, An Introduction to obability Theory and Its Applications, Volue, nd ed. Wiley, 97. [5] [6] R. Vershynin, Non-asyptotic Rando Matrix Theory Lecture Notes, 007.