Finding a sparse vector in a subspace: linear sparsity using alternating directions

Size: px
Start display at page:

Download "Finding a sparse vector in a subspace: linear sparsity using alternating directions"

Transcription

1 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 Finding a sarse vector in a subsace: linear sarsity using alternating directions Qing Qu Student Member IEEE Ju Sun Student Member IEEE and John Wright Member IEEE Abstract Is it ossible to find the sarsest vector direction in a generic subsace S R with dim S = n <? This roblem can be considered a homogeneous variant of the sarse recovery roblem and finds connections to sarse dictionary learning sarse PCA and many other roblems in signal rocessing and machine learning In this aer we focus on a lanted sarse model for the subsace: the target sarse vector is embedded in an otherwise random subsace Simle convex heuristics for this lanted recovery roblem rovably break down when the fraction of nonzero entries in the target sarse vector substantially exceeds O/ n In contrast we exhibit a relatively simle nonconvex aroach based on alternating directions which rovably succeeds even when the fraction of nonzero entries is Ω To the best of our knowledge this is the first ractical algorithm to achieve linear scaling under the lanted sarse model Emirically our roosed algorithm also succeeds in more challenging data models eg sarse dictionary learning Index Terms Sarse vector Subsace modeling Sarse recovery Homogeneous recovery Dictionary learning Nonconvex otimization Alternating direction method I INTRODUCTION Suose that a linear subsace S embedded in R contains a sarse vector x 0 0 Given an arbitrary basis of S can we efficiently recover x 0 u to scaling? Equivalently rovided a matrix A R n with NullA = S can we efficiently find a nonzero sarse vector x such that Ax = 0? In the language of sarse recovery can we solve min x x 0 st Ax = 0 x 0? I In contrast to the standard sarse recovery roblem Ax = b b 0 for which convex relaxations erform nearly otimally for broad classes of designs A 3] the comutational roerties of roblem I are not This work was artially suorted by grants ONR N NSF 3438 NSF and funding from the Moore and Sloan Foundations Q Qu J Sun and J Wright are all with the Electrical Engineering Deartment Columbia University New York NY 007 USA {qq05 js4038 jw966}@columbiaedu This aer is an extension of our revious conference version ] NullA = {x R Ax = 0} denotes the null sace of A nearly as well understood It has been known for several decades that the basic formulation min x x 0 st x S \ {0} I is NP-hard for an arbitrary subsace 4 5] In this aer we assume a secific random lanted sarse model for the subsace S: a target sarse vector is embedded in an otherwise random subsace We will show that under the secific random model roblem I is tractable by an efficient algorithm based on nonconvex otimization A Motivation The general version of Problem I in which S can be an arbitrary subsace takes several forms in numerical comutation and comuter science and underlies several imortant roblems in modern signal rocessing and machine learning Below we rovide a samle of these alications Sarse Null Sace and Matrix Sarsification: The sarse null sace roblem is finding the sarsest matrix N whose columns san the null sace of a given matrix A The roblem arises in the context of solving linear equality roblems in constrained otimization 5] null sace methods for quadratic rogramming 6] and solving underdetermined linear equations 7] The matrix sarsification roblem is of similar flavor the task is finding the sarsest matrix B which is equivalent to a given full rank matrix A under elementary column oerations Sarsity hels simlify many fundamental matrix oerations see 8] and the roblem has alications in areas such as machine learning 9] and in discovering cycle bases of grahs 0] ] discusses connections between the two roblems and also to other roblems in comlexity theory Sarse Comlete Dictionary Learning: In dictionary learning given a data matrix Y one seeks an aroximation Y AX such that A is a reresentation dictionary with certain desired structure and X collects the reresentation coefficients with maximal sarsity Such comact reresentation naturally allows signal comression and also facilitates efficient signal acquisition and classification see relevant discussion in ] When

2 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 A is required to be comlete ie square and invertible by linear algebra we have rowy = rowx 3] Then the roblem reduces to finding sarsest vectors directions in the known subsace rowy ie I Insights into this roblem have led to new theoretical develoments on comlete dictionary learning 3 5] Sarse Princial Comonent Analysis Sarse PCA: In geometric terms Sarse PCA see eg 6 8] for early develoments and 9 0] for discussion of recent results concerns stable estimation of a linear subsace sanned by a sarse basis in the data-oor regime ie when the available data are not numerous enough to allow one to decoule the subsace estimation and sarsification tasks Formally given a data matrix Z = U 0 X 0 + E 3 where Z R n collects column-wise n data oints U 0 R r is the sarse basis and E is a noise matrix one is asked to estimate U 0 u to sign scale and ermutation Such a factorization finds alications in gene exression financial data analysis and attern recognition 4] When the subsace is known say by the PCA estimator with enough data samles the roblem again reduces to instances of I and is already nontrivial 4 The full geometric sarse PCA can be treated as finding sarse vectors in a subsace that is subject to erturbation In addition variants and generalizations of the roblem I have also been studied in alications regarding control and otimization 5] nonrigid structure from motion 6] sectral estimation and Prony s roblem 7] outlier rejection in PCA 8] blind source searation 9] grahical model learning 30] and sarse coding on manifolds 3]; see also 3] and the references therein B Prior Arts Desite these otential alications of roblem I it is only very recently that efficient comutational surrogates with nontrivial recovery guarantees have been discovered for some cases of ractical interest In the context of sarse dictionary learning Sielman et al 3] introduced a convex relaxation which relaces the nonconvex roblem I with a sequence of linear rograms l /l Relaxation: min x x st xi = x S i I3 They roved that when S is generated as a san of n random sarse vectors with high robability wh the Here row denotes the row sace 3 Variants of multile-comonent formulations often add an additional orthonormality constraint on U 0 but involve a different notation of sarsity; see eg 6 3] 4 4] has also discussed this data-rich sarse PCA setting relaxation recovers these vectors rovided the robability of an entry being nonzero is at most θ O / n In the lanted sarse model in which S is formed as direct sum of a single sarse vector x 0 and a generic subsace Hand and Demanet roved that I3 also correctly recovers x 0 rovided the fraction of nonzeros in x 0 scales as θ O / n 4] One might imagine imroving these results by tightening the analyses Unfortunately the results of 3 4] are essentially shar: when θ substantially exceeds Ω/ n in both models the relaxation I3 rovably breaks down Moreover the most natural semidefinite rogramming SDP relaxation of I min X st A A X = 0 tracex] = I4 X 0 also breaks down at exactly the same threshold of θ O/ n 5 One might naturally conjecture that this / n threshold is simly an intrinsic rice we must ay for having an efficient algorithm even in these random models Some evidence towards this conjecture might be borrowed from the suerficial similarity of I-I4 and sarse PCA 6] In sarse PCA there is a substantial ga between what can be achieved with currently available efficient algorithms and the information theoretic otimum 9 33] Is this also the case for recovering a sarse vector in a subsace? Is θ O / n simly the best we can do with efficient guaranteed algorithms? Remarkably this is not the case Recently Barak et al introduced a new rounding technique for sum-of-squares relaxations and showed that the sarse vector x 0 in the lanted sarse model can be recovered when Ω n and θ = Ω 34] It is erhas surrising that this is ossible at all with a olynomial time algorithm Unfortunately the runtime of this aroach is a highdegree olynomial in see Table I; for machine learning roblems in which is often either the feature dimension or the samle size this algorithm is mostly of theoretical interest only However it raises an interesting algorithmic question: Is there a ractical algorithm that rovably recovers a sarse vector with θ / n ortion of nonzeros from a generic subsace S? C Contributions and Recent Develoments In this aer we address the above roblem under the lanted sarse model We allow x 0 to have u to θ 0 nonzero entries where θ 0 0 is a constant 5 This breakdown behavior is again in shar contrast to the standard sarse aroximation roblem with b 0 in which it is ossible to handle very large fractions of nonzeros say θ = Ω/ log n or even θ = Ω using a very simle l relaxation 3]

3 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 3 TABLE I COMPARISON OF EXISTING METHODS FOR RECOVERING A PLANTED SPARSE VECTOR IN A SUBSPACE Method Recovery Condition Time Comlexity 6 l /l Relaxation 4] θ O/ n On 3 log/ε SDP Relaxation θ O/ n O 35 log /ε SOS Relaxation 34] Ωn θ O O 7 log/ε 7 Sectral Method 35] Ωn oly logn θ O O n log/ɛ This work Ωn 4 log n θ O On 5 log n + n 3 log/ε We rovide a relatively simle algorithm which wh exactly recovers x 0 rovided that Ω n 4 log n A comarison of our results with existing methods is shown in Table I After initial submission of our aer Hokins et al 35] roosed a different simle algorithm based on the sectral method This algorithm guarantees recovery of the lanted sarse vector also u to linear sarsity whenever Ωn olylogn and comes with better time comlexity 8 Our algorithm is based on alternating directions with two secial twists First we introduce a secial data driven initialization which seems to be imortant for achieving θ = Ω Second our theoretical results require a second linear rogramming based rounding hase which is similar to 3] Our core algorithm has very simle iterations of linear comlexity in the size of the data and hence should be scalable to moderateto-large scale roblems Besides enjoying the θ Ω guarantee that is out of the reach of revious ractical algorithms our algorithm erforms well in simulations emirically succeeding with Ω n olylogn It also erforms well emirically on more challenging data models such as the comlete dictionary learning model in which the subsace of interest contains not one but n random target sarse vectors This is encouraging as breaking the O/ n sarsity barrier with a ractical algorithm and otimal guarantee is an imortant roblem in theoretical dictionary learning 36 40] In this regard our recent work 5] resents an efficient algorithm based on Riemannian otimization that guarantees recovery u to linear sarsity However the result is based on different ideas: a different nonconvex formulation otimization algorithm and analysis methodology 8 Desite these imroved guarantees in the lanted sarse model our method still roduces more aealing results on real imagery data see Section V-B for examles D Paer Organization Notations and Reroducible Research The rest of the aer is organized as follows In Section II we rovide a nonconvex formulation and show its caability of recovering the sarse vector Section III introduces the alternating direction algorithm In Section IV we resent our main results and sketch the roof ideas Exerimental evaluation of our method is rovided in Section V We conclude the aer by drawing connections to related work and discussing otential imrovements in Section VI Full roofs are all deferred to the aendix sections For a matrix X we use x i and x j to denote its i-th column and j-th row resectively all in column vector form Moreover we use xi to denote the i-th comonent of a vector x We use the comact notation k] = { k} for any ositive integer k and use c or C and their indexed versions to denote absolute numerical constants The scoe of these constants are always local namely within a articular lemma roosition or roof such that the aarently same constant in different contexts may carry different values For robability events sometimes we will just say the event holds with high robability wh if the robability of failure is dominated by κ for some κ > 0 The codes to reroduce all the figures and exerimental results can be found online at: htts://githubcom/sunju/sv II PROBLEM FORMULATION AND GLOBAL OPTIMALITY We study the roblem of recovering a sarse vector x 0 0 u to scale which is an element of a known subsace S R of dimension n rovided an arbitrary orthonormal basis Y R n for S Our starting oint is the nonconvex formulation I Both the objective and the constraint set are nonconvex and hence it is not easy to otimize over We relax I by relacing the l 0 norm with the l norm For the constraint x 0 since in most alications we only care about the solution u to

4 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 4 scaling it is natural to force x to live on the unit shere S n giving min x x st x S x = II This formulation is still nonconvex and for general nonconvex roblems it is known to be NP-hard to find even a local minimizer 4] Nevertheless the geometry of the shere is benign enough such that for well-structured inuts it actually will be ossible to give algorithms that find the global otimizer The formulation II can be contrasted with I3 in which effectively we otimize the l norm subject to the constraint x = : because the set {x : x = } is olyhedral the l -constrained roblem immediately yields a sequence of linear rograms This is very convenient for comutation and analysis However it suffers from the aforementioned breakdown behavior around x 0 0 / n In contrast though the shere x = is a more comlicated geometric constraint it will allow much larger number of nonzeros in x 0 Indeed if we consider the global otimizer of a reformulation of II: min Yq q R n st q = II where Y is any orthonormal basis for S the sufficient condition that guarantees exact recovery under the lanted sarse model for the subsace is as follows: Theorem II l /l recovery lanted sarse model There exists a constant θ 0 > 0 such that if the subsace S follows the lanted sarse model S = san x 0 g g n R where g i iid N 0 I and x 0 iid θ Berθ are all jointly indeendent and / n < θ < θ 0 then the unique u to sign otimizer q to II for any orthonormal basis Y of S roduces Yq = ξx 0 for some ξ 0 with robability at least c rovided Cn Here c and C are ositive constants Hence if we could find the global otimizer of II we would be able to recover x 0 whose number of nonzero entries is quite large even linear in the dimension θ = Ω On the other hand it is not obvious that this should be ossible: II is nonconvex In the next section we will describe a simle heuristic algorithm for aroximately solving a relaxed version of the l /l roblem II More surrisingly we will then rove that for a class of random roblem instances this algorithm lus an auxiliary rounding technique actually recovers the global otimizer the target sarse vector x 0 The roof requires a detailed robabilistic analysis which is sketched in Section IV-B Before continuing it is worth noting that the formulation II is in no way novel see eg the work of 9] in blind source searation for recedent However our algorithms and subsequent analysis are novel III ALGORITHM BASED ON ALTERNATING DIRECTION METHOD ADM To develo an algorithm for solving II it is useful to consider a slight relaxation of II in which we introduce an auxiliary variable x Yq: min qx fq x = Yq x + λ x st q = III Here λ > 0 is a enalty arameter It is not difficult to see that this roblem is equivalent to minimizing the Huber M-estimator over Yq This relaxation makes it ossible to aly the alternating direction method to this roblem This method starts from some initial oint q 0 alternates between otimizing with resect to wrt x and otimizing wrt q: x k+ = arg min x q k+ = arg min q S n Yq k x + λ x III Yq x k+ III3 where x k and q k denote the values of x and q in the k-th iteration Both III and III3 have simle closed form solutions: x k+ = S λ Yq k ] q k+ = Y x k+ Y x k+ III4 where S λ x] = signx max { x λ 0} is the softthresholding oerator The roosed ADM algorithm is summarized in Algorithm Algorithm Nonconvex ADM Alogrithm Inut: A matrix Y R n with Y Y = I initialization q 0 threshold arameter λ > 0 Outut: The recovered sarse vector ˆx 0 = Yq k : for k = 0 O n 4 log n do : x k+ = S λ Yq k ] 3: q k+ = Y x k+ Y x k+ 4: end for The algorithm is simle to state and easy to imlement However if our goal is to recover the sarsest vector x 0 some additional tricks are needed Initialization Because the roblem II is nonconvex an arbitrary or random initialization may not roduce a

5 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 5 global minimizer 9 In fact good initializations are critical for the roosed ADM algorithm to succeed in the linear sarsity regime For this urose we suggest using every normalized row of Y as initializations for q and solving a sequence of nonconvex rograms II by the ADM algorithm To get an intuition of why our initialization works recall the lanted sarse model S = sanx 0 g g n and suose Y = x 0 g g n ] R n III5 If we take a row y i of Y in which x 0 i is nonzero then x 0 i = Θ / θ Meanwhile the entries of g i g n i are all N 0 / and so their magnitude have size about / Hence when θ is not too large x 0 i will be somewhat bigger than most of the other entries in y i Put another way y i is biased towards the first standard basis vector e Now under our robabilistic model assumtions Y is very well conditioned: Y Y I 0 Using the Gram-Schmidt rocess we can find an orthonormal basis Y for S via: Y = YR III6 where R is uer triangular and R is itself wellconditioned: R I Since the i-th row y i of Y is biased in the direction of e and R is well-conditioned the i-th row y i of Y is also biased in the direction of e In other words with this canonical orthobasis Y for the subsace the i-th row of Y is biased in the direction of the global otimizer The heuristic arguments are made rigorous in Aendix B and Aendix D What if we are handed some other basis Ŷ = YU where U is an arbitary orthogonal matrix? Suose q is a global otimizer to II with the inut matrix Y then it is easy to check that U q is a global otimizer to II with the inut matrix Ŷ Because YU e i U q = Y e i q our initialization is invariant to any rotation of the orthobasis Hence even if we are handed an arbitrary orthobasis for S the i-th row is still biased in the direction of the global otimizer Rounding by linear rogramming LP Let q denote the outut of Algorithm As illustrated in Fig we 9 More recisely in our models random initialization does work but only when the subsace dimension n is extremely low comared to the ambient dimension 0 This is the common heuristic that tall random matrices are well conditioned 4] QR decomosition in general with restriction that R = will rove that with our articular initialization and an aroriate choice of λ ADM algorithm uniformly moves towards the otimal over a large ortion of the shere and its solution falls within a certain small radius of the globally otimal solution q to II To exactly recover q or equivalently to recover the exact sarse vector x 0 = γyq for some γ 0 we solve the linear rogram min q Yq st r q = III7 with r = q Since the feasible set {q q q = } is essentially the tangent sace of the shere S n at q whenever q is close enough to q one should exect that the otimizer of III7 exactly recovers q and hence x 0 u to scale We will rove that this is indeed true under aroriate conditions IV MAIN RESULTS AND SKETCH OF ANALYSIS A Main Results In this section we describe our main theoretical result which shows that wh the algorithm described in the revious section succeeds Theorem IV Suose that S obeys the lanted sarse model and let the columns of Y form an arbitrary orthonormal basis for the subsace S Let y y R n denote the transoses of the rows of Y Aly Algorithm with λ = / using initializations q 0 = y / y y / y to roduce oututs q q Solve the linear rogram III7 with r = q q to roduce q q Set i arg min i Y q i Then Y q i = γx 0 IV for some γ 0 with robability at least c rovided Cn 4 log n and n θ θ 0 Here C c and θ 0 are ositive constants IV Remark IV We can see that the result in Theorem IV is subotimal in samle comlexity comared to the global otimality result in Theorem II and Barak et al s result 34] and the subsequent work 35] For successful recovery we require Ω n 4 log n while the global otimality and Barak et al demand Ω n and Ω n resectively Aside from ossible deficiencies in our current analysis comared to Barak et al we believe this is still the first ractical and efficient method which is guaranteed to achieve θ Ω rate The lower bound on θ in Theorem IV is mostly for convenience in

6 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 6 the roof; in fact the LP rounding stage of our algorithm already succeeds wh when θ O / n B A Sketch of Analysis In this section we briefly sketch the main ideas of roving our main result in Theorem IV to show that the initialization + ADM + LP rounding ieline recovers x 0 under the stated technical conditions as illustrated in Fig The roof of our main result requires rather detailed technical analysis of the iteration-by-iteration roerties of Algorithm most of which is deferred to the aendices As noted in Section III the ADM algorithm is invariant to change of basis So wlog let us assume Y = x 0 g g n ] and let Y to be its orthogonalization ie x0 ] / Y = P x 0 x 0 G G P x 0 G IV3 When is large Y is nearly orthogonal and hence Y is very close to Y Thus in our roofs whenever convenient we make the arguments on Y first and then roagate the quantitative results onto Y by erturbation arguments With that noted let y y be the transose of the rows of Y and note that these are all indeendent random vectors To rove the result of Theorem IV we need the following results First given the secified Y we show that our initialization is biased towards the global otimum: Proosition IV3 Good initialization Suose θ > / n and Cn It holds with robability at least c that at least one of our initialization vectors suggested in Section III say q 0 i = y i / y i obeys y i y i e 0 θn IV4 Here C c are ositive constants Proof See Aendix D Second we define a vector-valued random rocess Qq on q S n via Qq = y i S λ q y i] IV5 i= Note that with robability one the inverse matrix square-root in Y is well defined So Y is well defined wh ie excet for x 0 = 0 See more quantitative characterization of Y in Aendix B so that based on III4 one ste of the ADM algorithm takes the form: q k+ = Q q k Q q k IV6 This is a very favorable form for analysis: the term in the numerator Q q k is a sum of indeendent random vectors with q k viewed as fixed We study the behavior of the iteration IV6 through the random rocess Q q k We want to show that wh the ADM iterate sequence q k converges to some small neighborhood of ±e so that the ADM algorithm lus the LP rounding described in Section III successfully retrieves the sarse vector x 0 / x 0 = Ye Thus we hoe that in general Qq is more concentrated on the first coordinate than q S n Let us artition the vector q as q = q ; q ] with q R and q R n ; and corresondingly Qq = Q q; Q q] The inner roduct of Qq/ Qq and e is strictly larger than the inner roduct of q and e if and only if Q q q > Q q q In the following roosition we show that wh this inequality holds uniformly over a significant ortion of the shere Γ = { q S n 0 nθ q 3 θ q } 0 IV7 so the algorithm moves in the correct direction Let us define the ga Gq between the two quantities Q q / q and Q q / q as Gq = Q q q Q q q IV8 and we show that the following result is true: Proosition IV4 Uniform lower bound for finite samle ga There exists a constant θ 0 0 such that when Cn 4 log n the estimate inf Gq q Γ 0 4 θ n holds with robability at least c rovided θ / n θ 0 Here C c are ositive constants Proof See Aendix E Next we show that whenever q 3 θ wh the iterates stay in a safe region with q θ which is enough for LP rounding III7 to succeed Proosition IV5 Safe region for rounding There exists

7 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 7 No Jum Away from the Ca e 3 θ θ Gq = jq qj jq j jq qj jjqqjj > θ Otimizer q? jjq qjj jjq jj LP Rounding Succeeds > C θ n Stoing Point q Uniform Progress by ADM Algorithm Initializer q 0 0 θn O e? Fig An illustration of the roof sketch for our ADM algorithm a constant θ 0 0 such that when Cn 4 log n it holds with robability at least c that Q q Qq θ for all q S n satisfying q > 3 θ rovided θ / n θ 0 Here C c are ositive constants Proof See Aendix F In addition the following result shows that the number of iterations for the ADM algorithm to reach the safe region can be bounded grossly by On 4 log n wh Proosition IV6 Iteration comlexity of reaching the safe region There is a constant θ 0 0 such that when Cn 4 log n it holds with robability at least c that the ADM algorithm in Algorithm with any initialization q 0 S n satisfying q 0 0 will θn roduce some iterate q with q > 3 θ at least once in at most On 4 log n iterations rovided θ / n θ 0 Here C c are ositive constants Proof See Aendix G Moreover we show that the LP rounding III7 with inut r = q exactly recovers the otimal solution wh whenever the ADM algorithm returns a solution q with first coordinate q > θ Proosition IV7 Success of rounding There is a constant θ 0 0 such that when Cn the following holds with robability at least c rovided θ / n θ 0 : Suose the inut basis is Y defined in IV3 and the ADM algorithm roduces an outut q S n with q > θ Then the rounding rocedure with r = q returns the desired solution ±e Here C c are ositive constants Proof See Aendix H Finally given Cn 4 log n for a sufficiently large constant C we combine all the results above to comlete the roof of Theorem IV Proof of Theorem IV Wlog let us again first consider Y as defined in III5 and its orthogonalization Y in a natural/canonical form IV3 We show that wh our algorithmic ieline described in Section III exactly recovers the otimal solution u to scale via the following argument: Good initializers Proosition IV3 shows that wh at least one of the initialization vectors say q 0 i = y i / y i obeys q 0 i e 0 θn which imlies that q 0 i is biased towards the global otimal solution Uniform rogress away from the equator By Proosition IV4 for any θ / n θ 0 with a constant θ 0 0 Gq = Q q q Q q q 0 4 θ n IV9 holds uniformly for all q S n in the region 0 q θn 3 θ wh This imlies that with an inut q 0 such that q 0 the ADM 0 θn algorithm will eventually obtain a oint q k for

8 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 8 which q k 3 θ if sufficiently many iterations are allowed 3 No jums away from the cas Proosition IV5 shows that for any θ / n θ 0 with a constant θ 0 0 wh Q q Qq θ holds for all q S n with q 3 θ This imlies that once q k 3 θ for some iterate k all the future iterates roduced by the ADM algorithm stay in a sherical ca region around the otimum with q θ 4 Location of stoing oints As shown in Proosition IV6 wh the strictly ositive ga Gq in IV9 ensures that one needs to run at most O n 4 log n iterations to first encounter an iterate q k such that q k 3 θ Hence the stes above imly that wh Algorithm fed with the roosed initialization scheme successively roduces iterates q S n with its first coordinate q θ after O n 4 log n stes 5 Rounding succeeds when r θ Proosition IV7 roves that wh the LP rounding III7 with an inut r = q roduces the solution ±x 0 u to scale Taken together these claims imly that from at least one of the initializers q 0 the ADM algorithm will roduce an outut q which is accurate enough for LP rounding to exactly return x 0 / x 0 On the other hand our l /l otimality theorem Theorem II imlies that ±x 0 are the unique vectors with the smallest l norm among all unit vectors in the subsace Since wh x 0 / x 0 is among the unit vectors q q our row initializers finally roduce our minimal l norm selector will successfully locate x 0 / x 0 vector For the general case when the inut is an arbitrary orthonormal basis Ŷ = YU for some orthogonal matrix U the target solution is U e The following technical ieces are erfectly arallel to the argument above for Y Discussion at the end of Aendix D imlies that wh at least one row of Ŷ rovides an initial oint q 0 such that q 0 U e 0 θn Discussion following Proosition IV4 in Aendix E indicates that for all q such that 0 θn q U e 3 θ there is a strictly ositive ga indicating steady rogress towards a oint q k such that q k U e 3 θ 3 Discussion at the end of Aendix F imlies that once q satisfies q U e the next iterate will not move far away from the target: Q q; Ŷ Ŷ / Q q; U e θ 4 Reeating the argument in Aendix G for general inut Ŷ shows it is enough to run the ADM algorithm O n 4 log n iterations to cross the range 0 q U e θn 3 θ So the argument above together dictates that with the roosed initialization wh the ADM algorithm roduces an outut q that satisfies q U e θ if we run at least O n 4 log n iterations 5 Since the ADM returns q satisfying q R e θ discussion at the end of Aendix H imlies that we will obtain a solution q = ±U e u to scale as the otimizer of the rounding rogram exactly the target solution Hence we comlete the roof Remark IV8 Under the lanted sarse model in ractice the ADM algorithm with the roosed initialization converges to a global otimizer of III that correctly recovers x 0 In fact simle calculation shows such desired oint for successful recovery is indeed the only critical oint of III near the ole in Fig Unfortunately using the current analytical framework we did not succeed in roving such convergence in theory Proosition IV5 and IV6 imly that after On 4 log n iterations however the ADM sequence will stay in a small neighborhood of the target Hence we roosed to sto after On 4 log n stes and then round the outut using the LP that rovable recover the target as imlied by Proosition IV5 and IV7 So the LP rounding rocedure is for the urose of comleting the theory and seems not necessary in ractice We susect alternative analytical strategies such as the geometrical analysis that we will discuss in Section VI can likely get around the artifact V EXPERIMENTAL RESULTS In this section we show the erformance of the roosed ADM algorithm on both synthetic and real datasets On the synthetic dataset we show the hase transition of our algorithm on both the lanted sarse and the dictionary learning models; for the real dataset we demonstrate how seeking sarse vectors can hel discover interesting atterns on face images A Phase Transition on Synthetic Data For the lanted sarse model for each air of k we generate the n dimensional subsace S R by direct sum of x 0 and G: x 0 R is a k-sarse vector with uniformly random suort and all nonzero entries

9 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 9 equal to and G R n is an iid Gaussian matrix distributed by N 0 / So one basis Y of the subsace S can be constructed by Y = GS x 0 G] U where GS denotes the Gram-Schmidt orthonormalization oerator and U R n n is an arbitrary orthogonal matrix For each we set the regularization arameter in III as λ = / use all the normalized rows of Y as initializations of q for the roosed ADM algorithm and run the alternating stes for 0 4 iterations We determine the recovery to be successful whenever x 0 / x 0 Yq 0 for at least one of the trials we set the tolerance relatively large as we have shown that LP rounding exactly recovers the solutions with aroximate inut To determine the emirical recovery erformance of our ADM algorithm first we fix the relationshi between n and as = 5n log n and lot out the hase transition between k and Next we fix the sarsity level θ = 0 or k = 0 and lot out the hase transition between and n For each air of k or n we reeat the simulation for 0 times Fig shows both hase transition lots We also exeriment with the comlete dictionary learning model as in 3] see also 5] Secifically the observation is assumed to be Y = A 0 X 0 where A 0 is a square invertible matrix and X 0 a n sarse matrix Since A 0 is invertible the row sace of Y is the same as that of X 0 For each air of k n we generate X 0 = x x n ] where each vector x i R is k-sarse with every nonzero entry following iid Gaussian distribution and construct the observation by Y = GS X 0 U We reeat the same exeriment as for the lanted sarse model described above The only difference is that here we determine the recovery to be successful as long as one sarse row of X 0 is recovered by one of those rograms Fig 3 shows both hase transition lots Fig a and Fig 3a suggest our ADM algorithm could work into the linear sarsity regime for both models rovided Ωn log n Moreover for both models the log n factor seems necessary for working into the linear sarsity regime as suggested by Fig b and Fig 3b: there are clear nonlinear transition boundaries between success and failure regions For both models On log n samle requirement is near otimal: for the lanted sarse model obviously Ωn is necessary; for the comlete dictionary learning model 3] roved that Ωn log n is required for exact recovery For the lanted sarse model our result Ωn 4 log n is far from this much lower emirical requirement Fig b further suggests that alternative reformulation and algorithm are needed to solve II so that the otimal recovery guarantee as deicted in Theorem II can be obtained B Exloratory Exeriments on Faces It is well known in comuter vision that the collection of images of a convex object only subject to illumination changes can be well aroximated by a low-dimensional subsaces in raw-ixel sace 43] We will lay with face subsaces here First we extract face images of one erson 65 images under different illumination conditions Then we aly robust rincial comonent analysis 44] to the data and get a low dimensional subsace of dimension 0 ie the basis Y R We aly the ADM + LP algorithm to find the sarsest elements in such a subsace by randomly selecting 0% rows of Y as initializations for q We judge the sarsity in the l /l sense that is the sarsest vector x 0 = Yq should roduce the smallest Yq / Yq among all results Once some sarse vectors are found we roject the subsace onto orthogonal comlement of the sarse vectors already found 3 and continue the seeking rocess in the rojected subsace Fig 4To shows the first four sarse vectors we get from the data We can see they corresond well to different extreme illumination conditions We also imlemented the sectral method with the LP ost-rocessing roosed in 35] for comarison under the same rotocol The result is resented as Fig 4Bottom: the ratios l / l are significantly higher and the ratios l 4 / l this is the metric to be maximized in 35] to romote sarsity are significantly lower By these two criteria the sectral method with LP rounding consistently roduces vectors with higher sarsity levels under our evaluation rotocol Moreover the resulting images are harder to interret hysically Second we manually select ten different ersons faces under the normal lighting condition Again the dimension of the subsace is 0 and Y R We reeat the same exeriment as stated above Fig 5 shows four sarse vectors we get from the data Interestingly the sarse vectors roughly corresond to differences of face images concentrated around facial arts that different eole tend to differ from each other eg eye brows forehead hair nose etc By comarison the vectors returned by the sectral method 35] are relatively denser and the sarsity atterns in the images are less structured hysically In sum our algorithm seems to find useful sarse vectors for otential alications such as eculiarity discovery in first setting and locating differences in second setting Nevertheless the main goal of this 3 The idea is to build a sarse orthonormal basis for the subsace in a greedy manner

10 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 0 Fig Phase transition for the lanted sarse model using the ADM algorithm: a with fixed relationshi between and n: = 5n log n; b with fixed relationshi between and k: k = 0 White indicates success and black indicates failure Fig 3 Phase transition for the dictionary learning model using the ADM algorithm: a with fixed relationshi between and n: = 5n log n; b with fixed relationshi between and k: k = 0 White indicates success and black indicates failure exeriment is to invite readers to think about similar attern discovery roblems that might be cast as the roblem of seeking sarse vectors in a subsace The exeriment also demonstrates in a concrete way the racticality of our algorithm both in handling data sets of realistic size and in roducing meaningful results even beyond the idealized lanted sarse model that we adoted for analysis VI CONNECTIONS AND DISCUSSION For the lanted sarse model there is a substantial erformance ga in terms of -n relationshi between the our otimality theorem Theorem II emirical simulations and guarantees we have obtained via efficient algorithm Theorem IV More careful and tighter analysis based on decouling 45] and chaining 46 47] and geometrical analysis described below can robably hel bridge the ga between our theoretical and emirical results Matching the theoretical limit deicted in Theorem II seems to require novel algorithmic ideas The random models we assume for the subsace can be extended to other random models articularly for dictionary learning where all the bases are sarse eg Bernoulli-Gaussian random model This work is art of a recent surge of research efforts on deriving rovable and ractical nonconvex algorithms to central roblems in modern signal rocessing and machine learning These roblems include low-rank matrix recovery/comletion 48 56] tensor recovery/decomosition 57 6] hase retrieval 6 65] dictionary learning ] and so on 4 Our aroach like the others is to start with a carefully chosen roblem-secific initialization and then erform a local analysis of the subsequent iterates to guarantee 4 The webage htt://sunjuorg/research/nonconvex/ maintained by the second author contains ointers to the growing list of work in this direction

11 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 Fig 4 The first four sarse vectors extracted for one erson in the Yale B database under different illuminations To by our ADM algorithm; Bottom by the seeding-u SOS algorithm roosed in 35] Fig 5 The first four sarse vectors extracted for 0 ersons in the Yale B database under normal illuminations To by our ADM algorithm; Bottom by the seeding-u SOS algorithm roosed in 35]

12 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 convergence to a good solution In comarison our subsequent work on comlete dictionary learning 5] and generalized hase retrieval 65] has taken a geometrical aroach by characterizing the function landscae and designing efficient algorithm accordingly The geometric aroach has allowed rovable recovery via efficient algorithms with an arbitrary initialization The article 66] summarizes the geometric aroach and its alicability to several other roblems of interest A hybrid of the initialization and the geometric aroach discussed above is likely to be a owerful comutational framework To see it in action for the current lanted sarse vector roblem in Fig 6 we rovide the asymtotic function landscae ie of the Huber loss on the shere S aka the relaxed formulation we tried to solve III It is clear that with an initialization that is biased towards either the north or the south ole we are situated in a region where the gradients are always nonzero and oints to the favorable directions such that many reasonable otimization algorithms can take the gradient information and make steady rogress towards the target This will robably ease the algorithm develoment and analysis and hel yield tight erformance guarantees We rovide a very efficient algorithm for finding a sarse vector in a subsace with strong guarantee Our algorithm is ractical for handling large datasets in the exeriment on the face dataset we successfully extracted some meaningful features from the human face images However the otential of seeking sarse/structured element in a subsace seems largely unexlored desite the cases we mentioned at the start We hoe this work could insire more alication ideas ACKNOWLEDGEMENT We thank the anonymous reviewers for their valuable comments and constructive criticism JS thanks the Wei Family Private Foundation for their generous suort We thank Cun Mu IEOR Deartment of Columbia University for helful discussion and inut regarding this work We thank the anonymous reviewers for their constructive comments that heled imrove the manuscrit This work was artially suorted by grants ONR N NSF 3438 NSF and funding from the Moore and Sloan Foundations APPENDIX A TECHNICAL TOOLS AND PRELIMINARIES In this aendix we record several lemmas that are useful for our analysis Lemma A Let ψx and Ψx to denote the robability density function df and the cumulative distribution function cdf for the standard normal distribution: Standard Normal df ψx = π ex Standard Normal cdf Ψx = π x t t { x ex } { t Suose a random variable X N 0 with the df f x = ψ x then for any t > t we have t t t f xdx = Ψ Ψ xf xdx = ψ t x f xdx = Ψ t t t ψ t ψ Ψ t t t t t ψ ] ] t ] Lemma A Taylor Exansion of Standard Gaussian cdf and df Assume ψx and Ψx be defined as above There exists some universal constant C ψ > 0 such that for any x 0 x R ψx ψx 0 x 0 ψ x 0 x x 0 ] C ψ x x 0 Ψx Ψx 0 + ψx 0 x x 0 ] C ψ x x 0 Lemma A3 Matrix Induced Norms For any matrix A R n the induced matrix norm from l l q is defined as A l l = q su Ax q x = } dt In articular let A = a a n ] = a a ] we have A l l = su a k x A l l = max a k x = k= AB l l r A l q l r B l l q and B is any matrix of size comatible with A k Lemma A4 Moments of the Gaussian Random Variable If X N 0 X then it holds for all integer m that ] E X m ] = X m m!! π m=k+ + m=k m X m!! k = m/ Lemma A5 Moments of the χ Random Variable If X χ n ie X = x for x N 0 I then it

13 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 3 Fig 6 Function landscae of fq with θ = 04 for n = 3 Left fq over the shere S Note that near the sherical cas around the north and south oles there are no critical oints and the gradients are always nonzero; Right Projected function landscae by rojecting the uer hemishere onto the equatorial lane Mathematically the function gw : e 3 R obtained via the rearameterization qw = w; w ] Corresonding to the left there is no undesired critical oint around 0 within a large radius holds for all integer m that E X m m/ Γ m/ + n/ ] = m!! n m/ Γ n/ Lemma A6 Moments of the χ Random Variable If X χ n ie X = x for x N 0 I then it holds for all integer m that E X m m Γ m + n/ ] = Γ n/ m = n + k m! nm k= Lemma A7 Moment-Control Bernstein s Inequality for Random Variables 67] Let X X be iid realvalued random variables Suose that there exist some ositive numbers R and X such that E X k m ] m! XR m for all integers m Let S = k= X k then for all t > 0 it holds that t P S E S] t] ex X + Rt Lemma A8 Moment-Control Bernstein s Inequality for Random Vectors 5] Let x x R d be iid random vectors Suose there exist some ositive number R and X such that E x k m ] m! XR m for all integers m Let s = k= x k then for any t > 0 it holds that t P s E s] t] d + ex X + Rt Lemma A9 Gaussian Concentration Inequality Let x N 0 I Let f : R R be an L-Lischitz function Then we have for all t > 0 that P fx EfX t] ex t L Lemma A0 Bounding Maximum Norm of Gaussian Vector Sequence Let x x n be a sequence of not necessarily indeendent standard Gaussian vectors in R n It holds that P max x i > n + ] logn n 3 i n ] Proof Since the function is -Lischitz by Gaussian concentration inequality for any i n ] we have ] P x i E x i > t P x i E x i > t] ex t for all t > 0 Since E x i = n by a simle union bound we obtain P max x i > ] n + t ex t i n ] + log n for all t > 0 Taking t = logn gives the claimed result Corollary A Let Φ R n n iid N 0 It

14 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 4 holds that n Φx + A simle alication of union bound yields logn x for all x R n P E c ] ex δ n π + n log + δ with robability at least n 3 Proof Let Φ = φ ] Choosing δ small enough such that φ n Without loss of generality let us only consider x S n 3δ δ ε + δ δ + ε we have Φx = max x φ i max φ i then conditioned on E we can conclude that i n ] i n A ] ε Invoking Lemma A0 returns the claimed result π n Φx + ε π n x S n Lemma A Covering Number of a Unit Shere 4] Let S n = {x R n x = } be the unit shere For any ε 0 there exists some ε cover of S n wrt the l norm denoted as N ε such that N ε + ε n 3 ε n Lemma A3 Sectrum of Gaussian Matrices 4] Let Φ R n n n > n contain iid standard normal entries Then for every t 0 with robability at least ex t / one has n n t min Φ max Φ n + n + t Lemma A4 For any ε 0 there exists a constant C ε > such that rovided n > C ε n the random matrix Φ R n n iid N 0 obeys ε π n x Φx + ε π n x for all x R n with robability at least ex c ε n for some c ε > 0 Geometrically this lemma roughly corresonds to the well known almost sherical section theorem 68 69] see also 70] A slight variant of this version has been roved in 3] borrowing ideas from 7] Proof By homogeneity it is enough to show that the bounds hold for every x of unit l norm For a fixed x 0 with x 0 = Φx 0 N 0 I So E Φx = π n Note that is n -Lischitz by concentration of measure for Gaussian vectors in Lemma A9 we have P Φx E Φx ] > t] ex t n for any t > 0 For a fixed δ 0 S n can be covered by a δ-net N δ with cardinality #N δ + /δ n Now consider the event E = { x N δ δ π n Φx + δ } π n Indeed suose E holds Then it can easily be seen that any z S n can be written as z = λ k x k with λ k δ k x k N δ for all k k=0 Hence we have Φz = Φ λ k x k k=0 Similarly Φz = Φ λ k x k k=0 δ k Φx k k=0 + δ δ π n δ δ + δ δ ] π n = 3δ δ π n Hence the choice of δ above leads to the claimed result Finally given n > Cn to make the robability P E c ] decaying in n it is enough to set C = π δ log + δ This comletes the roof APPENDIX B THE RANDOM BASIS VS ITS ORTHONORMALIZED VERSION In this aendix we consider the lanted sarse model Y = x 0 g g n ] = x 0 G] R n as defined in III5 where x 0 k iid Ber θ k θ g l iid N 0 I l n B

15 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 5 Recall that one natural/canonical orthonormal basis for the subsace sanned by columns of Y is x0 ] / Y = P x 0 x 0 G G P x 0 G which is well-defined with high robability as P x 0 G is well-conditioned roved in Lemma B We write G / = Px 0 G G P x 0 G B for convenience When is large Y has nearly orthonormal columns and so we exect that Y closely aroximates Y In this section we make this intuition rigorous We rove several results that are needed for the roof of Theorem II and for translating results for Y to results for Y in Aendix E-D For any realization of x 0 let I = sux 0 = {i x 0 i 0} By Bernstein s inequality in Lemma A7 with X = θ and R = the event { } E 0 = θ I θ B3 holds with robability at least ex θ/6 Moreover we show the following: Lemma B When Cn and θ > / n the bound x 0 4 n log 5 θ B4 holds with robability at least c Here C c are ositive constants ] Proof Because E x 0 = by Bernstein s inequality in Lemma A7 with X = /θ and R = /θ we have ] ] ] x0 P E x 0 x0 > t = P > t ex θt 4 + t for all t > 0 which imlies ] t P x 0 > x 0 + = P x 0 x 0 + > t] ex θt 4 + t On the intersection with E 0 x /4 and setting t = P n log θ we obtain x n log θ ] E 0 ex n log Unconditionally this imlies that with robability at least ex θ/6 ex n log we have x 0 = x 0 4 n log x 0 5 θ as desired Let M = G P x 0 G / Then G = GM x 0x 0 GM We show the following results hold: x 0 Lemma B Provided Cn it holds that n M M I log with robability at least Here C is a ositive constant Proof First observe that / M = min G P x 0 G = min Px 0 G Now suose B is an orthonormal basis sanning x 0 Then it is not hard to see the sectrum of P x 0 G is the same as that of B G R n ; in articular min Px 0 G = min B G Since each entry of G iid N 0 and B has orthonormal rows B G iid N 0 we can invoke the sectrum results for Gaussian matrices in Lemma A3 and obtain that n log min B G n max B log G + + with robability at least Thus when C n for some sufficiently large constant C by using the results above we have M = min B G n = I M = max max M min M = max min B G max B G log

16 IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 6 n log max n log + + { n log = max + + n log n log + + n log + + n log + + n log with robability at least Lemma B3 Let Y I be a submatrix of Y whose rows are indexed by the set I There exists a constant C > 0 such that when Cn and / > θ > / n the following Y l l 3 Y I l l 7 θ G G l 4 n + 7 log l YI Y l n log I 0 l θ Y Yl n log 0 l θ hold simultaneously with robability at least c for a ositive constant c Proof First of all we have x 0 x 0 x 0 GM x l x l 0 x 0 l l 0 GM l l = x x 0 x 0 0 G where in the last inequality we have alied the fact M from Lemma B Now x 0 G is an iid Gaussian vectors with each entry distributed as N 0 x0 where x 0 = I θ So by Gaussian concentration inequality in Lemma A9 we have x log 0 G x 0 with robability at least c On the intersection with E 0 this imlies x 0 x 0 x 0 GM θ log l l with robability at least c rovided θ > / n Moreover when intersected with E 0 Lemma A4 imlies that when C n G l l G I l l θ with robability at least c 3 rovided θ > / n Hence by Lemma B when > C n G G l G l l l I M + x 0 x 0 x 0 GM l l n log + θ log 4 n + 7 log Yl x l 0 l l + G l l x 0 + θ + 3 G l I G l I l l M + x 0 x 0 x 0 GM l l θ + θ log 4 θ GI G I l G l I l l I M + x 0 x 0 x 0 GM n θ log l l + θ log 4 θn + 6 θ log Y I l l x 0 x 0 + G l I l l l x 0 x θ 7 θ with robability at least c 4 rovided θ > / n Finally by Lemma B and the results above we obtain Y Yl l x 0 x 0 + G G l l n log 0 θ

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment

More information

A Note on Guaranteed Sparse Recovery via l 1 -Minimization

A Note on Guaranteed Sparse Recovery via l 1 -Minimization A Note on Guaranteed Sarse Recovery via l -Minimization Simon Foucart, Université Pierre et Marie Curie Abstract It is roved that every s-sarse vector x C N can be recovered from the measurement vector

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 000-9939XX)0000-0 ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS WILLIAM D. BANKS AND ASMA HARCHARRAS

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and

More information

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material Robustness of classifiers to uniform l and Gaussian noise Sulementary material Jean-Yves Franceschi Ecole Normale Suérieure de Lyon LIP UMR 5668 Omar Fawzi Ecole Normale Suérieure de Lyon LIP UMR 5668

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

1 Extremum Estimators

1 Extremum Estimators FINC 9311-21 Financial Econometrics Handout Jialin Yu 1 Extremum Estimators Let θ 0 be a vector of k 1 unknown arameters. Extremum estimators: estimators obtained by maximizing or minimizing some objective

More information

Cryptanalysis of Pseudorandom Generators

Cryptanalysis of Pseudorandom Generators CSE 206A: Lattice Algorithms and Alications Fall 2017 Crytanalysis of Pseudorandom Generators Instructor: Daniele Micciancio UCSD CSE As a motivating alication for the study of lattice in crytograhy we

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

COMMUNICATION BETWEEN SHAREHOLDERS 1

COMMUNICATION BETWEEN SHAREHOLDERS 1 COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

Improved Capacity Bounds for the Binary Energy Harvesting Channel

Improved Capacity Bounds for the Binary Energy Harvesting Channel Imroved Caacity Bounds for the Binary Energy Harvesting Channel Kaya Tutuncuoglu 1, Omur Ozel 2, Aylin Yener 1, and Sennur Ulukus 2 1 Deartment of Electrical Engineering, The Pennsylvania State University,

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

Statics and dynamics: some elementary concepts

Statics and dynamics: some elementary concepts 1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and

More information

Metrics Performance Evaluation: Application to Face Recognition

Metrics Performance Evaluation: Application to Face Recognition Metrics Performance Evaluation: Alication to Face Recognition Naser Zaeri, Abeer AlSadeq, and Abdallah Cherri Electrical Engineering Det., Kuwait University, P.O. Box 5969, Safat 6, Kuwait {zaery, abeer,

More information

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES OHAD GILADI AND ASSAF NAOR Abstract. It is shown that if (, ) is a Banach sace with Rademacher tye 1 then for every n N there exists

More information

Positive decomposition of transfer functions with multiple poles

Positive decomposition of transfer functions with multiple poles Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS #A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,

More information

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split A Bound on the Error of Cross Validation Using the Aroximation and Estimation Rates, with Consequences for the Training-Test Slit Michael Kearns AT&T Bell Laboratories Murray Hill, NJ 7974 mkearns@research.att.com

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

On Wald-Type Optimal Stopping for Brownian Motion

On Wald-Type Optimal Stopping for Brownian Motion J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems Int. J. Oen Problems Comt. Math., Vol. 3, No. 2, June 2010 ISSN 1998-6262; Coyright c ICSRS Publication, 2010 www.i-csrs.org Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various

More information

Commutators on l. D. Dosev and W. B. Johnson

Commutators on l. D. Dosev and W. B. Johnson Submitted exclusively to the London Mathematical Society doi:10.1112/0000/000000 Commutators on l D. Dosev and W. B. Johnson Abstract The oerators on l which are commutators are those not of the form λi

More information

Principal Components Analysis and Unsupervised Hebbian Learning

Principal Components Analysis and Unsupervised Hebbian Learning Princial Comonents Analysis and Unsuervised Hebbian Learning Robert Jacobs Deartment of Brain & Cognitive Sciences University of Rochester Rochester, NY 1467, USA August 8, 008 Reference: Much of the material

More information

On Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm

On Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm On Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm Gabriel Noriega, José Restreo, Víctor Guzmán, Maribel Giménez and José Aller Universidad Simón Bolívar Valle de Sartenejas,

More information

On the capacity of the general trapdoor channel with feedback

On the capacity of the general trapdoor channel with feedback On the caacity of the general tradoor channel with feedback Jui Wu and Achilleas Anastasooulos Electrical Engineering and Comuter Science Deartment University of Michigan Ann Arbor, MI, 48109-1 email:

More information

MATH 361: NUMBER THEORY EIGHTH LECTURE

MATH 361: NUMBER THEORY EIGHTH LECTURE MATH 361: NUMBER THEORY EIGHTH LECTURE 1. Quadratic Recirocity: Introduction Quadratic recirocity is the first result of modern number theory. Lagrange conjectured it in the late 1700 s, but it was first

More information

Spectral Clustering based on the graph p-laplacian

Spectral Clustering based on the graph p-laplacian Sectral Clustering based on the grah -Lalacian Thomas Bühler tb@cs.uni-sb.de Matthias Hein hein@cs.uni-sb.de Saarland University Comuter Science Deartment Camus E 663 Saarbrücken Germany Abstract We resent

More information

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales Lecture 6 Classification of states We have shown that all states of an irreducible countable state Markov chain must of the same tye. This gives rise to the following classification. Definition. [Classification

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

ON OPTIMIZATION OF THE MEASUREMENT MATRIX FOR COMPRESSIVE SENSING

ON OPTIMIZATION OF THE MEASUREMENT MATRIX FOR COMPRESSIVE SENSING 8th Euroean Signal Processing Conference (EUSIPCO-2) Aalborg, Denmark, August 23-27, 2 ON OPTIMIZATION OF THE MEASUREMENT MATRIX FOR COMPRESSIVE SENSING Vahid Abolghasemi, Saideh Ferdowsi, Bahador Makkiabadi,2,

More information

PARTIAL FACE RECOGNITION: A SPARSE REPRESENTATION-BASED APPROACH. Luoluo Liu, Trac D. Tran, and Sang Peter Chin

PARTIAL FACE RECOGNITION: A SPARSE REPRESENTATION-BASED APPROACH. Luoluo Liu, Trac D. Tran, and Sang Peter Chin PARTIAL FACE RECOGNITION: A SPARSE REPRESENTATION-BASED APPROACH Luoluo Liu, Trac D. Tran, and Sang Peter Chin Det. of Electrical and Comuter Engineering, Johns Hokins Univ., Baltimore, MD 21218, USA {lliu69,trac,schin11}@jhu.edu

More information

Uniform Law on the Unit Sphere of a Banach Space

Uniform Law on the Unit Sphere of a Banach Space Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

Notes on duality in second order and -order cone otimization E. D. Andersen Λ, C. Roos y, and T. Terlaky z Aril 6, 000 Abstract Recently, the so-calle

Notes on duality in second order and -order cone otimization E. D. Andersen Λ, C. Roos y, and T. Terlaky z Aril 6, 000 Abstract Recently, the so-calle McMaster University Advanced Otimization Laboratory Title: Notes on duality in second order and -order cone otimization Author: Erling D. Andersen, Cornelis Roos and Tamás Terlaky AdvOl-Reort No. 000/8

More information

Linear diophantine equations for discrete tomography

Linear diophantine equations for discrete tomography Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

AKRON: An Algorithm for Approximating Sparse Kernel Reconstruction

AKRON: An Algorithm for Approximating Sparse Kernel Reconstruction : An Algorithm for Aroximating Sarse Kernel Reconstruction Gregory Ditzler Det. of Electrical and Comuter Engineering The University of Arizona Tucson, AZ 8572 USA ditzler@email.arizona.edu Nidhal Carla

More information

John Weatherwax. Analysis of Parallel Depth First Search Algorithms

John Weatherwax. Analysis of Parallel Depth First Search Algorithms Sulementary Discussions and Solutions to Selected Problems in: Introduction to Parallel Comuting by Viin Kumar, Ananth Grama, Anshul Guta, & George Karyis John Weatherwax Chater 8 Analysis of Parallel

More information

Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value de

Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value de Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A Dahleh George Verghese Deartment of Electrical Engineering and Comuter Science Massachuasetts Institute of Technology c Chater Matrix Norms

More information

Sharp gradient estimate and spectral rigidity for p-laplacian

Sharp gradient estimate and spectral rigidity for p-laplacian Shar gradient estimate and sectral rigidity for -Lalacian Chiung-Jue Anna Sung and Jiaing Wang To aear in ath. Research Letters. Abstract We derive a shar gradient estimate for ositive eigenfunctions of

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

ε i (E j )=δj i = 0, if i j, form a basis for V, called the dual basis to (E i ). Therefore, dim V =dim V.

ε i (E j )=δj i = 0, if i j, form a basis for V, called the dual basis to (E i ). Therefore, dim V =dim V. Covectors Definition. Let V be a finite-dimensional vector sace. A covector on V is real-valued linear functional on V, that is, a linear ma ω : V R. The sace of all covectors on V is itself a real vector

More information

Elementary theory of L p spaces

Elementary theory of L p spaces CHAPTER 3 Elementary theory of L saces 3.1 Convexity. Jensen, Hölder, Minkowski inequality. We begin with two definitions. A set A R d is said to be convex if, for any x 0, x 1 2 A x = x 0 + (x 1 x 0 )

More information

p-adic Measures and Bernoulli Numbers

p-adic Measures and Bernoulli Numbers -Adic Measures and Bernoulli Numbers Adam Bowers Introduction The constants B k in the Taylor series exansion t e t = t k B k k! k=0 are known as the Bernoulli numbers. The first few are,, 6, 0, 30, 0,

More information

Principles of Computed Tomography (CT)

Principles of Computed Tomography (CT) Page 298 Princiles of Comuted Tomograhy (CT) The theoretical foundation of CT dates back to Johann Radon, a mathematician from Vienna who derived a method in 1907 for rojecting a 2-D object along arallel

More information

Best approximation by linear combinations of characteristic functions of half-spaces

Best approximation by linear combinations of characteristic functions of half-spaces Best aroximation by linear combinations of characteristic functions of half-saces Paul C. Kainen Deartment of Mathematics Georgetown University Washington, D.C. 20057-1233, USA Věra Kůrková Institute of

More information

Beyond Worst-Case Reconstruction in Deterministic Compressed Sensing

Beyond Worst-Case Reconstruction in Deterministic Compressed Sensing Beyond Worst-Case Reconstruction in Deterministic Comressed Sensing Sina Jafarour, ember, IEEE, arco F Duarte, ember, IEEE, and Robert Calderbank, Fellow, IEEE Abstract The role of random measurement in

More information

On the Chvatál-Complexity of Knapsack Problems

On the Chvatál-Complexity of Knapsack Problems R u t c o r Research R e o r t On the Chvatál-Comlexity of Knasack Problems Gergely Kovács a Béla Vizvári b RRR 5-08, October 008 RUTCOR Rutgers Center for Oerations Research Rutgers University 640 Bartholomew

More information

Research Article Controllability of Linear Discrete-Time Systems with Both Delayed States and Delayed Inputs

Research Article Controllability of Linear Discrete-Time Systems with Both Delayed States and Delayed Inputs Abstract and Alied Analysis Volume 203 Article ID 97546 5 ages htt://dxdoiorg/055/203/97546 Research Article Controllability of Linear Discrete-Time Systems with Both Delayed States and Delayed Inuts Hong

More information

The analysis and representation of random signals

The analysis and representation of random signals The analysis and reresentation of random signals Bruno TOÉSNI Bruno.Torresani@cmi.univ-mrs.fr B. Torrésani LTP Université de Provence.1/30 Outline 1. andom signals Introduction The Karhunen-Loève Basis

More information

MATH 6210: SOLUTIONS TO PROBLEM SET #3

MATH 6210: SOLUTIONS TO PROBLEM SET #3 MATH 6210: SOLUTIONS TO PROBLEM SET #3 Rudin, Chater 4, Problem #3. The sace L (T) is searable since the trigonometric olynomials with comlex coefficients whose real and imaginary arts are rational form

More information

Unsupervised Hyperspectral Image Analysis Using Independent Component Analysis (ICA)

Unsupervised Hyperspectral Image Analysis Using Independent Component Analysis (ICA) Unsuervised Hyersectral Image Analysis Using Indeendent Comonent Analysis (ICA) Shao-Shan Chiang Chein-I Chang Irving W. Ginsberg Remote Sensing Signal and Image Processing Laboratory Deartment of Comuter

More information

Recursive Estimation of the Preisach Density function for a Smart Actuator

Recursive Estimation of the Preisach Density function for a Smart Actuator Recursive Estimation of the Preisach Density function for a Smart Actuator Ram V. Iyer Deartment of Mathematics and Statistics, Texas Tech University, Lubbock, TX 7949-142. ABSTRACT The Preisach oerator

More information

Hidden Predictors: A Factor Analysis Primer

Hidden Predictors: A Factor Analysis Primer Hidden Predictors: A Factor Analysis Primer Ryan C Sanchez Western Washington University Factor Analysis is a owerful statistical method in the modern research sychologist s toolbag When used roerly, factor

More information

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation Uniformly best wavenumber aroximations by satial central difference oerators: An initial investigation Vitor Linders and Jan Nordström Abstract A characterisation theorem for best uniform wavenumber aroximations

More information

Improvement on the Decay of Crossing Numbers

Improvement on the Decay of Crossing Numbers Grahs and Combinatorics 2013) 29:365 371 DOI 10.1007/s00373-012-1137-3 ORIGINAL PAPER Imrovement on the Decay of Crossing Numbers Jakub Černý Jan Kynčl Géza Tóth Received: 24 Aril 2007 / Revised: 1 November

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition TNN-2007-P-0332.R1 1 Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers

More information

Proximal methods for the latent group lasso penalty

Proximal methods for the latent group lasso penalty Proximal methods for the latent grou lasso enalty The MIT Faculty has made this article oenly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Villa,

More information

LEIBNIZ SEMINORMS IN PROBABILITY SPACES

LEIBNIZ SEMINORMS IN PROBABILITY SPACES LEIBNIZ SEMINORMS IN PROBABILITY SPACES ÁDÁM BESENYEI AND ZOLTÁN LÉKA Abstract. In this aer we study the (strong) Leibniz roerty of centered moments of bounded random variables. We shall answer a question

More information

Detection Algorithm of Particle Contamination in Reticle Images with Continuous Wavelet Transform

Detection Algorithm of Particle Contamination in Reticle Images with Continuous Wavelet Transform Detection Algorithm of Particle Contamination in Reticle Images with Continuous Wavelet Transform Chaoquan Chen and Guoing Qiu School of Comuter Science and IT Jubilee Camus, University of Nottingham Nottingham

More information

The Graph Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule

The Graph Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule The Grah Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule STEFAN D. BRUDA Deartment of Comuter Science Bisho s University Lennoxville, Quebec J1M 1Z7 CANADA bruda@cs.ubishos.ca

More information

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT ZANE LI Let e(z) := e 2πiz and for g : [0, ] C and J [0, ], define the extension oerator E J g(x) := g(t)e(tx + t 2 x 2 ) dt. J For a ositive weight ν

More information

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data Quality Technology & Quantitative Management Vol. 1, No.,. 51-65, 15 QTQM IAQM 15 Lower onfidence Bound for Process-Yield Index with Autocorrelated Process Data Fu-Kwun Wang * and Yeneneh Tamirat Deartment

More information

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &

More information

CMSC 425: Lecture 4 Geometry and Geometric Programming

CMSC 425: Lecture 4 Geometry and Geometric Programming CMSC 425: Lecture 4 Geometry and Geometric Programming Geometry for Game Programming and Grahics: For the next few lectures, we will discuss some of the basic elements of geometry. There are many areas

More information

Robust Predictive Control of Input Constraints and Interference Suppression for Semi-Trailer System

Robust Predictive Control of Input Constraints and Interference Suppression for Semi-Trailer System Vol.7, No.7 (4),.37-38 htt://dx.doi.org/.457/ica.4.7.7.3 Robust Predictive Control of Inut Constraints and Interference Suression for Semi-Trailer System Zhao, Yang Electronic and Information Technology

More information

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete

More information

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in

More information

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS #A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS Ramy F. Taki ElDin Physics and Engineering Mathematics Deartment, Faculty of Engineering, Ain Shams University, Cairo, Egyt

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

8 STOCHASTIC PROCESSES

8 STOCHASTIC PROCESSES 8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular

More information

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers Sr. Deartment of

More information

Coding Along Hermite Polynomials for Gaussian Noise Channels

Coding Along Hermite Polynomials for Gaussian Noise Channels Coding Along Hermite olynomials for Gaussian Noise Channels Emmanuel A. Abbe IG, EFL Lausanne, 1015 CH Email: emmanuel.abbe@efl.ch Lizhong Zheng LIDS, MIT Cambridge, MA 0139 Email: lizhong@mit.edu Abstract

More information

8.7 Associated and Non-associated Flow Rules

8.7 Associated and Non-associated Flow Rules 8.7 Associated and Non-associated Flow Rules Recall the Levy-Mises flow rule, Eqn. 8.4., d ds (8.7.) The lastic multilier can be determined from the hardening rule. Given the hardening rule one can more

More information

Applications to stochastic PDE

Applications to stochastic PDE 15 Alications to stochastic PE In this final lecture we resent some alications of the theory develoed in this course to stochastic artial differential equations. We concentrate on two secific examles:

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

HENSEL S LEMMA KEITH CONRAD

HENSEL S LEMMA KEITH CONRAD HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar

More information

Research of power plant parameter based on the Principal Component Analysis method

Research of power plant parameter based on the Principal Component Analysis method Research of ower lant arameter based on the Princial Comonent Analysis method Yang Yang *a, Di Zhang b a b School of Engineering, Bohai University, Liaoning Jinzhou, 3; Liaoning Datang international Jinzhou

More information

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces Abstract and Alied Analysis Volume 2012, Article ID 264103, 11 ages doi:10.1155/2012/264103 Research Article An iterative Algorithm for Hemicontractive Maings in Banach Saces Youli Yu, 1 Zhitao Wu, 2 and

More information

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points. Solved Problems Solved Problems P Solve the three simle classification roblems shown in Figure P by drawing a decision boundary Find weight and bias values that result in single-neuron ercetrons with the

More information

A note on the random greedy triangle-packing algorithm

A note on the random greedy triangle-packing algorithm A note on the random greedy triangle-acking algorithm Tom Bohman Alan Frieze Eyal Lubetzky Abstract The random greedy algorithm for constructing a large artial Steiner-Trile-System is defined as follows.

More information

The rapid growth in the size and scope of datasets in science

The rapid growth in the size and scope of datasets in science Comutational and statistical tradeoffs via convex relaxation Venkat Chandrasekaran a and Michael I. Jordan b, a Deartments of Comuting and Mathematical Sciences and Electrical Engineering, California Institute

More information

Generalized Coiflets: A New Family of Orthonormal Wavelets

Generalized Coiflets: A New Family of Orthonormal Wavelets Generalized Coiflets A New Family of Orthonormal Wavelets Dong Wei, Alan C Bovik, and Brian L Evans Laboratory for Image and Video Engineering Deartment of Electrical and Comuter Engineering The University

More information

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Ketan N. Patel, Igor L. Markov and John P. Hayes University of Michigan, Ann Arbor 48109-2122 {knatel,imarkov,jhayes}@eecs.umich.edu

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment

More information