The rapid growth in the size and scope of datasets in science

Size: px
Start display at page:

Download "The rapid growth in the size and scope of datasets in science"

Transcription

1 Comutational and statistical tradeoffs via convex relaxation Venkat Chandrasekaran a and Michael I. Jordan b, a Deartments of Comuting and Mathematical Sciences and Electrical Engineering, California Institute of Technology, Pasadena, CA 95; and b Deartments of Statistics and Electrical Engineering and Comuter Sciences, University of California, Berkeley, CA 947 Contributed by Michael I. Jordan, February 5, 3 (sent for review November 7, ) Modern massive datasets create a fundamental roblem at the intersection of the comutational and statistical sciences: how to rovide guarantees on the quality of statistical inference given bounds on comutational resources, such as time or sace. Our aroach to this roblem is to defineanotionof algorithmic weakening, in which a hierarchy of algorithms is ordered by both comutational efficiency and statistical efficiency, allowing the growing strength of the data at scale to be traded off against the need for sohisticated rocessing. We illustrate this aroach in the setting of denoising roblems, using convex relaxation as the core inferential tool. Hierarchies of convex relaxations have been widely used in theoretical comuter science to yield tractable aroximation algorithms to many comutationally intractable tasks. In the current aer, we show how to endow such hierarchies with a statistical characterization and thereby obtain concrete tradeoffs relating algorithmic runtime to amount of data. convex geometry convex otimization high-dimensional statistics The raid growth in the size and scoe of datasets in science and technology has created a need for novel foundational ersectives on data analysis that blend comuter science and statistics. That classical ersectives from these fields are not adequate to address emerging roblems in Big Data is aarent from their sharly divergent nature at an elementary level: In comuter science, the growth of the number of data oints is a source of comlexity that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data oints is a source of simlicity in that inferences are generally stronger and asymtotic results can be invoked. In classical statistics, where one considers the increase in inferential accuracy as the number of data oints grows, there is little or no consideration of comutational comlexity. Indeed, if one imoses the additional constraint, as is revalent in real-world alications, that a certain level of inferential accuracy must be achieved within a limited time budget, classical theory rovides no guidance as to how to design an inferential strategy. [Note that classical statistics contains a branch known as sequential analysis that does discuss methods that sto collecting data oints after a target error level has been reached (e.g., ref. ), but this is different from the comutational comlexity guarantees (the number of stes that a comutational rocedure requires) that are our focus.] In classical comuter science, ractical solutions to large-scale roblems are often framed in terms of aroximations to idealized roblems; however, even when such aroximations are sought, they are rarely exressed in terms of the coin of the realm of the theory of inference: the statistical risk function. Thus, there is little or no consideration of the idea that comutation can be simlified in large datasets because of the enhanced inferential ower in the data. In general, in comuter science, datasets are not viewed formally as a resource on a ar with time and sace (such that the more of the resource, the better). On intuitive grounds, it is not imlausible that strategies can be designed to yield monotonically imroving risk as data accumulate, even in the face of a time budget. In articular, if an algorithm simly ignores all future data once a time budget is exhausted, statistical risk will not increase (under various assumtions that may not be desirable in ractical alications). Alternatively, one might allow linear growth in the time budget (e.g., in a real-time setting) and attemt to achieve such growth via a subsamling strategy in which some fraction of the data are droed. Executing such a strategy may be difficult, however, in that the aroriate fraction deends on the risk function, and thus on a mathematical analysis that may be difficult to carry out. Moreover, subsamling is a limited strategy for controlling comutational comlexity. More generally, one would like to consider some notion of algorithm weakening, where as data accumulate, one can back off to simler algorithmic strategies that nonetheless achieve a desired risk. The challenge is to do this in a theoretically sound manner. We base our aroach to this roblem on the notion of a time-data comlexity class. In articular, we defineaclass TD(t(), n(), «()) of arameter estimation roblems in which a -dimensional arameter underlying an unknown oulation can be estimated with a risk of «(), given n() indeendent and identically distributed (i.i.d.) samles using an inference rocedure with runtime t(). Our definition arallels the definition of the time-sace (TISP) comlexity class in comutational comlexity theory for describing algorithmic tradeoffs between time and sace resources (). In this formalization, classical results in estimation theory can be viewed as emhasizing the tradeoffs between the second and third arameters (amount of dataandrisk).ourfocusinthisaeristofix «() to some desired level of accuracy and to investigate the tradeoffs between the first two arameters, namely, runtime and dataset size. Although classical statistics gave little consideration to comutational comlexity, comutational issues have come increasingly to the fore in modern high-dimensional statistics (3), where the number of arameters is relatively large and the number of data oints n is relatively small. In this setting, methods based on convex otimization have been emhasized (articularly methods based on l enalties). This is due, in art, to the favorable analytical roerties of convex functions and convex sets, but also to the fact that such methods tend to have favorable comutational scaling. However, the treatment of comutation has remained in- Significance The growth in the size and scoe of datasets in science and technology has created a need for foundational ersectives on data analysis that blend comuter science and statistics. Secifically, the core challenge with massive datasets is that of guaranteeing imroved accuracy of an analysis rocedure as data accrue, even in the face of a time budget. We address this roblem via a notion of algorithmic weakening, whereby as data scale, the rocedure backs off to cheaer algorithms, leveraging the growing inferential strength of the data to ensure that a desired level of accuracy is achieved within the comutational budget. Author contributions: V.C. and M.I.J. designed research, erformed research, and wrote the aer. The authors declare no conflict of interest. Freely available online through the PNAS oen access otion. To whom corresondence should be addressed. jordan@eecs.berkeley.edu. This article contains suorting information online at 73/nas.393/-/DCSulemental. STATISTICS PNAS PLUS PNAS Published online March, 3 E8 E9

2 formal, with little attemt to characterize tradeoffs between comutation time and estimation quality. In our work, we aim exlicitly at such tradeoffs in the setting in which both n and are large. To develo a notion of algorithm weakening that combines comutational and statistical considerations, we consider estimation rocedures for which we can characterize the comutational benefits as well as the loss in estimation erformance due to the use of weaker algorithms. Reflecting the fact that the sace of all algorithms is oorly understood, we retain the focus on convex otimization from high-dimensional statistics, but we consider arameterized hierarchies of otimization rocedures in which a form of algorithm weakening is obtained by using successively weaker outer aroximations to convex sets. Such convex relaxations have been widely used to give efficient aroximation algorithms for intractable roblems in comuter science (4). As we will discuss, a recise characterization of both the estimation erformance and the comutational comlexity of using a articular relaxation of a convex set can be obtained by aealing to convex geometry and to results on the comlexity of solving convex rograms. Secifically, the tighter relaxations in these families offer better aroximation quality (and better estimation erformance in our context) but are comutationally more comlex. On the other hand, the weaker relaxations are comutationally more tractable and can rovide the same estimation erformance as the tighter ones but with access to more data. In this manner, convex relaxations rovide a rinciled mechanism to weaken inference algorithms so as to reduce the runtime in rocessing larger datasets. To demonstrate exlicit tradeoffs in high-dimensional, largescale inference, we focus, for simlicity and concreteness, on estimation in sequence models (5): y ¼ x* þ σz; [] where σ >, the noise vector z R is standard normal, and the unknown arameter x* belongs to a known subset S R. The objective is to estimate x* based on n indeendent observations fy i g n i¼ of y. This denoising setu has a long history and has been at the center of some remarkable results in the high-dimensional setting over the ast two decades, beginning with the aers of Donoho and Johnstone (6, 7). The estimators discussed next roceed by first comuting the samle mean y ¼ n i¼ y i and then using y as inut to a suitable convex rogram. Of course, this is equivalent to a denoising roblem in which the noise variance is σ /n, and we are given just one samle. The reason why we consider the elaborate two-ste rocedure is to account more accurately both for data aggregation and for subsequent rocessing in our runtime calculations. Indeed, in a real-world setting, one is tyically faced with a massive dataset in unaggregated form, and when both and n may be large, summarizing the data before any further rocessing can itself be an exensive comutation. As will be seen in concrete calculations of time-data tradeoffs, the number of oerations corresonding to data aggregation is sometimes comarable to or even larger than the number of oerations required for subsequent rocessing in a massive data setting. To estimate x*, we consider the following natural shrinkage estimator given by a rojection of the samle mean y onto a convex set C that is an outer aroximation to S (i.e., S C): ^x n ðcþ ¼ arg min y x x R l s:t: x C: [] We study the estimation erformance of a family of shrinkage estimators f^x n ðc i Þg that use as the convex constraint one of a sequence of convex outer aroximations {C i }withc C... S. Given the same number of samles, using a weaker relaxation, such as C, leads to an estimator with a larger risk than would result from using a tighter relaxation, such as C. On the other hand, given access to more data samles, the weaker aroximations rovide the same estimation guarantees as the tighter ones. In settings in which comuting a weaker aroximation is more tractable than comuting a tighter one, a natural comutation/ samle tradeoff arises. We characterize this tradeoff in a number of stylized examles, motivated by roblems such as collaborative filtering, learning an ordering of a collection of random variables, and inference in networks. More broadly, this aer highlights the role of comutation in estimation by jointly studying both the comutational and statistical asects of high-dimensional inference. Such an understanding is articularly of interest in modern inferential tasks in data-rich settings. Furthermore, an observation from our examles on timedata tradeoffs is that, in many contexts, one does not need too many extra data samles to go from a comutationally inefficient estimator based on a tight relaxation to an extremely efficient estimator based on a weaker relaxation. Consequently, in alication domains in which obtaining more data is not too exensive, it may be referable to acquire more data, with the ushot being that the comutational infrastructure can be relatively less sohisticated. We should note that we investigate only one algorithmweakening mechanism, namely, convex relaxation, and one class of statistical estimation roblems, namely, denoising in a highdimensional sequence model. There is reason to believe, however, that the rinciles described in this aer are relevant more generally. Convex otimization-based rocedures are used in a variety of large-scale data analysis tasks (3, 8), and it is likely to be interesting to exlore hierarchies of convex relaxations in such tasks. In addition, there are a number of otentially interesting mechanisms beyond convex relaxation for weakening inference rocedures, such as dimensionality reduction or other forms of data quantization and aroaches based on clustering or coresets. We discuss these and other research directions in Conclusions. Related Work A number of aers have considered comutational and samle comlexity tradeoffs in the setting of learning binary classifiers. Secifically, several authors have described settings under which seedus in running time of a classifier learning algorithm are ossible, given a substantial increase in dataset size (9 ). In contrast, in the denoising setu considered in this aer, several of our examles of time-data tradeoffs demonstrate significant comutational seedus with just a constant factor increase in dataset size. Another attemt in the binary classifier learning setting, building on earlier work on classifier learning in data-rich roblems (3), has shown that modest imrovements in runtime (of constant factors) may be ossible with access to more data by using the stochastic gradient descent method (4). Time-data tradeoffs have also been characterized in Boolean network training from time series data (5), but the comutational seedus offered there are from exonential-time algorithms to slightly faster but still exonential-time algorithms. Two recent aers (6, 7) have considered time-data tradeoffs in sarse rincial comonent analysis (PCA) (8) and in biclustering, in which one wishes to estimate the suort of a sarse eigenvector that consists of most of the energy of a matrix. We also study time-data tradeoffs for this roblem, but from a denoising ersective rather than from one of estimating the suort of the leading sarse eigenvector. In our discussion of Examle 3 (Time-Data Tradeoffs), we discuss the differences between our roblem setu and these latter two aers (6, 7). Finally, a recent aer (9) studies time-data tradeoffs in model selection roblems by investigating rocedures that oerate within a comutational budget. As a general contrast to all these revious results, a major contribution of the resent aer is the demonstration of the efficacy of convex relaxation as a owerful algorithm-weakening mechanism for rocessing massive datasets in a broad range of settings. Paer Outline The main sections of this aer roceed in the following sequence. The next section describes a framework for formally stating results on time-data tradeoffs. We then rovide some background on convex otimization and relaxations of convex sets. Following this, we investigate in detail the denoising roblem [] and char- E8 Chandrasekaran and Jordan

3 acterize the risk obtained when one emloys a convex rogramming estimator of the tye []. Subsequently, we give several examles of time-data tradeoffs in concrete denoising roblems. Finally, we conclude with a discussion of directions for further research. Formally Stating Time-Data Tradeoffs In this section, we describe a framework to state results on comutational and statistical tradeoffs in estimation. Our discussion is relevant to general arameter estimation roblems and inference rocedures; one may kee in mind the denoising roblem [] for concreteness. Consider a sequence of estimation roblems indexed by the dimension of the arameter to be estimated. Fix a risk function «() that secifies the desired error of an estimator. For examle, in the denoising roblem [], the error of an estimator of the form [] may be secified as the worst case mean squared error taken over all elements of the set S (i.e., su x* S E½kx* ^x n ðcþk l Š). One can informally view an estimation algorithm that achieves a risk of «() by rocessing n() samles with runtime t() asa oint on a D lot as shown in Fig., with one axis reresenting the runtime and the other reresenting the samle comlexity. To be recise, the axes in the lot index functions (of ) that reresent runtime and number of samles, but we do not emhasize such formalities and rather use these lots to rovide a useful qualitative comarison of inference algorithms. In Fig., rocedure A requires fewer samles than rocedure C to achieve the same error, but this reduction in samle comlexity comes at the exense of a larger runtime. Procedure B has both a larger samle comlexity and a larger runtime than rocedure C; thus, it is strictly dominated by rocedure C. Given an error function «(), there is a lower bound on the number of samles n() required to achieve this error using any comutational rocedure [i.e., no constraints on t()]; such information-theoretic or minimax risk lower bounds corresond to vertical lines in the lot in Fig.. Characterizing these fundamental limits on samle comlexity has been a traditional focus in the estimation theory literature, with a fairly comlete setofresultsavailableinmanysettings.onecanimagineasking for similar lower bounds on the comutational side, corresonding to horizontal lines in the lot in Fig. : Given a desired risk «() and access to an unbounded number of samles, what is a nontrivial lower bound on the runtime t() of any inference algorithm that achieves a risk of «()? Such comlexity-theoretic lower bounds are significantly harder to obtain, and they remain a central oen roblem in comutational comlexity theory. Fig.. Tradeoff between the runtime and samle comlexity in a stylized arameter estimation roblem. Here, the risk is assumed to be fixed to some desired level, and the oints in the lot refer to different algorithms that require a certain runtime and a certain number of samles to achieve the desired risk. The vertical and horizontal lines refer to lower bounds in samle comlexity and in runtime, resectively. This research landscae informs the qualitative nature of the statements on time-data tradeoffs we make in this aer. First, we will not attemt to rove combined lower bounds, as is traditionally done in the characterization of tradeoffs between hysical quantities, involving n() and t() jointly; this is because obtaining a lower bound just on t() remains a substantial challenge. Hence, our time-data tradeoff results on the use of more efficient algorithms for larger datasets refer to a reduction in the uer bounds on runtimes of estimation rocedures with increases in dataset size. Second, in any setting in which there is a comutational cost associated with touching each data samle and in which the samles are exchangeable, there is a samle threshold beyond which it is comutationally more efficient to throw away excess data samles than to rocess them in any form. This observation suggests that there is a floor, as in Fig. with rocedures E, F, G, and H, beyond which additional data do not lead to a reduction in runtime. Precisely characterizing this samle threshold is generally very hard because it deends on difficultto-obtain comutational lower bounds for estimation tasks as well as on the articular sace of estimation algorithms that one may use. We will comment further on this oint when we consider concrete examles of time-data tradeoffs. To state our results concerning time-data tradeoffs formally, we define a resource class constrained by runtime and samle comlexity as follows. Definition : Consider a sequence of arameter estimation roblems indexed by the dimension of the sace of arameters that index an underlying oulation. This sequence of estimation roblems belongs to a time-data class TD(t(), n(), «()) if there exists an inference rocedure for the sequence of roblems with runtime uer-bounded by t(), with the number of i.i.d. samles rocessed bounded by n(), and which achieves a risk bounded by «(). We note that our definition of a time-data resource class arallels the time-sace resource classes considered in comlexity theory (). In that literature, TISP(t(), s()) denotes a class of roblems of inut size that can be solved by some algorithm using t() oerations and s() units of sace. With this formalism, classical minimax bounds can be stated as follows. Given some function nðþ for the number of samles, suose a arameter estimation roblem has a minimax risk of «minimax () [which deends on the function nðþ]. If an estimator achieving a risk of «minimax () iscomutable with runtime tðþ, this estimation roblem then lies in TDðtðÞ; nðþ; e minimax ðþþ. Thus, the emhasis is fundamentally on the relationshi between nðþ and «minimax (), without much focus on the comutational rocedure that achieves the minimax risk bound. Our interest in this aer is to fix the risk «() =«desired () to be equal to some desired level of accuracy and to investigate the tradeoffs between t() andn() so that a arameter estimation roblem lies in TD(t(), n(), «desired ()). Convex Relaxation In this section, we describe the articular algorithmic toolbox on which we focus, namely, convex rograms. Convex otimization methods offer a owerful framework for statistical inference due to the broad class of estimators that can be effectively modeled as convex rograms. Furthermore, the theory of convex analysis is useful both for characterizing the statistical roerties of convex rogramming-based estimators and for develoing methods to comute such estimators efficiently. Most imortantly from our viewoint, convex otimization methods rovide a rinciled and general framework for algorithm weakening based on relaxations of convex sets. We briefly discussthekey ideas from this literature that are relevant to this aer in this section. A central notion to the geometric viewoint adoted in this section is that of a convex cone, which is a convex set that is closed under nonnegative linear combinations. Reresentation of Convex Sets. Convex rograms refer to a class of otimization roblems in which we seek to minimize a convex STATISTICS PNAS PLUS Chandrasekaran and Jordan PNAS Published online March, 3 E83

4 function over a convex constraint set (8). For examle, linear rogramming (LP) and semidefinite rogramming (SDP) are two rominent subclasses in which linear functions are minimized over constraint sets given by affine saces intersecting the nonnegative orthant (in LP) and the ositive semidefinite cone (in SDP). Roughly seaking, convex rograms are tractable to solve comutationally if the convex objective function can be comuted efficiently and if membershi of an arbitrary oint in the convex constraint sets can be certified efficiently; we will informally refer to this latter oeration as comuting the convex constraint set. (More recisely, one requires an efficient searation oracle that resonds YES if the oint is in the convex set and otherwise rovides a hyerlane that searates the oint from the convex set). It is then clear that the main comutational bottleneck associated with solving convex rograms of the form [] isthe efficiency of comuting the constraint sets. A central insight from the literature on convex otimization is that the comlexity of comuting a convex set is closely linked to how efficiently the set can be reresented. Secifically, if a convex set can be exressed as the intersection of a small number of basic or elementary convex sets, each of which is tractable to comute, the original convex set is then also tractable to comute, and one can, in turn, otimize over this set efficiently. Examles of basic convex sets include affine saces or cones, such as the nonnegative orthant and the cone of ositive semidefinite matrices. Indeed, a canonical method with which to reresent a convex set is to exress the set as the intersection of a cone and an affine sace. In what follows, we will consider such conic reresentations of convex sets in R. Definition : Let C R be a convex set, and let K R be a convex cone. C is then said to be K-reresentable if C can be exressed as follows for A R m, b R m : C¼fxjx K; Ax ¼ bg: [3] Such a reresentation of C is called a K -reresentation. Informally, if K is the nonnegative orthant (or the semidefinite cone), we will refer to the resulting reresentations as LP reresentations (or SDP reresentations), following commonly used terminology in the literature. A virtue of conic reresentations of convex sets based on the orthant or the semidefinite cone is that these reresentations lead to a numerical recie for solving convex otimization roblems of the form [] viaanatural associated barrier enalty (). The comutational comlexity of these rocedures is olynomial in the dimension of the cone, and we discuss runtimes for secific instances in our discussion of concrete examles of time-data tradeoffs. Examle : The -dimensional simlex is an examle of an LP reresentable set: Δ ¼ fxj x ¼ ; x g; [4] where R is the all-ones vector. The -simlex is the set of robability vectors in R. The next examle is one of an SDP-reresentable set that is commonly encountered both in otimization and in statistics. Examle : The ellitoe, or the set of correlation matrices, in the sace of m m symmetrical matrices is defined as follows: E m m ¼ fxjx ; X ii ¼ ig: [5] Conic reresentations are somewhat limited in their modeling caacity, and an imortant generalization is obtained by considering lifted reresentations. In articular, the notion of liftand-roject lays a critical role in many examles of efficient reresentations of convex sets. The lift-and-roject concet is simle: We wish to exress a convex set C R as the rojection of a convex set C R in some higher dimensional sace (i.e., > ). The comlexity of solving the associated convex rograms is now a function of the lifting dimension. Thus, liftand-roject techniques are useful if isnottoomuchlarger than and if C has an efficient reresentation in the higher dimensional sace R. Lift-and-roject rovides a very owerful reresentation tool, as seen in the following examle. Examle 3: The cross-olytoe is the unit ball of the l -norm: B l ¼ x R jx i j : i The l -norm has been the focus of much attention recently in statistical model selection and feature selection due to its sarsity-inducing roerties (, ). Although the cross-olytoe has vertices, a direct secification in terms of linear constraints involves inequalities: B l ¼ x R z i x i ; z f ; þg : i However, we can obtain a tractable reresentation by lifting to R andthenrojectingontothefirst coordinates: B l ¼ x R j z R s:t: z i x i z i i; z i : i Note that in R with the additional variables z, we have only + inequalities. Another examle of a olytoe that requires many inequalities in a direct descrition is the ermutahedron (3), the convex hull of all the ermutations of the vector [,..., ] R. In fact, the ermutahedron requires exonentially many linear inequalities in a direct descrition, whereas a lifted reresentation involves O( log()) additional variables and about O( log()) inequalities in the higher dimensional sace (4). We refer the reader to the literature on conic reresentations for other examles (ref. 5 and references therein), including lifted semidefinite reresentations. Hierarchies of Convex Relaxations. In many cases of interest, convex sets may not have tractable reresentations. Lifted reresentations in such cases have lifting dimensions that are suerolynomially large in the dimension of the original convex set; thus, the associated numerical techniques lead to intractable comutational rocedures that have suerolynomial runtime with resect to the dimension of the original set. A rominent examle of a convex set that is difficult to comute is the cut olytoe: CUT m m ¼ convfmm jm f ; þg m g: [6] Rank-one signed matrices and their convex combinations are of interest in collaborative filtering and clustering roblems (see Time-Data Tradeoffs). There is no known tractable reresentation of the cut olytoe; lifted linear or semidefinite reresentations have lifting dimensions that are suerolynomial in size. Such comutational issues have led to a large literature on aroximating intractable convex sets by tractable ones. For the uroses of this aer, and following the dominant trend in the literature, we focus on outer aroximations. For examle, the ellitoe [5] is an outer relaxation of the cut olytoe, and it has been used in aroximation algorithms for intractable combinatorial otimization roblems, such as finding the maximumweight cut in a grah (6). More generally, one can imagine a hierarchy of increasingly tighter aroximations {C i } of a convex set C as follows: C C 3 C C : There exist several mechanisms for deriving such hierarchies, and we describe three frameworks here. In the first framework, which was develoed by Sherali and Adams (7), the set C is assumed to be olyhedral and each element of the family {C i } is also olyhedral. Secifically, each C i is exressed via a lifted LP reresentation. Tighter aroximations are obtained by resorting to larger sized lifts such that the lifting dimension increases with the level i in the hierarchy. The second framework is similar in sirit to the first one, but the set C is now a convex basic, closed semialgebraic set and the aroximations {C i } are given by lifted SDP reresentations. [A basic, closed semialgebraic set is the collection of solutions of a E84 Chandrasekaran and Jordan

5 system of olynomial equations and olynomial inequalities (8).] Again, the lifting dimension increases with the level i in the hierarchy. This method was initially ioneered by Parrilo (9, 3) and by Lasserre (3), and it was studied in greater detail subsequently by Gouveia et al. (3). Both of these first and second frameworks are similar in sirit in that tighter aroximations are obtained via lifted reresentations with successively larger lifting dimensions. The third framework that we mention here is qualitatively different from the first two. Suose C is a convex set that has a K-reresentation; by successively weakening the cone K itself, one obtains increasingly weaker aroximations to C. Secifically, we consider the setting in which the cone K is a hyerbolicity cone (33). Such cones have rich geometric and algebraic structure, and their boundary is given in terms of the vanishing of hyerbolic olynomials. They include the orthant and the semidefinite cone as secial cases. We do not go into further technical details and formal definitions of these cones here; instead, we refer the interested reader to the work of Renegar (33). The main idea is that one can obtain a family of relaxations {K i } to a hyerbolicity cone K R, where each K i is a convex cone (in fact, hyerbolic) and is a subset of R : K K 3 K K : [7] These outer conic aroximations are obtained by taking certain derivatives of the hyerbolic olynomial used to define the original cone K (more details are rovided in ref. 33). One then constructs a hierarchy of aroximations {C i }toc by relacing the cone K in the reresentation of C by the family of conic aroximations {K i }. From 3 and 7, it is clear that the aroximations {C i }sodefined satisfy C... C 3 C C. The imortant oint in these three frameworks is that the family of aroximations {C i } obtained in each case is ordered both by aroximation quality and by comutational comlexity; that is, the weaker aroximations in the hierarchy are also the ones that are more tractable to comute. This observation leads to an algorithmweakening mechanism that is useful for rocessing larger datasets more coarsely. As demonstrated concretely in the next section, the estimator [] based on a weaker aroximation to C can rovide the same statistical erformance as one based on a stronger aroximation to C, rovided that the former estimator is evaluated with more data. The ushot is that the first estimator is more tractable to comute than the second. Thus, we obtain a technique for reducing the runtime required to rocess a larger dataset. Estimation via Convex Otimization In this section, we investigate the statistical roerties of the estimator [] for the denoising roblem []. The signal set S in differs based on the alication of interest. For examle, S may be the set of sarse vectors in a fixed basis, which could corresond to the roblem of denoising sarse vectors in wavelet bases (6). The signal set S may be the set of low-rank matrices, which leads to roblems of collaborative filtering (34). Finally, S may be a set of ermutation matrices corresonding to rankings over a collection of items. Our analysis in this section is general, and it is alicable to these and other settings (concrete examles are rovided in Time-Data Tradeoffs). In some denoising roblems, one is interested in noise models other than Gaussian. We comment on the erformance of the estimator [] in settings with non-gaussian noise, although we rimarily focus on the Gaussian case for simlicity. Convex Programming Estimators. To analyze the erformance of the estimator [], we introduce a few concets from convex analysis (35). Given a closed convex set C R and a oint a C, we define the tangent cone at a with resect to C as T C ðaþ ¼conefb ajb Cg: [8] Here, cone( ) refers to the conic hull of a set obtained by taking nonnegative linear combinations of elements of the set. The cone T C (a) isthesetofdirectionstoointsinc from the oint a. Theolar K* R of a cone K R is the cone K* ¼ fh R jhh; di d Kg: The normal cone N C (a) ata with resect to the convex set C is the olar cone of the tangent cone T C (a): N C ðaþ ¼T C ðaþ* : [9] Thus, the normal cone consists of vectors that form an obtuse angle with every vector in the tangent cone T C (x). Both the tangent and normal cones are convex cones. A key quantity that will aear in our error bounds is the following notion of the comlexity or size of a tangent cone. Definition 3: The Gaussian squared-comlexity of a set D R is defined as: gðdþ ¼ E su ha; gi ; a D where the exectation is with resect to g N (, I ). This quantity is closely related to the Gaussian comlexity of a set (36, 37), which consists of no squaring of the term inside the exectation. The Gaussian squared-comlexity shares many roerties in common with the Gaussian comlexity, and we describe those that are relevant to this aer in the next subsection. Secifically, we discuss methods to estimate this quantity for sets D that have some structure. With these definitions and letting B l denote the l ball in R, we have the following result on the error between ^x n ðcþ and x*. Proosition 4. For x* S R and with C R convex such that S C, we have the error bound x* E ^xn ðcþ l σ n g T C ðx* Þ B l : Proof: We have that y ¼ x* þ σ ffiffi n z. We begin by establishing a bound that is derived by conditioning on z ¼ ~z. Subsequently, taking exectations concludes the roof. We have from the otimality conditions (35) of the convex rogram [] that x* þ σ n ~z ^x nðcþj z¼~z N C ^x n ðcþj z¼~z : Here, ^x n ðcþj z¼~z reresents the otimal value of [] conditioned on z ¼ ~z. Because x* S C, wehavethatx* ^x n ðcþj z¼~z T C ð^x n ðcþj z¼~z Þ. Because the normal and tangent cones are olar to each other, we have that D x* þ σ ffiffiffi ~z ^x n ðcþj n z¼~z ; x* ^x n ðcþj z¼~z E : It then follows that x* ^xn ðcþj z¼~z σ ffiffiffi ^x l n ðcþj n z¼~z x* ; ~z ¼ σ ffiffiffi ^x n ðcþj n z¼~z x* * + ^x n ðcþj z¼~z x* l ^x n ðcþj z¼~z x* ; ~z l σ ffiffiffi ^x n ðcþj n z¼~z x* l su d; ~z : d T Cðx*Þ;kdk l Dividing both sides by k^x n ðcþj z¼~z x*k l, squaring both sides, and finally taking exectations comletes the roof. Note that the basic structure of the error bound rovided by the estimator [] in fact holds for an arbitrary distribution on the noise z with the Gaussian squared-comlexity suitably modified. However, we focus for the rest of this aer on the Gaussian case, z N (, I ). To summarize in words, the mean squared error is bounded by the noise variance times the Gaussian squared-comlexity of the STATISTICS PNAS PLUS Chandrasekaran and Jordan PNAS Published online March, 3 E85

6 normalized tangent cone with resect to C at the true arameter x*. Essentially, it measures the amount of noise restricted to the tangent cone, which is intuitively reasonable because only the noise that moves one away from x* in a feasible direction in C must contribute toward the error. Therefore, if the convex constraint set C is shar at x* so that the cone T C (x*) is narrow, the error is then small. At the other extreme, if the constraint set C = R, the error is then σ n as one would exect. Although Proosition 4 is useful and indeed will suffice for the uroses of demonstrating time-data tradeoffs in the next section, there are a coule of shortcomings in the result as stated. First, suose the signal set S is contained in a ball around the origin, with the radius of the ball being small relative to the noise variance σ n. In such a setting, the estimator ^x ¼ leads to a smaller mean squared error than one would obtain from Proosition 4. Second, and somewhat more subtly, suose that one does not have a erfect bound on the size of the signal set. For concreteness, consider a setting in which S is a set of sarse vectors with bounded l norm, in which case a good choice for the constraint set C in the estimator [] is an aroriately scaled l ball. However, if we do not know the l norm of x* a riori, we may then end u using a constraint set C such that x* does not belong to C (hence, x* is an infeasible solution) or such that x* lies strictly in the interior of C [hence, T C (x*) is all of R ]. Both these situations are undesirable because they limit the alicability of Proosition 4 and rovide very loose error bounds. The following result addresses these shortcomings by weakening the assumtions of Proosition 4. Proosition 5. Let x* S R, and let C R be a convex set. Suose there exists a oint ~x C such that C ~x ¼ Q Q, with Q,Q lying in orthogonal subsaces of R and Q αb l for α. We then have that x* E ^xn ðcþ σ l 6 n g coneðq Þ B l þ x* ~x l þ α : Here, cone(q ) is the conic hull of Q. The roof of this result is resented in SI Aendix. A number of remarks are in order here. With resect to the first shortcoming in Proosition 4 stated above, if C is chosen such that S C, one can set ~x ¼ x* ; Q ¼ and Q ¼C ~x in Proosition 5 and can readily obtain a bound that scales only with the diameter of the convex constraint set C. With regard to the second shortcoming in Proosition 4 described above, if a oint ~x C near x* has a narrow tangent cone T C ð~xþ, one can then rovide an error bound with resect to gðt C ð~xþþ with an extra additive term that deends on kx* ~xk l ; this is done by setting Q ¼C ~x and Q = (thus, α =)inproosition 5. More generally, Proosition 5 incororates both of these imrovements in a single error bound with resect to an arbitrary oint ~x C; thus, one can further otimize the error bound over the choice of ~x C (as well as the choice of the decomosition Q and Q ). Proerties and Comutation of Gaussian Squared-Comlexity. We record some roerties of the Gaussian squared-comlexity that are subsequently useful when we demonstrate concrete time-data tradeoffs. It is clear that g( ) is monotonic with resect to set nesting [i.e., g(d ) g(d ) for sets D D ]. If D is a subsace, one can then check that g(d) = dim(d). To estimate squaredcomlexities of families of cones, one can imagine aealing to techniques similar to those used for estimating Gaussian comlexities of sets (36, 37). Most rominent among these are arguments based on covering number and metric entroy bounds. However, these arguments are frequently not shar and introduce extraneous log-factors in the resulting error bounds. In a recent aer by Chandrasekaran et al. (38), shar uer bounds on the Gaussian comlexities of normalized cones have been established for families of cones of interest in a class of linear inverse roblems. The (square of the) Gaussian comlexity can be uer-bounded by the Gaussian squared-comlexity g(d) via Jensen s inequality: E su hd; gi gðdþ; d D where g is a standard normal vector. In fact, most of the bounds in the aer by Chandrasekaran et al. (38) were obtained by bounding g(d); thus, they are directly relevant to our setting. In the rest of this section, we resent the bounds on g(d) from the aer by Chandrasekaran et al. (38) that will be used in this aer, deferring to that aer for roofs in most cases. In some cases, the roofs do require modifications with resect to their counterarts in the aer by Chandrasekaran et al. (38), and for these cases, we give full roofs in SI Aendix. The first result, roved in (38), is a direct consequence of convex duality and rovides a fruitful general technique to comute shar estimates of Gaussian squared-comlexities. Let dist(a, D) denote the l distance from a oint a to the set D. Lemma. Let K R be a convex cone, and let K* R be its olar. We then have for any a R that su d K B l hd; ai ¼ distða; K* Þ: Therefore, we have the following result as a simle corollary. Corollary 6. Let K R be a convex cone, and let K* R be its olar. For g N (, I ), we have that g K B l ¼ E hdistðg; K* Þ i : Based on the duality result of Lemma and Corollary 6, one can comute the following shar bounds on the Gaussian squaredcomlexities of tangent cones with resect to the l norm and nuclear norm balls. These are esecially relevant when one wishes to estimate sarse signals or low-rank matrices; in these settings, the l and nuclear norm balls serve as useful constraint sets for denoising because the tangent cones with resect to these sets at sarse vectors and at low-rank matrices are articularly narrow. Bothoftheseresults,rovedin(38),areusedwhenwedescribe time-data tradeoffs. Proosition 7. Let x R be a vector containing s nonzero entries. Let T be the tangent cone at x with resect to an l norm ball scaled so that x lies on the boundary of the ball (i.e., a scaling of the unit l norm ball by a factor kxk l ). Then, g T B l s log þ 5 s 4 s: Next, we state a result, roved in (38), for low-rank matrices and the nuclear norm ball. Proosition 8. m m Let X R be a matrix of rank r. Let T be the tangent cone at X with resect to a nuclear norm ball scaled so that X lies on the boundary of the ball (i.e., a scaling of the unit nuclear norm ball by a factor equal to the nuclear norm of X). Then, g T B mm l 3rðm þ m rþ: Next, we state and rove a result that allows us to estimate Gaussian squared-comlexities of general cones. The bound is based on the volume of the dual of the cone of interest, and the roof involves an aeal to Gaussian isoerimetry (39). A similar result on Gaussian comlexities of cones (without the square) was roved by Chandrasekaran et al. (38), but that result does not directly imly our statement, and we therefore give a comlete self-contained roof in SI Aendix. The volume of a cone E86 Chandrasekaran and Jordan

7 is assumed to be normalized (between and ); thus, we consider the relative fraction of a unit Euclidean shere that is covered by acone. Proosition 9. Let K R be a cone such that its olar K* R has a normalized volume of μ 4exf =g;. For, we have that 4e g K B l log : 4μ If a cone is narrow, its olar will then be wide, leading to a large value of μ, and hence a small quantity on the right-hand side of the bound. This result leads to bounds on Gaussian squaredcomlexity in settings in which one can easily obtain estimates of volumes. One setting in which such estimates are easily obtained is the case of tangent cones with resect to vertex transitive olytoes. We recall that a vertex transitive olytoe (3) is one in which there exists a symmetry of the olytoe for each air of vertices maing the two vertices isomorhically to each other. Roughly seaking, all the vertices in such olytoes are the same. Some examles include the cross-olytoe (the l norm ball), the simlex [4], the hyercube (the l norm ball), and many olytoes generated by the action of grous (4). We will see many examles of such olytoes in our examles on time-data tradeoffs; thus, we will aeal to the following corollary reeatedly. Corollary. Suose that P R is a vertex transitive olytoe with v vertices, and let x be a vertex of this olytoe. If 4e v 4 ex{/}, gðt P ðxþþ log ðv=4þ: Proof: The normal cones at the vertices of P artition R. If the olytoe is vertex-transitive, the normal cones are then all equivalenttoeachother(utoorthogonal transformations). Consequently, the (normalized) volume of the normal cone at any vertex is /v. Because the normal cone at a vertex is olar to the tangent cone, we have the desired result from Proosition 9. Time-Data Tradeoffs Preliminaries. We now turn our attention to giving examles of time-data tradeoffs in denoising roblems via convex relaxation. As described reviously, we must set a desired risk to realize a time-data tradeoff; in the examles in the rest of this section, we will fix the desired risk to be equal to indeendent of the roblem dimension. Thus, these denoising roblems belong to the time-data comlexity classes TD (t(), n(), ) for different runtime constraints t() and samle budgets n(). The following corollary gives the number of samles required to obtain a mean squared error of via convex otimization in our denoising setu. (In each of our time-data tradeoff examles, the signal sets S and the associated convex relaxations are symmetric so that no oint in S is distinguished; therefore, without loss of generality, we comute the risk by considering the mean squared error at an arbitrary oint in S rather than taking a suremum over x* S.) Corollary. For x* S and with S C for a closed, convex set C, if n σ g T C ðx* Þ B l ; then, E½kx* ^x n ðcþk l Š. Proof: The result follows by a rearrangement of the terms in the bound in Proosition 4. This corollary states that if we have access to a dataset with n samles, we can then use any convex constraint set C such that the term on the right-hand side in the corollary is smaller than n. Recalling that larger constraint sets C lead to larger tangent cones T C, we observe that if n is large, one can otentially use very weak (and comutationally inexensive) relaxations and still obtain a risk of. This observation, combined with the imortant oint that the hierarchies of convex relaxations described reviously are simultaneously ordered both by aroximation quality and by comutational tractability, allows us to realize a time-data tradeoff by using convex relaxation as an algorithm-weakening mechanism. A simle demonstration is rovided in Fig.. We further consider settings with σ = and in which our signal sets S R ffiffi consist of elements that have Euclidean norm on the order of (measured from the centroid of S). In such regimes, the James Stein shrinkage estimator (4) offers about the same level of erformance as the maximum-likelihood estimator, and both of these are outerformed in statistical risk by nonlinear estimators of the form [] based on convex otimization. Finally, we briefly remark on the runtimes of our estimators. The runtime for each of the rocedures below is calculated by adding the number of oerations required to comute the samle mean y and the number of oerations required to solve [] to some accuracy. Hence, if the number of samles used is n and if f C () denotes the number of oerations required to roject y onto C, the total runtime is then n + f C (). Thus, the number of samles enters the runtime calculations as just an additive term. As we rocess larger datasets, the first term in this calculation becomes larger, but this increase is offset by a more substantial decrease in the second term due to the use of a comutationally tractable convex relaxation. We note that such a runtime calculation extends to more general inference roblems in which one emloys estimators of the form [] but with different loss functions in the objective; secifically, the runtime is calculated as above so long as the loss function deends only on some sufficient statistic comuted from the data. If the loss function is instead of the form n i¼ lðx; y iþ and it cannot be summarized via a sufficient statistic of the data fy i g n i¼, the number of samles then enters the runtime comutation in a multilicative manner as θ(n)f C () for some function θ( ). Examle : Denoising Signed Matrices. We consider the roblem of recovering signed matrices corruted by noise: S¼ aa ja f ; þg ffiffi : We have a R ffiffi so that S R. Inferring such signals is of interest in collaborative filtering, where one wishes to aroximate matrices as the sum of a small number of rank-one signed matrices (34). Such matrices may reresent, for examle, the movie references of users as in the Netflix roblem. The tightest convex constraint that one could use in this case is C = conv(s), which is the cut olytoe [6]. To obtain a risk of Fig.. (Left) Signal set S consisting of x*. (Center) Two convex constraint sets C and C, where C is the convex hull of S and C is a relaxation that is more efficiently comutable than C. (Right) Tangent cone T C (x*) is contained inside the tangent cone T C (x*). Consequently, the Gaussian squaredcomlexity gðt C ðx* Þ B Þ is smaller than the comlexity gðt l C ðx* Þ B Þ,sothat l the estimator ^x n ðcþ requires fewer samles than the estimator ^x n ðc Þ for a risk of at most. STATISTICS PNAS PLUS Chandrasekaran and Jordan PNAS Published online March, 3 E87

8 ffiffi with this constraint, one requires n ¼ c by alying Corollary and Corollary based on the symmetry of the cut olytoe. The cut olytoe is generally intractable to comute. Hence, the best-known algorithms to roject onto C would require a runtime that is suerolynomial in. Consequently, the total runtime of this algorithm is c.5 + sueroly(). A commonly used tractable relaxation of the cut olytoe is the ellitoe [5]. By comuting the Gaussian squared-comlexity of the tangent cones at rank-one signed matrices ffiffi with resect to this set, it is ossible to show that n ¼ c leads to a risk of (with c > c ). Furthermore, interior oint-based convex otimization algorithms for solving [] that exloit the secial structure of the ellitoe require O(.5 ) oerations (8, 4, 43). (The exonent is a result of the manner in which we define our signal set so that a rank-one signed matrix lives in R.) Hence, the total runtime of this rocedure is c.5 + O(.5 ). Finally, an even weaker relaxation of the cut olytoe than the ellitoe ffiffi is the unit ball of the nuclear norm scaled by a factor of ; one can verify that the elements of S lie on the boundary of this set and are, in fact, extreme oints. Aealing to Proosition 8 (using the fact that the elements of S are ffiffi rank-one matrices) and Corollary, we conclude that n ¼ c 3 samles rovide a mean squared error of (with c 3 > c ). Projecting onto the scaled nuclear norm ball can be done by comuting a singular value decomosition (SVD) and then truncating the sequence of singular ffiffi values in descending order when their cumulative sum exceeds (in effect, ffiffi rojecting the vector of singular values onto an l ball of size ). This oeration requires O(.5 ) oerations; thus, the total runtime is c O(.5 ). To summarize, the cut-matrix denoising ffiffi roblem lives in the ffiffi time-data class TD(sueroly(),c,), in TDðOð :5 Þ; c ; Þ, and in TDðOð :5 ffiffi Þ; c 3 ; Þ, with constants c < c < c 3. Examle : Ordering Variables. In many data analysis tasks, one is given a collection of variables that are suitably ordered so that the oulation covariance is banded. Under such a constraint, thresholding the entries of the emirical covariance matrix based on their distance from the diagonal has been shown to be a owerful method for estimation in the high-dimensional setting (44). However, if an ordering of the variables is not known a riori, one must jointly learn an ordering for the variables and estimate their underlying covariance. As a stylized version of this variable ordering roblem, let M R ffiffi ffiffi ffiffi be a known tridiagonal matrix (with Euclidean norm Oð Þ) and consider the following signal set: S¼ ffiffi ffiffiffi ΠMΠ jπ is a ermutation matrix : The matrix M here is to be viewed as a covariance matrix. Thus, the corresonding denoising roblem [] is that we wish to estimate a covariance matrix in the absence of knowledge of the ordering of the underlying variables. In a real-world scenario, one might wish to consider covariance matrices M that belong to some class of banded matrices and then construct S as done here, but we stick with the case of a fixed M for simlicity. Furthermore, the noise in a ractical setting is better modeled as coming from a Wishart distribution; again, we focus on the Gaussian case for simlicity. The tightest convex constraint set that one could use in this case is the convex hull of S, which is generally intractable to comute for arbitrary matrices M. For examle, if one were able to comute this set in olynomial time for any tridiagonal matrix M, one would be able to solve the intractable longest ath roblem (45) (finding the longest ath between any two vertices in a grah) in olynomial time. With this convex constraint ffiffi set, we find using Corollary and Corollary that n ¼ c logðþ samles would lead to a risk of. This follows from the ffiffi fact that conv(s) is a vertex-transitive olytoe with about ð Þ! vertices. Thus, the total runtime is c.5 log() + sueroly(). An efficiently comutable relaxation of conv(s) is a scaled l ball (scaled by the l norm of M). Aealing to Proosition 7 on tangent cones with ffiffi resect to the l ball and to Corollary, we find that n ¼ c logðþ samles suffice to rovide a risk of. In alying Proosition 7, we note ffiffi that M is assumed to be tridiagonal, and therefore has Oð Þ nonzero entries. The runtime of this rocedure is c.5 log() +O( log()). Thus, the variable ordering ffiffi denoising roblem belongs to TD(sueroly(),c ffiffi logðþ,) and to TDðOð :5 logðþþ; logðþ; Þ, withconstantsc < c. c Examle 3: Sarse PCA and Network Activity Identification. As our third examle, we consider sarse PCA (8) in which one wishes to learn from samles a sarse eigenvector that contains most of the energy of a covariance matrix. As a simlified version of this roblem, one can imagine a matrix M R ffiffi ffiffi with entries equal to ffiffiffi =k in the to-left k k block ffiffi and zeros elsewhere (so that the Euclidean norm of M is ), and with S defined as: S¼ ffiffi ffiffi ΠMΠ jπ is a ermutation matrix : In addition to sarse PCA, such signal sets are of interest in identifying activity in noisy networks (46) as well as in related combinatorial otimization roblems, such as the lanted clique roblem (45). In the sarse PCA context, Amini and Wainwright (6) study time-data tradeoffs by investigating the samle comlexities of two rocedures: a simle one based on thresholding and a more sohisticated one based on semidefinite rogramming. Kolar et al. (46) investigate the samle comlexities of a number of rocedures ranging from a combinatorial search method to thresholding and sarse SVD. We note that the time-data tradeoffs studied in these two aers (6, 46) relate to the roblem of learning the suort of the leading sarse eigenvector; in contrast, in our setu, the objective is simly to denoise an element of S. Furthermore, although the Gaussian noise setting is of interest in some of these domains, in a more realistic sarse PCA roblem (e.g., the one considered in ref. 6), the noise is Wishart rather than Gaussian as considered here. Nevertheless, we stick with our stylized roblem setting because it rovides some useful insights on ffiffiffi time-data tradeoffs. Finally, the size of the block k f;...; g deends on the alication of interest, and it is tyically far from the extremes and ffiffiffi.we will consider the case k /4 for concreteness. [This setting is an interesting threshold case in the lanted clique context (47 49), where k = /4 is the square root of the number of nodes of the grah reresented by M (viewed as an adjacency matrix).] As usual, the tightest convex constraint set one can use in this setting is the convex hull of S, which is generally intractable to comute; an efficient characterization of this olytoe would lead to an efficient solution of the intractable lanted-clique roblem (finding a fully connected subgrah inside a larger grah). Using this convex constraint set gives an estimator that requires about n = O( /4 log()) samles to roduce a risk- estimate. We obtain this threshold by aealing to Corollary and to Corollary, as well as to the observation that conv(s) isa ffiffi vertex-transitive olytoe with about =4 vertices. Thus, the overall runtime is O( 5/4 log()) + sueroly(). A convex relaxation ffiffi of conv(s) is the nuclear norm ball scaled by a factor of so that the elements of S lie on the boundary. From Proosition 8 (observing that the elements of S are rankone matrices) and Corollary, wehavethatn¼c ffiffi samles give a risk- estimate with this rocedure. As comuted in the examle with cut matrices, the overall runtime of this nuclear norm rocedure is c.5 + O(.5 ). In conclusion, the denoising version of sarse PCA lies in TD(sueroly(), O( /4 log()),) and in TDðOð :5 ffiffi Þ; Oð Þ; Þ. Examle 4: Estimating Matchings. As our final examle, we consider signals that reresent the set of all erfect matchings in the comlete grah. A matching is any subset of edges of a grah such that no node of the grah is incident to more than one edge in the subset, and a erfect matching is a subset of edges in which every node is incident to exactly one edge in the subset. Grah matchings arise in a range of inference roblems, such as in chemical structure E88 Chandrasekaran and Jordan

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

Computational and Statistical Tradeoffs via Convex Relaxation

Computational and Statistical Tradeoffs via Convex Relaxation Computational and Statistical Tradeoffs via Convex Relaxation Venkat Chandrasekaran Caltech Joint work with Michael Jordan Time-constrained Inference o Require decision after a fixed (usually small) amount

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material Robustness of classifiers to uniform l and Gaussian noise Sulementary material Jean-Yves Franceschi Ecole Normale Suérieure de Lyon LIP UMR 5668 Omar Fawzi Ecole Normale Suérieure de Lyon LIP UMR 5668

More information

On the Chvatál-Complexity of Knapsack Problems

On the Chvatál-Complexity of Knapsack Problems R u t c o r Research R e o r t On the Chvatál-Comlexity of Knasack Problems Gergely Kovács a Béla Vizvári b RRR 5-08, October 008 RUTCOR Rutgers Center for Oerations Research Rutgers University 640 Bartholomew

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Positive decomposition of transfer functions with multiple poles

Positive decomposition of transfer functions with multiple poles Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

Distributed Rule-Based Inference in the Presence of Redundant Information

Distributed Rule-Based Inference in the Presence of Redundant Information istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split A Bound on the Error of Cross Validation Using the Aroximation and Estimation Rates, with Consequences for the Training-Test Slit Michael Kearns AT&T Bell Laboratories Murray Hill, NJ 7974 mkearns@research.att.com

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

Inequalities for the L 1 Deviation of the Empirical Distribution

Inequalities for the L 1 Deviation of the Empirical Distribution Inequalities for the L 1 Deviation of the Emirical Distribution Tsachy Weissman, Erik Ordentlich, Gadiel Seroussi, Sergio Verdu, Marcelo J. Weinberger June 13, 2003 Abstract We derive bounds on the robability

More information

CMSC 425: Lecture 4 Geometry and Geometric Programming

CMSC 425: Lecture 4 Geometry and Geometric Programming CMSC 425: Lecture 4 Geometry and Geometric Programming Geometry for Game Programming and Grahics: For the next few lectures, we will discuss some of the basic elements of geometry. There are many areas

More information

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems Int. J. Oen Problems Comt. Math., Vol. 3, No. 2, June 2010 ISSN 1998-6262; Coyright c ICSRS Publication, 2010 www.i-csrs.org Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various

More information

COMMUNICATION BETWEEN SHAREHOLDERS 1

COMMUNICATION BETWEEN SHAREHOLDERS 1 COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov

More information

Uniform Law on the Unit Sphere of a Banach Space

Uniform Law on the Unit Sphere of a Banach Space Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a

More information

The non-stochastic multi-armed bandit problem

The non-stochastic multi-armed bandit problem Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

Elementary theory of L p spaces

Elementary theory of L p spaces CHAPTER 3 Elementary theory of L saces 3.1 Convexity. Jensen, Hölder, Minkowski inequality. We begin with two definitions. A set A R d is said to be convex if, for any x 0, x 1 2 A x = x 0 + (x 1 x 0 )

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

Linear diophantine equations for discrete tomography

Linear diophantine equations for discrete tomography Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,

More information

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating

More information

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

Some results of convex programming complexity

Some results of convex programming complexity 2012c12 $ Ê Æ Æ 116ò 14Ï Dec., 2012 Oerations Research Transactions Vol.16 No.4 Some results of convex rogramming comlexity LOU Ye 1,2 GAO Yuetian 1 Abstract Recently a number of aers were written that

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

An Analysis of Reliable Classifiers through ROC Isometrics

An Analysis of Reliable Classifiers through ROC Isometrics An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit

More information

A Note on Guaranteed Sparse Recovery via l 1 -Minimization

A Note on Guaranteed Sparse Recovery via l 1 -Minimization A Note on Guaranteed Sarse Recovery via l -Minimization Simon Foucart, Université Pierre et Marie Curie Abstract It is roved that every s-sarse vector x C N can be recovered from the measurement vector

More information

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition A Qualitative Event-based Aroach to Multile Fault Diagnosis in Continuous Systems using Structural Model Decomosition Matthew J. Daigle a,,, Anibal Bregon b,, Xenofon Koutsoukos c, Gautam Biswas c, Belarmino

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

A Social Welfare Optimal Sequential Allocation Procedure

A Social Welfare Optimal Sequential Allocation Procedure A Social Welfare Otimal Sequential Allocation Procedure Thomas Kalinowsi Universität Rostoc, Germany Nina Narodytsa and Toby Walsh NICTA and UNSW, Australia May 2, 201 Abstract We consider a simle sequential

More information

Cryptanalysis of Pseudorandom Generators

Cryptanalysis of Pseudorandom Generators CSE 206A: Lattice Algorithms and Alications Fall 2017 Crytanalysis of Pseudorandom Generators Instructor: Daniele Micciancio UCSD CSE As a motivating alication for the study of lattice in crytograhy we

More information

c Copyright by Helen J. Elwood December, 2011

c Copyright by Helen J. Elwood December, 2011 c Coyright by Helen J. Elwood December, 2011 CONSTRUCTING COMPLEX EQUIANGULAR PARSEVAL FRAMES A Dissertation Presented to the Faculty of the Deartment of Mathematics University of Houston In Partial Fulfillment

More information

On Wald-Type Optimal Stopping for Brownian Motion

On Wald-Type Optimal Stopping for Brownian Motion J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of

More information

arxiv: v1 [stat.ml] 10 Mar 2016

arxiv: v1 [stat.ml] 10 Mar 2016 Global and Local Uncertainty Princiles for Signals on Grahs Nathanael Perraudin, Benjamin Ricaud, David I Shuman, and Pierre Vandergheynst March, 206 arxiv:603.03030v [stat.ml] 0 Mar 206 Abstract Uncertainty

More information

SECTION 5: FIBRATIONS AND HOMOTOPY FIBERS

SECTION 5: FIBRATIONS AND HOMOTOPY FIBERS SECTION 5: FIBRATIONS AND HOMOTOPY FIBERS In this section we will introduce two imortant classes of mas of saces, namely the Hurewicz fibrations and the more general Serre fibrations, which are both obtained

More information

Approximate Dynamic Programming for Dynamic Capacity Allocation with Multiple Priority Levels

Approximate Dynamic Programming for Dynamic Capacity Allocation with Multiple Priority Levels Aroximate Dynamic Programming for Dynamic Caacity Allocation with Multile Priority Levels Alexander Erdelyi School of Oerations Research and Information Engineering, Cornell University, Ithaca, NY 14853,

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

Statics and dynamics: some elementary concepts

Statics and dynamics: some elementary concepts 1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and

More information

Location of solutions for quasi-linear elliptic equations with general gradient dependence

Location of solutions for quasi-linear elliptic equations with general gradient dependence Electronic Journal of Qualitative Theory of Differential Equations 217, No. 87, 1 1; htts://doi.org/1.14232/ejqtde.217.1.87 www.math.u-szeged.hu/ejqtde/ Location of solutions for quasi-linear ellitic equations

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation Uniformly best wavenumber aroximations by satial central difference oerators: An initial investigation Vitor Linders and Jan Nordström Abstract A characterisation theorem for best uniform wavenumber aroximations

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

Lilian Markenzon 1, Nair Maria Maia de Abreu 2* and Luciana Lee 3

Lilian Markenzon 1, Nair Maria Maia de Abreu 2* and Luciana Lee 3 Pesquisa Oeracional (2013) 33(1): 123-132 2013 Brazilian Oerations Research Society Printed version ISSN 0101-7438 / Online version ISSN 1678-5142 www.scielo.br/oe SOME RESULTS ABOUT THE CONNECTIVITY OF

More information

arxiv: v2 [stat.me] 3 Nov 2014

arxiv: v2 [stat.me] 3 Nov 2014 onarametric Stein-tye Shrinkage Covariance Matrix Estimators in High-Dimensional Settings Anestis Touloumis Cancer Research UK Cambridge Institute University of Cambridge Cambridge CB2 0RE, U.K. Anestis.Touloumis@cruk.cam.ac.uk

More information

#A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS

#A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS #A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS Norbert Hegyvári ELTE TTK, Eötvös University, Institute of Mathematics, Budaest, Hungary hegyvari@elte.hu François Hennecart Université

More information

Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Application

Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Application BULGARIA ACADEMY OF SCIECES CYBEREICS AD IFORMAIO ECHOLOGIES Volume 9 o 3 Sofia 009 Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Alication Svetoslav Savov Institute of Information

More information

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete

More information

arxiv:cond-mat/ v2 25 Sep 2002

arxiv:cond-mat/ v2 25 Sep 2002 Energy fluctuations at the multicritical oint in two-dimensional sin glasses arxiv:cond-mat/0207694 v2 25 Se 2002 1. Introduction Hidetoshi Nishimori, Cyril Falvo and Yukiyasu Ozeki Deartment of Physics,

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

POINTS ON CONICS MODULO p

POINTS ON CONICS MODULO p POINTS ON CONICS MODULO TEAM 2: JONGMIN BAEK, ANAND DEOPURKAR, AND KATHERINE REDFIELD Abstract. We comute the number of integer oints on conics modulo, where is an odd rime. We extend our results to conics

More information

Finding a sparse vector in a subspace: linear sparsity using alternating directions

Finding a sparse vector in a subspace: linear sparsity using alternating directions IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 Finding a sarse vector in a subsace: linear sarsity using alternating directions Qing Qu Student Member IEEE Ju Sun Student Member IEEE and John Wright

More information

Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value de

Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value de Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A Dahleh George Verghese Deartment of Electrical Engineering and Comuter Science Massachuasetts Institute of Technology c Chater Matrix Norms

More information

HENSEL S LEMMA KEITH CONRAD

HENSEL S LEMMA KEITH CONRAD HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar

More information

On the asymptotic sizes of subset Anderson-Rubin and Lagrange multiplier tests in linear instrumental variables regression

On the asymptotic sizes of subset Anderson-Rubin and Lagrange multiplier tests in linear instrumental variables regression On the asymtotic sizes of subset Anderson-Rubin and Lagrange multilier tests in linear instrumental variables regression Patrik Guggenberger Frank Kleibergeny Sohocles Mavroeidisz Linchun Chen\ June 22

More information

CHAPTER 2: SMOOTH MAPS. 1. Introduction In this chapter we introduce smooth maps between manifolds, and some important

CHAPTER 2: SMOOTH MAPS. 1. Introduction In this chapter we introduce smooth maps between manifolds, and some important CHAPTER 2: SMOOTH MAPS DAVID GLICKENSTEIN 1. Introduction In this chater we introduce smooth mas between manifolds, and some imortant concets. De nition 1. A function f : M! R k is a smooth function if

More information

Additive results for the generalized Drazin inverse in a Banach algebra

Additive results for the generalized Drazin inverse in a Banach algebra Additive results for the generalized Drazin inverse in a Banach algebra Dragana S. Cvetković-Ilić Dragan S. Djordjević and Yimin Wei* Abstract In this aer we investigate additive roerties of the generalized

More information

Hotelling s Two- Sample T 2

Hotelling s Two- Sample T 2 Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test

More information

On Z p -norms of random vectors

On Z p -norms of random vectors On Z -norms of random vectors Rafa l Lata la Abstract To any n-dimensional random vector X we may associate its L -centroid body Z X and the corresonding norm. We formulate a conjecture concerning the

More information

Two pairs of families of polyhedral norms versus

Two pairs of families of polyhedral norms versus Math. Program., Ser. A DOI 0.007/s007-05-0899-9 FULL LENGTH PAPER Two airs of families of olyhedral norms versus l -norms: roximity and alications in otimization Jun-ya Gotoh Stan Uryasev 2 Received: 27

More information

Sharp gradient estimate and spectral rigidity for p-laplacian

Sharp gradient estimate and spectral rigidity for p-laplacian Shar gradient estimate and sectral rigidity for -Lalacian Chiung-Jue Anna Sung and Jiaing Wang To aear in ath. Research Letters. Abstract We derive a shar gradient estimate for ositive eigenfunctions of

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS #A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,

More information

Finite Mixture EFA in Mplus

Finite Mixture EFA in Mplus Finite Mixture EFA in Mlus November 16, 2007 In this document we describe the Mixture EFA model estimated in Mlus. Four tyes of deendent variables are ossible in this model: normally distributed, ordered

More information

Convexification of Generalized Network Flow Problem with Application to Power Systems

Convexification of Generalized Network Flow Problem with Application to Power Systems 1 Convexification of Generalized Network Flow Problem with Alication to Power Systems Somayeh Sojoudi and Javad Lavaei + Deartment of Comuting and Mathematical Sciences, California Institute of Technology

More information

Generalized Coiflets: A New Family of Orthonormal Wavelets

Generalized Coiflets: A New Family of Orthonormal Wavelets Generalized Coiflets A New Family of Orthonormal Wavelets Dong Wei, Alan C Bovik, and Brian L Evans Laboratory for Image and Video Engineering Deartment of Electrical and Comuter Engineering The University

More information

ECE 534 Information Theory - Midterm 2

ECE 534 Information Theory - Midterm 2 ECE 534 Information Theory - Midterm Nov.4, 009. 3:30-4:45 in LH03. You will be given the full class time: 75 minutes. Use it wisely! Many of the roblems have short answers; try to find shortcuts. You

More information

p-adic Measures and Bernoulli Numbers

p-adic Measures and Bernoulli Numbers -Adic Measures and Bernoulli Numbers Adam Bowers Introduction The constants B k in the Taylor series exansion t e t = t k B k k! k=0 are known as the Bernoulli numbers. The first few are,, 6, 0, 30, 0,

More information

Introduction to MVC. least common denominator of all non-identical-zero minors of all order of G(s). Example: The minor of order 2: 1 2 ( s 1)

Introduction to MVC. least common denominator of all non-identical-zero minors of all order of G(s). Example: The minor of order 2: 1 2 ( s 1) Introduction to MVC Definition---Proerness and strictly roerness A system G(s) is roer if all its elements { gij ( s)} are roer, and strictly roer if all its elements are strictly roer. Definition---Causal

More information

Multiplicative group law on the folium of Descartes

Multiplicative group law on the folium of Descartes Multilicative grou law on the folium of Descartes Steluţa Pricoie and Constantin Udrişte Abstract. The folium of Descartes is still studied and understood today. Not only did it rovide for the roof of

More information

NONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS. The Goldstein-Levitin-Polyak algorithm

NONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS. The Goldstein-Levitin-Polyak algorithm - (23) NLP - NONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS The Goldstein-Levitin-Polya algorithm We consider an algorithm for solving the otimization roblem under convex constraints. Although the convexity

More information

t 0 Xt sup X t p c p inf t 0

t 0 Xt sup X t p c p inf t 0 SHARP MAXIMAL L -ESTIMATES FOR MARTINGALES RODRIGO BAÑUELOS AND ADAM OSȨKOWSKI ABSTRACT. Let X be a suermartingale starting from 0 which has only nonnegative jums. For each 0 < < we determine the best

More information

On Doob s Maximal Inequality for Brownian Motion

On Doob s Maximal Inequality for Brownian Motion Stochastic Process. Al. Vol. 69, No., 997, (-5) Research Reort No. 337, 995, Det. Theoret. Statist. Aarhus On Doob s Maximal Inequality for Brownian Motion S. E. GRAVERSEN and G. PESKIR If B = (B t ) t

More information

s v 0 q 0 v 1 q 1 v 2 (q 2) v 3 q 3 v 4

s v 0 q 0 v 1 q 1 v 2 (q 2) v 3 q 3 v 4 Discrete Adative Transmission for Fading Channels Lang Lin Λ, Roy D. Yates, Predrag Sasojevic WINLAB, Rutgers University 7 Brett Rd., NJ- fllin, ryates, sasojevg@winlab.rutgers.edu Abstract In this work

More information

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Dedicated to Luis Caffarelli for his ucoming 60 th birthday Matteo Bonforte a, b and Juan Luis Vázquez a, c Abstract

More information

Hidden Predictors: A Factor Analysis Primer

Hidden Predictors: A Factor Analysis Primer Hidden Predictors: A Factor Analysis Primer Ryan C Sanchez Western Washington University Factor Analysis is a owerful statistical method in the modern research sychologist s toolbag When used roerly, factor

More information

A generalization of Amdahl's law and relative conditions of parallelism

A generalization of Amdahl's law and relative conditions of parallelism A generalization of Amdahl's law and relative conditions of arallelism Author: Gianluca Argentini, New Technologies and Models, Riello Grou, Legnago (VR), Italy. E-mail: gianluca.argentini@riellogrou.com

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

The Algebraic Structure of the p-order Cone

The Algebraic Structure of the p-order Cone The Algebraic Structure of the -Order Cone Baha Alzalg Abstract We study and analyze the algebraic structure of the -order cones We show that, unlike oularly erceived, the -order cone is self-dual for

More information

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:

More information

Quantitative estimates of propagation of chaos for stochastic systems with W 1, kernels

Quantitative estimates of propagation of chaos for stochastic systems with W 1, kernels oname manuscrit o. will be inserted by the editor) Quantitative estimates of roagation of chaos for stochastic systems with W, kernels Pierre-Emmanuel Jabin Zhenfu Wang Received: date / Acceted: date Abstract

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

Multi-Operation Multi-Machine Scheduling

Multi-Operation Multi-Machine Scheduling Multi-Oeration Multi-Machine Scheduling Weizhen Mao he College of William and Mary, Williamsburg VA 3185, USA Abstract. In the multi-oeration scheduling that arises in industrial engineering, each job

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J. Math. Anal. Al. 44 (3) 3 38 Contents lists available at SciVerse ScienceDirect Journal of Mathematical Analysis and Alications journal homeage: www.elsevier.com/locate/jmaa Maximal surface area of a

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

On a class of Rellich inequalities

On a class of Rellich inequalities On a class of Rellich inequalities G. Barbatis A. Tertikas Dedicated to Professor E.B. Davies on the occasion of his 60th birthday Abstract We rove Rellich and imroved Rellich inequalities that involve

More information

Developing A Deterioration Probabilistic Model for Rail Wear

Developing A Deterioration Probabilistic Model for Rail Wear International Journal of Traffic and Transortation Engineering 2012, 1(2): 13-18 DOI: 10.5923/j.ijtte.20120102.02 Develoing A Deterioration Probabilistic Model for Rail Wear Jabbar-Ali Zakeri *, Shahrbanoo

More information