CMA-ES with Optimal Covariance Update and Storage Complexity

Size: px
Start display at page:

Download "CMA-ES with Optimal Covariance Update and Storage Complexity"

Transcription

1 CMA-ES with Optimal Covariance Upate an Storage Complexity Oswin Krause Dept. of Computer Science University of Copenhagen Copenhagen, Denmark Díac R. Arbonès Dept. of Computer Science University of Copenhagen Copenhagen, Denmark Christian Igel Dept. of Computer Science University of Copenhagen Copenhagen, Denmark Abstract The covariance matrix aaptation evolution strategy (CMA-ES) is arguably one of the most powerful real-value erivative-free optimization algorithms, fining many applications in machine learning. The CMA-ES is a Monte Carlo metho, sampling from a sequence of multi-variate Gaussian istributions. Given the function values at the sample points, upating an storing the covariance matrix ominates the time an space complexity in each iteration of the algorithm. We propose a numerically stable quaratic-time covariance matrix upate scheme with minimal memory requirements base on maintaining triangular Cholesky factors. This requires a moification of the cumulative step-size aaption (CSA) mechanism in the CMA-ES, in which we replace the inverse of the square root of the covariance matrix by the inverse of the triangular Cholesky factor. Because the triangular Cholesky factor changes smoothly with the matrix square root, this moification oes not change the behavior of the CMA-ES in terms of require objective function evaluations as verifie empirically. Thus, the escribe algorithm can an shoul replace the stanar CMA-ES if upating an storing the covariance matrix matters. Introuction The covariance matrix aaptation evolution strategy, CMA-ES [Hansen an Ostermeier, 200], is recognize as one of the most competitive erivative-free algorithms for real-value optimization [Beyer, 2007; Eiben an Smith, 205]. The algorithm has been successfully applie in many unbiase performance comparisons an numerous real-worl applications. In machine learning, it is mainly use for irect policy search in reinforcement learning an hyperparameter tuning in supervise learning (e.g., see Gomez et al. [2008]; Heirich-Meisner an Igel [2009a,b]; Igel [200], an references therein). The CMA-ES is a Monte Carlo metho for optimizing functions f : R R. The objective function f oes not nee to be continuous an can be multi-moal, constraine, an isturbe by noise. In each iteration, the CMA-ES samples from a -imensional multivariate normal istribution, the search istribution, an ranks the sample points accoring to their objective function values. The mean an the covariance matrix of the search istribution are then aapte base on the ranke points. Given the ranking of the sample points, the runtime of one CMA-ES iteration is ω( 2 ) because the square root of the covariance matrix is require, which is typically compute by an eigenvalue ecomposition. If the objective function can be evaluate efficiently an/or is large, the computation of the matrix square root can easily ominate the runtime of the optimization process. Various strategies have been propose to aress this problem. The basic approach for reucing the runtime is to perform an upate of the matrix only every τ Ω() steps [Hansen an Ostermeier, 29th Conference on Neural Information Processing Systems (NIPS 206), Barcelona, Spain.

2 996, 200], effectively reucing the time complexity to O( 2 ). However, this forces the algorithm to use outate matrices uring most iterations an can increase the amount of function evaluations. Furthermore, it leas to an uneven istribution of computation time over the iterations. Another approach is to restrict the moel complexity of the search istribution [Polan an Zell, 200; Ros an Hansen, 2008; Sun et al., 203; Akimoto et al., 204; Loshchilov, 204, 205], for example, to consier only iagonal matrices [Ros an Hansen, 2008]. However, this can lea to a rastic increase in function evaluations neee to approximate the optimum if the objective function is not compatible with the restriction, for example, when optimizing highly non-separable problems while only aapting the iagonal of the covariance matrix [Omivar an Li, 20]. More recently, methos were propose that upate the Cholesky factor of the covariance matrix instea of the covariance matrix itself [Suttorp et al., 2009; Krause an Igel, 205]. This works well for some CMA-ES variations (e.g., the (+)-CMA-ES an the multi-objective MO-CMA-ES [Suttorp et al., 2009; Krause an Igel, 205; Bringmann et al., 203]), however, the original CMA-ES relies on the matrix square root, which cannot be replace one-to-one by a Cholesky factor. In the following, we explore the use of the triangular Cholesky factorization instea of the square root in the stanar CMA-ES. In contrast to previous attempts in this irection, we present an approach that comes with a theoretical justification for why it oes not eteriorate the algorithm s performance. This approach leas to the optimal asymptotic storage an runtime complexity when aaptation of the full covariance matrix is require, as is the case for non-separable ill-conitione problems. Our CMA-ES variant, referre to as Cholesky-CMA-ES, reuces the runtime complexity of the algorithm with no significant change in the number of objective function evaluations. It also reuces the memory footprint of the algorithm. Section 2 briefly escribes the original CMA-ES algorithm (for etails we refer Hansen [205]). In section 3 we propose our new metho for approximating the step-size aaptation. We give a theoretical justification for the convergence of the new algorithm. We provie empirical performance results comparing the original CMA-ES with the new Cholesky-CMA-ES using various benchmark functions in section 4. Finally, we iscuss our results an raw our conclusions. 2 Backgroun Before we briefly escribe the CMA-ES to fix our notation, we iscuss some basic properties of using a Cholesky ecomposition to sample from a multi-variate Gaussian istribution. Sampling from a -imensional multi-variate normal istribution N (m, Σ), m R,Σ R is usually one using a ecomposition of the covariance matrix Σ. This coul be the square root of the matrix Σ = HH R or a lower triangular Cholesky factorization Σ = AA T, which is relate to the square root by the QR-ecomposition H = AE where E is an orthogonal matrix. We can sample a point x from N (m, Σ) using a sample z N (0, I) by x = Hz + m = AEz + m = Ay + m, where we set y = Ez. We have y N (0, I) since E is orthogonal. Thus, as long as we are only intereste in the value of x an o not nee y, we can sample using the Cholesky factor instea of the matrix square root. 2. CMA-ES The CMA-ES has been propose by Hansen an Ostermeier [996, 200] an its most recent version is escribe by Hansen [205]. In the tth iteration of the algorithm, the CMA-ES samples λ points from a multivariate normal istribution N (m t, σ 2 t C t ), evaluates the objective function f at these points, an aapts the parameters C t R, m t R, an σ t R +. In the following, we present the upate proceure in a slightly simplifie form (for iactic reasons, we refer to Hansen [205] for the etails). All parameters (µ, λ, ω, c σ, σ, c c, c, c µ ) are set to their efault values [Hansen, 205, Table ]. For a minimization task, the λ points are ranke by function value such that f(x,t ) f(x 2,t ) f(x λ,t ). The istribution mean is set to the weighte average m t+ = µ i= ω ix i,t. The weights epen only on the ranking, not on the function values irectly. This reners the algorithm invariant uner orer-preserving transformation of the objective function. Points with smaller ranks (i.e., better objective function values) are given a larger weight ω i with λ i= ω i =. The weights are zero for ranks larger than µ < λ, which is typically µ = λ/2. Thus, points with function values worse than the meian o not enter the aaptation process of the parameters. The covariance matrix 2

3 is upate using two terms, a rank- an a rank-µ upate. For the rank- upate, a long term average of the changes of m t is maintaine p c,t+ = ( c c )p c,t + c c (2 c c )µ eff m t+ m t σ t, () where µ eff = / µ i= ω2 i is the effective sample size given the weights. Note that p c,t is large when the algorithm performs steps in the same irection, while it becomes small when the algorithm performs steps in alternating irections. The rank-µ upate estimates the covariance of the weighte steps x i,t m t, i µ. Combining rank- an rank-µ upate gives the final upate rule for C t, which can be motivate by principles from information geometry [Akimoto et al., 202]: C t+ = ( c c µ )C t + c p c,t+ p T c,t+ + c µ σ 2 t µ ω i (x i,t m t ) (x i,t m t ) T (2) So far, the upate is (apart from initialization) invariant uner affine linear transformations (i.e., x Bx + b, B GL(, R)). The upate of the global step-size parameter σ t is base on the cumulative step-size aaptation algorithm (CSA). It measures the correlation of successive steps in a normalize coorinate system. The goal is to aapt σ t such that the steps of the algorithm become uncorrelate. Uner the assumption that uncorrelate steps are stanar normally istribute, a carefully esigne long term average over the steps shoul have the same expecte length as a χ-istribute ranom variable, enote by E{χ}. The long term average has the form i= p σ,t+ = ( c σ )p σ,t + c σ (2 c σ )µ eff C /2 t m t+ m t σ t (3) with p σ, = 0. The normalization by the factor C /2 t is the main ifference between equations () an (3). It is important because it corrects for a change of C t between iterations. Without this correction, it is ifficult to measure correlations accurately in the un-normalize coorinate system. For the upate, the length of p σ,t+ is compare to the expecte length E{χ} an σ t is change epening on whether the average step taken is longer or shorter than expecte: ( ( )) cσ pσ,t+ σ t+ = σ t exp (4) E{χ} σ This upate is not proven to preserve invariance uner affine linear transformations [Auger, 205], an it is it conjecture that it oes not. 3 Cholesky-CMA-ES In general, computing the matrix square root or the Cholesky factor from an n n matrix has time complexity ω( 2 ) (i.e., scales worse than quaratically). To reuce this complexity, Suttorp et al. [2009] have suggeste to replace the process of upating the covariance matrix an ecomposing it afterwars by upates irectly operating on the ecomposition (i.e., the covariance matrix is never compute an store explicitly, only its factorization is maintaine). Krause an Igel [205] have shown that the upate of C t in equation (2) can be rewritten as a quaratic-time upate of its triangular Cholesky factor A t with C t = A t A T t. They consier the special case µ = λ =. We propose to exten this upate to the stanar CMA-ES, which leas to a runtime O(µ 2 ). As typically µ = O(log()), this gives a large spee-up compare to the explicit recomputation of the Cholesky factor or the inverse of the covariance matrix. Unfortunately, the fast Cholesky upate can not be applie irectly to the original CMA-ES. To see this, consier the term s t = C /2 t (m t+ m t ) in equation (3). Rewriting p σ,t+ in terms of s t in a non-recursive fashion, we obtain p σ,t+ = t ( c σ ) t k c σ (2 c σ )µ eff s k. σ k Given c c, the factors in () are chosen to compensate for the change in variance when aing istributions. If the ranking of the points woul be purely ranom, µ eff (m t+ m t)/σ t N (0, C t) an if C t = I an p c,t N (0, I) then p c,t+ N (0, I). k= 3

4 Algorithm : The Cholesky-CMA-ES. input :λ, µ, m, ω i=...µ, c σ, σ, c c, c an c µ A = I, p c, = 0, p σ, = 0 for t =, 2,... o for i =,..., λ o x i,t = σ t A t y i,t + m t, y i,t N (0, I) Sort x i,t, i =,..., λ increasing by f(x i,t ) m t+ = µ i= ω ix i,t p c,t+ = ( c c )p c,t + m c c (2 c c )µ t+ m t eff σ t // Apply formula (2) to A t A t+ c c µ A t A t+ rankoneupate(a t+, c, p c,t+ ) for i =,..., µ o A t+ rankoneupate(a t+, c µ ω i, xi,t mt σ t ) // Upate σ using ŝ k as in (5) p σ,t+ = ( c σ )p σ,t + c σ (2 c σ )µ eff A ( ( )) t c σ t+ = σ t exp σ pσ,t+ σ E{χ} m t+ m t σ t Algorithm 2: rankoneupate(a, β, v) input :Cholesky factor A R of C, β R, v R output : Cholesky factor A of C + βvv T α v b for j =,..., o A jj A 2 jj + β b α2 j γ A 2 jj b + βα2 j for k = j +,..., o α k α k αj A jj A kj A kj = A jj A jj A kj + A jj βαj γ b b + β α2 j A 2 jj α k By the RQ-ecomposition, we can fin C /2 t = A t E t with E t being an orthogonal matrix an A t lower triangular. When replacing s t by ŝ t = A t (m t+ m t ), we obtain p σ,t+ = t ( c σ ) t k c σ (2 c σ )µ eff Ek T ŝ k. σ k Thus, replacing C /2 t by A t introuces a new ranom rotation matrix Et T, which changes in every iteration. Obtaining E t from A t can be achieve by the polar-ecomposition, which is a cubic-time operation: currently there are no algorithms known that can upate an existing polar ecomposition from an upate Cholesky factor in less than cubic time. Thus, if our goal is to apply the fast Cholesky upate, we have to perform the upate without this correction factor p σ,t+ t ( c σ ) t k c σ (2 c σ )µ eff ŝ k. (5) σ k This introuces some error, but we will show in the following that we can expect this error to be small an to ecrease over time as the algorithm converges to the optimum. For this, we nee the following result: k= k= 4

5 Lemma. Consier the sequence of symmetric positive efinite matrices C t=0 with C t = C t (et C t ) /. Assume that C t t C an that C is symmetric positive efinite with et C =. /2 Let C t = ĀtE /2 t enote the RQ-ecomposition of C t, where E t is orthogonal an Āt lower triangular. Then it hols Et E T t t I. Proof. Let C = ĀE, the RQ-ecomposition of C. As et C 0, this ecomposition is unique. Because the RQ-ecomposition is continuous, it maps convergent sequences to convergent sequences. t Therefore E t E an thus, Et E T t t E T E = I. This result establishes that, when C t converges to a certain shape (but not necessary to a certain scaling), A t an thus E t will also converge (up to scaling). Thus, as we only nee the norm of p σ,t+, we can rotate the coorinate system an by multiplying with E t we obtain p σ,t+ = E t p σ,t+ = t ( c σ ) t k c σ (2 c σ )µ eff E t Ek T ŝ k. (6) k= t I, the error in the norm will also vanish ue to the exponential weighting Therefore, if E t Et T in the summation. Note that this oes not hol for any ecomposition C t = B t Bt T. If we o not constrain B t to be triangular an allow any matrix, we o not have a bijective mapping between C t an B t anymore an the introuction of ( ) 2 egrees of freeom (as, e.g., in the upate propose by Suttorp et al. [2009]) allows the creation of non-converging sequences of E t even for C t = const. As the CMA-ES is a ranomize algorithm, we cannot assume convergence of C t. However, in simplifie algorithms the expectation of C t converges [Beyer, 204]. Still, the reasoning behin Lemma establishes that the error cause by replacing s t by ŝ t is small if C t changes slowly. Equation (6) establishes that the error epens only on the rotation of coorinate systems. As the mapping from C t to the triangular factor A t is one-to-one an smooth, the coorinate system changes in every step will be small an because of the exponentially ecaying weighting, only the last few coorinate systems matter at a particular time step t. The Cholesky-CMA-ES algorithm is given in Algorithm. One can erive the algorithm from the stanar CMA-ES by ecomposing (2) into a number of rank- upates C t+ = (((αc t +β v v T )+ β 2 v 2 v T 2 )... ) an applying them to the Cholesky factor using Algorithm 2. Properties of the upate rule. The O(µ 2 ) complexity of the upate in the Cholesky-CMA- ES is asymptotically optimal. 2 Apart from the theoretical guarantees, there are several aitional avantages compare to approaches using a non-triangular Cholesky factorization (e.g., Suttorp et al. [2009]). First, as only triangular matrices have to be store, the storage complexity is optimal. Secon, the iagonal elements of a triangular Cholesky factor are the square roots of the eigenvalues of the factorize matrix, that is, we get the eigenvalues of the covariance matrix for free. These are important, for example, for monitoring the conitioning of the optimization problem an, in particular, to enforce lower bouns on the variances of σ t C t projecte on its principal components. Thir, a triangular matrix can be inverte in quaratic time. Thus, we can efficiently compute A t from A t when neee, instea of having two separate quaratic-time upates for A t an A t, which requires more memory an is prone to numerical instabilities. 4 Experiments an Results Experiments. We compare the Cholesky-CMA-ES with other CMA-ES variants. 3 The reference CMA-ES { implementation } uses a elay strategy in which the matrix square root is compute every max, 0(c +c µ) iterations [Hansen, 205], which equals one for the imensions consiere 2 Actually, the complexity is relate to the complexity of multiplying two µ matrices. We assume a naïve implementation of matrix multiplication. With a faster multiplication algorithm, the complexity can be reuce accoringly. 3 We ae our algorithm to the open-source machine learning library Shark [Igel et al., 2008] an use LAPACK for high efficiency. σ k 5

6 Iterations 0 4 Cholesky-CMA-ES Suttorp-CMA-ES CMA-ES/ CMA-ES-Ref (a) Sphere 0 4 (b) Cigar 0 4 (c) Discus Iterations () Ellipsoi (e) Rosenbrock (f) DiffPowers Figure : Function evaluations require to reach f(x) < 0 4 over problem imensionality (meians of 00 trials). The graphs for CMA-ES-Ref an Cholesky-CMA-ES overlap. Cholesky-CMA-ES Suttorp-CMA-ES CMA-ES/ CMA-ES-Ref time/s (a) Sphere (b) Cigar (c) Discus time/s () Ellipsoi (e) Rosenbrock (f) DiffPowers Figure 2: Runtime in secons over problem imensionality. Shown are meians of 00 trials. Note the logarithmic scaling on both axes. 6

7 Name f(x) Sphere x 2 ( Rosenbrock i=0 00(xi+ x 2 i )2 + ( x i ) 2) Discus x i= 0 6 x 2 i Cigar 0 6 x i= x2 i 6i Ellipsoi i=0 0 x 2 i Different Powers i=0 x i 2+0i Table : Benchmark functions use in the experiments (aitionally, a rotation matrix B transforms the variables, x Bx) log f(mt) 0 6 Cholesky-CMA-ES Suttorp-CMA-ES CMA-ES/ CMA-ES-Ref (a) Sphere (b) Cigar log f(mt) (c) Discus () DiffPowers log f(mt) time/s time/s (e) Ellipsoi (f) Rosenbrock Figure 3: Function value evolution over time on the benchmark functions with = 28. Shown are single runs, namely those with runtimes closest to the corresponing meian runtimes. 7

8 in our experiments. We call this variant CMA-ES-Ref. As an alternative, we experimente with elaying the upate for steps. We refer to this variant as CMA-ES/. We also aapte the nontriangular Cholesky factor approach by Suttorp et al. [2009] to the state-of-the art implementation of the CMA-ES. We refer to the resulting algorithm as Suttorp-CMA-ES. We consiere stanar benchmark functions for erivative-free optimization given in Table. Sphere is consiere to show that on a spherical function the step size aaption oes not behave ifferently; Cigar/Discus/Ellipsoi moel functions with ifferent convex shapes near the optimum; Rosenbrock tests learning a function with bens, which lea to slowly converging covariance matrices in the optimization process; Diffpowers is an example of a function with arbitrarily ba conitioning. To test rotation invariance, we applie a rotation matrix to the variables, x Bx, B SO(, R). This is one for every benchmark function, an a rotation matrix was chosen ranomly at the beginning of each trial. All starting points were rawn uniformly from [0, ], except for Sphere, where we sample from N (0, I). For each function, we vary {4, 8, 6,..., 256}. Due to the long running times, we only compute CMA-ES-Ref up to = 28. For the given range of imensions, for every choice of, we ran 00 trials from ifferent initial points an monitore the number of iterations an the wall-clock time neee to sample a point with a function value below 0 4. For Rosenbrock we exclue the trials in which the algorithm i not converge to the global optimum. We further evaluate the algorithms on aitional benchmark functions inspire by Stich an Müller [202] an measure the change of rotation introuce by the Cholesky-CMA-ES at each iteration (E t ), see supplementary material. Results. Figure shows that CMA-ES-Ref an Cholesky-CMA-ES require the same amount of function evaluations to reach a given objective value. The CMA-ES/ require slightly more evaluations epening on the benchmark function. When consiering the wall-clock runtime, the Cholesky-CMA-ES was significantly faster than the other algorithms. As expecte from the theoretical analysis, the higher the imensionality the more pronounce the ifferences, see Figure 2 (note logarithmic scales). For = 64 the Cholesky-CMA-ES was alreay 20 times faster than the CMA-ES-Ref. The rastic ifferences in runtime become apparent when inspecting single trials. Note that for = 256 the matrix size exceee the L2 cache, which affecte the performance of the Cholesky-CMA-ES an Suttorp-CMA-ES. Figure 3 plots the trials with runtimes closest to the corresponing meian runtimes for = Conclusion CMA-ES is a ubiquitous algorithm for erivative-free optimization. The CMA-ES has proven to be a highly efficient irect policy search algorithm an to be a useful tool for moel selection in supervise learning. We propose the Cholesky-CMA-ES, which can be regare as an approximation of the original CMA-ES. We gave theoretical arguments for why our approximation, which only affects the global step-size aaptation, oes not impair performance. The Cholesky-CMA-ES achieves a better, asymptotically optimal time complexity of O(µ 2 ) for the covariance upate an optimal memory complexity. It allows for numerically stable computation of the inverse of the Cholesky factor in quaratic time an provies the eigenvalues of the covariance matrix without aitional costs. We empirically compare the Cholesky-CMA-ES to the state-of-the-art CMA-ES with elaye covariance matrix ecomposition. Our experiments emonstrate a significant increase in optimizaton spee. As expecte, the Cholesky-CMA-ES neee the same amount of objective function evaluations as the stanar CMA-ES, but require much less wall-clock time an this spee-up increases with the search space imensionality. Still, our algorithm scales quaratically with the problem imensionality. If the imensionality gets so large that maintaining a full covariance matrix becomes computationally infeasible, one has to resort to low-imensional approximations [e.g., Loshchilov, 205], which, however, bear the risk of a significant rop in optimization performance. Thus, we avocate our new Cholesky-CMA-ES for scaling up CMA-ES to large optimization problems for which upating an storing the covariance matrix is still possible, for example, for training neural networks in irect policy search. Acknowlegement. We acknowlege support from the Innovation Fun Denmark through the projects Personalize breast cancer screening (OK, CI) an Cyber Frau Detection Using Avance Machine Learning Techniques (DRA, CI). 8

9 References Y. Akimoto, Y. Nagata, I. Ono, an S. Kobayashi. Theoretical founation for CMA-ES from information geometry perspective. Algorithmica, 64(4):698 76, 202. Y. Akimoto, A. Auger, an N. Hansen. Comparison-base natural graient optimization in high imension. In Proceeings of the 6th Annual Genetic an Evolutionary Computation Conference (GECCO), pages ACM, 204. A. Auger. Analysis of Comparison-base Stochastic Continous Black-Box Optimization Algorithms. Habilitation thesis, Faculté es Sciences Orsay, Université Paris-Su, 205. H.-G. Beyer. Evolution strategies. Scholarpeia, 2(8):965, H.-G. Beyer. Convergence analysis of evolutionary algorithms that are base on the paraigm of information geometry. Evolutionary Computation, 22(4): , 204. K. Bringmann, T. Frierich, C. Igel, an T. Voß. Speeing up many-objective optimization by Monte Carlo approximations. Artificial Intelligence, 204:22 29, 203. A. E. Eiben an Jim Smith. From evolutionary computation to the evolution of things. Nature, 52: , 205. F. Gomez, J. Schmihuber, an R. Miikkulainen. Accelerate neural evolution through cooperatively coevolve synapses. Journal of Machine Learning Research, 9: , N. Hansen an A. Ostermeier. Aapting arbitrary normal mutation istributions in evolution strategies: The covariance matrix aaptation. In Proceeings of IEEE International Conference on Evolutionary Computation (CEC 996), pages IEEE, 996. N. Hansen an A. Ostermeier. Completely eranomize self-aaptation in evolution strategies. Evolutionary Computation, 9(2):59 95, 200. N. Hansen. The CMA evolution strategy: A tutorial. Technical report, Inria Saclay Île-e-France, Université Paris-Su, LRI, 205. V. Heirich-Meisner an C. Igel. Hoeffing an Bernstein races for selecting policies in evolutionary irect policy search. In Proceeings of the 26 th International Conference on Machine Learning (ICML 2009), pages , V. Heirich-Meisner an C. Igel. Neuroevolution strategies for episoic reinforcement learning. Journal of Algorithms, 64(4):52 68, C. Igel, T. Glasmachers, an V. Heirich-Meisner. Shark. Journal of Machine Learning Research, 9: , C. Igel. Evolutionary kernel learning. In Encyclopeia of Machine Learning. Springer-Verlag, 200. O. Krause an C. Igel. A more efficient rank-one covariance matrix upate for evolution strategies. In Proceeings of the 205 ACM Conference on Founations of Genetic Algorithms (FOGA XIII), pages ACM, 205. I. Loshchilov. A computationally efficient limite memory CMA-ES for large scale optimization. In Proceeings of the 6th Annual Genetic an Evolutionary Computation Conference (GECCO), pages ACM, 204. I. Loshchilov. LM-CMA: An alternative to L-BFGS for large scale black-box optimization. Evolutionary Computation, 205. M. N. Omivar an X. Li. A comparative stuy of CMA-ES on large scale global optimisation. In AI 200: Avances in Artificial Intelligence, volume 6464 of LNAI, pages Springer, 20. J. Polan an A. Zell. Main vector aaptation: A CMA variant with linear time an space complexity. In Proceeings of the 0th Annual Genetic an Evolutionary Computation Conference (GECCO), pages Morgan Kaufmann Publishers, 200. R. Ros an N. Hansen. A simple moification in CMA-ES achieving linear time an space complexity. In Parallel Problem Solving from Nature (PPSN X), pages Springer, S. U. Stich an C. L. Müller. On spectral invariance of ranomize Hessian an covariance matrix aaptation schemes. In Parallel Problem Solving from Nature (PPSN XII), pages Springer, 202. Y. Sun, T. Schaul, F. Gomez, an J. Schmihuber. A linear time natural evolution strategy for non-separable functions. In 5th Annual Conference on Genetic an Evolutionary Computation Conference Companion, pages ACM, 203. T. Suttorp, N. Hansen, an C. Igel. Efficient covariance matrix upate for variable metric evolution strategies. Machine Learning, 75(2):67 97,

Multi-objective Optimization with Unbounded Solution Sets

Multi-objective Optimization with Unbounded Solution Sets Multi-objective Optimization with Unbounded Solution Sets Oswin Krause Dept. of Computer Science University of Copenhagen Copenhagen, Denmark oswin.krause@di.ku.dk Tobias Glasmachers Institut für Neuroinformatik

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Adaptive Coordinate Descent

Adaptive Coordinate Descent Aaptive Coorinate Descent Ilya Loshchilov TAO, INRIA Saclay U. Paris Su, F-9 Orsay Marc Schoenauer TAO, INRIA Saclay U. Paris Su, F-9 Orsay firstname.lastname@inria.fr Michèle Sebag CNRS, LRI UMR 86 U.

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

arxiv: v2 [cond-mat.stat-mech] 11 Nov 2016

arxiv: v2 [cond-mat.stat-mech] 11 Nov 2016 Noname manuscript No. (will be inserte by the eitor) Scaling properties of the number of ranom sequential asorption iterations neee to generate saturate ranom packing arxiv:607.06668v2 [con-mat.stat-mech]

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Mirrored Sampling in Evolution Strategies With Weighted Recombination

Mirrored Sampling in Evolution Strategies With Weighted Recombination Mirrore Sampling in Evolution Strategies With Weighte Recombination Anne Auger, Dimo Brochoff, Niolaus Hansen To cite this version: Anne Auger, Dimo Brochoff, Niolaus Hansen. Mirrore Sampling in Evolution

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Capacity Analysis of MIMO Systems with Unknown Channel State Information Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,

More information

arxiv: v1 [cs.lg] 22 Mar 2014

arxiv: v1 [cs.lg] 22 Mar 2014 CUR lgorithm with Incomplete Matrix Observation Rong Jin an Shenghuo Zhu Dept. of Computer Science an Engineering, Michigan State University, rongjin@msu.eu NEC Laboratories merica, Inc., zsh@nec-labs.com

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Diagonalization of Matrices Dr. E. Jacobs

Diagonalization of Matrices Dr. E. Jacobs Diagonalization of Matrices Dr. E. Jacobs One of the very interesting lessons in this course is how certain algebraic techniques can be use to solve ifferential equations. The purpose of these notes is

More information

ELEC3114 Control Systems 1

ELEC3114 Control Systems 1 ELEC34 Control Systems Linear Systems - Moelling - Some Issues Session 2, 2007 Introuction Linear systems may be represente in a number of ifferent ways. Figure shows the relationship between various representations.

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Proof of Proposition 1

Proof of Proposition 1 A Proofs of Propositions,2,. Before e look at the MMD calculations in various cases, e prove the folloing useful characterization of MMD for translation invariant kernels like the Gaussian an Laplace kernels.

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Adaptive Gain-Scheduled H Control of Linear Parameter-Varying Systems with Time-Delayed Elements

Adaptive Gain-Scheduled H Control of Linear Parameter-Varying Systems with Time-Delayed Elements Aaptive Gain-Scheule H Control of Linear Parameter-Varying Systems with ime-delaye Elements Yoshihiko Miyasato he Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, okyo 6-8569, Japan

More information

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

A study on ant colony systems with fuzzy pheromone dispersion

A study on ant colony systems with fuzzy pheromone dispersion A stuy on ant colony systems with fuzzy pheromone ispersion Louis Gacogne LIP6 104, Av. Kenney, 75016 Paris, France gacogne@lip6.fr Sanra Sanri IIIA/CSIC Campus UAB, 08193 Bellaterra, Spain sanri@iiia.csic.es

More information

ON THE OPTIMALITY SYSTEM FOR A 1 D EULER FLOW PROBLEM

ON THE OPTIMALITY SYSTEM FOR A 1 D EULER FLOW PROBLEM ON THE OPTIMALITY SYSTEM FOR A D EULER FLOW PROBLEM Eugene M. Cliff Matthias Heinkenschloss y Ajit R. Shenoy z Interisciplinary Center for Applie Mathematics Virginia Tech Blacksburg, Virginia 46 Abstract

More information

LeChatelier Dynamics

LeChatelier Dynamics LeChatelier Dynamics Robert Gilmore Physics Department, Drexel University, Philaelphia, Pennsylvania 1914, USA (Date: June 12, 28, Levine Birthay Party: To be submitte.) Dynamics of the relaxation of a

More information

THE EFFICIENCIES OF THE SPATIAL MEDIAN AND SPATIAL SIGN COVARIANCE MATRIX FOR ELLIPTICALLY SYMMETRIC DISTRIBUTIONS

THE EFFICIENCIES OF THE SPATIAL MEDIAN AND SPATIAL SIGN COVARIANCE MATRIX FOR ELLIPTICALLY SYMMETRIC DISTRIBUTIONS THE EFFICIENCIES OF THE SPATIAL MEDIAN AND SPATIAL SIGN COVARIANCE MATRIX FOR ELLIPTICALLY SYMMETRIC DISTRIBUTIONS BY ANDREW F. MAGYAR A issertation submitte to the Grauate School New Brunswick Rutgers,

More information

Linear Regression with Limited Observation

Linear Regression with Limited Observation Ela Hazan Tomer Koren Technion Israel Institute of Technology, Technion City 32000, Haifa, Israel ehazan@ie.technion.ac.il tomerk@cs.technion.ac.il Abstract We consier the most common variants of linear

More information

Track Initialization from Incomplete Measurements

Track Initialization from Incomplete Measurements Track Initialiation from Incomplete Measurements Christian R. Berger, Martina Daun an Wolfgang Koch Department of Electrical an Computer Engineering, University of Connecticut, Storrs, Connecticut 6269,

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments Problem F U L W D g m 3 2 s 2 0 0 0 0 2 kg 0 0 0 0 0 0 Table : Dimension matrix TMA 495 Matematisk moellering Exam Tuesay December 6, 2008 09:00 3:00 Problems an solution with aitional comments The necessary

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13) Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to

More information

Permuted Orthogonal Block-Diagonal Transformation Matrices for Large Scale Optimization Benchmarking

Permuted Orthogonal Block-Diagonal Transformation Matrices for Large Scale Optimization Benchmarking Permute Orthogonal Block-Diagonal Transformation Matrices for Large Scale Optimization Benchmarking Ouassim Ait Elhara, Anne Auger, Nikolaus Hansen To cite this version: Ouassim Ait Elhara, Anne Auger,

More information

arxiv:hep-th/ v1 3 Feb 1993

arxiv:hep-th/ v1 3 Feb 1993 NBI-HE-9-89 PAR LPTHE 9-49 FTUAM 9-44 November 99 Matrix moel calculations beyon the spherical limit arxiv:hep-th/93004v 3 Feb 993 J. Ambjørn The Niels Bohr Institute Blegamsvej 7, DK-00 Copenhagen Ø,

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Constraint Reformulation and a Lagrangian Relaxation based Solution Algorithm for a Least Expected Time Path Problem Abstract 1.

Constraint Reformulation and a Lagrangian Relaxation based Solution Algorithm for a Least Expected Time Path Problem Abstract 1. Constraint Reformulation an a Lagrangian Relaation base Solution Algorithm for a Least Epecte Time Path Problem Liing Yang State Key Laboratory of Rail Traffic Control an Safety, Being Jiaotong University,

More information

Permanent vs. Determinant

Permanent vs. Determinant Permanent vs. Determinant Frank Ban Introuction A major problem in theoretical computer science is the Permanent vs. Determinant problem. It asks: given an n by n matrix of ineterminates A = (a i,j ) an

More information

Robust Low Rank Kernel Embeddings of Multivariate Distributions

Robust Low Rank Kernel Embeddings of Multivariate Distributions Robust Low Rank Kernel Embeings of Multivariate Distributions Le Song, Bo Dai College of Computing, Georgia Institute of Technology lsong@cc.gatech.eu, boai@gatech.eu Abstract Kernel embeing of istributions

More information

McMaster University. Advanced Optimization Laboratory. Title: The Central Path Visits all the Vertices of the Klee-Minty Cube.

McMaster University. Advanced Optimization Laboratory. Title: The Central Path Visits all the Vertices of the Klee-Minty Cube. McMaster University Avance Optimization Laboratory Title: The Central Path Visits all the Vertices of the Klee-Minty Cube Authors: Antoine Deza, Eissa Nematollahi, Reza Peyghami an Tamás Terlaky AvOl-Report

More information

New Statistical Test for Quality Control in High Dimension Data Set

New Statistical Test for Quality Control in High Dimension Data Set International Journal of Applie Engineering Research ISSN 973-456 Volume, Number 6 (7) pp. 64-649 New Statistical Test for Quality Control in High Dimension Data Set Shamshuritawati Sharif, Suzilah Ismail

More information

Adaptive Coordinate Descent

Adaptive Coordinate Descent Adaptive Coordinate Descent Ilya Loshchilov 1,2, Marc Schoenauer 1,2, Michèle Sebag 2,1 1 TAO Project-team, INRIA Saclay - Île-de-France 2 and Laboratoire de Recherche en Informatique (UMR CNRS 8623) Université

More information

A Spectral Method for the Biharmonic Equation

A Spectral Method for the Biharmonic Equation A Spectral Metho for the Biharmonic Equation Kenall Atkinson, Davi Chien, an Olaf Hansen Abstract Let Ω be an open, simply connecte, an boune region in Ê,, with a smooth bounary Ω that is homeomorphic

More information

arxiv: v2 [cs.ds] 11 May 2016

arxiv: v2 [cs.ds] 11 May 2016 Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We

More information

The Press-Schechter mass function

The Press-Schechter mass function The Press-Schechter mass function To state the obvious: It is important to relate our theories to what we can observe. We have looke at linear perturbation theory, an we have consiere a simple moel for

More information

Optimal CDMA Signatures: A Finite-Step Approach

Optimal CDMA Signatures: A Finite-Step Approach Optimal CDMA Signatures: A Finite-Step Approach Joel A. Tropp Inst. for Comp. Engr. an Sci. (ICES) 1 University Station C000 Austin, TX 7871 jtropp@ices.utexas.eu Inerjit. S. Dhillon Dept. of Comp. Sci.

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

How to Minimize Maximum Regret in Repeated Decision-Making

How to Minimize Maximum Regret in Repeated Decision-Making How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

On conditional moments of high-dimensional random vectors given lower-dimensional projections

On conditional moments of high-dimensional random vectors given lower-dimensional projections Submitte to the Bernoulli arxiv:1405.2183v2 [math.st] 6 Sep 2016 On conitional moments of high-imensional ranom vectors given lower-imensional projections LUKAS STEINBERGER an HANNES LEEB Department of

More information

Gaussian processes with monotonicity information

Gaussian processes with monotonicity information Gaussian processes with monotonicity information Anonymous Author Anonymous Author Unknown Institution Unknown Institution Abstract A metho for using monotonicity information in multivariate Gaussian process

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Ashish Goel Michael Kapralov Sanjeev Khanna Abstract We consier the well-stuie problem of fining a perfect matching in -regular bipartite

More information

Monte Carlo Methods with Reduced Error

Monte Carlo Methods with Reduced Error Monte Carlo Methos with Reuce Error As has been shown, the probable error in Monte Carlo algorithms when no information about the smoothness of the function is use is Dξ r N = c N. It is important for

More information

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1 Assignment 1 Golstein 1.4 The equations of motion for the rolling isk are special cases of general linear ifferential equations of constraint of the form g i (x 1,..., x n x i = 0. i=1 A constraint conition

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

θ x = f ( x,t) could be written as

θ x = f ( x,t) could be written as 9. Higher orer PDEs as systems of first-orer PDEs. Hyperbolic systems. For PDEs, as for ODEs, we may reuce the orer by efining new epenent variables. For example, in the case of the wave equation, (1)

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises

More information

Chapter 2 Lagrangian Modeling

Chapter 2 Lagrangian Modeling Chapter 2 Lagrangian Moeling The basic laws of physics are use to moel every system whether it is electrical, mechanical, hyraulic, or any other energy omain. In mechanics, Newton s laws of motion provie

More information

Modelling and simulation of dependence structures in nonlife insurance with Bernstein copulas

Modelling and simulation of dependence structures in nonlife insurance with Bernstein copulas Moelling an simulation of epenence structures in nonlife insurance with Bernstein copulas Prof. Dr. Dietmar Pfeifer Dept. of Mathematics, University of Olenburg an AON Benfiel, Hamburg Dr. Doreen Straßburger

More information

State observers and recursive filters in classical feedback control theory

State observers and recursive filters in classical feedback control theory State observers an recursive filters in classical feeback control theory State-feeback control example: secon-orer system Consier the riven secon-orer system q q q u x q x q x x x x Here u coul represent

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

Stable and compact finite difference schemes

Stable and compact finite difference schemes Center for Turbulence Research Annual Research Briefs 2006 2 Stable an compact finite ifference schemes By K. Mattsson, M. Svär AND M. Shoeybi. Motivation an objectives Compact secon erivatives have long

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

TIME-DELAY ESTIMATION USING FARROW-BASED FRACTIONAL-DELAY FIR FILTERS: FILTER APPROXIMATION VS. ESTIMATION ERRORS

TIME-DELAY ESTIMATION USING FARROW-BASED FRACTIONAL-DELAY FIR FILTERS: FILTER APPROXIMATION VS. ESTIMATION ERRORS TIME-DEAY ESTIMATION USING FARROW-BASED FRACTIONA-DEAY FIR FITERS: FITER APPROXIMATION VS. ESTIMATION ERRORS Mattias Olsson, Håkan Johansson, an Per öwenborg Div. of Electronic Systems, Dept. of Electrical

More information

Situation awareness of power system based on static voltage security region

Situation awareness of power system based on static voltage security region The 6th International Conference on Renewable Power Generation (RPG) 19 20 October 2017 Situation awareness of power system base on static voltage security region Fei Xiao, Zi-Qing Jiang, Qian Ai, Ran

More information

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Proceeings of the 4th East-European Conference on Avances in Databases an Information Systems ADBIS) 200 Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Eleftherios Tiakas, Apostolos.

More information

VIRTUAL STRUCTURE BASED SPACECRAFT FORMATION CONTROL WITH FORMATION FEEDBACK

VIRTUAL STRUCTURE BASED SPACECRAFT FORMATION CONTROL WITH FORMATION FEEDBACK AIAA Guiance, Navigation, an Control Conference an Exhibit 5-8 August, Monterey, California AIAA -9 VIRTUAL STRUCTURE BASED SPACECRAT ORMATION CONTROL WITH ORMATION EEDBACK Wei Ren Ranal W. Bear Department

More information

Optimal Signal Detection for False Track Discrimination

Optimal Signal Detection for False Track Discrimination Optimal Signal Detection for False Track Discrimination Thomas Hanselmann Darko Mušicki Dept. of Electrical an Electronic Eng. Dept. of Electrical an Electronic Eng. The University of Melbourne The University

More information

WUCHEN LI AND STANLEY OSHER

WUCHEN LI AND STANLEY OSHER CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.

More information

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Chuang Wang, Yonina C. Elar, Fellow, IEEE an Yue M. Lu, Senior Member, IEEE Abstract We present a high-imensional analysis

More information