A Sparse Covariance Function for Exact Gaussian Process Inference in Large Datasets

Size: px
Start display at page:

Download "A Sparse Covariance Function for Exact Gaussian Process Inference in Large Datasets"

Transcription

1 A Covariance Function for Exact Gaussian Process Inference in Large Datasets Arman ekumyan Austraian Centre for Fied Robotics The University of Sydney NSW 26, Austraia Fabio Ramos Austraian Centre for Fied Robotics The University of Sydney NSW 26, Austraia Abstract Despite the success of Gaussian processes GPs in modeing spatia stochastic processes, deaing with arge datasets is sti chaenging. The probem arises by the need to invert a potentiay arge covariance matrix during inference. In this paper we address the compexity probem by constructing a new stationary covariance function ercer kerne that naturay provides a sparse covariance matrix. The sparseness of the matrix is defined by hyperparameters optimised during earning. The new covariance function enabes exact GP inference and performs comparativey to the squared-exponentia one, at a ower computationa cost. This aows the appication of GPs to arge-scae probems such as ore grade prediction in mining or 3D surface modeing. Experiments show that using the proposed covariance function, very sparse covariance matrices are normay obtained which can be effectivey used for faster inference and ess memory usage. 1 Introduction Gaussian processes GPs are a usefu and powerfu too for regression in supervised machine earning [Rasmussen and Wiiams, 26]. The range of appications incudes geophysics, mining, hydroogy, reservoir engineering and robotics. Despite its increasing popuarity, modeing argescae spatia stochastic processes is sti chaenging. The difficuty comes from the fact that inference in GPs is usuay computationay expensive due to the need to invert a potentiay arge covariance matrix during inference time, which has O N 3 cost. For probems with thousands of observations, exact inference in norma GPs is intractabe and approximation agorithms are required. ost of the approximation agorithms empoy a subset of points to approximate the posterior distribution at a new point given the training data and hyper-parameters. These approximations rey on heuristics to seect the subset of points [Lawrence et a., 23; Seeger et a., 23], or use pseudo targets obtained during the optimisation of the og-margina ikeihood of the mode [Sneson and Ghahramani, 26]. In this work, we address the compexity probem differenty. Instead of reying on sparse GP approximations, we propose a new covariance function which provides intrinsicay sparse covariance matrices. This aows exact inference in GPs using conventiona methods. As the new sparse covariance function can be mutipied by any other vaid covariance function and the resut is a sparse covariance matrix, a ot of fexibiity is given to practitioners to accuratey mode their probems whie sti preserving sparseness properties. We ca the GPs constructed using our sparse covariance function Exact Gaussian Processes ESGPs. The main idea behind is the formuation of a vaid and smooth covariance function whose output equas to zero whenever the distance between input observations is arger than a hyperparameter. As with other hyper-parameters, this can be estimated by maximising the margina ikeihood to better mode the properties of the data such as smoothness, characteristic ength-scae, and noise. Additionay, the proposed covariance function in much resembes the popuar squared exponentia in terms of smoothness, being four times continuousy differentiabe. We empiricay compare ESGP with oca approximation techniques and demonstrate how other covariance functions can be integrated in the same framework. Our method resuts in very sparse covariance matrices up to 9% of the eements are zeros in in-ground grade estimation probems which requires significanty ess memory whie providing simiar performance. This paper is organised as foows. In Section 2 we review the basics of GP regression and introduce notation. Section 3 summarises previous work on approximate inference with GPs. Section 4 presents our new intrinsicay sparse covariance function and its main properties. We evauate the framework providing experimenta resuts in both artificia and rea data in Section 6. Finay, Section 7 concudes the paper and discusses further deveopments. 2 Gaussian Processes In this section we briefy review Gaussian Processes for regression and introduce notation. We consider the supervised earning probem where given a training set D = {x i,y i } N i=1 consisting of N input points x i R D and the corresponding outputs y i R the objective is to compute the predictive distribution f x at a new test point x. A Gaussian process mode paces a mutivariate Gaussian distribution over the space of function variabes fx mapping input to output spaces. The mode is specified by defining a mean function 1936

2 mx and the covariance function kx, x resuting in the Gaussian process written as fx GPmx,kx, x. Denoting groups of these points as X, f, y = {x i }, {f i }, {y i } N i=1 for the training set and X, f, y = {x,i }, {f,i }, {y,i } N i=1 for the testing points, the joint Gaussian distribution with mx =is [ f f ] N [ KX, X KX, X, KX,X KX,X ], 1 where N μ, Σ is a mutivariate Gaussian distribution with mean μ and covariance Σ, and K is used to denote the covariance matrix computed between a points in the set. If we assume observations with Gaussian noise ɛ and variance σ 2 such that y = fx+ɛ, the joint distribution becomes [ ] [ ] y KX, X+σ N, 2 I KX, X. f KX,X KX,X 2 A popuar choice for the covariance function is the squared exponentia used in this paper for comparisons in the experiment section: kx, x =σ 2 f exp 1 2 x x T x x 3 with = diag 2 where is a vector of positive numbers representing the ength-scaes in each dimension. 2.1 Inference for New Points By conditioning on the observed training points, the predictive distribution can be obtained as p f X,X,y =N μ, Σ, 4 where μ = K X,X [ K X, X+σ 2 I ] 1 y Σ = K X,X K X,X [ K X, X+σ 2 I ] 1 K X, X +σ 2 I. 5 From Equation 5, it can be observed that the predictive mean is a inear combination of N kerne functions each centred on a training point, μ = N i=1 α ik x i, x, where α = K X, X+σ 2 I 1 y. A GP is aso a best unbiased inear estimator [Cressie, 1993; Kitanidis, 1997] in the mean squared error sense. During inference, most of the computationa cost takes pace whie computing the inversion in Equation 5, which is O N 3 if impemented naïvey. 2.2 Learning Hyper-Parameters Commony, the covariance function kx, x is parametrised by set of hyper-parameters θ, and we can write k x, x ; θ. These parameters aow for more fexibiity in modeing the properties of the data. Thus, earning a GP mode is equivaent to determining the hyper-parameters of the covariance function from some training dataset. In a Bayesian framework this can be performed by maximising the og of the margina ikeihood w.r.t. θ: og p y X, θ = 1 2 yt Ky 1 y 1 2 og K y N 2 og 2π 6 where K y = KX, X +σ 2 I is the covariance matrix for the targets y. The margina ikeihood has three terms from eft to right, the first accounts for the data fit; the second is a compexity penaty term encoding the Occam s Razor principe and the ast is a normaisation constant. Eq. 6 is a non-convex function on the hyper-parameters θ and therefore ony oca maxima can be obtained. In practice, this is not a major issue since good oca maxima can be obtained with gradient descent techniques by using mutipe starting points. However, this requires the computation of partia derivatives resuting in: og p y X, θ = 1 θ j 2 yt 1 K K K 1 y 1 θ j 2 tr K 1 K θ 7 Note that this expression requires the computation of partia derivatives of the covariance function w.r.t θ. 3 Reated Work Recenty, there have been severa methods proposed to tacke the probem of GP inference in arge datasets. However, most of these approaches rey on approximation techniques. A common and simpe procedure is to seect a subset of data points and perform inference using ony these points. This is equivaent to ignoring part of the data which makes the seection of the subset very important. In [Lawrence et a., 23], the seection of points is based on the differentia entropy. Simiary, [Seeger et a., 23] suggest the use of another information theory quantity, the information gain. Another interesting procedure is to seect a subset of data points to act as an inducing set and project this set up to a the data points avaiabe. This is known as sparse GP approximation [Wiiams and Seeger, 21; Smoa and Bartett, 21; Candea and Rasmussen, 25] which usuay performs better than simpy seecting a subset of data points. However, the definition of the inducing set is difficut and can invove non-convex optimisations [Sneson and Ghahramani, 26]. Loca methods have been appied in geostatistics for a ong time [Wackernage, 23]. The idea is to perform inference by evauating the covariance function ony at points in the neighbourhood of a query point. This method can be effective but the definition of the neighbourhood is crucia. Our method is inspired by this approach but rather than defining the neighbourhood manuay, we obtain it automaticay during earning. An interesting idea on combining oca and goba methods such as the sparse Gaussian process was proposed in [Sneson and Ghahramani, 27]. We compare our method to theirs in Section 6. This work differs from other studies by not addressing the GP inference probem through an approximation technique. Rather, it proposes a new covariance function that naturay. 1937

3 generates sparse covariance matrices. This idea was used in [Wendand, 25] with piecewise poynomias but extensions to mutipe dimensions is difficut due to the need to guarantee positive definiteness. A simiar formuation to ours was proposed in [Storkey, 1999]. However, there is no hyperparameter earning and the main properties are not anaysed. To the best of our knowedge, this work is the first to demonstrate with rea exampes how the compexity probem can be addressed through the construction of a new sparse covariance function aowing for exact GP inference in arge datasets. 4 Exacty Gaussian Processes For very arge datasets, the inversion or even storage of a fu matrix KX, X+σ 2 I can be prohibitive. In geoogy probems for exampe, it is not uncommon to have datasets with 1K points or more. To dea with such arge probems whie sti being abe to perform exact inference in the GP mode, we deveop the covariance function beow. First, note that the mean prediction in Eq. 5 can be rewritten as a inear combination of N evauations of the covariance function, each = N i=1 α ik x, x i, one centred on a training point, μ where α = K X, X+σ 2 I 1 y. To avoid the inversion of the fu matrix, we can instead deveop a covariance function whose output vanishes outside some region R, so that kx, x i =when x i is outside a region R. In this way, ony a subset of α woud need to be computed which effectivey means that ony few coumns of K X, X+σ 2 I 1 need to be computed, significanty reducing the computationa and storage costs as R diminishes. As we sha see, the region can be specified automaticay during earning. 4.1 Intrinsicay Covariance Function The covariance function we are ooking for must vanish out of some finite region R for exact sparse GP regression. It must produce smooth curves but it shoud not be infinitey differentiabe so that it can be appicabe to probems with some discontinuities. For our derivation, the function g x = cos 2 πx H.5 x was chosen, which due to cos 2 πx = cos 2πx+1/2 is actuay the cosine function shifted up, normaised and set to zero out of the interva x.5,.5. The cosine function was seected as the basis function due to the foowing reasons: 1 it is anayticay we tractabe; 2 integras with finite imits containing combinations of its basic form can be cacuated in cosed form; and 3 the cosine function usuay provides good approximations for different functions, being the core eement for Fourier anaysis. Here and afterwards H represents the Heaviside unit step function. As it stands, the chosen basis function g x is smooth on the whoe rea axis, vanishes out of the interva x.5,.5 and has discontinuities in the second derivative. To derive a vaid covariance function we conduct cacuations anaogous to presented in [Rasmussen and Wiiams, 26]. Using the transfer function h x; u =g x u the foowing 1D covariance function is obtained: x x x, x =σ h ; u h ; u du. 8 k1x,x sparse =1 sparse =2 sparse =3 sqexp =.75 = x x Figure 1: Pot showing the output of the covariance function for different vaues of Δx. Due to the chosen form of the basis functions, the integra in Eq. 8 can be anayticay evauated see Appendix A for detais to resut in: k 1 x, x [ ;, σ = 2+cos2π σ d 3 1 d + 1 2π sin ] 2π d if d< if d 9 where σ > is a constant coefficient, >is a given scae and d is the distance between the points: d = x x. 1 From Eq. 8 it foows that for any points x i and any rea numbers a i where i = 1, 2,..., n the inequaity n n xi 2 a i a j x i,x j =σ a i h i,j=1 ; u du i=1 hods, so that the constructed covariance function is positive semi-definite. Based on Eqs. 9-1 we cacuate that k d= = k d = 2 k d= d 2 = 3 k d= d 3 = 4 k d= d 4 =, d= 5 k d 5 = 4π 4 11 d= which shows that the covariance function is continuous and has continuous 4th derivative at d =. The function Δx, ;, 1 is compared with squared exponentia in Figure 1. Note that it foows the squared exponentia covariance function cosey but vanishes when Δx. 4.2 Extending to utipe Dimensions This covariance function can be extended to mutipe dimensions in the foowing ways: 1. Using direct products for a axes: k 1 x, D x ;,σ =σ x i,x i; i, 1 12 i=1 where D is the dimensionaity of the points and is the vector of the characteristic engths, = 1, 2,..., D T. 2. Using ahaanobis distance: = 3 =

4 k 2 { r; [ σ, Ω = ] σ 2+cos2πr 3 1 r+ 1 2π sin 2πr if r<1 if r 1 13 where σ >, Ω is positive semi-definite and r = x x T Ωx x, Ω. 14 After this point we wi frequenty use the short notation k for the function k 2 r; σ, Ω. 4.3 Important properties of the new covariance function The deveoped muti-dimensiona covariance function in both forms k 1 and k2 has the foowing remarkabe properties: 1. It vanishes out of some finite region R: { } R 1 = r R D : k 1 r 15 { } R 2 = r R D : k 2 r 16 Sizes of the regions R 1 and R 2 can be controed via the characteristic engths i. oreover, these sizes can be earnt from data by maximising the margina ikeihood as common in the GP framework. 2. A the derivatives up to and incuding the fourth order derivative are continuous, which guarantees mean square differentiabiity up to the corresponding order of the sampe curves in GPs. There are discontinuities for the fifth order gradient. 3. The region R 1 is a D dimensiona rectange and the region R 2 is a D dimensiona eipsoid. 4. In the case of 1 dimension k 1 and k2 become identica. 5. The covariance function is anisotropic, i.e. has different inner properties for different dimensions. 6. This covariance function eads to sparse covariance matrices and aows GP inference in arge datasets without the need for approximations. 5 Partia Derivatives for Learning Learning the GP requires the computation of the covariance function partia derivatives w.r.t. the hyper-parameters Eq. 7. Based on Eqs.9, 12, the foowing expressions for the partia derivatives of k 1 x, x ;,σ can be cacuated: k 1 = 1 k 1 17 σ σ k 1 x i,x i ; i,σ [ π 1 d i i = 4σ i 3 cos π d i i k 1 x i,x i ; i,σ + sin π d i i d i i 2 ] sin π d i i 18 where i =1, 2,..., D. For the second case, if Ω is diagona and positive definite, it can be expressed via the characteristic engths as foows: 1 Ω = diag 1 2, 1 2 2,..., 1 D 2 From Eqs. 14, 19 it foows that r = D k=1. 19 xk x 2 k. 2 Based on Eqs. 13, 19-2 the foowing gradient components of this muti-dimensiona covariance function k can be obtained: k σ = 2 + cos 2πr 3 k 1 r+ 1 sin 2πr,if r<1 2π 21 k = 4σ [π 1 r cos πr + sin πr] j 3 sin πr 1 xj x 2 j, if <r<1. 22 r j grad k =,ifr In Eq. 22, r is in the denominator, so that direct cacuations cannot be carried out using Eq. 22 when r =. However, using the equaity sin μr im = μ 24 r r one can directy show that k r; σ, Ω 4σ im = im 1 xj x 2 r j r 3 π2 j =. j j 25 Based on Eq. 25 it must be taken directy k j =, j =1, 2,..., D. 26 r= Eqs , 26 fuy define the gradient of the new covariance function k at every point and can be directy used in the earning procedure. 6 Experiments This section provides empirica comparisons between exact sparse GP, conventiona GP with squared exponentia and approximation procedures. 6.1 Artificia Dataset In this experiment we compare the exact GP with the proposed covariance function against the approach proposed in [Sneson and Ghahramani, 27]. The data is essentiay the same as presented in the experiment section in [Sneson and Ghahramani, 27]. As can be observed in Figure 2, the j 1939

5 SqExp Normaized SE mean and std Figure 2: Comparison between exact sparse GP and the oca and goba approximation. FITC stands for Fuy independent training conditiona approximation. Detais can be found in [Sneson and Ghahramani, 27]. Note that exact sparse GP provides a much smoother curve. sparse covariance function provides a much smoother prediction for the underying function than the combination of oca and goba approximations. This exampe shows quaitativey that in some situations approximation methods can ead to discontinuities. The same does not occur in the exact sparse GP inference. 6.2 Rainfa Dataset In this experiment we compare the exact sparse GP with the exact GP with the squared exponentia covariance function and the covariance function obtained by the mutipication of both of them. The dataset used is a popuar dataset in geostatistics for comparing inference procedures and is known as the Spatia Interpoation Comparison dataset [Dubois et a., 23]SIC 1. The dataset consists of 467 points measuring rainfa in 2D space. We divide these points into two sets, inference and testing. The inference set contains the points used to perform inference on the testing points. For each case the experiment is repeated 15 times with randomy seected inference and testing sets. Figure 3 shows the normaised squared error for the different covariance functions and the standard deviation one sigma for each part of the bar as a function of the number of inference points. As the number of inference points increases, so does the size of the covariance matrix. The resuts demonstrate that very simiar errors are obtained for the different covariances. However, the sparse covariance function produces sparse matrices thus requiring much ess foating point operations. Figure 4 shows the percentage of zeros in the covariance matrix as a function of the number of inference points. As can be observed, the percentage of zeros grows quicky as more inference points are added and it reaches its imit around 1 inference points. Athough the percentage of zeros reaches its imit, the number of zeros in the covariance matrix continues to increase because the size of the covariance matrix increases with the 1 The SIC dataset can be downoaded at: Number of points used for inference Figure 3: Normaised ean Square Error for the SIC dataset. The error is essentiay the same for both covariance functions, with the exact sparse performing sighty worse with fewer inference points but simiar with more inference points at a much ower computation cost. number of inference points. Aso worth noticing is the performance of the mutipication between the two covariance functions. The error is essentiay the same as for the sparse covariance function aone but the percentage of zeros is significanty smaer. This exampe demonstrates the benefits of the proposed approach in reducing storage and number of operations for simiar accuracy. 6.3 Iron Ore Dataset In this dataset the goa is to estimate iron ore grade in 3D space over a region of 2.5 cubic kiometres. The dataset is from an open pit iron mine in Western Austraia. About 17K sampes were coected and the iron concentration measured with X-Ray systems. We divide the 17K dataset points into inference and testing sets. The inference set is taken arbitrariy from the dataset and from the remaining points the testing points are arbitrariy chosen. The experiments are repeated 5 times. Figure 5 shows the normaised mean squared error and the standard deviation one sigma for each part of the bar in the cases of squared exponentia, sparse covariance functions and their product. The resuts demonstrate that a the three ead to simiar errors with the sparse covariance function performing sighty better with the increase of the number of inference points. Figure 6 shows that athough they resut in simiar errors, the sparse and the product ead to about 48% of zeros in the covariance matrix, which is 12K to 12 ces exacty equa to zero when the number of inference points varies from 5 to 5. This exampe demonstrates that the proposed method provides greater savings for bigger datasets. 6.4 Speed Comparison This experiment demonstrates the computationa gains in using the proposed sparse covariance function. Synthetic datasets containing 1, 2, 3 and 4 points were generated by samping a poynomia function with white 194

6 6 5 Percentage of number of zeros in K mean and std SqExp Number of inference points Figure 4: Percentage of zeros in the covariance matrix as a function of the number of inference points. Normaized SE mean and std SqExp Number of points used for inference Figure 5: Normaised ean Square Error for the iron ore dataset. The performance of the sparse covariance function is equivaent to squared exponentia. Due to the computationa cost using the squared exponentia, we stop the experiment with 5 inference points athough the sparse covariance function coud accommodate a the 17K points. Percentage of zeros in K mean and std SqExp: Exact Number of inference points Figure 6: Percentage of zeros for the iron ore grade estimation probem. noise. We compare the speed of a GP with the sparse covariance function to a GP with the squared exponentia covariance function for different ength scaes and corresponding number of non-zero eements in the covariance matrix. The resuts are presented in Figure 7. The code is impemented in atab and uses the sparse matrix impementation package provided. Further gains coud be obtained in more efficient sparse matrix packages. As the number of points in the datasets increases, the speed up becomes more evident. With 4 points, the sparse covariance function is faster than the squared exponentia for up to 7% of non-zeros eements in the covariance matrix. After this point, the computationa cost of the sparse matrix impementation becomes dominant. As in genera the sparse covariance function provides covariance matrices much sparser, speed gains can be quite substantia in addition to storage gains. 7 Concusions This paper proposed a new covariance function constructed upon the cosine function for anaytica tractabiity that naturay provides sparse covariance matrices. The sparseness of the data is controed by a hyper-parameter that can be earnt from data. The sparse covariance function enabes exact inference in GPs even for arge datasets, providing both storage and computationa benefits. Athough the main focus of this paper was on GPs, it is important to emphasise that the covariance function proposed is aso a ercer kerne and therefore can be appied to kerne machines such as support vector machines, kerne principa component anaysis and others [Schökopf and Smoa, 22]. The use of the sparse covariance function in other kerne methods is objective of our future work. Acknowedgements We thank Edward Sneson for providing the pictures for the comparison with oca and goba sparse Gaussian process approximation. This work has been supported by the Rio Tinto 1941

7 Normaized computationa time SqExp: 1 points : 1 points SqExp: 2 points : 2 points SqExp: 3 points : 3 points SqExp: 4 points : 4 points Percentage of non zero eements for the sparse covariance function Figure 7: Normaised computationa time versus number of non-zero eements for the sparse covariance function in datasets of different sizes. The performance of the nonsparse squared exponentia covariance function is aso incuded for comparison. Note that the computationa gains increase with the size of the datasets. Centre for ine Automation and the ARC Centre of Exceence programme, funded by the Austraian Research Counci ARC and the New South Waes State Government. References [Candea and Rasmussen, 25] J. Quiñonero Candea and C. E. Rasmussen. A unifying view of sparse Gaussian process regression. Journa of achine Learning Research, 6: , 25. [Cressie, 1993] N. Cressie. Statistics for Spatia Data. Wiey, [Dubois et a., 23] G. Dubois, J. aczewski, and. De Cort. apping radioactivity in the environment. spatia interpoation comparison In Office for Officia Pubications of the European Communities, Luxembourg, 23. [Kitanidis, 1997] P. K. Kitanidis. Introdcution to Geostatistics: Appications in Hydrogeoogy. Cambridge University Press, [Lawrence et a., 23] N. Lawrence,. Seeger, and R. Herbrich. Fast sparse gaussian process methods: The information vector machine. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neura Information Processing Systems 15, pages IT Press, 23. [Rasmussen and Wiiams, 26] C. E. Rasmussen and C. K. I. Wiiams. Gaussian Processes for achine Learning. IT Press, 26. [Schökopf and Smoa, 22] B. Schökopf and A. J. Smoa. Learning with Kernes. IT Press, 22. [Seeger et a., 23]. Seeger, C. K. I. Wiiams, and N. Lawrence. Fast forward seection to speed up sparse gaussian process regression. In AISTATS, 23. [Smoa and Bartett, 21] A. Smoa and P. Bartett. greedy gaussian process regression. In Advances in Neura Information Processing Systems 13, pages IT Press, 21. [Sneson and Ghahramani, 26] E. Sneson and Z. Ghahramani. gaussian processes using pseudo-inputs. In Advances in Neura Information Processing Systems 18, pages IT press, 26. [Sneson and Ghahramani, 27] E. Sneson and Z. Ghahramani. Loca and goba sparse Gaussian process approximations. In AISTATS, 27. [Storkey, 1999] A. J. Storkey. Truncated covariance matrices and toepitz methods in gaussian processes. In 9th Internationa Conference on Artificia Neura Networks, [Wackernage, 23] H. Wackernage. utivariate Geostatistics. Springer, 23. [Wendand, 25] H. Wendand. Scattered Data Approximation. Cambridge onographs on Appied and Computationa athematics. Cambridge University Press, 25. [Wiiams and Seeger, 21] C. K. I. Wiiams and. Seeger. Using the Nyström method to speed up kerne machines. In Advances in Neura Information Processing Systems 13, pages IT Press, 21. A Detaied Derivation The covariance function is constructed by evauating the integra Z x «x = σ g u g u du 27 where g x = cos 2 πx H.5 x 28 and H x is the Heaviside unit step function. From Eq. 28 it foows that g x =if x.5 so that from Eq. 27 we have = if x x. 29 If x x <then the integrand of Eq. «27 is nonzero ony when maxx,x u.5, minx,x +.5 therefore = σ Z minx,x maxx,x 1 2 cos 2 π x «πu cos 2 π x πu du 3 Using the identities cos 2 x = cos 2x+1/2 and 2 cos x cos y = cos x y + cos x + y the indefinite integra of the integrand of Eq. 3 can be anayticay cacuated: Z «J u = πu du 2 + cos 2π x x = u+ 1 π 8 4π cos x «x sin 2πu π x + «x + 1 4πu 32π sin 2π x + «x 31 From Eqs. 31 and 3 one has that if x x <then» min x, x = σ J + 1 «max x, x J 1 « which after agebraic manipuations becomes " 2 + cos d `2π = σ 1 d « π sin cos 2 π x πu cos 2 π x 2π d «# 33 where d = x x and σ =3σ/8. Finay, combining Eqs. 29 and 33, we obtain Eq

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing

More information

Active Learning & Experimental Design

Active Learning & Experimental Design Active Learning & Experimenta Design Danie Ting Heaviy modified, of course, by Lye Ungar Origina Sides by Barbara Engehardt and Aex Shyr Lye Ungar, University of Pennsyvania Motivation u Data coection

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

Week 6 Lectures, Math 6451, Tanveer

Week 6 Lectures, Math 6451, Tanveer Fourier Series Week 6 Lectures, Math 645, Tanveer In the context of separation of variabe to find soutions of PDEs, we encountered or and in other cases f(x = f(x = a 0 + f(x = a 0 + b n sin nπx { a n

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

Statistics for Applications. Chapter 7: Regression 1/43

Statistics for Applications. Chapter 7: Regression 1/43 Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)

More information

Robust Multi-Task Learning with t-processes

Robust Multi-Task Learning with t-processes Shipeng Yu shipeng.yu@siemens.com CAD and Knowedge Soutions, Siemens Medica Soutions, Mavern, PA 9355, USA Voker Tresp Corporate Technoogy, Siemens AG, Munich 8730, Germany Kai Yu NEC Laboratories America,

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

FORECASTING TELECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODELS

FORECASTING TELECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODELS FORECASTING TEECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODES Niesh Subhash naawade a, Mrs. Meenakshi Pawar b a SVERI's Coege of Engineering, Pandharpur. nieshsubhash15@gmai.com

More information

https://doi.org/ /epjconf/

https://doi.org/ /epjconf/ HOW TO APPLY THE OPTIMAL ESTIMATION METHOD TO YOUR LIDAR MEASUREMENTS FOR IMPROVED RETRIEVALS OF TEMPERATURE AND COMPOSITION R. J. Sica 1,2,*, A. Haefee 2,1, A. Jaai 1, S. Gamage 1 and G. Farhani 1 1 Department

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

A Novel Learning Method for Elman Neural Network Using Local Search

A Novel Learning Method for Elman Neural Network Using Local Search Neura Information Processing Letters and Reviews Vo. 11, No. 8, August 2007 LETTER A Nove Learning Method for Eman Neura Networ Using Loca Search Facuty of Engineering, Toyama University, Gofuu 3190 Toyama

More information

Multilayer Kerceptron

Multilayer Kerceptron Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

Lecture 6: Moderately Large Deflection Theory of Beams

Lecture 6: Moderately Large Deflection Theory of Beams Structura Mechanics 2.8 Lecture 6 Semester Yr Lecture 6: Moderatey Large Defection Theory of Beams 6.1 Genera Formuation Compare to the cassica theory of beams with infinitesima deformation, the moderatey

More information

Multi-Kernel Gaussian Processes

Multi-Kernel Gaussian Processes Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Arman Melkumyan Australian Centre for Field Robotics School of Aerospace, Mechanical and Mechatronic Engineering

More information

Math 124B January 17, 2012

Math 124B January 17, 2012 Math 124B January 17, 212 Viktor Grigoryan 3 Fu Fourier series We saw in previous ectures how the Dirichet and Neumann boundary conditions ead to respectivey sine and cosine Fourier series of the initia

More information

Determining The Degree of Generalization Using An Incremental Learning Algorithm

Determining The Degree of Generalization Using An Incremental Learning Algorithm Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

Partial permutation decoding for MacDonald codes

Partial permutation decoding for MacDonald codes Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics

More information

2M2. Fourier Series Prof Bill Lionheart

2M2. Fourier Series Prof Bill Lionheart M. Fourier Series Prof Bi Lionheart 1. The Fourier series of the periodic function f(x) with period has the form f(x) = a 0 + ( a n cos πnx + b n sin πnx ). Here the rea numbers a n, b n are caed the Fourier

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

THINKING IN PYRAMIDS

THINKING IN PYRAMIDS ECS 178 Course Notes THINKING IN PYRAMIDS Kenneth I. Joy Institute for Data Anaysis and Visuaization Department of Computer Science University of Caifornia, Davis Overview It is frequenty usefu to think

More information

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations Optimaity of Inference in Hierarchica Coding for Distributed Object-Based Representations Simon Brodeur, Jean Rouat NECOTIS, Département génie éectrique et génie informatique, Université de Sherbrooke,

More information

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1 Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

PREDICTION OF DEFORMED AND ANNEALED MICROSTRUCTURES USING BAYESIAN NEURAL NETWORKS AND GAUSSIAN PROCESSES

PREDICTION OF DEFORMED AND ANNEALED MICROSTRUCTURES USING BAYESIAN NEURAL NETWORKS AND GAUSSIAN PROCESSES PREDICTION OF DEFORMED AND ANNEALED MICROSTRUCTURES USING BAYESIAN NEURAL NETWORKS AND GAUSSIAN PROCESSES C.A.L. Baier-Jones, T.J. Sabin, D.J.C. MacKay, P.J. Withers Department of Materias Science and

More information

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm 1 Asymptotic Properties of a Generaized Cross Entropy Optimization Agorithm Zijun Wu, Michae Koonko, Institute for Appied Stochastics and Operations Research, Caustha Technica University Abstract The discrete

More information

BALANCING REGULAR MATRIX PENCILS

BALANCING REGULAR MATRIX PENCILS BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

Effective Appearance Model and Similarity Measure for Particle Filtering and Visual Tracking

Effective Appearance Model and Similarity Measure for Particle Filtering and Visual Tracking Effective Appearance Mode and Simiarity Measure for Partice Fitering and Visua Tracking Hanzi Wang David Suter and Konrad Schinder Institute for Vision Systems Engineering Department of Eectrica and Computer

More information

arxiv: v1 [math.ca] 6 Mar 2017

arxiv: v1 [math.ca] 6 Mar 2017 Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear

More information

Combining reaction kinetics to the multi-phase Gibbs energy calculation

Combining reaction kinetics to the multi-phase Gibbs energy calculation 7 th European Symposium on Computer Aided Process Engineering ESCAPE7 V. Pesu and P.S. Agachi (Editors) 2007 Esevier B.V. A rights reserved. Combining reaction inetics to the muti-phase Gibbs energy cacuation

More information

High-order approximations to the Mie series for electromagnetic scattering in three dimensions

High-order approximations to the Mie series for electromagnetic scattering in three dimensions Proceedings of the 9th WSEAS Internationa Conference on Appied Mathematics Istanbu Turkey May 27-29 2006 (pp199-204) High-order approximations to the Mie series for eectromagnetic scattering in three dimensions

More information

Melodic contour estimation with B-spline models using a MDL criterion

Melodic contour estimation with B-spline models using a MDL criterion Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex

More information

A Comparison Study of the Test for Right Censored and Grouped Data

A Comparison Study of the Test for Right Censored and Grouped Data Communications for Statistica Appications and Methods 2015, Vo. 22, No. 4, 313 320 DOI: http://dx.doi.org/10.5351/csam.2015.22.4.313 Print ISSN 2287-7843 / Onine ISSN 2383-4757 A Comparison Study of the

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

Course 2BA1, Section 11: Periodic Functions and Fourier Series

Course 2BA1, Section 11: Periodic Functions and Fourier Series Course BA, 8 9 Section : Periodic Functions and Fourier Series David R. Wikins Copyright c David R. Wikins 9 Contents Periodic Functions and Fourier Series 74. Fourier Series of Even and Odd Functions...........

More information

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA ON THE SYMMETRY OF THE POWER INE CHANNE T.C. Banwe, S. Gai {bct, sgai}@research.tecordia.com Tecordia Technoogies, Inc., 445 South Street, Morristown, NJ 07960, USA Abstract The indoor power ine network

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, we as various

More information

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING 8 APPENDIX 8.1 EMPIRICAL EVALUATION OF SAMPLING We wish to evauate the empirica accuracy of our samping technique on concrete exampes. We do this in two ways. First, we can sort the eements by probabiity

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Moreau-Yosida Regularization for Grouped Tree Structure Learning Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State

More information

17 Lecture 17: Recombination and Dark Matter Production

17 Lecture 17: Recombination and Dark Matter Production PYS 652: Astrophysics 88 17 Lecture 17: Recombination and Dark Matter Production New ideas pass through three periods: It can t be done. It probaby can be done, but it s not worth doing. I knew it was

More information

Unconditional security of differential phase shift quantum key distribution

Unconditional security of differential phase shift quantum key distribution Unconditiona security of differentia phase shift quantum key distribution Kai Wen, Yoshihisa Yamamoto Ginzton Lab and Dept of Eectrica Engineering Stanford University Basic idea of DPS-QKD Protoco. Aice

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

b n n=1 a n cos nx (3) n=1

b n n=1 a n cos nx (3) n=1 Fourier Anaysis The Fourier series First some terminoogy: a function f(x) is periodic if f(x ) = f(x) for a x for some, if is the smaest such number, it is caed the period of f(x). It is even if f( x)

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, as we as various

More information

Traffic data collection

Traffic data collection Chapter 32 Traffic data coection 32.1 Overview Unike many other discipines of the engineering, the situations that are interesting to a traffic engineer cannot be reproduced in a aboratory. Even if road

More information

Research Article On the Lower Bound for the Number of Real Roots of a Random Algebraic Equation

Research Article On the Lower Bound for the Number of Real Roots of a Random Algebraic Equation Appied Mathematics and Stochastic Anaysis Voume 007, Artice ID 74191, 8 pages doi:10.1155/007/74191 Research Artice On the Lower Bound for the Number of Rea Roots of a Random Agebraic Equation Takashi

More information

Support Vector Machine and Its Application to Regression and Classification

Support Vector Machine and Its Application to Regression and Classification BearWorks Institutiona Repository MSU Graduate Theses Spring 2017 Support Vector Machine and Its Appication to Regression and Cassification Xiaotong Hu As with any inteectua project, the content and views

More information

C. Fourier Sine Series Overview

C. Fourier Sine Series Overview 12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a

More information

Efficient Generation of Random Bits from Finite State Markov Chains

Efficient Generation of Random Bits from Finite State Markov Chains Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law Gauss Law 1. Review on 1) Couomb s Law (charge and force) 2) Eectric Fied (fied and force) 2. Gauss s Law: connects charge and fied 3. Appications of Gauss s Law Couomb s Law and Eectric Fied Couomb s

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com

More information

Integrating Factor Methods as Exponential Integrators

Integrating Factor Methods as Exponential Integrators Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been

More information

<C 2 2. λ 2 l. λ 1 l 1 < C 1

<C 2 2. λ 2 l. λ 1 l 1 < C 1 Teecommunication Network Contro and Management (EE E694) Prof. A. A. Lazar Notes for the ecture of 7/Feb/95 by Huayan Wang (this document was ast LaT E X-ed on May 9,995) Queueing Primer for Muticass Optima

More information

arxiv: v1 [math.fa] 23 Aug 2018

arxiv: v1 [math.fa] 23 Aug 2018 An Exact Upper Bound on the L p Lebesgue Constant and The -Rényi Entropy Power Inequaity for Integer Vaued Random Variabes arxiv:808.0773v [math.fa] 3 Aug 08 Peng Xu, Mokshay Madiman, James Mebourne Abstract

More information

Random maps and attractors in random Boolean networks

Random maps and attractors in random Boolean networks LU TP 04-43 Rom maps attractors in rom Booean networks Björn Samuesson Car Troein Compex Systems Division, Department of Theoretica Physics Lund University, Sövegatan 4A, S-3 6 Lund, Sweden Dated: 005-05-07)

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

School of Electrical Engineering, University of Bath, Claverton Down, Bath BA2 7AY

School of Electrical Engineering, University of Bath, Claverton Down, Bath BA2 7AY The ogic of Booean matrices C. R. Edwards Schoo of Eectrica Engineering, Universit of Bath, Caverton Down, Bath BA2 7AY A Booean matrix agebra is described which enabes man ogica functions to be manipuated

More information

A Better Way to Pretrain Deep Boltzmann Machines

A Better Way to Pretrain Deep Boltzmann Machines A Better Way to Pretrain Deep Botzmann Machines Rusan Saakhutdino Department of Statistics and Computer Science Uniersity of Toronto rsaakhu@cs.toronto.edu Geoffrey Hinton Department of Computer Science

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

From Margins to Probabilities in Multiclass Learning Problems

From Margins to Probabilities in Multiclass Learning Problems From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error

More information

School of Electrical Engineering, University of Bath, Claverton Down, Bath BA2 7AY

School of Electrical Engineering, University of Bath, Claverton Down, Bath BA2 7AY The ogic of Booean matrices C. R. Edwards Schoo of Eectrica Engineering, Universit of Bath, Caverton Down, Bath BA2 7AY A Booean matrix agebra is described which enabes man ogica functions to be manipuated

More information

In-plane shear stiffness of bare steel deck through shell finite element models. G. Bian, B.W. Schafer. June 2017

In-plane shear stiffness of bare steel deck through shell finite element models. G. Bian, B.W. Schafer. June 2017 In-pane shear stiffness of bare stee deck through she finite eement modes G. Bian, B.W. Schafer June 7 COLD-FORMED STEEL RESEARCH CONSORTIUM REPORT SERIES CFSRC R-7- SDII Stee Diaphragm Innovation Initiative

More information

MONTE CARLO SIMULATIONS

MONTE CARLO SIMULATIONS MONTE CARLO SIMULATIONS Current physics research 1) Theoretica 2) Experimenta 3) Computationa Monte Caro (MC) Method (1953) used to study 1) Discrete spin systems 2) Fuids 3) Poymers, membranes, soft matter

More information

Path planning with PH G2 splines in R2

Path planning with PH G2 splines in R2 Path panning with PH G2 spines in R2 Laurent Gajny, Richard Béarée, Eric Nyiri, Oivier Gibaru To cite this version: Laurent Gajny, Richard Béarée, Eric Nyiri, Oivier Gibaru. Path panning with PH G2 spines

More information

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS TONY ALLEN, EMILY GEBHARDT, AND ADAM KLUBALL 3 ADVISOR: DR. TIFFANY KOLBA 4 Abstract. The phenomenon of noise-induced stabiization occurs

More information

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia

More information

Math 124B January 31, 2012

Math 124B January 31, 2012 Math 124B January 31, 212 Viktor Grigoryan 7 Inhomogeneous boundary vaue probems Having studied the theory of Fourier series, with which we successfuy soved boundary vaue probems for the homogeneous heat

More information

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems Convergence Property of the Iri-Imai Agorithm for Some Smooth Convex Programming Probems S. Zhang Communicated by Z.Q. Luo Assistant Professor, Department of Econometrics, University of Groningen, Groningen,

More information

Nonlinear Gaussian Filtering via Radial Basis Function Approximation

Nonlinear Gaussian Filtering via Radial Basis Function Approximation 51st IEEE Conference on Decision and Contro December 10-13 01 Maui Hawaii USA Noninear Gaussian Fitering via Radia Basis Function Approximation Huazhen Fang Jia Wang and Raymond A de Caafon Abstract This

More information

Testing for the Existence of Clusters

Testing for the Existence of Clusters Testing for the Existence of Custers Caudio Fuentes and George Casea University of Forida November 13, 2008 Abstract The detection and determination of custers has been of specia interest, among researchers

More information

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron Neura Information Processing - Letters and Reviews Vo. 5, No. 2, November 2004 LETTER A Soution to the 4-bit Parity Probem with a Singe Quaternary Neuron Tohru Nitta Nationa Institute of Advanced Industria

More information

Some Measures for Asymmetry of Distributions

Some Measures for Asymmetry of Distributions Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester

More information

Fast Spectral Clustering via the Nyström Method

Fast Spectral Clustering via the Nyström Method Fast Spectra Custering via the Nyström Method Anna Choromanska, Tony Jebara, Hyungtae Kim, Mahesh Mohan 3, and Caire Monteeoni 3 Department of Eectrica Engineering, Coumbia University, NY, USA Department

More information

Deep Gaussian Processes for Multi-fidelity Modeling

Deep Gaussian Processes for Multi-fidelity Modeling Deep Gaussian Processes for Muti-fideity Modeing Kurt Cutajar EURECOM Sophia Antipois, France Mark Puin Andreas Damianou Nei Lawrence Javier Gonzáez Abstract Muti-fideity modes are prominenty used in various

More information

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channes arxiv:cs/060700v1 [cs.it] 6 Ju 006 Chun-Hao Hsu and Achieas Anastasopouos Eectrica Engineering and Computer Science Department University

More information

Soft Clustering on Graphs

Soft Clustering on Graphs Soft Custering on Graphs Kai Yu 1, Shipeng Yu 2, Voker Tresp 1 1 Siemens AG, Corporate Technoogy 2 Institute for Computer Science, University of Munich kai.yu@siemens.com, voker.tresp@siemens.com spyu@dbs.informatik.uni-muenchen.de

More information

arxiv: v1 [cs.lg] 31 Oct 2017

arxiv: v1 [cs.lg] 31 Oct 2017 ACCELERATED SPARSE SUBSPACE CLUSTERING Abofaz Hashemi and Haris Vikao Department of Eectrica and Computer Engineering, University of Texas at Austin, Austin, TX, USA arxiv:7.26v [cs.lg] 3 Oct 27 ABSTRACT

More information

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage Magnetic induction TEP Reated Topics Maxwe s equations, eectrica eddy fied, magnetic fied of cois, coi, magnetic fux, induced votage Principe A magnetic fied of variabe frequency and varying strength is

More information

The Group Structure on a Smooth Tropical Cubic

The Group Structure on a Smooth Tropical Cubic The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,

More information

Robust Sensitivity Analysis for Linear Programming with Ellipsoidal Perturbation

Robust Sensitivity Analysis for Linear Programming with Ellipsoidal Perturbation Robust Sensitivity Anaysis for Linear Programming with Eipsoida Perturbation Ruotian Gao and Wenxun Xing Department of Mathematica Sciences Tsinghua University, Beijing, China, 100084 September 27, 2017

More information