Gaussian processes with monotonicity information

Size: px
Start display at page:

Download "Gaussian processes with monotonicity information"

Transcription

1 Gaussian processes with monotonicity information Anonymous Author Anonymous Author Unknown Institution Unknown Institution Abstract A metho for using monotonicity information in multivariate Gaussian process regression an classification is propose. Monotonicity information is introuce with virtual erivative observations, an the resulting posterior is approximate with expectation propagation. Behaviour of the metho is illustrate with artificial regression examples, an the metho is use in a real worl health care classification problem to inclue monotonicity information with respect to one of the covariates. INTRODUCTION In moelling problems there is sometimes a priori knowlege available, concerning the function to be learne, which can be use to improve the performance of the moel. Such information may be inaccurate, an be relate to the behaviour of the output variable as a function of the input variables. For instance, instea of having measurements on erivatives, the output function can be known to be monotonic with respect to an input variable. For univariate an multivariate aitive functions, the monotonicity can be force by construction, see e.g. Shively et al., 9. A generic approach for multivariate moels was propose by Sill an Abu- Mostafa, 997, who introuce monotonicity information to multilayer perceptron MLP neural networks using hints that are virtual observations place appropriately in the input space. See also Lampinen an Selonen, 997 for more explicit formulation. However, use of hints can be problematic with MLP ue to nonstationarity of the smoothness properties, an Preliminary work. Uner review by AISTATS. Do not istribute. ifficulties in the integration over the posterior istribution. In this paper, we propose a metho similar to hint approach for incluing mononicity information into a Gaussian process GP moel using virtual erivative observations with a Gaussian istribution. In Gaussian processes smoothness can be controlle in a more systematic way than in MLP by the selection of a covariance function. In this work, integrals are approximate using the fast expectation propagation EP algorithm. We first illustrate the behaviour an examine the performance of the approach with artificial univariate regression ata sets. We then illustrate the benefits of monotonicity information in a real worl multivariate classification problem with monotonicity for one of the covariates. Section presents briefly the Gaussian process with erivative observations, an Section escribes the propose metho. In Section 4 experiments are shown, an conclusions are rawn in Section 5. GAUSSIAN PROCESSES AND DERIVATIVE OBSERVATIONS Gaussian process GP is a flexible nonparametric moel in which the prior is set irectly over functions of one or more input variables, see e.g. O Hagan, 978; MacKay, 998; Neal, 999; Rasmussen an Williams, 6. Gaussian process moels are attractive in moelling complex phenomena since they allow possible nonlinear effects, an if there are epenencies between covariates, GP can hanle these interactions implicitly. Let x enote a D-imensional covariate vector, an the matrix X, of size N D, all N training input vectors. We assume a zero mean Gaussian process prior pf X = Nf,KX,X, where f is a vector of N latent values. The covariance matrix KX, X between the latent values epens on the covariates, an is etermine by the covariance

2 function. Throughout this work, we use the stationary square exponential covariance function, which prouces smooth functions, given by Cov f i,f j] = Kx i,x j = η exp D = ρ xi xj, where η an ρ = {ρ,...,ρ D } are the hyperparameters of the GP moel. In the regression case, having the vector y of N noisy outputs, we assume the Gaussian relationship between the latent function values an the noisy observations py f = Ny f,σ I, where σ is the noise variance an I is the ientity matrix. Given the training ata X an y, the conitional preictive istribution for a new covariate vector x is Gaussian with mean an variance Ef x,y,x,θ] = Kx,XKX,X + σ I y Varf x,y,x,θ] = Kx,x Kx,X where θ = {η,ρ,σ}. KX,X + σ I KX,x, 4 Instea of integrating out the hyperparameters, for simplicity we fin a point estimate for the values of the hyperparameters θ, by optimising the marginal likelihoo py X,θ = py f,θpf X,θf, an in the computations we use the logarithm of the marginal likelihoo log py X,θ = yt KX,X + σ I y log KX,X + σ I N logπ. The erivative of a Gaussian process remains a Gaussian process because ifferentiation is a linear operator Rasmussen, ; Solak et al.,. This makes it possible to inclue erivative observations in the GP moel, or to compute preictions about erivatives. The mean of the erivative is equal to the erivative of the latent mean ] = E f i] E f i x i x i Likewise, the covariance between a partial erivative an a function value satisfies ] f i Cov,f j = Cov f i,f j], x i x i. an the covariance between partial erivatives ] f i Cov, fj = x j g x i x i xj g Cov f i,f j]. For the square exponential covariance function, the covariances between function values an partial erivatives are given by Cov f i x i g,f j ] = η exp D = an between partial erivatives by ] f i Cov, fj = η exp D x i g x j h = ρ g ρ xi xj ρ g x i g x j g, ρ xi xj δ gh ρ h xi h xj h xi g x j g where δ gh = if g = h, an otherwise. For instance, having observe the values of y, mean of the erivative of the latent function f with respect to the imension, is ] f E x = Kx,X x KX,X + σ I y, an the variance ] f Var x = Kx,x x x Kx,X x KX,X + σ I KX,x x, similar to the equations an 4. To use the erivative observations in the Gaussian process, the observation vector y can be extene to inclue also the erivative observations, an the covariance matrix between the observations can be extene to inclue the covariances between the observations an partial erivatives, an the covariances between the partial erivatives. EXPRESSING MONOTONICITY INFORMATION In this section we present the metho for introucing monotonicity information to a Gaussian process moel. Instea of evaluating the erivative everywhere, it is possible to choose a finite number of locations where the erivative is evaluate when the function is smooth. Monotonicity conitions are the following: at the operating point x i, the erivative of the target function,

3 is non-negative with respect to the input imension i. We use the notation m i i for the erivative information where monotonicity is with respect to the imension i at the location x i. We enote with m a set of M erivative points inucing the monotonicity at the operating points X m the matrix of size M D. To express this monotonicity, the following probit likelihoo p m i f i f i i = Φ 5 x i i x i ν i Φz = z Nt, t, is assume for the erivative observation. By using the probit function instea of step function the likelihoo tolerates small errors. The probit function in 5 approaches the step function when ν, an in all experiments in this work we fixe ν = e 6. However, it is possible to ajust the steepness of the step, an thereby control the strictness of monotonicity information with the parameter ν in the likelihoo. To inclue the information from this likelihoo into the GP moel, the expectation propagation algorithm Minka, is use to form virtual erivative observations. For now we assume we have a set of locations X m where the function is known to be monotonic. By assuming a zero mean Gaussian process prior for latent function values, the joint prior for latent values an erivatives is given by where f joint = pf,f X,X m = Nf joint,k joint, f f ] Kf,f K,an K joint = f,f K f,f K f,f ]. 6 In 6 f is use as a shorthan notation for the erivative of latent function f with respect to some of the input imensions, an the subscripts of K enote the variables between which the covariance is compute. Using the Bayes rule, the joint posterior is obtaine by where pf,f y,m = Z pf,f X,X m py fpm f 7 pm f = Φ f i x i ν i an the normalisation term is Z = pf,f X,X m py fpm f ff. 8 Since the likelihoo for the erivative observations in 8 is not Gaussian, the posterior is analytically intractable. We apply the EP algorithm, an compute the Gaussian approximation for the posterior istribution. The local likelihoo approximations given by EP are then use in the moel as virtual erivative observations, in aition to the observations y. The EP algorithm approximates the posterior istribution in 7 with qf,f y,m = Z EP pf,f X,X m py f t i Z i, µ i, σ i, where t i Z i, µ i, σ i = Z i Nf i µ i, σ i are local likelihoo approximations with site parameters Z i, µ i an σ i. The posterior is a prouct of Gaussian istributions, an can be simplifie to qf,f y,m = Nf joint µ,σ. 9 The posterior mean is µ = Σ Σ joint µ joint an the covariance Σ = K joint + Σ joint, where µ joint = ỹ µ ], an Σ σ I joint = Σ ]. In µ is the vector of site means µ i, an Σ is a iagonal matrix with site variances σ i on the iagonal. The esire posterior marginal moments with the likelihoo 5 are upate as Ẑ i = Φz i σ i ˆµ i = µ i + Nz i, Φz i ν + σ i /ν ˆσ i = σ i σ4 i Nz i, Φz i ν + σ i where µ i z i =, ν + σ i /ν z i + Nz i, Φz i, an µ i an σ i are the parameters of the cavity istribution in EP. These equations are almost similar to those of binary classification with probit likelihoo, an the EP algorithm is otherwise similar as presente, for example, in chapter of Rasmussen an Williams, 6.

4 The normalisation term is approximate with EP as Z EP = qy,m X,X m,θ = pf,f X,X m py f t i Z i, µ i, σ i ff = = Z Nf joint µ,σff Z Z i, joint Z i joint where the normalisation term of the prouct of Gaussians is Z joint = π D/ K joint + Σ joint / exp µtjointk joint + Σ joint µ joint, an the remaining terms Z i are the normalisation constants from EP. In the computations we use the logarithm of the normalisation term, an after the convergence of EP, the approximation for the logarithm of marginal likelihoo is compute as log Z EP = log K joint + Σ joint µt jointk joint + Σ M joint µ i µ i µ joint + σ i + σ i M + log Φ µ i + M logσ ν + σ i i + σ i. /ν The values for the hyperparameters are foun by optimising the logarithm of the joint marginal likelihoo approximation for the observations an erivative information. To use the virtual erivative samples in the GP moel preictions, the approximative preictive mean an variance for the latent variable can be compute with Ef x,y,x,m,x m ] = K,fjoint K joint + Σ joint µ joint Varf x,y,x,m,x m ] = K, K,fjoint K joint + Σ joint K fjoint, analogously to the stanar GP preiction equations an 4. In classification examples, we assume the probit likelihoo for class observations pc f = N Φf i c i, where now c i = {,} escribes the two output classes. We apply the expectation propagation algorithm for both the class observations an virtual erivative observations. EP approximates the joint posterior of f an f similarly to the regression case in 9, except that the vector of observations y, an noise σ I in are now replace with site approximations µ class an Σ class, enoting the mean an variance site terms given by EP, an associate with class observations. The parameter ν in likelihoo for virtual erivative observation causes the esire posterior marginal moments to be compute slightly ifferently, epening on whether the moments are compute for class observations or erivative observations. For class observations, the moments are given, for example, in chapter of Rasmussen an Williams, 6, an for virtual erivative observations moments are compute as in the regression case. The values for the hyperparameters are foun by optimising the joint marginal likelihoo approximation of class observations an virtual erivative observations. The normalisation term is compute as in regression, except that again y an noise σ I in are replace with site terms µ class an Σ class. Furthermore, in the computation of the normalisation of joint posterior, the normalisation site terms Z class of class observations are also taken into account. In classification, the preictions for the latent values using the class observations an virtual erivative observations are mae by using the extene vector of site means an extene covariance matrix having site variances on the iagonal.. PLACING THE VIRTUAL DERIVATIVE POINTS In low imensional problems the erivative points can be place on a gri to assure monotonicity. A rawback is that the number of gri points increases exponentially with regar to the number of input imensions. In higher imensional cases the istribution for X can be assume to be the empirical istribution of observations X, an the virtual points can be chosen to be at the unique locations of the observe input ata points. Alternatively, a ranom subset of points from the empirical istribution can be chosen. If the istance between erivative points is short enough compare to the lengthscale, then monotonicity information affects also between the virtual points accoring to the correlation structure. Due to the computational scaling ON + M, it may be necessary to use a smaller number of erivative points. In such a case, a general solution is to use the GP preic-

5 tions about the values of erivatives at the observe unique ata points. The probability of the erivative being negative is compute, an at the locations where this probability is high, virtual erivative points are place. After conitioning on the virtual ata points, the new preictions for the erivative values at the remaining unique observe points can be compute, an virtual erivative points can be ae, move or remove if neee. This iteration can be continue to assure monotonicity at the interesting regions. To place the virtual erivative points between the observe ata points, or outsie the convex hull of the observe X, a more elaborate istribution moel for X is neee. Again, the probability of erivative being negative can easily be compute, an more virtual erivative points can be place on locations where this probability is high. 4 EXPERIMENTAL RESULTS 4. DEMONSTRATION An example of Gaussian process regression with monotonicity information is shown in Figure. Subfigure a illustrates the GP preiction mean + 95% interval without monotonicity information, with hyperparameter values foun by optimising the marginal likelihoo. Subfigures b an c show the preictions with monotonicity information, with hyperparameter values that maximise the approximation of the joint marginal likelihoo. Short vertical lines in b an c are the locations of virtual erivative points. In Subfigure b, the locations of virtual points are foun by choosing a subset amongst the observe ata points, on the locations where the probability of erivative being negative is large before conitioning to any monotonicity information the erivative seen in Subfigure. In Subfigure c the virtual points are place on a gri. The preictions in b an c are similar, an e an f illustrate the corresponing erivatives of the latent functions. Since the probability of erivative being negative in e an f at the observe ata range is very low, aing more virtual erivative points is unnecessary. The effect of the monotonicity information is illustrate also in Figure. Subfigures a-c show the case without monotonicity information: a shows the marginal likelihoo as a function of lengthscale an noise variance parameters signal magnitue is fixe to be one, an b an c show two ifferent solutions mean + 95% interval at two ifferent moes shown in a. The moe with the shorter lengthscale an smaller noise variance function estimate in b has higher ensity. Subfigures -f show the case with monotonicity information. Subfigure shows the approximate marginal likelihoo for the observations an virtual erivative observations. Now the moe corresponing to the longer lengthscale an the monotone function shown in f has much higher ensity. Since virtual observations are not place ensely, there is still another moe at shorter lengthscale function estimate in e although with much lower ensity. This shows the importance of having enough virtual observations, an this secon moe woul eventually vanish if the number of virtual observations woul be increase. 4. ARTIFICIAL EXAMPLES We test the Gaussian process moel with monotonicity information by performing simulation experiments on four artificial ata sets. We consier the following functions: a fx = if x <.5, fx = if x.5 step; b fx = x linear; c fx = exp.5x exponential; fx = /{ + exp 8x + 4} logistic, an raw observations from the moel y i = fx i +ǫ i, where x i an e i are i.i.. samples from the uniform istribution Ux i,, an from the Gaussian Nǫ i,. We normalise x an y to have mean zero an stanar eviation.5. For the Gaussian process with monotonicity information, we introuce virtual observations space equally between the observe minimum an maximum values of x variable. We compare the results of the moel to a Gaussian process with no monotonicity information. The performances of the moels are evaluate using a root-mean-square error RMSE. The estimates for RMSE are evaluate against true function values on 5 equally space x-values. Table summarises the simulation results. The results are base on simulations repeate 5 times. Two sample sizes, N = an N =, were use in the simulations. For the step function, the GP moel with monotonicity information performs worse than GP without monotonicity assumption because the propose metho has tenency to favour smooth increasing functions. In a case of heavy truncation by the step likelihoo, the result may not be well approximate with a Gaussian istribution, an thus erivative information presente by virtual observations can be slightly biase away from zero. On the other han, the GP without monotonicity assumption estimates the step function with a shorter lengthscale proucing

6 4 4 4 a b c erivative, f/ x erivative, f/ x erivative, f/ x 4 4 e 4 f Figure : Example of Gaussian process solution mean + 95% interval without monotonicity information a, an the corresponing erivative of the latent function. Subfigures b an c illustrate the solutions with monotonicity information, an the corresponing erivatives are shown in e an f. The virtual erivative observations shown with short vertical lines in b are place on locations where the probability of erivative being negative is large seen in Subfigure. In Subfigure c the erivative points are place on a gri. noise variance... characteristic lengthscale a b c noise variance... characteristic lengthscale e f Figure : Contour plot of the log marginal likelihoo without monotonicity information a, an the corresponing solutions b an c at the moes. Subfigure shows contour plot of the marginal likelihoo with monotonicity information, an Subfigures e an f illustrate the corresponing solutions at the moes. The locations of virtual observations are shown with short vertical lines in Subfigures e an f.

7 Table : Root-mean-square errors for the artificial examples. function root-mean-square error N = N = GP GP GP GP monot. monot. step linear exponential logistic a better fit but with wiggling behaviour. For linear an exponential functions the GP with monotonicity assumption gives better estimates, as monotonicity information favours smoother solutions an prevents the estimate functions from wiggling. In the case of observations, the ifferences between the estimates for the two moels were smaller with linear an exponential functions, as the possibility of overfit ecreases. For logistic function both moels gave similar results. 4. MODELLING RISK OF INSTITUTIONALISATION In this section we report the results of assessing the institutionalisation risk of users of communal elerly care services. The risk of institutionalisation was moelle using ata prouce from health care registers, an the aim was to stuy whether a patient becomes institutionalise or not uring the next three months. The actual stuy population consiste of patients over 65 years in one city uring -4. In this stuy the following seven variables were use as preictors: age, whether the patient ha nursing home perios, whether the patient ha ADL activities of aily living evaluation, maximum memory problem score, maximum behavioral symptoms score, maximum number of aily home care visits, an number of ays in hospital. Since only a small number of institutionalisation events were available in the whole ata set, the training ata was balance such that approximately half of the patients institutionalise. The training ata set consiste of observations. Classification was one using a Gaussian process binary classification moel with the probit likelihoo function, an the square exponential covariance function with an iniviual lengthscale parameter for each input variable. We moelle the risk of institutionalisation with a GP where no information about monotonicity with respect to any of the covariates was assume. This moel was compare to a GP moel where monotonicity information was ae such that the institutionalisation risk was assume to increase as a function of age. Virtual observations were place at the unique locations of the input training ata points. To test the preictive abilities of these two GP moels, ROC curves were compute for younger an oler the olest thir age groups using an inepenent test ata of observation perios. The preictive performances of the moels were similar for the younger age group but the GP moel with monotonicity information gave better preictions for the oler age group Figure. As age increases, the ata becomes more scarce an monotonicity assumption more useful. We also stuie the effect of monotonicity information in the moel by comparing the preicte risks of institutionalisation as a function of age an ifferent aily home care levels. The preictions for a low-risk subgroup are shown in Figure 4. The GP moel without monotonicity information gives a slight ecrease for the institutionalisation risk for patients over 8 Subfigure a, whereas the GP moel with monotonicity information gives smoother results Subfigure b, suggesting more realistic estimates. true positive rate younger age group younger age group monotonicity oler age group oler age group monotonicity false positive rate Figure : ROC curves for the probability of institutionalisation of elerly. 5 CONCLUSION We have propose a metho for introucing monotonicity information to a nonparametric Gaussian process moel. The monotonicity information is set using virtual erivative observations concerning the behaviour of the target function in the esire locations of input space. In the metho a Gaussian approximation is foun for the virtual erivative observations using the EP algorithm, an the virtual observations are use in the GP moel in aition to the real ob-

8 probability of institutionalisation aily home care no visits aily home care max visit aily home care max visits probability of institutionalisation aily home care no visits aily home care max visit aily home care max visits age age a b Figure 4: Simulate estimates for the probabilities of institutionalisation of elerly as a function of age an aily home care levels. The estimates using a Gaussian process moel are shown in Subfigure a, an the estimates using a Gaussian process with monotonicity information in Subfigure b. servations. In the cases where the target function is monotonic, a solution that is less prone to overfitting, an therefore better, can be achieve using monotonicity information. This is emphasize in the cases where there is only a small number of observations available. When the target function has flat areas with sharp steps, the virtual erivative observations can lea to a worse performance cause by a bias away from zero ue to Gaussian approximation of the truncate erivative istribution. Therefore virtual erivative observations implying monotonicity are more useful in the cases when the target function is smooth. Further, if the istance between the virtual erivative observations is too large with respect to the estimate characteristic lengthscale, the solution can become non-monotonic. However, by placing an aing the virtual points iteratively, this can be avoie, an a monotonic solution can be guarantee. Acknowlegements References Lampinen, J. an Selonen, A Using backgroun knowlege in multilayer perceptron learning. In Proc. of The th Scaninavian Conference on Image Analysis, volume, pages MacKay, D. J. C Introuction to Gaussian processes. In Bishop, C. M., eitor, Neural Networks an Machine Learning, pages 66. Springer- Verlag. Minka, T.. Expectation Propagation for approximative Bayesian inference. In Proceeings of the 7th Annual Conference on Uncertainty in Artificial Intelligence UAI-, pages Morgan Kaufmann. Neal, R. M Regression an classification using Gaussian process priors with iscussion. In Bernaro, J. M., Berger, J. O., Dawi, A. P., an Smith, A. F. M., eitors, Bayesian Statistics 6, pages Oxfor University Press. O Hagan, A Curve fitting an optimal esign for preiction. Journal of the Royal Statistical Society. Series B Methoological, 4: 4. Rasmussen, C. E.. Gaussian processes to spee up Hybri Monte Carlo for expensive Bayesian integrals. In Bayesian Statistics, volume 7, pages Oxfor University Press. Rasmussen, C. E. an Williams, C. K. I. 6. Gaussian Processes for Machine Learning. The MIT Press. Shively, T. S., Sager, T. W., an Walker, S. G. 9. A Bayesian approach to non-parametric monotone function estimation. Journal of the Royal Statistical Society Series B, 7: Sill, J. an Abu-Mostafa, Y Monotonicity hints. In Avances in Neural Information Processing Systems 9, pages MIT Press. Solak, E., Murray-Smith, R., Leithea, W. E., Leith, D. J., an Rasmussen, C. E.. Derivative observations in Gaussian process moels of ynamic systems. In Avances in Neural Information Processing Systems 5, pages 4. MIT Press.

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Inter-domain Gaussian Processes for Sparse Inference using Inducing Features

Inter-domain Gaussian Processes for Sparse Inference using Inducing Features Inter-omain Gaussian Processes for Sparse Inference using Inucing Features Miguel Lázaro-Greilla an Aníbal R. Figueiras-Vial Dep. Signal Processing & Communications Universia Carlos III e Mari, SPAIN {miguel,arfv}@tsc.uc3m.es

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE Journal of Soun an Vibration (1996) 191(3), 397 414 THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE E. M. WEINSTEIN Galaxy Scientific Corporation, 2500 English Creek

More information

Introduction to Machine Learning

Introduction to Machine Learning How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression

More information

STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING

STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING Mark A. Kon Department of Mathematics an Statistics Boston University Boston, MA 02215 email: mkon@bu.eu Anrzej Przybyszewski

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009 Spurious Significance of reatment Effects in Overfitte Fixe Effect Moels Albrecht Ritschl LSE an CEPR March 2009 Introuction Evaluating subsample means across groups an time perios is common in panel stuies

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation Relative Entropy an Score Function: New Information Estimation Relationships through Arbitrary Aitive Perturbation Dongning Guo Department of Electrical Engineering & Computer Science Northwestern University

More information

Expected Value of Partial Perfect Information

Expected Value of Partial Perfect Information Expecte Value of Partial Perfect Information Mike Giles 1, Takashi Goa 2, Howar Thom 3 Wei Fang 1, Zhenru Wang 1 1 Mathematical Institute, University of Oxfor 2 School of Engineering, University of Tokyo

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

GWAS V: Gaussian processes

GWAS V: Gaussian processes GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Research Article When Inflation Causes No Increase in Claim Amounts

Research Article When Inflation Causes No Increase in Claim Amounts Probability an Statistics Volume 2009, Article ID 943926, 10 pages oi:10.1155/2009/943926 Research Article When Inflation Causes No Increase in Claim Amounts Vytaras Brazauskas, 1 Bruce L. Jones, 2 an

More information

A. Exclusive KL View of the MLE

A. Exclusive KL View of the MLE A. Exclusive KL View of the MLE Lets assume a change-of-variable moel p Z z on the ranom variable Z R m, such as the one use in Dinh et al. 2017: z 0 p 0 z 0 an z = ψz 0, where ψ is an invertible function

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

Non-Linear Bayesian CBRN Source Term Estimation

Non-Linear Bayesian CBRN Source Term Estimation Non-Linear Bayesian CBRN Source Term Estimation Peter Robins Hazar Assessment, Simulation an Preiction Group Dstl Porton Down, UK. probins@stl.gov.uk Paul Thomas Hazar Assessment, Simulation an Preiction

More information

Survey-weighted Unit-Level Small Area Estimation

Survey-weighted Unit-Level Small Area Estimation Survey-weighte Unit-Level Small Area Estimation Jan Pablo Burgar an Patricia Dörr Abstract For evience-base regional policy making, geographically ifferentiate estimates of socio-economic inicators are

More information

Jointly continuous distributions and the multivariate Normal

Jointly continuous distributions and the multivariate Normal Jointly continuous istributions an the multivariate Normal Márton alázs an álint Tóth October 3, 04 This little write-up is part of important founations of probability that were left out of the unit Probability

More information

arxiv: v2 [cond-mat.stat-mech] 11 Nov 2016

arxiv: v2 [cond-mat.stat-mech] 11 Nov 2016 Noname manuscript No. (will be inserte by the eitor) Scaling properties of the number of ranom sequential asorption iterations neee to generate saturate ranom packing arxiv:607.06668v2 [con-mat.stat-mech]

More information

Text S1: Simulation models and detailed method for early warning signal calculation

Text S1: Simulation models and detailed method for early warning signal calculation 1 Text S1: Simulation moels an etaile metho for early warning signal calculation Steven J. Lae, Thilo Gross Max Planck Institute for the Physics of Complex Systems, Nöthnitzer Str. 38, 01187 Dresen, Germany

More information

Switched Latent Force Models for Movement Segmentation

Switched Latent Force Models for Movement Segmentation Switche Latent Force Moels for Movement Segmentation Mauricio A. Álvarez, Jan Peters, Bernhar Schölkopf, Neil D. Lawrence 3,4 School of Computer Science, University of Manchester, Manchester, UK M3 9PL

More information

arxiv: v4 [math.pr] 27 Jul 2016

arxiv: v4 [math.pr] 27 Jul 2016 The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,

More information

Robust Low Rank Kernel Embeddings of Multivariate Distributions

Robust Low Rank Kernel Embeddings of Multivariate Distributions Robust Low Rank Kernel Embeings of Multivariate Distributions Le Song, Bo Dai College of Computing, Georgia Institute of Technology lsong@cc.gatech.eu, boai@gatech.eu Abstract Kernel embeing of istributions

More information

New Statistical Test for Quality Control in High Dimension Data Set

New Statistical Test for Quality Control in High Dimension Data Set International Journal of Applie Engineering Research ISSN 973-456 Volume, Number 6 (7) pp. 64-649 New Statistical Test for Quality Control in High Dimension Data Set Shamshuritawati Sharif, Suzilah Ismail

More information

KNN Particle Filters for Dynamic Hybrid Bayesian Networks

KNN Particle Filters for Dynamic Hybrid Bayesian Networks KNN Particle Filters for Dynamic Hybri Bayesian Networs H. D. Chen an K. C. Chang Dept. of Systems Engineering an Operations Research George Mason University MS 4A6, 4400 University Dr. Fairfax, VA 22030

More information

Prediction of Data with help of the Gaussian Process Method

Prediction of Data with help of the Gaussian Process Method of Data with help of the Gaussian Process Method R. Preuss, U. von Toussaint Max-Planck-Institute for Plasma Physics EURATOM Association 878 Garching, Germany March, Abstract The simulation of plasma-wall

More information

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1 Lecture 5 Some ifferentiation rules Trigonometric functions (Relevant section from Stewart, Seventh Eition: Section 3.3) You all know that sin = cos cos = sin. () But have you ever seen a erivation of

More information

Track Initialization from Incomplete Measurements

Track Initialization from Incomplete Measurements Track Initialiation from Incomplete Measurements Christian R. Berger, Martina Daun an Wolfgang Koch Department of Electrical an Computer Engineering, University of Connecticut, Storrs, Connecticut 6269,

More information

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations Optimize Schwarz Methos with the Yin-Yang Gri for Shallow Water Equations Abessama Qaouri Recherche en prévision numérique, Atmospheric Science an Technology Directorate, Environment Canaa, Dorval, Québec,

More information

Sensors & Transducers 2015 by IFSA Publishing, S. L.

Sensors & Transducers 2015 by IFSA Publishing, S. L. Sensors & Transucers, Vol. 184, Issue 1, January 15, pp. 53-59 Sensors & Transucers 15 by IFSA Publishing, S. L. http://www.sensorsportal.com Non-invasive an Locally Resolve Measurement of Soun Velocity

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.

More information

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion Hybri Fusion for Biometrics: Combining Score-level an Decision-level Fusion Qian Tao Raymon Velhuis Signals an Systems Group, University of Twente Postbus 217, 7500AE Enschee, the Netherlans {q.tao,r.n.j.velhuis}@ewi.utwente.nl

More information

Optimal Signal Detection for False Track Discrimination

Optimal Signal Detection for False Track Discrimination Optimal Signal Detection for False Track Discrimination Thomas Hanselmann Darko Mušicki Dept. of Electrical an Electronic Eng. Dept. of Electrical an Electronic Eng. The University of Melbourne The University

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Linear Regression with Limited Observation

Linear Regression with Limited Observation Ela Hazan Tomer Koren Technion Israel Institute of Technology, Technion City 32000, Haifa, Israel ehazan@ie.technion.ac.il tomerk@cs.technion.ac.il Abstract We consier the most common variants of linear

More information

Diagonalization of Matrices Dr. E. Jacobs

Diagonalization of Matrices Dr. E. Jacobs Diagonalization of Matrices Dr. E. Jacobs One of the very interesting lessons in this course is how certain algebraic techniques can be use to solve ifferential equations. The purpose of these notes is

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

Modelling and simulation of dependence structures in nonlife insurance with Bernstein copulas

Modelling and simulation of dependence structures in nonlife insurance with Bernstein copulas Moelling an simulation of epenence structures in nonlife insurance with Bernstein copulas Prof. Dr. Dietmar Pfeifer Dept. of Mathematics, University of Olenburg an AON Benfiel, Hamburg Dr. Doreen Straßburger

More information

A comparison of small area estimators of counts aligned with direct higher level estimates

A comparison of small area estimators of counts aligned with direct higher level estimates A comparison of small area estimators of counts aligne with irect higher level estimates Giorgio E. Montanari, M. Giovanna Ranalli an Cecilia Vicarelli Abstract Inirect estimators for small areas use auxiliary

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

INDEPENDENT COMPONENT ANALYSIS VIA

INDEPENDENT COMPONENT ANALYSIS VIA INDEPENDENT COMPONENT ANALYSIS VIA NONPARAMETRIC MAXIMUM LIKELIHOOD ESTIMATION Truth Rotate S 2 2 1 0 1 2 3 4 X 2 2 1 0 1 2 3 4 4 2 0 2 4 6 4 2 0 2 4 6 S 1 X 1 Reconstructe S^ 2 2 1 0 1 2 3 4 Marginal

More information

Situation awareness of power system based on static voltage security region

Situation awareness of power system based on static voltage security region The 6th International Conference on Renewable Power Generation (RPG) 19 20 October 2017 Situation awareness of power system base on static voltage security region Fei Xiao, Zi-Qing Jiang, Qian Ai, Ran

More information

Experiment 2, Physics 2BL

Experiment 2, Physics 2BL Experiment 2, Physics 2BL Deuction of Mass Distributions. Last Upate: 2009-05-03 Preparation Before this experiment, we recommen you review or familiarize yourself with the following: Chapters 4-6 in Taylor

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probabilistic & Unsupervised Learning Gaussian Processes Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College London

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Model Selection for Gaussian Processes

Model Selection for Gaussian Processes Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

MODELLING DEPENDENCE IN INSURANCE CLAIMS PROCESSES WITH LÉVY COPULAS ABSTRACT KEYWORDS

MODELLING DEPENDENCE IN INSURANCE CLAIMS PROCESSES WITH LÉVY COPULAS ABSTRACT KEYWORDS MODELLING DEPENDENCE IN INSURANCE CLAIMS PROCESSES WITH LÉVY COPULAS BY BENJAMIN AVANZI, LUKE C. CASSAR AND BERNARD WONG ABSTRACT In this paper we investigate the potential of Lévy copulas as a tool for

More information

A Modification of the Jarque-Bera Test. for Normality

A Modification of the Jarque-Bera Test. for Normality Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

Bayesian Estimation of the Entropy of the Multivariate Gaussian

Bayesian Estimation of the Entropy of the Multivariate Gaussian Bayesian Estimation of the Entropy of the Multivariate Gaussian Santosh Srivastava Fre Hutchinson Cancer Research Center Seattle, WA 989, USA Email: ssrivast@fhcrc.org Maya R. Gupta Department of Electrical

More information

Robust Bounds for Classification via Selective Sampling

Robust Bounds for Classification via Selective Sampling Nicolò Cesa-Bianchi DSI, Università egli Stui i Milano, Italy Clauio Gentile DICOM, Università ell Insubria, Varese, Italy Francesco Orabona Iiap, Martigny, Switzerlan cesa-bianchi@siunimiit clauiogentile@uninsubriait

More information

State observers and recursive filters in classical feedback control theory

State observers and recursive filters in classical feedback control theory State observers an recursive filters in classical feeback control theory State-feeback control example: secon-orer system Consier the riven secon-orer system q q q u x q x q x x x x Here u coul represent

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Modeling of Dependence Structures in Risk Management and Solvency

Modeling of Dependence Structures in Risk Management and Solvency Moeling of Depenence Structures in Risk Management an Solvency University of California, Santa Barbara 0. August 007 Doreen Straßburger Structure. Risk Measurement uner Solvency II. Copulas 3. Depenent

More information

CONTROL CHARTS FOR VARIABLES

CONTROL CHARTS FOR VARIABLES UNIT CONTOL CHATS FO VAIABLES Structure.1 Introuction Objectives. Control Chart Technique.3 Control Charts for Variables.4 Control Chart for Mean(-Chart).5 ange Chart (-Chart).6 Stanar Deviation Chart

More information

Number of wireless sensors needed to detect a wildfire

Number of wireless sensors needed to detect a wildfire Number of wireless sensors neee to etect a wilfire Pablo I. Fierens Instituto Tecnológico e Buenos Aires (ITBA) Physics an Mathematics Department Av. Maero 399, Buenos Aires, (C1106ACD) Argentina pfierens@itba.eu.ar

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

Switching Time Optimization in Discretized Hybrid Dynamical Systems

Switching Time Optimization in Discretized Hybrid Dynamical Systems Switching Time Optimization in Discretize Hybri Dynamical Systems Kathrin Flaßkamp, To Murphey, an Sina Ober-Blöbaum Abstract Switching time optimization (STO) arises in systems that have a finite set

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

Estimating Causal Direction and Confounding Of Two Discrete Variables

Estimating Causal Direction and Confounding Of Two Discrete Variables Estimating Causal Direction an Confouning Of Two Discrete Variables This inspire further work on the so calle aitive noise moels. Hoyer et al. (2009) extene Shimizu s ientifiaarxiv:1611.01504v1 [stat.ml]

More information

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties Journal of Machine Learning Research 16 (2015) 1547-1572 Submitte 1/14; Revise 9/14; Publishe 8/15 Flexible High-Dimensional Classification Machines an Their Asymptotic Properties Xingye Qiao Department

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: EBLUP Unit Level for Small Area Estimation Contents General section... 3 1. Summary... 3 2. General

More information

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

ON THE OPTIMALITY SYSTEM FOR A 1 D EULER FLOW PROBLEM

ON THE OPTIMALITY SYSTEM FOR A 1 D EULER FLOW PROBLEM ON THE OPTIMALITY SYSTEM FOR A D EULER FLOW PROBLEM Eugene M. Cliff Matthias Heinkenschloss y Ajit R. Shenoy z Interisciplinary Center for Applie Mathematics Virginia Tech Blacksburg, Virginia 46 Abstract

More information

Stable and compact finite difference schemes

Stable and compact finite difference schemes Center for Turbulence Research Annual Research Briefs 2006 2 Stable an compact finite ifference schemes By K. Mattsson, M. Svär AND M. Shoeybi. Motivation an objectives Compact secon erivatives have long

More information

A New Minimum Description Length

A New Minimum Description Length A New Minimum Description Length Soosan Beheshti, Munther A. Dahleh Laboratory for Information an Decision Systems Massachusetts Institute of Technology soosan@mit.eu,ahleh@lis.mit.eu Abstract The minimum

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

arxiv: v1 [stat.co] 23 Jun 2012

arxiv: v1 [stat.co] 23 Jun 2012 Noname manuscript No. (will be inserte by the eitor) Moments Calculation For the Doubly Truncate Multivariate Normal Density Manjunath B G Stefan Wilhelm arxiv:1206.5387v1 [stat.co] 23 Jun 2012 This version:

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

Transform Regression and the Kolmogorov Superposition Theorem

Transform Regression and the Kolmogorov Superposition Theorem Transform Regression an the Kolmogorov Superposition Theorem Ewin Penault IBM T. J. Watson Research Center Kitchawan Roa, P.O. Box 2 Yorktown Heights, NY 59 USA penault@us.ibm.com Abstract This paper presents

More information

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments Problem F U L W D g m 3 2 s 2 0 0 0 0 2 kg 0 0 0 0 0 0 Table : Dimension matrix TMA 495 Matematisk moellering Exam Tuesay December 6, 2008 09:00 3:00 Problems an solution with aitional comments The necessary

More information