DO HEBBIAN SYNAPSES ESTIMATE ENTROPY?

Size: px
Start display at page:

Download "DO HEBBIAN SYNAPSES ESTIMATE ENTROPY?"

Transcription

1 DO HEBBIAN SYNAPSES ESTIMATE ENTROPY? Deniz Erdogmus, Jose C. Principe, Kenneth E. Hild II CNE, Electrical & Computer Engineering Dept. Universit of Florida, Gainesville, F 36. [deniz,principe,hild]@cnel.ufl.edu Abstract. Hebbian learning is one of the mainstas of biologicall inspired neural processing. Hebb s rule is biologicall plausible, and it has been etensivel utilized in both computational neuroscience and in unsupervised training of neural sstems. In these fields, Hebbian learning became snonmous for correlation learning. But it is nown that correlation is a second order statistic of the data, so it is sub-optimal when the goal is to etract as much information as possible from the sensor data stream. In this paper, we demonstrate how information learning can be implemented using Hebb s rule. Thus the paper brings a new understanding to how neural sstems could, through Hebb s rule, etract information theoretic quantities rather than merel correlation. INTRODUCTION Hebb s rule states: When an aon of cell A is near enough to ecite cell B or repeatedl or consistentl taes part in firing it, some growth or metabolic change taes place in one or both cells such that A s efficienc, as one of the cells firing B, is increased []. This principle has been translated in the neural networs literature as: In Hebbian learning, the weight connecting a neuron to another is incremented proportional to the product of the input to the neuron and its output []. This formulation then can be shown to maimize the correlation between the input and the output of the neuron whose weights are updated through the use of Widrow s stochastic gradient on a correlation-based cost function [,3]. Correlation has been declared the basis of learning, and Hebb s law has mostl shaped our understanding of the operation of the brain and the process of learning. However, we now that correlation describes simple second order statistics of the data, and as such it is sub-optimal when the goal is to process information. Our brains must eploit much more than simpl correlation from the neural activit in order to achieve their level of performance in information processing. From an abstract perspective, the learning-from-eamples scenario starts with a data set, which globall conves information about a real world event, and the goal is to capture the information in the weights of a learning machine. Information Theor IT should be the mathematical infrastructure used to quantif this change in state because it is the best possible approach to deal with manipulation of information [4]. Shannon, in a 948 classical paper, laid down the foundations of IT [5]. IT has had a tremendous impact in the design of efficient and reliable communication

2 sstems [6,7] because it is able to answer two e questions: what is the best possible minimal code for our data, and what is the maimal amount of information which can be transferred through a particular channel. In spite of its practical origins, IT is a deep mathematical theor concerned with the ver essence of the communication process. IT has also impacted statistical mechanics b providing a clearer understanding of the nature of entrop, as was illustrated b Janes [8]. Due to the emphasis on principles, information theor has also plaed a role in eplaining biological information processing. Barlow enunciated the principle of redundanc reduction [9] building on earlier wor of Attneave on visual perception [] and utilized it to train unsupervised learning the weights of a linear neural networ. He also proposed factorial or minimum entrop coding []. The development of sparse codes has been recentl advanced b man researchers [,3]. Atic [4] and Atic and Redlich [5] demonstrated that feature etraction could be accomplished from nois inputs b maimizing mutual information between the input and the output, which is a form of redundanc reduction. The recent interest in sparse representations has been triggered b Fields [6], and a subsequent paper in Nature, which demonstrated emergence of simple cell receptive fields when learning a sparse code for natural images [3]. Probabl the most insightful application of information theor to brain theor is inser s principle of maimum information preservation, also called the Infoma principle [7]. inser enunciated a self-organizing principle for neuronal assemblies based on the simple idea that the role of a neuronal assembl is to transfer to its output as much information as possible about its input. He formulated this principle in analog to the channel capacit theorem as the maimization of the mutual information between the input and output. This is a profound idea that shines new light on the organizational principles of the brain. One of the difficulties of all these studies is that there is an abss between the sophisticated computation required to estimate Shannon s entrop and mutual information and the realit of the local computation of the Hebbian snapse. We will show in this paper, however, that if a ernel-based nonparametric estimator for Shannon s entrop is utilized, then the Hebbian snapse can in fact estimate entrop over time instead of simpl estimating correlation. We also demonstrate that the same goal could also be reached through the unconventional Reni s entrop. The structure of this paper is the following. First we briefl eplain and provide references of our nonparametric approach to estimate entrop directl from samples for both Shannon s and Reni s entrop definitions. Then we derive the stochastic information gradient SIG to adapt the parameters of a linear, or nonlinear sstem, to maimize or minimize entrop [8]. While eploring the mathematical properties of the SIG algorithm, we will establish the similarit between a special case of the SIG algorithm and Hebbian and anti-hebbian terms, which raised the question of whether information processing is possible through Hebbian learning. Finall we present a simple eample to demonstrate the convergence of the algorithm to the desired solutions.

3 SIG AGORITHM FOR ADAINE In order to arrive at the SIG algorithm, we start with the derivation of the nonparametric entrop estimator that emplos Parzen windowing. First consider Shannon s entrop, which is given b H = E [ log f ] S Suppose we utilize the stochastic value of this quantit at time, such that the epectation is dropped and the argument of this operation is evaluated at the most recent sample. H log f S Since in practice the analtical pdf of the random variable is not available, we emplo the nonparametric Parzen window estimator over the last samples of the signal []. Hˆ S = log = 3 Assuming that the samples of are generated b an ADAINE structure, i.e. T = w, where is the input vector and w is the weight vector, the stochastic gradient estimate for Shannon s entrop is found as in terms of a column vector Hˆ S = = w = This epression is called the stochastic information gradient SIG [8]. SIG can also be derived using Reni s entrop, a parametric famil of functions. For a random variable it is defined as [9] H = log E[ f ] 5 We named the argument of the log in 5 as the information potential, not arbitraril, but because it shares the properties of phsical potential fields, when the formulation is complete []. The epectation in the information potential could be dropped to obtain a stochastic estimate of this quantit. Notice that minimization or maimization of entrop is equivalent to minimization or maimization of the information potential, depending on the value of. Using Parzen windowing over the last samples, at time, we obtain the stochastic quantit ˆ = = V 6 Then, for an ADAINE, the stochastic gradient of Reni s entrop becomes 4

4 = = = = ˆ / ˆ ˆ V w V w H 7 which is identical to 3. In 3 and 7,. is called the ernel function of Parzen windowing, usuall selected to be a smmetric, continuous and differentiable pdf, whose width is determined b the parameter. In [], we have established the lin between the ernel function and the convolution smoothing for global optimization, and we also demonstrated the lin between the minimum error-entrop MEE criterion and minimum mean-square-error MMSE criterion in supervised learning. Note that, when maimizing or minimizing the output entrop of ADAINE, ust lie in Oa s rule [3], we would need to normalize the weights to eep them from growing without bound or decaing to zero. Alternative SIG epressions for Shannon s and Reni s definitions of entrop could also be obtained b simpl utilizing a single sample from the past in the Parzen pdf estimate and using the window of samples to approimate the epectation value operator with a sample mean. As a result, assuming an ADAINE structure, we obtain the following alternative SIG epressions for Shannon s and Reni s entropies, respectivel. + = = S w H ˆ 8 + = + = = w H ˆ 9 SIG given in 3 is, in effect, ver similar to Widrow s stochastic gradient in MS; we assume onl the instantaneous increment in the output samples, i.e., is available at the th iteration of adaptation. Moreover, it is trivial to see that the epected value of this gradient is the actual gradient of the Shannon s entrop for the output pdf estimated with Parzen windowing. For a small ernel size and a large number of samples, this estimate is ver close to the true pdf. An interesting special case of 3, 8, and 9 is for =. = w H The convergence properties of this algorithm for entrop minimization was studied in detail in [4]. Etensions to differentiable with respect to their weights nonlinear sstems are also possible [8].

5 REATIONSHIP BETWEEN SIG AND HEBB S RUE In general, we prefer using differentiable and smmetric ernels in Parzen windowing; differentiabilit is required to guarantee proper evaluation of the gradient in adaptation, and smmetr is preferred to prevent biasing the mean of the estimated pdf. Suppose Gaussian ernels are emploed. In that case, naturall becomes the standard deviation of the ernel, and we have the following. = Substituting in, we obtain the eplicit epression H = w We clearl see from that emploing Hebbian and anti-hebbian terms involving the two most recent values of the input and the output results in a stochastic estimate of the information gradient. When interpreted under the viewpoint of classical Hebbian learning, states that it is possible to maimize the entrop b appling the Hebbian rule to the instantaneous increments in the input and the output signals. Thus, the neuron implements information learning rather than merel implementing correlation learning. Now, consider the general case where an differentiable smmetric ernel function ma be used in Parzen windowing. Since onl the Gaussian distribution satisfies the differential equation in, it is the onl ernel choice that results in this special case given b, which reduces to a learning algorithm that is Hebbian on the increments in the classical sense. In general, since we use smmetric and differentiable ernels that are pdfs themselves, we get the following update rule H = f 3 w where f = /, and satisfies the condition sign f = sign. Thus, the update amount that would be applied to the weight vector is still in the same direction as would be in the classical Hebbian learning; however, it is scaled nonlinearl depending on the value of the increment that occurred in the output of the neuron. This is in fact consistent with Hebb s principle, and is a good eample that demonstrates the product of the output and the input signals is not the onl possibilit for the weight update to implement Hebb s principle. CASE STUDIES In this section, we present two case studies in which an ADAINE is trained to ield maimum entrop at the output subect to the unit-norm constraint on the weight vector. The weights are updated using SIG. In both eamples, training is carried out using two different ernel function choices: Gaussian and Cauch. Eplicitl, these ernel functions are given b

6 G = e, C = 4 π π + / The weights are normalized to unit-length after each update, and the information about the previous output values are modified using this normalized weight vector. In the first eample, samples from a -dimensional oint Gaussian distribution are generated and the ADAINE is trained to maimize the entrop at the output. The step sizes are chosen as -5, and -3, for Gaussian and Cauch ernels, respectivel, and both ernel sizes are chosen as.. Fig. a shows the samples and the directions deduced. Fig. b shows the convergence of the weight vector angle to the ideal solution, which corresponds to the st principal component in this case, where the data distribution is Gaussian []. The choice of ernel size and step size are important issues that affect convergence time and misadustment. 3 Ideal Entrop Gaussian EntropCauch.9 Ideal Entrop Gaussian Entrop Cauch.8 -ais - - Angle of weight vector rad ais Number of epochs Figure : a Data samples and the derived directions b Convergence of the angles In the second eample 5 samples of a -dimensional random vector are generated such that the -coordinate is uniform, -coordinate is Gaussian, and the sample covariance matri is identit. In this case, PCA is unable to deduce an maimal variance direction since the variance along each direction is the same. On the other hand, using the maimum minimum entrop approach, we can etract the direction along which the data ehibits the most least uncertaint. Fig. a shows the estimated entrop vs. direction of weight vector for both ernels, and Fig. b shows the convergence of weights from 5 different initial conditions Entrop Gaussian.74.7 Direction Gaussian Direction of Weight Vector rad Epochs Entrop Cauch.3.. Direction Cauch Direction of Weight Vector rad Epochs Figure : a Entrop vs. direction b Convergence from different initial conditions

7 We remar that, in this eample, as the number of samples increases, the estimated distributions will converge to the actual distributions for small ernel sizes, which are between uniform and Gaussian. Hence, the asmptoticall ideal solution for the angle will be π /, since the Gaussian distribution has the largest entrop among distributions of fied variance [7]. The third case stud we present here is a Monte Carlo analsis of the misadustment of the stochastic gradient given in. In each run of this analsis, we use samples from a -dimensional ointl Gaussian random vector, as was done in the first eample. The eigenspread λ ma /λ min of the covariance matri is one of the controlled variables. The ernel size is scaled up or down with the standard deviation of the maimum entrop proection this could be estimated from the samples as = λ ma, where is a base-value for the ernel function selected for unit-variance samples. This latter variable is also part of the controlled parameters of the Monte Carlo simulations. Specificall, we control the ratio η /, where η is the learning rate of the steepest ascent algorithm. We perform these series of Monte Carlo simulations varing η / from - and varing λ ma / λ from to. For each pairwise combination of min these parameters, eperiments were performed, starting from random initial conditions and using the last 3 samples to compute the MSE between the estimated maimum entrop direction and the actual solution found using the unbiased sample covariance estimate for the whole -sample training set. The MSE values obtained over the runs are then averaged to produce the results summar depicted in Fig. 3. These results are presented in terms of root-meansquare RMS values of the error in degrees between the estimated direction and the true direction notice the log-scale of the RMS aes in all three subplots. Both cross-section plots and the -argument -output mesh plot of the RMS Error versus learning rate and eigenspread are provided for convenience. -3 to.5 RMS Error degrees -.5. η/ RMS Error degrees sqrtλ ma /λ min log RMS Error degrees η/ 8 6 sqrtλ ma /λ min Figure 3: a RMS Error in degrees versus η / for different eigenspread values b RMS Error in degrees versus λ ma / λ for different learning rates min c RMS Error in degrees versus η / and λ ma / λ min 4

8 From the results shown in Fig. 3, we observe that the misadustment increases as the learning rate increases and as the eigenspread decreases, as epected. The affect of learning rate is intuitivel obvious from our nowledge on the effect of step size on the misadustment of the well-nown MS algorithm. The intuition behind the observed effect of the eigenspread is that with increasing eigenspread, the entrop of the proection to the desired direction increases causing the differences between consecutive samples of the input vector and the output value to become larger, thus emphasizing the distinction between the maimum-entrop direction and all the other directions. Therefore, it becomes easier for the algorithm to determine and converge to the optimal solution sought. CONCUSIONS In this paper we presented a nonparametric entrop estimator, whose stochastic gradient led to a local, sample-wise epression that involves the product of a nonlinear function of the differences between consecutive output samples and the difference between the corresponding input samples. The derived stochastic entrop gradient epressions all three of them can be utilized in the information theoretic solution of man engineering problems where training of adaptive sstems according to information theoretic criteria is required. These stochastic information gradient SIG epressions allow the designer to manipulate the information at the output of a learning sstem on a sample-b-sample basis; therefore, the are etremel useful for fast, on-line information theoretic learning. For the special case of single-sample windows, this stochastic gradient defaulted to a combination of Hebbian and anti-hebbian learning, among pairs of consecutive samples, acting on their instantaneous value increments in value instead of their instantaneous values, as in the classical sense. We have investigated the abilit of this simplified SIG algorithm to determine maimumentrop directions in different sets of random variables and demonstrated that it successfull forces the linear adaptive sstem to converge to the desired solution. A Monte Carlo analsis of the effects of the learning rate and the eigenspread of the data on the misadustment of the algorithm in determining the maimumentrop direction was also conducted. This analsis verified the epectation that the misadustment increases with increasing learning rate and decreasing eigenspread. There are two main conclusions from this stud that we would lie to address. This relationship between entrop and Hebbian update principle encourages us to stud the realism of this learning model in neuronal interactions in the brain; eperiments could be set up to see if, in fact, snaptic strength is onl dependent upon the level of ecitabilit of the neuron or if, as the formula predicts, the efficac of the snapse is also modulated b the action potential s intervals verbal communication with several neuroscience eperts revealed the possibilit of a similar process in neuron snapses. If our prediction is correct, then we have a ver gratifing result that increases even further our admiration for biological

9 processes. In fact, Hebbian learning could adapt the snapses with information gradients, and therefore neural assemblies will manipulate entrop, which from the point of information processing provides all that can be nown about the data streams. The second conclusion concerns the machine learning communit. Even if our predictions for computational neuroscience are incorrect, Reni s entrop estimator is a viable alternative to help us leave the local minima created b second order methods so pervasive in both adaptive filter theor and artificial neural networ cost functions. SIG clearl shows that Hebbian tpe rules are estimating far more than second order statistics. It is quite etraordinar that interactions between consecutive samples are able to estimate higher order statistics. Upon emploing our nonparametric estimator for entrop and seeing an on-line implementation, we began to realize how powerful local computation in space-time can reall be. As a result, it is imperative that researchers move to a different paradigm, where Hebbian learning is no longer a snonm of correlation learning. Finall, we would lie to address the issue of applicabilit of the presented relationship to the field of information theoretic learning. In general, mutual information, not entrop alone, is emploed as the adaptation criterion as it is scale invariant and ehibits other desirable properties. It is eas to write mutual information in terms of marginal and oint entropies of the signals involved, therefore a lin between mutual information and Hebbian learning might as well be established following this lead. On the other hand, there are applications where a modified entrop criterion might be useful, such as blind deconvolution. B adding the log of the variance of the signal to its entrop, a scale invariant obective function could be constructed, which would ehibit Hebbian update rules for both entrop and the variance terms in the stochastic updates. In summar, there is still much to be researched about the lins between information etraction in biological adaptive sstems and their application to learning engineering sstems. Acnowledgments: This wor is partiall supported b the grants NSF-ECS and ONR-N REFERENCES [] D.O. Hebb, The Organization of Behavior: A Neuropschological Theor, Wile, New Yor, 949. [] S. Hain, Neural Networs, Prentice-Hall, New Jerse, 999. [3] B. Widrow, S.D. Stearns, Adaptive Signal Processing, Prentice-Hall, New Jerse, 985. [4] R. Fano, Transmission of Information, MIT Press, Massachussetts, 96. [5] C.E. Shannon, A mathematical theor of communication, Bell Ss. Tech. J., vol. 7, pp ,63-653, 948.

10 [6] R. Blahut, Principles and Practice of Information Theor, Addison Wesle, Massachussetts, 987. [7] T. Cover, J. Thomas, Elements of Information Theor, Wile, New Yor, 99. [8] E. Janes, Information theor and statistical mechanics, Phs. Rev., vol. 6, pp. 6-63, 957. [9] H. Barlow, P. Foldia, Adaptation and Decorrelation in the Corte, in Computing Neuron, Durbin, Miall & Mitchison eds., Addison Wesle, Massachussetts, 989. [] F. Attneave, Information Aspects of Visual Perception, Psch. Rev., vol. 6, pp , 954. [] H. Barlow, T. Kaushal, G. Mitchison, Finding Minimum Entrop Codes, Neural Computation, vol, no. 3, pp. 4-43, 989. [] R. Zemel, P. Daan, A. Pouget, Probabilistic Interpretation of Population Codes, Neural Computation, vol., pp , 998. [3] H. Olshausen, D.J. Field, Emergence of Simple-Cell Receptive Field Properties b earning a Sparse Code for Natural Images, Nature, vol. 38, pp , 996. [4] J. Atic, Could Information Theor Provide an Ecological Theor of Sensor Processing? Networ, vol. 3, pp. 3-5, 99. [5] J. Atic, A. Redlich, Convergent Algorithms for Sensor Receptive Field Development, Neural Computation, vol. 5, pp. 45-6, 993. [6] D. Fields, What is the Goal of Sensor Coding? Neural Computation, vol. 6, pp , 994. [7] R. inser, An Application of the Principle of Maimum Information Preservation to inear Sstems, in D.S. Tourez ed., Morgan-Kauffman, San Francisco, 988. [8] D. Erdogmus, J.C. Principe, An On-ine Adaptation Algorithm for Adaptive Sstem Training with Minimum Error Entrop: Stochastic Information Gradient, in Proc. of Independent Component Analsis ICA,. [9] A. Reni, Probabilit Theor, American Elsevier Pub. Co., New Yor, 97. [] J.C. Principe, D. Xu, J. Fisher, Information Theoretic earning, in Unsupervised Adaptive Filtering, vol I, S. Hain ed., Wile, New Yor,. [] E. Parzen, On Estimation of a Probabilit Densit Function and Mode,. in Time Series Analsis Papers, Holden-Da, Inc., California, 967. [] D. Erdogmus, J.C. Principe, Generalized Information Potential Criterion for Adaptive Sstem Training, to appear in IEEE Trans. on Neural Networs,. [3] E. Oa, Subspace Methods for Pattern Recognition, Research Studies Press, England, 98. [4] D. Erdogmus, J.C. Principe, Convergence Analsis of the Information Potential Criterion in ADAINE Training, Proc. Neural Networs in Signal Processing X,.

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Recursive Least Squares for an Entropy Regularized MSE Cost Function Recursive Least Squares for an Entropy Regularized MSE Cost Function Deniz Erdogmus, Yadunandana N. Rao, Jose C. Principe Oscar Fontenla-Romero, Amparo Alonso-Betanzos Electrical Eng. Dept., University

More information

AN ON-LINE ADAPTATION ALGORITHM FOR ADAPTIVE SYSTEM TRAINING WITH MINIMUM ERROR ENTROPY: STOCHASTIC INFORMATION GRADIENT

AN ON-LINE ADAPTATION ALGORITHM FOR ADAPTIVE SYSTEM TRAINING WITH MINIMUM ERROR ENTROPY: STOCHASTIC INFORMATION GRADIENT A O-LIE ADAPTATIO ALGORITHM FOR ADAPTIVE SYSTEM TRAIIG WITH MIIMUM ERROR ETROPY: STOHASTI IFORMATIO GRADIET Deniz Erdogmus, José. Príncipe omputational euroengineering Laboratory Department of Electrical

More information

Square Estimation by Matrix Inverse Lemma 1

Square Estimation by Matrix Inverse Lemma 1 Proof of Two Conclusions Associated Linear Minimum Mean Square Estimation b Matri Inverse Lemma 1 Jianping Zheng State e Lab of IS, Xidian Universit, Xi an, 7171, P. R. China jpzheng@idian.edu.cn Ma 6,

More information

STOCHASTIC INFORMATION GRADIENT ALGORITHM BASED ON MAXIMUM ENTROPY DENSITY ESTIMATION. Badong Chen, Yu Zhu, Jinchun Hu and Ming Zhang

STOCHASTIC INFORMATION GRADIENT ALGORITHM BASED ON MAXIMUM ENTROPY DENSITY ESTIMATION. Badong Chen, Yu Zhu, Jinchun Hu and Ming Zhang ICIC Express Letters ICIC International c 2009 ISSN 1881-803X Volume 3, Number 3, September 2009 pp. 1 6 STOCHASTIC INFORMATION GRADIENT ALGORITHM BASED ON MAXIMUM ENTROPY DENSITY ESTIMATION Badong Chen,

More information

6. Vector Random Variables

6. Vector Random Variables 6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following

More information

Computation of Csiszár s Mutual Information of Order α

Computation of Csiszár s Mutual Information of Order α Computation of Csiszár s Mutual Information of Order Damianos Karakos, Sanjeev Khudanpur and Care E. Priebe Department of Electrical and Computer Engineering and Center for Language and Speech Processing

More information

An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems

An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems 1780 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 7, JULY 2002 An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems Deniz Erdogmus, Member, IEEE, and Jose

More information

Iran University of Science and Technology, Tehran 16844, IRAN

Iran University of Science and Technology, Tehran 16844, IRAN DECENTRALIZED DECOUPLED SLIDING-MODE CONTROL FOR TWO- DIMENSIONAL INVERTED PENDULUM USING NEURO-FUZZY MODELING Mohammad Farrokhi and Ali Ghanbarie Iran Universit of Science and Technolog, Tehran 6844,

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

An Information Theory For Preferences

An Information Theory For Preferences An Information Theor For Preferences Ali E. Abbas Department of Management Science and Engineering, Stanford Universit, Stanford, Ca, 94305 Abstract. Recent literature in the last Maimum Entrop workshop

More information

CSE 546 Midterm Exam, Fall 2014

CSE 546 Midterm Exam, Fall 2014 CSE 546 Midterm Eam, Fall 2014 1. Personal info: Name: UW NetID: Student ID: 2. There should be 14 numbered pages in this eam (including this cover sheet). 3. You can use an material ou brought: an book,

More information

3.0 PROBABILITY, RANDOM VARIABLES AND RANDOM PROCESSES

3.0 PROBABILITY, RANDOM VARIABLES AND RANDOM PROCESSES 3.0 PROBABILITY, RANDOM VARIABLES AND RANDOM PROCESSES 3.1 Introduction In this chapter we will review the concepts of probabilit, rom variables rom processes. We begin b reviewing some of the definitions

More information

An Accurate Incremental Principal Component Analysis method with capacity of update and downdate

An Accurate Incremental Principal Component Analysis method with capacity of update and downdate 0 International Conference on Computer Science and Information echnolog (ICCSI 0) IPCSI vol. 5 (0) (0) IACSI Press, Singapore DOI: 0.7763/IPCSI.0.V5. An Accurate Incremental Principal Component Analsis

More information

ON-LINE MINIMUM MUTUAL INFORMATION METHOD FOR TIME-VARYING BLIND SOURCE SEPARATION

ON-LINE MINIMUM MUTUAL INFORMATION METHOD FOR TIME-VARYING BLIND SOURCE SEPARATION O-IE MIIMUM MUTUA IFORMATIO METHOD FOR TIME-VARYIG BID SOURCE SEPARATIO Kenneth E. Hild II, Deniz Erdogmus, and Jose C. Principe Computational euroengineering aboratory (www.cnel.ufl.edu) The University

More information

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS Justin Dauwels Dept. of Information Technology and Electrical Engineering ETH, CH-8092 Zürich, Switzerland dauwels@isi.ee.ethz.ch

More information

INF Introduction to classifiction Anne Solberg

INF Introduction to classifiction Anne Solberg INF 4300 8.09.17 Introduction to classifiction Anne Solberg anne@ifi.uio.no Introduction to classification Based on handout from Pattern Recognition b Theodoridis, available after the lecture INF 4300

More information

Computation of Total Capacity for Discrete Memoryless Multiple-Access Channels

Computation of Total Capacity for Discrete Memoryless Multiple-Access Channels IEEE TRANSACTIONS ON INFORATION THEORY, VOL. 50, NO. 11, NOVEBER 2004 2779 Computation of Total Capacit for Discrete emorless ultiple-access Channels ohammad Rezaeian, ember, IEEE, and Ale Grant, Senior

More information

LATERAL BUCKLING ANALYSIS OF ANGLED FRAMES WITH THIN-WALLED I-BEAMS

LATERAL BUCKLING ANALYSIS OF ANGLED FRAMES WITH THIN-WALLED I-BEAMS Journal of arine Science and J.-D. Technolog, Yau: ateral Vol. Buckling 17, No. Analsis 1, pp. 9-33 of Angled (009) Frames with Thin-Walled I-Beams 9 ATERA BUCKING ANAYSIS OF ANGED FRAES WITH THIN-WAED

More information

Vibrational Power Flow Considerations Arising From Multi-Dimensional Isolators. Abstract

Vibrational Power Flow Considerations Arising From Multi-Dimensional Isolators. Abstract Vibrational Power Flow Considerations Arising From Multi-Dimensional Isolators Rajendra Singh and Seungbo Kim The Ohio State Universit Columbus, OH 4321-117, USA Abstract Much of the vibration isolation

More information

CONTINUOUS SPATIAL DATA ANALYSIS

CONTINUOUS SPATIAL DATA ANALYSIS CONTINUOUS SPATIAL DATA ANALSIS 1. Overview of Spatial Stochastic Processes The ke difference between continuous spatial data and point patterns is that there is now assumed to be a meaningful value, s

More information

Interspecific Segregation and Phase Transition in a Lattice Ecosystem with Intraspecific Competition

Interspecific Segregation and Phase Transition in a Lattice Ecosystem with Intraspecific Competition Interspecific Segregation and Phase Transition in a Lattice Ecosstem with Intraspecific Competition K. Tainaka a, M. Kushida a, Y. Ito a and J. Yoshimura a,b,c a Department of Sstems Engineering, Shizuoka

More information

Applications of Gauss-Radau and Gauss-Lobatto Numerical Integrations Over a Four Node Quadrilateral Finite Element

Applications of Gauss-Radau and Gauss-Lobatto Numerical Integrations Over a Four Node Quadrilateral Finite Element Avaiable online at www.banglaol.info angladesh J. Sci. Ind. Res. (), 77-86, 008 ANGLADESH JOURNAL OF SCIENTIFIC AND INDUSTRIAL RESEARCH CSIR E-mail: bsir07gmail.com Abstract Applications of Gauss-Radau

More information

Deep Nonlinear Non-Gaussian Filtering for Dynamical Systems

Deep Nonlinear Non-Gaussian Filtering for Dynamical Systems Deep Nonlinear Non-Gaussian Filtering for Dnamical Sstems Arash Mehrjou Department of Empirical Inference Ma Planck Institute for Intelligent Sstems arash.mehrjou@tuebingen.mpg.de Bernhard Schölkopf Department

More information

Generalized Information Potential Criterion for Adaptive System Training

Generalized Information Potential Criterion for Adaptive System Training IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1035 Generalized Information Potential Criterion for Adaptive System Training Deniz Erdogmus, Student Member, IEEE, and Jose C. Principe,

More information

Efficient Coding. Odelia Schwartz 2017

Efficient Coding. Odelia Schwartz 2017 Efficient Coding Odelia Schwartz 2017 1 Levels of modeling Descriptive (what) Mechanistic (how) Interpretive (why) 2 Levels of modeling Fitting a receptive field model to experimental data (e.g., using

More information

Fault Detection and Diagnosis Using Information Measures

Fault Detection and Diagnosis Using Information Measures Fault Detection and Diagnosis Using Information Measures Rudolf Kulhavý Honewell Technolog Center & Institute of Information Theor and Automation Prague, Cech Republic Outline Probabilit-based inference

More information

The approximation of piecewise linear membership functions and Łukasiewicz operators

The approximation of piecewise linear membership functions and Łukasiewicz operators Fuzz Sets and Sstems 54 25 275 286 www.elsevier.com/locate/fss The approimation of piecewise linear membership functions and Łukasiewicz operators József Dombi, Zsolt Gera Universit of Szeged, Institute

More information

Dictionary Learning for L1-Exact Sparse Coding

Dictionary Learning for L1-Exact Sparse Coding Dictionary Learning for L1-Exact Sparse Coding Mar D. Plumbley Department of Electronic Engineering, Queen Mary University of London, Mile End Road, London E1 4NS, United Kingdom. Email: mar.plumbley@elec.qmul.ac.u

More information

Linear Least-Squares Based Methods for Neural Networks Learning

Linear Least-Squares Based Methods for Neural Networks Learning Linear Least-Squares Based Methods for Neural Networks Learning Oscar Fontenla-Romero 1, Deniz Erdogmus 2, JC Principe 2, Amparo Alonso-Betanzos 1, and Enrique Castillo 3 1 Laboratory for Research and

More information

Feature Selection by Maximum Marginal Diversity

Feature Selection by Maximum Marginal Diversity Appears in Neural Information Processing Systems, Vancouver, Canada,. Feature Selection by Maimum Marginal Diversity Nuno Vasconcelos Department of Electrical and Computer Engineering University of California,

More information

VECTOR-QUANTIZATION BY DENSITY MATCHING IN THE MINIMUM KULLBACK-LEIBLER DIVERGENCE SENSE

VECTOR-QUANTIZATION BY DENSITY MATCHING IN THE MINIMUM KULLBACK-LEIBLER DIVERGENCE SENSE VECTOR-QUATIZATIO BY DESITY ATCHIG I THE IIU KULLBACK-LEIBLER DIVERGECE SESE Anant Hegde, Deniz Erdogmus, Tue Lehn-Schioler 2, Yadunandana. Rao, Jose C. Principe CEL, Electrical & Computer Engineering

More information

FIRST- AND SECOND-ORDER IVPS The problem given in (1) is also called an nth-order initial-value problem. For example, Solve: Solve:

FIRST- AND SECOND-ORDER IVPS The problem given in (1) is also called an nth-order initial-value problem. For example, Solve: Solve: .2 INITIAL-VALUE PROBLEMS 3.2 INITIAL-VALUE PROBLEMS REVIEW MATERIAL Normal form of a DE Solution of a DE Famil of solutions INTRODUCTION We are often interested in problems in which we seek a solution

More information

The first change comes in how we associate operators with classical observables. In one dimension, we had. p p ˆ

The first change comes in how we associate operators with classical observables. In one dimension, we had. p p ˆ VI. Angular momentum Up to this point, we have been dealing primaril with one dimensional sstems. In practice, of course, most of the sstems we deal with live in three dimensions and 1D quantum mechanics

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Introduction Consider a zero mean random vector R n with autocorrelation matri R = E( T ). R has eigenvectors q(1),,q(n) and associated eigenvalues λ(1) λ(n). Let Q = [ q(1)

More information

Natural Image Statistics

Natural Image Statistics Natural Image Statistics A probabilistic approach to modelling early visual processing in the cortex Dept of Computer Science Early visual processing LGN V1 retina From the eye to the primary visual cortex

More information

Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces

Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces LETTER Communicated by Bartlett Mel Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces Aapo Hyvärinen Patrik Hoyer Helsinki University

More information

Noncommuting Rotation and Angular Momentum Operators

Noncommuting Rotation and Angular Momentum Operators Noncommuting Rotation and Angular Momentum Operators Originall appeared at: http://behindtheguesses.blogspot.com/2009/08/noncommuting-rotation-and-angular.html Eli Lanse elanse@gmail.com August 31, 2009

More information

INF Anne Solberg One of the most challenging topics in image analysis is recognizing a specific object in an image.

INF Anne Solberg One of the most challenging topics in image analysis is recognizing a specific object in an image. INF 4300 700 Introduction to classifiction Anne Solberg anne@ifiuiono Based on Chapter -6 6inDuda and Hart: attern Classification 303 INF 4300 Introduction to classification One of the most challenging

More information

Machine Learning Basics III

Machine Learning Basics III Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

Regression with Input-Dependent Noise: A Bayesian Treatment

Regression with Input-Dependent Noise: A Bayesian Treatment Regression with Input-Dependent oise: A Bayesian Treatment Christopher M. Bishop C.M.BishopGaston.ac.uk Cazhaow S. Qazaz qazazcsgaston.ac.uk eural Computing Research Group Aston University, Birmingham,

More information

Review of Probability

Review of Probability Review of robabilit robabilit Theor: Man techniques in speech processing require the manipulation of probabilities and statistics. The two principal application areas we will encounter are: Statistical

More information

1 History of statistical/machine learning. 2 Supervised learning. 3 Two approaches to supervised learning. 4 The general learning procedure

1 History of statistical/machine learning. 2 Supervised learning. 3 Two approaches to supervised learning. 4 The general learning procedure Overview Breiman L (2001). Statistical modeling: The two cultures Statistical learning reading group Aleander L 1 Histor of statistical/machine learning 2 Supervised learning 3 Two approaches to supervised

More information

How to Efficiently Capture On-Chip Inductance Effects: Introducing a New Circuit Element K

How to Efficiently Capture On-Chip Inductance Effects: Introducing a New Circuit Element K How to Efficientl Capture On-Chip Inductance Effects: Introducing a New Circuit Element K Anirudh Devgan Hao Ji Wane Dai IBM Microelectronics UC Santa Cruz CE Dept. UC Santa Cruz CE Dept. Austin, TX 5

More information

Type II variational methods in Bayesian estimation

Type II variational methods in Bayesian estimation Type II variational methods in Bayesian estimation J. A. Palmer, D. P. Wipf, and K. Kreutz-Delgado Department of Electrical and Computer Engineering University of California San Diego, La Jolla, CA 9093

More information

ENGI 4430 Advanced Calculus for Engineering Faculty of Engineering and Applied Science Problem Set 3 Solutions [Multiple Integration; Lines of Force]

ENGI 4430 Advanced Calculus for Engineering Faculty of Engineering and Applied Science Problem Set 3 Solutions [Multiple Integration; Lines of Force] ENGI 44 Advanced Calculus for Engineering Facult of Engineering and Applied Science Problem Set Solutions [Multiple Integration; Lines of Force]. Evaluate D da over the triangular region D that is bounded

More information

Estimation of Mixed Exponentiated Weibull Parameters in Life Testing

Estimation of Mixed Exponentiated Weibull Parameters in Life Testing International Mathematical Forum, Vol., 6, no. 6, 769-78 HIKARI Ltd, www.m-hikari.com http://d.doi.org/.988/imf.6.6445 Estimation of Mied Eponentiated Weibull Parameters in Life Testing A. M. Abd-Elrahman

More information

Strain Transformation and Rosette Gage Theory

Strain Transformation and Rosette Gage Theory Strain Transformation and Rosette Gage Theor It is often desired to measure the full state of strain on the surface of a part, that is to measure not onl the two etensional strains, and, but also the shear

More information

Financial Informatics XVII:

Financial Informatics XVII: Financial Informatics XVII: Unsupervised Learning Khurshid Ahmad, Professor of Computer Science, Department of Computer Science Trinity College, Dublin-, IRELAND November 9 th, 8. https://www.cs.tcd.ie/khurshid.ahmad/teaching.html

More information

Entropy Manipulation of Arbitrary Non I inear Map pings

Entropy Manipulation of Arbitrary Non I inear Map pings Entropy Manipulation of Arbitrary Non I inear Map pings John W. Fisher I11 JosC C. Principe Computational NeuroEngineering Laboratory EB, #33, PO Box 116130 University of Floridaa Gainesville, FL 326 1

More information

The Deterministic Information Bottleneck

The Deterministic Information Bottleneck The Deterministic Information Bottleneck DJ Strouse Phsics Department Princeton Universit dstrouse@princeton.edu David J Schwab Phsics Department Northwestern Universit david.schwab@northwestern.edu Abstract

More information

Research Article Chaos and Control of Game Model Based on Heterogeneous Expectations in Electric Power Triopoly

Research Article Chaos and Control of Game Model Based on Heterogeneous Expectations in Electric Power Triopoly Discrete Dnamics in Nature and Societ Volume 29, Article ID 469564, 8 pages doi:.55/29/469564 Research Article Chaos and Control of Game Model Based on Heterogeneous Epectations in Electric Power Triopol

More information

The Entropy Power Inequality and Mrs. Gerber s Lemma for Groups of order 2 n

The Entropy Power Inequality and Mrs. Gerber s Lemma for Groups of order 2 n The Entrop Power Inequalit and Mrs. Gerber s Lemma for Groups of order 2 n Varun Jog EECS, UC Berkele Berkele, CA-94720 Email: varunjog@eecs.berkele.edu Venkat Anantharam EECS, UC Berkele Berkele, CA-94720

More information

Introduction to Differential Equations. National Chiao Tung University Chun-Jen Tsai 9/14/2011

Introduction to Differential Equations. National Chiao Tung University Chun-Jen Tsai 9/14/2011 Introduction to Differential Equations National Chiao Tung Universit Chun-Jen Tsai 9/14/011 Differential Equations Definition: An equation containing the derivatives of one or more dependent variables,

More information

CIFAR Lectures: Non-Gaussian statistics and natural images

CIFAR Lectures: Non-Gaussian statistics and natural images CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity

More information

Methods for Advanced Mathematics (C3) Coursework Numerical Methods

Methods for Advanced Mathematics (C3) Coursework Numerical Methods Woodhouse College 0 Page Introduction... 3 Terminolog... 3 Activit... 4 Wh use numerical methods?... Change of sign... Activit... 6 Interval Bisection... 7 Decimal Search... 8 Coursework Requirements on

More information

Discriminative Learning of Dynamical Systems for Motion Tracking

Discriminative Learning of Dynamical Systems for Motion Tracking Discriminative Learning of Dnamical Sstems for Motion Tracking Minoung Kim and Vladimir Pavlovic Department of Computer Science Rutgers Universit, NJ 08854 USA {mikim,vladimir}@cs.rutgers.edu Abstract

More information

THE SPIKE PROCESS: A SIMPLE TEST CASE FOR INDEPENDENT OR SPARSE COMPONENT ANALYSIS. Naoki Saito and Bertrand Bénichou

THE SPIKE PROCESS: A SIMPLE TEST CASE FOR INDEPENDENT OR SPARSE COMPONENT ANALYSIS. Naoki Saito and Bertrand Bénichou THE SPIKE PROCESS: A SIMPLE TEST CASE FOR INDEPENDENT OR SPARSE COMPONENT ANALYSIS Naoki Saito and Bertrand Bénichou Department of Mathematics University of California Davis CA 95 USA ABSTRACT We examine

More information

5.0 References. Unsupervised Learning for Boltzmann Machines 15

5.0 References. Unsupervised Learning for Boltzmann Machines 15 5.0 References Ackley D., Hinton G. and Sejnowski, 1985, A Learning Algorithm for Boltzmann Machines, Cognitive Science, 9, 147-169. Atick J. and Redlich A., 1990, Towards a theory of early visual processing,

More information

Optimum Structure of Feed Forward Neural Networks by SOM Clustering of Neuron Activations

Optimum Structure of Feed Forward Neural Networks by SOM Clustering of Neuron Activations Optimum Structure of Feed Forward Neural Networks b SOM Clustering of Neuron Activations Samarasinghe, S. Centre for Advanced Computational Solutions, Natural Resources Engineering Group Lincoln Universit,

More information

On the spectral formulation of Granger causality

On the spectral formulation of Granger causality Noname manuscript No. (will be inserted b the editor) On the spectral formulation of Granger causalit the date of receipt and acceptance should be inserted later Abstract Spectral measures of causalit

More information

VISION TRACKING PREDICTION

VISION TRACKING PREDICTION VISION RACKING PREDICION Eri Cuevas,2, Daniel Zaldivar,2, and Raul Rojas Institut für Informati, Freie Universität Berlin, ausstr 9, D-495 Berlin, German el 0049-30-83852485 2 División de Electrónica Computación,

More information

9.2. Cartesian Components of Vectors. Introduction. Prerequisites. Learning Outcomes

9.2. Cartesian Components of Vectors. Introduction. Prerequisites. Learning Outcomes Cartesian Components of Vectors 9.2 Introduction It is useful to be able to describe vectors with reference to specific coordinate sstems, such as the Cartesian coordinate sstem. So, in this Section, we

More information

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification INF 4300 151014 Introduction to classifiction Anne Solberg anne@ifiuiono Based on Chapter 1-6 in Duda and Hart: Pattern Classification 151014 INF 4300 1 Introduction to classification One of the most challenging

More information

Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analysis

Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analysis Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analsis Matthew B. Blaschko, Christoph H. Lampert, & Arthur Gretton Ma Planck Institute for Biological Cbernetics Tübingen, German

More information

A Minimum Error Entropy criterion with Self Adjusting. Step-size (MEE-SAS)

A Minimum Error Entropy criterion with Self Adjusting. Step-size (MEE-SAS) Signal Processing - EURASIP (SUBMIED) A Minimum Error Entropy criterion with Self Adjusting Step-size (MEE-SAS) Seungju Han *, Sudhir Rao *, Deniz Erdogmus, Kyu-Hwa Jeong *, Jose Principe * Corresponding

More information

MACHINE LEARNING 2012 MACHINE LEARNING

MACHINE LEARNING 2012 MACHINE LEARNING 1 MACHINE LEARNING kernel CCA, kernel Kmeans Spectral Clustering 2 Change in timetable: We have practical session net week! 3 Structure of toda s and net week s class 1) Briefl go through some etension

More information

Parameterized Joint Densities with Gaussian and Gaussian Mixture Marginals

Parameterized Joint Densities with Gaussian and Gaussian Mixture Marginals Parameterized Joint Densities with Gaussian and Gaussian Miture Marginals Feli Sawo, Dietrich Brunn, and Uwe D. Hanebeck Intelligent Sensor-Actuator-Sstems Laborator Institute of Computer Science and Engineering

More information

Solving Variable-Coefficient Fourth-Order ODEs with Polynomial Nonlinearity by Symmetric Homotopy Method

Solving Variable-Coefficient Fourth-Order ODEs with Polynomial Nonlinearity by Symmetric Homotopy Method Applied and Computational Mathematics 218; 7(2): 58-7 http://www.sciencepublishinggroup.com/j/acm doi: 1.11648/j.acm.21872.14 ISSN: 2328-565 (Print); ISSN: 2328-5613 (Online) Solving Variable-Coefficient

More information

Artificial Neural Network : Training

Artificial Neural Network : Training Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural

More information

Theory of two-dimensional transformations

Theory of two-dimensional transformations Calhoun: The NPS Institutional Archive DSpace Repositor Facult and Researchers Facult and Researchers Collection 1998-1 Theor of two-dimensional transformations Kanaama, Yutaka J.; Krahn, Gar W. http://hdl.handle.net/1945/419

More information

Soft and hard models. Jiří Militky Computer assisted statistical modeling in the

Soft and hard models. Jiří Militky Computer assisted statistical modeling in the Soft and hard models Jiří Militky Computer assisted statistical s modeling in the applied research Monsters Giants Moore s law: processing capacity doubles every 18 months : CPU, cache, memory It s more

More information

On Range and Reflecting Functions About the Line y = mx

On Range and Reflecting Functions About the Line y = mx On Range and Reflecting Functions About the Line = m Scott J. Beslin Brian K. Heck Jerem J. Becnel Dept.of Mathematics and Dept. of Mathematics and Dept. of Mathematics and Computer Science Computer Science

More information

POLYNOMIAL EXPANSION OF THE PROBABILITY DENSITY FUNCTION ABOUT GAUSSIAN MIXTURES

POLYNOMIAL EXPANSION OF THE PROBABILITY DENSITY FUNCTION ABOUT GAUSSIAN MIXTURES POLYNOMIAL EXPANSION OF THE PROBABILITY DENSITY FUNCTION ABOUT GAUSSIAN MIXTURES Charles C. Cavalcante, J. C. M. Mota and J. M. T. Romano GTEL/DETI/CT/UFC C.P. 65, Campus do Pici, CEP: 6.455-76 Fortaleza-CE,

More information

0.24 adults 2. (c) Prove that, regardless of the possible values of and, the covariance between X and Y is equal to zero. Show all work.

0.24 adults 2. (c) Prove that, regardless of the possible values of and, the covariance between X and Y is equal to zero. Show all work. 1 A socioeconomic stud analzes two discrete random variables in a certain population of households = number of adult residents and = number of child residents It is found that their joint probabilit mass

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Mid-term eam October 8, 6 ( points) Your name and MIT ID: .5.5 y.5 y.5 a).5.5 b).5.5.5.5 y.5 y.5 c).5.5 d).5.5 Figure : Plots of linear regression results with different types of

More information

A parametric approach to Bayesian optimization with pairwise comparisons

A parametric approach to Bayesian optimization with pairwise comparisons A parametric approach to Bayesian optimization with pairwise comparisons Marco Co Eindhoven University of Technology m.g.h.co@tue.nl Bert de Vries Eindhoven University of Technology and GN Hearing bdevries@ieee.org

More information

Interaction Network Analysis

Interaction Network Analysis CSI/BIF 5330 Interaction etwork Analsis Young-Rae Cho Associate Professor Department of Computer Science Balor Universit Biological etworks Definition Maps of biochemical reactions, interactions, regulations

More information

Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation

Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation Cognitive Science 30 (2006) 725 731 Copyright 2006 Cognitive Science Society, Inc. All rights reserved. Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation Geoffrey Hinton,

More information

Multilayer Neural Networks

Multilayer Neural Networks Pattern Recognition Multilaer Neural Networs Lecture 4 Prof. Daniel Yeung School of Computer Science and Engineering South China Universit of Technolog Outline Introduction (6.) Artificial Neural Networ

More information

Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets

Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets Optimal scaling of the random walk Metropolis on ellipticall smmetric unimodal targets Chris Sherlock 1 and Gareth Roberts 2 1. Department of Mathematics and Statistics, Lancaster Universit, Lancaster,

More information

4. References. words, the pseudo Laplacian is given by max {! 2 x,! 2 y}.

4. References. words, the pseudo Laplacian is given by max {! 2 x,! 2 y}. words, the pseudo Laplacian is given by ma {! 2,! 2 y}. 4. References [Ar72] Arbib, M.A., The Metaphorical Brain, John Wiley, New York, 1972. [BB82] Ballard, D. H. and Brown, C. M., Computer Vision, Prentice-Hall,

More information

2.3 Quadratic Functions

2.3 Quadratic Functions 88 Linear and Quadratic Functions. Quadratic Functions You ma recall studing quadratic equations in Intermediate Algebra. In this section, we review those equations in the contet of our net famil of functions:

More information

nm nm

nm nm The Quantum Mechanical Model of the Atom You have seen how Bohr s model of the atom eplains the emission spectrum of hdrogen. The emission spectra of other atoms, however, posed a problem. A mercur atom,

More information

LECTURE :ICA. Rita Osadchy. Based on Lecture Notes by A. Ng

LECTURE :ICA. Rita Osadchy. Based on Lecture Notes by A. Ng LECURE :ICA Rita Osadchy Based on Lecture Notes by A. Ng Cocktail Party Person 1 2 s 1 Mike 2 s 3 Person 3 1 Mike 1 s 2 Person 2 3 Mike 3 microphone signals are mied speech signals 1 2 3 ( t) ( t) ( t)

More information

Stability Analysis of a Geometrically Imperfect Structure using a Random Field Model

Stability Analysis of a Geometrically Imperfect Structure using a Random Field Model Stabilit Analsis of a Geometricall Imperfect Structure using a Random Field Model JAN VALEŠ, ZDENĚK KALA Department of Structural Mechanics Brno Universit of Technolog, Facult of Civil Engineering Veveří

More information

On the Extension of Goal-Oriented Error Estimation and Hierarchical Modeling to Discrete Lattice Models

On the Extension of Goal-Oriented Error Estimation and Hierarchical Modeling to Discrete Lattice Models On the Etension of Goal-Oriented Error Estimation and Hierarchical Modeling to Discrete Lattice Models J. T. Oden, S. Prudhomme, and P. Bauman Institute for Computational Engineering and Sciences The Universit

More information

LESSON #48 - INTEGER EXPONENTS COMMON CORE ALGEBRA II

LESSON #48 - INTEGER EXPONENTS COMMON CORE ALGEBRA II LESSON #8 - INTEGER EXPONENTS COMMON CORE ALGEBRA II We just finished our review of linear functions. Linear functions are those that grow b equal differences for equal intervals. In this unit we will

More information

A two-layer ICA-like model estimated by Score Matching

A two-layer ICA-like model estimated by Score Matching A two-layer ICA-like model estimated by Score Matching Urs Köster and Aapo Hyvärinen University of Helsinki and Helsinki Institute for Information Technology Abstract. Capturing regularities in high-dimensional

More information

Viewpoint invariant face recognition using independent component analysis and attractor networks

Viewpoint invariant face recognition using independent component analysis and attractor networks Viewpoint invariant face recognition using independent component analysis and attractor networks Marian Stewart Bartlett University of California San Diego The Salk Institute La Jolla, CA 92037 marni@salk.edu

More information

HTF: Ch4 B: Ch4. Linear Classifiers. R Greiner Cmput 466/551

HTF: Ch4 B: Ch4. Linear Classifiers. R Greiner Cmput 466/551 HTF: Ch4 B: Ch4 Linear Classifiers R Greiner Cmput 466/55 Outline Framework Eact Minimize Mistakes Perceptron Training Matri inversion LMS Logistic Regression Ma Likelihood Estimation MLE of P Gradient

More information

1.1 The Equations of Motion

1.1 The Equations of Motion 1.1 The Equations of Motion In Book I, balance of forces and moments acting on an component was enforced in order to ensure that the component was in equilibrium. Here, allowance is made for stresses which

More information

520 Chapter 9. Nonlinear Differential Equations and Stability. dt =

520 Chapter 9. Nonlinear Differential Equations and Stability. dt = 5 Chapter 9. Nonlinear Differential Equations and Stabilit dt L dθ. g cos θ cos α Wh was the negative square root chosen in the last equation? (b) If T is the natural period of oscillation, derive the

More information

RegML 2018 Class 8 Deep learning

RegML 2018 Class 8 Deep learning RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Lecture 3: Pattern Classification. Pattern classification

Lecture 3: Pattern Classification. Pattern classification EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and

More information

Estimation Of Linearised Fluid Film Coefficients In A Rotor Bearing System Subjected To Random Excitation

Estimation Of Linearised Fluid Film Coefficients In A Rotor Bearing System Subjected To Random Excitation Estimation Of Linearised Fluid Film Coefficients In A Rotor Bearing Sstem Subjected To Random Ecitation Arshad. Khan and Ahmad A. Khan Department of Mechanical Engineering Z.. College of Engineering &

More information

v are uncorrelated, zero-mean, white

v are uncorrelated, zero-mean, white 6.0 EXENDED KALMAN FILER 6.1 Introduction One of the underlying assumptions of the Kalman filter is that it is designed to estimate the states of a linear system based on measurements that are a linear

More information

A PROPOSAL OF ROBUST DESIGN METHOD APPLICABLE TO DIVERSE DESIGN PROBLEMS

A PROPOSAL OF ROBUST DESIGN METHOD APPLICABLE TO DIVERSE DESIGN PROBLEMS A PROPOSAL OF ROBUST DESIGN METHOD APPLICABLE TO DIVERSE DESIGN PROBLEMS Satoshi Nakatsuka Takeo Kato Yoshiki Ujiie Yoshiuki Matsuoka Graduate School of Science and Technolog Keio Universit Yokohama Japan

More information

4.7. Newton s Method. Procedure for Newton s Method HISTORICAL BIOGRAPHY

4.7. Newton s Method. Procedure for Newton s Method HISTORICAL BIOGRAPHY 4. Newton s Method 99 4. Newton s Method HISTORICAL BIOGRAPHY Niels Henrik Abel (18 189) One of the basic problems of mathematics is solving equations. Using the quadratic root formula, we know how to

More information