Principal Components Analysis and Unsupervised Hebbian Learning
|
|
- Clementine Greer
- 5 years ago
- Views:
Transcription
1 Princial Comonents Analysis and Unsuervised Hebbian Learning Robert Jacobs Deartment of Brain & Cognitive Sciences University of Rochester Rochester, NY 1467, USA August 8, 008 Reference: Much of the material in this note is from Haykin, S. (1994). Neural Networks. New York: Macmillan. Here we consider training a single layer neural network (no hidden units) with an unsuervised Hebbian learning rule. It seems sensible that we might want the activation of an outut unit to vary as much as ossible when given different inuts. After all, if this activation was a constant for all inuts then the activation would not be telling us anything about which inut is resent. In contrast, if the activation has very different values for different inuts then there is some hoe of knowing something about the inut based just on the outut unit s activation. So we might want to train the outut unit to adat its weights so as to maximize the variance of its activation. However, one way in which the unit might do this is by making its weights arbitrarily large. This would not be a satisfying solution. We would refer a solution in which the variance of the outut activation is maximized subject to the constraint that the weights are bounded. For examle, we might want to constrain the length of the weight vector to be one. Lagrange otimization is a method for maximizing (or minimizing) a function subject to one or more constraints. At this oint, we take a brief detour and give an examle of Lagrange otimization. From this examle, you should get the general idea of how this method works. Examle: Suose that we want to find the shortest vector y = [y 1 y ] T that lies on the line y 1 y 5 = 0. We write down an objective function L in which the first term states the function that we want to minimize, and the second term states the constraint: L = 1 (y 1 + y ) + λ(y 1 y 5) (1) where λ is called a Lagrange multilier (it is always laced in front of the constraint). The first term gives the length of the vector (actually its 1/ times the length of the vector squared); the second term gives the constraint. We now take derivatives, and set these derivatives equal to zero: y 1 = y 1 + λ = 0 () y = y λ = 0 (3) λ = y 1 y 5 = 0. (4) Note that we have three equations and three unknown variables. From the first derivative we know that y 1 = λ, and from the second derivative we know that y = λ. If we lug these values into 1
2 the third derivative, we get 4λ λ = 5 or that λ = 1. From this, we can now solve for y 1 and y ; in articular, y 1 = and y = 1. So the solution to our roblem is y = [ 1] T. Now let s turn to the roblem that we really care about. We want to maximize the variance of the outut unit s activation subject to the constraint that the length of the weight vector is equal to one. Without loss of generality, let s make things simler by assuming that the inut variables each have a mean of zero (E[x i ] = 0). Because the maing from inuts to the outut is linear (y = w T x = x T w), this means that the outut activation will also have a mean of zero (E[y] = 0). We are now ready to write down the objective function: L = 1 y + λ(1 w T w) (5) where y is the outut activation on data attern. The first term, 1 y, is roortional to the samle variance of the outut unit, and the second term, λ(1 w T w) gives the constraint that the weight vector should have a length of one. We now do some algebraic maniulations: L = 1 (w T x )(x T w) + λ(1 w T w) (6) = 1 w T (x x T )w + λ(1 w T w) (7) = 1 wt Qw + λ(1 w T w) (8) where Q = x x T is the samle covariance matrix. Now let s take the derivative of L with resect to w, and set this derivative equal to zero: = Qw λw = 0 (9) w which means that Qw = λw. In other words, w is an eigenvector of the covariance matrix Q (the corresonding eigenvalue is λ). So if we want to maximize the variance of the outut activation (subject to the constraint that the weight vector is of length one), then we should set the weight vector to be an eigenvector of the inut covariance matrix (in fact it should be the eigenvector with the largest eigenvalue). As discussed below, this has an interesting relationshi to rincial comonent analysis and unsuervised Hebbian learning. The idea behind rincial comonent analysis (PCA) is that we want to reresent the data with resect to a new basis, where the vectors that form this new basis are the eigenvectors of the data covariance matrix. Recall that eigenvectors with large eigenvalues give directions of large variance (again, we are considering the eigenvectors of a covariance matrix), whereas eigenvectors with small eigenvalues give directions of small variance (so the eigenvector with the largest eigenvalue gives the direction in which the inut data shows the greatest variance, and the eigenvector with the smallest eigenvalue gives the direction in which the inut data shows the least variance). Each eigenvalue gives the value of the variance in the direction of its corresonding eigenvector (thus the
3 eigenvalues are all ositive). Because the data covariance matrix is symmetric, its eigenvectors are orthogonal (as usual, we assume that the eigenvectors are of length one). Let x (t) = [x (t) 1,..., x(t) n ] T be a data item, and e i = [e 1,..., e n ] T be the i th eigenvector (assume that the eigenvectors are ordered such that the eigenvector with the largest eigenvalue is e 1, the eigenvector with the next largest eigenvalue is e, and so on). The rojection of x (t) onto e i is denoted y (t) i, and is given by the inner roduct of these two vectors: y (t) i = [x (t) ] T e i = e T i x (t). (10) That is, y (t) i is the coordinate of x (t) along the axis given by e i, and y (t) = [y (t) 1,..., y(t) n ] T is the reresentation of x (t) with resect to the new basis. Let E = [e 1,..., e n ] be the matrix whose i th column is the i th eigenvector. Then we can move back and forth between the two reresentations of the same data by the linear equations: y (t) = E T x (t), (11) and x (t) = Ey (t). (1) Please make sure that you understand these equations [we have used the fact that for an orthogonal matrix E (a matrix whose columns are orthogonal vectors), its inverse, E 1, is equal to its transose E T ]. Princial comonent analysis is useful for at least two reasons. First, as a feature detection mechanism. It is often the case that the eigenvectors give interesting underlying features of the data. PCA is also useful as a dimensionality reduction technique. Suose that we reresent the data using only the first m eigenvectors, where m < n (that is, suose we reresent x (t) with ŷ (t) = [y (t) 1,..., y(t) m ] T instead of y (t) = [y (t) 1,..., y(t) n ] T ). Out of all linear transformations that roduce an m-dimensional reresentation, this transformation is otimal in the sense that it reserves the most information about the data (yes, I am being vague here). That is, PCA can be used for otimal linear dimensionality reduction. In the remainder of this handout, we rove that the use of a articular Hebbian learning rule results in a network that erforms PCA. That is, the weight vector of the i th outut unit goes to the i th eigenvector e i of the data covariance matrix, and the outut of this unit y (t) i is the coordinate of the inut x (t) along the axis given by e i. For simlicity, we assume that we are dealing with a two-layer linear network with n inut units and only one outut unit (the result extends to the case with n outut units in which the weight vector of each outut unit is a different eigenvector, but we will not rove that here). We will also assume that the inut is a random variable with mean equal to zero. The learning rule is: w i (t + 1) = w i (t) + ɛ[y(t)x i (t) y (t)w i (t)] (13) 3
4 where w i (t) is the value of the outut unit s i th weight at time t. Note that the first term inside the brackets is the standard Hebbian learning term, and the second term is a tye of weight decay term that revents the weight from getting too big. In vector notation, this equation may be re-written: w(t + 1) = w(t) + ɛ[y(t)x(t) y (t)w(t)]. (14) Recall that we are using a linear network so that Using this fact, we re-write the weight udate rule as y(t) = w T (t)x(t) = x T (t)w(t). (15) w(t + 1) = w(t) + ɛ[x(t)x T (t)w(t) w T (t)x(t)x T (t)w(t)w(t)]. (16) Let C = E[x(t)x T (t)] be the data covariance matrix (we use the fact that the inut has a mean equal to zero). The udate rule converges when E[w(t + 1)] = E[w(t)]. Denote the exected value of w(t) at convergence as w o. Convergence occurs when the exected value of what is inside the brackets in the udate rule is equal to zero: 0 = E[x(t)x T (t)]w o w T o E[x(t)x T (t)]w o w o (17) = Cw o w T o Cw o w o. (18) In other words, where Cw o = λ o w o (19) λ o = w T o Cw o. (0) That is, w o is an eigenvector of the data covariance matrix C with λ o as its eigenvalue. We know that w o has length one because λ o = w T o Cw o (1) = w T o λ o w o () = λ o w o (3) which is only ossible if w o = 1. We also know that λ o is the variance in the direction of w o because λ o = w T o Cw o (4) = w T o E[x(t)x T (t)]w o (5) = E[{w T (t)x(t)}{x T (t)w(t)}] (6) = E[y (t)] (7) which is the variance of the outut (assuming that the outut has mean zero which is true given our assumtion that the inut has mean zero and that the inut and outut are linearly related). 4
5 We still need to rove that w o is the articular eigenvector with the largest eigenvalue, i.e. w o = e 1. This is trickier to rove so we shall omit it. This result is due to Oja (198). The extension to networks with n outut units in which the weight vector of each outut unit goes to a different eigenvector is due to Sanger (1989). The extension to networks (using Hebbian and anti-hebbian learning) with n outut units in which the n weight vectors san the same sace as the first n eigenvectors is due to Földiák (1990). Földiák, P. (1990) Adative network for otimal linear feature extraction. IEEE/INNS International Joint Conference on Neural Networks. Proceedings of the Oja, E. (198) A simlified neuron model as a rincial comonent analyzer. Journal of Mathematical Biology, 15, Sanger, T.D. (1989) Otimal unsuervised learning in a single-layer linear feedforward neural network. Neural Networks, 1,
Radial Basis Function Networks: Algorithms
Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.
More informationUncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning
TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment
More informationFeedback-error control
Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller
More informationLIMITATIONS OF RECEPTRON. XOR Problem The failure of the perceptron to successfully simple problem such as XOR (Minsky and Papert).
LIMITATIONS OF RECEPTRON XOR Problem The failure of the ercetron to successfully simle roblem such as XOR (Minsky and Paert). x y z x y z 0 0 0 0 0 0 Fig. 4. The exclusive-or logic symbol and function
More informationPROFIT MAXIMIZATION. π = p y Σ n i=1 w i x i (2)
PROFIT MAXIMIZATION DEFINITION OF A NEOCLASSICAL FIRM A neoclassical firm is an organization that controls the transformation of inuts (resources it owns or urchases into oututs or roducts (valued roducts
More informationA Simple Weight Decay Can Improve. Abstract. It has been observed in numerical simulations that a weight decay can improve
In Advances in Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson and R.P. Limann, eds. Morgan Kaumann Publishers, San Mateo CA, 1995,. 950{957. A Simle Weight Decay Can Imrove Generalization
More informationSolved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.
Solved Problems Solved Problems P Solve the three simle classification roblems shown in Figure P by drawing a decision boundary Find weight and bias values that result in single-neuron ercetrons with the
More informationCovariance and Correlation Matrix
Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix
More informationCMSC 425: Lecture 4 Geometry and Geometric Programming
CMSC 425: Lecture 4 Geometry and Geometric Programming Geometry for Game Programming and Grahics: For the next few lectures, we will discuss some of the basic elements of geometry. There are many areas
More informationMetrics Performance Evaluation: Application to Face Recognition
Metrics Performance Evaluation: Alication to Face Recognition Naser Zaeri, Abeer AlSadeq, and Abdallah Cherri Electrical Engineering Det., Kuwait University, P.O. Box 5969, Safat 6, Kuwait {zaery, abeer,
More informationResearch of power plant parameter based on the Principal Component Analysis method
Research of ower lant arameter based on the Princial Comonent Analysis method Yang Yang *a, Di Zhang b a b School of Engineering, Bohai University, Liaoning Jinzhou, 3; Liaoning Datang international Jinzhou
More informationOne step ahead prediction using Fuzzy Boolean Neural Networks 1
One ste ahead rediction using Fuzzy Boolean eural etworks 1 José A. B. Tomé IESC-ID, IST Rua Alves Redol, 9 1000 Lisboa jose.tome@inesc-id.t João Paulo Carvalho IESC-ID, IST Rua Alves Redol, 9 1000 Lisboa
More informationHebbian Learning II. Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester. July 20, 2017
Hebbian Learning II Robert Jacobs Department of Brain & Cognitive Sciences University of Rochester July 20, 2017 Goals Teach about one-half of an undergraduate course on Linear Algebra Understand when
More informationHidden Predictors: A Factor Analysis Primer
Hidden Predictors: A Factor Analysis Primer Ryan C Sanchez Western Washington University Factor Analysis is a owerful statistical method in the modern research sychologist s toolbag When used roerly, factor
More informationFuzzy Automata Induction using Construction Method
Journal of Mathematics and Statistics 2 (2): 395-4, 26 ISSN 1549-3644 26 Science Publications Fuzzy Automata Induction using Construction Method 1,2 Mo Zhi Wen and 2 Wan Min 1 Center of Intelligent Control
More informationIntroduction to MVC. least common denominator of all non-identical-zero minors of all order of G(s). Example: The minor of order 2: 1 2 ( s 1)
Introduction to MVC Definition---Proerness and strictly roerness A system G(s) is roer if all its elements { gij ( s)} are roer, and strictly roer if all its elements are strictly roer. Definition---Causal
More informationMULTIVARIATE STATISTICAL PROCESS OF HOTELLING S T CONTROL CHARTS PROCEDURES WITH INDUSTRIAL APPLICATION
Journal of Statistics: Advances in heory and Alications Volume 8, Number, 07, Pages -44 Available at htt://scientificadvances.co.in DOI: htt://dx.doi.org/0.864/jsata_700868 MULIVARIAE SAISICAL PROCESS
More informationMATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression
1/9 MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression Dominique Guillot Deartments of Mathematical Sciences University of Delaware February 15, 2016 Distribution of regression
More informationHotelling s Two- Sample T 2
Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test
More informationUsing Factor Analysis to Study the Effecting Factor on Traffic Accidents
Using Factor Analysis to Study the Effecting Factor on Traffic Accidents Abstract Layla A. Ahmed Deartment of Mathematics, College of Education, University of Garmian, Kurdistan Region Iraq This aer is
More informationNumerical Linear Algebra
Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and
More informationEstimation of the large covariance matrix with two-step monotone missing data
Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo
More informationApproximating min-max k-clustering
Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost
More informationMATH 2710: NOTES FOR ANALYSIS
MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite
More informationUsing a Computational Intelligence Hybrid Approach to Recognize the Faults of Variance Shifts for a Manufacturing Process
Journal of Industrial and Intelligent Information Vol. 4, No. 2, March 26 Using a Comutational Intelligence Hybrid Aroach to Recognize the Faults of Variance hifts for a Manufacturing Process Yuehjen E.
More informationSemi-Orthogonal Multilinear PCA with Relaxed Start
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Semi-Orthogonal Multilinear PCA with Relaxed Start Qiuan Shi and Haiing Lu Deartment of Comuter Science
More informationarxiv: v2 [stat.ml] 7 May 2015
Semi-Orthogonal Multilinear PCA with Relaxed Start Qiuan Shi and Haiing Lu Deartment of Comuter Science Hong Kong Batist University, Hong Kong, China csshi@com.hkbu.edu.hk, haiing@hkbu.edu.hk arxiv:1504.08142v2
More informationThe Poisson Regression Model
The Poisson Regression Model The Poisson regression model aims at modeling a counting variable Y, counting the number of times that a certain event occurs during a given time eriod. We observe a samle
More informationRecursive Estimation of the Preisach Density function for a Smart Actuator
Recursive Estimation of the Preisach Density function for a Smart Actuator Ram V. Iyer Deartment of Mathematics and Statistics, Texas Tech University, Lubbock, TX 7949-142. ABSTRACT The Preisach oerator
More informationNotes on pressure coordinates Robert Lindsay Korty October 1, 2002
Notes on ressure coordinates Robert Lindsay Korty October 1, 2002 Obviously, it makes no difference whether the quasi-geostrohic equations are hrased in height coordinates (where x, y,, t are the indeendent
More informationGeneral Linear Model Introduction, Classes of Linear models and Estimation
Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)
More informationStatics and dynamics: some elementary concepts
1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and
More informationUncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition
Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers Sr. Deartment of
More informationConvex Analysis and Economic Theory Winter 2018
Division of the Humanities and Social Sciences Ec 181 KC Border Conve Analysis and Economic Theory Winter 2018 Toic 16: Fenchel conjugates 16.1 Conjugate functions Recall from Proosition 14.1.1 that is
More informationPositive decomposition of transfer functions with multiple poles
Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.
More informationEvaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models
Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Ketan N. Patel, Igor L. Markov and John P. Hayes University of Michigan, Ann Arbor 48109-2122 {knatel,imarkov,jhayes}@eecs.umich.edu
More informationVarious Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems
Int. J. Oen Problems Comt. Math., Vol. 3, No. 2, June 2010 ISSN 1998-6262; Coyright c ICSRS Publication, 2010 www.i-csrs.org Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various
More informationPHYS 301 HOMEWORK #9-- SOLUTIONS
PHYS 0 HOMEWORK #9-- SOLUTIONS. We are asked to use Dirichlet' s theorem to determine the value of f (x) as defined below at x = 0, ± /, ± f(x) = 0, - < x
More informationConvex Optimization methods for Computing Channel Capacity
Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem
More informationAI*IA 2003 Fusion of Multiple Pattern Classifiers PART III
AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers
More informationSums of independent random variables
3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where
More informationUncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition
TNN-2007-P-0332.R1 1 Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers
More informationUse of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek
Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.
More information4. Score normalization technical details We now discuss the technical details of the score normalization method.
SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules
More informationParticipation Factors. However, it does not give the influence of each state on the mode.
Particiation Factors he mode shae, as indicated by the right eigenvector, gives the relative hase of each state in a articular mode. However, it does not give the influence of each state on the mode. We
More informationStochastic integration II: the Itô integral
13 Stochastic integration II: the Itô integral We have seen in Lecture 6 how to integrate functions Φ : (, ) L (H, E) with resect to an H-cylindrical Brownian motion W H. In this lecture we address the
More informationSeries Handout A. 1. Determine which of the following sums are geometric. If the sum is geometric, express the sum in closed form.
Series Handout A. Determine which of the following sums are geometric. If the sum is geometric, exress the sum in closed form. 70 a) k= ( k ) b) 50 k= ( k )2 c) 60 k= ( k )k d) 60 k= (.0)k/3 2. Find the
More informationTowards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK
Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)
More informationChapter 7 Rational and Irrational Numbers
Chater 7 Rational and Irrational Numbers In this chater we first review the real line model for numbers, as discussed in Chater 2 of seventh grade, by recalling how the integers and then the rational numbers
More informationŽ. Ž. Ž. 2 QUADRATIC AND INVERSE REGRESSIONS FOR WISHART DISTRIBUTIONS 1
The Annals of Statistics 1998, Vol. 6, No., 573595 QUADRATIC AND INVERSE REGRESSIONS FOR WISHART DISTRIBUTIONS 1 BY GERARD LETAC AND HELENE ` MASSAM Universite Paul Sabatier and York University If U and
More informationTIME-FREQUENCY BASED SENSOR FUSION IN THE ASSESSMENT AND MONITORING OF MACHINE PERFORMANCE DEGRADATION
Proceedings of IMECE 0 00 ASME International Mechanical Engineering Congress & Exosition New Orleans, Louisiana, November 17-, 00 IMECE00-MED-303 TIME-FREQUENCY BASED SENSOR FUSION IN THE ASSESSMENT AND
More informationDepartment of Electrical and Computer Systems Engineering
Deartment of Electrical and Comuter Systems Engineering Technical Reort MECSE-- A monomial $\nu$-sv method for Regression A. Shilton, D.Lai and M. Palaniswami A Monomial ν-sv Method For Regression A. Shilton,
More informationState Estimation with ARMarkov Models
Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,
More informationNamed Entity Recognition using Maximum Entropy Model SEEM5680
Named Entity Recognition using Maximum Entroy Model SEEM5680 Named Entity Recognition System Named Entity Recognition (NER): Identifying certain hrases/word sequences in a free text. Generally it involves
More information2x2x2 Heckscher-Ohlin-Samuelson (H-O-S) model with factor substitution
2x2x2 Heckscher-Ohlin-amuelson (H-O- model with factor substitution The HAT ALGEBRA of the Heckscher-Ohlin model with factor substitution o far we were dealing with the easiest ossible version of the H-O-
More informationarxiv: v1 [physics.data-an] 26 Oct 2012
Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch
More informationGOOD MODELS FOR CUBIC SURFACES. 1. Introduction
GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in
More informationRUN-TO-RUN CONTROL AND PERFORMANCE MONITORING OF OVERLAY IN SEMICONDUCTOR MANUFACTURING. 3 Department of Chemical Engineering
Coyright 2002 IFAC 15th Triennial World Congress, Barcelona, Sain RUN-TO-RUN CONTROL AND PERFORMANCE MONITORING OF OVERLAY IN SEMICONDUCTOR MANUFACTURING C.A. Bode 1, B.S. Ko 2, and T.F. Edgar 3 1 Advanced
More informationCryptanalysis of Pseudorandom Generators
CSE 206A: Lattice Algorithms and Alications Fall 2017 Crytanalysis of Pseudorandom Generators Instructor: Daniele Micciancio UCSD CSE As a motivating alication for the study of lattice in crytograhy we
More informationPairwise active appearance model and its application to echocardiography tracking
Pairwise active aearance model and its alication to echocardiograhy tracking S. Kevin Zhou 1, J. Shao 2, B. Georgescu 1, and D. Comaniciu 1 1 Integrated Data Systems, Siemens Cororate Research, Inc., Princeton,
More informationComputer arithmetic. Intensive Computation. Annalisa Massini 2017/2018
Comuter arithmetic Intensive Comutation Annalisa Massini 7/8 Intensive Comutation - 7/8 References Comuter Architecture - A Quantitative Aroach Hennessy Patterson Aendix J Intensive Comutation - 7/8 3
More information3. Show that if there are 23 people in a room, the probability is less than one half that no two of them share the same birthday.
N12c Natural Sciences Part IA Dr M. G. Worster Mathematics course B Examles Sheet 1 Lent erm 2005 Please communicate any errors in this sheet to Dr Worster at M.G.Worster@damt.cam.ac.uk. Note that there
More informationAnalysis of Group Coding of Multiple Amino Acids in Artificial Neural Network Applied to the Prediction of Protein Secondary Structure
Analysis of Grou Coding of Multile Amino Acids in Artificial Neural Networ Alied to the Prediction of Protein Secondary Structure Zhu Hong-ie 1, Dai Bin 2, Zhang Ya-feng 1, Bao Jia-li 3,* 1 College of
More informationOn Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm
On Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm Gabriel Noriega, José Restreo, Víctor Guzmán, Maribel Giménez and José Aller Universidad Simón Bolívar Valle de Sartenejas,
More informationE( x ) [b(n) - a(n, m)x(m) ]
Homework #, EE5353. An XOR network has two inuts, one hidden unit, and one outut. It is fully connected. Gie the network's weights if the outut unit has a ste actiation and the hidden unit actiation is
More informationE( x ) = [b(n) - a(n,m)x(m) ]
Exam #, EE5353, Fall 0. Here we consider MLPs with binary-valued inuts (0 or ). (a) If the MLP has inuts, what is the maximum degree D of its PBF model? (b) If the MLP has inuts, what is the maximum value
More informationHebb rule book: 'The Organization of Behavior' Theory about the neural bases of learning
PCA by neurons Hebb rule 1949 book: 'The Organization of Behavior' Theory about the neural bases of learning Learning takes place in synapses. Synapses get modified, they get stronger when the pre- and
More informationThe analysis and representation of random signals
The analysis and reresentation of random signals Bruno TOÉSNI Bruno.Torresani@cmi.univ-mrs.fr B. Torrésani LTP Université de Provence.1/30 Outline 1. andom signals Introduction The Karhunen-Loève Basis
More informationMATH 6210: SOLUTIONS TO PROBLEM SET #3
MATH 6210: SOLUTIONS TO PROBLEM SET #3 Rudin, Chater 4, Problem #3. The sace L (T) is searable since the trigonometric olynomials with comlex coefficients whose real and imaginary arts are rational form
More information1 More General von Neumann Measurements [10 Points]
Ph5c Sring 07 Prof. Sean Carroll seancarroll@gmail.com Homework - 3 Solutions Assigned TA: Thom Bohdanowicz thom@caltech.edu Due: 5:00m, 5/03/07 More General von Neumann Measurements [0 Points] Recall
More informationBayesian Networks Practice
Bayesian Networks Practice Part 2 2016-03-17 Byoung-Hee Kim, Seong-Ho Son Biointelligence Lab, CSE, Seoul National University Agenda Probabilistic Inference in Bayesian networks Probability basics D-searation
More informationLinear diophantine equations for discrete tomography
Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,
More informationSection 0.10: Complex Numbers from Precalculus Prerequisites a.k.a. Chapter 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative
Section 0.0: Comlex Numbers from Precalculus Prerequisites a.k.a. Chater 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative Commons Attribution-NonCommercial-ShareAlike.0 license.
More informationNOTES. Hyperplane Sections of the n-dimensional Cube
NOTES Edited by Sergei Tabachnikov Hyerlane Sections of the n-dimensional Cube Rolfdieter Frank and Harald Riede Abstract. We deduce an elementary formula for the volume of arbitrary hyerlane sections
More informationDynamic System Eigenvalue Extraction using a Linear Echo State Network for Small-Signal Stability Analysis a Novel Application
Dynamic System Eigenvalue Extraction using a Linear Echo State Network for Small-Signal Stability Analysis a Novel Alication Jiaqi Liang, Jing Dai, Ganesh K. Venayagamoorthy, and Ronald G. Harley Abstract
More informationUnsupervised Hyperspectral Image Analysis Using Independent Component Analysis (ICA)
Unsuervised Hyersectral Image Analysis Using Indeendent Comonent Analysis (ICA) Shao-Shan Chiang Chein-I Chang Irving W. Ginsberg Remote Sensing Signal and Image Processing Laboratory Deartment of Comuter
More informationComponents Analysis of Sampled Noisy Functions. Herve Cardot. Unite de Biometrie et Intelligence Articielle, INRA Toulouse, BP 27
Nonarametric Estimation of Smoothed Princial Comonents Analysis of Samled Noisy Functions Herve Cardot Unite de Biometrie et Intelligence Articielle, INRA Toulouse, BP 7 F-336 Castanet-Tolosan Cedex, FRANCE
More informationElements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley
Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the
More informationEstimation of Separable Representations in Psychophysical Experiments
Estimation of Searable Reresentations in Psychohysical Exeriments Michele Bernasconi (mbernasconi@eco.uninsubria.it) Christine Choirat (cchoirat@eco.uninsubria.it) Raffaello Seri (rseri@eco.uninsubria.it)
More informationAn Analysis of Reliable Classifiers through ROC Isometrics
An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit
More informationTopic 7: Using identity types
Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules
More informationMATH 3240Q Introduction to Number Theory Homework 7
As long as algebra and geometry have been searated, their rogress have been slow and their uses limited; but when these two sciences have been united, they have lent each mutual forces, and have marched
More informationRecent Developments in Multilayer Perceptron Neural Networks
Recent Develoments in Multilayer Percetron eural etworks Walter H. Delashmit Lockheed Martin Missiles and Fire Control Dallas, Texas 75265 walter.delashmit@lmco.com walter.delashmit@verizon.net Michael
More informationScaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling
Scaling Multile Point Statistics or Non-Stationary Geostatistical Modeling Julián M. Ortiz, Steven Lyster and Clayton V. Deutsch Centre or Comutational Geostatistics Deartment o Civil & Environmental Engineering
More informationECE 534 Information Theory - Midterm 2
ECE 534 Information Theory - Midterm Nov.4, 009. 3:30-4:45 in LH03. You will be given the full class time: 75 minutes. Use it wisely! Many of the roblems have short answers; try to find shortcuts. You
More informationMultilayer Perceptron Neural Network (MLPs) For Analyzing the Properties of Jordan Oil Shale
World Alied Sciences Journal 5 (5): 546-552, 2008 ISSN 1818-4952 IDOSI Publications, 2008 Multilayer Percetron Neural Network (MLPs) For Analyzing the Proerties of Jordan Oil Shale 1 Jamal M. Nazzal, 2
More informationCSE 599d - Quantum Computing When Quantum Computers Fall Apart
CSE 599d - Quantum Comuting When Quantum Comuters Fall Aart Dave Bacon Deartment of Comuter Science & Engineering, University of Washington In this lecture we are going to begin discussing what haens to
More information1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)
CERTAIN CLASSES OF FINITE SUMS THAT INVOLVE GENERALIZED FIBONACCI AND LUCAS NUMBERS The beautiful identity R.S. Melham Deartment of Mathematical Sciences, University of Technology, Sydney PO Box 23, Broadway,
More informationCOMMUNICATION BETWEEN SHAREHOLDERS 1
COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov
More informationSome Unitary Space Time Codes From Sphere Packing Theory With Optimal Diversity Product of Code Size
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 5, NO., DECEMBER 4 336 Some Unitary Sace Time Codes From Shere Packing Theory With Otimal Diversity Product of Code Size Haiquan Wang, Genyuan Wang, and Xiang-Gen
More informationElementary Analysis in Q p
Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some
More informationLocation of solutions for quasi-linear elliptic equations with general gradient dependence
Electronic Journal of Qualitative Theory of Differential Equations 217, No. 87, 1 1; htts://doi.org/1.14232/ejqtde.217.1.87 www.math.u-szeged.hu/ejqtde/ Location of solutions for quasi-linear ellitic equations
More informationCollaborative Place Models Supplement 1
Collaborative Place Models Sulement Ber Kaicioglu Foursquare Labs ber.aicioglu@gmail.com Robert E. Schaire Princeton University schaire@cs.rinceton.edu David S. Rosenberg P Mobile Labs david.davidr@gmail.com
More informationDesign of NARMA L-2 Control of Nonlinear Inverted Pendulum
International Research Journal of Alied and Basic Sciences 016 Available online at www.irjabs.com ISSN 51-838X / Vol, 10 (6): 679-684 Science Exlorer Publications Design of NARMA L- Control of Nonlinear
More informationCSC165H, Mathematical expression and reasoning for computer science week 12
CSC165H, Mathematical exression and reasoning for comuter science week 1 nd December 005 Gary Baumgartner and Danny Hea hea@cs.toronto.edu SF4306A 416-978-5899 htt//www.cs.toronto.edu/~hea/165/s005/index.shtml
More informationIterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule
Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Clayton Aldern (Clayton_Aldern@brown.edu) Tyler Benster (Tyler_Benster@brown.edu) Carl Olsson (Carl_Olsson@brown.edu)
More informationarxiv: v1 [cs.lg] 31 Jul 2014
Learning Nash Equilibria in Congestion Games Walid Krichene Benjamin Drighès Alexandre M. Bayen arxiv:408.007v [cs.lg] 3 Jul 204 Abstract We study the reeated congestion game, in which multile oulations
More information8 STOCHASTIC PROCESSES
8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular
More informationCHAPTER 3: TANGENT SPACE
CHAPTER 3: TANGENT SPACE DAVID GLICKENSTEIN 1. Tangent sace We shall de ne the tangent sace in several ways. We rst try gluing them together. We know vectors in a Euclidean sace require a baseoint x 2
More informationStatistical Models approach for Solar Radiation Prediction
Statistical Models aroach for Solar Radiation Prediction S. Ferrari, M. Lazzaroni, and V. Piuri Universita degli Studi di Milano Milan, Italy Email: {stefano.ferrari, massimo.lazzaroni, vincenzo.iuri}@unimi.it
More information