Feature Extraction Techniques
|
|
- Grace Holland
- 5 years ago
- Views:
Transcription
1 Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that represent a for of sart feature extraction. Transforing the input data into the set of features still describing the data with sufficient accuracy In pattern recognition and iage processing, feature extraction is a special for of diensionality reduction What is feature reduction? Why feature reduction? Original data T G G dp p d : X p reduced data Linear transforation T d X Y G X Y d Most achine learning and data ining techniques ay not be effective for high-diensional data When the input data to an algorith is too large to be processed and it is suspected to be redundant (uch data, but not uch inforation) Analysis with a large nuber of variables generally requires a large aount of eory and coputation power or a classification algorith which overfits the training saple and generalizes poorly to new saples The iportant diension ay be sall. For exaple, the nuber of genes responsible for a certain type of disease ay be sall. 1
2 Why feature reduction? Visualization: projection of high-diensional data onto 2D or 3D. Data copression: efficient storage and retrieval. Noise reoval: positive effect on query accuracy. Feature reduction versus feature selection Feature reduction All original features are used The transfored features are linear cobinations of the original features. Feature selection Only a subset of the original features are used. Continuous versus discrete Application of feature reduction Face recognition Handwritten digit recognition Text ining Iage retrieval Microarray data analysis Protein classification Algoriths Feature Extraction Techniques Principal coponent analysis Singular value decoposition Non-negative atrix factorization Independent coponent analysis 2
3 What is Principal Coponent Analysis? Principal coponent analysis (PCA) Reduce the diensionality of a data set by finding a new set of variables, saller than the original set of variables Retains ost of the saple's inforation. Useful for the copression and classification of data. By inforation we ean the variation present in the saple, given by the correlations between the original variables. The new variables, called principal coponents (PCs), are uncorrelated, and are ordered by the fraction of the total inforation each retains. Geoetric picture of principal coponents (PCs) the 1 st PC z 1 z 2 is a iniu distance fit to a line in X space the 2 nd PC is a iniu distance fit to a line in the plane perpendicular to the 1 st PC PCs are a series of linear least squares fits to a saple, each orthogonal to all the previous. z 1 Principal Coponents Analysis (PCA) Principle Linear projection ethod to reduce the nuber of paraeters Transfer a set of correlated variables into a new set of uncorrelated variables Map the data into a space of lower diensionality For of unsupervised learning Properties It can be viewed as a rotation of the existing axes to new positions in the space defined by original variables New axes are orthogonal and represent the directions with axiu variability Background Matheatics Linear Algebra Calculus Probability and Coputing 3
4 Exaple. Consider the atrix A Consider the three colun atrices é 1 ù é -1 ù é 2 ù ê ú ê ú ê ú C 1 = êê 6 úú, C 2 = êê 2 úú, C 3 = êê 3 úú, ë û ë û ë û We have AC1 0, AC2 8, AC 3 9, In other words, we have AC 1 = 0C 1, AC 2 = -4C 2, AC 3 = 3C 3, 0, -4 and 3 are eigenvalues of A, C 1,C 2 and C 3 are eigenvectors Ac c Consider the atrix P for which the coluns are C 1, C 2, and C 3, i.e., P we have Deterinants of P det(p)= 84. So this atrix is invertible. Easy calculations give Next we evaluate the atrix P -1 AP. 1 P AP In other words, we have P P AP In other words, we have P AP Using the atrix ultiplication, we obtain A P P which iplies that A is siilar to a diagonal atrix. In particular, we have Definition. Let A be a square atrix. A non-zero vector C is called an eigenvector of A if and only if there exists a nuber (real or coplex) λ such that AC C. If such a nuber λ exists, it is called an eigenvalue of A. The vector C is called eigenvector associated to the eigenvalue λ. Reark. The eigenvector C ust be non-zero since we have A0 0 0 (0 is a zero vector) n A P P n 0 0 n 3 for n 1,2,... 1 for any nuber λ. 4
5 Exaple. Consider the atrix We have seen that where A AC 1 = 0C 1, AC 2 = -4C 2, AC 3 = 3C 3, C1 6, C2 2, C 3 3, So C 1 is an eigenvector of A associated to the eigenvalue 0. C 2 is an eigenvector of A associated to the eigenvalue -4 while C 3 is an eigenvector of A associated to the eigenvalue 3. Deterinant of order 2 easy to reeber (for order 2 only).. Deterinants a a A a11a 22 a12 a21 a21 a Exaple: Evaluate the deterinant: For a square atrix A of order n, the nuber λ is an eigenvalue if and only if there exists a non-zero vector C such that AC C Using the atrix ultiplication properties, we obtain ( AI ) C 0 n We also know that this syste has one solution if and only if the atrix coefficient is invertible, i.e. det( AI n ) 0. Since the zero-vector is a solution and C is not the zero vector, then we ust have det( AI n ) 0. Exaple. Consider the atrix 1 2 A 2 0 The equation det( AI n ) 0. translates into 1 2 (1 )(0 ) which is equivalent to the quadratic equation Solving this equation leads to (use quadratic forula) , and 2 2 In other words, the atrix A has only two eigenvalues. 5
6 In general, for a square atrix A of order n, the equation will give the eigenvalues of A. det( AI n ) 0. It is a polynoial function in λ of degree n. Therefore this equation will not have ore than n roots or solutions. So a square atrix A of order n will not have ore than n eigenvalues. Exaple. Consider the diagonal atrix a b 0 0 D. 0 0 c d Its characteristic polynoial is a det( D In) 0 b c 0 ( a )( b )( c )( d ) d So the eigenvalues of D are a, b, c, and d, i.e. the entries on the diagonal. Coputation of Eigenvectors Let A be a square atrix of order n and λ one of its eigenvalues. Let X be an eigenvector of A associated to λ. We ust have AX X or ( A I ) X 0 This is a linear syste for which the atrix coefficient is Since the zero-vector is a solution, the syste is consistent. n A I n Reark. Note that if X is a vector which satisfies AX= λx, then the vector Y = c X (for any arbitrary nuber c) satisfies the sae equation, i.e. AY- λy. In other words, if we know that X is an eigenvector, then cx is also an eigenvector associated to the sae eigenvalue. Coputation of Eigenvectors Exaple. Consider the atrix A First we look for the eigenvalues of A. These are given by the characteristic equation det( AI3) If we develop this deterinant using the third colun, we obtain ( 1 ) By algebraic anipulations, we get ( 4)( 3) 0 which iplies that the eigenvalues of A are 0, -4, and 3. 6
7 Coputation of Eigenvectors EIGENVECTORS ASSOCIATED WITH EIGENVALUES 1. Case λ=0. : The associated eigenvectors are given by the linear syste which ay be rewritten by AX 0 x 2y z 0 6x y 0 x 2y z 0 The third equation is identical to the first. Fro the second equation, we have y = 6x, so the first equation reduces to 13x + z = 0. So this syste is equivalent to y 6x z 13x Coputation of Eigenvectors So the unknown vector X is given by x x 1 X y 6x x 6 z 13x 13 Therefore, any eigenvector X of A associated to the eigenvalue 0 is given by 1 X c 6, 13 where c is an arbitrary nuber. Coputation of Eigenvectors 2. Case λ=-4: The associated eigenvectors are given by the linear syste AX 4 X or ( A 4 I ) X 0 which ay be rewritten by 5x 2y z 0 6x3y 0 x 2y 3z 0 We use eleentary operations to solve it. First we consider the augented atrix [ A 4I 0] 3 Coputation of Eigenvectors Then we use eleentary row operations to reduce it to a upper-triangular for. First we interchange the first row to the end Next, we use the first row to eliinate the 5 and 6 on the first colun. We obtain
8 Coputation of Eigenvectors If we cancel the 8 and 9 fro the second and third row, we obtain Finally, we subtract the second row fro the third to get Coputation of Eigenvectors Next, we set z = c. Fro the second row, we get y = 2z = 2c. The first row will iply x = -2y+3z = -c. Hence x c 1 X y 2c c 2 z c 1 Therefore, any eigenvector X of A associated to the eigenvalue -4 is given by 1 X c 2 1 where c is an arbitrary nuber. Coputation of Eigenvectors Coputation of Eigenvectors Case λ=3: Using siilar ideas as the one described above, one ay easily show that any eigenvector X of A associated to the eigenvalue 3 is given by 2 X c 3 2 where c is an arbitrary nuber. Suary: Let A be a square atrix. Assue λ is an eigenvalue of A. In order to find the associated eigenvectors, we do the following steps: 1. Write down the associated linear syste AX X 2. Solve the syste. or ( A I ) X 0 3. Rewrite the unknown vector X as a linear cobination of known vectors. n 8
9 Why Eigenvectors and Eigenvalues An eigenvector of a square atrix is a non-zero vector that, when ultiplied by the atrix, yields a vector that differs fro the original at ost by a ultiplicative scalar. The scalar is represented by its eigenvalue. In this shear apping the red arrow changes direction but the blue arrow does not. The blue arrow is an eigenvector, and since its length is unchanged its eigenvalue is 1. Principal Coponents Analysis (PCA) Principle Linear projection ethod to reduce the nuber of paraeters Transfer a set of correlated variables into a new set of uncorrelated variables Map the data into a space of lower diensionality For of unsupervised learning Properties It can be viewed as a rotation of the existing axes to new positions in the space defined by original variables New axes are orthogonal and represent the directions with axiu variability PCs and Variance Diensionality Reduction Can ignore the coponents of lesser significance. The first PC retains the greatest aount of variation in the saple The kth PC retains the kth greatest fraction of the variation in the saple The kth largest eigenvalue of the correlation atrix C is the variance in the saple along the kth PC You do lose soe inforation, but if the eigenvalues are sall, you don t lose uch n diensions in original data calculate n eigenvectors and eigenvalues choose only the first p eigenvectors, based on their eigenvalues final data set has only p diensions 9
10 PCA Exaple STEP 1 PCA Exaple STEP 1 Subtract the ean fro each of the data diensions. This produces a data set whose ean is zero. Subtracting the ean akes variance and covariance calculation easier by siplifying their equations. The variance and co-variance values are not affected by the ean value. DATA: x y ZERO MEAN DATA: x y PCA Exaple STEP 1 PCA Exaple STEP 2 Calculate the covariance atrix cov = Variance easures how far a set of nubers is spread out Covariance provides a easure of the strength of the correlation between two or ore sets of rando variates. The covariance for two rando variates and, each with saple size, is defined by the expectation value 10
11 PCA Exaple STEP 3 PCA Exaple STEP 3 Calculate the eigenvectors and eigenvalues of the covariance atrix eigenvalues = eigenvectors = eigenvectors are plotted as diagonal dotted lines on the plot. Note they are perpendicular to each other. Note one of the eigenvectors goes through the iddle of the points, like drawing a line of best fit. The second eigenvector gives us the other, less iportant, pattern in the data, that all the points follow the ain line, but are off to the side of the ain line by soe aount. PCA Exaple STEP 4 Reduce diensionality and for feature vector the eigenvector with the highest eigenvalue is the principle coponent of the data set. In our exaple, the eigenvector with the larges eigenvalue was the one that pointed down the iddle of the data. Once eigenvectors are found fro the covariance atrix, the next step is to order the by eigenvalue, highest to lowest. This gives you the coponents in order of significance. PCA Exaple STEP 4 Now, if you like, you can decide to ignore the coponents of lesser significance. You do lose soe inforation, but if the eigenvalues are sall, you don t lose uch n diensions in your data calculaten eigenvectors and eigenvalues choose only the first p eigenvectors final data set has only p diensions. 11
12 PCA Exaple STEP 4 Feature Vector FeatureVector = (eig 1 eig 2 eig 3 eig n ) We can either for a feature vector with both of the eigenvectors: or, we can choose to leave out the saller, less significant coponent and only have a single colun: PCA Exaple STEP 5 Deriving the new data FinalData = RowFeatureVector x RowZeroMeanData RowFeatureVector is the atrix with the eigenvectors in the coluns transposed so that the eigenvectors are now in the rows, with the ost significant eigenvector at the top RowZeroMeanData is the ean-adjusted data transposed, ie. the data ites are in each colun, with each row holding a separate diension. PCA Exaple STEP 5 FinalData transpose: diensions along coluns x y PCA Exaple STEP
13 PCA Algorith Why PCA? Get soe data Subtract the ean Calculate the covariance atrix Calculate the eigenvectors and eigenvalues of the covariance atrix Choosing coponents and foring a feature vector Deriving the new data set Maxiu variance theory o In general, variance for noise data should be low and signal should be high. A coon easure is the signal-to-noise ratio (SNR). A high SNR indicates high precision data. ab a b cos() The dot product is also related to the angle between the two vectors but it doesn t tell us the angle Projection Projection Unit Vector Original coordinate x ( i) T u is the length fro blue node to origin of coordinates. The ean for the x (i) T u is zero. ( var( x i) T T 1 u ( 1 u) i1 i1 ( i) ( i x x ) T ( ( x ) u i) T 2 1 u) i1 T ( i) ( i u x x ) T u Cov = 1 ( var( x Maxiu Variance i) T T 1 u ( å i=1 1 u) i1 x(i) x (i)t i1 ( i) ( i x x ) T ( ( x ) u i) T 2 1 u) i1 T ( i) ( i u x x Covariance Matrix since the ean is zero l = var(x (i)t u) = 1 å (x(i)t u) 2 = u T Covu u T u 1 i=1 ul = lu = uu T Covu = Covu Therefore Covu = lu We got it. is the eigenvalue of atrix Cov. u is the Eigenvectors. The goal of PCA is to find an u where the variance of all projection points is axiu. ) T u 13
Block designs and statistics
Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent
More informationPrincipal Components Analysis
Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations
More informationA Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag
A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data
More informationUnsupervised Learning: Dimension Reduction
Unsupervised Learning: Diension Reduction by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Principal Coponent Analysis (PCA) II. 2. PCA Algorith I. 2..
More informationLecture 13 Eigenvalue Problems
Lecture 13 Eigenvalue Probles MIT 18.335J / 6.337J Introduction to Nuerical Methods Per-Olof Persson October 24, 2006 1 The Eigenvalue Decoposition Eigenvalue proble for atrix A: Ax = λx with eigenvalues
More informationTopic 5a Introduction to Curve Fitting & Linear Regression
/7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline
More information3.3 Variational Characterization of Singular Values
3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and
More informationKernel Methods and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic
More informationEstimating Parameters for a Gaussian pdf
Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3
More informationChaotic Coupled Map Lattices
Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each
More informationIntelligent Systems: Reasoning and Recognition. Artificial Neural Networks
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More informationReed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product.
Coding Theory Massoud Malek Reed-Muller Codes An iportant class of linear block codes rich in algebraic and geoetric structure is the class of Reed-Muller codes, which includes the Extended Haing code.
More informationOBJECTIVES INTRODUCTION
M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and
More informationChapter 6 1-D Continuous Groups
Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:
More informationCOS 424: Interacting with Data. Written Exercises
COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well
More informationa a a a a a a m a b a b
Algebra / Trig Final Exa Study Guide (Fall Seester) Moncada/Dunphy Inforation About the Final Exa The final exa is cuulative, covering Appendix A (A.1-A.5) and Chapter 1. All probles will be ultiple choice
More informationSupport Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization
Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering
More informationModel Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon
Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More informationLeast Squares Fitting of Data
Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a
More informationMachine Learning Basics: Estimators, Bias and Variance
Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics
More informationMulti-Scale/Multi-Resolution: Wavelet Transform
Multi-Scale/Multi-Resolution: Wavelet Transfor Proble with Fourier Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. A serious drawback in transforing to the
More informationList Scheduling and LPT Oliver Braun (09/05/2017)
List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)
More informationPage 1 Lab 1 Elementary Matrix and Linear Algebra Spring 2011
Page Lab Eleentary Matri and Linear Algebra Spring 0 Nae Due /03/0 Score /5 Probles through 4 are each worth 4 points.. Go to the Linear Algebra oolkit site ransforing a atri to reduced row echelon for
More informationIntelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes
More information1 Proof of learning bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a
More informationCh 12: Variations on Backpropagation
Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith
More informationA Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)
1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu
More information13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices
CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay
More informationA note on the multiplication of sparse matrices
Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani
More informationNon-Parametric Non-Line-of-Sight Identification 1
Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,
More informationPattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition
More informationESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics
ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents
More informationSharp Time Data Tradeoffs for Linear Inverse Problems
Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used
More informationLinear Transformations
Linear Transforations Hopfield Network Questions Initial Condition Recurrent Layer p S x W S x S b n(t + ) a(t + ) S x S x D a(t) S x S S x S a(0) p a(t + ) satlins (Wa(t) + b) The network output is repeatedly
More informationSupport Vector Machines MIT Course Notes Cynthia Rudin
Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance
More informationE0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis
E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds
More informationCSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13
CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial
More informationA1. Find all ordered pairs (a, b) of positive integers for which 1 a + 1 b = 3
A. Find all ordered pairs a, b) of positive integers for which a + b = 3 08. Answer. The six ordered pairs are 009, 08), 08, 009), 009 337, 674) = 35043, 674), 009 346, 673) = 3584, 673), 674, 009 337)
More informationThe Transactional Nature of Quantum Information
The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.
More informationIntroduction to Machine Learning. Recitation 11
Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,
More informationMultivariate Methods. Matlab Example. Principal Components Analysis -- PCA
Multivariate Methos Xiaoun Qi Principal Coponents Analysis -- PCA he PCA etho generates a new set of variables, calle principal coponents Each principal coponent is a linear cobination of the original
More informationOptimal Jamming Over Additive Noise: Vector Source-Channel Case
Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper
More informationA remark on a success rate model for DPA and CPA
A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance
More informationPAC-Bayes Analysis Of Maximum Entropy Learning
PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E
More informationUsing a De-Convolution Window for Operating Modal Analysis
Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis
More informationA Simple Regression Problem
A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where
More informationProbability Distributions
Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationThe Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters
journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn
More informationProc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES
Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co
More informationSolutions of some selected problems of Homework 4
Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next
More informationDivisibility of Polynomials over Finite Fields and Combinatorial Applications
Designs, Codes and Cryptography anuscript No. (will be inserted by the editor) Divisibility of Polynoials over Finite Fields and Cobinatorial Applications Daniel Panario Olga Sosnovski Brett Stevens Qiang
More informationPhysics 215 Winter The Density Matrix
Physics 215 Winter 2018 The Density Matrix The quantu space of states is a Hilbert space H. Any state vector ψ H is a pure state. Since any linear cobination of eleents of H are also an eleent of H, it
More information1 Generalization bounds based on Rademacher complexity
COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges
More informationModule #1: Units and Vectors Revisited. Introduction. Units Revisited EXAMPLE 1.1. A sample of iron has a mass of mg. How many kg is that?
Module #1: Units and Vectors Revisited Introduction There are probably no concepts ore iportant in physics than the two listed in the title of this odule. In your first-year physics course, I a sure that
More informationQualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science
Proceedings of the 6th WSEAS International Conference on Applied Coputer Science, Tenerife, Canary Islands, Spain, Deceber 16-18, 2006 183 Qualitative Modelling of Tie Series Using Self-Organizing Maps:
More informationWhat is Principal Component Analysis?
What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most
More informationINNER CONSTRAINTS FOR A 3-D SURVEY NETWORK
eospatial Science INNER CONSRAINS FOR A 3-D SURVEY NEWORK hese notes follow closely the developent of inner constraint equations by Dr Willie an, Departent of Building, School of Design and Environent,
More information1 Bounding the Margin
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost
More informationMA304 Differential Geometry
MA304 Differential Geoetry Hoework 4 solutions Spring 018 6% of the final ark 1. The paraeterised curve αt = t cosh t for t R is called the catenary. Find the curvature of αt. Solution. Fro hoework question
More informationMechanics Physics 151
Mechanics Physics 5 Lecture Oscillations (Chapter 6) What We Did Last Tie Analyzed the otion of a heavy top Reduced into -diensional proble of θ Qualitative behavior Precession + nutation Initial condition
More informationLecture 9 November 23, 2015
CSC244: Discrepancy Theory in Coputer Science Fall 25 Aleksandar Nikolov Lecture 9 Noveber 23, 25 Scribe: Nick Spooner Properties of γ 2 Recall that γ 2 (A) is defined for A R n as follows: γ 2 (A) = in{r(u)
More informationLinear Algebra (I) Yijia Chen. linear transformations and their algebraic properties. 1. A Starting Point. y := 3x.
Linear Algebra I) Yijia Chen Linear algebra studies Exaple.. Consider the function This is a linear function f : R R. linear transforations and their algebraic properties.. A Starting Point y := 3x. Geoetrically
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher
More informationRESTARTED FULL ORTHOGONALIZATION METHOD FOR SHIFTED LINEAR SYSTEMS
BIT Nuerical Matheatics 43: 459 466, 2003. 2003 Kluwer Acadeic Publishers. Printed in The Netherlands 459 RESTARTED FULL ORTHOGONALIZATION METHOD FOR SHIFTED LINEAR SYSTEMS V. SIMONCINI Dipartiento di
More informationCurious Bounds for Floor Function Sums
1 47 6 11 Journal of Integer Sequences, Vol. 1 (018), Article 18.1.8 Curious Bounds for Floor Function Sus Thotsaporn Thanatipanonda and Elaine Wong 1 Science Division Mahidol University International
More informationUsing EM To Estimate A Probablity Density With A Mixture Of Gaussians
Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points
More informationFundamentals of Image Compression
Fundaentals of Iage Copression Iage Copression reduce the size of iage data file while retaining necessary inforation Original uncopressed Iage Copression (encoding) 01101 Decopression (decoding) Copressed
More informationNORMAL MATRIX POLYNOMIALS WITH NONSINGULAR LEADING COEFFICIENTS
NORMAL MATRIX POLYNOMIALS WITH NONSINGULAR LEADING COEFFICIENTS NIKOLAOS PAPATHANASIOU AND PANAYIOTIS PSARRAKOS Abstract. In this paper, we introduce the notions of weakly noral and noral atrix polynoials,
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationOn Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40
On Poset Merging Peter Chen Guoli Ding Steve Seiden Abstract We consider the follow poset erging proble: Let X and Y be two subsets of a partially ordered set S. Given coplete inforation about the ordering
More informationTesting equality of variances for multiple univariate normal populations
University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate
More informationComputable Shell Decomposition Bounds
Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung
More informationThe Fundamental Basis Theorem of Geometry from an algebraic point of view
Journal of Physics: Conference Series PAPER OPEN ACCESS The Fundaental Basis Theore of Geoetry fro an algebraic point of view To cite this article: U Bekbaev 2017 J Phys: Conf Ser 819 012013 View the article
More informationDetection and Estimation Theory
ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer
More information13 Harmonic oscillator revisited: Dirac s approach and introduction to Second Quantization
3 Haronic oscillator revisited: Dirac s approach and introduction to Second Quantization. Dirac cae up with a ore elegant way to solve the haronic oscillator proble. We will now study this approach. The
More informationSlide10. Haykin Chapter 8: Principal Components Analysis. Motivation. Principal Component Analysis: Variance Probe
Slide10 Motivation Haykin Chapter 8: Principal Coponents Analysis 1.6 1.4 1.2 1 0.8 cloud.dat 0.6 CPSC 636-600 Instructor: Yoonsuck Choe Spring 2015 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 How can we
More informationPolygonal Designs: Existence and Construction
Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G
More informationMeasures of average are called measures of central tendency and include the mean, median, mode, and midrange.
CHAPTER 3 Data Description Objectives Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance,
More informationSupplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion
Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish
More informationFinite fields. and we ve used it in various examples and homework problems. In these notes I will introduce more finite fields
Finite fields I talked in class about the field with two eleents F 2 = {, } and we ve used it in various eaples and hoework probles. In these notes I will introduce ore finite fields F p = {,,...,p } for
More information. The univariate situation. It is well-known for a long tie that denoinators of Pade approxiants can be considered as orthogonal polynoials with respe
PROPERTIES OF MULTIVARIATE HOMOGENEOUS ORTHOGONAL POLYNOMIALS Brahi Benouahane y Annie Cuyt? Keywords Abstract It is well-known that the denoinators of Pade approxiants can be considered as orthogonal
More informationCharacterization of the Line Complexity of Cellular Automata Generated by Polynomial Transition Rules. Bertrand Stone
Characterization of the Line Coplexity of Cellular Autoata Generated by Polynoial Transition Rules Bertrand Stone Abstract Cellular autoata are discrete dynaical systes which consist of changing patterns
More informationGeneral Properties of Radiation Detectors Supplements
Phys. 649: Nuclear Techniques Physics Departent Yarouk University Chapter 4: General Properties of Radiation Detectors Suppleents Dr. Nidal M. Ershaidat Overview Phys. 649: Nuclear Techniques Physics Departent
More informationProblem Set 2. Chapter 1 Numerical:
Chapter 1 Nuerical: roble Set 16. The atoic radius of xenon is 18 p. Is that consistent with its b paraeter of 5.15 1 - L/ol? Hint: what is the volue of a ole of xenon atos and how does that copare to
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering
More informationA Bernstein-Markov Theorem for Normed Spaces
A Bernstein-Markov Theore for Nored Spaces Lawrence A. Harris Departent of Matheatics, University of Kentucky Lexington, Kentucky 40506-0027 Abstract Let X and Y be real nored linear spaces and let φ :
More informationForecasting Financial Indices: The Baltic Dry Indices
International Journal of Maritie, Trade & Econoic Issues pp. 109-130 Volue I, Issue (1), 2013 Forecasting Financial Indices: The Baltic Dry Indices Eleftherios I. Thalassinos 1, Mike P. Hanias 2, Panayiotis
More informationlecture 37: Linear Multistep Methods: Absolute Stability, Part I lecture 38: Linear Multistep Methods: Absolute Stability, Part II
lecture 37: Linear Multistep Methods: Absolute Stability, Part I lecture 3: Linear Multistep Methods: Absolute Stability, Part II 5.7 Linear ultistep ethods: absolute stability At this point, it ay well
More informationLecture 20 November 7, 2013
CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,
More informationMeasuring orbital angular momentum superpositions of light by mode transformation
CHAPTER 7 Measuring orbital angular oentu superpositions of light by ode transforation In chapter 6 we reported on a ethod for easuring orbital angular oentu (OAM) states of light based on the transforation
More informationOcean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers
Ocean 40 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers 1. Hydrostatic Balance a) Set all of the levels on one of the coluns to the lowest possible density.
More informationAnalyzing Simulation Results
Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient
More informationIn this chapter, we consider several graph-theoretic and probabilistic models
THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions
More informationDistributed Subgradient Methods for Multi-agent Optimization
1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions
More informationExact tensor completion with sum-of-squares
Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David
More information