A powerful method for feature extraction and compression of electronic nose responses

Size: px
Start display at page:

Download "A powerful method for feature extraction and compression of electronic nose responses"

Transcription

1 Sensors and Actuators B 105 (2005) A powerful method for feature extraction and compression of electronic nose responses A. Leone a, C. Distante a,, N. Ancona b, K.C. Persaud c, E. Stella b, P. Siciliano a a Istituto per la Microelettronica e Microsistemi CNR Sez. Lecce, via Monteroni, Lecce, Italy b Istituto di Studi sui Sistemi Intelligenti per l Automazione CNR, via Amendola 166/5, Bari, Italy c DIAS UMIST, 3DIAS, UMIST, P.O. Box 88, Sackville Street, Manchester M60 1QD, UK Received 30 April 2004; received in revised form 16 June 2004; accepted 21 June 2004 Available online 27 July 2004 Abstract This paper focuses on the problem of data representation for feature selection and extraction of 1D electronic nose signals. While PCA signal representation is a problem dependent method, we propose a novel approach based on frame theory where an over-complete dictionary of functions is considered in order to find the near-optimal representation of any 1D signal considered. Feature selection is accomplished with an iterative methods called matching pursuit which select from the dictionary the functions that reduce the reconstruction error. In this case we can use the representation functions found for feature extraction or for signal compression purposes. Classification results of the selected features is performed with neural approach showing the high discriminatory power of the extracted feature Elsevier B.V. All rights reserved. Keywords: Feature extraction; Electronic nose; Gabor functions; Frame theory; Matching pursuit 1. Introduction Data representation is one of the main concern issues for developing intelligent machines that are able to solve real life problems. The choice of the most suitable representation for the particular application can significantly influence the general performance of the machine, either for computational speed and reliability. The goal of data representation is to find an appropriate set of coefficients (or features) that numerically describe the data. Such a representation involves some prior knowledge of the problem at hand, and involves linear or non linear transformations of original data (Fig. 1). A system that perform the representation task can be independent by the particular data, as in the case of Haar functions, or dependent by the data as in the case of principal component analysis. In both cases, the extracted features rel- Corresponding author. addresses: alessandro.leone@imm.cnr.it; cosimo.distante@imm. cnr.it (Distante); kcpersaud@umist.ac.uk (K.C. Persaud). ative to the original signal f, characterize the signal itself. Usually these transformations implies the convolution of the original signal with some appropriate kernel functions that often constitute a basis of the input space. In such a case, the representation of any original signal to the given basis is unique. We are interested to study system of functions that are over-complete (not an orthogonal basis) in which the representation of each element of the space is not necessarily unique. We require that the representation can be obtained in several ways starting from linearly dependent functions. As an example, let us consider a basis be a small dictionary of just a few thousand of words. Any concept can be described using terms of the vocabulary, but at the expense of long sentences. On the other hand, with a very large dictionary, let s say 100,000 words, concepts can be described with much more shorter sentences, in a more compact way, sometimes with a single word. Moreover, and more important for our context, large dictionaries allow to describe the same concept with different sentences, according to the message we want to transmit to the audience or reader. Over-complete dictionaries, allows to representing signals in many different /$ see front matter 2004 Elsevier B.V. All rights reserved. doi: /j.snb

2 A. Leone et al. / Sensors and Actuators B 105 (2005) Fig. 1. Pseudo-Code of the matching pursuit algorithm. ways and, in principle, it is possible to envisage that among all the possible representations, there is one suitable for the particular application at hand. Traditional transform based compression methods often use an orthogonal basis, with the objective of representing as much signal information as possible with a weighted sum of a few basis vectors as possible. The optimal transform depends on the statistics of the stochastic process that produced the signal. Often Gaussian assumption is made, hence finding the eigenvectors of the autocorrelation matrix of the steady state responses in order to minimize the average differential entropy [1]. If the process is not Gaussian, the principal component analysis need not be the optimal transform. It is a nontrivial task to find the optimal transform even when the statistics are known. Also, the signal is non-stationary and no fixed transform will be optimal in all signal regions. One way to overcome this problem is to have an over-complete set of vectors, "a frame", which span the finite-dimensional input space. Introduced in the context of non-harmonic Fourier series, recently Frames are becoming an interesting new trend in signal processing [2,3]. The advantage of using a frame rather than an orthogonal transform is the large amount of available kernel functions we have in order to select a small set whose linear combination match rather well with the original signal. In this paper, we analyze e-nose signal representation in terms of Frames in the context of odor detection and recognition, problem which is getting particular attention for its practical and commercial exploitation in different fields. In [4], an orthogonal decomposition of the signal is performed using wavelet functions. In the study, several feature extraction methods with complete dictionaries have been compared by minimizing a generalization error. For approaching this problem we follow the root of learning from examples in which the odor we want to detect is described by previously stored patterns. More specifically, signal detection can be seen as a classification problem, especially in the context of ambient air monitoring where measurements are made in an uncontrolled environment. The general problem of learning from examples, and in particular classification, can be interpreted as the problem of approximating a multivariate function from sparse 1 [5] data, where data are in the form (input, output) pairs. The optimal solution is the one minimizing the generalization error, called risk [6], or equivalently maximizing the capacity of the machine to correctly classify never seen before patterns. The solution found by support vector machine (SVM) for two class classification problems [7,8] is the hyperplane minimizing an upper bound of the risk constituted by the sum of the empirical risk, that is the error on the trained data, and of a measure of the complexity of the hypothesis space 2 expressed in term of VC-dimension. In this paper, we investigate the connections between the form of the representation and the generalization properties of the learning machines. It is thus time to wonder how to select an appropriate basis for processing a particular class of signals. The composition coefficients of a signal in a basis define a representation that highlights some particular signal properties. For example, wavelet coefficients provide explicit information on the location and type of signal singularities. The problem is to find a criterion for selecting a basis that is intrinsically well adapted to represent a class of signals. Mathematical approximation theory suggest choosing a basis that can construct precise signal approximations with a linear combination of a small number of vector selected inside the basis. This selected vectors can be interpreted as intrinsic signal structures. The signal is approximated or represented as a superposition of basic waveforms which are chosen from a dictionary of such waveform, so as to best match the signal. The goal is to obtain expansions which provide good synthesis for applications such as denoising, compression and feature extraction for classification purposes. The question we are interested to answer is: among all the possible representations of the data in terms of elements of over-complete dictionaries, is there a suitable representation of the data providing the best generalization capacity of the learning machine? We investigate several representations of e-nose signals using in combination method of frames proposed by [9] and method of matching pursuit introduced by [10]. Over-complete dictionaries are obtained by using translated and scaled version of Haar and Gabor functions with different number of center frequencies. 2. Function representation and reconstruction One of the main problems of computational intelligence is related to function representation and reconstruction. Func- 1 Recall that the concept of sparsity to feature extraction is referred to selecting a representation for a signal with a few number of coefficients different from zero. 2 With this term we mean the set of functions the machine implements.

3 380 A. Leone et al. / Sensors and Actuators B 105 (2005) tion must be discretized so as to be implemented on a computer, as well as when it is specified on the computer, the input is a representation of the function. Associated with each representation technique we must have a reconstruction method. These two operations enable us to move functions between the mathematical and the representation universes when necessary. As we will see, the reconstruction methods are directly related with the theory of function approximation. Let us denote by S a space of sequences 3 (c j ) of real or complex numbers. The space S has norm that allows us to compute distances between sequences of the space. A representation of a space of functions Λ is an operator F : Λ S into some space of sequences. So far, for a given function f Λ, its representation F(f ) is a sequence. F(f ) = (c j ) S F is called the representation operator. It is required in general that F preserves norm or satisfies some stability conditions. The most important examples of representations occurs when the space of functions is a subspace of the space L 2 (R), of square integrable functions (finite energy), L 2 (R) ={f : R R; f (t) 2 dt< }, (1) R and the representation space S is the space l 2 of the square summable sequences l 2 = (c j) ; c 2 = cj 2 < (2) When the representation operator is invertible, we can reconstruct f from its representation sequence: f = F 1 ((f j ) ). In this case, we have an exact representation, also called ideal representation. A method to compute the inverse operator gives us the reconstruction equation. We should remark that in general invertibility is a very strong requirement for a representation operator. In fact weaker conditions such as invertibility on the left suffices to obtain exact representations. Several representation/reconstraction methods are available in literature, among them: Basis and Frame representations Basis representation A natural technique to obtain a representation of a space of functions consists in constructing basis of the space. A set B ={e j ; j Z} is a basis of a function space Λ if the vectors e j are linearly independent, and for each f Λ there exists a sequence (c j ) of numbers such that f = c j e j (3) 3 The sequence can be thought as a set of coefficients obtained by projecting a function onto some space. thus, lim n f c je j = 0. You can think to B as the basis obtained by extracting principal component of raw data, which its eigenvectors are linearly independent. If all the eigenvectors are kept in the reconstruction process, the entire signal is completely recovered. Then projecting the function onto those eigenvectors we obtain a sequence of coefficients (c j ) which constitute a different representation of the original signal. A representation operator can be defined by F(f ) = (c j ). (4) In order to guarantee unicity of the representation, we must impose that the space of functions admits an inner product and it is complete in the norm induced by the inner product. These spaces are called Hilbert space that we denote by the symbol H. Definition 1. A collection of functions {ϕ j ; j Z} in a separable Hilbert space H is a complete orthonormal set if the conditions below are satisfied: 1 Normalization: ϕ j = 1 for each j Z; 2 Orthogonality: ϕ j,ϕ k =0ifj k; 3 Completeness: For all f H, N 0, and any ɛ>0 N f f, ϕ j ϕ j <ɛ. (5) j= N The third condition says that linear combinations of functions from the set can be used to approximate arbitrary functions from the space. Complete orthonormal stets are also called orthonormal basis of the Hilbert space. As is usually done with principal component analysis, we retain the first eigenvectors that carries the largest amount of information by discarding the remaining eigenvectors whose corresponding eigenvalues are very small. If all the eigenvectors are chosen, the norm in Eq. (5) is zero, otherwise ɛ is dependent on the number of the retained eigenvectors (components). The reconstruction of the original signal is given by f = f, ϕ j ϕ j (6) It is easy to see that the orthogonality condition implies that the elements ϕ j are linearly independent. This implies that the representation sequence ( f, ϕ j ) is uniquely determined by f Frame theory The basic idea of representing fuctions on a basis, consists in decomposing it using a countable set of simpler functions. The existence of a complete orthonormal set, and its construction is in general a very difficult task. On the other hand, orthonormal representations are too much restrictive and rigid. Therefore, it is important to obtain collections of functions {ϕ j ; j Z} which do not constitute necessarily an

4 A. Leone et al. / Sensors and Actuators B 105 (2005) orthonormal set or are not linearly independent, but can be used to define a representation operator. One such collection is constituted by the frames. Definition 2. A collection of functions {ϕ j ; j Z} is a frame if there exist constants A and B satisfying 0 <A B< +, such that for all f H,wehave A f 2 ( f, ϕ j ) 2 B f 2 (7) The constants A and B are called frame bounds. When A = B we say that the frame is tight and for all f H: ( f, ϕ j ) 2 = A f 2 (8) Moreover, if A = 1 then the frame is an orthonormal basis. Let c j = f, ϕ j for all j Z, and since the system {ϕ j ; j Z} is an orthonormal basis then ϕ j,ϕ k =δ jk where δ jk is the Kronecker s symbol. Theorem 3. If B ={ϕ j ; j Z} is an orthonormal basis, then B is a tight frame with A = 1. Proof. Since B is an orthonormal basis, then ϕ j = 1 j, f 2 = f, f = c j ϕ j, c k ϕ k = k Z k Z c j c k ϕ j,ϕ k = k Z c 2 k = k Z ( f, ϕ k ) 2 and then (8) holds with A = 1. The expression (6) shows that in the case of an orthonormal basis, any f can be expressed as a linear combination of ϕ j, with coefficients obtained projecting f on the same elements of the basis. Let us analyse this property when {ϕ j ; j Z} is a tight frame. Using the polarization identity and the Eq. (8) (reference report Nicola), for all g H we have: f, g = 1 f, ϕ j ϕ j,g. (9) A It follows that in weak sense we have: f = 1 f, ϕ j ϕ j. (10) A The representation of f in terms of the elements of a tight frame, given by Eq. (10), is very similar to the one in terms of the elements of an orthonormal basis given by (6). However, it is important to underline that frames, even tight frames, are not orthonormal basis. Expression (10) shows how we can obtain approximations of a function f using frames. In fact, it motivates us to define a new representation operator F, also called frame operator, analogous to what we did for orthonormal basis: F(f ) = (c j ) where c j = f, ϕ j (11) We should remark that this operator in general is not invertible. Note that F is an operator which takes in input a function f H and returns a sequence of c l 2, that is from the (2) a set of coefficients (features). The j-th component c j = (Ff ) j is the projection of the function f on the jth element ϕ j of the frame. Moreover, due to the linearity property of the scalar product, we can conclude that F is a linear operator. By the definition of frame (7) and by using the definition of frame operator, we have that A f 2 ((Ff ) j ) 2 B f 2. (12) Since ((Ff ) j) 2 = c2 j = c 2 then we can conclude that Ff 2 B f 2 which implies that the frame operator F is bounded, and it is also a continuous operator. For this type of operators, we can define the adjoint operator F : l 2 H which takes in input a sequence c l 2 and returns a function f H defined by: F c, f = c, Ff which holds for all c l 2 and for all f H. By using the definition of frame operator (11) we have F c, f = c j (Ff ) j = So we can conclude that: F c = c j ϕ j. c j f, ϕ j = c j ϕ j,f. This formula makes evident the role of the adjoint operator F of the frame operator F. In fact the adjoint operator F takes as input a sequence c l 2 and returns the function of H obtained as a linear combination of the frame elements ϕ j with coefficients c j, that is F c = c jϕ j. Finally, note that the operators F and F have the same norm and then since F is a bounded operator, also its adjoint F is a bounded operator. Let us consider c = F f and let us analyse the adjoint operator F of F. By definition, F is such that: F c, f = c, Ff. In these spaces we know that u, v =u v and so by definition of adjoint operator we have: (F c) u = c Fu c (F ) u = c Fu. From this equality it follows that (F ) = F, and then we can conclude that F = F, that is the adjoint operator of the frame operator F is the transpose of the matrix F.

5 382 A. Leone et al. / Sensors and Actuators B 105 (2005) By definition of frame operator (11) we have ( f, ϕ j ) 2 = (Ff ) 2 j = Ff 2 = Ff, Ff = c, Ff Using the definition of adjoint operator we have: ( f, ϕ j ) 2 = F c, f = F Ff, f and the frame condition (7) becomes: A f, f F Ff, f B f, f. The operator F F is self-adjoint 4 because (F F) = F (F ) = F F and then the following holds true: AId F F BId where Id f = f is the identity operator. Moreover, F F is a bounded operator, and positive definite since A>0, then it is invertible and the inverse operator (F F) 1 is bounded by A 1 as follows B 1 Id (F F) 1 A 1 Id. We can apply the new inverted operator (F F) 1 to the original frame {ϕ j } obtaining a new family: ϕ j = (F F) 1 ϕ j. (13) The family { ϕ j } constitutes a frame, called dual frame specified by the following theorem: Theorem 4. Let {ϕ j } be a frame with bounds A,B. The dual frame defined by ϕ j = (F F) 1 ϕ j satisfies f H, 1 B f 2 and f = FFf = ( f, ϕ j ) 2 1 f 2 A f, ϕ j ϕ j = f, ϕ j ϕ j If the frame is tight (i.e.a=b)then ϕ j = A 1 ϕ j. 4 A bounded linear operator F mapping a Hilbert space H into itself is said to be self-adjoint if F = F. Theorem 2 makes evident the fact that if we want to reconstruct f from the coefficients f, ϕ j we must use the dual frame of {ϕ j } that is: f = f, ϕ j ϕ j. (14) In conclusion, when we have a frame {ϕ j }, the only thing that we need to do in order to apply (14) is to compute the dual frame ϕ j = (F F) 1 ϕ j. 3. Finite dimensional frames In this section, we analyze frames in the particular cases, useful for practical applications, of H = R n and H = C n. At this aim, consider a family of vectors ( ) l ϕ j of Rn. The family is called dictionary [11] and the elements of the family are called atoms. By the definition of frame we know that the family of vectors ( ) l ϕ j constitutes a frame if there exist two constants A>0 and B< such that, for all f R n we have: A f 2 l ( f, ϕj ) 2 B f 2 i=1 (15) Let F be a l n matrix having the frame vectors as its rows. The matrix F is the frame operator, where F : R n R l. Let c R l be the vector obtained when we apply the frame operator F to the vector f R n, that is: c = Ff. Then: c j = (Ff ) j = f, ϕ j for j = 1, 2,...,l By using the definition of frame operator F, the (15) becomes: A f 2 Ff 2 B f 2 (16) and this inequalities hold for all f R n, and then F is a bounded operator. Consider the adjoint operator F of the frame operator. We have already seen that the adjoint operator coincides with the transpose of F, that is: F = F. The operator F : R l R n is the n l matrix such that for all f R n and c R l we have F c, f = c, Ff Analyze in detail how F works. In fact, substituting c = Ff in the definition of the adjoint operator we have: F c, f = l c j (Ff ) j F c, f = l c j ϕ j,f

6 A. Leone et al. / Sensors and Actuators B 105 (2005) This implies that F c = l c j ϕ j This equality shows that F associates to a vector c R l the vector v R n equal to the linear combination of the frame elements ϕ j with coefficients c 1,c 2,...,c l. Rewrite the inequalities in (16) expliciting the scalar products. Then A f, f Ff, Ff B f, f By using the definition of adjoint operator we have: A f, f F Ff, f B f, f (17) where the matrix F F is symmetric. For definition of frame, we know that A>0and then we can conclude that the matrix F F is positive definite, that is for all f 0wehave: f F Ff > 0. Then the matrix F F is invertible and so we can compute the vectors of the dual frame ( ϕ ) l j by using the formula (13), that is: ϕ j = (F F) 1 ϕ j. Moreover, the vector f can be recovered by the coefficients c = Ff, where c j = f, ϕ j, by using the formula (14), that is: f = f = l c j ϕ j By using the definition of dual frame we have: l c j (F F) 1 ϕ j Moreover, noting that here ϕ j are column vectors while they are row vectors in the frame operator F, then we can write: f = F c (18) where F = (F F) 1 F is the pseudoinverse of F. The formula (18) makes evident the difference between the frame operator F and the operator F. F, also called analysis operator, associates a vector of coefficients c (features) to a signal f, i.e. Ff = c, decomposing (projecting) the signal through the atoms of the dictionary. This operation involves a l n matrix. On the contrary F, also called synthesis operator, builds up a signal f as a superposition of the atoms of the dual dictionary weighted with coefficients c, i.e. f = F c. This operation involves a n l matrix. Note that the columns of F are the elements of the dual frame. So far we have seen as, given a frame (ϕ j ) l of vectors of R n, to decompose a signal by using the frame elements and as to reconstruct it by using the dual frame elements. It remains to understand when a given family of vectors constitutes a frame and how to compute the frame bounds. In fact, by using the definition of scalar product between elements of R n, f, v =f v, from (17) follows: Af f f F Ff Bf f For f 0 we have that: A f F Ff f f B The central term of this inequality is known as the Rayleigh s quotient R(f ) of the matrix F F. It is possible to show [12] that, if λ min and λ max are respectively the minimum and maximum eigenvalues of the symmetric matrix F F, then λ min R(f ) λ max. This means that the eigenvalues of the matrix F F lie in the interval [A,B]. Since any A and B such that 0 <A A and B B < can be used as frame bounds, then we choose A as large as possible and B as small as possible. This is equivalent to take the minimum and maximum eigenvalues of F F as frame bounds. Note that in the case of tight frame all the eigenvalues of F F coincide. So we have a practical recipe for establishing if a system of vectors ( ) l ϕ j is a frame. In fact, build the matrix F having the vectors ϕ j as rows. Compute the minimum and maximum eigenvalues of F F.Ifλ min > 0 then the system ( ) l ϕ j is a frame, with frame bounds λ min and λ max. Finally, note that the redundancy ratio B/A coincides with the condition number λ max /λ min of F F Frames and economic representations We have seen that in case of frames, the representation of a signal f in terms of the frame elements is not unique. Once we have computed the coefficients c, projecting the signal on the frame elements, there is a unique way to recover the signal f, in particular by using the elements of the dual frame (14). Now the question is: among all possible representations of the signal f in terms of the frame elements, what properties have the coefficients c? In[3] is shown (see also [11]) that, among all representations of a signal, the method of frames selects the one whose coefficients have minimum l 2 norm. This is equivalent to say that the vector c is solution of the following problem: min c 1 2 c 2 s.t. Fc= f where, here the frame elements are the columns of the matrix F for the sake of convenience. This is equivalent to start from the dual frame. In other words, for representing the signal f by using the elements of the dictionary F, the method of frame selects the minimum-length solution. Finally, note that the method of frame is not sparsity preserving. In fact, suppose that your signal is just an element of the dictionary and so c should have only one component different from zero. On the

7 384 A. Leone et al. / Sensors and Actuators B 105 (2005) contrary, each atom having a non zero scalar product with the signal contributes to the solution, and so the corresponding component of c is different from zero. 4. An overview of matching pursuit (MP) method A matching pursuit (MP) is a greedy algorithm which, rather than solving the optimal approximation problem, progressively refines the signal approximation with an iterative procedure. In other words, MP is a non linear algorithm that decomposes any signal into a linear expansion of waveforms that, generally, belong to a dictionary D of functions. MP is an iterative procedure which, at each step, selects the atom of the dictionary D which best reduces the residual between the current approximation of the signal and the signal itself. The best-adapted decomposition is selected by the following greedy strategy. We are interested with MP to finding a subset of D that best matches the given signal f. Let us consider a dictionary D of atoms {ϕ j } L such that D is a frame with L number of atoms, and let us indicate by u the residue of a signal f to be analysed at the first step. The first step of MP is to approximate f by projecting it on a vector ϕ jo D so then f = f, ϕ jo ϕ jo + u. (19) Since the residual u is orthogonal 5 to ϕ jo then f 2 = f, ϕ jo 2 + u 2 The objective is to minimizing the residual u so then choosing the atom ϕ jo such that maximizes f, ϕ j. Since we are working in finite dimensions we choose ϕ jo such that f, ϕ jo = max ϕj D f, ϕ j. (20) Suppose that we have already computed the residual at kth step u k. We choose ϕ jk D such that u k,ϕ jk = max ϕj D u k,ϕ j and projecting u k on ϕ jk we have u k+1 = u k u k,ϕ jk ϕ jk. In other worlds, at kth step the algorithm selects the dictionary atom that best correlates with the residual (minimizing the reconstruction error) and adds to the current approximation the selected atom ϕ jk. The orthogonality of u k+1 with ϕ jk implies that u k+1 2 = u k 2 u k,ϕ jk 2. 5 Rewriting Eq. (19) asu = f f, ϕ jo ϕ jo we can see that from f is taken out the component in direction of ϕ jo. Therefore, at a generic nth step the original signal f can be expressed by n 1 f = u k,ϕ jk ϕ jk + u n. (21) k=0 Similarly, for k between 0 and n 1 yields n 1 f 2 = u k,ϕ jk 2 + u n 2 (22) k=0 The residual u n is the approximation error of f after choosing n vectors in the dictionary, and the energy of this error is given by Eq. (22). For any f H, the convergence of the error to zero is a consequence of a theorem proved by Jones [13], i.e. lim n u n = 0. Hence, f = and u k,ϕ jk ϕ jk k=0 f 2 = u k,ϕ jk 2 k=0 In finite dimensions, the convergence is exponential. From the Eq. (21) follows that MP represent the signal f as linear combination of the dictionary atoms with coefficients computed minimizing at each step the residual. In general the algorithm ends when the residual is less than a given threshold or when the number of iterations k is equal to the number of wished atoms. If we stop the algorithm after few steps (high thresholds), MP provides an approximation of the signal with only few atoms and the reconstruction error could be very high (low computational time). However, when the algorithm selects a greater number of atoms (low thresholds), it provides a better approximation of the signal (i.e. low reconstruction error), but the computational time will be high. The approximations derived from MP can be refined by orthogonalizing the directions of projection. The resulting orthogonal pursuit converges with a finite number of iterations in finite-dimensional spaces, which is not the case for a nonorthogonal MP. The vector selected at each iteration by the MP algorithm is not, in general, orthogonal to the previously selected vectors. So then, at iteration p the selected atom ϕ jp is not always orthogonal to the k 1 atoms previously selected (ϕ j0,ϕ j1,...,ϕ jp 1 ). In subtracting the projection of u k onto ϕ jk the algorithm reintroduces new components in the directions of the previously selected atoms {ϕ jk } p k=0. This can be avoided by orthogonalizing the {ϕ jk } p k=0 with a Gram Schmidt procedure. It has been proved in [14] that the residual of an ortoghonal pursuit converge strongly to zero

8 A. Leone et al. / Sensors and Actuators B 105 (2005) and that the number of iterations required for convergence is less than or equal to the dimension of the space H. Thus, in finite-dimensional spaces, orthogonal MP are guaranteed to converge in a finite number of steps, unlike nonorthogonal pursuits. 5. 1D Gabor function A one dimensional (1D) Gabor function h ω0,σ(t) is a complex function given by the following expression: h ω0,σ(t) = g σ (t)f ω0 (t) (23) where g σ (t) is a normalized Gaussian function with standard deviation σ: g σ (t) = 1 e t2 /2σ 2 (24) 2πσ and f ω0 (t) = e jω 0t (25) is a complex exponential with angular frequency ω 0 (recall that ω 0 = 2πf 0 ). The real and imaginary components of a Gabor function are given by (see Fig. 2): h r ω 0,σ (t) = g σ(t) cos(ω 0 t) (26) h i ω 0,σ (t) = g σ(t) sin(ω 0 t) (27) From the Eq. (23), we see that a Gabor function is the product in time between a Gaussian function and a complex exponential. Then, for the convolution theorem in frequency domain, we have that: H σ,ω0 (ω) = (G σ Fω 0 )(ω) (28) It is simple to see that the Fourier transform of a Gabor function is a Gaussian function shifted in the frequency domain, centered on the frequency ω 0, having maximum amplitude 2π; H σ,ω0 (ω) = 2πe σ2 (ω ω 0 ) 2 /2 (29) Moreover, H σ,ω0 (ω) is a real function and represents a band-pass filter. Note that, since the Gaussian function is not band-limited, because it is a positive function, then it does not exist a frequency ω c such that H σ,ω0 (ω) = 0 for ω >ω c. So, we determine the half-peak bandwidth of the filter, that is we want to find the cut-off frequency ω c such that the filter response is equal to 3 db, or equivalently: H σ,ω0 (ω) 1 2 max H σ,ω 0 (ω) for ω ω c (30) Using the Eq. (29), we obtain that: 2ln2 ω c = ω 0 ± (31) σ Therefore, the half-peak bandwidth of a Gabor function with the standard deviation σ is: 2ln2 B = 2 (32) σ showing that the filter bandwidth B is proportional to the inverse of σ.now,letf 0 be the center frequency of the generic Gabor filter in a bank, where each filter is an appropriate Gabor function having passing bandwidth of B octave. The value of σ 0 in a filter bank is: Fig. 2. Gabor function: (top) real part, (middle) imaginary part and (bottom) Fourier transform.

9 386 A. Leone et al. / Sensors and Actuators B 105 (2005) σ 0 = (2B + 1) 2ln2 (2 B (33) 1)2πf 0 Note that each filter of the bank is a B octave filter, where the passing bandwidth of the particular filter depends on its center frequecy f 0. In particular, the σ 0 of an octave Gabor filter is proportional to the inverse of the center frequency f 0. Moreover, said f ci the center frequency of the interval [f i,f i+1 ] for i = 0,1,...,n 1, then the center frequency of each subintervals verify the following relation: f ci = 2 B f ci 1 for i = 1, 2,...,n 1 (34) 6. Experimental results We show results relative to the analysis and synthesis of 1D real signals derived by an e-nose. In the one dimensional case, each atom of the dictionary have length equal to the length of the 1D signal we want to analyze (we call n the length of 1D signal). The typical signal f is composed of n = 256 components (n = 2 l = 256 with l = 8). Fig. 3 shows an a typical 1Dsignal as response of an e-nose Haar dictionary The first dictionary we have considered is an overcomplete Haar dictionary, by scaling and translating Haar function with shift parameter equal to t = 1, so for each value of the scaling-parameter n vectors are added to the dictionary. The elements of the Haar basis were obtained as impulse responses from the Haar synthesis filters. Fig. 4 shows some atoms of a Haar dictionary (see Table 1) used in the experiments, generated by using t = 1 and scaling parameter as power of 2, from 2 0 to 2 l. Now we report the result obtained for the generation of Haar dictionary. Let us con- Fig. 4. Some atoms of the Haar dictionary. sider λ min and λ max the minimum and maximum eigenvalues of the matrix F l n (the Frame dictionary) computed using SVD, where l is the number of atoms (the rows of F), that is the number of elements of the dictionary. Let e = f ˆf is the error between the input signal f and the recovered signal ˆf Gabor dictionary We performed the same analysis and synthesis experiments by using a time and frequency translation invariant Gabor dictionary [10], by scaling, translating and modulating a window of a Gaussian function. Gaussian windows are used because of their optimal time and frequency energy concentration, proved by the "Heisenberg Uncertainty Principle". In the generation of the Gabor dictionary, for each value of the parameters ω 0 and σ in (23), h ω0,σ is sampled in n points and all the translated versions are considered as elements (atoms) of the dictionary. In other words, for each pair of parameters, n vectors are added to the dictionary. All the center frequencies f 0 = ω 0 /2π are in the range [0, 1/2] where 1/2 is the Nyquist critical frequency. Note that the atoms of the Gabor dictionary are complex elements, because modulating versions (e jω 0t ) of Gaussian window. Therefore, each atom have a real part and an imaginary part. In Table 2, we report the results obtained for different parameters value ω 0 and B. Let f 0 indicates the number of center frequencies used for the generation of the Gabor dictionary. In this case, the frequencies were obtained spacing the range [0, 1/2], so that f ci = 2 B f ci 1. Fig. 3. Typical e-nose response. Table 1 Haar dictionary results t 1 l 2049 λ min 1.0 λ max 16.0 e 4.3e 12

10 A. Leone et al. / Sensors and Actuators B 105 (2005) Table 2 Gabor dictionary results f 0 B l λ min λ max e e e e e e e e e e e MP filters for 1D-signal analysis An electronic nose incorporates an array of chemical sensors, whose response constitutes an odor pattern. A single sensor in the array should not be highly specific in its responses but should respond to a broad range of compounds such that different patterns are expected to be related to different odors. A typical gas sensor response is show in Fig. 3. Using MP scheme, a particular response of electronic nose is represented as an n-dimensional vector (c 0,c 1,...,c L 1 ), called a coefficient vector. One computes the coefficient value by projecting the response 1D-signal onto a set of basis functions, {ϕ 0,...,ϕ L 1 } which need not be orthogonal. Because the basis is not necessarily orthogonal, an iterative projection algorithm is applied to calculate the coefficients. If the basis is orthogonal, then the algorithm reduces to the standard projection method. The projection algorithm adjusts for the non-orthogonal property by using residual 1D-signal. If f is a 1D-signal (or template) like in Fig. 3, then u k is the residual signal during iteration k, where u 0 = f. The coefficient c k is the projection of the residual signal u k onto the basis element ϕ k,as c k = u k,ϕ k where, is the inner product between two functions. The residual signal is updated after each iteration by u k+1 = u k c k ϕ k (35) After the nth iteration, a 1D-signal f is decomposed into a sum of residual signal: n 1 f = (u k u k+1 ) + u n (36) k=0 Rearranging Eq. (35) and substituting into Eq. (36) yields n 1 f = c k ϕ k + u n (37) k=0 So, the approximation of the original signal after n iterations is n 1 ˆf = c k ϕ k (38) k=0 The approximation needs not be very accurate, since the encoded information is enough for recognition. When we arrive at the wanted approximation error, a subset of atoms of D is defined (called F D): the number of rows of F is equal to the number of selected atoms through MP Fig. 5. Reconstruction of the known curve using different number of atoms.

11 388 A. Leone et al. / Sensors and Actuators B 105 (2005) MP filters for 1D-signal synthesis The f signal can be recovered by the coefficient vector c = F f where c k = f, ϕ k, using the next formula: ˆf = F c (40) Fig. 6. Reconstruction error along iterations (number of selected atoms). algorithm (n), whereas the number of columns of F is equal to the number of samples of the signal f (m = 256). Therefore, it s possible to think that a generic signal can be expressed by projection of the signal on the matrix F n m as: c = F f (39) where c is the coefficient vector relative to the signal f (c is an n-dimensional). where F = (F F) 1 F is the "pseudoinverse of F". The Eq. (40) makes evident the difference between the operator F and the operator F. Operator F, also called "analysis operator", associates a vector of coefficient c (features) to a signal f, decomposing (projecting) the signal through the atoms of the sub-dictionary F. On the contrary F, also called synthesis operator, builds up a reconstructed signal ˆf as a superposition of the atoms of the dual sub-dictionary F weighted with coefficients c. Note that the columns of F are the elements of the dual matrix of F and, therefore, F is a matrix m n. Fig. 5 shows the reconstruction of the curve used in the training MP process at several atoms extracted. The more the atoms extracted the better of course is its representation as shown in Fig. 6. A number of 5, 10, 20 and 35 atoms have been found with MP using the curve shown in Fig. 3. It has been shown that with these set of extracted atoms we can reconstruct unseen signals, as shown in Fig. 7. This means that the extracted atoms span very well the domain represented by functions of this type, let us say e-nose functions. A comparison between MP coefficients and the standard wavelet transform coefficients using Haar is performed. Synthesis over the two sets of coefficients is shown in Fig. 8. In Fig. 7. Reconstruction of an unseen signal using different number of atoms extracted with curve shown in Fig. 3.

12 A. Leone et al. / Sensors and Actuators B 105 (2005) Fig. 8. Synthesis comparison between eight MP coefficients and eight standard Haar wavelet transform coefficients. the standard Haar wavelet transform, the fifth decomposition level is chosen with its eight approximation coefficients. No detail coefficients are used in the synthesis process (i.e. zero coefficients). Note that MP coefficients are more expressive than standard wavelet transform, since this last cannot capture high frequency content different by noise. As opposed, the selected atoms in the MP process are strongly scaled and localized in the two important regions of the signal (starting times for sampling and recovery processes) which allow a better synthesis of the analysed signal (Fig. 8). Recall that the 8 atoms used with MP are obtained with the signal shown in Fig MP features classification The c coefficient vector for a given signal is the feature input vector to a recognition system. In this case, the idea is to use such features to classify odors. In our case, an odor is simply expressed from the responses of an array of 19 sensors. Fig. 9 shows the responses of an array of chemical sensors when exposed to a given odor. For each odor response, analysis is performed using MP algorithm, by varying the number of the atoms in the subdictionary F. In order to evaluate the best number of coefficients to retain (the number of atoms to select from the over-complete dictionary D), a generalization error is evaluated. The objective is to find, with a leave-one-out procedure, the number of atoms for the signal that minimizes the generalization error. Fig. 9. Responses of an array of 19 gas sensors. Table 3 Summary of the PNN classification error with leave-one-out procedure of the methodologies investigated. The number of coefficients are related to each sensor response features extracted Method Error % Number of coefficients Haar wavelet transf MP over Haar dictionary 10 5 MP over Gabor dictionary 9% 18 MP over Haar Gabor dictionary 9 18

13 390 A. Leone et al. / Sensors and Actuators B 105 (2005) Fig. 10. Classification error with the Haar dictionary. Fig. 11. Classification error with different Gabor dictionaries.

14 A. Leone et al. / Sensors and Actuators B 105 (2005) Fig. 12. Classification error with Gabor and Haar dictionaries. The 19 coefficient vectors (one for each signal of the responses, such that each coefficient vector has equal length to the number of atoms of F) are given in input to a neural network. Table 3 reports the classification errors against the different techniques proposed in the paper. It is interesting to note that with almost the same classification error it is possible to represent the signal with a small number of information. For example, with the classical Haar wavelet transform, eight coefficients for each sensor response are needed to have the lowest classification error of 11%, while with the method proposed in this paper (matching pursuit over overcomplete Haar dictionaries) only five coefficients are needed to obtain almost the same classification error, so far saving 37% of memory allocation and also a certain amount of computational power. Obviously for a single sensor this is irrelevant, but for a large array of sensors this becomes very important. In the last two lines of Table 3 are reported MP Gabor Dictionary and MP over Gabor Haar Dictionary coefficients extracted from each response curve. The classification error reduces slightly with a larger number of coefficients but with a better reconstruction error with respect to MP over Haar dictionary method. This to verify if the extracted features are good features for discriminatory purposes in the classification process and, therefore, with the aim to have clustered responses associated at same odors. A probabilistic neural network (PNN) is used. A PNN is a class of neural networks which combine some of the best attributes of statistical pattern recognition and feedforward neural networks [15]. PNNs feature very fast training times and produce outputs with Bayes posterior probabilities. These useful features come at the expense of larger memory requirements and slower execution speed for prediction of unknown patterns compared to conventional neural networks. Figs show the classification error for different generated sub-dictionaries. In particular, in Fig. 11 Gabor subdictionaries are found by varying also the number of center frequencies. Fig. 12 shows classification results over subdicionaries generated by the union of two over-complete Gabor and Haar dictionaries. 7. Conclusions In this paper a new feature extraction technique has been studied and implemented for an array of conductive polymer gas sensors. It has been shown that the extracted features provide good discrimination capabilities in the recognition phase, and are superior to the classical orthonormal decomposition methods such as the discrete wavelet transform. The use of a large dictionary allows to select a subset of analyzing functions that span the whole space of the problem domain. The basis function of the feature space has been selected with the well known matching pursuit algorithm, which, rather than solving the optimal approximation problem, progressively refines the signal approximation with an iterative procedure. Convergence of the procedure is assured,

15 392 A. Leone et al. / Sensors and Actuators B 105 (2005) and in a few iteration has been possible to capture the nearoptimal signal representation. Several dictionaries have been used and the results of the MP algorithm compared in terms of classification error. Acknowledgment The authors would like to thank Osmetech Inc. USA for providing data used in the experiments of this work. This work is supported in part by NATO-CNR N PROT and NOSE II - 2nd Network on artificial Olfactory Sensing. References [1] S. Mallat, A Wavelet Tour of Signal Processing, second ed., Academic Press, [2] I. Daubechies, Ten lectures on wavelets CBMS-NSF Regional Conferences series in Applied Mathematics, SIAM (1992). [3] N. Ancona, E. Stella, Image representation with overcomplete dictionaries for object detection. Internal report 02/2003, ISSIA CNR, Bari, Italy, July cnr.it/iesina18/publications/ ancona-stella-report pdf. [4] C. Distante, M. Leo, P. Siciliano, K.C. Persaud, On the study of feature extraction method for electronic nose, Sens. Actuators B 87 (2002) [5] T. Poggio, F. Girosi, A theory of networks for approximation and learning, Technical report, Massachusetts Institute of Technology (1989) A.I. Memo [6] V. Vapnik, Statistical Learning Theory, Wiley, [7] C. Distante, N. Ancona, P. Siciliano, Support vector machines for olfactory signals recognition, Sens. Actuators B 88 (2003) [8] C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, in: Data Mining and Knowledge Discovery, Kluwer Academic Publishers, 2, 1998, pp [9] R.J. Duffin, A.C. Schaeffer, A class of non-harmonic Fourier series, Trans. Am. Math. Soc. 72 (1952) [10] S. Mallat, Z. Zhang, Matching pursuit with time-frequency dictionaries, IEEE Trans. Signal Process. 41 (1993) [11] S. Chen, D. Donhoho, M. Saunders. Atomic decomposition by basis pursuit. Technical Report 479, Department of Statistics, Stanford University, [12] G. Strang, Linear Algebra and its Applications, Harcourt Brace & Company, [13] L.K. Jones, On a conjecture of huber concerning the convergence of projection pursuit regression, Ann. Statist. 15 (1987) [14] L. Blum, M. Shub, S. Smale, On the theory of computation and complexity over the real numbers: Np-completeness, recursive functions, and universal machines, Bull. Am. Math. Soc. 21 (1989) [15] O. Duda Richard, E. Hart Peter, G. Stork David, Pattern Classification, second ed., Wiley Interscience, 2000.

Sparse linear models

Sparse linear models Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time

More information

Contents. 0.1 Notation... 3

Contents. 0.1 Notation... 3 Contents 0.1 Notation........................................ 3 1 A Short Course on Frame Theory 4 1.1 Examples of Signal Expansions............................ 4 1.2 Signal Expansions in Finite-Dimensional

More information

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases 2558 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 48, NO 9, SEPTEMBER 2002 A Generalized Uncertainty Principle Sparse Representation in Pairs of Bases Michael Elad Alfred M Bruckstein Abstract An elementary

More information

Introduction to Hilbert Space Frames

Introduction to Hilbert Space Frames to Hilbert Space Frames May 15, 2009 to Hilbert Space Frames What is a frame? Motivation Coefficient Representations The Frame Condition Bases A linearly dependent frame An infinite dimensional frame Reconstructing

More information

Wavelet Footprints: Theory, Algorithms, and Applications

Wavelet Footprints: Theory, Algorithms, and Applications 1306 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 5, MAY 2003 Wavelet Footprints: Theory, Algorithms, and Applications Pier Luigi Dragotti, Member, IEEE, and Martin Vetterli, Fellow, IEEE Abstract

More information

7. Variable extraction and dimensionality reduction

7. Variable extraction and dimensionality reduction 7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

A Short Course on Frame Theory

A Short Course on Frame Theory A Short Course on Frame Theory Veniamin I. Morgenshtern and Helmut Bölcskei ETH Zurich, 8092 Zurich, Switzerland E-mail: {vmorgens, boelcskei}@nari.ee.ethz.ch April 2, 20 Hilbert spaces [, Def. 3.-] and

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

Sparse Approximation and Variable Selection

Sparse Approximation and Variable Selection Sparse Approximation and Variable Selection Lorenzo Rosasco 9.520 Class 07 February 26, 2007 About this class Goal To introduce the problem of variable selection, discuss its connection to sparse approximation

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Frame Diagonalization of Matrices

Frame Diagonalization of Matrices Frame Diagonalization of Matrices Fumiko Futamura Mathematics and Computer Science Department Southwestern University 00 E University Ave Georgetown, Texas 78626 U.S.A. Phone: + (52) 863-98 Fax: + (52)

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Wavelets and Multiresolution Processing () Christophoros Nikou cnikou@cs.uoi.gr University of Ioannina - Department of Computer Science 2 Contents Image pyramids Subband coding

More information

Digital Image Processing

Digital Image Processing Digital Image Processing, 2nd ed. Digital Image Processing Chapter 7 Wavelets and Multiresolution Processing Dr. Kai Shuang Department of Electronic Engineering China University of Petroleum shuangkai@cup.edu.cn

More information

Machine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels

Machine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels Machine Learning: Basis and Wavelet 32 157 146 204 + + + + + - + - 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 7 22 38 191 17 83 188 211 71 167 194 207 135 46 40-17 18 42 20 44 31 7 13-32 + + - - +

More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information Introduction Consider a linear system y = Φx where Φ can be taken as an m n matrix acting on Euclidean space or more generally, a linear operator on a Hilbert space. We call the vector x a signal or input,

More information

Invariant Scattering Convolution Networks

Invariant Scattering Convolution Networks Invariant Scattering Convolution Networks Joan Bruna and Stephane Mallat Submitted to PAMI, Feb. 2012 Presented by Bo Chen Other important related papers: [1] S. Mallat, A Theory for Multiresolution Signal

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE 5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Uncertainty Relations for Shift-Invariant Analog Signals Yonina C. Eldar, Senior Member, IEEE Abstract The past several years

More information

2.3. Clustering or vector quantization 57

2.3. Clustering or vector quantization 57 Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :

More information

Decompositions of frames and a new frame identity

Decompositions of frames and a new frame identity Decompositions of frames and a new frame identity Radu Balan a, Peter G. Casazza b, Dan Edidin c and Gitta Kutyniok d a Siemens Corporate Research, 755 College Road East, Princeton, NJ 08540, USA; b Department

More information

Estimation Error Bounds for Frame Denoising

Estimation Error Bounds for Frame Denoising Estimation Error Bounds for Frame Denoising Alyson K. Fletcher and Kannan Ramchandran {alyson,kannanr}@eecs.berkeley.edu Berkeley Audio-Visual Signal Processing and Communication Systems group Department

More information

Review: Learning Bimodal Structures in Audio-Visual Data

Review: Learning Bimodal Structures in Audio-Visual Data Review: Learning Bimodal Structures in Audio-Visual Data CSE 704 : Readings in Joint Visual, Lingual and Physical Models and Inference Algorithms Suren Kumar Vision and Perceptual Machines Lab 106 Davis

More information

Statistical Geometry Processing Winter Semester 2011/2012

Statistical Geometry Processing Winter Semester 2011/2012 Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian

More information

Sparse linear models and denoising

Sparse linear models and denoising Lecture notes 4 February 22, 2016 Sparse linear models and denoising 1 Introduction 1.1 Definition and motivation Finding representations of signals that allow to process them more effectively is a central

More information

Applied Machine Learning for Biomedical Engineering. Enrico Grisan

Applied Machine Learning for Biomedical Engineering. Enrico Grisan Applied Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Data representation To find a representation that approximates elements of a signal class with a linear combination

More information

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Ole Christensen, Alexander M. Lindner Abstract We characterize Riesz frames and prove that every Riesz frame

More information

Multiscale Frame-based Kernels for Image Registration

Multiscale Frame-based Kernels for Image Registration Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image

More information

BANACH FRAMES GENERATED BY COMPACT OPERATORS ASSOCIATED WITH A BOUNDARY VALUE PROBLEM

BANACH FRAMES GENERATED BY COMPACT OPERATORS ASSOCIATED WITH A BOUNDARY VALUE PROBLEM TWMS J. Pure Appl. Math., V.6, N.2, 205, pp.254-258 BRIEF PAPER BANACH FRAMES GENERATED BY COMPACT OPERATORS ASSOCIATED WITH A BOUNDARY VALUE PROBLEM L.K. VASHISHT Abstract. In this paper we give a type

More information

Sparse Approximation of Signals with Highly Coherent Dictionaries

Sparse Approximation of Signals with Highly Coherent Dictionaries Sparse Approximation of Signals with Highly Coherent Dictionaries Bishnu P. Lamichhane and Laura Rebollo-Neira b.p.lamichhane@aston.ac.uk, rebollol@aston.ac.uk Support from EPSRC (EP/D062632/1) is acknowledged

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Contents. Acknowledgments

Contents. Acknowledgments Table of Preface Acknowledgments Notation page xii xx xxi 1 Signals and systems 1 1.1 Continuous and discrete signals 1 1.2 Unit step and nascent delta functions 4 1.3 Relationship between complex exponentials

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information

EUSIPCO

EUSIPCO EUSIPCO 013 1569746769 SUBSET PURSUIT FOR ANALYSIS DICTIONARY LEARNING Ye Zhang 1,, Haolong Wang 1, Tenglong Yu 1, Wenwu Wang 1 Department of Electronic and Information Engineering, Nanchang University,

More information

A WAVELET BASED CODING SCHEME VIA ATOMIC APPROXIMATION AND ADAPTIVE SAMPLING OF THE LOWEST FREQUENCY BAND

A WAVELET BASED CODING SCHEME VIA ATOMIC APPROXIMATION AND ADAPTIVE SAMPLING OF THE LOWEST FREQUENCY BAND A WAVELET BASED CODING SCHEME VIA ATOMIC APPROXIMATION AND ADAPTIVE SAMPLING OF THE LOWEST FREQUENCY BAND V. Bruni, D. Vitulano Istituto per le Applicazioni del Calcolo M. Picone, C. N. R. Viale del Policlinico

More information

Multiple Change Point Detection by Sparse Parameter Estimation

Multiple Change Point Detection by Sparse Parameter Estimation Multiple Change Point Detection by Sparse Parameter Estimation Department of Econometrics Fac. of Economics and Management University of Defence Brno, Czech Republic Dept. of Appl. Math. and Comp. Sci.

More information

Sparse Time-Frequency Transforms and Applications.

Sparse Time-Frequency Transforms and Applications. Sparse Time-Frequency Transforms and Applications. Bruno Torrésani http://www.cmi.univ-mrs.fr/~torresan LATP, Université de Provence, Marseille DAFx, Montreal, September 2006 B. Torrésani (LATP Marseille)

More information

Frames. Hongkai Xiong 熊红凯 Department of Electronic Engineering Shanghai Jiao Tong University

Frames. Hongkai Xiong 熊红凯   Department of Electronic Engineering Shanghai Jiao Tong University Frames Hongkai Xiong 熊红凯 http://ivm.sjtu.edu.cn Department of Electronic Engineering Shanghai Jiao Tong University 2/39 Frames 1 2 3 Frames and Riesz Bases Translation-Invariant Dyadic Wavelet Transform

More information

An Introduction to Filterbank Frames

An Introduction to Filterbank Frames An Introduction to Filterbank Frames Brody Dylan Johnson St. Louis University October 19, 2010 Brody Dylan Johnson (St. Louis University) An Introduction to Filterbank Frames October 19, 2010 1 / 34 Overview

More information

Eigenface-based facial recognition

Eigenface-based facial recognition Eigenface-based facial recognition Dimitri PISSARENKO December 1, 2002 1 General This document is based upon Turk and Pentland (1991b), Turk and Pentland (1991a) and Smith (2002). 2 How does it work? The

More information

Multiresolution Analysis

Multiresolution Analysis Multiresolution Analysis DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Frames Short-time Fourier transform

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

An Introduction to Wavelets and some Applications

An Introduction to Wavelets and some Applications An Introduction to Wavelets and some Applications Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France An Introduction to Wavelets and some Applications p.1/54

More information

Deep Learning: Approximation of Functions by Composition

Deep Learning: Approximation of Functions by Composition Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation

More information

2. Review of Linear Algebra

2. Review of Linear Algebra 2. Review of Linear Algebra ECE 83, Spring 217 In this course we will represent signals as vectors and operators (e.g., filters, transforms, etc) as matrices. This lecture reviews basic concepts from linear

More information

Introduction to Biomedical Engineering

Introduction to Biomedical Engineering Introduction to Biomedical Engineering Biosignal processing Kung-Bin Sung 6/11/2007 1 Outline Chapter 10: Biosignal processing Characteristics of biosignals Frequency domain representation and analysis

More information

On the Noise Model of Support Vector Machine Regression. Massimiliano Pontil, Sayan Mukherjee, Federico Girosi

On the Noise Model of Support Vector Machine Regression. Massimiliano Pontil, Sayan Mukherjee, Federico Girosi MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1651 October 1998

More information

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Introduction to Sparsity Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Outline Understanding Sparsity Total variation Compressed sensing(definition) Exact recovery with sparse prior(l 0 ) l 1 relaxation

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding

An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding Beibei Wang, Yao Wang, Ivan Selesnick and Anthony Vetro TR2004-132 December

More information

Computational Harmonic Analysis (Wavelet Tutorial) Part II

Computational Harmonic Analysis (Wavelet Tutorial) Part II Computational Harmonic Analysis (Wavelet Tutorial) Part II Understanding Many Particle Systems with Machine Learning Tutorials Matthew Hirn Michigan State University Department of Computational Mathematics,

More information

On Riesz-Fischer sequences and lower frame bounds

On Riesz-Fischer sequences and lower frame bounds On Riesz-Fischer sequences and lower frame bounds P. Casazza, O. Christensen, S. Li, A. Lindner Abstract We investigate the consequences of the lower frame condition and the lower Riesz basis condition

More information

Compressed Sensing: Extending CLEAN and NNLS

Compressed Sensing: Extending CLEAN and NNLS Compressed Sensing: Extending CLEAN and NNLS Ludwig Schwardt SKA South Africa (KAT Project) Calibration & Imaging Workshop Socorro, NM, USA 31 March 2009 Outline 1 Compressed Sensing (CS) Introduction

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

Multiscale Image Transforms

Multiscale Image Transforms Multiscale Image Transforms Goal: Develop filter-based representations to decompose images into component parts, to extract features/structures of interest, and to attenuate noise. Motivation: extract

More information

Linear Discrimination Functions

Linear Discrimination Functions Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

Regularization via Spectral Filtering

Regularization via Spectral Filtering Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

The Iteration-Tuned Dictionary for Sparse Representations

The Iteration-Tuned Dictionary for Sparse Representations The Iteration-Tuned Dictionary for Sparse Representations Joaquin Zepeda #1, Christine Guillemot #2, Ewa Kijak 3 # INRIA Centre Rennes - Bretagne Atlantique Campus de Beaulieu, 35042 Rennes Cedex, FRANCE

More information

Operators with Closed Range, Pseudo-Inverses, and Perturbation of Frames for a Subspace

Operators with Closed Range, Pseudo-Inverses, and Perturbation of Frames for a Subspace Canad. Math. Bull. Vol. 42 (1), 1999 pp. 37 45 Operators with Closed Range, Pseudo-Inverses, and Perturbation of Frames for a Subspace Ole Christensen Abstract. Recent work of Ding and Huang shows that

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

CHAPTER VIII HILBERT SPACES

CHAPTER VIII HILBERT SPACES CHAPTER VIII HILBERT SPACES DEFINITION Let X and Y be two complex vector spaces. A map T : X Y is called a conjugate-linear transformation if it is a reallinear transformation from X into Y, and if T (λx)

More information

Recent developments on sparse representation

Recent developments on sparse representation Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

Wavelet Bi-frames with Uniform Symmetry for Curve Multiresolution Processing

Wavelet Bi-frames with Uniform Symmetry for Curve Multiresolution Processing Wavelet Bi-frames with Uniform Symmetry for Curve Multiresolution Processing Qingtang Jiang Abstract This paper is about the construction of univariate wavelet bi-frames with each framelet being symmetric.

More information

Wavelets For Computer Graphics

Wavelets For Computer Graphics {f g} := f(x) g(x) dx A collection of linearly independent functions Ψ j spanning W j are called wavelets. i J(x) := 6 x +2 x + x + x Ψ j (x) := Ψ j (2 j x i) i =,..., 2 j Res. Avge. Detail Coef 4 [9 7

More information

Review and problem list for Applied Math I

Review and problem list for Applied Math I Review and problem list for Applied Math I (This is a first version of a serious review sheet; it may contain errors and it certainly omits a number of topic which were covered in the course. Let me know

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

A primer on the theory of frames

A primer on the theory of frames A primer on the theory of frames Jordy van Velthoven Abstract This report aims to give an overview of frame theory in order to gain insight in the use of the frame framework as a unifying layer in the

More information

Spectral Regularization

Spectral Regularization Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE

A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9939(XX)0000-0 A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE PETER G. CASAZZA, GITTA KUTYNIOK,

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Bearing fault diagnosis based on EMD-KPCA and ELM

Bearing fault diagnosis based on EMD-KPCA and ELM Bearing fault diagnosis based on EMD-KPCA and ELM Zihan Chen, Hang Yuan 2 School of Reliability and Systems Engineering, Beihang University, Beijing 9, China Science and Technology on Reliability & Environmental

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 Machine Learning for Signal Processing Sparse and Overcomplete Representations Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 1 Key Topics in this Lecture Basics Component-based representations

More information

1. Fourier Transform (Continuous time) A finite energy signal is a signal f(t) for which. f(t) 2 dt < Scalar product: f(t)g(t)dt

1. Fourier Transform (Continuous time) A finite energy signal is a signal f(t) for which. f(t) 2 dt < Scalar product: f(t)g(t)dt 1. Fourier Transform (Continuous time) 1.1. Signals with finite energy A finite energy signal is a signal f(t) for which Scalar product: f(t) 2 dt < f(t), g(t) = 1 2π f(t)g(t)dt The Hilbert space of all

More information

On Optimal Frame Conditioners

On Optimal Frame Conditioners On Optimal Frame Conditioners Chae A. Clark Department of Mathematics University of Maryland, College Park Email: cclark18@math.umd.edu Kasso A. Okoudjou Department of Mathematics University of Maryland,

More information

PCA FACE RECOGNITION

PCA FACE RECOGNITION PCA FACE RECOGNITION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Shree Nayar (Columbia) including their own slides. Goal

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE METHODS AND APPLICATIONS OF ANALYSIS. c 2011 International Press Vol. 18, No. 1, pp. 105 110, March 2011 007 EXACT SUPPORT RECOVERY FOR LINEAR INVERSE PROBLEMS WITH SPARSITY CONSTRAINTS DENNIS TREDE Abstract.

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Overview Motivation

More information

A BRIEF INTRODUCTION TO HILBERT SPACE FRAME THEORY AND ITS APPLICATIONS AMS SHORT COURSE: JOINT MATHEMATICS MEETINGS SAN ANTONIO, 2015 PETER G. CASAZZA Abstract. This is a short introduction to Hilbert

More information

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +

More information

Learning Methods for Linear Detectors

Learning Methods for Linear Detectors Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2011/2012 Lesson 20 27 April 2012 Contents Learning Methods for Linear Detectors Learning Linear Detectors...2

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information