Functional principal component analysis of financial time series
|
|
- Janis Wade
- 6 years ago
- Views:
Transcription
1 Vichi M., Monari P., Mignani S., Montanari A. (Eds.) New Developments in Classification and Data Analysis Springer-Verlag, Berlin, 2005, Functional principal component analysis of financial time series Salvatore Ingrassia and G. Damiana Costanzo Dipartimento di Economia e Statistica, Università della Calabria Arcavacata di Rende (CS), Italy s.ingrassia@unical.it, dm.costanzo@unical.it Abstract. We introduce functional principal component techniques for the statistical analysis of a set of financial time series from an explorative point of view. We show that this approach highlights some relevant statistical features of such related datasets. A case study is here considered concerning the daily traded volumes of the shares in the MIB30 basket from January 3rd, 2000 to December 30th, Moreover, since the first functional principal component accounts for the 89.4% of the whole variabilitity, this approach suggests the construction of new financial indices based on functional indicators. 1 Introduction Functional domain supports many recent methodologies for statistical analysis of data coming from measurements concerning continuous phenomena; such techniques constitute nowadays a new branch of statistics named functional data analysis, see Ramsay and Silverman (1997, 2002). Financial markets offer an appealing field of application since the phase of dealing is continuous and then the share prices, as well as other related quantities, are updated with a very high frequency. This paper focuses on functional principal component based approach to the statistical analysis of financial data. In finance principal component based techniques have been considered sometimes e.g. for construction of uncorrelated indices in multi-index models, see Elton and Gruber (1973); moreover they have been suggested in high frequency trading models by Dunis et al. (1998). Here we show that the functional version provides an useful tool for the statistical analysis of a set of financial series from an explorative perspective. Furthermore we point out as this approach suggests the possibility of the construction of stock market indices based on functional indicators. The analysis is here illustrated by considering the data concerning the daily traded volumes of the 30 shares listed in the MIB30 basket in the period January 3rd, December 30th, The rest of the paper is organised as follows. In the next section we outline functional data modeling and give some details about functional principal component analysis; in Section 3 we introduce the MIB30 basket dataset and present the main results of our analysis; finally in Section 4 we discuss further methodological aspects
2 350 Ingrassia and Costanzo and open a problem concerning the construction of new stock market indices on the ground of the obtained results. 2 Functional PCA Functional data are essentially curves and trajectories, the basic rationale is that we should think of observed data functions as single entities rather than merely a sequence of individual observations. Even though functional data analysis often deals with temporal data, its scope and objectives are quite different from time series analysis. While time series analysis focuses mainly on modeling data, or in predicting future observations, the techniques developed in FDA are essentially exploratory in nature: the emphasis is on trajectories and shapes; moreover unequally-spaced and/or different number of observations can be taken into account as well as series of observations with missing values. From a practical point of view, functional data are usually observed and recorded discretely. Let {ω 1,..., ω n} be a set of n units and let y i = (y i(t 1),..., y i(t p)) be a sample of measurements of a variable Y taken at p times t 1,..., t p T = [a, b] in the i-th unit ω i, (i = 1,..., n). As remarked above, such data y i (i = 1,..., n) are regarded as functional because they are considered as single entities rather than merely sequences of individual observations, so they are called raw functional data; indeed the term functional refers to the intrinsic structure of the data rather than to their explicit form. In order to convert raw functional data into a suitable functional form, a smooth function x i(t) is assumed to lie behind y i which is referred to as the true functional form; this implies, in principle, that we can evaluate x at any point t T. The set X T = {x 1(t),..., x n(t)} t T is the functional dataset. In functional data analysis the statistical techniques posit a vector space of real-valued functions defined on a closed interval for which the integral of their squares is finite. If attention is confined to functions having finite norms, then the resulting space is a Hilbert space; however often we require a stronger assumption so we assume H be a reproducing kernel Hilbert space (r.k.h.s.), see Wahba (1990), which is a Hilbert space of real-valued functions on T with the property that, for each t T, the evaluation functional L t, which associates f with f(t), L tf f(t), is a bounded linear functional. In such spaces the objective in principal component analysis of functional data is the orthogonal decomposition of the variance function: v(t, u) := 1 n 1 nx {x i(t) x(t)}{x i(u) x(u)} (1) i=1 (which is the counterpart of the covariance matrix of a multidimensional dataset) in order to isolate the dominant components of functional variation, see e.g. also Pezzulli (1994). In analogy with the multivariate case, the functional PCA problem is characterized by the following decomposition of the variance function: v(t, u) = X j λ jξ j(t)ξ j(u) (2)
3 Functional PCA of Financial Time Series 351 where λ j, ξ j(t) satisfy the eigenequation: and the eigenvalues: Z λ j := v(s, ), ξ j h = λ jξ j(u). (3) T ξ j(t)v(t, u)ξ j(u)dt du are positive and non decreasing while the eigenfunctions must satisfy the constraints: Z Z ξj(t)dt 2 = 1 and ξ jξ i(t)dt = 0 (i < j). T The ξ j s are usually called principal component weight functions. Finally the principal component scores (of ξ(t)) of the units in the dataset are the values w i given by: Z w (j) i := x i, ξ j = ξ(t)x i(t)dt. (4) T The decomposition (2) defined by the eigenequation (3) permits a reduced rank least squares approximation to the covariance function v. Thus, the leading eigenfunctions ξ define the principal components of variation among the sample functions x i. T 3 An explorative analysis of the MIB30 basket dataset Data considered here consist of the total value of the traded volumes of the shares composing the MIB30 index in the period January 3rd, December 30th, 2002, see also Costanzo (2003) for details. An important characteristic of this basket is that it is open in that the composition of the index is normally updated twice a year, in the months of March and September (ordinary revisions). However, in response to extraordinary events, or for technical reasons ordinary revisions may be brought forward or postponed with respect to the scheduled date; furthermore, in the interval between two consecutive revisions, the shares in the basket may be excluded due to particular reasons, see the website for further details. Raw data have been collected in a matrix. There are 21 companies which have remaining in the basket for the three years: Alleanza, Autostrade, Banca Fideuram, Banca Monte Paschi Siena, Banca Naz. Lavoro, Enel, Eni, Fiat, Finmeccanica, Generali, Mediaset, Mediobanca, Mediolanum, Olivetti, Pirelli Spa, Ras, San Paolo Imi, Seat Pagine Gialle, Telecom Italia, Tim, Unicredito Italiano; the other 9 places in the basket have been shared by a set of other companies which have been remaining in the basket for shorter periods. Such mixed trajectories will be called here homogeneous piecewise components of the functional data set and they will be referred as T 1,..., T 9. An example, concerning T 1, T 2, T 3 is given in Table 1. Due to the connection among the international financial markets, data concerning the closing days (as week-ends and holidays) are regarded here as missing data. In literature functional PCA is usual performed from the original data (x ij); here we preferred to work on the daily standardized raw functional data: z ij := xij xj s j (i = 1,..., 30, j = 1,..., 758), (5)
4 352 Ingrassia and Costanzo Date T1 T2 T3 03/01/2000 AEM Banca Commerciale Italiana Banca di Roma 04/04/2000 AEM Banca Commerciale Italiana Banca di Roma 18/09/2000 AEM Banca Commerciale Italiana Banca di Roma 02/01/2001 AEM Banca Commerciale Italiana Banca di Roma 19/03/2001 AEM Italgas Banca di Roma 02/05/2001 AEM Italgas Banca di Roma 24/08/2001 AEM Italgas Banca di Roma 24/09/2001 AEM Italgas Banca di Roma 18/03/2002 Snam Rete Gas Italgas Banca di Roma 01/07/2002 AEM Italgas Capitalia 15/07/2002 AEM Italgas Capitalia 23/09/2002 Banca Antonveneta Italgas Capitalia 04/12/2002 Banca Antonveneta Italgas Capitalia Table 1. The homogeneous piecewise components T 1, T 2, T 3. where x j and s j are respectively the daily mean and standard deviation of the e.e.v of the shares in the basket. We shall exhibit later how such transformation can gain an insight into the PC trajectories understanding. The functional dataset has been obtained from such data according to the procedure illustrated in Ramsay (2001). The trajectories of the first two functional principal components are plotted in Figure 1; they show the way in which such set of functional data varies from its mean, and, in terms of these modes of variability, quantifies the discrepancy from the mean of each individual functional datum. The analysis showed that the first PC alone accounts for the 89.4% and the second PC accounts for the 6.9% of the whole variability. The meaning of functional principal component analysis is a more complicated task than the usual multidimensional analysis, however here it emerges the following interpretation: i. The first functional PC is always positive, then shares with large scores of this component during the considered period have a large traded volume as compared with the mean value on the basket; it can be interpreted as a long term trend component. ii. The second functional PC changes sign at t = 431 which corresponds to September 11th, 2001 and the final values, in absolute value, are greater than the initial values: this means that shares having good (bad) performances before the September 11th, 2001 have been going down(rising) after this date; it can be interpreted as a shock component. This interpretation is confirmed by the following analysis of the raw data. As it concerns the first PC, for each company we considered its minimum standardized value over the three years z (min) i = min j=1,...,758 z ij (i = 1,..., 30). In particular z (min) i is positive (negative) when the traded volumes of the i-th share are always greater (less) than the mean value of the MIB30 basket during the three years. As for the second PC, let x Bi be the average of the traded volumes of the ith company over the days: 1,...,431 (i.e. before September 11th, 2001) and x Ai be the
5 Functional PCA of Financial Time Series PCA function 1 (Percentage of variability 89.4 ) PCA function 2 (Percentage of variability 6.9 ) Fig. 1. Plot of the first 2 functional principal components. corresponding mean value after September 11th, Let us consider the variation per cent: xai xbi δ i := 100% i = 1,..., 30. x Bi If δ i is positive (negative) then the ith company increased (decreased) its mean e.e.v. after the September 11, Finally consider the scores on the two first PCs given in (4), respectively w (1) i and w (2) i (see Figure 2). We observe that : i) companies with large positive (negative) value w (1) j present a large (small) value than the mean during the entire considered period, i.e. of z (min) i, see Table 2; ii) companies with large positive (negative) value w (2) i show a large decrement (increment) after September 11th, 2001 (Day=431), i.e. of δ i, see Table 3. Further details are given in Costanzo and Ingrassia (2004). 4 Further remarks and methodological perspectives The results illustrate the capability of functional PCs to highlight statistical features of a set of financial time series as the subsequent analysis on the raw data has been also confirmed. As we remarked above, the functional data set has been here constructed using the standardized data (z ij) defined in (5) rather than the original data (x ij); Figure 1 shows how this approach clarifies the contribute of the PC trajectories with respect to the mean trajectory. For the sake of completeness, we point out that the first two PCs computed on the non standardized data (x ij) explained respectively the 88.9% and the 7.1% of the whole variability; the plot of the scores on such two harmonics is practically the same of the one given in Figure 2.
6 354 Ingrassia and Costanzo SeatP.Gialle Olivetti Scores on Harmonic Enel T3 Mediaset Finmecc. T1 Pirelli T5 MedioB. T6 Alleanza B.N.L Fiat Generali MontePaschi B.Fideraum T7 Mediolanum T8 T2Ras SanPaolo T4 Autostrade Unicred.It. Telecom Tim Eni T Scores on Harmonic 1 Fig. 2. Scores on the two first harmonics z (min) i Company w (1) i Eni Telecom Tim Enel Generali Olivetti Unicredito T Mediaset Seat Pagine Gialle Table 2. Comparison between z (min) i and w (1) i for some companies. In our opinion, the obtained results open some methodological perspectives for the construction of new financial indices having some suitable statistical properties. As a matter of fact, the construction of some existing stock market indices has been criticized by several authors, see e.g. Elton and Gruber (1995). For example, it is well known that the famous U.S. Dow Jones presents some statistical flaws, but, despite these drawbacks in the methodology used in their computation, it continues to be widely employed.
7 Functional PCA of Financial Time Series 355 δ j Company w (2) j 80.20% Seat Pagine Gialle % Olivetti % Enel % Unicredito % Autostrade % T Table 3. Comparison between δ i and w (2) i for some companies. In Italy, the MIB30 share basket is summarized by the MIB30 index which is calculated according to the formula:! X30 p it p i0q i0 MIB30 = w i0 r 0 with w i0 = P p 30 (6) i0 i=1 pi0qi0 i=1 where p it is the current price of the i-th share at time t; p i0 is the base price of the i-th share which is the opening price on the day on which the updating of the index takes effect (multiplied, where appropriate, by an adjustment coefficient calculated by the Italian Exchange in the event of actions involving the i-th company s capital); q i0 is the base number of the shares in circulation of the i-th stock. The weight w i0 of the i-th share in (6) is given by the ratio of the company s market capitalisation to the market capitalisation of all the companies in the basket. Finally r 0 is a factor with the base set equal to one, used to maintain the continuity of the index when the basket is updated and the value 10,000 is the base of the index on December 31st, However such indices don t take into account the variability of the share prices (or of the traded volumes, or other related quantities) during any time interval (e.g. between two consecutive updating of the basket composition). Due to the resulted presented above, the shares scores on this harmonic seem constitute a good ingredient for a new family of financial indices trying to capture as most as possible of the variability of the prices in the share basket. This provides ideas for further developments of functional principal component techniques in the financial field. Acknowledgments Dataset used in this paper have been collected by the Italian Stock Exchange. The authors thank Research & Development DBMS (Borsa Italiana). References CERIOLI, A., LAURINI, F. and CORBELLINI, A. (2003), Functional cluster analysis of financial time series. In: M. Vichi and P. Monari (Eds.): Book of Short Paper of Cladag 2003.
8 356 Ingrassia and Costanzo COSTANZO, G.D. (2003), A graphical analysis of the dynamics of the MIB30 index in the period by a functional data approach. In: Atti Riunione Scientifica SIS 2003, Rocco Curto Editore, Napoli. COSTANZO, G.D., INGRASSIA, S. (2004): Analysis of the MIB30 basket in the period by functional PC s. In: J. Antoch (Ed.): Proceedings of Compstat 2004, Prague August 23-27, DUNIS, C., GAVRIDIS, M., HARRIS, A., LEONG, S. and NACASKUL P. (1998): An application of genetic algorithms to high frequence trading models: a case study. In: C. Dunis and B. Zhou (Eds.): Nonlinear Modeling of High Frequency Financial Time Series. John Wiley & Sons, New York, ELTON, E.J. and GRUBER, M.J. (1973): Estimating the dependence structure of share prices. Implications for portfolio, Journal of Finance, ELTON, E.J. and GRUBER, M.J. (1995): Modern Portfolio Theory and Investment Analysis. John Wiley & Sons, New York. PEZZULLI, S. (1994): L analisi delle componenti principali quando i dati sono funzioni, Tesi di Dottorato. RAMSAY, J.O. (2001) Matlab and S-PLUS functions for Functional Data Analysis, McGill University. RAMSAY, J.O. and SILVERMAN, B.W. (1997): Functional Data Analysis. Springer-Verlag, New York. RAMSAY, J.O. and SILVERMAN, B.W. (2002): Applied Functional Data Analysis. Springer-Verlag, New York. WAHBA, G. (1990): Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, SIAM, Philadelphia.
ANALYSIS OF THE MIB30 BASKET IN THE PERIOD BY FUNCTIONAL PC S
COMPSTAT 2004 Symposium c Physica-Verlag/Springer 2004 ANALYSIS OF THE MIB30 BASKET IN THE PERIOD 2000-2002 BY FUNCTIONAL PC S Damiana G. Costanzo and Salvatore Ingrassia Key words: Functional data, principal
More informationCOMPOSITE INDICATORS FOR FINANCE
Statistica Applicata Vol. 9 n. 3 2007 27 COMPOSITE INDICATORS FOR FINANCE Marco Marozzi Dipartimento di Economia e Statistica Università della Calabria via P. Bucci Cubo 0c 87036 Arcavacata di Rende (CS)
More informationForecasting Complex Time Series: Beanplot Time Series
COMPSTAT 2010 19 International Conference on Computational Statistics Paris-France, August 22-27 Forecasting Complex Time Series: Beanplot Time Series Carlo Drago and Germana Scepi Dipartimento di Matematica
More informationFunctional Data Analysis
FDA 1-1 Functional Data Analysis Michal Benko Institut für Statistik und Ökonometrie Humboldt-Universität zu Berlin email:benko@wiwi.hu-berlin.de FDA 1-2 Outline of this talk: Introduction Turning discrete
More informationModeling Ultra-High-Frequency Multivariate Financial Data by Monte Carlo Simulation Methods
Outline Modeling Ultra-High-Frequency Multivariate Financial Data by Monte Carlo Simulation Methods Ph.D. Student: Supervisor: Marco Minozzo Dipartimento di Scienze Economiche Università degli Studi di
More informationAbstract. 1 Introduction
Time Series Analysis: Mandelbrot Theory at work in Economics M. F. Guiducci and M. I. Loflredo Dipartimento di Matematica, Universita di Siena, Siena, Italy Abstract The consequences of the Gaussian hypothesis,
More informationQ-Learning and SARSA: Machine learningbased stochastic control approaches for financial trading
Q-Learning and SARSA: Machine learningbased stochastic control approaches for financial trading Marco CORAZZA (corazza@unive.it) Department of Economics Ca' Foscari University of Venice CONFERENCE ON COMPUTATIONAL
More informationFORECASTING THE RESIDUAL DEMAND FUNCTION IN
FORECASTING THE RESIDUAL DEMAND FUNCTION IN ELECTRICITY AUCTIONS Dipartimento di Statistica Università degli Studi di Milano-Bicocca Via Bicocca degli Arcimboldi 8, Milano, Italy (e-mail: matteo.pelagatti@unimib.it)
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationPrincipal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,
Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations
More informationPrincipal Component Analysis
I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables
More informationFunctional Preprocessing for Multilayer Perceptrons
Functional Preprocessing for Multilayer Perceptrons Fabrice Rossi and Brieuc Conan-Guez Projet AxIS, INRIA, Domaine de Voluceau, Rocquencourt, B.P. 105 78153 Le Chesnay Cedex, France CEREMADE, UMR CNRS
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering
More informationARANDOM-MATRIX-THEORY-BASEDANALYSISOF STOCKS OF MARKETS FROM DIFFERENT COUNTRIES
Advances in Complex Systems, Vol. 11, No. 5 (2008) 655 668 c World Scientific Publishing Company ARANDOM-MATRIX-THEORY-BASEDANALYSISOF STOCKS OF MARKETS FROM DIFFERENT COUNTRIES RICARDO COELHO,PETERRICHMONDandSTEFANHUTZLER
More informationIntroduction to Functional Data Analysis A CSCU Workshop. Giles Hooker Biological Statistics and Computational Biology
Introduction to Functional Data Analysis A CSCU Workshop Giles Hooker Biological Statistics and Computational Biology gjh27@cornell.edu www.bscb.cornell.edu/ hooker/fdaworkshop 1 / 26 Agenda What is Functional
More informationSparse PCA with applications in finance
Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction
More informationFUNCTIONAL DATA ANALYSIS. Contribution to the. International Handbook (Encyclopedia) of Statistical Sciences. July 28, Hans-Georg Müller 1
FUNCTIONAL DATA ANALYSIS Contribution to the International Handbook (Encyclopedia) of Statistical Sciences July 28, 2009 Hans-Georg Müller 1 Department of Statistics University of California, Davis One
More informationAdjusting time series of possible unequal lengths
Adjusting time series of possible unequal lengths Ilaria Amerise - Agostino Tarsitano Dipartimento di economia e statistica Università della Calabria RENDE (CS), Italy 1 Outline Adjusting Time Series of
More informationFunctional Data Analysis
FDA p. 1/42 Functional Data Analysis An introduction David Wooff d.a.wooff@durham.ac.uk University of Durham FDA p. 2/42 Main steps in FDA Collect, clean, and organize the raw data. Convert the data to
More informationEcon671 Factor Models: Principal Components
Econ671 Factor Models: Principal Components Jun YU April 8, 2016 Jun YU () Econ671 Factor Models: Principal Components April 8, 2016 1 / 59 Factor Models: Principal Components Learning Objectives 1. Show
More informationRegularized principal components analysis
9 Regularized principal components analysis 9.1 Introduction In this chapter, we discuss the application of smoothing to functional principal components analysis. In Chapter 5 we have already seen that
More informationPrincipal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17
Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into
More information-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the
1 2 3 -Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the 1950's. -PCA is based on covariance or correlation
More informationDimension Reduction and Classification Using PCA and Factor. Overview
Dimension Reduction and Classification Using PCA and - A Short Overview Laboratory for Interdisciplinary Statistical Analysis Department of Statistics Virginia Tech http://www.stat.vt.edu/consult/ March
More informationPrincipal component analysis
Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and
More informationAnalysis of Interest Rate Curves Clustering Using Self-Organising Maps
Analysis of Interest Rate Curves Clustering Using Self-Organising Maps M. Kanevski (1), V. Timonin (1), A. Pozdnoukhov(1), M. Maignan (1,2) (1) Institute of Geomatics and Analysis of Risk (IGAR), University
More informationAdvanced Introduction to Machine Learning
10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see
More informationResearch Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood
Advances in Decision Sciences Volume 2012, Article ID 973173, 8 pages doi:10.1155/2012/973173 Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood
More informationPrincipal Components Analysis. Sargur Srihari University at Buffalo
Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection Pursuit Methods Principal Components Examples of using PCA Graphical use of PCA Multidimensional Scaling Srihari 2
More informationIntro VEC and BEKK Example Factor Models Cond Var and Cor Application Ref 4. MGARCH
ntro VEC and BEKK Example Factor Models Cond Var and Cor Application Ref 4. MGARCH JEM 140: Quantitative Multivariate Finance ES, Charles University, Prague Summer 2018 JEM 140 () 4. MGARCH Summer 2018
More informationChapter 4: Factor Analysis
Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.
More informationOptimal Investment Strategies: A Constrained Optimization Approach
Optimal Investment Strategies: A Constrained Optimization Approach Janet L Waldrop Mississippi State University jlc3@ramsstateedu Faculty Advisor: Michael Pearson Pearson@mathmsstateedu Contents Introduction
More informationIdentifying Financial Risk Factors
Identifying Financial Risk Factors with a Low-Rank Sparse Decomposition Lisa Goldberg Alex Shkolnik Berkeley Columbia Meeting in Engineering and Statistics 24 March 2016 Outline 1 A Brief History of Factor
More informationDIDELĖS APIMTIES DUOMENŲ VIZUALI ANALIZĖ
Vilniaus Universitetas Matematikos ir informatikos institutas L I E T U V A INFORMATIKA (09 P) DIDELĖS APIMTIES DUOMENŲ VIZUALI ANALIZĖ Jelena Liutvinavičienė 2017 m. spalis Mokslinė ataskaita MII-DS-09P-17-7
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods The Connection to Green s Kernels Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu MATH 590 1 Outline 1 Introduction
More informationNonlinear functional regression: a functional RKHS approach
Nonlinear functional regression: a functional RKHS approach Hachem Kadri Emmanuel Duflos Philippe Preux Sequel Project/LAGIS INRIA Lille/Ecole Centrale de Lille SequeL Project INRIA Lille - Nord Europe
More informationMid-year Report Linear and Non-linear Dimentionality. Reduction. applied to gene expression data of cancer tissue samples
Mid-year Report Linear and Non-linear Dimentionality applied to gene expression data of cancer tissue samples Franck Olivier Ndjakou Njeunje Applied Mathematics, Statistics, and Scientific Computation
More informationProbabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm
Probabilistic Graphical Models 10-708 Homework 2: Due February 24, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 4-8. You must complete all four problems to
More informationUniversity of Karachi
ESTIMATING TERM STRUCTURE OF INTEREST RATE: A PRINCIPAL COMPONENT, POLYNOMIAL APPROACH by Nasir Ali Khan A thesis submitted in partial fulfillment of the requirements for the degree of B.S. in Actuarial
More information= observed volume on day l for bin j = base volume in jth bin, and = residual error, assumed independent with mean zero.
QB research September 4, 06 Page -Minute Bin Volume Forecast Model Overview In response to strong client demand, Quantitative Brokers (QB) has developed a new algorithm called Closer that specifically
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationComputational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms.
Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. January 5, 25 Outline Methodologies for the development of classification
More informationSmooth Common Principal Component Analysis
1 Smooth Common Principal Component Analysis Michal Benko Wolfgang Härdle Center for Applied Statistics and Economics benko@wiwi.hu-berlin.de Humboldt-Universität zu Berlin Motivation 1-1 Volatility Surface
More informationPrincipal Component Analysis vs. Independent Component Analysis for Damage Detection
6th European Workshop on Structural Health Monitoring - Fr..D.4 Principal Component Analysis vs. Independent Component Analysis for Damage Detection D. A. TIBADUIZA, L. E. MUJICA, M. ANAYA, J. RODELLAR
More informationCS229 Final Project. Wentao Zhang Shaochuan Xu
CS229 Final Project Shale Gas Production Decline Prediction Using Machine Learning Algorithms Wentao Zhang wentaoz@stanford.edu Shaochuan Xu scxu@stanford.edu In petroleum industry, oil companies sometimes
More informationPrincipal Component Analysis, A Powerful Scoring Technique
Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationLecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University
Lecture 15 20 Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Modeling for Time Series Forecasting Forecasting is a necessary input to planning, whether in business,
More informationGaussian Slug Simple Nonlinearity Enhancement to the 1-Factor and Gaussian Copula Models in Finance, with Parametric Estimation and Goodness-of-Fit
Gaussian Slug Simple Nonlinearity Enhancement to the 1-Factor and Gaussian Copula Models in Finance, with Parametric Estimation and Goodness-of-Fit Tests on US and Thai Equity Data 22 nd Australasian Finance
More informationPrincipal Component Analysis (PCA) CSC411/2515 Tutorial
Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)
More informationOutline. Introduction, program and reference textbooks A few definition and facts Modeling of biological systems:
Introduction 1 Outline Introduction, program and reference textbooks A few definition and facts Modeling of biological systems: Models in space and time: from molecules to the cell Quantitative models
More informationarxiv: v1 [q-fin.st] 18 Sep 2017
A new approach to the modeling of financial volumes arxiv:179.583v1 [q-fin.st] 18 Sep 17 Guglielmo D Amico Department of Pharmacy, University G. d Annunzio of Chieti-Pescara e-mail: g.damico@unich.it Filippo
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationAn introduction to Birkhoff normal form
An introduction to Birkhoff normal form Dario Bambusi Dipartimento di Matematica, Universitá di Milano via Saldini 50, 0133 Milano (Italy) 19.11.14 1 Introduction The aim of this note is to present an
More informationThe Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017
The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the
More informationPCR and PLS for Clusterwise Regression on Functional Data
PCR and PLS for Clusterwise Regression on Functional Data Cristian Preda 1 and Gilbert Saporta 2 1 Faculté de Médecine, Université de Lille 2 CERIM - Département de Statistique 1, Place de Verdun, 5945
More informationAn Introduction to Correlation Stress Testing
An Introduction to Correlation Stress Testing Defeng Sun Department of Mathematics and Risk Management Institute National University of Singapore This is based on a joint work with GAO Yan at NUS March
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationDiagnostics for Linear Models With Functional Responses
Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University
More informationRegression: Ordinary Least Squares
Regression: Ordinary Least Squares Mark Hendricks Autumn 2017 FINM Intro: Regression Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 2/32 Regression
More informationTable of Contents. Multivariate methods. Introduction II. Introduction I
Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation
More information3E4: Modelling Choice
3E4: Modelling Choice Lecture 6 Goal Programming Multiple Objective Optimisation Portfolio Optimisation Announcements Supervision 2 To be held by the end of next week Present your solutions to all Lecture
More informationDIMENSION REDUCTION AND CLUSTER ANALYSIS
DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833
More informationModeling with Itô Stochastic Differential Equations
Modeling with Itô Stochastic Differential Equations 2.4-2.6 E. Allen presentation by T. Perälä 27.0.2009 Postgraduate seminar on applied mathematics 2009 Outline Hilbert Space of Stochastic Processes (
More informationPREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY
REVSTAT Statistical Journal Volume 7, Number 1, April 2009, 37 54 PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY Authors: Germán Aneiros-Pérez Departamento de Matemáticas,
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method
CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationVector Space Models. wine_spectral.r
Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components
More informationFuncICA for time series pattern discovery
FuncICA for time series pattern discovery Nishant Mehta and Alexander Gray Georgia Institute of Technology The problem Given a set of inherently continuous time series (e.g. EEG) Find a set of patterns
More informationA RECURSION FORMULA FOR THE COEFFICIENTS OF ENTIRE FUNCTIONS SATISFYING AN ODE WITH POLYNOMIAL COEFFICIENTS
Georgian Mathematical Journal Volume 11 (2004), Number 3, 409 414 A RECURSION FORMULA FOR THE COEFFICIENTS OF ENTIRE FUNCTIONS SATISFYING AN ODE WITH POLYNOMIAL COEFFICIENTS C. BELINGERI Abstract. A recursion
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationMonitoring Random Start Forward Searches for Multivariate Data
Monitoring Random Start Forward Searches for Multivariate Data Anthony C. Atkinson 1, Marco Riani 2, and Andrea Cerioli 2 1 Department of Statistics, London School of Economics London WC2A 2AE, UK, a.c.atkinson@lse.ac.uk
More informationFINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3
FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 The required files for all problems can be found in: http://www.stat.uchicago.edu/~lekheng/courses/331/hw3/ The file name indicates which problem
More informationA Summary of Economic Methodology
A Summary of Economic Methodology I. The Methodology of Theoretical Economics All economic analysis begins with theory, based in part on intuitive insights that naturally spring from certain stylized facts,
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationIntroduction. x 1 x 2. x n. y 1
This article, an update to an original article by R. L. Malacarne, performs a canonical correlation analysis on financial data of country - specific Exchange Traded Funds (ETFs) to analyze the relationship
More informationTAMS39 Lecture 10 Principal Component Analysis Factor Analysis
TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More information2tdt 1 y = t2 + C y = which implies C = 1 and the solution is y = 1
Lectures - Week 11 General First Order ODEs & Numerical Methods for IVPs In general, nonlinear problems are much more difficult to solve than linear ones. Unfortunately many phenomena exhibit nonlinear
More information4 Bias-Variance for Ridge Regression (24 points)
Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationCS 540: Machine Learning Lecture 1: Introduction
CS 540: Machine Learning Lecture 1: Introduction AD January 2008 AD () January 2008 1 / 41 Acknowledgments Thanks to Nando de Freitas Kevin Murphy AD () January 2008 2 / 41 Administrivia & Announcement
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationGeostatistics for Gaussian processes
Introduction Geostatistical Model Covariance structure Cokriging Conclusion Geostatistics for Gaussian processes Hans Wackernagel Geostatistics group MINES ParisTech http://hans.wackernagel.free.fr Kernels
More informationPrincipal Component Analysis CS498
Principal Component Analysis CS498 Today s lecture Adaptive Feature Extraction Principal Component Analysis How, why, when, which A dual goal Find a good representation The features part Reduce redundancy
More informationChapter 8 - Forecasting
Chapter 8 - Forecasting Operations Management by R. Dan Reid & Nada R. Sanders 4th Edition Wiley 2010 Wiley 2010 1 Learning Objectives Identify Principles of Forecasting Explain the steps in the forecasting
More informationConvergence of Eigenspaces in Kernel Principal Component Analysis
Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation
More informationFractal functional regression for classification of gene expression data by wavelets
Fractal functional regression for classification of gene expression data by wavelets Margarita María Rincón 1 and María Dolores Ruiz-Medina 2 1 University of Granada Campus Fuente Nueva 18071 Granada,
More informationECE 661: Homework 10 Fall 2014
ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;
More informationFunctional principal component analysis of aircraft trajectories
Functional principal component analysis of aircraft trajectories Florence Nicol To cite this version: Florence Nicol. Functional principal component analysis of aircraft trajectories. ISIATM 0, nd International
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationDimensionality Reduction
Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball
More informationForecasting with Expert Opinions
CS 229 Machine Learning Forecasting with Expert Opinions Khalid El-Awady Background In 2003 the Wall Street Journal (WSJ) introduced its Monthly Economic Forecasting Survey. Each month the WSJ polls between
More information2.5 Multivariate Curve Resolution (MCR)
2.5 Multivariate Curve Resolution (MCR) Lecturer: Dr. Lionel Blanchet The Multivariate Curve Resolution (MCR) methods are widely used in the analysis of mixtures in chemistry and biology. The main interest
More informationIntermediate Social Statistics
Intermediate Social Statistics Lecture 5. Factor Analysis Tom A.B. Snijders University of Oxford January, 2008 c Tom A.B. Snijders (University of Oxford) Intermediate Social Statistics January, 2008 1
More informationKernel Principal Component Analysis
Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University
More informationLecture Notes in Physics
Lecture Notes in Physics Edited by J. Ehlers, Miinchen, K. Hepp, Zijrich R. Kippenhahn, Miinchen, H. A. Weidenmiiller, Heidelberg and J. Zittartz, Kijln Managing Editor: W. Beiglbijck, Heidelberg 120 Nonlinear
More information