Functional SVD for Big Data
|
|
- Gervais Kelly
- 6 years ago
- Views:
Transcription
1 Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, / 24
2 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem 3 Big data solution (split-and-recombine) a) Data split b) Recombine 4 Simulation 5 Summary and furture work 6 Reference Pan Chao Functional SVD for Big Data April 23, / 24
3 One-Way Problem To estimate the functional principle component of the data, we can use SVD method In order to incorporate the functional nature of the data, regularization is imposed on the estimates The first regularized principle component for functional data can be estimated by minimizing ρ(y uv T ) + λ u 2 v 2, where: Y : data matrix, Y ij = Y i (t j ) u: first left vector v: first right vector, v j = v(t j ) Pan Chao Functional SVD for Big Data April 23, / 24
4 Remarks: 1) Smoothness is controlled by the tuning parameter λ 2) Adding u 2 to make the problem scale-invariant [Huang et al 2008] 3) ρ( ) is the loss function which measure the fidelity of the rank-1 approximation If ρ( ) = 2, then least square 4) Due to the theory of smoothing spline [Green & Silverman], the integral has a matrix expression v 2 = v T Ω v v where Ω v = QR 1 Q T Q and R are two banded matrices depending on the discretization of the data Pan Chao Functional SVD for Big Data April 23, / 24
5 Smoothing Spline View If Y has only one row, denoted by y, then requiring u = 1 results in a standard smoothing spline problem: ρ(x v) + λv T Ω v v Pan Chao Functional SVD for Big Data April 23, / 24
6 PCA View If ρ( ) = 2 F, the minimizer of v of the one-way problem without penalty is ˆv = arg max v v T Y T Y v v 2, which maximizing the variance of the projected data on v So ˆv is an estimated PC direction With one-way penalty imposed, [Huang et al 2008] ˆv = arg max v = arg max v v T Y T Y v v 2 + λv T Ωv v T Y T Y v v T (I + λω) v Pan Chao Functional SVD for Big Data April 23, / 24
7 Regression View [Zhang et al 2013] Let Y to be a column stack of the data matrix Y, and u u 0 U = u If the loss function ρ( ) = 2 and u is given, the estimate of v for the one-way problem without penalty is given by a linear regression: ˆv = ( U T U ) 1 U T Y If the one-way penalty is imposed, then we have Ridge-regression type problem The estimate of v is ˆv = ( U T U + λ u 2 Ω) 1 U T Y Pan Chao Functional SVD for Big Data April 23, / 24
8 Robust Version [Zhang et al 2013] Sometimes there may be outliners in the observed data, then we can use a more robust loss function ρ( ) to replace the usual quadratic loss For example, Huber loss { x 2 ρ(x) = 2, if x θ θ ( x 2) θ, ow Pan Chao Functional SVD for Big Data April 23, / 24
9 Then the regression is analogous to a weighted least square problem ˆv = ( U T WU + λ u 2 Ω) 1 U T WY, where W is constructed from the weight matrix W ij = ρ (y ij u i v j ) y ij u i v j Pan Chao Functional SVD for Big Data April 23, / 24
10 Picking λ The smoothing parameter λ can be chosen by leave-one-column Cross-Validation [Huang et al 2008] [Zhang et al 2013] No robust: Let A λ = (I + λω) 1, then the CV and GCV scores are: m ( {(I Aλ ) Y T u} j ) 2 CV (λ) = 1 m j=1 1 {A λ } jj GCV (λ) = 1 m m j=1 (I A λ) Y T u 2 (1 Tr (A λ ) /m) 2 Remark: If CV is defined to be leave-out-one-column, no computational simplification Pan Chao Functional SVD for Big Data April 23, / 24
11 Robust: Let A λ = U ( U T WU + λ u 2 Ω) 1 U T W, then GCV (λ) = 1 m ˆv ˆv 2 (1 Tr (A λ ) /m) 2, where ˆv = ( U T WU + λ u 2 Ω) 1 U T WY ˆv = ( U T WU ) 1 U T WY Pan Chao Functional SVD for Big Data April 23, / 24
12 Two-Way Problem If both directions of data are considered as functions, then a two-way version is to minimize ρ(y uv T ) + λ v u 2 v 2 + λ u v 2 u 2 + λ v λ u u v v 2 u 2, where we added the penalty for the second direction and an interaction is introduced ρ( ) can be chosen to be a robust loss Pan Chao Functional SVD for Big Data April 23, / 24
13 Estimation [Zhang et al 2013] The estimates of v and u can be updated iteratively (IRLS) as: ˆv = ( U T WU + 2Ω v u ) 1 U T WY Ω v u = u T (I + λ u Ω u ) u (I + λ v Ω v ) u T ui û = ( V T W ) 1 V + 2Ω u v V T W Y Ω u v = v T (I + λ v Ω v ) v (I + λ u Ω u ) v T vi The hat matrices are: H = U ( U T ) 1 WU + 2Ω v u U T W H = V ( V T W ) 1 V + 2Ω u v V T W Pan Chao Functional SVD for Big Data April 23, / 24
14 GCV GCV (λ v λ u ) = 1 n ˆv = ( U T WU ) 1 U T WY ( ˆv ˆv ) 2 1 Tr (H) /n GCV (λ u λ v ) = 1 ( û û m 1 Tr (H ) /m û = ( V T W V ) 1 V T W Y ) 2 Remark: Leave-out-one-column/row CV criteria Pan Chao Functional SVD for Big Data April 23, / 24
15 Big Data and Parallelism for One-Way Problem The number of rows is large while the number of columns is moderate Equally-spaced common grids are assumed The smoothing parameter is forced to be the same for all subsets 1 Split data 2 Estimation for one block 3 Recombine 4 Simulation result Pan Chao Functional SVD for Big Data April 23, / 24
16 Split Data y 11 y 12 y 1m y 21 y 22 y 2m y n1 1 y n1 2 y n1 m y (n1 +1)1 y (n1 +1)2 y (n1 +1)m Y = y (n1 +n 2 )1 y (n1 +n 2 )2 y (n1 +n 2 )m y (n nk )1 y (n nk )2 y (n nk )m y n1 y n2 y nm n m Remarks: 1 Each subset is a block of the data matrix 2 The size of the subests are the same except the last one (The total number of rows may not be divisible by an integer) Pan Chao Functional SVD for Big Data April 23, / 24
17 When penalty is imposed in one direction (say v), a SVD direction can be estimated for each subset and they can be combined to recover the result when the whole matrix is used For k th subset, ˆv k = (U T k W ku k + 2Ω vk u k ) 1 U T k W ky k Ω vk u k = u k 2 λ v Ω v û k = (V T k W k V k + 2Ω uk v k ) 1 V T k W k Y k Ω uk v k = λ v v T k Ω vv k Pan Chao Functional SVD for Big Data April 23, / 24
18 Recombining [( K ) ] 1 K ˆv (c) = Uk T W ku k + 2 Ω vk u k K k=1 k=1 k=1 [( U T k W k U k + 2Ω vk u k ) ˆvk ] û (c) = (û 1, û 2,, û K ) Pan Chao Functional SVD for Big Data April 23, / 24
19 Algorithm initialize u = û (0) using SVD; split Y and û (0) ; iter = 0; tol = 99999; while localdiff>tol and iter<maxiter do initialize parallelism; estimate ˆv (i) using û (i) k ; k stop parallelism; recombine ˆv (i) k s to get ˆv(i+1) ; initialize parallelism; estimate û (i+1) k ; stop parallelism; recombine û (i+1) k s to get û (i+1) ; Ŷ = û (i+1) {ˆv (i+1) } T ; localdiff = distance(y Ŷ ); end Pan Chao Functional SVD for Big Data April 23, / 24
20 Simulation 7 replicates, data matrix, number of cores 1 10, smoothing parameters seq(0, 2, by=01) Average Elapsed Time vs CPUS Avg Elapsed Time CPUs Pan Chao Functional SVD for Big Data April 23, / 24
21 Logarithm of Max Pointwise Discrepency (Fixed Data Set and Fixed Smoothing Parameter) max pointwise discrepency Index Pan Chao Functional SVD for Big Data April 23, / 24
22 Summary 1 One-way functional SVD and its interpretation 2 Estimation and smoothing parameter selection for a one-way problem 3 Big data problem, ie too many curves, and parallelism 4 Simulation results, efficiency and accuracy Pan Chao Functional SVD for Big Data April 23, / 24
23 Future Work 1 Adding GCV for the one-way problem 2 Parallelizing two-way problems Pan Chao Functional SVD for Big Data April 23, / 24
24 References Xueying Chen and Min-ge Xie A split-and-conquer approach for analysis of extraordinarily large data pages 1 35, 2012 Jianhua Z Huang, Haipeng Shen, and Andreas Buja Functional principal components analysis via penalized rank one approximation Electronic Journal of Statistics, 2(March): , 2008 Jianhua Z Huang, Haipeng Shen, and Andreas Buja The Analysis of Two-Way Functional Data Using Two-Way Regularized Singular Value Decompositions Journal of the American Statistical Association, 104(488): , December 2009 L Zhang, H Shen, and JZ Huang Robust regularized singular value decomposition with application to mortality data The Annals of Applied Statistics, 2013 Pan Chao Functional SVD for Big Data April 23, / 24
SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA
SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA By Lingsong Zhang, Haipeng Shen and Jianhua Z. Huang Purdue University, University of North Carolina,
More informationJianhua Z. Huang, Haipeng Shen, Andreas Buja
Several Flawed Approaches to Penalized SVDs A supplementary note to The analysis of two-way functional data using two-way regularized singular value decompositions Jianhua Z. Huang, Haipeng Shen, Andreas
More informationFunctional principal components analysis via penalized rank one approximation
Electronic Journal of Statistics Vol. 2 (2008) 678 695 ISSN: 1935-7524 DI: 10.1214/08-EJS218 Functional principal components analysis via penalized rank one approximation Jianhua Z. Huang Department of
More informationSupplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data
Supplement to A Generalized Least Squares Matrix Decomposition Genevera I. Allen 1, Logan Grosenic 2, & Jonathan Taylor 3 1 Department of Statistics and Electrical and Computer Engineering, Rice University
More informationSparse principal component analysis via regularized low rank matrix approximation
Journal of Multivariate Analysis 99 (2008) 1015 1034 www.elsevier.com/locate/jmva Sparse principal component analysis via regularized low rank matrix approximation Haipeng Shen a,, Jianhua Z. Huang b a
More informationSTATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010
STATS306B Discriminant Analysis Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Classification Given K classes in R p, represented as densities f i (x), 1 i K classify
More informationP -spline ANOVA-type interaction models for spatio-temporal smoothing
P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and
More informationSparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis
Sparse Principal Component Analysis for multiblocks data and its extension to Sparse Multiple Correspondence Analysis Anne Bernard 1,5, Hervé Abdi 2, Arthur Tenenhaus 3, Christiane Guinot 4, Gilbert Saporta
More informationSpatial Process Estimates as Smoothers: A Review
Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed
More informationMultivariate Calibration with Robust Signal Regression
Multivariate Calibration with Robust Signal Regression Bin Li and Brian Marx from Louisiana State University Somsubhra Chakraborty from Indian Institute of Technology Kharagpur David C Weindorf from Texas
More informationDivide-and-combine Strategies in Statistical Modeling for Massive Data
Divide-and-combine Strategies in Statistical Modeling for Massive Data Liqun Yu Washington University in St. Louis March 30, 2017 Liqun Yu (WUSTL) D&C Statistical Modeling for Massive Data March 30, 2017
More informationLow-Rank Factorization Models for Matrix Completion and Matrix Separation
for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationEffective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data
Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationLeast Squares Approximation
Chapter 6 Least Squares Approximation As we saw in Chapter 5 we can interpret radial basis function interpolation as a constrained optimization problem. We now take this point of view again, but start
More informationAnalysis Methods for Supersaturated Design: Some Comparisons
Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs
More informationThe Analysis of Two-Way Functional Data. Using Two-Way Regularized Singular Value Decompositions
The Analysis of Two-Way Functional Data Using Two-Way Regularized Singular Value Decompositions Jianhua Z. Huang, Haipeng Shen and Andreas Buja Abstract Two-way functional data consist of a data matrix
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationLecture 5 Singular value decomposition
Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn
More informationCSE 554 Lecture 7: Alignment
CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationSupplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics
Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationA Nonparametric Monotone Regression Method for Bernoulli Responses with Applications to Wafer Acceptance Tests
A Nonparametric Monotone Regression Method for Bernoulli Responses with Applications to Wafer Acceptance Tests Jyh-Jen Horng Shiau (Joint work with Shuo-Huei Lin and Cheng-Chih Wen) Institute of Statistics
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationA Generalized Least Squares Matrix Decomposition
A Generalized Least Squares Matrix Decomposition Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, & Department of Statistics, Rice University, Houston, TX. Logan Grosenick
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationAdaptive Piecewise Polynomial Estimation via Trend Filtering
Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,
More informationTheoretical Exercises Statistical Learning, 2009
Theoretical Exercises Statistical Learning, 2009 Niels Richard Hansen April 20, 2009 The following exercises are going to play a central role in the course Statistical learning, block 4, 2009. The exercises
More informationStatistically-Based Regularization Parameter Estimation for Large Scale Problems
Statistically-Based Regularization Parameter Estimation for Large Scale Problems Rosemary Renaut Joint work with Jodi Mead and Iveta Hnetynkova March 1, 2010 National Science Foundation: Division of Computational
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationBig Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.
More informationBare minimum on matrix algebra. Psychology 588: Covariance structure and factor models
Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationLearning with Singular Vectors
Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:
More informationPackage RRR. April 12, 2013
Package RRR April 12, 2013 Type Package Title Reduced Rank Regression Methods Version 1.0 Date 2013-04-12 Author Maintainer Several reduced rank methods for conducting multivariate
More informationRelaxed linearized algorithms for faster X-ray CT image reconstruction
Relaxed linearized algorithms for faster X-ray CT image reconstruction Hung Nien and Jeffrey A. Fessler University of Michigan, Ann Arbor The 13th Fully 3D Meeting June 2, 2015 1/20 Statistical image reconstruction
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationEcon 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis
Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms De La Fuente notes that, if an n n matrix has n distinct eigenvalues, it can be diagonalized. In this supplement, we will provide
More informationMATH 829: Introduction to Data Mining and Analysis Principal component analysis
1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional
More informationCSC 576: Variants of Sparse Learning
CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in
More informationStatistical Methods as Optimization Problems
models ( 1 Statistical Methods as Optimization Problems Optimization problems maximization or imization arise in many areas of statistics. Statistical estimation and modeling both are usually special types
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)
More informationIntegrating Data Transformation in Principal Components Analysis
Marquette University e-publications@marquette Mathematics, Statistics and Computer Science Faculty Research and Publications Mathematics, Statistics and Computer Science, Department of 3-1-2015 Integrating
More informationOn construction of constrained optimum designs
On construction of constrained optimum designs Institute of Control and Computation Engineering University of Zielona Góra, Poland DEMA2008, Cambridge, 15 August 2008 Numerical algorithms to construct
More informationThe Analysis of Two-Way Functional Data Using Two- Way Regularized Singular Value Decompositions
This article was downloaded by: [University North Carolina - Chapel Hill] On: 3 January 204, At: 06:2 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 072954 Registered
More informationShort Course Robust Optimization and Machine Learning. Lecture 4: Optimization in Unsupervised Learning
Short Course Robust Optimization and Machine Machine Lecture 4: Optimization in Unsupervised Laurent El Ghaoui EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012 s
More informationCS540 Machine learning Lecture 5
CS540 Machine learning Lecture 5 1 Last time Basis functions for linear regression Normal equations QR SVD - briefly 2 This time Geometry of least squares (again) SVD more slowly LMS Ridge regression 3
More informationLecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016
Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationGI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil
GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection
More informationUnsupervised dimensionality reduction
Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October
More informationSparse PCA with applications in finance
Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction
More informationDimensionality Reduction for Exponential Family Data
Dimensionality Reduction for Exponential Family Data Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Andrew Landgraf July 2-6, 2018 Computational Strategies for Large-Scale
More informationUsing P-splines to smooth two-dimensional Poisson data
1 Using P-splines to smooth two-dimensional Poisson data Maria Durbán 1, Iain Currie 2, Paul Eilers 3 17th IWSM, July 2002. 1 Dept. Statistics and Econometrics, Universidad Carlos III de Madrid, Spain.
More informationStructural Learning and Integrative Decomposition of Multi-View Data
Structural Learning and Integrative Decomposition of Multi-View Data, Department of Statistics, Texas A&M University JSM 2018, Vancouver, Canada July 31st, 2018 Dr. Gen Li, Columbia University, Mailman
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationNonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy
Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille
More informationRegularization of the Inverse Laplace Transform with Applications in Nuclear Magnetic Resonance Relaxometry Candidacy Exam
Regularization of the Inverse Laplace Transform with Applications in Nuclear Magnetic Resonance Relaxometry Candidacy Exam Applied Mathematics, Applied Statistics, & Scientific Computation University of
More informationPrincipal Component Analysis
Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal
More informationRegularization and Variable Selection via the Elastic Net
p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction
More informationCPSC 340: Machine Learning and Data Mining. More PCA Fall 2017
CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationChemometrics. Matti Hotokka Physical chemistry Åbo Akademi University
Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references
More informationA Hierarchical Perspective on Lee-Carter Models
A Hierarchical Perspective on Lee-Carter Models Paul Eilers Leiden University Medical Centre L-C Workshop, Edinburgh 24 The vantage point Previous presentation: Iain Currie led you upward From Glen Gumbel
More informationEfficiently Implementing Sparsity in Learning
Efficiently Implementing Sparsity in Learning M. Magdon-Ismail Rensselaer Polytechnic Institute (Joint Work) December 9, 2013. Out-of-Sample is What Counts NO YES A pattern exists We don t know it We have
More informationPrincipal Components Analysis
Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal
More informationLess is More: Computational Regularization by Subsampling
Less is More: Computational Regularization by Subsampling Lorenzo Rosasco University of Genova - Istituto Italiano di Tecnologia Massachusetts Institute of Technology lcsl.mit.edu joint work with Alessandro
More informationRegularized principal components analysis
9 Regularized principal components analysis 9.1 Introduction In this chapter, we discuss the application of smoothing to functional principal components analysis. In Chapter 5 we have already seen that
More informationSimultaneous Confidence Bands for the Coefficient Function in Functional Regression
University of Haifa From the SelectedWorks of Philip T. Reiss August 7, 2008 Simultaneous Confidence Bands for the Coefficient Function in Functional Regression Philip T. Reiss, New York University Available
More informationAn ADMM Algorithm for Clustering Partially Observed Networks
An ADMM Algorithm for Clustering Partially Observed Networks Necdet Serhat Aybat Industrial Engineering Penn State University 2015 SIAM International Conference on Data Mining Vancouver, Canada Problem
More informationRecommender systems, matrix factorization, variable selection and social graph data
Recommender systems, matrix factorization, variable selection and social graph data Julien Delporte & Stéphane Canu stephane.canu@litislab.eu StatLearn, april 205, Grenoble Road map Model selection for
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationMethod 1: Geometric Error Optimization
Method 1: Geometric Error Optimization we need to encode the constraints ŷ i F ˆx i = 0, rank F = 2 idea: reconstruct 3D point via equivalent projection matrices and use reprojection error equivalent projection
More informationRegularization: Ridge Regression and the LASSO
Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationComputational methods for mixed models
Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different
More informationSparse Principal Component Analysis via Alternating Maximization and Efficient Parallel Implementations
Sparse Principal Component Analysis via Alternating Maximization and Efficient Parallel Implementations Martin Takáč The University of Edinburgh Joint work with Peter Richtárik (Edinburgh University) Selin
More informationSVD, PCA & Preprocessing
Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees
More informationRobust Quantum Error-Correction. via Convex Optimization
Robust Quantum Error-Correction via Convex Optimization Robert Kosut SC Solutions Systems & Control Division Sunnyvale, CA Alireza Shabani EE Dept. USC Los Angeles, CA Daniel Lidar EE, Chem., & Phys. Depts.
More informationSolving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm
Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Zaiwen Wen, Wotao Yin, Yin Zhang 2010 ONR Compressed Sensing Workshop May, 2010 Matrix Completion
More informationPrincipal components analysis COMS 4771
Principal components analysis COMS 4771 1. Representation learning Useful representations of data Representation learning: Given: raw feature vectors x 1, x 2,..., x n R d. Goal: learn a useful feature
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate
More informationA Generalized Least Squares Matrix Decomposition
A Generalized Least Squares Matrix Decomposition Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research Institute, Texas Children s Hospital,
More informationAn Extended Frank-Wolfe Method, with Application to Low-Rank Matrix Completion
An Extended Frank-Wolfe Method, with Application to Low-Rank Matrix Completion Robert M. Freund, MIT joint with Paul Grigas (UC Berkeley) and Rahul Mazumder (MIT) CDC, December 2016 1 Outline of Topics
More informationLecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26
Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1
More informationSparse orthogonal factor analysis
Sparse orthogonal factor analysis Kohei Adachi and Nickolay T. Trendafilov Abstract A sparse orthogonal factor analysis procedure is proposed for estimating the optimal solution with sparse loadings. In
More informationCOS 424: Interacting with Data
COS 424: Interacting with Data Lecturer: Rob Schapire Lecture #14 Scribe: Zia Khan April 3, 2007 Recall from previous lecture that in regression we are trying to predict a real value given our data. Specically,
More informationHigh-dimensional regression
High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and
More informationLecture 02 Linear Algebra Basics
Introduction to Computational Data Analysis CX4240, 2019 Spring Lecture 02 Linear Algebra Basics Chao Zhang College of Computing Georgia Tech These slides are based on slides from Le Song and Andres Mendez-Vazquez.
More information