A Hierarchical Perspective on Lee-Carter Models

Size: px
Start display at page:

Download "A Hierarchical Perspective on Lee-Carter Models"

Transcription

1 A Hierarchical Perspective on Lee-Carter Models Paul Eilers Leiden University Medical Centre L-C Workshop, Edinburgh 24

2 The vantage point Previous presentation: Iain Currie led you upward From Glen Gumbel to Ben Spline Full, smooth, interaction model for log-mortality surface Let s get a bit arrogant Look back at Lee-Carter model(s) from this height L-C Workshop, Edinburgh 24 1

3 The full model An array of tensor products of B-splines: B a B t Think of an egg-carton for the tensor products Give them different heights (coefficients) and sum Coefficients in matrix A Fitted log-mortality rate: Z = B a AB t or Z = BA B for short z ij = α kl B ik B jl k l L-C Workshop, Edinburgh 24 2

4 Perspective view of full model Year L-C Workshop, Edinburgh 24 3

5 Image view of full model Females, log 1 mortality Year Year 3.8 L-C Workshop, Edinburgh 24 4

6 Illustrative data Netherlands 1 11 Years Males and females separately One-year intervals Source: L-C Workshop, Edinburgh 24 5

7 Additive components and interaction Array of coefficients A = [α kl ] ANOVA-like decomposition: α kl = α k + α l + γ kl Row means α k (for age) Column means α l (for time) Interaction γ kl L-C Workshop, Edinburgh 24 6

8 Dutch females, log-mortality surface 11 1 Females, log 1 mortality 11 1 Females, residual surface Year Year L-C Workshop, Edinburgh 24 7

9 Dutch females, age and year effects 4 Females, period component log 1 mortality Year Females, age component log 1 mortality L-C Workshop, Edinburgh 24 8

10 Dutch males, log-mortality surface 11 Males, log 1 mortality 11 Males, residual surface Year Year L-C Workshop, Edinburgh 24 9

11 Dutch males, age and year effects 3.8 Males, period component log 1 mortality Year Males, age component log 1 mortality L-C Workshop, Edinburgh 24 1

12 The -Period model Forget the interaction part Just take average per age And average over time (Period) Additive model, sum the two Too simple: same age distribution, scaled in time Interaction surface shows size of error Up to factor 2 (log 1 2 =.3) Also clear structure in interaction L-C Workshop, Edinburgh 24 11

13 Decomposition of the interaction Singular value decomposition of Γ = [γ kl ] is Γ = USV Diagonal S, singular values s 1 to s r on diagonal Rank r: minimum of # of rows and # of columns Both U and V orthonormal: U U = I, V V = I Optimal approximation property Best rank q approximation: ˆγ jk = q h=1 u ihs h v jh Variance decomposition h s 2 h = k l γ 2 kl L-C Workshop, Edinburgh 24 12

14 An alternative decomposition Apply singular value decomposition to Z = BA B Results are different Tensor products blurs orthogonality in SVD of A Number of nonzero singular values not changed Rank of A determines rank of Z L-C Workshop, Edinburgh 24 13

15 Singular values for Dutch data Table of s 2 q/ t s 2 t (% of variance explained) Females A Females Z Males A Males Z First singular vector gives good approximation Approximating Z is the better choice L-C Workshop, Edinburgh 24 14

16 Rank-one approximation Approximate interaction by one singular vector z kl = z k + z l + u k1 s 1 v l1 This already looks like Lee-Carter, but smoothness is assumed additional time series z l included L-C Workshop, Edinburgh 24 15

17 Rank-one approximation for females I 11 1 Females, interaction surface Females, rank one approximation Year L-C Workshop, Edinburgh 24 16

18 Rank-one approximation for females II 11 Females, interaction surface.3 Differences with rank one approximation Year L-C Workshop, Edinburgh 24 17

19 Quality of fit for females Data and rank one fit 195; Females Rank one Full model Data and rank one fit 1964; Females Rank one Full model Data and rank one fit 1979; Females Rank one Full model Data and rank one fit 1999; Females Rank one Full model L-C Workshop, Edinburgh 24 18

20 Rank-one approximation for males I Males, interaction surface.3 Males, rank one approximation Year L-C Workshop, Edinburgh 24 19

21 Rank-one approximation for males II 11 1 Males, interaction surface.3 Differences with rank one approximation Year L-C Workshop, Edinburgh 24 2

22 Quality of fit for males Data and rank one fit 195; Males Rank one Full model Data and rank one fit 1964; Males Rank one Full model Data and rank one fit 1979; Males Rank one Full model Data and rank one fit 1999; Males Rank one Full model L-C Workshop, Edinburgh 24 21

23 Higher rank approximation Use two or more singular vectors Improvement not impressive Similar idea proposed by Hyndman Slides on his website (Monash University) www-personal.buseco.monash.edu.au/ hyndman L-C Workshop, Edinburgh 24 22

24 Direct Poisson regression 2D smoothing followed by rank-one approximation is weird It resembles SVD of raw log-mortality Poisson regression more logical (Brouhns, Denuit & Vermunt) Combined with smoothing (Currie, Durban & Eilers) Better fit? Is the extra time series important (LC+)? (Implemented with alternating Poisson regression) L-C Workshop, Edinburgh 24 23

25 Rank-one approximation, again Data and rank one fit 195; Males Rank one Full model Data and rank one fit 1964; Males Rank one Full model Data and rank one fit 1979; Males Rank one Full model Data and rank one fit 1999; Males Rank one Full model L-C Workshop, Edinburgh 24 24

26 LC by smooth Poisson regression 35 Data and LC smooth fit 195; Males 35 Data and LC smooth fit 1964; Males Data and LC smooth fit 1979; Males 35 Data and LC smooth fit 1999; Males L-C Workshop, Edinburgh 24 25

27 LC+ by smooth Poisson regression 35 Data and LC smooth fit 195; Males 35 Data and LC smooth fit 1964; Males Data and LC smooth fit 1979; Males 35 Data and LC smooth fit 1999; Males L-C Workshop, Edinburgh 24 26

28 Computation LC model (age i, year j): z ij = α i + β i κ j Assume α, β and κ to be smooth Approach 1: model them all with penalized splines Approach 2: direct model (no bases) with penalties General principle: alternating Poisson regressions Efficient computation L-C Workshop, Edinburgh 24 27

29 Fitting with P-splines Assume α and β known B-spline model for κ: κ j = l b jl a l, or κ = Ba Expectations E(y ij ) = µ ij = x ij exp(z ij ) (exposures in X) Linear predictor z ij = α i + β i l b jl a l Poisson log-likelihood l = i j(y ij z ij e z ij ) Subtract roughness penalty λ l( 2 a l ) 2 /2 L-C Workshop, Edinburgh 24 28

30 Computational outline Stack columns of Y: y = vec(y) Stack columns of M = [µ ij ]: µ = vec(m) Weighting matrix W = diag(µ ) Compute large basis B = β B Improve approximation ã: (B WB + λd D)a = B (y µ + WB ã) Term λd D comes from penalty L-C Workshop, Edinburgh 24 29

31 Shortcuts Large matrix B (5 years, 1 ages, 2 B-splines) Full matrix W very large (5 by 5) (No problem in Matlab: sparse matrices) Use a shortcut: µ ij β 2 i b jk bjl = ( µ ij β 2 i ) b jk bjl i j j i Compute effective weights v j = i µ ij β 2 i and V = diag(v) Then B WB = B V B Similar trick for B y L-C Workshop, Edinburgh 24 3

32 Computation of α and β First standardize κ: mean, variance 1 Let α = B f and β = Bg Large basis B = [1 B : κ B] One vector contains α and β: c = [α : β ] Weighted regression as above, with vec(y ) and vec(m ) Similar shortcuts as for κ When η ij = α i + β j κ j + γ j (LC+) ditto for γ and κ L-C Workshop, Edinburgh 24 31

33 No bases Use degenerate B-spline basis: identity matrix But keep the penalties Shortcuts essential without sparse matrices Avoids mysterious B-splines Penalty allows automatic extrapolation of κ (γ) L-C Workshop, Edinburgh 24 32

34 Independence plots Jones and Koch (23) proposal for density estimators Plot mixed derivative of log density It is zero for (local) independence Follows from mini-cross-table Apply it to estimated log mortality: diff(diff(z ) ) For L-C: β i κ j + κ j β i For tensor model more complicated (cohort effects) L-C Workshop, Edinburgh 24 33

35 Independence plot for L-C model 11 Males, log mortality fit by LC 11 Independence plot x Year Year L-C Workshop, Edinburgh 24 34

36 Independence plot for tensor product model 11 Males, log 1 mortality Males, independence plot (λ = 1) 11 x Year Year L-C Workshop, Edinburgh 24 35

37 Discussion Interpretation of LC model as rank-one approximation The missing time-average It looks neat and elegant for interpretation Practical use limited Direct smooth Poisson regression better than SVD L-C Workshop, Edinburgh 24 36

38 Lee-Carter for counts Usually Lee-Carter is used for mortality rates Over a large age range If we limit age (to, say, over 7) It also fits well to counts Interesting alternative interpretation Normal distributions on a transformed time scale L-C Workshop, Edinburgh 24 37

39 Fit of smooth LC+ for counts 8 Females α.3 κ β 1 γ L-C Workshop, Edinburgh 24 38

40 Parameter curves of LC+ for counts Data and LC smooth fit (counts) 195; Females 35 Data and LC smooth fit (counts) 1964; Females Data and LC smooth fit (counts) 1979; Females 35 Data and LC smooth fit (counts) 1999; Females L-C Workshop, Edinburgh 24 39

41 Warping age to get normality Can we make the (scaled) age distributions normal? By only transforming age? Yes, we can: z ij = E(y ij ) = f (g(a i ) φ j ), with f (u) = exp( u 2 /2) Counts y ij at age a i, year t j Smooth nonparametric function g(.) Shift parameters φ Distributions scaled by their maximum Expanding the square and collecting terms gives LC+ L-C Workshop, Edinburgh 24 4

42 How do we do it in a simple way? Model warping function with B-splines g(a) = K k=1 B k (a)α k Use roughness penalty when fitting, minimize S = (y ij z ij ) 2 + λ ( 2 α k ) 2 i j k Linearize with Taylor expansion Nickname: SWaN, Shifted Warped Normal L-C Workshop, Edinburgh 24 41

43 SWaN for Dutch females 35 Distributions, NL, Females.5 Center shift φ 3 [deaths/year] Distributions, NL, Females 5 Transformed age 3 4 [deaths/year] Transformed age Transformed age L-C Workshop, Edinburgh 24 42

44 SWaN for Dutch males 3 Distributions, NL, Males.2 Center shift φ [deaths/year] Distributions, NL, Males 5 Transformed age 25 4 [deaths/year] Transformed age Transformed age L-C Workshop, Edinburgh 24 43

45 Improvements Scaled distributions and least squares not optimal Data are counts, assume Poisson distribution Generalized linear model, expected values µ ij η ij = log µ ij = γ j + f (g(a i ) β j ) Time series γ to catch the trend Taylor linearization, (penalized) scoring algorithm A kind of GLM with curious (i.e normal) link function f (.) Results not much different from least squares approach L-C Workshop, Edinburgh 24 44

46 Further research Remarkable: constant width (variance) of normal Will variable width give much improvement? How do transformations compare (gender, countries)? Penalties allow automatic extrapolation of β and γ. How good? Raw mortalities used, population size not involved! Why does it work so well? L-C Workshop, Edinburgh 24 45

Using P-splines to smooth two-dimensional Poisson data

Using P-splines to smooth two-dimensional Poisson data 1 Using P-splines to smooth two-dimensional Poisson data Maria Durbán 1, Iain Currie 2, Paul Eilers 3 17th IWSM, July 2002. 1 Dept. Statistics and Econometrics, Universidad Carlos III de Madrid, Spain.

More information

GLAM An Introduction to Array Methods in Statistics

GLAM An Introduction to Array Methods in Statistics GLAM An Introduction to Array Methods in Statistics Iain Currie Heriot Watt University GLAM A Generalized Linear Array Model is a low-storage, high-speed, method for multidimensional smoothing, when data

More information

Multidimensional Density Smoothing with P-splines

Multidimensional Density Smoothing with P-splines Multidimensional Density Smoothing with P-splines Paul H.C. Eilers, Brian D. Marx 1 Department of Medical Statistics, Leiden University Medical Center, 300 RC, Leiden, The Netherlands (p.eilers@lumc.nl)

More information

Currie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh EH14 4AS, UK

Currie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh EH14 4AS, UK An Introduction to Generalized Linear Array Models Currie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh EH14 4AS, UK E-mail: I.D.Currie@hw.ac.uk 1 Motivating

More information

P -spline ANOVA-type interaction models for spatio-temporal smoothing

P -spline ANOVA-type interaction models for spatio-temporal smoothing P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and

More information

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra (Review) Volker Tresp 2018 Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c

More information

Recovering Indirect Information in Demographic Applications

Recovering Indirect Information in Demographic Applications Recovering Indirect Information in Demographic Applications Jutta Gampe Abstract In many demographic applications the information of interest can only be estimated indirectly. Modelling events and rates

More information

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra (Review) Volker Tresp 2017 Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.

More information

Space-time modelling of air pollution with array methods

Space-time modelling of air pollution with array methods Space-time modelling of air pollution with array methods Dae-Jin Lee Royal Statistical Society Conference Edinburgh 2009 D.-J. Lee (Uc3m) GLAM: Array methods in Statistics RSS 09 - Edinburgh # 1 Motivation

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Exercise Sheet 1 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Show that an infinite mixture of Gaussian distributions, with Gamma distributions as mixing weights in the following

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Flexible Spatio-temporal smoothing with array methods

Flexible Spatio-temporal smoothing with array methods Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Econ 2120: Section 2

Econ 2120: Section 2 Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Lars Eldén Department of Mathematics, Linköping University 1 April 2005 ERCIM April 2005 Multi-Linear

More information

Lecture 15 Review of Matrix Theory III. Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore

Lecture 15 Review of Matrix Theory III. Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore Lecture 15 Review of Matrix Theory III Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore Matrix An m n matrix is a rectangular or square array of

More information

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name: Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition Due date: Friday, May 4, 2018 (1:35pm) Name: Section Number Assignment #10: Diagonalization

More information

Background Mathematics (2/2) 1. David Barber

Background Mathematics (2/2) 1. David Barber Background Mathematics (2/2) 1 David Barber University College London Modified by Samson Cheung (sccheung@ieee.org) 1 These slides accompany the book Bayesian Reasoning and Machine Learning. The book and

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Mixed models in R using the lme4 package Part 4: Theory of linear mixed models

Mixed models in R using the lme4 package Part 4: Theory of linear mixed models Mixed models in R using the lme4 package Part 4: Theory of linear mixed models Douglas Bates 8 th International Amsterdam Conference on Multilevel Analysis 2011-03-16 Douglas Bates

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Functional SVD for Big Data

Functional SVD for Big Data Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem

More information

Assignment 11 (C + C ) = (C + C ) = (C + C) i(c C ) ] = i(c C) (AB) = (AB) = B A = BA 0 = [A, B] = [A, B] = (AB BA) = (AB) AB

Assignment 11 (C + C ) = (C + C ) = (C + C) i(c C ) ] = i(c C) (AB) = (AB) = B A = BA 0 = [A, B] = [A, B] = (AB BA) = (AB) AB Arfken 3.4.6 Matrix C is not Hermition. But which is Hermitian. Likewise, Assignment 11 (C + C ) = (C + C ) = (C + C) [ i(c C ) ] = i(c C ) = i(c C) = i ( C C ) Arfken 3.4.9 The matrices A and B are both

More information

Review of the General Linear Model

Review of the General Linear Model Review of the General Linear Model EPSY 905: Multivariate Analysis Online Lecture #2 Learning Objectives Types of distributions: Ø Conditional distributions The General Linear Model Ø Regression Ø Analysis

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 9: Basis Expansions Department of Statistics & Biostatistics Rutgers University Nov 01, 2011 Regression and Classification Linear Regression. E(Y X) = f(x) We want to learn

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

Inversion Base Height. Daggot Pressure Gradient Visibility (miles)

Inversion Base Height. Daggot Pressure Gradient Visibility (miles) Stanford University June 2, 1998 Bayesian Backtting: 1 Bayesian Backtting Trevor Hastie Stanford University Rob Tibshirani University of Toronto Email: trevor@stat.stanford.edu Ftp: stat.stanford.edu:

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

1 Linearity and Linear Systems

1 Linearity and Linear Systems Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

Assignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran

Assignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran Assignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran 1. Let A m n be a matrix of real numbers. The matrix AA T has an eigenvector x with eigenvalue b. Then the eigenvector y of A T A

More information

Practical Linear Algebra: A Geometry Toolbox

Practical Linear Algebra: A Geometry Toolbox Practical Linear Algebra: A Geometry Toolbox Third edition Chapter 12: Gauss for Linear Systems Gerald Farin & Dianne Hansford CRC Press, Taylor & Francis Group, An A K Peters Book www.farinhansford.com/books/pla

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA

More information

MATRICES. a m,1 a m,n A =

MATRICES. a m,1 a m,n A = MATRICES Matrices are rectangular arrays of real or complex numbers With them, we define arithmetic operations that are generalizations of those for real and complex numbers The general form a matrix of

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay A Statistical Look at Spectral Graph Analysis Deep Mukhopadhyay Department of Statistics, Temple University Office: Speakman 335 deep@temple.edu http://sites.temple.edu/deepstat/ Graph Signal Processing

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 A cautionary tale Notes for 2016-10-05 You have been dropped on a desert island with a laptop with a magic battery of infinite life, a MATLAB license, and a complete lack of knowledge of basic geometry.

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

The Poisson trick for matched two-way tables

The Poisson trick for matched two-way tables The Poisson trick for matched two-way tles a case for putting the fish in the bowl (a case for putting the bird in the cage) Simplice Dossou-Gbété1, Antoine de Falguerolles2,* 1. Université de Pau et des

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information

CS540 Machine learning Lecture 5

CS540 Machine learning Lecture 5 CS540 Machine learning Lecture 5 1 Last time Basis functions for linear regression Normal equations QR SVD - briefly 2 This time Geometry of least squares (again) SVD more slowly LMS Ridge regression 3

More information

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science 1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Functional responses, functional covariates and the concurrent model

Functional responses, functional covariates and the concurrent model Functional responses, functional covariates and the concurrent model Page 1 of 14 1. Predicting precipitation profiles from temperature curves Precipitation is much harder to predict than temperature.

More information

Math 671: Tensor Train decomposition methods

Math 671: Tensor Train decomposition methods Math 671: Eduardo Corona 1 1 University of Michigan at Ann Arbor December 8, 2016 Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

Problem # Max points possible Actual score Total 120

Problem # Max points possible Actual score Total 120 FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to

More information

Linear Algebra and Matrices

Linear Algebra and Matrices Linear Algebra and Matrices 4 Overview In this chapter we studying true matrix operations, not element operations as was done in earlier chapters. Working with MAT- LAB functions should now be fairly routine.

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

CS 143 Linear Algebra Review

CS 143 Linear Algebra Review CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1 Multiple Regression Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 12, Slide 1 Review: Matrix Regression Estimation We can solve this equation (if the inverse of X

More information

Need for Several Predictor Variables

Need for Several Predictor Variables Multiple regression One of the most widely used tools in statistical analysis Matrix expressions for multiple regression are the same as for simple linear regression Need for Several Predictor Variables

More information

Regression Models for Quantitative and Qualitative Predictors: An Overview

Regression Models for Quantitative and Qualitative Predictors: An Overview Regression Models for Quantitative and Qualitative Predictors: An Overview Polynomial regression models Interaction regression models Qualitative predictors Indicator variables Modeling interactions between

More information

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references

More information

Conceptual Questions for Review

Conceptual Questions for Review Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.

More information

LECTURE 16: PCA AND SVD

LECTURE 16: PCA AND SVD Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Introduction to Smoothing spline ANOVA models (metamodelling)

Introduction to Smoothing spline ANOVA models (metamodelling) Introduction to Smoothing spline ANOVA models (metamodelling) M. Ratto DYNARE Summer School, Paris, June 215. Joint Research Centre www.jrc.ec.europa.eu Serving society Stimulating innovation Supporting

More information

Appendix A: Matrices

Appendix A: Matrices Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Preliminary Examination, Numerical Analysis, August 2016

Preliminary Examination, Numerical Analysis, August 2016 Preliminary Examination, Numerical Analysis, August 2016 Instructions: This exam is closed books and notes. The time allowed is three hours and you need to work on any three out of questions 1-4 and any

More information

Computational methods for mixed models

Computational methods for mixed models Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan Outline: Cox regression part 2 Ørnulf Borgan Department of Mathematics University of Oslo Recapitulation Estimation of cumulative hazards and survival probabilites Assumptions for Cox regression and check

More information

Image Registration Lecture 2: Vectors and Matrices

Image Registration Lecture 2: Vectors and Matrices Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this

More information

Study Notes on Matrices & Determinants for GATE 2017

Study Notes on Matrices & Determinants for GATE 2017 Study Notes on Matrices & Determinants for GATE 2017 Matrices and Determinates are undoubtedly one of the most scoring and high yielding topics in GATE. At least 3-4 questions are always anticipated from

More information

to be more efficient on enormous scale, in a stream, or in distributed settings.

to be more efficient on enormous scale, in a stream, or in distributed settings. 16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

Cohort Effect Structure in the Lee-Carter Residual Term. Naoki Sunamoto, FIAJ. Fukoku Mutual Life Insurance Company

Cohort Effect Structure in the Lee-Carter Residual Term. Naoki Sunamoto, FIAJ. Fukoku Mutual Life Insurance Company Cohort Effect Structure in the Lee-Carter Residual Term Naoki Sunamoto, FIAJ Fukoku Mutual Life Insurance Company 2-2 Uchisaiwaicho 2-chome, Chiyoda-ku, Tokyo, 100-0011, Japan Tel: +81-3-3593-7445, Fax:

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Some Notes on Least Squares, QR-factorization, SVD and Fitting

Some Notes on Least Squares, QR-factorization, SVD and Fitting Department of Engineering Sciences and Mathematics January 3, 013 Ove Edlund C000M - Numerical Analysis Some Notes on Least Squares, QR-factorization, SVD and Fitting Contents 1 Introduction 1 The Least

More information

Linear Algebra V = T = ( 4 3 ).

Linear Algebra V = T = ( 4 3 ). Linear Algebra Vectors A column vector is a list of numbers stored vertically The dimension of a column vector is the number of values in the vector W is a -dimensional column vector and V is a 5-dimensional

More information

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined

More information

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, 2. REVIEW OF LINEAR ALGEBRA 1 Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, where Y n 1 response vector and X n p is the model matrix (or design matrix ) with one row for

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

1 Cricket chirps: an example

1 Cricket chirps: an example Notes for 2016-09-26 1 Cricket chirps: an example Did you know that you can estimate the temperature by listening to the rate of chirps? The data set in Table 1 1. represents measurements of the number

More information

Review of Linear Algebra

Review of Linear Algebra Review of Linear Algebra Dr Gerhard Roth COMP 40A Winter 05 Version Linear algebra Is an important area of mathematics It is the basis of computer vision Is very widely taught, and there are many resources

More information