A Constraint Generation Approach to Learning Stable Linear Dynamical Systems

Similar documents
A Constraint Generation Approach to Learning Stable Linear Dynamical Systems

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems

Learning about State. Geoff Gordon Machine Learning Department Carnegie Mellon University

Reduced-Rank Hidden Markov Models

Dynamic Data Factorization

Structured matrix factorizations. Example: Eigenfaces

Hilbert Space Embeddings of Hidden Markov Models

An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems

Chapter 3. Tohru Katayama

EE 227A: Convex Optimization and Applications October 14, 2008


Introduction to Graphical Models

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

Estimating Covariance Using Factorial Hidden Markov Models

Tensor Decompositions for Machine Learning. G. Roeder 1. UBC Machine Learning Reading Group, June University of British Columbia

DynamicBoost: Boosting Time Series Generated by Dynamical Systems

Subspace mappings for image sequences

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Linear Subspace Models

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Salt Dome Detection and Tracking Using Texture Analysis and Tensor-based Subspace Learning

Learning a Linear Dynamical System Model for Spatiotemporal

c Springer, Reprinted with permission.

Learning about State. Geoff Gordon Machine Learning Department Carnegie Mellon University

System Identification by Nuclear Norm Minimization

Using SVD to Recommend Movies

CS281 Section 4: Factor Analysis and PCA

Sparsity in system identification and data-driven control

Covariance-Based PCA for Multi-Size Data

Subspace Identification With Guaranteed Stability Using Constrained Optimization

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Exploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability

A Practical Algorithm for Topic Modeling with Provable Guarantees

VARIANCE COMPUTATION OF MODAL PARAMETER ES- TIMATES FROM UPC SUBSPACE IDENTIFICATION

Identification of modal parameters from ambient vibration data using eigensystem realization algorithm with correlation technique

What is Principal Component Analysis?

Deriving Principal Component Analysis (PCA)

Latent Semantic Analysis. Hongning Wang

Dimensionality Reduction and Principle Components Analysis

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix

Data Mining Techniques

Linear Dependent Dimensionality Reduction

Modal parameter identification from output data only

Main matrix factorizations

A New Subspace Identification Method for Open and Closed Loop Data

Fantope Regularization in Metric Learning

Principal components analysis COMS 4771

L26: Advanced dimensionality reduction

Probabilistic Latent Semantic Analysis

Singular Value Decomposition

Applied Linear Algebra

Lecture: Face Recognition and Feature Reduction

On the convergence of higher-order orthogonality iteration and its extension

Positive Definite Matrix

Applied Linear Algebra in Geoscience Using MATLAB

UNIT 6: The singular value decomposition.

Computational Methods. Eigenvalues and Singular Values

Karhunen Loéve Expansion of a Set of Rotated Templates

The Essentials of Linear State-Space Systems

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians

DYNAMIC TEXTURE RECOGNITION USING ENHANCED LBP FEATURES

Clustering VS Classification

The convex algebraic geometry of rank minimization

Lecture: Face Recognition and Feature Reduction

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

Spectral Learning Part I

RECURSIVE SUBSPACE IDENTIFICATION IN THE LEAST SQUARES FRAMEWORK

Operational modal analysis using forced excitation and input-output autoregressive coefficients

Maths for Signals and Systems Linear Algebra in Engineering

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Lecture 6 Sept Data Visualization STAT 442 / 890, CM 462

Lecture 02 Linear Algebra Basics

MINIMUM PEAK IMPULSE FIR FILTER DESIGN

Machine Learning Techniques for Computer Vision

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:

Principal Component Analysis CS498

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Learning SVM Classifiers with Indefinite Kernels

ADAPTIVE FILTER THEORY

Proposition 42. Let M be an m n matrix. Then (32) N (M M)=N (M) (33) R(MM )=R(M)

APPROXIMATE REALIZATION OF VALVE DYNAMICS WITH TIME DELAY

Matrix Mathematics. Theory, Facts, and Formulas with Application to Linear Systems Theory. Dennis S. Bernstein

LINEAR ALGEBRA KNOWLEDGE SURVEY

7. Symmetric Matrices and Quadratic Forms

EECS 275 Matrix Computation

Linear Algebra Review. Fei-Fei Li

Kronecker Decomposition for Image Classification

PCA and admixture models

Hilbert Space Embeddings of Hidden Markov Models

Mean Shift is a Bound Optimization

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

M E M O R A N D U M. Faculty Senate approved November 1, 2018

Review of Some Concepts from Linear Algebra: Part 2

Degeneracies, Dependencies and their Implications in Multi-body and Multi-Sequence Factorizations

Properties of Matrices and Operations on Matrices

A Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde

Transcription:

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University NIPS 2007 poster W22

steam

Application: Dynamic Textures 1 steam river fountain Videos of moving scenes that exhibit stationarity properties Dynamics can be captured by a low-dimensional model Learned models can efficiently simulate realistic sequences Applications: compression, recognition, synthesis of videos 1 S. Soatto, D. Doretto and Y. Wu. Dynamic Textures. Proceedings of the ICCV, 2001

Linear Dynamical Systems Input data sequence

Linear Dynamical Systems Input data sequence

Linear Dynamical Systems Input data sequence Assume a latent variable that explains the data

Linear Dynamical Systems Input data sequence Assume a latent variable that explains the data

Linear Dynamical Systems Input data sequence Assume a latent variable that explains the data Assume a Markov model on the latent variable

Linear Dynamical Systems Input data sequence Assume a latent variable that explains the data Assume a Markov model on the latent variable

Linear Dynamical Systems 2 State and observation models: Dynamics matrix: Observation matrix: 2 Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Trans.ASME-JBE

Linear Dynamical Systems 2 This talk: - Learning LDS parameters from data while ensuring a stable dynamics matrix A more efficiently and accurately than previous methods State and observation models: Dynamics matrix: Observation matrix: 2 Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Trans.ASME-JBE

Learning Linear Dynamical Systems Suppose we have an estimated state sequence

Learning Linear Dynamical Systems Suppose we have an estimated state sequence Define state reconstruction error as the objective:

Learning Linear Dynamical Systems Suppose we have an estimated state sequence Define state reconstruction error as the objective: We would like to learn such that i.e.

Subspace Identification 3 Subspace ID uses matrix decomposition to estimate the state sequence 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 Subspace ID uses matrix decomposition to estimate the state sequence Build a Hankel matrix D of stacked observations 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 Subspace ID uses matrix decomposition to estimate the state sequence Build a Hankel matrix D of stacked observations 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 In expectation, the Hankel matrix is inherently low-rank! 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 In expectation, the Hankel matrix is inherently low-rank! 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 In expectation, the Hankel matrix is inherently low-rank! 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 In expectation, the Hankel matrix is inherently low-rank! Can use SVD to obtain the low-dimensional state sequence 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 In expectation, the Hankel matrix is inherently low-rank! Can use SVD to obtain the low-dimensional state sequence For D with k observations per column, = 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Subspace Identification 3 In expectation, the Hankel matrix is inherently low-rank! Can use SVD to obtain the low-dimensional state sequence For D with k observations per column, = 3 P. Van Overschee and B. De Moor Subspace Identification for Linear Systems. Kluwer, 1996

Lets train an LDS for steam textures using this algorithm, and simulate a video from it! = [ ] x t 2 R 40

Simulating from a learned model x t The model is unstable t

Notation λ 1,, λ n : eigenvalues of A ( λ 1 > > λ n ) ν 1,,ν n : unit-length eigenvectors of A σ 1,,σ n : singular values of A (σ 1 > σ 2 > σ n ) S λ : matrices with λ 1 1 S σ : matrices with σ 1 1

Stability a matrix A is stable if λ 1 1, i.e. if

Stability a matrix A is stable if λ 1 1, i.e. if

Stability a matrix A is stable if λ 1 1, i.e. if λ 1 = 0.3 λ 1 = 1.295

Stability a matrix A is stable if λ 1 1, i.e. if λ 1 = 0.3 x t (1), x t (2) λ 1 = 1.295 x t (1), x t (2)

Stability a matrix A is stable if λ 1 1, i.e. if λ 1 = 0.3 x t (1), x t (2) λ 1 = 1.295 x t (1), x t (2) We would like to solve s.t.

Stability and Convexity But S λ is non-convex!

Stability and Convexity But S λ is non-convex! A 1 A 2

Stability and Convexity But S λ is non-convex! A 1 A 2 Lets look at S σ instead S σ is convex S σ µ S λ

Stability and Convexity But S λ is non-convex! A 1 A 2 Lets look at S σ instead S σ is convex S σ µ S λ

Stability and Convexity But S λ is non-convex! A 1 A 2 Lets look at S σ instead S σ is convex S σ µ S λ Previous work 4 exploits these properties to learn a stable A by solving the semi-definite program s.t. 4 S. L. Lacy and D. S. Bernstein. Subspace identification with guaranteed stability using constrained optimization. In Proc. of the ACC (2002), IEEE Trans. Automatic Control (2003)

Lets train an LDS for steam again, this time constraining A to be in S σ

Simulating from a Lacy-Bernstein stable texture model x t t Model is over-constrained. Can we do better?

Our Approach Formulate the S σ approximation of the problem as a semi-definite program (SDP) Start with a quadratic program (QP) relaxation of this SDP, and incrementally add constraints Because the SDP is an inner approximation of the problem, we reach stability early, before reaching the feasible set of the SDP We interpolate the solution to return the best stable matrix possible

The Algorithm

The Algorithm objective function contours

objective function contours The Algorithm A 1 : unconstrained QP solution (least squares estimate)

objective function contours The Algorithm A 1 : unconstrained QP solution (least squares estimate)

objective function contours The Algorithm A 1 : unconstrained QP solution (least squares estimate) A 2 : QP solution after 1 constraint (happens to be stable)

objective function contours The Algorithm A 1 : unconstrained QP solution (least squares estimate) A 2 : QP solution after 1 constraint (happens to be stable) A final : Interpolation of stable solution with the last one

objective function contours The Algorithm A 1 : unconstrained QP solution (least squares estimate) A 2 : QP solution after 1 constraint (happens to be stable) A final : Interpolation of stable solution with the last one A previous method : Lacy Bernstein (2002)

Lets train an LDS for steam using constraint generation, and simulate

Simulating from a Constraint Generation stable texture model x t t Model captures more dynamics and is still stable

Least Squares Constraint Generation

Algorithms: Empirical Evaluation Constraint Generation CG (our method) Lacy and Bernstein (2002) LB-1 finds a σ 1 1 solution Lacy and Bernstein (2003) LB-2 solves a similar problem in a transformed space Data sets Dynamic textures Biosurveillance baseline models (see paper)

Steam video Reconstruction error % decrease in objective (lower is better) number of latent dimensions

Running time Steam video Running time (s) number of latent dimensions

Conclusion A novel constraint generation algorithm for learning stable linear dynamical systems SDP relaxation enables us to optimize over a larger set of matrices while being more efficient

Conclusion A novel constraint generation algorithm for learning stable linear dynamical systems SDP relaxation enables us to optimize over a larger set of matrices while being more efficient Future work: Adding stability constraints to EM Stable models for more structured dynamic textures

Thank you!

Subspace ID with Hankel matrices Stacking multiple observations in D forces latent states to model the future = e.g. annual sunspots data with 12-year cycles k = 1 k = 12 t First 2 columns of U t

Stability and Convexity The state space of a stable LDS lies inside some ellipse

Stability and Convexity The state space of a stable LDS lies inside some ellipse The set of matrices that map a particular ellipse into itself (and hence are stable) is convex

Stability and Convexity The state space of a stable LDS lies inside some ellipse The set of matrices that map a particular ellipse into itself (and hence are stable) is convex If we knew in advance which ellipse contains our state space, finding A would be a convex problem. But we don t???

Stability and Convexity The state space of a stable LDS lies inside some ellipse The set of matrices that map a particular ellipse into itself (and hence are stable) is convex If we knew in advance which ellipse contains our state space, finding A would be a convex problem. But we don t??? and the set of all stable matrices is non-convex