Conquering the Complexity of Time: Machine Learning for Big Time Series Data

Size: px

Start display at page:

Download "Conquering the Complexity of Time: Machine Learning for Big Time Series Data"

Barnard Park
5 years ago
Views:

1 Conquering the Complexity of Time: Machine Learning for Big Time Series Data Yan Liu Computer Science Department University of Southern California Mini-Workshop on Theoretical Foundations of Cyber-Physical Systems April 14, 2017 Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

2 A human being is a part of a whole, called by us universe, a part limited in time and space. Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

3 Research Synopsis: Machine Learning Models Developing scalable and effective solutions by leveraging recent progresses across disciplines Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

4 Research Synopsis: Applications Doctor AI - Healthcare Analytics Social Network Analysis Result:-ArXiv He-et-al. HawkesTopic ICML /17 Computational sustainability Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

5 Google Alpha Go Beats Human Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

After AlphaGo, what s next for AI? Googles DeepMind AI group unveils health care ambitions http://www.theverge.

6 After AlphaGo, what s next for AI? Googles DeepMind AI group unveils health care ambitions Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

Heterogeneity Lots lots of missing data Big small data Worst of all: doctors do not believe anything

7 Properties of Health Care Data How are health care data different from the data from existing applications of deep learning? Privacy, privacy! Heterogeneity Lots lots of missing data Big small data Worst of all: doctors do not believe anything they cannot understand no matter how cool and how deep they are!! Example 1: Example 2: Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

Recurrent Neural Networks for Time Series Data Our Contributions [KDD 2015, AMIA 2015, 2016, arxiv 2016, ICLR 2017] Multi-modal Deep Neural Networks

8 Recurrent Neural Networks for Time Series Data Our Contributions [KDD 2015, AMIA 2015, 2016, arxiv 2016, ICLR 2017] Multi-modal Deep Neural Networks Handling Missing Data in DNN Big Small Data Solution via DNN Interpretation of DNN Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

9 Explainable Artificial Intelligence: Mimic learning Framework Target Input y X Deep Learning Model y nn X nn y nn X Mimic Model y m Output Main idea: Use Gradient Boosting Trees (GBT) to mimic the performance of deep learning models Benefits: Good performance from complex deep networks Mimic model keeps it Easy overfitting in original decision tree methods Mimic model avoids it Interpretations are hard to get in original models Mimic model provides it Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

10 Experiment Result Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

11 Experiment Result Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

12 Experiment Result Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

13 Research Synopsis: Applications Doctor AI - Healthcare Analytics Social Network Analysis Result:-ArXiv He-et-al. HawkesTopic ICML /17 Computational sustainability Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

14 Diffusion Analysis Adoption of innovation: new treatment, new technology Marketing: word of mouth effect, viral marketing Public opinion surveillance: detection of rumors Media analysis: modeling news dynamics Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

[ICML 2015, NIPS 2016, WSDM submission] Yan Liu (USC)

15 Network Inference Network Inference: inferring the latent diffusion network from observed cascades Our contribution [ICML 2015, NIPS 2016, WSDM submission] Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

Hawkes-Topic Models: Joint Inference of Diffusion Networks and Topics 1 Generate all the events and the event times via the Multivariate Hawkes Process 2 For each topic k: draw β k Dir(α).

16 Hawkes-Topic Models: Joint Inference of Diffusion Networks and Topics 1 Generate all the events and the event times via the Multivariate Hawkes Process 2 For each topic k: draw β k Dir(α). 3 For each event e of node v: 1 If e is a spontaneous event: η e N(α v, σ 2 I). Otherwise η e N(η parent[e], σ 2 I). 2 For each word n: z e,n Discrete(π(η e )), w e,n Discrete(β ze,n ). Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

17 Experiment Results: EventRegistry Result:-EventRegistry Network&Inference&accuracy:&10% improvement& Hawkes Hawkes?LDA Hawkes?CTM HTM Component" Component" Component" Topic&modeling&accuracy:& LDA CTM HTM Component"1 P42945 P42458?42325 Component"2 P22558 P22181 P22164 Component"3 P17574 P17574 P17571 He-et-al. HawkesTopic ICML /17 Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

18 Experiment Results: EventRegistry Result:-EventRegistry Early bird report He-et-al. agency: sunherald.com, HawkesTopic miamiherald.com and in.reuters.com ICML-2015 News gathering and re-distribution agency: 14/17 Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

19 Research Synopsis: Applications Doctor AI - Healthcare Analytics Social Network Analysis Result:-ArXiv He-et-al. HawkesTopic ICML /17 Computational sustainability Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

20 Spatiotemporal Data Analysis [NIPS 2104 Spotlight Presentation, ICML 2015, ICML 2016] Two key principles in designing spatial-temporal models Local smoothness: features in the same neighborhood share similar value Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

Spatiotemporal Data Analysis [NIPS 2104 Spotlight Presentation, ICML 2015, ICML 2016] Two key principles in designing spatial-temporal models Local smoothness: features in the same neighborhood share

21 Spatiotemporal Data Analysis [NIPS 2104 Spotlight Presentation, ICML 2015, ICML 2016] Two key principles in designing spatial-temporal models Local smoothness: features in the same neighborhood share similar value Global latent structure: the data lie on a lower dimensional latent structure Location Time Variable Singular values Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

22 Spatial Locations Tensor Representation for Spatial Temporal Data Spatial temporal data can naturally be represented as tensors Time Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

23 Spatial Locations Tensor Representation for Spatial Temporal Data Spatial temporal data can naturally be represented as tensors Time Simple solutions based on tensor formulation: Local smoothness can be achieved by Laplacian regularizer Global latent structures can be achieved by low-rank constraint Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

24 Application I: Cokriging Task description Jointly predicting the value of multiple features at unknown locations by taking advantage of the observations from known locations. Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

25 Application I: Cokriging Task description Jointly predicting the value of multiple features at unknown locations by taking advantage of the observations from known locations. Given the complete data X R P T M and partial observations at a subset of locations indexed by Ω {1,..., P }, we need to estimate W R P T M so that W Ω = X Ω. Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

26 Application I: Cokriging Task description Jointly predicting the value of multiple features at unknown locations by taking advantage of the observations from known locations. Given the complete data X R P T M and partial observations at a subset of locations indexed by Ω {1,..., P }, we need to estimate W R P T M so that W Ω = X Ω. Our formulation Ŵ = argmin W subject to { Loss term {}}{ W Ω X Ω 2 F + rank(w) ρ. }{{} Global consistency Local consistency {}}{ } M µ tr(w:,:,mlw :,:,m ) m=1 Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

27 Experiments on Climate Datasets Cokriging Dataset Tensor-F Tensor-O ADMM Simple MTGP USHCN CCDS Forecasting Dataset Tensor-F Tensor-O Tucker ADMM OrthoNL Trace MTL l1 MTL dirt USHCN CCDS Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

28 Experiments on Climate Datasets Cokriging Dataset Tensor-F Tensor-O ADMM Simple MTGP USHCN CCDS Forecasting Dataset Tensor-F Tensor-O Tucker ADMM OrthoNL Trace MTL l1 MTL dirt USHCN CCDS Map of most predictive regions analyzed by the greedy algorithm using CCDS dataset. Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

29 Experiments on Scalability Running time (in seconds) for cokriging and forecasting Cokriging Forecasting Dataset USHCN CCDS FSQ USHCN CCDS FSQ Tensor-O ADMM MTGP Other Development Online prediction (ICML 2015) Memory efficient prediction (ICML 2016) Connections to Gaussian Processes (Upcoming) Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

30 Thank you! For more information: USC Melady Group Yan Liu (USC) Mining from Large-scale Time Series Data April 14, / 25

Prediction of Citations for Academic Papers

Prediction of Citations for Academic Papers 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050