Temporal Multi-View Inconsistency Detection for Network Traffic Analysis
|
|
- Jason Walker
- 6 years ago
- Views:
Transcription
1 WWW 15 Florence, Italy Temporal Multi-View Inconsistency Detection for Network Traffic Analysis Houping Xiao 1, Jing Gao 1, Deepak Turaga 2, Long Vu 2, and Alain Biem 2 1 Department of Computer Science and Engineering, University at Buffalo; 2 IBM T.J. Watson Research Center 1
2 Outline Motivation Challenges Proposed Framework Temporal Multi-View Inconsistency Detection (TMVID) Experiments Conclusions 2
3 Motivation Multiple views information Network traffic data typically involve multiple views Example: Network traffic data can be collected through different protocols, such as TCP, UDP, and ICMP Question? Which host has suspicious behavior? Our solution Calculate the degree of receiving inconsistent information across multiple views Higher degree of inconsistency More suspicious 3
4 How to Find Inconsistent Behavior Single view approach Apply many anomaly detector algorithms on each view of the data and then compare the detector scores However, the detector scores may be noisy and fail to consider the intrinsic relationship between different views Analyze the behavior of host across multiple views 4
5 Detector Score Our Solution First apply existing anomaly detection algorithm Convert data from different views into comparable features and discard noisy information However, after the application of anomaly detectors on each view, it is still challenging to compare anomaly detector outputs from different views Raw detector scores from network traffic flow on 4 views Host ID 5
6 Our Solution Project multi-view data into a new space where inconsistent and consistent hosts can be well separated Identify detector clusters and compare at the cluster level In each source, detectors can be partitioned into clusters so that detectors in the same cluster share similar behavior patterns on hosts across multiple views The behavior of the underlying detector cluster should be consistent across multiple views 6
7 Temporal Behavior Observations Behavior of hosts evolves over time The temporal patterns of hosts behavior must be taken into consideration when finding inconsistency Example: a host with a very high volume of network traffic is normal on weekdays, while it s suspicious on weekends Solution In each view, timestamps will be partitioned into clusters Temporal behavior over timestamp clusters should be consistent across multiple views 7
8 Proposed Framework View 1 View 2 View j Temporal Multi-View Inconsistency Detection (TMVID ) Detector 1 Detector 2 Detector j detector host View M 1 Detector N 1 detector host View M Detector N Component 1 Anomaly Detector System Observed tensor 8
9 Proposed Framework View 1 View 2 View j Temporal Multi-View Inconsistency Detection (TMVID ) Detector 1 Detector 2 Detector j detector host View M 1 Detector N 1 detector host View M Detector N Component 1 Anomaly Detector System Observed tensor Component 2 Joint Probabilistic Tensor Factorization Latent tensor Identity matrix Detector cluster assignment matrix Timestamps cluster assignment matrix 9
10 Proposed Framework View 1 View 2 View j View M 1 Temporal Multi-View Inconsistency Detection (TMVID ) Detector 1 Detector 2 Detector j Detector N 1 detector detector host host Inconsisten cy Score + Inconsistent Hosts View M Detector N Component 1 Anomaly Detector System Component 2 Joint Probabilistic Tensor Factorization Latent tensor Component 3 Inconsistency Score Computation 10
11 Joint Probabilistic Tensor Factorization is the latent tensor. Each entry stands for the detector score at the u-th detector cluster and w-th timestamp cluster for v-th host is the d-th projection matrix, which constructs the multi-linear mapping between the observed detector tensors and the latent tensors is the residue tensor. Each entry is assumed to follow a Gaussian distribution 11
12 Joint Probabilistic Tensor Factorization Parameter set: The log-likelihood of given observed tensors: 12
13 Joint Probabilistic Tensor Factorization Assumptions: The behavior of anomaly detectors should be similar across different views The behavior of hosts on timestamp should be similar across different views Based on these assumptions, we introduce the penalized log-likelihood function: Where 13
14 Joint Probabilistic Tensor Factorization Goal: Factorization error Constraints: Projection matrices should be similar across views 14
15 Inconsistency Score Computation k k k k C D C D C D C D Inconsistency Score 15
16 Experiment Set-up Datasets: Synthetic datasets Two Real-world datasets Collected from IBM enterprise networks Network Traffic Flow Data Domain Name System Data 16
17 Effectiveness Comparison Vote/ mean Table 1: Statistics of Synthetic Data sets # detectors # hosts # timestamps # views Synth Synth Synth Table 2: F-Measure Comparison Vote/ min Vote/ max Mean Min Max NMF TMVID Synth Synth Synth Results: For the F-Measure, the higher, the better. It s seen from the table that the proposed TMVID can achieve highest F-measure. 17
18 Scalability V.S. # Hosts TMVID # Hosts Time(s) Pearson Correlation Results: The scalability of the proposed algorithm is almost linear with respect to the number of hosts 18
19 Running Time/sec Scalability V.S. # Views # Views Results: The scalability of the proposed algorithm is linear with respect to the number of views 19
20 Inconsistency Score Network Traffic Flow Host ID Results: Figure of the inconsistency scores for hosts. Most of the hosts are considered as consistent, while only a small set of hosts receives very high inconsistency scores 20
21 Detector Score Detector Score Case Study Top1 Inconsistent Host Top1 Consistent Host Results: For inconsistent host, the detector score patterns of views on both timestamp and detector clusters are well separated in the subspace found by the joint probabilistic tensor factorization, while the behavior of consistent host is almost the same across views 21
22 Detector Score Detector Score Case Study Top 2 Inconsistent Hosts Timestamp Top 2 Consistent Hosts Results: For inconsistent hosts, the patterns of detector clusters are quite different across multiple views, while the patterns are similar for consistent hosts across views, ignoring noise Timestamp 22
23 Detector Score Detector Score Case Study Top 2 Inconsistent Hosts Detector ID Top 2 Consistent Hosts Results: For inconsistent hosts, the patterns of timestamp clusters vary a lot across views, especially for view 2 and view 4, whose patterns are obviously different from that of view 1 and view 3. However, the patterns are quite similar for consistent hosts Detector ID 23
24 Conclusions Developed a novel framework (TMVID) to conduct inconsistency detection from multiple views of temporal data Proposed joint probabilistic tensor factorization to extract the common behavior hidden in multiple views, and presented how to calculate inconsistency score for each host Demonstrated the efficacy of TMVID to capture inconsistency in multi-view temporal data on synthetic and real-world network traffic data sets 24
25 Thank You! Questions? 25
Believe it Today or Tomorrow? Detecting Untrustworthy Information from Dynamic Multi-Source Data
SDM 15 Vancouver, CAN Believe it Today or Tomorrow? Detecting Untrustworthy Information from Dynamic Multi-Source Data Houping Xiao 1, Yaliang Li 1, Jing Gao 1, Fei Wang 2, Liang Ge 3, Wei Fan 4, Long
More informationUAPD: Predicting Urban Anomalies from Spatial-Temporal Data
UAPD: Predicting Urban Anomalies from Spatial-Temporal Data Xian Wu, Yuxiao Dong, Chao Huang, Jian Xu, Dong Wang and Nitesh V. Chawla* Department of Computer Science and Engineering University of Notre
More informationhe Applications of Tensor Factorization in Inference, Clustering, Graph Theory, Coding and Visual Representation
he Applications of Tensor Factorization in Inference, Clustering, Graph Theory, Coding and Visual Representation Amnon Shashua School of Computer Science & Eng. The Hebrew University Matrix Factorization
More informationMeasurements made for web data, media (IP Radio and TV, BBC Iplayer: Port 80 TCP) and VoIP (Skype: Port UDP) traffic.
Real time statistical measurements of IPT(Inter-Packet time) of network traffic were done by designing and coding of efficient measurement tools based on the Libpcap package. Traditional Approach of measuring
More informationSignal Modeling Techniques in Speech Recognition. Hassan A. Kingravi
Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction
More informationPredictive Discrete Latent Factor Models for large incomplete dyadic data
Predictive Discrete Latent Factor Models for large incomplete dyadic data Deepak Agarwal, Srujana Merugu, Abhishek Agarwal Y! Research MMDS Workshop, Stanford University 6/25/2008 Agenda Motivating applications
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationSparse Gaussian Markov Random Field Mixtures for Anomaly Detection
Sparse Gaussian Markov Random Field Mixtures for Anomaly Detection Tsuyoshi Idé ( Ide-san ), Ankush Khandelwal*, Jayant Kalagnanam IBM Research, T. J. Watson Research Center (*Currently with University
More informationLecture 7: Con3nuous Latent Variable Models
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationCOMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017
COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA
More informationLearning to Learn and Collaborative Filtering
Appearing in NIPS 2005 workshop Inductive Transfer: Canada, December, 2005. 10 Years Later, Whistler, Learning to Learn and Collaborative Filtering Kai Yu, Volker Tresp Siemens AG, 81739 Munich, Germany
More informationWindow-based Tensor Analysis on High-dimensional and Multi-aspect Streams
Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Jimeng Sun Spiros Papadimitriou Philip S. Yu Carnegie Mellon University Pittsburgh, PA, USA IBM T.J. Watson Research Center Hawthorne,
More informationModeling Residual-Geometric Flow Sampling
Modeling Residual-Geometric Flow Sampling Xiaoming Wang Joint work with Xiaoyong Li and Dmitri Loguinov Amazon.com Inc., Seattle, WA April 13 th, 2011 1 Agenda Introduction Underlying model of residual
More informationRobustness of Principal Components
PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.
More informationData Mining Techniques
Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!
More informationClustering Lecture 1: Basics. Jing Gao SUNY Buffalo
Clustering Lecture 1: Basics Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics Clustering
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationExploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data
Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data Jinzhong Wang April 13, 2016 The UBD Group Mobile and Social Computing Laboratory School of Software, Dalian University
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More informationLinear Regression. CSL603 - Fall 2017 Narayanan C Krishnan
Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization
More informationAdvanced Introduction to Machine Learning
10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see
More informationScaling Neighbourhood Methods
Quick Recap Scaling Neighbourhood Methods Collaborative Filtering m = #items n = #users Complexity : m * m * n Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion)
More informationLinear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan
Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis
More informationCSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression
CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html
More informationTensor Methods for Feature Learning
Tensor Methods for Feature Learning Anima Anandkumar U.C. Irvine Feature Learning For Efficient Classification Find good transformations of input for improved classification Figures used attributed to
More informationNoise & Data Reduction
Noise & Data Reduction Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum Dimension Reduction 1 Remember: Central Limit
More informationGlobal Behaviour Inference using Probabilistic Latent Semantic Analysis
Global Behaviour Inference using Probabilistic Latent Semantic Analysis Jian Li, Shaogang Gong, Tao Xiang Department of Computer Science Queen Mary College, University of London, London, E1 4NS, UK {jianli,
More informationDiagnosing New York City s Noises with Ubiquitous Data
Diagnosing New York City s Noises with Ubiquitous Data Dr. Yu Zheng yuzheng@microsoft.com Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Background Many cities suffer
More informationA Modular NMF Matching Algorithm for Radiation Spectra
A Modular NMF Matching Algorithm for Radiation Spectra Melissa L. Koudelka Sensor Exploitation Applications Sandia National Laboratories mlkoude@sandia.gov Daniel J. Dorsey Systems Technologies Sandia
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 8 Continuous Latent Variable
More informationDefending against Internet worms using a phase space method from chaos theory
Defending against Internet worms using a phase space method from chaos theory Jing Hu and Jianbo Gao Department of Electrical & Computer Engineering University of Florida Nageswara S. Rao Computer Science
More informationUnsupervised Anomaly Detection for High Dimensional Data
Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation
More informationOverlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach
Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach Author: Jaewon Yang, Jure Leskovec 1 1 Venue: WSDM 2013 Presenter: Yupeng Gu 1 Stanford University 1 Background Community
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationLearning SVM Classifiers with Indefinite Kernels
Learning SVM Classifiers with Indefinite Kernels Suicheng Gu and Yuhong Guo Dept. of Computer and Information Sciences Temple University Support Vector Machines (SVMs) (Kernel) SVMs are widely used in
More informationABSTRACT INTRODUCTION
ABSTRACT Presented in this paper is an approach to fault diagnosis based on a unifying review of linear Gaussian models. The unifying review draws together different algorithms such as PCA, factor analysis,
More informationLatent Geographic Feature Extraction from Social Media
Latent Geographic Feature Extraction from Social Media Christian Sengstock* Michael Gertz Database Systems Research Group Heidelberg University, Germany November 8, 2012 Social Media is a huge and increasing
More informationCS Homework 3. October 15, 2009
CS 294 - Homework 3 October 15, 2009 If you have questions, contact Alexandre Bouchard (bouchard@cs.berkeley.edu) for part 1 and Alex Simma (asimma@eecs.berkeley.edu) for part 2. Also check the class website
More informationSpatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter
Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint
More informationDynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji
Dynamic Data Modeling, Recognition, and Synthesis Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Contents Introduction Related Work Dynamic Data Modeling & Analysis Temporal localization Insufficient
More informationRestricted Boltzmann Machines for Collaborative Filtering
Restricted Boltzmann Machines for Collaborative Filtering Authors: Ruslan Salakhutdinov Andriy Mnih Geoffrey Hinton Benjamin Schwehn Presentation by: Ioan Stanculescu 1 Overview The Netflix prize problem
More informationActivity Identification from GPS Trajectories Using Spatial Temporal POIs Attractiveness
Activity Identification from GPS Trajectories Using Spatial Temporal POIs Attractiveness Lian Huang, Qingquan Li, Yang Yue State Key Laboratory of Information Engineering in Survey, Mapping and Remote
More informationLinear Factor Models. Sargur N. Srihari
Linear Factor Models Sargur N. srihari@cedar.buffalo.edu 1 Topics in Linear Factor Models Linear factor model definition 1. Probabilistic PCA and Factor Analysis 2. Independent Component Analysis (ICA)
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationNoise & Data Reduction
Noise & Data Reduction Andreas Wichert - Teóricas andreas.wichert@inesc-id.pt 1 Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis
More informationRecent advances in sensor data analytics
Recent advances in sensor data analytics Tsuyoshi Ide ( Ide-san ) PhD, Senior Technical Staff Member IBM Thomas J. Watson Research Center Jan 10, 2018, University at Albany, SUNY Agenda General challenges
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationLocalization of Radioactive Sources Zhifei Zhang
Localization of Radioactive Sources Zhifei Zhang 4/13/2016 1 Outline Background and motivation Our goal and scenario Preliminary knowledge Related work Our approach and results 4/13/2016 2 Background and
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationEstimating Local Information Trustworthiness via Multi-Source Joint Matrix Factorization
Estimating Local Information Trustworthiness via Multi-Source Joint Matrix Factorization Liang Ge, Jing Gao, Xiao Yu, Wei Fan and Aidong Zhang The State University of New York at Buffalo University of
More informationa Short Introduction
Collaborative Filtering in Recommender Systems: a Short Introduction Norm Matloff Dept. of Computer Science University of California, Davis matloff@cs.ucdavis.edu December 3, 2016 Abstract There is a strong
More informationDynamic Data-Driven Adaptive Sampling and Monitoring of Big Spatial-Temporal Data Streams for Real-Time Solar Flare Detection
Dynamic Data-Driven Adaptive Sampling and Monitoring of Big Spatial-Temporal Data Streams for Real-Time Solar Flare Detection Dr. Kaibo Liu Department of Industrial and Systems Engineering University of
More informationRank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs
Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas
More informationCPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018
CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,
More informationMachine Learning, Midterm Exam: Spring 2009 SOLUTION
10-601 Machine Learning, Midterm Exam: Spring 2009 SOLUTION March 4, 2009 Please put your name at the top of the table below. If you need more room to work out your answer to a question, use the back of
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationAnomaly Detection and Attribution in Networks with Temporally Correlated Traffic
Anomaly Detection and Attribution in Networks with Temporally Correlated Traffic Ido Nevat, Dinil Mon Divakaran 2, Sai Ganesh Nagarajan 2, Pengfei Zhang 3, Le Su 2, Li Ling Ko 4, Vrizlynn L. L. Thing 2
More informationOn the Nature of Random System Matrices in Structural Dynamics
On the Nature of Random System Matrices in Structural Dynamics S. ADHIKARI AND R. S. LANGLEY Cambridge University Engineering Department Cambridge, U.K. Nature of Random System Matrices p.1/20 Outline
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationSupporting Statistical Hypothesis Testing Over Graphs
Supporting Statistical Hypothesis Testing Over Graphs Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Tina Eliassi-Rad, Brian Gallagher, Sergey Kirshner,
More informationX t = a t + r t, (7.1)
Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationHidden Markov Models Part 1: Introduction
Hidden Markov Models Part 1: Introduction CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Modeling Sequential Data Suppose that
More informationDynamic Probabilistic Models for Latent Feature Propagation in Social Networks
Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Creighton Heaukulani and Zoubin Ghahramani University of Cambridge TU Denmark, June 2013 1 A Network Dynamic network data
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationLinear Dynamical Systems (Kalman filter)
Linear Dynamical Systems (Kalman filter) (a) Overview of HMMs (b) From HMMs to Linear Dynamical Systems (LDS) 1 Markov Chains with Discrete Random Variables x 1 x 2 x 3 x T Let s assume we have discrete
More informationManifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA
Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inria.fr http://perception.inrialpes.fr/
More informationy Xw 2 2 y Xw λ w 2 2
CS 189 Introduction to Machine Learning Spring 2018 Note 4 1 MLE and MAP for Regression (Part I) So far, we ve explored two approaches of the regression framework, Ordinary Least Squares and Ridge Regression:
More informationOverlapping Variable Clustering with Statistical Guarantees and LOVE
with Statistical Guarantees and LOVE Department of Statistical Science Cornell University WHOA-PSI, St. Louis, August 2017 Joint work with Mike Bing, Yang Ning and Marten Wegkamp Cornell University, Department
More informationA computationally efficient approach to generate large ensembles of coherent climate data for GCAM
A computationally efficient approach to generate large ensembles of coherent climate data for GCAM GCAM Community Modeling Meeting Joint Global Change Research Institute, College Park MD November 7 th,
More informationFactor Analysis (10/2/13)
STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.
More informationPrincipal Component Analysis
Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationCurrie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh EH14 4AS, UK
An Introduction to Generalized Linear Array Models Currie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh EH14 4AS, UK E-mail: I.D.Currie@hw.ac.uk 1 Motivating
More informationMachine Learning for Data Science (CS4786) Lecture 12
Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We
More informationIncremental Pattern Discovery on Streams, Graphs and Tensors
Incremental Pattern Discovery on Streams, Graphs and Tensors Jimeng Sun Thesis Committee: Christos Faloutsos Tom Mitchell David Steier, External member Philip S. Yu, External member Hui Zhang Abstract
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationEstimating Covariance Using Factorial Hidden Markov Models
Estimating Covariance Using Factorial Hidden Markov Models João Sedoc 1,2 with: Jordan Rodu 3, Lyle Ungar 1, Dean Foster 1 and Jean Gallier 1 1 University of Pennsylvania Philadelphia, PA joao@cis.upenn.edu
More informationJointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs
Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs Jiaming Xu Joint work with Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, and Lei Ying University of Illinois, Urbana-Champaign
More informationReview and Motivation
Review and Motivation We can model and visualize multimodal datasets by using multiple unimodal (Gaussian-like) clusters. K-means gives us a way of partitioning points into N clusters. Once we know which
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationDimensionality Reduction and Principle Components Analysis
Dimensionality Reduction and Principle Components Analysis 1 Outline What is dimensionality reduction? Principle Components Analysis (PCA) Example (Bishop, ch 12) PCA vs linear regression PCA as a mixture
More informationIntroduction to Machine Learning
Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationEstimating Latent Variable Graphical Models with Moments and Likelihoods
Estimating Latent Variable Graphical Models with Moments and Likelihoods Arun Tejasvi Chaganty Percy Liang Stanford University June 18, 2014 Chaganty, Liang (Stanford University) Moments and Likelihoods
More informationDiscovering Geographical Topics in Twitter
Discovering Geographical Topics in Twitter Liangjie Hong, Lehigh University Amr Ahmed, Yahoo! Research Alexander J. Smola, Yahoo! Research Siva Gurumurthy, Twitter Kostas Tsioutsiouliklis, Twitter Overview
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationTruncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences
Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 207-4212 Ubiquitous International Volume 7, Number 5, September 2016 Truncation Strategy of Tensor Compressive Sensing for Noisy
More informationTime Delay Estimation: Microlensing
Time Delay Estimation: Microlensing Hyungsuk Tak Stat310 15 Sep 2015 Joint work with Kaisey Mandel, David A. van Dyk, Vinay L. Kashyap, Xiao-Li Meng, and Aneta Siemiginowska Introduction Image Credit:
More informationMatrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY
Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY OUTLINE Why We Need Matrix Decomposition SVD (Singular Value Decomposition) NMF (Nonnegative
More informationNonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy
Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille
More informationUncertainty quantification and visualization for functional random variables
Uncertainty quantification and visualization for functional random variables MascotNum Workshop 2014 S. Nanty 1,3 C. Helbert 2 A. Marrel 1 N. Pérot 1 C. Prieur 3 1 CEA, DEN/DER/SESI/LSMR, F-13108, Saint-Paul-lez-Durance,
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationJoint Emotion Analysis via Multi-task Gaussian Processes
Joint Emotion Analysis via Multi-task Gaussian Processes Daniel Beck, Trevor Cohn, Lucia Specia October 28, 2014 1 Introduction 2 Multi-task Gaussian Process Regression 3 Experiments and Discussion 4 Conclusions
More informationData Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006
Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems Eric Kostelich Data Mining Seminar, Feb. 6, 2006 kostelich@asu.edu Co-Workers Istvan Szunyogh, Gyorgyi Gyarmati, Ed Ott, Brian
More informationFast Coordinate Descent methods for Non-Negative Matrix Factorization
Fast Coordinate Descent methods for Non-Negative Matrix Factorization Inderjit S. Dhillon University of Texas at Austin SIAM Conference on Applied Linear Algebra Valencia, Spain June 19, 2012 Joint work
More informationRoberto Perdisci^+, Guofei Gu^, Wenke Lee^ presented by Roberto Perdisci. ^Georgia Institute of Technology, Atlanta, GA, USA
U s i n g a n E n s e m b l e o f O n e - C l a s s S V M C l a s s i f i e r s t o H a r d e n P a y l o a d - B a s e d A n o m a l y D e t e c t i o n S y s t e m s Roberto Perdisci^+, Guofei Gu^, Wenke
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationFast and Scalable Distributed Boolean Tensor Factorization
Fast and Scalable Distributed Boolean Tensor Factorization Namyong Park Seoul National University Email: namyong.park@snu.ac.kr Sejoon Oh Seoul National University Email: ohhenrie@snu.ac.kr U Kang Seoul
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More information