DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. Min Du, Feifei Li, Guineng Zheng, Vivek Srikumar University of Utah
|
|
- Gwendoline Hart
- 6 years ago
- Views:
Transcription
1 DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning Min Du, Feifei Li, Guineng Zheng, Vivek Srikumar University of Utah
2 Background 2
3 Background System Event Log 3
4 Background System Event Log Available practically on every computer system! 4
5 Background System Event Log Available practically on every computer system! Automatic Analysis? 5
6 Background Automatically detected anomaly 6
7 Background System Event Log Started service A on port 80 Executor updated: app-1 is now LOADING 7
8 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Started service A on port 80 Executor updated: app-1 is now LOADING 8
9 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Started service A on port 80 Executor updated: app-1 is now LOADING Started service * on port * (log key ID: 1) Executor updated: * is now LOADING (log key ID: 2) 9
10 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Anomaly Detection LOG ANALYSIS 10
11 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Anomaly Detection LOG ANALYSIS Message count vector: Xu SOSP09, Lou ATC10, etc. 11
12 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Anomaly Detection LOG ANALYSIS Message count vector: Xu SOSP09, Lou ATC10, etc. Problem: Offline batched processing 12
13 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Anomaly Detection LOG ANALYSIS Message count vector: Xu SOSP09, Lou ATC10, etc. Problem: Offline batched processing Build workflow model: Lou KDD10, Beschastnikh ICSE14, Yu ASPLOS16, etc. 13
14 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Anomaly Detection LOG ANALYSIS Message count vector: Xu SOSP09, Lou ATC10, etc. Problem: Offline batched processing Build workflow model: Lou KDD10, Beschastnikh ICSE14, Yu ASPLOS16, etc. Problem: Only for simple execution path anomalies 14
15 Background System Event Log LOG PARSING Structured Data Message type Log key printf( Started service %s on port %d, x, y); Anomaly Detection LOG ANALYSIS Common problem: Only Log keys (Message types) are considered. Message count vector: Xu SOSP09, Lou ATC10, etc. Problem: Offline batched processing Build workflow model: Lou KDD10, Beschastnikh ICSE14, Yu ASPLOS16, etc. Problem: Only for simple execution path anomalies 15
16 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] 16
17 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] SPELL A streaming log parser published in ICDM 16 17
18 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] log message log key parameters SPELL A streaming log parser published in ICDM 16 18
19 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] log message log key parameters Deletion of file1 complete. SPELL A streaming log parser published in ICDM 16 19
20 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] log message log key parameters Deletion of file1 complete. SPELL A streaming log parser published in ICDM 16 Deletion of * complete. [file1] 20
21 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] log message log key parameters Deletion of file1 complete. Deletion of file2 complete. SPELL A streaming log parser published in ICDM 16 Deletion of * complete. [file1] 21
22 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] log message log key parameters Deletion of file1 complete. Deletion of file2 complete. SPELL A streaming log parser published in ICDM 16 Deletion of * complete. Deletion of * complete. [file1] [file2] 22
23 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] 23
24 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] 24
25 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] 25
26 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] Anomaly Detection 26
27 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] Anomaly Detection Diagnosis 27
28 DeepLog log message (log key underlined) log key parameter value vector t 1 Deletion of file1 complete k 1 [t 1 - t 0, file1] t 2 Took 0.61 seconds to deallocate network k 2 [t 2 - t 1, 0.61] t 3 VM Stopped (Lifecycle Event) k 3 [t 3 - t 2 ] DeepLog Anomaly Detection Diagnosis 28
29 DeepLog Architecture MODELS Training Stage Detection Stage 29
30 DeepLog Architecture MODELS Detection Stage 30
31 DeepLog Architecture 31
32 DeepLog Architecture 32
33 DeepLog Architecture 33
34 DeepLog Architecture 34
35 DeepLog Architecture 35
36 DeepLog Architecture MODELS Training Stage Detection Stage 36
37 DeepLog Architecture MODELS Training Stage 37
38 DeepLog Architecture 38
39 DeepLog Architecture 39
40 DeepLog Architecture 40
41 DeepLog Architecture 41
42 DeepLog Architecture 42
43 DeepLog Architecture 43
44 DeepLog Architecture 44
45 DeepLog Architecture 45
46 DeepLog Architecture MODELS 46
47 Log Key Anomaly Detection model Example log key sequence: a rigorous set of logic and control flows a (more structured) natural language 47
48 Log Key Anomaly Detection model Example log key sequence: a rigorous set of logic and control flows a (more structured) natural language natural language modeling multi-class classifier: history sequence => next key to appear 48
49 Log Key Anomaly Detection model Example log key sequence: a rigorous set of logic and control flows a (more structured) natural language natural language modeling multi-class classifier: history sequence => next key to appear A log key is detected to be abnormal if it does not follow the prediction. 49
50 Log Key Anomaly Detection model Use long short-term memory (LSTM) architecture 50
51 Log Key Anomaly Detection model Use long short-term memory (LSTM) architecture 51
52 Log Key Anomaly Detection model Use long short-term memory (LSTM) architecture Training: log key sequence: h=
53 Log Key Anomaly Detection model Use long short-term memory (LSTM) architecture Training: log key sequence: h=
54 Log Key Anomaly Detection model Use long short-term memory (LSTM) architecture Training: log key sequence: h=
55 Log Key Anomaly Detection model Use long short-term memory (LSTM) architecture Training: log key sequence: h=
56 Log Key Anomaly Detection model Use long short-term memory (LSTM) architecture Detection: In detection stage, DeepLog checks if the actual next log key is among its top g probable predictions. 56
57 Log Key Anomaly Detection model 57
58 Log Key Anomaly Detection model 58
59 Log Key Anomaly Detection model 59
60 Workflow Construction Input: log key sequence Output: 60
61 Workflow Construction Method 1: Using Log Key Anomaly Detection model --- LSTM prediction probabilities 61
62 Workflow Construction Method 1: Using Log Key Anomaly Detection model --- LSTM prediction probabilities An example of concurrency detection: 62
63 Workflow Construction Method 1: Using Log Key Anomaly Detection model --- LSTM prediction probabilities An example of concurrency detection: 63
64 Workflow Construction Method 1: Using Log Key Anomaly Detection model --- LSTM prediction probabilities An example of concurrency detection: 64
65 Workflow Construction Method 1: Using Log Key Anomaly Detection model --- LSTM prediction probabilities An example of concurrency detection: 65
66 Workflow Construction Method 1: Using Log Key Anomaly Detection model --- LSTM prediction probabilities An example of concurrency detection: 66
67 Workflow Construction Method 2: A density-based clustering approach 67
68 Workflow Construction Method 2: A density-based clustering approach Co-occurrence matrix of log keys (k i, k j ) within distance d f d (k i, k j ) : the frequency of (k i, k j ) appearing together within distance d f(k i ) : the frequency of k i in the input sequence p d (i, j) : the probability of (k i, k j ) appearing together within distance d 68
69 Parameter Value Anomaly Detection model Example: Log messages of a particular log key: t 2 : Took seconds to deallocate network t 2 : Took 1. 1 seconds to deallocate network. 69
70 Parameter Value Anomaly Detection model Example: Log messages of a particular log key: t 2 : Took seconds to deallocate network t 2 : Took 1. 1 seconds to deallocate network. Parameter value vectors overtime: [t 2 - t 1, 0.61], [t 2 - t 1, 1.1],. 70
71 Parameter Value Anomaly Detection model Example: Log messages of a particular log key: t 2 : Took seconds to deallocate network t 2 : Took 1. 1 seconds to deallocate network. Parameter value vectors overtime: [t 2 - t 1, 0.61], [t 2 - t 1, 1.1],. Multi-variate time series data anomaly detection problem! 71
72 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. 72
73 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history time 73
74 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history prediction time 74
75 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history prediction actual time 75
76 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history prediction actual MSE > Threshold? time 76
77 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history time 77
78 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history actual prediction time 78
79 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history actual MSE > Threshold? prediction time 79
80 Parameter Value Anomaly Detection model Multi-variate time series data anomaly detection problem Leverage LSTM-based approach; A parameter value vector is given as input at each time step; An anomaly is detected if the mean-square-error (MSE) between prediction and actual data is too big. value history time 80
81 LSTM model online update Q: How to handle false positive? 81
82 LSTM model online update Q: How to handle false positive? Log sequence: history 82
83 LSTM model online update Q: How to handle false positive? Log sequence: history model 83
84 LSTM model online update Q: How to handle false positive? Log sequence: history model prediction 84
85 LSTM model online update Q: How to handle false positive? Log sequence: history current Anomaly? model prediction 85
86 LSTM model online update Q: How to handle false positive? Log sequence: history current Anomaly? Yes model prediction 86
87 LSTM model online update Q: How to handle false positive? Log sequence: history current Anomaly? Yes model prediction False positive? 87
88 LSTM model online update Q: How to handle false positive? Log sequence: history current Anomaly? Yes model prediction False positive? update model using this case: history -> current Yes 88
89 Up is good Evaluation log key anomaly detection 89 Evaluation results on HDFS log data [1]. (over a million log entries with labeled anomalies) [1] PCA (SOSP 09), IM (UsenixATC 10), N-gram (baseline language model)
90 Evaluation parameter value anomaly detection MSE: mean square error 90 Evaluation results on OpenStack cloud log with different confidence intervals (CIs)
91 Evaluation parameter value anomaly detection MSE: mean square error 91 Evaluation results on OpenStack cloud log with different confidence intervals (CIs) generated on CloudLab; VM creation/deletion operations; injected performance anomalies.
92 Evaluation parameter value anomaly detection MSE: mean square error thresholds 92 Evaluation results on OpenStack cloud log with different confidence intervals (CIs)
93 Evaluation parameter value anomaly detection MSE: mean square error ANOMALY thresholds 93 Evaluation results on OpenStack cloud log with different confidence intervals (CIs)
94 Evaluation parameter value anomaly detection MSE: mean square error ANOMALY thresholds False Positive 94 Evaluation results on OpenStack cloud log with different confidence intervals (CIs)
95 Up is good Evaluation LSTM model online update Evaluation on Blue Gene/L log, with and without online model update. 95
96 Up is good Evaluation LSTM model online update Evaluation on Blue Gene/L log, with and without online model update. HPC log with labeled anomalies; Available at 96
97 Evaluation case study: network security log Dataset: IEEE VAST Challenge 2011 (Mini Challenge 2 Computer Networking Operations) The dataset contains firewall log, IDS log, etc. 97
98 Evaluation case study: network security log Dataset: IEEE VAST Challenge 2011 (Mini Challenge 2 Computer Networking Operations) The dataset contains firewall log, IDS log, etc. Detection results. 98
99 Evaluation case study: network security log Dataset: IEEE VAST Challenge 2011 (Mini Challenge 2 Computer Networking Operations) The dataset contains firewall log, IDS log, etc. 99 Detection results. Could be fixed with prior knowledge of documented IP
100 Evaluation workflow construction 100 Constructed workflow of VM Creation. (previously generated OpenStack cloud log)
101 Evaluation workflow construction How does it help to diagnose anomalies? 101 Constructed workflow of VM Creation. (previously generated OpenStack cloud log)
102 Evaluation workflow construction How does it help to diagnose anomalies? Parameter value anomaly 102 Constructed workflow of VM Creation. (previously generated OpenStack cloud log)
103 Evaluation workflow construction How does it help to diagnose anomalies? Parameter value anomaly Time difference (performance) anomaly Constructed workflow of VM Creation. (previously generated OpenStack cloud log) 103
104 Evaluation workflow construction How does it help to diagnose anomalies? Identified anomaly: Instance took too long to build because of the transition from 52 -> 53 Constructed workflow of VM Creation. (previously generated OpenStack cloud log) 104
105 Evaluation workflow construction How does it help to diagnose anomalies? Identified anomaly: Instance took too long to build because of the transition from 52 -> 53 Constructed workflow of VM Creation. (previously generated OpenStack cloud log) Injected anomaly: During VM creation, network speed from controller to compute node is throttled. 105
106 Summary DeepLog A realtime system log anomaly detection framework. LSTM is used to model system execution paths and log parameter values. Workflow models are built to help anomaly diagnosis. It supports online model update. Thank you! Min Du mind@cs.utah.edu Feifei Li lifeifei@cs.utah.edu 106
DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning
DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning Min Du, Feifei Li, Guineng Zheng, Vivek Srikumar School of Computing, University of Utah {mind, lifeifei, guineng, svivek}@cs.utah.edu
More informationChapter 2 Single Layer Feedforward Networks
Chapter 2 Single Layer Feedforward Networks By Rosenblatt (1962) Perceptrons For modeling visual perception (retina) A feedforward network of three layers of units: Sensory, Association, and Response Learning
More informationArcGIS Earth for Enterprises DARRON PUSTAM ARCGIS EARTH CHRIS ANDREWS 3D
ArcGIS Earth for Enterprises DARRON PUSTAM ARCGIS EARTH CHRIS ANDREWS 3D ArcGIS Earth is ArcGIS Earth is a lightweight globe desktop application that helps you explore any part of the world and investigate
More informationFrom statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu
From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom
More informationLeast Mean Squares Regression
Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method
More informationRoberto Perdisci^+, Guofei Gu^, Wenke Lee^ presented by Roberto Perdisci. ^Georgia Institute of Technology, Atlanta, GA, USA
U s i n g a n E n s e m b l e o f O n e - C l a s s S V M C l a s s i f i e r s t o H a r d e n P a y l o a d - B a s e d A n o m a l y D e t e c t i o n S y s t e m s Roberto Perdisci^+, Guofei Gu^, Wenke
More informationLarge-Scale Behavioral Targeting
Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting
More informationContinuous Machine Learning
Continuous Machine Learning Kostiantyn Bokhan, PhD Project Lead at Samsung R&D Ukraine Kharkiv, October 2016 Agenda ML dev. workflows ML dev. issues ML dev. solutions Continuous machine learning (CML)
More information@SoyGema GEMA PARREÑO PIQUERAS
@SoyGema GEMA PARREÑO PIQUERAS WHAT IS AN ARTIFICIAL NEURON? WHAT IS AN ARTIFICIAL NEURON? Image Recognition Classification using Softmax Regressions and Convolutional Neural Networks Languaje Understanding
More informationClassification with Perceptrons. Reading:
Classification with Perceptrons Reading: Chapters 1-3 of Michael Nielsen's online book on neural networks covers the basics of perceptrons and multilayer neural networks We will cover material in Chapters
More informationGeodatabase Best Practices. Dave Crawford Erik Hoel
Geodatabase Best Practices Dave Crawford Erik Hoel Geodatabase best practices - outline Geodatabase creation Data ownership Data model Data configuration Geodatabase behaviors Data integrity and validation
More informationAnomaly Detection. Jing Gao. SUNY Buffalo
Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their
More informationAnomaly Detection for the CERN Large Hadron Collider injection magnets
Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing
More informationUnsupervised Anomaly Detection for High Dimensional Data
Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation
More informationArcGIS Deployment Pattern. Azlina Mahad
ArcGIS Deployment Pattern Azlina Mahad Agenda Deployment Options Cloud Portal ArcGIS Server Data Publication Mobile System Management Desktop Web Device ArcGIS An Integrated Web GIS Platform Portal Providing
More informationPatrol: Revealing Zero-day Attack Paths through Network-wide System Object Dependencies
Patrol: Revealing Zero-day Attack Paths through Network-wide System Object Dependencies Jun Dai, Xiaoyan Sun, and Peng Liu College of Information Sciences and Technology Pennsylvania State University,
More informationNeural Networks Language Models
Neural Networks Language Models Philipp Koehn 10 October 2017 N-Gram Backoff Language Model 1 Previously, we approximated... by applying the chain rule p(w ) = p(w 1, w 2,..., w n ) p(w ) = i p(w i w 1,...,
More informationBased on the original slides of Hung-yi Lee
Based on the original slides of Hung-yi Lee Google Trends Deep learning obtains many exciting results. Can contribute to new Smart Services in the Context of the Internet of Things (IoT). IoT Services
More informationOne Optimized I/O Configuration per HPC Application
One Optimized I/O Configuration per HPC Application Leveraging I/O Configurability of Amazon EC2 Cloud Mingliang Liu, Jidong Zhai, Yan Zhai Tsinghua University Xiaosong Ma North Carolina State University
More informationInternational Journal of Scientific & Engineering Research, Volume 7, Issue 2, February ISSN
International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016 9 Automated Methodology for Context Based Semantic Anomaly Identification in Big Data Hema.R 1, Vidhya.V 2,
More informationEntropy-based data organization tricks for browsing logs and packet captures
Entropy-based data organization tricks for browsing logs and packet captures Department of Computer Science Dartmouth College Outline 1 Log browsing moves Pipes and tables Trees are better than pipes and
More informationUsing a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics
Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics Jorge González-Domínguez Parallel and Distributed Architectures Group Johannes Gutenberg University of Mainz, Germany j.gonzalez@uni-mainz.de
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationExploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales. ACM MobiCom 2014, Maui, HI
Exploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales Desheng Zhang & Tian He University of Minnesota, USA Jun Huang, Ye Li, Fan Zhang, Chengzhong Xu Shenzhen Institute
More informationDecision Support. Dr. Johan Hagelbäck.
Decision Support Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Decision Support One of the earliest AI problems was decision support The first solution to this problem was expert systems
More informationBuilding a Timeline Action Network for Evacuation in Disaster
Building a Timeline Action Network for Evacuation in Disaster The-Minh Nguyen, Takahiro Kawamura, Yasuyuki Tahara, and Akihiko Ohsuga Graduate School of Information Systems, University of Electro-Communications,
More informationArcGIS Pro Q&A Session. NWGIS Conference, October 11, 2017 With John Sharrard, Esri GIS Solutions Engineer
ArcGIS Pro Q&A Session NWGIS Conference, October 11, 2017 With John Sharrard, Esri GIS Solutions Engineer jsharrard@esri.com ArcGIS Desktop The applications ArcGIS Pro ArcMap ArcCatalog ArcScene ArcGlobe
More informationTime Series Anomaly Detection
Time Series Anomaly Detection Detection of Anomalous Drops with Limited Features and Sparse Examples in Noisy Highly Periodic Data Dominique T. Shipmon, Jason M. Gurevitch, Paolo M. Piselli, Steve Edwards
More informationPI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1
PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1 AUGUST 7, 2007 APRIL 14, 2010 APRIL 24, 2012 Copyr i g h t 2012 O S Is o f t, L L C. 2 PI Data Archive Security PI Asset
More informationMidterm, Fall 2003
5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are
More informationTroubleshooting Replication and Geodata Services. Liz Parrish & Ben Lin
Troubleshooting Replication and Geodata Services Liz Parrish & Ben Lin AGENDA: Troubleshooting Replication and Geodata Services Overview Demo Troubleshooting Q & A Overview of Replication Liz Parrish What
More informationEXAD: A System for Explainable Anomaly Detection on Big Data Traces
EXAD: A System for Explainable Anomaly Detection on Big Data Traces Fei Song Inria, France fei.song@inria.fr Arnaud Stiegler arnaud.stiegler@polytechnique.edu Yanlei Diao University of Massachusetts Amherst
More informationPulses Characterization from Raw Data for CDMS
Pulses Characterization from Raw Data for CDMS Physical Sciences Francesco Insulla (06210287), Chris Stanford (05884854) December 14, 2018 Abstract In this project we seek to take a time series of current
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationA stochastic model-based approach to online event prediction and response scheduling
A stochastic model-based approach to online event prediction and response scheduling M. Biagi, L. Carnevali, M. Paolieri, F. Patara, E. Vicario Department of Information Engineering, University of Florence,
More informationPresentation in Convex Optimization
Dec 22, 2014 Introduction Sample size selection in optimization methods for machine learning Introduction Sample size selection in optimization methods for machine learning Main results: presents a methodology
More informationRecurrent Neural Networks with Flexible Gates using Kernel Activation Functions
2018 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 18) Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors: S. Scardapane, S. Van Vaerenbergh,
More informationSource localization in an ocean waveguide using supervised machine learning
Source localization in an ocean waveguide using supervised machine learning Haiqiang Niu, Emma Reeves, and Peter Gerstoft Scripps Institution of Oceanography, UC San Diego Part I Localization on Noise09
More informationLecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets)
COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets) Sanjeev Arora Elad Hazan Recap: Structure of a deep
More informationAdministering your Enterprise Geodatabase using Python. Jill Penney
Administering your Enterprise Geodatabase using Python Jill Penney Assumptions Basic knowledge of python Basic knowledge enterprise geodatabases and workflows You want code Please turn off or silence cell
More informationAnomaly Detection for SOME/IP using Complex Event Processing
Chair of Network Architectures and Services TUM Department of Informatics Technical University of Munich (TUM) Anomaly Detection for SOME/IP using Complex Event Processing Nadine Herold, Stephan-A. Posselt,
More informationIntroduction to ArcGIS Maps for Office. Greg Ponto Scott Ball
Introduction to ArcGIS Maps for Office Greg Ponto Scott Ball Agenda What is Maps for Office? Platform overview What are Apps for the Office? ArcGIS Maps for Office features - Visualization - Geoenrichment
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationSPATIAL INFORMATION GRID AND ITS APPLICATION IN GEOLOGICAL SURVEY
SPATIAL INFORMATION GRID AND ITS APPLICATION IN GEOLOGICAL SURVEY K. T. He a, b, Y. Tang a, W. X. Yu a a School of Electronic Science and Engineering, National University of Defense Technology, Changsha,
More informationMachine Learning to Automatically Detect Human Development from Satellite Imagery
Technical Disclosure Commons Defensive Publications Series April 24, 2017 Machine Learning to Automatically Detect Human Development from Satellite Imagery Matthew Manolides Follow this and additional
More informationEvaluation. Albert Bifet. April 2012
Evaluation Albert Bifet April 2012 COMP423A/COMP523A Data Stream Mining Outline 1. Introduction 2. Stream Algorithmics 3. Concept drift 4. Evaluation 5. Classification 6. Ensemble Methods 7. Regression
More informationCS6375: Machine Learning Gautam Kunapuli. Decision Trees
Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s
More informationBelieve it Today or Tomorrow? Detecting Untrustworthy Information from Dynamic Multi-Source Data
SDM 15 Vancouver, CAN Believe it Today or Tomorrow? Detecting Untrustworthy Information from Dynamic Multi-Source Data Houping Xiao 1, Yaliang Li 1, Jing Gao 1, Fei Wang 2, Liang Ge 3, Wei Fan 4, Long
More informationCS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.
CS570 Data Mining Anomaly Detection Li Xiong Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber April 3, 2011 1 Anomaly Detection Anomaly is a pattern in the data that does not conform
More informationSlide credit from Hung-Yi Lee & Richard Socher
Slide credit from Hung-Yi Lee & Richard Socher 1 Review Recurrent Neural Network 2 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step
More informationBuilding a Multi-FPGA Virtualized Restricted Boltzmann Machine Architecture Using Embedded MPI
Building a Multi-FPGA Virtualized Restricted Boltzmann Machine Architecture Using Embedded MPI Charles Lo and Paul Chow {locharl1, pc}@eecg.toronto.edu Department of Electrical and Computer Engineering
More informationSpatial Decision Tree: A Novel Approach to Land-Cover Classification
Spatial Decision Tree: A Novel Approach to Land-Cover Classification Zhe Jiang 1, Shashi Shekhar 1, Xun Zhou 1, Joseph Knight 2, Jennifer Corcoran 2 1 Department of Computer Science & Engineering 2 Department
More informationData Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning
CI125230 Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning Stephen Brockwell Brockwell IT Consulting Inc. Sean Kinahan Brockwell IT Consulting Inc. Learning Objectives
More informationDistributed Architectures
Distributed Architectures Software Architecture VO/KU (707023/707024) Roman Kern KTI, TU Graz 2015-01-21 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 1 / 64 Outline 1 Introduction 2 Independent
More informationSeq2Tree: A Tree-Structured Extension of LSTM Network
Seq2Tree: A Tree-Structured Extension of LSTM Network Weicheng Ma Computer Science Department, Boston University 111 Cummington Mall, Boston, MA wcma@bu.edu Kai Cao Cambia Health Solutions kai.cao@cambiahealth.com
More informationVisual meta-learning for planning and control
Visual meta-learning for planning and control Seminar on Current Works in Computer Vision @ Chair of Pattern Recognition and Image Processing. Samuel Roth Winter Semester 2018/19 Albert-Ludwigs-Universität
More informationUnderstanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014
Understanding Comments Submitted to FCC on Net Neutrality Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Abstract We aim to understand and summarize themes in the 1.65 million
More informationSparse Gaussian Markov Random Field Mixtures for Anomaly Detection
Sparse Gaussian Markov Random Field Mixtures for Anomaly Detection Tsuyoshi Idé ( Ide-san ), Ankush Khandelwal*, Jayant Kalagnanam IBM Research, T. J. Watson Research Center (*Currently with University
More informationMining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University
Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit
More informationOnline Advertising is Big Business
Online Advertising Online Advertising is Big Business Multiple billion dollar industry $43B in 2013 in USA, 17% increase over 2012 [PWC, Internet Advertising Bureau, April 2013] Higher revenue in USA
More informationAdaptive Learning and Mining for Data Streams and Frequent Patterns
Adaptive Learning and Mining for Data Streams and Frequent Patterns Albert Bifet Laboratory for Relational Algorithmics, Complexity and Learning LARCA Departament de Llenguatges i Sistemes Informàtics
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationCSE 150. Assignment 6 Summer Maximum likelihood estimation. Out: Thu Jul 14 Due: Tue Jul 19
SE 150. Assignment 6 Summer 2016 Out: Thu Jul 14 ue: Tue Jul 19 6.1 Maximum likelihood estimation A (a) omplete data onsider a complete data set of i.i.d. examples {a t, b t, c t, d t } T t=1 drawn from
More informationAssignment No A-05 Aim. Pre-requisite. Objective. Problem Statement. Hardware / Software Used
Assignment No A-05 Aim Implement Naive Bayes to predict the work type for a person. Pre-requisite 1. Probability. 2. Scikit-Learn Python Library. 3. Programming language basics. Objective 1. To Learn basic
More informationCHAPTER 22 GEOGRAPHIC INFORMATION SYSTEMS
CHAPTER 22 GEOGRAPHIC INFORMATION SYSTEMS PURPOSE: This chapter establishes the administration and use of to improve the quality and accessibility of Department s spatial information and support graphical
More informationRisk Adjustment Submission Timetable Risk Adjustment Process Overview
Risk Adjustment Submission Timetable Risk Adjustment Process Overview CY Dates of Service Initial Submission Deadline First Payment Date Final Submission Deadline Hospital/Physician MA Organization 08
More information1 [15 points] Frequent Itemsets Generation With Map-Reduce
Data Mining Learning from Large Data Sets Final Exam Date: 15 August 2013 Time limit: 120 minutes Number of pages: 11 Maximum score: 100 points You can use the back of the pages if you run out of space.
More informationWDCloud: An End to End System for Large- Scale Watershed Delineation on Cloud
WDCloud: An End to End System for Large- Scale Watershed Delineation on Cloud * In Kee Kim, * Jacob Steele, + Anthony Castronova, * Jonathan Goodall, and * Marty Humphrey * University of Virginia + Utah
More informationConvolutional Neural Networks. Srikumar Ramalingam
Convolutional Neural Networks Srikumar Ramalingam Reference Many of the slides are prepared using the following resources: neuralnetworksanddeeplearning.com (mainly Chapter 6) http://cs231n.github.io/convolutional-networks/
More informationPortal for ArcGIS: An Introduction
Portal for ArcGIS: An Introduction Derek Law Esri Product Management Esri UC 2014 Technical Workshop Agenda Web GIS pattern Product overview Installation and deployment Security and groups Configuration
More informationDiscrete-event simulations
Discrete-event simulations Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/elt-53606/ OUTLINE: Why do we need simulations? Step-by-step simulations; Classifications;
More informationData Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition
Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each
More informationLeveraging GIS data and tools for maintaining hydraulic sewer models
Leveraging GIS data and tools for maintaining hydraulic sewer models Ben Gamble & Joseph Koran Metropolitan Sewer District of Greater Cincinnati Carl C. Chan & Michael York CDM Smith Ben Gamble Senior
More informationIn silico generation of novel, drug-like chemical matter using the LSTM deep neural network
In silico generation of novel, drug-like chemical matter using the LSTM deep neural network Peter Ertl Novartis Institutes for BioMedical Research, Basel, CH September 2018 Neural networks in cheminformatics
More informationEigen Co-occurrence Matrix Method for Masquerade Detection
Eigen Co-occurrence Matrix Method for Masquerade Detection Mizuki Oka 1 Yoshihiro Oyama 2,3 Kazuhiko Kato 3,4 1 Master s Program in Science and Engineering, University of Tsukuba 2 Department of Computer
More informationto be more efficient on enormous scale, in a stream, or in distributed settings.
16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationCONTEMPORARY ANALYTICAL ECOSYSTEM PATRICK HALL, SAS INSTITUTE
CONTEMPORARY ANALYTICAL ECOSYSTEM PATRICK HALL, SAS INSTITUTE Copyright 2013, SAS Institute Inc. All rights reserved. Agenda (Optional) History Lesson 2015 Buzzwords Machine Learning for X Citizen Data
More informationModern Information Retrieval
Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction
More informationMemory-Augmented Attention Model for Scene Text Recognition
Memory-Augmented Attention Model for Scene Text Recognition Cong Wang 1,2, Fei Yin 1,2, Cheng-Lin Liu 1,2,3 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences
More informationEsri UC2013. Technical Workshop.
Esri International User Conference San Diego, California Technical Workshops July 9, 2013 CAD: Introduction to using CAD Data in ArcGIS Jeff Reinhart & Phil Sanchez Agenda Overview of ArcGIS CAD Support
More informationArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson
ArcGIS Enterprise: What s New Philip Heede Shannon Kalisky Melanie Summers Sam Williamson ArcGIS Enterprise is the new name for ArcGIS for Server What is ArcGIS Enterprise ArcGIS Enterprise is powerful
More informationMachine Learning 2010
Machine Learning 2010 Decision Trees Email: mrichter@ucalgary.ca -- 1 - Part 1 General -- 2 - Representation with Decision Trees (1) Examples are attribute-value vectors Representation of concepts by labeled
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationLecture 5 Neural models for NLP
CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm
More informationIntroduction to Portal for ArcGIS. Hao LEE November 12, 2015
Introduction to Portal for ArcGIS Hao LEE November 12, 2015 Agenda Web GIS pattern Product overview Installation and deployment Security and groups Configuration options Portal for ArcGIS + ArcGIS for
More informationThe Noisy Channel Model and Markov Models
1/24 The Noisy Channel Model and Markov Models Mark Johnson September 3, 2014 2/24 The big ideas The story so far: machine learning classifiers learn a function that maps a data item X to a label Y handle
More informationSpatial Analysis with Web GIS. Rachel Weeden
Spatial Analysis with Web GIS Rachel Weeden Agenda Subhead goes here Introducing ArcGIS Online Spatial Analysis Workflows Scenarios Other Options Resources ArcGIS is a Platform Making mapping and analytics
More informationIncremental Learning and Concept Drift: Overview
Incremental Learning and Concept Drift: Overview Incremental learning The algorithm ID5R Taxonomy of incremental learning Concept Drift Teil 5: Incremental Learning and Concept Drift (V. 1.0) 1 c G. Grieser
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationMachine Learning Analyses of Meteor Data
WGN, The Journal of the IMO 45:5 (2017) 1 Machine Learning Analyses of Meteor Data Viswesh Krishna Research Student, Centre for Fundamental Research and Creative Education. Email: krishnaviswesh@cfrce.in
More informationResearch Overview. Kristjan Greenewald. February 2, University of Michigan - Ann Arbor
Research Overview Kristjan Greenewald University of Michigan - Ann Arbor February 2, 2016 2/17 Background and Motivation Want efficient statistical modeling of high-dimensional spatio-temporal data with
More informationDecision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,
More informationDecision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1
Decision Trees Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, 2018 Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1 Roadmap Classification: machines labeling data for us Last
More informationNLP Homework: Dependency Parsing with Feed-Forward Neural Network
NLP Homework: Dependency Parsing with Feed-Forward Neural Network Submission Deadline: Monday Dec. 11th, 5 pm 1 Background on Dependency Parsing Dependency trees are one of the main representations used
More informationApprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire
More informationIntroduction to Portal for ArcGIS
Introduction to Portal for ArcGIS Derek Law Product Management March 10 th, 2015 Esri Developer Summit 2015 Agenda Web GIS pattern Product overview Installation and deployment Security and groups Configuration
More informationAnalysis Based on SVM for Untrusted Mobile Crowd Sensing
Analysis Based on SVM for Untrusted Mobile Crowd Sensing * Ms. Yuga. R. Belkhode, Dr. S. W. Mohod *Student, Professor Computer Science and Engineering, Bapurao Deshmukh College of Engineering, India. *Email
More informationLeast Squares Classification
Least Squares Classification Stephen Boyd EE103 Stanford University November 4, 2017 Outline Classification Least squares classification Multi-class classifiers Classification 2 Classification data fitting
More informationGeo-enabling a Transactional Real Estate Management System A case study from the Minnesota Dept. of Transportation
Geo-enabling a Transactional Real Estate Management System A case study from the Minnesota Dept. of Transportation Michael Terner Executive Vice President Co-author and Project Manager Andy Buck Overview
More information