Sequential Feature Explanations for Anomaly Detection
|
|
- Ashley George
- 5 years ago
- Views:
Transcription
1 Sequential Feature Explanations for Anomaly Detection Md Amran Siddiqui, Alan Fern, Thomas G. Die8erich and Weng-Keen Wong School of EECS Oregon State University
2 Anomaly Detection Anomalies : points that are generated by a process that is dis:nct from the process genera:ng normal points In this talk Anomaly = Threat
3 Anomaly Detectors We focus on density-based anomaly detectors StaBsBcal Outliers : points with low density values non-anomaly anomaly non-anomaly Not all stabsbcal outliers are anomalies of interest (sta:s:cs versus seman:cs)
4 Anomaly Detec:on Pipeline Data Points Threats & Non-Threats
5 Anomaly Detec:on Pipeline Data Points Threats & Non-Threats Anomaly Detector
6 Anomaly Detec:on Pipeline Data Points Threats & Non-Threats Anomaly Detector Outliers Threats & False Positives Non-Outliers Non-Threats & Missed Threats Type 1 Missed Threats = Anomaly Detector s False Negatives Reduce by improving anomaly detector
7 Anomaly Detec:on Pipeline Data Points Outliers Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Non-Outliers Non-Threats & Missed Threats Type 1 Missed Threats = Anomaly Detector s False Negatives Reduce by improving anomaly detector
8 Anomaly Detec:on Pipeline Data Points Outliers Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Type 1 Missed Threats = Anomaly Detector s False Negatives Reduce by improving anomaly detector
9 Anomaly Detec:on Pipeline Data Points Outliers Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Type 1 Missed Threats = Anomaly Detector s False Negatives Reduce by improving anomaly detector Type 2 Missed Threats = Analyst s False Negatives Can occur due to information overload and time constraints How can we reduce type 2 errors?
10 Anomaly Detec:on Pipeline Data Points Outliers Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Goal: reduce analyst effort for correctly detecting outliers that are threats
11 Anomaly Detec:on Pipeline Data Points Outliers + Explanations Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Goal: reduce analyst effort for correctly detecting outliers that are threats How: provide analyst with explanations of outlier points
12 Anomaly Detec:on Pipeline Data Points Outliers + Explanations Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Goal: reduce analyst effort for correctly detecting outliers that are threats How: provide analyst with explanations of outlier points Why did the detector consider an object to be an outlier?
13 Anomaly Detec:on Pipeline Data Points Outliers + Explanations Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Goal: reduce analyst effort for correctly detecting outliers that are threats How: provide analyst with explanations of outlier points Why did the detector consider an object to be an outlier? Analyst can focus on information related to explanation.
14 Anomaly Detec:on Pipeline Data Points Outliers + Explanations Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Sequential Feature Explanation (SFE): an ordering on features of an outlier prioritized by importance to anomaly detector (F2, F10, F37, F26 )
15 Anomaly Detec:on Pipeline Data Points Outliers + Explanations Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Sequential Feature Explanation (SFE): an ordering on features of an outlier prioritized by importance to anomaly detector (F2, F10, F37, F26 ) Protocol: incrementally reveal features ordered by SFE until analyst makes a determination
16 SFE Example Analyst s belief about normality of X 16
17 SFE Example Analyst s belief about normality of X 17
18 SFE Example Analyst s belief about normality of X 18
19 SFE Example Analyst s belief about normality of X
20 SFE Example Analyst s belief about normality of X
21 SFE Example Analyst s belief about normality of X
22 SFE Example Analyst s belief about normality of X
23 SFE Example Analyst s belief about normality of X Threshold
24 SFE Example Analyst s belief about normality of X Threshold How do we evaluate SFE quality?
25 SFE Example Analyst s belief about normality of X MFP = 4 Threshold Minimum Feature Prefix (MFP). Minimum number of features that must be revealed for the analyst to become confident that a threat is truly a threat.
26 Op:mizing MFP Analyst s belief about normality of X unknown MFP = 4 unknown Threshold Ideal ObjecBve: compute SFE with minimum MFP But.. We don t know the analyst belief model or threshold!
27 Op:mizing MFP Analyst s belief about normality of X f(x) MFP = 4 unknown Threshold Ideal ObjecBve: compute SFE with minimum MFP AssumpBon 1: analyst s beliefs modeled by learned density f(x)
28 Op:mizing MFP Analyst s belief about normality of X f(x) MFP = 4 Pr(threshold) Threshold Ideal ObjecBve: compute SFE with minimum MFP AssumpBon 1: analyst s beliefs modeled by learned density f(x) AssumpBon 2: distribu:on Pr(threshold) over possible thresholds
29 Op:mizing MFP Analyst s belief about normality of X f(x) MFP = 4 Pr(threshold) Threshold Ideal ObjecBve: compute SFE with minimum MFP AssumpBon 1: analyst s beliefs modeled by learned density f(x) AssumpBon 2: distribu:on Pr(threshold) over possible thresholds
30 Op:mizing MFP Analyst s belief about normality of X f(x) MFP = 4 Pr(threshold) Threshold Realizable ObjecBve: compute SFE with minimum expected MFP under assump:ons 1 and 2 AssumpBon 1: analyst s beliefs modeled by learned density f(x) AssumpBon 2: distribu:on Pr(threshold) over possible thresholds
31 Op:mizing MFP Analyst s belief about normality of X f(x) MFP = 4 Pr(threshold) Threshold Realizable ObjecBve: compute SFE with minimum expected MFP under assump:ons 1 and 2 NP-hard problem Not Covered Today: branch and bound op:miza:on procedure
32 Greedy Op:miza:on: Sequen:al Marginal SequenBal Marginal: Choose First feature i that minimizes f( x i )
33 Greedy Op:miza:on: Sequen:al Marginal SequenBal Marginal: Choose First feature i that minimizes f( x i ) Choose Second feature j that minimizes f( x i, x j )
34 Greedy Op:miza:on: Sequen:al Marginal SequenBal Marginal: Choose First feature i that minimizes f( x i ) Choose Second feature j that minimizes f( x i, x j )....
35 Greedy Op:miza:on: Independent Marginal Independent Marginal: computa:onally cheaper
36 Greedy Op:miza:on: Independent Marginal Independent Marginal: computa:onally cheaper Order features according to increasing f( x i ) I.e. order according to independent anomalousness of each feature
37 Greedy Op:miza:on: Indepedent Dropout Independent Dropout: inspired by [Robnik et al., 2008] for compu:ng supervised learning explana:ons
38 Greedy Op:miza:on: Indepedent Dropout Independent Dropout: inspired by [Robnik et al., 2008] for compu:ng supervised learning explana:ons Order features according to decreasing f( x i ) I.e. order according to how much more normal x looks ader removing the feature
39 Greedy Op:miza:on: Sequen:al Dropout SequenBal Dropout:
40 Greedy Op:miza:on: Sequen:al Dropout SequenBal Dropout: Select first feature i as one that maximizes f( x i )
41 Greedy Op:miza:on: Sequen:al Dropout SequenBal Dropout: Select first feature i as one that maximizes f( x i ) Select second feature j as one that maximizes f( x i j )..
42 Evalua:ng SFEs Analyst s belief about normality of X MFP = 4 Threshold Problem: Evalua:ng an SFE requires access to an analyst, but we can t run large scale experiments with real analysts
43 Evalua:ng SFEs Analyst s belief about normality of X MFP = 4 Threshold Problem: Evalua:ng an SFE requires access to an analyst, but we can t run large scale experiments with real analysts SoluBon: Construct simulated analyst for anomaly detec:on benchmarks
44 Evalua:ng Explana:ons Start with anomaly detec:on benchmarks constructed from UCI supervised learning data set [Emmoi et al., 2013] Each benchmark has known anomaly and normal classes UCI Dataset Normal Points Anomaly Points Anomaly Detec:on Benchmark
45 Evalua:ng Explana:ons Start with anomaly detec:on benchmarks constructed from UCI supervised learning data set [Emmoi et al., 2013] Each benchmark has known anomaly and normal classes Learn a classifier P(normal x) to predict normal vs. anomalous for any feature subset Can serve as a simulated analyst UCI Dataset Normal Points Supervised Learning Simulated Analyst Anomaly Points P(no Anomaly Detec:on Benchmark
46 Evalua:ng SFEs Analyst s belief about normality of X Simulated Analyst MFP = 4 P(no Threshold rmal EvaluaBon Metric : expected MFP of simulated analyst x) Use reasonable distribu:on over thresholds.
47 Results of Explana:ons for EGMM 6 5 Average MFP Random OptOracle IndDO SeqDO IndMarg SeqMarg Use an ensemble of GMMs (EGMM) as the learned density f(x)
48 Oracle Experiments Explana:on evalua:on depends on two factors: 1. Quality of f(x) How well does f(x) match true analyst? 2. Quality of explana:on computa:on To assess (2) we run experiments that replace f(x) with ground truth analyst
49 Results of Explana:ons for Oracle Detector 6 5 Average MFP Random OptOracle IndDO* SeqDO* IndMarg* SeqMarg*
50 Result on KDDCup99 Dataset
51 Result on KDDCup99 Dataset Average MFP Random IndDO SeqDO IndMarg SeqMarg
52 Key Observa:ons from the Experiments All methods significantly beat random
53 Key Observa:ons from the Experiments All methods significantly beat random Marginal methods no worse and sometimes better than dropout
54 Key Observa:ons from the Experiments All methods significantly beat random Marginal methods no worse and sometimes better than dropout Independent marginal is nearly as good as sequential marginal
55 Key Observa:ons from the Experiments All methods significantly beat random Marginal methods no worse and sometimes better than dropout Independent marginal is nearly as good as sequential marginal But sequential is significantly better in oracle experiments The weaker signals produced by the Dropout methods when taking early decisions makes it less robust compare to the Marginal methods
56 Summary Reducing effort of analyst to detect threats can reduce the analyst miss rate
57 Summary Reducing effort of analyst to detect threats can reduce the analyst miss rate Proposed sequential feature explanations to guide analyst investigation
58 Summary Reducing effort of analyst to detect threats can reduce the analyst miss rate Proposed sequential feature explanations to guide analyst investigation Proposed an evaluation framework for explanations
59 Summary Reducing effort of analyst to detect threats can reduce the analyst miss rate Proposed sequential feature explanations to guide analyst investigation Proposed an evaluation framework for explanations Designed 4 greedy explanation methods and evaluated
60 Summary Reducing effort of analyst to detect threats can reduce the analyst miss rate Proposed sequential feature explanations to guide analyst investigation Proposed an evaluation framework for explanations Designed 4 greedy explanation methods and evaluated Preferred Method: sequential marginal
61 Further evaluations Future Work Additional anomaly detectors (e.g. with PCA applied) Larger feature spaces Evaluate non-greedy algorithms Branch-and-Bound Anomaly exoneration Alternative types of explanations
62 Ques:ons
63 SFE Calcula:on We assume, for every feature subset s there exists a par:cular threshold τ such that for any instance x: f( x s )<τ implies x is an anomaly To find op:mal SFE we first define the MFP of a SFE E for an instance x: MFP(x, E, τ(e))= min { i :f( x E 1:i )< τ i (E) } Where f(.) is the density func:on τ(e) is the set of thresholds, where τ i (E) is a random variable corresponding to the feature subset E 1:i
64 SFE Calcula:on Expected MFP: MFP(x,E)=E τ(e) [MFP(x,E,τ(E))] Objec:ve func:on for gerng op:mal MFP of x: arg min E MFP(x,E) The objec:ve func:on is hard to op:mize, hence, we introduce two greedy methods: Marginal and Dropout, those approximately try to minimize the objec:ve func:on for compu:ng SFE
65 f(x) is the learned normal Explana:on Algorithms SequenBal Marginal: Choose First feature i that minimizes f( x i ) Choose Second feature j that minimizes f( x i, x j ).... Independent Marginal: computa:onally cheaper Order features according to increasing f( x i ) I.e. order according to independent anomalousness of each feature Independent Dropout: inspired by [Robnik et al., 2008] from supervised learning Order features according to decreasing f( x i ) I.e. order according to how much more normal x looks ader removal SequenBal Dropout: Select first feature i as one that maximizes f( x i ) Select second feature j as one that maximizes f( x i j )..
Advances in Anomaly Detection
Advances in Anomaly Detection Tom Dietterich Alan Fern Weng-Keen Wong Andrew Emmott Shubhomoy Das Md. Amran Siddiqui Tadesse Zemicheal Outline Introduction Three application areas Two general approaches
More informationSYSTEMATIC CONSTRUCTION OF ANOMALY DETECTION BENCHMARKS FROM REAL DATA. Outlier Detection And Description Workshop 2013
SYSTEMATIC CONSTRUCTION OF ANOMALY DETECTION BENCHMARKS FROM REAL DATA Outlier Detection And Description Workshop 2013 Authors Andrew Emmott emmott@eecs.oregonstate.edu Thomas Dietterich tgd@eecs.oregonstate.edu
More informationSta$s$cal Op$miza$on for Big Data. Zhaoran Wang and Han Liu (Joint work with Tong Zhang)
Sta$s$cal Op$miza$on for Big Data Zhaoran Wang and Han Liu (Joint work with Tong Zhang) Big Data Movement Big Data = Massive Data- size + High Dimensional + Complex Structural + Highly Noisy Big Data give
More informationBBM406 - Introduc0on to ML. Spring Ensemble Methods. Aykut Erdem Dept. of Computer Engineering HaceDepe University
BBM406 - Introduc0on to ML Spring 2014 Ensemble Methods Aykut Erdem Dept. of Computer Engineering HaceDepe University 2 Slides adopted from David Sontag, Mehryar Mohri, Ziv- Bar Joseph, Arvind Rao, Greg
More informationSTAD68: Machine Learning
STAD68: Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 1 Evalua;on 3 Assignments worth 40%. Midterm worth 20%. Final
More informationFeedback-Guided Anomaly Discovery via Online Optimization
Feedback-Guided Anomaly Discovery via Online Optimization Md Amran Siddiqui Oregon State University Corvallis, OR 97331, USA siddiqmd@eecs.oregonstate.edu Ryan Wright Galois, Inc. Portland, OR 97204, USA
More informationLeast Mean Squares Regression. Machine Learning Fall 2017
Least Mean Squares Regression Machine Learning Fall 2017 1 Lecture Overview Linear classifiers What func?ons do linear classifiers express? Least Squares Method for Regression 2 Where are we? Linear classifiers
More informationLast Lecture Recap UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 3: Linear Regression
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 3: Linear Regression Yanjun Qi / Jane University of Virginia Department of Computer Science 1 Last Lecture Recap q Data
More informationCS 6140: Machine Learning Spring What We Learned Last Week 2/26/16
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Sign
More informationComputer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13
Computer Vision Pa0ern Recogni4on Concepts Part I Luis F. Teixeira MAP- i 2012/13 What is it? Pa0ern Recogni4on Many defini4ons in the literature The assignment of a physical object or event to one of
More informationBoos$ng Can we make dumb learners smart?
Boos$ng Can we make dumb learners smart? Aarti Singh Machine Learning 10-601 Nov 29, 2011 Slides Courtesy: Carlos Guestrin, Freund & Schapire 1 Why boost weak learners? Goal: Automa'cally categorize type
More informationEnsemble Data Assimila.on and Uncertainty Quan.fica.on
Ensemble Data Assimila.on and Uncertainty Quan.fica.on Jeffrey Anderson, Alicia Karspeck, Tim Hoar, Nancy Collins, Kevin Raeder, Steve Yeager Na.onal Center for Atmospheric Research Ocean Sciences Mee.ng
More informationBayesian networks Lecture 18. David Sontag New York University
Bayesian networks Lecture 18 David Sontag New York University Outline for today Modeling sequen&al data (e.g., =me series, speech processing) using hidden Markov models (HMMs) Bayesian networks Independence
More informationPerformance Evaluation
Performance Evaluation David S. Rosenberg Bloomberg ML EDU October 26, 2017 David S. Rosenberg (Bloomberg ML EDU) October 26, 2017 1 / 36 Baseline Models David S. Rosenberg (Bloomberg ML EDU) October 26,
More informationCS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment
More informationCSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015
CSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015 Luke ZeElemoyer Slides adapted from Carlos Guestrin Predic5on of con5nuous variables Billionaire says: Wait, that s not what
More informationClassifica(on and predic(on omics style. Dr Nicola Armstrong Mathema(cs and Sta(s(cs Murdoch University
Classifica(on and predic(on omics style Dr Nicola Armstrong Mathema(cs and Sta(s(cs Murdoch University Classifica(on Learning Set Data with known classes Prediction Classification rule Data with unknown
More informationBagging and Other Ensemble Methods
Bagging and Other Ensemble Methods Sargur N. Srihari srihari@buffalo.edu 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties as Constrained Optimization 3. Regularization and Underconstrained
More informationCOMP 562: Introduction to Machine Learning
COMP 562: Introduction to Machine Learning Lecture 20 : Support Vector Machines, Kernels Mahmoud Mostapha 1 Department of Computer Science University of North Carolina at Chapel Hill mahmoudm@cs.unc.edu
More informationDecision Trees Lecture 12
Decision Trees Lecture 12 David Sontag New York University Slides adapted from Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Machine Learning in the ER Physician documentation Triage Information
More informationLBT for Procedural and Reac1ve Systems Part 2: Reac1ve Systems Basic Theory
LBT for Procedural and Reac1ve Systems Part 2: Reac1ve Systems Basic Theory Karl Meinke, karlm@kth.se School of Computer Science and Communica:on KTH Stockholm 0. Overview of the Lecture 1. Learning Based
More informationBias/variance tradeoff, Model assessment and selec+on
Applied induc+ve learning Bias/variance tradeoff, Model assessment and selec+on Pierre Geurts Department of Electrical Engineering and Computer Science University of Liège October 29, 2012 1 Supervised
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 1 Evalua:on
More informationCSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on
CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 13.1 and 13.2 1 Status Check Extra credits? Announcement Evalua/on process will start soon
More informationSome Review and Hypothesis Tes4ng. Friday, March 15, 13
Some Review and Hypothesis Tes4ng Outline Discussing the homework ques4ons from Joey and Phoebe Review of Sta4s4cal Inference Proper4es of OLS under the normality assump4on Confidence Intervals, T test,
More informationRoberto Perdisci^+, Guofei Gu^, Wenke Lee^ presented by Roberto Perdisci. ^Georgia Institute of Technology, Atlanta, GA, USA
U s i n g a n E n s e m b l e o f O n e - C l a s s S V M C l a s s i f i e r s t o H a r d e n P a y l o a d - B a s e d A n o m a l y D e t e c t i o n S y s t e m s Roberto Perdisci^+, Guofei Gu^, Wenke
More informationMassive Experiments and Observational Studies: A Linearithmic Algorithm for Blocking/Matching/Clustering
Massive Experiments and Observational Studies: A Linearithmic Algorithm for Blocking/Matching/Clustering Jasjeet S. Sekhon UC Berkeley June 21, 2016 Jasjeet S. Sekhon (UC Berkeley) Methods for Massive
More informationSta$s$cal sequence recogni$on
Sta$s$cal sequence recogni$on Determinis$c sequence recogni$on Last $me, temporal integra$on of local distances via DP Integrates local matches over $me Normalizes $me varia$ons For cts speech, segments
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationGradient Descent for High Dimensional Systems
Gradient Descent for High Dimensional Systems Lab versus Lab 2 D Geometry Op>miza>on Poten>al Energy Methods: Implemented Equa3ons for op3mizer 3 2 4 Bond length High Dimensional Op>miza>on Applica3ons:
More informationIntroduc)on to the Design and Analysis of Experiments. Violet R. Syro)uk School of Compu)ng, Informa)cs, and Decision Systems Engineering
Introduc)on to the Design and Analysis of Experiments Violet R. Syro)uk School of Compu)ng, Informa)cs, and Decision Systems Engineering 1 Complex Engineered Systems What makes an engineered system complex?
More informationBe#er Generaliza,on with Forecasts. Tom Schaul Mark Ring
Be#er Generaliza,on with Forecasts Tom Schaul Mark Ring Which Representa,ons? Type: Feature- based representa,ons (state = feature vector) Quality 1: Usefulness for linear policies Quality 2: Generaliza,on
More informationLearning with multiple models. Boosting.
CS 2750 Machine Learning Lecture 21 Learning with multiple models. Boosting. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Learning with multiple models: Approach 2 Approach 2: use multiple models
More informationComputer Vision. Pa0ern Recogni4on Concepts. Luis F. Teixeira MAP- i 2014/15
Computer Vision Pa0ern Recogni4on Concepts Luis F. Teixeira MAP- i 2014/15 Outline General pa0ern recogni4on concepts Classifica4on Classifiers Decision Trees Instance- Based Learning Bayesian Learning
More informationDART Tutorial Part IV: Other Updates for an Observed Variable
DART Tutorial Part IV: Other Updates for an Observed Variable UCAR The Na'onal Center for Atmospheric Research is sponsored by the Na'onal Science Founda'on. Any opinions, findings and conclusions or recommenda'ons
More informationExact data mining from in- exact data Nick Freris
Exact data mining from in- exact data Nick Freris Qualcomm, San Diego October 10, 2013 Introduc=on (1) Informa=on retrieval is a large industry.. Biology, finance, engineering, marke=ng, vision/graphics,
More informationParameter Es*ma*on: Cracking Incomplete Data
Parameter Es*ma*on: Cracking Incomplete Data Khaled S. Refaat Collaborators: Arthur Choi and Adnan Darwiche Agenda Learning Graphical Models Complete vs. Incomplete Data Exploi*ng Data for Decomposi*on
More informationUVA CS / Introduc8on to Machine Learning and Data Mining
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 13: Probability and Sta3s3cs Review (cont.) + Naïve Bayes Classifier Yanjun Qi / Jane, PhD University of Virginia Department
More informationCSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Instructor: Wei-Min Shen
CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Instructor: Wei-Min Shen Today s Lecture Search Techniques (review & con/nue) Op/miza/on Techniques Home Work 1: descrip/on
More informationAutomated Discovery of Novel Anomalous Patterns
Automated Discovery of Novel Anomalous Patterns Edward McFowland III Machine Learning Department School of Computer Science Carnegie Mellon University mcfowland@cmu.edu DAP Committee: Daniel B. Neill Jeff
More informationAnomaly Detection for the CERN Large Hadron Collider injection magnets
Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing
More informationSta$s$cal Significance Tes$ng In Theory and In Prac$ce
Sta$s$cal Significance Tes$ng In Theory and In Prac$ce Ben Cartere8e University of Delaware h8p://ir.cis.udel.edu/ictir13tutorial Hypotheses and Experiments Hypothesis: Using an SVM for classifica$on will
More informationLinear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz
Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques
More informationLinear Regression and Correla/on
Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques
More informationTDT 4173 Machine Learning and Case Based Reasoning. Helge Langseth og Agnar Aamodt. NTNU IDI Seksjon for intelligente systemer
TDT 4173 Machine Learning and Case Based Reasoning Lecture 6 Support Vector Machines. Ensemble Methods Helge Langseth og Agnar Aamodt NTNU IDI Seksjon for intelligente systemer Outline 1 Wrap-up from last
More informationA6523 Modeling, Inference, and Mining Jim Cordes, Cornell University. False Positives in Fourier Spectra. For N = DFT length: Lecture 5 Reading
A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 5 Reading Notes on web page Stochas
More informationCS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014
CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014 Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute NAME: Prof.
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationBellman s Curse of Dimensionality
Bellman s Curse of Dimensionality n- dimensional state space Number of states grows exponen
More informationClassification and Regression Trees
Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity
More informationIntroduction to Reinforcement Learning. CMPT 882 Mar. 18
Introduction to Reinforcement Learning CMPT 882 Mar. 18 Outline for the week Basic ideas in RL Value functions and value iteration Policy evaluation and policy improvement Model-free RL Monte-Carlo and
More informationTsybakov noise adap/ve margin- based ac/ve learning
Tsybakov noise adap/ve margin- based ac/ve learning Aar$ Singh A. Nico Habermann Associate Professor NIPS workshop on Learning Faster from Easy Data II Dec 11, 2015 Passive Learning Ac/ve Learning (X j,?)
More informationI D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69
R E S E A R C H R E P O R T Online Policy Adaptation for Ensemble Classifiers Christos Dimitrakakis a IDIAP RR 03-69 Samy Bengio b I D I A P December 2003 D a l l e M o l l e I n s t i t u t e for Perceptual
More informationCSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i
CSE 473: Ar+ficial Intelligence Bayes Nets Daniel Weld [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at hnp://ai.berkeley.edu.]
More informationParallelizing Gaussian Process Calcula1ons in R
Parallelizing Gaussian Process Calcula1ons in R Christopher Paciorek UC Berkeley Sta1s1cs Joint work with: Benjamin Lipshitz Wei Zhuo Prabhat Cari Kaufman Rollin Thomas UC Berkeley EECS (formerly) IBM
More informationLecture 22: Error exponents in hypothesis testing, GLRT
10-704: Information Processing and Learning Spring 2012 Lecture 22: Error exponents in hypothesis testing, GLRT Lecturer: Aarti Singh Scribe: Aarti Singh Disclaimer: These notes have not been subjected
More informationLinear Classifiers as Pattern Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2014/2015 Lesson 16 8 April 2015 Contents Linear Classifiers as Pattern Detectors Notation...2 Linear
More informationUnsupervised Learning: K- Means & PCA
Unsupervised Learning: K- Means & PCA Unsupervised Learning Supervised learning used labeled data pairs (x, y) to learn a func>on f : X Y But, what if we don t have labels? No labels = unsupervised learning
More informationDetection theory 101 ELEC-E5410 Signal Processing for Communications
Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off
More informationNatural Language Processing with Deep Learning. CS224N/Ling284
Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 4: Word Window Classification and Neural Networks Christopher Manning and Richard Socher Classifica6on setup and nota6on Generally
More informationAn Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology. Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University
An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University Why Do We Care? Necessity in today s labs Principled approach:
More informationFrom inductive inference to machine learning
From inductive inference to machine learning ADAPTED FROM AIMA SLIDES Russel&Norvig:Artificial Intelligence: a modern approach AIMA: Inductive inference AIMA: Inductive inference 1 Outline Bayesian inferences
More informationRegression.
Regression www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts linear regression RMSE, MAE, and R-square logistic regression convex functions and sets
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationSeman&cs with Dense Vectors. Dorota Glowacka
Semancs with Dense Vectors Dorota Glowacka dorota.glowacka@ed.ac.uk Previous lectures: - how to represent a word as a sparse vector with dimensions corresponding to the words in the vocabulary - the values
More informationPattern Recognition and Machine Learning. Perceptrons and Support Vector machines
Pattern Recognition and Machine Learning James L. Crowley ENSIMAG 3 - MMIS Fall Semester 2016 Lessons 6 10 Jan 2017 Outline Perceptrons and Support Vector machines Notation... 2 Perceptrons... 3 History...3
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationClassification and Pattern Recognition
Classification and Pattern Recognition Léon Bottou NEC Labs America COS 424 2/23/2010 The machine learning mix and match Goals Representation Capacity Control Operational Considerations Computational Considerations
More informationMulti-class SVMs. Lecture 17: Aykut Erdem April 2016 Hacettepe University
Multi-class SVMs Lecture 17: Aykut Erdem April 2016 Hacettepe University Administrative We will have a make-up lecture on Saturday April 23, 2016. Project progress reports are due April 21, 2016 2 days
More informationExperimental Designs for Planning Efficient Accelerated Life Tests
Experimental Designs for Planning Efficient Accelerated Life Tests Kangwon Seo and Rong Pan School of Compu@ng, Informa@cs, and Decision Systems Engineering Arizona State University ASTR 2015, Sep 9-11,
More informationMachine Learning & Data Mining CMS/CS/CNS/EE 155. Lecture 2: Perceptron & Gradient Descent
Machine Learning & Data Mining CMS/CS/CNS/EE 155 Lecture 2: Perceptron & Gradient Descent Announcements Homework 1 is out Due Tuesday Jan 12 th at 2pm Via Moodle Sign up for Moodle & Piazza if you haven
More informationPriors in Dependency network learning
Priors in Dependency network learning Sushmita Roy sroy@biostat.wisc.edu Computa:onal Network Biology Biosta2s2cs & Medical Informa2cs 826 Computer Sciences 838 hbps://compnetbiocourse.discovery.wisc.edu
More informationPerformance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project
Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationMachine Learning And Applications: Supervised Learning-SVM
Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine
More informationWarm up: risk prediction with logistic regression
Warm up: risk prediction with logistic regression Boss gives you a bunch of data on loans defaulting or not: {(x i,y i )} n i= x i 2 R d, y i 2 {, } You model the data as: P (Y = y x, w) = + exp( yw T
More informationOutline. What is Machine Learning? Why Machine Learning? 9/29/08. Machine Learning Approaches to Biological Research: Bioimage Informa>cs and Beyond
Outline Machine Learning Approaches to Biological Research: Bioimage Informa>cs and Beyond Robert F. Murphy External Senior Fellow, Freiburg Ins>tute for Advanced Studies Ray and Stephanie Lane Professor
More informationUVA CS 4501: Machine Learning. Lecture 11b: Support Vector Machine (nonlinear) Kernel Trick and in PracCce. Dr. Yanjun Qi. University of Virginia
UVA CS 4501: Machine Learning Lecture 11b: Support Vector Machine (nonlinear) Kernel Trick and in PracCce Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major
More informationDeep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationCSC 411 Lecture 3: Decision Trees
CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03-Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning
More informationMachine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler
+ Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table
More informationHomework 2 Solutions Kernel SVM and Perceptron
Homework 2 Solutions Kernel SVM and Perceptron CMU 1-71: Machine Learning (Fall 21) https://piazza.com/cmu/fall21/17115781/home OUT: Sept 25, 21 DUE: Oct 8, 11:59 PM Problem 1: SVM decision boundaries
More informationAlgorithms for Classification: The Basic Methods
Algorithms for Classification: The Basic Methods Outline Simplicity first: 1R Naïve Bayes 2 Classification Task: Given a set of pre-classified examples, build a model or classifier to classify new cases.
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More informationThe Perceptron Algorithm, Margins
The Perceptron Algorithm, Margins MariaFlorina Balcan 08/29/2018 The Perceptron Algorithm Simple learning algorithm for supervised classification analyzed via geometric margins in the 50 s [Rosenblatt
More informationUnsupervised Anomaly Detection for High Dimensional Data
Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation
More informationCS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING Santiago Ontañón so367@drexel.edu Summary so far: Rational Agents Problem Solving Systematic Search: Uninformed Informed Local Search Adversarial Search
More informationData Processing Techniques
Universitas Gadjah Mada Department of Civil and Environmental Engineering Master in Engineering in Natural Disaster Management Data Processing Techniques Hypothesis Tes,ng 1 Hypothesis Testing Mathema,cal
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationLearning from Observations. Chapter 18, Sections 1 3 1
Learning from Observations Chapter 18, Sections 1 3 Chapter 18, Sections 1 3 1 Outline Learning agents Inductive learning Decision tree learning Measuring learning performance Chapter 18, Sections 1 3
More informationStatistical Methods for Data Mining
Statistical Methods for Data Mining Kuangnan Fang Xiamen University Email: xmufkn@xmu.edu.cn Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find
More informationa. See the textbook for examples of proving logical equivalence using truth tables. b. There is a real number x for which f (x) < 0. (x 1) 2 > 0.
For some problems, several sample proofs are given here. Problem 1. a. See the textbook for examples of proving logical equivalence using truth tables. b. There is a real number x for which f (x) < 0.
More informationCS 380: ARTIFICIAL INTELLIGENCE
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING 11/11/2013 Santiago Ontañón santi@cs.drexel.edu https://www.cs.drexel.edu/~santi/teaching/2013/cs380/intro.html Summary so far: Rational Agents Problem
More informationNatural Language Processing with Deep Learning. CS224N/Ling284
Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 4: Word Window Classification and Neural Networks Christopher Manning and Richard Socher Overview Today: Classifica(on background Upda(ng
More informationDifferen'al Privacy with Bounded Priors: Reconciling U+lity and Privacy in Genome- Wide Associa+on Studies
Differen'al Privacy with Bounded Priors: Reconciling U+lity and Privacy in Genome- Wide Associa+on Studies Florian Tramèr, Zhicong Huang, Erman Ayday, Jean- Pierre Hubaux ACM CCS 205 Denver, Colorado,
More informationQuestion of the Day. Machine Learning 2D1431. Decision Tree for PlayTennis. Outline. Lecture 4: Decision Tree Learning
Question of the Day Machine Learning 2D1431 How can you make the following equation true by drawing only one straight line? 5 + 5 + 5 = 550 Lecture 4: Decision Tree Learning Outline Decision Tree for PlayTennis
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationarxiv: v2 [cs.ai] 26 Aug 2016
AD A Meta-Analysis of the Anomaly Detection Problem ANDREW EMMOTT, Oregon State University SHUBHOMOY DAS, Oregon State University THOMAS DIETTERICH, Oregon State University ALAN FERN, Oregon State University
More informationUVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 9: Classifica8on with Support Vector Machine (cont.
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 9: Classifica8on with Support Vector Machine (cont.) Yanjun Qi / Jane University of Virginia Department of Computer Science
More information