Chapter 8 Mining Additional Perspectives

Size: px
Start display at page:

Download "Chapter 8 Mining Additional Perspectives"

Transcription

1 Chapter 8 Mining Additional Perspectives prof.dr.ir. Wil van der Aalst

2 Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter 5 Process Discovery: An Introduction Chapter 6 Advanced Process Discovery Techniques Part III: Beyond Process Discovery Chapter 7 Conformance Checking Chapter 8 Mining Additional Perspectives Chapter 9 Operational Support Part IV: Putting Process Mining to Work Chapter 10 Tool Support Chapter 11 Analyzing Lasagna Processes Chapter 12 Analyzing Spaghetti Processes Part V: Reflection Chapter 13 Cartography and Navigation Chapter 14 Epilogue PAGE 1

3 Mining additional perspectives (one type of enhancement, cf. repair in context of conformance checking) world business processes people machines components organizations models analyzes supports/ controls specifies configures implements analyzes software system records events, e.g., messages, transactions, etc. (process) model discovery conformance enhancement event logs PAGE 2

4 Play-In Replay: Connecting events to model elements is essential for process mining event log process model Play-Out process model event log Replay event log process model extended model showing times, frequencies, etc. diagnostics predictions recommendations PAGE 3

5 Remember: Replay! A B C D B A p1 E p3 D start end p2 C p4 PAGE 4

6 Replay can detect problems AC D Problem! token left behind B Problem! missing token A p1 E p3 D start end p2 C p4 PAGE 5

7 Replay can extract timing information A 5 B 8 C 9 D B A p1 E p3 D start 5 4 p C 9 p end PAGE 6

8 Decision mining: Red cases A B C D B A p1 E p3 D start end p2 C p4 PAGE 7

9 Decision mining: Blue cases A E D If red then B+C; If blue then E; B A p1 E p3 D start end p2 C p4 PAGE 8

10 Starting point: connected event log and model process a 1 * activity model level 1 1 b c * * case d 1 * activity instance instance level 1 1 e f * * event g 1 * attribute event level h i j k timestamp resource costs... transaction PAGE 9

11 Process the initial process model is made by hand or discovered from the event log integrated model showing multiple perspectives world business processes people machines components organizations models analyzes 5 2 (process) model supports/ controls specifies configures implements analyzes discovery 3 conformance 4 enhancement software system records events, e.g., messages, transactions, etc. event logs 1 events have attributes relating to various perspectives conformance checking is used to relate the initial model and event log the model is extended using the additional information in the event log PAGE 10

12 Attributes in event logs PAGE 11

13 Cases may also have attributes PAGE 12

14 Helicopter view: Dotted charts class each dot corresponds to an event activity : decide type : start time : :11.18 resource : Sara cost : - custid : 9911 name : Smith type : gold region : south amount : the color and shape of a dot may depend on attributes of the event time time can be absolute or relative and real or logical each line corresponds to a class, e.g., a case, a resource, a customer, or an activity PAGE 13

15 Dotted chart for a process of a housing agency using absolute time PAGE 14

16 Zooming in PAGE 15

17 Same log, relative time PAGE 16

18 Organizational mining PAGE 17

19 Resource-activity matrix mean number of times a resource performs an activity per case Activity a is executed exactly once for each case (take the sum of the first column). Pete, Mike, and Ellen are the only ones executing this activity. In 30% of the cases, a is executed by Pete, 50% is executed by Pete, and 20% is executed by Ellen. Activities e and f are always executed by Sara. Activity e is executed, on average, 2.3 times per case. Etc. PAGE 18

20 Social network analysis organizational entity (resource, person, role, department, etc.) the thickness of the arc indicates the weight of the relationship relationship w=0.90 w=0.98 y w=0.80 the size of the oval indicates the weight of the entity x w=0.15 z w=0.30 w=0.08 w=0.35 PAGE 19

21 Handover of work matrix Count the number of times work is handed over from one resource to another (on average per case). The causal dependencies in the process model are used to count handovers in the event log. PAGE 20

22 Social network based on handover of work (threshold of 0.1) Pete Sue Sean Mike Sara Ellen In this figure only the thickness of the arcs is based on frequencies. PAGE 21

23 Handover of work at role level Assistant w=5.45 w=1.5 w=0.5 w=1.3 w=3.45 w=2,95 w=0.65 Expert w=1.15 In this figure also the size of each node is based on frequencies. Manager w=3.6 w=1.15 PAGE 22

24 Profile PAGE 23

25 Social network based on similarity of profiles Pete Sean Sue Mike Ellen Sara Resources that execute similar collections of activities are related. Sara is the only resource executing e and f. Therefore, she is not connected to other resources. Self-loops are suppressed as they contain no information (self-similarity) PAGE 24

26 Discovering organizational structures Expert Sue Sean b Manager Sara examine thoroughly g start a register request p1 p2 c examine casually d check ticket p3 p4 e decide f p5 pay compensation h reject request end Assistant reinitiate request Mike Ellen Pete PAGE 25

27 Another example process model organizational model resources oe1 r1 a1 oe2 oe3 r2 r3 a2 a3 oe4 oe5 r4 a4 r5 oe6 oe7 oe8 r6 r7 a5 r8 r9 PAGE 26

28 Analyzing resource behavior, e.g., Yerkes-Dodson law of arousal PAGE 27

29 Learning time and probabilities Replay, as before, but now considering timestamps. Let us replay the first three cases in the event log: case 1 starts at time 12 and ends at time 54, case 2 starts at time 17 and ends at time 73, case 3 starts at time 25 and ends at time 98. PAGE 28

30 PAGE 29 a start register request b examine thoroughly c examine casually d check ticket decide pay compensation reject request reinitiate request e g h f end p3 p1 p2 p4 p5 1,c:19 1,s:12 2,c:23 2,s:17 3,c:30 3,s:25 1,c:32 1,s:25 3,c:65 3,s:60 2,c:38 2,s:30 3,c:35 3,s:32 1,c:33 1,s:26 2,c:32 2,s:28 3,c:40 3,s:35 3,c:67 3,s:62 3,c:55 3,s:50 1,c:40 1,s:35 2,c:59 2,s:50 3,c:50 3,s:45 3,c:87 3,s:80 1,c:54 1,s:50 2,c:73 2,s:70 3,c:98 3,s:90 1:6 2:7 3:2 3:5 1:10 2:11 3:0 3:3 1:3 2:12 3:10 3:15 1:7 2:5 3:5 3:7 1:2 2:18 3:5 3:13 1:12 2:17 3:25 1:54 2:73 3:

31 Another view on the timed replay of the first three cases a b case 1 d e h time case 2 a d c e g case 3 a b c d d e f e g PAGE 30

32 Timed replay projected onto resources Pete Mike Ellen Sue Sean Sara a a a d d b c c d e e h e f b d g e g time PAGE 31

33 Decision mining start a register request decision point #1 c1 c2 b examine thoroughly c examine casually d check ticket c3 c4 e decide f decision point #2 c5 reinitiate request g pay compensation h reject request end PAGE 32

34 Example: XOR-split type region amount activity gold south z silver north z gold south y silver west z silver east z silver south z gold north y silver west z silver south z gold west y silver south z gold south z silver north z gold south y silver north z silver west z gold east y type=gold and amount<500 x type=silver or amount 500 x type=gold and amount<500 x type=silver or amount 500 y z y type=gold and amount<500 type=silver or amount 500 z y z What are the features (predictor variables) influencing the decision? A classification technique like decision tree learning can be used to find such rules: :explain response variable (dependent variable) in terms of predictor variables (independent variables). PAGE 33

35 Example: OR-split type region amount activity gold south y and z silver north y and z gold south just y silver west just z silver east y and z silver south y and z gold north just y silver west y and z silver south just z type=gold or amount<500 x type=silver or amount 500 x y z y type=gold or amount<500 type=silver or amount 500 z PAGE 34

36 Classification in process mining The application of classification techniques like decision tree learning is not limited to decision mining based on event/case data only. Additional predictor variables may be used: behavioral information (count number of loops) performance information (processing times) contextual information (weather, queues, etc.) Alternative response variables can be analyzed: uncover reasons for non-conformance (split instances in two groups) uncover reasons for delays PAGE 35

37 Bringing it all together Step 1: obtain an event log event log b Step 2: create or discover a process model Step 3: connect events in the log to activities in the model start A a register request c1 c2 examine thoroughly A c examine casually A d check ticket c3 c4 decide f M e c5 reinitiate request g pay compensation h reject request end Step 4: extend the model Role A: Assistant Role E: Expert Role M: Manager Pete Sue Sara add organizational perspective add time perspective add case perspective Step 5: return integrated model add other perspectives start Mike Ellen A a register request Sean c1 c2 E b examine thoroughly A c examine casually A d check ticket c3 c4 M e decide f M c5 reinitiate request A g pay compensation A h reject request end PAGE 36

Process Mining Enabling Data-Driven Process Discovery and Analysis Using ProM

Process Mining Enabling Data-Driven Process Discovery and Analysis Using ProM Process ining Enabling Data-Driven Process Discovery and Analysis Using Pro Keynote SIPDA 2011 Campione d Italia, June 29, 2011 prof.dr.ir. Wil van der Aalst www.processmining.org Baarle-Nassau (NL) /Baarle-Hertog

More information

Distributed Process Discovery and Conformance Checking

Distributed Process Discovery and Conformance Checking Distributed Process Discovery and Conformance Checking prof.dr.ir. Wil van der Aalst www.processmining.org On the different roles of (process) models PAGE 1 Play-Out PAGE 2 Play-Out (Classical use of models)

More information

Mine Your Own Business

Mine Your Own Business 33 Mine Your Own Business Using process mining to turn big data into better processes and systems prof.dr.ir. Wil van der Aalst PAGE 0 Season 1, Episode 4 (1969) PAGE 1 "Mine Your Own Business" (2006)

More information

Process Discovery and Conformance Checking Using Passages

Process Discovery and Conformance Checking Using Passages Fundamenta Informaticae XX (2012) 1 35 1 DOI 10.3233/FI-2012-0000 IOS Press Process Discovery and Conformance Checking Using Passages W.M.P. van der Aalst Department of Mathematics and Computer Science,

More information

Process Mining in the Large: A Tutorial

Process Mining in the Large: A Tutorial Process Mining in the Large: A Tutorial Wil M.P. van der Aalst Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands; Business Process Management

More information

Process Mining. Knut Hinkelmann. Prof. Dr. Knut Hinkelmann MSc Business Information Systems

Process Mining. Knut Hinkelmann. Prof. Dr. Knut Hinkelmann MSc Business Information Systems Knut Hinkelmann Prof. r. Knut Hinkelmann MSc usiness Information Systems Learning Objective Topic: Learning Process knowledge from experience learning a process/decision model ase-ased Reasoning (R) reusing

More information

Aligning Event Logs and Process Models for Multi-Perspective Conformance Checking: An Approach Based on Integer Linear Programming

Aligning Event Logs and Process Models for Multi-Perspective Conformance Checking: An Approach Based on Integer Linear Programming Aligning Event Logs and Process Models for Multi-Perspective Conformance Checking: An Approach Based on Integer Linear Programming Massimiliano de Leoni and Wil M. P. van der Aalst Eindhoven University

More information

An Experimental Evaluation of Passage-Based Process Discovery

An Experimental Evaluation of Passage-Based Process Discovery An Experimental Evaluation of Passage-Based Process Discovery H.M.W. Verbeek and W.M.P. van der Aalst Technische Universiteit Eindhoven Department of Mathematics and Computer Science P.O. Box 513, 5600

More information

Causal Nets: A Modeling Language Tailored towards Process Discovery

Causal Nets: A Modeling Language Tailored towards Process Discovery Causal Nets: A Modeling Language Tailored towards Process Discovery Wil van der Aalst, Arya Adriansyah, and Boudewijn van Dongen Department of Mathematics and Computer Science, Technische Universiteit

More information

Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Roberto Bruni http://www.di.unipi.it/~bruni 17 - Diagnosis for WF nets 1 Object We study suitable diagnosis techniques

More information

Content Area: Social Studies Standard: 1. History Prepared Graduates: Develop an understanding of how people view, construct, and interpret history

Content Area: Social Studies Standard: 1. History Prepared Graduates: Develop an understanding of how people view, construct, and interpret history Standard: 1. History Develop an understanding of how people view, construct, and interpret history 1. Organize and sequence events to understand the concepts of chronology and cause and effect in the history

More information

Using Genetic Algorithms to Mine Process Models: Representation, Operators and Results

Using Genetic Algorithms to Mine Process Models: Representation, Operators and Results Using Genetic Algorithms to Mine Process Models: Representation, Operators and Results A.K. Alves de Medeiros, A.J.M.M. Weijters and W.M.P. van der Aalst Department of Technology Management, Eindhoven

More information

Mining Social Networks: Uncovering Interaction Patterns in Business Processes

Mining Social Networks: Uncovering Interaction Patterns in Business Processes Mining Social Networks: Uncovering Interaction Patterns in Business Processes Wil M.P. van der Aalst 1 and Minseok Song 2,1 1 Department of Technology Management, Eindhoven University of Technology, P.O.

More information

Genetic Process Mining

Genetic Process Mining Genetic Process Mining W.M.P. van der Aalst, A.K. Alves de Medeiros, and A.J.M.M. Weijters Department of Technology Management, Eindhoven University of Technology P.O. Box 513, NL-5600 MB, Eindhoven, The

More information

Automatic Root Cause Identification Using Most Probable Alignments

Automatic Root Cause Identification Using Most Probable Alignments Automatic Root Cause Identification Using Most Probable Alignments Marie Koorneef, Andreas Solti 2, Henrik Leopold, Hajo A. Reijers,3 Department of Computer Sciences, Vrije Universiteit Amsterdam, The

More information

Discovering Petri Nets

Discovering Petri Nets Discovering Petri Nets It s a kind of magic Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information and Technology P.O. Box 513, 5600 MB Eindhoven The Netherlands w.m.p.v.d.aalst@tm.tue.nl

More information

Predictive Modeling: Classification. KSE 521 Topic 6 Mun Yi

Predictive Modeling: Classification. KSE 521 Topic 6 Mun Yi Predictive Modeling: Classification Topic 6 Mun Yi Agenda Models and Induction Entropy and Information Gain Tree-Based Classifier Probability Estimation 2 Introduction Key concept of BI: Predictive modeling

More information

Genetic Process Mining

Genetic Process Mining Genetic Process Mining W.M.P. van der Aalst, A.K. Alves de Medeiros, and A.J.M.M. Weijters Department of Technology Management, Eindhoven University of Technology P.O. Box 513, NL-5600 MB, Eindhoven, The

More information

Discovering Social Networks from Event Logs

Discovering Social Networks from Event Logs Discovering Social Networks from Event Logs Wil M.P. van der Aalst 1,HajoA.Reijers 1, Minseok Song 2,1 1 Department of Technology Management, Eindhoven University of Technology, P.O.Box 513, NL-5600 MB,

More information

Implementation Status & Results Indonesia Improving Governance for Sustainable Indigenous Community Livelihoods in Forested Areas (P130632)

Implementation Status & Results Indonesia Improving Governance for Sustainable Indigenous Community Livelihoods in Forested Areas (P130632) Public Disclosure Authorized Public Disclosure Authorized The World Bank Implementation Status & Results Indonesia Improving Governance for Sustainable Indigenous Community Livelihoods in Forested Areas

More information

TRAITS to put you on the map

TRAITS to put you on the map TRAITS to put you on the map Know what s where See the big picture Connect the dots Get it right Use where to say WOW Look around Spread the word Make it yours Finding your way Location is associated with

More information

GOVERNMENT GIS BUILDING BASED ON THE THEORY OF INFORMATION ARCHITECTURE

GOVERNMENT GIS BUILDING BASED ON THE THEORY OF INFORMATION ARCHITECTURE GOVERNMENT GIS BUILDING BASED ON THE THEORY OF INFORMATION ARCHITECTURE Abstract SHI Lihong 1 LI Haiyong 1,2 LIU Jiping 1 LI Bin 1 1 Chinese Academy Surveying and Mapping, Beijing, China, 100039 2 Liaoning

More information

Data Mining. Chapter 1. What s it all about?

Data Mining. Chapter 1. What s it all about? Data Mining Chapter 1. What s it all about? 1 DM & ML Ubiquitous computing environment Excessive amount of data (data flooding) Gap between the generation of data and their understanding Looking for structural

More information

arxiv: v1 [cs.db] 3 Nov 2017

arxiv: v1 [cs.db] 3 Nov 2017 Noname manuscript No. (will be inserted by the editor) Discovering More Precise Process Models from Event Logs by Filtering Out Chaotic Activities Niek Tax Natalia Sidorova Wil M. P. van der Aalst Received:

More information

Advanced Techniques for Mining Structured Data: Process Mining

Advanced Techniques for Mining Structured Data: Process Mining Advanced Techniques for Mining Structured Data: Process Mining Frequent Pattern Discovery /Event Forecasting Dr A. Appice Scuola di Dottorato in Informatica e Matematica XXXII Problem definition 1. Given

More information

InoPower Storm Detection system

InoPower Storm Detection system InoPower Storm Detection system Current situation of growers: Growers in possession of a hail control gun decide, based on the weather predictions or on site perception, when to use their machine. A hail

More information

Mining Process Models with Prime Invisible Tasks

Mining Process Models with Prime Invisible Tasks Mining Process Models with Prime Invisible Tasks Lijie Wen 1,2, Jianmin Wang 1,4,5, Wil M.P. van der Aalst 3, Biqing Huang 2, and Jiaguang Sun 1,4,5 1 School of Software, Tsinghua University, Beijing,

More information

Mapcube and Mapview. Two Web-based Spatial Data Visualization and Mining Systems. C.T. Lu, Y. Kou, H. Wang Dept. of Computer Science Virginia Tech

Mapcube and Mapview. Two Web-based Spatial Data Visualization and Mining Systems. C.T. Lu, Y. Kou, H. Wang Dept. of Computer Science Virginia Tech Mapcube and Mapview Two Web-based Spatial Data Visualization and Mining Systems C.T. Lu, Y. Kou, H. Wang Dept. of Computer Science Virginia Tech S. Shekhar, P. Zhang, R. Liu Dept. of Computer Science University

More information

7. Queueing Systems. 8. Petri nets vs. State Automata

7. Queueing Systems. 8. Petri nets vs. State Automata Petri Nets 1. Finite State Automata 2. Petri net notation and definition (no dynamics) 3. Introducing State: Petri net marking 4. Petri net dynamics 5. Capacity Constrained Petri nets 6. Petri net models

More information

Checking Behavioral Conformance of Artifacts

Checking Behavioral Conformance of Artifacts Checking Behavioral Conformance of Artifacts Dirk Fahland Massimiliano de Leoni Boudewijn F. van Dongen Wil M.P. van der Aalst, Eindhoven University of Technology, The Netherlands (d.fahland m.d.leoni

More information

Process Mining in Non-Stationary Environments

Process Mining in Non-Stationary Environments and Machine Learning. Bruges Belgium), 25-27 April 2012, i6doc.com publ., ISBN 978-2-87419-049-0. Process Mining in Non-Stationary Environments Phil Weber, Peter Tiňo and Behzad Bordbar School of Computer

More information

EECS 349:Machine Learning Bryan Pardo

EECS 349:Machine Learning Bryan Pardo EECS 349:Machine Learning Bryan Pardo Topic 2: Decision Trees (Includes content provided by: Russel & Norvig, D. Downie, P. Domingos) 1 General Learning Task There is a set of possible examples Each example

More information

Information System Decomposition Quality

Information System Decomposition Quality Information System Decomposition Quality Dr. Nejmeddine Tagoug Computer Science Department Al Imam University, SA najmtagoug@yahoo.com ABSTRACT: Object-oriented design is becoming very popular in today

More information

Trend

Trend Fact Sheet Safety Safety Trends Accidents are gathered using multiple sources and validated and classified by the Accident Classification Technical Group (ACTG). The technical group is comprised of industry

More information

Decision Tree Analysis for Classification Problems. Entscheidungsunterstützungssysteme SS 18

Decision Tree Analysis for Classification Problems. Entscheidungsunterstützungssysteme SS 18 Decision Tree Analysis for Classification Problems Entscheidungsunterstützungssysteme SS 18 Supervised segmentation An intuitive way of thinking about extracting patterns from data in a supervised manner

More information

Parts 3-6 are EXAMPLES for cse634

Parts 3-6 are EXAMPLES for cse634 1 Parts 3-6 are EXAMPLES for cse634 FINAL TEST CSE 352 ARTIFICIAL INTELLIGENCE Fall 2008 There are 6 pages in this exam. Please make sure you have all of them INTRODUCTION Philosophical AI Questions Q1.

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Thank you for your interest in the Support Resistance Strength Analyzer!

Thank you for your interest in the Support Resistance Strength Analyzer! This user manual refer to FXCM s Trading Station version of the indicator Support Resistance Strength Analyzer Thank you for your interest in the Support Resistance Strength Analyzer! This unique indicator

More information

Provenance-Aware Entity Resolution Leveraging Provenance to Improve Quality

Provenance-Aware Entity Resolution Leveraging Provenance to Improve Quality ProvenanceAware Entity Resolution Leveraging Provenance to Improve Quality Qing Wang, KlausDieter Schewe 2 and Woods Wang Research School of Computer Science, Australian National University, Australia

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 06 - Regression & Decision Trees Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 9 Verification and Validation of Simulation Models Purpose & Overview The goal of the validation process is: To produce a model that represents true

More information

Markings in Perpetual Free-Choice Nets Are Fully Characterized by Their Enabled Transitions

Markings in Perpetual Free-Choice Nets Are Fully Characterized by Their Enabled Transitions Markings in Perpetual Free-Choice Nets Are Fully Characterized by Their Enabled Transitions Wil M.P. van der Aalst Process and Data Science (PADS), RWTH Aachen University, Germany. wvdaalst@pads.rwth-aachen.de

More information

Decomposing Conformance Checking on Petri Nets with Data

Decomposing Conformance Checking on Petri Nets with Data Decomposing Conformance Checking on Petri Nets with Data Massimiliano de Leoni 1,2, Jorge Munoz-Gama 3, Josep Carmona 3, and Wil M.P. van der Aalst 2 1 University of Padua, Padua (Italy) 2 Eindhoven University

More information

Decision T ree Tree Algorithm Week 4 1

Decision T ree Tree Algorithm Week 4 1 Decision Tree Algorithm Week 4 1 Team Homework Assignment #5 Read pp. 105 117 of the text book. Do Examples 3.1, 3.2, 3.3 and Exercise 3.4 (a). Prepare for the results of the homework assignment. Due date

More information

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 4 of Data Mining by I. H. Witten, E. Frank and M. A.

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 4 of Data Mining by I. H. Witten, E. Frank and M. A. Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter of Data Mining by I. H. Witten, E. Frank and M. A. Hall Statistical modeling Opposite of R: use all the attributes Two assumptions:

More information

SOBER Cryptanalysis. Daniel Bleichenbacher and Sarvar Patel Bell Laboratories Lucent Technologies

SOBER Cryptanalysis. Daniel Bleichenbacher and Sarvar Patel Bell Laboratories Lucent Technologies SOBER Cryptanalysis Daniel Bleichenbacher and Sarvar Patel {bleichen,sarvar}@lucent.com Bell Laboratories Lucent Technologies Abstract. SOBER is a new stream cipher that has recently been developed by

More information

Census Transportation Planning Products (CTPP)

Census Transportation Planning Products (CTPP) Census Transportation Planning Products (CTPP) Penelope Weinberger CTPP Program Manager - AASHTO September 15, 2010 1 What is the CTPP Program Today? The CTPP is an umbrella program of data products, custom

More information

Induction of Decision Trees

Induction of Decision Trees Induction of Decision Trees Peter Waiganjo Wagacha This notes are for ICS320 Foundations of Learning and Adaptive Systems Institute of Computer Science University of Nairobi PO Box 30197, 00200 Nairobi.

More information

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Anomaly Detection for the CERN Large Hadron Collider injection magnets Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing

More information

How to evaluate credit scorecards - and why using the Gini coefficient has cost you money

How to evaluate credit scorecards - and why using the Gini coefficient has cost you money How to evaluate credit scorecards - and why using the Gini coefficient has cost you money David J. Hand Imperial College London Quantitative Financial Risk Management Centre August 2009 QFRMC - Imperial

More information

IOE 202: lecture 14 outline

IOE 202: lecture 14 outline IOE 202: lecture 14 outline Announcements Last time... Value of information (perfect and imperfect) in decision analysis Course wrap-up: topics covered and where to go from here IOE 202: Operations Modeling,

More information

How to Increase the Significance of your GIS

How to Increase the Significance of your GIS How to Increase the Significance of your GIS Wade Kloos, GIS Director, Utah Department of Natural Resources wkloos@utah.gov 2014 Esri International User Conference July 17 How to Increase the Significance

More information

Business Process Technology Master Seminar

Business Process Technology Master Seminar Business Process Technology Master Seminar BPT Group Summer Semester 2008 Agenda 2 Official Information Seminar Timeline Tasks Outline Topics Sergey Smirnov 17 April 2008 Official Information 3 Title:

More information

OFFSHORE. Advanced Weather Technology

OFFSHORE. Advanced Weather Technology Contents 3 Advanced Weather Technology 5 Working Safely, While Limiting Downtime 6 Understanding the Weather Forecast Begins at the Tender Stage 7 Reducing Time and Costs on Projects is a Priority Across

More information

PRELIMINARY STUDIES ON CONTOUR TREE-BASED TOPOGRAPHIC DATA MINING

PRELIMINARY STUDIES ON CONTOUR TREE-BASED TOPOGRAPHIC DATA MINING PRELIMINARY STUDIES ON CONTOUR TREE-BASED TOPOGRAPHIC DATA MINING C. F. Qiao a, J. Chen b, R. L. Zhao b, Y. H. Chen a,*, J. Li a a College of Resources Science and Technology, Beijing Normal University,

More information

Flexible Heuristics Miner (FHM)

Flexible Heuristics Miner (FHM) Flexible Heuristics Miner (FHM) A.J.M.M. Weijters Eindhoven University of Technology Email: a.j.m.m.weijters@tue.nl J.T.S. Ribeiro Eindhoven University of Technology Email: j.t.s.ribeiro@tue.nl Abstract

More information

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

CS6375: Machine Learning Gautam Kunapuli. Decision Trees Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s

More information

Decision Support. Dr. Johan Hagelbäck.

Decision Support. Dr. Johan Hagelbäck. Decision Support Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Decision Support One of the earliest AI problems was decision support The first solution to this problem was expert systems

More information

What s the Weather? Compiled by: Nancy Volk

What s the Weather? Compiled by: Nancy Volk Compiled by: Nancy Volk Weather Weather is the current state of the atmosphere in a given area. We are all fascinated by and interested in weather. It impacts what we wear, what we do, and how we do it.

More information

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data

More information

STAT400. Sample questions for midterm Let A and B are sets such that P (A) = 0.6, P (B) = 0.4 and P (AB) = 0.3. (b) Compute P (A B).

STAT400. Sample questions for midterm Let A and B are sets such that P (A) = 0.6, P (B) = 0.4 and P (AB) = 0.3. (b) Compute P (A B). STAT400 Sample questions for midterm 1 1 Let A and B are sets such that P A = 06, P B = 04 and P AB = 0 a Compute P A B b Compute P A B c Compute P A B Solution a P A B = P A + P B P AB = 06 + 04 0 = 07

More information

Machine Learning: Pattern Mining

Machine Learning: Pattern Mining Machine Learning: Pattern Mining Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim Wintersemester 2007 / 2008 Pattern Mining Overview Itemsets Task Naive Algorithm Apriori Algorithm

More information

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom

More information

15 Introduction to Data Mining

15 Introduction to Data Mining 15 Introduction to Data Mining 15.1 Introduction to principle methods 15.2 Mining association rule see also: A. Kemper, Chap. 17.4, Kifer et al.: chap 17.7 ff 15.1 Introduction "Discovery of useful, possibly

More information

EBA Engineering Consultants Ltd. Creating and Delivering Better Solutions

EBA Engineering Consultants Ltd. Creating and Delivering Better Solutions EBA Engineering Consultants Ltd. Creating and Delivering Better Solutions ENHANCING THE CAPABILITY OF ECOSYSTEM MAPPING TO SUPPORT ADAPTIVE FOREST MANAGEMENT Prepared by: EBA ENGINEERING CONSULTANTS LTD.

More information

Expanding Canada s Rail Network to Meet the Challenges of the Future

Expanding Canada s Rail Network to Meet the Challenges of the Future Expanding Canada s Rail Network to Meet the Challenges of the Future Lesson Overview Rail may become a more popular mode of transportation in the future due to increased population, higher energy costs,

More information

Learning Hybrid Process Models From Events

Learning Hybrid Process Models From Events Learning Hybrid Process Models From Events Process Discovery Without Faking Confidence (Experimental Results) Wil M.P. van der Aalst 1,2 and Riccardo De Masellis 2 and Chiara Di Francescomarino 2 and Chiara

More information

Data Exploration vis Local Two-Sample Testing

Data Exploration vis Local Two-Sample Testing Data Exploration vis Local Two-Sample Testing 0 20 40 60 80 100 40 20 0 20 40 Freeman, Kim, and Lee (2017) Astrostatistics at Carnegie Mellon CMU Astrostatistics Network Graph 2017 (not including collaborations

More information

Name: Date: Period: #: Chapter 1: Outline Notes What Does a Historian Do?

Name: Date: Period: #: Chapter 1: Outline Notes What Does a Historian Do? Name: Date: Period: #: Chapter 1: Outline Notes What Does a Historian Do? Lesson 1.1 What is History? I. Why Study History? A. History is the study of the of the past. History considers both the way things

More information

Lecture 7: DecisionTrees

Lecture 7: DecisionTrees Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

Co-constructing bushfire:

Co-constructing bushfire: The Age, February 7, 2010 Karen Reid Ruth Beilin Roksana Karim Landscape Sociology Melbourne School of Land & Environment University of Melbourne rbeilin@unimelb.edu.au reidk@unimelb.edu.au Co-constructing

More information

Chap 4. Software Reliability

Chap 4. Software Reliability Chap 4. Software Reliability 4.2 Reliability Growth 1. Introduction 2. Reliability Growth Models 3. The Basic Execution Model 4. Calendar Time Computation 5. Reliability Demonstration Testing 1. Introduction

More information

GIS Visualization Support to the C4.5 Classification Algorithm of KDD

GIS Visualization Support to the C4.5 Classification Algorithm of KDD GIS Visualization Support to the C4.5 Classification Algorithm of KDD Gennady L. Andrienko and Natalia V. Andrienko GMD - German National Research Center for Information Technology Schloss Birlinghoven,

More information

UPDATING THE MINNESOTA NATIONAL WETLAND INVENTORY

UPDATING THE MINNESOTA NATIONAL WETLAND INVENTORY UPDATING THE MINNESOTA NATIONAL WETLAND INVENTORY An Integrated Approach Using Object-Oriented Image Analysis, Human Air-Photo Interpretation and Machine Learning AARON SMITH EQUINOX ANALYTICS INC. FUNDING

More information

Distributed systems Lecture 4: Clock synchronisation; logical clocks. Dr Robert N. M. Watson

Distributed systems Lecture 4: Clock synchronisation; logical clocks. Dr Robert N. M. Watson Distributed systems Lecture 4: Clock synchronisation; logical clocks Dr Robert N. M. Watson 1 Last time Started to look at time in distributed systems Coordinating actions between processes Physical clocks

More information

Decision Model for Potential Asteroid Impacts

Decision Model for Potential Asteroid Impacts Decision Model for Potential Asteroid Impacts Research Paper EB560 Decision Analysis Division of Economics and Business Colorado School of Mines December 9, 2003 Brad R. Blair EXECUTIVE SUMMARY Research

More information

Topic Models and Applications to Short Documents

Topic Models and Applications to Short Documents Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text

More information

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes. CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers

More information

Administrative notes. Computational Thinking ct.cs.ubc.ca

Administrative notes. Computational Thinking ct.cs.ubc.ca Administrative notes Labs this week: project time. Remember, you need to pass the project in order to pass the course! (See course syllabus.) Clicker grades should be on-line now Administrative notes March

More information

OECD QSAR Toolbox v.3.2. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

OECD QSAR Toolbox v.3.2. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding OECD QSAR Toolbox v.3.2 Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding Outlook Background Objectives Specific Aims The exercise Workflow

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes dr. Petra Kralj Novak Petra.Kralj.Novak@ijs.si 7.11.2017 1 Course Prof. Bojan Cestnik Data preparation Prof. Nada Lavrač: Data mining overview Advanced

More information

Lecture 15 April 9, 2007

Lecture 15 April 9, 2007 6.851: Advanced Data Structures Spring 2007 Mihai Pătraşcu Lecture 15 April 9, 2007 Scribe: Ivaylo Riskov 1 Overview In the last lecture we considered the problem of finding the predecessor in its static

More information

CS : Spatial Data Modeling and Analysis. Geovisualization

CS : Spatial Data Modeling and Analysis. Geovisualization CS260-002: Spatial Data Modeling and Analysis Geovisualization Visual Perception Learning Styles & Personality Types: Visual, Auditory, Kinesthetic Cholera cases in the London epidemic of 1854 Cholera

More information

DrFurby Classifier. Process Discovery BPM Where innovation starts

DrFurby Classifier. Process Discovery BPM Where innovation starts Den Dolech 2, 5612 AZ Eindhoven P.O. Box 513, 5600 MB Eindhoven The Netherlands www.tue.nl Author Eric Verbeek and Felix Mannhardt Date June 14, 2016 Version 1.2 DrFurby Classifier Process Discovery Contest

More information

Requirements Validation. Content. What the standards say (*) ?? Validation, Verification, Accreditation!! Correctness and completeness

Requirements Validation. Content. What the standards say (*) ?? Validation, Verification, Accreditation!! Correctness and completeness Requirements Validation Requirements Management Requirements Validation?? Validation, Verification, Accreditation!! Check if evrything is OK With respect to what? Mesurement associated with requirements

More information

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM SPATIAL DATA MINING Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM INTRODUCTION The main difference between data mining in relational DBS and in spatial DBS is that attributes of the neighbors

More information

Chapter 8 - Forecasting

Chapter 8 - Forecasting Chapter 8 - Forecasting Operations Management by R. Dan Reid & Nada R. Sanders 4th Edition Wiley 2010 Wiley 2010 1 Learning Objectives Identify Principles of Forecasting Explain the steps in the forecasting

More information

Explaining Results of Neural Networks by Contextual Importance and Utility

Explaining Results of Neural Networks by Contextual Importance and Utility Explaining Results of Neural Networks by Contextual Importance and Utility Kary FRÄMLING Dep. SIMADE, Ecole des Mines, 158 cours Fauriel, 42023 Saint-Etienne Cedex 2, FRANCE framling@emse.fr, tel.: +33-77.42.66.09

More information

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG Why Synchronization? You want to catch a bus at 6.05 pm, but your watch is

More information

Stat 502X Exam 2 Spring 2014

Stat 502X Exam 2 Spring 2014 Stat 502X Exam 2 Spring 2014 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed This exam consists of 12 parts. I'll score it at 10 points per problem/part

More information

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding OECD QSAR Toolbox v.3.3 Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding Outlook Background Objectives Specific Aims The exercise Workflow

More information

Map your way to deeper insights

Map your way to deeper insights Map your way to deeper insights Target, forecast and plan by geographic region Highlights Apply your data to pre-installed map templates and customize to meet your needs. Select from included map files

More information

Describing Data Table with Best Decision

Describing Data Table with Best Decision Describing Data Table with Best Decision ANTS TORIM, REIN KUUSIK Department of Informatics Tallinn University of Technology Raja 15, 12618 Tallinn ESTONIA torim@staff.ttu.ee kuusik@cc.ttu.ee http://staff.ttu.ee/~torim

More information

GIS and Business Location Analytics

GIS and Business Location Analytics UNT s course evaluation system (SPOT - Student Perceptions of Teaching) opened on Monday, April 16 and runs through Thursday, May 3. You should have received an email on April 16 providing guidance on

More information

STUDY GUIDE. Exploring Geography. Chapter 1, Section 1. Terms to Know DRAWING FROM EXPERIENCE ORGANIZING YOUR THOUGHTS

STUDY GUIDE. Exploring Geography. Chapter 1, Section 1. Terms to Know DRAWING FROM EXPERIENCE ORGANIZING YOUR THOUGHTS For use with textbook pages 19 22. Exploring Geography Terms to Know location A specific place on the earth (page 20) absolute location The exact spot at which a place is found on the globe (page 20) hemisphere

More information

Toward a Definition of Astrology Michael Munkasey, 1996

Toward a Definition of Astrology Michael Munkasey, 1996 Toward a Definition of Astrology Michael Munkasey, 1996 Opening Statements Astrology is a complex subject which is rapidly gaining in recognition and popularity. This growth begs for a more formal definition

More information

Time and Activity Sequence Prediction of Business Process Instances

Time and Activity Sequence Prediction of Business Process Instances Noname manuscript No. (will be inserted by the editor) Time and Activity Sequence Prediction of Business Process Instances Mirko Polato Alessandro Sperduti Andrea Burattin Massimiliano de Leoni the date

More information

DISTRIBUTED COMPUTER SYSTEMS

DISTRIBUTED COMPUTER SYSTEMS DISTRIBUTED COMPUTER SYSTEMS SYNCHRONIZATION Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Topics Clock Synchronization Physical Clocks Clock Synchronization Algorithms

More information