Proess Mining: Dt Siene in Ation Alph Algorithm: Limittions prof.dr.ir. Wil vn der Alst www.proessmining.org
Let L e n event log over T. α(l) is defined s follows. 1. T L = { t T σ L t σ}, 2. T I = { t T σ L t = first(σ) }, 3. T O = { t T σ L t = lst(σ) }, 4. X L = { (A,B) A T L A ø B T L B ø A B L 1,2 A 1 # L 2 1,2 B 1 # L 2 }, 5. Y L = { (A,B) X L (A,B ) XL A A B B (A,B) = (A,B ) }, 6. P L = { p (A,B) (A,B) Y L } {i L,o L }, 7. F L = { (,p (A,B) ) (A,B) Y L A } { (p (A,B),) (A,B) Y L B } { (i L,t) t T I } { (t,o L ) t T O }, nd 8. α(l) = (P L,T L,F L ).
Alph lgorithm exmine thoroughly strt register request 1 exmine sully 3 deide 5 py ompenstion end 2 hek tiket 4 reinitite request rejet request
Limittion of the α lgorithm: Impliit ples d e f p 1 g p 2 p 1 nd p 2 re impliit ples! PAGE 4
Limittion of the α lgorithm: Loops of length 1 > > > > # # desired model: PAGE 5
Limittion of the α lgorithm: Loops of length 2 > > >d > d d desired model: d PAGE 6
Limittion of the α lgorithm: Non-lol dependenies?? d?? e PAGE 7
Limittion of the α lgorithm: Non-lol dependenies p 1 d p 2 e p 1 nd p 2 re not disovered! PAGE 8
Two event logs: Sme disovered model d e PAGE 9
Diffiult onstruts for the Alph lgorithm PAGE 10
Question Consider the event log: Wht model will the Alph lgorithm rete? Give sound WF-net tht n produe the oserved ehvior nd nothing more? PAGE 11
Answer (1/2): Model generted y Alph lgorithm d e Model generted y Alph lgorithm lso llows for tre strting with nd ending with d!
Answer (2/2): A sound WF-net tht n produe the oserved ehvior nd nothing more d e e Note the duplited e trnsition! The Alph lgorithm will never rete WF-net with two trnsitions hving the sme lel.
Limittion of the α lgorithm: representtionl is strt p end There is no WF-net with unique visile lels tht exhiits this ehvior.
Another exmple strt p1 p1 end strt p1 p1 end There is no WF-net with unique visile lels tht exhiits this ehvior. strt p1 τ p1 end
OR-split/join model d strt end Let us tke n event log ontining ll possile full firing sequenes nd pply the Alph lgorithm. Wht will hppen?
Applying the Alph lgorithm using ProM???
Region-sed miner (with lel splitting)
Limittion of the α lgorithm: resulting model does not need to e sound WF-net d f e The disovered model is not sound (hs dedlok).
Chllenge: Noise nd Inompleteness To disover suitle proess model it is ssumed tht the event log ontins representtive smple of ehvior. Two relted phenomen: Noise: the event log ontins rre nd infrequent ehvior not representtive for the typil ehvior of the proess. Inompleteness: the event log ontins too few events to e le to disover some of the underlying ontrol-flow strutures.
Flower model d strt end e f g h
Wht is the est model? d,,d 99,,e 85 e d e
Wht is the est model? d,,d 99,,e 50,,e 85,,d 48 e d e
Wht is the est model? d,,d 99,,e 1,,e 85,,d 2 e d e
Alph lgorithm nnot del with noise (rre ehvior, i.e., outliers)
Proess models re like mps: we my not wnt to see ll pths nd only see the highwys
Relted to noise: Completeness exmine thoroughly 1 strt register request 1 2 exmine sully hek tiket 3 4 deide 5 reinitite request py ompenstion rejet request end strt 2 3... k end Infinitely mny possile tres, 7 possile sttes PAGE 27 k numer of sttes: 2 k +2 numer of different tres: k! 1 4 1 2 6 2 5 34 120 10 1026 3628800 20 1048578 2.432902e+18
Alph lgorithm depends on the diretly follows reltion 1 k numer of sttes: 2 k +2 numer of different tres: k! 2 1 4 1 strt 3... end 2 6 2 5 34 120 10 1026 3628800 20 1048578 2.432902e+18 k Only k(k-1) oservtions re needed to disover the onurrent prt. However, if one of these is missing, the result will e inorret.
nnot ssume ompleteness
Limittions (1/2) Impliit ples (ples tht re redundnt): hrmless nd e solved through preproessing. Loops of length 1: n e solved in multiple wys (hnge of lgorithm or pre/postproessing). Loops of length 2: idem. Non-lol dependenies: foundtionl prolem, not speifi for Alph lgorithm.
Limittions (2/2) Representtionl is (nnot disover trnsitions with duplite or invisile lels): other lgorithms my hve different is. Disovered model does not need to e sound: some lgorithms ensure this. Noise: foundtionl prolem, not speifi for Alph lgorithm. Inompleteness: lso foundtionl prolem.
How to mesure the qulity of disovered model? There my e onfliting requirements (simpliity versus ury). Confusion mtrix nd F1-sore hve the prolem tht we do not hve negtive exmples. Topis will e disussed lter. For the moment, we only mention the redisovery prolem s qulity riterion.
Redisovering proess models originl proess model simulte event log disover disovered proess model N N N=N? The redisovery prolem: Is the disovered model N "equivlent" to the originl model N?
Prt I: Preliminries Prt III: Beyond Proess Disovery Chpter 1 Introdution Chpter 2 Proess Modeling nd Anlysis Chpter 3 Dt Mining Chpter 7 Conformne Cheking Chpter 8 Mining Additionl Perspetives Chpter 9 Opertionl Support Prt II: From Event Logs to Proess Models Chpter 4 Getting the Dt Chpter 5 Proess Disovery: An Introdution Chpter 6 Advned Proess Disovery Tehniques Prt IV: Putting Proess Mining to Work Chpter 10 Tool Support Chpter 11 Anlyzing Lsgn Proesses Chpter 12 Anlyzing Spghetti Proesses Prt V: Refletion Chpter 13 Crtogrphy nd Nvigtion Chpter 14 Epilogue