A likelihood-ratio test for identifying probabilistic deterministic real-time automata from positive data

Size: px
Start display at page:

Download "A likelihood-ratio test for identifying probabilistic deterministic real-time automata from positive data"

Transcription

1 A likelihood-rtio test for identifying proilistic deterministic rel-time utomt from positive dt Sicco Verwer 1, Mthijs de Weerdt 2, nd Cees Witteveen 2 1 Eindhoven University of Technology 2 Delft University of Technology s.verwer@tue.nl,{m.m.deweerdt,c.witteveen}@tudelft.nl Astrct. We dpt n lgorithm (RTI) for identifying (lerning) deterministic rel-time utomton (DRTA) to the setting of positive timed strings (or time-stmped event sequences). An DRTA cn e seen s deterministic finite stte utomton (DFA) with time constrints. Becuse DRTAs model time using numers, they cn e exponentilly more compct thn equivlent DFA models tht model time using sttes. We use new likelihood-rtio sttisticl test for checking consistency in the RTI lgorithm. The result is the RTI+ lgorithm, which stnds for rel-time identifiction from positive dt. RTI+ is n efficient lgorithm for identifying DRTAs from positive dt. We show using rtificil dt tht RTI+ is cple of identifying sufficiently lrge DRTAs in order to identify rel-world rel-time systems. 1 Introduction In previous work [11], we descried the RTI lgorithm for identifying (lerning) deterministic rel-time utomt (DRTAs) from leled dt, i.e., from n input smple S = (S +, S ). The RTI lgorithm is sed on the currently estperforming lgorithm for the identifiction of deterministic finite stte utomt (DFAs), clled evidence-driven stte-merging (ESDM) [9]. The only difference etween DFAs nd DRTAs re tht DRTAs contin time constrints. In ddition to using the stndrd stte-merging techniques, RTI identifies these time constrints y splitting trnsitions into two, see [11] for detils. The RTI lgorithm is efficient in oth run-time nd convergence ecuse it is specil cse of n efficient lgorithm for identifying one-clock timed utomt, see [12]. In prctice, however, it cn sometimes e difficult to pply RTI. The reson eing tht dt cn often only e otined from ctul oservtions of the process to e modeled. From such oservtions we only otin timed strings tht hve ctully een generted y the system. In other words, we only hve ccess to the positive dt S +. In this pper, we dpt the RTI lgorithm to this setting. A strightforwrd wy to do this is to mke the model proilistic, nd to check for consistency using sttistics. This hs een done mny times, nd in different wys, for the

2 prolem of identifying (proilistic) DFAs, see, e.g., [2, 8, 3]. As fr s we know, this is the first time such n pproch is pplied to the prolem of identifying DRTAs. We strt this pper y defining DRTAs nd proilistic DRTAs (PDRTA, Section 2). In ddition to DRTA structure, PDRTA contins prmeters tht model the proilities of events in the DRTA structure. In order to identify PDRTA, we thus need to solve two different identifiction prolems: the first prolem is to identify the correct DRTA structure, nd the second is to set the proilistic prmeters of this model correctly. However, ecuse PDRTA is deterministic model, we cn simply set these prmeters to the normlized frequency counts of events in the input smple S +. 3 This is very esy to compute nd it is the unique correct setting of the prmeters given the dt. We therefore focus on identifying the DRTA structure of PDRTA. We introduce new likelihood-rtio test tht cn e used to solve this identifiction prolem (Section 3). Intuitively, this test tests the null-hypothesis tht the suffixes of strings tht cn occur fter two different sttes hve een reched follow the sme PDRTA distriution, i.e., whether these two sttes cn e modeled using single stte in PDRTA. If this null-hypothesis is rejected with sufficient confidence, then this is considered to e evidence tht these two sttes should not e merged. Equivlently, if these two sttes result from split of trnsition, then this is evidence tht this trnsition should e split. In this wy, the sttisticl evidence resulting from these tests replce the evidence vlue in the originl RTI lgorithm. The result is the RTI+ lgorithm (Section 3.3), which stnds for rel-time identifiction from positive dt. The RTI+ lgorithm is n efficient lgorithm for identifying DRTAs from positive dt. The likelihood-rtio test used y RTI+ is designed specificlly for the purpose of identifying PDRTA from positive dt. Although mny lgorithms like RTI+ exist for the prolem of identifying (proilistic) DFAs, none of these lgorithms uses the non-timed version of the likelihood-rtio test of RTI+. Hence, since this test cn esily e modified in order to identify (proilistic) DFAs, it lso contriutes to the current stte-of-the-rt in DFA identifiction. In order to evlute the performnce of the RTI+ lgorithm we show typicl result of RTI+ when run on dt generted from rndom PDRTA (Section 4). This result shows tht our lgorithm is cple of identifying sufficiently complex rel-time systems in order to e useful in prctice. We end this pper with some conclusions nd pointers for future work (Section 5). 2 Proilistic Deterministic Rel-Time Automt The following exposition uses sic nottion from lnguge, utomt, nd complexity theory. For n introduction the reder is referred to [10]. In the following, we first descrie non-proilistic rel-time utomt, then we show how to dd proility distriutions to these model. 3 In the cse of non-deterministic model, setting the model prmeters is lot hrder. In fct, it cn e s difficult s identifying the model itself.

3 [0,10] [0,10], [6,10] [3,10] [0,5] [0,2] Fig. 1: An exmple of DRTA. The leftmost stte is the strt stte, indicted y the sourceless rrow. The topmost stte is n end stte, indicted y the doule circle. Every stte trnsition contins oth lel or nd dely gurd [n, n ]. Missing trnsitions led to rejecting grge stte. 2.1 Rel-Time Automt In rel-time system, ech occurrence of symol (event) is ssocited with time vlue, i.e., its time of occurrence. We model these time vlues using the nturl numers N. This is sufficient ecuse in prctice we lwys del with finite precision of time, e.g., milliseconds. Timed utomt [1] cn e used to ccept or generte sequence τ = ( 1, t 1 )( 2, t 2 )( 3, t 3 )... ( n, t n ) of symols i Σ pired with time vlues t i N, clled timed string. Every time vlue t i in timed string represents the time (dely) until the occurrence of symol i since the occurrence of the previous symol i 1. In timed utomt, timing conditions re dded using finite numer of clocks nd clock gurd for ech trnsition. In this pper, we use clss of timed utomt known s rel-time utomt (RTAs) [5]. An RTA hs only one clock tht represents the time dely etween two consecutive events. The clock gurds for the trnsitions re then constrints on this time dely. When trying to identify n RTA from dt, one cn lwys determine n upper ound on the possile time delys y tking the mximum oserved dely in this dt. Therefore, we represent dely gurd [n, n ] y closed intervl in N. Definition 1. (RTA) A rel-time utomton (RTA) is 5-tuple A = Q, Σ,, q 0, F, where Q is finite set of sttes, Σ is finite set of symols, is finite set of trnsitions, q 0 is the strt stte, nd F Q is set of ccepting sttes. A trnsition δ in n RTA is tuple q, q,, [n, n ], where q, q Q re the source nd trget sttes, Σ is symol, nd [n, n ] is dely gurd. Due to the complexity of identifying non-deterministic utomt (see [4]), we only consider deterministic RTAs (DRTAs). An RTA A is clled deterministic if A does not contin two trnsitions with the sme symol, the sme source stte, nd overlpping dely gurds. Like timed utomt, in DRTAs, it is possile to mke time trnsitions in ddition to the norml stte trnsitions used in DFAs. In other words, during its execution DRTA cn remin in the sme stte for while efore it genertes the next symol. The time it spends in every stte is represented y the time vlues of timed string. In DRTA, stte trnsition is possile (cn fire) only if its dely gurd contins the time spent in the previous stte. A trnsition q, q,, [n, n ] of DRTA is thus interpreted s follows:

4 whenever the utomton is in stte q, reding timed symol (, t) such tht t [n, n ], then the DRTA will move to the next stte q. Exmple 1. Figure 1 shows n exmple DRTA. This DRTA ccepts nd rejects timed strings not only sed on their event symols, ut lso sed on their time vlues. For instnce, it ccepts (, 4)(, 3) (stte sequence: left ottom top) nd (, 6)(, 5)(, 6) (left top left top), nd rejects (, 6)(, 2) (left top reject) nd (, 5)(, 5)(, 6) (left ottom top left). 2.2 Adding Proility Distriutions In order to identify DRTA from positive dt S +, we need to model proility distriution for timed strings using DRTA structure. Identifying DRTA then consists of fitting this distriution nd the model structure to the dt ville in S +. We wnt to dpt RTI [11] to identify such proilistic DR- TAs (PDRTAs). Since they hve the sme structure s DRTAs, we only need to decide how to represent the proility of oserving certin timed event (, t) given the current stte q of the PDRTA, i.e., P r(o = (, t) q). In order to determine the proility distriution of this rndom vrile O, we require two distriutions for every stte q of the PDRTA: one for the possile symols P r(s = q), nd one for the possile time vlues P r(t = t q). The proility of the next stte P r(x = q q) is determined y these two distriutions ecuse the PDRTA model is deterministic. The distriution over events P r(s = q) tht we use is the stndrd generliztion of the Bernoulli distriution, i.e., every symol hs some proility P r(s = q) given the current stte q, nd it holds tht Σ P r(s = q) = 1 (lso known s the multinomil distriution). This is the most strightforwrd choice for distriution function nd it is used in mny proilistic models, such s Mrkov chins, hidden Mrkov models, nd proilistic utomt. A flexile wy to model distriution over time P r(t = t q) is y using histogrms. A histogrm divides the domin of the distriution (in our cse time) into fixed numer of ins H. Every in [v, v ] H is n intervl in N. The distriutions inside the ins re modeled uniformly, i.e., for ll [v, v ] H nd ll t, t [v, v ], P r(t = t q) = P r(t = t q). Nturlly, it hs to hold tht ll these proilities sum to one: t N P r(t = t q) = 1. Using histogrms to model the time distriution might look simple, ut it is very effective. In fct, it is common wy to model time in hidden semi-mrkov models, see, e.g., [6]. The price of using histogrm to model time is tht we need to specify the mount, nd the sizes (division points) [v, v ] of the histogrm ins. Choosing these vlues oils down to mking trdeoff etween the model complexity nd the mount of dt required to identify the model. More ins led to more complex model tht is cple of modeling the time distriution more ccurtely, ut it requires more dt in order to do so. To simplify mtters, we ssume tht these ins re specified eforehnd, for exmple y domin expert, or y performing dt nlysis.

5 [0, 5] [5, 10] Fig. 2: A proilistic DRTA. Every stte is ssocited with proility distriution over events nd over time. The distriution over time is modeled using histogrms. The in sizes of the histogrms re predetermined ut left out for clrity. In ddition to choosing how to model the time nd symol distriutions, we need to decide whether to mke these two distriutions dependent or independent. It is common prctice to mke these distriutions independent, see, e.g., [6]. In this cse, the time distriution represents distriution over the witing (or sojourn) time of every stte. In some cses, however, it mkes sense to let the time spent in stte depend on the generted symol. By modeling this dependence, the model cn del with cses where some events re generted more quickly thn others. Unfortuntely, this dependence comes with cost: the size of the model is incresed y polynomil fctor (the product of the sizes of the distriutions). Due to this lowup, we require lot more dt in order to identify similr sized PDRTA. This is our min reson for modeling these two distriutions independently. This results in the following PDRTA model: Definition 2. (PDRTA) A proilistic DRTA (PDRTA) A is qudruple A, H, S, T, where A = Q, Σ,, q 0 is DRTA without finl sttes, H is finite set of ins (time intervls) [v, v ], v, v N, known s the histogrm, S is finite set of symol proility distriutions S q = {P r(s = q) Σ, q Q}, nd T is finite set of time-in proility distriutions T q = {P r(t h q) h H, q Q}. The DRTA without finl sttes specifies the structure of the PDRTA. The symol- nd time-proilities S nd T specify the proilistic properties of PDRTA. The proilities in these sets re clled the prmeters of A. However, in every set S q nd T q, the vlue of one of these prmeters follows from the others ecuse their vlues hve to sum to 1. Hence, there re ( S q 1)+( T q 1) prmeters per stte q of our PDRTA model. The proility tht the next time vlue equls t given tht the current stte is q is defined s P r(t h q) P r(t = t q) = v v + 1 where h = [v, v ] H is such tht t [v, v ]. Thus, in every time-in the proilities of the individul time points re modeled uniformly. The proility of n oservtion (, t) given tht the current stte is q is defined s P r(o = (, t) q) = P r(s = q) P r(t = t q)

6 Thus, the distriutions over events nd time re modeled to e independent. 4 The proility of the next stte q given the current stte q is defined s P r(x = q q) = P r(o = (, t) q) q,q,,[v,v ] t [v,v ] Thus, the model is deterministic. A PDRTA models distriution over timed strings P r(o = τ), defined using the computtion of PDRTA: Definition 3. (PDRTA computtion) A finite computtion of PDRTA A = Q, Σ,, q 0, H, S, T over timed string τ = ( 1, t 1 )... ( n, t n ) is finite sequence ( 1,t 1) n,t n q 0 q 1... q n 1 qn such tht for ll 1 i n, q i 1, q i, i, [n i, n i ], nd t i [n i, n i ]. The proility of τ given A is defined s P r(o = τ A) = 1 i n P r(o = ( i, t i ) q i 1, H, S, T ). Exmple 2. Figure 2 shows PDRTA A. Let H = {[0, 2]; [3, 4]; [5, 6]; [7, 10]} e the histogrm. In every in the distriution over time vlues is uniform. We cn use A s predictor of timed events. For exmple, the proility of (, 3)(, 1)(, 9)(, 5) is P r((, 3)(, 1)(, 9)(, 5)) = = A PDRTA essentilly models certin type of distriution over timed strings. An input smple S + cn e seen s smple drwn from such distriution. The prolem of identifying PDRTA then consists of finding the distriution tht generted this smple. We now descrie how we dpt RTI in order to identify PDRTA from such smple. 3 Identifying PDRTAs from positive dt In this section, we dpt the RTI lgorithm for the identifiction of DRTAs from leled dt (see [11]) to the setting of positive dt. The result is the RTI+ lgorithm, which stnd for rel-time identifiction from positive dt. Given set of oserved timed strings S +, the gol of RTI+ is to find PDRTA tht descries the rel-time process tht generted S +. Note tht, ecuse RTI+ uses sttistics (occurrence counts) to find this PDRTA, S + is multi-set, i.e., S + cn contin the sme timed string multiple times. Like RTI (see [11] for detils), RTI+ strts with n ugmented prefix tree cceptor (APTA). However, since we only hve positive dt ville, the APTA will not contin rejecting sttes. Moreover, since the points in time where the oservtions re stopped re ritrry, it lso does not contin ccepting sttes. Thus, the initil PDRTA simply is the prefix tree of S +, see Figure 3. 4 Modeling dependencies etween events nd time vlues is possile ut this comes with cost: the numer of prmeters of the model is incresed y polynomil fctor. This lowup lso increses the mount of dt required for identifiction.

7 [0,100] [0,100] [0,100] [0,100] [0,100] [0,100] Fig. 3: A prefix tree. It is identicl to n ugmented prefix tree cceptor, ut without ccepting nd rejecting sttes. The ounds of the dely gurds re initilized to the minimum nd mximum oserved time vlue. Strting from prefix tree, our originl lgorithm tries to merge sttes nd split trnsitions using red-lue frmework. A merge is the stndrd sttemerging opertion used in DFA identifiction lgorithms such s ESDM [9]. A split cn e seen s the opposite of merge. A split of trnsition δ requires time vlue t nd uses this to divide δ, its dely gurd [n, n ], nd the prt of the PDRTA reched fterwrds into two prts. The first prt is reched y the timed strings tht fire δ with dely vlue less or equl to t, creting new dely gurd [n, t]. The second prt is reched y timed strings for which this vlue is greter thn t, creting dely gurd [t + 1, n ]. The prts of the PDRTA reched fter firing δ re reconstructed s new prefix trees, using the suffixes of the timed strings tht rech these prts s input smple. See [11] for more informtion on the split opertion. RTI+ uses exctly the sme opertions nd frmework s RTI. The only difference is the evidence vlue we use. Originlly, the evidence ws sed on the numer of positive nd negtive exmples tht end in the sme stte. For RTI+, we require n evidence vlue tht uses only positive exmples, nd tht disregrds which sttes these exmples end in. We use likelihood-rtio test for this purpose. We now descrie this test nd explin how we use it oth s n evidence vlue nd s consistency check. 3.1 A likelihood-rtio test for stte-merging The likelihood-rtio test (see, e.g., [7]) is common wy to test nested hypotheses. A hypothesis H is clled nested within nother hypothesis H if the possile distriutions under H form strict suset of the possile distriutions under H. Less formlly, this mens tht H cn e creted y constrining H. Thus, y definition H hs more unconstrined prmeters (or degrees of freedom) thn H. Given two hypotheses H nd H such tht H is nested in H, nd dt set S +, the likelihood-rtio test sttistic is computed y LR = likelihood(s +, H) likelihood(s +, H ) where likelihood is function tht returns the mximized likelihood of dt set under hypothesis, i.e., likelihood(s +, H) is the mximum proility (with

8 Fig. 4: The likelihood-rtio test. We test whether using the left model (two prefix trees) insted of the right model ( single prefix tree) results in significnt increse in the likelihood of the dt with respect to the numer of dditionl prmeters (used to model the stte distriutions). optimized prmeter settings) of oserving S + under the ssumption tht H ws used to generte the dt. Let H nd H hve n nd n prmeters respectively. Since H is nested in H, the mximized likelihood of S + under H is lwys greter thn the mximized likelihood under H. Hence, the likelihood-rtio LR is vlue etween 0 nd 1. When the difference etween n nd n grows, the likelihood under H cn e optimized more nd hence LR will e closer to 0. Thus, we cn increse the likelihood of the dt S + y using different model (hypothesis) H, ut t the cost of using more prmeters n n. The likelihood-rtio test cn e used to test whether this increse in likelihood is sttisticlly significnt. The test compres the vlue 2ln(LR) to χ 2 distriution with n n degrees of freedom. The result of this comprison is p-vlue. A high p-vlue indictes tht H is etter model since the proility tht n n extr prmeters results in the oserved increse in likelihood is high. A low p-vlue indictes tht H is etter model. Applying the likelihood-rtio test to stte-merging nd trnsition-splitting is remrkly strightforwrd. Suppose tht we wnt to test whether we should perform merge of two sttes. Thus, we hve to mke choice etween two PDRTAs (models): the PDRTA A resulting from the merge of these sttes, nd the PDRTA A efore merging these sttes. Clerly, A is nested in A. Thus ll we need to do is compute the mximized likelihood of S + under A nd A, nd pply the likelihood-rtio test. Since PDRTAs re deterministic, the mximized likelihood cn e computed simply y setting ll the proilities in the PDRTAs to their normlized counts of occurrence in S +. We now show how to use this test in order to determine whether to perform merge using n exmple. Exmple 3. For simplicity, we disregrd the time vlues of timed strings nd the timed properties of PDRTAs. Suppose we wnt to test whether to merge the two root sttes of the prefix trees of Figure 4. These two prefix trees re prts of the PDRTA we re currently trying to identify. Hence only some strings from S + rech the top tree, nd some rech the ottom tree.

9 Let S = {10, 10, 20, 10 } nd S = {20, 20 } e the suffixes of these strings strting from the point where they rech the root stte of the top nd ottom tree respectively, where n τ mens tht the (timed) string τ occurs n times. We first set ll the prmeters of the top tree in such wy tht the likelihood of S is mximized: p,q0 = 4 5, p,q 0 = 1 5, p,q 1 = 1 3, p,q 1 = 2 3 (this is esy ecuse the model is deterministic). We do the sme for the ottom tree nd S : p,q 0 = 1 2, p,q 0 = 1 2, p,q 1 = 1, p,q2 = 1. We cn now compute the proility of S under the top tree: p 1 = ( ) ( 1 ) 10 ( 5 1 ) 10 ( 3 2 ) , nd the proility of S under the ottom tree p 2 = ( 1 20 ( 2) ) Next, we set the prmeter of the right tree to mximize the likelihood of S S : p,q0 = 2 3, p,q 0 = 1 3, p,q 1 = 3 5, p,q 1 = 2 5, p,q 2 = 1, nd compute the likelihood of the dt under the right (merged) tree: p 3 = ( 2 60 ( 3) 1 30 ( 3) 3 30 ( 5) ) We multiply the top nd ottom tree proilities in order to get the likelihood of the dt under the left (un-merged) tree, nd use this to compute the likelihood-rtio: LR = p3 p 1 p The χ 2 vlue tht we need to compre to χ 2 distriution then ecomes χ Per stte Σ 1 prmeters re used. In the un-merged model, the numer of (untimed) prmeters is 5, in the merged model it is 3. A likelihoodrtio test using these vlues results in p-vlue of This is lot less thn 0.05, nd hence the merge results in significntly worse model. Testing whether to perform split of trnsition cn e done in similr wy. When we wnt to decide whether to perform split, we lso hve to mke choice etween two PDRTAs: the PDRTA efore splitting A, nd the PDRTA fter splitting A. A is gin nested in A, nd hence we cn perform the likelihoodrtio test in the sme wy. 3.2 Deling with smll frequencies The likelihood-rtio test does not perform well when the tested models contin mny unused prmeters. The test tests whether n increse in the numer of prmeters leds to significntly higher likelihood. Thus, if there re mny unused prmeters, this increse will usully not e significnt. Hence, there will e tendency to ccept null-hypotheses, i.e., to merge sttes. This cuses prolems especilly in the lefs of the prefix tree. We del with the issue of smll frequencies y pooling the ins of the histogrm nd symol distriutions if the frequency of these ins in oth sttes is less thn 10. Pooling is the process of comining the frequencies of two ins into single in. In other words, we tret two ins s though it were single one. For exmple, suppose we hve three ins, nd their frequencies re 7, 14, nd 5, respectively. Then we tret it s eing two ins with frequencies 12 nd 14. In the likelihood-rtio test, this effectively reduces the mount of prmeters of the tested models. Theoreticlly, it cn e ojected tht this chnges the model using the dt. However, if we do not pool dt, we will otin too mny

10 prmeters for the sttes in which some in occurrences re very unlikely. For instnce, suppose we hve stte in which 1000 symols could occur, ut only 10 of them ctully occur. Then ccording to theory, we should count this stte s hving 999 prmeters. We count it s hving only The lgorithm We hve just descried the test we use to determine whether two sttes re similr. The null-hypothesis of this test is tht two sttes re the sme. When we otin p-vlue less thn 0.05, we cn reject this hypothesis with 95% certinty. When we otin p-vlue greter thn 0.05, we cnnot reject the possiility tht the two sttes re the sme. Insted of testing whether two sttes re the sme, however, we wnt to test whether to perform merge or split, nd if so, which one. When we test merge, high p-vlue indictes tht the merge is good. When we test split, low p-vlue indictes tht the split is good. We implemented this sttisticl evidence in RTI+ in very strightforwrd wy: If there is split tht results in p-vlue less thn 0.05, perform the split with the lowest p-vlue. If there is merge tht results in p-vlue greter thn 0.05, perform the merge with the highest p-vlue. Otherwise, perform color opertion. Thus, we merge two sttes unless we re very certin tht the two sttes re different. In ddition, we lwys perform the merge or split tht leds to the most certin conclusions. In every itertion, RTI+ selects the most visited trnsition from red stte to lue stte nd determines whether to merge the lue stte, split the trnsition, or color the lue stte red. The min reson for trying out only the most visited trnsition is tht it reduces the run-time of the lgorithm. Trying every possile merge nd split would tke much longer. Additionlly, the tests performed using the most visited trnsition will e sed on the lrgest mount of dt. Hence, we re more confident tht these conclusions re correct. An overview of the RTI+ lgorithm is shown in Algorithm 1. We clim tht RTI+ is efficient, i.e., it tht runs in polynomil time: Proposition 1. RTI+ is polynomil-time lgorithm. Proof. This follows from the fct tht ID 1DTA is efficient [12] nd the fct tht every sttistic cn e computed (up to sufficient ccurcy) in polynomil time for every stte. Since, t ny time during run of the lgorithm, the numer of sttes does not exceed the size of the input, the proposition follows. In ddition to eing time-efficient, we elieve tht RTI+ is lso dt-efficient. More specificlly, we conjecture tht returns PDRTA tht is equl to the correct PDRTA A t in the limit. With equl we men tht these PDRTAs model the exct sme proility distriutions over timed strings.

11 Algorithm 1 Rel-time identifiction from positive dt: RTI+ Require: A multi-set of timed strings S + generted y PDRTA A t Ensure: The result is smll DRTA A, in the limit A = A t Construct timed prefix A tree from S +, color the strt stte q 0 of A red while A contins non-red sttes do Color lue ll non-red trget sttes of trnsitions with red source sttes Let δ = q r, q,, g e most visited trnsition from red to lue stte Evlute ll possile merges of q with red sttes Evlute ll possile splits of δ If the lowest p-vlue of split is less thn 0.05 then preform this split Else if the highest merge p-vlue is greter thn 0.05 then perform this merge Else color q red end while Conjecture 1. The result A of RTI+ converges efficiently in the limit to the correct PDRTA A with proility 1. Completeness of the lgorithm follows from the fct tht the lgorithm is specil cse of the ID 1DTA lgorithm from [12]. The conjecture therefore holds if ll correct merges nd splits re performed given input smple of size polynomil in the size of A. The min reson for our conjecture follows from the fct tht with incresing mounts of dt, the p-vlue resulting from the likelihood-rtio test converges to 0 if the two sttes re different. Thus in the limit, RTI+ will perform ll the necessry splits, nd perhps some more, nd it will never perform n incorrect merge. However, when the two sttes tested in the likelihood-rtio test re the sme, there is lwys proility of 0.05 tht the p-vlue is less thn Thus, t times it will not perform merge when it should. Fortuntely, not performing merge or performing n extr split does not influence the lnguge of the DRTA, or the distriution of the PDRTA. It only dds dditionl (unnecessry) sttes to the resulting PDRTA A. Thus, in the limit, the lgorithm should return PDRTA A tht is lnguge equivlent to the trget PDRTA A. Unfortuntely, since we use multiple sttisticl tests tht cn ecome dependent, proving this conjecture is complex nd left s future work. 4 Tests on rtificil dt In order to evlute the RTI+ lgorithm, we test it on rtificilly generted dt. First we generte rndom PDRTA (without finl sttes), nd then we generte dt using the distriutions of this PDRTA. Unfortuntely, it is difficult to mesure the qulity of models tht re identified from such dt. Commonly used mesures include the predictive qulity or model selection criterion. However, such mesures re meningless on their own, they only useful to compre the performnce of different methods ginst ech other. Since, we know of no ny other method for identifying PDRTA, we cnnot mke use of these mesures.

12 Therefore, in order to provide some insight into the cpilities of RTI+, we only show typicl result of RTI+ when run on this dt. We generte rndom PDRTA with 8 sttes nd size 4 lphet. Of the trnsitions of the PDRTA, 4 re split nd ssigned different trget sttes t rndom. The numer of possile time vlues for the timed strings is fixed t 100. The numer of histogrm ins used in the PDRTA is set to 10. Thus, there re individul proilities for [0, 9], [10, 19], etc. The proilities of these ins nd the symol ins re generted y first ssigning to ech in vlue etween 0 nd 1, drwn from uniform distriution. These vlues re then normlized such tht oth the histogrm vlues nd the symol vlues summed to 1. We generted 2000 timed strings from this PDRTA, which ll hve n exponentilly distriuted length with n verge of 10. Figure 5 shows the resulting originl nd identified PDRTA (no proility distriutions re drwn). From this figure, it is cler tht the most common mistke is the incorrect identifiction (or sence) of clock gurd. These re usully only minor errors, involving only infrequently visited trnsitions. The resulting PDRTA is thus very similr to the originl used to generte the dt. We performed such test multiple times nd using differently sized rndom PDRTAs. The results of these tests re encourging for up to 8 sttes, size 4 lphet, nd 4 splits. When either of these vlues is incresed, the lgorithm needs more thn 2000 exmples to come up with similr PDRTA. These results re encourging ecuse PDRTAs of this size re complex enough to model interesting rel-time systems. 5 Future work In previous work, we descried the RTI lgorithm for identifying deterministic rel-time utomt (DRTAs) from leled dt. In this pper, we showed how to dpt it to the setting of positive dt. The result is the RTI+ lgorithm. RTI+ runs in polynomil time, nd we conjecture tht it converges efficiently to the correct proilistic DRTA (PDRTA). In future work, we would like to prove this conjecture. This should e possile, ecuse none of the sttistics we use requires lrge mount of dt. Moreover, the fct tht there exist polynomil chrcteristics sets for DRTAs (see [12]) should somehow extend to identifying PDRTAs. RTI+ uses likelihood-rtio test in order to determine which sttes to merge nd which trnsitions to split. Although this test is designed for the purpose of identifying PDRTA from positive dt, it cn esily e modified in order to identify proilistic DFAs. It would e interesting to test such n pproch. The chieved performnce of RTI+ is shown to e sufficient in order to identify complex rel-time systems. We elieve this performnce to e sufficient to e useful for identifying rel-world rel-time systems. We invite everyone with timed dt to try RTI+ to identify ehviorl models, nd network protocols. The source code of RTI+ is ville on-line from the first uthor s homepge.

13 Originlly generted rndom DRTA, c, c [86,100], d d [38,100], c d [48,100] d [0,19] [0,94] [94,100] [26,100],, d c [0,85] c d [0,47] [0,25] c c, d d [0,37], d c, c d [20,35] d [36,100] Identified using RTI+ with the likelihood rtio test, c d, d [0,100], d [25,28], c [0,100] d d [48,100] [29,100],, c c [0,100], d c d [0,47] [0,24] c, c c, d d [43,100] d [0,42] = correct = (prtilly) incorrect Fig. 5: A rndomly generted DRTA (top) nd the DRTA identified y our lgorithm (ottom). The dshed lines re (prtilly) incorrectly identified trnsitions. The solid sttes re correctly identified, including ll outgoing trnsitions.

14 References 1. R. Alur nd D. L. Dill. A theory of timed utomt. Theoreticl Computer Science, 126: , R. Crrsco nd J. Oncin. Lerning stochstic regulr grmmrs y mens of stte merging method. In ICGI, volume 862 of LNCS, pges Springer, A. Clrk nd F. Thollrd. PAC-lernility of proilistic deterministic finite stte utomt. Journl of Mchine Lerning Reserch, pges , C. de l Higuer. A iliogrphicl study of grmmticl inference. Pttern Recognition, 38(9): , C. Dim. Rel-time utomt. Journl of Automt, Lnguges nd Comintorics, 6(1):2 23, Y. Guédon. Estimting hidden semi-mrkov chins from discrete sequences. Journl of Computtionl nd Grphicl Sttistics, 12(3): , W. L. Hys. Sttistics. Wdsworth Pu Co, fifth edition, C. Kermorvnt nd P. Dupont. Stochstic grmmticl inference with multinomil tests. In ICGI, volume 2484 of LNAI, pges Springer, K. J. Lng, B. A. Perlmutter, nd R. A. Price. Results of the Adingo one DFA lerning competition nd new evidence-driven stte merging lgorithm. In ICGI, volume 1433 of LNCS. Springer, M. Sipser. Introduction to the Theory of Computtion. PWS Pulishing, S. Verwer, M. de Weerdt, nd C. Witteveen. An lgorithm for lerning rel-time utomt. In Benelern, pges , S. Verwer, M. de Weerdt, nd Cees Witteveen. One-clock deterministic timed utomt re efficiently identifile in the limit. In LATA, volume 5457 of LNCS, pges Springer, 2009.

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

2.4 Linear Inequalities and Interval Notation

2.4 Linear Inequalities and Interval Notation .4 Liner Inequlities nd Intervl Nottion We wnt to solve equtions tht hve n inequlity symol insted of n equl sign. There re four inequlity symols tht we will look t: Less thn , Less thn or

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

Bayesian Networks: Approximate Inference

Bayesian Networks: Approximate Inference pproches to inference yesin Networks: pproximte Inference xct inference Vrillimintion Join tree lgorithm pproximte inference Simplify the structure of the network to mkxct inferencfficient (vritionl methods,

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Homework Solution - Set 5 Due: Friday 10/03/08

Homework Solution - Set 5 Due: Friday 10/03/08 CE 96 Introduction to the Theory of Computtion ll 2008 Homework olution - et 5 Due: ridy 10/0/08 1. Textook, Pge 86, Exercise 1.21. () 1 2 Add new strt stte nd finl stte. Mke originl finl stte non-finl.

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

Inductive and statistical learning of formal grammars

Inductive and statistical learning of formal grammars Inductive nd sttisticl lerning of forml grmmrs Pierre Dupont Grmmr Induction Mchine Lerning Gol: to give the lerning ility to mchine Design progrms the performnce of which improves over time pdupont@info.ucl.c.e

More information

Review of Gaussian Quadrature method

Review of Gaussian Quadrature method Review of Gussin Qudrture method Nsser M. Asi Spring 006 compiled on Sundy Decemer 1, 017 t 09:1 PM 1 The prolem To find numericl vlue for the integrl of rel vlued function of rel vrile over specific rnge

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Bases for Vector Spaces

Bases for Vector Spaces Bses for Vector Spces 2-26-25 A set is independent if, roughly speking, there is no redundncy in the set: You cn t uild ny vector in the set s liner comintion of the others A set spns if you cn uild everything

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30 Tlen en Automten Test 1, Mon 7 th Dec, 2015 15h45 17h30 This test consists of four exercises over 5 pges. Explin your pproch, nd write your nswer to ech exercise on seprte pge. You cn score mximum of 100

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

Deterministic Finite Automata

Deterministic Finite Automata Finite Automt Deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion Sciences Version: fll 2016 J. Rot Version: fll 2016 Tlen en Automten 1 / 21 Outline Finite Automt Finite

More information

Lecture 9: LTL and Büchi Automata

Lecture 9: LTL and Büchi Automata Lecture 9: LTL nd Büchi Automt 1 LTL Property Ptterns Quite often the requirements of system follow some simple ptterns. Sometimes we wnt to specify tht property should only hold in certin context, clled

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

State Minimization for DFAs

State Minimization for DFAs Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages 5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

Name Ima Sample ASU ID

Name Ima Sample ASU ID Nme Im Smple ASU ID 2468024680 CSE 355 Test 1, Fll 2016 30 Septemer 2016, 8:35-9:25.m., LSA 191 Regrding of Midterms If you elieve tht your grde hs not een dded up correctly, return the entire pper to

More information

Tutorial Automata and formal Languages

Tutorial Automata and formal Languages Tutoril Automt nd forml Lnguges Notes for to the tutoril in the summer term 2017 Sestin Küpper, Christine Mik 8. August 2017 1 Introduction: Nottions nd sic Definitions At the eginning of the tutoril we

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck. Outline Automt Theory 101 Rlf Huuck Introduction Finite Automt Regulr Expressions ω-automt Session 1 2006 Rlf Huuck 1 Session 1 2006 Rlf Huuck 2 Acknowledgement Some slides re sed on Wolfgng Thoms excellent

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Learning Moore Machines from Input-Output Traces

Learning Moore Machines from Input-Output Traces Lerning Moore Mchines from Input-Output Trces Georgios Gintmidis 1 nd Stvros Tripkis 1,2 1 Alto University, Finlnd 2 UC Berkeley, USA Motivtion: lerning models from blck boxes Inputs? Lerner Forml Model

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Some Theory of Computation Exercises Week 1

Some Theory of Computation Exercises Week 1 Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether

More information

DFA minimisation using the Myhill-Nerode theorem

DFA minimisation using the Myhill-Nerode theorem DFA minimistion using the Myhill-Nerode theorem Johnn Högerg Lrs Lrsson Astrct The Myhill-Nerode theorem is n importnt chrcteristion of regulr lnguges, nd it lso hs mny prcticl implictions. In this chpter,

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-* Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17 EECS 70 Discrete Mthemtics nd Proility Theory Spring 2013 Annt Shi Lecture 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion,

More information

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010 CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w

More information

Worked out examples Finite Automata

Worked out examples Finite Automata Worked out exmples Finite Automt Exmple Design Finite Stte Automton which reds inry string nd ccepts only those tht end with. Since we re in the topic of Non Deterministic Finite Automt (NFA), we will

More information

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17 CS 70 Discrete Mthemtics nd Proility Theory Summer 2014 Jmes Cook Note 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion, y tking

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations. Lecture 3 3 Solving liner equtions In this lecture we will discuss lgorithms for solving systems of liner equtions Multiplictive identity Let us restrict ourselves to considering squre mtrices since one

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

SUMMER KNOWHOW STUDY AND LEARNING CENTRE SUMMER KNOWHOW STUDY AND LEARNING CENTRE Indices & Logrithms 2 Contents Indices.2 Frctionl Indices.4 Logrithms 6 Exponentil equtions. Simplifying Surds 13 Opertions on Surds..16 Scientific Nottion..18

More information

Linear Inequalities. Work Sheet 1

Linear Inequalities. Work Sheet 1 Work Sheet 1 Liner Inequlities Rent--Hep, cr rentl compny,chrges $ 15 per week plus $ 0.0 per mile to rent one of their crs. Suppose you re limited y how much money you cn spend for the week : You cn spend

More information

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary

Genetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed

More information

1.3 Regular Expressions

1.3 Regular Expressions 56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,

More information

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science CSCI 340: Computtionl Models Trnsition Grphs Chpter 6 Deprtment of Computer Science Relxing Restrints on Inputs We cn uild n FA tht ccepts only the word! 5 sttes ecuse n FA cn only process one letter t

More information

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers 80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES 2.6 Finite Stte Automt With Output: Trnsducers So fr, we hve only considered utomt tht recognize lnguges, i.e., utomt tht do not produce ny output on ny input

More information

Hybrid Control and Switched Systems. Lecture #2 How to describe a hybrid system? Formal models for hybrid system

Hybrid Control and Switched Systems. Lecture #2 How to describe a hybrid system? Formal models for hybrid system Hyrid Control nd Switched Systems Lecture #2 How to descrie hyrid system? Forml models for hyrid system João P. Hespnh University of Cliforni t Snt Brr Summry. Forml models for hyrid systems: Finite utomt

More information

FABER Formal Languages, Automata and Models of Computation

FABER Formal Languages, Automata and Models of Computation DVA337 FABER Forml Lnguges, Automt nd Models of Computtion Lecture 5 chool of Innovtion, Design nd Engineering Mälrdlen University 2015 1 Recp of lecture 4 y definition suset construction DFA NFA stte

More information

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy: Scnner Specifying ptterns source code tokens scnner prser IR A scnner must recognize the units of syntx Some prts re esy: errors mps chrcters into tokens the sic unit of syntx x = x + y; ecomes

More information

Context-Free Grammars and Languages

Context-Free Grammars and Languages Context-Free Grmmrs nd Lnguges (Bsed on Hopcroft, Motwni nd Ullmn (2007) & Cohen (1997)) Introduction Consider n exmple sentence: A smll ct ets the fish English grmmr hs rules for constructing sentences;

More information

Section 6.1 Definite Integral

Section 6.1 Definite Integral Section 6.1 Definite Integrl Suppose we wnt to find the re of region tht is not so nicely shped. For exmple, consider the function shown elow. The re elow the curve nd ove the x xis cnnot e determined

More information

Lecture 3: Equivalence Relations

Lecture 3: Equivalence Relations Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts

More information

Suppose we want to find the area under the parabola and above the x axis, between the lines x = 2 and x = -2.

Suppose we want to find the area under the parabola and above the x axis, between the lines x = 2 and x = -2. Mth 43 Section 6. Section 6.: Definite Integrl Suppose we wnt to find the re of region tht is not so nicely shped. For exmple, consider the function shown elow. The re elow the curve nd ove the x xis cnnot

More information

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite Unit #8 : The Integrl Gols: Determine how to clculte the re described by function. Define the definite integrl. Eplore the reltionship between the definite integrl nd re. Eplore wys to estimte the definite

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Let's start with an example:

Let's start with an example: Finite Automt Let's strt with n exmple: Here you see leled circles tht re sttes, nd leled rrows tht re trnsitions. One of the sttes is mrked "strt". One of the sttes hs doule circle; this is terminl stte

More information

CS 330 Formal Methods and Models

CS 330 Formal Methods and Models CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 2 1. Prove ((( p q) q) p) is tutology () (3pts) y truth tle. p q p q

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

The size of subsequence automaton

The size of subsequence automaton Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,

More information

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom Lerning Gols Continuous Rndom Vriles Clss 5, 8.05 Jeremy Orloff nd Jonthn Bloom. Know the definition of continuous rndom vrile. 2. Know the definition of the proility density function (pdf) nd cumultive

More information

Chapter 1, Part 1. Regular Languages. CSC527, Chapter 1, Part 1 c 2012 Mitsunori Ogihara 1

Chapter 1, Part 1. Regular Languages. CSC527, Chapter 1, Part 1 c 2012 Mitsunori Ogihara 1 Chpter 1, Prt 1 Regulr Lnguges CSC527, Chpter 1, Prt 1 c 2012 Mitsunori Ogihr 1 Finite Automt A finite utomton is system for processing ny finite sequence of symols, where the symols re chosen from finite

More information

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique? XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk out solving systems of liner equtions. These re prolems tht give couple of equtions with couple of unknowns, like: 6= x + x 7=

More information