Dirichlet Compound Multinomials Statistical Models

Size: px
Start display at page:

Download "Dirichlet Compound Multinomials Statistical Models"

Transcription

1 Applied Mahemaics, 2012, 3, hp://dx.doi.org/ /am a288 Published Online December 2012 (hp://.scirp.org/journal/am) Dirichle Compound Mulinomials Saisical Models Paola Cerchiello, Paolo Giudici Deparmen of Economics and Managemen, Universiy of Pavia, Pavia, Ialy Received Augus 31, 2012; revised Sepember 31, 2012; acceped Ocober 6, 2012 ABSTRACT This conribuion deals ih a generaive approach for he analysis of exual daa. Insead of creaing heurisic rules for he represenaion of documens and ord couns, e employ a disribuion able o model ords along exs considering differen opics. In his regard, folloing Mina proposal (2003), e implemen a Dirichle Compound Mulinomial (DCM) disribuion, hen e propose an exension called sbdcm ha aes explicily ino accoun he differen laen opics ha compound he documen. e follo o alernaive approaches: on one hand he opics can be unnon, hus o be esimaed on he basis of he daa, on he oher hand opics are deermined in advance on he basis of a predefined onological schema. The o possible approaches are assessed on he basis of real daa. Keyords: Texual Daa Analysis; Mixure Models; Onology Schema; Repuaional Ris 1. Inroducion ih he rapid groh of on-line informaion, ex caegorizaion has become one of he ey echniques for handling and organizing daa in exual forma. Tex caegorizaion echniques are an essenial par of ex mining and are used o classify ne documens and o find ineresing informaion conained ihin several on-line eb sies. Since building ex classifiers by hand is difficul, ime-consuming and ofen no efficien, i is orhy o learn classifiers from experimenal daa. In his proposal e employ a generaive approach for he analysis of exual daa. In he las o decades many ineresing and poerful conribuions have been proposed. In paricular, hen coping ih he ex classificaion as, a researcher has o face he ell-non problem of polysems (muliple senses for a given ords) and synonyms (same meaning for differen ords). One of he firs effecive model able o solve hose issues is represened by Laen semanic analysis (LSA) [1]. The basic idea is o or a a semanical level by reducing he vecor space hrough Singular Value Decomposiion (SVD), producing no sparse occurrence ables ha help in discovering associaions beeen documens. In order o esablish a solid heoreical saisical frameor in his conex, in [2] a probabilisic version of LSA (plsa) has been proposed, also non as he aspec model, rooed in he family of laen class models and based on a mixure of condiionally independen mulinomial disribuions for he couple ords-documens. The inenion from he inroducion of plsa as o offer a formal saisical frameor, helping he parameer inerpreaion issue as ell. By he ay he goal as achieved only parially, in fac he mulinomial mixures, hich componens can be inerpreed as opics, offer a probabilisic jusificaion a ords bu no a documens level. In fac he laer are represened merely as lis of mixing proporions derived from mixure componens. Moreover, he mulinomial disribuion presens as many values as here are in he raining documens and herefore i learns opic mixure on hose rained documens. The exension o previously unseen documens is no appropriae since here can be ne opics. In order o overcome he asymmery beeen ords and documens and o produce a real generaive model, [3] proposed he LDA (Laen Dirichle Allocaion). The idea of such ne approach emerges from he concep of exchangeabiliy for he ords in a documen ha unfolds in he bag of ords assumpion: he order of ords in a ex is no imporan. In fac he LDA model is able o capure eiher he ords or documens exchangeabiliy unlie LSA and plsa. On he oher hand LDA is a generaive model in any sense since i posis a Dirichle disribuion over documens in he corpus, hile each opic is dran from a Mulinomial disribuion over ords. Hoever noe ha [4] in 2003 have shon ha LDA and plsa are equivalen if he laer is under a uniform Dirichle prior disribuion. Obviously LDA does no solve all he issues. The main resricion embedded in LDA approach and due o he Dirichle disribuion, is he assumpion of independence among opics. The immediae consequence as o acle he issue by inroducing he Correlaed Topic Model (CTM), as proposed in [5]. CTM inroduces correlaions among

2 2090 P. CERCHIELLO, P. GIUDICI opics by replacing he Dirichle random variable ih he logisic normal disribuion. Unlie LDA, CTM presens a clear complicaion in erms of inference and parameer esimaion since he logisic normal disribuion and he Mulinomial are no conjugae. To bypass he problem, he mos recen alernaive is represened by he Independen Facor Topic Models (IFTM) inroduced in [6]. Such proposal maes use of laen variable model approach o deec hidden correlaions among opics. The choice o explore he laen model orld allos o choose among several alernaives ranging from he ype of relaion, linear or no linear, o he ype of prior o be specified for he laen source. For sae of compleeness is imporan o menion anoher ineresing research pah focusing on he bursiness phenomenon, ha is he endency of rare ords, mosly, o appear in burs. The above menioned generaive models are no able o capure such peculiariy, ha insead is very ell modelled by he Dirichle Compund Mulinomial model (DCM). Such disribuion as inroduced by saisicians [7] and has been idely employed by oher secors lie bioinformaics [8] and language engineering [9]. An imporan conribuion in he conex of ex classificaion as brough by [10] and [11] ha profiably used DCM as a bag-of-bags-of-ords generaive process. Similarly o LDA, e have a Dirichle random variable ha generaes a Mulinomial random variable for each documen from hich ords are dran. By he ay, DCM canno be considered a opic model in a ay, since each documen derives specifically by one opic. Tha is he main reason hy [12] proposed a naural exension of he classical opic model LDA by plugging ino i he DCM disribuion and obaining he so called DCMLDA. Folloing his line of hining, e move from DCM approach and e propose an exension of he DCM, called semanicbased Dirichle Compound Mulinomial (sbdcm), ha permis o ae laen opics ino accoun. The paper is organized as follos: in Secion 2 e sho he Dirichle Compound Mulinomial (DCM) model, in Secion e propose an exension of he DCM, called semanicbased Dirichle Compound Mulinomial (sbdcm), in Secion 4 e sho ho o esimae he parameers of he differen models. Then, in Secion 5 e assess he predicive performance of he o disribuions by using seven differen classifiers. Finally e sho he differen classificaion performance according o he noledge on he opics T (non or unnon). 2. Bacground: The Dirichle Compound Mulinomial The DCM disribuion is a hierarchical model: on one hand, he Dirichle random variable is devoed o model he Mulinomial ord parameers ; on he oher hand, he Mulinomial variable models he ord coun vecors x comprising he documen. The disribuion funcion of he DCM mixure model is: d p x p x p. (1) here p x is he Mulinomial disribuion: p x n! \ x 1 x. in hich x is he ords coun vecor, x is he coun for each ord and he probabiliy of emiing a ord i ; herefore a documen is modelled as a single se of ords ( bag-of-ords ). The Dirichle disribuion p is insead parameerized as: ih p \ he Dirichle parameer vecor for ords, as consequence he hole se of ords ( bag-of-bags ) is modelled. Thus a ex (a documen in a se) is modelled as bag-of-bags-of-ords. Developing he previous inegral e obain: p n! x x x \ In Figure 1 e repor he graphical represenaion of he DCM model. From anoher poin of vie, each Mulinomial is lined o specific sub-opics and maes, for a specific documen, he emission of some ords more liely han ohers. Insead he Dirichle represens a general opic ha compounds he se of documens and hus he DCM could be also described as bag-of-scaleddocumens. The added value of he DCM approach consiss in he abiliy o handle he bursiness of a rare ord ihou inroducing heurisics [13]. In fac, if a rare ord appears once along a ex, i is much more liely o appear again. hen e consider he enire se of documens D, here each documen is independen and idenified by is coun vecor, D x,,, 1 x xn, he lielihood of he Figure 1. Hierarchical represenaion of DCM model.. (2) (3) (4)

3 P. CERCHIELLO, P. GIUDICI 2091 hole documens se (D) is N pxd p D N d 1 1 x d1 1 xd 1 here x d is he sum of he couns of each ord in he documen d-h (x d ) and x d he coun of ord -h for he documen d-h. Thus he log-lielihood is: log d11 N log log x d1 1 1 N p D x log log d (5) (6) The parameers can be esimaed by a fixed-poin ieraion scheme, as described in Secion A Semanic-Based DCM As explained in Secion 2, e have a coefficien for each ord compounding he vocabulary of he se of documens hich is called corpus. The DCM model can be seen as a bag-of-scaled-documens here he Dirichle aes ino accoun a general opic and he Mulinomial some specific sub-opics. Our aim in his conribuion is o build a frameor ha allos us o inser specifically he opics (non or unnon) ha compound he documen, ihou losing he bursiness phenomenon and he classificaion performance. Thus e inroduce a mehod o lin he coefficiens o he hypoheic opics, indicaed ih i, by means of a funcion F hich mus be posiive in since he Dirichle coefficiens are posiive. Noe ha usually dimb dima and, herefore, our proposed approach is parsimonious. Subsiuing he ne funcion ino he inegral in Equaion (1), he ne model is: p x p x p F d, e have considered as funcion F() a linear combinaion based on a marix D and he vecor. D conains informaion abou he ay of spliing among opics he observed coun vecors of he ords conained in a diagonal marix A and is a vecor of coefficien (eighs) for he opics. More specifically e assume ha: (7) 1 d11 d1 T A, D, d1 d T 1, D AD T 1 F D T Noe ha: T d d d ; ih T he number of Topics; (8) (9) d is he coefficien for ord -h used o define he degree of belonging o opic -h and by hich a porion of he coun of ord -h is assigned o ha paricular opic -h. By subsiuing his linear combinaion ino Equaion (4), e obain he same disribuion bu ih he above menioned linear combinaion for each : p x T T d x n! d 1 T T 1 x x d d 1 1 (10) This model is a modified version of he DCM, henceforh semanic-based DCM (sbdcm), and he ne loglielihood for he se of documens becomes: log p D N T T log d log x d d1 1 1 N T T * log xd d log d d11 (11) In Figure 2 e repor he graphical represenaion of he ne model here he s are subsiued by a funcion of he s. An imporan aspec of he proposed approach is represened by he number T of opics o be insered ino he sbdcm ha can be: Unnon, hus o be esimaed on he basis of he available daa.

4 2092 P. CERCHIELLO, P. GIUDICI Figure 2. Hierarchical represenaion of sbdcm model. A priori non (i.e. fixed by field expers). The firs case ill be reaed in Secion 5.1, in paricular since i is no alays possible o no in advance he number of laen opics presen in a corpora, i becomes very useful o build a saisical mehodology for discovering efficienly and consisenly a suiable number of opics. In his case he number of opics T can be considered a random variable. Thus, e use a segmenaion procedure o group he ords in order o creae groups of ords sharing common characerisics ha can be considered as laen opics. The analysis is compleed by choosing he bes number of groups and a disance marix is used o se he membership percenage (d ) of each ord o each laen opic. The second case ill be reaed in Secion 5.2 and proposes o exploi he sbdcm model by employing a priori noledge based on onological schemas ha describe he relaions among conceps ih regards o he general opics of he corpora. The onology srucure provides he se of relaions among he conceps o hich can be associaed a cerain number of ey ords, by a field exper(s). Thus, e an o use he classes of a given onology and he associaed ey ords o define in advance he number T of opics. 4. Parameers Esimaion There are several mehods o maximize he log-lielihood and o find he parameers associaed ih a DCM: simplified Neon ieraion, he Expeced Maximizaion mehod and he maximizaion of he simplified lielihood (called leave-one-ou lielihood, LOO). Among hem, he mos general and flexible algorihm is he Expeced Maximizaion (EM). The EM algorihm is an ieraive procedure able o compue he maximum-lielihood esimaes henever daa are incomplee or hey are considered complee bu no observable (laen variables) [14]. In general, considering, p X Z he join probabiliy disribuion (or probabiliy mass funcion) here X is he incomplee (bu observable) daa se and Z he missing (or laen) daa, e can calculae he complee maximum lielihood esimaion of model parameers by means of he EM algorihm. e obain an overall complexiy reducion for ha concerns he calculaion of observed maximum-lielihood (ML) esimaes. The goal is achieved by alernaing he expecaion (E-sep) of he lielihood ih he laen variables as if hey are observed and he maximizaion (M-sep) of he lielihood funcion found in he E-sep. ih he resul of he M-sep e updae he ne parameers ne ha are used again in he cycle, by saring from he E-sep, unil e reach a fixed degree of approximaion of he observed lielihood hich increases sep by sep moving oards he maximum (ha could be local) [15]. The EM can be buil in differen ays. One possibiliy is o see he EM as a loer bound maximizaion here e alernae he E-sep o calculae an approximaion of he loer bound for he log-lielihood and maximize i in he M-sep unil a saionary poin (zero gradien) is reached. If e are able o find a loer bound for he log-lielihood e can maximize i via a fixed-poin ieraion; in fac i is he same principle of considering he EM as a loer bound maximizaion. In our conex, for he DCM, he loer log p D is he folloing quaniy: bound ih xd 1 N xd d (12) his allos us o use a fixed poin ieraion hose seps are: xd 1 N xd d (13) here x d is he sum of he couns of each ord in he documen d-h x d, x d he coun of ord -h for he documen d-h and he Dirichle coefficien for ord a he -h sep. The algorihm is sopped hen a degree of approximaion is reached. The ieraion sars ih equals o he occurrence percenage of he ord -h in he corpus. The esimaed parameers, as said before, have an imporan characerisic: hey follo he bursiness phenomenon of ords. In fac he smaller is, he more bursiness effec is conained ihin a ord, as revealed in Secion 5. In he case of sbdcm by considering he ne loglielihood, Equaion (11), e can use he same loer bound employed before. The only modificaion lays on he subsiuion of he coefficiens ih he linear funcion of, as in Equaion (9), and hereby he ne fixed poin ieraion sep is:

5 P. CERCHIELLO, P. GIUDICI N T T d xd d d d d N T T d xd d d d d (14) As before e sop he ieraion a a fixed degree of approximaion and he coefficiens d are hose described in Secion 3. The ne s mainain he ords bursiness, as e shall sho in Secion 5 and hey are used o classify he documen by employing a Naive Bayes classifier. For our applicaions, in he nex secion e have used for boh models a value of of The ieraion sars ih equals o he percenage of each single cluser obained by he grouping analysis of vocabulary ords. 5. Model Performance In his secion e describe he evaluaion of he differen classifiers by using he parameers esimaed from he DCM disribuion. Thus, our raining daa se is compound of 6436 documens ih a vocabulary (already filered and semmed) of 15,655 ords so e have o esimae 15,655 parameers. The parameer is able o model he bursiness of a ord. In fac, he smaller he parameers are, he more bursy he emission of ords is. This phenomenon is characerisic of rare ords, herefore coefficiens are, on average, smaller for less couned ords. The average value of he overall parameers is , he sandard deviaion is and he maximum and minimum values are respecively and Once he coefficien vecor of s is obained e employ seven differen classifiers, hree of hich are described in [13] (normal (N), complemen (C) and mixed (M). The remaining ones are proposed as he appropriae combinaion of he previous ones, in order o improve heir characerisics. Those ne classifiers are se in funcion of he number of ords ha a es-documen has in common ih he se of documens ha compound a class; in his ay e creae a classifier in funcion of he number of ords in common. Thus e analyze he folloing addiional classifiers: COMPLE- MENT + MIXED + NORMAL (CMN), COMPLE- MENT + NORMAL (CN), COMPLEMENT + MIXED (CM), MIXED + NORMAL (MN). In order o evaluae he classificaion performance e employ hree performance indexes: Ind1: The proporion of rue posiive over he oal number of es-documens: D d 1 TPd 100; D Ind2: The proporion of classes ha e are able o classify: C c1 Ic 100; C here I c is an indicaor ha e se 1 if a leas one documen of he class is classified correcly, oherise e se 0. Ind3: The proporion of rue posiive ihin each class over he number of es documens presen in he class: 1 C C c1 TPc 100; Mc here M c is he number of es-documens in each class, TP c is he number of rue posiive in he class and C he number of opics (46). For he four combined classifiers such indexes have been calculaed by varying he number of ords in common beeen he es documen and he class. In paricular for our es e have used hree differen hresholds for he number of ords (n): 15, 10 and 5. For example, e indicae ih he iniials CM n he classificaion rule ha employs classifier C hen he number of common ords are less or equal o n and classifier M hen he number of ords in common is more han n. Insead, he iniial CMN n.m idenifies he using of classifier C unil n, he classifier N over m and he classifier M beeen n and m. For he daa a hand he number of ords in common beeen he o ses (raining and evaluaion se) varies beeen 1 and 268. The above menioned combinaion is based on he folloing idea: if he number of ords in common beeen he bag of ords and he correc class is lo, hen he mos informaion conen is in he complemen se. Oherise he needed informaion is conained eiher in he normal se or in he complemen one. Taing ino accoun such consideraion e have se up differen combinaion and e concluded ha he useful rade-off among classifiers is equal o 10 (Table 1). As e can see he bes classifiers are he mixed and he CM 10 ones. They are able o classify respecively 1237 and 1238 over 1609 documens ha are disribued no uniformly among classes (46). These classifiers are able o classify a leas a documen per class even if here are classes conaining only 2 documens. Beeen hem he CM 10 classifier has index hree slighly beer han mixed one. The orse classifier, in his case, is he complemen version alone.

6 2094 P. CERCHIELLO, P. GIUDICI Table 1. Classificaion resuls by varying cluser numbers and using marix C. Classifier Measure sbdcm 5 sbdcm 11 sbdcm 17 sbdcm 23 sbdcm 46 DCM LL i 282, , , , , ,385 LL o 205, , , , , ,286 AIC i 56, , , , , ,264 AIC o 410, , , , , ,066 Norm. Ind \\ Ind \\ Ind Comp Ind \\ Ind \\ Ind Mixed Ind \\ Ind \\ Ind e no verify he goodness of he sbdcm models described in Secion 3, o undersand if e can inser laen opics ino DCM by mainaining he bursiness and he same classificaion performance. To inds of marixes have been used in he cluser procedure. One marix conains he correlaions among ords C in he vocabulary and anoher G is consruced by calculaing he Krusal-allis index on he coun marix among ords. The laer index g is defined as follos: g K 12 N 1 niri N N i1 p 3 ci i1 3 c i 2 (15) N N here n i is he number of sample daa, N he oal observaion number of he samples, he number of samples o be compared and r i he mean ran of i-h group. The denominaor of he index g is a correcion facor needed hen ied daa are presen in he daa se, here p is he number of recurring rans and c i is he imes he i-h ran is repeaed. The index g depends on he differences among he averages of he groups r i and he general average. If he samples come from he same populaion or from populaions ih he same cenral endency, he arihmeic averages of he rans of each group ri rij n i should be similar o each oher and o he j general average N 1 2 as ell. The raining daase conains 2051 documens ih a vocabulary of 4096 ords for boh approaches. The evaluaion daase (again he same for boh models) conains 1686 documens hich are disribued over 46 classes. In Tables 1 and 2 e repor he resuls from he o ess obained respecively by marixes C and G and by varying he number of groups in he cluser. In he ables e indicae ih LL in he log-lielihood before he parameers updaing and ih LL ou afer he ieraion procedure hich is sopped hen he error = is reached. The same ih AIC cin and AIC cou ha is he correced Aaie Informaion Crierion (AIC c ) before and afer he uploading. The indexes Ind1, Ind2 and Ind3 have been described in Secion 5.1. As e can see in he o Tables 1 and 2, he percenages of correc classificaion (Ind1) are very close o he original ones ih a parameer for each ord (4096 parameers). Of course hey depend on he ype of classifier employed during he classificaion sep. Considering boh sbdcm and DCM, he differences produced by varying he number of groups are small. Moreover he AIC c is alays beer in he ne approach hen considering each ord as a parameer (DCM model). In paricular for ha concerns he approach based on he correlaion marix C (in Table 2) ih 17 groups and on he Mixed classifier, i can predic correcly he 68.43% of documens. The log-lielihood and he AIC c indexes along groups are quie similar, hoever he bes value is obained ih 5 groups (respecively 205,412 and 410,834). Considering again he approach based on he correlaion marix C, e can conclude ha, in erms of complexiy expressed by he AIC index, he sbdcm approach haever applied classifier is alays beer han he DCM. hen e use marix G (Table 2) he bes classificaion performance is for he complemen classifier based on 23 groups, ih a percenage of 68.72%, a log-lielihood of 204,604, he AIC c of 409,254. The bes log-lielihood and AIC c are for cluser ih 46 groups (respecively 204,362 and 408,816). Even if he sbdcm disribuion based on marix G is no able o improve he classificaion performance of DCM, e can say ha he sbdcm Index1 is alays very close o he bes one. In addiion he ne model is

7 P. CERCHIELLO, P. GIUDICI 2095 Table 2. Classificaion resuls by varying cluser numbers and using marix G. Classifier Measure sbdcm 5 sbdcm 11 sbdcm 17 sbdcm 23 sbdcm 46 DCM LL i 291, , , , , ,385 LL o 205, , , , , ,286 AIC i 582, , , , , ,264 AIC o 411, , , , , ,066 Norm. Ind \\ Ind \\ Ind Comp Ind \\ Ind \\ Ind Mixed Ind \\ Ind \\ Ind alays beer in erms of eiher AIC and log-lielihood indexes. Moreover, if e perform an asympoic chi- 2 squared es es considering he o cases (marixes G and C) o decide heher he difference among loglielihoods (LL), ih respec o DCM, are significan (i.e. he difference is saisically meaningful if he LL1 LL2 is greaer han 6), e can see from Tables 1 and 2 he es ih marix G has he bes performance. Performance of he Semanic-Based Dirichle Compound Mulinomial ih T Knon in Advance A differen approach needs o be assessed hen he number of available opic T is non in advance. In fac a ex corpora could be enriched by several descripions of reaed opics according o he noledge of field expers. In more deails, he analysis could be provided ih a priori noledge based on onological schemas ha describe he relaions among conceps ih regards o he general opics of he corpora. An onology (from hich an onological schema is derived) is a formal represenaion of a se of conceps ihin a domain and he relaionships beeen hose conceps [16]. I provides a shared vocabulary, hich can be used o model a domain, ha is, he ype of objecs and/or conceps ha exis, and heir properies and relaions. In Figure 3 e repor an example of graphical represenaion of an onological schema. For example, if a ex se deals ih repuaional ris managemen for corporae insiuions, an onology can be creaed on he basis of he four caegories of problems (inernal processes, people, sysems and exernal evens) defined by Basel II Accords. Hence e can suppose ha some specific sub-opics and ey ords, such as he possible causes of repueional losses, ill be almos surely reaed along he exs. Figure 3. Example of onological schema. Thereby, he onology srucure provides he se of relaions among he conceps o hich can be associaed, by a field exper(s), a cerain number of ey ords. Thus, e an o use he classes of a given onology and he associaed ey ords o define in advance he number T of opics. The onological schema hich e refer o, for he daa a hand, deals ih he so called repuaional ris [17]. I is no simple o define and consequenly o measure and o monior he repuaion concep since i involves inangible asses such as: honor, public opinion, percepion, reliabiliy, meri. By he ay, i is a maer of fac, ha a bad repuaion can seriously affec and condiion he performance of a company. Moreover companies end o ac once he adverse even has occurred. According o such approach e can say ha here is no a ris managemen aciviy, bu only a crisis managemen. ih regards o repuaional ris, media coverage plays a ey role in deermining a company s repuaion. This ofen occurs hen a company repuaion has been significanly damaged by unfair aacs from special ineres groups or

8 2096 P. CERCHIELLO, P. GIUDICI inaccurae reporing by he media. A deailed and srucured analysis of ha he media are saying is especially imporan because he media shape he percepions and expecaions of all he involved acors. Naural language processing echnologies enable hese services o scan a ide range of oules, including nespapers, magazines, TV, radio, and blogs. In order o enable he applicaion of he classificaion exual model sbdcm e have collaboraed ih he Ialian mare leader company in financial and economic communicaion, Sole24ORE. Sole- 24ORE eam provided us ih a se of 870 aricles abou Alialia, an Ialian fligh company, covering a period of one year (Sep 07-Sep 08). The 80% of he aricles are used o rain he model and he remaining 20% o validae he process. The objecive is o classify he aricles on he basis of he repuaion onology in order o undersand he argumen reaed in he aricles. The onology classes used for he classificaion are: Ideniy: he percepion ha saeholders have of he organizaion, person, produc. I describes ho he organizaion is perceived by he saeholders. Corporae Image: he persona of he organiaion. Usually for companies visibly manifesed by ay of branding and he use of rademars and involves he mission and he vision. I involves brand value. Inegriy: personal inner sense of holeness deriving from honesy and consisen uprighness of characer. Qualiy: he achievemen or excellence of an eniy. Qualiy is someimes cerificaed by a hird par. Reliabiliy: abiliy of a sysem o perform/mainain is funcions in rouine and also in differen hosile or/ and unexpeced circumsances. I involves cusomer saisfacion and cusomer fideliaion. Social Responsibiliy: social responsibiliy is a docrine ha claims ha an organizaion or individual has a responsibiliy o sociey. I involves foundaion campaign and susainabiliy Technical Innovaion: he Inroducion of ne echnical producs or services. Measure of he RD orienaion of an organizaion (only for companies). Value For Money: he exen o hich he uiliy of a produc or service jusifies is price. Those classes define he concep of repuaion of a company. To lin he onology classes o he exual analysis e use a se of ey ords for each class of he repuaion schema. Since he aricles are in Ialian, he ey ords are in Ialian as ell. For example he concep of Reliabiliy involves cusomer saisfacion and cusomer fideliaion and is characerized by he folloing se of ey ords: affidabilià, fiducia, consumaori, risorsa, organizzazione, commerciale, dinamicià, valore, mercao. On he basis of hese ey ords, e perform a grouping analysis considering he 9 classes. From he clusering, e derive he marix D. Empirical resuls sho good performance in erms of correc classificaion rae. e obain ha, given 171 aricles, e correcly classify he 68% of hem. 6. Conclusions This conribuion has shon ho o enrich he DCM model ih a semanic exension. e also have proposed a mehod o inser laen opics ihin he Dirichle Compound Mulinomial (DCM) ihou losing he ords bursiness : e call such a disribuion semanicbased Dirichle Compound Mulinomial (sbdcm). The approaches assessed depend on he noledge abou he opics T. In fac here can be o alernaive conexs: on one hand he opics are unnon in advanced, hus o be esimaed on he basis of daa a hand. On he oher hand a ex corpora could be enriched by several descripions of reaed opics according o he experience of he field exper(s). Specifically, he analysis can be empoered ih a priori noledge based on onological schemas ha describe he relaions among conceps ih regards o he general class argumen of he corpora. In order o inser opics e creae a ne coefficien vecor for each opic and laer on e obain he parameers as a linear combinaion of hem. The mehodology is based on a marix D conaining he degree of membership of each ord o a cluser (i.e. a opic) by using he cluser disance marix. Then e spli he ords coun vecors among laen opics and, by employing a fixed-poin ieraion, e generae he β coefficiens represening he opics eighs. In order o compare he o models DCM and sbdcm e have employed a Naive Bayes Classifier based on he esimaed disribuions as shon in [13]. Several classifiers have been proposed and esed, and among hem he bes performance is obained by means of he mixed formula and CM 10. Moreover, e run several ess o verify if he classificaion performance reached ih an for each ord (DCM) is mainained or improved by he sbdcm. Such an objecive has been accomplished employing o differen approaches. e propose o differen mehods o generae parameers, one based on he correlaion among ords C and he second based on he Krusal-allis index calculaed on he ords coun marix G. The resuls repor ha he es performances in erms of misclassificaion rae are quie close o each oher and o he performance reached by he DCM. Hoever he sbdcm disribuion is able o obain beer resuls in erms of AIC and log-lielihood especially in he case of marix G. Concluding, by using marix D o describe ho ords coun vecors can be spli among opics, he as eighs for he opic and as a linear combinaion e are able o obain an opimal classifi-

9 P. CERCHIELLO, P. GIUDICI 2097 caion performance and o follo he bursiness. 7. Acnoledgemens This or has been suppored by MUSING 2006 conrac number , and FIRB, ). The paper is he resul of he close collaboraion beeen he auhors, hoever he paper has been rien by Paola Cerchiello under he supervision of Paolo Giudici. REFERENCES [1] S. Deereser, S. Dumais, G.. Furnas, T. K. Landauer and R. Harshman, Indexing by Laen Semanic Analysis, Journal of he American Sociey for Informaion Science, Vol. 41, No. 6, 1990, pp doi: /(sici) (199009)41:6<391::aid-as I1>3.0.CO;2-9 [2] T. Hofmann, Probabilisic Laen Semanic Indexing, Proceedings of Special Ineres Group on Informaion Rerieval, Ne Yor, 1999, pp [3] D. M. Blei, A. Y. Ng and M. I. Jordan, Laen Dirichle Allocaion, Journal of Machine Learning Research, Vol. 3, 2003, pp [4] M. Girolami and A. Kaban, On an Equivalence beeen PLSI and LDA, Proceedings of Special Ineres Group on Informaion Rerieval, Ne Yor, 2003, pp [5] D. M. Blei and J. D. Laffery, Correlaed Topic Models, Advances in Neural Informaion Processing Sysems, Vol. 18, 2006, pp [6] D. Puhividhya, H. T. Aias and S. S. Nagarajan, Independen Facor Topic Models, Proceeding of Inernaional Conference on Machine Learning, Ne Yor, 2009, pp [7] J. E. Mosimann, On he Compound Mulinomial Disribuion, he Mulivariae B-Disribuion, and Correlaions among Proporions, Biomeria, Vol. 49, No. 1-2, 1962, pp [8] K. Sjolander, K. Karplus, M. Bron, R. Hughey, A. Krogh, I. S. Mian and D. Haussler, Dirichle Mixures: A Mehod for Improving Deecion of ea bu Significan Proein Sequence Homology, Compuer Applicaions in he Biosciences, Vol. 12, No. 4, 1996, pp [9] D. J. C. Macay and L. Peo, A Hierarchical Dirichle Language Model, Naural Language Engineering, Vol. 1, No. 3, 1994, pp [10] T. Mina, Esimaing a Dirichle disribuion, Unpublished Paper, hp://research.microsof.com/en-us/um/people/mina/pap ers/dirichle/ [11] R. E. Madsen, D. Kaucha and C. Elan, Modeling ord Bursiness Using he Dirichle Disribuion, Proceeding of he 22nd Inernaional Conference on Machine Learning, Ne Yor, 2005, pp [12] G. Doyle and C. Elan, Accouning for Bursiness in Topic Models, Proceeding of Inernaional Conference on Machine Learning, Ne Yor, 2009, pp [13] J. D. M. Rennie, L. Shih, J. Teevan and D. R. Karge, Tacling he Poor Assumpions of Naive Bayes Tex Classifier, Proceeding of he 20h Inernaional Conference on Machine Learning, ashingon DC, 2003, 6 p. [14] A. P. Dempser, M. N. Laird and D. B. Rubin, Maximum Lielihood from Incomplee Daa via he EM Algorihm, Journal of he Royal Saisical Sociey, Series B, Vol. 39, No. 1, 1977, pp [15] D. Böhning, The EM Algorihm ih Gradien Funcion Updae for Discree Mixure ih Kno (Fixed) Number of Componens, Saisics and Compuing, Vol. 13, No. 3, 2003, pp doi: /a: [16] S. Saab and R. Suder, Handboo on Onologies, Inernaional Handboos on Informaion Sysems, 2nd Ediion, Springer, Berlin, [17] P. Cerchiello, Saisical Models o Measure Corporae Repuaion, Journal of Applied Quaniaive Mehods, Vol. 6, No. 4, 2011, pp

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Errata (1 st Edition)

Errata (1 st Edition) P Sandborn, os Analysis of Elecronic Sysems, s Ediion, orld Scienific, Singapore, 03 Erraa ( s Ediion) S K 05D Page 8 Equaion (7) should be, E 05D E Nu e S K he L appearing in he equaion in he book does

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

MANY FACET, COMMON LATENT TRAIT POLYTOMOUS IRT MODEL AND EM ALGORITHM. Dimitar Atanasov

MANY FACET, COMMON LATENT TRAIT POLYTOMOUS IRT MODEL AND EM ALGORITHM. Dimitar Atanasov Pliska Sud. Mah. Bulgar. 20 (2011), 5 12 STUDIA MATHEMATICA BULGARICA MANY FACET, COMMON LATENT TRAIT POLYTOMOUS IRT MODEL AND EM ALGORITHM Dimiar Aanasov There are many areas of assessmen where he level

More information

Air Traffic Forecast Empirical Research Based on the MCMC Method

Air Traffic Forecast Empirical Research Based on the MCMC Method Compuer and Informaion Science; Vol. 5, No. 5; 0 ISSN 93-8989 E-ISSN 93-8997 Published by Canadian Cener of Science and Educaion Air Traffic Forecas Empirical Research Based on he MCMC Mehod Jian-bo Wang,

More information

Retrieval Models. Boolean and Vector Space Retrieval Models. Common Preprocessing Steps. Boolean Model. Boolean Retrieval Model

Retrieval Models. Boolean and Vector Space Retrieval Models. Common Preprocessing Steps. Boolean Model. Boolean Retrieval Model 1 Boolean and Vecor Space Rerieval Models Many slides in his secion are adaped from Prof. Joydeep Ghosh (UT ECE) who in urn adaped hem from Prof. Dik Lee (Univ. of Science and Tech, Hong Kong) Rerieval

More information

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models Journal of Saisical and Economeric Mehods, vol.1, no.2, 2012, 65-70 ISSN: 2241-0384 (prin), 2241-0376 (online) Scienpress Ld, 2012 A Specificaion Tes for Linear Dynamic Sochasic General Equilibrium Models

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

CHERNOFF DISTANCE AND AFFINITY FOR TRUNCATED DISTRIBUTIONS *

CHERNOFF DISTANCE AND AFFINITY FOR TRUNCATED DISTRIBUTIONS * haper 5 HERNOFF DISTANE AND AFFINITY FOR TRUNATED DISTRIBUTIONS * 5. Inroducion In he case of disribuions ha saisfy he regulariy condiions, he ramer- Rao inequaliy holds and he maximum likelihood esimaor

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

CONFIDENCE INTERVAL FOR THE DIFFERENCE IN BINOMIAL PROPORTIONS FROM STRATIFIED 2X2 SAMPLES

CONFIDENCE INTERVAL FOR THE DIFFERENCE IN BINOMIAL PROPORTIONS FROM STRATIFIED 2X2 SAMPLES Proceedings of he Annual Meeing of he American Saisical Associaion Augus 5-9 00 CONFIDENCE INTERVAL FOR TE DIFFERENCE IN BINOMIAL PROPORTIONS FROM STRATIFIED X SAMPLES Peng-Liang Zhao John. Troxell ui

More information

ASYMPTOTICALLY EXACT CONFIDENCE INTERVALS OF CUSUM AND CUSUMSQ TESTS: A Numerical Derivation Using Simulation Technique

ASYMPTOTICALLY EXACT CONFIDENCE INTERVALS OF CUSUM AND CUSUMSQ TESTS: A Numerical Derivation Using Simulation Technique ASYMPTOTICALLY EXACT CONFIDENCE INTERVALS OF CUSUM AND CUSUMSQ TESTS: A Numerical Derivaion Using Simulaion Technique Hisashi TANIZAKI Faculy of Economics, Kobe Universiy, Nada-ku, Kobe 657, JAPAN ABSTRACT:

More information

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits DOI: 0.545/mjis.07.5009 Exponenial Weighed Moving Average (EWMA) Char Under The Assumpion of Moderaeness And Is 3 Conrol Limis KALPESH S TAILOR Assisan Professor, Deparmen of Saisics, M. K. Bhavnagar Universiy,

More information

Unit Root Time Series. Univariate random walk

Unit Root Time Series. Univariate random walk Uni Roo ime Series Univariae random walk Consider he regression y y where ~ iid N 0, he leas squares esimae of is: ˆ yy y y yy Now wha if = If y y hen le y 0 =0 so ha y j j If ~ iid N 0, hen y ~ N 0, he

More information

Matlab and Python programming: how to get started

Matlab and Python programming: how to get started Malab and Pyhon programming: how o ge sared Equipping readers he skills o wrie programs o explore complex sysems and discover ineresing paerns from big daa is one of he main goals of his book. In his chaper,

More information

Time series Decomposition method

Time series Decomposition method Time series Decomposiion mehod A ime series is described using a mulifacor model such as = f (rend, cyclical, seasonal, error) = f (T, C, S, e) Long- Iner-mediaed Seasonal Irregular erm erm effec, effec,

More information

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kriging Models Predicing Arazine Concenraions in Surface Waer Draining Agriculural Waersheds Paul L. Mosquin, Jeremy Aldworh, Wenlin Chen Supplemenal Maerial Number

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H.

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H. ACE 56 Fall 5 Lecure 8: The Simple Linear Regression Model: R, Reporing he Resuls and Predicion by Professor Sco H. Irwin Required Readings: Griffihs, Hill and Judge. "Explaining Variaion in he Dependen

More information

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,

More information

Math 10B: Mock Mid II. April 13, 2016

Math 10B: Mock Mid II. April 13, 2016 Name: Soluions Mah 10B: Mock Mid II April 13, 016 1. ( poins) Sae, wih jusificaion, wheher he following saemens are rue or false. (a) If a 3 3 marix A saisfies A 3 A = 0, hen i canno be inverible. True.

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Isolated-word speech recognition using hidden Markov models

Isolated-word speech recognition using hidden Markov models Isolaed-word speech recogniion using hidden Markov models Håkon Sandsmark December 18, 21 1 Inroducion Speech recogniion is a challenging problem on which much work has been done he las decades. Some of

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

Testing for a Single Factor Model in the Multivariate State Space Framework

Testing for a Single Factor Model in the Multivariate State Space Framework esing for a Single Facor Model in he Mulivariae Sae Space Framework Chen C.-Y. M. Chiba and M. Kobayashi Inernaional Graduae School of Social Sciences Yokohama Naional Universiy Japan Faculy of Economics

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin ACE 56 Fall 005 Lecure 4: Simple Linear Regression Model: Specificaion and Esimaion by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Simple Regression: Economic and Saisical Model

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data Chaper 2 Models, Censoring, and Likelihood for Failure-Time Daa William Q. Meeker and Luis A. Escobar Iowa Sae Universiy and Louisiana Sae Universiy Copyrigh 1998-2008 W. Q. Meeker and L. A. Escobar. Based

More information

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear In The name of God Lecure4: Percepron and AALIE r. Majid MjidGhoshunih Inroducion The Rosenbla s LMS algorihm for Percepron 958 is buil around a linear neuron a neuron ih a linear acivaion funcion. Hoever,

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals

More information

On Multicomponent System Reliability with Microshocks - Microdamages Type of Components Interaction

On Multicomponent System Reliability with Microshocks - Microdamages Type of Components Interaction On Mulicomponen Sysem Reliabiliy wih Microshocks - Microdamages Type of Componens Ineracion Jerzy K. Filus, and Lidia Z. Filus Absrac Consider a wo componen parallel sysem. The defined new sochasic dependences

More information

Shiva Akhtarian MSc Student, Department of Computer Engineering and Information Technology, Payame Noor University, Iran

Shiva Akhtarian MSc Student, Department of Computer Engineering and Information Technology, Payame Noor University, Iran Curren Trends in Technology and Science ISSN : 79-055 8hSASTech 04 Symposium on Advances in Science & Technology-Commission-IV Mashhad, Iran A New for Sofware Reliabiliy Evaluaion Based on NHPP wih Imperfec

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

The Simple Linear Regression Model: Reporting the Results and Choosing the Functional Form

The Simple Linear Regression Model: Reporting the Results and Choosing the Functional Form Chaper 6 The Simple Linear Regression Model: Reporing he Resuls and Choosing he Funcional Form To complee he analysis of he simple linear regression model, in his chaper we will consider how o measure

More information

3.1 More on model selection

3.1 More on model selection 3. More on Model selecion 3. Comparing models AIC, BIC, Adjused R squared. 3. Over Fiing problem. 3.3 Sample spliing. 3. More on model selecion crieria Ofen afer model fiing you are lef wih a handful of

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

5 The fitting methods used in the normalization of DSD

5 The fitting methods used in the normalization of DSD The fiing mehods used in he normalizaion of DSD.1 Inroducion Sempere-Torres e al. 1994 presened a general formulaion for he DSD ha was able o reproduce and inerpre all previous sudies of DSD. The mehodology

More information

Stability and Bifurcation in a Neural Network Model with Two Delays

Stability and Bifurcation in a Neural Network Model with Two Delays Inernaional Mahemaical Forum, Vol. 6, 11, no. 35, 175-1731 Sabiliy and Bifurcaion in a Neural Nework Model wih Two Delays GuangPing Hu and XiaoLing Li School of Mahemaics and Physics, Nanjing Universiy

More information

FITTING OF A PARTIALLY REPARAMETERIZED GOMPERTZ MODEL TO BROILER DATA

FITTING OF A PARTIALLY REPARAMETERIZED GOMPERTZ MODEL TO BROILER DATA FITTING OF A PARTIALLY REPARAMETERIZED GOMPERTZ MODEL TO BROILER DATA N. Okendro Singh Associae Professor (Ag. Sa.), College of Agriculure, Cenral Agriculural Universiy, Iroisemba 795 004, Imphal, Manipur

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011 Mainenance Models Prof Rober C Leachman IEOR 3, Mehods of Manufacuring Improvemen Spring, Inroducion The mainenance of complex equipmen ofen accouns for a large porion of he coss associaed wih ha equipmen

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

GENERALIZATION OF THE FORMULA OF FAA DI BRUNO FOR A COMPOSITE FUNCTION WITH A VECTOR ARGUMENT

GENERALIZATION OF THE FORMULA OF FAA DI BRUNO FOR A COMPOSITE FUNCTION WITH A VECTOR ARGUMENT Inerna J Mah & Mah Sci Vol 4, No 7 000) 48 49 S0670000970 Hindawi Publishing Corp GENERALIZATION OF THE FORMULA OF FAA DI BRUNO FOR A COMPOSITE FUNCTION WITH A VECTOR ARGUMENT RUMEN L MISHKOV Received

More information

DEPARTMENT OF STATISTICS

DEPARTMENT OF STATISTICS A Tes for Mulivariae ARCH Effecs R. Sco Hacker and Abdulnasser Haemi-J 004: DEPARTMENT OF STATISTICS S-0 07 LUND SWEDEN A Tes for Mulivariae ARCH Effecs R. Sco Hacker Jönköping Inernaional Business School

More information

GINI MEAN DIFFERENCE AND EWMA CHARTS. Muhammad Riaz, Department of Statistics, Quaid-e-Azam University Islamabad,

GINI MEAN DIFFERENCE AND EWMA CHARTS. Muhammad Riaz, Department of Statistics, Quaid-e-Azam University Islamabad, GINI MEAN DIFFEENCE AND EWMA CHATS Muhammad iaz, Deparmen of Saisics, Quaid-e-Azam Universiy Islamabad, Pakisan. E-Mail: riaz76qau@yahoo.com Saddam Akbar Abbasi, Deparmen of Saisics, Quaid-e-Azam Universiy

More information

Appendix to Creating Work Breaks From Available Idleness

Appendix to Creating Work Breaks From Available Idleness Appendix o Creaing Work Breaks From Available Idleness Xu Sun and Ward Whi Deparmen of Indusrial Engineering and Operaions Research, Columbia Universiy, New York, NY, 127; {xs2235,ww24}@columbia.edu Sepember

More information

Dynamic Econometric Models: Y t = + 0 X t + 1 X t X t k X t-k + e t. A. Autoregressive Model:

Dynamic Econometric Models: Y t = + 0 X t + 1 X t X t k X t-k + e t. A. Autoregressive Model: Dynamic Economeric Models: A. Auoregressive Model: Y = + 0 X 1 Y -1 + 2 Y -2 + k Y -k + e (Wih lagged dependen variable(s) on he RHS) B. Disribued-lag Model: Y = + 0 X + 1 X -1 + 2 X -2 + + k X -k + e

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

Methodology. -ratios are biased and that the appropriate critical values have to be increased by an amount. that depends on the sample size.

Methodology. -ratios are biased and that the appropriate critical values have to be increased by an amount. that depends on the sample size. Mehodology. Uni Roo Tess A ime series is inegraed when i has a mean revering propery and a finie variance. I is only emporarily ou of equilibrium and is called saionary in I(0). However a ime series ha

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi Creep in Viscoelasic Subsances Numerical mehods o calculae he coefficiens of he Prony equaion using creep es daa and Herediary Inegrals Mehod Navnee Saini, Mayank Goyal, Vishal Bansal (23); Term Projec

More information

Chapter 5. Heterocedastic Models. Introduction to time series (2008) 1

Chapter 5. Heterocedastic Models. Introduction to time series (2008) 1 Chaper 5 Heerocedasic Models Inroducion o ime series (2008) 1 Chaper 5. Conens. 5.1. The ARCH model. 5.2. The GARCH model. 5.3. The exponenial GARCH model. 5.4. The CHARMA model. 5.5. Random coefficien

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Time series model fitting via Kalman smoothing and EM estimation in TimeModels.jl

Time series model fitting via Kalman smoothing and EM estimation in TimeModels.jl Time series model fiing via Kalman smoohing and EM esimaion in TimeModels.jl Gord Sephen Las updaed: January 206 Conens Inroducion 2. Moivaion and Acknowledgemens....................... 2.2 Noaion......................................

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H.

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H. ACE 564 Spring 2006 Lecure 7 Exensions of The Muliple Regression Model: Dumm Independen Variables b Professor Sco H. Irwin Readings: Griffihs, Hill and Judge. "Dumm Variables and Varing Coefficien Models

More information

MATHEMATICAL DESCRIPTION OF THEORETICAL METHODS OF RESERVE ECONOMY OF CONSIGNMENT STORES

MATHEMATICAL DESCRIPTION OF THEORETICAL METHODS OF RESERVE ECONOMY OF CONSIGNMENT STORES MAHEMAICAL DESCIPION OF HEOEICAL MEHODS OF ESEVE ECONOMY OF CONSIGNMEN SOES Péer elek, József Cselényi, György Demeer Universiy of Miskolc, Deparmen of Maerials Handling and Logisics Absrac: Opimizaion

More information

EXPLICIT TIME INTEGRATORS FOR NONLINEAR DYNAMICS DERIVED FROM THE MIDPOINT RULE

EXPLICIT TIME INTEGRATORS FOR NONLINEAR DYNAMICS DERIVED FROM THE MIDPOINT RULE Version April 30, 2004.Submied o CTU Repors. EXPLICIT TIME INTEGRATORS FOR NONLINEAR DYNAMICS DERIVED FROM THE MIDPOINT RULE Per Krysl Universiy of California, San Diego La Jolla, California 92093-0085,

More information

Forecasting optimally

Forecasting optimally I) ile: Forecas Evaluaion II) Conens: Evaluaing forecass, properies of opimal forecass, esing properies of opimal forecass, saisical comparison of forecas accuracy III) Documenaion: - Diebold, Francis

More information

Vectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1

Vectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1 Vecorauoregressive Model and Coinegraion Analysis Par V Time Series Analysis Dr. Sevap Kesel 1 Vecorauoregression Vecor auoregression (VAR) is an economeric model used o capure he evoluion and he inerdependencies

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

CONFIDENCE LIMITS AND THEIR ROBUSTNESS

CONFIDENCE LIMITS AND THEIR ROBUSTNESS CONFIDENCE LIMITS AND THEIR ROBUSTNESS Rajendran Raja Fermi Naional Acceleraor laboraory Baavia, IL 60510 Absrac Confidence limis are common place in physics analysis. Grea care mus be aken in heir calculaion

More information

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015 Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Echocardiography Project and Finite Fourier Series

Echocardiography Project and Finite Fourier Series Echocardiography Projec and Finie Fourier Series 1 U M An echocardiagram is a plo of how a porion of he hear moves as he funcion of ime over he one or more hearbea cycles If he hearbea repeas iself every

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3 and d = c b - b c c d = c b - b c c This process is coninued unil he nh row has been compleed. The complee array of coefficiens is riangular. Noe ha in developing he array an enire row may be divided or

More information

Morning Time: 1 hour 30 minutes Additional materials (enclosed):

Morning Time: 1 hour 30 minutes Additional materials (enclosed): ADVANCED GCE 78/0 MATHEMATICS (MEI) Differenial Equaions THURSDAY JANUARY 008 Morning Time: hour 30 minues Addiional maerials (enclosed): None Addiional maerials (required): Answer Bookle (8 pages) Graph

More information

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates Biol. 356 Lab 8. Moraliy, Recruimen, and Migraion Raes (modified from Cox, 00, General Ecology Lab Manual, McGraw Hill) Las week we esimaed populaion size hrough several mehods. One assumpion of all hese

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

not to be republished NCERT MATHEMATICAL MODELLING Appendix 2 A.2.1 Introduction A.2.2 Why Mathematical Modelling?

not to be republished NCERT MATHEMATICAL MODELLING Appendix 2 A.2.1 Introduction A.2.2 Why Mathematical Modelling? 256 MATHEMATICS A.2.1 Inroducion In class XI, we have learn abou mahemaical modelling as an aemp o sudy some par (or form) of some real-life problems in mahemaical erms, i.e., he conversion of a physical

More information

Mathematical Theory and Modeling ISSN (Paper) ISSN (Online) Vol 3, No.3, 2013

Mathematical Theory and Modeling ISSN (Paper) ISSN (Online) Vol 3, No.3, 2013 Mahemaical Theory and Modeling ISSN -580 (Paper) ISSN 5-05 (Online) Vol, No., 0 www.iise.org The ffec of Inverse Transformaion on he Uni Mean and Consan Variance Assumpions of a Muliplicaive rror Model

More information

Inequality measures for intersecting Lorenz curves: an alternative weak ordering

Inequality measures for intersecting Lorenz curves: an alternative weak ordering h Inernaional Scienific Conference Financial managemen of Firms and Financial Insiuions Osrava VŠB-TU of Osrava, Faculy of Economics, Deparmen of Finance 7 h 8 h Sepember 25 Absrac Inequaliy measures for

More information

Accurate RMS Calculations for Periodic Signals by. Trapezoidal Rule with the Least Data Amount

Accurate RMS Calculations for Periodic Signals by. Trapezoidal Rule with the Least Data Amount Adv. Sudies Theor. Phys., Vol. 7, 3, no., 3-33 HIKARI Ld, www.m-hikari.com hp://dx.doi.org/.988/asp.3.3999 Accurae RS Calculaions for Periodic Signals by Trapezoidal Rule wih he Leas Daa Amoun Sompop Poomjan,

More information

Generalized Least Squares

Generalized Least Squares Generalized Leas Squares Augus 006 1 Modified Model Original assumpions: 1 Specificaion: y = Xβ + ε (1) Eε =0 3 EX 0 ε =0 4 Eεε 0 = σ I In his secion, we consider relaxing assumpion (4) Insead, assume

More information

in Engineering Prof. Dr. Michael Havbro Faber ETH Zurich, Switzerland Swiss Federal Institute of Technology

in Engineering Prof. Dr. Michael Havbro Faber ETH Zurich, Switzerland Swiss Federal Institute of Technology Risk and Saey in Engineering Pro. Dr. Michael Havbro Faber ETH Zurich, Swizerland Conens o Today's Lecure Inroducion o ime varian reliabiliy analysis The Poisson process The ormal process Assessmen o he

More information

Solutions to Odd Number Exercises in Chapter 6

Solutions to Odd Number Exercises in Chapter 6 1 Soluions o Odd Number Exercises in 6.1 R y eˆ 1.7151 y 6.3 From eˆ ( T K) ˆ R 1 1 SST SST SST (1 R ) 55.36(1.7911) we have, ˆ 6.414 T K ( ) 6.5 y ye ye y e 1 1 Consider he erms e and xe b b x e y e b

More information

Robotics I. April 11, The kinematics of a 3R spatial robot is specified by the Denavit-Hartenberg parameters in Tab. 1.

Robotics I. April 11, The kinematics of a 3R spatial robot is specified by the Denavit-Hartenberg parameters in Tab. 1. Roboics I April 11, 017 Exercise 1 he kinemaics of a 3R spaial robo is specified by he Denavi-Harenberg parameers in ab 1 i α i d i a i θ i 1 π/ L 1 0 1 0 0 L 3 0 0 L 3 3 able 1: able of DH parameers of

More information

Wednesday, November 7 Handout: Heteroskedasticity

Wednesday, November 7 Handout: Heteroskedasticity Amhers College Deparmen of Economics Economics 360 Fall 202 Wednesday, November 7 Handou: Heeroskedasiciy Preview Review o Regression Model o Sandard Ordinary Leas Squares (OLS) Premises o Esimaion Procedures

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION DOI: 0.038/NCLIMATE893 Temporal resoluion and DICE * Supplemenal Informaion Alex L. Maren and Sephen C. Newbold Naional Cener for Environmenal Economics, US Environmenal Proecion

More information

Semi-Competing Risks on A Trivariate Weibull Survival Model

Semi-Competing Risks on A Trivariate Weibull Survival Model Semi-Compeing Risks on A Trivariae Weibull Survival Model Jenq-Daw Lee Graduae Insiue of Poliical Economy Naional Cheng Kung Universiy Tainan Taiwan 70101 ROC Cheng K. Lee Loss Forecasing Home Loans &

More information

How to Deal with Structural Breaks in Practical Cointegration Analysis

How to Deal with Structural Breaks in Practical Cointegration Analysis How o Deal wih Srucural Breaks in Pracical Coinegraion Analysis Roselyne Joyeux * School of Economic and Financial Sudies Macquarie Universiy December 00 ABSTRACT In his noe we consider he reamen of srucural

More information