Latent Semantic Indexing Based on Factor Analysis

Size: px
Start display at page:

Download "Latent Semantic Indexing Based on Factor Analysis"

Transcription

1 Latet Sematc Idexg Based o Factor Aalyss Norak Kawamae Ceter for Advaced Research ad Techology The Uversty of Tokyo 4-6- Komaba Meguroku Tokyo JAPAN tel fax kawamae@mpeg.rcast.u-tokyo.ac.p Abstract The ma purpose of ths paper s to propose a ovel latet sematc dexg (LSI), statstcal approach to smultaeously mappg documets ad terms to a latet sematc space. Ths approach ca dex documets more effectvely tha the vector space model (VSM). Latet sematc dexg (LSI), whch s based o sgular value decomposto (SVD), ad probablstc latet sematc dexg (PLSI) have already bee proposed to overcome problems documet dexg, but crtcal problems rema. I cotrast to LSI ad PLSI, our method uses a more meagful, robust statstcal model based o factor aalyss ad formato theory. As a result, ths model ca solve the remag crtcal problems LSI ad PLSI. Expermetal results wth a test collecto showed that our method s superor to LSI ad PLSI from the vewpots of formato retreval ad classfcato. We also propose a ew term weghtg method based o etropy.. Itroducto Wth the advet of dgtal databases ad commucato etworks, t s easy to obta Web pages ad electroc documets. It s thus ecessary to develop formato retreval methods to smplfy the process of fdg relevat formato. Documet dexg s oe mportat formato retreval method. Tradtoally, documets have bee dexed ad labeled maually by humas. A mportat example s the dea of otoal famles the work of H. P. Luh [9]. The prmary goal s to dex documets wth the same precso acheved by humas. To develop such documet dexg methods, the followg problems must be solved: Ambvalece betwee terms Calculato cost Documet dexg ad keyword matchg methods Due to these problems, the retreval performace of dexg systems s poor. Amog prevous works, latet sematc dexg (LSI), based o sgular value decomposto (SVD), ad probablstc latet sematc dexg (PLSI) have bee developed to overcome these problems, but usolved problems rema. Our prmary goal ths paper s to preset a ovel statstcal approach, to smultaeously mappg documets ad terms to a latet sematc space. Ths approach ca dex documets better tha by usg dvdual dexg terms because the topcs a documet are more closely related to the cocepts descrbed there tha to the dex terms used the documet s descrpto. Our method uses a more meagful, robust statstcal model - called a code model - that assocates documets ad terms wth a latet sematc factor. Ths model s based o factor aalyss ad formato theory ad eables us to remove ths ose ad extract a better latet sematc space tha other methods. As a result, documets ca be dexed early as well as by humas. Ths s maly because factor aalyss s a better statstcal model tha SVD for capturg hdde factors. 2. Related work o documet dexg ad term selecto The vector space model (VSM) [] s a approach to mappg documets to a space assocated wth the terms the documets. The weghtg of the terms a documet provdes the documet s coordates space, ad the smlarty betwee documets s measured space. Latet sematc aalyss (LSA) [2] s a approach to mappg documets to a lower dmesoal space tha LSI. Ths s accomplshed by proectg term vectors to ths space by usg a model based o SVD. To date, several theoretcal results or explaatos have appeared, ad these studes have provded a better uderstadg of LSI. However, may fudametal problems rema uresolved, as follows. Iadequate use of statstcal methods: LSI uses SVD as a statstcal model. Therefore, LSI caot expla or extract the latet sematc factor as a hdde varable, because SVD s othg more tha the decomposto of the observed varable matrx [6]. Optmal decomposto dmeso: I geeral, dmesoalty reducto s ustfed by the statstcal sgfcace of the latet sematc vectors as measured by the lkelhood of the model based o SVD [3],[5]. The complexty of the model s ot cosdered terms of the dmeso. Term weghtg method: To date, may researchers o LSI have used tf/df [] or a smlar method. Ths type of method, however, s ot the best way to evaluate the usefuless of terms, because the weghtg of low-frequecy terms s uderestmated, whle that of hgh-frequecy terms s overestmated. 3. Latet sematc extracto ad term selecto Our method s a ovel statstcal approach to smultaeously mappg documets ad terms to a latet sematc space, ad t mproves documet retreval performace. Our method cossts of three compoets: () a ovel term weghtg

2 method, (2) a code model, ad (3) a statstcal model crtero. The ma dea ths approach s that a latet sematc factor s assocated wth each topc. A partcular topc a documet s more related to the cocepts descrbed the documet tha to the dex terms used the descrpto of the documet. Therefore, ths proposed dexg method eables us to retreve documets based o smlartes betwee cocepts. As a result, our proposed method evolves from keyword matchg to cocept matchg. Ths allows us to retreve documets eve f they are ot dexed by the terms a query, because oe documet shares cocepts wth aother documet dexed by the gve query. Therefore, the latet sematc space mproves documet retreval performace. 3. Term-documet matrx Morphologcal aalyss ca be used to covert a documet to a vector cosstg of the terms occurrg t. The vector space model s a method of geometrcally vsualzg the relatoshps betwee documets. I ths model, the relatoshps betwee terms ad documets are represeted as a term-documet matrx, whch cotas the values of the dex terms t occurrg each documet d, properly weghted by other factors [4][2][8]. We deote m as the umber of dex terms a collecto of documets ad as the total umber of documets. Formally, we let A deote a term-documet matrx wth rows ad m colums ad let w be a elemet (, ) of A. Each w s assged a weght assocated wth the term-documet par (d, t ), where d ( ) represets the -th documet ad t ( m) represets the -th term. For example, usg a tf/df represetato, we have w = tf(t -d )df(t ). Thus, gve the values of the w, the term-documet matrx represets the whole documet collecto. Therefore, each documet ca be expressed as a vector cosstg of the weghts of each term ad mapped a vector space: w w t. A = ( d ) d () wm wm tm There are two methods of term weghtg [8]: local ad global. Amog varous weghtg methods, those desgated as L3, G2, ad G3 are ovel ad have ot bee descrbed prevously [Toku 999]. 3.. Local weghtg Ths approach defes the weghts w each documet. Let P (, m) deote the occurrece probablty of t d. We ascrbe sgfcace to a term s occurrece, o the grouds that t represets a documet s topcs more tha other factors do. Therefore, we base w o the term occurrece probablty P each documet, ad we defe a local weghtg L as follows: L = P *log (+P ). (2) I cotrast to other local weghtg methods based o term frequecy, ths method reduces the effect of very hgh frequecy terms by multplyg P by a logarthm Global weghtg Ths approach defes the weghts w over all documets. Let P (, m) deote the relatve occurrece probablty of t d. We ascrbe sgfcace to the probablty of a term occurrg over all documets o the grouds that a gve term provdes formato for topc predcto ad affects global performace. Etropy s a coveet metrc for comparg the probablty dstrbuto of a term s occurrece. Therefore, we base w o the etropy of the relatve occurrece probablty, ad we defe a global weghtg G as follows: G = + p log p. log Because ths weghtg s based o etropy, f a term occurs wth equal probablty all documets, t s weghted as (the maxmum). To esure that t does ot exceed, ths weghtg s ormalzed through dvso by log, where s the total umber of documets. 3.2 Itroducto of factor aalyss to obta latet sematc space 3.2. Code model The ma theme obtag a latet sematc space s to capture the essetal, meagful sematc assocatos whle reducg the amout of redudat, osy formato. Here, we propose a code model to determe a documet s coordates the latet sematc space. The code model s based o the hypothess that terms documets are geerated from a latet sematc factor ad a gve documet has a probablty of belogg to some category. Ths model ca be used to determe the coordates of ot oly documets but also terms. The relatoshps betwee terms ad latet sematc factors are smlar to the relatoshp betwee ecoded data ad a formato source formato theory. Because of ths smlarty, we call our model a code model. We ext descrbe the key dea ths model. We thk that a latet sematc factor s a formato source ad s coded to a term a documet. Therefore, the relatoshp betwee terms ad latet sematc factors ca be defed as follows. (3) P(t l k ): Probablty of latet sematc factor l k geeratg term t, where m P ( t l k ). A term t = ca be geerated from ot oly = the latet sematc factor l k but also aother factor l k. The relatoshp betwee a documet ad a latet sematc factor ca be defed the followg way: P(l k d ): Probablty of documet d belogg to latet sematc factor l k,

3 where P ( l k d ) =. Documet d ca belog to ot oly latet sematc factor l k but also aother factor l k. The weghts w, whch represet the value of t d, are used to combe these deftos to a ot probablty model, resultg the expresso: w l ( d ) = P( t l ) P( l d ) + P ( t ) P ( d ) = P t k = k where P(t d ) deotes ot the emprcal occurrece but the statstcal probablty dstrbuto of t d, obtaed by multplyg the local ad global weghtgs; P (t ) deotes the uque probablty of t ; ad P (d ) deotes the uque probablty of d. The reaso for troducg these quattes s that P(t d ) caot actually be explaed terms of oly the ot probablty of P(t l k ) ad P(l k d ). The dfferece betwee the statstcal probablty dstrbuto P(t d ) ad the model dstrbuto based o oly the ot probablty of P(t l k ) ad P(l k d ) s cosdered as ose formato theory. Therefore, we call ths model a code model. Notce that multple documets ca belog to some latet sematc factor at the same tme. Ths s because the latet sematc factors are assocated wth the observed varaces as t, d. Moreover, the latet sematc factors do ot costra documets to be orthogoal to each other. Despte the smlarty, the fudametal dfferece betwee the aspect model [5] PLSI ad ths code model s the uque probablty. Ths dfferece affects the latet sematc factors obtaed Factor aalyss To calculate the occurrece probablty of terms the code model, we troduce factor aalyss, whch s a statstcal method that resolves a observed varace to a correspodg latet factor. Udoubtedly, SVD ca also calculate a varable correspodg to the latet sematc factor obtaed arthmetcally by our method. The varable SVD s a mxture of observed varables. I cotrast, the varable factor aalyss s a latet factor. Therefore, t fts the code model metoed earler. I utlzg factor aalyss, we defe the observed varace as P(t d ) ad the latet factor as a cocept. 3.3 Statstcal model selecto based o stochastc complexty (SC) A term-documet matrx s composed of oly observed varaces. The obect of factor aalyss s to estmate the factor loadg, meag P(l d) the code model, whle keepg the umber of uque factors as low as possble. We eed to defe the optmzed umber of factors advace. I ths paper, we determe ths umber by applyg stochastc complexty (SC) [Rssae 996] oe tme. Ths refers to a code legth that shortest the case of codg wth the umber of parameters fxed to m. SC s defed as follows to determe the optmzed umber of latet sematc factors: k ε ε, (4) ( Ak) [ 2( k + ) k( k ) ] ( m) SC log Pa ˆ ( A) + log * 2 (0) where P a (A) deotes the maxmum lkelhood estmator of term-documet matrx A. The mmum of SC(A k) gves the optmzed umber of latet sematc factors, k. Ths formulato allows us to solve the remag problem of determg the optmzed umber of factors both factor aalyss ad (P)LSI. The code model ot oly determes documets coordates a latet sematc space but also s a proper geeratve model for the observed data,.e., the probablty of t d. 4. Documet dexg latet sematc space 4. Expermetal desg Here, we assess the effectveess of OUR METHOD from three vewpots: () term weghtg methods, (2) Our method vs. (P)LSI, ad (3) statstcal model selecto based o SC. Both (P)LSI ad OUR METHOD use the term-documet matrx as a startg pot ad map documets to a latet sematc space. To compare the effectveess of these methods, we evaluated the retreval performace for latet sematc spaces based o each method. Or, terms of obtag a latet sematc space, we compared SVD ad factor aalyss. We evaluated term weghtg terms of the cotrbuto to the latet sematc factors, ad statstcal model selecto terms of whether t ca predct the optmal umber of latet sematc factors. For our expermets, we prepared 20 ews artcles to form a test collecto. These artcles covered seve topc categores: ecoomcs, etertamet, formato techology, poltcs, socety, sports, ad world ews. Each category cluded the same umber of artcles. We used these categores to udge retreval ad classfcato results terms of average precso ad recall rate. Frst, we decomposed the artcles to terms by usg morphologcal aalyss, whch exchages a setece for ts compoet terms. We used Chase s approach [] for ths step. We used oly terms that meet the codto that the part of speech s a ou or a ukow that s ot regstered the dctoary of morphologcal aalyss. Ths s because the terms, as used queres, are lmted to these parts of speech. We also omtted terms that dd ot appear (7) three or less dfferet documets. Geerally, terms wth a sze of (for example, a, etc.) were defed as stop words ad omtted, ad umbers were omtted as well. Sgle Japaese characters, however, do have meag, so we dd ot treat these as stop words. The documets cota 7863 dfferet terms, so a 7863 * 20 term-documet matrx was geerated. 4.2 Comparso of (P)LSI vs Our method ad term weghtg methods Table compares smlarty results amog the documets based o dfferet term weghtgs ad latet sematc spaces. Ths comparso was doe as follows. Frst, we calculated the documets smlarty matrx, as defed secto Secod, f the smlarty of two documets was above a

4 threshold, we udged these documets to be alke. Eve f oe of the documets dd ot clude a term used a query, we cluded both documets the search results. Fally, we evaluated the search results terms of the precso ad recall rates, as show Table. I ths expermet, we defed 0.7 as the threshold. The ormal vector space model cossted of 7863 term axes. The spaces decomposed by SVD or by factor aalyss both cossted of 7 axes. I both cases, the precso ad recall rate were hghest for a decomposed space wth ths umber of axes. Table dcates that L3 ad L4 were the most effectve term weghtg methods for local weghtg, whle G, G3, ad G4 were the most effectve for global weghtg. As for the decomposed space, both SVD ad factor aalyss acheved hgher values tha the ormal vector space model. These results show that the smlarty betwee documets measured a space decomposed by SVD or factor aalyss reflects a fudametal relatoshp betwee topcs the documets. They do ot, however, dstgush Our method from (P)LSI. The term weghtg methods used ths expermet are defed as follows. Local weghtg L: term occurrece probablty (=tf) Weghts w defed by occurrece a documet. L = P = C( t ) m = C( t ) L2: ormalzed term occurrece probablty Weghts w defed by a ormalzato of L based o a log fucto. L2 = log+ C( t ) C( t ) = L3: ormalzed etropy of documet Weghts d defed by a term s occurrece probablty dstrbuto a documet. L3 = log m m m = p log p m: Number of dfferet kds of terms documet d. P: Frequecy probablty of term t documet d. L4: proposed local weghtg method Global weghtg G: ormalzed etropy of term occurrece over all documets Weghts t defed by each term s occurrece over all documets. Ths method s smlar to our proposed global weghtg but dffers terms of P. G = + log p log p P: Occurrece probablty of term t over all documets. G2: ormalzed etropy of term occurrece over all documets Weghts t defed by the occurrece proporto over all documets. G2 ( p ) log( p ) = p log p P: Occurrece probablty of a documet cotag t over all documets. G3: proposed global weghtg method G4: documet frequecy (=df) Weghts d defed by the verse of the documet frequecy. H = + log C ( d ) C (t): Number of documets cotag t. G5: ormalzed etropy of relatve term occurrece Weghts d defed by the relatve probablty dstrbuto over all documets. H = + log p log p P: Relatve probablty of t occurrg d. Table.: Average precso ad recall rates for (P)LSI ad Our method Ths table compares ot oly the term weghtg methods but also thevsm, (P)LSI, ad Our method, for each term weghtg method. I For each same term weghtg method, the upper top row s the VSM, the mddle row s (P)LSI, ad the lower bottom row s Our method. Data lsted as N/P/R. TW ad N/P/R are defed bellow ths table. TW Ecoomcs eetertamet ITformato Poltcs ssports ssocetyal World ews aaverage LG.0/00/3.3.0/00/3.3.0/00/3.3.0/99.7/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /96.6/ /98.9/ /98.7/ /99.7/ /99.8/ /98.3/ /98.7/ /98.7/ /96.3/ /98.9/ /98.6/ /00/ /99.8/ /98./ /98.7/ /98.6/97.8

5 LG2.0/00/3.3.0/00/3.3.0/00/3.3.0/99.9/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /97.0/ /00/ /98.7/ /00/ /00/ /99.4/ /98.5/ /99./ /96.7/ /00/ /98.8/ /00/ /00/ /99.4/ /98.6/ /99./92. LG3.0/00/3.3.0/00/3.3.0/00/3.3.0/99.8/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /96.9/ /99.3/ /98.9/ /99.8/ /99.8/00 30./98.8/ /98.7/ /98.9/ /96.5/ /99.4/ /98.7/ /00/ /99.8/00 30./98.8/ /98.7/ /98.8/98. LG4.0/00/3.3.0/00/3.3.0/00/3.3.0/99.8/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /97.8/ /99.7/ /99.0/ /99.8/00 30./99.7/ /99.5/ /99.2/ /99.2/ /97.5/ /99.5/ /98.8/ /00/ /99.5/ /99.5/ /99./ /99./99.2 L2G.0/00/3.3.0/00/3.3.0/00/3.3.0/99.7/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /96.6/ /99.0/ /98.7/ /99.7/ /99.8/ /98.4/ /98.7/ /98.7/ /96.3/ /99.0/ /98.6/ /00/ /99.8/ /98.2/ /98.7/ /98.6/97.8 L2G2.0/00/3.3.0/00/3.3.0/00/3.3.0/99.9/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /97.0/ /00/ /98.8/ /00/ /00/ /99.4/ /98.6/ /99./ /96.8/ /00/ /98.8/ /00/ /00/ /99.4/ /98.6/ /99./92.2 L2G3.0/00/3.3.0/00/3.3.0/00/3.3.0/99.8/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /97.0/ /99.4/ /98.9/ /99.8/ /99.8/00 30./98.9/ /98.7/ /98.9/ /96.5/ /99.4/ /98.7/ /00/ /99.8/00 30./98.8/ /98.7/ /98.8/98. L2G4.0/00/3.3.0/00/3.3.0/00/3.3.0/99.8/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /97.8/ /99.7/ /99.0/ /99.8/00 30./99.7/ /99.5/ /99.2/ /99.2/ /97.6/ /99.5/ /98.8/ /00/ /99.5/ /99.5/ /99.2/ /99./99.2 L3G.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/3.4 L3G2.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /99.9/ /00/ /99.9/ /00/ /00/ /00/00 29./00/ /00/ /99.9/ /99.2/ /99.9/ /00/ /99.0/ /00/00 7.7/00/ /99.7/93.4 L3G3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/3.4 L3G4.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/3.4 L4G.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/3.4 L4G2.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/ /99.9/ /00/ /99.9/ /00/ /00/ /00/00 29./00/ /00/ /99.9/ /99.2/ /99.9/ /00/ /99.0/ /00/00 7.7/00/ /99.7/93.4 L4G3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/3.4 L4G4.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3.0/00/3.3./00/3.6.0/00/3.4

6 30.0/00/ /99.9/ /00/ /0.0/ /99.9/ /00/ /00/ /00/00 TW:term weghtg, N:the umber of documets retrevedaldocumets, P:precso, R:recall documets, we caot dstgush factor aalyss from SVD terms of precso. Whe we reduce the umber of terms by usg the value of global weghtg, however, we see the superorty of factor aalyss. As for the weghtg methods, the results dcate that t s effectve to select terms based o the value wth L4 as the local weghtg method. We should emphasze that the latet sematc space based o the combato of factor aalyss ad G3*L4 ca be used to classfy documets wth oly the half umber of terms ormally requred. The VSM, o the other had, does ot deped o the umber of terms or the weghtg method. hese results show that our proposed weghtg method s useful ot oly for dexg documets but also for selectg the mmum umber of terms. 4.4 Statstcal model selecto I ths expermet, the optmal umber of latet sematc factors was seve. Ths s the same as the umber of topcs used. Ths dcates that the obtaed latet sematc factors ca effectvely characterze documets statstcally; we ca say that these factors represet sematc meags. SC predcts e as the optmal umber of latet sematc factors, as show Table 2. Other obectve fuctos [3], [5], however, predct umbers far greater tha/less tha e, whch do ot appear Table 2. The value predcted by SC s thus approxmately correct. Table 2. Optmal umber of latet sematc factors ad SC FN 5 6 7* 8 9 Lkelhood SC * FN: Number of latet sematc factors 5. Coclusos We have proposed a ovel statstcal approach to smultaeously mappg documets ad terms to a latet sematc space. Our method cossts of three compoets: () a ovel term weghtg method, (2) a code model, ad (3) a statstcal model crtero. Retreval ad classfcato expermets o a test collecto dcated that Our method s superor to (P)LSI. I other words, the axes a latet sematc space obtaed by our method are closer to the geeral cocepts dexed by humas tha ay other method. Fally, we troduced a statstcal model selecto approach based o stochastc complexty (SC) to solve the remag problem (P)LSI: the problem of how to determe the umber of latet sematc factors. Our expermets showed that the formulato based o SC solves ths problem. We thus coclude that our method s a useful documet dexg method that ot oly solves the crtcal remag problems LSI ad PLSI but also mproves the retreval performace. REFERENCES [] Chase: [2] Deerwester, S., Dumas, S. T., Furas, G. W., Ladaure, T. K., ad Harshma, R.: Idexg by latet sematcs aalyss, Joural of the Amerca Socety for Iformato Scece, 990. [3] Dg, C. H. Q.: A Dual Probablstc Model for Latet Sematc Idexg Iformato Retreval ad Flterg, Proceedgs of the 22d Aual Coferece o Research ad Developmet Iformato Retreval (ACM SIGIR), 999. [4] Dumas, S.T.: Improvg the retreval of formato from exteral sources, Behavor Research Methods, Istrumets ad Computers, 23(2), [5] Hofma, T.: Probablstc latet sematc dexg, Proceedgs of the 22d Aual Coferece o Research ad Developmet Iformato Retreval (ACM SIGIR), 999. [6] Kawamae, N., Aok, T., Yasuda, H.: Iformato Retreval Based o the Iformato Theory Model, Techcal Report of IEICE, DE200-57, 200 ( Japaese). [7] Kawamae, N., Aok, T., Yasuda, H.: Documet Classfcato ad Retreval after Removg Word Nose, Techcal Report of IEICE, NLC200-48, 200 ( Japaese). [8] Kta, K.: Statstcal Laguage Model, The Uversty of Tokyo Press, 999. [9] Luh, H. P.: The automatc dervato of formato retreval ecodemet from mache readable text, Iformato Retreval ad Mache Traslato, 3(2), , 96. [0] Schutze, H. ad Pederse, J.: A vector model for sytagmatc ad paradgmatc relatedess, Proceedgs of the 9th Aual Coferece of the Uversty of Waterloo Ceter for the New OED ad Text Research, 993. [] Salto, G. ad McGll, M. J.: Itroducto to Moder Iformato Retreval, McGraw-Hll, 983. [2] Salto, G. ad Buckley, C.: Term-weghtg approaches automatc text retreval, Iformato Processg ad Maagemet, 24(5) , 988. [3] Saul, L., ad Perera, F.: Aggregate ad mxed order Markov models for statstcal laguage processg, Proceedgs of 2d Iteratoal Coferece o Emprcal Methods Natural Laguage Processg, 997. [4] Tohku, T.: Iformato Retreval ad Laguage Process, The Uversty of Tokyo Press, 999.

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632

More information

Module 7: Probability and Statistics

Module 7: Probability and Statistics Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

Overcoming Limitations of Sampling for Aggregation Queries

Overcoming Limitations of Sampling for Aggregation Queries CIS 6930 Approxmate Quer Processg Paper Presetato Sprg 2004 - Istructor: Dr Al Dobra Overcomg Lmtatos of Samplg for Aggregato Queres Authors: Surajt Chaudhur, Gautam Das, Maur Datar, Rajeev Motwa, ad Vvek

More information

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

Some Notes on the Probability Space of Statistical Surveys

Some Notes on the Probability Space of Statistical Surveys Metodološk zvezk, Vol. 7, No., 200, 7-2 ome Notes o the Probablty pace of tatstcal urveys George Petrakos Abstract Ths paper troduces a formal presetato of samplg process usg prcples ad cocepts from Probablty

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled

More information

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen. .5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

Lecture 1 Review of Fundamental Statistical Concepts

Lecture 1 Review of Fundamental Statistical Concepts Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber

More information

Median as a Weighted Arithmetic Mean of All Sample Observations

Median as a Weighted Arithmetic Mean of All Sample Observations Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of

More information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information Malaysa Joural of Mathematcal Sceces (): 97- (9) Bayes Estmator for Expoetal Dstrbuto wth Exteso of Jeffery Pror Iformato Hadeel Salm Al-Kutub ad Noor Akma Ibrahm Isttute for Mathematcal Research, Uverst

More information

Research on SVM Prediction Model Based on Chaos Theory

Research on SVM Prediction Model Based on Chaos Theory Advaced Scece ad Techology Letters Vol.3 (SoftTech 06, pp.59-63 http://dx.do.org/0.457/astl.06.3.3 Research o SVM Predcto Model Based o Chaos Theory Sog Lagog, Wu Hux, Zhag Zezhog 3, College of Iformato

More information

Collocation Extraction Using Square Mutual Information Approaches. Received December 2010; revised January 2011

Collocation Extraction Using Square Mutual Information Approaches. Received December 2010; revised January 2011 Iteratoal Joural of Kowledge www.jklp.org ad Laguage Processg KLP Iteratoal c2011 ISSN 2191-2734 Volume 2, Number 1, Jauary 2011 pp. 53-58 Collocato Extracto Usg Square Mutual Iformato Approaches Huaru

More information

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution Global Joural of Pure ad Appled Mathematcs. ISSN 0973-768 Volume 3, Number 9 (207), pp. 55-528 Research Ida Publcatos http://www.rpublcato.com Comparg Dfferet Estmators of three Parameters for Trasmuted

More information

ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS

ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS Producto Systems ad Iformato Egeerg Volume 5 (2009), pp. 4-50. ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS PÉTER BARABÁS Uversty of Msolc, Hugary Departmet of Iformato Techology barabas@t.u-msolc.hu

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits Block-Based Compact hermal Modelg of Semcoductor Itegrated Crcuts Master s hess Defese Caddate: Jg Ba Commttee Members: Dr. Mg-Cheg Cheg Dr. Daqg Hou Dr. Robert Schllg July 27, 2009 Outle Itroducto Backgroud

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur

More information

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods Malaysa Umodalty Joural Tests of Mathematcal for Global Optmzato Sceces (): of 05 Sgle - 5 Varable (007) Fuctos Usg Statstcal Methods Umodalty Tests for Global Optmzato of Sgle Varable Fuctos Usg Statstcal

More information

Wendy Korn, Moon Chang (IBM) ACM SIGARCH Computer Architecture News Vol. 35, No. 1, March 2007

Wendy Korn, Moon Chang (IBM) ACM SIGARCH Computer Architecture News Vol. 35, No. 1, March 2007 CPU 2006 Sestvty to Memory Page Szes Wedy Kor, Moo Chag (IBM) ACM SIGARCH Computer Archtecture News Vol. 35, No. 1, March 2007 Memory usage 1. Mmum ad Maxmum memory used 2. Sestvty to page szes 3. 4K,

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab Lear Regresso Lear Regresso th Shrkage Some sldes are due to Tomm Jaakkola, MIT AI Lab Itroducto The goal of regresso s to make quattatve real valued predctos o the bass of a vector of features or attrbutes.

More information

Lecture 8: Linear Regression

Lecture 8: Linear Regression Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE

More information

Bayes Interval Estimation for binomial proportion and difference of two binomial proportions with Simulation Study

Bayes Interval Estimation for binomial proportion and difference of two binomial proportions with Simulation Study IJIEST Iteratoal Joural of Iovatve Scece, Egeerg & Techology, Vol. Issue 5, July 04. Bayes Iterval Estmato for bomal proporto ad dfferece of two bomal proportos wth Smulato Study Masoud Gaj, Solmaz hlmad

More information

CHAPTER 4 RADICAL EXPRESSIONS

CHAPTER 4 RADICAL EXPRESSIONS 6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean Research Joural of Mathematcal ad Statstcal Sceces ISS 30 6047 Vol. 1(), 5-1, ovember (013) Res. J. Mathematcal ad Statstcal Sc. Comparso of Dual to Rato-Cum-Product Estmators of Populato Mea Abstract

More information

Descriptive Statistics

Descriptive Statistics Page Techcal Math II Descrptve Statstcs Descrptve Statstcs Descrptve statstcs s the body of methods used to represet ad summarze sets of data. A descrpto of how a set of measuremets (for eample, people

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:

More information

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK Ram Rzayev Cyberetc Isttute of the Natoal Scece Academy of Azerbaa Republc ramrza@yahoo.com Aygu Alasgarova Khazar

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

4. Standard Regression Model and Spatial Dependence Tests

4. Standard Regression Model and Spatial Dependence Tests 4. Stadard Regresso Model ad Spatal Depedece Tests Stadard regresso aalss fals the presece of spatal effects. I case of spatal depedeces ad/or spatal heterogeet a stadard regresso model wll be msspecfed.

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. Research on scheme evaluation method of automation mechatronic systems

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. Research on scheme evaluation method of automation mechatronic systems [ype text] [ype text] [ype text] ISSN : 0974-7435 Volume 0 Issue 6 Boechology 204 Ida Joural FULL PPER BIJ, 0(6, 204 [927-9275] Research o scheme evaluato method of automato mechatroc systems BSRC Che

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

On the Link Between the Concepts of Kurtosis and Bipolarization. Abstract

On the Link Between the Concepts of Kurtosis and Bipolarization. Abstract O the Lk etwee the Cocepts of Kurtoss ad polarzato Jacques SILE ar-ila Uversty Joseph Deutsch ar-ila Uversty Metal Haoka ar-ila Uversty h.d. studet) Abstract I a paper o the measuremet of the flatess of

More information

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design Authors: Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud Applcato of Calbrato Approach for Regresso Coeffcet Estmato uder Two-stage Samplg Desg Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

Pseudo-random Functions

Pseudo-random Functions Pseudo-radom Fuctos Debdeep Mukhopadhyay IIT Kharagpur We have see the costructo of PRG (pseudo-radom geerators) beg costructed from ay oe-way fuctos. Now we shall cosder a related cocept: Pseudo-radom

More information

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class) Assgmet 5/MATH 7/Wter 00 Due: Frday, February 9 class (!) (aswers wll be posted rght after class) As usual, there are peces of text, before the questos [], [], themselves. Recall: For the quadratc form

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

Investigating Cellular Automata

Investigating Cellular Automata Researcher: Taylor Dupuy Advsor: Aaro Wootto Semester: Fall 4 Ivestgatg Cellular Automata A Overvew of Cellular Automata: Cellular Automata are smple computer programs that geerate rows of black ad whte

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

A New Family of Transformations for Lifetime Data

A New Family of Transformations for Lifetime Data Proceedgs of the World Cogress o Egeerg 4 Vol I, WCE 4, July - 4, 4, Lodo, U.K. A New Famly of Trasformatos for Lfetme Data Lakhaa Watthaacheewakul Abstract A famly of trasformatos s the oe of several

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1 STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal

More information

A New Measure of Probabilistic Entropy. and its Properties

A New Measure of Probabilistic Entropy. and its Properties Appled Mathematcal Sceces, Vol. 4, 200, o. 28, 387-394 A New Measure of Probablstc Etropy ad ts Propertes Rajeesh Kumar Departmet of Mathematcs Kurukshetra Uversty Kurukshetra, Ida rajeesh_kuk@redffmal.com

More information

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher

More information

On generalized fuzzy mean code word lengths. Department of Mathematics, Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India

On generalized fuzzy mean code word lengths. Department of Mathematics, Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India merca Joural of ppled Mathematcs 04; (4): 7-34 Publshed ole ugust 30, 04 (http://www.scecepublshggroup.com//aam) do: 0.648/.aam.04004.3 ISSN: 330-0043 (Prt); ISSN: 330-006X (Ole) O geeralzed fuzzy mea

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

2C09 Design for seismic and climate changes

2C09 Design for seismic and climate changes 2C09 Desg for sesmc ad clmate chages Lecture 08: Sesmc aalyss of elastc MDOF systems Aurel Strata, Poltehca Uversty of Tmsoara 06/04/2017 Europea Erasmus Mudus Master Course Sustaable Costructos uder atural

More information

Generating Multivariate Nonnormal Distribution Random Numbers Based on Copula Function

Generating Multivariate Nonnormal Distribution Random Numbers Based on Copula Function 7659, Eglad, UK Joural of Iformato ad Computg Scece Vol. 2, No. 3, 2007, pp. 9-96 Geeratg Multvarate Noormal Dstrbuto Radom Numbers Based o Copula Fucto Xaopg Hu +, Jam He ad Hogsheg Ly School of Ecoomcs

More information

2. Independence and Bernoulli Trials

2. Independence and Bernoulli Trials . Ideedece ad Beroull Trals Ideedece: Evets ad B are deedet f B B. - It s easy to show that, B deedet mles, B;, B are all deedet ars. For examle, ad so that B or B B B B B φ,.e., ad B are deedet evets.,

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Chapter 11 The Analysis of Variance

Chapter 11 The Analysis of Variance Chapter The Aalyss of Varace. Oe Factor Aalyss of Varace. Radomzed Bloc Desgs (ot for ths course) NIPRL . Oe Factor Aalyss of Varace.. Oe Factor Layouts (/4) Suppose that a expermeter s terested populatos

More information

Analysis of Variance with Weibull Data

Analysis of Variance with Weibull Data Aalyss of Varace wth Webull Data Lahaa Watthaacheewaul Abstract I statstcal data aalyss by aalyss of varace, the usual basc assumptos are that the model s addtve ad the errors are radomly, depedetly, ad

More information

Third handout: On the Gini Index

Third handout: On the Gini Index Thrd hadout: O the dex Corrado, a tala statstca, proposed (, 9, 96) to measure absolute equalt va the mea dfferece whch s defed as ( / ) where refers to the total umber of dvduals socet. Assume that. The

More information

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i. CS 94- Desty Matrces, vo Neuma Etropy 3/7/07 Sprg 007 Lecture 3 I ths lecture, we wll dscuss the bascs of quatum formato theory I partcular, we wll dscuss mxed quatum states, desty matrces, vo Neuma etropy

More information

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a

More information

1. BLAST (Karlin Altschul) Statistics

1. BLAST (Karlin Altschul) Statistics Parwse seuece algmet global ad local Multple seuece algmet Substtuto matrces Database searchg global local BLAST Seuece statstcs Evolutoary tree recostructo Gee Fdg Prote structure predcto RNA structure

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory ROAD MAP... AE301 Aerodyamcs I UNIT C: 2-D Arfols C-1: Aerodyamcs of Arfols 1 C-2: Aerodyamcs of Arfols 2 C-3: Pael Methods C-4: Th Arfol Theory AE301 Aerodyamcs I Ut C-3: Lst of Subects Problem Solutos?

More information

Bootstrap Method for Testing of Equality of Several Coefficients of Variation

Bootstrap Method for Testing of Equality of Several Coefficients of Variation Cloud Publcatos Iteratoal Joural of Advaced Mathematcs ad Statstcs Volume, pp. -6, Artcle ID Sc- Research Artcle Ope Access Bootstrap Method for Testg of Equalty of Several Coeffcets of Varato Dr. Navee

More information

n -dimensional vectors follow naturally from the one

n -dimensional vectors follow naturally from the one B. Vectors ad sets B. Vectors Ecoomsts study ecoomc pheomea by buldg hghly stylzed models. Uderstadg ad makg use of almost all such models requres a hgh comfort level wth some key mathematcal sklls. I

More information

Chapter 8: Statistical Analysis of Simulated Data

Chapter 8: Statistical Analysis of Simulated Data Marquette Uversty MSCS600 Chapter 8: Statstcal Aalyss of Smulated Data Dael B. Rowe, Ph.D. Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 08 by Marquette Uversty MSCS600 Ageda 8. The Sample

More information

MATH 247/Winter Notes on the adjoint and on normal operators.

MATH 247/Winter Notes on the adjoint and on normal operators. MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say

More information

MAX-MIN AND MIN-MAX VALUES OF VARIOUS MEASURES OF FUZZY DIVERGENCE

MAX-MIN AND MIN-MAX VALUES OF VARIOUS MEASURES OF FUZZY DIVERGENCE merca Jr of Mathematcs ad Sceces Vol, No,(Jauary 0) Copyrght Md Reader Publcatos wwwjouralshubcom MX-MIN ND MIN-MX VLUES OF VRIOUS MESURES OF FUZZY DIVERGENCE RKTul Departmet of Mathematcs SSM College

More information

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura Statstcs Descrptve ad Iferetal Statstcs Istructor: Dasuke Nagakura (agakura@z7.keo.jp) 1 Today s topc Today, I talk about two categores of statstcal aalyses, descrptve statstcs ad feretal statstcs, ad

More information

PROJECTION PROBLEM FOR REGULAR POLYGONS

PROJECTION PROBLEM FOR REGULAR POLYGONS Joural of Mathematcal Sceces: Advaces ad Applcatos Volume, Number, 008, Pages 95-50 PROJECTION PROBLEM FOR REGULAR POLYGONS College of Scece Bejg Forestry Uversty Bejg 0008 P. R. Cha e-mal: sl@bjfu.edu.c

More information

Analysis of Lagrange Interpolation Formula

Analysis of Lagrange Interpolation Formula P IJISET - Iteratoal Joural of Iovatve Scece, Egeerg & Techology, Vol. Issue, December 4. www.jset.com ISS 348 7968 Aalyss of Lagrage Iterpolato Formula Vjay Dahya PDepartmet of MathematcsMaharaja Surajmal

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

The Effect of Distance between Open-Loop Poles and Closed-Loop Poles on the Numerical Accuracy of Pole Assignment

The Effect of Distance between Open-Loop Poles and Closed-Loop Poles on the Numerical Accuracy of Pole Assignment Proceedgs of the 5th Medterraea Coferece o Cotrol & Automato, July 7-9, 007, Athes - Greece T9-00 The Effect of Dstace betwee Ope-Loop Poles ad Closed-Loop Poles o the Numercal Accuracy of Pole Assgmet

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis LINEA EGESSION ANALYSIS MODULE III Lecture - 4 Multple Lear egresso Aalyss Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Cofdece terval estmato The cofdece tervals multple

More information

AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET

AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET Abstract. The Permaet versus Determat problem s the followg: Gve a matrx X of determates over a feld of characterstc dfferet from

More information

18.413: Error Correcting Codes Lab March 2, Lecture 8

18.413: Error Correcting Codes Lab March 2, Lecture 8 18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse

More information

IJOART. Copyright 2014 SciResPub.

IJOART. Copyright 2014 SciResPub. Iteratoal Joural of Advacemets Research & Techology, Volume 3, Issue 10, October -014 58 Usg webull dstrbuto the forecastg by applyg o real data of the umber of traffc accdets sulama durg the perod (010-013)

More information

Bounds for the Connective Eccentric Index

Bounds for the Connective Eccentric Index It. J. Cotemp. Math. Sceces, Vol. 7, 0, o. 44, 6-66 Bouds for the Coectve Eccetrc Idex Nlaja De Departmet of Basc Scece, Humates ad Socal Scece (Mathematcs Calcutta Isttute of Egeerg ad Maagemet Kolkata,

More information

STA302/1001-Fall 2008 Midterm Test October 21, 2008

STA302/1001-Fall 2008 Midterm Test October 21, 2008 STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from

More information

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat

More information

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Lecture Notes Forecasting the process of estimating or predicting unknown situations Lecture Notes. Ecoomc Forecastg. Forecastg the process of estmatg or predctg ukow stuatos Eample usuall ecoomsts predct future ecoomc varables Forecastg apples to a varet of data () tme seres data predctg

More information

ECE 559: Wireless Communication Project Report Diversity Multiplexing Tradeoff in MIMO Channels with partial CSIT. Hoa Pham

ECE 559: Wireless Communication Project Report Diversity Multiplexing Tradeoff in MIMO Channels with partial CSIT. Hoa Pham ECE 559: Wreless Commucato Project Report Dversty Multplexg Tradeoff MIMO Chaels wth partal CSIT Hoa Pham. Summary I ths project, I have studed the performace ga of MIMO systems. There are two types of

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

QR Factorization and Singular Value Decomposition COS 323

QR Factorization and Singular Value Decomposition COS 323 QR Factorzato ad Sgular Value Decomposto COS 33 Why Yet Aother Method? How do we solve least-squares wthout currg codto-squarg effect of ormal equatos (A T A A T b) whe A s sgular, fat, or otherwse poorly-specfed?

More information

Measures of Dispersion

Measures of Dispersion Chapter 8 Measures of Dsperso Defto of Measures of Dsperso (page 31) A measure of dsperso s a descrptve summary measure that helps us characterze the data set terms of how vared the observatos are from

More information