Sample Average Approximation with Adaptive Importance Sampling

Size: px
Start display at page:

Download "Sample Average Approximation with Adaptive Importance Sampling"

Transcription

1 oname manuscrpt o. wll be nserted by the edtor) Sample Average Approxmaton wth Adaptve Importance Samplng Andreas Wächter Jeremy Staum Alvaro Maggar Mngbn Feng October 9, 07 Abstract We study sample average approxmatons under adaptve mportance samplng n whch the sample denstes may depend on prevous random samples. Based on a generc unform law of large numbers, we establsh unform convergence of the sample average approxmaton to the true functon. We obtan convergence of the optmal value and optmal solutons of the sample average approxmaton. The relevance of ths result s demonstrated n the context of the convergence analyss of a randomzed optmzaton algorthm. Keywords sample average approxmaton, adaptve mportance samplng, lkelhood rato, parametrc ntegraton, unform convergence Introducton We are nterested n mnmzng a functon g : X R gven by gx) = F x, ξ)hx, ξ) dξ ) Ξ where F x, ) s measurable for all x, and hx, ) s a probablty densty functon that mght depend on x. We assume that X s a compact subset of R n. The ntegral gx) can be nterpreted as an expectaton E x [F x, ξ)] taken under the assumpton that ξ s a random vector wth densty hx, ). When the ntegral ) cannot be computed or s too expensve to evaluate, sample average approxmaton SAA) provdes a way to obtan an approxmaton of the mnmzer of gx). In the most smple settng, when the probablty dstrbuton does not depend on x, that s, hx, ξ) = hξ), ths approach conssts of mnmzng the sample average approxmaton ĝ x) = / F x, ξ ), where the realzatons ξ,..., ξ of the random varable Addresses) of authors) should be gven

2 Andreas Wächter et al. are drawn from hξ). In ths case, the set of mnmzers of ĝ converges to the set of mnmzers of gx) as, f ĝ converges unformly to g [9]. To extend ths approach, consder the parametrc ntegral gx) = Gx, ξ) dξ. ) Ξ Let φ be a samplng dstrbuton so that φξ) > 0 for any ξ such that there exsts an x X wth Gx, ξ) > 0. Then, when {ξ } s sampled..d. from φ, the Monte Carlo estmator / Gx, ξ )/φξ ) converges a.s. to gx) for all x X. In the context of problem ), defne Gx, ξ) = F x, ξ)hx, ξ). Then the estmator has the form ĝ x) = F x, ξ ) hx, ξ ) φξ ). Ths approach s known as mportance samplng [8]. The samplng densty φ may be dfferent from the target densty h. Usually, φ s chosen to reduce the varance of estmatng the expectaton of F. The key contrbuton of ths paper s that we provde convergence results wthout assumng that the samples ξ are ndependent and dentcally dstrbuted. Instead, we study the convergence of the sample average approxmaton gven by ĝ x) = Gx, ξ ) φ ξ ), 3) where, for each =,...,, ξ s sampled from a dfferent mportance samplng densty φ. A samplng densty φ mght even depend on the prevous samples ξ,..., ξ and s therefore by tself a random varable. Ths settng s smlar to that of adaptve multple mportance samplng [5, 6]. There, however, the estmator uses mxture dstrbutons, a case not consdered here. The pontwse convergence of ĝ x) to gx) for a sngle fxed x as the sample sze goes to nfnty s by tself of nterest and, dependng on the choce φ, mght be relatvely elementary see Secton 4 for two examples). In Secton, we gve condtons under whch pontwse convergence leads to unform convergence of the functons ĝ to g. Ths n turn allows us to establsh the convergence of the optmal solutons of the sample average approxmaton mn ĝx) 4) x X to the optmal solutons of the orgnal optmzaton problem mn x X gx). 5) In Secton 3 we extend ths to the case when ĝ depends on addtonal random nusance parameters z that converge to a random lmt pont z. Secton 5 gves smplfed condtons for unform convergence for the case that all probablty dstrbutons are normal. Fnally, n Secton 6 we apply our results to

3 Sample Average Approxmaton wth Adaptve Importance Samplng 3 prove convergence of the parameters n a quadratc regresson model that approxmates a stochastc functon n the context of a randomzed optmzaton algorthm. In stochastc optmzaton, mportance samplng has been used, for example, n the context of Benders decomposton [7, 0, ]. Royset and Polak [7] presented a result on unform convergence of the sample average approxmaton when ξ,..., ξ are ndependently sampled from an dentcal mportance samplng dstrbuton. In ther work, both the target and the samplng dstrbutons are assumed to be normal. The convergence of the sample average approxmaton under non-d samplng has been addressed, for example, by Da et al. [6]. They proved results about convergence of solutons to SAA problems when ξ,..., ξ are nether dentcally dstrbuted nor ndependent, but dd not dscuss unform convergence of ĝ to g. Dupačkocá and Wets [9] proved ep-convergence of ĝ to g, from whch convergence of solutons to SAA problems follows. Ther analyss assumes that {φ } converges n dstrbuton. A smlar result was obtaned by Korf and Wets [4]. One of ther assumptons s that {ξ } forms an ergodc process, whch may not be easy to verfy n many applcatons. Homem-de-Mello [] establshed results on unform convergence of ĝ to g, and of solutons to SAA problems, under non-d samplng. Hs results were generalzed by Xu []. Whle the these papers consder non-d samplng, our results are more general snce they permt dstrbutons that are adaptvely chosen based on the prevous samples. Unform Convergence To recaptulate wth more mathematcal detal: let X be a compact subset of R n, Ξ be a subset of R d and G be a functon from X R d to R whose support s contaned n X Ξ. Let Ω, G, Q) be a probablty space on whch there s an nfnte sequence of random vectors {ξ }, each ξ beng a G-measurable functon from Ω to R d. Defne {F } as the natural fltraton of ths sequence,.e., F contans the nformaton n ξ,..., ξ. Suppose that under Q, for every, the condtonal dstrbuton of ξ gven F has a densty φ. Let Ξ represent the support of φ ; ths subset of R d can be random. Suppose that G : X Ξ R be a real-valued functon so that, for all x X, ) exsts and s fnte. We are concerned wth unform convergence as of the sample average approxmaton ĝ defned by 3) to the functon g defned by ). The followng assumpton ensures that the ratos n 3) are fnte. Assumpton Wth probablty one, for every, Ξ Ξ. Our strategy s to assume that a pontwse strong law of large numbers apples Assumpton ), and then to specfy a Lpschtz-type condton Assumpton 3) that guarantees that the convergence s unform. Assumpton For all x X, w.p., lm ĝ x) gx) = 0.

4 4 Andreas Wächter et al. In Secton 4 we dscuss two pontwse laws of large numbers, ncludng one n whch {ξ } s nether ndependently nor dentcally dstrbuted. The followng Lpschtz assumpton corresponds to Assumpton S-LIP n []. Assumpton 3 There exsts a functon γ : R + R such that lm δ 0 γδ) = 0 and, for every, there exsts a random) measurable functon γ : Ξ R, such that and, wth probablty one, lm sup E[γ ξ )] <, 6) γ ξ ) E[γ ξ )]) = 0, 7) and, for all x, x X and, wth probablty one, Gx, ξ ) Gx, ξ ) φ ξ ) φ ξ ) γ ξ )γ x x ). 8) Lpschtz-type condtons smlar to 8) are common n unform convergence results see, for example, [8, 3, 0]). Together wth the compactness of the parameters, t allows for the extenson of pontwse results to unform ones. The Lpschtz constants are allowed to vary from sample to sample to accommodate a greater varety of samplng dstrbutons, so long as they satsfy the regularty condtons gven by 6) and 7). For the case of normal dstrbutons, Secton 5 presents condtons that are easer to verfy than those above. The next theorem follows from Theorem 3b) n []. It establshes unform convergence of the estmator ĝ to g. Theorem If Assumptons,, and 3 hold, then, wth probablty one, lm ĝ g = 0. ext we consder the convergence of the optmal solutons of the sample average approxmaton 4) to the optmal soluton of the orgnal problem 5). Let ˆϑ and ϑ denote the optmal objectve values of 4) and 5), respectvely. Smlarly, let Ŝ and S denote the set of optmal solutons of 4) and 5), respectvely. Fnally, we defne the dstance of a pont x X to a set B X as dstx, B) = nf x B x x and the devaton of a set A X from the set B as DA, B) = sup x A dstx, B). Theorem Suppose that Assumptons,, and 3 hold, that ) G, ξ) s lower sem-contnuous for all ξ R d, and ) that there exsts an ntegrable functon Zξ) such that Gx, ξ) Zξ) for all x X and almost all ξ Ξ. Further assume that there exsts a compact set C X such that S s nonempty and contaned n C, and wth probablty one, for large enough, Ŝ s non-empty and contaned n C. Then, wth probablty one, lm ˆϑ = ϑ and lm DŜ, S ) = 0. Havng establshed the unform convergence n Theorem, the proof of Theorem follows closely the proof of Theorem 5.3 n [9]. See Appendx A.)

5 Sample Average Approxmaton wth Adaptve Importance Samplng 5 3 Results When Some Parameters Converge In ths secton we consder the stuaton n whch the vector x n the parametrc ntegral ) s parttoned nto optmzaton varables y and nusance parameters z, wrtng x = y, z). We provde results relevant to sample average approxmaton and optmzaton over y alone, where the sample average approxmatons are constructed usng a convergent sequence of random values of the z parameters. For example, z may represent estmators of statstcal parameters, decsons that are updated and converge over tme, etc. Secton 6 descrbes an example n whch z corresponds to the terates of a randomzed optmzaton algorthm. To be mathematcally precse, let us assume that n the framework establshed n Secton, X = Y Z, where Y R ny and Z R nz for some n y and n z that sum to n. Further suppose there s a sequence of random vectors {Z } =, each Z beng a G-measurable functon from Ω to R nz. Ths sequence need not be adapted to the fltraton {F }. We analyze problems n whch ths sequence converges to a lmtng random varable Z. Assumpton 4 There exsts a random varable Z such that lm Z Z = 0 wth probablty one. We study the convergence of sample average approxmatons ĝ Z : Ω L Y) gven by ĝ Z y) = Gy,Z,ξ ) φ ξ ) to the functon g Z : Ω L Y) gven by g Z y) = gy, Z ). The followng result s a generalzaton of Theorem n ths context. Here, Assumptons,, and 3 refer to G : X Ξ R wth X = Y Z and x = y, z) Y Z. Theorem 3 If Assumptons,, 3, and 4 hold, then wth probablty one, lm ĝ Z gz = 0. Proof We have ĝ Z g Z Gy, Z, ξ ) = sup gy, Z ) y Y φ ξ ) sup Gy, Z, ξ ) Gy, Z, ξ ) y Y φ ξ ) + sup y Y 8) sup γ ξ )γ Z Z ) + sup y Y y Y Gy, Z, ξ ) φ ξ ) gy, Z ) gy, Z ). 9) Gy, Z, ξ ) φ ξ ) By Theorem, the second term converges to zero. For the frst term, we see that γ ξ ) = γ ξ ) E[γ ξ )]) + E[γ ξ )]

6 6 Andreas Wächter et al. where, by Assumpton 3, the frst term converges to zero and the second term s bounded. Snce Z converges to Z, we have from the contnuty of γ at 0 that γ Z Z ) 0. Hence, also the frst term n 9) converges to zero. Fnally, n analogy to 5) and 4), we consder the optmzaton problem ϑ Z := mn y Y g Z y) and ts sample average approxmaton ˆϑ Z := mn y Y ĝ Z y). Let SZ and ŜZ denote the set of optmal mnmzers of gz and ĝ Z, respectvely. Theorem 4 follows from Theorem 3 n the same way that Theorem follows from Theorem. Theorem 4 Suppose that Assumptons,, 3, and 4 hold, that ) G, ξ) s lower sem-contnuous for all ξ R d, and ) that there exsts an ntegrable functon Zξ) such that Gy, z, ξ) Zξ) for all y, z) Y Z and almost all ξ Ξ. Further assume that there exsts a compact set C Y such that, wth probablty one, S Z s non-empty and contaned n C and for large enough, Ŝ Z s non-empty and contaned n C. Then, wth probablty one, lm ˆϑZ = ϑ Z and lm DŜZ, SZ ) = 0. 4 Pontwse Strong Laws of Large umbers In ths secton, we gve two examples of theorems that mply the pontwse convergence requred n Assumpton. The frst s the well-known strong law of large numbers for ndependent and dentcally dstrbuted random varables. It follows, for example, from Theorem 6. n [], usng the fact that φ s the densty for ξ, and therefore E [ Gx,ξ) φ ξ ) assumpton on the measurablty of Gx, ). ] = gx). We however need the followng Assumpton 5 For all x X, Gx, ) s a measurable functon on R d and gx) <. Theorem 5 Suppose Assumpton and 5 hold. If {ξ } are ndependent and dentcally dstrbuted.e., φ = φ for all ), then for all x X, wth probablty one, lm ĝ x) gx) = 0. ext we establsh a pontwse strong law of large numbers for the case n whch {ξ } are nether ndependently nor dentcally dstrbuted. Assumpton 6 There exst non-negatve constants k and b such that, wth probablty one, for all, x X, and ξ Ξ, Gx,ξ) φ ξ) k expb ξ ). Assumptons on the uncondtonal moment generatng functon of F x, ξ) n ), for each x X, are common n ths type of analyss [6,,]. In Assumpton 7, we focus nstead on the moment generatng functon M of the condtonal dstrbuton of ξ gven F, defned as M s) = E[exps ξ ) F ] = Ξ exps ξ )φ ξ) dξ. ote that M s a random functon. Assumpton 7 There exsts α such that α E[M αb)] <, where b s as n Assumpton 6.

7 Sample Average Approxmaton wth Adaptve Importance Samplng 7 In Secton 5 we show that Assumpton 7 s satsfed when the denstes φ are normal dstrbutons wth bounded means. Theorem 6 Suppose Assumpton, 5, 6, and 7 hold. Then for all x X, wth probablty one, lm ĝ x) gx) = 0. The proof requres a smple relatonshp that s easy to show. Lemma Gven a, c R and r, we have a + c r + c ) r + a r ). Proof Proof of Theorem 6) For a gven fxed x X and all,, defne U = Gx,ξ) φ ξ ) gx) and V = U, so that ĝ x) gx) = V /. The clam of the theorem follows from Chow s strong law of large numbers for martngales see [3]) whch that states that V / 0 wth probablty one. The remander of ths proof verfes that our settng satsfes the condtons for the theorem n [3]. The condtons are that V be a martngale whose ncrements satsfy Chung s condton Equaton 3.) n [4]). That s, there exsts α such that +α) E[ U α ] <. To see that V s a martngale, recall that φ s the densty of ξ, and therefore E[U ] = 0 for all wth probablty one. Lettng a = U + gx) = Gx,ξ) φ, c = gx) and r = α n Lemma, we fnd E [ ξ ) U α] [ ) ]) α C + E Gx,ξ) φ ξ ), where C = + gx) ) α. Assumpton 6 then yelds [ Gx, ) ] [ [ α Gx, ) ]] α ξ ) ξ ) E = E E φ ξ ) φ ξ ) F E [ k α expαb ξ ) F ] = k α E[M αb)]. Snce α + >, we have +α) <, and wth Assumpton 7 ) +α) E[ U α ] C +α) + k α +α) E[M αb)] <. Hence, Chung s condton holds. 5 ormal Dstrbutons and Smooth Functons Assumpton 3 s stated n very general terms. ow we present specfc condtons that are easer to verfy. We consder the case n whch all densty functons correspond to normal dstrbutons wth dfferent means µ and varances σ, so they are of the form ϕµ, σ, ξ) = πσ) exp ξ ) µ d σ. 0)

8 8 Andreas Wächter et al. Assumpton 8 Let Ξ = R d, and for all x X and ξ R d we have hx, ξ) = ϕx, σ, ξ) for some σ > 0. Furthermore, for all, and ξ R d, we have Ξ = R d and φ ξ) = ϕµ, σ, ξ) for some random varables µ R and σ σ. The sequence {µ } s unformly bounded wth probablty one. Under ths assumpton, the moment generatng functons M s) = exps ξ )φ ξ) dξ = exp s ξ πσ ) d ξ µ ) σ are unformly bounded for fxed s, and Assumpton 7 holds for any values of α and b > 0). Furthermore, the followng lemma establshes that the lkelhood rato has subexponental growth. Lemma Suppose Assumpton 8 holds. Then there exst constants k h, b h 0 so that hx,ξ) φ ξ) k h expb h ξ ) for all, x X, and ξ Ξ. Proof Choose any, x X, and ξ Ξ. Then ) hx, ξ) log = log ϕx, σ, ξ)) log ϕµ, σ, ξ)) φ ξ) 0) = x ξ σ + ξ µ = σ = σ σ ) x + x, ξ ξ + σ σ µ σ σ µ, ξ + σ σ ξ [ ] ) σ σ µ x + x σ σ σ µ, ξ + σ ξ. ) By Assumpton 8, σ σ, and the term n the square brackets s non-postve. Because X s compact and µ s bounded by Assumpton 8, there exst postve constants k and b h so that for all, x X, and ξ Ξ, we have log hx,ξ) φ ξ) ) k + b h ξ. The clam of Lemma follows wth k h = exp k). We also requre some dfferentablty propertes for F. Assumpton 9 Suppose, that F n ) s contnuously dfferentable n x for any ξ Ξ, and that there exst k F, b F > 0 so that for any x X and ξ Ξ dξ, F x, ξ) k F expb F ξ ) and a) x F x, ξ) k F expb F ξ ). Here, x denotes the gradent wth respect to x. b) A consequence of the fnal proposton s that the clams of Theorems,, 3, and 4 hold under Assumptons 8 and 9. Proposton If Assumptons 8 and 9 hold, then Assumptons,, and 3 hold for Gx, ξ) = F x, ξ)hx, ξ).

9 Sample Average Approxmaton wth Adaptve Importance Samplng 9 Proof Suppose the assumptons of Proposton hold. Assumpton 8 mples Assumpton, and Assumpton 9 mples Assumpton 5. We already argued above that Assumpton 7 holds because of Assumpton 8. Assumpton 6 holds, snce for any, x X, and ξ Ξ, Gx, ξ) φ ξ) hx, ξ) a) = F x, ξ) k F expb F ξ ) k h expb h ξ ), φ ξ) where we used Lemma. Therefore, Theorem 6 mples that Assumpton holds. It remans to prove that Assumpton 3 s also mpled. ote that x hx, ξ) = σ hx, ξ)x ξ) for all x, ξ R d. Usng ths and the mean value theorem, we have for all and x, x X that Gx, ξ ) φ ξ ) Gx, ξ ) φ ξ ) = φ ξ ) xg x, ξ ), x x = φ ξ ) xf x, ξ )h x, ξ ) + F x, ξ ) x h x, ξ ), x x = h x, ξ ) φ ξ ) xf x, ξ ) + σ F x, ξ ) x ξ ), x x 3) for some x {λ x + λ )x : λ 0, )}. Wth M x = max{ x : x X } <, we fnd xf x, ξ ) + σ F x, ξ ) x ξ ) σ + M x + k F σ exp b F + ) ξ ) 4) where we used Assumpton 9 and ξ exp ξ ). Usng smlar arguments as n ), ) we have wth an arbtrary but fxed ˆx X and all that log h x,ξ ) hˆx,ξ ) = σ x ˆx + x ˆx, ξ ) ) ) ) σ M M x + M x ξ, so h x x, ξ ) hˆx, ξ ) exp σ exp Mx ξ σ. Combnng ths wth 3) and 4) we have Gx, ξ ) Gx, ξ ) φ ξ ) φ ξ ) hˆx, ξ ) φ ξ ) k G expb G ξ ) x x wth k G = k F + M x+) M σ exp x and b G = b F + + Mx σ. Defnng γ ξ ) = k G expb G ξ )hˆx,ξ ) φ ξ ), t remans to show that 6) and 7) hold. We are now gong to apply Theorem 6 to the functon G γ ˆx, ξ) = k G expb G ξ )hˆx, ξ) wth X γ = {ˆx}. For ths, note that g γ ˆx) defned as g γ ˆx) := Ξ G γ ˆx, ξ) dξ = Ξ σ ) G γ ˆx, ξ) φ ξ ) φ ξ ) dξ = Ξ γ ξ )φ ξ ) dξ = E[γ ξ )] s fnte. The last equalty follows because ξ s sampled from densty φ. Therefore, 6) holds, and Assumpton 5 holds for G = G γ. Further consder ĝ γ, ˆx) := G γˆx,ξ ) φ ξ ) = γ ξ ). From the defnton

10 0 Andreas Wächter et al. of G γ and Lemma, we have for any ξ Ξ that Gγˆx,ξ) φ ξ) = hˆx,ξ) φ ξ) k G expb G ξ ) k h k G expb h + b G ) ξ ). Therefore, Assumpton 6 holds for G = G γ, and usng Theorem 6 we obtan 0 = lm ĝ γ, ˆx) g γ ˆx)) = lm γ ξ ) E[γ ξ )]), whch s 7). 6 Example: Regresson Models for Step Computaton n an Optmzaton Algorthm As an llustraton n whch the mportance samplng s adaptve and nusance parameters are present, we consder the randomzed optmzaton algorthm proposed by Maggar et al. [5] n whch a local model of the objectve s constructed va a SAA regresson problem n every teraton. The algorthm n [5] addresses the mnmzaton of the functon L : Z R gven by Lz) = Ξ Lξ)hz, ξ) dξ, where Z Rd s a compact set, Ξ = R d, and hy, ξ) = ϕy, σ, ξ) s the normal densty wth mean y and varance σ. The ntegral s fnte because L : R d R s assumed to exhbt subexponental growth. Lξ) s the output of a determnstc computer smulaton wth nput ξ and the orgnal objectve functon one would lke to mnmzer. However, snce L s subject to numercal nose and therefore dscontnuous, the task of mnmzng L s ll-defned. To overcome ths dffculty, [5] proposes to mnmze the convoluton L as a smooth approxmaton of L. The dervatve-free trust-regon optmzaton algorthm proposed n [5] Lξ ) ϕz,σ,ξ) ϕt,σ,ξ ) utlzes an SAA approxmaton L z) = of L. The ponts ξ are sampled randomly accordng to the normal pdf ϕt, σ, ), where ts mean t s ether an terate or a tral pont encountered by the algorthm up to teraton. ote that the lkelhood rato n the defnton of Lz) has the form of that n 3) and therefore falls nto our framework. Gven an terate z Y, the optmzaton algorthm generates a tral pont z as the mnmzer of a quadratc model wthn a ball around z. The model has the form q ξ; z ) = b+ g, ξ z + ξ z, Qξ z ), wth coeffcents b R, g R d. The matrx Q R d d s symmetrc, and q ξ; z ) should approxmate the smulaton output Lξ) for ξ close to z. Convergence of the optmzaton algorthm would follow f the model parameters are computed by a weghted local regresson of L; that s, f y = b, g, Q) are the mnmzers of mn F y, z, ξ)hz, ξ) dξ, 5) y Y Ξ where F y, z, ξ) = b + g, ξ z + ξ z, Qξ z) Lξ)). Ths objectve functon has the form of ). In abuse of notaton, we collect the model parameters b, g, and Q n the vector y.) To get an approxmate soluton of 5), at an terate Z usng an upper case letter to emphasze ts stochastc nature), the optmzaton algorthm computes the quadratc model from the stochastc average approxmaton of

11 Sample Average Approxmaton wth Adaptve Importance Samplng 5); that s mn y Y ϕz, σ, ξ ) ϕt, σ, ξ ) b + g, ξ Z + ξ Z, Qξ z ) Lξ)). 6) The analyss of the algorthm n [5] requres that the model q ξ; Z ) converges to the optmal soluton of 5) at any lmt pont Z of the terates Z. Ths can be proved usng the results n Secton 3. For any ω Ω, let {Z ω)} = be a subsequence of terates such that {Z ω)} = converges to a lmt pont Z ω). Such a subsequence exsts, due to compactness of Z; thus Assumpton 4 holds. Furthermore, snce F y, z, ξ) s a polynomal n y, z) and L exhbts subexponental growth, Assumpton 9 holds. Also, because all terates and tral ponts are contaned n Z, the sequence {T }, consstng of such ponts, s unformly bounded. Fnally, the algorthm n [5] ensures that the optmal solutons of 5) and 6) are unque and unformly bounded, by montorng the condton number of matrces nvolved n the computaton of the optmal soluton of 6). In summary, Assumptons 4 and 9 hold, and Proposton together wth Theorem yelds lm DŜZ, SZ ) = 0. So, the approxmate model parameters n ŜZ n teraton converge to the optmal parameters n S Z. 7 Concluson We consdered the sample average approxmaton of stochastc optmzaton problems whose objectve functon s expressed as a parametrc ntegral. The key contrbuton s that we permt non-ndependent, non-dentcal, and adaptve samplng, where the mportance samplng dstrbuton may depend on prevous samples. Under the assumpton of pontwse convergence and a stochastc Lpschtz condton, we proved unform convergence of the sample average approxmaton of the parametrc ntegral over a compact set as well as convergence of the optmal values and optmal soluton sets of the sample average approxmaton problems as the number of samples goes to nfnty. Acknowledgments We thank Tto Homem-de-Mello, Davd Morton, Imry Rosenbaum, and Johannes Royset for dscussons. References. Andrews, D.W.: Generc unform convergence. Econometrc theory 80), ). Bllngsley, P.: Probablty and Measure, 3rd edn. John Wley & Sons 995) 3. Chow, Y.S.: On a strong law of large numbers for martngales. The Annals of Mathematcal Statstcs 38), )

12 Andreas Wächter et al. 4. Chung, K.L.: The strong law of large numbers. In: Processdngs of the Second Berkeley Symposum on Mathematcal Statstcs and Probablty, 950, pp Unversty of Calforna Press, Berkeley and Los Angeles 95) 5. Cornuet, J.M., Marn, J.M., Mra, A., Robert, C.P.: Adaptve multple mportance samplng. Scandnavan Journal of Statstcs 39, ) 6. Da, L., Chen, C.H., Brge, J.R.: Convergence propertes of two-stage stochastc programmng. Journal of Optmzaton Theory and Applcatons 063), ) 7. Dantzg, G.B., Glynn, P.W.: Parallel processors for plannng under uncertanty. Annals of Operatons Research ), 990) 8. Duffe, D., Sngleton, K.J.: Smulated moments estmaton of markov models of asset prces 990) 9. Dupačová, J., Wets, R.: Asymptotc behavor of statstcal estmators and of optmal solutons of stochastc optmzaton problems. The Annals of Statstcs pp ) 0. Glynn, P.W., Infanger, G.: Smulaton-based confdence bounds for two-stage stochastc programs. Mathematcal Programmng 38), ). Homem-de-Mello, T.: On rates of convergence for stochastc optmzaton problems under non-ndependent and dentcally dstrbuted samplng. SIAM Journal on Optmzaton 9), ). Infanger, G.: Monte Carlo mportance) samplng wthn a Benders decomposton algorthm for stochastc lnear programs. Annals of Operatons Research 39), ) 3. Jensh,., Prucha, I.R.: Central lmt theorems and unform laws of large numbers for arrays of random felds. Journal of econometrcs 50), ) 4. Korf, L., Wets, R.J.B.: Random LSC functons: An ergodc theorem. Mathematcs of Operatons Research 6), ) 5. Maggar, A., Wächter, A., Dolnskaya, I.S., Staum, J.: A dervatve-free trust-regon algorthm for the optmzaton of functons smoothed va Gaussan convoluton usng multple mportance samplng 05). HTML/05/07/507.html 6. Marn, J.M., Pudlo, P., Sedk, M.: Consstency of the adaptve multple mportance samplng 04). ArXv:.548v 7. Royset, J.O., Polak, E.: Implementable algorthm for stochastc optmzaton usng sample average approxmatons. Journal of Optmzaton Theory and Applcatons ), ) 8. Rubnsten, R.Y., Kroese, D.P.: Smulaton and the Monte Carlo method, 3rd edn. John Wley & Sons 07) 9. Shapro, A., Dentcheva, D., Ruszczńsk, A.: Lectures on Stochastc Programmng: Modelng and Theory. SIAM, Phladelpha 009) 0. Shapro, A., Xu, H.: Unform laws of large numbers for set-valued mappngs and subdfferentals of random functons. Journal of mathematcal analyss and applcatons 35), ). Xu, H.: Unform exponental convergence of sample average random functons under general samplng wth applcatons n stochastc programmng. Journal of Mathematcal Analyss and Applcatons 368, ) A Proof of Theorem 4 We establsh the result n two lemmas. Lemma 3 Suppose Assumptons,, and 3 hold. Further assume that S s not empty and that, wth probablty one, Ŝ s non-empty for all suffcently large. Then lm ˆϑ = ϑ wth probablty one. Proof We prove lm ˆϑ = ϑ n the event that Ŝ s non-empty for all suffcently large and that lm ĝ g = 0. Ths event has probablty one by assumpton and by Theorem.

13 Sample Average Approxmaton wth Adaptve Importance Samplng 3 Let x be an optmal soluton of 5). Because lm ĝ g = 0, lm ĝ x ) = gx ) = ϑ. Snce ˆϑ s the optmal value of 4), ˆϑ ĝ x ) for all. As a consequence, lm sup ˆϑ ϑ. Defne ˆϑ nf = lm nf ˆϑ. There exst a subsequence { } of the natural numbers and a sequence {x } = of ponts n X such that for every =,...,, x Ŝ, and lm ĝ x ) = ˆϑ nf. Because lm ĝ g = 0, we also have lm gx ) = ˆϑ nf. Snce ϑ s the optmal value of 5), ϑ gx ) for all. Therefore ϑ ˆϑ nf. Overall, we have obtaned lm sup ˆϑ ϑ lm nf ˆϑ. Lemma 4 Suppose the assumptons of Theorem hold. Then, w.p., lm DŜ, S ) = 0. Proof We prove lm DŜ, S ) = 0 n the event that lm ĝ g = 0, lm ˆϑ = ϑ, and Ŝ s non-empty and contaned n C for all suffcently large. Ths event has probablty one by Theorem, by Lemma 3, and by assumpton. Consder any subsequence { } of the natural numbers and sequence {x } = of ponts n X such that for every =,...,, x Ŝ. Because C s compact, the sequence {x } has a lmt pont. Consder any such lmt pont, and denote t as x. Consder any subsequence { } of { } such that lm x = x. For any, ) ) ˆϑ gx ) = ĝ x ) gx ) = ĝ x ) gx ) + gx ) gx ). It follows from assumptons ) and ) n Theorem and Theorem 7.47 n [9] that g s lower sem-contnuous, whch n turn mples that lm nf gx ) gx )) 0. We also have lm ĝ x ) gx )) = 0 snce lm ĝ g = 0. Therefore lm ˆϑ gx ). We also have lm ˆϑ = ϑ. Thus, gx ) ϑ, whch mples x S. In words: f x s a lmt pont of a sequence {x } of ponts that are optmal solutons of a sequence of sample average approxmaton problems gven by 4), then x s n S. Therefore, lm sup DŜ, S ) = lm sup sup dstx, S ) = 0. x Ŝ

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Strong Markov property: Same assertion holds for stopping times τ.

Strong Markov property: Same assertion holds for stopping times τ. Brownan moton Let X ={X t : t R + } be a real-valued stochastc process: a famlty of real random varables all defned on the same probablty space. Defne F t = nformaton avalable by observng the process up

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Basic Statistical Analysis and Yield Calculations

Basic Statistical Analysis and Yield Calculations October 17, 007 Basc Statstcal Analyss and Yeld Calculatons Dr. José Ernesto Rayas Sánchez 1 Outlne Sources of desgn-performance uncertanty Desgn and development processes Desgn for manufacturablty A general

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 7, Number 2, December 203 Avalable onlne at http://acutm.math.ut.ee A note on almost sure behavor of randomly weghted sums of φ-mxng

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Appendix B. Criterion of Riemann-Stieltjes Integrability

Appendix B. Criterion of Riemann-Stieltjes Integrability Appendx B. Crteron of Remann-Steltes Integrablty Ths note s complementary to [R, Ch. 6] and [T, Sec. 3.5]. The man result of ths note s Theorem B.3, whch provdes the necessary and suffcent condtons for

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Supplement to Clustering with Statistical Error Control

Supplement to Clustering with Statistical Error Control Supplement to Clusterng wth Statstcal Error Control Mchael Vogt Unversty of Bonn Matthas Schmd Unversty of Bonn In ths supplement, we provde the proofs that are omtted n the paper. In partcular, we derve

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Exercises of Chapter 2

Exercises of Chapter 2 Exercses of Chapter Chuang-Cheh Ln Department of Computer Scence and Informaton Engneerng, Natonal Chung Cheng Unversty, Mng-Hsung, Chay 61, Tawan. Exercse.6. Suppose that we ndependently roll two standard

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Exercise Solutions to Real Analysis

Exercise Solutions to Real Analysis xercse Solutons to Real Analyss Note: References refer to H. L. Royden, Real Analyss xersze 1. Gven any set A any ɛ > 0, there s an open set O such that A O m O m A + ɛ. Soluton 1. If m A =, then there

More information

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities Supplementary materal: Margn based PU Learnng We gve the complete proofs of Theorem and n Secton We frst ntroduce the well-known concentraton nequalty, so the covarance estmator can be bounded Then we

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Inductance Calculation for Conductors of Arbitrary Shape

Inductance Calculation for Conductors of Arbitrary Shape CRYO/02/028 Aprl 5, 2002 Inductance Calculaton for Conductors of Arbtrary Shape L. Bottura Dstrbuton: Internal Summary In ths note we descrbe a method for the numercal calculaton of nductances among conductors

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Error Probability for M Signals

Error Probability for M Signals Chapter 3 rror Probablty for M Sgnals In ths chapter we dscuss the error probablty n decdng whch of M sgnals was transmtted over an arbtrary channel. We assume the sgnals are represented by a set of orthonormal

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

An Analysis of a Least Squares Regression Method for American Option Pricing

An Analysis of a Least Squares Regression Method for American Option Pricing An Analyss of a Least Squares Regresson Method for Amercan Opton Prcng Emmanuelle Clément Damen Lamberton Phlp Protter Revsed verson, December 200 Abstract Recently, varous authors proposed Monte-Carlo

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

REAL ANALYSIS I HOMEWORK 1

REAL ANALYSIS I HOMEWORK 1 REAL ANALYSIS I HOMEWORK CİHAN BAHRAN The questons are from Tao s text. Exercse 0.0.. If (x α ) α A s a collecton of numbers x α [0, + ] such that x α

More information

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning Journal of Machne Learnng Research 00-9 Submtted /0; Publshed 7/ Erratum: A Generalzed Path Integral Control Approach to Renforcement Learnng Evangelos ATheodorou Jonas Buchl Stefan Schaal Department of

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Research Article Green s Theorem for Sign Data

Research Article Green s Theorem for Sign Data Internatonal Scholarly Research Network ISRN Appled Mathematcs Volume 2012, Artcle ID 539359, 10 pages do:10.5402/2012/539359 Research Artcle Green s Theorem for Sgn Data Lous M. Houston The Unversty of

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Research Article. Almost Sure Convergence of Random Projected Proximal and Subgradient Algorithms for Distributed Nonsmooth Convex Optimization

Research Article. Almost Sure Convergence of Random Projected Proximal and Subgradient Algorithms for Distributed Nonsmooth Convex Optimization To appear n Optmzaton Vol. 00, No. 00, Month 20XX, 1 27 Research Artcle Almost Sure Convergence of Random Projected Proxmal and Subgradent Algorthms for Dstrbuted Nonsmooth Convex Optmzaton Hdea Idua a

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

Why Monte Carlo Integration? Introduction to Monte Carlo Method. Continuous Probability. Continuous Probability

Why Monte Carlo Integration? Introduction to Monte Carlo Method. Continuous Probability. Continuous Probability Introducton to Monte Carlo Method Kad Bouatouch IRISA Emal: kad@rsa.fr Wh Monte Carlo Integraton? To generate realstc lookng mages, we need to solve ntegrals of or hgher dmenson Pel flterng and lens smulaton

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

6) Derivatives, gradients and Hessian matrices

6) Derivatives, gradients and Hessian matrices 30C00300 Mathematcal Methods for Economsts (6 cr) 6) Dervatves, gradents and Hessan matrces Smon & Blume chapters: 14, 15 Sldes by: Tmo Kuosmanen 1 Outlne Defnton of dervatve functon Dervatve notatons

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information