arxiv: v2 [stat.me] 15 May 2018

Size: px
Start display at page:

Download "arxiv: v2 [stat.me] 15 May 2018"

Transcription

1 Piecewise-Determiistic Markov Chai Mote Carlo Paul Vaetti 1, Alexadre Bouchard-Côté 2, George Deligiaidis 1, Araud Doucet 1 May 16, Departmet of Statistics, Uiversity of Oxford, UK. 2 Departmet of Statistics, Uiversity of British Columbia, Caada. arxiv: v2 stat.me] 15 May 218 Abstract A ovel class of o-reversible Markov chai Mote Carlo schemes relyig o cotiuous-time piecewisedetermiistic Markov Processes has recetly emerged. I these algorithms, the state of the Markov process evolves accordig to a determiistic dyamics which is modified usig a Markov trasitio kerel at radom evet times. These methods ejoy remarkable features icludig the ability to update oly a subset of the state compoets while other compoets implicitly keep evolvig ad the ability to use a ubiased estimate of the gradiet of the log-target while preservig the target as ivariat distributio. However, they also suffer from importat limitatios. The determiistic dyamics used so far do ot exploit the structure of the target. Moreover, exact simulatio of the evet times is feasible for a importat yet restricted class of problems ad, eve whe it is, it is applicatio specific. This limits the applicability of these techiques ad prevets the developmet of a geeric software implemetatio of them. We itroduce ovel MCMC methods addressig these shortcomigs. I particular, we itroduce ovel cotiuous-time algorithms relyig o exact Hamiltoia flows ad ovel o-reversible discrete-time algorithms which ca exploit complex dyamics such as approximate Hamiltoia dyamics arisig from symplectic itegrators while preservig the attractive features of cotiuous-time algorithms. We demostrate the performace of these schemes o a variety of applicatios. Keywords: geeralized Metropolis Hastigs; Hamiltoia dyamics; itractable likelihood; o-reversible Markov chai Mote Carlo; piecewise-determiistic Markov process; weak covergece. 1 Itroductio Markov chai Mote Carlo MCMC) methods are the tools of choice to sample o-stadard probability distributios. I high-dimesioal scearios, the celebrated Metropolis Hastigs algorithm performs usually poorly ad alterative algorithms are required. Two of the most popular alteratives are slice samplig 37] ad Hamiltoia Mote Carlo HMC) methods 18, 38, 3, 4] which have had much empirical success over recet years. More recetly, cotiuous-time o-reversible MCMC algorithms based o Piecewise-Determiistic Markov Processes PDMP) schemes have also appeared i the literature i applied probability 35, 17, 7], automatic cotrol 34], physics 42, 32, 27, 39], statistics ad machie learig 1, 6, 2, 5, 4, 47]. I physics, these schemes have become quickly popular as they provide state-of-the-art performace whe applied to the simulatio of large scale physical models. They also show promise for statistics applicatios, i particular for high dimesioal sparse graphical models 1] ad big data 1, 6, 21, 4]. However, the PDMP-based schemes curretly available suffer from shortcomigs which limit both their applicability ad performace. To esure ivariace with respect to the target distributio, oe eeds to be able to simulate these cotiuous-time processes exactly. I practice, this restricts severely the determiistic dyamics oe ca use: all the existig algorithms use a simple liear dyamics that does ot exploit the geometry of the target. Moreover, exact simulatio of the evet times is problem specific ad may be impossible i certai scearios. This prevets the developmet of a geeric software implemetatio of these techiques. I this paper, we address these limitatios by developig ovel cotiuous-time ad discrete-time Piecewise- Determiistic Markov Chai Mote Carlo PD-MCMC) techiques which brig together HMC, PDMP ad geeralized Metropolis Hastigs. 1

2 First, we show that it is possible to develop cotiuous-time PD-MCMC algorithms relyig o Hamiltoia dyamics. I this cotext, exact simulatio of the resultig PDMP remais possible for a importat class of target distributios. The resultig algorithms provide a alterative to elliptical slice samplig-type algorithms 36, 8]. We also exploit a geeralized versio of Metropolis Hastigs algorithm see, e.g., 31]) satisfyig a skewed detailed balace coditio to derive ovel schemes. Secod, we itroduce ovel discrete-time PD-MCMC algorithms. These o-reversible algorithms ca be thought of as a discretized versio of cotiuous-time PD-MCMC but preserve the target distributio as ivariat distributio for all discretizatio steps. These schemes are ot oly able to exploit complex dyamics, such as approximate Hamiltoia dyamics arisig from symplectic itegrators, but it is also always possible to simulate the evet times. Moreover some versios of these discrete-time algorithms do ot eve require beig able to compute the gradiet of the log-target. These methods ejoy the same attractive features as their cotiuous-time couterparts: they ca leverage ay represetatio of the target as a product of o-egative factors. Additioally they ca use ubiased estimators of the log-target distributio ad its gradiet ad still provide algorithms with the correct ivariat distributio. The rest of the paper is orgaised as follows. I Sectio 2 we review cotiuous-time PDMPs, provide sufficiet coditios to esure ivariace of a PDMP with respect to a give target distributio, discuss existig PD- MCMC algorithms ad fially itroduce ovel algorithms relyig o Hamiltoia dyamics. I Sectio 3, we itroduce the class of discrete-time PDMP ad provide sufficiet coditios to esure ivariace of a PDMP with respect to a give target distributio which parallel the oes obtaied i the cotiuous-time scearios. We review existig ad describe ovel discrete-time PD-MCMC algorithms. Sectio 4 is dedicated to the efficiet implemetatio of discrete-time algorithms usig subsamplig ad prefetchig ideas while Sectio 5 proposes discrete-time algorithms to hadle scearios where the target is itractable but its logarithm ad the logarithm of its gradiet ca be estimated ubiasedly. Empirical performace of some of these schemes are reviewed i Sectio 6. Appedix A cotais all the proofs of validity of the proposed algorithms while weak covergece of a specific discrete-time scheme to a PDMP is prove i Appedix B. 2 Cotiuous-Time PDMP ad PD-MCMC 2.1 PDMP PDMPs were itroduced i 14]. We will oly provide here a iformal review of this class of processes i the spirit of 34, 17, 2, 5] ad refer the reader to 15] for a detailed theoretical treatmet. For the sake of simplicity, assume that Z = R. A Z-valued cotiuous-time PDMP process z t ; t is a càdlàg process ivolvig a determiistic dyamics altered by radom jumps at radom evet times. It is defied through 1. a Ordiary Differetial Equatio ODE) with differetiable drift φ : Z Z, i.e., which iduces a determiistic flow dz t dt = φ z t), 1) t, z) R + Z Φ t z) Z 2) satisfyig the semi-group property Φ s Φ t = Φ s+t ad such that t Φ t z) is càdlàg, 2. a evet rate λ : Z R +, with λ z t ) ɛ + o ɛ) beig the probability of havig a evet i the time iterval t, t + ɛ], ad 3. a Markov trasitio kerel Q from Z to Z where the state at evet time t is give by z t Q z t, ), z t beig the state of the process just before the evet. Algorithm 1 describes how to simulate the path of a PDMP. 2

3 Algorithm 1 Simulatio of cotiuous-time PDMP 1. Iitialize z arbitrarily o Z ad set t. 2. for k = 1, 2,... do a) Sample iter-evet time τ k, where τ k is a o-egative radom variable such that t P τ k t) = exp λ Φ r z tk 1 ) ] dr. 3) r= b) For r, τ k ), set c) Set t k t k 1 + τ k ad sample z tk 1 +r Φ r z tk 1 ). 4) z tk Qz t, ). 5) k To be able to exactly simulate a PDMP, we thus eed to be able to simulate from the distributio 3) ad compute the flow 4). Fially we also eed to be able to simulate from the trasitio kerel Q. I importat scearios, exact simulatio of the evet times ca be performed usig iversio of the itegrated rate fuctio as i 42] or usig adaptive thiig procedures as i 1]. We ow itroduce the geerator associated with the PDMP. For fuctios i the domai of the geerator, it is defied by E f z t+ɛ ) z t = z] f z) Lf z) = lim. ɛ ɛ Uder suitable regularity coditios 15, Theorem 26.14], it ca be show that this geerator is give by Lf z) = φ z), f z) + λ z) f z ) f z)] Q z, dz ), 6) where a, b deotes the scalar product betwee vectors a, b ad a 2 = a, a. The first term o the right had side of 6) arises from the determiistic dyamics while the secod term correspods to the jump compoet of the process. 2.2 From PDMP to PD-MCMC Assume we are iterested i samplig from a give target probability distributio o the Borel space Z, B Z)). If we wat to use a PDMP mechaism to sample this target distributio, this PDMP eeds at least to admit this distributio as ivariat distributio. We provide here sufficiet coditios to esure this is satisfied. If additioally the PDMP is ergodic, this will allow us to estimate cosistetly expectatios with respect to the ivariat distributio. From ow oward, the target distributio will be assumed to have a strictly positive desity ρ z) with respect to the Lebesgue measure dz where ρ z) = exp H z)). 7) Ivariace with respect to ρ will be satisfied if ρ dz) Lf z) = for all fuctios f i the domai of the geerator 15, Propositio 34.7]. From 6), this meas that we eed ρ dz) φ z), f z) + ρ dz) λ z) Q z, dz ) f z ) f z)] =. However, usig itegratio by parts, we obtai ρ dz) φ z), f z) = ρ dz) φ z) H z), φ z) f z) 3

4 where φ z) := iφ i z) is the divergece of the vector field φ. Hece, a sufficiet coditio to esure ivariace of a PDMP with respect to ρ is to have ] ρ dz) λ z) Q z, dz ) f z ) f z)] φ z) H z), φ z) f z) =. 8) The followig otatio will prove useful to formulate sufficiet coditios to esure ivariace of a PDMP with respect to ρ. Suppose that we are give a a measure ν o Z, B Z) ad a measurable mappig Γ : Z Z. The the push-forward of the measure ν uder the mappig Γ, ofte deoted by Γ ν dz), is the measure A ν Γ 1 A) ) for ay A B Z). We will use here the otatio ν Γ 1 dz) ). For ay measurable f : Z R, the followig idetity holds fz)γ ν dz) = f Γ z)ν dz). Z Z Sufficiet coditios for global methods We provide here useful sufficiet coditios o φ, λ, ad Q to esure ρ-ivariace of the associated PDMP, without makig ay structural assumptios o these objects. A1) Coditios o φ, λ, ad Q 1. There exists a ρ-preservig mappig S : Z Z; that is S is measurable ad satisfies ρ S 1 dz) ) = ρdz). 2. The evet rate λ satisfies λ S z)) λ z) = φ z) H z), φ z). 9) 3. The kerel Q satisfies ρ dz) λ z) Q z, dz ) = ρ S 1 dz ) ) λ S z )). 1) Based o these assumptios, straightforward calculatios show that the followig result holds. Propositio 1. Assume A1). The the PDMP admits ρ as ivariat distributio Sufficiet coditios for local methods Assume that H z) ca be decomposed as follows H z) = H i z), 11) where potetially each H i z) oly depeds o a subset of the compoets of z. I this cotext, like i stadard MCMC, we might be iterested i usig a trasitio kerel which is a mixture of kerels performig local updates. This ca be achieved i the PDMP framework by itroducig a evet rate of the form ad a trasitio kerel of the form λ z) = Q z, dz ) = λ i z) 12) λ i z) λ z) Q i z, dz ) 13) where Q i are Markov trasitio kerels. Let us write ] := 1, 2,...,. To simulate the evet times of the resultig PDMP, oe ca associate a clock to each idex i ] ad use a priority queue 42, 32, 1]. Whe it is possible to boud λ i ; i ] locally i time, more elaborate thiig strategies have bee developed i 1, Sectio 3.3.2] ad 29]. Based o these structural assumptios o λ ad Q, we ca provide useful sufficiet local coditios o φ, λ i : i ] ad Q i : i ] to esure that ivariace of the associated PDMP with respect to ρ is satisfied. 4

5 A2) Coditios o φ, λ i : i ] ad Q i : i ] 1. There exists a ρ-preservig mappig S : Z Z. 2. The evet rates λ i : i ] satisfy λ i S z)) λ i z) = φ z) H z), φ z). 14) 3. For all i ], the trasitio kerel Q i satisfies ρ dz) λ i z) Q i z, dz ) = ρs 1 dz ))λ i S z )). 15) If the fuctios H i : i ] are differetiable the Assumptio A2.2 is satisfied for a divergece-free vector field, i.e. φ =, if for all i ] λ i S z)) λ i z) = H i z), φ z). 16) Propositio 2. Assume A2). The the PDMP admits ρ as ivariat distributio Sufficiet coditios for doubly stochastic methods Cosider ow a slight geeralizatio of the previous sceario where the target distributio caot eve be evaluated poitwise up to a ormalizig costat but there exists a measure µ o some measurable space Ω, G) ad a fuctio H ω z) : Ω Z R which ca be evaluated poitwise up to a additive costat such that H z) = H ω z) µ dω). 17) I this cotext, we cosider a evet rate of the form λ z) = λ ω z) µ dω) 18) where λ ω : Ω R + ad a trasitio kerel of the form Q z, dz λω z) µ dω) Q ω z, dz ) ) =, 19) λω z) µ dω) where Q ω is a Markov trasitio kerel from Z to Z. I Sectio 2.2.2, 11), 12) ad 13) simply correspod to 17), 18) ad 19) if we select µ as the measure such that µ i) = 1 for all i Ω = ]. The sufficiet coditios of the previous sectio ca be directly geeralized. A3) Coditios o φ, λ ω : ω Ω ad Q ω : ω Ω 1. There exists a ρ-preservig mappig S : Z Z. 2. The evet rates λ ω : ω Ω satisfy λ ω S z)) λ ω z) µ dω) = φ z) H z), φ z). 2) 3. For all ω Ω, the trasitio kerel Q ω satisfies ρ dz) λ ω z) Q ω z, dz ) = ρ S 1 dz ) ) λ ω S z )). 21) If µ is a probability measure ad the derivative H ω z) is well-defied for almost all ω Ω the uder weak regularity coditios, it follows from 17) that H ω z) is a ubiased estimate of H z) whe ω µ ad Assumptio A3.2 will be satisfied for a divergece-free field if λ ω S z)) λ ω z) = H ω z), φ z). 22) We will refer to this class of PD-MCMC as doubly stochastic i referece to doubly-stochastic Poisso processes. Propositio 3. Assume A3). The the PDMP admits ρ as ivariat distributio. 5

6 2.3 Existig PD-MCMC algorithms All the existig algorithms we are aware of are based o the followig framework. The target distributio admits a desity with respect to Lebesgue measure o X = R d equal to π x) = exp U x)). Lettig z = x, v), a exteded target distributio ρ dz) o Z = X V is the defied as ρ dz) = π dx) ψ dv), 23) where ψ is a auxiliary distributio o V, where V ca be for example either R d or the uit hypersphere S d 1 so that = 2d. The followig liear dyamics is the cosidered φ z) = v, d ), so the resultig flow is aalytically tractable ad give by Φ t z) = x + vt, v). 24) I this case, we have φ =. Additioally, all these algorithms rely o Sx, v) = x, v) which ca be viewed as a time reversal, so 9) becomes λ S z)) λ z) = λ x, v) λ x, v) = U x), v. 25) These algorithms differ i the way the evet rate ad the trasitio kerels are specified. We just give a few examples here ad refer the reader to the list of refereces for other examples Boucy particle sampler This algorithm proposed i 42] exploits ay additive decompositio of the potetial U, i.e. For λ ref >, it uses the evet rate U x) = λ z) = λ ref + where x + := max, x). It also relies o the trasitio kerel Q z, dz ) = λ ref λ z) δ xdx )ψdv ) + m U i x). 26) m U i x), v + where, for ay vector field W : R d R d, we defie R W x) as m R W x)v := v 2 U i x), v + δ x dx )δ R Ui x)vdv ), 27) λ z) W x), v W x). 28) W x) 2 We ote that 28) correspods to a bouce as it ca be iterpreted as a Newtoia collisio with the plae perpedicular to W at x. I 42], a ormal distributio is used for ψ but the uiform distributio o S d 1 ca also bee used 35, 16]. We are i the sceario where λ ad Q are of the form 12) ad 13) with = m + 1, λ i z) = 1 m U i x), v + ad Q i z, dz ) = δ x dx )δ R Ui x)vdv ) for i m] ad λ z) = λ ref, Q z, dz ) = δ x dx )ψdv ). It ca be checked that Assumptio A2 holds i this sceario. I particular, Assumptio A2.2 ca be verified by checkig the stroger coditio 16). Ideed, if we write H i := x H i, v H i ) the 16) becomes λ i x, v) λ i x, v) = x H i, v which is satisfied for H i z) := U i x) for i m] ad H z) :=. For m = 1, we refer to this algorithm as the global BPS ad for m > 1 as the local BPS. The local BPS is computatioally advatageous compared to BPS whe either U i x) oly depeds of a subset of the compoets of x, as for sparse graphical models, ad/or whe m is very large, as for big data applicatios. The BPS algorithm has bee further exteded to the sceario where oe has access to a ubiased estimate of U; see 4] ad 2, Sectio 4.4.2]. The validity of this algorithm ca be established as a applicatio of the results of Sectio We are ot aware of ay implemetatio of this algorithm i scearios where µ is ot a atomic measure with fiite support, i which case the algorithm is the local BPS. 6

7 2.3.2 Zig-Zag sampler This algorithm proposed i 6, 7] uses for ψ the uiform distributio o 1, 1 d1. It relies o the followig evet rates λ i z) = λ ref,i + i U x), v i +, while the trasitio kerel is selected as Q i z, dz ) = δ x dx )δ vi dv i) j i δ vj dv j). It is also possible to further exploit ay additive decompositio of U x) withi this framework ad this has bee used to develop a efficiet samplig algorithm for big data 6]. Agai, it is easy to show that Assumptio A2 is satisfied BPS sampler with radomized bouces Alteratives to bouces of the form 28) have bee proposed where oe uses Q z, dz ) = δ x dx )Q x v, dv ) 29) ad ψ v) = g v ). I this case, Assumptio A1.3 is verified if ψ dv) λ x, v) Q x v, dv ) = ψ dv ) λ x, v ). 3) Here ψ will be the stadard multivariate ormal distributio. We cosider the sceario where λ x, v) = U x), v + as i the global BPS. To preset the various methods proposed i the literature, a decompositio of the velocity similar to that adopted i 33] is useful: where ad are uit orm vectors such that All the radomized bouce procedures retur a vector v v = a + a, 31) U x), v, v. 32) where, =. With this otatio, we obtai λ x, v ) = a + U x). v = a + a, 33) Let χ k) ad χ 2 k) be the χ ad χ 2 distributios respectively, with k degrees of freedom. Uder ψ, the radom variables a ad a are idepedet ad satisfy a χ d 1), a N, 1). 34) Ideed, we have a 2 χ2 d 1) ad a. We give below some examples of kerels Q x v, dv ) satisfyig Equatio 3). 1. Idepedet samplig 2]: 2] proposes usig Q x v, dv ) ψ dv ) λ x, v ) a + ψ dv ) which satisfies 3) but a scheme to sample this distributio was ot give. Usig the parameterizatio 31)-33), 34) shows this ca be achieved by samplig a accordig to a desity proportioal to a + times the stadard ormal desity, which is equivalet to samplig a χ 2). Fially, sample v ψ ad set a = v v,. 2. Forward-evet chai 33]: I 33], ψ is the uiform distributio o S d 1, whereas we cosider the sceario where ψ is the ormal distributio. Oe uses =, set a = a ad a χ d 1). Alteratively, sample a χ 2) ad set a = a. For either scheme, we recover the method of 33] o S d 1 by ormalizig v, i.e. settig v = v / v. 3. Autoregressive bouce: this is a ew scheme where oe samples a χ 2) with probability p b ad a = a otherwise, sample v ψ ad set a = v v,. Fially, set a = ρ a + 1 ρ2 a for ρ 1, 1]. The properties of these radomized bouces are ot yet well uderstood. experimetally o a variety of models. I Sectio 6, we compare them 1 I this sceario, ρ dz) does ot admit a desity with respect to Lebesgue measure but the results discussed previously ca be directly exteded to this sceario. 7

8 2.4 Hamiltoia PD-MCMC Although all previously proposed methods rely o the liear flow 24), the framework preseted i Sectio 2.2 is much more flexible. We exploit here this geeralizatio to provide ovel cotiuous-time PD-MCMC algorithms relyig o Hamiltoia dyamics. 2 As i Sectio 2.3, we cosider targets of the form ρz) = π x) ψ v) with π x) = exp U x)) beig the desity of iterest o X = R d ad ψ the stadard multivariate ormal o V = R d. We use here the Hamiltoia flow Φ t associated with the Hamiltoia Ĥ z) = V x) + Kv), 35) where Kv) = v T v/2 ad µ x) exp V x)) is a auxiliary probability desity esurig Φ t is aalytically tractable, e.g., V is quadratic or liear 41]. For example if π x) is a posterior desity arisig from a Gaussia prior, the µ x) could be this Gaussia prior. Alteratively, µ x) ca always be selected as a Gaussia approximatio to π x). We ca the rewrite the target as ρz) = exp H z)) where H z) = Ũ x) + V x) + Kv), where Ũ x) := U x) V x). This is the same ratioale as i elliptical slice samplig-type algorithms 36, 8]: both schemes use a exact Hamiltoia dyamics associated with a approximatio of π to explore the space. The differece with these algorithms ad the method proposed here is that we correct for the discrepacy betwee µ ad π by usig a PDMP mechaism istead of slice samplig techiques. The Hamiltoia flow Φ t is iduced by the ODE of drift φ = φ x, φ v ) where φ x = v Ĥ z) = v ad φ v = x Ĥ z) = V x). Hece, we have φ z) = ad φ z) H z), φz) = x H z), φ x v H z), φ v = Ũ x), v V x), v + V x), v Ũx), = v. Oe ca check that Assumptio A.1 is thus verified for S z) = x, v) if we use a evet rate ad trasitio kerel as i the global BPS but based o Ũ oly3 λ z) := λ ref + Ũ x), v, + Ũ x), v Q z, dz ) := λ ref λ z) δ xdx )ψdv ) + λ z) + δ x dx )δ R Ũ x)v dv ). We ca alteratively use the radomized bouces described i Sectio substitutig Ũ for U. Figure 1 illustrates a sample path obtaied from the resultig Hamiltoia BPS algorithm. Local ad doubly stochastic versios of this algorithm as for BPS 42, 1, 41] ca also be directly developed. I the big data examples cosidered i 1, 6, 4], oe could for example use for µ a Gaussia approximatio of π. A local algorithm ca the be obtaied usig for Ũi the differece of the gradiet of the log-likelihood correspodig to data i ad the properly rescaled gradiet of the log-approximate posterior, as i 6]. If the terms Ũi are locally bouded, we ca simulate exactly the PDMP usig thiig techiques which boil dow to data subsamplig 1, 6]. This provides a alterative to 13] which also exploits Hamiltoia dyamics ad subsamplig but does ot preserve π as ivariat distributio. Fially, we also ote that the methods itroduced i this sectio ca be combied with the HMC algorithm of 41] proposed to perform exact simulatio of costraied ormal distributios. This exteds sigificatly the applicability of the work i 41], which ca be viewed as a special case where Ũ =. A alterative approach to costraied problems is proposed i 5] but it is limited to piecewise-liear dyamics. 2 The first arxiv versio of 1] proposed a versio of the BPS algorithm usig Hamiltoia dyamics but uses a differet approach based o maifolds. The algorithm suggested therei does ot preserve the correct ivariat distributio. 3 For Ũ =, this algorithm correspods to a cotiuous-time HMC algorithm with mometum/velocity refreshmet at Poisso times. 8

9 BPSHamiltoia,global) BPSPiecewiseLiear,global) BPSPiecewiseLiear,local) First positio coordiate Secod positio coordiate Figure 1: Examples of paths for the Hamiltoia BPS left), global BPS middle) ad local BPS blue). All algorithms are ru for a wall clock time of 15ms o a 1-dimesioal Gaussia latet field with sparsely observed Poisso distributed observatios oe observatio for every 1 latet variables), see Sectio 6.1 for details. The first two positio coordiates are show. 2.5 Usig geeralized Metropolis Hastigs trasitios at evet times All the algorithms we have cosidered so far are such that oly a part of the state z = x, v) is updated at evet times, i.e., the trasitio kerel is of the form Q z, dz ) = δx dx )Qx v, dv ). We might be iterested i desigig more geeral trasitios kerels satisfyig Assumptio A1.3 ad similarly Assumptio A2.3 or Assumptio A3.3. For sake of illustratio, cosider Assumptio A1.3. This ca be rewritte as ρ dz) Q z, dz ) = ρ S 1 dz ) 36) for the probability measure ρ dz) ρ dz) λ z) assumig that ρ dz) λ z) <, a weak coditio which we assume holds. If the mappig S is a ivolutio, i.e., S 1 = S, ad we ca desig a kerel Q satisfyig the so-called skewed detailed balace coditio ρ dz) Q z, dz ) = ρ S dz )) Q S z ), S dz)), 37) the it follows directly by itegratig both terms i this equality with respect to variable z that it will satisfy 36). We preset here a geeric mechaism which ca be used to achieve this kow as the Geeralized Metropolis Hastigs GMH) algorithm. The GMH algorithm is a simple extesio of MH; see for example 31, pp ]. For a probability measure ν dz) = ν z) dz o Z, let us cosider the followig GMH kerel defied for a Markov proposal kerel M by T z, dz ) = β z, z ) M z, dz ) + 1 β z, w) M z, dw) δsz) dz ) 38) where β z, z ) = g ν S dz )) M S z ), S dz)) ν dz) M z, dz ) We make the followig assumptios: A4) Coditios o ν, S, M ad g 1. The mappig S is a ivolutio, i.e., S 1 = S. 2. The Rado-Nikodym derivative ν S dz )) M S z ), S dz)) ν dz) M z, dz ) is defied ad positive for almost all z, z ) Z Z. 3. The fuctio g : R+, 1] satisfies g r) = rg 1/r) )

10 Assumptio A4.1 is satisfied for g r) = mi 1, r). For a determiistic proposal M z, dz ) = δ Ψz) dz ), Assumptio A4.3 is satisfied if Ψ admits a iverse Ψ 1 such that Ψ 1 = S Ψ S 4) ad the the acceptace probability is give by ) ν S Ψ dz)) β z, z ) = β z) = g. 41) ν dz) Propositio 4. Assume A4). The the GMH kerel T defied by 38) satisfies the followig skewed detailed balace coditio ν dz) T z, dz ) = ν S dz )) T S z ), S dz)). 42) If additioally S is a ν-preservig mappig the the GMH kerel is ν-ivariat. The proof of this result follows from direct calculatios give i the Appedix ad ca also be foud i 31, pp ]. Usig this result, it is possible to check easily Assumptio A1.3 for the BPS ad Zig-Zag processes. For example, for the BPS, Q is of the form 38) with ν = ρ, S 1 = S, g r) = mi 1, r) as we use a determiistic proposal Ψ z) = x, R U x) v) which verifies Ψ 1 = S Ψ S so β z, z ) = 1 for all z, z. Hece by Propositio 4, Q satisfies the skewed detailed balace 37), hece it satisfies 36). The beefit of the GMH approach is that it allows us to defie much more geeral kerels at evet times. For example oe could use a determiistic proposal with Ψ z) = x, R Û x) v) where Û is a computatioally cheap approximatio of U. It is valid to use such a determiistic proposal at it satisfies Ψ 1 z) = S Ψ Sz). I this case, there is a probability of the bouce beig rejected ad settig z S z). We ca also use trasitio kerels which modify the compoet x of z. 3 Discrete-time PDMP ad PD-MCMC We itroduce here the class of discrete-time PDMP ad preset geeral coditios for such processes to esure ivariace w.r.t. a strictly positive desity ρ z) = exp H z)). These coditios parallel the coditios give Sectio 2.2 for cotiuous-time algorithms. 3.1 Discrete-time PDMP As i the cotiuous-time sceario, we assume for simplicity that Z = R. A Z-valued discrete-time PDMP process z t ; t N ivolves a determiistic dyamics altered by radom jumps at radom evet times. It is defied through 1. a diffeomorphism Φ : Z Z with the absolute value of the determiat of the Jacobia satisfyig Φ z) > for all z, 2. a acceptace probability α : Z, 1] with 1 α z) beig the probability of havig a evet at the ext time step whe the curret state is z, ad 3. a Markov trasitio kerel Q from Z to Z where the state at evet time t is give by z t Q z t 1, ). Algorithm 2 describes how to simulate the path of a discrete-time PDMP. It will be coveiet to use the covetios 1 i= = 1, Φ z) = z ad Φ r+1 z) = Φ r Φ z) for r N. 1

11 Algorithm 2 Simulatio of discrete-time PDMP 1. Iitialize z arbitrarily o Z ad set t. 2. for k = 1, 2,... do a) Sample iter-evet time τ k, where τ k is a o-egative iteger-valued radom variable such that P τ k = j) = 1 α Φ j )) j 1 z tk 1 α Φ i )) z tk 1. 43) b) If τ k 1 the for r 1,..., τ k, set z tk 1 +r Φ r z tk 1 ). 44) i= c) Set t k t k 1 + τ k + 1 ad sample z tk Qz tk 1, ). 45) The process z t ; t N is othig but a Markov process of trasitio kerel K z, dz ) = α z) δ Φz) dz ) + 1 α z)) Q z, dz ). 46) 3.2 From discrete-time PDMP to PD-MCMC Similarly to Sectio 2.2, assume we are iterested i samplig a strictly positive desity ρ z) give by 7) usig a discrete-time PDMP process. Ivariace of the kerel K with respect to ρ is satisfied if, by defiitio, oe has ρ dz) K z, dz ) = ρ dz ). 47) From 46), 47) ca be rewritte as ρ Φ 1 z ) ) α Φ 1 z ) ) Φ 1 z ) dz + ρ dz) 1 α z) Q z, dz ) = ρ dz ). 48) All the followig developmets could also be adapted to sample from distributios o discrete spaces but this will ot be discussed here Sufficiet coditios for global methods We provide here useful sufficiet coditios o Φ, α, ad Q to esure ρ-ivariace of the associated discrete-time PDMP, without makig ay structural assumptio o these objects. A5) Coditios o Φ, α, ad Q 1. There exists a ρ-preservig mappig S : Z Z. 2. The acceptace probability α satisfies 3. The kerel Q satisfies log α S Φ z)) log α z) = log Φ z) H Φ z)) H z). 49) ρ dz) 1 α z)) Q z, dz ) = ρs 1 dz )) 1 α S z ))). 5) Remark 5. Coditios A5.1 to A5.3 parallel the coditios A1.1 to A1.3. Propositio 6. Assume A5). The the discrete-time PDMP admits ρ as ivariat distributio. Remark 7. Whe S is a ivolutio so that ρs 1 dz )) = ρs dz )), coditio A5.3 ca be iterpreted as a skewed ivariace coditio o νdz) ρdz)1 αz)). The quatity ρdz) 1 α z)) is proportioal to the ivariat distributio of the jump chai, i.e. the distributio of those states where the proposal Φ z) is rejected. It has a clear aalogue i the cotiuous-time sceario where the jumps occur at states with distributio proportioal to ρdz)λz). 11

12 3.2.2 Sufficiet coditios for local methods I scearios where H z) ca be decomposed as i 11), it will prove coveiet to cosider a acceptace probability of the form α z) = α i z) 51) where α i : Z, 1] are themselves acceptace probabilities 4. To sample a evet of probability α z), we ca sample idepedet Beroulli variables B i, such that B i Ber1 α i z)) for i ] where Berp) is the Beroulli distributio of parameter p. Hece the probability of the evet B =,..., ) where B = B 1,..., B ) is α z). Thus if B :=,..., ), we will set z Φ z). Otherwise, that is if B B where B =, 1 \, the we will sample z Qz, ) where Q z, dz ) = b B Q B 1 b z) Q b z, dz ). 52) I this expressio Q b is a Markov kerel ad Q B 1 b z) is the distributio of B coditioed upo B := B i 1 which is give by Q B 1 b z) = Berb i; 1 α i z)). 53) 1 α z) Based o these structural assumptios o α ad Q, we ca provide useful sufficiet local coditios o Φ, α i : i ] ad Q B : b B to esure ivariace of the associated discrete-time PDMP w.r.t. ρ is satisfied. A6) Coditios o φ, α i : i ], ad Q b : b B 1. There exists a ρ-preservig mappig S : Z Z. 2. The acceptace probabilities α i : i ] satisfy log α i S Φ z)) log α i z) = log Φ z) H Φ z)) H z). 54) 3. For all b B, the trasitio kerel Q b satisfies ρ dz) 1 α z)) Q B 1 b z) Q b z, dz ) = ρs 1 dz )) 1 α S z ))) Q B 1 b S z )). 55) For a mappig such that Φ = 1, the Assumptio A6.2 is satisfied if for all i ] log α i S Φ z)) log α i z) = H i Φ z)) H i z). 56) Propositio 8. Assume A6). The the discrete-time PDMP admits ρ as ivariat distributio Sufficiet coditios for doubly stochastic methods Cosider fially the sceario where H z) is give by 17). I this cotext, we cosider a acceptace probability of the form α z) = exp log α ω z) µ dω) 57) where α ω : Z, 1] which is a geeralizatio of 51) from the measure µ i) = 1 o a fiite space Ω = ] to a arbitrary measure o a geeral space. Obviously whe Ω is ot fiite, the strategy previously adopted to simulate a evet of probability α z) is ot applicable. However, this ca be achieved by simulatig a Poisso process P o Ω of rate Λ dω) = log α ω z) µ dω), the law of which we deote with Q dp z), ad oticig that α z) is the void probability of P. A similar idea was used i a differet cotext i 3]. Hece if the umber of poits is ull, i.e. P =, the we will set z Φ z). If P 1, that is P P where P is the set of cofiguratios of the Poisso process havig at least oe poit, the we will sample z Qz, ) where Q z, dz ) = Q P 1 dp z) Q P z, dz ). 58) P 4 The authors i 32] derive a cotiuous-time local PD-MCMC by usig this factorized acceptace probability, usig a mappig Φ z) = x + ɛv, v) ad takig the limit as ɛ. However for a strictly positive ɛ >, they do ot defie a discrete-time local PD-MCMC as proposed here. 12

13 I this expressio Q P is a Markov kerel ad Q P 1 dp z) is the law of the Poisso process P coditioed upo the evet P 1 which is give by Q P 1 dp z) = A7) Coditios o φ, α ω : ω Ω ad Q P : P P 1. There exists a ρ-preservig mappig S : Z Z. I P 1) Q dp z). 59) 1 α z) 2. The acceptace probabilities α ω : ω Ω satisfy log α ω S Φ z)) log α ω z)] µ dω) = log Φ z) H Φ z)) H z). 6) 3. For all P P, the trasitio kerel Q P satisfies ρ dz) 1 α z)) Q P 1 dp z) Q P z, dz ) = ρs 1 dz )) 1 α S z ))) Q P 1 dp S z )). Assumptio A.7.3 is a iformal expressio meaig that we assume that for Q P 1 dp z)-almost all P P dq P 1 P z) ρ dz) 1 α z)) dq P 1 P S z )) Q P z, dz ) = ρs 1 dz )) 1 α S z ))), ad the Rado-Nikodym derivative i the expressio above is well-defied ad strictly positive for Q P z, dz ) almost all z. For a mappig such that Φ = 1, Assumptio A7.2 is satisfied if for all ω Ω log α ω S Φ z)) log α ω z) = H ω Φ z)) H ω z). 62) Propositio 9. Assume A7). The the discrete-time PDMP admits ρ as ivariat distributio. 61) 3.3 Existig PD-MCMC algorithms A few algorithms proposed i the literature ca be cosidered as special istaces of discrete-time PD-MCMC algorithms. They all rely o the same framework discussed i Sectio 2.3, that is they sample a exteded target desity ρ z) = exp H z)) = π x) ψ v) defied 23) o Z = R d R d where π is the target distributio of iterest ad ψ is a stadard multivariate ormal. They use a mappig such that Φ = 1, Φ 1 = S Φ S with S z) = x, v) ad α z) = mi 1, ρ Φ z)) /ρ z). A fairly geeric scheme is detailed i Algorithm 3. Algorithm 3 Discrete-time PD-MCMC 1. With probability mi 1, ρ Φ z)) /ρ z), set z Φ z). 2. Otherwise, sample z M z, ). 3. With probability β x, v), x, v ) = mi 1, ρ x, v ) ρ Φ x, v ))] + M x, v ), x, v)) ρ x, v) ρ Φ x, v))] + M x, v), x, v, )) set z z, otherwise set z x, v). This scheme satisfies Assumptio A5.1 to Assumptio A5.3 ad is thus ρ-ivariat. I particular Assumptio A5.3 is satisfied as Steps 2 ad 3 correspod to usig for the evet kerel Q a GMH kerel satisfyig the skewed-detailed balace coditio 42) for ν dz) ρ dz) 1 α z)). Remark 1. Algorithm 3 ca be alteratively viewed as a compositio of reversible kerels. First, a delayedrejectio algorithm proposig Φ ad, i case of rejectio, the proposig Mz, S 1 )). Secod, the ivolutio S is applied ucoditioally. I the delayed-rejectio framework, we ca view coditio A5.3 as a coditio o delayed-rejectio kerels expressed i a sort of remaider form. While our algorithm uses two proposals, extedig this remaider coditio to multiple proposals would require that each Q k satisfies ρdz) k 1 1 α i z))q k z, dz ) = ρdz ) k 1 1 α iz )). 13

14 3.3.1 Guided radom walk This algorithm was proposed i 25]. It is a special case of Algorithm 3 which uses Φ z) = x + vɛ, v) for some ɛ > ad a proposal M z, dz ) = δ Sz) dz ) which is accepted with probability Hamiltoia Mote Carlo The celebrated HMC algorithm proposed i 18] is also a special case of Algorithm 3 which uses a proposal M z, dz ) = δ Sz) dz ). However, cotrary to guided radom walk, it is usig for Φ a symplectic itegrator targetig the Hamiltoia H. This determiistic proposal satisfies ideed Φ = 1 ad Φ 1 = S Φ S see, e.g., 38, 3]). The resultig PD-MCMC kerel K is usually combied with a mometum refreshmet step v ψ Reflective Slice Samplig: discrete-time BPS schemes Several versios of slice samplig, kow as reflective slice samplig, are based o bouces similar to the BPS ad are also a special case of Algorithm 3; see 37, Sectio 7]. They rely Φ z) = x + vɛ, v) for some ɛ > ad a determiistic proposal M z, dz ) = δ Ψz) dz ). Reflective slice samplig with ier reflectios is usig Ψ z) = x, v ) = x, R U x)v) while reflective slice samplig with outer reflectios is usig Ψ z) = x, v ) = x + vɛ + R U x + vɛ)vɛ, R U x + vɛ)v). Both proposals satisfy Ψ 1 = S Ψ S. The outer versio of the algorithm has bee recetly proposed idepedetly i 43]; see also 44] for a related proposal i the cotext of ested samplig. I either case, the acceptace probability simplifies to β x, v) = mi 1, π x ) πx v ɛ)] +. π x) πx + vɛ)] + Ituitively, these algorithms ca be iterpreted as discrete-time versios of the BPS process. Elemetary calculatios show ideed that i both cases α z) 1 ɛ U x), v + ad β z) 1 as ɛ uder regularity assumptios. We provide here a weak covergece result for the resultig Markov chai where ψ is the uiform distributio o S d 1 to limit techicalities. Propositio 11. Uder regularity coditios, reflective slice samplig with ier reflectios coverges weakly to the BPS for λ ref = as ɛ. A precise mathematical statemet, Theorem 12, ad its proof are give i Appedix B. We ca modify this algorithm to iclude a refreshmet, i.e. by samplig v ψ with probability λ ref ɛ. This weak covergece result of Propositio 11 ca be directly exteded to this case to show that the resultig discrete-time process coverges weakly to the BPS process with refreshmet rate λ ref. Note that the kerel K would still be ρ-ivariat if Φ were usig a computatioally cheap approximatio Û of U to bouce. However, this discrete-time algorithm does ot coverge to the BPS process as the probability of acceptig z = S z) does ot vaish as ɛ i this sceario. Uder regularity coditios, it will istead coverge towards the algorithm described at the ed of Sectio Extesios Discrete-time BPS with radomized bouces As discussed i Sectio 2.3.3, a variety of radomized bouces has bee proposed for cotiuous PD-MCMC. We show here how to geeralize these ideas to discrete-time. Let ψ deote the stadard ormal distributio o R d, Φ z) = x + vɛ, v), α z) = mi 1, ρ Φ z)) /ρ z) ad S z) = x, v) satisfyig Assumptios A5.1 ad A5.2 ad we select a evet kerel of the form Q z, dz ) = δ x dx )Q x v, dv ) based o a proposal M x v, dv ) = M x v, v ) dv. This leads to Algorithm 4. 14

15 Algorithm 4 Discrete-time BPS with radomized bouces 1. With probability mi 1, π x + vɛ) /π x), set z x + vɛ, v). 2. Otherwise a) Sample v M x v, ). b) With probability set z x, v ). c) Otherwise set z x, v). mi 1, ψ v ) π x) πx v ɛ)] + M x v, v) ψ v) π x) πx + vɛ)] + M x v, v, ) For the kerel M x v, ), we ca use the radomized bouces developed i Sectio as well as M x v, ) = ψ ). The forward-evet 33], geeralized BPS 47], ad autoregressive boucig procedures discussed i Sectio iduce a trasitio kerel M x satisfyig ψv) Ux), v + M x v, v ) = ψ v ) Ux), v + M x v, v), for which we would expect that the acceptace ratio i Step 2.b of Algorithm 4 will be close to 1 for small ɛ. The ivariace with respect to ρ of the trasitio kerel is easy to check. Assumptio A5.1 is clearly satisfied. Assumptio A5.2 follows from direct calculatios usig Φ = 1 ad Φ 1 = S Φ S. Fially Assumptio A5.3 follows from the fact that the evet kerel correspodig to steps 2.a to 2.c of Algorithm 4 is a GMH kerel with ν z) ρ z) 1 α z)) with a proposal kerel M x v, dv ) Discrete-time Hamiltoia BPS We cosider here the discrete-time versio of the Hamiltoia BPS proposed i Sectio 2.4. This is achieved by settig ψ as the stadard ormal distributio o R d, α z) = mi 1, ρ Φ z)) /ρ z) ad S z) = x, v). We also cosider a approximatio Ĥ z) defied i 35) of the Hamiltoia H z) ad recall that Ũ x) := U x) V x) ad deote Ψ z) = x, R Ũ x) v). I Sectio 2.4, we were cosiderig for Φ t the exact Hamiltoia flow associated with Ĥ z). I discrete time we ca select for Φ either this exact flow Φ ɛ for some ɛ > or a leapfrog itegrator with L steps which we will deote Φ HD. The crucial differece is thus that it is ot ecessary to restrict ourselves to a Hamiltoia Ĥ z) for which the Hamiltoia equatios ca be solved exactly. The resultig algorithm the proceeds as follows. Algorithm 5 Discrete-time Hamiltoia BPS 1. With probability mi 1, ρ Φ HD z)) /ρ z), set z Φ HD z). 2. Otherwise mi a) With probability 1, ρ x, R Ũ x) v) ρ Φ HD x, R Ũ x) v)) ] + ρ x, v) ρ Φ HD x, v))] + set z x, R Ũ x) v). b) Otherwise set z x, v). = mi 1, ρ x, v) ρ Φ HD x, R Ũ x) v)) ] + ρ x, v) ρ Φ HD x, v))] +, The ivariace with respect to ρ of the trasitio kerel is easy to check. Assumptio A5.1 is obviously satisfied. Assumptio A5.2 follows from direct calculatios usig Φ = 1 ad Φ 1 = S Φ S. Fially Assumptio A5.3 follows from the fact that the evet kerel correspodig to step a) ad b) of Algorithm 5 is a GMH kerel with ν z) ρ z) 1 α z)) with a determiistic trasitio kerel satisfyig Ψ 1 = S Ψ S. If Φ is a leapfrog itegrator of stepsize ɛ > targetig the Hamiltoia H z), the the strategy described above is ot directly applicable as Ũ x) = for all x so R Ũ x) is ot defied. However as Φ ca be thought of as the exact time discretizatio of a shadow Hamiltoia of the form Ĥɛ z) = H z) ɛ 2 H ) z) + O ɛ 4 3, p. 17], it may be possible to build bouces based o H z) to correct for the discrepacy betwee the true Hamiltoia dyamics ad its leapfrog approximatio. 15

16 Algorithm 6 Discrete-time gradiet-free BPS 1. With probability mi 1, π x + vɛ) /π x), set z x + vɛ, v). 2. Otherwise a) Sample v ψ. b) With probability set z x, v ). c) Otherwise go to Step 2.a. π x) πx v ɛ)] + πx) Discrete-time gradiet-free BPS The BPS-type algorithms give thus far all require computatio of the gradiet of the potetial Ux) i order to update the velocity v whe a bouce evet occurs. However, we may wish to target potetial fuctios where this gradiet caot be computed or is very expesive to compute. Additioally, the gradiet may ot be iformative i some models, such as certai embeddigs of discrete spaces where the gradiet may be zero almost everywhere. A scheme to approximate the gradiet Ux) by computig umerical differeces was advaced i 43]. Here, some umber cpt of orthogoal uit vectors ζ i, i cpt ] are selected, ad the gradiet approximated alog each of these vectors by, e.g., i = Ux + hζ i) Ux hζ i ) 2h for some small value h. The combiatio of these cpt vectors yields a approximatio to the gradiet cpt ĝ = i ζ i, which for cpt = d is a typical umerical approximatio to the gradiet. The ew velocity is foud by a reversible map from the old velocity to the ew velocity which preserves the magitude of the velocity ad maitais the projectio of the velocity o the gradiet vector. We may derive a algorithm which operates i the same spirit as that of 43]. By takig cpt orthogoal uit vectors, here selected radomly ad idepedetly of v, we ca achieve a reversible algorithm by simply takig the reflectio off of the approximate gradiet v = v 2 ĝ,v ĝ 2 ĝ, ad acceptig this proposal i the same way we would accept a typical bouce i the discrete-time BPS algorithm; specifically, by acceptig the bouce with probability mi 1, π x) πx v ɛ)] +. π x) πx + vɛ)] + Alteratively, we propose a algorithm which is related to the cotiuous-time radomized bouces of Sectio We had previously oted that the idepedet samplig algorithm proposed i 2] cosists of samplig from the distributio proportioal to ψv )λx, v ), idepedetly of the curret value of v. Based o the discrete-time ivariace coditio 5), we may aalogously sample from the distributio proportioal to ψv ) πx) πx v ɛ)] +. This ca be accomplished by usig rejectio samplig with istrumetal distributio ψ, otig that the ratio betwee the desities is bouded above by πx); thus each rejectio samplig proposal v is accepted with probability πx) πx v ɛ) ] /πx), ad the first accepted proposal is also accepted as + the ew state v. See Algorithm 6 for details of this rejectio-samplig scheme. 16

17 3.4.4 Efficiet Implemetatio of Discrete-time PD-MCMC All the implemetatios of discrete-time PD-MCMC schemes we are aware of cosist of simulatig the algorithm usig the kerel 46), that is, at each time step it is checked whether a evet occurs with probability 1 α z) whe i state z. However, it is possible to improve over this implemetatio i some iterestig scearios. Assume there exists ᾱ : Z, 1] such that for k N we have α Φ k z) ) ᾱ z, k) > where ᾱ z, k) is computatioally cheaper to evaluate tha α Φ k z) ). It is the possible to simulate a iter-evet time of distributio 43) by simulatig a time from the istrumetal distributio P τ = j) = 1 ᾱ z, j) j 1 i= ᾱ z, i) which is the accepted with probability 1 α Φ τ z)) / 1 ᾱ z, τ). For a liear dyamics Φ z) = x + vɛ, v), we ca obtai such bouds by upper boudig the derivative of t U x + vt). If α z) = mi 1, ρ Φ z)) /ρ z), we ca also always use for example the lower boud ᾱ z, k) = ᾱi z, k) where ᾱ i z, k) = mi 1, ρ i Φ k+1 z) ) /ρ i Φ k z) ) for ρ z) = ρ i z). It has the potetial advatage that simulatig a evet of probability ᾱ z, k) ca be performed i parallel by simulatig idepedet Beroulli radom variables B i Ber1 ᾱ i z, k)) for i ]. Fially there are scearios where it is possible to directly simulate a evet time from 43). For example, assume that π x) = exp U x)) where U is strictly covex, Φ z) = x + vɛ, v) ad α z) = mi 1, ρ Φ z)) /ρ z) = mi 1, exp U x + vɛ) U x))) the it is easy to show that Algorithm 7 returs a sample from 43). This adapts the approaches developed i 1, Sectio 2.3.1] for the cotiuous-time BPS algorithm to the discretetime case. Algorithm 7 Simulatio iter-evet time for discrete-time BPS for strictly log-cocave targets 1. Miimize the potetial alog the cotiuous trajectory t = arg mi U x + vt) : t R Set k = arg mi U x + vkɛ) : k t /ɛ, t /ɛ. 3. Solve for t t U x + vt) U x + vk ɛ) = E, E Exp, 1]. 4. Retur τ = t/ɛ. All these strategies ca be easily combied. For example, we ca use a upper boud ᾱ z, k) = ᾱi z, k) where ρ i z) is strictly log-cocave for some i ]. 4 Discrete-time local PD-MCMC 4.1 Algorithm descriptio Give the framework provided i Sectio 3.2.2, it is ot difficult to obtai discrete-time local PD-MCMC schemes for ρ z) = exp H i z)) = π x) ψ v) = exp U x))ψ v) o Z = R d R d where π is the target distributio of iterest with ψ is a multivariate ormal. We ca for example select a dyamics, ivolutio ad acceptace probability satisfyig Φ = 1, α i z) = mi 1, ρ i Φ z)) /ρ i z) with ρ i z) = exp H i z)), S z) = x, v), Φ 1 = S Φ S ad ρ S = ρ. A rather geeric local PD-MCMC scheme is preseted i Algorithm 8. 17

18 Algorithm 8 Discrete-time local PD-MCMC 1. For i ], sample B i Ber ρ i z) ρ i Φ z))] + /ρ i z). 2. If B i = for all i ], set z Φ z). 3. Otherwise, sample z M B z, ). 4. With probability mi 1, M B S z ), S z)) M B z, z ) ρ i S z )) Ber B i ; 1 α i S z ))). 63) ρ i z) Ber B i ; 1 α i z)) set z z. Otherwise, set z x, v). Here Steps 3 ad 4 of Algorithm 8 correspods to a GMH kerel satisfyig the skewed-detailed balace coditio 42) for ν b dz) ρ dz) 1 α z)) Q B 1 b z) ad a proposal M B z, dz ) for ay b B. Cosider a special case of Algorithm 8 give i Algorithm 9 which correspods to a discrete-time versio of local BPS. It is usig Φ z) = x + vɛ, v), Sz) = x, v) ad a determiistic proposal M b z, dz ) = δ Ψb z) dz ) satisfyig Ψ 1 b = S Ψ b S. We also use ρ i z) = exp U i x)) := π i x) so that U x) = m U i x) ad ρ z) = ψ v) with = m + 1. We could have selected α z) = α ref to refresh the velocity periodically but we omit it for ease of presetatio. The oly differece with Algorithm 8 is that we actually use here a alterative acceptace probability which is lower tha 63) but has the advatages that it factorizes across i. It will prove useful as it is the possible to simulate a evet with the required acceptace probability by simulatig idepedet evets i parallel. Algorithm 9 Discrete-time local BPS 1. For i m], sample B i Ber π i x) π i x + vɛ)] + /π i x). 2. If B i = for all i ], set z x + vɛ, v). 3. Otherwise, a) Set z Ψ B z) := x, v ), where v R U x)v with U x) := i:b U i x). b) With probability m = i:b i= set z Ψ B z). c) Otherwise, set z x, v). mi 1, ρ i S Ψ B z)) Ber B i ; 1 α i S Ψ B z))) ρ i z) Ber B i ; 1 α i z)) mi 1, mi π ix), π i x v ɛ)) mi π i x), π i x + vɛ)) i:b mi 1, π ix) π i x v ɛ)] +, 64) π i x) π i x + vɛ)] + Note that U x) depeds o both u, v ad ɛ, we stress this depedece as it is omitted otatioally. Algorithms 8 ad 9 might appear of limited iterest as they require to sample Beroulli radom variables at each iteratio. I the ext sectios, we show how we ca propose implemetatios that parallel the priority queue implemetatio of the local BPS proposed i 42], see 1, Sectio 3.3.1] for a detailed descriptio, as well as the subsamplig algorithms proposed i 1, 6, 29, Sectio 3.3.2]. 4.2 Prefetchig implemetatio We first describe a priority queue type implemetatio of Algorithm 9 based o parallel prefetchig ideas 11, 2] i scearios where m U x) = U i x Si ), 18

19 x Si beig a subset of the compoets of x ad π i x) = exp U i x Si )). There are may possible variatios of this implemetatio. Algorithm 1 Discrete-time local BPS implemetatio via parallel prefetchig 1. Iitializatio a) For i m], sample o-egative evet times τ i with distributio 2. Iteratio t, t 1 max, 1 π ) τi 1 ix + vτ i + 1)ɛ) mi 1, π ) ix + vk + 1)ɛ). π i x + vτ i ɛ) π i x + vkɛ) k= a) If mi τ i >, the set z x + ɛv, v). Update τ i τ i 1. b) Otherwise, i. Compute U x) := i:τ i= U i x Si ), 65) ad let v R U x) v. ii. With probability mi 1, mi π ix), π i x v ɛ)) mi 1, π ix) π i x v ɛ)] +, 66) mi π i:τ i x), π i x + vɛ)) π i> i:τ i x) π i x + vɛ)] + i= set z x, v ). Sample agai τ i for all i where v j v j for some j S i. iii. Otherwise set z x, v). Sample τ i for all i. The efficiecy of Algorithm 1 relies o the capability of computig the τ i efficietly. This may be possible whe, for example, this is doe i parallel or whe we some property of π i allows it, such as i the case of log-cocave targets detailed as i Algorithm 7 give above. 4.3 Subsamplig implemetatios For sufficietly small ɛ, we might expect that i Step 1 of Algorithm 9 would yield very few idices for which B i = 1. This motivates a approach which ca sample these variables more efficietly by fidig a upper boud o the probability that B i = 1, essetially allowig us to boud the umber of idices for which B i = 1. We preset Algorithm 11; here, the acceptace of the bouce move 64) is computed i two stages: i Step 4.b we simulate evets of probability 1 mi 1, πix) πix v ɛ)] + for each i where B i = 1, if these succeed the i Step 4.c we simulate evets of probability 1 mi π ix) π ix+vɛ)] + 1, miπix),πix v ɛ)) miπ ix),π ix+vɛ)) for each i where B i =. We suggest that oe ca make use of efficiet procedures described i Algorithm 12 ad Algorithm 13 to sample multiple Beroulli radom variables i both Steps 1 ad 4.c; i both cases we expect few cases where the respective Beroulli variables are 1. While Step 4.b also samples a set of Beroulli variables, our assumptio that ɛ is small suggests that the umber of variables sampled here will be small; as such this step may be iexpesive ad there is likely little to be gaied by a more sophisticated simulatio scheme. 19

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010 A Note o Effi ciet Coditioal Simulatio of Gaussia Distributios A D D C S S, U B C, V, BC, C April 2010 A Cosider a multivariate Gaussia radom vector which ca be partitioed ito observed ad uobserved compoetswe

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Advanced Sequential Monte Carlo Methods

Advanced Sequential Monte Carlo Methods Advaced Sequetial Mote Carlo Methods Araud Doucet Departmets of Statistics & Computer Sciece Uiversity of British Columbia A.D. () / 35 Geeric Sequetial Mote Carlo Scheme At time =, sample q () ad set

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Chapter 2 The Monte Carlo Method

Chapter 2 The Monte Carlo Method Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

On forward improvement iteration for stopping problems

On forward improvement iteration for stopping problems O forward improvemet iteratio for stoppig problems Mathematical Istitute, Uiversity of Kiel, Ludewig-Mey-Str. 4, D-24098 Kiel, Germay irle@math.ui-iel.de Albrecht Irle Abstract. We cosider the optimal

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

1 6 = 1 6 = + Factorials and Euler s Gamma function

1 6 = 1 6 = + Factorials and Euler s Gamma function Royal Holloway Uiversity of Lodo Departmet of Physics Factorials ad Euler s Gamma fuctio Itroductio The is a self-cotaied part of the course dealig, essetially, with the factorial fuctio ad its geeralizatio

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001. Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Sequential Monte Carlo Methods - A Review. Arnaud Doucet. Engineering Department, Cambridge University, UK

Sequential Monte Carlo Methods - A Review. Arnaud Doucet. Engineering Department, Cambridge University, UK Sequetial Mote Carlo Methods - A Review Araud Doucet Egieerig Departmet, Cambridge Uiversity, UK http://www-sigproc.eg.cam.ac.uk/ ad2/araud doucet.html ad2@eg.cam.ac.uk Istitut Heri Poicaré - Paris - 2

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Monte Carlo Optimization to Solve a Two-Dimensional Inverse Heat Conduction Problem

Monte Carlo Optimization to Solve a Two-Dimensional Inverse Heat Conduction Problem Australia Joural of Basic Applied Scieces, 5(): 097-05, 0 ISSN 99-878 Mote Carlo Optimizatio to Solve a Two-Dimesioal Iverse Heat Coductio Problem M Ebrahimi Departmet of Mathematics, Karaj Brach, Islamic

More information

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4

More information

SAMPLING LIPSCHITZ CONTINUOUS DENSITIES. 1. Introduction

SAMPLING LIPSCHITZ CONTINUOUS DENSITIES. 1. Introduction SAMPLING LIPSCHITZ CONTINUOUS DENSITIES OLIVIER BINETTE Abstract. A simple ad efficiet algorithm for geeratig radom variates from the class of Lipschitz cotiuous desities is described. A MatLab implemetatio

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

n n i=1 Often we also need to estimate the variance. Below are three estimators each of which is optimal in some sense: n 1 i=1 k=1 i=1 k=1 i=1 k=1

n n i=1 Often we also need to estimate the variance. Below are three estimators each of which is optimal in some sense: n 1 i=1 k=1 i=1 k=1 i=1 k=1 MATH88T Maria Camero Cotets Basic cocepts of statistics Estimators, estimates ad samplig distributios 2 Ordiary least squares estimate 3 3 Maximum lielihood estimator 3 4 Bayesia estimatio Refereces 9

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

x a x a Lecture 2 Series (See Chapter 1 in Boas)

x a x a Lecture 2 Series (See Chapter 1 in Boas) Lecture Series (See Chapter i Boas) A basic ad very powerful (if pedestria, recall we are lazy AD smart) way to solve ay differetial (or itegral) equatio is via a series expasio of the correspodig solutio

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

PC5215 Numerical Recipes with Applications - Review Problems

PC5215 Numerical Recipes with Applications - Review Problems PC55 Numerical Recipes with Applicatios - Review Problems Give the IEEE 754 sigle precisio bit patter (biary or he format) of the followig umbers: 0 0 05 00 0 00 Note that it has 8 bits for the epoet,

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio

More information

5 Birkhoff s Ergodic Theorem

5 Birkhoff s Ergodic Theorem 5 Birkhoff s Ergodic Theorem Amog the most useful of the various geeralizatios of KolmogorovâĂŹs strog law of large umbers are the ergodic theorems of Birkhoff ad Kigma, which exted the validity of the

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

TR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT

TR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT TR/46 OCTOBER 974 THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION by A. TALBOT .. Itroductio. A problem i approximatio theory o which I have recetly worked [] required for its solutio a proof that the

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

The Gamma function Michael Taylor. Abstract. This material is excerpted from 18 and Appendix J of [T].

The Gamma function Michael Taylor. Abstract. This material is excerpted from 18 and Appendix J of [T]. The Gamma fuctio Michael Taylor Abstract. This material is excerpted from 8 ad Appedix J of [T]. The Gamma fuctio has bee previewed i 5.7 5.8, arisig i the computatio of a atural Laplace trasform: 8. ft

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t = Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

The second is the wish that if f is a reasonably nice function in E and φ n

The second is the wish that if f is a reasonably nice function in E and φ n 8 Sectio : Approximatios i Reproducig Kerel Hilbert Spaces I this sectio, we address two cocepts. Oe is the wish that if {E, } is a ierproduct space of real valued fuctios o the iterval [,], the there

More information

2 Markov Chain Monte Carlo Sampling

2 Markov Chain Monte Carlo Sampling 22 Part I. Markov Chais ad Stochastic Samplig Figure 10: Hard-core colourig of a lattice. 2 Markov Chai Mote Carlo Samplig We ow itroduce Markov chai Mote Carlo (MCMC) samplig, which is a extremely importat

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

b i u x i U a i j u x i u x j

b i u x i U a i j u x i u x j M ath 5 2 7 Fall 2 0 0 9 L ecture 1 9 N ov. 1 6, 2 0 0 9 ) S ecod- Order Elliptic Equatios: Weak S olutios 1. Defiitios. I this ad the followig two lectures we will study the boudary value problem Here

More information

Notes 27 : Brownian motion: path properties

Notes 27 : Brownian motion: path properties Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

The Growth of Functions. Theoretical Supplement

The Growth of Functions. Theoretical Supplement The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that

More information

The Choquet Integral with Respect to Fuzzy-Valued Set Functions

The Choquet Integral with Respect to Fuzzy-Valued Set Functions The Choquet Itegral with Respect to Fuzzy-Valued Set Fuctios Weiwei Zhag Abstract The Choquet itegral with respect to real-valued oadditive set fuctios, such as siged efficiecy measures, has bee used i

More information

STAT331. Example of Martingale CLT with Cox s Model

STAT331. Example of Martingale CLT with Cox s Model STAT33 Example of Martigale CLT with Cox s Model I this uit we illustrate the Martigale Cetral Limit Theorem by applyig it to the partial likelihood score fuctio from Cox s model. For simplicity of presetatio

More information

6a Time change b Quadratic variation c Planar Brownian motion d Conformal local martingales e Hints to exercises...

6a Time change b Quadratic variation c Planar Brownian motion d Conformal local martingales e Hints to exercises... Tel Aviv Uiversity, 28 Browia motio 59 6 Time chage 6a Time chage..................... 59 6b Quadratic variatio................. 61 6c Plaar Browia motio.............. 64 6d Coformal local martigales............

More information

DETERMINATION OF MECHANICAL PROPERTIES OF A NON- UNIFORM BEAM USING THE MEASUREMENT OF THE EXCITED LONGITUDINAL ELASTIC VIBRATIONS.

DETERMINATION OF MECHANICAL PROPERTIES OF A NON- UNIFORM BEAM USING THE MEASUREMENT OF THE EXCITED LONGITUDINAL ELASTIC VIBRATIONS. ICSV4 Cairs Australia 9- July 7 DTRMINATION OF MCHANICAL PROPRTIS OF A NON- UNIFORM BAM USING TH MASURMNT OF TH XCITD LONGITUDINAL LASTIC VIBRATIONS Pavel Aokhi ad Vladimir Gordo Departmet of the mathematics

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Sequences of Definite Integrals, Factorials and Double Factorials

Sequences of Definite Integrals, Factorials and Double Factorials 47 6 Joural of Iteger Sequeces, Vol. 8 (5), Article 5.4.6 Sequeces of Defiite Itegrals, Factorials ad Double Factorials Thierry Daa-Picard Departmet of Applied Mathematics Jerusalem College of Techology

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

Linear Elliptic PDE s Elliptic partial differential equations frequently arise out of conservation statements of the form

Linear Elliptic PDE s Elliptic partial differential equations frequently arise out of conservation statements of the form Liear Elliptic PDE s Elliptic partial differetial equatios frequetly arise out of coservatio statemets of the form B F d B Sdx B cotaied i bouded ope set U R. Here F, S deote respectively, the flux desity

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information