Conditional Sampling for Max Stable Random Fields
|
|
- Griffin Jacobs
- 6 years ago
- Views:
Transcription
1 Conditional Sampling for Max Stable Random Fields Yizao Wang Department of Statistics, the University of Michigan April 30th, 0 th GSPC at Duke University, Durham, North Carolina Joint work with Stilian A. Stoev
2 An Illustrating Example Given the model and some observations, how to do prediction? Figure: A sample from the de Haan Pereira Model (de Haan and Pereira 00): a stationary (moving maxima) max stable random field. Parameters: ρ = 0., β 1 = 1., β = 0.7.
3 An Illustrating Example Four conditional samplings from the de Haan Pereira Model Difficulties: Analytical formula often impossible. Naïve Monte Carlo method does not apply.
4 Conditional Sampling for Max Stable Random Fields Our contribution: Obtained explicit formula of regular conditional probability for max linear models (including a large class of max stable random fields). Developed efficient software (R package maxlinear) for large scale conditional samplings. Potential applications in prediction for extremal phenomena, e.g., environmental and financial problems.
5 Max Linear Models Formula: p X i = a i,j Z j, 1 i n, denoted by X = A Z, j=1 Z j f Zj independent continuous, nonnegative random variables A = (a i,j ) 1 i n,1 j p, a i,j 0.
6 Max Linear Models Formula: X i = p a i,j Z j, 1 i n, denoted by X = A Z, j=1 Z j f Zj independent continuous, nonnegative random variables A = (a i,j ) 1 i n,1 j p, a i,j 0. Z 1,..., Z p independent α Fréchet: P(Z > t) = exp{ σ α t α } {X i } 1 i n is an α Fréchet process approximation of arbitrary max stable process (random field) with p sufficiently large. Modeling spatial extremes arising in meteorology, geology, and environmental applications.
7 Conditional Sampling for Max Linear Models Consider the model (A known) X = A Z. Observations: X = x. What is the conditional distribution of Z, given X = x? Prediction on Y = B Z, given X = x
8 Conditional Sampling for Max Linear Models Consider the model (A known) X = A Z. Observations: X = x. What is the conditional distribution of Z, given X = x? Prediction on Y = B Z, given X = x Remarks: Theoretical issue: P(Z E X = x) is not well defined. Rigorous treatment: ν(x, E) : R n + B R p [0, 1], regular conditional + probability.
9 Conditional Sampling for Max Linear Models Consider the model (A known) X = A Z. Observations: X = x. What is the conditional distribution of Z, given X = x? Prediction on Y = B Z, given X = x Remarks: Theoretical issue: P(Z E X = x) is not well defined. Rigorous treatment: ν(x, E) : R n + B R p [0, 1], regular conditional + probability. Computational issue: dim(a) = n p, dim(b) = n B p, n small, n B, p large.
10 A Toy Example Consider X = A Z with ( A = ), i.e. { X1 = Z 1 Z Z 3 X = Z 1 We have (in)equality constraints: Z 1 min(x 1, X ) =: ẑ 1 Z X 1 =: ẑ Z 3 X 1 =: ẑ 3.
11 A Toy Example Consider X = A Z with ( A = ), i.e. { X1 = Z 1 Z Z 3 X = Z 1 We have (in)equality constraints: Z 1 min(x 1, X ) =: ẑ 1 Z X 1 =: ẑ Z 3 X 1 =: ẑ 3. Two cases: (i) (red) If 0 < X 1 = X = a, then ẑ 1 = ẑ = ẑ 3 = a and Z 1 = ẑ 1, Z ẑ, Z 3 ẑ 3. (ii) (blue) If 0 < a = X < X 1 = b, then ẑ 1 = a, ẑ = ẑ 3 = b and Z 1 = ẑ 1, Z = ẑ, Z 3 ẑ 3 or Z 1 = ẑ 1, Z ẑ, Z 3 = ẑ 3.
12 A Toy Example Consider X = A Z with ( A = ), i.e. { X1 = Z 1 Z Z 3 X = Z 1 We have (in)equality constraints: Z 1 min(x 1, X ) =: ẑ 1 Z X 1 =: ẑ Z 3 X 1 =: ẑ 3. Two cases: (i) (red) If 0 < X 1 = X = a, then ẑ 1 = ẑ = ẑ 3 = a and Z 1 = ẑ 1, Z ẑ, Z 3 ẑ 3. (ii) (blue) If 0 < a = X < X 1 = b, then ẑ 1 = a, ẑ = ẑ 3 = b and Z 1 = ẑ 1, Z = ẑ, Z 3 ẑ 3 or Z 1 = ẑ 1, Z ẑ, Z 3 = ẑ 3. When Z j = ẑ j, we say Z j hits ẑ j different hitting scenarios.
13 Intuition of Conditional Distribution of Z X = x Define ẑ j := min 1 i n x i /a i,j and C(A, x) := {z R p + : x = A z} Need a distribution on C(A, x), for each x.
14 Intuition of Conditional Distribution of Z X = x Define ẑ j := min 1 i n x i /a i,j and C(A, x) := {z R p + : x = A z} Need a distribution on C(A, x), for each x. Partition of C(A, x), according to equality constraints: (blue) C(A, x) = C {1,} (A, x) C {1,3} (A, x) C {1,,3} (A, x) with C {1,} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 < ẑ 3 } C {1,3} (A, x) = {z 1 < ẑ 1, z = ẑ, z 3 = ẑ 3 } C {1,,3} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 = ẑ 3 }
15 Intuition of Conditional Distribution of Z X = x Define ẑ j := min 1 i n x i /a i,j and C(A, x) := {z R p + : x = A z} Need a distribution on C(A, x), for each x. Partition of C(A, x), according to equality constraints: (blue) C(A, x) = C {1,} (A, x) C {1,3} (A, x) C {1,,3} (A, x) with C {1,} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 < ẑ 3 } C {1,3} (A, x) = {z 1 < ẑ 1, z = ẑ, z 3 = ẑ 3 } C {1,,3} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 = ẑ 3 } Hitting scenarios: J {1,..., p} : C J (A, x). Conditional distribution: mixture of distributions, indexed by hitting scenarios: J (A, x) = {J : C J (A, x) }.
16 Intuition of Conditional Distribution of Z X = x C {1,} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 < ẑ 3 } C {1,3} (A, x) = {z 1 < ẑ 1, z = ẑ, z 3 = ẑ 3 } C {1,,3} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 = ẑ 3 } Define ν J on all the hitting scenarios J: ν J (x, E) := δẑj (π j (E)) P{Z j π j (E) Z j < ẑ j }, j J j J c
17 Intuition of Conditional Distribution of Z X = x C {1,} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 < ẑ 3 } C {1,3} (A, x) = {z 1 < ẑ 1, z = ẑ, z 3 = ẑ 3 } C {1,,3} (A, x) = {z 1 = ẑ 1, z = ẑ, z 3 = ẑ 3 } Define ν J on all the hitting scenarios J: ν J (x, E) := δẑj (π j (E)) P{Z j π j (E) Z j < ẑ j }, j J j J c It suffices to concentrate on relevant hitting scenarios J r (A, x) = {J J (A, x) : J = r} with r = min{ J : J J (A, x)}. Clearly, C {1,,3} is negligible compared to C {1,} and C {1,3} (C {1,,3} has lower dimension).
18 A Toy Example Consider X = A Z with ( A = In this case, ) { X1 = Z, i.e. 1 Z Z 3 X = Z 1 ẑ 1 = min(x 1, X ), ẑ = ẑ 3 = X 1. Two cases: (i) (red) If X 1 = X = a with 0 < a, then, ẑ 1 = ẑ = ẑ 3 = a and Z 1 = ẑ 1, Z ẑ, Z 3 ẑ 3. J (A, X) = {{1}, {1, }, {1, 3}, {1,, 3}} and J r (A, X) = {{1}}. (ii) (blue) If 0 < a = X < X 1 = b, then ẑ 1 = a, ẑ = ẑ 3 = b and Z 1 = ẑ 1, Z = ẑ, Z 3 ẑ 3 or Z 1 = ẑ 1, Z ẑ, Z 3 = ẑ 3. J (A, X) = {{1, }, {1, 3}, {1,, 3}} and J r (A, X) = {{1, }, {1, 3}}. Different X different hitting distributions different hitting scenarios.
19 Conditional Distribution for Max Linear Models Theorem (W and Stoev 0). The regular conditional probability ν(x, E) of Z w.r.t. X equals: E R R p where ν(x, E) = J J r (A,x) p J (A, x) w J := j J +, p J (A, x)ν J (x, E), P X a.a. x A (R p +) ẑ j f Zj (ẑ j ) j J c F Zj (ẑ j ), J J r (A,x) Proof is involved with the definition of regular conditional probability. p J = 1.
20 Conditional Distribution for Max Linear Models Theorem (W and Stoev 0). The regular conditional probability ν(x, E) of Z w.r.t. X equals: E R R p where ν(x, E) = J J r (A,x) p J (A, x) w J := j J +, p J (A, x)ν J (x, E), P X a.a. x A (R p +) ẑ j f Zj (ẑ j ) j J c F Zj (ẑ j ), J J r (A,x) Proof is involved with the definition of regular conditional probability. Algorithm I for conditional sampling Z X = x: (1) compute ẑ j, J (A, x), r(j (A, x)) and p J (A, x), and () sample Z ν(x, ). p J = 1.
21 Conditional Distribution for Max Linear Models Theorem (W and Stoev 0). The regular conditional probability ν(x, E) of Z w.r.t. X equals: E R R p where ν(x, E) = J J r (A,x) p J (A, x) w J := j J +, p J (A, x)ν J (x, E), P X a.a. x A (R p +) ẑ j f Zj (ẑ j ) j J c F Zj (ẑ j ), J J r (A,x) Proof is involved with the definition of regular conditional probability. Algorithm I for conditional sampling Z X = x: (1) compute ẑ j, J (A, x), r(j (A, x)) and p J (A, x), and () sample Z ν(x, ). p J = 1. Not the end of the story! We haven t discussed identification of J (A, x), which is closely related to the NP hard set covering problem.
22 Set Covering Problem Let H = (h i,j ) n p with h i,j {0, 1}. Write [m] {1,,, m}, m N. The column j [p] covers the row i [n], if h i,j = 1. The goal is to cover all rows with least columns. This is equivalent to solving min δ j, subject to h i,j δ j 1, i [n]. (1) δ j {0,1} j [p] An example: j [p] H = j [p] Minimum cost coverings are columns {1, }, {1, 3}, and {, 3}.
23 Identification of J r (A, x) and Set Covering Problem J J r (A, x) 1 1 a solution of (1) with h i,j = 1 {ai,j ẑ j =x i } and δ j = 1 {j J}.
24 Identification of J r (A, x) and Set Covering Problem J J r (A, x) 1 1 a solution of (1) with h i,j = 1 {ai,j ẑ j =x i } and δ j = 1 {j J}. A toy example: consider X = A Z with ( ) { X1 = Z A =, i.e. 1 Z Z X = Z 1 Then, (blue) 0 < a = X < X 1 = b H = ( The minimum cost coverings are columns {1, } and {1, 3}. ).
25 Identification of J r (A, x) and Set Covering Problem J J r (A, x) 1 1 a solution of (1) with h i,j = 1 {ai,j ẑ j =x i } and δ j = 1 {j J}. A toy example: consider X = A Z with ( ) { X1 = Z A =, i.e. 1 Z Z X = Z 1 Then, (blue) 0 < a = X < X 1 = b H = ( The minimum cost coverings are columns {1, } and {1, 3}. ). Write J r (H) = J r (A, x), with H referred to as the hitting matrix.
26 Identification of J r (A, x) and Set Covering Problem J J r (A, x) 1 1 a solution of (1) with h i,j = 1 {ai,j ẑ j =x i } and δ j = 1 {j J}. A toy example: consider X = A Z with ( ) { X1 = Z A =, i.e. 1 Z Z X = Z 1 Then, (blue) 0 < a = X < X 1 = b H = ( The minimum cost coverings are columns {1, } and {1, 3}. ). Write J r (H) = J r (A, x), with H referred to as the hitting matrix. Set covering problem is NP hard.
27 Simple Cases of Set Covering Problem H 1 = Two types of H: It takes a while to solve, H = r(j (H 1 )) = and J r (H 1 ) =. Clearly, columns 1 and are dominating. Therefore, r(j (H )) = and J (H ) = {{1, }}..
28 Simple Cases of Set Covering Problem H 1 = Two types of H: It takes a while to solve, H = r(j (H 1 )) = and J r (H 1 ) =. Clearly, columns 1 and are dominating. Therefore, r(j (H )) = and J (H ) = {{1, }}.. Lemma (W and Stoev) W.p.1, H has nice structure. example
29 Factorization of Regular Conditional Probability Formula Theorem (W and Stoev) With probability one, ν(x, E) = r ν (s) (X, E) with {Z j } j J (s) X = x ν (s) (x, ). s=1 Blocking structure: r s=1 J(s) = {1,..., p} and r s=1 Is = {1,..., n}.
30 Factorization of Regular Conditional Probability Formula Theorem (W and Stoev) With probability one, ν(x, E) = r ν (s) (X, E) with {Z j } j J (s) X = x ν (s) (x, ). s=1 Blocking structure: r s=1 J(s) = {1,..., p} and r s=1 Is = {1,..., n}. Conditional independence: restricted on X = x, X i p a i,j Z j = j=1 j J (s) a i,j Z j, i I s.
31 Factorization of Regular Conditional Probability Formula Theorem (W and Stoev) With probability one, ν(x, E) = r ν (s) (X, E) with {Z j } j J (s) X = x ν (s) (x, ). s=1 Blocking structure: r s=1 J(s) = {1,..., p} and r s=1 Is = {1,..., n}. Conditional independence: restricted on X = x, X i p a i,j Z j = j=1 j J (s) a i,j Z j, i I s. Algorithm II {Z j } j J (1) ν (1) (x, ) {Z j } j J (r) ν (r) (x, ) Z d = ν(x, ) = J J r p J ν J (x, )
32 Computational Efficiency Time (in secs) of identification of the blocking structure for Algorithm II: p \ n (0.0) 0.13 (0.03) 0. (0.0) 1. (0.09) (0.0) 0.0 (0.0) 1.00 (0.0).9 (0.33)
33 Computational Efficiency Time (in secs) of identification of the blocking structure for Algorithm II: p \ n (0.0) 0.13 (0.03) 0. (0.0) 1. (0.09) (0.0) 0.0 (0.0) 1.00 (0.0).9 (0.33) Comparison of two formulas: J J r (A,x) p J ν J (x, ) = ν(x, ) = Example Suppose, given X = x, r s=1 w (s) j j J (s) ν (s) j (x, ) } {{ } ν (s) (x, ) r = and J (s) =, s []. Then, to use ν(x, ) requires memory for J r (A, x) = weights, while to apply {ν (s) (x, )} s [] requires only.
34 Applications Simulations based on the de Haan Pereira model (de Haan and Pereira 00). Computational tools for prediction: real data analysis on maxima rainfall data at Bourgogne, France.
35 De Haan Pereira Model A stationary max stable random field model: De Haan and Pereira (00): with φ(t 1, t ) := X t = e R φ(t u)m α (du), t = (t 1, t ) R. β 1 β { π 1 ρ exp 1 [ β (1 ρ ) 1 t1 ρβ 1 β t 1 t + βt ] }. Consistent estimators known for ρ, β 1, β.
36 De Haan Pereira Model A stationary max stable random field model: De Haan and Pereira (00): with φ(t 1, t ) := X t = e R φ(t u)m α (du), t = (t 1, t ) R. β 1 β { π 1 ρ exp 1 [ β (1 ρ ) 1 t1 ρβ 1 β t 1 t + βt ] }. Consistent estimators known for ρ, β 1, β. A discretized version: X t = h /α φ(t u j1j )Z j1j, q j 1,j q 1 with u j1j = ((j 1 + 1/)h, (j + 1/)h) and Z j1j s i.i.d. 1 Fréchet.
37 1 Simulations Figure: conditional samplings from de Haan Pereira model. Parameters: ρ = 0, β 1 = 1, β = 1.
38 Simulations Figure: 9% quantile of the conditional marginal deviation. Parameters: ρ = 0., β 1 = 1., β = 0.7.
39 Review Obtained explicit formula of regular conditional probability for max linear models (including a large class of max stable random fields). Investigated the conditional independence structure. Developed efficient software (R package maxlinear) for large scale conditional samplings. Potential applications in prediction for extremal phenomena, e.g., environmental and financial problems.
40 Review Obtained explicit formula of regular conditional probability for max linear models (including a large class of max stable random fields). Investigated the conditional independence structure. Developed efficient software (R package maxlinear) for large scale conditional samplings. Potential applications in prediction for extremal phenomena, e.g., environmental and financial problems. Thank you. Website: http: //
41 Auxiliary Results
42 Regular Conditional Probability The regular conditional probability ν of Z given σ(x), is a function such that ν : R n B R p [0, 1], + (i) ν(x, ) is a probability measure, for all x R n, (ii) The function ν(, E) is measurable, for all Borel sets E B R p. (iii) For all E R R p and D R R n, (P X ( ) := P(X )): P(Z E, X D) = ν(x, E)P X (dx). () We will first guess a formula for ν and then prove (). D
43 A Heuristic Proof Consider a neighbor of C J (A, x) (of P measure 0) C J (A, x) = { z R p + : z j = ẑ j, j J, z k < ẑ k, k J c} C ɛ J (A, x) := { z R p + : z j [ẑ j (1 ɛ), ẑ j (1 + ɛ)], j J, z k < ẑ k (1 ɛ), k J c} for small enough ɛ > 0 and let C ɛ (A, x) := J J (A,x) C J ɛ (A, x). The sets A (C ɛ (A, x)) shrink to the point x, as ɛ 0. Proposition (W and Stoev 0). For all x A (R p +), we have, as ɛ 0, Remarks: P(Z E Z C ɛ (A, x)) ν(x, E), E R R p +. Proved by Taylor expansion. The choice of CJ ɛ (A, x) is delicate.
44 Nice Structure of the Hitting Matrix H Blocks of matrix H: Write i 1 j i if h i1,j = h i,j = 1. Define an equivalence relation on [n]: i 1 i, if i 1 = ĩ0 j 1 ĩ 1 j jm ĩm = i. (3) r blocks: the equivalence relation (3) induces [n] = r s=1 Is. Further, J (s) := J (s) := {j [p] : h i,j = 1 for all i I s } {j [p] : h i,j = 1 for some i I s } Example: H = Two blocks: I 1 = {1,, 3}, J (1) = {}, J (1) = {,, 7},. I = {,,, 7}, J () = {1}, J () = {1,, 3, }. Lemma (W and Stoev 0). W.p.1, the hitting matrix H of max linear model X = A Z has nice structure: (i) r = r(j (H)) = r(j (A, x)) and (ii) J (s) is nonempty s [r].
45 When We Have Bad H? Another Toy Example Consider Different hitting matrices: X = A Z with A = Z 1 > Z, Z > Z 3 H = Z 1 < Z < Z 1, Z > Z 3 H = Z 1 = Z, Z > Z 3 H =
46 Factorization of Regular Conditional Probability Formula Theorem (W and Stoev 0). With probability one, we have (i) for all J [p], J J r (A, A Z) if and only if J can be written as J = {j 1,..., j r } with j s J (s), s [r], (ii) For the regular conditional probability ν(x, E), ν(x, E) = r ν (s) (X, E) with ν (s) (X, E) = s=1 where for all j J (s), w (s) j = ẑ j f Zj (ẑ j ) ν (s) j (x, E) = δ πj (E)(ẑ j ) k J (s) \{j} k J (s) \{j} F Zk (ẑ k ) (s) j J (s) w j ν (s) (s) j J (s) w j j (X, E), P(Z k π k (E) Z k < ẑ k ).
Max stable Processes & Random Fields: Representations, Models, and Prediction
Max stable Processes & Random Fields: Representations, Models, and Prediction Stilian Stoev University of Michigan, Ann Arbor March 2, 2011 Based on joint works with Yizao Wang and Murad S. Taqqu. 1 Preliminaries
More informationSimulation of Max Stable Processes
Simulation of Max Stable Processes Whitney Huang Department of Statistics Purdue University November 6, 2013 1 / 30 Outline 1 Max-Stable Processes 2 Unconditional Simulations 3 Conditional Simulations
More informationarxiv:math/ v1 [math.st] 16 May 2006
The Annals of Statistics 006 Vol 34 No 46 68 DOI: 04/009053605000000886 c Institute of Mathematical Statistics 006 arxiv:math/0605436v [mathst] 6 May 006 SPATIAL EXTREMES: MODELS FOR THE STATIONARY CASE
More informationStochastic optimization Markov Chain Monte Carlo
Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing
More informationMax stable processes: representations, ergodic properties and some statistical applications
Max stable processes: representations, ergodic properties and some statistical applications Stilian Stoev University of Michigan, Ann Arbor Oberwolfach March 21, 2008 The focus Let X = {X t } t R be a
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationToday: Linear Programming (con t.)
Today: Linear Programming (con t.) COSC 581, Algorithms April 10, 2014 Many of these slides are adapted from several online sources Reading Assignments Today s class: Chapter 29.4 Reading assignment for
More informationExtreme Value Analysis and Spatial Extremes
Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models
More informationThe Multivariate Normal Distribution. In this case according to our theorem
The Multivariate Normal Distribution Defn: Z R 1 N(0, 1) iff f Z (z) = 1 2π e z2 /2. Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) T with the Z i independent and each Z i N(0, 1). In this
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 5 Sequential Monte Carlo methods I 31 March 2017 Computer Intensive Methods (1) Plan of today s lecture
More informationSolutions to Exercises
1/13 Solutions to Exercises The exercises referred to as WS 1.1(a), and so forth, are from the course book: Williamson and Shmoys, The Design of Approximation Algorithms, Cambridge University Press, 2011,
More informationOn the estimation of the heavy tail exponent in time series using the max spectrum. Stilian A. Stoev
On the estimation of the heavy tail exponent in time series using the max spectrum Stilian A. Stoev (sstoev@umich.edu) University of Michigan, Ann Arbor, U.S.A. JSM, Salt Lake City, 007 joint work with:
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationLecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321
Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process
More informationChapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma
Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma Theory of testing hypotheses X: a sample from a population P in P, a family of populations. Based on the observed X, we test a
More informationDecember 19, Probability Theory Instituto Superior Técnico. Poisson Convergence. João Brazuna. Weak Law of Small Numbers
Simple to Probability Theory Instituto Superior Técnico December 19, 2016 Contents Simple to 1 Simple 2 to Contents Simple to 1 Simple 2 to Simple to Theorem - Events with low frequency in a large population
More informationLecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.
1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if
More informationPhenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012
Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM
More informationDiscrete solid-on-solid models
Discrete solid-on-solid models University of Alberta 2018 COSy, University of Manitoba - June 7 Discrete processes, stochastic PDEs, deterministic PDEs Table: Deterministic PDEs Heat-diffusion equation
More informationOn Markov Chain Monte Carlo
MCMC 0 On Markov Chain Monte Carlo Yevgeniy Kovchegov Oregon State University MCMC 1 Metropolis-Hastings algorithm. Goal: simulating an Ω-valued random variable distributed according to a given probability
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1
More informationINTRODUCTION TO MARKOV CHAIN MONTE CARLO
INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate
More informationA Marshall-Olkin Gamma Distribution and Process
CHAPTER 3 A Marshall-Olkin Gamma Distribution and Process 3.1 Introduction Gamma distribution is a widely used distribution in many fields such as lifetime data analysis, reliability, hydrology, medicine,
More informationMultivariate Normal-Laplace Distribution and Processes
CHAPTER 4 Multivariate Normal-Laplace Distribution and Processes The normal-laplace distribution, which results from the convolution of independent normal and Laplace random variables is introduced by
More informationMarkov Chain Monte Carlo
1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior
More informationLemma 8: Suppose the N by N matrix A has the following block upper triangular form:
17 4 Determinants and the Inverse of a Square Matrix In this section, we are going to use our knowledge of determinants and their properties to derive an explicit formula for the inverse of a square matrix
More information2. Matrix Algebra and Random Vectors
2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns
More informationConcentration inequalities for Feynman-Kac particle models. P. Del Moral. INRIA Bordeaux & IMB & CMAP X. Journées MAS 2012, SMAI Clermond-Ferrand
Concentration inequalities for Feynman-Kac particle models P. Del Moral INRIA Bordeaux & IMB & CMAP X Journées MAS 2012, SMAI Clermond-Ferrand Some hyper-refs Feynman-Kac formulae, Genealogical & Interacting
More information1 Lyapunov theory of stability
M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability
More informationSTA 711: Probability & Measure Theory Robert L. Wolpert
STA 711: Probability & Measure Theory Robert L. Wolpert 6 Independence 6.1 Independent Events A collection of events {A i } F in a probability space (Ω,F,P) is called independent if P[ i I A i ] = P[A
More informationSTATS 306B: Unsupervised Learning Spring Lecture 2 April 2
STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised
More informationMixtures of Gaussians. Sargur Srihari
Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm
More informationIntroduction to MCMC. DB Breakfast 09/30/2011 Guozhang Wang
Introduction to MCMC DB Breakfast 09/30/2011 Guozhang Wang Motivation: Statistical Inference Joint Distribution Sleeps Well Playground Sunny Bike Ride Pleasant dinner Productive day Posterior Estimation
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationGeneral Principles in Random Variates Generation
General Principles in Random Variates Generation E. Moulines and G. Fort Telecom ParisTech June 2015 Bibliography : Luc Devroye, Non-Uniform Random Variate Generator, Springer-Verlag (1986) available on
More informationStable Process. 2. Multivariate Stable Distributions. July, 2006
Stable Process 2. Multivariate Stable Distributions July, 2006 1. Stable random vectors. 2. Characteristic functions. 3. Strictly stable and symmetric stable random vectors. 4. Sub-Gaussian random vectors.
More information(Multivariate) Gaussian (Normal) Probability Densities
(Multivariate) Gaussian (Normal) Probability Densities Carl Edward Rasmussen, José Miguel Hernández-Lobato & Richard Turner April 20th, 2018 Rasmussen, Hernàndez-Lobato & Turner Gaussian Densities April
More informationMixing time for a random walk on a ring
Mixing time for a random walk on a ring Stephen Connor Joint work with Michael Bate Paris, September 2013 Introduction Let X be a discrete time Markov chain on a finite state space S, with transition matrix
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationExtremogram and Ex-Periodogram for heavy-tailed time series
Extremogram and Ex-Periodogram for heavy-tailed time series 1 Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia) and Yuwei Zhao (Ulm) 1 Jussieu, April 9, 2014 1 2 Extremal
More informationConditional Distributions
Conditional Distributions The goal is to provide a general definition of the conditional distribution of Y given X, when (X, Y ) are jointly distributed. Let F be a distribution function on R. Let G(,
More informationLecture 13 October 6, Covering Numbers and Maurey s Empirical Method
CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and
More informationModeling & Control of Hybrid Systems Chapter 4 Stability
Modeling & Control of Hybrid Systems Chapter 4 Stability Overview 1. Switched systems 2. Lyapunov theory for smooth and linear systems 3. Stability for any switching signal 4. Stability for given switching
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative
More informationDiscretization of SDEs: Euler Methods and Beyond
Discretization of SDEs: Euler Methods and Beyond 09-26-2006 / PRisMa 2006 Workshop Outline Introduction 1 Introduction Motivation Stochastic Differential Equations 2 The Time Discretization of SDEs Monte-Carlo
More informationAn Extended BIC for Model Selection
An Extended BIC for Model Selection at the JSM meeting 2007 - Salt Lake City Surajit Ray Boston University (Dept of Mathematics and Statistics) Joint work with James Berger, Duke University; Susie Bayarri,
More informationDeterminant Approximations
Dagstuhl p.1 Determinant Approximations Ilse Ipsen North Carolina State University Joint Work with Dean Lee (Physics) Dagstuhl p.2 Overview Existing methods and determinant inequalities Our idea Diagonal
More informationINTERIOR-POINT METHODS ROBERT J. VANDERBEI JOINT WORK WITH H. YURTTAN BENSON REAL-WORLD EXAMPLES BY J.O. COLEMAN, NAVAL RESEARCH LAB
1 INTERIOR-POINT METHODS FOR SECOND-ORDER-CONE AND SEMIDEFINITE PROGRAMMING ROBERT J. VANDERBEI JOINT WORK WITH H. YURTTAN BENSON REAL-WORLD EXAMPLES BY J.O. COLEMAN, NAVAL RESEARCH LAB Outline 2 Introduction
More informationPCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities
PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets
More informationLecture I: Asymptotics for large GUE random matrices
Lecture I: Asymptotics for large GUE random matrices Steen Thorbjørnsen, University of Aarhus andom Matrices Definition. Let (Ω, F, P) be a probability space and let n be a positive integer. Then a random
More informationIntroduction to Markov Chains and Riffle Shuffling
Introduction to Markov Chains and Riffle Shuffling Nina Kuklisova Math REU 202 University of Chicago September 27, 202 Abstract In this paper, we introduce Markov Chains and their basic properties, and
More informationAn Alternative Method for Estimating and Simulating Maximum Entropy Densities
An Alternative Method for Estimating and Simulating Maximum Entropy Densities Jae-Young Kim and Joonhwan Lee Seoul National University May, 8 Abstract This paper proposes a method of estimating and simulating
More informationRandom regular digraphs: singularity and spectrum
Random regular digraphs: singularity and spectrum Nick Cook, UCLA Probability Seminar, Stanford University November 2, 2015 Universality Circular law Singularity probability Talk outline 1 Universality
More informationOn Strongly Equivalent Nonrandomized Transition Probabilities
On Strongly Equivalent Nonrandomized Transition Probabilities Eugene. Feinberg 1 and lexey B. Piunovskiy bstract Dvoretzky, Wald and Wolfowitz proved in 1951 the existence of equivalent and strongly equivalent
More informationAsymptotic behaviour of multivariate default probabilities and default correlations under stress
Asymptotic behaviour of multivariate default probabilities and default correlations under stress 7th General AMaMeF and Swissquote Conference EPFL, Lausanne Natalie Packham joint with Michael Kalkbrener
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationLecture 5 Channel Coding over Continuous Channels
Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From
More informationSummary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)
Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) The authors explain how the NCut algorithm for graph bisection
More informationCRPS M-ESTIMATION FOR MAX-STABLE MODELS WORKING MANUSCRIPT DO NOT DISTRIBUTE. By Robert A. Yuen and Stilian Stoev University of Michigan
arxiv: math.pr/ CRPS M-ESTIMATION FOR MAX-STABLE MODELS WORKING MANUSCRIPT DO NOT DISTRIBUTE By Robert A. Yuen and Stilian Stoev University of Michigan Max-stable random fields provide canonical models
More informationKernel Density Estimation
EECS 598: Statistical Learning Theory, Winter 2014 Topic 19 Kernel Density Estimation Lecturer: Clayton Scott Scribe: Yun Wei, Yanzhen Deng Disclaimer: These notes have not been subjected to the usual
More informationUC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes
UC Berkeley Department of Electrical Engineering and Computer Sciences EECS 6: Probability and Random Processes Problem Set 3 Spring 9 Self-Graded Scores Due: February 8, 9 Submit your self-graded scores
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo
Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationRank Determination for Low-Rank Data Completion
Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,
More informationReinforcement Learning
Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha
More informationMAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing
MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing Afonso S. Bandeira April 9, 2015 1 The Johnson-Lindenstrauss Lemma Suppose one has n points, X = {x 1,..., x n }, in R d with d very
More informationModelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions
Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions Anna Kiriliouk 1 Holger Rootzén 2 Johan Segers 1 Jennifer L. Wadsworth 3 1 Université catholique de Louvain (BE) 2 Chalmers
More informationThe Central Limit Theorem: More of the Story
The Central Limit Theorem: More of the Story Steven Janke November 2015 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 1 / 33 Central Limit Theorem Theorem (Central Limit
More informationChapter 3 : Likelihood function and inference
Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationLecture 4 Scheduling 1
Lecture 4 Scheduling 1 Single machine models: Number of Tardy Jobs -1- Problem 1 U j : Structure of an optimal schedule: set S 1 of jobs meeting their due dates set S 2 of jobs being late jobs of S 1 are
More informationRademacher Bounds for Non-i.i.d. Processes
Rademacher Bounds for Non-i.i.d. Processes Afshin Rostamizadeh Joint work with: Mehryar Mohri Background Background Generalization Bounds - How well can we estimate an algorithm s true performance based
More informationEfficient rare-event simulation for sums of dependent random varia
Efficient rare-event simulation for sums of dependent random variables Leonardo Rojas-Nandayapa joint work with José Blanchet February 13, 2012 MCQMC UNSW, Sydney, Australia Contents Introduction 1 Introduction
More information1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.
Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification
More informationSimultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms
Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Yan Bai Feb 2009; Revised Nov 2009 Abstract In the paper, we mainly study ergodicity of adaptive MCMC algorithms. Assume that
More informationQuasi-Monte Carlo Methods for Applications in Statistics
Quasi-Monte Carlo Methods for Applications in Statistics Weights for QMC in Statistics Vasile Sinescu (UNSW) Weights for QMC in Statistics MCQMC February 2012 1 / 24 Quasi-Monte Carlo Methods for Applications
More informationPartitioned Methods for Multifield Problems
Partitioned Methods for Multifield Problems Joachim Rang, 10.5.2017 10.5.2017 Joachim Rang Partitioned Methods for Multifield Problems Seite 1 Contents Blockform of linear iteration schemes Examples 10.5.2017
More informationSTA 294: Stochastic Processes & Bayesian Nonparametrics
MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationIntroduction to gradient descent
6-1: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction to gradient descent Derivation and intuitions Hessian 6-2: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction Our
More informationAN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES
Lithuanian Mathematical Journal, Vol. 4, No. 3, 00 AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES V. Bentkus Vilnius Institute of Mathematics and Informatics, Akademijos 4,
More informationLecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.
Lecture 10 1 Ergodic decomposition of invariant measures Let T : (Ω, F) (Ω, F) be measurable, and let M denote the space of T -invariant probability measures on (Ω, F). Then M is a convex set, although
More informationCS145: Probability & Computing
CS45: Probability & Computing Lecture 5: Concentration Inequalities, Law of Large Numbers, Central Limit Theorem Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,
More informationExample: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected
4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X
More informationis a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.
Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable
More informationThe strictly 1/2-stable example
The strictly 1/2-stable example 1 Direct approach: building a Lévy pure jump process on R Bert Fristedt provided key mathematical facts for this example. A pure jump Lévy process X is a Lévy process such
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,
More informationThe University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.
The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and
More informationHausdorff operators in H p spaces, 0 < p < 1
Hausdorff operators in H p spaces, 0 < p < 1 Elijah Liflyand joint work with Akihiko Miyachi Bar-Ilan University June, 2018 Elijah Liflyand joint work with Akihiko Miyachi (Bar-Ilan University) Hausdorff
More informationMulti-resolution models for large data sets
Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation NORDSTAT, Umeå, June, 2012 Credits Steve Sain, NCAR Tia LeRud, UC Davis
More informationj=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).
Math 344 Lecture #19 3.5 Normed Linear Spaces Definition 3.5.1. A seminorm on a vector space V over F is a map : V R that for all x, y V and for all α F satisfies (i) x 0 (positivity), (ii) αx = α x (scale
More informationMonte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan
Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis
More information2 Random Variable Generation
2 Random Variable Generation Most Monte Carlo computations require, as a starting point, a sequence of i.i.d. random variables with given marginal distribution. We describe here some of the basic methods
More informationCentral Limit Theorems for Conditional Markov Chains
Mathieu Sinn IBM Research - Ireland Bei Chen IBM Research - Ireland Abstract This paper studies Central Limit Theorems for real-valued functionals of Conditional Markov Chains. Using a classical result
More informationP i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=
2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]
More information(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute
ENEE 739C: Advanced Topics in Signal Processing: Coding Theory Instructor: Alexander Barg Lecture 6 (draft; 9/6/03. Error exponents for Discrete Memoryless Channels http://www.enee.umd.edu/ abarg/enee739c/course.html
More informationIndependent Component (IC) Models: New Extensions of the Multinormal Model
Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research
More informationINDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
More information