LECTURE 2 NOTES. 1. Minimal sufficient statistics.
|
|
- Camron Bond
- 5 years ago
- Views:
Transcription
1 LECTURE 2 NOTES 1. Minimal sufficient statistics. Definition 1.1 (minimal sufficient statistic). A statistic t := φ(x) is a minimial sufficient statistic for the parametric model F if 1. it is sufficient. 2. it can be expressed as a function of any other sufficient statistic; i.e. for any other sufficient statistic t, there is some function g such that t = g(t ). Recall passing a random variable x through a function g is a form of data reduction: a statistic t = g(t) has no more information than t. Thus a minimal sufficient statistic is a sufficient statistic with the least information. That is, it is an optimal form of data reduction. Like sufficient statistics, minimal sufficient statistics are not unique. If t is minimal sufficient, g(t) for any invertible g is also minimal sufficient. In terms of partitions of the sample space, a minimal sufficient statistic induces the coarsest (hence minimal) partition of the sample space among all sufficient statistics. Although minimal sufficient statistics are not unique, the minimal sufficient partition is unique. Going back to the coin tossing example, consider the statistic { (1.1) φ 100x x 2 + x 3 x 1 + x 2 + x 3 = 2 (x 1, x 2, x 3 ) = x 1 + x 2 + x 3 otherwise. Table 1 gives the conditional distribution of (x 1, x 2, x 3 ) t. It is sufficient because the conditional distribution does not depend on p. However, it is not minimal because it is not a function of x 1 + x 2 + x 3, which we know is also sufficient. Theorem 1.2. A statistic t := φ(x) is a miminal sufficent statistic for a parametric model F if the ratio f θ(x 1 ) f θ (x 2 ) does not depend on θ if and only if φ(x 1 ) = φ(x 2 ). Proof. We shall prove the theorem when the densities in the model have common support. 1
2 2 STAT 201B Table 1 The conditional distribution of three coin tosses given t. We see that t is not a minimal sufficient statistic: (x 1, x 2, x 2) t f p(x t ) (0,0,0) 0 1 (1,0,0) 1 1/3 (0,1,0) 1 1/3 (0,0,1) 1 1/3 (1,1,0) (0,1,1) 11 1 (1,0,1) (1,1,1) 3 1 First, we show that t is sufficient. For each set A t = {x : φ(x) = t} in the partition of induced by t, consider a point x t A t. The density has the form f θ (x) = f θ(x) f θ (x φ(x) ) f θ(x φ(x) ). By assumption, the ratio f θ(x) f θ (x ) 1. x φ(x) is a function of x, so f θ(x) f θ (x ) 2. f θ (x φ(x) ) depends only on φ(x). does not depend on θ. Further, depends only on x. Thus, by the factorization theorem, t = φ(x) is a sufficient statistic. We complete the proof by showing that t is minimal. It suffices to show that φ (x 1 ) = φ (x 2 ) for any sufficient statistic t := φ (x) implies φ(x 1 ) = φ(x 2 ). By the factorization theorem, the density has the form f θ (x) = g θ (φ (x))h(x) For any x 1, choose x 2 such that φ (x 1 ) = φ (x 2 ). We observe the ratio f θ (x 1 ) f θ (x 2 ) = g θ(φ (x 1 ))h(x 1 ) g θ (φ (x 2 ))h(x 2 ) = h(x 1) h(x 2 ) does not depend on θ. By assumption, this implies φ(x 1 ) = φ(x 2 ). Example 1.3. Let x i i.i.d. Poi(θ). We showed that t := i n] x i is a sufficient statistic for the Poisson family. The ratio f θ (x) f θ (x ) = i n] x i! i n] x i! θ i n] x i x i,
3 LECTURE 2 NOTES 3 which does not depend on θ if and only if t = t. By Theorem 1.2, t is a minimal sufficient statistic. 2. Exponential families. Definition 2.1. A set of densities is an exponential family if the densities are of the form (2.1) f θ (x) = exp ( η(θ) T φ(x) a(θ) ) h(x), where η(θ) R p are the natural parameters. φ(x) R p are sufficient statistics a(θ) := log exp ( η(θ) T φ(x) ) h(x)dx is the log-partition function. h : R is the base measure. Exponential families are of statistical relevance because many common distributions are exponential families. For example, 1. binomial: f p (x) = ( ) n x p x (1 p) n x = exp ( x log p + (n x) log(1 p) )( n x = exp ( x log p 1 p + n log(1 p))( n x). p 1 p. The natural parameter is the logit of p: log 2. Poisson: f λ (x) = λx x! e λ = exp (x log λ λ) 1 x!. 3. Gaussian (σ 2 = 1): 4. Gaussian (unknown σ 2 ): f µ,σ 2(x) = exp ( x2 ( µ = exp σ 2 f µ (x) = 1 2π exp ( 1(x µ)2) = exp ( µx 1 2 µ2) e x2 /2 2π. 2 + µx µ2 2σ 2 σ 2 σ 2 ] T x x 2 2 log σ ) 1 2σ 2 2π ] µ2 2σ 2 log σ ) ) 1. 2π
4 4 STAT 201B As all-inclusive as exponential families seem, there are notable exceptions. For example, the unif(0, θ) family is not an exponential family. More generally, parametric models whose support depends on the parameter are not exponential families. Often, it is convenient to re-parametrize an exponential family in terms of its natural parameters η. Doing so gives the canonical form of the exponential family: f η (x) = exp ( η T φ(x) a(η) ) h(x). For a set of sufficient statistics φ and a base measure h, the set of all valid natural parameters is the natural parameter space: H := { η R p : exp ( η T φ(x) ) h(x)dx < }. Lemma 2.2. The log-partition function of an exponential family in canonical form is convex. Proof. Let η 1, η 2 H. For any α (0, 1), a(αη 1 + (1 α)η 2 ) = log exp ( (αη 1 + (1 α)η 2 ) T φ(x) ) h(x)dx = log exp ( η1 T φ(x) ) α ( exp η T 2 φ(x) ) 1 α h(x)dx. By Hölder s inequality, log ( ( ( exp η T 1 φ(x) ) h(x) ) ) α α ( ( ( α dx exp η T 2 φ(x) ) h(x) ) ) 1 α 1 α 1 α dx = αa(η 1 ) + (1 α)a(η 2 ). Since a(η 1 ) and a(η 2 ) are finite by assumption, so is a(αη 1 + (1 α)η 2 ). Lemma 2.3. The log-partition function a : Θ R is convex and infinitely differentiable. Further, its derivatives are ] (2.2) a(θ) = E θ φ(x), 2 ] (2.3) a(θ) = cov θ φ(x). Proof. The proof of Lemma 2.2 actually shows a is convex. To complete the proof, we assume exchanging expectation and differentiation is permitted; verifying the validity of the assumption is done by a tedious argument
5 LECTURE 2 NOTES 5 that appeals to the dominated convergence theorem. We defer the details to the appendix. We exchange expectation and differentiation to obtain i a(θ) = φ i(x) exp ( θ T φ(x) ) h(x)dx exp (θt φ(x)) h(x)dx = φ i (x) exp ( θ T φ(x) a(θ) ) h(x)dx = E θ φi (x) ] for any i p], which estabishes (2.2). Exchanging expectation and differentiation again, 2 i,ja(θ) = φ i (x)(φ j (x) j a(θ)) exp ( θ T φ(x) a(θ) ) h(x)dx = φ i (x)φ j (x) exp ( θ T φ(x) a(θ) ) h(x)dx j a(θ) φ i (x) exp ( θ T φ(x) a(θ) ) h(x)dx = E θ φi (x)φ j (x) ] E θ φi (x) ] E θ φj (x) ], for any i, j p], which estabishes (2.3). The superficial dimension of an exponential family may be reduced by re-parametrization in two cases: 1. T := φ( ) is a subset of an affine set (in R p ): there are a R p and b R such that a T φ(x) = b for any x H is a subset of an affine set (in R p ). In the first case, two parameters η 1, η 2 H may correspond to the same density, making the model unidentifiable. Indeed, the densities that correspond to η and η + a are proportional to each other: f η+a (x) exp ( (η + a) T φ(x) ) h(x) = exp ( η T φ(x) + b ) h(x) exp ( η T φ(x) ) h(x). In fact, any parameter η + γa for some γ R is associated with the same density. In the second case, η obeys linear inequality constraints, the values 1 An affine set is the set of solutions to a linear system: {x R n : Ax = b} for some A R m n and b R m.
6 6 STAT 201B of some parameters are determined by the values of other parameters. Thus some of the parameters are extraneous. If neither case holds, the exponential family is minimal. Definition 2.4. An exponential family in canonical form is minimal if 1. the image of under φ is not a subset of an affine set. 2. H is not a subset of an affine set. Minimal exponential families are classified into two types: full-rank and curved. A minimal exponential family is full-rank when its natural parameter space H contains an open set. Otherwise, a minimal exponential family is curved. By the factorization theorem, φ(x) is a sufficient statistic for densities of the form f η (x) = exp ( η T φ(x) a(η) ) h(x). If the exponential family is minimal, then φ(x) is a minimal sufficient statistic. Theorem 2.5. If F is a minimal exponential family, then t = φ(x) is a minimal sufficient statistic. Proof. By Theorem 3.2, it suffices to show the ratio fη(x 1) f η(x 2 ) in η if and only if φ(x 1 ) = φ(x 2 ). If φ(x 1 ) = φ(x 2 ), the ratio is Assume fη(x 1) f η(x 2 ) We simplify to obtain f η (x 1 ) f η (x 2 ) = exp ( η T (φ(x 1 ) φ(x 2 )) ) = 1. is constant in η. That is f η1 (x 1 ) f η1 (x 2 ) = f η 2 (x 1 ) f η2 (x 2 ) for any η 1, η 2 H. (η 1 η 2 ) T (φ(x 1 ) φ(x 2 )) = 0. is constant By the definition of minimal exponential family, H is not a subset of an affine set. Since subspaces are affine sets, Θ is not a subset of any subspace, which implies span(θ) = R p.
7 LECTURE 2 NOTES 7 Since { (η1 η 2 ) T ( φ(x i ) φ(x i) ) = 0 for all η 1, η 2 H } we deduce { η T ( φ(x i ) φ(x i) ) = 0 for all η span(h) }, η T ( φ(x i ) φ(x i )) = 0 for all η R p, which allows us to conclude φ(x i ) = φ(x i ) = 0. The joint distribution of i.i.d. samples from a p-dimensional exponential family is another p-dimensional exponential family: f θ (x i ) = exp ( η(θ) T φ(x i ) a(θ) ) h(x i ) i n] i n] = exp ( η(θ) T ( i n] φ(x i) ) ) a(θ) i n] h(x i ) By the factorization theorem, i n] φ(x i) is a sufficient statistic. We remark that its dimension does not grow with the sample size. In fact, i.i.d. exponential families are the only parametric models that admit a sufficient statistic whose dimension does not depend on the sample size. The Pitman- Koopman-Darmois 2 theorem says that if {x i } i n] are i.i.d. samples from a continuous density f θ in a parametric model that consists of densities whose support does not depend on the parameter, then there is a sufficient statistic whose dimension does not depend on the sample size if and only if the parametric model is an exponential family. Lemma A.1. APPENDI A If two functions f 1 and f 2 are 1. proportional to each other: f 1 (x) = cf 2 (x) for any x 2. f 1(x)dx = f 2(x)dx, then they are equal: f 1 (x) = f 2 (x) for any x. Proof. Indeed, by integrating f 1 (x) = cf 2 (x) and rearranging, we have f 1(x)dx f 2(x)dx = c. By the assumption that their integrals agree, we deduce c = 1, which shows that f 1 (x) = f 2 (x). 2 One of the namesakes of the theorem is Dr. Jim Pitman s father, Dr. Edwin Pitman.
8 8 STAT 201B Lemma A.2. We have at any η int(h). Let f θ (x) be a member of a canonical exponential family. i z(η) = i exp(η T φ(x)) ] h(x)dx Proof. We show that exchanging differentiation and integration is justified at the origin. By the definition of the derivative, i z(0) = lim n = lim n z(e i ) z(0) e (e j) T φ(x) 1 h(x)dx, where { } is any decreasing sequence that converges to zero. We assume d 0 := sup n is small enough so that e j δ 0 int(h). We recognize the limit of the integrand is i exp(η T φ(x)) ] h(x). To justify exchanging lim n and integration, we appeal to the dominated convergence theorem. Let d 0 := sup n. The (absolute value of the) integrand is at most e (e j) T φ(x) 1 h(x) e (e j) T φ(x) (e j ) T φ(x) h(x) e (e jδ 0 ) T φ(x) (e j δ 0 ) T φ(x) h(x) δ 0 exp2 (e jδ 0 ) T φ(x) h(x) δ 0 exp2 (e jδ 0 ) T φ(x) + exp 2 (e j δ 0 ) T φ(x) h(x) δ 0 We check that the bound is integrable: exp 2 (e jδ 0 ) T φ(x) + exp 2 (e j δ 0 ) T φ(x) h(x)dx δ 0 = z(2e j δ 0 ) + z( 2e j δ 0 ), ( e x 1 e x x ) ( x e x ) ( e x e x + e x)
9 LECTURE 2 NOTES 9 which, by the assumption that e j δ 0 int(h) is finite. Thus we are free to exchange lim n and integration to conclude i z(0) = lim n = = lim n e (e j) T φ(x) 1 h(x)dx e (e j) T φ(x) 1 h(x)dx i exp(η T φ(x)) ] 0 h(x)dx. Yuekai Sun Berkeley, California November 29, 2015
ECE 275B Homework # 1 Solutions Version Winter 2015
ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2
More informationECE 275B Homework # 1 Solutions Winter 2018
ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,
More informationLECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS
LECTURE : EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS HANI GOODARZI AND SINA JAFARPOUR. EXPONENTIAL FAMILY. Exponential family comprises a set of flexible distribution ranging both continuous and
More informationMIT Spring 2016
Dr. Kempthorne Spring 2016 1 Outline Building 1 Building 2 Definition Building Let X be a random variable/vector with sample space X R q and probability model P θ. The class of probability models P = {P
More information1. Fisher Information
1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score
More informationChapter 2: Fundamentals of Statistics Lecture 15: Models and statistics
Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Monday, September 26, 2011 Lecture 10: Exponential families and Sufficient statistics Exponential Families Exponential families are important parametric families
More informationIntroduction: exponential family, conjugacy, and sufficiency (9/2/13)
STA56: Probabilistic machine learning Introduction: exponential family, conjugacy, and sufficiency 9/2/3 Lecturer: Barbara Engelhardt Scribes: Melissa Dalis, Abhinandan Nath, Abhishek Dubey, Xin Zhou Review
More informationLecture 17: Minimal sufficiency
Lecture 17: Minimal sufficiency Maximal reduction without loss of information There are many sufficient statistics for a given family P. In fact, X (the whole data set) is sufficient. If T is a sufficient
More informationThe Multivariate Gaussian Distribution [DRAFT]
The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationLecture 16 November Application of MoUM to our 2-sided testing problem
STATS 300A: Theory of Statistics Fall 2015 Lecture 16 November 17 Lecturer: Lester Mackey Scribe: Reginald Long, Colin Wei Warning: These notes may contain factual and/or typographic errors. 16.1 Recap
More informationExercises with solutions (Set D)
Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where
More informationMachine Learning. Lecture 3: Logistic Regression. Feng Li.
Machine Learning Lecture 3: Logistic Regression Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2016 Logistic Regression Classification
More informationExponential Families
Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationLecture 22: Variance and Covariance
EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce
More informationMarch 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS
March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS Abstract. We will introduce a class of distributions that will contain many of the discrete and continuous we are familiar with. This class will help
More informationLecture 3: Random variables, distributions, and transformations
Lecture 3: Random variables, distributions, and transformations Definition 1.4.1. A random variable X is a function from S into a subset of R such that for any Borel set B R {X B} = {ω S : X(ω) B} is an
More informationLast Lecture - Key Questions. Biostatistics Statistical Inference Lecture 03. Minimal Sufficient Statistics
Last Lecture - Key Questions Biostatistics 602 - Statistical Inference Lecture 03 Hyun Min Kang January 17th, 2013 1 How do we show that a statistic is sufficient for θ? 2 What is a necessary and sufficient
More informationMath Solutions to homework 5
Math 75 - Solutions to homework 5 Cédric De Groote November 9, 207 Problem (7. in the book): Let {e n } be a complete orthonormal sequence in a Hilbert space H and let λ n C for n N. Show that there is
More information18.175: Lecture 15 Characteristic functions and central limit theorem
18.175: Lecture 15 Characteristic functions and central limit theorem Scott Sheffield MIT Outline Characteristic functions Outline Characteristic functions Characteristic functions Let X be a random variable.
More informationClassical Estimation Topics
Classical Estimation Topics Namrata Vaswani, Iowa State University February 25, 2014 This note fills in the gaps in the notes already provided (l0.pdf, l1.pdf, l2.pdf, l3.pdf, LeastSquares.pdf). 1 Min
More informationGeneralized Linear Models (1/29/13)
STA613/CBB540: Statistical methods in computational biology Generalized Linear Models (1/29/13) Lecturer: Barbara Engelhardt Scribe: Yangxiaolu Cao When processing discrete data, two commonly used probability
More informationChapter 8.8.1: A factorization theorem
LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.
More information1 Probability Model. 1.1 Types of models to be discussed in the course
Sufficiency January 18, 016 Debdeep Pati 1 Probability Model Model: A family of distributions P θ : θ Θ}. P θ (B) is the probability of the event B when the parameter takes the value θ. P θ is described
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES Contents 1. Continuous random variables 2. Examples 3. Expected values 4. Joint distributions
More information1: PROBABILITY REVIEW
1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following
More informationSTAT 801: Mathematical Statistics. Hypothesis Testing
STAT 801: Mathematical Statistics Hypothesis Testing Hypothesis testing: a statistical problem where you must choose, on the basis o data X, between two alternatives. We ormalize this as the problem o
More informationMIT Spring 2016
Exponential Families II MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Exponential Families II 1 Exponential Families II 2 : Expectation and Variance U (k 1) and V (l 1) are random vectors If A (m k),
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More information18.175: Lecture 3 Integration
18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability
More informationLECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]
LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for
More informationSTAT 830 Hypothesis Testing
STAT 830 Hypothesis Testing Hypothesis testing is a statistical problem where you must choose, on the basis of data X, between two alternatives. We formalize this as the problem of choosing between two
More informationPatterns of Scalable Bayesian Inference Background (Session 1)
Patterns of Scalable Bayesian Inference Background (Session 1) Jerónimo Arenas-García Universidad Carlos III de Madrid jeronimo.arenas@gmail.com June 14, 2017 1 / 15 Motivation. Bayesian Learning principles
More informationis a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.
Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable
More informationChapter 1. Statistical Spaces
Chapter 1 Statistical Spaces Mathematical statistics is a science that studies the statistical regularity of random phenomena, essentially by some observation values of random variable (r.v.) X. Sometimes
More informationSTATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero
STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero 1999 32 Statistic used Meaning in plain english Reduction ratio T (X) [X 1,..., X n ] T, entire data sample RR 1 T (X) [X (1),..., X (n) ] T, rank
More information1. Let A be a 2 2 nonzero real matrix. Which of the following is true?
1. Let A be a 2 2 nonzero real matrix. Which of the following is true? (A) A has a nonzero eigenvalue. (B) A 2 has at least one positive entry. (C) trace (A 2 ) is positive. (D) All entries of A 2 cannot
More informationLECTURE 15: COMPLETENESS AND CONVEXITY
LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationOrder Statistics and Distributions
Order Statistics and Distributions 1 Some Preliminary Comments and Ideas In this section we consider a random sample X 1, X 2,..., X n common continuous distribution function F and probability density
More information40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology
Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson
More informationLecture 3. David Aldous. 31 August David Aldous Lecture 3
Lecture 3 David Aldous 31 August 2015 This size-bias effect occurs in other contexts, such as class size. If a small Department offers two courses, with enrollments 90 and 10, then average class (faculty
More information1 Presessional Probability
1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional
More informationIf there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,
Recall The Neyman-Pearson Lemma Neyman-Pearson Lemma: Let Θ = {θ 0, θ }, and let F θ0 (x) be the cdf of the random vector X under hypothesis and F θ (x) be its cdf under hypothesis. Assume that the cdfs
More information18.175: Lecture 13 Infinite divisibility and Lévy processes
18.175 Lecture 13 18.175: Lecture 13 Infinite divisibility and Lévy processes Scott Sheffield MIT Outline Poisson random variable convergence Extend CLT idea to stable random variables Infinite divisibility
More informationChapter 4. Theory of Tests. 4.1 Introduction
Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule
More informationSample Spaces, Random Variables
Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted
More informationSTAT 830 Hypothesis Testing
STAT 830 Hypothesis Testing Richard Lockhart Simon Fraser University STAT 830 Fall 2018 Richard Lockhart (Simon Fraser University) STAT 830 Hypothesis Testing STAT 830 Fall 2018 1 / 30 Purposes of These
More informationGEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect
GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, 2018 BORIS S. MORDUKHOVICH 1 and NGUYEN MAU NAM 2 Dedicated to Franco Giannessi and Diethard Pallaschke with great respect Abstract. In
More informationChapter 1. Sets and probability. 1.3 Probability space
Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability
More information18.175: Lecture 8 Weak laws and moment-generating/characteristic functions
18.175: Lecture 8 Weak laws and moment-generating/characteristic functions Scott Sheffield MIT 18.175 Lecture 8 1 Outline Moment generating functions Weak law of large numbers: Markov/Chebyshev approach
More informationLecture Notes 3 Convergence (Chapter 5)
Lecture Notes 3 Convergence (Chapter 5) 1 Convergence of Random Variables Let X 1, X 2,... be a sequence of random variables and let X be another random variable. Let F n denote the cdf of X n and let
More informationSTAT2201. Analysis of Engineering & Scientific Data. Unit 3
STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random
More informationLecture 26: Likelihood ratio tests
Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationI. ANALYSIS; PROBABILITY
ma414l1.tex Lecture 1. 12.1.2012 I. NLYSIS; PROBBILITY 1. Lebesgue Measure and Integral We recall Lebesgue measure (M411 Probability and Measure) λ: defined on intervals (a, b] by λ((a, b]) := b a (so
More informationIntroducing the Normal Distribution
Department of Mathematics Ma 3/13 KC Border Introduction to Probability and Statistics Winter 219 Lecture 1: Introducing the Normal Distribution Relevant textbook passages: Pitman [5]: Sections 1.2, 2.2,
More informationLecture 11: Random Variables
EE5110: Probability Foundations for Electrical Engineers July-November 2015 Lecture 11: Random Variables Lecturer: Dr. Krishna Jagannathan Scribe: Sudharsan, Gopal, Arjun B, Debayani The study of random
More informationu( x) = g( y) ds y ( 1 ) U solves u = 0 in U; u = 0 on U. ( 3)
M ath 5 2 7 Fall 2 0 0 9 L ecture 4 ( S ep. 6, 2 0 0 9 ) Properties and Estimates of Laplace s and Poisson s Equations In our last lecture we derived the formulas for the solutions of Poisson s equation
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationGaussian Models (9/9/13)
STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate
More informationNumerical Solutions to Partial Differential Equations
Numerical Solutions to Partial Differential Equations Zhiping Li LMAM and School of Mathematical Sciences Peking University The Residual and Error of Finite Element Solutions Mixed BVP of Poisson Equation
More informationSTA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4
STA 73: Inference Notes. Neyman-Pearsonian Classical Hypothesis Testing B&D 4 1 Testing as a rule Fisher s quantification of extremeness of observed evidence clearly lacked rigorous mathematical interpretation.
More informationFourier Transform & Sobolev Spaces
Fourier Transform & Sobolev Spaces Michael Reiter, Arthur Schuster Summer Term 2008 Abstract We introduce the concept of weak derivative that allows us to define new interesting Hilbert spaces the Sobolev
More informationGeneralized Linear Models and Exponential Families
Generalized Linear Models and Exponential Families David M. Blei COS424 Princeton University April 12, 2012 Generalized Linear Models x n y n β Linear regression and logistic regression are both linear
More informationUSING FUNCTIONAL ANALYSIS AND SOBOLEV SPACES TO SOLVE POISSON S EQUATION
USING FUNCTIONAL ANALYSIS AND SOBOLEV SPACES TO SOLVE POISSON S EQUATION YI WANG Abstract. We study Banach and Hilbert spaces with an eye towards defining weak solutions to elliptic PDE. Using Lax-Milgram
More informationfor all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true
3 ohn Nirenberg inequality, Part I A function ϕ L () belongs to the space BMO() if sup ϕ(s) ϕ I I I < for all subintervals I If the same is true for the dyadic subintervals I D only, we will write ϕ BMO
More informationMATH 426, TOPOLOGY. p 1.
MATH 426, TOPOLOGY THE p-norms In this document we assume an extended real line, where is an element greater than all real numbers; the interval notation [1, ] will be used to mean [1, ) { }. 1. THE p
More informationMcGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper
McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section
More information1 Exercises for lecture 1
1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More information(Practice Version) Midterm Exam 2
EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2014 Kannan Ramchandran November 7, 2014 (Practice Version) Midterm Exam 2 Last name First name SID Rules. DO NOT open
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationPOISSON PROCESSES 1. THE LAW OF SMALL NUMBERS
POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS 1.1. The Rutherford-Chadwick-Ellis Experiment. About 90 years ago Ernest Rutherford and his collaborators at the Cavendish Laboratory in Cambridge conducted
More informationLecture 5: The Bellman Equation
Lecture 5: The Bellman Equation Florian Scheuer 1 Plan Prove properties of the Bellman equation (In particular, existence and uniqueness of solution) Use this to prove properties of the solution Think
More informationTrace Class Operators and Lidskii s Theorem
Trace Class Operators and Lidskii s Theorem Tom Phelan Semester 2 2009 1 Introduction The purpose of this paper is to provide the reader with a self-contained derivation of the celebrated Lidskii Trace
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationMiscellaneous Errors in the Chapter 6 Solutions
Miscellaneous Errors in the Chapter 6 Solutions 3.30(b In this problem, early printings of the second edition use the beta(a, b distribution, but later versions use the Poisson(λ distribution. If your
More information10. Composite Hypothesis Testing. ECE 830, Spring 2014
10. Composite Hypothesis Testing ECE 830, Spring 2014 1 / 25 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve unknown parameters
More information4. CONTINUOUS RANDOM VARIABLES
IA Probability Lent Term 4 CONTINUOUS RANDOM VARIABLES 4 Introduction Up to now we have restricted consideration to sample spaces Ω which are finite, or countable; we will now relax that assumption We
More informationIntroduction & Random Variables. John Dodson. September 3, 2008
& September 3, 2008 Statistics Statistics Statistics Course Fall sequence modules portfolio optimization Bill Barr fixed-income markets Chris Bemis calibration & simulation introductions flashcards how
More informationBanach Spaces V: A Closer Look at the w- and the w -Topologies
BS V c Gabriel Nagy Banach Spaces V: A Closer Look at the w- and the w -Topologies Notes from the Functional Analysis Course (Fall 07 - Spring 08) In this section we discuss two important, but highly non-trivial,
More informationExpectation Maximization
Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /
More information4 Expectation & the Lebesgue Theorems
STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does
More informationFunctional Analysis Exercise Class
Functional Analysis Exercise Class Week 9 November 13 November Deadline to hand in the homeworks: your exercise class on week 16 November 20 November Exercises (1) Show that if T B(X, Y ) and S B(Y, Z)
More informationIntroducing the Normal Distribution
Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 10: Introducing the Normal Distribution Relevant textbook passages: Pitman [5]: Sections 1.2,
More informationLARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011
LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationCourse 212: Academic Year Section 1: Metric Spaces
Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........
More informationChapter 2: Random Variables
ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:
More informationt x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.
Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM
c 2007-2016 by Armand M. Makowski 1 ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM 1 The basic setting Throughout, p, q and k are positive integers. The setup With
More informationNew Class of duality models in discrete minmax fractional programming based on second-order univexities
STATISTICS, OPTIMIZATION AND INFORMATION COMPUTING Stat., Optim. Inf. Comput., Vol. 5, September 017, pp 6 77. Published online in International Academic Press www.iapress.org) New Class of duality models
More informationThe Sherrington-Kirkpatrick model
Stat 36 Stochastic Processes on Graphs The Sherrington-Kirkpatrick model Andrea Montanari Lecture - 4/-4/00 The Sherrington-Kirkpatrick (SK) model was introduced by David Sherrington and Scott Kirkpatrick
More information