Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood
|
|
- Peregrine Horton
- 5 years ago
- Views:
Transcription
1 Nonparametric estimation of s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Cowles Foundation Seminar, Yale University November 9, 2016
2 Cowles Foundation Seminar, Yale University Based on joint work with: Qiyang (Roy) Han Charles Doss Fadoua Balabdaoui Kaspar Rufibach Arseni Seregin
3 Outline A: B: Log-concave and s concave densities on R and R d s concave densities on R and R d C: Maximum Likelihood for log-concave and s concave densities D. 1: Basics 2: On the model 3: Off the model An alternative to ML: Rényi divergence estimators 1. Basics 2. On the model 3. Off the model E. Summary: problems and open questions Cowles Foundation Seminar, Yale, November 9,
4 A. Log-concave densities on R and R d If a density f on R d is of the form f(x) f ϕ (x) = exp(ϕ(x)) = exp ( ( ϕ(x))) where ϕ is concave (so ϕ is convex), then f is log-concave. The class of all densities f on R d of this form is called the class of log-concave densities, P log concave P 0. Properties of log-concave densities: Every log-concave density f is unimodal (quasi concave). P 0 is closed under convolution. P 0 is closed under marginalization. P 0 is closed under weak limits. A density f on R is log-concave if and only if its convolution with any unimodal density is again unimodal (Ibragimov, 1956). Cowles Foundation Seminar, Yale, November 9,
5 Many parametric families are log-concave, for example: Normal (µ, σ 2 ) Uniform(a, b) Gamma(r, λ) for r 1 Beta(a, b) for a, b 1 t r densities with r > 0 are not log-concave. Tails of log-concave densities are necessarily sub-exponential. P log concave = the class of Polyá frequency functions of order 2, P F 2, in the terminology of Schoenberg (1951) and Karlin (1968). See Marshall and Olkin (1979), chapter 18, and Dharmadhikari and Joag-Dev (1988), page 150. for nice introductions. Cowles Foundation Seminar, Yale, November 9,
6 B. s concave densities on R and R d If a density f on R d is of the form f(x) f ϕ (x) = then f is s-concave. (ϕ(x)) 1/s, ϕ convex, if s < 0 exp( ϕ(x)), ϕ convex, if s = 0 (ϕ(x)) 1/s, ϕ concave, if s > 0, The classes of all densities f on R d of these forms are called the classes of s concave densities, P s. The following inclusions hold: if < s < 0 < r <, then P r P 0 P s P Cowles Foundation Seminar, Yale, November 9,
7 Properties of s-concave densities: Every s concave density f is quasi-concave. The Student t ν density, t ν P s for s 1/(1 + ν). Thus the Cauchy density (= t 1 ) is in P 1/2 P s for s 1/2. The classes P s have interesting closure properties under convolution and marginalization which follow from the Borell-Brascamp-Lieb inequality: let 0 < λ < 1, 1/d s, and let f, g, h : R d [0, ) be integrable functions such that where h(λx + (1 λ)y) M s (f(x), g(y); λ) for all x, y R d M s (a, b; λ) = (λa s + (1 λ)b s ) 1/s, M 0 (a, b; λ) = a λ b 1 λ. Then R d h(x)dx M s/(sd+1) ( Rd f(x)dx, R d g(x)dx; λ Cowles Foundation Seminar, Yale, November 9, ).
8 Special Case: s = 0: Prékopa-Leindler inequality. Let 0 < λ < 1 and let f, g, h : R d [0, ). if h(λx + λy) f(x) λ g(y) 1 λ for all x, y R d, then R d h(x)dx ( R d f(x)dx ) λ ( R d g(x)dx ) 1 λ. Proposition. (a) Marginalization preserves log-concavity. (b) Convolution preserves log-concavity. Proof. (a) Suppose that h(x, y) is a log-concave density on R m R n. Thus h(λ(x, y) + (1 λ)(x, y )) h(x, y) λ h(x, y ) 1 λ. Cowles Foundation Seminar, Yale, November 9,
9 We want to show that f(y) R m h(x, y)dx is log-concave. With h(x) h(x, λy + λy ), f(x) h(x, y), g(x) h(x, y ), h(λx + λx ) = h(λx + λx, λy + λy ) Thus Prékopa-Leindler yields f(λy + (1 λ)y ) h(x, y) λ h(x, y ) λ = f(x) λ g(x ) λ. ( h(x, y)dx R m = f(y) λ f(y ) 1 λ. ) λ ( R m h(x, y )dx ) 1 λ (b) If X f and Y g are independent, f, g log-concave, then (X, Y ) f g h is log-concave. Since log-concavity is preserved under affine transforms (X Y, X +Y ) has a log-concave density. By (a), X + Y is log-concave. Cowles Foundation Seminar, Yale, November 9,
10 If f P s and s > 1/(d + 1), then E f X <. If f P s and s > 1/d, then f <. If f P 0, then for some a > 0 and b R f(x) exp( a x + b) for all x R. If f P s and s > 1/d, then with r 1/s > d, then for some a, b > 0 f(x) (b + a x ) r for all x R. If s < 1/d then there exists f P s with f =. If 1/d < s 1/(d + 1), then there exists f P s with E f X =. Cowles Foundation Seminar, Yale, November 9,
11 f(x) = ( 1 r rbeta(r/2, 1/2) r + x 2 ) (1+r)/2, with r =.05, so f P 1/(1+.05) Cowles Foundation Seminar, Yale, November 9,
12 t r densities, r {0.1, 0.2,..., 1.0} Cowles Foundation Seminar, Yale, November 9,
13 f(x) = 1 2Γ(1/2) x 1/2 exp( x ), with f P 2. Cowles Foundation Seminar, Yale, November 9,
14 C. Maximum Likelihood: 0-concave and s-concave densities MLE of f and ϕ: Let C denote the class of all concave function ϕ : R [, ). The estimator ϕ n based on X 1,..., X n i.i.d. as f 0 is the maximizer of the adjusted criterion function l n (ϕ) = = over ϕ C. logf ϕ (x)df n (x) f ϕ (x)dx ϕ(x)dfn (x) e ϕ(x) dx, s = 0, (1/s)log( ϕ(x))+ df n (x) ( ϕ(x)) 1/s + dx, s < 0, Cowles Foundation Seminar, Yale, November 9,
15 1. Basics The MLE s for P 0 exist and are unique when n d + 1. The MLE s for P s exist for s ( 1/d, 0) when n d ( r r d where r = 1/s. Thus n as 1/s = r d. Uniqueness of MLE s for P s? MLE ϕ n is piecewise affine for 1/d < s 0. The MLE for P s does not exist if s < 1/d. (Well known for s = and d = 1.) ) Cowles Foundation Seminar, Yale, November 9,
16 2. On the model The MLE s are Hellinger and L 1 consistent. The log-concave MLE s ˆf n,0 satisfy e a x ˆf n,0 (x) f 0 (x) dx a.s. 0. for a < a 0 where f 0 (x) exp( a 0 x + b 0 ). The s concave MLE s are computationally awkward; log is too aggressive a transform for an s concave density. [Note that ML has difficulties even for location t families: multiple roots of the likelihood equations.] Pointwise distribution theory for ˆf n,0 when d = 1; no pointwise distribution theory for ˆf n,s when d = 1; no pointwise distribution theory for ˆf n,0 or ˆf n,s when d > 1. Global rates? H( f n,s, f 0 ) = O p (n 2/5 ) for 1 < s 0, d = 1. Cowles Foundation Seminar, Yale, November 9,
17 3. Off the model Now suppose that Q is an arbitrary probability measure on R d with density q and X 1,..., X n are i.i.d. q. The MLE f n for P 0 satisfies: f n (x) f (x) dx a.s. 0 R d where, for the Kullback-Leibler divergence K(q, f) = qlog(q/f)dλ, f = argmin f P0 (R d ) K(q, f) is the pseudo-true density in P 0 (R d ) corresponding to q. In fact: R d ea x f n (x) f (x) dx a.s. 0 for any a < a 0 where f (x) exp( a 0 x + b 0 ). Cowles Foundation Seminar, Yale, November 9,
18 The MLE f n for P s does not behave well off the model. Retracing the basic arguments of Cule and Samworth (2010) leads to negative conclusions. (How negative remains to be pinned down!) Conclusion: Investigate alternative methods for estimation in the larger classes P s with s < 0! This leads to the proposals by Koenker and Mizera (2010). Cowles Foundation Seminar, Yale, November 9,
19 D. An alternative to ML: Rényi divergence estimators 0. Notation and Definitions β = 1 + 1/s < 0, α 1 + β 1 = 1. C(X) = all continuous functions on conv(x). C (X) = all signed Radon measures on C(X) = dual space of C(X). G(X) = all closed convex (lower s.c.) functions on conv(x). G(X) = {G C (X) : gdg 0 (or dual) cone of G(X). for all g G(X}, the polar Cowles Foundation Seminar, Yale, November 9,
20 Primal problems: P 0 and P s : P 0 : P s : min g G(X) L 0 (g, P n ) where L 0 (g, P n ) = P n g + min g G(X) L s (g, P n ) where L s (g, P n ) = P n g + 1 β R d exp( g(x))dx. R d g(x)β dx. Cowles Foundation Seminar, Yale, November 9,
21 Dual problems: P 0 and P s : D 0 : D s : max f { f(y)logf(y)dy} subject to f(y) = d(p n G) dy max f(y) α f α dy subject to f(y) = d(p n G) dy for some G G(X). for some G G(X). Cowles Foundation Seminar, Yale, November 9,
22 Why do these make sense? Population version of P 0 : min g G L 0 (g, f 0 ) where L 0 (g, f 0 ) = {g(x)f 0 (x) + e g(x) }dx. Minimizing the integrand pointwise in g = g(x) for fixed f 0 (x) yields f 0 (x) e g(x) = 0 if e g(x) = f 0 (x). Population version of P s : min g G L s (g, f 0 ) where L s (g, f 0 ) = {g(x)f 0 (x) + 1 β gβ (x)}dx. Minimizing the integrand pointwise in g = g(x) for fixed f 0 (x) yields f 0 (x)+(β/ β )g β 1 (x) = f 0 (x) g β 1 (x) = 0, and hence g 1/s = g 1/s (x) = f 0 (x). Cowles Foundation Seminar, Yale, November 9,
23 1. Basics for the Rényi divergence estimators: (Koenker and Mizera, 2010) If conv(x) has non-empty interior, then strong duality between P s and D s holds. The dual optimal optimal solution exists, is unique, and ˆf n = ĝ 1/s n. (Koenker and Mizera, 2010) The solution f = g 1/s in the population version of the problem when Q = P 0 has density p 0 P s is Fisher-consistent; i.e. f = p 0. Cowles Foundation Seminar, Yale, November 9,
24 2. Off the model: Han & W (2015) Let Q 1 {Q on (R d, B d ) : x dq(x) < }, Q 0 {Q on (R d, B d ) : int(csupp(q)) }. Theorem (Han & W, 2015): If 1/(d + 1) < s < 0 and Q Q 0 Q 1, then the primal problem P s (Q) has a unique solution g G which satisfies f = g 1/s where g is bounded away from 0 and f is a bounded density. Theorem (Han & W, 2015): Let d = 1. If f n,s denotes the solution to the primal problem P s and f n,0 denotes the solution to the primal problem P 0, then for any κ > 0, p 1, (1 + x ) κ f n,s (x) f n,0 (x) p dx 0 as s 0. Cowles Foundation Seminar, Yale, November 9,
25 Theorem (Han & W, 2015): Suppose that: (i) d 1, (ii) 1/(d + 1) < s < 0, and (iii) Q Q 0 Q 1. If f Q,s denotes the (pseudo-true) solution to the primal problem P s (Q), then for any κ < r d = ( 1/s) d, (1 + x ) κ f n,s (x) f Q,s (x) dx a.s. 0 as n. Cowles Foundation Seminar, Yale, November 9,
26 3. On the model: Q has density f P s ; f = g 1/s for some g convex. Consistency: Suppose that: (i) d 1 and 1/d < s < 0 and s > s if s 1/(d + 1), s = s if s > 1/(d + 1). Then for any κ < r d = ( 1/s) d, (1 + x ) κ f n,s (x) f(x) dx a.s. 0 as n. Thus H( f n,s, f) a.s. 0 as well. Pointwise limit theory: (paralleling the results of Balabdaoui, Rufibach, and W (2009) for s = 0) Cowles Foundation Seminar, Yale, November 9,
27 Assumptions: (A1) g 0 G and f 0 P s (R) with 1/2 < s < 0. (A2) f 0 (x 0 ) > 0. (A3) g 0 is locally C 2 in a neighborhood of x 0 with g 0 (x 0 ) > 0. Theorem 1. (Pointwise limit theorem; Han & W (2016)) Under assumptions (A1)-(A3), we have n2 5 n5 1 and... (ĝn (x 0 ) g 0 (x 0 ) ) (ĝ n (x 0 ) g 0 (x 0) ) d ( g0 4(x 0)g (2) ) 1/5 0 (x 0) r 4 f 0 (x 0 ) 2 H (2) (4)! ( ] g 2 0 (x 0 )[g (2) 3 0 (x ) 1/5 0) ] 3 H r 2 f 0 (x 0 ) [(4)! (3) 3 2 (0) 2 (0) Cowles Foundation Seminar, Yale, November 9, ,
28 ... furthermore n2 5 n 1 5 ( ˆf n (x 0 ) f 0 (x 0 ) ) ( ˆf n(x 0 ) f 0 (x 0) ) d ( rf 0 (x 0 ) 3 g (2) ) 1/5 0 (x 0) H (2) g 0 (x 0 )(4)! ( ) r 3 f 0 (x 0 ) (g 4 (2) 3 0 (x ) 1/5 0) ] 3 H g 0 (x 0 ) [(4)! (3) 3 2 (0) 2 (0), where H 2 is the unique lower envelope of the process Y 2 satisfying 1. H 2 (t) Y 2 (t) for all t R; 2. H (2) k is concave; 3. H 2 (t) = Y 2 (t) if the slope of H (2) 2 decreases strictly at t. 4. Y 2 (t) = t 0 W (s)ds t 4, t R where W is two-sided Brownian motion started at 0. Cowles Foundation Seminar, Yale, November 9,
29 Estimation of the mode for d = 1. Theorem 2. (Estimation of the mode) Assume (A1)-(A4) hold. Then ( n 1/5( ) g ˆm n m 0 (m 0 ) 2 (4)! 2 ) 1/5 0 d r 2 f 0 (m 0 )g (2) M(H (2) 0 (m 0) 2 2 ), (1) where ˆm n = M( ˆf n ), m 0 = M(f 0 ). What is the price of assuming s < 0 when the truth f P 0? Assume 1/2 < s < 0 and k = 2. Let f 0 = exp(ϕ 0 ) be a logconcave density where ϕ 0 : R R is the underlying concave function. Then f 0 is also s-concave. Cowles Foundation Seminar, Yale, November 9,
30 Let g s := f 1/r 0 = exp( ϕ 0 /r) be the underlying convex function when f 0 is viewed as an s-concave density. Calculation yields g (2) s (x 0 ) = 1 r 2g s(x 0 ) ( ϕ 0 (x 0) 2 rϕ 0 (x 0) ). Hence the constant before H (2) 2 (0) appearing in the limit distribution for f n becomes ( f0 (x 0 ) 3 ϕ 0 (x 0) 2 4!r + f 0(x 0 ) 3 ϕ 0 (x ) 1/5 0). 4! The second term is the constant involved in the limiting distribution when f 0 (x 0 ) is estimated via the log-concave MLE: (2.2), page 1305 in Balabdaoui, Rufibach, & W (2009). The ratio of the two constants (or asymptotic relative efficiency) is shown for f 0 standard normal (blue) and logistic (magenta) in the figure: Cowles Foundation Seminar, Yale, November 9,
31 Efficiency ratio: f 0 = N(0, 1), blue; f 0 =logistic, magenta Cowles Foundation Seminar, Yale, November 9,
32 The first term is non-negative and is the price we pay by estimating a true log-concave density via the Rényi divergence estimator over a larger class of s-concave densities. Note that the first term vanishes as r (or s 0). Note that the ratio is 1 at the mode of f 0. For estimation of the mode, the ratio of constants is always 1: nothing is lost by enlarging the class from s = 0 to s < 0! Cowles Foundation Seminar, Yale, November 9,
33 E. Summary: problems and open questions Global rates of convergence? Limiting distribution(s) for d > 1? (n r with r = 2/(4 + d)?) MLE (rate-) inefficient for d 4 (or perhaps d 3)? How to penalize to get efficient rates? Can we go below s = 1/(d + 1) with other methods? Multivariate classes with nice preservation/closure properties and smoother than log-concave? Algorithms for computing f n P s? (Non-smooth and convex; or non-smooth and non-convex?) Related results for convex regression on R d : Seijo and Sen, Ann. Statist. (2011). Cowles Foundation Seminar, Yale, November 9,
34 Guvenen et al (2014) have estimated models of income dynamics using very large (10 percent) samples of U.S. Social Security records linked to W2 data The density is not log-concave, but an s concave density with s = 1/2 fits well: log f(x) x ~ log income annual increments 1 f(x) x ~ log income annual increments f(x) x ~ log income annual increments Courtesy Roger Koenker Cowles Foundation Seminar, Yale, November 9,
35 F. Selected references Dümbgen and Rufibach (2009). Cule, Samworth, and Stewart (2010) Cule and Samworth (2010). Dümbgen, Samworth, and Schuhmacher (2011). Balabdaoui, Rufibach, and W (2009) Seregin & W (2010), Ann. Statist. Koenker and Mizera (2010), Ann. Statist. Han & W (2016): Ann. Statist., & arxiv: v3. Doss & W (2016): Ann. Statist., & arxiv: v2. Guntuboyina & Sen (2015): Prob. Theory & Rel. Fields. Chatterjee, Guntuboyina, & Sen (2015): Ann. Statist. Cowles Foundation Seminar, Yale, November 9,
36 Many thanks! Cowles Foundation Seminar, Yale, November 9,
37 Cowles Foundation Seminar, Yale, November 9,
38 Cowles Foundation Seminar, Yale, November 9,
Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood
Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Statistics Seminar, York October 15, 2015 Statistics Seminar,
More informationNonparametric estimation of log-concave densities
Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Seminaire, Institut de Mathématiques de Toulouse 5 March 2012 Seminaire, Toulouse Based on joint work
More informationNonparametric estimation of log-concave densities
Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Northwestern University November 5, 2010 Conference on Shape Restrictions in Non- and Semi-Parametric
More informationNonparametric estimation under Shape Restrictions
Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions
More informationShape constrained estimation: a brief introduction
Shape constrained estimation: a brief introduction Jon A. Wellner University of Washington, Seattle Cowles Foundation, Yale November 7, 2016 Econometrics lunch seminar, Yale Based on joint work with: Fadoua
More informationChernoff s distribution is log-concave. But why? (And why does it matter?)
Chernoff s distribution is log-concave But why? (And why does it matter?) Jon A. Wellner University of Washington, Seattle University of Michigan April 1, 2011 Woodroofe Seminar Based on joint work with:
More informationLog-concave distributions: definitions, properties, and consequences
Log-concave distributions: definitions, properties, and consequences Jon A. Wellner University of Washington, Seattle; visiting Heidelberg Seminaire, Institut de Mathématiques de Toulouse 28 February 202
More informationNONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES. By Arseni Seregin, and Jon A. Wellner, University of Washington
Submitted to the Annals of Statistics arxiv: math.st/0911.4151v1 NONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES By Arseni Seregin, and Jon A. Wellner, University of Washington We
More informationRegularization prescriptions and convex duality: density estimation and Renyi entropies
Regularization prescriptions and convex duality: density estimation and Renyi entropies Ivan Mizera University of Alberta Department of Mathematical and Statistical Sciences Edmonton, Alberta, Canada Linz,
More informationMaximum likelihood estimation of a log-concave density based on censored data
Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen
More informationMEDDE: MAXIMUM ENTROPY DEREGULARIZED DENSITY ESTIMATION
MEDDE: MAXIMUM ENTROPY DEREGULARIZED DENSITY ESTIMATION ROGER KOENKER Abstract. The trouble with likelihood is not the likelihood itself, it s the log. 1. Introduction Suppose that you would like to estimate
More informationOPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan
OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed
More informationA Law of the Iterated Logarithm. for Grenander s Estimator
A Law of the Iterated Logarithm for Grenander s Estimator Jon A. Wellner University of Washington, Seattle Probability Seminar October 24, 2016 Based on joint work with: Lutz Dümbgen and Malcolm Wolff
More informationRecent progress in log-concave density estimation
Submitted to Statistical Science Recent progress in log-concave density estimation Richard J. Samworth University of Cambridge arxiv:1709.03154v1 [stat.me] 10 Sep 2017 Abstract. In recent years, log-concave
More informationSHAPE CONSTRAINED DENSITY ESTIMATION VIA PENALIZED RÉNYI DIVERGENCE
SHAPE CONSTRAINED DENSITY ESTIMATION VIA PENALIZED RÉNYI DIVERGENCE ROGER KOENKER AND IVAN MIZERA Abstract. Shape constraints play an increasingly prominent role in nonparametric function estimation. While
More informationStrong log-concavity is preserved by convolution
Strong log-concavity is preserved by convolution Jon A. Wellner Abstract. We review and formulate results concerning strong-log-concavity in both discrete and continuous settings. Although four different
More informationLog Concavity and Density Estimation
Log Concavity and Density Estimation Lutz Dümbgen, Kaspar Rufibach, André Hüsler (Universität Bern) Geurt Jongbloed (Amsterdam) Nicolai Bissantz (Göttingen) I. A Consulting Case II. III. IV. Log Concave
More informationarxiv: v3 [math.st] 4 Jun 2018
Inference for the mode of a log-concave density arxiv:1611.1348v3 [math.st] 4 Jun 218 Charles R. Doss and Jon A. Wellner School of Statistics University of Minnesota Minneapolis, MN 55455 e-mail: cdoss@stat.umn.edu
More informationNonparametric estimation under Shape Restrictions
Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions
More informationBi-s -Concave Distributions
Bi-s -Concave Distributions Jon A. Wellner (Seattle) Non- and Semiparametric Statistics In Honor of Arnold Janssen, On the occasion of his 65th Birthday Non- and Semiparametric Statistics: In Honor of
More informationTheory of Probability Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 8.75 Theory of Probability Fall 008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Section Prekopa-Leindler inequality,
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationwith Current Status Data
Estimation and Testing with Current Status Data Jon A. Wellner University of Washington Estimation and Testing p. 1/4 joint work with Moulinath Banerjee, University of Michigan Talk at Université Paul
More informationA Brief Introduction to Copulas
A Brief Introduction to Copulas Speaker: Hua, Lei February 24, 2009 Department of Statistics University of British Columbia Outline Introduction Definition Properties Archimedean Copulas Constructing Copulas
More informationLecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches Ref: Huber,
More informationMaximum likelihood estimation
Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationLearning and Testing Structured Distributions
Learning and Testing Structured Distributions Ilias Diakonikolas University of Edinburgh Simons Institute March 2015 This Talk Algorithmic Framework for Distribution Estimation: Leads to fast & robust
More informationStochastic Comparisons of Weighted Sums of Arrangement Increasing Random Variables
Portland State University PDXScholar Mathematics and Statistics Faculty Publications and Presentations Fariborz Maseeh Department of Mathematics and Statistics 4-7-2015 Stochastic Comparisons of Weighted
More informationLearning Multivariate Log-concave Distributions
Proceedings of Machine Learning Research vol 65:1 17, 2017 Learning Multivariate Log-concave Distributions Ilias Diakonikolas University of Southern California Daniel M. Kane University of California,
More informationComputing Confidence Intervals for Log-Concave Densities. by Mahdis Azadbakhsh Supervisors: Hanna Jankowski Xin Gao York University
Computing Confidence Intervals for Log-Concave Densities by Mahdis Azadbakhsh Supervisors: Hanna Jankowski Xin Gao York University August 2012 Contents 1 Introduction 1 1.1 Introduction..............................
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationRichard F. Bass Krzysztof Burdzy University of Washington
ON DOMAIN MONOTONICITY OF THE NEUMANN HEAT KERNEL Richard F. Bass Krzysztof Burdzy University of Washington Abstract. Some examples are given of convex domains for which domain monotonicity of the Neumann
More informationTotal Variation Regularization for Bivariate Density Estimation
Total Variation Regularization for Bivariate Density Estimation Roger Koenker University of Illinois, Urbana-Champaign Oberwolfach: November, 2006 Joint work with Ivan Mizera, University of Alberta Roger
More informationis a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.
Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable
More informationR. Koenker Spring 2017 Economics 574 Problem Set 1
R. Koenker Spring 207 Economics 574 Problem Set.: Suppose X, Y are random variables with joint density f(x, y) = x 2 + xy/3 x [0, ], y [0, 2]. Find: (a) the joint df, (b) the marginal density of X, (c)
More informationL p Spaces and Convexity
L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function
More informationA tailor made nonparametric density estimate
A tailor made nonparametric density estimate Daniel Carando 1, Ricardo Fraiman 2 and Pablo Groisman 1 1 Universidad de Buenos Aires 2 Universidad de San Andrés School and Workshop on Probability Theory
More informationECE 275B Homework # 1 Solutions Version Winter 2015
ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationMathematical Methods for Physics and Engineering
Mathematical Methods for Physics and Engineering Lecture notes for PDEs Sergei V. Shabanov Department of Mathematics, University of Florida, Gainesville, FL 32611 USA CHAPTER 1 The integration theory
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationSemi-Parametric Importance Sampling for Rare-event probability Estimation
Semi-Parametric Importance Sampling for Rare-event probability Estimation Z. I. Botev and P. L Ecuyer IMACS Seminar 2011 Borovets, Bulgaria Semi-Parametric Importance Sampling for Rare-event probability
More informationDepartment of Statistics, School of Mathematical Sciences, Ferdowsi University of Mashhad, Iran.
JIRSS (2012) Vol. 11, No. 2, pp 191-202 A Goodness of Fit Test For Exponentiality Based on Lin-Wong Information M. Abbasnejad, N. R. Arghami, M. Tavakoli Department of Statistics, School of Mathematical
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More information1 Review of di erential calculus
Review of di erential calculus This chapter presents the main elements of di erential calculus needed in probability theory. Often, students taking a course on probability theory have problems with concepts
More informationMaximum likelihood estimation of a multidimensional logconcave
Maximum likelihood estimation of a multidimensional logconcave density Madeleine Cule and Richard Samworth University of Cambridge, UK and Michael Stewart University of Sydney, Australia Summary. Let X
More informationEntropy Power Inequalities: Results and Speculation
Entropy Power Inequalities: Results and Speculation Venkat Anantharam EECS Department University of California, Berkeley December 12, 2013 Workshop on Coding and Information Theory Institute of Mathematical
More informationECE 275B Homework # 1 Solutions Winter 2018
ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,
More informationUnobserved Heterogeneity in Income Dynamics: An Empirical Bayes Perspective
Unobserved Heterogeneity in Income Dynamics: An Empirical Bayes Perspective Roger Koenker University of Illinois, Urbana-Champaign Johns Hopkins: 28 October 2015 Joint work with Jiaying Gu (Toronto) and
More informationSTATISTICAL INFERENCE UNDER ORDER RESTRICTIONS LIMIT THEORY FOR THE GRENANDER ESTIMATOR UNDER ALTERNATIVE HYPOTHESES
STATISTICAL INFERENCE UNDER ORDER RESTRICTIONS LIMIT THEORY FOR THE GRENANDER ESTIMATOR UNDER ALTERNATIVE HYPOTHESES By Jon A. Wellner University of Washington 1. Limit theory for the Grenander estimator.
More informationMAXIMUM LIKELIHOOD ESTIMATION OF A UNIMODAL PROBABILITY MASS FUNCTION
Statistica Sinica 26 (2016), 1061-1086 doi:http://dx.doi.org/10.5705/ss.202014.0088 MAXIMUM LIKELIHOOD ESTIMATION OF A UNIMODAL PROBABILITY MASS FUNCTION Fadoua Balabdaoui and Hanna Jankowski Université
More information1 An Elementary Introduction to Monotone Transportation
An Elementary Introduction to Monotone Transportation after K. Ball [3] A summary written by Paata Ivanisvili Abstract We outline existence of the Brenier map. As an application we present simple proofs
More informationSteiner s formula and large deviations theory
Steiner s formula and large deviations theory Venkat Anantharam EECS Department University of California, Berkeley May 19, 2015 Simons Conference on Networks and Stochastic Geometry Blanton Museum of Art
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationBayes spaces: use of improper priors and distances between densities
Bayes spaces: use of improper priors and distances between densities J. J. Egozcue 1, V. Pawlowsky-Glahn 2, R. Tolosana-Delgado 1, M. I. Ortego 1 and G. van den Boogaart 3 1 Universidad Politécnica de
More informationSome stochastic inequalities for weighted sums
Some stochastic inequalities for weighted sums Yaming Yu Department of Statistics University of California Irvine, CA 92697, USA yamingy@uci.edu Abstract We compare weighted sums of i.i.d. positive random
More informationMaximization of a Strongly Unimodal Multivariate Discrete Distribution
R u t c o r Research R e p o r t Maximization of a Strongly Unimodal Multivariate Discrete Distribution Mine Subasi a Ersoy Subasi b András Prékopa c RRR 12-2009, July 2009 RUTCOR Rutgers Center for Operations
More informationExtreme Abridgment of Boyd and Vandenberghe s Convex Optimization
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The
More informationINDEPENDENT COMPONENT ANALYSIS VIA
INDEPENDENT COMPONENT ANALYSIS VIA NONPARAMETRIC MAXIMUM LIKELIHOOD ESTIMATION Truth Rotate S 2 2 1 0 1 2 3 4 X 2 2 1 0 1 2 3 4 4 2 0 2 4 6 4 2 0 2 4 6 S 1 X 1 Reconstructe S^ 2 2 1 0 1 2 3 4 Marginal
More informationConvex Geometry. Otto-von-Guericke Universität Magdeburg. Applications of the Brascamp-Lieb and Barthe inequalities. Exercise 12.
Applications of the Brascamp-Lieb and Barthe inequalities Exercise 12.1 Show that if m Ker M i {0} then both BL-I) and B-I) hold trivially. Exercise 12.2 Let λ 0, 1) and let f, g, h : R 0 R 0 be measurable
More informationSome properties of skew-symmetric distributions
Noname manuscript No. (will be inserted by the editor) Some properties of skew-symmetric distributions Adelchi Azzalini Giuliana Regoli 21st December 2010 Revised May 2011 Abstract The family of skew-symmetric
More informationLecture 3. Optimization Problems and Iterative Algorithms
Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex
More informationMaximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.
Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,
More informationOn Strong Unimodality of Multivariate Discrete Distributions
R u t c o r Research R e p o r t On Strong Unimodality of Multivariate Discrete Distributions Ersoy Subasi a Mine Subasi b András Prékopa c RRR 47-2004, DECEMBER, 2004 RUTCOR Rutgers Center for Operations
More informationEfficient Estimation in Convex Single Index Models 1
1/28 Efficient Estimation in Convex Single Index Models 1 Rohit Patra University of Florida http://arxiv.org/abs/1708.00145 1 Joint work with Arun K. Kuchibhotla (UPenn) and Bodhisattva Sen (Columbia)
More information14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.
CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity
More informationA Bahadur Representation of the Linear Support Vector Machine
A Bahadur Representation of the Linear Support Vector Machine Yoonkyung Lee Department of Statistics The Ohio State University October 7, 2008 Data Mining and Statistical Learning Study Group Outline Support
More informationBoundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta
Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density
More informationLecture 4: Convex Functions, Part I February 1
IE 521: Convex Optimization Instructor: Niao He Lecture 4: Convex Functions, Part I February 1 Spring 2017, UIUC Scribe: Shuanglong Wang Courtesy warning: These notes do not necessarily cover everything
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-2014-0088R2 Title Maximum likelihood estimation of a unimodal probability mass function Manuscript ID SS-2014-0088R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI
More informationTransformations and Bayesian Density Estimation
Transformations and Bayesian Density Estimation Andrew Bean 1, Steve MacEachern, Xinyi Xu The Ohio State University 10 th Conference on Bayesian Nonparametrics June 25, 2015 1 bean.243@osu.edu Transformations
More informationLinear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016
Linear Programming Larry Blume Cornell University, IHS Vienna and SFI Summer 2016 These notes derive basic results in finite-dimensional linear programming using tools of convex analysis. Most sources
More informationFE610 Stochastic Calculus for Financial Engineers. Stevens Institute of Technology
FE610 Stochastic Calculus for Financial Engineers Lecture 3. Calculaus in Deterministic and Stochastic Environments Steve Yang Stevens Institute of Technology 01/31/2012 Outline 1 Modeling Random Behavior
More informationAround the Brunn-Minkowski inequality
Around the Brunn-Minkowski inequality Andrea Colesanti Technische Universität Berlin - Institut für Mathematik January 28, 2015 Summary Summary The Brunn-Minkowski inequality Summary The Brunn-Minkowski
More informationRules of Differentiation
Rules of Differentiation The process of finding the derivative of a function is called Differentiation. 1 In the previous chapter, the required derivative of a function is worked out by taking the limit
More informationSeptember Math Course: First Order Derivative
September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which
More informationApplications of Linear Programming
Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal
More informationJournal of Statistical Software
JSS Journal of Statistical Software March 2011, Volume 39, Issue 6. http://www.jstatsoft.org/ logcondens: Computations Related to Univariate Log-Concave Density Estimation Lutz Dümbgen University of Bern
More informationHigh-dimensional distributions with convexity properties
High-dimensional distributions with convexity properties Bo az Klartag Tel-Aviv University A conference in honor of Charles Fefferman, Princeton, May 2009 High-Dimensional Distributions We are concerned
More informationLECTURE 15: COMPLETENESS AND CONVEXITY
LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other
More informationBayesian Regularization
Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors
More informationPart II Probability and Measure
Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationDensity Estimation under Independent Similarly Distributed Sampling Assumptions
Density Estimation under Independent Similarly Distributed Sampling Assumptions Tony Jebara, Yingbo Song and Kapil Thadani Department of Computer Science Columbia University ew York, Y 7 { jebara,yingbo,kapil
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More informationSurrogate loss functions, divergences and decentralized detection
Surrogate loss functions, divergences and decentralized detection XuanLong Nguyen Department of Electrical Engineering and Computer Science U.C. Berkeley Advisors: Michael Jordan & Martin Wainwright 1
More informationCourse 212: Academic Year Section 1: Metric Spaces
Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........
More informationi=1 h n (ˆθ n ) = 0. (2)
Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating
More informationPosterior Regularization
Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationA glimpse into convex geometry. A glimpse into convex geometry
A glimpse into convex geometry 5 \ þ ÏŒÆ Two basis reference: 1. Keith Ball, An elementary introduction to modern convex geometry 2. Chuanming Zong, What is known about unit cubes Convex geometry lies
More informationMATH MEASURE THEORY AND FOURIER ANALYSIS. Contents
MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure
More informationMath 341: Convex Geometry. Xi Chen
Math 341: Convex Geometry Xi Chen 479 Central Academic Building, University of Alberta, Edmonton, Alberta T6G 2G1, CANADA E-mail address: xichen@math.ualberta.ca CHAPTER 1 Basics 1. Euclidean Geometry
More informationConvex Analysis and Economic Theory AY Elementary properties of convex functions
Division of the Humanities and Social Sciences Ec 181 KC Border Convex Analysis and Economic Theory AY 2018 2019 Topic 6: Convex functions I 6.1 Elementary properties of convex functions We may occasionally
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationChapter 4: Asymptotic Properties of the MLE
Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationMAT 578 FUNCTIONAL ANALYSIS EXERCISES
MAT 578 FUNCTIONAL ANALYSIS EXERCISES JOHN QUIGG Exercise 1. Prove that if A is bounded in a topological vector space, then for every neighborhood V of 0 there exists c > 0 such that tv A for all t > c.
More informationLecture Quantitative Finance Spring Term 2015
on bivariate Lecture Quantitative Finance Spring Term 2015 Prof. Dr. Erich Walter Farkas Lecture 07: April 2, 2015 1 / 54 Outline on bivariate 1 2 bivariate 3 Distribution 4 5 6 7 8 Comments and conclusions
More information